Context: when processing a large flow of documents, a lot of time is spent on routine, and it is difficult to automate processing, since one document has a lot of different forms. Manual markup takes a long time.
Decision: created a document recognition system based on its own OCR and Text Detection models.
The solution is integrated into the customer's business process:
Reducing the load on specialists by 60%;
Increase the speed of document processing in the business process by 10%;
Reduced costs due to lost documents by 5%.
Built solutions for automatic speech recognition:
Order-out; Contract agreement.
To build the model, we used:
The marked-up document template;
Document reference fields with their characteristics;
A set of document images.
OCR character selection model;
Search model for text blocks Text Detection;
Table recognition model Table Detection;
Model for building an electronic version of the document.
Customer: Telecom, Finance
Technology stack: TensorFlow, Python, Flask.