Where The “User” Stands With OCR Technology…

Automating a business process can entail many things – repurposing staff, changing corporate mindsets, and implementing new technology. Although the idea of this can seem daunting, rest assured that it is well worth the investment after you look at the savings and the assurance of business continuity! This article will focus on the user-based technology, in other words, the software that those who are actually doing the work will use.

First, we need to understand which parts of the process are done by a human and which parts are done by the software. You can find a simple chart below, but please note that the specific setup is different for each company and process.

StepStep NameStep DescriptionHuman or Software?
1ScanningUsing a scanner to transfer paper documents into digital formatHuman
2RecognitionExtracting data from a form or “batch” of documents/imagesSoftware
3VerificationValidating the extracted data from Step 2Human
4ExportSaving both the document and the extracted data in the business system of choiceSoftware

Now that we understand which steps are done by humans, let’s look at the technology those valuable people use. The first is what is called a “Scanning Station” or “Scanning Client”. Just like the description above shares, the purpose of the scanning station is to take physical paper documents and digitize them. The scanning station is typically connected to a local or network scanner where the user will feed paper to. Once the software captures an image copy of the document(s), it then presents them to the user to review and adjust. Common options are flipping the image X amount of degrees, cropping, deleting, and even redacting. More advanced options may allow the user to take the batch of scanned images and separate them into documents or change image resolutions. For a great example of scanning station technology, please check out our YouTube video of ABBYY’s scanning technology: https://www.youtube.com/watch?v=yOskkih5EKo

The other step done by humans in some organizations is Verification. Being able to understand what the software extracted, where it found it on the document, and make adjustments as needed is all part of the “Verification Station” or “Verification Client”. At this point in our process, the documents or images are already captured, so a direct connection to a scanner is irrelevant and not needed. The Verification Station typically will show a copy of the image/document on one side of the screen and the extracted data on the other. On the extracted data side, there will be fields with data populated in them. If the software has a low confidence level on the recognition, it will highlight either the whole word or part of the word in red. Some verification software will even allow the user to “train” it on where to find fields.

For a basic example of ABBYY’s verification station in use, check out this video: https://www.youtube.com/watch?v=IzEDHh3Zl30

For a more advanced view of ABBYY’s verification station and how the “training” is used, check out this video: https://www.youtube.com/watch?v=_0JAAlMt4nM

Leave a Comment