
White Paper: Best Practices in Form Design for ABBYY FlexiCapture Fixed Form Processing
Read our White Papers and other technical articles to learn about OCR, data capture, and RPA technology.
Introduction to ABBYY® FlexiCapture
ABBYY FlexiCapture is a comprehensive forms processing and document capture solution. It includes standard import features for scanning documents, receiving documents from email, processing images from a watched network folder, and manually processing images. Standard export features include output of files to a wide variety of formats, enterprise content management systems, and Microsoft SharePoint. The product is available in two types of architecture depending upon your needs. The standalone system is for a single user processing smaller volumes of documents from a single workstation. The distributed version is the scalable client-server application that runs in Microsoft IIS. This distributed flavor provides for large scale processing of large volumes of documents spread across multiple servers for load balancing and failover in a Microsoft Server cluster. Within the standalone and distributed installation types are additional considerations including whether there is need to process fixed forms, semi-structured forms, or unstructured forms. The last two types require the use of the Flexilayout Studio product and licensing. Structured forms can be processed by the standard versions of FlexiCapture standalone and distributed. Both versions include a tool called the FormDesigner for the creation of “scannable” paper forms.
Hints for Using the ABBYY FormDesigner Tool
You can save yourself considerable grief by looking at the ABBYY FormDesigner as a tool for creating prototypes of forms to be exported image format to Adobe InDesign for the final design. There may be exceptions to this principle for certain simple forms with limited, widely spaced elements. This advice of using the FormDesigner for prototyping only will save you considerable grief when it comes to producing forms which work well in FlexiCapture, even though the FormDesigner automatically creates much of the document definition (template).
The following are some examples on how the FormDesigner could help you design a better form once you begin the production form design using a form design tool like Adobe InDesign.
- Use the designer to provide the exact dimension of form objects, then use these dimensions when you build the form in InDesign. For example, using the ABBYY tool to design a series of text entry fields with 4mm wide by 5 mm tall marking frames. Figure One
- Use the designer to quickly alternative prototype versions of forms containing a “straw man” set of form fields. Consider a prototype form with the form elements generated by the ABBYY FormDesigner.
- Build an image containing the required page anchor elements, such as black squares or angled corner marks.
- Create barcode element for matching of the document definition.
- Create all of the columns and one row of data for a table element.
- Use the designer to a “filling template” element that shows users of the form how to properly print the handwritten letters and mark checkboxes. Figure Two
- A table object with checkbox or text entry fields. Figure Three
- Text entry fields using constrained elements.
- Build a prototype form with a combination of elements for rapid testing with the customer.
Figure One: Text Entry Field (4 x 5 mm)
Figure Two: Sample Filling Template from ABBYY
Figure Three: Table Object with Entry Fields
Form Design Considerations for Various Field Types
In general, you want to be certain to provide as much information as possible in the document definition field properties for each field on the form. This may seem intuitive until you realize that the more information about the field, the less possibilities FlexiCapture will have to assign to the field. There can be a small difference for example between specifying just a data type as a name versus a first name. Finally, anytime you can utilize a database lookup to restrict the possible values for the field you will increase the accuracy of the recognition.
Text Entry Field for ICR:
We have found that we can achieve superior recognition results when using the character box series marking type and then designating the number of cells when you create the document definition for handwriting recognition (ICR). These full squares seem to provide a better method for users to enter data and FlexiCapture works better when using this marking pattern (see Figure Four). The second type that works very well is this same type of character box series but with the cell boxes marked in a dropout color. When you create the document definition with this type you will just need to change the marking type to “simple” since when the form is scanned the character box series marking will disappear leaving just the characters themselves. Finally, we have not had very good results with using the “dotted frame” marking method as provided by the FormDesigner. The issue with these marking types is that during the scanning process the dots tend to enlarge during the degree of freedom added during the scanning operation. Think of what is known as the “fax game.” Someone in an office makes a copy of document and send this first copy through a fax machine, which is in effect a very low quality scanner. The person on the other end receives the fax and sends it by fax to another person, and the cycle repeats. After a dozen faxes the document has changed quite a bit by darkening text areas and noise (dots and speckles). Evidently the same thing happens and when text entry fields are marked with the dotted frames they tend to thicken with scanning. Even when the document definition is set to de-speckle, you will experience poor recognition results for handwritten values.
Figure Four: Text Entry Field using Character Box Series
OMR Checkmark Field:
Here at UFC, Inc. we have found superior results when using the rectangular field marking for these fields. As an alternative we tried using oval markings instead of rectangles, then when the document definition was created it specified rectangular values since there is no option for ovals. This marking type worked very well even with the ovals.
Conclusion
Use the ABBYY FormDesigner tool to create high quality prototype forms for testing and evaluation, then create the production form in Adobe InDesign using the form elements produced in the ABBYY tool. Use caution in specifying page elements on forms such that you optimize the data extraction capability of the ABBYY FlexiCapture system.
Appendix: Scripting Example for Counting Checkbox Values
We were asked by a potential client whether we could 1. Count up the number of times a group of checkboxes were marked, and 2. Count the number of markings for the group to which the checkmark belonged, either normal or significant. Some checkmarks were in the normal category, while some were assigned to the more critical “significant” category. The following is the script that we created in the document definition for this operation. You may find it useful when you have to tally and make mathematical calculations for OMR fields. If you have questions we would be glad to help you.
Dim deficiencies Dim sigdeficiencies ‘Created by Jim Hill, UFC, Inc. ‘The script determines a count of checkboxes on the forms for the four field groups (1.Controls – 4.Other) ‘It reports a separate value for total number of deficiencies and another for significant deficiencies ‘Significant deficiencies are bolded fields on the image and the fields are designated -SIG ‘Set the variable to zero so it’s not a null so it can be incremented ‘But only do this the first time for checkbox group 1 then increment throughout the project deficiencies = 0 sigdeficiencies = 0 ‘Change the field value to the proper value, could do this in a for next loop but there are only three values if Me.Field(“1a”).Value = “Y” then deficiencies = deficiencies + 1 end if ‘increment then repeat the counter for the remaining fields if Me.Field(“1b”).value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“1c-SIG”).value = “Y” then deficiencies = deficiencies + 1 ‘increment the significant deficiencies since 1c is a bolded value on on the form (significant) field ‘change this for each bolded field on the form sigdeficiencies = sigdeficiencies + 1 end if if Me.Field(“2a”).Value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“2b”).Value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“2c-SIG”).Value = “Y” then deficiencies = deficiencies + 1 sigdeficiencies = sigdeficiencies + 1 end if if Me.Field(“2d-SIG”).Value = “Y” then deficiencies = deficiencies + 1 sigdeficiencies = sigdeficiencies + 1 end if if Me.Field(“2e-SIG”).Value = “Y” then deficiencies = deficiencies + 1 sigdeficiencies = sigdeficiencies + 1 end if if Me.Field(“2f-SIG”).Value = “Y” then deficiencies = deficiencies + 1 sigdeficiencies = sigdeficiencies + 1 end if if Me.Field(“2g”).Value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“2h”).Value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“2i-SIG”).Value = “Y” then deficiencies = deficiencies + 1 sigdeficiencies = sigdeficiencies + 1 end if if Me.Field(“3a-SIG”).Value = “Y” then deficiencies = deficiencies + 1 sigdeficiencies = sigdeficiencies + 1 end if if Me.Field(“3b”).Value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“3c”).Value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“3d-SIG”).Value = “Y” then deficiencies = deficiencies + 1 sigdeficiencies = sigdeficiencies + 1 end if if Me.Field(“3e”).Value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“4a”).Value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“4b”).Value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“4c-SIG”).Value = “Y” then deficiencies = deficiencies + 1 sigdeficiencies = sigdeficiencies + 1 end if if Me.Field(“4d”).Value = “Y” then deficiencies = deficiencies + 1 end if if Me.Field(“4e-SIG”).Value = “Y” then deficiencies = deficiencies + 1 sigdeficiencies = sigdeficiencies + 1 end if Me.Field(“Count of Deficiencies”).Value = deficiencies Me.Field(“Count of Significant Deficiencies”).Value = sigdeficiencies |
Attachments:
Link | Description | File type | File size |
---|---|---|---|
Download PDF Version | 584 KB | ||
Script Code to Count Deficiencies on a Fixed Form | Script Code to Count Deficiencies on a Fixed Form | plain | 3 KB |
Read our Other White Papers:
- White Paper: Buying a Data Capture or Document Management System on Price Alone Can be Dangerous to the Health of Your Business
- White Paper: How to Get the Most Out of Your Data Capture System
- White Paper: When Purchasing a Data Capture System Focus on the Journey Rather Than the Destination
- White Paper: Understanding and Effectively Using Document Indexing In Document Capture Software
- White Paper: Choosing the Right Image Resolution – Part 1
- White Paper: Choosing the Right Image Resolution – Part 2
- White Paper: How to Calculate an ROI for a Document Management System
- White Paper: Make Sure to Monitor Server RAM and Hard Disc Fragmentation for Optimal Data Capture and Document Management System Performance
- White Paper: Customizing Your Document Management System – Be Careful What You Wish For
- Download White Paper: Find the Right Data Capture Solution to End Paper Paralysis
- Two Way Matching of AP Invoices and PO’s
- White Paper: Fixed vs Semi-Structured Forms in ABBYY FlexiCapture
- White Paper: ABBYY FineReader Server – Web Service API Sample
- White Paper: ABBYY FlexiCapture Web Services API Example
- White Paper: Clustering of Servers in ABBYY FlexiCapture