White Paper from UFC on Clustering in FlexiCapture

Clustering of Servers in ABBYY FlexiCapture

By Jim Hill  Download PDF Version

Introduction

Configuring an ABBYY FlexiCapture Distributed system in a cluster using Microsoft Windows Server Clustering provides many advantages. Each server (called node) in the cluster can be configured either for failover or network load balancing. The distributed version of FlexiCapture has been architected to take maximum advantage of these features. There are licensing implications when implementing clustering and/or load balancing, so please contact your ABBYY account manager for details.

Read more ...

ABBYY FlexiCapture Web Services API Example

Introduction

ABBYY FlexiCapture solves business challenges by capturing documents, extracting information from them, and using that information to drive business processes. Advanced automatic document classification and learning features along with out-of-the-box solutions for common business challenges like accounts payables have helped to establish ABBYY FlexiCapture as one of the most cutting-edge tools that companies can use to transform their business.

Read more ...
Whitepaper from UFC on ABBYY Recognition Sever

ABBYY Recognition Server - Web Service API Sample

By Joe Hill   Download PDF version

Summary:

ABBYY Recognition Server converts paper or electronic documents into compressed, searchable, archive compliant files.  It also provides the ability to extract text and barcodes from paper or electronic documents and return the results in XML format. This not only includes the textual content that was found but also more detailed information such as the paragraph, line locations and the text formatting.  This information may be used to construct additional applications that perform operations such as document indexing and searching.

This document will show how the ABBYY Recognition Server web service API can be used to convert a document into an archival format.  In addition, it will show how to extract text and barcodes from a document and how to use this information to power a document search application.  This example is shown in the context of Microsoft Visual Studio.NET but the concepts apply to any development and runtime environment that can call a web service such as Java.  Example VB.NET code will be presented for example purposes only.

Read more ...
Whitepaper from UFC about Fixed vs Semi-Structured Forms

Fixed vs Semi-Structured Forms in ABBYY FlexiCapture

By Jim Hill   Download PDF version

Summary: Anyone new to data capture will be faced with an immediate decision of choosing either a fixed or semi-structured approach to extraction of data from a form. What constitutes a fixed form versus semi-structured and what are some guidelines for distinguishing them in ABBYY FlexiCapture?

One day, my office phone rang from someone who saw information about our products on our company web site. As a new sales person tasked with selling document capture, I was used to answering these type of calls; but this time the caller threw in an interesting spin. They needed ballpark pricing right now. They then went on to explain that they needed to extract data from a particular document and export it to a database. I began to ask them about their document and how the information was laid out on the page. Was the information always in the same position on the page, or did it move around from document to document? Were there multiple versions of the document and what was the annual page volume that they expected? These qualification questions were required because of the nature of the product I was going to recommend depended very closely upon the type and location of the data to be extracted from the form. Once I determined that they were most likely looking at a form in which the data moved around from document to document I was able to provide them with the ballpark pricing they required. Without that information, a verbal estimate would have been impossible. Why? As you will learn in this article, the structure of document is extremely important in determining the type of technology used to extract data from within the documents.

Read more ...