ABBYY FlexiCapture-FlexiLayout Part 1: Starting from Scratch

Learn how to use ABBYY FlexiLayout Studio for semi-structured forms.

Hello. Today I'm going to show you how to create a FlexiLayout from scratch. The first thing I'm going to do is show you the samples I have, and what these are, are registration forms. And all of them look just a little bit different. You can see the text on the screen is located in different places.

The details that we want to extract, which are the names, the planet names, the spaceship number, the dates, this bar code. All of this information is located at different spots. Now, it's the same detail we want to capture, but just located in different locations, which is what is a perfect semi-structured form.

And therefore we need what we call FlexiLayout Studio to help us design that. So ABBYY FlexiCapture comes with what we call a FlexiLayout Studio. And this studio is what gives us the ability to capture semi-structured forms. So the first thing we're going to do, is create a project. We'll give it a location.

For this case, we'll just put this in our demo folder, and we'll give it a name. We'll call this our Halloween Registration Form, and we will start it. Now what the project's doing now, is it is going to ask us for some properties, some basic properties.

Now, let me warn you here, that FlexiLayout Studio is going to require much more training than this video that you're going to see today. This is just a way to get you started, and there are a ton of features that we just won't have time to cover in this video. But for sake of these images, we know that these are one page documents.

So, I'm actually just going to select this and say the minimum number is one. The maximum number is one page, and that's just what we're going to process there. But as you can see here, there is a lot of different options we can set, and therefore FlexiLayout Studio does require training.

In fact, a week or two of training is typical for a new beginning user. So we're going to set some properties. We're going to hit Okay, and then we're going to add some images. I'm going to browse to our folder, and we'll select the images that I just showed you, and we will load these on the screen.

And if I double-click every one of them, you'll see here, we have each image that we have. Now we're going to zoom out, which we will here, and we can just scroll up and down to see each of these images. Now the important part about FlexiLayouts, is that most of the information that we want to capture is text-based, meaning that we want to use text on the document to help us develop rules on how to extract the different things.

So the very first thing we're going to capture today is, we want to make sure that we capture that the planet name is this Mars satellite [inaudible 00:02:47], for this one. But we want to capture that field. We just want to capture the planet name. So you can see it's different here on every one of these.

So the way we're going to do that, is we are going to use these rules within the FlexiLayout to determine how we map out the planet name. And it all comes down to what we call search elements. Search elements give us the reasoning behind why we looked for text, and then these blocks are what we are going to return to the software to extract.

So we may reference a bunch of text on the screen, but we really at the end of the day, only want to capture something, even though we use other text on the screen to map out where that something is located. So that's why there's a difference between search elements and blocks. But blocks are what we are returning to the software, and that's what we expect to return.

So the search elements are here. Now what we are going to do is, we're going to create a new search element. And we're going to right click, add an element, and you can see there are a lot of different element features here. For this case, we are able to determine a pattern on these documents.

You can see they're labeled, Your Planet Name. And if we look at every single document, you'll see there that is a pattern here with that name, even though it's located sometimes in different locations, the text is the same. So we're going to create what we call a labeled field. And there are a lot of properties on every single field an element that you want of capture.

Now once again, we're not going to go into every one of these options today. This is definitey more advanced training that you would need. But just to get the concepts down you can see, we're going to say, "Hey. This is a required element." We're going to tell it, what's the text of the field that we want to label. In this case it's, Your Planet Name.

And we're going to tell it where the field position is. We're going to say it's to the right and if we remember looking at every one of these forms, it is to the right. So you can see there's a lot of different options here we can select, but I think in this case we're good enough to be dangerous.

So we're going to go ahead and select Okay, and we will have a labeled field. So now what we can do, is highlight and match these documents and if we select here, our labeled field, you'll see that we did capture a lot of the field. So we can kind of scroll down. On this document, you can see here.

So now what we want to do, is give this labeled field some intelligence. So for one thing, it's very important to name these fields correctly. So we're going to call this one LF, for labeled field, and we're going to say, Planet Name. And then, I notice that it wasn't capturing this whole text, so we're going to go ahead and make max length a little bit larger here. Okay.

And we'll go ahead and match one of these, just for fun again. I'll come down here to our Planet Name, and we may need to tweak that just a little bit more length-wise to get that field. There we go. There we go. So you can see here, I determined what the label is and what the field is, hence the name, this is a labeled field element.

The other thing we want to capture is spaceship number, so this is very similar. You know, it's the same field. It's actually labeled the same on every form and that's also what we want to capture here. So it's very much the same. We're going to right click, Capture a labeled field. We're going to give it an intelligent name. We are going to say to the software that this is required.

We're going to give it a label. Give it the max length if you want, and actually we'll actually do this here. We tell the software where it's at and hit, Okay. So now what we will do here in this case, is we will look at the values, see what we return back for the planet, the spaceship and you can see it's determining the label, and the field for us just fine.

Now an element of FlexiCapture that is more advanced, is learning how to use these hypothesis tree to determine the quality of the results you got. You can see here they're all green, however they do go different colors based in the quality. For example if it's poor quality, it will go yellow or even different from that.

So it's very important to learn the hypothesis tree as you get more advanced here with FlexiLayouts. So what we've done so far, is we've told the software where to locate planet name and where to locate spaceship number. So now that we've told it where to locate it, we want to tell it to return that to us for referencing in our OCR or verification processes.

And the way that we do that is, we map these elements to blocks. So we're going to return a block, which is text and we'll make in an intelligent block. We'll call it Planet Name, and then we want to tell the software that this planet name comes from that field, Planet Name.

And we actually don't want the label, but we want the field itself, so we're going to go ahead and hit, Okay, and hit Okay there. And of course, we have spaceship number as well, so we are going to say this is our Spaceship Number. Oops. I'm going to rename that here.

And then of course we want to tell it, what element is relevant to the spaceship number, which is the Spaceship Number labeled field, and we want the field part of that. So we're going to go ahead and hit, Okay, and hit Okay here. Now when we match these, you can see we get a little bit more intelligence here on the block.

So the software is now going to tell us where it found the planet name and spaceship number. And in this case, when it's in green, it's telling us what the blocks are that we're going to return to the software. So it's very, very important. And those are just labeled fields. Now we can get very, very specific.

And in fact, in some of our future videos, we will get more specific on the elements that we can return, including videos and ... I'm sorry. Including bar codes and including photos. But just understanding that FlexiLayouts are all about text, is the important part here.

So you can see, we can match these, and if we just scroll down through everyone of our samples here, you can see it's now determining, based on the label where we found those fields.

So that is a very, very basic introduction to ABBYY FlexiCapture layouts, FlexiLayouts. I hope you did learn something about this and look forward to exploring FlexiLayouts with you as we get fancier in our next videos. So, thank you very much.

Save

Information about the Author
Travis Spangler
About Me
Articles by Travis Spangler: Travis writes articles dealing with various technical aspects of document capture and forms processing. He is fluent in Microsoft.NET and holds several certifications including ABBYY FlexiCapture and IRISXTract. As general manager and sales director, he controls the daily operations as well as manages customer accounts to ensure both customers and prospects are receiving the very best from UFC, Inc.
Some of My Other Articles

 

Recent Blog Posts