In this video you will learn how to set up field extraction training with a single variant in ABBYY FlexiCapture.
Hello. Today I’m going to show you how to set up field extraction training within ABBYY FlexiCapture. Now this gives us the ability to automatically tell the software where fields are located without the need to create templates. It actually allows end users to tell the software where fields are located. Does not require IT input or IT permissions or IT consultants to do this. This is ideal on simple, what we call fixed forms, or forms that have a consistent structure. I wanted to show you how this is done. It’s actually a very simple process.
The first thing we’re gonna do is create a project. We’ll just call this our demo project. We’re gonna create a document definition, a brand new one. In our case, today, we’re gonna be doing transcripts, we’re actually gonna be processing transcript documents. So, I’m just gonna load a sample of our first transcript. Give it a name. We’ll tell the software it is OCR, it’s not a handwritten project, and we’ll finish. What it’s gonna do, of course, like usual, is put us into the document definition editor. Within the editor, we want to do a couple different steps.
The first thing we wanna do is add some fields here. So, for fun, let’s just do, maybe, student name. This is very, very important. We want to tell it it can have a region and should be matched. Secondly, let’s add GPA. We’ll call it our overall GPA, cumulative GPA. Let’s make sure we do these two setups, can have the region and should be matched. Once again, very, very important for any fields that we want to apply some of that end user training. Now we have two fields. Now notice it’s a little bit different than what we typically do in the document definition editor. We are not drawing regions, and that’s because we want the end user to tell us where these fields are when they process these documents.
The next thing I’m gonna do is set up some settings on my document section. In this case, I’m just gonna give it a name. Yet another very important setting, allow field extraction training. It will give us a warning here and that’s okay ’cause it’s telling us about static elements and not to use them. That is very important. That’s what kind of triggers the software that we’re going to be providing some end user training. I’m just gonna hit okay here and then I will save and publish our document definition. What I’ve done is just set up a section with two different fields. So let’s go ahead and close this, we’ll publish it.
Next, we will go to view field extraction batches and we’re going to create a new batch. When we do this, we’re gonna select the section and the document definition, we’re gonna select the default variant, and then we’re going to load some samples here. You want these to be similar samples. Obviously the data’s gonna be different on each sample, but the sample wants to be similar for this variant. The software is going to now process them. I’m gonna hit background mode and we will jump directly into this batch.
Okay, now that we’re within the batch, we can see the software recognized each transcript. Let’s just take a quick look at them. We’ll see that the data is very similar, but the student name is located here, at the top, and the cumulative GPA, or the overall GPA, is located down here, at the bottom. This is the same variant, just, obviously, different data. I’m just gonna show these to you quickly so you believe me that they’re different, and you can see there, we’re just different values.
The next thing I want to do now is go to the very first one and I want to start telling the software for this variant where the fields are found. Here, I’m gonna select student name. I’m, of course, just gonna lasso the values or click, and I’m gonna do the same thing with the other ones. I’m gonna tell the software here’s where the name is, here’s where the GPA is. Once again, here’s where the name is and here’s where the GPA is. So what I’ve done is I’ve set those fields for each of the samples and the next thing I’m going to do is tell the software to go to our training states.
We’re going to tell the software now, use what I provided you to train itself where to find these fields for this given variant. That’s what I’ve done. I’ve trained it. You can see some logging here is done, but it’s telling me it’s completed. Now, in order to test our setup, what we will do is create a working batch. I’m gonna create a new batch and I’m just gonna load those images. Now I will jump into the batch and you’ll be able to see that we will now have results for each of those documents that we’ve extracted.
Now that the software’s now recognized the documents as I double-click and go into the actual document, you’ll see here, and since this is a live batch, the software remembers that we trained the variant on the location. You can see as I jump around in between the fields, it now has located what we call the region of those given fields. This was what we call training a single variant or field extraction training. It’s a very nice tool. It gives the end users control over training, it doesn’t require an IT admin to have previous knowledge of the document. It just gives us the ability to have full control of our destiny here with some of these new features. This one’s specifically called field extraction processing.
Now this is a single variant, we do have a little bit deeper of a process when we do multiple variants of a transcript, which will be in our next video. We look forward to showing this to you. I hope you enjoyed this video in understanding how we can utilize this software, make it a lot easier for us to process our business documents here. Thank you so much for watching.