Watch a video to learn how to create a Document Skill with ABBYY Vantage’s Document Skill Designer.
Hello. Today I’m going to show you how we create our very first Document Skill within ABBYY Vantage. Now, when we’re logged into Vantage, what we want to do is go to our Skill Designer and we want to select “Document Skill”. We can give the skill a name and then we can proceed. The first thing we want to do is upload some samples of the documents that we want to teach the software, how to extract. So I will do that. I’m going to upload two different samples. In practice, you would want multiple samples and potentially multiple samples of even the same format, if that’s where you’re going. So we will upload for this demo, just two different samples, but in practice, we would have multiples. We will continue moving down our action screen.
So we’ve uploaded documents. Now we want to label and create some business rules. This is where we tell the software where the information is on the document that I want to extract. It’s actually very, very simple. You can see the types of fields that we can extract here. We can do tables. We can do barcodes. We can do checkboxes. We can group elements. And of course we can grab any text on the document. What I’m going to do is I’m not going to touch anything besides lassoing the field for the sample that I want to extract. When I click that, you’ll see a new field is populated here on the right. For the sakes of this demo, I’m just going to select a few different fields. And the software will add the fields here. I will go ahead and give them a proper name.
The other cool part about Vantage is we get the ability to group fields. So just for fun, I thought I’d show you how we can group, for example, the banking information. So what I’ll do is I’ll click the “Add Group” here. I can rename the group. And as you’ll see, I can click to create an item. And here, once again, we just lasso the fields. And so what we can do here is rename the fields.
The other thing that we can do at this point is obviously we have checkboxes on this form, which are going to be pretty key for the type of data that we want to extract on this document. And so what I can do is I can click the “Add Checkbox” or “Checkmark” field. The software will put this checkmark here, and now I need to teach it where it’s found. So I’m going to click it and I’m just going to tell the software it’s located here. And then I’m going to give it a proper name. And I will do the same thing for the savings account. Once again, we will name that properly.
One of the important things to do when you use a checkbox is to tell the software, which one is the properly checked value. So in today’s document here, I’m going to unselect checking just so the software is known by my clicking, which one is properly checked.
Now that I’ve defined the fields for one document, I can do it for the next document. And so what I’ll do, even though the document looks different, that’s okay. We’re going to teach the software by clicking here on the right where to find this information. And all we do is we click in the field and then we click the text on the document. Here we got our banking information field.
And then remember on the checkboxes, we just want to teach the software where they’re found. So we click and point. And then the only thing we want to do last on the checkbox is make sure the one that’s selected on the document is properly selected on the form. So since the checking is the one selected on the document, we’re going to keep that selected there.
Now that I’ve trained my two documents. And in reality, we would want more, but for today’s demo, we will move on to the next available action, which is training. I will click the training button and eventually what will happen is we’ll see some results here in the results tab. Now we can look at the results tab for these pair of documents. We have the fields listed the banking information listed. We can tell that the software can find which correct fields and which ones are not correct. In this case, they’re all correct. These are pretty basic forms; however, so we would expect that. And we can see specifically which documents were correct and if there were incorrect findings on the form, we would have an incorrect option here as well.
Just like we do with any other skill, the very last action that we have is to publish. Before I publish, I want to show you one more thing about every field that you have. If I go back to the editor, I just wanted to inform you that there is this little gear. This is our field options. The options that we have on a field are to make it a required field. And we also have advanced properties. And then we can of course create business rules. Today we’re not going to talk about business rules, but I will click the button just so you get an idea of what a business rule is. We can look up information. We can merge fields. We can compare fields. We can even script fields here, but what I really wanted to call your attention to is how do we zone in and teach a software about the type of data for that field? Because not all data is just text. We have different data types that are supported. We can also mark fields as required fields or key fields or dimension fields. And we can provide more context around that if you have questions. We can also set a maximum length and sometimes more importantly define the type of data that we want to extract off of the field using a regular expression. And there is a regular expression editor. So this is how we really zone in and make sure that we’re teaching the OCR about the type of data that we expect to find in that field.
Once again, the last thing we’ll do here is publish. I’m going to go ahead and publish our skill, and that will take us back to the skills catalog, which again is where we live to find all of our skills.
And I hope you’ve enjoyed this video, creating your very first Vantage Document Skill. Thank you.