Configuring Multipage Documents in FlexiLayout Studio

ABBYY FlexiCapture 12 Video – Configuring Multipage Documents in FlexiLayout Studio

Watch our video in order to learn how to configure multipage documents in ABBYY FlexiCapture FlexiLayout Studio.

Hello. Today I’d like to show you how we can process a multipage document or even a single document that may have multiple individual documents in it such as a PDF that contains multiple documents that are referenced within it. What we’re going to do is create a new FlexiLayout project. We’ll give it a proper name and then we’ll move on to a properties window. This properties window shows us down here at the bottom, a checkbox that says allow multipage documents. This is very, very important. In this given scenario, we do want to set a proper minimum number of pages and maximum number of pages. If we know a maximum. Sometimes in a business situation we may not be able to predict how many pages or how much the maximum pages of a document can be, and in this case, if we don’t know the maximum, we can write I N F stands for infinity and that means the software is going to allow us to have a variable number of documents or even pages within a document.

The next step that we’ll do is we will upload a document. I have a PDF here that I’m going to show you. This PDF obviously has multiple pages. This is the price list document, and so I have prices that span multiple pages. However, you’ll notice that this is the beginning of a brand new price list even within the single documents. So in today’s video we have one PDF that contains multiple pages but even contains multiple documents with multiple pages. So what we’re going to do is upload this document. I’m just going to drag and drop the document and let the software process this.

Now that the documents are processed, what we are going to do is we’re going to assign proper search elements to these. Specifically a header and a footer. Header and footer are critical in to telling the software where a document starts and stops, it even is critical when we want to automatically split a document and process them accordingly. For today’s demo, I’m actually going to disable the footer. If you can predict a footer of a document or a set of documents, it’s very important that you do so, but in today to keep it a simple and short demo for you, we will show you how to add a header. What we’re going to do is we’re going to add a static text header and we’re going to look for the word price list at the beginning of each document that I’ve showed you here. So I’m just going to call this my “Price List” header.

We’re going to look for the word “Price List”. And before I move on, I’m going to make this a required element. So in other words, in order for a document to be considered a header, it will have to find the word “Price List” on the document in some location. Now that I’ve done that, I’m going to go ahead and show you what happens if we right click and match the whole batch of documents here. First off, what I’ll show you is there are lines that come up here. The blue line indicates what I would refer to as a baseline or even a reference line. That is what we are telling the software we know about the document and how a document starts and stops. The red line indicates that ABBYY is recognizing something different than what we are considering the reference or the baseline and the reason why that is is because we’ve uploaded a document with multiple documents within it so we haven’t been able to split it at this point.

Even though the software after we’ve right-clicked and matched is saying that it recognizes that perhaps the word “Price List” is actually located on page one and page four of the document. Therefore making it two separate documents. So in this case it may be helpful for us as end users to go ahead and set what we call a reference document. And what we do is we right click on those pages and we will assemble to a reference document. Now I’m going to uncheck this last one because I disabled my footer, but what we’re telling here is we’re telling the FlexiLayout how to reference this or actually to say it another way, what is the truth or what is the reality of how these documents are split and now that I’ve processed them and set that reference document, you can actually see here that instead of a red line we have a green line and the green line is simply an indicator that ABBYY FlexiCapture agrees with how we’ve split the document.

So if we select them all and right click match, you can see once again the items in blue are what I told the software, how it should be in reality and green is what ABBYY actually is performing automatically. So in this case, when we see a green line that matches the blue line, that means the software agrees with what we’re telling it should be the actual splitting. To emphasize though when we’re processing a multipage document, the most important things you should be able to do is set up the header and a footer if applicable on the footer. But header and footer is what tells the software when to start and stop a new document. And in the multipage or multidocument world, this is absolutely critical. From here, of course, we can process these, we can add additional elements. If you’re not sure how to control additional elements, please reference our video libraries for additional feedback there and how to videos. But once we’ve been able to split the document and have them referenced correctly with these green lines matching our blue lines, that means we now have the ability to process a multipage and potentially even a multidocument single document. I hope you’ve enjoyed this video. If you have any other questions, please reach out to us. Thank you.

Thumbnail for ABBYY FlexiCapture 12- Creating Alternative Layouts video

ABBYY FlexiCapture 12 Video – Creating Alternative Layouts

Watch our video in order to learn how to create an alternative layout in FlexiLayout Studio in the ABBYY FlexiCapture intelligent capture system.

Hello. Today I’d like to show you how we can use ABBYY FlexiLayout Studio to create an Alternate FlexiLayout or what we sometimes refer to as an Alternative FlexiLayout. And the reason why we would have an alternate is because we want to extract the same information from a variety of different forms, but the look and the feel and the structure of those forms are drastically different. And in this case we would want to tell the software that we’re going to use alternates and we’ll direct the software, how to figure out which alternative to use. The first thing I’m going to show you is what the documents look like. I have two transcripts and one is just from a generic like a homeschool document and the other one is from a document from Doe High School and I’ve already created the first one for us so that we don’t have to worry about this and waste video time to do so.

But this is the typical structure where you would add your search elements, define the header so that we can figure out which form this is and process those accordingly. Now I’m just going to add an alternate tier by right clicking at the very top of the tree and go in to add FlexiLayout Alternative. When I do that I will get a new set of search elements and you can see here I now have brand new search elements. And at this point I may want to rename these so that I can tell later on downstream which alternate is selected by the software. In this case, our typical document processing and FlexiLayout design happens just the same, so we will define a header and the header is very crucial because this is going to tell us how we locate this document and what makes it stand apart differently than the other ones.

And then of course we can have a footer. I disabled it in this case, but we can reference other footers. Now for ease of this video, I’m not going to describe how we would go and process the results accordingly, but you can see, I’m trying to find a student name and a GPA and just like any other FlexiLayout, we can add logic to go find that information using labeled fields and relations and all those cool little features that we have within the FlexiLayout Studio. But for today’s demo, I’m not going to do this. I’m just going to create the placeholders. So I’m just going to say this would be our student name and then of course we would provide logic and how to extract that in your forms and maybe I’m just going to select a different one for GPA. So now that we have the information being extracted and of course you can test and tune your documents, we would create a block. When we create a block, we will add the type of block, for example, student name, and this is where it actually becomes pretty important. We want to tell the software for this layout. So when you go to the Homeschool layout, I want you to reference the Homeschool student name and then you want to select alternative layouts and make sure you set the source element as well. And the software will automatically, of course with intelligence go to the proper FlexiLayout search element for us here.

So now I can see in the Doe Alternate FlexiLayout, I want to get it from this source and in the Homeschool I want to get it from this source and just to keep it very basic, we’ll do another one. This is going to be our GPA.

And in the Homeschool one we want to reference this GPA and in the Doe one we want to reference this GPA from this source element. So creating these different Alternate Layouts now has given us a lot of flexibility in where we control how the software extracts that information. Now let me just be clear. The normal designing and logic of a FlexiLayout still happens. So you want to define your headers and footers when and if applicable. You want to define your search elements and your relationships, and your grouping. Just like normal, but the only thing we’re dictating here is which FlexiLayout Alternate is being used. Now some things that is handy to do when you have an alternate is to be able to set the layout here. So we know that this one is going to be Homeschool.

And we know that this one is going to be Doe and the cool part is is now when we match, we can see here some logic that tells us when a document is referenced. So if it’s a certain check mark, we can tell that the software has referenced those for us. Now, just like normal FlexiCapture Studio, we would want to process these and test and make sure that what’s matching is green. And then of course we would export it. And just to show you on the other side how things look, I will open our project setup station.

So in our project setup station I’ve created a document definition and uploaded that FlexiLayout that we exported. But to show you the results here, you can see that even though I have one document definition, the content and location of those fields differ based on the layout. So this is a very cool way for us to kind of be able to dictate which layout the software is using, but have one FlexiLayout so that way we don’t have to have multiple document definitions for the same content. And so later on downstream we can have the same workflow, the same rules, the same export path that every other document in this document definition will have. So it’s a really good way for us to have one document definition and apply one FlexiLayout with Alternates here. So I hope you enjoyed this video! Please feel free to reach out to us if we can help you with anything!

Thumbnail for ABBYY FlexiCapture 12-Creating Tables in FlexiLayout (Basic) video

ABBYY FlexiCapture 12 Video – Creating Tables in FlexiLayout (Basic)

Watch our video to learn how to create a basic table in the ABBYY FlexiCapture FlexiLayout Studio application and the power of this innovative solution.

Today, I’d like to show you how we use the FlexiLayout Studio and our FlexiLayout out templates to extract tables or tabular information from documents. Now I’m going to show you my samples real quick. I have a document that has two pages worth of tables. We have a part number, a description or a name, and then we have a price and it’s two pages. It’s two pages long. And then my last sample here is another table where we have a name or description and we have a price, but we don’t have a part number. And so we want to extract this information from the documents. And make sure that we process them accordingly. So what I’m going to do, and this is probably one of the most important first steps that we can do, is we need to add a block. And this is going to be a table block. And what we’ll do here is we’ll create our columns. So I did share with you, we have a part number, we have a name, it’s a name or a description, and then we have a price. Okay, so we’re simply going to define our fields here and just to keep this video a little bit speedier, we’ve already assumed that we know the headers or how a document starts in our case, but let’s go ahead and add a table element. Now when we add a table element, we can of course give it an intelligent name and we can tell the software here about the columns. But if you realize it’s actually looking for a block, that’s the block that we created in the first step. So we’re just simply gonna go tell the software about that block. Now of course we’d want to name this something intelligent, like a table block, or something similar to that. So you can see here we have part number, name and description, and a price.

If you look at the properties of these, you can see we can get very, very specific. Now for ease of our first table, we’re going to keep it very simple and we’ll just simply use keywords as part of the name. But you can see, we can reference other elements of a document to tell us when a table starts honestly and when it even stops. So a lot of different configuration we can take here. I won’t do it for ease of this demo, but obviously super flexibility here when we’re extracting tables for this demo, I know that the word description is sometimes found or the word name is sometimes found, so I’m just simply going to tell the software. This can be name or description by using our pipe symbol there.

We have other columns that are pretty relevant to us. If there’s a fixed column order, we want to tell the software that, and I’ll just double click here so you kind of see what we do, but it’s really just defining your own order. You can have an array of these, so if you have multiple different ways, a document table comes in, you can of course set multiple different column orders for that. Here we’ll tell the software to use a header and we can even tell the software to look for a footer that is optional by default and we can tell the software how to detect rows. Now you can understand for this demo that I’m keeping it very simple and these are very simple tables, but I want you to understand the amount of flexibility you have here. As you just looked at those three tabs, there’s a lot of different options that we can use using source elements and other search elements that we have to get our tables zoned in here for us.

So once again, I have our columns here applied. I’m just going to go ahead and set these up to be processed. I’m just going to right click on our first item here and match. And what I’ll show you here is that we were able to extract the table. Now you can of course double click on the table and see some more of the specifics here. And you can see we’ve got our part number column, our name slash description column and our price column. And then here on my other sample, I will go ahead and match this one as well. And by clicking on the table element, I once again have the name, description, and the price and realized how in my demo here I don’t have a part number field. And there’s configuration of course that can either let that be optional or force it to be required.

But this is pretty awesome in how we extract a table. It’s very easy and it’s very flexible in the way we do it; a lot of different ways that we can configure it to do what we want it to do. But once I’ve extracted the table here, just as our usual practice, we can export this to our AFL file and then upload that to a project. And now we’re processing tables accordingly. I hope you enjoyed this video. Please let me know if you have any questions and thank you so much!

Thumbnail for ABBYY FlexiCapture 12- Creating Tables in FlexiLayout Studio (Advanced) video

ABBYY FlexiCapture 12 Video – Creating Tables in FlexiLayout Studio (Advanced)

Watch or video in order to learn a more advanced method for creating tables in the ABBYY FlexiCapture FlexiLayout Studio by utilizing repeating groups.

Hello. Today I’m going to explain to you how we create tables but more advanced. And sometimes we get in situations where there’s repeating information or tabular information on a document, but we can’t use a table element as referenced in the first basic video of table extraction. So we have to use a strategy using repeating groups and repeating groups gives us the ability to grab that repeating information but with maybe a little bit more intelligence or a little bit more complexity. So as you can see on some of my samples here, I have a situation where we have documents that have information and tables in this first page is pretty easy, but then we start getting in situations like what you see here, where we have some information, at the top, and then then there’s some white space. And then there’s another section and some additional repeating information towards the bottom.

And although in some situations we can use tables for this, it’s, it’s probably more appropriate that we use some sort of repeating group elements so that we can tell the software how to find a row and repeat itself as that row continues. So let’s walk through this. I’m just going to go back to our first page here and I did set up prior to this video, the ability for us to ignore a header on these documents. Every single page of mine has a header and just for creating a very simple video, a very pointed video. I’ll go ahead and ignore how we did that, but really all we’re doing is we have our group here where we’re ignoring the top header so that doesn’t add any complexity to our video here of using repeating groups. So the very first thing I’m going to do is figure out how we can map a table and when it comes to a table we want to anchor the table by its columns initially.

So we know that name is a very good anchor and as I go across this I can see that name isn’t always listed, but if I can find the name column once, then I can tell where this. In this case, this price list, I can see where that column is referenced on each of them of these subsiding samples here. You can see here is another document with a name, and then of course it goes down. And then the last document here has also a name column. And then they’re also structured very similar. In this case what I’m going to do is find us the name column. So I’m just gonna create a search element and it’s going to be static text. And I’m going to say go find me the name column. So I can even call this our static name and I’m going to tell the software to find the name column. Once again, when you create a FlexiLayout, your documents will be different than mine and some of your concepts may be a little different than mine, but the way we attack them as probably very similar. So use your own names and your own types of elements. But in my case, I want to anchor off the names, so I’m going to tell the software here to always find it on the first page.

And then I’m also going to tell it, always just go find the point at the top. So if we do see the word “name” multiple times in a document, which should probably be fairly common. And as this type of document we’ll tell it just always give me the one that’s nearest to the top. In other words, go find the one that’s at the very top of the document. Here are the very top name column. So there we’re going to get the name column. Now it’s very important that we find a field like price to make that our anchor per row. Once again, we use the name field to be our anchor per table. But now we want to be a price field to be our anchor per row. Cause that’s very consistent. Now just because I know how the software works, I’m going to create what we call a separator cause I’m going to have the software repeatedly find me this row, but I need to somehow find the separator here and that separator will tell me where the price begins. So I’m just going to create an element and it’s going to be a separator. A separator is simply a line and I’m just going to give it a name and it’s going to be a vertical separator.

And then we have our relations here. I’m just going to tell it to use the name field and go to the right so that we find the separator to the right. So go find the name field and we’re going to say to the right and I’ll probably give it some sort of offset. And once again your documents may be a little different, but I’m just going to let it kind of push that over to the right there a little bit. So we have some room. And lastly what I’ll do here is say go find me the one nearest to the name. Now this is making the assumption that I have a good solid barrier here or a line in between the name and the price column. So I’m just going to match this document so you can kind of see what we did here.

And that’s the column. So we found name and we find a separator. So now I can tell the software, now that we know where name is and we know how to section off the page because we have the separator, now we can get into some intelligence. So what I’m going to do is create a repeating group and now we’re going to start working in building this table. When we create the group, we’re just going to call it our table field. And we’re going to create some intelligence here. So we’re going to tell the software to ignore. In my case the header, and this is just once again special to my documents here.

So we’re just going to tell the software to ignore all instances of this header so we don’t get any confusion. And I’m going to tell the software that it’s going to find these below that name field. So we’re just going to say below. And so the software is going to say, okay now I have repeating information that’s going to be below the name field. Pretty simple here. We have our table. Now let’s focus on price. Cause once again, prices are anchor. If I can find a price, I can find everything else related to this row. So what I’m going to do is create an element and we’re going to just in this case consider it a character string. And this is going to be our anchoring field. So I’m just going to call it cs price. Cs stands for character string. And I’m going to add my own alphabet here because I know we have some common characters that we find. And let’s just add common things that we see in documents or in prices here.

And then lastly, I’m gonna create a new relation. I’m going to say, okay, now that you can find these characters, go find it to the right of the name of this separator here.

I’m just going to go ahead and apply this and just so you can see it take place, I’m going to go ahead and walk through this. So here’s our name column. You can see it highlighted, here’s our separator. And then lastly you can see our price and we’re able to capture all of these currencies. And then if I of course match another page, it’ll be very similar because I have a name and I have a separator and I have the prices as well. So now we have price and that’s going to be our anchor. That’s our row anchor. So what we would commonly like to do as a best practice is we create a group and we’re gonna call this group our row. And this will be how we define the full row. So in this case, our row, we actually have properties on a given row and we’re just going to tell the software, don’t find the row if you don’t find the price. In other words, you don’t even attempt to grab the row if we don’t have a price. And maybe something else we’ll do is we’ll say go find this information if it’s below the price name.

We’re just going to give it some room here to create its own square so that we can tell the software how to anchor in and kind of rectangle in this given row. We’re going to say go get the price and let’s say it’s below the top of the price. Maybe even give it some offset here, a negative offset. So that just gives us a little bit taller of a rectangle and then we’ll say it’s above the bottom in a similar fashion of price.

Now we have a group and that group defines that whole row. What we’ll do here, actually just just go ahead and run one. Now if I go into my table in my repeating group and I want to find the row there, you can see in gray how the software is now mapped out the row. So it found the price and now that we’ve found price we’ve structured in to actually find the whole row itself and now we can be more intelligent within the group without adding a lot more complexity. We’re just going to go ahead and add what we call a character string and we’re going to call this the name column and we’re just going to tell it to grab any character it finds. But the important part about the name is that we’re going to grab the one that’s to the left of price. So in this case we’ll just simply tell the software to grab price and go find me this to the left of it.

And just to be a little bit more intelligent here, we’re going to find it nearest to the page, right edge.

Now what we’re going to do is add an element for the part number and once again, that’ll be a character string. And we’ll call this cs part number. Once again, we’ll grab any characters we can, we can add a little bit more intelligence because now that we’ve found the name, we can say it’s to the left of the name.

And also just to be careful, we’ll say go get me the the character string closest to the left of the page. And that’ll just kind of give us a little bit more insurance there that we’re grabbing the right fields. So what I’ll do now is go ahead and mask that first one and I’ll show you here. We’ll dive into some of the rows. And as you can see, not only am I grabbing the price, but we’re also grabbing the part name and a part number. The cool part about this is is as I match a whole document, I’ll show you here all of the elements. You can see we’re grabbing all of these elements off the table, even when the document spans multiple pages. So this is a very cool way to do it. And of course we can process these through the other samples. At this point, what we will do is create a repeating group block and that will repass the information back to the FlexiCapture application. So I’ll call this our repeating group table. We’ll give it a source element here you can see we’re going to tell the software to grab it top to bottom. That’s kind of very helpful because typically when we read a table, we want to read it. As we’re seeing it on the screen, we will then add the additional fields.

And now we have our blocks created. At this point, what we would do is we would save our results, go to file and export and we’ll generate that AFL file that we’re familiar with when we are working with FlexiLayouts. From here what we would do then is we would create a document definition, of course if you’ve done this before, I’m actually going to show you something pretty cool about repeating groups that I think is very helpful, especially in a lot of business ways that we read a repeating group or a table. So what I’m just going to do is go ahead and create us a new document definition. I’ll load that FlexiLayout just so that we have it convenient to us.

And you can see here we have our table and I have a test sample. I’m just gonna go ahead and run this test sample just so you can see typically how a repeating group looks and you can see we have each row and it’s outlined here and as I click, the software will highlight for me where it’s at. Now this is a default way of how a repeating group looks. Now, sometimes we like repeating groups to look like tables. So in this case, and this is a very neat feature, you can right click and say, show as table. So now instead of breaking them out into separate, repeating groups with repeating rows, it will actually format it as a table. So now when we run a test here, you can see it looks like a table. It feels like a table just as if the user was reading it on the documents. So this is a very cool way and flexible way to extract information from tables, especially with repeating groups, because we have a lot of control over how we structure it. And sometimes that gives us an advantage instead of using a table element. I hope you have enjoyed this video. If you have any questions, please feel free to leave a comment for us. Thank you so much!

Thumbnail for ABBYY FlexiCapture - Nested Repeating Groups in FlexiLayout video

ABBYY FlexiCapture Video – Nested Repeating Groups in FlexiLayout

Watch our video Learn how to set up nested repeating groups in ABBYY FlexiCapture FlexiLayout Studio. Nested repeating groups can be useful in many ways, including collecting data from complicated tables where the table element cannot be used.

Hello. Today I’m going to show you how we add nested repeated groups in a repeating group scenario. We already covered another video where we just focused on one repeating group within a document, but sometimes we don’t get that luxury and we actually need to add what we call nested repeating groups to a FlexiLayout template, so that we can extract traditional repeating information on a document.

If you recall in this document, we have student information and then below each student, we have the class numbers along with some GPAs and other information we want to capture. So we added a repeating group already in our previous video where we’ve extracted just the student information. But now what we want to do is extract the class information along with some GPAs. So, the first steps that I’m going to do is I’m going to try to find some things that allow me to anchor things.

I’m going to use class number, because we’re going to extract the class number, I’m also going to use this word called entries, so that will help me get entries, and since those show up only once within the parent repeating group, I’m going to go ahead and add them at this base repeating group element. In this case, I’m just going to go ahead and give them some names. Since this is a static text, I typically give them just a name of S for static text and then I’ll do class number and then, of course, here we will use class number.

It’s very important that when we have cases like this where the label spans multiple lines, that we tell the software that. So, and the way that we do that is tell it to take spaces into account and permit multiple lines. Then, of course, as we typically want to do, is we want to just add a relationship here, so that it finds the class number text closest to, we’ll just call it the student element here. So, it’s going to look for this whole element here called the student element and we want it to find the one closest.

The other thing that does is allows it to order it from page top to bottom when we’re displaying them to ourselves or to the user. So that will show us how to get the class number. I’m also going to add the entries here, so we can extract the entries. So we’ll do something very similar. We want to tell it that we’re looking for entries and then, of course, just very similar, we want to tell the software that it’s kind of anchored back to the student.

Let’s go ahead and test this. So we’ll go ahead and just run a match and once again, as you’re familiar with now, you can see within this group, the first repeating group instance, we are now extracting this proper headers. I’m going to allow you to do description on your own after you’ve analyzed this video. Think about different ways that you could capture description. So that’s why we’re going to leave it out of this video. But you can see here, we’re finding the repeated information for the group.

So let’s go ahead and add our nested group. All we need to do is add an element within here called a repeating group. We’re going to give it just the name of RG2, so that we know this is the second repeating group. We’re going to keep it fairly simple here. We’re just going to tell the software that it’s above the average GPA and that it’s below the student. So now we know how to box in for each repeating piece of information, in this case the students, we can box in where this repeating group is located. Just above and below.

Okay. So now that we’ve done that, we want to extract these specific values here. That’s why we’ve previously grabbed the class number text and the entry’s text, so that’s going to kind of give us some secrets here. In this case, for class number, I’m just going to add what we call a character string element. The character string element gives us the ability to just find what we call regular expressions or certain patterns of alphabets within the software. Here we’re going to give this the name of class number. One thing I will do here that is very, very important is we want to tell the software to only find this in cases where it found a student. Or, in this case, we’ll go ahead and just use class number since we want to use that as an anchor. But in other words, if we don’t find class number, then we don’t want to return an empty repeating group.

We’ll just tell it here and so don’t find this class number, if we don’t find the text called class number within this repeating group. What I will do is go ahead and add a regular expression for this. This is three digits long. You can do that, just do three digits and that’s a digit there. Then you can tell the software that it, has a minimum value of three and a maximum value of three. So it’s going to look for a three digit long character. Then, of course, we like to, I always box in, as much as we can, each element so that the software knows kind of at the end of the day, a rectangle that it’s extracting. We’re going to tell it simply that it’s below the class number, so these values here for the class numbers is below this S class number key word that we’re using.

Next thing we will do is we will tell it that it’s left of the right. So we’re going to say that these are located to the left of the right side of class number. That’s how we do it, it’s going to be to the left of the right boundary of this element. Then, lastly, we kind of want to do the opposite way. We’re going to say it’s right of the left. Then, like we typically do, we just bring this back and we kind of anchor it back to the closest element that we can find, keeping them in proper order and we’ll do something like that so we can keep them in a good order there for us to preview.

So that will enable us to get the class number. If we match this and we dive into these, now you’ll see here. We’ll just go ahead and steal our first repeating group. If we see everything within our repeating group, we’re extracting all this information, and then we’re also extracting within the nested repeating group, those different class numbers as well. So, let’s go ahead and add the entries. Very similar process, it’s once again a character string. We’ll give it an intelligent name, like entries. Just like what we did before, however, we want to be very cautious about this and make sure we tell the software not to find it, or not to attempt to find it even, if it can’t find, in this case, the S class number, or the static value of class number.

We’ll just say don’t find it if it’s not found. In this case, we’ll just go ahead and tell the software that it’s just any part of the alphabet. So, any characters of the alphabet can be found in the entries field. Of course, if you were doing this in your production environment, or a production template, you may want to be more specific about that. Including using a regular expression there. In this case, I’m just going to go ahead and leave it alone and then lastly, once again, control the different elements in relations that we find it. Just like what we did before when we used class number, now we’re going to use the text of entries to kind of box in where we find it.

So, first off, we know that these are going to be found below the entries keyword. So look below. Go ahead and look to the left of the right of the word entries. And right of the left of the word entries. And we’ll go ahead and say that it’s closest to the class number. So now, if we go ahead and match these, we’ll see a fairly consistent results here that I think you’re going to like. So we’re not only grabbing the first repeating group, we’re also grabbing repeating information within each repeating group. Here I’ll just dive in to our first repeating group, or our first student. Of course, we found everything again and then if we dive in to each repeating group, on the nested side, you can see I’m now finding both class numbers here once again.

That’s a very good example and a very good situation where we would apply nested repeating groups when we don’t have very nice or pretty table or tabular formatted documents in order to return columns and rows, we can now use a repeating group to do that. Then, of course, we can use that information downstream when we’re ready to simply perform some sort of export and have them based on columns and rows.

I’m going to challenge you now to go ahead and set up description. There’s multiple different ways to do it, of course. You can use a character string. You can use what we call a region. But just think about a couple of different ways you can use what you’ve learned in the last couple of videos to extract the description on each repeating group. It’s not meant to trick you. But it’s more just different ways and brainstorming. There’s more than one way to skin a cat on this FlexiLayout, so if you do it differently, it does not mean you’re right or that I’m wrong. Or vice versa. So, go ahead and give it a shot or give some ideas, or make up your own sample and give it a shot. I think you’ll enjoy the flexibility that ABBYY has provided you within this FlexiLayout studio. As always, if we can be of any assistance to you, please reach out to us. We’d be thrilled to help you. Thank you so much.

Thumbnail for ABBYY FlexiCapture - Repeating Groups in FlexiLayout video

ABBYY FlexiCapture Video – Repeating Groups in FlexiLayout

Watch our video to earn how to set up repeating groups in ABBYY FlexiCapture FlexiLayout Studio. Repeating groups can be useful in many ways, including collecting data from complicated tables where the table element cannot be used.

Hello. Today we are going to discuss how to create a repeating group within an ABBYY FlexiCapture FlexiLayout. The reason why we have repeating groups is because we want to extract information that’s repeated on a document, but maybe it’s not in a tabular format, or not in a nice, clean table. So, we can’t use a table element, because it’s too restrictive, so we have to tell the software that we have information that repeats. Therefore, when we extract it, we want it to repeat, and when it exports we want it to repeat the table and columns too, when we export the document.

What we’re going to do is focus on just the repeating group element, we’re not going to go through the process of setting up a FlexiLayout from scratch or plugging it in to the FlexiCapture software. That has been covered in other videos and I ask you to reference those, if you have any questions on that process.

The first thing we’re going to do, is simply create a repeating group search element. And, I’m going to name it, just what we call repeating group one. The reason for this naming standard, and everybody has their own naming standards, and you can develop your own. So that we know that this is the base level of a repeating group. Then, as we nest, if we have two nested groups, you can see we’d have RG one, and then we would have everything nested under RG one, RG two, and RG three, and so forth. That’s the purpose of what this is all about.

Within a repeating group, we want to tell the software what we’re extracting. In this case, I’m going to tell it about three labeled fields, we’re going to extract student information and then the average GPA. Then, we’ll have that, of course, twice because there’s two pieces of student information and two average GPA’s on this document.

So, what I’m going to do is just create three different labeled fields. Now, for ease of this demo, I’m going to keep them labeled fields. Of course, you can process documents and you would, maybe even, develop this one a little bit different than me, but just understand for ease of a demo, I’m just going to restrict these to labeled fields. Of course, you can use any other search elements for repeating purposes.

Just quickly, I’m going to set up the student that we want to extract. And, just assuming that we want to stay away from this ID number, I’m just going to tell the search area to limit that, here on where we find the student number. I’m going to add some additional labeled fields, here for the ID number. And field position to the right is sufficient for us. Then, lastly, we’ll do the average GPA, so we’re capturing this average GPA per student. And the label here will be average GPA.

So, now we have three different labeled fields set up that will be repeating. What I’m going to simply do, is just test this. You’ll see that it is extracting repeated information, it doesn’t have them together yet, and that’s going to be the next step, but you can see, it’s capturing down here in our hypothesis tree multiple instances of a repeating group.

Sometimes you will have an instance or a last instance of a repeating group. As long as that last instance is all these yellow bubbles, then you won’t have a problem with the software, it shows there as a hidden element, and it will not get exported, and also not used in the preview to the users. We are verifying our extracted details, here.

So, it did extract the information that we needed to. Now, you’ll see that it’s kind of messing these up, and there’s some tricks we can use, and also different elements that we can use to help with this so it keeps this information together as one repeating group. For this case, I’m just going to tell the software that we want the student to be the anchor, and all I’m going to do there, is just tell the software that I want always to capture the one closest to the top of the document. Your logic on your documents may be different. So, just note that this is, obviously, for demo and video purposes.

The other thing that I will do, is I will tell it then the number and the average GPA that I want it to be closest to the student that it’s referencing. So, what I will do, is simply just tell it to find me the search element that’s relevant to student and once again, the search element that is relevant to the student.

If I test this now, you’ll see the first repeating group here, we found the label and the field for each of those, that we wanted to reference. Then, on the second one, as well, here.

So, that’s how we set up a repeating group. It’s a very simple process. Then, of course, we would add a block. We have a repeating group block and you can, of course, add multiple pieces of text as well, here.

So, what I’m going to do, is show you what the software looks like from a repeating group perspective, if we add a repeating group. It’s just a display to the end user that changes, here. I’m just going to tell the software that this is all instances of a repeating group block. Otherwise, we can return some separate text, here. So, just hang with me as I set up this. We’ll just use the student name for this sample. Once again, this is only going to show you what the user sees once we’ve extracted it. So, we’ll just call this the student name. And, we’ll tell this, this is the student name field.

So, that’s if we set up a repeating group block. We can tell the software just to return it outside of the repeating group, when we do that it’s very important however, that we use this ‘has repeating instances’, and then we want to tell the software, of course, where we find the student name, in this case.

What I’m going to do is save it. I’m going to export this, so that we can use it. And, we will come back to our demo here and we’ll, actually, create a new form together. We’ll load a sample, we’ll tell it where our FlexiLayout that we just exported was. Of course, we’ll give it an intelligent name, tell it the marking type, or excuse me, the OCR or handwriting type that we want. And then, lastly you’ll see here.

So, you can see the two different ways that I return the block to the software. It’s going to be the same information, it just displays differently to the end user. It operates just a little bit differently. And you can see one groups them, in what we call a formal group. And then, the other one just brings it in to a tab, tabular format.

So, what I’m going to do is simply test this with you. Just so you can see the little bit of a display difference, here, on where the software extracts that information. So, of course, in the real world, we would want to add additional blocks, so that we return the ID number and the average GPA that we referenced there in the FlexiLayout, as well.

I hope you enjoyed this short video of how to create a repeating group. Repeating groups are very, very powerful and very commonly used in the development of FlexiLayouts.

If you have any questions, please reach out to us. We’d love to be of assistance to you. Thank you so much.

Thumbnail for FlexiCapture - FlexiLayout Pt. 3 video

ABBYY FlexiCapture Video – FlexiLayout Part 3: Using Search Elements

Watch our video to learn how to use additional search elements (such as barcodes, dates, etc.) using relations within FlexiLayout Studio.

Well, hello. Today we’re going to continue learning FlexiLayout Studio for ABBYY FlexiCapture. Now FlexiLayouts, just to remind you is a way that we can get semi-structured data or data that has some sort of textual relationship instead of location. What we’ve done in the past is we’ve located some elements based on some obvious things, like keywords for example. Your planet name in this case, or your space ship number in this case. We use what we call label fields to find it. We have a label and we have a field and that’s what we want to capture.

Today we’re going to look at some other advanced elements that we can use. An example would be maybe a bar code, or maybe some dates and etcetera. We’re gonna learn that today. We’re gonna learn about a little bit about relationships and how we can use the information that we find on the form or different text on the form to help us determine a relationship that will help us find other text on the form.

Let’s start out today by looking for this bar code. There’s a bar code if we look at each one of these. There’s a bar code on this form. This one will be a very simple one. Let’s go ahead and add a bar code field. Remember we always do a search element and then whatever we return in blocks, these blocks is what gets returned to the FlexiCapture product so that we can capture these on documents as we process them.

We’re gonna create an element and this one’s gonna be pretty obvious here. We’re going to have what we call a bar code element. On the bar code element, we have a lot of different feature that we can reference. You can see we can control what kind of bar codes, how they orient. Again, once, again how they relate to each other and other elements on the form. In this case, we’re gonna be happy that we have a bar code.

We’re gonna have a bar code. There is just one bar code on every one of the samples, which is fine. In this case we’re just gonna tell the software to look for a bar code. We’re gonna tell it where its source is. Remember every block relates to a search element source. We’re gonna tell the software this is a bar code. We’ll go ahead and match these documents here.

Now, that we have that you can see, now we have our returning in green, which means we found it. We are returning that block, which is the bar code. I’m gonna intelligently name this one here. Instead of block, we’ll maybe call this our bar code. Then if we see here, we can match and we’ll see that all of our bar codes are found here.

Now that we found bar code, let’s look for something like name here. Name is an interesting field because if we just tell the software to look for the word name, colon, it will obviously find things like this field, which is exactly what we do not want to happen. However, if we look at every one of these, we can tell the software to use what we call a relationship. Now, there are multiple ways to find elements on a form. Every developer is gonna do it little bit differently, but today we’re going to target using relationships on how we do this. Other developers might use what we call a first found field. In this case, we’re going to simply look for anything that has the word name in it, but it has to be above your planet name.

In this case, we’re going to say we want to find a labeled field. And in this case, we’ll say look for anything that has the name, name, colon. If we stopped there, you would definitely have problems in relationship your planet name. In fact, let’s go ahead and stop there. We’ll just add it a block, so it’s obvious to see. Let’s give it an intelligent name here, just so we can make sure we understand the difference. I should’ve been a little bit more disciplined. We should call this a really good name here. We’ll call it a labeled field. Little bit more intelligently.

When we search for these, the software will sometimes find it correctly, and sometimes not. In this case, we did fine on most of these but you can see here we did not find name, colon. That’s just to speak to some of the differences in what we call our hypotheses here. However, it did find name, colon, but you’ll see in this case … This one’s fine. In this case here, maybe you can see we’re getting confused here with the planet name.

If we can put a relationship on here that says anytime you find name, colon for the guest name, we don’t want to talk about planet name. But we realize that the guest name must always be above your planet name. So what we’re going to do is we’re gonna say go look for your planet name. Go look for that field and make sure that when we find the guest name that it’s always above it. You can see we can play around with some offsets, which is very common. We typically need to do that when we’re developing more complex forms, but in this case we’ll go ahead and let the software run.

Now that we’ve set up that relationship or that relation, you can see here it’s very obvious that we’re determining where the guest name is located. There’s no confusion on this name field with this name field.

To just show you again how that would work, let’s assume that we need to grab the date field here. It’s very obvious, there’s a date field and we can see that in most of these, it’s between space ship number and this bar code. Once again, we’ll just take a peek. There’s space ship number and the bar code. We’ll just take another peek here. There’s space ship number and the bar code and that date is always in between there.

What we’re gonna tell the software is to look for a date. We have this date element. We can say we’ll give it a smart name. Maybe like date arrival. Of course, it’s a date. We can tell it what format we want it to be in. We could get very specific based on your requirements. We want to tell it a relationship. Now, in this case remember that it’s always below the space shop number field. We’re gonna look for the space ship number field. We’re gonna say it must be below it. And it must be above the bar code. We’re just gonna look for the bar code and say that it must always be above it. In other words, it’s gonna be between the bar code and the space ship number vertically on this form.

We’re gonna hit Okay, and now we’re gonna add our search block here. In this case, we’ll just consider it text. Remember we always gotta give it a source. What we’ll do here is go ahead and match these documents. Now that I’ve matched these documents if we give this an intelligent name again, so that we can easily see it … Can’t have a space in there.

Now that we’ve given it an intelligent name you can see we’re gonna capture it because we told the software it’s in the relationship of below a field and above a field. If we look at the arrival date there, it’s found.

These relationships that you have are specific to every form that you have. The rules are kind of up to you in how you develop it and you have a lot of flexibility using elements. Like I said, every developer’s going to do it differently. I would challenge you to look at some of these fields at a deeper level. Even read some documentation to learn about how you can find different data. You have different types, currencies, phones, dates, you can even look for separators, so if we wanted to capture this picture over here on the documents we can tell it to look for certain bars around the image so that we can capture the actual image itself.

A lot of different freedom that we have here as developers to customize our form and how we capture different fields off of it. There was a couple where we found the bar code. We also found the guest name and the arrival date there. Remember the way that we did that is through these relations and we have a lot of flexibility in the way that we control that. Every form is gonna be different, just understand that, but make sure you play around with it. Get comfortable with it, there’s a lot of different technology features in here that you can use to your advantage, and to make life a little bit easier from a development perspective.

I hope you enjoyed this video. Definitely, jump in and learn about these different types of elements that we can capture. If you have any questions, please reach out to us. We would love to be of assistance to you. Thank you so much.

Related Content:

Thumbnail for FlexiCapture - FlexiLayout Pt. 1 video

ABBYY FlexiCapture Video – FlexiLayout Part 1: Starting from Scratch

Watch our video to learn how to use ABBYY FlexiCapture FlexiLayout Studio for semi-structured forms.

Hello. Today I’m going to show you how to create a FlexiLayout from scratch. The first thing I’m going to do is show you the samples I have, and what these are, are registration forms. And all of them look just a little bit different. You can see the text on the screen is located in different places.

The details that we want to extract, which are the names, the planet names, the spaceship number, the dates, this bar code. All of this information is located at different spots. Now, it’s the same detail we want to capture, but just located in different locations, which is what is a perfect semi-structured form.

And therefore we need what we call FlexiLayout Studio to help us design that. So ABBYY FlexiCapture comes with what we call a FlexiLayout Studio. And this studio is what gives us the ability to capture semi-structured forms. So the first thing we’re going to do, is create a project. We’ll give it a location.

For this case, we’ll just put this in our demo folder, and we’ll give it a name. We’ll call this our Halloween Registration Form, and we will start it. Now what the project’s doing now, is it is going to ask us for some properties, some basic properties.

Now, let me warn you here, that FlexiLayout Studio is going to require much more training than this video that you’re going to see today. This is just a way to get you started, and there are a ton of features that we just won’t have time to cover in this video. But for sake of these images, we know that these are one page documents.

So, I’m actually just going to select this and say the minimum number is one. The maximum number is one page, and that’s just what we’re going to process there. But as you can see here, there is a lot of different options we can set, and therefore FlexiLayout Studio does require training.

In fact, a week or two of training is typical for a new beginning user. So we’re going to set some properties. We’re going to hit Okay, and then we’re going to add some images. I’m going to browse to our folder, and we’ll select the images that I just showed you, and we will load these on the screen.

And if I double-click every one of them, you’ll see here, we have each image that we have. Now we’re going to zoom out, which we will here, and we can just scroll up and down to see each of these images. Now the important part about FlexiLayouts, is that most of the information that we want to capture is text-based, meaning that we want to use text on the document to help us develop rules on how to extract the different things.

So the very first thing we’re going to capture today is, we want to make sure that we capture that the planet name is this Mars satellite [inaudible 00:02:47], for this one. But we want to capture that field. We just want to capture the planet name. So you can see it’s different here on every one of these.

So the way we’re going to do that, is we are going to use these rules within the FlexiLayout to determine how we map out the planet name. And it all comes down to what we call search elements. Search elements give us the reasoning behind why we looked for text, and then these blocks are what we are going to return to the software to extract.

So we may reference a bunch of text on the screen, but we really at the end of the day, only want to capture something, even though we use other text on the screen to map out where that something is located. So that’s why there’s a difference between search elements and blocks. But blocks are what we are returning to the software, and that’s what we expect to return.

So the search elements are here. Now what we are going to do is, we’re going to create a new search element. And we’re going to right click, add an element, and you can see there are a lot of different element features here. For this case, we are able to determine a pattern on these documents.

You can see they’re labeled, Your Planet Name. And if we look at every single document, you’ll see there that is a pattern here with that name, even though it’s located sometimes in different locations, the text is the same. So we’re going to create what we call a labeled field. And there are a lot of properties on every single field an element that you want of capture.

Now once again, we’re not going to go into every one of these options today. This is definitey more advanced training that you would need. But just to get the concepts down you can see, we’re going to say, “Hey. This is a required element.” We’re going to tell it, what’s the text of the field that we want to label. In this case it’s, Your Planet Name.

And we’re going to tell it where the field position is. We’re going to say it’s to the right and if we remember looking at every one of these forms, it is to the right. So you can see there’s a lot of different options here we can select, but I think in this case we’re good enough to be dangerous.

So we’re going to go ahead and select Okay, and we will have a labeled field. So now what we can do, is highlight and match these documents and if we select here, our labeled field, you’ll see that we did capture a lot of the field. So we can kind of scroll down. On this document, you can see here.

So now what we want to do, is give this labeled field some intelligence. So for one thing, it’s very important to name these fields correctly. So we’re going to call this one LF, for labeled field, and we’re going to say, Planet Name. And then, I notice that it wasn’t capturing this whole text, so we’re going to go ahead and make max length a little bit larger here. Okay.

And we’ll go ahead and match one of these, just for fun again. I’ll come down here to our Planet Name, and we may need to tweak that just a little bit more length-wise to get that field. There we go. There we go. So you can see here, I determined what the label is and what the field is, hence the name, this is a labeled field element.

The other thing we want to capture is spaceship number, so this is very similar. You know, it’s the same field. It’s actually labeled the same on every form and that’s also what we want to capture here. So it’s very much the same. We’re going to right click, Capture a labeled field. We’re going to give it an intelligent name. We are going to say to the software that this is required.

We’re going to give it a label. Give it the max length if you want, and actually we’ll actually do this here. We tell the software where it’s at and hit, Okay. So now what we will do here in this case, is we will look at the values, see what we return back for the planet, the spaceship and you can see it’s determining the label, and the field for us just fine.

Now an element of FlexiCapture that is more advanced, is learning how to use these hypothesis tree to determine the quality of the results you got. You can see here they’re all green, however they do go different colors based in the quality. For example if it’s poor quality, it will go yellow or even different from that.

So it’s very important to learn the hypothesis tree as you get more advanced here with FlexiLayouts. So what we’ve done so far, is we’ve told the software where to locate planet name and where to locate spaceship number. So now that we’ve told it where to locate it, we want to tell it to return that to us for referencing in our OCR or verification processes.

And the way that we do that is, we map these elements to blocks. So we’re going to return a block, which is text and we’ll make in an intelligent block. We’ll call it Planet Name, and then we want to tell the software that this planet name comes from that field, Planet Name.

And we actually don’t want the label, but we want the field itself, so we’re going to go ahead and hit, Okay, and hit Okay there. And of course, we have spaceship number as well, so we are going to say this is our Spaceship Number. Oops. I’m going to rename that here.

And then of course we want to tell it, what element is relevant to the spaceship number, which is the Spaceship Number labeled field, and we want the field part of that. So we’re going to go ahead and hit, Okay, and hit Okay here. Now when we match these, you can see we get a little bit more intelligence here on the block.

So the software is now going to tell us where it found the planet name and spaceship number. And in this case, when it’s in green, it’s telling us what the blocks are that we’re going to return to the software. So it’s very, very important. And those are just labeled fields. Now we can get very, very specific.

And in fact, in some of our future videos, we will get more specific on the elements that we can return, including videos and … I’m sorry. Including bar codes and including photos. But just understanding that FlexiLayouts are all about text, is the important part here.

So you can see, we can match these, and if we just scroll down through everyone of our samples here, you can see it’s now determining, based on the label where we found those fields.

So that is a very, very basic introduction to ABBYY FlexiCapture layouts, FlexiLayouts. I hope you did learn something about this and look forward to exploring FlexiLayouts with you as we get fancier in our next videos. So, thank you very much.

Related Content:

Thumbnail for FlexiCapture - FlexiLayout Pt. 2 video

ABBYY FlexiCapture Video – FlexiLayout Part 2: Export to FlexiCapture

Watch our video to learn how to transfer a FlexiLayout template to an ABBYY FlexiCapture project.

Hello. In our previous video, we were discussing FlexiLayouts and how we can use them for unstructured or semi structured data. Today, we’re gonna explain to you how we can transfer our FlexiLayout to FlexiCapture so we can actually start processing documents.

What I’ll do is I’ll remind you just a little bit about what our FlexiLayout looks like. Let me open the project here. You’ll recall that we set up a few elements. We call those search elements. Elements help us find information on the document, so in this case we planet name, and in this case we had spaceship numbers, and we use what we call labeled fields to detect that.

Now, we have blocks, and you’ll recall that blocks are what gets returned to the FlexiCapture project. In other words, this is the information that we will use intelligently when we verify the actual document itself. This is a FlexiLayout and what we’re going to start with the process here is we’re simply gonna export it. So, we’re gonna export the FlexiLayout and we’ll give it an intelligent name. You see it has an intelligent extension already with it, so we’re just gonna go ahead and leave that there, and just put the default name. We’ll save it, and we’ll go ahead and close the FlexiLayout studio.

What we’ll do now is we will go ahead and open the ABBYY Project Setup Station. We will show you how to incorporate this into a new project. So, let me bring this over here on my screen so you can see it, and we’ll start a new project. I’m gonna save this in the similar path. So, we’ll just go up here, get our path here, and we’ll just call this our Halloween Registration Form FlexiCapture Project. So, now we have an empty project, and we’re gonna go to Project Document Definitions. We’re gonna create a new document definition. From here we’ll make an intelligent name. We’ll determine whether or not it’s handwritten or typed, and we’ll kind of move forward in this wizard.

Now, this is where we’ll stop. There’s a couple of different things. One is we’re going to load some samples, which is fine. We can load whichever sample we want. And then we’re going to load a FlexiLayout. Now, this is key because this is where we’re gonna use that exported file in order to find our FlexiLayout. You recall that I exported that dot AFL, so I’m gonna simply click that, hit OK, and hit Finish.

Now, you’ll remember that those blocks in the FlexiLayout studio are what’s returned to the actual project itself. So, now we can control the different requirements of this field, including data types, rules, verification settings, etc. We have full control over the rules, so FlexiLayout is what’s responsible for determining the textural rules and then those blocks are what gets returned to the project that we’re in here to give us a little bit more control of it from a verification standpoint.

So, we can simply save this. We’ll go ahead and publish it, too, so we’ll close this, publish it, and we’ll just run a batch. Let’s just go ahead and load all of our forms. We’ll let those process here, and we will now go into the batch and you’ll see they’ll start processing here for us. We’ll look at the first one. And you can see we now have our OCR results.

So, that is how you get a document over from a FlexiLayout to a FlexiCapture project. It’s as simple as exporting the project, and then using FlexiCapture to modify the field settings there to get us specific with what we want from either an export perspective or a verification perspective.

So, this is a quicker video. Keep looking for our other videos and the series and we look forward to working with you here in the future. Thank you so much.

Related Content: