Watch in our demo how to split a document that contains multiple invoices with ABBYY Vantage.
Hello. Today I’d like to show you some document splitting in ABBYY Vantage. And the cool part about this is that the software has the ability to split based on text that is found on the document. And one of the ways that we typically use this sort of functionality is with invoices. So a document comes in that may have multiple invoices in it. So let me show you an example of one here.
In this PDF I have just a few different invoices. They’re completely different organizations, completely different invoices, but these are coming in and we need to be able to automatically split them. So what the software can do is it will read these documents and based on logic and today’s logic is pretty simple. We’re gonna look for extracting the invoice number and splitting it when we see a difference. But we can get pretty advanced in the way that that sort of logic is applied.
So what will happen in today’s demo is a document is gonna come in. We will use the assemble skill to split a document. Now we have what’s called a document splitter skill that’s used to split it and we have full control over that splitting logic, and I’ll show that a little bit to you behind the scenes here in a second. But once we’ve split it, we will then formally extract using our Invoice US Skill. So we’re gonna split using our splitter skill. Now that we’ve split successfully, we will extract using our document extraction skill. And then just for review purposes, we’ll actually come in here and we’ll see that document here that’s ready for review and the reason why we’re doing this in today’s demo so that we can see that splitting actually taken place. Alright, let’s go ahead and fire off a splitter skill.
So now that we fire that off, the software now is going to perform that splitting. So it’s looking for differences in the documents and will auto split those documents. And then based on our process skill, we will route those for human review. So what I’m gonna do here is just monitor our skill monitor in ABBYY Vantage. And you can see we now have an invoice that is in the starting stage. And when that’s ready for review, it will go to what we call the Manual Review Stage. So that is our human in the loop step so that we can see these documents.
So now we have that transaction ready for human review so that we can see the splitting. I’m gonna go ahead and open up that transaction. A couple of things I want you to know is that within this transaction, I now have four separate documents. So if you remember, I had a PDF that had four pages in it and they were each individual documents, and now ABBYY Vantage has split that up for us. Not only that, but we have then after we split, we have then applied against each document, our invoice document extraction skill. So now I have every invoice, it’s content that we read off of it, and of course the line items and those sorts of things on an invoice that we would typically want to extract. So this shows us the power of splitting.
Now in practice, document splitting can be much more advanced. We can use page numbers. We can use differing text. We can look at every single page. So let me kind of show you then behind the scenes how this really gets implemented. So if we go back and we look at our process skill, you’ll recall that we have a document splitter skill that we referenced here in that process skill.
Now, if we go and we look at our invoice splitter skill, it’s pretty simple. We’re going to of course perform OCR on a document as it comes in. We will look for different extraction rules. And in today’s demo, once again, this is very basic. We’re looking for an invoice number. We’re looking for maybe something that indicates the first page of a document, but once again, this is all customizable at this point based on your document type. And then we have a splitter script, which gives us the ability, frankly, to go as customized as we want to for the splitting. So we give you a sample script here to start out with, and this is definitely something we can consult with, but here’s where we kind of tell the software, here’s when a new document starts and stops, and then what to do with merging documents into previous invoices if we don’t see a differing number of, in this case, invoice numbers.
So just a lot of control here that we have now that we have the ability to classify, extract the text, and then perform the splitting. And we have full control over then how that batch is kind of formulated here in Vantage, which eventually gets us to the ability to have just a beautiful batch that’s outlined here as we would expect it to.
So a little bit behind the scenes, but a lot of this is really just low-code, no-code options here where we can assemble automatically. And honestly, within minutes we could have documents coming in, in this case invoices just being split and then routed as needed for our downstream steps. So I hope you were able to take this content and apply it to your business use cases here. Just a lot of power at our disposal.
[Music- “Engineered to Perfection” performed by Peter Nickalls, used under license from Shutterstock.
Adobe, Acrobat, and the Adobe PDF logo are either registered trademarks or trademarks of Adobe in the United States and/or other countries.]