image over text

Image over text is a technical term for a specific format of electronic document generally associated with the PDF specification. An image over text PDF is a clever method of imbedding searchable text behind the scanned image of a document. This handy type of PDF document is created by first scanning the document, then running it through and OCR engine. Next a mapping is created for each word from the OCR text to the zone from which the text was located on the scanned image. As a result when the PDF document is displayed it can be searched for words and phrases. When a search term is located within a PDF viewer such as Adobe Reader, the location of the search term within the document can be display. Perhaps one of the most useful attributes of the image over text PDF is that the textual data from within the document can be added to the index of an enterprise document system or content search engine. This makes available all of the text from within the scanned image available for searching by users trying to locate a document.
