What is a PDF/A?

Portable Document Format (PDF) files are a handy way to exchange documents with others without requiring them to have a copy of the application that was used to create the original document. PDFs provide a way to view, as well as print, an accurate representation of the original document. The PDF format was around for a number of years and rapidly grew to become a de facto standard. As the use of the PDF format exploded, so did the variations in the format. Not everyone implemented the format in exactly the same way, and not everyone supported every new feature that was added to the format as it evolved. Demand grew for improved standardization as well as a version of a PDF that could be used for long term document archiving. A PDF file format was desired that would assure that documents created using the format would be able to be viewed and displayed at any point in the future – regardless of other changes in the rapidly changing world of computer technology. At the same time, the number of paper documents being stored was rapidly decreasing in favor of electronic document storage. Since many documents form a legal record, it was also important that the content of the original documents be maintained perfectly and were able to be viewed and printed perpetually.

In 1995, Adobe (the original creator of the PDF format) began participating with many different working groups to develop standards for publication by the International Standards Organization (ISO) for a long term archival PDF standard. In 2005, the first ISO standard was published for the newly named PDF/A format. The “A” in PDF/A stands for “Archive” and the approved standard was rapidly embraced as an ideal way to store documents for long term archival. Prior to the approval of the standard, the Tagged Image File Format (TIFF) was the most widely used archival format.

Some of the advantages of PDF/A over TIFF are:

  • Smaller file sizes.

  • Access to searchable OCR text, not just images.

  • Document security, including the ability to limit document operations such as printing.

  • Improved document structure. TIFFs have only pages. PDF/A files can have bookmarks, hyperlinks, and tags.

  • TIFFs do not support annotations as part of the document. PDF/A files can include numerous types of annotations, and these can be stored directly in the document.

  • TIFFs do support embedded metadata tags, but PDF/A documents have vastly improved metadata features which include the ability to embed sophisticated XML data in addition to commonly used metadata, such as creator information and keywords.

  • The access to searchable text as well as additional metadata makes it much easier to find PDF/A documents after they have been filed.

The original ISO standard for PDF/A documents was released with two different levels of conformity to the standard: PDF/A-1a (also called Level 1-A) and PDF/A-1b (also called Level 1-B). PDF/A-1 documents that conform to Level 1-A are required to accurately reproduce the original document without any visual ambiguity, to contain text in Unicode format (a format that provides for a full international character set including symbols), and also obey proper content structure rules. PDF/A-1 documents that conform to the less stringent Level 1-B compliance are only required to accurately reproduce the original document without any visual ambiguity. The actual ISO standard names for the PDF/A-1 standard are: ISO 19005-1:2005:PDF/A-1a for Level A compliance and ISO 19005-1:2005:PDF/A-1b for Level B compliance.

Since the original PDF/A-1 standards were published in 2005, additional standards for PDF/A have been published. PDF/A-2 was published in 2011 and PDF/A-3 was published in 2012. Both were created to address some of the new features that were added to newer versions of PDFs that were created using versions 1.5, 1.6, and 1.7 of the increasingly more powerful PDF standard. PDF/A-2 allows documents to contain additional types of image compression that help to make color images smaller. In addition, PDFA-2 provides support for digital signatures and the embedding of PDF/A documents within a document. PDF/A-3 differs from PDF/A-2 in only one regard – the embedding of other types of files within a document are allowed, instead of just PDF/A files as in the PDF/A-2 standard. This includes file formats such as word processing files, spreadsheets, XML files, and others.

There are a few differences between PDF/A documents and regular PDF documents. Because PDF/A documents are intended for long term storage in the changing world of technology, they are required to contain everything within them that could be necessary in the future for them to be viewed or printed. This includes things like fonts. The embedding of any fonts that are used in the document will often make a PDF/A document larger than a regular PDF. There are also some features like transparency that are forbidden in the first PDF/A standards (PDF/A-1a and 1b). That restriction, however, was lifted in the PDF/A-2 standard.

In a future article, the concept of additional PDF technologies such as PDF/A, PDF/E, and PDF/VT will be discussed.

Information about the Author
About Me
Joe is the chief technologist for UFC, Inc. He guides the decisions on which products UFC offers as well as research on new software applications under the Jovation and MuWave trademarks.

Pin it