Understanding the PDF file Format – Multiple trailers on a PDF file

A PDF file consists of a ‘dump’ of PDF objects and a reference table defining where they are located in the PDF fille and the root object. This makes the PDF file format very powerful – objects only need to be read as required.

The PDF file format was also desined so that it could be easily updated. Rather than having to rewrite the whole file, it allows you to add new or changed objects onto the end of the file stream and then add a new reference table with the changes. So you might have a hypothetical PDF file containing objects 1,2,3, and 4 with a reference table. Then you edit the PDF file with a new version of object 4 and new object 5 – the updated object 4 is added to the end of the file and then a new table telling the PDF viewer to use this new version of Object 4. The original version of object 4 can still be in the file but is now ignored.

The new references table will contain the changed object location and a /Prev pointer to the previous table. You can chain any number of ref tables. So the way we would read our hypothetical PDF would be to read the first table and note the location of object 4 and 5. We would also note there is a /Prev entry and then go to that table. We would read the location of object 1,2,3 but ignore object 4 because we have alreasy found a newer version. There are no more /Prev table so we would stop there.

The location of the first table is always found at the end of the file so if we are appending data, it is easy to add a new ref pointer at the end.

This is one of the key features which gives the PDF file format its power and flexibility.

This post is part of our “Understanding the PDF File Format” series. In each article, we aim to take a specific PDF feature and explain it in simple terms. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

The following two tabs change content below.

Mark Stephens

System Architect and Lead Developer at IDRSolutions
Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Related Posts:

Markee174

About Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>