What are PDF Xref tables?
Xref tables are part of the original PDF file specification and one of the features which gives the PDF file format its flexibility. It is found in the trailer. A PDF file can contain multiple linked xref tables which all need to be read. The pointer to the first xref table is usually found at the end of the PDF files in the last 1024 bytes. Linearized PDF files includes some data at the start of the file so you can display the file before it has been fully read.
If you open a PDF file in a text editor and search for the word ‘xref’ you will find something like this
xref 0 271 0000000000 65535 f 0000000015 00000 n 0000000102 00000 n
This is the xref table. A PDF consists of lots of COS objects and this tells you where they are located in the file. This is actually very useful. A PDF Reader just has to read these values and then it loads the objects only when they are needed. It does not need to parse or load the whole file.
The first line tells you about the table entries. In this case the xref table has 271 entries and the object numbers start at zero. The following lines give the object offset from the start of the file, then the generation number (you can have several revisions of an object) and a flag to say whether the object is in use (n) or not (f). If the PDF file has been edited and objects changed, the changed version is often tagged onto the PDF with an updated xref table showing the new location. So it is possible for a PDF file to contain several xref tables and the later values are used.
If you look at byte offset 15 in the PDF file I took the xref table from you will find the start of object 1
1 0 obj<</Type/Font ...
If you are looking at PDF file created with version 1.5 and above, you may not find an xref entry because they introduced an alternative way to store the objects inside Compressed streams.
Xref tables also explain why if you alter a byte or add a byte to a PDF file it will become corrupted – all the pointers are now wrong.
Is there an easy way to explore PDF Xref tables?
Our Java PDF Viewer has a inspection mode which you can use to view Xref tables and see what data they point to.
It works in trial version as well (so you do not need to buy our software to use it). Find out more…