I continue to be impressed at the sheer variations we come across with PDF files, even after 10 years of writing a PDF parser….
Today we had a PDF file created with a tool call PdfGenLib. Here is a sample of what it produces…
The interesting thing is that it appears to be hard-coded to generate a 6 digit number for each PDF object reference (ie 000001 0 R) rather than 1 0 R. The PDF reference does not say that you cannot do this, but it does not add anything to the PDF file except to make it larger. So no other tools do this.
And it means that some clever code I wrote last week to allow for the first object in the PDF being object 0 0 R (and not object 1 0 R) needed to be modified. Still it means I will never be short of work 😉
Do you have any similar experiences with ‘strange’ PDF files?
This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!
Are you a Developer working with PDF files?
Our developers guide contains a large number of technical posts to help you understand the PDF file Format.
Find out more about our software for Developers
|Convert PDF to HTML5 or SVG|
|Convert AcroForms and XFA to HTML5|
|Java PDF SDK for working with PDF files|