I continue to be impressed at the sheer variations we come across with PDF files, even after 10 years of writing a PDF parser….
Today we had a PDF file created with a tool call PdfGenLib. Here is a sample of what it produces…
The interesting thing is that it appears to be hard-coded to generate a 6 digit number for each PDF object reference (ie 000001 0 R) rather than 1 0 R. The PDF reference does not say that you cannot do this, but it does not add anything to the PDF file except to make it larger. So no other tools do this.
And it means that some clever code I wrote last week to allow for the first object in the PDF being object 0 0 R (and not object 1 0 R) needed to be modified. Still it means I will never be short of work 😉
Do you have any similar experiences with ‘strange’ PDF files?
This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!
Did you know...
IDRsolutions offers a whole range of online file converters to convert PDF and Microsoft Excel, Word and Office Documents to HTML5, SVG or image formats?
It is free to use for single file conversions and also includes Developer links if you want to use our commercial software for bulk conversions. Find out more on this page