At IDR Solutions whilst working on some files we found we had a very good example last week to show why it is a bad idea to edit a PDF file directly. Let me share the story…
One of our customers wanted to remove some Annotations from a PDF file. So they deleted the /Annots object from the Page object. They then wondered why the file was so much slower to load and render.
In theory this just looks like a minor edit to a file. But a PDF file is not an ordinary file. It is a data dump, with a look-up table at the end. The lookup table allows the PDF viewer to read just the look-up table and then skip to just the objects it needs using Random Access. This is one reason why opening a PDF and moving around is very fast.
However, if you edit the file so that one of the objects is now shorter, then all the objects which follow it will be in a different place from that specified in the look-up table. Most PDF tools will spot this. They will then manually load the entire file, and manually work-out what the correct look-up table positions should be, if this is possible. Sometimes, the act of manually editing the PDF file will make it totally unusable. This is a much slower process.
So if you need to edit a PDF file, please use a proper tool (like IText) which will allow you to delete objects and then properly update all the look-up tables in the PDF file. It will make your life much easier…
Are you a Developer working with PDF files?
Our developers guide contains a large number of technical posts to help you understand the PDF file Format.
Do you need to solve any of these problems?
|Display PDF documents in a Web app|
|Use PDF Forms in a web browser|
|Convert PDF Documents to an image|
|Work with PDF Documents in Java|