Why we need to see your PDF files…

1 min read

What makes writing a PDF parser especially interesting (ie complex) is that the specification is often ambiguous and that PDF is a very complex structure. To Display a PDF file requires the parser to correctly scan the PDF object data structure, to correctly decode and assemble all the data, and then parse the stream of Postscript commands. There could be issues at any level. 

Occasionally we have to tweak our parser to allow for bugs in our code, things we had not considered, areas where the PDF does something which is permissible but not clear from the spec or even cases where the PDF does not actually follow the specification. Most PDF creation tool writers create a PDF according to their interpretation of the PDF specification and if it opens in Acrobat, they leave it at that. If it does not open in our parser, it is obviously our fault, not theirs.

Over time we have become very adept at tweaking our code to allow for all the little idiosyncracies of various PDF tools – we have lots of interesting internal flags in our source code and Intellij IDEA(my preferred Java IDE) excellent tracing allows us to follow the flow through code we know very well. It is normally a quick fix and regression test.

Sometimes, people send screenshots or say the file does not open. Unfortunately, it is very hard to help in this case. Send us the file and we can quickly find the issue. Screenshots are generally like giving a car mechanic a picture of your car and asking what is wrong – let him open up the bonnet and hear the engine and you’ll get a quick answer. 

Did you know...

IDRsolutions offers a whole range of online file converters to convert PDF and Microsoft Excel, Word and Office Documents to HTML5, SVG or image formats?

It is free to use for single file conversions and also includes Developer links if you want to use our commercial software for bulk conversions. Find out more on this page

Enabling SVG Gzip Compression on Apache and NGINX

Gzip compression is a widely supported method of reducing the size of the content sent from a web server in order to improve the...
Leon Atherton
47 sec read

One Reply to “Why we need to see your PDF files…”

  1. Even though our customers often have confidential PDFs, we have still been able to send effective testcases. If you have the full version of Acrobat, there is a Redaction tool that can remove all but the offending page, and even most of the text on that page. If this stripped down censored PDF still shows the error when you open it in JPedal, it’s just as good as the original. Foxit PDF Editor can do the same job.

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2020. All rights reserved.