We get to see a very wide range of PDF files from a large set of PDF creation tools in the process of developing a Java PDF viewer. PDF files can be constructed in a huge variety of ways and we are constantly looking at ways to make our code faster, use less memory and produce a better quality display. To do this, we have to make assumptions which we may need to change at a later date.
A good example of this is how much we down-sample an image. A PDF can contain some very large images, which are then shrunk down in size when they are displayed. In our Viewer, we detect huge images and reduce them in size. The user does not notice the different except that we use a lot less memory (images use lots of memory in Java) and things are much quicker. Actually, Java does not always do a great job of scaling down huge images and preserving quality so sometimes we get better image quality! And if we need a bigger version of the image, we regenerate it at a bigger scaling.
One assumption we made was that the image should not ever be reduced below the size of the page. This is a reasonable assumption most of the time, until we came across some PDF files where the page was 3000 by 3000 pixels. If we follow our original strategy, we will have some huge images which would never fully be displayed onscreen.
So should we change our assumptions? Any suggestions?
This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.