Large images in PDF files (and why I think they are a bad idea)

You can embed images in a PDF file as either Vector images or pixel images. This gives you huge power and flexibility, but (as with many things), the PDF file specification gives you these powers but not necessarily the wisdom to use them wisely.

I have been looking at a customer PDF which does not open in our Java PDF viewer (it does sometimes happen and then we jump on  the case). It turns out that this PDF file contains a very large bitmapped image (19,000 pixels by 12,000 pixels to be exact). This means you can scroll into it at great detail but most people are only going to see a very scaled version with most detail removed. And the bigger image is making the PDF file much bigger, even with with compression.

If we convert this into an image that is going to be a lot of memory (about 900 megs to hold the data if we include transparency, which we need for clipping). We can down-sample the image but there are no clues as to what would be the best down-sampled size to choose and how we should do the down-sampling.  In our viewer we can choose the viewable window but we are looking at PDF to HTML5 conversion here. We would either have to guess or need to adopt the strategy google maps uses and build lots of different tiled versions at different resolutions.

There are lots of different strategies which will produce different quality versions at different speeds.

If you want to have complex diagrams with the ability to scale in, the best format is to use Vector Graphics. This produces smaller, much better quality PDF files which use less memory. Do you agree with me? 

This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

Related Posts:

The following two tabs change content below.

Mark Stephens

System Architect and Lead Developer at IDRSolutions
Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.
Markee174

About Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>