How do PDF files manage limitless position accuracy of shapes & images?

PDF files are versatile creatures; when generated correctly they will happily display flawlessly on any platform, and they can handle being scaled both in and out almost infinitely and still maintain an accurate representation of the page. But how do they manage it?

Part of the reason is their roots in PostScript. Most people expect that a PDF
simply contains marked up content that a viewer uses to display a document, but it’s not quite as simple as that – it’s more like a description of how to draw a document.

Another factor is that they support vector graphics – or rather they support paths that can have painting operators applied, for example a stroke, fill, clipping boundary, or some combination of these.

How not to…

How to!

But what about raster graphics?

It is very common for PDF files to contain chopped up raster images that are then tiled when drawn. Get this wrong, and the consequences can be dire…

And have you ever noticed how a PDF can still manage to maintain a decent representation of the page even when you zoom out?

I have recently been reading the PDF spec to learn how we can replicate this with our PDF to HTML5 Converter and Java PDF Library, and have made a great discovery.

For shapes, the spec defines that “any pixel whose square region intersects the shape, no matter how small the intersection is” will be painted. What this means is that it’s not possible to make a shape disappear. Even if a line has a width or 0.1px, this will get rounded up rather than down, so that “no shape ever disappears as a result of unfavorable placement”. This is very clever, and interesting to see it in action when you know to look out for it. Zoom out in this file, and you should see that 1px lines can end up filling the whole page!

Raster graphics are handled slightly differently – for raster graphics it is defined that “only those pixels whose centres lie within the region shall be painted”. I find this particularly interesting because it shows that it’s not just a case of scaling up or down the image by the scaling value of the current zoom level and then placing on on the page at the calculated x and y positions; rather it’s actually the opposite – the size that it appears on page needs to be calculated first, and then the image should be output at the size to fit.

This post is part of our “Understanding the PDF File Format” series. In each article, we aim to take a specific PDF feature and explain it in simple terms. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

Related Posts:

The following two tabs change content below.
Leon is a developer at IDRsolutions and product manager for JPDF2HTML5. He is responsible for managing the JPDF2HTML5 product strategy and roadmap, and also spends a lot of his time writing code to build new features, improve functionality, fix bugs, and improve the testing for JPDF2HTML5.
Leon Atherton

About Leon Atherton

Leon is a developer at IDRsolutions and product manager for JPDF2HTML5. He is responsible for managing the JPDF2HTML5 product strategy and roadmap, and also spends a lot of his time writing code to build new features, improve functionality, fix bugs, and improve the testing for JPDF2HTML5.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>