PDF files are versatile creatures; when generated correctly they will happily display flawlessly on any platform, and they can handle being scaled both in and out almost infinitely and still maintain an accurate representation of the page. But how do they manage it?
Part of the reason is their roots in PostScript. Most people expect that a PDF
simply contains marked up content that a viewer uses to display a document, but it’s not quite as simple as that – it’s more like a description of how to draw a document.
Another factor is that they support vector graphics – or rather they support paths that can have painting operators applied, for example a stroke, fill, clipping boundary, or some combination of these.
How not to…
But what about raster graphics?
It is very common for PDF files to contain chopped up raster images that are then tiled when drawn. Get this wrong, and the consequences can be dire…
And have you ever noticed how a PDF can still manage to maintain a decent representation of the page even when you zoom out?
For shapes, the spec defines that “any pixel whose square region intersects the shape, no matter how small the intersection is” will be painted. What this means is that it’s not possible to make a shape disappear. Even if a line has a width or 0.1px, this will get rounded up rather than down, so that “no shape ever disappears as a result of unfavorable placement”. This is very clever, and interesting to see it in action when you know to look out for it. Zoom out in this file, and you should see that 1px lines can end up filling the whole page!
Raster graphics are handled slightly differently – for raster graphics it is defined that “only those pixels whose centres lie within the region shall be painted”. I find this particularly interesting because it shows that it’s not just a case of scaling up or down the image by the scaling value of the current zoom level and then placing on on the page at the calculated x and y positions; rather it’s actually the opposite – the size that it appears on page needs to be calculated first, and then the image should be output at the size to fit.
This post is part of our “Understanding the PDF File Format” series. In each article, we aim to take a specific PDF feature and explain it in simple terms. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.