One of my favourite features of the PDF file format is its flexibility. You can do just about anything with it. However, a drawback to this is that it does not always get used to best effect.
I was investigating a PDF file issue today for our Java PDF renderer. The issue turned out to be that the PDF file contained an image encoded in 16 bit RGB format. Here is the object information in Acrobat.
PDF file format lets you choose a resolution for your images (anything from 1 to 32 bits). Most RGB images use 8 bit as a trade-off between quality and size. I actually cannot recollect ever seeing a 16bit image in 11 years. I actually deal with the image by down-sampling it to 8 bit and there is no loss in visual quality.
The image in question is actually black and white so rather than 16bit RGB, it could have been encoded in Gray or Binary. This would reduce the size of the image data by a factor of up to 48!
The PDF file format provides lots of useful features but it does need careful consideration of how the features are used. You can see what a difference it can make to the size of the data which also affects load time. It is equally possible to create bloated, slow or fast, compact PDF files. Prawn looks very promising and I hope the developers look in detail at the PDF spec to see what useful tricks it offers.
This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.