PDF to image conversion in Java is a topic which I get asked a lot of questions about so I thought it would make a good topic for a blog. I will be doing some follow-up articles to cover specific image issues with tiff, JPEG and PNG.
A PDF is essentially a vector graphics format. Developers often want to create images of the pages to show on the web or use as thumbnails. So the first decision is how big to make the image. The visible size of a PDF file is the CropBox so it often makes sense to use this (so a common page size is 595 by 842 so we create an image 595×842 pixels in size). This is what our standard PDF to image conversion example does.
That is fine unless the user wants to zoom into the image. In this case we would need to create a bigger image. The issue here is that we can rapidly use up a great deal or memory (creating an image at 1190x 1684 pixels will use four times the memory). So you need to think very carefully about the optimum values to use.
There is also a question of how big to make the image. Sometimes the PDF will contain highly detailed images. If you squeeze them down you will lose image quality to you might want to use a higher scaling to take advantage of them. But if there are loads of poor quality images, then there is no advantage to creating a big image.
So we created a second example which could produce bigger images if the embedded images are high quality and allow the user to create different sizes of image for a page. Interestingly the main performance hit is on the actual creation of the blank image in Java.
So converting PDF files into images is very much a balancing trick – think very carefully about what you want to achieve to get the best balance on image size – a bigger image has more detail and you can zoom in but will take longer to create and the file will be larger.
And if you want to look at PDF to image conversion in Java, do have a look at our examples.
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.