PDF to image quality

The PDF file format was designed as much as possible to be Vector graphics. The problem with bitmaps and pixels is that pixels can only be on or off. While you can use some clever tricks such as anti-aliasing and hinting to smooth lines, you can’t draw fractions of a pixel. If a line is half a pixel wide, you need to make it either zero or one pixel wide…

In principle, this means that as soon as you convert a PDF to an image, you can lose detail. This article describes one possible answer.

The solution we have adopted in our JPedal software is to provide an example to allow the user to draw the PDF onto a much bigger image (providing more pixels) which can then be scaled down or adjusted to take account of various dpis. You can get the details of the example here and the example source code here.

There are a number of possible scenarios so we added some flags to allow the user to choose how to improve the image quality.

Scenario One – Bigger initial image

We added a flag EXTRACT_AT_PAGE_SIZE which takes a set of values

//alternatively secify a page size (aspect ratio preserved so will do best fit)
//set a page size (JPedal will put best fit to this)
mapValues.put(JPedalSettings.EXTRACT_AT_PAGE_SIZE, new String[]{“2000″,”1500”});

JPedal will attempt to scale up the page to fit into this box (so a PDF page of dimensions 1000×750 would be scaled up by a factor of 2, which a page which is 1000×1000 would be scaled up by a factor of 1.5 to 1500×1500. 

If you wanted to optimise a page for 96 dpi rather than 72 dpi (ie 1.33333 times more detail), you could get the page size (its in the PdfPageData class in JPedal) multiple the dimensions by 1.333 and then use this method to produce an image. If saving as a jpeg, you can then alter the dpi settings and it will display as intended at 96 dpi.

If you wanted the highest possible image, you could create a larger version (providing the maximum number of physical pixels) and then bicubically rescale to desired size.

All these processes are slower and use more memory but give a higher quality result – like so much in life, there is always a trade-off.

Scenario Two – Use the embedded images

PDFs contain raw images, which are often scaled down when drawn onto the PDF and detail is lost. So we thought, rather than scale down the image to fit the PDF, why not scale up the PDF to fit the image!

This worked very well until someone found a PDF with a image which was actually scaled down 47 times. When we use this file, the amount of memory required to create the page was way beyond the capabilities of any current machine….

So we add a limit as to how much the page could be scaled up.

//do not scale above this figure    mapValues.put(JPedalSettings.EXTRACT_AT_BEST_QUALITY_MAXSCALING, new Integer(2));

So here we allow the page to scale up to twice its size. If the raw image was bigger than that, it will be scaled down, but the results should still be better. 

Lastly we need a way to choose which of these 2 rules takes priority, so we added a flag

//which takes priority (default is false)
mapValues.put(JPedalSettings.PAGE_SIZE_OVERRIDES_IMAGE, Boolean.TRUE);

So we now have a flexible way to generate higher quality images and the user can choose the best tradeoffs for them in terms of speed, memory, quality, etc.

Related Posts:

The following two tabs change content below.

Mark Stephens

System Architect and Lead Developer at IDRSolutions
Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.
Markee174

About Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>