Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

Are you a Java Developer working with PDF files?

Find out why you should be using JPedal

Downsampling by averaging

1 min read

All the techniques for downsampling involve tradeoffs in terms of speed, quality, and memory usage. They also tend to do better on some types of image compared to others. One of the simplest, but still very effective methods is to use averaging.

We use averaging in our PDF Viewer to downsample images on the fly.  If a PDF file contains a 2000×2000 image on the page and it is being displayed at 25% scaling, the image is only going to appear as a 500×500 image.  If we can reduce the image to this size, it will use a quarter of the memory and be much faster to render. We can also use this technique where raw images in the PDF file are scaled down to a smaller size on a page. As the user zooms in over 100%, we can create a higher quality image each time from the raw data, optimised for the zoom mode.

There are different ways of downsampling. We average each block of 2×2 pixels to give a new value. All 4 pixels set would give a 100% pixel, 3 pixels set gives 75%, 2 pixels is a 50% pixel and no pixels is blank. This tends to result in hinting, giving really good results on pictures, but results in blurring on text and sharp edges. You can also apply a kernel to the data to sharpen the final image.

Averaging in this way also has major advantages in Java. Java supports many different types of colourspaces, but lots of things need ARGB mode. You cannot apply PDF clipping in any other mode for example. The issue with this is an ARGB image uses 32 bits (4 bytes per pixel). PDF files can contain huge black and white images (using 1 bit per pixel). Converting to ARGB is going to need 32 times the memory. Do the maths on a 14000×11000 pixel image! So being able to reduce the image size before you do this has massive memory and performance benefits.

Doing averaging on a set of bytes is very fast, can use low-level bit manipulation operations and does not need lots of memory so we can do it when we redraw the page.

Large black and white images can also be converted into grayscale (8 bits per pixel but we only have a quarter of the pixels afterwards). This gives really good results on black and white photographs but tend to blur sharp lines.

Downsampling is not perfect, but it is a good trade off if you are looking for a fast way to cut general images down to size.

 



Our software libraries allow you to

Convert PDF to HTML in Java
Convert PDF Forms to HTML5 in Java
Convert PDF Documents to an image in Java
Work with PDF Documents in Java
Read and Write AVIF, HEIC, WEBP and other image formats
Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.