All the techniques for downsampling involve tradeoffs in terms of speed, quality, and memory usage. They also tend to do better on some types of image compared to others. One of the simplest, but still very effective methods is to use averaging.
We use averaging in our PDF Viewer to downsample images on the fly. If a PDF file contains a 2000×2000 image on the page and it is being displayed at 25% scaling, the image is only going to appear as a 500×500 image. If we can reduce the image to this size, it will use a quarter of the memory and be much faster to render. We can also use this technique where raw images in the PDF file are scaled down to a smaller size on a page. As the user zooms in over 100%, we can create a higher quality image each time from the raw data, optimised for the zoom mode.
There are different ways of downsampling. We average each block of 2×2 pixels to give a new value. All 4 pixels set would give a 100% pixel, 3 pixels set gives 75%, 2 pixels is a 50% pixel and no pixels is blank. This tends to result in hinting, giving really good results on pictures, but results in blurring on text and sharp edges. You can also apply a kernel to the data to sharpen the final image.
Averaging in this way also has major advantages in Java. Java supports many different types of colourspaces, but lots of things need ARGB mode. You cannot apply PDF clipping in any other mode for example. The issue with this is an ARGB image uses 32 bits (4 bytes per pixel). PDF files can contain huge black and white images (using 1 bit per pixel). Converting to ARGB is going to need 32 times the memory. Do the maths on a 14000×11000 pixel image! So being able to reduce the image size before you do this has massive memory and performance benefits.
Doing averaging on a set of bytes is very fast, can use low-level bit manipulation operations and does not need lots of memory so we can do it when we redraw the page.
Large black and white images can also be converted into grayscale (8 bits per pixel but we only have a quarter of the pixels afterwards). This gives really good results on black and white photographs but tend to blur sharp lines.
Downsampling is not perfect, but it is a good trade off if you are looking for a fast way to cut general images down to size.
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.