Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

Reading ‘JPEG’ data inside a PDF in Java

1 min read

Pdf files contain compressed raw image data. This file is sometimes equivalent to a JPEG file so if you can extract the raw data and save it as a file with a filetype .jpeg, it will open as a JPEG.

Sometimes is the key word here because you may well need to interpet the data using colour information in the PDF file. For example, the actual data may be encoded Gray or DeviceRGB data (in which case it will look correct when you open the JPEG. But it may need some additional details (such as indexed colours) or be YCCK, in which case you will see the image but the colours will be wrong.

Although it cannot always make sense of these JPEG data (because the colour detail is not in the PDF, you can still use Java to open and access the pixel data in Java using ImageIO. The actual pixel data is stored in a Raster object.

So if you want to recreate the image you will need to get the pixel data and ‘merge’ it with the colour data. Here is how you can read the actual pixel data in Java. Even if Java does not understand the colours, it can access the actual pixels themselves.

//read the image data - data is a byte[] containing the data
in = new ByteArrayInputStream(data);
 
//choose JPEG decoder
Iterator iterator = ImageIO.getImageReadersByFormatName("JPEG");
 
while (iterator.hasNext())
{
Object o = iterator.next();
iir = (ImageReader) o;
if (iir.canReadRaster())
break;
}
 
ImageIO.setUseCache(false);
iin = ImageIO.createImageInputStream((in));
iir.setInput(iin, true);
 
//this is the actual pixel data
Raster ras=iir.readRaster(0, null);

Merging the different colours is a whole series of articles – are you interested?

Did you know...

IDRsolutions offers a whole range of online file converters to convert PDF and Microsoft Excel, Word and Office Documents to HTML5, SVG or image formats?

It is free to use for single file conversions and also includes Developer links if you want to use our commercial software for bulk conversions. Find out more on this page

Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

How to read HEIC image files in Java with…

In this article, I will explain how to read HEIC files into Java as a BufferedImage. ImageIO does not read HEIC file types so...
Mark Stephens
1 min read

How to convert WMF files to SVG in java…

This article will show you how to convert WMF files into SVG files using our JDeli Java Image library. What is WMF? WMF is...
Amy Pearson
1 min read

How to write WebP images in Java

In this article, I will walk you through how to write out images as WebP images in Java. ImageIO does not support WebP images...
Mark Stephens
1 min read

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2020. All rights reserved.