The PDF file I have been looking at today has an issue which is more of a Java ‘feature‘ than a PDF bug but it does cover some PDF features so it is worth covering.
When you define an ICCColorspace, you can define an alternate Colorspace which can be used to display the data (generally DeviceRGB for 3 colors and DeviceCMYK for 4). This can be used instead of the ICCprofile and gives good enough results for most cases. As an ICCProfile is relatively slow in Java, we use it in preference for image conversion. And if the Data is compressed with DCTDecode we can use ImageIO to extract the data. So far so good.
However, we have found a file which does not work with ImageIO. It gives an exception inside Java itself
java.awt.color.CMMException: Invalid image format
at com.sun.imageio.plugins.jpeg.JPEGImageReader.readImage(Native Method)
The JPG does work,however, if you decode it as an ICC JPeg by extraction the Raster and then converting manually. So we have adopted the pragmatic solution. We will still try to decode it with the Alternate colorspace, so we get all the benefits. But we will check to see if it fails, and treat it as an ICCColorspace. It seems a reasonable workaround to allow for an issue in the JVM.
I have come across quite a few issues with ICCColorspaces in Java5 and there are still some in Java6 so I hope Java7/8 will improve on ICC support. Do you have any favorite workarounds for ICCcolorspace limitations in Java?
This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.