PDF is a very flexible file format in which colour can be represented in lots of different ways. This allows great flexibility and also reflects the fact that PDF is used in many different environments.
One of the most useful color formats is CMYK which matches how professional printers work. Colours are made by mixing together 4 inks – Cyan, Magenta, Yellow and Black (it is actually called Key and is the K in CMYK). PDFs can be created which professional printers can use and users can be sure that the printed output is correct.
However, it turns out that some images in PDFs are not actually CMYK – they use a different form of encoding, called YCCK. Most of the time, this is hidden from the user, but if you are working with PDFs or doing an development, you may need to understand what is going on.
YCCK does not have its own type – it is always treated as CMYK and detected internally. If you save out DCTDecoded data which is flagged as CMYK it may well be YCCK – there is a flag in the header to show if it is.
YCCK also consists of 4 components. As with CMYK, there is a black element (the K value) but instead of Cyan, Magenta and Yellow, there are a Luma value (Y) and 2 chroma values (Cb and Cr). The maths on this is quite fiddly so the best way to think of this is that the information is encoded not in terms of ink colours but in terms of how your eye sees the colour. Your eye is more sensitive to the luminance value as opposed to the chrominance value. By separating off the chroma values they can be compressed more (reducing the filesize) without the eye noticing – you just can’t do this with CMYK.
So that explains why it might be used, but what about actually using the data. What you can do is convert the 3 YCC colours values into 3 CMY colour values. Add back the K component and you have CMYK.
As with lots of colour operations, there are 2 ways to do this:-
1. With a mathematical formula. This provides a fast approximation but is not always correct, especially on very dark or light colours.
2. Use a colour profiles – these files are essentially very precise lookup tables allowing accurate mapping from one colourspace to another. They are more precise but slower, at least in Java.
So if you are doing some serious work with the CMYK colorspace or saving out CMYK data, do be aware that not all CMYK is CMYK and it may need conversion into ‘proper’ CMYK.
This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.