Site iconJava PDF Blog

How does color work in PDF files?

Color is a complex topic in PDF. This article helps to explain how it works.

How to define Color in PDF files

Color can be defined in different ways in a PDF. This is because the PDF file specification is a very flexible format with lots of uses. Different tasks have come up with different ways to talk about colours. A way of defining colors is called a Colorspace.

Televisions and computers use 3 ‘base’ colours generated by a Red, a Green and a Blue cathode. The output of these would be mixed together in different amounts to give all the colors you see on the television screen (the RGB colorspace). If the image is black and white, it only needs one channel so can use a Gray ColourSpace.

A printer would usually print using a combination of 4 inks (Cyan, Magenta, Yellow and Key, which is really black) to produce color prints. Or they might use a selection of one of more known inks and print them one at a time (Separation/DeviceN ColorSpaces). There are several established definitions of Ink colours such as the Pantone scheme.

RGB and CMYK do not entirely reflect how the human eyes sees color, so PDF also supports LAB ColorSpace. This contains 3 components with a Lightness value and a red/green and yellow/blue value.

You can also create a ColorSpace based on a ICC defined profile (ICCColorSpace). The PDF file will generally contain the embedded ICC file to define the ColorSpace.

Because PDFs are used in digital, print and lots of other environments, with lots of different types on content, the PDF specification allows you to choose the most appropriate and natural way to think about color for that process and how you intend to use the file.

Color conversion

When a PDF is displayed the software has to work out how to convert the color into an appropriate form (for example a print PDF using CMYK needs to be displayed on an RGB computer screen).

Converting between colors is not always a simple task. For some conversions, there is a simple Maths formula while for others there are complex translation tables called profiles. Even with a formula, there are different versions available which give different results. There are also fast and approximate methods versus more accurate and slow methods.

All PDF tools have to choose the methods which offer the best compromise for their requirements. Xpdf, for example, usually uses a formula to handle CMYK, which is why some shades of black or white can look different compared with Adobe Acrobat, which uses a profile.

The most accurate way to convert between colors is to use a profile. When I wrote the color handling code for our Java PDF viewer, I needed to convert all the colors in a PDF file into sRGB so that I could use them in Java. Wherever possible, use profiles to give the closest match to what Adobe Acrobat does.