Site iconJava PDF Blog

How do Filter and DecodeParms Objects change a PDF Image

One of the hardest things I found to get to grips with on the PDF File format was using filters and DecodeParams together. So let us look at some examples to see  how they work.

Here is the PDF object dump for an Image object in a PDF file (which uses a /Filter to compress the data and a /DecodeParms to contain values for that filter.

<< /DecodeParms [<< /BlackIs1 true /K -1 /Rows 3300 /Columns 2560 >>]
/Type /XObject
/Subtype /Image
/ColorSpace [/Indexed /DeviceRGB 1 ]
/Width 2560
/BitsPerComponent 1
/Length 45 0 R
/Height 3300
/Filter [ /CCITTFaxDecode]
>>
stream

It could also be written coded with the DecodeParms in a separate object like this

<<
/DecodeParms [44 0 R]
/Type /XObject
/Subtype /Image
/ColorSpace [/Indexed /DeviceRGB 1 ]
/Width 2560
/BitsPerComponent 1
/Length 45 0 R
/Height 3300
/Filter [ /CCITTFaxDecode]
>>
stream

44 0 obj
<<
/BlackIs1 true
/K -1
/Rows 3300
/Columns 2560
>>
endobj

So far so good. The Image has a /CCITT filter which uses some values stored in /DecodeParams (either directly or as an object).

But the image can have a more than one filter, in which case there needs to be a decodeParms for each example. In which case they are both arrays of values.

<<
/DecodeParms [43 0 R 44 0 R]
/Type /XObject
/Subtype /Image
/ColorSpace [/Indexed /DeviceRGB 1 ]
/Width 2560
/BitsPerComponent 1
/Length 45 0 R
/Height 3300
/Filter [/A85 /CCITTFaxDecode]
>>
stream

43 0 obj
<<
>>
endobj
44 0 obj
<<
/BlackIs1 true
/K -1
/Rows 3300
/Columns 2560
>>
endobj

So in this case, object 43 0 is the decodeParms value for /A85 filter and object 44 0 is the decodeParms for the /CCITT filter. Because /A85 does not take any values, but we still need a decodeParms for each filter we end up with an empty object. As you can imagine we had a  lot of fun getting our PDF parser to handle all these cases properly!

It works but it is a clunky mechanism – I would much prefer to have seen all the values in the single FIlter object which would have made it simpler to use and avoided these clunky cases. What do you think? Does it have any real advantages?