Understanding the PDF file Format – PDF dictionary

The basis of a PDF file is a set of linked PDF objects, and the key part of a PDF object is the PDF dictionary. So it is pretty important to understand how the PDF Dictionary works.

If you view a PDF object you will see something like this

2479 0
obj<<
/LastChar 89
/BaseFont /MXREUX+HelveticaLTStd-Roman
/Type /Font
/Encoding /WinAnsiEncoding
/Subtype /Type1
/FontDescriptor 2480 0 R
/FirstChar 32
/Widths [278 0 0 0 0 0 0 0 0 0 0 0 0 0 278 0 556 556 556 0 0 0 0 0 0 0 0 0 0 0 0 0 0 667 0 722 0 667 0 778 722 278 500 0 556 833 722 778 0 0 722 0 0 722 0 944 0 667]
>>
endobj

2484 0
obj<<
/Subtype /Type1C
/Length 886
/Filter /FlateDecode
>>
stream (rest of object ommiteed)

This shows 2 PDF objects (identified as object 2479 0 R and 2484 0 R). The PDF object starts with a number and then obj shows the start of the object. The PDF dictionary is contained between the << >> brackets. Object 2484 0 R also has some binary data at the end which comes after the word stream. The last part of the object is the word endobj

So lets have a closer look and see what is going on between those brackets. An entry in a PDF Dictionary consists of a pair of values – a key (which always starts with a / ) and a value (which can be many things). So /LastChar, /BaseFont, /Type, /Encoding, etc are all keys which are defined in the PDF file specification. It is possible to add your own keys which will be ignored by Acrobat. The order they are listed is not important as all keys and their values are read before anything happens with the object.

The type of value depends on the key, so /LastChar takes a number value, /BaseFont, /Type, /Subtype take a string constant value starting with a /, /Widths is a array of numbers and /FontDescriptor points to another object. All values can be stored in another value, so the value can also be an object (2480 0 R) which contains the actual data (number, string,etc).

The value pairs contain all the data for the PDF objects and now you know how they work.

This post is part of our “Understanding the PDF File Format” series. In each article, we aim to take a specific PDF feature and explain it in simple terms. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

Related Posts:

The following two tabs change content below.

Mark Stephens

System Architect and Lead Developer at IDRSolutions
Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.
Markee174

About Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

One thought on “Understanding the PDF file Format – PDF dictionary

  1. Jaime

    Is there anything in the PDF Reference related to this statement you made?:

    > The order they are listed is not important as all keys and their values are read before anything happens with the object.

Leave a Reply to Jaime Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>