Interesting PDF bugs – Mixed up font object

A PDF file can contain several different types of embedded font. Here is an interesting font Object from a PDF created by the DocuCorp LPDF Driver 400.110.020 Jun 29 2006 (according to the Producer entry of the PDF file).

26 0 obj<</Subtype /TrueType/FirstChar 32/LastChar 255/Widths 29 0 R/FontDescriptor 28 0 R/Encoding /WinAnsiEncoding/Type /Font/BaseFont /ABGDHB+ICON1>>
endobj
29 0 obj
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
endobj
28 0 objFontName /ABGDHB+ICON1/FontBBox [ 0 0 1000 1000]/Flags 4/ItalicAngle 0/Ascent 826/Descent -214/CapHeight 826/StemV 94/FontFile 27 0 R>>
endobj

There are two very odd things about this data…

Firstly, Object 29 contains the widths of all the glyphs and lists them as zero. This is clearly wrong (as the characters are visible on the page) and over-ridden elsewhere. At the very least, it makes the file larger than it should be.

More importantly, the type of the Font is /Truetype. So we expect an embedded font stored as a /FontFile2 object. This example contains a /FontFile object (which is supposed to contain Postscript data and actually contains the TrueType data in this file. What a mess! Thankfully, it is relatively easy to fit in our code.

One of the biggest issues with developing a renderer is the number of PDF file which do not meet the PDF file format but work in Acrobat. This is a really good example.

This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

Related Posts:

The following two tabs change content below.

Mark Stephens

System Architect and Lead Developer at IDRSolutions
Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.
Markee174

About Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>