Interesting PDF bugs – Missing image data

After a brief absence, we are a back with a bang! Today’s PDF is a very nasty PDF allegedly created with these tools

Creator Form Z_ZSD000073 language ZF

Producer SAP R/3 Release 46C

This file is nasty for 2 reasons. Firstly, it uses non-embedded CID fonts, so it is not really cross-platform. It relies on the PDF viewer finding a suitable match for it (rather than displaying exactly what was intended). The PDF spec sets out a very sensible set of rules, but does not enforce them, so creation tools can do silly things.

More seriously, here is an excerpt showing the data for one of the fonts in the PDF file.

 

The Adobe PDF file spec includes lots of flexible ways to map data onto specific glyfs and this is usually set by the Encoding value? So what is the correct encoding to use with this PDF file?

The actual answer here is to ignore the Encoding value (ETen-B5-H) and use Identity-H (from the BaseFont name)….

I was actually surprised that this PDF opened in Adobe Acrobat, but it does and it now opens correctly in JPedal as well. It is a really good example of how you often have to ignore the spec if you want to open some PDF files.

This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

Related Posts:

The following two tabs change content below.

Mark Stephens

System Architect and Lead Developer at IDRSolutions
Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.
Markee174

About Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

3 thoughts on “Interesting PDF bugs – Missing image data

  1. dreamfly912

    where can i find the tool for viewing the internal of pdf file as shown in the pic in this article?

    • dreamfly912

      thank u for ur replay so soon 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>