Interesting PDF bugs – zero bytes in a string

I have been looking at a PDF file which did not open in our Java Viewer. The issue turned out to be that we were misreading a rather crucial text string. In general, text values are defined in 2 ways:-

1. Encoded text characters (if it is WIN encoded it will look like standard text).

(this is a text stream)

2. Hex values (a set of 2 letter hex entries)

<01ff12>

The offending String (it is the /U value which is used to decrypt the file) initially looked like case 2 until I opened it up in a hex editor. The last part of the input includes a long stream of zero byte values until you then each the closing > tag.

 

 

 

 

The way to handle this appears to be to just ignore these values – I suspect they are there because the PDF creation library (ISIS information Systems according to the file) is writing out a fixed length string value. Easily fixed with a simple code tweak but it is a shame to find garbage which we then have to ignore.

This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

Related Posts:

The following two tabs change content below.

Mark Stephens

System Architect and Lead Developer at IDRSolutions
Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.
Markee174

About Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>