Mark Stephens Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Understanding the PDF file Format – Custom font encodings

1 min read

One very powerful feature of the PDF file specification is the ability is the option to create custom font encodings. This means that for each font you can choose exactly what glyph value the text index values used in the Tj command map onto. This has a number of advantages, including:-

1. Making it very easy to map font values with subsetted fonts (especially CID fonts). If you are only using a few glyphs in a font, this can substantially reduce the PDF filesize and improve its loading speed.

2. Make it simple to map values from any system/platform so that they display correctly on multiple platforms.

Custom encoding is setup in the /Differences value of the Encoding dictionary. Here is an example

246 0 obj
<<
/Type /Encoding
/BaseEncoding /MacRomanEncoding
/Differences [32/space 97/a 99/c/d/e/f 104/h/i 108/l 
110/n/o 115/s/t/u 121/y]
>>
endobj

The /BaseEncoding defines the general encoding to use and then the /Differences key lists our changes. It is a number value followed by one or more values (if it is one or more we increment the counter). So [32/space 97/a 99/c/d/e/f 104/h/i 108/l 110/n/o 115/s/t/u 121/y] would define 32 as space, 97 as a, 99 and c, 100 as d, 101 as e, etc. So you can can mix and match standard and non-standard values.

In this case, all the values are glyphs, but they can also be octal, hex or decimal values. Here is an example I found this week in a PDF file.

obj
<<
/Differences[2/71/105/108/76/111/98/101/
73/117/115/116/114/97/100/121/82/99/104
/87/110 32/space]
/Type/Encoding
>>

If the number has a / it is a value, otherwise it is a next glyph number to use which is rather confusing.

You can actually have any value as a glyph (like missing_glyph) but if it is a non-standard one then you need to define it in your font.

So Differences gives you a very powerful way to define custom glyph settings for every font. Have you ever used it?

This post is part of our “Understanding the PDF File Format” series. In each article, we aim to take a specific PDF feature and explain it in simple terms. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

This post is part of our “Fonts Articles Index” in these articles we explore Fonts.

Mark Stephens Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Why you should care about Unicode support in Java…

Here at IDRsolutions we are very excited about Java 9 and have written a series of articles explaining some of the main features. In...
Bethan Palmer
1 min read

Updates to our Text to Speech support in PDF…

Some time ago we introduced text to speech functionality to the JPedal example viewer. This used the FreeTTS library and its default voices with the option of...
Kieran France
1 min read

Three ways to convert PDF to HTML5: Text and…

There are several ways that you can deal with text and fonts in PDF files when converting to HTML5. Here are there are the...
Leon Atherton
2 min read

Leave a Reply

Your email address will not be published. Required fields are marked *