Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

PDF to HTML5 conversion – correctly mapping the embedded glyfs

48 sec read

One of the challenges when using a converted font from a PDF file for HTML5 display is to ensure that the correct glyf value is used. Every font contains a map (CMAP) telling the font which glyf is the correct one to display for that actual character. Here is an example from a font inside PDF file. In this screenshot you can see that the character called lparenori is glyph number 4 and is assigned to character 52.

So when we work out the HTML we need to write this value out as char with value 52 and it will appear correctly in the font. So where does this value come from?

We cannot just use the value in the PDF file. In the actual PDF, the character is actually value 03 on the line [(<0F,03>)85(\n)]TJ

We know that 03 is the glyf called lparenori from the Differences look-up table in the PDF font object

So all we have to do is:-

1.read the value from the PDF table,

2. use the font encoding to find the glyf name

3. then use the data inside the actual embedded font to workout that lparenori is character 52

Simple really!

Note also that you can have glyf names which are not standard Adobe names, but it makes life simpler if you can still to the standard list defined by Adobe.

Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Converting your PDF files to HTML5 with BuildVu 

Recently we announced our updated product range for 2018 and are rebranding some existing products, like JPDF2HTML5 which has been renamed to BuildVu. It...
Georgia Ingham
3 min read

Favourite resources from our HTML development team

As the web progresses and grows, so do the technologies that come along with it. Trying to keep on top of everything you need...
Ovidijus Okinskas
1 min read

How HTML5 Javadocs in Java 9 will make your…

Here at IDRsolutions we are very excited about Java 9 and have written a series of articles explaining some of the main features. In...
Rob
1 min read

Leave a Reply

Your email address will not be published. Required fields are marked *