Mark Stephens Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

PDF to HTML5 conversion – non-standard glyfs

48 sec read

I have been looking at a PDF to HTML5 conversion issue where there was some odd text appearing on the HTML5 page but not in the PDF file. It turned out to be rather interesting…

Every glyf inside a PDF file has a name (A, B, Space, ellipsis, etc). There are a whole set of standard values defined but you can also use any arbitary value. They are listed in the charset and inside the fonts. So long as the values match where they are used up you can call them what you want.

However, if you create your own glyfs, the software may not be able to resolve the actual character you want to associate with this to display or extract as text. So what should we do when generating HTML5 from these files? The only value we have is the glyf name so this is the odd text we were seeing on the screen (in this case angbracketleft and angbracketright).

So we have added some mapping code into the static helper class HTMLHelper so you can replace these with an appropriate value

/**
* replace any non-standard glyfs
*/public String mapNonstandardGlyfName(String glyf,PdfFont currentFontData) {

glyf = glyf.replaceAll("angbracketright", ")");
glyf = glyf.replaceAll("angbracketleft", "(");

return glyf;
}

That looks rather better!

fixed text

 

 

This post is part of our “Fonts Articles Index” in these articles we explore Fonts.

Mark Stephens Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Converting your PDF files to HTML5 with BuildVu 

Recently we announced our updated product range for 2018 and are rebranding some existing products, like JPDF2HTML5 which has been renamed to BuildVu. It...
Georgia Ingham
3 min read

Favourite resources from our HTML development team

As the web progresses and grows, so do the technologies that come along with it. Trying to keep on top of everything you need...
Ovidijus Okinskas
1 min read

How HTML5 Javadocs in Java 9 will make your…

Here at IDRsolutions we are very excited about Java 9 and have written a series of articles explaining some of the main features. In...
Rob
1 min read

Leave a Reply

Your email address will not be published. Required fields are marked *