PDF to HTML conversion – font conflict

I have been looking at a PDF to HTML conversion issue where the text from the PDF file was not appearing on the HTML page. This turned out to be a ‘font conflict’…

In a PDF file it is possible to use lots of different versions of a font on the PDF page with the same name. The usual reason for that is that the PDF creation tool will decide to add a subsetted font containing only the glyfs from the font it needs and no others. If it then uses other glyfs on the page it will need to add these as well – often by adding another subsetted font. This does not cause a conflict because the fontname is not used as the key identifier.

In HTML, the font name is far more important and embedding multiple fragments of fonts all with the same name will cause major issues – like most of the glyfs not appearing. There are 3 ways to get around this problem:-

1. Stop the use of multiple fonts in the HTML. This is the easiest but could result it not all the characters appearing as you want. This is our current solution in today’s release.

2. Merge the font data into one font. This is messy…

3. Rename the fonts. This is our  long-term goal but slightly messy as it needs some work on the fonts.

The good news is that the text now appears correctly in the latest release and we have plans to improve the handling of font conflicts still further. Stay tuned…

This post is part of our “Fonts Articles Index” in these articles we explore Fonts.

Related Posts:

The following two tabs change content below.

Mark Stephens

System Architect and Lead Developer at IDRSolutions
Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.
Markee174

About Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

One thought on “PDF to HTML conversion – font conflict

  1. jirong

    really fast to push a quick fix!
    2 thumbs up!!!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>