I have been looking at a PDF to HTML conversion issue where the text from the PDF file was not appearing on the HTML page. This turned out to be a ‘font conflict’…
In a PDF file it is possible to use lots of different versions of a font on the PDF page with the same name. The usual reason for that is that the PDF creation tool will decide to add a subsetted font containing only the glyfs from the font it needs and no others. If it then uses other glyfs on the page it will need to add these as well – often by adding another subsetted font. This does not cause a conflict because the fontname is not used as the key identifier.
In HTML, the font name is far more important and embedding multiple fragments of fonts all with the same name will cause major issues – like most of the glyfs not appearing. There are 3 ways to get around this problem:-
1. Stop the use of multiple fonts in the HTML. This is the easiest but could result it not all the characters appearing as you want. This is our current solution in today’s release.
2. Merge the font data into one font. This is messy…
3. Rename the fonts. This is our long-term goal but slightly messy as it needs some work on the fonts.
The good news is that the text now appears correctly in the latest release and we have plans to improve the handling of font conflicts still further. Stay tuned…
This post is part of our “Fonts Articles Index” in these articles we explore Fonts.
Latest posts by Mark Stephens (see all)
- How we are improving our code quality with IDEA in 2018 - March 7, 2018
- How we are improving our code quality with NetBeans in 2018 - March 1, 2018
- 3 ways that the European Union is changing the way Companies write software in 2018 - January 31, 2018
- IDRsolutions product range update for 2018 - January 22, 2018
- 4 ways Companies can make remote working successful - December 21, 2017