When you convert PDF to HTML5, you can have a potential problem of duplicate font names. In a PDF file, you can embed lots of fonts and subset them to ignore just the glyfs you are using (keeping the font size down). So a page could contain several fonts, all called Arial. This is not an issue in a PDF file because the font name is a piece of information not the key used to identify the font.
In a PDF file, it is the unique key which identifies the fonts used in the CSS tag (FONT-FAMILY) and the @font-face tag to embed the font. So we need to ensure that the font name is unique in the HTML5. How will you handle this?
This is how we deal with this. The first time you use Arial, we will call it Arial. If a different version of Arial appears we will append the FontID (which is how the PDF identifies it) and the size of the font data to give a unique version (Arial_C2_0_5400). Luckily, because the PDF does not use it, we can easily alter it for our own use without breaking anything else and handle all these fonts. Does this seem sensible?
Latest posts by Mark Stephens (see all)
- 4 ways Companies can make remote working successful - December 21, 2017
- My experience of a Turkish bath (visiting Istanbul for DevFest) - November 24, 2017
- My 5 key takeaways from JavaOne 2017 - October 6, 2017
- My notes and pictures from thursday JavaOne 2017 - October 5, 2017
- My notes and pictures from Wednesday JavaOne 2017 - October 5, 2017