PDF to HTML5 conversion has to deal with the issue of embedded fonts in PDF files, and there are THREE ways to do this:-
1. Ignore the issue and hope the text looks okay in the available fonts.
2. Convert the text into shapes or draw as images.
3. Create new fonts for web use and use the embedded font within the HTML5 page.
We decided to go with option 3 for our PDF to HTML5 conversion. There are 2 font technologies used in PDF files – Truetype and Postscript. We added initial support for Truetype last year and today we added the first release of the PostScript font support (we are creating OpenType fonts (otf) to provide support for the most platforms).
You can see the difference it makes in this screenshot (font substituted on the left and extracted and reused on right). What do you think?
Because it is a first release we have left it disabled as the default. You can enable it with the JVM flag -DconvertOTFFonts=”true”
We look forward to your feedback…
This post is part of our “Fonts Articles Index” in these articles we explore Fonts..
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.