I have been working on mapping text as closely as possible to PDF layout and thought an article on progress would be of interest as it shows how you can debug HTML5 and what we are up to…
I have been working on some sample newspaper pages. These are good samples to work with because the multi-column format makes them particularly challenging.
The first improvement was actually to spot that there was a bug in our code (it does sadly happen!). If the space occupied was exactly the same size as the slot on the PDF page our code still reduced the fontsize by one point. That is now fixed!
When investigating HTML5 issues, Chrome has a neat debugger which allows you to inspect any element on the page (right click the menu item over the page). This makes it very easy to examine the element on the page.
I was originally rounding the font size up or down if it gave a better fit on same in a simple symmetrical matter. Using this it became clear that when adjusting the font size for best fit, it was better to stick to the lower size value unless the gap was over half the font size. This gives a better representation.
The big issue with text from a PDF file is that quite often the best font size would be 8.5pt but I have to choose between 8pt and 9pt. This means that I cannot reproduce the hard right-aligned margin on the columns of text (but it is a reasonably good representation of the page. As the page uses Type1 embedded fonts, it will look even better when we add in our Type1 to OTF font support and include the actual fonts as web fonts!
How do you like to debug HTML5?
This post is part of our “Fonts Articles Index” in these articles we explore Fonts.
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.