Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

PDF to HTML conversion – tradeoffs on adjusting font size

1 min read

When you convert a PDF to HTML, you move from a coding environment where you can work in any floating point size to one where you have to work with int pt sizes. This can cause some issues – if the PDF font size is 8.5pt should we use 8 or 9 in the HTML (8.5 is not allowed and both 8 and 9 are essentially wrong as they will not fit exactly)?

To improve the positioning of the text in HTML we alter the CharSpacing (moving the characters together or apart) and also see if adjusting the font size would produce a better fit. This is especially important when substituting fonts in the HTML – some fonts I have seen are much thinner than the replacement HTML fonts so 9pt in font X is actually more like 18pt in the font we use to display as HTML.

The drawback with this approach is that it can result in blocks of text with slightly changing font sizes (8,9,8,9,9,8 pt for example). This is more accurate but looks odd. So in our latest release we have added a compromise in the HTML. We will ignore small changes in HTML font size (so that line of text will now be 8,8,8,8,8,8 pt) but the 9pt is still changed to 18pt). And we allow the user to adjust the threshold value using the code call

//only adjust font if change bigger than 5pt
HTMLoutput.setValue(HTMLDisplay.UseFontResizing, 5);

Altering the value 5 to zero would include any tiny change and 10 would only alter the font size if the change was great than 10 pt. 5 seems a sensible default and you can experiment with the best compromise for your files in converting PDF to HTML. What value works best for you?

Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Converting your PDF files to HTML5 with BuildVuĀ 

Recently we announced our updated product range for 2018 and are rebranding some existing products, like JPDF2HTML5 which has been renamed to BuildVu. It...
Georgia Ingham
3 min read

Favourite resources from our HTML development team

As the web progresses and grows, so do the technologies that come along with it. Trying to keep on top of everything you need...
Ovidijus Okinskas
1 min read

How HTML5 Javadocs in Java 9 will make your…

Here at IDRsolutions we are very excited about Java 9 and have written a series of articles explaining some of the main features. In...
Rob
1 min read

Leave a Reply

Your email address will not be published. Required fields are marked *