Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

PDF to HTML5 conversion – duplicate text in PDF files for bold effects

24 sec read

A popular trick in PDF files is to print some text twice (with the second character moved slightly) to create a bold effect.

pdf text

You cannot do this in HTML5 so all you get is double text overlapping. How ugly!

html text

So we add some ‘intelligence’ into the conversion to ignore these characters (it needs to be smart enough to work correctly when we get genuine double characters like following or moon so we look at the position and gap between the letters).

This gives a much better representation of the text 🙂

html text

The PDF file format uses lots of tricks which work very well for PDF but need care in being translated in HTML5.

 



Converting PDF/ Office Documents to HTML?

Convert PDF to HTML Find out why our customers use BuildVu for HTML conversion

Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2021. All rights reserved.