PDF to HTML5 conversion – 2 ways to handle text content

When you convert a PDF file into HTML5, there are different ways  to do things. With the text, you can for example make use of the canvas or you can put the text in a div and embed the font. Both approaches have their advantages and disadvantages. Here are TWO ways to show text:-

1. Put the text in divs. This produces very small files, text is selectable and scales beautifully but there is a tradeoff in exact spacing.

2. Draw the text as shapes on the canvas. If you have your own font engine (and we have spent 8 years building one including Truetype font hinting), writing the text to canvas gives a very good representation but the text is not selectable and the files are larger.

We like our customers to have the choice so we have been working on option 1 and added First release of option 2 in todays release. You can see the results in yesterday’s blog post.

What would you like to see next?

This post is part of our “HTML5 Article index” in these articles, we aim to help you understand the world of HTML5.

Related Posts:

The following two tabs change content below.

Mark Stephens

System Architect and Lead Developer at IDRSolutions
Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.
Markee174

About Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>