Mark Stephens Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

PDF to HTML5 conversion – Tradeoff on filesize versus accuracy

51 sec read

Here is a good example of the sort of tradeoff you can find in converting PDF files to HTML5. The PDF file contains a right aligned table containing a line of dots.

original text

When we make this into a PDF, we can keep all the dots separately in their own div, so that we can position them exactly. This works fine but creates large files which can crash mobile browsers.

So why not roll the dots together into a single structure in a div. The file is much smaller and faster to display but we lose the alignment.

So is there any way we can keep the single div but stretch it out???

It turns out there are 2 possible values  we could use.

charSpacing allows us to insert a gap between lines but you cannot have floating point values. So if you have 40 characters on a line you can only adjust the line lengths in minimum 40 pixel increments – we need to stretch the line by about 10 pixels so it is no good to us. If we want to use it we would need the break the line which is getting messy 🙁

wordSpacing allows us to stretch the spaces on the line and gives us a small file and right alignment.

This is the strategy used in today’s release. You can see the compromises on accuracy, speed and filesize. So let us know what matters most to you…

Mark Stephens Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Converting your PDF files to HTML5 with BuildVu 

Recently we announced our updated product range for 2018 and are rebranding some existing products, like JPDF2HTML5 which has been renamed to BuildVu. It...
Georgia Ingham
3 min read

Favourite resources from our HTML development team

As the web progresses and grows, so do the technologies that come along with it. Trying to keep on top of everything you need...
Ovidijus Okinskas
1 min read

How HTML5 Javadocs in Java 9 will make your…

Here at IDRsolutions we are very excited about Java 9 and have written a series of articles explaining some of the main features. In...
Rob
1 min read

Leave a Reply

Your email address will not be published. Required fields are marked *