Mark Stephens Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

PDF hacks and HTML5 – ‘hidden’ PDF text

42 sec read

While debugging our PDF to HTML5 we have come across alsorts of interesting ‘PDF’ features which need conversion to an HTML5 equivalent.

Today, I have been looking at a PDF page which had extra text on the HTML5 version. It turns out that the text is also on the PDF but it is just invisible. You can select it but you cannot see it. In the PDF a white box has been drawn over it…

In general this is not a good way to delete PDF text (especially if it is sensitive or confidential!). The text is still there in the PDF and can be easily extracted.

The white box is also drawn in the HTML5 but because the shape is on the canvas layer (and the text is in a div on the separate text layer) the text is not hidden.

The practical fix is to put the text onto the canvas and we have a flag to do this. This is not totally satisfactory because text on the canvas acts like a bitmap. It does not scale without pixellation.

As is often the case, the quality of the PDF effects what we can do in HTML5.

IDRsolutions develop a Java PDF Viewer and SDK, an Adobe forms to HTML5 forms converter, a PDF to HTML5 converter and a Java ImageIO replacement. On the blog our team post anything interesting they learn about.

Mark Stephens Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Converting your PDF files to HTML5 with BuildVu 

Recently we announced our updated product range for 2018 and are rebranding some existing products, like JPDF2HTML5 which has been renamed to BuildVu. It...
Georgia Ingham
2 min read

Favourite resources from our HTML development team

As the web progresses and grows, so do the technologies that come along with it. Trying to keep on top of everything you need...
Ovidijus Okinskas
1 min read

How HTML5 Javadocs in Java 9 will make your…

Here at IDRsolutions we are very excited about Java 9 and have written a series of articles explaining some of the main features. In...
Rob
1 min read

Leave a Reply

Your email address will not be published. Required fields are marked *