Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

PDF to HTML5 conversion – Hyphen is a special character

48 sec read

A dash or hyphen is a special character in HTML5 and needs to be treated with care. The reason is that it is not just a character but an indicator of a line break which is picked up by the width properties of the div element. So if you have the div element

<div>party games</div>

a browser will give you the width of a single height text element of 11 characters.

But

<div>party-games</div>

returns a width of 6 characters by default and assumes it can be wrapped. If we are trying to adjust the text to get a best fit, this will obviously cause a lot of problems. In the screenshot you can see what can happen.

The HTML page contains a div element with a hyphen

The single div element contains the text “Der 63-jährigeCano, dessenrichtiger” but as far as the width is concerned, the div contents are “Der 63-“. So if we try to adjust the content to fit the space we get a mess with the text wrapped over the next line. Not pretty 🙁

The solution is to break this into 2 divs (with the – end the end of the first value)

and it looks much better!

I think it can be improved still further, but that is for another post…

Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Converting your PDF files to HTML5 with BuildVu 

Recently we announced our updated product range for 2018 and are rebranding some existing products, like JPDF2HTML5 which has been renamed to BuildVu. It...
Georgia Ingham
3 min read

Favourite resources from our HTML development team

As the web progresses and grows, so do the technologies that come along with it. Trying to keep on top of everything you need...
Ovidijus Okinskas
1 min read

How HTML5 Javadocs in Java 9 will make your…

Here at IDRsolutions we are very excited about Java 9 and have written a series of articles explaining some of the main features. In...
Rob
1 min read

2 Replies to “PDF to HTML5 conversion – Hyphen is a special…”

  1. Hi,

    I was using jpedal2html 5.13b16 version to convert pdf into html.
    After converting from pdf to html, all data after the hyphen are lost in the converted html.
    I wonder whether this is a known issue int he software version I am using? what is the latest version available? was it resolved in the later versions?

    Thanks
    Shiva

Leave a Reply

Your email address will not be published. Required fields are marked *