Chika Okereke Chika is a Java developer. When not experimenting with the new features of Java, he is a keen basketball player (he is the tall guy you might see at Devoxx).

PDF to HTML5 conversion – Text rotation

1 min read

I have been tackling HTML text in several different ways in an effort to get the best combination of size, accuracy and speed.

These included transferring the rotated text to the canvas as some transformation can be added to the canvas element. This had its advantages and disadvantages. Advantages included the fact that the rotation looked similar to the PDF file in question.

However, the main draw back was the fact that when the user zoomed into the rotated text it appeared to be pixelated and un-selectable as it was on the canvas. In some case this could be overlooked but being the perfectionists that we are, we could not settle for less!

The other approach we tried was to apply a rotation on the individual div. For example transform: rotate(90deg); . This worked great at first until we had files where the text seemed broken even if they were in the same sentence.It turns out that different shapes have different points of origin so would rotate differently.  This led to us adding the transform-origin: bottom left; this forces all text to rotate from their bottom left.

The advantage was that the text could be treated as text and was selectable once again but in certain files output wasnt as accurate and also resulted in larger files being generated.

Another approach we took was rasterizing each glyf as a shape. This by far gave the best output in comparison to the original PDF.  Just like the saying “You pays your money and you takes your choice”, the file output is dramatically larger. Almost double the size of the file created using the rotated divs – the reason being that each glyf is being created then applied in its location. For example below is the code used to create the glyf C.  Then this is the code used to scale, move, and give the text color.The advantages include the fact that the text is more or less a perfect representation of the PDF. However, the output file is also larger than the other techniques we have tried. With the text being rasterized they are being treated as images as opposed to text hence making it un-selectable which defeats the objective of it being text.

And finally the technique that combines the best aspect from all the others that we have tried.(You guessed it, by this stage we were in the new office.. :D) Using the -webkit-transform:matrix(0.0,-1.0,1.0,0.0,25, 340); to determine where the rotated text should be placed with the value that defines the text font size are set to one as the font size gets applied at a later stage. All the alterations are made in the div which applies the font information. The disadvantage is that we need to add the code for each browser 🙁

To make sure that all the text are spaced out correctly and aligned properly we reused a code we already had which was the  transform-origin: bottom left;.  This is the current output generated The number one advantage is the fact it behaves as text. This means it is selectable. As you can imagine the file size is also reduced as the changes are added to each rotated div. What do you think about the new changes added or do you have a better solution? 

Watch how to use our PDF Viewer JPedal

Chika Okereke Chika is a Java developer. When not experimenting with the new features of Java, he is a keen basketball player (he is the tall guy you might see at Devoxx).

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2022. All rights reserved.