One of the things I have missed most in moving PDF content to HTML5 is the clipping functionality of the PDF.
In a PDF File you can set a clip which can be any irregular shape. Only content which is inside the clip is drawn the rest is not (it is simple). HTML5 has nothing like this which means we have to emulate it. Otherwise invisible content (such as crop marks or invisible lines) starts to appear.
This turned out to quite a complex task. Eliminating anything which is not in the clipped area was easy – the tricky bit is handling items which intersect with the clip (ie drawn so partly visible). Images can be clipped but shapes have to be altered. The hardest items to handle were images.
In the PDF File format you can have a Stroked Shape (the outline), a Filled shape (colour in the shape) and both. So you have to workout how the shape interacts with the clip. For example if the clip was totally inside the shape, we could ignore it if it was a Stroke (ie an outline) but would need to fill in the clipped area if it was filled. We had to dig out our old Maths notes on trigonometry to calculate the points where the lines appear and disappear!
I am sure we will find some additional cases which we have not currently covered. So try the latest version and let us know what you think.
Did you know...
IDRsolutions offers a whole range of online file converters to convert PDF and Microsoft Excel, Word and Office Documents to HTML5, SVG or image formats?
It is free to use for single file conversions and also includes Developer links if you want to use our commercial software for bulk conversions. Find out more on this page