We have been getting lots of feedback and bug reports from users and potential users. Please keep them coming – that is how we can improve the conversion! There is no manual on what works best for PDF to HTML5 conversion so it is only by seeing the results and tuning the heuristics that we can improve the process…
One particularly interesting case I saw last week involved a PDF where the first letter of the word was in a different font. Because we auto fit the rest of the word (but not the first character), the rest of the word was appearing one text size smaller and looked odd as a result. So I am currently experimenting with avoiding the auto-fit in this case.
Another user, suggested that converting PDF files to version 1.5 improved HTML5 conversion (presumably as it simplifies the Postscript data which gives cleaner HTML5). Here is his suggestion:-
Here is the command to convert the PDF to a optimized PDF. The converted PDF works much better with your JPDF2HTML5 Library than the original one. I think you can post a blog entry for other users which can be very helpful. I run this command for every PDF before running JPDF2HTML5 and every PDF with problem (see attachment) worked well.
The command is available with ghostscript library on linux:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.5 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
Thanks to Thoren for his suggestion. If you have any issues with PDF files not converting to HTML5 please let us have the files as a bug report so we can investigate. Or do you have any tips to share?