Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

Converting Microsoft Office documents to PDF, HTML5 or SVG

1 min read

Office to PDF, HTML5, and SVG

As this is a question we get asked a lot at IDRsolutions, I decided to write a blog article on the topic, which may well develop into a series…

Microsoft Office files are an industry standard and lots of people want to convert them into PDF or HTML5 or SVG. One option is to use Microsoft Office but there is an alternative which is cross-platform and free  – LibreOffice. It is a version of the Open Source library OpenOffice which has excellent support for Word, PowerPoint and other office file formats. They are both very similar with slightly different strengths and  weaknesses (and both are free so try both yourself and choose).

LibreOffice has TWO very useful features. Firstly, it is cross-platform so it will run on Linux and OS X boxes and not just Windows. Secondly, it does not need a user to run it – the software can be called from your programs as a library. This is really easy to do. So

libreoffice --headless --convert-to pdf myFile.docx

will turn the Word file myFile.docx into a PDF file. We get to see a lot of PDF files and the PDF files created by LibreOffice are generally very good.

LibreOffice has several APIs (including Java) or you can just call it as an external process with this code in Java.

// Get an instance of shell
            Process pqShell = Runtime.getRuntime().exec("sh");
            String shellCommand = "libreoffice --headless --convert-to pdf " + fileName;
            try {
       dos = new;
                dos.writeBytes("cd " + userInputDirPath + "\n");
                dos.writeBytes(shellCommand + "\n");
            } catch (Exception ex) {
            } finally {

The –convert-to parameter can take any filetype as parameter (ie txt for Office to Text, html for Office to HTML), etc. There are lots of additional featured which we may document in later articles…

The HTML output is quite simple, so we have been linking the PDF files created via LibreOffice to our PDF to HTML5 converter and testing for several months now. We (and our test customers) have been very pleased with the results and we know of lots of companies using LibreOffice internally for file conversion.

So we have added LibreOffice to our free online converter which now allows people to convert not just PDF files but also Convert Office documents to HTML5, Word Documents to HTML5, Excel Documents to HTML5 and Powerpoint to HTML5.

We recommend this additional functionality to our commercial clients who want to process a wider range of documents with our PDF to HTML5 converter.

We are very impressed with the possibilities of LibreOffice as part of a two stage conversion process to turn Office Documents into HTML5 via PDF. I was less enthusiastic about Office to HTML direct conversion.  I hope that if you are doing anything with Office documents on server or desktop, you have a look and experiment with it as part of your solution.

IDRsolutions develop a Java PDF Viewer and SDK, an Adobe forms to HTML5 forms converter, a PDF to HTML5 converter and a Java ImageIO replacement. On the blog our team post anything interesting they learn about.

Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

Enabling SVG Gzip Compression on Apache and NGINX

Gzip compression is a widely supported method of reducing the size of the content sent from a web server in order to improve the...
Leon Atherton
47 sec read

One Reply to “Converting Microsoft Office documents to PDF, HTML5 or SVG”

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2019. All rights reserved.