} ?>
Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

Converting Microsoft Office documents to PDF, HTML5 or SVG

1 min read

Office to PDF, HTML5, and SVG

As this is a question we get asked a lot at IDRsolutions, I decided to write a blog article on the topic, which may well develop into a series…

Microsoft Office files are an industry standard and lots of people want to convert them into PDF or HTML5 or SVG. One option is to use Microsoft Office but there is an alternative which is cross-platform and free  – LibreOffice. It is a version of the Open Source library OpenOffice which has excellent support for Word, PowerPoint and other office file formats. They are both very similar with slightly different strengths and  weaknesses (and both are free so try both yourself and choose).

LibreOffice has TWO very useful features. Firstly, it is cross-platform so it will run on Linux and OS X boxes and not just Windows. Secondly, it does not need a user to run it – the software can be called from your programs as a library. This is really easy to do. So

libreoffice --headless --convert-to pdf myFile.docx

will turn the Word file myFile.docx into a PDF file. We get to see a lot of PDF files and the PDF files created by LibreOffice are generally very good.

LibreOffice has several APIs (including Java) or you can just call it as an external process with this code in Java.

// Get an instance of shell
            Process pqShell = Runtime.getRuntime().exec("sh");
            String shellCommand = "libreoffice --headless --convert-to pdf " + fileName;
            try {
                java.io.DataOutputStream dos = new java.io.DataOutputStream(pqShell.getOutputStream());
                dos.writeBytes("cd " + userInputDirPath + "\n");
                dos.writeBytes(shellCommand + "\n");
            } catch (Exception ex) {
            } finally {

The –convert-to parameter can take any filetype as parameter (ie txt for Office to Text, html for Office to HTML), etc. There are lots of additional featured which we may document in later articles…

The HTML output is quite simple, so we have been linking the PDF files created via LibreOffice to our PDF to HTML5 converter and testing for several months now. We (and our test customers) have been very pleased with the results and we know of lots of companies using LibreOffice internally for file conversion.

So we have added LibreOffice to our free online converter which now allows people to convert not just PDF files but also Convert Office documents to HTML5, Word Documents to HTML5, Excel Documents to HTML5 and Powerpoint to HTML5.

We recommend this additional functionality to our commercial clients who want to process a wider range of documents with our PDF to HTML5 converter.

We are very impressed with the possibilities of LibreOffice as part of a two stage conversion process to turn Office Documents into HTML5 via PDF. I was less enthusiastic about Office to HTML direct conversion.  I hope that if you are doing anything with Office documents on server or desktop, you have a look and experiment with it as part of your solution.

Find out more about our software for Developers

Convert PDF to HTML5 or SVG Convert PDF to HTML5 or SVG
Convert AcroForms and XFA to HTML5Convert AcroForms and XFA to HTML5
Java PDF SDK for working with PDF files Java PDF SDK for working with PDF files
Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

How to convert EMF files to SVG in Java…

This article shows you how to convert EMF files into SVG files using our JDeli Java Image library. What is EMF? EMF is a...
Mark Stephens
1 min read

One Reply to “Converting Microsoft Office documents to PDF, HTML5 or SVG”

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2020. All rights reserved.