In this article I will show you how you can convert PDF files to HTML in Java using our library BuildVu. PDF to HTML conversion helps you to optimise your PDF content for display on browsers. We have a separate article explaining the benefits of converting PDF to HTML.
How to convert PDF to HTML using Java
- Download the BuildVu trial jar
- Add the BuildVu Jar to your project libraries
- Choose conversion options
- Choose viewer options
- Set PDF file path and output directory
HTMLConversionOptions conversionOptions = new HTMLConversionOptions();
// Set conversion options here e.g. conversionOptions.setCompressImages(true);
IDRViewerOptions viewerOptions = new IDRViewerOptions();
// Set viewer options here e.g. viewerOptions.setViewerUI(IDRViewerOptions.ViewerUI.Clean);
File pdfFile = new File("C:/MyDocument.pdf");
File outputDir = new File("C:/MyOutputDirectory/");
PDFtoHTML5Converter converter = new PDFtoHTML5Converter(pdfFile, outputDir, conversionOptions, viewerOptions);
try {
converter.convert();
} catch (PdfException e) {
e.printStackTrace();
}
How to convert PDF to HTML from the command line
You can run BuildVu to convert directly from the command line which is useful for running the converter from another language or script.
- Download the BuildVu trial jar
- Set the input directory and output directory
- Choose conversion options
- Increase the XMX value according to need
java -Xmx512M -jar buildvu-html.jar /inputDirectory/ /outputDirectory/
The default mode generates the document inside the IDRViewer. To generate just the raw content to be used inside your own custom solution, you can use:
java -Dorg.jpedal.pdf2html.viewMode=content -jar buildvu-svg.jar /inputDirectory/ /outputDirectory/
You can check out documentation for BuildVu to learn more on how to turn PDF into an HTML using Java. If you want to convert PDF to SVG you can check out our other article here.
BuildVu allows you to
View PDF files in a Web app |
Convert PDF documents to HTML5 |
Parse PDF documents as HTML |