PDF to HTML conversion helps improve your PDF content for browser display. BuildVu is one of the leading PDF to HTML solution for developers. In this article I will explain how to use our PDF files to HTML API to convert documents into HTML with BuildVu.
Why convert PDF to HTML?
Converting PDF to HTML makes your documents easier to display on web and mobile devices, providing a smoother user experience. HTML content is also fully searchable and indexable by search engines, helping boost SEO.
We also have a dedicated article discussing the advantages of converting PDF to HTML.
Convert PDF to HTML using Java
- Download the BuildVu trial jar
- Add the BuildVu Jar to your project libraries
- Choose conversion options
- Choose viewer options
- Set PDF file path and output directory
HTMLConversionOptions conversionOptions = new HTMLConversionOptions();
// Set conversion options here e.g. conversionOptions.setCompressImages(true);
IDRViewerOptions viewerOptions = new IDRViewerOptions();
// Set viewer options here e.g. viewerOptions.setViewerUI(IDRViewerOptions.ViewerUI.Clean);
File pdfFile = new File("C:/MyDocument.pdf");
File outputDir = new File("C:/MyOutputDirectory/");
PDFtoHTML5Converter converter = new PDFtoHTML5Converter(pdfFile, outputDir, conversionOptions, viewerOptions);
try {
converter.convert();
} catch (PdfException e) {
e.printStackTrace();
}
Office Docs to HTML?
BuildVu’s main function is to convert PDFs to HTML5, but it can also process Office documents if you first use LibreOffice to convert them to PDF. Once LibreOffice is installed, the DocumentToPDFConverter class automates this step for you.
if (DocumentToPDFConverter.hasConvertibleFileType(document)) {
try {
DocumentToPDFConverter.convert(document, libreOfficeExecutablePath);
// Code to convert generated PDF file to HTML5 goes here
} catch (IOException e) {
// Problem occurred – see Javadoc for reasons
} catch (InterruptedException e) {
// Process was interrupted
}
} else {
// File type not supported*
}
With over 20 years of experience working with PDFs, we have many other blog posts that can help you understand the PDF file format.