Converting a PDF to HTML in PHP lets you display PDF content directly in the browser; no plugins, no iframes or no viewer dependencies. It’s the right approach for invoice rendering, report display, document portals, and any workflow where you need PDF content to behave like a proper web page.
This tutorial covers three approaches, open source first:
- pdf2htmlEX (free, open source, best fidelity of the free options)
- pdftohtml / poppler-utils (free, open source, lightweight, good for simple documents)
- BuildVu (commercial, production-grade, cloud or self-hosted)
If cost is the hard constraint, start with pdf2htmlEX, it produces significantly more accurate output than Poppler for most documents. If you need font accuracy, form support and SVG output at scale in a production environment, BuildVu is the correct tool.
| Feature | pdf2htmlEX | Poppler (pdftohtml) | BuildVu |
|---|---|---|---|
| Visual Fidelity | High (Preserves fonts/layouts) | Basic (Often loses formatting) | Highest (Production-grade) |
| Cost | Free (Open Source) | Free (Open Source) | Commercial |
| Interactive Support | No JS or Forms | No JS or Forms | Full JS and Form support |
| Setup | Moderate (Requires binary/Docker) | Easy (Standard Linux utility) | Easy (Cloud API or Composer) |
| Primary Use Case | Accurate web-viewing (Free) | Simple text extraction | Enterprise/Customer-facing portals |
Method 1: Convert PDF to HTML in PHP Using pdf2htmlEX
pdf2htmlEX converts PDF files to HTML while preserving text, fonts, formatting, and vector graphics with a fidelity that poppler-based tools can’t match. It outputs a self-contained HTML file with embedded CSS and JavaScript that replicates the original PDF layout in the browser.
Install pdf2htmlEX
// Debian/Ubuntu — via apt (may be an older version)
sudo apt-get install pdf2htmlex
// macOS
brew install pdf2htmlex
// Docker (recommended — avoids dependency conflicts)
docker pull pdf2htmlex/pdf2htmlex
To verify the install:
pdf2htmlEX –version
Call pdf2htmlEX from PHP
<?php
$inputFile = escapeshellarg('/path/to/your-document.pdf');
$outputDir = '/path/to/output/';
$outputFile = escapeshellarg($outputDir . 'output.html');
// Ensure the output directory exists
if (!is_dir($outputDir)) {
mkdir($outputDir, 0755, true);
}
$command = "pdf2htmlEX --zoom 1.3 {$inputFile} {$outputFile} 2>&1";
$result = shell_exec($command);
if ($result !== null && $result !== '') {
echo 'pdf2htmlEX output: ' . $result;
} else {
echo 'Conversion complete.';
}
The --zoom 1.3 flag corrects for the 96/72 dpi scaling difference between PDF units and browser pixels. Without it, the rendered page will appear approximately 25% smaller than the original PDF.
Using pdf2htmlEX via Docker from PHP
If your server environment makes binary installation difficult, the Docker approach is cleaner:
<?php
$inputFile = '/host/path/to/your-document.pdf';
$outputDir = '/host/path/to/output/';
$inputArg = escapeshellarg(basename($inputFile));
$inputDir = escapeshellarg(dirname($inputFile));
$outputArg = escapeshellarg($outputDir);
$command = "docker run --rm -v {$inputDir}:/pdf -v {$outputArg}:/output pdf2htmlex/pdf2htmlex --zoom 1.3 {$inputArg} 2>&1";
$result = shell_exec($command);
echo $result ?? 'Conversion complete.';
Method 2: Convert PDF to HTML in PHP Using pdftohtml / poppler-utils
pdftohtml from poppler-utils is the simpler, lighter-weight free option. It has been around longer than pdf2htmlEX and is available in virtually every Linux package repository, which makes it easy to install on shared hosting or minimal VPS environments where Docker is not an option.
Install poppler-utils
// Debian/Ubuntu
sudo apt-get install poppler-utils
// macOS
brew install poppler
Call pdftohtml from PHP
<?php
$inputFile = escapeshellarg('/path/to/your-document.pdf');
$outputDir = '/path/to/output/';
$outputFile = escapeshellarg($outputDir . 'output');
$command = "pdftohtml -c -noframes {$inputFile} {$outputFile} 2>&1";
$result = shell_exec($command);
echo $result ?? 'Conversion complete.';
The -c flag preserves the original layout using absolute CSS positioning. The -noframes flag outputs a single HTML file rather than a frameset (the default frameset output is not useful for modern applications).
Convert PDF to HTML using PHP using BuildVu
BuildVu is a commercial SDK purpose-built for PDF-to-HTML conversion. It runs as a microservice, either on the IDRsolutions shared cloud or self-hosted on your own infrastructure.
Although the services can be accessed with standard HTTP requests, this tutorial uses our open-source PHP IDRCloudClient, which offers a straightforward PHP wrapper for the REST API.
Prerequisites
To install the idrsolutions-php-client package using Composer, execute the following command:
composer require idrsolutions/idrsolutions-php-clientCode Examples
Here is a basic code example demonstrating how to generate HTML from PDF. Configuration options and advanced features are detailed below:
'input' => IDRCloudClient::INPUT_UPLOAD,
'file' => __DIR__ . 'path/to/file.pdf'
);
$results = IDRCloudClient::convert(array(
'endpoint' => $endpoint,
'parameters' => $parameters
));
IDRCloudClient::downloadOutput($results, __DIR__ . '/');
echo $results['downloadUrl'];Return result to a callback url
The BuildVu Microservice supports a callback URL to send the status upon conversion completion, eliminating the need to constantly poll the service. You can provide the callback URL to the parameters array as demonstrated below:
$parameters = array(
//'token' => 'Token', // Required only when connecting to the IDRsolutions trial and cloud subscription service
'input' => IDRCloudClient::INPUT_UPLOAD,
'callbackUrl' => 'http://listener.url',
'file' => __DIR__ . 'path/to/file.pdf'
);Configuration Options
The BuildVu API allows for conversion customization using a stringified JSON object with key-value pair configuration options. Add these settings to the parameters array. A comprehensive list of options for converting PDF files to HTML or SVG can be found here.
'settings' => '{"key":"value","key":"value"}'Upload by URL
In addition to uploading a local file, you can provide a URL for the BuildVu Microservice to download and convert. Simply replace the input and file values in the parameters array with the following.
'input' => IDRCloudClient.DOWNLOAD
'url' => 'http://exampleURL/exampleFile.pdf'Using Authentication
If you require authentication for your BuildVu Microservice, provide username and password when converting and downloading HTML from PDF. Add two variables named username and password to the parameters array, as shown below.
'username' => 'Username_If_Required',
'password' => 'Password_If_Required',In such cases, you'll also need to provide the authentication values to the downloadOutput method.
IDRCloudClient::downloadOutput($results, __DIR__ . '/','newFileName','username','password');Which Solution is right for you?
The free route is viable for batch-converting simple internal documents where visual accuracy is not critical. For customer-facing document portals, report viewers or any PDF with complex formatting, BuildVu is the correct tool.
BuildVu allows you to
| View PDF files in a Web app |
| Convert PDF documents to HTML5 |
| Parse PDF documents as HTML |
What is BuildVu?
BuildVu is a commercial SDK for converting PDF files into standalone HTML or SVG.
Why use BuildVu?
BuildVu allows you to integrate PDF into your HTML workflow effortlessly and securely by producing clean HTML that is easy for developers to work with.
What licenses are available?
We have 3 licenses available:
Cloud for conversion using the shared IDRsolutions cloud server, Self hosted server option for your own cloud or on-premise servers, and Enterprise for more demanding requirements.
How to use BuildVu?
Want to learn more about BuildVu and how to use it, we have plenty of tutorials and guides to help you.