Leon Atherton Leon has over 15 years’ Java experience and leads BuildVu, also contributing to cloud services and internal tooling. Wearing many hats across technical and growth roles as a core member, he enjoys motorsport, strategy games, and software side projects.

Easily Convert PDF to HTML in PHP (Tutorial)

3 min read

fillable pdf forms to html forms using PHP

Converting a PDF to HTML in PHP lets you display PDF content directly in the browser; no plugins, no iframes or no viewer dependencies. It’s the right approach for invoice rendering, report display, document portals, and any workflow where you need PDF content to behave like a proper web page.

This tutorial covers three approaches, open source first:

  • pdf2htmlEX (free, open source, best fidelity of the free options)
  • pdftohtml / poppler-utils (free, open source, lightweight, good for simple documents)
  • BuildVu (commercial, production-grade, cloud or self-hosted)

If cost is the hard constraint, start with pdf2htmlEX, it produces significantly more accurate output than Poppler for most documents. If you need font accuracy, form support and SVG output at scale in a production environment, BuildVu is the correct tool.

Featurepdf2htmlEXPoppler (pdftohtml)BuildVu
Visual FidelityHigh (Preserves fonts/layouts)Basic (Often loses formatting)Highest (Production-grade)
CostFree (Open Source)Free (Open Source)Commercial
Interactive SupportNo JS or FormsNo JS or FormsFull JS and Form support
SetupModerate (Requires binary/Docker)Easy (Standard Linux utility)Easy (Cloud API or Composer)
Primary Use CaseAccurate web-viewing (Free)Simple text extractionEnterprise/Customer-facing portals

Method 1: Convert PDF to HTML in PHP Using pdf2htmlEX

pdf2htmlEX converts PDF files to HTML while preserving text, fonts, formatting, and vector graphics with a fidelity that poppler-based tools can’t match. It outputs a self-contained HTML file with embedded CSS and JavaScript that replicates the original PDF layout in the browser.

Install pdf2htmlEX

// Debian/Ubuntu — via apt (may be an older version)
sudo apt-get install pdf2htmlex
// macOS
brew install pdf2htmlex
// Docker (recommended — avoids dependency conflicts)
docker pull pdf2htmlex/pdf2htmlex

To verify the install:

pdf2htmlEX –version

 

Call pdf2htmlEX from PHP



<?php

$inputFile  = escapeshellarg('/path/to/your-document.pdf');
$outputDir  = '/path/to/output/';
$outputFile = escapeshellarg($outputDir . 'output.html');

// Ensure the output directory exists
if (!is_dir($outputDir)) {
    mkdir($outputDir, 0755, true);
}

$command = "pdf2htmlEX --zoom 1.3 {$inputFile} {$outputFile} 2>&1";
$result  = shell_exec($command);

if ($result !== null && $result !== '') {
    echo 'pdf2htmlEX output: ' . $result;
} else {
    echo 'Conversion complete.';
}


The --zoom 1.3 flag corrects for the 96/72 dpi scaling difference between PDF units and browser pixels. Without it, the rendered page will appear approximately 25% smaller than the original PDF.

Using pdf2htmlEX via Docker from PHP

If your server environment makes binary installation difficult, the Docker approach is cleaner:



<?php

$inputFile  = '/host/path/to/your-document.pdf';
$outputDir  = '/host/path/to/output/';
$inputArg   = escapeshellarg(basename($inputFile));
$inputDir   = escapeshellarg(dirname($inputFile));
$outputArg  = escapeshellarg($outputDir);

$command = "docker run --rm -v {$inputDir}:/pdf -v {$outputArg}:/output pdf2htmlex/pdf2htmlex --zoom 1.3 {$inputArg} 2>&1";
$result  = shell_exec($command);

echo $result ?? 'Conversion complete.';

Method 2: Convert PDF to HTML in PHP Using pdftohtml / poppler-utils

pdftohtml from poppler-utils is the simpler, lighter-weight free option. It has been around longer than pdf2htmlEX and is available in virtually every Linux package repository, which makes it easy to install on shared hosting or minimal VPS environments where Docker is not an option.

Install poppler-utils



// Debian/Ubuntu
sudo apt-get install poppler-utils

// macOS
brew install poppler

Call pdftohtml from PHP



<?php

$inputFile  = escapeshellarg('/path/to/your-document.pdf');
$outputDir  = '/path/to/output/';
$outputFile = escapeshellarg($outputDir . 'output');

$command = "pdftohtml -c -noframes {$inputFile} {$outputFile} 2>&1";
$result  = shell_exec($command);

echo $result ?? 'Conversion complete.';

The -c flag preserves the original layout using absolute CSS positioning. The -noframes flag outputs a single HTML file rather than a frameset (the default frameset output is not useful for modern applications).

Convert PDF to HTML using PHP using BuildVu

BuildVu is a commercial SDK purpose-built for PDF-to-HTML conversion. It runs as a microservice, either on the IDRsolutions shared cloud or self-hosted on your own infrastructure.

Although the services can be accessed with standard HTTP requests, this tutorial uses our open-source PHP IDRCloudClient, which offers a straightforward PHP wrapper for the REST API.

Prerequisites

To install the idrsolutions-php-client package using Composer, execute the following command:


 

Code Examples

Here is a basic code example demonstrating how to generate HTML from PDF. Configuration options and advanced features are detailed below:


 

Return result to a callback url

The BuildVu Microservice supports a callback URL to send the status upon conversion completion, eliminating the need to constantly poll the service. You can provide the callback URL to the parameters array as demonstrated below:


 

Configuration Options

The BuildVu API allows for conversion customization using a stringified JSON object with key-value pair configuration options. Add these settings to the parameters array. A comprehensive list of options for converting PDF files to HTML or SVG can be found here.


 

Upload by URL

In addition to uploading a local file, you can provide a URL for the BuildVu Microservice to download and convert. Simply replace the input and file values in the parameters array with the following.


 

Using Authentication

If you require authentication for your BuildVu Microservice, provide username and password when converting and downloading HTML from PDF. Add two variables named username and password to the parameters array, as shown below.

In such cases, you'll also need to provide the authentication values to the downloadOutput method.


 

Which Solution is right for you?

The free route is viable for batch-converting simple internal documents where visual accuracy is not critical. For customer-facing document portals, report viewers or any PDF with complex formatting, BuildVu is the correct tool.



BuildVu allows you to

View PDF files in a Web app
Convert PDF documents to HTML5
Parse PDF documents as HTML
Leon Atherton Leon has over 15 years’ Java experience and leads BuildVu, also contributing to cloud services and internal tooling. Wearing many hats across technical and growth roles as a core member, he enjoys motorsport, strategy games, and software side projects.

PDF Performance and UX Issues in Web Publishing (Why…

Why PDFs Cause Performance and UX Issues in Web Publishing Embedding PDFs in websites is a common approach for digital publishing platforms, but it...
chika
2 min read

PDF vs HTML5 for Web Publishing: Which Is Better?

TL;DR PDFs are not ideal for digital publishing platforms They are slow to load and render in web applications They do not work well...
chika
1 min read