How to convert PDF to HTML in Python

Table of Contents show

In this article I will show you how you can use our PDF files to HTML API to generate HTML from PDF with our BuildVu library. with our library BuildVu. BuildVu is the best PDF to HTML conversion tool for developers. PDF to HTML conversion helps you to optimise your PDF content for display on browsers. We have a separate article explaining the benefits of converting PDF to HTML.

Convert PDF to HTML using Python

This tutorial uses our Python IDRCloudClient open source which provides a simple Python wrapper around the REST API.

Prerequisites

Using pip, install the IDRCloudClient package with the following command:

pip install IDRCloudClient

Code Examples

Below is a basic code example for converting PDF files to HTML or SVG. Additional configuration options and advanced features are detailed below.

from IDRSolutions import IDRCloudClient


client = IDRCloudClient(‘https://cloud.idrsolutions.com/cloud/’ + IDRCloudClient.BUILDVU)


try:

result = client.convert(

# token=’Token’, # Required only when connecting to the IDRsolutions trial and cloud subscription service

input=IDRCloudClient.UPLOAD,

file=’/path/to/exampleFile.pdf’

)

outputURL = result[‘downloadUrl’]


client.downloadResult(result, ‘path/to/output/dir’)


if outputURL is not None:

print("Download URL: " + outputURL)


except Exception as error:

print(error)

Return result to a callback url

The BuildVu Microservice supports a callback URL to send the status of a conversion on completion. Using a callback URL eliminates the need to continually check the service for updates. You can provide the callback URL to the `convert` method as shown below:

result = client.convert(

# token=’Token’, # Required only when connecting to the IDRsolutions trial and cloud subscription service

input=IDRCloudClient.UPLOAD,

callbackUrl=’http://listener.url’,

file=’/path/to/exampleFile.pdf’

)

Configuration Options

The BuildVu API allows for conversion customization using a stringified JSON object with key-value pair configuration options. Provide these settings to the convert method. A comprehensive list of options for converting PDF files to HTML or SVG is available here.

settings='{"key":"value","key":"value"}’

Upload by URL

In addition to uploading a local file, you can provide a URL for the BuildVu Microservice to download and convert. Simply replace the input and file values in the convert method with the following.

input=IDRCloudClient.DOWNLOAD

url=’http://exampleURL/exampleFile.pdf’

Using Authentication

For deployments of your own BuildVu Microservice that require a username and password for PDF-to-HTML or SVG conversions, provide these credentials with each conversion. Pass a variable named auth to the convert method as demonstrated below.

auth=(‘username’, ‘password’))

BuildVu allows you to

View PDF files in a Web app

Convert PDF documents to HTML5

Parse PDF documents as HTML

How to convert PDF to HTML in Python

Convert PDF to HTML using Python

Prerequisites

Code Examples

Return result to a callback url

Configuration Options

Upload by URL

Using Authentication

BuildVu allows you to

What is BuildVu?

Why use BuildVu?

What licenses are available?

How to use BuildVu?

Manipulate PDF files in the JPedal Viewer

How to remove annotations from PDF files in Java…

How to crop a PDF file in Java (Tutorial)

How to convert PDF to HTML in Python

Convert PDF to HTML using Python

Prerequisites

Code Examples

Return result to a callback url

Configuration Options

Upload by URL

Using Authentication

Related posts:

BuildVu allows you to

What is BuildVu?

Why use BuildVu?

What licenses are available?

How to use BuildVu?

Manipulate PDF files in the JPedal Viewer

How to remove annotations from PDF files in Java…

How to crop a PDF file in Java (Tutorial)