Extraction relates to the extraction of fonts, images, etc from PDF, HTML5, SVG, etc.

Extraction

How to extract clipped images from PDF file in…

PDF files are not directly supported by Java. This tutorial shows you how to extract clipped images from a PDF file in 5 simple...
Mark Stephens
58 sec read

How to extract images from PDF file in Java

PDF files are not directly supported by Java. This tutorial shows you how to extract images from a PDF file in 5 simple steps...
Mark Stephens
58 sec read

Size does matter

Recently I have been looking into an issue in our PDF text extraction. A case was found where text extraction would appear to freeze....
Kieran France
1 min read

Improving the way settings get passed in ExtractPagesAsHTML

Over the last year, we have grown our PDF to HTML5 converter to be increasingly configurable, capable of suiting a huge range of requirements...
Leon Atherton
1 min read

PDF puzzlers – when is a return character significant…

The PDF file format is a very ‘flexible’ file format. You can put returns into the middle of a most objects. There is a...
Mark Stephens
41 sec read