The main body of a book or article.

Text

Apache Tika PDF support in JPedal

JPedal now contains an Apache Tika Parser which can parse and extract unstructured text from PDF files. How to use an Apache Tika PDF...
Jacob Collins
29 sec read

Understanding the PDF File Format

We have been working with PDF files since 1999 and developed complex software to display PDF files. We have learnt a lot about the...
Leon Atherton
4 min read

Three ways to convert PDF to HTML5: Text and…

There are several ways that you can deal with text and fonts in PDF files when converting to HTML5. Here are there are the...
Leon Atherton
2 min read

Tutorial : How To Copy Text in JavaFX and…

At IDRSolutions we have a PDF Viewer that has the ability to highlight and copy text, because we are developing a JavaFX implementation of our PDF Viewer we required...
Nathan Howard
1 min read

Size does matter

Recently I have been looking into an issue in our PDF text extraction. A case was found where text extraction would appear to freeze....
Kieran France
1 min read