The main body of a book or article.

Text

How to extract text from a PDF as JSON

Some PDF files can be “tagged” which means they contain information about the structure of the file. This structure is embedded as metadata within...
Jacob Collins
1 min read

How to process PDFs for use with AI (Tutorial)

As Artificial Intelligence becomes more popular for processing large bodies of text, it becomes apparent that PDF files pose a challenge. PDF is a...
Jacob Collins
1 min read

How to search for text in a PDF file…

Can you determine if a PDF is searchable for text without opening it? Well you will need some special software. This might be useful...
Jacob Collins
48 sec read

Apache Tika PDF support in JPedal

JPedal now contains an Apache Tika Parser which can parse and extract unstructured text from PDF files. How to use an Apache Tika PDF...
Jacob Collins
29 sec read

Understanding the PDF File Format

We have been working with PDF files since 1999 and developed complex software to display PDF files. We have learnt a lot about the...
Leon Atherton
3 min read

Tutorial : How To Copy Text in JavaFX and…

At IDRSolutions we have a PDF Viewer that has the ability to highlight and copy text, because we are developing a JavaFX implementation of our PDF Viewer we required...
Nathan Howard
1 min read