This tutorial shows you how to find words in a PDF file in simple steps using JPedal Java PDF library. JPedal includes a PDF search engine which provides an easy to use Java PDF api to find words and phrases in a pdf document.
How to search PDF file in Java
- Download JPedal trial jar.
- Create a File handle, InputStream or URL pointing to the PDF file
- Include a password if file password protected
- Open the PDF file
- Scan the pages
- Close the PDF file
and the Java code to search a PDF…
File file = new File("/path/to/document.pdf"));
FindTextInRectangle extract=new FindTextInRectangle(file);
//extract.setPassword("password");
if (extract.openPDFFile()) {
int pageCount = extract.getPageCount();
for (int page = 1; page <= pageCount; page++) {
float[] coords = extract.findTextOnPage(page"textToFind",
SearchType.MUTLI_LINE_RESULTS ) ;
}
}
extract.closePDFfile();
Why can’t I just search the PDF file directly?
You cannot simply search inside a PDF file because the text data is stored in a special binary format.
Related tutorials
If you are looking to search PDF files in JPedal, we recommend you start with this tutorials:-