Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

How to search a PDF file in Java

36 sec read

jpedal

PDF files are not directly supported in Java. This tutorial shows you how to search the text content in a PDF file in simple steps using JPedal PDF library.

Why use a third-party library to handle PDF files?

PDF files are a very complex binary/text hybrid data structure and the file needs to be decoded to figure out the textual content. In this example, we will use our JPedal PDF library to make this task simple.

How to search PDF file in Java

Step 1 Download JPedal trial jar.
Step 2 Create a File handle, InputStream or URL pointing to the PDF file

FindTextInRectangle extract=new FindTextInRectangle(path);

Step 3 Include a password if file password protected

extract.setPassword("password");

Step 4 Open the PDF file

if (extract.openPDFFile()) {

Step 5 Scan the pages

  int pageCount = extract.getPageCount();
  for (int page = 1; page <= pageCount; page++) {
    float[] coords = extract.findTextOnPage(page"textToFind", 
          SearchType.MUTLI_LINE_RESULTS ) ;
  }
}

Step 6 Close the PDF file

 extract.closePDFfile();

 


JPedal makes Searching PDF files for text simple


Java PDF SDK for working with PDF filesFind out more



Do you need to...

Display PDF files in Java Apps →

Convert PDF Files to image →

Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.