Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

How to search a PDF file in Java

28 sec read

jpedal

PDF files are not directly supported in Java. This tutorial shows you how to search the text content in a PDF file in simple steps using JPedal PDF library.

Why use a third party library to handle PDF files?

PDF files are a very complex binary/text hybrid data structure and the file needs to be decoded to figure out the textual content. In this example, we will use our JPedal PDF library to make this task simple.

How to search PDF file in Java

Step 1 Create a File handle, InputStream or URL pointing to the PDF file

FindTextInRectangle extract=new FindTextInRectangle(path);

Step 2 Include a password if file password protected

extract.setPassword("password");

Step 3 Open the PDF file

if (extract.openPDFFile()) {

Step 4 Scan the pages

      int pageCount=extract.getPageCount();
      for (int page=1; page<=pageCount; page++) {
 
          float[] coords=extract.findTextOnPage(page"textToFind", SearchType.MUTLI_LINE_RESULTS ) ;
      }
}

Step 5 Close the PDF file

 extract.closePDFfile();



Do you need to work with PDF files in Java?

Java PDF SDK for working with PDF filesJava PDF SDK for working with PDF files

Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2020. All rights reserved.