Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

How to search a PDF file in Java

52 sec read

jpedal

PDF files are not directly supported in Java. This tutorial shows you how to search the text content in a PDF file in simple steps using JPedal PDF library.

Why use a third-party library to handle PDF files?

PDF files are a very complex binary/text hybrid data structure and the file needs to be decoded to figure out the textual content. In this example, we will use our JPedal PDF library to make this task simple.

How to search PDF file in Java

Step 1 Create a File handle, InputStream or URL pointing to the PDF file

FindTextInRectangle extract=new FindTextInRectangle(path);

Step 2 Include a password if file password protected

extract.setPassword("password");

Step 3 Open the PDF file

if (extract.openPDFFile()) {

Step 4 Scan the pages

  int pageCount = extract.getPageCount();
  for (int page = 1; page <= pageCount; page++) {
    float[] coords = extract.findTextOnPage(page"textToFind", 
          SearchType.MUTLI_LINE_RESULTS ) ;
  }
}

Step 5 Close the PDF file

 extract.closePDFfile();

 


JPedal makes Searching PDF files for text simple


Java PDF SDK for working with PDF filesFind out more



Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2021. All rights reserved.