Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

Are you a Java Developer working with PDF files?

Find out why you should be using JPedal

How to search a PDF file in Java

39 sec read

jpedal

PDF files are not directly supported in Java. This tutorial shows you how to search the text content in a PDF file in simple steps using JPedal Java PDF library. This provides an easy to use Java PDF api to search text in PDF documents from your Java code.

Why use a third-party library to handle PDF files?

PDF files are a very complex binary/text hybrid data structure and the file needs to be decoded to figure out the textual content. In this example, we will use our JPedal Java PDF library to make this task simple.

How to search PDF file in Java

  1. Download JPedal trial jar.
  2. Create a File handle, InputStream or URL pointing to the PDF file
    FindTextInRectangle extract=new FindTextInRectangle(path);
  3. Include a password if file password protected
    extract.setPassword("password");
  4. Open the PDF file
    if (extract.openPDFFile()) {
  5. Scan the pages
      int pageCount = extract.getPageCount();
      for (int page = 1; page <= pageCount; page++) {
        float[] coords = extract.findTextOnPage(page"textToFind", 
              SearchType.MUTLI_LINE_RESULTS ) ;
      }
    }
  6. Close the PDF file
     extract.closePDFfile();
    

JPedal makes Searching PDF files for text simple


Java PDF SDK for working with PDF filesFind out more



The JPedal PDF library allows you to

Display PDF files in Java Apps
View PDF files in Java
Convert PDF Files to image
Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.