Kieran France Kieran France is a programmer for IDRSolutions in charge of there internal test suite. In his spare time he enjoys tinkering with gadgets and code.

Improving PDF text search in JPedal

1 min read

I have been working on PDF search and felt it was time to share some enhancements and changes with you… Just over a month ago we announced the removal of the search option from the right click menu as it had become obsolete. In the up coming release the option will be completely removed (not just commented out).

Originally the option allowed you to search within highlighted areas and receive a dialogue displaying the coordinates of any found search terms. With the removal of this option we have replaced it with an new option to our main search functionality.

 

Search Within Highlights In The Viewer

To search within highlights in the JPedal PDF viewer you will need to be using either the external window or the side tab bar option for the search. The reason for this is because the menu bar search will display the first result as soon as it is found which will change the highlights where as the external window and the side tab bar complete the search before altering any highlight areas.

To turn on this functionality you need to activate the option from the advanced options panel shown below.

jPedal viewer search external window with advanced options open.
jPedal viewer search external window with advanced options open.

It is important to note that the option to search within highlighted areas only will only be used if it is selected and highlights are present. If there are no highlighted areas then the search continues as normal according to the various options chosen.

 

Search Within Highlights In Your Code

As the functionality for this code is based on one of our Viewer example classes and can not just be called via PdfGroupingAlgorithms we thought it best to describe how to achieve this functionality yourself.

First you will need a way within your program to specify highlight areas or any area you wish to search within.

Next just need to cycle through the areas you have using any of the search methods that take an input. Each time the search is complete, move the returned results into another object to hold them and move on to the next search with the next specified area.

A quick example of how this can be achieved for a single page can be found below.

 int[][] areas = new int[3][4];
 String[] search = {"test", "search"};
 int type = SearchType.DEFAULT;

 /**
 * Fill areas with coordinates in the order
 * x1, y1, x2, y2
 * Hard coded here just for example
 */ areas[0] = new int[]{0, 0, 80, 60};
 areas[1] = new int[]{20, 80, 120, 9};
 areas[2] = new int[]{20, 150, 60, 158};
 
 ArrayList<float[]> resultHolder = new ArrayList<float[]>();
 
 for(int i=0; i!=areas.length; i++){
     float[] results = findText(x1, y1, x2, y2, search, type);
     /**
      * Store results for use after all areas searched
      */     resultHolder.add(results);
 }

Thoughts and Opinions?

So what do you think of having a this style of functionality within a PDF?
I would like to hear what you think.

IDRsolutions develop a Java PDF Viewer and SDK, an Adobe forms to HTML5 forms converter, a PDF to HTML5 converter and a Java ImageIO replacement. On the blog our team post anything interesting they learn about.

Kieran France Kieran France is a programmer for IDRSolutions in charge of there internal test suite. In his spare time he enjoys tinkering with gadgets and code.

Enabling SVG Gzip Compression on Apache and NGINX

Gzip compression is a widely supported method of reducing the size of the content sent from a web server in order to improve the...
Leon Atherton
47 sec read

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2019. All rights reserved.