Kieran France Kieran France is a programmer for IDRSolutions. He enjoys tinkering with most things including gadgets, code and electronics. He spends his time working on the the JPedal library and our internal test suite..

Search PDF Files With Regular Expressions – Generating Teasers

55 sec read

Recently I have had some questions on how it would be possible to display search results including two words from either side of the the search result. This is something we already have set up for our simple viewer.

We have achieved this quite easily as our search functionality is built around a regular expression engine. Every time we find a search result we have a secondary search that will return the search term with text to either side. To do this we need only to add a prefix and suffix to the search expression we have been provided.

To find a word before the search term we add the following prefix to the search expression.

(?:\\S+\\s)?\\S*

To find a word after the search term we add the following suffix to the search expression.

\\S*(?:\\s\\S+)?

From here the easiest way to find more words before or after the search term is to copy these expressions multiple times either before or after. Handy little expressions like this are great to keep as final static variables in a utility class / jar. If you already have it written once why not make it easy to move them to other projects in the future, ready and waiting to be used again.

This article is part of our Search PDF Files With Regular Expressions series. The articles in this series covers our use of regular expressions with jPedal in order to search PDF files. By using the link above you will find the other articles in the series.

Kieran France Kieran France is a programmer for IDRSolutions. He enjoys tinkering with most things including gadgets, code and electronics. He spends his time working on the the JPedal library and our internal test suite..

Leave a Reply

Your email address will not be published. Required fields are marked *