Update: JPDF2HTM5 has been rebranded as BuildVu and JPDFForms has been rebranded as FormVu

Different commands to filter text files

At IDR Solutions this year, we have been looking into profiling our code using the Java JIT (Just-In-Time) compiler’s inlining functionality. This provided us with some useful information such as :

  • – Which methods are the most frequently called
  • – Which methods are inlined (a hot method)
  • – Which methods can be inlined but are too big

We piped the output information into a text file and soon realized that although the information it provided was useful, it was not sorted in any useful way.

This lead me to look at the different sorting/filtering UNIX commands which are available. I thought I would share some of the useful commands and options I used.

I have made a dummy file (profilerOutput.txt) to show you how the commands affect the contents of my file.

@1   b hot method too big
@35   b hot method too big
@15   c too big
@2   a hot method too big
@6   e inlined
@3   b hot method too big
@12   a hot method too big
@4   d hot method too big

1) Grep

grep

Firstly, I had to extract only the data that I deemed relevant. In this case I wanted to find methods that could be inlined but were too big. By doing this it means that we can work towards making these methods smaller so that the code can run faster. The Grep command lets you search each line for a specific string. If the input string is found it is added to the standard output (or a file if I use >>). To make this easier for me to perform other commands I forwarded the output to a new file.

grep “hot method too big” profilerOutput.txt >> grep.txt

@1   b hot method too big
@35   b hot method too big
@2   a hot method too big
@3   b hot method too big
@12   a hot method too big
@4   d hot method too big

As you can see the list now only contains the hot method too big methods

2) Sort

sort Next I wanted to group and count how many times a method appears. To do this I need to use the Uniq command, before I can do this I need to put similar line spaces next to each other.

I can do this using sort. Sort will reorder a file (or multiple files if we wished it) by an argument we specify.

I needed to use a additional option with this command in order to sort the second column which in this case is my representation of the method name.  As a example this command will put all the a’s together, b’s together etc.

-k  sorts by a column that I specify (where the columns are separated by spaces)

sort -k2 grep.txt >> sort1.txt

@12   a hot method too big
@2   a hot method too big
@1   b hot method too big
@3   b hot method too big
@35   b hot method too big
@4   d hot method too big

3) Uniq

uniqI now need to count how many times a method was called. The profiling output I received gave each method a different calling ID (@##) which isn’t significant to me.

I needed to find a way to group the methods without the command comparing the id. This is where Uniq command comes in.

It has some handy arguments that let you ignore certain parts of a line and then groups and counts the lines based on the remainder of the line.

The command arguments I used were:

-c  which will count the duplicates for you and display the very first instance it encounters. This displayed number will not affect the -f argument.
-f  which lets you ignore the first n fields at the start of the line (where fields are separated by spaces). In this case it will ignore the ID.

uniq -f 1 -c sort1.txt >> uniq.txt

2  @12   a hot method too big
3  @1   b hot method too big
1  @4   d hot method too big

4) Sort

Finally I wanted the methods in order with the most frequent method first. This will let me easily identify the biggest troublemakers. I will use two different arguments here.

-n  which sorts values based on the first number string value
-r  which reverses the results so that the biggest value is displayed first

sort -n -r uniq.txt >> finalOutput.txt

3  @1   a hot method too big
2  @12   b hot method too big
1  @4   d hot method too big

And there you have it. Now I have an idea of my most frequently used methods that aren’t inlined.

If you’re a first-time reader, or simply want to be notified when we post new articles and updates, you can keep up to date by social media (TwitterFacebook and Google+) or the  Blog RSS.

Related Posts:

The following two tabs change content below.

Georgia Ingham

Java Developer at IDRsolutions
Georgia is a Java Developer at IDRSolutions. She is currently working along side the team on the development of JPedal and JPDf2HTML5. Her hobbies include reading and cycling.
Georgia

About Georgia Ingham

Georgia is a Java Developer at IDRSolutions. She is currently working along side the team on the development of JPedal and JPDf2HTML5. Her hobbies include reading and cycling.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>