Lyndon Armitage Lyndon is a general Developer. He has a keen interest in AI and Games Programming and runs a blog that he periodically updates.

Save time, Test your Code Part 1

2 min read

In a previous blog post, Sam mentioned that we often have to test on and account for differences between browsers, and that using a web server program such as XAMPP can help greatly during development and debugging across not only different browsers, but different platforms as well.

However using a web server is a manual way of testing and some things can often be missed. It’s a very useful way of debugging individual files and problems on separate platforms but often times you want to test a great many files to see if any changes have occurred in their layout or content and checking each one individually can be quite laborious. This is where automated tests come in!

Here at IDR Solutions we have several different automated tests for our output as well as unit tests for our actual code. In this series of blog articles I hope to describe several we have created that might be useful or inspirational to your own projects.

Testing For Changes To A Baseline Of Files

One of the output tests that we find invaluable trawls through a large collection of PDF files we have accumulated over the years, and runs our PDF to HTML5 converter against them, comparing the output to a previous run of the test for changes to the results. This is a form of testing often called Black-box testing, as we treat the code as a black box and are only interested in the output it produces.

This allows us to make changes to fix a specific issue our converter is having with a file or group of files, run the tests, and see if it affects other files negatively or positively without having to manually compare each file individually, making it an excellent time saver.

Diagram showing how the baseline test works
Diagram showing how the baseline test works. This kind of testing can be applied to a variety of other systems.

However there are some cases where this test does not help; such as when changing the structure of our output, as we did not so long ago, will cause all the files to change, which can cause a lot of false positives if you haven’t tested a previous addition to the code.

So how do you minimise the impact of changes? Make them incremental and independent, that’s how! As each developer has access to these tests we can run them prior to any commits to the main repository, making sure to communicate what they change to each other so nobody is hit by a massive unexpected change in their test output, something we call a ‘baseline breaker’.

This test can also take a while to complete for a large selection of files (between 10 and 20 minutes on our newest machines). This is an issue that can be simply worked around by limiting the test data when it’s known what will likely change, e.g. after already running the tests on a particular change that still needs work we can limit the test data to only the files effected by the change.

The key feature of a test like this is the ability to test against a large selection of files or data automatically is also partly a downside; if you cannot find or produce a large data set it won’t necessarily pick up all the programming bugs you may introduce.

For more information on this test you can read a previous blog post by Leon.

Next Time

In the next blog post I will talk about how to set up some automated tests similar to the baseline test but geared more towards testing the functionality of a program or in the case of our HTML forms, it’s output.

This post is part of our “Testing Articles Index” in these articles we provide a guide to Testing.

Watch how to use our PDF Viewer JPedal

Lyndon Armitage Lyndon is a general Developer. He has a keen interest in AI and Games Programming and runs a blog that he periodically updates.

2 Replies to “Save time, Test your Code Part 1”

  1. Hello,

    Let’s think about the following example: the JPedal calls that convert a PDF page into a JPEG image. Sometimes the resulting image is lacking a table, a logo image or something else. Obviously a human observing the tested image against a baseline image would immediately spot any difference, but how do you automate this kind of tests (if you do) ?

    Do you compare to a baseline JPEG at byte level (this would mean that applying the same method to the same PDF from two different builds produces the same bytes) ?
    Or there is no need for an image comparison because such problems are always with PDF structure and therefore are caught by unit tests ?

    Thank you,

    1. Hi Pierangelo,

      Thanks for the questions!

      The example given in this article refers to our baseline tests for our HTML conversion. Currently when automatically comparing files it does the following:

      1. First it checks that the baseline file it is checking exists in the new test output. If it doesn’t we output a message telling us a file is missing from the new output.
      2. Next it checks that the new test file we are checking against exists in the baseline. If it doesn’t we output a message telling us a new file has been found that’s not in the baseline.
      3. If both of these are true it will then check the length/size of the files. If they differ it will output a message telling us the file has changed.
      4. Finally if the files lengths are the same and the files in question are text based files like .html, .svg, .css etc, it then checks their contents against each other to make sure they match. Outputting a message if they do not.

      For image files we have within our HTML baseline (normally background or images on the page) we don’t check them beyond a their length/sizes. This normally works well as our PDF2HTML5 code extends some of the same code our PDF2IMAGE code does, meaning we’d actually pick up on missing lines etc. in the baseline test.

      However I do believe we have some tests specific to the PDF2IMAGE code that tests on a byte level and highlights any discrepancies found in a similar manner to the baseline tests. They are normally run before making a build live and when we are working on our image handling code.

      Hope that helps,

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2022. All rights reserved.