As I have recently re-coded the regression tests for our PDF to HTML 5 Converter to add in a variety of improvements, I thought it would make an interesting article to explain how we test updates to our PDF to HTML 5 (and PDF to SVG, PDF to JavaFX, etc) converter.
The software development methodology that we follow is essentially an agile one – we offer a monthly release, but in addition to that, we offer a daily release with the latest developments and fixes. This means that we are continually running our regression tests to not only make sure that our changes aren’t breaking anything, but also to aid with the development of our changes.
Over the last 13 years we have built up quite a collection of PDF files that are interesting to us for a multitude of reasons. Our blog is full of things we’ve learned from these files, and occasionally a complaint or two that a file is doing silly things.
So, we have a directory that contains semi-organised PDF files that we run our tests on. From this we generate a baseline. That is, we run our converter on all of the PDFs, and we store the output in another directory. For those who are unfamiliar, each PDF creates a directory with the same name as the PDF, and within is an HTML file and directory (if it requires one) for each page. Inside each directory for each page is a directory each for fonts, images and shades (and anything else we feel needs a directory).
We can now start
breaking things developing. When we think we have a fix, or something that we want to test against the rest of our files, we can run our regression tests. PDF by PDF, this will convert each file, and then compare the output with the baseline that we created earlier. If the PDF fails the comparison, we get notified, and a report gets generated. In the report we get quick links to the PDF, and before and after for the pages that have failed. Under each page we get a list of files that have changed, and a reason for the failure (file not in baseline, file not in update, text has changed, length has changed etc).
This allows us to very quickly see what the result of the changes we have made are, and whether it’s a good change, or something that needs more work and testing.
This post is part of our “Testing Articles Index” in these articles we provide a guide to Testing.
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.