Responsive CSS Testing Made Simple with the BBC’s Wraith

A Q&A with BBC News developer David Blooman

Wraith in action

Last November, the BBC News team created a front-end regression tool that collects and diffs screenshots to automatically highlight discrepancies produced (intentionally or otherwise) by CSS changes. Last week, the team released Wraith as an open source tool and explained a little of its background:

This tool came about as we continued to see small changes from release to release, as more teams joined the project, small front end bugs cropped up more frequently. Our solution was to compare screenshots of pages, at the pull request level and when merged into master. This process produces fewer bugs and unintended changes, while also being able to ensure intentional changes appear correctly.

We spoke with David Blooman, who developed the tool last fall and worked with Simon Thulbourn to prepare it for public release.

The Background

Q. What kind of testing set-up were you working with before making Wraith, and why did you decide to make the switch?

In November, we had very little in the way of visual regression testing for news, it was mostly manual. We were still at an early stage in the project, building features with a relatively small team and I was the only one testing. We had automated tests for acceptance and unit tests, but visual changes were something that were difficult to spot and fix before releasing. I decided that we needed to created an automated tool to ensure we would be able to scale the website up to desktop features, without creating lots of visual regressions.

Q. Why did you build custom, instead of using one of the existing commercial or open source testing tools?

Initially, I did look at all the available tools, but they didn’t suit our needs. I wanted full page screenshots to ensure a total test coverage, this would help identify every change on the page, not just a specific part. I also found that the other open source tools were great at what they set out to do, but didn’t have much scope for responsive web pages. This was key for us: the ability to quickly add a URL and browser width and kick off a test. There were elements of lots of different tools that I liked, but decided that bringing them together into a new tool would be better suited for our team.

Under the Hood


Q. Wraith accomplishes three major things, right? It takes screenshots of two designated sites at a lot of resolutions using PhantomJS, diffs them using Imagemagick, and then spits out a set of images that flag differences between the screenshots of the two sites. Which of these functions was the trickiest to set up and get working?

You are right, the premise is very simple. PhantomJS does most of the work for us by getting the screenshots—the trickiest function was the comparing. I did a lot of experimentation with other comparing tools—ChunkyPNG was one that I used for awhile—but was slower than ImageMagick. I also found that PhantomJS had problems with anti-aliasing, and this caused ImageMagick to output a diff image that showed lots more changes than there actually were. I had to pass extra arguments to ImageMagick in order to get a diff image that was capable of showing changes, but not changes caused by inconsistencies in the images themselves. The final part was how to express the changes as a value, so I decided on the number of pixels that had changed, which is then output into a .txt file. This was also affected by aliasing, so that had to be taken into consideration too.

Q. Did you encounter any especially interesting (or troublesome) challenges while designing and building Wraith?

The biggest challenge was in making the compare job actually work. PhantomJS will take the screenshot fine, but the images that are created may be different sizes. This is fine for width, as we have already set this in the config file, but height was something that changed a lot. A one-pixel height increase would cause ImageMagick to error, so we initially put a fixed height into Wraith. This is fine in some ways, but it means very large images are created, with potentially a lot of whitespace. I added a piece of functionality after we open-sourced Wraith that actually fixed this issue by measuring the image sizes, but it was something that had caused us lots of issues in the beginning.

Wraith In Use

Q. How has your use of Wraith changed your production process? What kind of time savings are you seeing?

In some situations, such as CSS file path changes, we can easily check that we haven’t broken anything, so the monotony of clicking through the website is replaced with a automated script. We can also use Wraith to greatly reduce regression testing: moving most of the visual changes to an automated solution allows us to spend more time testing troublesome browsers and ensuring that the core functionality is ready for release. In terms of time saved, I would say it is significant, testing through all the stages of a features. It starts with a pull request, then on test servers and another run on staging. This would take hours to complete based on all the pages we test with Wraith, but a complete test for us using Wraith is less than 30 minutes.

Q. With the success of Wraith, are you planning any similar projects to support development and workflow?

We are always looking to improve on our workflow and reducing time spent waiting on tests to run. To that end, a project is in the works to build an automated system to test pull requests, running all the tests and then feeding that information back into Github to alert the team to whether a pull request should be merged. This will reduce time for developers as tests will not have to be run locally, meaning faster feedback and reviewing can occur.






Current page