• 19 Feb 2021
  • Neos
  • Martin Ficzel

Automated testing of screenshots with Sitegeist.Monocle & BackstopJS

The Frontends of current websites are complex constructs in which many components are used multiple times and it is often difficult to estimate the effects of individual changes. Unfortunately, it often happens that presumably harmless adjustments have an impact in unexpected places.

To avoid visual inconsistencies, Neos frontends at sitegeist have been developed and tested for several years now using our component styleguide Sitegeist.Monocle. To keep components visually isolated, styles are strictly separated using CSS Modules. Together with our four-eyes principle and multiple test stages, this approach has significantly improved quality assurance. Still, changes occasionally slipped through – especially when they caused unexpected side effects in unrelated areas.

The obvious solution – test everything – is not only impractical, but also unreliable. Not every team member perceives visual differences the same way. Colors, spacing, or typography can be subtle, and the old joke that backend developers only see 16 colors has some truth to it.

Since visual perception is subjective, an objective measurement makes sense. We’re engineers, after all. The idea is simple: take a screenshot, compare it to a reference image, and raise an alert if too many pixels differ.

However, this approach doesn’t work on actual CMS pages, since editorial content (text changes, image updates) would trigger false positives. This is where Monocle comes in: our styleguide already includes all components in isolation, along with stable dummy data.

Testing Tool - BackstopJS

To test and compare screenshots we use the tool BackstopJS which allows to define a set of scenarios based on URLS. Backstop calls all defined scenarios, takes a screenshot and compares it with reference images.

To connect BackstopJS with the Monocle styleguide, an additional package Sitegeist.Monocle.BackstopJS was created, which creates a BackstopJS configuration file containing scenarios for all elements of a styleguide in all breakpoints.

Bash

Consistent font rendering

Unfortunately, there are relatively large deviations in the rendering of fonts on different operating systems, so BackstopJS will report errors if the tests are executed on different operating systems.

Fortunately, we have been using Docker in the form of DDEV as the basis of our development setups for some time now and therefore have a defined Linux environment for each developer. Only the chromium package had to be added to the .ddev/config.yaml.

Bash

Image Lazy Loading

Modern websites only reload many images when the visitor actually scrolls close to the images. This can save a lot of bandwidth and loading time.

Unfortunately, this behavior is not helpful in automated screenshot tests. On the one hand, images of empty image areas are often recorded. But far more problematic is the fact that reloading is not 100% deterministic and screenshots are created alternately with or without loaded images. Which of course leads to error messages from BackstopJS.

Our solution is to use a special Flow context in which lazyness and other functions that interfere with testing are deactivated by default. This prevents Visual Regression Testing from waiting for the images to load and the tests run at the maximum possible speed.

Bash

Alternatively, there are other approaches for LazyLoading in VisualRegression Testing. For example, JavaScript code can be integrated at the start of BackstopJS, which triggers the loading of the images. However, the approach using a special Flow context also allows us to configure other functions specifically during the test. For example, we avoid database access during the tests by deactivating some functions such as the editorial maintenance of translations in this context.

Continuous Integration in Gitlab CI

The final challenge was the integration into gitlab CI. On the one hand, we had to execute the screenshots in parallel with the existing code linters. On the other hand, we wanted to avoid the application having to execute `yarn` and `composer install` as well as `yarn build` again in every step. Furthermore, the testing should run in parallel to the existing tests so that the developers receive prompt feedback and changes are not delayed.

Our solution is based on the build artifacts from GitlabCi. Essentially, a `build` stage installs the website, builds the JS and CSS and warms up the Flow cache. The following stages `test` and `deploy` can then start with a fully installed system and work in parallel where appropriate.

In order for the BackstopJS tests to be executed within the Gitlab runner, we had to provide a special Docker container with the chromium package.

Bash
Martin Ficzel

Martin Ficzel