Mutation testing: the best code-quality tool you never heard of

Quality software is a combination of quality code, and quality tests. It’s long been established that basic code-coverage is a low-quality metric, as it largely executes the code base to do what it’s been designed to do. However, the real world has edge cases, interruptions, disconnections and attacks & hacks – how does your code perform when it’s being executed in a different way to how it was designed?

This is where mutation testing comes in. Like the idea of Netflix’s ‘Chaos Monkey’ which randomly terminated services in their live environment to force the engineers to build with resiliency in mind, mutation testing, attacks your code from the *inside*.

Mutation testing works by subtly changing how your program works (Like replacing if (X > Y) with if (X < Y)) to see if your tests notice that the logic here has been inverted. Multiply this by mangling the logic of every statement, and you have mutation testing in action. There’s a whole host of other clever ways that this works, but that’s for another deep-dive post.

And it works spectacularly well. When I introduced this with my team at the Student Loans Company, there was some initial scepticism of the utility of this ‘random error tool’. But as the team strengthened the test cases to address the bugs that slipped through, something interesting happened – they started refactoring the code. In fact, we’d inadvertently gamified the testing process, and the whole team was invested! What’s more, the code became much more decoupled, simpler, and modular – all good things to have in your code base.

In fact, I don’t think we ever had a production bug with the system we developed using this method, no mean feat! Mutation testing highlighted hundreds of issues at source, and this prevented them entering the code base. Regressions were rare too, a nice side effect of decoupling.

Overall, Mutation testing made a transformative change to how my team wrote software. It made them better developers too, as they all started writing more defensive code to avoid triggering mutation issues in the first place.

There are some who will say that mutation testing is
resource intensive, and that they couldn’t possibly apply it to their huge,
legacy code base. My response: compute is cheap and it can be run alongside your existing code pipeline. The benefits to your codebase are immense, but to transform how your team writes code? Even greater.




Comments

Leave a Reply