April 8, 2014

...Learn TDD with Codemanship

When Might We Use Algorithms To Calculate Expected Test Answers?

Just a short placeholder post today, inspired by a conversation with a client's team today.

The question raised was: is it ever right to use an algorithm to calculate expected test results?

The advice from TDD veterans is to try to avoid duplicating source code in our tests. This may have been misinterpreted - like much oversimplified advice (and I'm just as guilty of this) - as "don't use algorithms in your tests".

Looking at a very simple example, a program that generates Fibonacci sequences of a specified length, I perhaps can illustrate what the advice should be.

First of all, when might we use an algorithm or general formula to generate expected test answers instead of just supplying the data for specific test cases?

The answer to that question is another question: when might you use an algorithm to generate a Fibonaccu sequence instead of just hard-coding the sequence?

If someone said to me "Jason, write me a program to generate the first 8 Fibonacci numbers, my code would look something like:

return new int[] {0, 1, 1, 2, 3, 5, 8, 13 }

...because that would be much simpler than an algorithm.

If someone asked me to write a program to generate up to, say, 50 Fibonacci numbers, an algorithm would be much shorter and simpler than typing out that whole sequence.

The same goes for tests; if we have a handful of test cases, it might seem overkill to write an algorithm to generate the expected answers. On the other hand, if we wanted to exhaustively test our implementation (e.g., test all the Fibonacci numbers in a sequence of 50, or test a maths algorithm in a rage from 0 -> 10,000 incrementing by one each time, then hard-coding the answers would be a heck of a lot of work.

In those cases, I'd use an algorithm in the tests. But, and this is very important, not the same algorithm as the solution I'm testing.

Uncle Bob Martin sums it up best with his analogy for unit testing of "double-entry book-keeping". If you've ever done your own accounts, you may well have learned that - although it's extra work - it can save a lot of time and heartache later if we take time to double check all our figures.

Double-entry book-keeping works like sign-up forms that require us to type in our email address twice. It compares one piece of information to another piece of information that is in no way derived from the original (e.g., it's typed in twice) on the understanding that they should be, if correct, the same. Of course, we could enter it wrong both times, but the chances of that happening are greatly reduced. Maybe I type it in right 99% of the time, leaving a 1% chance of a mistake. With double-entry, the odds of getting it wrong both times are 0.01%.

Going back to the Fibonacci example; if I wanted to exhaustively test my solution across a large range of numbers, I might choose to use an iterative solution in my source code, and calculate expected answers - maybe as a one-off job for future test runs, if it's computationally very expensive - using a tail-recursive algorithm.

The two algorithms may well have different performance properties, but the answers they produce should be identical. The chances of both algorithms being wrong are dramatically smaller than one of them being wrong by itself.

Just to finish with a cautionary note; be sure to let the test code - and the duplication in it - to lead you to a general algorithm, rather than leaping in straight away and writing one in the hope you'll need it for squillions of future tests. There's an art to refactoring tests into general specifications, but that's for another post.

Posted 3 years, 9 months ago on April 8, 2014