October 22, 2008

...Learn TDD with Codemanship

Musings On Testing Multithreaded Logic

I'm currently pondering the problem of multi-threaded logic and concurrent testing.

Threads are radioactive, man! You should try to stay well clear, and if you can't avoid them, handle with extreme caution.

You see, if I write a line of code, then there are only certain ways it can be wrong. If I write two lines of code, that doubles the ways it the code can be wrong. If I write three lines, then that pretty much doubles it again. Do you see where I'm going with this? Every new piece of logic can exponentially increase the probability of getting it wrong. Which is one very good reason to favour short and simple methods.

If we keep our methods short and simple, and test them effectively, then we can get a grip on our logic and produce reliable results.

But when we add the possibility of more than one thread of execution happening at the same time, then we get a literal explosion of possible combinations of interleaving of logic. If no data is shared between threads, we're probably okay. Which is why things like Java servlets and ASP.NET web forms are relatively "thread-safe", and dunderheads like me and you can get away with using this highly isolated concurrent processing.

But the moment threads start sharing data, we get problems. To illustrate this, I've tried to visualise an example interleaving of executable code statements from two threads both trying to transfer funds between the same accounts.

In this example, thread 1 checks the pre-condition for withdrawing money from the payer's account, and immediately after so does thread 2. Now both threads believe that account1 has sufficient funds to cover both transfers. Which, of course, it doesn't. Thread 1 calls the withdraw() method first, leaving only 20 in the account to cover thread 2's 50 withdrawal. But thread 2 is none the wiser, and calls withdraw() anyway. Oops!

A different interleaving of the code being executed by the two threads leads to a wholly different outcome.

There are several unique interleavings that can be achieved from just these few lines of code being executed by just two concurrent threads. If we wanted to ensure we get the multi-threaded logic right, I'd want to test all of them.

One of the real drawbacks of trying to test multi-threaded code is that the operating system (or virtual machine) decides the priority of threads and the interleaving of executable statements, and this is usually in competition with a whole bunch of running programs, all doing complex things. Effectively, you should not expect the same interleavings each time, and therefore you should expect your multi-threaded tests to be unpredictable and difficult to reproduce.

This is not very helpful, obviously. Automated tests should be totally predictable and reproducible. But how do we achieve this?

My proposal is to create a tool that reads your multi-threaded code for a test, and generates a set of Command classes to capture every executable step. It would build a sequence of commands that are semantically identical to the multi-threaded code, but that we can now explicitly choose when to execute. We can then create copies of this sequence, and calculate every possible interleaving for a given number of concurrent threads.

Not a trivial undertaking, I grant you. But quite do-able, in my opinion. Applicable test assertions could be checked before and after each concurrent step is executed.

It still needs a lot of thought, but I think this idea might have some merit, so I'm going to look into it a bit deeper.

Posted 9 years, 8 months ago on October 22, 2008