January 28, 2012
Non-functional Test-Driven DevelopmentIt's the question that comes up everytime I introduce someone to Test-driven Development: "But what about performance?"
The thing about TDD is that adage "be careful what you wish for" applies. The solution we end up with is constrained by tests. There may be a million and one ways of achieving a goal, and some will perform better than others. The trick with TDD is to ask the right questions.
What I like about TDD - and similar precise approaches to defining requirements - is that it forces us to be explicit and unambiguous about what we want from our software.
So, my stock in trade reply to the question "But what about performance?" is "yes, what about performance?"
Software performance has different dimensions, and if it's important then we need to define exactly what performance we require in specific scenarios. A great way to do this is using non-functional tests.
There's the dimension of time, for example. How long should it take for the code to run?
Imagine a search algorithm that looks for a customer name in a sorted list. We could just loop through the list, and if there are only 1,000 customers and the occasional search, that might be fine. But if there are 10,000,000 customers and users are frequently searching, then a simple loop probably isn't going to cut the mustard.
We can constrain our search algorithm with a basic timing test, like the one below, that makes it explicit that our worst case search - the customer we're looking for isn't in the list - should take a maximum of 1 millisecond to complete.
Execution time is only one dimension, of course. What if we need to constrain the memory footprint of our code while it's running? In Java, we can use the JVM to get information about memory usage, and we can create a multithreaded test to monitor how much more memory is being eaten up as our code executes. Let's imagine we need to constrain the memory footprint when sorting our list of 10 million customers by name, forcing us to use an in-place sorting algorithm that uses up a maximum of another 10KB of memory:
And here, with massive caveats for my less-than-amazing knowledge of the Java Runtime (I make no warranties, the value of shares can go down as well as up, etc etc):
Leaving aside the fact that my brute-force method for calculating memory footprint is a bit hokey (and on running the tests several times, quite variable, it seems), the basic idea is hopefully useful. No doubt some fine fellow will point out a much better way.
You may be able to envisage now how we could use tests to explicitly constrain other non-functional runtime qualities of our code. But we can also often find ways to constrain code at design time, too.
We might have a requirement that our methods should be short and simple. Static code analysis tools like XDepend and Checkstyle can give us hooks into the structure of our code and enable us to create tests that, when code fails to live up to our quality standards, alert us to that fact early enough to do something meaningful about it.
Using executable tests, we can steer our software between acceptable limits of performance, scalability, portability, maintainability, and a whole heap of other -ilities we might care about.
But what about the more, how shall we put this, etheric -ilities, like usability, accessibility and so on? These things tend to be pretty ill-defined and qualitative. Can we make them explicit and testable, just like execution time or memory footprint?
I believe that we can, and to without reason, because I've done it and seen it done. We could, say, define a test that fails if a carefully selected group of target users (e.g., legal secretaries with more than 2 years Windows and web browsing experience), when presented with our application for the first time, fail to get their heads around it fast enough to complete certain tasks we set them within a specified time, without any help or documentation.
With a bit of imagination and lateral thinking, it's possible to meaningfully test many more software qualities than we usually do. And my experience of non-functional TDD is that we tend to get what we have tests for, and we tend not to get what don't have tests for. So agreeing executable non-functional tests tends to lead to better non-functional software quality, if it's done well.
As I warned before, though, be very careful what you wish for.
Posted 3 weeks, 2 days ago on January 28, 2012