July 23, 2016

...Learn TDD with Codemanship

On The Compromises of Acceptance Test-Driven Development

I'm currently writing a book on Test-Driven Development to accompany the redesigned training workshop. Having thought very hard about TDD for many years, the first 140 pages were very easy to get out.

But things have - predictably - slowed down now that I'm on the chapter on end-to-end TDD and driving internal designs from customer tests.

The issue is that the ways we currently tackle this are all compromises, and there are many gods that need appeasing, just as there are many ways that folk do it.

Some developers will write, say, a failing FitNesse test and come up with an implementation to pass that test. Some will write a failing automated customer test and then drive an internal design using unit tests and "classic TDD". Some will write a failing automated customer test that makes all the assertions about desired outcomes (e.g., "the donated DVD should be in the library"), and rely entirely on interaction tests to drive out the internal design using mock objects. Some will use test doubles only for external dependencies, ensuring their automated customer test runs faster. Some will include external dependencies and use their automated customer test to do integration testing as well. Some will drive the UI with their automated customer tests, effectively making them complete end-to-end system tests. Some will drive the application through controllers or services, excluding the UI as well as external back-end dependencies, so they can concentrate on the internal design.

And, of course, some won't automate their customer tests at all, relying entirely on their own developer tests for design and regression testing, and favouring manual by-eye confirmation of delivery by the customer herself.

And many will use a combination of some or all of these approaches, as required.

In my own approach, I observe that:

a. You cannot automate customer acceptance. The most important part of ATDD is agreeing the test examples and getting the customer's test data. Making those tests executable through automation helps to eliminate ambiguity, but really we're only doing it because we know we'll be running those tests many times, and automating will save us time and money. We still have to let the dog see the rabbit to get confirmation of acceptance. The customer has to step through the tests with working software and see it for themselves at least once.

b. Non-executable customer tests can be ambiguous, and manually reconciling customer-provided data with unit test parameters can be hit-and-miss

c. The customer rarely, if ever, gets involved with writing "customer tests" using the available tools like FitNesse and Cucumber.

We're probably kidding ourselves that we even need a special set of tools distinct from the xUnit frameworks we would use for other kinds of tests, because - chances are - we're going to be writing those tests

d. Customer tests executed using these tools tend to run slow, even when external dependencies are excluded

e. Relying entirely on top-level tests to check that the work got done right can - and usually does - lead to problems with maintainability later. We might identify a class that could be split off into a component to be reused in ther applications, but where are its functional tests? Imagine we could only test a car radio when it's installed in a Ford Mondeo. This is especially pertinent for teams thinking about breaking down monolithic architectures into component-based or service-based designs.

f. When you exclude the UI and external dependencies, you are still a long way from "done" after your customer test has passed. There's many a slip twixt cup and lip.

g. Once we've established a design that passes the customer's test, the main purpose of having automated tests is to catch regressions as the code evolves. For this, we want to be able to test as much of our code as quickly and cheaply as possible. Over-reliance on slower-running customer tests can be at odds with this goal.

With all this in mind, and revisiting the original goal of driving designs directly from the customer's examples, it's difficult to craft a workable single narrative about how we might approach this.

I tend to automate a "happy path" test automated at entry point to the domain model, drive an internal design mostly through "classic" TDD, and use test doubles (stubs, mocks and dummies) to exclude external dependencies (as well as fake complex components I don't want to get into yet - "fake it 'til you make it".) A lot of edge cases get dealt with only in unit tests and with by-eye customer testing. I will work to pass one customer test assertion at a time, running the FitNesse test to get feedback before moving on to the next assertion.

This does lead to three issues:

1. It's not a system test, so there's still more TDD to do after passing the customer's test

2. It produces some duplication of test code, as the customer test will usually ask some of the same questions as the unit tests I write for specific behaviours

3. Even excluding the UI and external dependencies, they still run much slower than a unit test

I solve issue #3 by adapting my FitNesse fixtures to also be JUnit tests that can be run by me as part of continuous regression testing (see an example at https://gist.github.com/jasongorman/74f6a0a049e03b7030ab46e8b01128e7 ). That test is absolutely necessary, because it's typically the only place that checks that we get all of the desired outcomes from a user action. It's the customer test that drives me to wire the objects doing the work together. I prefer to drive the collaborations this way rather than use mock objects, because I have found over the years that an over-reliance on mocks can lead to maintainability issues. I want as few tests as possible that rely on the internal design.

Being honest, I don't know how to easily solve issue #2. It would require the ability to compose tests so that we can apply the same assertions to different set-ups and actions. I did experiment with an Assertion interface with a check() method, but ending up with every assertion having its own implementation just got kerrrazy. I think what's actually needed is a DSL of some kind that hides all of that complexity.

On issue #1, I've long understood that passing an automated customer test does not mean that we're finished. But there is a strong need to separate the concerns of our application's core logic from its user interface and from external dependencies. Most UIs can actually be unit tested, and if you implement an abstraction for the UI logic, the amount of actual code that directly depends on the UI framework tends to be minimal. All you're really doing is checking that logical views are rendered correctly, and that user actions map correctly onto their logical event handlers. The small sliver of GUI code that remains can be driven by integration tests, usually.

I don't write system tests to test logic of any kind. The few that I will write - complicated and cumbersome as they usually are - really just check that, once the car is assembled, when you turn the key in the ignition, it starts. A dozen or more "smoke tests" tend to suffice to check that the thing works when everything's plugged in.

So I continue to iterate this chapter, refining the narrative down to "this is how I would do it", but I suspect I will still be dissatisfied with the result until there's a workable solution to the duplication issue.

Posted 1 year, 3 months ago on July 23, 2016