July 23, 2016

On The Compromises of Acceptance Test-Driven Development

I'm currently writing a book on Test-Driven Development to accompany the redesigned training workshop. Having thought very hard about TDD for many years, the first 140 pages were very easy to get out.

But things have - predictably - slowed down now that I'm on the chapter on end-to-end TDD and driving internal designs from customer tests.

The issue is that the ways we currently tackle this are all compromises, and there are many gods that need appeasing, just as there are many ways that folk do it.

Some developers will write, say, a failing FitNesse test and come up with an implementation to pass that test. Some will write a failing automated customer test and then drive an internal design using unit tests and "classic TDD". Some will write a failing automated customer test that makes all the assertions about desired outcomes (e.g., "the donated DVD should be in the library"), and rely entirely on interaction tests to drive out the internal design using mock objects. Some will use test doubles only for external dependencies, ensuring their automated customer test runs faster. Some will include external dependencies and use their automated customer test to do integration testing as well. Some will drive the UI with their automated customer tests, effectively making them complete end-to-end system tests. Some will drive the application through controllers or services, excluding the UI as well as external back-end dependencies, so they can concentrate on the internal design.

And, of course, some won't automate their customer tests at all, relying entirely on their own developer tests for design and regression testing, and favouring manual by-eye confirmation of delivery by the customer herself.

And many will use a combination of some or all of these approaches, as required.

In my own approach, I observe that:

a. You cannot automate customer acceptance. The most important part of ATDD is agreeing the test examples and getting the customer's test data. Making those tests executable through automation helps to eliminate ambiguity, but really we're only doing it because we know we'll be running those tests many times, and automating will save us time and money. We still have to let the dog see the rabbit to get confirmation of acceptance. The customer has to step through the tests with working software and see it for themselves at least once.

b. Non-executable customer tests can be ambiguous, and manually reconciling customer-provided data with unit test parameters can be hit-and-miss

c. The customer rarely, if ever, gets involved with writing "customer tests" using the available tools like FitNesse and Cucumber.




We're probably kidding ourselves that we even need a special set of tools distinct from the xUnit frameworks we would use for other kinds of tests, because - chances are - we're going to be writing those tests

d. Customer tests executed using these tools tend to run slow, even when external dependencies are excluded

e. Relying entirely on top-level tests to check that the work got done right can - and usually does - lead to problems with maintainability later. We might identify a class that could be split off into a component to be reused in ther applications, but where are its functional tests? Imagine we could only test a car radio when it's installed in a Ford Mondeo. This is especially pertinent for teams thinking about breaking down monolithic architectures into component-based or service-based designs.

f. When you exclude the UI and external dependencies, you are still a long way from "done" after your customer test has passed. There's many a slip twixt cup and lip.

g. Once we've established a design that passes the customer's test, the main purpose of having automated tests is to catch regressions as the code evolves. For this, we want to be able to test as much of our code as quickly and cheaply as possible. Over-reliance on slower-running customer tests can be at odds with this goal.

With all this in mind, and revisiting the original goal of driving designs directly from the customer's examples, it's difficult to craft a workable single narrative about how we might approach this.

I tend to automate a "happy path" test automated at entry point to the domain model, drive an internal design mostly through "classic" TDD, and use test doubles (stubs, mocks and dummies) to exclude external dependencies (as well as fake complex components I don't want to get into yet - "fake it 'til you make it".) A lot of edge cases get dealt with only in unit tests and with by-eye customer testing. I will work to pass one customer test assertion at a time, running the FitNesse test to get feedback before moving on to the next assertion.

This does lead to three issues:

1. It's not a system test, so there's still more TDD to do after passing the customer's test

2. It produces some duplication of test code, as the customer test will usually ask some of the same questions as the unit tests I write for specific behaviours

3. Even excluding the UI and external dependencies, they still run much slower than a unit test

I solve issue #3 by adapting my FitNesse fixtures to also be JUnit tests that can be run by me as part of continuous regression testing (see an example at https://gist.github.com/jasongorman/74f6a0a049e03b7030ab46e8b01128e7 ). That test is absolutely necessary, because it's typically the only place that checks that we get all of the desired outcomes from a user action. It's the customer test that drives me to wire the objects doing the work together. I prefer to drive the collaborations this way rather than use mock objects, because I have found over the years that an over-reliance on mocks can lead to maintainability issues. I want as few tests as possible that rely on the internal design.

Being honest, I don't know how to easily solve issue #2. It would require the ability to compose tests so that we can apply the same assertions to different set-ups and actions. I did experiment with an Assertion interface with a check() method, but ending up with every assertion having its own implementation just got kerrrazy. I think what's actually needed is a DSL of some kind that hides all of that complexity.

On issue #1, I've long understood that passing an automated customer test does not mean that we're finished. But there is a strong need to separate the concerns of our application's core logic from its user interface and from external dependencies. Most UIs can actually be unit tested, and if you implement an abstraction for the UI logic, the amount of actual code that directly depends on the UI framework tends to be minimal. All you're really doing is checking that logical views are rendered correctly, and that user actions map correctly onto their logical event handlers. The small sliver of GUI code that remains can be driven by integration tests, usually.

I don't write system tests to test logic of any kind. The few that I will write - complicated and cumbersome as they usually are - really just check that, once the car is assembled, when you turn the key in the ignition, it starts. A dozen or more "smoke tests" tend to suffice to check that the thing works when everything's plugged in.

So I continue to iterate this chapter, refining the narrative down to "this is how I would do it", but I suspect I will still be dissatisfied with the result until there's a workable solution to the duplication issue.


July 16, 2016

Taking Agile To The Next Level

As with all of my Codemanship training workshops, there's a little twist in the tail of the Agile Software Development course.

Teams learn all about the Agile principles, and the Agile manifesto, Extreme Programming, and Scrum, as you'd expect from an Agile Software Development course.

But they also learn why all of that ain't worth a hill of beans in reality. The problem with Agile, in its most popular incarnations, is that teams iterate towards the wrong thing.

Software doesn't exist in a vacuum, but XP, Scrum, Lean and so forth barely pay lip-service to that fact. What's missing from the manifesto, and from the implementations of the manifesto, is end goals.

In his 1989 book, Principles of Software Engineering Management, Tom Gilb introduced us to the notion of an evolutionary approach to development that iterates towards testable goals.

On the course, I ask teams to define their goals last, after they've designed and started building a solution. Invariably, more than 50% of them discover they're building the wrong thing.

It had a big influence on me, and I devoted a lot of time in the late 90s and early 00s to exploring and refining these ideas.

Going beyond the essential idea that software should have testable goals - based on my own experiences trying to do that - I soon learned that not all goals are created equal. It became very clear that, when it comes to designing goals and ways of testing them (measures), we need to be careful what we wish for.

Today, the state of the art in this area - still relatively unexplored in our industry - is a rather naïve and one-dimensional view of defining goals and associated tests.

Typically, goals are just financial, and a wider set of perspectives isn't taken into account (e.g., we can reduce the cost of manufacture, but will that impact product quality or customer satisfaction?)

Typically, goals are not caveated by obligations on the stakeholder that benefits (e.g., the solution should reduce the cost of sales, but only if every sales person gets adequate training in the software).

Typically, the tests ask the wrong questions (e.g., the airline who measured speed of baggage handling without noticing the increase in lost of damaged property and insurance claims, and then mandated that every baggage handling team at every airport copy how the original team hit their targets.)

Now, don't get me wrong: a development team with testable goals is a big improvement on the vast majority of teams who still work without any goals other than "build this".

But that's just a foundation on which we have to build. Setting the wrong goals, implemented unrealistically and tested misleadingly, can do just as much damage as having no goals at all. Ask any developer whose worked under a regime of management metrics.

Going beyond Gilb's books, I explored the current thinking from business management on goals and measures.

Balancing Goals

First, we need to identify goals from multiple stakeholder perspectives. It's not just what the bean counters care about. How often have we seen companies ruined by an exclusive focus on financial numbers, at the expense of retaining the best employees, keeping customers happy, being kind to the environment, and so on? We're really bad at considering wider perspectives. The law of unintended consequences can be greatly magnified by the unparalleled scalability of software, and there may always be side-effects. But we could at least try to envisage some of the most obvious ones.

Conceptual tools like the Balanced Scorecard and the Performance Prism can help us to do this.

Back in the early 00s, I worked with people like Mike Bourne, Professor of Business Performance Innovation, to explore how these ideas could be applied to software development. The results were highly compatible, but still - more than a decade later - before their time, evidently.

Pre-Conditions

If business goals are post-conditions, then we - above all others - should recognise that many of them will have pre-conditions that constrain the situations in which our strategy or solution will work. A distributed patient record solution for hospitals cannot reduce treatment errors (e.g., giving penicillin to an unconscious patient who is allergic) if the computers they're using can't run our software.

For every goal, we must consider "when would this not be possible?" and clearly caveat for that. Otherwise we can easily end up with unworkable solutions.

Designing Tests

Just as with software or system acceptance tests, to completely clarify what is meant by a goal (e.g., improve customer satisfaction) we need to use examples, which can be worked into executable performance tests. Precise English (or French or Chinese or etc) just isn't precise enough.

Let's run with my example of "improve customer satisfaction"; what does that mean, exactly? How can we know that customer satisfaction has improved?

Imagine we're running a chain of restaurants. Perhaps we could ask customers to leave reviews, and grade their dining experience out of 10, with 1 being "very poor" and 10 being "perfect".

Such things exist, of course. Diners can go online and leave reviews for restaurants they've eaten at. As can the people who own the restaurant. As can online "reputation management" firms who employ armies of paid reviewers to make sure you get a great average rating. So you could be a very highly rated restaurant with very low customer satisfaction, and the illusion of meeting your goal is thus created.

Relying solely on online reviews could actively hurt your business if they invited a false sense of achievement. Why would service improve if it's already "great"?

If diners were really genuinely satisfied, what would the real signs be? They'd come back. Often. They'd recommend you to friends and family. They'd leave good tips. They'd eat all the food you served them.

What has all this got to do with software? Let's imagine a tech example: an online new music discovery platform. The goal is to amplify good bands posting good music, giving them more exposure. Let's call it "Soundclown", just for jolly.

On Soundclown, listeners can give tracks a thumbs-up if they like them, and thumb's down if they really don't. Tracks with more Likes get promoted higher in the site's "billboards" for each genre of music.

But here's the question: just because a track gets more Likes, does that mean more listeners really liked it? Not necessarily. As a user of many such sites, I see how mechanisms for users interacting with music and musicians get "gamed" for various purposes.

First and foremost, most sites identify to the musician who the listener that Liked their track is. This becomes a conduit for unsolicited advertising. Your track may have a lot of Likes, but that could just be because a lot of users would like to sell you promotional services. In many cases, it's evident that they haven't even listened to the track that they're Liking (when you notice it has more Likes than it's had plays.)

If I were designing Soundclown, I'd want to be sure that the music being promoted was genuinely liked. So I might measure how many times a listener plays the track all the way through, for example. The musical equivalent of "but did they eat it all? and "did they come back for more?"

We might also ask for a bit more proof than clicking a thumbs-up icon. One website I use keeps a list of my "fans", but are they really fanatical about my music? Judging by the tumbleweed when I alert my "fans" to new music, the answer is evidently "nope". Again, we could learn from the restaurant, and allow listeners to "tip" artists, conveying some kind of reward that costs the listener something somehow.

Finally, we might consider removing all unintended backwards marketing channels. Liking my music shouldn't be an opportunity for you to try and sell me something. That very much lies at the heart of the corruption of most social networks. "I'm really interested in you, Now buy my stuff!"

This is Design. So, ITERATE!

The lesson I learned early is that, no matter how smart we think we've been about setting goals and defining tests, we always need to revisit them - probably many times. This is a design process, and should be approached the same we design software.

We can make good headway using a workshop format I designed many years ago

Maintaining The Essence

Finally, but most important of all, our goals need to be expressed in a way that doesn't commit us to any solution. If our goal is to promote the best musicians, then that's our goal. We must always keep our eyes on that prize. It takes a lot of hard work and diligence not to lose sight of our end goals when we're bogged down in technical solution details. Most teams fail in that respect, and let the technical details become the goal.






April 20, 2016

A* - A Truly Iterative Development Process

Much to my chagrin, having promoted the idea for so many years, software development still hasn't caught on to the idea that what we ought to be doing is iterating towards goals.

NOT working through a queue of tasks. NOT working through a queue of features.

Working towards a goal. A testable goal.

We, as an industry, have many names for working through queues: Agile, Scrum, Kanban, Feature-driven Development, the Unified Process, DSDM... All names for "working through a prioritised list of stuff that needs to be done or delivered". Of course, the list is allowed to change depending on feedback. But the goal is usually missing. Without the goal, what are we iterating towards?

Ironically, working through a queue of items to be delivered isn't iterating - something I always understood to be the whole point of Agile. But, really, iterating means repeating a process, feeding back the results of each cycle, until we reach some goal. Reaching the goal is when we're done.

What name do we give to "iterating towards a testable goal"? So far, we have none. Buzzword Bingo hasn't graced the door of true iterative development yet.

Uncatchy names like goal-driven development and competitive engineering do exist, but haven't caught on. Most teams still don't even have even a vague idea of the goals of their project or product. They're just working through a list that somebody - a customer, a product owner, a business analyst - dreamed up. Everyone's assuming that somebody else knows what the goal is. NEWSFLASH: They don't.

The Codemanship way compels us to ditch the list. There is no release plan. Only business/user goals and progress. Features and change requests only come into focus for the very near future. The question that starts every rapid iteration is "where are we today, and what's the least we could do today to get closer to where we need to be?" Think of development as a graph algorithm: we're looking for the shortest path from where we are to some destination. There are many roads we could go down, but we're particularly interested in exploring those that bring us closer to our destination.

Now imagine a shortest-path algorithm that has no concept of destination. It's just a route map, a plan - an arbitrary sequence of directions that some product owner came up with that we hope will take us somewhere good, wherever that might be. Yup It just wouldn't work, would it? We'd have to be incredibly lucky to end up somewhere good - somewhere of value.

And so it is - in my quest for a one-word name to describe "iteratively seeking the shortest (cheapest) path to a testable goal", I propose simply A*

As in:

"What method are we following on this project?"

"A*"

Of course, there are prioritised lists in my A* method: but they are short and only concern themselves with what we're doing next to TRY to bring us closer to our goal. Teams meet every few days (or every day, if you're really keen), assess progress made since last meeting, and come up with a very short plan, the results of which will be assessed at the next meeting. And rinse and repeat.

In A*, the product owner has no vision of the solution, only a vision of the problem, and a clear idea of how we'll know when that problem's been solved. Their primary role is to tell us if we're getting warmer or colder with each short cycle, and to help us identify where to aim next.

They don't describe a software product, they describe the world around that product, and how it will be changed by what we deliver. We ain't done until we see that change.

This puts a whole different spin on software development. We don't set out with a product vision and work our way through a list of features, even if that list is allowed to change. We work towards a destination - accepting that some avenues will turn out to be dead-ends - and all our focus is on finding the cheapest way to get there.

And, on top of all that, we embrace the notion that the destination itself may be a moving target. And that's why we don't waste time and effort mapping out the whole route beyond the near future. Any plan that tries to look beyond a few days ends up being an expensive fiction that we become all too easily wedded to.






February 16, 2016

Software Craftsmanship 2016 - London workshop open for registration

I'm still working on the official international web page (with more workshops in your area TBA), but if you're in the London area, you can now register for the primary workshop happening in South Wimbledon. It's happening on Saturday May 14th, so no need to ask for a day off.

It's going to be a hell of a thing. No sessions, just one big breakout area full of passionate coders doing what passionate coders do best (coding passionately!)

It's completely free, and paid for by my company Codemanship. So if you want to show your appreciation, beg the boss for some training ;)

The day will be followed by the Software Craftsman's Ball at a local hostelry. And, yes, that is drinking, in case you were wondering.




September 12, 2015

TDD Katas Too Easy For You? Try the Codemanship Team Dojo

One thing I hear regularly is how the kinds of practical exercises we do in training workshops and pair programming interviews are "too trivial" to test real developers.

Curiously, and without any exceptions, it turns out that the people who make such claims are unable to complete the exercises to a high standard in the allotted time; leading me to think that just maybe we overestimate ourselves sometimes. And it's not escaped my attention that those who brag the loudest tend to do them least well.

But if you really want a bigger challenge - one that's more befitting of your programming genius - then there's always my Team Dojo

There are a number of user stories, with executable acceptance tests, for a social network for programmers from which we can explore and build teams based on a number of criteria.

The exercise is undertaken from a standing start. All you get is your computers and a network connection. You'll have to decide what language(s) and platforms you'll be developing for, set up version control if you think you need it (which, of course, you do), build automation, CI, all of that good stuff - none of it is provided.

Once you've got the sausage machine up and running, you then need to work through the user stories, designing and implementing working code in order to have someone outside your team verify that it does indeed pass each acceptance test on a machine that isn't yours.

Give yourself a maximum of a standard working day (8 hours) to complete it. Afterwards, assess the quality of the implementation code for readability, simplicity, lack of duplication etc. Give yourself a percebtage score for Code Cleanliness, and then multiply the points you picked up from passing acceptance tests by that.

Most good developers can do it in under a day. Curiously, teams of 3 or more tend to struggle to compete it in 8 hours. The rare great teams can do it in under 4 hours. Go figure!

You will learn LOTS, though you may well wish for the naïve simplicity of FizzBuzz by the time you get half-way through...





August 7, 2015

Taking Baby Steps Helps Us Go Faster

Much has been written about this topic, but it comes up so often in pairing that I feel it's worth repeating.

The trick to going faster in software development is to take smaller steps.

I'll illustrate why with an example from a different domain: recording music. As an amateur guitar player, I attempt to make recorded music. Typically, what I do is throw together a skeleton for a song - the basic structure, the chord progressions, melody and so on - using a single sequenced instrument, like nice synth patch. That might take me an afternoon for a 5 minute piece of music.

Then I start working out guitar parts - if it's going to be that style of arrangement - and begin recording them (muso's usually call this "tracking".)

Take a fiddly guitar solo, for example; a 16-bar solo might last 30 seconds at ~120 beats per minute. Easy, you might think to record it in one take. Well, not so much. I'm trying to get the best take possible, because it's metal and standards are high.

I might record the whole solo as one take, but it will take me several takes to get one I'm happy with. And even then, I might really like the performance on take #3 in the first 4 bars, and really like the last 4 bars of take #6, and be happy with the middle 8 from take #1. I can edit them together, it's a doddle these days, to make one "super take" that's a keeper.

Every take costs time: at least 30 seconds if I let my audio workstation software loop over those 16 bars writing a new take each time.

To get the takes I'm happy with, it cost me 6 x 30 seconds (3 minutes).

Now, imagine I recorded those takes in 4-bar sections. Each take would last 7.5 seconds. To get the first 4 bars so I'm happy with them, I would need 3 x 7.5 seconds (22.5 seconds). To get the last 4 bars, 6 x 7.5 seconds (45 seconds), and to get the middle 8, just 15 seconds.

So, recording it in 4 bar sections would cost me 1m 22.5 seconds.

Of course, there would be a bit of an overhead to doing smaller takes, but what I tend to find is that - overall - I get the performances I want sooner if I bite off smaller chunks.

A performance purist, of course, would insist that I record the whole thing in one take for every guitar part. And that's essentially what playing live is. But playing live comes with its own overhead: rehearsal time. When I'm recording takes of guitar parts, I'm essentially also rehearsing them. The line between rehearsal and performance has been blurred by modern digital recording technology. Having a multitrack studio in my home that I can spend as much time recording in as I want means that I don't need to be rehearsed to within an inch of my life, like we had to be back in the old days when studio time cost real money.

Indeed, the lines between composing, rehearsing, performing and recording have been completely blurred. And this is much the same as in programming today.

Remember when compilers took ages? Some of us will even remember when compilers ran on big central computers, and you might have to wait 15-30 minutes to find out if your code was syntactically correct (let alone if it worked.)

Those bad old days go some way to explaining the need for much up-front effort in "getting it right", and fuelled the artificial divide between "designing" and "coding" and "testing" that sadly persists in dev culture today.

The reality now is that I don't have to go to some computer lab somewhere to book time on a central mainframe, any more than I have to go to a recording studio to book time with their sound engineer. I have unfettered access to the tools, and it costs me very little. So I can experiment. And that's what programming (and recording music) essentially is, when all's said and done: an experiment.

Everything we do is an experiment. And experiments can go wrong, so we may have to run them again. And again. And again. Until we get a result we're happy with.

So biting off small chunks is vital if we're to make an experimental approach - an iterative approach - work. Because bigger chunks mean longer cycles, and longer cycles mean we either have to settle for less - okay, the first four bars aren't that great, but it's the least worst take of the 6 we had time for - or we have to spend more time to get enough iterations (movie directors call it "coverage") to better ensure that we end up with enough of the good stuff.

This is why live performances generally don't sound as polished as studio performances, and why software built in big chunks tends to take longer and/or not be as good.

In guitar, the more complex and challenging the music, the smaller the steps we should take. I could probably record a blues-rock number in much bigger takes, because there's less to get wrong. Likewise in software, the more there is that can go wrong, the better it is to take baby steps.

It's basic probability, really. Guessing a 4-digit number is an order of magnitude easier if we guess one digit at a time.










August 1, 2015

My First, Last & Only Blog Post About #NoEstimates

I've been keeping one eye on the whole #NoEstimates debate on Twitter, and folk have asked me my opinion quite a few times. So here it is.

I believe, very firmly, that the problem with estimation stems from us asking the wrong question.

In fact, this is where many big problems in software development arise; by asking the customer "What software would you like us to build?"

This naturally leads to a shopping list of features, and then a request to know "How much will all that cost and how long will it take?"

If we asked instead "What problem are we trying to solve, and how will we know when we've solved it?" - together with accompanying questions like "When do you need this solution?", "What is a solution worth to you?" and "How much money do you have to invest in solving it?" - we can set out on a different journey.

I believe software development needs to be firmly grounded in reality, and the reality is that it's R&D. At the start, the honest answer to questions like "What features are needed?", "How much will it cost?" and "How long will it take?" is I Don't Know.

Pretending to know the unknowable is what lands us in hot water in the first place. We don't know if we can solve the problem with the budget and the time available.

In the management quest for accounting certainties, though, nobody wants to hear that, and no developer with a mortgage to pay wants to admit it. So we go with the fairy tale instead.

Once we're in the fairy tale - where we know if we deliver this list of features, it will solve the customer's problem, and we can predict how long and how much it will take - it's almost impossible to get out of it. Budgets are committed. Deadlines are agreed. Necks are on chopping blocks.

So, what we do instead, is we wait for the reality to unfold, and then when it no longer matches the fairy tale, there's a major shitstorm of blame and recrimination. Typically, the finger is pointed at everyone and everything except that first mistake; the original sin of software projects: pretending to know the future.

After getting their fingers burned once, the customer's and manager's instinct is to "fix" the problem by "improving" estimates next time around. This is fixing the fairy tale by inventing an even more elaborate fairy tale, to try and disguise the fact that's it's fantasy. This is the management equivalent of sacrificing virgins to make it rain.

The only way out of the estimating nightmare is to call "bullshit" on it, and publicly accept - indeed, embrace - the uncertainty that's inherent in what we're doing.

Yes, you might lose the business if you start out saying "I don't know", but consider that the business you're losing is the same old Death March teams have been suffering for decades. That's not work. That's just passing the time for money.

By all means offer a guess, so the customer can budget realistically. But you must be absolutely 100% crystal clear with them that, at the end of the day, we don't know. We just don't know. It's a punt.

Sell yourself on what you do know. What's your track record as a team? What have you delivered in the past? How much did that cost? How long did that take? And - most importantly, but regrettably least asked - did it work?

When a movie studio hires a director, the director makes no guarantees that this new film will be a commercial success, or that it will cost no more than budgeted, or be completed dead on time. The history of cinema is littered with amazingly good, and often very successful, movies that cost more and took longer than planned. But somehow, James Cameron seems to have no trouble getting movies off the ground. That's because of his track record, not his ability to accurately predict production costs and schedules.

Studios gamble with huge sums of money, and - yes - they do ask for estimates, and things do get hairy when schedules slip and costs overrun, but fundamentally they know what game they're in.

It's time we did, too.




May 26, 2015

Ditch The Backlog and Start Iterating!

Goals.

Yes, those.

The old ways have a habit of sneaking back in through the back door. And so it is with Agile Software Development that, despite all our protestations about being iterative and open to feedback and change, The Big PlanTM found its way back cunningly disguised as the backlog.

The reality is that most Agile teams are not iterating their designs in rapid feedback cycles, but instead are incrementally working their way through a plan for a solution that was cooked up by what we used to call "requirements analysts" - generally speaking, people who talk to the customer to find out what they want and draw up a specification - right at the start.

The backlog on many teams doesn't change much. And this is because the goal of each small frequent delivery is not to try out the software and see how it can be made better in the next delivery, but to test each delivery to check that it conforms to The Big PlanTM.

The box-ticking exercise of user acceptance testing usually just asks "is that what we agreed?" The software isn't tested for real, by real users, working on real problems to ask "is that what we really need?"

And so it is that many Agile teams still get that skip-ful of feedback when the "iterated" solution finds its way into the real world. To all intents and purposes, that's a Big Bang release. Y'know, the kind we thought we'd stopped doing.

Better to get that kind of feedback throughout. Better also to shift focus from The Big PlanTM to actual end user goals, not a list of system features that someone believed would meet those goals (if they ever thought to ask what those goals were.)

Imagine we're working for an airline. We turn up for work and are presented with a backlog of feature requests for an online check-in facility. Dutifully, we work our way through this backlog, agreeing acceptance tests with our customer and ticking them off one by one. Eventually, the system goes live. At which point we discover that, because all of the flights we operate are long-haul, and therefore almost all our passengers need to check in baggage for the hold, we've had almost zero impact on the time it takes to check-in.

What we could have been doing, instead of working our way through The Big PlanTM, is working our way towards reducing check-in times. If that was the original goal, then that's what we should have been iterating towards.

This is actually a founding principle of Agile, before it was called "Agile". Tom Gilb's ideas about evolutionary project management, dating back to the late 1980s, clearly highlight the need to focus on goals, not plans. Each iteration needs to bring us closer to the goal, and we need to test and measurre progress not by software delivered against plan - I mean, damn, there was a major clue right there! - but by progress towards reaching our goals.

Instead of putting all our faith in the online check-in solution that was presented to us, we could have been focusing on those baggage check-in queues and streamlining them. The solution might not even involve software. In which case we swallow our pride and acknowledge we don't have the answer, instead of wasting a big chunk of time and money pretending we do.

This requires a different relationship with the customer, where developers like us are just one part of a cross-discipline team tasked with solving problems that may or may nor involve software. We should be incentivised to write software that really achieves something, and to be prepared to change direction when we learn that we're on the wrong track.

The first step in that journey is to ditch the backlog. Put a match to The Big PlanTM.

Instead of plans, have goals; a handful of headline requirements that really are requirements - to reduce check-in times, to detect and treat heart disease sooner, to save 1p on the cost of manufacturing a widget, to get 20% more children in Africa through school, or whatever the goal is that someone thinks is valuable enough to invest the kind of money software costs to create and maintain in. We ain't done until the goal's been achieved at least to some extent, or until we've abandoned the goal.

That requires developers to play an integral part in a wider - and probably longer-term - game. We are not actors who turn up and say the lines someone else wrote on a set someone else built. We write our lines and build our sets and then act them out to an audience whose feedback should determine what happens next in the story.




April 8, 2015

Reality-driven Development - Creating Software For Real Users That Solve Real Problems In the Real World

It's a known fact that software development practices cannot be adopted until they have a pithy name to identify the brand.

Hence it is that, even though people routinely acknowledge that it would be a good idea for development projects to connect with reality, very few actually do because there's no brand name for connecting your development efforts with reality.

Until now...

Reality-driven Development is a set of principles and practices aimed at connecting development teams to the underlying reality of their efforts so that they can create software that works in the real world.

RDD doesn't replace any of your existing practices. In some cases, it can short-circuit them, though.

Take requirements analysis, for example: the RDD approach compels us to immerse ourselves in the problem in a way traditional approaches just can't.

Instead of sitting in meeting rooms talking about the problem with customers, we go out to where the problem exists and see and experience it for ourselves. If we're tasked with creating a system for call centre operatives to use, we spend time in the call centre, we observe what call centre workers do - pertinent to the context of the system - and most importantly, we have a go at doing what the call centre workers do.

It never ceases to amaze me how profound an effect this can have on the collaboration between developers and their customers. Months of talking can be compressed into a day or two of real-world experience, with all that tacit knowledge communicated in the only way that tacit knowledge can be. Requirements discussions take on a whole different flavour when both parties have a practical, first-hand appreciation of what they're talking about.

Put the shoe on the other foot (and that's really what this practice is designed to do): imagine your customer is tasked with designing software development tools, based entirely on an understanding they've built about how we develop software purely based on our description of the problem. How confident are you that we'd communicate it effectively? How confident are you that their solutions would work on real software projects? You would expect someone designing dev tools to have been a developer at some point. Right? So what makes us think someone who's never worked in a call centre will be successful at writing call centre software? (And if you really want to see some pissed off end users, spend an hour in a call centre.)

So, that's the first practice in Reality-driven Development: Real-world Immersion.

We still do the other stuff - though we may do it faster and more effectively. We still gather user stories as placeholders for planning and executing our work.. We still agree executable acceptance tests. We still present it to the customer when we want feedback. We still iterate our designs. But all of these activities are now underpinned with a much more solid and practical shared understanding of what it is we're actually talking about. If you knew just how much of a difference this can make, it would be the default practice everywhere.

Just exploring the problem space in a practical, first-hand way can bridge the communication gap in ways that none of our existing practices can. But problem spaces have to be bounded, because the real world is effectively infinite.

The second key practice in Reality-driven Development is to set ourselves meaningful Real-world Goals: that is, goals that are defined in and tested in the real world, outside of the software we build.

Observe a problem in the real world. For example, in our real-world call centre, we observe that operatives are effectively chained to their desks, struggling to take regular comfort breaks, and struggling to get away at the end of a shift. We set ourselves the goal of every call centre worker getting at least one 15-minute break every 2 hours, and to work a maximum of 15 minute's unplanned overtime at the end of a day. This goal has nothing to do with software. We may decide to build a feature in the software they use that manages breaks and working hours, and diverts calls that are coming in just before their break is due. It would be the software equivalent of when the cashier at the supermarket checkout puts up one of those little signs to dissuade shoppers from joining their queue when they're about to knock-off.

Real-world Goals tend to have a different flavour to management-imposed goals. This is to be expected. If you watch any of those "Back to the floor" type TV shows, where bosses pose as front-line workers in their own businesses, it very often the case that the boss doesn't know how things really work, and what the real operational problems are. This raises natural cultural barriers and issues of trust. Management must trust their staff to drive development and determine how much of the IT budget gets spent. This is probably why almost no organisation does it this way. But the fact remains that, if you want to address real-world problems, you have to take your cues from reality.

Important, too, is the need to strike a balance in your Real-world Goals. While we've long had practices for discovering and defining business goals for our software, they tend to suffer from a rather naïve 1-dimensional approach. Most analysts seek out financial goals for software and systems - to cut costs, or increase sales, and so on - without looking beyond that to the wider effect the software can have. A classic example is music streaming: while businesses like Spotify make a great value proposition for listeners, and for major labels and artists with big back catalogues, arguably they've completely overlooked 99.9% of small and up-and-coming artists, as well as writers, producers and other key stakeholders. A supermarket has to factor in the needs of suppliers, or their suppliers go out of business. Spotify has failed to consider the needs of the majority of musicians, choosing to focus on one part of the equation at the expense of the other. This is not a sustainable model. Like all complex systems, dynamic equilibrium is usually the only viable long-term solution. Fail to take into account key variables, and the system tips over. In the real world, few problems are so simple as to only require us to consider one set of stakeholders.

In our call centre example, we must ask ourselves about the effect of our "guaranteed break" feature on the business itself, on its end customers, and anyone else who might be effected by it. Maybe workers get their breaks, but not withut dropping calls. Or without a drop in sales. All of these perspectives need to be looked at and addressed, even if by addressing it we end up knowingly impacting people in a negative way. Perhaps we can find some other way to compensate them. But at least we're aware.

The third leg of the RDD table - the one that gives it the necessary balance - is Real-world Testing.

Software testing has traditionally been a standalone affair. It's vanishingly rare to see software tested in context. Typically, we test it to see if it conforms to the specification. We might deploy it into a dedicated testing environment, but that environment usually bears little resemblance to the real-world situations in which the software will be used. For that, we release the software into production and cross our fingers. This, as we all know, pisses users off no end, and rapidly eats away at the goodwill we rely on to work together.

Software development does have mechanisms that go back decades for testing in the real world. Alpha and Beta testing, for example, are pretty much exactly that. The problem with that kind of small, controlled release testing is that it usually doesn't have clear goals, and lacks focus as a result. All we're really doing is throwing the software out there to some early adopters and saying "here, waddaya think?" It's missing a key ingredient - real-world testing requires real-world tests.

Going back to our Real-world Goals, in a totally test-driven approach, where every requirement or goal is defined with concrete examples that can become executable tests, we're better off deploying new versions of the software into a real-world(-ish) testing environment that we can control completely, where we can simulate real-world test scenarios in a repeatable and risk-free fashion, as often as we like.

A call centre scenario like "Janet hasn't taken a break for 1 hour and 57 minutes, there are 3 customers waiting in the queue, they should all be diverted to other operators so Janet can take a 15-minute break. None of the calls should be dropped" can be simulated in what we call a Model Office - a recreation of all or part of the call centre, into which multiple systems under development may be deployed for testing and other purposes.

Our call centre model office simulates the real environment faithfully enough to get meaningful feedback from trying out software in it, and should allow us to trigger scenarios like this over and over again. In particular, model offices enable us to exercise the software in rare edge cases and under unusually high peak loads that Alpha and Beta testing are less likely to throw up. (e.g., what happens if everyone is due a break within the next 5 minutes?)

Unless you're working on flight systems for fighter aircraft or control systems for nuclear power stations, it doesn't cost much to set up a testing environment like this, and the feedback you can get is worth far more.

The final leg of the RDD table is Real-world Iterating.

So we immerse ourselves in the problem, find and agree real-world goals and test our solutions in a controlled simulation of the real world. None of this, even taken together with existing practices like ATDD and Real Options, guarantees that we'll solve the problem - certainly not first time.

Iterating is, in practice, the core requirements discipline of Agile Software Development. But too many Agile teams iterate blindly, making the mistake of believing that the requirements they've been given are the real goals of the software. If they weren't elucidated from a real understanding of the real problem in the real world, then they very probably aren't the real goals. More likely, what teams are iterating towards is a specification for a solution to a problem they don't understand.

The Agile Manifesto asks us to value working software over comprehensive documentation. Realty-driven Development widens the context of "working software" to mean "software that testably solves the user's problem", as observed in the real world. And we iterate towards that.

Hence, we ask not "does the guaranteed break feature work as agreed?", but "do operatives get their guaranteed breaks, without dropping sales calls?" We're not done until they do.

This is not to say that we don't agree executable feature acceptance tests. Whether or not the software behaves as we agreed is the quality gate we use to decide if it's worth deploying into the Model Office at all. The software must jump the "it passes all our functional tests" gate before we try it on the "but will it really work, in the real world?" gate. Model Office testing is more complex and more expensive, and ties up our customers. Don't do it until you're confident you've got something worth testing in it.

And finally, Real-world Testing wouldn't be complete unless we actually, really tested the software in the real real world. At the point of actual deployment into a production environment, we can have reasonably high confidence that what we're putting in front of end users is going to work. But that confidence must not spill over into arrogance. There may well be details we overlooked. There always are. So we must closely observe the real software in real use by real people in the real world, to see what lessons we can learn.

So there you have it: Reality-driven Development

1. Real-world Immersion
2. Real-world Goals
3. Real-world Testing
4. Real-world Iterating


...or "IGTI", for short.


March 19, 2015

Requirements 2.0 - Make It Real

This is the second post in a series to float radical ideas for changing the way we handle requirements in software development. The previous post was Ban Feature Requests

In my previous post, I put forward the idea that we should ban customers from making feature requests so that we don't run the risk of choosing a solution too early. For example, in a user story, we'd get rid of most of the text, just leaving the "So that..." clause to describe why the user wants the software changed.

Another area where there's great risk of pinning our colours to a specific solution is in the collaboration between a customer and a UI/UX designer. The issue here is that things like wireframes and UI mock-ups tend to be the first concrete discussion points we put in front of customers. Up to this point, it's all very handwavy and vague. But seeing a web page with a text box and a list and some buttons on it can make it real enough to have a more meaningful discussion about the problem we're trying to solve.

This would be fine if we didn't get so attached to those designs. But, let's face it, we do. We get very attached to them, and then the goal of development transforms into "what must we do in order to realise that design?", when in reality, we're still exploring the problem space.

So, we need some way to make our ideas concrete, so we can have meaningful discussions about the problem, without presenting the customer with a design for a solution.

Here's what I do, when the team and the customer are willing to play ball:



I make it real by... well... making it real. I call this Tactile Modeling. (No doubt by tomorrow afternoon, some go-getting young hipster will have renamed it "Illustrating Requirements Using Things You Can See and Hold In Your Hand-driven Development". But for now, it's Tactile Modeling.)

Now, I'm old enough to remember when we were all so young and stupid we really thought that visual models in notations like UML would serve this purpose. Yeah, I know. It's like watching old movies of women smoking next to their babies. Boy, were we dumb!

But the idea of being able to concretely explore examples and business scenarios in a practical way can carry real power to break down the communication barriers; far more effectively than our current go-to techniques like agreeing acceptance tests in some airless meeting room with a customer who is pulling domain facts out of thin air half the time.

So, if we're talking about a system for managing a video library, let's create a video library and explore real-world systems for managing it. Let's get some videos. Let's get some shelves to put them. Let's get some boxes and folders and sticky-tape and elastic bands and build a video library management system out of real actual atoms and stuff, and explore how it works in different scenarios.

And instead of drawing boxes and arrows and wireframes and wizardry up on the whiteboard or in a modelling tool (like PowerPoint, for example), let's whip out our camera phones and take snaps at key steps and take videos to show how a process works and stick them in the Wiki for everyone to see.

And let's not sit in meeting rooms going "blah blah blah must be scalable etc etc", let's have our discussions inside this environment we've created, so we're surrounded by the problem domain, and at any point requiring clarification, the clarifier can jump up and show us what they mean, so that we can all see it (using our eyes).

As our understanding evolves, and we start to create software to be used in some of these scenarios to help the end users in their work, we can deploy that software into this fake video library and gradually swap out the belt-and-braces information systems with slick software, all the while testing to see that we're achieving our goals.

Now, I know what some of you are thinking: "but our problem domain is all abstract concepts like 'currency', 'option' and 'ennui'. " Well, here's the good news. Movies are an abstract concept. Sure, they come in boxes sometimes, or on cassettes. But that's just the physical representation - the medium - through which that concept is expressed. It's the same movie whether we download it as a file, buy it on a disc or get someone to paint it as a mural. That's what separates us from the beasts of the jungle. Well, that and the electrified fence around our compound. But mostly, it's our ability to express abstract concepts like money, employment contract and stock portfolio that we've built our entire civilisation on. Money can be represented by little pieces of paper with numbers written on them. (A radical idea, I know, but worth a try sometime.) And so on.

There is always a way to make it practical: something we can pick up and look at and manipulate and move to model the information in the system, be it information about hospital patients, or about chemicals components in self-replicating molecules, or about single adults who are looking for love.

Of course, there's more to it than that. But you get the gist, I'm sure. And we'll look at some of that in th next post, no doubt. In particular, the idea of a model office: a simulated testing and learning environment into which we should be deploying our software to see how it fairs in something approaching the real world.

Wanna have a meaningful conversation about requirements? Then make it real.