July 26, 2012
Empirical Design & Testing In The Wild
Ponderings and musings on that question of why we code.I'm not talking about why I am a programmer. That's easy - I enjoy it. Really, it's the question of "why software?"
It's no secret that, as an industry, we tend to be solution-led. We figure out how to do something often before we've thought of a good reason for doing it.
Maybe it's because we enjoy inventing solutions more than we do solving problems. Who knows?
And it's fair to say that it can cut both ways. Many times, we have a solution sitting on the shelf gathering dust because nobody found a use for it, and then one day someone made that connection to a real problem and said "hey, you know what we could use for that?"
But I'm seeing far too many solutions-looking-for-problems out there. CRM is the classic case-in-point. Large organisations know that they want it, but what is the goal of CRM? All too often, they can't articulate their reasons for wanting a particular CRM (or ERP, or whatever) solution. They just want it, and there's some vague acknowledgement that it might make things better somehow.
I suspect some of the most successful software solutions have attached themselves to problems almost by accident. How often have you seen software being used for something that it wasn't intended to be used for? Who said, for exampe, that Twitter was an open messaging solution, and not the micro-blogging solution it was designed to be? As a micro-blogging solution, it's arguably a failure. What it's turned out to be is something like AOL Instant Messenger, but anyone can join in the conversation.
Successes like Twitter and Facebook occur by providence more than by design. Users discover things they can do with the software, projecting their own use cases into it and working around the available features to find ways to exploit the underlying computerific nature of the beast.
Strip away the brand names and the logos and the unique designs, and you're left with a fundamental set of use cases upon which all software is based to some degree or another.
We're not supposed to use it that way, but for the majority, Microsoft Excel is a database solution. Indeed, I've seen Microsoft Word used as a database solution. You can store structured data in it. Ergo, it's a database.
You see, people have problems. And when all's said and done, software is nothing more than an interface to the computer that they can use to solve their problems. A user interface of any kind presents us with a language we can use to communicate with the computer, and users can be very creative about how they use that language. In Word, it may well be "add row to table", but in the user's mind it's "add item to order" or "register dog with kennel".
So too in Twitter, posting an update on my "micro-blog" might actually mean something else to me. I might be sending an open message to someone. I might be alerting followers to an interesting documentary I'm watching on TV at that moment. I might be asking for technical support. I've seen Twitter used in so many different ways.
I'm fascinated by watching people use software, and especially by the distance between their own internal conceptual model of what they think they're doing (adding an item to an order) and what the software thinks they're doing (adding a row to a table).
For me, these are the most enlightening use cases. What do people actually do using our software?
When I examine usage logs, I often find patterns of repeated sequences of user interactions. When I was younger and more naive, I believed that these revealed a need to offer further automation (e.g., wizards) to speed up these repetitive tasks, and to an extent that's usually true. It's a very mechanistic way of looking at these patterns.
But now I suspect that what these patterns reveal is more profound than that.
Imagine examining a log of instructions sent to the CPU of your computer. You would undoubtedly find much repetition. But tracing those patterns up through the technology stack, we will discover that these repetitions are a product of sequences of instructions defined at increasingly higher levels of abstraction - layers of languages, if you like. A simple expression or statement in Java might result in a whole sequence of machine instructions. A method containing multiple statements might result in even longer sequences. And a user interface or domain-specific language (which, by the way, is also a user interface, and vice-versa) might ultimately invoke many such methods with each interaction.
What I'm suggesting is that there can often be an unspoken - usually unacknowleged - language that sits above the user interface. This is the language of what the user intends.
And for all our attempts to define this user language up-front (with use cases and user stories), I don't think I've ever seen software where the mapping between software features and user intentions was precisely 1-to-1. When I resolve to watch closely, I've always found the user working around the software to at least some extent to get what they really want.
Inevitably, we don't get it right first time. Which is why we iterate. (We do iterate, right?) But what is that iteration based on? What are we feeding back in that helps to refine the design of our software?
It's my contention that requirements analysis and UI/UX design should be as much - if not more - an activity based on watching what users do with our software as it is on asking them what they want to do before we write it.
User acceptance testing helps us agree that we delivered what we agreed we should, but we need to go further. It's not enough to know that users can do what we expected they should be able to do using the software, because so much software gets its real value from being misused.
And it's not enough that we observe people using our software in captivity, under controlled conditions and sticking to the agreed scripts. We need to know what they'll likely do with it in the wild.
Going foward, here's how I plan to adapt my thinking about software design:
I plan to shift even more of the effort to redesign. I plan to base redesign not on washy-washy "customer feedback" but on detailed, objective observations taken from the real-world (or as near as damn-it) as to how the software's actually being used. Repetition and patterns in real-world usage data will reveal that there are goals and concepts I must have missed, and I will examine the patterns and the data, and then use that as input to ongoing collaborative analysis and redesign with the users.
I will keep doing this until no more usage patterns emerge and the design now encapsulates all of those missing goals and concepts, at which point hopefully the conceptual language of my software will be a 1-to-1 match for the user's.
I plan to refine this approach so that less and less we present users with our interpreration of what we think they need, and more and more we allow the patterns that emerge from continued usage to inform us what really needs to be in the software.
I consider this to be a scientific, empirical approach to software design. Design based on careful observation, which is then tested and retested based on further observations until what we observe is a precise match for what our users intend.
In iterative design, every design iteration is a theory, and every theory must be thoroughly tested by experiment. My feeling is that, for all these years, I've been doing the experiments wrong. And this has meant that the feedback going into the next iteration is less meaningful.
The whole point of iterative design is that we want to converge on the best design possible with the time and resources available to us. The $64,000 question is: converge on what? How do we know if we're getting hotter or colder?
That final test has always felt somehow lacking to me. We deliver some working software, the customer tests it to see that it's what we agreed it should be, and then we move on to the next iteration, where - instead of refining the design - we usually just add more features to it.
It's never felt right to me. In theory, the customer could come back and say "okay, so it does what we agreed, but now here are my changes to what we agreed for the next iteration". But they generally don't. That gets put off, and put off, and put off. Usually until a major roll-out, which is where most testing in the wild happens, and where most of the really meaningful feedback tends to come from.
This is one of Agile's dirty little secrets. The majority of the teams are doing short increments and loooong iterations. The real learning doesn't start until a great deal of the software's already been written. And then, thanks to Agile's other dirty little secret (Unclean Code), there's less we can do about it. Usually bugger all, in fact.
Of course, we're not going to be allowed to deploy software that doesn't have the mimimum viable set of features into a real business - any more than we'd be allowed to cut the ribbon on 10% of a suspension bridge - which is why I favour testing software in the most realistic simulations of the real world as possible.
Whenever I mention the idea of a "model office" I hear murmurs of approval. Everyone thinks it's a good idea. So, naturally, nobody does it*.
But if you want to get that most meaningful feedback, and therefore converge on the real value in your software, testing in captivity isn't going to work. You need to be able to observe end users trying to do their jobs, live their lives and organise their pool parties using your software. If you can't observe them in the wild, you need to at least create a testing environment that can fool them into thinking they're in the wild, so you can observe them using it in the way they naturally would.
That's my idea, basically. Deploy your software into the wild (or a very realistic simulation of it) and carefully and objectively observe what your real users do with it in realistic situations. Look for the patterns in that detailed usage data. Those patterns are goals and concepts that matter to your users which your software doesn't encapsulate. Make your software encapsulate those patterns. Then rinse and repeat until your software and your users are speaking exactly the same language.
* You think I'm kidding? Seriously, using a model office to test your software in is THE best idea in software development. Barr none. Nothing gets you closer to your users faster, except for actually becoming them. Nothing reveals the true nature of the user's problems, and the real gaps in your software, more directly. Nothing. NO-THING!
And I bet you still won't use one.
July 10, 2012
Software Apprentices Will Need Insights, Not Buzzwords
A lot of the debate that goes on in the world of software development about the processes, practices and techniques that we should be applying seems to hinge on what we choose to call what we do.This has the effect of creating the illusion that nobody really agrees on anything. And as my thoughts turn more exclusively to apprenticeships for software developers, this presents something of a problem.
If John Q Apprentice learns how to write software at Company X, there's no guarantee that he'll come away from that with knowledge and skills that Company Y would agree are important. My mind borks at the prospect of "Agile apprenticeships" or "Scrum apprenticeships" or "Extreme Programming apprenticeships", because I fear for apprentices being sold such narrow perspectives on what is, in fact, a very wide discipline.
I know I bang on about the potential for "evidence-based" approaches, but this is the real reason why. I believe we have a responsibility to young, impressionable minds to find a way for them to learn their craft (there, I've said it) without bamboozling them with buzzwords and brand names.
I'm planning to take on 2 "apprentices" in the near future (two undergraduates who I'll mentor throughout their degree studies and beyond), and this problem's weighed heavily on my mind.
What I really want to do, apart from giving them an opportunity to get a thousand or more hours of good, focused hands-on practice before they hit the job market, is give them a thorough grounding in the underlying principles of software development - free from fashions and fads - and use this practice time to help them internalise those principles until they become part of their developer DNA, so to speak.
Most importantly, I don't want to present a picture of software development that's personal, subjective and founded on little more than anecdotes. It's the height of arrogance, in my (arrogant) opinion, to tell people "you'll just have to take my word for it". I don't want to saddle two bright and enthusiastic young developers with the "Jason Gorman way" of writing software. That's a burden I wouldn't wish on anyone. Except maybe myself. Well, even then...
Fortunately, it's not necessary. In all the areas that count, much work has been done in the last few decades to establish principles upon which one could base a perfectly workable discipline of software development.
We know, for example, that more feedback more often, and more meaningful feedback, helps us solve complex problems more effectively and more economically than trying to get it right first time.
And we know that close collaboration with our customers is a major factor in project success rates.
Just as we know that testing earlier and more frequently throughout the development process catches problems sooner, making them so much easier to fix that the time saved later often outweighs the extra time spent testing.
There are a bunch of underlying principles upon which I feel I could build a good apprenticeship without lazily resorting to throwing buzzwords out there and saying "trust me, it works". My ambition is that apprentices will not only be fluent in the practies they apply, but they also know why they're applying them, and can find ways to apply these underlying principles regardless of the specific environments they find themselves in.
Buzzwords may last them a few productive years before they fall out of favour, but I'm hoping some key fundamental insights will serve them throughout their careers.
April 25, 2012
Entrepreneurial Programming - The Sixty Four Challenge
All this talk about "lean start-ups" and "bacon entrepreneurs" (or whatever... TBH, I wasn't really paying attention) has got me thinking...It seems that a little experiment, in the form of a challenge, might be in order. Many people - including people who should know better - continue to assert that quality and getting something to market quickly are a trade-off. It's the old "quick and dirty" school of thought.
If quick-and-dirty is the best short-term solution, then it stands to reason that in a short-term endevour, quick-and-dirty would give you an advantage over Clean Code.
I'm not at all convinced that it would. All the evidence I've seen suggests that the opposite is true.
But I'm not here to tell you ghost stories. How could we put it to the test? Asking a sample of people to start a real tech business and run it in a certain way just for an experiment doesn't seem reasonable. We've all got better things to do with our time. Well, maybe.
But, for a big enough sample, it might be worth investing a chunk of time to answer this question - along with potentially lots of other questions about what is the least we can do to start a successful tech business?
Here's a rough outline of an experiment in entrepreneurial programming I've been kicking around. I'll be interested to know what folk think.
This experiment would be called THE SIXTY FOUR CHALLENGE:
We would create an artificial tech business economy. 64,000 people will be given 64 tokens to spend on tech products and services created by one or more of 64 "tech businesses".
Each tech business is a team of people who get together to create a product or service out of software (e.g., a web or smartphone app).
Each team has no more than 64 person-days (64 x 8 hours) to design, build, sell and support their product or service.
The challenge lasts 64 days from a standing start to the final reckoning. At the end of those 64 days, we would tot up how much money (tokens) each startup has made from our artificial market.
Each start-up has a seed fund of 64 tokens, which they can use to buy things like hosting and professional services from other start-ups (at a negotiated value in tokens/hour or day - so a team made up entirely of web designers could potentially win just by doing web design for other teams, which many would argue is what the web is anyway). Hours worked for other teams would not count against the maximum 64 hours alotted to your team.
We would create special payment gateways and other tools for processing token payments and exchanging tokens between teams, sitting behind which would be an artificial bank that holds all of these accounts and provides transparency to the whole endevour.
You can change - and even completely re-write - the code as many times as you like over the 64 days.
At the end, the final accounts would be totted up and also the source code would be evaluated, and we'd see whether cleaner code = slower start-up. My guess is we would see no clear correlation, and that taking care over code quality would not be a significant disadvantage.
What do you reckon? Answers on a postcard, please.
April 19, 2012
Enough With The Movements! Movements Are Stupid.
I've been around the block a few times as a software developer, and as such I've witnessed several movements in the industry come and go.
Each movement (object technology, patterns, component-based, model-driven, Agile, service-oriented, Lean, craftsmanship etc etc) attempts to address a genuine problem, usually. And at the core of every movement, there's a little kernel of almost universal truth that remains true long after the movement that built upon it fell out of favour with the software chattering classes.
The problem I perceive is that this kernel of useful insight tends to become enshrouded in a shitload of meaningless gobbledygook, old wives tales and sales-speak, so that the majority of people jumping on to the bandwagon as the movement gains momentum often miss the underlying point completely (often referred to as "cargo cults").
Along with this kernel of useful insights there also tends to be a small kernel of software developers who actually get it. Object technology is not about SmallTalk. Patterns are not about frameworks. Components are not about COM or CORBA. Model-driven is not about Rational Rose. SOA is not about web services. Agile is not about Scrums. Responsibility-driven Design is not about mock objects. Craftsmanship is not about masters and apprentices or guilds or taking oaths.
In my experience, movements are a hugely inefficient medium for communicating useful insights. They are noisy and lossy.
My question is, do we need movements? When I flick through my textbooks from my physics degree course, they don't read as a series of cultural movements within the physics community. What is true is true. If we keep testing it and it keeps working, then the insights hold.
What is the problem in switching from a model of successive waves of movements, leaving a long trail of people who still don't get it, and possibly never will, to a model that focuses on testable, tested, proven insights into software development?
I feel for the kid who comes into this industry today - or on any other day. I went through the exact same thing before I started reading voraciously to find out what had come before. They may be deluged with wave after wave of meaningless noise, and every year, as more books get published about the latest, greatest shiny thing, it must get harder and harder to pick out the underlying signal from all the branding, posturing and reinvention of the wheel.
You see, it's like this. Two decades of practice and reading has inexorably led me to the understanding that very little of what I've learned that's genuinely important wasn't known about and written about before I was even born. And, just as it it is with physics, once you peel away the layers of all these different kinds of particle, you discover underlying patterns that can be explained surprisingly succinctly.
For those who say "oh, well, software development's much more complicated than that", I call "bullshit". We've made it much more complicated than it needs to be. It's a lot like physics or chess (both set-theoretic constructs where simple rules can give rise to high complexity, just like code): sure, it's hard, but that's not the same as complicated. The end result of what we do as programmers can be massively complicated. But the underlying principles and disciplines are simple. Simple and hard.
We do not master complexity by playing up to it. By making what we do complicated. We master complexity by keeping it simple and mastering how software comes about at the most fundamental level.
Logic is simple, but algorithms can be complex. A Turing Machine is simple, but a multi-core processor is complex. Programming languages are simple, but a program can be highly complex. Programming principles are simple, but can give rise to highly complex endevours.
Complexity theory teaches us that to shape complex systems, we must focus on the simple underlying rules that give rise to them. At its heart, software development has a surprisingly small core of fundamental principles that are easy to understand and hard to master, many of which your average programmer is blissfully unaware.
True evolution and progress in software development, as far as I can see, will require us to drop the brands, dump the fads and the fashions, and focus on what we know - as proven from several decades of experience and several trillion lines of code.
March 22, 2012
DIY Codemanship Coding Dojo Kit
If I haven't managed to slot you in to this week's Coding Dojo World Tour (of London), fear not. You can sing along with this handy lyric sheet that outlines the dojo.Once you get past all the shameless plugs for various conferences, companies, books and battery-operated adult toys, you'll find two famous katas - Fibonacci Sequence Generator and FizzBuzz. If you don't already know these code katas, just Google them. You'll find plenty of advice.
The dojo works like this:
1. Group selects one of the katas to perform
2. Group decides HOW they're going to perform them. There are 3 approaches on offer - By The Book, From Hell & As If You Meant It.
Working in pairs, perform the kata using the approach you've decided on.
Takes about an hour. Or two hours if you're hopelessly drunk (which I did momentarily consider as a possible 4th approach, but rejected on moral and public health grounds.)
So really, it's 6 katas in one. Which is handy.
February 24, 2012
Agile Design - How A Bit Of Informal Visual Modeling Can Save A Heap Of Heartache
All my courses are, of course, fine holiday fine. But the Agile Design workshop's especially enjoyable, as it brings together a whole range of disciplines while challening participants to work effectively together in designing and implementing different features of the same simple system.The group works in pairs (or threes, depending on the overall numbers). After a bit of a crash course in basic UML - use cases, class diagrams and sequence diagrams - each pair is given a user story for a community DVD library, and tasked with iteratively fleshing out an object oriented design to pass an acceptance test agreed with the customer (me).

In a break from the traditional approach, we turned the design process around - arguably the right way round - and spent day #1 telling the story using plain old objects, designing and implementing a functioning domain model that includes all the concepts and functions required to pass the tests.

On day #2, we look at how these cncepts and functions should be presented to the end users, designing a graphical user interface and retelling the story, this time through the GUI.
The impetus behind the course is to help teams avoid the design train wreck that can ensue when Agile teams pick up stories and go off into their silos to do the design for their part of the overall system. I've seen very experienced teams end up with duplicated classes, database tables, multiple architectures and disjoints in the same code base.
Using informal visual models in a collaborative design approach can aid us in externalising our thinking so that other people can see how what they're doing fits in what everyone else is doing.
Getting the team around the whiteboard to explore shared concepts like the domain model, the screenflow of the user interface or the patterns used in the technical architecture - especially in the earlier stages of development - can draw out misunderstandings and disjoints that might otherwise have only come to light in integration, when these issues can be much more costly to fix (and therefore often never get fixed).

Importantly, teams are soon testing their designs by implementing them in code (test-driven, of course), and important design decisions and changes to the shared vision that happen as a result of making the designs work for real can be visualised and communicated by sketching them out on flipchart paper or on whiteboards and keeping them around the team's work area for everyone to see.
On the course, teams discover just how much active collaboration's needed to coordinate design effectively, and to take the time to resolve design issues and conflicts at the whiteboard if they can. Pairs need to be going out of their way to find out what the other pairs are working on. In real life, we tend to put in a wholly inadequate amount of effort into collaborative design, and our ad hoc, inconsistent, and sometimes just plain wrong, designs can be the end result.
The more visible our work is, the easier it is to bring design issues out into the open early, and the sooner we're able to establish a shared language for meaningfully talking about our designs.
And we're not just talking about developers here, either. Testers and graphic designers can play an active and valuable role in this process, as well as the customer, of course. They should take an active interest in establishing the design of use cases, in designing UI storyboards and screenflows, and in designing good acceptance tests that will effectively constrain our designs to what will meet the customer's real needs.
That's why I love this workshop. You get a buzz and an energy in the room, and a real sense of "stuff happening" and of progress being made. And it incorporates disciplines like continuous integration, TDD and BDD (or, as I know it, "TDD with a B instead of a T"), making it a much closer fit to real-world Agile Software Development.
December 18, 2011
Leadership Without Leaders
Distracted today by interesting discussions on That Twitter. One in particular about the folly of giving decision-making power to one group of people ("leaders") and tasking another group with understanding the implications of those decisions ("workers", if you like).Why do we have dedicated leaders on teams - people who have to seek expert advice so they can make "informed decisions"? Why aren't those decisions being made by the experts themselves?
I don't have a problem with leadership, I should stress. At some point we all need to take the lead so the team can move forward in a specific direction.
My concern is that we appoint someone as leader. Leader as job title is what I have a problem with.
I also don't dispute that a single point of contact between the team and the business is a often a good idea. But let's not confuse the role of "team representative" with "leader".
In politics, and management, the distinction is often blurred. We elect "representatives" who then say they are "in power" and start telling us what to do, rather than asking us what we think should be done. Many of society's ills can be traced back to this confusion between people who speak for us and people who decide for us.
A team can find ways to make decisions that don't require a dedicated leader. I've done this many times in the past. When a decision needed to be taken, the team put it to the vote. Should we use Ibatis or Hibernate? Show of hands. Hibernate it is, then?
At this juncture in the debate, there are a couple of objections that tend to come up:
1. What if your team tends to make uninformed decisions?
Well, that's probably because you hired the wrong people. If you don't trust the majority opinion of your team, then you don't trust your team.
2. Who is accountable for the ultimate outcome of decisions, then?
The team. We stand or fall as a coherent unit.
There may be someone on the team the customer gets on best with. Y'know, one of those "people persons" people tend to bang on about. Dandy. It's like having an expert in talking to the customer. If you have someone on the team who excels at something the rest of you don't, it makes perfect sense to let them lead in that area. Empower them to act as the "face of the team", by all means. But be very clear as a team where that power ends.
Similarly, if someone on the team is acknowledged as the expert in, say, code quality andf is good at spotting code smells and SOLID and all that good, healthy stuff, then empower them to lead in that area.
Generally speaking, empower people who excel in key areas to lead in those areas, and take your lead when you're not the best person to be making those decisions.
When the team can't agree, have a simple constitution that kicks in - a basic democratic decision-making process, with checks and balances for changing course when the team discovers they made a wrong call as early as possible.
If we're unwilling to do this, and, as dedicated leaders, unwilling to relinquish decision-making power to experts and to the team as a whole, we are essentially saying "I know better than this team". And maybe you do, but the fact that they're your team suggests perhaps you didn't. Either you hired an inadequate team, or you joined an inadequate team, or you just stood back and let someone else build the team for you, and didn't insist on getting the right people. None of these makes you look good, frankly.
One very interesting Twitter response suggested that leaders "carry the can". To quote: "Do-ers don't go down if the org goes down". This is patently not true. If the org goes down, we all go down. In fact, in many businesses, it's the workers who go to the wall first. Middole management, when ordered to cut costs, are not in the business of firing themselves.
Why can't the whole team carry the whole can? Teams succeed or fail as a whole. They should share the risks, and share the rewards equally. We are grown-ups, after all.
December 11, 2011
Beyond Fashions & Fads
This blog post by Peter Krantz neatly cocks a snoop at all those who refer to Agile methods as "fashions" or "fads".Peter finds quotes relating to many of the key problems XP's tries to tackle - like the need for iterative and evolutionary design, the need for early user feedback, the need to stop treating design and "production" as artificially separate activities, the need for small and highly-prioritised releases, the need to automate testing and the need to recognise the universal truth that some developers are orders of magnitude more productive than others, and that who you have on your team is therefore of vital importance.
If you hunt around, you'll even find evidence of teams applying what looks suspiciously like TDD in the 1960's (e.g., on NASA's Project Mercury).
It may have become fashionable after 2001 to do some of these things, and there may indeed be teams doing them without understanding why they're doing them. But the pioneers of software development have been espousing these things since before many of us were born.
July 12, 2011
As the Customer, You Can Make Agility Happen
Agility, I've discovered after many years - okay, so I actually discovered it right at the start, but needed a good excuse to blog about something - that all the power lies with the customer.If you want to force an "Agile transition", all you have to do is ask the right questions.
For example, if you ask to see working software at the end of every week, deployed on to your computers, then the developers are not going to have much choice but to deliver something on a regular basis. The weekly "show-and-tell" is one of the customer's most powerful weapons. And criminally underutilised. I've been amazed at how reluctant many customers are to even take 15 minutes out of their week to see where their money went. (And when I say "their money", I mean someone elses's money that they have responsibility for spending wisely, of course.)
And, as a customer, you're holding most of the planning cards, too. If you choose to change the plan based on feedback every week, identifying new requirements that are coming to light as the work unfolds, and reprioritising exciting requirements, then you can. All you have to do is accept that fixed-price R&D is a folly, and that your focus should be on what you're getting for your money, not on whether you're on plan and on budget according to some grand overall plan that was plucked out of the air at a time when you knew the least. Once your emphasis shifts from sticking to a plan and a budget to maximising the value you get from every weekly delivery of working software, you are very much in Agile territory, whether the developers like it or not.
By all means, have high-level "headline" business goals for the software. Indeed, be one of the 1% of software projects that even know what their business goals are. And measure your progress week-on-week towards those business goals, and then steer the software accordingly.
And what about the quality of the software? How can a non-technical person have any influence on that? For sure, it will come back to bite you on the backside if the software is of a low quality. The time it takes to fix 100 bugs discovered in a release is roughly 10-100x what it would cost to have weeded them out before release. That's time that could have been spent adding more value to the software. It's not uncommon for teams, after a few months, to wind up spending half their time just fixing bugs. Which is a terrific way to waste your money, don't you think?
Of course, we have to be careful what we mean by a "bug". Too many teams end up dealing with what are actually change requests - "the software does this, but what we wanted was that" - which have been reported as bugs. This is usually because the requirements were ambiguous and ill-formed. The developers interpreted them, and - not unsurprisingly - didn't quite deliver what you had in mind. The devil's in the detail, you see. Delivering working software's a great way to find out what you really needed - this is a learning process, after all. But it's one hell of an expensive way to found out what you wanted in the first place.
As a customer, you can at least ensure that the developers deliver what you ask for, even if it turns out you asked for the wrong thing. Insist on agreeing acceptance tests for every requirement. These should be clear, unambiguous and executable; that's to say, they should expressed in a form that you can actually perform as a real test. Use specific examples, ideally culled from the real world, to make it absolutely crystal clear what the software should do in key business scenarios. (E.g., "Given a bank account with a balance of $50 and no arranged overdraft facility, when the account holder tries to withdraw $50.01, the system tells them they have insufficient funds and the account is not debited")
Make your expectations absolutely clear. The software should pass all the acceptance tests you agree with the developers. If it doesn't, then they are not "done". Be clear, too, that you do not want the software to do any more than the tests require. If delivering features wrong is a waste of time and money, then delivering features nobody asked for is doubly so.
So now you have them caught in your vice-like grip. Every week, you sit down and plan with them the work for that week. Every requirement that gets scheduled is expressed as a suite of precise, executable tests using realistic examples. At the end of the week, there's a show and tell, at which the first thing you as the customer will do is perform the acceptance tests to see whether you got what you asked for.
The developers will have scored each of your feature requests for its relative complexity or difficulty (in much the same way that golfers score holes, estimating how many shots it might take to complete each one.) At the end of each week, tot up the total number of those estimated complexity points to see how much "stuff" the team got done. You can realistically schedule the same amount of "stuff" for the next week.
What we tend to find with the average software development team is that, working on the same software week after week, we see a visible decline in the amount of "stuff" they deliver over weeks, months and years. It's not uncommon to find, after even just a few months, that teams are achieving a small fraction of what they were at the start. Why is this, and what can you do, as a non-programmer, to slow that decline and get more "stuff" for your money?
The reason why productivity declines, often rapidly, in software development is that, over time, the code gets harder and harder to change. It gets complicated, difficult to understand, riddled with duplication, and congealed with dependencies that make it nigh-on impossible to change a line of code without breaking many other parts of the software. It becomes rigid and brittle.
On a twelve month project, we might expect - if unmanaged - that adding or changing a line of code in month twelve could cost 4-5 times as much as it did at the start. A feature request that would have cost $1,000 to implement in the first month could cost $5,000 at the end of that year.
A disciplined programmer will continually take steps to keep the code malleableand robust for as long as possible. She will work hard to ensure that her code is kept clean - simple, modular, easy to understand, low in duplication and with the dependencies carefully managed to limit the costly ripple effect when making changes. She was also ensure that she is able to retest the code comprehensively as quickly and cheaply as possible, which usually means she will write her own automated tests as she goes - since a suite of well-written automated developer tests can check if anything is broken in a matter of seconds. Waiting for your test team to tell her will introduce long feedback loops. By the time she hears that her change has introduced a bug, it may have been weeks or months since she made that mistake. Instead of just fixing it there and then, you will have to schedule a bug fix task, and she will have to go through the entire test and release cycle again before you'll see that bug fix in the finished product. That massively escalates the cost of quality, and the cost of making changes safely.
So, as a customer, you have to be very interested in code cleanliness and in automated developer testing (and, while we're about it, automating your acceptance tests - because there's nothing worse than discovering months after the fact that a feature that was tested and working is no longer working).
Now, naturally, you're probably not very interested in hearing about test coverage reports or about code quality metrics. And why should you be? But you should be interested in the end results of quality - or the lack thereof. I can tell you whether a development team has been taking sufficient care over the code, and putting enough effort into test automation, by a couple of very simple tests.
First of all, deliveries will be reliable. That is to say, when the developers say they're done, they really are done. As evidenced by the lack of bug reports post-release. If teams get hundreds of bug reports over the period of, say, a few months of software releases, I would wager good money that they're not automating their tests. What happens is a release goes out, a flood of bug reports come in, the developers work frantically to fix those bugs and then the software goes through another release. But their frantic bug-fixing has had side effects. Inadvertantly, they've introduced new bugs. And their inability to quickly retest the code means that those new bugs may well have gone undetected. So a whole new batch of bug reports go in, and they repeat the process. The feedback cycles are very long, and it can be very difficult to stabilise the software. If they're lucky, with each bug-fix release the overall number of known issues will decrease until the bug count's low enough for the software to be usable in the real world. But sometimes the overall bug count doesn't go down, and teams get caught in a vicious Groundhog Day of bug-fixing and more bug-fixing that can last months.
Think of the software as a boat that has leaks. It's taking on water, and the overall bug count is the level of that water. In come your developers with buckets to bail out the water. But, as they madly bail, the water is still coming in through the leaks. In software development, bugs "leak" into the code through the gaps between us writing or changing code and then testing the code. The more frequently we test the code, the smaller we make those leaks. Good programmers test and retest the code literally hundreds of times a day, and the only way to achieve such short code-test feedback cycles is by automation. A computer can execute 10,000 tests in under a minute. A test team, doing it all manually following the same test scripts, could take weeks or months.
So, as a customer, you should take a very keen interest in the water level in your boat and in how fast water appears to be leaking in. Monitor bug counts. Know how many bugs are still open. Know how many bugs are being reported each week. A good team will achieve very low bug counts - an average of < 1 bug per weekly release (that's about 1 bug every few weeks for an average project), and the number of open bugs will hover as close to zero as makes little difference. Of course, that's just a ballpark. But if you're seeing dozens of bugs reported each week (or hundreds over months), and the average count of open bugs is in the dozens or even hundreds, then you can be sure there's a problem.
You should also take a keen interest in how the pace of development's being sustained. Maybe you started with the developers delivering a dozen features in a weekly release, and by the sixth month that's down to 2-3. Many agile teams that use Burndown Charts, but who don't put enough effort into code quality, see an often alarmingly rapid decline in their ability to make progress. The backlog of your requirements gets, well, backed-up, and you can see often quite clearly that progress is slowing.
As a customer, you can demand low bug counts. Or, rather, you can refuse to accept software that has any known bugs (with the previous caveat about bugs not being change requests, of course).
This can get contentious on teams where the term "bug" is open to interpretation. Having unambiguous executable acceptance tests can help massively in this respect. The software must pass all of the acceptance tests. And it should do nothing other than what's required by the acceptance tests.
I tend to close the stable door thus:
1. If the software fails an acceptance test, that is a bug
2. If the software displays any undesired behaviour (defined by a set of what I call "invariant tests" - tests that it must always pass in any scenario, like, for example, "the system will never respond with an unhandled exception message"), that is a bug
3. If the software passes all of the agreed tests, and you require it to do something that is not defined by any test, that is a new requirement or change request
In any case, it's only a bug when a test is failed. And, since good programmers don't say they're done until they've passed all your tests, expect not to get software that has bugs in it.
Now, a reality a check: when we bandy about requirements like "it will never" or "it will always", we'd be making a pretty grand claim if we said that we'd proved that oour software never does things it's not supposed to, and always does things it is supposed to. Except on very, very simple software, the number of tests we'd have to write to genuinely prove it would be astronomical. Effectively infinite. But we tend to find a law of diminishing returns. 10 tests for a software function may make us 99% sure it works. 100 tests may make us 99.9% sure. 1000 may give us 99.99% assurance, and so on. It is often not possible to achieve 100% assurance, and therefore often impractical to even try. But it is entirely realistic to achieve a level of assurance that's as close to 100% as dammit through everyday good programming practices like developer testing, pair programming (when an extra pair of eyes scrupulously inspects the code as it's being written for potential problems) and the like. And I've learned from experience that we can ratchet up the level of test assurance really quite far for those critical bits of the software without enduring too much pain.
What I'm trying to say is that code that's 99.99% bug-free is entirely achievable, but there'll always be that one-in-a-million freak chance scenario, when ley lines converge and Jupiter is in Uranus etc etc and a bug that's been lurking hidden deep in the nested conditional logic of the application will finally surface. Which is why I leave a tiny bit of wiggle room for development teams; that one-bug-every-once-in-a-blue-moon type of scenario.
So, as a customer, if you demand to see working software every week, and demand that the software is only "working" if it passes your executable acceptance tests, and focus on delivering maximum value each week that makes progress towards your high-level business goals instead of obsessing over sticking to the original plan, and if you expect software to be delivered with almost no bugs (allowing for that once in a blue moon scenario), and if you expect the pace of development to be sustained for months, even years, then you can back a team of programmers into that corner we call "Agile", and they'll either deliver on that or they won't. And, of course, if they don't, then as the customer you can apply the ultimate sanction and then go find a team that can.
June 27, 2011
Continuous Delivery is a Platform for Excellence, Not Excellence Itself
In case anyone was wondering, I tend to experience a sort of "heirarchy of needs" in software development. When I meet teams, I usually find out where they are on this ladder and ask them to climb up to the next rung.It goes a little like this:
0. Are you using a version control system for your code? No? Okay, things really are bad. Sort this out first. You'd be surprised how much relies on that later. Without the ability to go back to previous versions of your code, everything you do will carry a much higher risk. This is your seatbelt.
1. Do you produce working software on a regular basis (e.g., weekly) that you can get customer feedback on? No? Okay, start here. Do small releases and short iterations.
2. How closely do you collaborate with the customer and the end users? If the answer is "infrequently", "not at all", or "oh, we pay a BA to do that", then I urge them to get regular direct collaboration with the customer - this means programmers talking to customers. Anything else is a fudge.
3. Do you agree acceptance tests with the customer so you know if you've delivered what they wanted? No? Okay, then you should start doing this. "Customer collaboration" can be massively more effective when we make things explicit. Teams need a testable definition of "done": it makes things much more focused and predictable and can save an enormous amount of time. Writing working code is a great way to figure out what the customer really needed, but it's a very expensive way to find out what they wanted.
4. Do you automate your tests? No? Well, the effect of test automation can be profound. I've watched teams go round and round in circles trying to stabilise their code for a release, wasting hundreds of thousands of pounds. The problem with manual testing (or little or noe testing at all), is that you get very long feedback cycles between a programmer making a mistake and that mistake being discovered. It becomes very easy to break the code without finding out until weeks or even months later, and the cost of fixing those problems escalates dramatically the later they're discovered. Start automating your acceptance tests at the very least. The extra effort will more than pay for itself. i've never seen an instance when it didn't.
5. Do your programmers integrate their code frequently, and is there any kind of automated process for building and deploying the software? No? Software development has a sort of metabolism. Automated builds and continuous integration are like high fibre diets. You'd be surprised how many symptoms of dysfunctional software development miraculously vanish when programmers start checking inevery hour or three. It will also be the foundation for that Holy Grail of software development, which will come to later.
6. Do your programmers write the tests first, and do they only write code to pass failing tests? No? Okay, this is where it gets more serious. Adopting Test-driven Design is a none-trivial undertaking, but the benefits are becoming well-understood. Teams that do TDD tend to produce mucyh more reliable code. They tend to deliver more predictably, and, in many cases, a bit sooner and with less hassle. They also often produce code that's a bit simpler and cleaner. Most importantly, the feedback we get from developer tests (unit tests) is often the most useful of all. When an acceptance test fails, we have to debug an entire call stack to figure out what went wrong and pinpoint the bug. Well-written unit tests can significantly narrow it down. We also get feedback far sooner from small unit tests than we do from big end-to-end tests, because we write far less code to pass each test. Getting this feedback sooner has a big effect on our ability to safely change our code, and is a cornerstone in sustaining the pace of development long enough for us to learn valuable lessons from it.
Now, before we continue, notice that I called it "Test-driven Design", and not "Test-driven Development". Test-driven Development is defined as "Test-driven Design + Refactoring", which brings us neatly on to...
7. Do you refactor your code to keep it clean? The thing about Agile that too many teams overlook is that being responsive to change is in no small way dependent on our ability to change the code. As code grows and evolves, there's a tendency for what we call "code smells" to creep in. A "code smell" is a design flaw in the code that indicates the onset of entropy - growing disorder in the code. Examples of code smells include things like long and complex methods, big classes or classes that do too many things, classes that depend too much on other classes, and so on. All these things have a tendency to make the code harder to change. By aggressively eliminating code smells, we can keep our code simple and malleable enough to allow us to keep on delivering those valuable changes.
8. Do you collect hard data to help objectively measure how well you're doing 1-7? If you come to me and ask me to help you diet (though God knows why you would), the first thing I'm going to do is recommend you buy a set of bathroom scales and a tape measure. Too many teams rely on highly subjective personal feelings and instincts when assessing how well they do stuff. Conversely, some teams - a much smaller number - rely too heavily on metrics and reject their own experience and judgement when the numbers disagree with their perceptions. Strike a balance here: don't rely entirely on voodoo, but don't treat statistics as gospel either. Use the data to inform your judgement. At best, it will help you ask the right questions, which is a good start towards 9.
9. Do you look at how you're doing - in particular at the quality of the end product - and ask yourselves "how could we do this better?" And do you actually follow up on those ideas for improving? Yes, yes, I know. Most Agile coaches would probably introduce retrospectives at stage 0 in their heirarchy of needs. I find, though, that until we have climbed a few rungs up that ladder, discussion is moot. Teams may well need them for clearing the air and for personal validation and ego-massaging and having a good old moan, but I've seen far too many teams abuse retrospectives by slagging everything off left, right and centre and then doing absolutely nothing about it afterwards. I find retrospectives far more productive when they're introduced to teams who are actually not doing too badly, actually, thanks very much. and I always temper 9 with 8 - too many retrospectives are guided by healing crystals and necromancy, and not enough benefit from the revealing light of empiricism. Joe may well think that Jim's code is crap, but a dig around with NDepend may reveal a different picture. You'd be amazed how many truly awful programmers genuinely believe it's everybody elses' code that sucks.
10. Can your customer deploy the latest working version of the software at the click of a mouse whenever they choose to, and as often as they choose to? You see, when the code is always working, and when what's in source control is never more than maybe an hour or two away from what's on the programmer's desktops, and when making changes to the code is relatively straightfoward, and when rolling back to previous versions - any previous version - is a safe and simple process, then deployment becomes a business decision. They're not waiting for you to debug it enough for it to be usable. They're not waiting for smal changes that should have taken hours but for some reason seem to take weeks or months. They can ask for feature X in the morning, and if the team says X is ready at 5pm then they can be sure that it is indeed ready and, if they choose to, they can release feature X to the end users straight away. This is the Holy Grail - continuous, sustained delivery. Short cycle times with little or no latency. The ability to learn your way to the most valuable solutions, one lesson at a time. The ability to keep on learning and keep on evolving the solution indefinitely. To get to this rung on my ladder, you cannot skip 1-9. There's little point in even trying continuous delivery if you're not 99.99% confident that the software works and that it will be easy to change, or that it can be deployed and rolled back if necessary at the touch of a button.
Now at this point you're probably wondering what happened to user experience, scalability, security, or what about safety-critical systems, or what about blah blah blah etc etc. I do not deny that these things can be very important. But I've learned from experience that these are things that come after 1-10 in my heirarchy of needs for programmers. That's not to say they can't be more important to customers and end users - indeed, user experience is often 1 on their list. But to achieve a great user experience, software that works and that can evolve is essential, since it's user feedback that will help us find the optimal user experience.
To put it another way, on my list, 10 is actually still at the bottom of the ladder. Continuous delivery and ongoing optmisation of our working practices is a platform for true excellence, not excellence itself. 10 is where your journey starts. Everything before that is just packing and booking your flights.

