July 26, 2012

...Learn TDD with Codemanship

Empirical Design & Testing In The Wild

Ponderings and musings on that question of why we code.

I'm not talking about why I am a programmer. That's easy - I enjoy it. Really, it's the question of "why software?"

It's no secret that, as an industry, we tend to be solution-led. We figure out how to do something often before we've thought of a good reason for doing it.

Maybe it's because we enjoy inventing solutions more than we do solving problems. Who knows?

And it's fair to say that it can cut both ways. Many times, we have a solution sitting on the shelf gathering dust because nobody found a use for it, and then one day someone made that connection to a real problem and said "hey, you know what we could use for that?"

But I'm seeing far too many solutions-looking-for-problems out there. CRM is the classic case-in-point. Large organisations know that they want it, but what is the goal of CRM? All too often, they can't articulate their reasons for wanting a particular CRM (or ERP, or whatever) solution. They just want it, and there's some vague acknowledgement that it might make things better somehow.

I suspect some of the most successful software solutions have attached themselves to problems almost by accident. How often have you seen software being used for something that it wasn't intended to be used for? Who said, for exampe, that Twitter was an open messaging solution, and not the micro-blogging solution it was designed to be? As a micro-blogging solution, it's arguably a failure. What it's turned out to be is something like AOL Instant Messenger, but anyone can join in the conversation.

Successes like Twitter and Facebook occur by providence more than by design. Users discover things they can do with the software, projecting their own use cases into it and working around the available features to find ways to exploit the underlying computerific nature of the beast.

Strip away the brand names and the logos and the unique designs, and you're left with a fundamental set of use cases upon which all software is based to some degree or another.

We're not supposed to use it that way, but for the majority, Microsoft Excel is a database solution. Indeed, I've seen Microsoft Word used as a database solution. You can store structured data in it. Ergo, it's a database.

You see, people have problems. And when all's said and done, software is nothing more than an interface to the computer that they can use to solve their problems. A user interface of any kind presents us with a language we can use to communicate with the computer, and users can be very creative about how they use that language. In Word, it may well be "add row to table", but in the user's mind it's "add item to order" or "register dog with kennel".

So too in Twitter, posting an update on my "micro-blog" might actually mean something else to me. I might be sending an open message to someone. I might be alerting followers to an interesting documentary I'm watching on TV at that moment. I might be asking for technical support. I've seen Twitter used in so many different ways.

I'm fascinated by watching people use software, and especially by the distance between their own internal conceptual model of what they think they're doing (adding an item to an order) and what the software thinks they're doing (adding a row to a table).

For me, these are the most enlightening use cases. What do people actually do using our software?

When I examine usage logs, I often find patterns of repeated sequences of user interactions. When I was younger and more naive, I believed that these revealed a need to offer further automation (e.g., wizards) to speed up these repetitive tasks, and to an extent that's usually true. It's a very mechanistic way of looking at these patterns.

But now I suspect that what these patterns reveal is more profound than that.

Imagine examining a log of instructions sent to the CPU of your computer. You would undoubtedly find much repetition. But tracing those patterns up through the technology stack, we will discover that these repetitions are a product of sequences of instructions defined at increasingly higher levels of abstraction - layers of languages, if you like. A simple expression or statement in Java might result in a whole sequence of machine instructions. A method containing multiple statements might result in even longer sequences. And a user interface or domain-specific language (which, by the way, is also a user interface, and vice-versa) might ultimately invoke many such methods with each interaction.

What I'm suggesting is that there can often be an unspoken - usually unacknowleged - language that sits above the user interface. This is the language of what the user intends.

And for all our attempts to define this user language up-front (with use cases and user stories), I don't think I've ever seen software where the mapping between software features and user intentions was precisely 1-to-1. When I resolve to watch closely, I've always found the user working around the software to at least some extent to get what they really want.

Inevitably, we don't get it right first time. Which is why we iterate. (We do iterate, right?) But what is that iteration based on? What are we feeding back in that helps to refine the design of our software?

It's my contention that requirements analysis and UI/UX design should be as much - if not more - an activity based on watching what users do with our software as it is on asking them what they want to do before we write it.

User acceptance testing helps us agree that we delivered what we agreed we should, but we need to go further. It's not enough to know that users can do what we expected they should be able to do using the software, because so much software gets its real value from being misused.

And it's not enough that we observe people using our software in captivity, under controlled conditions and sticking to the agreed scripts. We need to know what they'll likely do with it in the wild.

Going foward, here's how I plan to adapt my thinking about software design:

I plan to shift even more of the effort to redesign. I plan to base redesign not on washy-washy "customer feedback" but on detailed, objective observations taken from the real-world (or as near as damn-it) as to how the software's actually being used. Repetition and patterns in real-world usage data will reveal that there are goals and concepts I must have missed, and I will examine the patterns and the data, and then use that as input to ongoing collaborative analysis and redesign with the users.

I will keep doing this until no more usage patterns emerge and the design now encapsulates all of those missing goals and concepts, at which point hopefully the conceptual language of my software will be a 1-to-1 match for the user's.

I plan to refine this approach so that less and less we present users with our interpreration of what we think they need, and more and more we allow the patterns that emerge from continued usage to inform us what really needs to be in the software.

I consider this to be a scientific, empirical approach to software design. Design based on careful observation, which is then tested and retested based on further observations until what we observe is a precise match for what our users intend.

In iterative design, every design iteration is a theory, and every theory must be thoroughly tested by experiment. My feeling is that, for all these years, I've been doing the experiments wrong. And this has meant that the feedback going into the next iteration is less meaningful.

The whole point of iterative design is that we want to converge on the best design possible with the time and resources available to us. The $64,000 question is: converge on what? How do we know if we're getting hotter or colder?

That final test has always felt somehow lacking to me. We deliver some working software, the customer tests it to see that it's what we agreed it should be, and then we move on to the next iteration, where - instead of refining the design - we usually just add more features to it.

It's never felt right to me. In theory, the customer could come back and say "okay, so it does what we agreed, but now here are my changes to what we agreed for the next iteration". But they generally don't. That gets put off, and put off, and put off. Usually until a major roll-out, which is where most testing in the wild happens, and where most of the really meaningful feedback tends to come from.

This is one of Agile's dirty little secrets. The majority of the teams are doing short increments and loooong iterations. The real learning doesn't start until a great deal of the software's already been written. And then, thanks to Agile's other dirty little secret (Unclean Code), there's less we can do about it. Usually bugger all, in fact.

Of course, we're not going to be allowed to deploy software that doesn't have the mimimum viable set of features into a real business - any more than we'd be allowed to cut the ribbon on 10% of a suspension bridge - which is why I favour testing software in the most realistic simulations of the real world as possible.

Whenever I mention the idea of a "model office" I hear murmurs of approval. Everyone thinks it's a good idea. So, naturally, nobody does it*.

But if you want to get that most meaningful feedback, and therefore converge on the real value in your software, testing in captivity isn't going to work. You need to be able to observe end users trying to do their jobs, live their lives and organise their pool parties using your software. If you can't observe them in the wild, you need to at least create a testing environment that can fool them into thinking they're in the wild, so you can observe them using it in the way they naturally would.

That's my idea, basically. Deploy your software into the wild (or a very realistic simulation of it) and carefully and objectively observe what your real users do with it in realistic situations. Look for the patterns in that detailed usage data. Those patterns are goals and concepts that matter to your users which your software doesn't encapsulate. Make your software encapsulate those patterns. Then rinse and repeat until your software and your users are speaking exactly the same language.

* You think I'm kidding? Seriously, using a model office to test your software in is THE best idea in software development. Barr none. Nothing gets you closer to your users faster, except for actually becoming them. Nothing reveals the true nature of the user's problems, and the real gaps in your software, more directly. Nothing. NO-THING!

And I bet you still won't use one.

Posted 5 years, 7 months ago on July 26, 2012