May 17, 2013

Straw Man TDD

A lot of the criticisms of Test-driven Develoment I hear are really attacks on a mythical version of TDD that no right-minded advocate ever put forward.

Nevertheless, being a TDD trainer and coach, I do still devote time to answering these straw man criticisms and objections. I thought it would be useful to collect some of the most common misconceptions in one place that I can point people to when I'm just too tired and/or drunk to answer them any more.

1. TDD means not doing any up-front thinking about design

Nobody has ever suggested this. It would be madness. Read books like Extreme Programming Explained again. You'll see sketches. You'll see CRC cards. You'll even see UML. (Gasp!)

The question really is about how much up-front design is sufficient. And the somewhat glib answer is "just enough". I tend to qualify that as "just enough to know what tests you need to pass". So, if your approach is focused on roles, responsibilities and interactions, then I'd want to have a high-level idea of what those are before diving in to code. If it's more an algorithmic focus, I'd want to have a test list that can act as a roadmap for key examples that - taken together - explain the algorithm. And so on.

I'd stop at the point where I'm asking questions that are best answered in code (e.g., is this an interface? Should this method be exposed? etc) Code is for details.

2. TDD takes significantly longer because you write twice as much code

Once you've got the hang of TDD - and that can take months of practice - we find it doesn't take significantly longer. Mostly because the bulk of our time isn't spent typing, it's spent thinking and, when we don't take care, fixing problems. Fixing problems, we find, generally takes more time than avoiding them. So much so, in fact, that working in the very short feedback loops of TDD and testing thoroughly as we go can turn out to be a way of saving time.

Most developers and teams who report a loss of productivity when they try TDD are actually reporting the learning curve. Which can be steep. This is why it can make good commercial sense to seek help in those early stages from someone who's been there, done that and got the t-shirt.

3. TDD leads to mountains of test code that make it harder to change your source code

There are three key steps in TDD, but most developers miss out or skimp on the third one - refactoring. So, when they report that they tried TDD for a few months, but found after a while that they couldn't change their source code without breaking loads of unit tests, I'm inclined to believe that this is what's really happened.

Test code is source code. If the test code is difficult to change, your code is difficult to change. So we must apply as much effort to the maintainability of test code as to the code it's testing. It must be easy to read and understand. It must be as simple as we can make it. It must be low in duplication. And, very importantly, it must be loosely coupled to the interfaces of the objects it's testing.

Think of UI testing. Maybe we wrote thousands of lines of scripts that click buttons and populate text boxes and all that sort of thing, binding our UI tests very closely to the implementation of the UI itself. So if we want to change the UI design - and we will - a whole bunch of dependent tests break.

Better to refactor our UI test scripts so that interactions with the concrete UI are encapsulated in one place and invoked through meaningfully-named helper functions. so we can write tests scripts in the abstract (e.g., submitMortgageApplication() instead of submitButton.click() )

The same applies to unit tests. If we repeatedly invoke the same methods on an object in our tests, better to encapsulate those interactions behind abstract and meaningful interfaces so it all happens in one place only.

4. TDD does not guarantee bug-free code

This isn't a straw man, per se. But to say that "we don't bother doing X because X is not completely perfect" isn't much of an argument against doing X when no approach guarantees perfection. When people throw this one at me, I'm naturally keen to see their bug-free code.

Let's face it, the vast majority of teams who don't do TDD would benefit from doing something like TDD. They'd benefit from working towards more explicit, testable outcomes. They'd benefit from shorter and less subjective feedback loops. They'd benefit from continuous refactoring. They'd benefit from fast, cheap regression testing. Their software would be more reliable and easier to maintain, and - once they've worked their way up the learning curve - it won't cost them more to achieve those better results. There are, of course, other approaches than TDD that can achieve these things. But, by Jiminy, they don't half feel like TDD when you're doing them (which I have).

UPDATE

5. you are not designing domain abstractions, you are designing tests.

This is a new addition to the fold, courtesy of some chap on That Twitter who obviously thinks I don't know one end of a domain model from a horse's backside.

Now, I've spent a fair chunk of my career modeling businesses - back in the good old days of "enterprise architecture", when that was where the big bucks were. So I do know a thing or two about this.

What I know is that those domain abstractions have to come from somewhere. How do we know we need a customer and that customer might have both a billing address and a shipping address, which my be the same address, and that customer may be a person or a company?

We know it because we see examples that require it to be so. If we don't see examples on which these generalisations are based, then our domain model is pure conjecture based on what we think the world our systems are modeling might look like (probably). I design software to be used, and it has been considered a good idea to drive the design from examples of usage for longer than I've been alive. Even when we're not designing software, but simply modeling the domain in order to understand it - perhaps to improve the way our business works - it workes best when we explore with examples and generalise as we go. In TDD, we call this "triangulation".

I will very often sketch out the concepts that play a part in a collection of scenarios - or examples - and create a generalised model that satisfies them all as a basis for the tests I'm about to write. (See Straw Man #1, of which this is just another example.)

When we generalise without exploring examples, we tend to find our domain models suffer from a smell we call "Speculative Generality". We can end up with unnecessarily complex models that often turn out not to be what's needed to satisfy the needs of end users.

Good user-centred software design is a process of discovery. We don't magic these abstractions and generalisations out of thin air. We discover the need for them. At it's very essence, that's what TDD is. I can't think of a single mainstream software development method of the last few decades that wasn't driven by usage scenarios or examples. There's a very good reason for that. To just go off and "model the domain" is a fool's errand. Model for a purpose, and that purpose comes first.

If you practice TDD, but don't think about the domain and the design up-front, then you're doing TDD wrong. It's highly recommended you think ahead. Just as long as you don't code ahead.

UPDATE #2

6. TDD doesn't work for the User Interface

Let's backtrack a little. Remember those good old days, about 10 minutes ago, when I told you that you should decouple your test code from the interfaces that it tests?

Those were the days. David Cameron was Prime Minister, and you could buy a pint of beer for under £4.

Anyhoo, it turns out - as if by magic - that it's not such a bad idea to decouple the logic of user interactions from the specific UI implementation in the architecture of your software. That is to say, you knobs and widgets in the UI should do - to use the scientific parlance - "f**k all" as regards the logic of your application.

The workflow of user interactions exists independent of whether that workflow is through a Java dekstop application or an iOS smartphone app.

A tiny slither of code is needed to glue the logical user experience to the physical user experience. If more than 5% of you code is dependent on the UI framework you're using, you're very probably doing it wrong.

And for that last 5%... well, you'd be surprised at how testable it really is. It may take some ingenuity, but it's often more do-able than you think.

Take web apps: all it takes is a fake HTTP context, and we've got ourselves 100% coverage. (Whatever that means.) Java Swing is equally get-at-able. As are .NET desktop GUIs. You just have to know where to stick your wotsit.


If you'd like to see a few other TDD myths debunked, while getting some hands-on practice in an intensive and fun workshop, join us in London on July 13th.





May 15, 2013

How Can You Attract And Retain Great Developers?

Companies often wonder how they can attract and retain great software developers.

Well, here's the thing about great software developers. They don't approach what they do as just a job. To them, it's a passion, a calling. They do it because they love to do it.

To make sense of this, let's change the context of the question.

You have started a band. How can you attract and retain great musicians for your band?

Here's how you might not do it:

1. Constantly remind them that this is your band and that they must do as you tell them

2. Get them playing awful, tacky music (e.g., that song from Four Weddings & A Funeral, anything by Meatloaf) at weddings and school proms

3. Force them to play with crappy musicians who make tonnes of mistakes, and don't give them time and space to help those musicians improve. And then blame them if it sounds crap.

4. Consistently approach the band's musical output with a "that'll do" attitude. "Yeah, the vocal's off-key in the chorus, but we're on a deadline so let's just print the CDs already"

5. Make unrealistic demands of them. "We're going to be playing 2 shows a day for the next 6 months", "We've got 5 days to rehearse this 90-minute set"

6. When they do something amazing, ignore it. Focus on you. You're the band leader, after all.

7. Routinely remind them that they are dispensible. Great musicians grow on trees, remember?

8. Discourage a musician-led culture in your band, and restrict time to practice, learn and grow as musicians. You're there to make money. Who gives a shit about day trips to NAMM or time off to attend guitar clinics?

9. Most important of all, remember: when the band's a success, it's because you're a great band leader. When it's a failure, it's because your musicians suck.

Now, ask me again: how can you attract and retain great software developers?



Legacy Code Without Automated Tests Is Not An Excuse For Less Rigour

While I'm on the subject of bad ideas when refactoring legacy code, I feel I should draw attention to what appears to be a common misunderstanding - even among us experts.

I watch a lot of screencasts where folk demonstrate how they would refactor legacy code. Typically, they start by stating bluntly that you shouldn't refactor code without automated tests.

Then they go on to do exactly that so that they can write their first automated unit test in order to make the code initially testable - usually to introduce some kind of dependency injection.

Some excuse themselves from the need to re-test while they do these initial refactorings because they were using automated refactoring tools. I'm not quite sure how this urban myth got started, but let me burst that bubble right here.

At the time of writing, no refactoring tool is that reliable. Even if you're expert at using the tool, and selecting all the right options for more complex refactorings , which most of us aren't, every once in a while the tool screws up our code. And when I say "once in a while", I mean regularly.

I've learned from the school of Hard Knocks to re-test my code even after using the simplest automated refactorings. Even if those tests run slowly. Even if I have to follow manual test scripts and click the buttons myself.

The reason we fear legacy code is because it is difficult to change. It's difficult to change because it's easy to break. The time to be giving ourselves a hall pass to excuse ourselves from regression testing is not right at the start, when the code is probably at ots most brittle.

In these early stages, when our priority is probably getting fast-running automated tests (i.e., unit tests) in place to enable the kind of architectural refactoring we want to do, we must approach the code with utmost care. That means we need to apply the greatest rigour.





April 19, 2013

Dark Life & New Ways Of Seeing

This article in the Guardian about how some astrobiologists theorise that there could be a "hidden biosphere" that has evolved on Earth in parallel with the tree of life from which we sprang reminded me of that age-old problem of how we can expect to find things we're not looking for.

We similarly overlooked a big chunk of the mass of the universe because we looked at electromagnetic radiation, seeing only that which emits or reflects electromagnetic waves.

In software development, we to can naively interpret our inability to see something as non-existence of the thing we can't see. Typically, we can't see it because we're not looking for it.

These could be, for example, the bugs nobody tested for. Like dark matter, and dark life, the bugs are still there, and they can still bite. But I've seen too many teams apply the strategy of not looking, as if that somehow means those bugs don't exist. This is like covering our faces and assuming that, because we can't see other people, they can't see us.

New ways of seeing are therefore vitally important. We can "see" dark matter by measuring its gravitational effects. And we could see dark life by appying tests for a wider set of biological possibilities. Then a whole new world (or universe) emerges out of the shadows, and our understanding is expanded.

Developers may believe their multithreaded code has few bugs, but that may because they haven't tested it in multithreaded scenarios. They may believe their software is easy to use, but that may be because they haven't tested it users who weren't involved in the design. They may believe their software is performant, but that may be because they haven't tested it under a high load. They may believe their classes are loosely coupled, but that may be because they haven't looked at a graph of class dependencies.

New ways of seeing offer up new possible understandings. And I can't help feeling we, as an industry, invest far too little in expanded our senses so we can expand our understanding of software. Too much of it is about "looking at text files", and I find that limits our vision and restricts our understanding.






April 6, 2013

Science & Software

Pairing with my apprentice-to-be, Will, on Friday, we got to chatting about the increasingly intimate relationship between software development and science.

i was reminded of something I heard at university (back in the days when we wrote our dissertations with quills). A PhD student who I hung out with had been working on a large-scale simulation of atoms in a crystal lattice to try and crack the problem of why washing powder clogs. His code was written in FORTRAN and was designed to run on the HP minicomputer - a monster of a computer, almost as powerful as my Android phone!

His research was building on work done by a previous PhD student, who had also written code to simulate the same crystals. His PhD was predicated on the assumption that he'd be able to take the existing code and adapt it for his research - in much the sdame way that an algorithm to traverse a tree and search for something could be adapted to traverse the same tree and search for something else.

Problem was that the existing code was impossible to understand, and the person who wrote it was long gone. This person lost several months rewriting the simulation from scratch.

Now, this was more than two decades ago. Much physics was still done with pen and paper or in the lab. But, increasingly, more and more research was done almost entirely using software to either search through large amounts of data collected from experiments, or to simulate physical systems that would be too expensive - or even impossible - to recreate in the lab.

I'm told by friends who carried on with physics that software-based research is much more common these days. And they often share anecdotes about the trouble software causes them that sounds jolly familiar. I wouldn't be surprised if a lot of research time and money isn't being lost to the kinds of problems we come up against daily in business when software's involved.

When I studied physics, they encouraged us to keep a diary of the lab work we did. The idea was that if, for example, we got run over by a bus, someone else could read our lab diary and continue our research. Hoorah - progress continues unabated.

The lab diary codifies your method. These days, I suspect, the method may - in at least some key cases - be codified as software. If you can't understand the software, you can't understand the method, and you can't continue the research.

Similarly, if the software is buggy, then your method is buggy, and your results and their conclusions are suspect. Cue faster-than-light neutrinos. The interactions of subatomic particles at CERN are interpreted by software. My first instinct on hearing the sensational news was "I'd like to read their code".

As science becomes more and more reliant on software, the integrity of our science will rely more and more on the integrity of our software. As yet, this is a fringe topic in physics. Universities may teach computer programming and computational maths, but they don't really help or encourage students to write software to a high-enough standard.

I can't help feeling that some element of the discipline of writing good software would benefit science students. But a science degree is already a big ask in terms of time commitment. Throwing in a day a week of "software craftsmanship" or "software engineering" may be the straw that broke the camel's back.

I do think, though, that the model of apprenticeship I'm proposing to trial with Will (and A.N.Other, if I can find the right person) could present a solution.

This is something I'm going to give more thought to.



March 7, 2013

Intenstive Test-driven Development, London April 20th

The world's best-value public TDD course is back!

I'll be running an intensive TDD workshop in central London on Saturday April 20th.

Previous public TDD workshops have sold out, and folk have traveled from as far afield as Russia and Dubai to take advantage of the amazingly low £99 price tag.

You can find out more and book here







January 12, 2013

The World-Famous Legacy Code Singleton Fudge

When refactoring legacy code that relies on static methods to access external systems - for example, for data access - our first goal is usually to make the code unit-testable. Therefore, we seek to invert that dependency on a static method to make it substitutible.

I demonstrated in previous blog posts how we do this in fairly simple situations. The dance is pretty straightfoward: you turn the static method into an instance method and then do a "Find and Replace" to swap references to the class that method is on into with "new ClassName().", so the target of invocation is now an instance. We then give client code a way to do the old polymorphic switcheroo by injecting that instance into, say, the constructor of the class where it's being used.

As easy as cake.

What often comes up though is a more complex, and less than ideal situation. What if our static method is being accessed by many, many classes? We'd have to introduce dependency injection into every class that uses it, and then - if those classes are some way down the call stack - into the every link in the chain to pass references from the top down to where they're used. This could be a lot of work, and I've not found a quick automated way of doing that. And what if - horror of horros - the methods that access our static data access method are also static?

Take this example. Here's a pretty nasty data access class that offers static methods for updating and retreiving object state to an external database.



Imagine that -for reasons best known to themselves - the developers use these methods inside every single business object in their middle tier. Like:



Our goal is to get decent automated unit test assurance in place quickly, so we can then safely set about refactoring this whole can of worms properly.

In this situation, I've often used a fudge to get my tests in place. The fudge being to deliberately introduce a Singleton. (Gasp!)



I first extract a new class that contains the implementations of the Update and Fetch methods, and leave delegate methods in our original DataAccess class. Then I extract an interface from this new DataAccessImpl class (IDataAccess) so we can make those methods polymorphic.

Here comes the fudge - I then introduce a static method for setting the data access implementation at runtime. so our unit tests can set it as a mock or a stub, and our production code can set it once as the real McCoy.

Yes, pretty rank. But we've made our application logic unit-testable at a relatively low expense, and can now start writing those tests that will be the safety net for unpicking this whole mess once and for all. To unpick it first and then write the tests would be too risky, in my experience.




November 21, 2012

Who's Afraid Of Legacy Code?

British society has a bit of a problem with age. We're culturally obsessed with youth, and prefer to hide away our ageing population - presumably so they don't keep reminding us that we, too, will one day get old.

This paradox leads to severe consequences for society, as we choose to ignore the facts of life and live as if old age will never come. We don't look after ourselves as well as we should, we don't plan for our futures as well as we could, and we structure our society to favour youth in many respects. That our society is gradually getting older on average, with over 65's now making up a big chunk of the population, confounds our desire to live in a youthful world. Ironically, this growing generation of senior citizens has led to successive governments pandering to the older vote at the expense of the young. Young people are now paying net for an older generation who have not adequately provided for themselves (chiefly because nobody thought people would be living as long as they are), and the "grey vote" has become such a powerful block that young adults of tomorrow can expect to be shouldering even more of that burden.

Software development has grown a similar paradox. We're obsessed with the shiny and the new, despite the fact that we're surrounding by a growing legacy of old code.

Nobody thought that the software they were writing back in the 90's, or the 80's, or 70's or 60's, or even the 50's, would still be in use today. And so, they didn't plan for the future we now find ourselves in.

When you open up a book or a magazine or read a blog post about software development, chances are it will be about writing new code. Aside from some noteworthy exceptions like Michael Feather's Working Effectively With Legacy Code, most coverage of software development is about programming on a blank sheet.

This has a two-fold effect; firstly, most developers lack the skills and the disciplines needed to maintain or add value to existing softare. And secondly, most software is not written with a potentially long life in mind.

Not only do developers lack the skills for legacy code, they have a marked tendency to run a mile in the opposite direction from acquiring those skills. I run a training company, so I know how low the demand is for learning them.

Employers, too, fail to recognise the need for and the value of legacy code skills. They rarely ask for them when hiring developers, and tend not to support developers seeking to improve things in those areas. When did you last see a job advertisement asking for experience of restoring or rehabilitating old, knackered code?

This is despite the fact that most developers are working on legacy code, and that their inability to add value to it and respond to the changing needs of the business and the end users is often cited as a major barrier to business competitiveness by managers.

As Ivan Moore recently put it, legacy code is "the elephant in the room". It comprises the bulk of the work and the overall cost of software development, but occupies a minimal slice of our thoughts and our care.

In the last decade, Test-driven Development has become de rigeur. And a jolly good thing, too.

But, as I see time after time, it's entirely possible to produce legacy code doing TDD. And even if you master writing clean, maintainable code, what about all the code you already write that's out there serving users right now? Do we just write those huge investments off to experience?

No. That would be silly, and immensely wasteful.

If only for the learning experience of dealing with the consequences of design decisions - I've yet to meet a genuinely great developer who hasn't devoted significant time to cleaning up old code - it's high time for us to really get to grips with legacy code.







November 6, 2012

Michael Feathers' Code History Mining Workshop, Jan 14

Just a quick tip for learning-hungry developers out there.

Michael Feathers will be running his code history mining workshop in London on Jan 14th. Highly recommended.

http://codehistorymining.eventbrite.co.uk/









October 28, 2012

Refactoring Legacy Code #2 - Making Web Apps More Unit-Testable

Following on from that last post about refactoring legacy classes that depend on external systems (like a database) - which has been read by literally dozens of people, and that's no idle boast - I also get asked a lot about making web applications unit-testable.

Taking classic ASP.NET as a typical example - and again using a toy but typical example - the problem is also external dependencies. When we reference ASP.NET objects like Session and Request, we tie our code to the ASP.NET process and the lifecycle of our web forms. We can't just create na instance of a web form's class and start invoking methods on the controls on our page, because outside of ASP.NET, those objects won't be there.



Our goal in making our legacy code unit-testable is to be able to test as much of the logic of our app as possible quickly and effectively, and to do this we need to isolate as much code as we can from external dependencies like these.

I'm a big believer that server pages and web forms should do as little as possible. Really, they should just be a very thin film of glue that binds the logic of user interactions and the display - which, if we think about it, is only marginally about Session and Request and HTML controls - with the meat and potatoes seperated away from knowledge of those details.

I might start to refactor this by extracting the meat and potatoes, complete with ASP.NET dependencies, into its own method.



Next, if I'm looking for some way to write the order data to the page without actually referencing the page or any of its controls, I need to extract methods that I can use to delegate this work through.



Now, for the magic. You'll like this. Not a lot, but you'll like it. If I make these helper methods for writing customer data to the web form public, I can extract an interface on the form's class, and have our controlling method speak to the form through that interface.



Next, we need to tackle that reference to Session. There are many different ways of breaking this dependency, but the simplest here might be to hide it behind another extracted helper method as a stepping stone to where I want to go next.



Now, I could just extact another interface on our form's class and pass that in. But I'm guessing we may want to have a shared abstraction we can reuse in a wider set of situations. Basically, imagine we don't want to implement SetSessionVariable (and, presumably, GetSessionVariable) on every web form. So, I'm going to extract a new class, and then extract an interface on that class.





Now we have a DisplayCustomerWithOrders method that depends only on abstractions for Session and for the web form - importantly, abstractions we control.

Next, I would extract this method into its own class. if you like, we can all it a "controller". (Let's make that one sacrifice to appease the gods of enterprise architecture.)





Now we're really getting somewhere. As it stands we could move CustomerController into a new .NET library, along with the interfaces it depends on, and this would all be unit-testable without the need to be running in the ASP.NET process.

We've got as bit of tweaking to do, first, though. For starters, if we follow the rule (not blindly, but with sound reason) that objects should be born with their collaborators, then let's refactor CustomerController along those lines, so any other controller methods we add can access the userSession and the view.

And while we're about it, we should make it possible for us to inject our DataRepository, so we can write unit tests that won't hit a real database.





We now have a controller that's isolated from the front and the back end of this application, and can be unit-tested using, for example, mock objects to check that it calls for the right customer and tells the view to set the right customer field and order values on the GUI.

A little bit of clean-up in our web form's class, just to tie up the loose ends...



The observant among you will have noticed that our refactored ASP.NET web form class is not smaller than it was. This is because my example is very simple in terms of business and control logic, and also because we only had one event to deal with. If this web form had multiple event handlers, and our business logic was more sophisticated, like in a real application, then the ratio of unit-testable code to web form code would normally start to tip in our favour.

It's often feasible to end up with 90% or more of our code to end up in unit-testable classes when we abstract away the external stuff like GUIs and databases, and make them substitutible for testing and other purposes.

Again, while all this refactoring was going on, I was disciplined enough to run a basic Selenium test script after each individual step to make sure the app was still working. But at the earliest opportunity, I would start writing unit tests to check the logic. Selenium's dandy and all, but when you have 10,000 business rules to check, testing them through a web browser requires a lot of down-time.