August 6, 2012

...Learn TDD with Codemanship

Back To Basics #9 - Automate The Donkey Work

This is the ninth in a series of ten posts about basic principles for software developers aimed at side-stepping the flim-flam and hifalutin hyperbole so, hopefully, the dog can see the rabbit.

I don't know about you, but I'm not a big fan of mindless, repetitive tasks.

In software development, we find that there are some activities we end up repeating many times.

Take testing, for example. An averagely complicated piece of software might require us to perform thousands of tests to properly ensure that every line of code is doing what it's supposed to. That can spell weeks of clicking the same buttons, typing in the same data etc etc, over and over again.

If we only had to test the software once, then it wouldn't be such a problem. Yeah, it'll be a few dull weeks, but when it's over, the champagne corks are popping.

Chances are, though, that it won't be the only time we need to perform those tests. If we make any changes to the software, there's a real chance that features that we tested once and found to be working might have been broken. So when we make changes after the software's been tested once, it will need testing again. Now we're breaking a real sweat!

Some inexperienced teams (and, of course, those experienced teams who should know better) try to solve this problem by preventing changes after the software's been tested.

This is sheer folly, though. By preventing change, we prevent learning. And when we prevent learning, we usually end up preventing ourselves from solving the customer's problems, since software development is a learning process.

The other major drawback to relying on repeated manual testing is that it can take much longer to find out if a mistake has been made. The longer a mistake goes undetected, the more it costs to fix (by orders of magnitude).

A better solution to repeated testing is to write computer programs that execute those tests for us. These could be programs that click buttons and input data like a user would, or programs that call functions inside the software to check the internal logic is correct or that the communication between different pieces of the software is working as we'd expect.

How much testing you should automate depends on a range of factors.

Writing automated test programs that perform user actions tends to be expensive and time-consuming, so you may decide to automate some key user interface tests, and then rely more on automating internal ("unit") tests - which can be cheaper to write and often run much faster - to really put the program through its paces.

If time's tight, you may choose to write more automated tests for parts of the software that present the greatest risk, or have the greatest value to the customer.

Automating tests can require a big investment, but can pay significant dividends throughout the lifetime of the software. Testing that might take days by hand might only take a few minutes if done by a computer program. You could go from testing once every few weeks to testing several times an hour. This can be immensely valuable in a learning process that aims to catch mistakes as early as possible.

Basic Principle #7 states that software that can't be put to use has no value. Here's another obvious truism for you: while software's being tested, we can't be confident that it's fit for use.

Or, to use more colourful language, anyone who releases software before it's been adequately tested is bats**t crazy.

If it takes a long time to test your software, then there'll be long periods when you don't know if the software can be put to use, and if your customer asked you to release it, you'd either have to tell them to wait or you'd release it under protest. (Or just don't tell them it might not work and brace yourself for the fireworks - yep, it happens.)

If we want to put the customer in the driving seat on decisions about when to release the software - and we should - then we need to be able to test the software quickly and cheaply so we can do it very frequently.

Repeating tests isn't the only kind of donkey work we do. Modern software is pretty complicated. Even a "simple" web application can involve multiple parts, written in multiple programming languages, that must be installed in multiple technology environments that each have their own way of doing things.

Imagine, say, a Java web application. To put it into use, we might have to compile a bunch of Java program source files, package up the executable files created by compilation into an archive (like a ZIP file) for deploying to a Java-enabled web server like the Apache Foundation's Tomcat. Along with the machine-ready (well, Java Virtual Machine-ready) executable files, a bunch of other source files need to be deployed, such as HTML templates for web pages, and files that contain important configuration information that the web application needs. It's quite likely that the application will store data in some kind of structured database, too. Making our application ready for use might involve running scripts to set up this database, and if necessary to migrate old data to a new database structure.

This typical set-up would involve a whole sequence of steps when doing it by hand. We'd need to get the latest tested (i.e. working) version of the source files from the team's source code repository. We'd need to compile the code. Then package up all the executable and supporting files and copy them across to the web server (which we might need to stop and restart afterwards.) Then run the database scripts. And then, just to be sure, run some smoke tests - a handful of simple tests just to "kick the tyres", so to speak - to make sure that what we've just deployed actually works.

And if it doesn't work, we need to be able to put everything back just the way it was (and smoke test again to be certain) as quickly as possible.

When we're working in teams, with each developer working on different pieces of the software simultaneously, we would also follow a similar procedure (but without releasing the software to the end users) every time we integrated our work into the shared source code repository, so we could be sure that all the individual pieces work correctly together and that any changes we've made haven't inadvertantly impacted on changes someone else has been making.

So we could be repeating this sequence of steps many, many times. This is therefore another great candidate for automation. Experienced teams write what we call "build scripts" and "deployment scripts" to do all this laborious and repetitive work for us.

There are many other examples of boring, repetitive and potentially time-consuming tasks that developers should think about automating - like writing programs that automatically generate the repetitive "plumbing" code that we often have to write in many kinds of applications these days (for example, code that reads and writes data to databases can often end up looking pretty similar, and can usually be inferred automatically from the data structures involved).

We need to be vigilant for repetition and duplication in our work as software developers, and shrewdly weigh up the pros and cons of automating the work to save us time and money in the future.

Posted 9 years, 2 months ago on August 6, 2012