August 6, 2018

Learn TDD with Codemanship

Agile Baggage

In the late 1940s, a genuine mystery gripped the world as it rebuilt after WWII. Thousands of eye witnesses - including pilots, police officers, astronomers, and other credible observers - reported seeing flying objects that had performance characteristics far beyond any known natural or artificial phenomenon.

These "flying saucers" - as they became popularly known - were the subject of intense study by military agencies in the US, the UK and many other countries. Very quickly, the extraterrestrial hypothesis - that these objects were spacecraft from another world - caught the public's imagination, and "flying saucer" became synonymous with Little Green Men.

In an attempt to outrun that pop culture baggage, serious studies of these objects adopted the less sensational term "Unidentified Flying Object". But that, too, soon became shorthand for "alien spacecraft". These days, you can't be taken seriously if you study UFOs, because it lumps you in with some very fanciful notions, and some - how shall we say? - rather colorful characters. Scientists don't study UFOs any more. It's not good for the career.

These days, scientific studies of strange lights in the sky - like the Ministry of Defence's Project Condign - use the term Unidentified Aerial Phenomena (UAP) in an attempt to outrun the cultural baggage of "UFOs".

The fact remains, incontravertibly, that every year thousands of witnesses see things in the sky that conform to no known physical phenomena, and we're no closer to understanding what it is they're seeing after 70 years of study. The most recent scientific studies, in the last 3 decades, all conclude that a portion of reported "UAPs" are genuine unknowns, they they are of real defence significance, and worthy of further scientific study. But well-funded studies never seem to materialise, because of the connotation that UFOs = Little Green Men.

The well has been poisoned by people who claim to know the truth about what these objects are, and they'll happily reveal all in their latest book or DVD - just £19.95 from all good stores (buy today and get a free Alien Grey lunch box!) If these people would just 'fess up that, in reality, they don't know what they are, either - or , certainly, they can't prove their theories - the scientific community could get back to trying to find out, like they attempted to in the late 1940s and early 1950s.

Agile Software Development ("agile" for short) is also now dragging a great weight of cultural baggage behind it, much of it generated by a legion of people also out to make a fast buck by claiming to know the "truth" about what makes businesses successful with technology.

Say "agile" today, and most people think you're talking about Scrum (and its scaled variations). The landscape is very different to 2001, when the term was coined at a ski resort in Utah. Today, there are about 20,000 agile coaches in the UK alone. Two thirds of them come from non-technical backgrounds. Like the laypeople who became "UFO researchers", many agile coaches apply a veneer of pseudoscience to what is - in essence - a technical persuit.

The result is an appearance of agility that often lacks the underlying technical discipline to make it work. Things like unit tests, continuous integration, design principles, refactoring: they're every bit as important as user stories and stand-up meetings and burndown charts.

Many of us saw it coming years ago. Call it "frAgile", "Cargo Cult agile", or "WAgile" (Waterfall-Agile) - it was on the cards as soon as we realised Agile Software Development was being hijacked by management consultants.

Post-agilism was an early response: an attempt to get back to "doing what works". Software Craftsmanship was a more defined reaction, reaffirming the need for technical discipline if we're to be genuinely responsive to change. But these, too, accrued their baggage. Software craft today is more of a cult of personality, dominated by a handful of the most vocal proponents of what has become quite a narrow interpretation of the technical disciplines of writing software. Post-agilism devolved into a pseudo-philosophical talking shop, never quite getting down to the practical detail. Their wells, too, have been poisoned.

But teams are still delivering software, and some teams are more successfully delivering software than others. Just as with UFOs, beneath the hype, there's a real phenomenon to be understood. It ain't Scrum and it ain't Lean and it certainly ain't SAFe. But there's undeniably something that's worthy of further study. Agile has real underlying insights to offer - not necessarily the ones written on the Manifesto website, though.

But, to outrun the cultural baggage, what shall we call it now?




July 5, 2018

Learn TDD with Codemanship

The Grand Follies of Software Development

Just time for a few thoughts on software deveopment's grand follies - things many teams chase that tend to make things worse.

Scale - on and on and on we go about scaling up or scaling out our software systems to handle millions of users and tens of thousands of requests every second. By optimising our architectures to work on Facebook scale, or Netflix scale, we potentially waste a lot of time and money and opportunities to get a product out there by doing something much simpler. The bottom line is that almost all software will never need to work on that scale, just like almost every person will never need a place to moor their $120 million yacht. If you're ever lucky enough to have that problem, good for you! Facebook and the others solved their scaling problems when they needed to, and they had the resources to do it because of their enormous scale.

Likewise the trend for scaling up software development itself. Organisations that set out to build large products - millions or tens of millions of lines of code - are going about it fundamentally arse-backwards. If you look at big software products today, they typically started out as small software products. Sure, MS Word today is over 10M LOC, but Word 1.0 was tens of thousands of lines of code. That original small team created something useful that became very popular, and it grew incrementally over time. Nature handles complexity very well, where design is concerned. It doesn't arrive at something like the human brain in a single step. Like Facebook and their scaling problems, Microsoft crossed that bridge when they got to it, by which time they had the money to crack it. And it takes a lot of money to create a new version of Word. There's no economy of scale, and at the scale they do it now, very little latitude for genuine innovation. Microsoft's big experiments these days are relatively small, like they always had to be. Focus on solving the problems you have now.

That can be underpinned by a belief that some software systems are irreducibly complex - that a Word processor would be unusable without the hundreds of features of MS Word. Big complex software, in reality, starts as small simple software and grows. Unless, of course, we set out to reproduce software that has become big and complex. Which is fine, if that's your business model. But you're going to need a tonne of cash, and there are no guarantees yours will fare better in the market. So it's one heck of a gamble. Typically, such efforts are funded by businesses (or governments) with enormous resources, and they usually fail spectacularly. Occasionally we hear about them, but a keenness to manage their brand means most get swept under the carpet - which might explain why organisations continue to attempt them.

Reuse - oh, this was a big deal in the 90s and early noughties. I came across project after project attempting to build reusable components and services that the rest of the organisation could stitch together to create working business solutions. Such efforts suffered from spectacular levels of speculative generality, trying to solve ALL THE PROBLEMS and satisfy such a wide range of use cases that the resulting complexity simply ran away from them. We eventually - well, some of us, anyway - learned that it's better to start by building something useful. Reuse happens organically and opportunistically. The best libraries and frameworks are discovered lurking in the duplication inside and across code bases.

"Waste" - certain fashionable management practices focus on reducing or eliminating waste from the software development process. Which is fine if we're talking about building every developer their own office complex, but potentially damaging f we're talking abut eliminating the "waste" of failed experiments. That can stifle innovation and lead - ironically - to the much greater waste of missed opportunities. Software's a gamble. You're gonna burn a lot of pancakes. Get used to it, and embrace throwing those burned pancakes away.

Predictability - alongside the management trend for "scaling up" the process of innovation comes the desire to eliminate the risks from it. This, too, is an oxymoron: innovation is inherently risky. The bigger the innovation, the greater the risk. But it's always been hard to get funding for risky ventures. Which is why we tend to find that the ideas that end up being greenlit by businesses are typically not very innovative. This is because we're still placing big bets at the crap table of software development, and losing is not an option. Instead of trying to reduce or eliminate risk, businesses should be reducing the size of their bets and placing more of them - a lot more. This is intimately tied to our mad desire to do everything at "enterprise scale". It's much easier to innovate with lots of small, independent teams trying lots of small-scale experiments and rapidly iterating their ideas. Iterating is the key to this process. So much of management theory in software development is about trying to get it right first time, even today. It's actually much easier and quicker and cheaper to get it progressively less wrong. And, yes, like natural evolution, there will be dead ends. The trick is to avoid falling to the Sunk Cost fallacy of having invested so much time and money in that dead end that you feel compelled to persist.

"Quick'n'dirty" - I shouldn't need to elaborate on this. It's one of the few facts we can rely on in software development. In the vast majority of cases, development teams would deliver sooner if they took more care. and yet, still, we fall for it. Start-ups especially have this mindset ("move fast and break things"). Noted that over time, the most successful tech start-ups tend to abandon this mentality. And, yes, I am suggesting that this way of thinking is a sign of a dev organisation's immaturity. There. I've said it.



June 27, 2018

Learn TDD with Codemanship

Team Craft

We're a funny old lot, software developers.

90% of us are working on legacy code 90% of the time, and yet I can only think of one book about working with legacy code that's been published in the last 20 years.

We spend between 50%-80% of our time reading code, and yet I can only think of a couple of books about writing code that's easier to understand that have ever been published.

We have a problem with our priorities, it would seem. And maybe none more so than in the tiny amount of focus we place on how we work together as teams to get shit done.

Our ability to work together, to communicate, to coordinate, to build shared undersanding and reach shared decisions and to make stuff happen - I call it Team Craft - rarely gets an airing in books, training courses and conferences.

In my TDD workshop, we play a little game called Evil FizzBuzz. If you've applied for a developer job in recent years, you may well have been asked to do the FizzBuzz coding exercise. It's a trivial problem - output a list of integers from 1 to 100, replace any that are divisible by 3 with "Fizz", any that are divisible by 5 with "Buzz", and any that are divisible by 3 and 5 with "FizzBuzz". Simple as peas.

I made it "evil" by splitting the rules up and requiring that individual pairs only work on code for their rule. (e.g., they can only work on generating a sequence from 1..100, or only on replacing numbers with Fizz, or Buzz etc).

They must coordinate their efforts to produce a single unified solution that passes my customer acceptance test - a complete comma-delimited sequence of the required length, with the numbers, the Fizzes, the Buzzes and FizzBuzzes in the right place. This is an exercise - superficially - in Continuous Integration. But, it turns out, it exercises far more than that.

An average developer can complete FizzBuzz in less than 30 minutes. An average team can't complete it in under an hour. No, seriously. 9 out of 10 teams who attempt it don't complete it. Go figure!

Watching teams attempt Evil FizzBuzz is fascinating. The first observation I've made - from dozens of teams who've tried it - is that the individual technical skills of the developers on the team appears to have little bearing on how they'll fare.

FizzBuzz is easy. It doesn't require strong Code Fu. And yet, somehow, it defeats 90% of teams. There must be something else at play here; some other skillset outside of coding and unit testing and refactoring and Git and wotnot that determines how a team will perform.

Over the years since it was introduced, I've developed an instinct for which teams will crack it. I can usually tell within the first 10 minutes if they're going to complete Evil FizzBuzz within the hour, just by looking at the way they interact.

Here are the most typical kinds of rocks I've seen teams' ships dashed on trying to complete Evil FizzBuzz.

1. Indecision - 45 minutes in and the team is still debating options. Should we do it in Java or JavaScript? Jenkins or TeamCity? NUnit or xUnit.net? Making affirmative decisions as a group is a hard skill. But it can be learned. There are various models for group decision making - from a show of hands to time-boxed A/B experiments to flipping a coin. I maintain that the essence of agility is that ability to make effective decisions quickly and cheaply and move on.

2. Priorities - the team spends 30 minutes discussing the design, and then someone starts to think about setting up the GitHub repository and a CI server.

3. Forgetting They're In a Team - I see this one a lot. For example, someone sets up a repository, then forgets to invite the rest of the team to contribute to it. Or - and this is my favourite - someone writes their code in a totally different set of project files, only realising too late that their bit isn't included in the end product. To coordinate efforts in such a small solution space, developers need to be hyper-aware of what the rest of the team are doing.

4. Trying To Win The Argument Instead Of The Game - as with 1-3, this is also very common on development teams. We get bogged down in trying to "win" the debate about what language we should use or whether we should use the Chain of Responsibility design pattern or go for tabs or spaces, and completely lose sight of what we're setting to achieve in the first place. This effect seems to escalate the more technically strong individuals on the team are. Teams of very senior developers or software architects tend to crash and burn more frequently than teams of average developers. We've kind of made this rod for our own backs, as a profession. Career advancement tends to rely more on winning arguments than achieving business goals. Sadly, life's like that. Just look at the people who end up in boardrooms or in government: prepared for leadership in the debating societies of our top schools and colleges. Organisations where that isn't part of the culture tend to do much better at Evil FizzBuzz.

5. All Talk, No Code, No Pictures - the more successful teams get around a whiteboard and visualise what they're going to do. They build a better shared understanding, sooner. The teams who stand around in a circle talking about it invariably end up with every pair walking away with a different understanding, leading to the inevitable car crash at the end. It's especially important for each pair to understand how their part fits in with the whole. The teams that do best tend to agree quickly on how the parts will interact. I've known this for years: the key to scaling up development is figuring out the contracts early. Use of stubs and mocks can help turn this into an explicit executable understanding. Also, plugging their laptops into the projector and demonstrating what they intend is always an option - but one that few teams take up. To date, no team has figured out that Mob Programming is allowed by the rules of the exercise, but a couple of teams came close in their use of the available technology in the room.

6. Focus On Plans, Not Goals - It all seems to be on track; with 5 minutes to go the team are merging their respective parts, only to discover at the very last minute that they haven't solved the problem I set them. Because they weren't setting out to. They came up with a plan, and focused on executing that plan. The teams that crack it tend to revisit the goals continually throughout the exercise. Does this work? Does this work? Does this work? Equally, teams who get 30 minutes in and don't realise they've used 50% of their time show a lack of focus on getting the job done. I announce the time throughout, to try and make them aware. But I suspect often - when they've got their heads down coding and are buried in the plan - they don't hear me. The teams who set themselves milestones - e.g. by 20 minutes we should have a GitHub repository with everyone contributing and a CI server showing a green build so we can start pushing - tend to do especially well.

From long experience on real teams, I've observed relationships between these elements of Team Craft. Teams that lack clear objectives tend to consume themselves with internal debate and "pissing contests". It also tends to make prioritising nigh-on impossible. Tabs vs spaces matters a lot more when you think you have infinite time to debate it. Lack of visualisation of what we're going to do - or attempt to do - tends to lead to less awareness of the team, and less effective coordination. And all of these factors combined tend to lead to an inability to make shared decisions when they're needed.

But before you conclude from this that the individual technical skills don't matter, I need to tell you about the final rule of Evil FizzBuzz: once the build goes green for the first time, it must not go red again. Breaking the build means disqualification. (Hey, it's an exercise in Continuous Integration...)

A few teams get dashed on those rocks, and the lesson from that is that technical discipline does matter. How we work together as teams is crucial, but potentially all for nought if we don't take good care of the fundamentals.






June 21, 2018

Learn TDD with Codemanship

Adopting TDD - The Codemanship Roadmap

I've been doing Test-Driven Development for 20 years, and helping dev teams to do it for almost as long. Over that time I've seen thousands of developers and hundreds of teams try to adopt this crucial enabling practice. So I've built a pretty clear picture of what works and what doesn't when you're adopting TDD.

TDD has a steep learning curve. It fundamentally changes the way you approach code, putting the "what" before the "how" and making us work backwards from the question. The most experienced developers, with years of test-after, find it especially difficult to rewrite their internal code to make it comfortable. It's like learning to write with your other hand.

I've seen teams charge at the edifice of this learning curve, trying to test-drive everything from Day #1. That rarely works. Productivity nosedives, and TDD gets jettisoned at the next urgent deadline.

The way to climb this mountain is to ascend via a much shallower route, with a more gentle and realistic gradient. You will most probably not be test-driving all your code in the first week. Or the first month. typically, I find it takes 4-6 months for teams to get the hang of TDD, with regular practice.

So, I have a recommended Codemanship Route To TDD which has worked for many individuals and teams over the last decade.

Week #1: For teams, an orientaton in TDD is a really good idea. It kickstarts the process, and gets everyone talking about TDD in practical detail. My 3-day TDD workshop is designed specifically with this in mind. It shortcuts a lot of conversations, clears up a bunch of misconceptions, and puts a rocket under the team's ambitions to succeed with TDD.

Week #2-#6: Find a couple of hours a week, or 20 minutes a day, to do simple TDD "katas", and focus on the basic Red-Green-Refactor cycle, doing as many micro-iterations as you can to reinforce the habits

Week #7-#11: Progress onto TDD-ing real code for 1 day a week. This could be production code you're working on, or a side project. The goal for that day is to focus on doing it right. The other 4 days of the week, you can focus on getting stuff done. So, overall, your productivity maybe only dips a bit each week. As you gain confidence, widen this "doing it right" time.

Week #12-#16: By this time, you should find TDD more comfortable, and don't struggle to remember what you're supposed to do and when. Your mind is freed up to focus on solving the problem, and TDD is becoming your default way of working. You'll be no less productive TDD-ing than you were befpre (maybe even more productive), and the code you produce will be more reliable and easier to change.

The Team Dojo: Some teams are keen to put their new TDD skills to the test. An exercise I've seen work well for this is my Team Dojo. It's a sufficiently challenging problem, and really works on those individual skills as well as collaborative skills. Afterwards, you can have a retrospective on how the team did, examining their progress (customer tests passed), code quality and the disciplie that was applied to it. Even in the most experienced experienced teams, the doj will reveal gaps that need addressing.

Graduation: TDD is hard. Learning to test-drive code involves all sorts of dev skills, and teams that succeed tell me they feel a real sense of achievement. It can be good to celebrate that achievement. Whether it's a party, or a little ceremony or presentation, when organisations celebrate the achievement with their dev teams, it shows reall commitment to them and to their craft.

Of course, you don't have to do it my way. What's important is that you start slow and burn your pancakes away from the spotlight of real projects with real deadlines. Give yourself the space and the safety to get it wrong, and over time you'll get it less and less wrong.

If you want to talk about adopting TDD on your team, drop me a line.




June 10, 2018

Learn TDD with Codemanship

Only This Week - Save Up To 65% On Codemanship Training




For one week only, we’re offering a veritable picnic of on-site code craft training at never-to-be repeated prices.

Save up to 65%, and train your developers in key skills like TDD, refactoring and OO design for as little as £40 per person per day. That’s full, action-packed hands-on days of code craft training.

Book any Codemanship training course before June 17th and save a whopping 50%. Book all four of our courses and save 65%. That’s a massive £12,000.


Find out more by visiting www.codemanship.com




June 8, 2018

Learn TDD with Codemanship

The Entire Codemanship TDD Course Book - Absolutely Free

Changes are afoot with my code craft training and coaching company, Codemanship, and as part of that, I'm making my 222-page TDD course book available to download as a spiffy full-colour PDF for free.



It covers everything from the basics of Red-Green-Refactor, through software design principles to apply to your growing code, all the way up to advanced topics other TDD books and courses don't reach, like mutation testing, property-based and data-driven testing and Continuous Inspection. Many people who've read the book have commented on how straightforward and to-the-point it is. Shorter than most TDD/code craft books, but covers more, all in practical detail.

Of course, to get the best from the book, you should try the exercises.

Better still, try the exercises with the guy who wrote the book in the room to guide you.





May 25, 2018

Learn TDD with Codemanship

Ever-Decreasing Cycles - I Called It Right

I'm right about something roughly once in a decade, if I'm lucky. Looking back over 13 years of blog posts, I nominate this little gem as a candidate for "That Thing I Called Right", which predicted that - as our computers grew ever more powerful - continuous background code review would become a thing.

The progression seemed perfectly logical. At the time I wrote it, we'd seen the advent of continuous background code compilation, giving us instant feedback when we make silly syntax errors. Younger developers may not be aware of just what a difference that made to those of us who remember compiling the code involving going away to get a coffee (or lunch, or dinner and a show). So much time saved!

With less brain power dedicated to "does it run?", we were freed up to think about a higher question: does it work?. In 2008, continuous background testing tools like Infinitest and JUnitMax were becoming more popular. Today, I see them quite widely used, and can easily foresee a time when we're all using them within the next decade.

So we've progressed from "does it run?" to "does it work?" as our computers have increased their processing power, and the next evolution I predicted was to continuously ask "will it be easy to change?" At the time, the majority of code analysis tools took too long to do what they did to be running continuously in the background alongside compilation and functional testing. (There were one or two adventurous experimental tools, but we haven't heard much from them in the meantime.)

With Microsoft's Roslyn compiler, continuous background code review is now finally a thing. We can write code quality checks and build them into the compilation pipeline, creating feedback on things like variable names, method size and complexity, couplings, and all that stuff we care about for maintainability, in real time, as we type the code. I suspect such a capability will be added to other compiler platforms in the next decade or so.

Sure, it's still early days, and my experiments with it suggest computing power needs maybe one or two more iterations to rise to meet the number-crunching challenge, but in a practical form that we can begin using today - just like those plucky pioneers who ventured out with Infinitest in the early days it's here. There'll be a learning curve. Start climbing it now, is my recommendation.

My hope for continuous background code review is that it will yet again free up our minds to focus on more important questions, like "is this what they really need?"

And that will be a great day for software.


* And, yes, I had hoped I'd been right about high-integrity software becoming mainstream, but interest in that has flat-lined these past 20 years. Maybe next year... Ho hum.



April 28, 2018

Learn TDD with Codemanship

8 Rules of Maintainable Code: A Handy Cut-Out-And-Keep Chart

If you've been on the Codemanship TDD training course, you may vaguely recall the first afternoon when we discuss design principles and how they can shape our code as it emerges.

I posit 8 principles that I ask participants to apply to the exercises, drawing from Simple Design, "Tell, Don't Ask" and S.O.L.I.D. These 8 factors are interrelated, and form a kind of virtuous - if somewhat complex - virtuous circle.

Code that's easier to change tends to be easier to test quickly. Fast-running tests make refactoring easier. Which helps us make our code easier to change. And around we go.

We don't do slides on the course (hoorah!), but I'm trying this morning to visualise these 8 principles and how they relate to each other in a single graphic.

There's the simple version:



And this is my latest iteration, to print off and hang on your toilet wall or put on a spiffy t-shirt. All non-profit uses are fine.



Going beyond maintainability, there's also a relationship between Clean code and reliability. Code that can be tested very quickly tends to have far fewer bugs. And code that's simpler and easier to understand is likely to get broken when we change it. So, it's more of a virtuous triangle, really.




April 11, 2018

Learn TDD with Codemanship

The Foundation of a Dev Profession Should Be Mentoring

What makes something like engineering or law or medicine a "profession"? Ask me 20 years ago, I'd have said it was standards and ethics, policed by some kind of professional body and/or the law. There are certain things, say, an electronic engineer isn't supposed to do, certain things you can't ask your doctor for, certain things a lawyer would end up in jail for doing.

Ask me today, and my answer would be this: a profession is a community of people following a vocation - like writing software or teaching children - that professes how it works to people who want to learn how to do it.

Experienced school teachers help people learning to be school teachers how to teach. They pass on the benefit of their experience, including all the stuff an even more experienced teacher passed on to them.

I still very much believe that standards and ethics must be part of a profession of software development. But I'm increasingly convinced that the bedrock of any such profession would be mentoring. I think of all the time I wasted in my early years of programming, and all the things that would have helped enormously to know back then. Even programming for fun in my teenage bedroom would have been made easier with some basic code craft like unit testing and rudimentary version control.

I was very lucky to be exposed to much more experienced "software engineers" who nudged me firmly in the direction of rigorous user-centred iterative software development, mentioning books I should read, newsgroups I should visit, courses I should go on, and showing me with their day-to-day examples techniques I still apply - and teach - today.

I make it my business today to pass on the benefits of the mentoring I received.And that, to my mind, should be the basis for a profession of software development.

For that to work, though, it's necessary that developers stay developers. "Use it or lose it" has never been more true than in software. I see developers I coached 10 years ago get promoted into management roles - sheesh, I know a lot of CTOs, according to LinkedIn - and quickly lose their coding abilities and fall behind with the technology. Their experience might be invaluable to someone starting out, but it's hard to lead by example if the last programming you did was in Visual C++ 6.0 and your junior devs are working in F#.

So, another pillar of this professional foundation must necessarily be parallel career progression - up to CTO equivalent - for developers. Looking for work for the first time in a decade has left me in little doubt that - with a handful of glorious exceptions that I'm exploring - many employers don't want older (i.e., more expensive) developers, and even the most senior dev roles typically pay a lot less than management equivalents. I meet a lot of senior managers who are reluctantly in this roles because they have big mortgages and school fees to pay. They'd much rather have stayed hands-on. If the best potential mentors are disappearing into meeting rooms all day, it will always be impossible to square this circle.

The idea's been floated before - including by me - but I think it's finally time to start a software developer's guild, with a specific purpose of championing long-term mentoring and parallel career progression for devs who want to stay devs.

Who's with me?




April 6, 2018

Learn TDD with Codemanship

Could Refactoring (& Refuctoring) Help Us Test Claims About Benefits of Clean Code

One of the more frustrating things about teaching developers about code craft and "Clean Code" is the lack of credible hard evidence from respectable sources about the claimed benefits of it.

Not only does this make code craft a tougher sell to skeptics - and there was a time when I was one of them, decades ago - but it also calls into question whether the alleged benefits are real.

The biggest barrier to doing research in this area has been twofold:

1. The lack of data points. Most software engineering academic studies take data from a handful of projects. If this were, say, medical research, we'd never get our medicines on to the market.

2. The problem of comparing apples with apples. There are so many factors in software development that it's pretty much impossible to isolate one and rule out all others. Studies into the effects of adopting TDD can't account for the variations in experience and ability, for example. Teams new to TDD tend to have to deal with a steep learning curve before they become productive again.

When I consider some of the theories about what makes code harder to change - the central plank of the code craft thesis - some we have strong evidence to back them up, others... not so much.

I've had a bit of a brainwave in this area that might help researchers. Take a code base, then specifically vary it along a single dimension. e.g., refactor to remove duplication, or "refuctor" to introduce duplication (by inlining functions and modules). The resulting variants should all be functionally equivalent, but you could fine-grain the levels of variation. Then ask developers to make changes to the logic, and measure how much code had to be edited to achieve those changes. Automated acceptance tests would ensure that every change was logically equivalent.

I can easily envisage how refactoring (and it's evil twin, refuctoring) could be used to vary readability, complexity, duplication, coupling and cohesion (e.g., by moving methods between classes to introduce or eliminate feature envy), "swabbability" (e.g., by introducing dependency injection, or by reversing the dependency inversion by using explicit references to concrete implementations of interfaces) and a range of other code qualities. Automated tests could ensure that every variant still works exactly the same way on the outside.

And the tests themselves could be varied. For example, you could manipulate test suite execution time so that in some cases developers had to wait an hour for feedback, while others only need wait seconds for the same feedback.

I think I might be on to something. What do you think?