April 6, 2018
Could Refactoring (& Refuctoring) Help Us Test Claims About Benefits of Clean CodeOne of the more frustrating things about teaching developers about code craft and "Clean Code" is the lack of credible hard evidence from respectable sources about the claimed benefits of it.
Not only does this make code craft a tougher sell to skeptics - and there was a time when I was one of them, decades ago - but it also calls into question whether the alleged benefits are real.
The biggest barrier to doing research in this area has been twofold:
1. The lack of data points. Most software engineering academic studies take data from a handful of projects. If this were, say, medical research, we'd never get our medicines on to the market.
2. The problem of comparing apples with apples. There are so many factors in software development that it's pretty much impossible to isolate one and rule out all others. Studies into the effects of adopting TDD can't account for the variations in experience and ability, for example. Teams new to TDD tend to have to deal with a steep learning curve before they become productive again.
When I consider some of the theories about what makes code harder to change - the central plank of the code craft thesis - some we have strong evidence to back them up, others... not so much.
I've had a bit of a brainwave in this area that might help researchers. Take a code base, then specifically vary it along a single dimension. e.g., refactor to remove duplication, or "refuctor" to introduce duplication (by inlining functions and modules). The resulting variants should all be functionally equivalent, but you could fine-grain the levels of variation. Then ask developers to make changes to the logic, and measure how much code had to be edited to achieve those changes. Automated acceptance tests would ensure that every change was logically equivalent.
I can easily envisage how refactoring (and it's evil twin, refuctoring) could be used to vary readability, complexity, duplication, coupling and cohesion (e.g., by moving methods between classes to introduce or eliminate feature envy), "swabbability" (e.g., by introducing dependency injection, or by reversing the dependency inversion by using explicit references to concrete implementations of interfaces) and a range of other code qualities. Automated tests could ensure that every variant still works exactly the same way on the outside.
And the tests themselves could be varied. For example, you could manipulate test suite execution time so that in some cases developers had to wait an hour for feedback, while others only need wait seconds for the same feedback.
I think I might be on to something. What do you think?
December 5, 2017
Automating Code ReviewsThis comes up quite often these days, so I thought I'd scribble my thoughts down, for posterity if nothing else.
I increasingly come across dev teams who have adopted a policy where every check-in needs to be reviewed before it can be accepted. In many cases, this has created a bottleneck while developers waiting to get a green build are stuck on the availability of their peers to do the reviewing.
Imagine, every time you want to check your code in, you have to wait for a tester to put your code through its paces. We knew that was a major bottleneck, so we started automating our tests. If the tester would normally check to see what happens if a customer cancels an order, we would write a unit test for the cancel() function of an order.
It's really not much different for code inspections. If a reviewer would normally check that no classes are too big (say, having more than 200 lines of code), we could write a bit of code to inspect every class and report any that exceed our limit.
A pretty comprehensive code inspection could cover a large amount of code, checking for a whole range of issues, in a tiny fraction of the time it takes a human. More importantly, those checks could be run any time. No need to wait for Jenny to get off the phone, or Rajesh to come back from lunch. You'd no longer be blocked.
This, of course, takes some considerable investment early on to develop the right suite of automated quality checks. But I see more and more teams struggling to maintain the pace of development and high code quality, and such an investment really pays for itself many times over, even on relatively short timescales.
It's for this reason that I'm going to be giving Continuous Inspection a big push in 2018. I think most teams should seriously consider it.
August 21, 2017
Codemanship Code Craft FxCop RulesSo, here they are. Hot from the oven, my FxCop code rules for the upcoming Codemanship Code Craft "Driving Test".
Some rubbish code, yesterday.
If you're signed up to be one of our valiant guinea pigs for the trail driving test on Sept 16th, I heartily recommend you download them and get a bit of practice. Try writing code that breaks each of the 11 rules, and then refactoring that code to make the nasty messages go away.
There's versions for Visual Studio 2013, 2015 and 2017, plus instructions on installing and suing the rules with your own projects.
And even if you're not doing the driving test on Sept 16th, have a go anyway. Your code may not be as clean as you think ;)
Any bugs or false positives, drop me a line.
August 6, 2017
What *Exactly* Is "Feature Envy"?I'm currently writing some custom FxCop rules for the trial Codemanship Code Craft "driving test" on Sept 16th. The aim is that not only will I be able to automatically check candidate's code, but they will be able to while they're writing it, too. The power of Continuous Inspection!
One of the rules is that methods of one class must not display Feature Envy for another class. Typically, Feature Envy's defined as:
A method accesses the features of another class more than its own.
And this might seem trivial to check for using a tool like FxCop. Look at all the member bindings inside a method. If there are more bindings to members of other types then to members of the type on which this method's declared, then we've got Feature Envy. To fix it, we can just move the method to the focus of its envy.
But I'm not sure it's quite that simple. This example might be an open-and-shut case:
But how about this?
The majority of feature calls in this method are to methods of the same class. But that code smell we saw in the first example is still here, on lines 3 and 4. Proof? What if we extract those 2 lines into their own method?
The method obviousFeatureEnvy now completely satisfies our definition of Feature Envy and should be moved to the other class.
I think this leads me to a better definition of Feature Envy:
Feature Envy is when any unit of executable code - a method, a block, a statement or an expression - uses features of another class more than features of its own class
Basically, if you can extract any portion of code into a method that displays the original, "classic" definition of Feature Envy.
But wait; there's more. Take a look at this example:
Technically, only one of these methods satisfies our definition of Feature Envy, but if were to inline the call stack, we'd end up with one method with very obvious Feature Envy.
It's much more complex than I thought. But, for the driving test, I'll probably keep it simple and stick with the classic - and much easier - definition of Feature Envy.
But one day, when I've got time...
August 10, 2015
A Hierarchy Of Software Design NeedsDesign is not a binary proposition. There is no clear dividing line between a good software design and bad software design, and even the best designs are compromises that seek to balance competing forces like performance, readability, testability, reuse and so on.
When I refactor a design, it can sometimes introduce side-effects - namely, other code smells - that I deem less bad than what was there before. For example, maybe I have a business object that renders itself as HTML - bad, bad, bad! Right?
The HTML format is likely to change more often than the object's data schema, and we might want to render it to other formats. So it makes sense to split out the rendering part to a separate object. But in doing so, we end up creating "feature envy" - an unhealthy high coupling between our renderer and the business object so it can get the data in needs - in the process.
I consider the new feature envy less bad than the dual responsibility, so I live with it.
In fact, there tends to be a hierarchy of needs in software design, where one design issue will take precedence over another. It's useful, when starting out, to know what that hierarchy of needs is.
Now, the needs may differ depending on the requirements of our design - e.g., on a small-memory device, memory footprint matters way more than it does for desktop software usually - but there is a fairly consistent pattern that appears over and over in the majority of applications.
There are, of course, a universe of qualities we may need to balance. But let's deal with the top six to get you thinking:
1. The Code Must Work
Doesn't matter how good you think the design is if it doesn't do what the customer needs. Good design always comes back to "yes, but does it pass the acceptance tests?" If it doesn't, it's de facto a bad design, regardless.
2. The Code Must Be Easy To Understand
By far the biggest factor in the maintainability of code is whether or not programmers can understand it. I will gladly sacrifice less vital design goals to make code more readable. Put more effort into this. And then put even more effort into it. However much attention you're paying to readability, it's almost certainly not enough. C'mon, you've read code. You know it's true.
But if the code is totally readable, but doesn't work, then spend more time on 1.
3. The Code Must Be As Simple As We Can Make It
Less code generally means a lower cost of maintenance. But beware; you can take simplicity too far. I've seen some very compact code that was almost intractable to human eyes. Readability trumps simplicity. And, yes, functional programmers, I'm particularly looking at you.
4. The Code Must Not Repeat Itself
The opposite of duplication is reuse. Yes it is: don't argue!
Duplication in our code can often give us useful clues about generalisations and abstractions that may be lurking in there that need bringing out through refactoring. That's why "removing duplication" is a particular focus of the refactoring step in Test-driven Development.
Having said that, code can get too abstract and too general at the expense of readability. Not everything has to eventually turn into the Interpreter pattern, and the goal of most projects isn't to develop yet another MVC framework.
In the Refuctoring Challenge we do on the TDD workshops, over-abstracting often proves to be a sure-fire way of making code harder to change.
5. Code Should Tell, Not Ask
"Tell, Don't Ask" is a core pillar of good modular -notice I didn't say "object oriented" - code. Another way of framing it is to say "put the work where the knowledge is". That way, we end up with modules where more dependencies are contained and fewer dependencies are shared between modules. So if a module knows the customer's date of birth, it should be responsible for doing the work of calculating the customer's current age. That way, other modules don't have to ask for the date of birth to do that calculation, and modules know a little bit less about each other.
It goes by many names: "encapsulation", "information hiding" etc. But the bottom line is that modules should interact with each other as little as possible. This leads to modules that are more cohesive and loosely coupled, so when we make a change to one, it's less likely to affect the others.
But it's not always possible, and I've seen some awful fudges when programmers apply Tell, Don't Ask at the expense of higher needs like simplicity and readability. Remember simply this: sometimes the best way is to use a getter.
6. Code Should Be S.O.L.I.D.
You may be surprised to hear that I put OO design principles so far down my hierarchy of needs. But that's partly because I'm an old programmer, and can vaguely recall writing well-designed applications in non-OO languages. "Tell, Don't Ask", for example, is as do-able in FORTRAN as it is in Smalltalk.
Don't believe me? Then read the chapter in Bertrand Meyer's Object Oriented Software Construction that deals with writing OO code in non-OO languages.
From my own experiments, I've learned that coupling and cohesion have a bigger impact on the cost of changing code. A secondary factor is substitutability of dependencies - the ability to insert a new implementation in the slot of an old one without affecting the client code. That's mostly what S.O.L.I.D. is all about.
This is the stuff that we can really only do in OO languages that directly support polymorphism. And it's important, for sure. But not as important as coupling and cohesion, lack of duplication, simplicity, readability and whether or not the code actually works.
Luckily, apart from the "S" in S.O.L.I.D. (Single Responsibility), the O.L.I.D. is fairly orthogonal to these other concerns. We don't need to trade off between substitutability and Tell, Don't Ask, for example. They're quite compatible, as are the other design needs - if you do it right.
In this sense, the trade off is more about how much time I devote t thinking about S.O.L.I.D. compared to other more pressing concerns. Think about it: yes. Obsess about it: no.
Like I said, there are many, many more things that concern us in our designs - and they vary depending on the kind of software we're creating - but I tend to find these 6 are usually at the top of the hierarchy.
So... What's your hierarchy of design needs?
July 31, 2015
Triangulating Your Test CodeWhile we're triangulating our solutions in TDD, our source code ought to be getting more general with each new test case.
But it's arguably not just the solution that should be getting more general; our test code could probably be generalised, too.
Take a look at this un-generalised code for the first two tests in a TDD'd implementation of a Fibonacci sequence generator:
Jumping in at this point, we see that our solution is still hard-coded. The trick to triangulation is to spot the pattern. The pattern for the first two Fibonacci numbers is that they are the same as their index in the sequence (assuming a zero-based array).
We can generalise our list into a loop that generates the list using the pattern (see Bob Martin's post on the Transformation Priority Premise, or, what I more simply call triangulation patterns).
But we can also generalise our test code into a single parameterised test, using the pattern as the test name, so it reads more like the specification we hope our tests in TDD will become:
Now, because all subequent tests are going to follow the same pattern (we provide an index and check what the expected Fibonacci number is at that index), we could carry on reusing this parameterised test for the rest of the problem.
Then we'd have to generalise the name of the test - a key part of our test-driven specification - to the point where every single patterm (every rule) is summarised in one test. I no likey. It's much harder to read, and when a test case fails, it's not entirely clear which rule was broken.
So, what I like to do is keep a bit of duplication in order to have one generalised test for each patterm/rule in the specification.
So, continuing on, I might end up with:
Notice that, although these two test methods are duplication, I've taken the step of refactoring out the duplicated knowledge of how to create and interact with the object being tested. This kind of duplication in test code tends to hurt us most. Many teams report how tight coupling between tests and objects under test led to interfaces being much more expensive to change. So I feel this is a small compromise that aid readability while not sacrificing too much to duplication.
April 25, 2015
Continuous Inspection ScreencastIt's been quite a while since I did a screencast. Here's a new one about Continuous Inspection, which is a thing. (Oh yes.)
March 1, 2015
Continuous Inspection at NorDevConOn Friday, I spent a very enjoyable day at the Norfolk developer's conference NorDevCon (do you see what they did there?) It was my second time at the conference, having given the opening keynote last year, and it's great to see it going from strength to strength (attendance up 50% on 2014), and to see Norwich and Norfolk being recognised as an emerging tech hub that's worthy of inward investment.
I was there to run a workshop on Continuous Inspection, and it was a good lark. You can check out the slides, which probably won't make a lot of sense without me there to explain them - but come along to CraftConf in Budapest this April or SwanseaCon 2015 in September and I'll answer your questions.
You can also take a squint at (or have a play with) some code I knocked up in C# to illustrate a custom FxCop code rule (Feature Envy) to see how I implemented the example from the slides in a test-driven way.
I'm new to automating FxCop (and an infrequent visitor to .NET Land), so please forgive any naivity. Hopefully you get the idea. The key things to take away are: you need a model of the code (thanks Microsoft.Cci.dll), you need a language to express rules against that model (thanks C#), and you need a way to drive the implementation of rules by writing executable tests that fail (thanks NUnit). The fun part is turning the rule implementation on its own code - eating your own dog food, so to speak. Throws up all sorts of test cases you didn't think of. It's a work in progress!
I now plan, before CraftConf, to flesh the project out a bit with 2-3 more example custom rules.
Having enjoyed a catch-up with someone who just happens to be managing the group at Microsoft who are working on code analysis tools, I think 2015-2016 is going to see some considerable ramp-up in interest as the tools improve and integration across the dev lifecycle gets tighter. If Continuous Inspection isn't on your radar today, you may want to put it on your radar for tomorrow. It's going to be a thing.
Right now, though, Continuous Inspection is very much a niche pastime. An unscientific straw poll on social media, plus a trawl of a couple of UK job sites, suggests that less than 1% of teams might even be doing automated code analysis at all.
I predicted a few years ago that, as computers get faster and code gets more complex, frequent testing of code quality using automated tools is likely to become more desirable and more do-able. I think we're just on the cusp of that new era today. Today, code quality is an ad hoc concern relying on hit-and-miss practices like pair programming, where many code quality issues often get overlooked by pair who have 101 other things to think about, and code reviews, where issues - if they get spotted at all in the to-and-fro - are flagged up long after anybody is likely to do anything about them.
In related news, after much discussion and braincell-wrangling, I've chosen the name for the conference that will be superceding Software Craftsmanship 20xx later this year (because craftsmanship is kind of done now as a meme). Watch this space.
February 13, 2015
Intensive TDD, Continuous Inspection Recipes & Crappy Remote Collaboration ToolsA mixed bag for today's post, while I'm at my desk.
First up, after the Intensive TDD workshop on March 14th sold out (with a growing waiting list), I've scheduled a second workshop on Saturday April 11th, with places available at the insanely low price of £30. Get 'em while they're hot!
Secondly, I'm busy working on a practical example for a talk I'm giving at NorDevCon on Feb 27th about Continuous Inspection.
What I'm hoping to do is work through a simple example based on my Dependable Dependencies Principle, where I'll rig up an automated code analysis wotsit to find the most complex, most depended upon and least tested parts of some code to give early warning about where it might be most likely to be broken and might need better testing and simplifying.
To run this metric, you need 3 pieces of information:
* Cyclomatic Complexity of methods
* Afferent couplings per method
* Test coverage per method
Now, test coverage could mean different things. But for a short demonstration, I should probably keeep it simple and fairly brute force - e.g., % LOC reached by the tests. Not ideal, but in a short session, I don't want to get dragged into a discussion about coverage metrics. It's also a readily-available measure of coverage, using off-the-shelf tools, so it will save me time in preparing and allow viewers to try it for themselves without too much fuss and bother.
What's more important is to demonstrate the process going from identifying a non-functional requirement (e.g., "As the Architect, I want early warning about code that presents a highr risk of being unreliable so that I can work with the developers to get better assurance for it"), to implementing an executable quality gate using available tools in a test-driven manner (everybody forgets to agree tests for their metrics!), to managing the development process when the gate is in place. All the stuff that constitutes effective Continuous Inspection.
At time of writing, tool choice is split between a commercial code analysis tool called JArchitect, and SonarQube. It's a doddle to rig up in JArchitect, but the tool costs £££. It's harder to rig up in SonarQube, but the tools are available for free. (Except, of course, nothing's ever really free. Extra time taken to get what you want out of a tool also adds up to £££.) We'll see how it goes.
Finally, after a fairly frustrating remote pairing session on Wednesday where we were ultimately defeated by a combination of Wi-Fi, Skype, TeamViewer and generally bad mojo, it's occured to me that we really should be looking into remote collaboration more seriously. If you know of more reliable tools for collaboration, please tweet me at @jasongorman.
February 9, 2015
Mock Abuse: How Powerful Mocking Tools Can Make Code Even Harder To ChangeConversation turned today to that perennial question about mock abuse; namely that there are some things mocking frameworks enable us to do that we probably shouldn't ought to.
In particular, as frameworks have become more powerful, they've made it possible for us to substitute the un-substitutable in our tests.
Check out this example:
Because Orders invokes the static database access method getAllOrders(), it's not possible for us to use dependency injection to make it so we can unit test Orders without hitting the database. Boo! Hiss!
Along comes our mocking knight in shining armour, enabling me to stub out that static method to give a test-specific response:
Problem solved. Right?
Well, for now, maybe yes. But the mocking tool has not solved the problem that I still couldn't substitute CustomerData.getAllOrders() in the actual design if I wanted to (say, to use a different kind of back-end data store or a web service). So it's solved the "how do I unit test this?" problem, but not in a way that buys me any flexibility or solves the underlying design problem.
If anything, it's made things a bit worse. Now, if I want to refactor Orders to make the database back end swappable, I've got a bunch of test code that also depends on that static method (and in arguably a bigger way - more code depends on that internal dependency. If you catch my drift.)
I warn very strongly against using tools and techniques like these to get around inherent internal dependency problems, because - when it comes to refactoring (and what's the point in having fast-running unit tests if we can't refactor?) all that extra test code can actually bake in the design problems.
Multiply this one toy example by 1,000 to get the real scale I sometimes see this one in real code bases. This approach can make rigid and brittle designs even more rigid and more brittle. In the long term, it's better to make the code unit-testable by fixing the dependency problem, even if this means living with slow-running (or even - gasp! - manual) tests for a while.