June 29, 2018
.NET Code Analysis using NDependIt's been a while since I used it in anger, but I've been having fun this week reaquainting myself with NDepend, the .NET code analysis tool.
Those of us who are interested on automating code reviews for Continuous Inspection have a range of options for .NET - ranging from tools built on the .NET Cecil decompiler - e.g., FxCop - to compiler-integrated tools on the Roslyn platform.
Ou of all them, I find NDepend to be by far the most mature. It's code model is much more expressive and intuitive (oh, the hours I've spent trying to map IL op codes on to source code!), it integrates out of the box with a range of popular build and reporting tools like VSTS, TeamCity, Excel and SonarQube. And in general, I find I'm up and running with a suite of usable quality gates much, much faster.
Under the covers, I believe we're still in Cecil/IL territory, but all the hard work's been done for us.
Creating analysis projects in NDepend is pretty straightforward. You can either select a set of .NET assemblies to be analysed, or a Visual Studio project or solution. It's very backwards-compatible, working with solutions as far back as VS 2005 (which, for training purposes, I still use occasionally).
I let it have a crack at the files for the Codemanship refactoring workshop, which are deliberately riddled with tasty code smells. My goal was to see how easy it would be to use NDepend to automatically detect the smells.
It found all the solution's assemblies, and crunched through them - building a code model and generating a report - in about a minute. When it's done, it opens a dashboard view which summarises the results of the analysis.
There's a lot going on in NDepend's UI, and this would be a very long blog post if I explored it all. But my goal was to use NDepend to detect the code smells in these projects, so I've focused on the features I'd use to do that.
First of all, out of the box with the code rules that come with NDepend, it has not detected any of the smells I'm interested in. This is tyicaly of any code analysis tool: the rules are not your rules. They're someone else's interpretation of code quality. FxCop's developers, for example, evidently have a way higher tolerance for complexity than I do.
The value in these tools is not in what they do out of the box, but in what you can make them do with a bit of thought. And for .NET, NDepend excels at this.
In the dialog at the bottom of the NDepend window, we can explore the code rules that it comes with and see how they've been implemented using NDepend's code model and some LINQ.
I'm interested in methods with too many parameters, so I clicked on that rule to bring up its implementation.
I happen to think that 5 parameters is too many, so could easily change the threshold where this rule is triggered in the LINQ. When I did, the results list immediately updated, showing the methods in my solution that have too many parameters.
This matches my expectation, and the instant feedback is very useful when creating custom quality gates - really speeds up the learning process.
To view the offending code, I just had to double click on that method in the results list, and NDepend opened it in Visual Studio. (You can use NDepend from within Visual Studio, too, if you want a more seamless experience.)
The interactive and integrated nature of NDepend makes it a useful tool to have in code reviews. I've always found going through the code inspecting source files by eye looking for issues hard work and really rather time-consuming. Being able to search for them interatively like this can help a lot.
Of course, we don't just want to look for code smells in code reviews - that's closing the stable door after the horse has bolted a lot of the time. It's quite fashionable now for dev teams to include code reviews as part of their check-in process - the dreaded Pull Request. It makes sense, as a last line of defence, to try to prevent issus being checked into the code respository. What I'm seeing more and more, though, is that pull requests can become a bottleneck for the team. Like any manual testing, it slows us down and hampers Continuous Delivery.
The command-line version of NDepend can easily be integrated into your build pipeline, allowing for some pretty comprehensive code reviews that can be performed automatically (and therefore quickly, alleviating the bottleneck).
I decided to turn this code rule into a quality gate that coud be used in a build, and set a policy that it should fail the build if more than 5 examples of long parameter lists are found.
So, up and running with a simple quality gate in no time. But what about more complex code smells, like message chains and feature envy? In the next blog post I'll go deeper into NDepend's Code Query Language and explore the kinds of queries we can create with more thought.
May 25, 2018
Ever-Decreasing Cycles - I Called It RightI'm right about something roughly once in a decade, if I'm lucky. Looking back over 13 years of blog posts, I nominate this little gem as a candidate for "That Thing I Called Right", which predicted that - as our computers grew ever more powerful - continuous background code review would become a thing.
The progression seemed perfectly logical. At the time I wrote it, we'd seen the advent of continuous background code compilation, giving us instant feedback when we make silly syntax errors. Younger developers may not be aware of just what a difference that made to those of us who remember compiling the code involving going away to get a coffee (or lunch, or dinner and a show). So much time saved!
With less brain power dedicated to "does it run?", we were freed up to think about a higher question: does it work?. In 2008, continuous background testing tools like Infinitest and JUnitMax were becoming more popular. Today, I see them quite widely used, and can easily foresee a time when we're all using them within the next decade.
So we've progressed from "does it run?" to "does it work?" as our computers have increased their processing power, and the next evolution I predicted was to continuously ask "will it be easy to change?" At the time, the majority of code analysis tools took too long to do what they did to be running continuously in the background alongside compilation and functional testing. (There were one or two adventurous experimental tools, but we haven't heard much from them in the meantime.)
With Microsoft's Roslyn compiler, continuous background code review is now finally a thing. We can write code quality checks and build them into the compilation pipeline, creating feedback on things like variable names, method size and complexity, couplings, and all that stuff we care about for maintainability, in real time, as we type the code. I suspect such a capability will be added to other compiler platforms in the next decade or so.
Sure, it's still early days, and my experiments with it suggest computing power needs maybe one or two more iterations to rise to meet the number-crunching challenge, but in a practical form that we can begin using today - just like those plucky pioneers who ventured out with Infinitest in the early days it's here. There'll be a learning curve. Start climbing it now, is my recommendation.
My hope for continuous background code review is that it will yet again free up our minds to focus on more important questions, like "is this what they really need?"
And that will be a great day for software.
* And, yes, I had hoped I'd been right about high-integrity software becoming mainstream, but interest in that has flat-lined these past 20 years. Maybe next year... Ho hum.
December 31, 2017
New Year's Resolutions - Making High-Integrity Code & Automated Code Inspections MainstreamWhat's your software development New Year's Resolution for 2018?
Through Codemanship, I'm going to be giving two things a big push, starting tomorrow:
1. Techniques for producing high-integrity code
This has been my pet topic for the best part of 20 years. Ever since I started contracting, I've been shocked at just how unreliable the majority of software we create is. Especially because I know from practical experience that the techniques we can apply to produce software that almost never fails are actually quite straightforward and can be applied economically if you know what you're doing.
I've been banging the drum for quality by design ever since, but - to my immense frustration - it never seems to catch on. Techniques like Design By Contract, data-driven and property-based testing, and even good old-fashioned guided inspections, are perfectly within reach of the average dev team. No need for Z specifications, proofs of correctness, or any of that hifalutin malarkey in the majority of cases. You'd be amazed what's possible using the tools you already know, like your xUnit framework.
But still, two decades later, most teams see basic unit testing as "advanced". New tools and technologies spread like wildfire through our community, but good practices catch on at a glacial pace.
As we rely on software more and more, software needs to become more reliable. Our practices have lagged far behind the exponentially increasing potential for harm. We need to up our game.
So, in 2018 I'm going to be doing a lot of promoting of these techniques, as well as raising awareness of their value in engineering load-bearing code that can be relied on.
2. Continuous code inspections
The more code I see (hundreds of code bases ever year), the more convinced I become that the practical route to the most maintainable code is automating code inspections. Regular code reviews are too little, too late, and suffer the economic drawbacks of all after-the-fact manual ad hoc testing. Pair programming is better, but it's a very human activity. It misses too much, because pairs are trying to focus on too many things simultaneously. Like code reviews, it's too subjective, too ad hoc, too hit-and-miss.
For years now, I've been in the habit of automating key code quality checks so all of the code can be checked all of the time. The economic argument for this is clear: code inspection is just another kind of testing. It's unit testing for code quality bugs. If testing is infrequent and arbitrary, many bugs will slip through the net. Later code reviews may pick them up, but the longer maintainability issues persist, the more it costs to a. live with them until they are fixed (because they make the code harder to change), and b. fix them.
Dev teams that do continuous automated inspection tend to produce much cleaner code, and they do it with little to no extra effort. This is for the exact same reasons that dev teams that do continuous automated functional testing tend to produce much more reliable code than teams that test manually, and take little to no extra time and effort to achieve that. Many teams even save time and money.
To be honest, automating code inspections involves a non-trivial learning curve. Devs have to reason about code and express their views on design in a way many of us aren't used to. It's its own problem domain, and the skills and experience required to do it well are currently in short supply. But the tools are readily available, should teams choose to try it.
So, a significant investment has to be made to get automated code inspections up and running. But the potential for reuse of code quality checks is massive. There's just one teeny obstacle: we have to agree on what constitutes a code quality bug, within the team, between teams, and across dev communities. Right now, I have some big issues with what the developers of some code analysis tools suggest is "good code". So I switch off their off-the-peg rules and write my own checks. But, even then, it pays off quite quickly.
Anyhoo, those are the two things I'm going to be focusing on in 2018. Wish me luck!
December 15, 2017
Code Reviews Are Just Another Kind Of Testing
One of my focuses in 2018 is going to be code reviews. As with any kind of testing (and, yes, code review is testing), the quality of the end product tends to reflect the effectiveness of our approach. More simply, teams who are better at code reviews tend to produce better code.
I think it really helps to see code review as testing. Framed in that context, we naturally consider the question of test assurance.
Test assurance has four dimensions:
Scope - How much of our code is tested?
Frequency - How often is our code tested?
Range - How many qualities are we testing for?
Efficacy - How good are our tests at catching bugs?
The batch size of code reviews also seems to have a big impact on their effectiveness. When we're testing the logic of our code, we've learned that it's most economical to run our tests very frequently. Infrequent testing means large batch sizes of new or changed code. The old programmer joke about "show me a line of code and I'll tell, you what's wrong with it, show me 500 lines of code and I'll say 'looks okay to me'" applies here. Code review is best applied one design decision at a time. Infrequent reviews can miss a tonne of stuff.
Also, with lower frequency, comes later feedback. We know that logic errors cost exponentially more to fix the longer they go undiscovered. The same is true of maintainability bugs. The cost of change increases as the code grows. The cheapest time to refactor out a code smell is as soon as it's introduced.
Range is a no-brainer. We probably won't address code qualities we're not testing for. Simples. If we aren't checking for, say, Feature Envy, we'll likely end up with high coupling. When teams say "We do code reviews many times a day", it's important to qualify that with "reviewing for what, exactly?" In the majority of cases, code reviews are a highly subjective and ad hoc affair. What Jane may look for in the code, Jim might not.
Finally, the question arises "How good are our code reviews, anyway?" If Phil wrote a method that the team agreed is probably too long or too complex, does it get flagged by our review? The majority of code reviews are leaky buckets, letting through many, many issues because of their informal and ad hoc nature. Code reviews tend to be opportunistic, relying heavily on who is reviewing the code, which code they just happen to look at, and what happens to be going on in their brains at that specific moment. This has always been the weakness of relying on pair programming to ensure code quality.
Which brings me to a possible fifth dimension of test assurance: improvement. Are we learning and improving our tests? Inevitably, when we test for the stuff we thought of, we miss the stuff we didn't. Teams need to be regularly exploring the code for problems their code reviews might be missing, and adapt their inspections when necessary. And, over time, a developer's understanding of "good code" and "bad code" evolves. 25 years ago, I believed that a 100-line function was just fine. Today, I believe that 10 LOC is pushing it. Our code reviews need reviews, too.
Those of us who firmly believe that fast-running automated tests are the key to maintaining the integrity of our software as it evolves might ask ourselves how code review is any different. It's a kind of testing. Therefore, should it not be subject to the same factors as logic testing? It's not a huge leap to conclude that maybe it should.
December 5, 2017
Automating Code ReviewsThis comes up quite often these days, so I thought I'd scribble my thoughts down, for posterity if nothing else.
I increasingly come across dev teams who have adopted a policy where every check-in needs to be reviewed before it can be accepted. In many cases, this has created a bottleneck while developers waiting to get a green build are stuck on the availability of their peers to do the reviewing.
Imagine, every time you want to check your code in, you have to wait for a tester to put your code through its paces. We knew that was a major bottleneck, so we started automating our tests. If the tester would normally check to see what happens if a customer cancels an order, we would write a unit test for the cancel() function of an order.
It's really not much different for code inspections. If a reviewer would normally check that no classes are too big (say, having more than 200 lines of code), we could write a bit of code to inspect every class and report any that exceed our limit.
A pretty comprehensive code inspection could cover a large amount of code, checking for a whole range of issues, in a tiny fraction of the time it takes a human. More importantly, those checks could be run any time. No need to wait for Jenny to get off the phone, or Rajesh to come back from lunch. You'd no longer be blocked.
This, of course, takes some considerable investment early on to develop the right suite of automated quality checks. But I see more and more teams struggling to maintain the pace of development and high code quality, and such an investment really pays for itself many times over, even on relatively short timescales.
It's for this reason that I'm going to be giving Continuous Inspection a big push in 2018. I think most teams should seriously consider it.
August 17, 2017
Your House, Your (Code Quality) Rules
Picking up where I left off on the custom FxCop rules for the Codemanship Code Craft "Driving Test" has reminded me of something that's vitally important.
This morning I wrote a class that enumerates a type's collaborators. The code currently looks like this:
Codified in this class is an understanding of what I mean by a "collaborator", for the purposes of the driving test.
First of all, I'm not including non-project types. This is a judgement call to keep design rules realistic. My rule will limit the number of collaborating types to 3. If that includes core library types etc, it's going to be really tough.
I'm including fields, parameters and local variables. I'm also including the declaring types of any bound members. So if I call a method that returns an object and then I call a method on that, it'll be included.
I'm also not counting base classes as collaborators. Again, it's a judgement call.
I'm working alone, so I get to make the rules. But in a team setting that absolutely should not happen. Don't send the "tools dev" away to work in isolation on quality gates for Continuous Inspection. Because what will happen is, when they return and unleash their rules on the rest of the team, there'll be tears before bedtime.
The whole team needs to be involved. This is a great candidate for mob programming, in my experience. While you're waiting for business requirements in the early stages of a project/product, here's what the team could be doing to get the delivery engine up and running.
It will require the team to have discussions about code quality with a level of precision they probably never have before. I think this is a good thing.
August 6, 2017
What *Exactly* Is "Feature Envy"?I'm currently writing some custom FxCop rules for the trial Codemanship Code Craft "driving test" on Sept 16th. The aim is that not only will I be able to automatically check candidate's code, but they will be able to while they're writing it, too. The power of Continuous Inspection!
One of the rules is that methods of one class must not display Feature Envy for another class. Typically, Feature Envy's defined as:
A method accesses the features of another class more than its own.
And this might seem trivial to check for using a tool like FxCop. Look at all the member bindings inside a method. If there are more bindings to members of other types then to members of the type on which this method's declared, then we've got Feature Envy. To fix it, we can just move the method to the focus of its envy.
But I'm not sure it's quite that simple. This example might be an open-and-shut case:
But how about this?
The majority of feature calls in this method are to methods of the same class. But that code smell we saw in the first example is still here, on lines 3 and 4. Proof? What if we extract those 2 lines into their own method?
The method obviousFeatureEnvy now completely satisfies our definition of Feature Envy and should be moved to the other class.
I think this leads me to a better definition of Feature Envy:
Feature Envy is when any unit of executable code - a method, a block, a statement or an expression - uses features of another class more than features of its own class
Basically, if you can extract any portion of code into a method that displays the original, "classic" definition of Feature Envy.
But wait; there's more. Take a look at this example:
Technically, only one of these methods satisfies our definition of Feature Envy, but if were to inline the call stack, we'd end up with one method with very obvious Feature Envy.
It's much more complex than I thought. But, for the driving test, I'll probably keep it simple and stick with the classic - and much easier - definition of Feature Envy.
But one day, when I've got time...
July 22, 2017
Code Analysis for Dependency Inversion
As work continues on the next book and training course, I'm thinking about how we could analyse our code for adherence to the Dependency Inversion Principle (the "D" in S.O.L.I.D.)
The DIP states that "High-level modules should not depend upon low-level modules. Both should depend upon abstractions. Abstractions should not depend upon details, details should depend upon abstractions."
This is a roundabout way of saying dependencies should be swappable. The means by which we make them swappable is dependency injection (often confused with Dependency Inversion, and the two are very closely related.)
Dependency injection is simply passing an objects collaborators in (e.g., through a constructor) instead of that object instantianting them itself. When we directly instantiate an object, we bind ourselves to its exact type. This makes it impossible to swap that collaborator with a different implementation without modifying the client code, making our design inflexible and difficult to adapt or extend.
In practice, what this means is that most of our objects are composed from the outside.
For example, in my Reading Ease calculator, the Program class - the entry point for this console app - creates all of the objects involved in doing the calculation and "plugs" them together via constructors.
I've used the analogy of Russian dolls to describe how we compose simpler collaborations into more complex collaborations (collaborations within collaborations). This means that the lowest-level objects in the call stack typically get created first.
Inside those lower-level classes, there's no direct instantiation of collaborators.
So, when we analyse the dependencies, we should find that classes that have clients in our code - classes that are further down the call stack - don't directly instantiate their collaborators.
More simply, if things depend on you, then don't use new.
There are, of course, exceptions. Factories and Builders are designed to instantiate and hide the details. Integration code - e.g., opening database connections - is also designed to hide details. We can't very well pass our database connections into those, or we'd be spreading that knowledge. Typically what we're talking about here is dependencies on our own classes. And what a kerfuffle it would be to try to apply DIP to strings and ints and collections and other core library types all the time. Though, again, there are situations where that may be called for.
If I was measuring adherence to the Dependency Inversion Principle, then, I'd look at a class and ask "Do any other of my classes depend on this?" If the answer is "yes", then I'd check to see if it creates instances of any other of my classes. I might also check - and this would be language-dependent - if those dependencies are on abstract types (abstract classes, interfaces).
November 8, 2016
Business Benefits of Continuous Delivery: We Need Hard DataSomething that's been bugging me for a while is our apparent lack of attention to the proclaimed business benefits of Continuous Delivery.
I'm not going to argue for one second that CD doesn't have business benefits; I'm a firm believer in the practice myself. But that's just it... I'm a believer in the business benefits of Continuous Delivery. And it's a belief based on personal and anecdotal experience, not on a good, solid body of hard evidence.
I had naturally assumed that such evidence existed, given that the primary motivation for CD, mentioned over and over again in the literature, is the reduced lead times on delivering feature and change requests. It is, after all, the main point of CD.
But where is the data that supports reduced lead times? I've looked, but not found it. I've found surveys about adopting CD. I've found proposed metrics, but no data. I've found largely qualitative studies of one or two organisations. But no smoking gun, as yet.
There's a mountain of data that backs up the benefits of defect prevention, but the case for CI currently rests on little more than smoke.
This, I reckon, we need to fix. It's a pillar on which so much of software craftsmanship and Agile rests; delivering working software sooner (and for longer).
Anything that supports the case for Continuous Delivery indirectly supports the case for Continuous Integration, TDD, refactoring, automation, and a bunch of other stuff we believe is good for business. And as such, I think we need that pillar to unassailably strong.
We need good data - not from surveys and opinion polls - on lead times that we can chart against CD practices so we can build a picture of what real, customer-visible impact these practices have.
To be genuinely useful and compelling, it would need to come from hundreds of places and cover the full spectrum of Continuous Delivery from infrequent manual builds with infrequent testing and no automation, to completely automated Continuous Deployment several times a day with high confidence.
One thing that would of particular interest to Agile mindsets would be how the lead times change over time. As the software grows, do lead times get longer? What difference does, say, automated developer testing make to the shape of the curve?
Going beyond that, can we understand what impact shorter lead times can have on a business? Shorter lead times, in of themselves have no value. The value is in what they enable a business to do - specifically, to learn faster. But what, in real terms, are the business benefits of learning faster? How would we detect them? Are businesses that do CD outperforming competitors who don't in some way? Are they better at achieving their goals?
Much to ponder on.
April 25, 2015
Non-Functional Tests Can Help Avoid Over-Engineering (And Under-Engineering)Building on the topic of how we tackle non-functional requirements like code quality, I'm reminded of those times where my team has evolved an architecture that developers taking over from us didn't understand the reasons or rationale for.
More than once, I've seen software and systems scrapped and new teams start again from scratch because they felt the existing solution was "over-engineered".
Then, months later, someone on the new team reports back to me that, over time, their design has had to necessarily evolve into something similar to what they scrapped.
In these situations it can be tricky: a lot of software really is over-engineered and a simpler solution would be possible (and desirable in the long term).
But how do we tell? How can we know that the design is the simplest thing that a team could have done?
For that, I think, we need to look at how we'd know that software was functionally over-complicated and see if we can project any lessons we leearn on to non-functional complexity.
A good indicator of whether code is really needed is to remove it and see if any acceptannce tests fail. You'd be surprised how many features and branches in code find their way in there without the customer asking for them. This is especially true when teams don't practice test-driven development. Developers make stuff up.
Surely the same goes for the non-functional stuff? If I could simplify the design, and my non-functional tests still pass, then it's probable that the current design is over-engineered. But in order to do that, we'd need a set of explicit non-functional tests. And most teams don't have those. Which is why designs can so easily get over-engineered.
Just a thought.