July 15, 2017

Learn TDD with Codemanship

Finding Load-Bearing Code - Thoughts On Implementation

I've been unable to shake this idea about identifying the load-bearing code in our software.

My very rough idea was to instrument the code and then run all our system or customer tests and record how many times methods are executed. The more times a method gets used (reused), the more critical it may be, and therefore may need more of our attention to make sure it isn't wrong.

This could be weighted by estimates for each test scenario of how big the impact of failure could be. But in my first pass at a tool, I'm thinking method call counts would be a simple start.

So, the plan is to inject this code into the beginning of the body of every method in the code under test (C# example), using something like Roslyn or Reflection.Emit:

The MethodCallCounter could be something as simple as a wrapper to a dictionary:

And this code, too, could be injected into the assembly we're instrumenting, or a reference added to a teeny tiny Codemanship.LoadBearing DLL.

Then a smidgen of code to write the results to a file (e.g., a spreadsheet) for further analysis.

The next step would be to create a test context that knows how critical the scenario is, using the customer's estimate of potential impact of failure, and instead of just incrementing the method call count, actually adds this number. So methods that get called in high-risk scenarios are shown as bearing a bigger load.

External to this would be a specific kind of runner (e.g., NUnit runner, FitNesse, SpecFlow etc) that executes the tests while changing the FailureImpact value using information tagged in each customer test somehow.


(PS. This is also kind of how I'd add logging to a system, in case you were wondering.)

July 13, 2017

Learn TDD with Codemanship

Do You Know Where Your Load-Bearing Code Is?

Do you know where your load-bearing code is?

In any system, there's some code that - if it were to fail - would be a big deal. Identifying that code helps us target our testing effort to where it's really needed.

But how do we find our load-bearing code? I'm going to propose a technique for measuring the "load-beariness" of individual methods. Let's call it criticality.

Working with your customer, identify the potential impact of failure of specific usage scenarios. It's about like estimating the relative value of features, only this time we're not asking "what's it worth?". We're asking "what's the potential cost of failure?" e.g., applying the brakes in an ABS system would have a relatively very high cost of system failure. Changing the font on a business report would have a relatively low cost of failure. Maybe it's a low-risk feature by itself, but will be used millions of times every day, greatly amplifying the risk.

Execute a system test case. See which methods were invoked end-to-end to pass the test. For each of those methods, assign the estimated cost of failure.

Now rinse and repeat with other key system test cases, adding the cost of failure to every method each scenario hits.

A method that's heavily reused in many low-risk scenarios could turn out to be cumulatively very critical. A method that's only executed once in a single very high-risk scenario could also be very critical.

As you play through each test case, you'll build a "heat map" of criticality in your code. Some areas will be safe and blue, some areas will be risky and red, and a few little patches of code may be white hot.

That is your load-bearing code. Target more exhaustive testing at it: random, data-driven, combinatorial, whatever makes sense. Test it more frequently. Inspect it carefully, many times with many pairs of eyes. Mathematically prove it's correct if you really need to. And, of course, do whatever you can to simplify it. Because simpler code is less likely to fail.

And you don't need code to make a start. You could calculate method criticality from, say, sequence diagrams, or from CRC cards, to give you a heads-up on how rigorous you may need to be implementing the design.

July 10, 2017

Learn TDD with Codemanship

Codemanship Bite-Sized - 2-Hour Trainng Workshops for Busy Teams

One thing that clients mention often is just how difficult it is to make time for team training. A 2 or 3-day course takes your team out of action for a big chunk of time, during which nothing's getting delivered.

For those teams that struggle to find time for training, I've created a spiffing menu of action-packed 2-hour code craft workshops that can be delivered any time from 8am to 8pm.

Choose from:

  • Test-Driven Development workshops

    • Introduction to TDD

    • Specification By Example/BDD

    • Stubs, Mocks & Dummies

    • Outside-In TDD

  • Refactoring workshops

    • Refactoring 101

    • Refactoring To Patterns

  • Design Principles workshops

    • Simple Design & Tell, Don’t Ask

    • S.O.L.I.D.

    • Clean Code Metrics

To find out more, visit http://www.codemanship.co.uk/bitesized.html

July 6, 2017

Learn TDD with Codemanship

Conceptual Correlation - Source Code + How To Build Your Own

Although it's only rough and ready, I've published the source code for my Conceptual Correlation calculator so you can get a feel for how it works and how you might implement your own in whichever language you're interested in.

It's actually only about 100 lines of code (not including tests), and if I put my brain in gear, it could well be signiicantly less. It's a pretty simple process:

1. Parse the code (or the IL code, in this case) using a parse, compiler, decompiler - whatever will get you the names used in the code

2. Tokenize those code names into individual words (e.g., thisMethodName becomes "this" "method" "name"

3. Tokenize the contents of a requirements text file

4. Filter stop words (basically, noise - "the", "at", "we", "I" etc) from these sets of words. You can find freely available lists of stop words online for many languages

5. Lemmatize the word sets - meaning to boil down different inflections of the same word ("report", "reports", "reporting" to a single dictionary root)

6. Optionally - just for jolly - count the occurances of each word

7. Calculate what % of the set of code words are also contained in the requirements words

8. Output the results in a usable format (e.g., console)

No doubt someone will show us how it can be done in a single line of F#... ;)

July 5, 2017

Learn TDD with Codemanship

A Little Test for My Conceptual Correlation Metric

Here's a little test for my prototype .NET command line tool for calculating Conceptual Correlation. Imagine we have a use case for booking seats on flights for passengers.

The passenger selects the flight they want to reserve a seat on. They choose the seat by row and seat number (e.g., row A, seat 1) and reserve it. We create a reservation for that passenger in that seat.

We write two implementations: one very domain-driven...

And one... not so much.

We run Conceptual.exe over our first project's binary to compare against the use case text, and get a good correlation.

Then we run it over the second project's output and get zero correlation.

QED :)

You can download the prototype here. What will it say about your code?

Learn TDD with Codemanship

Conceptual Correlation - Prototype Tool for .NET

With a few hours spare time over the last couple of days, I've had a chance to throw together a simple rough prototype of a tool that calculates the Conceptual Correlation between a .NET assembly (with a .pdb file in the same directory, very important that!) and a .txt file containing requirements descriptions. (e.g., text copied and pasted from your acceptance tests, or use case documents).

You can download it as a ZIP file, and to use it, just unzip the contents to a folder, and run the command-line Conceptual.exe with exactly 2 arguments: the first is the file name of the .NET assembly, the second is the file name of the requirements .txt.


Conceptual.exe "C:\MyProjects\FlightBooking\bin\debug\FlightBooking.dll" "C:\MyProjects\FlightBooking\usecases.txt"

I've been using it as an external tool in Visual Studio, with a convention-over-configuration argument of $(BinDir)\$(TargetName)$(TargetExt) $(ProjectDir)\requirements.txt

I've tried it on some fair-sized assemblies (e.g., Mono.Cecil.dll), and on some humungous text files (the entire text of my 200-page TDD book - all 30,000 words), and it's been pretty speedy on my laptop and the results have been interesting and look plausible.

Assumes code names are in PascalCase and/or CamelCase.

Sure, it's no Mercedes. At this stage, I just want to see what kind of results folk are getting from their own code and their own requirements. Provided with no warranty with no technical support, use at own risk, your home is at risk if you do not keep up repayments, mind the gap, etc etc. You know the drill :)

Conceptual.exe uses Mono.Cecil to pull out code names, and LemmaSharp to lemmatize words (e,g, "reporting", "reports" become "report"). Both are available via Nuget.

Have fun!

July 2, 2017

Learn TDD with Codemanship

Conceptual Correlation - A Working Definition

During an enjoyable four days in Warsaw, Poland, I put some more thought into the idea of Conceptual Correlation as a code metric. (hours of sitting in airports, planes, buses, taxis, trains and hotel bars gives plenty of time for the mind to wander).

I've come up with a working definition to base a prototype tool on, and it goes something like this:

Conceptual Correlation - the % of non-noise words that appear in names of things in our code (class names, method names, field names, variable names, constants, enums etc) that also appear in the customer's description of the problem domain in which the software is intended for use.

That is, if we were to pull out all the names from our code, parse them into their individual words (e.g., submit_mortgage_application would become "submit" "mortgage" "application"), and build a set of them, then Conceptual Correlation would be the % of that set that appeared in a similar set created by parsing, say, a FitNesse Wiki test page about submitting mortgage applications.

So, for example, a class name like MortgageApplicationFactory might have a conceptual correlation of 67% (unless, of course, the customer actually processes mortgage applications in a factory).

I might predict that a team following the practices of Domain-Driven Design might write code with a higher conceptual correlation, perhaps with just the hidden integration code (database access, etc) bringing the % down. Whereas a team that are much more solution-driven or technology-driven might write code with a relatively lower conceptual correlation.

For a tool to be useful, it would not only report the conceptual correlation (e.g,, between a .NET assembly and a text file containing its original use cases), but also provide a way to visualise and access the "word cloud" to make it easier to improve the correlation.

So, if we wrote code like this for booking seats on flights, the tool would bring up a selection of candidate words from the requirements text to replace the non-correlated names in our code with.

I currently envisage this popping up as an aid when we use a Rename refactoring, perhaps accentuating words that haven't been used yet.

A refactored version of the code would show a much higher conceptual correlation. E.g.,

The devil's in the detail, as always. Would the tool need to make non-exact correlations, for example? Would "seat" and "seating" be a match? Or a partial match? Also, would the strength of the correlation matter? Maybe "seat" appears in the requirements text many times, but only once in the code. Should that be treated as a weaker correlation? And what about words that appear together? Or would that be making it too complicated? Methinks a simple spike might answer some of these questions.

June 25, 2017

Learn TDD with Codemanship

Conceptual Correlation

I've long recommended running requirements documents (e.g., acceptance tests) through tag cloud generators to create a cheap-and-cheerful domain glossary for developers to refer to when we need inspiration for a name in our code.

But, considering today how we might assess the readability of code automatically, I imagined what we could learn by doing this for both the requirements and our code, and then comparing the resulting lexicons to see how much conceptual overlap there is.

I'm calling this overlap Conceptual Correlation, and I'm thinking this wouldn't be too difficult to automate in a basic form.

The devil's in the detail, of course. "Noise" words like "the", "a", "and" and so on would need to be stripped out. And would we look for exact words matches? Would we wish to know the incidence of each word and include that in our comparison? (e.g., if "flight" came up often in requirements for a travel booking website, but only mentioned once in the code, would that be a weaker correlation?)

I'm thinking that something like this, coupled with a readability metric similar to the Flesch-Kincaid index, could automatically highlight code that might be harder to understand.

Lots to think about... But it also strikes me as very telling that tools like this don't currently exist for most programming languages. I could only find one experimental tool for analysing Java code readability. Bizarre, when you consider just what a big deal we all say readability is.

May 30, 2017

Learn TDD with Codemanship

20 Dev Metrics - 20. Diversity

The final metric in my series 20 Dev Metrics is Diversity.

First of all, we can have diversity of people: their ages, their genders, their sexual orientations, their ethnic backgrounds, their nationalities, their abilities (and disabilities), their socio-economic backgrounds, their educational backgrounds, and so on.

But we can go beyond this and also consider diversity of ideas. The value of diversity is essentially more choice. A team with 10 different ideas for improving customer retention is in a better position for solving their problem than a team with only one.

Nurturing diversity of people can lead to a greater diversity of ideas, but I believe we shouldn't take that effect for granted. Teams made up of strikingly different people are still quite capable of group-think. Culture is susceptible to homogenisation, because people tend to try to fit in. A more diverse group of people may just take a bit longer to reach that uniformity. Therefore, diversity is not a destination, but a journey; a process that continually renews itself by ingesting new people and new ideas.

For example, on your current product or project, how many different ideas were considered? How many prototypes were tried? You'd be amazed at just how common it is for dev teams to start with a single idea and stick to it to the bitter end.

What processes and strategies does your organisation have for generating or finding new ideas and testing them out? Where do ideas come from? Is it from anyone in the team, or do they all come from the boss? (The dictatorial nature of the traditional heirarchical organisation tends to produce a very narrow range of ideas.)

What processes and strategies does your organisation have for attracting and retaining a diverse range of people? Does it have any at all? (Most don't.)

How outward-looking are the team? Do they engage with a wide range of communities and are they exposed to a wide range of ideas? Or are they inward-looking and insular, mostly seeking solutions in their own backyard?

The first step to improving diversity is measuring it. Does the makeup of the team roughly reflect the makeup of the general population? If not, then maybe we need to take steps to open the team up to a wider range of people. Perhaps we need to advertise jobs in other places? Perhaps we need to look at the team's "brand" when we're hiring to see what kind of message we're sending out? Does "Must be willing to work long hours" put off parents with young children? Does "Regular team paintballing" exclude people with certain disabilities? Does "We work hard, play hard" say to the teetotaller "You probably won't fit in"?

Most vitally, is your organisation the kind that insists on developers arriving fully-formed (and therefore are always drawing from the narrow pool of people who are already software developers)? Or do you offer chances for people who wouldn't normally be in that pool to learn and become developers? Do you offer paid apprenticeships or internships, for example? Are they open to anyone? Are you advertising them outside of the software development community? How would a 55-year-old recently forced to take early retirement find out about your apprenticeship? How would an 18-year-old who can't afford to go to university hear about your internship? These people probably don't read Stack Overflow.

May 22, 2017

Learn TDD with Codemanship

20 Dev Metrics - 19. Progress

Some folk have - quite rightly - asked "Why bother with a series on metrics?" Hopefully, I've vindicated myself with a few metrics you haven't seen before. And number 19 in the series of 20 Dev Metrics is something that I have only ever seen used on teams I've led.

When I reveal this metric, you'll roll your eyes and say "Well, duh!" and then go back to your daily routine and forget all about it, just like every other developer always has. Which is ironic, because - out of all the things we could possibly measure - it's indisputably the most important.

The one thing that dev teams don't measure is actual progress towards a customer goal. The Agile manifesto claimed that working software is the primary measure of progress. This is incorrect. The real measure of progress is vaguely alluded to with the word "value". We deliver "value" to customers, and that has somehow become confused with working software.

Agile consultants talk of the "flow of value", when what they really mean is the flow of working software. But let's not confuse buying lottery tickets with winning jackpots. What has value is not the software itself, but what can be achieved using the software. All good software development starts there.

If an app to monitor blood pressure doesn't help patients to lower their blood pressure, then what's the point? If a website that matches singles doesn't help people to find love, then why bother? If a credit scoring algorithm doesn't reduce financial risk, it's pointless.

At the heart of IT's biggest problems lies this failure of almost all development teams to address customers' end goals. We ask the customer "What software would you like us to build?", and that's the wrong question. We effectively make them responsible for designing a solution to their problem, and then - at best - we deliver those features to order. (Although, let's face it, most teams don't even do that.)

At the foundations of Agile Software Development, there's this idea of iterating rapidly towards a goal. Going back as far as the mid 1970's, with the germ of Rapid Development, and the late 80's with Tom Gilb's ideas of an evolutionary approach to software design driven by testable goals, the message was always there. But it got lost under a pile of daily stand-ups and burndown charts and weekly show-and-tells.

So, number 19 in my series is simply Progress. Find out what it is your customer is trying to achieve. Figure out some way of regularly testing to what extent you've achieved it. And iterate directly towards each goal. Ditch the backlog, and stop measuring progress by tasks completed or features delivered. It's meaningless.

Unless, of course, you want the value of what you create to be measured by the yard.