April 25, 2016
Mutation Testing & "Debuggability"More and more teams are waking up to the benefit of checking the levels of assurance their automated tests give them.
Assurance, as opposed to coverage, answers a more meaningful question about our regression tests: if the code was broken, how likely is it that our tests would catch that?
To answer that question, you need to test your tests. Think of bugs as crimes in your code, and your tests as police officers. How good are your code police at detecting code crimes? One way to check would be to deliberately commit code crimes - deliberately break the code - and see if any tests fail.
This is a practice called mutation testing. We can do it manually, while we pair - I'm a big fan of that - and we can do it using one of the increasingly diverse (and rapidly improving) mutation testing tools available.
For Java, for example, there are tools like Jester and PIT. What they do is take a copy of your code (with unit tests), and "mutate" it - that is, make a single change to a line of code that (theoretically) should break it. Examples of automated mutations include turning a + into a -, or a < into <=, or ++ into --, and so on.
After it's created a "mutant" version of the code, it runs the tests. If one or more tests fail, then they are said to have "killed the mutant". If no test fails, then the mutant survives, and we may need to have a think about whether that line of code that was mutated is being properly tested. (Of course, it's complicated, and there will be some false positives where the mutation tool changed something we don't really care about. But the results tend to be about 90% useful, which is a boon, IMHO.)
Here's a mutation testing report generated by PIT for my Combiner spike:
Now, a lot of this may not be news for many of you. And this isn't really what this blog post is about.
What I wanted to draw your attention to is that - once I've identified the false positives in the report - the actual level of assurance looks pretty high (about 95% of mutations I cared about got killed.) Code coverage is also pretty high (97%).
While my tests appear to be giving me quite high assurance, I'm worried that may be misleading. When I write spikes - intended to as proof of concept and not to be used in anger - I tend to write a handful of tests that work at a high level.
This means that when a test fails, it may take me some time to pinpoint the cause of the problem, as it may be buried deep in the call stack, far removed from the test that failed.
For a variety of good reasons, I believe that tests should stick close to the behaviour being tested, and have only one reason to fail. So when they do fail, it's immediately obvious where and what the problem might be.
Along with a picture of the level of assurance my tests give me, I'd also find it useful to know how far removed from the problem they are. Mutation testing could give me an answer.
When tests "kill" a mutant version of the code, we know:
1. which tests failed, and
2. where the bug was introduced
Using that information, we can calculate the depth of the call stack between the two. If multiple tests catch the bug, then we take the shallowest depth out of those tests.
This would give me an idea of - for want of a real word - the debuggability of my tests (or rather, the lack of it). The shallower the depth between bugs and failing tests, the higher the debuggability.
I also note a relationship between debuggability and assurance. In examining mutation testing reports, I often find that the problem is that my tests are too high-level, and if I wrote more focused tests closer to the code doing that work, they would catch edge cases I didn't think about at that higher level.
February 2, 2014
Announcing Codemanship Developer Suite - Architect EditionIntroducing the new Codemanship Developer Suite - Architect Edition.
If you are a developer, DO NOT READ BEYOND THIS POINT. The following text is for non-technical budget holders only.
For an annual fee of £500,000 (not including sales tax, maintenance and catering), Codemanship Developer Suite - Architect Edition will solve all of your software development problems, and some problems that don't even have anything to do with developing software, like male-pattern baldness and erectile dysfunction.
Codemanship Developer Suite - Architect Edition does this by putting you in the driving seat, removing all key technical decisions from your developers, who are probably incompetent because they keep telling you that creating a social media site with all the features of Faceboook can't be done in 4 weeks.
Codemanship Developer Suite - Architect Edition comes as a tightly-integrated family of tools that take the guesswork out of making shit up as you go along.
* Enterprise Text Manipulation & Management allows architects to precisely capture the logic of your software and systems in textual form that can be shared among developers. Choose among dozens of available logic specification languages, including Java, C#, C++, Visual Basic, Ruby, PHP and Python.
* Cloud-based Text File History Management "Git" Hub allows teams to share their logic specifications and maintain a history of all changes for audit purposes. It's good because it's in the cloud.
* Automated Executable Software Generation directly from textual descriptions of the software's logic targeted at a wide variety of compatible platforms
* Automated Testing & Verification & Test Suite Management using the exact same logic specification languages used to describe the software itself, greatly reducing the testing learning curve.
* Team Architecture Management tools that have been proven to work over thousands of years. (Pencils not included.)
* Gratuitous & Highly Misleading Reports Of Development Activities for that comforting illusion of control
Our objective and entirely plausible studies have shown that teams can expect a return on investment of up to 10,000,000%, meaning that most businesses will become immensely wealthy purely as a result of buying Codemanship Developer Suite - Architect Edition.
But don't take our word for it; here's some eyewitness testimony from an entirely real customer who really uses it:
"Since moving to Codemanship Developer Suite, my daughters nightmares have abated and we no long suffer at the hands of the poltergeist who has haunted us since 1977." - Mortimer D. Batmobile, Head of Development, S.P.E.C.T.R.E.
To find out more about Codemanship Developer Suite - Architect Edition and to arrange a trial demonstration, click here
March 28, 2012
Announcing A Powerful New Framework - Programming LanguageProgramming Language is a powerful new framework that enables developers to quickly and easily handle Dependency Injection, Inversion of Control, Model-View-Controller and many other common design problems.
Programming Language is easy to use and takes no time to master. There's a version of Programming Language for pretty much every platform - Java, .NET, Linux, iOS etc. You name it, there's a Programming Language for it.
Programming Language is completely object-oriented (providing you're working on an OO programming platform, of course.)
Here are just a couple of examples that illustrate the power and flexibility of the Programming Language framework.
Dependency Injection in Programming Language for Java
In this example, we use Programming Language to provide a mapping between a method parameter, declared as an instance of some interface Abstraction, and a concrete implementation of abstraction. These mappings are stored in a special, easily configurable file called a "Java class". The client method can now access features of Implementation without binding directly to it, and we can easily substitute Implementation for any other class that implements the Abstraction interface (e.g., for the purposes of mocking or stubbing it in unit tests).
Inversion of Control in Programming Language .NET
In this second example, we use the advanced IoC feature of Programming Language to define the explicit order of workflow in a user interface using a special mapping file called a "C# class". This gives us greater flexibility over the workflow. If we want to change the order of events, we simply edit the special mapping file, recompile and - bingo!
Of course, these are just two simple examples. But I'm sure even the least experienced developers among you will already see the incredible potential of the Programming Language framework.
Here's a list of some of the other powerful features accessible through Programming Language:
* Observers & Events
* Undoable Commands
* Role-based Access Control
* And many, many more.
You can download the latest stable build of Programming Language here.
January 24, 2012
Jason's Handy Guide To Evaluating Software PackagesI get asked this question a lot, but it never occurred to me to write down my usual answer.
How do we evaulate shrink-wrapped software against our needs?
Well, that's easy. You still need to do the usual business requirements analysis. Identify who will be using this system, and what their goals will be for using it. In the good old days, we called these "Use Cases". Yep, even if you're buying and not building the software, you still need use cases.
The next step is to flesh out the design of your use cases, as we might normally do, by describing how the user interacts with the software to achieve their goal.
When we're describing software we haven't built yet, this is design. When we're describing how we'll use software that already exists, this is a process of validation. Can the user achieve their goal using the software we're evaluating?
Even with the most feature-rich packages, we tend to find we don't get an exact match. It's not always possible to achieve every user goal using the software. So as we validate the software against our use cases, we may identify gaps. There are almost always gaps.
The next question we need to answer is can we fill those gaps? Let's say we're evaluating Microsoft PowerPoint for our training business. It doesn't do everything we need out of the box. Let's pretend we have a use case where the trainer needs to populate a slide with an organisation chart showing the reporting structure of the group attending the course. She has a spreadsheet with those names listed in alphabetical order and with information about who reports to whom. using PowerPoint's built-in scripting language, Visual Basic for Applications (VBA), it is indeed possible to take that information and automatically generate an Org Chart.
So that gap could be plugged, with some work. Write a reminder about it down on a blank index card. This is now a potential "User Story" for some programming work that would need to be done if we went the PowerPoint route.
Of course, people identify gaps in software all the time, and it's possible that someone somewhere has already found a solution to plugging some of your gaps with handy tools and utilities. Google is your friend here: search for solutions before you think about reinventing the wheel. If you find one, and there's money involved, write down roughly how much on the index card.
Finally, don't forget the non-functional requirements. A package may offer the right features, but it may not be able to handle a high-enough volume of users, or it may not be secure enough for your purposes, or it may take a long time for users to learn. Evaluate thye software against these criteria, too. Be as explicit as you can. Handwavy requirements like "it must be scalable" aren't very helpful for validating software. What do you mean by "scalable" - a certain number of users at any one time, or a certain number of transactions per second, or the ability to run it on more servers?
All too often, businesses buy a solution and then validate that it does what they need - often by actually trying to roll it out. Whether buying or building, the key is to have clear, testable requirements and to validate the software against them. Don't be seduced by their sales patter and let them lead you like a donkey to the slaughter to their feature list. What their software does is far less important than what we can do with their software.
April 2, 2011
Spring Is In The AirYesterday was April 1st - April Fool's Day. Lots of pranks were pulled; not that I would ever stoop to such levels, of course.
And pranks from April Fool's Days of yore also redid the rounds. One that did make me chuckle was this oldie from 2006 (Ah, 2006! Those were the days...) Someone claimed to have created an XML version of C#, demonstrating just how easy it was to write simple code using this new markup language.
The idea of writing C# code in XML is thus shown to be absurd and ridiculous. And yet, these days many of us do something very similar. "Dependency injection" and "inversion of control" frameworks do exactly this. They allow us to express things like object creation, dependency injection and flow of control in XML files.
The usual selling point of these frameworks, like Spring and Castle Windsor, is that they enable us to "soft-code" certain key dependencies to achieve a greater separation of concerns and make our applications easier to change. Some even tout the possibility of allowing non-programmers to edit these configuration files and thus make changes to, say, the flow of the UI without all that fiddly coding, building, testing and redeploying. Hooray for their side!
In reality, it's magic beans (powered by electric parsnips). If you edit an MVC mapping file, you can easily end up with the flow of the UI not matching the state of the user's session. A controller may end up having certain expectations - certain pre-conditions - broken because B didn't follow A as originally envisioned. In MVC, workflows are not infinitely flexible. So it's entirely possible to break an application by editing one of these configuration files. And if it can break it, then editing these files must be followed by retesting of the application. And, in my book, that makes them source code - something only a programmer who knows the system and works with good programming discipline can be trusted to modify.
The consequence of that is simple: when you embed DI or IoC or MVC logic in XML files, all you're really doing is converting C# or Java or Ruby or Visual Basic into the kind of XML representation we'd normally interpret as an April Fool's prank. (Okay, with Visual Basic it might actually be an improvement, but you get the point.)
It's entirely illusory that when you extract a dependency from your code into an XML file that you havr removed that dependency. All it takes is one change to the XML and the effect of that dependency can still be felt in the shape of failing tests and broken functionality, every bit as much as if you'd left it in the code. The dependency is still there, and no separation of concerns has been achieved.
The advantage of leaving dependencies in your Java or C# or Ruby or VB code is that they are easier to see, both by inspection, by the compiler and by static code analysis tools designed to help you manage dependencies.
Heavy reliance on these frameworks will also hit your unit tests where it hurts. They'll run slower and the test set-ups will get more complicated and difficult to manage.
And the funny thing is, I never had any trouble implementing patterms like Model-View-Controller, Dependency Injection or Inversion of Control before these frameworks started appearing. MVC is easy. If you want it really decoupled, try implicit invication (you may know it as the "Observer" pattern). Dependency injection literally just means passing in references to objects, making them more readily substitutible (helps with testability, for example). I mean, c'mon. How hard is it to write a constructor or implement the "Visitor" pattern? As for Inversion of Control; well, that just means that instead of baking in a workflow in the interactions between collaborating objects, we defer to a higher-level object that co-ordinates that workflow from above, making it easier to change and also introducing the possibility of changing the workflow dynamically at run-time.
Since those XML files are in effect source code, I see no real advantages in taking what would have been plain old code written in an elegant programming language with good tool support and translating into crappy markup and sweeping it under the carpet, in the vain hope that what the compiler can't see won't hurt us.
I find it better to have all the dependencies out in the open. It makes managing them so much easier.
November 13, 2010
A Short Rant About Generating Tests From CodeGrrr. Mutter. Grumble.
Just been noodling with Visual Studio 2010 and yet again it seems Microsoft have completely missed the point.
One thing in particular gets my goat: unit test generation from code. As far as I can see, this is a clear endorsement for not doing test-driven development. When you invent tools that work the opposite way around, what message are you sending to impressionable developers?
Generating tests from code is a great way to verify that your code does what it does. It promotes solutions to the role of problem statements, the antithesis of test-driven.
Yet again, Microsoft seem to encourage us to start with the code and work our way back to the requirements by handing us tools that encourage us to think in this direction. Increasingly I find trying to be test-driven with Visual Studio an exercise in planing against the grain of the tool.
September 23, 2009
Marketing Codemanship: What Search Terms Do I Pick?I recently got about $50 of free Google AdWords vouchers to help promote my company web site, codemanship.com. Yay!
Only, not so yay... You see, I've stumbled across something of a puzzle. I don't - repeat, DO NOT - want to tout myself as an "Agile Coach", or indeed, an "Agile" anything. Because, when I use that word - even just in passing - I immediately get pigeonholed by clients as a Scrum Master. Happens every time. And that's not business that particularly interests me, but it's a tough expectation to break once the perception's been established.
But if I'm not selling Agile coaching, then what, exactly, am I selling? And this is my puzzle. What keywords could I choose to have my Google ad appear when IT or development managers search on them looking for my kind of services?
Would an IT manager search for "software craftsmanship"? Very doubtful. Indeed, these days very few people are searching for "software craftsmanship". I know this because the SC2009 web site ranks second in Google's search results for that phrase, and my web stats tell me interest via search engines is very, very small compared to, for example, "use cases".
So I'm a bit stuck as to what keywords I should hang my hat on here. If a company's software/systems are poorly crafted - unreliable, untested, untestable, overly complex, hard to fathom, riddled with duplication and the worst kinds of dependencies - how could I use search engine advertising to find my quarry? Indeed, would a budget holder ever be actively seeking what I'm offering?
June 21, 2009
Free Pencil & Paper UML Trial EditionWhen people ask me what my favourite UML modeling tool is, I tell them that pencil and paper is the one to beat (or marker and whiteboard, if you need team modeling capabilities).
It's possibly a sign of our times that many software developers have not heard about "pencil and paper" before, and the follow-up question is usually "where can I get a demo version of that?"
So as a service to all your pencil and paper novices out there, parlezuml.com is giving you 10 blank sheets of special UML modeling paper which you can use to evaluate the tool today.
May 26, 2009
Devin Townsend's KiHoorah!
Devin Townsend's latest album "Ki" (pronounced "Kee") is almost upon us and you can preview many of the tracks on YouTube.
This Canadian chap is, in my humble onion, by far the most talented singer/writer/instrumentalist I've had the good fortune to listen to.
Full Title Track, Ki
April 16, 2009
Twitter Re-Educates Me On "Value"Twitter is currently teaching me a valuable lesson in - er - value.
Whenever I try to log into the service at the moment, be it via the web site, from my mobile phone or via TweetDeck, it's 50/50 that I'll actually get a response from the site.
If I am successfully logged in, when I submit an update it's 50/50 again that it'll actually get successfully posted.
Add to that some "logic quirks" with the functionality itself and some very dodgy AJAX nonsense going on, and we have one demonstrably flakey Web 2.0
And yet current speculation from those in the know (though the crisis in our financial markets maybe suggests that these people know nothing) values Twitter at some ridiculous, astronomical figure. We're talking the GDP of a small, but developed, economy here.
I am still convinced that a team of 4-6 good developers could deliver Twitter pretty much as it is on the Google App Engine in less than a fortnight. And deliver a far more reliable and robust implementation at that.
But how does the relationship between reliability and value work in this case? It calls into question the simplistic predictable relationship - a commoditised version of "value", if you like - that we've been assuming when we talk about things like "value streams" and "sustainable delivery of value".
If Twitter runs on just a few thousand lines of code, and I suspect there's very little under the bonnet in reality - then, if the service is valued in the billions, each line of code could be carrying the weight of millions of dollars in equity.
Imagine writing a line of code knowing that! How much time and effort would put in to making it right, and into making it scale?
Except, of course, Twitter's developers had no idea when they wrote it. And neither, I suspect, do any of us when we're writing code. No idea at all.