May 19, 2017

Learn TDD with Codemanship

20 Dev Metrics - 18. External Dependencies

18th in my series 20 Dev Metrics is External Dependencies.

If our code relies too much on other people's APIs, we can end up wasting a lot of time fixing things that are broken when the contracts change. (Anyone who's written code that consumes the Facebook API will probably know exactly what I mean.)

In an ideal world, APIs would remain backwards-compatible. But in the real world, where 3rd-party developers aren't as disciplined as we are, they change all the time. So our code has to keep changing to continue to work.

I would argue that, with the way our tools have evolved, it's too easy these days to add external dependencies to our software.

It helps to be aware of the burden we're creating as we suck in each new library or web service, lest we fall prey to the error of buying the whole Mercedes just for the cigarette lighter.

The simplest metric is just to count the number of dependencies. The more there are, the more unstable our code will become.

It's also worth knowing how much of our code has direct dependencies on external APIs. Maybe we only depend on JDBC, but if 50% of our code directly references JDBC interfaces, we still have a problem.

You should aim to have as little of your code directly depend on 3rd-party APIs as possible, and as few different APIs as you can use to build the software you need to.

(And, yes, I'm including GUI frameworks etc in my definition of "external dependencies")

May 5, 2017

Learn TDD with Codemanship

20 Dev Metrics - 15. Backwards Compatibility

Metric No. 15 in my 20 Dev Metrics series is short and sweet - Backwards Compatibility.

If you've heard of the Liskov Substitution Principle (the "L" in "SOLID"), which states that an instance of any class can be replaced with an instance of any of its subclasses... Well, let me introduce you to the Gorman Substitution Principle

"A version of any API can be replaced with a later version"

Or, to put it more bluntly: thou shalt not break client shit that was working.

For a published component or service (reusable code with an API), run new releases against the tests for previous releases. How many releases back can you go before tests start to break?

This is a particular bug-bear of mine; we're just a bit too change-happy with our APIs. So much so, that I wonder how many billions of dollars are wasted every year fixing client code that didn't need to be broken.

May 4, 2017

Learn TDD with Codemanship

20 Dev Metrics - 14. Interface Specificity

The 14th in my series of 20 Dev Metrics is Interface Specificity, which measures the extent to which interfaces are made to be client or usage-specific. That is to say, the extent to which interfaces only include methods that specific clients need to use.

This helps us to observe the interface segregation principle (the "I" in "SOLID"), and reminds us that interfaces are for collaborating through, and therefore should be designed from the client's perspective.

Imagine we have a class Book, which has methods for getting the ISBN of a publication, and the rating. A class Library uses the ISBN to search for books, and a different class BookStats uses the rating to calculate statistics about the book.

The Library doesn't need to know about a book's rating, and BookStats doesn't need to know its ISBN. Generally speaking, we should seek to limit the knowledge classes have about other classes in the system, so we can limit the chances of it being broken by changes. So instead of binding both Library and BookStats to the same general Book class, instead we can split Book's interface and expose them only to the method they need to use.

Interface Specificity is calculated thus: divide the number of methods used by a client class by the total number of methods exposed by the supplier type. If the supplier only exposes methods used by that client, then Interface Specificity is 100%. If the supplier has 4 methods, and the client only uses 2, then it's 50%. And so on.

An average of Interface Specificity across the software could serve as an indicator of how we're doing generally on this front. It would rarely reach 100%, but 80% or above would suggest we're probably doing okay.

May 3, 2017

Learn TDD with Codemanship

20 Dev Metrics - 13. Swappability of Dependencies

The 13th in my series 20 Dev Metrics is Swappability of Dependencies.

Swappability lies at the core of object oriented and component-based design, and so we should take a keen interest on how easy it would be to replace an object's collaborators without it having to change. For example, we might want to swap a data access object with a stub for testing, or swap a payment processing service when the customer is in a specific country.

Swappability as a general concept is pretty much universal, but differs in its implementation depending on the language. To make a dependency swappable in C++, we must do more than we would need to in, say, Ruby and other dynamically-typed languages.

I'll illustrate with a Java example.

Here we're depending directly on a static method of a class ImdbService to get information about a video the customer wants to rent. If we wanted to get that information from a different source (e.g., Amazon), there's no easy way to do it.

In our refactored design, we've made that dependency swappable by 3 steps:

1. We made the static method an instance method, so it can be overridden

2. We passed the instance into the constructor ("dependency injection"), so instantiation happens outside of Pricer. i.e., someone else decides what implementation to use

3. We extracted an interface for ultimate swappability ("dependency inversion"). Pricer can use any service that implements that interface.

In dynamically-typed languages, we may not need an interface - technically speaking - but many programmers get into the habit of creating classes with empty methods to represent an interface, mostly because it makes more sense than extending an implementation (e.g., is an AmazonVideoService really a kind of ImdbService?).

In C++, we would absolutely need an interface, as we can only readily override methods declared as virtual. And other languages like Java are somewhere in between.

Measuring swappability in Java would be a matter of analysing references to other objects and determining where those references are instantiated. If they're instantiated inside the client class, then they're not swappable. If they're passed in as a method parameter, they're swappable - but only if all of the methods used are overrideable. Hence, binding to a pure interface gives ultimate swappability. And, of course, if static methods are used, then that's zero swappability.

How I would I calculate swappability for a Java class?

I'd calculate swappability for each individual reference, and then divide the total for all of them by the maximum possible swappability.

If a reference is static, then it has 0% swappability.

If a reference isn't dependency-injected, it has 0% swappability.

If a reference is dependency-injected, it's swappability will depend on which of its methods are being used:

a. If a method used is abstract, that counts as 100% swappable

b. If a method used has an implementation, but is overrideable, that counts as partially swappable - 50%

c. If a method used cannot be overriden, that has 0% swappability.

For each reference, swappability is the average swappability of methods used. For the class as a whole, swappability is the average swappability of references. And at a package or system level, it's the average swappability across all of the classes

So, when Pricer uses ImdbInfo.fetchVideo(), it has zero swappability because it's a static reference. When Pricer uses a dependency-injected VideoInfoService.fetchVideo(), it has 100% swappability because that method is abstract.

You'll no doubt be delighted to learn that there are no automated tools for calculating this metric at present for any languages. So this is some tooling you would need to rig up yourself. For now, though, I find it a very useful conceptual tool for reasoning about swappability of dependencies.

A cruder approach would be to calculate what proportion of references are to interfaces, and from a tooling perspective this is much simpler, but arguably a bit of a blunt instrument... And very language-specific. For example, a field may be of an interface type, but if it's instantiated inside the constructor of that class, then it's not swappable.

April 28, 2017

Learn TDD with Codemanship

20 Dev Metrics - 11. Coupling

Number 11 in my series 20 Dev Metrics helps us to predict the potential impact of changing one part of our software on the rest of the software. Coupling is a measure of how interrelated code is, at various levels of organisation (classes, components, systems, services etc).

It's simply a matter of counting references in, say, one class to other classes (or features of other classes), or classes in one component to classes in other components. And so on.

When software is tightly coupled, changes can "ripple" out along the dependencies, breaking other parts of the code, like the ripples that cascade outwards when we throw a pebble into a pond. One of the goals of good modular software design is to localise those ripples and therefore minimise the impact of changes. So we aim for modules that are loosely coupled, and know as little about each other as possible.

Some people mistakenly believe this is an object oriented design principle. But it actually applies to modular software and systems of any kind. If we were writing our software in Pascal or COBOL, it would be just as true. However the technology allows us to modularise code, those modules need to be loosely coupled.

Many tools exist that can do this counting for us, thankfully.

December 21, 2016

Learn TDD with Codemanship

"Our Developers Don't Do Any Design". Yes They Do. They Have To.

A complaint I hear often from managers about their development teams is "they don't do any design".

This is a nonsense, of course. Designedness - is that a word? It is now - is a spectrum, with complete randomness at one end and zero randomness at the other. i.e., completely unintentional vs. nothing unintentional.

Working code is very much towards the zero randomness end of the spectrum. Code with no design wouldn't even compile, let alone kind of sort of work.

To look at it another way, working code is a tiny, tiny subset of possible combinations of alphanumeric characters. The probability of accidentally stumbling on a sequence of random characters that makes working code is so vanishingly remote, we can dismiss it as obvious silliness.

Arguably, software design is a process of iteratively whittling down the possibilities until we arrive at something that ticks the right boxes, of which there will be so very many if the resulting software is to do what the customer wants.

It's clear, though, that this tiny set of possible working code configurations contains more than one choice. And when you say "they don't do any design", what you really mean is "I don't like the design that they've chosen". They've done lots and lots of design, making hundreds and thousands (and possibly millions) of design choices. You would just prefer they made different design choices.

In which case, you need to more clearly define the properties of this tiny subset that would satisfy your criteria. Should they require modules in their design to be more loosely coupled, for example? If so, then add that to the list of requirements; the tests their design needs to pass.

Finally, in some cases, when managers claim their development teams "don't do any design", what they really mean is they don't follow a prescribed design process, producing the requisite artefacts as proof that design was done.

The finished product is the ultimate design artefact. If you want to know what they built, look at the code. The design is in there. And if you can't understand the code, maybe you should let someone who can worry about design.

September 30, 2016

Learn TDD with Codemanship

Software Development Doesn't Scale. Dev Culture Does

For a couple of decades now, the Standish Group have published an annual "CHAOS" reported, detailing the results of surveys taken by IT managers about the outcomes of IT projects.

One clear trend that emerged - and remains as true today as in 1995 - is that the bigger they are, the harder they fall. The risk of an IT project failing outright rises rapidly with project size and cost. When they reach a certain size - and it's much smaller than you may think - failure is almost guaranteed.

The reality of software development is that, once we get above a dozen or so people working for a year or two on the same product or system, the prognosis does not look good at all.

This is chiefly because - and how many times do we need to say this, folks? - software development does not scale.

If that's true, though, how do big software products come into existence?

The answer lies in city planning. A city is made up of hundreds of thousands of buildings, on thousands of streets, with miles of sewers and underground railways and electrical cabling and lawns and trees and shops and traffic lights and etc etc.

How do such massively complex structures happen? Is a city planned and constructed by a single massive team of architects and builders as a single project with a single set of goals?

No, obviously not. Rome was not built in a day. By the same guys. Reporting to one boss. With a single plan.

Cities appear over many, many decades. The suburbs of London were once, not all that long ago, villages outside London. An organic process of development, undertaken by hundreds of thousands of people and organisations all working towards their own unique goals, and co-operating or compromising when goals aligned or conflicted, produced the sprawling metropolis that is now London.

Trillions of pounds has been spent creating the London of today. Most of that investment is nowhere to be seen any more, having been knocked down (or bombed) and built over many times. You could probably create a "London" for a fraction of the cost in a fraction of the time, if it were possible to coordinate such a feat.

And that's my point: it simply isn't possible to coordinate such a feat, not on that scale. An office complex? Sure. A housing estate? Why not? A new rail line with new train stations running across North London? With a few tens of billions and a few decades, it's do-able.

But those big projects exist right the edge of what is manageable. They invariably go way over budget, and are completed late. If they were much bigger, they'd fail altogether.

Cities are a product of many lifetimes, working towards many goals, with no single clear end goal, and with massive inefficiency.

And yet, somehow, London mostly looks like London. Toronto mostly looks like Toronto. European cities mostly look like European cities. Russian cities mostly look like Russian cities. It all just sort of, kind of, works. A weird conceptual cohesion emerges from the near-chaos.

This is the product of culture. Yes, London has hundreds of thousands of buildings, designed by thousands of people. But those people didn't work in bubbles, completely oblivious to each others' work. They could look at other buildings. Read about their design and their designers. Learn a thousand and one lessons about what worked and what didn't without having to repeat the mistakes that earned that knowledge.

And knowledge is weightless. It travels fast and travels cheaply. Hence, St Petersburg looks like the palaces of Versailles, and that area above Leicester Square looks like 19th century Hong Kong.

Tens of thousands of architects and builders, guided by organising principles plucked from the experience of others who came before.

Likewise, with big software products. Many teams, with many goals, building on top of each other, cooperating when it makes sense, compromising when there are conflicts. But, essentially, each team is doing their own thing for their own reasons. Any attempt to standardise, or impose order from above, fails. Every. Single. Time.

Better to focus on scaling up developer culture, which - those of us who participate in the global dev community can attest - scales beautifully. We have no common goal, no shared boss; but, somehow, I find myself working with the same tools, applying the same practices and principles, as thousands of developers around the world, most of whom I've never met.

Instead of having an overriding architecture for your large system, try to spread shared organising principles, like Simple Design and S.O.L.I.D. It's not a coincidence that hundreds of thousands developers use dependency injection to make external dependencies swappable. We visit the same websites, watch the same screencasts, read the same books. On a 10,000-person programme, your architect isn't the one who sits in the Big Chair at head office drawing UMLL diagrams. Your architect is Uncle Bob. Or Michael Feathers. Or Rebecca Whirfs-Brock. Or Barbara Liskov. Or Steve Freeman. Or even me (a shocking thought!)

But it's true. I probably have more influence over the design of some systems than the people getting paid to design it. And all I did was blog, or record a screencast, or speak at a conference. Culture - in this web age - spreads fast, and scales rapidly. You, too, can use these tools to build bridges between teams, share ideas, and exert tacit influence. You just have to let go of having explicit top-down control.

And that's how you scale software development.

July 17, 2016

Learn TDD with Codemanship

Oodles of Free Legacy UML Tutorials

See how we used to do things back in Olden Times by visiting the legacy UML tutorials section of the Codemanship website (the content from the highly-popular-with-your-granddad-back-in-the-day

I maintain that:

a. Visual modeling & UML is still useful and probably due for a comeback, and

b. Visual modelling and Agile Software Development can work well together when applied sparingly and sensibly

Check it out.

April 15, 2016

Learn TDD with Codemanship

Compositional Coverage

A while back, I blogged about how the real goal of OO design principles is composability of software - the ability to wire together different implementations of the same abstractions to make our code do different stuff (or the same stuff, differently).

I threw in an example of an Application that could be composed of different combinations of database, external information service, GUI and reporting output.

This example design offers us 81 unique possible combinations of Database, Stock Data, View and Output for our application. e.g., A Web GUI with an Oracle database, getting stock data from Reuters and writing reports to Excel files.

A few people who discussed the post with me had concerns, though. Typically, in software, more combinations means more ways for our software to be wrong. And they're quite right. How do we assure ourselves that every one of the possible combinations of components will work as a complete whole?

A way to get that assurance would be to test all of the combinations. Laborious, potentially. Who wants to write 81 integration tests? Not me, that's for sure.

Thankfully, parameterised testing, with an extra combinatorial twist, can come to the rescue. Here's a simple "smoke test" for our theoretical design above:

This parameterised test accepts each of the different kind of component as a parameter, which it plugs into the Application through the constructor. I then use a testing utility I knocked up to generate the 81 possible combinations (the code for which can be found here - provided with no warranty, as it was just a spike).

When I run the test, it checks the trade price calculation using every combination of components. Think of it like that final test we might do for a car after we've checked all the individual components work correctly - when we bolt them all together, and turn the key in the ignition, does it go?.

The term I'm using for how many possible combinations of components we've tested is compositional coverage. In this example, I've achieved 100% compositional coverage, as every possible combination is tested.

Of course, this is a dummy example. The components don't really do anything. But I've simulated the possible cost of integration tests by building in a time delay, to illustrate that these ain't your usual fast-running unit tests. In our testing pyramid, these kinds of tests would be near the top, just below acceptance and system tests. We wouldn't run them after, say, every refactoring, because they'd be too slow. But we might run them a few times a day.

More complex architectures may generate thousands of possible combinations of components, and lead to integration tests (or "composition tests") that take hours to run. In these situations, we could probably buy ourselves pretty decent compositional coverage by doing pairwise combinations (and, yes, the testing utility can do that, too).

Changing that test to use pairwise combinations reduces the number of tests run to just 9.

April 11, 2016

Learn TDD with Codemanship

Intensive S.O.L.I.D. - London, Sat June 4th

Just a quick note to mention that I'll be running a Codemanship Intensive S.O.L.I.D. training workshop in London on Saturday June 4th at the amazingly low price of £59 for a jam-packed day of OO and refactoring goodness.