March 23, 2011

...Learn TDD with Codemanship

Gorman Metrics - Unifying OO Design Metrics At Class & Package Level

After just running another OO design principles workshop in London for the BBC, it occurred to me that I still have yet to "go public" on one key aspect of the course that I think could be of wider interest.

For several years I've been working from the notion that OO design principles have been artificially distinct at different levels of code organisation. We discuss and analyse dependencies at the class level and the package level differently for some reason. It's probably because different people have addressed different design principles at different times, and nobody sat down and asked "okay, so what have all these got in common?"

On the course, I proffer 4 unified principles of dependency management, which are prioritised as:

1. Minimise dependencies - often not mentioned in the context of dependency management, but less code, and less duplication, usually means less dependencies to worry about in the first place.

2. Localise dependencies - for the simplest code we have to write (since, while no code may be an ideal from the standpoint of minimising dependencies, it has its drawbacks regarding usefulness), whenever possible we aim to encapsulate dependencies between data and functions (and between functions and other functions) inside classes, and to encapsulate dependencies between classes inside packages. This has the effect of localising the impact of change.

3. Stabilise dependencies - while the ideal for localising dependencies might be putting everything in one big class, this is not ideal from the perspective of reuse or for the comprehensibility or testability of the software. Inevitably, in order to reuse code and to break the design down into manageable, testable chunks, we must have some dependencies crossing class and package boundaries. In those cases, we strive to depend on things that are less likely to change.

4. Abstract dependencies - we want to be able to add behaviour to our software without, if at all possible, impacting other parts of the software. OO gives us a way to do this by extending existing classes and packages, rather than modifying them (the Open-Closed Principle). Therefore, the more depended-upon classes and packages are, the more open to extension they need to be.

At the class level, we have a fairly obvious notion of class coupling. That's just the number of dependencies between one class and other classes. Simples.

Class cohesion refers to the "inner-connectedness" of classes - the extent to which features of that class are related to each other. There is no standard metric for "class cohesion", though. For some bizarre reason, possibly something to do with beards, we've been measuring the lack of class cohesion, and we've been doing it in what I think is a rather overblown and counterintuitive way. Lack of Cohesion of Methods is an old and established metric - a measure, if you like, of the extent to which methods on a class don't access the fields in that class. I think it's a crappy metric, to be perfectly frank. And often not very helpful.

If we're interested in class cohesion, let's measure class cohesion, damn it! I use this metric:

Class Cohesion = Number of Feature Couplings In A Class / Number of Methods
(where "feature" is a field, a method, a constant etc, and methods > 0)

Basically, it's the average number of internal dependencies per method. The higher that is, the more cohesive the class. (Though not too high, eh?!). Simples.

Scale that up to the package level, where relational cohesion is a widely-used metric, and we get some consistency.

Package Cohesion = Number of Class Couplings In Package/ Number Of Classes

(Where classes > 0)

Package coupling is just the number of class couplings that cross package boundaries.

So we can treat coupling and cohesion at class and package level pretty much the same, and the same principles - and similar metrics - apply at both levels.

Martin Metrics (named after Uncle Bob, of course) have been routinely applied at the package level, but I also think they can apply at the class level.

Package Instability is a measure of the ratio of outgoing dependencies from a package (to other packages it depends on) over the sum of outgoing and incoming dependencies (from packages that depend on it). The more a package depends on other packages, the more "instable" it's said to be (because it's more likely to be affected by changes to those packages it depends on). The more a package is depended upon, the more "stable" it's said to be, because the impact of changing it is likely to spread further and cost more, and also because changes in other packages are less likely to affect it.

We favour depending on more stable packages because they're less likely to change. Simples.

Classes can have dependencies on other classes, and can have other classes depend on them. So instability can be applied at this level, too. Classes should depend on classes that are more stable than they are.

Finally, if we wish to favour depending on abstractions (because they're easier to extend), we can apply the same metric at both class and package level.

In Martin Metrics, the Abstractness of a package is given as:

Package Abstractness = Abstract Classes (inc. interfaces) / Total Classes in Package
(Where total classes > 0)

Our design goal is to strike a balance of Instability (I) and Abstractness (A), such that A + I = 1. The more stable - or the less instable, depending on whether your glass is half-empty - a package is, the more abstract we need it to be. (Okay, not so simples.)

I can apply this at the class level, too. The abstractness of a class can be given by the ratio of abstract methods to the total number of methods in a class. An interface would be 100% abstract. An abstract class with 2 abstract methods and 2 method implementations would be 50% abstract. A concrete class would be 0% abstract.

We could then measure the balance of abstractness and instability at the class level, just as we can at the package level. We would hope that interfaces would have zero instability, and that concrete classes would have 100% instability, and that everywhere in between would strike the right balance.

So there you have it, my unified theory of dependency management with unified dependency metrics. I shall call them Gorman Metrics. Because I can.

I look forward to never seeing them in any metrics tools any time soon. (Yeah, I know. I suppose at some point I'll have to roll up my sleeves and write one. But you're very welcome to beat me to it.)

Posted 3 weeks, 5 days ago on March 23, 2011