January 28, 2009

...Learn TDD with Codemanship

Why Refactoring Is Like Snooker - UPDATED

One of the toughest things to master in software development is the compound refactoring.

Being able to visualise a well-defined route - a sequence of individual (and ideally automated) refactorings that will take us from the code as it is now to what we wish it to be - is a true art that takes years to develop the instincts for.

Joshua Kerievksy's book Refactoring to Patterns illustrates some common examples of compound refactorings, and this gives us some feel for the art. But it is not the art itself, any more than a book on how to draw famous cartoon characters is a book on how to draw cartoon characters.

One question that comes up is whether, from our existing set of refactorings, all routes are possible and all outcomes can be achieved using just those refactorings. I'm not convinced that we have the complete set of primatives from which to construct a "language of refactoring" that is general purpose enough for exclusive use. One finds, in practice, that little gaps appear that you must jump with a bit of hand-coded noodling because the refactoring does not yet exist that can bridge that gap.

It's also a bit of a pain that some very well-known refactorings have not been automated in many refactoring tools (like Move Method). Possibly this is because automating them is too hard? I don't know, but it again leaves perilous gaps that we just jump the hard way.

One wonders if it's possible, given a language grammar that defines the set of all possible programs that could be written in that language, and a set of refactorings that can transform structures in that language, whether one could prove or disprove that it's possible to transform any specific program structure into any other specific program structure using sequences of those refactorings*.

I also wonder if, just maybe, I need to get out more...

Anyway, back in the real world, compound refactorings can be a little like play snooker or bar billiards. For many of us, potting one ball is a significant achievement. Potting a second ball directly after that first is a rarer event, and we can summise that a player who does it routinely it is a better player. But some players are so well-practiced that they can pot balls in long sequences. They simultaneously aim the cue ball to pot, and judge the force, angle and spin to have the cue ball end up somewhere that will make the next shot easier.

When we start refactoring, most of us feel good about ourselves if we can apply individual refactoring primatives live extracting methods or introducing method parameters and so on. We're not really thinking about the refactoring after, and certainly not 6 refactorings into a sequence towards some architectural end goal. That's advanced stuff, and - just like in snooker - it requires thousands of hours of practice to build those instincts.

But starting with individual refactorings will help clean up your code to some extent, and these are a great place to start. You will familiarise yourself with each refactoring primative until they become trivial and second nature, and then you can start to tackle small sequences of 2-3 refactorings towards more ambitious goals - like turning a switch statement into the Strategy pattern. One day, years from now, you will be adept at taking code in pretty much any state and safely and cleanly refactoring into any structure you desire. and then, my fine friend, you will be among a very elite group indeed, because - I'll level with you - I've yet to meet such a person.

Hey. Maybe I'll be the first. Race you to it!

*UPDATE: nespera (Chris R) has tweeted me to remind me - quite rightly - that we would need a proof that a sequence of recactorings does/doesn't exist between two programs that exhibit the same external behaviour. Yes, indeed. Else it wouldn't be refactoring. Silly Jason!

Posted 12 years, 4 months ago on January 28, 2009