April 6, 2013

...Learn TDD with Codemanship

Science & Software

Pairing with my apprentice-to-be, Will, on Friday, we got to chatting about the increasingly intimate relationship between software development and science.

i was reminded of something I heard at university (back in the days when we wrote our dissertations with quills). A PhD student who I hung out with had been working on a large-scale simulation of atoms in a crystal lattice to try and crack the problem of why washing powder clogs. His code was written in FORTRAN and was designed to run on the HP minicomputer - a monster of a computer, almost as powerful as my Android phone!

His research was building on work done by a previous PhD student, who had also written code to simulate the same crystals. His PhD was predicated on the assumption that he'd be able to take the existing code and adapt it for his research - in much the sdame way that an algorithm to traverse a tree and search for something could be adapted to traverse the same tree and search for something else.

Problem was that the existing code was impossible to understand, and the person who wrote it was long gone. This person lost several months rewriting the simulation from scratch.

Now, this was more than two decades ago. Much physics was still done with pen and paper or in the lab. But, increasingly, more and more research was done almost entirely using software to either search through large amounts of data collected from experiments, or to simulate physical systems that would be too expensive - or even impossible - to recreate in the lab.

I'm told by friends who carried on with physics that software-based research is much more common these days. And they often share anecdotes about the trouble software causes them that sounds jolly familiar. I wouldn't be surprised if a lot of research time and money isn't being lost to the kinds of problems we come up against daily in business when software's involved.

When I studied physics, they encouraged us to keep a diary of the lab work we did. The idea was that if, for example, we got run over by a bus, someone else could read our lab diary and continue our research. Hoorah - progress continues unabated.

The lab diary codifies your method. These days, I suspect, the method may - in at least some key cases - be codified as software. If you can't understand the software, you can't understand the method, and you can't continue the research.

Similarly, if the software is buggy, then your method is buggy, and your results and their conclusions are suspect. Cue faster-than-light neutrinos. The interactions of subatomic particles at CERN are interpreted by software. My first instinct on hearing the sensational news was "I'd like to read their code".

As science becomes more and more reliant on software, the integrity of our science will rely more and more on the integrity of our software. As yet, this is a fringe topic in physics. Universities may teach computer programming and computational maths, but they don't really help or encourage students to write software to a high-enough standard.

I can't help feeling that some element of the discipline of writing good software would benefit science students. But a science degree is already a big ask in terms of time commitment. Throwing in a day a week of "software craftsmanship" or "software engineering" may be the straw that broke the camel's back.

I do think, though, that the model of apprenticeship I'm proposing to trial with Will (and A.N.Other, if I can find the right person) could present a solution.

This is something I'm going to give more thought to.

Posted 7 years, 9 months ago on April 6, 2013