Computational science: eliminating the errors

Posted by n.chuehong on 14 October 2010 - 7:27pm

There's an interesting feature over at Nature by Zeeya Merali called "Computational science: ... Error - why scientific programming does not compute" (disclaimer: I was one of the people interviewed for the piece). In it, Merali considers the issues of computational software written in the scientific context particularly in light of the problem revealed by the leak of emails from the Climatic Research Unit at the University of East Anglia last year.

As the article notes, a lot of scientific software has grown ever more complex and there is a steep learning curve for new researchers joining a collaboration. Additionally, software is often used to produce or analyse data published in papers, without the readers being able to verify the results. This is a big issue for research as we build on the principles that scientific discoveries should be both reproducible and repeatable. Without this, we cannot build on others achievements.

However where I think the article has the perspective slightly wrong is in suggesting that this complexity and fragility is the exclusive preserve of scientific software; there are plenty of examples of commercial and public-sector software which falls into the same traps. Any piece of software which has evolved organically over a period of time and changing developers can suffer from these issues.

Likewise, simply applying techniques from industry won't solve the problem, although some things may help it. There's certainly scope for many of the benefits promised by software engineering best practice to be realised, and many scientific software projects have done so: we'll be publishing examples of these on our website.

So what makes things different in the research software community? Why does the issue of complex, difficult to maintain, code appear to afflict us more? This is something which the Software Sustainability Institute has been considering in partnership with other organisations, but which might have at its heart a fundamental difference in approach...

One of the big differences in our field is that there are many more stakeholders including researchers, collaborators, institutions, the peer review system, and funders. Each has a different perspective on what they want the software to achieve, and what they think is an appropriate level of effort to deliver this.

Moreover, scientific research is geared towards discovery, not maintenance, and so it is difficult to get funding to simply refactor your software. Worse still, where there is a possibility of breaking new ground in software algorithms or modelling, it is rarely the case that the theoretical breakthroughs are happening at the same time as the applications delivering the science are being built. This "interest gap" means that it is often more difficult to get the right mix of domain scientist, computer scientist and developer in your team - someone must compromise their interests to remain involved.

Nevertheless, to actually effect change in scientific software, I think that it all comes down to three important points:

1. We mustn't be afraid to share our code, no matter how messy it is.

This is such a fundamental thing that we could do better as researchers but don't. In fact, we often don't do this for non-software things as well. We'd rather share our toothbrush than our workflows and data. Partly it's a fear that others will sneer at your efforts, partly it's a fear that others will achieve greater things with it than you can. Well, if you can't exploit something which you wrote better than someone else, maybe you do need to start collaborating with others!

Matt Might has an interesting take on this through his CRAPL license. Let's just get over this.

2. We mustn't be afraid to cite other peoples code when we use it.

Not only does this promote reuse and as a result is likely to result in more robust code, but it changes the mindset of treating the software as an inferior part of the research process. I hope that the DataCite initiative can be broadened to encourage DOI's for code in the future.

3. We must be encouraged to see software as the capital investment, not hardware.

Given the relative refresh cycles of software versus hardware, and the resurgence of networked distributed computing resources (in the guise of cloud computing), it's interesting that hardware is now the consumable and software is finally seen as the long term investment: something which Dan Atkins has been saying for some time.

Working together to create open, extensible software platforms for research is the equivalent of international consortia coming together to create facilities like the Large Hadron Collider. Indeed there are already good examples of this, such as the International Virtual Observatory Alliance.

There have been great strides in recent years towards open data. If we make the same progress with open software and open standards, we will have a chance of getting to the goal of robust, sustainable, scientific software: open knowledge.

And knowledge shared is a beautiful thing.