Why is reproducibility important?

Posted by s.hettrick on 8 July 2011 - 10:57am

The second discussion session at the Effective Scientific Programming workshop in Newcastle, chaired by James Perry, focused on reproducible results and related issues. We discussed why reproducibility is important, what obstacles get in the way of achieving it, and noted some real-world examples where reproducibility came into play.

Outcomes

Maybe the most obvious reason is correctness. If you run your code twice and get different results each time, how do you know which (if either) are correct? It also enables others to re-run your code and verify that they get the same results as you. This is important for transparency. And perhaps less obviously, reproducibility allows us to exactly regenerate data that has been lost and generate more data that matches results we already have.

There are many obstacles to achieving reproducibility, some obvious and some not-so-obvious. Clearly you need to run your code with the same input data and the same settings if you expect to get the same output. But are you sure you're even running the same code? Changes to the source can be hard to track (a version-control system can help). Build tools and settings can make a difference too. Changes to compilers and optimisation levels can alter the order of operations and subtly change the results of floating-point computations, for example, or a new version of a library may have different behaviour from the old version. Even the hardware itself can play a part (whether due to bugs or just architectural differences, such as different levels of precision).

It's often enough to be aware of these issues, as bit-for-bit reproducibility is not always practical and is not necessary for many applications. We can get closer to this ideal by gathering and recording as much information as possible about the software and hardware configuration that generated our results. This sort of information can be extremely valuable if, for example, a certain version of the code or a specific machine has a bug, as this allows quickly identification of the data that may have been affected. When porting code to a new machine, differences in the hardware may not be under our control, but we could run a small problem on the new machine and compare the output to results on the old machine to check that there are no surprises.

Reproducibility in scientific code is a goal that is easy to overlook and achieving it can be more difficult than it appears. We hope that discussion at events like the Effective Scientific Programming workshop will encourage researchers to give more thought to the issues surrounding reproducibility in their future work.