Automated testing to boost recipy’s confidence

Posted by s.aragon on 15 August 2016 - 2:18pm

By Mike Jackson, Software Architect

A major challenge to reproducibility in computational science is the effort that is required to keep track of provenance and to make research that relies upon code more reproducible. recipy provides an almost effortless way to track provenance in Python. I am working with recipy’s developers—Software Sustainability Institute fellow Robin Wilson and Janneke van der Zwaan of Geography and Environment at the University of Southampton—to develop an automated test suite for recipy as a precursor to expanding the development of recipy and promoting recipy more widely.

recipy is an open source Python module package hosted on GitHub and released under the open source Apache License Version 2.0. It is available via this repository or as a Python package that can be installed via Python’s pip package manager. Once a researcher has installed recipy, all they have to do is add “import recipy” at the top of their Python scripts, and all of their inputs, outputs and code will be logged in an easily-searchable database. The researcher can then easily go back to the database in the future and say “How did I produce graph.png?”, and recipy will tell them exactly what code they ran, what versions of what Python libraries they used, and what input files were used to produce what output files. recipy can be used by any research community in any field which uses computational tools written in Python as part of their research.

recipy provides wrappers for Python packages which are then invoked when Python loads these packages. This allows information about the package versions, as well as calls to their input and output functions, to be logged. At present, support is provided for a range of scientific Python package including SciPy, Pillow, scikit-learn, scikit-image, NiBabel and the Geospatial Data Abstraction Library. Wrappers for any package can be added. Provenance information is stored as JSON documents within TinyDB, a pure Python database.

recipy was first developed at the Institute’s Collaborations Workshop 2015 Hack Day by Robin, Janneke and Raquel Alegre, where it won the Hack Day Prize. Reception to recipy has been very positive: a talk given by Robin on recipy at EuroSciPy 2015 won the Best Presentation prize; a recipy entry on Python subreddit was top-voted, and recipy’s GitHub repository currently has 16 forks and 243 stars. Robin gave a demostration of the current version of recipy at the Institute’s Collaborations Workshop 2016:

Robin and Janneke wish to capitalise on this interest and engage with researchers, to increase the number of researchers who use recipy, to increase recipy's “bus factor” (the number of people who understand how recipy works and how to develop it) and to allow the development to be driven by the research community, and undertaken by research software engineers, regardless of location.

recipy’s GitHub repository is starting to get significant numbers of pull requests with feature additions and/or bug fixes. Robin and Janneke also have plans for new features. In addition, they launched a survey to help determine how development of recipy should proceed. However, Robin and Janneke are wary about implementing new features, or merging pull requests, as they are concerned these may break recipy’s existing functionality.

Our collaboration seeks to overcome this barrier to development and more widespread promotion by developing an automated test framework for recipy, which can be run locally or using a continuous integration server. Since recipy is multi-platform, this automated test suite will run under the same environments that recipy does. We will be using both Travis CI and AppVeyor hosted continuous integration services to run the test framework under both Linux and Windows operating systems.

Providing an automated test framework can reduce the risk incurred by Robin and Janneke, and other developers, that new features or enhancements lead to bugs being introduced into recipy and remain undetected until recipy is used. This would, in turn, give Robin and Janneke the confidence to promote recipy more widely.

recipy should be as flexible and broadly applicable as possible, so it should be easy to use, extend and customise; e.g. to provide a rich yet easy-to-use command-line interface, to provide a graphical user interface, to support recording provenance from new Python modules, or to support different databases for recording provenance. We will review recipy’s implementation and make suggestions as to how its usability, modularity and extensibility can be improved.

recipy has the potential to significantly benefit the research community in any field which uses computational tools written in Python to enable research. This is a very broad community, encompassing fields as diverse as history, neurology, geography and engineering, all of whom could benefit from the automatic provenance tracking provided by recipy. This collaboration contributes to evolving recipy into a more robust tool with which researchers can have confidence. I look forward to reporting on progress.