Overcoming Entry Barriers to Motivate Better Practice in Research Software Engineering

Posted by s.aragon on 14 December 2017 - 10:20am

Motivate Better Practice in Research SoftwareBy James Grant, University of Bath, Andrew Washbrook, University of Edinburgh, Louise Brown, University of Nottingham, Niels Drost, Netherlands eScience Center, and Andrew Bennett, European Centre for Medium-range Weather Forecasts

What can be termed as "coding" is a subset of wider software engineering practices such as version control, continuous integration and good software design. Coding is prevalent in academia but practices that allow sustainable software to be produced are frequently overlooked.  Motivating the uptake of the approaches, methods and tools, and highlighting the benefit they deliver, by engaging with researchers who develop software is the first step in spreading best practice in our community.

In discussions with researchers, we find that the use of version control is often highlighted as the first methodology that they would like to introduce into their workflow. We would therefore like to 1) identify approaches that can promote the use of version control by reducing barriers from textbook to full integration and 2) highlight the wider benefits of the methods beyond traditional software development.

Software related courses at an undergraduate level tend to focus on code syntax and functionality with limited time spent covering software management practices.  By including the use of version control as part of these training programs we can avoid much of the bad practice and inefficiency found without software management practices in place. If done well, this could expose the student to wider concepts of software engineering and make it more likely that they adopt good practices in their future projects. We will come back to a silver lining for lecturers reluctant to include new material in their courses.

Online repositories such as GitHub, GitLab and BitBucket are reducing the entry barriers to use of version control while the benefits to collaboration of services such as Overleaf can demonstrate efficiencies for paper writing and grant application.  Increasingly version control can be used to manage data repositories and simulation or analysis workflows.  If these methods can be introduced across academia to support research more widely then by extension, software developers are more likely to use them for their programming.

Other software engineering principles such as regression testing and continuous integration (CI) should be encouraged. Supporting researchers to trial co-coding, pair-programming or code review, or just mentoring offers the opportunity to discuss approaches to documentation, structuring code, introduce testing and validation or identify specific training needs. It is important that researchers feel that the feedback on their software is constructive; therefore this needs to be carefully done in a way that promotes personal skills improvement.

A further application of testing and CI is in grading. If students use version control to upload their code to a repository, continuous integration can then be used to verify that code meets the objectives whilst also evaluating the coding style. This approach to courses will teach good practise improving the quality and sustainability of software developed by the students in their future career. The silver lining for lecturers, after initial efforts to set up, (with members of their group!) is to automate large parts of marking while further developing skills of researchers and demonstrating the power of the methods.

Advocating the benefits of best practice in software development can bring process improvements across academia beyond research software, and help to bridge gaps between disciplines. Research software management will soon be a requirement of grant application and reporting so good practice will need to be embedded as software begins to gain recognition as research output. Robust software practices also provide a clear record of individual contributions for Research Software Engineers (RSEs) and postdoctoral staff which enables skills to more easily demonstrated as part of their career progression.

For the community, open and sustainable quality software, outlasting individual projects makes better use of resources. Open sharing of reproducible, validated software reduces duplication of effort and increases the time that can be focused on research. Ultimately, helping to create the environment that produces more, quality, reproducible research is our underlying motivation.