Research software

pandas_in_space copy.jpgBy Simon Hettrick, Deputy Director.

This is a story about reproducibility. It’s about the first study I conducted at the Institute, the difficulties I’ve faced in reproducing analysis that was originally conducted in Excel, and it’s testament to the power of a tweet that’s haunted me for three years.

The good news is that the results from my original analysis still stand, although I found a mistake in a subdivision of a question when conducting the new analysis. This miscalculation was caused by a missing “a” in a sheet containing 3763 cells. This is as good a reason as any for moving to a more reproducible platform.

2014: a survey odyssey

Back In 2014, I surveyed a group of UK universities to see how their researchers used software. We believed that an inexorable link existed between software and research, but we had yet to prove it. I designed the study, but I never intended to perform the analysis. This was a job better suited to someone who could write code, and I could not. Unfortunately, things didn’t go to plan and I found myself in the disquieting situation of having an imminent deadline and no one available to do the coding. Under these circumstances, few people have the fortitude to take some time out to learn how to…

Continue Reading

Code/Theory workshopBy Caroline Jay, University of Manchester, Robert Haines, University of Manchester.

A group of research software engineers (RSEs) recently gathered in Manchester, to explore the challenges of translating between scientific narrative and software. The full report from the Code/Theory Workshop is available in Research Ideas and Outcomes; here, we summarise the outcomes of the afternoon. Software engineers are sometimes seen as peripheral to the academic enterprise, providing the tool to do research, rather than actively contributing to the research itself. The overwhelming conclusion of the workshop was that, in reality, software engineers play a central role in the research process, and it is vital to get this message across.

Why is code/theory translation challenging?

Participants started by identifying the challenges of translating between code and theory. A key theme that emerged was the difficulty of designing research software. As scientific theory is continually changing, how do you design a plan?

All participants faced the challenge of getting to grips with new and diverse domains. In some…

Continue Reading

We have funding available at EPCC for PhD places to study different aspects of research software, related to the work of the Software Sustainability Institute.

To be eligible for funding covering both fees and stipend, students must be UK nationals, or EU nationals who have been resident in the UK for at least 3 years before commencing the studentship. 

There is a deadline of 15th May 2017 for applications for funded places. Prospective students should contact Neil Chue Hong (n.chuehong @ who will help them develop a short (2-3 page) research proposal which they must submit as part of their application. Please note that Neil is on leave between 21st April and 7th May, so responses will be slightly delayed.


Open access academiaBy Jon Hill, University of York, and Software Sustainability Institute Fellow.

A controversial title, but one I hope to explain! When running a couple of workshops later last year, I spoke at length on a number of aspects of open science. This included software sustainability, data and software licensing, collaboration and manuscript writing. I was inspired by this fantastic paper posted on ArXiv from Greg Wilson et al. I will caveat this text with the fact I am not a lawyer and none of the text below should be taken as legal advice.

After running these two workshops—“Tools for Constructing the Tree of Life” and “Good enough practice in Computational Geography”—and speaking to the attendees, I realised there is a disturbingly large gulf between those involved in the open science movement and the rest of academia. Many participants knew the words 'open access' and 'open source', but conflated the ideas and didn't link any licences to these terms. There was also a lot of confusion on what licences to use and which were appropriate, as well as the concept of copyright. Unfortunately, academics have to rely on the lawyers…

Continue Reading

ResearchFishBy Simon Hettrick, Deputy Director.

Researchfish® allows researchers to record the impact of their research outside of the standard metric of how many papers I have written. These outcomes, as they are called, cover publications, but also collaborations, events, awards and other metrics including - and of most interest to me - software.

Researchfish® was established with the support of MRC and initially focused on collecting outcomes from medical research. It has since been adopted by a broad range of funders, including the UK’s seven Research Councils. I recently had an interesting talk with the EPSRC’s Louise Tillman about what these outcomes might say about research software in the UK and, thanks to her, a week later I found myself in possession of a spreadsheet containing the research outcomes related to software for EPSRC researchers.

Just having the outcomes is pretty exciting, but to make things more interesting, I decided that I would write the analysis code myself. I’m not a software developer, but it’s getting progressively more difficult to stay that way when I spend my life surrounded by Research Software Engineers. Hence this post not only reports an investigation into Researchfish…

Continue Reading

A word cloud of the software used in researchBy Simon Hettrick, Deputy Director.

Over the last couple of years, we’ve had occasion to ask people about the software they use in their research. We’re about to start a long-running survey to collect this information properly, but I thought it might be fun to take a rough look at the data we’ve collected from a few different surveys.

It would be easy to survey people if there existed a super-list of all possible research software from which people could choose. But no such list exists. This raises the question of how many different types of software do we expect to see in research? Hundreds, thousands, more? The lack of this list is rather annoying, because it means we have to collect freeform text rather than ask people to choose from a drop-down list. Free-form text is the bane of anyone who collects survey data, because it takes so much effort to clean. It is truly amazing how many different ways people can find to say the same thing!

I collected together five of our surveys from 2014 to 2016, which relates to 1261 survey participants. From these, we collected 2958 different responses to the question “What software do you use in your research?”, but after a few hours of fairly laborious data cleaning (using Open Refine to make things easier) these were boiled…

Continue Reading

Software management plan guide and service

By Mike Jackson, Software Architect.

Software management plans set down goals and processes that ensure software is accessible and reusable throughout a project and beyond. To complement our guide on Writing and using a software management plan we have now developed a prototype software management plan service, powered by the Digital Curation Centre's data management plan service, DMPonline.

It is easy to concentrate on the short-term issues when developing scientific software. Deadlines for publications, collaboration with others and the demands of a daily routine all conspire to prevent proper planning. A software management plan can help to formalise a set of structures and goals that ensure research software is accessible and reusable in the short, medium and long term. It also helps researchers to consider whether third-party software to be used within a research project will be available, and supported, for the lifetime of the project. They can also give funders confidence that software they have funded survives beyond the funding period, that there is something to show for their investment.

In 2012 we wrote a guide on Writing…

Continue Reading

A collection of papers challenges related to the development, deployment, and maintenance of reusable scientific software has been published by the Journal of Open Research Software. This also launches a new section for the journal, Issues in Research Software, dedicating to publishing peer reviewed papers that cover different aspects of creating, maintaining and evaluating open source research software. The aim of the section is to promote the dissemination of best practice and experience related to the development and maintenance of reusable, sustainable research software.

The first collection contains invited papers from the Working towards Sustainable Software for Science: Practice and Experiences workshop held in conjunction with SuperComputing 2013 at the Colorado Convention Center in November 2013.

The event was a forum for the discussion of the challenges related to the development, deployment, and maintenance of reusable software (see the original call for papers here). Papers include Experience Reports, Position Papers and Survey Papers, on such challenges as:

  • the development process that leads to new software
  • the support, maintenance and usage of software
  • the role of communities or industry
  • policy issues relating to developing sustainable software
  • education and training


Continue Reading

Some people have it easy when it comes to explaining what they do for a living to their kids: just ask a plumber or a vet. Alas this is not the case for many of us in the research software community, or for Scott Henwood, a lead software architect at the Canadian Network for the Advancement of Research, Industry and Education (CANARIE).

After being challenged by his teenagers to explain his work, Scott put together an Introduction to Research Software Platforms video, which draws parallels between a well-known software service, Flickr, and the work of researchers. We think that it’s more than just teenagers who could benefit from seeing this video!

View the video on YouTube.

By Simon Hettrick, Deputy Director.

In an earlier post, I discussed our plans for investigating the number of researchers who rely on software. We’ve spent the last month looking at the feasibility of some of our ideas. In this post, I’ll present our findings about one of these approaches and some of the problems that we’ve encountered. If you’ve ever wondered what happens when a clueless physicist starts to dabble in social science, then this is the post for you.

First of all, a quick recap. Anecdotally, it seems that the number of researchers who rely on software for their research is – pretty much – everyone. There are few researchers who don’t use software in some way, even when we discount things like word processing and other non-result-generating software. But without a study, we lack the evidence to make this case convincingly. And it’s not just about the size of the community, it’s also about demographics. Seemingly simply questions are unanswerable without knowing the make up of the research software community. How much EPSRC funding is spent on researchers who rely on software? Is that greater, proportionally speaking, than the AHRC?

We had a few ideas about how to determine the size of the research software community. The one that has progressed furthest is the conceptually most straightforward idea: simply ask the research community. This is…

Continue Reading
Subscribe to Research software