Software and research: the Institute's Blog

Latest version published on 11 April, 2018.

7337761518_57b80d725b_z.jpgBy Simon Hettrick, Deputy Director.

When I first started thinking about how we could create a career path for Research Software Engineers (RSEs) in academia, I assumed we would have to persuade university management to change their policies and make it possible, or at least much easier, for researchers to retain RSEs within their groups. The actual solution has been somewhat different, and much more effective.

Pioneers at a growing number of universities have seized the initiative and set up their own RSE group. These groups employ a number of RSEs and then hire them out to researchers at their home organisation. It’s a win-win for researchers: they gain access to the skills they need and—unlike hiring new personnel—they only pay when they need those skills. By servicing an entire university, RSE groups tap into enough demand to allow a number of RSEs to be consistently employed.

When RSE groups are first launched they tend to hire generalists, but as they grow they can hire more specialists, which makes skills available that researchers could only dream of accessing without such a group. As they grow, RSE groups need senior staff who can run larger projects and oversee the work of others, and this creates the RSE career path that has been so sorely needed.

In other words, we’re winning the fight for RSE…

Continue Reading

Latest version published on 10 April, 2018.


By Peter Murray-Rust, ContentMine Ltd; Rachel Spicer, EMBL-EBI, University of Cambridge; Josh Heimbach InterMine, University of Cambridge; Yo Yehudi, InterMine, University of Cambridge and Code is Science; Naomi Penfold, eLife

Image to the right: Bike thefts in Cambridge over 2017. Rendered by Rachel Spicer, using R (ggmap) + Google Maps + Open police data

Open Data Day (ODD) is an international event that runs on the first Saturday of March, started in 2010 and supported by Open Knowledge International. It aims to raise the profile of all types of open data, from government to research.

Creating our own ODD

The Open Data Day organiser’s guide recommended picking a focus. We didn’t have a huge amount of time to organise, and we knew this wasn’t going to be a large event but mainly a motivation to meet some busy…

Continue Reading

Latest version published on 6 April, 2018.

16121281031_5b58dfc131_z.jpgBy Laura Fortunato, University of Oxford

Reproducible Research Oxford is a project based at the University of Oxford, launched in October 2016. The project aims to lay the groundwork for a culture of research reproducibility across the University, focusing on training in the effective use of computational tools in research. These tools are widely used in some disciplines, and they can enable researchers to easily track the process leading from data to results, so that it is fully reproducible. However, researchers often lack the opportunities, incentives and confidence to make best use of these tools.

As part of the project, we have set up a partnership between the University and Software and Data Carpentry, non-profit volunteer organisations focused on teaching researchers across disciplines the computing and data skills they need for effective and reproducible research. Since the start of the project, we have ran four Software Carpentry workshops, one Data Carpentry workshop—the first to be held in Oxford!—and we have hosted the first Oxford-based Software/Data Carpentry instructor training. So far, we have provided training to upwards of 100 learners from across the University who attended our workshops, in addition…

Continue Reading

Latest version published on 6 April, 2018.

8419988105_367cb3d1f8_z.jpgBy Matt Archer, Paul Brown, Stephen Dowsland, David Mawdsley, Amy Krause, Mark Turner (order is alphabetical).

So… you’ve just started on an exciting new data science project, but you know nothing about the domain you’re working on. Besides briefly panicking, how do you get up to speed on the area you’re working on?

First thing's's good to meet the researchers you'll be working with as quickly as possible. Most researchers are excited about their research; this enthusiasm is infectious. Ask questions. Be interested.

To get a basic grounding in your new area, YouTube is an invaluable source of quick bursts of domain knowledge for both a general subject area or the detailed specifics and intricacies of a niche within that subject area. Video tutorials can take many forms but the useful ones to look for are short explainers on concepts or tooling, as well as longer form recordings of things like lectures, workshops and panel discussions. YouTube has become a primary method of user training materials for large software vendors, there are thousands of video tutorials on how to use tools or perform specific actions for things like Jupyter Notebooks, Excel and Adobe Photoshop. If there are large commonly used pieces of software in the domain you’re trying to learn, there may be similar videos available to help get started with that software platform.

It can be useful to ask for a background reading list from the researchers you're working with. Selectively…

Continue Reading

Latest version published on 5 April, 2018.

8236647979_efbfd1d409_z.jpgBy Matthew Archer, Stephen Dowsland, Rosa Filgueira, R. Stuart Geiger, Alejandra Gonzalez-Beltran, Robert Haines, James Hetherington, Christopher Holdgraf, Sanaz Jabbari Bayandor, David Mawdsley, Heiko Mueller, Tom Redfern, Martin O'Reilly, Valentina Staneva, Mark Turner, Jake VanderPlas, Kirstie Whitaker (authors in alphabetical order)

In our institutions, we employ multidisciplinary research staff who work with colleagues across many research fields to use and create software to understand and exploit research data. These researchers collaborate with others across the academy to create software and models to understand, predict and classify data not just as a service to advance the research of others, but also as scholars with opinions about computational research as a field, making supportive interventions to advance the practice of science.

Some of us use the term "data scientist" to refer to our team members, in others we use "research software engineer" (RSE), and in some both. Where both terms are used, the difference seems to be that data scientists in an academic context focus more on using software to understand data, while research software engineers more often make software libraries for others to use. However, in some places, one or other term is used to cover both, according to local tradition.

What we have in common

Regardless of job title, we hold in common many of the skills involved and the goal of driving the use of open and reproducible…

Continue Reading