Software and research: the Institute's Blog

Latest version published on 23 May, 2018.

7838388322_8883573e4e_z.jpgBy Martin Callaghan, University of Leeds, Daniel S. Katz, University of Illinois, Alexander Struck, Cluster of Excellence Image Knowledge Gestaltung, HU-Berlin, and Matt Williams, University of Bristol, 

This post is part of the Collaborations Workshops 2018 speed blogging series.

This blog post is the result of a discussion group during the Collaborations Workshop 2018 organised by the Software Sustainability Institute. We talked about some national and institutional efforts being made to establish RSE groups and positions and are writing this blog to share our thoughts. The most successful of these RSE efforts have come from within UK universities. We believe sharing strategies and case studies on how to implement pilots should help grassroots movements and support…

Continue Reading

Latest version published on 22 May, 2018.

4825668491_2d9d7902c2_z.jpgBy M. H. Beals, Catherine Jones, Geraint Palmer, Mike Jackson, Henry Wilde, John Hammersley, Daniel Grose, Robin Long, Adrian-Tudor Panescu, Kirstie Whitaker

This post is part of the Collaborations Workshops 2018 speed blogging series.

What does reproducible mean? Who do we want to help and support by making our research reproducible? At what point does non-reproducible research become good enough (and carries on to the highest standards of reproducibility?)

In our discussions during the first speed blogging session at the Software Sustainability Institute’s Collaborations Workshop in Cardiff in March 2018, we brainstormed criteria for judging the quality of reproducible research. What emerged were two clear messages: 1) We all have our own overlapping definitions of the desirable features of reproducible research, and 2) there is no great benefit in rehashing old discussions.

In this blog post we outline 9 criteria that can be met by reproducible research. We believe that meeting as many of these as possible is moving in the right direction. Source code and data availability are often seen as important requirements, but documenting what code is trying to achieve, which other software libraries are required to run the code, the greater research ecosystem, what lessons were learned in the development of the…

Continue Reading

Latest version published on 21 May, 2018.

8345962625_fe0bfe8de9_z.jpgBy Matthew Upson, Data Scientist at Juro.

Unlike most of the 2017/2018 cohort, when I applied to become a fellow of the Software Sustainability Institute, I was a civil servant rather than an academic. In this blog post I want to talk about why Government needs sustainable software, the work being done to deliver it, and the lessons we learnt after the first year.

But Government already has sustainable software...

There's quite a bit of disambiguation that needs to be done to the statement 'Government needs sustainable software'. In fact, Government already has sustainable software, and lots of it. One need only look at alphagov, the GitHub organisation for the Government Digital Service. Sustainable, often open source, software is alive and well here, written by professional software developers, and in many other places in central and local Government alike. But this isn't the whole story.

There are other parts of Government that write software, but like many in academia, you may have a hard time convincing them of this fact. In central Government (this is where my experience lies, so I will focus largely upon it) there are literally thousands of statisticians, operational researchers, social researchers, economists, scientists, and engineers. Any one of these…

Continue Reading

Latest version published on 18 May, 2018.

ba.pngBy Becky Arnold, University of Sheffield 

Coding is now the backbone of much of scientific research. Despite this in many cases the coding education of researchers is nonexistent, or doesn’t extend far beyond how to use a for loop. As a result we largely learn tips, tricks and best practice the hard way, and in small fragments. The solution- teach each other! If everyone knows a little then put together we know a lot.

The approach of the Sheffield astrophysics group

Since May 2017 we have held lunchtime code review and collaboration meetings every two weeks. These meetings are very informal to encourage discussion, and as well as reviewing one anothers code we use these sessions to exchange information. If you spent the week figuring something out, read an interesting article, or picked up a new trick here is the place to tell others about it. People can also opt to give short tutorials if they wish, for example we’ve had ones on version control, wrapping one language with another, and best practice.

These sessions have had numerous benefits:

1. Save time and frustration

Researchers have a vast array of administrative and research responsibilities which leave little time to spare. But how many days of that valuable time have you spent knocking your head against coding issues, scouring stack overflow, and crying into coffee cups at 2 am about the code that just won’t work. An hour every week or two…

Continue Reading

Latest version published on 15 May, 2018.

DH.pngBy Becky Arnold, University of Sheffield

On the 2nd of May 2018, David Hubber, a postdoc at Ludwig-Maximilians-Universität Munich, gave a seminar on Long-term Software Development for Scientists, as part of a series of talks around good practice at the University of Sheffield. David discussed how to structure code efficiently, code module design, decoupling strategies and test-driven development.

I’ve had the pleasure of meeting David previously— we’re both astrophysicists working on simulations of star forming regions. In particular David has a great deal of experience in working with big codes that span tens of thousands of lines. While my own work is of a smaller scale I and I think many researchers are familiar with writing programs that start small and quickly evolve into a huge unmanageable mess.

David discussed the dangers of “the blob” and “spaghetti code”, before going on to cover strategies to avoid them. Some of these strategies were stylistic, he described how keeping consistent with style choices in your code makes it easier to understand for your future self and others, thus making it more maintainable.

He also discussed how comments (while absolutely vital) can also be something of a double edged sword if misused. Excessive comments can…

Continue Reading