Software and research: the Institute's Blog

Collaborations Workshop lives up to its name

By Robin Wilson, Researcher, University of Southampton.

As a Fellow of the Software Sustainability Institute in 2013, I attended the Collaborations Workshop 2013. To be honest, I did so rather reluctantly: I was in a very busy stage of my PhD at the time, and although it was seemed like a reasonable way to spend a few days, I felt that it was unlikely to produce anything of direct benefit to me. I couldn't have been more wrong.

In the first session at that year's workshop I met someone from the IT as a Utility Network+ and showed off a proof of concept instrument that I'd been developing during my PhD (in all honesty, I'd brought it with me so that if the conference was boring I could slip back to my room and test the instrument!). He was fascinated by it, and strongly suggested that we apply for a IT as a Utility Network+ pilot project to get some funding to continue development. We did so, and won £50,000 of funding – which was enough to employ a post-doc for six months and develop a full prototype instrument. The Collaborations Workshop was living up to its name: within an hour of the start of the conference I'd developed a collaboration which led to significant funding!

Making biology compute with CGAT

By Alexander Hay, the Institute’s Policy & Communications Consultant, talking with Andreas Hegar, CGAT.

This article is part of our series: Breaking Software Barriers, in which Alexander Hay investigates how our Research Software Group has helped projects improve their research software. If you would like help with your software, let us know.

Life Sciences often suffer from a lack of programming skills. This isn’t always a problem – you don’t need to know how to code in order to gauge the diurnal eating habits of squirrels, for example – but it does become an issue when you need to work with large datasets.

This is a growing problem. Next Generation sequencing techniques produce vastly more data than ever before, and more people are needed to properly handle this and analyse it. Many life scientists do not need these skills, or at least, have not needed them until recently. The most sensible solution to this, then, is to train biologists these skills.

Array of Python coders aim high at Diamond Hackathon

By Steve Crouch, Research Software Group Leader, and Mark Basham, SSI Fellow and Senior Software Scientist at Diamond Light Source.

January 30th 2015 saw the latest Institute-sponsored Hackathon at Diamond Light Source, bringing together top coding talent from across the STFC Rutherford Appleton Laboratory and beyond.

Participants were able to propose interesting scientific and software development projects related to Python, and work on them in groups. One constraint, however, was that these groups be formed of individuals that don't normally work together, and this led to some surprising results.

Fellows 2015 inaugural meeting

By Shoaib Sufi, Community manager.

Thursday 29 January 2015, on the day a mini blizzard struck Manchester and closed its airport something much more important to software in research was happening in sunny London. The Fellows 2015 Inaugural meeting took place in rooms annexed to the library at Imperial College London. The Fellows were introduced to the Institute, discussed thorny problems related to software and data and helped each other create plans for their Fellowship year.

We kicked off with a brief overview of the Institute with a focus on insights into our strategically important research into understanding the research software community;  understanding how best to engage their communities, and how to benefit from our web and social media presence. There was a lot of interest in running Software Carpentry events and taking part in our Open Call for projects.

Next stop was discussion time. There was a highly informing discussion about the challenges that the Fellows faced in using software in their research domains. A number of themes stood out, such as the difficulty of recruiting and maintaining software engineering effort (highly relevant to our with with Research Software Engineers), getting credit for software outputs; support for sharing data - especially big data, standards for reviewing code and data for journals, and clarifying intellectual property rights in software and data.

How open is your software?

OSS Watch app

By Mike Jackson, Software Architect.

The UK open source service, OSS Watch, have recently published their Openness Rating tool. This tool allows projects to assess their openness and can be applied to both open and closed source software. In this blog post, I'll provide a summary of the Openness Rating tool and how it complements our own online Sustainability Evaluation Service.

Exergames - how to make physiotherapy fun

By Alina Călin, Chief Research Officer at MIRA Rehab, and Dr. Emma Stanmore, Lecturer in Nursing at the University of Manchester.

This article is part of our series: a day in the software life, in which we ask researchers from all disciplines to discuss the tools that make their research possible.

Gamification is the next big innovation in the field of rehabilitation, and makes use of remote sensors and aspects of video game design to engage patients in their rehab and make it more accessible, which in turn encourages participation and so keeps costs down.

Exergames was our attempt to develop this possibility, and is a successful collaboration between The University of Manchester, Central Manchester University Hospitals NHS Foundation Trust and MIRA REHAB Ltd. It lead to the development of several exergames, designed to use the Microsoft Kinect sensor, and all of which use exercises known to help prevent falls and improve function in older people.

Building a bridge between a virtual machine and the outside world

Bridge

By Mike Jackson, Software Architect.

The Distance project at the University of St. Andrews use Windows XP virtual machines for developing their Distance for Windows software. Their interface code, implemented in Visual Basic, is not held under revision control and institutional security policies mean that their XP virtual machines cannot be connected to the network.

In this blog post, I describe my experiences of using Git and shared folders to address both these problems, as part of our recent open call collaboration.

Scientific coding and software engineering: what's the difference?

By Daisie Huang, Software Engineer, Dryad Digital Repository.

What differentiates scientific coders from research software engineers? Scientists tend to be data-discoverers: they view data as a monolithic chunk to be examined and explore it at a fairly fine scale. Research Software Engineers (and software engineers in general) tend to figure out the goal first and then build a machine to do it well. In order for scientists to fully leverage the discoveries of their predecessors, software engineers are needed to automate and simplify the tasks that scientists already know how to do.

Scientists want to explore. Engineers want to build

I've been thinking a lot about the role of coding in science. As a software engineer turned scientist, my research is extremely computational in nature: I work with genomes, which are really just long character strings with biological properties. My work depends on software developed by myself and many, many other scientists. Scientists are, by and large, inquisitive and intelligent people who are fast learners and can quickly pick up new skills, so it seems natural that many would teach themselves programming. When I first started talking to scientist-coders, I thought that perhaps I could relate to them from a programming perspective, and maybe bring some experience in formal software design practices to teaching scientists about coding. I started working with Software Carpentry and organisations of computational scientists in my field (Phylotastic, Open Tree of Life,  Mesquite) and getting more involved in figuring out what motivates scientists to take time out of their research and learn to code.

An introduction to CGAT

By Andreas Heger, CGAT Technical Director.

Today, biologists have access to high-throughput measurement techniques that can assay many variables or entities at the same time. One striking example has been the advent of massively parallel sequencing techniques in the form of next-generation sequencing (NGS).

While the sequencing of the human genome took more than ten years and cost billions of pounds just a decade ago, a researcher can now send off material to a sequencing service and expect the equivalent of multiple human genomes worth of data within a few weeks and for not much more than the cost of a typical experiment. Unfortunately, few biologists are trained to best deal with the handling and statistical issues of the resultant large data sets.

Adopting automated testing

OK sign

By Mike Jackson, Software Architect.

Automated tests provide a way to check that research software both produces scientifically-valid results and that it continues to do so if it is extended, refactored, optimised or tidied. Yet one challenge that can face researchers, especially those with large, legacy codes, is this - where to start?

The prospect of having to write dozens of unit tests can be off-putting at the best of times, let alone if one has a data to analyse, a paper to write or a conference to prepare for. Our new guide on Adopting automated testing describes an approach for introducing tests by focusing on introducing end-to-end, or system tests first.