Skip to main content

Software and research: the Institute's Blog

At the SeIUCCR Summer School in September I was asked a blinder of a question:

“How do I choose sustainable software for my project?”

Assuming an open-source context for this question, there are many things worth considering. It could be that the functionality of your software needs extended. Not wanting to re-invent the wheel, you’re looking for an appropriate library to provide that functionality. Or perhaps you have an analysis tool that outputs a certain data format that you need to post-process into an image. What should you look for in software?

It’s easy to reach for the first software package you come across that seems to do what you want. Perhaps it’s already installed in your target platform, or it’s the first thing you found on Google. But picking the wrong software can have expensive consequences if it doesn’t do everything you want or, even worse, development and support comes to a stop!

Taking a little time to make an informed choice is time well spent. So what questions can you ask about the software to find out if it’s suitable?

First off, and…

Continue Reading

DNABases.jpgBy Colin Semple, a Software Sustainability Institute Agent.

It has become a cliche to announce that biology is undergoing a revolution, driven by the rapid advance of new technologies for high-throughput sequencing (HTS) of DNA and RNA. A cursory glance at the dramatic increase in sequencing capacity (and the corresponding fall in costs) over the past couple of years reveals rates of improvements that outpace Moore's Law, the famous doubling of processing power every two years seen during the evolution of computer hardware. This is prompting biologists of almost every flavour to think bigger than ever before.

There has been, and continues to be, an explosion in very large and ambitious experiments involving the generation of large and often entirely novel datasets. In short, it is a very good time to be a biologist. It is also a very busy, and rather scary time to be a bioinformaticist, as we attempt not to drown in the deluge of sequence data demanding computational processing and analysis. The related trials and tribulations of those of us in the firing line are often ignored, but certainly not at ISMB 2011 the largest yearly conference of the bioinformatics community.

Of course, like any large conference, ISMB 2011 covered a wide variety of areas from population genomics to text mining, systems biology to bioimaging, and disease genetics to evolution. However…

Continue Reading

AntarticBugThing.jpgBy Kathryn Rose, a Software Sustainability Institute Agent.

Despite being associated with the icy landmass of Antarctica, the Southern Ocean hosts an abundance of marine life. Whilst most people might associate this wildlife with whales, seals and penguins, what many do not realise is that the majority of this great biodiversity is actually located on the sea floor (over 8,500 known species so far). In light of this, the International Polar Year saw the launch of the Census of Antarctic Marine Life, an international initiative established to investigate and record the distribution and abundance of marine life in the Southern Ocean. This work will contribute new insight into the complex biology of the Southern Ocean and how this system is responding to global environmental change.

Dr. Huw Griffith, of the British Antarctic Survey, was one of a group of UK based scientists to travel to the Antarctic to participate in the census. Working for several months at a time on board a ship in the Southern Ocean, the team was able to record countless new species. The quantity and significance of the data collected by both Dr. Griffiths, and numerous groups participating from other nations, highlighted the need for a robust database and software system to support such a large-scale survey.

The project required a system that would allow all the new species data to be located in one global database. Consequently, the initiative worked hard, in collaboration…

Continue Reading

OSMKibera.jpgBy Kristy Revell, a Software Sustainability Institute Agent.

When OpenStreetMap (OSM) was born in 2004, it was created as "an initiative to create and provide free geographic data, such as street maps, to anyone". Since then, OSM has grown rapidly and been developed by more than 400,000 volunteers.

I first became aware of OSM during a workshop in Kibera, an informal settlement in Nairobi. Vast amounts of data are collected on this community by the community, facilitated by the Map Kibera project. This project made me ponder the role of OSM in collecting and disseminating data from the developing world. To better understand this role, I took a few moments to speak with Dr Muki Haklay, Senior Lecturer in GIScience at UCL, London. Dr Haklay has vast experience in participatory mapping with his work stretching to all corners of the UK, from Dorset to London to Newcastle. His research is soon to head overseas, starting in Cameroon.

OSM has an interesting role in engaging communities and enabling them to obtain change, and providing researchers with data. Dr Haklay shares his views on OSM openly with me. He admits that he is a big fan of OSM, but he also describes himself as a critical friend: one who admits that OSM is not without its problems. For example, in terms of representing communities in the UK, one problem is that wealthy areas are mapped to a greater extent than less wealthy areas.

Yet Dr Haklay also asserts that there are groups within the OSM…

Continue Reading

SaveOurCode.jpgBy Aleksandra Pawlik, one of the institute's Agents.

The maintenance of scientific legacy code gives many scientists (and software engineers) a major headache. Supporting users, adding new functionality and fixing bugs causes problems to accumulate, until it seems easier to abandon the software and develop it again from scratch. Freezing the legacy code for (possibly) a few years of rewriting, means that new contributions have to wait until the rewritten software is released. Is there a solution that enables continuity of legacy software development, yet makes it possible to keep the software up to date and user friendly?

Scientists studying molecular physics found themselves facing this question several years ago. They work on two projects, UKRmol-in and UKRmol-out. Their software, which could be traced as far back as the 1970s, proved to be a useful and sometimes crucial tool. Unfortunately, it had become difficult to use and maintain. It started to look like the software was heading down a blind alley, but a few scientists managed to find a way forward.

The code had a number of contributors from different research institutions - each with different experience, skills and programming practices. There was no comprehensive methodology or guidelines for those who, over the years, had developed the software. In fact, the use of…

Continue Reading