Software and research: the Institute's Blog

GeoTOD.jpgIn October, the GeoTOD II team presented their work to key members of data.gov.uk and the Ordnance Survey. The meeting went well and there was interest in GeoTOD II's work on exposing legacy data sets, linking geographic and locational information, UML to RDFS conversion and also in GeoTOD IIs longer term plans.

GeoTOD II is contributing to the evolution of the UK's Location Strategy by exploring ways of linking geographic information with location data. The Software Sustainability Institute is assisting GeoTOD II in their use of the OGSA-DAI open source framework for distributed data management. OGSA-DAI is a key component of the GeoTOD II architecture.

Error_0.pngFollowing the recent Nature article "Computational science: ... Error - why scientific programming does not compute" spawned by the Climategate affair, there's another interesting article titled "Changing software, hardware a nightmare for tracking scientific data" from the Nobel Intent blog on Ars Technica. Again, it is the pace of technological advance, so important for making new discoveries, which is also causing us to have to question if we can reproduce our past results.

The author notes the difficulties of keeping a fully reproducible analysis pipeline working, with issues such as software and hardware obsolescence, data decay and dependency on services provided by others all contributing to the challenges of providing reproducible research.

Some of the things touched on, e.g. the potential to have to keep outdated hardware running, are part of the work we have recently completed with Curtis + Cartwright on the purposes, benefits and approaches to software preservation. Another is the problems arising when "the focus tends to be on getting the job done, not writing easy-to-maintain or well-commented code" something which the SSI is trying to address through our workshops and…

Continue Reading

NeSCForge.pngNeSCForge has been the home to many of the software projects from the UK's e-Science programme and beyond. However, pressures of funding have led to the decision to close NeSCForge permanently on 20th December 2010.

So what do you do if your software was using NeSCForge or another site marked for closure? Don't panic - it's fairly painless and the SSI can help guide you through the process.

We've recently extended our collection of guides for developers to include:

Hopefully, these will be useful for those of you looking to migrate projects  to a new home.

It's a timely reminder that you need to be aware of risks and react quickly when change is required to ensure software sustainability. Nevertheless, as long as you keep your community on board for the ride (for…

Continue Reading

KingCloud.jpgCloud computing is, in my experience, a subject that creates excitement and scepticism in equal quantities. My introduction to the subject came courtesy of a presentation by Werner Vogels, Amazon's Vice President & Chief Technology Officer, on Amazon's Elastic Compute Cloud. It was a fascinating presentation, but it was even more interesting to hear the passion behind the questions that followed. I’m not going to focus on the technical side of things in this post, because what interests me is the way in which cloud computing has grasped the public’s attention.

Unlike Grid computing, which is part of the same family of technologies, there has been a been a lot of public interest in cloud computing. It’s been featured on the TV and radio. I was even amazed to find an overview of cloud computing in the local magazine that’s distributed in my hometown (it was a woefully misrepresented introduction, but an introduction nonetheless). Okay, there might not be a particularly deep understanding - people may think that Saas, Paas and Laas are finalists in the Swedish version of the X-Factor – but there is knowledge of the concept. That’s important and its the result of good marketing. And that’s no surprise, cloud computing is being promoted by the big hitters like Amazon, Google and Microsoft. This also means that many people started using clouds without being aware of it, simply…

Continue Reading

There's an interesting feature over at Nature by Zeeya Merali called "Computational science: ... Error - why scientific programming does not compute" (disclaimer: I was one of the people interviewed for the piece). In it, Merali considers the issues of computational software written in the scientific context particularly in light of the problem revealed by the leak of emails from the Climatic Research Unit at the University of East Anglia last year.

As the article notes, a lot of scientific software has grown ever more complex and there is a steep learning curve for new researchers joining a collaboration. Additionally, software is often used to produce or analyse data published in papers, without the readers being able to verify the results. This is a big issue for research as we build on the principles that scientific discoveries should be both reproducible and repeatable. Without this, we cannot build on others achievements.

However where I think the article has the perspective slightly wrong is in suggesting that this complexity and fragility is the exclusive preserve of scientific software; there are plenty of examples of commercial and public-sector software which falls into the same traps. Any piece of software which has evolved organically over a period of time and changing developers can suffer from these issues.

Likewise, simply applying techniques from industry won't solve the problem, although some things may help it. There's certainly scope for many of the benefits promised by software engineering best practice to be realised,…

Continue Reading