Fellows blogs

EuroScipyBy Nikoleta Glynatsi, Cardiff University

EuroSciPy is a cross disciplinary gathering of sorcerers, knights and rangers united for the good of scientific research, and it is focused on the use and the development of the programming language Python. EuroSciPy 2017 was the 10th such conference and it took place in the city of Erlangen, Germany from the 28th of August to the 1st of September 2017. The conference aims for participants to show their latest work, learn from each other and collaborate on developing projects. All these are achieved through multiple days of tutorials, talks and sprints.  

Days one and two started of with several tutorials in two parallel streams (novice and advanced levels). I attended all the advanced workshops where several tools were touched upon, such as pandas, scikit-learn and SymPy. The tutorials have been the highlight of the conference, and I highly recommend the following:

  • Scikit-learn with Olivier Grisel and Tim Head. Material and video (…

Continue Reading

Sandbox_near_way_of_the_cross_in_Jiřice_u_Moravských_Budějovic,_Znojmo_District.jpg

By Blair Archibald, Software Sustainability Institute fellow.

Software is (almost) never written or run in isolation. Instead, it builds on top of a wide range of dependencies from compilers and language runtime environments to application specific libraries. This is a huge challenge for reproducible research. Not only should the software we write be sustainable (e.g well versioned, documented, and tested) but the environments that the software exists within also needs to be documented and, ideally, recreatable.

Many have suggested virtual machines/containers as one solution to this problem (for instance the recent Docker Containers for Reproducible Research Workshop (C4RR)), where you ship not just your computational code but also the environment alongside. While this is a good start on tackling this problem I'm not sure it's fully sufficient. Often the environment for the image is constructed using a standard Linux distribution's package manager, and these usually, by default, install the newest (stable) possible version of a package, meaning that two people running a VM/container at two…

Continue Reading

 

Astronomy research softwareBy Thomas Robitaille, Software Sustainability Institute Fellow

I recently attended the European Week of Astronomy and Space Science (EWASS), the largest yearly conference for European astronomy. This year, it featured a session on Developments and Practices in Astronomy Research Software that touched on many themes important to software sustainability.

With the rise of open source projects in Astronomy, this session was a great way to expose astronomers at a major conference to best practices in software development and update them on available free software and projects, including Astropy, TARDIS, Stingray and more. Talks included topics such as reproducible science, software best practices, when to make code public, transparency, credit, and citation of software, as well as a number of examples of best practices and lessons learned in specific projects. The session was extremely successful, with many talks ending up being standing-room only. On the second day, a hack day allowed attendees to work on related projects and…

Continue Reading

Learning programming for non-programmersBy Thomas Etherington, Senior Research Leader, Royal Botanic Gardens, Kew, and Institute Fellow.

One of the Software Sustainability Institute’s missions is to increase programming skills through training. So as part of my Software Sustainability Institute Fellowship, and with support from the Royal Geographical Society, I am organising an introductory programming workshop targeted at geographers. However, as well as telling people what they need to know from the perspective of someone with programming skills, I also think it is important to understand how learning computer programming is viewed from the non-programmer's perspective so that the training can be made as effective as possible. Therefore, the application process for attending the workshop involved a survey that asked non-programmers questions about their perceptions of computer programming.  I’ve summarised here what I think are some interestingly consistent themes and how these will affect the design of my workshop.

I had a total of 18 applicants for the workshop, the majority of whom had interests in human geography, so the following does need…

Continue Reading

Computational NeuroscienceBy Stephen Eglen, University of Cambridge.

The annual Computational Neuroscience meeting was held this July in Antwerp, Belgium. This is a well-established meeting for researchers to discuss matters around computational modelling and analysis of neuronal systems. Although computational simulation and analysis is at the heart of this field, historically there has been little evidence of sharing of code. I was pleasantly surprised, however, to discover at the meeting that many leading labs now embrace open science. Below I outline my key observations based on observing and presenting at two workshops.

Workshop 1: Recent methods and analyses for neuronal population recordings

Recent technological developments mean that it is now possible to record the spiking activity of many hundreds or thousands of neurons simultaneously. This workshop described some of these recent techniques and the challenges for data analysis. Two themes of general interest emerged in my view from the first day:

  1. People are now sharing their computational methods; most speakers at the workshop already made…

Continue Reading

Coding RetreatBy Dr Eilis Hannon, University of Exeter, Software Sustainability Fellow.

On Friday 30th June, the first Coding Retreat was trialled at the University of Exeter.  The idea originated from a Writing Retreat I attended, where the focus was to clear all distractions and use the time not just to write but to produce high quality output. This parallels many of the challenges that researchers face when writing software and the idea that there is no time to consider making the code nice or think about how someone else would use it. So it struck me that a Coding Retreat could be a mechanism to provide the discipline to promote good practise. 

Based on the Writing Retreat I attended, the premise was to develop a similar event providing researchers with the time and space to focus and prioritise writing high quality, sustainable software. The workshop was designed to be inclusive; open to any programming language, any discipline and any project. The only requirement was that attendees had a project to work on, either improving or finishing ongoing work or starting something new.

The day started with refreshments and an overview of what the day would involve. The primary objective of the workshop is to use the time, cleared from the usual daily distractions, to produce high-quality code putting into practice all the principles acknowledged…

Continue Reading

CythonBy Thomas Etherington, Senior Research Leader, Royal Botanic Gardens, Kew, and Software Sustainability Institute Fellow.

Getting code running fast enough to be useful is an important consideration for making software sustainable. For Python programmers, the Cython project provides an opportunity to speed up your Python code. As part of my Software Sustainability Institute Fellowship, I spent a couple of days learning about Cython from one of the lead developers, and I’ve summarised from my perspective when Cython could be a useful tool for others to explore.

My interest in Cython began when I looked into the code of a SciPy function and saw code that looked quite Pythonic, but clearly wasn’t actual Python code. It transpires that a lot of SciPy functions have been written using Cython, which is a language that can either: compile Python code directly to C, or wrap C or C++ code in Python, so that computational speeds associated with lower-level C programming can be leveraged from a higher-level Python programming interface. So while SciPy is one of my favourite Python packages, the code itself actually consists of “more than 200,000 lines of C++, 60,000 lines of C, [...] compared to about 70,000 lines of Python code…

Continue Reading

Best coding practicesBy Sanjay Manohar, University of Oxford, and Software Sustainability Institute fellow.

We have a problem in neuroscience. The complexity of data analysis techniques increases every year. With each increase in complexity comes an increase in the possibility of error. We have already been plagued by such problems, which have been reported on the internet and in the news. With increasing quantities of code being written, these problems seem unlikely to subside. How can we address the root of the problem?

In a questionnaire I regularly give to my students, one recurrent and ominous statistic always emerges: neuroscientists are, by and large, taught by each other to code. A PhD student is taught by his postdoc, who is in turn guided by her PI, who herself learned to program on the job with help from colleagues. Nobody in the loop has been formally taught coding practices, or implements the kinds of guidelines or conventions that are commonly imposed when programming in industry. This, I believe, is a central part of the problem.

I see a…

Continue Reading

Data Science for DoctorsBy Steve Harris, University College London Hospital, and Software Sustainability Institute Fellow.

This article was first published at Data Science Breakfast Club.

The general public would assume that the medical profession are numerically literate. University-educated, technically-trained in biological science and more, and with the unique legal privilege of prescribing medicines where incorrect dosing leads to disaster, there wouldto be no excuse. However, most medics would deny any affinity with maths, and exude a distaste for statistics. This is a truism even amongst anaesthetists whose professional training includes physics and pharmacokinetics.

Despite this public denial, we are nonetheless a data-literate profession. We read and interpret scientific papers, we run audit projects and write business cases for improvement projects. These are all intrinsically quantitative undertakings. However, our adopted posture gets in the way of our doing these things as well as possible, and engaging with statistics and data.

Data science would appear to be a rebranding of statistics. Despite underpinning quantum mechanics, modern finance, and the humble weather forecast, statistics has never managed to seem cool. At worst, it is cited as ‘lies,…

Continue Reading

Software in engineering By Edward Smith, Institute’s fellow, Imperial College London.

As an engineer, software design concepts are not only familiar, they are central to the education we are forced to endure. These include standardisation, quality testing and the importance of outlining a clear specification. However, when it comes to software development, engineering academics seem to forget these principles; principles that shaped the industrial revolution and allowed us to engineer the modern world. In this short blog, I want to explore why academic engineers don't apply these best-practice concepts to software.

It is clear we are in the middle of a revolution; one which arguably will change the world more rapidly than the industrial revolution over a hundred years ago. Aside from the scientific developments, among them steam power, electricity and mastery of materials such as iron and steel, it was the methodologies forged during this period that were pivotal to the revolution. The key concept for mass production was the division of labour and automation, allowing much greater production by fewer people. In addition, the standardisation of parts allowed each person to specialise and optimise a given part fitted together by agreeing on the required interface between these parts.

Consider a car. Before the industrial revolution, a…

Continue Reading
Subscribe to Fellows blogs