Fellows blogs

Easy statistics, bad science, pre-registration, analytical flexibility

By Cyril Pernet, University of Edinburgh

I attended Brainhack Global in Warwick on 2nd and 3rd March 2017. On this occasion, I was invited to give a talk on reproducibility and listen to Pia Rothsteins talk about pre-registration. These somewhat echo two papers published earlier this year: "Scanning the horizon: towards transparent and reproducible neuroimaging research" by Russ Poldrack et al. in Nature Neuroscience and "A manifesto for reproducible science" by Marcus Munafò et al. in Nature Human Behaviour. Here, I’ll talk about these two concepts that are 'new' because of the ease with which computers can nowadays…

Continue Reading

Participation in distributed open source projectsBy Mario Antonioletti, Software Sustainability Institute, Nikoleta Evdokia Glynatsi, Cardiff University, Lawrence Hudson, Natural History Museum, Cyril Pernet, University of Edinburgh, Thomas Robitaille, Freelance.

Running an open-source project with geographically distributed participants is a substantial undertaking. Attracting and retaining participants can be hard. Communicating project progress via social media, websites and mailing lists can take a lot of time, as does organising and running regular meetings since contributors are typically separated both by geography and time zones. It is important to identify mechanisms for (i) attracting participants (social aspect), (ii)  communications and development processes (which can encompass workflows, standards, conventions, project administration) and (iii) remote collaboration. In this blog post, we summarise some social and technical challenges and…

Continue Reading

Software Sustainability PracticeBy Blair Archibald, University of Glasgow, Gary Leeming, University of Manchester, Andy South, Freelancer, Software Sustainability Institute Fellows

Software plays a key role in a modern research environment with over 92% of academics reporting the use of research software. With such a large impact there is huge variation in the potential audience for the work of the Software Sustainability Institute across different disciplines. In some areas there already exists best practice, but many may find it difficult to understand the value or justification for making the effort to engage with software sustainability. Our mission, as fellows, is to help them.

As fellows, we need to interact with different stakeholders: the individual researchers who use and write software as part of their general practice, groups and disciplines who use software to enable new results to push their field forward, and policy makers who have global influence over the software conditions of funding and practice. We can target each of these stakeholders differently and provide a justification of improved software practice.

Continue Reading

Open sourceBy Alice Harpole, University of Southampton, Danny Wong, Royal College of Anaesthetists, and Eilis Hannon, University of Exeter, Software Sustainability fellows

There has been a collective push in recent years to make all empirical data open access, and this is often a requirement where it has been funded by taxpayers. One reason for this is to improve the overall quality of research and remove any barriers from replicating, reproducing or building on existing findings with the by-product of promoting a more collaborative style of working. In addition to making the data available, it is important to make it user-friendly by providing clear documentation of what exactly it is and how the data was generated, processed and analysed. There are a number of situations, where the key contribution from the research is not simply the underlying data but the software used to produce the findings or conclusions, for example, where a new methodology is proposed, or where the research is not based on any experimental data but instead on simulations. Openly sharing software is as critical here as sharing the raw data for experimental studies. What’s more, there are likely many projects where both the data and software are equally as important, and while there is an expectation to provide the data, this currently…

Continue Reading

CW montageBy Melody Sandells, Director at CORES Science and Engineering Limited and Institute Fellow

It was way back in 2012 when I first participated in a Software Sustainability Institute Collaborations Workshop.  As attendance was a condition of my fellowship I had to go, and I had no idea what to expect...

As it turned out, it was one of the best meetings I've ever attended alongside a mix of researchers, software engineers and people from funding bodies. There were some keynote speakers and lightning presentations, which were enthralling, entertaining and I seem to remember laughing quite a bit. There were lots of smaller group activities to discuss diverse topics, some that I knew nothing about but left caring deeply. The collaborations workshop has shaped me to this day more than I’d probably admit, and gifted me some of the colleagues I have now.

I’ve watched a few more collaborations workshops go by with green eyes, particularly the addition of Hack Days. If you want to judge the value of those then look no further than Robin Wilson’s Recipy. I have recently embarked on a new career direction outside traditional academia,…

Continue Reading

British Library awards, Library CarpentryBy James Baker, Lecturer in Digital History and Archives, University of Sussex, and Software Sustainability Institute Fellow

Librarians play a crucial role in cultivating world-class research and in most disciplinary areas today world-class research relies on the use of software. Established non-profit organisations such as Software Carpentry and Data Carpentry offer introductory software skills training with a focus on the needs and requirements of research scientists. Library Carpentry is a comparable introductory software skills training programme with a focus on the needs and requirements of library professionals: and by software skills, I mean coding and data manipulation that go beyond the use of familiar office suites. As librarians have substantial expertise working with data, we believe that adding software skills to their armoury is an effective and important use of professional development resource that benefits both library professionals and their colleagues and collaborators across higher education and beyond.

In November 2015 the first Library Carpentry workshop programme took place at City University London Centre for Information, generously supported by the…

Continue Reading

DiversityBy Eilis Hannon, Research Fellow in Bioinformatics, in the Complex Disease Epigenetic Group at the University of Exeter.

This post summarises a discussion with Lawrence Hudson, Roberto Murcio, Penny Andrew and Robin Long as part of the Fellow Selection Day 2017.

The question of how to improve diversity is suitably broad and vague to initially induce silence in a group, but eventually, true to its name, it promotes a wide-ranging discussion. Sometimes the task is divided up to target particular under-represented groups, as it starts to become a bit of a minefield to develop a scheme that improves diversity in general. What opens the door to some parts of society can simultaneously close the doors to others. Hackathon events are a common and successful method of attracting young people to computer science; however, if they take place over the weekend and are marketed as providing beer and pizza for sustenance, you start to exclude anyone with caring responsibilities or discourage anyone who doesn’t drink.

Before we can think about trying to improve diversity, it is helpful to consider what exactly do we mean and what are the benefits…

Continue Reading

Big DataBy Anna Leida, eScience Lab, University of Manchester

At the New Scientist Live festival of science and innovation, Professor Sir Nigel Shadbolt of the University of Oxford, co-founder of the Open Data Institute gave a talk on the promise and peril of big data and artificial intelligence. Big Data is the popular scientific term to describe the ability of computers to access and successfully analyse large amounts of data from multiple sources. This ability is the foundation for intelligence, and is an activity our human brains do on a daily basis, but where we have so far been in universal solitude - at least as far as we know. So why would we not welcome a little company in the ivory tower of intelligence, even if it is only by artificial means?

During half an hour in a fully packed auditorium, Professor Shadbolt walked the audience through history. Starting with the prosaic description of HAL in "2001: A Space Odyssey" 1970, via the invention of the "World Wide Web" in the 1980s, to machines now outsmarting humans in a series of data processing and analysis tasks, such as Deep Blue (chess), Watson (Q&A) and

Continue Reading

GSOC blogBy Raniere Silva, Software Sustainability Institute, David Pérez-Suárez, University College London.

The Google Summer of Code (GSoC) is a programme run by Google to sponsor the development of open source projects by university students between June and August (see our previous post Downloading Developers: The Google Summer of Code). After the summer, Google sponsors some GSoC mentors to meet in Sunnyvale, California, for a two-day summit where they can discuss what went well and what can be improved.

When we discovered that we would attend the summit (Raniere represented NumFOCUS and David represented Open Astronomy), we were happy to know in advance that a familiar face would be present. The summit kicked-off on a Friday.  Mentors arrived in their respective hotels with their many (figurative) hats—not all attendees make their living from their projects (we don't). The summit followed the unconference style and its schedule for the next two days started to take form the same Friday night. To propose a session, participants needed to write their…

Continue Reading

Nasa picture of LondonBy Olivia Guest, Software Sustainability Institute Fellow

The Open Data Science Conference, ODSC, was held for the first time in London on October 8th and 9th. As far as I understand, it has its roots in the US and has only recently expanded to another continent. I’m not sure what I expected as I was still very much recovering from PyCon UK (yes, I’m a lightweight). However, I had noticed that quite a few talks were on packages I and/or colleagues use (e.g., TensorFlow, scikit-learn, etc.) so I was excited to see how and what they’re used for  by others.

The first talk was delivered by Gaël Varoquaux, a core developer of scikit-learn, joblib, and other Python packages. He touched on a number of important issues. Firstly, he defined what a data scientist is as  the combination of statistics and code, and,…

Continue Reading
Subscribe to Fellows blogs