Software and research: the Institute's Blog

Latest version published on 9 November, 2017.

Software Citation HackathonBy Stephan Druskat (Humboldt-Universität zu Berlin, Germany)

On 26 October 2017, the Force11 Software Citation Implementation Working Group and Force11 Hackathon was hosted at the Force2017 Conference in Berlin, Germany, and led by Neil Chue Hong (Software Sustainability Institute), Lars Holm Nielsen (Zenodo) and Martin Fenner (DataCite). Participants took a full day to exchange, discuss, plan, and hack towards implementations of the software citation workflow.

Software citation is at the very heart of the process to create recognition for software as a scholarly product, and finding and implementing a working solution for this issue is the key to attribution and credit for the creators of scientific software. At the same time, the possibility of citing a software also helps to unlock its potential for sustainability by boosting its accessibility, and by fostering a software’s persistence through the encouragement of tagging its published versions with unique identifiers, such as DOIs. Thus, software citation also becomes a natural path to the reproducibility of research results in the context of open science, where not only the data, but also the complete toolchain employed in a research endeavour, is made openly available.

Software citation is still hard

However, software citation is hard, and for diverse reasons. It is…

Continue Reading

Latest version published on 6 November, 2017.

GRADnetBy Mike Jackson, Software Architect

On 18th October I attended GRADnet's "Moving Forward for 2nd Year PGRs" day in London for physics post-graduates, and ran two sessions on "Writing better software to research".

SEPnet, the South East Physics Network, is a consortium of universities in the south east of England, promoting excellence in physics in both academia and industry, via research, collaboration, training, and outreach. GRADnet is SEPnet's collaborative graduate school which provides professional skills training to PhD students.

GRADnet's "Moving Forward for 2nd Year PGRs" day offered attendees a choice of 5 sessions both morning and afternoon, on Creating impact, How to write a successful Fellowship Application, Research data management, Unconscious Bias and Writing better software for research. 66 students attended the event, held at the Park Crescent Conference Centre, London.

My 2.5 hour session on Writing better software for research provided students with a hands-on code review to get them thinking about the qualities of good, and bad, code. I gave an introduction to a selection of best practices from Wilson et al.'s highly recommended 2014 paper Best Practices for…

Continue Reading

Latest version published on 2 November, 2017.

LUX-Zepelin water tankBy Mike Jackson, Software Architect

In Using Excel for data storage and analysis in LUX-ZEPLIN, I summarised how Excel is both used and managed within the LUX-ZEPLIN (LZ) project and recommendations for improvements. In this second of two blog posts, I describe how LZ could migrate their data within Excel to MongoDB with supporting software, in Python, for computation and presentation. I also describe a proof-of-concept which extracts data from Excel, populates MongoDB with this data, and computes the radiogenic backgrounds expected from a subset of the possible sources of contamination.

As a reminder, the BG table is an Excel spreadsheet, with 43 sheets, used by LZ to calculate radiogenic backgrounds, and the WS Backgrounds Table is a sheet within the BG table which summarises the radiogenic backgrounds expected during the lifetime of the experiment from each source of contamination.

Migrating from Excel to MongoDB and Python

Excel combines data, computation and presentation. For example, a cell with a formula in Excel is a combination of data and computation, in effect a tiny program. The migration plan was based around migrating from the BG table into a solution…

Continue Reading

Latest version published on 1 November, 2017.

LUX-Zepelin water tankBy Mike Jackson, Software Architect

The LUX-ZEPLIN (LZ) project are building one of the largest and most sensitive dark matter detectors ever constructed. I’ve been providing consultancy, as part of an Institute open call project, on how LZ can migrate their data storage and analysis software from Microsoft Excel to a database management system-centred solution. In the first of two blog posts, I summarise how Excel is both used and managed within LZ and recommendations for improvements.

As described in my blog post at the outset of the consultancy, Shining a light on dark matter, LZ partners at University College London and University of Coimbra, maintain LZ's backgrounds control software. At the heart of the backgrounds control software is a Microsoft Excel spreadsheet (termed the “BG table”). While fit for purpose in the experiment’s early design and procurement stage, Excel is now reaching its limits in terms of sustainability, its ability to interface with other software in the experiment (for example, analysis software that interprets dark matter data), and the interface with…

Continue Reading

Latest version published on 16 October, 2017.

GUADECBy Raniere Silva, Software Sustainability Institute, David Pérez-Suárez, University College London.

Last year Raniere found out that the GNOME User and Developer European Conference (GUADEC) 2017 would be hosted in Manchester and that he should attend. Early this year, during Science Together, Raniere mentioned GUADEC to David Pérez-Suárez and we agreed to show up at the conference to find out what we could learn from GNOME about onboarding newcomers and best software development practices.

Onboarding Newcomers


All open source projects struggle with onboarding newcomers. And, most of the time, driving yourself to the first contribution is a journey that will have old source code, out-of-date documentation, undocumented culture and other rocks on the way. Thankfully, many contributors to open source are working collaboratively with other…

Continue Reading