Software and research: the Institute's Blog

GUADECBy Raniere Silva, Software Sustainability Institute, David Pérez-Suárez, University College London.

Last year Raniere found out that the GNOME User and Developer European Conference (GUADEC) 2017 would be hosted in Manchester and that he should attend. Early this year, during Science Together, Raniere mentioned GUADEC to David Pérez-Suárez and we agreed to show up at the conference to find out what we could learn from GNOME about onboarding newcomers and best software development practices.

Onboarding Newcomers

GUADEC

All open source projects struggle with onboarding newcomers. And, most of the time, driving yourself to the first contribution is a journey that will have old source code, out-of-date documentation, undocumented culture and other rocks on the way. Thankfully, many contributors to open source are working collaboratively with other…

Continue Reading

digital humanititesBy Giacomo Peru

On 26th and 27th September, Oxford held one of the first Data Carpentry workshops for Humanities*. The workshop is fruit of a collaboration between Reproducible Research Oxford and the Software Sustainability Institute. Iain Emsley has undertaken the endeavour of porting the Ecology lessons to a Humanities version, using Early English Books Online Text Creation Partnership texts as the dataset. The choice has been to port Python but R will come next. The team of instructors was Iain (Python), Pip Willcox, from the Bodleian Libraries’ Centre for Digital Scholarship (Spreadsheets) and Lucia Michielin, from the University of Edinburgh (Open Refine and SQL).

According to the instructors, the dataset needs more cleaning (for example, multiple authors come in the same column!). The lessons need further revision but there is hope to submit them to Data Carpentry for consideration by the end of the year.

Contributions are therefore welcome!

Dataset

Spreadsheets

Open Refine

Continue Reading

EuroScipyBy Nikoleta Glynatsi, Cardiff University

EuroSciPy is a cross disciplinary gathering of sorcerers, knights and rangers united for the good of scientific research, and it is focused on the use and the development of the programming language Python. EuroSciPy 2017 was the 10th such conference and it took place in the city of Erlangen, Germany from the 28th of August to the 1st of September 2017. The conference aims for participants to show their latest work, learn from each other and collaborate on developing projects. All these are achieved through multiple days of tutorials, talks and sprints.  

Days one and two started of with several tutorials in two parallel streams (novice and advanced levels). I attended all the advanced workshops where several tools were touched upon, such as pandas, scikit-learn and SymPy. The tutorials have been the highlight of the conference, and I highly recommend the following:

  • Scikit-learn with Olivier Grisel and Tim Head. Material and video (…

Continue Reading

Porting formulaeBy Mike Jackson, Software Architect

As part of my open call consultancy for LUX-ZEPLIN (LZ), I looked at how LZ could migrate their data and computation from Excel to MongoDB and Python. There are many resources with valuable advice on cleaning data in Excel into a form suitable for analysis using Python, R or other data analysis packages. Unfortunately, how to handle formulae and cross-references is little discussed.

Based on my experiences, I have written a guide on “Tips for porting formulae from Excel into code” in which I provide some (hopefully) helpful hints on how to identify and highlight formulae and cross-references, which can help when porting these to Python or R, and to restructure tables so that raw data is contiguous, and so is easy read by data analysis packages or to export into a database or files. Feedback, suggestions and additional advice is more than welcome.

Feel free to add these as comments!

 

Fellowship Programme 2018By Raniere Silva, Software Sustainability Institute.

Applications for the Software Sustainability Institute Fellowship Programme 2018 close on Monday, 9th October 2017 at 23:59 BST.

Joining the Software Sustainability Institute Fellow community means much more than just receiving funding to attend or organise conferences, workshops and other events. Some of our Fellows have said that it is  "a great way to collaborate across disciplines", "a really worthwhile experience that I cannot recommend more" and "a great opportunity to meet friendly new people", among other positive comments.

Continue Reading