Jupyter Notebook workshop

Posted by a.pawlik on 5 October 2015 - 9:00am

By Aleksandra Pawlik, Training Lead.

Jupyter is an excellent solution for tacking problems with reproducible research. On 28 September in collaboration with ELIXIR UK, the Institute ran a workshop to introduce researchers to programmatic data manipulation using both tools.

The training was fully hands-on allowing the participants to try out on their own different Jupyter features. In short, Jupyter is an elaborated electronic lab book. It allows the user not only to take notes, embed images, link to resources but most importantly, provides a way to write interactive source code and capture and display the outputs within the notebook itself. Jupyter originated from the very popular IPython Notebook. Currently both, open source projects, are developed separately, albeit there is still a strong connection between them in terms of technology and communities.

The workshop started with an overview of the Jupyter interface with focus on working with code in cells and different underlying kernels which allow to write code in a variety of languages. We primarily used Python during the workshop and Software Carpentry materials came in very handy. After working with some sample datasets manipulating data with the NumPy library and then creating pretty plots using matplotlib we moved to some Jupyter-specific features. We covered widgets, line and cell magics (with some confusing behavior of the %history magic). We looked at the Rich Display system that the Notebook comes with.

For the workshop all participants were asked to bring their own laptops with Jupyter installed (using the Anaconda package). The main advantage of this solution was that the attendees could then work independently and actually use Jupyter for their research. We demonstrated two optional ways to use Jupyter running remotely without having to install it. The first one was SageMath and the second TryJupyter. Whilst both only require the users to have internet connection and up to date browser, the researchers may face the problem of not being able or not being allowed to upload their data in order to run the analysis on these remote services. However, SageMath and TryJupyter can be useful as a quick and easy demo or teaching tool.

The Institute helped to run a similar workshop in May 2015. It was a first workshop on Jupyter and allowed us to pilot the teaching approach, curriculum and choice of materials. The workshop in May was not targeted at any particular audience and hence the attendees consisted of both researchers and research software engineers. The former were more interested in hands-on learning how to use Jupyter for programmatic data analysis and visualisation. The latter wanted to dive into the details of Jupyter architecture, features and fuctionalities. Based on these outcomes we decided to split Jupyter training since clearly the first group did not find it that useful to investigate the intricacies of the Notebook.

What was extremely helpful in planning and running the workshop, was the plethora of well developed training materials. We used the materials provided by Software Carpentry and the Jupyter/IPython Notebook core development team (IPython Notebook in depth). All these training materials are under permissive licenses which helps in teaching best practices in research software and encourages the development of open science.