CW16 Demo sessions

Demo sessions gave an in depth look at a particular tool or approach and a chance to query developers and experts about how this might apply to attendees areas of work.

Here are the list of demos that took place at CW16. Due to the demand for slots (and all the cool technology and advice on offer) we had two sessions each of 35 minutes in length. 

The demos on each session ran in parallel at different rooms. Each demo had 35 minutes for presentation and discussion. Attendees were encouraged to ask questions and enquire about how they could use the approach, tools etc - these sessions have a focus on being interactive.

Session 1

On Tuesday 22 March 2016, 14:10-14:45.

recipy: effortless provenance tracking in Python

Speaker: Robin Wilson

recipy is an open-source Python module that was first developed at the CW15 Hack Day. It provides an almost effortless way to track provenance in Python - all you have to do is add 'import recipy' at the top of your Python script, and all of your inputs, outputs and code will be logged in an easily-searchable database. You can then easily go back to the database and say "How did I produce graph.png" - and it will tell you exactly what code you ran, what versions of libraries you used, and what input files were used to produce what output files. A number of our users have found this to be a real life-saver!

This demo session will consist of:

  1. A brief presentation on recipy, including the motivation, how it works and how you can use it.
  2. An demonstration of recipy - including a 'first look' at some as-yet unreleased features.
  3. An interactive discussion about the way forward for recipy - including finding out what potential users want, prioritising new features, discussing the development process, and possibly coming up with projects for the CW16 Hack Day.

There is a survey to help with the next steps on the development of ReciPy. Please fill it.

Slides and video.

System testing and validating Jupyter notebooks with nbval

Speaker: Oliver Laslett

Good documentation should be clear, comprehensive, and up-to-date to assist both developers and users. Code examples in documentation are a powerful format for explaining key features and user interfaces.

Examples in static documentation can quickly become outdated; this is particularly true for open-source projects in their infancy, in which features are regularly added or altered. The Jupyter Notebook provides a practical way of creating (executable) documentation.

In this session we will introduce (i) executable documentation in Jupyter, (ii) re-execution of the examples and automatic comparison with stored outputs using nbval, to ensure the documentation is up-to-date. As a side effect, this approach provides (iii) additional system tests.

By re-executing the notebook, we can detect deviations in the documentation examples due to changes in the source code. By treating each output cell in the notebook as a unit test -- which passes if the output is essentially unchanged -- we can integrate this testing into established continuous integration workflows.

In this demo, we will demonstrate nbval, a notebook validation tool built on the popular py.test framework. Nbval executes notebook cells and checks the outputs for integrity and consistency. Outdated documentation is registered as a test failure.

This workshop will cover the basics of py.test and getting started with nbval. Once set up, we will write our first notebook documentation and testing, and discuss the advantages and pitfalls of executable documentation. This demo only requires prerequisite knowledge of basic python and familiarity with the jupyter notebook; so come along and get hands on with testable documentation."

Video. Slides unavailable.

Humanities Linked Data at the Oxford University e-Research Centre

Speaker: Terhi Nurmikko-Fuller

In this session, I will demonstrate two projects from the University of Oxford e-Research Centre, in which Linked Data has been applied to enrich and support scholarship in the Humanities: the first is a performance study capturing information about Richard Wagner's "Ring Cycle" as part of the Transforming Musicology project; the second is an investigation of early English texts from EEBO-TCP and the HathiTrust in the ElEPHãT project.

The session starts off with a quick introduction to the basics of Linked Data, and is followed by a description of the MuSAK annotation kit. Participants are invited to explore the RDF triples generated as part of the capture of the live performance of the “Ring Cycle” via the ‘Follow-Your-Nose’ approach using the Q&D Browser.

The second part of the session will consist of a brief introduction to the ElEPHãT project. Participants can engage with the prototype through exercises that allow them to create worksets using a graphical user-interface, and to query some of the project data using predefined SPARQL queries. They will also be introduced to the Semantic Alignment and Linking Tool (SALT).

This session will introduce participants to some of the interdisciplinary research projects currently underway at the University of Oxford e-Research Centre. It will provide insights into the potential of Linked Data, provide examples of existing tools, and illustrate how these can be used to enable new types of research questions.

Video. Slides unavailable.

Write Software Management Plans with DMPonline

Speaker: Mike Jackson

When developing research software, we need to know what we are going to write, who it is for (even if this is just us), how we will get it to them, how it will help them, and how we will assess whether it has helped them or not. A Software Management Plan can help us think about these and decide upon the processes and infrastructure we will use when developing our software.

The Software Sustainability Institute has drawn together advice and guidance to help researchers write Software Management Plans. We have developed a Checklist for a Software Management Plan. Our checklist is complemented by extensions to the Digital Curation Centre's service for Data Management Plans, DMPonline, which now allows researchers to create personalised Software Management Plans.

In this demonstration, I'll give an introduction to writing Software Management Plans using DMPonline.

For more information about Software Management Plans and links to our checklist and information on using DMPonline, please visit

Video. Slides unavailable.

Session 2

Tuesday 22 March 2016, 14:45-15:20

Public domain licensing and liability

Speaker: qLegal

Slides and video.

ChemBio Hub – a free, web-based research data management tool

Speaker: Karen Porter

Research data management - why should you care?
The problems of research data management (RDM) are well rehearsed and the consequences of poor processes are legion – exemplified by much recent negative publicity around lack of reproducibility in published data.  Funders require a clear RDM policy statement which demonstrates how to avoid these problems, but often researchers, despite best intentions and a willingness to comply with their commitments, do not have the relevant skills and tools to adopt best practice.

The ChemBio hub software has been developed to support best practice and to ensure that the barrier to adoption is as low as possible.  

So, you’re not interested in Chemical Biology?
Don’t let the name put you off.  It’s true that our remit in developing the software was to manage data related chemical biology, however we have built the software to be discipline-agnostic.  This means that as well as data related to chemical compounds, it can be used to manage data in diverse fields of research including biology, physical chemistry, zoology and materials science – anywhere where tabular data are generated, stored and analysed.

What will you see?
The ChemBio Hub demo will focus on the specific example of inventory management – an important requirement in any lab and one which is often addressed with an out of date spreadsheet in an inaccessible location.  We have proven repeatedly that the ChemBio Hub software is very useful in addressing this need.  The demo will include a walkthrough of the process of loading and validating data, as well as showing how easily relevant data can be retrieved.
With simple interfaces to upload data, a robust security model to manage access to projects, and a very powerful search utility to retrieve relevant information stored, ChemBio Hub has provided a useful platform with widespread applications.  Come and learn what it can do for you and your researcher colleagues to help manage data simply, securely and efficiently.

How can you get involved?
By joining forces we can enlarge the community, improve the tools for our users and contribute to a real and necessary improvement in the efficiency of how research data is managed. Why not start by attending this demonstration? All our tools are free and accessible online via all major web browsers.
The ChemBio Hub is a University of Oxford project funded by the Wellcome Trust, HEFCE, the Nuffield Department of Medicine, and the John Fell Fund.

Video. Slides unavailable.

Property based testing in Python

Speaker: Vince Knight

This demo will explain the role property based testing has to play in research.

Property based testing can be used to create a wide range of test cases for software. This is relevant to research software as testing exact behaviour is not always of value (due to stochastic processes for example).

This demo will show the use of the Python Hypothesis library to implement property based testing in a game theoretic library. I will also tell the story of how this test suite found a bug that would not otherwise have been identified.

Slides and video.

One-Man Crowdsourcing

Speaker: M. H. Beals

This demonstration will explain how to re-purpose a combination of existing plug-ins and free on-line services to quickly (and painlessly) create a crowd-sourcing platform. Anyone who would be aided by crowd (human) identification of data in their research, but lacks the time or funding to develop a purpose-built crowd-sourcing platform would benefit.

Video. Slides unavailable.