Mentorship programme: Developing sustainable software for the Swarm mission

Posted by j.laird on 15 August 2022 - 5:52pm

Magnetic waves across Earth’s outer core.
Image by Université Grenoble Alpes.

By Ashley Smith.

This blog post reflects on our Learning to Code mentorship programme as part of a Research Software Camp.

As the SSI has identified before, there is a gap in training of intermediate software engineering skills across academia. This gap has become more prominent and problematic as the need for research software engineers (RSEs) has grown in recent years with the increasing reliance of research on software.

I have experienced the training gap directly as I've moved through my PhD and onto my current position as a post-doctoral researcher, and have got to the point where I'm quite aware of what experience I need to acquire but find few workshops or similar that could address these well (without devoting very large amounts of time to full training courses). Particularly as an RSE, there are many small bits of knowledge that one can only get through long experience or tailored guidance from someone who already has that experience. So when I found out about the SSI’s Learning to Code mentorship program, it seemed like a perfect match for my situation.

Swarm spacecraft mission

I am a member of the European Space Agency Swarm Data, Innovation, and Science Cluster (ESA Swarm DISC), tasked with building innovation around data usage and processing of the Swarm spacecraft mission. Briefly, Swarm is a multi-spacecraft constellation in Low Earth Orbit which monitors and maps the magnetic field and plasma environment around Earth, contributing to research and services all the way from studying flows of liquid iron in Earth's core to modelling space weather effects in the ionosphere driven by events occurring on the surface of the Sun.

We already have a solid foundation of tools in place: the VirES platform for access and visualisation of Swarm data and models, and a connected JupyterHub system where researchers can bring their own Python code to perform arbitrary analysis on the data. These are made accessible through a Python package (viresclient) and a guide with a rich set of examples (Swarm Notebooks). We need to expand these with both community-contributed recipes, and specific analysis tools that can be brought to the data in modular, reliable ways. As part of this effort, I am coordinating the development of a new package (https://github.com/Swarm-DISC/SwarmX) to provide the "analysis & visualisation" layer of capabilities over the top of data retrieved through viresclient (and eventually other data sources). This is a difficult endeavour because we need to consider many things:

designing a sensible architecture for the package.
organising and helping contributors (scientist-developers with funding through Swarm DISC) to write code within this framework, and gathering feedback from them for requirements of the system.
engineer a sensible approach to handle dependencies (including more tricky community libraries from within specific research domains).
setting up adequate continuous integration to test the system and build documentation.
being mindful of many other community packages and approaches (e.g. https://heliopython.org/projects/).
build all this in a sustainable way, deeply inspired by The Turing Way.

How a mentor helped

As you can see, there are a lot of things I am concerned about in building this system, and all of these things have many different ways they can be solved. I can get a long way toward good solutions by reading about many existing projects and guides, but this journey can be bewildering with the amount and variation of tools and concepts. By having regular conversations with a mentor, in my case Anastasis Georgoulas, who already has much experience of these issues in a broad non-research-domain-specific context, this journey becomes more manageable.

We met once a week (mostly!) and had informal discussions which we tracked through notes on HackMD. This began with me giving an orientation about what I am trying to achieve and what we have in place already, with some loose goal about what specifically to get done during the mentorship period which got more clearly defined later on - to arrive at an early "v0.1" release of our Python package as a foundation for future work.

Through our discussions, we worked through the issues listed above and moved toward some practical solutions for each case. Along the way there were some detours into the details of specific problems (e.g. configuring Fortran compilers on GitHub Actions across different platforms) and into other related projects I am working on.

All in all it was a good experience, and actually I feel that the most valuable thing was, instead of just working alone as usual, to get the time from someone who is empathetic to the technical challenges I am facing. This made me more positive that we will be able to arrive at publishing a tool that our research domain can use.