By David De Roure, Professor of e-Research, University of Oxford and Turing Fellow at The Alan Turing Institute.
This post is part of our 10 year anniversary series.
It feels as though a lot has changed in research software since the SSI was founded in 2010. But perhaps it's interesting to look at what hasn't changed- to ask what has sustained. So my story starts earlier, dare I say "before the Web..." - I know that sounds like pre-history! Some of the software I used in much earlier days of computing is still going strong, and often it has underpinned later developments. Hence this blog post is an account of my 10 favourite pieces of software encountered in a four-decade journey through software in academia, and a reflection on what it means to sustain.
It's 1982 and I'm a Maths with Physics undergrad at Southampton. To do my coursework I have to get to grips with Unix v6 on a PDP 11. Little did I know I was to become Southampton's first Unix Systems Programmer, soon to be loading Unix V7M mag tapes. Fast forward nearly four decades and here I am with MacOS on my laptop, a raspberry pi on my desk, and some machine learning somewhere in the cloud. And it's all Unix - my ’80s sysprog skills still work today. They must have got something right back then. (Oh and Les Carr was the second sysprog, and today we're both SSI Co-Investigators - little did we know our destiny!)
I recall my friend and colleague Sebastian Rahtz, lecturer in Humanities Computing, sitting of an evening with one of those new IBM PCs and installing MicroTeX. He went on to spend a good many years working with Tex and LaTeX, founding TeXLive which is now the default TeX in major Linux distributions, and later running Oxford's Open Source Advisory Service. LaTeX is going strong today, as all of us doing our collaborative authoring in Overleaf can attest. So Donald Knuth definitely got it right when he released TeX in 1978. And we're only up to version 3.141592653.
Ok, so this hasn't sustained per se, but it did enable a bunch of people to time-travel significantly into the future. Before the Web we had interactive hypermedia thanks to the Microcosm system pioneered by Wendy Hall and her team, building on the work of Ted Nelson and Doug Engelbart. Microcosm let you navigate (and author) links between multimedia documents, it had search and even the ability to link things up automatically, it was used successfully in teaching and training, and in my case in early music information retrieval research. Why didn't it sustain per se? See number five. But it created the insights that helped shape the alternative future we enjoy today.
5. CERN httpd
Well I could just say "The Web" but I needed to choose a piece of software. This was the code by Tim Berners-Lee and colleagues that we installed in the early ’90s so that we could "put up a web site" - i.e. handcraft some HTML in a text editor to make content available to people who were downloading the Mosaic browser or working in line-mode in Lynx. Of course, finding sites wasn't so easy, but at one point I had a list of seven in my logbook. The significance of CERN httpd is it enabled people to make content available, and certainly the Web wouldn't have taken off without that! This is how the project looked back in those days.
Welcome! By what name shall I call you? The first MUD (multi-user dungeon/dimension/domain) was in the late ’70s, and in the ’80s I met one of the inventors doing his PhD in AI (cool!). Then in the ’90s I used a MUD as a teaching vehicle - object-oriented programming is always better with objects. And here I am today writing papers about Pokemon Go! MUDs had it all but in linemode - online chat, role-playing, interactive fiction. Yes that does sound pretty much like social media today. In the trade we call these "Social Machines". And you can still play the original MUD1 today (N.B. needs telnet!).
It's 1997, and I'm back in the land of Lisp, at MIT with Hal Abelson. But we are designing something new - figuring out how to program a hypothetical architecture consisting of millions of devices. This "Amorphous Computing" was about engineering with emergent behaviours, and it predated today's cyberphysical systems spectacularly. I went on to teach this way of thinking at MIT, and then in Southampton for many years where I used NetLogo, an excellent open source interactive modelling environment which I use to this day to get my head around emergent behaviours in (socio-technical) systems. This software enabled me to train people for our IoT-based, cyberphysical future before it arrived, and you can use it today to prepare for tomorrow (N.B. includes epidemic model).
The new millennium brought the UK e-Science programme - multiple funders investing in developing research software, and an idea for an institute to help sustain it! To choose one favourite from this era I have to go for Taverna, an environment for designing and executing computational workflows. Taverna led the way by making it possible to build workflows for a growing world of distributed Web services, facilitating automation so we could work at scale, and most importantly it made those workflows "first class citizens" that could be shared and cited. Taverna is now Apache Taverna, and the myExperiment workflow "commons" grew to be a resource for sharing and for study. And this all came about thanks to the myGrid e-Science project, led by SSI Co-Investigator Carole Goble.
9. Numbers Into Notes
10. Sample RNN
Right up to date! The Centre for Practice & Research in Science & Music (PRiSM) at the Royal Northern College of Music (RNCM) launched its lab in 2020 with a team including the first PRiSM Research Software Engineer, Chris Melen. And the first software crafted by Chris is PRiSM Sample RNN, an implementation of the SampleRNN recurrent neural network algorithm for TensorFlow 2. Chris worked with composer Sam Salem, and the software has enjoyed immediate takeup as increasingly composers and performers are seeking to work with AI. For me, PRiSM SampleRNN is a great example of software being adopted by a new community, in this case very creatively. And it shows why we need more RSEs in arts and humanities.
I'd better stop at 10. In case you were wondering, I would have also added the BUGS software in '89, Stat-JR in 2012, oh and make which was in my previous blog post.
All of the software on this list was designed for sharing and much of it has sustained impressively. But in every case, developing or experimenting with the software shared a glimpse of the future and a chance to shape it - it's software for time travel, exploring alternative futures. It has created the experiences, insights and practices that helped us understand what was to come and, importantly, it has often created the people and the teams too. And that is my reflection: software is more than code - it's a vehicle for time travel, insight, knowledge... and mostly it's about people. My thanks to all the amazing people I've had the privilege to work with on this journey.