Research Software Engineer, Diversity and Informatics, Natural History Museum, London
- Open-source software to support museum activities and for reproducible scientific research
- Open data
- Open science
- Python and R
I am a software engineer who retrained (rather late in life) in life sciences. I develop software to make possible the digitisation of the Natural History Museum's collections. Such collections have immense cultural, scientific, historical and aesthetic value and constitute an enormous evidence base for scientific research on the natural world. All natural history institutions are faced with the same problems of how to digitise their vast and diverse collections and to give their specimens a public digital presence. A digital presence might be basic textual metadata about a jar of fish in alcohol, a low-resolution jpeg image of a pinned butterfly or a detailed 3d scan of a complete dinosaur skeleton. The diversity and enormous sizes of collections present substantial challenges to digitisation. One of my major outputs in support of this work is Inselect - an open-source, cross-platform desktop application that automates the cropping of individual images of specimens from whole-drawer scans and similar images. It combines image processing, barcode reading, validation of user-defined metadata and batch processing. Inselect has been heavily used within the Natural History Museum and has been adopted by several other institutions. I also develop software to support large scientific research projects such as PREDICTS - Projecting Responses of Ecological Diversity In Changing Terrestrial Systems - an investigation into how local biodiversity responds to human pressures such as land-use change, with the aim of improving forecasts of the possible future states of global biodiversity. A major output of the project is the first public release of the database of 3.2 million biodiversity records collated from over nearly 450 published papers. I love programming in Python and using other open-source languages, tools and libraries such as Django, PostgreSQL, R, Qt and git.