Open-source biodiversity data great for both research and training
Posted by g.law
on 26 November 2019 - 11:42am
By Software Sustainability Institute Fellow Gergana Daskalova, University of Edinburgh
We live in a time of change. Change resonates through our own lives, but also through the lives of all the different animals and plants around the planet - the world’s biodiversity. Biodiversity is never static and has always been changing, but more and more of those changes are now being prompted by human activities. Shifts in biodiversity are like a boomerang. Sometimes they start from things we humans do, like using land for agriculture and mining natural resources. The consequences of the biodiversity shifts then come right back at us, as biodiversity change can influence the functioning of ecosystems and the services they provide for humanity.
Understanding the patterns and drivers of biodiversity change is at the core of global change ecology, a research field that has gained momentum thanks to the rise of open source biodiversity data and large databases bringing together studies from around the world. As more and more ecological data become available, the questions we can ask also grow - larger spatial scales, syntheses across different biomes and across more and more of the species found on Earth. These new opportunities come with new challenges. How can we pull apart the signal from the noise, how do we meaningfully analyse and interpret findings from large-scale syntheses? And more practically, how do we manage large amounts of data in a way that doesn’t crash our computers and is easily reproducible by other researchers?
Over the last four years I have lived through the opportunities and challenges of large-scale open-source biodiversity data. All around me I have seen many other people similarly trying to advance their skills so that they can keep up with increasingly quantitative research fields. What I’ve realised along the way is that open source biodiversity data are not just great for research, they are ideal for training ecologists and equipping them with the skills to take on the next big questions about how the world around us is changing.
The aim of my Institute Fellowship is to advance skills in efficient data synthesis and visualisation, and promote the use of open source data in quantitative training. In theCoding Club initiative I have been leading along with a team of undergraduates, postgraduates and research staff, we create free online tutorials covering topics such as data synthesis, analysis, visualisation and more. We use open-source data, for example data from the Global Biodiversity Information Facility (GBIF). GBIF is the largest global database of species occurrence records. With the help of the SSI, I visited the GBIF headquarters in Copenhagen, Denmark and presented the Coding Club model of teaching quantitative skills. We discussed further opportunities for collaboration and how to increase visibility for open-source data in both research and training. We already have incorporated GBIF data in Coding Club tutorials (for example here) and are looking forward to creating more content using open source data.
Training ecologists at all career stages in managing, analysing and visualising large ecological databases will allow us to answer important research questions about how the world’s biodiversity is changing in rigorous and reproducible ways.
*Photo caption: The numbat, a small marsupial found in Australia, is one of the many species whose range of distribution and population size have been impacted by land clearing and introduction of non-native predators by humans. Despite its size, the numbat has a key role in maintaining the natural regeneration of woodlands in the drylands of Australia. The numbat spends the whole day digging hundreds of little holes in the tough soil of the Outback. These holes then become the ideal place for tree species to germinate and grow, eventually creating healthy woodland ecosystems.