HomeNews and blogs hub

Seeking input: Challenges to sustainability of open source research data tools

Bookmark this page Bookmarked

Seeking input: Challenges to sustainability of open source research data tools


Danielle Robinson

Raniere Silva

Posted on 12 February 2019

Estimated read time: 5 min
Sections in this article
Share on blog/article:
Twitter LinkedIn

Seeking input: Challenges to sustainability of open source research data tools

Posted by r.silva on 12 February 2019 - 9:54am Open Data stickerImage by Jonathan Gray. Available at Flickr.

By Danielle RobinsonCode for Science & Society, with an introduction by Raniere Silva, Software Sustainability Institute.

As we reported in 2014, 7 out of 10 UK researchers said that it's impossible to conduct research without software. Some of that software can be categorised as open source data tools – defined as open source projects that facilitate any part of researchers data workflow, including data collection, analysis, visualisation, sharing, reuse, publication, and/or collaborative data projects. Code for Science & Society (CS&S) are working on a research project with the goal of identifying systemic challenges to the sustainability of data driven tooling in science and scholarship, and they want your input.

The blog post below was first published at Code for Science & Society blog and is reproduced here with their permission.

Data intensive research increasingly depends on open source (OS) software and data tools. These tools meet the needs of data driven researchers across fields better than commercial offerings and are often led by researchers with deep understanding of scientific domains. Open communities build and maintain these tools, and this work is often funded by grants and donations (Mozilla 2018; Eghbal 2016). While the scientific community’s usage of and participation in OS expands, the broader open source software community is experiencing a sustainability crisis (Eghbal 2016, see also the recent GitHub survey).

We are specifically interested in hearing from folks who work with and/or contribute to OS data tools for research and data science. For this purpose, OS data tools are defined as open source projects that facilitate any part of your data workflow, including data collection, analysis, visualisation, sharing, reuse, publication, and/or collaborative data projects. Sustainability is defined as a project's long term capacity to operate stably.

Looking ahead, maintaining an innovative, independent data and research tools ecosystem is key to scientific advancement. Without coordinated effort, the open research tooling community will miss an opportunity to grow and become self sustaining. Revamping models for funding and sustaining open source projects that serve scientific and data driven research communities is timely, given the broader conversations on open source sustainability and cooperative movement within the open research space itself. The Open Source Alliance for Open Scholarship, the US Software Sustainability Institute conceptualisation project, Joint Roadmap for Open Science Tools, and NumFOCUS' recent summit and sustainability workshop highlight the conversations on sustainability happening in science, scholarship, and data.

The research community is unique in many ways (structure, economies, participants) - and this community has unique challenges (and opportunities) around sustainability of software tools. Open source projects rooted in the research community (Juypter, Dat, RStudio) have grown into widely used tools across industries. Balancing the needs of a project’s founding community with the demands of growth is a challenge for any open software project. Doing this on a limited budget, while attending to research priorities, and without fundraising, business development, or other core operational expertise on staff adds to the challenge.

Through work with our sponsored projects, CS&S is developing operational and management capacity in our projects and project staff to support them as they grow into sustainable entities. We do this by focusing on identifying common needs (ie: problems faced by multiple projects), filling gaps and upskilling our community in management and operations, building shared solutions to systemic problems, and collaborating with organisations like NumFOCUS. As we continue to work with sponsored projects, we are looking to dig deeper into understanding sustainability challenges in the open research tools to meet these challenges in the broader community.

What's Your Perspective?

Through interviews and research, we aim to develop a deeper understanding of the sector’s strengths and identify areas of need.

Do you build, maintain, and/or use open source data tools? You might work as a contractor, at an academic institution, or a big corporation. We want to hear from you. If you have five minutes, please take this survey. If you have 15 minutes, we'd love to talk about your experiences in open source research and data-centric projects. Reach out @codeforsocietyor email us at hi@codeforscience.org.

Link to Survey

Please share this link with colleagues who may be interested: https://goo.gl/forms/Qk7TJslrMHP388tq1


This work is funded a Gordon and Betty Moore Foundation to Code for Science & Society.

The Fine Print

The data generated by this survey will be used by Code for Science & Society and NumFOCUS. You may take this survey anonymously and no questions are required. This survey will take about 5 minutes. Your responses will help Code for Science & Society and NumFOCUS to develop programs and resources to better support OS data tool sustainability.  Summary data (from multiple choice and check box questions) will be shared openly. Anonymous quotes (no names or project affiliations) may be included in a report, which will be shared openly. If you have additional questions about the survey email hi@codeforscience.org.


Eghbal, Nadia. 2016. “Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure.”https://www.fordfoundation.org/media/2976/roads-and-bridges-the-unseen-labor-behind-our-digital-infrastructure.pdf.

Mozilla. 2018. “Open Source Archetypes: A Framework for Purposeful Open Source.”https://blog.mozilla.org/wp-content/uploads/2018/05/MZOTS_OS_Archetypes_report_ext_scr.pdf.

Share on blog/article:
Twitter LinkedIn