The SciCodes Consortium: coordinating research software registries and repositories
Posted by j.laird
on 21 April 2022 - 10:00am
By Hervé Ménager, Tom Morrell and Alice Allen.
Scientific disciplines that rely on computational methods often have a resource, a code registry or repository, that serves as a library for the discipline and collects the software itself and/or metadata about the software. SciCodes, formed in 2021, is a consortium of academic discipline and institutional software registries and repositories. Among its goals are sharing work methods and creating a virtual registry standard to enable searching across multiple software registries.
Software in science
If you work in geodynamics, astronomy, or with biostatistics, or with any scientific research software, you are likely familiar with either the Astrophysics Source Code Library (ASCL), Computational Infrastructure for Geodynamics (CIG), bio.tools, or Zenodo. Software registries and repositories such as these, CaltechDATA, CoMSES, DOECODE, and others improve research by making these codes discoverable, thus providing transparency and reproducibility, and by promoting reuse of software, thus potentially making research more efficient. These services are also active in promoting formal software citation in research articles.
Several years ago, managers and editors of these and other similar resources got together to share and discuss their practices, and to develop a list of best practices for software registries and repositories. We met virtually for about a year, and then held a workshop to refine our ideas. At the conclusion of that project, the group decided to continue to meet and formed the SciCodes consortium.
One of the goals of the consortium is to enable the ability to search for code across multiple software registries. Software developed for one discipline may also be useful in another. For example, WND-CHARM (Shamir, 2013), written originally for use in biological imaging, has also proven useful in galaxy morphology research. Wouldn’t it be great if there were a way for you to query multiple research software resources to find code that solves a computational problem you have? We think so! And we are working toward a way to do this!
One of the first efforts of the group is to render our own holdings – the metadata in each software library – in the CodeMeta format (codemeta.json). This translates (or “crosswalks”) the information in each of our resources, which use different schemas, to one standard schema. Having the metadata from these various resources in one standard schema will allow us to build a search tool that can then search all of this metadata, enabling you to find the code that you need, regardless of which discipline it originated in.
The SciCodes consortium also works to improve software citation and findability, strengthen our individual resources by adopting and adapting the best practices we identified, and share advances and information through presentations at our monthly meetings. Because the consortium’s members are spread out over many time zones, the group holds two meetings, seven hours apart, on the same day each month. Meetings include discussions on best practices and presentations from group members. The consortium is currently led by Hervé Ménager and Tom Morrell, who were elected in late 2021 to overlapping terms to run the group.
Are you writing software for research? If so, please consider submitting it for inclusion in a suitable registry/repository. And! Make your own software more easily cited by listing your preferred citation on your code’s download site, preferably in a standard format such as codemeta.json or CITATION.cff.