At the Collaborations Workshop 21, we discussed the issue of software reusability in digital humanities (DH) research. This blog post provides an overview of some of the underlying issues, as well as the conversation points, resources and avenues for future exploration that arose from our discussion.
There are many significant obstacles to reusing research software in digital humanities scholarship. Firstly there is the issue of awareness, given that DH encompasses many fields of study, time periods, and methodologies that do not usually publish research in the same publishing venues, or necessarily collaborate on research aims and objectives. Where a specific tool might have applications beyond its field or domain, this becomes a bigger problem on top of the already substantial challenges of limited communication/visibility of the tool/software beyond its original community of practice, or even national/geographical borders of initial development and use. Secondly, much funding in the humanities focuses on novelty - existing projects find it difficult to locate funding that allows further development of, let alone maintenance of, existing tools, and new projects must promise brand new software in order to appeal to funders, posing further challenges regarding their sustainability and long-term maintenance.
There have been several approaches in the past to overcome this issue, from creating tools' directories (e.g. Project Bamboo and DiRT), most of which are now obsolete, to taxonomies of research methodologies (TaDiRah, NeMO, etc.) and, of course, a number of discipline- or methodology-specific procedures and initiatives, e.g. for digital scholarly editing or digital archaeology. However, these discussions are usually limited to specific scholarly fields and communities of practice, and many researchers starting new DH projects are not aware of the discussions taking place in the DH community, of which they do not yet see themselves as a member.
To bring together new perspectives on this topic and share resources and knowledge, we hosted a roundtable as part of the Software Sustainability Institute 2021 Collaborations Workshop with three researchers with a range of experience in developing, creating and sustaining different kinds of digital history projects: Dr Melodee Beals, Lecturer in Digital History at Loughborough University and an SSI fellow; Professor Tim Hitchcock, Professor Digital History at the University of Sussex; and Dr Matteo Romanello, Ambizione SNF Lecturer at the University of Lausanne.
Should we aim for reusability? How?
In a short presentation, “I’ve made it! Now what?: Encouraging Software Reuse in the Digital Humanities After the Fact”, Melodee Beals explored the role of bespoke software in research. She argued for creating new tools for your own research to solve specific problems, and suggested the best way to encourage others to use it is to demonstrate it in action in your own research and publications, showing how it works, the code, and use cases. She also highlighted teaching as a form of advocacy for software projects, discussing her own collaborations with PhD students. Rather than attempting to compete with major commercial providers, Melodee suggested primarily focusing on making your research software to solve the problem at hand.
In “Bad Data about Dead People”, Tim Hitchcock drew attention to the difficulties in working with other people’s data and trying to meld different types of data. He emphasised that creating a dataset is an editorial process, and that the relationship between the kinds of data we work with and the knowledge it represents is always problematic and political. Picking up on the threads of Melodee’s talk and its focus on creating software to solve your own problems, Tim suggested researchers should always consider a creator’s motivations and politics, and how digital remediation and the application of tools influence the data. As such, the extent to which other people’s data can be reused is very much in question.
Finally, Matteo Romanello’s talk “Tools in DH and Penelope’s Shroud: Shared Issue or Common Strategy?” highlighted that building new software can be a symptom of a bigger issue - lack of funding for existing software (i.e. maintenance, support, sustainability and expansion). As such, reinventing the wheel isn’t always intentional, but can be justified not only from a funding perspective but in giving the opportunity to apply different methods (and in different languages). An example of this is INCEpTION, an annotation platform built on another tool. Working with funders to transform the landscape would be helpful, and advocacy from influential, larger organisations could encourage a shift in how funders consider this issue. On an individual level, Matteo suggested reuse of software can be facilitated by open source code, documentation, use cases, and a sustainability plan.
Bringing the threads together
Several common issues arose from the talks and the Q&A: firstly, how and when to build sustainability into the creation process. Melodee suggested leaving five minutes at the end of a coding session to add comments and write a synopsis of what you’ve just done. Both Tim and Matteo drew attention to the difficulties of getting recognition for good documentation: academic articles don’t do justice to the processes involved, and certain kinds of higher status output are prioritised.
Some ideas were also discussed in classifying what makes code reusable. Attendees drew attention to making source code open, and ensuring the licence is clear as well as the documentation for understanding the code. The importance of providing guidance on how to use software and data, but also how to update and revise, was emphasised.
The distinction between producing software as a comprehensive knowledge creation infrastructure and developing specialised software for bespoke projects was a topic of discussion, with the speakers highlighting how this impacts both funding and the credit given to this type of labour. Research software development needs to be recognised as a valid research output by establishing robust peer review processes, publishing venues, citation practices and sharing methods for research software.
The necessity to communicate the above mentioned claims to the DH research community as well as to funding bodies was flagged several times during the discussion. Alongside the technical plan and the data management plan, funders should also require evidence regarding the research software developed or used within a funded project, and further support sustainable practices for software development and (re)use.
This workshop was our first attempt towards mapping the field of software sustainability in DH. Working as a small team, we (Emily and Anna-Maria) are now focusing on further encouraging conversations and practices regarding the above issues, focusing on different areas of the field such as digital scholarly editing and digital archives.
We are planning a series of specialised events on software development and sustainability for DH researchers and stakeholders in the coming months, as well as a number of future collaborative efforts. Stay tuned, and if you have suggestions or would like to be involved in future events, please get in touch!
Resources for digital humanists thinking about sustainability: