Skip to main content Site map
HomeNews and blogs hub

Is AI After My Job? Navigating the Future of Research Software Engineering

Bookmark this page Bookmarked

Is AI After My Job? Navigating the Future of Research Software Engineering

Author(s)
Nadine Spychala

Nadine Spychala

SSI fellow

Colin Sauze

Colin Sauze

SSI fellow

Bryn Noel Ubald

Scott Archer-Nicholls

Posted on 7 November 2025

Estimated read time: 18 min
Sections in this article
Share on blog/article:
LinkedIn

Is AI After My Job? Navigating the Future of Research Software Engineering

CW25 logo, a silhouette profile
This blog is part of the Collaborations Workshop 2025 Speed Blog Series. Join in and have a go at speed blogging at Collaborations Workshop 2026.

Introduction

The use of GenAI tools has become one of the most polarising topics in technology today, and the Research Software Engineering (RSE) community is no exception. In many RSE teams, there's a notable divide: some embrace them enthusiastically, while others maintain their distance, viewing LLM-based AI as overhyped, unreliable, unethical, or fundamentally threatening to their profession. This reluctance and scepticism may be particularly pronounced among senior developers and those with extensive experience. While many RSEs have experimented with basic tools like GitHub Copilot, relatively few have fully integrated advanced AI coding assistants into their daily workflows, potentially missing out on significant productivity improvements that early adopters consistently report.

At the recent Collaborations Workshop 2025, our discussion group tackled various concerns about the use of AI tools (generative AI, large language models, and similar AI tools) in research software development. Throughout this text, 'AI' refers to these generative AI technologies. Two major concerns were central in current RSE experiences: first is the concern about whether AI will make RSEs redundant. Second is the question of reliability and dependency: when using AI tools, how do we know when we're being "driven into a river"? This metaphor, inspired by over-reliance on GPS navigation [26], captures the risk of becoming too dependent on AI assistance such that we lose the ability to recognise when it is leading us astray.

What emerged was a clear message: while fears about AI are valid, choosing not to engage isn't a viable option. Those who don't engage with AI tools may find themselves at a disadvantage, while those who master them may enhance their unique value as RSEs. Training in AI literacy, using existing best development practices, development of critical evaluation skills, as well as establishment of community standards are needed to harness AI's power while capitalising on the domain and research expertise that defines excellent research software engineering.

The AI Dilemma: Promise and Peril

The Concerns Are Real

The anxiety around AI isn't unfounded. RSEs are raising legitimate concerns about:

Deskilling: Like GPS navigation that can leave drivers unable to read maps, AI coding assistants risk creating what some call "vibe coding" - development based on feel rather than deep understanding. The experience of becoming dependent on AI - and, as a consequence, losing or never developing skills - varies dramatically across career stages: junior developers may never develop fundamental coding skills, while senior engineers might find some of their hard-won capabilities gradually atrophying through disuse.

Quality, Bias, and Technical Debt: AI-generated code often looks polished but can introduce subtle or even gross errors, create unexplainable logic, and fail in unexpected ways. Debugging these issues can be particularly challenging because AI-generated code may implement logic that seems reasonable on the surface, but does not correspond to the way humans would normally code. Moreover, the "computer always says yes" phenomenon means AI tools rarely push back on requests, even when they should. A tendency toward confirmation bias - where AI consistently follows the user's direction regardless of merit - can obscure when we've taken a wrong turn, making it increasingly difficult to recognise flawed approaches or dead-end paths.

On a more philosophical note, our group also discussed how AI's outputs come with an unexpected cost: they lack the distinctive character that makes human-written code feel approachable and engaging. There's something to be said for code that bears the fingerprints of its creator - quirky variable names, explanatory comments, elegant workarounds that speak to human creativity.

Domain-Specific Risks: The stakes vary dramatically across research domains. In medical research software, AI-generated errors could impact patient safety decisions. In climate modelling, subtle bugs could affect policy recommendations. High-risk domains require especially careful validation of AI-generated code.

Ethical and Legal Concerns: AI training data can include copyrighted code and personal information scraped from the internet without explicit consent, raising both intellectual property and privacy concerns that remain legally murky [2]. Beyond data collection issues, LLMs can perpetuate and amplify societal biases present in their training data, potentially reinforcing stereotypes related to gender, race, or other sensitive attributes (though this is less relevant to discussions about using AI as coding assistants) [1]. The concentration of AI development in big tech companies also means that access to advanced AI capabilities is increasingly controlled by commercial entities whose priorities may not align with research needs - creating dependencies on proprietary systems where pricing, availability, and development directions are determined by market forces rather than scientific priorities. Interests may not align with research values or democratic principles, potentially giving unethical entities influence over scientific computing infrastructure. Additionally, the opacity of these "black box" models makes it difficult to understand how they arrive at outputs, complicating accountability when errors or biased results occur.

Environmental Impact: The computational costs of AI tools raise serious climate concerns, as training and running large language models require enormous energy consumption. Researchers at the University of Massachusetts, Amherst found that training a large AI model can emit more than 313 tons of carbon dioxide equivalent - nearly five times the lifetime emissions of the average American car, including manufacture of the car itself [3] (note that this 2019 study examined models orders of magnitude smaller than current frontier models). These training emissions, while substantial, are dwarfed by deployment at scale: while individual inference queries have much smaller carbon footprints, GPT-4o inference at 700 million daily queries would generate between 138,125 and 163,441 tons of CO2 annually [5]. The energy demands continue every time the model is used (however, see [4]). Models also show striking disparities in carbon footprint - recent benchmarking [33] shows that the most advanced models produce over 70 times more CO2 per query than efficient alternatives.

It is important to contextualise these emissions: data centres currently account for just 0.5% of global CO2 emissions (with AI comprising about 8% of that, or 0.04% of global emissions) [6]. However, the IEA projects data centres will reach 1-1.4% of global emissions by 2030, with AI's share growing to 35-50% of data centre power. As AI usage continues, the cumulative environmental impact could become substantial, raising questions about alignment with climate goals.

But the Benefits Are Transformative

The productivity gains from AI tools, when used skilfully, can be extraordinary. Since systematic research on RSE-specific contexts is lacking, we rely on evidence from industry and general software development - though many of these benefits are likely to transfer to RSE work.

Multiple studies show developers complete tasks 55-56% faster with AI coding assistants like GitHub Copilot [7, 8]. McKinsey research [12] found documenting code takes half the time, writing new code nearly half the time, and refactoring nearly two-thirds the time. A large-scale study [3] found 26% more completed tasks with 13.5% more commits. However, productivity impacts vary significantly by context and developer experience. A 2025 METR randomised controlled trial [16] found experienced open-source developers took 19% longer when using AI tools on their own repositories - despite believing they were 20% faster. Factors included time reviewing AI code, context switching overhead, and misalignment with project-specific standards in mature codebases. This suggests benefits may be greatest for newer developers or unfamiliar technologies, while experienced developers on complex codebases may see smaller gains.

Research shows mixed but generally positive code quality impacts. GitHub studies [10, 11] found 53.2% greater likelihood of passing unit tests and 13.6% fewer readability errors. However, independent studies like GitClear's research [15] have raised concerns about increased code churn and maintainability issues, indicating quality impact varies based on usage context.

Industry reports suggest significant potential for AI in documentation workflows, including generating technical specifications and API documentation [14]. For RSEs, who often work with research codebases that suffer from inadequate or missing documentation - a common issue in academic software development - AI tools could help generate comprehensive README files and inline comments, making these codebases more accessible to new team members and improving overall code maintainability.

While systematic studies specifically focused on RSE contexts are lacking, making it unclear how these general productivity gains translate to the specialised work that RSEs do, anecdotal reports from our discussion group (and more broadly from RSE community members) indicate that AI tools usually help in one or multiple of the following ways:

  • Code generation and completion: Writing new code faster, generating boilerplate, autocompletion.
  • Learning and knowledge transfer: Helping understand unfamiliar code, languages, APIs, domain knowledge.
  • Code analysis and improvement: Analysing existing codebases, suggesting optimizations, finding issues.
  • Documentation and communication: Generating docs, comments, explanations
  • Debugging and problem solving: Helping trace issues, explain errors.

Learning and knowledge acceleration deserves special treatment - while AI tools risk deskilling through over-dependence, they also offer perhaps one of the most transformative benefits: the ability to dramatically flatten learning curves. The key lies in how they're used - as a crutch that prevents learning, or as an accelerator that enhances it. Crucially, successful AI use depends heavily on one's ability to evaluate AI outputs. If someone has no foundational knowledge in a domain, they will likely struggle to judge whether AI has solved a task well. The fundamental requirement for effective LLM use is being able to verify whether the AI is providing good responses.

AI serves as an always-available collaborator, particularly valuable for RSEs working in isolation or tackling new technologies, both being a common situation. Studies [2, 9, 13] show that newer and less-experienced developers see the highest adoption rates and greatest productivity gains from AI tools, with research across three domains finding that AI tools significantly narrow the gap between worst and best performers.

For RSEs specifically, this learning acceleration is particularly valuable given the breadth of technologies and domains they typically encounter. AI assistance can enable rapid mastery of new programming languages, understanding legacy scientific codebases faster, and getting up to speed on unfamiliar libraries. The weeks or months required to become productive with a new research tool, domain-specific algorithm, or scientific computing framework can be compressed into days.

This is especially significant for early-career RSEs who may feel overwhelmed by the vast landscape of scientific computing, or for experienced RSEs moving into new research domains where the learning curve has traditionally been steep. The ability to quickly understand and work with unfamiliar codebases, APIs, and domain-specific practices removes a major barrier to contributing effectively across diverse research projects.

Keeping Your Job in an AI-Driven World

Fears and concerns about AI are valid, yet choosing not to engage isn't a viable option. Just as GPS revolutionised navigation while creating new dependencies, AI tools are transforming how we write code while introducing new risks. Experienced users can leverage these tools effectively because they understand when to trust them and when to question their outputs. The challenge lies in ensuring that newcomers develop the underlying skills to recognise when they're being "led into a river."

The Threat

The threat of AI displacement is real and already happening. Recent data shows that 14% of workers have already experienced job displacement due to AI or automation [18], with Goldman Sachs estimating that 6-7% of the US workforce [17] could be displaced if AI is widely adopted. A comprehensive PwC survey spanning 44 countries found that 30% of workers globally fear their jobs could be replaced by AI within the next three years [19]. Entry-level positions face particular risk, with estimates suggesting that AI could impact nearly 50 million US jobs in the coming years [20].  

Coding is both where AI currently excels most and where it's improving fastest - telling evidence came in 2024, when software developer employment flatlined after years of consistent growth [22]. According to 80000 Hours [21], within five years, AI will likely surpass humans at coding even for complex projects, with developers transitioning into AI system management roles that blend coding knowledge with other capabilities - though this shift will challenge some.

The Opportunity

Understanding AI's limitations helps identify where human skills remain essential. AI struggles with three categories of tasks [21]: those lacking sufficient training data (like robotics control, which has no equivalent to the internet's vast linguistic datasets), messy long-horizon challenges requiring judgment calls without clear answers over years (like building companies, directing novel research, setting organisational strategy), and situations requiring a person-in-the-loop for, e.g., legal liability or high reliability. Moreover, there are valuable complementary skills when deploying AI: spotting problems, understanding model limitations, writing specifications, grasping user needs, designing AI systems with error checking, coordinating people, ensuring cybersecurity as AI integrates throughout the economy and bearing ultimate responsibility. These skills resemble human management: both difficult for AI to master and complementary to its capabilities. As AI improves, they become more needed, multiplying their value.

As stated previously, while coding represents AI's strongest current capability and its most rapid area of improvement, it has simultaneously made learning to code more accessible and expanded what individual researchers can accomplish. This may increase the value of spending months (rather than years) learning coding as a complementary skill, especially if cheaper production costs expand overall software demand.

80000 Hours’ single most important piece of practical advice for navigating AI-related transitions is to learn to deploy AI to solve real problems. As AI capabilities advance, people who can effectively direct these systems become increasingly powerful. For RSEs, starting points might be using cutting-edge AI tools as coding assistants in their current work, and when opportunities arise, building AI-based applications to address real problems.

The RSE Advantage

We believe RSEs may be uniquely positioned to thrive in an AI-driven world. Unlike pure software development, RSE work requires deep domain expertise, understanding of research methodologies, stakeholder communication and management skills, and the ability to translate evolving scientific requirements into robust software solutions.

Importantly, many of the skills where AI struggles align closely with RSE competencies. RSEs routinely navigate messy long-horizon challenges - directing the software implementation of novel research projects, setting technical strategy for evolving scientific requirements, and making judgment calls where clear answers don't exist. Their work inherently requires a person-in-the-loop for research integrity, reproducibility, and ethical compliance. Moreover, RSEs are particularly well-positioned to develop the complementary skills that become increasingly valuable as AI advances: problem-solve in complex research contexts, write specifications that bridge science and software, grasp diverse user needs across research domains, coordinate interdisciplinary teams, and bear ultimate responsibility for research software reliability. These capabilities - to a huge extent involving project and people management in research contexts - are precisely the skills that AI finds most difficult to replicate and that become more valuable as AI handles more routine coding tasks.

RSEs also operate at the cutting edge of research, where problems are often novel and solutions aren't readily available in LLM training datasets. Just as research itself evolves and adapts, RSEs can leverage their research background to stay ahead of automation. It's precisely in these uncharted territories - where science meets software - that AI tools may remain limited and human insight becomes indispensable.

Essential Skills for the AI Era
  • Prompt Engineering: Learning to communicate effectively with AI tools, including crafting prompts that encourage critical evaluation rather than blind compliance.
  • AI Literacy: Understanding AI capabilities and limitations, knowing when to use different tools and when tasks can/can not be delegated to an AI agent, and recognising the environmental and ethical implications of various choices.
  • Enhanced Validation Practices: AI-generated code may require especially careful validation since errors may not be immediately obvious to human reviewers. Using and extending already existing best development practices, as well as the ability to judge when AI suggestions are leading in the wrong direction, will be crucial.
  • Continuous Learning: As AI capabilities evolve rapidly, staying current with both benefits and risks requires ongoing professional development.

Community Actions and Recommendations

The research software community can navigate this transition successfully via:

Training and Education

  • Establish foundational AI literacy programs: Help RSEs understand AI capabilities, limitations, and appropriate use cases across different research domains.
  • Develop adaptive learning frameworks: Since AI models evolve rapidly and best practices are still emerging, create flexible training approaches that can evolve with the technology rather than rigid curricula that quickly become outdated.
  • Foster critical evaluation skills: Train RSEs to assess when AI assistance is helpful versus harmful, and how to validate AI-generated outputs effectively.

Standards and Guidelines

  • Develop protocols for disclosing AI assistance in code development and research publications.
  • Create environmental impact metrics for AI usage (tools like CodeCarbon can help track computational costs).
  • Establish quality standards and validation procedures for AI-generated code in research contexts.

Available Tools and Technologies

The research software community has access to a growing ecosystem of AI tools, each with different strengths:

  • Conversational AI Models: ChatGPT, Claude, Gemini, and open-source alternatives like Mistral and DeepSeek offer different capabilities for code generation and problem-solving.
  • Integrated Development Tools: CursorAI, GitHub Copilot, and Windsurf provide AI assistance directly within coding environments, while tools like Claude Code, OpenCode and OpenAI's Codex agent enable autonomous coding through various interfaces including command line.
  • Local and Specialised Tools: Ollama allows running models locally for sensitive research, while tools like LangChain facilitate building custom AI applications. OpenCode provides terminal-based AI assistance with support for 75+ LLM providers.
  • Cloud-Based Autonomous Agents: OpenAI's Codex and Claude Code offer cloud-based software engineering with parallel task execution, while platforms like Manus provide more general-purpose autonomous assistance that can handle coding, among other complex tasks.

The variety of available tools means RSEs can choose solutions that match their security requirements, domain needs, and workflow preferences. Moreover, experienced developers have moved beyond simple chat-based interactions to develop structured workflows for AI-assisted development. Mitchell Hashimoto, for instance, proposes to iteratively build non-trivial features through multiple focused AI sessions, each tackling specific aspects while maintaining human oversight for critical decisions [23]. Geoffrey Litt advocates for a 'surgical' approach where AI handles secondary tasks like documentation and bug fixes asynchronously, freeing developers to focus on core design problems [24]. Peter Steinberger takes this further with parallel AI agents working on different aspects of a codebase simultaneously, using atomic commits and careful prompt engineering to maintain code quality [25]. These workflows share common principles: breaking work into manageable chunks, maintaining clear documentation for AI context, and crucially, always reviewing AI-generated code before deployment.

Conclusion: Looking Forward - Integration, Not Replacement

The expertise in leveraging AI dev tools is being developed primarily in industry, creating a risk that RSEs may fall behind in AI proficiency compared to their industry counterparts. Rather than rejecting AI tools outright, RSEs could position themselves at the forefront of AI usage knowledge, leveraging it to their advantage while actively addressing the legitimate concerns and working to mitigate the risks we've outlined - from deskilling and environmental impact to ethical considerations and quality assurance.

The RSEs who thrive will be those who learn to use AI tools effectively while maintaining the critical thinking, domain expertise, and ethical considerations that define excellent research software. They understand that their value lies not just in writing code, but in asking the right questions, understanding research contexts, and ensuring that software serves scientific discovery. We still need skilled developers who can recognise when they're being led astray.

As we navigate this transition, the research software community can work together to ensure AI enhances rather than replaces the human elements that make RSE work valuable. The question isn't whether AI will change our field (it will and already does) - it's whether we'll shape that change to serve research and society's best interests.

Want to join the conversation? The research software community continues to explore these questions through workshops, training programs, and collaborative initiatives. Visit software.ac.uk to learn more about upcoming events and resources.

HomeNews and blogs hub

5 strategies to help foster cross-disciplinary activity in skill-sharing communities

Bookmark this page Bookmarked

5 strategies to help foster cross-disciplinary activity in skill-sharing communities

Author(s)
Phil Reed Profile Picture

Phil Reed

SSI fellow

Oscar Seip

Oscar Seip

Research Community Manager

Nicky Nicolson Profile Picture

Nicky Nicolson

SSI fellow

Malvika Sharan

Malvika Sharan

SSI fellow

Samantha Wittke

Samantha Wittke

SSI fellow

Dave Horsfall

Dave Horsfall

SSI fellow

Alessandro Felder Profile Picture

Alessandro Felder

SSI fellow

Emily Lumley

Kwabena Amponsah

Posted on 14 October 2025

Estimated read time: 11 min
Sections in this article
Share on blog/article:
LinkedIn

5 strategies to help foster cross-disciplinary activity in skill-sharing communities

CW25 logo. 5 strategies diagram
This blog is part of the Collaborations Workshop 2025 Speed Blog Series. Join in and have a go at speed blogging at Collaborations Workshop 2026.

The fragmented nature of training materials distributed across research institutions and within project repositories often leads to duplicated resources, ineffective practice, and wasted storage, contributing to the lack of sustainability of the materials themselves. Furthermore, opportunities are missed for reusing domain-agnostic materials in additional disciplines, for example, research software management written to support life sciences may be relevant for all.  

In this blog post, we describe five strategies or general principles to help foster cross-disciplinary activity in skill-sharing communities.​

Establish a shared vocabulary

To foster effective cross-disciplinary activity, it is essential for communities to establish a shared vocabulary which reflects their different contexts, audiences, disciplinary practices, and career stages. For instance, the term ‘science’ in American English is often used in a way that aligns more closely with how ‘research’ is used in British English. However, such terms can also be unintentionally exclusionary or carry biases in terms of hierarchies that favour one discipline or method over the other. In German and Dutch, for example, the concept of Wissenshaft or wetenschap, which translates as ‘knowledge-ship’ or ‘knowledge craft’, covers not only the natural and social sciences but also the arts and humanities, placing them on an equal footing. Similarly, “data” might sound neutral, but not to humanities scholars. Furthermore, programming languages each have their own unique features, even though they are predominantly based on English, which presents its own set of challenges.

One way to address these challenges is to involve people who are embedded within different research subcultures who can act as champions or community brokers. These individuals understand the specific values, practices, and terminology of their respective communities and can help bridge gaps in communication. Moreover, they are able to articulate how different skills and different ways of working can complement each other. By ‘speaking both languages’, they are able to promote mutual understanding, translate concepts and skills across domains, and facilitate more inclusive and effective exchange. This role is therefore essential, especially in interdisciplinary projects, where a lack of understanding can hinder effective collaboration and negatively affect the project’s outcomes. ​

Effective and intentional event design

Different types of events, from structured training to unconferences and informal community spaces, provide multiple entry points for people to learn and contribute in a cross-disciplinary setting, regardless of their background. Designing events that facilitate both formal and informal spaces allows learners to be exposed to new concepts and exchange ideas from their respective fields. This exposure helps build an understanding of how general skills, like data management, reproducibility, or visualisation, can be applied across disciplines, from bioinformatics to the humanities. Engaging people in groups and encouraging collaborative documentation can also help build shared training materials and resources that are useful across various communities.

Let's walk through a learner's journey to see how different cross-disciplinary events can contribute to continuous learning:

  • Gaining Foundational Skills: A researcher new to data science might start with The Carpentries training. Offered in a supportive group setting, these short-format workshops provide foundational programming and data skills, empowering new learners to continue their journey.
  • General Discussions and Problem-Solving: As they apply their new skills, they might join The Turing Way's "collaborations cafés" to discuss best practices like version control, code review, and data management. They could even participate in a "Book Dash" to document skills they want their team to learn, collaborating with people from different disciplines.
  • Local Engagement: To maintain momentum, they could start their own "Hacky Hours" or Advent of Code meetups with others in their department. These regular gatherings create a safe, less intimidating space to share tools, learn about new packages, or even reproduce a recent data analysis method.
  • Connecting with the broader Ecosystem: Finally, the group might apply to join OLS’s Open Seeds training program or collectively attend an event like the annual SSI Collaboration Workshop or RSECon to connect with a wider community. This provides exposure to new topics, tools, and approaches, further strengthening their skills and network in a cross-disciplinary environment.

This mix of organised and casual formats normalises the process of acquiring new skills even beyond a traditional training format or classroom environment.

Leadership and diversity

To successfully foster cross-disciplinary activity, a skill-sharing community requires strong, intentional leadership. This includes both officially designated leaders, responsible for building and supporting communities, and "open leaders", those who organically step up to act as stewards, connecting new learners with experts and resources. These leaders, by acting as bridges between different groups and often diverse communities, help share and translate skills across disciplinary “languages.”

Effective leaders also actively champion diversity within their communities and teams. While community-building may not be in everyone's job description, a good leader encourages an environment where community thrives. A crucial part of this is creating intentional pathways for different kinds of open leadership. This supports a variety of roles that bring diverse perspectives from across different disciplines into the community's leadership structure.

Over the last few years, a variety of Research Technical Professional (RTP) roles, such as research community managers, technical trainers, and data stewards, have emerged in addition to the established Research Software Engineer (RSE) role (discussed in this policy briefing by Karoune and Sharan). These roles expand on the work of the RSE community and are spreading to departments beyond research computing. By acting as open leaders, individuals in these roles provide specialised expertise and model best practices that their community members can adopt.

Regardless of their specific role, leaders and role models should focus on removing barriers to collaboration and co-creation. This will encourage more people to move beyond simply consuming content and to become active participants (as illustrated by CSCCE). By bringing expertise from different backgrounds, these leaders can act as catalysts, building a thriving, cross-disciplinary community right where they are.

Shared spaces for collaboration

Traditional, synchronous meetings often fail to accommodate global teams or varied schedules, limiting engagement. Furthermore, without dedicated, persistent forums, valuable insights can be lost after events, and the crucial space for constructive disagreement, rapid iteration, and "failing fast" – essential for innovation – is often absent. This lack of ongoing, accessible shared environments can impede the organic development of trust and sustained interdisciplinary relationships. Communities typically have preferences for particular shared spaces, which don’t necessarily map across disciplines.

To address these challenges, community managers must receive and provide specific training for the effective use of shared digital spaces for collaboration using platforms like Slack, HackMD, Google Docs, or GitHub. If all communities get used to using these platforms for collaboration, the established “common ground” supports more efficient working across disciplines. When selecting tools, it's vital to acknowledge the audience's preferences – recognising that not all disciplines are familiar with platforms like GitHub – and choose the most appropriate option, even if it's a hybrid approach. By proactively providing these dedicated virtual spaces and the time to engage with them, we empower genuine, effective interdisciplinary collaboration.​

The specific training for such shared digital spaces and platforms can be found in many places such as institutional training catalogues, research infrastructure training catalogues, and materials offered by the platforms themselves. Training registry providers are also developing ways to address the fragmentation of materials across silos; for example, the mTeSS-X project is adding a feature to facilitate seamless content exchange between catalogues built on the TeSS Platform. This will significantly enhance the findability, accessibility, and reuse of high-quality training resources (FAIR principles).

Collaboration through co-design and team science

Cross-disciplinary skill-sharing is successful when communities embrace collaborative approaches that recognise the value of diverse expertise. No single person can hold all the skills required to deliver complex projects, whether in data science, genomics, digital humanities, or beyond. Effective collaboration, therefore, requires us to work in groups, not in isolation.

A co-design approach ensures that communities shape both the problems and the solutions together. By inviting contributions from participants at all stages of a project, we can avoid siloed thinking and instead co-create resources, training, and infrastructure that respond to a broad range of needs. This shared design process helps embed inclusivity from the outset, so that outcomes are more useful across disciplines.

In team science models, each member contributes their own expertise, for example, domain knowledge, software engineering, project management, or community engagement. This complementary approach strengthens the whole project. Importantly, recognising and celebrating these varied contributions also helps shift perceptions of what counts as valuable academic work, supporting career recognition for diverse skills.

Successful cross-disciplinary projects often emerge when people from different disciplines are not only users of one another’s outputs but also co-creators in shared spaces. This means designing environments where disciplinary boundaries can be crossed safely, and where failures are accepted as part of the innovation process. By embedding collaboration in the culture of our communities, we can move beyond fragmented exchanges of knowledge and towards more sustainable, resilient models of skill-sharing.

 

These five strategies demonstrate a range of approaches to help foster cross-disciplinary activity in skill-sharing communities. If we establish a shared vocabulary, we can better understand people working in different disciplines who may use familiar terms in unfamiliar ways. Effective and intentional event design will better connect people via community brokers. Presenting career pathways to leadership with diverse representation will inspire and support us all in the longer term. Creating shared spaces will encourage people to find and refine their voice. Those taking a collaborative approach to projects will recognise that different skills are required for a project to be successful. These strategies are interrelated, as illustrated in the figure above.

Before you can work with other relevant people, you have to find them. The SSI Collaborations Workshop is an effective way to find people who are not “in the room” (as in, not previously in your network). The SSI is also a very good example of widening to other communities, after it formed with a focus on research software engineers, and now supports the wider reach of digital research technical professionals. It epitomises the idea of research community support work as an ongoing, living process, not a “one and done”.

Authors

Phil ReedThe University of Manchester, phil.reed@manchester.ac.uk, 0000-0002-4479-715XLinkedInBluesky (corresponding)
Oscar SeipThe University of Manchester, oscar.seip@manchester.ac.uk, 0000-0002-8503-2698LinkedInBluesky
Nicky NicolsonKew Gardens, n.nicolson@kew.org 
Malvika SharanSt. Jude Children’s Research Hospital and OLS (Open Life Science), malvikasharan@gmail.com, 0000-0001-6619-7369
Samantha WittkeCSC- IT Center for Science, samantha.wittke@csc.fi, 0000-0002-9625-7235
Emily LumleyImperial College London, elumley@ic.ac.uk 
Kwabena AmponsahUniversity of Nottingham, kwabena.amponsah1@nottingham.ac.uk, 0000-0002-7506-9040
Dave HorsfallUniversity of Newcastle, dave.horsfall@newcastle.ac.uk, 0000-0002-8086-812X
Alessandro FelderNeuroinformatics Unit, Sainsbury Wellcome Centre & Gatsby Computational Neuroscience Unit, University College London, a.felder@ucl.ac.uk, 0000-0003-3510-9906

 

Original image by Samantha Wittke and Phil Reed.

 

HomeNews and blogs hub

Code That Works Isn’t Always Code That Lasts

Bookmark this page Bookmarked

Code That Works Isn’t Always Code That Lasts

Author(s)
Robert Chisholm

Robert Chisholm

SSI fellow

Patrick J. Roddy

Liam Pattinson

Andrew Gait

Daniel Cummins

Jason Klebes

Connor Aird

Posted on 25 September 2025

Estimated read time: 7 min
Sections in this article
Share on blog/article:
LinkedIn

Code That Works Isn’t Always Code That Lasts

CW25 logo, a comic strip
This blog is part of the Collaborations Workshop 2025 Speed Blog Series. Join in and have a go at speed blogging at Collaborations Workshop 2026.

When working on research software, there will always be a balance between investing time in functionality, usability, and performance. Additionally, software used and developed in academic research environments presents several unique challenges. Oftentimes, researchers focus on functionality first and foremost, with little to no effort spent on the ease of use of their code, nor on the performance. This is understandable given the pressures researchers are under to publish. Grants are typically awarded for novel and innovative ideas, rather than for maintaining existing code. As a result, researchers often have little incentive to sustain or update their code after publishing the associated papers. They may also view their code as being in an unsuitable state for public release, further discouraging long-term maintenance. It is unrealistic to expect researchers to fully optimise their software’s performance while basic functionality is being developed. However, steps towards improved usability and performance can be undertaken from the earliest stages of development. Simple things, like comments and docstrings, can be introduced as the codebase is being developed. Moreover, API documentation can be autogenerated from the function docstrings. A simple README can be iteratively developed to include things such as the motivation behind the code, the installation instructions, and community guidelines. In addition, tests can further improve the usability and can be created from the start. Most research software is not created with the expectation that it will be passed on; instead, it’s developed with only the current project in sight. It’s important to recognise that these small improvements can greatly improve future maintenance and usability, both for the original author and any researcher who inherits the software.

Bare minimum documentation: benefits and barriers

Missing docs harm the ability of others to pick them up.
Is documentation the bare minimum for usability? Just docstrings and a good README can be sufficient. It doesn’t necessarily require a fully fledged user guide.

It can be argued that the bare minimum requirement for software to be considered usable is the presence of documentation. Research software often uses idiomatic workflows.  A lack of instructions documenting exactly how the code should be installed and run can be a significant barrier for new users. Documentation doesn’t necessarily need to be a set of nicely formatted HTML docs – even the presence of a few docstrings, comments, and a README is often sufficient. However, the upkeep of documentation as a project develops presents a continuous challenge. Project maintainers should encourage each new piece of added functionality to include corresponding documentation.

A possible barrier to producing minimal documentation is a lack of confidence in the researcher that their code is worth the effort of making it understandable to others. Perhaps the researcher or PI needs to be convinced that their code is of interest to others outside their research group, or that their area of study has more general applicability (particularly in terms of programming or algorithms) than they may have first thought. A lot of the decisions about what documentation is required need to be made early on in the process of adding additional developers (e.g. RSE) to a project.

Unit tests can sit alongside the core code base as a means of demonstrating the functionality of the code. They improve not only the usability, but also the sustainability of the code.  Unit tests give others the confidence to build on existing code, knowing that they haven’t inadvertently broken other parts of the code base. In addition, the presence of tests in a project facilitates profiling and improving performance. Integration and regression tests can provide self-contained snippets to run specific parts of the code in isolation, providing a standardised test suite against which the correctness and performance of the code base can be measured.

Bare minimum profiling: benefits and barriers

Profiling is an essential and inexpensive first step to understanding and improving research software performance.  

Researchers who write code often lack formal programming training, which leads to bad habits that can greatly impact the performance of their code. As with documentation, profiling does not have to involve difficult tooling or in-depth analysis, but could consist of basic measures such as timing parts of the program. However, academic developers and users with little software experience or training may not recognise when performance is unexpectedly poor for a given problem. Many researchers will think of HPC or GPU compute when asked to improve the performance of their code, without having ever profiled their existing code to understand whether easily addressed bottlenecks exist. Consulting an RSE throughout the software development process, even with a short consultation, could help keep software both usable and performant.

Suggestions

One possible solution currently employed by the UCL Centre for Advanced Research Computing (ARC) is their ways-of-working agreement, which is detailed on their website. Before a project commences, ARC requires that their collaborating research groups understand and agree to facilitate the development of tests, that refactoring without the constant production of features is often essential, and that their code must be stored on GitHub or a similar remote git repository provider. This agreement helps ensure that collaborators allocate time for maintenance, which will ultimately benefit their software by increasing its usability and providing a convenient way for RSEs to prioritise documentation and testing.​

Conclusion

Fundamentally, any research software should solve the problem it is intended for, i.e. it must be functional. However, an incalculable amount of time is spent rewriting functionally correct code into more performant and/or more usable and sustainable code.

When beginning any research software endeavour, these three factors must be given adequate consideration.  Features, tools, or workflows to enable usability and the assessment of its performance should be implemented – at least to a minimum level – from the outset. Adding in documentation and unit tests to a fully functional but undocumented and untested code base can be an enormous task, and provides a huge barrier to the future usability and sustainability of the software.​

There are many popular tools for making a developer’s life easier when it comes to writing documentation and assessing performance.  Setting these up from the beginning of the project helps ensure sustained effort goes towards these throughout the project lifecycle.

If you’re a researcher who could benefit from learning more about documenting, testing, or profiling your code, reach out to your local Research Computing or RSE team. Many universities now offer training to upskill research staff and students in the skills required to produce robust and usable research software. If you are already familiar with these techniques, make sure to advocate for them with your collaborators who aren’t. Everyone benefits when research software is built to be used by others.

Authors

Patrick J. RoddyUniversity College London, patrick.roddy@ucl.ac.uk0000-0002-6271-1700
Robert ChisholmUniversity of Sheffield, robert.chisholm@sheffield.ac.uk0000-0003-3379-9042
Liam PattinsonUniversity of York, liam.pattinson@york.ac.uk0000-0001-8604-6904
Andrew GaitUniversity of Manchester, andrew.gait@manchester.ac.uk0000-0001-9349-1096 
Daniel CumminsImperial College London, daniel.cummins17@imperial.ac.uk0000-0003-2177-6521 
Jason KlebesUniversity of Edinburgh, jason.klebes@ed.ac.uk, 0000-0002-9166-7331
Connor AirdUniversity College London, c.aird@ucl.ac.uk

Original image from xkcd.com

 

HomeNews and blogs hub

Telling Our Success Stories: Recognising Contributions in Research Software

Bookmark this page Bookmarked

Telling Our Success Stories: Recognising Contributions in Research Software

Author(s)
Stephan Druskat

Stephan Druskat

SSI fellow

Jack Atkinson Profile Picture

Jack Atkinson

SSI fellow

Arielle Bennett

Arielle Bennett

SSI fellow

Will Haese-Hill

Tamora James

Michael Sparks

Jonathan Cooper

Jeremy Cohen

Adrian D'Alessandro

Posted on 22 September 2025

Estimated read time: 11 min
Sections in this article
Share on blog/article:
LinkedIn

Telling Our Success Stories: Recognising Contributions in Research Software

CW25 logo, two figures on a podium
This blog is part of the Collaborations Workshop 2025 Speed Blog Series. Join in and have a go at speed blogging at Collaborations Workshop 2026.

Research software is an increasingly important element of research across almost all domains. Its development, maintenance, and adaptation to new use cases require a blend of subject expertise, programming skills, and socio-technical capabilities that are often spread across teams. To support the rapidly growing need for software development for research processes, the role of the “Research Software Engineer” (RSE) has emerged and developed over the last decade, professionalising previously informal positions often held by postdocs. However, how we effectively recognise, reward, and support all those who make contributions to research software is an ongoing discussion and challenge. We cannot cover all aspects in this post, but we will try to describe key approaches and some of the issues surrounding them.

What do we mean by recognition and reward?

The fundamental question to answer is: what are we recognising and why? Codebase level contributions can include infrastructure, testing, sustainability, maintainability, reproducibility, and mentoring. State of the art contributions can include reusable tools, practices, or community standards. Some of these can be measured quantitatively with metrics, some qualitatively, and some narratively. We briefly discuss available options below.

Recognition that a contribution has been made – whether to the specific codebase, the project, the state of the art, a process, or a community – allows the recipient to feel that their contribution has been appreciated by a community. Recognition is about the acknowledgement of the skills, effort, and outcomes achieved by people. It can be as simple as a “thank you”, but may also extend to feedback that enhances the contributor’s reputation within a community. Sites such as StackOverflow rely on such approaches to maintain contributions. Citations for software are one way to recognise contributions, but there are so many more possibilities. Recognition of contributions can also extend into tangible rewards, e.g., monetary rewards, or career opportunities that translate into long-term benefits and ensure that people can continue to contribute and flourish. Reward and recognition are thus not distinct categories – a good reputation can improve career prospects.

There are at least two kinds of rewards. One-off displays, for example, bonuses, gifts, and awards. Then there's the ultimately more important rewards which manifest in terms of someone’s long-term career - like a title change, promotion, the ability to shift between institutions or even move into or out of academia, industry, or public services, based on the strength of contributions to research software projects. This latter form of reward is perhaps less obvious to early career software engineers, but is important down the line. It can affect eligibility for hiring panels, promotion boards, or even funding bodies. In other words, reward and recognition enable movement and credibility. This type of reward requires recognising contributions in a way that is credibly shareable with others.

What matters can vary significantly by contributor. It will also depend on their desired career path and current stage, and so is likely to vary over time. For those within an academic environment who eventually aim to become professors or permanent researchers, citations, visibility, and reputation within their field, leading to recognition within university promotion schemes, will be valued. Within an industry context, being able to show the impact your work has had on the company’s performance will typically lead to salary increases, one-off bonus payments, or new career opportunities.

The hybrid nature of many research technical roles means some research software contributors find themselves on a research career track, while others are considered technical professionals. Each of these comes with its own notions of reward and recognition, which drive promotion criteria, career development, and salary raises. A technical-track person will be less inclined to chase academic publication metrics, since these are not prioritised when evaluating individual performance. Further, the domain in which a person finds themselves working will also play a part. In Bioinformatics, for example, having even a minor role in a Bioinformatics Resource Center (BRC) can ensure authorship credit on the annual release paper (with dozens of others), garnering many citations for proportionally little effort compared to a publication covering a novel software tool. This inconsistency could skew reward incentives for a person on a research track. It also disincentivises academics from contributing to existing software, since this is unlikely to lead long term to the types of recognition required to advance in their careers.

Groups or teams also need collective recognition as they seek to establish or maintain their reputation. Being recognised for their work leads to them being more sought out as trusted collaborators, potentially resulting in ongoing funding for the team and increased sustainability for their work. Having more users of a piece of software both enhances and is driven by the reputation of the developers and may be leveraged to secure funding, or to translate users into contributors. The importance of such recognition to a team will depend on its institutional context, but all teams need to be able to justify their continued existence along these lines.

Critically, the academic structures for reward and recognition lead to particular challenges for maintaining research software. Instead, novel development is heavily weighted in many cases. Despite acknowledging that it is important that existing software tools are maintained, our current global system does not sufficiently reward this work, causing maintainer burnout, unreliable tools, and the reinvention of the wheel over and over. We need a better system for aligning incentives, career development and satisfaction, and the actual needs of the research community. ​

How do we evaluate contributions? Qualitative vs quantitative

There are numerous different approaches to measuring and evaluating the quality and utility of people's contributions to research software, encompassing a range of factors, both countable (quantitative) and more nebulous (qualitative). Broadly, historically speaking, these evaluation techniques have focused on the more quantitative side of metrics.

Attempts at improving documentation and evaluation of software have tapped into various pre-existing “traditional” metrics. In academia, publications and citations dominate, and projects utilise both software papers and DOIs for code. Efforts such as the Citation File Format project aim to make it easier for researchers to acknowledge the code they utilise, but still emphasise citation number as a key value metric for software. In the general software development realm, community impact is demonstrated through metrics such as contributions on third-party platforms (GitHub, StackOverflow, etc.) or being recognised as a core maintainer for a software application, tool, or library. Such quantitative metrics, whether GitHub stars, citation numbers, lines of code, downloads, etc. are alluring because of the ease of collection and ready translation across projects and domains. However, the homogeneity of these metrics means they can only tell a superficial story. For example, are people starring a particular repository because they're actively using the tool, because they want to come back and look at it later, or because they plan to include it in a list of resources for other people? If someone contributes 1,000s of lines of code to a project, does that automatically make them a better programmer than someone who contributes 100s?

An overzealous focus on readily quantifiable metrics can only ever tell a fraction of the story of a tool or piece of software, and its utility and impact. Likewise, these metrics are often in danger of being gamed, which can introduce skew into evaluations based upon them. One alternative, a narrative approach to describing contributions and software use, delivers much more information about real impact, but places a greater burden on both the author and the reader. It can be hard from a qualitative perspective to capture the breadth and scale of reach of online research outputs. It also presents a barrier to those who experience challenges with reading/writing or are not working in their native language.

Ideally, we need a combination of the two types of metrics. We can incorporate recognised metrics, each of which tells one part of a story, into a broader narrative as evidence behind a more complete story. Funders could easily implement this approach for future grant applications, although this should also be accompanied by corresponding guidance for panel members in how to interpret this new combined narrative approach to evidence.​

Conclusion

Recognising and rewarding software contributions can take as many forms as there are people writing code, reviewing pull requests, leading project meetings, researching users, creating documentation, and delivering training. Individuals are likely to all want and need different types of reward and recognition depending on their career stage, ambitions, personal motivations, and job description. However, the consensus in our speed blog is that the current approach, which is still largely focused on citations, downloads, or other quantitative metrics, when it is not focused on the impact factor of the journal, does not capture the full picture.

Efforts such as the Citation File Format, All Contributors bot, the Software Authorship and Contribution Task Force, and the Journal of Open Source Software all seek to address parts of the puzzle of recognition, but none are expected to comprehensively cover the full spectrum of individual needs for recognition. There is also a major challenge in developing wider acceptance of new approaches to recognition and reward in the context of research software, especially in communities that have long-established structures that are fundamental to the way that they manage career progression.

It's unlikely to find a 'one-size-fits-all' solution that adequately captures the complexities and reality of software projects, but we propose that a mixed methods approach to assessment, combining quantitative metrics alongside more in-depth user and impact stories, should be pursued as a path forward.

If you're interested in advocating for improved recognition and reward for research technical professionals:

  • Join communities and projects that are working on these problems, in addition to those referenced above: STEP-UP, The Turing Way, RSE Society, SSI itself.
  • Think about how you and your institution can recognise and reward people who contribute to research software in ways that they can meaningfully build on.
  • Look for opportunities to highlight the value of the work that technical professionals do, whether that's external awards (like HiddenREF or RSE Society Awards), institutional kudos (for example, staff recognition or open science awards), blog posts, or even a festival.

Authors (equal attribution)

Will Haese-HillUniversity of Glasgow, william.haese-hill@glasgow.ac.uk0000-0002-1393-0966@haessar.bsky.socialGithub
Tamora JamesUniversity of Sheffield, t.d.james@sheffield.ac.uk0000-0003-1363-4742 
Stephan DruskatGerman Aerospace Center (DLR), stephan.druskat@dlr.de, 0000-0003-4925-7248
Michael SparksUniversity of Manchester, michael.sparks@manchester.ac.uk0009-0001-3059-0000
Jonathan CooperUCL Advanced Research Computing Centre, j.p.cooper@ucl.ac.uk0000-0001-6009-3542LinkedIn 
Jeremy CohenImperial College London, jeremy.cohen@imperial.ac.uk, 0000-0003-4312-2537
Jack AtkinsonUniversity of Cambridge, jwa34@cam.ac.uk0000-0001-5001-4812, @jatkinson1000, Mastodon
Arielle BennettThe Alan Turing Institute, ariellebennettlovell@gmail.com, 0000-0002-0154-2982, @arielleb.bsky.social 
Adrian D’Alessandro Imperial College London, a.dalessandro@imperial.ac.uk, 0009-0002-9503-5777
HomeNews and blogs hub

Adding a little Carpentries Magic to Workshop Organisation at the Collaborations Workshop

Bookmark this page Bookmarked

Adding a little Carpentries Magic to Workshop Organisation at the Collaborations Workshop

Author(s)
Jannetta Steyn

Jannetta Steyn

SSI fellow

Posted on 30 July 2025

Estimated read time: 4 min
Sections in this article
Share on blog/article:
LinkedIn

Adding a little Carpentries Magic to Workshop Organisation at the Collaborations Workshop

Wallace monument

This year I was fortunate enough to attend my fifth Collaborations Workshop and it was as enjoyable and exciting as the previous four.

It was held at Stirling University. I haven't been to Stirling before, it's a really beautiful little city with the Wallace Monument keeping an eye on all the goings on.

As always, apart from listening to the inspirational keynotes, the idea of Collaborations Workshop is to provide an opportunity for delegates to discuss topics proposed by attendees. Some of the topics are then put forward as suggestions for the hack day. Topics that generate enough interest become a project, which a few people will work on for about five hours on the hack day.

During one of the discussions I mentioned my struggles with organising Carpentries workshops and the bash script that I wrote to try and automate some of the work. Another delegate, Colin Sauze, also mentioned the problems experienced with workshop attendees that misjudge the level skill required for attending the workshops. So we discussed the possibility of creating an online test that people, interested in attending a workshop, need to pass before they would be allowed to register. Putting these two ideas together sounded, to us, like a good topic to put forward for the hack day so Colin pitched it for us during one of the next sessions.

We managed to get a team of six people together (five in person and one online) and decided to use my existing bash script and MariaDB database, for creating workshop websites, as a starting point. Hollie Rowland became our project manager, Deborah Udoh and Tosan Okome worked on the online tests. Hui Ling worked on creating a csv version of my MariaDB database because Colin thought the database was to heavy duty for what we wanted. Colin worked on converting the bash script to work with the CSV file and I worked on creating a calendar invite (.ics file) that could be used to include in emails about the workshop.

We had a tremendous amount of fun and pitched our hack at the end of the day with great enthusiasm. Our collaborations workshop ended on a real high when we were announced as second prize winners of the hack day.

There was a great deal of interest in our project and I know many people have the same difficulties as me when organising workshops. So I have decided to carry on working on the project under the name we gave it on the hack day, which is CarpentriesMagic. (We'll have to ask Hollie how we came up with the name because I can't remember). I have registered an organisation on GitHub at https://github.com/CarpentriesMagic. There are three repositories at the moment. WorkshopAdmin (https://github.com/CarpentriesMagic/WorkshopAdmin) which are the bash scripts and WorkshopAdminUI (https://github.com/CarpentriesMagic/WorkshopAdminUI) which is a Java based GUI that works with a MariaDB database for adding workshops, helpers, instructors etc. The last repository is just an example of a workshop website we created with the bash script. There is also a CarpentriesMagic channel in the Carpentries' Slack workspace and a mailing list in Carpentries' TopicBox. If you are interested in using any of this or getting involved in the development, please get in contact. We would love for this project to become something that all Carpentries Workshop organisers can use.

Thanks again to the Software Sustainability Institute for organising the Collaborations Workshop and for making funds available to its fellows to attend. It is still the highlight of my year!

 

HomeNews and blogs hub

MetaGreenData: Making Code More Sustainable, One Metadata File at a Time

Bookmark this page Bookmarked

MetaGreenData: Making Code More Sustainable, One Metadata File at a Time

Author(s)
Jyoti Bhogal Profile Picture

Jyoti Bhogal

SSI fellow

Posted on 22 July 2025

Estimated read time: 5 min
Sections in this article
Share on blog/article:
LinkedIn

MetaGreenData: Making Code More Sustainable, One Metadata File at a Time

Jyoti Bhogal in the backdrop of the ‘University of Stirling’ and the centrally located Airthrey Loch; The William Wallace Monument, also known as the National Wallace Monument, located on the Abbey Craig in Stirling, Scotland

Images (L to R): Jyoti Bhogal in the backdrop of the ‘University of Stirling’ and the centrally located Airthrey Loch; The William Wallace Monument, also known as the National Wallace Monument, located on the Abbey Craig in Stirling, Scotland

Ever thought about how much energy your code consumes? Or how to make your software more eco-friendly? Well, during the recent Collaborations Workshop 2025 Hack Day, my team decided to tackle this very challenge. Let me walk you through our journey.

The third day of the Collaborations Workshop 2025 was a dedicated Hack Day. The attendees were invited to pitch ideas for the Hack Day. Many wonderful ideas were pitched for the same. People started joining the team with the idea in which they were most interested to work upon. Ultimately, 10 teams were formed. The 10 teams collaborated for an intense hack day, leading to amazing ideas taking shape by the end of the day!

Earlier on the second day of the workshop I had attended a collaborative ideas discussion session on “MetaGreenData: standardised reporting of the environmental impact of compute” chaired by Kirsty Pringle and joined by Bryn Ubald, Caterina Doglioni, Connor Aird, Jyoti Bhogal, and Saranjeet Kaur Bhogal. This session motivated me to join the “Group: D - Delphinium” team for the Hack Day (Team members: Christina Bremer, Duncan Leggatt, Jyoti Bhogal, Loïc Lannelongue, Will Haese-Hill; Joining remotely: Michael Sparks). I enjoyed collaborating with my team during the Hack Day - we divided the tasks amongst each other. Throughout the day the judges visited the different teams asking questions, and at the end of the day, we were asked to provide a presentation of the work we did. In this blog post, I share the product my team created by the end of the Hack Day!
Group photo while working on the hack idea, (L to R) Caterina Doglioni, Will Haese-Hill, Duncan Leggatt, Loïc Lannelongue, Jyoti Bhogal, Christina Bremer

Image: Group photo while working on the hack idea, (L to R) Caterina Doglioni, Will Haese-Hill, Duncan Leggatt, Loïc Lannelongue, Jyoti Bhogal, Christina Bremer

Motivation

While many developers are keen on optimising their code for speed and efficiency, not many consider the environmental impact. That's where my team saw the opportunity. The idea was simple: create a tool that helps developers understand and reduce the carbon footprint of their software.

Brainstorming the Solution

The brainstorming session was filled with energy and ideas. The motive was to come up with a  solution that was:

  • User-friendly: Easy for developers to integrate into their workflow.
  • Informative: Provides clear insights into the environmental impact.
  • Actionable: Offers suggestions to reduce carbon emissions.

After much discussion, it was decided to have a metadata generator that captures essential information about software, making it easier to assess and improve its sustainability.

Building MetaGreenData

Enter MetaGreenData. This is a Django-based web application designed to help developers generate metadata files for their software projects. Here's how it was designed:

  1. Understanding the Standards: To begin with, existing metadata standards, like CodeMeta and the Citation File Format (CFF), were explored. These standards provide a structured way to describe software, making it easier to share and cite.
  2. Designing the Workflow: Then a simple form was created where users can input details about their software. This included information like the software's name, version, authors, and more.
  3. Generating the Metadata: The tool was designed to generate a metadata file in the chosen format (CodeMeta or CFF) based on the input. This file can then be added to the software's repository.
  4. Integrating Carbon Footprint Estimation: Next, the Green Algorithms calculator was integrated to estimate the carbon footprint based on the software's computational requirements. 

Visualising the Process

To make things clearer, here's a simple flowchart of how MetaGreenData works:


 

The Outcome

MetaGreenData is now available on GitHub! 🎉
The idea was developed on a GitHub repository called MetaGreenData. We plan to develop it further and make it user-friendly.

Last but definitely not the least this project won the first prize at the Collaborations Workshop 2025 Hack Day. Here’s my team’s picture from the day!

Christina Bremer, Will Haese-Hill, Duncan Leggatt, Loïc Lannelongue, Jyoti Bhogal, Joining remotely: Michael Sparks

Image credit: Software Sustainability Institute; (L to R) Christina Bremer, Will Haese-Hill, Duncan Leggatt, Loïc Lannelongue, Jyoti Bhogal, Joining remotely: Michael Sparks

And here’s my certificate:

Certificate of Achievement: First place at the Hack Day

Certificate of Achievement: First place at the Hack Day, Collaborations Workshop 2025, organised by the Software Sustainability Institute

References and Inspirations

The project was inspired and informed by several resources:

MetaGreenData is just the beginning. The vision is of a future where sustainability is a core consideration in software development. By making it easier to assess and reduce the environmental impact of code, the hope is to inspire developers to make greener choices.
 

HomeNews and blogs hub

Collaborations Workshop 2025 Report

Bookmark this page Bookmarked

Collaborations Workshop 2025 Report

Author(s)
Kyro Hartzenberg

Kyro Hartzenberg

Events Manager

Posted on 27 June 2025

Estimated read time: 1 min
Sections in this article
Share on blog/article:
LinkedIn

Collaborations Workshop 2025 Report

CW25 group photo

Collaborations Workshop 2025 (CW25) took place as a hybrid event from Tuesday 13 May to Thursday 15 May 2025 at Stirling Court Hotel, University of Stirling. Over the course of three days, Collaborations Workshop 2025 (CW25), brought together researchers, developers, innovators, managers, funders, publishers, policy makers, leaders and educators to explore best practices and the future of research software.

Delegates explored the theme of “Future-proofing research software: evolving together as a diverse community”, delving into a variety of sessions to address the challenges and opportunities facing the research software community. 

Objectives and goals

CW25's key objective was to bring the community together to brainstorm and discuss what research software is going to be like in the next five years. CW25 aimed to increase confidence and understanding of key topics for the research software community that also fit into the wider aims and goals of the Software Sustainability Institute.

 

HomeNews and blogs hub

Work on skills and competencies for digital research professionals at CW25

Bookmark this page Bookmarked

Work on skills and competencies for digital research professionals at CW25

Author(s)
Aleksandra Nenadic

Aleksandra Nenadic

Training Team Lead

Phil Reed Profile Picture

Phil Reed

SSI fellow

Jonathan Cooper

Tamora James

Posted on 18 June 2025

Estimated read time: 5 min
Sections in this article
Share on blog/article:
LinkedIn

Work on skills and competencies for digital research professionals at CW25

CW25 logo, DIRECT

This year's Collaboration Workshop 2025 (CW25) brought together members of the community to explore and refine the evolving DIRECT framework - a community-developed approach to mapping digital research skills and competencies into a framework to support diverse progression pathways for researchers who code, RSEs, data specialists, RSE group leads, research project leads, etc., and help them find relevant resources or track and manage their professional profiles and development.

The creation of the DIRECT framework began two years ago at CW23 (originally called the "RSE Skills and Competencies Toolkit" - now extended to "Digital Research Competencies Framework" to encompass other roles and domains in digital research) and has been progressing ever since. With strong momentum behind it, CW25 presented the perfect opportunity to gather community input once again and collaboratively shape the next phase. We proposed a 30-minute workshop at CW25 to share progress and gather feedback, with the hope that it would spark enough interest to carry the work forward into the Hack Day—which, fortunately, it did.

The workshop 

The workshop was held on Wednesday 14 May (CW25 Day 2) and was co-led by Aleksandra Nenadic, Phil Reed, and Aman Goel (University of Manchester), with contributions from collaborators including David Horsfall (Newcastle University), Adrian D'Alessandro (Imperial College London), Jonathan Cooper (UCL), and Eli Chadwick (University of Manchester), all of whom have been closely involved in shaping the framework over the past two years.

The session invited participants to reflect on how well the existing version of the framework captures the skills needed across a range of domains, with a particular focus on AI, HPC and general systems infrastructure, domain-specific research and professional (non-technical) competencies.

Following brief introductions and a video overview recorded by Dave Horsfall, the 20+ participants rolled up their sleeves for a hands-on review of the framework, providing feedback via shared documents. These live contributions not only helped assess the current structure but also filled in the gaps in the missing training resources and seeded ideas for future improvements and collaboration. The session concluded with a pitch for the Hack Day project (which traditionally takes place on Day 3 of the Collaborations Workshop) to continue shaping the framework and extend the work on its practical implementation as a Django web application

The Hack Day 

During the Hack Day, we split into two focused sub-groups: one dedicated to further refining the framework content and its documentation (Phil Reed, Aleksandra Nenadic and Patricia Loto), and another building a Django-based web application (Adrian D’Alessandro, Bryn Ubald, Andrew Gait, Tamora James, Connor Aird, Ryan Smith) to make the framework interactive and accessible to end users. 

The framework team worked toward a version 1.0 release, incorporating feedback from the earlier workshop held the day before, streamlining structure, and capturing missing skills identified by contributors. Meanwhile, the web app team developed essential features such as skill browsing and user profile creation, and made progress towards competency visualisation using "competency wheels." The app will allow individuals and teams to self-assess, compare skill sets, and even define templates for key roles - e.g. for data scientists, archivists, or RSEs with HPC specialisms.

The event attracted several first-time CW hack day participants, including contributors from the humanities, showing how widely applicable and inclusive the framework could become. The group made effective use of GitHub Projects to organise their tasks (28 planned for the Hack Day) and pull requests (12 PRs made closing 18 issues/tasks with 3 more in progress), fostering transparency, collaboration, and accountability throughout the day. There was a strong emphasis on sustainable development, good documentation, and attribution for all contributors — including those working outside GitHub.

Participants at the hackday

Future Work 

We have several things lined up for 2025:

  • Dave Horsfall’s leadership of the DIRECT framework development is now funded by the UKRI Digital Research Infrastructure programme as part of the DisCouRSE NetworkPlus grant. We hope that flexible funding from this network will provide extra dedicated effort next year.
  • As part of his SSI Fellowship, Phil Reed is taking the DIRECT Framework to the UK-Ireland Digital Humanities Association annual event DHA25, as a workshop to capture and compare the voices of digital humanities researchers with the mostly STEM-focused work conducted so far.
  • The University of Manchester is hosting an Open Research Conference on 9-10 June 2025. Aleks Nenadic and Phil Reed are presenting a talk about the DIRECT Framework and its benefits within the wider open research movement.
  • The DIRECT Framework is featured as part of two workshop proposals for RSECon25 in September.
  • The team are in discussions with Australian Research Data Commons exploring common interests and opportunities.

Call for involvement

If you're passionate about helping researchers better articulate and develop their technical and non-technical competencies, keep an eye on the DIRECT framework—or better yet, get involved! Connect with the contributors, join the conversation, and help us shape the future of digital research skills.

 

HomeNews and blogs hub

CW25 - That's a wrap!

Bookmark this page Bookmarked

CW25 - That's a wrap!

Author(s)
Kyro Hartzenberg

Kyro Hartzenberg

Events Manager

Posted on 26 May 2025

Estimated read time: 2 min
Sections in this article
Share on blog/article:
LinkedIn

CW25 - That's a wrap!

CW25 group photo

Collaborations Workshop 2025 (CW25), brought together researchers, developers, innovators, managers, funders, publishers, policy makers, leaders and educators to explore best practices and the future of research software.

Over the course of three days, delegates explored the theme of “Future-proofing research software: evolving together as a diverse community”, delving into a variety of sessions to address the challenges and opportunities facing the research software community. 

Highlights of CW25 included thought leaders sharing insights into the evolving landscape of research software, thereby sparking meaningful conversations and reflections. CW25 featured interactive workshops, where delegates could dive deep into topics such as how to make software that others want to use, improving carbon literacy for researchers, and safeguarding research and culture - just to name a few. Delegates also had opportunities to propose and develop innovative solutions to issues, promoting teamwork and creativity.

On the third and final day, delegates were invited to take part in the Hack Day, where teams came together to transform pitches into outputs, demonstrating the power of collaboration and innovation.

CW25 also offered scheduled and informal socialising opportunities - from a traditional Scottish ceilidh to card games and scenic walks, allowing delegates to form new connections and build on existing friendships, outside of the official programme.

We look forward to continuing the conversations and collaborations that came from CW25, and hope to see new and returning delegates at CW26.

Stay tuned for access to the recordings and a more-detailed report on CW25. In the meantime, have a look at the amazing pictures in our CW25 Album.

 

HomeNews and blogs hub

CW25 in-person tickets are officially sold out!

Bookmark this page Bookmarked

CW25 in-person tickets are officially sold out!

Author(s)
Kyro Hartzenberg

Kyro Hartzenberg

Events Manager

Posted on 28 April 2025

Estimated read time: 2 min
Sections in this article
Share on blog/article:
LinkedIn

CW25 in-person tickets are officially sold out!

CW25 logo, a road heading forward

In-person tickets for Collaborations Workshop 2025 (CW25) are now officially sold out. 

Remote tickets will remain open via Eventbrite until Tuesday 6 May (or until sold out). Remote tickets include access and participation to all sessions, including keynotes, panel discussions, lightning talks, collaborative ideas sessions, and hack day sessions.

Event Overview

CW25 will centre around future proofing research software and how we evolve together as a diverse community. Over three days, delegates will be invited to take part in discussion sessions, collaborative ideas sessions, and hack day sessions, to explore together and create together.

Event Programme

The full event programme is now available.

CW25 will feature an impressive lineup of keynote speakers and expert panellists, which have all been outlined in the full programme.

Delegates can also start familiarising themselves with the scheduled mini-workshops and lightning talks, which will focus on the changes in landscape, what research software will look like in the future, and the challenges for the research software community.

Date and Location

CW25 will take place as a hybrid event from Tuesday 13 May to Thursday 15 May 2025 at Stirling Court Hotel, University of Stirling. We encourage all delegates to familiarise themselves with the CW25 Venue Guide.

Stirling is a beautiful, historic city with easy access to beautiful, wild landscapes and outdoor activities. Stirling is also ideally placed as a gateway to the Scottish Highlands and is only a short train trip from Edinburgh or Glasgow.

Subscribe to CW25
Back to Top Button Back to top