Introduction to Green Computing

Author(s)

Jyoti Bhogal

SSI fellow

Caterina Doglioni

Estimated read time: 15 min

Sections in this article

Introduction to Green Computing

This guide helps you understand how everyday coding, data use, or AI tools contribute to environmental impacts, and what you can realistically do to measure such impacts. It offers practical starting points: how to measure, reflect, and improve. Whether you're a researcher, developer, or policymaker, it helps you make more informed and responsible decisions.

What is green computing?

According to the Brundtland Commission (1987), sustainability is defined as “meeting the needs of the present without compromising the ability of future generations to meet their own needs.” There are several definitions of environmentally sustainable (also called “green”) computational workflows (see e.g. the reference below at [1], or the definitions given by IBM or Green Compute UK). We follow the definition of sustainability above, and narrow it down to (1) Activities concerning computing, and to (2) preserving the environment as one of the enablers to meet future generations’ needs.

In this guide, computing encompasses every aspect and resource that is necessary for a computational workflow. Broadly speaking, this includes the hardware, software, and data that enable these workflows.

These elements have an impact on the environment, including (but not limited to) the energy cost to power hardware when running software workflows or storing and moving data, the impact on natural resources of mining the material needed to build the hardware, the water needed both to produce energy and to cool the hardware, as well as the electronic waste created during hardware disposal.

Therefore, we define green computing as practices that minimise the negative impact of computing on the environment.

Why is it necessary?

The environmental impact of computing, in research and beyond, has historically been under-recognised. Users rely on computing hardware and electricity costs that are not immediately visible. For example, in the research sector, organisations generally cover the computing costs (e.g. electricity bills) centrally, creating the impression that computing power is a free resource. Another example is the use of generative AI tools (e.g. large language models) or software-as-a-service: the user pays a subscription for the computing power running in a data centre, but does not directly see the energy consumption and impacts related to the data centre.

However, computing has a significant impact in terms of environmental resources. Examples of these impacts are below:

The global electricity consumption of data centres is a driver of greenhouse gas (GHG) emissions, and in particular, CO2 emissions. It is estimated at 126 MTCO2e for 2020 [6]. While emissions depend on how electricity is sourced, electricity consumption from data centres is projected to at least double from 2024 to 2030 and could reach 15% of the global electricity consumption, as reported in the Energy and AI report by eia. In particular, an affordable, reliable and sustainable electricity supply will be necessary for further developments of AI, as a typical AI data centre consumes as much electricity as 100000 households.
Computing systems, particularly data centres and AI infrastructure, can consume significant amounts of water, directly for cooling, and indirectly, for electricity generation [2].
Data centres are localised consumers of electricity and water, as well as CO2 and GHG emitters. This aspect can be exacerbated by the presence of generators required for continuous uptime or consumption spikes. This puts pressure on the community, infrastructure and ecosystem where they are located, and creates challenges for local integration.
Computing hardware relies on the extraction of rare earth elements and critical minerals, whose mining processes can cause significant environmental damage, including large-scale waste generation, land disruption, and pollution.
Electronic waste (e-waste) is one of the fastest-growing waste streams globally and contains hazardous materials that can harm ecosystems and human health. E-waste is growing rapidly and is projected to reach 82 million tonnes by 2030.

The main goal of this introduction is to raise awareness of the environmental impact of computing. Because this impact and its costs are diffuse and not borne by the individual, it is not directly visible. Quantifying and surfacing this impact enables more informed and responsible decision-making in the use, design and execution of computational workflows. Moreover, the impact of an overall mindset [3][4] and policy change that requires computing providers to be transparent on their environmental impact, and subsequently mitigate it, can be sizable.

Furthermore, this guide provides pointers on how to make one’s own computational workflow more environmentally sustainable, adding an environmental cost dimension to computational workflow optimisation. Implementing and quantifying such improvements not only leads to an improved understanding of the environmental consequences of computational choices, but also motivates broader adoption of sustainable practices, including seeking out, developing and applying technological improvements that can be implemented at scale. Just to name a few examples: work from the University of Glasgow [5] studying more power-efficient computing chips led to the implementation in large-scale, high-throughput scientific computing infrastructures [6]; a case study between Finnish and African researchers shows that a redesign of their software architecture reduced energy consumption by more than 60%; a study performed by AI researchers from Carnegie Mellon University and Hugging Face reports on modelling of the optimisation of large language models in real-world settings as a prerequisite for implementation [7]; finally, shifting the execution of computing workflows to different times, geographical location or compute resource can reduce the energy demands [8].

It should also be noted that improved energy efficiency can drive increased demand, as per Jevons paradox, which is a rebound effect [10][11]. This can be summarised as the phenomenon where, if a resource becomes more efficient, then it also becomes cheaper, leading to more extensive use. This effect is particularly noticeable in computing workflows involving large AI models, whose demand keeps growing as energy efficiency is improved [12]. Other such examples exist, and qualitative and quantitative research is ongoing to inform effective mitigation techniques [13]. Nevertheless, understanding why and how to improve environmental sustainability aspects is still a crucial factor in the overall picture.

Software sustainability, understood through the FAIR4RS [14] principles and advocated by the SSI, is fundamental to individual and collective green computing strategies: building reproducible, well-documented and overall better code encourages a more thought-through approach to the consequences of computing workflows on others.

Measuring the environmental impact of a computational workflow: a quickstart guide

An essential first step in reducing the environmental impact of computational workflows is measuring it. This information is useful to workflow developers and users, with examples of self-reporting in [15], and it is increasingly becoming part of recommendations to policy makers and funders (see the examples of the UK [16] and France [17]).

However, measuring the entire environmental impact of a computational workflow is complex. This process encompasses hardware production, transport, use and disposal, and mirrors how we might assess the environmental cost of a can of tomato sauce: natural resources are impacted by growing the tomatoes, processing, packaging, transport to the supermarket, and waste management. An accurate measurement of all these components requires a full Lifecycle Assessment (LCA) following international standards, such as the ISO 14040:2006 and the ISO 14044:2006.

To make this task conceptually more manageable, we split the environmental impacts of a computational workflow into two categories: embodied impacts, related to the manufacturing and disposal of the computing hardware, and operational impacts, stemming from the energy consumed to run the computational workflow. Given this complexity, it is therefore important to define the boundaries of a computational workflow measurement, as suggested by the Green Software Foundation’s Software Carbon Intensity (SCI) (ISO/IEC 21031:2024), which defines a methodology to calculate the rate of carbon emissions for a software system. This approach involves several steps: first, deciding on what to include in the measurement; second, choosing a standardised measure of what the component is doing (e.g. inference for AI systems, or reading a file for a data analysis software) to understand how the environmental impacts will scale; third, define a measurement technique for each of these components and units, and then finally measuring and reporting the results and the chosen methodologies.

While this introduction does not give a complete list of methods and tools to perform a lifecycle assessment or calculate the SCI, it can serve as a quick-start guide with resources that can be used as a first step for individual computational workflows, in terms of:

Embodied impacts: a full lifecycle assessment for the hardware, which often needs to be done by the hardware manufacturers (see e.g. this example). Relevant information is starting to be collected in databases such as the Boavizta database [18]
Operational impacts: An operational metric for the environmental impact of computational workflows is their energy consumption during execution, which can then be transformed into other indicators of environmental impact, such as CO2 emissions. This can, for example, be estimated either with measurements while the software runs (see [19, 20] for a list of tools that can be used by software practitioners) or through modelling using online calculators [21]. It is important to note that measuring and reducing energy consumption (including the optimisation of idle times, where the computing servers still consume energy) is in active development for High Performance Computing, see for example [22].

Building on these measurement approaches, the reader can then go on to monitor and moderate the environmental impact of their software workflow. In the guides following this introduction, the reader will find further information and checklists on good practices for green data, code efficiency assessment, and writing energy-efficient code on appropriate hardware.

Takeaway message and some next steps

Green computing doesn’t require perfection, as it can start with awareness and small, intentional changes. Begin by measuring your current impact, then look for simple optimisations, like reducing unnecessary runs or improving efficiency. Over time, these small steps can scale into meaningful change. As a next step, you can try one of the tools below to estimate your footprint and explore ways to integrate sustainability into your regular workflow. Finally, the other SSI green computing guides highlight further practical steps on how to minimise the operational impacts on the environment of your computing workflows.

Useful resources

Here are some tools useful for estimating the carbon footprint of a computational workflow (also included in [19, 20]:

Tools to use if you want to run the code and see the impact:
- CodeCarbon: https://codecarbon.io/
- Scaphandre: https://github.com/hubblo-org/scaphandre
- CarbonTracker: https://carbontracker.info/
Tools to use if you want to see the impact without running the code. These do a post-hoc analysis based on models, without execution:
- Green Algorithms calculator: https://calculator.green-algorithms.org/
- ML CO₂ Impact calculator: https://calculator.linkeddata.es/

Declaration of Delegation to Generative AI (GAIDeT)

The authors declare the use of generative AI in the research and writing process. According to the GAIDeT taxonomy (2025), the following tasks were delegated to GAI tools under full human supervision: Proofreading and editing, Formulation of conclusions, Citation formatting.

The GAI tools used were: Claude Haiku 4.5, Sonnet 4.5, ChatGPT 4.5.

Responsibility for the final manuscript lies entirely with the authors. GAI tools are not listed as authors and do not bear responsibility for the final outcomes.

Declaration submitted by: Jyoti Bhogal, Caterina Doglioni

Additional note: We used Claude models to help with suggestions of transitions between paragraphs to increase coherence, and ChatGPT to generate an initial draft of the introduction and take-away messages. We recognise the environmental impact of the use of GenAI for text editing, as well as the difficulty in obtaining an estimate that is relevant for this guide. As a first step, based on this link and this link, we estimate the energy use involved in queries concerning this guide between 3 Wh (using 10 “average queries” as lower limit) and 400 Wh (10 x Claude code sessions as upper limit), where the system boundaries cover exclusively the operational costs of model queries (inference).

Citations

[1] S. Murugesan. 2008. Harnessing Green IT: Principles and Practices. IT Professional 10, 1 (January-February 2008), 24–33. https://doi.org/10.1109/MITP.2008.10

[2] David Mytton. 2021. Data centre water consumption. npj Clean Water 4, 11 (2021). https://doi.org/10.1038/s41545-021-00101-w

[3] Heike Rau, Stephan Nicolai, and Silja Stoll-Kleemann. 2022. A systematic review to assess the evidence-based effectiveness, content, and success factors of behavior change interventions for enhancing pro-environmental behavior in individuals. Frontiers in Psychology 13 (2022), 901927. https://doi.org/10.3389/fpsyg.2022.901927

[4] Hunt Allcott. 2011. Social norms and energy conservation. Journal of Public Economics 95, 9-10 (2011), 1082–1095. https://doi.org/10.1016/j.jpubeco.2011.03.003

[5] Emanuele Simili, Gordon Stewart, Samuel Skipsey, Dwayne Spiteri, Albert Borbely, and David Britton. 2024. ARMing HEP for the future Energy Efficiency of WLCG sites (ARM vs. x86). EPJ Web of Conferences 295 (2024), 11007. https://doi.org/10.1051/epjconf/202429511007

[6] Jens Malmodin, Nina Lövehagen, Pernilla Bergmark, and Dag Lundén. 2024. ICT sector electricity consumption and greenhouse gas emissions – 2020 outcome. Telecommunications Policy 48, 3 (2024), 102701. https://doi.org/10.1016/j.telpol.2023.102701

[7] Jillian Fernandez, Chaitanya Na, Vivek Tiwari, Yonatan Bisk, Sasha Luccioni, and Emma Strubell. 2025. Energy considerations of large language model inference and efficiency optimizations. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 32556–32569

[8] Felipe Oviedo, Fiodar Kazhamiaka, Esha Choukse, Allen Kim, Amy Luers, Melanie Nakagawa, Ricardo Bianchini, and Juan M. Lavista Ferres. 2025. Energy Use of AI Inference: Efficiency Pathways and Test-Time Compute. arXiv:2509.20241 [cs.LG]. https://arxiv.org/abs/2509.20241

[9] Nicolas Tirel, Philippe Roose, Sergio Ilarri, Adel Noureddine, and Olivier Le Goaër published "Workload Shifting Techniques: From Digital Inebriation to Sobriety" in ACM Computing Surveys 58(5), 2025 (pp. 1–36). https://doi.org/10.1145/3769301 .

[10] Blake Alcott. 2005. Jevons' paradox. Ecological Economics 54, 1 (2005), 9–21. https://doi.org/10.1016/j.ecolecon.2005.03.020

[11] Steve Sorrell. 2009. Jevons' Paradox revisited: The evidence for backfire from improved energy efficiency. Energy Policy 37, 4 (2009), 1456–1469. https://doi.org/10.1016/j.enpol.2008.12.003

[12] Dongyang Yu and Bingjie Xu. 2026. The Jevons Paradox in the AI era: Artificial intelligence adoption for enhancing environmental sustainability at the firm level. Economic Analysis and Policy 90 (2026), 946–966. https://doi.org/10.1016/j.eap.2026.01.060

[13] Alexandra Sasha Luccioni, Emma Strubell, and Kate Crawford. 2025. From Efficiency Gains to Rebound Effects: The Problem of Jevons' Paradox in AI's Polarized Environmental Debate. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT '25). Association for Computing Machinery, New York, NY, USA, 76–88. https://doi.org/10.1145/3715275.3732007

[14] Michelle Barker, Neil P. Chue Hong, David S. Katz, et al. 2022. Introducing the FAIR Principles for research software. Scientific Data 9 (2022), 622. https://doi.org/10.1038/s41597-022-01710-x

[15] Qin, Y., Havulinna, A.S., Liu, Y., et al. 2022. Combined effects of host genetics and diet on human gut microbiota and incident disease in a single population cohort. Nature Genetics 54 (2022), 134–142. https://doi.org/10.1038/s41588-021-00991-z

[16] Martin Juckes, Michael Bane, Jennifer Bulpett, Katie Cartmell, Miranda MacFarlane, Molly MacRae, Alex Owen, Charlotte Pascoe, and Poppy Townsend. 2023. Sustainability in Digital Research Infrastructure: UKRI Net Zero DRI Scoping Project final technical report. Zenodo, August 1, 2023. https://doi.org/10.5281/zenodo.8199984

[17] How to include environmental sustainability criteria in national AI funding schemes? Reflecting on the example of France and the Green Algorithms tool. Zenodo, January 7, 2025. https://doi.org/10.5281/zenodo.14607021

[18] Thibault Simon, David Ekchajzer, Adrien Berthelot, Eric Fourboul, Samuel Rince, and Romain Rouvoy. 2025. BoaviztAPI: A Bottom-Up Model to Assess the Environmental Impacts of Cloud Services. SIGENERGY Energy Informatics Review 4, 5 (December 2024), 84–90. https://doi.org/10.1145/3727200.3727213

[19] Rene Schiffmann. 2025. UofM-Green-Compute/Energy-Evaluation-Tools: First release of Green Computing Tools Table (v1.0). Zenodo, 2025. https://doi.org/10.5281/zenodo.16780717

[20] Boavizta. ICT Sustainability Tools. Retrieved March 31, 2026, from https://boavizta.github.io/ict-sustainability-tools/

[21] Loïc Lannelongue, Jason Grealey, and Michael Inouye. 2021. Green Algorithms: Quantifying the Carbon Footprint of Computation. Advanced Science 8, 12 (June 2021), 2100707. https://doi.org/10.1002/advs.202100707

[22] Czarnul, Pawel, Proficz, Jerzy, Krzywaniak, Adam, Energy-Aware High-Performance Computing: Survey of State-of-the-Art Tools, Techniques, and Environments, Scientific Programming, 2019, 8348791, 19 pages, 2019. https://doi.org/10.1155/2019/8348791

Acknowledgements

This guide by written by Jyoti Bhogal Independent Researcher & SSI Fellow, and Caterina Doglioni, University of Manchester. The guide was reviewed by Löic Lannelongue and Diego Alonso Alvarez.

Get in touch with Jyoti:

Get in touch with Caterina

Home Resource hub

Getting Started with ML and AI in Research Software

Bookmark this page Bookmarked

Getting Started with ML and AI in Research Software

Author(s)

Paul J. Wright

Estimated read time: 5 min

Sections in this article

Getting Started with ML and AI in Research Software

Machine learning (ML) and artificial intelligence (AI) have quickly become part of the modern research toolkit, changing how scientific software is built and maintained. As their use grows, it is important to maintain fundamental principles of research software—reusability, reproducibility, and transparency. This guide introduces the paradigm shift from rule-based to experiment-focused software and outlines practices and tools for designing research software to support ML tasks.

Deterministic Foundations of Research Software

In traditional research software, computational methods transform data into results, from imaging and data analysis to large-scale scientific simulation. These programs are typically deterministic, meaning the same input always produces the same output. The developer explicitly defines the transformation function (for example, y=f(x)) and, because the behaviour is fully determined by the program logic, the results are repeatable, verifiable, and transparent. These are properties that standard quality assurance practices, such as unit testing, rely on.

ML departs from this paradigm. Instead of the developer manually defining the transformation function during training, the ML model approximates f(x) by estimating its parameters from example input-output pairs (x,y). The trained model is therefore shaped by the learning algorithm, the training and validation data, the optimisation process, and configuration choices.

The accompanying figure contrasts a deterministic system with three independently trained models, each receiving the same input but producing different outputs across training runs.

This paradigm shift affects governance and reproducibility. In deterministic systems, codebase version control is generally sufficient, since the program logic fully determines the outcome. In ML systems, the code specifies how training should proceed but does not capture the experimentation itself: the data used, the hyperparameters tried, and the random seeds of countless configurations along the way. It is entirely possible to produce a high-performing model with no reliable record of its construction. The artefact exists; the path back to it does not.

“Even perfectly versioned code cannot reveal which data actually shaped the model”.

Reproducibility, Governance, and Supporting Tools

This shift moves researchers from a script that produces a result to an experiment that produces a model. Reproducibility now relies on capturing all conditions under which training occurred.

Fortunately, established ML Engineering and MLOps practices offer ways to manage this complexity. MLOps extends software engineering principles to ML, focusing on automation, reproducibility, and governance throughout the entire ML model lifecycle. Standard workflows typically feature dataset versioning, training pipelines, experiment tracking, model registries, continuous integration and deployment, production monitoring, and environment reproducibility. Collectively, these practices reduce ad hoc work, making the experimental record a first-class artefact alongside the model itself. In practice, this means being deliberate about three key areas, with supporting tools discussed at the end of this section:

The first is data. Reliable ML systems depend on clear data provenance and reproducible preprocessing. Best practices include capturing raw data hashes, version-controlling data preprocessing pipelines, and validating that incoming data conforms to expected schemas and distributions. Generating distinct training, validation, and test data is essential to prevent data leakage, with the test set only used for final evaluation. The dataset should be tagged with its version. Any data augmentation during training must be tracked and repeatable (applied offline and saved, or with a chosen random seed). Without these safeguards, a model's provenance becomes opaque: even perfectly versioned code cannot reveal which data actually shaped the model.

The second is the training process itself. Model behaviour is highly sensitive to choices such as architecture, optimisation strategy, random initialisation, and evaluation metrics. These sources of variability must be systematically captured through experiment tracking. Without such tracking, a model may be impossible to reproduce or explain, and subsequent runs may behave unpredictably.

Third is model evaluation. Offline metrics alone are insufficient; models should be validated against datasets that were previously excluded from the training process. Take care not to exhaust this resource through model evaluation. When data arrives continuously, production monitoring provides an additional source of out-of-sample evaluation and is essential for detecting data drift and model degradation over time.

Several tools support these directly, spanning orchestration, versioning, tracking, and model management:

Pipeline orchestration tools, such as Prefect, manage and schedule workflows, while purpose-built ML pipeline frameworks (for example, ZenML) provide abstractions tailored to ML workloads.
Data versioning tools, such as Data Version Control (DVC), extend version control to datasets and data preprocessing steps, linking them directly to model outputs. This helps ensure experiments can be reliably reproduced as datasets evolve over time.
Experiment tracking platforms, such as MLflow and Weights & Biases, record configurations, hyperparameters, metrics, and artefacts across training runs, enabling results to be traced.
Model registries and sharing platforms, such as Hugging Face, serve as structured repositories for storing trained models with metadata, documentation, and evaluation results. This supports repeatability, reproducibility, and model reuse within the research community.

Combined, they allow ML systems to be developed with the same rigour and traceability expected of any other piece of scientific software.

Acknowledgements

This guide was written by Paul J. Wright and reviewed by Yo Yehudi.

Paul J. Wright's ORCID: https://orcid.org/0000-0001-9021-611X
Yo Yehudi's ORCID: https://orcid.org/0000-0003-2705-1724

Home Resource hub

Inclusive Event Planning from Start to Finish

Bookmark this page Bookmarked

Inclusive Event Planning from Start to Finish

Author(s)

Laura Crawford

Estimated read time: 5 min

Sections in this article

Inclusive Event Planning from Start to Finish

This guide shares practical steps for planning accessible and inclusive events in research software communities. It builds on existing accessibility guidance and highlights extra things to consider when your event includes technical talks, live coding, shared repositories or ongoing collaboration.

It is aimed at event organisers, workshop leads, programme committees and community organisers working in research software.

Aim of the guide

After reading this guide, you should be able to:

Spot key stages where inclusion should be considered
Use general accessibility guidance as a starting point
Design clearer and more accessible calls for proposals
Plan fair hybrid and remote participation
Set expectations for accessible technical talks and materials
Support inclusive follow-up collaboration

Why this guide is useful

There is already good guidance on accessible events in general. The Zero Project Conference Accessibility Guide is a strong place to start. It covers venue access, communication, budgeting and general event planning. You may also find the Software Sustainability Institute’s Event Organisation Guide helpful for general planning considerations.

However, research software events often include extra layers. There may be live coding, technical demonstrations, shared repositories and follow-up development work. In this context, accessibility includes how technical knowledge is shared and collaboration continues after the event.

This guide builds on general accessibility advice and adds practical steps that are specific to research software communities.

Start with strong foundations

Begin with an established accessibility framework such as the Zero Project guide. Use it to shape your planning from the start. Inclusion works best when it is part of the plan from the beginning.

When designing calls for contributions to research software events also consider:

Does your call for proposals assume people already know the “right” tools or workflows?
Are you asking for links to code repositories? If so, have you explained how they will be judged?
Are you taking into account that early-career researchers may have smaller or less polished repositories?
Are you valuing learning, experimentation and community contribution as well as technical complexity?
Have you made it clear if talks do not need to present finished or “perfect” software?
Have you shared simple guidance on accessible slides, demos and shared materials?
Have you included a section for knowledge expectations, including programming languages or tools?

Calls for proposals in research software often request repository links, previous talks or evidence of impact. These can unintentionally favour people who have had more time, funding or institutional support.

If you request repositories, explain what you are looking for. For example:

Clarity of documentation
Openness to collaboration
Evidence of community use
Reflection on lessons learned

Make it clear that a small, well-documented project can be as valuable as a large and complex one. If possible, provide an example of a strong submission or a template. This helps reduce uncertainty, particularly for early-career or new contributors.

If your audience is international, programme organisers should also consider time zones and avoid scheduling all key sessions in one region’s working hours

Design for real participation

Many research software events include live demonstrations, coding sessions and hybrid formats. These can exclude people if not planned carefully.

When planning your sessions:

Ask presenters to use large, readable fonts
Encourage high-contrast colour schemes in slides and terminals
Ask speakers to describe what they are doing during live coding
Avoid assuming everyone has fast internet or powerful hardware
Share technical requirements well in advance
Offer browser-based or lightweight options where possible

If your event is hybrid or online:

Explain clearly how remote participants can ask questions
Use shared notes or moderated chat
Avoid making important decisions in side conversations
Record sessions and explain how to engage later

Design participation so that remote attendees and early-career participants are not treated as an afterthought.

Think beyond the event day

Research software events often lead to ongoing work. There may be new repositories, working groups or collaborations.

Event organisers should set expectations early that:

Slides and recordings should meet basic accessibility standards
Materials should be shared in a clearly communicated, accessible location, with information on how long they will remain available
Shared repositories should include clear README files
Licensing should be clearly stated
Decisions and next steps should be written down and shared
If you expect follow-up collaboration, explain how people can stay involved. Provide clear sign-up forms or contact points. Do not rely only on informal invitations.

Organisers do not need to control every future activity, but they can create fair starting conditions.

Takeaway message

Accessible and inclusive event planning in research software builds on good general practice. It also requires attention to technical communication, hybrid participation and shared outputs.

Set expectations early. Communicate clearly. Make small improvements each time you run an event.

Inclusion is not a single action. It is part of how you design, run and follow up on your event.

About the author

Laura Crawford, Rosalind Franklin Institute.

Find Laura on Github

ORCID: https://orcid.org/0000-0002-3553-7049

Acknowledgements

This guide was reviewed by Patricia Herterich, Chief of Staff at OLS and SSI Fellow.

Home Resource hub

Using Git with shared folders and virtual machines

Bookmark this page Bookmarked

Using Git with shared folders and virtual machines

Author(s)

Mike Jackson

Estimated read time: 5 min

Sections in this article

Using Git with shared folders and virtual machines

This is a guide on using Git and GitHub within a VMWare virtual machine (VM) which, for whatever reason (e.g. organisational security policies), cannot be connected to a network

Home Resource hub

Automating unit testing with Continuous Integration

Bookmark this page Bookmarked

Automating unit testing with Continuous Integration

Author(s)

Steve Crouch

Software Team Lead

Estimated read time: 16 min

Sections in this article

Automating unit testing with Continuous Integration

By Steve Crouch, SSI Research Software Group lead.

This guide is the first in the Unit Testing for Scale and Profit series.

In a project where changes are frequently made to research software, it is helpful to know that the code still works as expected. In our last two episodes, we looked at the benefits of having a set of unit tests and how we can use test parameterisation to write numerous tests efficiently. However, particularly with projects involving more than one contributor, it would be good to have assurance the software still works without everyone having to pull down all the changes and test them. In this guide, we'll be looking at Continuous Integration, which aims to reduce this burden by automating things in the background, such as running tests. But it also can be used for so much more.

Home Resource hub

An introduction to unit testing

Bookmark this page Bookmarked

An introduction to unit testing

Author(s)

Steve Crouch

Software Team Lead

Estimated read time: 19 min

Sections in this article

An introduction to unit testing

Demonstrating that a process generates the right results is important in any field of research, whether it’s research software generating those results or not. Automation, where possible, enables us to define a potentially complex process in a repeatable way that is quicker and far less prone to error than doing it manually. In this guide we’ll look into techniques of automated testing to improve the predictability of a software change, make development more productive, and help us produce code that works as expected and yields desired results. We'll use Python for illustration purposes, but the concepts and approaches can be readily applied to many other languages.

Home Resource hub

New materials to introduce beginners to Github and APIs

Bookmark this page Bookmarked

New materials to introduce beginners to Github and APIs

Author(s)

Rachael Ainsworth

SSI fellow

Reka Solymosi

SSI fellow

Estimated read time: 2 min

Sections in this article

New materials to introduce beginners to Github and APIs

New materials are now available to introduce beginners to Github and Application Programming Interfaces (APIs). The materials were developed for Open Data Manchester’s Pick N Mix series – free online sessions which enable people to improve their data skills.

Home Resource hub

Getting started with Travis CI

Bookmark this page Bookmarked

Getting started with Travis CI

Author(s)

Mike Jackson

Research Software Engineer

Estimated read time: 2 min

Sections in this article

Getting started with Travis CI

I have just completed developing an Interoperability test harness for Provenance Tool Suite. As part of this work, I used the TravisCI hosted continuous integration server for the first time. I've now written up walkthrough of Travis CI as part of our Build and test examples on GitHub.

Home Resource hub

Top tips for using Mercurial

Bookmark this page Bookmarked

Top tips for using Mercurial

Author(s)

Matthew Turk

Estimated read time: 14 min

Sections in this article

Top tips for using Mercurial

I love Mercurial. It's easily my favorite version control system, and I use it for all of my projects. Much like git, bazaar, darcs and so on, it's a distributed version control system - it's decentralised, in that every clone of the repository has a fully-fledged set of history - and it enables you to create local changes, review past changes, and create experimental branches that are later abandoned.

Home Resource hub

Tips for sustainable software development on supercomputers

Bookmark this page Bookmarked

Tips for sustainable software development on supercomputers

Author(s)

Derek Groen

SSI fellow

Estimated read time: 5 min

Sections in this article

Tips for sustainable software development on supercomputers

This blog is already chock-full of useful tips for software development, and much of it applies to sustaining software on supercomputers as well. Here are a few tips on developing sustainable software for supercomputer environments.S

Subscribe to Getting started

Introduction to Green Computing

Introduction to Green Computing

What is green computing?

Why is it necessary?

Measuring the environmental impact of a computational workflow: a quickstart guide

Takeaway message and some next steps

Useful resources

Declaration of Delegation to Generative AI (GAIDeT)

Citations

Acknowledgements

Get in touch with Jyoti:

Get in touch with Caterina

Getting Started with ML and AI in Research Software

Getting Started with ML and AI in Research Software

Deterministic Foundations of Research Software

Reproducibility, Governance, and Supporting Tools

Further reading

Acknowledgements

Inclusive Event Planning from Start to Finish

Inclusive Event Planning from Start to Finish

Aim of the guide

Why this guide is useful

Start with strong foundations

Design for real participation

Think beyond the event day

Takeaway message

Further reading

Event planning

Accessibility

Technical events

Inclusive participation and community building

About the author

Acknowledgements

Using Git with shared folders and virtual machines

Using Git with shared folders and virtual machines

Automating unit testing with Continuous Integration

Automating unit testing with Continuous Integration

An introduction to unit testing

An introduction to unit testing

New materials to introduce beginners to Github and APIs

New materials to introduce beginners to Github and APIs

Getting started with Travis CI

Getting started with Travis CI

Top tips for using Mercurial

Top tips for using Mercurial

Tips for sustainable software development on supercomputers

Tips for sustainable software development on supercomputers