New Guide: Green Data Practices for efficient utilisation of research data

We are pleased to announce the Green Data Practices: Efficient utilisation of research data guide, which offers practical steps for researchers and Digital Research Technical Professionals who want to reduce the environmental impact of working with research data.

Scientific research generates data, and data creates emissions through storage, transfer and repeated processing. As data volumes grow, these emissions can become a significant and often overlooked part of the environmental footprint of research.

The guide focuses on simple actions that teams and individuals can apply in their day-to-day work across laptops, HPC systems and cloud platforms. It introduces three guiding principles for working with data more sustainably: store less, move less and compute less.

It explains how researchers can identify common sources of data wastage, including duplicate copies, old datasets, repeated downloads and unnecessary preprocessing. It also covers how to manage data across its lifecycle, from retention and archiving to deletion, so that storage space is used more efficiently.

The guide also highlights the importance of reducing data movement and avoiding unnecessary recomputation. By keeping datasets in shared, versioned locations, reusing preprocessing outputs, designing pipelines that only recompute what has changed, and scheduling computationally intensive tasks for cleaner energy periods, research teams can improve efficiency while reducing waste.

Green data practices can also improve cost, speed and reproducibility. Shared datasets and reusable pipelines make it easier for teams to compare results, avoid duplicated effort and build more transparent research workflows.

After reading the guide, readers should be able to identify sources of data wastage, manage data more efficiently throughout its lifecycle, reduce unnecessary data transfers, and design pipelines that avoid avoidable recomputation.

Read the guide