Laurent Gatto

Instructor TrainingBy Steve Crouch, Software Sustainability Institute, with Karin Lagesen, University of Oslo, and Laurent Gatto, University of Cambridge.

Last month, we held a Software and Data Carpentry Instructor Training workshop at the University of Cambridge, sponsored by the R Consortium. The demand for Carpentry events in the UK, and trained instructors to facilitate them, has always been very high, and I found this to be a very enjoyable event to increase the instructor pool in the UK.

The main organiser of the event was Laurent Gatto, a Software Sustainability Institute Fellow who has delivered numerous Carpentry courses since becoming a certified instructor in 2014. We also had the able helping hands of Paul Judge and Gabriella Rustici from the University of Cambridge Bioinformatics Training facility, who assisted greatly with the event and helped us make great use of the sophisticated presentation systems present in the training room.

The workshop was held on 19th and 20th of September, with myself and Karin Lagesen as instructors. We were delighted with the very high level of engagement from the 25 trainees - this was very much the kind of group we hope…

Image by CASTLE ROCK INNOVATIONS.By David Perez-Suarez, University College London, Phil Bradbury, University of Manchester, Aleksandra Nenadic, University of Manchester, Laurent Gatto, Cambridge University, and Niall Beard, University of Manchester.

A speed blog from the Collaborations Workshop 2016 (CW16).

Remote collaboration: challenges in Human-Computer-Human interactions.

Tools that were mentioned during the discussion: GitHub, BitBucket, GitHub issue tracker, Skype, Google Hangouts (but max participants in Skype/Google Hangouts), Google Docs, spreadsheets, Jira, todo lists, time sheets, DropBox, … but are tools really the problem?

Use cases: coding, remote teaching, writing papers, large open-source development.

We started our discussion with a list of tools and use cases from our own experience: GitHub, BitBucket, GitHub issue tracker, Skype, Google Hangouts (but max participants in Skype/Google Hangouts), Google Docs, spreadsheets, Jira, todo lists, time sheets, DropBox, … for situations like coding, remote teaching, writing papers, large open-source development. Despite the availability of these tools, some being really good, we were left to wonder whether the tools were really the problem, here?

How is the Team…

By Laurent Gatto, Software Sustainability Institute Fellow.

This past week saw the yearly Bioconductor conference  take place at the Dana-Farber Cancer Institute, Boston, MA. It started with a Developer Day on July 30th and continued with scientific talks and workshops until August 1st.

Bioconductor is an R-based open-source, open-development software project that provides tools for the analysis and comprehension of high-throughput genomics data. It was set up in 2001 by Robert Gentleman, co-founder, alongside Ross Ihaka, of R and is overseen by a core team based primarily at the Fred Hutchinson Cancer Research Center in Seattle, WA and by other members coming from a range of other US-based and international institutions.

From a programming point of view, the Bioconductor benefits from the features of the R language, including a high-level and expressive language to easily and quickly prototype new computational methods. In addition, there is a well-established system for packaging together software, data

By Stephen Eglen and Laurent Gatto, Software Sustainability Institute Fellows.

R is a well-established environment for statistical computing.  It is often seen as an alternative to computing environments such as matlab or python. In this post, we give our five reasons for why we chose to use R for research.

1. Plotting

R generates beautiful graphics with minimal effort. Publication-quality plots can be rendered in a wide range of vector- and raster-based formats. Recent extensions to the plotting system allow for complex visualisations to be expressed succintly. See R Graphics Gallery for example plots along with the code that generated the plots.

2. Packaging

R comes with a robust packaging system to allow developers and domain experts to easily distribute their code. Packages come complete with documentation, vignettes (see point 3 below), and data files.  Windows and Mac users can download packages in binary form, where C and Fortran code is pre-compiled.  As of January 2013, the Comprehensive R Archive Network (CRAN) contains 5088 packages. The packaging build system is rigourous to ensure that packages will work for for other users.  Within the field of Computational Biology, the…

Senior Research Associate, Department of Biochemistry, University of Cambridge


I use statistics and machine learning to uncover relevant patterns in high throughput biology data and make every effort to make my research outputs (papers, software and data) open to everyone to read and re-use.


In biology, localisation is function: knowledge of the sub-cellular localisation of proteins is of paramount importance to assess and study their function and refine our understanding of cellular processes. Spatial or organelle proteomics is the systematic study of proteins and their sub-cellular localisation. My work is focused on the analysis of multivariate quantitative mass spectrometry-based proteomics data to infer sub-cellular localisation of proteins using contemporary and novel machine learning approaches. This research is implemented in a set of open source R/Bioconductor packages such as MSnbase and pRoloc. The software suite allows researcher to manage data, meta data and sub-cellular marker sets, apply state-of-the-art machine learning techniques to predict protein-organelle associations and incorporate data from other organelle proteomics initiatives and biological repositories. Particular emphasis is placed on reproducibility of the analyses, rigorous data exploration, comprehension of the data and the analysis pipeline, leading to a sound understanding of the data and informed interpretation of the results.

I am an affiliated member of the Bioconductor…

