Winners of the call for developing training materials for data science announced

Posted by j.laird on 12 February 2021 - 9:30am

Scottish Funding Council logoThe winning proposals for our call for developing training materials for data science have been chosen.

In October 2020 we invited quotes for the provision of Carpentries-style data science course curricula for statistical and machine learning skills. The call was supported by the Scottish Funding Council’s (SFC) Upskilling Fund to deliver a programme of data skills training for the Scottish workforce. 

We received seven excellent proposals and the review panel chose the winning submissions:

  • A course on applied machine learning for the health/biomedical domain in Python and R, to be developed by a team consisting of Nathan Pollard (the lead), Alistair Johnson, Tom Pollard and Marzyeh Ghassemi. The course will use anonymised, real-world patient data (X-ray images and electronic health records) and help learners gain an understanding of key concepts in machine learning, understanding of ethical issues around working with health data, and provide skills to apply machine learning to real-life tasks. Despite the lessons following the health narrative, the knowledge gained will be domain agnostic, and the course will emphasise how approaches can be broadly applied. 
  • An R course on statistical analysis for public health to be developed by a team consisting of Emma Rand (the lead), Andrew Stewart and Ezra Herman. The course will use the NHANES data set (US National Center for Health Statistics) from a series of health and nutrition surveys since the early 1960s. During the course, the learners will learn about (1) variable types and distributions (normal, binomial, Poisson); (2) statistical thinking (correlations between continuous variables, the logic of hypothesis testing, predictive relationships); (3) various statistical models and making predictions - simple and general linear regression models, generalised linear models for binomial data (logistic regression) and linear mixed effect models for repeated measures.

The alpha versions of materials will be piloted by summer 2021, with beta versions available for wider uptake by the community by the end of 2021. Materials will be developed publicly in the Carpentries Incubator, where the wider community will be able to review, comment and contribute after the initial development phase.

Share this page