Smartphones for improved disease spread modelling

Posted by a.hay on 3 October 2014 - 10:00am

By Katayoun Farrahi, Lecturer at the Department of Computing, Goldsmiths, and Rémi Emonet, Associate Professor and Software Engineer at Jean Monnet University

This article is part of our series: a day in the software life, in which we ask researchers from all disciplines to discuss the tools that make their research possible.

In our globalised world, people can travel across several continents in a single day - carrying diseases with them. The importance of containing disease outbreaks to prevent global epidemics cannot be overstated: as evidenced by the recent Ebola outbreak in West Africa.

Diseases spread through physical proximity. Having a mechanism to know or predict interactions between people would allow us to track the movement of diseases, which would be an invaluable tool in preventing an epidemic. But how would such a feat be achieved? Most individuals carry a mobile phone. A phone can tell us a lot about its owner by continuously collecting a wide range of information, such as location and interaction.

Starting from these observations, we decided to investigate how mobile phones can help contain epidemics. We gave 72 students smart phones and logged their calls and messaging over a nine month period. The phones used Bluetooth to identify and record any other devices that were present in the immediate vicinity. In addition, participants answered daily health questionnaires including questions about influenza symptoms. Our studies used this data to uncover correlations and drive simulations of the spreading of disease.

Many interesting findings emerged, such as the fact that the number of interactions correlated to both reported friendships and mutual sharing on Facebook. It means that we can use, to some extent, patterns of friendship and Facebook behaviour as a means to gauge how people interact in real life.

Of course, not everyone owns a smartphone, but other phones can also provide valuable information. Phone operators know which phone is within the range of a given cell tower at any time, which provides a time-stamped location for the phone’s user. (Accessing such data can be a challenge - it is often held by big phone operators and sharing such data requires privacy precautions.) There are limitations: Liberia and Sierra Leone are among the countries with the lowest mobile phone presence (although ownership is growing rapidly: 44% in Sierra Leon and 59% in Liberia). While still low, this level of ownership should provide a good estimate of people’s interactions in densely populated regions.

During the project, a lot of time has been spent writing code to analyse data and validate the hypothesis using simulation, as well as performing pure data analysis in the form of extracting statistics, computing confidence intervals and so on. Computationally heavy simulations, which can take hours or days to run on multiple machines, are needed to simulate various tracing policies. We also write code that mixes real data coming from the data collection campaign with simulations using epidemiological models. Even if we were computer literate from the beginning, this is demanding work requiring multi-site collaboration between people with various backgrounds.

Apart from version control for the code and articles, we originally started with writing different programs in Matlab and Java. We are progressively switching new code to Python, which acts as a common ground where we all find a lot of advantages. One of us promotes these tools by acting as an instructor in the Software Carpentry project which teaches basic software skills and best practice to researchers.

Obtaining a country-wide anonymised dataset of phone localisation data would be of great use for research in this domain, and might eventually help in preventing epidemics. As the recent Ebola outbreak in West Africa shows, such information can be vital to stop the spread of some truly terrible diseases.