Software and research: the Institute's Blog

Data Science for DoctorsBy Steve Harris, University College London Hospital, and Software Sustainability Institute Fellow.

This article was first published at Data Science Breakfast Club.

The general public would assume that the medical profession are numerically literate. University-educated, technically-trained in biological science and more, and with the unique legal privilege of prescribing medicines where incorrect dosing leads to disaster, there wouldto be no excuse. However, most medics would deny any affinity with maths, and exude a distaste for statistics. This is a truism even amongst anaesthetists whose professional training includes physics and pharmacokinetics.

Despite this public denial, we are nonetheless a data-literate profession. We read and interpret scientific papers, we run audit projects and write business cases for improvement projects. These are all intrinsically quantitative undertakings. However, our adopted posture gets in the way of our doing these things as well as possible, and engaging with statistics and data.

Data science would appear to be a rebranding of statistics. Despite underpinning quantum mechanics, modern finance, and the humble weather forecast, statistics has never managed to seem cool. At worst, it is cited as ‘lies,…

Continue Reading

Increasing diversityBy Thomas Robitaille, Freelance, Alice Harpole, University of Southampton, Olivier Philippe, University of Southampton, Louise Brown, University of Nottingham, Clem Tanzi, qLegal, Mateusz Kuzak, Netherlands eScience Center.

This post is part of the Collaborations Workshops 2017 speed blogging series.

There are diverse aspects of diversity, age, ethnicities, disabilities, and so on, however, the most commonly addressed one is the gender. Without taking into account the importance of one or the other aspect, gender has the advantage to be able to be easily assessable. It is easier to measure the situation to a standard and to compare the situation between different projects, careers, conferences, etc. 

However, even if we take the sole issue of gender and its simplified version (binary distinction between male and female), it quickly appears that the context where software developments takes place is already defined by the gender issues at a higher level, such as the representativity in the education field or in the career plan. Therefore, a definition of a standard (directly assessable) is biased, and each event where we try to enforce the diversity should take that into account (see for instance the…

Continue Reading

Research IT, Enterprise ITBy Laurence Billingham, British Geological Survey, David Golding, University of Leeds, Robert Haines, University of Manchester, Martin Hammitzsch, German Research Centre for Geoscience, James Hetherington, University College London, Simon Hettrick, Software Sustainability Institute.

This post is part of the Collaborations Workshops 2017 speed blogging series.

Universities need to strike a balance between risk and strategic opportunities (world-class research and world-class teaching). A semi-independent "sandboxed" service for research IT can deliver both, by isolating the stuff that needs to change fast from the stuff that needs to always work.

In mobile development, apps are "sandboxed" so that one app cannot break the phone. This analogy can work for services too. In research-led universities, we need…

Continue Reading

Seductive Data

By Eilis Hannon, University of Exeter, Martin Callaghan, University of Leeds, James Baldwin, Sheffield Hallam University, Mario Antonioletti, Software Sustainability Institute, David Pérez-Suárez, University College London.
 

This post is part of the Collaborations Workshops 2017 speed blogging series.

In our daily work we may, at some point, need to access data from third parties that we wish merge or compare with some data that we have generated or obtained. Invariably we may turn to Google to find pertinent data sources. Domain experts may be able to refer us to data sources or in part there are keywords that can unlock what you are trying to find on the web. Alongside, we can filter results using advanced Boolean operators.  In order to make sense of the results, we can consider a number of factors, such as top links and domains that are most relevant to the topic. For specific domains, there will be known and trusted data providers, e.g. the Gene Expression Omnibus (GEO) or the…

Continue Reading

Privacy and Trust in IoT & Open DataBy Sinan Shi, University College London, David De Roure, University of Oxford, Nikoleta Glynatsi, Cardiff University, Emma Tattershall, Science and Technology Facilities Council, Andrew Landells, University of Southampton, Chris Gutteridge, University of Southampton, Gary Leeming, University of Manchester.

This post is part of the Collaborations Workshops 2017 speed blogging series.

Challenges of understanding risks of privacy within a socially connected infrastructure are not well understood and constantly changing. Personal information can be private but still be accidentally shared by others and made available more widely. One of the largest challenges for privacy is the lack of understanding of what that data could be used for now, and as more data are collected and made available future purposes become even more difficult to predict. Often, seemingly innocuous data sets can be used to derive more private data, such as the waking times and other habits of…

Continue Reading