Library carpentryBy Julianne Schneider, Data Curator

This post was originally published at the Software Carpentry website.

In the space of a year, interest and participation in the Library Carpentry community has exploded like an amoeba who over-ate at an algae banquet and attempted one too many pseudopods.For Library Carpentry, though, this is a good thing; the pseudopods are propelling us forward across institutions, disciplines, and continents. The community, grounded in collaborative tools like Github and Gitter (I always want to type Glitter) is coalescing around lesson development and holding new workshops. Why is the buzz so strong? I think it’s a combination of relentless energy from people like Belinda Weaverand Tim Dennis (to name just a few), the acceptance and active encouragement of new people who want to contribute in some way, and the mutual recognition by all of us that in any one thing, we are all absolute beginners, and we all give each other permission to be terrible until we aren’t.

I am still terrible at Github and command line and use Tim’s Github workflow post every time I work with Github - seriously, this is Github workflow…

Data Science for DoctorsBy Steve Harris, University College London Hospital, and Software Sustainability Institute Fellow.

This article was first published at Data Science Breakfast Club.

The general public would assume that the medical profession are numerically literate. University-educated, technically-trained in biological science and more, and with the unique legal privilege of prescribing medicines where incorrect dosing leads to disaster, there wouldto be no excuse. However, most medics would deny any affinity with maths, and exude a distaste for statistics. This is a truism even amongst anaesthetists whose professional training includes physics and pharmacokinetics.

Despite this public denial, we are nonetheless a data-literate profession. We read and interpret scientific papers, we run audit projects and write business cases for improvement projects. These are all intrinsically quantitative undertakings. However, our adopted posture gets in the way of our doing these things as well as possible, and engaging with statistics and data.

Data science would appear to be a rebranding of statistics. Despite underpinning quantum mechanics, modern finance, and the humble weather forecast, statistics has never managed to seem cool. At worst, it is cited as ‘lies,…

Increasing diversityBy Thomas Robitaille, Freelance, Alice Harpole, University of Southampton, Olivier Philippe, University of Southampton, Louise Brown, University of Nottingham, Clem Tanzi, qLegal, Mateusz Kuzak, Netherlands eScience Center.

This post is part of the Collaborations Workshops 2017 speed blogging series.

There are diverse aspects of diversity, age, ethnicities, disabilities, and so on, however, the most commonly addressed one is the gender. Without taking into account the importance of one or the other aspect, gender has the advantage to be able to be easily assessable. It is easier to measure the situation to a standard and to compare the situation between different projects, careers, conferences, etc. 

However, even if we take the sole issue of gender and its simplified version (binary distinction between male and female), it quickly appears that the context where software developments takes place is already defined by the gender issues at a higher level, such as the representativity in the education field or in the career plan. Therefore, a definition of a standard (directly assessable) is biased, and each event where we try to enforce the diversity should take that into account (see for instance the…

Research IT, Enterprise ITBy Laurence Billingham, British Geological Survey, David Golding, University of Leeds, Robert Haines, University of Manchester, Martin Hammitzsch, German Research Centre for Geoscience, James Hetherington, University College London, Simon Hettrick, Software Sustainability Institute.

This post is part of the Collaborations Workshops 2017 speed blogging series.

Universities need to strike a balance between risk and strategic opportunities (world-class research and world-class teaching). A semi-independent "sandboxed" service for research IT can deliver both, by isolating the stuff that needs to change fast from the stuff that needs to always work.

In mobile development, apps are "sandboxed" so that one app cannot break the phone. This analogy can work for services too. In research-led universities, we need…

Seductive Data

By Eilis Hannon, University of Exeter, Martin Callaghan, University of Leeds, James Baldwin, Sheffield Hallam University, Mario Antonioletti, Software Sustainability Institute, David Pérez-Suárez, University College London.

This post is part of the Collaborations Workshops 2017 speed blogging series.

In our daily work we may, at some point, need to access data from third parties that we wish merge or compare with some data that we have generated or obtained. Invariably we may turn to Google to find pertinent data sources. Domain experts may be able to refer us to data sources or in part there are keywords that can unlock what you are trying to find on the web. Alongside, we can filter results using advanced Boolean operators.  In order to make sense of the results, we can consider a number of factors, such as top links and domains that are most relevant to the topic. For specific domains, there will be known and trusted data providers, e.g. the Gene Expression Omnibus (GEO) or the…

