Brexit Data Challenge event

Posted by s.aragon on 27 April 2018 - 8:00am

By Reka Solymosi, University of Manchester.

I have been co-organiser of the University of Manchester's R user group for about a year now. I started when I moved to Manchester, because I’ve always found that software user groups are a sure way to meet people with the same interests and motivations as myself, outside of my discipline. Having started work at the School of Law, I found that there were not many people as excited about open data, and data analysis as myself, and decided to reach out.

For the first year, we followed a steady format for the R user group meetings where we met monthly to listen to presentations given by volunteers about their work. This received generally positive feedback (and was in line with previous user group format) but often ended up with myself or my co-organiser Heather Robinson presenting as we struggled to motivate volunteers. After a year we took a poll, and while people seemed to like the format, there were some concerns raised about it being passive, and not engaging more novice users. An approach of organising a collaborative hackathon was suggested.

Opportunity

As we were considering possible topics for the hackathon that would engage people from a wide range of disciplines, we came across the Consumer Data Research Centre (CDRC) Brexit Data Challenge. The premise was to investigate the hypothesis set out in the Economist article “The immigration paradox Explaining the Brexit vote” (14th July 2016). We identified this as a topic that could motivate lots of people from various disciplines, and so suggested to organise a half-day hackathon where we put together the two-page submission.

Preparation

In preparation, we had a look at hackathon formats and guidance available online. I had a browse through the blog posts by other fellows and, in particular, saw the post by Derek Groen about hackathons for writing collaborative papers and his learnings about producing "papers versus programs". First suggestion was to have a clear aim/suitable topic identified. The focus on the Brexit Data Challenge had mostly taken care of this for us. The second, was to identify projects and project leaders in advance. To address this, we set up a GitHub repository, and suggested that people raise their ideas as issues, so others can read and begin to think about them. This allowed for some preparational work to take place in advance of the meeting, to make sure "time spent during the Paper Hackathon is spent more on science and less on the organisational activities around it".

I also emailed the fellows mailing list to ask for any suggestions and received lots of really helpful feedback; for example, a link to lessons learned from the Polar Computing Research Coordination Network (RCN) from Allen Pope. Particularly helpful advice for the goal of encouraging participation and learning by less experienced R users came from Neil Chue Hong, who advised that as a facilitator my primary role is to make sure that each and every participant can get involved with an activity that they feel is contributing to the overall goal of the hackathon. Additionally, a reference to tips from Joshua Tauberer also raised the need for a code of conduct to ensure the hackathon is a safe space, and so a code of conduct was added to our hackathon page.

Participants

We decided to branch out to extend the hackathon invite beyond out user group and invited the Manchester Metropolitan University (MMU) R user group, and also made an announcement at the HER+Data meetup group. We encouraged people of all abilities to apply. We actually got equal numbers of people sign up from those who identified as beginners, and those who were more comfortable users.

The Event

The event itself took place over the course of an afternoon. There were altogether 12 people who participated. There were exactly as many people who identified as beginner as more advanced, which was helpful in order to pair people up. We decided to form one collective group, who would create one submission as the output. The majority of the data wrangling and some analysis were finished on the afternoon (which flowed into the evening and night). There was lots of collaboration and discussion, facilitated by pizza and beer/wine. You can have a look at the activities in the GitHub repository for the event.

Output

The output we produced can be viewed here. We are also happy to announce that the output has been shortlisted by the CDRC review panel for presentation at Geographical Information Science (GIS) Research UK (GISRUK) conference. A member of the hackathon team will present in 10-12 minutes the results of our output, alongside the other candidates, and the winner will be announced at the conference close. So if you are in attendance, come see us present!

Feedback

We ran a short survey to gauge the feedback from participants. We received some positive comments:

"It was great to meet and work with others. I benefited from looking at how other s solve common data analysis problems using R. I use Excel on the day to day basis so the hackathon provided me with an opportunity to learn something new."

To stay updated, subscribe to the mailing list or send an email to LISTSERV@listserv.manchester.ac.uk with no subject and the body SUBSCRIBE RUM Your Name". Also check out our friends at MMU R User Group.