Web scraping for arts, humanities and social sciences workshop 2013

Durham University, UK, 1 July 2013

By Nick Pearce, SSI Fellow and  Teaching Fellow at The Foundation Centre, Durham University


The whole event was a highlight, i met lots of interesting people, and i was really pleased to spend time with Chris Hanretty, who delivered the workshop. The popularity of the event not only raised my personal profile at the institution but also raised the profile of what might be called computational methods, and this is no mean achievement at as traditional institution as Durham.

Event report

I'd had the idea to put on an event to promote webscraping as a relatively simple 'entry level' coding tool to entice people who wouldn't think of themselves as coders to see the value in coding more generally. I really wasn't sure what the take up would be like, but i was delighted when the event filled up within hours of being advertised.

On the day 25 people turned up, and there was a big variety of people. They were from Law, Anthropology, Education, Psychology, Archaeology and Modern Languages amongst others, as well a couple of STEMers. There was also a wide range of computer experience, from people who already did some coding, so those who had never seen html tags before. Chris’s materials and teaching style were very well recieved and by the end of the most of the participants were working on real scripts scraping stuff off the web, a massive achievement! I happened to bump into one of the anthropologists who was there afterwards who had told me that he was fiddling with some web scraping code the following night, which was great to hear.

I think putting on this event really demonstrates the appetite that there is a range of disciplines which are relatively new to coding. I think the implications of this for the SSI are to try and ensure that the focus isn't too narrowly on the natural sciences, and perhaps see what other low hanging fruit there are for other disciplines who haven't really had a computational turn yet.