Bash Scripting Workshop

Posted by s.aragon on 19 November 2018 - 9:05am

Image courtesy of Becky Arnold

By Becky Arnold, University of Sheffield.

This is part of a series of talks on good coding practice and related topics Becky Arnold has organised as part of her Fellowship plan.

On the 7th of November, Raniere Silva of the Software Sustainability Institute gave a one day workshop on bash scripting at the University of Sheffield. The Unix shell has tremendous power. This workshop was geared towards researchers that had some experience of working on Unix-like systems, but wanted to build on that to better exploit its full potential. Will Furnass of the University of Sheffield Research Software Engineering group also helped out at this event.

The material for the workshop can be found here, and the files used for the exercises can be found here.

Raniere started off with a recap of the Unix shell and then moved on to using loops in it. Researchers may commonly have hundreds or more data files which they need to use and manipulate in the course of their work. Doing this by hand quickly becomes time-consuming at best and unmanageable at worst. Having oodles of fantastic data is useless if you can’t handle it. By using loops, actions can be applied to thousands of files with only a few lines of code.

Next Raniere went over using the grep command to find lines which matched a pattern within a file. We started simply by identifying lines which contained certain words in a text file of haikus. (The second haiku in particular- “My thesis” not found - resonated with me as I move into the final year of my PhD. If the goal of poetry is to invoke emotion that one succeeded by invoking icy dread.) We then moved on to using grep to filter data, in the example given we extracted the data for stars of a certain mass from within much larger files. We then used sed and awk to modify the data files and perform operations using the data itself, which is the lifeblood of research.

Finally Raniere gave us an introduction to makefiles. I’ve come across these a few times in the course of my PhD and found them largely impenetrable. This is frustrating when trying to understand how a past result was reached. Raniere first went over the purpose of makefiles, and how they can save time by generating files automatically (reducing the number of steps that must be manually taken by the user), and by only regenerating files which need to be updated which reduces processing time. He then explained the structure of makefiles and showed us how to write them ourselves.

This was the first workshop in a series of talks and workshops I’ve organised with funds from my Software Sustainability Institute fellowship, and there was an almost eerie lack of hiccups. I was pleased by the level of interest; all 35 of the available places were booked and there were another 17 people of the waitlist. The next workshop, given by Yo Yehudi, is also fully booked at the time of writing this post. It will cover how to run and contribute to open source projects.

If you wish to discuss this post with us, send us an email or contact us on Twitter @SoftwareSaved.