By Oliva Guest, University of Oxford, Robin Wilson, University of Southampton, Martin Jones, Python for biologists and Craig MacLachlan, Met Office Hadley Centre.
A speed blog from the Collaborations Workshop 2016 (CW16).
Why are sustainable software practices difficult to teach?
Programming is a difficult thing to learn for students who have not been exposed to it before. However, for general programming there are at least some factors that help to make it easier. Feedback is generally very rapid; after writing and running a piece of code, students can see the result straight away. This isn't true for e.g. automated testing; the payoff for writing a test suite comes long after the fact, when it helps to catch a bug. The same goes for version control — until students have encountered one of the problems that version control is designed to solve, it seems like an unnecessary extra step in development.
Increasingly, programming is becoming a necessary tool for students who don't have a computer science background (represented in this discussion group: meteorologists, biologists, psychologists and physicists). Students coming to programming for the first time are often lacking in computer skills and so are not very able to cope with the extra burden of both the concepts and implementing the practices (eg, writing documentation and tests).
Teaching sustainable software practices
The nature of programming means that for many students there's a high intimidation factor (doubly so when some of their classmates are more experienced than them). Adding extra steps to what already seems like an arcane process can make programming seem even more unapproachable. This has the potential to cause stereotype threat: the feeling of being unskilled purely based on a perceived imaginary expert (eg, a person doesn’t identify with "geek", ergo they assume they cannot possibly do the things an imaginary geek can do).
As mentioned, in a class of under- and post-graduates it is very likely that some will have pre-existing programming skills, some will have a natural interest and aptitude, and some will have lesser background knowledge. Their level of motivation will vary as a function of these factors, just like with any class. To add to this issue, many degrees do not even offer a specifically designed programming course to teach students the basics.
Some of the concepts and tools we want to teach under the umbrella of sustainable software are difficult to shoehorn into a convenient format. For example, it's easy to set an exercise that requires students to write a program using for-loops — all you need is an input file and a desired output — but to set up a realistic example of e.g. version control involves many more moving parts (a GitHub account, a Git repository, ssh keys, someone to play the part of a programming colleague, etc.). These are often very challenging to set up - particularly for less experienced students.
Practical suggestions for educators
Motivation - The person delivering the training may find it obvious that people should be using sustainable development practices, but it’s still important to show why they are useful. Version control might save your life one day, is there a better reason to use it? Code review could have prevented a paper being retracted, or a rocket exploding. Give people a compelling reason to see the benefit of the tools. Motivation based on situations that the students aren’t currently in (eg, large-scale collaboration as a motivation for version control) are unlikely to be successful.
Easy entry point- Becoming an expert in good development practice doesn’t happen overnight, it doesn’t even happen over a two day course. When introducing an idea, try to pick an easy entry point. For example, if you want to teach software testing why not start with assert statements and defensive programming.
Follow-on information- Often time is short, so it’s not possible to cover more topics in depth. It’s still good to give people the opportunity to see where the basic concepts can be taken. Once you’ve covered assertions and defensive programming, show the path to unit and integration testing.
Forcing the use of tools- If it is difficult to persuade people to use sustainable practices, then the alternative is to force them! This doesn’t work in all situations - and, obviously, should be accompanied by a strong motivation - but can apply in assessed courses. For example, some courses require that your code is submitted as a link to a Github repository: thus ensuring that students have to use some sort of version control (even if some students simply do a single commit at the end of their project). Other approaches can involve sustainable practices as part of the assessment, either automatically (for example, a test suite that fails tests when the function written doesn’t have a docstring) or more qualitatively (for example, reserving a portion of the marks for “good sustainability practices” such as effective use of version control, testing etc).
Tools that might be of use to educators
- recipy is a method to record how things have been created in Python, eg, a script that produces a graph.
- Jupyter Notebook for introducing the idea of reproducibility (run from start to finish and end up with the same result).
- Automated testing on submission of coursework (eg, via email, with testing output sent back).
- Internally-hosted version control servers (no issue with public vs private etc).
- Integrating sustainable practices into the assessment (for example, you have to submit as a Github repo, the automated testing system checks for docstrings, marks given for use of good sustainable practices).
- Paired programming.
- Git GUIs (very simplified - basically just introducing the 'Commit' button, and then possibly an 'Oh Crap' button for reverting to previous versions)
- A good old-fashioned paper lab notebook (or the equivalent in Markdown, Word or whatever)