Talk on structuring code efficiently by David Hubber

Posted by s.aragon on 15 May 2018 - 9:10am

By Becky Arnold, University of Sheffield

This is part of a series of talks on good coding practice and related topics Becky Arnold has organised as part of her Fellowship plan.

On the 2nd of May 2018, David Hubber, a postdoc at Ludwig-Maximilians-Universität Munich, gave a seminar on Long-term Software Development for Scientists, as part of a series of talks around good practice at the University of Sheffield. David discussed how to structure code efficiently, code module design, decoupling strategies and test-driven development.

I’ve had the pleasure of meeting David previously— we’re both astrophysicists working on simulations of star forming regions. In particular David has a great deal of experience in working with big codes that span tens of thousands of lines. While my own work is of a smaller scale I and I think many researchers are familiar with writing programs that start small and quickly evolve into a huge unmanageable mess.

David discussed the dangers of “the blob” and “spaghetti code”, before going on to cover strategies to avoid them. Some of these strategies were stylistic, he described how keeping consistent with style choices in your code makes it easier to understand for your future self and others, thus making it more maintainable.

He also discussed how comments (while absolutely vital) can also be something of a double edged sword if misused. Excessive comments can reduce the maintainability of your code as they can more easily go out of date when the code progresses, and comments that contain wrong information are worse than no comments at all. Another danger David mentioned is the temptation to comment bad code rather than rewrite it to make it more straightforward.

Testing was also discussed. David explained the concept of unit testing and how the short term delay incurred in writing tests is often recovered many times over in the long term. He covered logging systems and suggested placing assert statements at key points in the code as sanity checks. I see a huge amount of value in this particularly in physics. For example, nothing can go faster than the speed of light, 3 x 108, but a computer can be assigned a variable velocity= 9 x 1012 without producing an error. Without any checks in the best case scenario this just causes a crash, though this may be much later in the program making it difficult to trace the bug. In the worst case the program runs anyway, producing a wrong result with the user completely unaware anything’s wrong.

Approximately 80 people attended this talk. We got quite lucky as there were only a few seats left in the lecture theatre! There were a lot of questions and discussion between sections of the talk which was interesting and beneficial. Following the talk several attendees requested copies of the slides and several expressed enthusiasm about these sessions. One (very kind) email I received stated:

“I just wanted to thank you for organising these talks, it seems like there has been such a dearth of intermediate level programming courses/talks so this is fantastic!”