By Steve Crouch.
In August instructed at the SeIUCCR Summer School in the shiny new training facilities at the Hartree Centre in Daresbury. Targeted at UK doctoral and postdoctoral researchers in the engineering and physical sciences, the school teaches researchers about tools and techniques for e-Infrastructure, software development and data management to support and improve their science.
My session covered sustainable software development, which I've presented in one form or another at the school for the last two years. This time, I asked the participants what they thought made good code good, splitting them into groups to discuss the issue.
The attendees thought that good code should be:
- Correct: it produces the right results, to increase confidence in the code, with accompanying tests that demonstrate its correctness.
- Documented: supporting documentation is a must, and should explain how to use the code, its overall architecture and how to develop the code.
- Extensible: it should be possible to adapt and extend the code to include other functions.
- Readable: the code should be understandable, following a logical, consistent structure and style. It should be well-commented so design rationale in relation to the underlying science is clearly explained, and input and output to functions are described with reference to appropriate data formats.
- Robust: the code should be fault-tolerant and developed defensively so it's able to give meaningful feedback on unexpected inputs, as well as not simply crash.
- Portable: the code should be easily portable so it can be exploited by users on other platforms.
- Maintainable: the code should be in a form that means bug fixes and other updates can be accomplished efficiently. This also applies to other materials such as documentation.
- Reproducible: others should be able to use the code to repeat experiments described in papers, and be able to understand how the implementation accomplishes the science.
- Scalable: it should be possible to run the code over larger problem spaces, and the code should take advantage of parallelism where appropriate, either through execution on distributed computing resources such as clusters or exploiting extra CPU cores on a single machine (or both).
- Structured: the code should be well designed and modular, but not over-architected with unnecessary layers of abstraction.
- Secure: where data confidentiality is a requirement, code should take measures to protect access to data, particularly if the code forms part of a service such as a web-based resource.
- Efficient: code should execute in a timely fashion.
- Accessible: the code should be available to others to make use of, to avoid reinventing the wheel.
- Testable: code should be in a form that is readily testable for fitness of purpose and correctness in uses beyond those of its initial project.
The votes from the participants on the importance of each of these was as follows:
As a group, the participants thought that good code should be correct, documented, extensible, readable and robust. This largely overlaps with widely accepted opinion on what makes code good (e.g. What makes good code good by Paul DiLascia, MSDN Magazine, 07/2004, or The pyramid of code quality, the 5 characteristics of good code by Alberto Gutierrez, 07/2009, etc.).
My thanks to Claire Devereux and David Wallom for organising yet another successful SeIUCCR Summer School, and for the invitation to come along and present the session.