At the Effective Scientific Programming workshop on the 20th June 2011, Mike Jackson posed to 31 attendees one of the essential programming questions, that of "what makes good code good?" The attendees, who mostly viewed themselves as "scientists who do some programming", rather than "scientific programmers" proposed the following:
- The most important quality is that the code must be fit for purpose. It must do what it is intended to do and do it correctly. Without this quality, any research deriving from the code could be fundamentally flawed.
- The code must be readable, well-commented and documented. The code may be worked upon by not just the original author, but others, for example research associates recruited onto a project. Similarly, the original author may need to return to the code after a long break.
- Source code itself says what the code does and how it does it. However, it is also important that comments be added to include design rationale, or why the code is as it is, why certain implementation decisions were taken. This can be particularly useful when these decisions might otherwise seem unconventional or non-intuitive.
- The code should be elegant and concise. There should be no redundant or repeated code. However, the code should not be so concise as to be so cryptic that it is very difficult to understand.
- The code should be efficient and, where possible, optimise its usage of processor, memory and storage resources. However, optimisation and efficiency should not be at the expense of its clarity and readibility. A few seconds improved run-time in a function is not necessarily worth a day of debugging if it goes wrong.
- The code should be well-designed and modular as this contributes to the qualities above - code that is elegant, concise and readable. It also promotes testing and code reuse.
- There should be tests for correctness so that both the author, and others, can have confidence that the code is fit for purpose. In addition, these tests should be shipped with the code to enable users to test their deployments, prove to themselves it works as expected and also to be able to test any changes they make.
- The code should be scalable and be able to handle inputs which require processing, memory or storage beyond the bounds needed by the original authors. If these bounds are reached it should fail gracefully, not just crash with a core dump.
- Usability is also important, the delivery of software that is easy to use. This more relates to the question "what makes good software good" than "what makes good code good" but it is beneficial to be aware that someone who is not the original author may use the software.
- And finally, the software should be portable. Though there may only be a need for it to run on one platform at the time of its writing, developing multi-platform software may make the software attractive to a wider number of users, or to other applications areas.
The responses and the focus on correctness, readibility, scalability, usabilility, and clarity overlap closely with commonly-understood definitions of good code as taught on software development courses or expressed in numberous articles and blogs (see, for example, What makes good code good by Paul DiLascia, MSDN Magazine, 07/2004, p144, or Christopher Diggins's The Properties of Good Code, 27/09/2005).
Adoption of these guidelines can help scientists ensure that their code can be more than just a throwaway prototype but an asset that can continue to contribute to the evolution of both their research and that of their peers.