By Nicolas Gruel, University of Manchester, Andrew Walker, University of Leeds, Vince Knight, Cardiff University, and Mike Jackson, University of Edinburgh.
This post is part of the Collaborations Workshops 2017 speed blogging series.
Code review is accepted as a key process in the creation of maintainable software with a low defect rate. Wilson et al. recommend code reviews as one of their Best Practices for Scientific Computing. The value of code review is beginning to be recognised in the development of research software, for example, in ObsPy and Astropy which make extensive use of code reviews in their project workflows. But, the review process can sometimes act as a barrier to new contributors joining a project or development community. During the Software Sustainability Institute’s Collaborations Workshop 2017, we discussed providing a constructive, kind, balanced and useful review.
Code review can come in various forms and at various times. The classic group discussion around a page or two of code, where developers debate code written by a team member working in the same building does not scale to large-scale or distributed projects. This is also a difficult model for academia, where research groups may not contain more than one person knowledgeable enough about a piece of software to provide any useful feedback. In a distributed world, code to be reviewed, in the form of a proposed change to code in a project, is often submitted as a patch, via e-mail, or as a pull request which is managed online using a platform such as GitHub, BitBucket or GitLab.
Constructive Code Critique
So why is it important to think about code review? First, we know that this is an important aspect of the creation and maintenance of quality code, with significant evidence for its cost-effectiveness for bug detection (see, for example, citations 65 and 66 in Wilson et al.). Second, a review is a core part of the process of community building that is necessary for the creation of modern software. It is in this area that a kind and balanced review can be most important. A good review can add a member to the community and the project could benefit for years to come. A bad experience can drive off a potential new community member. Taken to the extreme, a project can get a reputation as being unwelcoming. Finally, code review can allow maintenance of knowledge about changes to the code base across a development team, building strength in the project and providing the community spirit that draws many contributors to these projects. We consider a good code review to consist of three stages, before, during and after the merging of the submitted code.
Before the review
The review process starts before a line of code is written. This ensures a clear and well-directed process by all involved. First of all, it is important to ensure that all potential reviewers are identified and know that they might be involved in a code review, to begin with.
Once these individuals are identified, all should be involved in the design of two documents:
Contribution guidelines: what constitutes a contribution to the code base, how should this be presented and what is expected by the reviewers. How should tests be run/written? What sort of documentation is expected? (e.g. a good “how-to” guide describing how to exercise the contribution, with sample data, can be valuable) Is there a style convention being used? For more ideas of things to include, take a look at this post by Github.
A code of conduct: loosely speaking, this sets the standard for peer to peer behaviour and extends a level of expectation not only from reviewers to contributors but vice versa. Here is the Python code of conduct as an example.
The purpose of these two documents (and any other that might be important for your project) is to set a level of expectation from all parties. More importantly, though it helps with the process of depersonalising the next part of the review: any comments coming from the code review will, hopefully, no longer be viewed as personal, subjective ones but based on the guidelines of the project itself.
Similarly, automated tests and style checkers (e.g. PEP8 and/or Pylint) can automatically generate information on certain aspects of code quality for both submitters and reviewers. Tie as much as possible into a test framework/continuous integration tools (like Travis CI or Gitlab CI), and state that these will be consulted when reviewing pull requests or proposed patches, again, so the review itself becomes less personal.
During the review
Code is personal work and receiving comments on it can be difficult. It is important that you depersonalise the code review and never attack personal choices in the code. Code reviews should judge if the code is “good enough”, even if it is not written in the way you would write it.
When presenting your review, refer back to automatically-generated content (e.g. the results of style checkers) as well as the contribution guidelines described in the previous section as this helps depersonalise your comments. These documents also help ensure that aspects not covered by automated processes (cryptic or verbose variable names, for example) can also be discussed in a constructive way. These documents can also provide an educational benefit, helping to increase awareness within contributors of good programming practices, and therefore to improve the quality of their code generally.
It is important that you are correct and precise in your review and suggestions. You should avoid suggesting things that might actually not work, only suggest alternatives that you know, and have checked, will work.
Including examples of code (formatted in a way that clearly shows inputs and output) is a good way of ensuring the discussion is positive and constructive. It can be linked to specific technology or software stacks (e.g. in astrophysics it will be better to follow guidelines similar to Astropy).
Reviews can and should be a discussion and a building process which will improve the quality of the code and, by consequence, the quality of the software.
After the review
Regardless of what issues your review has revealed, or how challenging some of the code may have been to review, remember that the developer has voluntarily submitted their work to your criticism. Thank them for allowing you to read their code. Be sure to highlight anything you have learned from their code, or in the process of doing the review, as this emphasises that the review is a two-way process, a conversation.
If the code is a submission to your open source project, then also thank the developer for taking the time to make their submission. A little bit of gratitude could go a long way, promoting you and your software as being welcoming to contributors, which can help you grow the community around your software.
If you have no response to your review, you may want to do a friendly follow-up contact, to find out how they are getting on with any changes, and to offer a chance to clarify any points you raised. Again, this keeps the dialogue going and could reassure the developer that their code was not all bad!
That concludes our advice on reviews, endeavouring, in the spirit of Fox News, to ensure reviews are “fair and balanced”. If you have any hints and tips of your own, please feel free to share them with us!