By Matthew Upson, Freelance Data Scientist.
On 8th March I was lucky enough to go to the annual Git conference: 'Git Merge', held this year in Barcelona. With it being just 30 minutes from home, it was not something I could miss, especially after being kindly offered a ticket by Raniere from the Software Sustainability Institute, who could not make it.
First of all it is worth saying congratulations (and thank you) to all the organisers and sponsors of Git Merge from: GitHub, GitLab, Bitbucket, Microsoft, and Sticker Mule. It was a slick and well organised event that went without hitch, served excellent food, and took place in the impressive 16th Century Convent of the Angels at Barcelona's Museum of Contemporary Art in the centre of the city.
One of the interesting things about going to a conference about all things Git is that there is a huge range of people who attend. To many (perhaps most) software developers, version control is an essential part of their daily work, so the range of people who might attend is very wide. Aside from software developers who use and are interested in Git, there were two other distinct groups of people attending: those who are not yet Git users (or have just become Git users) and wanted to learn more about using Git, and those with very deep expertise in Git, such as contributors to the official Git implementation and developers of web-based hosting services for version control such as GitHub, GitLab, or Bitbucket. It was an international event: I spoke to delegates from Belgium, Canada, Germany, South Africa, Spain, Switzerland, UK, and the USA.
There were a wide range of presentations; these were my highlights:
Diane Hosfelt from Mozilla presented about the importance of developing security critical code openly, and managing it with a sensible workflow. Diane, like myself a recovering civil servant, noted the change in culture that moving to Mozilla and working openly on security critical browser components was to her. Working in government (and indeed any security conscious large organisation) there is sometimes a reluctance to share work within the organisation, let alone without. This cultural barrier to openness can be motivated by nothing more than fear that others might criticise your code. But as Diane pointed out when she asked the audience: 'How many of you have tried to read the OpenSSL source code' (not very many it seemed): just because your code is in the open, doesn't mean anyone is actually going to read it. Managing code with the Issue>Fork>Pull Request>Review>Rebase>Test>Merge workflow is therefore just as important as publishing openly, to ensure that critical code undergoes quality review, and that it is documented for all to see.
Carlos Martin Nieto from GitHub gave an interesting breakdown of the tech behind GitHub backups. In short: your data is safe with GitHub, and moving from a system of storing data in volumes (on AWS EC2) to storing data as objects (on AWS S3) led to 90% savings in their backup costs. Key takeaways for me: if you must rely on a piece of software, test it very regularly: GitHub makes thousands of backups and restores every day on bespoke software based on Git, a practice which reduces the risk of unfortunate incidents like the partial loss of GitLab’s backups in 2017.
Edward Thomson from Microsoft and the maintainer of libgit2 (the library that sits behind Git functionality in many Git clients), gave a slick demonstration of a security vulnerability which caused him many sleepless nights, and exploited case insensitivity on macOS and Windows systems, allowing an attacker to execute malicious code using Git hooks through a social engineering attack. Edward highlighted the importance of community, which allowed developers working on Mercurial (another version management system - which also suffered from the issue), Git, libgit2, and GitHub to coordinate simultaneous patch releases to deal with this vulnerability, and the subsequent creation of the Git security mailing list to allow better communication going forward.
William Chia's presented on Empowering non-developers to use Git based on his experiences at GitLab. GitLab loves openness, version control, and collaboration, so much so that GitLab is used by pretty much everyone in the company for pretty much everything. Marketing managers, graphic designers, lawyers, and sales professionals work in Markdown using static site generators (like Middleman, Jekyll, or Hugo) for their daily work, allowing easy collaboration that scales effortlessly.
What I love about this approach is that once the cultural barrier to working with something that looks a bit like code (but is actually easy and intuitive) has been overcome, the issue of who did what, when, and why; and how to transmit knowledge within the organisation is at least partially solved. This is a huge issue that many large organisations struggle with, and an approach that is lightweight, easy, and cheap that many other organisations could learn from. Bravo GitLab. Also a peak behind the GitLab curtain: starting with GitLab 10.6, released at the end of March, GitLab Continuous Integration (GitLab CI/CD) will not only work with repositories in GitLab but also with projects hosted on GitHub.
Another highlight for me was a presentation on 'Git driven refactoring' by Ashley Ellis Pearce from GitHub. This was a thought provoking and engaging presentation which demonstrated simple but elegant ways for testing adherence to the SOLID principles using your log and diffs. For example: 'When adding features there should be no red in the diffs' as a way of testing the Open/closed principle: essentially if there is red (and you needed to delete something) to add a new feature to a class, then your class may not be as open as it should be.
Ashley's presentation was another reminder of why it is so important to write useful commit messages. Having a clean and sensible history created from disciplined commits can become a tool in its own right when seeking to improve the quality of our code.
Finally, another presentation from Microsoft employees: Derrick Stolee and Johannes Schindelin (the latter of whom was responsible for implementing git rebase -i) demonstrated the intricacies of bringing a performant Git to Windows. Now that Windows development has been moved over to Git, Git for Windows users will be pleased to see new emphasis on improving performance, driven by the desire to improve its performance when dealing with the Windows codebase which runs to hundreds of gigabytes.
One final thing worthy of note that I was introduced to at the event was xltrail, a server based tool to add support for Microsoft Excel workbooks (including embedded VBA) to Git, allowing them to enjoy the same level of version control as plain text files. A similar tool, Simuldocs, also exists to add version control with Git to Microsoft Word.
All in all, Git Merge 2018 was a great event, well worth attending in future.