Why mycode.R_final.v2_usethisone is not helping your workflow
Image from https://unsplash.com/@casman
By Olly Butters (editor), Esther Plomp, Sarah Gibson, Eirini Zormpa, Bezaye Tesfaye, Tudor Amariei
This post is part of the CW20 speed blog posts series.
Developing your ideal workflow is hard: it can have many components, will be context/domain specific and can be difficult to change once you have started. We think one of the most important parts, relevant to almost everyone, is version control. Here are some of the reasons why we think you should use it from the very beginning of your project.
It’s the Time Machine you always wished you had
So you’ve been regularly committing your source code, but all of a sudden you reach a point where for some reason it doesn’t work anymore or you decide the feature you are working on is junk and you want to go back to how it was before you started working on it. This is where version control can come to the rescue. You have two main strategies to approach this:
Assuming you have been regularly tagging your code to signify big changes (e.g. v1, v2, ice cream sandwich, Billie Holiday etc), then you can revert straight back to one of these. That does mean you may lose some intermediate bits of your work though.
The second option is to navigate your code, commit by commit until you get to a working/desired version and then take it from there. You can easily check for the differences between the versions and keep the bits you want. A key point here is that commiting often and with small changes makes this approach work better.
Using version control can also make you more organised. First of all, version control makes it really explicit what has changed between different versions of your files, as long as you’re using helpful commit messages! This is really useful for code, but even more so for publications. Git is not just for code, you can use it for all text types.
Git may also make you more intentional in the way you write your code. When trying to analyse new data it can be tempting to just write code without first coming up with a structure of what the analysis should do. Thinking of your code as needing to work when you commit it can help you think more deeply about what you want to achieve when you start.
Image created by Scriberia for The Turing Way
community and used under a CC-BY licence
As if the above wasn’t awesome enough, there is another great reason to add version control to your software development workflow - the magic that comes from automation. When you commit source code it is easy to have it automatically check if you are following your style guide (you know - those pesky four-space indents, or semicolons at the end of a line). It can even automatically fix them for you.
Turning it up a notch you can start using continuous integration (CI), which can automatically build the environment you run your code in (i.e something like a virtual machine or a container) and then run your code in it. If you have a set of tests such as: load in some data, run the program and see if the output matches a previous test, then bang, you’ve just written regression testing. Now if your cat walks across your keyboard while you code, you’ll notice if a plus has changed to minus as your tests will start to fail.
To really push it to the max, you can start doing continuous deployment (CD) too. Here the code you commit to version control can automatically get deployed into your production environment if it has passed all the above CI. No more having to remember where to deploy code to, or manually doing it!
Using version control makes it easier to track who did what and when. This is especially convenient when you have multiple collaborators that would all like to work on the same file/code at the same time. Without version control, this collaborative work is prone to errors as people are able to overwrite the contributions of others. Using platforms such as GitHub, GitLab and Bitbucket ensure that everyone has direct access to the right version, rather than having to send a dozen emails around. This allows you to easily collaborate with people across the globe in different time zones and will accelerate the progress.
Boost your CV
As more industrial and academic groups move towards flexible, remote and open working, having publicly viewable examples of your coding practices can really boost your profile when it comes to career development. Collaboration platforms such as GitHub can be used as an indicator of your ability to effectively collaborate remotely and asynchronously. It also gives an employer the opportunity to see the kinds of projects you’ve worked on, as well as the skills you’ve developed and how these could add value to the position they are looking to fill.
Contributing to a project doesn’t just mean writing code - submitting bug reports and reviewing code are just as important. Platforms like GitHub can also show your contributions to these on your profile.
The simple fact is that using version control and remote collaborative platforms is a standard feature of working in a software engineering group and is becoming the standard in larger academic consortia. Demonstrating your knowledge and use of these skills will work in your favour.
Tools and resources
So you managed to get this far and you think version control could help you, but the learning curve scares you. This is understandable. Facing a command-line interface is intimidating, to say the least. Using an advanced IDE (Integrated Development Environment) is going to make it a whole lot easier to jump into using any sort of version control system. There are loads to choose from, but we like Visual Studio Code, Atom and JetBrains. All of these have version control integrations either out of the box or added via their package managers and will work with most programming languages.
A few really good links for getting started with Git are:
Do bear in mind that git is a big thing in its own right, with lots of complexities you will never need to know about. So don’t be intimidated by it - you will probably only ever use what’s in the guides above.
In summary, version control can not only help you keep track of versions of your code (and roll back to earlier versions when you need it!), it also helps you to organise, automate and collaborate in your workflow and can function as your CV. Whilst the initial time investment may seem high, the benefits you will get from it will be higher and your future self will appreciate it!