Infrastructure for closer collaboration - our top tips

Dominoes.jpgBy Mike Jackson.

What happens once your research software has become established and a number of developers, from a number of different projects, start working on it? What happens if you take the big step and open source your project? In this post, I follow our top tips on infrastructure needed to start developing research software and strengthen community engagement and deliver reliable software with top tips on infrastructure to foster closer and productive collaboration with developers.

Whether you're part of an international collaboration or just a solo researcher, infrastructure will make a valuable contribution to your development. Our tips will help you be unselfish: open, responsive, communicative and considerate to your fellow researchers to encourage them to engage with you and contribute to the onward development of your software and your research.

1. Keep your software releasable with a continuous integration server

Once you have automated your build and test system, you can set up a continuous integration server. A continuous integration server monitors your source-code repository and rebuilds your software, reruns your tests, publishes the results and can even email your developers to let them know when the work is done. This can happen every time an update is made, or the server can be configured to run at a regular interval.

A continuous integration server helps you keep your software in a releasable state so that you can quickly provide your community with bug-fixed versions and updates. You can also make your test results publicly available to increase confidence in your software. For example, check out the dashboards for Taverna and the MICE particle physics software (both implemented using the Jenkins continuous integration server). 

2. Keep your code consistent with a style checker

Coding standards (or programming style) promote consistency and contribute to maintainable software by encouraging developers to write code that can be understood by other developers. They are an agreement on what code should look like. The standard can cover variable, class and function names, bracketing, indentation, whether to use TABs or spaces, what comments are to be provided and in what form.

A style checker is a type of static code analysis tool that automatically checks whether code conforms to a set of coding standards, and reports on any deviations. A style checker can be added into a continuous integration server so that developers can be given feedback on the consistency and presentation of their code, in addition to its correctness. Examples of style checkers are CheckStyle for Java, lint for C/C++ or, as used by the MICE particle physics software, pylint for Python.

3. Use a shared calendar to help track your commitments

It's difficult to arrange a time to meet when your developers across spread across many sites - and even many time-zones. A shared calendar is a simple way to show everyone's availability. It might be something as simple as a wiki page you all update or a dedicated application like Google Calendar.

Shared calendars can help in many situations: deciding who can meet with a group of users or a potential collaborator, who is available to present at a conference, when you can all have a face-to-face development day. Shared calendars can also help identify when support might be less available (something that often happen when different developers' holidays coincide). 

4. Track usage of your software to understand your impact

If you want to show the impact of your software, you will want to demonstrate its - hopefully widespread - distribution. A simple way is to record downloads from your website. However, downloads can be misleading because they don't always convert into uses. Ideally, you want to distinguish between downloads, users who used your software once or twice, and regular users. For this level of detail, you will need a usage tracker (aka usage monitors or usage data collectors).

For example, the Eclipse Usage Data Collector keeps time-stamped records of the bundles, extensions and commands used by Eclipse users, as well as platform-specific information such as their operating system.

How you implement usage tracking depends on your software. Your software could record whenever a user opts to update the software. This could definitely be used as an indication of continued use - especially if the user performs updates over a long period of time. For an online portal, you could record usage as the time between a user logging in and then logging out.

Some people are wary of tracking usage, because it can be perceived as spying. It's important that you do not collect personal information, that you are open about what you information you collect, and that users can opt out if they wish (see Eclipse for example: they publish a complete list of the information they collect as well as their terms of use).

That completes our latest top tips. As always, please let us know what you think.

Posted by m.jackson on 16 July 2012 - 5:20pm