Choosing the right open-source software for your project

SweetShop.jpgWhen you need to fix a problem, it's all too easy to reach for the nearest software library or package that seems to fit your needs, but it's a risky move. A lot of time is invested in setting up new software, and if you haven't done your homework, you could be making an expensive mistake. This guide takes you through the questions you should ask before you invest your time in new software.

Why write this guide?

At the SeIUCCR Summer School in September 2011, we were asked a number of questions about choosing the right software, so we wrote this guide in response.

Why it is important to pick the right software

Often, you can find many open-source choices that appear to fit the bill, but picking the wrong software can have expensive consequences. A lot of time is required to learn new software and integrate it into your project, and time is money - especially when project goals have to be delayed. Choosing the wrong software can be an expensive mistake, so a little time invested in making an informed choice is time well spent.

What questions should you ask?

You've found a piece of software that seems to fit the bill, but how can you be sure it's a good choice? Let's begin by asking a few questions.

Does it do what you want?

It might sound obvious, but does the software do what you want it to do? What are your requirements? Try to think about the functionality you may need in the future, and whether the software goes far enough to meeting your needs. Check for alternative packages that may fit the bill better. If you've come across the best fit so far, and it still doesn't quite do all you need, check to see if the software can be readily extended to do what you want? It might have a number of sensible extensibility points within a framework which can be coded against to provide plug-in functionality.

Of course, it's not always about just what you want. Think about who else will be using or developing against the software, i.e. your community. Is it suitable for them? Perhaps the community has established software that you should consider using instead.

It's also a good idea to check if any features you need in the software are likely to be deprecated in the future.

Is the software good for its role?

Investigate prior uses of the software. Has anyone used it in the manner you want to use it? Perhaps there are compatible use cases or case studies in the software's documentation or on its website that confirm the software's suitability.

At a technical level, is there evidence that the software will play well with your other software? If you're looking to use the software in a closely integrated fashion with other software, it may be worth checking out our Defending your code against dependency problems guide.

And, very importantly, does the software have a track record for reliability?

Is the software actively used, developed and supported?

The importance of an active community around the software shouldn't be underestimated. Is there evidence that the software is being used actively by others, and importantly, is the software supported? Check out the mechanisms for how support is provided: support forums, direct email support and issue trackers, and investigate whether they are actively used and have a good level of response to queries and issues.

If you intend to develop the software, the development model of the software is also important. Is it clear how to make contributions to the project - is there a well-defined policy for how this is handled, and how decisions are made for including such contributions?

Does the software have a future?

If development and support of a software packages ends, due to lack of funding or because the software has been superseded, you could find yourself with unsupported and rapidly decaying software. Obviously, you want to avoid this!

Check whether the software, it's development, support and community will be around within your own project's lifecycle (and preferably beyond). This is often difficult to gauge. The presence of an active and sizeable user/developer community size can be a positive indicator. Look for evidence of planned future software releases, and whether the software has a good history of frequent releases. An actively maintained project website is also a good indicator. There might even be a roadmap for development.

An important aspect of sustainability is the adoption of open standards. Does the software support them appropriately (where you would expect to find them), and is there evidence of tried-and-tested interoperability with other implementations supporting those standards? Examples include the Keyhole Markup Language (KML) from the Open Geospatial Consortium for geographical annotation and visualisation, and the wide variety of Grid open standards from the Open Grid Forum.

How is the software provided?

In most cases, good user and developer documentation is a must. Are the prerequisites of the software well defined and straightforward to obtain and deploy, and do they fit your own requirements? For example, if the software doesn't support an appropriate operating system or programming language (if you intend to develop the software) it may not be suitable. For libraries, be sure to check the quality of the API documentation.

If you are intending to develop the software, an obvious question is whether you have access to the source code. Ideally this should be provided via a source-code repository such as Subversion or CVS. You should also check that it is easy to develop the software, that it's understandable and modular, and is straightforward to build. Does the software have support for testing it? As you develop new features, especially with complex software, it is important to make sure the feature hasn't broken any existing functionality. Regression tests can help reduce this risk. Ideally, the software should allow you to rapidly build and test against newly developed features.

In respect to the software licence, it is very important that you check whether you have the right to use the software in its intended production environment, or the right distribute it along with your software. Does the software itself respect third-party copyright and licensing of any supplied dependencies? If not, whilst you may adhere to the licence of the software itself, you may not adhere to the licensing of those third party dependencies.

Choosing the right version

Of course, using the latest stable release, and not a development or snapshot release, is often the best course of action. Selecting a development release can offer more features, but often at the price of reliability.

Try to determine whether the version you intend to use comes from a forked open-source project, or from its original source project if a development fork has occurred. If so, which source is more appropriate for your project?

Certainly worth checking is whether there any known issues or bugs with the version that could cause problems, by checking the release notes and issue tracker (if there is one). This presents an obvious risk. Are there solid plans to fix issues, and are there any tried-and-tested workarounds in the meantime?

Some Good Examples

There are many good examples out there of successful open-source projects that tick a great many of the right boxes:

  • JGraphT: a Java library for producing graphs. User support and announce mailing lists, source code under revision control (SVN), feature and bug trackers with high activity and prompt responses, a user Wiki for all things JGraphT, a code contribution policy and good API documentation.
  • Taverna: a workflow management system, for designing and executing workflows. Started as a system for bioinformaticians, its use has expanded to include many other disciplines. Has comprehensive user, developer and API documentation, project roadmap, source code under revision control in Subversion with a contribution licensing policy (Contributor License Agreement), an extensibility framework, an active and responsive bug tracking system, supports a host of domain-specific and web service standards, regularly updated website, development funding until 2014, organised training and workshops and a long list of example uses of Taverna.
  • Jmol: an interactive web browser applet for viewing molecules. Has bug and feature request trackers with good activity, source code under revision control via SVN, user and developer mailing lists, developers guide, comprehensive API documentation, good project feedback, support for a multitude of standard chemistry formats, and a huge list of websites successfully using Jmol.

Ask the developers!

You shouldn't be afraid to ask the developer, or the developer community, any questions about the software you need answered. In fact, it's a sign of good support if the developers are friendly and approachable. Asking a developer can give you an indication of the type of responses you can expect from them in the future, as well as filling in any knowledge gaps.

Spending time early on to learn about the software - even if you eventually decide that it is unsuitable - is far better than coming to that realisation after the software has become an integral part of your project.

Further reading