Software Evaluation Guide

By Mike Jackson, Steve Crouch and Rob Baxter

How do I figure out if this software is "good"?

Assessing the quality of software - either your own or someone else's - is a tricky balance between hard objectivity and the very subjective (but very valid) individual user experience. The Software Sustainability Institute provide a software evaluation service based on two complementary approaches developed over many years in the research software arena. The service can help you to improve your software. It can assess the general usability, and can identify technical or development issues, as well as any barriers to sustainability

Why write this guide?

This guide describes the two approaches we take to software evaluation, providing a set of guidelines that researchers and developers may find useful in performing their own assessments of code quality, usability and overall sustainability.

How to go about evaluating software

The two approaches we use are complementary; either can be used, and sometimes one approach makes more sense than the other.

Our criteria-based approach is a quantitative assessment of the software in terms of sustainability, maintainability, and usability. This can inform high-level decisions on specific areas for software improvement. This approach forms the basis of our online sustainability evaluation, a web-based assessment you can use straight out of the box.

Our tutorial-based approach provides a pragmatic evaluation of usability of the software in the form of a reproducible record of experiences. This gives a developer a practical insight into how the software is approached and any potential technical barriers that prevent adoption.

Tailor your approach

Please remember that whichever method you favour, the nature of the software under review can vary. It might include:

A software package released as a binary.
A software package released as source which the user must build.
An online portal.
A set of service endpoints e.g. web services or REST services.

There will likely also be related artefacts that fall under the remit of an evaluation e.g. a project web site, wiki, blog, issue tracker, user doc, tutorials, e-mail forums etc. As a result, you should not take this guide as a set of hard and fast rules all of which must be applied, but rather as guidelines to be followed, some of which will apply in certain situations and some of which will not. Use your best judgement in selecting these, bearing in mind that the goal is to produce valuable information on the state of the software package.

Evaluation nature and scope

A software evaluation is done for someone. Someone wants to know about the state of a particular package, and may even be paying you to look into it! So, at the outset, you should agree with this "someone" the scope of the evaluation. This includes what software and other project resources will be evaluated and the user classes from whose perspective the evaluation will be done. The user classes determine the tasks that will form the basis of any evaluation, especially a tutorial-based evaluation. The following classes of user can be assumed:

User. A person who, depending on the artefact, downloads, installs, configures and uses the artefact but does not write any code to use in conjunction with it. The software may be a web portal, a GUI or a command-line tool. Within this class there could be different types of user e.g. for the journalTOCs portal the users could be researchers wanting to use the service or journal publishers wanting to find out how the service used their RSS feeds.
User-Developer. A user who writes code which extends but does not change the software e.g. a client to some service endpoints, or a pluggable component coded against some extensibility point. As an analogy, a developer of web services using Apache Axis.
Developer. A user who writes code that changes the software e.g. fixes bugs, makes the software more efficient, or extends its functionality. As an analogy, someone who changes Apache Axis to make WSDL2Java easier to use.
Member. A Developer who is a project member and has write access to the source code repository. Unlike a Developer, a Member has to be aware of such issues as what the policy is on upgrading to use new versions of prerequisite packages, coding standards, who owns copyright, licensing, how changes are managed, if they’re expected to support components they develop, how the project is run etc. As an analogy, a member of the Apache Axis developer team.

In terms of how much time to spend on an evaluation to get useful information, our rule of thumb is that an ideal period is 1-2 weeks in duration (or 3-5 days of effort) depending on the complexity of the software and the nature of the evaluation tasks.

Evaluators need support too!

One important thing to determine before embarking on a major evaluation is the level and availability of support in case you run into technical problems. If you're performing a commissioned evaluation for the developers of the software, then checking and/or securing in advance the availability of the software development team during the period of evaluation can be vital. Actually, this is pretty important even if you're evaluating it for someone else! Of course, what support you might need depends on the type and depth of the evaluation, and the nature of the software.

Without the required level of technical support, the risks are that:

The period of evaluation could be greatly increased, perhaps exceeding an agreed effort or time tolerance for the activity.
An evaluation could fail completely e.g. perhaps due to a basic instruction typo of which the developers are unaware but could otherwise readily help with. Of course, an evaluation could fail completely in any case, but it's particularly annoying for everyone concerned if it's simply due to a documentation typo which a 30 second IM chat with a handy developer could resolve.

It's perhaps ironic if you can't complete an evaluation because you can't get the software to work; but then, that is a valid evaluation result!

Tips on writing up

When you write up your evaluation findings, don't forget to include the basic facts:

When the evaluation was done.
A brief overview of the software, what it does, and the project/researcher that produced it.
What versions of the software were used and where these came from e.g. a web site, the developer.
The classes of user considered in the evaluation.

As there may be dozens of issues identified, have a recommendations section at the outset listing what you believe to be the most pressing or important issues encountered, along with a rough estimate as to the impact of addressing these (e.g. time to do, impact upon architecture etc.), if possible. These should be grouped into sensible categories and it may be helpful to cross-reference these to the relevant sections in the report.

Software evaluation methodologies

Here are detailed descriptions of the two approaches we use at the Institute. Happy evaluating!

Criteria-based software evaluation (PDF) (DOC)
Tutorial-based software evaluation (PDF)