Going the Distance with natural abundance

By Alexander Hay, Policy & Communications Consultant, talking with Eric Rexstad, University of St. Andrews.

This article is part of our series: Breaking Software Barriers, in which we investigate how our Research Software Group has helped projects improve their research software. If you would like help with your software, let us know.

Abundance is a good thing not just for animals, but also for the researchers studying them. This study is, however, harder than it sounds, which is why it is an area of particular interest for Eric Rexstad, research fellow at the University of St. Andrews' Centre for Research into Ecological and Environmental Modelling

The exact term for this is Distance Sampling, where population numbers of a particular species in a certain area are estimated. For example, "how many harbour porpoises live in the North Sea?" as Eric puts it. Yet this leads onto more complex questions - in particular, how do animal populations react to perturbations or changes in the local environment, such as those caused by pollution or development?

Distance gets distant

One attempt to gauge this more complex picture is Distance, a software package for biologists that has been in development at St. Andrews for the last 20 years. It works by estimating abundance in a given area via the survey data entered into it by researchers, and by modelling how observers detect animals in the first place.

Distance offers a range of models that can be fitted to the survey data as well as the means to assess each one so the best choice for the job can be selected. The data itself comes with measures of precision for the estimates and users can choose a wide range of options to adjust and customise the model being developed.

This does, however, pose a dilemma. Distance was is in a state of constant development, and so needs to be updated on a regular basis. Parts of the software are also over 15 years old in the most extreme cases. Yet the developers have other commitments they needed to tend to as well as the small matter of a 10,000-strong user community that needed both training and support.

"With programmer time at a premium, we faced decisions about whether to include a rich new set of techniques without a GUI, or include a small set of new features while maintaining the legacy of the user interface," Eric explains. 

“A drink from a fire hose”

The solution to these problems was the Institute, whom Eric contacted during one of its Open Calls. As he explains, "Our initial contact, as I recall, was via email and Skype with Steve Crouch.  I remember having the opinion that he understood the nature of our challenges quite directly."

The next step was to await Steve's proposed course of action. This came in the form of the Institute's Software Architect, Mike Jackson. Working with him turned out to be a revelation for Eric. "My colleagues and I likened working with Mike to getting a drink from a fire hose. The amount of detail he provided in answer to vague questions on our part was astounding. For the specific questions we asked him, the answers were even more detailed."

The resulting changes include a new Cloud-based internal documentation system, which is certainly a step up from "some handwritten notes in someone's filing cabinet and a few Word documents living on a couple of PCs" as Eric puts it.  In addition, there is now an enhanced GitHub repository available on the main Distance site, which makes it easier to maintain the user community with much needed support.

Keeping down the "expense" of software

Eric points out that the Institute also gave great advice on how to organise the software itself. "We are in a position to fundamentally reorganise the heart of the software.  By doing so, this will enable us to maintain the software into the future and facilitate the process of adding new functionality to the software," Eric continues. "Our hope is that the lag between methods appearing in the statistical literature and software available to biologists and resource managers will be shortened as a result of a reorganising of our underlying analysis 'engines'."

Overall, Eric strongly recommends the Institute's help, including Mike's voluminous answers that he and his colleagues are still going through. Given the contributions made by the consultation to Distance, which, as said, has been in existence for over 20 years, "it is our feeling that the Institute can make even more valuable contributions to software projects that are still in their infancy," Eric concludes.

He adds that since software can be "expensive" in terms of time invested in its development, it stands to reason that it should be easily reused. "Therefore the Institute has the role of assisting researchers in developing software that can continue to be useful not just until the next paper gets written, but with a longer view of the usefulness of software to the scientific enterprise."

If you'd like free help to assess or improve your software, why not submit an application into the Institute's Open Call?

Posted by s.crouch on 20 April 2015 - 10:10am