By Jon Hill, University of York, and Software Sustainability Institute Fellow.
A controversial title, but one I hope to explain! When running a couple of workshops later last year, I spoke at length on a number of aspects of open science. This included software sustainability, data and software licensing, collaboration and manuscript writing. I was inspired by this fantastic paper posted on ArXiv from Greg Wilson et al. I will caveat this text with the fact I am not a lawyer and none of the text below should be taken as legal advice.
After running these two workshops—“Tools for Constructing the Tree of Life” and “Good enough practice in Computational Geography”—and speaking to the attendees, I realised there is a disturbingly large gulf between those involved in the open science movement and the rest of academia. Many participants knew the words 'open access' and 'open source', but conflated the ideas and didn't link any licences to these terms. There was also a lot of confusion on what licences to use and which were appropriate, as well as the concept of copyright. Unfortunately, academics have to rely on the lawyers available at their institution, which are on the side of the University, and not necessarily on the side of an academic who wants to make their work more open. Stories of conflicts between PhD students and supervisors over what could and could not be done with data and software were quite common. So it was clear to me that a gap of knowledge exists here.
What is open access?
Open access is defined as research that is free of all restrictions on access and free of many restrictions on use. It can apply to any type of research output, including peer-reviewed papers, books, manuscripts or pre-prints. There is often a distinction between gratis open access where the output is free to access, but there are still restrictions on usage, and libre open access, which truly fits the definition of open access. Typical licences for open access are the Creative Commons licences. CC-BY and CC-0 are the recommended licences, despite publishing houses offering their own restrictive licences. licences such as CC-BY-SA are known as copyleft licences as they ensure the derivative be released under a similarly permissive licence.
What is open source software?
Open source software (OSS) is the idea that software (including tests, documentation, blueprints, etc) is available to anyone. Moreover, it should be free in both senses of the word: free as in speech and free as in beer. Anyone should be able to download and edit the source code. Under some licences a user can redistribute their changes as they see fit (often called a fork in the project) with some restrictions. As with open access there are a number of licences available, which vary from the extremely liberal to licences compatible with patenting software. Common licences are GPL, Apache or BSD. These three popular licences are all copyleft licences.
Who is the copyright owner?
Beyond what licence outputs should be released under, there’s also a complex question of copyright. For staff at universities (that includes postdocs, research fellows, technicians, lecturers, etc), the University owns the copyright. They are your employer and under UK law an employer owns copyright. There are grey areas though—what if you write a piece of work related to your job at home, on weekends only, but use the laptop funded by the University? The University almost certainly owns the copyright for that. For PhD students, the situation is more complex. Students are not employed by a University so the copyright is owned by them. However, student projects will involve members of staff at a University (as supervisors, for example), so disentangling copyright could get very complex! Many Universities use some wording like the following:
For the avoidance of doubt, the University claims ownership of all intellectual property specified in section 12.3 of this statute which is devised, made or created:
by persons employed by the University in the course of their employment
by student members in the course of their studies, where a member of staff has also been involved...
There might be other factors that have to be considered also, such as the funder requirements. Research Councils UK (RCUK) mandate all funded work should be open access (either Gold or Green) and the Wellcome Trust even recommend the CC-BY licence. When it comes to software, must funders are less restrictive. Natural Environment Research Council (NERC), for example, only mandate you do something with the code:
At completion of a project, the software must be exploited either commercially, within an academic community or as OSS.
When making code or data open source/access it is important to make sure the licence is appropriate. You may have to involve the University’s research contracts team. For example, I had to get my current University to sign a letter to allow me to commit code to an open source project, Fluidity, to enable the project to use the University’s copyright. Other open source projects don’t go to such lengths, but the example above could be commercially exploited (and indeed has been in the past). Your University should be supportive of your efforts, so do get in touch with the appropriate person if you have any doubts. Your University will be supportive of releasing your software, data and research outputs under appropriate licences.
How can we then mitigate against such copyright issues so we make sure research outputs are freely available for in the future? A real difference could be made by informing incoming PhD students, or even undergraduates, about the virtues of open science and open source software, copyright issues, and what this means. An academic should be able to concentrate on their teaching and research, but the complications and caveats of licensing often makes not releasing software the easy option. Hence, this has a direct impact on sustainable software with conflicts arising when a knowledgeable PhD student, postdoc or even help from other sources, such as the Institute Research Software Engineers, contribute to software efforts. Greater knowledge instilled at an early career stage might help academics feel more confident in knowing what they are allowed to do with their software and data.