BioC 2014 - the right way to conduct Bioconductor

Posted by a.pawlik on 26 August 2014 - 10:00am

Last month saw the BioC 2014 conference take place at the Dana-Farber Cancer Institute, Boston, MA. Starting with a Developer Day on July 30th, it continued with a series of talks and workshops until August 1st.

Bioconductor is an R-based open-source, open-development software project. It provides tools for the analysis and comprehension of high-throughput genomics data. First developed in 2001 by Robert Gentleman, who also co-founded R with Ross Ihaka, it is overseen by a core team based at the Fred Hutchinson Cancer Research Center, alongside several other American and international institutions.

From a programming point of view, Bioconductor has all the benefits the R language brings, including a high-level, expressive language that allows users to easily and quickly prototype new computational methods. It also has a well-established system for packaging together software, data and annotation with documentation, state-of-the-art support for statistical computing, data mining and visualisation. Since the very beginning of the project, special emphasis was also put on documentation and reproducible research. Each Bioconductor package must, for example, include a vignette, a dynamically generated document that provides a task-oriented description of package functionality.

The project further promotes good practices in software development. For instance, unit tests are highly recommended for inclusion in packages. All the software released through the Bioconductor project is open source and distributed through a public subversion server in addition to standard R packages. Developers and user are invited to contribute to the project by submitting their own packages, each of which is individually reviewed before acceptance. One of the major strengths of the Bioconductor project is that it brings together developers with a wide range of skills, such as statisticians, computational biologists, computer scientists and biologists.

The collaborative nature of the project is also reflected is the interoperability of its component by promoting the re-use of existing packages and classes. Finally, the project provides training to researchers on computational and statistical methods for the analysis of biological data.

The Bioconductor project is extremely well regarded in the field of computational biology and has produced some of the most respected software used within Biology. It provides a friendly and constructive environment for beginners and experienced computational biologists and bioinformaticians alike.