Anatomy of Biojs, an Open Source Community for the Life Sciences

FEATURE ARTICLE elifesciences.org CUTTING EDGE Anatomy of BioJS, an open source community for the life sciences Abstract BioJS is an open source software project that develops visualization tools for different types of biological data. Here we report on the factors that influenced the growth of the BioJS user and developer community, and outline our strategy for building on this growth. The lessons we have learned on BioJS may also be relevant to other open source software projects. DOI: 10.7554/eLife.07009.001 GUY YACHDAV*, TATYANA GOLDBERG, SEBASTIAN WILZBACH, DAVID DAO, IRIS SHIH, SAKET CHOUDHARY, STEVE CROUCH, MAX FRANZ, ALEXANDER GARCIA,´ LEYLA J GARCIA,´ BJORN¨ A GRUNING,¨ DEVASENA INUPAKUTIKA, IAN SILLITOE, ANIL S THANKI, BRUNO VIEIRA, JOSE´ M VILLAVECES, MARIA V SCHNEIDER, SUZANNA LEWIS, STEVE PETTIFER, BURKHARD ROST AND MANUEL CORPAS* Introduction support needed to survive beyond the originator’s BioJavaScript (BioJS; http://biojs.net/) was set initial enthusiasm and/or funding. up to meet a need for an open source library of In the case of BioJS we were acutely aware of reusable components to visualize and analyse the need to gain buy-in from the community for biological data on the web (Figure 1; Corpas two reasons: et al., 2014). These components are discrete 1. To fulfil the vision of a suite of tools capable of modules that can be reused, extended and displaying diverse biological data requires combined to meet a particular visualization expertise and capacity that is well beyond that need. Unlike proprietary (or closed-source) of any individual group of developers working systems, which are typically distributed as in isolation. ‘executable’ files under restrictive licenses, open 2. To encourage users to spend time integrating source software projects make their source code BioJS tools into their websites and applica- freely available under a permissive license tions, the project has to have, and be seen to have, a potential lifespan far beyond that of its (Millington, 2012; Balch, et al., 2015). This initial funding. allows other users to modify, extend and re-distribute the software with few restrictions Predicting the success of projects is hard, and *For correspondence: and at no cost to other users. there is nothing in the nature of open source [email protected] (GY); However, the fact that the community is initiatives that makes this any easier. There have [email protected] (MC) allowed and encouraged to contribute to an been attempts to quantify success in open source This is an open-access article, free open source software project is no guarantee software (see, e.g., the metrics proposed by of all copyright, and may be freely that they will contribute. It is certainly not a Crowston et al., 2003), but these have gained reproduced, distributed, transmitted, sufficient condition for ensuring sustainability: little traction. More useful, we feel, have been modified, built upon, or otherwise used repositories of open source software such as articles that provide pragmatic advice to would-be by anyone for any lawful purpose. The GitHub (https://github.com) and SourceForge open software developers, such as ‘Ten simple work is made available under the Creative Commons CC0 public domain (http://sourceforge.net) are littered with aban- rules for the open development of scientific dedication. doned projects that have failed to gain the software’ (Prlic and Procter, 2012). Rather than Yachdav et al. eLife 2015;4:e07009. DOI: 10.7554/eLife.07009 1of7 Feature article Cutting edge | Anatomy of BioJS, an open source community for the life sciences Figure 1. Examples of BioJS tools. Tree Viewer (visualization of phylogeny data in a tree-like graph); MSA Viewer (visualization and analysis of multiple sequence alignments); Proteome (multilevel visualization of proteomes in UniProt; The UniProt Consortium, 2015); 3D structures (visualization of protein structures); Dot-bracket (visualization of RNA secondary structures); Muts-needle plot (presentation of mutation distribution across protein sequences). Protein Feature Viewer (visualization of position-based annotations in protein sequences); Plasmids (visualization of DNA plasmids); Pathway visualization (visualization of data from Pathway Commons; Cerami et al., 2011). Note that all visualization tools are native to the browser and do not require any specialized software (such as Adobe flash, Java Virtual Machine or Microsoft Silverlight) to be installed or loaded. DOI: 10.7554/eLife.07009.002 repeating or re-working the general-purpose discovered, tested and downloaded at the push recommendations of others, here we describe of a button. BioJS components are organized what we have learned from the BioJS project. as packages. This modular approach reduces duplication of effort and has been used by other Project evolution projects, such as BioGem (Bonnal, et al., 2012). BioJS was initially developed in 2012 through a Several recognized projects and institutions collaboration between the European Bioinformatics have already shown commitment to BioJS by Institute (EMBL-EBI) and The Genome Analysis utilizing and developing components: exam- Centre (TGAC). The project began as a set of ples of this include PredictProtein (Yachdav individual graphical components deposited in a et al., 2014), CATH (Sillitoe et al., 2013), bespoke online registry. Since then it has expanded Genome3D (Lewis et al., 2013), Reactome into a community of 41 code contributors spread (Croftetal.,2014), Expression Atlas (Petryszak across four continents, a Google Group forum with et al., 2014), Ensembl (Cunningham et al., 2015), more than 150 members (https://groups.google. InterMine (Smith et al., 2012), PolyMarker com/forum/#!forum/biojs), and 15 published (Ramirez-Gonzalez et al., 2015)andtheTGAC papers (as of May 2015). The project’s first paper, Browser (http://browser.tgac.ac.uk). published in 2013 (Gomez, et al., 2013), has been cited 31 times to date. Building and growing an open The BioJS registry (http://biojs.io) offers a source community modern platform for fast and customizable access In this section we discuss the factors that, we to components. The registry provides a centralized believe, have played an important role in the resource where deposited components can be growth of the BioJS open source community. Yachdav et al. eLife 2015;4:e07009. DOI: 10.7554/eLife.07009 2of7 Feature article Cutting edge | Anatomy of BioJS, an open source community for the life sciences Clear mission and vision statements for the Low technological barriers project Shared interests and goals may not be enough A vision is an ambitious aim that clearly states the if contributing to a project involves significant value that the project adds. BioJS’s vision is that and costly technical effort, so we have designed ‘every online biological dataset in the world BioJS such that a potential contributor is faced with should be visualized with BioJS tools’. This vision a relatively small set of technical requirements: s/he statement communicates the way that BioJS aims needs to know JavaScript, to ensure that their to ‘change the world’. The members of our com- component conforms to the NPM package munity are energized and motivated to contribute manager (https://www.npmjs.com), and to to this worthy cause. We also have a mission follow some simple naming conventions; con- statement that describes what we do in order to tributors are not required to understand the achieve the our vision: ‘we develop an open- core system. Moreover, multiple parallel con- source library of JavaScript components to tributions can be worked on at once, eliminating visualize biological data’. Whereas the vision is cross-development dependencies and affording broad, ambitious and overarching, the mission contributors independence in creating their own statement is more practical and actionable. components. Having both a mission statement and a vision helps the BioJS community define what to do Detailed and complete documentation and who we are. and tutorials Aligning interests with the wider The educational section of the BioJS website community (http://edu.biojs.net) includes a comprehensive tutorial tailored for different types and levels of BioJS aims to create the largest, most compre- developers. Newcomers will find that the ‘getting hensive repository of JavaScript-based tools that started’ section sheds light on what the project is visualize online biological data. Many research about and how core concepts (such as packag- groups that generate and make their data avail- ing) help to achieve modularity. Tutorials offer able will benefit if we are successful. Groups a step-by-step set of instructions on how to developing new components will now find that a create a new BioJS component and detail the host of core JavaScript functionality (I/O, parsers, two-step process for publishing components. basic visualization) is already available as part of Contributors can then ‘graduate’ to the more the BioJS library. This creates an ecosystem advanced and detailed sections of the tutorial to where new contributors can plug into and extend explore the nuances of BioJS packages, compo- existing components according to their needs nents, the registry and the JavaScript technolo- and preferences. Furthermore, developers can gies development stack. Good documentation is leverage the visibility of the BioJS registry to crucial for an open source project such as BioJS increase the exposure of their own tools, and as it helps to ‘flatten’

Anatomy of Biojs, an Open Source Community for the Life Sciences

DATA Poster Numbers: P Da001 - 130 Application Posters: P Da001 - 041

A Molecular Phylogenetic Toolkit Using Node.Js 1 2 3 Damien M

Biojs-HGV Viewer: Genetic Variation Visualizer

a Biojs Component to Visualize KEGG Pathways Keggviewer [V1

1 Phylo-Node: a Molecular Phylogenetic Toolkit Using Node.Js

Thesis Entire.Pdf (12.90Mb)

Blastjs: a BLAST+ Wrapper for Node.Js

Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies Alexandre G

1 Phylo-Node: a Molecular Phylogenetic Toolkit Using Node.Js

Visually Guiding Users in Selection, Exploration, and Presentation Tasks

Downloads Or Setups, Because They Are Built Into the Webpage Structure

Open Source Libraries and Frameworks for Biological Data Visualisation: a Guide for Developers