Building Research Communities on Biological Pathways Thomas Kelder1,*, Martijn P
Total Page:16
File Type:pdf, Size:1020Kb
Published online 16 November 2011 Nucleic Acids Research, 2012, Vol. 40, Database issue D1301–D1307 doi:10.1093/nar/gkr1074 WikiPathways: building research communities on biological pathways Thomas Kelder1,*, Martijn P. van Iersel1, Kristina Hanspers2, Martina Kutmon1, Bruce R. Conklin2, Chris T. Evelo1,3 and Alexander R. Pico2 1Department of Bioinformatics—BiGCaT, Maastricht University, Maastricht, The Netherlands, 2Gladstone Institute of Cardiovascular Disease, San Francisco, California, USA and 3Netherlands Consortium for Systems Biology (NCSB), The Netherlands Received August 6, 2011; Revised October 2, 2011; Accepted October 28, 2011 ABSTRACT form of pathway diagrams and as platform for curating, sharing and publishing pathways. We launched Here, we describe the development of WikiPathways in 2008 as an experiment in community- WikiPathways (http://www.wikipathways.org), a based curation of biological pathways (1). WikiPathways public wiki for pathway curation, since it was first has continued to develop and has been adopted by the published in 2008. New features are discussed, as research community in several ways. well as developments in the community of contribu- Here, we present the latest developments for tors. New features include a zoomable pathway WikiPathways focusing on two specific aspects. In the viewer, support for pathway ontology annotations, first part, we highlight new features developed in recent the ability to mark pathways as private for a limited years, and show how they fit with our general philosophy time and the availability of stable hyperlinks to of community curation and collaboration. In the second pathways and the elements therein. WikiPathways half, we analyze the size and activity of the community to assess the success of WikiPathways so far. content is freely available in a variety of formats such as the BioPAX standard, and the content is increasingly adopted by external databases and New features of WikiPathways tools, including Wikipedia. A recent development is Pathway diagrams are found everywhere: in textbooks, in the use of WikiPathways as a staging ground for review articles, on posters and on whiteboards. Their centrally curated databases such as Reactome. utility is to turn abstract knowledge into an understand- WikiPathways is seeing steady growth in the able visualization. WikiPathways enables biologists to number of users, page views and edits for each capture their rich, intuitive mental models of biological pathway. To assess whether the community pathways and avoid the incomprehensible ‘hairball’ state curation experiment can be considered successful, of networks derived from big data. From this, understanding of how pathways could be here we analyze the relation between use and con- used, we derived a number of assumptions that have tribution, which gives results in line with other wiki guided the addition of new features. The first assumption projects. The novel use of pathway pages as sup- is that manual curation leads to higher quality pathways plementary material to publications, as well as the in the long term. Consequently, we developed features that addition of tailored content for research domains, make manual interaction with the site easier. Our second is expected to stimulate growth further. assumption is that the success of WikiPathways is depend- ent on the community of curators, and thus we have added features to stimulate growth of the community and INTRODUCTION avoided features that increase the barrier to entrance for WikiPathways (http://www.wikipathways.org) is a newcomers. Finally, it is our goal for WikiPathways to be resource for biological pathways in the form of a wiki. a public resource for biological research. The content It serves as a repository for biological knowledge in the should be easily accessible for a wide range of *To whom correspondence should be addressed. Tel: +31 88 866 4270; Email: [email protected] Correspondence may also be addressed to Alexander R. Pico. Tel: +1 415 734 2741; Fax: +1 415 355 0960; Email: [email protected] Present addresses: Thomas Kelder, TNO, Research Group Microbiology and Systems Biology, Zeist, The Netherlands. Martijn P. van Iersel, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK. ß The Author(s) 2011. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. D1302 Nucleic Acids Research, 2012, Vol. 40, Database issue applications, including those that we support directly, as interactive viewer also contains a search function that well as those that we have not envisioned ourselves. makes it easier to locate elements on the pathway diagram. New developments in the pathway page Organizing pathway information The most visible aspect of WikiPathways is the pathway Currently, WikiPathways contain over 1600 pathways and page. Each pathway has a dedicated page that provides supports 21 species, including vertebrates, several plant information on a specific biological mechanism, including species, bacteria and model organisms such as worm, the pathway diagram, description, hyperlinks to detailed yeast and fruit fly. To support manual interaction with information about genes, proteins and metabolites, and this growing amount of knowledge, we attempted to relevant literature references. Entities such as genes, make it easier to organize and browse. Therefore, we proteins or metabolites in a pathway can be annotated recently implemented an ontology annotation feature with many different identifier systems, such as Ensembl that allows each pathway to be annotated with terms (2), Entrez Gene (3) or ChEBI (4). Hyperlinks are from several ontologies covering different topics [these provided to many different external sources of informa- currently include the Pathway, Human disease and Cell tion, such as genome browsers, experimental platforms, type ontologies from the BioPortal collection (5)]. This protein databases, Gene Ontology and Wikipedia. These allows users to exactly specify the context that applies to are directly available by clicking on a gene or protein box the pathway, e.g. a specific disease, organ or cell type. By on the pathway page and allow the researcher to browse to using the pathway ontology, a hierarchical organization of detailed information related to components of the the pathways can be created that groups smaller, more pathway. This way, the pathway page provides an specific pathways (e.g. oxidative phosphorylation) into organized summary of information related to a biological larger superpathways (e.g. energy metabolism). These an- mechanism, and provides a starting point for researchers notations make it easier to find pathways related to a to browse through more specific and detailed resources. specific topic, by allowing users to search pathways by Earlier versions of the pathway page contained a static ontology terms. It will also facilitate data analysis and image of the pathway, without interactive access to linked visualization methods, for example, a pathway interaction databases. Zooming in was possible only after opening the network could be built from the hierarchical organization editor applet. Aiming to increase the manual interaction in the pathway ontology to analyze biological mechanisms with the pathway page, we replaced the static pathway at different levels of detail. image with an interactive pathway viewer (Figure 1). This interactive viewer makes it possible to zoom and Pathways as a public resource pan the diagram and click on elements in the pathway to view detailed information, similar to the popular WikiPathways is a public resource intended to benefit Google maps interface. Genes, proteins and metabolites biological research worldwide. All content is freely can be clicked for direct access to external databases. The available under the Creative Commons attribution Figure 1. The new interactive pathway viewer. Nucleic Acids Research, 2012, Vol. 40, Database issue D1303 license (6). Because of this license, there is no limit to re- become publicly available to the WikiPathways distribution of pathway content. The WikiPathways community. content is already accessible from several external resources, such as NCBI BioSystems (7) and BioGPS Pathways in bioinformatics applications (8). We are continually looking for new ways to make Pathways produced at WikiPathways are in a format that pathway information available to a wider public. For can be directly used in downstream data analysis by a example, we recently added Interactive Pathway Maps number of software tools. Thus, we complete a cycle into human gene and pathway-related articles starting with researcher knowledge that when synthesized at Wikipedia (9). Implemented as reusable templates, with standardized data, leads to novel pathway models these interactive pathway maps link related Gene Wiki that can be used to visualize and analyze other data sets, (10) articles. Each of these pathways at Wikipedia is leading to new insights, experiments and knowledge. viewed tens of thousands of times per day. We were WikiPathways content is distributed through numerous careful not to add flagrant links to WikiPathways. online resources and bioinformatics software packages. Clicking on pathway at Wikipedia, for example,