City University of New York (CUNY) CUNY Academic Works Publications and Research Hunter College 2009 Reactome knowledgebase of human biological pathways and processes Lisa Matthews Cold Spring Harbor Laboratory Gopal Gopinath Cold Spring Harbor Laboratory Marc Gillespie St. John's University Michael Caudy Cold Spring Harbor Laboratory David Croft Wellcome Trust Genome Campus See next page for additional authors How does access to this work benefit ou?y Let us know! More information about this work at: https://academicworks.cuny.edu/hc_pubs/193 Discover additional works at: https://academicworks.cuny.edu This work is made publicly available by the City University of New York (CUNY). Contact: [email protected] Authors Lisa Matthews, Gopal Gopinath, Marc Gillespie, Michael Caudy, David Croft, Bernard de Bono, Phani Garapati, Jill Hemish, Henning Hermjakob, Bijay Jassal, Alex Kanapin, Suzanna Lewis, Shahana S. Mahajan, Bruce May, Esther Schmidt, Imre Vastrik, Guanming Wu, Ewan Birney, Lincoln Stein, and Peter D’Eustachio This article is available at CUNY Academic Works: https://academicworks.cuny.edu/hc_pubs/193 Published online 3 November 2008 Nucleic Acids Research, 2009, Vol. 37, Database issue D619–D622 doi:10.1093/nar/gkn863 Reactome knowledgebase of human biological pathways and processes Lisa Matthews1, Gopal Gopinath1, Marc Gillespie1,2, Michael Caudy1, David Croft3, Bernard de Bono3, Phani Garapati3, Jill Hemish1, Henning Hermjakob3, Bijay Jassal3, Alex Kanapin1, Suzanna Lewis4, Shahana Mahajan5,6, Bruce May1, Esther Schmidt3, Imre Vastrik3, Guanming Wu1, Ewan Birney3, Lincoln Stein1,7 and Peter D’Eustachio1,6,* 1Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, 2College of Pharmacy and Allied Health Professions, St. John’s University, Queens, NY 11439, USA, 3European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, 4Lawrence Berkeley National Laboratory, Berkeley, CA 94720, 5Hunter College, New York, NY 10010, 6NYU School of Medicine, New York, NY 10016, USA and 7Ontario Institute for Cancer Research, Toronto, ON, Canada M5G0A3 Received September 17, 2008; Revised October 14, 2008; Accepted October 16, 2008 ABSTRACT biology, such as apoptosis, the HIV and influenza life cycles, DNA replication, transcription, hemostasis and Reactome (http://www.reactome.org) is an expert- carbohydrate metabolism are annotated, as are normal authored, peer-reviewed knowledgebase of human functions of 1005 proteins associated with OMIM disease reactions and pathways that functions as a data phenotypes (http://www.ncbi.nlm.nih.gov/omim/). mining resource and electronic textbook. Its current release includes 2975 human proteins, 2907 reac- tions and 4455 literature citations. A new entity- IMPROVED TOOLS, SOFTWARE AND DATA MODEL level pathway viewer and improved search and Revised orthology prediction methods data mining tools facilitate searching and visualizing The OrthoMCL clustering procedure (1,2) (http:// pathway data and the analysis of user-supplied reactome.org/electronic_inference.html) applied to data high-throughput data sets. Reactome has increased from OrthoMCL DB (http://www.orthomcl.org/cgi-bin/ its utility to the model organism communities with OrthoMclWeb.cgi), Version 2, is used to identify ortho- improved orthology prediction methods allowing logs of curated human proteins in each of 22 evolutiona- pathway inference for 22 species and through colla- rily divergent species for which high-quality whole- borations to create manually curated Reactome genome sequence data are available (3). In line with pathway datasets for species including Arabidopsis, changes in the OrthoMCL clustering procedure, only the Oryza sativa (rice), Drosophila and Gallus gallus longest transcript of each gene is considered, and a gene- (chicken). Reactome’s data content and software based rather than a protein-based method is used to map can all be freely used and redistributed under open the Ensembl identifiers used by OrthoMCL to the UniProt accessions used in Reactome. These changes have source terms. improved our success rate for electronic inference without measurably affecting accuracy. EXPANDED COVERAGE OF HUMAN PATHWAYS Improved tools for analysis of large-scale data sets, The current release of Reactome (version 26, September data-mining and modeling 2008) covers approximately 12.5% of 20 000 curated SkyPainter. The current version of the SkyPainter tool UniProt human proteins, a 2.7-fold increase over the allows users to more effectively visualize the func- last three years. Forty-six major domains of human tional relationships among genes identified in large-scale *To whom correspondence should be addressed. Tel: +1 212 263 5779; Fax: +1 212 263 8166; Email: [email protected] The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors ß 2008 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. D620 Nucleic Acids Research, 2009, Vol. 37, Database issue experiments (Figure 1A and B). For a user-submitted list In collaboration with the Protein Ontology group— of genes, each reaction arrow on the reaction map is PRO (8), Reactome annotations of complexes and physio- colored according to the number of user-specified genes logical states of proteins are being applied to build a hier- whose products participate in the reaction. In addition, archy of proteins and their functional forms. For example, hypergeometric testing is now used to display statistically protein objects used to illustrate TGF-b signaling in PRO over-represented events in the event hierarchy. Statisti- use corresponding protein objects in the Reactome anno- cally over-represented events may also be viewed as an tated pathway. The Reactome and PRO collaboration, as ordered list and a mapping from submitted identifiers to a source for curated annotations of proteins, extends the reactions is provided. The colored reaction maps can be application of Reactome to a larger ontology community. downloaded in publication quality PNG, SVG or PDF format. INCREASED DATA INTEGRATION Biomart. The newly implemented Reactome BioMart tool AND USER SUPPORT (www.biomart.org) facilitates data mining, cross-database We are increasing the accessibility of Reactome data analysis and large-scale analysis of gene function. A user through data integration and improved online documen- can formulate queries across selected data sets (pathways, tation. Reactome pathway annotations are now included reactions and complexes) specifying data attributes and in the Pathway Interaction Database (http://pid.nci.nih. filters to narrow searches. For example, querying the gov/) (9). Reactome’s online documentation is now avail- Reactome dataset, a user can identify all complexes that able as WIKI pages that provide easy access to user, contain a given protein (Figure 1C). Alternatively, a user author and curator guides as well as glossaries and stan- can link a query of the Reactome ‘complexes’ dataset to a dard operating procedure (SOP) guidelines. In addition, UniProt proteome query and retrieve sequences of all the an Editorial Calendar lists modules being prepared for the proteins in these complexes. knowledgebase, their planned release dates and contact Popular searches, e.g. all reactions or proteins or genes information for module curators. in a given pathway, all pathways inferred for a given spe- To support data mining, analysis and modeling of cies or all reactions or pathways involving a set of speci- Reactome content by other groups, individual reactions fied genes can be launched with predefined ‘canned’ and pathways can be exported in SBML (http:// queries. Additional context-sensitive help documentation sbml.org/Main_Page) (10), Prote´ge´(http://protege.stan- is under construction. ford.edu), Cytoscape (http://www.cytoscape.org/) (11) and BioPax (http://www.biopax.org/) (levels 2 and 3) for- Changes in the Reactome data model. These have mats. The entire data content of Reactome can be down- been minimized, to facilitate data curation and use of loaded as a MySQL database or in SBML or BioPax 2 the knowledgebase. Two additions are a ‘black box’ reac- and 3 formats. A SOAP based Web Services API is now tion class to allow the annotation of events for which available to access the Reactome data. Details about this not all defining attributes can be provided and an API are provided in many forms including a ‘Flash’ tutor- ‘entityOnOtherCell’ attribute to allow description of indi- ial and a PDF user’s guide at http://www.reactome.org/ vidual events and complexes that span two cells. download/index.html. COLLABORATIONS FUTURE DIRECTIONS We are actively collaborating in the creation of model organism Reactome projects for Arabidopsis, rice, A long-term objective of the Reactome project is to Drosophila and chicken. The Arabidopsis Reactome provide users with intuitive graphical representations of (http://arabidopsisreactome.org/), developed with the pathways and reactions. Toward this goal, Reactome human Reactome data model, software and curation has developed a beta version of an entity-level pathway tools, contains seven manually curated pathways and visualization tool: through enhanced navigation features 311 pathways inferred
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-