D470–D478 Nucleic Acids Research, 2015, Vol. 43, Database issue Published online 26 November 2014 doi: 10.1093/nar/gku1204 The BioGRID interaction database: 2015 update Andrew Chatr-aryamontri1, Bobby-Joe Breitkreutz2, Rose Oughtred3, Lorrie Boucher2, Sven Heinicke3, Daici Chen1, Chris Stark2, Ashton Breitkreutz2, Nadine Kolas2, Lara O’Donnell2, Teresa Reguly2, Julie Nixon4, Lindsay Ramage4, Andrew Winter4, Adnane Sellam5, Christie Chang3, Jodi Hirschman3, Chandra Theesfeld3, Jennifer Rust3, Michael S. Livstone3, Kara Dolinski3 and Mike Tyers1,2,4,*
1Institute for Research in Immunology and Cancer, Universite´ de Montreal,´ Montreal,´ Quebec H3C 3J7, Canada, 2The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada, 3Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA, 4School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK and 5Centre Hospitalier de l’UniversiteLaval´ (CHUL), Quebec,´ Quebec´ G1V 4G2, Canada
Received September 26, 2014; Revised November 4, 2014; Accepted November 5, 2014
ABSTRACT semi-automated text-mining approaches, and to en- hance curation quality control. The Biological General Repository for Interaction Datasets (BioGRID: http://thebiogrid.org) is an open access database that houses genetic and protein in- INTRODUCTION teractions curated from the primary biomedical lit- Massive increases in high-throughput DNA sequencing erature for all major model organism species and technologies (1) have enabled an unprecedented level of humans. As of September 2014, the BioGRID con- genome annotation for many hundreds of species (2–6), tains 749 912 interactions as drawn from 43 149 pub- which has led to tremendous progress in the understand- lications that represent 30 model organisms. This ing of gene organization, genome evolution and the genetic interaction count represents a 50% increase com- basis for disease. At the same time, sequencing-based meth- pared to our previous 2013 BioGRID update. BioGRID ods have uncovered many intricacies of gene regulation at data are freely distributed through partner model or- a genomic scale, including expression patterns, alternative splicing, non-coding transcription and the myriad of regu- ganism databases and meta-databases and are di- latory factors that bind DNA and RNA (7–11). Proteomics rectly downloadable in a variety of formats. In addi- approaches, largely based on mass spectrometry, have sim- tion to general curation of the published literature ilarly mapped the abundance and post-translational mod- for the major model species, BioGRID undertakes ifications of proteins at impressive depth of coverage (12– themed curation projects in areas of particular rele- 15). At the phenotypic level, genome-wide reagent collec- vance for biomedical sciences, such as the ubiquitin- tions for systematic perturbation of gene function have led proteasome system and various human disease- to compendia of functional profiles for many different phe- associated interaction networks. BioGRID curation notypic characteristics (16–19). This wealth of new data has is coordinated through an Interaction Management been accrued in model organism systems, and particularly System (IMS) that facilitates the compilation inter- in humans, in both normal and disease contexts. In spite of action records through structured evidence codes, this data deluge, the fundamental problem of how genotype is translated into phenotype, and how genetic mutations can phenotype ontologies, and gene annotation. The Bi- affect this complex relationship, remains a formidable road- oGRID architecture has been improved in order to block in our understanding of fundamental biology and the support a broader range of interaction and post- basis for human disease. translational modification types, to allow the repre- It is now evident that genes and their encoded proteins sentation of more complex multi-gene/protein inter- function in the context of a vast, dynamic network of inter- actions, to account for cellular phenotypes through actions (20–23). The generation of comprehensive genetic structured ontologies, to expedite curation through and protein interaction maps will thus be essential for un- raveling the many complexities of biological processes and for understanding the general genotype to phenotype map-
*To whom correspondence should be addressed. Tel: +1 514 343 6668; Fax: +1 514 343 5839; Email: [email protected]