D836–D842 Nucleic Acids Research, 2018, Vol. 46, Database issue Published online 30 October 2017 doi: 10.1093/nar/gkx1006 Mouse Genome Database (MGD)-2018: knowledgebase for the laboratory mouse Cynthia L. Smith*, Judith A. Blake, James A. Kadin, Joel E. Richardson, Carol J. Bult and the Mouse Genome Database Group
The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
Received September 22, 2017; Revised October 11, 2017; Editorial Decision October 12, 2017; Accepted October 19, 2017
ABSTRACT nomic data for the laboratory mouse. To this end, MGD serves as an authoritative resource for the catalog of mouse The Mouse Genome Database (MGD; http://www. genes and genome features connecting reference genomic informatics.jax.org) is the key community mouse sequence information to mouse biological data, including database which supports basic, translational and (i) molecular function, biological process and cellular loca- computational research by providing integrated data tion information encoded using the Gene Ontology (GO); on the genetics, genomics, and biology of the lab- (ii) a comprehensive listing of mouse mutations, variants oratory mouse. MGD serves as the source for bio- and human disease models, with mutant genotypes an- logical reference data sets related to mouse genes, notated to Mammalian Phenotype (MP) terms and Dis- gene functions, phenotypes and disease models with ease Ontology (DO) terms and (iii) provide authoritative an increasing emphasis on the association of these nomenclature and identifiers for mouse gene names, sym- data to human biology and disease. We report here bols, alleles and strains as the recognized primary commu- nity resource (Table 1). Standardization of mouse nomen- on recent enhancements to this resource, including clature and annotation of data with commonly used biolog- improved access to mouse disease model and hu- ical ontologies and standardized vocabularies ensure that man phenotype data and enhanced relationships of data are consistently annotated, making precise data min- mouse models to human disease. ing possible. MGD is a core component of the Mouse Genome In- INTRODUCTION formatics (MGI) consortium (http://www.informatics.jax. org). Other database resources coordinated through the The laboratory mouse is used extensively as a model for MGI consortium include the Gene Expression Database investigating the etiopathogenesis of human disease. The (GXD) (10), the Mouse Tumor Biology Database (MTB) relevance of laboratory mouse to biomedical research in- (11), the Gene Ontology project (GO) (12), MouseMine cludes its extensive experimental genetics capabilities, fully- (13), the International Mouse Strain Resource (IMSR) (14) sequenced inbred strain genomes, published genotype to and the CrePortal database of recombinase expressing mice phenotype associations, and data for genome wide cover- (www.CrePortal.org, unpublished). Data and information age of induced variation from mouse large-scale mutagen- for these resources are obtained through a combination esis programs (1–4) and reviewed in (5). Resources such as of expert curation of the biomedical literature and by au- the Collaborative Cross (6) and Diversity Outbred projects tomated or semi-automatic processing of data sets down- (7) capture most of the genetic variation present in labora- loaded from more than fifty other data resources. Metrics tory mouse strains and serve as unique platforms for stud- of current MGD content is shown in Table 2. ies of human relevant quantitative traits and complex in- In this report, we highlight several improvements to the herited syndromes. Furthermore, the International Mouse capture, annotation, integration and presentation of data Phenotyping Consortium (IMPC) is generating a catalogue associated with mouse models of human disease. These of gene function through systematic generation and phe- include the incorporation of the Disease Ontology (DO), notyping of a genome-wide collection of traditional gene new disease detail pages and ontology browser with list- knockout (KO) and CRIPSR mutant lines in the mouse (8). ings of associated genes and mouse models, improved on- The Mouse Genome Database (MGD) is the primary tology vocabulary browsers and incorporation of a Human community knowledgebase for mouse phenotype and gene Phenotype Ontology (HPO) browser. We added human dis- function and mouse models of human disease (9). MGD’s ease to phenotype relationships from Orphanet, which ex- goal is to advance understanding of human biology and pands our number of human diseases annotated to HPO disease by facilitating access to integrated genetics and ge-
*To whom correspondence should be addressed. Tel: +1 207 288 6663; Fax: +1 207 288 6830; Email: [email protected]