
D840–D847 Nucleic Acids Research, 2016, Vol. 44, Database issue Published online 17 November 2015 doi: 10.1093/nar/gkv1211 Mouse genome database 2016 Carol J. Bult*, Janan T. Eppig, Judith A. Blake, James A. Kadin, Joel E. Richardson and the Mouse Genome Database Group The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA Received September 29, 2015; Accepted October 23, 2015 ABSTRACT ing and the trend toward lower cost of whole genome se- quencing are changing the ways biological systems are in- The Mouse Genome Database (MGD; http://www. terrogated. The mouse is essential for providing compara- informatics.jax.org) is the primary community model tive functional analysis and for annotating rapidly emerging organism database for the laboratory mouse and human genomes. serves as the source for key biological reference The Mouse Genome Database (MGD) is the primary data related to mouse genes, gene functions, phe- community resource for integrated genetic, genomic, func- notypes and disease models with a strong empha- tional and phenotypic information supporting the link be- sis on the relationship of these data to human biol- tween mouse models and human phenotypes and disease. ogy and disease. As the cost of genome-scale se- MGD’s semantic and contextual integration of genome- quencing continues to decrease and new technolo- scale data with well defined functional and phenotypic data gies for genome editing become widely adopted, the are critical for understanding and interpreting similarities and differences between mouse and human biology. This laboratory mouse is more important than ever as a knowledge is essential for effective translational applica- model system for understanding the biological sig- tions and for the development of new disease models, as nificance of human genetic variation and for advanc- well as for generating insights that foster hypotheses about ing the basic research needed to support the emer- mechanisms of biological processes. gence of genome-guided precision medicine. Recent MGD is the central component of several coordinated enhancements to MGD include new graphical sum- database projects that are part of the Mouse Genome Infor- maries of biological annotations for mouse genes, matics (MGI) consortium (http://www.informatics.jax.org). support for mobile access to the database, tools Other MGI data resources include the Gene Expression to support the annotation and analysis of sets of Database (GXD) (14), the Mouse Tumor Biology Database genes, and expanded support for comparative biol- (MTB) (15), the Gene Ontology project (GO) (16), Mouse ogy through the expansion of homology data. Mine (17) and the MouseCyc database of biochemical path- ways (18). These resources are tightly integrated and can be accessed as a single data resource, the MGI database. Taken INTRODUCTION together, these resources provide a combination of data breadth, depth, integration and quality that exists nowhere The laboratory mouse is widely recognized as the premier else for mouse (Table 1). animal model for investigating genetic and cellular systems relevant to human biology and disease. A large arsenal of experimental genetic tools is available for mouse, including NEW FEATURES AND IMPROVEMENTS unique inbred strains, a complete reference genome, deep In this report, we describe new features and user inter- sequencing data for 17 additional inbred lines (1), exten- face enhancements to MGD, including new graphical sum- sive genome variation maps (e.g. SNPs) and technologies maries of biological annotations for mouse genes, release for directly and specifically manipulating genomes (2,3). of a smartphone app, tools to support the annotation and An international effort to generate targeted mutations in analysis of sets of genes and expanded support for compar- all protein-coding genes in mouse begun in 2007 (4) is vir- ative biology through the expansion of homology data. tually complete (5), and the phenotyping phase to func- tionally characterize these genes is underway (6). New re- What does this gene do? sources including the Collaborative Cross (7,8) and Diver- sity Outbred mice (9,10) are beginning to bear fruit in anal- One of the primary interface improvement projects recently ysis of complex traits and multigenic diseases (11–13). In implemented in MGI and to be publicly released on 22 Oc- the arena of human genetics and genomics, exome sequenc- tober 2015, allows users to get a high level visual overview *To whom correspondence should be addressed. Tel: +1 207 288 6324; Fax: +1 207 288 6830; Email: [email protected] C The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Nucleic Acids Research, 2016, Vol. 44, Database issue D841 Table 1. Summary of MGD content (September 2015) Content category September 2015 Number of genes and genome features 54 879 Number of genes with nucleotide sequence data 46 206 Number of mouse genes with protein sequence data 25 059 Number of mouse genes with Human orthologs 17 101 Number of mouse genes with rat orthologs 18 553 Number of protein coding genes with GO annotations 24 189 Total Number of GO annotations 304 660 Number of mutant alleles in mice 44 602 Number of QTL 5054 Number of genotypes with phenotype annotation (MP) 55 491 Total Number of MP annotations 285 251 Number of human diseases with one of more mouse models 1387 Number of references in the MGD bibliography 218 487 of biological annotations available for a given gene (Figure save gene lists and perform set operations among multi- 1A). Each major biological annotation type in MGI (phe- ple lists (union, intersection, etc.). Forwarding to the Batch notype, function, expression) is represented by a grid la- Query tool in MGI allows users to custom annotate a set beled with 14–27 grouping terms. If there are one or more of genes (Figure 4C). Options include annotation with as- annotations related to a grouping term, the cell in the grid sociated human disease terms from OMIM, Mammalian is filled in; mouse-clicking on a colored cell launches anew Phenotype Ontology terms, Gene Ontology terms, develop- web browser window with annotation details (Figure 1B). mental gene expression information from GXD, etc. Results In addition to providing visual summaries of gene anno- sets from these analysis tools can be exported as text or Ex- tations, we also implemented enhancements to how litera- cel files. ture references are represented on MGI gene detail pages (Figure 2). Users can still review a list of all references as- sociated with a gene or genome feature but the references associated with phenotype, function and developmental an- notations are now clearly identified. Reference sets can fur- Expanded orthology/homology representation ther be filtered by author, journal, year, and data type and downloaded if desired. The reference filter functions also al- At the core of MGD’s support for comparative biology low users to view only peer-reviewed publications for a gene are the representations of orthology and homology of and not references that cite data sources and internal cura- mouse genes to genes in other vertebrates, including non- tion processes. mammalian vertebrates such as zebrafish (Danio rerio)and chicken (Gallus gallus). Assertions of orthology are used as evidence of function for mouse genes based on experimen- Mobile app tally determined knowledge in other organisms (19,20). We draw primarily on external sources for homology assertions The MGI mobile app for iOS, MGI GenomeCompass, was including Homologene (21,22) and the HGNC Compari- released to the Apple Store in 2015 (Figure 3). The appli- son of Orthology Predictions (HCOP) (23). Although more cation lets users create a favorites list of mouse genes, dis- than 90% of protein-coding genes in mouse have a 1:1 or- / ease terms and or phenotypes terms. When new informa- thology relationship with a gene in human or rat, we also tion about the items in a favorites list is added to the MGI represent many-to-many ‘orthology’ relationships. For ex- database, the user receives notifications of these updates ample, based on current genome annotations, there is one when they launch the app. Updates can be sorted by item human SERPINA1 gene with five mouse homologs, pre- name or the date when the new information was added to sumably due to gene duplication in the mouse lineage. When MGI. The update summaries are hyper linked to the MGI available, MGD Gene Detail pages provide links to HCOP website. For each item in their favorites list, users can record and HGNC (Hugo Gene Nomenclature Committee) in the notes and add customized labels to search for and group the Vertebrate Homology and Human Homologs sections, re- features and terms they are tracking by user-defined con- spectively. Nomenclature searches throughout MGI return cepts. appropriate HGNC homologs. In addition to orthology assertions, we currently repre- sent, for each mouse gene, a link to Ensembl gene trees Working with sets of genes (24) and to Protein Information Resource Super Fam- To support users who want to search for information for ily (PIRSF) gene family sets (25). In the future we will sets of genes, we have implemented new functionality that include Panther gene families (http://www.pantherdb.org/ allows search results to be downloaded or forwarded to the genes/)(26) which will expand our ability to use orthology MGI Batch Query tool or to MouseMine (Figure 4). For- and homology rule-based algorithms to generate functional warding a set of genes to MouseMine (17) automatically (i.e.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages8 Page
-
File Size-