The Functional Annotation of Mammalian Genomes: the Challenge of Phenotyping
Total Page:16
File Type:pdf, Size:1020Kb
ANRV394-GE43-14 ARI 2 October 2009 19:8 The Functional Annotation of Mammalian Genomes: The Challenge of Phenotyping Steve D.M. Brown,1 Wolfgang Wurst,2 Ralf Kuhn,¨ 2 and John M. Hancock1 1MRC Mammalian Genetics Unit, MRC Harwell, Harwell Science and Innovation Campus, Oxfordshire OX11 0RD, United Kingdom; email: [email protected], [email protected] 2Helmholtz Center Munich, German Research Center for Environmental Health, 85764 Munich, Germany; email: [email protected], [email protected] Annu. Rev. Genet. 2009. 43:305–33 Key Words First published online as a Review in Advance on mouse genetics, mutagenesis, phenotype, bioinformatics, ontologies August 18, 2009 The Annual Review of Genetics is online at Abstract genet.annualreviews.org The mouse is central to the goal of establishing a comprehensive func- This article’s doi: tional annotation of the mammalian genome that will help elucidate 10.1146/annurev-genet-102108-134143 various human disease genes and pathways. The mouse offers a unique Copyright c 2009 by Annual Reviews. combination of attributes, including an extensive genetic toolkit that Annu. Rev. Genet. 2009.43:305-333. Downloaded from www.annualreviews.org All rights reserved underpins the creation and analysis of models of human disease. An 0066-4197/09/1201-0305$20.00 international effort to generate mutations for every gene in the mouse genome is a first and essential step in this endeavor. However, the greater challenge will be the determination of the phenotype of every mu- tant. Large-scale phenotyping for genome-wide functional annotation Access provided by WIB6385 - GSF-Forschungszentrum fur Umwelt und Gesundheit on 02/17/16. For personal use only. presents numerous scientific, infrastructural, logistical, and informat- ics challenges. These include the use of standardized approaches to phenotyping procedures for the population of unified databases with comparable data sets. The ultimate goal is a comprehensive database of molecular interventions that allows us to create a framework for biolog- ical systems analysis in the mouse on which human biology and disease networks can be revealed. 305 ANRV394-GE43-14 ARI 2 October 2009 19:8 INTRODUCTION distinctive attributes that underline its role as The sequencing of the human and mouse a key model organism in functional genomic genomes has transformed the landscape of studies (42). First, an extensive genetic toolkit mammalian biology, providing a fundamen- has been developed that allows the generation tal underpinning to ongoing developments in of a wide variety of mutational lesions. This functional, population, and evolutionary ge- adaptability in generating a range of mutant al- nomics. The mouse and human genome se- leles underpins recent efforts to establish muta- quences have allowed us to determine a fairly tions for every gene in the mouse genome (38). accurate picture of the nature and content Ultimately, we can expect to produce a resource of protein-coding genes in these two mam- with multiple mutant alleles at every mouse lo- malian genomes (70, 89). These analyses rein- cus. Second, the mouse has been an important force the high levels of similarity between the experimental organism for over a century, re- two genomes—approximately 99% of mouse sulting in a depth and breadth of knowledge of genes have homologues in the human genome developmental, physiological, behavioral, and (25). Recent analyses have also revealed the ex- biochemical mechanisms that underpins con- traordinary breadth of noncoding transcripts in tinuing investigations as a model for human mammalian genomes (11, 29) and have under- biological processes. Third, given its relatively lined that a much larger fraction of the genome small size and short generation time, the mouse is transcribed than had heretofore been real- remains the most economically viable mam- ized. Despite the extensive sequence and tran- malian organism for large-scale functional ge- script annotation that has taken place, the extent nomic studies. This combination of features of our knowledge of the function of both coding means that the mouse is uniquely placed for and noncoding loci in the mammalian genome the development and analysis of models for is woeful. Studies of the role and function of the human disease, particularly for systems that various classes of noncoding transcripts are in are not possible or relevant to model in other their infancy. But even for protein-coding loci, organisms. we are some way from developing a compre- The first step in addressing the challenge of hensive picture of gene function. delivering a functional annotation of the mouse Undertaking a comprehensive functional genome has been to put in place programs to annotation of the mouse genome will be pivotal generate comprehensive mutant resources (4, to developing a systems biology of mammals— 5): A worldwide effort is now under way to gen- which in itself will be critical if we are to un- erate new libraries of mutant alleles for the bulk derstand how transformations in genetic net- of mouse genes. However, this effort, which works lead to disease. We need to establish should be substantially realized in the next five through model organisms, such as the mouse, years, is only the beginning. The biggest ob- Annu. Rev. Genet. 2009.43:305-333. Downloaded from www.annualreviews.org the functions and interactions of genetic loci, stacle to undertaking a comprehensive deter- carry out perturbations, by mutation or other mination of the function of genes remains the approaches, and investigate the phenotypic out- determination of phenotype. In this article, we come. The ultimate goal is to create a compre- review the complementary approaches that are hensive database of molecular interventions in critical for generating a rich diversity of alle- Access provided by WIB6385 - GSF-Forschungszentrum fur Umwelt und Gesundheit on 02/17/16. For personal use only. vivo. This database will be a central, though not les at mouse loci. However, we pay consider- Functional the only, element in developing a framework for able attention to the many and formidable chal- annotation: gaining biological systems in the mouse through which lenges that need to be met if these extensive mu- knowledge about the human biology and disease networks can be tant resources are to be characterized in a man- function of individual revealed. ner that will contribute to a profound knowl- genes to provide a The mouse is the organism of choice to un- edge of mammalian biological systems. There detailed picture of the functions of all genes dertake a comprehensive functional annotation need to be considerable developments in the of a mammalian genome. It has a number of science of phenotyping as well as significant 306 Brown et al. ANRV394-GE43-14 ARI 2 October 2009 19:8 advances in technologies, infrastructure, and lo- adult mice. Toavoid embryonic lethality and to gistics that underpin phenotyping. We review study gene function only in specific cell types, and discuss current progress in mouse pheno- Gu et al. (64) introduced a modified, condi- Gene targeting: typing as well as future areas for development, tional gene targeting scheme that allows re- the introduction of including the many critical informatics chal- searchers to restrict gene inactivation to specific predesigned, site- lenges that lie ahead. cell types or developmental stages. In a condi- specific modifications tional mutant, gene inactivation is achieved by into the genome of the insertion of two 34-base pair (bp) recogni- embryonic stem cells by homologous GENE-DRIVEN MUTAGENESIS tion (loxP) sites of the DNA recombinase Cre recombination into introns of the target gene such that recom- Gene Targeting in Embryonic FRT: FLP bination results in the deletion of loxP-flanked Stem Cells recombinase exons (113). Recombination and gene inactiva- recognition target Gene targeting allows the introduction of pre- tion are achieved by crossing the strain harbor- designed, site-specific modifications into the ing the conditional (loxP-flanked) allele with a genome of embryonic stem (ES) cells by ho- transgenic strain expressing Cre recombinase mologous recombination (28). It is extensively in one or several cell types. Target gene in- used for the preplanned disruption of genes in activation occurs in a spatially and temporally the murine germline resulting in mutant knock- restricted manner, according to the pattern of out mouse strains. Since the first demonstra- Cre expression. More than 100 Cre transgenic tion of homologous recombination in ES cells strains with tissue-specific recombinase expres- in 1987 (137), more than 4000 knockout mouse sion have been generated covering a large range strains have been generated. Gene inactivation of cell types (101). In addition, gene inacti- is achieved through the insertion of a selectable vation can also be induced in adult mice us- marker into an exon of the target gene or the ing Cre transgenes activated by small molecule replacement of one or more exons. The mu- inducers (16, 53, 86). The generation of con- tant allele is initially assembled in a specifically ditional alleles involves the same technology designed gene targeting vector such that the se- as the production of germline knockouts (87, lectable marker is flanked at both sides with ge- 139). For a conditional gene targeting vector, nomic segments derived from the target gene a selection marker and a loxP site are inserted that serve as homology regions to initiate ho- into one intron of the target gene while a sec- mologous recombination. Upon electropora- ond loxP sequence is placed into another intron tion of such a vector into ES cells and the selec- (Figure 1a). Upon homologous recombination tion of stable integrants, clones that underwent and germline transmission of the modified al- homologous recombination can be identified lele, the selection marker gene, flanked by FLP through the analysis of genomic DNA using recombinase recognition (FRT) sites, can be re- Annu.