Next Generation Mendelian Genetics Sarah Ng a Dissertation Submitted

Next Generation Mendelian Genetics Sarah Ng A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2012 Reading Committee: Jay Shendure, Chair Michael Bamshad Deborah Nickerson Program Authorized to Offer Degree: Genome Sciences University of Washington Abstract Next Generation Mendelian Genetics Sarah B Ng Chair of the Supervisory Committee: Associate Professor Jay Shendure Department of Genome Sciences The study of Mendelian disorders has been of immense utility in uncovering the genetic and molecular basis of numerous human traits, and has greatly furthered the identification of genes, the annotation of gene function and our understanding of biological pathways and cellular processes. Over the last 30 years, linkage analysis has been the most successful approach for finding the genes underlying Mendelian disorders, contributing to the identification of over 1,500 genes. However, thousands of disorders remain unsolved. Here I present a new paradigm to efficiently identify the genetic basis of Mendelian disorders, based on the direct observation of potentially causative mutations throughout the genome of affected individuals. This is enabled by massively parallel, or “next-generation”, sequencing, which has made it increasingly feasible to generate large amounts of sequencing data at low cost. Although whole human genomes can now be sequenced, it is more cost effective to focus on specific regions of interest – for example, all the protein-coding regions (the “exome”), in which the majority of known Mendelian disease mutations are found. In this dissertation, I first describe a method for the efficient enrichment and sequencing of the human exome. I then validate this method by sequencing and describing the genetic variation in twelve human exomes – eight Hapmap samples and four samples with Freeman-Sheldon syndrome (FSS) – and in a proof-of-concept experiment, show how the variation uncovered in exome data from the individuals affected with FSS can be filtered to identify the known causal gene. Next, I present the first successful applications of this approach to disorders of unknown genetic basis: a) Miller syndrome, a recessive disorder with only 40 described cases, and b) Kabuki syndrome, a dominant disorder where the majority of affected individuals are sporadic cases with no familial transmission. The development of exome sequencing and these filtering methods for exome data represents a new paradigm by which Mendelian disorders can be studied and new genes associated with disease can found, and as sequencing becomes ubiquitous, is likely to become a standard tool for the elucidation of the molecular basis of disease. TABLE OF CONTENTS Page List of Figures ....................................................................................................................................... iii List of Tables ........................................................................................................................................ iv Chapter 1 : Introduction ......................................................................................................................... 1 1.1 What is a Mendelian disorder? ...................................................................................................... 1 1.2 Finding “disease genes” ................................................................................................................ 3 1.3 “Next-generation” sequencing ..................................................................................................... 6 1.4 Target enrichment strategies ....................................................................................................... 7 1.5 The “exome” .................................................................................................................................. 8 1.6 Interpretation of variant function ............................................................................................... 9 1.7 Dissertation Aims ......................................................................................................................... 9 Chapter 2 : Targeted capture and sequencing of 12 human exomes ................................................... 11 2.1 Abstract ........................................................................................................................................ 12 2.2 Exome sequencing of twelve individuals ................................................................................... 12 2.3 Identification of causal mutations underlying a dominant Mendelian disorder ..................... 21 2.4 Conclusion ................................................................................................................................. 24 2.5 Materials and methods .............................................................................................................. 25 2.6 Acknowledgements ................................................................................................................... 32 Chapter 3 : Exome sequencing identifies the cause of a Mendelian disorder ...................................33 3.1 Abstract ....................................................................................................................................... 34 3.2 Introduction ............................................................................................................................... 34 3.3 Results ........................................................................................................................................ 37 3.4 Discussion .................................................................................................................................. 43 3.5 Conclusion ................................................................................................................................. 48 3.6 Materials and methods .............................................................................................................. 49 3.7 Acknowledgements .................................................................................................................... 50 Chapter 4 : Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome ............ 51 4.1 Abstract....................................................................................................................................... 52 4.2 Introduction ............................................................................................................................... 52 4.3 Results ........................................................................................................................................ 53 4.4 Discussion .................................................................................................................................. 59 4.5 Conclusion ................................................................................................................................. 60 4.6 Materials and Methods .............................................................................................................. 61 4.7 Acknowledgements ................................................................................................................... 63 i Chapter 5: Massively parallel sequencing and rare disease ............................................................... 65 5.1 Early Successes ........................................................................................................................... 65 5.1.1 Molecular diagnosis of mutations in known genes ............................................................. 65 5.1.2 Clinical diagnosis based on sequence data ......................................................................... 66 5.1.3 Novel disease gene discovery .............................................................................................. 67 5.1.4 Filtering based on function ................................................................................................. 69 5.1.5 Ranking variants by effect and conservation ...................................................................... 70 5.1.6 Filtering for rare variants .................................................................................................... 70 5.1.7 Searching for de novo mutations ........................................................................................ 72 5.2 Current Limitations ................................................................................................................... 73 5.2.1 Limited sequencing scope ................................................................................................... 73 5.2.2 Spurious gene identifications ............................................................................................. 74 5.2.3 Missing variant calls ............................................................................................................ 74 5.3 Future directions ........................................................................................................................ 75 5.4 Conclusions ................................................................................................................................ 77 References ..........................................................................................................................................

Load more