UCLA Electronic Theses and Dissertations
Total Page:16
File Type:pdf, Size:1020Kb
UCLA UCLA Electronic Theses and Dissertations Title Analyses of Next-Generation Sequencing Data to Identify Genes Associated with Complex Neuropsychiatric Disorders Permalink https://escholarship.org/uc/item/64r0d41z Author Rao, Aliz Raksi Publication Date 2017 Supplemental Material https://escholarship.org/uc/item/64r0d41z#supplemental Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA Los Angeles Analyses of Next-Generation Sequencing Data to Identify Genes Associated with Complex Neuropsychiatric Disorders A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Bioinformatics by Aliz Raksi Rao 2017 © Copyright by Aliz Raksi Rao 2017 ABSTRACT OF THE DISSERTATION Analyses of Next-Generation Sequencing Data to Identify Genes Associated with Complex Neuropsychiatric Disorders by Aliz Raksi Rao Doctor of Philosophy in Bioinformatics University of California, Los Angeles, 2017 Professor Stanley F. Nelson, Chair Over the past decade, decreases in the cost of DNA sequencing has allowed for a surge in the amount of data being generated. This has led to the discovery of genes causal for hundreds of Mendelian disorders and genes associated with many complex disorders. Given the opportunity to use sequencing data to tackle neuropsychiatric diseases with a complex genetic architecture, I take a data-first approach to study two diseases, bipolar disorder and autism spectrum disorder (ASD). Combining information gleaned from next-gen sequencing (NGS) data with the latest analytic methods shines new light on the biology of these diseases. In the first part of this dissertation, I present a whole-exome analysis of nine affected individuals from four families in which bipolar disorder was transmitted over several generations, and six unrelated, affected individuals. Our results demonstrate the genetic heterogeneity of bipolar disorder and provide support for rare-variant oligogenic disease model. ii In the second part, I present an approach to identify rare variants associated with autism spectrum disorder (ASD) in a whole-genome sequencing study of 71 individuals diagnosed with ASD and their family members. I demonstrate that by incorporating knowledge of population- wide variant frequencies to analyses of NGS data and taking an approach sensitive to complex family structures, as opposed to utilizing only case-control or trio data, one can identify patterns that would otherwise have been missed and thus gain novel insights into disease etiology. Finally, I present a mutational burden dataset called SORVA (Significance Of Rare VAriants), which is useful in vetting candidate variants and genes from NGS studies. In effect, my studies of complex disorders using next-gen sequencing show the field is constantly evolving with new computational approaches allowing for many advances being made in the areas of psychiatric disorders and ASD, in particular. iii The dissertation of Aliz Raksi Rao is approved. Eleazar Eskin Janet S. Sinsheimer Rita M. Cantor Stanley F. Nelson, Committee Chair University of California, Los Angeles 2017 iv DEDICATION To my father, whose curiosity and enthusiasm for science I wholeheartedly share. Thank you for showing me the path and supporting me throughout this journey of discovery. To my spouse and partner, who has demonstrated incredible patience by supporting me along the way. “Success is the sum of small efforts— repeated day in and day out.” v TABLE OF CONTENTS Abstract of the Dissertation.............................................................................................................ii Signature Page................................................................................................................................iv Dedication........................................................................................................................................v Table of Contents............................................................................................................................vi List of Figures.................................................................................................................................ix List of Tables...................................................................................................................................x List of Acronyms............................................................................................................................xi Acknowledgements........................................................................................................................xii Vita................................................................................................................................................xiv 1 Introduction...................................................................................................................................1 1.1 Challenges of studying complex disorders via next-gen sequencing....................................1 1.2 Current methods to analyze large NGS datasets...................................................................2 1.3 The genetic basis of bipolar disorder....................................................................................4 1.4 The genetic basis of autism spectrum disorders....................................................................6 1.5 Bibliography..........................................................................................................................9 2 Rare deleterious mutations are associated with disease in bipolar disorder families.................20 2.1 Abstract...............................................................................................................................20 2.2 Introduction.........................................................................................................................21 2.3 Materials and Methods........................................................................................................23 2.3.1 Sample selection..........................................................................................................23 2.3.2 Exome sequencing and bioinformatics analysis..........................................................24 2.3.3 Variant annotation, filtering and interpretation...........................................................25 2.3.4 Statistical analysis.......................................................................................................25 2.4 Results.................................................................................................................................26 2.4.1 Sample characteristics.................................................................................................26 2.4.2 Identification of rare, damaging mutations.................................................................27 2.5 Discussion...........................................................................................................................30 2.6 Conclusions.........................................................................................................................33 vi 2.7 Supplementary Methods......................................................................................................34 2.7.1 Exome capture and re-sequencing...............................................................................34 2.7.2 Quality control (QC)....................................................................................................34 2.7.3 Sequence alignment and variant calling......................................................................35 2.7.4 Variant annotation, filtering and interpretation...........................................................36 2.8 Supplementary Tables.........................................................................................................38 2.9 Bibliography........................................................................................................................39 3 Whole-genome sequencing of web-based recruited individuals with autism spectrum disorders reveals novel candidate genes........................................................................................................45 3.1 Abstract...............................................................................................................................45 3.2 Introduction.........................................................................................................................46 3.3 Methods...............................................................................................................................47 3.3.1 Sample recruitment......................................................................................................47 3.3.2 Sequencing and bioinformatics pipeline.....................................................................49 3.3.3 Validation of de novo events.......................................................................................52 3.3.4 Functional Enrichment and Network Analyses...........................................................53 3.4 Results.................................................................................................................................53 3.5 Discussion...........................................................................................................................59