Klein Washington 0250E 18537.Pdf (2.932Mb)
Total Page:16
File Type:pdf, Size:1020Kb
© Copyright 2018 Jason C. Klein Massively parallel characterization of enhancers in evolution and disease Jason C. Klein A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2018 Reading Committee: Jay Shendure, Chair James Thomas R. David Hawkins Program Authorized to Offer Degree: Genome Sciences University of Washington Abstract Massively parallel characterization of enhancers in evolution and disease Jason C. Klein Chair of the Supervisory Committee: Professor Jay Shendure, MD, PhD Genome Sciences On average, protein-coding sequence is over 99.9% identical between humans, yet some individuals develop disease while others do not. Similarly, protein-coding sequences are over 99% identical between human and chimpanzees despite large phenotypic differences. Our genome makes several different phenotypes using the same set of instructions (genes) by regulating the relative timing and expression level of genes throughout development. One way in which our genome does this is through enhancers. Enhances were first discovered in 1981 as pieces of DNA that are able to increase gene expression independent of their relative position and orientation to the transcriptional start site (TSS). Despite enhancers being implicated in a variety of evolutionary phenotypes and disease, the field still lacks a comprehensive understanding of how they function. Massively parallel reporter assays (MPRAs) have served as an indispensable tool to screen short pieces of DNA for enhancer activity. MPRAs test the ability of thousands to hundreds of thousands of sequences to increase expression of a reporter gene on a plasmid in a single experiment. However, the assay relies on testing short pieces of DNA (<200bp), which may contribute to the assay’s relatively low sensitivity. In chapter two of this dissertation, I discuss a protocol I developed to assemble libraries of short (<230bp) DNA fragments into large (up to 700bp) sequences. This protocol has allowed our group and others to screen large synthetic sequences for enhancer activity as well as for protein folding for the first time. In chapters three and four, I apply MPRAs to characterize regulatory changes over primate evolution and identify mutations within enhancers that may contribute to a patient’s risk for Osteoarthritis. Enhancers have previously been suggested to play a unique role in evolution and speciation due to their decreased pleiotropy and penetrance compared to protein-coding changes. In chapter three, I synthesized and screened present day and ancestral orthologs for 348 liver enhancers in order to characterize each enhancer’s evolutionary-functional trajectory throughout 40 million years of primate evolution. We identified groups of enhancers with similar trajectories, most of which could be explained by one or two mutational events. We identified and quantified the correlation between sequence and functional divergence of naturally evolving sequence and implicate cytosine deamination within CpGs as a potential driver in enhancer evolution. In chapter four, I applied the same assay to identify common polymorphisms that are associated with Osteoarthritis, which alter enhancer activity. From 35 lead GWAS variants, I identified 1605 SNPs associated with Osteoarthritis in European populations. I then synthesized 196bp around the major and minor alleles for each SNP, and screened each sequence for enhancer activity. We confidently measured enhancer activity of both alleles for 753 SNPs, and found six that drove differential enhancer activity at an FDR of 5%. Our lead SNP increases expression of an isoform of HBP1 (a negative regulator of the Wnt-beta-catenin pathway) with an alternative TSS in an osteosarcoma cell line, CRISPR-edited chondrosarcoma cell line, and cartilage from knee Osteoarthritis patients. Chapter five discusses ongoing and future projects to systematically compare different enhancer assays to improve sensitivity and specificity of enhancer screens. It also suggests future applications of these assays to better understand the role of gene regulation in evolution and disease. TABLE OF CONTENTS List of Figures ................................................................................................................................. v List of Tables ................................................................................................................................. vi Chapter 1. Introduction ................................................................................................................... 1 1.1 The discovery and characterization of enhancers ........................................................... 1 1.2 Traditional Methods to Screen for Enhancers ................................................................ 3 1.3 Biochemical Screens for Enhancers ................................................................................ 4 1.4 Massively Parallel Reporter Assays................................................................................ 6 1.5 CRISPR-based screens for Enhancers ............................................................................ 7 1.6 Enhancers in Evolution ................................................................................................. 16 1.7 Enhancers in Disease .................................................................................................... 17 1.8 Topics in this dissertation ............................................................................................. 19 Chapter 2. Multiplex Pairwise Assembly of array-derived DNA oligonucleotides ................ Error! Bookmark not defined. 2.1 Abstract ......................................................................................................................... 20 2.2 Introduction ................................................................................................................... 21 2.3 Results ........................................................................................................................... 23 2.3.1 Assembling targets in sets of 131-250 ...................................................................... 23 2.3.2 Multiplex assembly of 2,271 pairs of fragments ...................................................... 25 2.3.3 Error correction of assembled targets ....................................................................... 26 2.3.4 MPA using longer, more accurate oligos .................................................................. 27 2.3.5 Hierarchical MPA ..................................................................................................... 28 i 2.4 Discussion ..................................................................................................................... 29 2.5 Methods......................................................................................................................... 32 2.5.1 Target designs ........................................................................................................... 32 2.5.2 Pairwise oligonucleotide assembly ........................................................................... 34 2.5.3 Hierarchical pairwise oligonucleotide assembly ...................................................... 35 2.5.4 In silico design of static tag library ........................................................................... 35 2.5.5 Design and synthesis of Dial-Out retrieval primers .................................................. 36 2.5.6 Static tag library synthesis and preparation .............................................................. 37 2.5.7 Tagging of assembled targets.................................................................................... 37 2.5.8 Sequence verification of Dial-Out tagged targets ..................................................... 38 2.5.9 Dial-Out Retrieval ..................................................................................................... 38 2.5.10 Analysis of average nucleotide accuracy .................................................................. 38 Chapter 3. Functional Characterization of Enhancer Evolution in the Primate Lineage .............. 52 3.1 Abstract ......................................................................................................................... 52 3.2 Introduction ................................................................................................................... 53 3.3 Results ........................................................................................................................... 55 3.3.1 Identification of candidate hominoid-specific enhancers ......................................... 55 3.3.2 Computationally predictinv the activity of ancestral and orthologous sequences .... 56 3.3.3 Functional characterization of ancestral and orthologous sequences ....................... 58 3.3.4 Evolutionary-functional trajectories for hundreds of enhancer tiles across the primate phylogeny ................................................................................................................ 61 3.3.5 Characterizing molecular mechanisms for enhancer modulation ............................. 64 ii 3.4 Discussion ..................................................................................................................... 66 3.5 Methods........................................................................................................................