Global Measurements of Human Transcription Factor Occupancy
Total Page:16
File Type:pdf, Size:1020Kb
Global measurements of human transcription factor occupancy: Insights into development and genome evolution Andrew Ben Stergachis A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2013 Reading Committee: John A. Stamatoyannopoulos Michael J. MacCoss James H. Thomas Program Authorized to Offer Degree: Genome Sciences ©Copyright 2013 Andrew Ben Stergachis University of Washington Abstract Global measurements of human transcription factor occupancy: Insights into development and genome evolution Andrew Ben Stergachis Chair of the Supervisory Committee: Associate Professor John A. Stamatoyannopoulos Department of Genome Sciences Transcription factors (TFs) are a class of proteins that interact with the genome, dictating which parts of the genome are utilized by a given cell. Despite the central role TFs play in regulating the genome, technologies available to date have not permitted global measurement of TF occupancy within the cell. Consequently, our understanding of how TFs model cellular development and genome evolution has been limited. To address this, I focused my graduate studies on the development of genomic and proteomic tools that facilitates global measurements of TF occupancy within a cell, including the development of: (i) targeted proteomic assays to quantify TF protein abundances within the nucleus; (ii) a protein-centric method to map TF protein occupancy in functionally distinct chromatin microenvironments; (iii) a genome-wide DNaseI footprinting method that enables construction of global maps of TF occupancy along the genome and core human regulatory networks; and (iv) a method for mapping the in vivo affinity of a TF genome-wide. Using a combination of these techniques, I have identified the sequence and chromatin features that direct TF occupancy within a cell and have characterized the mechanisms underlying TF occupancy dynamics during development and oncogenesis. These global TF occupancy maps have revealed novel insights into how TF occupancy shapes the evolution of both non-coding and coding genomic sequence. Specifically, it appears that TF affinity, not occupancy, plays the dominant role in shaping the evolution of TF binding elements. In addition, coding TF binding elements significantly contribute to protein evolution and codon usage biases in mammals. Together, these results depict the sequence, chromatin, developmental and evolutionary forces that shape TF occupancy within a cell—revealing unforeseen evolutionary constraints on the human genome. Table of Contents PAGE CHAPTER 1 – INTRODUCTION ...................................................................................................... 1 1.1 – Transcription factors (TFs) and their role in gene regulation ............................................ 1 1.2 – Methodology for quantifying TF occupancy ..................................................................... 2 1.3 – Dissertation aims ................................................................................................................ 2 1.4 – Organization of thesis ........................................................................................................ 3 CHAPTER 2 – DEVELOPMENT OF TARGETED PROTEOMIC ASSAYS FOR HUMAN TRANSCRIPTION FACTORS ......................................................................................... 6 2.1 – ABSTRACT ....................................................................................................................... 6 2.2 – INTRODUCTION .............................................................................................................. 6 2.3 – RESULTS........................................................................................................................... 7 2.4 – FIGURES ......................................................................................................................... 12 Figure 2.M1. Development of targeted proteomics assays using enriched in vitro– synthesized full-length proteins. ............................................................................................ 12 Figure 2.M2. Targeted assays can be efficiently developed using in vitro–synthesized proteins and applied to measure proteins in vivo. ................................................................. 14 Figure 2.S1. Poor coverage of target transcription factor proteins in NIST database ........... 16 Figure 2.S2. In vitro‐synthesized proteins are enriched full‐length proteins ........................ 17 Figure 2.S3. Calibration curve for the schistosomal GST peptide IEAIPQIDK ................... 18 Figure 2.S4. Histogram of dot‐products for quality score 1 and 2 peptides .......................... 19 i Figure 2.S5. Spearman correlation of our empirical peptide ranking with ESPPredictor rankings ................................................................................................................................. 20 Figure 2.S6. Peptide MS/MS spectrum counts are a poor predictor of targeted peptide signal intensity using selected reaction monitoring‐mass spectrometry .......................................... 21 Figure 2.S7. Reagent cost for generating in vitro‐synthesized proteins from plasmids ........ 22 2.5 – METHODS....................................................................................................................... 23 CHAPTER 3 – PROTEIN-CENTRIC MAPPING OF NUCLEAR TRANSCRIPTION FACTOR OCCUPANCY .................................................................................................................... 32 3.1 – ABSTRACT ..................................................................................................................... 32 3.2 – INTRODUCTION ............................................................................................................ 33 3.3 – RESULTS......................................................................................................................... 35 Segregation of TFs into biochemically defined chromatin compartments ............................ 35 Patterning of TF occupancy across specific chromatin niches .............................................. 36 Many TFs are poorly solubilized by widely-used extraction methods .................................. 37 TFs can occupy both euchromatic and heterochromatic niches ............................................ 37 Euchromatic occupancy by a linker histone (H1.x) .............................................................. 39 Restriction of individual TF isoforms to distinct chromatin niches ...................................... 40 3.4 – DISCUSSION .................................................................................................................. 41 3.5 – FIGURES ......................................................................................................................... 44 Figure 3.M1. Segregation of transcription factors into discrete chromatin compartments .. 44 Figure 3.M2. Patterned partitioning of TFs across nuclear microenvironments ................... 46 Figure 3.M3. TFs and linker histones occupy both euchromatic and heterochromatic niches ............................................................................................................................................... 47 ii Figure 3.M4. Transcription factor isoforms guide chromatin niche occupancy .................. 49 Figure 3.M5. Assortment of human transcription factor into chromatin niches ................... 51 Figure 3.S1. Relative protein abundance between HepG2 and K562 nuclei ........................ 52 Figure 3.S2. Fine-scale mapping of transcription factor chromatin compartments ............. 53 Figure 3.S3. Digestion patterns of MNase soluble and stable chromatin.............................. 54 Figure 3.S4. Segregation of transcription factors into discrete chromatin compartments from MNase treated nuclei ............................................................................................................. 55 Figure 3.S5. NKRF and CTCF isoforms occupy distinct chromatin niches ......................... 56 Figure 3.S6. PML, ZNF512 and HIC2 isoforms occupy distinct chromatin niches, as measured by SRM ................................................................................................................. 58 3.6 – METHODS....................................................................................................................... 60 CHAPTER 4 – GENOME-WIDE DNASEI FOOTPRINTS ENCODE GLOBAL MAPS OF TRANSCRIPTION FACTOR OCCUPANCY .............................................................................. 67 4.1 – ABSTRACT ..................................................................................................................... 67 4.2 – INTRODUCTION ............................................................................................................ 68 4.3 – RESULTS......................................................................................................................... 68 Regulatory DNA is populated with DNase I footprints ........................................................ 68 Footprints are quantitative markers of factor occupancy ...................................................... 70 Footprints harbour functional SNVs and lack methylation ................................................... 72 Transcription factor structure is imprinted on the genome ...................................................