Evolution, Structure and Assembly of Basal Transcription Factor II D
Total Page:16
File Type:pdf, Size:1020Kb
Evolution, structure and assembly of basal Transcription Factor II D Simona V. Antonova The work described in this thesis was partially funded by the Bijvoet NWO Graduate Programme (grant 022.004.019). Paranimfen Elisa Bloemberg Ragna Senf Printed by Print: Ridderprint | www.ridderprint.nl ISBN: 978-94-6416-229-5 Cover Adaptation of ―Transcription factors‖ by Kelvinsong, distributed under a CC-BY 3.0 license Evolution, structure and assembly of basal Transcription Factor II D Evolutie, structuur en assemblage van basale transcriptiefactor II D (met een samenvatting in het Nederlands) Proefschrift ter verkrijging van de graad van doctor aan de Universiteit Utrecht op gezag van de rector magnificus, prof.dr. H.R.B.M. Kummeling, ingevolge het besluit van het college voor promoties in het openbaar te verdedigen op donderdag 22 oktober 2020 des ochtends te 11.00 uur door Simona Vladimirova Antonova geboren op 5 augustus 1989 te Sofia Promotoren: Prof. dr. H.T.M. Timmers Prof. dr. A.J.R. Heck Content CHAPTER 1 11 General introduction 11 1.1. Chromatin 13 1.2. The histone code 15 1.3. Chromatin dynamics 16 1.4. Promoter architecture 20 1.5. Transcription initiation and transcription cycle 23 1.6. TFIID 30 1.7. Sharing with SAGA 45 1.8. Concluding remarks and outline 48 CHAPTER 2 49 Epigenetics and transcription regulation during eukaryotic diversification: the saga of TFIID 49 CHAPTER 3 87 Chaperonin CCT checkpoint function in basal transcription factor TFIID assembly 87 CHAPTER 4 121 Proteomic analysis of human TFIID in different cellular compartments 121 CHAPTER 5 153 TAF1L integrates into a holo-TFIID complex with canonical composition 153 CHAPTER 6 171 General discussion 171 6.1. Cellular distribution and assembly of TFIID 173 6.2. Sharing the TAFs 180 6.3. Final remarks and conclusions 181 APPENDICES 183 Bibliography 184 Summary 199 Nederlandse sammenvatting 200 Dankwoord 200 List of publications 204 Curriculum vitae 205 Voorzitter van de promotieplechtigheid / Chairperson of the defense ceremony: Prof. dr. H.M. Verkooijen Promotiecommissie / Doctoral committee members: Prof. dr. P. J. Coffer (beoordelingscommissie / assesment committee member) Prof. dr. F.C.P. Holstege (beoordelingscommissie / assesment committee member) Prof. dr. W.L. de Laat (beoordelingscommissie / assesment committee member) Dr. S.M. Lemeer Prof. dr. T.K. Sixma Prof. dr. B. Snel (beoordelingscommissie / assesment committee member) Prof. dr. L. Tora Prof. dr. C.P. Verrijzer (beoordelingscommissie / assesment committee member) List of abbreviations Ac acetylation Acetyl Co-A acetyl coenzyme-A AML acute myeloid leukemia ARP actin-related protein ATAC ada2a-containing B2M β2 microglobulin BDP1 B double prime 1 BP(S) base pair(s) BrD bromodomain BRE B recognition element BRED B recognition element (downstream) BREU B recognition element (upstream) BRF1 B-related factor 1 BTAF1 B-TFIID TATA-box binding protein associated factor 1 BTF basal transcription factor CBP CREB-binding protein CCD conserved domain database CCT chaperonin containing TCP1 CDK cyclin-dependent kinase CHD chromodomain ChEC-Seq chromatin endogenous cleavage and sequencing CHX cycloheximide Co-IP co-immunoprecipitation COMPASS complex proteins associated with Set1 Co-REST co-repressor of RE1 silencing transcription factor Cryo-EM cryogenic electron microscopy CTD C-terminal domain CUT cryptic unstable transcripts DBD DNA-binding domain DCE downstream core element DNA deoxyribonucleic acid Dox doxycycline DPE downstream promoter element DSIF DRB sensitivity inducing factor DTT dithiothreitol DUB deubiquitinating (domain/protein) DUF3591 domain of unknown function 3591 EER4 enhanced ethylene response 4 ERAP-1 endoplasmic reticulum aminopeptidase 1 ESC embryonic stem cell FACT facilitates chromatin transcription FAD flavin adenine dinucleotide FDR false discovery rate FECA first eukaryotic common ancestor FRT Flp recombinase target (site) GFP green fluorescent protein GNAT GCN5-related N-acetyltransferases GSTF gene specific transcription factor GTF general transcription factor H1 histone H2A histone 2A H2B histone 2B H3 histone 3 H4 histone 4 HAT histone acetyl transferase HDAC histone deacetylase HDMT histone demethyltransferase HEAT huntingtin, elongation factor 3, protein phosphatase 2A, TOR1 HF histone fold HFD histone fold domain HGT horizontal gene transfer HMT histone methyltransferase HRP horseradish peroxidase iBAQ intenstity-based absolute quantification Inr initiator IP immunoprecipitation iPSC induced pluripotent stem cells iTOL interactive tree of life kDA kilo Dalton KDM lysine demethylase KMT lysine methyltransferase KO knock-out LECA last eukaryotic common ancestor LFQ label-free quantification MDA mega Dalton MBD methyl-CpG-binding domain Me methylation MLL mixed lineage leukemia MOZ monocytic leukemia zinc-finger mRNA messenger RNA MBT malignant brain tumour (domain) MTE motif ten element MYST MOZ-Ybf2/Sas3-Sas2/Tip60 (family) NAD nicotinamide adenine dinucleotide NC2 negative cofactor 2 NELF negative elongation factor NLS nuclear localization signal NTD N-terminal domain NuRD nucleosome remodeling and deacetylase PCAF p300/CREB-binding protein-associated factor PCR polymerase chain reaction PDB protein data bank PHD plant homeodomain PIC pre-initiation complex Pol I RNA polymerase I Pol II RNA polymerase II Pol III RNA polymerase III P-TEFB positive transcription elongation factor b PTM post-translational modification qMS quantitative mass spectrometry RNA ribonucleic acid RNAPI RNA polymerase I RNAPII RNA polymerase II RNAPIII RNA polymerase III rRNA ribosomal RNA SAGA Spt-Ada-Gcn5 acetyltransferase SCP super core promoter SEC super elongation complex SET su(var)3-9, enhancer of zeste, trithorax SF splicing factor SL1 selective factor 1 SUMO small ubiquitin-like modifier SUT stable unannotated transcripts SWI/SNF SWItch/sucrose non-fermentable) TAF TBP-association factor TAND TAF1 N-terminal domain TBP TATA-binding protein TBPL TBP-like protein TF transcription factor TFB transcription factor B TFIIA transcription factor II A TFIIB transcription factor II B TFIID transcription factor II D TFIIE transcription factor II E TFIIF transcription factor II F TFIIH transcription factor II H TLF TBP-like factor TRF TBP-related factor TRIC tailless complex polypeptide 1 ring complex tRNA transfer RNA TSS transcription start site UAS upstream activation sequence Ub ubiquitin WCE whole cell extract WGD whole genome duplication WH winged helix WT wild type ZnK zinc knuckle General introduction 1 Chapter 1 General introduction 11 Chapter 1 1 Chromatin ~ Epigenetics ~ Transcription “…the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but transfer from protein to protein, or from protein to nucleic acid is impossible. Information means here the precise determination of sequence, either of bases in the nucleic acid or of amino acid residues in the protein.” (The Central Dogma, Symp. Soc. Exp. Biol. 1958, vol 12, pp 138-163) Over half a century ago and shortly after the discovery of the deoxyribonucleic acid (DNA) double- helical structure, the process of transcribing DNA-encoded information to RNA messenger molecules and subsequently to a synthesized functional protein was described as the central dogma of molecular biology. This, together with the early 1960‘s genetic work of Jacob and Monod as well as the discovery of messenger RNA (mRNA) molecule and the subsequent RNA polymerase II purification were among the main fundamental research achievements of the last century, which marked the beginning of central fields of molecular biology, such as transcriptional regulation and epigenetics. Since then our understanding of the mechanisms and structures behind the immense complexity of eukaryotic transcription has expanded exponentially, revealing a chaotic world of tightly regulated processes. This thesis aims to further elucidate our scientific perception of eukaryotic transcription on the evolutionary and structural example of a protein complex central to this process, transcription factor II D (TFIID). For this purpose, the general introduction of the manuscript will first illustrate the nuclear protein network tasked with the execution and maintenance of transcriptional activity in the human cell. Next, I will delve into the process of transcriptional initiation with emphasis on the structural and functional role of TFIID in it. 12 General introduction 1.1. Chromatin 1 The DNA is a nucleotide-based biopolymer made of sugar-phosphate backbone and four nitrogen-containing nucleobases – guanine (G), cytosine (C), thymidine (T) and adenine (A). Its well-characterized double helical shape is attained through complementary A-T and C-G base pairing of the two strands in regular right-handed twists (1-5). Such coiling causing slight distortion in the direct opposition of the two strands and, thus, forms unequally sized opposing distances between each ~10 bps DNA turn, termed minor and major grooves (Fig. 1A) (6). These structures provide the first layer of regulation in the decoding of the genetic information through differential protein-binding properties (7). In humans, the entirety of this information is contained within approximately 3.2x109 base pairs (bps) of DNA, which makes up to a meter in stretched length (8). As most cells in the human body are somatic and, therefore, diploid, up to 2 meters of genetic material has to be packed within the nuclear dimensions of about 0,5x10-9 m3 (9). An analogy can be made with fitting 200