Protein-Protein and Protein-DNA Interactions of the Human C2H2 Zinc Finger Proteins
Total Page:16
File Type:pdf, Size:1020Kb
Protein-protein and Protein-DNA Interactions of the Human C2H2 Zinc Finger Proteins by Ernest Radovani A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Molecular Genetics University of Toronto © Copyright 2019 by Ernest Radovani Protein-protein and protein-DNA interactions of the human C2H2 zinc finger proteins Ernest Radovani Doctor of Philosophy in the Graduate Department of Molecular Genetics University of Toronto 2019 Abstract TFs (transcription factors) bind DNA in a sequence specific manner and regulate transcription. Humans encode ~1600 TFs, which have been identified almost entirely on the basis of a putative DNA-binding domain. Knowledge of the mechanisms by which they regulate transcription is sparse. TFs sometimes contain effector domains, but often utilize uncharacterized regions to recruit various cofactors, including chromatin modifiers, components of the general transcription machinery, and other TFs. This thesis focuses on describing the interactome of C2H2-ZNFs (C2H2 zinc finger proteins) which have been classified as TFs due to their ability to bind DNA and regulate transcription in some studied cases. With ~750 proteins, C2H2-ZNFs have greatly expanded in mammals and are the largest and least well characterized subfamily of DNA-binding proteins in humans. Through AP/MS (affinity purification and mass spectrometry), we have identified PPIs (protein-protein interactions) for 345 C2H2-ZNFs and, through ChIP-seq (chromatin immunoprecipitation followed by next generation sequencing), we have identified genomic binding sites for 217 C2H2-ZNFs. From the AP/MS data it appears that, overall, the C2H2-ZNFs exhibit a diverse set of interactions that may give them the ability to perform dual functions in transcription activation and repression, therefore suggesting that they may act as TFs. However, they also exhibit PPIs suggestive of roles additional to what the conventional definition of a TF entails, such as in AS (alternative splicing), which provides evidence that they are multifunctional proteins. Furthermore, C2H2-ZNFs are enriched for ii binding to AS exons and 3'-ends of genes, providing additional evidence for involvement in co- transcriptional processes. Strikingly, from the AP/MS data, C2H2-ZNFs extensively associate with each other and, based on their DNA-binding patterns, I show that interacting pairs may co-bind DNA, since they bind in closer proximity to each other in the genome compared to pairs that were not found to interact. Lastly, by integrating the PPI data with ChIP-seq, ChIA-PET (chromatin interaction analysis by paired-end tag sequencing), and HiC data, I show that C2H2-ZNF interacting pairs may mediate long range DNA interactions and thus organize chromatin architecture. Altogether, this work provides a snapshot of the C2H2-ZNF interactome and its potential to regulate gene expression. iii Table of Contents Chapter 1 -- Introduction......................................1 1.1 Overview of the regulation of gene expression..............2 1.2 Compaction of the genome by histones.......................4 1.2.1 Histones............................................4 1.2.2 The histone code....................................5 1.2.3 Epigenetic states...................................5 1.3 The Transcription cycle of RNA Polymerase II...............6 1.4 Chromatin architecture and gene expression................10 1.4.1 Topologically Associated Domains...................10 1.4.2 Long range interactions and transcription..........11 1.4.3 Long range interactions involving exons............12 1.4.4 Long range interactions involving 3'-ends of genes....................................................12 1.4.5 Nuclear compartmentalization.......................13 1.5 Transcription factors and mechanisms by which they regulate transcription............................................16 1.5.1 Overview of human transcription factors............16 1.5.2 Transcription regulation through PPIs of TFs with chromatin-related proteins...............................18 1.5.3 Transcription regulation through PPIs of TFs with general transcription factors............................20 1.5.4 Transcription regulation through PPIs of TFs with other transcription factors..............................20 1.6 C2H2-ZNF transcription factors............................21 1.6.1 Organization of DNA-binding domains................21 1.6.2 Auxiliary domains associated with human C2H2-ZNFs..23 1.6.3 RNA-binding by C2H2-ZNFs...........................24 1.6.4 RNA-DNA hybrid binding.............................25 1.6.5 Transcriptional regulation of endogenous retroelements............................................25 1.6.6 Transcriptional regulation of protein coding genes.27 1.6.7 Regulation of alternative splicing.................28 1.6.8 Regulation of 3'-end formation.....................29 1.6.9 Roles in genome organization.......................30 1.7 Summary and thesis outline................................31 Chapter 2 -- Protein-protein interactions of the C2H2-ZNF proteins......................................................34 iv 2.1 Introduction..............................................36 2.2 Results...................................................38 2.2.1 Optimization of the AP/MS method...................38 2.2.2 Overview of analyzed C2H2-ZNF proteins.............43 2.2.3 Overview of the SAINT analysis of the PPI data.....44 2.2.4 Diversity of PPIs for C2H2-ZNFs....................59 2.2.5 Frequently occurring prey proteins in the C2H2-ZNF PPI network..............................................67 2.2.6 Annotation of the preys according to function......71 2.2.7 Interaction of C2H2-ZNFs with metabolic proteins...72 2.2.8 Interaction of C2H2-ZNFs with RNA-related proteins.76 2.2.9 Interaction of C2H2-ZNFs with transcription factors..................................................81 2.2.10 Interaction of C2H2-ZNFs with transcription related effector proteins........................................85 2.2.11 Interaction of C2H2-ZNFs with histones and proteins involved in post- translational modifications and DNA replication/repair.......................................88 2.2.12 Interaction of C2H2-ZNFs with virus related proteins.................................................88 2.2.13 Roles of prey proteins in transcription activation and repression...........................................89 2.2.14 Sub-cellular localization of prey proteins in the network..................................................94 2.3 Discussion................................................97 2.4 Chapter 2 supplementary figures..........................100 Chapter 3 -- DNA-binding landscape of C2H2-ZNF proteins across the human genome.............................................107 3.1 Introduction.................................................108 3.2 Results......................................................109 3.2.1 ChIP-seq workflow for C2H2-ZNFs...................109 3.2.2 Comparison of C2H2-ZNF ChIP-seq experiments to previously published datasets...........................110 3.2.3 Distribution of C2H2-ZNF peaks in open and closed chromatin...............................................116 3.2.4 Motif analysis....................................120 3.2.5 Overlap of C2H2-ZNF binding sites across the genome..................................................130 v 3.2.6 Potential cooperative DNA-binding between C2H2- ZNFs....................................................137 3.2.7 Binding of C2H2-ZNFs at functional elements.......141 3.2.8 C2H2-ZNFs and R-loops.............................143 3.3 Discussion...............................................148 3.4 Chapter 3 supplementary figures..........................151 Chapter 4 -- C2H2-ZNF proteins and organization of chromatin architecture.................................................163 4.1 Introduction.............................................164 4.2 Results..................................................166 4.2.1 RNA Polymerase II ChIA-PET........................166 4.2.2 Overlap of C2H2-ZNF PPIs with ChIA-PET data.......174 4.2.3 Overlap of C2H2-ZNF PPIs with HiC data............185 4.2.4 Overlap of C2H2-ZNF binding sites at LRIs with mutations in cancer cells...............................188 4.3 Discussion...............................................192 4.4 Chapter 4 supplementary figures..........................195 Chapter 5 -- Discussion, conclusions, and future directions..197 5.1 C2H2-ZNFs: not just TFs, but Jacks of all trades?........198 5.2 C2H2-ZNFs and the regulation of chromatin architecture...202 5.3 C2H2-ZNFs and RNA........................................207 5.4 Summary and conclusions..................................213 Chapter 6--Methods...........................................215 6.1 Generation of stable cell lines using the Flp-IN TREx system to express inducible GFP-tagged C2H2-ZNFs....................216 6.2 Generation of stable cell lines using the MAPLE system to express VA-tagged C2H2-ZNFs..................................217 6.3 AP/MS workflows..........................................217 6.3.1 Workflow 1 (Two step purification)................217 6.3.2 Workflow 2 (One step FLAG purification)...........219 6.3.3 Workflow 3 (One-step GFP purification)............220 vi 6.3.4 Data accession for optimization of AP/MS experiments.............................................220 6.4 SAINT analysis...........................................221 6.5 Overlap of binding sites