Dissecting the Regulatory Roles and Cellular Functions of Mammalian Zscan5b and Primate-Specific Paralogs by Younguk Sun Dissert
Total Page:16
File Type:pdf, Size:1020Kb
DISSECTING THE REGULATORY ROLES AND CELLULAR FUNCTIONS OF MAMMALIAN ZSCAN5B AND PRIMATE-SPECIFIC PARALOGS BY YOUNGUK SUN DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Cell and Developmental Biology in the Graduate College of the University of Illinois at Urbana-Champaign, 2016 Urbana, Illinois Doctoral Committee: Professor Lisa J. Stubbs, Chair Associate Professor David H. Rivier Associate Professor Craig A. Mizzen Assistant Professor Eric C. Bolton Assistant Professor Rachel Smith-Bolton ABSTRACT Certain subfamilies of zinc finger transcription factors (ZNF-TFs), especially those including KRAB or SCAN protein-interaction domains, have evolved rapidly in mammals due to repeated rounds of gene duplication, deletion, and divergence. The functions of the encoded proteins, and the evolutionary role of their duplication and divergence, are not well understood. I have focused on mammalian ZNF subfamilies that are rooted by relatively unique, conserved members but that have duplicated to generate groups of divergent paralogous transcription factors in primate species. One such family is the ZSCAN5 subfamily, represented by a single, conserved founding member, ZSCAN5B, and clustered paralogs that arose early in primate history, and this clustered subfamily has been the subject of my thesis work. I have shown that three of the primate paralogs, ZSCAN5A, B, and D, are expressed in different tissues and cell types while ZSCAN5C is not detectably expressed in most tissues. Using chromatin immunoprecipitation (ChIP-seq) in human cell lines and mouse placenta tissue, I have identified mammalian ZSCAN5B as a protein that binds with high specificity to tRNA genes (tDNA) and other polymerase III (Pol III) transcripts including several types of transposable elements (TEs). Primate-specific paralogs ZSCAN5A and ZSCAN5D also bind tDNAs, although ZSCAN5D preferentially recognizes a subset of TE-derived “extra-TFIIIC” or ETC sites. I also used siRNA knockdown followed by transcriptome sequencing (RNA-seq) to identify biological functions of the dominantly expressed paralog ZSCAN5A and the ancestral ZSCAN5B gene. In addition to tDNAs and other Pol III transcripts, I show that ZSCAN5A and ZSCAN5B gene knockdown also dysregulates many Pol II genes including some that are located near the DNA binding regions. Significant overlap between the differentially expressed gene (DEG) sets detected after ZSCAN5A or ZSCAN5B knockdown suggested that the two proteins may act cooperatively through their similar SCAN domains. The DEG sets thus suggested similar biological functions for ZSCAN5A and ZSCAN5B, but also highlighted some potentially distinct activities. In particular, based on these data I hypothesize that ZSCAN5B primarily regulates ribosome biogenesis and tRNA processing, whereas ZSCAN5A has an especially important role in ii mitotic progression. Consistent with this latter function, I show that ZSCAN5A as well as its human paralogs are cell-cycle regulated with peak expression around the time of mitosis. To test these hypotheses about gene function, I generated a series of resources including both human cell lines and transgenic mice engineered to overexpress or knock down ZSCAN5 genes specifically. Focusing on the predicted role of ZSCAN5A in cell cycle progression, I showed that ZSCAN5A knockdown dysregulates cell cycle progression in cultured human cells, leading to an accumulation of cells in mitotic phase and the appearance of aneuploid cells. These data suggest that, either through a cyto- architectural function, regulation of Pol III transcription, position effects on Pol II genes, or a combination of these roles, ZSCAN5A has an essential role in maintenance of chromosome integrity in human cells. These data open up many new avenues for future exploration. Together these data define unexpected functions for a conserved mammalian gene and two paralogs found only in primate genomes. The human paralogs have evolved from the basic tDNA-binding function of the conserved founder gene, ZSCAN5B, to bind more widely to Pol III transcripts and other TFIIIC binding sites and affect the transcription of a wider selection of nearby genes. Data presented here indicate that despite being duplicated only in primates, ZSCAN5A has been integrated into the control of a very ancient process, namely chromosome segregation during mitosis, and plays an important role in chromosome integrity. The new tools and resources I have developed provide essential tools needed to explore those functions in further depth. iii ACKNOWLEDGEMENTS Words can only say so much how much I appreciate all the support and encouragement from my advisor, Dr. Lisa Stubbs, my colleagues and friends, and my family. Firstly, this project would have not happened and done without Dr. Lisa Stubbs and thesis committee members’ enthusiastic support and untiring guidance throughout the completion of my degree. I am also very lucky to have my friend, Dr. Chase Bolt, and grateful for the countless scientific discussions and moral support over the countless pints of beers. Dr. Elbert Branscomb’s inspirational input, Christopher Seward and Joe Troy’s proficient bioinformatics analysis, Huimin Zhang and Jaeeun Yoo’s remarkable experimental assistances have greatly fueled the completion of this study. I am much indebted to my parents, Yongseung Sun and Kyungran Baik, and my sister, Youngeun Sun, for their incessant and motivating encouragement to keep me moving forward. Many thanks are due to my closest and loving friends – Jiyoung Jeong, Dongwook Kim, Jihye Nam – and colleagues, Dr. Patricia Annie Weisner, Li-Hsin Chang, Erin Borchardt, and Seeon Lee. Lastly, I would like to express my sincere gratitude to my undergraduate advisor, Dr. Suk-Tae Kwon, for having faith in me and gave me an opportunity to start my career in the field of molecular biology. iv TABLE OF CONTENTS CHAPTER 1: INTRODUCTION ......................................................................................1 CHAPTER 2: ZSCAN5B AND ITS PRIMATE-SPECIFIC PARALOGS BIND RNA POLYMERASE III GENES AND EXTRA-TFIIIC (ETC) SITES IN MAMMALIAN CELLS ..............................................................................................................................29 CHAPTER 3: A ROLE FOR MAMMALIAN SCAN-CONTAINING ZINC FINGER TRANSCRIPTION FACTOR ZSCAN5A IN CELL CYCLE PROGRESSION .............78 CHAPTER 4: TOOLS AND RESOURCES FOR ANALYSIS OF BIOLOGICAL FUNCTIONS OF MOUSE Zscan5b .............................................................................103 CHAPTER 5: CONCLUSIONS AND SUMMARY ......................................................118 APPENDIX: SUPPLEMENTAL TABLES AND FIGURES .......................................120 v CHAPTER 1: INTRODUCTION Evolution of Regulatory Networks While it is still controversial and vague to define a true meaning of ‘life’, scientifically and traditionally, ‘life’ is specifically distinguished by the capacity that it can grow, metabolize, respond to stimuli, adapt, and reproduce. This reproductive capacity elicits numerous generations of offspring, and each generation offers a new opportunity for evolution of form and function. From simple single-cell bacteria to highly complex vertebrate mammals, every known living organism on the planet earth follows a simple rule known as the “Central dogma of molecular biology”(Crick, 1970; Crick, 1958). Each organism carries its own genetic material that is transcribed and further translated into useful peptides to sustain its viability, and replicated to produce additional progeny. Amongst these important processes, an accurate and sophisticated transcription of each gene is essential for the successful development and survival of the living organism (Jacob and Monod, 1961). This fine control is called “transcriptional regulation”. The transcriptional regulation system is principally composed of cis- and trans- regulatory elements (Davidson 2006). Cis-regulatory elements are typically non-coding DNA segments that contain binding sites for transcription factors (TFs) and other regulatory proteins that are required to activate and sustain transcription; the well- characterized examples of these cis-elements are enhancers and promoters (Blackwood and Kadonaga, 1998; Pennacchio et al., 2013; Smale and Kadonaga, 2003). Conversely, trans-regulatory factors include proteins, RNAs, and other diffusible molecules that affect gene expression, primarily by interacting with the cis DNA sites (Gilad et al., 2008). The basic biochemical function of a TF is firstly, to recognize and bind a short, specific string of nucleotides (called a motif) within cis-regulatory regions, and secondly, to recruit or bind other proteins relevant to transcriptional regulation; these include other 1 TFs, chromatin remodeling complex components, and the general RNA polymerase machinery (Karin, 1990; Latchman, 1997). Thus, the interaction between a TF and a specific cis-regulatory element may endow positive or negative changes to the basal expression of a target gene (Lee and Young, 2000; Nikolov and Burley, 1997; Roeder, 1996). Because these systems orchestrate the specification of a particular cell type or structure, heritable changes to regulatory networks result in the evolution of animal morphology and phenotypic variations across organisms (Carroll et al., 2013; Rubinstein and de Souza, 2013). Throughout the course of evolution, as biological systems have become more complex, the amount of information devoted to