Thesis

Structural and functional studies of poly(U) polymerases

MUNOZ TELLO, Paola Andréa

Abstract

Nous avons étudié les aspects structuraux et fonctionnels de deux poly(U) polymérases (PUPs) afin de mieux comprendre le fonctionnement de ces enzymes: l'enzyme Cid1 PUP de Schizosaccharomyces pombe et l'enzyme RET1 de Trypanosome brucei. Ces protéines sont impliquées dans la polyuridylation de plusieurs types d'ARNs comme les ARNms ou les ARNgs. Nous avons déterminé la structure atomique par cristallographie aux rayons X de l'enzyme Cid1 liée à son substrat UTP, un UTP non hydrolysable et un pseudo-produit ApU qui correspondrait à la première étape de l'uridylation d'un ARNm polyadenylé. Récemment, nous avons obtenu des cristaux contenant un fragment fonctionnel de la protéine RET1 et la résolution de la structure est la prochaine étape. Nos études fournissent une vision atomique de la sélectivité envers l'UTP ainsi que la reconnaissance de l'ARN par les enzymes impliquées dans la polyuridylation en détaillant les mécanismes moléculaires de cette famille d'enzymes et de leurs complexes.

Reference

MUNOZ TELLO, Paola Andréa. Structural and functional studies of poly(U) polymerases . Thèse de doctorat : Univ. Genève, 2014, no. Sc. 4716

URN : urn:nbn:ch:unige-408396 DOI : 10.13097/archive-ouverte/unige:40839

Available at: http://archive-ouverte.unige.ch/unige:40839

Disclaimer: layout of this document may differ from the published version.

1 / 1 UNIVERSITÉ DE GENÈVE FACULTE DES SCIENCES Département de Biologie Moléculaire Professeur Stéphane Thore

Structural and Functional Studies of Poly(U) Polymerases

THÈSE

Présentée à la faculté des sciences de l’Université de Genève pour obtenir le grade de Docteur ès Sciences, mention Biologie

par

Paola MUNOZ TELLO de Cali (COLOMBIE)

THÈSE N° 4716

GENÈVE Atelier d’impression Repromail 2014

2 Acknowledgements

First of all, I would like to thank the members of the jury, Professor Cecilia Arraiano, Professor Andrzej Dziembowski and Professor Thomas Schalch, for accepting to evaluate my thesis research.

I would like to express my most sincere gratitude and appreciation to Professor Stéphane Thore for accepting me as a PhD candidate in his group and for his guidance, support, patience, enthusiasm and advices throughout all my time in the lab.

I am grateful to all the members of the Thore lab that have pass through these years for the good atmosphere, advices and support; with a special thanks to Lionel, Fabi and Caroline, without whom the PhD in the lab would have not been the same. Lionel, you have been a great support and made my stay a pleasant and memorable one. Fabi, we have had our ups and downs but you made me grow as a person and scientist and I really appreciate that. Caroline, our discussions about Science and life are memorable and thanks for teaching me that radioactivity is not as scary as it seems although you should not underestimate its danger. Thank you for all the laughs, the patience, the cries, the really good advices and helpful discussions we have had during all these years.

I would also like to thank all the members of my PhD TAC committee, Professor Thanos Halazonetis, Professor Robbie Loewith and Professor Angela Kramer for the helpful discussions and advices.

I would like to extend my special thanks to the past and present members of the Department, the Faculty of Biology of the University of Geneva and the PhD International Program for sharing information, equipment, consumables and advices during all my PhD; it has been a pleasure to be able to exchange ideas with all of you.

Finally, I would like to thank all my family and friends from Cali, Toulouse and all around the world, without whose love, affection and encouragement, this work would not have been possible; you have been an invaluable support during this time for me. The last paragraph of acknowledgements will be written in Spanish especially for my family.

3 Este es el agradecimiento mas especial e importante pues no fue fácil el camino para llegar hasta aquí para nuestra familia. Agradezco inmensamente a mis padres Gloria María Tello y José Omar Muñoz, mi hermano Andrés Muñoz, mi tía Elizabeth Tello y Dios por darme la oportunidad de vivir mis sueños. La verdad es que sin ustedes nunca abría podido llegar tan lejos ni soportar tantos años lejos de la familia (al otro lado del charco…) en busca de un mejor futuro. Yo se que fue un sacrificio para todos nosotros que apenas ahora esta dando sus frutos. Pero no solo ellos sino toda mi familia porque siempre creyeron en mi y en lo lejos que podría llegar como científica. Gracias por todos los rosarios, plegarias, rezos y velitas puestas en mi pidiéndole a Dios lo mejor para mi futuro. Todos estos años fueron llenos de esfuerzo tanto de su parte como de la mía y valieron mas que la pena pues gracias a ello, llegue aquí donde estoy ahora con los pies en alto y el mayor reconocimiento de estudios: un doctorado.

Mami, no estudie Medicina como tu querías, pero igual termine siendo Doctora.

Los quiero y adoro muchísimo!

4 Table of Contents

ACKNOWLEDGEMENTS 2

TABLE OF CONTENTS 5

1 ABSTRACT 7

1.1 FRENCH VERSION 8 1.2 ENGLISH VERSION 10

2 INTRODUCTION 12

2.1 RNA LIFE 13

2.1.1 TYPES OF 14 2.2 RNA PROCESSING AND MODIFICATION 18 2.3 3’-END ADDITION OF NON-TEMPLATED NUCLEOTIDES 20

2.3.1 21 2.3.2 POLYURIDYLATION 22

2.4 THE NON-CANONICAL RIBONUCLEOTIDYL TRANSFERASES FAMILY 23 2.5 SUBSTRATES OF POLYURIDYLATION IN THE DIFFERENT CELL COMPARTMENTS 26 2.5.1 POLYURIDYLATION IN THE NUCLEUS 27

2.5.2 POLYURIDYLATION IN THE ORGANELLES (MITOCHONDRIA) 28 2.5.3 POLYURIDYLATION IN THE CYTOPLASM 29

2.6 KNOWN FUNCTIONS OF POLYURIDYLATION 31 2.6.1 IN THE NUCLEUS 31 2.6.2 IN THE ORGANELLES 32

2.6.3 IN THE CYTOPLASM 33

3 GOALS OF THE THESIS WORK 38

3.1 PROJECT I: FUNCTIONAL AND STRUCTURAL STUDIES OF CID1 POLY(U) POLYMERASE FROM S. POMBE 39 3.2 PROJECT II: STRUCTURAL STUDIES OF MITOCHONDRIAL RNA PROCESSING COMPLEXES IN T.

BRUCEI 39

4 RESULTS 40

5 4.1 FUNCTIONAL AND STRUCTURAL STUDIES OF CID1 POLY(U) POLYMERASE FROM S. POMBE 41 4.1.1 CID OVEREXPRESSION AND PURIFICATION 41

4.1.2 CID1 CRYSTALLIZATION 44 4.1.3 DATA COLLECTION AND PHASING 45 4.1.4 PUBLICATION I: “FUNCTIONAL IMPLICATIONS FROM THE CID1 POLY(U) POLYMERASE CRYSTAL

STRUCTURE” 50 4.1.5 CID1 SUBSTRATE/RNA COMPLEX CRYSTALLIZATION 66

4.1.6 PUBLICATION II: “A CRITICAL SWITCH IN THE ENZYMATIC PROPERTIES OF CID1 PROTEIN DECIPHERED FROM ITS PRODUCT-BOUND CRYSTAL STRUCTURE” 71 4.1.7 MATERIALS AND METHODS 85

4.2 STRUCTURAL STUDIES OF MITOCHONDRIAL RNA PROCESSING COMPLEXES IN T. BRUCEI 88 4.2.1 GRNA MATURATION COMPLEXES OR RET1 TUTASE DSS1 EXONUCLEASE COMPLEX (RDS COMPLEX) 88 4.2.2 POLYADENYLATION/URIDYLATION COMPLEX 99 4.2.3 MATERIALS AND METHODS 101

5 GENERAL DISCUSSION 107

5.1 STRUCTURAL DETERMINANTS FOR CID1 POLY(U) POLYMERASE UTP SELECTIVITY AND RNA

RECOGNITION 108 5.2 RDS COMPLEX STRUCTURAL AND FUNCTIONAL STUDIES: LIGHT INTO A TRYPANOSOMAL TUTASE, RET1 113

6 ABBREVIATIONS 119

7 REFERENCES 121

6 Abstract

7 1.1 French Version

L’acide ribonucléique, plus connu sous le nom d’ARN, est une molécule très versatile et instable qui peut subir une dégradation rapide et spontanée. Afin d’éviter par exemple une dégradation prématurée des ARNs, celui-ci est soumis a des modifications post- transcriptionnelles importantes. En réalité, la très grande majorité des processus en lien avec l'ARN nécessite de telles modifications comme l’export vers le cytoplasme ou le bon fonctionnement de la traduction (Muthukrishnan et al. 1975; Munroe and Jacobson 1990; Gallie 1991; Caponigro and Parker 1995; Meyer et al. 2004; Alexandrov et al. 2006). La dégradation des ARNs est un mécanisme important dans la régulation de l’expression génique qui élimine, par exemple, les ARN messagers (ARNms) transcrits de façon erronés ainsi que les ARNms dont la cellule n’a plus besoin. Chez les eucaryotes, la dégradation des ARNs commence par la déadenylation suivi par l’enlèvement de la coiffe en 5’ et finalement la digestion du messager par plusieurs exonucleases (Decker and Parker 1993). Récemment, l'influence majeure de l’uridylation en 3’ comme étape régulatrice dans certaines voies de dégradation des ARNs a poussé les chercheurs à se focaliser sur l’action des poly(U) polymérases (PUP), les enzymes responsable de cette modification. L’uridylation des ARN mitochondriaux chez les trypanosomes par les terminal uridyl transférases (TUTase), un groupe de PUP, est critique pour la biogénèse des ARNs guide (ARNg) nécessaires au processus d’édition des ARNs ou « RNA editing » ainsi que pour la bonne traduction des ARNms qui nécessite l’ajout de longues queues polyadenine et polyuridine. Ces nombreuses observations soulignent l’importance de la polyuridylation dans ces espèces. Chez les eucaryotes, la polyuridylation de différents types d ‘ARNs dans les différents compartiments cellulaires est extrêmement importante pour la régulation de l’expression génique mettant en évidence l’expansion de ce phénomène (Blum et al. 1990; Blum and Simpson 1990; Li et al. 2005; Yu et al. 2005; Horwich et al. 2007; Mullen and Marzluff 2008; Kurth and Mochizuki 2009; Rissland and Norbury 2009; Ameres et al. 2010; Aphasizheva and Aphasizhev 2010; Kamminga et al. 2010; Schmidt et al. 2011). La protéine Cid1 est le premier membre des PUP découvert chez les eucaryotes supérieures et montre une préférence pour l’utilisation in vitro et in vivo de l’UTP par rapport à l’ATP, un phénomène actuellement pas complètement caractérisé (Rissland and Norbury 2009).

8 Durant la période couverte par cette thèse, nous avons étudié les aspects structuraux et fonctionnels de deux PUPs afin de mieux comprendre le fonctionnement de ces enzymes: l’enzyme Cid1 PUP de Schizosaccharomyces pombe et l’enzyme RET1 (RNA editing TUTase 1) de Trypanosome brucei. Ces protéines sont toutes les deux impliquées dans la polyuridylation de plusieurs types d’ARNs comme les ARNms ou les ARNgs.

Nous avons déterminé la structure atomique par cristallographie aux rayons X de l’enzyme Cid1 liée à son substrat UTP, un UTP non hydrolysable et un pseudo-produit ApU qui correspondrait à la première étape de l’uridylation d’un ARNm polyadenylé. Grace à des analyses mutationnelles, nous proposons un cycle catalytique basé sur le modèle atomique de l’enzyme où des mouvements locaux et globaux sont nécessaires pour la translocation et l'accommodation de l’ARN substrat au sein de l’enzyme. Nos études mettent en évidence les mécanismes moléculaires pour la sélectivité de l’UTP par Cid1 en identifiant les résidus clés à cet effet. De plus, nous avons identifié les bases moléculaires pour la sélection d’un substrat de type ARN (et pas un ADN) ainsi que certaines des propriétés de liaison de Cid1 à l’ARN, caractéristiques essentielles pour la biogénèse des microARNs (miARN), la régulation des ARNm des histones, et plus généralement, la régulation de la dégradation des ARNs.

Mon deuxième projet est focalisé sur les analyses biochimiques et structurales d’un nouveau complexe identifié chez la mitochondrie de Trypanosome brucei composé de la TUTase RET1, de l'exonucléase DSS1 et de trois autres protéines sans aucun domaine discernable appelées RDS1, RDS2 et RDS3. Ce complexe est impliqué dans la biogénèse et la polyuridylation des ARNg. Plusieurs expressions et purifications de sous complexes ont été testées afin d’obtenir des complexes pures nécessaires pour des études structurales par rayons X. Récemment, nous avons obtenu des cristaux contenant un fragment fonctionnel de la protéine RET1 et avons pu collecter des données de diffraction à 3.5Å de résolution. La résolution de la structure est la prochaine étape.

Nos études fournissent une vision atomique de la sélectivité envers l’UTP ainsi que la reconnaissance de l’ARN par les enzymes impliquées dans la polyuridylation en détaillant les mécanismes moléculaires de cette famille d’enzymes et de leurs complexes.

9 1.2 English Version

Ribonucleic acid, most commonly known as RNA, is a very labile molecule prone to rapid and spontaneous decay. In order to avoid this premature degradation, RNA molecules undergo several post-transcriptional modifications required for the proper RNA function such as export to the cytoplasm or translation efficiency (Muthukrishnan et al. 1975; Munroe and Jacobson 1990; Gallie 1991; Caponigro and Parker 1995; Meyer et al. 2004; Alexandrov et al. 2006). RNA degradation is an important mechanism for regulation of gene expression leading to the disposal of erroneously transcribed messenger RNAs (mRNAs) or mRNAs that are not required anymore by the cell. In eukaryotes, general mRNA degradation pathway begins with poly(A) tail removal, followed by decapping, and finally mRNA digestion by exonucleases (Decker and Parker 1993). In recent years, the major influence of 3'-end uridylation as a regulatory step within several RNA degradation pathways has driven attention towards poly(U) polymerase (PUP) enzymes. Uridylation in trypanosomal mitochondria by RET1 and RET2 terminal uridyl transferases (TUTases) is important for guide RNA (gRNA) biogenesis (necessary for the RNA editing process) and for the proper translation of mRNAs via long poly(A)/(U) tail addition. These two examples highlight the importance of this modification in this species. In higher eukaryotes, polyuridylation of several types of RNAs in different cell compartments is crucial for regulation of gene expression supporting the idea of a widespread phenomenon critical for RNA metabolism (Blum et al. 1990; Blum and Simpson 1990; Li et al. 2005; Yu et al. 2005; Horwich et al. 2007; Mullen and Marzluff 2008; Kurth and Mochizuki 2009; Rissland and Norbury 2009; Ameres et al. 2010; Aphasizheva and Aphasizhev 2010; Kamminga et al. 2010; Schmidt et al. 2011). The protein Cid1 is the first member of PUP enzymes found in higher eukaryotes and shows a preference for UTP over ATP both in vivo and in vitro, a feature not fully understood (Rissland and Norbury 2009).

During the course of this thesis, we carried out functional and structural studies of two PUP enzymes to reveal novel aspects of poly(U) polymerases: the cytoplasmic Cid1 PUP from Schizosaccharomyces pombe and the mitochondrial RET1 from Trypanosome brucei. These proteins are both involved in the polyuridylation of coding and non-coding RNAs.

10 We have determined the crystal structures of Cid1 protein bound to its substrate UTP, a non-hydrolysable form of UTP and its pseudo-product ApU. Supplemented by point mutations, our atomic models are used to propose a catalytic cycle where local and global movements of the enzyme are needed for RNA accommodation and translocation. Our study provides molecular insights on Cid1’s UTP selectivity by identifying residues critical for UTP recognition. Furthermore, we highlight the molecular basis for selectivity of RNA (rather than DNA) substrates as well as Cid1 RNA binding properties, a feature with critical implications for miRNAs, histone mRNAs and, more generally, cellular RNA degradation.

As a second project, we focused on biochemical and structural analyses of a newly identified complex in trypanosomal mitochondria composed of RET1 TUTase, DSS1 exonuclease and the proteins RDS1, RDS2, and RDS3, which have a currently unknown function. This complex is involved in guide RNA biogenesis and polyuridylation. Several expressions and purifications of sub-complex combinations were attempted towards crystallization and structural studies. Recently, we crystallized a functional fragment of the RET1 protein and collected diffraction data up to 3.5Å resolution.

Globally, our studies provide atomic details into polyuridylating enzyme properties like UTP selectivity or RNA recognition, bringing new molecular insights regarding the function of this family of enzymes.

11 Introduction

12 1.3 RNA Life

The three essential molecules for known terrestrial life are deoxyribonucleic acid (DNA), ribonucleic acid (RNA) and proteins. DNA is the most stable molecule in a cell and is composed of four bases, i.e. adenine (A), guanine (G), cytosine (C) or thymidine (T); a sugar ring (deoxyribose) and a phosphate moiety (Fig. 1). Deoxyribonucleotides store genetic information from one generation to the next. In 1953, James Watson and Francis Crick published the first model of the DNA double helix structure, leading to a revolution in genetics and biology (Watson and Crick 1953; Watson and Crick 1953). The DNA of an organism can be huge and tightly packed inside the nucleus. But there are some variations of DNA size and packing architecture between organisms. In order to pass the genetic information to the translation machinery in the cytoplasm, the DNA has to be transcribed into RNA. In other words, translation of the DNA sequence into its corresponding amino acid sequence to form protein polymer, the essential player for the catalytic and structural functions of a cell, is linked through the RNA molecule. Conversely to DNA, several copies of a RNA molecule are present in the cell and are mainly composed of single stranded nucleic acids containing 4 major bases: A, C, G and uracil (U), a ribose ring and a phosphate moiety (Fig. 1). RNA molecules can also form double stranded structures through self- complementary sequences, which can be crucial for the stability and regulation of this molecule (Buratti and Baralle 2004; Cooper et al. 2009; Cruz and Westhof 2009; Sharp 2009).

RNA is produced by the RNA polymerases during a process known as transcription. Eukaryotic transcription is initiated by these enzymes together with the necessary transcription elongation factors. The DNA template is read by the RNA polymerases that simultaneously synthesize a RNA copy of the transcription unit (Hahn 2004). When the end of the transcription unit is reached, the RNA polymerase falls off the DNA template and the newly formed strand of RNA is released (Richard and Manley 2009). In eukaryotes, there are three RNA polymerases named RNA polymerase I, II and III. Each of them is responsible for the transcription of a specific set of RNAs and able to recognize a specific set of promoters. They were identified through their sensitivity to α-amanitin (Roeder and Rutter 1969; Kedinger et al. 1970). RNA polymerase I is insensitive to this drug and localizes in the nucleolus while RNA polymerase II and III are found in the nucleoplasm and are strongly inhibited or inhibited at high concentrations of α-amanitin, respectively.

13

Figure 1: Nucleotide chemical structure. In DNA, a nucleotide (nt) is composed of a base (Cytosine, Thymine, Adenine or Guanine), a deoxyribose and a phosphate moiety. In RNA, the nucleotide is composed of a base (Cytosine, Uracil, Adenine or Guanine), a ribose and a phosphate moiety (the figure was created by the use of ChemDraw software).

Unlike DNA, RNA molecules are very labile and undergo fast and spontaneous decay due to the presence of the 2'-hydroxyl group in the sugar ring. In order to be functional and avoid degradation, the large variety of produced RNA molecules are either bound by proteins or need to carry specific chemical modifications. These modifications play a critical role in the organization and regulation of gene expression in the cell due to their essential influence on the steady-state levels of RNA.

1.3.1 Types of RNAs

In all prokaryotic and eukaryotic organisms there are at least three primary types of RNAs with different functions during protein synthesis: messenger RNAs (mRNAs), ribosomal RNAs (rRNAs) and transfert RNAs (tRNAs; Fig. 2). There are, additionally, several other types of RNAs playing an important role in the regulation of gene expression in different organisms such as small nucleolar RNAs (snoRNAs), small nuclear RNAs (snRNAs), microRNAs (miRNAs),

14 small interfering RNAs (siRNAs), long non-coding RNA (lincRNAs) and PIWI-interacting RNAs (piRNAs) for the most represented. The different types of RNAs and their main functions are briefly described in the following sections (sections 2.1.1.1 to 2.1.1.4).

Figure 2: Involvement of the three primary types of RNA in the protein synthesis. An mRNA molecule is translated into protein by the action of tRNAs together with the ribosome. The major components of the ribosome are the rRNAs. Adapted from (A.J.E. Griffiths et al., 1993).

1.3.1.1 mRNAs mRNAs are the only RNAs encoding for proteins and are transcribed by RNA polymerase II. They represent ~5% of the total RNAs in a cell and are the most heterogeneous in size and sequence. As key mediators between the genetic information encoded by the DNA and the amino acids sequence of the protein, mRNA levels and localizations are tightly regulated.

1.3.1.2 rRNAs rRNAs are transcribed by RNA polymerase I and III. These RNAs along with ribosomal proteins compose the translation machinery of a cell, the ribosome. There are three species in prokaryotes the 5S and 23S, which associate with the 50S large ribosomal subunit and the 16S rRNA, which is found in the 30S ribosomal small subunit. In eukaryotes, the 60S large subunit is composed of three RNA molecules: the 5S, the 5.8S and the 28S, while the 40S small subunit contains the 18S rRNA specie. The rRNAs are the major structural component

15 within ribosomes and form the peptidyl transferase center, whose activity is responsible for peptide bond formation during translation (Simonovic and Steitz 2009).

1.3.1.3 tRNAs tRNAs are transcribed by RNA polymerase III and there is at least 20 tRNA species corresponding to the 20 amino acids necessary for protein production. They actively participate in the synthesis of proteins as they are used to decode the sequence of the mRNA. They carry an amino acid on the 3'-CCA end and contact the mRNA using their anti-codon loop, complementary to three nucleotides of the mRNA sequence (Sprinzl and Cramer 1979; Green and Noller 1997). Once in the ribosome, their anti-codon region forms a base pair with the mRNA codon, while their 3’-CCA end positions the amino acid next to the newly synthesized polypeptide. They are the translators of the cell.

1.3.1.4 Other Types of RNAs A large number of novel RNA species have been described in recent years, underscoring the complexity of functionality of RNAs. Indeed, many transcribed RNAs do not encode for a polypeptide chain but are nevertheless crucial for gene expression regulation and transcriptome metabolism. Some of these other types of RNAs are described in the following sections (sections 2.1.1.4.1 to 2.1.1.4.5)

1.3.1.4.1 snRNAs snRNAs are found only in eukaryotes and are transcribed by RNA polymerase II or III. These small uridine-rich RNAs are always present as RNA/proteins complexes named small nuclear ribonucleoproteins () and play an essential role in pre-mRNA splicing in the nucleus (Maniatis and Reed 1987; Guthrie and Patterson 1988; Mowry and Steitz 1988). One of them, the U6 snRNA is uridylated in the nucleus, a modification that is essential for its function (Trippe et al. 1998; Trippe et al. 2006).

16 1.3.1.4.2 snoRNAs snoRNAs are transcribed by RNA polymerase II or III. They play a crucial role in RNA biogenesis and guide the methylation and/or pseudouridylation of rRNAs, tRNA and snRNAs. They are often embedded in small nucleolar ribonucleoproteins (snoRNPs), essential for the modification of the targeted RNA (Kiss-Laszlo et al. 1996; Ganot et al. 1997; Maden and Hughes 1997; Lafontaine and Tollervey 1998; Jady and Kiss 2001).

1.3.1.4.3 miRNAs miRNAs are single stranded RNAs of approximately 22 nt and are crucial for gene regulation. They are generated from imperfect hairpin RNA transcripts and bind to a complementary sequence found usually in the 3’ UTR of an mRNA. The miRNA binding event will most of the time lead to translational repression and decay of the targeted mRNA. However, miRNAs have also been involved in translational activation (Filipowicz et al. 2008) and heterochromatin formation (Kim et al. 2008).

1.3.1.4.4 siRNAs siRNAs derive from perfect long double stranded RNAs and are about 20 to 25 nt in size. As for miRNAs, they are essential regulators of gene expression and therefore their proper biogenesis and maturation are crucial for the cell (Babiarz et al. 2008; Tam et al. 2008; Watanabe et al. 2008).

1.3.1.4.5 piRNAs piRNAs are around 24 to 29 nt in size and are specific to germ lines. Transcribed by RNA polymerase II, these RNAs interact with PIWI proteins and are mainly responsible for the silencing of transposable elements in germ lines (Girard et al. 2006; Vagin et al. 2006; Houwing et al. 2007; Kirino and Mourelatos 2007).

17 1.4 RNA Processing and Modification

RNA processing is a general term used to describe the events necessary for RNA maturation (Abelson 1979). Processing of RNAs can occur within the RNA molecule but also at the 5’- and 3’-ends. These processing events are an important determinant of the biological fate of RNAs, necessary for gene regulation and crucial for the proper function of the RNA.

RNA modifications regroup diverse enzymatic reactions such as pseudouridylation (change of uridine into pseudouridine: Ψ), methylation (addition of a methyl group to a nucleotide), RNA editing (Adenine-to-Inosine (A-to-I), Cytidine-to-Uridine (C-to-U) or uridine insertion/deletion in trypanosomal mitochondrial mRNAs). Such modifications can be critical for the main function of the RNA as well as for the encoded protein as some of them can modify the mRNA translation process (Benne et al. 1986; Shaw et al. 1988; Flomen et al. 2004; Nishikura 2006; Ge and Yu 2013). Indeed, the nucleotide composition of a given RNA determines its chemical and structural properties together with its defined open reading frame in the case of mRNAs. These modifications will either facilitate RNA maturation or target the RNA for degradation. One of the most characterized single nucleotide modifications is RNA editing. In higher eukaryotes, the most important editing event is the A- to-I editing on double stranded RNA (dsRNA) structures catalyzed by the Adenosine Deaminases that Act on RNA (ADARs) and the A-to-I event on tRNAs catalyzed by Adenosine Deaminases that Act on tRNA (ADAT; Fig. 3; Savva et al. 2012). The newly formed inosine nucleotide is recognized as a guanosine by the translation machinery and thus will alter the sequence of the encoded protein (Fig. 3). Additionally, editing reactions can influence the splicing pattern by recruiting splicing factors at the editing site. Editing can also trigger regulatory pathways such as silencing if the editing event happens within the UTR of mRNAs or on non-coding RNAs (Flomen et al. 2004; Nishikura 2006). Another type of editing is the insertion/deletion of uridines within mitochondrial mRNA molecules specific to trypanosomal species (Benne et al. 1986; Shaw et al. 1988). This editing event is crucial for mRNA function since it will restore open reading frames (ORFs), add or delete start or stop codons and correct frame shifts (Benne et al. 1986; Feagin et al. 1988; Shaw et al. 1988). In this case, a complex called the 20S editosome is responsible for the editing of the mitochondrially encoded mRNAs (Pollard et al. 1992; Corell et al. 1996).

18

Figure 3: A-to-I editing. Adenosine is deaminated by ADARs or ADATs (depending on the RNA target) to form an inosine, which will be recognized as a guanosine by the translation machinery in the cell (the figure was created by the use of ChemDraw software).

In addition to internal nucleotide modification, RNA molecules can also be modified at their extremities and such modifications are essential for proper RNA function. One of the best characterized modifications occurs at the 5'-end and is known as the 5’ cap (Shatkin 1976). This event consists of a guanine nucleotide methylated in position 7 connected to the mRNA via an unusual 5′ to 5′ triphosphate bond catalyzed by the capping enzyme complex (Shatkin 1976; Shuman 1995). This modification is crucial for mRNA maturation, stability and translation efficiency (Merrick 2004; Meyer et al. 2004; Li and Kiledjian 2010). 5’ decapping event, catalyzed by the Dcp1-Dcp2 complex in concert with the Lsm1-7 complex, will lead to the decay of such mRNA by Xrn1 exonuclease, further highlighting the critical role of 5’- capping in mRNA stability (Fig. 4; Decker and Parker 1993; Hsu and Stevens 1993; Dunckley and Parker 1999; Lykke-Andersen 2002; van Dijk et al. 2002).

Over 150 RNA modifications have been estimated so far (Czerwoniec et al. 2009). The work presented in this thesis focuses solely on proteins involved in a 3’-end modification, which was underestimated for over 40 years but has critical implications in miRNA biogenesis and more generally, in cellular RNA degradation.

19

Figure 4: General mRNA decay pathway. Cytoplasmic mRNA degradation occurs by two general pathways, both of which are initiated by deadenylation. Shortening of the poly(A) tail is carried out by the Pan2/Pan3 complex or the Ccr4/Pop2/Not complex. Following deadenylation, mRNAs can be subjected to 3’ to 5’ degradation by the exosome. mRNAs can also be decapped by the Dcp1/Dcp2 decapping enzyme and then subjected to 5’ to 3’ degradation by Xrn1 or 3’ to 5’ decay by the exosome or DIS3L2.

1.5 3’-End Addition of Non-Templated Nucleotides

RNA 3’-end processing or modification plays an important role in determining their biological fate (Munroe and Jacobson 1990; Meyer et al. 2004; Alexandrov et al. 2006). One major type of modification encountered by mRNAs is the addition of non-templated nucleotides (Munroe and Jacobson 1990; West et al. 2006; Doma and Parker 2007; Martin and Keller 2007; Rissland and Norbury 2009). The functional consequence of this nucleotide addition is essentially to protect newly transcribed mRNAs from degradation. More generally, tail addition to RNAs regulates cellular RNA content by influencing RNA steady-state levels. Nuclear polyadenylation is essential to degrade various classes of non-coding RNAs (ncRNAs) in the nucleus (Deutscher 1990; Keller 1995; Manley 1995; LaCava et al. 2005). However, once in the cytoplasm, RNAs carrying a 3'-poly(A) tail are protected from 3'-to-5'

20 exonucleases. Polyuridylation is another 3’ modification that involves the addition of uridines at the 3’-end of RNA molecules. This modification is found on various types of RNAs such as mRNAs, small RNAs, miRNAs or gRNAs (Blum et al. 1990; Blum and Simpson 1990; Li et al. 2005; Yu et al. 2005; Horwich et al. 2007; Mullen and Marzluff 2008; Kurth and Mochizuki 2009; Rissland and Norbury 2009; Ameres et al. 2010; Aphasizheva and Aphasizhev 2010; Kamminga et al. 2010; Schmidt et al. 2011). This modification is known to have a major impact in multiple aspects of RNA turnover and metabolism (Blum et al. 1990; Blum and Simpson 1990; Mullen and Marzluff 2008; Rissland and Norbury 2009; Aphasizheva and Aphasizhev 2010; Schmidt et al. 2011).

1.5.1 Polyadenylation

Eukaryotic mRNAs start to be modified during their transcription, where capping and polyadenylation take place at their 5'- and 3’-ends respectively (except for histone mRNAs and some viral mRNAs; Parker and Song 2004). Pre-mRNAs are first cleaved by the cleavage and polyadenylation machinery at the polyadenylation site located near what will become the 3’-end. This cleavage is followed by addition of the poly(A) tail by nuclear poly(A) polymerases (nPAPs). This event will determine the 3’ untranslated region (UTR) of the RNA, which is crucial for the regulation of gene expression processes (Wilkie et al. 2003). Mutations and changes in the length of this region will immediately affect a variety of proceedings such as mRNA stability, mRNA localization and mRNA translation efficiency (van der Velden and Thomas 1999; Conne et al. 2000; Bashirullah et al. 2001; Babendure et al. 2006; Mayr and Bartel 2009). Once the mRNAs are exported to the cytoplasm, they may undergo several additional modifications such as methylation, editing, deadenylation, decapping and polyuridylation, which are again influencing the stability or degradation of the RNA (Benne et al. 1986; Blum et al. 1990; Shyu et al. 1991; Decker and Parker 1993; Seiwert et al. 1996; Dunckley and Parker 1999; Lykke-Andersen 2002; Yu et al. 2005; Mullen and Marzluff 2008; Rissland and Norbury 2009; Kamminga et al. 2010; Schmidt et al. 2011). Polyadenylation regulates RNA degradation, which is one of the most important gene expression mechanisms not only for the removal of mRNAs that should not be translated anymore, but also for the disposal of incorrectly transcribed mRNAs that have escaped the nuclear surveillance mechanisms. General RNA degradation pathway is well conserved

21 throughout eukaryotes, from yeast to mammals, and has two major directions: the 5’-3’ degradation by Xrn1 exoribonuclease and the 3’-5’ degradation catalyzed by the (Fig. 4; for recent review: Schoenberg and Maquat 2012). However, before degrading the mRNA bodies, the cell machinery must first identify the mRNA to be degraded. The cellular cues initiating mRNA degradation are still poorly understood for mRNAs encoding house-keeping genes, while physiological inputs that trigger mRNA decay such as pro-inflammatory responses, heat shock or differentiation are far better characterized (Carballo et al. 1998; Lai et al. 2000). Deadenylation is generally the rate- limiting event in the cytoplasmic mRNA degradation and is catalyzed by the PAN2/PAN3 complex followed by the CCR4/NOT complex (Shyu et al. 1991; Decker and Parker 1993). Once the poly(A) tail has been removed, the Dcp1-Dcp2 decapping complex will withdraw the 7-methylguanylate cap (m7G) from the 5’-end of the mRNA allowing the trimming of this RNA in a 5’-to-3’ manner by Xrn1 exonuclease (Decker and Parker 1993; Hsu and Stevens 1993; Dunckley and Parker 1999; Lykke-Andersen 2002; van Dijk et al. 2002). Following deadenylation, the cytoplasmic exosome complex may cut down deadenylated RNAs as the 3’-5’ mRNA decay pathway (Anderson and Parker 1998; Wang and Kiledjian 2001).

1.5.2 Polyuridylation

Recently, another player in the mRNA decay pathways has come into focus: the cytoplasmic non-canonical nucleotidyl transferases (ncNTrs). These enzymes add uridine residues at the 3’-end of RNA molecules, which can be coding RNAs or ncRNA. Even though this modification has been known since the late fifties, its significance had been underestimated (Canellakis 1957; Wilkie and Smellie 1968; Wilkie and Smellie 1968). In the middle of the eighties, the importance of uridylation increased with the discovery and the characterization of the uridine insertion/deletion editing mechanism in the mitochondria of kinetoplastids. The process was subsequently shown to be crucial for translation efficiency of local mRNAs (Benne et al. 1986; Blum et al. 1990; Seiwert et al. 1996). During the last decade, evidences showed that polyuridylation also existed in higher eukaryotes. Norbury and colleagues showed its implication during the S-phase of the cell cycle (Rissland et al. 2007). Further studies demonstrated that polyuridylation was a critical step for the degradation of non- polyadenylated mRNA encoding histone proteins in mammals (Mullen and Marzluff 2008).

22 This new enzymatic step occurring at the 3'-end of RNAs added another level of complexity to the known mRNA decay pathways further highlighting the importance of regulation for this 3’-end modification (Mullen and Marzluff 2008; Rissland and Norbury 2009; Schmidt et al. 2011). Polyuridylation has also been found to occur on other types of RNA molecules such as miRNAs, siRNAs and piRNAs (Li et al. 2005; Yu et al. 2005; Horwich et al. 2007; Kurth and Mochizuki 2009; Ameres et al. 2010; Kamminga et al. 2010). Finally, studies from the Aphasizhev laboratory on ncNTr family members present in trypanosomal species demonstrated additional roles for these enzymes, not only in the RNA editing process but also during the processing of gRNAs and during mRNA translation in the mitochondria (Aphasizhev et al. 2003; Aphasizheva et al. 2011).

In these following paragraphs, we focus on the latest development about the terminal polyuridylation by non-canonical polymerases, a longtime underestimated 3’-end post- transcriptional modification withstand by various types of RNAs influencing their half-life and functions.

1.6 The Non-Canonical Ribonucleotidyl Transferases Family

Enzymes performing terminal polyuridylation belong to the Polymerase β-like nucleotidyl transferases superfamily and more precisely to the family of template-independent polymerases that covalently add nucleotides to the 3’-end of RNA molecules. This family is generally divided in three subgroups: (i) The canonical RNA-specific nucleotidyl transferases (rNTrs) group, which corresponds to the nuclear poly(A) polymerases α, β and γ. These are found in eukaryotes and share similar enzymatic and RNA-binding domains. (ii) The non- canonical rNTrs comprise the Gld-2-, Trf4/5-, Cid1-type poly(A) or poly(U) polymerases, 2’-5’- oligo(A) synthetases and the terminal uridylyl transferases (TUTases). They differ from the canonical rNTrs by their divergent nucleotide recognition motif (NRM) but possess a similar catalytic domain. (iii) The third group is the one of the CCA-adding enzymes, which also have a conserved catalytic domain but contain the most divergent NRM among this family. Canonical rNTrs and CCA-adding enzymes have been characterized in depth in past reviews, thus we will only focus on the non-canonical rNTrs (Martin and Keller 2007).

23 Every member from the non-canonical rNTrs group is characterized by an enzymatic domain made of two lobes denominated the catalytic and the central domains. The catalytic domain is made of four or five β-strands, which carries a conserved catalytic triad. Specifically, a DxD or DxE motif (aspartate (Asp) or glutamate (Glu) residues separated by one hydrophobic residue) is located in the second β-strand of the catalytic domain. A single Asp or Glu is positioned in the third β-strand of the catalytic domain completing the triad. Thus, the consensus signature of the catalytic site is hG[GS]x(7-13)Dh[DE]h (with h for hydrophobic residues, uppercase letters for invariant, and x for any residue). The central domain contains the NRM, which corresponds to a 10-15 amino acid long loop forming one end of the nucleotide triphosphate binding pocket. As mentioned above, there are three types of NRM corresponding to the three subgroups of the Pol β-NTrs family (canonical-, non-canonical- and CCA-rNTrs associated types). Sub-classification of the rNTrs was attempted based on sequence conservation within the NRM.

Furthermore, a RNA binding domain (RBD) is also found in all canonical and a few non- canonical rNTrs. It shares structural homology with the RNA recognition motif (RRM) protein family and has for function to bind RNA substrates in a non-sequence specific manner (Zhelkovsky et al. 1995; Martin and Keller 1996). The RRM domains are located anywhere in the sequence, i.e. near the C-terminus for the canonical PAPs or at the N-terminus in some non-canonical rNTrs. The RBD is absent in numerous non-canonical NTrs enzymes, indicating that either these proteins can act on any RNA or that their activity is restricted by binding to a yet to be identified protein partner that targets them to specific RNAs. In at least one case, the enzyme ZCCHC11 is targeted to specific pre-miRNA species through interactions with the Lin28 proteins (Yu et al. 2007; Heo et al. 2009; Chang et al. 2012).

From a phylogenetic point of view, several models have been proposed to explain the evolution of the Pol β-NTrs family. The hypothesis of Aravind and Koonin (Aravind and Koonin 1999) is that the Pol β-NTrs family members have rapidly and independently diverged from a common ancestor presenting a very general and non-specific nucleotidyl transferase activity. The different family members would have acquired distinct functional domains to occupy vacant evolutionary niches. Then, horizontal gene transfer and lineage- specific gene loss could have explained the actual distribution of the different groups in the three domains of life. Some evidences like the discovery of the archaeal and bacterial

24 minimal nucleotidyl transferases (MNT family) and the restricted phylogenetic distribution of most of the Pol β-NTrs family members support this model (Aravind and Koonin 1999). However, it has recently been shown that a bacterial poly(A) polymerase that possesses the RBD of a CCA-adding enzyme is able to act as a CCA-rNTrs (Betat et al. 2004). This suggests that the CCA-adding enzymes could be the ancestors of the poly(A) polymerases and possibly the founders of all the remaining rNTrs, which would have adopted different RNA binding domains mediating different target specificity.

Within the non-canonical rNTrs, there are two main groups of poly(U) polymerases based on their substrates: the Cid1-like family and the RNA editing enzymes. Cid1 (Caffeine-induced death suppressor) protein from Schizosaccharomyces pombe is the pioneer of cytoplasmic poly(U) polymerases (Aravind and Koonin 1999). Many other proteins are part of this group with limited sequence homology but highly similar enzymatic properties such as the trypanosomal protein RNA editing TUTase1 (RET1), which modifies both the gRNAs and the mRNAs in a process that still awaits full characterization (Blum and Simpson 1990; Aphasizhev et al. 2003; Aphasizheva and Aphasizhev 2010; Aphasizhev and Aphasizheva 2011; Aphasizheva et al. 2011). Seven proteins from this sub-group are found in Human although their precise action also required a more detailed characterization (Fig. 5; Trippe et al. 2006; Rissland et al. 2007; Heo et al. 2009).

Figure 5: Schematic representation of human TUTase proteins. The TUTases catalytic motif is composed of the nucleotidyl transferase domain (NTr domain, dark aquamarine box) and the PAP-associated domain (magenta box). Light aquamarine box indicates a conserved inactive NTr domain (sequence variations). Green and yellow boxes represent zinc finger and zinc knuckles, respectively. Violet box corresponds to the RNA recognition motif (RRM).

25 The RNA editing enzymes are responsible for mitochondrial mRNA editing by U- insertion/deletion in kinetoplastids (Rusche et al. 1997; Aphasizhev et al. 2002; Panigrahi et al. 2003; Wang et al. 2003; Stuart et al. 2005; Aphasizhev and Aphasizheva 2008). Two main proteins have been studied in details: RNA editing TUTase 2 (RET2) and the mitochondrial editosome-like complex associated TUTase 1 (MEAT1). RET2 and MEAT1 are found with the 20S editosome complex of trypanosomes and are crucial for the U insertion-type of editing in this organism (Ernst et al. 2003; Wang et al. 2003). Crystal structures of RET2 and MEAT1 showed conserved domain organization except for the middle domain (MD; Deng et al. 2005; Stagno et al. 2010). The lack of sequence similarity within this middle domain suggests divergent functions for these specific modules only found in trypanosomal TUTases.

1.7 Substrates of Polyuridylation in the Different Cell Compartments

Polyuridylation at the 3’-end of RNAs varies based on their localization as well as on the type of RNAs to be modified. Until a few years ago, this process had been mostly reported in the mitochondria of the parasitic protist Trypanosome (Benne et al. 1986; Blum et al. 1990). More recently, non-canonical rNTrs were found in the cytoplasm of eukaryotic organisms and were shown to modify a wide range of non-translated and translated RNAs (Abelson 1979; Mullen and Marzluff 2008; Rissland and Norbury 2009; van Wolfswinkel et al. 2009; Ibrahim et al. 2010; Kamminga et al. 2010; Schmidt et al. 2011; Zhao et al. 2012). Details of the different polyuridylation substrates in the , cytoplasm and mitochondria are described hereafter and summarized in figure 6 (Fig. 6).

26

Figure 6: Substrates of polyuridylation in the different cell compartment. For each substrate, the players for polyuridylation are presented for the organisms mentioned. The bended arrow means the polyuridylation event.

1.7.1 Polyuridylation in the nucleus

Until now, the only known substrate of uridylation reported in the nucleus is the U6 snRNA (Fig. 6). This RNA is uridylated by U6 Terminal Uridyl Transferase (TUTase), which is essential for cell survival in mammals (Trippe et al. 2006). U6 TUTase is responsible for the addition or restoration of four uridine residues at the 3’-end of U6 snRNA since 3’-end of U6 snRNA is constantly subjected to exonucleases activity (Trippe et al. 1998; Trippe et al. 2006). These four U residues form an intramolecular double strand with a stretch of adenines within the U6 snRNA molecule, which is important for mRNA splicing (Trippe et al. 2006).

27

1.7.2 Polyuridylation in the organelles (mitochondria)

Uridylation events in the organelles have only been reported in mitochondria (Aphasizheva and Aphasizhev 2010; Aphasizhev and Aphasizheva 2011). It is apparently absent from the chloroplastic compartment, although proteins from the rNTrs family are present such as the poly(A) polymerase (Zimmer et al. 2009). One possible reason is the close evolutionary conservation of the RNA processing pathways found in the chloroplast and in bacteria (Raynal and Carpousis 1999).

Poly(U) tails have been reported mostly on kinetoplastid organisms. Two main substrates are targeted in these organisms: gRNAs and mRNAs (Fig. 6). The gRNAs are specific to kinetoplastid species and are crucial for cell survival as they are in charge of guiding the editosome machinery to its mRNA targets in these organisms (Blum et al. 1990). RET1, the first characterized ncNTrs, is responsible for the polyuridylation of gRNAs and this step allows the proper maturation of the gRNA (Aphasizhev et al. 2003; Aphasizheva and Aphasizhev 2010). Currently, it is not yet fully understood how RET1 enzyme recognizes its gRNA substrate nor how the pre-gRNA processing step takes place (Aphasizhev et al. 2003; Aphasizheva and Aphasizhev 2010). RET1 was more recently shown to be involved in polyuridylation of mRNAs where it works in concert with the kinetoplast poly(A) polymerase 1 (KPAP1). These two proteins add alternating A/U residues specifically to the 3’-end of mitochondrially-encoded mRNAs. Their action is coordinated by the kinetoplast polyadenylation / uridylation factor 1 and 2 (KPAF1 and KPAF2) complex (Aphasizheva et al. 2011). As for the cytoplasmic RNAs, our molecular understanding of the factor targeting specifically RET1 and KPAP1 activities to the mRNAs is poor and awaits further structural and biochemical characterization (Aphasizhev and Aphasizheva 2011; Aphasizheva et al. 2011).

It is noteworthy that RNAs with poly(U) tails have also been observed in human mitochondria under certain conditions (Slomovic et al. 2005; Slomovic and Schuster 2008; Borowski et al. 2010; Szczesny et al. 2010). In spite of this, how this process is achieved in this compartment and its implication(s) for human mitochondrial RNA metabolism still remain to be characterized.

28 1.7.3 Polyuridylation in the cytoplasm

Cytoplasmic polyuridylation occurs on a variety of RNA molecules ranging from polyadenylated to non-polyadenylated RNA molecules including mRNAs, small RNAs, miRNAs or piRNAs (Fig. 6; Mullen and Marzluff 2008; Rissland and Norbury 2009; van Wolfswinkel et al. 2009; Ibrahim et al. 2010; Kamminga et al. 2010; Schmidt et al. 2011; Zhao et al. 2012).

As mentioned previously, several eukaryotic mRNAs were shown to be uridylated in the cytoplasm by the poly(U) polymerase Cid1 (Fig. 6; Rissland et al. 2007; Rissland and Norbury 2009). Until now, only a hand full of mRNAs has been identified to be specifically uridylated such as act1, urg1 and adh1 (Rissland and Norbury 2009). More recently, one study looked at the 3'-end sequence of mRNAs at a genome-wide level and revealed that U tails are apparently attached to short poly(A) tracks rather than to the mRNA body (Chang et al. 2014). Interestingly, while some mRNAs like the one encoding the poly(A) binding protein 4 are polyuridylated in more than 25% of the cases, about 80% of mRNAs have an uridylation frequency comprised between 2 and 5%. Overall, the functional relevance of these low-level of uridylation or the identification of the actors responsible for the cytoplasmic mRNA polyuridylation is currently poorly known. Their characterization will likely bring new lights on our understanding of this particular process.

Non-polyadenylated mRNAs are also uridylated in the cytoplasm (Fig. 6). This is the case of the histone mRNAs, which are polyuridylated by Cid1 orthologous enzymes TUTase1 and TUTase3 (also named PAPD1 and Trf4-2 respectively; Mullen and Marzluff 2008). It is not fully clear, however, how these enzymes would switch between cell compartments or how they could select which substrates to uridylate and which one to adenylate in vivo (Mullen and Marzluff 2008). One should not exclude that several pools of PAPD1 differentially located in the cell may exist. Several years later another Cid1 orthologous enzyme, ZCCHC11, was also shown to polyuridylate histone mRNAs following inhibition or complete DNA replication (Schmidt et al. 2011). More data are definitely required to fully apprehend ncNTrs role during regulated histone mRNA degradation in particular regarding the factors bringing together the histone mRNAs and the ncNTrs.

A variety of ncRNAs in a variety of organisms have been shown to carry mono- or multiple non-templated uridine residues at their 3'-end (Fig. 6; Li et al. 2005; Ibrahim et al. 2006;

29 Ibrahim et al. 2010; Heo et al. 2012; Thornton et al. 2012). However, the functional consequences associated with these modifications are highly diverse. 3’ uridylation of various miRNAs has been observed in multiple sequencing studies suggesting a wide role of uridylation during miRNA biogenesis pathway (Landgraf et al. 2007; Morin et al. 2008; Choi et al. 2012). These mono- or polyuridylation events have been found in both pre-miRNAs and mature miRNAs (Ibrahim et al. 2010; Newman et al. 2011; Heo et al. 2012; Thornton et al. 2012). In C. elegans and Human, polyuridylation of the pre-let-7-miRNA has been reported and is performed by the proteins PUP-2 and ZCCHC11 respectively (Hagan et al. 2009; Lehrbach et al. 2009; Thornton et al. 2012). Association between the pre-miRNA and the Lin28 protein induces a conformational change in the pre-miRNA loop, which possibly favors modification by ZCCHC11 (Nam et al. 2011; Loughlin et al. 2012). However, the presence of a single 3'-overhanging nucleotide appears critical for the uridylation process therefore excluding the so-called "group I" or canonical miRNAs from being subject to uridylation (Heo et al. 2012). Furthermore, in the same study, the ZCCHC6 enzyme was found responsible for the mono-uridylation of group II let7 pre-miRNAs and this modification is independent of Lin28 protein (Heo et al. 2012). ZCCHC11 has also been involved in the uridylation of specific mature miRNA such as miR-26 (Jones et al. 2009). Further biochemical and biophysical studies are needed in order to identify the specific enzymes responsible for the uridylation of other miRNAs in higher organisms as well as the target-specific effects induced by this 3'-end modification. 3’ uridylation of piRNAs have been observed in zebrafish and drosophila but the enzymes responsible for this modification are currently unknown (Ameres et al. 2010; Kamminga et al. 2010). Another class of ncRNAs subjected to 3’ uridylation is the siRNAs (Fig. 6). In nematodes, this particular type of RNA substrate is modified by the protein CDE-1 (Cosuppression defective 1) and HESO-1 (HEN1 SUPPRESSOR 1; Ren et al. 2012). In plants and lower eukaryotes, MUT68 has been implicated in this event (Ibrahim et al. 2010). This last enzyme is shown to act on atypical substrates, i.e. the product of the miRNA-directed mRNA cleavage (Ren et al. 2014). In this case, the uridine nucleotides are apparently added to the 5'-fragment of the cleaved mRNA when it is still bound by the AGO1 complex (Ren et al. 2014). Further studies will help determining the generality of this mechanism as the responsible enzyme HESO-1 does not seem to be conserved in higher eukaryotes.

30 1.8 Known Functions of Polyuridylation

3’ uridylation is a widespread phenomenon in eukaryotes with multiple biological consequences depending on the substrate to be modified (coding RNA or ncRNA), the eventual targets of the uridylated RNA and the cell compartment. In the following section, we will review the main known consequences of uridylation in the nucleus (Fig. 7), in the mitochondria (Fig. 8) and in the cytoplasm (Fig. 9).

1.8.1 In the nucleus

The highly specific uridylation of U6 snRNA by U6 TUTase is necessary for the U6 snRNA stability and function (Trippe et al. 2006). As we mentioned before, this is the only substrate known to be uridylated in this compartment and the main function of this uridylation event is the stabilization of this snRNA crucial for RNA splicing. Mammalian U6 snRNA uridylation in vivo has been reported with up to 20 nucleotides added at the 3’-end of the RNA molecule (Rinke and Steitz 1985; Lund and Dahlberg 1992). It is important to note that U6 snRNA is also subjected to adenylation and this event inhibits its uridylation (Fig. 7; Chen et al. 2000). Further analysis by Chen and colleagues observed U6 snRNA transcripts where the adenine residue at position 102 was replaced with a uridine, strongly suggesting a role of uridylation in the regeneration of the U6 snRNA 3’ end following its shortening by exonucleases (Chen et al. 2000). Biochemical studies identified the TUTase responsible for U6 snRNA 3’ uridylation and this enzyme was named U6 TUTase (Trippe et al. 2006). SiRNA-mediated silencing of the U6 TUTase lead to U6 snRNA decay confirming the necessity of uridylation for U6 snRNA stability (Trippe et al. 2006). So far, this is the only uridylation event known in the nucleus (Fig. 7).

31

Figure 7: Function of polyuridylation in the nucleus. U6 snRNA is the only known substrate for polyuridylation in the nucleus by U6 TUTase. Polyuridylation is thought to regenerate the 3’ end of U6 snRNA following its shortening by exonucleases. If this RNA is adenylated, the polyuridylation event is inhibited.

1.8.2 In the organelles

So far, no polyuridylation event has been found in the chloroplast of plants and algal cells. In this organelle, RNA surveillance mechanism is rather close to the one found in bacteria where polyadenylation of the 3’-end of mRNAs is the main destabilizing factor (Lisitsky et al. 1996; Lisitsky et al. 1997; Blum et al. 1999).

Polyuridylation of RNAs in the mitochondria has been observed and has two major impacts depending on the type of RNA modified. As I mentioned before, this event is specific to kinetoplastid organisms and is found on gRNAs and on mRNAs. For gRNAs, polyuridylation is the final maturation step and is necessary for the editing process of mitochondrially- encoded mRNAs whereas uridylation together with polyadenylation are necessary for the proper translation of edited and never edited mRNA (Fig. 8).

In order to be matured, pre-gRNAs need to pass through an exonucleolytic process followed by stabilization by the gRNA Binding Complex (GRBC) and RET1 uridylation (Blum and Simpson 1990; Weng et al. 2008; Aphasizheva and Aphasizhev 2010). Mature gRNA is thus composed of a 5’ phosphate from the transcription followed by an anchor region complementary to a target unedited mRNA, a guiding region that directs the editing of its mRNA target and a final poly(U) tract at the 3’-end. In RET1-depleted cells, gRNAs are stable but not able to perform their editing function suggesting a crucial role of the oligo(U) tail in the editing event in the mitochondria. This oligo(U) tract may stabilize the gRNA-mRNA

32 hybrid through binding with the purine-rich pre-edited region (Blum and Simpson 1990). The uridylated gRNA bound to its mRNA target recruits the 20S editosome. This gRNA-mediated mRNA editing in kinetoplastid trypanosomes is crucial for the parasite survival, as these editing events are needed for the proper establishment of the coding sequence of the mitochondrial mRNAs (Blum and Simpson 1990).

Figure 8: Function of polyuridylation in the mitochondria of trypanosomes. In order to be properly matured, gRNAs are polyuridylated by RET1 TUTase allowing the gRNA to “guide” the editing reaction. To be translationally competent, mitochondrial mRNAs require addition of a long A/U tail, which is performed by the RET1/KPAP1 complex and coordinated by the KPAF1/KPAF2 complex. The mRNA is then recognized by the ribosome and translation can be started.

After editing, mRNAs need to be further modified at the 3’-end in order to be translationally competent in trypanosomal mitochondria. The addition of a long 3’ A/U tail is necessary to make the mRNAs favorable for translation (Fig. 8; Aphasizheva et al. 2011). This nucleotide addition is performed by the RET1/KPAP1 complex, which adds around 200 alternated adenines and uridines to the 3’-end of the targeted mRNA and is modulated by the KPAF1/KPAF2 protein complex (Aphasizheva et al. 2011). The polyadenylated/polyuridylated mRNA is then recognized by the translation machinery to give birth to the encoded protein (Aphasizheva et al. 2011).

1.8.3 In the cytoplasm

The functional outcome of polyuridylation in this compartment offers new insights into RNA turnover and small RNA biogenesis (Fig. 9). Most of cytoplasmic uridylated mRNAs and sRNAs are targeted for decay thus strongly implying a destabilizing role for uridylation. RNA

33 cRACE studies in fission yeast revealed a role of uridylation in a new deadenylation- independent decapping-mediated degradation pathway (Rissland and Norbury 2009). Furthermore, a single uridine at the 3’-end of a RNA molecule is sufficient to be recognized by the Lsm1-7 complex, known to link 3'-end deadenylation and 5'-end decapping, therefore supporting the relationship between uridylation and mRNA degradation (Song and Kiledjian 2007). Recently discovered, Dis3L2 protein in fission yeast was shown to preferentially degrade mRNAs with 3’-end uridylation and its deletion together with Lsm1 led to the accumulation of uridylated RNAs (Malecki et al. 2013). Mammalian Cid1 orthologs ZCCHC11 and ZCCHC6 proteins together with the tumor suppressor protein Lin28 uridylate pre-let7- miRNA in humans, thereby influencing positively and negatively the miRNA production (Schmidt et al. 2011). Interestingly, the mammalian DIS3L2 exonuclease was also shown to specifically degrade uridylated pre-let7-microRNA discriminating them from 3’-unmodified RNAs (Lubas et al. 2013). Further studies between DIS3L2 exonuclease and TUTases will be necessary to better understand their respective function and more importantly, the functional link existing between these enzymatic activities.

Upon inhibition of DNA replication or conclusion of S-phase, histone mRNAs must be rapidly degraded in order to avoid their accumulation and their interference with other cellular pathways (Osley 1991). Histone mRNAs are not polyadenylated but posses a stem-loop structure at their 3’-end crucial for pre-mRNA processing, export and proper translation (Pandey and Marzluff 1987; Gallie et al. 1996; Battle and Doudna 2001). Studies aiming to understand the mechanism by which histone mRNA degradation was triggered found that histone mRNAs were targeted to decay by uridylation (Mullen and Marzluff 2008; Schmidt et al. 2011). Indeed, knockdown of ZCCHC11 enzyme stimulated the accumulation of histone- encoded mRNAs (Schmidt et al. 2011). Interestingly, as for polyadenylated mRNAs in fission yeast, uridylation of histone mRNAs was shown to promote decapping followed by 5’-3’ degradation (Mullen and Marzluff 2008). More recently, 3’-5’ degradation of histone mRNAs by ERI1 exonuclease has been reported (Hoefig et al. 2013). Lsm1-7 promotes the degradation of the 3’-end stem-loop of the mRNA by recruiting ERI1 to the stem-loop and simultaneously blocks translation of these mRNAs. In this case, Lsm1-7 complex binds to the uridylated histone mRNAs and to ERI1 (Hoefig et al. 2013).

34

Figure 9: Function of polyuridylation in the cytoplasm. In humans, ZCCHC11 in concertwith Lin28 polyuridylates pre-let7-miRNA, which will then be degraded by DIS3L2 exonuclease. ZCCHC6 alone is responsible for the monouridylation of group II pre-miRNA, which will be further processed by Dicer. In the case of mature miRNA, ZCCHC11 monouridylates some miRNA leading to indirect consequences to the miRNA targeted mRNAs. In plants, zebrafish and flies, methylation and polyuridylation have antagonistic effects. Methylated siRNAs, piRNAs and miRNAs will be stabilized whereas those polyuridylated will be degraded. Finally, in S. pombe, polyuridylation of mRNAs by Cid1 PUP leads to decapping and decay.

Studies in plant, animal and fly species have demonstrated an antagonistic role of uridylation and 2’-O-methylation in these organisms (Li et al. 2005; Yu et al. 2005; Kurth and Mochizuki 2009; Ameres et al. 2010; Kamminga et al. 2010). HEN1 (HUA ENHANCER 1) and its homologs methylate sRNAs in plants, piRNAs in animals and Ago2-associated siRNAs in flies protecting these RNAs against 3’ uridylation (Li et al. 2005; Yu et al. 2005; Kurth and Mochizuki 2009; Ameres et al. 2010; Kamminga et al. 2010). In zebrafish hen1 mutants, piRNAs derived from retrotransposons are found uridylated and their levels are decreased suggesting a sensitivity of these uridylated piRNAs to degradation. Interestingly, a mild

35 repression of retransposons is observed in these mutants thus highlighting a destabilizing role for uridylation of piRNAs and a stabilizing role for methylation, which is the likely cause of transposons silencing in the hen1 mutant (Kamminga et al. 2010). In C. elegans, CSR-1 is an Ago protein necessary for proper chromosome segregation rather than regulation of mRNA levels (Claycomb et al. 2009; Gu et al. 2009; van Wolfswinkel et al. 2009). CDE-1, a C. elegans PUP, uridylates unmethylated siRNAs of the CSR-1 pathway (van Wolfswinkel et al. 2009). Mutation of this CDE-1 enzyme leads to accumulation of CSR-1 siRNAs, which promotes erroneous chromosome segregation and defective gene silencing (van Wolfswinkel et al. 2009). Uridylation is then a destabilizing factor against CSR-1 siRNAs, which regulates CSR-1-dependent and specific siRNA levels in this organism. In Chlamydomonas reinhardtii, MUT68 was first known to adenylate 5’ cleavage fragments of mRNAs targeted by the RNA-induced silencing complexes (RISC), thereby promoting their decay (Ibrahim et al. 2006). Further studies showed an important role of MUT68 in miRNA and siRNA degradation through 3’ uridylation (Ibrahim et al. 2010). Taken together, these data highlights the crucial role of sRNA uridylation for diverse biological processes in several organisms and that defects in the regulation of this phenomenon can have important consequences on the gene expression of the RNA targets (Fig. 9).

Polyuridylation has also been reported to stabilize RNAs and thus prevent their degradation. In Arabidopsis thaliana, uridylation of oligoadenylated mRNAs has been suggested to prevent their 3’ trimming and rather establish a preferential 5’-to-3’ mRNA degradation manner (Sement et al. 2013). Indeed, URT1 (UTP:RNA uridylyl transferase 1) was shown to uridylate oligo(A)-tailed mRNAs in vivo and its absence contributed to the degradation of oligoadenylated mRNAs highlighting a new role of uridylation in mRNA stability. ZCCHC11 enzyme, besides its role on histone mRNA and pre-miRNA decay, has also been implicated in indirect mRNA stabilization by uridylation of mature miRNAs. ZCCHC11-dependent uridylation of mature cytokine-targeting miRNAs is known to lead to the stabilization of cytokine transcripts and hence regulates cytokines gene expression. Mature miR-26 can bind interleukin IL-6 mRNA in its 3’ UTR and targets this cytokine-encoding mRNA to degradation (Jones et al. 2009). Upon miR-26 uridylation by ZCCHC11, the miRNA is unable to bind the 3’ UTR of the mRNA and thus the transcript is stabilized even though no degradation of the miRNA is observed. In this case, uridylation of miRNAs indirectly stabilizes the mRNA

36 targeted by the unmodified miRNA by preventing the formation of the miRNA/mRNA complex and thus blocks the miRNA directed silencing function. This is further confirmed by ZCCHC11 knockdown experiments where several cytokine mRNAs are downregulated in the absence of uridylation (Minoda et al. 2006; Jones et al. 2009).

These data together supports a crucial role of cytoplasmic polyuridylation in the regulation of gene expression and stability control of coding and non-coding RNAs in eukaryotes. Understanding in detail the enzymatic properties of PUPs is therefore of primordial importance and is the subject of the present thesis research.

37 Goals of the Thesis Work

The aim of this thesis work was to understand the molecular mechanisms by which polyuridylating enzymes discriminate UTP over ATP, CTP and GTP. In parallel, we wished to gather insights of how these enzymes recognize their RNA target at the atomic level to help explain their ability to modify such a variety of RNAs. For this purpose, we used X-ray crystallography as our main tool in order to reveal novel aspects of poly(U) polymerases along with their protein/protein and protein/RNA interactions. In the first project, we obtained structural information on protein/substrate, protein/RNA and protein/substrate/RNA interactions, which were complemented by mutational and functional studies of the different components of these complexes. In the second project, we tried to study protein/protein and protein/RNA interactions attempting to elucidate the specific functions of the different subunits of the RDS complex (RET1 TUTase DSS1 exonuclease complex), a major uridylating complex in Trypanosomal species.

38 1.9 Project I: Functional and Structural Studies of Cid1 Poly(U) Polymerase from S. pombe

Cid1, a template independent cytoplasmic RNA poly(U) polymerase from S. pombe and pioneer of PUP enzyme family, is able to use ATP and UTP as a substrate but will preferentially use UTP in vivo (Rissland and Norbury 2009). This enzyme lacks any known RRM suggesting that it needs a protein partner for RNA target selection even though no partners have been identified so far (Rissland et al. 2007). The molecular mechanism by which this enzyme was able to discriminate UTP over ATP was not described previously. Hence, in order to understand its UTP selectivity and how the enzyme was selecting and recognizing its RNA target, we aimed to solve the crystal structure of the enzyme alone and in secondary and ternary complexes. Followed by point mutations, several biochemical analyses were performed to better characterize its function and selectivity in vitro.

1.10 Project II: Structural Studies of Mitochondrial RNA Processing Complexes in T. brucei

DNA in the mitochondria of trypanosomes is densely packed in a structure called the kinetoplast, composed of approximately 40 to 50 maxicircles, which yield to rRNA and mRNA precursors, and thousands of minicircle encoding mostly gRNAs. These gRNAs are necessary for mRNA editing however their biogenesis is not completely understood. Our collaborators in the group of Prof. Ruslan Aphasizhev characterized the complex responsible for the gRNA biogenesis called RDS complex. This complex is composed of the TUTase RET1, the exonuclease DSS1 and three other proteins lacking any recognizable motif named RDS1, RDS2 and RDS3. Our purpose in this project was to solve the atomic structure of the RDS complex. However, purification of a stable complex has not yet been achieved. Therefore, we are now pursuing to obtain atomic structures of the individual subunits with a special emphasis on RET1 alone, RET1 with UTP and/or with its RNA substrate and as a protein/protein complex with DSS1. We aim to solve the crystal structure of RDS1, RDS2 and RDS3 proteins either as a whole complex or as several sub-complexes along with RET1 and DSS1 in order to further investigate the functions of these proteins.

39 Results

The results obtained during this PhD research are presented in the two following sections 4.1 and 4.2. The first section is focused on “Functional and structural studies of Cid1 poly(U) polymerase from S. pombe” project, and the second section describes the “Structural studies of mitochondrial RNA processing complexes in T. brucei” project. Within the first section, two publications are included and complemented with unpublished work performed on Cid1 enzyme. The second section describes the ongoing work on the different proteins of the RDS complex from trypanosomes in collaboration with Ruslan Aphasizhev laboratory in Boston and a master student in our laboratory, Marius Long.

40 1.11 Functional and Structural Studies of Cid1 Poly(U) Polymerase from S. pombe

1.11.1 Cid Overexpression and Purification

Early studies of Cid1 full-length (FL) enzyme showed that the protein is not stable during purification (Rissland et al. 2007). We then decided to try two different constructs named Cid1S (amino acids 40 to 366) and Cid1L (amino acids 40 to 377; Fig.10A) based on secondary structure predictions performed by the Phyre program (Kelley and Sternberg 2009). We cloned Cid1S and Cid1L into two of our expression vectors with either a heptahistidine tag (pST2) or a nonahistidine tag followed by an MBP (Maltose Binding Protein) tag (pST18). A TEV (Tobacco Etch Virus) protease cleavage site is found between the tag and the recombinant protein ORF in order to be able to remove the tag prior to crystallization. We initially tested expression of the His-tagged/MBP version of Cid1S and Cid1L in small-scale, in several different conditions (different E. coli strains, media, induction temperatures and times; Fig. 10B). We observed no expression in Luria Broth (LB), Terrific Broth (TB) and autoinduction (AI) media when we used the Rosetta 2 strain of E. coli (Fig. 10B). When using E. coli strain BL21-Star, all the three different media had good expression but the best condition for Cid1 expression was AI (Figure 10C). We then checked the solubility after lysis followed by a small-scale purification with Ni-NTA beads of the two enzyme constructs (Fig. 11A, 11B and 11C). We could obtain low quantity of soluble protein for Cid1S whereas the amount was higher for Cid1L (Fig. 11B and 11C respectively). To be sure that the eluted protein is indeed our recombinant Cid1, we took one of the elution fractions and cleaved the tag with the TEV protease overnight (O/N) at 16°C (Fig. 11D). Both of the constructs are TEV cleavable but since we obtain higher amounts of protein for Cid1L, the large-scale purification optimizations were performed with this construct. For structural biology studies, the protein purification should yield at least 1 mg of 95% pure protein. The protein large-scale purification was straightforward and the details of buffers and purification method are described in materials and methods chapter 4.1.4. Throughout the whole purification, we assessed the purity of the enzyme in a 10% SDS-PAGE gel (Fig. 12A). In the size exclusion chromatography step, we observe two peaks (Fig. 12B). The first peak

41 corresponds to the aggregated population of tagged protein and contaminants and the second peak corresponds to Cid1L monomer (Fig. 12B). At the end of the purification, we obtain 10 mg of Cid1L enzyme with more than 95% purity out of 10 g of lysed cells. The protein was further concentrated to 9.5-10mg/ml for setting up factorial crystallization screens.

Figure 10: Cid1 truncated versions scheme and expression tests. (A) Cid1 enzyme constructs Cid1L (40-377 Aa.) and Cid1S (40-365 Aa.) tested. The tag is in light blue, the TEV cleavage site is in dark blue and Cid1 enzyme is in grey with the nucleotide transferase domain in aquamarine and the PAP-associated domain in magenta. (B) Cid1L and Cid1S expression tests in Rosetta 2 E. coli strain with three different media (LB, TB and AI) loaded onto a 15% SDS-PAGE gels. (C) Cid1L and Cid1S expression tests in BL21-Star E. coli strain with three different media (LB, TB and AI) loaded onto a 15% SDS-PAGE gels. The arrow corresponds to Cid1 tagged protein in panels B and C. LB and TB expressions were induced at 18°C with 0.1mM IPTG (Isopropyl-β-D-

42 thiogalactopyranosine). AI expressions were performed at 30°C in panels B and C. Ni for none-induced sample, O/D for overday, O/N for overnight, h for hour and Mw for protein molecular weight.

Figure 11: Cid enzyme truncated versions solubility, purification and TEV cleavage trials. (A) Cid1S and Ci1L small-scale expressions in LB and AI media lysed with the emulsiflex and loaded into a 10% SDS-PAGE gel to check the presence of Cid1 in the soluble fraction. (B) Cid1S expressions in LB or AI media batch purification. FT for Flow-through, W for wash and E1, 2, 3 and 4 for elutions 1, 2, 3 and 4. (C) Cid1L expressions in LB or AI media batch purification. (D) 16°C TEV overnight cleavage of Cid1L and Cid1S constructs expressed in LB or AI media. Ni for none-induced sample, T for total fraction, P for pellet insoluble fraction, S for soluble fraction, BT for sample before TEV cleavage and AT for samle after TEV cleavage.

43 Figure 12: Cid1 purification steps and final gel filtration chromatography profile (A) Final 15% SDS-PAGE gel summarizing the different purification steps. (B) Cid1 gel filtration chromatography profile. Tot for Total fraction, Sol for soluble fraction, FT for Flow-through, BT for Before TEV cleavage, AT for After TEV cleavage, rFT for reloading Flow-Through, Bgf for before gel filtration, Agf for after gel filtration and Prot for pure protein.

1.11.2 Cid1 Crystallization

We first screened ~480 individual crystallization conditions of the protein alone testing two different protein concentrations (9.5mg/ml and 4.5mg/ml) at 18°C. Several crystallization hits were obtained over these conditions but only one gave single crystals (Fig. 13A). Morpheus screen condition containing 0.1M MES/Imidazole pH 6.5, 30% Glycerol/PEG4000 and 0.09M of a mix of NaI, NaBr and NaF gave single crystals in the middle of urchins (Fig 13A). After one round of optimization of the crystallization condition, we checked if these crystals were indeed Cid1L-containing crystals. For this purpose, we fished three big rod crystals which we washed three times on three different reservoir drops and loaded onto a 15% SDS-PAGE gel together with the whole drop, last wash of the crystals and the protein sample (Fig 13B). We can clearly see on the SDS-PAGE gel that the crystals contain Cid1L protein (Fig 13B).

44 Figure 13: Cid1 crystals characterization and optimization. (A) Initial Morpheus hit. (B) Silver stained 10% SDS- PAGE gel of dissolved Cid1 crystals. (C) First optimization. (D) Crystals grown using macro-seeding technique. (E) Crystals grown using the pseudo-batch approach.

Changing the buffer, precipitant and additive concentrations and combinations still gave us very small crystals with no defined edges in sitting drop vapor diffusion plates (Fig 13C). During optimization trials, we noticed that leaving the protein together with the crystallization condition in an eppendorf gave us bigger crystals with sharper edges. This means that there is no need of equilibration for crystal growth. So, we decided to try two other different crystallization methods: macro-seeding and pseudo-batch (the solution on the reservoir and the drop are at the same concentration, so there is no equilibration between the reservoir and the drop as in the normal vapor diffusion method; Fig. 13D and 13E). In the case of macro-seeding, we managed to obtain bigger crystals but their shape was not sharp enough. But when we performed pseudo-batch method, the crystals grew bigger and with sharp edges (Fig. 13E). These crystals were then tested for diffraction using our in-house source and the Beamline ID14-4 of the European Synchrotron Radiation Facility (ESRF) together with crystals soaked with 2.5mM of UTP or ATP supplemented with 5mM of

MgCl2.

1.11.3 Data Collection and Phasing

Cid1 crystals diffracted up to 2.4Å and 2.28Å resolution for the native and UTP-soaked crystals respectively (Fig. 14A and 14B). Diffraction data reduction and scaling were performed as described in materials and methods chapter 4.1.4.

The data reduction and scaling only give us the intensities and amplitudes of the spots. To obtain the exact position of the atoms inside the crystal lattice, we need to calculate the phases. In order to do so, we used the molecular replacement (MR) method where the phases of an atomic model with at least 25% sequence identity to our protein would be used for phase calculation. Unfortunately, molecular replacement experiments with TbTUT4 as a model were not successful since this protein only has 20% sequence identity with Cid1.

45

Figure 14: Cid1 enzyme high resolution diffraction patterns. (A) Cid1 native enzyme high resolution diffraction with a snap shot of the crystal used for data collection. (B) Cid1 enzyme soaked with 2.5mM UTP supplemented with 5mM MgCl2 high resolution diffraction pattern.

We then expressed Cid1 seleno-methionine (SeMet) derivative. All the methionines in the protein will be replaced by seleno-methionine, which will have 18 electrons more than methionine allowing the phasing by anomalous method. We implemented the same purification method as for the WT protein and yield to around 2mg of pure SeMet-derivative monomeric protein for 5g of cell pellet (Fig. 15A and 15B). We performed crystallization trials in the same conditions as for the WT protein as well as general grid screens and micro- seeding (Fig. 15C). All the crystals obtained were tested at the synchrotron but unfortunately we could not obtain enough anomalous signal (SigAno <1) to identify the position of the methionines (Fig. 15D).

46 Figure 15: SeMet derivative Cid1 enzyme purification, crystallization and data analysis (A) Final 15% SDS- PAGE gel summarizing the different purification steps. (B) SeMet derivative Cid1 gel filtration chromatography profile (C) SeMet derivative Cid crystals using the micro-seeding method (left) or the WT pseudo-batch condition (right). (D) XSCALE.LP table showing the subset of resolution limit, the I/sigma, the Rmeas and the Rmrgd-F, and the Anomal Corr and SigAno, which shows the SeMet anomalous signal.

In order to delineate the phases with a different method, we tried heavy atom soakings of the native crystals. Once inside the crystals, heavy atoms can be very destructive depending on the concentration and time of soaking. Several rounds of soakings with the different conditions were tried and summarized on table 1 (Table 1). In the first round of heavy atom soakings with KAuCl4 compound, the crystals cracked very fast meaning that this heavy atom interacts with the protein. Lower concentrations of KAuCl4 compound and O/N soakings as well as a 7 days KPtCl4 platinum soak allowed us to obtain one useful derivative out of which we could solve the structure using a single-wavelength anomalous diffraction (SAD) phasing strategy (Fig 16A and 16B). These preliminary phases allowed us to build a partial atomic model of Cid1, which was then used to obtain a solution using the molecular replacement technique into a native dataset with a higher resolution. We tried the same procedure with the datasets collected from UTP-soaked Cid1 crystals. In some cases, the density was good enough to start building the enzyme (Fig 16C). Details of the structure solving along with the mutational and functional analysis of Cid1 based on the crystal structure were published in the journal Structure of Cell press and are shown in the chapter below (section 4.1.4).

47 Heavy atom compound Soaking concentration Soaking time K2PtCl4 K2PtBr4 K2Pt(CN)2 C8H8HgO2 C9H9HgNaO2S C2H5HgCl HgCl 1h, 5h, 8h, 2 0.5mM, 1mM, 2mM, NaAu(CN) O/N, 2 days, 7 2 5mM KAuCl4 days K2ReCl6 GdCl3 NdCl3 YbCl3 Samarium Acetate Samarium Nitrate

Table 1: Different heavy atom compounds soaking concentrations and times tested for SAD phasing.

48

Figure 16: Data analysis of crystals soaked with a platinum and gold compound. (A) Diffraction pattern of a K2PtCl4 compound and its XSCALE.LP table. (B) Diffraction pattern of a KAuCl4 compound and its XSCALE.LP table. For panels A and B, XSCALE.LP table showing the subset of resolution limit, the I/sigma, the Rmeas, the Rmrgd-F and the Anomal Corr and SigAno, which shows the anomalous signal of the heavy atoms. (C) Electron density map from the KAuCl4 derivative obtained after heavy atom localization, phasing and density modification procedures.

49 1.11.4 Publication I: “Functional Implications from the Cid1 Poly(U) Polymerase Crystal Structure”

We solved the atomic structure of Cid1 enzyme in complex with its substrate UTP and identified several residues critical for the UTP selectivity and RNA recognition of the enzyme. The structure reveals multiple residues from the nucleotide recognition motif (NRM) and from several positively charged patches of the enzyme that are shown to stabilized the UTP molecule. Indeed, histidine 336 of the NRM loop is in direct contact with the uracil base of the UTP, a feature of PUPs, which was not shown before. Our atomic model, complemented with point mutations and functional studies, is used to propose a catalytic cycle where local and global movements of the enzyme are needed for RNA accommodation and translocation. Our research underlines the molecular insights on Cid1’s UTP selectivity and RNA recognition with crucial implications for the PUP targeted RNAs along with the PUP- dependent polyuridylation process.

50 Structure Article

Functional Implications from the Cid1 Poly(U) Polymerase Crystal Structure

Paola Munoz-Tello,1 Caroline Gabus,1 and Ste´ phane Thore1,* 1Department of Molecular Biology, University of Geneva, 30 Quai Ernest Ansermet, Geneva 1211, Switzerland *Correspondence: [email protected] DOI 10.1016/j.str.2012.04.006

SUMMARY Zcchc11-dependent uridylation of the let-7 precursor (Heo et al., 2009). Finally, several reports have recently shown that In eukaryotes, mRNA degradation begins with 30-end uridylation happens on other types of poly(A)-minus poly(A) tail removal, followed by decapping, and the RNAs, such as histone mRNAs, subsequently targeting these mRNA body is degraded by exonucleases. In recent particular RNAs for degradation (Mullen and Marzluff, 2008; Schmidt et al., 2011). years, the major influence of 30-end uridylation as a regulatory step within several RNA degradation path- In the past decade, an entire family of polymerase proteins, ways has generated significant attention toward the called terminal uridylyl transferase (TUT) or poly(U) polymerase (PUP), has been identified in the metazoa (Martin and Keller, responsible enzymes, which are called poly(U) poly- 2007). Several of its members have been shown to uridylate merases (PUPs). We determined the atomic structure mRNA substrates in vivo (Kwak and Wickens, 2007; Martin of the Cid1 protein, the founding member of the PUP and Keller, 2007). One of the first characterized enzymes with family, in its UTP-bound form, allowing unambiguous the above-mentioned destabilizing activity is the protein Cid1 positioning of the UTP molecule. Our data also (caffeine-induced death suppressor), a cytoplasmic PUP suggest that the RNA substrate accommodation (Wang et al., 2000). Its action was first identified in the S-M phase and product translocation by the Cid1 protein rely transition control. Deeper analysis revealed that the Cid1 protein on local and global movements of the enzyme. was not adding a poly(A) tail but rather a poly(U) stretch in vivo Supplemented by point mutations, the atomic model (Rissland et al., 2007). Moreover, the half-life of mRNAs with is used to propose a catalytic cycle. Our study under- uridine residues added to the poly(A) tail, such as act1, was lines the Cid1 RNA binding properties, a feature with longer in cid1 mutants compared to the wild-type, indicating that the Cid1-dependent uridylation was likely to promote critical implications for miRNAs, histone mRNAs, mRNA degradation (Rissland et al., 2007; Rissland and Norbury, and, more generally, cellular RNA degradation. 2009). The Cid1 uridylation of polyadenylated mRNAs like act1, adh1, and urg1 promotes decapping, interceded by the Lsm1-7 complex, which leads to their decay (Rissland and INTRODUCTION Norbury, 2009). In vitro, these enzymes can add both poly(A) or poly(U) stretches. However, in vivo, they exert their function In bacteria, the addition of the poly(A) tail at the 30-end of mRNA in a uridine-specific fashion (Rissland et al., 2007). Members molecules is a destabilizing factor for the mRNAs (for a recent of this family, Cid1 being the most fully characterized, exhibit review, see Re´ gnier and Hajnsdorf, 2009). In eukaryotes, a similar highly processive activity, leading to the addition of long effect on mRNA stability has been attributed to 30-end uridylation U-stretches (Rissland et al., 2007). However, the length of the (Song and Kiledjian, 2007; Mullen and Marzluff, 2008). This addi- uridine tail observed in vivo is far shorter (between 1 to 10 nucle- tion of uridyl nucleotides to RNA substrates has been involved in otides). The reason for these observed differences is presently several distinct RNA metabolic pathways (Shen and Goodman, unclear. One possible mechanism would consist of a balance 2004; Norbury, 2010). Recently, the general mRNA degradation between the PUP activity and an exonuclease trimming mecha- pathway in the Schizosaccharomyces pombe was shown to be nism allowing the control of the poly(U) length as observed modulated by the protein Cid1 (Rissland and Norbury, 2009). with the U6 snRNA (Chen et al., 2000) or for the poly(A) tail The polyuridylation of polyadenylated mRNAs is crucial for (Kim and Richter, 2006). triggering the UTP-dependent degradation pathway in Trypano- Here we report the crystal structure of Schizosaccharomyces soma brucei mitochondria, and this modification is performed by pombe Cid1 protein in complex with UTP. The structure reveals RNA editing terminal uridylyl transferase 1 (RET-1; Ryan and the selective pocket for the pyrimidine nucleotide and highlights Read, 2005). The uridylation of U6 snRNA 30-end by U6 the specific interaction between the uracil base and multiple snRNA-specific terminal uridylyl transferase (TUTase) occurs in amino acids of the protein. Multiple residues of the nucleotide molecules lacking the cyclic 2030-phosphate group. This post- recognition motif (NRM) and from various parts of the protein transcriptional modification is required for the maintenance of are shown to stabilize the UTP. Mutations of the corresponding the 30-end of the U6 snRNA (Hirai et al., 1988; Chen et al., residues lead to functional consequences as reflected by 2000). Another example is the regulation of mammalian let-7 variable polymerase activity in vitro. In addition, superpositions miRNA production, which is, at least partially, mediated by the of our UTP bound structure and several related Pol b family

Structure 20, 977–986, June 6, 2012 ª2012 Elsevier Ltd All rights reserved 977

51 Structure The Atomic Model of Cid1

Table 1. Data Collection nucleotidyl transferase (NT) domain. Both termini were truncated based on these predictions and the resulting fragment (residues Crystal KAuCl -Soaked UTP-Soaked 4 40–377) was overexpressed, purified and crystallized. The Wavelength (A˚ ) 1.0396 0.9811 structure was solved using the single anomalous dispersion Temperature (K) 100 100 technique with a gold compound KAuCl4. We collected a Maximum resolution (A˚ ) 43.8–2.6 45.2–2.28 complete dataset at the peak wavelength 1.040 A˚ . The crystal (2.67–2.6) (2.3–2.28) diffracted to 2.6 A˚ resolution and belonged to space group Space group P2 2 2 P2 1 1 1 1 P212121. The asymmetric unit contained one molecule (see Unit cell dimensions (A˚ ) 52.8, 75.7, 78.4 53.5, 76.9, statistics in Table 1). The phased electron density map was of 81.9, b = 91.17 sufficient quality to build an initial model containing about half No. of observations 190,902 83,724 the amino acid sequence of the protein. Our phased electron No. of unique reflections 18,538 29,999 density also showed a large area with very poor density, indi- Redundancy 10.3 2.8 cating a disordered region possibly linked to the extended soak- ing time (7 days at 1 mM). However, this partial model was used Data completeness (%) 99.2 (91.9) 98.74 (59.3) subsequently to solve by molecular replacement a dataset I/s(I) 28.13 (1.91) 11.5 (2.64) collected from a native Cid1 crystal, soaked overnight with Rmeas 6.7 (90.4) 8.8 (49.5) 2.5 mM of UTP. This crystal did not show any signs of pseudo- Refinement merohedral twinning, differently than observed for the crystals No. of atoms 5,454 of the protein alone (P.M.-T. and S.T., data not shown) and Protein 5,146 diffracted to 2.28 A˚ resolution with the statistics reported in Ligand (UTP) 56 Table 1. The native crystal belonged to the monoclinic space Magnesium 7 group P21 and contained two molecules in the asymmetric Water 245 unit. A large unidentified density was visible in both the 2Fo-Fc and the Fo-Fc electron density maps. We built a uridine triphos- R (%) 0.188 work phate (UTP) together with a triphosphate-bound magnesium ion. Rfree (%) 0.247 The final models contain the Cid1 sequence (residues 40–377 in RMSD from ideal geometry molecule A and 41–377 in molecule B) as well as two UTP mole- Bond lengths (A˚ ) 0.011 cules and seven magnesium ions. Loops consisting of residues

Bond angles () 1.112 108–116 and 308–323 were not defined clearly in the electron Mean B (A˚ 2) density. It is assumed that these regions are disordered and Protein chain 30.9 they were excluded from the final model structure. The crystallo- Nucleotide/magnesium chains 26.1 graphic data for these two datasets (gold-bound and UTP-soak) are reported in Table 1. Waters 31.7 Residues in favored region of the 603 (97.3) Structure of the Cid1 Protein Ramachandran plot (%) The Cid1 protein was first characterized as a factor involved in Residues in allowed region of the 17 (2.7) the S-M phase transition process (Wang et al., 1999) and later Ramachandran plot (%) was classified as a NT from the polymerase b superfamily based Values in bracket are for the highest resolution shell. on sequence conservation (Figure 1A; Wang et al., 2000). The overall protein fold is related to the polymerase b fold with two characteristic domains, catalytic (CAT) and central (CD) members suggest a swivel motion between the catalytic and the domains, separated by a trench where the UTP is bound central domains, as well as various conformational changes in (Figure 1B). conserved loops. These multiple observations are corroborated The CAT domain contains the typical features of the DNA poly- by polymerase assays showing differential effects depending on merase b enzyme, i.e., a five-stranded b sheet with two a helices the mutations. These elements point out the requirement for on one side and the catalytic aspartic acid triad facing the CD a structural change during the fitting/translocation mechanism domain (residues 101, 103, and 160; Figure 1B). The CD domain of the RNA molecule by the PUP enzymes. This study shows is composed of six a helices connected by multiple loops of the molecular basis for the specific recognition of UTP substrate, various lengths (Figure 1B). The CD domain contains the NRM and reveals several local differences between the Pol b family as shown in Figure 1B (see below). The CAT domain and the members. We think that it reflects their distinctive substrate CD domain are linked by two loops (res. 56–58 and res. recognition properties and/or activities. 163–165; Figure 1B). Since the Cid1 protein has been classified in the group of non- RESULTS canonical template-independent NT by Martin and colleagues (Martin and Keller, 2007), we used a 17-mer long RNA finishing Structure Determination Process with a Cytidine to test the polymerizing activity of our engineered We have determined the atomic structure of the Cid1 protein in protein. Our construct displays similar activity level compared to complex with UTP at 2.3 A˚ resolution. The Cid1 protein is the full-length enzyme (Figure 1C). We also introduced two point predicted to contain two unstructured termini surrounding a mutations at positions 101 and 103, demonstrating that the

978 Structure 20, 977–986, June 6, 2012 ª2012 Elsevier Ltd All rights reserved

52 Structure The Atomic Model of Cid1

Figure 1. The Cid1 Protein Belongs to the Polymerase b Family (A) Sequence alignment of multiple members of the Pol b family. Uniprot accession numbers are as follows: hsZCCHC11-Q5TAX3, hsZCCHC6-Q5VYS8, spCID1-O13833, tbTUT4-A4UBD5, tbRET2-Q86MV5, tbMEAT1-Q4GZ86, hsPAPD1-Q9WVV4, hsGLD2-Q6PIY7, scTRF4-P53632, and scPAP1-P29468. Secondary structure in the Cid1 protein is indicated above the alignment (rectangle, a helix; arrow, b strand). The three catalytic aspartic acid are indicated with a star and, three functionally important regions are underlined (green, regions 1 and 2; yellow, NRM, see text for details). (B) Overall view of the Cid1 structure with the labeled secondary structure elements. The catalytic (CAT) and central (CD) domains are colored respectively in orange and green. The NRM loop is colored in magenta. The catalytic residues are colored in cyan and the bound-uridine triphosphate in yellow. Atoms are colored according to type (red for oxygen, orange for phosphate and cyan or yellow for carbon). Four magnesium ions are represented with gray spheres. Two arrows indicate the start and end locations of the CAT domain. (C) In vitro poly(U) polymerase assays. Experiments were performed with 50 and 150 ng of poly(U) polymerase, of Cid1 or Cid1-AMA protein as indicated, and

a radiolabeled 17-mer long RNA substrate. Each reaction mix was incubated for 10 min at 30C. Ct, no protein. The size of reaction products is indicated on the left of the image. All the structural panels here and in the following figures were generated with PyMol (DeLano, 2002).

poly(U) addition is due to our protein and not to a co-purifying on residues conserved in the PUP/TUT group of proteins contaminant. (Figures 1A and 2B). Eight direct hydrogen bonds are found between the protein and the UTP molecule. The Cid1/UTP/Mg2+ Complex Structure The nucleobase and the sugar ring are recognized by every The Cid1/UTP complex structure was obtained from an over- edge capable of engaging in a hydrogen bond (Figure 2C). night soaked crystal. The UTP is buried at the bottom of the cleft Noticeably, the Cid1 protein is capable of sensing the presence found between the CAT and the CD domains (Figure 2A; Fig- of the two carbonyl oxygen atoms of the uridine base via the ure S1 available online). Multiple interactions are observed Asn171 and the His336 residues (Figure 2C). Until the present between the nucleotide triphosphate and the protein and rely work, the contacts describing the pyrimidine recognition mode

Structure 20, 977–986, June 6, 2012 ª2012 Elsevier Ltd All rights reserved 979

53 Structure The Atomic Model of Cid1

Figure 2. The Cid1/UTP/Mg2+ Complex Association (A) Surface representation of the Cid1 structure with the domains colored as in Figure 1. The UTP is shown as a stick representation. (B) Detailed view of the UTP-Cid1 hydrogen bonds. (C) Close-up view of the uridine base and sugar ring interactions. For both panels, side chains of interacting amino acid are shown using ball-and-stick repre- sentation. Atoms are colored as before (slate and magenta for carbon). Hydrogen bonds within 2.4–3.6 A˚ distance are shown as black dash lines. The magnesium ion is shown as a gray sphere, and its hydrogen bonds are shown as magenta dash lines. (D) Poly(U) polymerase assays. The Cid1 WT or the indicated mutants were incubated with radioactively labeled 17-mer oligonucleotides. The lane labeled Ct shows the RNA primer after incubation without enzyme. The mutant enzymes show limited activity as compared with Cid1 WT. See also Figure S1.

by the TUT enzymes were focused on water-mediated interac- nize the 20 position of the ring, differently than observed in the tions (Deng et al., 2005; Stagno et al., 2007b, 2010). Although TUT structures, in which the equivalent residue (serine) contacts these interactions allow the TUT proteins to select UTP, the both the 30 and the 20-oxygens (Deng et al., 2005; Stagno et al., mechanistic basis of uridine selectivity by the Cid1 protein has 2007b, 2010). The triphosphate moiety is stabilized by hydrogen been left unclear. In particular, the capacity of the Cid1 protein bonds with Lys193 and Lys197, both highly conserved in the Pol to use both UTP and ATP for polymerization is unexplained. b family (Figure 1A), as well as with Ser90, Ser211 and Tyr212. A We show that mutations of the His336 into its equivalent residues single magnesium ion is coordinated on one side by the catalytic in TUT or PAP proteins (H336L or H336N respectively and amino acid Asp103 and, on the other side, by the phosphate- H336A) decrease the overall activity of the protein compared bound oxygen atoms, a situation encountered in the TUT4/ to the wild-type enzyme (WT; Figure 2D). The results of experi- UTP co-crystal structure. Another magnesium ion is found ments performed in the presence of both UTP and ATP suggest above the uridine base in a position that is likely to become that mutant proteins still incorporate UMP preferentially over occupied by the incoming RNA substrate. This ion is stabilized AMP (P.M.-T. and S.T., data not shown; Rissland et al., 2007). via p interaction forming an organomagnesium compound We therefore conclude that the Histidine 336 certainly partici- (Dougherty, 1996). pates in the selection process since we observe reduced polymerizing activity when mutated. But this residue is not suffi- The Nucleotide Recognition Motif cient to fully explain the atypical property of the Cid1 protein. Poly(A) and poly(U) polymerase enzymes have been shown to In addition to the nucleobase edges, the 30-oxygen of the contain a highly conserved motif in the CD domain. This sugar ring is stabilized via H-bonds with the Thr172 and the sequence, called the NRM, has been used to derive the nucleo- Asn171 residues of Cid1. However, the protein does not recog- tide-binding specificity of a given polymerase (Martin and Keller,

980 Structure 20, 977–986, June 6, 2012 ª2012 Elsevier Ltd All rights reserved

54 Structure The Atomic Model of Cid1

2007). The NRM sequences are divided into two groups: (1) helix 4. We propose, based on our structure and on the activity group 1 for the canonical NT, which contains PAP enzymes, tests performed with the mutant enzymes, the following enzy- and (2) group 2 for the non-canonical NT to which the TUT matic cycle (Figure 3E). Upon triphosphate binding, the CAT enzymes and the Cid1 protein belong (Figure 1A). In the domain closes onto the CD domain inducing a rotation of helix co-crystal structures of RET2, MEAT1 and TUT4 trypanosomal 4(Figure 3E). This helix would then act like a ‘‘ratchet’’ helix lock- proteins, no direct contacts were observed between the amino ing the Phe332 residue in the hydrophobic pocket (Figure 3D). acids of the NRM motif and the nucleotide base. Furthermore, The stabilization of Phe332 orients the NRM loop and ensures the function of individual residues in the NRM loop is still unclear. that Asp330, Glu333 and residue 336 (Leucine in the case of In particular, the question of how the NRM sequence distin- TUT4, RET2 and MEAT1 or Histidine in the case of Cid1, guishes between the PAP and the PUP/TUT groups is Zcchc11 and Zcchc6) is held in proximity to the edges of the unanswered. The superposition of the NRM loops (from our nucleobase to be added (Figure 3B). The stabilization of the structure and from other TUT or PAP structures) indicates an correct nucleotide base would provide the necessary time for excellent match within the PUP/TUT group and highlights hydrolysis to occur, leading to the addition of a UMP molecule a couple of clear differences with the PAP group (Figure 3A). to the RNA primer. A specific hydrogen bond between the NE2 position of the Histidine residue 336 in the NRM sequence and the O4 carbonyl Identification of the Cid1 Surface Involved in RNA oxygen of the uracil base is found in our Cid1/UTP structure Binding (Figures 2C and 3B). This additional bond ensures that the two In the last decade, several structures of trypanosomal TUTase oxygen atoms from the nucleobase are selected by specific have been solved (Deng et al., 2005; Stagno et al., 2007b, interaction with the side chains of residues 171 and 336 (Fig- 2010). Our model allows us to visualize similarities and differ- ure 2C). It is striking to note that the Cid1 protein and the two ences existing between the poly(U) polymerase involved in human poly(U) polymerases Zcchc11 and Zcchc6, which have cellular RNA regulation (Trippe et al., 1998; Perumal and Reddy, demonstrated poly(U) polymerase activity in vivo, also have 2002; Shen and Goodman, 2004) and the TUT enzymes a Histidine residue at the equivalent position (Figures 1A and associated with RNA processing mechanisms in mitochondria 3A; Rissland et al., 2007; Heo et al., 2009). It would be interesting (Aphasizhev and Aphasizheva, 2011). Despite the very limited to see whether the two human proteins also have the capacity to sequence homology existing between the Cid1 protein and the use ATP as substrate in vitro and are able to select UTP over ATP trypanosomal enzymes TUT4, RET 2, and MEAT1 (Figure 1A), as the Cid1 protein. their folds are highly related (Figure 4A). Structural superposi- The next residue of particular interest in the NRM sequence tions point out multiple loops in the CAT and the CD domains and which has a differential location in the two groups corre- that may have different orientations. Based on their electrostatic sponds to the Phenylalanine 332 (Figures 3A and 3B). In the properties, these loops are likely to participate in the selection/ yeast poly(A) polymerase, this residue is an Alanine, while a larger stabilization of the RNA primer (Figure 4B). hydrophobic residue (Phe or Leu) is found in the PUP/TUT The loop between b strand 4 and 5 (Region 1 in Figure 1A) is protein sequences (Figures 1A and 3A). In our structure, the located above the UTP molecule in our structure (Figure 4C). Phe332 is buried into a hydrophobic pocket, directly facing the This protein fragment is highly conserved in all the NT sequences NRM loop (Figures 3B and 3C). This pocket is formed by residues (Figure 1A) and is also positively charged (Figure 4B). In the from an extended loop of the CD domain and from residues of TUT4/UpU structure, the corresponding amino acids (residues the helix 4 (Figure 3C). The residues forming this pocket are 118–124 in the TUT4 sequence) are in proximity to the second reasonably conserved in the TUT and the PAP proteins (Fig- uridine moiety and interact through hydrogen bonds (Stagno ure 1A). Furthermore, the Cid1 helix 4 is equivalent to the helix et al., 2007a). The Arg121 of TUT4, equivalent to Arg139 in the F of the yeast PAP protein, which is implicated in the pivotal Cid1 protein, has a proven role in the catalytic activity of the movement of the CAT domain upon substrate RNA association TUT4 protein, which is likely to be linked to its role in substrate (Balbo et al., 2007). A similar observation was made with the selection/stabilization (Stagno et al., 2007a, 2007b). Further- TUT4 protein structure upon binding of UTP (Stagno et al., more, upon UTP association, the corresponding residues in the 2007b). The template-independent properties of the Cid1 protein MEAT1 protein structure move up to 7 A˚ toward the CD domain imply that it relies on a ‘‘conformational breathing’’ mechanism to (Figures 4B and 4C; Stagno et al., 2010). The b sheet 4, which accommodate and to translocate its substrate upon the addition immediately follows this loop, contains the conserved Lysine of UMP (Suezaki and Go, 1975). We believe that the Phe332 in position 144 as well as several hydrophobic residues (Fig- residue is implicated in this mechanism by properly positioning ure 1A). Mutation of Arg126 in the TUT4 protein (equivalent to the NRM loop once an NTP molecule is bound by the Cid1 Lys144 in Cid1) abolished TUT activity (Stagno et al., 2007b). protein. To demonstrate the importance of the Phe332 within In fact, the substitution of Arg126 to Alanine disrupted the salt the catalytic cycle of the Cid1 protein, we mutated the residue bridge between Arg126 and Asp136 leading to inactivation of to modify its properties from a large hydrophobic residue to an the TUT4 protein. The strong sequence conservation and acidic residue. When the Phe332 is mutated to a Leucine, the the above-mentioned observations clearly indicate that this enzyme still displays PUP activity (Figure 3D). A clear decrease protein region will play a major role in regulating the activity of activity is observed when the Phe332 is replaced by an Alanine of the Cid1 protein. We mutated the two most conserved resi- and the activity is abolished in the Aspartate mutant (Figure 3D). dues, Arg139 and Lys144, to assess their importance in Cid1 Therefore, the decrease of activity seems to correlate with activity. The Arg139 mutation strongly reduces the polymeriza- the destabilization of the contact between the NRM loop and tion indicating that, as observed in the TUTase group, this

Structure 20, 977–986, June 6, 2012 ª2012 Elsevier Ltd All rights reserved 981

55 Structure The Atomic Model of Cid1

Figure 3. The NRM (A) Superposition of the NRM loop structures from the Cid1 protein (shown in green), the yeast PAP (Protein Data Bank [PDB] code: 2Q66; in cyan), the try- panosomal TUT4 (PDB code: 2IKF; in orange) and the RET2 (PDB code: 2B51; in violet) enzymes. Important side chains are shown with ball-and-stick repre- sentation, with the corresponding protein color and label (labels in bracket correspond to PAP, TUT4 or RET2 sequence). (B) Global view of the Cid1 structure along the axis of helix 4, showing its critical location between the CAT, the CD and the NRM loop. The Cid1 protein is colored as in panel 1B with the helix 4 shown in yellow. The catalytic Aspartates, Phe332 and His336 residues are shown in ball-and-stick representation (carbon colored in orange and magenta, respectively). (C) Residue Phe332 is buried in a large hydrophobic cavity, lined by residues represented in ball-and-stick, labeled and colored in cyan. The pocket is adjacent to helix 4, which is proposed to communicate the association of the triphosphate group. (D) Poly(U) polymerase assays. The Cid1 WT or the indicated mutants were incubated with radioactively labeled 17-mer oligonucleotides. The lane labeled Ct shows the RNA primer after incubation without enzyme. The three mutants (F332L, F332A, and F332D) have different levels of activity reflecting the importance

982 Structure 20, 977–986, June 6, 2012 ª2012 Elsevier Ltd All rights reserved

56 Structure The Atomic Model of Cid1

Figure 4. Putative RNA Binding Sites (A) Structural similarities between the structures of Cid1 and the TUT enzymes from trypanosome (MEAT PDB code: 3HJ4; RET2 and TUT4). Overall, these models show good resemblance with excellent conservation of secondary structure elements. They are shown as cartoon, and colored as in panel 3A (green for Cid1, slate for MEAT, violet for RET2 and orange for TUT4). The RET2 protein contains an additional domain (shown in gray), which is not conserved in the Pol b family. (B) Electrostatic representation of the Cid1 protein. The protein is oriented almost similarly to the panel A. The Cid protein surface is colored according to its electrostatic potential: blue for positive and red for negative charge. The bottom of the cleft is rather negatively charged while the upper surface of the CD domain is mostly positively charged. The location of the region 1 and 2 is indicated and labeled. (C) Superposition of the loop between b sheets 4 and 5 showing variable orientation upon UTP association. We superposed the models of the TUT4/UpU structure (PDB code: 2Q0G), the MEAT with and without UTP bound (PDB code for the MEAT/UTP model: 3HIY) and the RET2 structure. The large movement of the loop, as exemplified in the MEAT structure, indicates that the corresponding Cid1 loop is susceptible to reorganization. In particular, Arg139 (or equivalent residues) is shown to be essential for PUP activity. The color coding is similar to panel A except for the MEAT/UTP complex shown in white. (D) Poly(U) polymerase assays as described in Figure 2D. The R139A and the Y205A mutants have reduced activity although the mutation should not affect the UTP binding properties. The K144A mutant displays an almost similar level of activity as the WT enzyme. (E) The region 2 in our sequence alignment is located above the active site. The same region is involved in the poly(A) stabilization in the scPAP/RNA complex structure (shown as a cyan cartoon; PDB code: 2Q66) via hydrogen bonds and van der Waals interaction with residues 226 and 227. In the Cid1 structure, these residues are replaced by Proline and Tyrosine, respectively (shown as green ball-and-stick). Both residues are suitable for interactions with the growing poly(U) tail. The model of TUT4 (in orange) is shown to highlight the structural differences observed for this particular region between the TUT and the PAP enzymes. Atoms are shown as spheres, and colored according to their type. Magnesium ions are gray. (F) Poly(U) polymerase assays as in Figure 2D. The mutation with the lowest activity is the Y205D. In this case, the enzyme is almost inactive clearly suggesting that region 2 is important for RNA stabilization. See also Figure S2.

residue is essential to select/stabilize the RNA primer (Figure 4D). Cid1 protein can function despite the absence of a stabilizing Differently from the TUT4 protein, mutation of Lys144 resulted in partner for Asp160 (Figure 4D). a limited decrease of the polymerization level when compared to A second loop, corresponding to residues 202–206, is found the WT enzyme suggesting that the catalytic mechanism of the upstream of the conserved Tyr212, on which the UTP molecule

of the modification. If the hydrophobic character is kept (F332L and to a lower degree F332A), the enzyme is still active. Once a charged residue is introduced (F332D), the activity is almost abolished. (E) Proposed cycle depicting NTP association and hydrolysis. Association of the triphosphate moiety of the NTP induces a CAT movement toward the CD domain, and helix 4 rotates locking Phe332. The NRM loop is positioned precisely to select the uridine Watson-Crick edges. Stabilization of UTP provides time for substrate RNA association and hydrolysis followed by translocation of the newly generated RNA-U molecule. A new cycle can then be started by the association of a new NTP molecule.

Structure 20, 977–986, June 6, 2012 ª2012 Elsevier Ltd All rights reserved 983

57 Structure The Atomic Model of Cid1

stacks (Figure 4E; Region 2 in the Figure 1A). The sequence of reaction. In this case, the authors had to use very high concen- this loop is poorly conserved between the PUP enzymes. tration of UpU (10 mM) to obtain the co-structure. One reason Furthermore, although disordered in the apo RET2 structure, for such difficulties may be that the template-independent prop- the same loop is clearly resolved in the RET2/UTP structure erties of these enzymes result in a substrate binding site with (Deng et al., 2005). In our case, this loop contains a Tyrosine low-affinity. From the electrostatic potential point of view, the residue (Tyr205), which may be involved in the stabilization of bottom of the cleft is rather negatively charged (Figure 4B) which bound RNA (Figure 4E). We have also superposed our Cid1 disfavors a stable association between the enzyme and its model onto the structures of the poly(A) polymerases (Figure 4E; substrates (RNA primer and NTP molecule). Previous studies Martin et al., 2000; Balbo and Bohm, 2007; Hamill et al., 2010). In have shown that the TUTase proteins TUT4 and RET1 have an the yeast PAP model, the same loop makes direct contacts with affinity constant for the UTP compound of 1 and 45 mM respec- the short poly(A) stretch in multiple locations (Figures 4C and 4E; tively (Aphasizheva et al., 2004; Stagno et al., 2007b). Balbo and Bohm, 2007). The structural similarities observed The proposed hinge region between the central and the cata- between the Cid1 and PAP loops suggest a similar mode of lytic domains suggests that the Cid1 protein translocates its RNA association between the Cid1 protein and its growing product. substrate by adjusting the accessibility of the active site-bound Mutations of Tyr205 into various residues highlight the role of UTP (Figures 3B and 3E). We suggest that Phe332 plays this residue. It appears that a hydrophobic residue is needed in a critical role in the communication between the CAT domain this loop to keep Cid1 active (Figures 4D and 4F). If the enzyme closure and the selection of the nucleobase by the NRM loop. contains a charged residue like an Aspartate, then its ability to These movements can be observed on their own, as seen in polymerize is significantly decreased (Figure 4F and Figures the apo and UTP-bound RET2 structures, or through a protein S2A and S2B). The decreased activity is observed with three partner. The recruitment of the Zcchc11 protein by the Lin28A RNA primers (a 17 nucleotide primer, a 15-mer poly(A) and protein could reflect such a need (Hagan et al., 2009). When a short UpU substrate). compared with the yeast PAP enzyme, the Cid1 protein lacks the additional domain found above the active site cleft, which DISCUSSION globally reorients after catalysis. The reorganization of the PAP RNA binding sites is, at least partially, mimicked by movements The present structure of fission yeast Cid1 protein reveals the of the two Cid1 domains. This conformational flexibility is likely to molecular basis for UTP binding by this particular enzyme. The dictate the ratio between the processive and the distributive active site is located at the bottom of the cleft separating the character of the enzyme. These phenomena are noteworthy to CAT and CD domains, with the three aspartic acid residues understand since the length of the poly(U) polymerase-added facing the phosphate b and g of the bound UTP (Figures 2A poly(U) tail determines the fate of the substrate RNA. With the and 2B). Stabilization of the UTP is performed by several help of our atomic model and of our activity assays of mutated H-bonds, principally between Lys193 and 197 and the phos- proteins, a more comprehensive analysis of the PUP-dependent phate groups, as well as His336 and Asn171 and the uracil polyuridylation process is achieved. moiety. Two magnesium ions are found in the pocket; one is coordinated by the catalytic Asp103 residue and the phos- EXPERIMENTAL PROCEDURES phate-bound oxygen atoms and the other positioned above the Expression, Purification, and Mutagenesis uracil base of the UTP, most likely at the position of the RNA The nucleotide sequence of the Cid1 protein encoding residues 40–377 was primer. In our structure, a single magnesium ion coordinates amplified by PCR from Schizosaccharomyces pombe genomic DNA (kindly the phosphate oxygens and Aspartate 103 (Figure 2D). A similar provided by M. Bu¨ hler, FMI, Basel) and sequenced. The ORF was cloned situation was encountered in the RET2 structure although, in into pST18 (derived from pET42) for expression in E. coli as a fusion protein their case, two of the catalytic aspartates were bound to the with N-terminal His9 and MBP tag followed by a TEV cleavage site. The magnesium ion (Deng et al., 2005). The absence of the substrate recombinant protein was overexpressed in Escherichia coli BL21-Star cells, RNA certainly prevented the stable association of the second grown in TB media at 37C for 6 hrs followed by overnight induction at 18C with 0.1mM isopropyl-b-D-thiogalactopyranoside (IPTG). Induced cells were magnesium ion as observed in the ternary structure of the TUT4 harvested by centrifugation, resuspended in buffer A (20 mM Phosphate buffer enzyme from trypanosome (Stagno et al., 2007a). pH 7.5, 300mM NaCl, 25mM imidazole, 0.1% X-Triton 100, DNase 1 mg/ml, A large number of attempts have been carried out to obtain the Lysozyme 1 mg/ml, 5 mM b-mercaptoethanol), supplemented with protease structure of Cid1 in complex with other nucleotides, such as inhibitors PhenylMethylSulfonyl Fluoride (PMSF) 1mM, leupeptin 1 mg/ml, ATP, or with short reaction products such as ApU or UpU. We and pepstatin 2 mg/ml). obtained numerous crystals from co-crystallization or soaking Cells were lysed using an emulsiflex system (AVESTIN) and cleared by centrifugation at 15,000 rpm for 35 minutes at 4 C. The soluble fraction was procedures and determined these atomic structures by molec-  subjected to an initial affinity purification using a chelating HiTrap FF crude ular replacement using the model built previously. However, column (GE Healthcare) charged with Ni2+ ions. The protein was eluted with the subsequent electron density maps did not show additional 250 mM imidazole and desalted against 20 mM phosphate buffer pH 7.5, densities in the active site regions. The difficulty to obtain a 300 mM NaCl, 25 mM imidazole, and 5 mM b-mercaptoethanol. The Tobacco product-bound form of this enzyme is recurrent in this protein Etch Virus protease was added at a ratio of 1/50 (TEV-protein solution) and the family. The structure of the yeast poly(A) polymerase could reaction was performed at 16C for 10 hours. Cleaved protein was separated from the tag, the protease and the contaminants by reapplication to the HiTrap only be solved in complex with ATP and reaction substrate by column. After concentration, remaining impurities were removed by gel filtra- using an inactive protein (Balbo and Bohm, 2007). The structure tion chromatography with a Superdex 200 column (GE Healthcare), in a buffer of the trypanosomal TUT4 enzyme was solved in complex with containing 20 mM HEPES pH 7.5, 150 mM NaCl, 1 mM b-mercaptoethanol, the dinucleotide UpU, mimicking the minimal product of the and 10% (v/v) glycerol. Samples were concentrated (Amicon) to 9.5 mg/ml

984 Structure 20, 977–986, June 6, 2012 ª2012 Elsevier Ltd All rights reserved

58 Structure The Atomic Model of Cid1

and used immediately or stored at –80C. All protein samples eluted from the ACCESSION NUMBERS gel filtration column as monomers and were homogeneous when analyzed by SDS-PAGE. Atomic coordinates and structure factors for the Cid1/UTP model have been All Cid1 point mutants were prepared by site-directed mutagenesis using deposited in the Protein Data Bank (http://www.pdb.org) with the accession the Quick-Change mutagenesis kit from Stratagene according to manufac- code 4EP7. turer instructions. Point mutations were verified by DNA sequencing prior to large-scale expression. The oligonucleotides used for the mutations of the SUPPLEMENTAL INFORMATION catalytic Aspartate residues 101 and 103 (called Cid-AMA), the Arginine 139, the Lysine 144, the Tyrosine 205, the Phenylalanine 332 and the Histidine Supplemental Information includes two figures, one table, and Supplemental 336 are shown in Table S1 of supplemental data. Experimental Procedures and can be found with this article online at doi:10.1016/j.str.2012.04.006. Protein Crystallization and X-Ray Data Collection Cid1 was crystallized in two steps. Crystals were produced using the ACKNOWLEDGMENTS sitting-drop vapor diffusion technique at 18C, with a reservoir solution con- taining 0.1 M Imidazole/MES (equal ratio) pH 6.1, 12%–20% Glycerol, We wish to acknowledge Marc Bu¨ hler for the generous gift of Schizosacchar- 6%–10% polyethylene glycol (PEG) 4000 and 90–144mM of a halogen salt omyces pombe genomic DNA. We also acknowledge the staff of X06SA at solution (NaI, NaBr, NaF in equal ratio). The protein concentration was the Swiss Light-Source and of ID14-4 at the ESRF for developing excellent 9.5 mg/ml. Due to a high level of nucleation, a ‘‘pseudo-batch’’ procedure beamlines and supports with data collection. We wish to thank all members was performed at 18C to obtain diffraction quality crystals. In this case, from the Ste´ phane Thore lab for critical reading, helpful discussions and crystallization conditions in the drop and in the reservoir were the same. Equal technical advice. We are grateful for financial contributions from the volumes of the gel filtration buffer and the crystallization condition mentioned Ernest Boninchi and the Ernst and Lucie Schmidheiny foundations together before were mixed in the reservoir. Crystals appeared in two days and grew to with the R’equip subsidy from the Swiss National Science Foundation

a maximum size of around 0.15 3 0.15 3 0.1mm in 1 week. Crystals of Cid1 in (n316030_128787), used to acquire the Biostructural platform equipments. complex with UTP were obtained by soaking native crystals for 14 hours with This work was partially supported by individual research grants from the Swiss

2.5 mM of UTP supplemented with 5mM of MgCl2. Crystals were passed into National Science Foundation to S.T. (n31003A_124909 and n31003A- perfluoropolyether cryo oil (Hampton Research) prior to flash freezing in liquid 140924). nitrogen. Diffraction data for the UTP-soaked protein crystal were collected at Received: February 15, 2012 Beamline ID14-4 at the European Synchrotron Radiation Facility (ESRF, Revised: April 17, 2012 Grenoble). Diffraction was observed to about 2.3 A˚ resolution. Native crystals Accepted: April 17, 2012 were soaked with numerous heavy metal compounds at a range of concentra- Published online: May 17, 2012 tions and incubation times. We collected a highly redundant data set from a crystal soaked for 7 days in the crystallization solution supplemented with REFERENCES

1mM KAuCl4. A single anomalous diffraction (SAD) data set to a resolution of 2.7–2.8 A˚ was collected at the Beamline PXIII at the Swiss Light-Source Abrahams, J.P., and Leslie, A.G.W. (1996). Methods used in the structure (SLS, Villigen-Paul Scherrer Institute). determination of bovine mitochondrial F1 ATPase. Acta Crystallogr. D Biol. Crystallogr. 52, 30–42. Structure Determination and Refinement Adams, P.D., Afonine, P.V., Bunko´ czi, G., Chen, V.B., Davis, I.W., Echols, N., The structure of Cid1 was determined by SAD method using the anomalous Headd, J.J., Hung, L.W., Kapral, G.J., Grosse-Kunstleve, R.W., et al. (2010). signal of the gold atom. Briefly, the data set was indexed with the XDS package PHENIX: a comprehensive Python-based system for macromolecular (Kabsch, 2010). Heavy atom location was performed with the Shelx suite using structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, 213–221. data to 3.5A˚ (Sheldrick, 2010). Phasing was done using the SHARP procedure Aphasizheva, I., Aphasizhev, R., and Simpson, L. (2004). RNA editing terminal (Bricogne et al., 2003). Finally, density modification to a maximum resolution of uridylyl transferase 1: identification of functional domains by mutational 2.8 A˚ was done with the program SOLOMON (Abrahams and Leslie, 1996), analysis. J. Biol. Chem. 279, 24123–24130. which allowed unambiguous hand determination and gave a readily interpret- Aphasizhev, R., and Aphasizheva, I. (2011). Mitochondrial RNA processing in able electron density map. An initial model was built using the molecular trypanosomes. Res. Microbiol. 162, 655–663. graphic program coot (Emsley and Cowtan, 2004) and refined with the routine phenix.refine from the program Phenix (Adams et al., 2010). When more than Balbo, P.B., and Bohm, A. (2007). Mechanism of poly(A) polymerase: structure 90% of the protein chain was included, the model was used for phasing the of the enzyme-MgATP-RNA ternary complex and kinetic analysis. Structure data set from the UTP-soaked crystal using the program Phaser from the 15, 1117–1131. CCP4 package (McCoy et al., 2007). The final model was refined to 2.3 A˚ Balbo, P.B., Toth, J., and Bohm, A. (2007). X-ray crystallographic and steady with 99.7% of the residues in the preferred and allowed regions of the state fluorescence characterization of the protein dynamics of yeast polyade- Ramachandran plot. nylate polymerase. J. Mol. Biol. 366, 1401–1415. Bricogne, G., Vonrhein, C., Flensburg, C., Schiltz, M., and Paciorek, W. (2003). Poly(U) Polymerase Assays Generation, representation and flow of phase information in structure determi- For each experiment, two quantities of enzymes were used (50 and 150 ng or nation: recent developments in and around SHARP 2.0. Acta Crystallogr. 2U and 6U for the PUP from NEB). The same protocol was used for the D Biol. Crystallogr. 59, 2023–2030. different proteins (mutants or WT). The various RNA primers were labeled Chen, Y., Sinha, K., Perumal, K., and Reddy, R. (2000). Effect of 3 terminal 32 0 using the T4 polynucleotide kinase (NEB) and g- P-ATP according to manu- adenylic acid residue on the uridylation of human small RNAs in vitro and in facturer’s instructions. Labeled RNA substrate ( 130 pmole) was incubated  frog oocytes. RNA 6, 1277–1288. with the respective amount of enzymes in 25 ml of 10mM Tris-Cl, pH 7.9, DeLano, W.L. (2002). The PyMOL User’s Manual (San Carlos, CA: DeLano 50mM NaCl, 10mM MgCl2, 1mM DTT, 0.5mM UTP for 10 min at 30C. Reac- Scientific). tions were stopped with an equal volume of 100mM EDTA and 2% sodium do- decyl sulfate (SDS). Extended RNA primers were extracted with phenol-chlo- Deng, J., Ernst, N.L., Turley, S., Stuart, K.D., and Hol, W.G. (2005). Structural roform, precipitated with ethanol and resuspended in 90% formamide loading basis for UTP specificity of RNA editing TUTases from Trypanosoma brucei. buffer except for the reactions with the UpU substrate which were directly EMBO J. 24, 4007–4017.

inactivated. Products were denatured at 95C and separated by gel electro- Dougherty, D.A. (1996). Cation-pi interactions in chemistry and biology: a new phoresis in a 12% (or 25% for the UpU) polyacrylamide/7M urea gel. view of benzene, Phe, Tyr, and Trp. Science 271, 163–168.

Structure 20, 977–986, June 6, 2012 ª2012 Elsevier Ltd All rights reserved 985

59 Structure The Atomic Model of Cid1

Emsley, P., and Cowtan, K. (2004). Coot: model-building tools for molecular Rissland, O.S., Mikulasova, A., and Norbury, C.J. (2007). Efficient RNA polyur- graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132. idylation by noncanonical poly(A) polymerases. Mol. Cell. Biol. 27, 3612–3624.

Hagan, J.P., Piskounova, E., and Gregory, R.I. (2009). Lin28 recruits the Ryan, C.M., and Read, L.K. (2005). UTP-dependent turnover of Trypanosoma TUTase Zcchc11 to inhibit let-7 maturation in mouse embryonic stem cells. brucei mitochondrial mRNA requires UTP polymerization and involves the Nat. Struct. Mol. Biol. 16, 1021–1025. RET1 TUTase. RNA 11, 763–773. Hamill, S., Wolin, S.L., and Reinisch, K.M. (2010). Structure and function of the Schmidt, M.J., West, S., and Norbury, C.J. (2011). The human cytoplasmic polymerase core of TRAMP, a RNA surveillance complex. Proc. Natl. Acad. RNA terminal U-transferase ZCCHC11 targets histone mRNAs for Sci. USA 107, 15045–15050. degradation. RNA 17, 39–44. Heo, I., Joo, C., Kim, Y.K., Ha, M., Yoon, M.J., Cho, J., Yeom, K.H., Han, J., and Kim, V.N. (2009). TUT4 in concert with Lin28 suppresses microRNA Sheldrick, G.M. (2010). Experimental phasing with SHELXC/D/E: combining biogenesis through pre-microRNA uridylation. Cell 138, 696–708. chain tracing with density modification. Acta Crystallogr. D Biol. Crystallogr. Hirai, H., Lee, D.I., Natori, S., and Sekimizu, K. (1988). Uridylation of U6 RNA in 66, 479–485. a nuclear extract in Ehrlich ascites tumor cells. J. Biochem. 104, 991–994. Shen, B., and Goodman, H.M. (2004). Uridine addition after microRNA- Kabsch, W. (2010). Xds. Acta Crystallogr. D Biol. Crystallogr. 66, 125–132. directed cleavage. Science 306, 997. Kim, J.H., and Richter, J.D. (2006). Opposing polymerase-deadenylase Song, M.G., and Kiledjian, M. (2007). 30 Terminal oligo U-tract-mediated stim- activities regulate cytoplasmic polyadenylation. Mol. Cell 24, 173–183. ulation of decapping. RNA 13, 2356–2365. Kwak, J.E., and Wickens, M. (2007). A family of poly(U) polymerases. RNA 13, Stagno, J., Aphasizheva, I., Aphasizhev, R., and Luecke, H. (2007a). Dual role 860–867. of the RNA substrate in selectivity and catalysis by terminal uridylyl transfer- Martin, G., and Keller, W. (2007). RNA-specific ribonucleotidyl transferases. ases. Proc. Natl. Acad. Sci. USA 104, 14634–14639. RNA 13, 1834–1849. Martin, G., Keller, W., and Doublie´ , S. (2000). Crystal structure of mammalian Stagno, J., Aphasizheva, I., Rosengarth, A., Luecke, H., and Aphasizhev, R. poly(A) polymerase in complex with an analog of ATP. EMBO J. 19, 4193– (2007b). UTP-bound and Apo structures of a minimal RNA uridylyltransferase. 4203. J. Mol. Biol. 366, 882–899. McCoy, A.J., Grosse-Kunstleve, R.W., Adams, P.D., Winn, M.D., Storoni, L.C., Stagno, J., Aphasizheva, I., Bruystens, J., Luecke, H., and Aphasizhev, R. and Read, R.J. (2007). Phaser crystallographic software. J. Appl. Cryst. 40, (2010). Structure of the mitochondrial editosome-like complex associated 658–674. TUTase 1 reveals divergent mechanisms of UTP selection and domain organi- Mullen, T.E., and Marzluff, W.F. (2008). Degradation of histone mRNA requires zation. J. Mol. Biol. 399, 464–475. oligouridylation followed by decapping and simultaneous degradation of the Suezaki, Y., and Go, N. (1975). Breathing mode of conformational fluctuations mRNA both 50 to 30 and 30 to 50. Genes Dev. 22, 50–65. in globular proteins. Int. J. Pept. Protein Res. 7, 333–334. Norbury, C.J. (2010). 30 Uridylation and the regulation of RNA function in the Trippe, R., Sandrock, B., and Benecke, B.J. (1998). A highly specific terminal cytoplasm. Biochem. Soc. Trans. 38, 1150–1153. uridylyl transferase modifies the 30-end of U6 small nuclear RNA. Nucleic Acids Perumal, K., and Reddy, R. (2002). The 3 end formation in small RNAs. Gene 0 Res. 26, 3119–3126. Expr. 10, 59–78. Re´ gnier, P., and Hajnsdorf, E. (2009). Poly(A)-assisted RNA decay and Wang, S.W., Norbury, C., Harris, A.L., and Toda, T. (1999). Caffeine can over- modulators of RNA stability. Prog. Mol. Biol. Transl. Sci. 85, 137–185. ride the S-M checkpoint in fission yeast. J. Cell Sci. 112, 927–937. Rissland, O.S., and Norbury, C.J. (2009). Decapping is preceded by Wang, S.W., Toda, T., MacCallum, R., Harris, A.L., and Norbury, C. (2000).

30 uridylation in a novel pathway of bulk mRNA turnover. Nat. Struct. Mol. Cid1, a fission yeast protein required for S-M checkpoint control when DNA Biol. 16, 616–623. polymerase delta or epsilon is inactivated. Mol. Cell. Biol. 20, 3234–3244.

986 Structure 20, 977–986, June 6, 2012 ª2012 Elsevier Ltd All rights reserved

60 Structure, Volume 20

Supplemental Information

Functional Implications

from the Cid1 Poly(U) Polymerase Crystal Structure Paola Munoz-Tello, Caroline Gabus, and Stéphane Thore Inventory of Supplemental Information I. Supplementary Data

A. Figure S1. Fo – Fc difference electron density map of the bound UTP molecule, related to Figure 2.

B. Figure S2. Poly(U) polymerase assays with different RNA primers, related to Figure 4.

C. Table S1. Cid1 mutagenesis primers.

II. Supplementary Experimental Procedure

Cloning the Cid1 gene from genomic DNA.

61 SupplementalData

FigureS1,relatedtoFigure2:

ViewoftheFo–FcdifferencedensitymapoftheboundUTPmolecule.TheCid1proteinisshownin green(CDdomain)andorange(CATdomain).TheNRMloopiscoloredinmagenta.TheNRMamino acidsandtheUTPmoleculeareshowninballandstickrepresentation.Twomagnesiumionsare depictedasgraysphere.Thedifferenceelectrondensitymapwascalculatedusingtherefined proteinchainsandthemagnesiumionsasmodelsandisshownasgraymesh(contouredat2.7).

62 FigureS2,relatedtoFigure4:

Poly(U)polymerizationassayswithdifferentRNAprimers.(A)UTPpolymerizationbymutants R139A,Y205AandK144AcomparedtothewildtypeCid1proteinwithaUpURNAtemplate.(B)UTP polymerizationbymutantsY205F,Y205VandY205DcomparedtothewildtypeCid1proteinusing thesameprimerasinpanelA.(C)UTPpolymerizationbymutantsR139A,Y205AandK144A comparedtothewildtypeCid1proteinwithapolyAtract(15nt).(D)UTPpolymerizationbymutants Y205F,Y205VandY205DcomparedtothewildtypeCid1proteinusingthesameprimerasinpanel C.Equivalentamountofproteinswereusedinallthereactions(50and150ng)andenzymatic conditionswereasdescribedintheexperimentalproceduressection.Productswereseparatedona 25and12%polyacrylamide/7Mureagel(panelAandBandpanelCandD,respectively).

63 TableS1:Cid1mutagenesisprimers.

Cid1 ForwardPrimer(5’3’) ReversePrimer(5’3’) AMA CTTAAAAATTCGGCTATGGCTTTGTGCGTGCTTATG CATAAGCACGCACAAAGCCATAGCCGAATTTTTAAG R139A TTTTACAAAGGGCAGCAATTCCCATTATC GATAATGGGAATTGCTGCCCTTTGTAAAA K144A CAAGAATTCCCATTATCGCATTAACATCTGATACG CGTATCAGATGTTAATGCGATAATGGGAATTCTTG Y205A GCAAATCAACTCTCCTGCCTTTGGAACTCTTTCC GGAAAGAGTTCCAAAGGCAGGAGAGTTGATTTGC Y205D GCAAATCAACTCTCCTGACTTTGGAACTCTTTC GAAAGAGTTCCAAAGTCAGGAGAGTTGATTTGC Y205F GCAAATCAACTCTCCTTTCTTTGGAACTCTTTCC GGAAAGAGTTCCAAAGAAAGGAGAGTTGATTTGC Y205V GCAAATCAACTCTCCTGTCTTTGGAACTCTTTCC GGAAAGAGTTCCAAAGACAGGAGAGTTGATTTGC F332A GCGATTGAAGATCCTGCCGAGATTTCACATAATG CATTATGTGAAATCTCGGCAGGATCTTCAATCGC F332D GCGATTGAAGATCCTGACGAGATTTCACATAATG CATTATGTGAAATCTCGTCAGGATCTTCAATCGC F332L CGATTGAAGATCCTTTGGAGATTTCACATAATG CATTATGTGAAATCTCCAAAGGATCTTCAATCG H336A CTTTCGAGATTTCAGCTAATGTGGGTAGG CCTACCCACATTAGCTGAAATCTCGAAAG H336L CTTTCGAGATTTCACTTAATGTGGGTAGG CCTACCCACATTAAGTGAAATCTCGAAAG H336N CTTTCGAGATTTCAAATAATGTGGGTAGG CCTACCCACATTATTTGAAATCTCGAAAG

64 SupplementalExperimentalProcedures

CloningtheCid1genefromtheschizosaccharomycespombegenomicDNA

TheDNAconstructcodingforS.pombeCid1wassynthesizedbynestedPCR,usingS.pombegenomic DNA(kindgiftfromM.Bühler).ThetwofoundintheCid1genewereremovedusing polymerasechainreactionandthefollowingprimers:

Forwardprimer(Nterm):5'CAGTACATGTCGCACAAGGAATTTACGAAGTTTTGC3'

Forwardprimer(intron1):5'CTTAAACGAATATCCCCTGATGCTGAATTGGTAG3'

Forwardprimer(intron2):5'GCTTATAGCTGAAGGATTTGAAGGAAAATTTTTAC3'

Reverseprimers(intron1):5'CTACCAATTCAGCATCAGGGGATATTCGTTTAAG3'

Reverseprimer(intron2):5'GTAAAAATTTTCCTTCAAATCCTTCAGCTATAAGC3'

Reverseprimer(Cterm):5'GCCGCTCGAGTCAGGCCTCCTCAAATAATGAATC3'

TheprimersweredesignedinordertosuppressthepredicteddisorderNterminal(region139)and Cterminal(378405)regionsoftheprotein.ThePCRproductwasclonedintopST18(derivedfrom

pET42)forexpressioninE.coliasafusionproteinwithNTerminalHis9andMBPtagfollowedbya TEVcleavagesite.

65 1.11.5 Cid1 Substrate/RNA Complex Crystallization

In order to better understand how Cid1 enzyme catalytic cycle is taking place, we tried to obtain the crystal structures of several binary and ternary complexes. Different compounds and concentrations were used for soaking and co-crystallization of the native crystals (Fig. 13E). The different trials performed are summarized on table 2 (Table 2). Furthermore, multiple crystallization conditions were also tested (Fig. 17). Unfortunately, all these trials were unsuccessful since none of the electron density maps calculated using the dataset of these soaked crystals had any visible positive density in the active site indicating the presence of a bound ligand (Fig. 18).

Soaking Co-crystallization Ligand Soaking times concentration concentration ATP ATP/ApU ATP/ApCpp ATP/UpA ATP/AAAAU ApCpp ApCpp/ApU 15min, 2h, 4h, 6h, UMPnPP 10mM, 5mM, 2.5mM 8h, 16h, 2 days UMPnPP/UTP 1mM, 0.75mM, UMPnPP/UpU 0.5mM, 0.25mM UTP/UpU UpA UpU ApU UTP/AAAAU AAAAAAAAAU X X AAAAAAAAAAAAAAA X X UUUUUUUUUUUUUUU X X 17-mer RNA X X

Table 2: Different ligand soakings and co-crystallization trials. The X corresponds to trials not tested.

Based on a publication where the authors could observed how a phosphodiester bond was made by collecting and analyzing data from crystals soaked for different times (Nakamura et al. 2012), we tested if the protein was still active in several crystallization buffers and then tried soakings at different times and concentrations in that particular condition (Fig. 19). In

66 spite of our efforts, we could not observe any catalytic activity inside the crystal or any other ligand in the active site.

Figure 17: Different crystallization conditions tested for co-crystallization and soaking of ligands.

Figure 18: Cid1 electron density maps zoomed into a none-ligand bound active site. Tyrosine 205 and Histidine 336 are shown in the active site to orient the protein. The positive and negative F0 – Fc density maps are shown as mesh and colored in green and red respectively. 2F0 – Fc density map is colored in blue.

67 Figure 19: Cid1 in vitro polymerization assays with radiolabeled ApU RNA in different crystallization buffers. Experiments were performed at 150ng of Cid1 and a radiolabeled 2-mer RNA substrate with 10mM of UTP (U) or ATP (A). Each reaction mix was incubated for 20min at 30°C. -, no protein control; Buf1, NEB buffer used for polymerization reaction (Materials and Methods section 4.1.4); Buf2, 0.1M CHES pH 9.0 and 6% PEG 3350; Buf3, 0.1M Imidazole/MES pH 6.1, 15% Glycerol/PEG 4000 and 0.06 M NaI, NaBr, NaF; Buf4, 0.1M Hepes pH 7.5, 6% 2-propanol, 0.2M NaCl and Buf5, 4% Tacsimate pH 8.0, 9% PEG 3350. All the buffers were supplemented with 5mM MgCl2.

We then decided to mutated one residue from the catalytic triad (residue D160) in an attempt to stabilize binary or ternary Cid1 complexes as successfully done in the PAP/ATP/RNA ternary complex (Balbo and Bohm 2007). Protein expression and purification were performed as for the WT protein and D160A mutant elutes as a monomeric protein in the gel filtration column (Fig. 20). The WT Cid1 crystallization condition was supplemented with 5mM of TCEP leading to larger crystals (Fig. 21A) and new crystallization conditions were also tested (Fig. 21B). The same binary and ternary complexes performed for WT Cid1 were tested in soaking and co-crystallization trials (Table 2). We tested approximately 200 crystals and only one of them had a positive density inside the active site, corresponding to an ApU molecule (Fig. 22). Details of the structure solving along with the mutational and functional analysis of Cid1 based on this crystal structure were published in the journal Nucleic Acids Research and are shown in the following chapter (section 4.1.6).

68

Figure 20: Gel filtration chromatographic profile of the D160A Cid1 protein mutant purification.

Figure 21: Different D160A mutant crystal hits. (A) D160A mutant crystallization conditions supplemented with TCEP. (B) Additional D160A crystallization conditions used for soaking or co-crystallization.

69 Figure 22: Cid1 electron density map zoomed into a ligand-bound active site. Tyrosine 205 and Histidine 336 are shown in the active site to orient the protein. The positive and negative F0 – Fc density maps are shown as mesh and colored in green and red respectively. 2F0 – Fc density map is colored in blue.

70 1.11.6 Publication II: “A Critical Switch in the Enzymatic Properties of Cid1 Protein Deciphered from its Product-bound Crystal Structure”

In order to trap other Cid1 binary and ternary complexes, we mutated one of aspartatic residues of the catalytic triad (D160A) to inactivate the enzyme as successfully performed for the poly(A) polymerase/ATP/RNA ternary complex (Balbo and Bohm 2007). We then solved the crystal structure of Cid1 in complex with a non-hydrolysable UTP (UMPnPP) and a pseudo-product ApU. This pseudo-product mimics the post-catalysis reaction state since Cid1 targets are polyadenylated. This atomic model gave us deeper molecular insights on the likely catalytic mechanism and substrate preference of Cid1 enzyme. We first confirmed the RNA selectivity of the enzyme by showing a direct contact of aspartate 101 with the 2’-OH group of the RNA substrate. We also highlighted the enzyme’s strong preference for uridylated substrates and emphasized the importance of several other key residues for RNA accommodation and Cid1’s activity. Taken together, our study allows us to propose a switching mechanism of Cid1 activity, which changes from a distributive to a processive mode, potentially allowing the addition of two distinct U-tails with different sizes.

71 Nucleic Acids Research Advance Access published December 9, 2013 Nucleic Acids Research, 2013, 1–9 doi:10.1093/nar/gkt1278

A critical switch in the enzymatic properties of the Cid1 protein deciphered from its product-bound crystal structure Paola Munoz-Tello, Caroline Gabus and Ste´ phane Thore*

Department of Molecular Biology, University of Geneva, Geneva, 1211, Switzerland Downloaded from

Received September 18, 2013; Revised November 14, 2013; Accepted November 16, 2013

ABSTRACT to mRNAs, which appear to modulate the rates and/or the http://nar.oxfordjournals.org/ directionality of the mRNA degradation process (4,7–9). The addition of uridine nucleotide by the poly(U) Furthermore, members of the PUP family also regulate polymerase (PUP) enzymes has a demonstrated the production of histone mRNAs, microRNAs, siRNAs impact on various classes of RNAs such as or U6 snRNA in the nucleus (6,10–17). microRNAs (miRNAs), histone-encoding RNAs and The PUP enzymes contain a nucleotidyl transferase messenger RNAs. Cid1 protein is a member of the domain, which has been characterized structurally PUP family. We solved the crystal structure of Cid1 (18–23). These structures show that the nucleotide triphos- in complex with non-hydrolyzable UMPNPP and a phate and the substrate RNA are bound in a space short dinucleotide compound ApU. These structures between the catalytic domain and the central domain at Institut universitaire de hautes etudesinternationales - Bibliotheque on May 12, 2014 revealed new residues involved in substrate/product (21,24). The PUP Cid1 protein was identified in stabilization. In particular, one of the three catalytic Schizosaccharomyces pombe and was shown to have the capacity to use both uridine triphosphate (UTP) and ad- aspartate residues explains the RNA dependence of enosine triphosphate (ATP) as substrates for polymeriza- its PUP activity. Moreover, other residues such as tion in vitro, while apparently being specific for UTP residue N165 or the b-trapdoor are shown to be in vivo (25–26). This is surprising, as other members of critical for Cid1 activity. We finally suggest that the the PUP family were depicted to be highly specific for length and sequence of Cid1 substrate RNA influ- UTP (18–20,27–28). Residue H336 found in the nucleotide ence the balance between Cid1’s processive and recognition motif (NRM) loop has been shown to distributive activities. We propose that particular be mainly responsible for determining the specificity of processes regulated by PUPs require the enzymes the nucleotide recognized in vitro and in vivo (21–23). to switch between the two types of activity as Additional charged residues were shown to be involved shown for the miRNA biogenesis where PUPs can in substrate RNA association, although their exact either promote DICER cleavage via short U-tail or function was not entirely characterized (21–22). To further understand the mechanism behind RNA rec- trigger miRNA degradation by adding longer ognition and elongation, we determined the structure of poly(U) tail. The enzymatic properties of these Cid1 bound to a non-hydrolyzable nucleotide UMPNPP enzymes may be critical for determining their and bound to its minimal pseudo-product ApU. These particular function in vivo. structures revealed new key residues for the substrate/ product recognition. In particular, one of the three cata- INTRODUCTION lytic aspartate residues interacts with the 20-OH group of the pseudo-product-bound dinucleotide, thus explaining Cytoplasmic messenger RNA (mRNA) homeostasis is a the RNA specificity displayed by this class of enzymes. complex phenomenon in which the balance between Moreover, residue N165 interacts with the adenosine mRNA stability and mRNA degradation is tightly moni- base in our pseudo-product-bound crystal structure. This tored. Eukaryotic species, from plants to mammals, rely interaction is essential for the enzymatic reaction. Finally, heavily on the poly(A) tail to control the mRNA degrad- we demonstrate that the polymerizing capacities of Cid1 ation process (1–2). Recently, a long-known family of are modulated by a highly flexible loop named the b- proteins, named poly(U) polymerases (PUPs), was trapdoor (residues 310–322) and by the residue K144 shown to be an unexpected player in mRNA homeostasis when UTP is used as a substrate (22). Our study thus in various eukaryotes (3–6). The PUPs add poly(U) tails highlights the structural basis behind several unique

*To whom correspondence should be addressed. Tel: +41 2237 96127; Fax: +41 2237 96868; Email: [email protected]

ß The Author(s) 2013. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]

72 2 Nucleic Acids Research, 2013

properties of the PUP enzyme family: properties mediated Table 1. Data collection and refinement statistics by a combination of specific residues and conserved sec- Data collection ondary structures. We speculate that the specific type of Crystal UMPNPP-soaked ApU-soaked activity that PUP enzymes display may determine their Cid1 Cid1 involvement in particular biochemical pathways and Wavelength (A˚ ) 0.9394 0.9394 require factors to regulate their polymerizing activity. Temperature (K) 100 100 Maximum resolution (A˚ ) 45.0–1.90 45.3–1.94 Space group P21 P21 Unit cell dimensions (A˚ ) a = 53.8, a = 54.0, MATERIALS AND METHODS b = 77.3, b = 77.1,

c = 82.1, c = 81.8, Downloaded from Expression and purification a = g = 90.0, a = g = 90.0, b = 90.9 b = 90.7 Schizosaccharomyces pombe Cid1 protein (residues 40–377) Number of observations 260 839 246 855 was expressed and purified as described previously (21). Number of unique reflections 52 778 49 569 Briefly, Cid1 was expressed in Escherichia coli BL21 Star Redundancy 5.0 5.0 Data completeness (%) 99.5 (99.7) 99.6 (99.2) cells (Novagen) as a His9-MBP fusion protein and purified I/ (I) 28.47 (13.2) 13.14 (2.3) http://nar.oxfordjournals.org/ using HiTrap FF crude column (GE Healthcare). After s Rmeas 4.4 (16.7) 9.3 (87.8) cleavage with the Tobacco Etch Virus protease (ratio of Refinement ˚ 1/50) at 16C, the sample was reloaded onto the HiTrap Resolution (A) 19.8–1.90 45.3–1.94 column to remove the tag, the protease and the contamin- Number of atoms 5688 5569 Protein 5112 5129 ants. The purified protein was finally applied on a Superdex Ligand 58 (UMPNPP) 78 (ApU) 200 column (GE Healthcare), in a buffer containing 20 mM Magnesium ions 4 3 Hepes (pH 7.5), 150 mM NaCl, 1 mM b-mercaptoethanol Brome ions 3 3 and 10% (v/v) glycerol. Fractions containing the protein Water molecules 511 356 Rwork (%) 17.94 17.92 were pooled and concentrated (30 000 MWCO Amicon) at Institut universitaire de hautes etudesinternationales - Bibliotheque on May 12, 2014 Rfree (%) 20.81 22.57 to 10 mg/ml and stored at 80C. All protein samples RMSD from ideal geometry eluted from the gel filtrationÀ column as monomers. The Bond lengths (A˚ ) 0.008 0.008 same expression and purification procedure was performed Bond angles (degrees) 1.082 1.077 ˚ 2 for all the Cid1 mutants. Mean B (A ) Protein chain 22.6 33.9 Nucleotide/ions 26.1 35.2 Mutagenesis Water 30.6 41.3 Residues in favored region of 97.52 96.72 Cid1 point mutants were prepared by site-directed muta- the Ramachandran plot (%) genesis using the Quick-Change mutagenesis kit from Residues in allowed region of 2.15 2.63 Stratagene according to manufacturer’s instructions. The the Ramachandran plot (%) Cid1 Á310–322 deletion mutant was prepared by Nested- polymerase chain reaction. All the constructs were verified by DNA sequencing. The oligonucleotides used for the collected to 1.9 and 1.94 A˚ resolution, respectively, and mutations are shown in Supplementary Table S1. the statistics are reported in Table I.

Protein crystallization and X-ray data collection Structure determination and refinement D160A mutant protein was used at 10 mg/ml. Crystals The structures of the D160A mutant in complex with were produced using the sitting-drop vapor diffusion tech- UMPNPP or in complex with ApU were determined by nique at 18C, with a reservoir solution containing 0.1 M molecular replacement with the Cid1 protein structure as a imidazole/MES (equal ratio) (pH 6.1), 12–20% glycerol, search model (PDB code 4EP7). The data sets were 6–10% polyethylene glycol (PEG) 4000, 126 mM of a indexed with the XDS package (29). Phasing was done halogen salt (NaI, NaBr, NaF in equal ratio) and using the program Phaser from the CCP4 package (30) 10 mM TCEP [tris(2-carboxyethyl)phosphine] as an and the models were refined using the routine phenix.re- additive. Crystals appeared in 2 days and grew to a fine from the Phenix program (31). A large unidentified maximum size of around 0.25 0.25 0.2 mm in 1 week. density was visible in both 2Fo –Fc and the Fo –Fc Crystals of D160A mutant were soaked for 16 h with electron density maps where an UMPNPP or an ApU 2.5 mM UMPNPP and UTP or with 2.5 mM of ApU molecule was built (Supplementary Figure S1A and B). and ATP, in both cases supplemented with 5 mM of Atomic coordinates for the UMPNPP and ApU molecules MgCl2. Crystals were passed into the crystallization were taken from the HIC-Up database (32) and fitted in solution supplemented with 10% glycerol before flash the empty electron density map using the program COOT freezing in liquid nitrogen. (33). The final Cid1/UMPNPP and Cid1/ApU models Diffraction data for the UMPNPP-soaked and were refined to a resolution of 1.9 and 1.94 A˚ , respectively. ApU-soaked protein crystals were collected using the Both crystal structures contain two molecules of Cid1 in Beamline ID14–4 at the European Synchrotron the asymmetric unit. The Cid1/UMPNPP structure Radiation Facility (ESRF, Grenoble). Complete data consists of residues 38–377 from the Cid1 protein, 2 sets for UMPNPP- and ApU-soaked crystals were UMPNPP compounds, 4 magnesium ions and 3 bromide

73 Nucleic Acids Research, 2013 3

ions present in the crystallization condition. The Cid1/ RESULTS AND DISCUSSION ApU model contains the same Cid1 sequence (residues Overall structure of the Cid1/UMPNPP and Cid1/ApU 38–377 for molecule A and residues 39–377 for complexes molecule B), 2 ApU dinucleotides, 3 magnesium ions and 3 bromide ions. For both models, loops containing We mutated one residue from the catalytic triad (residue residues 109–115 and 309–322 were not visible in our D160) in an attempt to stabilize the UMPNPP- or the electron density maps. ApU-bound Cid1 complexes as successfully done in the poly(A) polymerase (PAP)/ATP/RNA ternary complex structure (24). Here, we report the atomic structures of Poly(U) polymerase assays

the D160A Cid1 mutant in complex with the non- Downloaded from hydrolyzable UMPNPP and the D160A Cid1 mutant RNA primers (U15,A15 or ApU) were labeled using T4 polynucleotide kinase (NEB) and g-32P-ATP using manu- in complex with a minimal pseudo-product (ApU) facturer’s protocol. Labeled RNA substrates (10 pmoles) (Figure 1A and B; see X-ray statistics in Table I). were incubated with 5 or 15 pmoles of Cid1 proteins Both ligands were built using unbiased Fo-Fc Fourier dif- [wild-type (WT) or mutant] in 25 ml of 10 mM Tris-Cl ference electron density map calculated with the protein model only after a round of simulated annealing http://nar.oxfordjournals.org/ (pH 7.9) containing 50 mM NaCl, 10 mM MgCl2, 1 mM DTT and 0.5 mM UTP (or ATP when indicated) for (Supplementary Figure S1). UMPNPP mode of binding is similar to our Cid1/UTP 40 min at 37C. Reactions were stopped with an equal volume of 100 mM EDTA and 2% sodium dodecyl complex (21) confirming that D160A mutation does not sulfate. Extended primers were extracted with phenol- induce any conformational changes in the overall protein chloroform, precipitated with ethanol and resuspended fold or in the substrate nucleotide recognition by the in 90% formamide loading dye. Products were denatured protein. The only difference is the absence of the first mag- nesium ion necessary for the nucleolytic attack, as previ- at 95C for 2 min and separated by gel electrophoresis in a

ously observed in the PAP/ATP/RNA crystal structure at Institut universitaire de hautes etudesinternationales - Bibliotheque on May 12, 2014 12% (or 25% for the ApU substrate) polyacrylamide/7 M (Figure 1A) (24). urea gel. The bound ApU molecule should be considered as a mimic of the post-catalysis reaction state as defined Electrophoretic mobility shift assay by Stagno et al.(27). As such, ApU is considered as a pseudo-product of the Cid1-mediated reaction. The The indicated amount of proteins was mixed with the pseudo-product ApU is representative of the in vivo situ- labeled RNAs in 20 mM Hepes (pH 7.5) containing 150 ation encountered by Cid1 protein as, in S. pombe, Cid1 mM NaCl, 5 mM MgCl2, 1 mM b-mercaptoethanol, 10% natural mRNA substrates are polyadenylated (Figure 1B) glycerol and incubated at 37C for 10 min. Following (26). In the ApU-bound Cid1 structure, the uridine nu- incubation, 2 ml of loading dye (0.1% bromophenol blue, cleotide is recognized as in the UMPNPP and the UTP- 0.1% xylene cyanol, 180 mM Tris-boric acid and 50% bound Cid1 structures (21–23)(Figure 1C). The adenosine glycerol) was added, and the samples were immediately base of the bound ApU molecule is stacked on the uridine loaded on a 10% native polyacrylamide gel in 0.5 TBE Â base, a situation reminiscent of the TUT4/UpU structure buffer (45 mM Tris-boric acid, 1 mM EDTA). The gel was (Figure 1C) (27). Several Cid1 residues are involved in the then run at 300 V for 120 min at 4C and exposed in a stabilization of the adenosine nucleotide. First, its ribose cassette with an X-ray film (Fujifilm) for 30 to 45 min. 20-OH group establishes a direct hydrogen bond with the Following quantification with ImageQuant software (GE catalytic residue D103 (Figures 1C and D). Although Cid1 Healthcare), affinity constants were calculated using the was previously classified as an RNA-specific nucleotidyl non-linear regression with one site-specific binding option transferase, the basis for such specificity was not described from GraphPad Prism 6 program (GraphPad). (8). We suggest that the catalytic residue D103 stabilizes For comparing directly the RNA binding efficiency of the proper type of substrate near the catalytic site by de- various mutated Cid1 proteins, we used a unique tecting the presence of a 20-hydroxyl group (Figure 1D). concentration of protein (WT or mutant) and measured This mechanism of RNA detection is probably shared complex formation with a given probe (U15 or A15). across the entire family, as a similar interaction is found Then, we quantified and normalized the amount of in the TUT4/UpU crystal structure (27). Second, residue complex obtained with a given mutant protein by the D103 and the ApU molecule have another water-mediated one obtained with the WT enzyme for the same probe. interaction with the N3 position of the adenine, likely used Protein concentrations were set to 1.5 mM for the U15 to stabilize the incoming RNA substrate before the probe and 11 mM for the A15 RNA. We used these two nucleotidyl transfer reaction (Figure 1C and D). The fixed concentrations of protein to maximize complex adenine is further stabilized via additional water- formation based on WT Cid1 Kd value for a given mediated interaction between its N6 position and residue probe and to take into account the lower solubility E333 (Figure 1C and D). None of these water-mediated observed with mutant proteins K144A and F332A. After interactions confer nucleotide specificity on the 30-end of loading and running the sample as indicated previously, the substrate RNA, as expected for a template-independ- we quantified our signal, normalized it against WT signal ent polymerase (8). In the ApU complex structure, the N1 and plotted it as histogram. Individual binding reactions position of the bound adenine is recognized via a specific were performed three times in triplicates. hydrogen bond with the OD1 position of residue N165. In

74 4 Nucleic Acids Research, 2013 Downloaded from http://nar.oxfordjournals.org/ at Institut universitaire de hautes etudesinternationales - Bibliotheque on May 12, 2014

Figure 1. Crystal structure of Cid1 in complex with its minimal pseudo-product. (A) Cartoon representation of D160A Cid1 bound to the non- hydrolyzable UMPNPP. The UMPNPP, the catalytic aspartic acids and D160A residue are shown as ball and sticks and colored according to atom type (carbon, yellow or cyan; nitrogen, blue; oxygen, red; phosphate, orange). (B) Cartoon representation of D160A Cid1 bound to ApU. The ApU molecules are shown and colored as in A. The catalytic (CAT) and central (CD) domains of Cid1 are colored in salmon and green, respectively, with the ‘ratchet’ helix four colored in yellow in A and B. (C) Side view of the active site of the Cid1/UTP and D160A/ApU superposed structures. Cid/ UTP atomic model is colored in gray. Residues from the NRM loop are shown as ball and sticks and colored as in A (carbon atom in magenta). Hydrogen bonds within 2.4–3.6 A˚ distance are shown as black dash lines for adenine stabilization and as red dash lines for uridine stabilization. Amino acids from the Cid1/UTP structure are shown with thin stick radius compared with the one of the Cid1/ApU model. (D) Top view of the D160A showing ApU interactions. Atoms, residues and hydrogen bonds are colored as C. Magnesium ions are illustrated as gray spheres in A and C. All the structural panels here and in the following figures were generated with PyMol (34).

Trypanosoma brucei TUT4/UpU structure, the equivalent contact with the pseudo-product ApU (Figure 2B). We residue is an arginine (R141), which cannot contact the presume that the two enzyme groups may have evolved uracil base because it forms a salt bridge with residue toward slightly different RNA recognition/stabilization E300 (E333 in Cid1) (27). However, if we superpose the modes, most likely reflecting their specific enzymatic TUT4/UpU and Cid1/ApU structures, the N3 atom of the properties. uracil base in the trypanosomal structure is located at nearly the same place as the N1 atom of the adenosine RNA binding properties in our structure (Figure 2A). Therefore, we suggest that Cid1 has been classified as a non-canonical template- N165 is a key residue for binding to the 30-end of the RNA independent nucleotidyl transferase (3,7–8,21–23,35). substrate either before or after the Cid1 uridylation Previous studies have also shown that the overall reaction. This proposal is valid whether this substrate activity of Cid1 is unexpectedly high when presented RNA contains a 30-end adenosine nucleotide, as with U15 (26). To further understand the Cid1 enzymatic expected for the native poly(A)-containing substrates in properties, we generated various Cid1 mutants driven by S. pombe, or whether the RNA went through at least the analysis of our Cid1/ApU crystal structure. We then one round of polyuridylation (Figure 2A). Our proposal measured their binding properties for two different sub- can be extrapolated to any RNA 30-end base, although it strates (A15 and U15) in the absence of any polymerizing will require the nucleotide tautomeric form with a activity to avoid interference from the newly synthesized hydrogen-bonded N1 atom to establish a proper bond tail (Figure 3 and Supplementary Figure S2). with N165 OD1 position. Finally, the TUT/PUP super- We first compared the capacities of the WT protein position also shows that residue R139 in Cid1 is not in and of our mutant D160A to bind to the two selected

75 Nucleic Acids Research, 2013 5 Downloaded from http://nar.oxfordjournals.org/

Figure 2. Structural comparison between the Cid1/ApU and the TbTUT4/UpU complex structures. (A) Close view of the NRM loop TbTUT4/UpU (PDB code 2Q0G) and Cid1/ApU showing related active site pocket architecture. TbTUT4/UpU structure is shown in cartoon representation and colored in violet. The main interacting amino acids and the UpU molecule are shown as ball and stick model colored in violet or green, respectively. Cid1/ApU model, residues and atoms are shown and colored as in Figure 1.(B) View from the top of the active site highlighting the differential position of the loop between b-sheets four and five. This loop contains residue R139 (R121 in TbTUT4). TbTUT4 residues are labeled in italic and the radius of their ball and stick representations is reduced compared with the one used for Cid1 residues. Magnesium ions are illustrated as gray spheres in all panels. The black dash lines represent hydrogen bonds within 2.4–3.6 A˚ distance. at Institut universitaire de hautes etudesinternationales - Bibliotheque on May 12, 2014

substrates—mimics of a short poly(A) tail (A15) or a newly that the b-sheet structure on top of the active site, named added U-tail (U15; Figure 3A and B; Supplementary b-trapdoor, plays a role during UTP exchange, although Figure S2A and B). The affinity constant calculated its role in RNA association has not been assessed (22). We between the Cid1 WT and the U probe was 1 and could not observe any significant difference between the 15  20 mM with the A15 probe (Figure 3C). These values affinity constants measured with Cid1 truncation of were identical for the D160A mutant, reinforcing the residues 310–322 (Á310–322) for both RNA substrates fact that this mutation does not alter the enzyme’s RNA suggesting that the b-trapdoor does not determine the binding capacity. Overall, we observe a 20-fold difference specificity of the associated RNA per se (Figure 3C). between the affinity constants of the various tested Cid1 Several residues that have been shown to be important proteins for a 15-mer homouridine as compared with the for the Cid1 activity such as F332, K144 or H336 were A15 (Figure 3C). Cid1’s intrinsically higher affinity for finally tested for their potential effect on the RNA binding homouridine polymer probably explains why Cid1 properties. Strikingly, the F332 mutant had a reduced exhibits a more efficient polymerizing activity on U15 capacity for binding RNAs (Figure 3D). We previously RNA (26). proposed that F332 was critical for the UTP clamping We also mutated residue N165 into an alanine (or an mechanism mediated by the ratchet helix 4 (21). We aspartate) and assessed the mutated Cid1 enzyme’s suggest that the importance of the domain movement binding capacity (Supplementary Figure S2C and D). during the nucleotide transfer likely extends to RNA The affinity constant for the N165A mutant did not template trapping mechanism as well (Figure 3D) (21). differ from that of the WT protein (Figure 3C). These We then looked into the RNA binding capacity of the values are comparable or slightly lower than the H336A mutant protein for either of the RNA substrates, reported values for nucleotidyl transferase enzymes which was unimpaired, thereby confirming that H336 is (23,27,36). The binding experiments agree with our obser- only important for the UTP selectivity (Figure 3D) (21). vation that both A and U nucleotide would be forming a When we tested the RNA binding capacity of the K144A stable hydrogen bond with N165 residue (Figures 2A and mutant protein, we observed that its association with A15 3C). We then tested residue R139, which is found in the was strongly reduced, whereas it was unchanged for a loop between b-strands 4 and 5 (Figure 2B). In the 15-mer U stretch (Figure 3D). Therefore, K144 residue TbTUT4, the equivalent amino acid R121 is involved in seems important for the stabilization of our short the primer stabilization as well as in the catalytic activity incoming substrate RNA. These results are slightly differ- (Figure 2B) (19,27). The association between the two ent from the data published by Gilbert’s laboratory, RNA substrates and the R139A mutant is similar to possibly due to the differences in the RNA used for the that of the WT protein (Figure 3D). These binding binding studies (a 40-mer RNA in their study) (22). The assays corroborate our structural data, which did not poly(A) polymerase equivalent residue K145 is found to show any interaction with R139 residue (Figure 2B). We contact the 20-OH group of the nucleotide -2 in the PAP/ concluded that this residue is not important on its own for ATP/RNA crystal structure (24). Therefore, the Cid1 the recognition of the substrate RNA in contrast to protein may use a substrate stabilization strategy closer TbTUT4 (Figure 3D). Furthermore, it has been suggested to the one of the PAP enzymes. Altogether, it appears

76 6 Nucleic Acids Research, 2013

that Cid1 enzyme is highly specialized with residues critical for the nucleotide triphosphate association (H336), residues specialized for the RNA primer associ- ation (K144) and residues involved in the nucleotidyl transferase enzymatic cycle (b-trapdoor, D103, N165 or F332).

Polymerization activity of Cid1 mutants To complete our RNA binding study, we measured the polymerization activities of the same set of mutants. As Downloaded from we observed significant differences between Cid1 affinity for a U-tail as compared with an A-tail, we looked at the polymerizing activity in the presence of either UTP or ATP. The N165A mutant is unable to add more than

one nucleotide per substrate regardless of the provided http://nar.oxfordjournals.org/ nucleotide triphosphate (UTP or ATP) even though its RNA binding capacity is unimpaired, at least for the 15- mer primer (Figures 3C, 4A and B). If N165 residue is changed into a charged residue (N165D), the enzyme is able to add up to 3–4 uridines or up to 20 adenines to the 15-nt RNA template but is still almost unable to use the ApU substrate (Figure 4, mutant N165D). Overall, these experiments demonstrate the critical role of N165 residue in the polymerizing reaction. Being located at the begin- at Institut universitaire de hautes etudesinternationales - Bibliotheque on May 12, 2014 ning of helix 4, N165 is likely important during the swivel motion occurring throughout the catalytic cycle of the enzyme (21). The rotation of the catalytic domain around the helix 4 can open the space between the domains. We suggest that N165 residue is involved in pulling on the RNA primer after the addition of the UMP. On displacement of the RNA primer, the UTP binding cavity is emptied allowing a new UTP molecule to take its place, and a new cycle to start. Another region of interest near the NRM loop is the b- trapdoor (22). Deletion of the b-trapdoor residues does not affect the overall PUP activity of the enzyme for either of the RNA primers as long as we use UTP as a substrate (Figure 4A and C). We can observe the typical biphasic profile with, on one hand, short products of 7–9 added nucleotides corresponding to a distributive activity and, on the other hand, highly elongated products that did not enter the gel (Figure 4A and B). However, when we look at activity of the Á310–322 mutant protein in the presence of ATP, we do not observe any larger products suggesting that Cid1 without the b-trapdoor is almost ex- clusively a distributive PAP enzyme (Figure 4B and D). This is even more striking with the short substrate ApU (Figure 4D). Because these observed differences between PAP and PUP activities cannot be linked to a difference in the RNA binding properties of the truncated protein (Figure 3C; Supplementary Figure S2E and F), we

Figure 3. Continued and U15. Individual binding experiments were repeated at least three Figure 3. Characterization of U15 and A15 RNA binding properties of times and representative EMSA experiments are shown in the WT and mutant Cid1 protein by EMSA. (A) EMSA showing the Supplementary Figure S2.(D) Histogram representation of the inter- interaction of Cid1 with an A15 RNA. (B) Same experiments with a action measured between various Cid1 mutants with either an U15 or U15 RNA. In A and B, the individual reactions contain either no an A15 RNA probe. Bound fractions correspond to the amount of protein or the indicated amounts. (C) RNA binding curves and formed complex and were normalized by the one observed between calculated Kd from quantified EMSA experiments performed with Cid1 WT protein and either the U15 or the A15 probe. Experiments Cid1 WT, D160A, N165A and Á310–322 mutants for both A15 were repeated at least three times and done in triplicate.

77 Nucleic Acids Research, 2013 7 Downloaded from http://nar.oxfordjournals.org/ at Institut universitaire de hautes etudesinternationales - Bibliotheque on May 12, 2014

Figure 4. Poly(U) and poly(A) polymerization assays with two different RNA probes. (A) Polymerization assay in the presence of UTP with an A15 primer. Lane 1 shows the RNA primer after incubation without enzyme. The RNA after elongation with WT Cid1 (lanes 2–3), N165A (lanes 4–5), N165D (lanes 6–7), Á310–322 deleted protein (lanes 8–9) and K144A mutants are loaded in the respective lanes. (B) Polymerization assay in the presence of ATP with an A15 as substrate. Similar mutants were used and reactions were loaded as in A. (C) Polymerization assay with UTP and the substrate ApU. The lane 1 shows the ApU substrate after incubation without enzyme. The polymerizing reaction in the presence of WT Cid1, N165A, N165D and Á310–322 mutant proteins is shown in lanes 2–3, 4–5, 6–7 and 9–10, respectively. (D) Polymerization assays with ATP and an ApU substrate. Similar mutants were used and reactions were loaded as in C.

conclude that the b-trapdoor has an important role during when presented with ATP (Figure 4A and B). As the the catalytic cycle. This may be to favor NTP exchange K144A mutant has an impaired A15 RNA association, after each polymerization cycle by promoting domain we propose that the change in the polymerizing profile is movements (22). linked with the poly(A) RNA binding capacities of the Finally, we tested the polymerizing activity of the protein. K144A mutant. This single-point mutant has a PUP activity comparable with the WT protein, but its PAP activity is drastically reduced as demonstrated by the almost complete absence of extended products (compare CONCLUSIONS Figure 4A and B). As observed for the Á310–322 deleted Our atomic models provide missing information regarding protein, its activity is now almost exclusively distributive the association of RNAs with the uridylating enzyme

78 8 Nucleic Acids Research, 2013

Cid1. We identify multiple determinants behind the RNA Stephane Thore Lab for their critical reading, helpful dis- binding properties of Cid1 and decipher the roles played cussion and technical advice. They thank the University of by several residues or protein fragments during Cid1’s Geneva for gratefully allowing the installation of the polymerization activity. When the 30-end of the template biostructural platform together with the generous finan- RNA is not a uridine, Cid1 has a low RNA binding cial support of the Boninchi foundation, the Schmidheiny affinity and therefore a highly distributive activity reflect- foundation and the Swiss National Science Foundation ing its classification as a template-independent nucleotidyl R’equip grant (316030-128787). transferase (8). Our binding tests show that on addition of 6–7 uridine nucleotides, the enzyme’s affinity for the 30-

end of its template becomes at least 20 times higher as FUNDING Downloaded from shown by our binding tests. Following this switch in its Swiss National Science Foundation [31003A_140924 and overall affinity, Cid1’s activity becomes highly processive. 31003A_124909 to S.T.]. Funding for open access charge: We show that residue K144 is involved in this switch of Swiss National Science Foundation. activity. The Cid1 catalytic cycle is complex and its com- pletion critically depends on several key residues: (i) the b- Conflict of interest statement. None declared. trapdoor controls at least partially the UTP exchange http://nar.oxfordjournals.org/ after the reaction; (ii) N165 residue, which is hydrogen bonded to the 30-penultimate nucleotide in our atomic REFERENCES model, may be involved in the displacement of the elongated RNA product after UMP addition. Finally, 1. Eckmann,C.R., Rammelt,C. and Wahle,E. (2011) Control of poly(A) tail length. Wiley Interdiscip. Rev. RNA, 2, 348–361. the overall catalytic cycle depends on the previously 2. Schuster,G. and Stern,D. (2009) RNA polyadenylation and decay proposed domain movement around helix 4 and residue in mitochondria and chloroplasts. Prog. Mol. Biol. Transl. Sci., F332 to proceed (21). 85, 393–422.

Until now, the length of the characterized poly(U) tail 3. Rissland,O.S. and Norbury,C.J. (2009) Decapping is preceded by at Institut universitaire de hautes etudesinternationales - Bibliotheque on May 12, 2014 due to PUP activities in vivo is generally short (between 1 30 uridylation in a novel pathway of bulk mRNA turnover. Nat. Struct. Mol. Biol., 16, 616–623. and 5 nt) (4,5,37,38). We propose that Cid1 enzymatic 4. Sement,F.M., Ferrier,E., Zuber,H., Merret,R., Alioua,M., activity is influenced by its higher affinity for the Deragon,J.M., Bousquet-Antonelli,C., Lange,H. and Gagliardi,D. homouridine-containing tails. To prevent Cid1 from (2013) Uridylation prevents 30 trimming of oligoadenylated generating long U-tails that can trigger RNA degradation mRNAs. Nucleic Acids Res., 41, 7115–7127. by exonucleases like Dis3L2 (9,39), we suggest that factors 5. Lapointe,C.P. and Wickens,M. (2013) The nucleic acid-binding domain and translational repression activity of a Xenopus are used to control the local activity, like the targeting terminal uridylyl transferase. J. Biol. Chem., 288, 20723–20733. factor Lin28 (11,40). 6. Schmidt,M.J., West,S. and Norbury,C.J. (2011) The human Lastly, as shown by the differential use of particular cytoplasmic RNA terminal U-transferase ZCCHC11 targets protein residues like residue R139 or K144, TUT and histone mRNAs for degradation. RNA, 17, 39–44. 7. Kwak,J.E. and Wickens,M. (2007) A family of poly(U) PUP enzymes have evolved differently (19,21). It has polymerases. RNA, 13, 860–867. recently been proposed that DSS1 protein, a uridine- 8. Martin,G. and Keller,W. (2007) RNA-specific ribonucleotidyl specific exonuclease, plays a critical role during the guide transferases. RNA, 13, 1834–1849. RNA maturation process in trypanosome, a process 9. Malecki,M., Viegas,S.C., Carneiro,T., Golik,P., Dressaire,C., where RET1 TUT enzyme is critical (41,42). As our Ferreira,M.G. and Arraiano,C.M. (2013) The exoribonuclease Dis3L2 defines a novel eukaryotic RNA degradation pathway. present results shed new light on the properties of the EMBO J., 32, 1842–1854. polyuridylating enzyme Cid1, it will be exciting to 10. Mullen,T.E. and Marzluff,W.F. (2008) Degradation of histone see whether a similar switch exists in the RET1 enzyme mRNA requires oligouridylation followed by decapping (43–45). and simultaneous degradation of the mRNA both 50 to 30 and 30 to 50. Genes Dev., 22, 50–65. 11. Heo,I., Joo,C., Kim,Y.K., Ha,M., Yoon,M.J., Cho,J., Yeom,K.H., Han,J. and Kim,V.N. (2009) TUT4 in concert with ACCESSION NUMBERS Lin28 suppresses microRNA biogenesis through pre-microRNA Atomic coordinates and structure factors have been uridylation. Cell, 138, 696–708. 12. Chen,Y., Sinha,K., Perumal,K. and Reddy,R. (2000) Effect of 30 deposited in the Protein Data Bank with accession codes terminal adenylic acid residue on the uridylation of human small 4NKT for the Cid1-UMPNPP complex and 4NKU for RNAs in vitro and in frog oocytes. RNA, 6, 1277–1288. the Cid1-ApU complex. 13. Trippe,R., Sandrock,B. and Benecke,B.J. (1998) A highly specific terminal uridylyl transferase modifies the 30-end of U6 small nuclear RNA. Nucleic Acids Res., 26, 3119–3126. 14. Trippe,R., Guschina,E., Hossbach,M., Urlaub,H., Luhrmann,R. SUPPLEMENTARY DATA and Benecke,B.J. (2006) Identification, cloning, and functional Supplementary Data are available at NAR Online. analysis of the human U6 snRNA-specific terminal uridylyl transferase. RNA, 12, 1494–1504. 15. Li,J., Yang,Z., Yu,B., Liu,J. and Chen,X. (2005) Methylation protects miRNAs and siRNAs from a 30-end uridylation activity ACKNOWLEDGEMENTS in Arabidopsis. Curr. Biol., 15, 1501–1507. 16. Shen,B. and Goodman,H.M. (2004) Uridine addition after The authors acknowledge the staff of ID14-4 at the ESRF microRNA-directed cleavage. Science, 306, 997. for developing excellent beamlines and supporting optimal 17. Zhao,Y.Y., Yu,Y., Zhai,J.X., Ramachandran,V., Dinh,T.T., data collection. They also thank all the members of Meyers,B.C., Mo,B.X. and Chen,X.M. (2012) The Arabidopsis

79 Nucleic Acids Research, 2013 9

nucleotidyl transferase HESO1 uridylates unmethylated small Kunstleve,R.W. et al. (2010) PHENIX: a comprehensive RNAs to trigger their degradation. Curr. Biol., 22, 689–694. Python-based system for macromolecular structure solution. Acta 18. Deng,J., Ernst,N.L., Turley,S., Stuart,K.D. and Hol,W.G. (2005) Crystallogr. D. Biol. Crystallogr., 66, 213–221. Structural basis for UTP specificity of RNA editing TUTases 32. Kleywegt,G.J. (2007) Crystallographic refinement of ligand from Trypanosoma brucei. EMBO J., 24, 4007–4017. complexes. Acta Crystallogr. D. Biol. Crystallogr., 63, 94–100. 19. Stagno,J., Aphasizheva,I., Rosengarth,A., Luecke,H. and 33. Emsley,P. and Cowtan,K. (2004) Coot: model-building tools for Aphasizhev,R. (2007) UTP-bound and Apo structures of a molecular graphics. Acta Crystallogr. D. Biol. Crystallogr., 60, minimal RNA uridylyltransferase. J. Mol. Biol., 366, 882–899. 2126–2132. 20. Stagno,J., Aphasizheva,I., Bruystens,J., Luecke,H. and 34. Stevenson,A.L. and Norbury,C.J. (2006) The Cid1 family of Aphasizhev,R. (2010) Structure of the mitochondrial editosome- non-canonical poly(A) polymerases. Yeast, 23, 991–1000. like complex associated TUTase 1 reveals divergent mechanisms 35. DeLano,W.L. (2002) The PyMol User’s Manual. DeLano of UTP selection and domain organization. J. Mol. Biol., 399, Scientific, San Carlos, CA, USA. Downloaded from 464–475. 36. Balbo,P.B., Toth,J. and Bohm,A. (2007) X-ray crystallographic 21. Munoz-Tello,P., Gabus,C. and Thore,S. (2012) Functional and steady state fluorescence characterization of the protein implications from the Cid1 poly(U) polymerase crystal structure. dynamics of yeast polyadenylate polymerase. J. Mol. Biol., 366, Structure, 20, 977–986. 1401–1415. 22. Yates,L.A., Fleurdepine,S., Rissland,O.S., De Colibus,L., 37. Heo,I., Ha,M., Lim,J., Yoon,M.J., Park,J.E., Kwon,S.C., Harlos,K., Norbury,C.J. and Gilbert,R.J. (2012) Structural basis Chang,H. and Kim,V.N. (2012) Mono-Uridylation of

for the activity of a cytoplasmic RNA terminal uridylyl Pre-MicroRNA as a key step in the biogenesis of group II let-7 http://nar.oxfordjournals.org/ transferase. Nat. Struct. Mol. Biol., 19, 782–787. MicroRNAs. Cell, 151, 521–532. 23. Lunde,B.M., Magler,I. and Meinhart,A. (2012) Crystal structures 38. Knouf,E.C., Wyman,S.K. and Tewari,M. (2013) The human of the Cid1 poly (U) polymerase reveal the mechanism for UTP TUT1 nucleotidyl transferase as a global regulator of microRNA selectivity. Nucleic Acids Res., 40, 9815–9824. abundance. PloS One, 8, e69630. 24. Balbo,P.B. and Bohm,A. (2007) Mechanism of poly(A) 39. Ustianenko,D., Hrossova,D., Potesil,D., Chalupnikova,K., polymerase: structure of the enzyme-MgATP-RNA ternary Hrazdilova,K., Pachernik,J., Cetkovska,K., Uldrijan,S., Zdrahal,Z. complex and kinetic analysis. Structure, 15, 1117–1131. and Vanacova,S. (2013) Mammalian DIS3L2 exoribonuclease 25. Wang,S.W., Toda,T., MacCallum,R., Harris,A.L. and Norbury,C. targets the uridylated precursors of let-7 miRNAs. RNA, 19, (2000) Cid1, a fission yeast protein required for S-M checkpoint 1632–1638. control when DNA polymerase delta or epsilon is inactivated. 40. Heo,I., Joo,C., Cho,J., Ha,M., Han,J. and Kim,V.N. (2008) Lin28 Mol. Cell. Biol., 20, 3234–3244. mediates the terminal uridylation of let-7 precursor MicroRNA. at Institut universitaire de hautes etudesinternationales - Bibliotheque on May 12, 2014 26. Rissland,O.S., Mikulasova,A. and Norbury,C.J. (2007) Efficient Mol. Cell, 32, 276–284. RNA polyuridylation by noncanonical poly(A) polymerases. Mol. 41. Mattiacio,J.L. and Read,L.K. (2008) Roles for TbDSS-1 in RNA Cell. Biol., 27, 3612–3624. surveillance and decay of maturation by-products from the 12S 27. Stagno,J., Aphasizheva,I., Aphasizhev,R. and Luecke,H. (2007) rRNA locus. Nucleic Acids Res., 36, 319–329. Dual role of the RNA substrate in selectivity and catalysis by 42. Penschow,J.L., Sleve,D.A., Ryan,C.M. and Read,L.K. (2004) terminal uridylyl transferases. Proc. Natl Acad. Sci. USA, 104, TbDSS-1, an essential Trypanosoma brucei exoribonuclease 14634–14639. homolog that has pleiotropic effects on mitochondrial RNA 28. Ringpis,G.E., Stagno,J. and Aphasizhev,R. (2010) Mechanism of metabolism. Eukaryot. Cell, 3, 1206–1216. U-insertion RNA editing in trypanosome mitochondria: 43. Aphasizhev,R., Aphasizheva,I. and Simpson,L. (2003) A tale of characterization of RET2 functional domains by mutational two TUTases. Proc. Natl Acad. Sci. USA, 100, 10617–10622. analysis. J. Mol. Biol., 399, 696–706. 44. Aphasizheva,I., Ringpis,G.E., Weng,J., Gershon,P.D., 29. Kabsch,W. (2010) Xds. Acta Crystallogr. D. Biol. Crystallogr., 66, Lathrop,R.H. and Aphasizhev,R. (2009) Novel TUTase associates 125–132. with an editosome-like complex in mitochondria of Trypanosoma 30. McCoy,A.J., Grosse-Kunstleve,R.W., Adams,P.D., Winn,M.D., brucei. RNA, 15, 1322–1337. Storoni,L.C. and Read,R.J. (2007) Phaser crystallographic 45. Panigrahi,A.K., Schnaufer,A., Carmean,N., Igo,R.P. Jr, Gygi,S.P., software. J. Appl. Crystallogr., 40, 658–674. Ernst,N.L., Palazzo,S.S., Weston,D.S., Aebersold,R., Salavati,R. 31. Adams,P.D., Afonine,P.V., Bunkoczi,G., Chen,V.B., Davis,I.W., et al. (2001) Four related proteins of the Trypanosoma brucei Echols,N., Headd,J.J., Hung,L.W., Kapral,G.J., Grosse- RNA editing complex. Mol. Cell. Biol., 21, 6833–6840.

80 Supplementary Information

A critical switch in the enzymatic properties of the Cid1 protein deciphered from its product-bound crystal structure

Paola Munoz-Tello1, Caroline Gabus1 and Stéphane Thore1, *

1 Department of Molecular Biology, University of Geneva, Geneva, 1211, Switzerland

* To whom correspondence should be addressed. Tel: 4122 379 6127; Fax: 4122 379 6868; Email: [email protected]

81 Supplementary Tables

Table SI. Primers for mutagenesis

Mutant Primer forward (5’3’) Primer reverse (5’3’) TTTTACAAAGGGCAGCAATTCCCATT GATAATGGGAATTGCTGCCCTTTGT R139A ATC AAAA CAAGAATTCCCATTATCGCATTAACA CGTATCAGATGTTAATGCGATAATG K144A TCTGATACG GGAATTCTTG GTTAAATCCAATAGCACATTGAAAC D160A CGTTTCAATGTGCTATTGGATTTAAC G GATATTGGATTTAACACGCGTCTAG GAATAGCTAGACGCGTGTTAAATCC N165A CTATTC AATATC GCGATTGAAGATCCTGACGAGATTT CATTATGTGAAATCTCGTCAGGATCT F332D CACATAATG TCAATCGC CTTTCGAGATTTCAGCTAATGTGGG CCTACCCACATTAGCTGAAATCTCG H336A TAGG AAAG GGATGGACTTCAGCTGACAGGTATA CGCAAGAATATACCTGTCAGCTGAA 310-322 TTCTTGCG GTCCATCC

82 Supplementary Figures and Legends

Figure S1: View of the Fo – Fc difference density maps of the UMPNPP and ApU compounds bound to Cid1.

(A) Fo – Fc difference density map of the UMPNPP nucleotide bound to Cid1. The Cid1 protein is shown in lime (CD domain) and salmon pink (CAT domain) with helix 4 in yellow. The NRM loop residues are colored in light magenta, the aspartate triad residues are colored in cyan and the major residues interacting with the ligands are shown in slate. Side chains and ligands are shown using a ball and stick representation (carbon as indicated before, nitrogen in blue, oxygen in red). The difference map was calculated using the refined protein as a model and is shown as a gray mesh (contoured at 2.5). (B) Fo – Fc difference density map of the minimal product ApU bound to Cid1. The difference map was calculated using the refined protein as a model and is shown as a gray mesh (contoured at 2.5). Side chains and ligand are shown as in panel A.

83 Figure S2: Characterization of the RNA binding properties of the WT and mutant Cid1 proteins by electrophoretic mobility shift assay (EMSA).

(A) EMSA showing the association between D160A mutant and an A15 probe. (B) EMSA showing the

interaction between D160A mutant and U15 RNA. (C) EMSA showing the binding between N165A

mutant and A15 RNA. (D) EMSA illustrating the binding between N165A mutant and a U15 probe. (E)

EMSA displaying the interaction of 310-322 mutant and A15 RNA. (F) EMSA showing the association

of 310-322 mutant and a U15 probe. Individual reactions in all the panels contained no protein or the indicated amounts.

84 1.11.7 Materials and Methods

1.11.7.1 Media Compositions

1.11.7.1.1 Luria Broth (LB) Media For 1L of LB media: 10 g tryptone, 5 g yeast extract, 10 g NaCl and autoclaved deionized water to make 1L supplemented with the appropriate antibiotics. Autoclave for 20min at 120°C.

1.11.7.1.2 Terrific Broth (TB) Media For 1L of TB media: 12g tryptone, 24g yeast extract, 4ml of glycerol and autoclaved water to make 1L after addition of 200ml of K salts and the appropriate antibiotics.

For 2L of K salts: 250.8g K2HPO4, 46.2g KH2PO4 and water to make 2L. Autoclave for 20min at 120°C.

1.11.7.1.3 Autoinduction (AI) Media Several media are needed to obtain the final AI media.

10X ZY (1L): 1M MgSO4 (100ml):

100g Tryptone 24.65g MgSO4!7H2O 50g Yeast extract qsp 100ml Water qsp 1L Water

50X 5052 (1L): 20X NPS (2L):

250g Glycerol 132g (NH4)2SO4 25g Glucose 272g KH2PO4 100g Lactose 284g Na2HPO4 qsp 1L Water qsp 1L Water Autoclave for 20min at 120°C.

For 1L of AI media add 100ml of 10X ZY, 1ml of 1M MgSO4, 20ml of 50X 5052, 50ml of 20X NPS supplemented with the appropriate antibiotics and autoclaved water to make 1L.

85 1.11.7.1.4 Seleno-Methionine (SeMet) Derivative Media Several media are needed for SeMet-derivative protein expression.

Medium A (1L): 100X Trace elements solution (1L):

100ml M9 medium (10X) 5g EDTA Trace elements solution ! 10ml 0.83g FeCl3 6 H2O (100X) 20ml 20% (w/v) Glucose 84mg ZnCl2 1ml 1M MgSO 4 13mg CuCl2!2 H2O 0.3ml 1M CaCl2 10mg CoCl !6 H O 1ml Biotin (1mg/ml) 2 2 H BO 1ml Thiamin (1mg/ml) 10mg 3 3

Appropriate antibiotic(s) 1.6mg MnCl2!6 H2O

10X M9 medium (1L):

80g Na2HPO4

40g KH2PO4 5g NaCl

5g NH4Cl

For all the media, add water to make 1L. For medium A use autoclaved water. M9 medium autoclave for 20min at 120°C. Trace elements need to be filtered with a 0.22μm filter.

1.11.7.2 Protein Expression and Batch Purification Tests Small-scale cultures of E. coli BL21-Star or Rosetta 2 cells (Novagen), previously transformed with either Cid1L or Cid1S construct, were grown in LB or TB media at 37°C for 6 hours. The cells were then induced overnight at 18°C with 0.1mM IPTG or diluted in AI media (1/50) for overnight expression at 30°C. Expression of the recombinant proteins was analyzed by SDS- PAGE. Cells expressing Cid1L or Cid1S were harvested using centrifugation at 4 000 rpm and resuspended in 50mM Phosphate buffer pH 7.5 containing 300mM NaCl, 20mM imidazole, 0.1% Triton X-100, DNase 1 (1μg/ml), Lysozyme (1μg/ml), 5mM β-mercaptoethanol (β-MT), and a cocktail of protease inhibitors (1mM PhenylMethylSulfonyl Fluoride, 1μg/ml of leupeptin and 2μg/ml of pepstatin). Cells were lysed using an Emulsiflex system (AVESTIN) and cleared using centrifugation at 15 000 rpm for 30 minutes at 4°C. The soluble fraction

86 was subjected to an initial affinity purification using 1ml slurry of Ni-NTA beads (GE Healthcare). The protein was eluted from the beads with a buffer containing 250mM imidazole and desalted against 50mM Phosphate buffer pH 7.5, 300mM NaCl, 20mM imidazole, and 5mM β-mercaptoethanol. The TEV protease was added at a ratio of 1/50 (TEV/protein) and the sample was incubated at 16°C overnight. The fractions containing the total and soluble fractions, the elution peak and the sample after protease treatment were analyzed on a SDS-PAGE.

1.11.7.3 Seleno-Methionine Derivative Protein Expression, Purification and Crystallization Cid1L is overexpressed in E. coli B834 cells, which are auxotroph for methionine, grown in LB media at 37°C for 8 hours then diluted 1/50 and grown overnight at 37°C. The overnight culture is diluted 1/100 and grown for 6 hours, centrifuged at 4 000 rpm for 15 min at room temperature (RT) and resuspended in 1L of Medium A (without methionine) described previously for overnight growth at 37°C. The next morning, the cells were harvested by centrifugation at 4 000 rpm for 15 min at RT, resuspended in 1L of Medium A supplemented with 50mg of seleno-methionine and incubated at 37°C for 30 min followed by 4 hours induction at 37°C with 1mM IPTG. Induced cells were harvested by centrifugation at 4 000 rpm for 15 min at 4°C and resuspended in a lysis buffer containing 50mM Phosphate buffer pH 7.5, 300mM NaCl, 0.1% Triton X-100, 20mM Imidazole, DNase 1 (1μg/ml), Lysozyme (1μg/ml), 5mM β-MT, and a cocktail of protease inhibitors (1mM PhenylMethylSulfonyl Fluoride, 1μg/ml of leupeptin and 2μg/ml of pepstatin). Cells were lysed using the emulsiflex system (AVESTIN) and cleared by centrifugation at 15 000 rpm for 30 min at 4°C. SeMet- derivative Cid1L was then purified and crystallized as for Cid1 WT protein at 9.5mg/ml. We also tested a second crystallization technique called micro-seeding, which consists in crushing the obtained crystals in order to seed a new set of crystal trials. At least two or three rounds of micro-seeding needed to be performed in order to obtain larger crystals at 18°C in a reservoir solution containing 0.1 M HEPES pH7.0, 20% PEG 6000, 0.2M Ammonium Chloride.

87 1.12 Structural Studies of Mitochondrial RNA Processing Complexes in T. brucei

1.12.1 gRNA Maturation Complexes or RET1 TUTase DSS1 Exonuclease Complex (RDS complex)

1.12.1.1 RDS Complex Expression and Purification Ruslan Aphasizhev group in Boston identified through RET1 TUTase pull-down assays and mass spectrometry analysis four proteins implicated in gRNA biogenesis (unpublished data). The first one is DSS1, a 80 kDa 3’ exonuclease known to be an essential protein influencing the RNA editing reaction and the RNA stability in trypanosomal mitochondria (Penschow et al. 2004). The other 3 proteins named RDS1, RDS2 and RDS3 (standing for RET1 TUTase DSS1 complex protein), are three new proteins without any discernable motif of 189, 180 and 123 kDa respectively. These five proteins compose the RDS complex in trypanosomes. RNAi tests on RET1 TUTase as well as DSS1 exoribonuclease show accumulation of pre-gRNAs in the mitochondria suggesting that this RDS complex might be necessary for gRNA biogenesis (unpublished data). In order to better understand the function of this newly identified RDS complex, we started a collaborative project with their laboratory aiming to solve the crystal structure of the factors, either as sub-complexes or single subunits. It is noteworthy that RET1 TUTase is also involved in the uridylation of mitochondrially-encoded mRNAs as part of a polyadenylation/uridylation complex, which will be discussed in the section 4.2.2 (Aphasizheva and Aphasizhev 2010; Aphasizheva et al. 2011).

We received several clones from our collaborators ready for E. coli protein expression trials of RET1, DSS1, RDS1, RDS2 and RDS3 full-length (FL) proteins. These plasmids contained either one single protein tagged with a N- or C-terminal His-tag or two ORFs with one protein tagged and the other one untagged for co-expression and co-purification assays. All the expression and purification tests performed with the different plasmids are summarized in table 3 (Table 3). Unfortunately, in our co-expression trials we could only express one of the proteins no matter what combination we would try and obtained very poor expression levels (Table 3). Since the size of these proteins is rather high (more than 100kDa each), we decided to try to express and purify the proteins separately. We would then in vitro

88 reconstitute the complex from these individual subunits. RDS1, RDS2 and RDS3 expressions and purifications were still unsuccessful most probably due to their large size. For DSS1 protein, we obtained very good expression levels but the protein was not soluble (Fig. 23A). We tried to solubilize it by adding a N-terminal GST tag and testing different lysis buffers however unsuccessfully (Table 4, Fig. 23A). RET1 TUTase construct had good expression levels and small-scale purification seemed promising (Table 3; Fig. 23B).

On RET1 small-scale elutions we performed a size exclusion chromatography in order to analyze the oligomeric state of the enzyme as well as mass spectrometry analysis in order to confirm the presence of RET1. Mass spectrometry analysis revealed that both of the upper bands seen on the gel of the elution correspond to RET1 protein (Fig. 24). We realized that these two protein forms are found in several different oligomeric states and that they are indiscernible (Fig. 25A and 25B). We then focused ourselves on RET1 enzyme purification optimization to remove these different populations to eventually setup crystallization screens.

Figure 23: Batch purification of DSS1 and RET1 FL enzymes (A) SDS-PAGE gels of the batch purifications and solubility tests performed on DSS1-expressing cells. (B) SDS-PAGE gel showing the different batch purification

89 steps of RET1 FL enzyme. In all the panels, the arrow indicates the protein of interest and T is Total fraction, S is Soluble fraction, FT is Flow-Through, W is Wash and E1, E2 and E3 are the three eluted fractions.

Expression Ni-NTA Protein(s) LB or TB 18° TEV Size exclusion AI 30°C O/N Soluble affinity and tag(s) O/N cleavage chromatography purification R2 BL21 R2 BL21 ★ ★ NO for R2/AI: complex DSS1- Only RET BL21★ aggregated, RET1 alone 6xHis / and few NO FEW FEW FEW YES for - multimeric or in RET1 DSS1 for LB R2 complex with protein untagged expression RET1 ~65kDa 6xHis- RDS2 / Only Only - - NO RET1 RDS2 RDS2 untagged RET1 RET1 alone multimeric untagged Only Only Only Only Only Purifiable or in complex with / RDS1- RET1 RET1 RET1 RET1 RET1 without tag protein ~65kDa 6xHis 6xHis-N- DSS1 / Only NO RET1 DSS1 untagged Purifiable RET1 aggregated or in RET1 FEW YES FEW FEW YES w/o tag: 2 complex with protein untagged sizes ~65kDa DSS1- NO NO NO NO NO 6xHis Few binds 6xHis-N- YES YES YES column or Several oligomers DSS1 beads 6xHis-N- FEW FEW FEW FEW FEW NO NO NO RDS3 6xHis-N- FEW FEW FEW FEW NO NO NO NO RDS2 RDS1- NO 6xHis GST-8xHis- RET1L and YES NO DSS1 RET1L and GST-8xHis- YES NO DSS1

Table 3: Expression and purification trials for the different proteins forming the RDS complex. R2 corresponds to Rosetta 2 E. coli strain and N is the TEV cleavage site.

90 Figure 24: 10% SDS-PAGE gel used for mass spectrometry analysis of RET1 FL Ni-NTA purification elutions. The first two bands correspond to RET1 enzyme whereas the third band corresponds to a contaminant.

Buffer 1 Buffer 2 Buffer 3 Buffer 4 Buffer 5 Buffer 6 50mM 50mM Tris 50mM Tris 50mM Tris 50mM Tris pH 50mM Tris pH Phosphate pH 8.0 pH 8.0 pH 8.0 8.0 8.0 buffer pH 7.5 300mM NaCl 1M NaCl 1M NaCl 1M NaCl 1M NaCl 1M NaCl 20mM 20mM 20mM 20mM 20mM 20mM imidazole imidazole imidazole imidazole imidazole imidazole 5mM β-MT 5mM β-MT 5mM β-MT 5mM β-MT 5mM β-MT 5mM β-MT 0.1% Triton X- 0.015mM 2mM 0.1% Triton X- 0.015mM 2mM CHAPS 100 DDM CHAPS 100 DDM 5% Glycerol 5% Glycerol

Table 4: DSS1 solubility test buffers.

Figure 25: Gel Filtration chromatography analysis of RET1 FL after Ni-NTA purification elutions. (A) Gel Filtration chromatogram. (B) 10% SDS-PAGE gel analysis of the different gel filtration peaks. Bgf, sample before gel filtration and Agg, aggregates.

91 1.12.1.2 RET1 Terminal Uridyl Transferase Based on secondary structure predictions, we performed three different constructs of RET1

TUTase named RET1L, RET1S and RET1S2 to improve its behavior during the various purification steps (Fig. 26). We cloned them into our pST0 vector containing a GST-8xHis tag, which contain a TEV cleavage site for tag removal (Fig. 37). We then performed expression tests with RET1L and RET1S (Table 5) followed by small-scale Ni-NTA affinity purifications (Fig. 27). RET1L being poorly expressed (Fig. 27A), we continued with a large-scale purification of RET1S construct. As for Cid1 enzyme, we performed the same purification protocol as a first trial: i) Day 1: Ni-NTA affinity purification followed by desalting to remove imidazole followed by O/N TEV cleavage ii) Day 2: Reloading on the Ni-NTA column to remove the tag and contaminants followed by a size exclusion chromatography step. Unfortunately, during the reloading process we realized that RET1S protein has some affinity for Ni2+ ions since it was able to bind the Ni-NTA column without a tag. We then changed our purification method using a GST column for affinity purification instead and optimized it as shown in the purification scheme in materials and methods section 4.2.3.3 (Fig. 38). Indeed, we had to add a heparin chromatographic step in order to remove the nucleic acid contamination of our preparation (Fig. 28). It is noteworthy that this heparin step allowed, after TEV cleavage, to avoid a second reloading onto the GST column as the GST tag remained in the 100mM NaCl flow-through (FT) of the purification and the uncleaved RET1S protein eluted at 1M NaCl salt wash. RET1S untagged version eluted from the heparin column with few contaminants at ~480mM NaCl salt and was directly injected into a

Superdex 200 gel filtration column where it eluted as a dimeric protein (Velution=65ml). The final SDS-PAGE gel summarizing the different purification steps is shown in figure 29 (Fig. 29A). From our large-scale purifications, we obtained 5mg of protein out of 10g of cell pellet.

In order to assess the activity of the C-terminally truncated RET1S construct, we performed polyuridylation activity assays at 37°C with a 17-mer unlabeled RNA supplemented with ATP or UTP, MgCl2 and equimolar amounts of the tested proteins. We used Cid1 enzyme as a positive control for polyuridylation activity. Activity test of the RET1S construct confirmed that the enzyme was active and uses only UTP as a substrate unlike Cid1 that uses also ATP (Fig. 29B).

92 We tested over 1500 crystallization conditions at 18°C and 4°C at a protein concentration of 3.5 mg/ml. We could not go higher in protein concentration because RET1S construct was very unstable and heavily precipitated upon concentration. We crystallized the protein at 18°C in a reservoir solution containing 25% of Sokalan HP56, 0.2M of ammonium acetate and 0.1M HEPES-NaOH pH 7.0 but failed optimizing the condition (Fig. 29C). We tested these small needles without obtaining any diffraction.

Figure 26: RET1 enzyme constructs.

Media E. coli strain Temperature [IPTG] Time Expression Solubility AI Rosetta 2 30°C O/N Few Soluble AI BL21★ 30°C O/N Yes Soluble LB Rosetta 2 18°C 0.1mM O/N Few Soluble LB BL21★ 18°C 0.1mM O/N Yes Soluble LB BL21★ 37°C 1mM 4h Yes Insoluble TB Rosetta 2 18°C 0.1mM O/N Few Soluble TB BL21★ 18°C 0.1mM O/N Yes Soluble TB BL21★ 37°C 1mM 4h Yes Insoluble

Table 5: RET1 expression tests. In bold are the best expression conditions for all the RET1 constructs.

93

Figure 27: RET1L and RET1S small-scale purifications. (A) RET1L 10% SDS-PAGE gel showing the different batch purification steps. (B) RET1S 10% SDS-PAGE gel showing the different batch purification steps. In all the panels, the arrow indicates the protein of interest and T is Total fraction, S is Soluble fraction, FT is Flow-Through, W is Wash and E1, E2 and E3 are the three eluted fractions. Ni, none-induced; I, induced.

Figure 28: 1% agarose gel showing the nucleic acids contamination of the RET1 eluted sample. Bef Hep, sample before heparin purification; Aft Hep, sample after heparin purification.

94

Figure 29: RET1S purification, polymerization activity assay and crystallization. (A) 10% SDS-PAGE gel showing the different purification steps and gel filtration chromatogram showing the monomeric state of RET1S, the aggregates and the GST tag. Tot, total fraction; Sol, soluble fraction; FT, flow through; BT, before TEV; AT, after TEV; He, heparin step; Bgf, Before GF; Agf, after GF; Prot, pure protein. (B) Poly(U) and poly(A) polymerization assay. 181μM of a 17-mer RNA were incubated for 40 min at 37°C with either RET1S or Cid1 supplemented with 5mM MgCl2 and 10mM of UTP or ATP. The smeared signal observed corresponds to the poly(U)-containing product of the reaction. Ct, RNA alone with UTP. Prot, protein alone. (C) RET1S small thin needle-like crystals.

Finally, we decided to test our RET1S2 construct for expression and purification in a large- scale using the optimized purification method of RET1S shown in figure 38 of materials and methods section 4.2.3.3 (Fig. 38). The protein was well expressed and soluble, and eluted as a monomer in the Superdex 75 column (Velution=56ml; Fig. 30A). We checked the polyuridylating activity of RET1S2 and confirmed the activity of this construct (Fig. 30B). As for RET1S, we performed crystallization screens 18°C and 4°C at 4mg/ml. Two crystallization hits were revealed and optimized using the sitting drop vapor diffusion technique (Fig. 31A and 31B). The first hit leads to very small see urchins, which after several rounds of optimization and the use of the pseudo-batch technique (previously described for Cid1), formed needle-like RET1S2 crystals. These crystals grew to full size in a week in a reservoir

95 solution containing 0.1M Bicine/Tris pH 8.5, 25-30% Glycerol/PEG 4000, 0.9M NPS (0.3M sodium nitrate, 0.3M disodium hydrogen phosphate and 0.3M ammonium sulfate). The second RET1S2 initial hit contained 0.1M Bicine pH 9.0, 20% polyacrylate 2100 and 0.2M NaCl forming very small needle-like see urchins. In order to optimize these crystals we combined the micro-seeding technique together with the pseudo-batch technique (Fig. 31B).

RET1S2 crystals grew to full size in a week and a half giving plate-like crystals in a reservoir solution containing 20% polyacrylate 2100, 0.1M Bicine pH 8.6, 0.2M NaI supplemented with non-diluted crystal seeds (Fig. 31B). Several crystals with different cryoprotection solutions were tested for diffraction at PXIII beamline at the Swiss Light Source (SLS) synchrotron. Needle-like crystals diffracted very poorly (less than 15Å) whereas plate-like crystals diffracted up to 3.5Å resolution (Fig. 32A). We processed the diffraction data and RET1S2 plate-like crystals belonged to the space group P2(1)2(1)2 with the following cell parameters a=127.70, b=193.18, c=59.44, α=β=γ=90. A summary of the processing and scaling statistics are shown in figure 32B (Fig. 32B). We are now aiming to solve the structure by molecular replacement using RET2 enzyme (PDB code: 2B51) as a first model (Fig. 32B and 33A) and also trying to obtain better quality crystals. RET2 TUTase has only 26% sequence identity with RET1 (Deng et al. 2005). Thus, seleno-methionine derivative RET1S2 protein expression, purification and crystallization are being set up in case the MR fails to bring up the phases. Other models will also be tried for the MR such as TUT4 (33% identity, PDB code: 2IKF) and Cid1 (25% identity; PDB code: 4EP7), which might help to obtain phase information and subsequently calculate electron density maps in order to solve the structure of the RET1S2 fragment (Fig. 33B and 33C; Stagno et al. 2007; Munoz-Tello et al. 2012).

96

Figure 30: RET1S2 purification and polymerization assay. (A) 10% SDS-PAGE gel showing the different purification steps and gel filtration chromatogram showing the monomeric state of RET1S2. (B) Poly(U) and poly(A) polymerization assay. 181μM of a 17-mer RNA were incubated for 40 min at 37°C with either RET1S2 or Cid1 supplemented with 5mM MgCl2 and 10mM of UTP or ATP. Prot stands for RET1S2 protein alone.

97 Figure 31: RET1S2 crystal hits optimization. (A) Needle-like RET1S2 crystals optimized using the pseudo-batch technique. (B) Plate-like RET1S2 crystals optimized using micro-seeding and pseudo-batch techniques.

Figure 32: RET1S2 diffraction data. (A) Diffraction pattern of RET1S2 plate-like crystals shown in the figure. (B) RET1S2 data processing final table showing the statistics of the data reduction and processing.

98

Figure 33: Protein models for molecular replacement phasing. (A) RET2 TUTase structure (26% sequence identity with RET1; PDB code: 2B51). (B) TUT4 structure (33% sequence identity with RET1; PDB code: 2IKF) (C) Cid1 enzyme structure (25% sequence identity with RET1; PDB code: 4EP7).

1.12.2 Polyadenylation/Uridylation Complex

In trypanosomes, the polyadenylation/uridylation complex composed of KPAP1 poly(A) polymerase, RET1 TUTase and KPAF1/KPAF2 factors is required for mRNA translation (Aphasizheva et al. 2011). This complex adds a long A/U tail to the 3’-end of the mRNA that is then recognized by the translation machinery. In order to better understand the function of each of these proteins within the complex, we tried to solve its atomic structure by X-ray crystallography. KPAF1 and KPAF2 factors are mainly composed of pentatricopeptide repeats (PPR), which form a helix-loop-helix structure and have been shown to be important for RNA recognition and enhance the catalytic activity of RET1 and KPAP1 (Aphasizheva et al. 2011). Another PhD student in our laboratory is currently studying the structural characterization of KPAF1 and KPAF2 complex. We focused our efforts on the catalytic part of the polyadenylation / polyuridylation complex meaning RET1 (structural work discussed in section 4.2.1.2) and KPAP1.

Based on secondary structure predictions, we clone a truncated version of KPAP1 protein with either a N-Terminal 9xHis-MBP tag; GST-8xHis tag or a 7xHis tag and performed small- scale expression tests (Fig. 34). All the constructs were expressed but the GST-tagged and His-tagged versions were not soluble. We then performed a large-scale purification of the MBP-tagged KPAP1 protein (Fig. 35). At the affinity purification step, we realized that the protein was not very stable since the MBP tag was getting self-cleaved (Fig. 35A). Nevertheless, we pooled the elution fractions together for an O/N TEV cleavage at 4°C (Fig.

99 35B) followed by a reloading into the Ni-NTA column in order to remove the contaminants and the tag (Fig. 35C). Most of the protein was either precipitated or binding the nickel column (Fig. 35C). The few protein left was then concentrated and loaded into a gel filtration column to evaluate the oligomeric state of the untagged protein (Fig. 35D). Unfortunately, the protein was found in the aggregated fractions. We tried to optimize the purification buffers but the protein was still eluting as an aggregated population (Table 6). Further optimizations of the purification method as well as the protein construct are needed in order to obtain a non-aggregated population of KPAP1 enzyme.

Figure 34: KPAP1 (29-436 Aa.) protein construct with three different purification tags.

Figure 35: KPAP1 enzyme purification. (A) 10% SDS-PAGE gel showing the Total (T) and Soluble (S) fractions, the flow-through (FT) of the Ni-NTA column and the Ni-NTA elution of KPAP1. We observe self-cleavage of the MBP-KPAP1 tagged protein. (B) Before TEV (BT) and After TEV (AT) cleavage samples loaded on a 10% SDS-

100 PAGE gel. (C) 10% SDD-PAGE gel showing the reloading steps of KPAP1 untagged to remove the contaminants and MBP tag. rFT, reloading Flow-Through; rE, reloading elution. (D) KPAP1 GF profile showing the protein eluting in the aggregated fractions.

50mM Phosphate 50mM Phosphate 50mM Tris buffer 50mM Bis-Tris buffer pH 7.5, buffer pH 7.5, pH 8.0, 500mM buffer pH 6.5, Lysis 300mM NaCl, 500mM NaCl, NaCl, 20mM 500mM NaCl, Buffer 20mM Imidazole, 20mM Imidazole, Imidazole, 5mM 20mM Imidazole, 5mM β-MT, 0.1% 5mM β-MT, 0.1% β-MT, 0.1% Triton 5mM β-MT, 0.1% Triton X-100 Triton X-100 X-100 Triton X-100 50mM Phosphate 50mM Phosphate 50mM Tris buffer 50mM Bis-Tris buffer pH 7.5, buffer pH 7.5, pH 8.0, 500mM buffer pH 6.5, Wash 300mM NaCl, 500mM NaCl, NaCl, 20mM 500mM NaCl, Buffer 20mM Imidazole, 20mM Imidazole, Imidazole, 5mM 20mM Imidazole, 5mM β-MT 5mM β-MT β-MT 5mM β-MT 50mM Bis-Tris 50mM Phosphate 50mM Phosphate 50mM Tris buffer buffer pH 6.5, buffer pH 7.5, buffer pH 7.5, pH 8.0, 500mM Elution 500mM NaCl, 300mM NaCl, 500mM NaCl, NaCl, 250mM Buffer 250mM 250mM Imidazole, 250mM Imidazole, Imidazole, 5mM Imidazole, 5mM 5mM β-MT 5mM β-MT β-MT β-MT 20mM MES pH 6.5, 20mM HEPES pH 20mM MES pH Gel 500mM NaCl, 7.5, 500mM NaCl, 6.5, 500mM NaCl, Filtration 1mM TCEP, 10% Glycerol, 1mM TCEP, 10% Buffer 0.015mM DDM, 1mM β-MT Glycerol 10% Glycerol

Table 6: Different buffers tested for KPAP1 enzyme purification.

We also tried to obtain KPAF1 and/or KPAF2 together with RET1 TUTase but we did not manage to obtain a stable complex. Trials with the whole complex will be performed once we obtain a stable KPAP1 protein.

1.12.3 Materials and Methods

1.12.3.1 RDS complex proteins cloning We received several ready-to-use expression and co-expression vectors from our collaborators in Boston containing the RDS complex protein components. An example of plasmid organization is presented in figure 36 (Fig. 36).

101

Figure 36: Schematic representation of the expression plasmid used for the co-expression of DSS1-6x His tag with RET1 FL in E. coli cells. Only the unique cutter enzymes are represented, ori is the high-copy-number ColE1/pMB1/pBR322/pUC origin of replication, bom is basis of mobility region from pBR322 and rop is Rop protein, which maintains plasmids at low copy number.

New DSS1, KPAP1 and RET1 constructs were designed based on secondary structure predictions obtained from the online server Phyre (Kelley and Sternberg 2009). Phyre is an online server that performs a local alignment of the query sequence against a fold library. The output of this server is a structure prediction model, which is based on the alignment of the query sequence with all the sequences present in the library. This model allows us to visualize the potential disordered regions of the protein sequence that could impair the crystallization of the protein, and remove them for structural purposes. The laboratory has developed a series of pET42-derived expression vectors, called, pST vectors, with identical multi-cloning sites, allowing us to use a single pair of PCR primers to clone our proteins of interest at the N-terminal of different tags. An example of pST vector with a GST-8xHis tag, named pST0, is shown in figure 37 (Fig. 37). The primers used for the PCR amplification of the protein fragments are shown in table 7 (Table 7). The constructs were cloned in the corresponding vectors using NcoI and XhoI restriction enzymes (New England Biolabs) O/N at 37°C followed by T4 DNA Ligase (New England Biolabs) ligation for 3h at RT. The ligated vectors were then transformed into XL1 E. coli cells, extracted and sequenced.

102

Figure 37: Schematic representation of the expression plasmid pST0 used for the overexpression of GST-8x His tag RET1 constructs in E. coli cells. Only the unique cutter enzymes are represented. ColE1 origin: the high-copy- number origin of replication. MCS: Multiple Cloning Site.

For KPAP1 (29-436aa) 5' - gctattCCATGGACaaaagcgacagccagtgg - 3' Rev KPAP1 (29-436aa) 5'- cccaatCTCGAGtcatcgttcctgaagtagagtcgc - 3' For RET1L and RET1S 5’ - gctattGGCGCGCCtatggtaagtaagtaccaccgc - 3’ Rev RET1L 5’ - ccgattGCGGCCGCtcaagtaaccttgctgatgccgc- 3’ Rev RET1S 5’ - ccgattGCGGCCGCtcatgctgcggcgaagacgcac- 3’ For DSS1 (56-743aa) 5’ - gctattGGCGCGCCtatgcttgacaaggaactgcttc - 3’ Rev DSS1 (53-743aa) 5’ - ccgattGCGGCCGCtcacgagtccagagatggaagaagg - 3’

For RET1S2 5' - ccgattCCATGGgtgtgcggttgtattc - 3'

Rev RET1S2 5' - ccgattCTCGAGtcatgctgcggcgaagac - 3'

Table 7: Primers used for cloning.

1.12.3.2 Protein Expression and Purification Tests Small-scale cultures of E. coli BL21-Star or Rosetta 2 cells (Novagen), previously transformed with the appropriate protein construct, were grown in LB or TB media at 37°C for 6 hours. The cells were then induced overnight at 18°C with 0.1mM IPTG or diluted in AI media (1/50) for overnight expression at 30°C. Expression of the recombinant proteins was analyzed by SDS-PAGE. Cells expressing the protein were harvested using centrifugation at 4 000 rpm and resuspended in 50mM Tris buffer pH 8.5 containing 300mM NaCl, 20mM imidazole, 0.1% Triton X-100, DNase 1 (1μg/ml), Lysozyme (1μg/ml), 5mM β-MT, and a cocktail of protease inhibitors (1mM PhenylMethylSulfonyl Fluoride, 1μg/ml of leupeptin and 2μg/ml of pepstatin). Cells were lysed using an Emulsiflex system (AVESTIN) and cleared using

103 centrifugation at 15 000 rpm for 30 minutes at 4°C. The soluble fraction was subjected to an initial affinity purification using 1ml slurry of Ni-NTA beads (GE Healthcare). The protein was eluted from the beads with a buffer containing 250mM imidazole and desalted against 50mM Tris buffer pH 8.5, 300mM NaCl, 20mM imidazole, and 5mM β-MT. The TEV protease was added at a ratio of 1/50 (TEV/protein) and the sample was incubated at 16°C or 4°C overnight. The fractions containing the total and soluble fractions, the elution peak and the sample after protease treatment were analyzed on a SDS-PAGE. When obtaining enough material, eluted proteins were loaded on a Superdex 200 analytical column for checking the oligomeric state of the protein or complexes.

1.12.3.3 Large-Scale RET1 TUTase Expression, Purification and Crystallization RET1 constructs were overexpressed in BL21-Star E. coli cells, grown in TB media at 37°C for 6 hours followed by overnight induction at 18°C with 0.1mM IPTG. Induced cells were harvested by centrifugation at 4 000 rpm for 30 min at 4°C and resuspended in a lysis buffer A (Table 8) supplemented with DNase 1 (1μg/ml), Lysozyme (1μg/ml), and a cocktail of protease inhibitors (1mM PhenylMethylSulfonyl Fluoride, 1μg/ml of leupeptin and 2μg/ml of pepstatin).

Cells were lysed using the emulsiflex system (AVESTIN) and cleared by centrifugation at 16 000 rpm for 30 min at 4°C. The soluble fraction was subjected to several purification steps summarized in figure 38 and the different buffers used during the purification are shown in table 8 (Fig. 38; Table 8). Briefly, the soluble fraction is loaded on a GST column and the tagged RET1 protein is eluted with Elution buffer A containing 10mM of reduced gluthatione. The eluted fractions are then set for TEV protease cleavage at 4°C O/N. The next morning, the sample is diluted to Wash buffer Hep to loaded on a heparin column to remove nucleic acids contamination and the tag. The heparin is eluted with the High Salt buffer Hep and the peak corresponding to RET1 is subjected to a gel filtration step in GF buffer A to check the oligomeric state of the protein prior to crystallization trials.

104

Lysis Wash Wash buffer High Salt Elution buffer GF buffer A buffer A buffer A Hep buffer Hep A 50mM Tris 50mM Tris pH 50mM Tris 50mM Tris pH 50mM Tris pH 50mM MES pH 8.0 8.0 pH 8.0 8.0 8.0 pH 6.5 500mM NaCl 500mM NaCl 100mM NaCl 1M NaCl 500mM NaCl 500mM NaCl 10mM L- 0.1% Triton 5mM β-MT 5mM β-MT 5mM β-MT Glutathione 2mM TCEP X-100 reduced 5mM β-MT 5mM β-MT Table 8: Buffers used for RET1 optimized purification.

Figure 38: RET1 optimized purification scheme. The lysate containing RET1 GST-8xHis tagged protein was loaded 2 times on a 2x5ml GST column. The GST affinity purification was followed by a cleavage reaction, overnight at 4°C, by TEV protease in order to remove the tags. After tag removal, a heparin column is loaded 3 times in order to remove the nucleic acid contamination and separated the GST-8xHis tag from the protein. As last step, to further polish and exchange the buffer of the sample, the protein was injected on a size exclusion chromatography previously equilibrated with the GF buffer. The protein is then concentrated and ready for setting crystallization screens.

RET1S2 protein was crystallized in two steps. Crystals were produced using the sitting-drop vapor diffusion technique at 18°C, with a reservoir solution containing 20% polyacrylate 2100, 0.1M Bicine pH 8.6 and 0.2M NaI. The same pseudo-batch technique used for Cid1 enzyme was performed for RET1S2 combined with micro-seeding in order to obtain diffracting plate-like crystals. Crystals appeared in two days and grew to maximum size of 0.1 x 0.04 x 0.2 mm in a week and a half. Crystals were passed into the reservoir solution supplemented with 15% glycerol prior to freezing in liquid nitrogen.

105

1.12.3.4 In Vitro Polymerization Assay A 17-mer RNA was used for all the RET1 polymerization assays. 181μM of unlabeled RNA were incubated with 70μM or 180μM of RET1 constructs, or Cid1 (positive control), supplemented with 5mM MgCl2 and 10mM of UTP or ATP for 40min at 37°C. The reactions were then stopped with an equal volume of 100mM EDTA and 2% sodium dodecyl sulfalte (SDS). Extended RNA primers were then extracted with phenol/chloroform, precipitated with 80% ethanol and resuspended in 90% formamide loading buffer. Products were denatured at 95°C for 3 min and separated by gel electrophoresis in a 12% polyacrylamide/7M urea gel.

1.12.3.5 Large-Scale KPAP1 Expression and Purification KPAP1 was overexpressed in BL21-Star E. coli cells, grown in TB media at 37°C for 6 hours followed by overnight induction at 18°C with 0.1mM IPTG. Induced cells were harvested by centrifugation at 4 000 rpm for 30 min at 4°C and resuspended in a lysis buffer containing 50mM Tris buffer ph 8.0, 500mM NaCl, 20mM imidazole, 0.1% Triton X-100 and 5mM β-MT supplemented with DNase 1 (1μg/ml), Lysozyme (1μg/ml), and a cocktail of protease inhibitors (1mM PhenylMethylSulfonyl Fluoride, 1μg/ml of leupeptin and 2μg/ml of pepstatin).

Cells were lysed using the emulsiflex system (AVESTIN) and cleared by centrifugation at 16 000 rpm for 30 min at 4°C. The soluble fraction was subjected to a first affinity purification using chelating HiTrap FF crude column (GE Healthcare) charged with Ni2+ ions. The protein was eluted with a buffer containing 250mM of imidazole and desalted against a buffer containing 50mM Tris buffer ph 8.0, 500mM NaCl, 20mM imidazole and 5mM β-MT. The eluted fractions were then set for TEV protease cleavage at 4°C O/N. Cleaved protein was reloaded on the Ni-NTA column to remove the tag and contaminants followed by a final gel filtration step to check the oligomeric state of the enzyme.

106 General Discussion

107 1.13 Structural Determinants for Cid1 Poly(U) Polymerase UTP Selectivity and RNA Recognition

Cid1 enzyme belongs to the polymerase β superfamily and was uncovered through its resistance to the combination of hydroxyurea and caffeine, both drugs disturbing S-M phase control, during a screen aiming to identify new players of this checkpoint (Aravind and Koonin 1999; Wang et al. 1999; Wang et al. 2000). Nevertheless, no link between Cid1 and the S-M checkpoint has been revealed so far. Cid1 was initially considered as a poly(A) polymerase until further biochemical analysis uncovered its preferential polyuridylation activity in vivo and in vitro (Wang et al. 2000; Read et al. 2002; Rissland et al. 2007; Aphasizheva et al. 2011). Recent studies have implicated Cid1 enzyme in a new RNA degradation pathway independent of deadenylation, which starts with polyuridylation of the mRNA followed by decapping and mRNA body decay (Rissland and Norbury 2009). Cid1 protein lacks any RNA binding domain and is highly processive in vitro. Surprisingly, only one to five uridines have been found in vivo on Cid1 target mRNAs even though longer untemplated U-tails can be observed in vitro (Rissland and Norbury 2009). These evidences together with the fact that in vivo Cid1 will preferentially use UTP while in vitro this protein was able to use both ATP and UTP as a substrate brought us to investigate the molecular detail of Cid1 activity. Discovered more than ten years ago, the molecular mechanism by which Cid1 exerts its polyuridylation activity was poorly understood. Thus, structural studies will shed light on how this enzyme accommodates its RNA target as well as let us obtain some insights into how its catalytic activity is performed. We decided to work with a truncated version of Cid1 based on secondary structure predictions since earlier studies showed that the full-length version of the enzyme was unstable during purification (Fig. 10A; Rissland et al. 2007). We then solved the crystal structure of Cid1 together with its substrate UTP and obtained critical molecular insights on the enzyme’s specificity and function (Munoz-Tello et al. 2012). Cid1/UTP complex allowed us to identify the crucial residues for UTP recognition, highlighting the UTP selectivity of the enzyme. As for other members of the polymerase β family, Cid1 is composed of a catalytic (CAT) domain with the three catalytic aspartic residues and a central domain (CD). The UTP molecule is bound at the bottom of the cleft between the two domains, which compose the

108 active site. The UTP molecule is stabilized via several direct and water mediated interactions, which are similar to other TUTase crystal structures (Deng et al. 2005; Stagno et al. 2007; Stagno et al. 2010). Two magnesium ions are stabilized in the active site. The first one is stabilizing the triphosphate moiety of the UTP together with the catalytic residue 103 and the second one is found on top of the uracil base, taking the place of the substrate RNA molecule. Surprisingly, histidine residue 336 is hydrogen bonded to the uracil base of the UTP molecule, a stabilization feature that was not shown before this date in any of the known PUP structures, and is the main residue responsible for UTP selection. Indeed, deletion of this residue strongly reduces the specific uridylating activity of Cid1 enzyme. When we superpose the NRM region of the PUPs and the PAPs enzymes, we observe that histidine 336 (according to Cid1 numbering) is conserved in all PUPs whereas most of the PAPs have an asparagine at this position likely underlying the importance of this residue in the UTP selectivity (Fig. 39). This was further confirmed by other published crystal structures of the Cid1 enzyme where the authors managed to show a complete abolishment of the uridylating activity upon mutation of this histidine (Yates et al. 2012). In fact, the authors show that mutating this histidine into an asparagine (which is present at that position in poly(A) polymerases) turned the enzyme into a solely poly(A) polymerase (Yates et al. 2012). These data have to be taken with care, as in our hands and in the experiments reported by another competing group, i.e. Lunde and colleagues, we only observed reduction of the polyuridylating activity upon H336 mutation rather than a complete abolition. Further biochemical studies will be needed to be able to definitely turn a PUP into a PAP in vivo and to fully understand the role of this histidine residue on UTP selection in the protein family. In the NRM loop we also find the phenylalanine 332 (F332), which is buried into a hydrophobic pocket and is necessary for the proper folding of the protein. The hydrophobic pocket is composed of residues found in helix 4 and an extended loop of the CD domain. Interestingly, this helix 4 corresponds to the helix F in the yeast PAP enzyme, which is required for the movement of the CAT domain during catalysis (Balbo and Bohm 2007). Furthermore, a similar cavity is observed in the trypanosomal TUT4 protein suggesting a conserved movement of the two domains forming the active site cleft during catalysis in the entire polymerase family (Stagno et al. 2007). When we mutate this hydrophobic residue (F332) into an alanine, we have a very strong decrease of Cid1’s activity underlying the importance of phenylalanine 332 in the proper positioning of the NRM and the CAT domain during

109 catalysis. Furthermore, other residues found on positively charged patches of the enzyme such as arginine 139 and tyrosine 205 seem to play an important role for RNA recognition. Upon mutation of these residues, Cid1 activity is strongly decreased involving these positively charged patches in the RNA translocation and accommodation during the catalytic cycle of Cid1. Based on these results we propose a catalytic cycle where binding of the triphosphate moiety of the nucleotide leads to a movement of the CAT domain towards the CD domain of the enzyme making helix 4 rotate and thus lock phenylalanine 332 into the hydrophobic pocket immediately below the helix 4 (Munoz-Tello et al. 2012). The uracil base of the UTP is then recognized by the NRM loop giving the time to the RNA molecule to bind and allowing triphosphate hydrolysis for the addition of a single uridine. Upon pyrophosphate hydrolysis, the active site becomes unstable and the CAT and CD domains move apart from each other. This leads to the translocation of the RNA molecule and a new cycle of polymerization will be performed. The conformational changes needed upon Cid1 polyuridylation partially mimic the yeast PAP reorganization upon RNA binding and this flexibility likely influences the processive and distributive mode of action of the enzyme.

Figure 39: Superposition of the Nucleotide Recognition Motif (NRM) of several eukaryotic poly(U) polymerases (PUPs) and poly(A) polymerases (PAPs). PUP enzymes NRM are framed in yellow whereas PAP enzymes NRM are framed in purple. The PUP enzymes aligned are S. pombe Cid1 (SpCid1), C. elegans PUP1 (CePUP-1) and PUP-2 (CePUP-2), Human ZCCHC11 (HsTUT4) and A. thaliana AT1 (AtAT1). The PAP enzymes used for the alignment are S. cerevisiae TRF4 (ScTRF4), Human PAPD5 (HsPAPD5), S. pombe Cid14 (SpCid14) and Cid12 (SpCid12) and Human GLD2 (HsGLD2). The black triangle represents the conserved residues in the NRM loop. Phe332 (Cid numbering) is very well conserved in PUPs but not in PAPs. His336 (Cid numbering) in PUPs is conserved at this position whereas in PAPs this residue corresponds to an Asparagine highlighting the importance of this Histidine in PUP enzymes. An asterisk underlines other important residues of the NRM.

Our Cid1 structure bound to an ApU molecule shed further lights on the catalytic mechanism and substrate preference of this enzyme. Previous studies have shown the implication of Cid1 enzyme in the polyuridylation of polyadenylated RNAs (Rissland et al. 2007; Rissland and Norbury 2009). Thus, we consider the ApU dinucleotide observed in our crystal structure

110 as a pseudo-product corresponding to the post-catalysis state of the polyuridylation reaction. The uridine base of the ApU molecule is stabilized as for the uracil part in the Cid1/UTP complex (Munoz-Tello et al. 2012; Munoz-Tello et al. 2014). The adenine base of the pseudo-product is stabilized via several water-mediated interactions with Cid1 but also with a direct hydrogen bond with asparagine 165, the first residue of the helix 4. We further analyzed by mutational studies the importance of asparagine 165. As I mentioned before, helix 4 is required for the global movements of the enzyme upon catalysis (Munoz-Tello et al. 2012). Cid1/ApU binding mode is different to the one found in TbTUT4/UpU complex where a movement of the loop between β-sheets 4 and 5 is necessary for UpU direct stabilization (Stagno et al. 2007). In our case, the same loop des not move when we compare our two structures. This asparagine 165 is identified as a key residue contacting the substrate in the Cid1/ApU crystal. This amino acid does not influence the RNA binding properties of Cid1 but is required for Cid1’s enzymatic activity since mutations of this residue into an alanine lead to the complete abolishment of the enzyme’s activity. The reason behind this might be its involvement in the swivel motion of the enzyme during the catalytic cycle and strategic position at the beginning of helix 4 (Munoz-Tello et al. 2012; Munoz-Tello et al. 2014). Furthermore, our Cid1/ApU complex delineated the molecular bases for selectivity of RNA rather than DNA substrates by showing a direct contact of the aspartate 103 of the catalytic triad with the 2’-OH group of the RNA substrate in our atomic model. This binding mode resembles to the previously described RNA specificity of TUT4 RNA uridylyl transferase from Trypanosome brucei suggesting a conserved mechanism of RNA detection among the entire family (Stagno et al. 2007). Based on electromobility shift assays (EMSA), we showed that Cid1 WT or mutated at the position 160 (active site) or 165 (residue binding to the last base of the substrate) or deleted of its β-trapdoor has a substantially higher affinity for poly(U) than for poly(A) substrates (Munoz-Tello et al. 2014). The β- trapdoor is a very flexible region found above the active site of Cid1, which was suggested to be involved in keeping the UTP inside the cleft (Yates et al. 2012). But when we deplete this structured-region in Cid1 enzyme, we only observed a strong effect on the poly(A) activity of Cid1 thus implicating the β-trapdoor in the polyadenylation activity of the enzyme rather than in its polyuridylation activity. It is important to note that if we mutate the lysine 144, found in the β-sheet 4 of Cid1, the binding to a small stretch of As is impaired whereas its binding to a U15 RNA is not affected by the mutation. This suggests that lysine 144 is required

111 for the stabilization of short incoming RNA molecule. Yates and collaborators do not observe this effect in their RNA binding assays most likely because they used a stretch of 40 nucleotides (Yates et al. 2012). In the poly(A) polymerase, lysine 144 equivalent residue is lysine 145, which contacts the 2’-OH group of the -2 nucleotide in the PAP ternary crystal structure (Balbo and Bohm 2007). Thus, Cid1 may stabilize its substrate in a manner closer to PAP enzymes. The equivalent residue of lysine 144 in the trypanosomal TUT4 structure is the aspartate 126, which once mutated strongly affects the polyuridylation activity of the enzyme (Stagno et al. 2007). Taken together, these results suggest a mechanism, involving lysine 144, which allows Cid1 to switch from a distributive to a processive mode of action, potentially allowing the addition of two distinct size lengths of U-tails. These second structure allowed us to further complete our catalytic cycle model highlighting the importance of several additional key residues such as the Asparagine 165 for the complete description of the catalytic reaction: (i) the β-trapdoor partially modulates the exchange of UTP after each cycle; (ii) asparagine 165, which is in direct contact with the 3’-penultimate nucleotide in our structure, may be implicated in the translocation of the RNA molecule after catalysis. Finally, as mentioned before, the overall activity of the enzyme clearly depends on global movements of the enzyme around helix 4 and phenylalanine 332 in order to pursue to the next polyuridylation cycle (Munoz-Tello et al. 2012).

Our atomic structures shed new light into the UTP selectivity of Cid1 enzyme allowing us to propose a model for catalysis where local and global movements of the enzyme are crucial for RNA translocation and accommodation. Yates and collaborators, Lunde and collaborators and our group obtained critical information about the RNA recognition mode of the enzyme although a protein structure with a longer RNA target will definitely clarify this matter (Lunde et al. 2012; Munoz-Tello et al. 2012; Yates et al. 2012; Munoz-Tello et al. 2014). Unfortunately, our attempts to obtain diffracting crystals of the RNA complex were not successful. We also acquired insights into the catalytic switch of Cid1 from a distributive to a processive enzyme and underlined its preference for uridylated substrates. Nevertheless, several questions still remain unanswered and will need further structural, biochemical and genetic studies. We still do not understand how this enzyme selects its targets and whether a cellular mechanism exists to regulate the Cid1-mediated U-tail addition, although we obtained some hints of the Cid1 enzymatic mechanism with our ApU complex structure

112 (Munoz-Tello et al. 2014). We also do not know why certain mRNAs need to be regulated by uridylation in fission yeast or in other organisms. We propose that Cid1 enzymatic activity is influenced by its strong affinity for the uridine-containing tails. Factors such as Lin28 in humans, Dis3L2 exonuclease or other unknown factors might be necessary to further regulate Cid1’s activity (Heo et al. 2009; Malecki et al. 2013). Indeed, U-tails of up to 15 nt are found in dis3L2 depleted S. pombe cells reinforcing the idea that other factors might regulate Cid1’s activity in vivo (Malecki et al. 2013). It is tempting to speculate that Cid1, as for its human orthologs ZCCHC11, might need a RNA binding protein or other factors to guide the enzyme towards a specific RNA target and regulate its function in the cell (Heo et al. 2009). Deep Cid1 protein partner identification studies will be needed to further clarify this matter.

1.14 RDS Complex Structural and Functional Studies: Light into a Trypanosomal TUTase, RET1

To better understand the role of PUP enzymes in eukaryotes, we started a collaborative work with Ruslan Aphasizhev’s laboratory in Boston on a PUP named RET1, which is responsible for the polyuridylation of gRNAs necessary for the RNA editing reaction and required for the proper mRNA translation (Blum and Simpson 1990; Ernst et al. 2003). The kinetoplast DNA in the mitochondria of trypanosomes contains approximately 40 to 50 maxicircles, which encode rRNA and mRNA precursors, and thousands of minicircles encoding gRNAs. These gRNAs are required for mRNA editing and their biogenesis is not completely understood. Through RET1 TUTase pull-down assays and mass spectrometry analysis, our collaborators discovered a complex containing RET1 plus four proteins, i.e. DSS1 exonuclease and 3 other proteins without any discernable motif named RDS1, RDS2 and RDS3, essential for gRNA biogenesis (unpublished data). So far, the significance of this newly characterized complex in gRNA synthesis and the protein functions are not fully understood except that they are all required for proper gRNA biogenesis. Interestingly, RET1 is also involved in mRNA translation through the formation of long A/U tails, which are required at the 3’-end of mitochondrially-encoded mRNAs in order to be translated (Aphasizheva et al. 2011). The addition of this A/U tail is coordinated by the KPAF1/KPAF2

113 complex and catalyzed by KPAP1/RET1 complex (Aphasizheva et al. 2011). KPAF1 and KPAF2 proteins contain 20 and 5 PPR repeats respectively. Their main known function is to enhance the poly(A) and poly(U) activity of KPAP1 and RET1 (Aphasizheva et al. 2011). The implications of RET1 in both of these pathways awaits for complete characterization.

In order to better understand the function of this new complex involved in gRNA biogenesis and maturation, we aimed to solve the atomic structure of this complex either as a reconstituted complex or through structural studies of individual subunits. Firstly, we tried to obtain several co-expression and co-purifications of these proteins without any success. We then decided to take each protein alone for expression, purification and crystallization. In this case, trials with DSS1, RDS1, RDS2 and RDS3 proteins were still unsuccessful, most probably because of the large protein size, which might not be very well tolerated in E. coli (Table 3). Expression with protein chaperones or using insect cell Multi-BAC system might lead to successful results (Berger et al. 2004; Prasad et al. 2011). Nevertheless, we were able to express, purify and crystallize a truncated version of RET1 (named RET1S2), which still possess a polyuridylating activity (Fig. 29B and 30B). The enzyme gave plate-like crystals of 2μm thickness and around 200μm length (Fig. 31B). We tested these plates for diffraction at PXI beamline of the Swiss Light Source (SLS) and obtained a complete data set diffracting up to 3.5Å (Fig. 32A). So far, we were able to process the data and scale it but we have just started the analysis for phasing (Fig. 32B). Molecular replacement method using RET2 (PDB code: 2B51), Cid1 (PDB code: 4EP7) and TUT4 (PDB code: 2IKF) will be tried first in order to obtain the phases to calculate an electron density map using our diffraction data (Fig. 33; Deng et al. 2005; Stagno et al. 2007; Munoz-Tello et al. 2012). These protein models will need to be modified and adapted for molecular replacement. For example, RET2 enzyme possesses an additional non-conserved domain, which will need to be removed in order to pursue the phasing experiment. Another option will be to use all the three PDB files as an ensemble for molecular replacement. Indeed, we will superpose the three models and remove the most variable parts leaving only the main structured parts of each protein and use them for phasing. At the same time, seleno-methionine derivative of the truncated RET1 protein is being prepared and will be tested for crystallization trials in case our molecular replacement attempts with RET2, TUT4 and Cid1 enzymes remain unsuccessful. Interestingly, RET1 is suspected to bind zinc ion through the presence of four conserved

114 cysteines in the N-terminal region. RET1S2 crystals contain zinc ion(s) as demonstrated by the fluorescence scan performed on RET1S2 crystal. This signal could also be helpful for phasing but higher diffracting crystals will be needed as the tested one were not recorded at the optimal wavelength. Lastly, if these other phasing methods fail to give us a proper density map for building RET1S2, soakings with heavy atoms will be required for single anomalous scattering phasing method as used for the determination of the Cid1/UTP structure (Munoz-Tello et al. 2012). Since RET1 is a PUP with two different type of substrates (gRNAs and mRNAs) it will be interesting to try to obtain the crystal structure of RET1 in complex with each of these RNAs to better understand the RNA recognition mode of the enzyme as well as how the RNA will be accommodated within the protein. It is interesting to note that gRNAs are supposed to have two stem-loops within the RNA, which differs from mRNAs. Obtaining crystals with both a gRNA and a mRNA will help us understand how this PUP enzyme is able to recognize such a different type of substrates for polyuridylation. As for Cid1 and TUT4 PUPs, co-crystallization and soaking trials with UTP alone and with UpU or UMP molecules will be tried in order to try to block a binary and/or ternary complex in the active site of RET1S2. If needed, RET1S2 enzyme will be mutated into an inactive protein to further trap RNA molecules in the active site as successfully done by us with Cid1/ApU complex and the PAP/ATP/RNA ternary complex (Balbo and Bohm 2007; Munoz-Tello et al. 2014).

As I mentioned before, RET1 is required for proper mRNA translation in the mitochondria of trypanosomes through the formation of long A/U tails together with KPAP1 poly(A) polymerase and enhancer factors KPAF1 and KPAF2 (Aphasizheva et al. 2011). We then tried to obtain a KPAP1/RET1 stable complex or KPAP1 protein alone (Fig. 34 and Fig. 35; Table 6). Unfortunately, we were not able to successfully purify KPAP1 poly(A) polymerase in E. coli cells. Several new constructs based on secondary structure predictions and changing the position of the tag from N-terminal to C-terminal will need to be tested to try to purify KPAP1 enzyme as tried for RET1. If the truncated versions of KPAP1 are purified, activity tests with a mRNA molecule will have to be performed to confirm that our construct is still active. To fully characterize this polyadenylation/polyuridylation complex, another PhD student in our laboratory is studying the structural and functional bases of KPAF1 and KPAF2 complex.

115 In both of these projects, RET1 structure will bring further light into how this enzyme is bound by other proteins and the ways in which RET1 protein partners influence the activity and substrate recruitment of the enzyme. Thus, the structures may help understanding how this TUTase is able to make complexes with different RNA substrates and proteins in different pathways in trypanosomal mitochondria. For this purpose, obtaining protein/protein structures of RET1 with any of the above-mentioned enzymes would help to answer at least partially this question. Furthermore, atomic models of RET1 with either a gRNA or a mRNA may help us understand the complex mode of RNA binding of RET1 bringing some interesting molecular insights into PUP enzymes RNA selectivity and conformational flexibility.

Polyuridylation was for a longtime an underestimated 3’-end modification; most probably because sequencing techniques were focused on polyadenylated RNAs. With the development of new and adapted techniques to detect 3’ uridylation, this event is starting to gain strength with impacting roles in RNA degradation and stability (Choi et al. 2012; Clamer et al. 2014). RNA sequencing analysis of mammalian cells, not depending on oligodT primers but rather using 3’ ligated linkers specific for small RNAs of 200nt or less, showed a widespread tendency of 3’-end uridylation of small RNAs (Choi et al. 2012). Interestingly, besides the already known uridylated targets, they also found this 3’ modification on transcriptional start-site-associated RNAs along with spliced introns suggesting a larger role of polyuridylation in RNA metabolism in mammals, and despite the fact that PUPs are mostly localized in the cytoplasm. Optimized RNA sequencing methods in different backgrounds, such as DNA replication inhibition and stress conditions, and refinements in these methods are necessary to understand the global biological consequences of uridylation in RNA metabolism and might also be the start for uncovering new ncNTrs among eukaryotes. With RNA-Seq development, more and more RNAs are found to be uridylated in several organisms but the enzymes responsible for this process are still unknown. The identification of polyuridylating enzymes becomes now critical for obtaining a larger picture of polyuridylation in eukaryotes, its evolution and functional implications in the cell.

It is important to note that RNA uridylation in the cytoplasm has been shown to induce tumorogenesis in mammals. Uridylation at the 3’-end of the tumor suppressor pre-let-7 microRNA by cytoplasmic ZCCHC11 and ZCCHC6 enzymes blocks let-7 miRNA maturation,

116 which in turn stimulates tumor growth (Heo et al. 2009). Lin28 is a factor of pluripotency in stem cells and once expressed, it helps the maintenance of an undifferentiated and proliferative state by blocking the expression of let-7 miRNA by recruiting ZCCHC11 for uridylation-mediated decay (Yu et al. 2007; Heo et al. 2009; Chang et al. 2012). In adult somatic cells, Lin28-let-7 pathway is normally silenced even though we still observe expression of LIN28A or LIN28B in a wide variety of human cancers (Viswanathan et al. 2009; Piskounova et al. 2011). Inhibition of this oncogenic pathway blocks the tumorigenicity of cancer cells (Piskounova et al. 2011). It has recently been shown that modified let-7 microRNAs are degraded by DIS3L2 exonuclease (Chang et al. 2013). Furthermore, DIS3L2, which preferentially trims uridylated cytoplasmic RNAs, has been found mutated in patients with Perlman syndrome and in some cases this mutation lead to the development of Wilm's tumor at early stages of child’s growth (Chang et al. 2013). Even though RNA uridylation has been linked with tumor growth, the biological significance of such event is still poorly understood and as such is being studied. In order to better understand tumorogenesis, it is necessary to identify the RNA targets as well as the protein partners that recruits either the RNA or the poly(U) polymerase. Such information will allow the in depth studies of the link between TUTases and diseases. Furthermore, structural and biochemical studies of substrate recognition by TUTases and ncPAPs will provide a rational foundation for therapeutic purposes. In kinetoplastid organisms, this information will bring new insights into U-insertion/deletion, gRNA biogenesis and translational efficiency required for parasite survival and thus, helps for the design of new trypanocides important to treat various trypanosomal diseases including the fatal human sleeping sickness.

In summary, our structural and functional studies on Cid1 PUP and the ongoing work on RET1 TUTase help to our understanding of PUPs flexibility and substrate interactions, completing the work done on TUTases implicated in the RNA editing process taking place in the mitochondria (Deng et al. 2005; Stagno et al. 2007; Stagno et al. 2007; Stagno et al. 2010). Nevertheless, in order to complement our structural results, functional tests based on in vivo studies would bring a better understanding of these enzymes. In particular, studies of Cid1 H336A, K144A or even the β-trapdoor deleted mutants in cells lacking Cid1 WT would allow a precise understanding of their role during substrate recognition and more generally polyuridylation function. Furthermore, our studies combined with other PUPs structural and

117 bioinformatic studies provide crucial structural and architectural information on PUP’s mode of action providing then a strong base for testing inhibitors for therapeutic benefits. It is important to note that 3’ uridylation is involved in several key aspects of the RNA biology and all the proteins implicated in this process in eukaryotes are not fully known, bringing into focus the importance of further studies concerning this event and its players. Several research groups nowadays started to focus their work on identifying new ncNTrs along with their targets and possible protein partners. Thus, we will most likely hear a lot more about PUPs and their influence in RNA metabolism and turnover during the coming years.

118 Abbreviations

• A Adenine • KPAF1 Kinetoplast Polyadenylation • ADAR Adenosine Deaminase that / uridylation Factor 1 Act on RNA • KPAF2 Kinetoplast Polyadenylation • ADAT Adenosine Deaminase that / uridylation Factor 2 Act on tRNA • KPAP1 Kinetoplast poly(A) • AI Autoinduction polymerase • AT After TEV cleavage • LB Luria Broth • BT Before TEV cleavage • lincRNA Long non‐coding RNA • C Cytosine • MBP Maltose Binding Protein • CAT Catalytic Domain • MD Middle Domain • CD Central Domain • MEAT 1 Mitochondrial Editosome- • CDE-1 Cosuppression defective 1 like complex Associated TUTase 1 • Cid1 Caffeine-induced death • min Minutes suppressor 1 • miRNA Micro RNA • DNA Deoxyribonucleic acid • MNT Minimal Nucleotidyl • dsRNA double‐stranded RNA Transferase • ESRF European Synchrotron • MR Molecular Replacement Radiation Facility • mRNA Messenger RNA • FL Full-Length • Mw Molecular weight • G Guanine • ncNTrs Non-canonical • GRBC gRNA binding complex ribonucleotidyl transferase • gRNA Guide RNA • ncRNA Non‐coding RNA • GST Glutathione-S-transferase • nPAP Nuclear Poly(A) Polymerase • h Hours • NRM Nucleotide Recognition • HESO-1 HEN1 SUPPRESSOR 1 Motif • IPTG Isopropyl-β-D- • nt Nucleotide thiogalactopyranosine • O/N Overnight • ORF Open Reading Frame

119 • PAP Poly(A) Polymerase • snoRNP Small nucleolar • PEG Polyethylene Glycol ribonucleoprotein • piRNA PIWI‐interacting RNA • snRNA Small nuclear RNA • Pol Polymerase • snRNP Small nuclear • PPR Pentatricopeptide repeats ribonucleoprotein • PUP Poly(U) Polymerase • sRNA Small RNA • RBD RNA binding domain • T Thymidine • RDS RET1 TUTase DSS1 • TB Terrific Broth exonuclease • TEV Tobacco Etch Virus • RET1 RNA editing TUTase 1 • tRNA Transfer RNA • RET2 RNA editing TUTase 2 • TUTase Terminal Uridyl Transferase • RNA Ribonucleic acid • U Uracil • RNP Ribonucleoprotein • URT1 UTP:RNA uridylyl • rNTrs Ribonucleotidyl transferase transferase 1 • RRM RNA recognition motif • UTR Untranslated region • rRNA Ribosomal RNA • WT Wild-type • SAD Single-wavelength • ZCCHC11 Zinc finger, CCHC containing anomalous diffraction 11 • SeMet Seleno-methionine • ZCCHC6 Zinc finger, CCHC containing • siRNA Small interfering RNA 6 • SLS Swiss Light Source • β-MT β-mercaptoethanol • snoRNA Small nucleolar RNA

120 References

Abelson, J. (1979). RNA processing and the intervening sequence problem. Annu Rev Biochem 48, 1035-1069. Alexandrov, A., I. Chernyakov, W. Gu, S. L. Hiley, T. R. Hughes, E. J. Grayhack and E. M. Phizicky (2006). Rapid tRNA decay can result from lack of nonessential modifications. Mol Cell 21, 87-96. Ameres, S. L., M. D. Horwich, J. H. Hung, J. Xu, M. Ghildiyal, Z. Weng and P. D. Zamore (2010). Target RNA-directed trimming and tailing of small silencing RNAs. Science 328, 1534-1539. Anderson, J. S. and R. P. Parker (1998). The 3' to 5' degradation of yeast mRNAs is a general mechanism for mRNA turnover that requires the SKI2 DEVH box protein and 3' to 5' exonucleases of the exosome complex. EMBO J 17, 1497-1506. Aphasizhev, R. and I. Aphasizheva (2008). Terminal RNA uridylyltransferases of trypanosomes. Biochim Biophys Acta 1779, 270-280. Aphasizhev, R. and I. Aphasizheva (2011). Uridine insertion/deletion editing in trypanosomes: a playground for RNA-guided information transfer. Wiley Interdiscip Rev RNA 2, 669-685. Aphasizhev, R., I. Aphasizheva and L. Simpson (2003). A tale of two TUTases. Proc Natl Acad Sci U S A 100, 10617-10622. Aphasizhev, R., S. Sbicego, M. Peris, S. H. Jang, I. Aphasizheva, A. M. Simpson, A. Rivlin and L. Simpson (2002). Trypanosome mitochondrial 3' terminal uridylyl transferase (TUTase): the key enzyme in U-insertion/deletion RNA editing. Cell 108, 637-648. Aphasizheva, I. and R. Aphasizhev (2010). RET1-catalyzed uridylylation shapes the mitochondrial transcriptome in Trypanosoma brucei. Mol Cell Biol 30, 1555-1567. Aphasizheva, I., D. Maslov, X. Wang, L. Huang and R. Aphasizhev (2011). Pentatricopeptide repeat proteins stimulate mRNA adenylation/uridylation to activate mitochondrial translation in trypanosomes. Mol Cell 42, 106-117. Aravind, L. and E. V. Koonin (1999). DNA polymerase beta-like nucleotidyltransferase superfamily: identification of three new families, classification and evolutionary history. Nucleic Acids Res 27, 1609-1618. Babendure, J. R., J. L. Babendure, J. H. Ding and R. Y. Tsien (2006). Control of mammalian translation by mRNA structure near caps. RNA 12, 851-861. Babiarz, J. E., J. G. Ruby, Y. Wang, D. P. Bartel and R. Blelloch (2008). Mouse ES cells express endogenous shRNAs, siRNAs, and other Microprocessor-independent, Dicer- dependent small RNAs. Genes Dev 22, 2773-2785. Balbo, P. B. and A. Bohm (2007). Mechanism of poly(A) polymerase: structure of the enzyme-MgATP-RNA ternary complex and kinetic analysis. Structure 15, 1117-1131. Bashirullah, A., R. L. Cooperstock and H. D. Lipshitz (2001). Spatial and temporal control of RNA stability. Proc Natl Acad Sci U S A 98, 7025-7028.

121 Battle, D. J. and J. A. Doudna (2001). The stem-loop binding protein forms a highly stable and specific complex with the 3' stem-loop of histone mRNAs. RNA 7, 123-132. Benne, R., J. Van den Burg, J. P. Brakenhoff, P. Sloof, J. H. Van Boom and M. C. Tromp (1986). Major transcript of the frameshifted coxII gene from trypanosome mitochondria contains four nucleotides that are not encoded in the DNA. Cell 46, 819-826. Berger, I., D. J. Fitzgerald and T. J. Richmond (2004). Baculovirus expression system for heterologous multiprotein complexes. Nat Biotechnol 22, 1583-1587. Betat, H., C. Rammelt, G. Martin and M. Morl (2004). Exchange of regions between bacterial poly(A) polymerase and the CCA-adding enzyme generates altered specificities. Mol Cell 15, 389-398. Blum, B., N. Bakalara and L. Simpson (1990). A model for RNA editing in kinetoplastid mitochondria: "guide" RNA molecules transcribed from maxicircle DNA provide the edited information. Cell 60, 189-198. Blum, B. and L. Simpson (1990). Guide RNAs in kinetoplastid mitochondria have a nonencoded 3' oligo(U) tail involved in recognition of the preedited region. Cell 62, 391-397. Blum, E., A. J. Carpousis and C. F. Higgins (1999). Polyadenylation promotes degradation of 3'-structured RNA by the Escherichia coli mRNA degradosome in vitro. J Biol Chem 274, 4009-4016. Borowski, L. S., R. J. Szczesny, L. K. Brzezniak and P. P. Stepien (2010). RNA turnover in human mitochondria: more questions than answers? Biochim Biophys Acta 1797, 1066-1070. Buratti, E. and F. E. Baralle (2004). Influence of RNA secondary structure on the pre-mRNA splicing process. Mol Cell Biol 24, 10505-10514. Canellakis, E. S. (1957). Incorporation of radioactive uridine-5'-monophosphate into ribonucleic acid by soluble mammalian enzymes. Biochim Biophys Acta 23, 217-218. Caponigro, G. and R. Parker (1995). Multiple functions for the poly(A)-binding protein in mRNA decapping and deadenylation in yeast. Genes Dev 9, 2421-2432. Carballo, E., W. S. Lai and P. J. Blackshear (1998). Feedback inhibition of macrophage tumor necrosis factor-alpha production by tristetraprolin. Science 281, 1001-1005. Chang, H., J. Lim, M. Ha and V. N. Kim (2014). TAIL-seq: genome-wide determination of poly(A) tail length and 3' end modifications. Mol Cell 53, 1044-1052. Chang, H. M., N. J. Martinez, J. E. Thornton, J. P. Hagan, K. D. Nguyen and R. I. Gregory (2012). Trim71 cooperates with microRNAs to repress Cdkn1a expression and promote embryonic stem cell proliferation. Nat Commun 3, 923. Chang, H. M., R. Triboulet, J. E. Thornton and R. I. Gregory (2013). A role for the Perlman syndrome exonuclease Dis3l2 in the Lin28-let-7 pathway. Nature 497, 244-248. Chen, Y., K. Sinha, K. Perumal and R. Reddy (2000). Effect of 3' terminal adenylic acid residue on the uridylation of human small RNAs in vitro and in frog oocytes. RNA 6, 1277- 1288. Choi, Y. S., W. Patena, A. D. Leavitt and M. T. McManus (2012). Widespread RNA 3'-end oligouridylation in mammals. RNA 18, 394-401. Clamer, M., L. Hofler, E. Mikhailova, G. Viero and H. Bayley (2014). Detection of 3'-end RNA uridylation with a protein nanopore. ACS Nano 8, 1364-1374. Claycomb, J. M., et al. (2009). The Argonaute CSR-1 and its 22G-RNA cofactors are required for holocentric chromosome segregation. Cell 139, 123-134.

122 Conne, B., A. Stutz and J. D. Vassalli (2000). The 3' untranslated region of messenger RNA: A molecular 'hotspot' for pathology? Nat Med 6, 637-641. Cooper, T. A., L. Wan and G. Dreyfuss (2009). RNA and disease. Cell 136, 777-793. Corell, R. A., et al. (1996). Complexes from Trypanosoma brucei that exhibit deletion editing and other editing-associated properties. Mol Cell Biol 16, 1410-1418. Cruz, J. A. and E. Westhof (2009). The dynamic landscapes of RNA architecture. Cell 136, 604-609. Czerwoniec, A., S. Dunin-Horkawicz, E. Purta, K. H. Kaminska, J. M. Kasprzak, J. M. Bujnicki, H. Grosjean and K. Rother (2009). MODOMICS: a database of RNA modification pathways. 2008 update. Nucleic Acids Res 37, D118-121. Decker, C. J. and R. Parker (1993). A turnover pathway for both stable and unstable mRNAs in yeast: evidence for a requirement for deadenylation. Genes Dev 7, 1632-1643. Deng, J., N. L. Ernst, S. Turley, K. D. Stuart and W. G. Hol (2005). Structural basis for UTP specificity of RNA editing TUTases from Trypanosoma brucei. EMBO J 24, 4007-4017. Deutscher, M. P. (1990). Ribonucleases, tRNA nucleotidyltransferase, and the 3' processing of tRNA. Prog Nucleic Acid Res Mol Biol 39, 209-240. Doma, M. K. and R. Parker (2007). RNA quality control in eukaryotes. Cell 131, 660-668. Dunckley, T. and R. Parker (1999). The DCP2 protein is required for mRNA decapping in Saccharomyces cerevisiae and contains a functional MutT motif. EMBO J 18, 5411- 5422. Ernst, N. L., B. Panicucci, R. P. Igo, Jr., A. K. Panigrahi, R. Salavati and K. Stuart (2003). TbMP57 is a 3' terminal uridylyl transferase (TUTase) of the Trypanosoma brucei editosome. Mol Cell 11, 1525-1536. Feagin, J. E., J. M. Abraham and K. Stuart (1988). Extensive editing of the cytochrome c oxidase III transcript in Trypanosoma brucei. Cell 53, 413-422. Filipowicz, W., S. N. Bhattacharyya and N. Sonenberg (2008). Mechanisms of post- transcriptional regulation by microRNAs: are the answers in sight? Nat Rev Genet 9, 102-114. Flomen, R., J. Knight, P. Sham, R. Kerwin and A. Makoff (2004). Evidence that RNA editing modulates splice site selection in the 5-HT2C receptor gene. Nucleic Acids Res 32, 2113-2122. Gallie, D. R. (1991). The cap and poly(A) tail function synergistically to regulate mRNA translational efficiency. Genes Dev 5, 2108-2116. Gallie, D. R., N. J. Lewis and W. F. Marzluff (1996). The histone 3'-terminal stem-loop is necessary for translation in Chinese hamster ovary cells. Nucleic Acids Res 24, 1954- 1962. Ganot, P., M. L. Bortolin and T. Kiss (1997). Site-specific pseudouridine formation in preribosomal RNA is guided by small nucleolar RNAs. Cell 89, 799-809. Ge, J. and Y. T. Yu (2013). RNA pseudouridylation: new insights into an old modification. Trends Biochem Sci 38, 210-218. Girard, A., R. Sachidanandam, G. J. Hannon and M. A. Carmell (2006). A germline-specific class of small RNAs binds mammalian Piwi proteins. Nature 442, 199-202. Green, R. and H. F. Noller (1997). Ribosomes and translation. Annu Rev Biochem 66, 679- 716. Gu, W., et al. (2009). Distinct argonaute-mediated 22G-RNA pathways direct genome surveillance in the C. elegans germline. Mol Cell 36, 231-244. Guthrie, C. and B. Patterson (1988). Spliceosomal snRNAs. Annu Rev Genet 22, 387-419.

123 Hagan, J. P., E. Piskounova and R. I. Gregory (2009). Lin28 recruits the TUTase Zcchc11 to inhibit let-7 maturation in mouse embryonic stem cells. Nat Struct Mol Biol 16, 1021- 1025. Hahn, S. (2004). Structure and mechanism of the RNA polymerase II transcription machinery. Nat Struct Mol Biol 11, 394-403. Heo, I., M. Ha, J. Lim, M. J. Yoon, J. E. Park, S. C. Kwon, H. Chang and V. N. Kim (2012). Mono- Uridylation of Pre-MicroRNA as a Key Step in the Biogenesis of Group II let-7 MicroRNAs. Cell 151, 521-532. Heo, I., C. Joo, Y. K. Kim, M. Ha, M. J. Yoon, J. Cho, K. H. Yeom, J. Han and V. N. Kim (2009). TUT4 in concert with Lin28 suppresses microRNA biogenesis through pre-microRNA uridylation. Cell 138, 696-708. Hoefig, K. P., N. Rath, G. A. Heinz, C. Wolf, J. Dameris, A. Schepers, E. Kremmer, K. M. Ansel and V. Heissmeyer (2013). Eri1 degrades the stem-loop of oligouridylated histone mRNAs to induce replication-dependent decay. Nat Struct Mol Biol 20, 73-81. Horwich, M. D., C. Li, C. Matranga, V. Vagin, G. Farley, P. Wang and P. D. Zamore (2007). The Drosophila RNA methyltransferase, DmHen1, modifies germline piRNAs and single- stranded siRNAs in RISC. Curr Biol 17, 1265-1272. Houwing, S., et al. (2007). A role for Piwi and piRNAs in germ cell maintenance and transposon silencing in Zebrafish. Cell 129, 69-82. Hsu, C. L. and A. Stevens (1993). Yeast cells lacking 5'-->3' exoribonuclease 1 contain mRNA species that are poly(A) deficient and partially lack the 5' cap structure. Mol Cell Biol 13, 4826-4835. Ibrahim, F., J. Rohr, W. J. Jeong, J. Hesson and H. Cerutti (2006). Untemplated oligoadenylation promotes degradation of RISC-cleaved transcripts. Science 314, 1893. Ibrahim, F., L. A. Rymarquis, E. J. Kim, J. Becker, E. Balassa, P. J. Green and H. Cerutti (2010). Uridylation of mature miRNAs and siRNAs by the MUT68 nucleotidyltransferase promotes their degradation in Chlamydomonas. Proc Natl Acad Sci U S A 107, 3906- 3911. Jady, B. E. and T. Kiss (2001). A small nucleolar guide RNA functions both in 2'-O-ribose methylation and pseudouridylation of the U5 spliceosomal RNA. EMBO J 20, 541-551. Jones, M. R., L. J. Quinton, M. T. Blahna, J. R. Neilson, S. Fu, A. R. Ivanov, D. A. Wolf and J. P. Mizgerd (2009). Zcchc11-dependent uridylation of microRNA directs cytokine expression. Nat Cell Biol 11, 1157-1163. Kamminga, L. M., M. J. Luteijn, M. J. den Broeder, S. Redl, L. J. Kaaij, E. F. Roovers, P. Ladurner, E. Berezikov and R. F. Ketting (2010). Hen1 is required for oocyte development and piRNA stability in zebrafish. EMBO J 29, 3688-3700. Kedinger, C., M. Gniazdowski, J. L. Mandel, Jr., F. Gissinger and P. Chambon (1970). Alpha- amanitin: a specific inhibitor of one of two DNA-pendent RNA polymerase activities from calf thymus. Biochem Biophys Res Commun 38, 165-171. Keller, W. (1995). No end yet to messenger RNA 3' processing! Cell 81, 829-832. Kelley, L. A. and M. J. Sternberg (2009). Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc 4, 363-371. Kim, D. H., P. Saetrom, O. Snove, Jr. and J. J. Rossi (2008). MicroRNA-directed transcriptional gene silencing in mammalian cells. Proc Natl Acad Sci U S A 105, 16230-16235. Kirino, Y. and Z. Mourelatos (2007). Mouse Piwi-interacting RNAs are 2'-O-methylated at their 3' termini. Nat Struct Mol Biol 14, 347-348.

124 Kiss-Laszlo, Z., Y. Henry, J. P. Bachellerie, M. Caizergues-Ferrer and T. Kiss (1996). Site- specific ribose methylation of preribosomal RNA: a novel function for small nucleolar RNAs. Cell 85, 1077-1088. Kurth, H. M. and K. Mochizuki (2009). 2'-O-methylation stabilizes Piwi-associated small RNAs and ensures DNA elimination in Tetrahymena. RNA 15, 675-685. LaCava, J., J. Houseley, C. Saveanu, E. Petfalski, E. Thompson, A. Jacquier and D. Tollervey (2005). RNA degradation by the exosome is promoted by a nuclear polyadenylation complex. Cell 121, 713-724. Lafontaine, D. L. and D. Tollervey (1998). Birth of the snoRNPs: the evolution of the modification-guide snoRNAs. Trends Biochem Sci 23, 383-388. Lai, W. S., E. Carballo, J. M. Thorn, E. A. Kennington and P. J. Blackshear (2000). Interactions of CCCH zinc finger proteins with mRNA. Binding of tristetraprolin-related zinc finger proteins to Au-rich elements and destabilization of mRNA. J Biol Chem 275, 17827- 17837. Landgraf, P., et al. (2007). A mammalian microRNA expression atlas based on small RNA library sequencing. Cell 129, 1401-1414. Lehrbach, N. J., J. Armisen, H. L. Lightfoot, K. J. Murfitt, A. Bugaut, S. Balasubramanian and E. A. Miska (2009). LIN-28 and the poly(U) polymerase PUP-2 regulate let-7 microRNA processing in Caenorhabditis elegans. Nat Struct Mol Biol 16, 1016-1020. Li, J., Z. Yang, B. Yu, J. Liu and X. Chen (2005). Methylation protects miRNAs and siRNAs from a 3'-end uridylation activity in Arabidopsis. Curr Biol 15, 1501-1507. Li, Y. and M. Kiledjian (2010). Regulation of mRNA decapping. Wiley Interdiscip Rev RNA 1, 253-265. Lisitsky, I., P. Klaff and G. Schuster (1996). Addition of destabilizing poly (A)-rich sequences to endonuclease cleavage sites during the degradation of chloroplast mRNA. Proc Natl Acad Sci U S A 93, 13398-13403. Lisitsky, I., A. Kotler and G. Schuster (1997). The mechanism of preferential degradation of polyadenylated RNA in the chloroplast. The exoribonuclease 100RNP/polynucleotide phosphorylase displays high binding affinity for poly(A) sequence. J Biol Chem 272, 17648-17653. Loughlin, F. E., L. F. Gebert, H. Towbin, A. Brunschweiger, J. Hall and F. H. Allain (2012). Structural basis of pre-let-7 miRNA recognition by the zinc knuckles of pluripotency factor Lin28. Nat Struct Mol Biol 19, 84-89. Lubas, M., C. K. Damgaard, R. Tomecki, D. Cysewski, T. H. Jensen and A. Dziembowski (2013). Exonuclease hDIS3L2 specifies an exosome-independent 3'-5' degradation pathway of human cytoplasmic mRNA. EMBO J 32, 1855-1868. Lund, E. and J. E. Dahlberg (1992). Cyclic 2',3'-phosphates and nontemplated nucleotides at the 3' end of spliceosomal U6 small nuclear RNA's. Science 255, 327-330. Lunde, B. M., I. Magler and A. Meinhart (2012). Crystal structures of the Cid1 poly (U) polymerase reveal the mechanism for UTP selectivity. Nucleic Acids Res 40, 9815- 9824. Lykke-Andersen, J. (2002). Identification of a human decapping complex associated with hUpf proteins in nonsense-mediated decay. Mol Cell Biol 22, 8114-8121. Maden, B. E. and J. M. Hughes (1997). Eukaryotic ribosomal RNA: the recent excitement in the nucleotide modification problem. Chromosoma 105, 391-400.

125 Malecki, M., S. C. Viegas, T. Carneiro, P. Golik, C. Dressaire, M. G. Ferreira and C. M. Arraiano (2013). The exoribonuclease Dis3L2 defines a novel eukaryotic RNA degradation pathway. EMBO J 32, 1842-1854. Maniatis, T. and R. Reed (1987). The role of small nuclear ribonucleoprotein particles in pre- mRNA splicing. Nature 325, 673-678. Manley, J. L. (1995). Messenger RNA polyadenylylation: a universal modification. Proc Natl Acad Sci U S A 92, 1800-1801. Martin, G. and W. Keller (1996). Mutational analysis of mammalian poly(A) polymerase identifies a region for primer binding and catalytic domain, homologous to the family X polymerases, and to other nucleotidyltransferases. EMBO J 15, 2593-2603. Martin, G. and W. Keller (2007). RNA-specific ribonucleotidyl transferases. RNA 13, 1834- 1849. Mayr, C. and D. P. Bartel (2009). Widespread shortening of 3'UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673-684. Merrick, W. C. (2004). Cap-dependent and cap-independent translation in eukaryotic systems. Gene 332, 1-11. Meyer, S., C. Temme and E. Wahle (2004). Messenger RNA turnover in eukaryotes: pathways and enzymes. Crit Rev Biochem Mol Biol 39, 197-216. Minoda, Y., K. Saeki, D. Aki, H. Takaki, T. Sanada, K. Koga, T. Kobayashi, G. Takaesu and A. Yoshimura (2006). A novel Zinc finger protein, ZCCHC11, interacts with TIFA and modulates TLR signaling. Biochem Biophys Res Commun 344, 1023-1030. Morin, R. D., et al. (2008). Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Res 18, 610-621. Mowry, K. L. and J. A. Steitz (1988). snRNP mediators of 3' end processing: functional fossils? Trends Biochem Sci 13, 447-451. Mullen, T. E. and W. F. Marzluff (2008). Degradation of histone mRNA requires oligouridylation followed by decapping and simultaneous degradation of the mRNA both 5' to 3' and 3' to 5'. Genes Dev 22, 50-65. Munoz-Tello, P., C. Gabus and S. Thore (2012). Functional implications from the Cid1 poly(U) polymerase crystal structure. Structure 20, 977-986. Munoz-Tello, P., C. Gabus and S. Thore (2014). A critical switch in the enzymatic properties of the Cid1 protein deciphered from its product-bound crystal structure. Nucleic Acids Res 42, 3372-3380. Munroe, D. and A. Jacobson (1990). mRNA poly(A) tail, a 3' enhancer of translational initiation. Mol Cell Biol 10, 3441-3455. Muthukrishnan, S., G. W. Both, Y. Furuichi and A. J. Shatkin (1975). 5'-Terminal 7- methylguanosine in eukaryotic mRNA is required for translation. Nature 255, 33-37. Nakamura, T., Y. Zhao, Y. Yamagata, Y. J. Hua and W. Yang (2012). Watching DNA polymerase eta make a phosphodiester bond. Nature 487, 196-201. Nam, Y., C. Chen, R. I. Gregory, J. J. Chou and P. Sliz (2011). Molecular basis for interaction of let-7 microRNAs with Lin28. Cell 147, 1080-1091. Newman, M. A., V. Mani and S. M. Hammond (2011). Deep sequencing of microRNA precursors reveals extensive 3' end modification. RNA 17, 1795-1803. Nishikura, K. (2006). Editor meets silencer: crosstalk between RNA editing and RNA interference. Nat Rev Mol Cell Biol 7, 919-931. Osley, M. A. (1991). The regulation of histone synthesis in the cell cycle. Annu Rev Biochem 60, 827-861.

126 Pandey, N. B. and W. F. Marzluff (1987). The stem-loop structure at the 3' end of histone mRNA is necessary and sufficient for regulation of histone mRNA stability. Mol Cell Biol 7, 4557-4559. Panigrahi, A. K., A. Schnaufer, N. L. Ernst, B. Wang, N. Carmean, R. Salavati and K. Stuart (2003). Identification of novel components of Trypanosoma brucei editosomes. RNA 9, 484-492. Parker, R. and H. Song (2004). The enzymes and control of eukaryotic mRNA turnover. Nat Struct Mol Biol 11, 121-127. Penschow, J. L., D. A. Sleve, C. M. Ryan and L. K. Read (2004). TbDSS-1, an essential Trypanosoma brucei exoribonuclease homolog that has pleiotropic effects on mitochondrial RNA metabolism. Eukaryot Cell 3, 1206-1216. Piskounova, E., C. Polytarchou, J. E. Thornton, R. J. LaPierre, C. Pothoulakis, J. P. Hagan, D. Iliopoulos and R. I. Gregory (2011). Lin28A and Lin28B inhibit let-7 microRNA biogenesis by distinct mechanisms. Cell 147, 1066-1079. Pollard, V. W., M. E. Harris and S. L. Hajduk (1992). Native mRNA editing complexes from Trypanosoma brucei mitochondria. EMBO J 11, 4429-4438. Prasad, S., P. B. Khadatare and I. Roy (2011). Effect of chemical chaperones in improving the solubility of recombinant proteins in Escherichia coli. Appl Environ Microbiol 77, 4603-4609. Raynal, L. C. and A. J. Carpousis (1999). Poly(A) polymerase I of Escherichia coli: characterization of the catalytic domain, an RNA binding site and regions for the interaction with proteins involved in mRNA degradation. Mol Microbiol 32, 765-775. Read, R. L., R. G. Martinho, S. W. Wang, A. M. Carr and C. J. Norbury (2002). Cytoplasmic poly(A) polymerases mediate cellular responses to S phase arrest. Proc Natl Acad Sci U S A 99, 12079-12084. Ren, G., X. Chen and B. Yu (2012). Uridylation of miRNAs by hen1 suppressor1 in Arabidopsis. Curr Biol 22, 695-700. Ren, G., M. Xie, S. Zhang, C. Vinovskis, X. Chen and B. Yu (2014). Methylation protects microRNAs from an AGO1-associated activity that uridylates 5' RNA fragments generated by AGO1 cleavage. Proc Natl Acad Sci U S A 111, 6365-6370. Richard, P. and J. L. Manley (2009). Transcription termination by nuclear RNA polymerases. Genes Dev 23, 1247-1269. Rinke, J. and J. A. Steitz (1985). Association of the lupus antigen La with a subset of U6 snRNA molecules. Nucleic Acids Res 13, 2617-2629. Rissland, O. S., A. Mikulasova and C. J. Norbury (2007). Efficient RNA polyuridylation by noncanonical poly(A) polymerases. Mol Cell Biol 27, 3612-3624. Rissland, O. S. and C. J. Norbury (2009). Decapping is preceded by 3' uridylation in a novel pathway of bulk mRNA turnover. Nat Struct Mol Biol 16, 616-623. Roeder, R. G. and W. J. Rutter (1969). Multiple forms of DNA-dependent RNA polymerase in eukaryotic organisms. Nature 224, 234-237. Rusche, L. N., J. Cruz-Reyes, K. J. Piller and B. Sollner-Webb (1997). Purification of a functional enzymatic editing complex from Trypanosoma brucei mitochondria. EMBO J 16, 4069-4081. Savva, Y. A., L. E. Rieder and R. A. Reenan (2012). The ADAR protein family. Genome Biol 13, 252. Schmidt, M. J., S. West and C. J. Norbury (2011). The human cytoplasmic RNA terminal U- transferase ZCCHC11 targets histone mRNAs for degradation. RNA 17, 39-44.

127 Schoenberg, D. R. and L. E. Maquat (2012). Regulation of cytoplasmic mRNA decay. Nat Rev Genet 13, 246-259. Seiwert, S. D., S. Heidmann and K. Stuart (1996). Direct visualization of uridylate deletion in vitro suggests a mechanism for kinetoplastid RNA editing. Cell 84, 831-841. Sement, F. M., E. Ferrier, H. Zuber, R. Merret, M. Alioua, J. M. Deragon, C. Bousquet- Antonelli, H. Lange and D. Gagliardi (2013). Uridylation prevents 3' trimming of oligoadenylated mRNAs. Nucleic Acids Res 41, 7115-7127. Sharp, P. A. (2009). The centrality of RNA. Cell 136, 577-580. Shatkin, A. J. (1976). Capping of eucaryotic mRNAs. Cell 9, 645-653. Shaw, J. M., J. E. Feagin, K. Stuart and L. Simpson (1988). Editing of kinetoplastid mitochondrial mRNAs by uridine addition and deletion generates conserved amino acid sequences and AUG initiation codons. Cell 53, 401-411. Shuman, S. (1995). Capping enzyme in eukaryotic mRNA synthesis. Prog Nucleic Acid Res Mol Biol 50, 101-129. Shyu, A. B., J. G. Belasco and M. E. Greenberg (1991). Two distinct destabilizing elements in the c-fos message trigger deadenylation as a first step in rapid mRNA decay. Genes Dev 5, 221-231. Simonovic, M. and T. A. Steitz (2009). A structural view on the mechanism of the ribosome- catalyzed peptide bond formation. Biochim Biophys Acta 1789, 612-623. Slomovic, S., D. Laufer, D. Geiger and G. Schuster (2005). Polyadenylation and degradation of human mitochondrial RNA: the prokaryotic past leaves its mark. Mol Cell Biol 25, 6427-6435. Slomovic, S. and G. Schuster (2008). Stable PNPase RNAi silencing: its effect on the processing and adenylation of human mitochondrial RNA. RNA 14, 310-323. Song, M. G. and M. Kiledjian (2007). 3' Terminal oligo U-tract-mediated stimulation of decapping. RNA 13, 2356-2365. Sprinzl, M. and F. Cramer (1979). The -C-C-A end of tRNA and its role in protein biosynthesis. Prog Nucleic Acid Res Mol Biol 22, 1-69. Stagno, J., I. Aphasizheva, R. Aphasizhev and H. Luecke (2007). Dual role of the RNA substrate in selectivity and catalysis by terminal uridylyl transferases. Proc Natl Acad Sci U S A 104, 14634-14639. Stagno, J., I. Aphasizheva, J. Bruystens, H. Luecke and R. Aphasizhev (2010). Structure of the mitochondrial editosome-like complex associated TUTase 1 reveals divergent mechanisms of UTP selection and domain organization. J Mol Biol 399, 464-475. Stagno, J., I. Aphasizheva, A. Rosengarth, H. Luecke and R. Aphasizhev (2007). UTP-bound and Apo structures of a minimal RNA uridylyltransferase. J Mol Biol 366, 882-899. Stuart, K. D., A. Schnaufer, N. L. Ernst and A. K. Panigrahi (2005). Complex management: RNA editing in trypanosomes. Trends Biochem Sci 30, 97-105. Szczesny, R. J., L. S. Borowski, L. K. Brzezniak, A. Dmochowska, K. Gewartowski, E. Bartnik and P. P. Stepien (2010). Human mitochondrial RNA turnover caught in flagranti: involvement of hSuv3p helicase in RNA surveillance. Nucleic Acids Res 38, 279-298. Tam, O. H., et al. (2008). Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes. Nature 453, 534-U538. Thornton, J. E., H. M. Chang, E. Piskounova and R. I. Gregory (2012). Lin28-mediated control of let-7 microRNA expression by alternative TUTases Zcchc11 (TUT4) and Zcchc6 (TUT7). RNA 18, 1875-1885.

128 Trippe, R., E. Guschina, M. Hossbach, H. Urlaub, R. Luhrmann and B. J. Benecke (2006). Identification, cloning, and functional analysis of the human U6 snRNA-specific terminal uridylyl transferase. RNA 12, 1494-1504. Trippe, R., B. Sandrock and B. J. Benecke (1998). A highly specific terminal uridylyl transferase modifies the 3'-end of U6 small nuclear RNA. Nucleic Acids Res 26, 3119- 3126. Vagin, V. V., A. Sigova, C. Li, H. Seitz, V. Gvozdev and P. D. Zamore (2006). A distinct small RNA pathway silences selfish genetic elements in the germline. Science 313, 320-324. van der Velden, A. W. and A. A. Thomas (1999). The role of the 5' untranslated region of an mRNA in translation regulation during development. Int J Biochem Cell Biol 31, 87- 106. van Dijk, E., N. Cougot, S. Meyer, S. Babajko, E. Wahle and B. Seraphin (2002). Human Dcp2: a catalytically active mRNA decapping enzyme located in specific cytoplasmic structures. EMBO J 21, 6915-6924. van Wolfswinkel, J. C., J. M. Claycomb, P. J. Batista, C. C. Mello, E. Berezikov and R. F. Ketting (2009). CDE-1 affects chromosome segregation through uridylation of CSR-1-bound siRNAs. Cell 139, 135-148. Viswanathan, S. R., et al. (2009). Lin28 promotes transformation and is associated with advanced human malignancies. Nat Genet 41, 843-848. Wang, B., N. L. Ernst, S. S. Palazzo, A. K. Panigrahi, R. Salavati and K. Stuart (2003). TbMP44 is essential for RNA editing and structural integrity of the editosome in Trypanosoma brucei. Eukaryot Cell 2, 578-587. Wang, S. W., C. Norbury, A. L. Harris and T. Toda (1999). Caffeine can override the S-M checkpoint in fission yeast. J Cell Sci 112 ( Pt 6), 927-937. Wang, S. W., T. Toda, R. MacCallum, A. L. Harris and C. Norbury (2000). Cid1, a fission yeast protein required for S-M checkpoint control when DNA polymerase delta or epsilon is inactivated. Mol Cell Biol 20, 3234-3244. Wang, Z. and M. Kiledjian (2001). Functional link between the mammalian exosome and mRNA decapping. Cell 107, 751-762. Watanabe, T., et al. (2008). Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes. Nature 453, 539-543. Watson, J. D. and F. H. Crick (1953). Genetical implications of the structure of deoxyribonucleic acid. Nature 171, 964-967. Watson, J. D. and F. H. Crick (1953). Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171, 737-738. Weng, J., I. Aphasizheva, R. D. Etheridge, L. Huang, X. Wang, A. M. Falick and R. Aphasizhev (2008). Guide RNA-binding complex from mitochondria of trypanosomatids. Mol Cell 32, 198-209. West, S., N. Gromak, C. J. Norbury and N. J. Proudfoot (2006). Adenylation and exosome- mediated degradation of cotranscriptionally cleaved pre-messenger RNA in human cells. Mol Cell 21, 437-443. Wilkie, G. S., K. S. Dickson and N. K. Gray (2003). Regulation of mRNA translation by 5'- and 3'-UTR-binding factors. Trends Biochem Sci 28, 182-188. Wilkie, N. M. and R. M. Smellie (1968). Chain extension of ribonucleic acid by enzymes from rat liver cytoplasm. Biochem J 109, 485-494. Wilkie, N. M. and R. M. Smellie (1968). Polyribonucleotide synthesis by subfractions of microsomes from rat liver. Biochem J 109, 229-238.

129 Yates, L. A., S. Fleurdepine, O. S. Rissland, L. De Colibus, K. Harlos, C. J. Norbury and R. J. Gilbert (2012). Structural basis for the activity of a cytoplasmic RNA terminal uridylyl transferase. Nat Struct Mol Biol 19, 782-787. Yu, B., Z. Yang, J. Li, S. Minakhina, M. Yang, R. W. Padgett, R. Steward and X. Chen (2005). Methylation as a crucial step in plant microRNA biogenesis. Science 307, 932-935. Yu, J., et al. (2007). Induced pluripotent stem cell lines derived from human somatic cells. Science 318, 1917-1920. Zhao, Y. Y., Y. Yu, J. X. Zhai, V. Ramachandran, T. T. Dinh, B. C. Meyers, B. X. Mo and X. M. Chen (2012). The Arabidopsis Nucleotidyl Transferase HESO1 Uridylates Unmethylated Small RNAs to Trigger Their Degradation. Current Biology 22, 689-694. Zhelkovsky, A. M., M. M. Kessler and C. L. Moore (1995). Structure-function relationships in the Saccharomyces cerevisiae poly(A) polymerase. Identification of a novel RNA binding site and a domain that interacts with specificity factor(s). J Biol Chem 270, 26715-26720. Zimmer, S. L., A. Schein, G. Zipor, D. B. Stern and G. Schuster (2009). Polyadenylation in Arabidopsis and Chlamydomonas organelles: the input of nucleotidyltransferases, poly(A) polymerases and polynucleotide phosphorylase. Plant J 59, 88-99.

130