Evolution of Telomerase RNA

by

Dhenugen Logeswaran

A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy

Approved November 2019 by the Graduate Supervisory Committee:

Julian Chen, Chair Giovanna Ghirlanda Chad Borges

ARIZONA STATE UNIVERSITY

December 2019 ABSTRACT

The highly specialized telomerase ribonucleoprotein enzyme is composed

minimally of telomerase reverse transcriptase (TERT) and telomerase RNA (TR) for

catalytic activity. Telomerase is an RNA-dependent DNA polymerase that syntheizes

DNA repeats at chromosome ends to maintain genome stability. While TERT is highly conserved among various groups of species, the TR subunit exhibits remarkable divergence in primary sequence, length, secondary structure and biogenesis, making TR identification extremely challenging even among closely related groups of organisms.

A unique computational approach combined with in vitro telomerase activity reconstitution studies was used to identify 83 novel TRs from 10 animal kingdom phyla spanning 18 diverse classes from the most basal sponges to the late evolving vertebrates.

This revealed that three structural domains, pseudoknot, a distal stem-loop moiety and box H/ACA, are conserved within TRs from basal groups to vertebrates, while group- specific elements emerge or disappear during animal TR evolution along different lineages.

Next the corn-smut fungus Ustilago maydis TR was identified using an RNA-

immunoprecipitation and next-generation sequencing approach followed by

computational identification of TRs from 19 additional class Ustilaginomycetes fungi,

leveraging conserved gene synteny among TR genes. Phylogenetic comparative analysis,

in vitro telomerase activity and TR mutagenesis studies reveal a secondary structure of

TRs from higher fungi, which is also conserved with vertebrates and filamentous fungi,

providing a crucial link in TR evolution within the opisthokonta super-kingdom.

i Lastly, work by collabarotors from Texas A&M university and others identified the first bona fide TR from the model Arabidopsis thaliana. Computational analysis was performed to identify 85 novel AtTR orthologs from three major plant clades: angiosperms, gymnosperms and lycophytes, which facilitated phylogenetic comparative analysis to infer the first plant TR secondary structural model. This model was confirmed using site-specific mutagenesis and telomerase activity assays of in vitro reconstituted enzyme. The structures of plant TRs are conserved across land providing an evolutionary bridge that unites the disparate structures of previously characterized TRs from ciliates and vertebrates.

ii DEDICATION

This dissertation is dedicated to my family whose unconditional love and unwavering

support made all this possible in my life. In memory of my beloved father and my first

teacher, Caruppiah Logeswaran, who passed away two years ago. My mother,

Santhirapavani Logeswaran whose inspiring strength is the driving force that made me reach many milestones. My brother, Lajanugen Logeswaran, my intellectual role model.

My wife Nirupa Nagaratnam, who supported me physically, emotionally and

intellectually through college and my PhD. Their sacrifices are unmatched, and I will

forever be grateful. Finally, our son Ashwick, whose arrival is the single greatest joy in

our life.

iii ACKNOWLEDGMENTS

My deepest gratitude to my advisor Prof. Julian Chen who demanded rigorous and meticulous work instilling in me the passion for scientific glory. He taught me everything about being a great scientist from critical thinking to even managing time. His lessons will forever accompany me in all my future scientific endeavors.

I am grateful to my fellow lab members of the Chen lab past and present; Joshua

Podlevsky, Dustin Rand, Yinnan Chen, Zhenqiu Huang, Bowen Liu and Yang Li for

being great team members and for many insightful discussions. Undergraduate students

past and present who performed tedious work to acquire invaluable laboratory training.

iv TABLE OF CONTENTS

Page

LIST OF TABLES ...... viii

LIST OF FIGURES ...... ix

CHAPTER

1 INTRODUCTION ...... 1

1.1 Overview of Telomeres and Telomerase Research ...... 1

1.2 Telomerase Reverse Transcriptase ...... 6

1.3 Telomerase RNA ...... 8

1.4 Recent Advances in Structural Studies of Telomerase Holoenzyme ...... 15

1.5 References ...... 17

2 STRUCTURE AND FUNCTION OF METAZOAN TELOMERASE RNA ... 36

2.1 Abstract ...... 36

2.2 Introduction ...... 37

2.3 Materials and Methods ...... 41

2.4 Results ...... 44

2.5 Discussion ...... 51

2.6 References ...... 59

3 STRUCTURE AND FUNCTION OF USTILAGO FUNGAL TELOMERASE

RNA ...... 74

3.1 Abstract ...... 74

3.2 Introduction ...... 75

3.3 Materials and Methods ...... 79

v CHAPTER Page

3.4 Results ...... 84

3.5 Discussion ...... 91

3.6 References ...... 97

4 STRUCTURE AND FUNCTION OF LAND PLANTS TELOMERASE

RNA ...... 114

4.1 Abstract ...... 114

4.2 Introduction ...... 115

4.3 Materials and Methods ...... 118

4.4 Results ...... 120

4.5 Discussion ...... 127

4.6 References ...... 130

REFERENCES ...... 145

APPENDIX

A SPECIES IDENTIFICATION VIA PCR AMPLIFICATIOF OF RIBOSOMAL

RNA AND SANGER SEQUENCING ...... 162

B PLASMID MAP OF PCM955-3XFLAG-UMATERT ...... 164

C READ COVERAGE OF U. MAYDIS CANDIDATE #2 LOCUS ...... 166

D UMATR α SEQUENCE AND TEMPLATE ANNOTATION ...... 168

E MULTIPLE SEQUENCE ALIGNMENT OF UMAG_03168 HOMOLOGS 170

F MULTIPLE SEQUENCE ALIGNMENT OF CAR2-ORNITHINE OXO-ACID

TRANSAMINASE (OAT) ...... 172

G EXPANDED PHYLOGENETIC TREE OF LAND PLANTS ...... 174

vi APPENDIX Page

H CO-AUTHOR APPROVAL ...... 176

vii LIST OF TABLES

Table Page

4.1 Species with Land Plant TRs Identified in this Study ...... 142

viii LIST OF FIGURES

Figure Page

1.1 Telomere Sequences of Major Eukaryotic Groups ...... 29

1.2 Schematic of the End-replication problem ...... 30

1.3 Telomerase adds DNA Repeats to Telomeres de novo ...... 31

1.4 Domain Architecture of the Catalytic TERT Protein ...... 32

1.5 Conserved Structural Domains of TR from well-studied Groups ...... 33

1.6 Diverse TR Biogenesis Pathways from Major Organism Groups ...... 34

1.7 Cryo-EM structures of Tetrahymena and Human Telomerase ...... 35

2.1 Phylogeny Assisted TR Identification Approach ...... 63

2.2 Phylogenetic Tree of Metazoan Kingdom with Number of TRs Identified

Shown ...... 65

2.3 Validation and Characterization of Saccoglossus kowalevskii (acorn worm)

TR ...... 66

2.4 Validation and Characterization of Pomacea diffusa (apple snail) TR ...... 68

2.5 Validation and Characterization of Crassostrea virginica (American oyster)

TR ...... 70

2.6 The CR4/5 Domain Comprising the P6.1 Stem Loop is Conserved Across Basal

Metazoa ...... 72

2.7 Comparison of TR Essential Template Core, Distal Stem Loop Moiety and 3’

Biogenesis Domains from Major Metazoan Clades ...... 73

3.1 Evolutionary Relationships of Major Fungal Groups and TR Identification

Status of Each Subphyla Shown ...... 101

ix Figure ...... Page

3.2 Expression of 3xFLAG-UmaTERT in U. maydis cells ...... 102

3.3 Testing Conditions for Optimal Lysis, Immunoprecipitation and Telomerase

Activity of 3xFLAG-UmaTERT Clone A ...... 103

3.4 RNase Sensitive Telomerase Activity of 3xFLAG-UmaTERT ...... 104

3.5 Bioinformatics Pipeline for Screening UmaTR Candidates ...... 105

3.6 Validation of Candidate #2 as UmaTR ...... 106

3.7 Characterization of U. maydis TR Transcripts ...... 107

3.8 Synteny Conservation of the TR Locus in Class Ustilaginomycetes ...... 109

3.9 Identification of the Minimal UmaTR Regions Required for Telomerase

Activity ...... 110

3.10 Multiple Sequence Alignment of T/PK and eCR4/5 Domains of Select

Ustilaginales Species ...... 111

3.11 Secondary Structure Model of the UmaTR Core Domains ...... 113

4.1 AtTR Harbors the RNA Template for Arabidopsis Telomerase ...... 134

4.2 Land Plant Clades Share Universally Conserved Regions Within TRs ...... 135

4.3 Sequence Alignments of TR Structural Elements from Respective Clades to

Identify Group-Specific Co-Variations ...... 137

4.4 Plant TRs Share a Conserved Secondary Structure ...... 139

4.5 Functional Characterization of Critical Structural Elements in AtTR ...... 140

4.6 Evolution of TR Pseudoknot Structures ...... 141

x CHAPTER 1

INTRODUCTION

1.1 Overview of telomeres and telomerase research

Of the three domains of life, a defining feature of eukarya is the predominant

presence of linear chromosomes. Linear chromosomes require end-protection to refrain

from being recognized as damaged DNA ends or from being fused together with other

chromosomes resulting in genomic instability. Insight into the presence of an end

protection mechanism in chromosomes first came from experiments performed by

Hermann Müller who discovered that double stranded DNA breaks formed due to X-ray irradiation are incapable of fusing to native linear chromosomal ends (Muller, 1938). This

observation suggests that native chromosome termini are protected from abnormal fusion

by a mechanism that was unknown at that time. Hermann Müller thus named the protected chromosome termini as ‘telomeres’ from the Greek words for ‘end’ (telos) and

‘part’ (meros). Around the same time, pioneering cytogeneticist Barbara McClintock observed that broken chromosomes during mitosis had ends that were distinct from the

natural ends of chromosomes in maize (McClintock, 1939; McClintock, 1941). These

seminal work by Müller and McClintock laid the foundation for telomere biology many

years before DNA was recognized as the genetic material.

Telomeres are deoxyribonucleoprotein complexes comprised of highly repetitive

DNA sequences bound to a specific protein complex. The very first telomere sequence

identified by Elizabeth blackburn was from the basal single celled ciliate Tetrahymena

thermophila (Blackburn & Gall, 1978). The T. thermophila cells were uniquely suited for this purpose due to the large abundance of chromosome termini in their millions of copies

1 of mini chromosomes. T. thermophila telomeres contains ‘TTGGGG’ hexanucleotide repeats (Figure 1.1). Human telomeric DNA comparable to T. thermophila in its repetitive nature is composed of ‘TTAGGG’ repeats (Figure 1.1) (Moyzis et al., 1988).

The ‘TTAGGG’ telomeric sequence is highly conserved among vertebrates, a vast number of invertebrates, filamentous fungi, certain plants and protozoans (Figure 1.1)

(Meyne et al., 1989; Podlevsky et al., 2008). This suggests that the ‘TTAGGG’ repeat sequence is the most ubiquitous and ancestral telomere sequence in eukaryotes (Figure

1.1).

Plant telomeres are composed of a similar register to ‘TTAGGG’ with an additional ‘T’ appended with the repeating sequence being ‘TTTAGGG’ (Figure 1.1).

This is observed in a majority of land plants with notable exceptions (J. Fajkus et al.,

2005; Riha & Shippen, 2003). Species from the Solanaceae family including onion either have ‘T’ rich or highly unusual telomere repeats (P. Fajkus et al., 2016; Peška et al.,

2015). Interestingly a small group of plants contain the vertebrate like ‘TTAGGG’ telomere sequences (J. Fajkus et al., 2005).

Incomplete replication of DNA at telomeres was recognized not long after the mechanism of DNA replication was discovered. When Watson and Crick first described the DNA double helix and the complementary nature of DNA, they proposed a semi- conservative mechanism for DNA replication where the two daughter DNA molecules formed from parental DNA contain one parental and one newly synthesized strand each.

Elegant experiments performed by Meselson and Stahl confirmed the semi-conservative

DNA replication mechanism (Meselson & Stahl, 1958). Following this discovery, it was recognized that conventional DNA polymerases are incapable of completely replicating

2 linear chromosomal ends and thus described as the ‘end-replication problem’ (Figure 1.2)

(Olovnikov, 1973; Watson, 1972). This problem arises due to the inherent properties of

DNA polymerases and the DNA replication mechanism itself. All known DNA

polymerases synthesize DNA in a 5’- to 3’- direction. The synthesis requires a free 3’

hydroxyl group for nucleophilic attack of an incoming deoxyribonucleotide for

subsequent polymerization. The free 3’-hydroxyl group is provided by RNA primers

which are removed and in filled following DNA synthesis (Figure 1.2). An RNA primer

that anneals to the extreme terminus of the chromosome cannot be replaced by DNA due

to the absence of a downstream RNA primer. This generates a daughter strand that is

shorter than the parental strand giving rise to the end-replication problem (Figure 1.2).

The ‘end-replication problem’ was proposed during a time when telomere

structures were largely unexplored. Thus, it was described assuming that telomere DNA

was blunt ended. However, later research revealed that linear chromosomes have long 3’-

single stranded DNA overhangs (Figure 1.2) (Makarov et al., 1997). These overhangs are

generated by the exonucleases Apollo and Exo1 by resecting the blunt-ended DNA strand produced by leading strand synthesis (Sfeir et al., 2005). Mechanistic details of the end- replication problem was demonstrated in yeast showing a net loss of the leading telomere

(Figure 1.2) (Soudet et al., 2014). Without replenishing, this loss occurs with each cell division and DNA replication, eventually reaching critical lengths leading to genome wide instability.

The interactions between telomeric DNA and telomere binding proteins is one of the major mechanisms that prevents telomeres from being recognized as DNA damage and ensuing cellular senescence. In mammals, the protein complex associated with

3 telomeric DNA repeats are composed of 6 proteins known as the Shelterin complex (de

Lange, 2018). Three proteins; Telomeric Repeat-binding Factors 1 and 2 (TRF1, TRF2)

and Protection of Telomeres 1 (POT1) directly bind to telomeric DNA. Both TRF1 and

TRF2 interact with the double stranded region of telomeres whereas POT1 associates

with the single-stranded 3’ overhang (Baumann & Cech, 2001; Bilaud et al., 1997;

Broccoli et al., 1997; Choi et al., 2011; Chong et al., 1995). The remaining three proteins

do not directly interact with telomeric DNA, however associate with TRF1, TRF2 and

POT1. The Repressor/Activator Protein 1 (RAP1) binds specifically to TRF2 but not

TRF1 (B. Li et al., 2000). The TRF1- and-TRF2 Interacting Nuclear protein 2 (TIN2) as the name states interconnects TRF1 and TRF2. The TIN2 and POT1-ineracting protein 1

(TPP1), bridges POT1 and TIN2 (Houghtaling et al., 2004; Kim et al., 1999; Liu et al.,

2004; Ye & de Lange, 2004).

Replicative potential of somatic cells was found to be tightly correlated with telomere length (Harley et al., 1990). Normal human somatic cells are mortal with limited replicative capacity. Leonard Hayflick discovered that cell cultures derived from human tissues are capable of only a finite number of cell divisions before undergoing growth arrest or senescence, which is termed the ‘Hayflick limit’ (Hayflick, 1965). Thus, telomere length is viewed as a “mitotic clock” counting down as the cells divide. Upon

reaching a critical length, short telomeres lose protective function and trigger cellular

senescence, limiting the lifespan of cells and contributing to cellular aging (Harley et al.,

1992; Logeswaran & Chen, 2019)

Although somatic cells have limited replicative potential, germline, cancer cells

and stem cells have nearly unlimited proliferative capability. How do these rapidly

4 dividing cells solve the ‘end-replication problem’? Pioneering studies in T. thermophila performed by Carol Greider from Elizabeth Blackburn’s laboratory answered this question. Cell extracts from T. thermophila was found to have enzymatic activity capable of performing de novo DNA synthesis at telomeres (Greider & Blackburn, 1985; Greider

& Blackburn, 1987). While it was named terminal transferase activity at that time, it later came to be known as telomerase. Discovery of telomerase spawned a field of research with implications in various areas of biology including cancer, stem cells and anti-aging research. Both Carol Greider and Elizabeth Blackburn shared two-thirds of the Nobel prize for physiology or medicine in 2009 for their discovery of telomerase highlighting the importance of their contributions.

Telomerase counteracts progressive loss of telomeric DNA by synthesizing short

DNA repeats to chromosome termini (Shay & Wright, 2019). Telomerase is a ribonucleoprotein enzyme, composed of an integral RNA component; telomerase RNA

(TR) and the catalytic telomerase reverse transcriptase (TERT) (Greider & Blackburn,

1987; Greider & Blackburn, 1989). In contrast to conventional reverse transcriptases, only a short sequence within the TR acts as a template for telomere repeat addition.

Despite utilizing an extremely short template, telomerase is capable of synthesizing immensely long stretches of DNA (Figure 1.3) (Shippen-Lentz & Blackburn, 1990). This property of regenerating the template for multiple rounds of repeat synthesis is unique to telomerase. Although the exact mechanism of how template regeneration occurs is not well understood, it is known that the template temporarily dissociates from substrate

DNA after repeat synthesis, translocates, and the 3’ region of the template re-hybridizes

5 to the substrate so that the catalytic TERT can synthesize the next repeat (Figure 1.3) (Qi et al., 2012).

1.2 Telomerase reverse transcriptase

The TERT protein is a core-component of the telomerase enzyme which performs the catalytic function of telomerase. TERT uses the TR template for de novo synthesis of telomeric DNA repeats (Shippen-Lentz & Blackburn, 1990). Except for a select group of insects, TERT is conserved across eukaryotes with linear chromosomes. Four independently folded domains form the TERT protein which includes the telomerase essential N-terminal (TEN) domain, telomerase RNA binding domain (TRBD), reverse transcriptase (RT) domain and C-terminal extension (CTE) (Figure 1.4A). While TEN and TRBD are telomerase specific, RT and CTE share conserved motifs with conventional reverse transcriptases (RTs) and DNA polymerases (Lingner et al., 1997).

Despite telomerase structural studies being historically challenging, a key crystal structure of the beetle TERT provided important insight into the overall architecture of

TERT (Figure 1.4C) (Gillis et al., 2008). While insect TERTs lack a TEN domain, the

TRBD-RT-CTE domains were found to be organized in a ring like organization forming a central cavity to accommodate the TR template/DNA substrate duplex (Gillis et al.,

2008; M. Mitchell et al., 2010). However, recently the authenticity of the beetle TERT has been questioned due to the absence of TEN domain and a crucial variable region found in a vast majority of TERTs (Jiang et al., 2018).

The TERT specific TEN domain binds to both the TR subunit and the single stranded DNA substrate. TEN domain has a DNA ‘anchor’ site for binding single stranded telomeric DNA increasing processive repeat synthesis (Finger & Bryan, 2008;

6 Jacobs et al., 2006; Lue, 2005; Lue & Li, 2007; Romi et al., 2007; Sealey et al., 2010;

Wyatt et al., 2007). As the substrate DNA is retained via TEN domain interactions, the active site of TERT has greater probability for performing a subsequent repeat addition without complete substrate dissociation. Additionally, TEN domain harbors a low affinity

TR binding site, the mechanistic significance of which is poorly understood (Lai et al.,

2001; Moriarty et al., 2002; Moriarty et al., 2004).

The highly conserved TRBD (Podlevsky et al., 2008) is integral to the ring like

organization of TERT and forms part of the central cavity of the TERT active site (Figure

1.4C) (Gillis et al., 2008). Thus, the α-helix rich TRBD is a crucial domain for telomerase

ribonucleoprotein assembly and has high affinity to the TR structural domain CR4/5

(Bley et al., 2011; Huang et al., 2014; Moriarty et al., 2004; Rouda & Skordalakes, 2007).

In addition to RNA binding, motif T of TRBD has been shown to be important in

telomerase processivity (Drosopoulos & Prasad, 2009; M. Mitchell et al., 2010).

The RT domain is the catalytic domain of TERT and contains motifs conserved

among DNA polymerases and reverse transcriptases. These motifs are named motifs 1-2

and A thorough E from N to the C terminus. Apart from sharing homologous motifs,

structural organization of the RT domain was found to be similar to DNA polymerases

and reverse transcriptases. This architecture is analogous to the human right hand and

best demonstrated by the crystal structures of E. coli DNA polymerase I and HIV-RT

(Kohlstaedt et al., 1992; Ollis et al., 1985). In the TERT – RT, the fingers domain

comprised of motifs 1 and 2, is important for binding nucleotides required for DNA

synthesis and to position the RNA template (Bosoy & Lue, 2001; Gillis et al., 2008;

Wyatt et al., 2010). The palm domain composed of motifs A through E forms the

7 catalytic site for DNA synthesis. As all polymerases, telomerase uses a two-metal mechanism for catalytic DNA polymerization (Steitz, 1999). Three aspartic acid residues located in the palm, within motifs A and C are universally conserved and indispensable for enzymatic activity (Bryan et al., 2000; Counter et al., 1997; Harrington et al., 1997;

Nakayama et al., 1998; Weinrich et al., 1997; Wyatt et al., 2010).

The CTE domain has a structure and function similar to the thumb domain of RTs

(Gillis et al., 2008; Hoffman et al., 2017; M. Mitchell et al., 2010; Nakamura et al., 1997) and is important in telomeric DNA binding, telomerase activity and processivity (Hossain et al., 2002; Huard et al., 2003; Tomlinson et al., 2016).

1.3 Telomerase RNA

Although the TERT subunit is highly conserved across distinct evolutionary groups, TR is highly divergent in primary sequence, length and biogenesis pathway

(Podlevsky et al., 2008; Podlevsky & Chen, 2016). Three major groups of species are traditionally well studied for TR which includes ciliates which are basal unicellular eukaryotes, yeast and vertebrates. In terms of length, ciliate TRs range from 140-210 nucleotides; the smallest discovered (Figure 1.5A) (McCormick-Graham & Romero,

1995). In contrast, yeast and filamentous fungi have extremely long TRs ranging from

920-2430 nucleotides (Figure 1.5D, E) (Dandjinou et al., 2004; Qi et al., 2013).

Vertebrate TRs are intermediate in length with 312-559 nucleotides (Figure 1.5C) (Chen et al., 2000; Xie et al., 2008). Additionally, identification and characterization of invertebrate TRs primarily from echinoderms suggests a length range comparable to vertebrates (Figure 1.5B) (Y. Li et al., 2013; Podlevsky et al., 2016). Despite this drastic variation in length, two structural elements have been identified to be universally

8 conserved across all known TRs: the template pseudoknot domain and a distal stem-loop moiety (Figure 1.5) (Brown et al., 2007; Chen et al., 2000; Chen & Greider, 2004; Chen et al., 2002; Y. Li et al., 2013; Lin et al., 2004; Podlevsky et al., 2016; Qi et al., 2013).

These two domains are sufficient to reconstitute telomerase activity in vitro combined with TERT either as a single RNA or as two trans fragments (J. R. Mitchell & Collins,

2000; Qi et al., 2013; Tesmer et al., 1999).

The ubiquitous pseudoknot structure found in the template-pseudoknot domain is structurally conserved (Figure 1.5). The pseudoknot is formed by intramolecular base pairing between the loop of a hairpin and nucleotides outside of this stem. The human TR pseudoknot contains a conserved triple helix structure formed by Hoogsten base pairing with a Watson-Crick base-paired stem (Theimer et al., 2005). TRs from additional species have been shown to harbor similar triple helices (Qiao & Cech, 2008; Shefer et al., 2007). Although the pseudoknot is indispensable for telomerase activity, its exact role in catalysis is unclear (Chen & Greider, 2005; Ly et al., 2003; Qiao & Cech, 2008).

Structural studies of the human TR pseudoknot, primarily by NMR, shows that

RNA fragments proximal to the pseudoknot triple helix have a sharp bend in the RNA core domain inducing formation of a triangular structure (Zhang et al., 2011; Zhang et al.,

2010). Additionally, NMR structure of the smallest vertebrate pseudoknot from teleost fish is in good agreement with the human models suggesting a conserved tertiary structure of the pseudoknot (Wang et al., 2016). This pseudoknot architecture potentially serves to position the template and offers larger flexibility for conformational changes during the telomerase catalytic cycle (Musgrove et al., 2018).

9 The template regeneration property which is unique to telomerase requires that

the short template be precisely defined in a vastly larger TR. A TR structural feature termed template boundary element (TBE) is part of the template-pseudoknot core and

halts DNA synthesis past the template and prevents synthesis of non-telomeric DNA

sequences (Figure 1.5). A survey of structural elements from well characterized TRs

suggests diverse mechanisms for template boundary definition. Ciliate TBE is a stem 5’

of the template which contains conserved residues at the base of the stem (Figure 1.5A)

(Autexier & Greider, 1995; Lai et al., 2002). This helix has high-affinity to TRBD of

TERT where an RNA-protein interaction mediated boundary definition was found to

prevent template read through (Jansson et al., 2015). Similar to ciliates, a stable helix

immediately upstream of the template acts as TBE in yeast and filamentous fungi.

However fungal TBE functions via a distinct mechanism to restrict non-templated

nucleotide addition. Fungal TBE imposes constraints on the availability of ssRNA for

DNA synthesis restricting TERT from utilizing residues upstream of the template (Figure

1.5 D, E) (Seto et al., 2003; Tzfati et al., 2000). Interestingly, part of the fission yeast

TBE stem overlaps with the template itself. The dynamic formation and disruption of base pairing is presumably responsible for the heterogeneity in telomeric repeats observed in Schizosaccharomyces pombe telomeres (Webb & Zakian, 2015). However, in

most vertebrates including human, the template boundary is defined by the core-

enclosing helix P1 and the linker between template and P1 (Figure 1.5C). The linker

length controls template dynamics during telomere repeat synthesis and restricts the

availability of single-stranded RNA in contrast to steric hindrance observed in ciliate

(Chen & Greider, 2003; Moriarty et al., 2005). Rodents, however, do not have structural

10 features upstream of the template, but rather end 2 nt 5’ of the template (Hinkley et al.,

1998). By limiting availability of RNA sequence upstream of the template, template read-

though is averted (Chen & Greider, 2003).

A stem loop moiety located 3’ of the template-pseudoknot region is the second universally conserved TR structural element (Figure 1.5). Characterization of this element from major phylogenetic groups have revealed structurally different but functionally homologous domains. In vertebrates and filamentous fungi it has been termed CR4/5 (Figure 1.5B, E), three-way-junction (TWJ) in yeasts (Figure 1.5D), eCR4/5 in invertebrates and helix IV in ciliates (Figure 1.5A, B) (Blackburn & Collins,

2011; Brown et al., 2007; Chen et al., 2000; Chen et al., 2002; Podlevsky et al., 2016; Qi et al., 2013). The vertebrate and filamentous fungal CR4/5 and yeast TWJ form a three- stem junction two of which form short hairpins while the other stem forms a longer helix emanating from the rest of the TR (Figure 1.5 C, D, E) (Qi et al., 2013). Functionally, vertebrate CR4/5 is essential for in vitro telomerase activity and telomere maintenance in vivo and harbors a highly conserved 4bp stem termed P6.1 crucial for telomerase activity.

Point mutations in the P6.1 loop render telomerase inactive (Chen et al., 2002).

Vertebrate CR4/5 has high affinity interactions with the TRBD domain as evidenced by

cross-linking, mass spectrometry studies and a recent co-crystal structure (Bley et al.,

2011; Huang et al., 2014). Filamentous fungal CR4/5 shows similar properties in terms of function where it is indispensable for activity (Qi et al., 2013). Yeast TWJ, although

structurally similar to CR4/5, is not absolutely essential for activity (Brown et al., 2007;

Zappulla et al., 2005).

11 While a three-way junction based TR secondary forms vertebrate and fungal

distal stem loop moieties, invertebrates contain a simple internal stem termed eCR4/5 that

is functionally homologous to vertebrate and fungal CR4/5 (Figure 1.5B) (Y. Li et al.,

2013; Podlevsky et al., 2016). Strikingly, eCR4/5 is dispensable for telomerase activity in

vitro where the template-pseudoknot core by itself shows 30-40% activity of full-length

TR without the eCR4/5 domain. Finally, ciliate TRs have a single helix, named helix IV which has weak binding affinity to TERT and is required for telomerase activity (Lai et al., 2003; Mason et al., 2003).

The extreme divergence of TRs from distinct evolutionary clades is predominantly due to the diverse pathways of TR biogenesis and maturation (Figure 1.6).

Vertebrate TRs share a biogenesis pathway reminiscent of box H/ACA small nucleolar

(sno-) and small Cajal body (sca-) RNAs (Figure 1.6). Box H/ACA snoRNAs have the

Box H motif flanked by two stem-loops followed by the box ACA motif which lies close the to 3’ end (Marz et al., 2011). This type of arrangement can be found at the 3’ end of vertebrate TRs (Jády et al., 2004; J. R. Mitchell et al., 1999). Additionally, similar to

H/ACA sno- and sca-RNAs two copies of the dyskerin complex comprised by dyskerin,

Nhp2, Nop10 and Gar1 is also found associated with the boxH / ACA domain in vertebrate TRs (Cheng & Roberts, 2001; Egan & Collins, 2010; Girard et al., 1993;

Hamma et al., 2005; Maiorano et al., 1999; Pogacic et al., 2000). Although the sno- and sca- RNPs perform pseudouridylation assisted by dyskerin in target RNAs, no such function has been observed in the telomerase RNP. The terminal loop of the stem loop between the boxH and box ACA motifs in the TR contains a short sequence known as the

CAB box (Reichow et al., 2007; Theimer et al., 2007). The CAB box is bound by

12 telomerase Cajal body protein (TCAB1) for Cajal body localization. Cajal bodies are

nuclear localized structures in which important RNA processing events such as splicing

and post-transcriptional modifications occur (Venteicher et al., 2009). A similar 3’

domain organization in echinoderm TRs suggest that echinoderms share a similar

biogenesis and localization pathway with vertebrates (Figure 1.5C) (Podlevsky et al.,

2016). The 5’ end of human TR contains a G rich tract which putatively forms a G-

quadruplex structure (Lattmann et al., 2011; Sexton & Collins, 2011). Resolution of this structure by specific enzymes have been shown to influence assembly of the telomerase

RNP.

Yeasts have completely unique biogenesis pathway from that of vertebrates and echinoderms (Figure 1.6). Proximal to the 3’ end of most Saccharomyces genus budding yeast TRs, a single-stranded ‘U’rich motif binds to Sm proteins (Seto et al., 1999).

Fission yeast Schizosaccharomyces were also shown to bind Sm proteins (Tang et al.,

2012). These Sm proteins form a heteroheptameric ring that binds the uridine rich motif required for accumulation of telomerase RNP and post-transcriptional modifications of the TR. Additionally fission yeast, Schizosaccharomyces pombe and all known filamentous fungal TRs undergo splicesosomal cleavage of a 3’ terminal intron as part of the TR maturation pathway (Box et al., 2008; Kannan et al., 2015; Qi et al., 2015).

Candida budding yeast are proposed to also follow a similar pathway due to the presence of a conserved splice site sequences proximal to the 3’ end of the precursor TR (Egan &

Collins, 2012; Gunisova et al., 2009).

Two different stem loops which are part of the conserved core domain in yeast

TERs act as scaffolds to bind various proteins involved in biogenesis. The first stem from

13 the 5’ end of the TR is situated immediately upstream of the template and is also the

TBE. The terminal hairpin of this arm binds to the Ku heterodimer which is important for

assembly, localization and recruitment of telomerase to telomeres (Figure 1.5D) (Fisher

& Zakian, 2005; Kabaha et al., 2008; Stellwagen et al., 2003). A helix formed

downstream of the template between the pseudoknot and the template is bound by ever-

shorter telomere protein 1 (Est1p) which is important in telomere synthesis (Figure 1.5E)

(Seto et al., 2002, Evans and Lundblad, 2002).

The ciliate telomerase holoenzyme has been extensively characterized in the

model Tetrahymena thermophila. Initial studies on Tetrahymena telomerase holoenzyme purification identified five different proteins associate with the telomerase catalytic core- enzyme including p19, p45, p50, p75 and Teb1 (Min & Collins, 2009; Witkin & Collins,

2004). However, recent studies identified more protein subunits which will be discussed later. The p65 protein associates with helix IV of ciliate TR maintaining a kink in the

RNA structure to facilitate tight binding of TERT (Singh et al., 2012; Stone et al., 2007).

Interestingly, although vertebrate and fungal TRs are transcribed by RNA polymerase II

(Pol II), ciliate TRs are RNA polymerase III (Pol III) transcripts (Box et al., 2008;

Chapon et al., 1997; Chen et al., 2000; Gunisova et al., 2009; Kannan et al., 2015;

McCormick-Graham & Romero, 1995; J. R. Mitchell et al., 1999; Qi et al., 2015). The

Pol III transcribed ciliate TRs contain a poly ‘U’ sequence at the 3’ end. The genome encoded poly ‘U’ tract, signals transcription termination. The length disparity among TRs from distinct evolutionary groups could be explained in part by the characteristics of the transcription machinery used in biogenesis.

14 1.4 Recent advances in structural studies of telomerase holoenzyme

With the advancements in structure determination based on cryo-EM, the

Tetrahymena telomerase holoenzyme structure was significantly improved since the first negative staining EM maps were published in 2013 (Jiang et al., 2013). The initial negative staining EM structures from homogenously purified holoenzyme complexes showed the specific interactions of 8 different proteins with the core-enzyme. This includes the p65 protein which is required for catalytic activity and telomerase assembly along with p50, two copies of heterotrimeric RPA related complexes the TEB complex and CST complex (Wang et al., 2019). The most recently published structures was solved with cryo-EM at a resolution of 4.8 Å with telomeric DNA and 6.4 Å without (Jiang et al., 2018). This structure showed that the telomerase catalytic core forms a compact interlocked structure where the ring organization of TRBD-RT-CTE domains are wrapped around by the T/PK core of Tetrahymena TR enclosed by stem 1 (Figure 1.7A).

Another milestone in telomerase structural biology is the cryo-EM structure of human telomerase with telomeric substrate published in 2018 solved at an overall resolution of 10.2 Å (Figure 1.7B) (Nguyen et al., 2018). Human cells transiently overexpressing telomerase core subunits were used for purification of telomerase holoenzyme. Negative-stain EM was used to identify homogenous complexes which were subsequently used in structure determination. The structure shows a bilobular architecture where the catalytic core lobe is comprised of the TERT and T/PK core tethered to the H/ACA RNP lobe via the intervening sequences of the TR. Independently the catalytic core lobe and the H/ACA RNP lobe were solved at 7.7 Å and 8.2 Å respectively. The highly dynamic tether region of TR between the two lobes severely

15 limited the resolution of the overall structure. The catalytic core adopts a similar organization to the tetrahymena telomerase core where the RNA loops around the TERT forming a compact structure. The H/ACA RNP lobe constitutes two sets of the tetrameric dyskerin, NOP10, NHP2 and GAR1 proteins. This structure presents the first global architecture of telomerase from a multicellular animal and provides an important structure for therapeutic intervention of cancer or ageing related pathologies.

16 1.5 References

Autexier, C., & Greider, C. W. (1995). Boundary elements of the Tetrahymena telomerase RNA template and alignment domains. Genes & Development, 9(18), 2227-2239.

Baumann, P., & Cech, T. R. (2001). Pot1, the Putative Telomere End-Binding Protein in Fission Yeast and Humans. Science, 292(5519), 1171-1175.

Bilaud, T., Brun, C., Ancelin, K., Koering, C. E., Laroche, T., & Gilson, E. (1997). Telomeric localization of TRF2, a novel human telobox protein. Nature Genetics, 17(2), 236-239.

Blackburn, E. H., & Collins, K. (2011). Telomerase: an RNP enzyme synthesizes DNA. Cold Spring Harbor Perspectives in Biology, 3(5).

Blackburn, E. H., & Gall, J. G. (1978). A tandemly repeated sequence at the termini of the extrachromosomal ribosomal RNA genes in Tetrahymena. Journal of Molecular Biology, 120(1), 33-53.

Bley, C. J., Qi, X., Rand, D. P., Borges, C. R., Nelson, R. W., & Chen, J. J.-L. (2011). RNA–protein binding interface in the telomerase ribonucleoprotein. Proceedings of the National Academy of Sciences of United States of America, 108(51), 20333- 20338.

Bosoy, D., & Lue, N. F. (2001). Functional analysis of conserved residues in the putative "finger" domain of telomerase reverse transcriptase. Journal of Biological Chemistry, 276(49), 46305-46312.

Box, J. A., Bunch, J. T., Tang, W., & Baumann, P. (2008). Spliceosomal cleavage generates the 3' end of telomerase RNA. Nature, 456(7224), 910-914.

Broccoli, D., Smogorzewska, A., Chong, L., & de Lange, T. (1997). Human telomeres contain two distinct Myb-related proteins, TRF1 and TRF2. Nature Genetics, 17(2), 231-235.

Brown, Y., Abraham, M., Pearl, S., Kabaha, M. M., Elboher, E., & Tzfati, Y. (2007). A critical three-way junction is conserved in budding yeast and vertebrate telomerase RNAs. Nucleic Acids Research, 35(18), 6280-6289.

Bryan, T. M., Goodrich, K. J., & Cech, T. R. (2000). Telomerase RNA bound by protein motifs specific to telomerase reverse transcriptase. Molecular Cell, 6(2), 493-499.

Chapon, C., Cech, T. R., & Zaug, A. J. (1997). Polyadenylation of telomerase RNA in budding yeast. RNA, 3(11), 1337-1351.

17 Chen, J. J.-L., Blasco, M. A., & Greider, C. W. (2000). Secondary structure of vertebrate telomerase RNA. Cell, 100(5), 503-514.

Chen, J. J.-L., & Greider, C. W. (2003). Template boundary definition in mammalian telomerase. Genes & Development, 17(22), 2747-2752.

Chen, J. J.-L., & Greider, C. W. (2004). An emerging consensus for telomerase RNA structure. Proceedings of the National Academy of Sciences of United States of America, 101(41), 14683-14684.

Chen, J. J.-L., & Greider, C. W. (2005). Functional analysis of the pseudoknot structure in human telomerase RNA. Proceedings of the National Academy of Sciences of United States of America, 102(23), 8080-8085; discussion 8077-8089.

Chen, J. J.-L., Opperman, K. K., & Greider, C. W. (2002). A critical stem-loop structure in the CR4-CR5 domain of mammalian telomerase RNA. Nucleic Acids Research, 30(2), 592-597.

Cheng, X., & Roberts, R. J. (2001). AdoMet-dependent methylation, DNA methyltransferases and base flipping. Nucleic Acids Research, 29(18), 3784-3795.

Choi, K. H., Farrell, A. S., Lakamp, A. S., & Ouellette, M. M. (2011). Characterization of the DNA binding specificity of Shelterin complexes. Nucleic Acids Research, 39(21), 9206-9223.

Chong, L., van Steensel, B., Broccoli, D., Erdjument-Bromage, H., Hanish, J., Tempst, P., & de Lange, T. (1995). A human telomeric protein. Science, 270(5242), 1663- 1667.

Counter, C. M., Meyerson, M., Eaton, E. N., & Weinberg, R. A. (1997). The catalytic subunit of yeast telomerase. Proceedings of the National Academy of Sciences of United States of America, 94(17), 9202-9207.

Dandjinou, A. T., Lévesque, N., Larose, S., Lucier, J.-F., Abou Elela, S., & Wellinger, R. J. (2004). A phylogenetically based secondary structure for the yeast telomerase RNA. Current Biology, 14(13), 1148-1158. de Lange, T. (2018). Shelterin-Mediated Telomere Protection. Annual Review of Genetics, 52(1), 223-247.

Drosopoulos, W. C., & Prasad, V. R. (2009). Telomerase-specific T Motif is a Restrictive Determinant of Repetitive Reverse Transcription by Human Telomerase. Molecular and Cellular Biology.

Egan, E. D., & Collins, K. (2010). Specificity and Stoichiometry of Subunit Interactions in the Human Telomerase Holoenzyme Assembled In Vivo. Molecular and Cellular Biology, 30(11), 2775-2786. 18 Egan, E. D., & Collins, K. (2012). Biogenesis of telomerase ribonucleoproteins. RNA (New York, N.Y.), 18(10), 1747-1759.

Fajkus, J., Sýkorová, E., & Leitch, A. R. (2005). Telomeres in evolution and evolution of telomeres. Chromosome Research, 13(5), 469-479.

Fajkus, P., Peska, V., Sitova, Z., Fulneckova, J., Dvorackova, M., Gogela, R., Sykorova, E., Hapala, J., & Fajkus, J. (2016). Allium telomeres unmasked: the unusual telomeric sequence (CTCGGTTATGGG)n is synthesized by telomerase. Plant Journal, 85(3), 337-347.

Finger, S. N., & Bryan, T. M. (2008). Multiple DNA-binding sites in Tetrahymena telomerase. Nucleic Acids Research, 36(4), 1260-1272.

Fisher, T. S., & Zakian, V. A. (2005). Ku: a multifunctional protein involved in telomere maintenance. DNA Repair (Amst), 4(11), 1215-1226.

Gillis, A. J., Schuller, A. P., & Skordalakes, E. (2008). Structure of the Tribolium castaneum telomerase catalytic subunit TERT. Nature, 455(7213), 633-637.

Girard, J.-P., Caizergues-Ferrer, M., & Lapeyre, B. (1993). The SpGARI gene of Schizosaccharomyces pombe encodes the functional homologue of the snoRNP protein GAR1 of Saccharomyces cerevisiae. Nucleic Acids Research, 21(9), 2149- 2155.

Greider, C. W., & Blackburn, E. H. (1985). Identification of a specific telomere terminal transferase activity in Tetrahymena extracts. Cell, 43(2 Pt 1), 405-413.

Greider, C. W., & Blackburn, E. H. (1987). The telomere terminal transferase of Tetrahymena is a ribonucleoprotein enzyme with two kinds of primer specificity. Cell, 51(6), 887-898.

Greider, C. W., & Blackburn, E. H. (1989). A telomeric sequence in the RNA of Tetrahymena telomerase required for telomere repeat synthesis. Nature, 337(6205), 331-337.

Gunisova, S., Elboher, E., Nosek, J., Gorkovoy, V., Brown, Y., Lucier, J. F., Laterreur, N., Wellinger, R. J., Tzfati, Y., & Tomaska, L. (2009). Identification and comparative analysis of telomerase RNAs from Candida species reveal conservation of functional elements. RNA, 15(4), 546-559.

Hamma, T., Reichow, S. L., Varani, G., & Ferré-D'Amaré, A. R. (2005). The Cbf5– Nop10 complex is a molecular bracket that organizes box H/ACA RNPs. Nature Structural & Molecular Biology, 12(12), 1101-1107.

Harley, C. B., Futcher, A. B., & Greider, C. W. (1990). Telomeres shorten during ageing of human fibroblasts. Nature, 345(6274), 458-460. 19 Harley, C. B., Vaziri, H., Counter, C. M., & Allsopp, R. C. (1992). The telomere hypothesis of cellular aging. Experimental Gerontology, 27(4), 375-382.

Harrington, L., Zhou, W., McPhail, T., Oulton, R., Yeung, D. S., Mar, V., Bass, M. B., & Robinson, M. O. (1997). Human telomerase contains evolutionarily conserved catalytic and structural subunits. Genes & Development, 11(23), 3109-3115.

Hayflick, L. (1965). The limited in vitro lifetime of human diploid cell strains. Experimental Cell Research, 37, 614-636.

Hinkley, C. S., Blasco, M. A., Funk, W. D., Feng, J., Villeponteau, B., Greider, C. W., & Herr, W. (1998). The mouse telomerase RNA 5"-end lies just upstream of the telomerase template sequence. Nucleic Acids Research, 26(2), 532-536.

Hoffman, H., Rice, C., & Skordalakes, E. (2017). Structural Analysis Reveals the Deleterious Effects of Telomerase Mutations in Telomerase-Associated Bone Marrow Failure Syndromes. Journal of Biological Chemistry.

Hossain, S., Singh, S., & Lue, N. F. (2002). Functional analysis of the C-terminal extension of telomerase reverse transcriptase. A putative "thumb" domain. Journal of Biological Chemistry, 277(39), 36174-36180.

Houghtaling, B. R., Cuttonaro, L., Chang, W., & Smith, S. (2004). A dynamic molecular link between the telomere length regulator TRF1 and the chromosome end protector TRF2. Current Biology, 14(18), 1621-1631.

Huang, J., Brown, A. F., Wu, J., Xue, J., Bley, C. J., Rand, D. P., Wu, L., Zhang, R., Chen, J. J. L., & Lei, M. (2014). Structural basis for protein-RNA recognition in telomerase. Nature Structural & Molecular Biology, 21, 507.

Huard, S., Moriarty, T. J., & Autexier, C. (2003). The C terminus of the human telomerase reverse transcriptase is a determinant of enzyme processivity. Nucleic Acids Research, 31(14), 4059-4070.

Jacobs, S. A., Podell, E. R., & Cech, T. R. (2006). Crystal structure of the essential N- terminal domain of telomerase reverse transcriptase. Nature Structural & Molecular Biology, 13(3), 218-225.

Jády, B. E., Bertrand, E., & Kiss, T. (2004). Human telomerase RNA and box H/ACA scaRNAs share a common Cajal body-specific localization signal. Journal of Cell Biology, 164(5), 647-652.

Jansson, L. I., Akiyama, B. M., Ooms, A., Lu, C., Rubin, S. M., & Stone, M. D. (2015). Structural basis of template-boundary definition in Tetrahymena telomerase. Nature Structural &Amp; Molecular Biology, 22, 883.

20 Jiang, J., Miracco, E. J., Hong, K., Eckert, B., Chan, H., Cash, D. D., Min, B., Zhou, Z. H., Collins, K., & Feigon, J. (2013). The architecture of Tetrahymena telomerase holoenzyme. Nature, 496(7444), 187-192.

Jiang, J., Wang, Y., Sušac, L., Chan, H., Basu, R., Zhou, Z. H., & Feigon, J. (2018). Structure of Telomerase with Telomeric DNA. Cell, 173(5), 1179-1190.e1113.

Kabaha, M. M., Zhitomirsky, B., Schwartz, I., & Tzfati, Y. (2008). The 5' arm of Kluyveromyces lactis telomerase RNA is critical for telomerase function. Molecular and cellular biology, 28(6), 1875-1882.

Kannan, R., Helston, R. M., Dannebaum, R. O., & Baumann, P. (2015). Diverse mechanisms for spliceosome-mediated 3′ end processing of telomerase RNA. Nature Communications, 6(1), 6104.

Kim, S. H., Kaminker, P., & Campisi, J. (1999). TIN2, a new regulator of telomere length in human cells. Nature Genetics, 23(4), 405-412.

Kohlstaedt, L., Wang, J., Friedman, J., Rice, P., & Steitz, T. (1992). Crystal structure at 3.5 A resolution of HIV-1 reverse transcriptase complexed with an inhibitor. Science, 256(5065), 1783-1790.

Lai, C. K., Miller, M. C., & Collins, K. (2002). Template boundary definition in Tetrahymena telomerase. Genes & Development, 16(4), 415-420.

Lai, C. K., Miller, M. C., & Collins, K. (2003). Roles for RNA in telomerase nucleotide and repeat addition processivity. Molecular Cell, 11(6), 1673-1683.

Lai, C. K., Mitchell, J. R., & Collins, K. (2001). RNA binding domain of telomerase reverse transcriptase. Molecular and Cellular Biology, 21(4), 990-1000.

Lattmann, S., Stadler, M. B., Vaughn, J. P., Akman, S. A., & Nagamine, Y. (2011). The DEAH-box RNA helicase RHAU binds an intramolecular RNA G-quadruplex in TERC and associates with telomerase holoenzyme. Nucleic Acids Research, 39(21), 9390-9404.

Li, B., Oestreich, S., & de Lange, T. (2000). Identification of human Rap1: implications for telomere evolution. Cell, 101(5), 471-483.

Li, Y., Podlevsky, J. D., Marz, M., Qi, X., Hoffmann, S., Stadler, P. F., & Chen, J. J.-L. (2013). Identification of purple sea urchin telomerase RNA using a next- generation sequencing based approach. RNA, 19(6), 852-860.

Lin, J., Ly, H., Hussain, A., Abraham, M., Pearl, S., Tzfati, Y., Parslow, T. G., & Blackburn, E. H. (2004). A universal telomerase RNA core structure includes structured motifs required for binding the telomerase reverse transcriptase protein.

21 Proceedings of the National Academy of Sciences of United States of America, 101(41), 14713-14718.

Lingner, J., Hughes, T. R., Shevchenko, A., Mann, M., Lundblad, V., & Cech, T. R. (1997). Reverse transcriptase motifs in the catalytic subunit of telomerase. Science, 276, 561-567.

Liu, D., Safari, A., O'Connor, M. S., Chan, D. W., Laegeler, A., Qin, J., & Songyang, Z. (2004). PTOP interacts with POT1 and regulates its localization to telomeres. Nature Cell Biology, 6(7), 673-680.

Logeswaran, D., & Chen, J. J.-L. (2019). Effects of Telomerase Activation. In D. Gu & M. E. Dupre (Eds.), Encyclopedia of Gerontology and Population Aging (pp. 1- 8). Cham: Springer International Publishing.

Lue, N. F. (2005). A physical and functional constituent of telomerase anchor site. Journal of Biological Chemistry, 280(28), 26586-26591.

Lue, N. F., & Li, Z. (2007). Modeling and structure function analysis of the putative anchor site of yeast telomerase. Nucleic Acids Research, 35(15), 5213-5222.

Ly, H., Blackburn, E. H., & Parslow, T. G. (2003). Comprehensive structure-function analysis of the core domain of human telomerase RNA. Molecular and Cellular Biology, 23(19), 6849-6856.

Maiorano, D., Brimage, L. J., Leroy, D., & Kearsey, S. E. (1999). Functional conservation and cell cycle localization of the Nhp2 core component of H + ACA snoRNPs in fission and budding yeasts. Exp Cell Res, 252(1), 165-174.

Makarov, V. L., Hirose, Y., & Langmore, J. P. (1997). Long G tails at both ends of human chromosomes suggest a C strand degradation mechanism for telomere shortening. Cell, 88(5), 657-666.

Marz, M., Gruber, A. R., Höner Zu Siederdissen, C., Amman, F., Badelt, S., Bartschat, S., Bernhart, S. H., Beyer, W., Kehr, S., Lorenz, R., Tanzer, A., Yusuf, D., Tafer, H., Hofacker, I. L., & Stadler, P. F. (2011). Animal snoRNAs and scaRNAs with exceptional structures. RNA biology, 8(6), 938-946.

Mason, D. X., Goneska, E., & Greider, C. W. (2003). Stem-loop IV of tetrahymena telomerase RNA stimulates processivity in trans. Molecular and Cellular Biology, 23(16), 5606-5613.

McClintock, B. (1939). The Behavior in Successive Nuclear Divisions of a Chromosome Broken at Meiosis. Proceedings of the National Academy of Sciences of United States of America, 25(8), 405-416.

22 McClintock, B. (1941). The Stability of Broken Ends of Chromosomes in Zea Mays. Genetics, 26(2), 234-282.

McCormick-Graham, M., & Romero, D. P. (1995). Ciliate telomerase RNA structural features. Nucleic Acids Research, 23(7), 1091-1097.

Meselson, M., & Stahl, F. W. (1958). The replication of DNA in Escherichia coli. Proceedings of the National Academy of Sciences of United States of America, 44(7), 671-682.

Meyne, J., Ratliff, R. L., & Moyzis, R. K. (1989). Conservation of the human telomere sequence (TTAGGG)n among vertebrates. Proceedings of the National Academy of Sciences of United States of America, 86(18), 7049-7053.

Min, B., & Collins, K. (2009). An RPA-related sequence-specific DNA-binding subunit of telomerase holoenzyme is required for elongation processivity and telomere maintenance. Molecular Cell, 36(4), 609-619.

Mitchell, J. R., Cheng, J., & Collins, K. (1999). A box H/ACA small nucleolar RNA-like domain at the human telomerase RNA 3' end. Molecular and Cellular Biology, 19(1), 567-576.

Mitchell, J. R., & Collins, K. (2000). Human telomerase activation requires two independent interactions between telomerase RNA and telomerase reverse transcriptase. Molecular Cell, 6(2), 361-371.

Mitchell, M., Gillis, A., Futahashi, M., Fujiwara, H., & Skordalakes, E. (2010). Structural basis for telomerase catalytic subunit TERT binding to RNA template and telomeric DNA. Nature Structural & Molecular Biology, 17(4), 513-518.

Moriarty, T. J., Huard, S., Dupuis, S., & Autexier, C. (2002). Functional multimerization of human telomerase requires an RNA interaction domain in the N terminus of the catalytic subunit. Molecular and Cellular Biology, 22(4), 1253-1265.

Moriarty, T. J., Marie-Egyptienne, D. T., & Autexier, C. (2004). Functional organization of repeat addition processivity and DNA synthesis determinants in the human telomerase multimer. Molecular and Cellular Biology, 24(9), 3720-3733.

Moriarty, T. J., Marie-Egyptienne, D. T., & Autexier, C. (2005). Regulation of 5' template usage and incorporation of noncognate nucleotides by human telomerase. RNA, 11(9), 1448-1460.

Moyzis, R. K., Buckingham, J. M., Cram, L. S., Dani, M., Deaven, L. L., Jones, M. D., Meyne, J., Ratliff, R. L., & Wu, J. R. (1988). A highly conserved repetitive DNA sequence, (TTAGGG)n, present at the telomeres of human chromosomes. Proceedings of the National Academy of Sciences of United States of America, 85(18), 6622-6626. 23 Muller, H. (1938). The remaking of chromosomes. The Collecting Net, 13(8), 181-198.

Nakamura, T. M., Morin, G. B., Chapman, K. B., Weinrich, S. L., Andrews, W. H., Lingner, J., Harley, C. B., & Cech, T. R. (1997). Telomerase catalytic subunit homologs from fission yeast and human. Science, 277, 955-959.

Nakayama, J., Tahara, H., Tahara, E., Saito, M., Ito, K., Nakamura, H., Nakanishi, T., Ide, T., & Ishikawa, F. (1998). Telomerase activation by hTRT in human normal fibroblasts and hepatocellular carcinomas. Nature Genetics, 18(1), 65-68.

Nguyen, T. H. D., Tam, J., Wu, R. A., Greber, B. J., Toso, D., Nogales, E., & Collins, K. (2018). Cryo-EM structure of substrate-bound human telomerase holoenzyme. Nature, 557(7704), 190-195.

Ollis, D. L., Brick, P., Hamlin, R., Xuong, N. G., & Steitz, T. A. (1985). Structure of large fragment of Escherichia coli DNA polymerase I complexed with dTMP. Nature, 313(6005), 762-766.

Olovnikov, A. M. (1973). A theory of marginotomy. The incomplete copying of template margin in enzymic synthesis of polynucleotides and biological significance of the phenomenon. Journal of Theoretical Biology, 41(1), 181-190.

Peška, V., Fajkus, P., Fojtová, M., Dvořáčková, M., Hapala, J., Dvořáček, V., Polanská, P., Leitch, A. R., Sýkorová, E., & Fajkus, J. (2015). Characterisation of an unusual telomere motif (TTTTTTAGGG)n in the plant Cestrum elegans (Solanaceae), a species with a large genome. Plant Journal, 82(4), 644-654.

Podlevsky, J. D., Bley, C. J., Omana, R. V., Qi, X., & Chen, J. J.-L. (2008). The telomerase database. Nucleic Acids Research, 36(Database issue), D339-343.

Podlevsky, J. D., & Chen, J. J. (2016). Evolutionary perspectives of telomerase RNA structure and function. RNA Biol, 13(8), 720-732.

Podlevsky, J. D., Li, Y., & Chen, J. J.-L. (2016). Structure and function of echinoderm telomerase RNA. RNA, 22(2), 204-215.

Pogacic, V., Dragon, F., & Filipowicz, W. (2000). Human H/ACA small nucleolar RNPs and telomerase share evolutionarily conserved proteins NHP2 and NOP10. Molecular and Cellular Biology, 20(23), 9028-9040.

Qi, X., Li, Y., Honda, S., Hoffmann, S., Marz, M., Mosig, A., Podlevsky, J. D., Stadler, P. F., Selker, E. U., & Chen, J. J. L. (2013). The common ancestral core of vertebrate and fungal telomerase RNAs. Nucleic acids research, 41(1), 450-462.

Qi, X., Rand, D. P., Podlevsky, J. D., Li, Y., Mosig, A., Stadler, P. F., & Chen, J. J. L. (2015). Prevalent and distinct spliceosomal 3′-end processing mechanisms for fungal telomerase RNA. Nature Communications, 6(1), 6105. 24 Qi, X., Xie, M., Brown, A. F., Bley, C. J., Podlevsky, J. D., & Chen, J. J. (2012). RNA/DNA hybrid binding affinity determines telomerase template-translocation efficiency. The EMBO journal, 31(1), 150-161.

Qiao, F., & Cech, T. R. (2008). Triple-helix structure in telomerase RNA contributes to catalysis. Nature Structural & Molecular Biology, 15(6), 634-640.

Reichow, S. L., Hamma, T., Ferré-D'Amaré, A. R., & Varani, G. (2007). The structure and function of small nucleolar ribonucleoproteins. Nucleic Acids Research, 35(5), 1452-1464.

Riha, K., & Shippen, D. E. (2003). Telomere structure, function and maintenance in Arabidopsis. Chromosome Research, 11(3), 263-275.

Romi, E., Baran, N., Gantman, M., Shmoish, M., Min, B., Collins, K., & Manor, H. (2007). High-resolution physical and functional mapping of the template adjacent DNA binding site in catalytically active telomerase. Proceedings of the National Academy of Sciences of United States of America, 104(21), 8791-8796.

Rouda, S., & Skordalakes, E. (2007). Structure of the RNA-binding domain of telomerase: implications for RNA recognition and binding. Structure, 15(11), 1403-1412.

Sealey, D. C. F., Zheng, L., Taboski, M. A. S., Cruickshank, J., Ikura, M., & Harrington, L. A. (2010). The N-terminus of hTERT contains a DNA-binding domain and is required for telomerase activity and cellular immortalization. Nucleic Acids Research, 38(6), 2019-2035.

Seto, A. G., Umansky, K., Tzfati, Y., Zaug, A. J., Blackburn, E. H., & Cech, T. R. (2003). A template-proximal RNA paired element contributes to Saccharomyces cerevisiae telomerase activity. RNA, 9(11), 1323-1332.

Seto, A. G., Zaug, A. J., Sobel, S. G., Wolin, S. L., & Cech, T. R. (1999). Saccharomyces cerevisiae telomerase is an Sm small nuclear ribonucleoprotein particle. Nature, 401(6749), 177-180.

Sexton, A. N., & Collins, K. (2011). The 5′ Guanosine Tracts of Human Telomerase RNA Are Recognized by the G-Quadruplex Binding Domain of the RNA Helicase DHX36 and Function To Increase RNA Accumulation. Molecular and Cellular Biology, 31(4), 736-743.

Sfeir, A. J., Chai, W., Shay, J. W., & Wright, W. E. (2005). Telomere-end processing the terminal nucleotides of human chromosomes. Molecular Cell, 18(1), 131-138.

Shefer, K., Brown, Y., Gorkovoy, V., Nussbaum, T., Ulyanov, N. B., & Tzfati, Y. (2007). A triple helix within a pseudoknot is a conserved and essential element of telomerase RNA. Molecular and Cellular Biology, 27(6), 2130-2143. 25 Shippen-Lentz, D., & Blackburn, E. H. (1990). Functional evidence for an RNA template in telomerase. Science, 247(4942), 546-552.

Singh, M., Wang, Z., Koo, B.-K., Patel, A., Cascio, D., Collins, K., & Feigon, J. (2012). Structural Basis for Telomerase RNA Recognition and RNP Assembly by the Holoenzyme La Family Protein p65. Molecular Cell, 47(1), 16-26.

Soudet, J., Jolivet, P., & Teixeira, M. T. (2014). Elucidation of the DNA end-replication problem in Saccharomyces cerevisiae. Molecular Cell, 53(6), 954-964.

Steitz, T. A. (1999). DNA Polymerases: Structural Diversity and Common Mechanisms. Journal of Biological Chemistry, 274(25), 17395-17398.

Stellwagen, A. E., Haimberger, Z. W., Veatch, J. R., & Gottschling, D. E. (2003). Ku interacts with telomerase RNA to promote telomere addition at native and broken chromosome ends. Genes & Development.

Stone, M. D., Mihalusova, M., O'Connor, C. M., Prathapam, R., Collins, K., & Zhuang, X. (2007). Stepwise protein-mediated RNA folding directs assembly of telomerase ribonucleoprotein. Nature, 446(7134), 458-461.

Tang, W., Kannan, R., Blanchette, M., & Baumann, P. (2012). Telomerase RNA biogenesis involves sequential binding by Sm and Lsm complexes. Nature, 484(7393), 260-264.

Tesmer, V. M., Ford, L. P., Holt, S. E., Frank, B. C., Yi, X., Aisner, D. L., Ouellette, M., Shay, J. W., & Wright, W. E. (1999). Two inactive fragments of the integral RNA cooperate to assemble active telomerase with the human protein catalytic subunit (hTERT) in vitro. Molecular and Cellular Biology, 19(9), 6207-6216.

Theimer, C. A., Blois, C. A., & Feigon, J. (2005). Structure of the human telomerase RNA pseudoknot reveals conserved tertiary interactions essential for function. Molecular Cell, 17(5), 671-682.

Theimer, C. A., Jady, B. E., Chim, N., Richard, P., Breece, K. E., Kiss, T., & Feigon, J. (2007). Structural and functional characterization of human telomerase RNA processing and cajal body localization signals. Molecular Cell, 27(6), 869-881.

Tomlinson, C. G., Holien, J. K., Mathias, J. A., Parker, M. W., & Bryan, T. M. (2016). The C-terminal extension of human telomerase reverse transcriptase is necessary for high affinity binding to telomeric DNA. Biochimie, 128-129, 114-121.

Tzfati, Y., Fulton, T. B., Roy, J., & Blackburn, E. H. (2000). Template boundary in a yeast telomerase specified by RNA structure. Science, 288(5467), 863-867.

Venteicher, A. S., Abreu, E. B., Meng, Z., McCann, K. E., Terns, R. M., Veenstra, T. D., Terns, M. P., & Artandi, S. E. (2009). A human telomerase holoenzyme protein 26 required for Cajal body localization and telomere synthesis. Science, 323(5914), 644-648.

Wang, Y., Susac, L., & Feigon, J. (2019). Structural Biology of Telomerase. Cold Spring Harbor Perspectives in Biology.

Wang, Y., Yesselman, J. D., Zhang, Q., Kang, M., & Feigon, J. (2016). Structural conservation in the template/pseudoknot domain of vertebrate telomerase RNA from teleost fish to human. Proceedings of the National Academy of Sciences of United States of America, 201607411.

Watson, J. D. (1972). Origin of concatemeric T7 DNA. Nature New Biology, 239(94), 197-201.

Webb, C. J., & Zakian, V. A. (2015). Telomerase RNA stem terminus element affects template boundary element function, telomere sequence, and shelterin binding. Proceedings of the National Academy of Sciences of United States of America, 112(36), 11312-11317.

Weinrich, S. L., Pruzan, R., Ma, L., Ouellette, M., Tesmer, V. M., Holt, S. E., Bodnar, A. G., Lichtsteiner, S., Kim, N. W., Trager, J. B., Taylor, R. D., Carlos, R., Andrews, W. H., Wright, W. E., Shay, J. W., Harley, C. B., & Morin, G. B. (1997). Reconstitution of human telomerase with the template RNA component hTR and the catalytic protein subunit hTRT. Nature Genetics, 17(4), 498-502.

Witkin, K. L., & Collins, K. (2004). Holoenzyme proteins required for the physiological assembly and activity of telomerase. Genes & Development, 18(10), 1107-1118.

Wyatt, H. D. M., Lobb, D. A., & Beattie, T. L. (2007). Characterization of physical and functional anchor site interactions in human telomerase. Molecular and Cellular Biology, 27(8), 3226-3240.

Wyatt, H. D. M., West, S. C., & Beattie, T. L. (2010). InTERTpreting telomerase structure and function. Nucleic Acids Research.

Xie, M., Mosig, A., Qi, X., Li, Y., Stadler, P. F., & Chen, J. J.-L. (2008). Structure and function of the smallest vertebrate telomerase RNA from teleost fish. Journal of Biological Chemistry, 283(4), 2049-2059.

Ye, J. Z.-S., & de Lange, T. (2004). TIN2 is a tankyrase 1 PARP modulator in the TRF1 telomere length control complex. Nature Genetics, 36(6), 618-623.

Zappulla, D. C., Goodrich, K., & Cech, T. R. (2005). A miniature yeast telomerase RNA functions in vivo and reconstitutes activity in vitro. Nature Structural & Molecular Biology, 12(12), 1072-1077.

27 Zhang, Q., Kim, N. K., & Feigon, J. (2011). Architecture of human telomerase RNA. Proceedings of the National Academy of Sciences of United States of America, 108(51), 20325-20332.

Zhang, Q., Kim, N. K., Peterson, R. D., Wang, Z., & Feigon, J. (2010). Structurally conserved five nucleotide bulge determines the overall topology of the core domain of human telomerase RNA. Proceedings of the National Academy of Sciences of United States of America, 107(44), 18761-18768.

28

Figure 1.1. Telomere sequences of major eukaryotic groups. Asterisk indicates majority of sub-groups have the shown telomere sequence. Most eukaryotic groups share the highly conserved TTAGGG telomeric repeat sequence.

29

Figure 1.2. Schematic of the end-replication problem. Conventional DNA polymerases cannot completely replicate linear chromosomal ends leading to shorter daughter DNA strands than parental DNA (Black). Exonuclease processing causes significantly shortened leading strand template compared to the parental DNA.

30

Figure 1.3. Telomerase adds DNA repeats to telomeres de novo. Telomerase extends the 3’ end of the telomeric DNA overhang by adding 5’-GGTTAG-3’ repeats using the integral short telomerase RNA template. Telomerase is a unique RNA dependent DNA polymerase which regenerates the short template for a subsequent repeat addition cycle.

31

Figure 1.4. Domain architecture of the catalytic TERT protein. (A) TERT consists of four structural domains: telomerase essential N-terminal (TEN, grey) domain, telomerase RNA binding domain (TRBD, orange), reverse transcriptase domain (TR, blue) and the C-terminal extension (CTE, yellow). TEN and TRBD are telomerase-specific while RT and CTE share motifs with reverse transcriptases and conventional DNA polymerases. (B) Surface representation of crystal structure of TEN domain from ciliate (Tetrahymena thermophila) TERT. (C) Surface representation of the crystal structure of the red flour beetle (Tribolium castaneum) TERT with a hairpin heteroduplex in the active site. The TRBD-RT-CTE domains form a unique ring like architecture with TRBD and CTE forming protein-protein interactions. TERT domains are colored similar to linear representation in (A)

32

Figure 1.5. Conserved structural domains of TR from well-studied groups. Two structural domains are ubiquitous among all known TRs: a template / pseudoknot domain and a distal stem loop moiety (shown in yellow boxes and labeled in red). The template boundary element (TBE) from each group is shown in a blue box (A) Ciliates include the smallest TRs known to date. (B) Recently well studied echinoderm TRs fall in the intermediate size range and contain H/ACA domain (grey) with a CAB box for biosynthesis and localization. (C) Vertebrate TRs harbor the highly conserved CR4/5 domain crucial for telomerase activity are intermediate in length with H/ACA biogenesis domains similar to echinoderms. (D) Yeast TRs in contrast are very large in size and contain Ku and Sm protein-binding site (grey). (E) Filamentous fungal TRs include the largest identified TRs and contain vertebrate-like CR4/5 domain and Est1-binding site (grey).

33 Figure 1.6. Diverse TR biogenesis pathways from major organism groups. Ciliates employ a small RNA type biogenesis pathway and is transcribed by Pol III. Fungi exhibit diversity within the group where yeasts show snRNA type biogenesis with or without splicesosomal cleavage while filamentous fungi show 3’ terminal intron splicing. Vertebrates and echinoderms share box H/ACA type biogenesis pathway. Adapted from (Podlevsky & Chen, 2016)

34

Figure 1.7. Cryo-EM structures of Tetrahymena and human telomerase. (A) Schematic of Tetrahymena telomerase core-enzyme showing interlocked TERT domains by the TR core (left). Surface representation of the structure of Tetrahymena telomerase catalytic core (middle and right). Adapted from (Jiang et al, 2018) (B) Schematic of human telomerase showing the bilobular architecture of the catalytic core and the H/ACA RNP. Individual protein and RNA domains modelled into the cryo-EM density map shown (middle and right). Adapted from (Nguyen et al, 2018)

35 CHAPTER 2

STRUCTURE AND FUNCTION OF METAZOAN TELOMERASE RNA

2.1 Abstract

Telomerase RNA (TR) is a non-coding RNA essential to the catalytic function of the telomerase ribonucleoprotein enzyme. Due to the extremely diversity in sequence, structure and biogenesis mechanisms, TRs have only been identified from chordate and

echinoderm phyla of the animal kingdom. In this study, we employed a phylogeny- guided, structure-based bioinformatics approach and identified 83 TRs from 10 additional metazoan phyla spanning 18 diverse classes including early branching sponges, cnidarian nettles, mollusks, worms, eels and jawless vertebrates. In vitro synthesized TRs from

three representative species, Saccoglossus kowalevskii (acorn worm) from phylum

hemichordata, Pomacea diffusa (apple snail) and Crassostrea virginica (American

oyster) from phylum Mollusca reconstitute active telomerase with the corresponding

telomerase reverse transcriptase (TERT) components. Comparison of secondary

structures inferred by phylogenetic comparative analysis show that three structural

domains, pseudoknot, CR4/5 and box H/ACA, are conserved from the basal lineages

including cnidaria and sponges to vertebrates, supporting a monophyletic origin of animal

TRs. However, TRs from two separate lineages contain a CR4/5-equivalent domain,

called eCR4/5, that lacks the crucial P6.1 stem-loop of vertebrate CR4/5 but capable of assembly with TERT in trans and crucial for telomerase catalytic activity. Furthermore, structural comparison shows that a template-boundary element P1.1 helix is ancient and ubiquitously found across most metazoan TRs including some primitive chordates but lost rather recently in vertebrate TRs. This study reveals the detailed evolutionary

36 pathway of TRs across diverse clades of animal species, identifying ancestral as well as later-evolved TR structural domains.

2.2 Introduction

The ends of eukaryotic linear chromosome termini are capped by telomeres, nucleoprotein complexes that safeguard genome stability, ensure proper chromosome partitioning, and block chromosome fusion events from undesired double stranded DNA break repair mechanisms (Podlevsky & Chen, 2012; Zakian, 2009). Telomeres shorten with each cell division following genome duplication due to the aptly named ‘end replication problem’ (Soudet et al., 2014). The telomerase enzyme is responsible for the synthesis of telomeric DNA to offset telomere erosion (Musgrove et al., 2018). The core enzyme is minimally composed of the catalytic telomerase reverse transcriptase (TERT) which synthesizes telomeric DNA from a short region within the integral telomerase

RNA (TR) component. Although TR is indispensable for telomerase function, there is extremely limited sequence homology and little structural similarities amongst TRs from protists, fungi, and metazoans (Podlevsky & Chen, 2016).

The few shared TR structural elements found within most species include a template proximal pseudoknot and a distal stem-loop moiety located centrally within the greater TR (Blackburn & Collins, 2010; Lin et al., 2004; Podlevsky et al., 2016a; Qi et al., 2013). While present within all TRs with a determined secondary structure, the distal stem-loop moiety is highly divergent in TRs from evolutionarily distant species.

Exacerbating the divergence in TR sequence and secondary structure are the distinct biogenesis processes employed by evolutionarily separate groups of species (Podlevsky

& Chen, 2016). The overall architecture of TRs appears to have been extensively shaped

37 by the myriad of accessory proteins that vary dramatically among species by binding

species-specific structural domains within TR. These accessory proteins are essential for

TR biogenesis, localization, RNP formation, and the regulation of telomerase activity.

Discerning the origins, evolution, and structure/function relationship of the telomerase

RNP requires the identification of numerous TRs from all major taxa of eukaryotes (Chen et al., 2000; Podlevsky & Chen, 2016). Massive TR divergence in the primary sequence of TR has been a nearly insurmountable obstacle for TR identification from important model organisms and taxa by conventional molecular and bioinformatics approaches.

Biochemical methods for TR identification primarily involve the purification of telomerase holoenzyme from cell lysates (Cifuentes-Rojas et al., 2011; Greider &

Blackburn, 1989; Leonardi et al., 2008; Qi et al., 2013; Webb & Zakian, 2008).

Although the first TR to be identified was using this method, these purification protocols are tedious, time consuming and impractical for a number of organisms. Within species compliant to telomerase enzyme purification, the process requires multiple steps and extensive optimizations. (Li et al., 2013).

PCR based methods have partially overcome drawbacks associated with biochemical purification and has been successfully applied for TR identification of a number of vertebrate and yeast species (Chen et al., 2000; Dandjinou et al., 2004). For

the identification of vertebrate TRs, degenerate PCR primers were designed to target the

highly conserved pseudoknot and CR4/5 sequences. This was best demonstrated for

teleost fish species where the primer targeting sites had variation rendering unsuccessful

PCR amplification (Xie et al., 2008). This situation was even more pronounced in

echinoderms where advanced approaches were required for TR finding (Podlevsky et al.,

38 2016b). Overcoming primer design complications, an alternative approach relied on

targeting protein genes flanking TR which are conserved across species known as

syntenic conservation. This has been applied for TR discovery in Saccharomyces genus

fungal organisms (Dandjinou et al., 2004). These PCR-based approaches are limited to—

and most adept for—additional TR identification from groups of species with a previously identified closely related TR species. RT-PCR from total RNA has been

applied in Candida fungal species. This is due to their considerably long TR template

region against which primers are designed for (Gunisova et al., 2009). The long template

sequence allows sufficient number of nucleotides for good priming, however, most

species do not harbor such long templates. Although these PCR-based variants offer

greater flexibility in terms of experimental setup, they are still hindered by lack of

sequence conservation. Additional molecular approaches include RNA/DNA

hybridization and whole genome gene-knockout library screening (Hsu et al., 2007;

Kachouri-Lafond et al., 2009; McEachern & Blackburn, 1995).

Several bioinformatics approaches have been developed to leverage the rapidly

expanding number of sequenced genomes and transcriptomes due to next-generation

sequencing advances. The basic local alignment search tool, BLAST, was designed as a

straightforward means of searching for homologous sequences within closely related

species (Altschul et al., 1990; Qi et al., 2013). BLAST has been recently enhanced and

applied for TR discovery with the addition of position-specific weight matrices (PWM)

using the Fragrep program (Mosig et al., 2006; Podlevsky et al., 2016b; Xie et al., 2008).

This approach has been selectively successful at overcoming the low sequence

conservation inherent to the highly divergent TR. However, these bioinformatics

39 techniques require a sufficient number of well-aligned TR sequences from relatively

closely related species for calculating the nucleotide probability at each position

necessary for generating the PWM search pattern. These bioinformatics tools are best

applied for the identification of TRs from species recently diverged from a closely related

group of species with the TR previously identified.

Herein, we have applied bioinformatics analysis on available genomic and

transcriptomic data to identify TRs from each of the major phyla that comprise the

metazoan kingdom of animals. Additionally, to validate identified candidate TRs, we

cloned representative TRs of the deuterostome and protostome groups and showed in

vitro reconstitution of active telomerase enzyme via direct primer extension assays. This

comprehensive survey of metazoan TRs revealed that the consensus structure of

metazoan TR comprises three structural domains, akin to the initially characterized

vertebrate TR. However, conserved structural features of the pseudoknot domain for

metazoan—and across all eukaryote—TRs have been lost within vertebrates.

Additionally, the absolutely essential central domain structural element has been lost twice independently within metazoan, in echinoderms and protostomes, having been replaced with a simpler structure that is less critical for the telomerase activity of these species. In contrast to the variations found in the pseudoknot and central domains, the

H/ACA domain is absolutely conserved and reveals this biogenesis pathway for TR emerged early in the metazoan lineage. Together, this research describes the evolution of

TRs across the metazoan kingdom of animals in terms of sequence conservation and structural elements.

40 2.3 Materials and Methods

Sequence alignment analysis

Multiple sequence alignment of vertebrate and echinoderm TRs were performed initially using the program BioEdit and the ClustalW algorithm. The alignments were further refined manually using the highly conserved regions and known motifs as anchor points. Closely related species were initially aligned, and the alignment was expanded to include sequences from more divergent species.

Isolation of total RNA

Total RNA was isolated from the dissected gonadal tissue of P. giganteus, and P. ochraceus, intestinal tissue of P. diffusa, liver tissue of C. virginica and from whole body tissues of S. kowalevskii, S. bromophenolosus and C. teleta using TRI-Reagent

(Molecular Research Center, Inc.) following the manufacturer’s instructions appended with an acid phenol extraction step prior to chloroform extraction and ethanol precipitation. RNA quality was determined by electrophoresis on a

1%agarose/formaldehyde denaturing gel and 2100 Bioanalyzer (Agilent).

Bioinformatics search strategy

The next-generation sequencing data was de novo assembled using the Trinity assembly program (Haas et al., 2013) with default parameters. The assembled transcripts were searched using the Infernal Inference of RNA Alignments program with a PWM sequence and secondary structure pattern generated from the multiple sequence alignment of 42 vertebrate and 13 echinoderm TRs. Transcriptomes and genomes available from

NCBI, Ensembl (Challis et al., 2017), Dryad as well as sequencing reads from NCBI that were de novo assembled using the Trinity assembly program were searched with the

41 Infernal program. The Infernal program was continuously retrained with a progressively

updated multiple sequence alignment that was expanded to include the newly identified

TRs to effectively ‘walk the phylogenetic tree’ (Figure 2.1C).

Species identification

Whole body tissue of S. bromophenolosus was used for genomic DNA isolation using the Wizard Genomic DNA Purification Kit (Promega). Ribosomal RNA 18S gene fragment was PCR amplified from the isolated genomic DNA using primers targeting conserved 18S ribosomal gene regions (Wang et al., 2014) and sanger sequenced

(Appendix A).

Determination of RNA 5’- and 3’-ends

The 5’- and 3’-ends of S. kowalevskii, P. diffusa, TERTs and TRs were

determined by Rapid Amplification of cDNA Ends (RACE) using the FirstChoice RLM-

RACE kit (Ambion) proceeding poly(A) tailing of the total RNA using poly(A)

polymerase (USB). For all other metazoan TRs identified in this study, the 5’-end was

predicted by the proximity of a TATA box for transcription initiation and a putative P1

helix with the 3’-end was predicted as 3 nt downstream from the box ACA motif as

previously described (Chen et al., 2000; Podlevsky & Chen, 2016).

Cloning of TERTs and TRs

Partial TERT gene (Genbank accession no. of scaffold NW_003141316) of S.

kowalevskii was identified from the S. kowalevskii genome database by BLAST. Partial

TERTs of P. diffusa and putative full length isoform X1 of Crassostrea virginica TERT

were obtained by BLAST searches from Ampubase (Ip et al., 2018) and the NCBI (NCBI

RefSeq no. XP_022325546.1) respectively. Both 5’ and 3’ ends of P. diffusa TERT were

42 determined by RACE followed by RT-PCR and cloning. The 5’ and 3’ ends of the TR transcripts of S. kowalevskii, P. diffusa were characterized by RACE. The full-length TR sequences were PCR amplified from genomic DNA and cloned.

Telomerase in vitro reconstitution

3xFLAG tagged TERT (S. kowalevskii, P. diffusa and C. virginica) was expressed in rabbit reticulocyte lysate (RRL) from the p3xFLAG-AtTERT plasmid using the TNT

Quick Coupled transcription/translation kit (Promega) following manufacturer’s instructions. Full length TR or TR fragments were in vitro transcribed by T7 RNA polymerase, gel purified and assembled with TERT protein for 30 min at 30˚C.

Telomerase activity assay

12 µl of in vitro reconstituted telomerase enzyme was immuno-purified with 3 µl of anti-FLAG M2 magnetic beads (Sigma M8823) at room temperature for 1 hr. The telomerase enzyme on beads was assayed in a 10 µl reaction containing 1X telomerase reaction buffer (50 mM Tris-HCl, pH 8.0, 50 mM NaCl, 0.5 mM MgCl2, 5 mM BME and

1mM spermidine), 1µM DNA primer, and specified dNTPs or ddNTPs and 0.18µM of

32P-dGTP (3,000 Ci/mmol, 10 mCi/ml; Perkin-Elmer). Reactions were incubated at 30˚C for 60 min and terminated by phenol/chloroform extraction, followed by ethanol precipitation. The DNA products were resolved on a 10% (wt/vol) polyacrylamide/8 M urea denaturing gel, dried, exposed to a phosphorstorage screen and imaged on a

Typhoon gel scanner (GE Healthcare).

43 2.4 Results

A phylogeny-assisted approach for identification of metazoan TR homologs

The identification of TR genes from many groups of metazoan species has

remained a challenging biochemical and bioinformatics task, limiting our understanding

of TR evolution in the animal kingdom. We hereby devised a bioinformatics approach to

leverage the publicly available genome and transcriptome sequencing data for TR

identification and targeted initially the species closely related to the species with TR

genes identified based on their phylogenetic relationship (Figure 2.1). Our search strategy

employed the Infernal Inference of RNA Alignments (Nawrocki & Eddy, 2013) program that searches for conserved secondary structures and primary sequence similarity . For training the Infernal program, we used the aligned sequences and conserved secondary structures of TRs from vertebrates and echinoderms (Figure 2.1 A, B) (Chen et al., 2000;

Podlevsky et al., 2016b). Both vertebrate and echinoderm TRs comprise three distinct structural domains, the pseudoknot, CR4/5 (or eCR4/5 for echinoderm) and H/ACA

(Figure 2.1A). Since there is no structural or sequence similarity between the vertebrate

CR4/5 and the echinoderm eCR4/5, we performed Infernal search using only the aligned sequences and conserved structures of the pseudoknot and H/ACA domains excluding the

CR4/5 or eCR4/5 domains (Figure 2.1B).

Our detailed sequence alignment revealed several regions that have an identity conservation that is greater than 80%. Within the pseudoknot, these regions comprise the expected template for telomeric DNA sequence as well as the region that forms a triple helix. This minimal region of the triple helix comprises two U-tracts and a A-tract as well as the terminus of P2b. In addition to the namesake box H and ACA moieties, the

44 base of P7a and terminus P8b have well conserved residues that are likely for maintaining

this helical structure. Despite the prevalence of a CAB box across most of these vertebrate and echinoderm TRs, this element was below the 80% identity threshold. We then trained the Infernal Inference of RNA Alignments program with our detailed

sequence alignment that included defined regions of the secondary structure to create a

hybrid sequence and structure PWM (Figure 2.1B).

Identification of deuterostome telomerase RNAs

TR discovery within metazoan began with closely related species and radiated out

as we ‘walked the phylogenetic tree’ (Figure 2.2). We began by searching for TRs within

deuterostomes, specifically basal chordate groups to identify additional regions of

similarity that might have existed between early emerging chordates and echinoderms

(Figure 2.2). Searching generated single hits for the three cyclostomata (jawless fish) and

five cephalochordata (primitive fish-like eels) species with either genome or

transcriptome sequence data available (Figure 2.2). These eight putative TRs were

aligned with the previous vertebrate chordate and echinoderm TR sequences to increase

the diversity of Infernal sequence and structure PWM thus broadening our search

parameters for searching across additional deuterostome species (Figure 2.1C) and

identified 13 putative TR sequences: nine echinoderm species from the starfish class

asteroidea as well as four acorn worm species from phylum hemichordate (Figure 2.2).

Although we have identified and extensively characterized a number of vertebrate and

echinoderm TRs, phylum hemichordata has not been explored for TR studies (Chen et al.,

2000; Li et al., 2013; Podlevsky et al., 2016b).

45 In vitro reconstituted acorn worm telomerase is active and processive

In order to functionally characterize hemichordate TR, we chose the model

invertebrate, acorn worm (Saccoglossus kowalevskii). Total RNA was isolated from a

single adult acorn worm and the size of the full-length TR (SkoTR) transcript was

determined by 5’- and 3’-RACE to be 436 nt which is close in length to human/vertebrate

TRs (Figure 2.3B). The full-length TR gene was PCR amplified from genomic DNA,

cloned and sequenced to verify the sequence identified from the acorn worm genome.

Next, we identified partial acorn worm TERT (SkoTERT) sequence by BLAST. To

determine the identity of the ORF and the actual ends of the mRNA, we performed 5’- and 3’- RACE. Results showed that the mRNA to be of 2.8 kb in length. The coding sequence for SkoTERT was then cloned into an expression vector to synthesize SkoTERT protein in the rabbit reticulocyte lysate (RRL) system. Acorn worm telomerase activity was reconstituted by combining in vitro transcribed SkoTR and in vitro synthesized

SkoTERT protein in RRL. The in vitro reconstituted acorn worm telomerase is active and highly processive, producing radiolabeled telomeric DNA products with the characteristic

6-nt ladder banding pattern while reactions void of SkoTR do not show telomerase extended DNA (Figure 2.3A, C). Moreover, telomeric DNA primers with six different circular permutations hybridizing to the SkoTR template at variable positions follow the template dependent offset ladder pattern (Figure 2.3A). Collectively, these results provide irrefutable evidence for successful functional characterization of the acorn worm TR.

46 Secondary structure models of novel deuterostome TRs reveal conserved and disparate

structural motifs

Using sequence alignment of newly identified TRs with select vertebrates and

echinoderms, we deduced secondary structure models for the novel deuterostome TRs.

The template-pseudoknot and H/ACA domains showed sequence conservation. The

template adjacent P1.1 stem and box H/ACA motifs were ubiquitously found in all newly

identified deuterostome TRs. The central domain architecture however, showed greater

variability in terms of structure. All identified early branching chordates contain the

vertebrate and filamentous fungal conserved CR4/5 domain with the highly conserved 4

bp stem loop P6.1. This includes the major chordate subgroup cephalochordata which is

comprised of invertebrates. This suggests that CR4/5 domain is more ubiquitous in

chordates and not exclusive to vertebrates. All identified asteroidea TRs, however, lack

the CR4/5 structure replaced by structurally variable domains similar to all previously

characterized echinoderm TRs (Li et al., 2013; Podlevsky et al., 2016b).

Acorn worm TR contains the vertebrate conserved CR4/5 domain including the highly

conserved P6.1 stem loop

Interestingly, despite being placed as a sister group to Echinodermata (Bourlat et al., 2006), hemichordate TRs contain the conserved CR4/5 domain (Figure 2.3B). To determine if the hemichordate CR4/5 is functionally homologous to vertebrate CR4/5, we performed telomerase direct activity assays by combining synthetic SkoTERT protein with in vitro transcribed SkoT/PK and SkoCR4/5 added as two fragments. Based on previous studies, both the T/PK and CR4/5 domain bind independently to TERT and are the minimal TR domains required to reconstitute vertebrate telomerase activity in vitro

47 (Chen et al., 2002; Mitchell & Collins, 2000; Tesmer et al., 1999). We also tested

mutants of homologous residues of the highly conserved P6.1 loop which disrupt activity

in vertebrates (Chen et al., 2002). Both SkoT/PK and SkoCR4/5 domains were sufficient

to reconstitute in vitro telomerase activity comparable to full length TR (Figure 2.3C,

right, lane 3) while CR4/5 domains with mutations in P6.1 loop completely abolished

activity (Figure 2.3C, lanes 4, 5) demonstrating hemichordate CR4/5 is functionally

equivalent to the vertebrate CR4/5 domain.

Identification of protostome telomerase RNAs

Upon exhausting TR sequence hits within available deuterostome sequencing

data, we proceeded to search protostomes—the sister group to deuterostomes within

bilateria (Figure 2.2). We identified putative TR sequences from 23 species of molluscs

covering 4 major classes, 7 annelid species including the model worm Capitella teleta

and the agriculturally important earthworm Eisenia fetida, one species each from the

small phyla brachiopoda and phoronida for a total of 32 TRs from the protostome group

(Figure 2.2). Sequence alignments of identified candidate TRs with select deuterostomes,

clearly showed the presence of hallmark TR domains providing crucial evidence for the

authenticity of identified TR candidates.

Functional characterization of the first protostome TRs

Despite a wealth of publicly available genomic and transcriptomic data for protostomes, not a single TR has previously been reported to the best of our knowledge from this group. This is in part due to divergent primary sequences of TRs. Of the 32 protostome TR candidates identified, we chose two molluscs, one each from class gastropoda and bivalvia for functional characterization. The gastropod mollusc apple

48 snail (Pomacea diffusa) is important in the pet trade whereas the bivalve American oyster

(Crassostrea virginica) is an important food source. Due to their ease of availability these two representative species were prime candidates for TR functional characterization.

Performing RACE to identify the 5’ and 3’ termini of apple snail (PdTR), we found that is 408 nt in length (Figure 2.4B). This indicates that protostome TRs fall within the length range of deuterostome TRs (Chen et al., 2000; Podlevsky et al., 2016b). The

TERT component coding mRNAs of both species were bioinformatically identified and

P. diffusa TERT was experimentally characterized. Subsequently, TERT protein coding sequences were cloned for RRL expression. Active telomerases from both species were reconstituted in vitro by combining respective synthetic TERTs and in vitro transcribed

TRs (Figures 2.4A, Figure 2.5A). Telomerase characteristic 6-nt pattern of bands can be clearly seen for both species validating the first functionally characterized protostome

TRs. Six circularly permuted DNA primers which hybridize to variable positions with the

TR template were tested for telomerase extension. A banding pattern offset by 1 nt was clearly seen suggesting the template dependent nucleotide addition property unique to telomerase (Figure 2.4A, Figure 2.5A). It is also worth noting that CvTR has the shortest template identified in any TR with only 7 nt that can synthesize a 6-nucleotide telomere repeat sequence.

Protostome TRs show remarkable diversity in the central domain

By constructing secondary structure models for select protostome TRs via phylogenetic comparative analysis, we found that the T/PK core and the box H/ACA structural domains were conserved. And the template proximal helix mediated template boundary definition is predominant due to the universal presence of P1.1 in the T/PK

49 domain in all identified protostome TRs (Figure 2.4B, Figure 2.5B). Unlike

deuterostomes, none of the identified protostome TRs contain CR4/5, rather show

structurally variable central domains with no sequence conservation akin to echinoderm

TRs (Figure 2.4B, Figure 2.5B) (Podlevsky et al., 2016b). In order to elucidate if

functionally equivalent domain to CR4/5 or eCR4/5 exist within the representative

mollusk protostomes apple snail and American oyster, we performed in vitro telomerase

assays comparing full length TR against template-pseudoknot fragment (T/PK) and truncated central domain fragments assembled in trans with synthetic TERT in RRL.

Both species show similar phenomenon in terms of the TR domain requirements for in vitro activity. Although The T/PK fragment alone did not produce detectable activity,

T/PK in combination with the central domain or P6 stem showed activity for apple snail telomerase (Figure 2.4C). The P5 stem however did not show detectable activity in trans with T/PK indicating that P6 is the eCR4/5 element in apple snail (Figure 2.4B, C).

Interestingly, two fragment trans assays showed RNA concentration dependent activity in apple snail (Figure 2.4D). Direct telomerase activity assays performed with increasing concentrations of the trans fragments did not reach the level of single fragment activity and plateaued at 2 µM RNA concentration (Figure 2.4D). Oyster TR has a long stem as the central domain and truncations followed by two fragment trans telomerase activity assays allowed for determination of the minimal fragment required for in vitro activity in combination with the T/PK domain (Figure 2.5B, C). The terminal stem loop CDΔ2 was identified as the eCR4/5 domain in CvTR (Figure 2.5C).

50 The most basal metazoan TRs contain the CR4/5 domain conserved in higher chordates

Continuing our search for TR candidates across metazoans to the sea nettle

cnidarian phyla, we identified putative TR sequences from 25 species that belong

principally to anthozoan sea anemones and corals followed by scyphozoan jellyfish

(Figure 2.2). During this progressive sequence search, we continuously updated our

multiple sequence alignment with our newly identified additional putative TR sequences

for retraining Infernal and to update our sequence and structure PWM (Figure 2.1B). We

ended our search with the identification of putative TRs from the placozoa phylum of flat

animals and three poriferan sponges, the most basal living metazoan (Figure 2.2).

Inferring secondary structure models for representatives from major cnidarian classes and

the three sponges, we found the T/PK and box H/ACA domains to be conserved with all

previously identified metazoan TRs with the universal presence of P1.1 stem as TBE.

The T/PK domain was found to contain an additional stem termed P2.1. Strikingly, both

cnidarian and poriferan TRs contain the chordate specific CR4/5 domain including the

highly conserved P6.1 stem loop (Figure 2.6 A-F). This suggests that the CR4/5 domain

was an ancestral structural element in the metazoan lineage that was replaced by eCR4/5

in protostomes and echinoderms but re-emerged in hemichordates and chordates.

2.5 Discussion

Much has been learnt from the identification and characterization of vertebrate,

echinoderm, fungal and ciliate TRs over the past 3 decades (Podlevsky & Chen, 2016).

While TR studies from these distinct evolutionary groups have offered a global overview of TR evolution in eukaryotes, lack of a comprehensive study of TRs in the animal kingdom has hindered the delineation of TR structural evolution in animals. Among

51 animals, vertebrate TRs have been the most extensively studied for TR structure and function (Chen et al., 2000; Xie et al., 2008). Identification of the first invertebrate TR from purple sea urchin and a detailed study on echinoderm TR structure have expanded our understanding of TR evolution within the animal kingdom (Li et al., 2013; Podlevsky et al., 2016b). However, a vast majority of animal phyla are unexplored for TR studies

(Figure 2.2). In order to discern the TR structural evolution within the animal kingdom, we have identified novel TRs from disparate animal clades, performed comprehensive phylogenetic comparative analysis to deduce secondary structure models and show experimental validation of select TRs. We find that the overall architecture of TR structural elements and their requirements for function are conserved even among the most basal of animals while clade specific features emerge or disappear.

We employed a phylogeny assisted reiterative homology search strategy to identify novel animal TRs. TR identification approaches based on sequence conservation such as BLAST searches have historically been extremely limited due to the immense disparities in TR primary sequences even among closely related species. Circumventing drawbacks associated with BLAST, a modified TR search method using primary sequences, but leveraging position specific weight matrices to assign weighted scores to a given nucleotide position based on a multiple sequence alignment has achieved better success in identification of TRs with fragrep 2 (Podlevsky et al., 2016b; Xie et al., 2008).

However, more distantly related TRs which might show co-variation or point mutations that occur in the PWM matrix blocks either completely fail to be identified by fragrep 2 or gives large number of hits making downstream bioinformatics screens for TR specific structural motifs impractical. Our strategy takes advantage of both sequence conservation

52 and secondary structural information from a sequence alignment of previously known

TRs to identify TRs from publicly available genomic or transcriptomic data using

Infernal (Nawrocki & Eddy, 2013). Initially multiple sequence alignment of TRs with structural annotations is generated using existing secondary structure information either supported experimentally and/or via co-variation. A statistical model of the alignment which considers both secondary structure information and position specific sequence conservation known as a covariance model is generated using Infernal. This model is used to search against the genome or transcriptome of a closely related target species to obtain TR candidates. Secondary structure model and primary sequence alignment is used to verify the hits to identify a bona fide TR. This process is repeated by generating an improved covariance model by including the newly identified TR and searching for TRs from organisms in the next closely related clade (Figure 2.1C). Careful secondary structural analysis and strategic search based on well-established phylogenetic relationships has allowed us to identify TRs from 83 animal species covering the animal kingdom which would have been extremely tedious with conventional methods.

Two domain requirements for in vitro telomerase activity is conserved in metazoan TRs. Evolutionary groups traditionally well characterized for TRs such as vertebrates and fungi minimally require two structural domains within the TR for invitro telomerase activity reconstitution (Mitchell & Collins, 2000; Qi et al., 2013; Tesmer et

al., 1999). These two domains comprising of the template-pseudoknot and a distal stem-

loop moiety demonstrate telomerase activity even when added in trans as two

independent RNA fragments with in vitro synthesized TERT protein. Our previous work

on dissecting echinoderms and flagellate TRs to determine the minimal TR domains

53 required for activity shows that two domain requirements is highly conserved and is an ancient feature evolving as early as the first flagellate TRs (Li et al., 2013; Podlevsky et al., 2016a; Podlevsky et al., 2016b). Interestingly the dependence for in vitro activity on these domains is variable between the groups. For instance, vertebrate and fungal telomerase activity requires both domains to be present to show activity comparable to full-length TR activity. However, flagellates and echinoderms are only partially dependent on the distal stem-loop moiety where the T/PK domain by itself is responsible for ~30-40% of full-length activity (Podlevsky & Chen, 2016). However, ciliate TRs show only partial activity if fragments are added in trans compared to a single RNA fragment (Mason et al., 2003). This is potentially due to the compact nature of ciliate TRs which limits the independence of the two TR domains. Characterization of structural domains required for activity in vitro from novel metazoan TRs show the conservation of this property (Figures 2.3C, 2.4C, 2.5C). The T/PK domain is highly conserved in all identified novel metazoan TRs. However, the distal stem loop moiety shows dramatic diversity in its functional dependence for in vitro activity. The presence of vertebrate conserved CR4/5 domain was found in all newly identified early evolving chordates suggesting the absolute requirement of their CR4/5 for in vitro activity. Asteroidea TRs from echinoderms putatively contain the eCR4/5 domain identified and well characterized from other classes of echinoderms. Interestingly, despite being placed as a sister clade of echinoderms, the first hemichordate TR identified was found to contain the chordate conserved CR4/5 domain including the highly conserved P6.1 stem loop (Figure

2.3B). Functional characterization of hemichordate TR from acorn worm shows highly active and processive in vitro reconstituted telomerase. Subsequent secondary structure-

54 based dissection of acorn worm TR shows complete lack of activity in the absence of

CR4/5 or in the presence of mutated P6.1 loop of the CR4/5 domain added in trans

(Figure 2.3C). TRs from the protostome group however have completely switched to a

highly diverse eCR4/5 type distal stem loop moiety (Figures 2.4B, 2.5B). Structure based

truncations of the gastropod mollusc apple snail TR shows a simple stem loop eCR4/5

required for telomerase activity (Figure 2.4C, lanes 2, 3). Interestingly, absence of this

eCR4/5 completely disrupted activity as opposed to a partial disruption observed in

echinoderm eCR4/5 (Figure 2.4C). This suggests that while structurally apple snail

eCR4/5 is similar to echinoderm eCR4/5, in terms of absolute function it is equivalent to

chordate CR4/5. Although belonging to a sister class of gastropoda, the American oyster

TR can reconstitute partial activity in the presence of only the T/PK fragment (Figure

2.5C).

Most basal metazoans including corals and sponges show the highly conserved

CR4/5 domain suggesting an absolute functional dependence of both the T/PK and CR4/5

domains for in vitro activity (Figure 2.6). This also lends to the ancestral nature of the

CR4/5 domain as it is found in the earliest branching animals as well as filamentous

fungi. The loss of P6.1 stem occurs in the protostome group and echinoderms of the

deuterestome clade potentially due to the co-evolution of protein binding site that evolved

to accommodate simpler structures such as eCR4/5.

The template adjacent helix is the ubiquitous template boundary element in all

newly identified metazoan TRs (Figure 2.7). The T/PK core domain is comprised of a

stem of variable length that is located immediately upstream of the TR template in

disparate groups of species. This stem loop termed P1.1 in invertebrate echinoderms,

55 TBE in fungal TRs and Helix II in ciliates prevents addition of non-telomeric DNA

sequences to telomeric DNA as a result of template read-through (Jansson et al., 2015;

Podlevsky et al., 2016b; Qi et al., 2013; Tzfati et al., 2000). The template proximal helix

type TBE mostly prevents template bypass by limiting the availability of single stranded

RNA for DNA repeat synthesis, however, ciliates contain conserved residues at the base

of helix II stem which is a TERT binding site and causes steric interference preventing

usage of non-template sequences by the catalytic TERT. In contrast, most vertebrates do

not contain P1.1 rather prevent template read-through via tightly maintaining the linker

length between the more distal P1b stem and the template (Chen & Greider, 2003).

Surprisingly, lampreys which are vertebrates, seem to use the P1.1 type template

boundary definition based on secondary structure models proposed in this study. This

suggests the switch to P1 type TBE is a recent event specific to more later evolving vertebrates in TR structural evolution. Moreover, it has been previously demonstrated that the P1.1 helix in echinoderms can be deleted completely and the telomerase switches to the P1 type template boundary and vice versa (Podlevsky et al., 2016). This suggests that the switch between P1 and P1.1 type is more plastic. However, based on secondary structural models of protostomes and basal metazoan, the P1.1 type is universal indicating P1.1 is favored over P1 type (Figure 2.7). This is potentially as a means of restricting the linker length between the template and P1. For instance, as species specific insertions between the template and P1 occurs, local structures such as P1.1 are formed to limit the linker length between the template and P1. The presence of P2.1 stem downstream of the template and upstream of P2 in basal metazoan (Figure 2.7) could be

56 explained similarly for limiting the overall single stranded region between P2.1 and P1.1

for the accommodation to the TERT active site.

The box H/ACA sca-/sno- RNA type TR biogenesis is conserved in animals

(Figure 2.7). It is proposed that TR sequence and secondary structural divergence is

primarily due to distinct biogenesis pathways employed by phylogenetic groups

(Podlevsky & Chen, 2016). Vertebrates and echinoderms employ a sno- / sca- RNP

biogenesis pathway for 3’ processing, trafficking and localization of TR (Jády et al.,

2004; Li et al., 2013; Mitchell et al., 1999). Two stem loops intervened by a distinct

sequence motif; box H (ANANNA) followed by an ACA motif is a feature shared with

H/ACA group RNAs with TRs (Chen et al., 2000). This 3’ biogenesis region of TRs is

bound by two sets of dyskerin, NOP10, NHP2 and GAR1 proteins shared with H/ACA

snoRNAs as evidenced by biochemical, mass spectrometry studies and the recent cryo-

EM structure of human telomerase (Egan & Collins, 2010; Fu & Collins, 2007). Insight

into the structural organization and interactions, with TR and among these proteins was

delineated by the recently solved landmark cryo-EM structure of human telomerase

(Nguyen et al., 2018). Additionally, the apical loop of the box H/ACA stem loop contains a conserved 4 nucleotide sequence known as the CAB box bound by Cajal body protein 1

(TCAB1) required for Cajal body localization in vertebrate and echinoderm TRs (Figure

2.1A) (Li et al., 2013; Venteicher et al., 2009). All novel TRs identified in this study share the box H/ACA domain characteristic of snoRNA type biogenesis pathway (Figure

2.7). This finding provides the first evidence for conservation of the shared 3’ processing mechanism of animal TRs suggesting box H/ACA type biogenesis evolved very early during animal evolution being found in the most basal of animals like cnidarians and

57 sponges. However, the CAB box is not conserved among metazoan TRs. This could be due to the under sampling of CAB box sequences to generate the consensus sequence which might not encompass all possibilities. The closest related clade to metazoa with

TRs characterized are filamentous fungi. Filamentous fungal TR 3’ ends are processed by intron splicing where mature TR is formed by removal of a terminal intron found at the 3’ end of the TR (Figure 2.7) (Qi et al., 2015). Although the detailed mechanisms of filamentous fungal TR biogenesis are yet to be determined, it would be distinct from metazoan TR biogenesis. This suggests that the biogenesis pathway unique to animal TRs evolved immediately after the common ancestor of filamentous fungi and modern multicellular animals diverged. Our comprehensive identification and analysis of metazoan TRs, the largest survey of TRs from the most diverse of clades show that despite being completely dissimilar in terms of primary sequence, share conserved structural domains, conserved minimal TR domain requirements for in vitro activity, near universal template boundary definition mechanism and a conserved biogenesis pathway

(Figure 2.7). Moreover, we believe that our TR identification method by far offers the most unbiased strategy for TR identification, leveraging phylogenetic comparative analysis and well-established secondary structure models widely applicable for TR identification from diverse groups providing a crucial tool for TR identification and thus telomerase studies.

58 2.6 References

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3), 403-410.

Blackburn, E. H., & Collins, K. (2010). Telomerase: An RNP Enzyme Synthesizes DNA. Cold Spring Harbor Perspectives in Biology, 3(5).

Bourlat, S. J., Juliusdottir, T., Lowe, C. J., Freeman, R., Aronowicz, J., Kirschner, M., Lander, E. S., Thorndyke, M., Nakano, H., Kohn, A. B., Heyland, A., Moroz, L. L., Copley, R. R., & Telford, M. J. (2006). Deuterostome phylogeny reveals monophyletic chordates and the new phylum Xenoturbellida. Nature, 444(7115), 85-88.

Challis, R. J., Kumar, S., Stevens, L., & Blaxter, M. (2017). GenomeHubs: simple containerized setup of a custom Ensembl database and web server for any species. Database, 2017.

Chen, J.-L., Blasco, M. A., & Greider, C. W. (2000). Secondary structure of vertebrate telomerase RNA. Cell, 100(5), 503-514.

Chen, J.-L., & Greider, C. W. (2003). Template boundary definition in mammalian telomerase. Genes & Development, 2747-2752.

Chen, J.-L., Opperman, K. K., & Greider, C. W. (2002). A critical stem-loop structure in the CR4-CR5 domain of mammalian telomerase RNA. Nucleic Acids Research, 30(2), 592-597.

Cifuentes-Rojas, C., Kannan, K., Tseng, L., & Shippen, D. E. (2011). Two RNA subunits and POT1a are components of Arabidopsis telomerase. Proceedings of the National Academy of Sciences of United States of America, 108(1), 73-78.

Dandjinou, A. T., Lévesque, N., Larose, S., Lucier, J.-F., Abou Elela, S., & Wellinger, R. J. (2004). A phylogenetically based secondary structure for the yeast telomerase RNA. Current biology, 14, 1148-1158.

Egan, E. D., & Collins, K. L. (2010). Specificity and stoichiometry of subunit interactions in the human telomerase holoenzyme assembled in vivo. Molecular and Cellular Biology, 30, 2775-2786.

Fu, D., & Collins, K. (2007). Purification of Human Telomerase Complexes Identifies Factors Involved in Telomerase Biogenesis and Telomere Length Regulation. Molecular Cell, 28(5), 773-785.

Greider, C. W., & Blackburn, E. H. (1989). A telomeric sequence in the RNA of Tetrahymena telomerase required for telomere repeat synthesis. Nature, 337, 331- 337. 59 Gunisova, S., Elboher, E., Nosek, J., Gorkovoy, V., Brown, Y., Lucier, J. F., Laterreur, N., Wellinger, R. J., Tzfati, Y., & Tomaska, L. (2009). Identification and comparative analysis of telomerase RNAs from Candida species reveal conservation of functional elements. RNA, 15(4), 546-559.

Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J., Couger, M. B., Eccles, D., Li, B., Lieber, M., Macmanes, M. D., Ott, M., Orvis, J., Pochet, N., Strozzi, F., Weeks, N., Westerman, R., William, T., Dewey, C. N., Henschel, R., Leduc, R. D., Friedman, N., & Regev, A. (2013). De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature protocols, 8, 1494-1512.

Hsu, M., McEachern, M. J., Dandjinou, A. T., Tzfati, Y., Orr, E., Blackburn, E. H., & Lue, N. F. (2007). Telomerase core components protect Candida telomeres from aberrant overhang accumulation. Proceedings of the National Academy of Sciences of United States of America, 104(28), 11682-11687.

Ip, J. C. H., Mu, H., Chen, Q., Sun, J., Ituarte, S., Heras, H., Van Bocxlaer, B., Ganmanee, M., Huang, X., & Qiu, J. W. (2018). AmpuBase: a transcriptome database for eight species of apple snails (Gastropoda: Ampullariidae). BMC Genomics, 19(1), 179.

Jády, B. E., Bertrand, E., & Kiss, T. (2004). Human telomerase RNA and box H/ACA scaRNAs share a common Cajal body-specific localization signal. Journal of Cell Biology, 164, 647-652.

Jansson, L. I., Akiyama, B. M., Ooms, A., Lu, C., Rubin, S. M., & Stone, M. D. (2015). Structural basis of template-boundary definition in Tetrahymena telomerase. Nature Structural & Molecular Biology, 22(11), 883-888.

Kachouri-Lafond, R., Dujon, B., Gilson, E., Westhof, E., Fairhead, C., & Teixeira, M. (2009). Large telomerase RNA, telomere length heterogeneity and escape from senescence in Candida glabrata. FEBS Letters, 583(22), 3605-3610.

Leonardi, J., Box, J. A., Bunch, J. T., & Baumann, P. (2008). TER1, the RNA subunit of fission yeast telomerase. Nature Structural & Molecular Biology, 15(1), 26-33.

Li, Y., Podlevsky, J. D., Marz, M., Qi, X., Hoffmann, S., Stadler, P. F., & Chen, J. J.-L. (2013). Identification of purple sea urchin telomerase RNA using a next- generation sequencing based approach. RNA, 19(6), 852-860.

Lin, J., Ly, H., Hussain, A., Abraham, M., Pearl, S., Tzfati, Y., Parslow, T. G., & Blackburn, E. H. (2004). A universal telomerase RNA core structure includes structured motifs required for binding the telomerase reverse transcriptase protein. Proceedings of the National Academy of Sciences of United States of America, 101, 14713-14718.

60 Mason, D. X., Goneska, E., & Greider, C. W. (2003). Stem-loop IV of tetrahymena telomerase RNA stimulates processivity in trans. Molecular and Cellular Biology, 23(16), 5606-5613.

McEachern, M. J., & Blackburn, E. H. (1995). Runaway telomere elongation caused by telomerase RNA gene mutations. Nature, 376(6539), 403-409.

Mitchell, J. R., Cheng, J., & Collins, K. (1999). A box H/ACA small nucleolar RNA-like domain at the human telomerase RNA 3' end. Molecular and Cellular Biology, 19(1), 567-576.

Mitchell, J. R., & Collins, K. L. (2000). Human telomerase activation requires two independent interactions between telomerase RNA and telomerase reverse transcriptase. Molecular Cell, 6, 361-371.

Mosig, A., Sameith, K., & Stadler, P. (2006). Fragrep: An efficient search tool for fragmented patterns in genomic sequences. Genomics, Proteomics and Bioinformatics, 4, 56-60.

Musgrove, C., Jansson, L. I., & Stone, M. D. (2018). New perspectives on telomerase RNA structure and function. Wiley Interdisciplinary Reviews RNA, 9(2).

Nawrocki, E. P., & Eddy, S. R. (2013). Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics, 29(22), 2933-2935.

Nguyen, T. H. D., Tam, J., Wu, R. A., Greber, B. J., Toso, D., Nogales, E., & Collins, K. (2018). Cryo-EM structure of substrate-bound human telomerase holoenzyme. Nature, 557(7704), 190-195.

Podlevsky, J. D., & Chen, J. J.-L. (2012). It all comes together at the ends: telomerase structure, function, and biogenesis. Mutation Research, 730(1-2), 3-11.

Podlevsky, J. D., & Chen, J. J.-L. (2016). Evolutionary perspectives of telomerase RNA structure and function. RNA biology, 13(8), 720-732.

Podlevsky, J. D., Li, Y., & Chen, J. J.-L. (2016a). The functional requirement of two structural domains within telomerase RNA emerged early in eukaryotes. Nucleic Acids Research, 44, 9891-9901.

Podlevsky, J. D., Li, Y., & Chen, J. J.-L. (2016b). Structure and function of echinoderm telomerase RNA. RNA, 22, 204-215.

Qi, X., Li, Y., Honda, S., Hoffmann, S., Marz, M., Mosig, A., Podlevsky, J. D., Stadler, P. F., Selker, E. U., & Chen, J. J.-L. (2013). The common ancestral core of vertebrate and fungal telomerase RNAs. Nucleic Acids Research, 41(1), 450-462.

61 Qi, X., Rand, D. P., Podlevsky, J. D., Li, Y., Mosig, A., Stadler, P. F., & Chen, J. J.-L. (2015). Prevalent and distinct spliceosomal 3'-end processing mechanisms for fungal telomerase RNA. Nature Communications, 2, 1-8.

Soudet, J., Jolivet, P., & Teixeira, M. T. (2014). Elucidation of the DNA end-replication problem in Saccharomyces cerevisiae. Molecular Cell, 53, 954-964.

Tesmer, V. M., Ford, L. P., Holt, S. E., Frank, B. C., Yi, X., Aisner, D. L., Ouellette, M., Shay, J. W., & Wright, W. E. (1999). Two inactive fragments of the integral RNA cooperate to assemble active telomerase with the human protein catalytic subunit (hTERT) in vitro. Molecular and Cellular Biology, 19, 6207-6216.

Tzfati, Y., Fulton, T. B., Roy, J., & Blackburn, E. H. (2000). Template boundary in a yeast telomerase specified by RNA structure. Science, 288(5467), 863-867.

Venteicher, A. S., Abreu, E. B., Meng, Z., McCann, K. E., Terns, R. M., Veenstra, T. D., Terns, M. P., & Artandi, S. E. (2009). A human telomerase holoenzyme protein required for Cajal body localization and telomere synthesis. Science, 323, 644- 648.

Wang, Y., Tian, R. M., Gao, Z. M., Bougouffa, S., & Qian, P. Y. (2014). Optimal eukaryotic 18S and universal 16S/18S ribosomal RNA primers and their application in a study of symbiosis. PLoS One, 9(3), e90053.

Webb, C. J., & Zakian, V. A. (2008). Identification and characterization of the Schizosaccharomyces pombe TER1 telomerase RNA. Nature Structural & Molecular Biology, 15(1), 34-42.

Xie, M., Mosig, A., Qi, X., Li, Y., Stadler, P. F., & Chen, J. J.-L. (2008). Structure and function of the smallest vertebrate telomerase RNA from teleost fish. Journal of Biological Chemistry, 283(4), 2049-2059.

Zakian, V. A. (2009). The ends have arrived. Cell, 139(6), 1038-1040.

62

63

Figure 2.1. Phylogeny assisted TR identification approach. (A) Schematic comparison of TR secondary structures between chordates and echinoderms. TR characteristic structural domains are shown in colored boxes and labeled. Length ranges of TRs from each group are indicated. (B) Position weight matrix of template/pseudoknot and H/ACA domains of 42 vertebrate and 13 echinoderm TRs shown with conserved sequence motifs and base paired regions indicated underneath the matrix. Both structural domains were used for searches of novel TRs using Infernal. (C) Workflow for reiterative homology search strategy for identification of novel TRs using Infernal.

64

Figure 2.2. Phylogenic tree of metazoan kingdom with number of TRs identified shown. Previously 42 TRs were identified from phylum of vertebrate chordates (Chen et al., 2000; Xie et al., 2008) and 13 TRs from the echinoderm phylum (Li et al., 2013; Podlevsky et al., 2016b). An additional 83 metazoan TRs from 10 diverse phyla across 18 classes that spans the early branching basal sponges to primitive fish-like animals were identified in this study. (TR identification Joshua Podlevsky & Dhenugen Logeswaran)

65 Figure 2.3. Validation and characterization of acorn worm TR. (A) (top) Template sequence of acorn worm TR (open box - red) with hybridizing positions of the six circularly permuted telomeric DNA primers (a-f) shown. Extension products are shown in lower case blue color with the number of nucleotides added to reach the end of the template for each telomeric primer shown to the right. (bottom) Direct activity assay of in

66 vitro reconstituted telomerase from synthesized acorn worm TERT in rabbit reticulocyte lysate and synthetic acorn worm TR via T7 RNA pol transcription. A 32P end labeled oligonucleotide added as recovery control (r.c.) to each reaction prior to phenol chloroform extraction and ethanol precipitation of telomerase extended DNA products. Number of nucleotides added to the telomeric primer are denoted to the right of the gel. (B) Secondary structure model of acorn worm TR inferred from phylogenetic comparative analysis. Characteristic TR domains T/PK, CR4/5 and H/ACA are shown in yellow. Kingdom conserved (red), phylum conserved (orange), and class conserved (blue) residues shown. Class conserved residues are determined from the alignment of 4 acorn worm species (one partial). (C) (left) Characterization of hemichordate CR4/5 domain. Schematic of the T/PK and CR4/5 domains denoting start and end positions of the fragments used for activity assay relative to the full-length TR. Wild type CR4/5 residues and point mutations of the P6.1 loop shown with nucleotide positions. (right) Acorn worm TERT synthesized in vitro assembled with T/PK and either of the CR4/5 fragments (WT, m1, m2) used for telomerase direct activity assay. End labeled oligonucleotide is the recovery control (r.c.) with number of nucleotides added on to the telomeric primer shown on the left of the gel. (TR identification, secondary structure model – Joshua Podlevsky; primer design, cloning, RNA synthesis and purification – Dhenugen Logeswaran; RACE, cloning, in vitro telomerase reconstitution, direct activity assay – Yang Li)

67 Figure 2.4. Validation and characterization of Pomacea diffusa (apple snail) TR (A) (top) Template sequence of apple snail TR (PdTR) (open box - red) with hybridizing positions of the six circularly permuted telomeric DNA primers (a-f) shown. Extension products are shown in lower case blue color with the number of nucleotides added to reach the end of the template for each telomeric primer shown to the right. (bottom) Direct activity assay of in vitro reconstituted telomerase from synthesized apple snail TERT (PdTERT) in rabbit reticulocyte lysate and synthetic PdTR via T7 RNA pol

68 transcription. A 32P end labeled oligonucleotide added as recovery control (r.c.) to each reaction prior to phenol chloroform extraction and ethanol precipitation of telomerase extended DNA products. Number of nucleotides added to the telomeric primer are denoted to the right of the gel. (B) Secondary structure model of PdTR deduced from phylogenetic sequence analysis. Characteristic TR domains T/PK and H/ACA are shown in yellow. Kingdom conserved, phylum conserved, and class conserved residues shown in red, orange and blue respectively. Class conserved residues are determined from the alignment of 10 gastropoda species. (C) Determination of minimal central domain region required for in vitro apple snail telomerase activity. (left) Outline of the T/PK and central domain (CD) denoting start and end positions of the fragments used for activity assay relative to the full-length TR. (right) PdTERT synthesized in vitro assembled with T/PK and either full length central domain or P5, P6 fragments and used for telomerase direct activity assay. End labeled oligonucleotide is the recovery control (r.c.) with +1 position shown. (D) Telomerase activity reconstituted in vitro assembling 1, 2 and 3 µM concentrations of T/PK fragment shown in 4(C) and synthetic PdTERT compared with full-length TR. (TR identification, secondary structure model, primer design, cloning, RNA synthesis and purification – Dhenugen Logeswaran; RACE, cloning, in vitro telomerase reconstitution, direct activity assay – Yang Li)

69

70 Figure 2.5. Validation and characterization of (Crassostrea virginica) American oyster TR (A) (top) Template sequence of American oyster TR (CvTR) (open box - red) with hybridizing positions of the five circularly permuted telomeric DNA primers (a-e) shown. Extension products are shown in lower case blue color with the number of nucleotides added to reach the end of the template for each telomeric primer shown to the right. (bottom) Direct activity assay of in vitro reconstituted telomerase from synthetic American oyster TERT (CvTERT) in rabbit reticulocyte lysate and synthetic CvTR via T7 RNA pol transcription. A 32P end labeled oligonucleotide added as recovery control (r.c.) to each reaction prior to phenol chloroform extraction and ethanol precipitation of telomerase extended DNA products. Number of nucleotides added to the telomeric primer are denoted to the right of the gel. (B) Secondary structure model of CvTR deduced from phylogenetic sequence analysis. Characteristic TR domains T/PK and H/ACA are shown in yellow. Kingdom conserved, phylum conserved, and class conserved residues shown in red, orange and blue respectively. Class conserved residues are determined from the alignment of 9 bivalvia species. (C) Determination of minimal central domain region required for in vitro American oyster telomerase activity. (Top and left) Outline of the T/PK and central domain (CD) denoting start and end positions of the fragments used for activity assay relative to the full-length TR. (right) CvTERT synthesized in vitro assembled with T/PK and either full length central domain or P5, P6 fragments and used for telomerase direct activity assay. End labeled oligonucleotide is the recovery control (r.c.) with +1 position shown. (TR identification – Joshua Podlevsky; secondary structure model, primer design, cloning, RNA synthesis and purification – Dhenugen Logeswaran; RACE, cloning, in vitro telomerase reconstitution, direct activity assay – Yang Li; cloning – Tamara Olson)

71

Figure 2.6. The CR4/5 domain comprising the P6.1 stem loop is conserved across basal metazoa. Comparison of representative TR central domains from phyla cnidaria, placozoa and porifera. Cnidaria are represented by one representative from each major class including anthozoa, digitate coral (A); scyphozoa, moon jellyfish (B); staurozoa, horned stalked jellyfish (C); hydrozoa, freshwater jellyfish (D). Placozoa is represented by Trichoplax adherens (E) and porifera represented by Demospongiae, sponge (F). Kingdom conserved, phylum conserved, and class conserved residues are shown in red, orange and blue respectively. Phylum conserved residues from cnidaria are labeled based on 25 species. Class conserved residues from phylum cnidaria are labeled based on 18 anthozoa, 4 scyphozoa and 2 staurozoa species respectively (TR identification and Fig – Joshua Podlevsky).

72

Figure 2.7. Comparison of TR essential template core, distal stem loop moiety and 3’ biogenesis domains from major metazoan clades. Phylogenetic relationship of major metazoan lineages Deuterostomia (orange), Protostomia (blue), basal metazoan (green) and filamentous fungal (black) shown. Size ranges of TRs from respective groups are indicated below each group. Asterisk represents range inferred based on TR 5’ end proximity to TATA box and 3 nucleotides downstream of box ACA for the 3’ end when including TRs without experimentally verified ends. The template core domain in all shown groups is comprised of a pseudoknot domain with the template boundary element (TBE) shown in thick lines. Structure of the distal element of each respective group is shown (middle). The 3’ biogenesis domain is either box H/ACA type in metazoa or 3’ intron splicing in filamentous fungi.

73 CHAPTER 3

STRUCTURE AND FUNCTION OF USTILAGO FUNGAL TELOMERASE RNA

3.1 Abstract

The telomerase ribonucleoprotein is a unique RNA dependent DNA polymerase with an integral telomerase RNA (TR) component that prevents telomere erosion by adding DNA repeats to the ends of linear chromosomes. The extensively studied yeast

TRs and the recently identified filamentous fungal TRs are incredibly large and show remarkable diversity in terms of TR secondary structural elements. In order to expand our knowledge on fungal TR structure, function and evolution, we identified TRs from 20 species of the basidiomycota phylum, which represents “higher fungi” that includes economically important mushrooms and smut fungi. Initially the TR from the phytopathogenic fungus Ustilago maydis was identified using an RNA

immunoprecipitation (RNA-IP) and next generation sequencing based approach. The U.

maydis fungus represents a model organism in cell biology with economic importance

due to the cause of corn-smut disease. Orthologs of this TR were identified from class

ustilaginomycetes fungi using a gene synteny based method from publicly available

genomes. TR dissection analysis and in vitro telomerase activity reconstitutions showed

two regions of the TR indispensable for telomerase activity. Secondary structure models

of these regions deduced from phylogenetic comparative analysis showed the presence of

a pseudoknot domain (PK) and a distal stem loop moiety harboring a stem homologous to

the P6.1 stem-loop, highly conserved in vertebrates and filamentous fungal CR4/5. The

two core domains were sufficient to reconstitute telomerase activity and can be added in

trans as two RNA fragments. Remarkably, both domains are crucial for in vitro

74 telomerase activity and no activity was detected without the distal domain suggesting that

U. maydis distal moiety is functionally equivalent to vertebrate and filamentous fungal

CR4/5. Overall, this study shows that higher ustilaginomycetes fungal TRs have conserved structural core domains and show functional conservation with vertebrates and filamentous fungal TRs providing a crucial link in TR structural and functional evolution in the opisthokonta super-kingdom.

3.2 Introduction

The switch from circular to linear chromosomes in early emerging eukaryotes warranted protective caps at the ends of chromosomes and specialized DNA polymerases for faithful replication of chromosomal DNA. These protective caps known as telomeres are DNA-protein complexes which protect the integrity of the cellular genome. Despite being protected by a large reserve of telomeres in most eukaryotic cells, a portion of telomeric DNA is lost with each round of DNA replication that cannot be compounded by conventional DNA polymerases. This is because conventional DNA polymerases require a free 3’ hydroxyl to add incoming deoxyribonucleotides for DNA polymerization. The lack of a free 3’ hydroxyl at the extreme termini of linear chromosomes make it mechanistically impossible for conventional DNA polymerases to synthesize DNA at the ends of chromosomes described as the end-replication problem

(Olovnikov, 1973; Watson, 1972). This necessitated the evolution of a DNA polymerase that can supply its own integral template giving rise to the telomerase enzyme (Podlevsky

& Chen, 2016).

Telomerase counterbalances telomere erosion by synthesizing short, tandem DNA repeats onto linear chromosome termini. Since telomere attrition is more pronounced in

75 rapidly dividing cells such as stem and germline cells, human telomerase is highly

expressed in these cell types to offset the rapid loss (Meyerson et al., 1997). This constant

replenishing of lost telomeres confers replicative immortality to these cells. However,

human somatic cells lack detectable telomerase activity limiting their replicative

potential. Thus, telomerase deficiency phenotypes are observed mainly as stem cell

defects and includes dyskeratosis congenita, aplastic anemia and idiopathic pulmonary

fibrosis (Armanios, 2009).

The telomerase ribonucleoprotein core-enzyme is composed of the catalytic telomerase reverse trancriptase (TERT) and the integral telomerase RNA (TR) which contains a short template sequence that dictates DNA synthesis. The catalytic TERT subunit is readily identified across eukaryotes using simple BLAST searches. Identified

TERTs share homology with hallmark motifs in DNA polymerases and reverse transcriptases while TERT specific domains have evolved providing unique properties to telomerase (A. G. Lai et al., 2017; Lingner et al., 1997). In contrast, TRs are highly divergent in primary sequence and length even among closely related groups of organisms (Podlevsky & Chen, 2016). The most well studied groups for TR biochemistry and biology include ciliates, yeast and vertebrates. Ciliate TRs are the smallest, ranging

~150 nucleotides while vertebrates are intermediate in length (~450 nt) with yeast TRs atleast 3-4 times larger than vertebrates (Musgrove et al., 2018).

Domain requirements for in vitro telomerase activity shows larger diversity with

TR compared to TERT. While it is required that all TERT domains be intact to reconstitute in vitro telomerase activity, only two separate TR domains are required from the TR subunit in a vast majority of eukaryotes. The first domain is the template /

76 pseudoknot domain (T/PK) whereas the secondary domain is a distal stem-loop moiety.

All identified TRs to date except for flagellates have a conserved pseudoknot (PK) in the

T/PK domain universally required for telomerase activity (Podlevsky & Chen, 2016). The

PK domain harbors a conserved triple helix structure which is absolutely essential for

telomerase activity (Qiao & Cech, 2008; Shefer et al., 2007). The structure and

requirement of the secondary distal stem-loop moiety, however, widely varies among distinct groups of species. This moiety termed CR4/5 in vertebrates and filamentous fungi are critical for in vitro activity (Chen et al., 2000; Chen et al., 2002; Podlevsky et al., 2016b; Qi et al., 2013). However, equivalent CR4/5 (eCR4/5) domains from invertebrate echinoderms and flagellate TRs are dispensable (Li et al., 2013; Podlevsky et al., 2016a; Podlevsky et al., 2016b). The homologous structure in ciliates termed stem- loop IV is a terminal stem loop that stimulates telomerase processivity in vitro (Mason et al., 2003).

Although telomerase RNA is vastly larger, only a short template is used in telomeric DNA synthesis. This property requires that the template be precisely defined to prevent synthesis of non-telomeric DNA sequences. The most ubiquitous mechanism of template boundary definition in telomerases is the presence of a template adjacent helix that controls the availability of single stranded RNA to be used in DNA synthesis although a homologous stem is absent in vertebrate TRs (Chen et al., 2000; Dandjinou et al., 2004; C. K. Lai et al., 2002).

Yeasts are the most well studied fungi for telomerase biology and biochemistry

(Figure 3.1). Apart from binding to TERT, budding yeast Saccharomyces cerevisiae and fission yeast Schizosaccharomyces pombe TRs acts as large scaffolds with multiple

77 protein binding arms involved in TR biogenesis and localization (Egan & Collins, 2012).

Identification and characterization of filamentous fungal TR from Neurospora crassa showed the conservation of core domains even with more distantly related vertebrate TRs

(Kuprys et al., 2013; Qi et al., 2013; Qi et al., 2015). Despite important developments in fungal telomerase biology, all characterized fungal TRs belong to phylum Ascomycota

(Figure 3.1). The sister phylum Basidiomycota, however, is untapped for TR studies which includes species of economic and medicinal importance such as mushrooms, model organisms and phytopathogenic fungi important in crop science. Additionally, the conservation of the vertebrate-type telomeric repeat sequence of ‘TTAGGG’ in most basidiomycete fungi and conserved telomere maintenance properties with mammals warrants detailed investigations into Basidiomycota telomerase biology (Guzmán &

Sánchez, 1994; Pérez et al., 2009; Yu et al., 2013).

We hereby report the identification of novel TRs from 20 Basidiomycota fungi

from class Ustilaginomycetes. Initial identification of the corn smut fungus Ustilago

maydis TR (UmaTR) provided a launching point to identify additional ustilaginomycetes

TRs leveraging gene synteny conservation. Phylogenetic comparative analysis and

functional characterization demonstrated conserved structural and functional

determinants for in vitro telomerase activity providing important insight into the TR

structural evolution within the opisthokonta super-kingdom.

78 3.3 Materials and methods

Ustilago maydis (U. maydis) cell growth and harvesting

The Ustilago maydis cells were grown at 27 °C in yeast extract (1%), peptone

(2%), dextrose (2%) medium (YEPD) for liquid cultures. For YPD-agar plates, 2.4%

bacto agar was added to YPD medium and aseptically poured to sterile petri plates. After

solidifying, the plates were stored at 4 °C until usage. The U. maydis cells from a frozen

glycerol stock was aseptically four-way streaked on a YPD-agar plate and incubated at 27

°C for 2 days with the edges wrapped in para-film to prevent loss of moisture. A single colony from the plate was picked and transferred to 5 ml YPD liquid medium in a sterile

50 ml falcon tube and grown with shaking at 250 rpm overnight at 27 °C until an OD600 of 0.5-0.7 corresponding to log-phase growth was reached. Cells were observed with a light microscope (Figure 3.2A) and were harvested by centrifugation at 1,500 g for 15 mins and the pellets were stored at -80 °C until further usage. The clone A U. maydis cells expressing 3xFLAG-UmaTERT were grown as described above but in the presence of 50 µg/ml Hygromycin.

U. maydis cell lysis

Cell pellet of U. maydis corresponding to OD600 of ~0.7 from 3 ml of culture was

resuspended in 500 µl of buffer (50 mM HEPES pH 7.5, 1 mM EDTA, 150 mM NaCl,

1X Roche protease cocktail inhibitor, 1 mM PMSF, 0.35% BME and 10% glycerol).

Identical resuspensions were done with buffers supplemented with various detergent

percentages or no detergent for the initial detergent screen. This includes 1% Triton X-

100 + 0.1 % NaDOC (Sodium deoxycholate), 1% Triton X-100, 0.1% NaDOC, 0.5%

CHAPS, 1% Tween 20 and 1% NP40. To the suspensions 500 µl of 0.5 mm glass beads

79 were added and vortexed at maximum speed for 20 mins in a cold room followed by 10

mins on ice. This was repeated twice total. Whole cell lysates were centrifuged at

maximum speed (21,000 g) for 10 mins at 4 °C for separation of insoluble components.

The supernatant was transferred to fresh tubes to be used for immunoprecipitation and/or

TRAP assay. Total protein was quantified by Bradford assay and proteins visualized by

SDS-PAGE analysis.

Western blotting

Clarified lysate size fractionated in an SDS-PAGE gel was transferred to a PVDF membrane via the semi-dry transfer method using the Trans-blot turbo (BIORAD) transfer apparatus following manufacturer’s instructions. Blot was blocked with 5% non- fat milk in 1X TBST (Tris-buffered saline with 0.1% Tween20) 30 mins at room temperature. Anti-FLAG primary antibody (Sigma) was used at a final dilution of 1:3000 overnight at 4 °C. Following washes (3 times 1X TBST, 5 min each) blot was blocked with 5% non-fat milk in 1X TBST and GAM-HRP (Goat anti-mouse horse radish peroxidase conjugated) secondary antibody was used at 1:10,000 dilution for 1 hour.

Following washes as described above, the blot was visualized using Immobilon ECL

Ultra Western HRP substrate according to manufacturer’s instructions.

Immunoprecipitation (IP) of 3xFLAG tagged UmaTERT and TRAP assay

For each IP, 10 µl of monoclonal anti-FLAG mouse M2 affinity gel (5 µl beads) was added to a low retention 1.5 ml centrifuge tube. Centrifuged at 1000 xg for 15 sec and the buffer was aspirated. Beads were washed twice each time with 30 µl of 1x TBS.

Following aspiration of buffer, the lysate was added immediately and rotated for 1 hr at 4

°C. Following IP, the beads were washed three times with 1xTBS and once with 1x PE

80 buffer (telomerase reaction buffer). Following washes, the TRAP assay master mix (1X

PE buffer, 0.5 µM telomeric DNA primer, dNTPs (50 µM final each)) was added to the

beads and incubated at 30 °C for 2 hours. Following phenol / chloroform extraction, the

DNA was ethanol precipitated and washed using 70% ethanol. And resuspended in 10 µl

water. The purified DNA (5 µl) was added to the PCR master mix (1X Ex Taq buffer,

dNTPS (0.15 mM each final), primers (0.4 µM each 32P labeled TS primer and ACX

primer), 0.625 U of Ex Taq DNA polymerase Hot-start version). Reactions are

thermocycled under conditions (94 °C / 2 min, 94 °C / 30 sec, 60 °C / 30 sec and 60 °C /

5 min) for 25 or 35 cycles. Extended products were size fractionated in a 10% native

PAGE gel and visualized via autoradiography by exposure to phosphor screen.

Bioinformatics analysis

Ustilago maydis reference genome was screened for template permutations using

a custom written script. Template was defined as 8-12 nucleotides in length and a circular

permutation of 5’-CCCTAA-3’. Putative template containing loci of 2 kb in length based

on the length of Neurospora crassa TR were extracted which resulted in 782 loci. Paired

end Illumina sequencing reads (109,069,132) were mapped using bowtie2 (Langmead &

Salzberg, 2012) to the extracted loci in a strand specific manner (Figure 3.5A). Default

parameters were used for bowtie2 mapping and including the additional --no-unal flag to

eliminate unmapped reads from the output. Mapped loci were ranked based on the

number of mapping reads and read covered regions with at most 50 bp of interrupted

coverage were defined as a locus for downstream analysis purposes. The loci were used

as queries for BLAST against the non-redundant nucleotide database for annotation of known genes. For homolog search, the loci were used as queries in standalone BLAST

81 (version 2.2.31+) search against the Ustilago bromivora genome (Figure 3.5B). The tablet genome viewer (Milne et al., 2012) was used to visualize reference loci and the mapped reads. Bedtools (Quinlan & Hall, 2010) was used to compute the per base coverage. For TR identification based on gene synteny, the UMAG_03168 and OAT homologs were identified by performing standalone BLAST searches against respective

Ustilaginomycetes genomes. Both protein coding genes from the identified species were manually annotated and intervening sequence was searched for a conserved template.

Identified putative TR sequences which show gene synteny conservation were used in the sequence alignment and phylogenetic comparative analysis to determine a secondary structure model for UmaTR. Multiple sequence alignment of Ustilaginomycetes TRs was performed initially using the ClustalW algorithm of the Bioedit program. Manual refinements were made to preliminary alignments with highly conserved regions and invariant primary sequence motifs as anchor points.

In vitro reconstitution and direct telomerase activity assay

The 3xFLAG-UmaTERT protein was expressed in rabbit reticulocyte lysate

(RRL) from the pCITE-4a-3xFLAG-UmaTERT plasmid using the TNT Quick Coupled transcription/translation kit (Promega) following manufacturer’s instructions. The full length UmaTR or UmaTR fragments were in vitro transcribed by T7 RNA polymerase, gel purified and assembled with 3xFLAG-UmaTERT protein for 30 min at 30˚C. Twelve microlitres of in vitro reconstituted telomerase enzyme was immunopurified with 3 µl of anti-FLAG M2 magnetic beads (Sigma M8823) at room temperature for 1 hr. The telomerase enzyme on beads was assayed in a 10 µl reaction containing 1X telomerase reaction buffer (50 mM Tris-HCl, pH 8.0, 50 mM NaCl, 0.5 mM MgCl2, 5 mM BME and

82 1mM spermidine), 1µM DNA primer, and specified dNTPs with 0.18µM of 32P-dGTP

(3,000 Ci/mmol, 10 mCi/ml; PerkinElmer). Reactions were incubated at 30˚C for 60 min and terminated by phenol/chloroform extraction, followed by ethanol precipitation. The

DNA products were resolved on a 10% (wt/vol) polyacrylamide/8 M urea denaturing gel, dried, exposed to a phosphorstorage screen and imaged on a Typhoon gel scanner (GE

Healthcare).

Total RNA isolation and Northern blot

Total RNA was extracted from U. maydis cells by liquid N2 grinding followed by

Tri-reagent (MRC) extraction following manufacturer’s protocols followed by an

additional phenol chloroform extraction step. Ten microgram total RNA was fractionated

on a 1.5 % formaldehyde agarose gel along with in vitro T7 transcribed RNA markers.

RNA was transferred to a Hybond-XL membrane using downward capillary transfer.

Transferred RNA were UV crosslinked to the membrane using the optimal crosslink

mode followed by blocking with ULTRAhyb buffer at 65 °C for 30 min. Following

blocking, α-32P-UTP labeled riboprobe was used for probing. The membrane was washed

and exposed to phosphor-screen at least overnight to several days prior to imaging using

the Typhoon gel scanner (GE healthcare).

Rapid amplification of cDNA ends (RACE)

RACE to determine the 5’ and 3’ ends of UmaTR was performed using similar

instructions to the FirstChoice RLM-RACE kit (Invitrogen). Kit enzymes were replaced

with Calf-intestine alkaline phosphatase (CIP) and RNA 5’ Pyrophosphohydrolaes

(RppH) purchased from New England Biolabs (NEB). Treatment of poly (A) polymerase

(NEB) was performed according to manufacturer’s instructions.

83 3.4 Results

Plasmid design for ectopic expression of 3xFLAG-UmaTERT

As the first step, the plasmid pCM955 was chosen to which the TERT expression

cassette was cloned (Kojic et al., 2006). This plasmid has a Hygromycin

phosphotransferase gene (hph) codon optimized for U. maydis expression to offer

antibiotic selection of transformants. The hph gene is under the U. maydis heat shock

protein 70 (Phsp70) promoter followed by the hsp70 terminator (Thsp70) for control of hph gene expression by the cellular transcription and translation machinery (Appendix B). For maintenance in bacteria, the pCM955 vector has ampicillin resistance marker. The U. maydis TERT (UmaTERT) gene was cloned into pCM955 plasmid under the control of the Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) promoter (Pgap) (Kinal et al.,

1993). The Pgap promoter offers stable and high levels of constitutive expression

throughout the cell cycle as GAPDH is a house-keeping gene. An in-frame 3xFLAG fusion tag was appended to the N terminal of the TERT gene to facilitate immunoprecipitation with the well characterized anti-FLAG antibodies. The pCM955-

3xFLAG-UmaTERT was sequence verified for subsequent transformation. U. maydis cells transformed with pCM955-3xFLAG-UmaTERT was a kind gift from Dr. Neal Lue

(Cornell).

Verification of U. maydis cells for 3xFLAG-UmaTERT expression

Five clones of pCM955-3xFLAG-UmaTERT plasmid transformed U. maydis cells were subject to expression validation via Western blotting (Figure 3.2B). Cells were lysed by bead beating in the presence of ionic detergent and cell lysate proteins were resolved on an SDS-PAGE gel along with similarly processed wild-type cell lysate as

84 negative control and RRL expressed FLAG-hTERT as positive control. Size fractionated proteins were transferred to a PVDF membrane and bands were detected using anti-

FLAG primary antibody probing followed by HRP-conjugated secondary antibodies for chemiluminescent detection of protein bands. The results showed that clone A has detectable expression of 3xFLAG-UmaTERT compared to other four clones (Figure

3.2B). The RRL expressed FLAG-hTERT was used as positive control for validating the

FLAG antibody (Figure 3.2B).

Determination of conditions for optimal 3xFLAG-UmaTERT yield and activity

We sought to verify if the constitutively expressed 3xFLAG-UmaTERT functionally assembles with the TR subunit. To achieve this, cell lysis was performed under various conditions followed by immunoprecipitation and telomere repeat amplification protocol (TRAP) assay (Figure 3.3). TRAP assay provides a highly sensitive PCR based method for detection of telomerase activity. Both the direct lysate and immunoprecipitates were tested for telomerase activity with TRAP. Of the various detergents and detergent combinations, 1% Tween 20 provides best TRAP activity with successful immunoprecipitation of 3xFLAG-UmaTERT demonstrated by anti-FLAG

Western (Figure 3.3, lane 17). The banding pattern characteristic of TRAP activity is the gradual decrease in intensity of bands as the molecular weight increases. Presence of uneven intensities might suggest spurious DNA amplification. This is best demonstrated in the presence of Tween20 compared to all other detergent or detergent combinations.

Thus, Tween 20 was chosen as the optimal detergent for downstream experiments.

85 Functional validation of 3xFLAG-UmaTERT

In order to determine if either PCR inhibitors or factors detrimental to telomerase activity might exist in the lysate, the input lysate was diluted at indicated ratios with lysis buffer followed by IP and TRAP with or without RNaseA treatment of immunoprecipitates. The results clearly demonstrate the RNase sensitive nature of TRAP telomerase activity indicating the observed bands to be authentic telomerase activity extended DNA being amplified during TRAP assay (Figure 3.4). Similar TRAP band intensities among the dilutions suggests that if inhibitors are present in the lysate, they do not significantly affect telomerase activity or PCR amplification. Additionally, to rule out immunoprecipitation of non-specific molecules, a control replicate was performed with wild-type cells. The absence of any detectable TRAP with wild type cell lysate provides evidence for specific pull-down of 3xFLAG tagged TERT and associating molecules

(Figure 3.4).

Identification of U. maydis TR candidates via a custom bioinformatics pipeline

Next-generation sequencing (NGS) data generated using the library prepared from

RNAs that co-IP with 3xFLAG-UmaTERT, was processed using a custom bioinformatics pipeline to obtain candidate transcripts for further screening (Figure 3.5A). The sequencing experiment generated more than 109 million, 75 bp paired end reads. As part of the bioinformatics pipeline, initially, the reference genome of U. maydis was screened for regions that contain a putative template which generated 782 loci. The putative template of U. maydis was predicted to be atleast 8 and upto 12 nt in length and a circular permutation of 5’-CCCUAA-3’ as per the telomeric repeat sequence of 5’-TTAGGG-3’

(Guzmán & Sánchez, 1994). The paired end RNA-seq reads were quality controlled and

86 subsequently mapped to these 782 loci. (Figure 3.5A). The top five loci were ordered

based on number of mapping reads and genomic regions of these five loci were extracted

based on coverage with gaps less than 50 bp. The expanded loci were further analyzed by

BLAST search against the entire National Center for Biotechnology Information (NCBI)

database to eliminate any known or hypothetical genes. These candidates were also used

as search queries against the closely related U. bromivora genome using standalone

BLAST especially focusing on conservation of the putative template (Figure 3.5B). Only

candidate #2 does not have homology to any previously known or putative genes and has

a homolog with the template conserved in U. bromivora (Figure 3.5B).

Validation of Candidate #2 as UmaTR

In order to validate candidate #2 as UmaTR, telomerase activity was reconstituted

in RRL with synthetic 3xFLAG-UmaTERT and synthetic candidate #2 RNA. Primers

with two different telomeric sequence circular permutations were used as substrates in the

direct telomerase primer-extension assay. The results clearly show the offset banding

between the two primers whereas no activity is detected without the RNA (Figure 3.6).

These results provide support that candidate #2 is UmaTR.

Characterization of the UmaTR transcript

In order to determine the length of the UmaTR transcript, 5’- and 3’ -RACE was performed. Although a single 5’ end was detected, 3’ RACE showed multiple bands following agarose gel electrophoresis analysis (Figure 3.7 A, C). Cloning and sequencing of these bands revealed the presence of 3 different 3’ ends suggesting the presence of multiple TR isoforms (Figure 3.7 A, B). These isoforms named α, β and γ were determined to be 1293, 1478 and 1571 nucleotides in length. Interestingly, 3’ RACE

87 products were observed even when the input RNA was not Poly-A-polymerase (PAP) treated (Figure 3.7C, lanes 5-7). This suggests that a significant amount of RNA of all three isoforms are polyA tailed. However, the significance and implications of the apparent presence of a polyA tail in the TR isoforms is unknown at this time. Northern blot analysis of the isoforms using a probe designed against the template region shows that isoform α is highly abundant compared to β and γ (Figure 3.7 E, F). Thus, isoform α was used for downstream analysis and experiments and will be referred to as UmaTR

(Appendix C). Additionally, RT-PCR analysis using an RT primer downstream of the largest isoform γ suggests the presence of a polycistronic RNA from which could be endonucleolytically processed to generate the mature TR isoforms (Figure 3.7D).

Identification of UmaTR homologs from Ustilaginomycetes species

Initial standalone BLAST search using UmaTR as query against U. bromivora genome provided a hit where only the template and a few flanking residues were conserved (Figure 3.5B). This suggests that the TR primary sequence is highly variable even between species belonging to the same genus. As expected, BLAST searches were not successful in identification of homologs from more distant relatives of U. maydis.

Careful analysis of TR encoding loci from U. maydis and U. bromivora showed that homologs of the putative protein coding gene designated UMAG_03168 was present upstream of both TR sequences. Interestingly, when sequences downstream of this gene were analyzed from additional species belonging to class ustilaginomycetes, all of them harbored the template sequence and conserved flanking residues (Figure 3.8).

Additionally, the CAR2-ornithine-amino-transferase coding gene was found downstream of the identified TRs in all ustilaginales species but not in Urocystidales and

88 Violaceomycetales orders (Figure 3.8). This suggests the relative positions of

UMAG_03168 homologs and TR are conserved across Ustilaginomycetes showing conserved gene synteny. Leveraging the conserved gene synteny, TRs were found from a total of 20 Ustilaginomycetes species.

UmaTR contains two independent regions required for in vitro activity

Truncation analysis of UmaTR was performed to determine the regions required to demonstrate in vitro telomerase activity. As the requirement of two independent TR domains for telomerase activity is universal, we performed activity assays with two

UmaTR fragments added in trans. Two fragments of UmaTR; the 5’ fragment and the 3’ fragment were determined based on the read coverage map of the UmaTR locus. Two peaks of high coverage observed in the coverage map (Appendix D) might suggest two independent TERT binding sites in UmaTR allowing us to set the breakpoint set at position 990 between the two covered regions (Figure 3.9). The 3’ fragments 3F2-3F11 were either 5’ or 3’ truncations of fragment 3F1 (991-1293) (Figure 3.9). Conversely, 5’ or 3’ truncations of the 5’ fragment 5F1 retain the essential template sequence and are termed 5F2-5F5 (Figure 3.9). T7 transcribed 5’ fragment 5F1 (1-990) was assembled with fragments 3F1-3F11 in separate reactions with synthetic 3xFLAG-UmaTERT in

RRL to reconstitute in vitro telomerase activity via direct primer extension using 18-mer of 5’-(GTTAGG)-3’ (Figure 3.9). Conversely full length or truncations of the 5’ fragment

(5F1-5F5) were T7 transcribed and assembled in separate reactions with 3’ fragment 3F1

(991-1293) and assayed for telomerase activity as described in materials and methods.

The results clearly show that fragments 5F3 and 3F6 are the minimal UmaTR domains crucial for in vitro telomerase activity (Figure 3.9).

89 Multiple sequence alignment analysis

As the identified Ustilaginomycetes TRs are on average >1 kb in length, multiple sequence alignments are challenging. However, the truncation analysis of UmaTR provided regions of the TR that are essential for activity and thus more likely to be conserved. As sequences motifs known to be well conserved are required to approach the alignment especially when aligning large RNAs the 5F3 region (Figure 3.9) was the first to be aligned. The downstream high coverage region showed conservation (Appendix D) and thus aligned next. After aligning both regions and using them as anchor points, the intervening sequences were aligned, gradually finding conserved motifs (Figure 3.10).

Secondary structure models of the UmaTR domains essential for in vitro telomerase activity

The substantial reduction in the length of the TR domains determined by truncation analysis aided in the secondary structure determination of the two domains

5F3 and 3F6. By using phylogenetic comparative analysis, the secondary structures of both domains were determined based on the sequence alignment of TR domains (Figure

3.10). The template containing fragment 5F3, showed a TR characteristic pseudoknot domain and thus herein named template/pseudoknot domain (T/PK) (Figure 3.10, Figure

3.11). The template adjacent helix arm had conserved residues at the base of the stem and is potentially the template boundary element and is named as such (TBE) (Figure 3.10,

3.11). The 3F6 distal region crucial for activity is structurally and functionally homologous to filamentous fungal and vertebrate CR4/5 (Figure 3.11) including the highly conserved 4bp stem loop P6.1 (Figure 3.11).

90 3.5 Discussion

Biochemical TR identification methods are hindered by the low abundance of telomerase in a vast majority of model organisms. Method development is further obstructed by the extreme diversity in TR primary sequence and length. In this study, we report the identification of phylum basiciomycota fungal TRs and show that TRs from class ustilaginomycetes share conserved secondary structural and functional determinants with vertebrates and filamentous fungi expanding our understanding on opisthokonta TR evolution.

We chose the phytopathogenic fungus Ustilago maydis which causes the corn- smut disease to pursue the identification of TR due its many favorable attributes. Firstly, as a genetically tractable model organism, Ustilago maydis offers the advantage of being easily and inexpensively cultured in the laboratory, availability of a high quality genome sequence and a wide variety of well-established genetic tools at the researcher’s disposal

(Kamper, 2004; Kämper et al., 2006; Schuster et al., 2016). Moreover, U. maydis is suited for TR studies specifically as the catalytic TERT component is characterized and determined to be essential for telomere maintenance in vivo along with the presence of vertebrate like telomere repeats of 5’-TTAGGG-3’ suggesting conserved telomere binding proteins with humans (Bautista-Espana et al., 2014; Guzmán & Sánchez, 1994;

Yu et al., 2013).

Biochemical purification of UmaTR was performed using TERT mediated RNA-

IP as the first step in the UmaTR identification process. The TERT protein served as bait to co-immunoprecipitate the associated TR. Although custom developed polyclonal antibodies against TERT could immunoprecipitate native telomerase holoenzyme-

91 complexes along with the TR, development of custom antibodies is expensive and time

consuming. Thus a U. maydis strain ectopically expressing 3xFLAG-UmaTERT was used and confirmed for 3xFLAG-UmaTERT protein expression via Western blot (Figure

3.2). Furthermore, immunoprecipitation and TRAP based telomerase activity showed that

the fusion TERT protein associates with the native UmaTR and forms catalytically active

telomerase (Figure 3.4). An additional reaction with wild-type cells was included to

control non-specific amplification of telomeric DNA associating with TERT (Figure 3.4).

The results show that the RNA-IP was specific and TRAP activity is detected only in the

3xFLAG-UmaTERT clone suggesting specific PCR amplification of telomerase extended

DNA but not non-specifically bound telomeric DNA (Figure 3.4).

Our bioinformatics pipeline to identify UmaTR includes modifications from our

previous approaches (Li et al., 2013; Qi et al., 2013). Initially we screened for loci with

putative template sequences in the U. maydis reference genome as the template sequence

is the only universally conserved TR primary sequence element and is determined by the

telomeric repeat sequence. The template sequence was defined as minimally being 8 nt in

length and upto 12 nt. As telomerase requires that the DNA substrate hybridize via

Watson-Crick base pairing to the TR template, we hypothesized a requirement of at least

2 nt for the initial hybridization to form a stable hetero-duplex. As the telomeric repeat is

6 nt in length in U. maydis, we chose the lower limit of template length to be 8 nt including both the alignment region and the 6 nt repeat length. To allow for more freedom in the upper limit of the template length, at most 2 repeat lengths of 12 nt was set as the upper limit. The NGS data was then mapped to the putative template containing loci in a strand specific manner preserving orientation of transcription information.

92 Following ranking based on mapped reads, BLAST searches and homolog identification, candidate #2 was singled out for further analysis as it fulfilled all the required criteria of a

TR candidate (Figure 3.5).

Homologs of UmaTR were unable to be found by BLAST searches against target species genomes. This is not surprising as TRs are notoriously divergent in primary sequence (Podlevsky & Chen, 2016). This lack of conservation was more pronounced than expected within the Ustilago genus. However, conserved gene synteny of the U. maydis TR was observed providing a synteny based approach for TR identification. Gene synteny has been successfully applied for TR identification in Yarrowia, Candida and

Leishamnia species previously (Červenák et al., 2019; Gunisova et al., 2009;

Vasconcelos et al., 2014). This approach functions under the assumption that homologous genes with same function share a conserved locus in the same chromosomes from the respective species. In this regard, Ustilagionmycetes species show remarkable syntenic conservation of the TRs and a deeper exploration may provide insight into the potential regulatory function of flanking genes in TR metabolism.

RACE and Northern experiments performed in order to characterize the UmaTR transcript showed the presence of 3 different isoforms α, β and γ with variable 3’ ends but with a single 5’ end (Figure 3.7). Remarkably, all three isoforms contain poly A tails shown by poly-A-polymerase dependent or independent 3’ RACE (Figure 3.7C). Human

TR (hTR) was shown to mature via a polyadenylation dependent pathway where hTR precursors with poly-A tails were detected in vivo in human cells (Nguyen et al., 2015).

However, the poly-A tail is processed by PARN and the mature hTR associated with hTERT is void of a poly-A tract. UmaTR however, shows significant accumulation of

93 poly A tailed α, β, γ isoforms capable of being detected by 3’ RACE (Figure 3.7C).

While the significance of this phenomenon is unknown at this point, a number of

interesting questions could be addressed by exploring further. For instance, why does

UmaTR which is a long non-coding RNA have a poly A tail? What is the functional significance of this? Does poly-A tailed UmaTR associate with UmaTERT? Does

UmaTR have a unique biogenesis pathway different from filamentous fungi? Which are

currently under exploration. However, independent of this, we pursued the structural and

functional characterization of the most abundant UmaTR isoform α.

In vitro telomerase activity reconstituted with the α isoform showed telomerase

activity (Figure 3.6). However, the processivity of U. maydis telomerase was low adding

a faint second repeat (Figure 3.6). This is likely due to the absence of telomerase

holoenzyme components required to increase telomerase processivity. This is in contrast

to filamentous fungal telomerase which was demonstrated to be highly processive in vitro

(Qi et al., 2013). This suggests the presence of unique telomerase holoenzyme

components in U. maydis not conserved with the vertebrate system. The identification of

these components could offer novel insight into mechanisms governing telomerase

processivity.

Except for flagellates, the TR pseudoknot (PK) domain is universally present and

critical for telomerase activity (Podlevsky & Chen, 2016; Podlevsky et al., 2016a). The

PK domain identified by comparative phylogenetic analysis in U. maydis TR shows

unique sequence composition with a conserved secondary structure (Figure 3.11). The

classic H type PK found in TRs is composed of an organization in which the loop

residues of a stem loop, base pairs intramolecularly downstream of the first stem. The PK

94 domain of UmaTR showed unpaired residues between P2 and P3 stems as were the other

ustilaginales TR PKs (Figure 3.10, 3.11). This is in contrast to vertebrate and filamentous

fungal TR PKs which show a compact organization with no unpaired residues between

PK forming helices (Chen et al., 2000; Qi et al., 2013). The P2 stem of UmaTR did not

show high conservation (Figure 3.10) in contrast to vertebrate P2 which contains residues

highly conserved among vertebrates. However, this is marginally similar to filamentous

fungal PK1 which has 2 basepairs that share 80% identity but the rest of the 4 base pairs

are not conserved albeit show co-variation (Qi et al., 2013). The UmaTR P3 stem,

however, is conserved among ustilaginales like vertebrate and filamentous fungi. This

suggests the P3 stem is more stable and may induce P2 base pairing leading to the

formation of a stable PK. The 5’ linker adjoining P2 and P3 stems in UmaTR contains

conserved ‘U’ residues for the formation of a universally conserved triple helix required

for telomerase function by yet an unknown mechanism (Theimer et al., 2005). The

downstream linker between P2 and P3 is more variable in primary sequence. A conserved helix PK2.1 was identified in the downstream PK linker in filamentous fungal TR the

function of which is unknown (Qi et al., 2013). However, the downstream linker between

P2 and P3 in vertebrates do not show similar structures (Chen et al., 2000). UmaTR thus,

is more like vertebrate TRs in this regard as no helices were identified in the downstream

linker.

The template adjacent helix-based template boundary definition is ubiquitous in

TRs except for vertebrates and is likely the mechanism defining the 5’ boundary of the

template in UmaTR shown by a conserved stem 5’ of the template (Figure 3.11).

95 The distal stem loop moiety in UmaTR is critical for activity and functionally

homologous to vertebrate and filamentous fungal CR4/5. The CR4/5 domain of TR is a

three-way stem; P5, P6 and P6.1 where the P6.1 helix is a short 4-bp stem loop absolutely

required for activity in vertebrates and filamentous fungi (Chen et al., 2002; Qi et al.,

2013). Crosslinking and mass-spectrometry based mapping showed P6 stem-loop residues and a P6.1 loop residue to crosslink to the TERT-TRBD in a vertebrate system

(Bley et al., 2011). A later co-crystal structure of the CR4/5 and TERT-TRBD, shows the protein domain wedged in between the P6-P6.1 junction making extensive protein-RNA contacts with P6 and P6.1 (Huang et al., 2014). The requirement of CR4/5 for telomerase activity could be explained by the high affinity of the CR4/5 to TERT likely inducing conformation changes required for and occurring during TERT catalysis. A structure analogous to CR4/5 including a highly conserved 4 bp stem P6.1 was found in UmaTR distal moiety suggesting the presence of the CR4/5 domain (Figure 3.11). Thus, the conserved structural and functional features of ‘higher fungi’ TRs illuminated by this study provides an important piece of the puzzle of opisthokonta TR evolution.

96 3.6 References

Armanios, M. Y. (2009). Syndromes of telomere shortening. Annual Reviews of Genomics and Human Genetics, 10, 45-61.

Bautista-Espana, D., Anastacio-Marcelino, E., Horta-Valerdi, G., Celestino-Montes, A., Kojic, M., Negrete-Abascal, E., Reyes-Cervantes, H., Vazquez-Cruz, C., Guzman, P., & Sanchez-Alonso, P. (2014). The telomerase reverse transcriptase subunit from the dimorphic fungus Ustilago maydis. PLOS ONE, 9(10), e109981.

Bley, C. J., Qi, X., Rand, D. P., Borges, C. R., Nelson, R. W., & Chen, J. J.-L. (2011). RNA-protein binding interface in the telomerase ribonucleoprotein. Proceedings of the National Academy of Sciences of United States of America, 108(51), 20333- 20338.

Červenák, F., Juríková, K., Devillers, H., Kaffe, B., Khatib, A., Bonnell, E., Sopkovičová, M., Wellinger, R. J., Nosek, J., Tzfati, Y., Neuvéglise, C., & Tomáška, Ľ. (2019). Identification of telomerase RNAs in species of the Yarrowia clade provides insights into the co-evolution of telomerase, telomeric repeats and telomere-binding proteins. Scientific Reports, 9(1), 13365.

Chen, J. J.-L., Blasco, M. A., & Greider, C. W. (2000). Secondary structure of vertebrate telomerase RNA. Cell, 100(5), 503-514.

Chen, J. J.-L., Opperman, K. K., & Greider, C. W. (2002). A critical stem-loop structure in the CR4-CR5 domain of mammalian telomerase RNA. Nucleic Acids Research, 30(2), 592-597.

Dandjinou, A. T., Lévesque, N., Larose, S., Lucier, J.-F., Abou Elela, S., & Wellinger, R. J. (2004). A phylogenetically based secondary structure for the yeast telomerase RNA. Current Biology, 14, 1148-1158.

Egan, E. D., & Collins, K. (2012). Biogenesis of telomerase ribonucleoproteins. RNA, 18(10), 1747-1759.

Gunisova, S., Elboher, E., Nosek, J., Gorkovoy, V., Brown, Y., Lucier, J. F., Laterreur, N., Wellinger, R. J., Tzfati, Y., & Tomaska, L. (2009). Identification and comparative analysis of telomerase RNAs from Candida species reveal conservation of functional elements. RNA, 15(4), 546-559.

Guzmán, P. A., & Sánchez, J. G. (1994). Characterization of telomeric regions from Ustilago maydis. Microbiology, 140(3), 551-557.

Huang, J., Brown, A. F., Wu, J., Xue, J., Bley, C. J., Rand, D. P., Wu, L., Zhang, R., Chen, J. J.-L., & Lei, M. (2014). Structural basis for protein-RNA recognition in telomerase. Nature Structural & Molecular Biology, 21(6), 507-512.

97 Kamper, J. (2004). A PCR-based system for highly efficient generation of gene replacement mutants in Ustilago maydis. Molecular Genetics and Genomics, 271(1), 103-110.

Kämper, J., Kahmann, R., Bölker, M., Ma, L.-J., Brefort, T., Saville, B. J., Banuett, F., Kronstad, J. W., Gold, S. E., Müller, O., Perlin, M. H., Wösten, H. A. B., de Vries, R., Ruiz-Herrera, J., Reynaga-Peña, C. G., Snetselaar, K., McCann, M., Pérez-Martín, J., Feldbrügge, M., Basse, C. W., Steinberg, G., Ibeas, J. I., Holloman, W., Guzman, P., Farman, M., Stajich, J. E., Sentandreu, R., González- Prieto, J. M., Kennell, J. C., Molina, L., Schirawski, J., Mendoza-Mendoza, A., Greilinger, D., Münch, K., Rössel, N., Scherer, M., Vraneš, M., Ladendorf, O., Vincon, V., Fuchs, U., Sandrock, B., Meng, S., Ho, E. C. H., Cahill, M. J., Boyce, K. J., Klose, J., Klosterman, S. J., Deelstra, H. J., Ortiz-Castellanos, L., Li, W., Sanchez-Alonso, P., Schreier, P. H., Häuser-Hahn, I., Vaupel, M., Koopmann, E., Friedrich, G., Voss, H., Schlüter, T., Margolis, J., Platt, D., Swimmer, C., Gnirke, A., Chen, F., Vysotskaia, V., Mannhaupt, G., Güldener, U., Münsterkötter, M., Haase, D., Oesterheld, M., Mewes, H.-W., Mauceli, E. W., DeCaprio, D., Wade, C. M., Butler, J., Young, S., Jaffe, D. B., Calvo, S., Nusbaum, C., Galagan, J., & Birren, B. W. (2006). Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis. Nature, 444(7115), 97-101.

Kinal, H., Park, C. M., & Bruenn, J. A. (1993). A family of Ustilago maydis expression vectors: new selectable markers and promoters. Gene, 127(1), 151-152.

Kojic, M., Zhou, Q., Lisby, M., & Holloman, W. K. (2006). Rec2 interplay with both Brh2 and Rad51 balances recombinational repair in Ustilago maydis. Molecular and Cellular Biology, 26(2), 678-688.

Kuprys, P. V., Davis, S. M., Hauer, T. M., Meltser, M., Tzfati, Y., & Kirk, K. E. (2013). Identification of Telomerase RNAs from Filamentous Fungi Reveals Conservation with Vertebrates and Yeasts. PLOS ONE, 8(3), e58661.

Lai, A. G., Pouchkina-Stantcheva, N., Di Donfrancesco, A., Kildisiute, G., Sahu, S., & Aboobaker, A. A. (2017). The protein subunit of telomerase displays patterns of dynamic evolution and conservation across different metazoan taxa. BMC Evolutionary Biology, 17(1), 107-107.

Lai, C. K., Miller, M. C., & Collins, K. (2002). Template boundary definition in Tetrahymena telomerase. Genes & Development, 415-420.

Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods, 9(4), 357-359.

Li, Y., Podlevsky, J. D., Marz, M., Qi, X., Hoffmann, S., Stadler, P. F., & Chen, J. J.-L. (2013). Identification of purple sea urchin telomerase RNA using a next- generation sequencing based approach. RNA, 19(6), 852-860.

98 Lingner, J., Hughes, T. R., Shevchenko, A., Mann, M., Lundblad, V., & Cech, T. R. (1997). Reverse transcriptase motifs in the catalytic subunit of telomerase. Science, 276, 561-567.

Mason, D. X., Goneska, E., & Greider, C. W. (2003). Stem-loop IV of tetrahymena telomerase RNA stimulates processivity in trans. Molecular and Cellular Biology, 23(16), 5606-5613.

Meyerson, M., Counter, C. M., Eaton, E. N., Ellisen, L. W., Steiner, P., Caddle, S. D., Ziaugra, L., Beijersbergen, R. L., Davidoff, M. J., Liu, Q., Bacchetti, S., Haber, D. A., & Weinberg, R. A. (1997). hEST2, the putative human telomerase catalytic subunit gene, is up-regulated in tumor cells and during immortalization. Cell, 90(4), 785-795.

Milne, I., Stephen, G., Bayer, M., Cock, P. J. A., Pritchard, L., Cardle, L., Shaw, P. D., & Marshall, D. (2012). Using Tablet for visual exploration of second-generation sequencing data. Briefings in Bioinformatics, 14(2), 193-202.

Musgrove, C., Jansson, L. I., & Stone, M. D. (2018). New perspectives on telomerase RNA structure and function. Wiley Interdisciplinary Reviews RNA, 9(2).

Nguyen, D., Grenier St-Sauveur, V., Bergeron, D., Dupuis-Sandoval, F., Scott, M. S., & Bachand, F. (2015). A Polyadenylation-Dependent 3' End Maturation Pathway Is Required for the Synthesis of the Human Telomerase RNA. Cell Reports, 13(10), 2244-2257.

Olovnikov, A. M. (1973). A theory of marginotomy. The incomplete copying of template margin in enzymic synthesis of polynucleotides and biological significance of the phenomenon. Journal of Theoretical Biology, 41, 181-190.

Pérez, G., Pangilinan, J., Pisabarro, A. G., & Ramírez, L. (2009). Telomere Organization in the Ligninolytic Basidiomycete Pleurotus ostreatus. Applied and Environmental Microbiology, 75(5), 1427-1436.

Podlevsky, J. D., & Chen, J. J.-L. (2016). Evolutionary perspectives of telomerase RNA structure and function. RNA biology, 13(8), 720-732.

Podlevsky, J. D., Li, Y., & Chen, J. J.-L. (2016a). The functional requirement of two structural domains within telomerase RNA emerged early in eukaryotes. Nucleic Acids Research, 44, 9891-9901.

Podlevsky, J. D., Li, Y., & Chen, J. J.-L. (2016b). Structure and function of echinoderm telomerase RNA. RNA, 22, 204-215.

Qi, X., Li, Y., Honda, S., Hoffmann, S., Marz, M., Mosig, A., Podlevsky, J. D., Stadler, P. F., Selker, E. U., & Chen, J. J.-L. (2013). The common ancestral core of vertebrate and fungal telomerase RNAs. Nucleic Acids Research, 41(1), 450-462. 99 Qi, X., Rand, D. P., Podlevsky, J. D., Li, Y., Mosig, A., Stadler, P. F., & Chen, J. J. (2015). Prevalent and distinct spliceosomal 3'-end processing mechanisms for fungal telomerase RNA. Nat Communications, 6, 1-8.

Qiao, F., & Cech, T. R. (2008). Triple-helix structure in telomerase RNA contributes to catalysis. Nature Structural & Molecular Biology, 15, 634-640.

Quinlan, A. R., & Hall, I. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26(6), 841-842.

Schuster, M., Schweizer, G., Reissmann, S., & Kahmann, R. (2016). Genome editing in Ustilago maydis using the CRISPR-Cas system. Fungal Genetics and Biology, 89, 3-9.

Shefer, K., Brown, Y., Gorkovoy, V., Nussbaum, T., Ulyanov, N. B., & Tzfati, Y. (2007). A triple helix within a pseudoknot is a conserved and essential element of telomerase RNA. Molecular and Cellular Biology, 27, 2130-2143.

Theimer, C. a., Blois, C. a., & Feigon, J. (2005). Structure of the human telomerase RNA pseudoknot reveals conserved tertiary interactions essential for function. Molecular Cell, 17, 671-682.

Vasconcelos, E. J. R., Nunes, V. S., da Silva, M. S., Segatto, M., Myler, P. J., & Cano, M. I. N. (2014). The putative Leishmania telomerase RNA (LeishTER) undergoes trans-splicing and contains a conserved template sequence. PLOS ONE, 9, e112061.

Watson, J. D. (1972). Origin of concatemeric T7 DNA. Nature: New biology, 239(94), 197-201.

Yu, E. Y., Kojic, M., Holloman, W. K., & Lue, N. F. (2013). Brh2 and Rad51 promote telomere maintenance in Ustilago maydis, a new model system of DNA repair proteins at telomeres. DNA repair, 12(7), 472-479.

100 Fig 3.1. Evolutionary relationships of major fungal groups and TR identification status of each subphyla shown. Major fungal phyla and subphyla shown with representative organisms from each subphylum. TR sequences have been identified and characterized exclusively from the ascomycota phylum with no TRs identified in phylum basidiomycota. The organism of interest in this study Ustilago maydis as shown in red.

101 Figure 3.2. Expression of 3xFLAG-UmaTERT in U. maydis cells. (A) Light microscopy of Ustilago maydis cells (400X magnification). A single cell shown with a red arrow. (B) Western blot analysis of pCM955-3xFLAG-UmaTERT transformants #2, #3, A, B and C shown. Detectable expression of 3xFLAG-UmaTERT in Clone A at the expected size of ~156 kDa shown with a red arrow. WT cell lysate loaded as negative control with FLAG-hTERT as positive control.

102

Figure 3.3. Testing conditions for optimal lysis, immunoprecipitation and telomerase activity of 3xFLAG-UmaTERT clone A. Clone A cells expressing 3xFLAG-UmaTERT were lysed under conditions shown above the gel and TRAP telomerase activity assay was performed to elucidate the optimal condition for cell lysis, immunoprecipitation and telomerase activity. Western blot of cell lysates of respective lysis condition shown under the TRAP gel. Ladder like banding pattern and decreasing intensity with increasing molecular weight is characteristic of TRAP activity. All detergents were at concentrations described in the materials and methods section. Both lysate and immunoprecipitates were assayed for TRAP activity. Lysis in the presence of Tween20 followed by IP and TRAP shows best activity (Lane 17). RRL expressed hTERT combined with synthetic hTR assayed for TRAP activity is positive control shown in lane 1 (Cell growth, lysis, WB – Dhenugen Logeswaran, TRAP assay – Joshua Podlevsky).

103

Figure 3.4. RNase sensitive telomerase activity of 3xFLAG-UmaTERT. U. maydis clone A cells following lysis was diluted at indicated ratios with lysis buffer prior to immunoprecipitation to check for presence of inhibitors of immunoprecipitation or carry over PCR inhibitors. Immunoprecipitates were subject to TRAP telomerase activity with or without RNase treatment including WT as negative control. The WT cell lysate do not show any TRAP bands suggesting specific pulldown of 3xFLAG-UmaTERT in the recombinant samples and all dilutions show RNase sensitive telomerase activity suggesting the functional assembly of 3xFLAG-UmaTERT with endogenous UmaTR. RRL expressed hTERT and synthetic hTR combined for TRAP assay is positive control (H) (Cell growth, lysis – Dhenugen Logeswaran, TRAP assay – Zhenqiu Huang).

104

Figure 3.5. Bioinformatics pipeline for screening UmaTR candidates. (A) Following next-generation sequencing of RNA extracted from immunoprecipitated 3xFLAG- UmaTERT, sequencing reads (109,069,132) were obtained. Independently, putative template containing loci from the U. maydis genome were extracted (782). The reads were mapped to the loci and were ranked based on the number of mapped reads. (B) Top 5 candidates from the bioinformatics analysis pipeline in descending order of the number of reads mapped. Candidate #2 shows the presence of a 9 nt template with no BLAST hits to annotated homologous genes but with a homologous unannotated sequence from the closely related U. bromivora with the template being conserved (NGS library preparation – Joshua podlevsky).

105 Figure 3.6. Validation of candidate #2 as UmaTR. (Left) Sequence of the putative template (red in open box) with hybridizing positions of two circular permuted telomeric DNA primers (a,b) shown. Nucleotides predicted to be added by telomerase (lower case, blue) and the number of nucleotides added (+4, +5) shown. (Right) Direct telomerase activity assay using in vitro reconstituted U. maydis telomerase with synthetic UmaTERT and synthetic candidate #2 RNA. A 22 mer is included as marker to reflect the size of the +4 band. Numbers on the left show the number of nucleotides added to each primer. A radiolabeled recovery control (r.c.) was added prior to phenol/chloroform extraction of extended DNA products (Primer design, PCR, TR synthesis and purification – Dhenugen Logeswaran; Telomerase in vitro reconstitution, activity assay – Yang Li)

106

Figure 3.7. Characterization of U. maydis TR transcripts. (A) Schematic of the overlapping UmaTR isoforms with the position and sequence of the template shown (red) and the annealing position of the Northern blot probes 1 and 2 indicated. Probe 2 can detect only isoforms β and γ. The common 5’ end and the variable 3’ ends of each isoform α, β and γ determined by 5’ and 3’ RACE respectively shown with end nucleotide positions and poly A tails indicated as ‘A(n)’. Hybridization of the 3’ RACE adapter to the poly A tail shown and the annealing of adapter specific primer Ra. (B) Schematic of UmaTR transcript locus cDNA generated using the 3’ RACE adapter with the ends of the isoforms shown. PCR primer annealing positions, orientations are shown below the cDNA. (C) Agarose gel electrophoresis of PCR products generated using the

107 cDNA in B as template with respective gene specific forward primers shown above each lane and the adapter specific reverse primer Ra used in all reactions. Asterisks correspond to the isoforms as marked in the cDNA schematic. Presence of bands in the Poly A polymerase (PAP) (-) shows the presence of a poly A tail in all three isoforms. (D) Agarose gel electrophoresis of PCR products with indicated forward primers and R1 reverse primer. Asterisks correspond to the isoforms as marked in the cDNA schematic. Presence of bands in PAP (+) and PAP (-) reactions with R1 as reverse primer shows the presence of a large polycistronic RNA that has a poly A tail that encompasses all three isoforms of UmaTR. (E) Northern blot analysis of UmaTR isoforms. Total RNA (10 µg) was loaded along with size makers and probed using probe 1 (P1) shown in A. The isoforms detected are shown to the right with isoform α showing highest abundance. (F) Northern blot analysis using probe 2 shows lower abundance of β but does not detect α providing evidence to support the high abundance of α. (RACE, PCR, Northern, agarose gel electrophoresis – Yang Li; Primer design, Northern, PCR, RNA synthesis and purification – Dhenugen Logeswaran)

108

Figure 3.8. Synteny conservation of the TR locus in class Ustilaginomycetes. (left) Evolutionary relationships between the species and respective orders are shown. Branch length is not proportional to evolutionary distance. (right) Schematic of the gene synteny in the TR locus from Ustilaginomycetes species with coding sequences flanking the TR and their transcription orientations shown. The coding sequence 5’ of the TR is a hypothetical protein designated as UMAG_03168 in U. maydis. Homologous sequences were found in all species (Appendix E) shown above and is transcribed in the same orientation as the TR. The coding sequence 3’ of the TR is a CAR2-ornithine amino- transferase protein showing syntenic conservation in all species except for A. flocculosa, U. primulicola and V. palustris (Appendix F).

109 Figure 3.9. Identification of the minimal UmaTR regions required for telomerase activity. Truncation analysis of UmaTR 5’- and 3’- fragments. (Top) Schematic of the truncated UmaTR 5’- and 3’- fragments and the fragment combinations used in the activity assay. Numbers above the UmaTR schematic denotes the nucleotide positions within the RNA. (Bottom) Activity assay of telomerase reconstituted from UmaTR 5F and 3F fragments in given combinations. The minimal UmaTR fragments showing activity are shaded in grey. A 32P recovery control (r.c.) was added prior to phenol/chloroform extraction and ethanol precipitation of extended DNA products. Number of nucleotides added to the primer are indicated to the left of the gel (activity assay – Yang Li; RNA synthesis – Dhenugen Logeswaran).

110

111 Figure 3.10. Multiple sequence alignment of T/PK and eCR4/5 domains of select Ustilaginales species. Nucleotides are colored by identity (A: green, T: red, C: blue and G: black) with shaded residues shown as white text on colored background. TR domains are shown within white boxes above the alignment with base paired regions named same. Highly variable regions intervening conserved motifs are omitted with the omitted number of nucleotides shown between the conserved or base-paired regions. (A) Alignment of T/PK domain from 18 Ustilaginales species with 75% shading applied. (B) Alignment of the eCR4/5 domain from 15 Ustilaginales species with 80% shading applied.

112

Figure 3.11. Secondary structure model of the UmaTR core domains. Secondary structure models of the template-pseudoknot (T/PK) and the CR4/5 domains inferred from domain specific multiple sequence alignments of ustilaginales species (Figure 3.10.). Invariant residues (red) or nucleotides with ≥75% identity (green) are indicated. Major structural features are shown. Regions without the secondary structure determined are indicated as dotted lines with the number of omitted residues indicated. The positions of minimal regions required for activity as indicated in Figure 3.9 are shown in blue.

113 CHAPTER 4

STRUCTURE AND FUNCTION OF LAND PLANTS TELOMERASE RNA

Reproductions with permission in part from:

Song, J., Logeswaran, D., Castillo-Gonzalez, C., Li, Y., Bose, S., Aklilu, B., Ma, M.,

Polkhovskiy, A., Chen, J. J-L., Shippen, D. E. (2019). The conserved structure of plant

telomerase RNA provides the missing link for an evolutionary pathway from ciliates to

humans. Proc. Natl. Acad. Sci. USA, (in press)

4.1 Abstract

Telomerase is essential for maintaining telomere integrity. Although telomerase

function is widely conserved, the integral telomerase RNA (TR) that provides a template

for telomeric DNA synthesis has diverged dramatically. Nevertheless, TR molecules

retain two highly conserved structural domains critical for catalysis: a template-proximal pseudoknot (PK) structure and a downstream stem-loop structure. Using the authentic TR from the model plant Arabidopsis thaliana (AtTR) identified by Shippen (as part of this collaborative study) and Fajkus labs independently, 85 AtTR orthologs from three major clades of plants: angiosperms, gymnosperms and lycophytes were identified using bioinformatics methods. Through phylogenetic comparison, a secondary structure model conserved among plant TRs was inferred and verified using site directed mutagenesis and in vitro telomerase activity assays. The conserved plant TR structure contains a template-

PK core domain enclosed by a P1 stem and a 3’ long stem P4/5/6, both of which resemble a corresponding structural element in ciliate and vertebrate TRs. However, the plant TR contains additional stems and linkers within the template-PK core, allowing for expansion of PK structure from the simple PK in the smaller ciliate TR during evolution.

114 Hence, the plant TR provides an evolutionary bridge that unites the disparate structures of previously characterized TRs from ciliates and vertebrates.

4.2 Introduction

Many non-coding RNAs (ncRNAs) function as integral components of ribonucleoprotein (RNP) complex enzymes that govern cellular processes such as translation, RNA splicing and telomere maintenance (Wilusz et al., 2009). The telomerase RNA (TR or TER) assembles with the telomerase reverse transcriptase

(TERT) protein to form the catalytic core of an enzyme that maintains telomere function and genome integrity by continually adding telomeric DNA repeats onto chromosome ends (Shay & Wright, 2019). TR contains a template for the synthesis of G-rich telomere repeat arrays catalyzed by TERT. In addition, TR harbors highly conserved structural domains that serve as a scaffold for binding accessory proteins that facilitate RNP biogenesis, engagement with the chromosome terminus and regulation of telomerase enzyme activity (Podlevsky & Chen, 2016).

The essential role of telomerase in telomere maintenance is universally conserved across Eukarya, except for a small group of insect species that evolved a retrotransposon- mediated mechanism (Casacuberta, 2017). Nevertheless, key aspects of the telomerase

RNP have diverged dramatically, including the sequence and length of TR, the protein composition of the holoenzyme and the mechanism of RNP maturation (Egan & Collins,

2012). For example, TR genes in ciliated protozoa encode relatively small RNAs (140-

210 nt. in length) that are transcribed by RNA polymerase III (Pol III) (Greider &

Blackburn, 1989; Lingner et al., 1994). The La-related protein P65 in Tetrahymena recognizes the 3’ poly-U tail of TR and bends the RNA to facilitate telomerase RNP

115 assembly (Jiang et al., 2015; Singh et al., 2012). In contrast, fungi maintain much larger

TR molecules (900 to 2,400 nt.) that are transcribed by RNA polymerase II (Pol II)

(Podlevsky & Chen, 2016). The 3’ end maturation of fungal TRs requires components of

the canonical snRNA biogenesis pathway and results in RNP assembly with Sm and Lsm

proteins (Box et al., 2008; Noël et al., 2012). Like fungi, vertebrates also utilize Pol II to transcribe a TR with a size ranging from 312 to 559 nt (Chen et al., 2000). However, vertebrate telomerase RNP processing and biogenesis proceeds via a small nucleolar

RNA (snoRNA) maturation pathway (Mitchell et al., 1999). In vertebrates, a highly conserved structural motif in the 3’ H/ACA domain of TR binds the protein components of the H/ACA snoRNP (Dyskerin, NOP10, NHP2, and GAR1) which then protect the

3’end of the mature TR from exonuclease degradation (Egan & Collins, 2010; Tseng et al., 2018; Wang & Meier, 2004).

Within TR, two conserved domains are critical for telomerase catalysis (Qi et al.,

2013). The first is the template-pseudoknot domain which bears a single-stranded template region typically corresponding to 1.5-2 copies of the telomeric repeat

(Podlevsky & Chen, 2016). The 5’ boundary of the TR template is defined by a template boundary element (TBE) that promotes polymerase fidelity by preventing incorporation of non-telomeric nucleotides into telomeric DNA (Autexier & Greider, 1995; Chen &

Greider, 2003b; Jansson et al., 2015; Tzfati et al., 2000). In addition to the template and

TBE, the pseudoknot (PK) structure located downstream of the template is essential for

TERT-TR interaction and enzyme activity (Blackburn & Collins, 2011; Podlevsky &

Chen, 2012). The PK structures from vertebrates and yeast TRs are generally larger and more stable (Chen et al., 2000; Qi et al., 2013), harboring longer helices than the PK

116 structures of ciliate TR, which are relatively primitive and less stable (Autexier &

Greider, 1998; Gilley & Blackburn, 1999). NMR studies of TR reveal a unique triple-

helix structure in the PK which plays an essential, but poorly understood, role in

promoting telomerase activity (Theimer et al., 2005). Another essential domain of TR,

called helix IV in ciliates or CR4/5 in vertebrates, can reconstitute telomerase activity in

trans together with the template-PK domain (Chen et al., 2002; Mason et al., 2003;

Mitchell & Collins, 2000; Xie et al., 2008). TRs from other groups of eukaryotes

including echinoderms and trypanosomes also possess a second structural domain called

eCR4/5 that can bind independently to TERT in trans and is functionally equivalent to

the vertebrate CR4/5. The requirement of two conserved structural TR domains for

telomerase activity is therefore universally conserved among all major groups of

eukaryotes from Trypanosome to vertebrates (Podlevsky et al., 2016a).

The identification of two telomerase-associated RNAs from A. thaliana termed

AtTER1 and AtTER2 (Cifuentes-Rojas et al., 2011; Cifuentes-Rojas et al., 2012) was

previously described. AtTER1 was proposed to serve as the template for telomeric DNA

synthesis by telomerase (Cifuentes-Rojas et al., 2011). However, recent data has refuted

the role of AtTER1 in telomere maintenance (Fajkus et al., 2019). Moreover, Fajkus and

colleagues recently reported the identification of a novel telomerase RNA from A.

thaliana termed AtTR that is required for telomere maintenance and is conserved across

land plants (Fajkus et al., 2019).

Using next-generation sequencing analysis of TERT-associated RNAs, Shippen

and co-workers independently identified AtTR as the bona fide RNA component for

Arabidopsis telomerase in this collaborative study. We show that this AtTR is sufficient

117 to reconstitute telomerase activity with A. thaliana TERT (AtTERT) protein in vitro. In

addition, by employing phylogenetic sequence analysis of homologous TRs from the

three distantly related plant lineages including angiosperms, gymnosperms and the early

branching lycophytes, we determine a conserved structural model for plant TRs that was

verified using mutagenesis. Our findings provide an evolutionary bridge to unite the

disparate structures of the previously characterized TRs from ciliates and vertebrates as

well as a new platform to explore the evolution of the telomerase RNP enzyme.

4.3 Materials and methods

In vitro reconstitution of Arabidopsis telomerase.

3xFLAG tagged Arabidopsis TERT (AtTERT) was expressed in rabbit

reticulocyte lysate (RRL) from the p3xFLAG-AtTERT plasmid using the TNT Quick

Coupled transcription/translation kit (Promega) following manufacturer’s instructions.

The AtTR fragments were in vitro transcribed by T7 RNA polymerase, gel purified and

assembled with TERT protein for 30 min at 30˚C at a final concentration of 1.5 µM.

Telomerase direct primer extension

12 µl of in vitro reconstituted telomerase enzyme was immuno-purified with 3 µl

of anti-FLAG M2 magnetic beads (Sigma M8823) at room temperature for 1 hr. The

telomerase enzyme on beads was assayed in a 10 µl reaction containing 1X telomerase

reaction buffer (50 mM Tris-HCl, pH 8.0, 50 mM NaCl, 0.5 mM MgCl2, 5 mM BME and

1mM spermidine), 1µM DNA primer, and specified dNTPs or ddNTPs and 0.18µM of

32P-dGTP (3,000 Ci/mmol, 10 mCi/ml; Perkin-Elmer). Reactions were incubated at 30˚C for 60 min and terminated by phenol/chloroform extraction, followed by ethanol precipitation. The 22-mer size marker was prepared in a 10 µl reaction containing

118 (GGGTTTA)3 oligo, 1x TdT reaction buffer, 5 units of terminal deoxynucleotidyl transferase (TdT, Affymetrix) and 0.1 µM of 32P-dGTP. The reaction was incubated at

room temperature for 3 sec and terminated by addition of 10 µl 2x formamide loading

buffer (10mM Tris-HCl, pH8.0, 80% (vol/vol) formamide, 2 mM EDTA, 0.08%

bromophenol blue, and 0.08% Xylene cyanol). The DNA products were resolved on a

10% (wt/vol) polyacrylamide/8 M urea denaturing gel, dried, exposed to a

phosphorstorage screen and imaged on a Typhoon gel scanner (GE Healthcare).

Bioinformatics analysis

AtTR orthologs were identified by standalone BLAST (version 2.2.31+) searches

initially using AtTR as query from closely related species. The BLASTN search was

performed with the -task dc-megablast parameter to allow for identification of more

variable sequences. For more distantly related species, position weight matrix (PWM)

search using fragrep 2 (Mosig et al., 2007) was performed for candidate identification.

The PWM was created using sequence alignment from AtTR orthologs identified via

BLAST and the match scores were relaxed during PWM searches to allow for

identification of more divergent sequences. Once a reliable secondary structure was

established using the TRs identified via BLAST and fragrep2, secondary structure-based searches were performed using Infernal (Nawrocki & Eddy, 2013) for identification of orthologs from more distantly related species.

Sequence alignment analysis

Multiple sequence alignment of land plant TRs was performed initially using the

ClustalW algorithm of the Bioedit program. Manual refinements were made to preliminary alignments with highly conserved regions and invariant primary sequence

119 motifs as anchor points. Sequences from closely related species of the

family were aligned first and the alignment was expanded by including sequences in

order of phylogenetic relationships to the existing alignment.

4.4 Results

AtTR and AtTERT reconstitute active telomerase in vitro.

Initially we tested whether AtTR can assemble with AtTERT in vitro to

reconstitute active telomerase. As shown in Figure 4.1A, recombinant FLAGx3-AtTERT

protein synthesized in rabbit reticulocyte lysate was assembled with T7 RNA polymerase

transcribed AtTR in vitro and the reconstituted telomerase was immuno-purified followed

by a direct primer extension assay (Figure 4.1A). Importantly, the primer extension

activity is AtTR-dependent as no activity was detected in the absence of AtTR (Figure

4.1A, lane 1). Seven A. thaliana telomeric DNA primers with permuted sequences of

TTTAGGG bearing different 3’ terminal sequences were examined using in vitro

reconstituted telomerase enzyme. The reaction with (GTTTAGG)3 generated a 7-nt ladder pattern of products with major bands at positions +6, +13 and +20 (Figure 4.1A, lane 8), consistent with the 7-nt telomeric DNA repeats synthesized by A. thaliana telomerase. A. thaliana telomerase exhibited similar levels of activity with the different permuted telomeric DNA primers and generated the expected offset banding patterns

(Figure 4.1A, lanes 2-7), indicating correct primer-template alignment and specific usage of the template.

To further examine the templating function of AtTR, we generated an AtTR template mutant (AtTRhum) with a template sequence similar to the human TR (hTR)

template that allows the synthesis of 6-nt TTAGGG repeats. The telomeric TTAGGG

120 repeats are ubiquitously conserved in most lineages of eukaryotes (Podlevsky & Chen,

2016). The 9-nt AtTR template sequence 5’-CUAAACCCUGAACC-3’ for the synthesis

of 7-nt repeats (TTTAGGG)n is flanked by a G residue at it 3’ boundary and could

potentially be expanded to a longer 14 nt template by mutating the G residue to A. To

convert the native A. thaliana template sequence to a human-like template, we simply

deleted one A residue in the polymerization template sequence and the non-conserved G

residue in the alignment sequence, which resulted in a 12-nt 5’-CUAACCCUAACC-3’ template for synthesizing TTAGGG repeats. As expected, the telomerase reconstituted from the AtTRhum template mutant generates the first major bands at position +5 (+gttag)

and the second major band at +11, indicating the addition of a 6-nt DNA repeat using the

human-like template (Figure 4.1B, lane 8). Moreover, the inclusion of dideoxy-

ribonucleotides, either ddTTP or ddATP, terminated the primer extension reaction at the

expected positions on the template of the AtTRWT and AtTRhum (Figure 4.1B, lanes 2-3 and 6-7). In addition, under processive conditions with all three nucleotides, the AtTRhum

template with a long 6-nt alignment region led to a significantly high processivity based

on the ratio of +11/+5 products (Figure 4.1B, lanes 4 and 8), consistent with a previous

finding that longer templates correlate with high repeat addition processivity (Chen &

Greider, 2003a). Altogether, these data demonstrate that the template sequence 43-

CUAAACCCU-51 within AtTR is a bona fide template for telomeric DNA repeat

synthesis by A. thaliana TERT.

Plant TRs share a conserved secondary structure

To discern the structure of AtTR, we employed phylogenetic comparative

analysis to infer a secondary structure model from the sequence alignment of plant TR

121 homologs identified from three major clades of land plant species: angiosperms, gymnosperms and lycophytes (Figure 4.2A, B). Orthologs of AtTR were identified by searching genomic sequence data from National Center for Biotechnology using sequence homology search tools including BLAST, Fragrep2 (Mosig et al., 2007) and

Infernal (Nawrocki & Eddy, 2013). While BLAST was able to find TR homologs from closely related species, Fragrep2 allowed for identification of TRs from more distantly related species by utilizing position specific weight matrix (PWM) based searches with

PWMs derived from multiple sequence alignments, as opposed to using the primary sequence as the search query. Collectively, we identified 85 AtTR orthologs, 70 from angiosperms, 11 from gymnosperms and 4 from lycophytes (Appendix G, Table 4.1) with lycophytes representing the most ancestral land plant TRs identified to date. To infer secondary structure, multiple sequence alignment analysis was performed with 16 representative TR sequences (12 angiosperms, 3 gymnosperms and 1 lycophytes) selected from the 85 sequences to allow at least one representative from each individual order spanning three distinct clades (Figure 4.2B). All TR sequences including those from the basal groups, gymnosperms and lycophytes, can be reliably aligned with the TR sequences from angiosperms (Figure 4.2B) revealing universally conserved structural elements of plant TRs. From the alignment of 16 divergent plant TR sequences, universal nucleotide covariations were identified to infer base-paired structural elements conserved among the all three clades (Figure 4.2B). Group-specific covariations were identified from alignments of secondary structural elements from closely related species belonging to specific groups. (Figure 4.3). Comparison of TR secondary structures from four representative species, A. thaliana from angiosperms, Picea glauca (spruce) from

122 gymnosperms and S. kraussiana (spike moss) from lycophytes, revealed three common structural features: a conserved template-PK core domain enclosed by stem P1c, a long stem that comprises consecutive short base-paired regions termed P4, P5 and P6, and a long-range base-paired stem P1a formed between the extreme distal 5’ and 3’ sequences

(Figure 4.4 A-C).

The plant template-PK (T/PK) core domain resembles those from ciliate, fungal and vertebrate TRs, consisting of a template, a universal PK structure formed by stems P2 and P3, and a core-enclosing stem P1c (Figure 4.4A-C). However, the plant T/PK core domain contains additional plant-specific stems, namely P1.1 (in P. glauca and S. kraussiana), P2.1 (in A. thaliana and P. glauca) and P2.2 (in P. glauca and S. kraussiana) (Fig 4.4 A-C). The P1.1 stem can be found in the invertebrate echinoderm and fungal TRs, and could potentially function as a TBE (Podlevsky et al., 2016b; Qi et al., 2013). The P2.1 and P2.2 stems are not present in all plant TRs, suggesting that they are more adaptable and maybe important for a function specific to some plant groups.

One possible role for the variable P2.1 and P2.2 stems is to maintain the length of the linker between the template and the pseudoknot structure within the T/PK core domain.

In addition to the T/PK core domain, the plant TR contains a long helical structure with three consecutive short stems, P4, P5 and P6, located near the 3’end between P1a and P1b (Figure 4A-C). The location and structure of the plant P4/P5/P6 stem resembles the vertebrate CR4/5 domain, echinoderm eCR4/5 domain or ciliate helix IV, all of which are essential for telomerase activity (Chen et al., 2002; Mason et al., 2003; Podlevsky et al., 2016b). The three-way junction formed between P1a, P1b and P4/5/6 appears to be a

123 conserved feature of plant TR (Figure 4.4 A-C). This P1a-mediated three-way junction is unique to plant TR and is not found in other known TRs.

In addition to inferring the conserved secondary structure, the multiple sequence alignment of the 16 representative plant TRs spanning land plant evolution revealed five highly conserved regions (CR), CR1 to CR5, containing nucleotides that are invariant among these 16 distantly related species (Figure 4.2B). Such remarkable conservation of nucleotide identity usually predicts essential functions of these regions as evident in vertebrate TRs (Chen et al., 2000). CR1 corresponds to the template of AtTR. CR2 and

CR3 form the universal P2 and P3 stems of the PK, while CR4 and CR5 form a P5 structural element that includes the short 3-bp P5 stem, an asymmetric internal loop and the upper part of stem P4 (Figure 4.4 A-C). While lacking the P6.1 stem-loop, the universal P5 structural element of the plant TR resembles the CR4/5 domain conserved in vertebrate, fission yeast and filamentous fungal TRs (Chen et al., 2002; Qi et al., 2013).

This highly conserved P5 stem may serve as a protein binding site or play a crucial role in telomerase function.

The AtTR PK domain is essential for telomerase function and homologous to human TR

With a robust secondary structure model for AtTR, we sought to map the structural elements essential for telomerase activity. Full-length or truncated AtTR constructs were assembled with recombinant FLAGx3-AtTERT in vitro and the immuno- purified enzymes were analyzed for telomerase activity by direct primer extension.

Analysis of three truncated AtTR fragments, 11-179, 25-153 and 42-136 (Figure 4.5A), showed that AtTR-25-153 is the minimal PK fragment sufficient to reconstitute about

40% of wild-type activity without the P4/5/6 domain (Figure 4.5B, lanes 2 and 3). The

124 core-enclosing P1c stem appeared to be important for telomerase function as the AtTR-

42-136 fragment with P1c removed was unable to reconstitute any significant activity

(Figure 4.5B, lane 4). Equivalent to the CR4/5 domain of human TR, the 3’ P1a/4/5/6

domain of AtTR can also function in trans as a separate RNA molecule to stimulate the

reconstituted activity from the basal 40% to 66% of wild-type level (Figure 4.5C). A basal activity of telomerase reconstituted from the T/PK domain alone was previously reported with Trypanosome and Echinoderm TRs (Podlevsky & Chen, 2016; Podlevsky et al., 2016b), indicating an evolutionary transition of functional dependence for the two conserved TR domains.

The PK structure of plant TRs highly resembles the PK structures in ciliate and vertebrate TRs with differences in size and complexity. In human TR PK structure, the

invariant U residues in the J2/3 upstream region (J2/3u) are essential to telomerase

activity (Chen & Greider, 2005). To determine if the invariant U residues in plant TR PK

are functionally homologous to the human TR, we reconstituted telomerases with two

AtTR mutants, U92C and UU94/95CC. The activity assays of the mutant enzyme showed

no activity (Figure 4D, lanes 2 and 3), indicating these U residues in the AtTR PK

domain are absolutely required for telomerase activity. Therefore, the T/PK domains of

AtTR and hTR are both structurally and functionally homologous.

Another critical function provided by the T/PK domain is defining the functional

template boundary through specific structural elements, i.e. the P1 stem in vertebrate TR

(Chen & Greider, 2003b). The P1c stem in the T/PK domain of AtTR resembles the P1

stem in human TR, and presumably functions as the template boundary element. To test

this idea, we generated an AtTR mutant 38UU with two U residues inserted between the

125 P1c stem and the template to increase the linker length, a critical determinant of the

template boundary. In the wild-type AtTR template, a G residue immediately flanks the

5’ boundary and does not serve as a template even in the presence of dCTP substrate

(Figure 4.5E, lanes 1 and 2). However, in the presence of dCTP, the telomerase enzyme

reconstituted with the AtTR mutant 38UU utilized the G residue as a template beyond the

template boundary (Figure 4.5E, lanes 3 and 4). Thus, A. thaliana and human telomerases

share a homologous mechanism for template boundary definition.

While the overall secondary structure of AtTR is well supported by co-variation

evidence and chemical probing data, we performed mutagenesis analysis to provide

additional support for the highly conserved P5 stem and the plant-specific P2.1 stem

(Figure 4.5A). The 3-bp P5 stem is formed by two highly conserved regions, CR4 and

CR5, with only limited co-variation support for one of the 3 base-pairs. We thus generated AtTR full-length constructs, P5-m1 and -m2, with two single point mutations,

G194C and C239G, introduced to disrupt the invariant G:C base-pairing in the P5 stem, or a compensatory mutant P5-m3 with both point mutations to restore the base-pairing

(Figure 4.5A). The activity assay showed that P5-m1 and -m2 single point mutations abolished telomerase activity (Figure 4.5F, lanes 2 and 3), while the compensatory mutation P5-m3 restored activity (Figure 4.5F, lane 4), consistent with the essential base-

paired structure of stem P5. A similar mutagenesis approach was employed to confirm

the base-paired structure and the functional importance of stem P2.1 (Figure 4.5G).

Altogether, these in vitro studies strongly support the robustness of the phylogenetic comparative analysis for inferring RNA secondary structure in plant TR.

126 4.5 Discussion

Telomerase emerged in early eukaryotes as a specialized reverse transcriptase with an integral RNA template to counteract the end-replication problem and maintain genomic integrity. While the catalytic TERT component of telomerase is conserved among eukaryotes, the TR component has diverged significantly during evolution. A missing piece in the evolutionary history of telomerase has been plant TR. Recent studies from the Fajkus lab (Fajkus et al., 2019) indicated that the previously identified AtTER1

(Cifuentes-Rojas et al., 2011) was not the authentic TR in A. thaliana. The Shippen lab collaboratively in this study and Fajkus lab (Fajkus et al., 2019) have independently identified AtTR as the bonafide TR in Arabidopsis thaliana. To investigate the function of AtTR, we employed in vitro reconstitution experiments using a rigorous non-PCR assay of direct primer extension to test the authenticity of the AtTR template. We determined that AtTR possessed a functional template for telomeric DNA synthesis by

AtTERT in vitro.

AtTR was first described in 2012 by Wu and collaborators as a root-specific, conserved Pol III-dependent ncRNA (Wu et al., 2012). The ATTR gene (Genbank

AB646770.1) includes a U6-like Type III promoter and poly(T) terminator. The promoter has a consensus cis upstream sequence element (USE) and a TATA box-like element 25 bp upstream of the transcription start site (TSS). The discovery of plant TRs being Pol III

RNA transcripts leads to an interesting question: was the first TR a Pol II or Pol III transcript? TR was originally identified in ciliates as a small Pol III RNA transcript with sizes ranging from 140 to 210 nt (Figure 4.6). RNA polymerase III is generally employed for transcribing small RNA such as 5S rRNA and tRNA due to its sequence-dependent

127 termination at a U-rich termination site. A large RNA would encounter a high frequency

of U-rich sequences and suffer premature termination with Pol III transcription, which is consistent with the small size of ciliate TR (Lingner et al., 1994). Surprisingly, TRs

identified later in vertebrates and fungi are larger Pol II transcripts with sizes of 312-559

nt and 920-2425 nt, respectively (Chen et al., 2000; Qi et al., 2013). While it seems

reasonable to assume that the Pol III TR transcript is more ancestral, TRs from early

branching flagellates, including trypanosomes, are large Pol II transcripts ranging

between 781-993 nt (Figure 4.6). Discerning the origin of TR will require discovery of

TRs from the early branching lineages of eukaryotes, a daunting task considering the

extremely divergent nature of TR.

The conserved secondary structures of plant TRs presented in this study were

determined by employing phylogenetic comparative analysis, a gold standard for

inferring RNA secondary structures (Chen et al., 2000; Pace et al., 1989). Moreover, the

secondary structure of AtTR was verified by mutagenesis analysis using an in vitro

reconstitution system. In the AtTR structure, the most crucial structural element is the

PK, which is conserved in all known TRs except Trypanosome (Figure 4.6).

Trypanosome TR contains two structural domains, the template-core and eCR4/5, both of

which are required for telomerase activity in vitro and can function in trans as two

separate RNA fragments (Podlevsky et al., 2016a). However, the minimal template core

domain of Trypanosome TR does not contain a PK, arguing that the critical TR PK was a

later adaptation. Nevertheless, helix III of Trypanosome TR is potentially homologous to

the PK forming helix III of Tetrahymena TR as both helices are located between the

template and the core enclosing helix, i.e. helix I in Tetrahymena TR or P1 stem in other

128 TRs. The PK structure of Tetrahymena TR only requires formation of a 4 bp stem between the loop sequence of helix III and an upstream complementary sequence (Fig.

5). This 4 bp stem is structurally equivalent to the vertebrate P2 stem which is longer and contains two consecutive stems, P2a and P2b, and with an additional P2a.1 stem in the mammalian TR PK (Figure 4.6). How this primitive ciliate TR PK evolved to the more complex vertebrate TR PK has been unclear. The structure of plant TR PK now provides an explanation for the structural transition from ciliate to vertebrate PK. Similar to ciliate

PK, plant PK contains a short unstable 4 bp P2 stem and a longer 8-9 bp P3 stem.

Notably, the ciliate and plant PK structures differ in the length of the joining sequences,

J2/3 upstream (J2/3u) and J2/3 downstream (J2/3d) (Figure 4.6). The length of J2/3u increases from 3 nt in Tetrahymena to 8 nt in plants, similar to the 8 nt J2b/3 in vertebrate

TR PK (Figure 4.6). The length of J2/3d sequence also increases from 4 nt in

Tetrahymena to 14 nt in the A. thaliana PK. We propose that the longer J2/3d makes it possible to expand the short 4 bp P2 stem to a longer P2a/P2b stem in vertebrate PK during evolution. Notably, plant TR contains additional stems (P2.1 and P2.2) located between the template and the P2 stem (Figure 4.6). These additional stems may reflect selective pressure to maintain the spatial constraints for the enzyme active site as the P2 stem expands during evolution. Therefore, the plant TR PK provides an evolutionary bridge for the structural transition from ciliate TR to vertebrate TR.

129 4.6 References

Autexier, C., & Greider, C. W. (1995). Boundary elements of the Tetrahymena telomerase RNA template and alignment domains. Genes & Development, 9(18), 2227-2239.

Autexier, C., & Greider, C. W. (1998). Mutational analysis of the Tetrahymena telomerase RNA: identification of residues affecting telomerase activity in vitro. Nucleic Acids Research, 26(3), 787-795.

Blackburn, E. H., & Collins, K. (2011). Telomerase: an RNP enzyme synthesizes DNA. Cold Spring Harbor Perspectives in Biology, 3(5).

Box, J. A., Bunch, J. T., Tang, W., & Baumann, P. (2008). Spliceosomal cleavage generates the 3' end of telomerase RNA. Nature, 456, 910-914.

Casacuberta, E. (2017). Drosophila: Retrotransposons Making up Telomeres. Viruses, 9(7).

Chen, J. J.-L., Blasco, M. A., & Greider, C. W. (2000). Secondary structure of vertebrate telomerase RNA. Cell, 100(5), 503-514.

Chen, J. J.-L., & Greider, C. W. (2003a). Determinants in mammalian telomerase RNA that mediate enzyme processivity and cross-species incompatibility. The EMBO Journal, 22(2), 304-314.

Chen, J. J.-L., & Greider, C. W. (2003b). Template boundary definition in mammalian telomerase. Genes & Development, 17(22), 2747-2752.

Chen, J. J.-L., & Greider, C. W. (2005). Functional analysis of the pseudoknot structure in human telomerase RNA. Proceedings of the National Academy of Sciences of the United States of America, 102(23), 8080-8085.

Chen, J. J.-L., Opperman, K. K., & Greider, C. W. (2002). A critical stem-loop structure in the CR4-CR5 domain of mammalian telomerase RNA. Nucleic Acids Research, 30(2), 592-597.

Cifuentes-Rojas, C., Kannan, K., Tseng, L., & Shippen, D. E. (2011). Two RNA subunits and POT1a are components of Arabidopsis telomerase. Proceedings of the National Academy of Sciences of the United States of America, 108(1), 73-78.

Cifuentes-Rojas, C., Nelson, A. D., Boltz, K. A., Kannan, K., She, X., & Shippen, D. E. (2012). An alternative telomerase RNA in Arabidopsis modulates enzyme activity in response to DNA damage. Genes & Development, 26(22), 2512-2523.

Egan, E. D., & Collins, K. (2012). Biogenesis of telomerase ribonucleoproteins. RNA, 18(10), 1747-1759. 130 Egan, E. D., & Collins, K. L. (2010). Specificity and stoichiometry of subunit interactions in the human telomerase holoenzyme assembled in vivo. Molecular and Cellular biology, 30, 2775-2786.

Fajkus, P., Peska, V., Zavodnik, M., Fojtova, M., Fulneckova, J., Dobias, S., Kilar, A., Dvorackova, M., Zachova, D., Necasova, I., Sims, J., Sykorova, E., & Fajkus, J. (2019). Telomerase RNAs in land plants. Nucleic Acids Research, 47(18), 9842- 9856.

Gilley, D., & Blackburn, E. H. (1999). The telomerase RNA pseudoknot is critical for the stable assembly of a catalytically active ribonucleoprotein. Proceedings of the National Academy of Sciences of the United States of America, 96(12), 6621- 6625.

Greider, C. W., & Blackburn, E. H. (1989). A telomeric sequence in the RNA of Tetrahymena telomerase required for telomere repeat synthesis. Nature, 337, 331- 337.

Jansson, L. I., Akiyama, B. M., Ooms, A., Lu, C., Rubin, S. M., & Stone, M. D. (2015). Structural basis of template-boundary definition in Tetrahymena telomerase. Nature Structural & Molecular Biology, 22(11), 883-888.

Jiang, J., Chan, H., Cash, D. D., Miracco, E. J., Ogorzalek Loo, R. R., Upton, H. E., Cascio, D., O'Brien Johnson, R., Collins, K., Loo, J. A., Zhou, Z. H., & Feigon, J. (2015). Structure of Tetrahymena telomerase reveals previously unknown subunits, functions, and interactions. Science, 350(6260), aab4070.

Lingner, J., Hendrick, L. L., & Cech, T. R. (1994). Telomerase RNAs of different ciliates have a common secondary structure and a permuted template. Genes & Development, 8(16), 1984-1998.

Mason, D. X., Goneska, E., & Greider, C. W. (2003). Stem-loop IV of tetrahymena telomerase RNA stimulates processivity in trans. Molecular and Cellular Biology, 23(16), 5606-5613.

Mitchell, J. R., Cheng, J., & Collins, K. (1999). A box H/ACA small nucleolar RNA-like domain at the human telomerase RNA 3' end. Molecular and Cellular Biology, 19(1), 567-576.

Mitchell, J. R., & Collins, K. L. (2000). Human telomerase activation requires two independent interactions between telomerase RNA and telomerase reverse transcriptase. Molecular Cell, 6, 361-371.

Mosig, A., Chen, J. J.-L., & Stadler, P. F. (2007). Homology Search with Fragmented Nucleic Acid Sequence Patterns. Lecture Notes in Computer Science, 4645, 335- 345.

131 Nawrocki, E. P., & Eddy, S. R. (2013). Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics, 29(22), 2933-2935.

Noël, J.-F., Larose, S., Abou Elela, S., & Wellinger, R. J. (2012). Budding yeast telomerase RNA transcription termination is dictated by the Nrd1/Nab3 non- coding RNA termination pathway. Nucleic Acids Research, 40(12), 5625-5636.

Pace, N. R., Smith, D. K., Olsen, G. J., & James, B. D. (1989). Phylogenetic comparative analysis and the secondary structure of ribonuclease P RNA--a review. Gene, 82(1), 65-75.

Podlevsky, J. D., & Chen, J. J.-L. (2012). It all comes together at the ends: telomerase structure, function, and biogenesis. Mutation Research, 730(1-2), 3-11.

Podlevsky, J. D., & Chen, J. J.-L. (2016). Evolutionary perspectives of telomerase RNA structure and function. RNA Biology, 13(8), 720-732.

Podlevsky, J. D., Li, Y., & Chen, J. J.-L. (2016a). The functional requirement of two structural domains within telomerase RNA emerged early in eukaryotes. Nucleic Acids Research, 44, 9891-9901.

Podlevsky, J. D., Li, Y., & Chen, J. J.-L. (2016b). Structure and function of echinoderm telomerase RNA. RNA, 22, 204-215.

Qi, X., Li, Y., Honda, S., Hoffmann, S., Marz, M., Mosig, A., Podlevsky, J. D., Stadler, P. F., Selker, E. U., & Chen, J. J.-L. (2013). The common ancestral core of vertebrate and fungal telomerase RNAs. Nucleic Acids Research, 41(1), 450-462.

Shay, J. W., & Wright, W. E. (2019). Telomeres and telomerase: three decades of progress. Nature Reviews Genetics, 20(5), 299-309.

Singh, M., Wang, Z., Koo, B. K., Patel, A., Cascio, D., Collins, K., & Feigon, J. (2012). Structural Basis for Telomerase RNA Recognition and RNP Assembly by the Holoenzyme La Family Protein p65. Molecular Cell, 47, 16-26.

Theimer, C. a., Blois, C. a., & Feigon, J. (2005). Structure of the human telomerase RNA pseudoknot reveals conserved tertiary interactions essential for function. Molecular Cell, 17, 671-682.

Tseng, C.-K., Wang, H.-F., Schroeder, M. R., & Baumann, P. (2018). The H/ACA complex disrupts triplex in hTR precursor to permit processing by RRP6 and PARN. Nature Communications, 9(1), 5430.

Tzfati, Y., Fulton, T. B., Roy, J., & Blackburn, E. H. (2000). Template Boundary in a Yeast Telomerase Specified by RNA Structure. Science, 288(5467), 863-867.

132 Wang, C., & Meier, U. T. (2004). Architecture and assembly of mammalian H/ACA small nucleolar and telomerase ribonucleoproteins. The EMBO Journal, 23(8), 1857-1867.

Wilusz, J. E., Sunwoo, H., & Spector, D. L. (2009). Long noncoding RNAs: functional surprises from the RNA world. Genes & Development, 23(13), 1494-1504.

Wu, J., Okada, T., Fukushima, T., Tsudzuki, T., Sugiura, M., & Yukawa, Y. (2012). A novel hypoxic stress-responsive long non-coding RNA transcribed by RNA polymerase III in Arabidopsis. RNA Biology, 9(3), 302-313.

Xie, M., Mosig, A., Qi, X., Li, Y., Stadler, P. F., & Chen, J. J.-L. (2008). Structure and function of the smallest vertebrate telomerase RNA from teleost fish. Journal of Biological Chemistry, 283(4), 2049-2059.

133

Figure 4.1. AtTR harbors the RNA template for Arabidopsis telomerase. (A) In vitro reconstitution of A. thaliana telomerase activity. Sequences of the putative template with the annealing position of seven circular permuted telomeric DNA primers are shown (right). The predicted primer-extended products are shown in red. A. thaliana telomerase is reconstituted in vitro from synthesized FLAGx3-AtTERT and 1.5 μM of T7 transcribed full-length AtTR (268nt). The affinity-purified telomerase was assayed for activity in the presence of 32P-dGTP, dTTP, dATP and seven plant telomeric DNA primers with permuted sequences. A radiolabeled 18-mer recovery control (r.c.) was added before product purification and precipitation. Numbers to the right of the gel denote the number of nucleotides added to the primer. (B) Template directed nucleotide addition by A. thaliana telomerase. Telomerase was reconstituted in vitro with AtTERT and either AtTRWT or AtTRHum. The reconstituted telomerase was assayed for activity in the presence of 32P-dGTP and different combinations of dTTP, dATP, ddTTP or ddATP. A 21 nt plant telomeric DNA primer (GTTTAGG)3 was used for AtTR, and an 18 nt human telomeric DNA primer (GTTAGG)3 was used for the AtTRhum. A radiolabeled 18-mer recovery control (r.c.) was added before product purification and precipitation. Numbers and sequences of nucleotides added to the primers are indicated. (in vitro reconstitution, telomerase activity assay, gel electrophoresis – Yang Li; Cloning, PCR, RNA synthesis and purification – Dhenugen Logeswaran)

134

135 Figure 4.2. Land plant clades share universally conserved regions within TRs. (A) Evolutionary relationship between major land plant clades. A single representative species of each order is included. (B) Multiple sequence alignment of plant TRs. Alignment of TR sequences from species shown in A. Multiple sequence alignment was performed using the ClustalW algorithm in the BioEdit program. Highly conserved regions and motifs were aligned first followed by alignment of intervening sequences using conserved regions as anchors. The total number of nucleotides in each TR is indicated at the end of the respective sequence. Individual nucleotides are colored by identity (A; green, G; black, U; red, C; blue) and nucleotides that are conserved ≥ 75% are shaded (White text on colored background). Five conserved regions (CRs) are indicated with red lines above the alignment. The template and base paired helices (P1- P6) in the secondary structures are denoted within white boxes below the alignment.

136

137 Figure 4.3. Sequence alignments of TR structural elements from respective clades to identify group-specific co-variations. Individual nucleotides are colored by identity (A; green, G; black, T; red, C; blue) with shaded residues shown as white text in colored background. Variable shading was applied to show clarity of co-variation. Individual TR elements are indicated above each alignment block with secondary structure representation shown using dot-bracket notations at the bottom. Intervening residues of structural elements that form long range base pairing are omitted and the number of nucleotides omitted are shown between the base paired regions. (A) Sequence alignments of TR structural elements from 15 species belonging to the order including AtTR. Shading of P1a (80%), P2.1 (75%), P1b/P1c (60%) and P6 (80%) are shown. (B) Sequence alignments of TR structural elements of 6 species from order Pinales including. Shading of P1b/P1c (50%) and P2.2 (65%) are shown. (C) Sequence alignments of TR structural elements of 4 species from division lycophyta including. Shading of 50% shown for all elements P1a, P1c, P1.1 and P6.

138

Figure 4.4. Plant TRs share a conserved secondary structure. Representative TR secondary structures determined by phylogenetic sequence analysis are shown for (A) A. thaliana from angiosperms, (B) Picea glauca (spruce) from gymnosperms and (C) S. kraussiana (spike moss) from lycophytes. The characteristic TR pseudoknot (PK) is shaded in yellow. Universal co-variations (green), group-specific co-variations (blue) and plant invariant residues (red) are indicated and based on sequence alignment of 16 divergent plant species spanning 8 , 2 monocots, 2 early branching angiosperms, 3 gymnosperms and 1 lycophyte. The aligned sequences are shown in Figure 4.2. Group specific co-variations are based on sequence alignments shown in Figure 4.3.

139

Figure 4.5. Functional characterization of critical structural elements in AtTR. (A) A schematic of AtTR secondary structure. The 5’ and 3’ residues of truncated AtTR fragments are denoted on the AtTR structure. The positions and identities of specific point mutations introduced are indicated. (B) Identification of a minimal PK fragment and (C) functional analysis of stem P1a/4/5/6. Full-length AtTR (AtTR-FL) and various AtTR truncated fragments were assembled with AtTERT in vitro and analyzed for activity by primer extension assay. The number of nucleotides (+6, +13 or +20) added in each major band of product are indicated. The P1a/4/5/6 fragment was generated by deleting residues 25-153 from the AtTR-FL and replacing with a GAAA tetraloop. The relative activities of the reactions are indicated under the gel. A recovery control (r.c.) is shown. (D) The functional requirement of invariant U residues in PK domain. (E) The effect of P1c linker length on template boundary definition. (F) Compensatory mutagenesis analysis of stem P5. (G) Compensatory mutagenesis analysis of stem P2.1. AtTR-FL constructs bearing specific point mutations are assembled with AtTERT in vitro and analyzed for telomerase activity. For analyzing template boundary definition with AtTR-38UU, the reconstituted enzyme was analyzed in the absence (-) or presence (+) of dCTP in addition to dGTP, dATP and dTTP. (in vitro reconstitution, telomerase activity assays – Yang Li; RNA synthesis and purification – Dhenugen Logeswaran)

140

Figure 4.6. Evolution of TR pseudoknot structures. A simplified phylogenetic tree of major eukaryotic lineages is shown in the left panel. Branch length in the tree does not reflect evolutionary distance. The lineages with TR transcribed by Pol II (green) and Pol III (orange) are depicted. The size range of TRs from each group is indicated. The PK structures of TRs from the major groups of eukaryotes including ciliates, plants, fishes and mammals are shown in the right panel. Trypanosome TR does not have a PK structure in the template core domain. The P2 and P3 stems conserved from ciliates to mammals are shown in red with highly conserved nucleotides explicitly denoted. The vertebrate-specific stem extension P2a is shown in blue while the mammal-specific stem extension P2a.1 is shown in green. The length of joining sequences, J2/3 upstream (J2/3u) or downstream (J2/3d) regions, between stems P2 and P3 are indicated. (Concept and Figure; Julian J-L Chen)

141 Table 4.1.

Species with land plant TRs identified in this study

Start End Order Species Accession Source Coordinatesa Coordinatesb

Apiales Daucus carota NC_030389.1 27,698,101 27,698,376 NCBI Asterales Chrysanthemum seticuspe BDUE01009703.1 11,137 11,419 NCBI Boraginales Echium plantagineum QFAX02000220.1 135,925 136,171 NCBI Brassicales Aethionema arabicum KE151693.1 19,486 19,752 NCBI Brassicales Arabidopsis halleri FJVB01000013.1 273,652 273,920 NCBI Brassicales Arabidopsis lyrata NW_003302193.1 6,235 6,501 NCBI Brassicales Arabis alpina LT669791.1 32,604,818 32,605,077 NCBI Brassicales Arabis montbretiana LNCH01009117.1 36,530 36,800 NCBI Brassicales Arabis nordmanniana LNCG01220153.1 3,675 3,942 NCBI Brassicales Barbarea vulgaris LXTM01001115.1 52,644 52,908 NCBI Brassicales Boechera stricta MLHT01000206.1 2,167,990 2,168,256 NCBI Brassicales Brassica cretica QGKV01138583.1 73 347 NCBI Brassicales Brassica juncea CM007199.1 42,541,815 42,542,082 NCBI Brassicales Brassica rapa NC_024798.1 13,226,512 13,226,780 NCBI Brassicales Capsella bursa-pastoris MPGU01000291.1 544,071 544,341 NCBI Brassicales Cardamine hirsuta Chr4 14,202,139 14,202,402 MPIPZ Brassicales Conringia planisiliqua FNXX01000004.1 7,410,188 7,410,452 NCBI Brassicales Crucihimalaya himalaica SMJT01000124.1 207,463 207,727 NCBI Brassicales Euclidium syriacum FPAK01000008.1 2,642,798 2,643,063 NCBI Brassicales Eutrema heterophyllum PKMM01021225.1 255,915 256,145 NCBI Brassicales Eutrema salsugineum NW_006256908.1 4,817,520 4,817,781 NCBI Brassicales Eutrema yunnanense PKML01061038.1 473 736 NCBI Brassicales Leavenworthia alabamica KE157026.1 94,536 94,815 NCBI Brassicales Raphanus raphanistrum JRQH01003943.1 2,932 3,186 NCBI Brassicales Raphanus sativus NW_017353142.1 35,244,840 35,245,094 NCBI Brassicales Schrenkiella parvula CM001190.1 12,955,768 12,956,035 NCBI Brassicales Sisymbrium irio KE156162.1 139,418 139,686 NCBI Brassicales Tarenaya hassleriana NW_010971389.1 564,830 565,090 NCBI Brassicales Thlaspi arvense AZNP01000142.1 120,620 120,884 NCBI Caryophyllales Beta vulgaris NC_025816.2 46,989,999 46,990,277 NCBI Cucurbitales Cucurbita argyrosperma SDJN01000158.1 222,899 223,147 NCBI Fabales Pisum sativum PUCA014342884.1 375 635 NCBI Fagales Casuarina equisetifolia RDRV01000354.1 115,161 115,414 NCBI Gentianales Coffea eugenoides NC_040043.1 6,019,019 6,019,286 NCBI Lamiales Olea europaea NW_019237129.1 278,546 278,801 NCBI Malpighiales Caryocar brasiliense STGP01026219.1 4,821 5,089 NCBI

142 Malpighiales Manihot esculenta NC_035172.1 30,848,292 30,848,551 NCBI Malpighiales Populus simonii CM017472.2 14,153,485 14,153,758 NCBI Malpighiales Viola pubescens NBIL01136792.1 11,205 11,463 NCBI Aquilaria agallochum KK907007.1 4,840 5,116 NCBI Malvales Aquilaria sinensis SMDT01003036.1 616,167 616,432 NCBI Malvales Corchorus capsularis AWWV01006766.1 15,731 16,000 NCBI Malvales Corchorus olitorius AWUE01012270.1 7,870 8,137 NCBI Malvales Durio zibethinus NW_019167871.1 10,632,362 10,632,624 NCBI Malvales Gossypioides kirkii CM008983.1 32,191,550 32,191,784 NCBI Malvales aroboreum NC_030666.1 87,580,015 87,580,260 NCBI Malvales Gossypium australe CM016621.1 70,003,453 70,003,727 NCBI Malvales Gossypium thurberi CM013381.1 23,878,808 23,879,056 NCBI Malvales Kokia drynaroides NTFQ01013625.1 69,702 69,937 NCBI Malvales Theobroma cacao NC_030859.1 13,601,850 13,602,113 NCBI Myrtales Eucalyptus camaldulensis BADO01007437.1 1,766 2,014 NCBI Myrtales Eucalyptus melliodora SISH01000046.1 4,217,044 4,217,293 NCBI Oxalidales Aristotelia chilensis VEXP01036680.1 842 1,093 NCBI Oxalidales Cephalotus follicularis BDDD01000524.1 142,301 142,563 NCBI Proteales Macadamia integrifolia UZVR01001767.1 83,061 83,327 NCBI Rosales Rosa chinensis NC_037093.1 60,849,275 60,849,535 NCBI Sapindales Atalantia buxifolia MKYR01004417.1 843,867 844,129 NCBI Sapindales Azadirachta indica AMWY02057456.1 1,105 1,362 NCBI Sapindales Cintrus hindsii QWBT01000927.1 5,076,792 5,077,050 NCBI Sapindales Citrus clementina NW_006261964.1 4,968,658 4,968,914 NCBI Sapindales Xanthoceras sorbifolium CM010616.1 13,811,758 13,812,007 NCBI Solanales Cuscuta australis NQVE01000092.1 783,927 784,177 NCBI Solanales Nicotiana rustica ML520654.1 26,960 27,208 NCBI Solanales Nicotiana tabacum NW_015926110.1 63,719 63,965 NCBI Solanales Solanum tuberosum NW_006239035.1 694,552 694,810 NCBI Zingiberales Musa balbisiana CM017189.1 27,011,643 27,011,918 NCBI Alismatales Posidonia oceanica GGFN01190223.1 5 250 NCBI* Arecales Calamus simplicifolius UESW01003909.1 1,490,526 1,490,801 NCBI Amborellales Amborella trichopoda NW_006494910.1 7,781,446 7,781,726 NCBI Magnoliales Liriodendron chinense PVNU02000262.1 764,788 765,066 NCBI Gnetales Gnetum montanum scaffold866741 96,878 97,202 DRYAD Cupressales Sequoia sempervirens VDFB01200574.1 58,757 59,059 NCBI Ginkgoales Ginkgo biloba Chr9 251,572,510 251,572,855 GIGA Pinales Abies balsamea aalba5_s00030163 46,478 46,823 TG DB Pinales Larix sibirica NWUY0100044616.1 10,666 11,015 NCBI Pinales Picea abies CBVK0101923023.1 6,762 7,110 NCBI Pinales Picea glauca ALWZ04S1636083.1 4,036 4,382 NCBI Pinales Pinus lambertiana LMTP010003768.1 303,995 304,339 NCBI Pinales Pinus sylvestris contig_7214027 1,193 1,533 NCBI Pinales Pinus taeda APFE031443769.1 20,896 21,241 NCBI 143 Pinales Pseudotsuga menziesii LPNX010568464.1 175,954 176,301 NCBI

Isoetales Isoetes echinospora GGKY01093994.1 1,209 1,488 NCBI*

Selaginellales Selaginella kraussiana LDJE01041645.1 1,146 1,441 NCBI

Selaginellales Selaginella bryopteris GEMU01091170.1 1 305 NCBI*

Selaginellales Selaginella tamariscina PUQB01000486.1 141,633 141,932 NCBI predicted based on multiple sequence alignment with AtTR b: 3’ end of TR inferred based on the presence of a poly ‘U’ tract NCBI: National center for Biotechnology Information – Genome Database, URL: www.ncbi.nlm.nih.gov/genome/ NCBI*: National center for Biotechnology Information – Transcriptome Shotgun Assembly Sequence Database, URL : www.ncbi.nlm.nih.gov/genbank/tsa/ MPIPG : Max Planck Institute for Plant Breeding Research – Genomic Resource, URL : chi.mpipz.mpg.de DRYAD : Dryad digital repository, URL : datadryad.org GIGA DB : GigaDB data repository, URL : gigadb.org TGDB : TreeGenes database, URL : treegenesdb.org

144 REFERENCES

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3), 403-410.

Armanios, M. Y. (2009). Syndromes of telomere shortening. Annual Reviews of Genomics and Human Genetics, 10, 45-61.

Autexier, C., & Greider, C. W. (1995). Boundary elements of the Tetrahymena telomerase RNA template and alignment domains. Genes & Development, 9(18), 2227-2239.

Autexier, C., & Greider, C. W. (1998). Mutational analysis of the Tetrahymena telomerase RNA: identification of residues affecting telomerase activity in vitro. Nucleic Acids Research, 26(3), 787-795.

Baumann, P., & Cech, T. R. (2001). Pot1, the Putative Telomere End-Binding Protein in Fission Yeast and Humans. Science, 292(5519), 1171-1175.

Bautista-Espana, D., Anastacio-Marcelino, E., Horta-Valerdi, G., Celestino-Montes, A., Kojic, M., Negrete-Abascal, E., Reyes-Cervantes, H., Vazquez-Cruz, C., Guzman, P., & Sanchez-Alonso, P. (2014). The telomerase reverse transcriptase subunit from the dimorphic fungus Ustilago maydis. PLOS ONE, 9(10), e109981.

Bilaud, T., Brun, C., Ancelin, K., Koering, C. E., Laroche, T., & Gilson, E. (1997). Telomeric localization of TRF2, a novel human telobox protein. Nature Genetics, 17(2), 236-239.

Blackburn, E. H., & Collins, K. (2011). Telomerase: an RNP enzyme synthesizes DNA. Cold Spring Harbor Perspectives in Biology, 3(5).

Blackburn, E. H., & Gall, J. G. (1978). A tandemly repeated sequence at the termini of the extrachromosomal ribosomal RNA genes in Tetrahymena. Journal of Molecular Biology, 120(1), 33-53.

Bley, C. J., Qi, X., Rand, D. P., Borges, C. R., Nelson, R. W., & Chen, J. J.-L. (2011). RNA–protein binding interface in the telomerase ribonucleoprotein. Proceedings of the National Academy of Sciences of United States of America, 108(51), 20333- 20338.

Bosoy, D., & Lue, N. F. (2001). Functional analysis of conserved residues in the putative "finger" domain of telomerase reverse transcriptase. Journal of Biological Chemistry, 276(49), 46305-46312.

145 Bourlat, S. J., Juliusdottir, T., Lowe, C. J., Freeman, R., Aronowicz, J., Kirschner, M., Lander, E. S., Thorndyke, M., Nakano, H., Kohn, A. B., Heyland, A., Moroz, L. L., Copley, R. R., & Telford, M. J. (2006). Deuterostome phylogeny reveals monophyletic chordates and the new phylum Xenoturbellida. Nature, 444(7115), 85-88.

Box, J. A., Bunch, J. T., Tang, W., & Baumann, P. (2008). Spliceosomal cleavage generates the 3' end of telomerase RNA. Nature, 456(7224), 910-914.

Broccoli, D., Smogorzewska, A., Chong, L., & de Lange, T. (1997). Human telomeres contain two distinct Myb-related proteins, TRF1 and TRF2. Nature Genetics, 17(2), 231-235.

Brown, Y., Abraham, M., Pearl, S., Kabaha, M. M., Elboher, E., & Tzfati, Y. (2007). A critical three-way junction is conserved in budding yeast and vertebrate telomerase RNAs. Nucleic Acids Research, 35(18), 6280-6289.

Bryan, T. M., Goodrich, K. J., & Cech, T. R. (2000). Telomerase RNA bound by protein motifs specific to telomerase reverse transcriptase. Molecular Cell, 6(2), 493-499.

Casacuberta, E. (2017). Drosophila: Retrotransposons Making up Telomeres. Viruses, 9(7).

Červenák, F., Juríková, K., Devillers, H., Kaffe, B., Khatib, A., Bonnell, E., Sopkovičová, M., Wellinger, R. J., Nosek, J., Tzfati, Y., Neuvéglise, C., & Tomáška, Ľ. (2019). Identification of telomerase RNAs in species of the Yarrowia clade provides insights into the co-evolution of telomerase, telomeric repeats and telomere-binding proteins. Scientific Reports, 9(1), 13365.

Challis, R. J., Kumar, S., Stevens, L., & Blaxter, M. (2017). GenomeHubs: simple containerized setup of a custom Ensembl database and web server for any species. Database, 2017.

Chapon, C., Cech, T. R., & Zaug, A. J. (1997). Polyadenylation of telomerase RNA in budding yeast. RNA, 3(11), 1337-1351.

Chen, J. J.-L., Blasco, M. A., & Greider, C. W. (2000). Secondary structure of vertebrate telomerase RNA. Cell, 100(5), 503-514.

Chen, J. J.-L., & Greider, C. W. (2003). Determinants in mammalian telomerase RNA that mediate enzyme processivity and cross-species incompatibility. The EMBO Journal, 22(2), 304-314.

Chen, J. J.-L., & Greider, C. W. (2003). Template boundary definition in mammalian telomerase. Genes & Development, 17(22), 2747-2752.

146 Chen, J. J.-L., & Greider, C. W. (2004). An emerging consensus for telomerase RNA structure. Proceedings of the National Academy of Sciences of United States of America, 101(41), 14683-14684.

Chen, J. J.-L., & Greider, C. W. (2005). Functional analysis of the pseudoknot structure in human telomerase RNA. Proceedings of the National Academy of Sciences of United States of America, 102(23), 8080-8085; discussion 8077-8089.

Chen, J. J.-L., Opperman, K. K., & Greider, C. W. (2002). A critical stem-loop structure in the CR4-CR5 domain of mammalian telomerase RNA. Nucleic Acids Research, 30(2), 592-597.

Cheng, X., & Roberts, R. J. (2001). AdoMet-dependent methylation, DNA methyltransferases and base flipping. Nucleic Acids Research, 29(18), 3784-3795.

Choi, K. H., Farrell, A. S., Lakamp, A. S., & Ouellette, M. M. (2011). Characterization of the DNA binding specificity of Shelterin complexes. Nucleic Acids Research, 39(21), 9206-9223.

Chong, L., van Steensel, B., Broccoli, D., Erdjument-Bromage, H., Hanish, J., Tempst, P., & de Lange, T. (1995). A human telomeric protein. Science, 270(5242), 1663- 1667.

Cifuentes-Rojas, C., Kannan, K., Tseng, L., & Shippen, D. E. (2011). Two RNA subunits and POT1a are components of Arabidopsis telomerase. Proceedings of the National Academy of Sciences of United States of America, 108(1), 73-78.

Cifuentes-Rojas, C., Nelson, A. D., Boltz, K. A., Kannan, K., She, X., & Shippen, D. E. (2012). An alternative telomerase RNA in Arabidopsis modulates enzyme activity in response to DNA damage. Genes & Development, 26(22), 2512-2523.

Counter, C. M., Meyerson, M., Eaton, E. N., & Weinberg, R. A. (1997). The catalytic subunit of yeast telomerase. Proceedings of the National Academy of Sciences of United States of America, 94(17), 9202-9207.

Dandjinou, A. T., Lévesque, N., Larose, S., Lucier, J.-F., Abou Elela, S., & Wellinger, R. J. (2004). A phylogenetically based secondary structure for the yeast telomerase RNA. Current Biology, 14(13), 1148-1158. de Lange, T. (2018). Shelterin-Mediated Telomere Protection. Annual Review of Genetics, 52(1), 223-247.

Drosopoulos, W. C., & Prasad, V. R. (2009). Telomerase-specific T Motif is a Restrictive Determinant of Repetitive Reverse Transcription by Human Telomerase. Molecular and Cellular Biology.

147 Egan, E. D., & Collins, K. (2010). Specificity and stoichiometry of subunit interactions in the human telomerase holoenzyme assembled in vivo. Molecular and Cellular Biology, 30, 2775-2786.

Egan, E. D., & Collins, K. (2012). Biogenesis of telomerase ribonucleoproteins. RNA (New York, N.Y.), 18(10), 1747-1759.

Fajkus, J., Sýkorová, E., & Leitch, A. R. (2005). Telomeres in evolution and evolution of telomeres. Chromosome Research, 13(5), 469-479.

Fajkus, P., Peska, V., Sitova, Z., Fulneckova, J., Dvorackova, M., Gogela, R., Sykorova, E., Hapala, J., & Fajkus, J. (2016). Allium telomeres unmasked: the unusual telomeric sequence (CTCGGTTATGGG)n is synthesized by telomerase. Plant Journal, 85(3), 337-347.

Fajkus, P., Peska, V., Zavodnik, M., Fojtova, M., Fulneckova, J., Dobias, S., Kilar, A., Dvorackova, M., Zachova, D., Necasova, I., Sims, J., Sykorova, E., & Fajkus, J. (2019). Telomerase RNAs in land plants. Nucleic Acids Research, 47(18), 9842- 9856.

Finger, S. N., & Bryan, T. M. (2008). Multiple DNA-binding sites in Tetrahymena telomerase. Nucleic Acids Research, 36(4), 1260-1272.

Fisher, T. S., & Zakian, V. A. (2005). Ku: a multifunctional protein involved in telomere maintenance. DNA Repair (Amst), 4(11), 1215-1226.

Fu, D., & Collins, K. (2007). Purification of Human Telomerase Complexes Identifies Factors Involved in Telomerase Biogenesis and Telomere Length Regulation. Molecular Cell, 28(5), 773-785.

Gilley, D., & Blackburn, E. H. (1999). The telomerase RNA pseudoknot is critical for the stable assembly of a catalytically active ribonucleoprotein. Proceedings of the National Academy of Sciences of the United States of America, 96(12), 6621- 6625.

Gillis, A. J., Schuller, A. P., & Skordalakes, E. (2008). Structure of the Tribolium castaneum telomerase catalytic subunit TERT. Nature, 455(7213), 633-637.

Girard, J.-P., Caizergues-Ferrer, M., & Lapeyre, B. (1993). The SpGARI gene of Schizosaccharomyces pombe encodes the functional homologue of the snoRNP protein GAR1 of Saccharomyces cerevisiae. Nucleic Acids Research, 21(9), 2149- 2155.

Greider, C. W., & Blackburn, E. H. (1985). Identification of a specific telomere terminal transferase activity in Tetrahymena extracts. Cell, 43(2 Pt 1), 405-413.

148 Greider, C. W., & Blackburn, E. H. (1987). The telomere terminal transferase of Tetrahymena is a ribonucleoprotein enzyme with two kinds of primer specificity. Cell, 51(6), 887-898.

Greider, C. W., & Blackburn, E. H. (1989). A telomeric sequence in the RNA of Tetrahymena telomerase required for telomere repeat synthesis. Nature, 337(6205), 331-337.

Gunisova, S., Elboher, E., Nosek, J., Gorkovoy, V., Brown, Y., Lucier, J. F., Laterreur, N., Wellinger, R. J., Tzfati, Y., & Tomaska, L. (2009). Identification and comparative analysis of telomerase RNAs from Candida species reveal conservation of functional elements. RNA, 15(4), 546-559.

Guzmán, P. A., & Sánchez, J. G. (1994). Characterization of telomeric regions from Ustilago maydis. Microbiology, 140(3), 551-557.

Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J., Couger, M. B., Eccles, D., Li, B., Lieber, M., Macmanes, M. D., Ott, M., Orvis, J., Pochet, N., Strozzi, F., Weeks, N., Westerman, R., William, T., Dewey, C. N., Henschel, R., Leduc, R. D., Friedman, N., & Regev, A. (2013). De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature protocols, 8, 1494-1512.

Hamma, T., Reichow, S. L., Varani, G., & Ferré-D'Amaré, A. R. (2005). The Cbf5– Nop10 complex is a molecular bracket that organizes box H/ACA RNPs. Nature Structural & Molecular Biology, 12(12), 1101-1107.

Harley, C. B., Futcher, A. B., & Greider, C. W. (1990). Telomeres shorten during ageing of human fibroblasts. Nature, 345(6274), 458-460.

Harley, C. B., Vaziri, H., Counter, C. M., & Allsopp, R. C. (1992). The telomere hypothesis of cellular aging. Experimental Gerontology, 27(4), 375-382.

Harrington, L., Zhou, W., McPhail, T., Oulton, R., Yeung, D. S., Mar, V., Bass, M. B., & Robinson, M. O. (1997). Human telomerase contains evolutionarily conserved catalytic and structural subunits. Genes & Development, 11(23), 3109-3115.

Hayflick, L. (1965). The limited in vitro lifetime of human diploid cell strains. Experimental Cell Research, 37, 614-636.

Hinkley, C. S., Blasco, M. A., Funk, W. D., Feng, J., Villeponteau, B., Greider, C. W., & Herr, W. (1998). The mouse telomerase RNA 5"-end lies just upstream of the telomerase template sequence. Nucleic Acids Research, 26(2), 532-536.

Hoffman, H., Rice, C., & Skordalakes, E. (2017). Structural Analysis Reveals the Deleterious Effects of Telomerase Mutations in Telomerase-Associated Bone Marrow Failure Syndromes. Journal of Biological Chemistry. 149 Hossain, S., Singh, S., & Lue, N. F. (2002). Functional analysis of the C-terminal extension of telomerase reverse transcriptase. A putative "thumb" domain. Journal of Biological Chemistry, 277(39), 36174-36180.

Houghtaling, B. R., Cuttonaro, L., Chang, W., & Smith, S. (2004). A dynamic molecular link between the telomere length regulator TRF1 and the chromosome end protector TRF2. Current Biology, 14(18), 1621-1631.

Hsu, M., McEachern, M. J., Dandjinou, A. T., Tzfati, Y., Orr, E., Blackburn, E. H., & Lue, N. F. (2007). Telomerase core components protect Candida telomeres from aberrant overhang accumulation. Proceedings of the National Academy of Sciences of United States of America, 104(28), 11682-11687.

Huang, J., Brown, A. F., Wu, J., Xue, J., Bley, C. J., Rand, D. P., Wu, L., Zhang, R., Chen, J. J.-L., & Lei, M. (2014). Structural basis for protein-RNA recognition in telomerase. Nature Structural & Molecular Biology, 21, 507.

Huard, S., Moriarty, T. J., & Autexier, C. (2003). The C terminus of the human telomerase reverse transcriptase is a determinant of enzyme processivity. Nucleic Acids Research, 31(14), 4059-4070.

Ip, J. C. H., Mu, H., Chen, Q., Sun, J., Ituarte, S., Heras, H., Van Bocxlaer, B., Ganmanee, M., Huang, X., & Qiu, J. W. (2018). AmpuBase: a transcriptome database for eight species of apple snails (Gastropoda: Ampullariidae). BMC Genomics, 19(1), 179.

Jacobs, S. A., Podell, E. R., & Cech, T. R. (2006). Crystal structure of the essential N- terminal domain of telomerase reverse transcriptase. Nature Structural & Molecular Biology, 13(3), 218-225.

Jády, B. E., Bertrand, E., & Kiss, T. (2004). Human telomerase RNA and box H/ACA scaRNAs share a common Cajal body-specific localization signal. Journal of Cell Biology, 164(5), 647-652.

Jansson, L. I., Akiyama, B. M., Ooms, A., Lu, C., Rubin, S. M., & Stone, M. D. (2015). Structural basis of template-boundary definition in Tetrahymena telomerase. Nature Structural &Amp; Molecular Biology, 22, 883.

Jiang, J., Chan, H., Cash, D. D., Miracco, E. J., Ogorzalek Loo, R. R., Upton, H. E., Cascio, D., O'Brien Johnson, R., Collins, K., Loo, J. A., Zhou, Z. H., & Feigon, J. (2015). Structure of Tetrahymena telomerase reveals previously unknown subunits, functions, and interactions. Science, 350(6260), aab4070.

Jiang, J., Miracco, E. J., Hong, K., Eckert, B., Chan, H., Cash, D. D., Min, B., Zhou, Z. H., Collins, K., & Feigon, J. (2013). The architecture of Tetrahymena telomerase holoenzyme. Nature, 496(7444), 187-192.

150 Jiang, J., Wang, Y., Sušac, L., Chan, H., Basu, R., Zhou, Z. H., & Feigon, J. (2018). Structure of Telomerase with Telomeric DNA. Cell, 173(5), 1179-1190.e1113.

Kabaha, M. M., Zhitomirsky, B., Schwartz, I., & Tzfati, Y. (2008). The 5' arm of Kluyveromyces lactis telomerase RNA is critical for telomerase function. Molecular and cellular biology, 28(6), 1875-1882.

Kachouri-Lafond, R., Dujon, B., Gilson, E., Westhof, E., Fairhead, C., & Teixeira, M. (2009). Large telomerase RNA, telomere length heterogeneity and escape from senescence in Candida glabrata. FEBS Letters, 583(22), 3605-3610.

Kamper, J. (2004). A PCR-based system for highly efficient generation of gene replacement mutants in Ustilago maydis. Molecular Genetics and Genomics, 271(1), 103-110.

Kämper, J., Kahmann, R., Bölker, M., Ma, L.-J., Brefort, T., Saville, B. J., Banuett, F., Kronstad, J. W., Gold, S. E., Müller, O., Perlin, M. H., Wösten, H. A. B., de Vries, R., Ruiz-Herrera, J., Reynaga-Peña, C. G., Snetselaar, K., McCann, M., Pérez-Martín, J., Feldbrügge, M., Basse, C. W., Steinberg, G., Ibeas, J. I., Holloman, W., Guzman, P., Farman, M., Stajich, J. E., Sentandreu, R., González- Prieto, J. M., Kennell, J. C., Molina, L., Schirawski, J., Mendoza-Mendoza, A., Greilinger, D., Münch, K., Rössel, N., Scherer, M., Vraneš, M., Ladendorf, O., Vincon, V., Fuchs, U., Sandrock, B., Meng, S., Ho, E. C. H., Cahill, M. J., Boyce, K. J., Klose, J., Klosterman, S. J., Deelstra, H. J., Ortiz-Castellanos, L., Li, W., Sanchez-Alonso, P., Schreier, P. H., Häuser-Hahn, I., Vaupel, M., Koopmann, E., Friedrich, G., Voss, H., Schlüter, T., Margolis, J., Platt, D., Swimmer, C., Gnirke, A., Chen, F., Vysotskaia, V., Mannhaupt, G., Güldener, U., Münsterkötter, M., Haase, D., Oesterheld, M., Mewes, H.-W., Mauceli, E. W., DeCaprio, D., Wade, C. M., Butler, J., Young, S., Jaffe, D. B., Calvo, S., Nusbaum, C., Galagan, J., & Birren, B. W. (2006). Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis. Nature, 444(7115), 97-101.

Kannan, R., Helston, R. M., Dannebaum, R. O., & Baumann, P. (2015). Diverse mechanisms for spliceosome-mediated 3′ end processing of telomerase RNA. Nature Communications, 6(1), 6104.

Kim, S. H., Kaminker, P., & Campisi, J. (1999). TIN2, a new regulator of telomere length in human cells. Nature Genetics, 23(4), 405-412.

Kinal, H., Park, C. M., & Bruenn, J. A. (1993). A family of Ustilago maydis expression vectors: new selectable markers and promoters. Gene, 127(1), 151-152.

Kohlstaedt, L., Wang, J., Friedman, J., Rice, P., & Steitz, T. (1992). Crystal structure at 3.5 A resolution of HIV-1 reverse transcriptase complexed with an inhibitor. Science, 256(5065), 1783-1790.

151 Kojic, M., Zhou, Q., Lisby, M., & Holloman, W. K. (2006). Rec2 interplay with both Brh2 and Rad51 balances recombinational repair in Ustilago maydis. Molecular and Cellular Biology, 26(2), 678-688.

Kuprys, P. V., Davis, S. M., Hauer, T. M., Meltser, M., Tzfati, Y., & Kirk, K. E. (2013). Identification of Telomerase RNAs from Filamentous Fungi Reveals Conservation with Vertebrates and Yeasts. PLOS ONE, 8(3), e58661.

Lai, A. G., Pouchkina-Stantcheva, N., Di Donfrancesco, A., Kildisiute, G., Sahu, S., & Aboobaker, A. A. (2017). The protein subunit of telomerase displays patterns of dynamic evolution and conservation across different metazoan taxa. BMC Evolutionary Biology, 17(1), 107-107.

Lai, C. K., Miller, M. C., & Collins, K. (2002). Template boundary definition in Tetrahymena telomerase. Genes & Development, 16(4), 415-420.

Lai, C. K., Miller, M. C., & Collins, K. (2003). Roles for RNA in telomerase nucleotide and repeat addition processivity. Molecular Cell, 11(6), 1673-1683.

Lai, C. K., Mitchell, J. R., & Collins, K. (2001). RNA binding domain of telomerase reverse transcriptase. Molecular and Cellular Biology, 21(4), 990-1000.

Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods, 9(4), 357-359.

Lattmann, S., Stadler, M. B., Vaughn, J. P., Akman, S. A., & Nagamine, Y. (2011). The DEAH-box RNA helicase RHAU binds an intramolecular RNA G-quadruplex in TERC and associates with telomerase holoenzyme. Nucleic Acids Research, 39(21), 9390-9404.

Leonardi, J., Box, J. A., Bunch, J. T., & Baumann, P. (2008). TER1, the RNA subunit of fission yeast telomerase. Nature Structural & Molecular Biology, 15(1), 26-33.

Li, B., Oestreich, S., & de Lange, T. (2000). Identification of human Rap1: implications for telomere evolution. Cell, 101(5), 471-483.

Li, Y., Podlevsky, J. D., Marz, M., Qi, X., Hoffmann, S., Stadler, P. F., & Chen, J. J.-L. (2013). Identification of purple sea urchin telomerase RNA using a next- generation sequencing based approach. RNA, 19(6), 852-860.

Lin, J., Ly, H., Hussain, A., Abraham, M., Pearl, S., Tzfati, Y., Parslow, T. G., & Blackburn, E. H. (2004). A universal telomerase RNA core structure includes structured motifs required for binding the telomerase reverse transcriptase protein. Proceedings of the National Academy of Sciences of United States of America, 101(41), 14713-14718.

152 Lingner, J., Hendrick, L. L., & Cech, T. R. (1994). Telomerase RNAs of different ciliates have a common secondary structure and a permuted template. Genes & Development, 8(16), 1984-1998.

Lingner, J., Hughes, T. R., Shevchenko, A., Mann, M., Lundblad, V., & Cech, T. R. (1997). Reverse transcriptase motifs in the catalytic subunit of telomerase. Science, 276, 561-567.

Liu, D., Safari, A., O'Connor, M. S., Chan, D. W., Laegeler, A., Qin, J., & Songyang, Z. (2004). PTOP interacts with POT1 and regulates its localization to telomeres. Nature Cell Biology, 6(7), 673-680.

Logeswaran, D., & Chen, J. J.-L. (2019). Effects of Telomerase Activation. In D. Gu & M. E. Dupre (Eds.), Encyclopedia of Gerontology and Population Aging (pp. 1- 8). Cham: Springer International Publishing.

Lue, N. F. (2005). A physical and functional constituent of telomerase anchor site. Journal of Biological Chemistry, 280(28), 26586-26591.

Lue, N. F., & Li, Z. (2007). Modeling and structure function analysis of the putative anchor site of yeast telomerase. Nucleic Acids Research, 35(15), 5213-5222.

Ly, H., Blackburn, E. H., & Parslow, T. G. (2003). Comprehensive structure-function analysis of the core domain of human telomerase RNA. Molecular and Cellular Biology, 23(19), 6849-6856.

Maiorano, D., Brimage, L. J., Leroy, D., & Kearsey, S. E. (1999). Functional conservation and cell cycle localization of the Nhp2 core component of H + ACA snoRNPs in fission and budding yeasts. Exp Cell Res, 252(1), 165-174.

Makarov, V. L., Hirose, Y., & Langmore, J. P. (1997). Long G tails at both ends of human chromosomes suggest a C strand degradation mechanism for telomere shortening. Cell, 88(5), 657-666.

Marz, M., Gruber, A. R., Höner Zu Siederdissen, C., Amman, F., Badelt, S., Bartschat, S., Bernhart, S. H., Beyer, W., Kehr, S., Lorenz, R., Tanzer, A., Yusuf, D., Tafer, H., Hofacker, I. L., & Stadler, P. F. (2011). Animal snoRNAs and scaRNAs with exceptional structures. RNA biology, 8(6), 938-946.

Mason, D. X., Goneska, E., & Greider, C. W. (2003). Stem-loop IV of tetrahymena telomerase RNA stimulates processivity in trans. Molecular and Cellular Biology, 23(16), 5606-5613.

McClintock, B. (1939). The Behavior in Successive Nuclear Divisions of a Chromosome Broken at Meiosis. Proceedings of the National Academy of Sciences of United States of America, 25(8), 405-416.

153 McClintock, B. (1941). The Stability of Broken Ends of Chromosomes in Zea Mays. Genetics, 26(2), 234-282.

McCormick-Graham, M., & Romero, D. P. (1995). Ciliate telomerase RNA structural features. Nucleic Acids Research, 23(7), 1091-1097.

McEachern, M. J., & Blackburn, E. H. (1995). Runaway telomere elongation caused by telomerase RNA gene mutations. Nature, 376(6539), 403-409.

Meselson, M., & Stahl, F. W. (1958). The replication of DNA in Escherichia coli. Proceedings of the National Academy of Sciences of United States of America, 44(7), 671-682.

Meyerson, M., Counter, C. M., Eaton, E. N., Ellisen, L. W., Steiner, P., Caddle, S. D., Ziaugra, L., Beijersbergen, R. L., Davidoff, M. J., Liu, Q., Bacchetti, S., Haber, D. A., & Weinberg, R. A. (1997). hEST2, the putative human telomerase catalytic subunit gene, is up-regulated in tumor cells and during immortalization. Cell, 90(4), 785-795.

Meyne, J., Ratliff, R. L., & Moyzis, R. K. (1989). Conservation of the human telomere sequence (TTAGGG)n among vertebrates. Proceedings of the National Academy of Sciences of United States of America, 86(18), 7049-7053.

Milne, I., Stephen, G., Bayer, M., Cock, P. J. A., Pritchard, L., Cardle, L., Shaw, P. D., & Marshall, D. (2012). Using Tablet for visual exploration of second-generation sequencing data. Briefings in Bioinformatics, 14(2), 193-202.

Min, B., & Collins, K. (2009). An RPA-related sequence-specific DNA-binding subunit of telomerase holoenzyme is required for elongation processivity and telomere maintenance. Molecular Cell, 36(4), 609-619.

Mitchell, J. R., Cheng, J., & Collins, K. (1999). A box H/ACA small nucleolar RNA-like domain at the human telomerase RNA 3' end. Molecular and Cellular Biology, 19(1), 567-576.

Mitchell, J. R., & Collins, K. (2000). Human telomerase activation requires two independent interactions between telomerase RNA and telomerase reverse transcriptase. Molecular Cell, 6(2), 361-371.

Mitchell, M., Gillis, A., Futahashi, M., Fujiwara, H., & Skordalakes, E. (2010). Structural basis for telomerase catalytic subunit TERT binding to RNA template and telomeric DNA. Nature Structural & Molecular Biology, 17(4), 513-518.

Moriarty, T. J., Huard, S., Dupuis, S., & Autexier, C. (2002). Functional multimerization of human telomerase requires an RNA interaction domain in the N terminus of the catalytic subunit. Molecular and Cellular Biology, 22(4), 1253-1265.

154 Moriarty, T. J., Marie-Egyptienne, D. T., & Autexier, C. (2004). Functional organization of repeat addition processivity and DNA synthesis determinants in the human telomerase multimer. Molecular and Cellular Biology, 24(9), 3720-3733.

Moriarty, T. J., Marie-Egyptienne, D. T., & Autexier, C. (2005). Regulation of 5' template usage and incorporation of noncognate nucleotides by human telomerase. RNA, 11(9), 1448-1460.

Mosig, A., Chen, J. J.-L., & Stadler, P. F. (2007). Homology Search with Fragmented Nucleic Acid Sequence Patterns. Lecture Notes in Computer Science, 4645, 335- 345.

Mosig, A., Sameith, K., & Stadler, P. (2006). Fragrep: An efficient search tool for fragmented patterns in genomic sequences. Genomics, Proteomics and Bioinformatics, 4, 56-60.

Moyzis, R. K., Buckingham, J. M., Cram, L. S., Dani, M., Deaven, L. L., Jones, M. D., Meyne, J., Ratliff, R. L., & Wu, J. R. (1988). A highly conserved repetitive DNA sequence, (TTAGGG)n, present at the telomeres of human chromosomes. Proceedings of the National Academy of Sciences of United States of America, 85(18), 6622-6626.

Muller, H. (1938). The remaking of chromosomes. The Collecting Net, 13(8), 181-198.

Musgrove, C., Jansson, L. I., & Stone, M. D. (2018). New perspectives on telomerase RNA structure and function. Wiley Interdisciplinary Reviews RNA, 9(2).

Nakamura, T. M., Morin, G. B., Chapman, K. B., Weinrich, S. L., Andrews, W. H., Lingner, J., Harley, C. B., & Cech, T. R. (1997). Telomerase catalytic subunit homologs from fission yeast and human. Science, 277, 955-959.

Nakayama, J., Tahara, H., Tahara, E., Saito, M., Ito, K., Nakamura, H., Nakanishi, T., Ide, T., & Ishikawa, F. (1998). Telomerase activation by hTRT in human normal fibroblasts and hepatocellular carcinomas. Nature Genetics, 18(1), 65-68.

Nawrocki, E. P., & Eddy, S. R. (2013). Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics, 29(22), 2933-2935.

Nguyen, D., Grenier St-Sauveur, V., Bergeron, D., Dupuis-Sandoval, F., Scott, M. S., & Bachand, F. (2015). A Polyadenylation-Dependent 3' End Maturation Pathway Is Required for the Synthesis of the Human Telomerase RNA. Cell Reports, 13(10), 2244-2257.

Nguyen, T. H. D., Tam, J., Wu, R. A., Greber, B. J., Toso, D., Nogales, E., & Collins, K. (2018). Cryo-EM structure of substrate-bound human telomerase holoenzyme. Nature, 557(7704), 190-195.

155 Noël, J.-F., Larose, S., Abou Elela, S., & Wellinger, R. J. (2012). Budding yeast telomerase RNA transcription termination is dictated by the Nrd1/Nab3 non- coding RNA termination pathway. Nucleic Acids Research, 40(12), 5625-5636.

Ollis, D. L., Brick, P., Hamlin, R., Xuong, N. G., & Steitz, T. A. (1985). Structure of large fragment of Escherichia coli DNA polymerase I complexed with dTMP. Nature, 313(6005), 762-766.

Olovnikov, A. M. (1973). A theory of marginotomy. The incomplete copying of template margin in enzymic synthesis of polynucleotides and biological significance of the phenomenon. Journal of Theoretical Biology, 41(1), 181-190.

Pace, N. R., Smith, D. K., Olsen, G. J., & James, B. D. (1989). Phylogenetic comparative analysis and the secondary structure of ribonuclease P RNA--a review. Gene, 82(1), 65-75.

Pérez, G., Pangilinan, J., Pisabarro, A. G., & Ramírez, L. (2009). Telomere Organization in the Ligninolytic Basidiomycete Pleurotus ostreatus. Applied and Environmental Microbiology, 75(5), 1427-1436.

Peška, V., Fajkus, P., Fojtová, M., Dvořáčková, M., Hapala, J., Dvořáček, V., Polanská, P., Leitch, A. R., Sýkorová, E., & Fajkus, J. (2015). Characterisation of an unusual telomere motif (TTTTTTAGGG)n in the plant Cestrum elegans (Solanaceae), a species with a large genome. Plant Journal, 82(4), 644-654.

Podlevsky, J. D., Bley, C. J., Omana, R. V., Qi, X., & Chen, J. J.-L. (2008). The telomerase database. Nucleic Acids Research, 36(Database issue), D339-343.

Podlevsky, J. D., & Chen, J. J. (2016). Evolutionary perspectives of telomerase RNA structure and function. RNA Biol, 13(8), 720-732.

Podlevsky, J. D., & Chen, J. J.-L. (2012). It all comes together at the ends: telomerase structure, function, and biogenesis. Mutation Research, 730(1-2), 3-11.

Podlevsky, J. D., Li, Y., & Chen, J. J.-L. (2016). The functional requirement of two structural domains within telomerase RNA emerged early in eukaryotes. Nucleic Acids Research, 44, 9891-9901.

Podlevsky, J. D., Li, Y., & Chen, J. J.-L. (2016). Structure and function of echinoderm telomerase RNA. RNA, 22(2), 204-215.

Pogacic, V., Dragon, F., & Filipowicz, W. (2000). Human H/ACA small nucleolar RNPs and telomerase share evolutionarily conserved proteins NHP2 and NOP10. Molecular and Cellular Biology, 20(23), 9028-9040.

156 Qi, X., Li, Y., Honda, S., Hoffmann, S., Marz, M., Mosig, A., Podlevsky, J. D., Stadler, P. F., Selker, E. U., & Chen, J. J. L. (2013). The common ancestral core of vertebrate and fungal telomerase RNAs. Nucleic acids research, 41(1), 450-462.

Qi, X., Rand, D. P., Podlevsky, J. D., Li, Y., Mosig, A., Stadler, P. F., & Chen, J. J.-L. (2015). Prevalent and distinct spliceosomal 3'-end processing mechanisms for fungal telomerase RNA. Nature Communications, 2, 1-8.

Qi, X., Xie, M., Brown, A. F., Bley, C. J., Podlevsky, J. D., & Chen, J. J. (2012). RNA/DNA hybrid binding affinity determines telomerase template-translocation efficiency. The EMBO journal, 31(1), 150-161.

Qiao, F., & Cech, T. R. (2008). Triple-helix structure in telomerase RNA contributes to catalysis. Nature Structural & Molecular Biology, 15(6), 634-640.

Quinlan, A. R., & Hall, I. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26(6), 841-842.

Reichow, S. L., Hamma, T., Ferré-D'Amaré, A. R., & Varani, G. (2007). The structure and function of small nucleolar ribonucleoproteins. Nucleic Acids Research, 35(5), 1452-1464.

Riha, K., & Shippen, D. E. (2003). Telomere structure, function and maintenance in Arabidopsis. Chromosome Research, 11(3), 263-275.

Romi, E., Baran, N., Gantman, M., Shmoish, M., Min, B., Collins, K., & Manor, H. (2007). High-resolution physical and functional mapping of the template adjacent DNA binding site in catalytically active telomerase. Proceedings of the National Academy of Sciences of United States of America, 104(21), 8791-8796.

Rouda, S., & Skordalakes, E. (2007). Structure of the RNA-binding domain of telomerase: implications for RNA recognition and binding. Structure, 15(11), 1403-1412.

Schuster, M., Schweizer, G., Reissmann, S., & Kahmann, R. (2016). Genome editing in Ustilago maydis using the CRISPR-Cas system. Fungal Genetics and Biology, 89, 3-9.

Sealey, D. C. F., Zheng, L., Taboski, M. A. S., Cruickshank, J., Ikura, M., & Harrington, L. A. (2010). The N-terminus of hTERT contains a DNA-binding domain and is required for telomerase activity and cellular immortalization. Nucleic Acids Research, 38(6), 2019-2035.

Seto, A. G., Umansky, K., Tzfati, Y., Zaug, A. J., Blackburn, E. H., & Cech, T. R. (2003). A template-proximal RNA paired element contributes to Saccharomyces cerevisiae telomerase activity. RNA, 9(11), 1323-1332.

157 Seto, A. G., Zaug, A. J., Sobel, S. G., Wolin, S. L., & Cech, T. R. (1999). Saccharomyces cerevisiae telomerase is an Sm small nuclear ribonucleoprotein particle. Nature, 401(6749), 177-180.

Sexton, A. N., & Collins, K. (2011). The 5′ Guanosine Tracts of Human Telomerase RNA Are Recognized by the G-Quadruplex Binding Domain of the RNA Helicase DHX36 and Function To Increase RNA Accumulation. Molecular and Cellular Biology, 31(4), 736-743.

Sfeir, A. J., Chai, W., Shay, J. W., & Wright, W. E. (2005). Telomere-end processing the terminal nucleotides of human chromosomes. Molecular Cell, 18(1), 131-138.

Shay, J. W., & Wright, W. E. (2019). Telomeres and telomerase: three decades of progress. Nature Reviews Genetics, 20(5), 299-309.

Shefer, K., Brown, Y., Gorkovoy, V., Nussbaum, T., Ulyanov, N. B., & Tzfati, Y. (2007). A triple helix within a pseudoknot is a conserved and essential element of telomerase RNA. Molecular and Cellular Biology, 27(6), 2130-2143.

Shippen-Lentz, D., & Blackburn, E. H. (1990). Functional evidence for an RNA template in telomerase. Science, 247(4942), 546-552.

Singh, M., Wang, Z., Koo, B.-K., Patel, A., Cascio, D., Collins, K., & Feigon, J. (2012). Structural Basis for Telomerase RNA Recognition and RNP Assembly by the Holoenzyme La Family Protein p65. Molecular Cell, 47(1), 16-26.

Soudet, J., Jolivet, P., & Teixeira, M. T. (2014). Elucidation of the DNA end-replication problem in Saccharomyces cerevisiae. Molecular Cell, 53(6), 954-964.

Steitz, T. A. (1999). DNA Polymerases: Structural Diversity and Common Mechanisms. Journal of Biological Chemistry, 274(25), 17395-17398.

Stellwagen, A. E., Haimberger, Z. W., Veatch, J. R., & Gottschling, D. E. (2003). Ku interacts with telomerase RNA to promote telomere addition at native and broken chromosome ends. Genes & Development.

Stone, M. D., Mihalusova, M., O'Connor, C. M., Prathapam, R., Collins, K., & Zhuang, X. (2007). Stepwise protein-mediated RNA folding directs assembly of telomerase ribonucleoprotein. Nature, 446(7134), 458-461.

Tang, W., Kannan, R., Blanchette, M., & Baumann, P. (2012). Telomerase RNA biogenesis involves sequential binding by Sm and Lsm complexes. Nature, 484(7393), 260-264.

Tesmer, V. M., Ford, L. P., Holt, S. E., Frank, B. C., Yi, X., Aisner, D. L., Ouellette, M., Shay, J. W., & Wright, W. E. (1999). Two inactive fragments of the integral RNA

158 cooperate to assemble active telomerase with the human protein catalytic subunit (hTERT) in vitro. Molecular and Cellular Biology, 19(9), 6207-6216.

Theimer, C. A., Blois, C. A., & Feigon, J. (2005). Structure of the human telomerase RNA pseudoknot reveals conserved tertiary interactions essential for function. Molecular Cell, 17(5), 671-682.

Theimer, C. A., Jady, B. E., Chim, N., Richard, P., Breece, K. E., Kiss, T., & Feigon, J. (2007). Structural and functional characterization of human telomerase RNA processing and cajal body localization signals. Molecular Cell, 27(6), 869-881.

Tomlinson, C. G., Holien, J. K., Mathias, J. A., Parker, M. W., & Bryan, T. M. (2016). The C-terminal extension of human telomerase reverse transcriptase is necessary for high affinity binding to telomeric DNA. Biochimie, 128-129, 114-121.

Tseng, C.-K., Wang, H.-F., Schroeder, M. R., & Baumann, P. (2018). The H/ACA complex disrupts triplex in hTR precursor to permit processing by RRP6 and PARN. Nature Communications, 9(1), 5430.

Tzfati, Y., Fulton, T. B., Roy, J., & Blackburn, E. H. (2000). Template boundary in a yeast telomerase specified by RNA structure. Science, 288(5467), 863-867.

Vasconcelos, E. J. R., Nunes, V. S., da Silva, M. S., Segatto, M., Myler, P. J., & Cano, M. I. N. (2014). The putative Leishmania telomerase RNA (LeishTER) undergoes trans-splicing and contains a conserved template sequence. PLOS ONE, 9, e112061.

Venteicher, A. S., Abreu, E. B., Meng, Z., McCann, K. E., Terns, R. M., Veenstra, T. D., Terns, M. P., & Artandi, S. E. (2009). A human telomerase holoenzyme protein required for Cajal body localization and telomere synthesis. Science, 323(5914), 644-648.

Wang, C., & Meier, U. T. (2004). Architecture and assembly of mammalian H/ACA small nucleolar and telomerase ribonucleoproteins. The EMBO Journal, 23(8), 1857-1867.

Wang, Y., Susac, L., & Feigon, J. (2019). Structural Biology of Telomerase. Cold Spring Harbor Perspectives in Biology.

Wang, Y., Tian, R. M., Gao, Z. M., Bougouffa, S., & Qian, P. Y. (2014). Optimal eukaryotic 18S and universal 16S/18S ribosomal RNA primers and their application in a study of symbiosis. PLoS One, 9(3), e90053.

Wang, Y., Yesselman, J. D., Zhang, Q., Kang, M., & Feigon, J. (2016). Structural conservation in the template/pseudoknot domain of vertebrate telomerase RNA from teleost fish to human. Proceedings of the National Academy of Sciences of United States of America, 201607411. 159 Watson, J. D. (1972). Origin of concatemeric T7 DNA. Nature New Biology, 239(94), 197-201.

Webb, C. J., & Zakian, V. A. (2008). Identification and characterization of the Schizosaccharomyces pombe TER1 telomerase RNA. Nature Structural & Molecular Biology, 15(1), 34-42.

Webb, C. J., & Zakian, V. A. (2015). Telomerase RNA stem terminus element affects template boundary element function, telomere sequence, and shelterin binding. Proceedings of the National Academy of Sciences of United States of America, 112(36), 11312-11317.

Weinrich, S. L., Pruzan, R., Ma, L., Ouellette, M., Tesmer, V. M., Holt, S. E., Bodnar, A. G., Lichtsteiner, S., Kim, N. W., Trager, J. B., Taylor, R. D., Carlos, R., Andrews, W. H., Wright, W. E., Shay, J. W., Harley, C. B., & Morin, G. B. (1997). Reconstitution of human telomerase with the template RNA component hTR and the catalytic protein subunit hTRT. Nature Genetics, 17(4), 498-502.

Wilusz, J. E., Sunwoo, H., & Spector, D. L. (2009). Long noncoding RNAs: functional surprises from the RNA world. Genes & Development, 23(13), 1494-1504.

Witkin, K. L., & Collins, K. (2004). Holoenzyme proteins required for the physiological assembly and activity of telomerase. Genes & Development, 18(10), 1107-1118.

Wu, J., Okada, T., Fukushima, T., Tsudzuki, T., Sugiura, M., & Yukawa, Y. (2012). A novel hypoxic stress-responsive long non-coding RNA transcribed by RNA polymerase III in Arabidopsis. RNA Biology, 9(3), 302-313.

Wyatt, H. D. M., Lobb, D. A., & Beattie, T. L. (2007). Characterization of physical and functional anchor site interactions in human telomerase. Molecular and Cellular Biology, 27(8), 3226-3240.

Wyatt, H. D. M., West, S. C., & Beattie, T. L. (2010). InTERTpreting telomerase structure and function. Nucleic Acids Research.

Xie, M., Mosig, A., Qi, X., Li, Y., Stadler, P. F., & Chen, J. J.-L. (2008). Structure and function of the smallest vertebrate telomerase RNA from teleost fish. Journal of Biological Chemistry, 283(4), 2049-2059.

Ye, J. Z.-S., & de Lange, T. (2004). TIN2 is a tankyrase 1 PARP modulator in the TRF1 telomere length control complex. Nature Genetics, 36(6), 618-623.

Yu, E. Y., Kojic, M., Holloman, W. K., & Lue, N. F. (2013). Brh2 and Rad51 promote telomere maintenance in Ustilago maydis, a new model system of DNA repair proteins at telomeres. DNA repair, 12(7), 472-479.

Zakian, V. A. (2009). The ends have arrived. Cell, 139(6), 1038-1040. 160 Zappulla, D. C., Goodrich, K., & Cech, T. R. (2005). A miniature yeast telomerase RNA functions in vivo and reconstitutes activity in vitro. Nature Structural & Molecular Biology, 12(12), 1072-1077.

Zhang, Q., Kim, N. K., & Feigon, J. (2011). Architecture of human telomerase RNA. Proceedings of the National Academy of Sciences of United States of America, 108(51), 20325-20332.

Zhang, Q., Kim, N. K., Peterson, R. D., Wang, Z., & Feigon, J. (2010). Structurally conserved five nucleotide bulge determines the overall topology of the core domain of human telomerase RNA. Proceedings of the National Academy of Sciences of United States of America, 107(44), 18761-18768.

161 APPENDIX A

SPECIES IDENTIFICATION VIA PCR AMPLIFICATION OF RIBOSOMAL RNA

GENE FRAGMENTS AND SANGER SEQUENCING

162

Species identification. Sequences of PCR amplified rRNA gene fragments from S. kowalevskii and S. bromophenolosus for species identification. Sequence variations highlighted

163 APPENDIX B

PLASMID MAP OF PCM955-3XFLAG-UMATERT

164

Plasmid map of pCM955-3xFLAG-UmaTERT. Plasmid sequence elements and orientation shown (Cloning– Joshua Podlevsky).

165 APPENDIX C

READ COVERAGE OF U. MAYDIS CANDIDATE #2 LOCUS

166

Sequencing read coverage for candidate #2. Illumina sequencing reads mapped with strand specificity to the genomic locus of candidate #2. Nucleotides positions shown in x- axis with position of the putative template shown in red below the coverage map. Y-axis indicates the number of reads.

167 APPENDIX D

UMATR α SEQUENCE AND TEMPLATE ANNOTATION

168

UmaTR α-isoform sequence. Nucleotides are separated as blocks of 10 residues with the start and end nucleotide positions shown at the beginning and end of each line. The template is shown in red.

169 APPENDIX E

MULTIPLE SEQUENCE ALIGNMENT OF UMAG_03168 HOMOLOGS

170

. The length. The of

UMAG_03168 Multiple alignment of species sequence homologs from Ustilaginales proteinisthe at shown sequence applied. coding end 85% alignment. Shading the of each of

171 APPENDIX F

MULTIPLE SEQUENCE ALIGNMENT OF CAR2-ORNITHINE OXO-ACID

TRANSAMINASE (OAT)

172

Multiple sequence alignment of CAR2-Ornithine oxo-acid transaminase (OAT) homologs. The OAT homologs identified via BLAST searches from ustilaginales species were aligned to the experimentally verified Saccharomyces cerevisiae OAT sequence. The length of each protein coding sequence is shown at the end of the alignment. Shading of 85% applied.

173 APPENDIX G

EXPANDED PHYLOGENETIC TREE OF LAND PLANTS

174

Expanded phylogenetic tree of land plants showing major clades. Plant genus with number of identified TRs are shown to the right in parantheses (green). Branch lengths do not correspond to evolutionary distance.

175 APPENDIX H

CO-AUTHOR APPROVAL

176 I verify that the following co-authors have approved of my use of our publication in my

dissertation

Jiarui Song (Texas A&M University)

Claudia Castillo-Gonzalez (Texas A&M University)

Yang Li (Arizona State University)

Sreyashree Bose (Texas A&M University)

Behailu Aklilu (Texas A&M University)

Zeyang Ma (China Agricultural University)

Alexander Polkhovskiy (Texas A&M University, Skolkovo Institute of Science and

Technology)

Julian Chen (Arizona State University)

Dorothy Shippen (Texas A&M University)

177