Maintaining Genomic Stability: from DNA Polymerase Β to Telomerase

Maintaining genomic stability: from DNA polymerase β to telomerase

By Matthew A. Schaich B.S., B.A., University of Nebraska-Lincoln, 2014

Submitted to the graduate degree program in Biochemistry and Molecular Biology and the Graduate Faculty of the University of Kansas in partial fulfillment of the requirements for the degree of Doctor of Philosophy.

______Committee Chair: Dr. Bret Freudenthal

______Dr. Christy Hagan

______Dr. Ken Peterson

______Dr. Liskin Swint-Kruse

______Dr. Dan Welch

Date Defended: 9 July, 2020

The disertation committee for Matthew A. Schaich certifies that this is

the approved version of the following dissertation:

Maintaining genomic stability: from DNA polymerase β to telomerase

______Chair: Dr. Bret Freudenthal

______Graduate Director: Dr. Aron Fenton

Date Approved: 15 July, 2020

Abstract

One vital aspect of genomic stability is the maintenance of repetitive DNA sequences at

the end of eukaryotic chromosomes known as telomeres. Telomere DNA shortens with every eukaryotic cell division, increasing the difficulty of maintaining telomere sequence.

Telomerase reverse transcriptase (TERT) restores telomeres eroded by the end-

replication problem by copying the sequence from its RNA template (TR) into telomere

DNA. Unfortunately, the mechanism by which telomerase selects correct nucleotide

triphosphate substrates to regenerate telomeres is poorly understood. Therefore, in this dissertation, we uncovered how telomerase selects canonical matched, deoxyribonucleotides to faithfully extend telomere sequences. First, we determined structures of Tribolium castaneum TERT throughout its catalytic cycle and mapped the active site residues responsible for nucleoside selection, metal coordination, triphosphate binding, and RNA template stabilization. Next, kinetically characterized TERT to insert a mismatch or ribonucleotide ~1 in 20,000 and ~1 in 25,000 insertion events, respectively,

equivalent to ~1 mismatch and ~20 ribonucleotides per 10 kilobases at biological nucleotide concentrations. Human telomerase assays determined a conserved tyrosine steric gate regulates ribonucleotide insertion into telomeres. Finally, we used DNA polymerase β as a model for telomerase to investigate how telomere-targeting nucleotide analogs are inserted into telomeres. We found that they evade polymerase fidelity mechanisms by keeping their modification away from selective active site residues, and

likely work in similar ways in the active site of telomerase. Cumulatively, our work

provides insight into how telomerase selects the proper nucleotide to maintain telomere

iii integrity and indicate a specific means by which nucleotide selection by telomerase can be evaded for a positive therapeutic outcome.

Acknowledgements

First, I would like to thank my parents, Marlin and Sherrie, for helping to instill my scientific

curiosity from a young age. I look back fondly on my childhood as a time full of discovery

and learning. Certainly, their encouragement and the sense of values that I developed

early on has shaped me into who I am.

I also will be forever grateful to my wife, Laurel, who has been with me for every step of the process. Thank you for being there for me after tough days in the lab, celebrating my successes after great days in the lab, and for putting up with the late nights (and early mornings) that I worked to push projects forward.

Thanks also to every member of the Freudenthal lab, both past and present. I firmly believe that science should be a collaborative endeavor, and the group dynamics during my PhD work have far exceeded my hopes and expectations. There are innumerable

ways that I have been helped over the years, from scientific discussions about challenging

problems, to working together to fix broken pieces of equipment.

A big thank you also goes out to the entire Biochemistry and Molecular Biology

department, for teaching me so much over the course of classes, and for always being

willing to provide expertise to scientific discussions, as well as allowing me to use

instruments not present in our lab.

Thank you to the Madison and Lila Self Graduate Fellowship, not only for providing me with the funding for my research, but also for helping me to develop so many professional skills.

Finally, I want to thank my mentor, Dr. Bret Freudenthal, for his guidance through every step of the PhD process. Even outside of science, he has been an invaluable source of

v advice throughout many aspects of my life, and had a profound role on not just making me the person I am today.

We thank Jay Nix (Molecular Biology Consortium 4.2.2 beamline at Advanced Light

Source) for aid in remote data collection and help with data analysis. This research used resources of the Advanced Light Source, which is a Department of Energy Office of

Science user facility under Contract DE-AC02-05CH11231. We thank Amy Whitaker

(University of Kansas Medical Center) for helpful discussion and assistance with the preparations of both manuscripts in this dissertation, and Scott Lovell (University of

Kansas) for his advice and help with crystal optimization. This research was supported by the National Institute of General Medical Science [R35-GM128562], the National

Institute of Environmental Health Sciences [R35-ES030396], [R00ES024431], and the

Madison and Lila Self Graduate Fellowship.

No conflicts of interest are declared throughout this dissertation.

Table of Contents

Chapter 1: Introduction ...... 1

1.1 A biological clock present in cultured cells ...... 2

1.1.1 A molecular view of how a cell’s biological clock ticks ...... 3

1.1.2 Turning back time: the first characterization of an enzyme that lengthens

telomeres ...... 6

1.1.3 Innovative approaches to study telomerase: the TRAP assay and beyond .... 8

1.2 Telomerase and cancer: a “universal” cancer target? ...... 11

1.2.1 Creative approaches used to target telomerase ...... 12

1.2.2 Direct telomerase inhibitors ...... 15

1.3 Mechanistic studies of telomerase ...... 18

1.3.1 The identification of TERT ...... 20

1.3.2 Structural characterization of TERT ...... 22

1.3.3 Cryo-EM structures of the telomerase holoenzyme ...... 25

1.4 Testing the ability of telomerase to choose right from wrong: studies of

telomerase with noncanonical substrates ...... 26

1.4.2 Mechanistic insight on 8-oxoG as a chain terminator from DNA polymerase β

...... 30

1.4.3 Targeting 6-thioguanine to telomeres ...... 32

1.5 Dissertation Overview ...... 34

vii

Chapter 2: Structural Characterization of the Telomerase Active Site...... 36

2.1 Abstract ...... 38

2.2 Introduction ...... 39

2.2.1 Challenges of biochemical characterization of TERT ...... 39

2.2.2 Model systems used to study TERT ...... 40

2.2 Materials and Methods ...... 41

2.3.1 Nucleic acid sequences...... 41

2.3.2 Expression and purification of tcTERT ...... 41

2.3.3 Crystallization of tcTERT ...... 45

2.3.4 Data collection and refinement ...... 46

2.3.5 Data resources ...... 46

2.4 Results ...... 46

2.4.1 Pre-nucleotide binary complex ...... 48

2.4.2 Nucleotide bound ternary complex ...... 50

2.4.3 Product complex ...... 51

2.5 Discussion ...... 54

2.5.1 TERT conformations throughout its catalytic cycle ...... 54

2.5.2 Limitations of structural studies ...... 55

Chapter 3: Kinetic characterization of the telomerase active site ...... 56

3.1 Abstract ...... 58

viii

3.2 Introduction ...... 59

3.2.1 Kinetic characterization of telomerase ...... 59

3.3 Materials and Methods ...... 61

3.3.1 Nucleic acid sequences...... 61

3.3.2 Expression and purification of tcTERT ...... 61

3.3.3 Pre-steady-state kinetic characterization of tcTERT ...... 61

3.3.4 Human telomerase expression ...... 63

3.3.5 Human telomerase purification ...... 63

3.3.6 32P-end-labeling of DNA primers ...... 64

3.3.7 Telomerase activity assay ...... 64

3.4 Results ...... 65

3.4.1 Fidelity and sugar selectivity of TERT ...... 65

3.4.2 The steric gate of telomerase ...... 71

3.4.3 Ribonucleotide insertion by human telomerase ...... 73

3.5 Discussion ...... 76

3.5.1 Telomeric Ribonucleotides ...... 77

3.5.2 Telomerase prevents the insertion of aberrant nucleotide triphosphates ...... 79

3.5.3 Acknowledgements ...... Error! Bookmark not defined.

3.5.4 Competing interests ...... Error! Bookmark not defined.

Chapter 4: Therapeutic nucleotide insertions by DNA polymerase β ...... 81

4.1 Abstract ...... 84

4.2 Introduction ...... 86

4.2.1 Nucleotide analogs in the clinic ...... 87

4.3 Materials and Methods ...... 90

4.3.1 DNA sequences used in this study ...... 91

4.3.2 Expression and purification of DNA Polymerase β ...... 91

4.3.3 Pol β/DNA complex crystallization ...... 92

4.3.4 Crystallographic data collection and refinement ...... 92

4.4 Results ...... 93

4.4.1 Structures with incoming 5-Fluorouracil ...... 95

4.4.2 Structures with incoming 6-Thioguanine ...... 97

4.4.3 Structures with incoming 5-Formylcytosine ...... 98

4.4.4 Structures with incoming 5-Formyluracil ...... 100

4.5 Discussion ...... 102

4.6.1 Comparison between the structures of modified nucleotides and native

nucleotides ...... 104

4.5.2 Concealing base modifications by placing them in the major groove ...... 105

4.5.3 Applying results of DNA polymerase β to other nucleic acid enzymes ...... 107

4.5.4 Funding Sources ...... Error! Bookmark not defined.

4.5.5 Acknowledgements ...... Error! Bookmark not defined.

4.5.6 Abbreviations ...... Error! Bookmark not defined.

Chapter 5: Discussion and future directions ...... 109

5.1 Maintenance of telomere integrity by the telomerase active site...... 111

5.1.1 Self-proofreading by telomerase ...... 112

5.1.2 Advances in structural biology of telomerase ...... 117

5.2 Future directions and concluding remarks ...... 120

Appendix A: Full TERT enzymology parameters ...... 134

Appendix B: Molecular docking with AutoDock Vina ...... 138

B.1 Advantages of using molecular docking techniques ...... 138

B.2 Challenges associated with molecular docking ...... 139

B.3 Complementary techniques to molecular docking ...... 141

B.3.1 Complementary approaches that do not require protein purification ...... 141

B.3.2 Purified protein dependent complementary techniques ...... 142

B.4 Example Molecular Docking Protocol with Autodock Vina ...... 144

B.4.1 Context: Information that prompted molecular docking experiments ...... 144

B.4.2 Closing notes on interpretation ...... 152

List of Figures

Figure 1.1: The end replication problem drives telomere shortening with each cell division.

...... 4

Figure 1.2: The telomerase reverse transcriptase (TERT) catalytic cycle...... 10

Figure 1.3: Telomerase as a target for cancer therapy...... 14

Figure 1.4: Structural characterization of a telomere G-quadruplex ...... 16

Figure 1.5: Imetelstat inhibits the telomerase catalytic cycle...... 19

Figure 1.6: Vacuum electrostatics of the TEN domain...... 23

Figure 1.7: The C-terminal (CTE) domain of tcTERT (blue) and human TERT (green)

exhibit similar secondary structures...... 27

Figure 1.8: Two modified forms of guanine impacting telomere integrity...... 29

Figure 1.9: 8-oxoG at the primer terminus alters the insertion efficiency of pol β...... 33

Figure 2.1: Alignments of several telomerase reverse transcriptase homologs...... 42

Figure 2.2: Overlay of TERT from Tribolium castaneum with other TERT structures. ... 43

Figure 2.3: The prenucleotide binary structure of the telomerase catalytic cycle...... 49

Figure 2.4: TERT ternary structure...... 52

Figure 2.5: TERT product structure...... 53

Figure 3.1 TERT WT dGTP insertion kinetics, fit two different ways...... 67

Figure 3.2: TERT fidelity and sugar discrimination...... 70

Figure 3.3: The steric gate of TERT prevents insertion of rNTPs...... 72

Figure 3.4: The steric gate of human telomerase...... 75

Figure 3.5: A model for how telomerase preserves telomeric integrity and biological

implications of telomeric ribonucleotides...... 78

xii

Figure 4.1 Pol β ternary ground state with incoming 5-FdUTP base pairing to A...... 96

Figure 4.2: Structural features of the precatalytic pol β ternary ground state with incoming

6-TdGTP...... 99

Figure 4.3: Structure of incoming 5-FodCTP, primed for insertion by pol β...... 101

Figure 4.4: Structure of pol β bound to gapped DNA and 5-FodUTP...... 103

Figure 4.5: Other polymerases have minimal contacts with the major groove side of

incoming nucleotides...... 106

Figure 5.1 : Possible eventual fates of errors incorporated by telomerase...... 115

Figure 5.2: Advances in highest TERT resolutions over time...... 119

Figure A.1: Pre-steady-state kinetics of additional TERT active site mutants...... 134

Figure A.2: Complete graphs of selected TERT pre-steady-state kinetics ...... 135

Figure A.3: Residual analysis of (A) single exponential and (B) double exponential fits of

the kinetic data from WT TERT inserting dGTP across from rC...... 136

Figure B.1 :Selection of JNK1 structure...... 145

Figure B.2: Preparing a structure from the PDB for molecular docking...... 146

Figure B.3 Preparing the structure for a molecular docking experiment...... 147

Figure B.4: Setting up a Grid Box for the docking search into JNK1...... 148

Figure B.5: Example syntax for using AutoDock Vina with the command line...... 149

Figure B.6: Viewing the molecular docking results...... 150

Figure B.7: Orientations of published (green) vs docked (blue) quercetagen into JNK1.

...... 151

xiii

List of Tables

Table 1.1: Telomerase architecture of five different TERTs, including Homo sapiens,

Tetrahymena thermophila, Saccharomyces cerevisiae, and Tribolium castaneum.

...... 21

Table 2.1: The conservation and function of TERT active site residues...... 44

Table 2.2: Data collection and refinement statistics for tcTERT structures...... 47

Table 3.1: Kinetic parameters and errors for tcTERT pre-steady-state kinetics of single

nucleotide incorporation...... 68

Table 4.1: Selected nucleoside analogs and their clinical use...... 90

Table 4.2: Data Collection and Refinement Statistics of Ternary Pre-catalytic Pol β:DNA

Co-complexes with Incoming Nucleotide Analogues ...... 94

Table A.1: Full kinetic parameters of each curve used to generate Kobs and target

engagement values in tcTERT pre-steady-state kinetics ...... 137

xiv

List of Abbreviations

5-FU: 5-fluorouracil

5-FdUTP: 5-fluorodeoxyuridine triphosphate

6-TG: 6-thioguanine

6-TdGTP: 6-thiodeoxyguanidine triphosphate

5-FoC: 5-formylcytidine

5-FodCTP: 5-formyldeoxycytidine triphosphate

5-FoU: 5-formyluridine

5-FodUTP: 5-formyldeoxyuridine triphosphate pol β: DNA polymerase β dsDNA: double stranded deoxyribonucleic acid pol: polymerase nt: nucleotide

A: adenosine

C: cytosine

T: thymine

G: guanine

Chapter 1: Introduction

1.1 A biological clock present in cultured cells

For much of the 20th century, the scientific dogma stated that under ideal conditions, eukaryotic cells would divide indefinitely. In other words, aging was not thought to be relevant at the cellular level. Much of the evidence behind this belief stemmed from experiments conducted by Dr. Alexis Carrel, a premier surgeon and Nobel

Laureate 1. In these experiments, multiple cell types were isolated and monitored for a period of several months. Throughout the timeframe, many of the cell types did not appear to show a reduction in growth rate, which led to the prediction that perfect conditions would allow for permanent life for tissues outside of organisms. The conclusions of these experiments were widely accepted, and as further evidence of this “permanent life”, Dr.

Carrel famously kept a culture of chick heart embryo cells alive for a period of years 2.

Later, Drs. Hayflick and Moorehead published their seminal work, “The serial cultivation of human diploid cell strains”, refuting the established dogma 3. In contrast to the limitless divisions of Dr. Carrel’s cultured cells, every cell line used in this study reached a unique but reproducible number of divisions before stopping completely.

Shortly after a cell line stopped dividing, a death phase was observed that quickly eliminated viable cells in culture. This limit varied from as few as 5 subcultivations to over

50. Furthermore, cell lines used in these studies were subjected to storage at temperatures of -70° C; even after long periods of time at these temperatures, the limit of number of subcultivations did not alter. Therefore, the “biological clock” that was counting down on these cell cultures was dependent on the number of cell divisions, not the amount of time that the cells were kept alive. Although this finding may have been controversial at the time, it established the new dogma in modern cell biology that primary

2 cells have a fixed number of times they will divide, termed the Hayflick Limit after Leonard

Hayflick 4.

How did Dr. Carrel keep his famous chick heart embryo extracts alive for so long?

The passage of time has made it impossible to know exactly what allowed these cells to evade the Hayflick limit, but several hypotheses have been proposed. One possibility is that nutrients used to grow the cells were allowing the culture to grow indefinitely. Chick embryo extract was used as a nutrient source, and if any of the chick embryo cells were not fully lysed prior to addition to the culture, addition of nutrients could have reseeded the culture, effectively resetting the molecular age of the culture 2. Significant anecdotal evidence also points towards technicians in the lab, fearful of Dr. Carrel’s wrath, intentionally reseeding the culture with fresh chick heart embryo cells in order to keep the culture alive 2. One final possibility is that the cells spontaneously transformed, activating enzymes to turn back their biological clock, using the mechanisms discussed in following sections.

1.1.1 A molecular view of how a cell’s biological clock ticks

Drs. Hayflick and Moorehead discovered that cells possess a biological clock, but it took over a decade to understand the mechanism of how that clock worked. In the mid-

1970’s, findings in the field of DNA replication provided a clue known as the “end- replication problem” (Figure 1.1) 5-7. This phenomenon works as follows: The genomes of eukaryotic cells are composed of two complementary strands of DNA, that are hybridized to one another in a reversed orientation. For replication, both of these strands need to be successfully copied by DNA polymerases. For DNA polymerases to copy the DNA, they must have a short DNA primer hybridized upstream of the sequence that needs to be

Figure 1.1: The end replication problem drives telomere shortening with each cell division. Because the double stranded DNA anneals in an antiparallel direction, DNA replication of each strand requires different mechanisms. With the leading strand, DNA polymerases can reach the 3’ end of a DNA sequence because priming occurs on the 5’ side of the DNA polymerase (left). In the case of the lagging strand, the primer for replication (green) needs to anneal to the DNA in order to start DNA synthesis. When that is complete, primer removal leaves a portion of the lagging strand not fully replicated (right). This process results in blunt ending of the overhang, and chromosome shortening at the telomere ends5.

copied. In the case of the leading strand, the DNA polymerase can bind onto a primer far

upstream from the chromosomal end and continue replicating out to the last base of the

DNA, resulting in a full-length copy of the leading strand. In contrast, the primer for the

lagging strand binds to the end of the DNA, preventing the polymerase from completely

copying the sequence. Therefore, every time eukaryotic cells copy their genome, the DNA

shortens. It was further hypothesized that non-essential genes were placed at the termini of chromosomes, termed telogenes, and that after these telogenes were exhausted, cell division would begin to eliminate vital genetic content, causing cellular aging 8. Thus, cell

division was linked to loss of genetic content, providing a basis for how cellular aging

occurs. Telogenes only remained hypothetical for a short period of time, after which Drs.

Blackburn and Gall used Sanger sequencing to identify and term them “telomeres” in the

protozoal model organism Tetrahymena thermophila (T. thermophila) 9. They found that

the telomeres of T. thermophila consist of a hexanucleotide repeat of the sequence 5’ (G-

G-G-G-T-T)n 3’.

One key to their success is that T. thermophila possesses a unique nuclear

architecture (45 copies of 275 chromosomes and 9,000 copies of a single chromosome),

compared to human cells that have only two copies of 23 chromosomes in the diploid

state 10. Because T. thermophila have so many chromosomes, they possess many more

telomere ends than a typical human cell The higher abundance of telomeric DNA then

made sequencing approaches more feasible and telomeres easier to study. These results

confirmed the idea that non-coding repetitive sequences are found at the end of

chromosomes, protecting the vital coding genes from the dangerous effects of the end-

replication problem. The foundational discovery of telomeres still left lingering questions,

however. Because genomic DNA is passed in every organism for generation to

generation, wouldn’t shortened telomeres also be heritable? Therefore, system must exist that elongates telomeres, in order to prevent the erosion of vital DNA throughout generations.

1.1.2 Turning back time: the first characterization of an enzyme that lengthens telomeres

It took another decade to discover an enzyme that was able to elongate telomeres.

Using an extensive protein purification scheme that involved at least 5 types of chromatography, Drs. Blackburn and Greider isolated an enzyme that elongated telomere sequences, which they named telomerase 11. Telomerase activity was detected using assays that tested for DNA elongation with similar techniques to Sanger sequencing 11,12.

In these assays, telomerase was incubated with both a labeled DNA primer composed of

the T. thermophila telomere sequence and all four dNTPs. Running the products of this reaction on a denaturing polyacrylamide gel yielded a laddering pattern, with each band

indicating a nucleotide inserted by telomerase. The evident laddering pattern indicated

that telomerase was inserting hundreds of subsequent nucleotides, all with no additional

RNA primer, or even a templating DNA strand present in the solution. Upon treatment

with protein denaturing conditions or RNase, telomerase activity was lost, implying that

telomerase is dependent on both protein and intrinsic RNA components. These features

separate telomerase from any other known DNA polymerase or reverse transcriptase.

Furthermore, the specificity of telomerase for telomeric DNA could be tested by varying

the sequence of the labeled DNA primer. Telomerase only elongated the DNA sequences

that composed the T. thermophila telomeric repeat sequence, implying that the

mechanism of telomere elongation must involve telomere sequence recognition 11. The means by which telomerase recognized telomeres, however, still remained a pressing question in the field.

Unlike other DNA polymerases and reverse transcriptases, telomerase possesses the ability to extend DNA sequences without an extrinsic templating sequences or a primer. Similarly, telomerase was found to contain essential components made of both

RNA and protein, a feature not observed with other types of polymerases. Was it possible that the RNA component of telomerase was acting as an intrinsic templating sequence?

Shortly after the discovery of T. thermophila telomerase, its RNA component was

sequenced and found to contain the sequence 5’-CAACCCCAA-3’ 13. The sequence

found represents the reverse complement to the hexanucleotide repeat that constitutes

the telomeres of T. thermophila; however, instead of six nucleotides long, the template is

nine nucleotides. The extra length allows telomerase to hybridize to the telomeric

sequences with the first three bases, fill in the six nucleotides complementary to the

primer, and then undergo a process termed “translocation”. In the translocation step,

telomerase dissociates from the telomeric end, and then rehybridizes the three

homologous bases to the end of the telomere, allowing for six new bases to be inserted

(Figure 1.2) 13.

Due to translocation, a single copy of telomerase can add hundreds of bases to

the end of a telomere prior to dissociation. Greider et al. began testing this with trapping

experiments, wherein telomerase was preincubated with various DNA substrates prior to

initiating the reaction. When telomerase was preincubated with the canonical telomeric

repeat sequence before initiating the reaction with a competitor DNA strand present, the

7 banding pattern was consistent with elongation of the initial telomeric DNA sequence rather than that of the competitor DNA. Conversely, when telomerase was incubated with the competitor DNA prior to the reaction initiation and addition of a telomeric DNA sequence, the competitor DNA was elongated rather than the canonical DNA 13.

Therefore, evidence was shown for the first time that telomerase exhibits repeat addition processivity; in other words, one telomerase molecule can add hundreds of nucleotides to a telomere prior to dissociation.

1.1.3 Innovative approaches to study telomerase: the TRAP assay and beyond

Many of the early breakthroughs in telomerase biology were observed with the model system of T. thermophila, but unfortunately, studies of human telomerase proved much more challenging. One enormous challenge in transitioning between T. thermophila and human was the issue of telomerase abundance. Compared to the tens of thousands of chromosomes present in each T. thermophila cell, human cells have only 46 14. This translates to much fewer telomere ends that need maintenance, and therefore fewer copies of telomerase needed. Additionally, it has now been observed that the vast majority of cells actively repress the expression of telomerase, presumably. Cells that do express telomerase are estimated to only have 240 copies of the monomer per cell, which is barely one telomerase copy for each chromosome end 15. With expression levels this low, both the structural and biochemical characterization of human telomerase has proven exceedingly difficult because of low yields and purities. Furthermore, for biochemical assays, the concentration of an enzyme is required for calculating many kinetic parameters, including kpol, the maximal rate constant of nucleotide incorporation by a polymerase independent of nucleotide concentration and kcat, the turnover number

8 representing the apparent first-order rate constant with saturated substrate in a steady- state kinetic regime 16,17. To measure the telomerase concentration, the number of correctly assembled RNA:protein complexes that compose telomerase must be measurable, a task that has not been attainable.

Because of the inherent challenges with the biochemical characterization of telomerase, innovative techniques in cell and molecular biology were needed in order to study telomerase in humans. One groundbreaking technique, known as the Telomeric

Repeat Amplification Protocol (or TRAP) provided a first glimpse at which types of human cells express telomerase, and which do not 18. This assay was made possible by the invention of the Polymerase Chain Reaction (PCR) 19. To detect telomerase activity, a non-telomeric sequence is capped on the 3’ end by a telomeric sequence. This DNA sequence is then incubated with cell extract to be tested for telomerase expression. If telomerase is present, the sequence will be lengthened by the activity of telomerase. After that, the reverse telomere sequence is added and if telomerase created a lengthened sequence, they are annealed and amplified by Taq DNA polymerase. With subsequent rounds of the Polymerase Chain Reaction, the signal is amplified exponentially; hence, previously undetectable levels of telomerase activity would then be rendered detectable

18.

Cancer researchers were among the first to apply the TRAP assay. Because cancer cells divide more frequently than normal somatic cells, telomere shortening should also occur more rapidly in cancer cells. Therefore, to compensate for the rapid telomere

Figure 1.2: The telomerase reverse transcriptase (TERT) catalytic cycle. In order to elongate telomeres, telomerase (made up of TERT and its RNA template) must first bind to the 3’ end of the leading strand at end of telomere DNA (Fig. 1.1, red). Telomerase then anneals its template and leaving 6 unoccupied slots for insertion (State I). Then, nucleotide binding occurs, and the next nucleotide is positioned in the TERT active site across from its complementary RNA template (State II). Chemistry then occurs, covalently linking the bound nucleotide to the 5’ end of telomere DNA (State III). This cycle completes 6 times, until all RNA templating bases are filled (State IV), after which telomerase translocates to allow for six new insertion events.

loss, it was hypothesized that cancer cells are more dependent on mechanisms of

telomere maintenance, such as telomerase, than normal cells. To test this hypothesis,

the TRAP assay was used to test a variety of both normal and cancer cells by Kim et al.

18. First, using immortalized HEK293 cells as a model, the TRAP assay detected telomerase activity. When the cell extract was subjected to either heat or RNase treatment (which would denature the protein component or digest the RNA component, respectively), the signal from the TRAP assay disappeared, giving further credence towards the specificity of telomerase detection by TRAP assay.

Next, a wide variety of tumor and normal cells were tested for telomerase

expression from tissues including skin, breast, lung, cervix, kidney, bladder, colon, and

prostate. In nearly every tissue, the tumor cells tested positive for telomerase activity

whereas normal cells exhibited a conspicuous lack of telomerase activity 18. These

seminal experiments led to telomerase being called the “universal cancer target”,

because it seemed that if a therapeutic could kill cells based on telomerase expression,

cancer cells could be eradicated while leaving normal cells unharmed 20. By providing the

tools to study telomerase, along with a direct rationale for its clinical importance, the

number of publications in the field of telomerase soared; in just five years the number of

publications per year grew from ~50 to ~500 21. Despite this growth of the field, however,

no telomerase-targeting therapy has successfully reached the clinic as an intervention for

any human disease.

1.2 Telomerase and cancer: a “universal” cancer target?

In the nearly 25 years of research performed after the link between telomerase

and cancer was discovered, the role of telomere length as a defense against cancer has

11 been elucidated. In this context, the Hayflick Limit (which placed a fixed limit on the number of times a cell can potentially divide) shows a clear evolutionary benefit. In somatic tissues, where telomerase is repressed, every cell division causes telomere shortening. With successive divisions, telomeres will eventually become critically short and cause the cells to undergo either apoptosis or senescence 22. In either case, this results in the cell ceasing replication. This limitation in cell division is protective against cancer because telomere shortening limits immortalization. Even with several carcinogenic mutations, cells that do not reactivate mechanisms of telomere maintenance will not be able to progress to a malignant cancer 23. From the perspective of therapeutic development, the differential in telomerase activity between cancer cells and somatic tissues suggests a significant therapeutic window. If a treatment could specifically target only telomerase expressing cells, it would specifically eliminate cancer cells while not harming normal cells (Figure 1.3).

1.2.1 Creative approaches used to target telomerase

The discovery in the mid-1990’s of telomerase as a target for cancer therapy was juxtaposed against an environment in which minimal mechanistic information was known regarding the structure and function of telomerase. Undeterred, the appeal of making telomerase-targeting cancer therapeutics led researchers to use innovative and creative approaches at the frontiers of therapeutic design. One such therapeutic, known as GV-

1001, consists of a peptide from telomerase that triggers the immune system to eliminate any cell that expressed telomerase. Based on the observation that hTERT endogenous to cancer cells is processed and presented as an antigen, telomerase has been called a

“universal tumor-associated antigen” 24. Although the preclinical results looked promising,

several clinical trials using GV-1001 have been terminated due to lack of efficacy,

potentially due to the immunosuppressing environment of tumors 25. A phase III clinical

trial will start recruiting soon to evaluate the efficacy of GV-1001 in Benign Prostatic

Hyperplasia, but largely clinical trials for GV-1001 as a cancer therapy are stopping, and it does not seem likely that GV-1001 will reach the clinic to treat the broad-spectrum of cancers that express TERT. Interestingly, treatment with this small peptide from hTERT has had unintended effects, including its potential use as a cell-penetrating peptide (which

could be used to deliver molecules across cell membranes), and as a potential treatment

for Alzheimer’s Disease (with a planned phase II clinical trial not yet recruiting) 26,27.

Several other methodologies for the therapeutic targeting of telomerase are also

under investigation and also are largely independent of the structural and biochemical

characterization of telomerase. One category, gene therapies, often involves placing

cytotoxic elements under the control of the an extrinsic TERT promoter. Delivered to

many cells via adenovirus, only cells that express TERT will activate the cytotoxic

element, and thus the therapy should be specific to telomerase positive cancer cells 28,29.

One of these therapeutics, telemolysin, is under investigation in a phase II clinical trial for

treatment of esophagogastric adenocarcinoma 30. Alternately, telomeres have been

established to form secondary structure at telomere ends known as G-quadruplexes. In

these structures, guanines in the strand use both their Watson-Crick and Hoogsteen face

in order to attain four hydrogen bonds per G, rather than the three available with canonical

duplex base pairing (Figure 1.4). Telomestatin is a compound under current investigation

that wraps around these quadruplex structures, stabilizing them and preventing them from

unfolding. Because this prevents telomerase from accessing them and lengthening

Figure 1.3: Telomerase as a target for cancer therapy. In normal somatic cells, telomerase is absent. Therefore, with each round of cell division, telomere decreases due to the end replication problem. Eventually, a critically short length is reached that triggers senescence or apoptosis, which prevents future cell divisions. In the context of cancer cells, ~90 percent of cancers express telomerase, extending their telomeres and allowing for unlimited cell proliferation. Therefore, methods that inhibit telomerase have been hypothesized to inhibit cancer cell proliferation while leaving normal cells unaffected.

telomeres, this effectively inhibits telomerase in vitro 31. Unfortunately, in order to stabilize

G-quadruplex, molecules like telomestatin need to be both large enough and polar

enough to encircle the quadruplex and bind, which has limited their ability to permeate

the cell membrane. This lack of bioavailability has thus far limited G-quadruplex stabilizers

from entering clinical trials, although efforts at optimization continue 32,33.

1.2.2 Direct telomerase inhibitors

Even though the development of therapeutics to target telomerase has resulted in

the creative approaches outlined above, more traditional approaches have also been

applied. In this approach, the goal is to directly inhibit of the reverse transcriptase activity

of telomerase. Fortunately, researchers developing direct inhibitors to telomerase’s

activity had a distinct advantage in spite of minimal biochemical characterization: the

sequence of the RNA component of telomerase was known. Therefore, a complementary

nucleic acid substrate to that RNA strand could hypothetically be used to disrupt telomere

elongation. This was the idea behind GRN163, the oligonucleotide sequence that was

eventually named “imetelstat” 34. Several different DNA sequence candidates were tested

for inhibitory activity; GRN163 was perfectly complementary to the RNA component of

telomerase and exhibited extremely low IC50 values of 0.045 nM. Placing mismatches in

the sequence or introducing the complementary strand with the inhibitor candidate starkly

reduced the potency (to 80 and 8.9 nM, respectively). Putting in the strand with the same

sequence as the telomerase RNA component (TERC) only (i.e. the strand that should

34 not efficiently bind TERC) exhibited an IC50 of > 1000 nM . Therefore, GRN163 seemed

to specifically inhibit telomerase based in a sequence-dependent manner (Figure 1.5).

Figure 1.4: Structural characterization of a telomere G-quadruplex (A, B) Two angles of a G-quadruplex composed of 4 human telomere repeats. The G-rich region localizes + at the interior of the structure, coordinating three K ions (purple spheres). Nucleic acids are shown as green sticks. (C) The interaction network of a G-quartet, one layer of the a G-quadruplex. Each guanine has four hydrogen bonds to another guanine in the quartet, along with interacting with the K+ ion in the center. Structural information for all images taken from PDB code 1kf1.

GRN163 showed promising preclinical results both in vitro and in tumor xenograft models, and promising results depended on cancer cells expressing telomerase. Cells with shorter starting telomere length showed increased sensitivity to the compound, further supporting the mechanism of action of GRN163 34. The compound was subsequently modified by attaching a lipid tail to the oligonucleotide to increase its bioavailability 35. The modified version, GRN163L, has since entered ~20 clinical trials for a variety of cancer interventions and progressed to phase III for patients with myelodysplastic syndrome (NCT02598661). Unfortunately, several of those clinical trials were terminated, including a phase II clinical trial on patients with recurrent or refractory brain tumors (NCT01836549). One of the issues facing the clinical application of telomerase inhibitors like imetelstat is a long lag period between the treatment time and the therapeutic outcome (i.e., once the telomeres of cancer cells are shortened enough to trigger apoptosis) 36. Therefore, removal of the treatment for short amount of times to alleviate side effects gives cancer cells the opportunity to regenerate their telomere length and better resist telomeric erosion once the treatment resumes.

In summary, numerous innovative approaches have been attempted to target telomerase as a cancer intervention. Over the course of two decades, diverse approaches from suicide genes to enzymatic inhibitors have been attempted, with many of these approaches reaching phase III clinical trials. While promising on paper, none of these compounds has proven to be a panacea for patients with cancer. These methodologies have taken bold steps forward, but in every case, the development of the therapy was made prior to critical mechanistic knowledge of how telomerase works. No approach, even ones currently in late-stage clinical trials, has a published structure of the therapeutic

17 compound bound to its target. Similarly, minimal information is known about how any telomerase inhibitor alters basic kinetic parameters of telomerase. Dr. Jerry Shay, who significantly contributed to the development of imetelstat and the TRAP assay, commented on this critical need of telomerase biochemistry, stating “We're at this stage where there's still more basic science of telomerase we need to understand to be able to make better telomerase-specific therapeutics” 37. All of the compounds currently undergoing clinical trials were discovered with little mechanistic information known; imagine the impact of a telomerase-targeting therapy rooted in mechanistic information of how telomerase works.

1.3 Mechanistic studies of telomerase

In order to understand the mechanism by which telomerase elongates telomeres, one of the most important questions is: what components are necessary to comprise active telomerase? From the outset of the discovery of telomerase in 1985, evidence pointed towards the necessity of both an RNA component and a protein component 11,38.

Namely, treatment with RNase caused the activity of telomerase to disappear, and shortly after, the telomerase RNA component (TERC) was successfully sequenced 13. The protein sequence was identified using mass spectrometry; after purifying a protozoa telomerase from Euplotes aediculatus, a portion of the protein sequence was identified using mass spectrometry 39. The sequence was predicted to be a reverse transcriptase, similar to the retroviral reverse transcriptase Human Immunodeficiency Virus Reverse

Transcriptase (HIV RT). A homologous yeast gene was identified from that sequence, which had been named EST2 for “ever shorter telomeres”. This gene had previously been identified as relevant to telomere maintenance, as knocking out EST2 caused telomeres

in yeast to

Figure 1.5: Imetelstat inhibits the telomerase catalytic cycle. Imetelstat (green) contains the complementary sequence to the RNA component of telomerase (purple). Therefore, treatment with imetelstat binds telomerase (top) and prevents telomerase from engaging with telomeres (bottom), which in turn prevents telomere elongation.

19 progressively shorten 40,41. Based on the known structure and function of HIV RT, known catalytic amino acid residues were targeted and mutated to alanine, including residues homologous to catalytically vital aspartic acid residues 42-44. These single amino acid changes shortened the average telomere length, and caused a senescent phenotype in the mutant yeast. Similarly, human TERT was identified with homology searches, originally named hEST2, and linked to tumor cells and immortalization 45. Furthermore, expression of TERT mRNA correlated with telomerase activity, and telomerase activity could be reconstituted by mixing TERT with TERC, lending support to the hypothesis that

TERT was the catalytic component of human telomerase 46,47.

1.3.1 The identification of TERT

The reverse transcriptase responsible for telomeric extension is named TERT, for

TElomerase Reverse Transcriptase. Currently, TERT is divided into four domains: its N- terminal domain (TEN domain) thought to be involved in the translocation step of telomeric extension; its RNA binding domain named TRBD that interacts with TERC; its reverse transcriptase domain, RT, which contains the nucleotide binding pocket and catalytic residues; and its C-terminal domain (CTE),which interacts with the telomere terminus and assists with binding TERT to the DNA component (Table 1.1, data from 49-

53) 48. As with many other nucleotidyl transferases, the architecture of TERT has been compared to that of a hand. In this analogy, the CTE domain represents the thumb, the

RT represents the palm, and the TRBD domain acts as the fingers.

The first structural information regarding TERT was obtained using the same model system with which it was initially discovered: Tetrahymena thermophila. Although this structure did not contain the catalytic residues of TERT (because it was only one

TR1 TERT Sequence Length Domain Structure Mass Identity (nt) (kDa) TEN-TRBD-RT-IFD- H. Sapiens 451 127 100% CTE* TEN-TRBD-RT-IFD- M. musculus 431 128 62% CTE TEN-TRBD-RT-IFD- T. thermophila 159 133 18% CTE TEN-TRBD-RT-IFD- S. cerevisiae 1157 103 19% CTE T. castaneum unknown TRBD-RT-IFD2-CTE 70 23%

1TR = Telomerase RNA component *TEN = N-terminal domain *TRBD = Telomerase RNA binding domain *RT = Reverse transcriptase domain *CTE = C-terminal extension *IFD = Insertion in fingers domain 2 This IFD is truncated compared to other TERTs.

Table 1.1: Telomerase architecture of five different TERTs, including Homo sapiens, Tetrahymena thermophila, Saccharomyces cerevisiae, and Tribolium castaneum. Reprinted with permission from Schaich, M. A., Sanford, S. L., Welfer, G. A., Johnson, S. A., Khoang, T. H., Opresko, P. L., & Freudenthal, B. D. (2020). Mechanisms of nucleotide selection by telomerase. eLife. 9, e55438, doi:10.7554/eLife.55438.

domain of TERT), it represents a breakthrough in understanding the mechanism of TERT.

In this structure, a single domain of TERT was crystallized, the TEN domain, giving insight

to both the DNA binding and translocation mechanism of TERT 54. The TEN domain contains a positively charged groove composed of conserved residues, leading to the hypothesis that this was a site where TERT bound telomeric DNA. Site directed mutagenesis of some of these key sites in the positively charged cleft of the TEN domain further corroborated this hypothesis. Both DNA binding and overall telomerase activity were reduced when the positively charged cleft was disrupted. Unfortunately, as this structural characterization only included one domain of TERT, many lingering questions about the mechanism remained, including how the active site of TERT is positioned onto an RNA/DNA hybrid substrate, and how the four domains work together.

1.3.2 Structural characterization of TERT

Among the multiple obstacles in obtaining a structure of a catalytically active TERT, one of the most limiting was that of protein yields. For X-ray crystallographic studies, hundreds of milligrams of protein are often required; therefore, finding a means to overexpress and purify TERT at these scales was of paramount importance. With many other proteins, bacterial overexpression has proved transformative in mechanistic studies specifically due to improvements in yield and purity55. So, with no TERT overexpression protocol currently available, researchers attempted expressing multiple model TERT homologs from a variety of different species, including the Caenorhabditis elegans and the red flour beetle, Tribolium castaneum 56. Out of all the species tested, T. castaneum TERT allowed

for the greatest yield and was therefore the most amenable to X-ray crystallographic studies. Using this homolog of TERT, the first X-ray crystal structure of a

Figure 1.6: Vacuum electrostatics of the TEN domain. Adaptive Poisson-Boltzmann Solver tools 2.1 was used to generate vacuum electrostatics from the TEN domain of TERT, PDB code 2b2a. The positively charged groove (blue) is shown in the center of the protein groove. A scale of charge is shown at the bottom, in multiples of 25.85 mV (total range of -1.9 V to 1.9V).

23 catalytically active TERT was published 49. In this TERT structure, the way that multiple domains interact with each other can be seen for the first time. Although the T. castaneum homolog does not have a TEN domain, the other three TERT domains form a ring like structure, termed the TERT ring. The TRBD domain in this ring closely agreed with a previously published structure of an isolated TRBD from T. thermophila 57. Furthermore, from this structure, the nucleic acid substrate could be modeled into the TERT ring, based on previous data gained from the HIV RT structure. From this modeling, hypotheses could be made about which specific amino acids were involved in binding both the RNA and

DNA strands.

Soon after, T. castaneum TERT was crystallized bound to a nucleic acid substrate comprised of both DNA and RNA nucleotides in a hairpin configuration 58. In this structure, the interaction between TERT and telomeric DNA could finally be observed. Furthermore, the way that the TRBD domain interacted with a piece of the TERC was also visualized for the first time. This structure could be compared with the predicted binding orientation of TERT’s DNA:RNA substrate, and largely the experimental data agreed with the prediction. This structure also presented the first view of how catalytic TERT residues were positioned to insert nucleotides at the end of telomeric DNA. To probe this, nucleotides with a sulfur atom bonded to their alpha phosphate were utilized to reduce insertion efficiency 59. Even though the DNA substrate initially started with three overhanging templating ribonucleotide bases, once the structure was solved it was found that the modification of the nucleotide did not slow the reaction enough to capture an intermediate state. Instead, three separate nucleotide analogs had been covalently linked into the telomere strand. Therefore, no structure of TERT poised for catalysis was

obtained, but this structure did give valuable information about post-catalytic conformation

of TERT.

With much of the structural information of TERT now known from the T. castaneum

model, a clear future direction for telomerase research is transitioning from using this

model to using human TERT. In one of the most recent X-ray crystal structures of TERT,

the first atomic resolution structure of a single domain from human TERT was solved: the

CTE domain 60. This structure corroborated earlier structures of T. castaneum TERT;

even though the amino acid identity between the two TERTs is only ~25%, the overall

fold and secondary structure was nearly superimposable (Figure 1.7). Unfortunately,

challenges that initially led to problems in structural characterization of human TERT will

likely continue to persist, and at this time it seems unlikely that a crystal structure of full-

length human TERT will be published in the near future.

1.3.3 Cryo-EM structures of the telomerase holoenzyme

Cryo-electron microscopy (cryo-EM) structures of telomerase have also helped elucidate

its mechanism and hold several distinct advantages to crystallography. In addition to

needing less protein than X-ray crystallography, impurities such as partially assembled or

misfolded telomerase can be computationally removed 61. In 2018, a subnanometer resolution cryo-EM structure of human telomerase was published 51. Extensive flexibility

and structural heterogeneity of telomerase was observed, which limited the resolution to

~8 Å. At this resolution, protein secondary structures were observed along with density

for TERC, but positions of amino acids were less clear. Additionally, other proteins bound

to TERT and TERC were observed in their structure, including Dyskerin, TCAB1, and

NHP2. On comparing the cryo-EM density for human telomerase with that of earlier

tcTERT structures, the tcTERT model fit into the cryo-EM density, further corroborating

the conserved secondary structures between the two proteins. For more information on recent advances in cryo-EM structures of telomerase, see Chapter 5.

1.4 Testing the ability of telomerase to choose right from wrong: studies of telomerase with noncanonical substrates

Due to previously mentioned technical challenges, even simpler kinetics of telomerase such as steady-state characterization has proven challenging. In one case steady-state kinetics were performed, but key kinetic parameters such as the kcat and catalytic efficiency of the reaction were not calculated, due to unknown enzyme concentration and low purity 62. So far, this lack of mechanistic detail has limited the

understanding of the mechanism by which telomerase inserts native nucleotides, and

minimal work has been performed on biologically-relevant noncanonical nucleotides,

including: mismatches that would disrupt telomere sequence, oxidized nucleotides that

could inhibit telomere extension, or therapeutic nucleotides that are aimed to specifically

disrupt telomeres. See Figure 1.8 for the chemical structure of two of these key modified

nucleobases, 8-oxoguanine and 6-thioguanine. Fortunately, primer extension assays, (in

which telomerase and a labeled DNA primer are incubated together with all four

nucleotides to study telomere extension over time) give a semi-quantitative understanding

of how well telomerase functions. These primer extension assays can be combined with

cellular results, including assays of telomere shortening, telomeric fusions, fragile

telomeres, or cellular senescence to better understand how modified nucleotides impact

telomere biology 63,64.

Figure 1.7: The C-terminal (CTE) domain of tcTERT (blue) and human TERT (green) exhibit similar secondary structures. X-ray crystal structures of both CTE domains were aligned, and the positions of alpha helices contain large portions of overlap. PDB codes 3KYL and 5UGW were used to generate these models.

1.4.1 8-oxoG and telomerase: activator or inhibitor?

Because of the inherent difficulties in studying telomerase, the vast majority of

biochemical studies on telomerase have been performed in context of native nucleotides

that compose the genome (i.e., dATP, dTTP, dCTP, and dGTP). However, as outlined in

section 1.2, a promising therapeutic interest, in conjunction with a lack of fundamental

knowledge, is focused on disrupting telomerase with noncanonical substrates, including

modified and unnatural nucleotides. A study on the interplay between telomerase and 8-

oxoguanine (8-oxoG) represents a breakthrough in understanding how modified

nucleotides impact telomere elongation 65. 8-oxoG is a common form of DNA damage resulting from oxidation of guanine bases 66. In telomeres, their G-rich sequences further

increase the likelihood that guanines in telomeres will be oxidized 67. The formation of 8-

oxoG is not limited to DNA bases in the genome, however, and may also occur in the

context of the nucleotide pool, altering dGTP to 8-oxodGTP 66. In primer extension assays

by telomerase, when 8-oxodGTP was present in solution instead of dGTP, the 8-

oxodGTP was efficiently inserted. After its insertion, however, the next nucleotide in the

telomere sequence was not evident. In other words, the nucleotide form of 8-oxoG acted

as a chain terminator, and the insertion of one damaged nucleotide prevented telomere

elongation 65. In contrast, when 8-oxoG was present in the telomere strand prior to elongation, telomerase activity increased. To further investigate the mechanism of this effect, Fouquerel et al. attempted primer extension assays with 8-oxoG-containing primers of different lengths; activity increased only when there were at least four telomeric repeats present, the minimal number to form a G-quadruplex (see section 1.2.1 for more information on telomeric G-quadruplexes) 65.

Figure 1.8: Two modified forms of guanine impacting telomere integrity. Physiological reactive oxygen species can oxidize guanine to 8-oxoguanine (left), which has been shown to inhibit telomerase in its nucleotide triphosphate form, but activate telomerase once it is already in telomere sequences. 6-thioguanine (right) is a synthetic modification of guanine that causes telomere uncapping and apoptosis in telomerase positive cancer cells, potentially mediated via its insertion into telomere sequences by telomerase.

How does the presence of 8-oxoG change the fold and stability of G-

quadruplexes? Previous investigations of DNA damage and G-quadruplexes presented

evidence that, in the presence of 8-oxoG, quadruplexes still form, but exhibit a lower

thermal stability than quadruplexes without damage 68. Using single molecule approaches, the impact of a single 8-oxoG on the fold of G-quadruplexes was

investigated. Briefly, on one side of a G-quadruplex composed of four telomeric repeats, a Cy5 fluorophore was placed. On the other side, its Förster Resonance Energy Transfer

(FRET) pair Cy3 was placed, along with either normal guanine or 8-oxoG in the original

DNA sequence. The resulting FRET efficiency for undamaged substrate was significantly higher than that of the damaged G-quadruplex; furthermore, the histogram of FRET efficiencies for linear DNA (undamaged with no quadruplex) was even less than that of the G-quadruplex. Therefore, the state of the damaged G-quadruplex is at an intermediate position between the two. The implications of this single molecule experiment are that 8-oxoG partially unfolded of G-quadruplexes. Therefore, DNA damage in the telomeric repeat sequence may make the telomere end more accessible for telomerase by destabilizing G-quadruplexes, which in turn increases telomerase activity.

1.4.2 Mechanistic insight on 8-oxoG as a chain terminator from DNA polymerase β

Technical difficulties in working with telomerase have thus far prevented a mechanistic rationale for how 8-oxoG impacts telomerase. Specifically, why does 8-

oxoGTP terminate telomere elongation, even though it appears to be efficiently inserted?

Even computational approaches, like those presented in Appendix B, cannot explain this

phenomenon. In the absence of mechanistic insights from telomerase, a greater

understanding can be gained from insights with the model high-fidelity polymerase DNA

polymerase β (pol β). This DNA polymerase has previously been used to better

understand general nucleotide selection mechanisms by mammalian DNA polymerases

69-71. Furthermore, DNA pol β has known crystallization conditions, and structures have

been obtained at >2 Å resolution, making it amenable to advanced approaches outlined

below, such as time-lapse X-ray crystallography. That being said, there are key

differences that limit the application of pol β to study telomerase, including a fundamental

difference in the template strand (pol β uses a DNA template, whereas telomerase uses

an RNA template) and differences in amino acid sequences overall. However, both

enzymes perform SN2 reactions at the end of a primer terminus to extend DNA chains by

one nucleotide, both use nucleotide triphosphates as their substrates, and both enzymes

possess similarities in their active sites. These similarities include the presence of an

aspartic acid triad at the primer terminus to deprotonate the 3’-OH and prepare it for nucleophilic attack and a tyrosine residue positioned in the minor groove side of the DNA helix, just past the primer terminus.

Therefore, research in the Freudenthal laboratory studying how pol β extends from inserted 8-oxoG provides some understanding on what may happen in the telomerase active site when telomerase inserts 8-oxodGTP. Namely, how does 8-oxoG at the primer

terminus impact future insertion events? With 8-oxoG at the primer terminus across from a cytosine, the insertion efficiency for a matched dCTP was reduced ~50 fold compared to the same insertion event with an undamaged guanosine at the primer terminus 72.

While 50-fold is a stark reduction in efficiency, similar to the insertion efficiencies

observed by telomerase, it does not fully explain the chain termination observed.

Furthermore, in the case of 8-oxoG across from adenosine, the insertion efficiency is

nearly the same as a primer terminus composed of a G:C match 72. Time lapse-

crystallography revealed that during the catalytic cycle of pol β, a salt bridge between

R254 and D256 was disrupted in the case of 8-oxoG across from C, which may explain the reduction in catalytic efficiency (Figure 1.9). Unfortunately, the telomerase active site

does not contain a homologous arginine residue, so alternate mechanisms for extending

from 8-oxoG must occur 58. Therefore, although studies in pol β give insight into how 8-

oxoG terminates telomere elongation, much mechanistic information is still missing with regard to telomerase inserting and extending from damaged nucleotides.

1.4.3 Targeting 6-thioguanine to telomeres

Conceptually and structurally similar to 8-oxodGTP, the biological effects of 6- thiodeoxyguanosine triphosphate (6-TdGTP) have also been studied in context of telomeres and telomerase (Figure 1.8) Unlike 8-oxodGTP, however, 6-TdGTP is not

physiologically present; instead, it is a nucleotide formed upon the metabolic processing

of the established cancer therapeutic 6-thioguanine (6-TG) 73. 6-thioguanine contains

nearly identical base pairing properties to guanine because the only substitution is that

carbonyl group bonded to the 6 position of the pyrimidine in guanine is replaced with a

similarly electronegative thio group in 6-TG. In fact, these two bases are so similar that

treatment of patients with 6-TG results in between 0.01 and 0.1% of all genomic guanines

with the sulfur containing analog 74. Unfortunately, although 6-TG still remains the

standard of care for several types of leukemias, its metabolism produces several other

chemical species that cause profound biological effects. In particular, processing of 6-TG

Figure 1.9: 8-oxoG at the primer terminus alters the insertion efficiency of pol β. Overview comparing the structural changes observed during efficient, mutagenic extension from 8-oxodG:A (green) and inefficient, non-mutagenic extension from 8oxodG:C (yellow) terminal pairs. In the case of the non-mutagenic 8-oxoG:C primer terminus, structural rearrangement of R254 occurs during the catalytic cycle. The right panels show overlays of the ground (colored) and product (gray) states. Reprinted with permission from Whitaker, A. M., Smith, M. R., Schaich, M. A. & Freudenthal, B. D. Capturing a mammalian DNA polymerase extending from an oxidized nucleotide. Nucleic Acids Res 45, 6934-6944, doi:10.1093/nar/gkx293 (2017).

forms an active ribonucleotide version (6-TrGTP) that inhibits GTPase signaling in

addition to other biological effects 75,76.In an attempt to produce clinically beneficial

anticancer effects while minimizing side effects, the Shay research group hypothesized

that 6-TG could be modified to increase its specificity. More specifically, by adding a

deoxyribose sugar moiety to the compound to form 6-thiodeoxyriboguanosine (6-TdG),

they hypothesized that the metabolic processing would cause more genomic

incorporation while minimizing the production of the ribonucleotide version 77. Upon

treatment of 6-TdG, multiple different cancer cell lines demonstrated reduced viability,

specifically in the case of telomerase positive cancers. Other cell lines modeling 6-TdG

treatment resulted in induction of DNA damage response specific to telomere ends

(known as telomere damage induced foci, or TIF). These results contrast the treatment

with 6-TG, which was observed to induce DNA damage response throughout the genome, not localized to telomeres 77. These results in cell biology were recapitulated with mouse

tumor xenograph models, and 6-TdG is on track to enter phase 1 clinical trials as a

therapeutic for lung cancers and melanomas in early 2022. Additional studies have used

similar approaches with other forms of modified nucleotides: modifying the chemotherapy

5-fluorouracil into 5-fluorodeoxyuracil also promoted cancer cell death in a telomerase

dependent fashion 78. Hence, other possible nucleoside analogs that have not been

tested could have even better efficacy if they were target to telomeres.

1.5 Dissertation Overview

Cumulatively, telomere length maintenance by telomerase represents a widely implemented means of regulating cell division, and may potentially be targeted to

intervene against multiple different pathologies, including idiopathic pulmonary fibrosis

(IPF), aging, and cancer 18,79,80. Unfortunately, technical limitations have largely limited the molecular understanding of how telomerase extends telomeres; in turn, a limited molecular understanding of telomerase has slowed the progress of telomerase-targeting therapeutics from entering the clinic. To combat these difficulties, several different models of TERT have been used in this dissertation, including DNA polymerase β (which is less similar to TERT, but is more amenable to biochemical and structural analysis), and TERT from Tribolium castaneum (a homolog to human TERT). Using methods including pre- steady state kinetics, X-ray crystallography, and telomerase extension assays, we were able to answer foundational questions about telomerase. These questions include: what are the positions and function of telomerase active site residues throughout the full catalytic cycle, and are there large-scale structural changes required for nucleotide insertion? (addressed in Chapter 2); what mechanism does telomerase use to prevent the insertion of noncanonical nucleotides such as ribonucleotides and mismatches, and how effective are these mechanisms? (addressed in Chapter 3); what structural aspect of therapeutic nucleotides such as 6-thioguanine allows them to successfully evade DNA polymerase selectivity methods, a requirement for their insertion into telomeres by telomerase? (addressed in Chapter 4).

Chapter 2: Structural Characterization of the Telomerase Active Site

This chapter has previously been published as an open access article and is reprinted here with adaptations. Schaich, M. A., Sanford, S. L., Welfer, G. A., Johnson, S. A., Khoang, T. H., Opresko, P. L., & Freudenthal, B. D. (2020). Mechanisms of nucleotide selection by telomerase. eLife. 9, e55438, doi:10.7554/eLife.55438.

Author contributions: B.D.F. and M.A.S. designed the experiments. M.A.S., G.W., and T.H.K. performed the structural experiments, with M.A.S. determining the initial crystallization conditions and determining the prenucleotide binary, ternary, and product structures, and G.W. and T.H.K. determining the Y256A prenucleotide structure under the guidance and training of M.A.S. M.A.S. built the structures and prepared them for deposition into the Protein Data Bank. B.D.F. and M.A.S. wrote the manuscript and made figures with assistance from the other authors.

In this chapter, we used TERT from Tribolium castaneum, a homolog to human

TERT, to better understand the structures required for TERT to complete its catalytic cycle. We hypothesized the structural arrangements in TERT were similar to that Human

Influenza Virus Reverse Transcriptase (HIV RT). More specifically, that residues in the active site remain in similar positions throughout the catalytic cycle, but that a global

“open” to “closed” conformational change would happen upon nucleotide binding. The positions of catalytic residues did seem static throughout the catalytic cycle, and we did not observe a global conformational change as with HIV RT. These studies revealed the

TERT catalytic cycle for the first time, and generated future hypotheses about the roles of TERT active site residues.

2.1 Abstract

Telomerase extends telomere sequences at chromosomal ends to protect genomic DNA. During this process it must select the correct nucleotide from a pool of

nucleotides with various sugars and base pairing properties, which is critically important

for the proper capping of telomeric sequences by telomere binding proteins, known as

shelterin. The mechanism by which telomerase selects correct nucleotides is unknown.

Here, we determined structures of Tribolium castaneum telomerase reverse transcriptase

(TERT) throughout its catalytic cycle and mapped the active site residues responsible for

nucleoside selection, metal coordination, triphosphate binding, and RNA template

stabilization. We found that the TERT active site exhibits features similar to other DNA polymerases, including a triad of carboxylate residues that coordinate the 3’-OH of the primer terminus, preparing it for its nucleophilic attack. Furthermore, a tyrosine residue rests in the minor groove of the DNA:RNA duplex ~3 Å from the C2 position of the incoming nucleotide triphosphate, and may acta as a steric gate to prevent ribonucleotide insertion. Cumulatively, these structures provide insight into how telomerase selects the proper nucleotide to maintain telomere integrity.

2.2 Introduction

During every round of eukaryotic cell division, a small amount of DNA is lost from

the ends of each chromosome 81,82. Termed the end replication problem, this phenomenon is countered by two complementary adaptations. First, repetitive noncoding

DNA sequences, known as telomeres, are found at chromosomal ends, preventing the loss of vital genetic information during each cell division 9,83. Second, in select cells, the

ribonucleoprotein telomerase elongates shortened telomeres at chromosomal ends using

a reverse transcriptase activity 11. Without elongation, telomeres will eventually reach a

critically short length, causing cells to undergo apoptosis or become senescent 3,84.

Because telomerase plays such a fundamental role in the temporal regulation of cell

division, aberrations in telomerase are implicated in numerous human diseases. These

include premature aging, idiopathic pulmonary fibrosis (IPF), dyskeratosis congenita, and

cancer 18,79,80. In particular, ~90% of cancers upregulate telomerase to combat telomere shortening and enable unlimited cell division, as opposed to somatic cells where

telomerase is absent 23.

2.2.1 Challenges of biochemical characterization of TERT

The implication of telomerase in multiple human diseases underscores the

importance of understanding its catalytic cycle at the molecular level. Although the human

telomerase holoenzyme is composed of multiple accessory subunits, catalysis is localized

to the telomerase reverse transcriptase (TERT) subunit 51. Historically, biochemical

characterization of human TERT (hTERT) has proven challenging, in part because of

technical difficulties purifying and reconstituting large quantities of active telomerase 85,86.

As a result, many fundamental parameters describing the telomerase catalytic cycle have

remained undefined. These enzymatic constants are essential for understanding the roles

of active site residues, the selection of right from the wrong nucleotides (fidelity), and the

prevention of ribonucleotide triphosphate (rNTP) insertion (sugar discrimination). The

faithful extension of telomeres by telomerase is critical because aberrations in telomeric

sequences prevent shelterin proteins from capping telomeres, thus promoting genomic

instability 87,88. Structural studies of human telomerase have also been historically

challenging. Currently, the highest resolution structural snapshot of human telomerase is

a cryo-EM structure at 8 Å resolution 51. This structure represents a milestone in

telomerase structural biology, revealing details of the telomerase tertiary and secondary

structure. However, the positions of amino acids are difficult to distinguish at this

resolution, leaving many molecular details of the catalytic cycle ambiguous.

2.2.2 Model systems used to study TERT

To mitigate the difficulties inherent in the biochemical characterization of human

telomerase, several model systems have been established. These include models from

yeast, the protazoa Tetrahymena thermophila (with which a 5 Å cryo-EM structure was

recently determined), and the insect model Tribolium castaneum (sequence alignment

shown in Figure 2.1) 49,50,89. For biochemical characterization of TERT, we opted to use

T. castaneum TERT (tcTERT) for the following reasons: first, tcTERT readily fit into the

cryo-EM density of hTERT and aligns well with the recent cryo-EM structure from T.

thermophila, highlighting the conserved secondary structure (Figure 2.2, structures from

49-50); second, upon alignment with hTERT, the active site pocket of tcTERT exhibits a high degree of sequence identity (Table 2.1); third, we can readily obtain sufficient

quantities of isolated, active tcTERT for characterization of the telomerase catalytic cycle

40 by pre-steady-state kinetics and X-ray crystallography 49,51. Importantly, we also complemented the tcTERT system with human telomerase studies to characterize the catalytic cycle of telomerase (see Chapter 3). Using this combined approach, we have elucidated the role of conserved telomerase active site residues and determined the mechanisms of fidelity and rNTP discrimination.

2.2 Materials and Methods

2.3.1 Nucleic acid sequences

To generate crystal structures of the TERT catalytic cycle, the following DNA sequences were utilized for all crystallization experiments: DNA primer of 5’-

GGTCAGGTCAGGTCA-3’ and the RNA template sequence 5’- rCrUrGrArCrCrUrGACCUGACC-3’. In each case, the oligonucleotides were resuspended in molecular biology grade water, and the concentration was calculated from their absorbance at 260 nm as measured on a NanoDrop microvolume spectrophotometer.

Nucleic acid substrates for crystallography were annealed at an equimolar ratio. We used a thermocycler to anneal all nucleic acid substrates, heating them to 90°C for 2 minutes before cooling to 4°C at a rate of 0.1°C per second.

2.3.2 Expression and purification of tcTERT

We used previously published methods for tcTERT expression and purification, but implemented several modifications49. Briefly, we grew tcTERT in BL-21(DE3) pLysS cells using an Epiphyte3 LEX bioreactor at 37 °C until they reached an OD600 of 0.6-0.8, after which the temperature was dropped to 30 °C for 4-5 hours of protein production. Cells were harvested via centrifugation at 4000 x g until lysis. For TERT purification, we used buffers containing 0.75 M KCl and 10% glycerol for the capture step on Ni-NTA columns

Figure 2.1: Alignments of several telomerase reverse transcriptase homologs. A multiple sequence alignment is shown from several established TERT models, including Tribolium castaneum (tc), Homo sapiens (h), Tetrahymena thermophila (tt), and Saccharomyces cerevisiae (sc). The active site residues discussed are highly conserved throughout all four species. Colored arrows mark the positions of each group as in Figure 2.3, with residues coordinating the RNA terminus shown in gray, the primer terminus in yellow, the catalytic residues in blue, the nucleoside coordinating residues in cyan, and the triphosphate coordinating residues in green.

Figure 2.2: Overlay of TERT from Tribolium castaneum with other TERT structures. (A) An overlay of tcTERT with the density from a cryo-EM structure of human telomerase. tcTERT is shown as a green cartoon, density is a gray surface, and hTR is shown as a beige cartoon. (B) An overlay of tcTERT with a cryo-EM structure of TERT from Tetrahymena thermophila. tcTERT, the focus of this work is shown as a green cartoon, and a previously determed 5Å ttTERT structure is shown as a blue cartoon, PDB code 6D6V.

H. sapiens T. castaneum T. thermophila S. cerevisiae Function

D868 D343 D815 D670 Catalytic

D869 D344 D816 D671 Catalytic

D712 D251 D618 D530 Catalytic

R631 R194 R543 E447 Nucleoside binding

Q833 Q308 Q773 Q632 Nucleoside binding

Y717 Y256 Y623 Y535 Nucleoside binding

A716 A255 C622 C534 Triphosphate binding

K902 K372 K849 K704 Triphosphate binding

N899 N369 N846 N701 Triphosphate binding

T714 R253 K620 K532 Triphosphate binding

G715 D254 K621 S533 Triphosphate binding

V713 I252 I619 V531 Triphosphate binding Templating M636 I199 F548 A453 ribonucleotide binding Templating N635 S198 T547 I452 ribonucleotide binding Templating I633 I196 I546 R450 ribonucleotide binding Templating V634 V197 M546 I451 ribonucleotide binding Templating F568 L141 K483 S388 ribonucleotide binding Templating R622 N185 R534 R439 ribonucleotide binding Templating I624 I187 I536 I441 ribonucleotide binding Templating G834 G309 G774 G633 ribonucleotide binding L866 T341 L813 L668 Primer terminus binding

V867 V342 T814 A669 Primer terminus binding

Table 2.1: The conservation and function of TERT active site residues.

(GE Healthcare), and then further purified our sample with cation exchange on a POROS

HS column (Thermo Fisher), using a salt gradient of 0.5 M KCl to 1.5 M KCl. Then, we

cleaved the hexahistadine tag with Tobacco etch virus protease before purifying the cut

tag from the protein with another run on our Ni-NTA columns. Finally, we used a slightly

different buffer for size exclusion chromatography (Sephacryl S-200 16/60, GE

Helathcare), containing 50 mM Tris-HCl, pH 7.5, 10% glycerol, 0.8 M KCl and 1 mM

Tris(2-carboxyethyl)phosphine (TCEP). Resultant tcTERT was concentrated down to 18 mg mL-1 prior to crystallography, and stored at 4°C 49.

2.3.3 Crystallization of tcTERT

Prior to crystallization, we complexed tcTERT with its nucleic acid substrate by

mixing them at a 1:1.2 ratio of protein to DNA. To increase protein solubility, we included

520 mM KCl when preparing to mix tcTERT with its nucleic acid substrate. We then used

sitting drop vapor diffusion to grow binary complex crystals in conditions containing 11%

isopropanol, 0.1 M KCl, 25 mM MgCl2, and 50 mM sodium cacodylate pH 6.5. Volume

ratios for the optimal crystal growth were optimized to 2.3 µL tcTERT binary complex

crystals + 1.7 µL of our crystallization condition, to make 4 µL total. For the ternary

complex crystals, we used the same conditions, but included 0.69 mM dGpCpp (Jena

Biosciences), the next matched incoming nucleotide in the sequence. Finally, for the

product complex, we formed a DNA strand one nucleotide longer by incubating 2.5 mM dGTP with tcTERT and its nucleic acid substrate, allowing the reaction to occur at 22 °C for 30 minutes prior to setting up crystallization drops. In all cases, crystals were transferred to a cryosolution containing 80% reservoir solution and 20% 2-methyl-2,4-

pentanediol by volume before flash cooling them in liquid nitrogen.

2.3.4 Data collection and refinement

All datasets were collected at a wavelength of 1.00 Å, using the 4.2.2 synchrotron

beamline at the Advanced Light Source of the Ernest Orlando Lawrence Berkeley

National Laboratory. Datasets were indexed and scaled using XDS90,91. Initial models were generated using molecular replacement in PHENIX, using a previously published tcTERT structure with an alternate substrate, PDB code 3KYL 92,93. After a solution was found, all DNA and RNA bases were built in the pre-nucleotide complex, and the resultant structure was then used for further molecular replacements other structures. Model building was accomplished with Coot and validated with MolProbity 94,95. For structures ≥

3 Å resolution, both secondary structure restraints and torsional restraints from the

prenucleotide binary structure were used to prevent overmodeling. All refinements were

done using PHENIX, and figures were generated using PyMOL (Schrödinger LLC). For

each of the structures, Ramachandran analysis revealed a minimum of 100% of non-

glycine residues occupied allowed regions and at least 93% occupied favored regions.

2.3.5 Data resources

Accession numbers for models reported are PDB: 6USO, 6USP, 6USQ, and 6USR.

2.4 Results

The TERT subunit of telomerase elongates telomeric DNA using a conserved

catalytic cycle as outlined in Figure 1.2 and here. First, telomerase anneals its RNA

template to the end of telomeric DNA to form a binary complex (TERT:DNA, Figure 1.2,

state I). Next, the binary complex binds an incoming dNTP and samples for proper

Table 2.2: Data collection and refinement statistics for tcTERT structures.

Watson-Crick base pairing to the RNA template (Figure 1.2, state II). If the resulting ternary complex (TERT:DNA:dNTP) is in the proper orientation, TERT will catalyze the formation of a phosphodiester bond and extend the telomere by one nucleotide (Figure

1.2, state III). Following insertion of the incoming nucleotide, telomerase will shift registry to align the active site with the next templating base (returning back to state I). This core catalytic cycle repeats six times, until a new telomeric repeat is added (Figure 1.2, state

IV). At this point, telomerase will either (1) translocate and anneal the RNA component to the newly extended telomeric repeat, thus allowing for additional repeat addition; or (2) dissociate from the telomeric DNA.

2.4.1 Pre-nucleotide binary complex

We determined how TERT engages with telomeric DNA by co-crystallizing tcTERT with a 16-mer RNA strand hybridized to its complementary 15-mer DNA strand to form a binary complex. This substrate mimics the initial TERT:RNA complex bound to telomeric

DNA (Figure 1.2, state I). In this orientation, an unpaired 5’ cytosine (rC) of the RNA strand acts as the templating base and a 3’ adenosine (dA) of the DNA strand serves as the primer terminus (Figure 2.3). Crystals of this complex grew in a P3221 space group, diffracting to 2.5 Å resolution (Table 2.2). The resulting structure shows TERT bound as a ring around the end of the RNA:DNA complex, with its active site positioned at the terminus of the DNA strand (Figure 1.2).

Within the TERT active site, the templating RNA strand is stabilized by multiple conserved tcTERT residues (Table 2.1). Residues I196, V197, S198, G309, and R194 compose a pocket around the templating RNA base (Figure 1.2C). This pocket uses both polar and nonpolar interactions to stabilize the templating rC in a conformation that orients

Figure 2.3: The prenucleotide binary structure of the telomerase catalytic cycle. (A) The tcTERT prenucleotide binary complex (state I). tcTERT (pale orange cartoon and surface) encircles the DNA (white) and RNA (purple) substrate. (B) Active site pocket of the prenucleotide binary complex. rC binding (gray), dG binding residues (yellow), nucleoside residues (cyan), catalytic residues (blue), and triphosphate binding (green) are shown as sticks. (C) Closeup views of the rC binding residues, (D) terminal dG binding residues, (E) nucleoside binding residues, and (F) the tcTERT catalytic residues and triphosphate coordinating residues. A Mg2+ ion is shown as a purple sphere and DNA is presented as white sticks.

49 its Watson-Crick edge towards the incoming nucleotide binding site. On the opposite side, the 3’-OH of the primer terminal 3’-dA points towards the catalytic metal binding site, and

T341 and V342 coordinate the deoxyribose sugar moiety of the 3’-dA (Figure 1.2D and

Table 2.1). This binary TERT complex also has a cavity in the active site that forms the nucleotide binding pocket. These nucleotide pocket residues can be subdivided into three categories: nucleoside coordinating, catalytic, and triphosphate interacting residues

(Figure 1.2E, F). Notably, the residues that compose these three groups are 100% conserved between hTERT and tcTERT (Table 2.1). The nucleoside binding group is composed of residues R194, Y256, and Q308 (Figure 1.2E). These residues form a nucleoside shaped cleft directly upstream of the primer terminus of the telomeric DNA and are further characterized below. The catalytic residues include the catalytic triad:

D251, D343, and D344. These residues coordinate the divalent metal ions during catalysis (Figure 1.2F). Residues K189, A255, N369, and the backbone of K372 are in position to form interactions with the triphosphate of the incoming nucleotide (Figure

1.2F). Collectively, the active site of tcTERT is primed for nucleotide binding, and the residues involved in binding are highly conserved with human telomerase (Table 2.1).

2.4.2 Nucleotide bound ternary complex

We also determined the structure of TERT after binding an incoming nucleotide, but prior to catalysis (Figure 1.2). To capture the ternary complex, we utilized a non- hydrolyzable nucleotide analog 2'-deoxyguanosine-5'-[(α,β)-methyleno]triphosphate

(dGpCpp). dGpCpp is identical to dGTP, except the bridging oxygen between the α and

β phosphate is a carbon atom, which prevents catalysis 96,97. Crystals of this complex grew in the same P3221 space group and diffracted to 2.9 Å resolution (Table 2.2).

Comparing this ternary complex to the binary state (RMSD value of 1.52 Å, Figure 2.4A,

B) indicates minimal structural rearrangements are required for TERT to bind dGpCpp.

The active site residues that compose the nucleotide binding pocket of the pre-nucleotide binary complex coordinate the incoming dGpCpp, positioning its Watson-Crick face to hydrogen bond with the templating rC (Figure 2.4B, C). Two Mg+2 ions exhibit octahedral coordination to facilitate nucleotide binding. The catalytic metal coordinates residues

D251, D343, D344, the 3’-OH of the primer terminus, and the non-bridging oxygen of the dGpCpp α-phosphate (Figure 2.4D). The nucleotide metal coordinates the side chains of

D251 and D343, the backbone carbonyl of I252, and a non-bridging oxygen on the α,β, and γ phosphates of the incoming dGpCpp (Figure 2.4D). Nucleoside binding residue

R194 forms a network of contacts between the residue Q308 and the α phosphate of the incoming nucleotide. Y256 and Q308 coordinate the nucleoside component of the dGpCpp (Figure 2.4E). Overall, this ternary complex provides insight into nucleotide selection by TERT and the specific roles of active site residues during nucleotide binding.

2.4.3 Product complex

To capture a product complex of telomerase, we crystallized TERT after incubation with its nucleic acid substrate and dGTP, allowing TERT to insert the nucleotide and form the final product stage of the catalytic cycle (Figure 1.2, state IV). Both globally and within the active site, we observed minimal structural changes between the prenucleotide binary complex and the product structure (RMSD value of 1.04 Å, Figure 2.5A, B). Electron density of the inserted dG indicates its Watson-Crick face hydrogen bonds to the Watson-

Crick face of the templating rC (Figure 2.5C). Neither a registry shift nor a translocation step has occurred in this structure; the TERT active site remains aligned to the terminal

Figure 2.4: TERT ternary structure. (A) The tcTERT ternary structure, overlaid with the prenucleotide binary complex. DNA (white), RNA (purple), binary tcTERT (yellow cartoon), and ternary tcTERT (blue cartoon) are shown. (B) The tcTERT active site. Nucleoside residues (cyan), triphosphate residues (green), and catalytic residues (blue) are shown. The dGpCpp (yellow), DNA, and RNA are represented as sticks. (C) Closeup view of dGpCpp with a polder OMIT map contoured at σ=3.0 (green mesh). (D, E) dGpCpp contacts are shown with Mg2+ (purple), catalytic residues (marine) and nucleoside residues (cyan) indicated.

Figure 2.5: TERT product structure. (A) The tcTERT product structure (red cartoon), overlaid with the prenucleotide binary complex (yellow cartoon). (B) An active site view of the tcTERT product structure. (C) A display of a polder OMIT map contoured at σ=3.3 around the incoming dGpCpp (green mesh). (D) Catalytic residues (marine) coordinate the inserted dG (white).

DNA base (Figure 2.5D). Comparing this structure to structures of the previous stages in

the catalytic cycle, we observed that minimal global rearrangements are required to

proceed from the binary, ternary, and product states of the catalytic cycle.

2.5 Discussion

2.5.1 TERT conformations throughout its catalytic cycle

In this study, we characterized each step of the TERT catalytic cycle for single

nucleotide insertion. We found that, in terms of global protein structure, minimal

rearrangement is required to proceed through the catalytic cycle. This lack of

rearrangement contrasts with many other DNA polymerases and even HIV RT, which

undergo global shifts from an “open” to a “closed” state upon nucleotide binding 98-100.

Although it is unknown why the TERT catalytic core does not open and close, other

complexities necessary for telomerase function may limit the opening and closing, including the translocation step during repeat addition or the extensive interaction with its

RNA component. Within the active site, we observed residues that encompass a cavity for the incoming nucleotide prior to binding, which then adjust to coordinate the incoming nucleotide after binding, and continue to stabilize the newly inserted base after its insertion in the product state. Many of the active site residues involved in carrying out the catalytic cycle are in similar positions as other DNA polymerases; a triad of three carboxylate containing residues such as D251, D343, and D344 in tcTERT is conserved in many DNA polymerases 101. Interestingly, R194 and Q308 are in a similar structural

location to R61 and Q38 of human DNA polymerase η, and both have been shown to be

important in its catalytic cycle 102.

2.5.2 Limitations of structural studies

The structural characterization of the TERT catalytic cycle presented here represents a step forward in the mechanistic understanding of how telomerase extends telomeres. These structures represent snapshots of amino acid positions throughout the catalytic cycle, but cannot unequivocally state their roles in that cycle. Therefore functional readouts, in combination with the structural studies outlined above, would provide a better characterization of the specific roles of active site residues during the telomerase catalytic cycle.

Chapter 3: Kinetic characterization of the telomerase active site

Author contributions: B.D.F., P.L.O., M.A.S., and S.L.S. designed the experiments. M.A.S., G. A. W., and T.H.K. performed the single turnover kinetics experiments, with M.A.S. performing ~85% of the experiments independently, and training T.H.K. and G.A.W. to perform others. S.L.S. and S.A.J. performed the telomerase activity assays. B.D.F. and M.A.S. wrote the manuscript and made figures with assistance from the other authors.

In this chapter, we used the TERT structural data in Chapter 3 to inform functional studies probing the mechanism of nucleotide selection by TERT. Based on the biological error rates observed in telomeres, we hypothesized that TERT possessed moderately high nucleobase fidelity and ribonucleotide discrimination rates. Furthermore, we hypothesized that Y256 acted as a steric gate to prevent ribonucleotide insertion. Our pre-steady-state kinetics confirmed both of these hypotheses, and the results were similar with primer extension assays of human telomerase. We also uncovered that, in the context of multiple nucleotide insertion events, ribonucleotide insertions by telomerase disrupted the translocation step of the telomerase catalytic cycle, potentially providing a second level of nucleotide discrimination.

3.1 Abstract

Telomerase extends telomere sequences at chromosomal ends to protect genomic DNA. During this process telomerase must select the correct nucleotide from a

pool of nucleotides with various sugars and base pairing properties, which is critically

important for the proper capping of telomeric sequences by shelterin. Unfortunately, how

telomerase selects correct nucleotides is unknown. Here, we used pre-steady-state

kinetics to determine both the kpol and Kd for nucleotide insertion by telomerase reverse

transcriptase. Furthermore, we found that TERT will insert a mismatch or ribonucleotide

~1 in 20,000 and ~1 in 25,000 insertion events, respectively, equivalent to ~1 mismatch and ~20 ribonucleotides per 10 kilobases at biological nucleotide concentrations. Human telomerase assays determined a conserved tyrosine steric gate regulates ribonucleotide insertion into telomeres. Cumulatively, our work provides insight into how telomerase selects the proper nucleotide to maintain telomere integrity.

3.2 Introduction

For a broader introduction on the telomerase and its impacts on cell biology and

human health, please see section 2.2.

3.2.1 Kinetic characterization of telomerase

Largely due to difficulties in purifying and reconstituting telomerase at a scales and

purities amenable for kinetic studies, kinetic characterization of telomerase has proven challenging, historically. Although many fundamental kinetic parameters of telomerase

103 are not known, the Km was recently determined with a steady-state kinetic approach .

In this approach, the RNA component of telomerase was truncated to remove any of the

templating RNA bases. This form of telomerase was termed “template-free” telomerase.

The advantage of using this form of telomerase is that the RNA sequence could be pre-

annealed to a complementary DNA primer to study single nucleotide insertion kinetics of

defined primer lengths. One caveat for these experiments is that the amount of correctly assembled “template-free” telomerase was not separable from incorrect assembly, so the amount of enzyme used in the assays was unknown. Therefore, only Km values could be obtained with these experiments, instead of the Km and kcat values typically obtained with

steady-state kinetics. Still, valuable information was obtained regarding nucleotide binding in this study, including a large variation in Km values depending on the position of

the template that was being filled (ranging from 4 to 120 µM) 103.

Alternately, rather than compromising the telomerase by changing the RNA

component, recent advances in single-molecule microscopy have allowed for the imaging

of single telomerase molecules. Using single-molecule approaches, the problem of

obtaining large amounts of reconstituted telomerase can be avoided, because much

59 fewer enzyme is consumed in these techniques. In one single molecule approach, telomerase was tagged, and the tag linked to one optical tweezer. On the other side of telomerase, telomeric DNA was linked to a second optical tweezer. After the linkage occurred, telomerase activity was initiated by adding the four nucleotide triphosphates to solution. With each nucleotide added, the distance between the two optical tweezers increased, allowing a rate of extension to be calculated: a majority of traces inserted between 8 to 50 nucleotides per minute 104. Furthermore, readout of this assay was dependent on the length of the telomere strand, phenomena larger than insertion alone could be visualized. Instead of the distance increasing in a linear fashion, where each nucleotide added a small amount of length, the length of the telomere was altered in larger steps, in both the forwards and backwards direction. This phenomenon could be explained by the growing telomere strand folding into G-quadruplex structures and binding telomerase at an anchor site. Overall, this study points to a complex mechanism of multi-turnover nucleotide insertion and underscores the critical need of studying telomerase kinetics at the single nucleotide level. Therefore, in this chapter, we designed a new methodology to study telomerase extension at the single nucleotide level, similar to that of the template-free telomerase. By using known protein concentrations, and using a pre-steady-state regime of kinetics, we were able to the first time quantitate the Kd of nucleotide binding as well as the kpol of telomerase, the theoretical maximal turnover rate at saturating nucleotide concentrations.

3.3 Materials and Methods

3.3.1 Nucleic acid sequences

For kinetic studies, we utilized a DNA primer with a 5’ label of 6-carboxyfluorescein

(6-FAM), and the DNA sequence of 5’- CCAGCCAGGTCAG-3’. The RNA template used in kinetic reactions contained the sequence 5’- rUrGrArCrCrUrGrArCrCrUrGrGrCrUrGrG-

3’ and was not labeled. In each case, the oligonucleotides were resuspended in molecular

biology grade water, and the concentration was calculated from their absorbance at 260

nm as measured on a NanoDrop microvolume spectrophotometer. Oligonucleotides were

annealed at a 1:1.2 molar ratio of labeled to unlabeled primer to reduce labeled primers

that did not effectively anneal We used a thermocycler to anneal all nucleic acid

substrates, heating them to 90°C for 2 minutes before cooling to 4°C at a rate of 0.1°C

per second.

3.3.2 Expression and purification of tcTERT

tcTERT was expressed and purified as described in section 2.3.2

3.3.3 Pre-steady-state kinetic characterization of tcTERT

Pre-steady-state kinetic parameters of tcTERT were obtained using established

pre-steady-state kinetics protocols for DNA polymerases, also known as single turnover

kinetics 17,105. Briefly, we preincubated 2 μM tcTERT with 200 nM annealed DNA:RNA

hybrid substrate, with a 6-FAM label on the 5’ end of the DNA component. We then used

a KinTek RQF-3 (a rapid quench-flow instrument) to mix equal ratios of the incoming

nucleotide triphosphate of interest and 10 mM MgCl2 with the existing mix of tcTERT and

its DNA:RNA hybrid substrate. Reactions were run at 37°C and quenched at various

timepoints (ranging from 10 ms to 700 s) with 100 mM EDTA pH 7.5. In each case, the

61 conditions used for each reaction were: 25 mM TRIS pH 7.5, 0.05 mg mL-1 Bovine Serum

Albumin, 1 mM dithiothreitol, 10% glycerol, 200 mM KCl, 1 μM tcTERT, 100 nM annealed

DNA:RNA hybrid substrate, and varying concentrations of the nucleotide triphosphate of interest. The samples were transferred to a DNA gel loading buffer, containing 100 mM

EDTA, 80% deionized formamide, 0.25 mg ml−1 bromophenol blue and 0.25 mg ml−1 xylene cyanol. For the generation of data sets that had a minimum time point of 12s or greater, a LabDoctor™ heating block was used in lieu of the KinTec RQF-3, and quenching was accomplished using a solution of DNA gel loading buffer. These mixes were then incubated at 95°C for 5 mins and loaded onto a 21% denaturing polyacrylamide gel. These gels were run at 700 V, 60 A, and 30 W at 30°C in order to separate the reaction product from its substrate.

Gels were scanned and imaged using a GE Typhoon FLA 9500 imager, and the ratios of product to substrate were quantified using ImageJ106. Means and standard deviations were taken from at least three replicates were calculated and graphed using

KaleidaGraph. Plots of product formation over time were fit to the exponential Equation 1 to determine kobs values:

[P] = A(1 e ) −𝑘𝑘obs𝑡𝑡 in which [P] is the concentration of− the product, A is the target engagement

(amplitude), and t is the reaction time. After kobs values were determined for multiple nucleotide triphosphate concentrations, the data were replotted to compare kobs to concentration of nucleotide triphosphate, and fit to Equation 2:

k [NTP] k = K pol+ [NTP] obs d

with kpol representing the theoretical maximum value of kobs at saturating nucleotide concentration, [NTP] representing the concentration of the nucleotide triphosphate of interest, and Kd representing the dissociation constant of the incoming nucleotide

triphosphate

3.3.4 Human telomerase expression

HEK293T cells were used to overexpress hTR and 3×FLAG-tagged human

telomerase reverse transcriptase (hTERT) genes in pSUPER-hTR and pVan107, respectively. Cells were grown to 90% confluency in Dulbecco’s modified Eagle’s medium

(Gibco) supplemented with 10% High Quality FBS (Hyclone) and 1% penicillin-

streptomycin (Corning) at 37°C and 5% CO2. Cells were transfected with 10 μg of

pSUPER-hTR plasmid and 2.5 μg of pVan107 hTERT plasmid diluted in 625 μl of Opti-

MEM (Gibco) using 25 μl of Lipofectamine 2000 (ThermoFisher) diluted in 625 μl of Opti-

MEM. Cells were cultured for 48 hours post-transfection, and then were trypsinized and

washed with phosphate-buffered saline and lysed in CHAPS lysis buffer buffer (10 mM

Tris-HCl, 1 mM MgCl2, 1 mM EDTA, 0.5% CHAPS, 10% glycerol, 5 mM β-

mercaptoethanol, 120 U RNasin Plus (Promega), 1 μg/ml each of pepstatin, aprotinin,

leupeptin and chymostatin, and 1 mM AEBSF) for 30 min at 4°C. Cell lysate supernatant

was then flash frozen and stored at -80ºC.

3.3.5 Human telomerase purification

Telomerase was purified via the 3xFLAG tag on hTERT encoded pVan107 using

ANTI-FLAG M2 affinity gel agarose beads (Sigma Aldrich), as described previously with

some modification 65. An 80μL bead slurry (per T75 flask) was washed three times with

10 volumes of 1X human telomerase buffer in 30% glycerol with 1 min centrifugation steps at 3500 r.p.m. at 4ºC. The bead slurry was added to the lysate and nutated for 4-6 hours at 4ºC. The beads were harvest by 1 min centrifugation at 3500 r.p.m, and washed 3X with 1X human telomerase buffer with 30% glycerol. Telomerase was eluted from the beads using 2x the bead volume of 250 μg/mL 3X FLAG® peptide (Sigma Aldrich) in 1X telomerase buffer with 150 mM KCl. The bead slurry was nutated for 30 min at 4ºC. The eluted telomerase was collected using Mini Bio-Spin® Chromatography columns (Bio-

Rad). Samples were flash frozen and stored a -80ºC.

3.3.6 32P-end-labeling of DNA primers

50pmol of PAGE purified DNA primer GGTTAGGGTTAGGGTTAG (IDT) was labeled with γ-32P ATP (Perkin Elmer) using T4 polynucleotide kinase (NEB) in 1X PNK

Buffer (70mM Tris-HCl, pH 7.6, 10mM MgCL2, 5mM DTT) in a 20uL reaction volume. The reaction was incubated for 1 h at 37ºC followed by heat inactivation at 65ºC for 20 minutes. G-25 spin columns (GE Healthcare) were used to purify the end labeled primer.

3.3.7 Telomerase activity assay

The telomerase assay was as previously described. Reactions contained 1x human telomerase buffer, 5 nM of 32P-end-labeled primer and 50μM dNTP or rNTP mix as indicated in the figure legends. The reactions were started by the addition of 3 μL of immunopurified telomerase eluent, incubated at 37ºC for a specified time course, then terminated with 2 μL of 0.5mM EDTA and heat inactivated at 65ºC for 20 minutes. An equal volume of loading buffer (94% formamide, 0.1 × Tris-borate-EDTA [TBE], 0.1% bromophenol blue, 0.1% xylene cyanol) was added to the reaction eluent from the G-25 spin column. The samples were heat denatured for 10 min at 100°C and loaded onto a

14% denaturing acrylamide gel (7M urea, 1x TBE) and electrophoresed for 90 min at

constant 38W. Samples were imaged using a Typhoon phosphorimager (GE Healthcare).

Percent primer extension was quantitated using ImageQuant.

3.4 Results

3.4.1 Fidelity and sugar selectivity of TERT

Characterization of the telomerase catalytic mechanism was performed using pre-

steady-state kinetics of single nucleotide insertion by tcTERT (Figure 3.2; Table 3.1).

These experiments determined both the Kd of the incoming nucleotide and the kpol for nucleotide insertion by tcTERT, which have thus far proven unattainable for human telomerase (or any other homolog). TERT inserts the correctly matched dGTP across

-1 from a templating rC with a kpol of 1.98 s and a Kd for the incoming dGTP of 19.6 μM

(Figure 2.5A, D). Both of these values are comparable to other non-replicative DNA

103,107 polymerases and the TERT Km values obtained by steady-state kinetics . Of note, the data specifically for TERT WT kinetics do not fully fit the single-exponent equation

typically used for DNA polymerase single turnover kinetics, potentially implying a more

complex fit needed to model the reaction. One possibility is that there are multiple

populations of TERT in the reaction, and one of the populations undergoes catalysis more rapidly than the other. Because these two components would be contributing to the reaction with different first-order exponential rates (i.e., Equation 1), this scenario can be

modeled with a double-exponential equation:

[P] = A(1 e )+ B(1 e ) −kobs1𝑡𝑡 −kobs2𝑡𝑡 P represents the product formed,− two exponential− rates (kobs1 and kobs2), and the

amplitude by which each contributes (A and B, respectively). These double exponential

65 equations more accurately fit the data, and a comparison of both methods of fit is shown in Figure 3.1. Although the double exponential plot is somewhat unusual to see in the field of DNA polymerase pre-steady-state kinetics, this type of fit has been used where necessary to determine kpol and Kd values. For a more in-depth analysis of the two fits, see Appendix A for a plot of the residuals.

We further probed the role of tcTERT active site residues R194 and Q308 because their role during catalysis is not clear from the structures alone and both residues protrude into the nucleotide binding pocket (Figure 2.3 and Table 2.1). For TERT R194A, the kpol

-1 decreased by 54-fold to 0.0369 s , and the Kd for dGTP increased ~5-fold to 93 μM (Table

-1 3.1). With the Q308A variant, the kpol decreased ~6-fold to 0.30 s and the Kd for dGTP increased ~2-fold to 45 μM (Table 3.1). Therefore, R194 and Q308 primarily play a role in the chemistry step rather than the nucleotide binding. As the hTERT homolog to R194

(R631, Table 2.1) is implicated in IPF, we infer that mutations at R631 likely reduce

108,109 hTERT’s kpol, contributing to IPF pathologies .

During telomeric extension, telomerase must select between a variety of nucleic acid substrates in order to properly maintain telomeric integrity. To probe the fidelity of telomerase, we applied pre-steady-state kinetics, assessing the efficiency with which

TERT inserts nucleotides during telomeric elongation. Two separate types of nucleotide selection were examined: (1) the selection of a matched dGTP over a mismatched dATP, and (2) the selection of a matched deoxyribonucletide triphosphate (dNTP) over a matched rNTP (Figure 3.2B and C, Table 3.1). We observed that for the insertion of dATP opposite a templating rC, the catalytic efficiency starkly decreased compared to dGTP insertion, both at the nucleotide binding and chemistry step.

Figure 3.1 TERT WT dGTP insertion kinetics, fit two different ways. (A) When fit to a single exponential equation, the fit diverges from the data. (B) When fit to a double exponential, the data align more closely with the fit. (C) A replot of the majority (fast) kobs values from panel B, with the kpol and Kd shown. On panels A and B, error bars represent standard deviations of at least 3 technical replicates, and in panel C, the error bars represent the error of the fit. The resultant kpol was 1.98 ± 0.119 per second, and the Kd was 19.2 ± 7.53 µM.

-- Down Down Down Down Down Down 4 1 Fold change 259 1.16 1.36 15.5 dGTP insertion compared to WT 2.45 x 10 x 2.45 1.74 x 104 x 1.74 -3 -2 -5 -5 1 0.859 0.733 3.86 x 10 x 3.86 10 x 4.08 10 x 5.73 6.44 x 10 x 6.44 Efficiency compared compared Efficiency to WT dGTP insertion ) -2 -3 -3 -1 d /K mM 22 pol 40.9 1.53 6.66 Error Error -1 k (s 4.69 x 10 x 4.69 10 x 1.05 10 x 1.21 ) -1 -3 -3 -1 d /K mM 103 pol 6.64 88.5 75.5 -1 k 3.98 x 10 x 3.98 10 x 4.21 5.90 x 10 x 5.90 (s d 10 K 208 260 10.1 7.53 17.5 6.21 (μM) Error 3 (μM) 887 92.7 44.7 19.2 74.1 73.6 d K 1.38 x 10 x 1.38 ) -3 -2 -4 -1 -4 -1 -1 -1 pol k Error (s Error 1.19 x 10 x 1.19 10 x 1.67 10 x 1.58 10 x 3.19 10 x 6.71 10 x 5.19 10 x 1.43 -2 -1 -3 -3 ) -1 (s 1.98 6.56 5.56 pol k 3.69 x 10 x 3.69 10 x 2.97 10 x 3.73 10 x 8.14 rGTP rGTP dGTP dGTP dGTP dGTP Incoming nucleotide dATP (mismatch) dATP WT WT WT TERT TERT Y256A Y256A R194A Q308A variant tc

Table 3.1: Kinetic parameters and errors for tcTERT pre-steady-state kinetics of single nucleotide incorporation. Errors in the kpol and Kd error columns represent the error of fit for each parameter, calculated from the replot of the single turnover kinetics experiments. The kpol/Kd error was calculated via multiplicative error propagation of the two errors.

For the mismatched insertion, the kpol decreased 243-fold to 0.0081 s-1 and the Kd

increased 68-fold to 1.3 mM (Figure 3.2E). The resulting catalytic efficiencies (kpol /Kd) for

a matched versus mismatched nucleotide insertion indicate telomerase will insert the

wrong nucleotide ~1 in 20,000 nucleotide insertion events. This places telomerase at a

moderate fidelity of base selection compared to other DNA polymerases (Figure 3.2G,

114 data from ). For rNTP discrimination, the kpol for inserting a rGTP decreased 531-fold

to 0.0037 s-1 and the Kd increased 46-fold to 0.89 mM (Figure 3.2F). This results in a

nearly 25,000-fold decrease in the catalytic efficiency for the insertion of a rNTP

compared to a dNTP (i.e. sugar discrimination, Figure 3.2H, data from 111). Because the

cellular concentrations of rNTPs are around 50-fold higher on average than dNTPs, this

sugar discrimination indicates telomerase will insert a rNTP ~1 in 500 insertion events in a cellular context 110. For complete TERT enzymology parameters, see Appendix A.

Figure 3.2: TERT fidelity and sugar discrimination. (A) Pre-steady-state kinetics of WT tcTERT inserting dGTP opposite rC. Data were fit to Equation 1 (Table 3.1). Error bars represent the standard deviation of the mean. These experiments were also performed with WT tcTERT inserting dATP across from rC (B) and rGTP across from rC (C). Replots of the data and fits to Equation 2 were performed for: (D) dGTP across from rC, with a kpol was 1.98 ± 0.119 per second, and Kd of 19.2 ± 7.53 µM -3 -4 (E) dATP across from rC, with a kpol was 8.14 x 10 ± 6.71 x 10 per second, and Kd 1.38 ± 0.260 mM, and -3 -4 (F) rGTP across from rC. with a kpol was 3.73 x 10 ± 3.19 x 10 per second, and Kd of 887 ± 208 µM. (G) A comparison of TERT nucleobase fidelity (red line) compared to other DNA polymerase families. (H) TERT’s rNTP discrimination rates (red line) compared to select DNA polymerases is shown.

3.4.2 The steric gate of telomerase

The high cellular concentration of rNTPs has resulted in most DNA polymerases evolving a structurally conserved active site residue which provides sugar discrimination by reducing the rate of rNTP insertion 111-113. These residues are termed “steric gates”

because they clash with the 2’-OH of the incoming rNTP. Throughout the TERT catalytic

cycle, we observed that Y256 rests in the minor groove of the DNA and is in position to

clash with the 2’-OH of an incoming rNTP (Figure 2.4). Therefore, we hypothesized this

residue to be the steric gate in telomerase.

To test this hypothesis, we performed pre-steady-state kinetics of rNTP insertion

with the Y256A variant of TERT (Figure 3.3A, B). Compared to WT TERT, the insertion

-1 of a matched rGTP by Y256A showed a 1,490-fold increase in kpol to 5.5 s and a 12-fold

decrease in Kd to 73 μM (Figure 3.3C). The results for TERT Y256A inserting a matched

-1 dGTP were similar to that of the rGTP, with a kpol and Kd of 6.6 s and 74 μM, respectively

(Figure 3.3D). Thus, the sugar selectivity of TERT dropped from 25,000-fold between rGTP and dGTP for WT TERT to less than 2-fold for the Y256A TERT variant (Figure

3.3E). In other words, a single Y256A substitution increased rGTP insertion efficiency by

18,000-fold, abolishing almost all sugar discrimination. In a cellular environment, where rNTPs are at much higher concentrations than dNTPs, WT TERT would insert rGTP

~194-fold times less efficiently than dGTP. In contrast, TERT Y256A under cellular conditions would insert rGTP 77-fold times more efficiently than dGTP (Figure 3.3F)110.

We next verified that this ablation in sugar discrimination was due to specific changes in

the TERT active site rather than global rearrangements of the enzyme. To test for

structural rearrangements, we crystallized the pre-nucleotide binary state of TERT

Figure 3.3: The steric gate of TERT prevents insertion of rNTPs. Pre-steady-state kinetics of tcTERT with its steric gate removed (i.e. Y256A) for both (A) rGTP insertion and (B) dGTP insertion (Table 3.1). Error bars represent standard deviation of the mean, composed of at least three technical replicates. These graphs were replotted and fit to Equation 2 for (C) tcTERT Y256A inserting rGTP (the kpol was 5.56 ± 0.143 per second, and Kd was 73.6 ± 6.21 µM) and (D) tcTERT Y256A inserting dGTP (the kpol was 6.56 ± 0.519 per second, and Kd was 74.1 ± 17.5 µM). The error bars from these two plots represent the errors of the fits from the original plots. (E) A comparison of catalytic efficiencies (kpol/Kd) of both WT TERT and TERT Y256A for the insertion of dGTP (green) versus rGTP (red). (F) A comparison of TERT efficiencies adjusted for cellular nucleotide concentrations. (G) The prenucleotide binary complex of TERT Y256A (green) overlaid with the WT TERT prenucleotide binary complex structure (yellow). (H, I) The active site pocket of TERT Y256A, with DNA (white), RNA (purple), catalytic residues (blue), and nucleoside coordinating residues (cyan) shown. (J) The closest contacts to the C2 position of dGpCpp from the ternary structure (aligned and shown with yellow and green sticks) compared to Y256A tcTERT (cyan).

Y256A (state I), and saw minimal structural differences compared to the WT protein

(Figure 3.3G-J). The position of the primer terminus was not changed, and catalytic residues D251, D343, and D344 were in position to catalyze the nucleotidyl transferase reaction (Figure 3.3H). Upon closer examination of the active site pocket, the substitution of Y256 with alanine resulted in a more open nucleotide binding pocket (Figure 3.3H and

I). The distance from the C2 carbon of the dGpCpp to residue 256 shifted from 3.3 Å in

WT TERT to 5.7 Å in TERT Y256A, showing that the TERT Y256A has much more room to accommodate the 2’-OH (Figure 3.3J). Taken as a whole, these results indicate Y256 clashes with the 2’-OH of rNTPs to provide sugar discrimination and is the steric gate in telomerase.

3.4.3 Ribonucleotide insertion by human telomerase

To determine if human telomerase uses a similar mechanism to discriminate against rNTP insertion, we implemented human telomerase activity assays 15. In these assays, 1.5 telomeric repeats with the sequence 5’-TTAGGGTTAG-3’ were incubated with purified human telomerase, along with 50 µM of either all four dNTPs or all four rNTPs. We performed these tests with both WT telomerase and a telomerase Y717A variant, which is the homologous residue to tcTERT Y256. WT telomerase showed robust primer extension in the presence of dNTPs, with ~10% of the primer extended into product over the course of 30 minutes (Figure 3.4A, B). Similarly, with all four dNTPs, the Y717A variant reached ~7% primer extension in the same amount of time (Figure 3.4A, B). In contrast, when we incubated WT telomerase with all four rNTPs rather than dNTPs, we observed very low (<1%) primer extension after 30 minutes, suggesting that human telomerase discriminates against rNTPs, similar to tcTERT (Figure 3.4C, D).

When we performed the same telomeric extension assay with the steric gate variant (Y717A), we observed that the primer extension was increased 3-fold compared to WT hTERT in the presence of rNTPs (Figure 3.4C, D). These results indicate the conserved residue Y717 in hTERT (Table 2.1) is the steric gate in human telomerase.

Interestingly, in both WT and Y717A telomerase, minimal insertion past the first telomeric repeat was observed, which suggests the presence of rNTPs in telomere strands may inhibit the telomerase translocation step (Figure 3.4C).

We next assessed if more subtle alterations in the nucleotide mixture would also cause inhibition of telomerase extension. By substituting one rNTP into the nucleotide mix at a time, we could determine the effects of inserting one, two, or three rNTPs inserted per repeat (with rATP, rUTP, and rGTP, respectively). In each case, telomerase processivity was reduced, with no bands evident past the second telomeric repeat (Figure

3.4E). The inhibitory effect seemed to depend on the number of rNTPs inserted per repeat, with rGTP presence showing the greatest inhibition. For the steric gate Y717A mutant telomerase, the effect of rNTPs on telomerase’s processivity were much less pronounced. In the most extreme case of three rNTPs present per repeat, extension products are evident well into the second repeat, in contrast to the WT telomerase which had almost no insertion events (Figure 3.4E). Within these primer extension activity assays, the effects of rNTP insertions can also be observed at the single nucleotide level.

For rATP insertion with WT telomerase, we observed a buildup in substrates one nucleotide shorter than where the rNTP insertion would occur, composing ~60% of the total product formation (Figure 3.4F). This could be explained by the poor catalytic efficiency of telomerase for rATP insertion. With poor rATP insertion efficiency, the

Figure 3.4: The steric gate of human telomerase. (A) Timecourse of primer extension with all four dNTPs by WT (left) and Y717A (right) human telomerase. (B) Quantification of percent primer extension, with WT shown as a blue line and the Y717A mutant shown in red. (C) Timecourse of primer extension with all four rNTPs for WT telomerase (left) and Y717A telomerase (right). (D) Results from the gel in panel C, quantified with the mutant telomerase shown in red and WT telomerase shown in blue. (E) Primer extension from both WT (left) and Y717A telomerase. Each lane contains either all four dNTPs, or three dNTP and one dNTP marked in red. (F) A close-up view of the first two telomeric repeats from selected lanes in panel E. rNTPs present in the mix are shown in red. Quantifications of each band in terms of percent product are also shown and labeled by the position of the telomerase templating base. The base marked by an asterisk is complementary to the rNTP in the solution.

enzyme would stall directly before the rATP insertion event as shown on the gel. In

contrast. the Y717A variant did not stall at the rATP insertion event. Interestingly, rNTPs

inhibited translocation in both Y717A and WT telomerase, implying that telomerase

possesses rNTP discrimination mechanisms beyond the level of single nucleotide

incorporations (Figure 3.4E, F). The agreement between the results of the tcTERT and

hTERT points towards a universal mechanism of sugar discrimination by any telomerase

homolog with a tyrosine in this conserved position (Table 2.1).

3.5 Discussion

Our structural snapshots (Chapter 2) were complemented by kinetic studies,

allowing us to understand how telomerase chooses right from wrong nucleotides; that is,

selecting canonical dNTPs with correct base pairing compared to noncanonical rNTPs or

mismatched base pairing (Figure 3.5A). We found that telomerase fits in the moderate

range of base selection fidelity, similar to that of X-family polymerases involved in DNA

repair 114. Based on our kinetic values, we predict that telomerase nearly ~1 mismatch

per each 10 kb of telomere extension. Because telomerase does not have a proofreading

domain, misinsertions created by telomerase will be hidden from mismatch repair after

telomerase dissociates. Therefore, our fidelity measurement is predictive of cellular error

rates in telomeric sequences. Accordingly, our predicted error rate agrees with telomeric

error rates observed using telomere sequencing 115. While the downstream

consequences of telomeric mismatches are currently unclear, they likely would disrupt G- quadruplex stability and inhibit shelterin protein binding, as both of these phenomena are dependent on DNA sequence (Figure 3.5B) 87,116.

3.5.1 Telomeric Ribonucleotides

DNA polymerases insert millions of rNTPs into the genome during replication,

because of a large disparity in nucleotide concentrations; rNTPs are ~50-fold more

abundant in cells than dNTPs 110. Telomerase also must select against this disparity;

although telomerase is canonically thought to elongate telomeres with only dNTPs, our

kinetic results imply this is not the case. Instead, we predict that for every 10 kb of

telomere extension, telomerase inserts ~20 rNTPs, which represents selectivity

comparable to DNA polymerase β and DNA polymerase δ 111,113,117. However, it is

unknown whether ribonucleotides persist in telomeres or if they are addressed with

ribonucleotide excision repair (RER), similar to other genomic ribonucleotides 118. In our experiments with human telomerase, we found that even with increased rates of ribonucleotide insertion at the single nucleotide insertion level, telomere elongation was reduced via inhibition of the translocation step (Figure 3.4E, F). This reduction was

evident even with a single rNTP present in a telomeric repeat. It is possible that

telomerase pauses after inserting ribonucleotides to provide an opportunity for an

extrinsic proofreader or RER to remove the ribonucleotide before continuing telomeric

elongation. Telomerase has previously been observed to halt telomeric extension after

the insertion of other noncanonical nucleotides, particularly in the case of 8-

oxodeoxyguanine triphosphate (8-oxodGTP) insertion 65. Therefore, telomerase may stall

on noncanonical nucleotides, such as 8-oxodGTP, for similar proofreading reasons.

Telomerase discrimination against rNTP insertion is important because

ribonucleotides can cause multiple downstream problems for telomeric stability. These

Figure 3.5: A model for how telomerase preserves telomeric integrity and biological implications of telomeric ribonucleotides. (A) The telomerase catalytic cycle, with matched dNTP (green), mismatched dNTPs (orange), and matched rNTPs (red). For each nucleotide path, kinetic parameters are labeled for each step and the insertion probability. (B) Downstream consequences of mismatch insertion by telomerase may include evasion of mismatch repair, reduced binding affinity for shelterin proteins, and altered stability of telomeric G quadruplexes. (C) Eventual consequences of rNTP insertion by telomerase. After insertion into telomeres, ribonucleotides could cause harsh consequences via hydrolysis, or disrupting telomere capping and stability. Ribonucleotides may also be removed by ribonucleotide excision repair.

problems can arise from both the direct reduction of telomere length and from altering

telomere structure. Because RNA is more vulnerable to hydrolysis than DNA, inserted

ribonucleotides are more likely to hydrolyze, which would cause a telomeric single strand

break 87,119,120. If the strand break occurred before the complementary strand of the

telomere was copied and ligated to the genome, the elongating single strand would be

released, and the entire telomere elongation process would need to be restarted (Figure

3.5C). Alternately, the inserted ribonucleotide may not hydrolyze, but instead persist

within telomeres. The continued presence of ribonucleotides in telomeres may cause

several structural aberrations, including altering G-quadruplex stability and disrupting shelterin protein binding 121,122. Importantly, if shelterin proteins are not able to bind to

telomeres, telomere ends may be recognized as DNA damage, activating double strand

break repair and causing disastrous biological consequences.

3.5.2 Telomerase prevents the insertion of aberrant nucleotide triphosphates

Subtle alterations to the telomeric nucleotides have previously been shown to

cause telomeric disruption in a cellular context 36,65,123. Even changes of only one atom in

a nucleotide, including replacing an oxygen of a guanine with a sulfur to form therapeutic

6-thioguanine nucleotides or the adduction of an extra oxygen onto guanine to form 8-

oxoguanine nucleotides, have significant biological consequences in the context of

telomerase. Therefore, in order to prevent this telomeric disruption, the telomerase active

site appears to have evolved a high degree of stringency towards noncanonical

nucleotides, including both rNTPs and mismatched dNTPs. Using our structural

characterization of the telomerase catalytic cycle, we were able to efficiently modify the

stringent active site of telomerase, generating a human telomerase variant that readily

79 inserts rNTPs. The applications shown here highlight the potential of combining model

TERTs with complementary human telomerase studies to further probe the telomerase catalytic mechanism and screen future telomerase-targeting therapeutics.

Chapter 4: Therapeutic nucleotide insertions by DNA polymerase β

This chapter has previously been published in whole without any adaptations since publication and is reprinted here with permission. Schaich, M. A., Smith, M. R., Cloud, A. S., Holloran, S. M. & Freudenthal, B. D. Structures of a DNA Polymerase Inserting Therapeutic Nucleotide Analogues. Chem Res Toxicol 30, 1993-2001, doi:10.1021/acs.chemrestox.7b00173 (2017). Copyright © 2017 American Chemical Society.

Author contributions: B.D.F. designed the experiments, M.R.S., A.S.C., S. M. H., and M.A.S. performed the experiments, with A.S.C. and S.M.H. collecting structures under the supervision of M.R.S. and M.A.S.; M.R.S. prepared the data for deposition in the protein data bank, M.R.S. and M.A.S. created the figures, M.A.S. and B.D.F. wrote the manuscript, and the manuscript was edited by the other authors.

At the time of this publication, altering the established therapy of 6-thioguanine into 6-

thiodeoxyguanosine was previously shown to cause potent, telomerase dependent,

cancer cell death. Unfortunately, the mechanism by which 6-thioguanine was inserted

into telomeric DNA, or even any genomic DNA, was unknown, along with several other

nucleoside analogs currently in use in the clinic or under investigation for clinical use.

Therefore, we solved X-ray crystal structures of DNA polymerase β, a model mammalian

DNA polymerase, in the process of inserting four separate nucleotide analogs to

determine what structural features allowed them to be efficiently inserted into genomic

DNA. We hypothesized that modifications of the nucleobase portion of the analogs would

not significantly perturb the polymerase active site; in other words, nucleotide analogs

would be inserted with nearly the same mechanism to their unmodified counterparts. We

found that the base modifications were always present in the solvent-exposed major groove side of the DNA and did not contact any other active site residues.

Although we did not solve structures of TERT inserting 6-thiodeoxyguanine triphosphate, similarities in the active sites of the two enzymes allow some findings of this study to apply to TERT. These similarities including the position of the triad of aspartic acid residues at the primer terminus, and the contacts similar to Watson-Crick interactions made between the incoming nucleotide and the templating base. Based on these findings, we predict that 6-thioguanine nucleobases evade the nucleotide selection process of

TERT in a similar way as pol β: a modification in the major groove side would not clash

82 with any active site residue in TERT, similar to pol β. Unexpected structural rearrangements could also occur, however, and the structural characterization of 6-

TdGTP insertion by TERT remains an untapped future direction (see Chapter 5).

4.1 Abstract

Members of the nucleoside analog class of cancer therapeutics compete with canonical nucleotides to disrupt numerous cellular processes, including nucleotide homeostasis, DNA and RNA synthesis, and nucleotide metabolism. Nucleoside analogs

are triphosphorylated and subsequently inserted into genomic DNA, contributing to the

efficacy of therapeutic nucleosides in multiple ways. In some cases, the altered base acts

as a mutagen, altering the DNA sequence to promote cellular death; in others, insertion

of the altered nucleotide triggers DNA repair pathways, which produce lethal levels of

cytotoxic intermediates such as single and double stranded DNA breaks. As a

prerequisite to many of these biological outcomes, the modified nucleotide must be

accommodated in the DNA polymerase active site during nucleotide insertion. Currently,

the molecular contacts that mediate DNA polymerase insertion of modified nucleotides

remain unknown for multiple therapeutic compounds, despite decades of clinical use. To

determine how modified bases are inserted into duplex DNA, we used mammalian DNA

polymerase β (pol β) to visualize the structural conformations of four therapeutically

relevant modified nucleotides, 6-thio-2'-deoxyguanosine-5'-triphosphate (6-TdGTP), 5-

fluoro-2’-deoxyuridine-5’-triphosphate (5-FdUTP), 5-formyl-deoxycytosine-5’- triphosphate (5-FodCTP), and 5-formyl-deoxyuridine-5’-triphosphate (5-FodUTP).

Together, the structures reveal a pattern in which the modified nucleotides utilize Watson-

Crick base pairing interactions similar to that of unmodified nucleotides. The nucleotide modifications were consistently positioned in the major groove of duplex DNA, accommodated by an open cavity in pol β. These results provide novel information for the

84 rational design of new therapeutic nucleoside analogs, and a greater understanding of how modified nucleotides are tolerated by polymerases.

4.2 Introduction

DNA sequence conservation promotes genomic stability, which is vital for the

prevention of genomic mutations that can ultimately produce dysfunctional and/or

cytotoxic proteins. Reflective of its importance, many enzymatic pathways have evolved

to increase genome stability. These include free radical scavenging enzymes to reduce

cellular levels of reactive oxygen species (ROS), nucleotide metabolism enzymes to

balance nucleotide pools, and DNA repair enzymes to remove and replace DNA damage

from the genome124. When pathways that conserve genomic integrity are disrupted,

potent amounts of cellular DNA damage can result. In some scenarios, disruption of

genomic stability can have positive clinical impacts; one situation includes the specific

targeting of proliferating cancer cells that have compromised DNA damage responses125.

Different methodologies have been established to disrupt genome stability, including

radiotherapies, platinum-based antineoplasmics, and nucleotide analog cancer therapies,

all of which create cytotoxic effects by inducing high levels of DNA damage126,127. Recent

advancements toward increasing the amounts of DNA damage levels in cancer cells have

been made by both reducing the cellular capacity to tolerate damage and the generation

of additional DNA damage. Some examples of this include poly(ADP-ribose) polymerase

(PARP) inhibitors that restrict cells from repairing DNA damage, treatment with nucleoside analogs that cause mutagenesis and cytotoxicity, and promotion of DNA damage through inhibition of the nucleotide cleansing enzyme Mut T Homolog 1, which removes oxidatively damaged nucleotides before they can be inserted into the genome128,129.

4.2.1 Nucleotide analogs in the clinic

Nucleoside analogs have been used clinically as a method of introducing DNA

damage into cells for decades. This approach exploits DNA polymerases that will insert

modified nucleotides into the genome. These molecules must be structurally similar

enough to native nucleotides that DNA polymerases will still insert them, while at the

same time being structurally different enough from native nucleotides to cause mutagenesis and/or cytotoxicity. Currently, over 30 nucleoside analogs have been FDA approved for intervention against a variety of cancers, viral infections, and other diseases130. One of these compounds, 5-fluorouracil (5-FU), has been used in the clinic

for over 60 years and is still the treatment of choice for some cancers131,132. The

mechanism of action of 5-FU is complex; one important characterized cellular effect is its

action as an antimetabolite, inhibiting thymidine synthase. This starkly decreases the

synthesis of thymine nucleotides, causing imbalanced nucleotide pools and promoting

mutagenesis133. In addition to its role as a thymidine synthase inhibitor, 5-FU is

phosphorylated to 5-fluoro-2’-deoxyuridine-5’-triphosphate (5-FdUTP) and inserted into

the genome by DNA polymerases, directly generating DNA damage. Once inserted into

the genome, the toxicity of genomic 5-FU occurs during DNA repair when uracil-DNA

glycosylases (e.g., thymine DNA glycosylase) recognize and excise it, creating an

abundance of DNA breaks. This, in turn, causes cytotoxicity134. The complexity of 5-FU

toxicity is also mediated by its insertion into RNA by RNA polymerases135.

6-thioguanine (6-TG) is another clinically relevant nucleoside analog that has been

utilized since the 1950s, and yet its mechanism of action remains unclear136,137. With a

similar structure to guanine, one primary mode of action is its metabolic processing to 6-

thio-2'-deoxyguanosine-5'-triphosphate (6-TdGTP) and subsequent insertion into

genomic DNA by a DNA polymerase. While genomic incorporation alone is not by itself

particularly cytotoxic or mutagenic, the toxicity seems to be dependent on DNA repair

machinery, in a similar fashion to 5-FU. After insertion of 6-TG, the molecule is methylated

to 6-methylthioguanine, which base pairs with both T and C with similar efficiencies, and

targets mismatch repair to the lesion138. Due to the removal of the non-damaged C or T,

as opposed to the damage in the templating strand, repair is futile and eventually results

in cell cycle arrest138. Toxicity of the 6-TG was reduced by the ablation of the base

excision repair (BER) protein MUTYH (Mut Y DNA glycosylase), suggesting a role of BER

in mediating 6-TG response139. Additionally, insertion of 6-TG into the genome at

telomere regions was shown to cause telomere dysfunction, suggesting other complex

mechanisms of cytotoxicity still undiscovered77. Other cellular effects of 6-TG treatment have also been observed, such as a decrease in guanine nucleotide synthesis and disruption of Rac signaling, highlighting the complexity of 6-TG treatment140,141.

Importantly, many of these mechanisms of action are dependent on the insertion of 6-TG

into the genome via a DNA polymerase.

The treatment potential of additional nucleoside analogs is an active area of

investigation. 5-formylcytosine (5-FoC) treatment has been shown to specifically kill

cancer cells that are resistant to treatments with other pyrimidine analogs142. Unlike the

two compounds mentioned above, 5-FoC is produced in the cell under normal

physiological conditions when 5-methylcytosine is oxidized to 5-FoC, and is used as an

epigenetic signal143. The mechanism of 5-FoC cytotoxicity relies on cancer cells

overexpressing cytidine deaminase (CDA), which converts cytidine to uracil by removing

the amine group. Therefore, upon 5-FoC treatment, 5-FoC is converted to 5-FoU by CDA,

which is subsequently metabolically processed to 5-formyl-deoxyuracil-5’-triphosphate

(5-FodUTP) and inserted into the genome of cancer cells by DNA polymerases. Non-

cancerous cells do not overexpress CDA and therefore will not efficiently convert 5-FoC

to 5-FoU. Because 5-FoC is not phosphorylated to 5-formyl-deoxycytosine-5’-

triphosphate (5-FodCTP), 5-FoC treatment was published to be cancer cell specific via 5-

FodUTP insertion based toxicity142,144.

Although the mechanism of action for 5-FU, 6-TG, and 5-FoC treatments

universally disrupt genome stability, the structures of these nucleoside analogs are

diverse. Table 4.1 shows the molecular structure of these analogs and describes their

clinical uses. Although there is still much to learn about the way nucleoside analogs act,

it is known that for some (and in select cases, nearly all) of their cytotoxic and mutagenic

effects occur when they are inserted into the genome by a DNA polymerase73,142,145.

Therefore, understanding how therapeutic nucleosides are inserted into the genome is

an essential part of elucidating their mechanism of action and optimizing future

treatments. That being said, there is no reported structural information regarding the

molecular contacts required for 5-FdUTP, 6-TdGTP, 5-FodCTP, or 5-FodUTP being inserted by a DNA polymerase. Obtaining this information is vital to understand why these specific nucleoside analogs perform well therapeutically, and for the rational design of future therapeutic nucleoside analogs.

Structure Name Use

5-FU is used in the clinic for colon, rectal, 5-Fluorouracil pancreas, breast, stomach, and ovarian (5-FU) cancer interventions, and has been used in the clinic for around 60 years 132.

6-TG is used clinically for acute myeloid 6-Thioguanine leukemia, acute lymphatic leukemia, chromic myeloid leukemia, and also used as an (6-TG) immunosuppressant. 6-TG has also been used in the clinic for around 60 years 137.

5-FoC is currently under investigation for use 5- with cancers resistant to cytidine analog Formylcytidine treatment. It is thought that 5-FoC is converted (5-FoC) to 5-FoU before it can be inserted into the genome 142.

5-FoU is the deaminated form of 5-FoC, a 5- reaction is specific to cancer cells Formyluridine overexpressing cytidine deaminase. 5-FoC deamination presumably must occur before (5-FoU) phosphorylation and subsequent genomic 142 insertion .

Table 4.1: Selected nucleoside analogs and their clinical use.

4.3 Materials and Methods

4.3.1 DNA sequences used in this study

The following DNA sequences were used to generate the 16-mer DNA duplexes used in crystallization studies (the templating base is underlined): template, 5’-

CCGACGGCGCATCAGC-3’, 5’-CCGACAGCGCATCAGC-3’, 5’-

CCGACCGCGCATCAGC-3’; primer, 5’-GCTGATGCGC-3’; downstream, 5’-GTCGC-3’.

The downstream sequence was 5’-phosphorylated. Each oligonucleotide was suspended

in 10 mM Tris–HCl, pH 7.4 and 1 mM EDTA and the concentration was determined from

their ultraviolet absorbance at 260 nm. The annealing reactions were performed by

incubating a solution of primer with downstream and template oligonucleotides (1:1.2:1.2

molar ratio, respectively) at 95°C for 5 min, followed by 65°C for 30 min, and finally cooling

1°C min−1 to 10°C in a thermocycler.

4.3.2 Expression and purification of DNA Polymerase β

Human wild-type DNA pol β was overexpressed from a pET-28 vector in the BL21-

CodonPlus(DE3)-RP Escherichia coli strain. Purification of pol β was carried out as

described previously and briefly written here146. Cell lysate containing pol β was run over

GE HiTrap Heparin HP, GE Resource S, and HiPrep 16/60 Sephacryl S-200HR columns

and fractions containing pure pol β were concentrated and stored at –80°C in 20 mM

BisTris propane, pH 7.0 for crystallization. Pol β was determined to be pure by SDS page

and the final concentration was determined by A280 using a NanoDrop One UV-Vis

Spectrophotometer (ε = 23 380 M−1 cm−1).

4.3.3 Pol β/DNA complex crystallization

Pol β was complexed with 1-nt gapped DNA to form binary complex crystals containing the desired corresponding templating base. The binary complex crystals were grown as previously described in a solution containing 50 mM imidazole, pH 7.5, 13–19%

PEG3350 and 350 mM sodium acetate96. Binary pol β/DNA crystals were then soaked in

a cryosolution containing 25% ethylene glycol, 50 mM imidazole, pH 7.5, 19% PEG3350,

70 mM sodium acetate, 2 mM nucleoside analog (6-TdGTP, 5-FdUTP, 5-FodCTP, or 5-

FodUTP), and 50 mM CaCl2 for 20 min. This resulted in ground state ternary pol

β/DNA/nucleoside analog crystals for collection.

4.3.4 Crystallographic data collection and refinement

Data were collected at 100 K on a Rigaku MicroMax-007 HF rotating anode diffractometer equipped with a Dectris Pilatus3R 200K-A detector system at a wavelength of 1.54 Å . Data were processed and scaled using the HKL3000R software package147.

Initial models were determined using molecular replacement with the previously

determined closed (PDB: 2FMS) structure of pol β as a reference. All R-free flags were

taken from the starting model. Refinement was performed using PHENIX and model

building using Coot30,94. The metal ligand coordination restraints were generated by

ReadySet (PHENIX). The figures were prepared in PyMOL (Schrödinger LLC). For Figure

4.5, residue R61 (placed above the nucleotide triphosphate) in pol η is highly mobile,

sampling multiple rotamers with the same occupancy, therefore rotamer A is displayed97.

Ramachandran analysis determined 100% of non-glycine residues lie in the allowed

regions and at least 96% in favored regions.

4.4 Results

To determine the molecular mechanism by which 6-TdGTP, 5-FdUTP, 5-FodCTP,

and 5-FodUTP are inserted into the genome, we utilized human pol β. Pol β fills gaps in the DNA following DNA damage removal during BER, and is an established model for both kinetic and structural analysis of DNA synthesis71,148,149. Additionally, treatment with

any of these nucleoside analogs generates DNA damage, creating additional

opportunities for pol β to insert modified nucleotides during DNA repair. Pol β is also

upregulated in breast and colon adenocarcinomas; furthermore, cell lines with

overexpressed pol β exhibited increased sensitivity to nucleoside analog treatment150,151.

This suggests that pol β may be even more involved in inserting nucleotide analogs in

certain cancers and contribute to the selective toxicity of therapeutic nucleoside analogs.

Our experimental design involved crystallizing pol β in complex with a double

stranded DNA (dsDNA) substrate containing a 1-nt gap 152. This binary complex (pol

β/DNA) has been previously characterized with the pol β N-subdomain in an open position, generating a highly accessible active site for nucleotide binding. These binary crystals were transferred to a cryoprotectant solution containing the crystallization conditions, 25% ethylene glycol, CaCl2, and a nucleotide analog. Calcium ions allow for

nucleotide binding, but prevent nucleotide insertion

5-FdUTP 6-TdGTP 5-FodCTP 5-FodUTP Data collection Space group P21 P21 P21 P21 Cell dimensions a, b, c (Å) 50.5,79.7,55.6 50.2,79.2,55.6 50.6,79.6,55.6 50.7,79.6,55.3 α, β, γ (°) 90,107.2,90 90,107.4,90 90,107.2,90 90,107.3,90 Resolution (Å) 25-2.10 50-2.55 25-2.20 50-1.60 a Rsym or Rmerge 13.7 (45.9) 13.3 (76.6) 9.7 (43.0) 6.5 (74.0) (%) I/σI 15.3 (2.3) 15.1 (2.1) 12.7 (1.8) 20.6 (1.4) Completeness 99.8 (97.4) 99.9 (99.7) 98.0 (94.5) 99.9 (99.5) (%) Redundancy 4.5 (2.6) 7.0 (5.3) 3.4 (2.0) 4.2 (2.8)

Refinement Resolution (Å) 2.10 2.55 2.20 1.60 No. reflections 113357 96895 71630 232652 Rwork/ Rfree 17.1 (23.3) 20.6 (28.9) 18.6 (25.7) 20.3 (22.8) No. atoms 3536 3273 3378 3664 Protein 2626 2600 2618 2641 DNA 659 661 662 661 Water 243 7 90 354 B-factors (Å2) Protein 33.38 43.84 41.61 25.04 DNA/Analog 42.14/25.74 55.74/34.57 52.16/32.16 30.82/19.27 Water 38.78 32.93 42.01 32.58 R.m.s deviations Bond length 0.015 0.014 0.013 0.007 (Å) Bond angles 1.241 1.386 1.297 0.907 (º)

PDB ID 5WNY 5WNX 5WNZ 5WO0

Table 4.2: Data Collection and Refinement Statistics of Ternary Pre-catalytic Pol β:DNA Co-complexes with Incoming Nucleotide Analogues

a Highest resolution shell is shown in parentheses

by causing a minor change in the coordination distance between the primer terminus and

the α phosphate of the incoming nucleotide71,96. This allows the use of a natural primer terminus and nucleotide analog, rather than a dideoxy terminated primer or non-

hydrolyzable nucleotide analog version. For each modified nucleotide described below,

the soak promoted the N-subdomain shift from the open to closed conformation, placing

each analog in position for catalysis in a pre-catalytic ternary complex (pol

β/DNA/nucleotide)153.

4.4.1 Structures with incoming 5-Fluorouracil

To obtain a pol β pre-catalytic ternary complex with 5-FdUTP as the incoming

nucleotide opposite adenosine (A), we transferred a binary pol β crystal bound to 1

nucleotide (nt) gapped DNA into a cryosolution with 5-FdUTP and CaCl2. The crystal

diffracted to 2.10 Å (Table 4.2) the polymerase was in the closed conformation, and the

structure showed atomic contacts that were virtually identical to that of unmodified uracil.

Figure 4.1A shows a close up of the pol β active site with 5-FdUTP base pairing with a

templating A through its Watson-Crick face; with N1 of A acting as a proton acceptor for

N2 of 5-FdUTP, and the amine group branching off of N6 from A acting as a proton donor

for the carbonyl group at the C4 position of 5-FdUTP. This is identical to structures of unmodified uracil with the base pairing interaction remaining planar (Figure 4.1B). The

distances of the hydrogen bonding interactions (2.6 and 3.1 Å, see Figure 4.1C) and the

position of the pyrimidine ring itself are not altered between uracil and 5-FdUTP (see

Figure 4.1C). Both metal binding sites were occupied by Ca2+.

Figure 4.1 Pol β ternary ground state with incoming 5-FdUTP base pairing to A. (A) Key active site residues of pol β (gray) inserting 5-FdUTP (yellow) are shown in stick format with the N- subdomain in cartoon. Calcium ions are shown in orange. The fluoro group is shown in cyan. (B) An overlay of the 5- FdUTP (gray) with an incoming dUTP (salmon) nucleotide base pairing to A (PDB 2FMS). (C) A 90° rotation of B, showing the Watson–Crick edge of the overlay, with distances labeled in Å. H-bonds are shown as dashed lines. (D) The omit map (green), contoured at 2.0σ.

The fluoro group that modifies dUTP into 5-FdUTP was observed to be pointing into the major groove of the double stranded DNA helix. This placement puts it in an isolated position, with no direct atomic contacts to either the enzyme or dsDNA. Because the fluoro group is isolated to the major groove, it does not require any major structural rearrangements of either the protein, DNA, or its triphosphate group. Previous findings show that 5-FdUTP will be incorporated into DNA by pol β at similar rates to native nucleotides, and are consistent with our structural snapshot showing the enzyme in a position to support catalysis, if it were in the presence of magnesium154.

4.4.2 Structures with incoming 6-Thioguanine

To determine the structure of 6-TdGTP being inserted opposite a templating C, we soaked a binary pol β crystal bound to dsDNA with a 1-nt gap (templating C) in a cryosolution containing CaCl2 and 6-TdGTP. The crystal diffracted to 2.55 Å, and the polymerase was in the closed position with 6-TdGTP bound (Table 4.2). Figure 4.2A shows a close-up of the active site, with the incoming 6-TdGTP base pairing with C through the Watson-Crick face and the incoming nucleotide is poised for insertion. The structure obtained was nearly identical to that of pol β inserting unmodified dGTP opposite

C, as indicated by an overlay of the two nucleotides in Figure 4.2B, C. The major difference between the base pairing contacts of 6-TdGTP compared to unmodified dGTP is that the thio group is substituted for the carbonyl group at the C6 position. This sulfur substitution has the same number of valence electrons and can act as a hydrogen bond acceptor. However, the difference in bond lengths alters the position of the incoming nucleotide analog slightly155. The difference in length of the hydrogen bond between the modified thio group and the C2 carbonyl group of the templating C is 3.5 Å, which is

slightly longer than typical hydrogen bond distances (Figure 4.2C). For comparison, the

hydrogen bond from the oxygen on C6 of unmodified dGTP to the C2 carbonyl of C is

2.98 Å.

The C6 thio group also points towards the major groove of the DNA helix and is

located at the Watson-Crick face of the modified nucleotide. In contrast to 5-FdUTP,

where the modification is isolated from any atomic contacts, the thio modification was

within weak hydrogen bonding distance of the templating C (3.5 Å). That being said, this

structure also revealed no perturbation to the position of any residues of pol β (relative to

the insertion of an unmodified G) and the triphosphate of the incoming nucleotide (see

Figure 4.2). Additionally, both metal binding sites were occupied by calcium, with the

nucleotide binding in a catalytically competent position. This agrees with previous kinetic

results that suggest several human DNA polymerases insert 6-TdGTP at comparable

efficiency to dGTP, and that pol β also can insert it at a reduced efficiency 156,157.

4.4.3 Structures with incoming 5-Formylcytosine

To determine the structure of 5-FodCTP being inserted opposite a templating

guanine (G), we soaked a binary pol β crystal bound to 1-nt gap DNA with a templating

G in a cryosolution containing CaCl2 and 5-FodCTP. The crystal diffracted to 2.20 Å and pol β was in the closed pre-catalytic conformation, poised for insertion with two calcium ions bound (Figure 4.3A and Table 4.2). Because the adducted formyl group does not alter the Watson-Crick face of 5-FodCTP, hydrogen bonding occurs between the Watson-

Crick face of 5-FoC and G, with 5-FodCTP binding in a highly similar position to unmodified dUTP. Also, the pyrimidine ring is planar with respect to the templating G

(Figure 4.3B, C). Even though the atomic contacts involved in mediating the hydrogen

Figure 4.2: Structural features of the precatalytic pol β ternary ground state with incoming 6-TdGTP. (A) A close-up view of the pol β active site (gray), with the incoming 6-TdGTP shown in cyan and the sulfur modification in yellow. The N-subdomain is shown in cartoon, DNA in gray, and calcium ions in orange. (B) A superposition of 6-TdGTP with a structure of an incoming dGTP nucleotide base paring to C (in salmon, PDB 4UB4). (C) A view of the Watson–Crick edge of the incoming 6-TdGTP, with H-bonds shown as dashed lines and distances in Å. (D) An omit map (green) for the 6-TdGTP, contoured at 2.0σ.

bonding distances with the templating G are similar for both the dUTP and 5-FodCTP, the

oxygen of the adducted formyl group is positioned within hydrogen bonding distance of

the amine group at the C4 position on the pyrimidine ring (Figure 4.3C). This second

proton acceptor pulling from the amine group changes the hydrogen bonding character

along the Watson-Crick edge, and may act to reduce the stability of base pairing158. This interaction also places the formyl group in a position to prevent clashing with the nearest phosphate group, and is stabilizing enough that clear electron density (Figure 4.3D) was seen for this group with a B-factor of 34.8 Å2.

As with 5-FU and 6-TG, 5-FoC is positioned with its altered group in the major

groove of the DNA helix. This positioning allows for the group to be isolated from any

protein contacts, and because of its resemblance to a native nucleotide, no shifting was

observed for any amino acid residues in comparison to a native nucleotide. Previous

kinetic studies of other pyrimidine analogs with highly similar structures to 5-FodCTP,

such as 5-fluorodeoxycytidine triphosphate, 5-fluorodeoxyuracil triphosphate, and 5-

formyldeoxyuracil triphosphate, show that pyrimidines modified at the C5 position are

tolerated well by polymerases154,159. These previous kinetic results corroborate our

structural result that 5-FodCTP minimally disrupts the active site of pol β, implying that

insertion is expected to proceed efficiently.

4.4.4 Structures with incoming 5-Formyluracil

To acquire pre-catalytic structures of pol β inserting 5-FodUTP into dsDNA containing a

1-nt gap with A in the templating position we soaked binary pol β crystals in a cryosolution

with CaCl2 and 5-FodUTP. The resulting crystal diffracted to 1.60 Å with pol β in the

closed position, 5-FodUTP bound, and two Ca2+ ions present in the active site

100

Figure 4.3: Structure of incoming 5-FodCTP, primed for insertion by pol β. (A) 5-FodCTP (green) bound in the active site of pol β (gray), with key residues shown as sticks. DNA bases are labeled, with calcium ions shown in orange. The N-helix is shown in cartoon form, and the formyl modification is labeled. (B) Close-up of the incoming 5-FodCTP and its templating base, overlaid with a structure of an incoming pyrimidine base (in salmon, PDB 2FMS). (C) A 90° rotation of B, highlighting the Watson–Crick edge of the nucleotide, with distances displayed in Å. (D) A green omit map, contoured at 2.0σ for the incoming 5-FodCTP.

101

(Table 4.2 and Figure 4.4A). 5-FodUTP binds in a nearly identical fashion to its unmodified counterpart dUTP, with the pyrimidine ring planar and hydrogen bonding to the templating A through the Watson-Crick face (Figure 4.4B, C).

Although the formyl group of 5-FodUTP was placed in the major groove of the

dsDNA, its position is slightly changed compared to 5-FodCTP; this altered placement is

presumably driven by the change from a positively charged C4 amine in 5-FodCTP to a

negatively charged C4 carbonyl in 5-FodUTP, which would eliminate hydrogen bonding

potential with the proton accepting formyl modification. Even without this stabilization,

clear density was seen for the formyl group with a B-factor of 28.2 Å2. The formyl group

is placed in the major groove of the dsDNA in both cases. For 5-FodUTP, the formyl group

is isolated from any DNA or protein contacts, allowing the polymerase to adopt a

catalytically competent conformation to support efficient insertion into the DNA. This is

consistent with kinetic studies of other DNA polymerases (DNA polymerase I, T7 DNA

polymerase, and Taq DNA polymerase) that show efficient insertion of 5-FodUTP159.

4.5 Discussion

In this article, we present novel X-ray crystal structures of four nucleotide analogs

(5-FdUTP, 6-TdGTP, 5-FodCTP, and 5-FodUTP) base pairing to their respective

complementary base and poised for insertion by pol β. Structures representing each of

the four modified nucleotides reveal a trend that these therapeutic nucleotides are

inserted with the modified portion of the nucleobase placed in the major groove of the

DNA helix in the absence of an observed shift in the DNA or polymerase active site. Prior

102

Figure 4.4: Structure of pol β bound to gapped DNA and 5-FodUTP. (A) A close-up view of 5-FodUTP (purple) in the active site of pol β (gray). Calcium ions are shown in orange, and the N-subdomain is displayed as a cartoon. DNA bases (gray) are labeled, along with the adducted formyl group. (B) A structural overlay of 5-FodUTP with a structure of an incoming dUTP base in the active site of pol β (in salmon, PDB 2FMS). (C) Rotated view of B, showing the hydrogen bonding of the bases as dashed lines and selected distances shown in Å. (D) Omit map for the incoming 5-FodUTP in green, contoured at 2.0σ.

103

to this study, structural characterization of these modified nucleotides was largely limited

to structures of the analogs in dsDNA with no enzyme present, and never with any

polymerase143,160-166. Of note, all of these previous structures show identical base pairing

orientation as observed here, within the pol β active site. This implies that after catalysis

occurs, the structure of the modified nucleotide remains in the same conformation in

which a DNA polymerase inserted it. This is in contrast to previous reports of non-

canonical base pairs (such as mismatches) that adopt alternative conformations in DNA

alone and in protein/DNA complexes70. This underscores the importance of determining the molecular structures of insertion to understand how nucleotide analogs are accommodated in the genome and mediate their biological impact.

4.6.1 Comparison between the structures of modified nucleotides and native nucleotides

It is important to note that, although insertion of these modified nucleotides by pol

β is physiologically relevant during DNA repair, it is likely that many insertion events occur during DNA replication or translesion synthesis167,168. Additionally, some of the cytotoxic

effects of these compounds result from their insertion by RNA polymerases135. Therefore,

cytotoxicity may be mediated by insertion events from many polymerases with diverse

structures, which may interfere with the nucleobase modification. For example, other

polymerases may have additional domains or alternate conformations that result in

contacts with the DNA major groove. To probe this possibility, we compared our

structures with ternary (protein/DNA/nucleotide) crystal structures of polymerases

representing DNA replication (polymerase δ), translesion synthesis (polymerase η), and

transcription (N4 RNA polymerase), presented in Figure 4.5. This was done by overlaying

104

the incoming nucleotide with our modified analog and inspecting the major groove for

clashes. There was no obvious clashing that would interfere with the insertion of each

analog, suggesting that our results for pol β insertion of these modified nucleotides may

also be relevant during DNA replication, translesion synthesis, and transcription.

Consistent with this, recent structures of KlenTaq DNA polymerase inserting non-

physiological nucleobases, such as dyes, affinity tags, and other groups with high steric

demand were shown to be tolerated in the major groove 169.

4.5.2 Concealing base modifications by placing them in the major groove

Multiple rationales would explain why modification placement in the major groove

has been selected for in prevalent nucleoside analog therapies. One being placement of

modifications in the major groove (such as the C5 and N7 positions of pyrimidines and

purines, respectively) minimizes the structural perturbation to the DNA polymerase during

insertion. Large structural disruptions of the polymerase active site have been shown to

reduce activity, and placements of modifications in the major groove side evades this

potential problem152,170. A second rationale is that placement of modifications in the major

groove causes minimal disruption to Watson-Crick base pairing interactions, which act to stabilize the incoming nucleotide in the active site and increase insertion efficiency. A third rationale is that placement of modifications in other positions may cause structural disruptions. For example, a modification positioned near the nucleotide triphosphate may clash with catalytically key groups (e.g. α phosphate) and require additional structural rearrangements for insertion170. Alternately, large modifications in the DNA minor groove

105

Figure 4.5: Other polymerases have minimal contacts with the major groove side of incoming nucleotides. We modeled an incoming 5-FdUTP as sticks, the DNA as a cartoon, and each enzyme as a surface representation. A purple sphere is placed where the fluoro group at the C5 position of the ring would rest. (A) A white surface representation of a structure from this work of pol β with incoming 5-FdUTP. (B) A blue surface of N4 RNA polymerase, with less restriction along the major groove, and nothing contacting a modification at the C5 position (PDB 3Q23). (C) A green surface representation of DNA polymerase η, with no contacts to a modification in the major groove (PDB 3MR2). (D) A dark gray surface of the replicative DNA polymerase δ, also with minimal contacts along the major groove of DNA (PDB 3IAY).

106

would be sterically hindered by enzyme contacts, as DNA polymerases from many

families have been observed to contact the minor groove69,170.

Furthermore, many of these compounds, in the state in which they are inserted to

the genome, have minimal cytotoxicity and mutagenicity. The full cellular effects of

insertion do not occur until other enzymes recognize them and excise them, causing their

cytotoxicity155. The placement of the adducted group in the major groove could also aid

in the post-insertion processing. For example, 6-TG is not cytotoxic until the sulfur group

is methylated, because the methylated version of 6-TG base pairs with T, promoting

mutagenesis171. Thus, placement of the sulfur group in the major groove of the helix could make it more accessible to agents that methylate it, such as S-adenosylmethionine.

Additionally, it is possible that placing the modifications in the major groove not only

assists in the insertion of the genome, but also creates a more detectable lesion for DNA

repair machinery139. Increasing detectability of lesions could lead to higher rates of excision, and therefore would cause more cytotoxic DNA repair intermediates to be generated.

4.5.3 Applying results of DNA polymerase β to other nucleic acid enzymes

Here, we have reported the first atomic resolution structures of pol β inserting

multiple therapeutically relevant nucleotide analogs into dsDNA. The cellular impacts of

these therapeutic compounds are complex and there is still much to learn about their

biological mechanism of action172. That being said, much of the mechanism of action of

these compounds is dependent on their insertion into the cellular genome, and

understanding the mechanism of insertion is vital to understanding why these drugs are

effective. From drugs that have been in the clinic for six decades, to recently discovered

107 compounds currently under investigation for therapeutic relevance, similarities can be seen in the way these structures are inserted into DNA. These themes include Watson-

Crick base pairing similar to unmodified nucleotides, minimal alterations to the positioning of triphosphate groups, and positioning of the adducted modification into the major groove of the DNA during insertion. These themes can serve to guide rational optimization of future drug structures for novel nucleoside analog therapies.

108

Chapter 5: Discussion and future directions

109

Cumulatively, this work has provided a deeper understanding of how telomerase

maintains telomeric integrity using its stringent active site pocket. In Chapter 2, we offer the first complete structural analysis of the telomerase catalytic cycle, including TERT engagement at the telomere end, correct deoxyribonucleotide binding, and phosphodiester bond linkage of that nucleotide to the telomere end. These structural snapshots generated new hypotheses regarding how telomerase selects correct, matched, dNTP building blocks for telomere insertion, but structural information alone was not enough to test these hypotheses. Therefore, in Chapter 3, pre-steady-state enzymology of tcTERT was applied, allowing us to quantify for the first time the Kd of

nucleotide binding and kpol of nucleotide insertion by any TERT homolog. Furthermore,

combining structural and kinetic information, we determined a single nucleotide residue

responsible for preventing ribonucleotide insertion. With this identified, we determined

that human telomerase exhibits similar nucleotide selectivity as tcTERT alone, but

possessed additional stringency mechanisms at the translocation step of the catalytic

cycle. In Chapter 4, we used DNA polymerase β as a model polymerase for nucleotide

selection and determined that nucleoside analogs (currently under investigation to target

telomeres of cancer cells) evade active site residues by placing their modification in a

solvent-exposed portion of the nucleotide binding pocket. Because of the similarities

between the active sites of pol β and telomerase, we predict that the structural

observations in Chapter 4 apply to both enzymes. Cumulatively, work presented in this

dissertation provides a better understanding on how telomerase selects the correct

nucleotides to preserve telomere integrity, and gives rationale about how noncanonical

and modified nucleotides (including 6-thiodeoxyguanine) are inserted by telomerase.

110

5.1 Maintenance of telomere integrity by the telomerase active site

Many lessons about the maintenance of telomere integrity can be gleamed from the well-defined mechanisms of DNA replication. During DNA replication, both strands of genomic DNA are unwound to serve as a template for high fidelity DNA polymerases.

After the two complementary strands are generated by replicative polymerases, the nucleotides that were selected by the DNA polymerases remain base paired to the original template strand. Because the original genetic information persists within the newly synthesized double stranded DNA, any mistakes made by replicative polymerases can be proofread by mismatch repair. Other repair systems prevent DNA transversions resulting from DNA damage, such as the glycosylase MUTYH, which detects 8-oxoG base paired with adenine, and removes the adenine to give DNA repair systems another chance to insert a cytidine across from the damage 173. Thanks to the additional proofreading of DNA repair pathways, the error rate of replicative polymerases is reduced from ~1 out of every ten million to ~1 in every billion insertions 114.

In contrast to replication of genomic DNA by DNA polymerases, the newly synthesized telomeric ends by telomerase do not remain bound to a template strand. The

RNA template strand of telomerase stays bound with the enzyme through each translocation step; therefore, telomere synthesis is better described as the elongation of single stranded DNA. Any mistakes made by telomerase during single strand elongation cannot be detected by repair pathways such as mismatch repair (see Figure 3.5).

Therefore, the accuracy of telomerase must be high to maintain telomeric stability, because secondary systems will not correct any mistakes made by telomerase. Our findings agree with this perspective by demonstrating that telomerase possesses a

111 moderate fidelity to that of replicative DNA polymerases at ~1 in 10,000. Therefore, in a telomere extension involving around 10,000 bases, only ~1 mismatch is predicted to insert.

Sometimes, however, telomerase inserting errors can prove beneficial from a clinical perspective. Using nucleotide analogs to target the telomeres of cancer cells (as in Chapter 2) requires an evasion of the stringent telomerase active site. The ability of nucleotide analogs to evade active site selectivity is likely dependent on the placement of the modification. Based on the research performed within this dissertation, nucleotide analogs such as 6-thioguanosine triphosphate, would be expected to place the modification in solvent exposed regions (i.e. the major groove) rather than a position coordinated by the enzyme to allow for efficient insertion. By avoiding telomerase from recognizing the nucleotide analog substrates as different than their unmodified counterparts, regenerated telomeres can be heavily damaged specifically in telomerase positive cancer cells, offering an avenue for the development of new therapies.

5.1.1 Self-proofreading by telomerase

Although many of the DNA repair systems that ameliorate errors introduced by

DNA polymerases cannot act on telomerase, our findings point to a novel form of error recognition inherent in telomere extension. Namely, the RNA component of telomerase must reanneal to the most recently synthesized telomeric repeat in order for telomere elongation to continue. In situations where errors have been introduced to the telomeric sequence, the binding affinity for telomerase to the newly synthesized telomeric repeat will be starkly reduced, especially considering the minimal (as low as 3) number of nucleotides that are available to anneal after translocation. Therefore, if telomerase

112 makes errors, the efficiency of the translocation step will reduce (Figure 1.2). Certain nucleotides, including 8-oxoG, have previously been demonstrated to terminate telomere elongation, but the results in this dissertation point to even greater sensitivity by telomerase; ribonucleotides, which contain identical Watson-Crick face to their canonical deoxyribonucleotide counterparts, also starkly reduce translocation by telomerase

(Figure 3.4) 65. Therefore, evidence points to a unique mechanism by which telomerase maintains the telomere integrity by double checking the telomere sequence during each translocation step and halting telomeric elongation in scenarios where the templating

RNA component does not align with the growing telomere strand.

The eventual fate of telomerase stalled at the translocation step currently remains unknown (Figure 5.1). One possible outcome is that once telomerase stalls due to the insertion of an error, no more elongation by that molecule of telomerase occurs.

Eventually, that stalled molecule of telomerase would dissociate, leaving one end of a chromosome significantly shorter than other ends that had successful elongation by telomerase. After a single round of cell division, whatever error introduced by telomerase would erode from telomeres due to the end replication problem, leaving a correct telomere sequence in its place. With this system, telomere integrity would be maintained, but the rate at which telomeres could be regenerated would be reduced. After a stunted elongation, cells could undergo excessive telomere shortening, eventually causing error insertions by telomerase to cause cells to enter senescence and eventually undergo apoptosis

In situations where telomerase is disrupted by an error incorporation of telomerase, adaptations could also allow cells to recover, Potentially supporting the prevalence of this

113

scenario, telomerase expression has been previously shown to target the shortest

telomere in a cell 174. The next time that telomerase is expressed, that telomere will have

the highest priority in elongation, which could cause that telomere to regenerate to the

point of other telomere lengths. Based on the error rates found in this dissertation,

however, telomerase is expected to insert a ribonucleotide after approximately every 100

insertion events. Therefore, the differential recruitment in telomerase would likely have to

persist throughout multiple cell divisions in order for one chromosome to establish similar

telomere lengths to other chromosomes. Telomere length has been observed to be

heterogeneous, varying up to ~10 kb from chromosome to chromosome, which stochastic

stalling of telomerase caused by error incorporation may help explain.

Alternately, it is possible that telomerase recruits a nuclease to cut out the

erroneous insertion, enabling telomerase to further elongate the growing telomere strand.

An extrinsic nuclease could act in a similar fashion to previously observed exonuclease

domains of DNA polymerases, which have been observed to increase their accuracy by

several orders of magnitude 114. Supporting this hypothesis, a telomerase-associated

nuclease has been observed to co-purify with telomerase, and was demonstrated to

excise up to 13 nucleotides in a telomeric primer 175,176. With this hypothesis, telomerase would make a mistake, disengage from the telomeric end, allow for nucleases to trim back the telomeric end, and then start over without waiting for another round of cell division.

Unfortunately, although nuclease activity has been observed, the sequence and identity

of this nuclease is currently unknown. The ‘telomerase-associated nuclease’ could be a

previously unidentified telomerase-specific nuclease that is specialized to proofread

telomerase errors, or it could be a known nuclease repurposed from another role in the

114

Figure 5.1 : Possible eventual fates of errors incorporated by telomerase. Once errors have been incorporated by telomerase, elongation stalls at the translocation step. This stalled telomerase could potentially be resolved one of three ways: (Left) The stalled telomerase fails to continue, resulting in a telomere end with stunted elongation. A new telomerase molecule would be needed to reinitiate elongation, otherwise the apoptosis could occur (Middle) The error is tolerated by telomerase, allowing for telomere elongation to continue. This species results in errors being incorporated into telomeres, potentially being addressed by downstream DNA repair pathways. (Right) An extrinsic nuclease cuts back the single stranded DNA of the freshly elongated telomere, removing the damage before telomerase re-engages and continues telomere elongation

115 cell. One last possibility is that after a misinsertion, telomerase can perform its reverse nucleotidyl transferase reaction to remove the error, essentially acting as its own nuclease. However, telomerase has not been observed thus far to have this capability, and the telomerase associated nuclease activity is associated with lower purity levels of telomerase, so an extrinsic nuclease seems more likely 176.

5.1.2 Telomere error tolerance

Aside from telomerase correctly maintaining telomere sequence, another standing question in the field is: how many errors can be tolerated in telomeres before they negatively impact the biology of a cell? Unfortunately, one major technical limitation that has prevented probing this question is difficulties in sequencing telomeres. Because the sequence is so repetitive, traditional next generation sequencing approaches have not worked well for telomere sequencing, due to an issue known as “phasing” 115. Phasing occurs because DNA must be cleaved into short pieces before many sequencing approaches can be applied. This does not cause an issue with typical genomic DNA, because overlapping reads between non-repetitive DNA can be obtained and reassembled into the full sequence. With repetitive sequence such as telomeres, however, the reads are so short that it is difficult to identify whether the read comes from repeat 1-10 of a telomere or repeat 90-100, which in turn makes reassembly of the different reads difficult, if not impossible. Additionally, because every chromosome would contain telomere sequences, it is challenging to even know from which chromosomal end that the telomere sequence originated. Therefore, without an accurate readout on how many aberrances in telomere sequences, the amount of DNA damage or mutagenesis that can be tolerated by telomeres has been difficult to quantify.

116

One way to solve the phasing problem in telomere sequencing is to obtain reads long enough to span the entire length of telomeres. If the average read length of a sequencing technology was much longer than telomere length, then an accurate portrayal of both the sequence and length of a telomere could be obtained. Furthermore, if read length was long enough, genomic DNA adjacent to telomeres could be sequenced in addition to the telomeric DNA, potentially allowing for the differentiation between telomere length of each individual chromosome. Currently, alternate sequencing technologies have been developed that have the potential to sequence an entire telomere in a single read.

One of these technologies, known as nanopore sequencing, was recently optimized to generate “ultra-long reads” of up to ~1 Mb 177. Reads of this length are ~two orders of magnitude longer than the expected telomere length. With this optimization, full telomere sequences were obtained for several chromosomes, which varied from ~2 to 10 kb 177.

Harnessing nanopore technology to obtain accurate telomere sequences may allow future studies of telomerase fidelity at the cellular level. Furthermore, because nanopore translates the sequence using changes in current, small base modifications have already been detectable using nanopore technology due to their signature changes in current, including 8-oxoG and methylated cytosines 178,179. Because nanopore sequencing of telomeres is such a recent development, however, the levels at which these base modifications persist in telomeres remain to be seen.

5.1.2 Advances in structural biology of telomerase

For the reasons outlined in Chapter 1, structural characterization of telomerase has proven challenging, despite its foundational importance in molecular cell biology and potential clinical applications. To address the challenges in TERT structural biology, one

117

successful strategy has been to employ model systems, including Tribolium castaneum

and Tetrahymena thermophila (see Chapter 1). These model systems have proved

instrumental in gaining a molecular understanding of how telomerase extends telomeres.

They do not, however, fully substitute for studies of human telomerase. Recently,

technological improvements in the field of cryo-electron microscopy (including direct electron detectors, movie mode data acquisition, and advances in the software for obtaining cryo-EM structures) have enabled structure determination with small protein samples, lower purities, and even heterogeneity in structure 180,181. With these

improvements, atomic-resolution structures of the human telomerase holoenzyme seem

to be on the horizon; the resolution of telomerase structures improves with each

publication (Figure 5.2). For instance, the highest resolution structure of human

telomerase increased from ~30 Å to 8 Å in a blisteringly short 5-year period. If the

resolution of human telomerase follows similar trends as the cryo-EM structures of

Tetrahymena thermophila, it may be possible to achieve a sub-5 Å structure as early as

2022. It will likely be several years after that structure before human telomerase achieves

a high enough resolution to know the positions of amino acids in the protein with certainty.

Based on sequence homology and secondary structure motifs, we anticipate that our

results with Tribolium castaneum TERT will be recapitulated with human telomerase, but

those questions cannot directly be answered without an improved resolution of human

TERT.

118

Figure 5.2: Advances in highest TERT resolutions over time. Over time, Tribolium castaneum TERT (tcTERT, green) has remained at better than 3 Å resolution (dotted line). In contrast, cryo-EM structures of both Tetrahymena thermophila (red) and Homo sapiens (blue) telomerase have dramatically increased in resolution. If current trends continue, a human telomerase structure at 5 Å or better may be published by 2022.

119

5.2 Future directions and concluding remarks

Throughout this dissertation, model systems were used in place of human

telomerase, from Tribolium castaneum telomerase reverse transcriptase in Chapter 2 and

Chapter 3 to DNA polymerase β in Chapter 4. These studies revealed a deeper understanding of the specific mechanisms by which telomerase protects telomere integrity by preventing the insertion of noncanonical nucleotides. So far, we were able to specifically look at misincorporation of ribonucleotides, mismatches, and therapeutic nucleotides, but still many lingering questions remain. For instance, we uncovered the mechanism by which 6-thiodeoxyguanosine triphosphate evades selectivity by DNA polymerase β, but do not have a clear answer for how it is inserted in the context of TERT.

If that information were obtained, future studies towards structure-based drug design could be attainable; 6-thiodeoxyguansoine treatment already shows striking therapeutic potential. A structure of 6-thiodGTP poised for insertion in the TERT active site would give insight towards ways that the compound could be improved while maintaining efficient insertion. Of note, recent publications have already begun to use structural biology of tcTERT for optimization of telomerase targeting therapeutics 182.

Similarly, the relationship between 8-oxoguanosine triphosphate and telomerase

remains enigmatic; why does the insertion of 8-oxodGTP by telomerase cause telomere

elongation to terminate? Our work with 8-oxoG extension by DNA polymerase β revealed

different trends than what has previously been observed with telomerase, so it is unlikely

that this question will be answered until direct mechanistic insight has been obtained for

TERT inserting 8-oxodGTP. Uncovering the interplay between 8-oxoguanine and

telomerase is key to understanding the mechanisms by which chronic oxidative stress

120 causes aging phenotypes. Beyond 8-oxoG and telomerase, there are many other conserved TERT mutations that are associated with various diseases, structure-function studies with these mutants may provide answers to longstanding questions regarding why specific mutations cause disease phenotypes, and the further deepen the understanding of the roles of amino acids in the active site. In the end, we provided insight to many fundamental questions of how telomerase filters out noncanonical nucleotides to protect telomere integrity. However, our work also uncovered further mysteries that remain to be answered regarding how telomerase specifically protects telomere integrity, as outlined above. With powerful model systems, advances in structural biology, and other innovative approaches, I remain optimistic that these exciting questions about telomerase will soon be answered, and that a deeper understanding of how telomerase works will drive telomerase-targeting therapeutics to the clinic.

121

References

1 Carrel, A. ON THE PERMANENT LIFE OF TISSUES OUTSIDE OF THE ORGANISM. J Exp Med 15, 516-528, doi:10.1084/jem.15.5.516 (1912). 2 Witkowski, J. A. Dr. Carrel's immortal cells. Med Hist 24, 129-142, doi:10.1017/s0025727300040126 (1980). 3 Hayflick, L. & Moorhead, P. S. The serial cultivation of human diploid cell strains. Experimental cell research 25, 585-621 (1961). 4 Shay, J. W. & Wright, W. E. Hayflick, his limit, and cellular ageing. Nature reviews. Molecular cell biology 1, 72-76, doi:10.1038/35036093 (2000). 5 Wellinger, R. J. In the end, what's the problem? Molecular cell 53, 855-856, doi:10.1016/j.molcel.2014.03.008 (2014). 6 Olovnikov, A. M. [Principle of marginotomy in template synthesis of polynucleotides]. Doklady Akademii nauk SSSR 201, 1496-1499 (1971). 7 Watson, J. D. & Crick, F. H. Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171, 737-738, doi:10.1038/171737a0 (1953). 8 Olovnikov, A. M. A theory of marginotomy. The incomplete copying of template margin in enzymic synthesis of polynucleotides and biological significance of the phenomenon. J Theor Biol 41, 181-190, doi:10.1016/0022-5193(73)90198-7 (1973). 9 Blackburn, E. H. & Gall, J. G. A tandemly repeated sequence at the termini of the extrachromosomal ribosomal RNA genes in Tetrahymena. Journal of molecular biology 120, 33-53, doi:10.1016/0022-2836(78)90294-2 (1978). 10 Collins, K. & Gorovsky, M. A. Tetrahymena thermophila. Current biology : CB 15, R317-318, doi:10.1016/j.cub.2005.04.039 (2005). 11 Greider, C. W. & Blackburn, E. H. The telomere terminal transferase of Tetrahymena is a ribonucleoprotein enzyme with two kinds of primer specificity. Cell 51, 887-898, doi:10.1016/0092-8674(87)90576-9 (1987). 12 Heather, J. M. & Chain, B. The sequence of sequencers: The history of sequencing DNA. Genomics 107, 1-8, doi:10.1016/j.ygeno.2015.11.003 (2016). 13 Greider, C. W. & Blackburn, E. H. A telomeric sequence in the RNA of Tetrahymena telomerase required for telomere repeat synthesis. Nature 337, 331- 337, doi:10.1038/337331a0 (1989). 14 Trask, B. J. Human cytogenetics: 46 chromosomes, 46 years and counting. Nature reviews. Genetics 3, 769-778, doi:10.1038/nrg905 (2002). 15 Xi, L. & Cech, T. R. Inventory of telomerase components in human cells reveals multiple subpopulations of hTR and hTERT. Nucleic Acids Res 42, 8565-8577, doi:10.1093/nar/gku560 (2014). 16 Lorsch, J. R. Practical steady-state enzyme kinetics. Methods in enzymology 536, 3-15, doi:10.1016/b978-0-12-420070-8.00001-5 (2014). 17 Powers, K. T. & Washington, M. T. Analyzing the Catalytic Activities and Interactions of Eukaryotic Translesion Synthesis Polymerases. Methods in enzymology 592, 329-356, doi:10.1016/bs.mie.2017.04.002 (2017). 18 Kim, N. W. et al. Specific association of human telomerase activity with immortal cells and cancer. Science (New York, N.Y.) 266, 2011-2015 (1994).

122

19 Saiki, R. K. et al. Enzymatic amplification of beta-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science (New York, N.Y.) 230, 1350-1354, doi:10.1126/science.2999980 (1985). 20 Harley, C. B. Telomerase and cancer therapeutics. Nature reviews. Cancer 8, 167- 179, doi:10.1038/nrc2275 (2008). 21 Corey, D. R. Telomeres and telomerase: from discovery to clinical trials. Chem Biol 16, 1219-1223, doi:10.1016/j.chembiol.2009.12.001 (2009). 22 Victorelli, S. & Passos, J. F. Telomeres and Cell Senescence - Size Matters Not. EBioMedicine 21, 14-20, doi:10.1016/j.ebiom.2017.03.027 (2017). 23 Jafri, M. A., Ansari, S. A., Alqahtani, M. H. & Shay, J. W. Roles of telomeres and telomerase in cancer, and advances in telomerase-targeted therapies. Genome Med 8, 69-69, doi:10.1186/s13073-016-0324-x (2016). 24 Vonderheide, R. H. Telomerase as a universal tumor-associated antigen for cancer immunotherapy. Oncogene 21, 674-679, doi:10.1038/sj.onc.1205074 (2002). 25 Park, J. K. et al. The anti-fibrotic effect of GV1001 combined with gemcitabine on treatment of pancreatic ductal adenocarcinoma. Oncotarget 7, 75081-75093, doi:10.18632/oncotarget.12057 (2016). 26 Park, H.-H. et al. The novel vaccine peptide GV1001 effectively blocks β-amyloid toxicity by mimicking the extra-telomeric functions of human telomerase reverse transcriptase. Neurobiology of Aging 35, 1255-1274, doi:https://doi.org/10.1016/j.neurobiolaging.2013.12.015 (2014). 27 Kim, H., Seo, E.-H., Lee, S.-H. & Kim, B.-J. The Telomerase-Derived Anticancer Peptide Vaccine GV1001 as an Extracellular Heat Shock Protein-Mediated Cell- Penetrating Peptide. Int J Mol Sci 17, 2054, doi:10.3390/ijms17122054 (2016). 28 Plumb, J. A. et al. Telomerase-specific suicide gene therapy vectors expressing bacterial nitroreductase sensitize human cancer cells to the pro-drug CB1954. Oncogene 20, 7797-7803, doi:10.1038/sj.onc.1204954 (2001). 29 Majumdar, A. S. et al. The telomerase reverse transcriptase promoter drives efficacious tumor suicide gene therapy while preventing hepatotoxicity encountered with constitutive promoters. Gene Therapy 8, 568-578, doi:10.1038/sj.gt.3301421 (2001). 30 Nemunaitis, J. et al. A phase I study of telomerase-specific replication competent oncolytic adenovirus (telomelysin) for various solid tumors. Molecular therapy : the journal of the American Society of Gene Therapy 18, 429-434, doi:10.1038/mt.2009.262 (2010). 31 Kim, M. Y., Vankayalapati, H., Shin-Ya, K., Wierzba, K. & Hurley, L. H. Telomestatin, a potent telomerase inhibitor that interacts quite specifically with the human telomeric intramolecular g-quadruplex. Journal of the American Chemical Society 124, 2098-2099, doi:10.1021/ja017308q (2002). 32 Gilbert-Girard, S. et al. Stabilization of Telomere G-Quadruplexes Interferes with Human Herpesvirus 6A Chromosomal Integration. J Virol 91, e00402-00417, doi:10.1128/JVI.00402-17 (2017). 33 Read, M. et al. Structure-based design of selective and potent G quadruplex- mediated telomerase inhibitors. Proceedings of the National Academy of Sciences

123

of the United States of America 98, 4844-4849, doi:10.1073/pnas.081560598 (2001). 34 Asai, A. et al. A novel telomerase template antagonist (GRN163) as a potential anticancer agent. Cancer research 63, 3931-3939 (2003). 35 Herbert, B. S. et al. Lipid modification of GRN163, an N3'-->P5' thio- phosphoramidate oligonucleotide, enhances the potency of telomerase inhibition. Oncogene 24, 5262-5268, doi:10.1038/sj.onc.1208760 (2005). 36 Mender, I., Gryaznov, S., Dikmen, Z. G., Wright, W. E. & Shay, J. W. Induction of telomere dysfunction mediated by the telomerase substrate precursor 6-thio-2'- deoxyguanosine. Cancer discovery 5, 82-95, doi:10.1158/2159-8290.CD-14-0609 (2015). 37 Williams, S. C. No end in sight for telomerase-targeted cancer drugs. Nature medicine 19, 6, doi:10.1038/nm0113-6 (2013). 38 Greider, C. W. & Blackburn, E. H. Identification of a specific telomere terminal transferase activity in tetrahymena extracts. Cell 43, 405-413, doi:https://doi.org/10.1016/0092-8674(85)90170-9 (1985). 39 Lingner, J. et al. Reverse Transcriptase Motifs in the Catalytic Subunit of Telomerase. Science (New York, N.Y.) 276, 561, doi:10.1126/science.276.5312.561 (1997). 40 Lendvay, T. S., Morris, D. K., Sah, J., Balasubramanian, B. & Lundblad, V. Senescence mutants of Saccharomyces cerevisiae with a defect in telomere replication identify three additional EST genes. Genetics 144, 1399-1412 (1996). 41 Johnston, M. et al. The nucleotide sequence of Saccharomyces cerevisiae chromosome XII. Nature 387, 87-90 (1997). 42 Kohlstaedt, L. A., Wang, J., Friedman, J. M., Rice, P. A. & Steitz, T. A. Crystal structure at 3.5 A resolution of HIV-1 reverse transcriptase complexed with an inhibitor. Science (New York, N.Y.) 256, 1783-1790, doi:10.1126/science.1377403 (1992). 43 Arnold, E. et al. Structure of HIV-1 reverse transcriptase/DNA complex at 7 Å resolution showing active site locations. Nature 357, 85-89, doi:10.1038/357085a0 (1992). 44 Jacobo-Molina, A. et al. Crystal structure of human immunodeficiency virus type 1 reverse transcriptase complexed with double-stranded DNA at 3.0 A resolution shows bent DNA. Proceedings of the National Academy of Sciences of the United States of America 90, 6320-6324, doi:10.1073/pnas.90.13.6320 (1993). 45 Meyerson, M. et al. hEST2, the putative human telomerase catalytic subunit gene, is up-regulated in tumor cells and during immortalization. Cell 90, 785-795, doi:10.1016/s0092-8674(00)80538-3 (1997). 46 Nakamura, T. M. et al. Telomerase catalytic subunit homologs from fission yeast and human. Science (New York, N.Y.) 277, 955-959, doi:10.1126/science.277.5328.955 (1997). 47 Weinrich, S. L. et al. Reconstitution of human telomerase with the template RNA component hTR and the catalytic protein subunit hTRT. Nature genetics 17, 498- 502, doi:10.1038/ng1297-498 (1997).

124

48 Gillis, A. J., Schuller, A. P. & Skordalakes, E. Structure of the Tribolium castaneum telomerase catalytic subunit TERT. Nature 455, 633-637, doi:10.1038/nature07283 (2008). 49 Jiang, J. et al. Structure of Telomerase with Telomeric DNA. Cell 173, 1179- 1190.e1113, doi:10.1016/j.cell.2018.04.038 (2018). 50 Nguyen, T. H. D. et al. Cryo-EM structure of substrate-bound human telomerase holoenzyme. Nature 557, 190-195, doi:10.1038/s41586-018-0062-x (2018). 51 Niederer, R. O. & Zappulla, D. C. Refined secondary-structure models of the core of yeast and human telomerase RNAs directed by SHAPE. RNA 21, 254-261, doi:10.1261/rna.048959.114 (2015). 52 Garforth, S. J., Wu, Y. Y. & Prasad, V. R. Structural features of mouse telomerase RNA are responsible for the lower activity of mouse telomerase versus human telomerase. Biochem J 397, 399-406, doi:10.1042/BJ20060456 (2006). 53 Wu, R. A., Upton, H. E., Vogan, J. M. & Collins, K. Telomerase Mechanism of Telomere Synthesis. Annu Rev Biochem 86, 439-460, doi:10.1146/annurev- biochem-061516-045019 (2017). 54 Jacobs, S. A., Podell, E. R. & Cech, T. R. Crystal structure of the essential N- terminal domain of telomerase reverse transcriptase. Nature structural & molecular biology 13, 218-225, doi:10.1038/nsmb1054 (2006). 55 Young, C. L., Britton, Z. T. & Robinson, A. S. Recombinant protein expression and purification: a comprehensive review of affinity tags and microbial applications. Biotechnology journal 7, 620-634, doi:10.1002/biot.201100155 (2012). 56 Hoffman, H. & Skordalakes, E. Crystallographic Studies of Telomerase. Methods in enzymology 573, 403-419, doi:10.1016/bs.mie.2016.04.006 (2016). 57 Rouda, S. & Skordalakes, E. Structure of the RNA-binding domain of telomerase: implications for RNA recognition and binding. Structure (London, England : 1993) 15, 1403-1412, doi:10.1016/j.str.2007.09.007 (2007). 58 Mitchell, M., Gillis, A., Futahashi, M., Fujiwara, H. & Skordalakes, E. Structural basis for telomerase catalytic subunit TERT binding to RNA template and telomeric DNA. Nature structural & molecular biology 17, 513-518, doi:10.1038/nsmb.1777 (2010). 59 Bakhtina, M. et al. Use of viscogens, dNTPalphaS, and rhodium(III) as probes in stopped-flow experiments to obtain new evidence for the mechanism of catalysis by DNA polymerase beta. Biochemistry 44, 5177-5187, doi:10.1021/bi047664w (2005). 60 Hoffman, H., Rice, C. & Skordalakes, E. Structural Analysis Reveals the Deleterious Effects of Telomerase Mutations in Bone Marrow Failure Syndromes. The Journal of biological chemistry 292, 4593-4601, doi:10.1074/jbc.M116.771204 (2017). 61 Czarnocki-Cieciura, M. & Nowotny, M. Introduction to high-resolution cryo-electron microscopy. Postepy biochemii 62, 383-394 (2016). 62 Peng, Y., Mian, I. S. & Lue, N. F. Analysis of Telomerase Processivity: Mechanistic Similarity to HIV-1 Reverse Transcriptase and Role in Telomere Maintenance. Molecular cell 7, 1201-1211, doi:https://doi.org/10.1016/S1097-2765(01)00268-4 (2001).

125

63 Sfeir, A. et al. Mammalian telomeres resemble fragile sites and require TRF1 for efficient replication. Cell 138, 90-103, doi:10.1016/j.cell.2009.06.021 (2009). 64 Noren Hooten, N. & Evans, M. K. Techniques to Induce and Quantify Cellular Senescence. J Vis Exp, 10.3791/55533, doi:10.3791/55533 (2017). 65 Fouquerel, E. et al. Oxidative guanine base damage regulates human telomerase activity. Nature structural & molecular biology 23, 1092-1100, doi:10.1038/nsmb.3319 (2016). 66 Nakabeppu, Y. Cellular levels of 8-oxoguanine in either DNA or the nucleotide pool play pivotal roles in carcinogenesis and survival of cancer cells. Int J Mol Sci 15, 12543-12557, doi:10.3390/ijms150712543 (2014). 67 Lee, Y. A., Durandin, A., Dedon, P. C., Geacintov, N. E. & Shafirovich, V. Oxidation of guanine in G, GG, and GGG sequence contexts by aromatic pyrenyl radical cations and carbonate radical anions: relationship between kinetics and distribution of alkali-labile lesions. J Phys Chem B 112, 1834-1844, doi:10.1021/jp076777x (2008). 68 Vorlickova, M., Tomasko, M., Sagi, A. J., Bednarova, K. & Sagi, J. 8-oxoguanine in a quadruplex of the human telomere DNA sequence. The FEBS journal 279, 29- 39, doi:10.1111/j.1742-4658.2011.08396.x (2012). 69 Beard, W. A. & Wilson, S. H. Structural insights into the origins of DNA polymerase fidelity. Structure (London, England : 1993) 11, 489-496 (2003). 70 Batra, V. K., Beard, W. A., Pedersen, L. C. & Wilson, S. H. Structures of DNA Polymerase Mispaired DNA Termini Transitioning to Pre-catalytic Complexes Support an Induced-Fit Fidelity Mechanism. Structure (London, England : 1993) 24, 1863-1875, doi:10.1016/j.str.2016.08.006 (2016). 71 Freudenthal, B. D., Beard, W. A., Shock, D. D. & Wilson, S. H. Observing a DNA polymerase choose right from wrong. Cell 154, 157-168, doi:10.1016/j.cell.2013.05.048 (2013). 72 Whitaker, A. M., Smith, M. R., Schaich, M. A. & Freudenthal, B. D. Capturing a mammalian DNA polymerase extending from an oxidized nucleotide. Nucleic Acids Res 45, 6934-6944, doi:10.1093/nar/gkx293 (2017). 73 Karran, P. & Attard, N. Thiopurines in current medical practice: molecular mechanisms and contributions to therapy-related cancer. Nature reviews. Cancer 8, 24-36, doi:10.1038/nrc2292 (2008). 74 Warren, D. J., Andersen, A. & Slordal, L. Quantitation of 6-thioguanine residues in peripheral blood leukocyte DNA obtained from patients receiving 6- mercaptopurine-based maintenance therapy. Cancer research 55, 1670-1674 (1995). 75 Neurath, M. Thiopurines in IBD: What Is Their Mechanism of Action? Gastroenterol Hepatol (N Y) 6, 435-436 (2010). 76 Nelson, J. A., Carpenter, J. W., Rose, L. M. & Adamson, D. J. Mechanisms of action of 6-thioguanine, 6-mercaptopurine, and 8-azaguanine. Cancer research 35, 2872-2878 (1975). 77 Mender, I., Gryaznov, S., Dikmen, Z. G., Wright, W. E. & Shay, J. W. Induction of telomere dysfunction mediated by the telomerase substrate precursor 6-thio-2'- deoxyguanosine. Cancer Discov 5, 82-95, doi:10.1158/2159-8290.Cd-14-0609 (2015).

126

78 Zeng, X. et al. Administration of a Nucleoside Analog Promotes Cancer Cell Death in a Telomerase-Dependent Manner. Cell Rep 23, 3031-3041, doi:10.1016/j.celrep.2018.05.020 (2018). 79 Blasco, M. A. Telomeres and human disease: ageing, cancer and beyond. Nature Reviews Genetics 6, 611-622, doi:10.1038/nrg1656 (2005). 80 Nelson, N. D. & Bertuch, A. A. Dyskeratosis congenita as a disorder of telomere maintenance. Mutation research 730, 43-51, doi:10.1016/j.mrfmmm.2011.06.008 (2012). 81 Olovnikov, A. M. A theory of marginotomy: The incomplete copying of template margin in enzymic synthesis of polynucleotides and biological significance of the phenomenon. Journal of Theoretical Biology 41, 181-190, doi:https://doi.org/10.1016/0022-5193(73)90198-7 (1973). 82 Watson, J. D. Origin of Concatemeric T7DNA. Nature New Biology 239, 197-201, doi:10.1038/newbio239197a0 (1972). 83 Moyzis, R. K. et al. A highly conserved repetitive DNA sequence, (TTAGGG)n, present at the telomeres of human chromosomes. Proceedings of the National Academy of Sciences 85, 6622, doi:10.1073/pnas.85.18.6622 (1988). 84 Meyerson, M. Telomerase enzyme activation and human cell immortalization. Toxicology letters 102-103, 41-45 (1998). 85 Ramakrishnan, S. et al. Characterization of human telomerase complex. Proceedings of the National Academy of Sciences of the United States of America 94, 10075-10079, doi:10.1073/pnas.94.19.10075 (1997). 86 Schmidt, J. C., Zaug, A. J. & Cech, T. R. Live Cell Imaging Reveals the Dynamics of Telomerase Recruitment to Telomeres. Cell 166, 1188-1197.e1189, doi:10.1016/j.cell.2016.07.033 (2016). 87 de Lange, T. Shelterin: the protein complex that shapes and safeguards human telomeres. Genes & development 19, 2100-2110, doi:10.1101/gad.1346005 (2005). 88 Nandakumar, J., Podell, E. R. & Cech, T. R. How telomeric protein POT1 avoids RNA to achieve specificity for single-stranded DNA. Proceedings of the National Academy of Sciences of the United States of America 107, 651-656, doi:10.1073/pnas.0911099107 (2010). 89 Petrova, O. A. et al. Structure and function of the N-terminal domain of the yeast telomerase reverse transcriptase. Nucleic Acids Res 46, 1525-1540, doi:10.1093/nar/gkx1275 (2018). 90 Kabsch, W. XDS. Acta crystallographica. Section D, Biological crystallography 66, 125-132, doi:10.1107/s0907444909047337 (2010). 91 Winn, M. D. et al. Overview of the CCP4 suite and current developments. Acta crystallographica. Section D, Biological crystallography 67, 235-242, doi:10.1107/s0907444910045749 (2011). 92 Mitchell, M., Gillis, A., Futahashi, M., Fujiwara, H. & Skordalakes, E. Structural basis for telomerase catalytic subunit TERT binding to RNA template and telomeric DNA. Nat Struct Mol Biol 17, 513-518, doi:http://www.nature.com/nsmb/journal/v17/n4/suppinfo/nsmb.1777_S1.html (2010).

127

93 Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta crystallographica. Section D, Biological crystallography 66, 213-221, doi:10.1107/s0907444909052925 (2010). 94 Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta crystallographica. Section D, Biological crystallography 60, 2126-2132, doi:10.1107/s0907444904019158 (2004). 95 Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta crystallographica. Section D, Biological crystallography 66, 12-21, doi:10.1107/S0907444909042073 (2010). 96 Batra, V. K. et al. Magnesium-induced assembly of a complete DNA polymerase catalytic complex. Structure (London, England : 1993) 14, 757-766, doi:10.1016/j.str.2006.01.011 (2006). 97 Gleghorn, M. L., Davydova, E. K., Basu, R., Rothman-Denes, L. B. & Murakami, K. S. X-ray crystal structures elucidate the nucleotidyl transfer reaction of transcript initiation using two nucleotides. Proceedings of the National Academy of Sciences of the United States of America 108, 3566-3571, doi:10.1073/pnas.1016691108 (2011). 98 Doublié, S., Sawaya, M. R. & Ellenberger, T. An open and closed case for all polymerases. Structure (London, England : 1993) 7, R31-R35, doi:https://doi.org/10.1016/S0969-2126(99)80017-3 (1999). 99 Sawaya, M. R., Pelletier, H., Kumar, A., Wilson, S. H. & Kraut, J. Crystal structure of rat DNA polymerase beta: evidence for a common polymerase mechanism. Science (New York, N.Y.) 264, 1930-1935, doi:10.1126/science.7516581 (1994). 100 Schmidt, T., Tian, L. & Clore, G. M. Probing Conformational States of the Finger and Thumb Subdomains of HIV-1 Reverse Transcriptase Using Double Electron- Electron Resonance Electron Paramagnetic Resonance Spectroscopy. Biochemistry 57, 489-493, doi:10.1021/acs.biochem.7b01035 (2018). 101 Steitz, T. A. DNA polymerases: structural diversity and common mechanisms. The Journal of biological chemistry 274, 17395-17398, doi:10.1074/jbc.274.25.17395 (1999). 102 Biertümpfel, C. et al. Structure and mechanism of human DNA polymerase eta. Nature 465, 1044-1048, doi:10.1038/nature09196 (2010). 103 Chen, Y., Podlevsky, J. D., Logeswaran, D. & Chen, J. J. L. A single nucleotide incorporation step limits human telomerase repeat addition activity. The EMBO Journal 37, e97953, doi:10.15252/embj.201797953 (2018). 104 Patrick, E. M., Slivka, J. D., Payne, B., Comstock, M. J. & Schmidt, J. C. Observation of processive telomerase catalysis using high-resolution optical tweezers. Nature Chemical Biology, doi:10.1038/s41589-020-0478-0 (2020). 105 Beard, W. A., Shock, D. D., Batra, V. K., Prasad, R. & Wilson, S. H. Substrate- induced DNA polymerase beta activation. The Journal of biological chemistry 289, 31411-31422, doi:10.1074/jbc.M114.607432 (2014). 106 Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years of image analysis. Nature methods 9, 671-675 (2012). 107 Brown, J. A. et al. Pre-Steady-State Kinetic Analysis of Truncated and Full-Length Saccharomyces cerevisiae DNA Polymerase Eta. Journal of nucleic acids 2010, doi:10.4061/2010/871939 (2010).

128

108 Basel-Vanagaite, L. et al. Expanding the clinical phenotype of autosomal dominant dyskeratosis congenita caused by TERT mutations. Haematologica 93, 943-944, doi:10.3324/haematol.12317 (2008). 109 Diaz de Leon, A. et al. Telomere lengths, pulmonary fibrosis and telomerase (TERT) mutations. PloS one 5, e10680, doi:10.1371/journal.pone.0010680 (2010). 110 Traut, T. W. Physiological concentrations of purines and pyrimidines. Molecular and cellular biochemistry 140, 1-22, doi:10.1007/bf00928361 (1994). 111 Brown, J. A. & Suo, Z. Unlocking the sugar "steric gate" of DNA polymerases. Biochemistry 50, 1135-1142, doi:10.1021/bi101915z (2011). 112 Cavanaugh, N. A., Beard, W. A. & Wilson, S. H. DNA polymerase beta ribonucleotide discrimination: insertion, misinsertion, extension, and coding. The Journal of biological chemistry 285, 24457-24465, doi:10.1074/jbc.M110.132407 (2010). 113 Nick McElhinny, S. A. et al. Abundant ribonucleotide incorporation into DNA by yeast replicative polymerases. Proceedings of the National Academy of Sciences of the United States of America 107, 4949-4954, doi:10.1073/pnas.0914857107 (2010). 114 McCulloch, S. D. & Kunkel, T. A. The fidelity of DNA synthesis by eukaryotic replicative and translesion synthesis polymerases. Cell research 18, 148-161, doi:10.1038/cr.2008.4 (2008). 115 Lee, M. et al. Telomere sequence content can be used to determine ALT activity in tumours. Nucleic Acids Res 46, 4903-4918, doi:10.1093/nar/gky297 (2018). 116 Burge, S., Parkinson, G. N., Hazel, P., Todd, A. K. & Neidle, S. Quadruplex DNA: sequence, topology and structure. Nucleic Acids Res 34, 5402-5415, doi:10.1093/nar/gkl655 (2006). 117 Cavanaugh, N. A., Beard, W. A. & Wilson, S. H. DNA polymerase beta ribonucleotide discrimination: insertion, misinsertion, extension, and coding. The Journal of biological chemistry 285, 24457-24465, doi:10.1074/jbc.M110.132407 (2010). 118 Sparks, J. L. et al. RNase H2-initiated ribonucleotide excision repair. Molecular cell 47, 980-986, doi:10.1016/j.molcel.2012.06.035 (2012). 119 Li, Y. & Breaker, R. R. Kinetics of RNA Degradation by Specific Base Catalysis of Transesterification Involving the 2‘-Hydroxyl Group. Journal of the American Chemical Society 121, 5364-5372, doi:10.1021/ja990592p (1999). 120 Baumann, P. & Cech, T. R. Pot1, the Putative Telomere End-Binding Protein in Fission Yeast and Humans. Science (New York, N.Y.) 292, 1171, doi:10.1126/science.1060036 (2001). 121 Nowotny, M., Gaidamakov, S. A., Crouch, R. J. & Yang, W. Crystal structures of RNase H bound to an RNA/DNA hybrid: substrate specificity and metal-dependent catalysis. Cell 121, 1005-1016, doi:10.1016/j.cell.2005.04.024 (2005). 122 Fay, M. M., Lyons, S. M. & Ivanov, P. RNA G-Quadruplexes in Biology: Principles and Molecular Mechanisms. Journal of molecular biology 429, 2127-2147, doi:10.1016/j.jmb.2017.05.017 (2017). 123 Stefl, R., Spackova, N., Berger, I., Koca, J. & Sponer, J. Molecular dynamics of DNA quadruplex molecules containing inosine, 6-thioguanine and 6-thiopurine. Biophysical journal 80, 455-468, doi:10.1016/s0006-3495(01)76028-6 (2001).

129

124 Whitaker, A. M., Schaich, M. A., Smith, M. R., Flynn, T. S. & Freudenthal, B. D. Base excision repair of oxidative DNA damage: from mechanism to disease. Frontiers in bioscience (Landmark edition) 22, 1493-1522 (2017). 125 O'Connor, M. J. Targeting the DNA Damage Response in Cancer. Molecular cell 60, 547-560, doi:10.1016/j.molcel.2015.10.040 (2015). 126 Dilruba, S. & Kalayda, G. V. Platinum-based drugs: past, present and future. Cancer chemotherapy and pharmacology 77, 1103-1124, doi:10.1007/s00280- 016-2976-z (2016). 127 Martin, N. E. & D'Amico, A. V. Progress and controversies: Radiation therapy for prostate cancer. CA: a cancer journal for clinicians 64, 389-407, doi:10.3322/caac.21250 (2014). 128 Gad, H. et al. MTH1 inhibition eradicates cancer by preventing sanitation of the dNTP pool. Nature 508, 215-221, doi:10.1038/nature13181 (2014). 129 Plummer, R. et al. Phase I Study Of The Poly(ADP-Ribose) Polymerase Inhibitor, AG014699, In Combination With Temozolomide in Patients with Advanced Solid Tumors. Clinical Cancer Research 14, 7917-7923, doi:10.1158/1078-0432.CCR- 08-1223 (2008). 130 Jordheim, L. P., Durantel, D., Zoulim, F. & Dumontet, C. Advances in the development of nucleoside and nucleotide analogues for cancer and viral diseases. Nature reviews. Drug discovery 12, 447-464, doi:10.1038/nrd4010 (2013). 131 Heidelberger, C. et al. Fluorinated pyrimidines, a new class of tumour-inhibitory compounds. Nature 179, 663-666 (1957). 132 Wilson, P. M., Danenberg, P. V., Johnston, P. G., Lenz, H. J. & Ladner, R. D. Standing the test of time: targeting thymidylate biosynthesis in cancer therapy. Nature reviews. Clinical oncology 11, 282-298, doi:10.1038/nrclinonc.2014.51 (2014). 133 Longley, D. B., Harkin, D. P. & Johnston, P. G. 5-fluorouracil: mechanisms of action and clinical strategies. Nature reviews. Cancer 3, 330-338, doi:10.1038/nrc1074 (2003). 134 Kunz, C. et al. Base excision by thymine DNA glycosylase mediates DNA-directed cytotoxicity of 5-fluorouracil. PLoS biology 7, e91, doi:10.1371/journal.pbio.1000091 (2009). 135 Pritchard, D. M., Watson, A. J. M., Potten, C. S., Jackman, A. L. & Hickman, J. A. Inhibition by uridine but not thymidine of p53-dependent intestinal apoptosis initiated by 5-fluorouracil: Evidence for the involvement of RNA perturbation. Proceedings of the National Academy of Sciences of the United States of America 94, 1795-1799 (1997). 136 Elion, G. B. & Hitchings, G. H. The Synthesis of 6-Thioguanine. J. Am. Chem. Soc. 77, 1676-1676, doi:10.1021/ja01611a082 (1955). 137 Munshi, P. N., Lubin, M. & Bertino, J. R. 6-thioguanine: a drug with unrealized potential for cancer therapy. The oncologist 19, 760-765, doi:10.1634/theoncologist.2014-0178 (2014). 138 Swann, P. F. et al. Role of postreplicative DNA mismatch repair in the cytotoxic action of thioguanine. Science (New York, N.Y.) 273, 1109-1111 (1996).

130

139 Grasso, F. et al. MUTYH mediates the toxicity of combined DNA 6-thioguanine and UVA radiation. Oncotarget 6, 7481-7492, doi:10.18632/oncotarget.3037 (2015). 140 Seinen, M. L., van Nieuw Amerongen, G. P., de Boer, N. K. H. & van Bodegraven, A. A. Rac Attack: Modulation of the Small GTPase Rac in Inflammatory Bowel Disease and Thiopurine Therapy. Molecular Diagnosis & Therapy 20, 551-557, doi:10.1007/s40291-016-0232-1 (2016). 141 Mlinarič-Raščan, I., Milek, M. & Karas-Kuželički, N. Pharmacogenetics of thiopurine therapy: from thiopurine S-methyltransferase to S-adenosylmethionine. BMC Pharmacology 9, A66-A66, doi:10.1186/1471-2210-9-S2-A66 (2009). 142 Zauri, M. et al. CDA directs metabolism of epigenetic nucleosides revealing a therapeutic window in cancer. Nature 524, 114-118, doi:10.1038/nature14948 (2015). 143 Hu, L. et al. Structural insight into substrate preference for TET-mediated oxidation. Nature 527, 118-122, doi:10.1038/nature15713 (2015). 144 Helleday, T. Poisoning Cancer Cells with Oxidized Nucleosides. The New England journal of medicine 373, 1570-1571, doi:10.1056/NEJMcibr1510335 (2015). 145 Ingraham, H. A., Tseng, B. Y. & Goulian, M. Nucleotide levels and incorporation of 5-fluorouracil and uracil into DNA of cells treated with 5-fluorodeoxyuridine. Molecular pharmacology 21, 211-216 (1982). 146 Beard, W. A. & Wilson, S. H. Purification and domain-mapping of mammalian DNA polymerase beta. Methods in enzymology 262, 98-107 (1995). 147 Otwinowski, Z. & Minor, W. [20] Processing of X-ray diffraction data collected in oscillation mode. Methods in enzymology 276, 307-326, doi:10.1016/s0076- 6879(97)76066-x (1997). 148 Idriss, H. T., Al-Assar, O. & Wilson, S. H. DNA polymerase beta. The international journal of biochemistry & cell biology 34, 321-324 (2002). 149 Beard, W. A. & Wilson, S. H. Structure and mechanism of DNA polymerase beta. Biochemistry 53, 2768-2780, doi:10.1021/bi500139h (2014). 150 Srivastava, D. K., Husain, I., Arteaga, C. L. & Wilson, S. H. DNA polymerase beta expression differences in selected human tumors and cell lines. Carcinogenesis 20, 1049-1054 (1999). 151 Bergoglio, V. et al. Enhanced expression and activity of DNA polymerase beta in human ovarian tumor cells: impact on sensitivity towards antitumor agents. Oncogene 20, 6181-6187, doi:10.1038/sj.onc.1204743 (2001). 152 Whitaker, A. M., Smith, M. R., Schaich, M. A. & Freudenthal, B. D. Capturing a mammalian DNA polymerase extending from an oxidized nucleotide. Nucleic Acids Res., doi:10.1093/nar/gkx293 (2017). 153 Pelletier, H., Sawaya, M. R., Wolfle, W., Wilson, S. H. & Kraut, J. A structural basis for metal ion mutagenicity and nucleotide selectivity in human DNA polymerase beta. Biochemistry 35, 12762-12777, doi:10.1021/bi9529566 (1996). 154 Tanaka, M., Yoshida, S., Saneyoshi, M. & Yamaguchi, T. Utilization of 5-fluoro-2'- deoxyuridine triphosphate and 5-fluoro-2'-deoxycytidine triphosphate in DNA synthesis by DNA polymerases alpha and beta from calf thymus. Cancer research 41, 4132-4135 (1981).

131

155 Rappaport, H. P. Replication of the base pair 6-thioguanine/5-methyl-2-pyrimidine with the large Klenow fragment of Escherichia coli DNA polymerase I. Biochemistry 32, 3047-3057 (1993). 156 Ling, Y. H., Nelson, J. A., Cheng, Y. C., Anderson, R. S. & Beattie, K. L. 2'-Deoxy- 6-thioguanosine 5'-triphosphate as a substrate for purified human DNA polymerases and calf thymus terminal deoxynucleotidyltransferase in vitro. Molecular pharmacology 40, 508-514 (1991). 157 Yoshida, S., Yamada, M., Masaki, S. & Saneyoshi, M. Utilization of 2'-deoxy-6- thioguanosine 5'-triphosphate in DNA synthesis in vitro by DNA polymerase alpha from calf thymus. Cancer research 39, 3955-3958 (1979). 158 Dai, Q. et al. Weakened N3 Hydrogen Bonding by 5-Formylcytosine and 5- Carboxylcytosine Reduces Their Base-Pairing Stability. ACS chemical biology 11, 470-477, doi:10.1021/acschembio.5b00762 (2016). 159 Yoshida, M. et al. Substrate and mispairing properties of 5-formyl-2'-deoxyuridine 5'-triphosphate assessed by in vitro DNA polymerase reactions. Nucleic Acids Res 25, 1570-1577 (1997). 160 Sahasrabudhe, P. V., Pon, R. T. & Gmeiner, W. H. Solution structures of 5- fluorouracil-substituted DNA and RNA decamer duplexes. Biochemistry 35, 13597-13608, doi:10.1021/bi960535y (1996). 161 Buck, J. et al. Influence of ground-state structure and Mg2+ binding on folding kinetics of the guanine-sensing riboswitch aptamer domain. Nucleic Acids Res 39, 9768-9778, doi:10.1093/nar/gkr664 (2011). 162 Bohon, J. & de los Santos, C. R. Structural effect of the anticancer agent 6- thioguanine on duplex DNA. Nucleic Acids Res 31, 1331-1338 (2003). 163 Szulik, M. W. et al. Differential stabilities and sequence-dependent base pair opening dynamics of Watson-Crick base pairs with 5-hydroxymethylcytosine, 5- formylcytosine, or 5-carboxylcytosine. Biochemistry 54, 1294-1305, doi:10.1021/bi501534x (2015). 164 Tsunoda, M., Karino, N., Ueno, Y., Matsuda, A. & Takenaka, A. Crystallization and preliminary X-ray analysis of a DNA dodecamer containing 2'-deoxy-5- formyluridine; what is the role of magnesium cation in crystallization of Dickerson- type DNA dodecamers? Acta crystallographica. Section D, Biological crystallography 57, 345-348 (2001). 165 Hardwick, J. S. et al. 5-Formylcytosine does not change the global structure of DNA. Nature structural & molecular biology, doi:10.1038/nsmb.3411 (2017). 166 Cruse, W., Saludjian, P., Neuman, A. & Prange, T. Destabilizing effect of a fluorouracil extra base in a hybrid RNA duplex compared with bromo and chloro analogues. Acta crystallographica. Section D, Biological crystallography 57, 1609- 1613 (2001). 167 Prindle, M. J. & Loeb, L. A. DNA polymerase delta in DNA replication and genome maintenance. Environmental and molecular mutagenesis 53, 666-682, doi:10.1002/em.21745 (2012). 168 Swan, M. K., Johnson, R. E., Prakash, L., Prakash, S. & Aggarwal, A. K. Structural basis of high-fidelity DNA synthesis by yeast DNA polymerase delta. Nature structural & molecular biology 16, 979-986, doi:10.1038/nsmb.1663 (2009).

132

169 Hottin, A. & Marx, A. Structural Insights into the Processing of Nucleobase- Modified Nucleotides by DNA Polymerases. Acc. Chem. Res. 49, 418-427, doi:10.1021/acs.accounts.5b00544 (2016). 170 Freudenthal, B. D. et al. Uncovering the polymerase-induced cytotoxicity of an oxidized nucleotide. Nature 517, 635-639, doi:10.1038/nature13886 (2015). 171 You, C., Dai, X., Yuan, B. & Wang, Y. Effects of 6-Thioguanine and S(6)- Methylthioguanine on Transcription in Vitro and in Human Cells. Journal of Biological Chemistry 287, 40915-40923, doi:10.1074/jbc.M112.418681 (2012). 172 Seinen, M. L., van Nieuw Amerongen, G. P., de Boer, N. K. H. & van Bodegraven, A. A. Rac Attack: Modulation of the Small GTPase Rac in Inflammatory Bowel Disease and Thiopurine Therapy. Mol. Diag. Ther. 20, 551-557, doi:10.1007/s40291-016-0232-1 (2016). 173 Kashfi, S. M. H., Golmohammadi, M., Behboudi, F., Nazemalhosseini-Mojarad, E. & Zali, M. R. MUTYH the base excision repair gene family member associated with colorectal cancer polyposis. Gastroenterol Hepatol Bed Bench 6, S1-S10 (2013). 174 Fakhoury, J., Marie-Egyptienne, D. T., Londono-Vallejo, J. A. & Autexier, C. Telomeric function of mammalian telomerases at short telomeres. J Cell Sci 123, 1693-1704, doi:10.1242/jcs.063636 (2010). 175 Huard, S. & Autexier, C. Human telomerase catalyzes nucleolytic primer cleavage. Nucleic Acids Res 32, 2171-2180, doi:10.1093/nar/gkh546 (2004). 176 Oulton, R. & Harrington, L. A human telomerase-associated nuclease. Mol Biol Cell 15, 3244-3256, doi:10.1091/mbc.e04-03-0178 (2004). 177 Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra- long reads. Nat Biotechnol 36, 338-345, doi:10.1038/nbt.4060 (2018). 178 An, N., Fleming, A. M., White, H. S. & Burrows, C. J. Nanopore detection of 8- oxoguanine in the human telomere repeat sequence. ACS Nano 9, 4296-4307, doi:10.1021/acsnano.5b00722 (2015). 179 Liu, Q. et al. Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data. Nature Communications 10, 2449, doi:10.1038/s41467-019-10168-2 (2019). 180 Bai, X. C., McMullan, G. & Scheres, S. H. How cryo-EM is revolutionizing structural biology. Trends Biochem Sci 40, 49-57, doi:10.1016/j.tibs.2014.10.005 (2015). 181 Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nature methods 14, 290- 296, doi:10.1038/nmeth.4169 (2017). 182 Hernandez-Sanchez, W. et al. A non-natural nucleotide uses a specific pocket to selectively inhibit telomerase activity. PLoS biology 17, e3000204, doi:10.1371/journal.pbio.3000204 (2019).

133

Appendix A: Full TERT enzymology parameters

Figure A.1: Pre-steady-state kinetics of additional TERT active site mutants. (A) Pre-steady-state kinetics of tcTERT with an R194A substitution. Error bars represent the standard deviation of the mean. Data were fit to the exponential equation 1 to determine the kobs (rate) and target engagement. Pre-steady-state parameters were also calculated for (B) tcTERT with a Q308A substitution. (C) A replot of the fits from panel A, showing the kobs values obtained from each concentration of dGTP. -2 -3 Data were fit to Equation 2. The kpol was 3.69 x 10 ± 1.67 x 10 per second, and Kd was 92.7 ± 10.1 µM. -1 -2 (D) A replot of tcTERT Q308A. The kpol was 2.97 x 10 ± 1.58 x 10 per second, and Kd was 44.7 ± 10.0 µM.

134

Figure A.2: Complete graphs of selected TERT pre-steady-state kinetics Due to space considerations, only a zoomed in portion of pre-steady-state kinetic data were shown in certain portions of the text. Both the zoomed and complete curves were shown for (A) WT TERT inserting dGTP across from dC with a double exponential fit and (B) TERT R194A inserting dGTP across from dC. In both cases, error bars represent standard deviations of the at least three technical replicates.

135

Figure A.3: Residual analysis of (A) single exponential and (B) double exponential fits of the kinetic data from WT TERT inserting dGTP across from rC. Note that the residuals are smaller in each case for the double exponential, and that they form a more random distribution, whereas the single exponential forms an upside-down “u” pattern.

136

tcTERT variant Incoming nucleotide Nucleotide concentration (μM) Target engagement (nM) Target engagement error (nM) Kobs x 100 Kobs error x 100 1 55 (N/A) 7.56 1.61 5 55 (N/A) 34.7 2.67 50 55 (N/A) 105 9.58 100 55 (N/A) 170 7.18 300 55 (N/A) 154 7.65 400 55 (N/A) 197 19 WT dGTP 500 55 (N/A) 207 14 1 35 (N/A) 0.347 0.107 5 35 (N/A) 4.56 0.577 50 35 (N/A) 15.1 2.13 100 35 (N/A) 4.68 0.46 300 35 (N/A) 4.1 0.637 400 35 (N/A) 7.03 1.65 500 35 (N/A) 8.88 1.07 -2 -3 100 88.9 4.35 3.18 x 10 2.47 x 10 -1 -2 500 93.6 4.59 1.29 x 10 1.76 x 10 -1 -3 WT rGTP 1000 96.3 0.89 1.99 x 10 4.92 x 10 -1 -3 2000 90.6 0.62 2.77 x 10 5.25 x 10 -1 -3 3000 91.4 1.03 2.73 x 10 8.57 x 10 -2 -3 100 101 1.41 5.22 x 10 1.43 x 10 -1 -3 500 100 1.58 2.00 x 10 8.50 x 10 -1 -2 WT dATP 1000 99.7 1.63 3.45 x 10 1.65 x 10 -1 -2 2000 101 2.59 5.07 x 10 4.22 x 10 -1 -2 3000 101 1.52 5.39 x 10 2.70 x 10 -1 10 93.7 1.48 3.77 2.33 x 10 50 92.4 1.79 16.8 1.48 Q308A dGTP 250 93.9 1.82 25.2 2.3 500 97.3 1.96 27 2.57 -1 -2 10 101 3.02 3.69 x 10 3.36 x 10 -1 50 97.5 2.78 1.34 1.46 x 10 R194A dGTP -2 100 96.5 0.89 1.85 6.92 x 10 -1 250 96.9 1.34 2.71 1.66 x 10 10 85 3.24 91.5 11.6 50 87 3.28 231 33 Y256A dGTP 100 90.8 2.19 406 45.2 500 84.5 1.89 565 65.6 10 86 2.12 71.7 5.81 50 86.4 2.39 213 21.8 Y256A rGTP 100 88.6 1.98 332 31.7 250 87.2 1.98 428 45.6 500 88.5 2.47 484 66.3 *Values taken from slowest (minority) values of double-exponential plot, and not used for the data replot. Note that target engagement values were fixed for each experiment

Table A.1: Full kinetic parameters of each curve used to generate Kobs and target engagement values in tcTERT pre-steady-state kinetics

137

Appendix B: Molecular docking with AutoDock Vina

In addition to the approaches in structural and molecular biology presented in this dissertation, molecular docking has been demonstrated to be a valuable technique to understand how a ligand binds to a receptor. In this appendix, a brief overview on the advantages and disadvantages of using AutoDock Vina will be given, along with a protocol used to perform a docking experiment and commentary on the interpretation of docking results.

B.1 Advantages of using molecular docking techniques

Using computational techniques, including molecular docking experiments, is a valuable tool for understanding how and if a ligand binds a target receptor. Although there are hundreds of different docking programs available, the molecular docking technique discussed in this section will focus on AutoDock Vina, a recently developed program based on AutoDock 4.0, optimized for greater accuracy and faster processing time.

Compared to other structural techniques, molecular docking holds several distinct advantages. Firstly, compared to studies with structural techniques such as X-ray crystallography or cryo-EM, computational techniques are vastly less financially demanding. Instead of spending $10,000 - $50,000 preparing a sample for a structural study, modern molecular docking experiments can be performed with a traditional desktop computer, with no extra expenses required. Similarly, the result from molecular docking experiments are generated at a much more rapid pace than other biochemical experiments; upon starting a new molecular docking experiment, it is not uncommon for a researcher well-versed in computational techniques to generate results in less than one

138 day of effort. Results can be achieved rapidly and with low cost in large part because protein purification is not needed for computational studies.

Because molecular docking occurs so rapidly and low cost of resources, the barriers for collaboration are reduced. To establish a traditional collaboration, investigators often need to write a grant, purchase reagents, and designate personnel to work on the project before the work even begins. With collaborations in molecular docking, a brief meeting between collaborators explaining the problem and a survey of the protein data bank is often all of the preparation needed in order to initiate molecular docking experiments. Very quickly after that, computational results will dictate how the ligand of interest binds (both in terms of the binding pocket of the receptor chosen and in terms of the orientation and posing of the ligand), as well as how well it binds (with an estimation of the binding thermodynamics generated as part of the docking process).

Because so many enzymes have atomic-resolution structures publicly available in the

Protein Data Bank (PDB), a wealth of information can be used to determine these two parameters. Alternately, even without a direct hypothesis regarding a ligand binding a receptor, virtual screening can be performed, in which thousands of potential ligands are docked into a receptor of interest. With virtual screening, therapeutic candidates can be identified, and further optimized with other biochemical techniques.

B.2 Challenges associated with molecular docking

As outlined above, there are numerous potential benefits and applications for using molecular docking. As with any other technique, however, there are challenges associated with molecular docking that must be addressed for the correct design of a docking experiment, as well as an accurate interpretation of its results. One major issue

139 associated with molecular docking is the question of induced fit: upon binding, ligands can cause large-scale structural changes in a receptor. If a molecular docking experiment is performed using a receptor not bound to a ligand, the binding pocket may not be in the proper conformation to support binding. This problem can beparticularly computationally taxing, especially accounting for scenarios where secondary structure changes during the induced fit. One way to account for induced fit (used by several programs, including

AutoDock Vina) allow certain amino acid residues to be designated as flexible. With flexible residues, different rotamers can be sampled during the molecular docking experiment, allowing for further binding site optimization to occur. Flexible residues cannot fully address the challenges of induced fit in molecular docking, but at least offer a compromise between long calculation times and mutability of an enzyme’s structure during ligand binding.

Another challenge associated with molecular docking is the availability and selection of the proper receptor structure for docking. Firstly, a high-resolution structure of the receptor must be available. The higher the resolution, the more accurate the docking will be, and accuracy will particularly suffer if the locations of the amino acid side chains are not clear (> ~4 Å). Furthermore, if a structure is available, it is important to perform the docking experiment with the biologically relevant assembly (i.e. if the receptor is assembled as a hexamer in cells, docking will be the most accurate using the hexamer structure rather than a monomer structure). With structures containing multiple subunits, one other common pitfall that can result in false binding sites is domain swapping between subunits. In domain swapping, proteins adopt a more extended conformation, wherein a domain from one subunit switches positions with an equivalent domain in another subunit.

140

The swapped domain can create an artificial cavity if it is removed prior to performing the

docking calculation. Altogether, molecular docking is not a perfect technique for every

scenario, and several challenges must be overcome in order for an experiment to

accurately reflect a biological context. In many scenarios, it is difficult to tell whether

challenges are accurately addressed or not from molecular docking results alone;

therefore, molecular docking experiments nearly always require complementary

biological techniques to confirm the relevance of their findings.

B.3 Complementary techniques to molecular docking

For the reasons outlined above, it is advisable to perform complementary

biochemical techniques to confirm that the in silico results from a molecular docking experiment accurately reflect ligand binding in a cellular context. In many cases, complementary experiments have already been performed prior to docking to narrow down potential receptors for a ligand. If not, however, care must be taken when selecting complementary techniques, because many of the benefits offered by a molecular docking experiment (rapid results, low cost) can easily be outweighed by costly and time- consuming complementary approaches. Hence, the complementary approaches listed here have been ordered from the least intensive to most intensive.

B.3.1 Complementary approaches that do not require protein purification

As alluded to earlier, one major advantage towards performing molecular docking experiments is that costly and technically difficult protein purification protocols can be circumvented. Therefore, complementary methods that also do not require proteins to be purified are often techniques of choice to complement molecular docking. Genetic manipulations, such as using CRISPR-cas9 to disrupt the gene that hypothetically binds

141

the ligand, is one such way to test binding interactions. Unfortunately, these techniques

may knock out essential genes, or have other artifactual results from the knockout, so

they can be challenging to perform. An alternative technique, known as the Cellular

Thermal Shift Assay (CETSA), was recently developed to study ligand binding within a

cellular context 1. In this assay, briefly, both treated and untreated cells are heated to

various temperatures, from physiological temperatures to conditions high enough to

denature the protein of interest (the putative receptor of the ligand). If the ligand binds to

the protein, the temperature at which denaturation occurs will shift. Often, but not always,

if the ligand binds to the protein, it will stabilize the structure and increase the temperature

at which denaturation occurs. This technique is an excellent way to confirm target engagement of a ligand, and other than cell culture and the ligand of interest, only requires an antibody for the protein of interest. However, if the ligand to be tested interacts with the heat shock response of the cells; if heat shock response is disrupted or augmented, false negatives or positives may result.

B.3.2 Purified protein dependent complementary techniques

If the aforementioned techniques are not accessible or provide an unclear result, many other techniques can be utilized to characterize binding of a ligand to a receptor.

Fortunately, in order for molecular docking experiments to proceed, structural information must be known about the putative receptor. Protein purification is a prerequisite for structural characterization, so typically a purification protocol for the receptor of interest has already been published by the time structural information is released. After the receptor is purified, numerous established techniques can be applied to determine the Kd

of the ligand. These methods include (but are not limited to) fluorescence anisotropy,

142 isotherm titration calorimetry, biolayer interferometry, and Förster Resonance Energy

2 Transfer (FRET) techniques . All of these techniques can determine a Kd of binding for the ligand, but do not easily give information on the binding pocket of the receptor, or amino acids necessary for effective binding. If binding site information is needed, or more details desired to confirm the molecular docking, other lower resolution structural techniques can apply to validate an in silico study. These include Small Angle X-ray

Scattering (SAXS), hydrogen/deuterium exchange mass spectrometry, and low resolution cryo-EM 3,4. Structural techniques such as high resolution cryo-EM or X-ray crystallography can also provide answers to questions about the binding mode and orientation of a ligand, but results from these resource-intensive techniques will often supersede any in silico studies.

143

B.4 Example Molecular Docking Protocol with Autodock Vina

B.4.1 Context: Information that prompted molecular docking experiments

Researchers in the Jaeschke group at the University of Kansas Medical Center

were developing new treatments to counteract acetaminophen overdose, a common

cause of liver failure in the United States 5. They found that a compound, 4- methylpyrazole (4-MP), protected against acetaminophen hepatotoxicity, and prevented

Jun N-terminal kinase (JNK) activation, including JNK1 and JNK2. Furthermore, a CETSA assay revealed that the thermal stability of both JNK1 and JNK2 was significantly reduced upon treatment of 4-MP. The molecular mechanism by which 4-MP binds JNK, however, was currently not known. Therefore, a molecular docking experiment was performed, to see the binding orientation and thermodynamic energy of binding of 4-MP bound to JNK.

For details on every molecular docking experiment used in this study, the reader is encouraged to view them at the original Akakpo et al. publication 6. However, the specific

docking methodology used to generate those results are outlined below as a guide,

specifically for the control docking of a compound into JNK1. All other molecular docking

experiments used the same protocol, but with different receptors and ligands.

144

Figure B.1 :Selection of JNK1 structure. For molecular docking experiments this structure of JNK1 was used (PDB cod 3V3V). It was selected because it was a high resolution (2.7 Å) and was previously bound to a ligand, which could be used for a control docking experiment.

145

Figure B.2: Preparing a structure from the PDB for molecular docking. An example screenshot from PyMol during structure preparation. The structure was downloaded from the pdb using the “fetch 3v3v” command (bottom left). After fetching the structure, the “s” button (bottom right) was selected to display the protein sequence. After that, all solvent molecules and the ligand in the active site were removed via selecting them on the sequence, right clicking, and then clicking on “remove” to delete them. After this, the file should be saved as a .pdb file for use with docking. Similarly, if the ligand of interest is in the pdb, PyMol can be used to specifically save the coordinates of the ligand.

146

Figure B.3 Preparing the structure for a molecular docking experiment. For molecular docking experiments, force fields and other atomic parameters must be generated from the .pdb file via its conversion to .pdbqt format. First, within AutoDockTools, open the cleaned coordinates file generated in PyMol (no waters, ligands, etc), select the “Grid” tab, and then select “Macromolecule” and “choose”. This will give the option for naming the file into .pdbqt format, the file type necessary for use with AutoDock Vina. Make note of this file and its location. Similarly, with the ligand file that is to be docked, open the coordinates in AutoDockTools, select “Ligand”  “Input”  “Choose” and save the ligand of interest as a .pdbqt file as well.

147

Figure B.4: Setting up a Grid Box for the docking search into JNK1. Open AutoDockTools, and then load the molecular structure generated from PyMol. To determine the search area for the docking, first click on the “Grid” tab, and then select “Grid Box”. A dialog box like the one in the top left should pop up. Make sure to change the spacing to 1 angstrom, because those are the units used by AutoDock Vina in the command line. In this instance of molecular docking of JNK1, a Grid Box centered at coordinates (21, 45, 9) with side lengths of 60 x 60 x 60 angstrom was utilized. was utilized. The Grid Box should be large enough to envelop the receptor of interest, but also as small as possible.

148

Figure B.5: Example syntax for using AutoDock Vina with the command line. To perform the control docking experiment of quercetagetin, the command line was first directed to vina.exe using the path “\Program Files (x86)\The Scripps Research Institute\Vina\vina.exe”. Then the parameters for the Grid Box size, and center were given, the ligand file, the receptor file, the name of the log file, the output coordinates file, and the exhaustiveness parameter were explicitly given. After the search is completed, the default setting of of up to 9 binding modes were displayed, including their binding affinity in kcal/mol, and distance from best binding mode. The coordinate files that display each of these binding modes are displayed in the designated “JNKdockquer.pdbqt” file. Note that all files generated by AutoDock Vina will be placed in the path of the command line, in this case “C: \Docking\JNK docking\”

149

Figure B.6: Viewing the molecular docking results. After the calculations are finished on the molecular docking experiment, the results can be viewed using AutoDockTools. To do this, simply open the .pdbqt file generated in the docking experiment as the output. In this case, that file was named JNKdockquer.pdbqt . Upon opening the file, 9 different models will open simultaneously, with model 1 corresponding with the highest binding affinity (most negative value in kcal/mol), and model 9 corresponding with the 9th highest energy on the list. In this scenario, all 9 models docked to the active site of JNK1, and are shown overlayed in green, orange, cyan, pink, and red.

150

Figure B.7: Orientations of published (green) vs docked (blue) quercetagen into JNK1. To view the binding modes of the ligand in PyMol, select the specific models of interest in AutoDockTools and then save them as .pdb files. The .pdb file can then be opened in PyMol, and compared to the experimental structure. This control molecular docking experiment correctly identified the binding mode of quercetagen into JNK1 (gray surface), as indicated by the similar positions of the published ligand (green sticks) to that of the ligand position determined with AutoDock Vina (blue sticks). Two different views of the same data are shown, one with protein removed. Because these two orientations are similar (RMSD < 2 Å), this experiment is considered to have correctly identified the binding mode of quercetagen, validating JNK1 as a receptor amenable for molecular docking.

151

B.4.2 Closing notes on interpretation

With any molecular docking experiment, programs attempt to find the best binding mode. However, one must be conscientious that just because a binding mode is displayed on the output, that does not mean that it is a biologically relevant binding mode. Whether the ligand actually binds or not, free-energy algorithms optimization algorithms will always calculate a minimum; in some extreme cases, the best binding modes identified can have binding thermodynamics in the positive range (i.e. nonspontaneous, unfavorable binding).

Therefore, carrying out control experiments in addition to the experiments involving the ligand of interest can act as key information to interpret a molecular docking experiment.

These control docking experiments can either be used to validate the receptor structure or validate the ligand structure as appropriate for use with molecular docking. To validate the receptor structure, a structure of the receptor of interest bound to a small-molecule ligand must be available. By deleting the ligand out of the structure and attempting to find its binding mode computationally, potential false positive binding pockets can be identified, and the search parameters can also be verified, including size of the grid box and optimal exhaustiveness to use. Similarly, if a structure is available of the ligand of interest bound to an alternate receptor, attempting a molecular docking experiment with that known structure can give insight on the amenability of the ligand for molecular docking, as well as provide information about the thermodynamics of binding for that ligand. This informational can be critically important, especially when estimating how well a ligand binds to an unknown receptor.

The appendix presented here is intended to be utilized as a guide for molecular docking experiments with AutoDock Vina beyond the molecular docking of

152 quercatagen into JNK1. Because molecular docking aims to capitalize on the rich information provided by the multitude of structures available in the protein data bank, the possible uses of these programs is nearly endless. Proper use of molecular docking can lead to the rapid generation of information regarding a ligand’s binding mode and binding thermodynamics. Care must be taken, however, to make sure that the proper controls and parameters are utilized during a molecular docking experiment; in an ideal situation these would include validation of both the receptor and the ligand with control docking experiments, prior to the experiment with the unknown ligand or receptor. Unfortunately, this amount of information is often not available, particularly in the case of newly generated ligands that have no structural information available. Therefore, orthogonal approaches are often helpful, if not necessary, to validate the results of a molecular docking experiment. Even with these caveats, proper molecular docking is a valuable tool for better understanding how small molecules bind their target. With the binding mode of a compound known, not only is a compound’s mechanism of action better understood, but furthermore, improved therapeutics can then be designed with lower doses needed and fewer side effects, resulting in more successful clinical trials, and better patient outcomes.

153

B.4.3 References

1 Jafari, R. et al. The cellular thermal shift assay for evaluating drug target interactions in cells. Nature Protocols 9, 2100-2122, doi:10.1038/nprot.2014.138 (2014). 2 Ma, W., Yang, L. & He, L. Overview of the detection methods for equilibrium dissociation constant K(D) of drug-receptor interaction. J Pharm Anal 8, 147-152, doi:10.1016/j.jpha.2018.05.001 (2018). 3 Chalmers, M. J., Busby, S. A., Pascal, B. D., West, G. M. & Griffin, P. R. Differential hydrogen/deuterium exchange mass spectrometry analysis of protein-ligand interactions. Expert Rev Proteomics 8, 43-59, doi:10.1586/epr.10.109 (2011). 4 Tuukkanen, A. T. & Svergun, D. I. Weak protein-ligand interactions studied by small-angle X-ray scattering. The FEBS journal 281, 1974-1987, doi:10.1111/febs.12772 (2014). 5 Ye, H., Nelson, L. J., Gómez Del Moral, M., Martínez-Naves, E. & Cubero, F. J. Dissecting the molecular pathophysiology of drug-induced liver injury. World J Gastroenterol 24, 1373-1385, doi:10.3748/wjg.v24.i13.1373 (2018). 6 Akakpo, J. Y. et al. Delayed Treatment With 4-Methylpyrazole Protects Against Acetaminophen Hepatotoxicity in Mice by Inhibition of c-Jun n-Terminal Kinase. Toxicol Sci 170, 57-68, doi:10.1093/toxsci/kfz077 (2019).

154