THE ROLE OF SHELTERIN PROTEINS IN DNA

PROTECTION AND REGULATION

by

MENGYUAN XU

Submitted in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

Dissertation Advisor: Dr. Derek Taylor

Department of Pharmacology

CASE WESTERN RESERVE UNIVERSITY

May, 2020 CASE WESTERN RESERVE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

We hereby approve the thesis/dissertation of

Mengyuan Xu

candidate for the degree of Doctor of Philosophy *.

Committee Chair

Jason Mears, Ph.D.

Committee Member

Derek Taylor, Ph.D.

Committee Member

Eckhard Jankowsky, Ph.D.

Committee Member

Blanton Tolbert, Ph.D.

Committee Member

Marcin Golczak, Ph.D.

Date of Defense

February 13th, 2020

*We also certify that written approval has been obtained

for any proprietary material contained therein

i

Table of Contents

List of Tables ...... 1

List of Figures ...... 2

Acknowledgement ...... 4

Abstract ...... 6

Chapter 1 Background ...... 8

1.1 Telomere ...... 8

1.2 ...... 10

1.3 Telomere DNA structure ...... 11

1.3.1 G-quadruplex ...... 11

1.3.2 T-loop ...... 15

1.4 Shelterin complex ...... 15

1.4.1 Double stranded DNA binding proteins ...... 15

1.4.2 Single stranded DNA binding proteins ...... 17

1.4.2.1 POT1-TPP1 functions in telomerase regulation ...... 20

1.4.2.2 Disease related mutations in POT1-TPP1 ...... 20

1.5 Telomere protection and telomerase regulation ...... 22

1.5.1 Telomere and DNA damage response ...... 22

1.5.2 Telomere length regulation ...... 26

1.6 Significance ...... 26 ii

Chapter 2 POT1-TPP1 differentially regulates telomerase via POT1 His266 and as a function of single-stranded telomere DNA length ...... 28

Abstract ...... 29

Introduction ...... 30

Results ...... 32

Discussion ...... 60

Material and Methods ...... 65

Chapter 3 The telomere POT1-TPP1 complex promotes G-quadruplex destabilization using both active and passive mechanisms ...... 91

Abstract ...... 92

Introduction ...... 94

Results ...... 96

Discussion ...... 117

Material and Methods ...... 121

Chapter 4 Summary and future direction ...... 132

4.1 Summary ...... 132

4.2 Future direction ...... 135

4.2.1 Determination of POT1-TPP1-ssDNA complex structure ...... 135

4.2.2 Characterization of the kinetic properties of POT1-TPP1 destabilizing

hybrid G-quadruplexes ...... 136

iii

4.2.3 Characterization of the role of POT1-TPP1 in telomerase regulation in

the presence of G-quadruplexes ...... 137

4.2.4 Determination of the cellular consequence of POT1 cancer-associated

mutation H266L in genome instability...... 139

Reference ...... 141

iv

List of Tables

Table 2.1 Summary of hydroxyl radical modification rate constant for residues in

POT1-N, POT1-N-hT12 and POT1-N-hT72...... 77

Table 2.2 Summary of hydroxyl radical modification ratios for residues in comparison of POT1-N/POT1-N-hT12 and POT1-N-hT12/POT1-N-hT72...... 80

Table 2.3 Summary of hydroxyl radical modification rate constant for residues in

PT, PT-hT12, and PT-hT72...... 82

Table 2.4 Summary of hydroxyl radical modification ratios for residues in the comparison of PT/PT-hT12 and PT-hT12/PT-hT72...... 88

Table 3.1 Rate constants for POT1-TPP1 binding events to 6ThT22 in Li+ buffer as determined from SPR data...... 130

Table 3.2 Equilibrium dissociation constants determined for POT1-TPP1 and

6ThT22 in Li+ buffer using SPR analysis...... 130

Table 3.3 Rate constants for POT1-TPP1 binding events to 6ThT22 in K+ buffer as determined from SPR data...... 131

Table 3.4 Equilibrium dissociation constants determined for POT1-TPP1 and

6ThT22 in K+ buffer using SPR analysis...... 131

1

List of Figures

Figure 1.1Three-dimensional and schematic structures of intramolecular G- quadruplexes formed by human telomere sequences...... 13

Figure 1.2 Shelterin complex components...... 19

Figure 1.3 Shelterin prevents multiple DNA-damage responses at the telomere.

...... 23

Figure 2.1 POT1-N and POT1-TPP1 coat long telomere ssDNA substrates...... 34

Figure 2.2 HRF identifies POT1-N and POT1-TPP1 residues undergoing

environmental changes upon binding of differing telomere ssDNA substrates. .. 36

Figure 2.3 Regions of POT1-N that exhibit alterations in hydroxyl radical modification rates upon telomere ssDNA binding and oligomerization on long

ssDNA...... 37

Figure 2.4 Oxidation rates of identified residues in POT1-N...... 39

Figure 2.5 HRF results of POT1-TPP1 complexes bound to different telomere

ssDNA substrates...... 43

Figure 2.6 POT1(H266L)-TPP1 mutant exhibits differential binding defects that are

length-dependent of ssDNA substrate...... 47

Figure 2.7 H266L mutation impairs the end-binding preference of POT1-TPP1. 48

Figure 2.8 H266L POT1 mutant abrogates POT1-TPP1 complex-mediated

inhibition of telomerase in a substrate length-dependent manner promoting

telomere extension...... 51

Figure 2.9 POT1-TPP1 wild type and mutant complexes enhance telomerase

processivity to extend telomere ssDNA substrates...... 53

2

Figure 2.10 Destabilization of hT72 formed G-quadruplex upon POT1-TPP1

binding...... 58

Figure 2.11 H266L POT1 mutant promotes telomere extension in cancer cell line.

...... 59

Figure 2.12 The proposed mechanism of POT1-TPP1 complex-mediated

regulation of telomerase...... 64

Figure 3.1 6ThT22 telomere ssDNA forms antiparallel GQ structures in K+...... 98

Figure 3.2 hT22 forms G-quadruplexes in K+ but not Li+ buffer conditions...... 100

Figure 3.3 POT1-TPP1 binds unfolded 6ThT22 with two distinct binding affinities.

...... 101

Figure 3.4 POT1-TPP1 binds and destabilizes 6ThT22 telomere GQs...... 104

Figure 3.5 Potential models for POT1-TPP1 destabilization of telomere G-

quadruplexes...... 106

Figure 3.6 POT1-TPP1 complex promotes G-quadruplex destabilization using both active and passive mechanisms...... 109

Figure 3.7 POT1-TPP1 Q94R and H266L mutants impair protein-ssDNA

interactions...... 112

Figure 3.8 POT1-TPP1 mutants impair both binding and destabilization of 6ThT22

telomere GQs...... 113

Figure 3.9 POT1-TPP1 pathogenic mutants differentially impair binding and

destabilization of telomere G-quadruplexes...... 115

3

Acknowledgement

During my journey to pursue the Ph.D. degree, firstly, I would like to express my deepest gratitude to my mentor Dr. Derek Taylor. He is so helpful and patient for guiding me with my research projects. At the beginning of my Ph.D. study, he explained the projects to me with all the details, which makes me to start the

projects smoothly. The great insightful discussions we had every week lead to the

final completion of the projects. I also appreciate a lot for him trusting me to try the

challenging second project, which is the field I am not familiar with. He is so

supportive and let me to try every possible strategy to complete that project.

Without his mentorship, I cannot have any piece of my research work done.

I want to thank all the past and present Taylor lab members. Thanks to Michael

Mullins, who was the initial lab member when I started my Ph.D. He was able to

notice my little mood swings and tried his best to help me. With Dr. Magdalena

Malgowska, we are more like close friends and sisters than just lab colleagues,

thanks for inspiring me with both the positive lifestyle and diligent research attitude.

I would like to thank Wilnelly Hernandez-Sanchez and Dr. Tawna Mangosh for the

enjoyable discussions we had and immensely helps with their research experience.

The help from other lab members also means a lot for me, I would also like to thank

Dr. Malligarjunan Rajavel, Dr. Xuehuo Zeng, Maria de la Fuente, Dr. Wei Huang,

Dr. Harry Scott, Daniel Leonard, and Dr. Jagpreet Nanda.

I sincerely acknowledge my committee members, Drs. Jason Mears, Eckhard

Jankowsky, Blanton Tolbert, Philip Kiser, Michael Harris, and Marcin Golczak.

4

They have provided very helpful advices and suggestions for my research projects and guided me to follow the correct pattern for my research career.

I would like to thank all my collaborators, Armend Axhemi, Dr. Janna Kiselar,

Sukanya Srinivasan, and Dr. Yinghua Chen. It is their efforts and collaborations making the research projects in this dissertation finally completed.

At last, I want to thank my parents Mr. Cun Xu and Ms. Zhaomin Zeng, for them immerse support for me to purse my Ph.D. at the place so far away from them.

Without the support from my friends in and out of Cleveland, I cannot drift through my Ph.D. life. Finally, I want to thank my husband, Dr. Lei Han, who brightens my life as sunshine.

5

The Role of Shelterin Proteins in Telomere DNA

Protection and Regulation

Abstract

by

MENGYUAN XU

Telomeres are specialized nucleoprotein complexes that cap the ends of all linear chromosomes. Telomere DNA is composed of hexameric tracts of guanine (G) rich sequence (TTAGGG in mammals) that ends in single-stranded DNA (ssDNA) overhangs. The G-rich repeats are capable to form highly stable secondary structures called G-quadruplexes (GQs) that contribute to telomere maintenance.

POT1-TPP1, which forms a heterodimeric complex that specifically binds to telomere ssDNA overhang with nanomolar affinity. The POT1-TPP1 complex is a critical regulator of telomere length, which functions both in protecting from being recognized by DNA damage response pathways and recruiting telomerase, which is a specialized enzyme that extend telomeres. The regulatory roles of POT1-TPP1 complex in the presence of physiologically relevant telomere ssDNA lengths is yet to be understood. In my dissertation, we identified a telomerase inhibitory role of POT1-TPP1 when multiple complexes coat physiologically relevant lengths of telomere ssDNA but not for short telomere ssDNA which contains only one POT1-TPP1 binding site. Furthermore, hydroxyl radical footprinting coupled with mass spectrometry was employed to identify

6

unbiasedly the structural environmental changes occurring at residue histindine

266 of POT1 which is dependent on telomere ssDNA length. Additionally, we

determined that the chronic lymphocytic leukemia (CLL)-related POT1 H266L

substitution impairs POT1-TPP1 binding to ssDNA substrates in a length

dependent manner. Also, POT1 H266L mutant impairs the POT1-TPP1 inhibitory

role in telomerase regulation, leading to telomerase overextension at the cellular level. In the second part of my dissertation, we presented a detailed kinetic model that defines the telomere GQ destabilization upon POT1-TPP1 binding, which depends upon protein concentration. At low POT1-TPP1 concentrations, proteins capture the unfolded state of telomere ssDNA through dynamic equilibrium from its GQ structure. Conversely, at high POT1-TPP1 concentrations, the binding of proteins actively open GQ structures. Finally, we characterized two CLL-related mutations which differentially impair GQ binding and destabilization processes, providing insights into the influence of DNA topology on genomic stability and pathogenic mechanisms.

7

Chapter 1 Background

1.1 Telomere

The telomere, located at the very end of the eukaryotic chromosomes, is

comprised of a nucleoprotein complexes. Telomere DNA extends for thousands of

base pairs of double stranded DNA (dsDNA) before ending with a G-rich single- stranded DNA (ssDNA) overhang that serves as the template for telomerase extension. Telomere is mainly to solve two biological problems. First is end- protection problem, the ends of chromosome need to be distinguished from other double strand break which need to be repaired. Otherwise, the DNA damage response and DNA-end joining events will result in deleterious consequences such as end-to end chromosomal fusions. In order to solve this problem, telomere binding protein specifically binding to the telomere and protect it from DNA damage response. The second problem is end-replication problem, since the DNA polymerase synthesis following directional pattern from 5’ to 3’ end, and cannot fill the gap left behind by the RNA primer at 5 end. To solve this problem, telomerase, which is a specialized and unique ribonucleoprotein (RNP) complex, is responsible for extending telomere sequence by using its own RNA component as template.

Telomeres is able to absorb the loss of DNA caused by the end-replication problem and to prevent the loss of genomic information. As such, telomere length in healthy, adult somatic cells is somewhat heterogeneous among individuals and populations; however, telomere lengths tend to gradually become shorter as part of the natural aging process. On average, telomere length in healthy, human adult

8

cells ranges from 5-15 kB, with the 3’ overhang extending for an additional 50-200 nucleotides (1, 2). While telomere sequence is conserved among mammals

(TTAGGG)n, telomere length varies among species. As an example, mice have

extremely long telomeres (~30-150 kb), which has complicated the use of mouse

models to explore the processes involved in telomere regulation in humans.

Telomere DNA is bound and protected by a core group of six proteins that is collectively referred to as the shelterin complex (3). Three shelterin proteins interact specifically with telomere DNA; Telomere Repeat Binding Factors 1 and 2

(TRF1 and TRF2) bind to the double-stranded region of the telomere and

Protection of Telomere 1 (POT1) binds to the telomere ssDNA. RAP1 (the human ortholog of yeast Repressor/Activator Protein 1) interacts directly with TRF2 to

modulate its function. The two remaining shelterin proteins, TIN2 (TERF1-

interacting nuclear factor 2) and TPP1 behave as a molecular conduit in the

shelterin complex, interacting with TRF2, TRF1 and POT1 to form a direct, protein-

mediated link between telomere dsDNA and single stranded DNA (ssDNA). The

removal of individual shelterin proteins induces a complex set of DNA damage

responses, which includes traditional and non-traditional repair mechanisms (4-6).

As such, a primary function of shelterin is to bind telomere DNA and protect it from

inadequate recognition of DNA damage response machinery. However, data are

emerging which suggest that the shelterin complex plays a much more versatile

role in telomere maintenance and cell signaling events.

9

1.2 Telomerase

Telomerase is responsible for maintaining telomere length homeostasis (7).

Telomerase is minimally composed of a catalytic subunit that contains the telomerase reverse transcriptase (TERT) activity and a telomerase RNA component (TR, or TERC), which serves as the template for telomere DNA synthesis (7). In addition to nucleotide addition, telomerase translocates its RNA template after six nucleotides are synthesized, so that it may be re-used as a template for the next set of six nucleotides to be synthesized. This mechanism, referred to as repeat addition processivity, is coordinated by multiple domains within TERT and TR to prevent dissociation from the telomere and to orchestrate realignment of the template RNA with the newly synthesized DNA product.

Whereas TERT is responsible for telomerase catalytic functions, TR provides the

RNA template needed to elongate telomeres. In addition to the nucleotides that are responsible for incorporation of the consensus sequence by Watson-Crick base pairing, the TR template contains additional nucleotides that are equally important for initial binding and for proper alignment. Upon recruitment, telomerase binds to the DNA 3’ end-flanking region by complementarity of those nucleotides that are adjacent to the coding template region. Nucleotides are then reverse- transcribed into telomere ssDNA until the end of the coding template is reached.

At this point, telomerase translocates the DNA strand to realign the template region with freshly synthesized telomere DNA to repeat the entire process without primer dissociation.

10

1.3 Telomere DNA structure

Telomere DNA is comprised of repetitive G-rich sequence motifs oriented

5’ to 3’ towards the chromosome end. The length and sequence of repeats varies among different species. It is represented by 4.5 repeats of (T4G4) sequence in

the ciliate Oxytricha nova, 20~70 repeats of (T2G4) sequence in T. thermophila,

10-15 kb of (T2AG3) repeats in humans, and 20-50 kb in certain mouse and rat

species (8). Meanwhile, the telomere DNA repeats in S. cerevisae is approximately

300 base-pairs of a somewhat heterogeneous (TG)1–4G2-3 repeating sequence. In

all organisms, telomere DNA is composed mostly of dsDNA followed by a ssDNA

overhang that serves as the substrate for telomerase-mediated extension.

1.3.1 G-quadruplex

Guanine-rich DNA is capable of forming very stable G-quadruplex (GQ) structures

(9, 10). GQs are best characterized by the arrangement of planar arrays formed by four guanine bases held together by forming hydrogen bonded hoogsteen base- pairing interactions. GQ structures can form both inter- and intramolecularly and

the morphology varies depending on several factors. For example, the orientation

of the strands in the GQ can assemble as parallel or antiparallel, or a

heterogeneous mix of both, strands. Similarly, the associated metal ion that

stabilizes the GQ contributes to GQ topology and strand orientation. Finally, the

sequence of nucleotides flanking the GQ structure can influence topology and stability. Multiple GQ topologies assembled using DNA with human telomere sequence have been characterized in molecular detail (Figure 1.1). The K+

containing structure of d[AGGG(TTAGGG)3] determined by X-ray crystallography 11 reveals a propeller-like structure with the strands oriented in a parallel fashion (11).

In this arrangement, the guanines are arranged into stacked G-quartets with the

K+ in the center and the TTA loops protruding away like the blades of a propeller.

The structure of the same DNA sequence determined by NMR provides a different

basket-type GQ conformation (12). In this topology, all strands reside in an

antiparallel orientation. While most of the current knowledge regarding GQ stability

and structural polymorphism stems from biophysical experiments using isolated

DNA, GQs have also been identified in the telomeres of human cells and in the

macronuclei of ciliates and in frog oocytes (13-15).

12

Figure 1.1Three-dimensional and schematic structures of intramolecular G- quadruplexes formed by human telomere sequences.

A. Parallel G-quadruplex observed for the sequence d[AGGG(TTAGGG)3] in a K+- containing crystal (PDB ID: 1KF1). B. Parallel G-quadruplex observed by NMR for the sequence d[TAGGG(TTAGGG)3] in a K+-containing crowded solution (PDB

ID: 2LD8). C. antiparallel G-quadruplex observed by NMR for the sequence d[AGGG(TTAGGG)3] in Na+ -containing solution (PDB ID: 143D). D. Hybrid G- quadruplex with two G-tetrad layers observed by NMR for the sequence d[GGG(TTAGGG)3T] in K+-containing dilute solution (PDB ID: 2KF8). E. (3 + 1) form one hybrid G-quadruplex observed by NMR for the sequence d[TAGGG(TTAGGG)3] in K+-containing dilute solution (PDB ID: 2JSM). F. (3 + 1)

13 form 2 G-quadruplex observed by NMR for the sequence d[TAGGG(TTAGGG)3TT] in K+-containing dilute solution (PDB ID: 2JSL); Guanines in an anti-configuration are shown in blue and those in a syn-configuration are colored yellow.

14

1.3.2 T-loop

Telomere loops (T-loop) describe another structure that has been characterized for telomere DNA. To form a T-loop, the ssDNA overhang is predicted to invade the telomere duplex DNA to form a lariat configuration. Due to

the elevated thermodynamic stability inherent to dsDNA, protein factors such as

TRF2 are a requisite for T-loop assembly (16).

1.4 Shelterin complex

1.4.1 Double stranded DNA binding proteins

Vertebrate telomeres are capped by a multiple-protein complex called

shelterin (Figure 1.2) (3). Two shelterin components, TRF1 and TRF2, localize specifically at telomeres (Figure 1.2A). The two proteins are negative regulators of telomere length, as overexpression of either TRF1 or TRF2 leads to gradual telomere attrition in cancer cells (17, 18). TRF1 and TRF2 bind to the telomere dsDNA as preformed homodimers, which interact through their N-terminal, TRF-

homology (TRFH) domains (19). The structure of the TRFH domain has been

determined for both TRF1 and TRF2 and it resembles a twisted horseshoe-like

structure with unique interface features to prevent heterodimerization (19). For example, the amino acid sequence at the TRFH interface differs between TRF1 and TRF2 and the structures implicate these differences in inhibiting TRF1-TRF2 heterodimer formation. The protein-DNA interactions for TRF1 and TRF2 occur exclusively with telomere dsDNA and are orchestrated by conserved Myb domains that reside at the C-terminus of both proteins (20, 21). The tertiary structure of the

Myb domain of TRFs is represented by three helices (21-23). Notably, the third 15

helix recognizes the core TAGGG sequence that resides in the major groove of

the duplex, telomere DNA.

TIN2 is retained at the telomere through interactions that stabilize the TRF1

and TRF2 DNA binding ability (24). TIN2 comprises a central hub of the shelterin

complex that maintains interactions with TRF1, TRF2 and TPP1 (Figure 1.2A) (25-

27). Mutations in TIN2 that impair its binding with TRF1 or TRF2, destabilize

telomeres and induce a DNA damage response (28). Meanwhile, interactions that

reside between TIN2 and TPP1 are necessary for recruitment of the TPP1-POT1

heterodimer to the telomere to bind and protect the ssDNA overhang (29). The

removal of TIN2 protein in mice abrogates the localization of POT1-TPP1 protein

at the telomere, and triggers an ATR-mediated DNA damage response (30). These

data suggest that TRF1 and TRF2 recruit TIN2 to the shelterin complex, which in

turn recruits POT1-TPP1 to the telomere. In addition to forming interactions that

keep the shelterin complex intact, the TRF1-TIN2 interaction prevents SCFFbx4- mediated ubiquitination and degradation of the TRF1 protein (31).

RAP1 is the most highly conserved shelterin protein with the least understood role in telomere biology. RAP1 forms a complex specifically with TRF2 to enhance its DNA-binding specificity (Figure 1.2A) (32). The RAP1-TRF2 interaction has been shown to protect telomere DNA from nonhomologous end- joining (NHEJ) (33). The role of RAP1 in NHEJ has been controversial, however, as other data identifies a role of RAP1 in suppressing homology-directed repair

(HDR) at telomeres and not NHEJ, at least in cell lines devoid of -KU80 signaling proteins (34). Structurally, the RAP1 C-terminal domain forms a

16 conserved module in proteins across species that guides interactions with TRF2 in humans, and is used to recruit SIR3 proteins to regulate gene silencing in budding yeast (35). Interestingly, the removal of RAP1 from human cell lines has no effect on the other shelterin components or on telomere length homeostasis

(36). These data suggest that RAP1 may play a more crucial role in regulating transcription as opposed to a direct role in telomere maintenance.

1.4.2 Single stranded DNA binding proteins

The ssDNA overhang at the 3’ end of mammalian telomeres is bound and protected by POT1 protein (Figure 1.2B) (37). POT1 was originally thought to behave exclusively as a negative regulator of telomerase activity, but this interpretation gets complicated when POT1 functions with other shelterin proteins.

For example, the POT1-TPP1 heterodimer increases telomerase activity on telomere DNA (38, 39). Although TPP1 is not known to interact with telomere DNA directly, it increases the affinity of POT1 for telomere DNA substrates (38, 39).

Furthermore, TPP1 helps POT1 to discriminate between ssDNA and RNA

substrates (40) and plays a central role in the shelterin complex by bridging TIN2,

and thus the double-stranded region of the telomere, with POT1 and the ssDNA

overhang (29, 41, 42).

Structurally, the N-terminal domain of POT1 folds into two

oligonucleotide/oligosaccharide binding (OB) domains, which interact intimately

with telomere DNA (43). In addition to POT1, a number of nucleic-acid binding

proteins, including replication protein A, are represented by OB-fold motifs,

indicating a universal role in the direct maintenance of genomic stability (44). The 17 central domain of TPP1 also represents an OB-fold (38). However, instead of binding to nucleic acid, the OB-fold of TPP1 is responsible for interactions with telomerase (29, 45, 46). Recently, two groups determined structure of C-terminal of POT1 in complex with POT1 binding domain of TPP1 by X-ray crystallography.

The C-terminus of POT1 contains a holliday junction resolvase (HJR) domain inserted in the middle of the third OB-fold. Both structures revealed POT1 binding domain of TPP1 lies in the grooves formed by the third OB domain and HJR domain of POT1, thereby localizing POT1 at the telomere. Binding of POT1-TPP1 to telomere ssDNA prevents the induction of DNA damage response including ATR kinase-dependent types and contributes to genome stability.

Despite poor sequence identity, POT1 and TPP1 are structurally related to the O. nova telomere end-binding α and β (TEBPα and TEBPβ) heterodimer (38).

TEBPα and TEBPβ are the first identified specific telomere DNA binding protein and the structure of the TEBPα-TEBPβ-DNA complex has been solved by X-ray diffraction (47). The structure reveals that TEBPα is represented as a series of three OB-fold domains and TEBPβ is comprised of a single OB-fold motif. The two proteins interact with one another to clamp down on telomere DNA that resides in a groove between them.

18

Figure 1.2 Shelterin complex components.

A. Domain diagrams of telomere double stranded DNA binding components. B.

Telomere single stranded DNA binding components. The top and bottom panels show POT1 and TPP1 domain diagrams, respectively. The middle panel shows the available structures POT1 (green) and TPP1 (orange), including the DNA- binding domain of POT1 bound to telomere DNA (PDB ID: 1XJV), the OB-fold domain of TPP1 (PDB ID: 2I46), and the C-terminus of POT1 in complex with

POT1 binding domain of TPP1 (PDB ID: 5UN7 and 5H65).

19

1.4.2.1 POT1-TPP1 functions in telomerase regulation

As mentioned, a primary function of shelterin proteins is to protect telomere

DNA from illicit events, such as DNA degradation and end-to-end fusions of different chromosomes. However, the role of shelterin has expanded to include telomere-length maintenance as several of the proteins have been discovered to function in telomerase recruitment and regulation. For example, the POT1-TPP1 heterodimer enhances telomerase activity and processivity (38, 39). The enhancement of telomerase activity can be attributed to a direct protein-protein interaction that allows TPP1 to recruit telomerase to the telomere (29, 45, 46).

Independent of telomerase recruitment, the POT1-TPP1 heterodimer slows dissociation of telomerase from telomere DNA to assist translocation and enhance telomerase processivity (48). A number of studies have identified post- translational modifications to TPP1 that may provide a molecular switch between telomere protection and telomerase recruitment activities (49, 50).

Several studies suggest that TIN2 facilitates the localization of POT1-TPP1 to telomere ssDNA. The removal of TIN2 diminishes the amount of POT1 and

TPP1 that localizes at the telomere (30). Moreover, the depletion of TIN2 but not

POT1 results in the failure of TPP1-dependent telomerase recruitment (29).

1.4.2.2 Disease related mutations in POT1-TPP1

Recently, more than 300 single-nucleotide polymorphisms (SNPs) within the coding region of POT1 (cBioPortal) have been identified in patients with varying types of cancer (51, 52). Exome sequencing of familial glioma patients has identified inherited mutations in the POT1 gene that are associated with this type 20

of cancer (p.G95C, p.E450X and p.D617Efs) (53). The POT1 G65C mutation is located within the DNA-binding groove and presumably disrupts interactions with telomere DNA. The POT1 E450X introduces a premature STOP codon in the translated POT1 protein that would be predicted to lack its TPP1 interacting domain. Similar inherited mutations have been reported in the POT1 gene of familial melanoma patients (54). Most of the identified mutations are localized in the POT1 DNA-binding domain, which emphasizes an important relationship between POT1-DNA interactions and the development of familial cancer. Somatic mutations of POT1 have also been detected in chronic lymphocytic leukemia (CLL)

(55). These studies suggest that mutant POT1 protein fails to localize at the telomere, leaving unprotected telomere ends that could lead to genome instability and tumorigenesis.

Besides POT1, mutations to the genes that code for TPP1 have been identified in patients with bone marrow failure syndromes (56). A recent study on

Hoyeraal-Hreidarsson syndrome (HH)—a clinically severe variant of DC, revealed

ACD (codes for TPP1) as a novel DC related gene (56). In this study, two mutations

(ΔK170 and P491T) were identified at the protein level of TPP1, both of which are

highly conserved in mammals. The first mutation is a single amino acid deletion of

K170, which is located in a region of TPP1 that is responsible for conducting

interactions with telomerase (46). Interestingly, the ΔK170 mutation of TPP1 has

been identified in patients with aplastic anemia as well (57). The second mutation

is an amino acid substitution (P491T), which is located in the TIN2 interacting

domain of TPP1. Together, these data further demonstrate that highly intricate

21 protein-protein and protein-DNA interactions within the shelterin complex contribute to proper genome stability.

1.5 Telomere protection and telomerase regulation

1.5.1 Telomere and DNA damage response

When telomeres become critically short, their capping function is compromised

and a range of DNA damage-like responses are induced. In telomerase deficient

yeast cells, short telomeres are recognized as DNA damage and are arrested in

G2/M (58). Markers consistent with DNA damage response are also triggered in

human fibroblasts when telomeres reach a critically short threshold to invoke

. These signaling events are remarkably similar to those in cells

bearing DNA double-stranded breaks and involve the activation of DNA damage

checkpoint kinases including CHK1 and CHK2. These findings, as well as others,

have provided a clear connection between telomere-initiated senescence and

innate DNA damage responses (Figure 1.3).

22

Figure 1.3 Shelterin prevents multiple DNA-damage responses at the telomere.

The repression mechanism of DNA-damage response by shelterin components are indicated by arrows in black. The telomerase main components, TR and TERT, are shown in blue. Individual shelterin proteins and their role in preventing various

DNA-damage responses are labeled accordingly.

23

In normal telomeres, the shelterin complex collaborates to repress at least

six DNA damage pathways that include ATM and ATR signaling, classical and

alternative nonhomologous end joining (alt-NHEJ), homologous recombination, and resection (6). Single knockdown studies for each component of the shelterin complex have revealed similar and alternative mechanisms that explain how this protein complex functions to prevent telomeres from appearing as a DNA break in

need of repair. The deletion of a POT1-ortholog in mice results in telomere fusions

and -dependent senescence (59). POT1 knockdown experiments have

provided evidence that it functions, at least in part, to prevent replication protein A

(RPA) from binding to telomere ssDNA thereby preventing the RPA-induced

activation of ATR dependent DNA damage responses (4, 5, 59). Because of an

expanded role in DNA damage repair and DNA synthesis, RPA exists at a

concentration that is much higher in the cell than that of POT1. Both POT1 and

RPA display similar binding affinities for telomere DNA, yet physiological levels of

POT1 are sufficient to prevent RPA from binding to the telomere. One explanation

for this phenomenon can be attributed to a shelterin-related enhancement of POT1

localization and function. Interactions between POT1 and TPP1 with the rest of the

shelterin complex effectively localize and concentrate POT1 protein exclusively at

the telomere. Furthermore, the inclusion of TPP1 increases the binding affinity of

POT1 to telomere DNA nearly ten times over that of POT1 alone (38). Highlighting

the importance of TPP1 in localizing POT1 to protect ssDNA at the telomere, knockdown experiments revealed that the loss of TPP1 activates ATR dependent

DNA damage response in a manner similar to that of POT1 removal (60).

24

Furthermore, TIN2 performs a similar role in recruitment, as its ablation prevents

POT1-TPP1 localization at the telomere, thus allowing RPA-binding and ATR signaling (30).

At the double-stranded region of the telomere TRF1, TRF2, and RAP1 also shield telomere DNA from appearing as sites of breaks or damage. Knockdown studies of TRF1 revealed that it is essential for chromosome stability by limiting replicative stress. Mechanistically, TRF1 assists in the proper replication of telomeres by preventing ATR kinase activation and fork stalling (61). TRF1 may function by coordinated interactions with essential , such as BLM and

RTEL1, to facilitate unwinding of G-quadruplex structures at the telomere and to avoid fork stalling. Mice deficient for TRF2 are early embryonic lethal. Knockdown of TRF2 in mouse embryonic fibroblast (MEF) cells causes telomere fusions and dsDNA break-like damage activation through MRN (MRE11, RAD50 and NBS1) recruitment and ATM activation (5). Other studies in cell culture show that the removal of TRF2 allows the KU70-KU80 complex to load onto telomeres to initiate

NHEJ DNA repair (62). Although the TRF2-binding protein, RAP1, is dispensable for ATM activation and NHEJ events, it appears to be critical for repressing HDR at telomeres (34). Strikingly, these studies demonstrate that HDR events in mouse embryonic fibroblasts lacking RAP1 occur at the telomere even in the absence of a DNA damage signal. Cumulatively, these knockdown studies reveal a critical function of shelterin proteins in protecting against a range of DNA damage response mechanisms.

25

1.5.2 Telomere length regulation

In addition to age, telomere length generally correlates with cell function.

Cells that exhibit a high proliferative rate (e.g. during embryogenesis, in adult

germline and proliferative cells of tissue renewal) express telomerase to maintain

longer telomeres and to prevent senescence (63). In healthy somatic cells

telomerase activity is below detection limits and progressive telomere shortening

is observed. Cells that express moderate amounts of telomerase, such as

hematopoietic stem cells (HSC), have the ability to maintain telomere length but not as efficiently as cells that constitutively express telomerase, as is the case in most cancer cells. Generally, cancer cells reactivate and/or upregulate telomerase to maintain telomeres, albeit at reduced lengths. Other evidence suggests a putative mechanism in which telomerase is activated in response to the detection of extremely short telomeres that are at a higher risk for inducing chromosome instability (64). In these cases, activation of telomerase is sufficient for avoiding

cell death mechanisms that would otherwise be initiated. The exact mechanism of

how telomere length and telomerase expression is regulated, particularly during

cancer progression, remains unclear. Nonetheless, there is a clear connection

between telomere length, telomerase activity, and gene stability in a wide range of

cell types.

1.6 Significance

Cellar phenotypes and molecular pathways have elucidated a role of POT1-

TPP1 to function properly to protect telomere from illicit events and regulate telomerase-based telomere extension, however the underlying mechanism has

26 remained controversial and elusive. POT1-TPP1 play diverse roles in telomere protection and regulation in both positive and negative ways. For instance, the

POT1-TPP1 heterodimer protects telomere ssDNA from telomerase and other enzymatic activity. Conversely, it also plays a role in recruiting telomerase to enhance telomerase-mediated extension of telomere DNA. In vivo the length of telomere ssDNA is approximately 50-200nt in humans, so the formation of G-

quadruplex structures may affect POT1-TPP1 recognition and binding events. The stability of G-quadruplex makes it as a potential obstacle for telomerase elongation,

which makes the understanding of POT1-TPP1 facilitate physiological length

telomere ssDNA as an interesting topic. Additionally, the molecular mechanism of

how the disease-related mutations in POT1 impair its function in telomere

maintenance is yet to be understood.

In this dissertation, we investigated the functions of POT1-TPP1 telomere

maintenance and telomerase regulation in the presence of physiological length

telomere ssDNA. By employed a combination of biochemical, biophysical and cell

biological techniques, we probed two new regulatory roles of POT1-TPP1 in telomere length regulation. First, we identified that POT1-TPP1 differentially regulates telomerase as a function of telomere ssDNA length. Second, POT1-

TPP1 is capable to destabilize telomere ssDNA formed G-quadruplex by both

passively and actively resolving its secondary structure. Additionally, CLL related

mutations disrupt POT1 regulatory roles

27

Chapter 2 POT1-TPP1 differentially regulates telomerase via

POT1 His266 and as a function of single-stranded telomere DNA length

The study of this chapter was published in

Mengyuan Xu, Janna Kiselar, Tawna L. Whited, Wilnelly Hernandez-

Sanchez, and Derek J. Taylor. POT1-TPP1 differentially regulates telomerase via

POT1 His266 and as a function of single-stranded telomere DNA length.

Proceedings of the National Academy of Sciences 116.47 (2019): 23527-23533.

28

Abstract

Telomeres cap the ends of linear chromosomes and terminate in a single stranded

DNA (ssDNA) overhang recognized by POT1-TPP1 heterodimers to help regulate

telomere length homeostasis. Here, hydroxyl radical footprinting coupled with

mass spectrometry was employed to probe protein-protein interactions and

conformational changes involved in the assembly of telomere ssDNA substrates

of differing lengths bound by POT1-TPP1 heterodimers. Our data identified

environmental changes surrounding residue histidine 266 of POT1 that were

dependent on telomere ssDNA substrate length. We further determined that the

chronic lymphocytic leukemia-associated H266L substitution significantly reduced

POT1-TPP1 binding to short ssDNA substrates, however, only moderately impaired the heterodimer binding to long ssDNA substrates containing multiple protein binding sites. Additionally, we identified a telomerase inhibitory role when

several native POT1-TPP1 proteins coat physiologically relevant lengths of

telomere ssDNA. This POT1-TPP1 complex-mediated inhibition of telomerase is

abrogated in the context of the POT1 H266L mutation, which leads to telomere

overextension in a malignant cellular environment.

29

Introduction

Telomeres are specialized nucleoprotein complexes that cap the ends of linear chromosomes (65). After thousands of base-pairs of a repeating G-rich sequence, telomeres terminate in a short single-stranded DNA (ssDNA) overhang (66). In proliferating and transformed cells, telomerase, a reverse transcriptase ribonucleoprotein complex, base-pairs with the telomere ssDNA overhang to synthesize new telomere ssDNA (65). Telomere DNA is protected by a set of specialized proteins called shelterin that help regulate telomerase activity and prevent telomeres from being misidentified as sites of DNA damage (3). POT1 and

TPP1 are two shelterin proteins that interact with the ssDNA overhang to help maintain telomere integrity (3, 8). Whereas POT1 binds specifically to telomere ssDNA, TPP1 interacts with POT1 and other shelterin proteins to localize POT1 to telomeres. Recently, more than 300 single-nucleotide polymorphisms (SNPs) within the coding region of POT1 (cBioPortal) have been identified in patients with malignancies including chronic lymphocytic leukemia (CLL) (55), familial melanoma (67, 68), familial glioma (53), and cardiac angiosarcoma (69). Most cancer-associated SNPs result in mutations that localize to the two N-terminal oligosaccharide-oligonucleotide (OB) folds of the POT1 protein, with many residing near the ssDNA binding cleft. Meanwhile, TPP1 helps to regulate interactions with telomerase for its recruitment to telomeres in a cell-cycle dependent manner (29,

45, 46, 70, 71). Therefore, the POT1-TPP1 heterodimer plays diverse roles in protecting the telomere ssDNA from degradation and repair, while also facilitating access of telomere ssDNA for telomerase-mediated extension (3, 8, 37, 38).

30

In telomere length homeostasis is regulated by a protein-counting mechanism, whereby the length of the telomere DNA and the relative number of telomere proteins bound to it differentially influence telomerase- mediated extension (72, 73). The length of the telomere ssDNA overhang is also an important regulator of telomere homeostasis in most eukaryotes as its ability to adopt disparate secondary structures including G quadruplexes (9, 10) and T- loops (16) pose obstacles for telomere extension. The binding of telomere proteins alleviates secondary structures to promote telomerase accessibility and extension

(74-77). In mammalian cells, the 50-100 copies of POT1 and TPP1 proteins per telomere is more than enough to fully coat the telomere ssDNA overhang (78, 79).

Together, these studies indicate that telomere ssDNA length, binding of shelterin

proteins, and telomere DNA structure collectively contribute to telomere

homeostasis. Despite these findings, the molecular switch governing whether

these regulatory elements contribute positively or negatively to telomere extension

are not well understood.

Here, we probed the functional influences of telomere ssDNA length and binding

of multiple POT1-TPP1 complexes on nucleoprotein-mediated regulation of

telomerase. Hydroxyl radical footprinting (HRF) coupled with mass spectrometry

was used to identify key alterations in POT1-TPP1 complex structural

environments as a function of telomere ssDNA length. Specifically, our data

present a model in which short POT1-TPP1-ssDNA complexes enhance

telomerase activity, while longer tracts of POT1-TPP1-ssDNA complexes negatively regulate telomerase-mediated extension. Furthermore, our biophysical 31 and functional data highlight significant environmental changes surrounding histidine 266 of POT1. Indeed, the cancer-associated H266L POT1 mutant exhibits defects in telomere ssDNA binding and telomerase regulation that are dependent on the length of ssDNA and the number of POT1-TPP1 proteins bound to it. The observed defects of telomerase inhibition and telomere length regulation associated with the H266L mutant was further confirmed using CRISPR-Cas9 technology to introduce this substitution into malignant cells. Together, our data establish an elaborate model in which POT1-TPP1 binding to ssDNA differentially regulates telomerase in a manner that is dependent upon ssDNA length and degree of POT1-TPP1 saturation of that ssDNA. Additionally, our data identified the H266 residue of POT1 as playing a key role in transitioning the POT1-TPP1 complex from a positive to a negative processivity factor of telomerase.

Results

Hydroxyl radical footprinting of monomeric versus multimeric POT1-N

binding to different lengths of ssDNA highlights different structural

environments

The physiological length of the telomere ssDNA overhang is maintained at a length

of 50-200 nucleotides in mammalian cells (1), and can accommodate binding of

several POT1 proteins or POT1-TPP1 complexes (79, 80). In order to explore the

structural environment changes induced by multiple protein binding events on

physiologically relevant lengths of ssDNA, we employed hydroxyl radical

32

footprinting (HRF) to characterize single and multiple POT1 binding events to

telomere ssDNA of differing lengths. Hydroxyl radicals generated from exposure

to X-ray beams oxidatively modify solvent-accessible amino acid side chains, providing high resolution information on protein-protein and protein-DNA interactions (81-83).

Initially, HRF experiments were performed with a splice variant of POT1 (POT1-N)

that represents the DNA binding domain of full-length protein and the availability of its structure solved by X-ray crystallography (43) provides a detailed guide for mapping and analyzing the individual residues identified in the footprinting studies.

HRF experiments were performed to probe POT1-N protein alone (POT1-N), a

single protein bound to the minimal 12-nt ssDNA substrate (POT1-N-hT12), and a complex assembled from six POT1-N proteins bound to a 72-nt, physiologically relevant length of telomere ssDNA (POT1-N-hT72) (Figure 2.1A and 2.1B). The modification ratio compares solvent accessible reaction rates for an individual amino acid measured in different sample environments, as previously described

(82, 84, 85). Thus, a normalized modification ratio equal to 1 indicates that the solvent accessibility of a residue is consistent across two different sample environments. Meanwhile, a normalized modification ratio value of less than 1 identifies residues that experience a gain in solvent accessibility in the state represented in the denominator of the ratio. Similarly, a normalized modification ratio that is greater than 1 highlights residues that are more protected from solvent in the state that is represented in the denominator of the ratio.

33

Figure 2.1 POT1-N and POT1-TPP1 coat long telomere ssDNA substrates.

A. Electrophoretic mobility shift assays under stoichiometric conditions of POT1-

TPP1 coating different lengths of telomere ssDNA substrates. Molar ratios of POT1-

TPP1 proteins to the number of binding sites in each oligo are indicated on the top of the gel. B. Size-exclusion chromatography of POT1-N with or without telomere ssDNA substrates. Profiles of the POT1-N, POT1-N-hT12, and POT1-N-hT72 complex are shown in black, blue, and red, respectively. Absorbance at 280nm is shown as solid lines and the absorbance at 260nm is shown as dashed lines. C.

Size-exclusion chromatograms of POT1-TPP1 with or without telomere ssDNA substrates. Label coloring schemes are the same as in panel B.

34

For experimental data comparing POT1-N to POT1-N-hT12, the majority of modification ratios fall in the range of mean ± 2SD, indicating that the solvent accessibility of POT1-N protein is not significantly altered upon hT12 binding

(Figure 2.2A, 2.3A, and Table 2.1, 2.2). However, residue H266 exhibited a

modification ratio greater than the mean + 3SD upon hT12 binding. This finding

suggests that H266 is significantly more protected from solvent

in the presence of hT12 DNA. This conclusion is supported by the X-ray crystal

structure which describes intimate interactions between H266 and two separate

nucleotides (T8 and G10) in the short ssDNA substrate (Figure 2.3C) (43).

A comparison between POT1-N-hT12 and POT1-N-hT72 samples (Figure 2.2A and 2.3B) revealed W184 and H266 as the only residues with modification ratios outside the mean ± 2SD range. Specifically, W184 became significantly more protected in complexes assembled from six POT1-N proteins coating the physiologically relevant ssDNA substrate when compared to a single POT1-N protein interacting with the short ssDNA substrate. Meanwhile, H266 displayed a modification rate that was dramatically elevated with the protein-coated hT72 ssDNA sample (25.1 s-1) compared to that of a single POT1-N protein bound to

the hT12 ssDNA (1.5 s-1) (Figure 2.4A). Together, these data highlight H266 as a

residue in POT1 protein that becomes significantly more solvent accessible in

samples containing protein-coated hT72 ssDNA versus the hT12-bound complex.

35

Figure 2.2 HRF identifies POT1-N and POT1-TPP1 residues undergoing

environmental changes upon binding of differing telomere ssDNA

substrates.

Normalized modification ratio of POT1-N-hT12/POT1-N-hT72 versus normalized modification ratio of POT1-N/POT1-N-hT12 for all detectable POT1-N residues.

Gray squares indicate boundaries of mean ± 3 SD (solid) and mean ± 2 SD

(dashed). B. POT1 and C. TPP1 residue modification ratios of PT-hT12/PT-hT72 versus normalized modification ratio of PT/PT-hT12 in the context of POT1-TPP1 binding to ssDNA substrates. Data are plotted as described in A.

36

Figure 2.3 Regions of POT1-N that exhibit alterations in hydroxyl radical modification rates upon telomere ssDNA binding and oligomerization on long ssDNA.

A. The structure of an individual POT1-N protein in complex with ssDNA (PDB ID:

1XJV) (43) was used for interpretation of HRF data. The structure is colored based on normalized modification rate ratios of POT1-N/POT1-N-hT12 (an individual 37 protein with and without the minimal hT12 ssDNA substrate). Those residues

identified with the most significant changes are highlighted in stick representations

and labeled accordingly. The color bar indicates relative solvent accessibility. B.

The data depicted are similar to that in panel A except that it is based on

normalized modification rate ratios of POT1-N-hT12/POT1-N-hT72 (single protein on a short ssDNA substrate versus six proteins coating a long ssDNA substrate).

The color bar indicates relative solvent accessibility. C. Structure of POT1-N (green) with hT10 (yellow). W184 and H266 residues highlighted in pink. Inset displays interactions between POT1 H266 and T8 and G10 of hT10 ssDNA. Hydrogen bond between H266 and G10 is indicated by dashed line.

38

Figure 2.4 Oxidation rates of identified residues in POT1-N.

Dose response curves for hydroxyl radical footprinting of A. H266 and B. Y36 of

POT1-N for different DNA bound states (black for protein only, blue for protein bound to hT12 and red for protein bound to hT72).

39

POT1 H266 is solvent accessible in POT1-TPP1 complexes coating

physiologically relevant ssDNA substrates.

To further explore the structural changes of telomere protein-DNA interactions in

an expanded physiological context, the POT1-TPP1 heterodimer was similarly assembled with differing lengths of telomere ssDNA (referred to as PT, PT-hT12, and PT-hT72; Figure 2.1C) and HRF experiments were conducted. Similar to

POT1-N, POT1-TPP1 exhibited a significantly reduced modification rate (~6-7 fold) for POT1 H266 upon binding to hT12 ssDNA substrates (Figure 2.2B and Figure

2.5A, and Table 2.3, 2.4). The modification rate ratios for residues Y73 and Y242 in the context of hT12 ssDNA binding were similarly determined to be beyond the mean ± 3SD range for this analysis (Figure 2.2B and 2.5A). Both Y73and Y242 are located at the interface between the two OB-fold domains of POT1 and a hydrogen bond is formed between the backbone carbonyl group of the Y73 and

Y242 side chain hydroxyl group in the previously solved X-ray crystal structure

(Figure 2.5C) (43). The enhanced protection of these two tyrosine residues is indicative of a conformational change, upon hT12 ssDNA binding, that brings the two OB-folds of POT1 closer together to limit solvent accessibility. As these changes were not observed in the POT1-N experiments, it is likely that the inclusion of TPP1 helps to promote this conformational change of POT1, which could help explain the role of TPP1 in enhancing the binding affinity of POT1 for hT12 ssDNA by an order of magnitude (38).

A comparison of the modification rates between PT-hT12 and PT-hT72 identified few structural changes that are altered upon binding to telomere ssDNA of differing 40 lengths as the majority of ratios were within the mean ± 2SD range (Figure 2.2B and 2.5B). Consistent with POT1-N experiments, however, H266 exhibited a marked increase in modification rate when in complex with hT72 ssDNA and the lowest ratio of normalized modification rates when comparing PT bound to hT12 versus hT72 ssDNA (Figure 2.2B). Once again, these data reveal that H266 is more protected upon POT1-TPP1 binding to hT12 ssDNA but this protective capacity is lost upon binding to the longer, physiologically relevant hT72 ssDNA substrate. Compared to POT1, TPP1 is less involved in the POT1-TPP1-DNA and inter-protein interactions that are inherent to multiple protein binding events. The modification rate ratios of TPP1 residues were primarily in the normalized range of mean ± 3SD, with the exception of C298 which exhibits greater solvent accessibility when bound to hT72 versus hT12 ssDNA (Figure 2.2C, and 2.5B).

These data suggest that the interface encompassing C298 of TPP1, undergoes a

DNA-length dependent rearrangement; however, delineation of the role of such a conformational change in POT1-TPP1 function will require further investigation.

41

42

Figure 2.5 HRF results of POT1-TPP1 complexes bound to different telomere ssDNA substrates.

A. Model of POT1-TPP1 complex with hT10 was assembled using POT1-N-hT10

(PDB ID: 1XJV) (43), TPP1 OB-fold (PDB ID: 2I46) (38) and POT1-C and TPP1

POT1-binding domain (PDB ID: 5UN7) (86) based on the organization of the telomere end-binding complex from Oxytricha nova, TEBPα-TEBPβ-ssDNA (PDB

ID: 1OTC) (47). Model is color coded to depict normalized modification rate ratios of PT/PT-hT12 (an individual protein complex with and without the minimal ssDNA substrate). Those residues identified with the most significant changes are highlighted in stick representations and labeled accordingly. B. The data depicted are similar to that in panel A except that it is based on normalized modification rate ratios of PT-hT12/PT-hT72 (single protein complex on a short ssDNA substrate versus six protein complexes coating a long ssDNA substrate). Color bars indicate relative normalized modification rate ratios. C. Structure of POT1-N with hT10. Y73 and Y242 residues highlighted in cyan. Inset displays hydrogen bond between the backbone carbonyl group of the Y73 and Y242 side chain hydroxyl group.

43

Cancer associated POT1 H266L mutant differentially regulates POT1-TPP1 binding to telomere ssDNA substrates in a length dependent manner

With the identification of the H266L POT1 mutation in patients diagnosed with CLL

(55), we hypothesized that the pathology observed may be related to altered H266 function in coordinating interactions between POT1 and telomere ssDNA. Like

H266, single nucleotide polymorphism resulting in Y36 mutations (Y36N) at the

POT1 protein level have been identified in CLL patients (55). Although both H266 and Y36 interact directly with telomere ssDNA in the X-ray crystal structure (43),

Y36 did not exhibit significant changes in solvent accessibility in the HRF experiments described above (Figure 2.4B). One potential explanation for this observation is localization of H266 and Y36 in the POT1 DNA-binding domain; Y36 resides in OB1 whereas H266 localizes to OB2. Since both OB-folds cooperate in recognizing telomere ssDNA (43, 77), we first sought to determine whether CLL- associated mutations introduced at the Y36 or H266 position differentially alter ssDNA interactions. To do so, POT1-TPP1 heterodimeric protein with a H266L or a Y36N mutation introduced in the POT1 protein were expressed and purified. The ability for each construct to interact with telomere ssDNA of different lengths (hT12 and hT72), and identical to those used in HRF experiments, was quantitatively measured. Electrophoretic mobility shift assays (EMSAs) were performed to compare the equilibrium dissociation constant (KD) of wild type and mutant POT1-

TPP1 proteins for ssDNA substrates. Initial experiments for a single protein bound

to the minimal hT12 ssDNA substrate yielded a calculated KD value for wild type

POT1-TPP1 to be 1.0 ± 0.5 nM, consistent with previous reports (38, 39, 43).

44

Similar experiments revealed POT1 Y36N moderately decreased the affinity of

POT1-TPP1 protein for hT12 ssDNA (KD = 2.2 ± 1.1 nM), whereas POT1 H266L

mutation significantly reduced DNA binding affinity by approximately 30-fold (KD =

29.1 ± 9.3 nM) (Figure 2.6A and B).

EMSAs were then employed to characterize the ability of multiple POT1-TPP1 proteins to bind to the physiologically relevant hT72 ssDNA substrate. The apparent KD value (Kapp) was calculated to describe the efficiency of six POT1-

TPP1 proteins to simultaneously bind to, thereby coating, the hT72 ssDNA

substrate. In these experiments, the Kapp was determined to be 6.9 ± 0.6 nM, 4.5

± 0.4 nM, and 11.1 ± 1.1 nM for wild type, Y36N, and H266L constructs,

respectively (Figure 2.6C and D). These data indicate that the CLL associated

Y36N or H266L mutations only modestly (less than 2-fold) alter the apparent

equilibrium dissociation constant of POT1-TPP1 to coat physiological lengths of

telomere ssDNA. This finding is in stark contrast to the 30-fold difference

determined for H266L POT1-TPP1 protein binding to short, hT12 ssDNA substrate.

We next investigated whether the H266L mutant alters the preference for 3’ end

binding exhibited by native POT1-TPP1 protein (38). Accordingly, EMSAs were conducted with a3 and a5 ssDNA substrates that contain nucleotide substitutions

to impact POT1-TPP1 binding at either the 5’ or the 3’ ends of an 18-nt telomere

ssDNA substrate (38). Whereas wild-type protein exhibits a tenfold higher affinity for binding to the 3’ recognition motif, this preference is reduced to only 2.5X higher for the H266L mutant (Figure 2.7). Altogether, these data recapitulate the HRF results described above identifying H266 as being critical for ssDNA substrate

45 binding and dictating the differential roles for POT1-TPP1 binding to short hT12 ssDNA versus the coating of multiple heterodimers on long hT72 ssDNA substrates.

46

Figure 2.6 POT1(H266L)-TPP1 mutant exhibits differential binding defects that are length-dependent of ssDNA substrate.

A. EMSA assays were performed under equilibrium binding conditions to determine the effects CLL-associated POT1 mutations had on POT1-TPP1 binding to short telomere ssDNA substrate (hT12). Protein concentration ranged from 0 to 500 nM for wild type and Y36N and from 0 to 10 μM for H266L. B.

Quantification of EMSA data for POT1-TPP1 and mutant protein binding to hT12.

Error bars represent the mean ± SD (n = 3). C. EMSAs were performed to determine the effects of CLL-associated POT1 mutations on multiple POT1-TPP1 complexes coating long telomere ssDNA substrates (hT72). D. Quantification of

EMSA data for POT1-TPP1 and mutant protein coating hT72. Error bars represent the mean ± SD (n = 3). Schematic models indicate POT1-TPP1 complexes bound to differing telomere ssDNA hT12 in B and hT72 in D. POT1 is shown as green ellipse with 2 OB domains labeled as 1 and 2. TPP1 is shown as orange ellipse, and telomere ssDNA is depicted as black lines.

47

Figure 2.7 H266L mutation impairs the end-binding preference of POT1-TPP1.

A. EMSA assays were performed to determine the effects of H266L mutant binding to a 3’ (a3) or 5’ (a5) recognition sequence in an 18-nt telomere ssDNA substrate.

To obtain a reliable fitting curve, variable protein concentration ranges were used for different constructs and for different substrates, as indicated by highest protein concentration labeled for each gel shift. B. Quantification of EMSA data for POT1-

TPP1 and H266L mutant protein binding to a3 and a5 primer. Dissociation constants (KD) are indicated. Error bars represent the mean ± SD (n=3). Sequence

of primer a3 and a5 are shown below. The sequence underlined indicates the

48 native POT1 recognition sequence. Point mutations to the telomere ssDNA sequence are highlighted in red.

49

H266L POT1 mutant abrogates inhibitory role of multiple POT1-TPP1 binding events on telomerase activity

In addition to telomere ssDNA protection, the POT1-TPP1 heterodimer plays a critical role in regulating telomere length homeostasis (29, 39, 87). Though the

POT1-TPP1 complex is known to increase both activity and processivity of human telomerase (38, 39), it remains unclear what regulatory elements prevent telomerase from perpetually extending telomere ssDNA. To address the role of

POT1-TPP1 in this context, we employed an in vitro direct telomerase incorporation and extension assay to determine the role of POT1-TPP1 in regulating telomerase activity and processivity with ssDNA substrates of differing lengths, specifically hT12 and hT72 (Figure 2.8 and 2.9). Similar experiments using the CLL-associated POT1 mutants, Y36N and H266L, were carried out to determine if these mutations disrupt the ability of POT1-TPP1 complexes to properly regulate telomerase activity.

The introduction of either Y36N or H266L POT1 mutations did not significantly alter telomerase activity or processivity provided by POT1-TPP1 on the hT12 ssDNA

(Figure 2.9A and B). Because longer G-rich ssDNA substrates form complex secondary structures that affect telomerase activity (74, 76, 88), we reasoned that telomere length alone may be a determining feature in preventing telomerase- mediated over-extension of telomere ssDNA. Alternatively, longer ssDNA substrates sheathed in multiple POT1-TPP1 complexes may potentially impair the accessibility of ssDNA by telomerase, thereby limiting telomere extension. To test

50

Figure 2.8 H266L POT1 mutant abrogates POT1-TPP1 complex-mediated

inhibition of telomerase in a substrate length-dependent manner promoting

telomere extension.

A. Direct in vitro telomerase assay performed on hT72 ssDNA. Lane 1, no POT1-

TPP1 added. Lanes 2 to 4, stoichiometric concentration of POT1-TPP1 wild type,

Y36N, and H266L were added, respectively. Lanes 5 to 7, POT1-TPP1 wild type,

Y36N, and H266L were added to saturate all binding sites on hT72. LC, loading

control. The number of telomere repeats being added is indicated at left. B.

Quantification of lanes in A displaying normalized telomerase activity. Schematic

51 models indicate different binding states of POT1-TPP1 complexes bound to hT72.

Error bars represent the mean ± SD (n = 3). **P < 0.005; 0.005 < *P < 0.05. C.

TRF analysis of genomic DNA from parental and CRISPR-Cas9 edited HCT 116

cell lines containing either homozygous H266L (homo #1) or heterozygous H266L

(het #1) mutations at the indicated population doublings (PDs) in culture.

Quantification of TRF length is indicated by red arrows with corresponding values

shown below the gel.

52

Figure 2.9 POT1-TPP1 wild type and mutant complexes enhance telomerase processivity to extend telomere ssDNA substrates.

A. Direct in vitro telomerase assay performed on hT12 ssDNA. Lane 1: no POT1-

TPP1 added. Lane 2-4: stoichiometric concentration of POT1-TPP1 wildtype,

Y36N and H266L added, respectively. LC=Loading control. B. Quantification of lanes in panel A displaying normalized telomerase activity (upper panel) and processivity (lower panel). C. Quantification of data presented in Figure 2.8

displaying normalized telomerase processivity. Schematic models indicate the

53 different binding states of POT1-TPP1 complexes binding to telomere ssDNA.

Error bars represent the mean ± SD (n=3).

54 this, we conducted in vitro telomerase extension assays as described above but with hT72 ssDNA substrate (Figure 2.8A). These data revealed that, in the absence of POT1-TPP1 complexes, telomerase extends hT72 ssDNA substrates to a similar extent as hT12 (Figure 2.8A and 2.9A). When stoichiometric concentrations of POT1-TPP1 complexes were included (such that one protein binds to one ssDNA), telomerase activity was enhanced slightly while processivity was elevated by about 3-fold (Figure 2.9C). Under these conditions, it would be expected that a single POT1-TPP1 protein preferentially occupies the 3’ position of each hT72 substrate. In similar experiments, the inclusion of either Y36N or

H266L mutated POT1-TPP1 proteins regulated telomerase in a manner that was indistinguishable from that of wild type protein. Finally, the impact of multiple

POT1-TPP1 binding events on the regulation of telomerase activity and processivity was assessed in the presence of hT72 ssDNA substrates completely sheathed in protein complexes. In this scenario, coating of hT72 ssDNA substrate with wild type POT1-TPP1 complexes impaired telomerase activity, while no significant change in processivity was observed (Figure 2.8 and 2.9). POT1-TPP1 complexes with the Y36N mutation exhibited a similar affect to wild type protein on telomerase regulation. However, the introduction of the H266L mutation supported telomerase activity, thereby abrogating the inhibitory role of wild type POT1-TPP1 complexes on telomerase extension of physiologically relevant lengths of ssDNA.

Since native POT1-TPP1 binding destabilizes G-quadruplexes (77, 89), we asked whether the H266L mutation impairs this ability to contribute to unregulated telomere elongation via telomerase. To do so, we used circular dichroism (CD)

55

spectroscopy, which illustrates a positive band at 295nm and a negative band at

265nm, both of which are signatures associated with G-quadruplex structures (90,

91) (Figure 2.10A). POT1-TPP1 protein was mixed with hT72 ssDNA at a

concentration sufficient to saturate all binding sites using a stopped-flow device.

The gradual decrease in the circular dichroism spectra at 295 nm is consistent with

the G-quadruplex structures being resolved (90, 92) upon addition of POT1-TPP1

protein (Figure 2.10B). These data indicate that the H266L mutant POT1-TPP1

-1 protein resolves G-quadruplex structures, but at rates (kobs1 = 0.27 ± 0.06 s , kobs2

-1 = 0.019 ± 0.002 s ) that are slower than those of wild type protein (kobs1 = 0.63 ±

-1 -1 0.27 s , kobs2 = 0.030 ± 0.004 s ).

To further explore the biologic role of POT1 aberrations associated with the H266L mutation, the SNP coding for the appropriate amino acid change was introduced into the genome of HCT116 colon adenocarcinoma cancer cells using CRISPR-

Cas9. Two homozygous and two heterozygous cell lines with the successful introduction of the H266L substitution at the protein level were recovered and validated using next-generation and Sanger sequencing (Figure 2.11). Each of these four cell lines, as well as parental HCT116 cells, were passaged for 78 days and changes in telomere length were determined using telomere restriction fragment (TRF) analysis (Figure 2.8C and 2.11B). Parental cells displayed a slight shortening in telomere length with increasing population doublings. In contrast, the homozygous H266L mutant cells demonstrated robust telomere lengthening over the 78 days of cell growth. Heterozygous H266L mutant cells demonstrated either as minimal telomere shortening (het #1) or maintained a relatively consistent

56 telomere length (het #2) over the 78-day time course. While our results help to define an important role of POT1 H266 in properly regulating telomerase-mediated extension, they are unlikely to fully explain the pathology of the H266L mutation, patients with CLL are heterozygous for the mutation (55). POT1 mutations that are associated with CLL, and including H266L, also correlate with a higher frequency of sister chromatid fusions and stalled replication forks at telomeres (55).

Nonetheless, consistent with the in vitro direct telomerase assay, our cellular data

further demonstrate that the H266 residue of POT1 plays a critical role in regulating

telomere maintenance and that mutation of this residue results in enhanced

telomere elongation.

57

Figure 2.10 Destabilization of hT72 formed G-quadruplex upon POT1-TPP1 binding.

A. CD spectra of hT72 before (grey line) and 300 seconds after mixing with POT1-

TPP1 wild type (black line) protein. The dashed line indicates 295 nm that was recorded over time. B. Time-course for the circular dichroism signal at 295 nm of hT72 without (grey dots) and with mixing POT1-TPP1 wild type (black) or H266L mutant (blue) protein. Solid lines indicate double-exponential fits of the protein mixing data.

58

Figure 2.11 H266L POT1 mutant promotes telomere extension in cancer cell line.

A. Validation of H266L substitution in HCT 116 cells. Cell lines edited by CRISPR-

Cas9 and used for telomere restriction fragment (TRF) analysis were validated by

NGS and Sanger sequencing to confirm the introduction of H266L substitution. B.

TRF analysis of genomic DNA from CRISPR-Cas9 edited cell lines containing

homozygous H266L (homo #2) and heterozygous H266L (het #2) POT1 mutations

at the indicated population doublings (PDs) in culture. Quantified TRF lengths are

indicated by red arrows and values shown below.

59

Discussion

The ability of telomerase to maintain telomere DNA at a constant length is a

fundamental, yet complex, process of physiology that involves many regulatory

elements (8). In this report, we show that the length of telomere ssDNA and the

number of POT1-TPP1 heterodimers bound to it are key contributing factors that

differentially influence telomerase-mediated extension. In the context of binding short hT12 ssDNA, the added protection of H266 demonstrated in our HRF experiments can be attributed to direct interactions that exist between H266 of

POT1 and T8 and G10 of bound telomere ssDNA (43). Some amino acid side-

chains exhibit lower intrinsic reactivity to hydroxyl radicals (93), which might

explain why more POT1 residues were not identified upon DNA binding in the HRF

experiments. However, DMS methylation and pyrrolidine cleavage similarly

highlighted only G10 in the DNA to be protected when bound to POT1 protein,

even though other residues interact with ssDNA in the crystal structure of the

protein-DNA complex (43). These independent results suggest that the presence

of ssDNA does not significantly alter solvent accessibility and/or chemical reactivity

for many of the protein-DNA interactions that were identified in the X-ray crystal

structure, at least for 10-12 nt substrates. Highlighting a dynamic nature between

POT1 and telomere ssDNA, our findings elucidate a critical and opposing role of

residue H266 in POT1 protein for binding telomere ssDNA of different lengths. In

our HRF experiments using POT1-TPP1 protein, the modification rate of POT1

H266 decreased by ~6-times when bound to hT12 ssDNA as compared to free

POT1-TPP1 protein (Table 2.2). However, a similar comparison of modification

60

rates for hT12- versus hT72-bound protein indicates an elevation in modification

rate of POT1 H266 by 5-times. As the modification rate for the hT72-bound

complex represents the average rate of modification for the six proteins coating

the hT72 ssDNA, it is plausible that only one of the six proteins is more protected

by ssDNA binding while the others are solvent accessible. The enhanced binding

interaction for POT1-TPP1 interacting with the 3’ hydroxyl of ssDNA (43, 94), as

compared to the five proteins binding an internal ssDNA sequence on the hT72

substrate, could account for the difference in rates observed. Alternatively, the

arrangement of protein on the longer, more structured ssDNA might provide a

modest protection against solvent accessibility to result in a subtly reduced

modification rate as compared to free protein. In any event, the modification rate

of H266 in the context of six wild type proteins bound to hT72 ssDNA more closely

resembles that of free protein than the hT12-bound complex.

Another possible explanation for the discrepancy in POT1-TPP1 interactions with

long versus short telomere ssDNA may be related to different secondary structures

adopted by longer ssDNA substrates. The G-rich sequence makes telomere ssDNA prone to forming complex secondary structures such as T-loops and G- quadruplexes, which are dependent on the length of ssDNA (9, 10, 16). The binding of POT1 to telomere ssDNA normally helps to relieve such structures to properly regulate telomere maintenance (76, 77, 89). Therefore, it is conceivable that the first OB-fold domain of POT1 recognizes telomere ssDNA regardless of its secondary structure, while the second OB-fold of POT1 partially remains solvent accessible allowing for dynamic interactions with telomere ssDNA with potential to

61

form secondary structures. Our data demonstrate that the introduction of the

H266L mutation slows the rate at which G-quadruplex structures are relieved by

POT1-TPP1 protein. As G-quadruplex structures generally inhibit telomerase- mediated extension of telomere ssDNA (76, 88) and the POT1 H266L mutant impairs this ability, it would be expected that the POT1 H266L mutation would be associated with shorter telomeres. Therefore, it is more likely that the weaker affinity and reduced preference for 3’ binding that is associated with the H266L

POT1 mutation negate some of the protective properties of the protein, thereby allowing telomerase to more readily access the 3’ end of telomeres to promote over-extension in the context of the H266L mutation (Figure 2.12). Building upon studies demonstrating that the POT1-TPP1 heterodimer acts as a telomerase processivity factor (38, 39), our investigation further defines this role as being conditional for telomere ssDNA length and the number of POT1-TPP1 proteins bound to it. Specifically, our data describe an opposing role of the telomere-binding heterodimer by which longer ssDNA products are coated with multiple proteins to inhibit telomerase activity. Taken together, these data present a scenario where

POT1-TPP1 binds short telomere ssDNA substrates to promote telomerase recruitment and activity (38, 39). However, once the ssDNA overhang reaches a specific threshold length sheathed by multiple POT1-TPP1 complexes, telomerase binding and subsequent extension is prevented. This inhibitory regulation of telomerase in the presence of long ssDNA overhangs is lost in the case of H266L

POT1 mutant protein indicating the H266 residue is critical for telomerase regulation. In this regard, and potentially separate from its role in resolving DNA

62 secondary structure, the mutant POT1 is unable to sufficiently protect the extreme

3’ end of telomere ssDNA promoting telomerase accessibility and unbridled telomere extension. While the average length of the ssDNA overhang of telomeres is usually in the range of 50-200 nts, this length is not always fixed and is subject to being regulated (95). Our results provide a potential regulatory event by which telomerase extends the ssDNA of telomeres in S-phase following telomere replication (96). After this initial extension event, our data indicate that the longer

G-rich products are coated with native POT1-TPP1 protein to prevent additional telomerase-mediated extension events. This balance would insure that telomerase adds approximately the same number of nucleotides that are lost due to the end- replication problem for each cell cycle and would contribute to maintaining a relatively constant telomere length in telomerase-positive cells. In the case of the

POT1 H266L mutation, the ability to depress telomerase-mediated extension is lost, resulting in more telomerase-mediated extension events, and subsequently longer telomeres at each round of replication.

In summary, our results support a model in which POT1-TPP1 regulates telomere length homeostasis by coating long ssDNA substrates to render them inaccessible for telomerase extension. The presence of the CLL-associated H266L POT1 mutant impairs this ability of POT1-TPP1 to destabilize long telomere ssDNA, thereby leaving the telomere in a state that is accessible for telomerase-mediated extension, regardless of telomere ssDNA substrate length.

63

Figure 2.12 The proposed mechanism of POT1-TPP1 complex-mediated regulation of telomerase.

Telomere ssDNA is capable of forming higher ordered structures. POT1-TPP1 wild-type complexes bind to telomere ssDNA thereby forming a compact structure and protecting the 3’ end of the telomere ssDNA overhang from telomerase accessibility (Right). However, POT1-TPP1 complexes harboring the H266L POT1 mutant impair POT1-TPP1 complex function, leaving the 3’ end of the telomere ssDNA overhang accessible for telomerase extension (Left).

64

Material and Methods

Protein expression and purification

The N-terminal GST-tagged POT1-N (1-340) fusion construct was expressed using the recombinant baculovirus expression system to infect

Spodoptera frugiperda 9 (Sf9) insect cells as described previously (43). After 72 h

infection, cells were pelleted and stored at -80°C. Frozen cell pellets were

resuspended in lysis buffer containing 25 mM Hepes (pH 8.0), 200 mM NaCl, 100

mM Na2HPO4/NaH2PO4 (pH 8.0), 5 mM DTT, 5 mM benzamidine, 1 mM PMSF,

and 1 cOmplete ULTRA protease inhibitor cocktail tablet (Roche). Cells were then

lysed using sonication and incubated with 2 units/mL of RQ1 DNase I (Promega)

for 30 min before removal of cellular debris by ultracentrifugation at 36K rpm for 1

hour at 4°C. Following ultracentrifugation, supernatant was applied to a gravity

filtration column with buffer washed GST-beads (Invitrogen). Bead binding was

performed at 4°C, then rinsed with wash buffer containing 25 mM Hepes (pH 8.0),

150 mM NaCl and 50 mM Na2HPO4/NaH2PO4 (pH 8.0). Protein was eluted with

wash buffer containing 15 mM glutathione. PreScission protease (GE Healthcare)

was then added to remove the N-terminal GST tag followed by size-exclusion

chromatography (SEC) using a Superdex 200 HiLoad 16/60 chromatography

column on an AKTA Purifier10 FPLC system (GE Healthcare). Protein fractions

were pooled, concentrated with a Millipore Amicon Ultra 10K centrifugal column to

1 mg/mL and stored at -80°C. Full-length POT1 (1-634) with an N-terminal GST-

tag and N-terminal 6×His-tagged TPP1 (89-334) proteins were co-expressed and

65

co-purified following a strategy similar to that described above for POT1-N,

including affinity purification, followed by GST-tag removal and SEC purification.

DNA oligonucleotides and 5’ end-labeling

The hT12 (5’-GGTTAGGGTTAG-3’ and 5’-TTAGGGTTAGGG-3’) and hT72

(5’-(GGTTAG)12-3’) oligonucleotides were synthesized by IDT. 5’ end-labeling was

performed using 25 pmols of oligonucleotide reacted with radiolabeled ATP[γ-32P]

(Perkin Elmer) and T4 Polynucleotide Kinase (New England BioLabs) as previously described (76). Labeled products were purified from unreacted nucleotide using illustra MicroSpin G-25 columns (GE Healthcare).

Stoichiometric gel shift assay

Stoichiometric gel shift assays were performed to evaluate loading of POT1-

TPP1 proteins on hT12 and hT72 telomere ssDNA substrates. Each 10 μL reaction

contained 0.2 µM ssDNA with approximately 4% of it being 5’end 32P labeled.

POT1-TPP1 was added to each reaction at 0, 0.2, 0.4, 0.8 µM for hT12, and 0, 0.3,

0.6, 0.9, 1.2, 2.4, 3.6, and 4.8 µM for hT72 to reach the molar ratios indicated.

Reactions were conducted in buffer containing 60 mM Hepes (pH 8.0), 75 mM

NaCl, 5 mM DTT, 5 μg/mL BSA, 1.2 μg/mL tRNA and 6.25% glycerol. After 30

minutes of incubation on ice, reactions were loaded onto 4–20% Tris-borate non-

denaturing gel (Invitrogen) and run at 125 V for 1 hour and 20 minutes. Gels were

then dried, and scanned for densitometry using a Typhoon FLA 9500 biomolecular

imager (GE Healthcare).

Assembly of protein-ssDNA complexes as monomer and hexameric

complexes 66

POT1-TPP1 protein was assembled with hT12 or hT72 ssDNA to form monomeric or hexameric species of nucleoprotein complexes. SEC was used to separate protein-ssDNA complex from unbound ssDNA. For PT-hT12 complex, hT12 was incubated with POT1-TPP1 in three-times molar excess to ensure that all POT1-TPP1 protein was in complex with hT12 ssDNA. PT-hT12 was then separated from unbound hT12 ssDNA by SEC. Conversely, for PT-hT72 complexes, hT72 ssDNA was incubated with 18-times molar excess of POT1-

TPP1 protein to saturate the hT72 ssDNA. The PT-hT72 hexamer was then separated from free POT1-TPP1 protein using SEC. Protein-ssDNA complex

fractions were pooled and concentrated with a Millipore Amicon Ultra 10K

centrifugal column to ~5 μM. Complex assembled from POT1-N protein (POT1-N-

hT12 and POT1-N-hT72) were assembled using a similar strategy.

Synchrotron radiolysis and POT1 proteolysis using pepsin

Radiolysis experiments were performed at beamlines X28C and 5.3.1 of the

National Synchrotron Light Source at Brookhaven National Laboratory and

Advanced Light Source at Lawrence Berkeley National Laboratory. The X-ray

beam parameters were optimized by using Alexa-488 fluorophore assay. All

samples were exposed for 0-20 milliseconds (X28C) and for 0-800 µs (5.3.1) at

ambient temperature and immediately quenched with methionine amide at the

10mM final concentration to prevent secondary oxidation (97). All protein samples

were then reduced with 10 mM dithiothreitol (DTT) at 56°C for 45 minutes and

alkylated with 25 mM iodoacetamide at room temperature and in the dark for 45

minutes. Prior to digestion, formic acid (FA) was added to all samples at a final

67 concentration of 0.5% to achieve a pH between 1 and 2. Protein samples were digested with pepsin (Promega, Madison, WI) at 37°C for 3 h with an enzyme:protein molar ratio of 1:10. The digestion reaction was terminated by heating samples at 95°C for 2 minutes.

LC-MS Analysis

Liquid chromatography mass spectrometry (LC-MS) analysis of digested samples for hydroxyl radical footprinting experiments were performed using an

Orbitrap Elite mass spectrometer (Thermo Electron, San Jose, CA) interfaced with a Waters nanoAcquity UPLC system (Waters, Taunton, MA). Proteolytic peptides were desalted on a trap column (180 μm × 20 mm packed with C18 Symmetry, 5

μm, 100 Å (Waters, Taunton, MA)) and subsequently eluted on a reverse phase column (75 μm × 250 mm nano column, packed with C18 BEH130, 1.7 μm, 130 Å

(Waters, Taunton, MA)) using a gradient of 2 to 42% mobile phase B (100% acetonitrile/0.1% formic acid) vs. mobile phase A (100% water/0.1 % formic acid) over a period of 60 minutes at 37°C with a flow rate of 300 nl/min. A total of 250 ng of proteolytic peptides were loaded on column for each MS analysis. Peptides eluting from the column were introduced into the nano-electrospray source at a capillary voltage of 2.5 kV. All MS data were acquired in the positive ion mode. For

MS1 analysis, a full scan was recorded for eluted peptides (m/z range of 360–1600) in the Orbitrap mass analyzer with resolution of 120,000 followed by MS/MS of the

20 most intense peptide ions scanned in the ion trap mass analyzer. Selected ion currents for modified and unmodified peptic peptides in MS1 experiments were used to determine the extent of oxidation for each modified site. In HRF

68

experiments, the resulting MS/MS data were searched against a POT-N or POT1 and TPP1 sequence database using Mass Matrix software to identify sites of modification. In particular, MS/MS spectra were searched for nonspecific peptides from POT1 and TPP1 protein sequences using mass accuracy values of 10 ppm and 0.7 Daltons for MS1 and MS2 scans respectively, with allowed variable modifications including carbamidomethylation for cysteines and all known oxidative modifications previously documented for amino acid side chains. In addition, MS/MS spectra for each site of proposed modification were manually examined and verified.

Modification rate calculation for a specific site

The integrated peak areas of the unmodified peptide (Au), and of a peptide

in which a residue is modified (Am) derived from selected ion chromatograms, were

used to calculate the fraction unmodified for each specific modified species according to the formula: Fu=1- (Am/ (Au+∑Am)), where ∑Am is the sum of all

modified products for a particular peptide. Dose-response curves generated using

unmodified fraction for each specific site of modification plotted versus X-ray

exposure time. The fraction unmodified for each site of modification was fit to the

-kt equation Fu(t)=Fu(0)e , where Fu(0) and Fu(t) are the fraction of unmodified at time

0 and time t, respectively, and k is a first order rate constant. The peptide segments and the amino acid side chains in each segment for which rates were determined are provided in Table 2.1. A modification rates ratio comparing protein-only state

(POT1-N) versus protein bound to short telomere ssDNA hT12 (POT1-N-hT12)

was calculated by dividing the modification rate for each modified site derived from

69

POT1-N protein samples by the modification rate for the same sites of modification derived from POT1-N-hT12. Similarly, the modification ratio comparing protein bound to hT12 (POT1-N-hT12) versus hT72 (POT1-N-hT72) was the quotient of modification rate of each modified site from POT1-N-hT12 sample against the modification rate of the same site from POT1-N-hT72 sample. For the residues identified from multiple peptides, the average values were used.

The reported values were depicted as modification rate ratios following normalization as described previously (84). Briefly, the mean and median modification rate ratios were determined in log10 scale and averaged to provide a normalization factor used for individual data points. Normalized modification rate ratios were calculated by subtracting the normalization factor from the measured modification rate ratio for the same residue under different experimental conditions.

The normalization in log scale is inspired by the geometric mean, which computes the arithmetic mean of a set of the logarithm transformed values (98). Also, the normalization factor was calculated by averaging the mean and median values to better represent the data compiled. Normalized ratios and normalization factors were converted back to linear scale for presentation in Table 2.2. Normalized modification ratios of POT1-TPP1 samples were calculated following the same strategy as POT1-N samples (Table 2.3 and 2.4).

Electrophoretic mobility shift assays (EMSAs)

Protein-DNA binding reactions were performed in buffer containing 60 mM

HEPES (pH 8.0), 75 mM NaCl, 5 mM DTT, 5 μg/mL BSA, 1.2 μg/mL tRNA and

6.25% glycerol. Reactions were performed using 50 pM 32P-labeled DNA (hT12, 70 hT72) or 4 nM IRD700 fluorescent dye labeled DNA (a3, a5) and variable concentrations of recombinant POT1-TPP1 protein. For hT12, wild type and Y36N mutant POT1-TPP1 were performed under the concentration range of 0 to 500 nM, while H266L mutant POT1-TPP1 concentrations ranged from 0 to 10 μM. For hT72 binding reactions, both wild type and mutant (Y36N or H266L) POT1-TPP1 proteins were used at concentrations ranging from 0 to 1 μM. Binding reactions were incubated for 15 min at room temperature before 8 μL of the reaction was loaded onto a 5% Tris-borate non-denaturing gels (Invitrogen). Gels were run at

125 V for 30 min for hT12 and 55 min for hT72 ssDNA. Gels were dried and scanned using a Typhoon FLA 9500 biomolecular imager (GE Healthcare) and densitometry was performed using ImageQuant TL 1D v8.1 software (GE

Healthcare). For hT12, a3, and a5, the quadratic equation was used to fit and obtain equilibrium dissociation constants (KD) for each construct. For hT72, the highest molecular weight band representing six-proteins bound was quantified against the bottom band representing free DNA. The fraction of saturated complex was plotted against POT1-TPP1 protein concentration and subsequently fitted using the Hill equation with Origin8.0 software. Fitting was performed by allowing all parameters to float to obtain the best fit for apparent dissociation constants (Kapp) for each construct. In doing so, Hill coefficients were determined to be n = 3.9 ±

0.1, 2.9 ± 0.3, and 2.5 ± 0.2 for wild type, Y36N, and H266L constructs, respectively.

Reported values represent the average dissociation constants, with calculated standard errors, derived from three independent experiments for each condition.

Size-exclusion chromatography analysis

71

Size-exclusion chromatography analyses were performed using a

Superdex 75 column (increase 3.2/300, bed volume 2.4 mL) (GE Healthcare Life

Sciences) on a Shimadzu HPLC system in buffer containing 25 mM Hepes (pH 8.0) and 150 mM NaCl. POT1-TPP1 wildtype or H266L mutant (10 µM) was incubated with or without (TTAGGG)2 ssDNA (20 µM) for 30 min on ice, then loaded onto the column. The SEC profile was calibrated using gel-filtration protein standards (Bio-

Rad). Molecular weight (Log10) of standards were plotted against the ratio of Ve/Vo elution volumes (where Ve is the elution volume for each standard and Vo is the void volume). The standard masses were fit using the equation Log10 (Molecular weight) = -1.7135 * Ve/Vo + 7.2454 and sample molecular weights were determined by fitting to the data.

Multi-angle light scattering (MALS) analysis

MALS determines the molecular weight of molecules via a method that is independent of molecular mass reference standards, column calibration and assumptions of molecular shape (99). In addition, SEC-MALS separates mixtures of oligomers and measures the absolute molecular weight of an oligomer in elution fractions. Multi-angle light scattering (MALS) experiments were performed immediately following size-exclusion chromatography (SEC) by online measurement of static light scattering (mini DAWN TREOS, Wyatt Technology), differential refractive index (dRI, Optilab rEX, Wyatt Technology) at a wavelength of 658 nm and ultraviolet absorbance at a wavelength of 280 nm (Dionex ultimate

3000 variable wavelength detector). For this assay, a ProSEC 300S, 300 x 7.5 mm

SEC column was connected upstream of the MALS-RI detectors and used to

72

fractionate the injected sample. The SEC-MALS-RI system as a whole was validated using BSA (Sigma-Aldrich). Prior to sample injection (40 µL), the column was equilibrated at a flow rate of 0.3 mL/min in 50 mM HEPES pH 7.5, 150 mM

NaCl. The chromatogram and resultant molecular weight data were analyzed using the Astra 5.3 software from Wyatt Corporation.

Direct telomerase incorporation assay

Telomerase activity assays were performed by mixing 2 μl of hTR and hTERT transfected HEK 293T cell lysate into a 15 μL reaction mixture containing

35 mM Tris-HCl pH 8.0, 0.7 mM MgCl2, 1.8 mM β-mercaptoethanol, 0.7 mM

spermidine, 35 mM KCl, 500 μM dTTP, 500 μM dATP, 2.9 μM dGTP, 2 μL [α-32P]-

dGTP (10 μCi/μL, 3000 Ci/mmol, Perkin–Elmer) and 0.1 μM hT12 or hT72 primer.

Five microliters of purified POT1–TPP1 (wild type, Y36N and H266L) protein

complexes (at 0.8 μM or 4.8 μM) were added to reach the final concentration of

0.2 μM for hT12 or 0.2 μM and 1.2 μM for hT72. For reactions without POT1–TPP1

protein, an equivalent volume of the appropriate protein buffer was used instead.

The telomerase reaction was carried out for 30 min at 30°C and then quenched by

adding 100 μL of 3.6 M NH4OAc, 20 μg of glycogen, 4 μL of 10 mM EDTA. A 5΄-

32P-labeled 12 nt, 15 nt, or 53 nt oligo was used as loading control. The radioactivity

of the loading control was determined by liquid scintillation counting and 400 cpm

were loaded into each reaction mixture. All ssDNA products synthesized in the

assay were ethanol-precipitated and analyzed on a 12% polyacrylamide/7 M

urea/1× TBE denaturing gel. Gels were dried and subjected to densitometry results

73 that were digitized with a Typhoon FLA 9500 biomolecular imager (GE Healthcare) and quantified using ImageQuant TL 1D v8.1 software (GE Healthcare).

Quantification of telomerase assay products was performed as described previously (46, 48). Briefly, relative intensities for each hexamer repeat were determined and normalized against the loading control for each lane. Total activity is reported as total lane counts by summing the relative intensities of all normalized bands within a lane. Repeat addition processivity was calculated by first correcting for the number of radiolabeled Gs incorporated within each hexamer repeat and then calculating the fraction left behind (FLB) by subtracting and dividing the sum of intensities for each round of extension (1-n) by the entire sum of intensities for the total lane count. The ln (1-FLB) was plotted against the repeat number of telomerase extension and the slope was fitted to the linear portion of those data.

Repeat addition processivity was defined as −0.693/slope.

Stopped-flow circular dichroism

The kinetics of G-quadruplex destabilization of hT72 ssDNA was monitored upon rapid mixing of ssDNA with POT1-TPP1 protein. hT72 oligonucleotides were first prepared by heating at 95 °C for 5 min followed by slow cooling to room temperature. 200 nM of hT72 and 2.4 μM of POT1-TPP1 (either wild type or H266L mutant), all prepared in 60 mM HEPES (pH 8.0) and 75 mM NaCl buffer, were mixed using the stopped-flow device equipped on a PiStar 180 spectrophotometer

(Applied Photophysics). Circular dichroism was conducted at 25°C with signal changes monitored at 295nm (path length 1 cm, bandwidth 4 nm) over time for

74

300s at a rate of three measurements per second. Data were fitted to a double

–k t –k t exponential equation: CD 295 = A1 exp obs1 + A2 exp obs2 + c, where kobs1 and kobs2

represent the two independent rates used for fitting the data.

CRISPR-Cas9 editing cell lines and telomere restriction fragment (TRF) assay

The genome of HCT116 cells were edited using CRISPR-Cas9 techniques by Washington University’s Genome Engineering and iPSC Center (GEIC) as previously described (100). The cells were transfected with sgRNA sequence

(TTAGAGTTTCATCTTCATGGNGG) (Addgene #43860) and Cas9 pDNA

(Addgene #43945) via nucleofection (Lonza 4D-Nucleofector, X Unit). The edited cells were validated by next-generation sequencing (NGS) and Sanger sequencing. Parental and H266L mutated HCT116 cells were grown in Dulbecco’s

Modified Eagle Medium (DMEM) supplemented with 10% fetal bovine serum (FBS)

6 at 37°C with 5% CO2 for 78 population doublings (PDs). 5×10 cells were

harvested at 0, 28, 55, 78 population doublings and genomic DNA were isolated

(Sigma-Alorich, Catalog# NA2110). TRF analysis was performed using a

commercial kit (TeloTAGGG Telomere Length Assay, Catalog# 12209136001,

Roche Diagnostics Corporation, Indianapolis, IN, USA). A total of 2 µg DNA was

digested overnight with Rsa I and Hinf I at 37°C and electrophoresed through 0.5%

agarose gels in 1×TBE at 5 V/cm for 6 hrs. Gels were denatured and neutralized

prior to capillary transfer overnight. Telomere DNA was transferred onto a

HybondTM- N+ membrane (GE Healthcare) using 20× SSC buffer. The transferred

DNA was fixed by UV crosslinking. The cross-linked membrane was then 75 hybridized with DIG-labeled telomere probes with the sequence (TTAGGG)4

overnight at 42°C. After hybridization, the membrane was washed with buffer 1 (2×

SSC, 0.1% SDS) at room temperature for 15 min and then washed twice with buffer

2 (0.5× SSC, 0.1% SDS) at 55°C for 15 min. Then, the membrane was incubated with Anti-DIG-AP antibody. Finally, the membrane was incubated with CDP-Star

solution (Roche) for detection and imaged with ODYSSEY imaging system (LI-

COR). Image quantification was performed using Image Studio software to

measure the intensity value of each telomere smear. The weighted telomere length

mean was determined by dividing each lane into 60 boxes and applying the

formula ΣODi/Σ(ODi/Li), where ODi is the intensity of box i and Li corresponds to

the length of the DNA in box i as determined using DNA markers and a standard

curve (101).

76

Table 2.1 Summary of hydroxyl radical modification rate constant for residues in POT1-N, POT1-N-hT12 and POT1-N-hT72.

Peptide Sequence Residues Modification Modification Modification

Modified Rate, S-1 Rate, S-1 Rate, S-1

POT1-N POT1-N- POT1-N-

hT12 hT72

[-5-8] GPLGSMSLVPAT M1 80.6 ± 3.04 104.9 ± 4.8 76.4 ± 0.97

N

[-1-9] SMSLVPATNY M1 85.6 ± 5.04 109.6 ± 8.0 64.6 ± 2.2

[10-25] IYTPLNQLKGGTI I10 0.97 ± 0.03 1.27 ± 0.05 1.1 ± 0.02

VNV L17 0.46 ± 0.02 0.71 ± 0.06 0.55 ± 0.012

K18 1.33 ± 0.05 1.7 ± 0.06 1.4 ± 0.04

I22 1.30 ± 0.13 2.4 ± 0.18 1.1 ± 0.03

[25-31] VYGVVKF no mod - - -

[31-43] FKPPYLSKGTDY L37 0.47 ± 0.04 0.71 ± 0.02 0.29 ± 0.01

K39 1.20 ± 0.03 1.55 ± 0.06 0.61 ± 0.03

Y43 1.25 ± 0.04 1.43 ± 0.03 0.50 ± 0.015

[31-42] Y36 1.2 ± 0.027 0.90 ± 0.029 0.40 ± 0.008

FKPPYLSKGTD L37 0.58 ± 0.02 0.84 ± 0.029 0.33 ± 0.013

K39 0.62 ± 0.032 0.67 ± 0.038 0.29 ± 0.005

[49-57] I49orV50 0.21 ± 0.013 0.22 ± 0.013 0.20 ± 0.005

IVDQTNVKL V55 0.49 ± 0.028 0.51 ± 0.015 0.50 ± 0.011

K56 0.75 ± 0.015 0.81 ± 0.0157 0.86 ± 0.017

[60-66] LLFSGNY no mod - - -

[62-69] FSGNYEAL no mod - - -

[70-77] PIIYKNGD I71 0.27 ± 0.005 0.29 ± 0.010 0.14 ± 0.002

77

Y73 0.63 ± 0.023 0.81 ± 0.017 0.54 ± 0.019

[70-81] PIIYKNGDIVRF I71 0.22 ± 0.012 0.27 ± 0.005 0.14 ± 0.003

Y73 0.67 ± 0.021 0.86 ± 0.023 0.54 ± 0.009

[82-101] HRLKIQVYKKET K90 8.32 ± 0.34 11.88 ± 0.57 9.33 ± 0.34

QGITSSGF I96 6.16 ± 0.21 7.49 ± 0.55 8.50 ± 0.29

[89-101] YKKETQGITSSG K90 2.58 ± 0.025 5.7 ± 0.12 2.56 ± 0.048

F

[105-122] TFEGTLGAPIIPR L110 0.82 ± 0.074 1.0 ± 0.09 1.03 ± 0.022

TSSKY I115 2.5 ± 0.08 3.1 ± 0.17 0.8 ± 0.043

K121 1.6 ± 0.075 2.5 ± 0.11 2.8 ± 0.11

[113-125] PIIPRTSSKYFNF I115 2.25 ± 0.08 3.25 ± 0.26 0.58 ± 0.018

K121 1.76 ± 0.063 2.1 ± 0.061 1.6 ± 0.0423

Y122 2.5 ± 0.082 3.6 ± 0.082 3.0 ± 0.13

F125 0.69 ± 0.03 1.1 ± 0.078 0.63 ± 0.047

[123-135] FNFTTEDHKMVE M132 32.84 ± 1.35 43.34 ± 2.42 35.25 ± 1.89

A

[139-148] WASTHMSPSW M144 25.43 ± 1.07 25.74 ± 1.03 12.14 ± 0.72

[139-150] WASTHMSPSWT M144 25.65 ± 1.02 26.34 ± 1.16 13.40 ± 0.74

L

[151-161] LKLCDVQPMQY M159 36.83 ± 1.20 56.26 ± 3.71 37.05 ± 2.06

[162-168] FDLTCQL No mod - - -

[169-179] LGKAEVDGASF K171 0.35 ± 0.034 0.33 ± 0.019 0.17 ± 0.006

E173 0.48 ± 0.046 0.40 ± 0.035 0.23 ± 0.017

[169-180] LGKAEVDGASFL K171 0.32± 0.022 0.31 ± 0.014 0.16 ± 0.004

E173 0.46 ± 0.010 0.45 ± 0.016 0.23 ± 0.005

[181-194] LKVWDGTRTPFP R188 0.26 ± 0.010 0.24 ± 0.027 0.17 ± 0.027

SW F191 3.66 ± 0.16 4.83 ± 0.18 1.94 ± 0.017

78

W194/W1 7.48 ± 0.12 6.02 ± 0.46 0. 43 ± 0.007

84

W194 0.46 ± 0.036 0.48 ± 0.032 0.38 ± 0.039

[195-200] RVLIQD No mod - - -

[204-216] EGDLSHIHRLQN L207 0.61 ± 0.028 0.61 ± 0.025 0.56 ± 0.013

L H209 0.49 ± 0.029 0.77 ± 0.059 0.39 ± 0.020

H211 2.14 ± 0.049 3.01 ± 0.095 1.40 ± 0.018

[222-232] VYDNHVHVARS Y223 1.48 ± 0.039 1.29 ± 0.078 0.48 ± 0.016

H228 4.27 ± 0.14 4.46 ± 0.17 2.06 ± 0.12

[222-238] VYDNHVHVARSL Y223 0.91 ± 0.031 0.90 ± 0.046 0.31 ± 0.019

KVGSF H228 2.34 ± 0.14 3.01 ± 0.094 0.76 ± 0.089

K234 1.38 ± 0.035 1.45 ± 0.059 0.85 ± 0.023

F239 0.69 ± 0.045 1.39 ± 0.099 0.53 ± 0.039

[239-244] LRIYSL Y242 0.51 ± 0.03 0.44 ± 0.022 0.12 ± 0.009

[245-258] HTKLQSMNSEN M251 42.35 ± 1.86 44.59 ± 1.68 36.71 ± 2.04

QTM

[263-274] FHLHGGTSYGR H264 4.42 ± 0.33 5.59 ± 0.16 2.15 ± 0.13

G

[264-275] HLHGGTSYGRG H266 5.34 ± 0.31 1.50 ± 0.21 25.39 ± 1.10

[275-284 IRVLPESNSD No mod - - -

[290-297] VDQLKKDLESAN K289/K29 1.26 ± 0.036 1.37 ± 0.042 1.34 ± 0.036

L 0

L297 0.41 ± 0.019 0.53 ± 0.061 0.47 ± 0.012

[298-305] TANQHSDV No mod - - -

[308-318] QSEPDDSFPNG No mod - - -

[319-327] VSLRPPGWS W326 11.00 ± 0.41 11.89 ± 0.34 11.76 ± 0.61

79

Table 2.2 Summary of hydroxyl radical modification ratios for residues in comparison of POT1-N/POT1-N-hT12 and POT1-N-hT12/POT1-N-hT72.

Normalized Normalized Normalized Normalized Resi modification modification Resi modification modification due Resid ratio ratio due Resid ratio ratio num ue POT1-N POT1-N- num ue POT1-N POT1-N- ber /POT1-N- hT12 /POT1- ber /POT1-N- hT12 /POT1- hT12 N-hT72 hT12 N-hT72 1 M 0.92 0.89 184 W 1.48 8.14 10 I 0.91 0.67 188 R 1.29 0.82 17 L 0.77 0.75 191 F 0.90 1.45 18 K 0.93 0.71 194 W 1.30 2.44 22 I 0.65 1.27 207 L 1.19 0.63 36 Y 1.59 1.31 209 H 0.76 1.15 37 L 0.81 1.45 211 H 0.85 1.25 39 K 1.01 1.41 223 Y 1.29 1.62 43 Y 1.04 1.66 228 H 1.03 1.70 49 I 1.14 0.64 234 K 1.14 0.99 50 V 1.14 0.64 239 L 0.59 1.52 55 V 1.15 0.59 242 Y 1.38 2.13 56 K 1.11 0.55 251 M 1.13 0.71 71 I 1.04 1.16 264 H 0.94 1.51 73 Y 0.93 0.90 266 H 4.25 0.03 90 K 0.67 0.98 289 K 1.10 0.59 96 I 0.98 0.51 290 K 1.10 0.59 110 L 0.98 0.56 297 L 0.92 0.66 115 I 0.89 2.71 326 W 1.09 0.63 *normalizati 121 K 0.87 0.63 0.84 1.72 122 Y 0.83 0.70 on factor 125 F 0.75 1.01 †mean-3SD 0.42 0.11 132 M 0.90 0.71 †mean-2SD 0.56 0.22 144 M 1.17 1.19 †mean 1.03 0.95 159 M 0.78 0.88 †mean+2SD 1.89 4.04 171 K 1.25 1.13 †mean+3SD 2.56 8.36 173 E 1.32 1.07

*The normalization factor is determined as the average of mean and median from all identified ratios of modification rates before normalization in log10 scale, then converted back to linear scale.

80

† mean and SD are also calculated based on normalized modification rate ratios in log10 scale, then mean±2SD and mean±3SD are determined, and finally converted back to linear scale.

81

Table 2.3 Summary of hydroxyl radical modification rate constant for residues in PT, PT-hT12, and PT-hT72.

Residues Modification Modification Modification Peptide Sequence Modified Rate, s-1 Rate, s-1 Rate, s-1 PT PT-hT12 PT-hT72 POT1 [1-9] MSLVPATNY M1 1240.0 ± 1510.0 ± 565.86 ± 34.11 216.7 108.4 [2-9] SLVPATNY L3 76.39 ± 3.67 78.29 ± 5.73 23.10 ± 1.85 Y9 76.0 ± 5.10 73.71 ± 3.80 23.0 ± 1.13 [9-17] YIYTPLNQL Y9 174.34 ± 8.29 175.34 ± 48.04 ± 3.37 13.24 I10 179.21 ± 5.50 215.76 ± 60.30 ± 4.16 21.55 [10-24] IYTPLNQLKGG K18 179.44 ± 9.39 214.58 ± 41.92 ± 2.49 TIVN 13.29 [10-25] IYTPLNQLKGG I10 53.42 ± 4.64 60.59 ± 4.27 16.24 ± 1.04 TIVNV K18 164.83 ± 200.92 ± 38.99 ± 1.93 13.16 16.13 I22 104.43 ± 6.60 144.52 ± 39.79 ± 3.29 14.29 [30-36] VYGVVKF no mod - - - [32-37] FKPPYL Y36 51.25 ± 1.86 53.42 ± 2.96 8.46 ± 0.56 [32-43] FKPPYLSKGT Y36 51.18 ± 2.72 62.38 ± 3.87 10.27 ± 0.64 DY L37 19.63 ± 1.66 35.19 ± 2.65 7.90 ± 0.44 K39 53.55 ± 2.07 80.46 ± 6.64 14.38 ± 1.57 [32-42] Y36 40.93 ± 2.45 41.57 ± 2.83 7.25 ± 0.49 FKPPYLSKGT L37 25.78 ± 1.53 43.29 ± 2.54 8.51 ± 0.47 D K39 36.26 ± 1.95 54.96 ± 2.58 9.39 ± 0.67 [49-57] V55 322.96 ± 441.48 ± 74.75 ± 11.42 IVDQTNVKL 14.02 23.22 K56 213.87 ± 5.99 282.08 ± 68.38 ± 7.95 19.76 [61-66] LFSGNY Y66 137.23 ± 8.13 44.06 ± 2.85 9.21 ± 0.74 [70-77] PIIYKNGD I71 11.89 ± 0.55 15.57 ± 1.24 5.11 ± 0.25 Y73 69.09 ± 1.95 13.83 ± 0.90 3.66 ± 0.50 [82-88] HRLKIQV no mod - - - [89-100] YKKETQGITSS Y89/K90/K 155.92 ± 4.99 208.71 ± 44.45 ± 3.15 G 91 17.72 [89-101] YKKETQGITSS Y89/K90 153.45 ± 4.47 216.38 ± 52.84 ± 3.13 GF 18.85 [105- TFEGTLGA no mod - - - 112] [105- L110 1990.0 ± 1810.0 ± 491.33 ± 36.08 122] 159.4 194.2 TFEGTLGAPIIP P113 106.06 ± 6.17 119.55 ± 66.19 ± 4.56 RTSSKY 11.08 K121 84.31 ± 9.68 80.82 ± 4.57 61.43 ± 3.63 Y122 89.08 ± 6.28 100.87 ± 11.1 34.26 ± 4.99 82

Residues Modification Modification Modification Peptide Sequence Modified Rate, s-1 Rate, s-1 Rate, s-1 PT PT-hT12 PT-hT72 [113- PIIPRTSSKY I115 14.89 ± 0.87 20.08 ± 1.42 7.46 ± 0.53 122] Y122 61.60 ± 3.45 64.19 ± 3.82 19.60 ± 0.74 [125- FTTEDHKMVE M132 218.10 ± 286.79 ± 155.39 ± 2.10 135] A 12.51 18.96 [126- TTEDHKMVEA H130 33.31 ± 1.97 21.56 ± 1.68 6.47 ± 0.32 135] M132 34.30± 2.62 44.20± 1.46 33.44± 4.44 [139- WASTHMSPS M144 627.06± 27.10 552.03± 42.68 85.27± 5.55 150] WTL [143- HMSPSWTL H143 39.07 ± 1.44 41.01 ± 2.90 14.41 ± 1.02 150] M144 128.32± 6.61 101.88 ± 5.96 29.21± 2.34 [151- LKLCDVQPMQ M159 1840.0± 106.9 2180.0± 164.1 706.62± 46.6 160] [152- KLCDVQPMQY C154 713.72 ± 528.61 ± 173.46 ± 13.97 161] 62.79 55.62 M159 510.78 ± 728.65 ± 182.36 ± 10.94 15.48 67.89 [162- FDLTCQL No mod - - - 168] [169- LGKAEVDGAS L169 4.26 ± 0.14 2.37 ± 0.15 1.33 ± 0.21 179] FL K171 17.46± 0.76 13.57± 1.08 5.14± 0.32 E173 29.32 ± 0.49 26.42 ± 1.98 6.31 ± 0.45 V174 8.21± 0.26 8.54± 0.76 1.47 ± 0.063 A177 22.86± 0.60 26.56± 2.12 5.97 ± 0.28 [180- LLKVWDGTRT W184 43.56 ± 3.53 53.41 ± 3.70 17.36 ± 1.38 194] PFPSW P190 34.03 ± 2.0 32.22 ± 2.48 25.89 ± 2.09 F191 66.24 ± 4.89 105.55 ± 7.46 22.17 ± 1.65 W194 7.78 ± 0.35 7.16 ± 0.31 1.30 ± 0.083 W184/W19 28.9 ± 0.89 27.93 ± 2.69 11.49 ± 0.79 4 [181- W184 36.57 ± 3.45 47.07 ± 2.63 16.39 ± 1.23 194] LKVWDGTRTP P190 29.83 ± 2.21 39.19 ± 3.15 20.71 ± 1.36 FPSW F191 54.29 ± 0.32 93.77 ± 9.02 18.51 ± 1.49 W194 6.86 ± 0.32 5.42 ± 0.16 1.05 ± 0.084 W184/W19 23.82 ± 2.47 26.85 ± 1.59 8.43 ± 0.69 4 [195- RVLIQD L197 87.80 ± 4.91 83.95 ± 6.74 16.66 ± 0.61 200] [204- EGDLSHIHRLQ H211 154.95 ± 161.81 ± 40.65 ± 1.21 216] NL 12.75 11.86 [208- SHIHRLQNL H209 838.6 ± 38.1 886.7 ± 55.1 374.0 ± 26.8 216] [220- ILVYDNHVHVA No mod - - - 232] RS [222- VYDNHVHVAR Y223 66.39 ± 1.94 74.30 ± 4.85 12.67 ± 0.77 232] SLKVGS H228 284.85 ± 195.67 ± 37.43 ± 2.28 27.02 11.27 [223- YDNHVHVARS H228 295.78 ± 190.29 ± 46.14 ± 3.34 232] LKVGS 21.45 16.85 ARSLKVGSF L233 77.08 ± 5.37 42.87 ± 2.68 11.39 ± 1.54 83

Residues Modification Modification Modification Peptide Sequence Modified Rate, s-1 Rate, s-1 Rate, s-1 PT PT-hT12 PT-hT72 [230- K234 148.04 ± 6.58 125.88 ± 25.28 ± 3.16 238] 12.63 [239- LRIYSL I241 9.82 ± 0.64 11.59 ± 0.62 3.18 ± 0.23 244] Y242 114.04 ± 3.45 11.4 ± 0.69 5.58 ± 0.62 [245- HTKLQSMNSE M251 645.13 ± 887.53 ± 253.41 ± 5.06 256] NQ 22.28 69.15 [264- HLHGGTSYGR H266 232.16 ± 34.01 ± 2.11 59.55 ± 4.26 274] G 11.43 Y271 48.96 ± 1.07 32.38 ± 1.07 3.92 ± 0.31 [275- V277 53.24 ± 1.75 26.31 ± 1.55 8.70 ± 0.25 284] IRVLPESNSD P279 48.24 ± 1.55 19.23 ± 1.16 6.43 ± 0.83 E280 41.82 ± 1.39 29.82 ± 2.08 7.48 ± 0.64 [275- IRVLPESNSDV V277 51.16 ± 2.57 27.33 ± 1.39 10.60 ± 0.57 287] DQ P279 89.58 ± 6.13 36.28 ± 2.31 10.00 ± 0.14 E280 101.87 ± 5.07 74.98 ± 5.82 17.69 ± 1.29 [285- VDQLKKDLES K289/K290 219.79 ± 4.29 169.06 ± 54.71 ± 2.63 297] ANL 10.24 L297 40.63 ± 3.33 41.59 ± 1.21 15.25 ± 1.43 [298- TANQHSDVIC No mod - - - 307] [306- ICQSEPDDSFP No mod - - - 323] SSGSVSL [324- YEVERC No mod - - - 329] [335- TILTDHQYL D339 27.26 ± 1.46 26.40 ± 2.22 9.13 ± 0.65 343] Y342 23.78 ± 1.60 28.36 ± 2.38 8.07 ± 0.34 [342- YLERTPLCA Y342 53.52 ± 4.11 46.01 ± 3.31 14.83 ± 0.21 350] [351- ILKQKAPQQ Q354/K35 80.37 ± 2.57 79.58 ± 5.73 26.98 ± 1.74 359] 5 [360- YRIRAKL Y360 35.50 ± 1.05 37.99 ± 2.0 7.49 ± 0.51 366] [367- RSYKPRRLFQ Y369/K370 95.29 ± 2.14 82.30 ± 4.67 21.61 ± 1.07 376] [377- SVKLHCPKCH P383 22.17 ± 0.16 22.68 ± 0.22 2.52 ± 0.21 387] L K384 10.68 ± 0.51 7.46 ± 0.61 6.33 ± 0.49 H386 14.87 ± 1.31 10.69 ± 1.04 8.61 ± 0.57 [388- LQEVPHEGDL P392 28.22 ± 2.78 23.04 ± 2.23 12.5 ± 0.83 397] H393 26.73 ± 1.60 22.51 ± 1.35 6.40 ± 0.48 [399- IIFQDGA No mod 405] [419- YDSKIWTTKN K422 177.96 ± 9.90 226.15 ± 7.39 201.93 ± 11.22 438] QKGRKVAVHF K427 169.89 ± 9.19 216.39 ± 96.56 ± 4.55 12.23 [439- VKNNGILPLSN I444 35.59 ± 1.32 30.63 ± 2.32 18.50 ± 1.04 449] L445/L447 84.62 ± 3.79 73.50 ± 4.85 38.40 ± 2.16 [439- VKNNGILPLSN I444 59.29 ± 2.31 55.23 ± 3.48 29.07 ± 1.74 450] E L445/L447 106.63 ± 3.55 96.81 ± 5.92 52.51 ± 3.23 P449 13.33 ± 0.90 13.76 ± 0.98 7.27 ± 0.47 84

Residues Modification Modification Modification Peptide Sequence Modified Rate, s-1 Rate, s-1 Rate, s-1 PT PT-hT12 PT-hT72 [461- SEICKLSNKF C464 2670.0 ± 2360.0 ± 532.32 ± 31.48 470] 303.9 185.8 [471- NSVIPVRSGHE H480 487.73 ± 491.64 ± 156.32 ± 8.58 481] 25.99 37.01 [471- NSVIPVRSGHE H480 72.93 ± 4.63 75.86 ± 4.65 25.18 ± 1.84 485] DLEL E481 35.77 ± 2.10 37.39 ± 2.67 22.80 ± 1.42

[486- LDLSAPFL L488 543.47 ± 490.03 ± 84.56 ± 12.66 493] 44.40 47.35 S489 213.62 ± 193.08 ± 29.37 ± 2.73 19.95 23.05 [494- I498 49.28 ± 3.15 40.90 ± 3.39 30.98 ± 1.34 509] IQGTIHHYGCK H499/H50 18.02 ± 1.92 14.68 ± 1.17 5.18 ± 0.23 QCSSL 0 Y501 64.75 ± 3.58 48.32 ± 4.58 25.91 ± 1.94 [516- NSLVDKTSWIP V519 2180.0 ± 2650.0 ± 470.13 ± 20.67 528] SS 205.3 228.9 [516- NSLVDKTSWIP L518 196.44 ± 187.77 ± 17.0 54.87 ± 3.33 530] SSVA 22.56 V519 1830.0 ± 2080.0 ± 431.08 ± 27.37 208.0 239.58 [531- EALGIVPLQ L533 145.49 ± 9.77 162.53 ± 35.25 ± 1.51 539] 12.92 [533- LGIVPLQ V536 34.67 ± 1.15 29.16 ± 1.71 12.32 ± 0.38 539] [546- FTLDDGTGVL T552 7.67 ± 0.43 8.79 ± 0.62 2.88 ± 0.19 555] V554 30.73 ± 0.69 30.56 ± 2.26 9.90 ± 0.73 [558- YLMDSDKF Y558 126.35 ± 139.91 ± 67.91 ± 5.76 565] 11.52 14.25 M560 204.26 ± 231.44 ± 66.33 ± 5.21 11.89 14.00 [566- FQIPASE Q567 10.09 ± 0.30 9.90 ± 0.69 3.02 ± 0.21 572] I568 9.32 ± 0.30 6.95 ± 0.57 3.56 ± 0.27 P569 32.33 ± 1.26 29.29 ± 1.89 10.75 ± 0.77 [573- VLMDDDL M575 1320.0 ± 1440.0 ± 806.17 ± 43.48 579] 48.37 91.68 [585- MIMDM M585 1120.0 ± 1320.0 ± 693.45 ± 81.89 589] 73.47 88.49 [589- MFCPPGIKIDA M589 414.79 ± 518.58 ± 272.45 ± 9.27 599] 24.75 47.82 [590- FCPPGIKIDA P593 27.91 ± 1.04 27.22 ± 2.68 8.30 ± 0.71 599] K596 33.51 ± 2.06 34.86 ± 3.11 13.20 ± 0.86 [607- IKSYNVTNGTD Y610 19.86 ± 0.57 17.05 ± 0.89 8.73 ± 0.35 619] NQ Q619 29.57 ± 1.05 24.79 ± 1.23 11.33 ± 0.47 [607- IKSYNVTNGTD C621 250.81 ± 197.20 ± 92.76 ± 6.79 621] NQIC 11.98 12.60 [623- QIFDTT F625 90.66 ± 3.79 69.55 ± 6.26 52.77 ± 4.70 628]

85

Residues Modification Modification Modification Peptide Sequence Modified Rate, s-1 Rate, s-1 Rate, s-1 PT PT-hT12 PT-hT72 TPP1 [94-101] VLRPWIRE L95 42.87 ± 3.80 38.52 ± 1.93 20.62 ± 1.09 P97 90.33 ± 2.92 66.44 ± 3.44 58.84 ± 3.45 W98 43.88 ± 3.73 48.60 ± 2.88 27.08 ± 2.58 [96-101] RPWIRE P97 67.96 ± 3.04 65.68 ± 5.50 54.93 ± 2.88 W98 43.90 ± 1.18 41.65 ± 3.04 22.76 ± 1.42 [102- LILGSETPSSP L102 19.46 ± 1.61 19.83 ± 1.07 24.74 ± 2.94 117] RAGQL L104 76.87 ± 6.18 75.48 ± 3.44 62.33 ± 7.14 P109 135.22 ± 8.02 128.90 ± 9.69 106.48 ± 5.25 [205- GSETPSSPRA P109 142.54 ± 5.32 114.94 ± 8.16 93.42 ± 2.92 117] GQL [127- AVAGPSHAPD P131 96.28 ± 6.78 118.59 ± 107.39 ± 7.45 141] TSDVG 12.13 H133 44.02 ± 4.62 52.46 ± 4.45 25.65 ± 1.78 [145- LVSDGTHSVR V146 28.94 ± 1.97 23.90 ± 1.79 15.77 ± 0.92 155] C H151 32.14 ± 2.36 27.91 ± 2.09 9.98 ± 0.45 [172- FGFRGTEGRL F172 104.92 ± 5.81 89.90 ± 5.56 72.11 ± 3.32 181] [182- LLQDCGVHVQ No mod - - - 192] [202- FYLQV F202/Y203 157.92 ± 8.20 114.05 ± 9.38 67.06 ± 5.94 206] [203- YLQVDRF Y203 31.85 ± 1.94 26.89 ± 0.74 15.46 ± 0.51 209] D207 11.90 ± 1.03 9.03 ± 0.50 4.69 ± 0.40 R208 72.29 ± 3.75 53.47 ± 4.51 29.62 ± 1.39 [213- PTEQPRLRVP C224 284.60 ± 314.16 ± 171.21 ± 8.71 228] GCNQDL 15.94 19.57 [229- DVQKKLYDCL Y235 78.0 ± 5.59 72.97 ± 4.91 65.26 ± 3.67 238] C237 77.63 ± 5.85 65.19 ± 6.33 27.28 ± 2.05 [239- EEHLSESTSS No mod - - - 254] NAGLSL [261- MREDQEHQG M261 331.3 ± 19.17 621.11 ± 238.09 ± 11.64 271] AL 40.87 H267 35.76 ± 3.14 39.23 ± 1.53 7.83 ± 0.43 [280- TLEGPCTAPP L281 17.33 ± 1.34 17.38 ± 1.15 8.90 ± 0.74 293] VTHW P284 54.67 ± 2.64 65.45 ± 5.09 27.94 ± 2.63 C285 35.97 ± 2.50 38.80 ± 3.07 12.07 ± 1.27 P288 23.76 ± 2.16 27.77 ± 2.08 18.39 ± 1.39 H292 11.14 ± 0.81 11.70 ± 0.85 5.73 ± 0.57 [294- AASRCKATGE C298 1000.0 ± 650.46 ± 2300.0± 305] AV 46.59 56.19 166.26 [306- YTVPSSM M312 117.07 ± 7.98 151.85 ± 6.21 60.27 ± 2.47 312] [306- YTVPSSML M312 1840.0 ± 2160.0 ± 1030.0 ± 57.59 313] 55.04 162.0 [318- NDQLIL L321 290.62 ± 297.90 ± 111.32 ± 5.71 323] 16.88 16.76 I322 263.50 ± 9.79 287.61 ± 111.87 ± 4.47 20.89 86

Residues Modification Modification Modification Peptide Sequence Modified Rate, s-1 Rate, s-1 Rate, s-1 PT PT-hT12 PT-hT72 [324- SSLGPCQRTQ P328 15.13 ± 0.59 15.40 ± 0.44 5.33 ± 0.16 334] G C329 35.28 ± 2.50 39.31 ± 2.67 21.69 ± 1.34 R331 44.98 ± 3.05 43.39 ± 3.56 21.80 ± 1.88

87

Table 2.4 Summary of hydroxyl radical modification ratios for residues in the comparison of PT/PT-hT12 and PT-hT12/PT-hT72.

Normaliz Normaliz Normaliz Normaliz Resid ed ed Resid ed ed ue Resid modificat modificat ue Resid modificat modificat numb ue ion ratio ion ratio numb ue ion ratio ion ratio er PT/PT- PT-hT12 er PT/PT- PT-hT12 hT12 /PT-hT72 hT12 /PT-hT72 POT1 228 H 1.46 1.64 1 M 0.80 0.94 233 L 1.75 1.33 3 L 0.95 1.20 234 K 1.15 1.76 9 Y 0.99 1.21 241 I 0.83 1.29 10 I 0.83 1.29 242 Y 9.74 0.72 18 K 0.81 1.82 251 M 0.71 1.23 22 I 0.70 1.29 266 H 6.65 0.20 36 Y 0.89 2.14 271 Y 1.47 2.92 37 L 0.56 1.68 277 V 1.90 0.99 39 K 0.65 2.02 279 P 2.42 1.17 55 V 0.71 2.09 280 E 1.34 1.45 56 K 0.74 1.46 289 K 1.27 1.09 66 Y 3.03 1.69 290 K 1.27 1.09 71 I 0.74 1.08 297 L 0.95 0.96 73 Y 4.86 1.34 339 D 1.01 1.02 89 Y 0.71 1.55 342 Y 0.96 1.17 90 K 0.71 1.55 354 Q 0.98 1.04 91 K 0.73 1.66 355 K 0.98 1.04 110 L 1.07 1.30 360 Y 0.91 1.79 113 P 0.86 0.64 369 Y 1.13 1.35 115 I 0.72 0.95 370 K 1.13 1.35 121 K 1.02 0.47 383 P 0.95 3.18 122 Y 0.90 1.10 384 K 1.39 0.42 130 H 1.50 1.18 386 H 1.35 0.44 132 M 0.75 0.55 392 P 1.19 0.65 143 H 0.93 1.01 393 H 1.16 1.24 144 M 1.16 1.68 422 K 0.77 0.40 154 C 1.31 1.08 427 K 0.76 0.79 159 M 0.75 1.24 444 I 1.09 0.63 169 L 1.75 0.63 445 L 1.10 0.66 171 K 1.25 0.93 447 L 1.10 0.66 173 E 1.08 1.48 449 N 0.94 0.67 174 V 0.94 2.06 464 C 1.10 1.57 177 A 0.84 1.57 480 H 0.95 1.09 184 W 0.85 1.02 481 E 0.93 0.58 190 P 0.87 0.54 488 L 1.08 2.05 191 F 0.59 1.74 489 S 1.08 2.33 194 W 1.03 1.36 498 I 1.17 0.47 197 L 1.02 1.78 499 H 1.20 1.00 209 H 0.92 0.84 500 H 1.20 1.00 211 H 0.93 1.41 501 Y 1.30 0.66 223 Y 0.87 2.07 518 L 1.02 1.21 88

Normaliz Normaliz Normaliz Normaliz Resid ed ed Resid ed ed ue Resid modificat modificat ue Resid modificat modificat numb ue ion ratio ion ratio numb ue ion ratio ion ratio er PT/PT- PT-hT12 er PT/PT- PT-hT12 hT12 /PT-hT72 hT12 /PT-hT72 519 V 0.83 1.85 172 F 1.14 0.44 533 L 0.87 1.63 202 F 1.35 0.60 536 V 1.16 0.84 203 Y 1.25 0.61 552 T 0.85 1.08 207 D 1.28 0.68 554 V 0.98 1.09 208 R 1.32 0.64 558 Y 0.88 0.73 224 C 0.88 0.65 560 M 0.86 1.23 235 Y 1.04 0.40 567 Q 0.99 1.16 237 C 1.16 0.85 568 I 1.31 0.69 261 M 0.52 0.92 569 P 1.07 0.96 267 H 0.89 1.77 575 M 0.89 0.63 281 L 0.97 0.69 585 M 0.83 0.67 284 P 0.81 0.83 589 M 0.78 0.67 285 C 0.90 1.14 593 P 1.00 1.16 288 P 0.83 0.53 596 K 0.94 0.93 292 H 0.93 0.72 610 Y 1.13 0.69 298 C 1.50 0.10 619 Q 1.16 0.77 312 M 0.79 0.81 621 C 1.24 0.75 321 L 0.95 0.95 625 F 1.27 0.47 322 I 0.89 0.91 TPP1 328 P 0.96 1.02 95 L 1.08 0.66 329 C 0.87 0.64 97 P 1.15 0.41 331 R 1.01 0.70 98 W 0.95 0.64 *normalization 1.03 2.83 102 L 0.96 0.28 factor 104 L 0.99 0.43 †mean-3SD 0.33 0.19 109 P 1.11 0.43 †mean-2SD 0.49 0.33 131 P 0.79 0.39 †mean 1.05 0.94 133 H 0.82 0.72 †mean+2SD 2.26 2.71 146 V 1.18 0.54 †mean+3SD 3.31 4.60 151 H 1.12 0.99

*The normalization factor is determined as the average of mean and median from all identified ratios of modification rates before normalization in log10 scale, then converted back to linear scale.

89

† mean and SD are also calculated based on normalized modification rate ratios in log10 scale, then mean±2SD and mean±3SD are determined, and finally converted back to linear scale.

90

Chapter 3 The telomere POT1-TPP1 complex promotes G- quadruplex destabilization using both active and passive mechanisms

The study in this chapter has been submitted for publication:

Mengyuan Xu, Armend Axhemi, Magdalena Malgowska, Yinghua Chen, Daniel Leonard, Sukanya Srinivasan, Eckhard Jankowsky, and Derek J. Taylor

91

Abstract

DNA topology and 3-dimensional dynamics are a critical regulatory element of genome stability, genetic preservation and transcriptional regulation. The guanine rich sequences defining the protective repeats of telomere DNA are prone to forming highly stable secondary structures known as G-quadruplexes (GQ). While these highly stable GQ structures function to protect their DNA components, this inherent protection also limits access of telomere DNA to the central regulatory enzyme, telomerase, which presents a major conundrum based on our current understanding of telomere maintenance. POT1-TPP1 is a specialized heterodimeric complex that specifically interacts with telomere single-stranded

DNA. While POT1-TPP1 is known to destabilize GQ structures, the molecular mechanism and kinetic properties dictating GQ destabilization upon POT1-TPP1 binding are poorly understood. We present a detailed kinetic model to demonstrate that GQ formation is in fact a dynamic process that is directly influenced by the local milieu of POT1-TPP1. In the presence of low concentrations of POT1-TPP1,

GQ structures undergo a passive disassembly and secondary binding by POT1-

TPP1. Alternatively, under high concentrations of POT1-TPP1, there is a dramatic rate favored active opening process driven by the binding of two POT1-TPP1 heterodimers which function to fully destabilize GQ structures. Furthermore, cancer-related mutations within the DNA binding interface of POT1-TPP1 impair and exploit the GQ binding and destabilization dynamics elucidated in our work.

This insight into inherent GQ dynamics, along with a protein regulated mechanism of GQ binding and destabilization, illustrates the mechanisms employed to regulate

92

DNA topology and thus influence genomic stability, genetic preservation, and transcriptional activity.

93

Introduction

In addition to the finite message established by the genetic sequence, the

structural topology and conformational dynamics of DNA play a critical role in regulating nearly all cellular processes (102). One regulatory topology found throughout the genome is a highly stable structure known as a G-quadruplex (GQ)

(103-105). These GQ structures have been shown to regulate a variety of cellular processes including transcriptional regulation (106-108), DNA recombination (109), and most notably, telomere maintenance (75, 88). Telomeres represent a specialized nucleoprotein complex found at the ends of all eukaryotic chromosomes (110). This telomere DNA, consisting of tandem G-rich repeats

(d(TTAGGG) in mammals) that end in a single strand DNA (ssDNA) overhang, are vital for successive rounds of cellular replication and DNA preservation (8, 66).

Importantly, the G-rich sequences found at telomeres are prone to form these highly stable GQ structures (91, 111-114). GQs are comprised of multiple sets of stacked G-tetrads that form highly stable Hoogsteen hydrogen bonds between guanines (115, 116). The geometry and stability of GQ structures varies based on the nature and size of intercalating cations that support the G-tetrads. For instance, lithium ions have destabilizing effects, whereas potassium ions promote the formation of highly thermodynamically stable GQ structures (117, 118).

The inherent stability of GQ structures found at telomeres presents a major hurdle for telomere elongation as they inhibit the binding of telomerase (75). Overcoming this hurdle, POT1-TPP1 is a specialized protein heterodimer that specifically binds to telomere ssDNA overhangs and destabilizes GQ structures (37, 39, 43, 76, 77).

94

Although POT1-TPP1 is known to destabilize GQ structures to make the ssDNA

more accessible to enzymes such as telomerase (77, 88, 89), the kinetic properties

of POT1-TPP1 in telomere GQ binding and destabilization have yet to be

characterized. Additionally, more than 300 single-nucleotide polymorphisms

(SNPs) within the POT1 gene have been identified in varying types of cancer (53,

55, 67, 69, 119). The impact of POT1-TPP1 in telomere maintenance and apparent

aberration in human diseases highlights the importance of detailed molecular

characterization of how POT1-TPP1 functions to destabilize GQ structures. To understand the kinetic properties of POT1-TPP1 destabilization of GQs, we utilized

a previously determined telomere ssDNA sequence known to adopt a specific

homogenous GQ topology (120). Through the use of biophysical techniques

capable of investigating dynamic changes in real-time, we determined that the folding and unfolding of telomere ssDNA and GQs is highly dynamic. Additionally, the kinetic binding parameters from surface plasmon resonance (SPR) coupled with stopped-flow techniques to measure the GQ destabilization, allow us to present a detailed kinetic model of POT1-TPP1 actions on GQ complexes. This model illustrates how the binding of two POT1-TPP1 molecules are required to fully destabilize GQs and, through both passive and active pathways, contribute to

GQ destabilization. Furthermore, the cancer-related Q94R and H266L POT1 mutants exhibit significant defects in telomere GQ binding and/or destabilization, leading to significant alterations in telomere ssDNA protection and telomere maintenance. This insight into the dynamic nature of GQ stability and regulation provides vital clues into how these structures function to protect telomeres while

95 allowing selective maintenance and preservation by telomere nucleoprotein complexes such as POT1-TPP1.

Results

The GQs at telomeres form diverse and heterogeneous topologies that are

dependent on DNA sequence, overhang length, and monovalent cation

composition (112). To avoid complications associated with structural heterogeneity,

we used the hT22 telomere ssDNA (TAGGG(TTAGGG)2TTAGG) for our kinetic

investigations, as it is one of the most homogenous and well characterized

telomere GQ topologies (120). Circular dichroism (CD) spectra confirmed that in

buffer containing 90 mM K+ the hT22 DNA substrate maintains its antiparallel GQ

morphology (120), which is unaffected by the incorporation of a 5’ hexathymidine

linker (Figure 3.1A). The 5’ linker was introduced to prevent steric interactions

between the telomere ssDNA and the immobilization surface used for subsequent

protein-DNA binding kinetics. First, the folding of hexathymidine hT22 (6ThT22)

into its GQ structure was performed by rapidly mixing unfolded 6ThT22 with

increasing concentrations of potassium buffer in a stopped-flow device. 6ThT22

was maintained in an unfolded state in a solution devoid of monovalent cations

(117). KCl was titrated into the reaction to promote formation and stacking of G-

quartets, which was monitored over time by changes in CD ellipticity at 295 nm

(92, 121). These data were used to generate a model (Figure 3.1B-C) that

determined GQ folding and unfolding rate constants of 0.05245 ± 0.00005 mM-1 s-

1 -1 and 0.73902 ± 0.02103 s , respectively. As the 6ThT22GQ structure has been

most well characterized under cation conditions containing 90 mM K+, we used our

96 data and model to determine the 6ThT22 GQ folding rate at 90 mM K+ to be

4.72063 ± 0.00471 s-1 and maintained 90 mM K+ for all subsequent studies.

97

Figure 3.1 6ThT22 telomere ssDNA forms antiparallel GQ structures in K+.

A. CD spectrum of 6ThT22 in K+ highlights signature peak (295 nm) and valley (260 nm) indicative of antiparallel GQ topology. B. Time traces of ellipticity at 295 nm to monitor the folding of 6ThT22 into GQ structures with increasing concentrations of K+. C. GQ formation of 6ThT22 structure, as monitored in changes at 295 nm ellipticity, as a function of potassium concentration. D. The model for 6ThT22 GQ formation upon mixing with potassium.

98

An individual POT1 protein or POT1-TPP1 heterodimer binds to a ten nucleotide span of telomere ssDNA (43, 79, 94, 122). As such, the 6ThT22 substrate contains two consecutive POT1-TPP1 binding motifs. In order to distinguish protein binding from GQ unfolding, we first sought to determine binding affinities of the POT1-

TPP1 heterodimer for the folded and unfolded 6ThT22. 6ThT22 GQs were prepared in either K+ or Li+ based solvents and the secondary structure was

confirmed by UV melting and size-exclusion chromatography analysis (Figure

3.2A-B). As a commonality to GQ formation (123), 6ThT22 folds into a GQ

structure in K+ but remains unfolded in Li+ buffer. Surface Plasmon Resonance

(SPR) was employed to characterize the kinetic properties between POT1-TPP1

and 6ThT22 binding under conditions that prevent or promote GQ folding. The

sensorgrams recorded in Li+ supported a sequential, two-site protein binding

model (Figure 3.3). The kinetic data indicate that the first protein-binding event is

characterized by a relatively fast association rate and a slow dissociation rate,

whereas the opposite is true for the second binding event (Table 3.1). These

results demonstrate that the initial POT1-TPP1 binding has a high affinity for the

unfolded 6ThT22 whereas the second binding event has a much lower affinity.

These data were further corroborated by electrophoretic mobility shift assays

(EMSAs; Figure 3.2C and Table 3.2) as well as previously published studies(38,

76).

99

Figure 3.2 hT22 forms G-quadruplexes in K+ but not Li+ buffer conditions.

A. UV-melting profile of hT22. Absorbance was monitored at 295 nm in 90 mM Li+ or K+ buffer as a function of increasing temperatures ranging from 10 to 90 °C. B.

Normalized chromatograms of hT22 and a 30mer control oligo investigated using

size-exclusion-HPLC in Li + or K + buffer. Unlike the 30-mer which maintains the

same shape in Li + or K + buffer, hT22 elutes at different volumes in the different

cation-based buffers. This difference is due to changes in shape from unstructured

(Li+) to GQ (K+). C. Quantification of EMSA data presented in Figure 3.4A (solid dots and line) and compared to simulated data (dashed line).

100

Figure 3.3 POT1-TPP1 binds unfolded 6ThT22 with two distinct binding affinities.

A. EMSA for POT1-TPP1 binding to unfolded 6ThT22 under equilibrium binding conditions in Li+ buffer. 1nM 6ThT22 with 0-102 nM POT1-TPP1 protein. B.

Sensorgrams describing the interactions of immobilized 6ThT22 with POT1-TPP1

proteins in Li+ buffer. C. Model used for fitting SPR data and for calculation of

individual rate constants (as shown).

101

We next investigated the kinetic binding properties of POT1-TPP1 for the

structured, GQ-folded 6ThT22 ssDNA substrate. EMSA, SPR analysis of POT1-

TPP1 binding to pre-formed GQ structures, along with stopped-flow CD (changes

in ellipticity at 295 nm) analysis, was used to decipher POT1-TPP1 binding versus

POT1-TPP1 induced GQ opening (Figure 3.4A-D). Interestingly, unlike the interactions with unfolded ssDNA (Figure 3.3B), the binding of two POT1-TPP1 proteins to preformed 6ThT22 GQ structures is characterized by a fast initial dissociation phase followed by a prolonged secondary dissociation phase. The loss of ellipticity at 295 nm indicates the opening of GQ upon POT1-TPP1 binding.

Based on these data, we were able to test three potential models with only one model fitting all of the experimental data with sufficient correlation (Figures 3.4 and

3.5). This best fit model (Figure 3.4E) accurately fits both the SPR binding data and GQ opening data and moreover, the molar ratio plot for the GQ destabilization by POT1-TPP1 further indicates that two POT1-TPP1 molecules are required for complete opening of GQs (Figure 3.4C inset). Additionally, the equilibrium dissociation constants for POT1-TPP1 binding to unfolded DNA in K+ conditions

are comparable to those determined from Li+ conditions, which further validates

this model as the best fit model (compare Table 3.2 and 3.4). The association rate

constants of POT1-TPP1 binding to GQ (k4 = first binding event and, k5 = second

binding event) and unfolded DNA (k1 = first binding event and, k2 = second binding

event) demonstrate that POT1-TPP1 binds to the GQ state faster than it does to the unfolded state, but more importantly, the dissociation phase of POT1-TPP1 from GQ is significantly increased (k-1 vs. k-4 and k-2 vs. k-5) resulting in lower

102 affinities of POT1-TPP1 for GQ folded 6ThT22. Moreover, the rate constant for active opening of GQs from the 2 bound POT1-TPP1 complexes (k6) is 550-fold higher than the reverse rate (k-6). This indicates that the active opening of GQs by the 2 bound POT1-TPP1 complexes is highly energetically favorable which further supports the notion of a combined passive and active pathway involved in GQ opening.

103

Figure 3.4 POT1-TPP1 binds and destabilizes 6ThT22 telomere GQs.

A. Stoichiometric binding assays of POT1-TPP1 with 6ThT22 in K+ buffer reveals two binding events. B. Experimental and modeled sensorgrams to describe the interactions of immobilized 6ThT22 with POT1-TPP1 proteins in K+ buffer. C. Time course of changes in ellipticity for 6ThT22 with and without mixing with POT1-

104

TPP1 protein. Inset plot of 295 nm amplitude changes measured against molar ratio of protein:oligo indicates complete GQ unfolding of 6ThT22 requires two proteins. D. Correlation between EMSA experimental data with simulated data using model. E. Model used for comprehensive fitting of protein binding and GQ unfolding data.

105

Figure 3.5 Potential models for POT1-TPP1 destabilization of telomere G-

quadruplexes.

A. The model represents GQ opened by POT1-TPP1 capturing the unstructured

DNA, driving the equilibrium from G4 to unstructured DNA. B. Experimental and

modeled (with model in panel A) sensorgrams to describe the interactions of

immobilized 6ThT22 with POT1-TPP1 proteins in K+ buffer. C. Time course for

changes of ellipticity at 295 nm of 6ThT22 without or mixing with POT1-TPP1. Solid

lines are simulated data using the model in panel A, which fails in fitting both the

SPR data and G4 opening data obtained from K+ conditions. D, E, and F, are the

same as A, B, and C but using fitting model illustrated in panel D, which represents one POT1-TPP1 bound is required for actively opening GQ. This model fails to simulate GQ opening data. G. Quantitative evaluation of different model. Chi-

106 squared values obtained from global data simulations with Kintek Explorer for the entire model or GQ opening assay subsets, were normalized to the best fit in each data subset. A value of 1 indicates the best fit.

107

In order to understand these kinetic parameters within the physiologically dynamic

protein concentration milieus of the nucleus, we next investigated the free energy

landscape for the POT1-TPP1-GQ association under varying POT1-TPP1

concentrations (Figure 3.6). At low protein concentrations (10 nM) the POT1-

TPP1-GQ bound states are energetically unfavorable which discourages POT1-

TPP1 binding to GQs. However, the dynamic equilibrium between unfolded DNA and GQ allow a small population of unfolded DNA to exist even in the presence of

K+. The tight interaction between POT1-TPP1 and unfolded DNA leads to an energetically favorable stabilization of opened DNA upon POT1-TPP1-binding, thus partially opening GQs. Alternatively, at high protein concentrations (500nM) the binding of POT1-TPP1–GQ becomes energetically favorable. The favorable

binding of POT1-TPP1 to the GQ complex promotes active deconstruction of the

GQ to ensue as demonstrated by the large free energy gap between the structured

and unfolded DNA states (3.72 kcal/mol) when POT1-TPP1 is bound. Thus, at high

concentrations of POT1-TPP1, the main driver for GQ opening becomes the

energetically favorable, active effects of POT1-TPP1 binding to GQ ssDNA.

108

Figure 3.6 POT1-TPP1 complex promotes G-quadruplex destabilization using both active and passive mechanisms.

A. Free energy landscape of POT1-TPP1-DNA of different bound states at low protein concentration (10nM). The black arrow indicates the passive capture pathway. B. Free energy landscape of POT1-TPP1-DNA of different bound states at high protein concentration (500nM). The black arrow indicates the active destabilization pathway.

109

Interestingly, POT1 mutations have been observed in various human diseases,

including cancer. Given our kinetic observations, we next asked what effect

pathogenic POT1 mutations have in telomere GQ binding and destabilization. Two

highly suspicious POT1 mutations, Q94R and H266L (55), reside within the

tandem DNA binding folds (OB1 and OB2, respectively) of POT1 and occur at

residues that directly interact with a guanine of telomere DNA (Figure 3.7A) (43).

To investigate the effect of these mutations in POT1-TPP1-unfolded DNA binding

ability, we performed the previously described Li+ buffer EMSA and SPR assays

using these two mutants (Figure 3.7). The data suggest that both mutations greatly impair the binding properties of POT1-TPP1 with unfolded ssDNA (Table 3.1 and

3.2). With a clear picture of how these mutations affected POT1’s function on unfolded ssDNA, we next set out to determine how they might impact POT1-TPP1 binding and destabilization of GQ structures. We again performed SPR and GQ opening experiments with K+ buffer conditions as described above, but now with the two POT1 mutants, Q94R and H266L (Figure 3.8). The kinetic parameters obtained from our global fitting indicates that Q94R and H266L mutants impair most of the steps involved in GQ binding and destabilization by ~2-fold (Figure

3.9A, Table 3.3 and 3.4).

To better interpret these cumulative findings, we determined the effect of these each mutation on the free energy landscapes of POT1-TPP1 with 6ThT22 ssDNA

(Figure 3.9B-C). These data show that binding of the Q94R POT1-TPP1 mutant was dramatically less favorable for folded ssDNA compared to the wild-type POT1-

TPP1. This conclusion is supported by the comparable ΔΔG of the wild-type and

110

Q94R. For instance, the ΔΔG of two protein-folded DNA bound state between wild-

type and Q94R is 0.9 kcal/mol, which is comparable to the ΔΔG of two protein-

unfolded DNA bound state at 1.1 kcal/mol. This suggests that the Q94R mutant

mainly impairs the POT1-TPP1 binding event to the GQ structure. In contrast, the

ΔΔG of two protein-folded DNA bound state between wild-type and H266L is only

0.3 kcal/mol, while the ΔΔG of two protein-unfolded DNA bound state at 0.9

kcal/mol, suggesting that the H266L mutation has less significant deficiency in

binding the GQ structure compared to wild-type protein, but instead, mainly affects

GQ opening after two POT1-TPP1 complexes have bound (Figure 3.9D).

Interestingly, less energy barriers were observed at the transition state between

one protein-unfolded DNA and two protein-unfolded DNA states in both mutants;

nevertheless, the less favorable two proteins-unfolded DNA bound states indicate that either mutation results in less efficiency of GQ destabilization by POT1-TPP1.

111

Figure 3.7 POT1-TPP1 Q94R and H266L mutants impair protein-ssDNA interactions.

A. Structure of POT1-N with hT10 ssDNA (PDB code: 1XJV). Q94 and H266 residues highlighted in green and blue, respectively. Inset displays hydrogen bonding interactions that occur between Q94 and G4 (left) and H266 and G10

(right). B. EMSA for POT1-TPP1 Q94R (left) or H266L (right) mutants binding to unfolded 6ThT22 under equilibrium binding conditions in Li+ buffer. 1nM 6ThT22 with 0-800 nM POT1-TPP1 in the case of the Q94R mutant or 1-512 nM POT1-

TPP1 in the case of the H266L mutant protein. C. Sensorgrams describing the interactions of immobilized 6ThT22 with POT1-TPP1 Q94R mutant proteins in Li+ buffer. D. Quantification of EMSA data presented in panel B left (solid dots and line) and compared to simulated data (dashed line) for Q94R mutant binding to

6ThT22 ssDNA. E. Sensorgrams describing the interactions of immobilized

6ThT22 with POT1-TPP1 H266L mutant proteins in Li+ buffer. F. Quantification of

EMSA data presented in panel B left (solid dots and line) and compared to simulated data (dashed line) for H266L mutant binding to 6ThT22 ssDNA.

112

Figure 3.8 POT1-TPP1 mutants impair both binding and destabilization of

6ThT22 telomere GQs.

A. Stoichiometric binding assays of POT1-TPP1 Q94R (left) or H266L (right)

mutants with 6ThT22 in K+ buffer. B. Experimental and modeled sensorgrams to describe the interactions of immobilized 6ThT22 with POT1-TPP1 Q94R mutant

proteins in K+ buffer. C. Time course of changes in ellipticity measured at 295 nm

for 6ThT22 with and without mixing with POT1-TPP1 Q94R mutant protein. D. Plot

of CD ellipticity amplitude changes at 295 nm measured against molar ratio of

protein:DNA for Q94R POT1-TPP1 mutant. E. Correlation between EMSA experimental data with simulated data for Q94R POT1-TPP1 protein binding to

6ThT22 GQ using model in Figure 3.4 E. F. Time course of changes in ellipticity measured at 295 nm for 6ThT22 with and without mixing with POT1-TPP1 H266L mutant protein. G. Plot of CD ellipticity amplitude changes at 295 nm measured

113 against molar ratio of protein:DNA for H266L POT1-TPP1 mutant. H. Correlation between EMSA experimental data with simulated data for H266L POT1-TPP1 protein binding to 6ThT22 GQ using model in Figure 3.4 E.

114

Figure 3.9 POT1-TPP1 pathogenic mutants differentially impair binding and

destabilization of telomere G-quadruplexes.

A. The Q94R and H266L POT1-TPP1 mutations differentially alter rate constants in GQ binding and destabilization as compared to wild-type protein. The relative

115

change of individual rate constants for each mutation as compared to wild-type

protein is indicated for each step in the model. Changes in values are depicted

within the lengths and color range of the individual bars for each mutant. B and C.

Free energy landscape of ΔΔG between POT1-TPP1 protein and (B) Q94R mutant or (C) H266L mutant protein. D. Comparison of the ΔΔG between wild-type POT1-

TPP1 and each mutant protein for each bound and transition states. Finite values for each step are included in the table.

116

Discussion

We present a detailed kinetic model, built from multiple sets of kinetic experiments,

that demonstrates GQ destabilization is a function of both a passive and active

process. Furthermore, our results indicate that the folding and unfolding of GQs is

kinetically dynamic, and highly influenced by local concentrations of POT1-TPP1.

For example, in the setting of low concentrations of POT1-TPP1, the opening of

GQ into unfolded DNA relies on a rate favored, passive, spontaneous process. As

such, the main driver of GQ opening, in this setting, must then be the passive

trapping of unfolded GQ DNA by the POT1-TPP1 heterodimers. However, the

failure of global fitting into model in Figure 3.5A indicates that this passive opening

pathway is insufficient to fully destabilize GQ structures. Instead, two POT1-TPP1

heterodimers are required to actively, and completely, destabilize the antiparallel

GQs in more physiologic conditions (K+ containing solutions), which complements

previous studies.(89) With increasing POT1-TPP1 concentration, although the passive opening of GQs continues, the dramatic rate favored active opening of

GQs by bound POT1-TPP1 becomes the main mechanism of GQ opening.

This shift from a passive to active process of GQ opening, based on POT1-TPP1 concentration, explains how POT1-TPP1 functions to assist telomerase in recognizing telomere DNA sequences and promoting overall telomerase processivity. For instance, the formation of GQs significantly diminishes the ability of telomerase to bind or extend telomere DNA, as demonstrated previously by the increased dissociation rate constant of telomerase from telomere GQ DNA (124).

The accumulation of POT1-TPP1 at telomere DNA, subsequent to numerous

117

cellular stimuli, results in increased telomere concentrations of POT1-TPP1 which,

based on the data presented, would function to actively destabilize telomere GQ

to promote the binding and stabilization of telomerase.

Furthermore, our data demonstrates that the cancer-related mutations, each

occurring on different DNA binding domains (OB1, OB2), differentially impair

POT1-TPP1 binding and/or opening of GQs. The H266L mutation functions by

reducing POT1-TPP1’s ability to open the GQ structure. Alternatively, the Q94R

mutation decreases the binding affinity of POT1-TPP1 for the telomere ssDNA; but

interestingly, retains the ability to destabilize the GQ structure if binding proceeds.

Taken together, these observations suggest that these cancer-related mutations result in defective POT1-TPP1 binding/regulation of telomere ssDNA which results in dysfunction of telomere maintenance and genomic stability. While these investigations are focused on one specific GQ topology and its relationship with telomere nucleoprotein complexes, this data highlights the need for identification and characterization of GQ binding proteins and kinetic parameters of GQ topology.

For instance, GQs with different topologies exhibit different thermodynamic and kinetic properties(125), which may alter the rates of GQ destabilization.

Additionally, the stability or instability of GQ structures located throughout the genome, in particular at promoter regions (108), are implicated in the transcriptional regulation of downstream genes. Thus, dysregulation of GQ binding partners/enzymes in either direction could conceivably result in aberrant expression of pathologic proteins or conversely, prevent the expression of cellular

“guardians”, checkpoints, and tumor suppressors. Our results also highlight the

118 importance of pairing both stable and dynamic measurements of DNA topology, particularly GQ structures, to assemble a biologically relevant, dynamic, model of structural DNA regulation. For example, stable, thermodynamic measurements of

GQ stability alone, have led previous reports to conclude that these GQ structures are highly stable and require a significant energy barrier to be disassembled. While our results also demonstrate the inherent stability of GQ structures, the dynamic experiments presented here also capture the relatively small fraction of DNA being opened during steady state equilibrium. Furthermore, DNA and GQ structures occur in the setting of diverse subcellular microenvironments with potential interactors and regulators. While previous reports demonstrated an ATP dependent requirement for the active opening of GQ structures by helicases (126), our data reveals that this is not the case for all mechanisms of GQ unwinding.

Complementing these static models with our kinetic measurements reveals that the stability of GQ’s at telomere DNA sequences are highly context dependent, suggesting the same might be true for other GQ structures throughout the genome.

Taken together, our data allows us to generate a more complete, biologically relevant, model of telomere GQ stability. This model integrates dynamic GQ stability measurements, along with DNA binding nucleoprotein complexes, to provide a kinetic explanation and molecular mechanism of telomere GQ opening following POT1-TPP1 binding. While providing a framework for investigating other telomere GQ topologies, this work highlights the importance of aiming to recapitulate additional variables present in biology that most likely impact GQ structures. Investigation of additional GQ binding partners, similar to POT1-TPP1,

119 and their role in protecting or unwinding QG structures is critical in understanding the cellular mechanisms of genomic maintenance and transcriptional regulation.

Exhaustive characterization of these GQ topologies and regulatory mechanisms could also provide unique targets, never before imagined, for therapeutic intervention.

120

Material and Methods

Oligo preparation

The oligonucleotides investigated in this study were all synthesized and HPLC

purified by Integrated DNA Technologies (IDT). Unmodified oligos were used for

circular dichroism spectroscopy, UV melting, size exclusion-HPLC, GQ formation

and destabilization assays. IRD-700 labeled oligos were used for electrophoretic mobility shift assays (EMSAs), and biotinylated labeled oligos were used for surface plasmon resonance (SPR) assays.

To pre-fold GQ, the oligos were prepared in the potassium phosphate-KCl buffer

containing 90 mM K+ (pH 7.0) buffer by heating to 95°C for 5 min, followed by

slowly cooling to 25°C and allowed at least 4 hrs for GQ formation before

proceeding to subsequent steps.

Circular dichroism spectroscopy

Circular dichroism spectra were obtained by a PiStar 180 spectrophotometer

(Applied Photophysics) using 1 cm path length at 500 nM of GQ in the potassium

phosphate-KCl buffer containing 90 mM K+ (pH 7.0) buffer. Scans were performed

at 25 °C over a range of 240−340 nm, and each final spectrum was smoothed and

averaged of three scans taken with a 4-nm bandwidth, 1-nm step size, and 1-s

collection time per data point. A buffer only blank was subtracted from each

spectrum with all data zero corrected at 340 nm.

121

Stopped flow measurement

The kinetics of GQ formation upon rapidly mixing with varying concentration of K+

were obtained by using a PiStar 180 spectrophotometer (Applied Photophysics).

6 μM of unfolded 6ThT22 was prepared in 5 mM Tris (pH 7.0) and rapidly mixed

1:1 with varying concentration of KCl in 5 mM Tris (pH7.0) to reach a final

concentration of oligos at 3 μM. GQ folding was monitored by ellipticity changes at

295 nm as a function of time (path length 1 cm, bandwidth 4 nm) for 10s with 4000

data points, and data were zero corrected to the first time point. A circulating water

bath was used to maintain constant temperature at 25 °C. The averaged readings

of eight mixing reactions were fitted by Kintek Explorer software to calculate rate

constants of GQ folding and unfolding. To calculate the extent of GQ being folded

upon K+ binding, the average readings were fitted to an exponential function

-k *t (BoxLucas1 in Origin): Ellipticity changes 295 = A * (1 – e obs ), where A and kobs

represent the amplitude and observed rate constants of GQ opening. The

amplitude values were plotted against potassium concentration and fitted to a

hyperbolic function:

+ Amax [K ] A= + Eq 1 K +K1/2

� � + where Amax and K1/2 represent the maximal amplitude and [K ] needed to fold half

the oligonucleotide.

+ The pseudo first order GQ folding rate (kf90) at 90 mM K is determined with

following equation:

122

+ kf90= kf*[K ] Eq 2

where kf is the folding rate constant obtained from global fitting to be 0.05 ± 0.01 mM-1 s-1 and [K+] is the concentration of potassium at 90 mM.

The kinetics of GQ destabilization upon binding with POT1-TPP1 (wild type and

mutants) proteins were obtained by rapidly mixing 1 μM pre-folded GQ with varying

concentrations of proteins as protein:oligo molar ratios at 0, 0.4, 0.8, 1.2, 2, 2.4.

Both GQ and POT1-TPP1 proteins were prepared in potassium phosphate-KCl

buffer containing 90 mM K+ (pH 7.0) and mixed 1:1 to reach a final GQ concentration of 500 nM. GQ destabilization was monitored by ellipticity changes at 295 nm as a function of time (path length 1 cm, bandwidth 4 nm) for 300 s with

900 data points at 25 °C, and data were zero corrected to the first time point. Three to eight (mostly more than five) mixing reactions were performed for each condition.

The average readings were used for globally fitting by Kintek Explorer software.

To calculate the extent of GQ being opened upon POT1-TPP1 binding, the average readings were fitted to an exponential function: Ellipticity changes 295 = A

-k *t * e obs , where A and kobs represent the amplitude and observed rate constants of

GQ opening. The amplitude values were plotted against molar ratio of proteins to

oligos.

Protein expression and purification

123

Full-length POT1 (wild type and mutants) with an N-terminal GST-tag and N-

terminal 6×His-tagged TPP1 (89-334) proteins were co-expressed using the

recombinant baculovirus expression system to infect Spodoptera frugiperda 9 (Sf9)

insect cells as described previously(43). Briefly, POT1-TPP1 protein complexes

were purified with affinity chromatography by GST-beads (Invitrogen) first.

PreScission protease (GE Healthcare) was then added to remove the N-terminal

GST tag followed by size-exclusion chromatography (SEC) using a Superdex 200

10/300 chromatography column on an AKTA Purifier10 FPLC system (GE

Healthcare) with either 20 mM Tris (pH 7.0), 90 mM LiCl or potassium phosphate-

KCl buffer containing 90 mM K+ (pH 7.0) buffer. Proteins purified in Li+ buffer were

used for assays with unfolded DNA, i.e., EMSAs and SPR under Li+ condition.

Proteins purified in K+ buffer were used for assays with GQ formed DNA, i.e., GQ

destabilization assays by stopped flow and SPR under K+ buffer.

UV melting

The UV melting assays were carried out on a PiStar 180 spectrophotometer

(Applied Photophysics). 16 μM of hT22 were prepared in either 20 mM Tris (pH

7.0), 90 mM LiCl or potassium phosphate-KCl buffer containing 90 mM K+ (pH 7.0)

buffer. The absorbance was monitored using 2 mm path length while the

temperature was increased from 10°C to 90°C at 1°C/min. Absorbance changes

at 295 nm were plotted against temperature to obtain UV melting profile under K+

or Li+ conditions.

Size Exclusion –HPLC

124

hT22 oligos were prepared at 10 μM in either Li+ or K+ buffer. For HPLC runs, 10

μL of DNA was injected into a Shimadzu HPLC and separated using a SEC-300

size-exclusion column (Thermo Scientific Acclaim). All SE-HPLC runs were

conducted at 25°C using a flow-rate of 0.2 ml/min and constant UV absorbance

monitoring at a wavelength of 260 nm.

Electrophoretic mobility shift assays (EMSAs)

POT1-TPP1 with unfolded oligos binding reactions were performed in buffer

containing 20 mM Tris (pH 7.0), 90 mM LiCl, 5 mM DTT, 5 μg/mL BSA, 1.2 μg/mL

tRNA and 6.25% glycerol. Reactions were performed using 1 nM IRD700-labeled

6ThT22 and variable concentrations of recombinant POT1-TPP1 protein. Wild type

POT1-TPP1 were performed under the concentration range of 0 to 102 nM, while

Q94R mutant concentrations ranged from 0 to 800 nM, H266L mutant POT1-TPP1

concentrations ranged from 0 to 512 nM. Binding reactions were incubated for 30

min at room temperature before 10 μL of the reaction was loaded onto a 5% Tris- borate non-denaturing gels (Bio-rad). Gels were run at 125 V for 35 min, then imaged with ODYSSEY imaging system (Li-COR). Image quantification was performed using Image Studio software. The data from EMSA experiments were

[POT1-TPP1] fitted to a hyperbolic function: bound fraction= to obtain K1/2, which [POT1-TPP1]+K1/2

represent [POT1-TPP1] needed to obtain half bound shift. Averages and standard deviations from three measurements were used to determine the values reported in the Figures 3.2C, 3.7D and F.

125

POT1-TPP1 with pre-folded oligos binding reactions were performed in the potassium phosphate-KCl buffer containing 90 mM K+ (pH 7.0), with 5 mM DTT, 5

μg/mL BSA, 1.2 μg/mL tRNA and 6.25% glycerol. Reactions were performed using

500nM 6ThT22 with 4 nM IRD700-labeled 6ThT22 and variable concentrations of

recombinant POT1-TPP1 protein to reach the protein:oligo ratio as indicated.

Binding reactions were incubated for 30 min at room temperature before 10 μL of the reaction was loaded onto a 5% Tris-borate non-denaturing gels (Bio-rad). Gels

were run at 125 V for 35 min, then imaged with ODYSSEY imaging system (Li-

COR). Image quantification was performed using Image Studio software. The unbound, one protein-bound and two protein-bound fractions were quantified and compared with the data simulated from global fitting under the same incubation time as shown in Figures 3.4A and 3.9A.

Surface plasmon resonance (SPR)

SPR analysis on the binding of 5’ end labeled biotinylated 6ThT22 with POT1-

TPP1 wild type and mutants proteins were performed using a BIAcore T200 (GE

Healthcare). Oligos were captured to a flow cell of an S series SA sensor chip (GE

Healthcare) at a low density of approximately 5 response units (RUs). To

investigate POT1-TPP1 with unfolded oligo binding events, binding reactions were

performed in Li+ running buffer, 20mM Tris (pH7.0), 90 mM LiCl, 0.05% Tween-20.

POT1-TPP1 proteins are prepared in the Li+ running buffer with varying

concentration range, 0.33 – 80 nM for wild type and H266L mutant proteins, and

0.33 – 720 nM for Q94R mutant proteins. Series concentrations of proteins were

injected at 30 µL/min over oligos. Surface was regenerated between injections 126 using 15s of 2M guanidinium chloride followed by 60s of Li+ running buffer. To investigate GQ folded oligo binding events, binding reactions were performed in

K+ running buffer, potassium phosphate-KCl buffer containing 90 mM K+ (pH 7.0),

+ 5 mM MgCl2, 0.05% Tween-20. POT1-TPP1 proteins are prepared in the K running buffer with varying concentration range, 0.5 – 40.5 nM for wild type proteins, 2.9 – 720 nM for Q94R mutant proteins, and 0.1 – 240 nM for H266L mutant proteins. Series concentrations of proteins were injected at 30 µL/min over oligos. Surface was regenerated between injections using 30s of water, 15s of guanidinium chloride, 30s of water, 60s of 2 M KCl, and 30s of K+ running buffer.

Global data fitting and kinetic modeling

Kintek Global Kinetic Explorer program, Version 8.0, Kintek Corp, Snow Shoe, PA

(127, 128) was used to fit: a) Kinetic data for GQ folding and unfolding, b) POT1-

TPP1 binding to unfolded telomere ssDNA (Li+ condition), and c) POT1-TPP1 binding and opening of GQ telomere ssDNA (K+ condition).

Kinetics of GQ folding/unfolding in absence of protein

GQ formation Data were fit to the model shown in Figure 3.1D. Initial parameters were obtained using the Dynamic Simulation feature of Kintek which allows variation of rate constants with continuous simulation. The global fit was performed multiple times for alternating combinations of folding and unfolding rate constants

(initial parameters) until an overall fit with the lowest possible χ2 value was reached.

The folding and unfolding rate constants were used and fixed for the following

POT1-TPP1 binding and opening GQ formed telomere ssDNA models.

127

POT1-TPP1 binding to unfolded telomere ssDNA (Li+ condition)

POT1-TPP1 binding to unfolded telomere ssDNA data were fit to the model shown

+ in Figure 3.3C. The KD1 and KD2 obtained from EMSAs under Li conditions, which corresponds to the ratio between dissociation rate constants and association rate constants for each step, were used as initial parameters. The global fit was performed multiple times for alternating combinations of fixed and floated variables until an overall fit with the lowest possible χ2 value was reached. The simulated sensorgrams calculated from obtained rate constants were plotted to compare with experimental sensorgrams in Figures 3.2C, 3.7D and F.

POT1-TPP1 binding and opening of GQ telomere DNA (K+ condition).

The models shown in Figure 3.4 E, Figure 3.5 B and E were used for fitting POT1-

TPP1 binding to GQ formed telomere ssDNA (SPR data under K+ conditions) and

GQ destabilization upon POT1-TPP1 binding. (stopped flow measurement).

Values of the folding and unfolding rate constants obtained from GQ formation assays were used and fixed as k3 and k-3. Values of k1, k-1, k2, k-2 obtained from

POT1-TPP1 binds to unfolded telomere ssDNA were used as initial parameters.

Again, the global fitting was performed multiple times for alternating combinations of fixed and floated variables until an overall fit with the lowest possible χ2 value was reached. The simulated data points calculated from obtained rate constants were plotted to compare with experimental data points in Figure 3.4D, 3.8E and I.

.

128

Free energy calculation

To obtain the free energy landscapes shown in Figure 3.6 and Figure 3.9, the free energy for each state was determined according to the deviation of Eyring-Polonyi

Equation: ΔG = RT * ln (κB * T / k * h), where ΔG is the free energy barrier for any given step, R is the gas constant (1.98 cal/mol), T is the temperature in kelvin, which is 298 K, κB and h are Boltzmann’s and Planck’s constants, respectively, and k is the rate constant for any given step. For any second order reaction, the association rate constant (k) was converted to a pseudo-first order rate constant

(kpse) at a given protein concentration according to: kpse = k * [protein], and kpse was used for calculating free energy. To obtain ΔΔG between POT1 wild-type and mutants, ΔΔG values were determined by: ΔΔG=ΔGmutant-ΔGwild-type, for each mutant and given bound state.

129

Table 3.1 Rate constants for POT1-TPP1 binding events to 6ThT22 in Li+ buffer as determined from SPR data.

wild-type Q94R H266L

rate rate rate StdErr StdErr StdErr constant constant constant

-1 -1 k+1 (nM s ) 1.7 E-3 2.2 E-6 4.1 E-4 1.3 E-6 2.7 E-3 3.6 E-5

-1 k-1 (s ) 3.5 E-4 1.5 E-6 6.0 E-4 3.7 E-6 1.9 E-3 1.1 E-4

-1 -1 k+2 (nM s ) 8.2 E-5 2.8 E-7 5.4 E-6 1.1 E-7 1.7 E-4 1.7 E-5

-1 k-2 (s ) 2.4 E-3 2.9 E-5 4.2 E-3 4.0 E-4 1.7 E-2 6.4 E-4

Table 3.2 Equilibrium dissociation constants determined for POT1-TPP1 and

6ThT22 in Li+ buffer using SPR analysis.

wild-type Q94R H266L

KD1 (nM) 0.21 1.47 0.73

KD2 (nM) 29 774 103

130

Table 3.3 Rate constants for POT1-TPP1 binding events to 6ThT22 in K+ buffer as determined from SPR data.

wild-type Q94R H266L rate rate rate StdErr StdErr StdErr constant constant constant -1 -1 k+1 (nM s ) 4.4E-03 2.9E-05 1.0E-03 1.5E-04 2.4E-03 7.3E-05 -1 k-1 (s ) 8.5E-04 5.0E-05 4.3E-04 7.2E-05 1.0E-03 6.5E-04 -1 -1 k+2 (nM s ) 6.7E-06 2.3E-06 7.9E-05 1.3E-05 2.3E-04 1.1E-04 -1 k-2 (s ) 1.9E-04 1.8E-04 7.3E-03 5.5E-04 1.4E-02 8.9E-04 -1 -1 k+4 (nM s ) 5.5E-03 7.5E-05 1.0E-03 2.6E-04 1.6E-03 7.5E-05 -1 k-4 (s ) 1.8E-01 1.2E-03 1.1E-01 1.8E-03 6.1E-02 1.4E-03 -1 -1 k+5 (nM s ) 3.5E-04 1.0E-05 1.0E-04 2.9E-05 1.2E-04 4.3E-05 -1 k-5 (s ) 2.0E-01 4.9E-02 8.4E-02 5.6E-02 9.9E-02 4.3E-02 -1 k+6 (s ) 2.5E+00 5.2E-01 5.8E+00 1.9E+00 3.1E+00 1.1E+00 -1 k-6 (s ) 4.6E-03 1.4E-03 1.7E-02 9.7E-03 1.7E-02 6.3E-03

Table 3.4 Equilibrium dissociation constants determined for POT1-TPP1 and

6ThT22 in K+ buffer using SPR analysis.

wild-type Q94R H266L

KD1 (nM) 0.19 0.41 0.43

KD2 (nM) 28 93 60

KD3 (nM) 32 103 37

KD4 (nM) 586 829 807

Kopen 546 348 184

131

Chapter 4 Summary and future direction

4.1 Summary

The interactions between POT1-TPP1 and telomere ssDNA directly dictate and facilitate the function of shelterin complexes in telomere ssDNA protection and telomerase regulation. For instance, the high affinity of POT1-TPP1-ssDNA

interactions prevent RPA- mediated telomere recognition and consequent ATR

pathway activation. Additionally, POT1-TPP1 functions to not only camouflage

telomere ssDNA from recognition by other proteins but it also recruits telomerase

to the telomere to facilitate telomere extension thus overcoming the end-replication

problem. The first study in this dissertation established both positive and negative

roles of POT1-TPP1 protein complexes in telomerase regulation in a telomere

ssDNA length-dependent manner. Interestingly, POT1 His 266 residue, which

directly interacts with telomere DNA and undergoes significant structural

environment changes upon binding events, play a significant role in telomerase

regulation. The CLL-related mutation H266L causes the loss of POT1-mediated telomerase inhibitory regulation allowing for aberrant telomere extension despite employing a physiologically relevant telomere ssDNA substrate. Somatic cells will induce cellular senescence when telomere length is reduced below a minimum threshold length, however, with the POT1 H266L mutant and loss of POT1-

132

mediated inhibition of telomerase, cells may escape senescence and attain

replicative immortality.

Additionally, POT1-TPP1 facilitates the resolution of telomere GQ secondary

structures, which also contribute to the regulation of telomere maintenance. The

detailed kinetic model presented in Chapter 3 of this dissertation illustrates that the

process of GQ destabilization depends on POT1-TPP1 concentration. The highly

favored, active GQ opening pathway is the main driver for complete destabilization

of GQ by POT1-TPP1. With this detailed kinetic model, we are able to predict the

time scale and extent of GQ destabilization based on varying POT1-TPP1

concentrations. The destabilization of GQ at the telomere is critical for telomerase

regulation since the formation of GQs significantly diminishes the ability of

telomerase to access telomere DNA. Furthermore, the CLL-related mutations that

disrupt telomere ssDNA binding also impair GQ binding or/and destabilization,

therefore appropriate GQ destabilization facilitated by POT1-TPP1 is an important

step in telomere maintenance and genomic stability.

In summary, POT1-TPP1 is capable of binding and resolving telomere GQ

structures and promotes telomere ssDNA accessibility for telomerase. However,

when the telomere ssDNA reach a sufficient length, POT1-TPP1 and telomere

ssDNA form a compact, globular structure (79) making the 3’ end of telomere

ssDNA inaccessible to telomerase. The CLL-related mutants exhibit defects in

telomere ssDNA binding and GQ destabilization. In this scenario, defective telomere ssDNA binding and/or incomplete opening of GQs by POT1-TPP1

133 mutants makes telomere ssDNA available for unregulated telomerase-mediated extension causing further genomic instability.

134

4.2 Future direction

As concluded by studies in this dissertation and others, POT1-TPP1 plays critical

roles in telomere maintenance and telomerase regulation. However, the molecular

details defining how POT1-TPP1 interacts with telomere DNA at physiologically

relevant lengths and how POT1-TPP1 regulates telomerase in the presence of

higher ordered DNA structures is yet to be understood. Therefore, I would like to

propose the following directions for future studies and propose approaches to each

question.

4.2.1 Determination of POT1-TPP1-ssDNA complex structure

Despite the efforts made to determine structural information of POT1-TPP1-

ssDNA complexes for decades, this feat remains unachieved. To date, three

individual domains of human complexes have been solved independently by

crystallization, including the POT1-DNA binding domain with telomere ssDNA (43),

TPP1-OB domain (38), and POT1-TPP1 interacting domain (86, 129). A fully

assembled POT1-TPP1-ssDNA complex structure would be of immense value in

understanding molecular details of this interaction and for predicting molecular

consequences of individual pathogenic mutations within this complex.

Approach:

Thanks to the rapidly developing field of cryo-EM and single particle reconstruction techniques, the human POT1-TPP1-ssDNA complex with a molecular mass of approximately 100-kDa is feasible for structural analysis via cryo-EM. We have established protocols for POT1-TPP1 protein co-expression and co-purification

135

with the minimal 12-nt telomere ssDNA substrate. After affinity and size-exclusion

purification, high purity (>95%) has been achieved. Using our in-house Tecnai F20

microscope equipped with a DE-20 direct electron detector, we have collected an

initial dataset of POT1-TPP1-DNA complexes. The 2D class averages, as well as

the 3D reconstruction, reveal a structurally dynamic complex that is comprised of

five distinct domains, each representing an individual OB-fold or HJR domain. With

the new Titan Krios microscope, a higher resolution data set should be achieved.

The goal will be to characterize the dynamic nature of the complex and achieve

one structure/conformation beyond 4Å resolution. Moving forwards, the available

crystal structures can be used to dock the respective densities of the complex to

reveal the molecular details of POT1-TPP1-ssDNA complex.

4.2.2 Characterization of the kinetic properties of POT1-TPP1 destabilizing

hybrid G-quadruplexes

In Chapter 3, I characterized the binding and destabilization of one antiparallel G-

quadruplex upon POT1-TPP1 binding, however, it is well established that the

topology of G4s can be affected by a number of factors including monovalent

cations incorporated in the central cavity of G4s, nucleotide sequence and length.

All of which may lead to a diverse array of G4 topologies. It will be important to

compare the kinetic properties of POT1-TPP1 destabilizing G-quadruplexes comprised of different topologies since the heterogeneity of G-quadruplexes an inherent characteristic under physiological conditions. The molecular mechanism of POT1-TPP1 destabilizing other G-quadruplex topologies will better illustrate the

136 function of POT1-TPP1 in telomere maintenance and telomerase regulation within a cellular context.

Approach:

Besides the hT22 substrate characterized in Chapter 3, other G-quadruplexs with hybrid topologies have been characterized by NMR spectroscopy (130, 131).

Preliminary SPR and circular dichroism assays have been conducted to characterize the binding and G-quadruplex destabilization kinetic parameters of hybrid G-quadruplexes following POT1-TPP1 binding. The observed rate constants indicated that POT1-TPP1 proteins are less efficient at destabilizing hybrid G-quadruplexes compared to antiparallel G-quadruplex substrates characterized in Chapter 3. Global fitting will reveal the kinetic changes of G- quadruplex destabilization due to different topologies. Moreover, similar strategies as in 4.2.1 could potentially be applied to resolve the complex structure of POT1-

TPP1 bound to a longer telomere ssDNA (more than 22-nt), capable of forming G- quadruplexes. This structural information would reveal molecular details of how

POT1-TPP1 proteins destabilize pre-formed telomere G-quadruplexes in a more physiologically relevant context.

4.2.3 Characterization of the role of POT1-TPP1 in telomerase regulation in the presence of G-quadruplexes

In addition to telomere ssDNA protection, the POT1-TPP1 heterodimer plays a critical role in regulating telomerase-mediated telomere elongation including telomerase recruitment and enhancement of telomerase processivity (38, 46). The

137 formation of GQs regulate the kinetics of telomerase, specifically this results in an increased dissociation rate constant of product dissociating from telomerase (124).

Additionally, POT1-TPP1 enhances telomerase processivity by slowing primer dissociation (48). It is reasonable to hypothesize that the destabilization of telomere GQs upon POT1-TPP1 binding involves enhancement of telomerase processivity. Though the kinetic model regarding telomerase extension and GQ destabilization by POT1-TPP1 have been presented by Stone and our group (124), the kinetic properties of GQ destabilization by POT1-TPP1 in telomerase extension has yet to be understood.

Approach:

Telomerase primer-extension assay (as shown in Figure 2.8) coupled with pulse- chase assays is a quantitative way to determine kinetic rate constants of the telomerase extension mechanism, including association, dissociation, and translocation steps. To perform telomerase assays under conditions that prevent

(Li+) or promote (K+) GQ folding in the absence and presence of POT1-TPP1 should provide information to characterize the effect of POT1-TPP1 in telomerase regulation with a GQ-prone substrate. Moreover, coupled with the kinetic model of

GQ destabilization characterized in Chapter 3, the molecular consequence of telomere repeat addition by telomerase can be more accurately predicted in the presence of POT1-TPP1 at physiological conditions.

138

4.2.4 Determination of the cellular consequence of POT1 cancer-associated

mutation H266L in genome instability.

Through our studies included in this dissertation, we have characterized the

functional deficiency caused by POT1 H266L mutant in telomere ssDNA binding,

GQ destabilization, and telomerase regulation. Interestingly, the POT1 H266L

mutant cells using CRISPR-Cas9 genome editing demonstrated robust telomere

lengthening over the 78 days of cell growth. This is consistent with the observation that POT1 H266L mutant abrogates the inhibitory role of multiple POT1-TPP1

binding events on telomerase activity as shown in Chapter 2. However, the

observation the POT1 H226L mutant impairs GQ destabilization upon POT1-TPP1

binding (Chapter 3), it would be reasonable to hypothesize that the POT1 H266L

mutation results in telomere shortening since GQs normally behave as

obstructions for telomerase extension. The molecular mechanism of how the

H266L mutation impairs POT1-TPP1 function at cellular level is still not understood.

Approach:

One possible explanation for telomere lengthening caused by the H266L mutant is the observed weaker telomere ssDNA affinity thereby negating some protective

properties of POT1 proteins and making the 3’ end of telomere ssDNA accessible

for telomerase extension. A second possible consequence of POT1 dysfunction in

telomere protection is that naked telomere ends induces illicit induction of the ATR kinase-dependent DNA damage response, which can be detected by the phosphorylation of the downstream target Chk1 via western blotting. Additionally, there will be the mis-localization of mutated POT1, which can be detected by 139 immunofluorescence. The intensity of POT1 fluorescence will likely be reduced or found beyond the telomere binding probe. Lastly, the weakened POT1-DNA interaction may trigger the competitive binding of RPA, which can induce the DNA damage response (38). RPA has a much higher abundance (approximate 1,000- fold) than POT1/TPP1 and a similar affinity for telomere ssDNA (30) such that it can out compete the diminished interaction between POT1/TPP1 and ssDNA when mutant POT1 is present.

140

Reference

1. Makarov VL, Hirose Y, & Langmore JP (1997) Long G tails at both ends of human chromosomes suggest a C strand degradation mechanism for telomere shortening. Cell 88(5):657-666. 2. de Lange T, et al. (1990) Structure and variability of human chromosome ends. Molecular and cellular biology 10(2):518-527. 3. de Lange T (2005) Shelterin: the protein complex that shapes and safeguards human telomeres. Genes Dev. 19(18):2100-2110. 4. Guo X, et al. (2007) Dysfunctional telomeres activate an ATM-ATR-dependent DNA damage response to suppress tumorigenesis. EMBO J. 26(22):4709-4719. 5. Denchi EL & de Lange T (2007) Protection of telomeres through independent control of ATM and ATR by TRF2 and POT1. Nature 448(7157):1068-1071. 6. Sfeir A & de Lange T (2012) Removal of shelterin reveals the telomere end-protection problem. Science 336(6081):593-597. 7. Wyatt HD, West SC, & Beattie TL (2010) InTERTpreting telomerase structure and function. Nucleic Acids Res. 38(17):5609-5622. 8. Nandakumar J & Cech TR (2013) Finding the end: recruitment of telomerase to telomeres. Nat. Rev. Mol. Cell Biol. 14(2):69-82. 9. Williamson JR, Raghuraman MK, & Cech TR (1989) Monovalent cation-induced structure of telomeric DNA: the G-quartet model. Cell 59(5):871-880. 10. Sundquist WI & Klug A (1989) Telomeric DNA dimerizes by formation of guanine tetrads between hairpin loops. Nature 342(6251):825-829. 11. Parkinson GN, Lee MP, & Neidle S (2002) Crystal structure of parallel quadruplexes from human telomeric DNA. Nature 417(6891):876-880. 12. Wang Y & Patel DJ (1993) Solution structure of the human telomeric repeat d[AG3(T2AG3)3] G-tetraplex. Structure 1(4):263-282. 13. Biffi G, Tannahill D, McCafferty J, & Balasubramanian S (2013) Quantitative visualization of DNA G-quadruplex structures in human cells. Nat. Chem. 5(3):182-186. 14. Paeschke K, et al. (2008) Telomerase recruitment by the telomere end binding protein- beta facilitates G-quadruplex DNA unfolding in ciliates. Nat Struct Mol Biol 15(6):598-604. 15. Hansel R, et al. (2011) The parallel G-quadruplex structure of vertebrate telomeric repeat sequences is not the preferred folding topology under physiological conditions. Nucleic Acids Res 39(13):5768-5775. 16. Griffith JD, et al. (1999) Mammalian telomeres end in a large duplex loop. Cell 97(4):503- 514. 17. van Steensel B & de Lange T (1997) Control of telomere length by the human telomeric protein TRF1. Nature 385(6618):740-743. 18. Smogorzewska A, et al. (2000) Control of human telomere length by TRF1 and TRF2. Mol Cell Biol 20(5):1659-1668. 19. Fairall L, Chapman L, Moss H, de Lange T, & Rhodes D (2001) Structure of the TRFH dimerization domain of the human telomeric proteins TRF1 and TRF2. Mol. Cell 8(2):351- 361. 20. Broccoli D, Smogorzewska A, Chong L, & de Lange T (1997) Human telomeres contain two distinct Myb-related proteins, TRF1 and TRF2. Nat. Genet. 17(2):231-235.

141

21. Court R, Chapman L, Fairall L, & Rhodes D (2005) How the human telomeric proteins TRF1 and TRF2 recognize telomeric DNA: a view from high-resolution crystal structures. EMBO Rep 6(1):39-45. 22. Hanaoka S, Nagadoi A, & Nishimura Y (2005) Comparison between TRF2 and TRF1 of their telomeric DNA-bound structures and DNA-binding activities. Protein Sci. 14(1):119-130. 23. Nishikawa T, et al. (2001) Solution structure of a telomeric DNA complex of human TRF1. Structure 9(12):1237-1251. 24. Ye JZ, et al. (2004) TIN2 binds TRF1 and TRF2 simultaneously and stabilizes the TRF2 complex on telomeres. J. Biol. Chem. 279(45):47264-47271. 25. Ye JZ, et al. (2004) POT1-interacting protein PIP1: a telomere length regulator that recruits POT1 to the TIN2/TRF1 complex. Genes Dev. 18(14):1649-1654. 26. Liu D, O'Connor MS, Qin J, & Songyang Z (2004) Telosome, a mammalian telomere- associated complex formed by multiple telomeric proteins. J Biol Chem 279(49):51338- 51342. 27. Kim SH, Kaminker P, & Campisi J (1999) TIN2, a new regulator of telomere length in human cells. Nat. Genet. 23(4):405-412. 28. Kim SH, et al. (2004) TIN2 mediates functions of TRF2 at human telomeres. J Biol Chem 279(42):43799-43804. 29. Abreu E, et al. (2010) TIN2-tethered TPP1 recruits human telomerase to telomeres in vivo. Mol. Cell. Biol. 30(12):2971-2982. 30. Takai KK, Kibe T, Donigian JR, Frescas D, & de Lange T (2011) Telomere protection by TPP1/POT1 requires tethering to TIN2. Mol. Cell 44(4):647-659. 31. Zeng Z, et al. (2010) Structural basis of selective ubiquitination of TRF1 by SCFFbx4. Dev Cell 18(2):214-225. 32. Janouskova E, et al. (2015) Human Rap1 modulates TRF2 attraction to telomeric DNA. Nucleic Acids Res 43(5):2691-2700. 33. Bae NS & Baumann P (2007) A RAP1/TRF2 complex inhibits nonhomologous end-joining at human telomeric DNA ends. Mol Cell 26(3):323-334. 34. Sfeir A, Kabir S, van Overbeek M, Celli GB, & de Lange T (2010) Loss of Rap1 induces telomere recombination in the absence of NHEJ or a DNA damage signal. Science 327(5973):1657-1661. 35. Chen Y, et al. (2011) A conserved motif within RAP1 has diversified roles in telomere protection and regulation in different organisms. Nat Struct Mol Biol 18(2):213-221. 36. Zimmermann M, Kibe T, Kabir S, & de Lange T (2014) TRF1 negotiates TTAGGG repeat- associated replication problems by recruiting the BLM and the TPP1/POT1 repressor of ATR signaling. Genes & development 28(22):2477-2491. 37. Baumann P & Cech TR (2001) Pot1, the putative telomere end-binding protein in fission yeast and humans. Science 292(5519):1171-1175. 38. Wang F, et al. (2007) The POT1-TPP1 telomere complex is a telomerase processivity factor. Nature 445(7127):506-510. 39. Xin H, et al. (2007) TPP1 is a homologue of ciliate TEBP-beta and interacts with POT1 to recruit telomerase. Nature 445(7127):559-562. 40. Nandakumar J, Podell ER, & Cech TR (2010) How telomeric protein POT1 avoids RNA to achieve specificity for single-stranded DNA. Proc. Natl. Acad. Sci. U. S. A. 107(2):651-656. 41. O'Connor MS, Safari A, Xin H, Liu D, & Songyang Z (2006) A critical role for TPP1 and TIN2 interaction in high-order telomeric complex assembly. Proc. Natl. Acad. Sci. U. S. A. 103(32):11874-11879.

142

42. Houghtaling BR, Cuttonaro L, Chang W, & Smith S (2004) A dynamic molecular link between the telomere length regulator TRF1 and the chromosome end protector TRF2. Curr. Biol. 14(18):1621-1631. 43. Lei M, Podell ER, & Cech TR (2004) Structure of human POT1 bound to telomeric single- stranded DNA provides a model for chromosome end-protection. Nat. Struct. Mol. Biol. 11(12):1223-1229. 44. Rajavel M, Mullins MR, & Taylor DJ (2014) Multiple facets of TPP1 in telomere maintenance. Biochim Biophys Acta 1844(9):1550-1559. 45. Zhong FL, et al. (2012) TPP1 OB-fold domain controls telomere maintenance by recruiting telomerase to chromosome ends. Cell 150(3):481-494. 46. Nandakumar J, et al. (2012) The TEL patch of telomere protein TPP1 mediates telomerase recruitment and processivity. Nature 492(7428):285-289. 47. Horvath MP, Schweiker VL, Bevilacqua JM, Ruggles JA, & Schultz SC (1998) Crystal structure of the Oxytricha nova telomere end binding protein complexed with single strand DNA. Cell 95(7):963-974. 48. Latrick CM & Cech TR (2010) POT1-TPP1 enhances telomerase processivity by slowing primer dissociation and aiding translocation. EMBO J. 29(5):924-933. 49. Rai R, et al. (2011) The E3 ubiquitin ligase Rnf8 stabilizes Tpp1 to promote telomere end protection. Nat Struct Mol Biol 18(12):1400-1407. 50. Zhang Y, et al. (2013) Phosphorylation of TPP1 regulates cell cycle-dependent telomerase recruitment. Proc. Natl. Acad. Sci. U. S. A. 110(14):5457-5462. 51. Cerami E, et al. (2012) The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2(5):401-404. 52. Gao J, et al. (2013) Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 6(269):pl1. 53. Bainbridge MN, et al. (2015) Germline mutations in shelterin complex genes are associated with familial glioma. J. Natl. Cancer Inst. 107(1):384. 54. Robles-Espinoza CD, et al. (2014) POT1 loss-of-function variants predispose to familial melanoma. Nat. Genet. 46(5):478-481. 55. Ramsay AJ, et al. (2013) POT1 mutations cause telomere dysfunction in chronic lymphocytic leukemia. Nat. Genet. 45(5):526-530. 56. Kocak H, et al. (2014) Hoyeraal-Hreidarsson syndrome caused by a germline mutation in the TEL patch of the telomere protein TPP1. Genes Dev. 28(19):2090-2102. 57. Sung LY, et al. (2014) Telomere elongation and naive pluripotent stem cells achieved from telomerase haplo-insufficient cells by somatic cell nuclear transfer. Cell reports 9(5):1603- 1609. 58. IJpma A & Greider CW (2003) Short telomeres induce a DNA damage response in Saccharomyces cerevisiae. Mol Biol Cell 14(3):987-1001. 59. Wu L, et al. (2006) Pot1 deficiency initiates DNA damage checkpoint activation and aberrant homologous recombination at telomeres. Cell 126(1):49-62. 60. Kibe T, Osawa GA, Keegan CE, & de Lange T (2010) Telomere protection by TPP1 is mediated by POT1a and POT1b. Molecular and cellular biology 30(4):1059-1066. 61. Sfeir A, et al. (2009) Mammalian telomeres resemble fragile sites and require TRF1 for efficient replication. Cell 138(1):90-103. 62. Doksani Y, Wu JY, de Lange T, & Zhuang X (2013) Super-resolution fluorescence imaging of telomeres reveals TRF2-dependent T-loop formation. Cell 155(2):345-356.

143

63. Bodnar AG, et al. (1998) Extension of life-span by introduction of telomerase into normal human cells. Science 279(5349):349-352. 64. Chin K, et al. (2004) In situ analyses of genome instability in breast cancer. Nature genetics 36(9):984-988. 65. Blackburn EH (2005) Telomeres and telomerase: their mechanisms of action and the effects of altering their functions. FEBS Lett 579(4):859-862. 66. Moyzis RK, et al. (1988) A highly conserved repetitive DNA sequence, (TTAGGG)n, present at the telomeres of human chromosomes. Proc. Natl. Acad. Sci. U. S. A. 85(18):6622-6626. 67. Shi J, et al. (2014) Rare missense variants in POT1 predispose to familial cutaneous malignant melanoma. Nat. Genet. 46(5):482-486. 68. Trigueros-Motos L (2014) Mutations in POT1 predispose to familial cutaneous malignant melanoma. Clin Genet 86(3):217-U109. 69. Calvete O, et al. (2015) A mutation in the POT1 gene is responsible for cardiac angiosarcoma in TP53-negative Li-Fraumeni-like families. Nat Commun 6:8383. 70. Sexton AN, Youmans DT, & Collins K (2012) Specificity requirements for human telomere protein interaction with telomerase holoenzyme. J. Biol. Chem. 287(41):34455-34464. 71. Schmidt JC, Zaug AJ, & Cech TR (2016) Live Cell Imaging Reveals the Dynamics of Telomerase Recruitment to Telomeres. Cell 166(5):1188-1197 e1189. 72. Marcand S, Gilson E, & Shore D (1997) A protein-counting mechanism for telomere length regulation in yeast. Science 275(5302):986-990. 73. Teixeira MT, Arneric M, Sperisen P, & Lingner J (2004) Telomere length homeostasis is achieved via a switch between telomerase- extendible and -nonextendible states. Cell 117(3):323-335. 74. Zahler AM, Williamson JR, Cech TR, & Prescott DM (1991) Inhibition of telomerase by G- quartet DNA structures. Nature 350(6320):718-720. 75. Wang Q, et al. (2011) G-quadruplex formation at the 3' end of telomere DNA inhibits its extension by telomerase, polymerase and unwinding by helicase. Nucleic Acids Res. 39(14):6229-6237. 76. Mullins MR, et al. (2016) POT1-TPP1 Binding and Unfolding of Telomere DNA Discriminates against Structural Polymorphism. J. Mol. Biol. 428(13):2695-2708. 77. Hwang H, Buncher N, Opresko PL, & Myong S (2012) POT1-TPP1 regulates telomeric overhang structural dynamics. Structure 20(11):1872-1880. 78. Takai KK, Hooper S, Blackwood S, Gandhi R, & de Lange T (2010) In vivo stoichiometry of shelterin components. J. Biol. Chem. 285(2):1457-1467. 79. Taylor DJ, Podell ER, Taatjes DJ, & Cech TR (2011) Multiple POT1-TPP1 proteins coat and compact long telomeric single-stranded DNA. J. Mol. Biol. 410(1):10-17. 80. Corriveau M, Mullins MR, Baus D, Harris ME, & Taylor DJ (2013) Coordinated interactions of multiple POT1-TPP1 proteins with telomere DNA. J. Biol. Chem. 288(23):16361-16370. 81. Jain SS & Tullius TD (2008) Footprinting protein-DNA complexes using the hydroxyl radical. Nat. Protoc. 3(6):1092-1100. 82. Kiselar J & Chance MR (2018) High-Resolution Hydroxyl Radical Protein Footprinting: Biophysics Tool for Drug Discovery. Annu Rev Biophys. 83. Wang L & Chance MR (2017) Protein Footprinting Comes of Age: Mass Spectrometry for Biophysical Structure Assessment. Mol. Cell. Proteomics 16(5):706-716. 84. Deperalta G, et al. (2013) Structural analysis of a therapeutic monoclonal antibody dimer by hydroxyl radical footprinting. MAbs 5(1):86-101.

144

85. Takamoto K & Chance MR (2006) Radiolytic protein footprinting with mass spectrometry to probe the structure of macromolecular complexes. Annu. Rev. Biophys. Biomol. Struct. 35:251-276. 86. Rice C, et al. (2017) Structural and functional analysis of the human POT1-TPP1 telomeric complex. Nat Commun 8:14928. 87. Zaug AJ, Podell ER, Nandakumar J, & Cech TR (2010) Functional interaction between telomere protein TPP1 and telomerase. Genes Dev. 24(6):613-622. 88. Zaug AJ, Podell ER, & Cech TR (2005) Human POT1 disrupts telomeric G-quadruplexes allowing telomerase extension in vitro. Proc. Natl. Acad. Sci. U. S. A. 102(31):10864-10869. 89. Ray S, Bandaria JN, Qureshi MH, Yildiz A, & Balci H (2014) G-quadruplex formation in telomeres enhances POT1/TPP1 protection against RPA binding. Proc. Natl. Acad. Sci. U. S. A. 111(8):2990-2995. 90. Dapic V, et al. (2003) Biophysical and biological properties of quadruplex oligodeoxyribonucleotides. Nucleic Acids Res. 31(8):2097-2107. 91. Ambrus A, et al. (2006) Human telomeric sequence forms a hybrid-type intramolecular G- quadruplex structure with mixed parallel/antiparallel strands in potassium solution. Nucleic Acids Res. 34(9):2723-2735. 92. Gray DM, et al. (2008) Measured and calculated CD spectra of G-quartets stacked with the same or opposite polarities. Chirality 20(3-4):431-440. 93. Huang W, Ravikumar KM, Chance MR, & Yang S (2015) Quantitative mapping of protein structure by hydroxyl radical footprinting-mediated structural mass spectrometry: a protection factor analysis. Biophys. J. 108(1):107-115. 94. Loayza D, Parsons H, Donigian J, Hoke K, & de Lange T (2004) DNA binding features of human POT1: a nonamer 5'-TAGGGTTAG-3' minimal binding site, sequence specificity, and internal binding to multimeric sites. J. Biol. Chem. 279(13):13241-13248. 95. Huffman KE, Levene SD, Tesmer VM, Shay JW, & Wright WE (2000) Telomere shortening is proportional to the size of the G-rich telomeric 3'-overhang. J. Biol. Chem. 275(26):19719-19722. 96. Zhao Y, et al. (2009) Telomere extension occurs at most chromosome ends and is uncoupled from fill-in in human cancer cells. Cell 138(3):463-475. 97. Xu G, Kiselar J, He Q, & Chance MR (2005) Secondary reactions and strategies to improve quantitative protein footprinting. Anal. Chem. 77(10):3029-3037. 98. van den Berg RA, Hoefsloot HC, Westerhuis JA, Smilde AK, & van der Werf MJ (2006) Centering, scaling, and transformations: improving the biological information content of metabolomics data. BMC Genomics 7:142. 99. Wyatt PJ (1997) Multiangle light scattering: The basic tool for macromolecular characterization. Instrumentation Science & Technology 25(1):1-18. 100. Sentmanat MF, Peters ST, Florian CP, Connelly JP, & Pruett-Miller SM (2018) A Survey of Validation Strategies for CRISPR-Cas9 Editing. Sci. Rep. 8(1):888. 101. Harley CB, Futcher AB, & Greider CW (1990) Telomeres shorten during ageing of human fibroblasts. Nature 345(6274):458-460. 102. Travers A & Muskhelishvili G (2015) DNA structure and function. FEBS J 282(12):2279- 2295. 103. Gellert M, Lipsett MN, & Davies DR (1962) Helix formation by guanylic acid. Proc. Natl. Acad. Sci. U. S. A. 48:2013-2018. 104. Todd AK, Johnston M, & Neidle S (2005) Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic Acids Res. 33(9):2901-2907.

145

105. Huppert JL & Balasubramanian S (2005) Prevalence of quadruplexes in the human genome. Nucleic Acids Res. 33(9):2908-2916. 106. Siddiqui-Jain A, Grand CL, Bearss DJ, & Hurley LH (2002) Direct evidence for a G- quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc. Natl. Acad. Sci. U. S. A. 99(18):11593-11598. 107. Patel DJ, Phan AT, & Kuryavyi V (2007) Human telomere, oncogenic promoter and 5'-UTR G-quadruplexes: diverse higher order DNA and RNA targets for cancer therapeutics. Nucleic Acids Res. 35(22):7429-7455. 108. Huppert JL & Balasubramanian S (2007) G-quadruplexes in promoters throughout the human genome. Nucleic Acids Res. 35(2):406-413. 109. Shen W, Gao L, Balakrishnan M, & Bambara RA (2009) A recombination hot spot in HIV-1 contains guanosine runs that can form a G-quartet structure and promote strand transfer in vitro. J. Biol. Chem. 284(49):33883-33893. 110. Blackburn EH, Greider CW, & Szostak JW (2006) Telomeres and telomerase: the path from maize, Tetrahymena and yeast to human cancer and aging. Nat. Med. 12(10):1133-1138. 111. Lim KW, et al. (2009) Structure of the human telomere in K+ solution: a stable basket-type G-quadruplex with only two G-tetrad layers. J. Am. Chem. Soc. 131(12):4301-4309. 112. Luu KN, Phan AT, Kuryavyi V, Lacroix L, & Patel DJ (2006) Structure of the human telomere in K+ solution: an intramolecular (3 + 1) G-quadruplex scaffold. J. Am. Chem. Soc. 128(30):9963-9970. 113. Phan AT, Luu KN, & Patel DJ (2006) Different loop arrangements of intramolecular human telomeric (3+1) G-quadruplexes in K+ solution. Nucleic Acids Res. 34(19):5715-5719. 114. Phan AT & Patel DJ (2003) Two-repeat human telomeric d(TAGGGTTAGGGT) sequence forms interconverting parallel and antiparallel G-quadruplexes in solution: distinct topologies, thermodynamic properties, and folding/unfolding kinetics. J. Am. Chem. Soc. 125(49):15021-15027. 115. Gilbert DE & Feigon J (1999) Multistranded DNA structures. Curr. Opin. Struct. Biol. 9(3):305-314. 116. Simonsson T (2001) G-quadruplex DNA structures--variations on a theme. Biol. Chem. 382(4):621-628. 117. Bhattacharyya D, Mirihana Arachchilage G, & Basu S (2016) Metal Cations in G- Quadruplex Folding and Stability. Front Chem 4:38. 118. Hardin CC, Watson T, Corregan M, & Bailey C (1992) Cation-dependent transition between the quadruplex and Watson-Crick hairpin forms of d(CGCG3GCG). Biochemistry 31(3):833-841. 119. Trigueros-Motos L (2014) Mutations in POT1 predispose to familial cutaneous malignant melanoma. Clin. Genet. 86(3):217-218. 120. Galer P, Wang B, Sket P, & Plavec J (2016) Reversible pH Switch of Two-Quartet G- Quadruplexes Formed by Human Telomere. Angew. Chem. Int. Ed. Engl. 55(6):1993-1997. 121. Randazzo A, Spada GP, & da Silva MW (2013) Circular dichroism of quadruplex structures. Top. Curr. Chem. 330:67-86. 122. Lei M, Zaug AJ, Podell ER, & Cech TR (2005) Switching human telomerase on and off with hPOT1 protein in vitro. J. Biol. Chem. 280(21):20449-20456. 123. Hardin CC, Perry AG, & White K (2000) Thermodynamic and kinetic characterization of the dissociation and assembly of quadruplex nucleic acids. Biopolymers 56(3):147-194. 124. Jansson LI, et al. (2019) Telomere DNA G-quadruplex folding within actively extending human telomerase. Proc. Natl. Acad. Sci. U. S. A. 116(19):9350-9359.

146

125. Zhang AY & Balasubramanian S (2012) The kinetics and folding pathways of intramolecular G-quadruplex nucleic acids. J. Am. Chem. Soc. 134(46):19297-19308. 126. Mendoza O, Bourdoncle A, Boule JB, Brosh RM, Jr., & Mergny JL (2016) G-quadruplexes and helicases. Nucleic Acids Res. 44(5):1989-2006. 127. Johnson KA, Simpson ZB, & Blom T (2009) Global kinetic explorer: a new computer program for dynamic simulation and fitting of kinetic data. Anal. Biochem. 387(1):20-29. 128. Johnson KA, Simpson ZB, & Blom T (2009) FitSpace explorer: an algorithm to evaluate multidimensional parameter space in fitting kinetic data. Anal. Biochem. 387(1):30-41. 129. Chen C, et al. (2017) Structural insights into POT1-TPP1 interaction and POT1 C-terminal mutations in human cancer. Nat Commun 8:14929. 130. Phan AT, Kuryavyi V, Luu KN, & Patel DJ (2007) Structure of two intramolecular G- quadruplexes formed by natural human telomere sequences in K+ solution. Nucleic Acids Res. 35(19):6517-6525. 131. Dai J, et al. (2007) Structure of the intramolecular human telomeric G-quadruplex in potassium solution: a novel adenine triple formation. Nucleic Acids Res. 35(7):2440-2450.

147