Mutagenic and Kinetic Effects of Various DNA Lesions on DNA Polymerization

Catalyzed by Y-Family DNA

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Shanen Michelle Sherrer

Ohio State Biochemistry Program

The Ohio State University

2011

Dissertation Committee:

Zucai Suo, Adviser

Richard Swenson

Karin Musier-Forsyth

Irina Artsimovitch

Copyright by

Shanen Michelle Sherrer

2011

Abstract

Cell survival requires genomic stability through the conservation of DNA sequences. If DNA is damaged, most DNA polymerases will stall during DNA replication. The specialized Y-family DNA polymerases rescue stalled DNA replication sites and avoid cell death. Hence, gaining knowledge on the molecular basis of DNA ‟s nucleotide selectivity and fidelity during DNA lesion bypass improves our understanding of this process. However, the multiple bypass mechanisms that Y-family

DNA polymerases employ are poorly understood. Sulfolobus solfataricus DNA

Polymerase IV (Dpo4), a model Y-family member, has a fidelity range of 10-3-10-4 at 37

°C, which does not significantly change between 26 °C and 56 °C. However, the physiological temperature of the S. solfataricus is approximately 80 °C. To determine the kinetic and structural relevance of data collected below 80 °C, we employed a circular dichroism spectroscopic investigation to observe the secondary structural changes of

Dpo4 over a large temperature range. We discovered that Dpo4 displayed a three-state cooperative unfolding trend with a hyperthermophilic melting temperature, and exhibited secondary structural stability until 87 °C. We also established that the Dpo4 unfolding intermediate originated from ionic interactions between the linker region and Palm domain. These interactions are considered important for DNA binding during binary complex formation and possibly nucleotide incorporations.

ii

Utilizing transient kinetics, we demonstrated that an active site mutation (Y12A) within Dpo4 caused an average 220-fold increase in matched ribonucleotide incorporation efficiency and an average 9-fold decrease in correct deoxyribonucleotide incorporation efficiency, leading to an average reduction of 2,000-fold in sugar selectivity. Therefore, the bulky side chain of Tyr12 is important for both ribonucleotide discrimination and efficient deoxyribonucleotide incorporation.

To examine mutagenic outcomes of lesion bypass, we employed short oligonucleotide sequencing assay (SOSA) with all human Y-family members and DNA containing a non-informational abasic (AP) site. We observed complex mutagenic patterns of AP site bypass catalyzed by human Y-family enzymes including rare mutagenic events. This data suggested that human DNA Polymerase ε (hPolε) is the likely enzyme to bypass AP sites in vivo. Furthermore, our other SOSA studies using

DNA substrates containing a cis-syn cyclobutane pyrimidine dimer (a product of UV- exposure) or cisplatin-dGpG (a product of anticancer drug cisplatin) depicted each Y- family DNA polymerase performing lesion bypass uniquely for each DNA lesion and differently from other Y-family members.

Both our pre-steady state kinetic and SOSA investigations on the bypass of N-

(deoxyguanosin-8-yl)-1-aminopyrene (dGAP), a product of incomplete fuel combustion, confirmed this hypothesis as Dpo4 was the most error-free while hPolε was the most efficient during lesion bypass. Our work also indicated that all Y-family enzymes utilized different dGAP bypass kinetic mechanisms, and that the dGAP presence decreased nucleotide incorporation efficiencies and accuracies upstream, downstream and opposite

iii the lesion site. Beyond these findings, we elucidated the minimal dGAP bypass mechanism for Dpo4 and hPolε, as well as added more details to the kinetic mechanism employed by hPolε during DNA synthesis with undamaged DNA. Overall, our data contributes to the understanding of lesion bypass and potentially mutagenic outcomes of this vital biological process in vivo.

iv

Dedication

This dissertation is dedicated to my family.

v

Acknowledgements

First, I would like to thank my adviser Dr. Zucai Suo. Without his support and guidance, most scientific accomplishments during my graduate years would not have materialized. Dr. Suo has helped me realize my true potential as a scientific researcher and has provided me with the tools to succeed beyond graduate school. I would also like to thank my graduate committee members Dr. Richard Swenson, Dr. Karin Musier-

Forsyth and Dr. Irina Artsimovitch for their support and suggestions, which have assisted me in publishing manuscripts, winning awards and fellowships, and completing this dissertation.

I am very grateful for the interactions and collaborations that I had with current and past members of the Suo laboratory, especially David Beyer, Nikunj Bhatt, Dr.

Jessica Brown, Wade Duym, Dr. Kevin Fiala, Dr. Jason Fowler, Dr. Sonja Fraas, Sean

Newmister, Lindsey Pack, John Pryor, Laura Sanman, Dr. David Taggart, Cindy Xia, and

Dr. Cuiling Xu. These individuals provided not only scientific training and support, but also provided a research laboratory environment that was conducive to my productivity. I would like to especially acknowledge Dr. Jessica Brown for her feedback on my theories and ideas, and Dr. Jason Fowler for his technical support, which allowed me to develop and implement several of my research projects. Moreover, I would like to show my appreciation to all of the undergraduate students who I have mentored over the past five

vi years. Their research efforts resulted in several of my published works being completed quickly as well as significant contributions to other research projects in Dr. Suo‟s laboratory.

I express my gratitude to Dr. Ashis Basu (University of Connecticut) and his laboratory for providing the Suo laboratory such interesting DNA substrates at a larger- than-normal scale of production. I also thank Dr. John-Stephen Taylor (Washington

University) for materials shared with the Suo laboratory and research collaborations. I sincerely thank Dr. Thomas Magliery for allowing the use of his RT-PCR instrument for one of my projects.

Additionally, I would like to extend my gratitude to my family, especially my father and mother, who were very patient with me through the years and supported my scientific endeavors. I also would like to show my appreciation to my long-term significant other who gave me unwavering support, and encouraged me to be my best in all that I pursue.

I would also like to thank current OSBP director Dr. Jill Rafael-Fortney, past

OSBP director Dr. Ross Dalbey, the Biochemistry Department, respective staff, faculty and students for providing additional intellectual and social interactions that helped me stay grounded beyond my research endeavors over the years at the Ohio State University.

I would like to give a special thanks to Peter Sanders who helped me grow as an educator and provided me with other professional development opportunities.

My research was supported by grants awarded to my adviser from the National

Science Foundation and the National Institute of Health (NIH). I would also like to

vii acknowledge the financial support provided to me during my graduate career from the

Robert H. Edgerley Environmental Toxicology Fellowship at The Ohio State University, the NIH-sponsored Chemistry-Biology Interface Training Program fellowship, and the

American Heart Association Great Rivers Affiliate Predoctoral Fellowship.

viii

Vita

Education

2001-2005 ...... B.S. Biochemistry, Miami University 2001-2005 ...... Minor Statistic Methods, Miami University 2005 to present...... Ph.D. Biochemistry, The Ohio State University

Presentations

1. Shanen Sherrer, Amy Krans, Jenni Hoehn, and Dr. Scott Rogers. Life in Ancient Ice. Bowling Green State University, OH. (2003) REU/NSF Summer Conference. (abstract, poster and oral presentation) 2. Shanen Sherrer and Dr. Ann Hagerman. Detection and Characterization of Polyphenol Radicals via EPR Spectroscopy. The Ohio State University. (2004) Ohio Science and Engineering Alliance Student Research Forum. (abstract, poster and oral presentation) 3. Shanen Sherrer and Dr. Ann Hagerman. Polymeric Polyphenols as Dietary Antioxidants. Miami University, OH. (2005) The 11th Annual Undergraduate Research Forum. (poster) 4. Shanen Sherrer and Dr. Ann Hagerman. Polymeric Polyphenols as Dietary Antioxidants. San Diego, CA. (2005) Experimental Biology and International Union of Physiological Sciences annual conference. (abstract and poster) 5. Shanen Sherrer. Detection and Characterization of Polyphenolic Radicals Via Epr Spectroscopy. Central State University, OH. (2005) Student Achievement in Research and Scholarship Statewide Student Research Conference. (abstract and oral presentation) 6. Shanen M. Sherrer, Jessica A. Brown, Lindsey R. Pack, Vijay P. Jasti, Jason D. Fowler, Ashis K. Basu and Zucai Suo. Mechanistic Studies of the Bypass of a Bulky Single-Base Lesion Catalyzed by a Y-Family DNA polymerase. The Ohio State University. (2009) NIH Chemistry-Biology Interface Training Program Symposium. (poster) 7. Shanen M. Sherrer, Jessica A. Brown, Lindsey R. Pack, Vijay P. Jasti, Jason D. Fowler, Ashis K. Basu and Zucai Suo. Mechanistic Studies of the Bypass of a Bulky Single-Base Lesion Catalyzed by a Y-Family DNA polymerase. The Ohio State University. (2009) IGP Symposium. (poster and oral presentation) 8. Shanen M. Sherrer, Jessica A. Brown, Lindsey R. Pack, Vijay P. Jasti, Jason D. Fowler, Ashis K. Basu and Zucai Suo. Mechanistic Studies of the Bypass of a Bulky ix

Single-Base Lesion Catalyzed by a Y-Family DNA polymerase. University of New England, Maine. (2009) Gordon Research Conference on Nucleic Acid. (poster) 9. Shanen M. Sherrer. Investigation on the Various Lesion Bypass Mechanisms Utilized by Human Y-Family DNA Polymerases. The Ohio State University. (2010) NIH Chemistry-Biology Interface Program Seminar. (oral presentation) 10. Shanen M. Sherrer, Lindsey R. Pack, Jason D. Fowler, Sean A. Newmister, and Zucai Suo. A Look into the DNA Lesion Bypass Functions of Y-Family Human DNA Polymerases. The Ohio State University. (2010) NIH Chemistry-Biology Interface Training Program Symposium. (poster) 11. Shanen M. Sherrer, Lindsey R. Pack, Sean A. Newmister, Jason D. Fowler, and Zucai Suo. Consequences of Air Pollution on Human DNA Replication. The Ohio State University. (2010) IGP Symposium. (poster) 12. Shanen M. Sherrer, Lindsey R. Pack, Kevin A. Fiala, Jason D. Fowler, and Zucai Suo. Investigation of the Thermal Stability of a Y-Family DNA Polymerase. San Diego, CA. (2010) The 24th Symposium of the Protein Society. (poster) 13. Shanen M. Sherrer and Zucai Suo. Mutagenic Bulky Lesion Bypass by Human Y- Family DNA Polymerases. The Ohio State University. (2011) The 25th Annual Edward J. Hayes Graduate Research Forum. (poster) 14. Shanen M. Sherrer and Zucai Suo. Identification of a Thermal-Stable Unfolding Intermediate for a Y-Family DNA Polymerase. The Ohio State University. (2011) IGP Symposium. (poster) 15. Shanen M. Sherrer and Zucai Suo. Mutagenic Consequences of 1-Aminopyrene Adduct Bypass Catalyzed by Human Y-Family DNA Polymerases. The Ohio State University. (2011) NIH Chemistry-Biology Interface Training Program Symposium. (poster)

Invited Talks

Shanen M. Sherrer. Keynote Address: The Importance of Research as an Undergraduate Scholar. (2010) The 16th Annual Miami University Undergraduate Research Forum, OH.

Publications

1. Fiala, K.A., Sherrer, S.M., Brown, J.A., and Suo, Z. (2008) Mechanistic Consequences of Temperature on DNA Polymerization Catalyzed by a Y-family DNA Polymerase. Nucleic Acids Research 36, 1990 – 2001.

x

2. Sherrer, S.M., Brown, J.A., Pack, L.R., Jasti, V.P., Fowler, J.D., Basu, A.K., Suo, Z. (2009) Mechanistic Studies of the Bypass of a Bulky Single-Base Lesion Catalyzed by a Y-Family DNA Polymerase. Journal of Biological Chemistry 284, 6379 – 6388. 3. Brown, J.A., Fiala, K.A., Fowler, J.D., Sherrer, S.M., Newmister, S.A., Duym, W.W., Suo, Z. (2010) A Novel Mechanism of Sugar Selection Utilized by a Human X-family DNA Polymerase. Journal of Molecular Biology 395, 282 – 290. 4. Brown, J.A., Zhang, L., Sherrer, S.M., Taylor, J.S.A., Burgers, P.M.J., Suo, Z. (2010) Pre-Steady State Kinetic Analysis of Truncated and Full-Length Saccharomyces cerevisiae DNA Polymerase Eta. Journal of Nucleic Acids, doi:10.4061/2010/871939. 5. Brown, J.A., Pack, L.R., Sherrer, S.M., Kshetry, A., Newmister, S.A., Fowler, J.D., Taylor, J.S., Suo, Z. (2010) Identification of Critical Residues for the Tight Binding of Both Correct and Incorrect Nucleotides to Human DNA Polymerase ι. Journal of Molecular Biology 403, 505 – 515. 6. Sherrer, S.M., Beyer, D.C., Xia, C.X., Fowler, J.D., Suo, Z. (2010) Kinetic basis of sugar selection by a Y-family DNA polymerase from Sulfolobus solfataricus P2. Biochemistry 49, 10179 – 10186. 7. Sherrer, S.M., Fiala, K.A., Fowler, J.D., Newmister, S.A., Pryor, J., Suo, Z. (2011) Quantitative Analysis of the Efficiency and Mutagenic Spectra of Abasic Lesion Bypass Catalyzed by Human Y-Family DNA Polymerases. Nucleic Acids Research 39, 609 – 622.

Fields of Study

Major Field: Ohio State Biochemistry Program

xi

Table of Contents

Abstract ...... ii

Dedication ...... v

Acknowledgements ...... vi

Vita ...... ix

Table of Contents ...... xii

List of Tables ...... xvii

List of Figures ...... xxi

List of Schemes ...... xxvii

List of Abbreviations ...... xxviii

Chapter 1 : Introduction to Y-Family DNA Polymerases ...... 1

1.1 Introduction ...... 1

1.2 Mechanisms of DNA Polymerases ...... 2

1.3 DNA Polymerase Families ...... 3

Characteristics of Each DNA Polymerase Family ...... 4

Dpo4 and the Human Y-Family DNA Polymerases ...... 7

1.4 Focus of Dissertation ...... 10

xii

1.5 Figures ...... 13

1.6 Tables ...... 19

1.7 Schemes ...... 21

Chapter 2 : Identification of an Unfolding Intermediate for a Thermal-Stable Y-Family

DNA Polymerase ...... 23

2.1 Introduction ...... 23

2.2 Materials and Methods ...... 24

2.3 Results ...... 28

2.4 Discussion ...... 37

2.5 Figures ...... 44

2.6 Tables ...... 59

Chapter 3 : Kinetic Basis of Sugar Selection by a Y-Family DNA Polymerase from

Sulfolobus solfataricus P2...... 63

3.1 Introduction ...... 63

3.2 Material and Methods...... 65

3.3 Results ...... 68

3.4 Discussion ...... 74

3.5 Figures ...... 79

3.6 Tables ...... 91

xiii

Chapter 4 : Quantitative Analysis of the Efficiency and Mutagenic Spectra of Abasic

Lesion Bypass Catalyzed by Human Y-Family DNA Polymerases ...... 94

4.1 Introduction ...... 94

4.2 Material and Methods...... 96

4.3 Results ...... 99

4.4 Discussion ...... 113

4.5 Figures ...... 120

4.6 Tables ...... 141

4.7 Schemes ...... 144

Chapter 5 : Mechanistic Studies of the Bypass of a Bulky Single-Base Lesion Catalyzed by a Y-Family DNA Polymerase ...... 145

5.1 Introduction ...... 145

5.2 Material and Methods...... 148

5.3 Results ...... 151

5.4 Discussion ...... 156

5.5 Figures ...... 165

5.6 Tables ...... 174

5.7 Scheme ...... 179

xiv

Chapter 6 : Kinetic Analysis of the Bypass of a Bulky Lesion Catalyzed by Human Y-

Family DNA Polymerases ...... 180

6.1 Introduction ...... 180

6.2 Material and Methods...... 182

6.3 Results ...... 185

6.4 Discussion ...... 192

6.5 Figures ...... 201

6.6 Tables ...... 207

6.7 Schemes ...... 218

Chapter 7 : Additional Results, Future Directions and Conclusion ...... 219

7.1 Mutagenic Analysis of dGAP Bypass Catalyzed by Y-family DNA Polymerases 221

Results ...... 221

Future Directions ...... 230

7.2 Mutagenic Analysis on cis-syn Cyclobutane Thymine Dimer Bypass Catalyzed by

Y-Family DNA Polymerases ...... 231

Results ...... 232

Future Directions ...... 239

7.3 Mutagenic Analysis on Cisplatin-dGpG Bypass Catalyzed by Y-Family DNA

Polymerases ...... 240

xv

Results ...... 241

Future Directions ...... 247

7.4 Transient Kinetic Investigation of Human DNA Polymerase ε ...... 249

Results ...... 249

Future Directions ...... 256

7.5 Kinetic Investigation of Dpo4 Domain Interactions Involving the Linker Region258

Results ...... 258

Future Directions ...... 262

7.6 Conclusion ...... 264

7.7 Figures ...... 272

7.8 Tables ...... 329

7.9 Schemes ...... 344

References ...... 351

xvi

List of Tables

Table 1.1. Estimated kinetic constants of Dpo4 at 37 °C [65]...... 19

Table 1.2. Estimated kinetic constants of Dpo4 [66]...... 20

Table 2.1. Amino acid sequence of the linker region for specific Dpo4 mutants...... 59

Table 2.2. Apparent unfolding thermodynamic parameters determined by thermal denaturation at wavelength 222 nm...... 60

Table 2.3. Apparent unfolding thermodynamic parameters in the presence of 50 mM guanidine hydrochloride at wavelength 222 nm...... 61

Table 2.4. Apparent unfolding thermodynamic parameters determined by thermal denaturation at wavelength 222 nm...... 62

Table 3.1. DNA substrates...... 91

Table 3.2. Kinetic parameters of matched rNTP or dNTP incorporation into DNA catalyzed by the wild-type Dpo4 or its Y12A mutant at 37 °C...... 92

Table 3.3. Kinetic parameters of mismatched rNTP incorporation into a DNA substrate with 5′-dG (D-1) or 5′-dA (D-1′) from the templating base dA (Table 3.1) catalyzed by the Y12A Dpo4 mutant at 37 °C...... 93

Table 4.1. DNA primer and templatesa ...... 141

Table 4.2. The AP bypass efficiencies of the human Y-family DNA polymerases ...... 142

Table 4.3. Error rates of the four human Y-family DNA polymerases ...... 143

xvii

Table 5.1. Sequences of DNA oligonucleotides...... 174

Table 5.2. Binding affinity of Dpo4 to damaged and control DNA substrates at 23 °C. 175

Table 5.3. Kinetic parameters for single dNTP incorporation opposite template 26-mer- dGAP...... 176

Table 5.4. Kinetic parameters of dNTP incorporation into undamaged DNA...... 177

Table 5.5. Biphasic kinetic parameters for correct dNTP incorporation into 5′-[32P]- labeled DNA (30 nM) in the presence of a DNA trap (5 M) at 37 C...... 178

Table 6.1. DNA Substrates for dGAP...... 207

Table 6.2. Binding affinity of hPolε, hPolθ hPolη and hRev1 to normal and damaged

DNA at 23 °C...... 208

Table 6.3. Kinetic parameters of nucleotide incorporation into control DNA catalyzed by hPolε...... 209

Table 6.4. Kinetic parameters of nucleotide incorporation into damaged DNA catalyzed by hPolε...... 210

Table 6.5. Kinetic parameters of nucleotide incorporation into control DNA catalyzed by hPolθ...... 211

Table 6.6. Kinetic parameters of nucleotide incorporation into damaged DNA catalyzed by hPolθ...... 212

Table 6.7. Kinetic parameters of nucleotide incorporation into control DNA catalyzed by hPol...... 213

Table 6.8. Kinetic parameters of nucleotide incorporation into damaged DNA catalyzed by hPol...... 214

xviii

Table 6.9. Kinetic parameters of nucleotide incorporation into control DNA catalyzed by hRev1...... 215

Table 6.10. Kinetic parameters of nucleotide incorporation into damaged DNA catalyzed by hRev1...... 216

Table 6.11. Biphasic kinetic parameters of correct nucleotide incorporations catalyzed by hPolε...... 217

Table 7.1. DNA substrates...... 329

Table 7.2. The dGAP bypass efficiencies of the human Y-family DNA polymerases and

Dpo4...... 331

Table 7.3. Error rates of the Y-family DNA polymerases using S-3a 17/73-mer and S-3b

17/73-mer-dGAP...... 332

bypass Table 7.4. Calculated t50 of T1, T2, and total lesion bypass by Y-family DNA polymerases...... 333

Table 7.5. The total error rate of Y-family DNA polymerases using S-4a 17/77-mer and

S-4b 17/77-mer-CPD...... 334

bypass Table 7.6. Calculated t50 of G1, G2, and total lesion bypass by Y-family DNA polymerases...... 335

Table 7.7. The total error rate of Y-family DNA polymerases using S-6a 15/69-mer and

S-6b 15/69-mer-DDP...... 336

Table 7.8. DNA binding affinity of hPolε for different DNA substrates at room temperature...... 337

xix

Table 7.9. Pre-steady state kinetic parameters for single dNTP incorporations into 21/41- mers catalyzed by hPolε at 37 °C...... 338

Table 7.10. Pre-steady state kinetic parameters for single dNTP incorporations into blunt- end DNA substrate BE2 catalyzed by hPolε at 37 °C...... 339

Table 7.11. Biphasic kinetic parameters of single dNTP incorporations into different

DNA substrates catalyzed by hPolε...... 340

Table 7.12. Estimated kinetic parameters of hPolε for normal DNA synthesis...... 341

Table 7.13. Linker region sequences and DNA-binding parameters for various Dpo4 mutants at 25 °C...... 342

Table 7.14. Kinetic parameters for nucleotide incorporation into DNA (D-1) catalyzed by

Dpo4 at 37 C...... 343

xx

List of Figures

Figure 1.1. Simplistic representation of a DNA polymerase (red) catalyzing DNA replication...... 13

Figure 1.2. Estimated replication error rates...... 14

Figure 1.3. Crystal structure of Dpo4 in a ternary complex with DNA and incoming nucleotide...... 15

Figure 1.4. Structural domains of Y-family DNA polymerases...... 16

Figure 1.5. Temperature dependence of nucleotide incorporation fidelity for Dpo4 [66].

...... 17

Figure 1.6. Crystal structures of Dpo4 in apo-state, binary and ternary complex forms [3].

...... 18

Figure 2.1. Diagram of the wt Dpo4 and mutants...... 44

Figure 2.2. CD wavelength spectra of the wt Dpo4 and mutants...... 45

Figure 2.3. Far-UV region CD wavelength of the wt Dpo4 at various temperatures...... 46

Figure 2.4. Thermal denaturation plots of the wt Dpo4 and certain mutants monitored via

CD spectroscopy...... 47

Figure 2.5. Thermal denaturation of the wt Dpo4 and the LF+ monitored via CD spectroscopy at fixed wavelength 209 nm...... 48

Figure 2.6. CD spectra of LF...... 49

xxi

Figure 2.7. FTS plots of wt Dpo4 and three mutants in buffer A...... 50

Figure 2.8. CD spectra of the wt Dpo4 and the P236A Dpo4 in the presence of 50 mM guanidine hydrochloride...... 51

Figure 2.9. CD spectra of the wt Dpo4 and linker mutants...... 53

Figure 2.10. Crystal structure of Dpo4 in apo-state...... 55

Figure 2.11. CD spectra of the wt Dpo4 and selected mutants...... 56

Figure 2.12. Sequence of alignment of Y-family DNA polymerases...... 58

Figure 3.1. Sequence alignment of the Y-family DNA polymerases...... 79

Figure 3.2. Running start assays for the wild-type Dpo4 and the Y12A Dpo4 mutant at 37

°C...... 80

Figure 3.3. Alkaline degradation of full-length extension on D-1 21/41-mer catalyzed by

Y12A Dpo4 at 37 °C...... 81

Figure 3.4. Single nucleotide incorporation assays at 37 °C...... 82

Figure 3.5. Matched rATP incorporation into DNA substrate D-7 (Table 3.1) catalyzed by the Y12A Dpo4 mutant at 37 °C...... 83

Figure 3.6. Comparison of sugar selectivity values (Table 3.2) between the wild-type

Dpo4 (grey bar) and the Y12A Dpo4 mutant (black bar)...... 85

Figure 3.7. Mismatched rATP incorporation into DNA substrate D-1 (Table 3.1) catalyzed by the Y12A Dpo4 mutant at 37 °C...... 86

Figure 3.8. Blunt-end rATP additions onto DNA substrate BE2 (Table 3.1) catalyzed by the Y12A Dpo4 mutant at 37 °C...... 88

Figure 3.9. Magnification of the active site of the wild-type Dpo4...... 89

xxii

Figure 3.10. Model of an RNA/DNA primer/template duplex into the DNA binding cleft of the Y12A Dpo4 mutant...... 90

Figure 4.1. Running start assays for human Y-family DNA polymerases...... 120

Figure 4.2. Time-dependent AP bypass% during running start assays...... 122

Figure 4.3. Mutation spectra of DNA synthesis catalyzed by hPolε (A and B), hΔPolη (C and D) and hΔPolθ (E and F)...... 123

Figure 4.4. Comparison of preferred actions by human Y-family DNA polymerases opposite the AP site in the damaged template 51AP (A) or the corresponding template base dT in undamaged template 63CTL (B)...... 132

Figure 4.5. Histogram of relative error% as a function of template position...... 133

Figure 4.6. Relative error% as a function of template position from the AP site for hPolε-

(A), hΔPolη- (B), and hΔPolθ-catalyzed (C) nucleotide incorporation events opposite template bases dTs...... 135

Figure 4.7. Relative error% as a function of template position from the AP site for hPolε-

(A), hΔPolη- (B), and hΔPolθ-catalyzed (C) nucleotide incorporation events opposite template bases dGs...... 137

Figure 4.8. Relative error% as a function of template position from the AP site for hPolε-

(A), hΔPolη- (B), and hΔPolθ-catalyzed (C) nucleotide incorporation events opposite template bases dAs...... 139

Figure 5.1. Running start assays...... 165

Figure 5.2. Measurement of Kd, DNA at the first pause site...... 166

Figure 5.3. Kinetics of dCTP incorporation into 22/26-mer-dGAP ...... 167

xxiii

Figure 5.4. Quantitative effects of a dGAP lesion on correct dNTP incorporation catalyzed by Dpo4...... 168

Figure 5.5. Effectiveness of the DNA trap for biphasic kinetic assays...... 170

Figure 5.6. Biphasic kinetics of correct dNTP incorporation in the presence of a DNA trap...... 171

Figure 5.7. Proposed kinetic mechanism for the bypass of dGAP catalyzed by Dpo4. ... 173

Figure 6.1. Chemical structure of dGAP...... 201

Figure 6.2. Running start assay for hPolε (A and B), hPolθ (C and D), hPolη (E and F) and hRev1 (G and H) at 37 °C. (A, C, E and G) 17/26-mer; (B, D, F and H) 17/26-mer- dGAP...... 202

Figure 6.3. EMSA for hPolε using 5′-radiolabeled 20/26-mers...... 204

Figure 6.4. Effectiveness of the DNA trap for biphasic kinetic assays of correct dGTP incorporation into 20/26-mer catalyzed by hPolε...... 205

Figure 6.5. Biphasic kinetics of correct dGTP incorporation into 20/26-mers catalyzed by hPolε in the presence a DNA trap...... 206

Figure 7.1. Running start assay for Sulfolobus solfataricus DNA Polymerase B1exo+ at 37

°C...... 272

Figure 7.2. Bypass of dGAP over time...... 273

Figure 7.3. Standing start assay for hRev1 at 37 °C...... 274

Figure 7.4. Mutation spectra of DNA synthesis catalyzed by Dpo4 (A and B), hPolε (C and D), hPolθ (E and F), and hPol (G)...... 275

xxiv

Figure 7.5. Comparison of preferred actions by Dpo4, hPolε and hPol opposite dGAP in

DNA substrate S-3b 17/73-mer-dGAP or the corresponding template base dG in DNA substrate S-3a 17/73-mer...... 285

Figure 7.6. Histogram of relative error% as a function of template position...... 287

Figure 7.7. Crystal structure of cis-syn cyclobutane thymine dimer (cis-syn TT)...... 291

Figure 7.8. Running start assays for cis-syn TT bypass catalyzed by Y-family DNA polymerases...... 292

Figure 7.9. Bypass of cis-syn TT as a function of time...... 294

Figure 7.10.Mutation spectra of DNA synthesis catalyzed by Dpo4 (A and B), hPolε (C), hPolθ (D and E), and hPol (F and G)...... 295

Figure 7.11. Comparison of preferred actions by Dpo4, hPolε, hPol and hPol opposite cis-syn TT in DNA substrate S-4b 17/77-mer-CPD or the corresponding template bases

TT in DNA substrate S-4a 17/77-mer...... 302

Figure 7.12. Histograms of relative error% as a function of template position...... 304

Figure 7.13. Running start assays for cisplatin-dGpG bypass catalyzed by hPolε and hPol...... 308

Figure 7.14. Bypass of cisplatin-dGpG as a function time...... 309

Figure 7.15. Mutation spectra of DNA synthesis catalyzed by Dpo4 (A and B), and hPolθ

(C and D)...... 310

Figure 7.16. Comparison of preferred actions by Dpo4 and hPol opposite cisplatin- dGpG in DNA substrate S-6b 15/69-mer-DDP or the corresponding template bases GG in

DNA substrate S-6a 15/69-mer...... 314

xxv

Figure 7.17. Histogram of relative error% as a function of template position...... 316

Figure 7.18. Pre-steady state burst kinetics of dTTP incorporation into D-1 catalyzed by hPolε...... 319

Figure 7.19. EMSA for hPolε using different DNA substrates...... 320

Figure 7.20. Pre-steady state kinetics of dTTP incorporation into D-1 21/41-mer catalyzed by hPolε at 37 °C...... 321

Figure 7.21. Biphasic kinetics of hPolε incorporating dCTP and dATP into D-8 and BE2, respectively...... 322

Figure 7.22. Zoomed in view of binary complex Dpo4•DNA...... 324

Figure 7.23. Chemical structure of 2-aminopurine hydrogen bonding to thymine...... 325

Figure 7.24. Equilibrium dissociation constant for the wt Dpo4 using fluorescence titration assay at 25 °C...... 326

Figure 7.25. Denaturation of the wt Dpo4 by increasing guanidinium chloride (GDN) at

37 °C monitored by tyrosine fluorescence...... 327

Figure 7.26. Model of a DNA primer/template duplex containing a dGAP into the DNA binding cleft of Dpo4 mutant...... 328

xxvi

List of Schemes

Scheme 1.1. General DNA polymerization reaction...... 21

Scheme 1.2. Minimal kinetic mechanism of Dpo4 [65]...... 22

Scheme 4.1. Short oligonucleotide sequencing assay...... 144

Scheme 5.1. Formation of N-(deoxyguanosin-8-yl)-1-aminopyrene (dGAP)...... 179

Scheme 6.1. Proposed kinetic mechanism for dGAP bypass catalyzed by human Polε. . 218

Scheme 7.1. SOSA scheme for dGAP bypass analysis...... 344

Scheme 7.2. SOSA scheme for cis-syn TT bypass analysis...... 345

Scheme 7.3. SOSA scheme for cisplatin-dGpG bypass analysis ...... 346

Scheme 7.4. General kinetic mechanism of DNA synthesis catalyzed by hPolε...... 347

Scheme 7.5. Folding and unfolding of Dpo4...... 348

Scheme 7.6. Proposed two-polymerase lesion bypass pathway...... 349

Scheme 7.7. Modified kinetic mechanism for DNA synthesis catalyzed by hPolε...... 350

xxvii

List of Abbreviations

1-AP 1-aminopyrene

1-NP 1-nitropyrene

AAF N-acetyl-2-aminofluorene

AAF-dG N-acetyl-2-aminofluorene adduct at the C8 position of

deoxyguanosine

BPDE Benzo[a]pyrene diol epoxide

BPDE-dA Benzo[a]pyrene 7,8-diol 9,10-epoxide-derived adduct at the N6

position of 2′-deoxyadenosine

BPDE-dG Benzo[a]pyrene 7,8-diol 9,10-epoxide-derived adduct at the N2

position of 2′-deoxyguanosine

BSA Bovine serum albumin

CD Circular Dichroism cisplatin-dGpG cis-[Pt(NH3)2{d(GpG)-N7(1),-N7-(2)}] cis-syn TT cis-syn cyclobutane thymine dimer

CPD cis-syn cyclobutane pyrimidine dimer

Core Thumb, Finger and Palm domains of Dpo4

DinB Damage-induced protein dGAP N-(deoxyguanosin-8-yl)-1-aminopyrene dNTP 5′-triphosphate deoxyribonucleotide

xxviii dPTP 5ʹ-triphosphate deoxyribopyrene

Dpo4 Sulfolobus solfataricus P2 DNA Polymerase IV

EMSA Electrophoretic mobility shift assay

FTS Fluorescence-based thermal scanning hPolε Human DNA Polymerase ε hPol Human DNA Polymerase  hPol Human DNA Polymerase  hRev1 Human Rev1

LF Little Finger domain of Dpo4

LF+ Little Finger domain of Dpo4 with the linker region

NMR Nuclear magnetic resonance

NTP 5′-triphosphate nucleotide

PAGE Polyacrylamide gel electrophoresis

PAH Polyaromatic hydrocarbon

PCNA Proliferating cell nuclear antigen rNTP 5′-triphosphate ribonucleotide

RPA

RT Reverse transcriptase

TBE Tris boric acid EDTA wt Wild type

xxix

Chapter 1 : Introduction to Y-Family DNA Polymerases

1.1 Introduction

The conservation of genetic information is crucial for cellular survival in all organisms. The process of conserving cellular genetic information is performed via DNA replication, which is performed by replicative DNA polymerases. A simplified representation of a DNA polymerase performing DNA replication is shown in Figure 1.1.

These specialized enzymes accurately replicate DNA by creating complementary DNA strands where standard Watson-Crick base pairs are formed along the DNA strands in an anti-parallel direction. However, when DNA is damaged by endogenous sources (e.g. oxidative stress and cleavage by uracil glycolases) or exogenous sources (e.g. UV radiation and chemotherapy), DNA replication is often stalled as replicative DNA polymerases cannot traverse DNA lesions. If DNA replication does not restart during mitosis, cellular division is suspended and apoptosis is subsequently initiated, leading to cell death. To avoid apoptosis, the cell will attempt to repair DNA lesions with specific enzymes and specialized gap-filling DNA polymerases. If a DNA lesion is not repaired by these enzymes, the cell will then employ lesion bypass pathways using a separate set of specialized DNA polymerases. This lesion bypass process is considered a last resort for cell survival as the lesion bypass pathway can be error-prone, which will corrupt

1 genetic information through the introduction of mutations.

1.2 Mechanisms of DNA Polymerases

All DNA polymerases catalyze a DNA polymerization reaction as shown in

Scheme 1.1. Structurally, all DNA polymerases resemble a right hand, and possess at least Palm, Thumb and Finger domains. These domains are used to bind DNA and incoming 5′-triphosphate deoxyribonucleotide (dNTP), as well as form the active site for

DNA polymerization. The open-to-close conformational change of the DNA polymerases during catalysis is believed to be universal, and is induced by either the formation of the binary (enzyme•DNA) or ternary (enzyme•DNA•dNTP) complex [1-3]. Accordingly, the overall DNA polymerization reaction rate is increased in the close conformation of the

DNA polymerase when all of the essential components are in close proximity.

The DNA polymerase active site is located within the Palm domain. Three conserved catalytic carboxylate residues within the active site of each DNA polymerase coordinate the two divalent catalytic metal ions into positions, and assist in the nucleophilic attack of the primer terminus 3′-hydroxyl group on the α-phosphate of the incoming dNTP 5′-phosphate group (Scheme 1.1) [1]. One of these catalytic metal ions, usually Mg2+, helps lower the 3′-OH affinity for hydrogen through stabilizing a 3′-O- group. The second catalytic metal ion, also normally Mg2+, interacts with the β- and γ- phosphate groups of the incoming dNTP, and therefore stabilizes the pyrophosphate leaving group [1, 3]. Removal of any reaction components aforementioned will results in

2 the inactivation of the DNA polymerase [2, 3].

Conversely, each DNA polymerase incorporates dNTPs with varying accuracy and efficiency (Figure 1.2) [4, 5]. Increasing the dNTP binding affinity, the DNA binding affinity, and base stacking interactions have all been implicated in the observed differences for dNTP incorporation efficiency [4, 6]. During DNA replication, the following events can occur to result in no introduction of errors. i) The DNA polymerase inserts the correct dNTP. ii)The DNA polymerase misincorporates a dNTP followed by a

3′→5′ proofreading step. iii) The DNA polymerase misincorporates a dNTP followed by a DNA repair process [5]. Thus, high dNTP selectivity is considered the greatest contributor of DNA replication accuracy (fidelity). Excluding water molecules from the active site, possessing a fidelity-checking α-helix (usually named Helix O), containing a proofreading 3′→5′ exonuclease domain, having a „snug‟ active site, forming Watson-Crick base pairs, and using geometric selection for correct Watson-

Crick base pairing have all been implicated in the differences in dNTP incorporation fidelity [4, 5, 7]. Consequently, a major focus in the DNA polymerase field is structure- function relationships between dNTP incorporation efficiency/fidelity and these factors.

1.3 DNA Polymerase Families

DNA polymerases are broadly classified into six families based on phylogenetic relationships: A-, B-, C-, D-, X-, and Y-families. While all DNA polymerases share similar overall organization of their respective structural domains, their active site amino

3 acid compositions, sizes, rate constants for performing catalysis, and dNTP incorporation fidelities vary greatly. Besides the three conserved catalytic carboxylic residues within the active site, there is little primary sequence similarity shared between members of different DNA polymerase families as well as within the same family.

Characteristics of Each DNA Polymerase Family

A-family DNA polymerases are found in bacteria, metazoans, plants, mitochondria and viruses. These enzymes are characterized as highly accurate DNA template-dependent DNA polymerases that possess 3′ → 5′ exonuclease activity and possibly 5′ → 3′ exonuclease activity. The representative of the A-family member is

Escherichia coli (E. coli) DNA Polymerase I, which possesses all of these characteristics and is involved in DNA repair [8]. This family of enzymes is also considered as replicative DNA polymerases. For example, eukaryotic DNA polymerase gamma (γ) is a well characterized polymerase that is solely responsible for the replication and repair of the mitochondrial genome [9-11].

B-family DNA polymerases can be found in Archaea, Eukaryota, and viruses

[12]. These B-family enzymes include DNA polymerases alpha (α), delta (δ), epsilon (ε), and zeta (δ). DNA Polymerases δ and ε are responsible for the synthesis of the leading and lagging DNA strands during DNA replication, and possess both 3′ → 5′ and 5′ → 3′ exonuclease activities [8]. DNA Polymerase α functions as a and possesses only

3′ → 5′ exonuclease activity [13]. Unusual in B-family character, DNA polymerase δ has a significantly lower dNTP incorporation fidelity due to its lack of 3′ → 5′ exonuclease

4 activity [14]. This DNA polymerase is possibly involved in DNA lesion bypass pathways as the bypass product extender [14-16] and in somatic hypermutation due to fairly efficiently extension of mispaired primer termini [17-19], and an inability to incorporate dNTPs opposite DNA lesions.

The C-family DNA polymerases are found exclusively in bacteria [8]. These replicative DNA polymerases possess 3′ → 5′ exonuclease activity as well as high dNTP incorporation fidelity. E. coli DNA Polymerase III is the most well studied member of the

C-family [20].

Found exclusively in Archaea, the replicative D-family DNA polymerases are enzymes with high dNTP incorporation fidelity. Members of this family possess proofreading 3′ → 5′ exonuclease activity. The most well-known D-family polymerase,

Pyrococcus furiosus (Pfu) DNA polymerase, is widely used in PCR techniques [21].

However, D-family DNA polymerases share very little amino acid sequence homology to

DNA polymerases from other families [8].

Found in Archaea, Eukaryota, bacteria and viruses, the X-family of DNA polymerases is a subdivision of a larger superfamily that can catalyze DNA template- independent nucleotidyl transfer [12]. Some of the members include DNA Polymerases beta (β), lambda (ι), mu (κ), sigma 1 (ζ1), and sigma 2 (ζ2) as well as African swine fever virus DNA Polymerase X (ASFV PolX) and terminal deoxynucleotidyl transferase

(TdT). TdT catalyzes non-templated, random dNTP addition at V(D)J junctions [8]. Polβ has been shown to catalyze gap-filling synthesis after removing the 5′-deoxyribose phosphate moiety during base excision repair (BER) [22, 23]. Polι is structurally similar

5 to Polβ, but possesses two additional domains, a breast cancer susceptibility gene 1 C- terminal (BRCT) domain and a proline-rich domain [24-26].

The members of the Y-family are found in all domains of life. These Y-family

DNA polymerases are classified as distributive enzymes that only incorporate a few dNTPs per DNA binding event before dissociation from binary enzyme•DNA complex

[27-32], and are all devoid of 3′ → 5′ exonuclease activity [16, 33-36]. It is hypothesized that the primary biological function of Y-family DNA polymerases is to perform translesion DNA synthesis (TLS). The importance of this role is evident as most organisms contain genomes that encode for more than one Y-family DNA polymerase.

For example, of the 16 identified human DNA polymerases, there are 4 Y-family members: Rev1, DNA Polymerases eta (Polε), iota (Pol), and kappa (Pol) [37].

Sulfolobus solfataricus DNA Polymerase IV (Dpo4) serves as a model of the Y-family as it is the only Y-family member encoded for the thermophilic archeon [38, 39], and it is one of the most studied Y-family DNA polymerases [3, 27, 40-59]. The first crystal structure of Dpo4 (Figure 1.3) depicts the general structure of Y-family DNA polymerases [60]. All Y-family enzymes possess smaller Finger and Thumb domains than those in other DNA polymerase families [37, 61]. In addition, all Y-family DNA polymerases possess a domain named the Little Finger (LF) domain or the Polymerase-

Associated domain (PAD) that is connected to the Thumb domain at the C-terminus

(Figure 1.4) [37, 61]. Together, these structural changes lead to a more solvent-accessible active site that can accommodate a modified DNA base. Previous studies have demonstrated that Y-family DNA polymerases can bypass specific DNA lesions, and thus

6 can rescue stalled DNA replication forks [28, 32, 61-64].

Dpo4 and the Human Y-Family DNA Polymerases

Dpo4 is a homolog of E. coli DNA Polymerase IV (Pol IV) and S. acidocaldarius

Damage-induced B (DinB) homolog (Dbh) [37, 60]. By using pre-steady state kinetic methods, the dNTP incorporation fidelity range of Dpo4 has been measured at 37 °C to be 10-3 to 10-4, or one error per 1000 to 10,000 dNTP incorporations [27]. Taken together, our previous studies with Dpo4 have elucidated a minimal kinetic mechanism shown in

Scheme 1.2 with corresponding kinetic parameters shown in Table 1.1 [27, 65]. Over a temperature range from 26 °C to 56 °C, the reaction rate constant exponentially increases for both correct and incorrect dNTP incorporations, leading to a dNTP incorporation fidelity that does not change significantly from 26 °C to 56 °C (Figure 1.5 and Table 1.2)

[66]. A recent structural study of Dpo4 revealed large conformational changes during the formation of the binary complex, followed by subtle conformational changes during the formation of the ternary complex (Figure 1.6) [3]. The actual concert of domain movements for Dpo4 as well as DNA translocation within the active site of Dpo4 during

DNA replication have also been demonstrated using fluorescence resonance energy transfer (FRET) stopped-flow kinetics coupled with pre-steady state kinetics [2]. Within the context of TLS, Dpo4 is the only lesion bypass DNA polymerase in S. solfataricus and consequently, must be able to traverse a more diverse set of DNA lesions than eukaryotic homologs [38, 39].

Amongst the four human Y-family DNA polymerases, the most well known

7 enzyme is Polε, which is part of the Rad30 subfamily of Y-family DNA polymerases

[63]. A series of studies have shown that the inactivation of Polε leads to a genetic condition known as Xeroderma Pigmentosum variant (XPV), which is characterized by a higher sensitivity to sunlight-induced skin cancer [34, 67-69]. Previous investigations have determined that UV light creates DNA damages, e.g. cyclobutane pyrimidine dimers

(CPD), which is the direct cause of tumorigenesis [70, 71]. Interestingly, both in vitro and in vivo studies have demonstrated the error-free bypass of double-base lesion cis-syn cyclobutane thymine dimers (cis-syn TT) catalyzed by Polε [62, 70, 72-77]. Due to this ability, it is theorized that Polε may be a source of anticancer cisplatin drug resistance

[78-81]. However, Polε is error-prone when bypassing cis-[Pt(NH3)2{d(GpG)-N7(1),-N7-

(2)}] (cisplatin-dGpG) intrastrand adducts [62] as well as other lesions including apyrimidine/apurine (abasic) sites (AP) [82], 7,8-dihydro-8-oxoguanine (8-oxoG) [36],

(+)-trans-anti-benzo[a]pyrene-N2-dG ((+)BPDE-dG) [82], 1,N6-ethenodeoxyadenosine

[35], O6-methylguanine [83], and N-2-acetylaminofluorene-dG (AAF-dG) [62].

Another member of the Rad30 subfamily is Pol, which is the product of the

RAD30B gene [30, 63] .The phenotype of XPV patients caused by an increased frequency of mutations is thought to arise from Pol acting as the error-prone cis-syn TT bypass enzyme in vivo [28, 71, 84, 85]. Notably, the observation that Polη incorporates incorrect dGTP opposite a template base dT more efficiently than canonical dATP [30, 86] supports this hypothesis. In vitro, hPol traverses AP sites [85-87], 8-oxoG [85], AAF-dG

[85], cis-syn TT dimers [87], and (6-4) TT photoproducts [17, 85]. In addition to TLS,

Pol and Polε have been implicated in somatic hypermutation [61].

8

Y-family DNA polymerase Rev1 is classified as a dCTP transferase [32, 88, 89].

Rev1 efficiently incorporates dCTP opposite many DNA lesions including AP sites [89,

90], 8-oxoG, (+)BPDE-dG, (-)BPDE-dG, and 1,N6-ethenoadenine adducts [32]. Recent structural studies of Rev1 reveal a novel mechanism of DNA synthesis where an active site Arg from the N-terminal addition, or „Digit domain‟, replaces the DNA template base and hydrogen bonds readily to incoming dCTP [91, 92]. In addition to having a region that interacts with proliferating cell nuclear antigen (PCNA), Rev1 has a specific region that interacts with other Y-family DNA polymerases [93, 94]. Moreover, Rev1 is the only

Y-family DNA polymerase that contains a BRCT domain, which is used to recruit enzymes to DNA lesion sites during DNA replication [95-97]. Thus, it has been proposed that Rev1 participates in TLS more as a scaffold protein than as a lesion bypass DNA polymerase in vivo [61, 97].

Polθ is a member of the DinB subfamily, and is a close structural relative of Dpo4

[31, 98]. Similar to Dpo4 and Polε, Polθ has been shown to bypass AP sites, 8-oxoG,

AAF-dG, and (+)BPDE-dG [99]. Additionally, it has been demonstrated that Pol can efficiently elongate mispaired primer termini [100]. Due to this characteristic, it is theorized that Pol participates in lesion bypass pathways as the bypass product extender for many different DNA lesions. Notably, Pol is highly expressed in cell lines enriched with steroids such as estrogen [98, 101]. The N-terminal Digit domain of Pol is hypothesized to shield bulky hydrophobic adducts within the active as it encircles the

DNA [102].

Despite the known characteristics of human Y-family DNA polymerases, one of

9 the prevalent challenges in TLS studies is to identify the in vivo lesion bypass specificity for each human Y-family DNA polymerase. Together, the studies mentioned in this section indicate that there is a significant overlap in regards to their in vitro lesion bypass abilities. With the exception of a cis-syn TT dimer, it remains unclear which human Y- family DNA polymerase is responsible for the lesion bypass of each particular DNA lesion(s) and which bypass mechanism is utilized in vivo. It is hypothesized that the varying abilities of lesion bypass for each human DNA polymerase are derived from the structural differences amongst these enzymes. Knowing the lesion specificity of each human Y-family DNA polymerase will help design drugs for diseases directly caused by

DNA-damaging reagents.

1.4 Focus of Dissertation

The focus of this dissertation is to elucidate structure-function relationships between the specific lesion bypass capacity of each Y-family DNA polymerase and the structural differences amongst the Y-family members. Prior to initiating our investigation into structure-function relationships involving the human Y-family DNA polymerases, we answered two important questions regarding Dpo4, the model Y-family member from

S. solfataricus. These questions were: i) are structural and kinetic studies conducted outside the endogenous temperature of S. solfataricus relevant to the native structure and activity of Dpo4; and ii) what nucleotide sugar selection mechanism does Dpo4 employ during DNA synthesis. As DNA replication occurs in vivo, both ribonucleotides (rNTPs)

10 and deoxyribonucleotides (dNTPs) are available to be incorporated into DNA. However, high-fidelity DNA-dependent DNA polymerases have high NTP sugar selectivity against rNTPs. It is important to determine if error-prone Y-family DNA polymerases also have a high NTP sugar selectivity, possibly by using a novel mechanism. After answering these questions, we used the knowledge gained from studies with Dpo4 to begin our investigations with the human Y-family DNA polymerases. First, we applied the newly developed short oligonucleotide sequencing assay (SOSA) method in order to visualize the mutagenic potential of each human Y-family member during DNA replication, and to compare the abasic site bypass abilities of each human Y-family enzyme and Dpo4. The abasic site bypass outcome for each human Y-family DNA polymerase is extremely relevant as abasic sites arise from both endogenous and exogenous sources, e.g. oxidative stress, and are the most prevalent DNA lesions in vivo [103, 104]. We then applied the knowledge gained from this study to examine possible lesion bypass mechanisms utilized by each human Y-family DNA polymerase and Dpo4 in the presence of an air pollution derivative (N-(deoxyguanosin-8-yl)-1-aminopyrene, dGAP), a product of UV radiation

(cis-syn TT), or a product from the chemotherapy drug cisplatin (cisplatin-dGpG). Data from these investigations not only added to mechanistic details of how genetic mutations arise under different environmental circumstances, but also demonstrated the varying lesion bypass abilities of each enzyme within the context of each DNA lesion studied.

Additionally, the work herein may be used to help design better chemotherapy drugs that

Y-family DNA polymerases will not be able to bypass error-free, thereby reducing the frequency of drug resistance incidents observed for anticancer drugs, such as cisplatin.

11

Our work will also help determine the Y-family DNA polymerases responsible for mutagenic phenotypes that lead to tumor formations in the presence of cis-syn TT or dGAP in vivo.

12

1.5 Figures

Figure 1.1. Simplistic representation of a DNA polymerase (red) catalyzing DNA replication.

The site of incoming dNTP hydrogen binding to DNA template is shaded in light blue.

The DNA backbone is in yellow.

13

Figure 1.2. Estimated replication error rates.

The ranges of error rates for single base substitutions (top) or deletions (bottom) are shown. The dashed lines denote that error rates could be equal to or lower than indicated.

This figure is originally from [4].

14

Figure 1.3. Crystal structure of Dpo4 in a ternary complex with DNA and incoming nucleotide.

The DNA and Dpo4 structures are drawn in ribbon cartoon while the incoming nucleotide is in stick form. The Finger, Palm, Thumb and LF domains are colored blue, red, green and purple, respectively. The DNA is colored in gold. This figure is originally from [60].

15

Figure 1.4. Structural domains of Y-family DNA polymerases.

The Finger, Palm, Thumb and LF domains are colored blue, red, green and purple, respectively. This figure is modified from [61].

16

Figure 1.5. Temperature dependence of nucleotide incorporation fidelity for Dpo4 [66].

17

Figure 1.6. Crystal structures of Dpo4 in apo-state, binary and ternary complex forms [3].

(a) Apo-Dpo4, the light blue dash line shows the non-structured loop of the finger domain in the apo-Dpo4; (b) Dpo4•DNA binary complex; (c) Dpo4•DNA•dTTP ternary complex; and (d) superposition of apo-Dpo4 (blue), Dpo4•DNA binary complex (green), and Dpo4•DNA•dTTP ternary complex (red) in Cα traces. (a–c) Dpo4 structures are drawn in ribbon cartoon and colored with the same color scheme as in Figure 1.3.

18

1.6 Tables

Table 1.1. Estimated kinetic constants of Dpo4 at 37 °C [65].

19

aResults taken from [27, 65].

Table 1.2. Estimated kinetic constants of Dpo4 [66].

20

1.7 Schemes

O OR RO O P H P 2+ O- Mg -O O O N O O N H N N O N -O -O -O N N H HO O O O O O P -O P P P H O O- O N O O O O H N O 2+ O N N Mg N N H N O O- H O HO N P O H RO

O OR RO O -O P 2+ H P Mg O- O O N O O N H N N O N O N N H -O -O O O O O O OH -O P P -O P P O H O- O N H N O O O O 2+ O N N Mg N N H N O O- H O N P OH O H RO

Scheme 1.1. General DNA polymerization reaction.

21

Scheme 1.2. Minimal kinetic mechanism of Dpo4 [65].

22

Chapter 2 : Identification of an Unfolding Intermediate for a Thermal-Stable Y-

Family DNA Polymerase

2.1 Introduction

Y-family DNA polymerases are known to have low fidelity and low during DNA synthesis over undamaged DNA substrates. Importantly, there are numerous examples of Y-family DNA polymerases bypassing a variety of DNA lesions due to their relatively open active sites when compared to replicative DNA polymerases [4, 8, 60,

105]. The most studied Y-family DNA polymerase is Sulfolobus solfataricus DNA polymerase IV (Dpo4). Dpo4 is a thermal-stable enzyme that catalyzes DNA synthesis at

37 °C with nucleotide incorporation fidelity in the range of 10-3 to 10-4 based on pre- steady state kinetic analysis [27, 57, 65, 66]. Previous studies have shown Dpo4 bypassing an abasic site [46, 58, 59, 106], as well as modified DNA bases such as 7,8- dihydro-8-oxodeoxyguanosine [50], 1,N2 –etheno()deoxyguanosine [56], cis-syn cyclobutane pyrimidine thymine-thymine dimer [41, 44, 107], 1,2-cisplatinated deoxyguanosine [41, 108], deoxyadenosine with a benzo[a]pyrene diol epoxide adduct

[109], N-(deoxyguanosin-8-yl)-1-aminopyrene [57], and N-(deoxyguanosin-8-yl)-2- acetylaminofluorene [41].

23

Recently, a series of Dpo4 crystal structures have shown a large tertiary conformational change when Dpo4 binds double-stranded DNA to form the binary complex Dpo4•DNA [3]. In contrast, only small local and secondary conformational changes occur as the binary complex Dpo4•DNA binds dNTP to form the ternary complex Dpo4•DNA•dNTP [3, 50, 60]. However, the reliability of Dpo4 structural information collected at lower-than-physiological temperatures, i.e. lower than 80 °C, has not been investigated. A biochemical study demonstrated that Dpo4 maintains significant activity at temperatures up to 95 °C [41]. Moreover, the nucleotide incorporation fidelity of Dpo4 remains relatively unchanged over a temperature range from 2 °C to 56 °C, suggesting that no substantial structural changes at the active site occur within the studied temperature range [66].

In this investigation, we provided circular dichroism (CD) spectroscopic evidence that the structural stability of Dpo4 was extremely high, with a melting temperature (Tm) well above the physiological temperature of S. solfataricus. The CD spectroscopic data also displayed Dpo4 unfolding in a three-state cooperative fashion. From this data, the apparent unfolding thermodynamic parameters of Dpo4 were calculated. The origin of thermal stability and the thermal-stable unfolding intermediate were also elucidated for

Dpo4 via a series of CD spectroscopic analysis and fluorescence-based thermal scanning

(FTS) of the wild-type Dpo4 and its mutants.

2.2 Materials and Methods

24

Protein Purification. Full-length, wild-type S. solfataricus Dpo4 (wt Dpo4) fused to a C- terminal His6-tag was overexpressed in and purified as previously described [27]. The following truncated Dpo4 mutants were purified using the same protocol for the wt Dpo4 [27]: the mutant containing only the Core (amino acid residues

1-230), the mutant containing the LF+ (amino acid residues 231-352), and the mutant containing only the LF (amino acid residues 246-352). The following full-length Dpo4 constructs were purified as aforementioned: the Dpo4 mutant where the amino acid residues of the linker region [3, 60] were changed to all glycine residues (All-Gly linker

Dpo4), the Dpo4 mutant where all positively charged amino acid residues within the linker region were changed to alanine residues (R/K-to-A linker Dpo4) or aspartic acid

(R/K-to-D linker Dpo4), the Dpo4 mutant where E235 and R240 were changed to alanine residues (E235A/R240A Dpo4), the Dpo4 mutant where R240 was changed to alanine

(R240A Dpo4), the Dpo4 mutant where E100, E235 and R240 were changed to alanine residues (E100A/E235A/R240A Dpo4), the Dpo4 mutant where K148, E235 and R240 were changed to alanine residues (K148A/E235A/R240A Dpo4), the Dpo4 mutant where

E100, K148, E235 and R240 were changed to alanine residues

(E100A/K148A/E235A/R240A Dpo4), and the Dpo4 mutant where P236 was changed to alanine (P236A Dpo4). The following full-length Dpo4 constructs with mutations only in the Palm domain were also purified: the Dpo4 mutant where E100 was changed to alanine (E100A Dpo4), the Dpo4 mutant where K148 was changed to alanine (K148A

Dpo4), and the Dpo4 mutant where E100 and K148A were changed to alanine residues

(E100A/K148A Dpo4). Prior to experiments, these proteins were exchanged with

25 degassed buffers and the final concentrations were determined spectrophotometrically at

280 nm using the following extinction coefficients: 24058 M-1cm-1 for the wt Dpo4, the

R/K-to-A linker Dpo4, the R/K-to-D linker Dpo4, the E235A/R240A Dpo4, the R240A

Dpo4, the E100A Dpo4, the K148A Dpo4, the E100A/K148A Dpo4, the

E100A/E235A/R240A Dpo4, the K148A/E235A/R240A Dpo4, the

E100A/K148A/E235A/R240A Dpo4, the P236A Dpo4, and the All-Gly linker Dpo4;

16608 M-1cm-1 for the Core; 7668 M-1cm-1 for the LF+; and 5960 M-1cm-1 for the LF.

Circular Dichroism Spectroscopy. CD spectra were obtained using an AVIV CD spectrometer model 62A DS. Unless specified, CD wavelength spectra were acquired in a

1 mm quartz cuvette at 37 °C over a wavelength range from 200 nm to 270 nm using an iteration of 1 nm and a 5 s signal averaging at each wavelength. Protein samples with a final concentration of approximately 1 mg*mL-1 were dissolved into degassed buffer A

(25 mM NaPO4 pH 7.5, 50 mM NaCl, and 5 mM MgCl2) and filtered with a 0.45 µm membrane to remove residual aggregation. CD wavelength spectra of buffer alone were obtained and subtracted from sample spectra. Ellipticities were converted to molar ellipticities (deg*cm2*dmol-1), and were plotted as a function of wavelength.

CD spectroscopic thermal denaturation plots were acquired over a temperature range from 26 °C to 119 °C using an iteration of 1.5 °C and a 1 mm pathlength quartz cuvette. The cuvette was stoppered to ensure that there was no loss in total sample volume. Protein samples were prepared as described above. Molar ellipticities (ζm) were plotted as a function of temperature (°C) at fixed wavelength 222 nm or 209 nm. The

26 thermal denaturation data were used to calculate the apparent equilibrium constant of unfolding at temperature T (KT) in the following equation: K(T) = [ζm, obs – (yf + mf*T)]/[(yu + mu*T) – ζm, obs], where ζm, obs is the observed ζm at a fixed wavelength, T is the temperature (Kelvin), yf and mf are the y-intercept and slope of the linear baseline before the unfolding transition, respectively, and yu and mu are the y-intercept and slope of the linear baseline after the unfolding transition, respectively [110]. Based on the van‟t

Hoff equation, apparent ΔHm and ΔSm were derived from the plot of lnK(T) versus 1/T.

For thermal denaturations in the presence of 50 mM guanidine hydrochloride, protein samples were prepared as before in degassed buffer A. Next, a CD wavelength spectrum was obtained for each sample, and concentrated guanidine hydrochloride in buffer A was added to the protein sample. After incubation of the mixture for 10 min at

37 °C, another CD wavelength spectrum was obtained to ensure the protein was still folded. With the same sample of protein, the thermal denaturation plot was then acquired as previously conducted.

Fluorescence-Based Thermal Scanning (FTS) Assay. The FTS assays were obtained using an iCycler iQ Real-Time Detection System (Bio-Rad) and similar methods to previously published protocols [111, 112]. Solutions of 2 µL of SYPRO® Orange (30X final concentration, Sigma-Aldrich) and 18 µL of protein (approximately 1 mg*mL-1) in buffer A were added to each well of a 96-well thin-wall PCR plate (Bio-Rad) and each well was sealed with iCycler optical quality sealing tape (Bio-Rad). The plates were heated from 25 °C to 94.8 °C with 0.2 °C increments per 12 s and the thermal

27 denaturation data were acquired by monitoring the fluorescence intensities using the 490 nm excitation and the 575 nm emission wavelengths. Each data set was fit to a curve to determine the apparent Tm in the following Clarke & Fersht equation: Signal = [(αF +

m(T - Tm) m(T - Tm) βFT) + e ]/[1 + e ], where αF and βF are the intercept and the slope, respectively, of the baseline for the folded state, T was temperature, Tm is the apparent melting temperature, and m is an exponential factor related to the slope of the transition at the apparent melting temperature [111]. The variables were determined using non-linear regression software KaleidaGraph 4.0 (Synergy Software).

2.3 Results

Secondary Structure of Dpo4. CD spectroscopy was employed to observe the thermal stability of Dpo4 over a large temperature range in solution. The description of each

Dpo4 mutant is in the Materials and Methods section of this chapter. Before each thermal denaturation, CD spectra of wt Dpo4 and its mutants (Figure 2.1 and Table 2.1) were obtained in the far-UV region to detect any significant changes in secondary structure due to mutations in Dpo4. In the CD spectral trace of the wt Dpo4 (Figure 2.2), the strong negative ellipticities observed at 209 nm and 222 nm indicated substantial amounts of - helical content [113]. Figure 2.2 also signified that all Dpo4 mutants were folded at 37

°C. Notably, the All-Gly linker Dpo4 and the P236A Dpo4 (Table 2.1) had similar CD spectra to the wt Dpo4 with slight changes due to experimental error. These two proteins were the only Dpo4 mutants that reproduced far-UV CD spectra identical to that of the wt

28

Dpo4 (Figure 2.2), which suggested that the composition of the linker region between the

Little Finger (LF) and the Thumb domains was not important for the secondary structural folding of Dpo4 at ambient temperatures, e.g. 37 °C.

Interestingly, the CD spectral trends of the wt Dpo4 and the Core (Figure 2.1) were also similar (Figure 2.2). However, the CD spectrum of the Core depicted significantly stronger negative molar ellipticity (ζm) at 222 nm and 209 nm, corresponding to an increase in α-helical content than the wt Dpo4. The CD spectrum of the LF domain with the linker region (LF+, Figure 2.1) did not reveal two distinct peaks at 222 nm and 209 nm, and displayed the most change in spectral trend when compared to the spectral trend of the wt Dpo4 (Figure 2.2). These data hinted that α-helices were not dominant secondary structures for the LF+ in solution.

Thermal Stability of Dpo4. A series of CD spectra of the wt Dpo4 at various temperatures was obtained to monitor changes in secondary structure within a temperature range from

38 °C to 100 °C. A single sample of the wt Dpo4 was used during this series of CD spectra, and was equilibrated for 5 min at each specific temperature before the corresponding CD spectrum was acquired. As shown in Figure 2.3, only the intensity of the negative molar ellipticity changed with temperature. Markedly, there was a significant decrease in negative molar ellipticity from 80 °C to 90 °C when compared to other temperature increments, indicating that a large change in the folded state of the wt

Dpo4 had occurred above 80 °C. Another significant decrease in negative molar ellipticity from 95 °C to 100 °C suggested that the wt Dpo4 unfolded into a different

29 thermal-stable folded species between 90 °C and 95 °C before continuing to unfold into aggregated protein at higher temperatures.

Fresh wt Dpo4 was used for thermal denaturation via CD spectroscopy at fixed wavelength 222 nm. As shown in Figure 2.4, the wt Dpo4 possessed two thermal-stable folded species with a three-state cooperative unfolding trend. The first thermal-stable folded species of the wt Dpo4 existed from 26 °C to 87.5 °C and the second thermal- stable folded species of the wt Dpo4 existed from 92 °C to 96.5 °C. At 119 °C, approximately 0.3 percent of the wt Dpo4 retained secondary structures based on molar ellipticity. After heating to 100 °C or 119 °C, the wt Dpo4 neither refolded nor continued to denature as temperature returned to ambient conditions (data not shown). Most of the wt Dpo4 was observed as aggregated protein within the sealed cuvette after thermal denaturation at 119 °C. Therefore, the observed denaturation of the wt Dpo4 was due to temperature changes, and was not due to the loss of volume.

Thermal denaturation of the wt Dpo4 via CD spectroscopy at fixed wavelength

209 nm was also performed. At this wavelength, the α-helical secondary structure will not be the dominant signal [113]. The thermal denaturation plot of the wt Dpo4 at fixed wavelength 209 nm resembled the thermal denaturation plot of the wt Dpo4 at fixed wavelength 222 nm (Figures 2.4 and 2.5). This data corroborated the unfolding of the wt

Dpo4 observed via CD spectra at varying temperatures: the secondary structural content of the wt Dpo4 at a low temperature, e.g. 37 °C, resembled the secondary structural content of the wt Dpo4 at a high temperature, e.g. 80 °C.

Based on the thermal denaturation plot in Figure 2.4, the unfolding equilibrium

30 constant, KT, was calculated for the wt Dpo4. Using a van‟t Hoff plot, the change in melting enthalpy (ΔHm) and the change in melting entropy (ΔSm) were determined (Table

2.2). Using the assumption that the change in free energy for unfolding (ΔGm) equaled zero, the overall Tm for the wt Dpo4, the Tm for the first thermal-stable folded species and the Tm for the second thermal-stable folded species were determined (Table 2.2). Due to irreversible unfolding of the wt Dpo4, the corresponding thermodynamic parameters in

Table 2.2 should be considered apparent values [114].

The Unfolding of Dpo4. Since the wt Dpo4 possesses consecutively folded domains [60] as shown in Figure 2.1, we speculated that the first thermal-stable folded species was rigid and only increased in secondary structural mobility with increasing temperature.

The second thermal-stable folded species was postulated to be an unraveled LF domain with intact Finger, Thumb and Palm domains (Core, Figure 2.1). To examine this hypothesis, thermal denaturation of the LF+, the Core and the All-Gly linker Dpo4 were also monitored via CD spectroscopy at fixed wavelength 222 nm.

As shown in Figure 2.4, the Dpo4 mutants exhibited cooperative unfolding and

Tm higher than 80 °C. The All-Gly linker Dpo4 displayed a thermal denaturation trend with no apparent thermal-stable unfolding intermediate. Unlike the wt Dpo4, the All-Gly linker Dpo4 began to denature at 86 °C and was mostly denatured by 113.5 °C (Figure

2.4). Moreover, the Core did not possess a thermal-stable unfolding intermediate and displayed the highest thermal stability with Tm greater than the wt Dpo4 as a “whole”

(Table 2.2). Importantly, the LF+ did contain a thermal-stable unfolding intermediate.

31

To establish that the absence of the thermal-stable unfolding intermediate for the

All-Gly linker Dpo4 was not due to the LF being misfolded, thermal denaturation of the

P236A Dpo4 was monitored via CD spectroscopy at fixed wavelength 222 nm. As shown in Figure 2.4, this Dpo4 mutant did not denature below 80 °C and contained a thermal- stable unfolding intermediate. Interestingly, the thermal denaturation trend of the P236A

Dpo4 displayed a close resemblance to the thermal denaturation trend of the wt Dpo4, which began to unfold at 81.5 °C with a more gradual unfolding into the thermal-stable unfolding intermediate at 89 °C. This thermal-stable folded species existed over a larger temperature range than the second thermal-stable folded species of the wt Dpo4 (Figure

2.4).

This CD spectroscopic data suggested that the origin of the thermal-stable unfolding intermediate involved the linker region. Particularly, the thermal-stable unfolding intermediate existed during thermal denaturation of the LF+. Thus, the linker region was removed from the LF+ to create the LF mutant (Figure 2.1). Although the LF was fairly unstable in low salt solutions and temperatures, i.e. less than 400 mM NaCl at

23 °C, a far-UV CD spectrum of the LF at 37 °C was acquired (Figure 2.6A). The CD spectral trends, with and without high salt, were similar to the CD spectral trend of the

LF+ at 37 °C, but displayed more negative molar ellipticity than the CD spectral trend of the LF+. This data indicated that the linker region did not contain any α-helices. With the

LF sample in low salt, a thermal denaturation plot at fixed wavelength 222 nm also was acquired. The LF displayed a higher thermal stability than the LF+, and did not start unfolding until 95 °C (Figure 2.6B). In addition, the LF was not completely unfolded at

32

119 °C and displayed no plateau near 0 ζm. Since there was no unfolded state observed for the LF, the unfolding thermodynamic parameters could not be calculated. Most importantly, the LF did not show a thermal-stable unfolding intermediate within the same temperature range as the LF+ (Figure 2.6B). These data provided evidence that the existence of the thermal-stable unfolding intermediate of Dpo4 was indeed due to the linker region. Moreover, the linker region may have induced structural instability of the

LF domain.

After heating samples to 119 °C, approximately 3.9 percent of the Core, 7.9 percent of the LF+, 35.7 percent of the LF, 1.8 percent of the All-Gly linker Dpo4 and 2 percent of the P236A Dpo4 remained folded (Figure 2.4). All Dpo4 mutants, with the exception of the LF for aforementioned reasons, were assumed to irreversibly unfold to protein aggregates [115]. Due to instrumental limitations, thermal denaturation did not go beyond 119 °C.

Using the data gathered from this series of thermal denaturations, the apparent unfolding thermodynamic parameters for the LF+, the Core, the P236A Dpo4 and the All-

Gly linker Dpo4 were calculated, and are summarized in Table 2.2. The order of thermal stability based on initial unfolding was as followed: P236A Dpo4 < All-Gly linker Dpo4

+ < LF < wt Dpo4 < Core < LF. If only the highest Tm for each protein was considered, the thermal stability order was as followed: Core < All-Gly linker Dpo4 ≈ wt Dpo4 <

P236A Dpo4 < LF+.

As the thermal-stable unfolding intermediates could be due to the formation of unfolded Dpo4 prior to aggregation, thermal denaturation also was monitored using FTS

33 assays (Materials and Methods). During a FTS assay, the change in hydrophobicity of solvent is indirectly observed through the change in fluorescence of a hydrophobic fluorophore, e.g. SYPRO® Orange. The corresponding thermal denaturation plots are shown in Figure 2.7A for the wt Dpo4, the All-Gly linker Dpo4, the LF+ and the LF. All proteins showed a relative structural stability until approximately 80 °C. The thermal denaturation plots of the All-Gly linker Dpo4 and the wt Dpo4 displayed a maximum fluorescence at 87.4 °C and 86.2 °C, respectively. After reaching 94.8 °C, the LF+ and the

LF samples did not show a fluorescent maximum. Moreover, the LF appeared to be more thermal stable than the LF+ based on initial unfolding (Figure 2.7A). Experimental temperatures for FTS analysis did not go higher than 94.8 °C due to instrumental limitations. As the wt Dpo4 and the All-Gly linker Dpo4 displayed a complete transition

(i.e. steep slope between baseline fluorescence and a maximum plateau fluorescence), only the y-axis of the thermal denaturation plots for the wt Dpo4 and the All-Gly linker

Dpo4 was normalized in Figure 2.7B. Data for both the All-Gly linker Dpo4 and the wt

Dpo4 were then fit to the Clarke & Fersht curve [111] to determine the apparent Tms, which were 81.46 ± 0.03 °C and 82.87 ± 0.03 °C, respectively. Data from the FTS analysis implied that the LF domain unfolded last and that the linker region was a source of structural instability.

Origin of the Thermal-Stable Unfolding Intermediate. Based on thermal denaturation data aforementioned, we determined that the formation of the thermal-stable unfolding intermediate of Dpo4 was due to an existence of a functional linker region, i.e. all

34 original amino acid residues. Inferring from the crystal structure of apo-Dpo4 [3], we deduced that these surface ionic interactions within the linker region are of structural significance. To test the importance of these interactions, the wt Dpo4 and the P236A

Dpo4 were exposed to 50 mM guanidine hydrochloride. As shown in Figure 2.8A, 50 mM guanidine hydrochloride did not denature these proteins, but did change the thermal denaturation trends of both the wt Dpo4 and the P236A Dpo4 (Figure 2.8B). For both of these proteins, the thermal-stable unfolding intermediate disappeared. The corresponding apparent unfolding thermodynamic parameters are displayed in Table 2.3 and were almost identical to each other. Moreover, the calculated apparent Tm values matched the corresponding overall apparent Tm values in Table 2.2, and the new apparent Hm and

Sm values matched the apparent Hm and Sm values of the corresponding second thermal-stable folded species in Table 2.2. This data strongly suggested that the thermal- stable unfolding intermediates were truly derived from surface interactions of Dpo4.

To further investigate the possibility that the thermal-stable unfolding intermediate originated from surface interactions involving the linker region, mutations within the linker region were created to disrupt these interactions, i.e. the R/K-to-A linker

Dpo4 and the R/K-to-D linker Dpo4 (Table 2.1). As shown in Figure 2.9A, these Dpo4 mutants produced identical CD spectral trends not only to each other, but also to the CD spectral trend of the wt Dpo4 (Figure 2.2). The thermal denaturation plots of these Dpo4 mutants displayed a wide range of structural stability, demonstrating the strong influence of the linker region on the overall folding stability of Dpo4 (Figure 2.9B). Notably, there were no distinct thermal-stable unfolding intermediates observed. Intriguingly, the most

35 thermal-stable protein studied in this investigation was the R/K-to-D linker Dpo4 with an apparent Tm of 104.4 ± 0.4 °C (Table 2.4). These findings confirmed that positively charged amino acid residues within the linker region played an important role in the formation of the thermal-stable unfolding intermediate. This data also suggested that the positively charged amino acid residues were a source of structural instability for the wt

Dpo4 (Figure 2.9B).

As seen in Figure 2.10, we hypothesized that the side chains of amino acid residues forming salt bridges between the Palm domain and the linker region were the amino acids responsible for the thermal-stable unfolding intermediate: E100, K148, E235 and R240. These amino acids were mutated to Ala in order to verify the importance of these interactions in the formation of the thermal-stable unfolding intermediate. As shown in Figure 2.11A, all Dpo4 mutants were similar to each other at 37 °C, and to the wt Dpo4 at 37 °C (Figure 2.2). During thermal denaturation of these Dpo4 mutants, it was observed that the thermal-stable unfolding intermediate was greatly affected (Figure

2.11B). Moreover, altering of a single amino acid residue, i.e. R240, K148 or E100, to a neutral amino acid residue, i.e. Ala, was sufficient to remove the thermal-stable unfolding intermediate (Figure 2.11B). This observation was substantiated by observation of identical apparent thermodynamic parameters for the R240A Dpo4 and the R/K-to-A linker Dpo4, and the identical apparent thermodynamic parameters for the E100A Dpo4 and the E100A/K148A/E235A/R240A Dpo4 (Table 2.4). The apparent thermodynamic parameters for two mutants, the E100A/K148A Dpo4 and the E235A/R240A Dpo4, were similar to each other but not similar to the R240A Dpo4, the E100A Dpo4 or the K148A

36

Dpo4 (Table 2.4).

When comparing these apparent thermodynamic parameters (Table 2.4) to those in Tables 2.2 and 2.3, the calculated apparent Tms for the R240A Dpo4 and the R/K-to-A linker Dpo4 matched the overall apparent Tm for the wt Dpo4 and were similar to the apparent Tms calculated for the wt Dpo4 and the P236A Dpo4 in the presence of guanidine hydrochloride. Meanwhile, the apparent Tm for the R/K-to-D linker Dpo4 was similar to the apparent Tm for the second thermal-stable folded species of the wt Dpo4.

Remarkably, the apparent Tm for the E100A/K148A Dpo4 matched the apparent Tm for the first thermal-stable folded species of the wt Dpo4. Intriguingly, the apparent Hm and

Sm for the E100A/K148A Dpo4 and the E235A/R240A Dpo4 matched those for the first thermal-stable folded species of the P236A Dpo4. Therefore, the amino acid residues

E100, K148, E235 and R240 were indeed involved in the formation of the thermal-stable unfolding intermediate during the thermal unfolding of Dpo4. Our data also suggested that double mutations E100A/K148A and E235A/R240A have more impact on the structural stability than the single mutations E100A, K148A and R240A, which supported the hypothesis that these amino acids form salt bridges with each other (Figure

2.10). Finally, the data implied that interactions involving R240 could be replaced with side chains of neighboring positively charged side amino acid residues and/or backbone of other amino acid residues.

2.4 Discussion

37

Conformational Flexibility of Dpo4. Crystal structures often agree with the corresponding solution structures, e.g. NMR studies [116]. In our investigation of Dpo4 structural stability, the far-UV region CD spectrum of the wt Dpo4 coincided with the calculated 41 percent -helical content of a ternary complex Dpo4•DNA•dNTP crystal structure [60,

113]. Importantly, the folded truncated mutants of Dpo4 (Figure 2.2) proved that inter- domain interactions were not critical in overall Dpo4 structure. As the CD spectra were representative of the secondary structure ratios of each protein [113], the CD spectral changes observed in Figure 2.2 likely reflected changing secondary structural ratios amongst the Dpo4 mutants based on the full-length Dpo4 crystal structures [3, 40, 43, 50,

60, 106, 109, 117].

Structural Insights of Dpo4 in Solution. The thermal denaturation of the wt Dpo4 and its mutants revealed many aspects of the unfolding of Dpo4. First, the thermophilic and robust nature of Dpo4 was not derived from inter-domain interactions. Each mutant demonstrated extreme thermal stability with cooperative unfolding, with the Core and the

LF+ being more thermal stable than the wt Dpo4 as a “whole” (Figure 2.4). Thus, it was likely that thermal stability of Dpo4 was derived from local and secondary structural interactions.

Second, the thermal denaturation plot of the wt Dpo4 displayed a three-state unfolding trend that suggested a conformational change at high temperatures. Previous studies have proven that the unfolded intermediates represent the sequential melting of domains, and these denaturation trends were recreated with the CD spectral combination

38 of different domains [118, 119]. If the thermal-stable unfolding intermediate of Dpo4 was a sum of the LF domain unfolding before the Core domains, then the thermal denaturation of the LF+ and the Core combined should have equaled the thermal denaturation of the wt Dpo4. Although the combination of the LF+ and the Core thermal denaturation trends was similar to the thermal denaturation trend of the wt Dpo4, the molar ellipticity magnitude of the LF+ and the Core combination was not similar to the molar ellipticity magnitude of the wt Dpo4 (data not shown). Hence, the unfolding intermediate of Dpo4 was not derived from Core domains unfolding before the LF domain.

Additionally, the All-Gly linker Dpo4 was designed to mimic the covalent linkage of the LF to the Core without a functional linker region. However, the thermal denaturation trend of the All-Gly linker Dpo4 did not resemble the wt Dpo4 thermal denaturation trend (Figure 2.4). Notably, the LF+ and the wt Dpo4 had a functional linker region while the Core, the All-Gly linker Dpo4, the R/K-to-A linker Dpo4, the R/K-to-D linker Dpo4, the E235A/R240A Dpo4, the R240A Dpo4, the E100A/E235A/K148A

Dpo4, the K148A/E235A/R240A Dpo4, the E100A/K148A/E235A/R240A Dpo4, the LF and the P236A Dpo4 did not possess such a linker region. Markedly, the removal of any kinks (i.e. the P236A Dpo4) stabilized the thermal-stable unfolding intermediate.

Moreover, changing R240, E100 and/or K148 to Ala (i.e. the R/K-to-A linker Dpo4, the

R/K-to-A linker Dpo4, the E100A Dpo4, the K148A Dpo4, the E100A/K148A Dpo4, the

E100A/E235A/R240A Dpo4, the K148A/E235A/R240A Dpo4, the E235A/R240A Dpo4, the R240A Dpo4 and the All-Gly linker Dpo4) removed the thermal-stable unfolding

39 intermediate (Figures 2.9 and 2.11).

These occurrences were explained by the observation that the functional linker region has many interactions with the surface of the Palm domain when the wt Dpo4 is in the apo-state (Figure 2.10) [3]. In the middle of the linker region, there are strong salt bridges amongst amino acid residues K148, E235, E100 and R240, which are conserved during the formation of the binary complex Dpo4•DNA [3]. These interactions may potentially reduce the entropic energy cost of the LF domain structurally and functionally as the LF movement would be restricted to a hinge-like motion pivoted around the Palm domain (Figure 2.10). Additionally, the importance of these interactions is inferred by the fact that they are conserved in E. coli DNA Polymerase IV and human Y-family DNA polymerases (Figure 2.12), and are found in the same regions of the human Y-family enzymes in the presence of DNA [77, 92, 102, 120]. Alteration of R240 into a neutral amino acid residue (i.e. Ala) in Dpo4 removed the strongest interaction with E100 and consequently the thermal-stable unfolding intermediate. Changing charged amino acid residues to neutral amino acids (i.e. Ala) or oppositely charged amino acid residues (i.e.

Asp) in the linker region also disrupted these interactions. Mutations that removed kinks

(i.e. Pro to Ala) did not disrupt existing interactions. With the addition of a chaotropic salt (i.e. guanidine hydrochloride), most surface polar interactions were disrupted while keeping intra-domain interactions intact. This event correlated well with the absence of the thermal-stable unfolding intermediate for the wt Dpo4 and the P236A Dpo4 in the presence of guanidine hydrochloride (Figures 2.4 and 2.8B).

During FTS assays, we observed that the maximum fluorescence temperatures

40 were very similar to the temperatures where the wt Dpo4 and the All-Gly linker Dpo4 began to denature while being monitored during CD spectroscopy (Figures 2.4 and 2.7).

Given that CD spectroscopic thermal denaturation was a direct method to monitor protein unfolding whereas FTS was an indirect method to monitor solution change from hydrophilic to hydrophobic conditions, it was assumed that FTS data displayed Dpo4 losing rigidity before secondary structural components became disordered. This event would allow the entrance of SYPRO® Orange into the hydrophobic regions of Dpo4 without Dpo4 unfolding. The quantum yield of SYPRO® Orange dye molecules increases with increasing hydrophobicity of the solution [112, 121, 122]. As the linker region contained strong salt bridges with other domains, such hydrophilic interactions cannot be detected by a hydrophobic dye, e.g. SYPRO® Orange.

The thermal-stable unfolding intermediate for the wt Dpo4 (Figure 2.4) most likely resulted from the loss of ionic and polar interactions within the linker region.

Unfolding intermediates have been observed for different proteins. For example, unfolding intermediates of protein disulfide isomerase have been stabilized by guanidine hydrochloride but not urea [123]. In addition, the three-state unfolding trend did not reflect the residual enzyme activity decline over the same range of guanidine hydrochloride concentrations for human placental alkaline phosphatase [124].

The second thermal-stable unfolding intermediate for Dpo4 (between 90 °C and

100 °C) also can be explained by large conformational changes that resulted in fewer α- helices than the first thermal-stable folded species. However, the thermal denaturation of the wt Dpo4 and the LF+ monitored at fixed wavelength 209 nm showed the same

41 thermal-stable unfolding intermediates observed at wavelength 222 nm (Figure 2.5).

Thus, the thermal-stable unfolding intermediates observed in our investigation were not derived from overall secondary structural changes. Instead, we postulated that the thermal-stable unfolding intermediate for the wt Dpo4 was either a disordered linker region with folded LF, thumb, palm and finger domains, or completely folded Dpo4 with the absence of interaction between the LF and the thumb domains. With either hypothesis, we predict that Dpo4 will catalyze the DNA polymerization at similar reaction amplitudes to wt Dpo4 at 37 °C, which has been demonstrated previously up to

95 °C [41].

Third, a thermodynamic correlation amongst the Dpo4 proteins was noted. For example, the first thermal-stable folded species of the LF+ possessed apparent thermodynamic parameters were similar to those of the first thermal-stable folded species for the wt Dpo4 (Table 2.2). Also, the addition of the apparent ΔHm and ΔSm for the

+ second thermal-stable folded species of the LF to the apparent ΔHm and ΔSm for the

Core were similar to the apparent ΔHm and ΔSm, respectively, for the second thermal- stable folded species of the wt Dpo4. Thus, the unfolding Gibbs free energy of the wt

Dpo4 can be dissected into thermal-stable folded species of individual domains. The linker region effect on the overall thermal stability of Dpo4 was also observed as the apparent Tm for the All-Gly linker Dpo4 equaled the apparent Tm for the second thermal- stable folded species of the wt Dpo4 (Table 2.2). Furthermore, this data hinted that the second thermal-stable folded species of wt Dpo4 has the same structural conformation as the All-Gly linker Dpo4.

42

Lastly, the unfolding order of Dpo4 was elucidated. Most Dpo4 proteins that contained a functional linker region, i.e. the LF+ and the wt Dpo4, also possessed a thermal-stable folded species with a relatively low apparent Tm (Table 2.2). Most importantly, the removal of the linker region for the LF+ mutant, i.e. the LF only, removed the thermal-stable unfolding intermediate and significantly increased the thermal stability of the LF domain (Figures 2.6 and 2.7). Based on these observations along with those previously discussed, the unfolding order of Dpo4 domains was inferred as followed: the linker region, then the Core domains, and finally the LF domain.

Our most important observation was that all Dpo4 proteins did not begin to unfold until after 80 °C (Figures 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, and 2.11). Therefore, the overall structure of Dpo4 at 37 °C is similar to the structure of Dpo4 at 80 °C, the physiological temperature of S. solfataricus. The similar CD spectra obtained for the wt Dpo4 at specific temperatures confirmed this notion (Figure 2.3). Accordingly, the kinetic studies of Dpo4 performed at ambient temperatures are indeed structurally relevant. Along with the temperature-dependent kinetic study of Dpo4 [66], the data herein provided evidence that the kinetic mechanism and structural integrity of Dpo4 are not compromised at lower-than-physiological temperatures.

43

2.5 Figures

wt Dpo4 1 11 78 167 230 245 352 Full-length Dpo4 mutants Fingers Palm Thumb LF

1 11 78 167 230

Core Fingers Palm Thumb

230 245 352 + LF LF

245 352

LF LF

Figure 2.1. Diagram of the wt Dpo4 and mutants.

The diagram of the amino acid residue length and domain composition of the wt Dpo4 in comparison to its mutants used in this investigation. The Finger, the Palm, the Thumb, and the LF domains are colored blue, red, green and purple, respectively, while the linker region is colored in black.

44

1000

0

) -1

-1000

dmol 2 -2000

-3000

-4000

-5000 Molar Ellipticity (deg cm Molar Ellipticity -6000

-7000 200 210 220 230 240 250 Wavelength (nm)

Figure 2.2. CD wavelength spectra of the wt Dpo4 and mutants.

Far-UV region CD wavelength spectra of the wt Dpo4 (thick, solid black line), the Core

(thin, solid blue line), the LF+ (long-dashed purple line), the P236A Dpo4 (dotted red line) and the All-Gly linker Dpo4 (short-dashed light blue line) at 37 °C.

45

0

) -500 -1

-1000

dmol 2 -1500

-2000

-2500

-3000

Molar(deg cm Ellipticity -3500

-4000 205 210 215 220 225 230 235 240 245

Wavelength (nm)

Figure 2.3. Far-UV region CD wavelength of the wt Dpo4 at various temperatures.

CD wavelength spectra of the wt Dpo4 at the following temperatures: 38 °C (thick, solid black line), 70 °C (short- and long-dashed purple line), 80 °C (thin, solid green line), 90

°C (short-dashed orange line), 95 °C (long-dashed blue line) and 100 °C (dotted and short-dashed red line).

46

1000

) 0 -1

-1000

dmol 2 -2000

-3000

-4000

-5000

Molar Ellipticity (deg cm -6000

-7000 20 40 60 80 100 120

o Temperature ( C)

Figure 2.4. Thermal denaturation plots of the wt Dpo4 and certain mutants monitored via

CD spectroscopy at fixed wavelength 222 nm.

Thermal denaturation plots of the wt Dpo4 (black ●), the All-Gly linker Dpo4 (light blue

■), the P236A Dpo4 (red ♦), the Core (blue ■) and the LF+ (purple ▲).

47

0

) -1

-1000

dmol 2

-2000

-3000

-4000 Molar Ellipticity (deg cm Molar Ellipticity (deg

20 40 60 80 100 120

Temperature (oC)

Figure 2.5. Thermal denaturation of the wt Dpo4 and the LF+ monitored via CD spectroscopy at fixed wavelength 209 nm.

The wt Dpo4 (●) and the LF+ (□) were in the same experimental conditions as used during the CD spectroscopic thermal denaturations at fixed wavelength 222 nm.

48

1000

) -1

0

dmol 2

-1000

-2000

-3000 Molar Ellipticity (deg cm (deg Ellipticity Molar -4000 205 210 215 220 225 230 235 240 245 Wavelength (nm)

A

1000

) -1

0

dmol 2

-1000

-2000

-3000 Molar Ellipticity (deg cm -4000 20 40 60 80 100 120 Temperature ( oC)

B

Figure 2.6. CD spectra of LF.

(A) Far-UV region CD wavelength spectra of the LF in the presence of 400 mM NaCl

(dotted blue line), the LF in buffer A (short-dashed red line) and the LF+ in buffer A

(solid black line, same as Figure 2.2) at 37 °C. (B) Thermal denaturation of the LF+

(black ●, same as Figure 2.4), and the LF in the presence of 400 mM NaCl (blue ▲) and in buffer A (red ■) monitored via CD spectroscopy at fixed wavelength 222 nm.

49

1.5 104

1 104

5000 Relative Fluorescence Relative

0 65 70 75 80 85 90 95 Temperature ( oC)

A

1

0.8

0.6

0.4

0.2 NormalizedFluorescence

0 65 70 75 80 85 90 95 o Temperature ( C)

B

Figure 2.7. FTS plots of wt Dpo4 and three mutants in buffer A.

(A) Non-normalized FTS traces for the wt Dpo4 (black ●), the All-Gly linker Dpo4

(orange ▲), the LF+ (blue ■) and the LF (green ). (B) Normalized FTS traces for the wt

Dpo4 (●) and the All-Gly linker Dpo4 (orange ▲).

50

1000 )

-1 0

dmol 2

-1000

-2000

-3000 Molar Ellipticity Molar cm (deg

-4000 200 210 220 230 240 250 Wavelength (nm)

A (continued)

Figure 2.8. CD spectra of the wt Dpo4 and the P236A Dpo4 in the presence of 50 mM guanidine hydrochloride.

(A) Far-UV region CD wavelength spectra of the wt Dpo4 (thick, solid black line) and the P236A Dpo4 (dotted orange line) at 37 °C. (B) Thermal denaturation plots of the wt

Dpo4 (black ●) and the P236A Dpo4 (orange ♦) at fixed wavelength 222 nm.

51

Figure 2.8 continued

0

-500

) -1

-1000

dmol 2 -1500

-2000

-2500

-3000

Molar Ellipticity (deg cm (deg Ellipticity Molar -3500

-4000 20 40 60 80 100 120 Temperature (oC)

B

52

4000 ) -1 3000

dmol 2000 2

1000

0

-1000

-2000

-3000 Molar Ellipticity (deg cm Molar Ellipticity -4000 200 210 220 230 240 250 Wavelength (nm)

A (continued)

Figure 2.9. CD spectra of the wt Dpo4 and linker mutants.

(A) Far-UV region CD wavelength spectra of the wt Dpo4 (thick, solid black line), the

R/K-to-A linker Dpo4 (short-dashed green line) and the R/K-to-D linker Dpo4 (dashed purple line) at 37 °C. (B) Thermal denaturation plots of the wt Dpo4 (●, same as Figure

2.4), the R/K-to-A linker Dpo4 (green ♦) and the R/K-to-D linker Dpo4 (purple ▲) at fixed wavelength 222 nm.

53

Figure 2.9 continued

0 )

-1 -500

dmol -1000 2

-1500

-2000

-2500

-3000

-3500 Molar Ellipticity (deg cm (deg Molar Ellipticity -4000 20 40 60 80 100 120 o Temperature ( C)

B

54

Figure 2.10. Crystal structure of Dpo4 in apo-state.

The Finger, Palm, Thumb and LF domains are colored blue, red, green and purple, respectively. The linker region is colored black. The area of the linker region with the

Palm domain in the background was magnified in the inset where the key amino acid residues are labeled for reference. The structure originated from Protein Data Bank ID

2RDI [3].

55

4000 ) -1 3000

dmol 2000 2

1000

0

-1000

-2000

-3000 Molar Ellipticitycm (deg -4000 200 210 220 230 240 250 Wavelength (nm)

A (continued)

Figure 2.11. CD spectra of the wt Dpo4 and selected mutants.

(A) Far-UV region CD wavelength spectra of the wt Dpo4 (thick, solid black line), the

R240A Dpo4 (closely dotted purple line), the E235A/R240A Dpo4 (short-dashed and dotted light blue line), the E100A Dpo4 (loosely dotted green line), the K148A Dpo4

(solid red line), the E100A/K148A Dpo4 (long-dashed orange line), the

E100A/E235A/R240A Dpo4 (short- and long-dashed grey line), the

K148A/E235A/R240A Dpo4 (short-dashed blue line) and the

E100A/K148A/E235A/R240A Dpo4 (dashed yellow line) at 37 °C. (B) Thermal denaturation plots of the wt Dpo4 (black ●, same as Figure 2.4), the R240A Dpo4 (purple

■), the E235A/R240A Dpo4 (light blue ♦), the E100A Dpo4 (green ■), the K148A Dpo4

(red ▲), the E100A/K148A Dpo4 (orange ♦), the E100A/E235A/R240A Dpo4 (grey ▲), the K148A/E235A/R240A Dpo4 (blue ●) and the E100A/K148A/E235A/R240A Dpo4

(yellow ■) at fixed wavelength 222 nm.

56

Figure 2.11 continued

0 ) -1 -500

dmol -1000 2

-1500

-2000

-2500

-3000

-3500 MolarEllipticity cm (deg -4000 20 40 60 80 100 120 o Temperature ( C)

B

57

Dpo4 93 REYSEKIEIA SIDEAYLD 110 DinB 91 SRYTSRIEPL SLDEAYLD 108 hRev1 558 ASYTHNIEAV SCDEALVD 575 hPolη 104 SRFA-VIERA SIDEAYVD 120 hPolκ 186 ADYDPNFMAM SLDEAYLN 203 hPolι 114 EEFSPVVERL GFDENFVD 131

Dpo4 144 ISKNKVFAKI AADMAKPN 161 DinB 141 VAPVKFLAKI ASDMNKPN 158 hRev1 609 IGSNILLARM ATRKAKPD 627 hPolη 217 ISHNKVLAKL ACGLNKPN 233 hPolκ 276 IAPNTMLAKV CSDKNKPN 330 hPolι 179 VASNKLLAKL VSGVFKPN 216

Dpo4 230 RDEYNE-PIR TRVRKSIG 246 DinB 228 GIDERD-VNS ERLRKSVG 244 hRev1 696 RGLDDRPVRT EKERKSVS 713 hPolη 303 RGIEHDPVKP RQLPKTIG 320 hPolκ 400 LGLGSTHLTR DGERKSMS 417 hPolι 286 FGEDNSPVIL SGPPQSFS 303

Figure 2.12. Sequence of alignment of Y-family DNA polymerases.

Amino acid sequences of S. solfataricus Dpo4, E. coli Pol IV (DinB), human Rev1

(hRev1), and human DNA polymerases ε (hPolε), θ (hPolθ) and η (hPolη) were compared. The conserved amino acid residues that may form ionic interactions within the

Palm domain and linker region of each protein are highlighted in light blue. The non- conserved amino acid residues colored in red are hypothesized to interact with conserved amino acid residues highlighted in light blue.

58

2.6 Tables

Enzyme Linker Region Sequencea wt Dpo4 RDEYNEPIRTRVRK LF+ RDEYNEPIRTRVRK E100A Dpo4 RDEYNEPIRTRVRK K148A Dpo4 RDEYNEPIRTRVRK E100A/K148A Dpo4 RDEYNEPIRTRVRK E100A/E235A/R240A Dpo4 RDEYNAPIRTAVRK K148A/E235A/R240A Dpo4 RDEYNAPIRTAVRK E100A/K148A/E235A/R240A Dpo4 RDEYNAPIRTAVRK P236A Dpo4 RDEYNEAIRTRVRK R240A Dpo4 RDEYNEPIRTAVRK E235A/R240A Dpo4 RDEYNAPIRTAVRK R/K-to-A linker Dpo4 ADEYNEPIATAVAA R/K-to-D linker Dpo4 DDEYNEPIDTDVDD All-Gly Linker Dpo4 GGGGGGGGGGGGGG aSite-specific mutations are in bold.

Table 2.1. Amino acid sequence of the linker region for specific Dpo4 mutants.

59

T a ΔH b ΔS b Protein m m m (°C) (kcal*mol-1) (kcal*mol-1*K-1) wt Dpo4 Wholec 96 ± 1 67 ± 7 0.18 ± 0.02 1st species 89.3 ± 0.2 516 ± 66 1.4 ± 0.2 2nd species 102.6 ± 0.1 162 ± 6 0.43 ± 0.02 Core 101.1 ± 0.1 94 ± 2 0.25 ± 0.01 LF+ Wholec 98.1 ± 1.1 44 ± 7 0.12 ± 0.02 1st species 86.8 ± 0.2 449 ± 55 1.2 ± 0.2 2nd species 105.6 ± 0.1 115 ± 3 0.30 ± 0.01 All-Gly linker Dpo4 102.5 ± 0.1 73 ± 2 0.19 ± 0.01 P236A Dpo4 Wholec 97.6 ± 0.9 50 ± 6 0.13 ± 0.02 1st species 83.0 ± 0.2 130 ± 9 0.36 ± 0.03 2nd species 103.7 ± 0.1 190 ± 9 0.50 ± 0.02 a Calculated assuming ΔGm = 0. b Calculated using lnKm = -ΔHm/(RT) + ΔSm/R. cWhole referred to the calculations including the total temperature range of thermal denaturation.

Table 2.2. Apparent unfolding thermodynamic parameters determined by thermal denaturation at wavelength 222 nm.

60

a b b Protein Tm Hm Sm (°C) (kcal*mol-1) (kcal*mol-1*K-1) wt Dpo4 99.6 ± 0.1 152 ± 4 0.41 ± 0.01 P236A Dpo4 97.4 ± 0.1 174 ± 5 0.47 ± 0.01 a Calculated assuming ΔGm = 0. b Calculated using lnKm = -ΔHm/(RT) + ΔSm/R.

Table 2.3. Apparent unfolding thermodynamic parameters in the presence of 50 mM guanidine hydrochloride at wavelength 222 nm.

61

a b b Protein Tm Hm Sm (°C) (kcal*mol-1) (kcal*mol-1*K-1) R/K-to-A linker Dpo4 96.4 ± 0.2 89 ± 4 0.24 ± 0.01 R/K-to-D linker Dpo4 104.4 ± 0.4 118 ± 8 0.31 ± 0.02 R240A Dpo4 96.5 ± 0.3 93 ± 9 0.25 ± 0.03 E235A/R240A Dpo4 90.3 ± 0.3 125 ± 6 0.34 ± 0.02 E100A Dpo4 88.9 ± 0.2 163 ± 5 0.45 ± 0.01 K148A Dpo4 92.8 ± 0.3 98 ± 6 0.27 ± 0.02 E100A/K148A Dpo4 89.4 ± 0.2 135 ± 7 0.37 ± 0.02 E100A/E235A/R240A Dpo4 88.3 ± 0.1 193 ± 5 0.53 ± 0.01 K148A/E235A/R240A Dpo4 90.3 ± 0.3 172 ± 11 0.47 ± 0.03 E100A/K148A/E235A/R240A Dpo4 90.0 ± 0.2 163 ± 5 0.45 ± 0.01 a Calculated assuming ΔGm = 0. b Calculated using lnKm = -ΔHm/(RT) + ΔSm/R.

Table 2.4. Apparent unfolding thermodynamic parameters determined by thermal denaturation at wavelength 222 nm.

62

Chapter 3 : Kinetic Basis of Sugar Selection by a Y-Family DNA Polymerase from

Sulfolobus solfataricus P2

3.1 Introduction

DNA polymerases play critical roles in genomic replication, DNA damage repair, and DNA lesion bypass in vivo. To maintain genomic stability, cellular DNA polymerases are required to select correct deoxynucleotides (dNTPs) and to discriminate against both incorrect dNTPs and ribonucleotides (rNTPs) during DNA synthesis.

Incorporations of rNTPs lead to DNA strand breakage, genetic mutation, and cell death while dNTP misincorporations result in mutation. Since the cellular concentration of rNTPs is at least 10-fold higher than that of dNTPs [125-127], DNA polymerases, which are phylogenetically grouped into A-, B-, C-, D-, X-, Y-, and reverse transcriptase (RT) families [8], possess efficient mechanisms to limit rNTP binding and incorporation during DNA polymerization. For example, most DNA polymerases and reverse transcriptases (RTs) discriminate against rNTPs by using the steric clash between a bulky side chain of an active site amino acid residue and the ribose 2′-OH of an incoming rNTP

[128]. Through site-directed mutagenesis and kinetic studies, the bulky active site amino acid residue, or „steric gate‟, has been identified to be either Glu for the A-family

63 polymerases [129, 130] or Tyr (or Phe) for the B-, Y- and RT-family members [25, 130-

136]. Consistently, X-ray crystal structures of the ternary complexes (E•DNA•dNTP) of several DNA polymerases have revealed that the side chain of the „steric gate‟ residue and the ribose 2′ position of an incoming dNTP are too close to each other and cannot tolerate the presence of a 2′-OH moiety [60, 77, 102]. If the incoming dNTP were replaced by an rNTP in those structures, the 2′-OH of the rNTP would clash with the side chain, or backbone, of the „steric gate‟ residue. This structural prediction is supported by the fact that a DNA polymerase can incorporate rNTPs into DNA as efficiently as dNTPs after the „steric gate‟ residue is mutated to a residue with a small side chain, e.g. Ala or

Gly [102, 129, 131, 132, 136, 137]. Unfortunately, the E•DNA•rNTP ternary structures of any DNA polymerases and their „steric gate‟ mutants have not been solved yet. Thus, the structural basis for rNTP discrimination by a DNA polymerase has not been unambiguously established.

In this paper, we chose to investigate the sugar selectivity of Sulfolobus solfataricus P2 DNA polymerase IV (Dpo4), a model Y-family DNA polymerase. The Y- family DNA polymerases, which have been identified in all three domains of life [138], bypass DNA lesions in an error-prone or error-free manner, lack intrinsic proofreading exonuclease activities, and catalyze DNA synthesis with low fidelity and poor processivity [4, 27, 57, 59, 65, 108, 117]. Structurally, each Y-family enzyme contains the typical Finger, Thumb, and Palm domains found in all structurally known DNA polymerases [3, 91, 102, 139-142]. Interestingly, a fourth domain, designated as the

“Little Finger” domain, is tethered to the small Thumb domain via a linker and is only

64 present in the Y-family enzymes [60]. In Dpo4, the Little Finger and Thumb domains bind to DNA and fit into its major and minor grooves, respectively. The Palm domain contains three highly conserved carboxylate residues that bind two Mg2+ ions at the active site. The small Finger domain contacts an incoming dNTP but lacks the O and O1 helices which perform fidelity checking in the high fidelity DNA polymerases [143]. The side chains surrounding the nascent base pair are small and hydrophobic, rather than the large and positively-charged side chains present in other DNA polymerases. The active site of Dpo4 is relatively loose and solvent accessible when compared to the active sites of replicative DNA polymerases in the presence of DNA and a nucleotide [60]. These structural features help Dpo4 accommodate bulky DNA lesions but contribute to its low fidelity when replicating both undamaged and damaged DNA [4, 27, 57, 59, 65, 108,

117]. Thus, it is possible that unfaithful Dpo4 could incorporate rNTPs efficiently. Here we utilized protein engineering and kinetic methods to investigate the sugar selectivity of

Dpo4. Our results showed that Dpo4, like other Y-family members, uses a „steric gate‟ residue to discriminate against rNTPs.

3.2 Material and Methods

Materials. Reagents were purchased from the following companies: OptiKinase from

USB; [γ-32P]ATP from MP Biomedicals; rNTPs and dNTPs from GE Healthcare.

65

Protein Preparation. The mutation Y12A was introduced into the plasmid containing the full-length wild-type Dpo4 via QuikChange site-directed mutagenesis kit (Stratagene).

The full-length wild-type Dpo4 and the Y12A Dpo4 mutant were expressed in E. coli and purified as previously described [27].

Synthetic Oligonucleotides. The DNA substrates listed in Table 3.1 were purchased from

Integrated DNA Technologies. All DNA substrates were purified by denaturing polyacrylamide gel electrophoresis (PAGE). The concentration of each DNA oligomer was determined by the UV absorbance at 260 nm. Each primer was 5′-[32P]-labelled by incubating it with OptiKinase and [γ-32P]ATP for 3 hrs at 37 °C. The 5′-[32P]-labelled primer was annealed to an unlabeled DNA template at a molar ratio of 1.00:1.15. This mixture was first heat denatured at 95 °C for 2 min and then cooled slowly to room temperature over several hours.

Buffers. All pre-steady-state kinetic assays, if not specified, were performed in optimized reaction buffer R (50 mM HEPES, pH 7.5 at 37 °C, 5 mM MgCl2, 50 mM NaCl, 0.1 mM

EDTA, 5 mM DTT, 10% glycerol, and 0.1 mg/ml BSA) [27]. All given concentrations were final after mixing all solutions.

Primer Extension Assays. The 5′-[32P]-labelled DNA substrate D-1 (30 nM) was preincubated with either the wild-type Dpo4 or the Y12A Dpo4 mutant (120 nM) and then reacted with either all four dNTPs or rNTPs (100 M each) or individual dNTPs or

66 rNTPs (100 M) at 37 oC for 2 min (or for various times during running-start experiments) in buffer R. Reactions were terminated with 0.37 M EDTA and then analyzed by denaturing PAGE (17% polyacrylamide, 8 M urea). The gels were visualized using a Typhoon TRIO (GE Healthcare).

Measurement of Nucleotide Incorporation Efficiency and Fidelity. Single-turnover nucleotide incorporation assays were employed to obtain the kp and Kd, NTP as previously described [27]. Briefly, a preincubated solution of enzyme (120 nM) and 5′-radiolabeled

DNA (30 nM) in buffer R was mixed with increasing concentrations of a single nucleotide. The reactions were terminated after various reaction times using 0.37 M

EDTA. Reaction products were analyzed by denaturing PAGE (17% polyacrylamide, 8

M urea) and quantitated with a Typhoon TRIO (GE Healthcare). The time course of product formation at each nucleotide concentration was fit to a single-exponential equation (Eq 1):

[Product] = A[1 – exp(- kobst)] (Eq 1) where kobs is the observed reaction rate constant and A is the reaction amplitude. Next, the plot of the kobs versus the nucleotide concentration was fit to a hyperbolic equation (Eq

2): kobs = kp[NTP]/{[NTP] + Kd, NTP} (Eq 2) where kp is the maximum nucleotide incorporation rate constant and the Kd, NTP is the equilibrium dissociation constant for the ternary complex (Dpo4•DNA•NTP). From these kinetic parameters, the ribonucleotide incorporation efficiency (kp/Kd, NTP) was calculated.

67

The ribonucleotide incorporation fidelity was also calculated using the following equation (Eq 3):

Fidelity = (kp/Kd, NTP)mismatched/[(kp/Kd, NTP)mismatched + (kp/Kd, NTP)matched] (Eq 3)

Determination of Sugar Selectivity. The sugar selectivity for a specific DNA substrate was determined for both the wild-type Dpo4 and the Y12A mutant using a nucleotide incorporation efficiency ratio (Eq 4):

Sugar selectivity = (kp/Kd)dNTP/(kp/Kd)rNTP (Eq 4) where (kp/Kd)dNTP is the dNTP incorporation efficiency and (kp/Kd)rNTP is the rNTP incorporation efficiency [25].

3.3 Results

Reduced rNTP Discrimination by the Y12A Dpo4 Mutant. Sequence alignment of the Y- family DNA polymerases shows that Dpo4 possesses a potential „steric gate‟ residue

Tyr12 (Figure 3.1). This putative „steric gate‟ residue was mutated to Ala through site- directed mutagenesis and the single residue mutant protein was expressed and purified as described for the wild-type Dpo4 (Materials and Methods) [27]. To test the impact of the

Y12A mutation on the ability of Dpo4 to discriminate against rNTPs, running start assays using a DNA substrate D-1 (Table 3.1) were individually performed with the wild-type

Dpo4 and its Y12A mutant at 37 C. In the presence of four dNTPs, the wild-type Dpo4 extended primer 21-mer to the 5′-end of DNA template 41-mer before 0.6 min (Figure

68

3.2). The Y12A Dpo4 mutant also synthesized full-length products before 4 min, suggesting that the efficiency of dNTP incorporation was reduced by about 7-fold (4 min divided by 0.6 min) relative to the dNTP incorporation efficiency of the wild-type Dpo4

(Figure 3.2). In contrast to dNTP incorporation, the wild-type Dpo4 incorporated only approximately five rNTPs after 4 min with no full-length product observed within 1 hr and was thus highly discriminatory against incoming rNTPs (Figure 3.2). Interestingly, the Y12A Dpo4 mutant was able to incorporate all four rNTPs and extended the DNA primer 21-mer to full-length products within 20 min (Figure 3.2). Moreover, the DNA-

RNA hybrid products were reduced to the DNA primer 21-mer after alkaline degradation

(Figure 3.3). Thus, the Y12A mutation caused Dpo4 to lose most of its ability to discriminate against rNTPs and synthesized 20- and 21-nucleotide RNA oligomers with a reasonable velocity. Overall, the Y12A Dpo4 mutant was approximately 5-fold (20 min divided by 4 min) less efficient when incorporating rNTPs over dNTPs to form full- length products (Figure 3.2). The 42-mer products in Figure 3.2 were likely derived from blunt-end addition [117]. To examine if the wild-type Dpo4 and the Y12A mutant preferentially incorporated a matched or a mismatched rNTP opposite a DNA template base, single nucleotide incorporation assays were performed for 2 min at 37 C in the presence of D-1 (Table 3.1). The resulting gel images are shown in Figure 3.4. Clearly, the Y12A Dpo4 mutant preferred to incorporate matched UTP or correct dTTP opposite the templating base dA with minimal incorporations of mismatched rNTPs or incorrect dNTPs (Figure 3.4). In comparison, the wild-type Dpo4 incorporated very small amounts of UTP and rGTP but incorporated all four dNTPs with a preference for correct dTTP.

69

The latter observation is consistent with our previous kinetic results where Dpo4 displays low fidelity (10-3 to 10-4) when incorporating dNTPs into undamaged DNA at 37 C [27].

Sugar Selectivity of the Wild-Type Dpo4 and the Y12A Dpo4 Mutant. To quantitatively analyze the sugar selectivity of a DNA polymerase, the incorporation efficiencies of each dNTP and its corresponding rNTP were determined. Under single-turnover reaction conditions, we have previously determined the incorporation efficiency of each of the four correct dNTP incorporations into the corresponding D-1, D-6, D-7 or D-8 DNA substrate (Table 3.1) catalyzed by the wild-type Dpo4 at 37 C [27]. By employing the same kinetic assays, we individually determined the rNTP incorporation efficiencies with the wild-type Dpo4 as well as dNTP or rNTP incorporation efficiencies with the Y12A

Dpo4 mutant. For example, a preincubated solution of the Y12A Dpo4 mutant (120 nM) and 5′-radiolabeled D-7 (Table 3.1) was rapidly mixed with rATP (50 to 1,200 M) and these reactions were quenched by 0.37 M EDTA after various reaction times. The gel image of a time course of rATP incorporation is shown in Figure 3.5A. At each rATP concentration, the plot of product formation as a function of time was fit to Eq 1

(Materials and Methods) to obtain the kobs values (Figure 3.5B). Next, the kobs was plotted against the corresponding rATP concentration (Figure 3.5C) and the plot was fit to Eq 2

(Materials and Methods) to obtain the maximum nucleotide incorporation rate (kp = 0.51

-1 ± 0.04 s ) and the equilibrium dissociation constant (Kd, rATP = 400 ± 70 M) for rATP

(Table 3.2 and Figure 3.5C). Recently, K. A. Johnson and coworkers have used stopped- flow and computer simulation methods to investigate dNTP incorporation catalyzed by

70

T7 phage DNA polymerase and found that the Kd, dNTP value obtained from single- turnover kinetic assays does not represent the true nucleotide ground state binding affinity due to protein conformational changes prior to catalysis [144]. By monitoring real-time fluorescence resonance energy transfer signal changes, we have discovered that the four domains in Dpo4 undergo synchronized conformational changes induced by correct dNTP binding [2]. If rNTP binding induces similar protein conformational changes in Dpo4 as dNTP [2], the above measured Kd, ATP will be considered an apparent

Kd, ATP.

Using the measured kp and Kd, rATP values, we then calculated rATP incorporation

-3 -1 -1 efficiency (kp/Kd, rATP = 1.3 x 10 M s ) and sugar selectivity ((kp/Kd)dATP/(kp/Kd)rATP =

3) for the Y12A Dpo4 mutant (Table 3.2). Opposite template bases dA, dG, and dC, the sugar selectivity values were measured under the same single-turnover reaction conditions and were determined to be 30, 12, and 4, respectively. In comparison, the sugar selectivity values for the wild-type Dpo4 were also determined, and they are

20,500, 18,333, 13,448, and 5,500 (Table 3.2) for dTTP/UTP, dGTP/rGTP, dATP/rATP, and dCTP/rCTP, respectively. Comparing these sugar selectivity values in Table 3.2, the

Y12A mutation significantly decreased the sugar selectivity of Dpo4 (Figure 3.6). The extremely low rNTP incorporation efficiencies relative to those of correct dNTPs with the wild-type Dpo4 (Table 3.2) are consistent with the product formation patterns observed in Figure 3.2: incorporation of rNTPs catalyzed by the wild-type Dpo4 was extremely inefficient while dNTPs were rapidly incorporated. When comparing the kinetic parameters (Table 3.2) of the wild-type Dpo4 and its Y12A mutant, the mutation caused

71 moderate to significant changes in both kp and Kd, dNTP values for dNTP incorporation, which led to 3-, 32-, 10- and 19-fold decreases in dTTP, dGTP, dCTP, and dATP incorporation efficiencies, respectively. The difference in dNTP incorporation efficiencies correlated well to the slower formation of the full-length DNA products synthesized by the Y12A mutant than by the wild-type Dpo4 (Figure 3.2). In addition, the

Y12A Dpo4 mutant incorporated rNTPs and dNTPs with very different kinetic parameters. Relative to dNTP incorporation, the kp values decreased by ~9-fold for all matched rNTP incorporations, the Kd, rNTP values increased by ~2-fold for pyrimidine nucleotides (i.e. UTP/dTTP, rCTP/dCTP), and the Kd, rNTP values decreased by about 2- fold for purine nucleotides (i.e. rGTP/dGTP, rATP/dATP). These changes in the kinetic parameters led to rNTP incorporation with 3- to 30-fold lower efficiencies than the corresponding dNTP incorporation (Table 3.2), resulting in the slower DNA primer extension in the presence of rNTPs relative to dNTPs (Figure 3.2).

Fidelity of DNA-Dependent RNA Polymerase Activity of the Y12A Dpo4 Mutant. Figure

3.2 shows that the Y12A Dpo4 mutant possesses a DNA-dependent RNA polymerase activity in addition to the DNA-dependent DNA polymerase activity. To determine the fidelity of this DNA-dependent RNA polymerase activity, the kinetic parameters (Table

3.3) of mismatched rNTP incorporations into DNA substrate D-1, e.g. rATP incorporation (Figure 3.7), were determined under single-turnover reaction conditions.

Opposite templating base dA, both kp and kp/Kd, rNTP values of mismatched rNTPs are 2-3 orders of magnitude lower than those of matched UTP while the difference in Kd, rNTP

72 values is within 2-fold (Table 3.3). The calculated fidelity for the DNA-dependent RNA polymerase activity of the Y12A Dpo4 mutant is in the range of 10-3 to 10-4 (Table 3.3).

Notably, the Y12A Dpo4 mutant misincorporated rCTP 14- and 3-fold more efficiently than rATP and rGTP, respectively (Table 3.3). Since the 5′-base from the templating base dA in D-1 is dG (Table 3.1), the enhanced misincorporation of rCTP was probably due to rNTP-stabilized misalignment as observed with dNTP misincorporation catalyzed by the wild-type Dpo4 [27]. This hypothesis was proven to be correct based on the fact that rCTP misincorporation efficiency (1.1 x 10-7 µM-1s-1) is 21-fold lower when the 5′-base dG (D-1) from the templating base dA was changed to dA (D-1′) (Table 3.3).

Blunt-End rNTP Addition Catalyzed by the Y12A Dpo4 Mutant. The 42-mer products in

Figure 3.2 indicate that the Y12A Dpo4 mutant was able to catalyze a blunt-end rNTP addition onto an RNA/DNA duplex 41/41-mer as it incorporated a dNTP onto the blunt- end DNA/DNA duplex 41/41-mer. To examine how slow the blunt-end addition activity is, we measured the rate constant for rNTP incorporation onto BE2 (Table 3.1) catalyzed by the Y12A Dpo4 mutant. This mutant was able to incorporate a small amount of rATP

-5 -1 (~2 nM) after 2 hrs (Figure 3.8A). The kobs was determined to be (2.9 ± 0.4) x 10 s when the rATP concentration was 600 µM (Figure 3.8B). This value is 120- to 207-fold

-3 -1 lower than the kp values (3.5-6 x 10 s ) for the blunt-end dATP addition catalyzed by the wild-type Dpo4 [117]. Under the same reaction conditions as in Figure 3.8A, there were no detectable blunt-end additions of either rCTP, rGTP, or UTP (data not shown).

Thus, blunt-end rNTP additions onto a DNA/DNA duplex are much less efficient than the

73 corresponding blunt-end dNTP additions, and rATP, like dATP, was the preferred nucleotide for this activity due to strong intrahelical base stacking interactions between nucleobase adenine and blunt-end DNA [117].

3.4 Discussion

Kinetic Basis for Sugar Selectivity of Dpo4. Sequence alignment of the Y-family DNA polymerases (Figure 3.1) indicates a conserved Tyr or Phe residue which likely serves as the „steric gate‟ in the discrimination against rNTPs. Mutation of Phe12 of S. acidocaldarius Dbh to Ala, which possesses a much smaller side chain than Phe, decreases the sugar selectivity ((kp/Kd)dGTP/(kp/Kd)rGTP) of this Y-family member from

3,400 to 4 for rGTP/dGTP [132]. Similarly, the mutation of Phe13 of E. coli DinB to slightly less bulky Val has a modest effect on its sugar selectivity which drops from 105 to 103 [145]. Since Dpo4 belongs to the same DinB subfamily, one can predict that the

Y12A mutation will likely relax Dpo4‟s discrimination against rNTPs. This prediction was confirmed by the product formation patterns in Figure 3.4 which directly demonstrate when opposite the templating base dA, UTP was as good a nucleotide substrate as dTTP to the Y12A Dpo4 mutant and a much better substrate to the Y12A mutant than to the wild-type Dpo4. This prediction was also confirmed by our kinetically determined sugar selectivity values which show that the sugar selectivity decreased from the range of 5,500-20,500 for the wild-type Dpo4 to 3-30 for the Y12A Dpo4 mutant, depending on the identity of a nascent base pair (Table 3.2 and Figure 3.6). The kinetic

74 basis for the significant decrease in sugar selectivity is that the Y12A mutation dramatically enhanced rNTP incorporation efficiency ((kp/Kd)rNTP) by an average of 221- fold while decreasing correct dNTP incorporation efficiency ((kp/Kd)dNTP) by an average of 9-fold, leading to a calculated reduction of sugar selectivity by 1,989-fold. Moreover, the difference between the average efficiency for dNTP (8.0 x 10-3 µM-1s-1) and rNTP incorporation (1.7 x 10-3 µM-1s-1) for the Y12A Dpo4 mutant is only 5-fold while it is

9,200-fold for the wild-type Dpo4. Thus, the small side chain of Ala12 allowed Dpo4 to incorporate rNTPs almost as efficiently as dNTPs. The 5-fold difference between dNTP and rNTP incorporation efficiency for the Y12A Dpo4 mutant was mainly contributed by the difference in kp values (9-fold) (Table 3.2). In contrast, the average kp value difference between dNTP and rNTP incorporations catalyzed by the wild-type Dpo4 was

1,925-fold while the average Kd difference is only 17-fold (Table 3.2). These kinetic data suggested that the Y12A mutation of Dpo4 mainly affected catalysis (kp), rather than the nucleotide binding step (Kd). The kinetic insights into the sugar selectivity of Dpo4 may shed light on how human Y-family DNA polymerases discriminate against ribonucleotides in vivo, especially when dNTP pools are low and rNTP pools are high

[25], as their proposed „steric gate‟ residues are either Phe or Tyr (Figure 3.1).

To obtain a structural sense on how Dpo4 discriminates against an incoming rNTP, we modeled rATP into the binding site of dATP by swapping the ribose rings between Dpo4-bound dATP (PDB 2AGQ) and an N5-CAIR synthetase-bound rATP

(PDB 3ETH) (Figure 3.9) [52, 146]. The adenine base and the triphosphate were not modified. The C1′, C4′ and C3′ atoms of the rATP (PDB 3ETH) occupied the same

75 positions as those corresponding atoms in the Dpo4-bound dATP (PDB 2AGQ).

Unfortunately, the O2′ of the incoming rATP and the side chain of Tyr12 are at least 0.7

Å too close to avoid a steric clash based on van der Waals atomic radii (Figure 3.9B). In solution, the side chain of Tyr12 must move away from an incoming rATP in order for

Dpo4 to bind and incorporate this rATP. This movement will require more energy and may explain why the matched rNTP incorporation efficiencies with the wild-type Dpo4 were significantly lower than the matched rNTP incorporation efficiencies with the Y12A

Dpo4 mutant (Table 3.2).

DNA-Dependent RNA Polymerase Activity of the Y12A Dpo4 Mutant. In addition to an intrinsic DNA-dependent DNA polymerase activity, the Y12A Dpo4 mutant was able to incorporate at least 20 consecutive rNTPs into the DNA substrate D-1 21/41-mer (Figure

3.2) and displayed a DNA-dependent RNA polymerase activity. Interestingly, this activity did not reach its upper limit since longer RNA polymers were synthesized by the

Y12A Dpo4 mutant with shorter DNA primers (data not shown). In comparison, the binary and ternary crystal structures of Dpo4 have shown that the DNA binding cleft of

Dpo4 is about 8 base pairs in length and the bound DNA/DNA duplex is completely in the B-type conformation [3, 60]. Thus, the DNA binding cleft of the Y12A Dpo4 mutant can accommodate both DNA/DNA (B-type) and RNA/DNA (A-type) helices during polymerization. Similarly, the F12A Dbh mutant has been found to be capable of performing at least 10 successive rNTP insertions into a DNA/DNA duplex 13/23-mer

[132], suggesting that Dbh, like its Y-family homolog Dpo4, can readily bind both A-

76 and B-type helices at its flexible DNA binding cleft. This is not surprising considering that the Y-family DNA polymerases are known to accommodate DNA containing bulky and helix-distorting lesions during translesion DNA synthesis. The ternary crystal structures of both Dpo4 [3, 60] and Dbh [141] demonstrate that the DNA binding cleft of either Y-family enzyme is formed by a polymerase core (Palm, Thumb, and Finger domains) and a Little Finger domain; their physical connection is through a 14 amino acid residue linker. This linker likely increases the flexibility of the DNA binding cleft by facilitating the repositioning of the Little Finger domain relative to the polymerase core, depending on the conformation of bound DNA helix (A- or B-type). This hypothesis is supported by the modeling result revealed in Figure 3.10: only a portion (5 base pairs) of the B-type DNA/DNA duplex (8 base pairs) in the ternary structure of Dpo4 can be replaced by the same sized A-type RNA/DNA duplex without steric hindrance of the

Little Finger domain. Interestingly, unlike the Y-family enzymes Dpo4 and Dbh, DNA polymerases from other families do not possess the Little Finger domain and the linker, therefore, their DNA binding clefts are predicted to be relatively stringent. A stringent

DNA cleft may not tolerate pure A-type helices and may limit the polymerase to synthesize a long RNA polymer opposite a DNA template. Consistently, the „steric gate‟ mutants of those non-Y-family DNA polymerases have been found to only add approximately 4 to 6 rNTPs to a DNA/DNA duplex [129, 131, 147-150]. In addition, these studies also indicate that the DNA binding clefts of those non-Y-family DNA polymerase mutants can simultaneously tolerate a stretch (4 to 6 base pairs) of

RNA/DNA helix from the primer 3′-terminus and a stretch of DNA/DNA helix near the

77 primer 5′-terminus. Interestingly, this covalently linked A-type and B-type duplex conformation has been observed in the ternary crystal structures of those non-Y-family

DNA polymerases in complex with DNA and dNTP which show that a portion (~4 base pairs) of the double-stranded DNA duplex next to the primer 3′-terminus is in an A-like conformation while the rest of the DNA duplex is in the B-type conformation [143, 151-

153].

Interestingly, the fidelity of rNTP incorporation into D-1 (Table 3.1) catalyzed by the Y12A Dpo4 mutant was measured to be in the range of 10-3 to 10-4 (Table 3.3), which is identical to the fidelity range for the DNA-dependent DNA polymerase activity of the wild-type Dpo4 [27]. This suggests that the DNA-dependent RNA polymerase activity of the Y12A Dpo4 mutant is as error prone as its intrinsic DNA-dependent DNA polymerase activity. Although slow, the DNA-dependent RNA polymerase activity of the

Y12A Dpo4 mutant catalyzed blunt-end rNTP and dNTP additions (Figure 3.2), and the predominant blunt-end rNTP addition event is single rATP incorporation (Figure 3.8).

These observations mirror what we have observed with the blunt-end dNTP additions catalyzed by the wild-type Dpo4 [117]. Taken together, we conclude that the DNA- dependent RNA polymerase activity and DNA-dependent DNA polymerase activity of the Y12A Dpo4 mutant are mechanistically similar.

78

3.5 Figures

S.solf Dpo4 1 MIVLFVDFDYFYAQVEEVLNPS S.acid Dbh 1 MIVIFVDFDYFFAQVEEVLNPQ E.coli UmuC 1 MFALCDVNAFYASCETVFRPD E.coli DinB 2 RKIIHVDMDCFFAAVEMRDNPA H.sapi Polκ 101 NTIVHIDMDAFYAAVEMRDNPE H.sapi Polη 7 RVVALVDMDCFFVQVEQRQNPH S.cere Polη 24 ACIAHIDMNAFFAQVEQMRCGL H.sapi Polι 28 RVIVHVDLDCFYAQVEMISNPE H.sapi Rev1 417 SCIMHVDMDCFFVSVGIRNRPD

Figure 3.1. Sequence alignment of the Y-family DNA polymerases.

The conserved residue in bold-type is the putative „steric gate‟ residue. Residue numbers are shown on the left side of the primary sequences.

79

Figure 3.2. Running start assays for the wild-type Dpo4 and the Y12A Dpo4 mutant at 37

°C.

A preincubated solution of enzyme (120 nM) and 5′-radiolabeled DNA substrate D-1 (30 nM) was rapidly mixed with all four rNTPs or dNTPs (100 µM each) for various reaction times before being quenched with 0.37 M EDTA. Sizes of important products are denoted on the right side of each image.

80

Figure 3.3. Alkaline degradation of full-length extension on D-1 21/41-mer catalyzed by

Y12A Dpo4 at 37 °C.

Lane 1: A preincubated solution of the Y12A Dpo4 mutant (120 nM) and 5′-radiolabeled

D-1 21/41-mer (30 nM) was rapidly mixed with all rNTPs (0.5 mM each). The reaction was quenched with 0.37 M EDTA after 30 min and the products are shown in lane 1.

Lane 2: After the reaction in lane 1 was quenched, the solution was mixed with equal volume of 0.5 M NaOH at 60 °C for 10 min before being diluted with more 0.37 M

EDTA. The resulting ladder is shown in lane 2. Lane 3: A preincubated solution of only the Y12A Dpo4 mutant (30 nM) and 5′-radiolabeled D-1 21/41-mer (30 nM) in reaction buffer (no rNTPs).

81

Figure 3.4. Single nucleotide incorporation assays at 37 °C.

A preincubated solution of enzyme (120 nM) and 5′-radiolabeled DNA D-1 (30 nM) was rapidly reacted with the indicated nucleotide (100 µM) for 2 min before being quenched with 0.37 M EDTA. Some reactions were catalyzed by the Y12A Dpo4 mutant in the presence of an rNTP (A) or a dNTP (B) while others were catalyzed by the wild-type

Dpo4 in the presence of an rNTP (C) or a dNTP (D). „B‟ denotes that no enzyme was added to the reaction. The primer‟s size is denoted on the right side of each image.

82

A (continued)

Figure 3.5. Matched rATP incorporation into DNA substrate D-7 (Table 3.1) catalyzed by the Y12A Dpo4 mutant at 37 °C.

A preincubated solution of the Y12A Dpo4 mutant (120 nM) and 5′-radiolabeled D-7 (30 nM) was rapidly mixed with increasing concentrations of rATP before being quenched by

0.37 M EDTA at various reaction times. (A) Gel image of the time course of the incorporation of 50 µM rATP; (B) Plots of product concentration versus reaction time at specific rATP concentrations (50 µM, ●; 100 µM, ○; 200 µM, ♦; 400 µM, ◊; 600 µM, ▲;

800 µM, Δ; 1200 µM, ■). Each time course was fit to Eq 1 to obtain kobs (Materials and

Methods). (C) Plot of kobs values as a function of rATP concentrations. The data were fit

-1 to Eq 2 (Materials and Methods) to obtain a kp of 0.51 ± 0.04 s and a Kd, rATP of 400 ± 70

M (Table 3.2).

83

Figure 3.5 continued

30

25

20

15

10 [Product] (nM) [Product]

5

0 0 10 20 30 40 50 60 Reaction Time (s)

B

0.4

0.35

0.3

0.25

) -1

(s 0.2 obs k 0.15

0.1

0.05

0 0 200 400 600 800 1000 1200 [rATP] (M)

C

84

105

104

103

100

10 Sugar Sugar Selectivity (logscale)

1 dATP/ATP dCTP/CTP dGTP/GTP dTTP/UTP Nucleotide

Figure 3.6. Comparison of sugar selectivity values (Table 3.2) between the wild-type

Dpo4 (grey bar) and the Y12A Dpo4 mutant (black bar).

85

30

25

20

15

[Product] (nM) [Product] 10

5

0 0 5 x 103 10 x 103 15 x 103 20 x 103 25 x 103 30 x 103 Reaction Time (s)

A (continued)

Figure 3.7. Mismatched rATP incorporation into DNA substrate D-1 (Table 3.1) catalyzed by the Y12A Dpo4 mutant at 37 °C.

(A) A preincubated solution of the Y12A Dpo4 mutant (120 nM) and 5′-radiolabeled

DNA D-1 (30 nM) was rapidly mixed with increasing concentrations of rATP (50 µM, ◊;

100 µM, ■; 200 µM, □; 400 µM, ○; 800 µM, ♦; 1200 µM, Δ; 1500 µM, ●) before being quenched by 0.37 M EDTA after various reaction times. Each time course of product formation was fit to Eq 1 (Materials and Methods) to yield kobs values. (B) Plot of kobs values versus rATP concentrations was fit to Eq 2 (Materials and Methods) to obtain a kp

-4 -1 of (2.2 ± 0.1) x 10 s and a Kd, rATP of 1,292 ± 154 M (Table 3.3).

86

Figure 3.7 continued

1.4 x 10-4

1.12 x 10-4

-5

) 8.4 x 10

-1 (s

obs -5 k 5.6 x 10

2.8 x 10-5

0 0 300 600 900 1200 1500 [rATP] (M)

B

87

A

5

4

3

2 [Product] (nM) [Product]

1

0 0 5 x 103 1 x 104 1.5 x 104 2 x 104 Reaction Time (s)

B

Figure 3.8. Blunt-end rATP additions onto DNA substrate BE2 (Table 3.1) catalyzed by the Y12A Dpo4 mutant at 37 °C.

(A) Gel image of the time course of rATP (600 µM) incorporation. (B) Plot of product concentrations in (A) as a function of reaction times. The time course was fit to Eq 1

-5 -1 (Materials and Methods) to yield a kobs of (2.9 ± 0.4) x 10 s .

88

A

B

Figure 3.9. Magnification of the active site of the wild-type Dpo4.

The Dpo4-bound incoming nucleotide was dATP (A) or rATP (B). The ternary crystal structure of the wild-type Dpo4 (grey with Tyr12 in multiple colors), and dATP (multiple colors) in (A) is from PDB 2AGQ. The ribose ring in (B) is from the rATP in PDB 3ETH while the rest of the structure is identical to (A). The catalytic metal ions are in magenta.

89

Figure 3.10. Model of an RNA/DNA primer/template duplex into the DNA binding cleft of the Y12A Dpo4 mutant.

The 5-mer RNA/DNA duplex (cyan) was taken from the coordinates given in PDB

2QK9. The single-strand portion of the DNA template (orange) was taken from the coordinates given in PDB 2AGQ. The Y12A Dpo4 mutant (the polymerase core in grey, the Little Finger domain in green) was based on the coordinates from PDB 2AGQ. The incoming rATP (multiple colors) was modeled into the polymerase active site as in

Figure 3.7B. The catalytic metal ions are in magenta. Residue Ala12 is shown in yellow.

The distance from the 2′-OH of the rATP ribose ring to the side chain of Ala12 is 3.6 Å.

90

3.6 Tables

5′-CGCAGCCGTCCAACCAACTCA-3′ D-1 3′-GCGTCGGCAGGTTGGTTGAGTAGCAGCTAGGTTACGGCAGG-5′ 5′-CGCAGCCGTCCAACCAACTCA-3′ D-6 3′-GCGTCGGCAGGTTGGTTGAGTGGCAGCTAGGTTACGGCAGG-5′ 5′-CGCAGCCGTCCAACCAACTCA-3′ D-7 3′-GCGTCGGCAGGTTGGTTGAGTTGCAGCTAGGTTACGGCAGG-5′ 5′-CGCAGCCGTCCAACCAACTCA-3′ D-8 3′-GCGTCGGCAGGTTGGTTGAGTCGCAGCTAGGTTACGGCAGG-5′ 5′-CGCAGCCGTCCAACCAACTCA-3′ D-1′ 3′-GCGTCGGCAGGTTGGTTGAGTAACAGCTAGGTTACGGCAGG-5′ 5′-TTGAGTTGCAACTCAA-3′ BE2 3′-AACTCAACGTTGAGTT-5′

Table 3.1. DNA substrates.

91

DNA k K k /K Sugar Dpo4 Nucleotide p d p d Substrate (s-1) (µM) (µM-1s-1) Selectivitya Wild-type D-1 dTTPb 9.4 ± 0.3 230 ± 17 4.1 x 10-2 D-1 UTP (4.0 ± 1.0) x 10-3 2158 ± 779 2.0 x 10-6 20500 D-8 dGTPb 9.4 ± 0.2 171 ± 15 5.5 x 10-2 D-8 rGTP (2.9 ± 0.2) x 10-3 980 ± 166 3.0 x 10-6 18333 D-6 dCTPb 7.6 ± 0.2 70 ± 8 1.1 x 10-1 D-6 rCTP (6.4 ± 0.6) x 10-2 3111 ± 419 2.0 x 10-5 5500 D-7 dATPb 16 ± 0.9 206 ± 46 7.8 x 10-2 D-7 rATP (8.1 ± 0.6) x 10-3 1405 ± 172 5.8 x 10-6 13448 Y12A D-1 dTTP 4.5 ± 0.5 300 ± 90 1.5 x 10-2 D-1 UTP (3.3 ± 0.4) x 10-1 600 ± 200 5.5 x 10-4 30 D-8 dGTP (9.4 ± 0.9) x 10-1 540 ± 100 1.7 x 10-3 D-8 rGTP (1.1 ± 0.1) x 10-1 270 ± 40 4.1 x 10-3 4 D-6 dCTP 4.9 ± 0.2 460 ± 45 1.1 x 10-2 D-6 rCTP 0.7 ± 0.1 800 ± 300 8.8 x 10-4 12 D-7 dATP 3.7 ± 0.5 900 ± 200 4.1 x 10-3 D-7 rATP (5.1 ± 0.4) x 10-1 400 ± 70 1.3 x 10-3 3 a Calculated as (kp/Kd)dNTP/(kp/Kd)rNTP. bKinetic parameters are from reference [27].

Table 3.2. Kinetic parameters of matched rNTP or dNTP incorporation into DNA catalyzed by the wild-type Dpo4 or its Y12A mutant at 37 °C.

92

rNTP k K k /K rNTP p d, rNTP p d, rNTP Incorporation (s-1) (µM) (µM-1s-1) Fidelityb DNA substrate D-1 UTPa (3.3 ± 0.4) x 10-1 600 ± 200 5.5 x 10-4 - rATP (2.2 ± 0.1) x 10-4 1292 ± 154 1.7 x 10-7 3.1 x 10-4 rCTP (11.0 ± 0.4) x 10-4 489 ± 33 2.3 x 10-6 4.2 x 10-3 rGTP (5.3 ± 0.5) x 10-4 793 ± 152 6.7 x 10-7 1.2 x 10-3 DNA substrate D-1′ rCTP (9.8 ± 1.9) x 10-5 903 ± 344 1.1 x 10-7 undetermined aKinetic parameters for matched UTP are from Table 3.2. b Calculated as (kp/Kd)mismatched rNTP/[(kp/Kd)matched rNTP + (kp/Kd)mismatched rNTP].

Table 3.3. Kinetic parameters of mismatched rNTP incorporation into a DNA substrate with 5′-dG (D-1) or 5′-dA (D-1′) from the templating base dA (Table 3.1) catalyzed by the Y12A Dpo4 mutant at 37 °C.

93

Chapter 4 : Quantitative Analysis of the Efficiency and Mutagenic

Spectra of Abasic Lesion Bypass Catalyzed by Human Y-Family DNA

Polymerases

4.1 Introduction

DNA polymerases are grouped into the A-, B-, C-, D-, X-, and Y-families. The Y-family

DNA polymerases function primarily in the bypass of replication-stalling DNA lesions, a process which can ultimately decrease the possibility of invoking DNA damage-induced apoptosis. In humans, four of the 16 identified DNA polymerases are in the Y-family:

DNA polymerases eta (hPolε), iota (hPolη), kappa (hPolθ), and Rev1 (hRev1). In numerous biochemical studies [75, 154, 155], these enzymes are capable of both error- free and error-prone lesion bypass, depending on the specific lesion. In vivo, hPolε is responsible for the error-free bypass of cis-syn thymine-thymine (TT) dimers [34, 156].

Moreover, the mutational inactivation of hPolε leads to Xeroderma Pigmentosum variant

(XPV) that predisposes individuals to an increased incidence of sunlight-induced skin cancer [34]. hPolε also has been shown biochemically to bypass lesions including abasic sites (AP) [82], 7,8-dihydro-8-oxoguanine (8-oxoG) [36], (+)-trans-anti-benzo[a]pyrene-

N2-dG ((+)BPDE-dG) [82], 1,N6-ethenodeoxyadenosine [35], O6-methylguanine [83], N- 94

2-acetylaminofluorene-dG (AAF-dG) [62], and cisplatin-dGpG intrastrand adducts [62].

Interestingly, Polη has been shown to incorporate incorrect dGTP opposite a template base dT more efficiently than canonical dATP [30, 86]. In vitro, hPol has been shown to traverse AP sites [85-87], 8-oxoG [85], AAF-dG [85], cis-syn TT [87], and (6-4) TT photoproducts [17, 85]. Polθ, which is a member of the DinB subfamily and a close relative of Y-family member Sulfolobus solfataricus DNA Polymerase IV (Dpo4), can efficiently elongate mispaired primer termini [100]. In addition, hPolθ has been shown to bypass AP sites, 8-oxoG, AAF-dG, and (+)BPDE-dG [99]. Rev1, which is the only Y- family DNA polymerase to contain a BRCT domain, is classified as a dCTP transferase

[88]. hRev1 incorporates a dCTP efficiently opposite lesions including AP sites [89, 90],

8-oxoG, (+)BPDE-dG, (-)BPDE-dG, and 1,N6-ethenoadenine adducts [32]. Together, the aforementioned studies indicate that there is a significant overlap in regards to the in vitro lesion bypass spectra of the four human Y-family DNA polymerases. Yet, with the exception of a cis-syn TT, it is unclear which human Y-family enzyme is responsible for the bypass of which lesion(s) in vivo.

One of the most challenging issues in the field of DNA lesion bypass is to identify the in vivo lesion bypass specificity of the Y-family DNA polymerases, especially for those Y-family enzymes that co-exist within the same organism. The knowledge of in vitro lesion bypass efficiency and fidelity of a Y-family DNA polymerase may shed light on which lesions it bypasses in vivo. However, the aforementioned in vitro lesion bypass studies have been performed by different laboratories using different DNA substrates under different reaction conditions [72, 76, 157]. Thus, it is challenging to deduce which

95 human Y-family enzyme bypasses specific lesions in vivo. In order to exclude these variables, we quantitatively assessed the AP lesion bypass abilities of the four human Y- family DNA polymerases using the same in vitro assay under the same reaction conditions. Notably, AP lesions are the most common DNA lesions found in mammalian cells with approximately 10,000 spontaneous AP sites generated in each cell every day

[103, 104]. Herein, we have employed a recently developed short oligonucleotide sequencing assay (SOSA) to quantitatively determine the mutational spectra of these human Y-family enzymes in the vicinity of this non-coding DNA lesion. Our in vitro data suggest that hPolε is likely the Y-family DNA polymerase to bypass AP lesions in vivo.

4.2 Material and Methods

Materials. Human AP endonuclease and Taq DNA polymerase were purchased from

Trevigen and Invitrogen, respectively. [γ-32P]ATP was purchased from GE Healthcare

Life Sciences. The oligodeoxynucleotides in Table 4.1 were purchased from Integrated

DNA Technologies and were purified, labeled, and annealed as described previously

[27].

Protein Purification. The gene encoding the full-length hPol was cloned into the NdeI and XhoI sites of pET21B to generate a plasmid pET-21B-hPol. The C-terminal His6- tagged hPol was induced and expressed in E. coli BL21(DE3) Rosetta cells at 16 °C.

96

The protein was purified through a nickel affinity column, a heparin sepharose column, and a HiTrap SP column.

The gene encoding the N-terminal 420 amino acid residues of hPol (hPol) was inserted into the NcoI and XhoI sites of pGST-Parallel1 [158] to produce the plasmid pGST-iota. The N-terminal GST-tag on hPolη was used to increase the protein purification yield as hPolη proved to be quite insoluble from cell lysate (data not shown). The GST-tagged hPol was expressed in E. coli BL21(DE3) Rosetta cells at 16

°C. The fusion protein in the cleared lysate was bound to a GSTrap column. After washing, the fusion protein on the column was digested with tobacco etch viral protease

(TEV) at 4 °C overnight. The free hPol was then washed with binding buffer and collected. The pooled fractions were passed through a DEAE column to remove any

DNA from E. coli. Finally, hPol was separated from TEV and other impurities using a heparin sepharose column.

The gene encoding the truncated fragment (residues 341-829) of hRev1 (hRev1) was cloned into plasmid pBAD-REV1S(341-829) [159]. The protein was induced with

1% of L-(+)-arabinose and expressed in E. coli BL21(DE3) at 15 °C. The N-terminal

His6-tagged hRev1 was purified using a nickel affinity column, a heparin sepharose column, and a HiTrap Q column.

The gene encoding the truncated fragment (residues 9-518) of hPolθ (hPolθ) was inserted into the NcoI and XhoI sites of the plasmid pHIS-Parallel1 [158] to create pHIS-hPolkappa-9-518. This plasmid was transformed into E. coli BL21(DE3) Rosetta cells. The N-terminal His6-tagged hPolθ was induced and expressed at 19 °C, and then 97 was purified using a nickel affinity column, a heparin sepharose column, a HiTrap Q column, and a Sephacryl 200 gel filtration column.

The concentrations of the purified recombinant proteins hPol, hPol, hRev1 and hPolθ were determined spectrophotometrically at 280 nm using calculated molar extinction coefficients of 70 731, 14 080, 32 430 and 31 860 M-1cm-1, respectively. As both the N- and C-termini are exposed to solvent and far away from their active sites [92,

102, 140], the His6-tag on either end should not affect the activities of the hPolε, hΔPolθ, and hΔRev1. All experiments were conducted using the same preparation of each enzyme in order to eliminate variations between experiments.

Reaction Buffer. The reaction buffer H contained 50 mM HEPES (pH 7.5 at 37 °C), 5 mM MgCl2, 50 mM NaCl, 5 mM DTT, 10% glycerol, 0.1 mM EDTA, and 0.1 mg/ml

BSA. All reactions were performed at 37 °C.

Running Start Assay. Experiments were performed using a rapid chemical quench flow apparatus (KinTek) by rapidly mixing a solution containing 100 nM 5′-[32P] DNA (14- mer/51AP or 14-mer/51CTL, Table 4.1) and 100 nM of a human Y-family DNA polymerase preincubated in buffer H with a solution containing all four dNTPs (200 µM each) at 37 ºC for times ranging from ms to min followed by quenching with 0.37 M

EDTA. The nucleotide incorporation pattern was resolved by sequencing gel analysis.

98

Short Oligonucleotide Sequencing Assay. SOSA was performed as previously described

[58] with the following modifications. The DNA substrate (14-mer/51AP, Table 4.1) contained the AP site analog tetrahydrofuran, which was located 22 nucleotides (nt) from the 3′-end of the DNA template. For control SOSA, the DNA substrate (14-mer/63CTL,

Table 4.1) contained a dT in place of the AP site. The resulting full-length purified products were PCR amplified using the following primers: 16mer_AP_upstream, 5′-

CACGCAGCCGTCCAAC-3′, and 16mer_AP_downstream, 5′-

GCCCTGGACTCAGGAC-3′. The DNA plasmids containing the ligated full-length products were sequenced from bacterial colonies (Genewiz, Inc.). The method is summarized in Scheme 4.1.

4.3 Results

Our goal was to quantitatively elucidate the relative abilities of the four human Y-family

DNA polymerases to replicate through an AP site. Previous studies have indicated that hPolε [82], hPolη [85-87], hPolθ [99], and hRev1 [88, 89] are able to bypass an AP site in vitro. However, these in vitro studies have not quantitatively evaluated the AP site bypass efficiency of the four human Y-family enzymes. Moreover, there have been no studies reporting a comprehensive nucleotide incorporation profile for these human enzymes that encompasses nucleotide incorporation events not only opposite the AP site, but also upstream and downstream from this non-coding lesion. Such studies are important in that they will bring us one step closer to possibly determining which of the human Y-family

99

DNA polymerases preferentially performs AP site bypass in vivo. To address these unresolved issues, running start assays were performed to determine the ability of each human Y-family enzyme to elongate a 14-mer/51AP substrate containing an AP site located 22 bases from the 3′-end of the DNA template 51AP (Table 4.1). The resulting nucleotide incorporation profiles were compared to those obtained with an undamaged

DNA substrate 14-mer/51CTL (Table 4.1) in order to assess the effect of the AP site on

DNA polymerase activity.

Running Start Assays. Recombinant hPol, hPolθ, and hRev1 were used, rather than their respective full-length proteins, in the running start assays because these fragments were able to be expressed and purified from E. coli as soluble and active proteins. Based on the domain structures of these three enzymes [160] and the published X-ray crystal structures of these three enzymes [92, 102, 139], the purified fragments contain the DNA polymerase core domains. Importantly, hRev1 has been shown to possess intact dCTP transferase activity [90] while hPolθ and hΔPolη are as active as their full-length counterparts [102, 161].

The running start assays for each of the four human Y-family DNA polymerases were performed under the same reaction conditions (see Materials and Methods). These reaction conditions did not deviate significantly from the reaction conditions for individual human Y-family DNA polymerases used in other laboratories [30, 31, 62, 89,

162], and therefore should not significantly affect the activity of each enzyme. Overall, all four human DNA polymerases are dissociative in the elongation of both the 5′-[32P]-

100 labeled 14-mer/51AP and the 5′-[32P]-labeled 14-mer/51CTL DNA substrates based on intermediate product accumulation patterns (Figure 4.1). A general trend was observed in the nucleotide incorporation profile for each DNA polymerase with the exception of hRev1(see below), in that the elongation of the 14-mer/51AP proceeded rapidly until the enzyme encountered the AP site where analysis of the nucleotide incorporation profile indicated two consecutive strong pause sites (Figures 4.1B, 4.1D, and 4.1F).

These consecutive pause sites corresponded to the nucleotide incorporation event directly opposite the AP site and the subsequent extension event, suggesting slow turnover at these sites when compared to the corresponding assays performed with the 14- mer/51CTL substrate (Figures 4.1A, 4.1C, and 4.1E). Nonetheless, the AP site was bypassed relatively efficiently by hPolε with the full-length product observed at 10 s

(Figures 4.1A and 4.1B). The subsequent downstream incorporation was perturbed more significantly than the insertion opposite the AP site, indicating that the AP site modestly affected the elongation of the lesion bypass product catalyzed by hPolε (Figure 4.1).

Comparison of Figures 4.1D and 4.1B revealed that hPolη appeared to be more perturbed by the AP site than hPolε. There was a difference in the amount of time required for hΔPolη to generate the full-length product between the control (4 min, Figure

4.1C) and damaged (7 min, Figure 4.1D) DNA substrates. Analogous assays performed with hPolθ showed that this enzyme was also affected by the AP site as revealed by the elongation patterns of the damaged (Figure 4.1F) and the undamaged (Figure 4.1E) DNA substrates. Although the nucleotide incorporation patterns of hPolε, hPolη, and hPolθ shared the same two strong pause sites in the vicinity of the AP site, hPolθ stalled for

101 longer periods of time at these sites than hPolε and hPolη, generating the full-length product 30 min after reaction initiation (Figure 4.1F). Further analysis of the nucleotide incorporation profiles indicated that hPolε (Figure 4.1B) and hPolη (Figure 4.1D) were relatively more efficient at nucleotide incorporation opposite the AP site while hPolθ

(Figure 4.1F) was more efficient at catalyzing the subsequent extension step.

Notably, hRev1 was not able to generate the full-length product with either undamaged (Figure 4.1G) or damaged DNA (Figure 4.1H), even after a 20-h incubation period at 37 °C (data not shown). Additionally, this enzyme was only able to incorporate

~ 13 nucleotides on undamaged DNA (Figure 4.1G), a weak activity that has been reported previously for hRev1 [163]. Notably, hRev1 quickly catalyzed the first nucleotide incorporation, which was opposite template base dG, and stalled significantly at the next nucleotide incorporation opposite template base dT (Figures 4.1G and 4.1H).

This observation was expected as hRev1 has been found to function as a dCTP transferase in vitro [89, 90]. The AP site did not significantly inhibit the primer elongation, although more accumulation of intermediate 19-mer was observed with the

14-mer/51AP substrate than with the 14-mer/51CTL substrate, indicating that the AP site had some effect on the activity of hΔRev1 (Figure 4.1H). Since hΔRev1 failed to generate the full-length „AP bypass products‟ (Figure 4.1H), we could not quantitatively analyze its bypass specificity using SOSA. The inability to analyze the mutagenic spectrum for hRev1 should not diminish the veracity of our results (see Discussion).

Quantitative Analysis of the AP Bypass Efficiencies. We further analyzed the results of

102 the running start assays for hPolε, hPolη and hPolθ by quantitatively determining their bypass efficiencies (AP bypass%). This was calculated by determining the number of events at a given time t that the enzyme encountered the AP site relative to the number of these encounters that resulted in a lesion bypass event. Thus at reaction time t, an AP

“encounter” was defined as an event where the AP lesion was located in the enzyme active site, regardless of whether or not the nucleotide was incorporated. The total AP bypass events (B) was calculated from the concentration of all intermediate products with sizes greater than or equal to the 22-mer in Figure 4.1. Therefore, at the reaction time t, the total AP “encounter” events (E) equaled the summation of the 21-mer concentration and the total AP bypass events (B). Finally, we defined the AP bypass% at reaction time t as the ratio of the bypass events to the encounter events:

AP bypass% = (B/E) x 100% = {B/([21-mer] + B)} x 100%

Figure 4.2 shows the AP bypass% plotted as a function of the reaction time for hPolε, hPolη and hPolθ. This figure demonstrates that hPolε required the shortest time to bypass the AP site while hPolθ required the longest time to traverse the lesion. To

bypass quantitatively define the AP bypass efficiency, we defined t50 as the time required to

bypass bypass 50% of the total AP lesions encountered. The t50 values, estimated from

Figure 4.2, were 4.6, 112 and 1 823 s for hPolε, hPolη and hPolθ, respectively (Table

4.2). Since hRev1 only bypassed the AP lesion after 184, 360, and 1 200 min (Figure

4.1H) with bypass% values of 14, 20 and 28%, respectively, we did not plot the time-

bypass dependent AP bypass% for this enzyme. However, the t50 value of hRev1 was expected to be larger than 1,200 min and was estimated to be approximately 2,150 min 103

bypass based on these three time points. Based on the t50 values, hPolε possessed the highest

AP bypass efficiency and bypassed an AP site 24-, 396- and > 28,043-fold faster than hΔPolη, hΔPolθ and hRev1, respectively. The specific activity of the human enzymes to bypass an AP site can be described as the amount of AP site bypassed (nM) per second, and were calculated to be 1.1x101, 4.5x10-1, 2.7x10-2 and 3.9x10-4 nM s-1 for hPolε, hPolη, hPolθ and hRev1, respectively.

For comparison, the time (t50) for each enzyme to create 50% of products that extended past a template base dT in the control template 63CTL, which is at the corresponding position of the AP site in the template 51AP, was estimated based on the

bypass running start analysis in Figure 4.1 and is listed in Table 4.2. The t50 /t50 ratios were calculated to evaluate the inhibitory effect of the AP site on DNA synthesis catalyzed by

bypass the Y-family DNA polymerases (Table 4.2). The t50 /t50 ratios indicate that the AP site slowed down hPolθ the most (87-fold) while it had almost no effect on hRev1.

The inhibitory effect of an abasic site on hPolε (4.6-fold) and hPolη (1.5-fold) were small but notable.

Quantitative Analysis of the Mutation Spectra of AP Bypass. We performed SOSA to determine the precise sequences of AP lesion bypass products synthesized by each human

bypass Y-family DNA polymerase. Although the t50 values for each enzyme varied, our assays allowed adequate time for each enzyme to generate the full-length AP bypass products. Compared to our previously published SOSA [58], we modified the assay

(Scheme 4.1) by increasing the size of the sequencing window from 8 to 22 nucleotides

104 in order to increase the number of incorporation events both upstream (7 nucleotides) and downstream (15 nucleotides) from the AP site in addition to increasing the reliability of the statistical analysis.

hPolη. To perform SOSA with hPolε and 14-mer/51AP, we sequenced 45 colonies and the DNA sequences of the 45 AP lesion bypass products are summarized in Figure 4.3A.

Our results showed that hPolε incorporated either dATP (30/45 colonies, 66.7%), dGTP

(9/45 colonies, 20%), or dTTP (2/45 colonies, 4.4%) opposite the AP lesion (Figure

4.4A). The preference for dATP incorporation opposite the AP site was likely due to the well-established “A-rule” [164]. The remaining events opposite the AP site (4/45 colonies, 8.9%) were deletion mutations (Figure 4.4A). These dNTP incorporation frequencies were comparable to the error% measured by M13-based reverse mutation assays [46]. In addition, hPolε generated a significant number of substitutions, deletions, and mixed mutations at positions located both upstream and downstream from the AP site.

To quantitatively compare mutation frequencies at the AP site and other template positions, we plotted relative error% as a function of template positions (Figure 4.5A).

Each value of relative error% was calculated from the ratio of total number of mutations

(insertions, deletions, and substitutions) divided by the total number of dNTP incorporations at a specific template position. To compare the contributions of different types of mutations to the value of relative error%, we further calculated the relative base insertion%, substitution%, and deletion% at each template position. For calculations at

105 the AP site, since there was no correct dNTP incorporation, relative base substitution% and insertion% were not calculated. In order to compare relative deletion% at the AP site and other template positions, we only presented relative deletion% at the AP site (Figure

4.5A). Interestingly, Figure 4.5A indicates that dNTP incorporation was most significantly affected at Position 1 (one nucleotide downstream from AP site) as evidenced by the highest relative error (~22.2%) in the sequencing window. Position 1 was also the strongest pause site in Figure 4.1B. Surprisingly, the most error-free incorporation occurred at the template base position preceding the AP site (Position -1,

Figure 4.5A). To further examine the sequence-dependence of the relative error%, we plotted the relative error% versus the positions of template bases dTs (Positions -6, -5, -1,

0, 6, 10 and 11) (Figure 4.6A). Obviously, the relative base deletion% increased after hPolε encountered the AP lesion. In contrast, the relative base substitution% was higher at positions further away from the AP site and dropped to 0% at Position -1. When opposite template bases dGs (Positions -7, -4, -2, 1, 4, 8, 9, 14 and 15), the opposite trends of relative base deletion% and substitution% were observed with hPolε (Figure

4.6A). While the relative base substitution% trend for template bases dAs (Positions -3,

3, 7 and 12) was similar to the one observed in Figure 4.6A, the relative base deletion% increased only in the vicinity of the AP site (Figure 4.8A). Notably, there were no upstream template bases dCs on 51AP. Based on these template base-dependent analyses, it is apparent that there is no clear pattern for relative error% during translesion DNA synthesis catalyzed by hPolε.

For further evaluation of the effect of the AP site on upstream and downstream

106 nucleotide incorporations catalyzed by hPolε, we performed SOSA using a control DNA substrate 14-mer/63CTL which has a template base dT at the corresponding position of the AP site in 14-mer/51AP (Table 4.1). The control template 63CTL is 12 nucleotides longer than 51AP in order to facilitate PAGE separation of the full-length products from the control template which cannot be cleaved by the human AP endonuclease (Scheme

4.1). Based on the sequences gathered (Figure 4.3B), hPolε incorporated mostly correct dATP (85.4%) and occasionally misincorporated dGTP (14.6%) opposite a template base dT at Position 0 (Figure 4.4B). However, the mutations made around Position 0 changed from mostly multi-base deletions and single-base substitutions of dT and dA in the presence of the AP site to mostly single-base substitutions of dG and dT in the absence of the AP site (Figures 4.3A and 4.3B). As shown in Figure 4.5B, hPolε created mutations along template 63CTL with an average relative error% of ~6.7% at each template position. This value is similar to the average error% in Figure 4.5A. However, the only observed mutations that occurred with template 63CTL were base substitutions (Figures

4.5B and 4.3B). The relative base substitution error% with template bases dTs (6% to

21%) was sequence-dependent (Figure 4.6A). Similar trends were also observed with template dAs and dGs (Figure 4.7A and 4.8A). Overall, the base substitution frequency of hPolε was calculated to be 7.5x10-2 with 14-mer/63CTL (Table 4.3). This value was comparable to the error rate of 7.1x10-2 for combined dT:dG and dA:dC mutations measured by the M13-based forward mutation assays [165] and to 5.6x10-2 measured by steady-state kinetic studies [166]. Interestingly, differences in the mutagenic data in

Figures 4.5A and 4.5B suggest that the base deletions and insertions observed with

107 template 51AP were most likely caused by the presence of the AP site. Both 2.2% deletion error frequency at Position -7 and 8.9% insertion error frequency at Position -6

(Figure 4.5A) may also suggest that hPolε could detect the presence of the AP site before the lesion entered its active site.

hPolι. To quantitatively analyze the mutagenic profiles of AP bypass catalyzed by hPolη, we sequenced 45 colonies and the SOSA results are shown in Figure 4.3C.

Interestingly, hPolη lacked strong preference when incorporating a dNTP directly opposite the AP site as indicated by the relative nucleotide incorporation percentages for dATP (8/45 colonies, 17.8%), dGTP (11/45 colonies, 24.4%), dCTP (1/45 colonies,

2.2%), and dTTP (17/45 colonies, 37.8%) in addition to the deletion events (8/45 colonies, 17.8%) (Figure 4.4A). Thus, hPolη did not follow the “A-rule” at the AP site.

These SOSA results were similar to other published kinetic studies [17, 85] with only one difference: they found that dGTP incorporation opposite an AP site is slightly more efficient than dTTP. In addition, our SOSA assay also identified base deletions opposite the AP site (Figure 4.4A). At the positions both upstream and downstream of the AP lesion, hPolη generated a plethora of base substitution, deletion, and insertion mutations that varied from 10% to 80% of the total flux opposite each template position in the sequencing window (Figure 4.5C). There were more mutagenic events opposite template bases dTs (Figure 4.6B) than opposite template bases dAs (Figure 4.8B). As reported previously [30, 167], we also observed the unusual ability of hPolη to efficiently incorporate dGTP opposite dT at Position -6 where 58% (26/45 colonies) of the total 108 dNTP incorporation events were dGTP substitution mutations while only 40% involved correct dATP incorporation (Figure 4.5C). Perhaps most intriguingly, as hΔPolη approached the AP site, there was a sequence-dependent increase in the number of deletion mutations, and a corresponding sequence-dependent decrease in deletion mutations as the polymerase proceeded downstream from the AP site with a maximum deletion mutation% (26/45 colonies, 58%) observed at Position 1 (Figures 4.5C, 4.6B).

This trend was inversely correlated to the number of substitution mutations generated by hΔPolη. Additionally, Positions 0 and 1 were also the strongest pause sites in Figure 4.1D.

Notably, the results for hΔPolη indicated an extraordinarily high probability of generating frameshift mutations during AP bypass with a probability of 13% (6/45 colonies) to 58%

(26/45 colonies) of the total dNTP incorporation events catalyzed within two template bases of the AP site (Figure 4.5C).

The effects of the AP site on DNA synthesis catalyzed by hPolη were more evident when comparing the results in Figure 4.3C with the results for the control template 63CTL (Figure 4.3D). While mostly multi-base deletions were observed in the immediate area of the AP site, hPolη created mostly multi-base substitutions with dGTP being the most frequently incorporated incorrect nucleotide on control template 63CTL.

Opposite template bases dTs, the inversely correlated, sequence-dependent variation observed with template 51AP was not observed with template 63CTL (Figure 4.6B). The relative error% for base deletions was lower than base substitutions along template

63CTL and displayed no sequence bias. Similar trends are also observed upon comparison of incorporation events opposite template bases dAs and dGs (Figure 4.7B

109 and 4.8B), although the overall error rate of hΔPolη opposite dA was relatively low.

While relative error frequencies at specific positions on template 63CTL were as high as

76% (Figure 4.5D), hPolη synthesized one completely error-free product out of 44 sequenced products (Figure 4.3D). Among the 44 dNTP incorporation events opposite template base dT at Position 0, hPolη made 2 base deletions and incorporated dATP, dGTP, dTTP, and dCTP 23, 19, 0, and 0 times, respectively (Figure 4.4B). Again, the data demonstrated that hPolη formed the base pairs of dG:dT and dA:dT with similar efficiency [30, 167]. Overall, the base insertion, substitution, and deletion frequencies with undamaged template 63CTL were calculated to be 1.2x10-2, 1.9x10-1, and 5.1x10-2, respectively (Table 4.3). The base substitution error rate is similar to 1.0x10-1 measured by steady-state kinetic studies [168]. When comparing the relative error% along the templates 63CTL (Figure 4.5D) and 51AP (Figure 4.5C), the relative error% for hPolη in the presence of an AP site increased at almost every template position.

hPolκ. We sequenced 55 AP lesion bypass products formed by hPolθ and the SOSA results are shown in Figure 4.3E. Opposite the AP lesion (Figure 4.4A), 60% (33/55 colonies) of the nucleotide incorporation events were dATP incorporations, 8.5% (3/55 colonies) were dTTP incorporations, 34.5% (19/55 colonies) were no incorporations and

0% (0/55 colonies) were dCTP and dGTP incorporations, suggesting that hPolθ, like hPolε, followed the “A-rule” to select an incoming dNTP opposite the AP site. This conclusion is consistent with previously published results derived from single dNTP incorporation assays [99]. The strong pause sites observed in Figure 4.1F were the same 110 positions that created the most deletion mutations shown in Figure 4.5E. Notably, there were a significantly higher number of substitution and frameshift mutations at template positions downstream from the AP lesion (Figure 4.5E), which was not observed with hPolε (Figure 4.5A) and hPolη (Figure 4.5C). When following the dTs along template

51AP, the inversely correlated relative base substitution% and deletion% observed with hPolη was also observed with hPolθ (Figure 4.6C). Opposite template bases dGs and dAs, hPolη followed similar patterns of relative base substitution% and deletion%

(Figure 4.7C and 3C).

With undamaged 14-mer/63CTL (Table 4.1), our SOSA data (Figure 4.3F) indicated that hPolθ made significantly less errors with control 14-mer/51CTL (Figure

4.5F) than with damaged 14-mer/51AP (Figure 4.5E). Although hPolθ created multi- base deletions in the immediate vicinity of the AP site, mostly single-base substitutions were observed when the AP site was replaced with base dT (Figure 4.3F). For the 61 sequenced full-length products in Figure 4.3F, the base insertion, substitution, and deletion error rates were calculated to be 0, 1.9x10-2 and 3.9x10-3, respectively (Table

4.3), and the base substitution error rate was similar to 1.4x10-2 measured by steady-state kinetic studies [99]. Among the 61 dNTP incorporation events opposite dT at Position 0, hPolθ only misincorporated one dGTP and one dTTP and did not make any other mutations (Figure 4.4B). Opposite template bases dTs in 63CTL, there were no base deletions and few base substitutions (Figure 4.6C). Similar trends were also observed when hPolθ incorporated dNTPs opposite template bases dGs and dAs (Figure 4.7C and

111

4.8C). When comparing Figures 4.5E and 4.5F, it was clear that the relative error% at each template position increased significantly in the presence of an AP site.

All Y-family human enzymes. Within the distinct characteristic fidelity of each of the three human Y-family enzymes, there were mutational hotspots observed for both templates

51AP and 63CTL. All three DNA polymerases created a substantial amount of mutations at Positions 0 and 1 in the presence of the AP site with deletion mutations having the highest frequency (Figure 4.5). Furthermore, these positions created the strongest pause sites during running start analysis (Figure 4.1). There were also common distal positions downstream from the AP site (Positions 10 and 11) where these enzymes created mostly base substitutions. At Positions 10 and 11, the most frequent mutation was a base substitution to either dT or dG for all enzymes (Figure 4.3). Similarly, both hPolε and hPolη displayed high frequencies of mutations at Positions -6 and -5 on both templates

51AP and 63CTL. At these positions where slight enzyme pauses occurred during running start analysis (Figure 4.1), these two enzymes created mostly a base substitution to dG (Figure 4.3). Thus, the slight pause sites were directly correlated to mutational hotspots observed by SOSA. Notably, while downstream nucleotide incorporation events were not significantly affected, the observation of increased deletion and insertion mutations upstream from the AP site may be suggestive of a mechanism by which hPolε was affected by the AP site several nucleotides prior to encountering the lesion in its active site (Figure 4.5A). This upstream AP effect was also observed with both hPolη and hPolθ (Figure 4.5). In addition, the AP lesion affected downstream nucleotide

112 incorporation catalyzed by hPolη and hPolθ (Figure 4.5).

4.4 Discussion

A mammalian genome is estimated to undergo about 100,000 modifications per day [74].

It is obvious that the four human Y-family DNA polymerases have to bypass different types of DNA lesions efficiently in order to rescue and facilitate DNA replication. While

S. solfataricus Dpo4 has served as an excellent model for characterizing the Y-family, it has become apparent that DNA polymerases even within the same family possess structural and functional attributes that render them fundamentally different from one another [61]. Thus, it is extremely likely that the Y-family DNA polymerases, especially those from the same organism, possess distinct lesion bypass specificities in vivo. We hypothesized that one Y-family DNA polymerase may play a dominant role on the insertion or extension bypass of a specific lesion while other DNA polymerases may play a secondary role. In order to maintain genetic stability, we further hypothesized that the in vivo bypass of a widespread lesion such as the AP site requires a relatively efficient and faithful Y-family DNA polymerase that causes few frameshift mutations. In this paper, we quantitatively analyzed all four human Y-family enzymes for their abilities to perform AP bypass. The resulting analyses are discussed herein.

Human Rev1 Is Not the DNA Polymerase to Bypass AP Lesions in Vivo. For the following reasons, Rev1 does not appear to play an important DNA polymerase role in

113

AP lesion bypass in vivo. First, previous studies have shown hRev1 to function as a dCTP transferase that uses a novel mechanism involving the side chain of Arg357 in the N-digit domain that specifically interacts with the base of dCTP through hydrogen bonds to direct dCTP incorporation [92]. This property is supported by our recently published kinetic studies on hΔRev1 which preferentially incorporated dCTP opposite all four possible template bases [169]. This low fidelity of hRev1 could severely compromise genomic stability. Second, among the four human Y-family DNA polymerases, hRev1 possesses the lowest AP bypass efficiency (see Results) and may not be able to timely bypass numerous AP sites generated per cell per day [103, 104]. Third, due to the ability of Rev1 to specifically interact with DNA polymerases ε, θ, and η, but not with other

DNA polymerases such as β or µ [93-95] coupled with the observation that Rev1 will switch its interaction partner based on the relative concentrations of the competing Y- family DNA polymerases [93], it has recently been hypothesized that Rev1 may have an important role in mediating the DNA polymerase switching process by serving as a protein scaffold. Finally, additional evidence for the scaffold role of Rev1 were provided by demonstrating that Rev1 is required for DNA-damage induced mutagenesis in yeast, but its dCTP transferase activity is not required for AP site bypass in vivo [18].

hPolη Possesses the Highest AP Bypass Efficiency in Vitro. Using similar running start analysis performed previously for Dpo4 [59], our data demonstrated that hPolε, hPolη, and hPolθ were able to bypass a site-specifically placed AP site and subsequently

bypass generate the full-length bypass product. Based on the t50 values for hPolε, hPolη,

114 and hPolθ estimated in Figure 4.2, hPolε possessed the highest AP bypass efficiency and bypassed an AP site 24- and 396-fold faster than hΔPolη and hΔPolθ, respectively.

Relative to the control DNA substrate, the presence of an AP site had only a slight inhibitory effect on the polymerase activity of hPolε and hPolη but significantly slowed the hΔPolθ-catalyzed DNA synthesis. Moreover, further analysis of the nucleotide incorporation profile indicated that hPolε and hPolη were relatively more efficient at nucleotide incorporation opposite the AP site while hPolθ was more efficient at catalyzing the subsequent extension step. Such observations agree with previous reports that Polθ has a unique ability to extend mispaired and aberrant primer termini [100, 170].

This superior extension ability can be explained by the presence of a unique N-terminal extension of the palm domain, which helps stabilize the Polθ•DNA complex at the primer-terminus junction [102].

hPolη Has a Lower Mutagenic Potential than hPolι and hPolκ During AP Bypass.

Analysis of the mutation spectra of AP bypass observed via SOSA revealed several general trends: i) opposite the AP site, all three enzymes made frameshift mutations with the following order of deletion frequency: hPolε < hPolη < hPolθ (Figure 4.4A); ii) at template positions other than Position 0 (Table 4.3), all three human Y-family enzymes had higher deletion and insertion frequencies with the damaged template 51AP than with the control template 63CTL (Figure 4.5); iii) the base deletion and insertion frequencies of each of the three human Y-family enzymes increased significantly after the enzyme encountered the AP site (Table 4.3); iv) with both templates 51AP and 63CTL, hPolη 115 made more insertion, deletion, and substitution mutations than hPolε and hPolθ (Table

4.3); and v) hPolε was the least likely human Y-family polymerase to generate frameshift mutations in the vicinity of the AP site.

Sensing of a lesion by DNA polymerases. Figure 4.4A shows that hPolε preferentially incorporated dATP opposite the AP site, a preference that has been observed previously in some mammalian cells lines [61, 93, 94]. Additionally, while downstream nucleotide incorporation events were not significantly affected, the observation of increased deletion and insertion mutations upstream from the AP site (Figure 4.5A) may be suggestive of a mechanism by which hPolε is affected by the AP site several bases prior to encountering the lesion within its active site . This observation was also made for AP translesion DNA synthesis catalyzed by both hPolη (Figure 4.5C) and hPolθ (Figure 4.5E). This ability to sense DNA lesions has been reported previously for Pyrococcus furiosus (Pfu) DNA polymerase, which has been shown to sense uracil in a DNA template at least four bases upstream from the lesion [171]. In addition, S. solfataricus DNA polymerase B1 has been shown to be affected by uracil and hypoxanthine four bases upstream of these lesions, although the inhibitory effect is not as substantial as that reported for Pfu DNA polymerase [39, 171]. How a polymerase senses a DNA lesion is unclear. Based on crystal structures of eukaryotic Y-family DNA polymerases [81, 92, 172, 173], the 5′ single strand portion of the DNA template from the primer and template junction is threaded through the interface between the Finger domain and Polymerase-associated domain (PAD), or Little Finger domain, and interacts with amino acid residues in the

116 interface. The presence of a damaged template base may affect these interactions, facilitating the polymerase to sense the upstream lesion and make more mutations.

Potential deletion mechanism of hPolι. The results for hΔPolη indicated an extraordinarily high probability of generating frameshift mutations during AP bypass (Figure 4.5C).

Interestingly, a recent crystal structure of hPolη with an AP-containing DNA substrate indicates that the protrusion of the AP site would be sterically forbidden in hPolη by the presence of Lys60 and Glu97 of the finger domain and Ser307 of the PAD [174]. Thus, it is possible that hΔPolη utilizes a different AP site bypass mechanism, such as a primer misalignment, to generate the large number of frameshift mutations observed within the sequencing window. In addition, the inactivation of hPolε, i.e. XPV cell lines, is known to have the phenotype of hypermutability after exposure to DNA-damaging conditions

[69]. The hypermutability that was observed for hPolη (Figure 4.5C), which was the second most efficient enzyme to bypass an AP site in vitro, suggests that hPolη may substitute for the defective hPolε during the bypass of AP sites in the XPV cell lines

[175].

Impact of an AP lesion on hΔPolκ. hΔPolθ, which is shown to generate -1 deletions even on undamaged DNA [33, 105], was intermediate to hPolε and hΔPolη, generating a moderate number of base deletion mutations immediately downstream from the AP site in a manner similar to that reported for its Y-family homolog Dpo4 [170]. Furthermore, the embedded AP site affected the accuracy of dNTP incorporation catalyzed by hΔPolθ

117 for at least 15 bases downstream from the AP site (Figure 4.5E), which was similar to the pre-steady state kinetic data reported for Dpo4 [59]. Together, these data indicate that the mutagenic effects of an AP site on DNA synthesis catalyzed by Polθ and Dpo4 are similar.

Potential role of the little finger domain on a Y-family enzyme’s lesion bypass abilities.

The abovementioned differences in the AP bypass efficiency and mutagenic frequency among human Y-family DNA polymerases are a fundamental function of the ability of the enzymes to incorporate a dNTP directly opposite the AP site. This ability has been shown to be influenced by the PAD or little finger domain, which can modulate the activity of the enzyme by modifying the relative position of the DNA within the polymerase active site [50, 107]. As a proof of principle, the bypass specificity has been shown to be significantly influenced by the little finger domain in a study that swapped this domain in two Y-family polymerases and observed a switch of the bypass specificities [176]. Thus, our data collectively demonstrate the functional differences between DNA polymerases from the same phylogenetic family.

In summary, our study serves as an initial attempt to understand the precise mutagenic profile of each of the four human Y-family polymerases in response to an AP site. The data presented suggest that the most likely Y-family polymerase to bypass an

AP site in vivo is hPolε, due to (i) the extremely high frameshift potential of hΔPolη, (ii) the higher potential of hΔPolθ to create frameshift mutations as opposed to directly incorporating dNTP at the AP site, (iii) the weak polymerase activity observed opposite

118 the AP site by hΔRev1 coupled with its reported preference to function as a scaffolding protein, and (iv) the low frameshift potential and extremely high AP bypass efficiency of hPolε. Future studies with a variety of different DNA lesions are forthcoming, and will certainly provide a plethora of useful information regarding lesion bypass specificity.

119

4.5 Figures

(continued)

Figure 4.1. Running start assays for human Y-family DNA polymerases.

(A) and (B) hPolε; (C) and (D), hΔPolη; (E) and (F) hΔPolθ; (G) and (H) hΔRev1. A preincubated solution of a Y-family DNA polymerase (100 nM) and 5′-[32P]-labeled

DNA (100 nM) was mixed with all four dNTPs (200 µM each) for various reaction times before being quenched with 0.37 M EDTA. Reactions using the 14-mer/51CTL substrate are in (A), (C), (E), and (G) while reactions using the 14-mer/51AP substrate are in (B),

(D), (F), (H). 120

Figure 4.1 continued

121

100

80

60

40 AP Bypass% AP

20

0 0 1000 2000 3000 4000 5000 6000 7000 8000

Reaction Time (s)

Figure 4.2. Time-dependent AP bypass% during running start assays.

The AP bypass% was plotted as a function of reaction time for hPolε (open square), hΔPolη (closed circles) and hΔPolθ (open circle).

122

Adenine Incorporation CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (6/45) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCTGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAACGTTGAACCAATGCAGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAACGGTCGATCCAAGGCCGTCCTGAGTCCAGG (1/45)* CGCAGCCGTCCAACCAACTCAACGTCGCTCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAACGTCAATGACGTGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCCACTTAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAAAGTCGATGCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAGCTCAACGTCGATCCACTGTCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTTAACGTCGATCCAATGCCGTCCTGAGTCCAGG (2/45) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45)

A (continued)

Figure 4.3. Mutation spectra of DNA synthesis catalyzed by hPolε (A and B), hΔPolη (C and D) and hΔPolθ (E and F).

Results from SOSA are shown separately based on specific dNTP incorporations with the damaged template 51AP (A, C, and E) or the control template 63CTL (B, D and F).

Sequences corresponding to primers used for PCR amplification are shown in small case font while sequenced dNTPs are in large case font and underlined. Individual base substitutions (blue), deletions (red), and insertions (green) are written below the full- length „product‟ while complex mutations are color-coded based on the specific mutation in order of occurrence. Boldfaced letter shaded in yellow corresponds to incorporation opposite the AP site (A, C and E) or the corresponding template base dT (B, D, and F).

Relative frequencies are shown at right in parentheses. * denotes that the smaller-than- standard font size of mutations started under the first base and included the mutations in the smaller-than-standard font size. 123

Figure 4.3 continued Adenine Incorporation CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (cont‟d) CGCAGCCGTCCAACCGACTCAAAGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAACGTCGATCCGGTGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTTAACGTCGATCCCATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAACGTCGATCCAACGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAACGTCGATCCACTGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAGCTCAACGTCGTAACAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAGCTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGTCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAAAGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACCAAATGTCGACCCAGCGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCGACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (2/45)

Guanine Incorporation CGCAGCCGTCCAACCAACTCAGCGTCGATCCAATGCCGTCCTGAGTCCAGG (2/45) CGCAGCCGTCCAACTAACTCAGCGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAGCATCGGTCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTTAGCGTCGATCCAATTCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAGCGTCGATCCAGTGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCGGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCAACGTCAATCCAGTGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCTCACAACTCAGCGTCGATCCAACCGTCCTGAGTCCAGG (1/45)*

Thymine Incorporation CGCAGCCGTCCAACCAACTCATCGTCGATCCAATGCCGTCCTGAGTCCAGG (0/45) CGCAGCCGTCCAACCAGCTCATCGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAGCTCAACGTCGATCCAAAGCCGTCCTGAGTCCAGG (1/45)

No Incorporation CGCAGCCGTCCAACCAACTCA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (0/45) CGCAGCCGTCCAACCAACTTA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45) CGCAGCCGTCCAACCAACTCA-CGTCGATCGAACCGCTGTCCTGAGTCCAGG (1/45)* CGCAGCCGTCCAACCAATTCA-CGTCGATCCAATGCTGTCCTGAGTCCAGG (1/45)

A (continued)

124

Figure 4.3 continued Adenine Incorporation CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (9/48) CGCAGCCGTCCAACCAACTCAACGTAGATCCGATGCTGTCCTGAGTCCAGG (2/48) CGCAGCCGTCCAACCAACTCAACGTCGATCCAGAGCAGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACTAACTCAATGTCAATCCAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAGCTCAGCGTCGATCCGATGCCGTCCTGAGTCCAGG (3/48) CGCAGCCGTCCAACCAAGTCAGCGTCGATCTAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCGACTCAACGTCGATCCTATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACGTCGGTCCCTTGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACGTCAATCCCATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAGCTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACATCGATCCAGTGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGGCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAAGTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCGACTCAACGTAGAACCAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACGTCGATCCAAAGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAGCTCAATGTCGATCCAATGCGGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACGCTGATCCAATACCGTCCTGAGTCCAGG (2/48) CGCAGCCGTCCAACCAACTTAGCGTCGCTCCAACGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAGCTCAGCGTGGATCCAGTGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAAATCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAGCGTCGCTCCAACGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACGTAGATCCAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCGACGTCGATCCAATGCCGTCCTGAGTCCAGG (2/48) CGCAGCCGTCCAACCAACTCAACGTCAATCCAATGGCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACGTCGATCCGATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACACAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCATCTCAACGTTGATCCAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTAGACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACGTCGATCCAGCGTCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACGCCGATCCAATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACATCGATCCAATGTCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTTAACGTCGATCCGATGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCAACTCAACGTCGATCCAGTGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACCGTCTCAACGTCGGTCCAGTGCCGTCCTGAGTCCAGG (1/48) CGCAGCCGTCCAACGAACTCAACGTCGGTCCAATGCCGTCCTGAGTCCAGG (1/48)

B (continued) 125

Figure 4.3 continued Adenine Incorporation CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45)

TG CT C TC TT T (1/45)

T GT T (1/45)

CG C GCC (1/45)

C G CG C A CAAT CC (1/45)

CG CGAT GT (1/45)

CG CGG TCTCC (1/45)

TG CT C G T TG TT (1/45)

Cytosine Incorporation CGCAGCCGTCCAACCAACTCACCGTCGATCCAATGCCGTCCTGAGTCCAGG (0/45)

GT C AT GCA GCC (1/45)

Guanine Incorporation CGCAGCCGTCCAACCAACTCAGCGTCGATCCAATGCCGTCCTGAGTCCAGG (1/45)

G T C GT GA AC T (1/45)

G A TC G T C (1/45)

T A CG T T T T (1/45)

CG CA GGTTC T (1/45)

G A CTT T (1/45)

G G A CG TGA GGAA T (1/45)

TG CGATGTT GT T (1/45)*

CG TTTTG AA TTTG (1/45)*

T T T G T (1/45)

G T T AAT (1/45)

C (continued)

126

Figure 4.3 continued Thymine Incorporation CGCAGCCGTCCAACCAACTCATCGTCGATCCAATGCCGTCCTGAGTCCAGG (0/45) T TAC A TT T (1/45) G G C CG G CTCT CC (1/45) TG G C T AT TTCTGCC (1/45)* G TG CATGCTT TAA GA (1/45) G A A C TA CCG T G (1/45) T TCA T T T (1/45) G C CGAT TG G (1/45) G CA C T G A (1/45) G CT AT T GCC (1/45)

G TG C GA CCTG T (1/45)

G A GTCTCC T (1/45)* T G C CG GTC CTTG (1/45) G T A C TCGA AA GT (1/45) TT G CG A CCAA (1/45) G C CA CGT GTAA G T (1/45) GG T CGT A TT T (1/45) G CTG C CGA CAA GCC (1/45)

No incorporation CGCAGCCGTCCAACCAACTCA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (0/45) T T GT T (1/45) T G CA CG C TT (1/45) AAG A CG GT T (1/45) ACTG CG GT TT (2/45) GT A C T TT G GT (1/45)

GA TGT TC A AGTTCCTTCTCC (1/45)* T T C T GC (1/45)

C (continued)

127

Figure 4.3 continued Adenine Incorporation CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCAGCGTCGTTCCTTTGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCGACGTTGTTCCTTTGTCGTCCTGAGTCCAGG (2/44) CGCAGCCGTCCAACCGACTCAACGTCGATCCTTTGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCAGCTTCGTTCCAATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCGACGTCCATCCTTTGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCAACGTCGTTCCGATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACTAACTTGACGTCGGTCGAATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCAGCTCAGCGTGGGAGCGATGTCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCAACGTCGTTCTATTGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCGACGTTGGTCCGTTGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCAACTTAACGTCTTTCCGATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCGGCGTCGGTCTAGTGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCAACTCGACGTCGATCCTTTGCCGTCCTGAGTCCAGG (2/44) CGCAGCCGTCCAACCGATTCAGCGTCGATTCAATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCGACGTCGGTCCAATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGAGTCAACGTCGGTCCGATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGGCTCAGCATCTGTCCAATGTCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCTACTCGGCGTCGGTCCATTGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCAACTTAGCGTCGGTCCTTTGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCAGCTTAGCGTCGGTCCGATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCAGCGTCGGTCTAGTGCCGTCCTGAGTCCAGG (2/44) CGCAGCCGTCCAACCAGCTCAGCGTCGATCCAGTGTCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCAGCTTCATTCCGACGCTCGTCCTAGTCCAGG (1/44)* CGCAGCCGTCCAACCGACTCTACGTCGATTCTTTGTCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCAACGTAGCTTCAAGGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTTAACGTCGGTCCGATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTTAACGTCGTTTCTATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCGACATCGATCCGTTGCCGTCCTGAGTCCAGG (3/44) CGCAGCCGTCCAACCGACTCGGCGTCGATCCATTGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGATTCTACGTCGTTGTATTGTCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTTAACGTCGTTCCAATGTTGTCCTGAGTCCAGG (2/44) CGCAGCCGTCCAACCGACTCGACGTCAATCCGATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCGACTCAGCGTCGTTCCGATGCCGTCCTGAGTCCAGG (1/44)

D (continued) 128

Figure 4.3 continued Adenine Incorporation CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (cont‟d) CGCAGCCGTCCAACCGATTCAGCGTCGATCCAATGCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCTCATCCGAGTCGATTCATTTGCCGTCCT AGTCCAGG (1/44)* CGCAGCCGTCCAACCAGCTCTGCGTCGGTCCGATTCCGTCCTGAGTCCAGG (1/44) CGCAGCCGTCCAACCAGCTCAGCGTCGGTTCGTTGTTGTCCTGAGTCCAGG (1/44)

D

Adenine Incorporation CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (5/55) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCGGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTTGATCCAATACCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCGGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTTGATCCAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTAGATCCAATACTGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTTGCTCCAATGCTGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTCGATCGAATGACGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTCGCTCCAAGGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTCGATCCAGTGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCGGTCCTGAGTCCAGG (2/55) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATACCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTCGAATCAATGCAGTCCTGAGTCCAGG (2/55) CGCAGCCGTCCAACCACCTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAAGTCAACGTCGATCCAATACGGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCACCTCAACGTCGATCCAATGCGGTCCTGAGTCCAGG (1/55)

CGCAGCCGTCCAACCAACTTAACGTCGATCCCGCCGTCCTGAGTCAGGGCAGCAGTTGAGTCCAGG (1/55)* CGCAGCCGTCCAACCAACACAACGTCGATCCGATGTCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAAGTCAACGTCCATCGAATGCTGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTTGATCCAATGCCGTCCTGAGTCCAGG (2/55) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCGGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTCGATCCACTGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTTGATCCAACGCGGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTCGATCAAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCAACGTTGAATCAATGTCGTCCTGAGTCCAGG (1/55) E (continued) 129

Figure 4.3 continued

Thymine Incorporation CGCAGCCGTCCAACCAACTCATCGTCGATCCAATGCCGTCCTGAGTCCAGG (0/55) CGCAGCCGTCCAACCAACTCATCGTCGATCCAATGACGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCATCGTCGATCCAATACCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAAATCATCGTCGATCCAATGCTGTCCTGAGTCCAGG (1/55)

No Incorporation CGCAGCCGTCCAACCAACTCA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (0/55) CGCAGCCGTCCAACCAACTCA-CGTCGATCCAATTCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCA-CGTCGAATCAATGCTGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCGATTCA-CGTCGATCCAATTCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAAATCA-CGTCGATCCGATGATGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAAATCA-CGTCGATCCAATGCTGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACTAACTCA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCA-CGTCGATCAAATGCGGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCA-CGTCGATCCGATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCA-CGTCGATCCAATGCTGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTTA-CGTCGATGCAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACTAACTTA-CGTTGATTTAATGTTGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTCA-CGTCCATGAAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACCAACTGA-CGTCGATCCAATGCCGTCCTGAGTCCAGG (1/55) CGCAGCCGTCCAACTAACTCA-CGTCGATCCGATGCCGTCCTGAGTCCAGG (2/55)

E (continued)

130

Figure 4.3 continued Adenine Incorporation CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (34/61) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCGGTCCTGAGTCCAGG (2/61) CGCAGCCGTCCAACCAACTCAATGTCGATCCAATGCCGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCAACTTAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCAACTCAACGTCGATCAAATGCCGTCCTGAGTCCAGG (2/61) CGCAGCCGTCCAACCAACTCGAAGTCGATCCAATGCCGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCAACTCAACGTCGCTCCAATGCCGTCCTGAGTCCAGG (6/61) CGCAGCCGTCCAACCAAACCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCAACTCCGCGTCGATCCAATGCCGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCACCTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (3/61) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCAGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCAACTCAAGGTCGATCCAATGCCGTCCTGAGTCCAGG (2/61) CGCAGCCGTCCAACCAACTCAACGTCGATCAAATGCCGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCAACTCATCGTCTATCCAATGCCGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCAACTCAACGTCGATCCAATACCGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCAACTCAACGTCGATCCAAAGCCGTCCTGAGTCCAGG (1/61) CGCAGCCGTCCAACCAAATCAACGTCGATCCAATGCCGTCCTGAGTCCAGG (1/61)

F

131

100

90

80

70

60

50

40

30

20 RelativeFrequency (%) 10

0 dA dC dG dT Deletion Preferred Action Opposite Lesion Site

A

100

90

80

70

60

50

40

30

20 RelativeFrequency (%) 10

0 dA dC dG dT Deletion Preferred Action Opposite dT

B

Figure 4.4. Comparison of preferred actions by human Y-family DNA polymerases opposite the AP site in the damaged template 51AP (A) or the corresponding template base dT in undamaged template 63CTL (B).

The results from SOSA were tallied for all events at template Position 0 for hPolε (white bar), hΔPolη (striped bar) and hΔPolθ (black bar).

132

100 100 90 90 80 80 70 70 60 60 50 50 40 40 30 30

Relative Error (%) Error Relative 20 Relative Error (%) Error Relative 20

10 10

0 0 -7 -6 -5 -4 -3 -2 -1 AP 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Position from Abasic Site Relative Position Along Template 63CTL

A B (continued)

Figure 4.5. Histogram of relative error% as a function of template position.

At each position along the DNA template, the relative base insertion% (white bar), substitution% (striped bar) and deletion% (black bar) are shown to reveal total relative error% and the contribution of each type of mutations simultaneously. The AP site in

51AP is indicated as “AP” along the X-axis, and the corresponding template base dT in

63CTL is denoted as “0” along the X-axis. For values opposite the AP site, an error was scored only for those nucleotide incorporation events that resulted in a deletion. AP bypass analyses for hPolε (A), hΔPolη (C), and hΔPolθ (E) are shown. DNA synthesis with the control template 63CTL was also analyzed for hPolε (B), hΔPolη (D), and hΔPolθ (F).

133

Figure 4.5 continued

100 100

90 90

80 80

70 70

60 60

50 50

40 40

30 30 Relative Error (%) Error Relative

Relative Error (%) Error Relative 20 20

10 10

0 0 -7 -6 -5 -4 -3 -2 -1 AP 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Position from Abasic Site Relative Position Along Template 63CTL

C D

100 100

90 90

80 80

70 70

60 60

50 50

40 40

30 30

20 (%) Error Relative 20 Relative Error (%) Error Relative

10 10

0 0 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 -7 -6 -5 -4 -3 -2 -1 AP 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Relative Position Along Template 63CTL Position from Abasic Site

E F

134

25

20

15

10 Relative Error (%)

5

0

-5 0 5 10 Position from AP Site

A (continued)

Figure 4.6. Relative error% as a function of template position from the AP site for hPolε-

(A), hΔPolη- (B), and hΔPolθ-catalyzed (C) nucleotide incorporation events opposite template bases dTs.

The plots show the relative error% for deletion (open square) and substitution mutations

(open circle) as a function of the template position from the AP site. The relative error% for deletion (closed square) and substitution mutations (closed circle) as a function of the control template position in 63CTL is also shown for comparison. Only incorporations opposite template bases dTs were analyzed.

135

Figure 4.6 continued

80

70

60

50

40

30 RelativeError% 20

10

0 -5 0 5 10 Position from AP Site

B

10

8

6

4

Relative Error(%) Relative 2

0

-5 0 5 10 Position from AP Site

C

136

20

15

10 Relative Error% 5

0

-8 -4 0 4 8 12 16 Position from AP Site

A (continued)

Figure 4.7. Relative error% as a function of template position from the AP site for hPolε-

(A), hΔPolη- (B), and hΔPolθ-catalyzed (C) nucleotide incorporation events opposite template bases dGs.

The plots show the relative error% for deletion (open square) and substitution mutations

(open circle) as a function of the template position from the AP site. The relative error% for deletion (closed square) and substitution mutations (closed circle) as a function of the control template position in 63CTL is also shown for comparison. Only incorporations opposite template bases dGs were analyzed.

137

Figure 4.7 continued

60

50

40

30

Relative Error% 20

10

0 -8 -4 0 4 8 12 16 Position from AP Site

B

40

35

30

25

20

15 RelativeError%

10

5

0 -8 -4 0 4 8 12 16 Position from AP Site

C

138

12

10

8

6

4 Relative Error% Relative

2

0

-4 -2 0 2 4 6 8 10 12 Position from AP Site

A (continued)

Figure 4.8. Relative error% as a function of template position from the AP site for hPolε-

(A), hΔPolη- (B), and hΔPolθ-catalyzed (C) nucleotide incorporation events opposite template bases dAs.

The plots show the relative error% for deletion (open square) and substitution mutations

(open circle) as a function of the template position from the AP site. The relative error% for deletion (closed square) and substitution mutations (closed circle) as a function of the control template position in 63CTL is also shown for comparison. Only incorporations opposite template bases dAs were analyzed.

139

Figure 4.8 continued

10

8

6

4 Relative Error% Relative 2

0

-4 -2 0 2 4 6 8 10 12 Position from AP Site

B

8

6

4

2 RelativeError%

0

-4 -2 0 2 4 6 8 10 12 Position from AP Site

C

140

4.6 Tables

Primer 14-mer 5’-CGCAGCCGTCCAAC-3’

Templates 51AP 3’-GCGTCGGCAGGTTGGTTGAGTXGCAGCTAGGTTACGGCAGGACTCAGGTCC-5’ 51CTL 3’-GCGTCGGCAGGTTGGTTGAGTAGCAGCTAGGTTACGGCAGGACTCAGGTCC-5’ 63CTL 3’-TTCGTATGGGTAGCGTCGGCAGGTTGGTTGAGTTGCAGCTAGGTTACGGCAGGACTCAGGTCC-5’ aX designates an AP site.

Table 4.1. DNA primer and templatesa

141

bypass a b bypass Enzyme t50 (s) t50 (s) t50 /t50 hPolε 4.6 1.0 4.6 hPolη 112 75 1.5 hPolθ 1 823 21 87 hRev1 129 000 129 426 1.0 aCalculated as the time required to bypass 50% of the AP sites (Position 0) in Figure 4.1. bCalculated as the time required to bypass 50% of the specific dT, or 21-mer (Position 0) in Figure 4.1.

Table 4.2. The AP bypass efficiencies of the human Y-family DNA polymerases

142

Insertion Deletion Enzyme DNA Event Insertion Error Deletion Error Substitution Substitution Errora Ratiob Errorc Ratiod Errore Error Ratiof 14/ hPolε 63CTL Totalg 0 N/A 0 N/A 7.5 x 10-2 N/A Upstreamh 0 N/A 0 N/A 6.5 x 10-2 N/A Downstreami 0 N/A 0 N/A 7.9 x 10-2 N/A

14/ hPolε 51AP Totalg 4.4 x 10-3 - 2.6 x 10-2 - 6.4 x 10-2 0.85 Upstreamh 0 - 3.5 x 10-3 - 5.9 x 10-2 0.91 Downstreami 6.5 x 10-3 - 3.6 x 10-2 - 6.7 x 10-2 0.85

14/ hPolη 63CTL Totalg 1.2 x 10-2 N/A 5.1 x 10-2 N/A 1.9 x 10-1 N/A Upstreamh 6.5 x 10-3 N/A 5.8 x 10-2 N/A 2.2 x 10-1 N/A Downstreami 1.5 x 10-2 N/A 4.7 x 10-2 N/A 1.8 x 10-1 N/A

14/ hPolη 51AP Totalg 7.0 x 10-2 5.8 1.8 x 10-1 3.5 1.7 x 10-1 0.89 Upstreamh 4.1 x 10-2 6.3 1.1 x 10-1 1.9 1.8 x 10-1 0.82 Downstreami 8.3 x 10-2 5.5 2.2 x 10-1 4.7 1.7 x 10-1 0.94

14/ hPolθ 63CTL Totalg 0 N/A 3.9 x 10-3 N/A 1.9 x 10-2 N/A Upstreamh 0 N/A 0 N/A 2.0 x 10-2 N/A Downstreami 0 N/A 5.7 x 10-3 N/A 1.8 x 10-2 N/A

14/ hPolθ 51AP Totalg 3.2 x 10-3 - 5.4 x 10-2 14 6.3 x 10-2 3.3 Upstreamh 0 - 1.5 x 10-2 - 3.1 x 10-2 1.6 Downstreami 4.8 x 10-3 - 7.1 x 10-2 12 7.9 x 10-2 4.4 „-‟ means the ratio cannot be calculated because the denominator is 0. „N/A‟ means not applicable. aCalculated as Σ(base insertions)/[(sample size)x(number of bases in a specific event)]. b Calculated as {Σ(base insertions)/[(sample size)x(number of bases in a specific event)]}51AP/{Σ(base insertions)/[(sample size)x(number of bases in a specific event)]}63CTL. cCalculated as Σ(base deletions)/[(sample size)x(number of bases in a specific event)]. d Calculated as {Σ(base deletions)/[(sample size)x(number of bases in a specific event)]}51AP/{Σ(base deletions)/[(sample size)x(number of bases in a specific event)]}63CTL. eCalculated as Σ(base substitutions)/[(number of samples)x(number of bases in a specific event)]. f Calculated as {Σ(base substitutions)/[(sample size)x(number of bases in a specific event)]}51AP/{Σ(base insertion)/[(sample size)x(number of bases in a specific event)]}63CTL. gTotal events include all dNTP incorporation events during DNA synthesis of full-length products except those that occurred at Position 0 in Figure 4.5. hUpstream events include all dNTP incorporation events during DNA synthesis of full-length products that occurred before an enzyme encountered Position 0 in Figure 4.5. iDownstream events include all dNTP incorporation events during DNA synthesis of full-length products that occurred after an enzyme traversed Position 0 in Figure 4.5.

Table 4.3. Error rates of the four human Y-family DNA polymerases 143

4.7 Schemes

Scheme 4.1. Short oligonucleotide sequencing assay.

144

Chapter 5 : Mechanistic Studies of the Bypass of a Bulky Single-Base Lesion

Catalyzed by a Y-Family DNA Polymerase

5.1 Introduction

Environmental pollutants have been shown to impact human health at the molecular level. One detrimental route is the modification of genomic DNA and nucleotides [177]. If DNA lesions are not recognized and removed by the cellular DNA repair machinery, they will stall replicative DNA polymerases [4, 53, 178-182]. To rescue DNA replication, cells employ lesion bypass DNA polymerases to traverse unrepaired lesions. Most of these enzymes belong to the Y-family of DNA polymerases.

The Y-family enzymes possess relatively flexible and solvent accessible active sites in order to accommodate bulky DNA lesions [49, 60]. However, Y-Family DNA polymerases catalyze DNA synthesis over undamaged DNA with low fidelity and poor processivity [4, 8, 60, 105]. The Y-family DNA polymerases have been identified in all three domains of life, e.g. four in humans (DNA polymerases , , , and Rev1), two in

Escherichia coli (DNA polymerases IV and V) and one in S. solfataricus (Dpo4).

Because Dpo4 can be expressed in E. coli and purified with a high yield, it has been extensively studied in vitro as a prototype Y-family enzyme. Dpo4 catalyzes DNA

145 synthesis on an undamaged DNA template with a fidelity of one error per 1,000 to 10,000 nucleotide incorporations based on pre-steady-state kinetic analysis from 37 to 56 °C [27,

65, 66]. Dpo4 is capable of bypassing a myriad of DNA lesions including apurinic/apyrimidinic (abasic) sites [46, 58, 59, 106], 8-oxo-7,8-dihydro-2′- deoxyguanosine [50, 183], 1,N2–etheno()guanosine [56], cis-syn thymine-thymine dimer

[41, 44, 107], cisplatin-induced 1,2-intrastrand cross-links with adjacent deoxyguanosines

(cisplatin-dGpG adducts) [41, 108], benzo[a]pyrene diol epoxide (BPDE) on deoxyguanosine (BPDE-dG) or deoxyadenosine (BPDE-dA) [44, 109], and N-2-acetyl- aminofluorene (AAF) on deoxyguanosine (AAF-dG) [41].

So far, there are no comprehensive in vitro studies of the bypass of 1-nitropyrene

(1-NP)-induced DNA adducts catalyzed by a Y-family DNA polymerase. 1-NP, one of the most abundant polycyclic aromatic hydrocarbons (PAH), is a product of incomplete diesel and gasoline combustion [177, 184, 185]. There are two known pathways by which

1-NP is metabolized: nitro reduction (Scheme 5.1) and C-hydroxylation. When an aromatic ring of 1-NP is oxidized into non-DNA-reactive metabolites by P450 enzymes while in the gastrointestinal tract, respiratory system or skin, the organism can excrete the metabolites through a detoxification process [184-186]. However, in a gastrointestinal tract containing bacteria, such as Clostridium paraputrificum, Clostridium clostridiiforme, Eubacterium sp. and Clostridium leptum [187], the majority of 1-NP proceeds through nitro reduction, thereby leading to the production of DNA-reactive metabolites (Scheme 5.1) [186]. The intermediate metabolite, N-hydroxy-1-aminopyrene, is critical for creating an electrophilic nitrenium ion capable of reacting with DNA

146

(Scheme 5.1). The major product formed from these reactive metabolites is N-

(deoxyguanosin-8-yl)-1-aminopyrene (dGAP) that was shown to be mutagenic in bacterial and mammalian cells [185, 186, 188]. 1-NP is a potent mutagen and a carcinogen in rodents [189, 190], and the International Agency for Research on Cancer classifies 1-NP as a class 2B carcinogen [186, 188, 191, 192].

Dpo4 has not been shown to bypass the dGAP adduct in vivo for several reasons.

First, S. solfataricus, a hyperthermophilic archeon, would have to be able to uptake 1-NP.

Since this organism grows optimally at 80 C and pH 2-4 [38], 1-NP may not be stable under these extreme conditions [193]. Second, S. solfataricus would have to encode the necessary enzymes like nitro reductase to metabolize 1-NP and form dGAP [38]. We chose to study dGAP bypass catalyzed by Dpo4 because (i) the kinetic mechanism of nucleotide incorporation into undamaged DNA catalyzed by Dpo4 has been previously elucidated by our laboratory [65]; (ii) there are many published crystal structures of the

Dpo4 ternary complexes which contain various DNA lesions [40, 42, 43, 50, 55, 56, 106,

107, 109, 183]; (iii) Dpo4 is the only Y-family DNA polymerase encoded by S. solfataricus and thus is responsible for most translesion synthesis events in that organism

[38]; and (iv) we have established kinetic mechanisms and pathways for the bypass of an abasic site [59] and a cisplatin-dGpG adduct [108] catalyzed by Dpo4. In addition, both

Dpo4 and eukaryotic DNA polymerase θ (Polθ) are DinB homologs [33, 99] and the bypass abilities for a spectrum of DNA lesions are similar to those of eukaryotic DNA polymerase  (Pol) [34, 35, 41]. Thus, Dpo4 is a good model for those eukaryotic Y- family DNA polymerases and our studies may implicate how eukaryotic Y-family

147 enzymes bypass dGAP. To better understand the mutagenic potential of 1-NP-induced

DNA damage, a single-base lesion, dGAP, was placed specifically in a GC-rich region of a synthetic DNA template. Regions composed of repetitive DNA sequences, such as those in oncogenes, have been shown previously to induce more mutations than non- repetitive regions [188, 194, 195]. The mechanistic basis of the bypass of this bulky dGAP lesion catalyzed by Dpo4 was comprehensively investigated using pre-steady-state kinetic methods.

5.2 Material and Methods

Materials. Reagents were purchased from the following companies: OptiKinase from

United States Biochemical (Cleveland, OH), [γ-32P]ATP from GE Healthcare (Picataway,

NJ), and dNTPs from Gibco-BRL (Grand Island, NY). The Full-length Dpo4 was expressed in E. coli and purified as previously described [27].

Synthetic Oligonucleotides. The DNA template 26-mer-dGAP (Table 5.1) was synthesized and purified as previously described [196]. The monoisotopic mass (M-H) 8109.32 of the purified 26-mer-dGAP by electrospray ionization was consistent with the calculated mass of 8109.58. Other DNA substrates listed in Table 5.1 were purchased from Integrated

DNA Technologies (Coralville, IA) and purified by denaturing polyacrylamide gel electrophoresis (PAGE). The concentration of each DNA oligomer was determined by the UV absorbance at 260 nm.

148

Labeling and Annealing of the DNA Substrates. Each primer was 5′-[32P]-labeled by incubating it with OptiKinase and [γ-32P]ATP for 3 hrs at 37 °C. The 5′-[32P]-labeled primer was annealed to the unlabeled 26-mer or 26-mer-dGAP at a molar ratio of

1.00:1.15. This mixture was first heat denatured at 75 °C for 2 min and then cooled slowly to room temperature in several hours.

Buffers. All pre-steady-state kinetic assays, if not specified, were performed in optimized reaction buffer R (50 mM HEPES, pH 7.5 at 37 °C, 5 mM MgCl2, 50 mM NaCl, 0.1 mM

EDTA, 5 mM DTT, 10 % glycerol, and 0.1 mg/ml BSA) [27]. All electrophoresis mobility shift assays (EMSA) were performed in buffer S (50 mM Tris-Cl, pH 7.5 at 23

°C, 5 mM MgCl2, 50 mM NaCl, 5 mM DTT, 10 % glycerol, and 0.1 mg/ml BSA). All given concentrations were final after mixing all solutions.

Running Start Assay. The running start assay was performed as previously described [58,

59, 108]. Briefly, a preincubated solution of 5′-[32P]-labeled DNA (100 nM) and Dpo4

(100 nM) in buffer R was rapidly mixed with a solution containing all four dNTPs (200

µM each) at 37 °C via a rapid chemical-quench flow apparatus (KinTek). The reaction was quenched with 0.37 M EDTA after various times, and the reaction products were analyzed by denaturing PAGE (17 % polyacrylamide, 8 M urea).

EMSA. Dpo4 (0.5– 80 nM) was titrated into a solution containing 5′-[32P]-labeled DNA

149

(5 nM) in buffer S at 23 °C. To separate the binary complex from free DNA, native

PAGE was conducted at a constant voltage of 70 V for 35 min at 23 °C using running buffer A (50 mM Tris-acetate, pH 7.5 at 23 °C, 0.5 mM EDTA, 5.5 mM Mg(OAc)2).

After drying the gel, the bands were quantitated using a PhosphorImager 445 SI

(Molecular Dynamics). The dependence of the concentration of the binary complex

Dpo4•DNA on the Dpo4 concentration was fit to Eq 1 to yield Kd, DNA, the equilibrium dissociation constant for the binary complex (Dpo4•DNA) at 23 °C.

2 1/2 [Dpo4•DNA] = 0.5(Kd, DNA + Eo + Do) – 0.5[(Kd, DNA + Eo + Do) – 4EoDo] (Eq 1)

In Eq 1, Eo is the active Dpo4 concentration and Do is the DNA concentration.

Determination of Substrate Specificity. The dNTP incorporation efficiency (kp/Kd, dNTP) was calculated using measured maximum dNTP incorporation rate (kp) and equilibrium dissociation constant (Kd, dNTP) of an incoming dNTP. Single-turnover dNTP incorporation assays were employed to obtain the kp and Kd, dNTP as previously described

[27, 59, 65, 108]. Briefly, a preincubated solution of Dpo4 (120 nM) and 5′-[32P]-labeled

DNA (30 nM) in buffer R was mixed with increasing concentrations of a dNTP. The reactions were terminated after various times using 0.37 M EDTA. Reaction products were analyzed by denaturing PAGE (17 % polyacrylamide, 8 M urea) and quantitated with a PhosphorImager 445 SI. The time course of product formation at each dNTP concentration was fit to a single-exponential equation (Eq 2):

[Product] = A(1 – exp(- kobst)) (Eq 2) where kobs is the observed reaction rate constant and A is the reaction amplitude. Next, the

150 plot of the kobs versus the dNTP concentration was fit to a hyperbolic equation (Eq 3): kobs = kp[dNTP]/{[dNTP] + Kd, dNTP} (Eq 3) where kp is the maximum dNTP incorporation rate and Kd, dNTP is the equilibrium dissociation constant for the ternary complex (Dpo4•DNA•dNTP).

Biphasic Kinetic Assay. A preincubated solution of Dpo4 (120 nM) and 5′-[32P]-labeled

DNA (30 nM) in buffer R was rapidly mixed with 5 κM DNA trap D-1 (Table 5.1) [27] and 1.2 mM correct dNTP in buffer R for various times before being quenched with 0.37

M EDTA. Reaction products were resolved and quantitated as described above. The plot of the product concentration versus reaction time was fit to a double-exponential equation

(Eq 4):

[Product] = EoA1[1 – exp(- k1t)] + EoA2[1 – exp(- k2t)] (Eq 4) where Eo is the active Dpo4 concentration, A1 and A2 are the reaction amplitudes of the first and second phase, respectively, and k1 and k2 are the rate constants of the first and second phases, respectively.

5.3 Results

Bypass of a dGAP Lesion Catalyzed by Dpo4. A running start assay (Experimental

Procedures) was performed to observe the DNA polymerization pattern of how Dpo4 responded to a 1-aminopyrene (1-AP) adduct in a DNA substrate (17/26-mer-dGAP). As described previously with other normal DNA substrates [59, 108], Dpo4 synthesized the

151 full-length product 26-mer with an undamaged DNA substrate 17/26-mer (Table 5.1) within 10 s (Figure 5.1A). Despite the bulky size of dGAP, Dpo4 was able to bypass this lesion in 17/26-mer-dGAP (Table 5.1) with the observation of the full-length product after

180 s (Figure 5.1B). However, the accumulation of intermediate products 20-mer and 21- mer signaled that there were two consecutive strong polymerase pause sites. The 20-mer and 21-mer intermediates corresponded to dNTP incorporation opposite dGAP and extension of the lesion bypass product, respectively. In comparison, Dpo4 did not pause significantly at the comparable sites in Figure 5.1A. The product 27-mer in Figure 5.1A was likely formed through a blunt-end addition [117]. Furthermore, the accumulation of

24-mer and 25-mer in Figure 5.1A and 25-mer in Figure 5.1B was likely due to the dC- rich sequence at the 3′-terminus of the 26-mer template (Table 5.1) which caused polymerase „slippage‟ via primer realignment. This possibility was supported by the observation that an addition of 5% DMSO in the reaction buffer diminished the accumulation of both the 24-mer and 25-mer (data not shown).

Effect of a dGAP Lesion on DNA Binding to Dpo4. The accumulation of intermediate products in Figure 5.1B suggested that the presence of the bulky lesion may weaken the binding of DNA to Dpo4, thereby reducing the extension of intermediates. To measure the binding affinity (1/Kd, DNA) of DNA to Dpo4, the EMSA was performed for DNA substrates containing either the control (26-mer) or damaged (26-mer-dGAP) DNA templates (Table 5.1). The binary complex (Dpo4•DNA) was separated from free DNA using native PAGE (Figure 5.2A). As a representative example, the plot of the

152 concentration of Dpo4•20/26-mer-dGAP against the total concentration of Dpo4 (Figure

5.2B) was fit to Eq 1 (Experimental Procedures) to obtain a Kd, DNA of 1.0 ± 0.1 nM.

EMSAs were repeated for other DNA substrates and the Kd, DNA values are listed in Table

5.2. As expected, Dpo4 bound to undamaged DNA substrates with similar affinity (3.1-

4.0 nM). In comparison, Dpo4 bound to damaged DNA substrates with a larger range of

Kd, DNA values (1.0-4.4 nM, Table 5.2). Interestingly, a 4-fold tighter binding affinity

(Table 5.2) was observed with 20/26-mer-dGAP, the DNA substrate at the first pause site in Figure 5.1B. This suggested that the 1-AP moiety may interact directly with the active site residues of Dpo4. However, the tighter binding of 20/26-mer-dGAP to Dpo4 should facilitate processive polymerization and thus cannot be used to account for the accumulation of 20-mer in the first strong pause site (Figure 5.1B). Other kinetic studies have shown that the accumulation of intermediates in the vicinity of a DNA lesion is a strong indication that certain microscopic kinetic parameters, such as maximum dNTP incorporation rate (kp) and ground-state dNTP binding affinity (1/Kd, dNTP), were altered

[59, 108, 197]. Thus, we suspected that alterations in kp and Kd, dNTP of an incoming dNTP by dGAP were the kinetic reasons for polymerase pausing in Figure 5.1B.

AP Effect of a dG Lesion on the Kinetics of dNTP Incorporation. To determine kp and Kd, dNTP, we performed single dNTP incorporation assays under single-turnover reaction conditions. A representative example is shown in Figure 5.3. First, a preincubated solution of Dpo4 (120 nM) and 5′-[32P]-labeled 22/26-mer-dGAP were rapidly mixed with dCTP (25-1500 κM) and quenched with 0.37 M EDTA at various times. The products

153 were resolved by denaturing PAGE. The product concentration was plotted against time

(Figure 5.3A), and the data were fit to Eq 2 (Experimental Procedures) to determine the observed reaction rate (kobs). The dependence of kobs on the dCTP concentration was

-1 plotted and fit to Eq 3 (Experimental Procedures) which yielded a kp of 6.3 ± 0.3 s and a

Kd, dCTP of 682 ± 80 κM (Figure 5.3B). This assay was repeated for the series of DNA substrates representing the progression of Dpo4 as it approached, encountered, and bypassed the dGAP lesion in template 26-mer-dGAP, and these kinetic data are listed in

Table 5.3. For comparison, we used the same kinetic assay to determine the kinetic parameters (Table 5.4) for the corresponding dNTP incorporations with the control template 26-mer (Table 5.1). Although sequence dependent, these control kinetic parameters for both correct and incorrect dNTP incorporations were similar to our previously published results with a different undamaged template [27].

AP From the measured kp and Kd, dNTP values with 26-mer-dG , we further calculated dNTP incorporation efficiency (kp/Kd, dNTP), efficiency ratio (relative to undamaged

DNA), fidelity, and probability (Table 5.3). At non-pause sites, the kp/Kd, dNTP values for correct dNTP incorporation were within 1-4 fold of those with control 26-mer (Table 5.4) and were 100-4,000-fold greater relative to misincorporations (Table 5.3). For 20/26- mer-dGAP and 21/26-mer-dGAP, the correct dNTP incorporation efficiencies respectively decreased by 9- and 88-fold (Table 5.3 and Figure 5.4A) in comparison to those values with control 20/26-mer and 21/26-mer (Table 5.4). In addition, these catalytic efficiencies were up to 740-fold lower than those at non-pause sites (Table 5.3 and Figure 5.4A).

Thus, the presence of dGAP unfavorably impacted correct dNTP incorporation at two

154 discrete locations, opposite the lesion and extension of the bypass product.

In Table 5.3, the polymerase fidelity both upstream and downstream of the pause sites is in the range of 10-3 to 10-5 which is similar to the fidelity range obtained with control DNA (Table 5.4). In contrast, the fidelity at the pause sites (10-2 to 10-4), especially at the 2nd pause site, was lowered by 10-100 fold compared to non-pause sites

(Table 5.3) and with control DNA (Table 5.4). Interestingly, the correct dNTP incorporation probability is above 98% at all sites tested except for the extension step, whereby it drops to 89% (Table 5.3). Based on the dNTP incorporation efficiency values in Table 5.3, Dpo4 catalyzed the insertion of dNTPs with the following selection preference: dCTP >> dATP > dTTP, dGTP at the 1st pause site and dGTP > dATP, dCTP

> dTTP at the 2nd pause site.

Biphasic Kinetics of dNTP Incorporation at the Pause Sites. Our previous studies have shown that dNTP incorporation at pause sites follow biphasic kinetics [59, 108]. Such multiple phase kinetics, which was hidden in the single-turnover dNTP incorporation assay, can be deconvoluted by including a DNA trap. For this assay, 5 M of undamaged

D-1 (Table 5.1) was used as the trap. The effectiveness of this trap was examined and confirmed to be sufficient (Figure 5.5). To investigate the kinetics of dNTP incorporation at pause sites, a preincubated solution of Dpo4 (120 nM) and 5′-[32P]-labeled DNA (30 nM) was rapidly mixed with a solution of correct dNTP (1.2 mM) and unlabeled D-1 (5

κM) for various times before termination with 0.37 M EDTA. The time courses (Figure

5.6) of correct dNTP incorporation into 20/26-mer-dGAP and 21/26-mer-dGAP were both

155 biphasic and were fit to Eq 4 (Experimental Procedures) which yielded the biphasic kinetic parameters listed in Table 5.5. With both damaged substrates, the rate (k1) of the first phase was significantly faster than the k2 of the second phase while the reaction amplitude of the first phase (A1) was much smaller than the amplitude of the second

AP phase (A2). The total amplitudes (A1 +A2) were 66.7% with 20/26-mer-dG and 22% with 21/26-mer-dGAP, which were much less than 100%. In contrast, similar DNA trap assays with their control DNA substrates (20/26-mer and 21/26-mer) revealed only a single, fast phase (data not shown). Moreover, correct dGTP incorporation into damaged

19/26-mer-dGAP at an upstream, non-pause site and into the control substrate 19/26-mer in the presence of a DNA trap also exhibited monophasic kinetics (data not shown). The time courses of product formation were fit to Eq 2 (Experimental Procedures) to yield the kinetic parameters listed in Table 5.5. For the time courses exhibiting monophasic kinetics, the reaction amplitudes of dNTP incorporation were all about 90% while the reaction rates were 2.5-4.6 s-1 (Table 5.5). Taken together, these DNA trap assay experiments demonstrated that the dGAP lesion altered only the kinetics of dNTP incorporation at the two critical steps of translesion synthesis.

5.4 Discussion

The full-length product in Figure 5.1B indicates that Dpo4 was able to bypass a site-specifically placed dGAP. However, the initial formation of 26-mer was slower with the damaged template 26-mer-dGAP (180 s) than with the control template 26-mer (10 s)

156

(Figure 5.1). Moreover, the accumulation of intermediates 20-mer and 21-mer indicated that Dpo4 paused significantly when incorporating dNTP opposite dGAP and extending the bypass product. Interestingly, the second pause site was stronger than the first, which has been reported previously with AAF-dG [41]. To mechanistically understand how a

Y-family DNA polymerase traverses a single-base lesion like dGAP, we utilized EMSA and pre-steady-state kinetic methods to investigate the kinetic impact of this lesion on

DNA binding to Dpo4 and dNTP incorporations at positions upstream, opposite, and downstream from the dGAP lesion.

A Kinetic Basis for the Pausing of Dpo4 Caused by a dGAP. Figure 5.1B shows that several intermediate products accumulated (i.e. 20-mer, 21-mer, and 25-mer). The accumulation of 25-mer was likely due to polymerase slippage at the dC-rich region

(Results). A comparison of the catalytic efficiencies at the four other positions (Table

5.3) revealed the kinetic basis for the remaining incorporation profile. For example, correct dGTP incorporation into 21/26-mer-dGAP is 150-fold less efficient than correct dCTP into 20/26-mer-dGAP while the latter is 4-fold less efficient than correct dGTP into

19/26-mer-dGAP. These inefficiencies led to the accumulation of 20-mer and 21-mer with the latter accumulating more than former in Figure 6.2B. In contrast, correct dCTP incorporation into 22/26-mer-dGAP was 220-fold more efficient than correct dGTP incorporation into 21/26-mer-dGAP but was 3-fold less efficient than correct dGTP incorporation into 23/26-mer-dGAP. These resulted in the non-accumulation of 22-mer and 23-mer. Taken together, these data suggest the following kinetic pattern: (i) if an

157 intermediate accumulates at a polymerase pause site, its elongation is less efficient than its production, and the larger the difference in incorporation efficiency, the stronger the accumulation of the intermediate; (ii) the contrary is true for intermediate species which do not accumulate at a polymerase non-pause site. Consistently, this kinetic pattern has been observed in the bypass of an abasic site [59] and a cisplatin-dGpG adduct [108] catalyzed by Dpo4.

The analysis of efficiency ratios showed that dGAP significantly altered the kinetics of elongating the species at the strong pause sites in Figure 5.1B (Table 5.3 and

Figure 5.4A). To determine which kinetic parameters were affected, the kp and Kd, dNTP ratios were plotted against the template positions (Figure 5.4). As displayed in Figure

5.4B, the kp was significantly affected only at the two pause sites (11-fold for 20/26-mer-

AP AP dG and 58-fold for 21/26-mer-dG ). In contrast, the Kd, dNTP ratios were within 5-fold, with the largest increase in Kd, dNTP was at a non-pause site (Figure 5.4C). Thus, the inefficient elongation of 20-mer and 21-mer in Figure 5.1B was primarily due to slow kp values for correct dNTP incorporation.

These slow kp values may be due to DNA being trapped in non-productive complexes with Dpo4. This hypothesis was supported by the biphasic kinetics of nucleotide incorporations in the presence of a DNA trap, whereby a small, fast phase (A1 and k1) preceded a large, slow phase (A2 and k2) (Figure 5.6 and Table 5.5). Opposite dGAP, the contribution of the fast phase [(11 s-1)x(6.7 % of reaction amplitude)] and the slow phase [(0.7 s-1)x(60%)] yielded an overall dCTP incorporation rate of 1.2 s-1. This

-1 value was close to the calculated kobs (0.9 s ) estimated using Eq 3, 1.2 mM dCTP in

158

Figure 5.6, and measured Kd, dNTP and kp values (Table 5.3). Similarly, analysis of the biphasic rates for dGTP incorporation into 21/26-mer-dGAP (Table 5.5) agreed with the single-turnover rate in Table 5.3. In contrast, the same DNA trap experiments with 19/26- mer-dGAP revealed only the fast phase kinetics of dGTP incorporation with reaction amplitude of ~90% (Table 5.5) while the slow phase was not observed. Despite the molar excess of Dpo4, the remaining 10% of 19/26-mer-dGAP was never elongated (data not shown) which was not caused by dGAP, for similar reaction amplitudes were also observed with the three control DNA substrates in the presence of the DNA trap (Table

5.5). Observing reaction amplitudes less than 100% could be due to experimental errors, incomplete annealing of the DNA duplexes, Dpo4 bound to DNA in an inactive mode, and Dpo4 binding at the blunt-end rather than the staggering end of DNA. Notably, correct nucleotide incorporation into 19/26-mer-dGAP and the three control DNA substrates all followed monophasic kinetics with similar kinetic parameters (Table 5.5).

This suggested that the dGAP lesion did not affect nucleotide incorporation at a non-pause

AP P site and 19/26-mer-dG was bound by Dpo4 as a productive complex (E•DNAn ) similar to control DNA. In comparison, the small fast phases with both 20/26-mer-dGAP and

AP 21/26-mer-dG occurred with similar reaction rates (k1, Table 5.5) as the single-turnover rates observed with the control DNA substrates (Table 5.4) and likely represented the

AP same kinetic process. Thus, small percentages (A1) of 20/26-mer-dG and 21/26-mer-

AP P dG were bound productively by Dpo4 (E•DNAn ) in the fast phase. In the slow phase, large percentages (A2) of these two damaged substrates must be bound in a less

N catalytically competent mode by Dpo4 (E•DNAn ) as they were elongated with much

159

N slower rates (k2) than in the fast phase (k1). Moreover, the elongation of E•DNAn

N occurred in a single binding event, suggesting a slow conversion of E•DNAn to

P E•DNAn (k2) prior to a rapid extension (k1). The total reaction amplitudes observed with both 20/26-mer-dGAP (66.7%) and 21/26-mer-dGAP (22%) were much smaller than the highest possible amplitudes (~90%) observed with 19/26-mer-dGAP and control DNA

N substrates (Table 5.5). The kinetic partitioning between the dissociation of E•DNAn and

N P the conversion from E•DNAn to E•DNAn led to the reduction of A2 by a factor of k2/(k2

+ koff), whereas the koff is the DNA dissociation rate from Dpo4•DNA. The koff of D-1

(Table 5.1) has been previously determined to be 0.02 s-1 [65]. Because both 20/26-mer- dGAP and 21/26-mer-dGAP bound to Dpo4 with tighter or similar affinities as control

-1 DNA (Table 5.2), we assumed their koff to be 0.02 s . On the basis of the koff, the k2, A1, and A2 values (Table 5.5), and the above factor, we further estimated that 62% of 20/26-

AP AP N mer-dG and 31% of 21/26-mer-dG are in the form of E•DNAn , and that 21.3% of

20/26-mer-dGAP and 56% of 21/26-mer-dGAP were never elongated. Moreover, only 1% of 20/26-mer-dGAP and 4.7% of 21/26-mer-dGAP were calculated to be free in solution based on Kd, DNA values in Table 5.2 and Dpo4 and DNA concentrations in Figure 5.6,

Thus, significant amounts of 20/26-mer-dGAP and 21/26-mer-dGAP bound by Dpo4 were

D catalytically incompetent (E•DNAn ). Together, these DNA trap experiments suggest a kinetic mechanism for bypassing dGAP as shown in Figure 5.7.

There is structural evidence to support this lesion bypass mechanism. For example, the combined NMR-molecular mechanics computational studies reveal that the

1-AP moiety of an embedded dGAP:dC base pair in an 11-mer duplex is only intercalated

160 into the DNA helix between adjacent Watson-Crick base pairs [198]. Moreover, the sugar of the modified dG has a syn glycosidic torsion angle while both bases of the dGAP:dC base pair are displaced into the major groove [198]. If dGAP at the pause sites possesses the same conformation, then 1-AP will occupy the position of the incoming dNTP, thereby blocking catalysis. Such binary complexes would not be elongated without

D undergoing dramatic structural changes and likely represent the form of E•DNAn .

Interestingly, an energy minimization study suggests the presence of other conformers in which 1-AP is either quasi-intercalative or externally bound [196]. These minor conformers could be stabilized by the interactions between damaged DNA and the active site residues of Dpo4, which has been observed in comparisons of structures of PAHs without and with DNA polymerases [116]. The existence of these conformers is supported by a molecular modeling and simulation study [53] and two X-ray crystallographic studies [40, 109]. In the first study, the AAF of damaged DNA is situated in the Dpo4 major groove open pocket, and an anti glycosidic torsion with C1'- exo deoxyribose conformation allows AAF-dG to be Watson-Crick paired to dCTP with modest polymerase perturbation [53]. Such conformation would hinder translocation of

Dpo4‟s Little Finger (LF), based on a series of crystal structures of Dpo4 during DNA polymerization [3, 50, 60]. In the crystal ternary complexes of Dpo4, DNA containing

BPDE, and a dNTP [40, 109], there are two conformations of the BPDE, one intercalated between base pairs and another flipped out of the DNA helix into a structural gap between the LF and core domains. Additionally, the distance between the 3′-OH of the primer and the α–phosphate of the incoming dNTP is 9.0 Å if the bulky BPDE is in the

161 former conformation and 3.9 Å if it is in the latter conformation. This distance is close to

3.4 Å, the optimum catalytic distance [109, 199]. Thus, if 1-AP is flipped out as the aforementioned AAF and BPDE at the Dpo4 active site, those molecules of Dpo4•20/26-

AP P mer-dG will be in the form of E•DNAn and be rapidly elongated to Dpo4•21/26-mer.

If 1-AP is in the quasi-intercalative conformation [196], the binary complex Dpo4•20/26- mer-dGAP requires subtle to mild structural changes for efficient catalysis and is likely in

N the form of E•DNAn . To verify these structural speculations, we are currently attempting to solve the X-ray crystal structure of Dpo4•DNA-dGAP.

Potential Origin of Enhanced DNA Binding. Notably, Table 5.2 shows that the binding of

Dpo4•20/26-mer-dGAP is about 4-fold tighter than Dpo4•21/26-mer-dGAP, and

Dpo4•20/26-mer is the only binary complex affected by 1-AP. This suggests that the 1-

AP in Dpo4•20/26-mer-dGAP likely interacted with the residues in the LF domain of

Dpo4, as depicted in structures of Dpo4 with BPDE-dG [40]. The binding effect of 1-AP is surprising because DNA lesions usually distort DNA structure and weaken the binding of DNA to a DNA polymerase [59, 108, 200]. The tighter binding caused by the interactions between 1-AP and Dpo4 likely promoted catalysis, as 20/26-mer-dGAP was elongated with ~150-fold higher efficiency than 21/26-mer-dGAP (Table 5.3). A similar tight binding effect has been observed with human Polε and a 3-ring PAH [201].

Kinetic Effect of dGAP on dNTP Incorporation at Adjacent Sites. Interestingly, Table 5.3 shows that dGAP did not kinetically affect dNTP incorporations at any downstream

162 positions of the pause sites. In contrast, both an abasic site and a cisplatin-dGpG adduct kinetically affected six to seven downstream dNTP incorporations during TLS catalyzed by Dpo4 [59, 108]. These downstream effects have also been observed for the of E. coli DNA polymerase I replicating through AAF-dG and 8-oxo-dG lesions

[202]. Such a difference may be a reflection of how each lesion distorts the DNA structure within a DNA polymerase active site.

General Kinetic Mechanism for DNA Lesion Bypass. Similar biphasic kinetics of dNTP incorporation at pause sites has been observed in the bypass of an abasic site catalyzed by

Dpo4 [59] as well as a cisplatin-dGpG adduct catalyzed by Dpo4 [108] and HIV-1 reverse transcriptase [197]. Like the bypass of dGAP, the total reaction amplitude in each of these cases is much less than the reaction amplitude obtained with either control DNA

D or a DNA substrate at a non-pause site, indicating the existence of E•DNAn . Some of these dead-end binary complexes can bind a nucleotide and form dead-end ternary complexes. Previously, such dead-end ternary complexes have been proposed as HIV-1 reverse transcriptase and T7 DNA polymerase encounter N2-methylguanine [203], O6- benzylguanine [204] and O6-methylguanine [204]. Since replicative DNA polymerases

N P have more stringent active sites, the conversion of E•DNAn to E•DNAn should be

D extremely difficult. Thus, the majority of E•DNAn likely exists in the form of E•DNAn .

For example, the exonuclease-deficient T7 DNA polymerase inefficiently bypasses a cisplatin-dGpG adduct with total reaction amplitudes below 5% at each of the three consecutive pause sites [197]. Thus, Figure 5.7 is a general mechanism for DNA lesion

163 bypass catalyzed by several DNA polymerases.

In summary, Dpo4 was shown to be capable of traversing a model single-base lesion dGAP in a kinetically inefficient and error-prone manner. The extension step rather than the insertion opposite dGAP was more challenging. A kinetic mechanism for the dGAP lesion bypass was established via pre-steady-state kinetic analysis.

164

5.5 Figures

Figure 5.1. Running start assays.

A preincubated solution of Dpo4 (100 nM) and 5′-[32P]-labeled DNA (100 nM) was rapidly mixed with all four dNTPs (200 κM each), and the reaction was quenched with

0.37 M EDTA at various time intervals. (A) 17/26-mer; (B) 17/26-mer-dGAP. Sizes of important products are indicated, and the 21st position marks the location of the dGAP lesion from the 3′-terminus of the DNA template.

165

A

5

4

3

2 [Complex](nM)

1

0 0 20 40 60 80 100 [Dpo4] (nM)

B

Figure 5.2. Measurement of Kd, DNA at the first pause site.

Various amounts of Dpo4 (0.5–80 nM) were titrated into a solution containing 5′-[32P]- labeled 20/26-mer-dGAP (5 nM). The binary complex of Dpo4•DNA was separated from free DNA by native PAGE. (A) gel image of titration; (B) the plot of the binary complex‟s concentration versus the total concentration of Dpo4. The data were fit to Eq 1

(Materials and Methods) which yielded a Kd, DNA of 1.0 ± 0.1 nM. 166

A

5

4

) 3

-1 (s

obs 2 k

1

0 0 200 400 600 800 1000 1200 1400 1600

[dCTP] (M)

B

Figure 5.3. Kinetics of dCTP incorporation into 22/26-mer-dGAP

(A) a preincubated solution of Dpo4 (120 nM) and 5′-[32P]-labeled 22/26-mer-dGAP (30 nM) was rapidly mixed with increasing concentrations of dCTP (25 κM, ■; 50 κM, ○;

100 κM, ▲; 250 κM, ◊; 500 κM, ●; 1000 κM, □; 1500 κM, ♦) for various time intervals.

Each time course was fitted to Eq 2 to yield a kobs; (B) the plot of kobs values against

-1 dCTP concentrations was fit to Eq 3 to produce a kp of 6.3 ± 0.3 s and a Kd, dCTP of 682 ±

80 κM. 167

A (continued)

Figure 5.4. Quantitative effects of a dGAP lesion on correct dNTP incorporation catalyzed by Dpo4.

Extracted kinetic parameters from Table 5.3 were plotted against DNA substrates. (A) the dNTP incorporation efficiency ratio; (B) the kp ratio; and (C) the Kd, dNTP ratio.

168

Figure 5.4 continued

60

50

40

30

20 Ratio (normal/damaged) Ratio

p 10 k

0

19-mer 20-mer 21-mer 22-mer 23-mer DNA Primer

B

6

5

4

3

2 Ratio (damaged/normal)Ratio

1

, dNTP ,

d K

0

19-mer 20-mer 21-mer 22-mer 23-mer

DNA Primer

C

169

Figure 5.5. Effectiveness of the DNA trap for biphasic kinetic assays.

A preincubated solution of Dpo4 (120 nM), 5′-[32P]-labeled 21/26-mer-dGAP (30 nM) and

DNA trap (5 κM, 21/41-mer D-1) was rapidly mixed with dGTP (1.2 mM) and quenched at various time intervals with 0.37 M EDTA. The products were resolved on a 17% polyacrylamide gel with 8 M urea. The autoradiographed gel image revealed minimal product formation (22-mer) after 264 s. Thus, a molar ratio of 167:1 for the D-1 DNA trap to the radiolabeled damaged DNA substrate was effective at sequestering free Dpo4 that dissociated from the 5′-[32P]-labeled 21/26-mer-dGAP.

170

20

15

10 [Product] (nM) [Product] 5

0 0 50 100 150 200 250

Time (s)

(continued)

Figure 5.6. Biphasic kinetics of correct dNTP incorporation in the presence of a DNA trap.

A preincubated solution of Dpo4 (120 nM) and 5′-[32P]-labeled 20/26-mer-dGAP (■, 30 nM) or 21/26-mer-dGAP (●, 30 nM) was mixed rapidly with 21/41-mer D-1 (5 κM) and dCTP (■, 1.2 mM), or dGTP (●, 1.2 mM). The reaction was quenched with 0.37 M

EDTA after various times. The product concentration was plotted as a function of reaction time for each DNA substrate which was then fit to Eq 4. For 20/26-mer-dGAP, the fast phase had a reaction amplitude of 2.0 ± 0.7 nM and a reaction rate of 11 ± 6 s-1, while the slow phase had a reaction amplitude of 18 ± 1 nM and a reaction rate of 0.7 ±

0.1 s-1. For 21/26-mer-dGAP, the fast phase had a reaction amplitude of 0.9 ± 0.1 nM and a reaction rate of 1.9 ± 0.3 s-1, while the slow phase had a reaction amplitude of 5.7 ± 0.2 nM and a reaction rate of 0.031 ± 0.003 s-1.

171

Figure 5.6 continued

15

10

5 [Product] (nM) [Product]

0 0 0.5 1 1.5 2 Time (s)

Inset

172

Figure 5.7. Proposed kinetic mechanism for the bypass of dGAP catalyzed by Dpo4.

The kinetic parameters of the observed fast phase (A1, k1) and slow phase (A2, k2) in the presence

D of a DNA trap are labeled. E, polymerase; DNAn, DNA substrate; E•DNAn , dead-end binary

N P complex; E•DNAn , non-productive binary complex; E•DNAn , productive binary complex;

DNAn+1, extended DNA product by a base; and PPi, pyrophosphate.

173

5.6 Tables

Primers 17-mer 5′-AACGACGGCCAGTGAAT-3′ 19-mer 5′-AACGACGGCCAGTGAATTC-3′ 20-mer 5′-AACGACGGCCAGTGAATTCG-3′ 21-mer 5′-AACGACGGCCAGTGAATTCGC-3′ 22-mer 5′-AACGACGGCCAGTGAATTCGCG-3′ 23-mer 5′-AACGACGGCCAGTGAATTCGCGC-3′ Templates 26-mer 3′-TTGCTGCCGGTCACTTAAGCGCGCCC-5′ a26-mer-dGAP 3′-TTGCTGCCGGTCACTTAAGCGCGCCC-5′ DNA Trap 5′-CGCAGCCGTCCAACCAACTCA-3′ D-1 (21/41-mer) 3′-GCGTCGGCAGGTTGGTTGAGTAGCAGCTAGGTTACGGCAGG-5′ aG designates N-(deoxyguanosin-8-yl)-1-aminopyrene (dGAP).

Table 5.1. Sequences of DNA oligonucleotides.

174

Damaged DNAa Control DNAb Affinity DNA Substrate (nM) (nM) Ratioc 19/26-mer 2.7 ± 0.2 3.1 ± 0.5 1.2 20/26-mer 1.0 ± 0.1 4.0 ± 0.2 4.0 21/26-mer 4.4 ± 0.2 3.7 ± 0.2 0.8 22/26-mer 2.6 ± 0.2 3.8 ± 0.6 1.5 23/26-mer 4.0 ± 0.3 3.6 ± 0.5 0.9 aDamaged DNA refers to those with template 26-mer-dGAP in Table 5.1. bControl DNA refers to those with template 26-mer in Table 5.1. c Values were calculated as (Kd, DNA)control/(Kd, DNA)damaged.

Table 5.2. Binding affinity of Dpo4 to damaged and control DNA substrates at 23 °C.

175

Kd, dNTP kp (kp/Kd, dNTP)damaged Efficiency c Proba- dNTP -1 -1 Fidelity (M) (s-1) (M s ) Ratioa,b bilityd 19/26-mer-dGAP dGTP 187 ± 36 4.3 ± 0.3 2.3x10-2 1.1 - 99.8 dATP 373 ± 81 (6.3 ± 0.5)x10-3 1.7x10-5 0.8 7.4x10-5 0.1 dCTP 227 ± 23 (6.4 ± 0.2)x10-3 2.8x10-5 5 1.2x10-3 0.1 dTTP 1180 ± 190 (1.0 ± 0.1)x10-2 8.7x10-6 1.0 3.8x10-4 0.0 *20/26-mer-dGAP dCTP 167 ± 15 1.03 ± 0.03 6.2x10-3 9.2 - 98.4 dATP 856 ± 184 (7.6 ± 0.8)x10-2 8.9x10-5 0.2 1.4x10-2 1.4 dGTP 955 ± 160 (2.0 ± 0.2) x 10-3 2.1x10-6 21 3.3x10-4 0.0 dTTP 557 ± 36 (5.5 ± 0.1)x10-3 9.9x10-6 8.4 1.6x10-3 0.2 *21/26-mer-dGAP dGTP 674 ± 231 (2.8 ± 0.3)x10-2 4.2x10-5 88 - 88.6 dATP 886 ± 145 (2.7 ± 0.2)x10-3 3.1x10-6 0.8 6.8x10-2 6.5 dCTP 328 ± 79 (6.8 ± 0.5)x10-4 2.1x10-6 15 4.7x10-2 4.4 dTTP 2300 ± 461 (4.9 ± 0.7)x10-4 2.2x10-7 8.2 5.1x10-3 0.5 22/26-mer-dGAP dCTP 682 ± 80 6.3 ± 0.3 9.3x10-3 3.7 - 99.8 dATP 826 ± 85 (4.2 ± 0.2)x10-3 5.0x10-6 0.8 5.4x10-4 0.1 dGTP 502 ± 98 (4.3 ± 0.3)x10-3 8.6x10-6 0.8 9.3x10-4 0.1 dTTP 1540 ± 257 (4.7 ± 0.5)x10-3 3.1x10-6 3.6 3.3x10-4 0.0 23/26-mer-dGAP dGTP 62 ± 18 1.9 ± 0.1 3.1x10-2 0.8 - 98.9 dATP 668 ± 173 (3.7 ± 0.4)x10-2 5.5x10-5 0.3 1.8x10-3 0.2 dCTP 1130 ± 161 (9.0 ± 0.7)x10-3 7.9x10-6 0.2 2.6x10-4 0.0 dTTP 1290 ± 218 (3.4 ± 0.3)x10-2 2.7x10-4 0.01 8.7x10-4 0.9 a Calculated as (kp/Kd, dNTP)control/(kp/Kd, dNTP)damaged. b Values for (kp/Kd, dNTP)control are listed in Table 5.4. c Calculated as (kp/Kd, incorrect dNTP)damaged/[(kp/Kd, correct dNTP)damaged + (kp/Kd, incorrect dNTP)damaged] d Calculated as {(kp/Kd, dNTP)damaged/[Σ(kp/Kd, dNTP)damaged]}x100. *Denotes pause sites.

Table 5.3. Kinetic parameters for single dNTP incorporation opposite template 26-mer- dGAP. 176

Kd, dNTP k kp/Kd, dNTP p Fidelitya (M) (s-1) (M-1s-1) 19/26-mer dGTP 183 ± 54 4.6 ± 0.4 2.5 x 10-2 - dATP 731 ± 175 (1.0 ± 0.1) x 10-2 1.4 x 10-5 5.6 x 10-4 dCTP 350 ± 107 (5.0 ± 0.5) x 10-2 1.4 x 10-4 5.6 x 10-3 dTTP 1440 ± 305 (1.3 ± 0.2) x 10-2 8.9 x 10-6 3.6 x 10-4 20/26-mer dCTP 205 ± 64 11.6 ± 0.9 5.7 x 10-2 - dATP 631 ± 136 (1.2 ± 0.1) x 10-2 1.9 x 10-5 3.3 x 10-4 dGTP 77 ± 16 (3.5 ± 0.2) x 10-3 4.5 x 10-5 7.9 x 10-4 dTTP 489 ± 52 (4.0 ± 0.2) x 10-2 8.3 x 10-5 1.5 x 10-3 21/26-mer dGTP 437 ± 19 1.62 ± 0.03 3.7 x 10-3 - dATP 859 ± 168 (2.0 ± 0.2) x 10-3 2.4 x 10-6 6.5 x 10-4 dCTP 701 ± 30 (2.14 ± 0.04) x 10-2 3.1 x 10-5 8.3 x 10-3 dTTP 1180 ± 144 (2.1 ± 0.2) x 10-3 1.8 x 10-6 4.9 x 10-4 22/26-mer dCTP 129 ± 19 4.4 ± 0.2 3.4 x 10-2 - dATP 1313 ± 154 (5.5 ± 0.4) x 10-3 4.2 x 10-6 1.2 x 10-4 dGTP 567 ± 95 (4.8 ± 0.4) x 10-3 8.5 x 10-6 2.5 x 10-4 dTTP 1340 ± 454 (1.5 ± 0.3) x 10-2 1.1 x 10-5 3.2 x 10-4 23/26-mer dGTP 116 ± 24 2.8 ± 0.1 2.4 x 10-2 - dATP 431 ± 25 (6.6 ± 0.1) x 10-3 1.5 x 10-5 6.2 x 10-4 dCTP 918 ± 102 (1.5 ± 0.1) x 10-3 1.6 x 10-6 6.7 x 10-5 dTTP 1220 ± 59 (3.8 ± 0.1) x 10-3 3.1 x 10-6 1.3 x 10-4 a Calculated as (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect].

Table 5.4. Kinetic parameters of dNTP incorporation into undamaged DNA.

177

Correct A k A k DNA Substrate 1 1 2 2 dNTP (nM) (s-1) (nM) (s-1) 19/26-mer 27 ± 1 3.2 ± 0.4 dGTP (91%)* 20/26-mer 26 ± 1 4.6 ± 0.5 dCTP (86%)* 21/26-mer 27.0 ± 0.4 2.3 ± 0.1 dGTP (88%)* 27.0 ± 0.4 2.5 ± 0.2 19/26-mer-dGAP dGTP (89%)* 2.0 ± 0.7 11 ± 6 18 ± 1 0.7 ± 0.1 20/26-mer-dGAP dCTP (6.7%)* (60%)* 0.9 ± 0.1 1.9 ± 0.3 5.7 ± 0.2 0.031 ± 0.003 21/26-mer-dGAP dGTP (3%)* (19%)* *Calculated as (reaction amplitude/30 nM) x 100.

Table 5.5. Biphasic kinetic parameters for correct dNTP incorporation into 5′-[32P]- labeled DNA (30 nM) in the presence of a DNA trap (5 M) at 37 C.

178

5.7 Scheme

Scheme 5.1. Formation of N-(deoxyguanosin-8-yl)-1-aminopyrene (dGAP).

179

Chapter 6 : Kinetic Analysis of the Bypass of a Bulky Lesion Catalyzed by Human

Y-Family DNA Polymerases

6.1 Introduction

DNA lesions can arise from both endogenous and exogenous sources. If these

DNA lesions are not repaired by cellular DNA repair machinery, they will stall replicative DNA polymerases [4, 39, 180, 181]. To rescue stalled DNA replication, cellular DNA repair machinery may temporarily switch to Y-family DNA polymerases that function primarily in the bypass of DNA lesions. Out of the 16 identified human

DNA polymerases, four enzymes belong to the Y-family: DNA polymerase eta (hPolε), kappa (hPolθ), iota (hPolη), and Rev1 (hRev1). These enzymes have been shown to catalyze both error-free and error-prone translesion DNA synthesis (TLS) in vitro and in vivo [75, 154, 155]. For example, hPolε catalyzes error-free bypass of cis-syn cyclobutane thymidine dimer (cis-syn TT) in vivo [34, 68] and its inactivation by mutations leads to Xeroderma Pigmentosum variant (XPV) disease that increases incidence of sunlight-induced skin cancer [34, 156]. Biochemically, hPolε has been shown to bypass DNA lesions including abasic sites [62, 82, 205], N-2- acetylaminofluorene-deoxyguanosine (AAF-dG) [62] and benzo[a]pyrene-N2-

180 deoxyguanosine (BPDE-dG) [82]. hPolθ, a homolog of the well-studied Y-family member Sulfolobus solfataricus DNA Polymerase IV (Dpo4), is able to bypass abasic sites [205], AAF-dG and BPDE-dG [99] and to efficiently elongate mispaired primer termini [100]. hPol has been shown to bypass abasic sites [205], cis-syn TT [87] and

AAF-dG [85]. hRev1 preferentially incorporates dCTP opposite any template base [169] and DNA lesions, including abasic sites [205] and BPDE-dG [32]. These biochemical studies demonstrate that there is a significant overlap in lesion bypass abilities of the four human Y-family DNA polymerases. At present, it is unclear which human Y-family enzyme bypasses specific DNA lesion(s) in vivo.

Despite growing concerns about the effects on air pollution on human health, there are no comprehensive studies of the air pollutant-induced DNA adducts bypass catalyzed by human Y-family DNA polymerases. The metabolites of 1-nitropyrene (1-

NP), a product of incomplete gasoline combustion and one of the most abundant polycyclic aromatic hydrocarbons (PAH) [177, 184, 185], mostly react with DNA to form N-(deoxyguanosin-8-yl)-1-aminopyrene (dGAP) (Figure 6.1) [185]. 1-NP is a potent mutagen and a carcinogen in rodents [189], and is classified as a class 2B carcinogen

[186, 188]. Moreover, the dGAP lesion is shown to be mutagenic in bacterial and mammalian cells [186, 188]. In a previous study, we have shown that the nucleotide incorporations opposite dGAP and the first extension of dGAP bypass product catalyzed by

Dpo4 are kinetically altered significantly [57]. We also elucidated the minimum kinetic dGAP bypass mechanism utilized by Dpo4 using transient kinetic methods [57]. For the

181 current study, we kinetically assessed the dGAP bypass abilities of the four human Y- family DNA polymerases to shed light on which human enzyme bypasses dGAP in vivo.

6.2 Material and Methods

Materials. Reagents were purchased from the following companies: OptiKinase from

USB Corporation, [γ-32P]ATP from MP Biomedicals, and dNTPs from GE Healthcare. hPolε, hRev1, and hPol were expressed and purified as previously described [205]. hPol was expressed and purified as previously described with the following modification: the GST-tag was not removed in order to increase protein stability [205].

DNA Substrates. The DNA template 26-mer-dGAP (Table 6.1) was synthesized and purified as previously described [57]. Other DNA substrates listed in Table 6.1 were purchased from Integrated DNA Technologies. The radiolabeling of the primers and the annealing of a primer to a template were performed as described previously [57].

Buffers. All pre-steady-state kinetic assays were performed in reaction buffer R (50 mM

HEPES, pH 7.5 at 37 °C, 5 mM MgCl2, 50 mM NaCl, 0.1 mM EDTA, 5 mM DTT, 10 % glycerol, and 0.1 mg/ml BSA). All electrophoresis mobility shift assays (EMSA) were performed in buffer S (50 mM Tris-Cl, pH 7.5 at 23 °C, 5 mM MgCl2, 50 mM NaCl, 5 mM DTT, 10 % glycerol, and 0.1 mg/ml BSA). All listed concentrations were final after mixing solutions.

182

Running Start Assays. The running start assay was performed similarly to previous studies [57, 58, 108, 205, 206]. Briefly, a preincubated solution of 5′-[32P]-labeled DNA

(100 nM) and single human Y-family enzyme (1 µM) in buffer R was rapidly mixed with a solution containing all four dNTPs (200 µM each) at 37 °C via a rapid chemical-quench flow apparatus (KinTek). The reaction was quenched with 0.37 M EDTA after various times, and the reaction products were analyzed by denaturing polyacrylamide gel electrophoresis (PAGE, 20% polyacrylamide, 8 M urea).

Electrophoresis Mobility Shift Assays. hPolε (10-400 nM) or hPol (65-950 nM) or hRev1 (15-550 nM) was titrated into a solution containing 5′-[32P]-DNA (10 nM) in buffer S at 23 °C. Native PAGE was ran to separate the binary complex enzyme•DNA from free DNA. After quantitation using a Typhoon Trio (GE Healthcare), the concentration of the binary complex was plotted as a function of the enzyme concentration. The data were fit to Eq 1 using Kaleidagraph (Synergy software):

2 1/2 [E•DNA] = 0.5(Kd, DNA + Eo + Do) – 0.5[(Kd, DNA + Eo + Do) – 4EoDo] (Eq 1) where Eo and Do are the initial enzyme concentration and DNA concentration, respectively, and Kd, DNA is the equilibrium dissociation constant for E•DNA at 23 °C.

Active Site Titration Assays. A preincubated solution of 5′-[32P]-DNA (10-450 nM) and hPol (27.5 nM) was reacted with correct dNTP at saturating concentrations at 37 °C prior to being quenched with EDTA (0.37 M) after various time intervals. The reaction

183 products were resolved and quantitated as described above. The burst reaction amplitude

(A) was plotted as a function of DNA concentration and the data were then fit to Eq 2 to determine Kd, DNA.

2 1/2 A = 0.5(Kd, DNA + [hPol] + Do) – 0.5[(Kd, DNA + [hPol] + Do) – 4([hPol]Do)] (Eq 2)

Nucleotide Incorporation Efficiency and Fidelity Measurements. Single-turnover nucleotide (dNTP) incorporation assays were employed to obtain the kp (maximum dNTP incorporation rate) and Kd, dNTP (equilibrium dissociation constant for dNTP from enzyme•DNA•dNTP) as previously described [27, 57]. Briefly, a preincubated solution of

5′-[32P]-labeled DNA (20 nM) and hPolε (130 nM) or hRev1 (130 nM) or hPol (130 nM) in buffer R was mixed with increasing concentrations of a dNTP. For hPol, a preincubated solution of 5′-[32P]-labeled DNA (30 nM) and hPol (300 nM) in buffer R was mixed with increasing concentrations of a dNTP. The reactions were terminated after various time intervals by the addition of 0.37 M EDTA. Reaction products were resolved and quantitated as described above. The time course of product formation at each dNTP concentration was fit to Eq 3:

[Product] = A(1 – exp(-kobst)) (Eq 3) where kobs is the observed reaction rate constant. Next, the plot of the kobs as a function of dNTP concentrations was fit to Eq 4: kobs = kp[dNTP]/{[dNTP] + Kd, dNTP} (Eq 4)

Next, the dNTP incorporation efficiency (kp/Kd, dNTP) was calculated for each enzyme.

The dNTP incorporation fidelity was determined using the Eq 5:

184

Fidelity = (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect] (Eq 5)

Biphasic Kinetic Assays. A preincubated solution of hPolε (130 nM) and 5′-[32P]-labeled

DNA (20 nM) in buffer R was rapidly mixed with a solution of 5 κM unlabeled DNA trap D-1 (Table 6.1) and correct dNTP at saturating concentrations in buffer R for various times before the reactions were quenched with 0.37 M EDTA. Reaction products were resolved and quantitated as described above. The plot of the product concentration as a function of reaction time was fit to Eq 6:

[Product] = EoA1[1 – exp(- k1t)] + EoA2[1 – exp(-k2t)] (Eq 6) where Eo is the active hPolε concentration, A1 and A2 are the reaction amplitudes of the first and second phases, respectively, and k1 and k2 are the rate constants of the first and second phases, respectively.

6.3 Results

dGAP Bypass Catalyzed by Human Y-Family DNA Polymerases. To determine if human

Y-family DNA polymerases were able to bypass dGAP (Figure 6.1), we performed a running start assay for each enzyme in the presence and absence of dGAP. The gel images are shown in Figure 6.2. All human enzymes except for hRev1 were able to synthesize full-length products into DNA template 26-mer (Table 6.1) within 300 s (Figures 6.2A,

6.2C, 6.2E and 6.2G). Within 12 hrs, hRev1 was only able to incorporate five dNTPs

(Figure 6.2G). Unlike hRev1, hPolε synthesized full-length products the most efficiently

185 and within 3 s (Figure 6.2A), while hPol and hPol created full-length products within

15 s and 300 s, respectively. Likewise, Figure 6.2B depicted hPolε did not significantly pause while creating the full-length dGAP bypass products within 3 s. In contrast, hPol and hPol created full-length TLS products within 300 s and 7 200 s, respectively

(Figures 6.2D and 6.2F). During dGAP bypass catalyzed by either hPol or hPol, there were significant accumulations of 21-mer, indicating pausing immediately downstream of dGAP. Such accumulations of intermediates normally indicate kinetic parameters of DNA synthesis being altered [57, 108]. Markedly, hRev1 did not create full-length dGAP bypass products (Figure 6.2H). In addition, hRev1 was able to incorporate a dNTP opposite the lesion but did not extend the bypass product.

Effects of dGAP on DNA Binding Affinities. The accumulation of intermediates during the

TLS of dGAP (Figure 6.2) suggested that the presence of the bulky 1-AP adduct may have weakened the DNA binding to each human Y-family DNA polymerase. Therefore, we investigated the 1-AP adduct effect on the DNA binding affinity for each human Y- family enzyme via gel electrophoresis mobility shift assays (EMSA). An example gel image for EMSA using hPolε and 20/26-mer-dGAP (Table 6.1) is shown in Figure 6.3.

The binary complex concentration was plotted against the hPolε concentration (Figure

6.3), and was fit to Eq 1 in order to determine the ground-state equilibrium dissociation binary complex constant (Kd, DNA). This assay was repeated for DNA substrate 21/26- mer-dGAP and the corresponding control DNA substrates (Table 6.1). Additionally, the same EMSA series was repeated for hPol and hRev1 (Table 6.2). For hPol, there was

186 no mobility shift observed during EMSA, assumingly due to a fast dissociation rate of the binary complex (hPol•DNA). Consequently, active site titration assay was performed to determine the Kd, DNA for hPol (Table 6.2). The order of increasing DNA binding

AP affinities (1/Kd, DNA) for human Y-family DNA polymerases using both normal and dG - containing DNA substrates was as followed: hPolε > hPol > hRev1 >> hPol. When binding to either hPolε or hPol, the 20/26-mer-dGAP bound approximately 3-fold tighter when compared to the 20/26-mer binding affinity. When binding to either hRev1 or

AP hPol, there was no significant Kd, DNA change observed in presence of dG . Since these changes alone could not explain the observed pausing during TLS in Figure 6.2, we suspected that the nucleotide (dNTP) incorporation efficiencies of each human Y-family enzyme must also have been kinetically affected by the presence of the 1-AP adduct.

dGAP Effects on the Kinetics of dNTP Incorporation. To determine the dNTP incorporation efficiency changes in the presence of dGAP, we performed single dNTP incorporation assays with each human Y-family DNA polymerase under single-turnover conditions. For an example, a preincubated solution of 5′-[32P]-labeled DNA (20 nM) and hPolε (130 nM) at 37 °C was rapidly mixed with a single concentration of a single dNTP for various reaction times before being quenched with EDTA (0.37 M). Next, the plot of the product formation over time at each dNTP concentration was fit to Eq 3 to obtain a rate constant (kobs) for each dNTP concentration. The plot of the kobs versus dNTP concentrations was then fit to Eq 4 in order to obtain the apparent maximum dNTP incorporation rate (kp) and the apparent equilibrium dissociation constant (Kd, dNTP) 187 opposite dGAP and immediately downstream the bulky lesion site. With these kinetic parameters, we calculated the correct dNTP incorporation efficiency (kp/Kd, dNTP). To determine if the dNTP incorporation fidelity was affected within the vicinity of the bulky

1-AP adduct, we also employed single dNTP misincorporations under single-turnover conditions as previously described with both normal and dGAP-containing DNA substrates (Table 6.1), and calculated the corresponding fidelities using Eq 5.

hPolη. At both sites, the correct dNTP incorporation efficiency for hPolε dropped significantly with dGAP affecting the DNA synthesis more so at the one position downstream than at the lesion site (48-fold versus 20-fold; Tables 6.3 and 6.4). However, the efficiency change during correct dNTP incorporation into 20/26-mer-dGAP versus

21/26-mer-dGAP was only 1.5-fold (Table 6.4). Thus, the dNTP incorporation efficiency of hPolε only was slightly perturbed during the immediate extension of dGAP bypass products, which was consistent with the running start assay (Figure 6.2B). Likewise, the dNTP incorporation fidelity dropped significantly more so at the first position immediately downstream of dGAP (up to 37-fold) than at the 1-AP adduct site (up to 15- fold). The most likely misincorporation opposite the dGAP lesion was dTTP whereas the most likely misincorporation into the 21/26-mer-dGAP was dCTP (Table 6.4).

Remarkably, hPolε was more efficient misincorporating dCTP than correctly incorporating dGTP into 21/26-mer-dGAP. This data hinted that hPolε may have created a

+1 or -1 frameshift using either an upstream or a downstream template dG.

188 hPol. Similarly, the dGTP incorporation efficiency decreased more opposite the first position downstream the dGAP site than the dCTP incorporation efficiency opposite dGAP

(208-fold versus 5.7-fold; Tables 6.5 and 6.6). Additionally, the catalytic efficiency change during correct dNTP incorporation into 20/26-mer-dGAP versus 21/26-mer-dGAP dropped 123-fold. This correlated to the DNA polymerase pausing observed at 21st position (Figure 6.2D). Furthermore, the dNTP incorporation fidelity significantly decreased more opposite the first position immediately downstream of dGAP (up to 94- fold) than opposite dGAP (up to 11-fold). The most likely dNTP misincorporation opposite both sites was dATP, with more probability of dATP misincorporation opposite the 21/26-mer-dGAP site than opposite 20/26-mer-dGAP. This data suggested that the dGAP bypass catalyzed by hPol may involve a mechanism similar to the „A-rule‟ [164].

hPol. The catalytic efficiency of dGTP incorporation into 20/26-mer was approximately equal to values previously determined [206]. Notably, the catalytic efficiency of dGTP incorporation into 21/26-mer-dGAP decreased more than the catalytic efficiency in dCTP incorporation opposite dGAP (4,545-fold versus 7-fold; Tables 6.7 and 6.8). Interestingly, the change in correct dNTP incorporation efficiency from opposite dGAP to +1 position dropped 1,045-fold (Table 6.8), indicating a strong kinetic pause opposite 21/26-mer- dGAP. Such strong DNA polymerase pausing at this site was observed in Figure 6.2F.

Furthermore, the dNTP incorporation fidelity significantly dropped more opposite 21/26- mer-dGAP (up to 5.5-fold) than opposite dGAP (up to 1.4-fold). hPol was not able to misincorporate dATP or dCTP (Table 6.8) during the extension of the dGAP bypass

189 products, suggesting dGAP was not well accommodated downstream of the hPol active site.

hRev1. Consistent with our running start assays (Figure 6.2H), hRev1 could not incorporate any dNTP opposite 21/26-mer-dGAP within 3 hrs. The dCTP incorporation efficiency opposite dGAP decreased only 22-fold (Tables 6.9 and 6.10). Interestingly, the dNTP incorporation fidelity slightly increased for dTTP and dGTP misincorporations.

This increased fidelity was due to the correct dCTP incorporation efficiency opposite dGAP being roughly three orders of magnitude higher than the dNTP misincorporation efficiencies while the correct dCTP incorporation efficiency opposite dG being approximately one to two orders of magnitude higher than the dNTP misincorporation efficiencies. This data implied that hRev1 became slightly more error-free while incorporating dNTP opposite dGAP than opposite dG.

All human Y-family DNA polymerases. The correct dCTP incorporation efficiencies opposite dGAP for both hPol and hRev1 were identical. Although dCTP bound to hRev1•20/26-mer-dGAP 51-fold tighter than hPol•20/26-mer-dGAP, hRev1 incorporated dCTP 51-fold slower than hPol. For both 20/26-mer-dGAP and 21/26-mer-dGAP, hPolε was the quickest during correct dNTP incorporations. Opposite the bulky moiety, hPolε was 4.3-, 122- and 4.3-fold more efficient incorporating dCTP than hPol, hPol and hRev1, respectively. Opposite 21/26-mer-dGAP, hPolε was 340-, and 81,818-fold more efficient incorporating dGTP than hPol, and hPol, respectively. Consistently for all 190 human enzymes, the catalytic efficiency of the dGTP incorporation into 21/26-mer-dGAP site decreased more significantly than the catalytic efficiency of dCTP incorporation opposite dGAP (Tables 6.4, 6.6, 6.8, and 6.10). This data suggested that the immediate extension of dGAP bypass products was more kinetically hindered than the dNTP incorporation opposite dGAP, and the dominant cause for the observed accumulation of

21-mers in Figure 6.2.

Interestingly, hRev1 was the most error-free during dNTP incorporation opposite dGAP, while hPol was the least error-free during the dNTP incorporation opposite the bulky lesion. Furthermore, hPol was the most error-free while hPolε was the least error- free during the first extension of dGAP bypass products. Most of the dNTP misincorporation fidelities for each human Y-family enzyme decreased significantly opposite the bulky lesion, and even more so immediately downstream of dGAP. This data implied that these Y-family DNA polymerases became more error-prone during TLS of dGAP than during normal DNA synthesis.

Biphasic Kinetics of dNTP Incorporation for hPolη. Previously, we have shown that

Dpo4 bypassed dGAP following biphasic kinetics and that the amplitude of incompetent

Dpo4•DNA complexes was greater during the first extension of dGAP bypass products than during the dNTP incorporation opposite dGAP [57]. Since hPolε was the most efficient during first extension of dGAP bypass products (Table 6.4) and had the strongest

DNA binding affinity (Table 6.2), we performed single dGTP incorporation into either

21/26-mer or 21/26-mer-dGAP under single-turnover conditions and in the presence of a

191

DNA trap in large excess of 5′-radiolabeled DNA. First, we determined 6the effectiveness of our 5 µM DNA trap (D-1 21/41-mer, Table 6.1) to be efficient up to 132 s (Figure 6.4). Next, after conducting the biphasic kinetic assay (Material and Methods), the concentration of products was plotted as a function of reaction time (Figure 6.5), and fit to Eq 6 (Table 6.11). Unexpectedly, hPolε followed biphasic kinetics for both normal and dGAP-containing DNA substrates. For correct dGTP incorporation into 21/26-mer, the amplitude of the first kinetic phase (A1) was smaller than the amplitude of the second kinetic phase (A2) while the reaction rate of the first kinetic phase (k1) was faster than the reaction rate of the second kinetic phase (k2). For correct dGTP incorporation into 21/26-

AP mer-dG , A1 and k1 were larger than A2 and k2, respectively. The combination of reaction

-1 rates (A1k1 + A2k2) for 21/26-mer (30 s ) roughly equaled the apparent kp in Table 6.3 (39 s-1). Furthermore, the combination of reaction rates for 21/26-mer-dGAP (1.5 s-1)

-1 approximately equaled the apparent kp in Table 6.4 (2.2 s ). Thus, the biphasic kinetics of dNTP incorporation was significantly affected by the dGAP during bypass extension. This data also hinted that hPolε may utilize a different mechanism of dNTP incorporation during TLS of dGAP than during normal DNA synthesis. Overall, the presence of dGAP affected mostly the dNTP incorporation kinetics for human Y-family enzymes during the dGAP bypass and even more so during the first extension of dGAP bypass products.

6.4 Discussion

In our investigation, the mutagenic potential of dGAP was examined kinetically for human

192

Y-family DNA polymerases. Figure 6.2 indicated that all human Y-family DNA polymerases except for hRev1 (see Results) can bypass the dGAP lesion and create full- length products within 2 hrs. During the TLS of dGAP, there was only a significant accumulation of 21-mer for hPolθ and hPolη, which indicated these enzymes paused drastically when extending the immediate dGAP bypass products. Moreover, this data indicated that the dGAP bypass extension was more problematic for these enzymes than the dNTP incorporation opposite the bulky lesion. Strong DNA polymerase pausing downstream of a bulky lesion site was observed for dGAP [57] and AAF-dG [41] bypasses catalyzed by Dpo4 as well as BPDE-dG [201] and N-(2′-deoxyguanosin-8-yl)-

1-acetlyaminopyrene [207] bypasses catalyzed by Polε. Previously, we elucidated the minimum dGAP bypass mechanism utilized by Dpo4 [57]. Dpo4 is considered a good model for eukaryotic Y-family DNA polymerases because both Dpo4 and eukaryotic

Polθ are DinB homologs [33, 99], and the bypass abilities for a spectrum of DNA lesions are similar to those of eukaryotic Pol [34, 35, 41]. However, no running start assays in the presence of the dGAP lesion for each human Y-family enzyme were identical to the corresponding running start assay for Dpo4 [57]. To kinetically define how each human

Y-family DNA polymerase traverses a single-base lesion like dGAP, EMSA and pre- steady state kinetic methods were utilized.

A Kinetic Basis for the Pausing of hPolι and hPolκ Caused by dGAP. For hPolη, Figure

6.2F showed that significant accumulation of 21-mer that was likely due to DNA polymerase pausing at the first extension of dGAP bypass product. The kinetic basis for

193 this pausing event was revealed in Tables 6.7 and 6.8, which showed a 4,545-fold decrease in catalytic efficiency for dGTP incorporation opposite 21/26-mer-dGAP versus opposite 21/26-mer. The dNTP misincorporation efficiencies at this site also reduced drastically. Notably, hPolη only weakly misincorporated dTTP opposite 21/26-mer-dGAP

(Table 6.8). Further examination revealed that the kp had dropped approximately three orders of magnitude while the Kd, dGTP only decreased 2.3-fold when comparing normal

DNA synthesis to dGTP opposite 21/26-mer-dGAP. These changes in kinetic parameters led to the large drop in catalytic efficiency visualized in Figure 6.2F with inefficient elongation primarily due to slower kp value for correct dGTP incorporation.

For hPol, Figures 6.2C and 6.2D displayed a significant accumulation of 25-mer and 24-mer while extending 17/26-mer as well as a weak accumulation of 21-mer while extending 17/26-mer-dGAP. The accumulation of 25-mer was likely due to hPol creating

-1 frameshift mutations during DNA synthesis. Hence, the 25-mer would be the true full- length product. Moreover, the small detection of 26-mer would be the result of single dNTP blunt-end addition. This characteristic of hPol creating -1 frameshift mutations has been well noted in numerous studies [61, 99, 162, 205], and is most likely attributed to the ability to extend mispaired primer termini [99, 170]. The proposed -1 frameshift most likely occurred due to polymerase slippage at the dC-rich region marked by the accumulation of 24-mer (Figure 6.2C). The accumulation of 21-mer (Figure 6.2D) can be explained kinetically by the calculated decrease in correct dGTP incorporation efficiency opposite 21/26-mer-dGAP when compared to correct dGTP incorporation opposite 21/26- mer (Table 6.6). The decrease in catalytic efficiency at this site was due mostly to the

194

4,304-fold drop in kp. The 2-fold increase observed for the dGTP binding affinity likely promoted the extension of dGAP bypass product catalyzed by hPol.

Kinetic Basis for the Inefficient dGAP Bypass and Non-Extension of dGAP Bypass

Products of hRev1. As demonstrated in Figure 6.2G and 6.2H, hRev1 was extremely distributive and did not extend 17/26-mer into full-length products within 12 hrs. The reaction did not go beyond 12 hrs due to in vivo relevance. Notably, hRev1 preferred to correctly incorporate dCTP opposite both dG and dGAP at least 5- and 100-fold, respectively, more than a dNTP misincorporation was observed. Such a dCTP incorporation preference was expected since hRev1 is considered a dCTP transferase

[163] using Arg 357 to hydrogen bond with incoming dCTP during DNA synthesis [92].

For this same DNA synthesis mechanism, it may be assumed that the bulky 1-AP adduct cannot fit within the active site during dGAP bypass product extension. The increased nucleotide incorporation fidelity further supported this hypothesis as hRev1 would not have enough room in its active site for a misaligned primer to allow for dCTP misincorporation in the presence of an 1-AP adduct.

Kinetic Basis for Non-Pausing dGAP Bypass of hPolη. Interestingly, there were no distinct accumulations of intermediates during both normal DNA synthesis and dGAP bypass

(Figures 6.2A and 6.2B). However, there were significant decreases in catalytic efficiencies for correct dNTP incorporations into both 20/26-mer-dGAP and 21/26-mer-

AP dG , mostly due to drastically reduced kp at both sites when compared to normal DNA

195 synthesis (Table 6.3 and Table 6.4). The slowed kp values may be due to DNA being trapped in non-productive complexes with hPolε. The biphasic kinetics of dNTP incorporations catalyzed by hPolε in the presence of a DNA trap, whereby a fast phase

(k1) preceded a slow phase (k2), supported this theory (Figure 6.5 and Table 6.11).

AP Opposite 21/26-mer-dG , the contribution of the fast phase (k1 x [A1/20]) and slow

-1 phase (k2 x [A2/20]) yielded an overall dGTP incorporation rate of 1.5 s . This value was

-1 close to the calculated kobs (2.0 s ) estimated using Eq 4, 1.2 mM dGTP, and measured

Kd, dGTP and kp values (Table 6.4). Similar analysis of the biphasic rates for dGTP incorporation into 21/26-mer (Table 6.11) agreed with the single-turnover reaction rate in

Table 6.3. However, the same DNA trap experiment with 21/26-mer displayed a reaction amplitude of 86% (Table 6.11). Despite the molar excess of hPolε, the remaining 14% of

21/26-mer was never elongated (data not shown). The total amplitude (A1 + A2) of 86% for 21/26-mer was close to the calculated value (83%) based on Kd, DNA (Table 6.2) and experimental conditions (see Material and Methods). In contrast, the total amplitude for

21/26-mer-dGAP was 19.3%, which is far from ideal amplitude of 100% and calculated amplitude based on Kd, DNA (93%). Observing reaction amplitudes less than 100% could also be due to incomplete annealing of the DNA duplexes, or hPolε bound to DNA in an inactive mode. The binding of hPolε at a non-specific site of DNA rather than the staggered end of DNA could also explain the biphasic kinetics observed during correct dGTP incorporation into 21/26-mer (Figure 6.5). This hypothesis was also supported by observation of blunt-end addition products (27-mer) in Figure 6.2A. Although blunt-end addition by hPolε has been observed previously [62, 76, 157], the novel biphasic trend of

196 normal DNA synthesis have not been previously observed and thus warrant more research.

Comparing k1 and k2 for both DNA substrates, the presence of a bulky 1-AP adduct within the active site reduced k1 by 6.8-fold and k2 by 22-fold. Moreover, both A1

AP and A2 reduced 2.7- and 8.6-fold, respectively, in the presence of the bulky dG lesion.

Interestingly, the majority of 21/26-mer was in the non-productively bound complexes

N AP (hPolε•DNAn ) while the majority of 21/26-mer-dG was in the productively bound

P AP complexes (hPolε•DNAn ). This data suggested that 21/26-mer-dG bound more efficiently to hPolε than 21/26-mer, which was supported by the observed smaller Kd, DNA opposite dGAP than dG (Table 6.2). Due to the design of our experiment, we know that

N the elongation of E•DNAn occurred in a single binding event, suggesting a slow

N P conversion of E•DNAn to E•DNAn (k2) prior to a rapid extension (k1). Increased DNA specificity in the presence of a bulky lesion for hPolε was also observed during cis-syn

TT bypass in vitro [44, 157]. We had estimated that 47% of 21/26-mer and 5.3% of

AP N 21/26-mer-dG are in the form of E•DNAn , and that 14% of 21/26-mer and 80.7% of

21/26-mer-dGAP were never elongated. Additionally, less than 17% of 21/26-mer and

AP 6.6% of 21/26-mer-dG were calculated to be free in solution based on Kd, DNA values in

Table 6.2. Notably, the amount of 21/26-mer free approximately equaled the amount of non-elongated 21/26-mer, which suggested that there was no significant amount of

D hPolε•21/26-mer in dead-end binary complexes (E•DNAn ). However, a significant

AP D amount of 21/26-mer-dG bound by hPolε was catalytically incompetent (E•DNAn ).

Therefore, we hypothesized that hPolε used a kinetic mechanism for bypassing dGAP

197 shown in Scheme 6.1.

The structural evidence to support this lesion bypass mechanism are as followed.

A NMR study reveals that the hydrophobic 1-AP adduct of an embedded dGAP:dC base pair in a DNA duplex is intercalated into the DNA helix between adjacent Watson-Crick base pairs with the sugar of the modified dG possessing a syn glycosidic torsion angle, and both bases of the dGAP:dC base pair are displaced into the major groove [198]. With this conformation of dGAP within the active site of a DNA polymerase, the bulky 1-AP would occupy the position of the incoming dNTP, and thereby blocking catalysis.

Moreover, the binary complex of dGAP bypass products would not be elongated without

D undergoing dramatic structural changes and likely represent the form of E•DNAn . We hypothesized that such conformation would hinder translocation of the Little Finger of hPolε, based on a series of crystal structures of yeast Polε (yPolε) during TLS of AAF- dG [207]. In the crystal binary complexes of yPolε and DNA containing AAF [207], there are two conformations of the AAF: one where AAF stacks above last primer- template base pair, and another where AAF is partially rotated out of the DNA helix but still blocks incoming dCTP. If 1-AP is rotated completely out the DNA duplex similar to the aforementioned AAF at the yPolε active site model, the molecules of hPolε•21/26-

AP P mer-dG will be in the E•DNAn form and will be rapidly elongated. If 1-AP is in the quasi-intercalative conformation, the binary complex hPolε•20/26-mer-dGAP requires subtle to mild structural changes for efficient catalysis as predicted by Schorr and et al

N [207] and is likely in the form of E•DNAn .

198

Potential Function of Enhanced DNA Binding. Notably, Table 6.2 showed that the DNA binding affinities of E•20/26-mer-dGAP was about 4-fold tighter than E•21/26-mer-dGAP for both hPolε and hPol while there were no significant changes in DNA binding for hPol and hRev1. Moreover, hPol•20/26-mer was the only binary complex affected by the 1-AP adduct while both hPolε•21/26-mer and hPolε•21/26-mer were affected by the bulky 1-AP adduct. This suggested that the hydrophobic 1-AP adduct in E•20/26-mer- dGAP and E•21/26-mer-dGAP likely interacted with the amino acid residues in the Little

Finger subdomain of hPolε and hPol, as depicted in structures of Dpo4 with BPDE-dG

[40] and yPolε with AAF-dG [207]. A similar tighter binding effect has been observed

2 with hPolε and a different 3-ring PAH, N -methyl(9-anthracenyl)-dG [201]. The tighter binding of dGAP-containing DNA correlated well with more efficient dNTP incorporation into 20/26-mer-dGAP and 21/26-mer-dGAP than weaker or unchanged DNA binding affinity that resulted in less efficient dGAP bypass and extension. Thus, the tighter binding of 1-AP-containing DNA during binary complex formation most likely promoted the catalysis of dGAP bypass.

A General Kinetic Mechanism for DNA Lesion Bypass. Like hPolε, biphasic kinetics of dNTP incorporation at pause sites has been observed in the dGAP bypass catalyzed by

Dpo4 [57] where the total reaction amplitude is much less than the reaction amplitude obtained with either control DNA or a DNA substrate at a non-pause site. This difference

D in reaction amplitude indicates the existence of E•DNAn in both cases. Hence, Scheme

199

6.1 can be used to describe a general mechanism for DNA lesion bypass catalyzed by several Y-family DNA polymerases.

In summary, we have demonstrated that hPolε, hPol, hPol and hRev1 were able to bypass the single-base lesion dGAP with kinetically different ranges of catalytic efficiencies and accuracies. The extension step rather than the insertion opposite dGAP was more challenging for all human Y-family DNA polymerases. In addition, a minimal kinetic mechanism for the dGAP lesion bypass of each human Y-family enzyme was established. Based on the kinetic data herein, hPolε was the most efficient to bypass dGAP while hPol was the most accurate during the extension of dGAP bypass products.

Interestingly, the most favored misincorporation, dATP, opposite dGAP when catalyzed by hPol would lead to G → T transversion. Such G → T transversions are observed in human kidney cells [195], which implies that hPol is involved in dGAP bypass in vivo.

Furthermore, the favored dCTP misincorporation over the correct dGTP incorporation opposite 21/26-mer-dGAP catalyzed by hPolε correlated well with observed -1 frameshifts in the aforementioned study [195] and bacterial in vivo study [208].

Accordingly, both hPolε and hPol are viable candidates for dGAP adducts in vivo.

200

6.5 Figures

Figure 6.1. Chemical structure of dGAP.

201

(continued)

Figure 6.2. Running start assay for hPolε (A and B), hPolθ (C and D), hPolη (E and F) and hRev1 (G and H) at 37 °C. (A, C, E and G) 17/26-mer; (B, D, F and H) 17/26-mer- dGAP.

The 21st position marks the position of dGAP.

202

Figure 6.2 continued

203

10

8

6

4 [Complex] (nM)

2

0 0 100 200 300 400 500

[hPol] (nM)

Figure 6.3. EMSA for hPolε using 5′-radiolabeled 20/26-mers.

Inset is the gel image for hPolε EMSA using 20/26-mer-dGAP. The plots for 20/26-mer

AP (■) and 20/26-mer-dG (●) were fit to a quadratic curve (Eq 1). For hPolε, Kd, DNA was

23 ± 1.7 nM with 20/26-mer and 7.9 ± 0.3 nM with 20/26-mer-dGAP.

204

Figure 6.4. Effectiveness of the DNA trap for biphasic kinetic assays of correct dGTP incorporation into 20/26-mer catalyzed by hPolε.

A preincubated solution of hPolε (130 nM), radiolabeled 21/26-mer (20 nM) and DNA trap D-1 21/41-mer (5 µM) was mixed with dGTP (0.8 mM) and was quenched after various reaction times with 0.37 M EDTA. The gel image revealed minimal product formation (22-mer) after 132 s.

205

20

15

10 [Product] (nM) [Product] 5

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Time (sec)

Figure 6.5. Biphasic kinetics of correct dGTP incorporation into 20/26-mers catalyzed by hPolε in the presence a DNA trap.

A solution of hPolε (130 nM) preincubated with either radiolabeled 21/26-mer (20 nM,

■) or radiolabeled 21/26-mer-dGAP (20 nM, ●) was mixed with dGTP (0.8 mM, ■; 1.2 mM, ●) and DNA trap D-1 21/41-mer (5 µM) for various time intervals before being quenched with EDTA (0.37 M). The product concentration for each DNA substrate was plotted as a function of reaction time and was fit to Equation 6. For 21/26-mer, the fast phase had reaction amplitude of 7.8 ± 0.3 nM and reaction rate of 72.2 ± 4.9 s-1 while the slow phase had reaction amplitude of 9.5 ± 0.3 nM and reaction rate of 4.4 ± 0.5 s-1. For

21/26-mer-dGAP, the fast phase had reaction amplitude of 2.9 ± 0.1 nM and reaction rate of 10.6 ± 1.2 s-1 while the slow phase had reaction amplitude of 1.1 ± 0.2 nM and reaction rate of 0.2 ± 0.02 s-1.

206

6.6 Tables

Primers 17-mer 5′-AACGACGGCCAGTGAAT-3′ 20-mer 5′-AACGACGGCCAGTGAATTCG-3′ 21-mer 5′-AACGACGGCCAGTGAATTCGC-3′ Templates 26-mer 3′-TTGCTGCCGGTCACTTAAGCGCGCCC-5′ a26-mer-dGAP 3′-TTGCTGCCGGTCACTTAAGCGCGCCC-5′ DNA Trap D-1 (21/41-mer) 5′-CGCAGCCGTCCAACCAACTCA-3′ 3′-GCGTCGGCAGGTTGGTTGAGTAGCAGCTAGGTTACGGCAGG-5′ aG designates the 1-AP adduct on C8 position of dG (dGAP).

Table 6.1. DNA Substrates for dGAP.

207

With Adducta Without Adductb Affinity Enzyme Oligomer (nM) (nM) Ratioc hPolε 20/26-mer 7.9 ± 0.3 23 ± 1.7 3.0 21/26-mer 8.8 ± 0.5 26 ± 1.2 2.9

hPolθ 20/26-mer 303 ± 13 916 ± 8 3.0 21/26-mer 586 ± 36 955 ± 28 1.6

hPolηd 20/26-mer 64 ± 8 56 ± 12 0.9 21/26-mer 100 ± 1.2 43 ± 4 0.4

hRev1 20/26-mer 85 ± 5 118 ± 7 1.4 21/26-mer ND ND ND aWith adduct refers to 26-mer-dGAP. bWithout adduct refers to 26-mer. c Values are calculated as (Kd, DNA)Normal/(Kd, DNA)Damaged. dValues were determined using active site titration assays. „ND‟ denoted not determined.

Table 6.2. Binding affinity of hPolε, hPolθ hPolη and hRev1 to normal and damaged

DNA at 23 °C.

208

K k k /K d, dNTP p p d, dNTP Fidelitya (μM) (s-1) (μM-1s-1) Template dG (20/26-mer) dCTP 85 ± 11 48 ± 2 5.6 x 10-1 - dATP 350 ± 36 (9.5 ± 0.4) x 10-1 2.7 x 10-3 4.8 x 10-3 dGTP 37 ± 8 (4.0 ± 0.2) x 10-2 1.1 x 10-3 2.0 x 10-3 dTTP 494 ± 70 3.3 ± 0.2 6.6 x 10-3 1.2 x 10-2 Template dC (21/26-mer) dGTP 46 ± 8 39 ± 2 8.4 x 10-1 - dATP 242 ± 25 3.4 ± 0.1 1.4 x 10-2 1.7 x 10-2 dCTP 119 ± 21 2.2 ± 0.1 1.8 x 10-2 2.1 x 10-2 dTTP 712 ± 73 3.3 ± 0.2 4.6 x 10-3 5.4 x 10-3

a Values are calculated as (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect].

Table 6.3. Kinetic parameters of nucleotide incorporation into control DNA catalyzed by hPolε.

209

K k k /K Efficiency Fidelity d, dNTP p p d, dNTP Fidelityc Probabilitye (μM) (s-1) (μM-1s-1) Ratioa,b Ratiob,d Template dGAP (20/26mer-dGAP) dCTP 287 ± 24 8.1 ± 0.2 2.8 x 10-2 20 - - 77.3 dATP 634 ± 66 (8.7 ± 0.4) x 10-1 1.4 x 10-3 1.9 4.8 x 10-2 10 3.9 dGTP 65 ± 7 (5.2 ± 0.2) x 10-2 8.1 x 10-4 1.4 2.8 x 10-2 14 2.2 dTTP 492 ± 67 3.0 ± 0.2 6.0 x 10-3 1.1 1.8 x 10-1 15 16.6 Template dC (21/26mer-dGAP) dGTP 125 ± 18 2.2 ± 0.1 1.8 x 10-2 48 - - 22.1 dATP 335 ± 34 (3.3 ± 0.1) x 10-1 1.0 x 10-3 14 5.3 x 10-2 3.1 1.2 dCTP 119 ± 14 7.3 ± 0.2 6.1 x 10-2 0.3 7.7 x 10-1 37 74.8 dTTP 632 ± 71 (9.5 ± 0.5) x 10-1 1.5 x 10-3 3.1 7.7 x 10-2 14 1.8 a Values are calculated as (kp/Kd, dNTP)normal/(kp/Kd, dNTP)damaged. bThe values are from Table 6.3 using control 26mer DNA template. c Values are calculated as (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect]. d Values are calculated as Fidelitydamaged/Fidelitynormal. e Values are calculated as ((kp/Kd, dNTP)damaged/[Σ(kp/Kd, dNTP)damaged]) x 100.

Table 6.4. Kinetic parameters of nucleotide incorporation into damaged DNA catalyzed by hPolε.

210

Kd, dNTP k kp/Kd, dNTP p Fidelitya (M) (s-1) (M-1s-1) Template dG (20/26-mer) dCTP 46 ± 6 1.7 ± 0.1 3.7 x 10-2 - dATP 539 ± 71 (3.2 ± 0.2) x 10-1 6.0 x 10-4 1.6 x 10-2 dGTP 388 ± 29 (4.1 ± 0.1) x 10-1 1.1x 10-3 2.9 x 10-2 dTTP 693 ± 66 (5.5 ± 0.3) x 10-1 8.0 x 10-4 2.1 x 10-2 Template dC (21/26-mer) dGTP 87 ± 9 (9.9 ± 0.3) x 10-1 1.1 x 10-2 - dATP 596 ± 121 (2.4 ± 0.2) x 10-2 4.0 x 10-5 3.6 x 10-3 dCTP 1343 ± 474 (9.1 ± 2.0) x 10-2 6.8 x 10-5 6.1x 10-3 dTTP 645 ± 84 (2.5 ± 0.2) x 10-2 3.9 x 10-5 3.5 x 10-3 a Values are calculated as (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect].

Table 6.5. Kinetic parameters of nucleotide incorporation into control DNA catalyzed by hPolθ.

211

Kd, dNTP kp kp/Kd, dNTP Efficiency c Fidelity e Fidelity b,d Probability (M) (s-1) (M-1s-1) Ratioa,b Ratio Template dGAP (20/26-mer-dGAP) dCTP 267 ± 40 1.7 ± 0.1 6.5 x 10-3 5.7 - - 70.6 dATP 497 ± 98 (7.0 ± 0.6) x 10-1 1.4 x 10-3 0.4 1.8 x 10-1 11 15.2 dGTP 346 ± 80 (3.4 ± 0.3) x 10-1 9.9 x 10-4 1.1 1.3 x 10-1 4.5 10.7 dTTP 726 ± 237 (2.4 ± 0.4) x 10-1 3.2 x 10-4 2.5 4.7 x 10-2 2.2 3.5 Template dC – pause site (21/26-mer-dGAP) dGTP 43 ± 7 (23 ± 0.9) x 10-4 5.3 x 10-5 208 - - 60.0 dATP 203 ± 36 (5.5 ± 0.3) x 10-3 2.7x 10-5 1.48 3.4 x 10-1 94 30.5 dCTP 693 ± 45 (4.1 ± 0.1) x 10-3 5.9 x 10-6 12 1.0 x 10-1 17 6.7 dTTP 468 ± 90 (12 ± 0.8) x 10-4 2.5 x 10-6 16 4.5 x 10-2 13 2.8 a Values are calculated as (kp/Kd, dNTP)normal/(kp/Kd, dNTP)damaged. bThe values are from Table 6.5 using control 26mer DNA template. c Values are calculated as (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect]. d Values are calculated as Fidelitydamaged/Fidelitynormal. e Values are calculated as ((kp/Kd, dNTP)damaged/[Σ(kp/Kd, dNTP)damaged]) x 100.

Table 6.6. Kinetic parameters of nucleotide incorporation into damaged DNA catalyzed by hPolθ.

212

Kd, dNTP k kp/Kd, dNTP p Fidelitya (M) (s-1) (M-1s-1) Template dG (20/26-mer) dCTP 133 ± 8 (2.1 ± 0.1) x 10-1 1.6 x 10-3 - dATP 667 ± 105 (1.0 ± 0.1) x 10-2 1.5 x 10-5 9.3 x 10-3 dGTP 323 ± 25 (5.3 ± 0.2) x 10-3 1.6 x 10-5 9.9 x 10-3 dTTP 447 ± 35 (9.1 ± 0.3) x 10-2 2.0 x 10-4 1.1 x 10-1 Template dC (21/26-mer) dGTP 117 ± 7 (19 ± 0.4) x 10-2 1.6 x 10-3 - ND ND - - - ND ND - - - dTTP 783 ± 44 (13 ± 0.4) x 10-2 1.7 x 10-4 9.6 x 10-2 a Values are calculated as (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect]. „ND‟ denoted not determined.

Table 6.7. Kinetic parameters of nucleotide incorporation into control DNA catalyzed by hPol.

213

Kd, dNTP kp kp/Kd, dNTP Efficiency c Fidelity e Fidelity b,d Probability (M) (s-1) (M-1s-1) Ratioa,b Ratio Template dGAP (20/26-mer-dGAP) dCTP 380 ± 66 (8.8 ± 0.6) x 10-2 2.3 x 10-4 7.0 - - 97.1 dATP 338 ± 22 (10.3 ± 0.2) x 10-4 3.0 x 10-6 5.0 1.3 x 10-2 1.4 1.3 dGTP 252 ± 29 (5.3 ± 0.2) x 10-4 2.1 x 10-6 7.6 9.0 x 10-3 0.9 0.9 dTTP 762 ± 60 (13.1 ± 0.5) x 10-4 1.7 x 10-6 118 7.3 x 10-3 0.1 0.7 Template dC – pause site (21/26-mer-dGAP) dGTP 264 ± 76 (5.9 ± 0.6) x 10-5 2.2 x 10-7 4545 - - 46.8 dATP ND ND - - - - - dCTP ND ND - - - - - dTTP 573 ± 73 (14.1 ± 0.8) x 10-5 2.5 x 10-7 680 5.3 x 10-1 5.5 53.2 a Values are calculated as (kp/Kd, dNTP)normal/(kp/Kd, dNTP)damaged. bThe values are from Table 6.7 using control 26mer DNA template. c Values are calculated as (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect]. d Values are calculated as Fidelitydamaged/Fidelitynormal. e Values are calculated as ((kp/Kd, dNTP)damaged/[Σ(kp/Kd, dNTP)damaged]) x 100. „ND‟ denoted not determined.

Table 6.8. Kinetic parameters of nucleotide incorporation into damaged DNA catalyzed

by hPol.

214

a dNTP Kd, dNTP kp kp/Kd, dNTP Fidelity (μM) (s-1) (μM-1s-1) Template dG (20/26-mer) dCTP 5.4 ± 1.2 0.78 ± 0.07 1.4 x 10-1 - dTTP 80 ± 14 (2.1 ± 0.1) x 10-1 2.6 x 10-2 1.6 x 10-1 dATP 99 ± 17 (6.98 ± 0.04) x 10-3 7.1 x 10-5 5.1 x 10-4 dGTP 256 ± 14 1.37 ± 0.03 5.3 x 10-3 3.6 x 10-2 a Values are calculated as (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect].

Table 6.9. Kinetic parameters of nucleotide incorporation into control DNA catalyzed by hRev1.

215

c e dNTP Kd, dNTP kp kp/Kd, dNTP Efficiency Fidelity Fidelity Probability (μM) (s-1) (μM-1s-1) Ratioa,b Ratioc Template dGAP (20/26-mer-dGAP) dCTP 5.2 ± 0.9 (3.3 ± 0.2) x 10-2 6.5 x 10-3 22 - - 99.8 dTTP 78 ± 18 (2.6 ± 0.3) x 10-4 3.4 x 10-6 7647 5.2 x 10-4 0.003 0.1 dATP 103 ± 25 (4.4 ± 0.4) x 10-4 4.2 x 10-6 17 6.5 x 10-4 1.3 0.1 dGTP 143 ± 34 (4.2 ± 0.3) x 10-4 3.0 x 10-6 1767 4.6 x 10-4 0.1 ~0 a Values are calculated as (kp/Kd, dNTP)normal/(kp/Kd, dNTP)damaged. bThe values are from Table 6.9 using control 26mer DNA template. c Values are calculated as (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect]. d Values are calculated as Fidelitydamaged/Fidelitynormal. e Values are calculated as ((kp/Kd, dNTP)damaged/[Σ(kp/Kd, dNTP)damaged]) x 100.

Table 6.10. Kinetic parameters of nucleotide incorporation into damaged DNA catalyzed

by hRev1.

216

DNA Substrate A1 k1 A2 k2 (Primer/Template) (nM) (s-1) (nM) (s-1) 21/26-mer 7.8 ± 0.3 (39%)a 72.2 ± 4.9 9.5 ± 0.3 (47%)a 4.4 ± 0.5 21/26-mer-dGAP 2.9 ± 0.1 (14%)a 10.6 ± 1.2 1.1 ± 0.2 (5.3%)a 0.2 ± 0.02 aCalculated as (reaction amplitude/20nM) x 100.

Table 6.11. Biphasic kinetic parameters of correct nucleotide incorporations catalyzed by hPolε.

217

6.7 Schemes

Scheme 6.1. Proposed kinetic mechanism for dGAP bypass catalyzed by human Polε.

DNAo, initial DNA concentration; A1, fast phase amplitude; k1, fast phase reaction rate;

A2, slow phase amplitude; k2, slow phase reaction rate; E, DNA polymerase; DNAn, DNA

D N substrate; E•DNA n, dead-end binary complex; E•DNA n, nonproductive binary complex;

P N E•DNA n, productive binary complex; E•DNA n•dNTP, nonproductive ternary complex;

P E•DNA n•dNTP, productive ternary complex DNAn+1, extended DNA product by 1 base;

PPi, pyrophosphate.

218

Chapter 7 : Additional Results, Future Directions and Conclusion

The focus of this dissertation was to elucidate structure-function relationships between the specific DNA lesion bypass capacity of hPolε, hPol, hPol, hRev1 and S. solfataricus Dpo4 in the context of structural differences. In continuation of the work presented in Chapter 6, we applied the newly developed short oligonucleotide sequencing assay (SOSA) method to quantitatively measure the mutagenic potential of the N-

(deoxyguanosin-8-yl)-1-aminopyrene (dGAP) bypass when catalyzed by S. solfataricus

Dpo4 and each human Y-family DNA polymerase. When our work is completed, we will be able to elucidate which human Y-family DNA polymerase performs TLS on air- pollution derived DNA lesions such as dGAP in vivo.

We also applied the SOSA method to quantitatively measure the mutegenic potential of double-base DNA lesion bypass when catalyzed by the same group of Y- family DNA polymerases. The two double-base DNA lesions most commonly studied are cis-syn TT and cisplatin-dGpG. We studied these two specific DNA lesions for the following reasons: i) the Y-family DNA polymerase that is responsible for the mutagenic genotype in XPV patients has not been identified; ii) the anticancer cisplatin drug resistance is hypothesized to possibly arise from DNA lesion bypass that catalyzed by a

Y-family DNA polymerase; and iii) the mutagenic potential differences in each double-

219 base lesion bypass catalyzed by a single Y-family DNA polymerase can be observed.

Data from our SOSA studies will increase mechanistic details on the origin of genetic mutations within a cell.

Our studies using SOSA and different DNA substrates indicated that hPolε may play an important role in multiple lesion bypass pathways in vivo. Thus, it is imperative to understand how hPolε is able to bypass each DNA lesion in vivo in order to understand the possible mutagenic outcomes. However, there is no comprehensive study on the mechanism of DNA polymerization catalyzed by hPolε. To address this absence of scientific knolwedge, we provided essential insights into the hPolε kinetic mechanism during normal DNA synthesis. The information gained from our kinetic study of hPolε can be used to design drugs to treat specific diseases such as XPV.

While examining the Dpo4 relevance of kinetic and structural data at lower-than- physiological temperatures in Chapter 2, we discovered a novel thermal-stable unfolding intermediate. We also determined the origin of the unfolding intermediate to specific ionic interactions between the amino acid residues in the linker region and Palm domain.

Since the ionic interactions of the linker region are conserved within the Y-family DNA polymerases, we hypothesized that these ionic interactions have important function implications during DNA synthesis catalyzed by Y-family DNA polymerases.

Consequently, we further explored the structure-function potential of the linker region interactions in Dpo4 using kinetic methods. As it has been demonstrated that proteins may unfold using different pathways based on method of denaturation, we also examined the possible unfolding pathways of Dpo4 using salt denaturation assays. Although all of

220 our findings within Chapter 7 are preliminary, the data obtained are significant enough to be noted in relations to the focus of this dissertation.

7.1 Mutagenic Analysis of dGAP Bypass Catalyzed by Y-family DNA Polymerases

Our long-term goal is to quantitatively determine the relative abilities of the four human Y-family DNA polymerases and Dpo4 to replicate past a single dGAP. To date, only our laboratory had studied this specific DNA lesion derived from air pollution in the context of enzymatic lesion bypass. Our previous studies have indicated that Dpo4 [57] as well as hPolε, hPolη, hPolθ, and hRev1 (Chapter 6) are able to bypass dGAP in vitro with different catalytic efficiencies. Although there were new details gained for each dGAP bypass mechanism, we did not illustrate the full mutagenic potential of dGAP bypass catalyzed by each enzyme. Furthermore, we needed to elucidate the likely enzyme(s) to perform in vivo dGAP bypass. It is challenging to deduce which human Y-family enzyme bypasses specific lesions in vivo, as most Y-family DNA polymerases have overlapping abilities with different lesion bypass efficiencies. In order to solve some of these unanswered questions, we quantitatively assessed the dGAP bypass abilities of the four human Y-family DNA polymerases as well as Dpo4 via running start assays and short oligonucleotide sequencing assays (SOSAs).

Results

The bypass of dGAP is essential for cell survival. It is known that bulky lesions

221 will stop the progression of the DNA replication machinery. To visualize such event in vitro, we performed running start assay in the presence and absence of dGAP with a replicative DNA polymerase, Sulfolobus solfataricus DNA Polymerase B1 (S. solfataricus Pol B1). As expected, S. solfataricus Pol B1 stalled after a small amount of dNTP incorporation opposite the dGAP site (Figure 7.1). Thus, not all DNA polymerases can bypass lesions such as dGAP.

The dGAP Bypass Efficiency Quantitative Analysis. We sought to first quantitatively determine the various dGAP bypass efficiencies for Dpo4 and all four human Y-family

DNA polymerases using running start assays as described in Chapters 5 and 6 (Figures

5.1 and 6.2). We further analyzed the results of the running start assays for Dpo4, hPolε, hPolη and hPolθ by quantitatively determining their bypass efficiencies (dGAP bypass%) using the same equation as for AP site bypass (Chapter 4, Eq 1) [205]. The corresponding plot of dGAP bypass% as a function of time for each enzyme (Figure 7.2) demonstrated that hPolε required the shortest time to bypass dGAP while hPol required the longest time

AP bypass AP to bypass dG . We defined t50 as the time required to bypass 50% of the total dG

AP bypass lesions encountered to quantitatively define dG bypass efficiency. The t50 values, estimated from Figures 5.1 and 6.2, are shown in Table 7.2. Since hRev1 did not bypassed dGAP after 720 min (Figure 6.2H), we did not plot the time-dependent dGAP

bypass AP bypass% for this enzyme. Based on the t50 values, hPolε possessed the highest dG bypass efficiency with hPol being < 2-fold slower than hPolε.

222

Standing Start Analysis of hRev1. Since hRev1 failed to synthesize full-length DNA synthesis product over S-1a 17/26-mer and S-1b 17/26-mer-dGAP, it was hypothesized that hRev1 lost activity over time prior to reaching the DNA lesion site. Therefore, we performed a series of standing start assays with hRev1 where the incoming dNTP was incorporated opposite dGAP (Figure 7.3). These data confirmed what was observed during our running start and pre-steady state kinetic single-turnover dNTP incorporation assays

(Chapter 6). That is, hRev1 could incorporate dNTP opposite dGAP, but failed to extend the dGAP bypass products to full-length size. Similar to previous studies [169], hRev1 preferred to incorporate dCTP over other dNTPs when opposite both dG and dGAP. This characteristic of hRev1 was expected as hRev1 is known to function as a dCTP transferase in vitro [89, 90, 169]. Interestingly, the dNTP incorporation fidelity of hRev1 increased in the presence of dGAP, which also correlated well with our kinetic dGAP study

(Chapter 6). Since hRev1 failed to generate the full-length dGAP bypass products (Figure

6.2H), we could not quantitatively analyze its bypass specificity using SOSA (Scheme

7.1).

Quantitative Analysis of Mutational Spectra Derived from the dGAP Bypass. Similar to previously published SOSAs [58, 205], we modified the SOSA method for lesion bypass analysis to separate the longer DNA template from the shorter dGAP bypass products by denaturating PAGE instead of DNA template cleavage by human AP endonuclease (Scheme 7.1). Our newly modified SOSA still allowed adequate time for each enzyme to generate the full-length dGAP bypass products.

223

Dpo4. To perform SOSA with Dpo4 and S-3b 17-mer/73-mer-dGAP, we sequenced 52 colonies, which are summarized in Figure 7.4A. Our results showed that Dpo4 incorporated either dCTP (47/52 colonies, 90.4%), dATP (4/52 colonies, 7.7%), or no dNTP (1/52 colonies, 1.9%) opposite dGAP (Figure 7.5A). Thus, Dpo4 was mostly error- free while incorporating dNTP opposite the bulky dGAP lesion. This data supported our previous kinetic investigation of dGAP bypass catalyzed by Dpo4 [57]. The dATP misincorporation opposite the dGAP lesion was likely due to a mechanism similar to the well-established “A-rule” [164] where the lesion is looped out of the Dpo4 active site. In addition, Dpo4 generated a small number of substitutions, deletions, and mixed mutations at positions located both upstream and downstream from dGAP (Figure 7.4A).

To quantitatively compare mutation frequencies at the lesion site and other DNA template positions, we plotted relative error% as a function of template positions (Figure

7.6A) using the same method described for AP site bypass (Figure 4.5) [205].

Interestingly, Figure 7.6A indicates that dNTP incorporation was mostly affected at the lesion site and the Position +1 (one nucleotide downstream from the dGAP site) as evidenced by the highest relative error (~9.6%) in the sequencing window. The slight differences in mutation frequencies at Positions 0 and +1 are displayed in Figures 7.5A and 7.5C.

For accurate evaluation of the dGAP effect on upstream and downstream nucleotide incorporations catalyzed by Dpo4, we performed SOSA using a control DNA substrate S-3a 17/73-mer which has a template base dG at the corresponding position of

224 dGAP in S-3b (Table 7.1). Based on the 52 sequences gathered (Figure 7.4B), Dpo4 incorporated mostly correct dCTP (98.1%) and occasionally did not incorporate a dNTP

(1.9%) opposite the template base dG at Position 0 (Figure 7.5B). The mutations created by Dpo4 in the vicinity of Position 0 only changed in relative frequency, being mostly single-base deletions and substitutions in the presence and absence of dGAP (Figures 7.6A and 7.6B). Overall, the base substitution error rate of Dpo4 was calculated to be 1.2x10-2 with S-3b 17/73-mer-dGAP and 3.1x10-3 with S-3a 17/73-mer (Table 7.3). These values were comparable to the pre-steady-state kinetic fidelities of 1.4x10-2 and 3.7x10-3 for

DNA containing dGAP or dG, respectively, within the active site of Dpo4 [57]. Markedly, small differences in the mutagenic data in Figures 7.4A and 7.4B suggested that the base deletions and substitutions observed with S-3b (Table 7.1) were most likely caused by the presence of dGAP. The 1.9% deletion error frequency at Position -5 and 3.8% overall error frequency at Position -3 (Figure 7.6A) also hinted that Dpo4 was able to detect dGAP before the DNA lesion entered its active site.

hPolη. The 44 colonies containing SOSA products from hPolε and S-3b 17/73-mer-dGAP were sequenced and are summarized in Figure 7.4C. Our data showed that hPolε incorporated dATP (5/44 colonies, 11.4%), dCTP (16/44 colonies, 36.4%), dGTP (1/44 colonies, 2.3%), and dTTP (2/44 colonies, 4.5%) opposite the bulky dGAP lesion (Figure

7.5A). However, hPolε preferred to create a -1 frameshift, i.e. no dNTP incorporation opposite dGAP (20/44 colonies, 45.5%). Furthermore, the dATP misincorporation opposite dGAP by hPolε was likely due to a mechanism that utilized the base-stacking

225 potential of dATP at the DNA lesion site. Therefore, hPolε was extremely error-prone during TLS of dGAP and was not likely to incorporate any dNTP opposite this bulky

DNA lesion. Notably, the percentage of G→T mutations observed with hPolε (11.4% dATP incorporation opposite dGAP) was similar to the measured G→T transversion percentage (6.2%) within kidney cell lines in the presence of dGAP [195]. Most intriguingly, hPolε generated a significant number of base substitutions, deletions, and mixed mutations at DNA template positions located both upstream and downstream from dGAP, including a -8 frameshift within the DNA lesion site. Such large frameshifts were observed in the presence of dGAP within E. coli cells defective of dnaQ gene, which encodes 3′→5′ exonuclease (ε subunit) of DNA polymerase III [188].

Next, we plotted relative error% as a function of DNA template positions (Figure

7.6C). Markedly, dNTP incorporation fidelity of hPolε seemed to be the most affected at

Position +1 (one nucleotide downstream from the dGAP site) with the highest relative error frequency (65.9%) in the sequencing window. Position 0 possessed the second highest relative error frequency (38.6%). Notably, the preference at both of these sites was no dNTP incorporation, with more deletions occurring opposite dGAP than Position

+1 (Figure 7.6C). Additionally, the total relative error frequency along the damaged DNA template (S-3b, Table 7.1) appeared to oscillate every 3 base positions, possibly due to the dGAP presence as the DNA helix turned. Such a cyclic pattern was observed kinetically as Dpo4 bypassed a cisplatin-dGpG [108].

For further evaluation of the dGAP effect on upstream and downstream dNTP incorporations catalyzed by hPolε, we performed SOSA using control DNA substrate S-

226

3a 17/73-mer (Table 7.1). The subsequent control sequences are displayed in Figure

7.4D. Surprisingly, hPolε incorporated only correct dCTP opposite a template base dG at

Position 0 (Figures 7.5B and 7.6D). However, the mutations created within the vicinity of

Position 0 changed from mostly multi-base deletions and single-base substitutions in the presence of dGAP to mostly single-base substitutions and few multi-base deletions in the absence of dGAP (Figures 7.6C and 7.6D). As shown in Figure 7.6D, hPolε created mutations with an average relative error% of 4.7% at each DNA template position. This value was approximately 2.5-fold lower than the average relative error% (11.9%) for S-

3b 17/73-mer-dGAP (Figure 7.6C). Overall, the base substitution error rate of hPolε was calculated to be 6.1x10-2 with S-3a 17/73-mer (Table 7.3). This value was comparable to the base substitution error rate of 7.5x10-2 calculated using a different SOSA DNA substrate [205], to the dNTP incorporation fidelity of 5.6x10-2 measured by steady-state kinetics [166], and to the dATP incorporation fidelity of 4.8x10-2 to 7.7x10-2 measured by pre-steady state kinetics (Chapter 6). Interestingly, differences in the mutagenic data in

Figures 7.6C and 7.6D suggested that the base deletions observed with S-3b 17/73-mer- dGAP were most likely caused by the presence of the bulky lesion. Furthermore, there was an observed increase in base deletions that occurred from Position -8 to Position +11 that also suggested that hPolε could detect the presence of dGAP before the DNA lesion entered the active site of hPolε.

hPolκ. We sequenced 45 colonies that contained dGAP bypass products synthesized by hPolθ and the resulting data are shown in Figure 7.4E. Of the dNTP incorporation events

227 opposite dGAP (Figure 7.5A), 37.8% (17/45 colonies) were dCTP incorporations, 26.7%

(12/45 colonies) were dATP incorporations, 33.3% (15/45 colonies) were no dNTP incorporations, and 2.2% (1/45 colonies) were dTTP incorporations. This data indicated that hPolθ, unlike Dpo4 and hPolε, did not clearly follow any specific bypass mechanism. Markedly, the DNA template positions that represented the most base deletions shown in Figure 7.6E were the same strong pause sites observed in Figure 6.2D.

We further examined the dGAP effect on DNA synthesis by hPol by performing

SOSA with control DNA substrate S-3a 17/73-mer (Figure 7.4F). Our SOSA data (Figure

7.4F) showed that hPolθ made significantly less errors with control S-3a 17/73-mer

(Figure 7.6F) than with S-3b 17/73-mer-dGAP (Figure 7.6E). Although hPolθ created mostly single-base deletions in the immediate vicinity of the bulky 1-AP adduct, mostly single-base substitutions were observed with DNA S-3a (Figure 7.6F). Based on Figure

7.4F, the base insertion, substitution, and deletion error rates for normal DNA synthesis catalyzed by hPol were calculated to be 2.5x10-3, 2.9x10-2 and 7.4x10-3, respectively

(Table 7.3). The base substitution error rate was similar to 2.9x10-2 measured by pre- steady state kinetic studies (Chapter 6). However, the dNTP incorporation fidelity in the presence of dGAP from our pre-steady state kinetic studies was an order of magnitude higher than the base substitution error rate from our SOSA study. This difference can be accounted for using the following factors. First, our pre-steady state kinetic study was not designed to calculate possible base deletion events as part of the dNTP incorporation fidelity value. As shown in Figures 7.5A and 7.5C, a large number of events were no dNTP incorporations opposite both dGAP and the Position +1. Second, our pre-steady

228 state kinetic study did not extend beyond the possible pause sites in Figure 6.2, and thus did not examine the well known lesion bypass extension abilities for hPol [99, 100, 102,

135, 170]. With our SOSA data (Table 7.3), we calculated error rates that included more than one template position at a time, and monitored base deletion, insertion and substitution events separately. Lastly, if we re-calculate the error rate of hPol bypassing a 1-AP site for Positions 0 only, the new base substitution error rate is 2.7x10-1 for dATP misincorporations, which was very similar to the pre-steady state kinetic dATP misincorporation fidelity value 1.8x10-1 (Chapter 6). This data implied that hPol was more error-free during the dGAP bypass extension than insertion opposite dGAP, which was observed in Figure 7.6E. Additionally, when comparing Figures 7.6E and 7.6F, it was clear that the relative error% at each DNA template position between Position -6 and

Position +11 increased significantly in the presence of dGAP.

Summary of data. Human Polε and Pol were more error-prone than Dpo4 as they approached dGAP, inserted dNTPs opposite the DNA lesion, and extended their dGAP bypass products to the full-length 60-mer (Figure 7.4). Clearly, Polε possessed the highest mutation frequency at most DNA template positions while Dpo4 was the most error-free. Interestingly, all three enzymes had higher mutation frequencies at Positions 0 and +1 than at any other DNA template positions (Figure 7.6). When bypassing dGAP, both hPolε and hPol made more base deletions than dNTPs misincorporations (Figure

7.5). Among the dNTP misincorporations, dATP had the highest incorporation frequency for each of the three enzymes. Moreover, none of the three enzymes made a +1

229 frameshift when bypassing the bulky DNA lesion. Taken together, the bypass of dGAP is highly mutagenic when catalyzed by hPolε and hPol. Our data also suggested that hPol is the most likely enzyme to incorporate a dNTP opposite dGAP in vivo. This theory is supported by the crystal structures that depict the N-terminal „Digit‟ subdomain of hPol shielding the active site while providing more space for hydrophobic DNA lesions like dGAP within the active site [102, 172].

Future Directions

A mechanistic model of how each human Y-family DNA polymerase bypasses dGAP in vitro is being constructed. Presently, we are quantitating the running start gel images in Figure 6.2 to retrieve the t50 values and DNA synthesis efficiency comparisons as determined in Chapter 4. We are also collecting SOSA data for hPol in to order have a complete comparison of dGAP bypass abilities of all human Y-family members. Our initial sample group of 5 colonies containing the dGAP bypass products was recently analyzed and is shown in Figure 7.4G. Similar to the SOSA on abasic site bypass with hPol, there were no identical DNA sequences. This data also hinted that hPol may be the most error- prone during dGAP bypass. We do not plan to perform SOSA with hRev1 due to the limitations noted in Chapters 4 and 6. However, the inability to analyze the mutagenic spectrum for hRev1 will not lessen the significance of our results.

We plan to elucidate the structural changes of each human Y-family DNA polymerase during dGAP bypass. We are currently determining the crystal structure of hPol binding to a DNA substrate containing a site-specific dGAP. In addition, we will 230 expand our SOSA to determine the long-range effects of each accessory protein on each

Y-family DNA polymerase‟s DNA synthesis fidelity.

7.2 Mutagenic Analysis on Cis-Syn Thymine Dimer Bypass Catalyzed by Y-Family DNA

Polymerases

DNA damage from UV radiation has been documented for decades as a source of skin cancer [70]. The most studied example of UV radiation-derived skin cancer is

Xeroderma Pigmentosum (XP), which is characterized by extreme sensitivity to sunlight and increased incidence of skin cancer [34, 67, 69, 209, 210]. One of the most studied

DNA lesions that arise from UV exposure is a cyclobutane pyrimidine dimer (CPD), cis- syn thymine dimer (cis-syn TT, Figure 7.7)[211]. In a series of investigations, it was determined that the XP variant gene XPV was a mutation of POLH, and that hPolε (the

POLH gene product) was responsible for error-free cis-syn TT bypass in vivo [34, 67-69].

Intriguingly, hPol incorporates dNTPs more efficiently in the vicinity of a cis-syn TT than during normal DNA synthesis [212], which is surprising due to the 20-30˚ distortion in DNA structure induced by a cis-syn TT [213]. However, the observed increase skin cancer incidence suggested that another Y-family DNA polymerase performs error-prone cis-syn TT bypass in vivo [71, 214]. One such enzyme may be hPol, which is known to be hypermutagenic during normal DNA synthesis [84, 205, 215, 216]. Previous studies have also identified hRev1 as a possible player in the cis-syn TT bypass pathway in vivo by functioning as recruiter for other Y-family DNA polymerases [94, 217]. Thus, the

231 enzymes responsible for error-prone cis-syn TT bypass in vivo have not been explicitly identified. Furthermore, the genotypic error rate of cis-syn TT bypass catalyzed by hPolε has not been determined. In order to obtain more details on cis-syn TT bypass in vitro, we quantitatively assessed the cis-syn TT bypass abilities of the four human Y-family DNA polymerases as well as Dpo4 via running start assays and SOSAs.

Results

Although there is a plethora of data on the cis-syn TT bypass in vitro for Dpo4, hPolε and hPol [44, 68, 84, 107, 157], the cis-syn TT bypass abilities of hPol and hRev1 are not clearly defined [100, 102]. Therefore, we performed running start assays in the absence and presence of a cis-syn TT for all four human Y-family DNA polymerases and Dpo4 using the same conditions as previously described by our laboratory [205]. As shown in Figure 7.8, we observed that all enzymes bypassed a cis-syn TT with varying catalytic efficiencies. For comparison, we also performed running start assays performed on control S-4a 17/77mer DNA substrate (Table 7.1) for each enzyme.

Running Start Analysis. In general, an accumulation of intermediate product during primer extension using S-4b 17/77-mer-CPD with no correlating accumulation of intermediate product for primer extension using S-4a 17/77-mer was considered enzymatic pausing at the specific template position [205]. As observed in Figure 7.8B,

Dpo4 significantly paused only while incorporating dNTPs opposite 3′-T of the cis-syn

TT (Position T1), noted by an accumulation of 26-mer. Such pausing opposite the 3′-T of

232 cis-syn TT was previously observed for Dpo4 at both 37 °C and 60 °C [44]. Notably, there was a stronger Dpo4 pause located upstream of the cis-syn TT site (20-mer) that was not observed in Figure 7.8A. This data indicated that: i) the presence of a cis-syn TT indeed perturbs the DNA synthesis efficiency of Dpo4; and ii) Dpo4 can „sense‟ a bulky double-base lesion upstream of the active site.

Unlike Dpo4, hPolε paused while incorporating opposite the 5′-T of the cis-syn

TT (Position T2), marked by accumulation of 27-mer in the running start assay in Figure

7.8D. Based on the ternary crystal structure of yeast Polε [211], the 3′-T of the cis-syn TT was in almost identical conformation as normal template dT. In contrast, the 5′-T of the cis-syn TT shifted subtly with respect to nearby amino acid side chains, accounting for the slight stalling with no observed change in error frequency [211]. Similar to Dpo4, hPolε paused longer at multiple sites upstream of the cis-syn TT site, i.e. the accumulation of 20-, 21-, 23-, and 24-mers.

As shown in Figure 7.8F, hPol was able to synthesized full-length cis-syn TT bypass products. This is the first evidence that hPol can bypass cis-syn TT in vitro.

Previous studies under different reaction conditions depict hPol unable to insert dNTPs opposite both sites of the lesion [64, 100, 102, 218, 219]. In contrast to Dpo4 and hPolε, hPol paused at both sites of the cis-syn TT as well as one base upstream, which was marked by the accumulation of 25-, 26- and 27-mers (Figure 7.8F). This enzyme pausing upon dNTP incorporation opposite both thymines of the DNA lesion implied that hPol may only contain one template base within its active site. Moreover, the pausing of hPol was much stronger upon dNTP incorporation opposite the 3′-T than 5′-T of cis-syn TT. 233

This observation could be explained by a recent crystallogical study of hPol.

Interestingly, hPol is able to the 5′-T of cis-syn TT in the same conformation as an undamaged dT while incorporating dATP opposite this 5′-T of cis-syn TT [219]. Thus, hPol would have less pausing while incorporating dNTP opposite 5′-T than 3′-T of cis- syn TT. Additionally, hPol paused longer at sites upstream of cis-syn TT, as there were significant accumulations of 20- and 22-mer in Figure 7.8F.

The more distributive nature of hPol than hPolε and hPol was observed as there was significant pausing during the running start assay using DNA substrate S-4a 17/77- mer (Figure 7.8G). Furthermore, there was significantly less full-length product after 240 min with hPol (Figure 7.8G) than after 15 s and 60 s for hPolε (Figure 7.8C) and hPol

(Figure 7.8E), respectively. During the cis-syn TT bypass, hPol also paused while incorporating dNTPs immediately upstream the lesion site (25-mer accumulation) and during the bypass extension (28-mer accumulation). Notably, the same sites of pausing during normal DNA synthesis (Figure 7.8G) were also observed during cis-syn TT bypass (Figure 7.8H), but with less intensity. This data suggest that hPol „sensed‟ the bulky lesion upstream and improved DNA synthesis efficiency in the in presence of cis- syn TT. However, the reduction of full-length product and increased primer remaining after 240 min (Figure 7.8G versus Figure 7.8H) suggested that hPol could not extend most primers in the presence of this lesion.

Our data is the first evidence that Rev1 can indeed incorporate dNTPs opposite cis-syn TT, as there are products that extended well beyond the site of this lesion (Figure

7.8J). The very distributive nature of hRev1 was observed as there was a very strong 234 pause after approximately 10 dNTP incorporations and very little detection of full-length product during normal DNA synthesis (Figure 7.8I). Interestingly, hRev1 appeared to progress farther on the S-4b 17/77-mer-CPD (Figure 7.8J) than on the S-4a 17/77-mer

(Figure 7.8I), suggesting that the presence of cis-syn TT increased the processivity of hRev1.

We also determined the cis-syn TT bypass efficiency for each enzyme based on the corresponding running start assays in Figure 7.8. We determined lesion bypass% of

3′-T (T1), 5′-T (T2) of the cis-syn TT site as well as the bypass% of the total lesion site at each time. The lesion bypass% was calculated as previously described [205] using Eq 1:

Lesion bypass% = (bypass events/encounter events) x 100% Eq 1

The T1 bypass events calculated by summing the concentration of all intermediates and full-length products of size equal to or longer than 27-mers. The T2 bypass and total lesion bypass events encompassed all products that were equal to or longer than 28-mer.

The T1 and total lesion encounter events included all products equal to or greater than

26-mers. The T2 encounter events consisted of all products equal to or greater than 27- mers. The corresponding plot for the total lesion bypass% is shown in Figure 7.9. From

bypass these lesion bypass% plots, we estimated the t50 of the T1, T2 and total cis-syn TT

bypass bypass (Table 7.4). Based on these t50 values, the bypass efficiencies were as

bypass followed: hPol > Dpo4 > hPol >> hPol >> hRev1. Although there was a t50 for hRev1 at both T1 and T2, hRev1 did not bypass 50% of the total lesion site within the reaction time of 4 hrs. This is possible based on Eq 1, where we only account for the

235 encounter events and longer products. Consequently, there are fewer T2 encounter events than T1 encounter events, particularly if the enzyme paused at T1.

Quantitative Analysis of Mutational Spectra Derived from the cis-syn TT Bypass.

Next, we sought to visualize the mutagenic potential during cis-syn TT bypass catalyzed by each human Y-family DNA polymerase as well as the model Y-family member Dpo4.

To obtain the mutagenic spectra of each enzyme, we performed SOSA (Scheme 7.2) using the same method as described in Section 7.1 with DNA substrate S-4b 17/77-mer-

CPD. For further examination of the cis-syn TT effect on DNA synthesis for each enzyme, we also performed SOSA with S-4a 17/77-mer. Thus far, we only have a statistically relevant number of sequences (≥ 40) for SOSA with S-4a 17/77-mer for

Dpo4 (Figure 7.10B).

Dpo4. While incorporating dNTPs opposite T1 and T2 on S4-a 17/77-mer, Dpo4 was always error-free (Figure 7.11A and 7.11C). Within the sequence window, Dpo4 only made 4 base substitutions (Figures 7.12B). Derived from this data, the Dpo4 overall error rate of 2x10-3(Table 7.5) was similar to the corresponding error rate for SOSA with S-3a

17/73-mer (Table 7.3) and dNTP incorporation fidelity values from our previous pre- steady state kinetic studies [27, 57].

At the time of this dissertation, we have sequenced 28 colonies with SOSA products from S-4b 17/77-mer-CPD and Dpo4. Opposite 3′-T of cis-syn TT, Dpo4 marginally preferred correct dATP incorporation over no dNTP incorporation (Figure

236

7.11B). Opposite 5′-T of cis-syn TT, Dpo4 preferred skipping the site over correct dATP incorporation or dCTP misincorporation (Figure 7.11D). When including upstream and downstream sites from the cis-syn TT (Figure 7.12A), the overall error rate increased to

6.1x10-2 (Table 7.11). Although these data were only preliminary, they suggested a 30- fold increase of mutations in the presence of a cis-syn TT. Such an increase in mutagenic potential in a presence of a double-base lesion was previously observed during cisplatin- dGpG bypass catalyzed by Dpo4 [108].

hPolη. Currently, 11 colonies have been sequenced with SOSA products from S-4a

17/77-mer and hPolε (Figure 7.10C). As expected, hPolε was more error-prone than

Dpo4, with a few base deletions and dTTP misincorporations opposite T1 and T2 of S-4a

(Figures 7.11A and 7.11C). Within the sequence window (Figure 7.12C), there were significant amounts of base substitutions, insertions and deletions that led to an overall error rate of 8.2x10-2. This value was approximately the same as those determined from

SOSA with S-3a 17/73-mer (Table 7.3) and SOSA with 14/63CTL [205].

hPol. We have sequenced 25 colonies from SOSA with S-4a 17/77-mer (Figure 7.10E) and 14 colonies from SOSA with S-4b 17/77-mer-CPD (Figure 7.10D) for hPol.

Opposite T1 and T2, hPol was more error-free than hPolε over DNA substrate S-4a

(Figures 7.11A and 7.11C). Notably, hPol was more likely to misincorporate dTTP or to skip dNTP incorporation than to correctly incorporate dATP opposite 3′-T of the cis-syn

TT (Figure 7.11B). There were no strong preferences of action opposite 5′-T of the cis-

237 syn TT (Figure 7.11D), as the relative error frequency for dATP, dCTP and dGTP incorporations and base deletions were similar. This data implied that upstream events dictated dNTP incorporation of this site for hPol. Within the sequence window (Figures

7.12D and 7.12E), hPol displayed an overall error rate of 5.6 x 10-2 for normal DNA synthesis and 1.5 x 10-1 for cis-syn TT bypass (Table 7.5). The error rate value for normal

DNA synthesis was similar to those determined in previous SOSA studies (Table 7.3)

[205].

hPol. We also sequenced 6 colonies from SOSA with S-4a 17/77-mer (Figure 7.10G) and 21 colonies from SOSA with S-4b 17/77-mer-CPD for hPol (Figure 7.10F).

As expected, hPol displayed the most errors not only opposite T1 and T2 (Figures 7.11A and 7.11C), but also throughout the sequence window (Figure 7.12G) with an overall error rate of 1.4x10-1 (Table 7.5). Opposite both 3′-T and 5′-T of cis-syn TT, hPol preferred to not incorporate a dNTP (Figures 7.11B and 7.11D). Thus, hPol was able to literally bypass the cis-syn TT site with low bypass efficiency as well as a high rate of

DNA polymerase error. Within the sequence window (Figure 7.12F), hPol displayed mutagenic events that were cyclic, i.e. increased mutations every 4 to 5 bases, upstream and downstream of the DNA lesion site. Such a cyclic trend of mutations downstream of a double-base DNA lesion was observe for cisplatin-dGpG bypass catalyzed by Dpo4

[108]. We hypothesize that this cyclic observation is due to an increased DNA structural distortion at each DNA helix turn in the vicinity of the lesion.

238

Future Directions

The goal of this study is to profile the mutagenic potential during cis-syn TT bypass when catalyzed by all human Y-family DNA polymerases as well as the model Y- family member Dpo4. So far, our study provided more evidence that hPol is the DNA polymerase to bypasses cis-syn TT in vivo due to its high efficiency of DNA lesion bypass. To further examine the double-base DNA lesion effect on DNA synthesis for each enzyme, we are currently quantitating running start assays using control DNA substrate S-4a (Figure 7.8) for each enzyme to estimate the corresponding t50 values of

bypass T1, T2 and overall TT site. Comparison of t50 and t50 for each site will quantitatively determine the DNA synthesis efficiency change in the presence of a cis-syn TT.

Our preliminary SOSA data revealed the mutagenic potential of each enzyme in the context of a cis-syn TT. However, to visualize true trends of each enzyme, we plan to sequence more colonies of each SOSA reaction that did not have a statistical relevant number of samples, i.e. all reaction except SOSA with S-4a 17/77-mer and Dpo4.

Although we had the surprising result that hRev1 had marginal cis-syn TT bypass ability, it did not reach 50% in product bypass during our running start assay (Figure 7.8J). Based on previous studies demonstrating Rev1 as a scaffold protein [93, 94, 217], we suspect that the main role of hRev1in vivo is not to act as a dCTP transferase [32, 88-92, 163].

Due to the extremely low DNA polymerization efficiency and known dCTP transferase activity, we will not sequence hRev1 bypass products. However, we will perform a standing start assay using the same conditions as described in Section 7.1 and with DNA substrates that will position hRev1 directly opposite each base of the cis-syn TT. With

239 this information, we can depict the cis-syn TT bypass behavior of hRev1 as a TLS enzyme in vivo. We will then start pre-steady state kinetic studies for each relevant Y- family enzyme to begin to establish the unique minimal kinetic mechanism utilized by each DNA polymerase during cis-syn TT bypass in vivo. With this information, we can design better treatment for patients with skin cancer that arose from the formation of cis- syn TT.

7.3 Mutagenic Analysis on Cisplatin-dGpG Bypass Catalyzed by Y-Family DNA

Polymerases

A common anticancer drug, cis-diamminechloroplatinum(II) (cisplatin), is potent against ovarian, head, neck, testicular and nonsmall cell lung cancers. It is an effective anti-cancer agent because it hinders cellular division, and thus slows tumor growth.

Cisplatin targets cellular DNA to create covalent adducts on the N7 positions of purines with 65% of products formed being cis-[Pt(NH3)2{d(GpG)-N7(1),-N7-(2)}] (cisplatin- dGpG) intrastand crosslinks [220, 221]. The appearance of these DNA adducts are used as visual markers to detect the effectiveness of cisplatin [222]. Furthermore, these DNA adducts can inhibit DNA replication as they stall replicative DNA polymerases [223-

225]. Unfortunately, the prevalence of drug resistance is a major limitation of cisplatin application. The cisplatin drug resistance observed in some tumor cell lines is hypothesized to arise from cellular DNA repair pathways and/or Y-family DNA polymerase(s) bypassing the cisplatin adducts in vivo [221]. Of the four human Y-family

240

DNA polymerases, hPolε is the most likely candidate to catalyze cisplatin adduct bypass in vivo as it can efficiently bypass another double-base lesion, cis-syn TT, error-free in vivo (Section 7.2). Indeed, several in vitro and cell-based assays have demonstrated the bypass ability of hPolε for cisplatin adducts [79, 80, 226]. The cisplatin adduct bypass ability of other Y-family DNA polymerases is largely unknown due to the research focus on hPolε. In our current study, we quantitatively assessed the cisplatin-dGpG bypass abilities of the four human Y-family DNA polymerases as well as Dpo4 via running start assays and SOSAs in order to obtain more details.

Results

To elucidate the potential mutagenic profile of each Y-family DNA polymerase as it bypasses cisplatin-DNA adducts, we needed to first determine which Y-family enzymes could bypass cisplatin-dGpG. Based on previous studies using various reaction conditions, Dpo4 and hPolε are able to bypass cisplatin-dGpG with varying efficiencies in vitro [78, 108] while hPol and hPol are not able to incorporate any dNTP opposite the DNA lesion sites [31, 63, 64]. Therefore, it will be beneficial to confirm the cisplatin- dGpG bypass ability of hPol and Dpo4 under our reaction conditions, which will provide a comparison basis for other Y-family DNA polymerases. Using the same conditions as used for running start assays with DNA containing cis-syn TT (Section 7.2) or abasic sites [205], we planned to perform running start assays with DNA substrates containing cisplatin-dGpG for each of the four human Y-family DNA polymerases as well as Dpo4. Running start assays with shorter DNA substrates containing cisplatin-

241 dGpG have been completed for Dpo4 in our previous kinetic study [108]. For comparison, we also intended to perform running start assays on control S-5a 15/54-mer

DNA substrate (Table 7.1) for each enzyme.

Running Start Analysis. To date, we have completed the running start assays with DNA substrates S-5a 15/54-mer and S-5b 15/54-mer-DDP for hPolε and hPol (Figure 7.13) as well as for Dpo4 (data not shown). We observed that all enzymes bypassed cisplatin- dGpG with varying efficiencies. Notably, hPol generated full-length lesion bypass products after 10 s, whereas hPol and Dpo4 required approximately 90 s and 60 s, respectively. The accumulations of intermediates during primer extension using of S-5b

15/54-mer-DDP without correlating accumulations of intermediates during primer extension of S-5a 15/54-mer were believed to be the result of enzyme pausings at the specific DNA template positions [205]. During both previous running start assays [108] and our current running start assays with Dpo4 and S-5b 15/54-mer-DDP, we observed the accumulation of 23-mers and 24-mers, with more accumulation of 24-mers than of

23-mers. Thus, Dpo4 paused significantly while incorporating dNTPs opposite the 3′-G

(G1) and even more so opposite the 5′-G (G2) of cisplatin-dGpG [108]. This hypothesis was supported kinetically as the dNTP incorporation efficiency of Dpo4 decreased about

12-fold more opposite G2 than opposite G1 [108]. This data confirmed that the presence of a cisplatin-dGpG perturbs the DNA synthesis efficiency of Dpo4.

In contrast to Dpo4, hPolε paused during the immediate extension of 5′-G bypass products, as there were large accumulations of 25-mers, 26-mers and 27-mers (Figure

242

7.13B). There was also a slight accumulation of 24-mers, hinting that dNTP incorporations catalyzed by hPolε opposite G2 were more difficult than opposite G1 of cisplatin-dGpG. Such DNA lesion bypass difficulties for hPolε have been observed during the bypass of cis-syn TT [62, 157, 211] as well as biochemical assays for cisplatin-dGpG bypass [62], and suggested that hPolε employs a similar mechanism to bypass cisplatin-dGpG as cis-syn TT. This data also hinted that in vivo hPolε may participate in cisplatin-dGpG bypass in partnership with another DNA polymerase that would extend bypass products created by hPolε.

During the primer extension of S-5b 15/54-mer-DDP catalyzed by hPol, there were significant accumulations of 22-mers and 23-mers (Figure 7.13D). This data implied that hPol paused one template base before the cisplatin-dGpG as well as opposite G1 of the cisplatin-dGpG. The absence of strong hPol pausing opposite G2, i.e. large accumulation of 24-mers, hinted that hPol incorporated a dNTP opposite G2 more efficiently than opposite G1. Such a dNTP incorporation pattern is observed during cis- syn TT bypass catalyzed by Pol [219]. This data also implied that hPol would participate in cisplatin-dGpG bypass in vivo as the DNA lesion bypass product extender.

Notably, this is the first evidence that hPol is able to bypass cisplatin-dGpG in vitro.

Previous studies under different reaction conditions suggested that hPol is unable to insert dNTPs opposite both sites of cisplatin-dGpG [31, 64].

We also calculated the cisplatin-dGpG bypass efficiency for each enzyme based on the corresponding running start assays in Figure 7.13 and Eq 1 (Section 7.2). The G1 bypass events calculated by summing the concentration of all intermediates and full- 243 length products of sizes equal to or longer than 24-mers. The T2 bypass and total lesion bypass events encompassed all products that were equal to or longer than 25-mers. The

G1 and total lesion encounter events included all products equal to or greater than 23- mers, and the G2 encounter events consisted of all products equal to or greater than 24- mers. We determined lesion bypass% of 3′-G (G1), 5′-G (G2) of the cisplatin-dGpG site as well as the lesion bypass% of the total lesion site at each time point. The corresponding plot for the total lesion bypass% is shown in Figure 7.14. From these

bypass lesion bypass% plots, we were able to estimate the t50 of the G1, G2 and total

bypass cisplatin-dGpG bypass (Table 7.6). Based on t50 values, the bypass efficiencies were

bypass as followed: hPol >> hPol >> Dpo4. Interestingly, the G1 t50 values for all

bypass enzymes were significantly slower than the G2 t50 , indicating again that the dNTP incorporations opposite 5′-G were less difficult than dNTP incorporations opposite 3′-G of cisplatin-dGpG.

Quantitative Analysis of Mutational Spectra Derived from the Cisplatin-dGpG

Bypass. To visualize the mutagenic potential during cisplatin-dGpG bypass catalyzed by each human Y-family DNA polymerase as well as the model Y-family member Dpo4, we designed SOSA (Scheme 7.3) using the same method as described in Section 7.2. For further examination of the cisplatin-dGpG effect on DNA synthesis for each enzyme, we also performed SOSA with S-6a 15/69-mer. We extended the DNA template from the running start assays (S-5) to 69-mer (S-6a and S-6b, Table 7.1) so that full-length products will be at least 15 bases shorter than the DNA template, and can be separated by

244 size using denaturing PAGE (Scheme 7.3). Thus far, we have sequences for SOSA with

15/69-mers (Table 7.1) for Dpo4 and hPol (Figure 7.15).

Dpo4. While incorporating dNTPs opposite G1 and G2 into S6-a 15/69-mer, Dpo4 was again error-free (Figures 7.16A and 7.16C). Within the sequence window, Dpo4 only created 2 base deletions and 3 base substitutions (Figures 7.17B). As a result, the Dpo4 overall error rate was calculated as 6.2x10-3(Table 7.7), which was similar to previously determined error rates from SOSA (Tables 7.3 and 7.5) and dNTP incorporation fidelity values from our previous pre-steady state kinetic studies [27, 57, 108].

For mutagenic potential analysis of cisplatin-dGpG bypass catalyzed by Dpo4, we sequenced 49 colonies containing DNA lesion bypass products from SOSA with Dpo4 and the DNA substrate S-6b 15/69-mer-DDP (Figure 7.15A). Opposite G1, Dpo4 preferred to correctly incorporate dCTP (91.8%) over misincorporating dATP (2.0%) or skipping the G1 site (6.1%), as shown in Figure 7.16B. Opposite G2, Dpo4 preferred to correctly incorporate dCTP (85.7%) over misincorporating dATP (8.2%) or skipping the

G2 site (6.1%). Therefore, Dpo4 was more error-prone at G2 site than at G1. Examining the entire sequence window (Figure 7.17A) depicted Dpo4 becoming more error-prone by creating mostly base deletions from G1 and G2 to Position +6. Moreover, the cyclic trend of increasing base substitutions from the cisplatin adduct sites to downstream DNA template positions followed the trend observed kinetically [108]. We hypothesized that the observed cyclic event is due to an increased DNA structural distortion at each helix turn of the DNA in the vicinity of the cisplatin adduct. Markedly, we observed that Dpo4

245 was more error-prone with higher frequency of base deletions at Positions +1 and +2 than

G1 (Figure 7.17A), an observation of an event not obtainable using kinetic methods.

Thus, SOSA was a suitable method to estimate enzyme accuracy during TLS. When including upstream and downstream sites from the cisplatin-dGpG (Figure 7.17A), the overall error rate increased 5.4-fold to 3.4x10-2 (Table 7.7). Such an increase in mutagenic potential in a presence of a double-base lesion was previously observed using

SOSA to analyze cis-syn TT bypass with Dpo4 (Section 7.2) and kinetically during cisplatin-dGpG bypass catalyzed by Dpo4 [108].

hPol. We sequenced 64 colonies with SOSA products from S-6a 15/69-mer and hPol

(Figure 7.15D), and sequenced 49 colonies with SOSA products from S-6b 15/69-mer-

DDP and hPol (Figure 7.15C). Opposite G1 on the DNA substrate S-6a, hPol was mostly error-free, only misincorporating dATP or creating a base deletion 3.1% of the time (Figures 7.16A). Opposite G2 on the DNA substrate S-6a, hPol created more mutations by creating a base deletion 15.6% of the time (Figures 7.16C). This data suggested that hPol discriminates against base repeats, a useful characteristic when extending mismatched primer termini. Notably, hPol was most likely to insert correct dCTP opposite both G1 and G2 of cisplatin-dGpG (Figures 7.16B and 7.16D). However, hPol was more error-prone opposite G2 than G1 with more base deletions at G2 than at

G1. When we accounted for the greater change in mutation frequency at G1 versus at G2

(Figure 7.16), it became clear that hPol had more difficulty incorporating a dNTP opposite G1 than G2 of cisplatin-dGpG. This may be due to similar difficulties as those 246 discussed for hPol bypassing another double-base lesion, cis-syn TT (Section 7.2).

Furthermore, this data was corroborated by the location of pause sites in Figure 7.13D.

When examining the entire sequence window for S-6a 15/69-mer, hPol exhibited a random spectrum of mutations with the exception of a notable increase in base deletions at Position G2 (Figure 7.17D). Consequently, caution was given to any mutations observed at Position G2 in the presence of a cisplatin-dGpG. The presence of a cisplatin adduct had a dramatic effect on the error frequency of hPol, with a sharp increase in base deletions for a 9-base region surrounding the cisplatin-dGpG sites as well as emergence of base insertions upstream (Figure 7.17C). Thus, it was suggested that hPol sensed the bulky double-base lesion cisplatin-dGpG, which was corroborated with the observation of the upstream pausing of hPol during running start analysis (Figure

7.13D). Within the sequence window (Figures 7.17C and 7.17D), hPol displayed an overall error rate of 1.9x10-2 for normal DNA replication and 6.0x10-2 for cisplatin-dGpG bypass. The error rate value for normal DNA synthesis was similar to those determined in aforementioned SOSA studies (Tables 7.9 and 7.11) [205].

Future Directions

The profiling of the mutagenic potential during cisplatin-dGpG bypass when catalyzed by all human Y-family DNA polymerases as well as the model Y-family member Dpo4 is currently underway. So far, we have provided evidence that hPol is the

DNA polymerase to bypasses cisplatin in vivo due to its high lesion bypass efficiency when compared to hPol. To further examine the cisplatin adduct effect on DNA 247 replication for each enzyme, we are currently quantitating running start assays using control DNA substrate S-5a (Figure 7.13) for each enzyme to estimate the corresponding

bypass t50 values of G1, G2 and overall GG. Comparison of t50 and t50 for each site will quantitatively determine the DNA synthesis efficiency change in the presence of a cisplatin-dGpG. After producing more DNA substrate, we will also perform running start assays with DNA substrates S-5a and S-5b for hPol and hRev1 as well as estimate the

bypass corresponding t50 and t50 values to make a complete comparison of cisplatin-dGpG bypass efficiencies for human Y-family DNA polymerases.

From the SOSA data collected to date, the mutagenic potential of Dpo4 and hPol in the context of a cisplatin-dGpG was visualized. To obtain a complete picture of the mutagenic potential of cisplatin-dGpG, we plan to sequence colonies containing DNA lesion bypass products from SOSAs with S-6a 15/69-mer or S-6b 15/69-mer-DDP and hPolε. Past studies have depicted hPol and hPol as incapable of bypassing cisplatin- dGpG in vitro, but our data depicted hPol capable of bypassing this lesion. Thus, the running start assay is a necessity for the evaluation of lesion bypass efficiencies under the same reaction conditions. If the pending running start assays also reveal that hPol bypasses cisplatin-dGpG and creates the full-length 54-mer, we will perform SOSAs with

DNA substrates S-6a and S-6b for hPol. Based on previous studies demonstrating Rev1 as a scaffold protein [93, 94, 217] and due to the extremely low DNA polymerization efficiency calculated in our previous running start assays (Figures 4.1, 6.2 and 7.8), we expect the same low DNA polymerase processivity for hRev1 in the presence of cisplatin-dGpG and do not plan to sequence cisplatin-dGpG bypass products created by 248 hRev1. However, we will perform a standing start assay using the same conditions as described in Section 7.1 and DNA substrates that will position hRev1 directly opposite each base of the cisplatin-dGpG. We will also start pre-steady state kinetic studies for each relevant Y-family enzyme to establish each unique minimal kinetic mechanism utilized by each DNA polymerase during cisplatin-dGpG bypass in vivo. With this information, we can design better chemotherapy for patients with tumors initially resistant to cisplatin.

7.4 Transient Kinetic Investigation of Human DNA Polymerase ε

The biological function(s) of each human Y-family DNA polymerase have not been explicitly defined [5]. However, there are numerous studies that proved that these

DNA polymerases are needed for cell survival in the presence of DNA-damaging conditions, e.g. oxidative stress and UV-damage [67, 69, 72, 73, 216]. By investigating the human TLS process in vitro, our group [205] and other groups [61, 72, 162, 227] have noted that hPolε is the most likely enzyme to bypass different DNA lesions in vivo.

Moreover, the mechanistic details of DNA replication catalyzed by hPolε in the absence

[72, 76, 165, 201] or presence of various DNA lesions [72, 76, 78, 201, 228, 229] have begun to surface. Here, we provide additional insights into the hPolε kinetic mechanism during normal DNA polymerization in the presence of undamaged DNA.

Results

249

Burst kinetics of hPolη. In order to compare the kinetic parameters of Dpo4 and hPolε, we performed all experiments using the same DNA substrates (Table 7.1) that were used to investigate the kinetic mechanism of Dpo4 [27, 65, 117]. After purifying full-length human Polε from E. coli [205], we determined the active concentration of hPolε as well as the burst rate constant via a burst kinetic assay. A preincubated solution of 5ʹ-[32P]- labeled D-8 (100 nM) and hPolε (19 nM, UV-determined) at 37 °C was rapidly mixed with Mg2+•dTTP (0.5 mM) and quenched after various time intervals with EDTA (0.37

M). The products were resolved using denaturing PAGE and quantitated using

ImageQuant. The product formation versus time was plotted and fit to burst equation (Eq

2):

[Product] = EoA[1 – exp(-k1t) + k2t] (Eq 2)

The corresponding plot of the burst kinetic assay is shown in Figure 7.18. The active concentration of hPolε was calculated to be 15 nM (79%), and rate constants for the burst and linear phases were determined to be 46.7 ± 5.6 s-1 and 0.126 ± 0.007 s-1, respectively. The linear phase rate constant was approximately equal to the steady-state rate constant (0.07 to 0.3 s-1) obtained from previous studies [165, 201, 229], and thus was assumed to be the steady-state rate constant here. From this point on, we used the active concentration of hPolε for all future kinetic assays.

DNA Binding Affinity of hPolη. To date, the DNA binding affinity of hPolε has not been quantitatively determined for any DNA substrate except for those with staggered- ends. We determined the equilibrium dissociation constant for the hPolε binary complex

250

(Kd, DNA) using EMSA and different types of DNA substrate. For each DNA substrate of interest, we added hPolε (10-400 nM) to a solution containing 5ʹ-radiolabeled DNA (10 nM) at 23 °C. After 20 min incubation at 23 °C, the DNA complex was separated from free DNA via native PAGE, and quantitated using ImageQuant. The corresponding plot

(Figure 7.19) of the binary complex formation as a function of hPolε concentration was then fit to a quadratic curve (Eq 3):

2 1/2 [E•DNA] = 0.5(Kd, DNA + Eo + Do) – 0.5[(Kd, DNA + Eo + Do) – 4EoDo] (Eq 3)

The Kd, DNA values determined for various DNA substrates are shown in Table 7.8.

Notably, the DNA binding affinity of hPolε was 4.4 to 5.4-fold tighter with staggered- end DNA substrate than with blunt-end DNA substrate. Similar discrimination in DNA binding affinities for the same DNA substrates were observed for Dpo4 [117]. The DNA binding affinity of hPolε for D-8 (38 nM) was similar to the DNA binding affinity of hPolε for 21/26-mer (26 nM, Chapter 6), demonstrating that the size of the DNA substrate is not a factor for weaker binding to DNA substrate BE2. This theory is supported by the observation that the DNA binding region of hPolε is in contact with approximately 8 base pairs of the DNA substrate [77].

DNA Dissociation Rate from the Binary Complex hPolη•DNA. We also directly measured the DNA dissociation rate from the binary complex hPolε•DNA. A preincubated solution of hPolε (20 nM) and 5ʹ-[32P]-DNA D-1 (130 nM) at 37 °C was mixed with unlabeled DNA trap D-1 (5 µM) for various time intervals. A solution of

Mg2+•dTTP (100 µM) was then added to start the reaction for 15 s before being quenched

251 by EDTA (0.37 M). The products were resolved as previously described. The plot of product formation over time was fit to exponential curve (Eq 4):

[Product] = A exp(-kt) (Eq 4) where A was the product concentration in absence of the DNA trap, t is the time in

DNA DNA seconds, and k is the DNA dissociation rate constant (koff , Scheme 7.4). The koff value was calculated to be 0.0027 ± 0.0002 s-1, which was only 1.5-fold slower than full-

DNA length yeast Polε [230]. Based on Kd, DNA and koff values, we calculated the apparent

DNA DNA -1 -1 association rate (kon = koff /Kd, DNA) to be 0.059 µM s , which was approximately

3-fold higher than Dpo4 [65] and 10-fold lower than yeast full-length Polε [230].

Nucleotide Incorporation Efficiency and Fidelity of hPolη. In a previous study, we have determined the dNTP incorporation efficiencies and fidelities of truncated yeast

Polε [230]. However, hPolε only shares 19.6% amino acid identity with yeast Polε [34].

To determine the difference in the apparent kinetic parameters, the maximum dNTP incorporation rate (kp) and the ground-state equilibrium dissociation rate constant (Kd, dNTP), between yeast and human full-length Polε, we performed single-turnover dNTP incorporation assays with hPolε under pre-steady state kinetic conditions using the same

DNA substrates with yeast Polε [230]. For example, a preincubated solution of hPolε

(130 nM) and 5ʹ-[32P]-DNA (20 nM) at 37 °C was rapidly mixed with dTTP (100 µM) for varying time intervals before being quenched with EDTA (0.37 M). The products were resolved using denaturing PAGE, and quantitated using ImageQuant. The formation of product over time was then plotted and fit t o a single exponential equation (Eq 5):

252

[Product] = A(1 – exp(-kobst)) (Eq 5) where A is the reaction amplitude, t is reaction time, and kobs is the rate constant. Next, the plot of kobs versus the dNTP concentration was fit to a hyperbolic equation (Eq 6): kobs = kp[dNTP]/{[dNTP] + Kd, dNTP} (Eq 6)

The corresponding plots are shown in Figure 7.20. The calculated kinetic parameters are displayed in Table 7.9, and represent the 16 possible dNTP incorporations involving dATP, dCTP, dGTP and dTTP. Interestingly, it was observed that hPolε was about 3-fold more efficient in forming a dG•dC than dA•dT. Furthermore, hPolε bound to dGTP tighter than other dNTPs during dNTP misincorporations. Notably, the dNTP incorporation fidelity was within the range of our work presented in Chapter 6 as well as the error rates determined using SOSA with normal DNA [205]. Moreover, the correct dNTP incorporation efficiencies in Table 7.9 were similar to those reported in Chapter 6.

Efficient Blunt-End Addition by hPolη. With the observation of single blunt-end additions during running start assays (Chapters 4 and 6) and a small discrepancy in hPolε binding to stagger-ended versus blunt-ended DNA substrates (Table 7.8), we deemed it necessary to measure the exact catalytic efficiencies of dNTP incorporation into blunt- end DNA by hPolε. Preliminary dNTP incorporation assays revealed hPolε incorporating dATP with the highest efficiency (data not shown). Subsequent pre-steady state single- turnover dATP incorporation assays at 37 °C determined the catalytic efficiency of the dATP incorporation into blunt-end DNA substrate BE2 (Table 7.10) to be 2.7x10-3, which was about 70-fold less efficient than the correct dATP incorporation into

253 staggered-end DNA substrate D-7 (Table 7.9). Further examination revealed that the main contributor to this large reduction in dATP incorporation was the kp, which was 65- fold slower for blunt-end addition.

The Base-Stacking Contribution to the Binding Affinity of Incoming dNTP. To determine if base-stacking influenced favored dATP incorporation over other dNTPs, we repeated the pre-steady state single-turnover dNTP incorporation assay at 37 °C using

DNA substrate BE2 and nucleotide dPTP, a dNTP analog with the base replaced with a pyrene group [230]. As shown in Table 7.10, hPolε bound to dPTP ~17-fold tighter than dATP and incorporated dPTP approximately 4-fold faster than dATP into DNA substrate

BE2. This led to 67-fold more efficient incorporation of dPTP than dATP into DNA substrate BE2 (Table 7.10). Thus, the dNTP binding free energy difference between dATP and dPTP was 1.7 kcal/mol, the same free energy difference determined for yeast truncated Polε at 23 °C [230]. Based on previous works [117, 230], it appeared that base- stacking was not the only contributing factor for the blunt-end addition catalyzed by hPolε. We hypothesize that van der Waals interactions may also play a role in the catalysis of blunt-end addition.

Biphasic Kinetics. As a follow-up to the biphasic kinetic observation during normal

DNA synthesis (Figure 6.5 and Table 6.11), we performed the kinetic assay described in

Chapter 6 with DNA substrate D-8 (Table 7.1) in order to determine if the biphasic kinetics of hPolε were due to length of DNA substrate. The corresponding biphasic

254 kinetic plot shown in Figure 7.21 was fit to double exponential curve (Eq 7):

[Product] = EoA1[1 – exp(- k1t)] + EoA2[1 – exp(- k2t)] (Eq 7)

The subsequent kinetic parameters are shown in Table 7.11, and were similar to those for DNA substrate 21/26-mer in Chapter 6 with one exception. For DNA substrate

D-8 21/41-mer, there was a slight majority in the fast phase of the reaction (Table 7.11) where as for DNA substrate 21/26-mer, there was a slight majority in the slow phase of the reaction (Table 6.11). However, the fast phase reaction rates were similar for both

DNA substrates, and the same held true for the slow phase reaction rates. Since reactions for DNA substrates D-8 and 21/26-mer involved correct dCTP incorporations, we hypothesized that the difference in the corresponding reaction amplitudes is due to the length of the DNA template. Upstream „sensing‟ in the form of increased upstream mutations in the presence of an AP site has been observed previously for hPolε [205].

However, this biphasic data suggested that the hPolε•DNA binary complex underwent conformational changes before catalysis. Since the DNA substrate contained no modified bases within the active site, we theorized that hPolε was non-specifically binding to DNA before the primer terminus was properly aligned within the DNA polymerase active site.

Two possible non-specific binding events could be hPolε binding to either the blunt-end of DNA or the single-strand section of DNA. The weak binding of hPolε to both blunt- end DNA and single-stranded DNA has been observed previously [212]. To further examine this hypothesis, we performed the biphasic kinetic assay using blunt-end DNA substrate BE2. As shown in Figure 7.21, hPolε also displayed biphasic kinetics while incorporating dATP into BE2. However, most of the binary complexes hPolε•BE2 were

255 trapped in dead-end complexes (Scheme 6.1). Furthermore, a majority of hPolε•BE2 complexes that resulted in catalysis of DNA polymerization went through a slow conversion to productive complexes, i.e. higher reaction amplitude of the slow phase than for the reaction amplitude of the fast phase. Although the reaction amplitudes for each phase were drastically reduced for BE2, the fast phase reaction rate (40 s-1) for incorporation into BE2 was similar to the fast phase reaction rate (73 s-1) for incorporation into D-8. Thus, we were able to rule out hPolε binding to the blunt-end of

DNA as a possible non-specific binding site.

Future Directions

The kinetic study of DNA synthesis catalyzed by hPolε is ongoing in our laboratory. Here, we have provided kinetic parameters for the minimal DNA synthesis pathway utilized by hPolε (Table 7.12). Notably, we observed an additional step of DNA conformational change (Scheme 7.4). However, important questions remain. First, although the dPTP incorporation efficiency into blunt-end DNA substrate BE2 was approximately equal to the dATP incorporation efficiency into staggered-end DNA substrate D-1 (Tables 7.9 and 7.10), dPTP was bound 15-fold tighter than dATP, and dPTP incorporated 16-fold slower than dATP. This tight dNTP binding may be correlated to the fact that hPolε preferably bound tighter to dGTP during dNTP misincorporations.

To obtain more details on this characteristic of hPolε, we will perform pre-steady state single-turnover dPTP incorporation assays into D-1. This information will demonstrate the base-stacking factor for dNTP incorporation into staggered-end DNA. We will also

256 attempt to determine the crystal structure of the hPolε catalytic domain bound to dGTP and DNA in a ternary complex for a misincorporation along with a crystal ternary complex containing a blunt-end DNA substrate within the active site of hPolε. Crystal structures of such an hPolε construct for correct dATP incorporation into staggered-end

DNA substrates have recently been determined [77]. Therefore, our goal is attainable.

The information from these crystal structures will give more structural insights into the kinetic mechanism of hPolε.

Based on data from biphasic kinetic assays (Chapter 6 and Figure 7.21), we theorized that changes in the slow phase reflect DNA conformational changes involving the translocation of the primer terminus into the active site of hPolε. However, we have previously determined the translocation of Dpo4 to be >100 s-1 [2], which is too fast to observe using current kinetic methods. To determine if translocation is part of the slow phase kinetics, we will perform these kinetic assays with modified DNA substrates that can chemically cross-link to hPolε during normal DNA synthesis, separate the products via PAGE or HPLC, and analyze the products via mass spectrometry [24]. Analysis of binary and ternary complexes by mass spectrometry has been previously performed by our laboratory. Additionally, attempts to crosslink other DNA polymerases to DNA during normal replication have been successful in our laboratory (unpublished data).

Once this work is completed, a detailed kinetic mechanism for DNA synthesis utilized by hPolε will be revealed. The research focus will then lead to expanding this work by performing these experiments in the presence of various accessory proteins, e.g. RPA,

257

RFC and PCNA, as well as other DNA polymerases to reconstruct the replication fork in the presence of a specific DNA lesion.

7.5 Kinetic Investigation of Dpo4 Domain Interactions Involving the Linker Region

In Chapter 2, we answered the long standing question of Dpo4 relevance at lower- than-physiological temperatures, primarily via mostly CD spectroscopic methods. In the process of answering this question, we discovered a novel thermal-stable unfolding intermediate and traced its origin to specific ionic interactions between amino acid residues in the linker region and Palm domain. These interactions are conserved within the Y-family of DNA polymerases. Moreover, these interactions are observed in crystal structures from both binary and ternary complexes of Dpo4 (Figure 7.22). Here, we further explored the structure-function potential of these interactions in Dpo4 using fluorescent titration assays, pre-steady state kinetic methods, and salt denaturation assays.

Results

It is hypothesized that the linker region directly interacts with the backbone of

DNA [60]. After we identified the key ionic interactions between amino acid residues within the linker region and Palm domain while Dpo4 was in apo-state, we sought to determine if disruption of these interactions would change the DNA binding affinity of

Dpo4. The DNA binding affinity is inversely related to the equilibrium dissociation constant of Dpo4 binary complex, Kd, DNA. To determine the Kd, DNA values for Dpo4

258 binary complexes, we used Dpo4 titration into DNA substrate 21/41-mer-2AP (F-8,

Table 7.1) that contained a fluorescent adenine analog, 2-aminopurine (2-AP, Figure

7.23). For example, 6.25 nM to 294.71 nM of the wt Dpo4 (Table 2.1) was titrated into a solution containing 25 nM of DNA F-8 in a quartz cuvette. For each titration point, the mixture was allowed to reach equilibrium for 5 min before exciting the DNA substrate F-

8 at wavelength 310 nm and monitoring the corresponding emission peak at wavelength

371 nm. The intensity change in the emission peak as a function of wt Dpo4 concentration was plotted (Figure 7.24) and, fit to a modified quadratic equation (Eq 8):

2 ½ Ft = Fmax + [Fmin - Fmax/(2Do)] x {(Kd, DNA+Eo+Do)-[(Kd, DNA+Eo+Do) -(4DoEo)] } (Eq 8) where Ft is the fluorescence after titration t, and Fmax and Fmin are the fluorescent maximum and minimum, respectively. This fluorescence titration series was repeated for each Dpo4 mutant used in Chapter 2 and the corresponding Kd, DNA values are displayed in Table 7.13.

Interestingly, any mutation of Dpo4 that included amino acid residue R240 resulted in at least a 3-fold decrease in DNA binding affinity (Table 7.13). In contrast, the removal of the interactions only between K148 and E100 (E100A/K148A Dpo4) did not change the DNA binding affinity. Thus, our data suggested that the ionic interactions between K148 and E100 in Dpo4 are not as important as the ionic interactions requiring

R240. As expected, the removal of all positively charged amino acid residues within the linker region of Dpo4 resulted in a drastic 7.0-fold decrease in DNA binding affinity.

This data correlated well to the observed linker region interactions with the negatively charged backbone of bound DNA in Dpo4 crystal structures [60]. Furthermore, changing

259 these positively charged amino acid residues to negatively charged amino acid residues resulted in an additional 5-fold decrease in DNA binding affinity. Unexpectedly, the

DNA binding affinity for the All-Gly linker Dpo4 mutant only decreased 2.3-fold more so than the DNA binding affinity for the R/K-to-D linker Dpo4 mutant. This data indicated that the large increase in entropic energy of the linker region, i.e. mutation of linker region to all glycine residues, had a DNA binding G of ~3.3 kcal/mol, whereas removal of all positively charged amino acid residues, i.e. when all Arg and Lys residues within the linker region were mutated to either Ala or Asp, had a DNA binding G of

~1.8-2.8 kcal/mol. Notably, the DNA binding free energy difference between the wt

Dpo4 and the R240A Dpo4 was 0.6 kcal/mol. As R240A mutation was included in the

DNA binding ΔΔG for the R/K-to-A linker Dpo4, the contribution of R240 to overall

DNA binding ΔG is larger than the other Arg and Lys residues within the linker region

(Table 2.1). Therefore, R240 should be considered a key amino acid residue that is required for Dpo4 structural stability during the formation of Dpo4•DNA binary complexes (Figure 1.6).

Based on our preliminary data of the DNA binding affinities for various Dpo4 mutants, we began investigating the kinetic effects of these specific mutations on the apparent kinetic parameters kp and Kd, dNTP. We used the same pre-steady state kinetic conditions as described in [27]. The products were processed as previously described [27,

51, 57, 108]. The preliminary results are shown in Table 7.14. Removal of important interactions between the Palm domain and the linker region, i.e. E100, K148, E235 and

R240, increased the dATP misincorporation into DNA D-1 (Table 7.1) by 2.6- to 2.8-

260 fold. These data suggested that the space within the active site was affected by these mutations, potentially by providing more space within the DNA polymerase‟s active site to allow for bulky non-Watson-Crick base pairing. It is important to note that there was no dATP misincorporation catalyzed by the R/K-to-D linker Dpo4 observed within 4 hrs

(Table 7.14), so the dNTP incorporation fidelity could not be calculated accurately.

Interestingly, the dNTP incorporation efficiency for correct dTTP incorporation into

DNA substrate D-1 catalyzed by the All-Gly linker Dpo4 dropped 610-fold when compared to the dNTP incorporation efficiency for correct dTTP incorporation catalyzed by the wt Dpo4. This data suggested that the conformational change of the LF domain observed from apo-state to binary/ternary complexes for Dpo4 is limited by the linker region, as hypothesized in Chapter 2. The importance of charge distribution within the linker region was supported by the observation that the R/K-to-D linker Dpo4 possessed an 8,333-fold lower correct dTTP incorporation efficiency when compared to the wt

Dpo4.

In addition to determining the linker region mutation effects on the kinetic parameters for dNTP incorporation catalysis, we also wanted to dissect the structural folding and unfolding kinetics for Dpo4 to gain more insights into the structure-function relationships of thermal Y-family DNA polymerases. Information from Chapter 2 laid the foundation for structure-function investigations in solution. Next, we planned to observe the unfolding and subsequent refolding of Dpo4. We hypothesized that this event could be observed if the denaturation of Dpo4 was performed with chaotropic salts, e.g. urea and guanidinium chloride (GDN). Although Dpo4 does not contain any tryptophan

261 residues, it contains multiple tyrosine residues that are excited at wavelength 274 nm. As these tyrosines are exposed to environmental solvent, there is an increase in fluorescent emission at 306 nm [231]. As shown in Figure 7.25A, the wt Dpo4 was successfully denatured via titration of GDN at 37 °C where each titration point reached equilibrium within 14 hrs. Interestingly, more than one unfolding intermediate of Dpo4 was observed during the titration. As shown in Figure 7.25A, the wt Dpo4 initially condensed before unfolding at higher GDN concentrations. It was suspected that Dpo4 became more compact at extremely low concentrations of GDN, as the decrease in unfolded fraction during initial titration was observed no matter the time interval. This data suggested that structural stability of Dpo4 is dependent on the surrounding solvent.

The folding of Dpo4 increased until about 0.5 M GDN, then decreased at a steep slope to 50-60% of Dpo4 being unfolded until 3.75 M GDN after which cooperative unfolding was observed until 100% of Dpo4 was unfolded at 7 M GDN. The second folded intermediate was more evident in Figure 7.25B. We also discovered that the unfolding of Dpo4 can be monitored for various concentrations of GDN as a function of time (Figure 7.25B). Thus, the overall steady-state unfolding of Dpo4 was visualized in

Figure 7.25A.

Future Directions

We are currently determining the kp and Kd, dNTP for correct dTTP incorporations catalyzed by these Dpo4 linker mutants. For these studies, we will use the same conditions as those used for determining dNTP misincorporations catalyzed with these

262

Dpo4 mutants. In addition, we will determine the crystal structure of specific Dpo4 linker mutants to identify the structural changes induced by these mutations during the formation of binary and ternary complexes. With this information, we can elucidate the contribution of linker region interactions to the overall dNTP incorporation efficiency and fidelity as well as structural integrity of Dpo4.

The structure of Dpo4 in solution has been studied using different methods, including site-directed mutagenesis [51], kinetics [51, 66], stopped-flow FRET assays

[2], and NMR [47, 48]. However, these investigations have primarily focused on the active site and structural domains, not the linker region. Based on thermal and guanidinium chloride denaturations performed by our laboratory, it was obvious that

Dpo4 unfolded using a pathway that involved the formation of a stable unfolding intermediate (Scheme 7.5). However, salt denaturation appeared to be more complex than thermal denaturation, i.e. more complex than Scheme 7.5. As a result, we do not plan to calculate any thermodynamic parameters corresponding to salt denaturation due to these complex trends. Instead, we will continue salt denaturations only to discover the simplest unfolding pathway of each important Dpo4 mutant, e.g. reduced number of unfolding intermediates during denaturation. We hypothesized that the

K148A/E100A/E235A/R240A Dpo4 mutant may be one such protein since it did not have a thermal-stable unfolding intermediate. If we obtain the exact same unfolding trend as shown in Figure 7.25A, we will then do the guanidinium chloride titration with the

LF+ and the Core separately to determine if the corresponding denaturation plots can be added to equal the guanidinium chloride denaturation plot of the wt Dpo4. We will also

263 repeat these titrations using urea, as it has been demonstrated that guanidinium chloride stabilizes normally undetectable unfolding intermediates [123]. Additionally, once conditions are optimized, we will perform time-dependent GDN titrations to obtain the unfolding and folding rate constants for Dpo4 at different temperatures in order to extrapolate the overall free Gibbs energy of unfolding versus folding for Dpo4. These values will then be used in simulations to calculate microscopic parameters of each pathway. From this information, we can approximate the true unfolding parameters of the wt Dpo4 and gain more insights into the structural stability of Dpo4 in solution. Our work here can then be used to improve methods where a highly thermostable DNA polymerase is needed, e.g. PCR.

7.6 Conclusion

The overall goal of our work within this dissertation was to elucidate a structure- function relationship between the specific lesion bypass capacity of each Y-family DNA polymerase and the structural difference amongst the Y-family members. To start our investigation, we studied the solution structure of Dpo4, the model Y-family DNA polymerase from aerobic hyperthermophilic archeon Sulfolobus solfataricus. The crystal structure of Dpo4 has been previously determined for the apo-state as well as in binary and ternary complexes (see Chapter 1). However, the Dpo4 structure at endogenous temperatures, e.g. 80 °C, has not been determined due to instrumental limitations. The current concern is that any findings of a study with Dpo4 conducted outside of the

264 endogenous conditions of Sulfolobus solfataricus will be an artificial and irrelevant characteristic of Dpo4. Our CD spectroscopic data proved that at least the secondary structural composition of Dpo4 was the same from 26 to 87 °C. Thus, any previous structural or kinetic studies as well as future studies conducted with Dpo4 at temperatures lower than 80 °C, e.g. 37 °C, will have significant relevance to the native activity of

Dpo4. We were also able to determine apparent unfolding thermodynamic parameters for

Dpo4, which suggested that Dpo4 unfolding was entropically driven, and that the LF domain was the last to unfold (Table 2.2).

Importantly, we were the first to observe an unfolding intermediate during a thermal denaturation (Figure 2.4) that arose from ionic interactions between the Palm domain and the linker region of Dpo4 (Figure 2.10). The specific ionic interactions were determined to involve amino acid residues R240 (linker region), E235 (linker region),

E100 (Palm domain), and K148 (Palm domain). Charge removal of amino acid residue

R240 resulted in the most change in the thermodynamic parameters of Dpo4 (Tables 2.2 and 2.4). Our preliminary kinetic studies with these same Dpo4 mutants demonstrated that the linker region interactions with the Palm domain are important, although linker region interactions with the backbone of the DNA contribute more to the DNA binding event.

Another question answered for the model Y-family member Dpo4 was the sugar selectivity mechanism used during NTP binding and DNA synthesis. Based on previous crystal structures of Dpo4, we believed that sugar selectivity was performed using the amino acid residue steric gate Y12 that filled the space needed for the 2′-OH on the

265 ribose sugar of incoming nucleotide (Figures 3.9 and 3.10). When the steric gate was removed, i.e. mutation Y12A, Dpo4 became a distributive DNA-dependent RNA polymerase (Figure 3.2) with a lowered dNTP incorporation efficiency and similar rNTP incorporation fidelity to wt Dpo4 incorporating dNTP. This sugar selectivity mechanism is believed to be conserved within the Y-family of DNA polymerases, and this structural theory for Dpo4 sugar selectivity was later confirmed with a crystal structure of Y12A

Dpo4•DNA complex with an incoming rNTP [45]. With these questions answered, we continued our studies on Dpo4 with more kinetic and structural details at our disposal.

As a common approach in the DNA polymerase field, we also initiated a kinetic investigation of the bypass of a potential carcinogen, dGAP, by first studying the bypass mechanism for Dpo4, a Y-family member that is well known. Our biochemical and kinetic data indicated that Dpo4 bypasses dGAP with more difficulty during the dGAP bypass extension than dNTP incorporation opposite the bulky DNA lesion. Based on the kinetic data, we hypothesized that DNA containing a dGAP possessed three different conformations within the active site. The first DNA conformation would have the bulky

1-AP adduct rotated out of the helix and interacting with a domain of Dpo4, possibly the

LF domain (Figure 7.26A). In this conformation, dG would be able to form a normal

Watson-Crick base pair with the incoming dNTP. This conformation would represent the productive binary complex observed during the biphasic kinetic analysis (Table 5.5), as depicted with Dpo4 binding to DNA containing a BPDE-dG [40]. The second DNA conformation may be similar to the NMR structure of DNA containing dGAP where the bulky hydrophobic 1-AP adduct intercalated within the double helix of the DNA and the

266 base opposite the lesion was pushed out into the surrounding solvent (Figure 7.26B). This conformation would represent the dead-end binary complex observed during the biphasic kinetic assays as the bulky 1-AP would completely block the incoming dNTP and propel the dG of the lesion into the major groove. The third DNA conformation would be similar to the second DNA conformation, but the 1-AP adduct is perpendicular to the base pairs and parallel to the helix axis. This complex would be considered the non-productive complex as the bulky adduct is not blocking the incoming dNTP, but the affected guanine is slightly rotated out of plane of the base pairs. This conformation would also hinder translocation of Dpo4 as the 1-AP adduct will clash with the LF domain, as predicted with the BPDE-dA lesion and Dpo4 crystal structures [109].

Next, we continued the dGAP bypass investigation using less studied Y-family members, i.e. human Y-family DNA polymerases. We discovered that each of the human

Y-family enzymes can insert dNTPs opposite dGAP, but with varying dNTP incorporation efficiencies and fidelities (Tables 6.3-6.10). Therefore, our results strongly suggested that the Y-family DNA polymerases utilize different dGAP bypass mechanisms. The only common factor was that all Y-family DNA polymerases had more difficulty extending dGAP bypass products than inserting a dNTP opposite dGAP. This dGAP bypass characteristic is explained by the fact that the 1-AP adduct is on the C8 position of dG.

Thus the 1-AP adduct does not obstruct the Watson-Crick face of the base. However, this bulky adduct could potentially interfere on the minor or major groove face of the DNA as the Finger and LF domains interact with the minor and major grooves, respectively, of the DNA template base during the DNA binding event [60, 196].

267

We also used the newly developed SOSA method with both human Y-family

DNA polymerase and Dpo4 to visualize and quantify the mutagenic potential of dGAP bypass. This method, along with quantification of running start assays, provided a vivid picture of dGAP bypass in the presence of all dNTPs and accounted for lesion effects that were upstream and downstream of the bulky lesion site. In these studies, we not only estimated the efficiencies and error rates of dGAP bypass for each enzyme, but also the relative efficiencies and error rates of normal DNA synthesis for each enzyme (Tables 7.2 and 7.3). Most importantly, these data allowed us to decipher the mutations created in the presence of dGAP from mutations caused by the sequence bias of each Y-family DNA polymerase. Notably, the error rates for both normal DNA synthesis and dGAP bypass catalyzed by Dpo4 matched the corresponding dNTP incorporation fidelities for Dpo4

[57]. We observed an increase in mutations in the form of mostly base deletions opposite dGAP and during the first extension of dGAP bypass products (Figure 7.5). Unexpectedly, we also discovered that dGAP affected the dNTP incorporation fidelity of each human Y- family DNA polymerase, not only at the sites of pausing observed during the running start assays, but also at upstream and downstream template bases relative to the lesion site (Figure 7.6). We suspected that these upstream and downstream mutations were present due to DNA structural distortions induced by dGAP. More specifically, we predicted that dGAP structurally interfered with the Dpo4 translocation along the DNA template.

Additionally, we performed SOSA and running start assays with DNA containing an abasic site for all human Y-family DNA polymerases. As aforementioned, the pause

268 sites identified during running start assays correlated well with mutational hot spots observed during corresponding assays. Unexpectedly, we also observed mutations upstream of the abasic site that were not present in the absence of an abasic site (Figure

4.5). Since multiple crystal structures of DNA containing an abasic site all depict no

DNA distortion, we suspect that the DNA template bases upstream of the active site have key interactions with amino acids of the Y-family DNA polymerases, most likely with either the LF or Finger domain. Without interactions with the DNA template, the DNA strand will likely misalign within the active site of the enzyme. Therefore, another structure-function relationship has been identified for further analysis.

When we conducted this type of study for two different double-base lesions, cis- syn TT and cisplatin-dGpG, our preliminary data became slightly more complex than the single-base lesions abasic site and dGAP. As before, each Y-family DNA polymerase studied displayed varying efficiencies and mutagenic potential for each site of the double- base lesion as well as different mechanisms for each DNA lesion. In general, these enzymes demonstrated insertion preferences that were different for each modified base.

Interestingly, our preliminary data supported the hypothesis that hPol is the DNA polymerase that substitutes for hPol in XPV patients since cis-syn TT bypass catalyzed by hPol produced the same type of mutations proportionally observed in XPV cell lines.

We also observed the preference of hPolto extend mismatched primer termini in the vicinity of either cis-syn TT or cisplatin-dGpG, which correlated well to the proposed role of hPol in extending from initial DNA lesion bypass products. Furthermore, our running start assays indicated that hPolε slightly paused after dNTP incorporation

269 opposite 3′-T of cis-syn TT (Figure 7.8). As proposed by Vasquez-del Carpio et al. [219], hPolε only inserts dATP opposite the 3′-T of the dimer, leaving hPol to incorporate dATP opposite the 5′-T of the lesion and the cis-syn TT bypass extension. This hypothesis supported the current belief that DNA lesion bypass is perform through a concert of different Y-family DNA polymerases for each step of the TLS process

(Scheme 7.6).

In all SOSA studies conducted thus far, our data has suggested that hPolε is the likely enzyme to perform DNA lesion bypass in vivo due to the highest lesion bypass efficiency and/or most error-free TLS. However, the general kinetic mechanism of DNA synthesis of hPolε is not well defined. These conclusions from SOSA and running start assays called for a detailed kinetic study into the general mechanism of DNA synthesis catalyzed by hPolε, which we began here. Notably, we were the first to observe the kinetically biphasic nature of hPolε during DNA replication over undamaged DNA substrates (Tables 6.11 and 7.11). We also noted the extremely tight binding of dGTP during misincorporations and efficient blunt-end additions. Based on elemental effect assays from other groups on the correct dNTP incorporation, the rate-limiting step of this pathway is not the chemistry step for hPolε, as the elemental effect value was 1.0 [201].

DNA -1 This research group also determined the koff to be 2.4 s [201], which was

DNA -1 approximately 889-fold faster than the koff value we determined (0.0027 s ). It is important to note that their dissociation assay was designed to monitor the extension of labeled DNA as hPolε dissociated from unlabeled DNA whereas our dissociation assay monitored the elongation of labeled DNA before the dissociation of binary hPolε•DNA

270 complex. Thus, we conclude that their dissociation rate constant is for a different binary complex, such as the dissociation of non-productive binary complex. When we compiled data from previous studies with our data, a kinetic mechanism (Scheme 7.7) with the corresponding kinetic parameters displayed in Table 7.12 emerged.

In summary, this dissertation covered many fundamental aspects of the DNA polymerase field, including kinetic methodology and analysis of genetic mutation generation. Although most agree that all DNA polymerases evolved from a single ancestor, the data herein demonstrated the diversity of the DNA lesion bypass abilities and DNA synthesis efficiencies within a single DNA polymerase family. Consequently, even though the DNA lesion bypass function can be redundant for multitudes of DNA polymerases, the outcome of each DNA lesion bypass mechanism will be different for each DNA polymerase encountering each type of DNA lesion in the context of each

DNA sequence.

271

7.7 Figures

Figure 7.1. Running start assay for Sulfolobus solfataricus DNA Polymerase B1exo+ at 37

°C.

A solution of preincubated DNA Polymerase B1exo+ (100 nM) and 5′-radiolabeled DNA substrate (A) S-1a 17/26-mer (100 nM) or (B) S-1b 17/26-mer-dGAP (100 nM) was rapidly mixed with all four dNTPs (200 µM each) for various time intervals before being quenched with EDTA (0.37 M).

272

100

80

60

40 Lesion Bypass%

20

0 0 10 20 30 40 50 60 Time (sec)

Figure 7.2. Bypass of dGAP over time.

Quantitification of running start assays image gels in Figures 5.1 and 6.2 for Dpo4 (●), hPolε (■), hPol (□), and hPol (○).

273

Figure 7.3. Standing start assay for hRev1 at 37 °C.

Each lane represented a 10 min reaction. “0”, “A”, “C”, “G”, “T” and “4” denoted no dNTP added, only dATP (200 µM) added, only dCTP (200 µM), only dGTP (200 µM), only dTTP (200 µM) and all four dNTPs (200 µM each) added, respectively. DNA substrates used were (A) S-2a 20/26-mer and (B) S-2b 20/26-mer-dGAP.

274

No Incorporation CTACCTGAACGACGGCCAGTGAATTCG-GCGGGGACAGGACGGCTAGTGCAATGTTGACC (0/52) C (1/52)

A (continued)

Figure 7.4. Mutation spectra of DNA synthesis catalyzed by Dpo4 (A and B), hPolε (C and D), hPolθ (E and F), and hPol (G).

275 Results from SOSA are shown separately based on specific dNTP incorporations with the damaged DNA substrate S-3b 17/73-

mer-dGAP (A, C, E, and G) or the control DNA substrate S-3a 17/73-mer (B, D and F). Sequences corresponding to primers used

for PCR amplification are shown in small case font while sequenced dNTPs are in large case font and underlined. Individual base

substitutions (blue), deletions (red), and insertions (green) are written below the full-length „product‟ while complex mutations

are color-coded based on the specific mutation in order of occurrence. Boldfaced letter shaded in light blue corresponds to

incorporation opposite the dGAP site (A, C, E, and G). Boldfaced letter corresponds to incorporation opposite the template base

dG (B, D, and F). Relative frequencies are shown at right in parentheses. * denotes that the smaller-than-standard font size of

mutations started under the first base and included the mutations in the smaller-than-standard font size.

275

Figure 7.4 continued

Cytosine Incorporations CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (36/52) C (1/52) A (2/52) C (1/52) C (1/52) A (1/52) T (1/52) A (1/52) A (1/52) 276 A (1/52)

T (1/52)

Adenine Incorporations CTACCTGAACGACGGCCAGTGAATTCGAGCGGGGACAGGACGGCTAGTGCAATGTTGACC (2/52) G (2/52)

A

(continued)

276

Figure 7.4 continued

CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (48/52) C (1/52) C (1/52) G (1/52) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/52)

B

Adenine Incorporation CTACCTGAACGACGGCCAGTGAATTCGAGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/44)

277 C GACG (1/44)

G G GACG (1/44) A ACAG C (1/44) G (1/44)

C (continued)

277

Figure 7.4 continued

Cytosine Incorporations CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (5/44) G T A (1/44) G (1/44) T (1/44) G (1/44) AC (1/44) A (1/44) A (1/44) G (1/44) 278 G (1/44)

A (1/44) G C (1/44)

Thymine Incorporation CTACCTGAACGACGGCCAGTGAATTCGTGCGGGGACAGGACGGCTAGTGCAATGTTGACC (0/44) G C A (1/44) AT (1/44)

Guanine Incorporation CTACCTGAACGACGGCCAGTGAATTCGGGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/44)

C (continued) 278

Figure 7.4 continued

No Incorporation CTACCTGAACGACGGCCAGTGAATTCG-GCGGGGACAGGACGGCTAGTGCAATGTTGACC (2/44) AA T (1/44) AAA (1/44) C (1/44) CG (1/44) G GGGG T (1/44) G ACAG (1/44) G (4/44) A G A (1/44) 279 G A (1/44)

G CA G (1/44) AATTCG G T (1/44) A TTCG G (1/44) G G (1/44) A G (1/44) G G G (1/44)

C (continued)

279

Figure 7.4 continued

CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (19/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGAAAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAAGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAACTCGCGTGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCGGTGAATTCGCACGGGGACAGGACGGCTAGTGCAATGTTGACC (2/51) CTACCTGAACGACGGCCAGAGAATTCGCGCGGGGACGGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATGCGCGCGGGGACAGGACAGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTAAATTCGCGCGAGGACAGGACGGCTAGTGCAATGTTGACC (1/51)

280 CTACCTGAACGACGGCCAGTGAATCCGCGCGGGGCCAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAATGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGAGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGTGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCTCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGTGGGGATAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGGCGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAAGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAGTTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (3/51)

D (continued)

280

Figure 7.4 continued

CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC CTACCTGAACGACGGCCAGTGACTTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCCGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATCCGCGCGGGGCCAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAACTCGCGTGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTAAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGNACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (2/51) CTACCTGAACGACGGCCAGTGAAACCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (3/51)

281

D

Cytosine Incorporations CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (9/45) C C (1/45) T (2/45) A GA ACAGG (1/45) T (1/45) GG (1/45) G T (1/45) AT G G (1/45)

E (continued)

281

Figure 7.4 continued

No Incorporations CTACCTGAACGACGGCCAGTGAATTCG-GCGGGGACAGGACGGCTAGTGCAATGTTGACC (7/45) G (3/45) G T (1/45) G (1/45) C (1/45) C (2/45)

Adenine Incorporations CTACCTGAACGACGGCCAGTGAATTCGAGCGGGGACAGGACGGCTAGTGCAATGTTGACC (7/45) (1/45) 282 T A G (1/45)

G (1/45) A (1/45) G (1/45)

Thymine Incorporation CTACCTGAACGACGGCCAGTGAATTCGTGCGGGGACAGGACGGCTAGTGCAATGTTGACC (0/45) G (1/45)

E

(continued)

282

Figure 7.4 continued

CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (27/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (7/51) CTACCTGAACGACGGCCAGTGAATTCGAGCGGGGACAGGACGGCTAGTGCAATGTTGACC (5/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGTCAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGCGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGCGGACAGGACGGCTAGTGCAATGTTGACC (2/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGAACAGGACGGCTAGTGCAAGGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGATGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGAAGGCTAGTGCAATGTTGACC (1/51) 283 CTACCTGAACGACGGCCAATGAATTAGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (1/51) CTACCTGAACGACGGCCAGTGAATTCGCGAGGGGACAGGACGGCTAGTGCAATGTTGACC (2/51) CTACCTGAACGACGGCCAGTGAATTCTCGCGGGGACAGGACGGCTAGTGNNNTGTTGACC (1/51)

F

(continued)

283

Figure 7.4 continued

Cytosine Incorporation CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGTTGACC (0/5) CTACCTGAACGACGGCCAGTGGTTTCTCGCGGGGACAGGGGCGCTAGTGCAATGTTGACC (1/5) CTACCTGAACGACGGCCAGTTGAGTTCGCGCGGGGACAGGTCGGCTAGTGCAATGTTGACC (1/5)* CTACCTGAACGACGGCCAGTGGTTTCGCGCGGGGACAGGGACGGCTAGTGCAATGTTGACC (1/5)* CTACCTGAACGACGGCCAGTGGATTCGCGCGGGGACGGGTCGGCTAGTGCAATGtTgACC (1/5)

No Incorporation CTACCTGAACGACGGCCAGTGAATTCG-GCGGGGACAGGACGGCTAGTGCAATGTTGACC (0/5)

284 CTACCTGAACGACGGCCAGTGAATTCGCGCGGGGACAGGACGGCTAGTGCAATGttgacc (1/5)

G

284

100 90 Dpo4 80 Pol Kappa 70 60 Pol Eta 50 40 30

20 RelativeFrequency 10 0 dA dC deletion dT dG AP Preferred Action Opposite dG

A

100 90 Dpo4 80 Pol Kappa 70 60 Pol Eta 50 40 30

20 RelativeFrequency 10 0 dA dC deletion dT dG Preferred Action Opposite dG

B (continued)

Figure 7.5. Comparison of preferred actions by Dpo4, hPolε and hPol opposite dGAP in

DNA substrate S-3b 17/73-mer-dGAP or the corresponding template base dG in DNA substrate S-3a 17/73-mer.

The results from SOSA were tallied for all events at template Position 0 (A and B) and

Position +1 (C and D) in the presence of dGAP (A and C) or dG (B and D) for Dpo4

(black bar), hPolε (striped bar), and hPolθ (white bar).

285

Figure 7.5 continued

100 90 80 Dpo4 70 Pol Eta 60 50 Pol Kappa 40 30

RelativeFrequency 20 10 0 dA dG deletion dC dT Preferred Action Opposite +1 Site

C

100 90 80 Dpo4 70 Pol Kappa 60 50 Pol Eta 40 30

RelativeFrequency 20 10 0 dA dG deletion dC dT Preferred Action Opposite dC

D

286

100 90 80 70 60 50 40 30 20

Relative ErrorRelative (%) 10 0 -10-9 -8 -7 -6 -5 -4 -3 -2 -1 X 1 2 3 4 5 6 7 8 9 1011121314 AP Position from dG

A (continued)

Figure 7.6. Histogram of relative error% as a function of template position.

At each position along the DNA template, the relative base insertion% (striped bar), substitution% (white bar) and deletion% (black bar) are shown to reveal total relative error% and the contribution of each type of mutations simultaneously. The dGAP site is indicated as “X” along the X-axis, and the corresponding template base dG is denoted as

“0” along the X-axis. The dGAP bypass analyses for Dpo4 (A), hPolε (C), and hPolθ (E) are shown. DNA synthesis with the control S-3a was also analyzed for Dpo4 (B), hPolε

(D), and hPolθ (F).

287

Figure 7.6 continued

100 90 80 70 60 50 40 30 20 Relative ErrorRelative (%) 10 0 -10-9-8-7-6-5-4-3-2-1 0 1 2 3 4 5 6 7 8 9 1011121314

Position Along 73-mer

B

100 90 80 70 60 50 40 30 20

Relative ErrorRelative (%) 10 0 -10-9 -8 -7 -6 -5 -4 -3 -2 -1 X 1 2 3 4 5 6 7 8 9 1011121314 AP Position from dG Site

C (continued)

288

Figure 7.6 continued

100 90 80 70 60 50 40 30 20 Relative ErrorRelative (%) 10 0 -10-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 1011121314 Position Along 73-mer

D

100 90 80 70 60 50 40 30 20

Relative ErrorRelative (%) 10 0 -10-9 -8 -7 -6 -5 -4 -3 -2 -1 X 1 2 3 4 5 6 7 8 9 1011121314 AP Position from dG Site

E (continued)

289

Figure 7.6 continued

100 90 80 70 60 50 40 30 20 10 Relative ErrorRelative (%) 0 -10-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 1011121314 Position Along 73-mer

F

290

Figure 7.7. Crystal structure of cis-syn cyclobutane thymine dimer (cis-syn TT).

Image is derived from PDB ID 3MFI [211].

291

(continued)

Figure 7.8. Running start assays for cis-syn TT bypass catalyzed by Y-family DNA polymerases.

A solution of preincubated 100 nM of Dpo4 (A and B), hPolε (C and D), hPolθ (E and

F), hPolη (G and H) or hRev1 (I and J) and 5′-[32P]-labeled S-4a 17/77-mer (A, C, E, G and I) or S-4b 17/77-mer-CPD (B, D, F, H and J) at 37 °C was rapidly mixed with all four dNTPs (200 µM each) for various reaction times, and quenched with EDTA (0.37

M). The 17th, 27th, 28th, and 66th positions mark the primer, the 3′-T of cis-syn TT (T1), the 5′-T of cis-syn TT (T2) and full-length product, respectively.

292

Figure 7.8 continued

293

100 90 80 70 60 50 40 30 Lesion Bypass% Lesion 20 10 0 0 500 1000 1500 2000 2500 3000 3500 4000

Time (s)

Figure 7.9. Bypass of cis-syn TT as a function of time.

Quantitification of running start assays image gels in Figure 7.8 for Dpo4 (♦), hPolε (■), hPol (□), hPol (◊), and hRev1 (▲).

294

AA Incorporations CGGCATCAGCAATGTTGACCCAACTCAATGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (8/28) CGGCATCAGCAATGTTG C CGTGCTGTGCGAGCGGATAGG (1/28)

AC Incorporations CGGCATCAGCAATGTTGACCCAACTCACTGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (4/28) CGGCATCAGCAATGTTG C T CGTGCTGTGCGAGCGGATAGG (1/28)

A (continued)

Figure 7.10.Mutation spectra of DNA synthesis catalyzed by Dpo4 (A and B), hPolε (C), hPolθ (D and E), and hPol (F and G).

295 Results from SOSA are shown separately based on specific dNTP incorporations with the damaged DNA substrate S-4b 17/77-

mer-CPD (A, C, E, and G) or the control DNA substrate S-4a 17/77-mer (B, D and F). Sequences corresponding to primers used

for PCR amplification are shown in small case font while sequenced dNTPs are in large case font and underlined. Individual base

substitutions (blue), deletions (red), and insertions (green) are written below the full-length „product‟ while complex mutations

are color-coded based on the specific mutation in order of occurrence. Boldfaced letter shaded in light blue corresponds to

incorporation opposite the cis-syn TT (A, D, and F). Boldfaced letter corresponds to incorporation opposite the template bases TT

(B, C, E, and G). Relative frequencies are shown at right in parentheses. * denotes that the smaller-than-standard font size of

mutations started under the first base and included the mutations in the smaller-than-standard font size.

295

Figure 7.10 continued

G- Incorporations CGGCATCAGCAATGTTGACCCAACTCG-TGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (4/28) CGGCATCAGCAATGTTG GA T A A CAT TGTGCTGTGCGAGCGGATAGG (1/28)

No Incorporations CGGCATCAGCAATGTTGACCCAACTC--TGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (9/28) CGGCATCAGCAATGTTG G AA CGTGCTGTGCGAGCGGATAGG (2/28) CGGCATCAGCAATGTTG CT AA CGTGCTGTGCGAGCGGATAGG (1/28) CGGCATCAGCAATGTTG AA C T CGTGCTGTGCGAGCGGATAGG (1/28) A

296 CGGCATCAGCAATGTTGACCCAACTCAATGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (61/65) CGGCATCAGCAATGTTG G CGTGCTGTGCGAGCGGATAGG (2/65) CGGCATCAGCAATGTTG G CGTGCTGTGCGAGCGGATAGG (1/65) CGGCATCAGCAATGTTG CCGTGCTGTGCGAGCGGATAGG (1/65) B (continued)

296

Figure 7.10 continued

CGGCATCAGCAATGTTGACCCAACTCAATGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (2/11) CGGCATCAGCAATGTTG T CGTGCTGTGCGAGCGGATAGG (2/11) CGGCATCAGCAATGTTGA G CGTGCTGTGCGAGCGGATAGG (2/11) CGGCATCAGCAATGTTG T CGTGCTGTGCGAGCGGATAGG (1/11) CGGCATCAGCAATGTTG T G CGTGCTGTGCGAGCGGATAGG (1/11) CGGCATCAGCAATGTTC C C CGTGCTGTGCGAGCGGATAGG (1/11) CGGCATCAGCAATGA G CGTGCTGTGCGAGCGGATAGG (1/11) CGGCATCAGCAATGTTG G CGTGCTGTGCGAGCGGATAGG (1/11) CGGCATCAGCAATGTTA C AA T T CCAATGTGCTGTGCGAGCGGATAGG (1/11) C

297

AA Incorporations CGGCATCAGCAATGTTGACCCAACTCAATGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (0/14) CGGCATCAGCAATGTTG A CGTGCTGTGCGAGCGGATAGG (1/14) CGGCATCAGCAATGTTG A C T AGTGCTGTGCGAGCGGATAGG (1/14) CGGCATCAGCAATGTTG A CGTGCTGTGCGAGCGGATAGG (1/14)

TC Incorporations CGGCATCAGCAATGTTGACCCAACTCTCTGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (0/14) CGGCATCAGCAATGT TC C TA C C ACGTGCTGTGCGAGCGGATAGG (1/14) CGGCATCAGCAATGTTG TC C TT C CGTGCTGTGCGAGCGGATAGG (1/14)

D (continued)

297

Figure 7.10 continued

TG Incorporations CGGCATCAGCAATGTTGACCCAACTCTGTGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (0/14) CGGCATCAGCAATGTTG G TG GT CGTGCTGTGCGAGCGGATAGG (4/14)

No Incorporations CGGCATCAGCAATGTTGACCCAACTC--TGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (0/14) CGGCATCAGCAATGTTG G AA CGTGCTGTGCGAGCGGATAGG (2/14) CGGCATCAGCAATGTTG AA A CGTGCTGTGCGAGCGGATAGG (1/14) CGGCATCAGCAATGTTG AA A CGTGCTGTGCGAGCGGATAGG (1/14) CGGCATCAGCAATGTTG AA CGGTGCTGTGCGAGCGGATAGG (1/14) D (continued)

298

298

Figure 7.10 continued

CGGCATCAGCAATGTTGACCCAACTCAATGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (10/25) CGGCATCAGCAATGTTG CA CGTGCTGTGCGAGCGGATAGG (5/25) CGGCATCAGCAATGTTG C A A A CGTGCTGTGCGAGCGGATAGG (1/25) CGGCATCAGCAATGTTG C A A CGTGCTGTGCGAGCGGATAGG (1/25) CGGCATCAGCAATGTTG T CCGTGCTGTGCGAGCGGATAGG (1/25) CGGCATCAGCAATGTTG C CGTGCTGTGCGAGCGGATAGG (1/25) CGGCATCAGCAATGTTG G G CGTGCTGTGCGAGCGGATAGG (1/25) CGGCATCAGCAATGTTG C CGTGCTGTGCGAGCGGATAGG (1/25) CGGCATCAGCAATGTTG C C CGTGCTGTGCGAGCGGATAGG (1/25) CGGCATCAGCAATGTTG C A AGGCGTGCTGTGCGAGCGGATAGG (1/25)

299 CGGCATCAGCAATGTTG GAC A T GCGTGCTGTGCGAGCGGATAGG (1/25) CGGCATCAGCAATGTTG TTC CGTGCTGTGCGAGCGGATAGG (1/25)

E (continued)

299

Figure 7.10 continued

TA Incorporations CGGCATCAGCAATGTTGACCCAACTCTATGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (0/21) TA CCCG CG GT (1/21)

No Incorporations CGGCATCAGCAATGTTGACCCAACTC--TGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (11/21) CGGCATCAGCAATGTTG G AA T TG T CGTGCTGTGCGAGCGGATAGG (2/21) CGGCATCAGCAATGTTGC AA T T TTTG ACC G G A A (1/21)* CGGCATCAGCAATGTTG G AA TG TG G T CGTGCTGTGCGAGCGGATAGG (1/21) CGGCATCAGCAATGTTG CT AA C A CGTGCTGTGCGAGCGGATAGG (1/21)

300 CGGCATCAGCAATGTTG AA C C (1/21)

CGGCATCAGCAATGTTG G AA T A T CGTGCTGTGCGAGCGGATAGG (1/21) CGGCATCAGCAATGTTG G AA TA TT G CGTGCTGTGCGAGCGGATAGG (1/21) CGGCATCAGCAATGTTG T AA T G CGTGCTGTGCGAGCGGATAGG (1/21) F (continued)

300

Figure 7.10 continued

CGGCATCAGCAATGTTGACCCAACTCAATGTCGATCCAATGGAGGCGTGCTGTGCGAGCGGATAGG (0/6) CGGCATCAGCAATGTTG A T A A CGTGCTGTGCGAGCGGATAGG (1/6) CGGCATCAGCAATGTTG C A CGTGCTGTGCGAGCGGATAGG (1/6) CGGCATCAGCAATGTTG T C CGTGCTGTGCGAGCGGATAGG (1/6) CGGCATCAGCAATGTTG T GC C CA A A (1/6) CGGCATCAGCAATGTTG AA G A CGTGCTGTGCGAGCGGATAGG (1/6) CGGCATCAGCAATGTTG T C T A CGTGCTGTGCGAGCGGATAGG (1/6) G

301

301

1

0.8

0.6

0.4

0.2

Relative Frequency Relative 0 dA dC deletion dT dG Preferred Action Opposite T1 (control)

A

1

0.8

0.6

0.4

0.2

Relative Frequency Relative 0 dA dC deletion dT dG

Preferred Action Opposite T1 (lesion)

B (continued)

Figure 7.11. Comparison of preferred actions by Dpo4, hPolε, hPol and hPol opposite cis-syn TT in DNA substrate S-4b 17/77-mer-CPD or the corresponding template bases

TT in DNA substrate S-4a 17/77-mer.

The results from SOSA were tallied for all events at template Position T1 (A and B) and

Position T2 (C and D) in the presence of cis-syn TT (B and D) or TT (A and C) for Dpo4

(black bar), hPolε (white bar), hPol (red bar), and hPolθ (blue bar). 302

Figure 7.11 continued

1

0.8

0.6

0.4

0.2

Relative Frequency Relative 0 dA dC deletion dT dG

Preferred Action Opposite T2 (control)

C

1

0.8

0.6

0.4

0.2

Relative Frequency Relative 0 dA dC deletion dT dG

Preferred Action Opposite T2 (lesion)

D

303

100 90 80 70 60 50 40 30

Relative Error (%) Error Relative 20 10 0 -9 -8 -7 -6 -5 -4 -3 -2 -1 T1T2 1 2 3 4 5 6 7 8 9 1011121314151617 Position from CPD Lesion

A

(continued)

Figure 7.12. Histograms of relative error% as a function of template position.

At each position along the DNA template, the relative base insertion% (white bar), substitution% (grey bar) and deletion% (black bar) are shown to reveal total relative error% and the contribution of each type of mutations simultaneously. The 3′-T and 5′-T of cis-syn TT are indicated as “T1” and “T2”, respectively, along the X-axis. The same notation was used for corresponding control TT template bases. The cis-syn TT bypass analyses for Dpo4 (A), hPol (C), and hPol (E) are shown. DNA synthesis with the control S-4a was also analyzed for Dpo4 (B), hPol (D), hPol (F), and hPolε (G).

304

Figure 7.12 continued

100 90 80 70 60 50 40 30

Relative Error (%) Error Relative 20 10 0 -9 -8 -7 -6 -5 -4 -3 -2 -1T1T2 1 2 3 4 5 6 7 8 9 1011121314151617 Position from TT on 77mer

B

100 90 80 70 60 50 40 30

Relative Error (%) Error Relative 20 10 0 -9 -8 -7 -6 -5 -4 -3 -2 -1 T1T2 1 2 3 4 5 6 7 8 9 1011121314151617 Position from CPD Lesion

C

(continued)

305

Figure 7.12 continued

100 90 80 70 60 50 40 30

Relative Error (%) Error Relative 20 10 0 -9 -8 -7 -6 -5 -4 -3 -2 -1 T1T2 1 2 3 4 5 6 7 8 9 1011121314151617 Position from TT on 77mer

D

100 90 80 70 60 50 40 30

Relative Error (%) Error Relative 20 10 0 -9 -8 -7 -6 -5 -4 -3 -2 -1 T1T2 1 2 3 4 5 6 7 8 9 1011121314151617 Position from CPD Lesion

E

(continued)

306

Figure 7.12 continued

100 90 80 70 60 50 40 30

Relative Error (%) Error Relative 20 10 0 -9 -8 -7 -6 -5 -4 -3 -2 -1 T1T2 1 2 3 4 5 6 7 8 9 1011121314151617

Position from TT on 77mer

F

100 90 80 70 60 50 40 30

Relative Error (%) Error Relative 20 10 0 -9 -8 -7 -6 -5 -4 -3 -2 -1 T1T2 1 2 3 4 5 6 7 8 9 1011121314151617 Position from TT on 77mer

G

307

Figure 7.13. Running start assays for cisplatin-dGpG bypass catalyzed by hPolε and hPol.

A solution of preincubated 100 nM of hPolε (A and B) and hPol (C and D) and 5′-[32P]- labeled S-5a 15/54-mer (A and C) or S-5b 15/54-mer-DDP (B and D) at 37 °C was rapidly mixed with all four dNTPs (200 µM each) for various reaction times, and quenched with EDTA (0.37 M). The 15th, 24th, 25th, and 54th positions mark the primer, the 3′-G of lesion (G1), the 5′-G of lesion (G2) and full-length product, respectively. 308

100 80 60 40 20

0 Lesion Bypass% Lesion 0 100 200 300 400 500 600 Time (s)

Figure 7.14. Bypass of cisplatin-dGpG as a function time.

Quantitification of running start assays image gels in Figure 7.13 for hPol (♦) and hPolε

(■) as well as Dpo4 (○) from our previous kinetic study [108].

309

CC Incorporations GTCCCTGTTCGGGCGCCAGGAGACCAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (31/49) AG T (2/49) G A (1/49) G (4/49) A (1/49) G (1/49) C (1/49)

A (continued)

310 Figure 7.15. Mutation spectra of DNA synthesis catalyzed by Dpo4 (A and B), and hPolθ (C and D).

Results from SOSA are shown separately based on specific dNTP incorporations with the damaged DNA substrate S-6b 15/69-

mer-DDP (A and C) or the control DNA substrate S-6a 15/69-mer (B and D). Sequences corresponding to primers used for PCR

amplification are shown in small case font while sequenced dNTPs are in large case font and underlined. Individual base

substitutions (blue), deletions (red), and insertions (green) are written below the full-length „product‟ while complex mutations

are color-coded based on the specific mutation in order of occurrence. Boldfaced letter shaded in light blue corresponds to

incorporation opposite the cisplatin-dGpG (A and C). Boldfaced letter corresponds to incorporation opposite the template bases

GG (B and D). Relative frequencies are shown at right in parentheses. 310

Figure 7.15 continued

AC Incorporations GTCCCTGTTCGGGCGCCAGGAGAACAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (1/49)

CA Incorporations GTCCCTGTTCGGGCGCCAGGAGACAAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (2/49) G (1/49) C (1/49)

No Incorporations GTCCCTGTTCGGGCGCCAGGAGA--AGAGGCTAGTCTCGTGGTCGAGTCAGGTC (0/49) C A AGAG (1/49)

311 AGA (1/49) AGAT (1/49)

A

GTCCCTGTTCGGGCGCCAGGAGACCAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (27/33) GTCCCTGTTCGGGCGCCAGGAGACCAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (1/33) GTCCCTGTTCGGGCGCCAGGAGACCAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (1/33) GTCCCTGTTCGGGCGCCATGAGACCAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (1/33) GTCCCTGTTCGGGCGCCAGGAGACCAGAGGCTAGTTTCGTGGTCGAGTCAGGTC (1/33) GTCCCTGTTCGGGCGGCAGGAGACCAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (1/33) GTCCCTGTTCGGGCGCCAGGAGACCAGCGGCTAGTCTCGTGGTCGAGTCAGGTC (1/33) B (continued) 311

Figure 7.15 continued

CC Incorporations GTCCCTGTTCGGGCGCCAGGAGACCAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (27/49) A (1/49) T (1/49) G G (1/49) C (1/49) A G (1/49) G (3/49) C (2/49) A (1/49)

312 AC Incorporations

GTCCCTGTTCGGGCGCCAGGAGAACAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (0/49) AGAGG (1/49)

C- Incorporations GTCCCTGTTCGGGCGCCAGGAGAC-AGAGGCTAGTCTCGTGGTCGAGTCAGGTC (1/49) G A AGAG (1/49) AGA (1/49)

C (continued)

312

Figure 7.15 continued

No Incorporations GTCCCTGTTCGGGCGCCAGGAGA--AGAGGCTAGTCTCGTGGTCGAGTCAGGTC (0/49) GA G (1/49) AGA (1/49) G GA AGAGGCT (1/49) G A (2/49) G G A (1/49) A G G (1/49)

C

313

GTCCCTGTTCGGGCGCCAGGAGACCAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (41/64) GTCCCTGTTCGGGCGCCAGGAGACCAGAGGCTAGTCTCGTGGTCGAGTCAGGtC (1/64) GTCCCTGTTCGGGCGCCAGGAGAcCAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (9/64) GTCCCTGTTCGGGCGCCAGGAGACCAGCAGGTAGTCTCTGGGTCGAGTCAGGTC (1/64) GTCCCTGTTCGGGCGCCAGGAGAcCAGCGGCTAGTCTCGTGGTCCAGTCAGGTC (1/64) GTCCCTGTTCGGGCGCCAGGAGACCAGAAGCTAGTCTCGTGGTCGAGTCAGGGC (1/64) GTCCCTGTTCGGGCGCCAGGAGACCAGCGGCTAGTCTCGTGGTCGAGTCAGGTC (2/64) GTCCCTGTTCGGGCGCCAGGAGACCAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (3/64) GTCCCTGTTCGGGCGCCAGGAGACCAGAGGCTAGTCTAGTGGTCGAGTCAGGTC (3/64) GTCCCTGTTCGGGCGCCAGGAGACCAGCGGCTAGTCTCGTGGTCGAGTCAGGTC (1/64) GTCCCTGTTCGGGCGCCAGGAGAACAGAGGCTAGTCTCGTGGTCGAGTCAGGTC (1/64)

D 313

100

80

60

40

20 Relative Frequency Relative 0 dA dC deletion Preferred Action Opposite G1 (control) A

100

80

60

40

20 Relative Frequency Relative 0 dA dC deletion Preferred Action Opposite G1 (lesion)

B (continued)

Figure 7.16. Comparison of preferred actions by Dpo4 and hPol opposite cisplatin- dGpG in DNA substrate S-6b 15/69-mer-DDP or the corresponding template bases GG in

DNA substrate S-6a 15/69-mer.

The results from SOSA were tallied for all events at template position G1 (A and B) and position G2 (C and D) in the presence cisplatin-dGpG (B and D) or GG (A and C) for

Dpo4 (black bar) and hPol (white bar). 314

Figure 7.16 continued

100

80

60

40

20 Relative Frequency Relative 0 dA dC deletion Preferred Action Opposite G2 (control)

C

100

80

60

40

20 Relative Frequency Relative 0 dA dC deletion Preferred Action Opposite G2 (lesion)

D

315

100 90 80 70 60 50 40 30

20 Relative ErrorRelative (%) 10 0 -8 -7 -6 -5 -4 -3 -2 -1G1G2 1 2 3 4 5 6 7 8 9 1011 121314

Position from Cisplatin Sites

A (continued)

Figure 7.17. Histogram of relative error% as a function of template position.

At each position along the DNA template, the relative base insertion% (striped bar), substitution% (white bar) and deletion% (black bar) are shown to reveal total relative error% and the contribution of each type of mutations simultaneously. The 3′-G and 5′-G of the cisplatin-dGpG are indicated as “G1” and “G2”, respectively, along the X-axis.

The same notation was used for corresponding control GG template bases. The cisplatin- dGpG bypass analyses for Dpo4 (A) and hPol (C) are shown. DNA synthesis with the control S-6a was also analyzed for Dpo4 (B), hPol (D).

316

Figure 7.17 continued

100 90 80 70 60 50 40 30

Relative ErrorRelative (%) 20 10 0 -8 -7 -6 -5 -4 -3 -2 -1G1G2 1 2 3 4 5 6 7 8 9 1011 121314

Position Along 69-mer Control B

100 90 80 70 60 50 40 30

20 Relative ErrorRelative (%) 10 0 -8 -7 -6 -5 -4 -3 -2 -1G1G2 1 2 3 4 5 6 7 8 9 1011 121314

Position from Cisplatin Sites C (continued)

317

Figure 7.17 continued

100 90 80 70 60 50 40 30

Relative ErrorRelative (%) 20 10 0 -8 -7 -6 -5 -4 -3 -2 -1G1G2 1 2 3 4 5 6 7 8 9 10 11 121314

Position Along 69-mer Control

D

318

50

40

30

20 [Product] (nM)

10

0 0 2 4 6 8 10 12 14 16 Time (s)

Figure 7.18. Pre-steady state burst kinetics of dTTP incorporation into D-1 catalyzed by hPolε.

The concentration of products were plotted as a function of time and was fit to curve (Eq

3) to obtain rate constants 46.7 ± 5.6 s-1 and 0.126 ± 0.007 s-1 for the exponential phase and linear phase of the curve, respectively.

319

10

8

6

4 [Complex] (nM) [Complex]

2

0 0 50 100 150 200 250 [hPol] (nM)

A

7

6

5

4

3

[Complex] (nM) [Complex] 2

1

0 0 100 200 300 400 500 [hPol] (nM)

B

Figure 7.19. EMSA for hPolε using different DNA substrates.

(A) Corresponding plot of hPolε EMSA using D-8 21/41-mer. Kd, DNA was determined to be 38 ± 3 nM. (B) Corresponding plot of hPolε EMSA using BE2 16/16-mer. Kd, DNA was determined to be 204 ± 13 nM.

320

20

15

10

[Product] (nM) [Product] 5

0 0 0.1 0.2 0.3 0.4 0.5 Time (s)

A

50

40

) 30

-1 (s

obs 20 k

10

0 0 200 400 600 800 1000 1200 1400 [dTTP] (M)

B

Figure 7.20. Pre-steady state kinetics of dTTP incorporation into D-1 21/41-mer catalyzed by hPolε at 37 °C.

(A) a preincubated solution of hPolε (130 nM) and 5′-radiolabeled 21/41-mer (20 nM) was rapidly mixed with increasing concentrations of dTTP (50 κM, ●; 100 κM, □; 200

κM, ▲; 400 κM, ◊; 650 κM, ■; 950 κM, ○; 1200 κM, ♦) for various time intervals. Each time course was fitted to Eq 5 to yield a kobs; (B) the plot of kobs values against dTTP

-1 concentrations was fit to Eq 6 to determine a kp of 48 ± 2 s and a Kd, dTTP of 258 ± 36

κM.

321

16

14

12

10

8

6

[Product](nM) 4

2

0 0 5 10 15 20 25 Time (s)

(continued)

Figure 7.21. Biphasic kinetics of hPolε incorporating dCTP and dATP into D-8 and BE2, respectively.

A preincubated solution of hPolε (130 nM) and 5′-[32P]-labeled D-8 21/41-mer (■, 20 nM) or BE2 16/16-mer (●, 20 nM) was mixed rapidly with DNA trap D-1 21/41-mer (5

κM) and dCTP (■, 0.8 mM), or dATP (●, 1.2 mM). After various time intervals, the reaction was quenched with 0.37 M EDTA. The product concentrations versus time were plotted for each DNA substrate, were fit to Eq 7. For D-8 21/41-mer, the fast and slow phases had reaction amplitudes of 8.3 ± 0.5 nM and 6.5 ± 0.5 nM, respectively, and reaction rates of 73 ± 15 s-1 and 1.2 ± 0.3 s-1, respectively. For BE2 16/16-mer, the fast and slow phases had reaction amplitudes of 0.32 ± 0.01 nM and 0.77 ± 0.05 nM, respectively, and reaction rates of 40 ± 8 s-1 and 0.0023 ± 0.0004 s-1, respectively.

322

Figure 7.21 continued

15

10

5 [Product] (nM) [Product]

0 0 0.2 0.4 0.6 0.8 1 Time (s)

Inset

2

1.5

1

[Product] (nM) [Product] 0.5

0 0 200 400 600 800 1000 Time (s)

Zoom out

323

Figure 7.22. Zoomed in view of binary complex Dpo4•DNA.

The Finger and Palm domains are colored blue and red, respectively. The linker region is colored black. The key amino acid residues labeled with distances for reference. The structure originated from Protein Data Bank ID 2RDJ [3].

324

Figure 7.23. Chemical structure of 2-aminopurine hydrogen bonding to thymine.

325

1.4 105

1.2 105

1 105

8 104

6 104 Intensity (cps) Intensity

4 104

2 104

0 50 100 150 200 250 300

[wt Dpo4] (nM)

Figure 7.24. Equilibrium dissociation constant for the wt Dpo4 using fluorescence titration assay at 25 °C.

The fluorescence intensity versus the wt Dpo4 concentration plot was fit to Eq 8 to determine Kd, DNA (9.0 ± 1.2 nM).

326

1.2

1

0.8

0.6

0.4 Unfolded Ratio Unfolded

0.2

0 0 1 2 3 4 5 6 7 8

[GDN] (M)

A

1

0.8

0.6

0.4 UnfoldedRatio

0.2

0 0 1 2 3 4 5 6 [GDN] (M)

B

Figure 7.25. Denaturation of the wt Dpo4 by increasing guanidinium chloride (GDN) at

37 °C monitored by tyrosine fluorescence.

(A) Each titration point monitored after 14 hr incubation at 37 °C. (B) Each titration point monitored after 5 min (●) or 10 min (□) incubation at 37 °C. 327

A

B

Figure 7.26. Model of a DNA primer/template duplex containing a dGAP into the DNA binding cleft of Dpo4 mutant.

The structure of Dpo4 (blue) bound to DNA duplex (multiple colors) and catalytic metal ions (magenta) is from PDB ID 1S0M [109]. The structure of BPDE-dG is replaced with the structure of dGAP is from PDB ID 1AXU [198]. The DNA (multiple colors) is either in productive (A) or dead-end (B) binary complex. The distances from the 1-AP adduct

(red) to the nearest amino acid residue and base are shown in angstroms.

328

7.8 Tables

5’-CGCAGCCGTCCAACCAACTCA-3’ D-1 3’-GCGTCGGCAGGTTGGTTGAGTAGCAGCTAGGTTACGGCAGG-5’ 5’-CGCAGCCGTCCAACCAACTCA-3’ D-6 3’-GCGTCGGCAGGTTGGTTGAGTGGCAGCTAGGTTACGGCAGG-5’ 5’-CGCAGCCGTCCAACCAACTCA-3’ D-7 3’-GCGTCGGCAGGTTGGTTGAGTTGCAGCTAGGTTACGGCAGG-5’ 5’-CGCAGCCGTCCAACCAACTCA-3’ D-8 3’-GCGTCGGCAGGTTGGTTGAGTCGCAGCTAGGTTACGGCAGG-5’ 5’-CGCAGCCGTCCAACCAACTCA-3’ F-8 3’-GCGTCGGCAGGTTGGTTGAGTCACAGCTAGGTTACGGCAGG-5’ 5’-TTGAGTTGCAACTCAA-3’ BE2 3’-AACTCAACGTTGAGTT-5’ 5′-AACGACGGCCAGTGAAT-3′ S-1a 3′-TTGCTGCCGGTCACTTAAGCGCGCCC-5′ 5′-AACGACGGCCAGTGAAT-3′ S-1b 3′-TTGCTGCCGGTCACTTAAGCGCGCCC-5′ 5′-AACGACGGCCAGTGAATTCG-3′ S-2a 3′-TTGCTGCCGGTCACTTAAGCGCGCCC-5′ 5′-AACGACGGCCAGTGAATTCG-3′ S-2b 3′-TTGCTGCCGGTCACTTAAGCGCGCCC-5′ aA designates the 2-aminopurine, G designates the 1-AP adduct on C8 position of dG (dGAP), TT designates the cyclobutane TT dimer, and GG designates the cisplatin-dGpG.

(continued)

Table 7.1. DNA substrates.

329

Table 7.1 continued

5’-CTACCTGAACGACGGCC-3’ S-3a 3’-CTACTCAGCCGTTGATGGACTTGCTGCCGGTCACTTAAGCGCGCCCCTGTCCTGCCGA TCACGTTACAACTGG-5’ 5’- CTACCTGAACGACGGCC-3’ S-3b 3’-CTACTCAGCCGTTGATGGACTTGCTGCCGGTCACTTAAGCGCGCCCCTGTCCTGCCGA TCACGTTACAACTGG-5’ 5’-CGGCATCAGCAATGTTG-3’ S-4a 3’-CCTGCTGTCCTGCCGTAGTCGTTACAACTGGGTTGAGTTACAGCTAGGTTACCTCCGCACG ACACGCTCGCCTATCC-5’ 5’-CGGCATCAGCAATGTTG-3’ S-4b 3’-CCTGCTGTCCTGCCGTAGTCGTTACAACTGGGTTGAGTTACAGCTAGGTTACCTCCGCACG ACACGCTCGCCTATCC-5’ 5’-GTCCCTGTTCGGGCG-3’ S-5a 3’-CAGGGACAAGCCCGCGGTCCTCTGGTCTCCGATCAGAGCACCAGCTCAGTCCAG-5’ 5’-GTCCCTGTTCGGGCG-3’ S-5b 3’-CAGGGACAAGCCCGCGGTCCTCTGGTCTCCGATCAGAGCACCAGCTCAGTCCAG-5’ 5’-GTCCCTGTTCGGGCG-3’ S-6a 3’-GGAGTGTAGCAGCAGCAGGGACAAGCCCGCGGTCCTCTGGTCTCCGATCAGAGCACCAGC TCAGTCCAG-5’ 5’-GTCCCTGTTCGGGCG-3’ S-6b 3’-GGAGTGTAGCAGCAGCAGGGACAAGCCCGCGGTCCTCTGGTCTCCGATCAGAGCACCAGC TCAGTCCAG-5’ aA designates the 2-aminopurine, G designates the 1-AP adduct on C8 position of dG (dGAP), TT designates the cyclobutane TT dimer, and GG designates the cisplatin-dGpG.

330

bypass a Enzyme t50 (s) hPolε 2.2 hPol 4.1 hPol 106.5 Dpo4 15.9 aCalculated as the time required to bypass 50% of the dGAP sites (Position 0) in Figure 7.2.

Table 7.2. The dGAP bypass efficiencies of the human Y-family DNA polymerases and

Dpo4.

331

Insertion Deletion Substitution Enzyme DNA Event Insertion Error Deletion Error Substitution Error Errord Ratioe Errord Ratioe Errord Ratioe

Dpo4 S-3a Totala 0 8.0 x 10-4 3.1 x 10-3 Upstreamb 0 1.9 x 10-3 3.8 x 10-3 Downstreamc 0 0 0

Dpo4 S-3b Totala 0 - 3.2 x 10-3 4.0 1.4 x 10-2 4.5 Upstreamb 0 - 3.8 x 10-3 2.0 9.6 x 10-3 2.5 Downstreamc 0 - 2.7 x 10-3 - 5.5 x 10-3 -

hPolθ S-3a Totala 2.5 x 10-3 7.4 x 10-3 2.9 x 10-2 Upstreamb 0 7.8 x 10-3 2.0 x 10-2 Downstreamc 4.2 x 10-3 7.0 x 10-3 1.1 x 10-2

hPolθ S-3b Totala 2.8 x 10-3 1.1 1.8 x 10-2 2.4 2.9 x 10-2 1.0 Upstreamb 0 - 1.1 x 10-2 1.4 1.1 x 10-2 0.6 Downstreamc 4.8 x 10-3 1.1 2.2 x 10-2 3.1 1.4 x 10-2 1.3

hPolε S-3a Totala 2.5 x 10-3 1.6 x 10-2 6.1 x 10-2 Upstreamb 0 5.9 x 10-3 4.3 x 10-2 Downstreamc 4.2 x 10-3 2.2 x 10-2 2.2 x 10-2

hPolε S-3b Totala 1.9 x 10-3 0.8 5.0 x 10-2 3.1 8.0 x 10-2 1.3 Upstreamb 0 - 3.2 x 10-2 1.5 4.1 x 10-2 1.0 Downstreamc 3.3 x 10-3 0.8 6.3 x 10-2 2.9 3.1 x 10-2 1.4

aTotal events count all events except those occurred at Position 0 in Figure 7.6. bUpstream events include all events occurred before an enzyme encountered Position 0 in Figure 7.6. cDownstream events include all events occurred after an enzyme traversed Position 0 in Figure 7.6. dError was calculated using Σ(specific mutation type)/[(number of samples)x(number of bases in event)]. eError Ratio was calculated using {Σ(specific mutation type)/[(number of samples)x(number of bases in event)]}73GAP/{Σ(specific mutation type)/[(number of samples)x(number of bases in event)]}73CTL. “-” means the ratio cannot be calculated because the denominator is 0.

Table 7.3. Error rates of the Y-family DNA polymerases using S-3a 17/73-mer and S-3b

17/73-mer-dGAP.

332

T1 bypass a T2 bypass b total bypass c Enzyme t50 (s) t50 (s) t50 (s) hPol 0.77 0.79 16.79 hPol 14.6 33.5 304 hPol 70 297 1586 hRev1 5229 7789 not observed Dpo4 0.8 23.5 52.1 aCalculated as the time required to bypass 50% of the 3′-T of cis-syn TT (Position T1) in Figure 7.8. bCalculated as the time required to bypass 50% of the 5′-T of cis-syn TT (Position T2) in Figure 7.8. cCalculated as the time required to bypass 50% of both the 3′-T and 5′-T of cis-syn TT (Positions T1 and T2) in Figure 7.8.

bypass Table 7.4. Calculated t50 of T1, T2, and total lesion bypass for Y-family DNA polymerases.

333

Total Error Total Error Enzyme Frequency Frequency (Control)a (cis-syn TT)a

hPol 5.6 x 10-2 1.5 x 10-1 hPol 8.4 x 10-2 -- hPol 1.4 x 10-1 1.7 x 10-1 Dpo4 2 x 10-3 6.1 x 10-2 aTotal error frequency = total number of mutations / number of bases sequenced.

Table 7.5. The total error rate of Y-family DNA polymerases using S-4a 17/77-mer and

S-4b 17/77-mer-CPD.

334

G1 bypass a G2 bypass b total bypass c Enzyme t50 (s) t50 (s) t50 (s) hPol 1.9 1.9 17.3 hPol 1251 10.7 2167 Dpo4 3775 23.9 4204 aCalculated as the time required to bypass 50% of the 3′-G of cisplatin-dGpG (Position G1) in Figure 7.13. bCalculated as the time required to bypass 50% of the 5′-G of cisplatin-dGpG (Position G2) in Figure 7.13. cCalculated as the time required to bypass 50% of both the 3′-G and 5′-G of cisplatin-dGpG (Positions G1 and G2) in Figure 7.13.

bypass Table 7.6. Calculated t50 of G1, G2, and total lesion bypass for Y-family DNA polymerases.

335

Enzyme Total Error Total Error Frequency Frequency (control)a (cisplatin-dGpG)a hPol 1.9 x 10-2 6.0 x 10-2 Dpo4 6.3 x 10-3 3.4 x 10-2 aTotal error frequency = total number of mutations / number of bases sequenced.

Table 7.7. The total error rate of Y-family DNA polymerases using S-6a 15/69-mer and

S-6b 15/69-mer-DDP.

336

K Affinity Oligomer d, DNA (nM) Ratioa D-1 21/41-mer 46 ± 1.0 1 D-8 21/41-mer 38 ± 3.3 0.8 BE2 16/16-mer 204 ± 13 4.4 a Calculated as (Kd, DNA)/(Kd, DNA)D-1.

Table 7.8. DNA binding affinity of hPolε for different DNA substrates at room temperature.

337

K k k /K dNTP d, dNTP p p d, dNTP Fidelitya (µM) (s-1) (µM-1s-1) Template dA (D-1) dTTP 258 ± 36 48 ± 2 1.9 x 10-1 - dATP 165 ± 24 2.0 ± 0.1 1.2 x 10-2 5.9 x 10-2 dCTP 48 ± 4 (18 ± 0.3) x 10-2 3.8 x 10-3 2.0 x 10-2 dGTP 81 ± 9 (3.9 ± 0.1) x 10-1 4.8 x 10-3 2.5 x 10-2 Template dG (D-6) dCTP 91 ± 19 47 ± 3 5.2 x 10-1 dATP 268 ± 58 1.1 ± 0.1 4.3 x 10-3 8.2 x 10-3 dGTP 35 ± 5 (8.0 ± 0.3) x 10-1 2.3 x 10-2 4.2 x 10-2 dTTP 520 ± 71 7.2 ± 0.4 1.4 x 10-2 2.6 x 10-2 Template dT (D-7) dATP 223 ± 27 43 ± 2 1.9 x 10-1 - dCTP 210 ± 47 (8.3 ± 0.7) x 10-1 4.0 x 10-3 2.1 x 10-2 dGTP 42 ± 2 1.4 ± 0.1 3.4 x 10-2 1.5 x 10-1 dTTP 344 ± 37 1.5 ± 0.1 4.2 x 10-3 2.2 x 10-2 Template dC (D-8) dGTP 117 ± 20 74 ± 4 6.4 x 10-1 - dATP 201 ± 36 1.6 ± 0.1 7.9 x 10-3 1.2 x 10-2 dCTP 209 ± 35 (5.8 ± 0.4) x 10-1 2.8 x 10-3 4.4 x 10-3 dTTP 412 ± 97 1.1 ± 0.1 2.6 x 10-3 4.0 x 10-3 a Calculated as (kp/Kd, dNTP)incorrect/[( kp/Kd, dNTP)correct + (kp/Kd, dNTP)incorrect].

Table 7.9. Pre-steady state kinetic parameters for single dNTP incorporations into 21/41- mers catalyzed by hPolε at 37 °C.

338

K k k /K Efficiency dNTP d, dNTP p p d, dNTP (µM) (s-1) (µM-1s-1) Ratioa dATP 247 ± 28 (6.6 ± 0.2) x 10-1 2.7 x 10-3 1.0 dPTP 15 ± 2 2.7 ± 0.1 1.8 x 10-1 66.7 a Calculated as (kp/Kd, dNTP)dNTP /(kp/Kd, dNTP)dATP.

Table 7.10. Pre-steady state kinetic parameters for single dNTP incorporations into blunt- end DNA substrate BE2 catalyzed by hPolε at 37 °C.

339

A k A k DNA Substrate 1 1 2 2 (nM) (s-1) (nM) (s-1) D-8 21/41-mer 8.3 ± 0.5 72.7 ± 15.3 6.5 ± 0.5 1.2 ± 0.3 (41%)a (32%)a

BE2 (3.2 ± 0.1) x 10-1 40.3 ± 8.1 (7.7 ± 0.5) x 10-1 (2.3 ± 0.4) x 10-3 (1.6%)a (3.9%)a aCalculated as (reaction amplitude/20 nM) x 100.

Table 7.11. Biphasic kinetic parameters of single dNTP incorporations into different

DNA substrates catalyzed by hPolε.

340

Kinetic Parameters Values a Kd, DNA (nM)  38 - 46 DNA -1 -1 kon (µM s ) 0.059 DNA -1 koff (s ) 0.0027 DNA -1 b k*off (s ) 2.4

Kd, incorrect dNTP (µM) 35 - 520

Kd, correct dNTP (µM) 91 - 258 dNTP -1 -1 kon (µM s ) 100 incorrect dNTP -1 koff (s ) ≤ 52,000 correct dNTP -1 koff (s ) ≤ 25,800 -1 kp, incorrect (s ) 0.18 - 7.2 -1 kp, correct (s ) 39 - 74 aValue for staggered primer/template DNA substrate. bValue from [201].

Table 7.12. Estimated kinetic parameters of hPolε for normal DNA synthesis.

341

Dpo4 Kd (nM) Binding Attenuationa wt 9.0 ± 1.8 1 E100A 11.1 ± 4.1 1.2 K148A 10.9 ± 3.6 1.2 E100A/K14A 7.8 ± 2.1 ~1 R240A 25.1 ± 2.4 2.8 E235A/R240A 26.2 ± 5.0 2.9 E100A/E235A/R240A 26.3 ± 5.4 2.9 K148A/E235A/R240A 31.3 ± 7.9 3.5 K148A/E100A/E235A/R240A 33.4 ± 6.2 3.7 R/K-to-A linker 192.8 ± 47.1 21 R/K-to-D linker 953.1 ± 124.0 106 All-Gly linker 2233 ± 600 248 a Calculated as (Kd)mutant /(Kd)wt.

Table 7.13. Linker region sequences and DNA-binding parameters for various Dpo4 mutants at 25 °C.

342

kp Kd, dNTP kp/Kd Efficiency b dNTP a Fidelity (s-1) (μM) (μM-1s-1) Ratio wt Dpo4c dTTP 9.4 ± 0.3 230 ± 17 4.1 x 10-2 - - dATP (6 ± 1) x 10-3 578 ± 188 9.9 x 10-6 - 2.4 x 10-4 E100A/K148A Dpo4 dTTP - - - - - dATP (1.1 ± 0.1) x 10-2 408 ± 102 2.7 x 10-5 0.37 K148A/E235A/R240A Dpo4 dTTP - - - - - dATP (1.5 ± 0.2) x 10-2 587 ± 132 2.6 x 10-5 0.38 K148A/E100A/E235A/R240A Dpo4 dTTP - - - - - dATP (1.4 ± 0.1) x 10-2 500 ± 113 2.8 x 10-5 0.35 - R/K-to-A linker Dpo4 dTTP - - - - - dATP (9.2 ± 3.8) x 10-3 1138 ± 672.2 8.10 x 10-6 1.22 R/K-to-D linker Dpo4 dTTP (1.3 ± 0.1) x 10-3 257 ± 71.1 4.99 x 10-6 8333 - dATP ND ND ND ND ND All-Gly Linker Dpo4 dTTP (8 ± 1) x 10-2 1198 ± 383 6.7 x 10-5 610 - dATP (2.2 ± 0.2) x 10-4 448 ± 108 4.9 x 10-7 20 7.3 x 10-3 a Calculated as (kp/Kd, dNTP)wt/(kp/Kd, dNTP)Mutant. b Calculated as (kp/Kd, dNTP)incorrect/[(kp/Kd, dNTP)incorrect + (kp/Kd, dNTP)correct]. c The kp and Kd, dNTP values for wt Dpo4 are from previous work [27].

Table 7.14. Kinetic parameters for nucleotide incorporation into DNA (D-1) catalyzed by

Dpo4 at 37 C.

343

7.9 Schemes

Scheme 7.1. SOSA scheme for dGAP bypass analysis.

344

Scheme 7.2. SOSA scheme for cis-syn TT bypass analysis.

345

Scheme 7.3. SOSA scheme for cisplatin-dGpG bypass analysis

346

Scheme 7.4. General kinetic mechanism of DNA synthesis catalyzed by hPolε.

347

Scheme 7.5. Folding and unfolding of Dpo4.

348

Scheme 7.6. Proposed two-polymerase lesion bypass pathway.

349

Scheme 7.7. Modified kinetic mechanism for DNA synthesis catalyzed by hPolε.

350

References

1. Steitz, T.A. (1998) A mechanism for all polymerases. Nature 391(6664), 231-2. 2. Xu, C., Maxwell, B.A., Brown, J.A., Zhang, L., and Suo, Z. (2009) Global conformational dynamics of a Y-family DNA polymerase during catalysis. PLoS Biol. 7(10), e1000225. 3. Wong, J.H., Fiala, K.A., Suo, Z., and Ling, H. (2008) Snapshots of a Y-family DNA polymerase in replication: substrate-induced conformational transitions and implications for fidelity of Dpo4. J. Mol. Biol. 379(2), 317-30. 4. Kunkel, T.A. (2004) DNA replication fidelity. J. Biol. Chem. 279(17), 16895-8. 5. McCulloch, S.D. and Kunkel, T.A. (2008) The fidelity of DNA synthesis by eukaryotic replicative and translesion synthesis polymerases. Cell Res. 18(1), 148- 61. 6. Johnson, R.E., Trincao, J., Aggarwal, A.K., Prakash, S., and Prakash, L. (2003) Deoxynucleotide triphosphate binding mode conserved in Y family DNA polymerases. Mol. Cell. Biol. 23(8), 3008-12. 7. Broyde, S., Wang, L., Rechkoblit, O., Geacintov, N.E., and Patel, D.J. (2008) Lesion processing: high-fidelity versus lesion-bypass DNA polymerases. Trends Biochem. Sci. 33(5), 209-19. 8. Fowler, J.D. and Suo, Z. (2006) Biochemical, structural, and physiological characterization of terminal deoxynucleotidyl transferase. Chem. Rev. 106(6), 2092-110. 9. Johnson, A.A., Tsai, Y., Graves, S.W., and Johnson, K.A. (2000) Human mitochondrial DNA polymerase holoenzyme: reconstitution and characterization. Biochemistry 39(7), 1702-8.

351

10. Graves, S.W., Johnson, A.A., and Johnson, K.A. (1998) Expression, purification, and initial kinetic characterization of the large subunit of the human mitochondrial DNA polymerase. Biochemistry 37(17), 6050-8. 11. Fowler, J.D., Brown, J.A., Johnson, K.A., and Suo, Z. (2008) Kinetic investigation of the inhibitory effect of gemcitabine on DNA polymerization catalyzed by human mitochondrial DNA polymerase. J. Biol. Chem. 283(22), 15339-48. 12. Filee, J., Forterre, P., Sen-Lin, T., and Laurent, J. (2002) Evolution of DNA polymerase families: evidences for multiple gene exchange between cellular and viral proteins. J. Mol. Evol. 54(6), 763-73. 13. Lehman, I.R. and Kaguni, L.S. (1989) DNA polymerase alpha. J. Biol. Chem. 264(8), 4265-8. 14. Nelson, J.R., Lawrence, C.W., and Hinkle, D.C. (1996) Thymine-thymine dimer bypass by yeast DNA polymerase zeta. Science 272(5268), 1646-9. 15. Guo, D., Wu, X., Rajpal, D.K., Taylor, J.S., and Wang, Z. (2001) Translesion synthesis by yeast DNA polymerase zeta from templates containing lesions of ultraviolet radiation and acetylaminofluorene. Nucleic Acids Res. 29(13), 2875- 83. 16. Burgers, P.M., Koonin, E.V., Bruford, E., Blanco, L., Burtis, K.C., Christman, M.F., Copeland, W.C., Friedberg, E.C., Hanaoka, F., Hinkle, D.C., Lawrence, C.W., Nakanishi, M., Ohmori, H., Prakash, L., Prakash, S., Reynaud, C.A., Sugino, A., Todo, T., Wang, Z., Weill, J.C., and Woodgate, R. (2001) Eukaryotic DNA polymerases: proposal for a revised nomenclature. J. Biol. Chem. 276(47), 43487-90. 17. Johnson, R.E., Washington, M.T., Haracska, L., Prakash, S., and Prakash, L. (2000) Eukaryotic polymerases iota and zeta act sequentially to bypass DNA lesions. Nature 406(6799), 1015-9.

352

18. Haracska, L., Unk, I., Johnson, R.E., Johansson, E., Burgers, P.M., Prakash, S., and Prakash, L. (2001) Roles of yeast DNA polymerases delta and zeta and of Rev1 in the bypass of abasic sites. Genes Dev. 15(8), 945-54. 19. Haracska, L., Prakash, S., and Prakash, L. (2003) Yeast DNA polymerase zeta is an efficient extender of primer ends opposite from 7,8-dihydro-8-Oxoguanine and O6-methylguanine. Mol. Cell. Biol. 23(4), 1453-9. 20. Rattray, A.J. and Strathern, J.N. (2003) Error-prone DNA polymerases: when making a mistake is the only way to get ahead. Annu. Rev. Genet. 37, 31-66. 21. Cann, I.K. and Ishino, Y. (1999) Archaeal DNA replication: identifying the pieces to solve a puzzle. Genetics 152(4), 1249-67. 22. Dianov, G.L., Prasad, R., Wilson, S.H., and Bohr, V.A. (1999) Role of DNA polymerase beta in the excision step of long patch mammalian base excision repair. J. Biol. Chem. 274(20), 13741-3. 23. Beard, W.A. and Wilson, S.H. (2000) Structural design of a eukaryotic DNA repair polymerase: DNA polymerase beta. Mutat. Res. 460(3-4), 231-44. 24. Fowler, J.D., Brown, J.A., Kvaratskhelia, M., and Suo, Z. (2009) Probing conformational changes of human DNA polymerase lambda using mass spectrometry-based protein footprinting. J. Mol. Biol. 390(3), 368-79. 25. Brown, J.A., Fiala, K.A., Fowler, J.D., Sherrer, S.M., Newmister, S.A., Duym, W.W., and Suo, Z. (2010) A Novel Mechanism of Sugar Selection Utilized by a Human X-Family DNA Polymerase. J. Mol. Biol. 395, 282-290. 26. Brown, J.A., Pack, L.R., Sherrer, S.M., Kshetry, A.K., Newmister, S.A., Fowler, J.D., Taylor, J.S., and Suo, Z. (2010) Identification of critical residues for the tight binding of both correct and incorrect nucleotides to human DNA polymerase lambda. J. Mol. Biol. 403(4), 505-15. 27. Fiala, K.A. and Suo, Z. (2004) Pre-Steady-State Kinetic Studies of the Fidelity of Sulfolobus solfataricus P2 DNA Polymerase IV. Biochemistry 43(7), 2106-15.

353

28. Kunkel, T.A., Pavlov, Y.I., and Bebenek, K. (2003) Functions of human DNA polymerases eta, kappa and iota suggested by their properties, including fidelity with undamaged DNA templates. DNA Repair (Amst) 2(2), 135-49. 29. Washington, M.T., Johnson, R.E., Prakash, S., and Prakash, L. (1999) Fidelity and processivity of Saccharomyces cerevisiae DNA polymerase eta. J. Biol. Chem. 274(52), 36835-8. 30. Tissier, A., McDonald, J.P., Frank, E.G., and Woodgate, R. (2000) poliota, a remarkably error-prone human DNA polymerase. Genes Dev. 14(13), 1642-50. 31. Gerlach, V.L., Feaver, W.J., Fischhaber, P.L., and Friedberg, E.C. (2001) Purification and characterization of pol kappa, a DNA polymerase encoded by the human DINB1 gene. J. Biol. Chem. 276(1), 92-8. 32. Zhang, Y., Wu, X., Rechkoblit, O., Geacintov, N.E., Taylor, J.S., and Wang, Z. (2002) Response of human REV1 to different DNA damage: preferential dCMP insertion opposite the lesion. Nucleic Acids Res. 30(7), 1630-8. 33. Ohashi, E., Bebenek, K., Matsuda, T., Feaver, W.J., Gerlach, V.L., Friedberg, E.C., Ohmori, H., and Kunkel, T.A. (2000) Fidelity and processivity of DNA synthesis by DNA polymerase kappa, the product of the human DINB1 gene. J. Biol. Chem. 275(50), 39678-84. 34. Masutani, C., Kusumoto, R., Yamada, A., Dohmae, N., Yokoi, M., Yuasa, M., Araki, M., Iwai, S., Takio, K., and Hanaoka, F. (1999) The XPV (xeroderma pigmentosum variant) gene encodes human DNA polymerase eta. Nature 399(6737), 700-4. 35. Levine, R.L., Miller, H., Grollman, A., Ohashi, E., Ohmori, H., Masutani, C., Hanaoka, F., and Moriya, M. (2001) Translesion DNA synthesis catalyzed by human pol eta and pol kappa across 1,N6-ethenodeoxyadenosine. J. Biol. Chem. 276(22), 18717-21. 36. Haracska, L., Yu, S.L., Johnson, R.E., Prakash, L., and Prakash, S. (2000) Efficient and accurate replication in the presence of 7,8-dihydro-8-oxoguanine by DNA polymerase eta. Nat. Genet. 25(4), 458-61.

354

37. Yang, W. (2005) Portraits of a Y-family DNA polymerase. FEBS Lett. 579(4), 868-72. 38. She, Q., Singh, R.K., Confalonieri, F., Zivanovic, Y., Allard, G., Awayez, M.J., Chan-Weiher, C.C., Clausen, I.G., Curtis, B.A., De Moors, A., Erauso, G., Fletcher, C., Gordon, P.M., Heikamp-de Jong, I., Jeffries, A.C., Kozera, C.J., Medina, N., Peng, X., Thi-Ngoc, H.P., Redder, P., Schenk, M.E., Theriault, C., Tolstrup, N., Charlebois, R.L., Doolittle, W.F., Duguet, M., Gaasterland, T., Garrett, R.A., Ragan, M.A., Sensen, C.W., and Van der Oost, J. (2001) The complete genome of the crenarchaeon Sulfolobus solfataricus P2. Proc. Natl. Acad. Sci. U. S. A. 98(14), 7835-40. 39. Gruz, P., Shimizu, M., Pisani, F.M., De Felice, M., Kanke, Y., and Nohmi, T. (2003) Processing of DNA lesions by archaeal DNA polymerases from Sulfolobus solfataricus. Nucleic Acids Res. 31(14), 4024-30. 40. Bauer, J., Xing, G., Yagi, H., Sayer, J.M., Jerina, D.M., and Ling, H. (2007) A structural gap in Dpo4 supports mutagenic bypass of a major benzo[a]pyrene dG adduct in DNA through template misalignment. Proc. Natl. Acad. Sci. U. S. A. 104(38), 14905-10. 41. Boudsocq, F., Iwai, S., Hanaoka, F., and Woodgate, R. (2001) Sulfolobus solfataricus P2 DNA polymerase IV (Dpo4): an archaeal DinB-like DNA polymerase with lesion-bypass properties akin to eukaryotic poleta. Nucleic Acids Res. 29(22), 4607-16. 42. Eoff, R.L., Angel, K.C., Egli, M., and Guengerich, F.P. (2007) Molecular basis of selectivity of nucleoside triphosphate incorporation opposite O6-benzylguanine by sulfolobus solfataricus DNA polymerase Dpo4: steady-state and pre-steady- state kinetics and x-ray crystallography of correct and incorrect pairing. J. Biol. Chem. 282(18), 13573-84. 43. Eoff, R.L., Irimia, A., Egli, M., and Guengerich, F.P. (2007) Sulfolobus solfataricus DNA polymerase Dpo4 is partially inhibited by "wobble" pairing

355

between O6-methylguanine and cytosine, but accurate bypass is preferred. J. Biol. Chem. 282(2), 1456-67. 44. Johnson, R.E., Prakash, L., and Prakash, S. (2005) Distinct mechanisms of cis-syn thymine dimer bypass by Dpo4 and DNA polymerase eta. Proc. Natl. Acad. Sci. U. S. A. 102(35), 12359-64. 45. Kirouac, K.N., Suo, Z., and Ling, H. (2011) Structural mechanism of ribonucleotide discrimination by a y-family DNA polymerase. J. Mol. Biol. 407(3), 382-90. 46. Kokoska, R.J., McCulloch, S.D., and Kunkel, T.A. (2003) The efficiency and specificity of apurinic/apyrimidinic site bypass by human DNA polymerase eta and Sulfolobus solfataricus Dpo4. J. Biol. Chem. 278(50), 50537-45. 47. Ma, D., Fowler, J.D., and Suo, Z. (2011) Backbone assignment of the little finger domain of a Y-family DNA polymerase. Biomol. NMR Assign. 48. Ma, D., Fowler, J.D., Yuan, C., and Suo, Z. (2010) Backbone assignment of the catalytic core of a Y-family DNA polymerase. Biomol. NMR Assign. 4(2), 207-9. 49. Mizukami, S., Kim, T.W., Helquist, S.A., and Kool, E.T. (2006) Varying DNA base-pair size in subangstrom increments: evidence for a loose, not large, active site in low-fidelity Dpo4 polymerase. Biochemistry 45(9), 2772-8. 50. Rechkoblit, O., Malinina, L., Cheng, Y., Kuryavyi, V., Broyde, S., Geacintov, N.E., and Patel, D.J. (2006) Stepwise translocation of Dpo4 polymerase during error-free bypass of an oxoG lesion. PLoS Biol. 4(1), e11. 51. Sherrer, S.M., Beyer, D.C., Xia, C.X., Fowler, J.D., and Suo, Z. (2010) Kinetic basis of sugar selection by a Y-family DNA polymerase from Sulfolobus solfataricus P2. Biochemistry 49(47), 10179-86. 52. Vaisman, A., Ling, H., Woodgate, R., and Yang, W. (2005) Fidelity of Dpo4: effect of metal ions, nucleotide selection and pyrophosphorolysis. EMBO J. 24, 2957-2967. 53. Wang, L. and Broyde, S. (2006) A new anti conformation for N-(deoxyguanosin- 8-yl)-2-acetylaminofluorene (AAF-dG) allows Watson-Crick pairing in the

356

Sulfolobus solfataricus P2 DNA polymerase IV (Dpo4). Nucleic Acids Res. 34(3), 785-95. 54. Wong, J.H., Brown, J.A., Suo, Z., Blum, P., Nohmi, T., and Ling, H. (2010) Structural insight into dynamic bypass of the major cisplatin-DNA adduct by Y- family polymerase Dpo4. Embo J. 29(12), 2059-69. 55. Zang, H., Chowdhury, G., Angel, K.C., Harris, T.M., and Guengerich, F.P. (2006) Translesion synthesis across polycyclic aromatic hydrocarbon diol epoxide adducts of deoxyadenosine by Sulfolobus solfataricus DNA polymerase Dpo4. Chem. Res. Toxicol. 19(6), 859-67. 56. Zang, H., Goodenough, A.K., Choi, J.Y., Irimia, A., Loukachevitch, L.V., Kozekov, I.D., Angel, K.C., Rizzo, C.J., Egli, M., and Guengerich, F.P. (2005) DNA adduct bypass polymerization by Sulfolobus solfataricus DNA polymerase Dpo4: analysis and crystal structures of multiple base pair substitution and frameshift products with the adduct 1,N2-ethenoguanine. J. Biol. Chem. 280(33), 29750-64. 57. Sherrer, S.M., Brown, J.A., Pack, L.R., Jasti, V.P., Fowler, J.D., Basu, A.K., and Suo, Z. (2009) Mechanistic studies of the bypass of a bulky single-base lesion catalyzed by a Y-family DNA polymerase. J. Biol. Chem. 284(10), 6379-88. 58. Fiala, K.A. and Suo, Z. (2007) Sloppy bypass of an abasic lesion catalyzed by a Y-family DNA polymerase. J. Biol. Chem. 282(11), 8199-206. 59. Fiala, K.A., Hypes, C.D., and Suo, Z. (2007) Mechanism of abasic lesion bypass catalyzed by a Y-family DNA polymerase. J. Biol. Chem. 282(11), 8188-98. 60. Ling, H., Boudsocq, F., Woodgate, R., and Yang, W. (2001) Crystal structure of a Y-family DNA polymerase in action: a mechanism for error-prone and lesion- bypass replication. Cell 107(1), 91-102. 61. Yang, W. and Woodgate, R. (2007) What a difference a decade makes: insights into translesion DNA synthesis. Proc. Natl. Acad. Sci. U. S. A. 104(40), 15591-8.

357

62. Masutani, C., Kusumoto, R., Iwai, S., and Hanaoka, F. (2000) Mechanisms of accurate translesion synthesis by human DNA polymerase eta. Embo. J. 19(12), 3100-9. 63. McDonald, J.P., Tissier, A., Frank, E.G., Iwai, S., Hanaoka, F., and Woodgate, R. (2001) DNA polymerase iota and related rad30-like enzymes. Philos. Trans. R. Soc. Lond. B Biol. Sci. 356(1405), 53-60. 64. Ohashi, E., Ogi, T., Kusumoto, R., Iwai, S., Masutani, C., Hanaoka, F., and Ohmori, H. (2000) Error-prone bypass of certain DNA lesions by the human DNA polymerase kappa. Genes Dev. 14(13), 1589-94. 65. Fiala, K.A. and Suo, Z. (2004) Mechanism of DNA Polymerization Catalyzed by Sulfolobus solfataricus P2 DNA Polymerase IV. Biochemistry 43(7), 2116-25. 66. Fiala, K.A., Sherrer, S.M., Brown, J.A., and Suo, Z. (2008) Mechanistic consequences of temperature on DNA polymerization catalyzed by a Y-family DNA polymerase. Nucleic Acids Res. 36(6), 1990-2001. 67. Yamada, A., Masutani, C., Iwai, S., and Hanaoka, F. (2000) Complementation of defective translesion synthesis and UV light sensitivity in xeroderma pigmentosum variant cells by human and mouse DNA polymerase eta. Nucleic Acids Res. 28(13), 2473-80. 68. Masutani, C., Araki, M., Yamada, A., Kusumoto, R., Nogimori, T., Maekawa, T., Iwai, S., and Hanaoka, F. (1999) Xeroderma pigmentosum variant (XP-V) correcting protein from HeLa cells has a thymine dimer bypass DNA polymerase activity. Embo. J. 18(12), 3491-501. 69. Yuasa, M., Masutani, C., Eki, T., and Hanaoka, F. (2000) Genomic structure, chromosomal localization and identification of mutations in the xeroderma pigmentosum variant (XPV) gene. Oncogene 19(41), 4721-8. 70. Taylor, J.S. (1994) Unraveling the Molecular Pathway from Sunlight to Skin Cancer. Acc. Chem. Res. 27(3), 76-82. 71. Hendel, A., Ziv, O., Gueranger, Q., Geacintov, N., and Livneh, Z. (2008) Reduced efficiency and increased mutagenicity of translesion DNA synthesis

358

across a TT cyclobutane pyrimidine dimer, but not a TT 6-4 photoproduct, in human cells lacking DNA polymerase eta. DNA Repair (Amst) 7(10), 1636-46. 72. Washington, M.T., Johnson, R.E., Prakash, L., and Prakash, S. (2001) Accuracy of lesion bypass by yeast and human DNA polymerase eta. Proc. Natl. Acad. Sci. U. S. A. 98(15), 8355-60. 73. Yu, S.L., Johnson, R.E., Prakash, S., and Prakash, L. (2001) Requirement of DNA polymerase eta for error-free bypass of UV-induced CC and TC photoproducts. Mol. Cell. Biol. 21(1), 185-8. 74. Friedberg, E.C., Wagner, R., and Radman, M. (2002) Specialized DNA polymerases, cellular survival, and the genesis of mutations. Science 296(5573), 1627-30. 75. Lehmann, A.R. (2002) Replication of damaged DNA in mammalian cells: new solutions to an old problem. Mutat. Res. 509(1-2), 23-34. 76. Johnson, R.E., Washington, M.T., Prakash, S., and Prakash, L. (2000) Fidelity of human DNA polymerase eta. J. Biol. Chem. 275(11), 7447-50. 77. Biertumpfel, C., Zhao, Y., Kondo, Y., Ramon-Maiques, S., Gregory, M., Lee, J.Y., Masutani, C., Lehmann, A.R., Hanaoka, F., and Yang, W. (2010) Structure and mechanism of human DNA polymerase eta. Nature 465(7301), 1044-8. 78. Bassett, E., Vaisman, A., Havener, J.M., Masutani, C., Hanaoka, F., and Chaney, S.G. (2003) Efficiency of extension of mismatched primer termini across from cisplatin and oxaliplatin adducts by human DNA polymerases beta and eta in vitro. Biochemistry 42(48), 14197-206. 79. Bassett, E., King, N.M., Bryant, M.F., Hector, S., Pendyala, L., Chaney, S.G., and Cordeiro-Stone, M. (2004) The role of DNA polymerase eta in translesion synthesis past platinum-DNA adducts in human fibroblasts. Cancer Res. 64(18), 6469-75. 80. Albertella, M.R., Green, C.M., Lehmann, A.R., and O'Connor, M.J. (2005) A role for polymerase eta in the cellular tolerance to cisplatin-induced damage. Cancer Res. 65(21), 9799-806.

359

81. Alt, A., Lammens, K., Chiocchini, C., Lammens, A., Pieck, J.C., Kuch, D., Hopfner, K.P., and Carell, T. (2007) Bypass of DNA lesions generated during anticancer treatment with cisplatin by DNA polymerase eta. Science 318(5852), 967-70. 82. Zhang, Y., Yuan, F., Wu, X., Rechkoblit, O., Taylor, J.S., Geacintov, N.E., and Wang, Z. (2000) Error-prone lesion bypass by human DNA polymerase eta. Nucleic Acids Res. 28(23), 4717-24. 83. Haracska, L., Prakash, S., and Prakash, L. (2000) Replication past O(6)- methylguanine by yeast and human DNA polymerase eta. Mol. Cell. Biol. 20(21), 8001-7. 84. Tissier, A., Frank, E.G., McDonald, J.P., Iwai, S., Hanaoka, F., and Woodgate, R. (2000) Misinsertion and bypass of thymine-thymine dimers by human DNA polymerase iota. Embo. J. 19(19), 5259-66. 85. Zhang, Y., Yuan, F., Wu, X., Taylor, J.S., and Wang, Z. (2001) Response of human DNA polymerase iota to DNA lesions. Nucleic Acids Res. 29(4), 928-35. 86. Zhang, Y., Yuan, F., Wu, X., and Wang, Z. (2000) Preferential incorporation of G opposite template T by the low-fidelity human DNA polymerase iota. Mol. Cell. Biol. 20(19), 7099-108. 87. Haracska, L., Johnson, R.E., Unk, I., Phillips, B.B., Hurwitz, J., Prakash, L., and Prakash, S. (2001) Targeting of human DNA polymerase iota to the replication machinery via interaction with PCNA. Proc. Natl. Acad. Sci. U. S. A. 98(25), 14256-61. 88. Nelson, J.R., Lawrence, C.W., and Hinkle, D.C. (1996) Deoxycytidyl transferase activity of yeast REV1 protein. Nature 382(6593), 729-31. 89. Lin, W., Xin, H., Zhang, Y., Wu, X., Yuan, F., and Wang, Z. (1999) The human REV1 gene codes for a DNA template-dependent dCMP transferase. Nucleic Acids Res. 27(22), 4468-75. 90. Masuda, Y., Takahashi, M., Tsunekuni, N., Minami, T., Sumii, M., Miyagawa, K., and Kamiya, K. (2001) Deoxycytidyl transferase activity of the human REV1

360

protein is closely associated with the conserved polymerase domain. J. Biol. Chem. 276(18), 15051-8. 91. Nair, D.T., Johnson, R.E., Prakash, L., Prakash, S., and Aggarwal, A.K. (2005) Rev1 employs a novel mechanism of DNA synthesis using a protein template. Science 309, 2219-22. 92. Swan, M.K., Johnson, R.E., Prakash, L., Prakash, S., and Aggarwal, A.K. (2009) Structure of the human Rev1-DNA-dNTP ternary complex. J. Mol. Biol. 390(4), 699-709. 93. Guo, C., Fischhaber, P.L., Luk-Paszyc, M.J., Masuda, Y., Zhou, J., Kamiya, K., Kisker, C., and Friedberg, E.C. (2003) Mouse Rev1 protein interacts with multiple DNA polymerases involved in translesion DNA synthesis. Embo. J. 22(24), 6621-30. 94. Ohashi, E., Murakumo, Y., Kanjo, N., Akagi, J., Masutani, C., Hanaoka, F., and Ohmori, H. (2004) Interaction of hREV1 with three human Y-family DNA polymerases. Genes Cells 9(6), 523-31. 95. Tissier, A., Kannouche, P., Reck, M.P., Lehmann, A.R., Fuchs, R.P., and Cordonnier, A. (2004) Co-localization in replication foci and interaction of human Y-family members, DNA polymerase pol eta and REVl protein. DNA Repair (Amst) 3(11), 1503-14. 96. Guo, C., Sonoda, E., Tang, T.S., Parker, J.L., Bielen, A.B., Takeda, S., Ulrich, H.D., and Friedberg, E.C. (2006) REV1 protein interacts with PCNA: significance of the REV1 BRCT domain in vitro and in vivo. Mol. Cell 23(2), 265-71. 97. Guo, C., Tang, T.S., Bienko, M., Parker, J.L., Bielen, A.B., Sonoda, E., Takeda, S., Ulrich, H.D., Dikic, I., and Friedberg, E.C. (2006) Ubiquitin-binding motifs in REV1 protein are required for its role in the tolerance of DNA damage. Mol. Cell. Biol. 26(23), 8892-900. 98. Ogi, T., Kato, T., Jr., Kato, T., and Ohmori, H. (1999) Mutation enhancement by DINB1, a mammalian homologue of the Escherichia coli mutagenesis protein dinB. Genes Cells 4(11), 607-18.

361

99. Zhang, Y., Yuan, F., Xin, H., Wu, X., Rajpal, D.K., Yang, D., and Wang, Z. (2000) Human DNA polymerase kappa synthesizes DNA with extraordinarily low fidelity. Nucleic Acids Res. 28(21), 4147-56. 100. Washington, M.T., Johnson, R.E., Prakash, L., and Prakash, S. (2002) Human DINB1-encoded DNA polymerase kappa is a promiscuous extender of mispaired primer termini. Proc. Natl. Acad. Sci. U. S. A. 99(4), 1910-4. 101. Suzuki, N., Itoh, S., Poon, K., Masutani, C., Hanaoka, F., Ohmori, H., Yoshizawa, I., and Shibutani, S. (2004) Translesion synthesis past estrogen-derived DNA adducts by human DNA polymerases eta and kappa. Biochemistry 43(20), 6304- 11. 102. Lone, S., Townson, S.A., Uljon, S.N., Johnson, R.E., Brahma, A., Nair, D.T., Prakash, S., Prakash, L., and Aggarwal, A.K. (2007) Human DNA polymerase kappa encircles DNA: implications for mismatch extension and lesion bypass. Mol. Cell 25(4), 601-14. 103. Lindahl, T. and Andersson, A. (1972) Rate of chain breakage at apurinic sites in double-stranded deoxyribonucleic acid. Biochemistry 11, 3618-23. 104. Lindahl, T. and Nyberg, B. (1972) Rate of depurination of native deoxyribonucleoic acid. Biochemistry 11, 3610-18. 105. Kokoska, R.J., Bebenek, K., Boudsocq, F., Woodgate, R., and Kunkel, T.A. (2002) Low fidelity DNA synthesis by a y family DNA polymerase due to misalignment in the active site. J. Biol. Chem. 277(22), 19633-8. 106. Ling, H., Boudsocq, F., Woodgate, R., and Yang, W. (2004) Snapshots of replication through an abasic lesion; structural basis for base substitutions and frameshifts. Mol. Cell 13(5), 751-62. 107. Ling, H., Boudsocq, F., Plosky, B.S., Woodgate, R., and Yang, W. (2003) Replication of a cis-syn thymine dimer at atomic resolution. Nature 424(6952), 1083-7.

362

108. Brown, J.A., Newmister, S.A., Fiala, K.A., and Suo, Z. (2008) Mechanism of double-base lesion bypass catalyzed by a Y-family DNA polymerase. Nucleic Acids Res. 36(12), 3867-78. 109. Ling, H., Sayer, J.M., Plosky, B.S., Yagi, H., Boudsocq, F., Woodgate, R., Jerina, D.M., and Yang, W. (2004) Crystal structure of a benzo[a]pyrene diol epoxide adduct in a ternary complex with a DNA polymerase. Proc. Natl. Acad. Sci. U. S. A. 101(8), 2265-9. 110. Wong, K.B., Lee, C.F., Chan, S.H., Leung, T.Y., Chen, Y.W., and Bycroft, M. (2003) Solution structure and thermal stability of ribosomal protein L30e from hyperthermophilic archaeon Thermococcus celer. Protein Sci. 12(7), 1483-95. 111. Lavinder, J.J., Hari, S.B., Sullivan, B.J., and Magliery, T.J. (2009) High- throughput thermal scanning: a general, rapid dye-binding thermal shift screen for protein engineering. J. Am. Chem. Soc. 131(11), 3794-5. 112. Ericsson, U.B., Hallberg, B.M., Detitta, G.T., Dekker, N., and Nordlund, P. (2006) Thermofluor-based high-throughput stability optimization of proteins for structural studies. Anal. Biochem. 357(2), 289-98. 113. Greenfield, N. and Fasman, G.D. (1969) Computed circular dichroism spectra for the evaluation of protein conformation. Biochemistry 8(10), 4108-16. 114. Karantzeni, I., Ruiz, C., Liu, C.C., and Licata, V.J. (2003) Comparative thermal denaturation of Thermus aquaticus and Escherichia coli type 1 DNA polymerases. Biochem. J. 374(Pt 3), 785-92. 115. Quintero, D., Velasco, Z., Hurtado-Gomez, E., Neira, J.L., and Contreras, L.M. (2007) Isolation and characterization of a thermostable beta-xylosidase in the thermophilic bacterium Geobacillus pallidus. Biochim. Biophys. Acta 1774(4), 510-8. 116. Broyde, S., Wang, L., Zhang, L., Rechkoblit, O., Geacintov, N.E., and Patel, D.J. (2008) DNA adduct structure-function relationships: comparing solution with polymerase structures. Chem. Res. Toxicol. 21(1), 45-52.

363

117. Fiala, K.A., Brown, J.A., Ling, H., Kshetry, A.K., Zhang, J., Taylor, J.S., Yang, W., and Suo, Z. (2007) Mechanism of template-independent nucleotide incorporation catalyzed by a template-dependent DNA polymerase. J. Mol. Biol. 365(3), 590-602. 118. Foglia, F., Mandrich, L., Pezzullo, M., Graziano, G., Barone, G., Rossi, M., Manco, G., and Del Vecchio, P. (2007) Role of the N-terminal region for the conformational stability of esterase 2 from Alicyclobacillus acidocaldarius. Biophys. Chem. 127(1-2), 113-22. 119. Wetterau, J.R., Aggerbeck, L.P., Rall, S.C., Jr., and Weisgraber, K.H. (1988) Human apolipoprotein E3 in aqueous solution. I. Evidence for two structural domains. J. Biol. Chem. 263(13), 6240-8. 120. Wang, J. (2005) DNA polymerases: Hoogsteen base-pairing in DNA replication? Nature 437(7057), E6-7; discussion E7. 121. Pantoliano, M.W., Petrella, E.C., Kwasnoski, J.D., Lobanov, V.S., Myslik, J., Graf, E., Carver, T., Asel, E., Springer, B.A., Lane, P., and Salemme, F.R. (2001) High-density miniaturized thermal shift assays as a general strategy for drug discovery. J. Biomol. Screen 6(6), 429-40. 122. Lo, M.C., Aulabaugh, A., Jin, G., Cowling, R., Bard, J., Malamas, M., and Ellestad, G. (2004) Evaluation of fluorescence-based thermal shift assays for hit identification in drug discovery. Anal. Biochem. 332(1), 153-9. 123. Morjana, N.A., McKeone, B.J., and Gilbert, H.F. (1993) Guanidine hydrochloride stabilization of a partially unfolded intermediate during the reversible denaturation of protein disulfide isomerase. Proc. Natl. Acad. Sci. U. S. A. 90(6), 2107-11. 124. Hung, H.C. and Chang, G.G. (1998) Biphasic denaturation of human placental alkaline phosphatase in guanidinium chloride. Proteins 33(1), 49-61. 125. Traut, T.W. (1994) Physiological concentrations of purines and pyrimidines. Mol. Cell. Biochem. 140(1), 1-22.

364

126. Ferraro, P., Franzolin, E., Pontarin, G., Reichard, P., and Bianchi, V. (2010) Quantitation of cellular deoxynucleoside triphosphates. Nucleic Acids Res. 38(6), e85. 127. Nick McElhinny, S.A., Watts, B.E., Kumar, D., Watt, D.L., Lundstrom, E.B., Burgers, P.M., Johansson, E., Chabes, A., and Kunkel, T.A. (2010) Abundant ribonucleotide incorporation into DNA by yeast replicative polymerases. Proc. Natl. Acad. Sci. U. S. A. 107(11), 4949-54. 128. Joyce, C.M. (1997) Choosing the right sugar: how polymerases select a nucleotide substrate. Proc. Natl. Acad. Sci. U. S. A. 94(5), 1619-22. 129. Astatke, M., Ng, K., Grindley, N.D., and Joyce, C.M. (1998) A single side chain prevents Escherichia coli DNA polymerase I (Klenow fragment) from incorporating ribonucleotides. Proc. Natl. Acad. Sci. U. S. A. 95(7), 3402-7. 130. Patel, P.H. and Loeb, L.A. (2000) Multiple amino acid substitutions allow DNA polymerases to synthesize RNA. J. Biol. Chem. 275(51), 40266-72. 131. Bonnin, A., Lazaro, J.M., Blanco, L., and Salas, M. (1999) A single tyrosine prevents insertion of ribonucleotides in the eukaryotic-type phi29 DNA polymerase. J. Mol. Biol. 290(1), 241-51. 132. DeLucia, A.M., Grindley, N.D., and Joyce, C.M. (2003) An error-prone family Y DNA polymerase (DinB homolog from Sulfolobus solfataricus) uses a 'steric gate' residue for discrimination against ribonucleotides. Nucleic Acids Res. 31(14), 4129-37. 133. Gao, G., Orlova, M., Georgiadis, M.M., Hendrickson, W.A., and Goff, S.P. (1997) Conferring RNA polymerase activity to a DNA polymerase: a single residue in reverse transcriptase controls substrate selection. Proc. Natl. Acad. Sci. U. S. A. 94(2), 407-11. 134. Gardner, A.F., Joyce, C.M., and Jack, W.E. (2004) Comparative kinetics of nucleotide analog incorporation by vent DNA polymerase. J. Biol. Chem. 279(12), 11834-42.

365

135. Niimi, N., Sassa, A., Katafuchi, A., Gruz, P., Fujimoto, H., Bonala, R.R., Johnson, F., Ohta, T., and Nohmi, T. (2009) The steric gate amino acid tyrosine 112 is required for efficient mismatched-primer extension by human DNA polymerase kappa. Biochemistry 48(20), 4239-46. 136. Yang, G., Franklin, M., Li, J., Lin, T.C., and Konigsberg, W. (2002) A conserved Tyr residue is required for sugar selectivity in a Pol alpha DNA polymerase. Biochemistry 41(32), 10256-61. 137. Cases-Gonzalez, C.E., Gutierrez-Rivas, M., and Menendez-Arias, L. (2000) Coupling ribose selection to fidelity of DNA synthesis. The role of Tyr-115 of human immunodeficiency virus type 1 reverse transcriptase. J. Biol. Chem. 275(26), 19759-67. 138. Ohmori, H., Friedberg, E.C., Fuchs, R.P., Goodman, M.F., Hanaoka, F., Hinkle, D., Kunkel, T.A., Lawrence, C.W., Livneh, Z., Nohmi, T., Prakash, L., Prakash, S., Todo, T., Walker, G.C., Wang, Z., and Woodgate, R. (2001) The Y-family of DNA polymerases. Mol. Cell 8(1), 7-8. 139. Nair, D.T., Johnson, R.E., Prakash, S., Prakash, L., and Aggarwal, A.K. (2004) Replication by human DNA polymerase-iota occurs by Hoogsteen base-pairing. Nature 430(6997), 377-80. 140. Trincao, J., Johnson, R.E., Escalante, C.R., Prakash, S., Prakash, L., and Aggarwal, A.K. (2001) Structure of the catalytic core of S. cerevisiae DNA polymerase eta: implications for translesion DNA synthesis. Mol. Cell 8(2), 417- 26. 141. Wilson, R.C. and Pata, J.D. (2008) Structural insights into the generation of single-base deletions by the Y family DNA polymerase dbh. Mol. Cell 29(6), 767- 79. 142. Zhou, B.L., Pata, J.D., and Steitz, T.A. (2001) Crystal structure of a DinB lesion bypass DNA polymerase catalytic fragment reveals a classic polymerase catalytic domain. Mol. Cell 8(2), 427-37.

366

143. Doublie, S., Tabor, S., Long, A.M., Richardson, C.C., and Ellenberger, T. (1998) Crystal structure of a bacteriophage T7 DNA replication complex at 2.2 A resolution. Nature 391(6664), 251-8. 144. Tsai, Y.C. and Johnson, K.A. (2006) A new paradigm for DNA polymerase specificity. Biochemistry 45(32), 9675-87. 145. Jarosz, D.F., Godoy, V.G., Delaney, J.C., Essigmann, J.M., and Walker, G.C. (2006) A single amino acid governs enhanced activity of DinB DNA polymerases on damaged templates. Nature 439(7073), 225-8. 146. Thoden, J.B., Holden, H.M., and Firestine, S.M. (2008) Structural analysis of the active site geometry of N5-carboxyaminoimidazole ribonucleotide synthetase from Escherichia coli. Biochemistry 47(50), 13346-53. 147. Boule, J.B., Rougeon, F., and Papanicolaou, C. (2001) Terminal deoxynucleotidyl transferase indiscriminately incorporates ribonucleotides and deoxyribonucleotides. J Biol Chem 276(33), 31388-93. 148. Gardner, A.F. and Jack, W.E. (1999) Determinants of nucleotide sugar recognition in an archaeon DNA polymerase. Nucleic Acids Res 27(12), 2545-53. 149. Liu, S., Goff, S.P., and Gao, G. (2006) Gln(84) of moloney murine leukemia virus reverse transcriptase regulates the incorporation rates of ribonucleotides and deoxyribonucleotides. FEBS Lett 580(5), 1497-501. 150. Ruiz, J.F., Juarez, R., Garcia-Diaz, M., Terrados, G., Picher, A.J., Gonzalez- Barrera, S., Fernandez de Henestrosa, A.R., and Blanco, L. (2003) Lack of sugar discrimination by human Pol mu requires a single glycine residue. Nucleic Acids Res 31(15), 4441-9. 151. Huang, H., Chopra, R., Verdine, G.L., and Harrison, S.C. (1998) Structure of a covalently trapped catalytic complex of HIV-1 reverse transcriptase: implications for drug resistance. Science 282(5394), 1669-75. 152. Johnson, S.J., Taylor, J.S., and Beese, L.S. (2003) Processive DNA synthesis observed in a polymerase crystal suggests a mechanism for the prevention of frameshift mutations. Proc Natl Acad Sci U S A 100(7), 3895-900.

367

153. Li, Y., Korolev, S., and Waksman, G. (1998) Crystal structures of open and closed forms of binary and ternary complexes of the large fragment of Thermus aquaticus DNA polymerase I: structural basis for nucleotide incorporation. Embo J 17(24), 7514-25. 154. Boudsocq, F., Ling, H., Yang, W., and Woodgate, R. (2002) Structure-based interpretation of missense mutations in Y-family DNA polymerases and their implications for polymerase function and lesion bypass. DNA Repair (Amst) 1(5), 343-58. 155. Goodman, M.F. (2002) Error-prone repair DNA polymerases in and eukaryotes. Annu. Rev. Biochem. 71, 17-50. 156. Johnson, R.E., Kondratick, C.M., Prakash, S., and Prakash, L. (1999) hRAD30 mutations in the variant form of xeroderma pigmentosum. Science 285(5425), 263-5. 157. McCulloch, S.D., Kokoska, R.J., Masutani, C., Iwai, S., Hanaoka, F., and Kunkel, T.A. (2004) Preferential cis-syn thymine dimer bypass by DNA polymerase eta occurs with biased fidelity. Nature 428(6978), 97-100. 158. Sheffield, P., Garrard, S., and Derewenda, Z. (1999) Overcoming expression and purification problems of RhoGDI using a family of "parallel" expression vectors. Protein Expr. Purif. 15(1), 34-9. 159. Masuda, Y. and Kamiya, K. (2006) Role of single-stranded DNA in targeting REV1 to primer termini. J. Biol. Chem. 281(34), 24314-21. 160. Bienko, M., Green, C.M., Crosetto, N., Rudolf, F., Zapart, G., Coull, B., Kannouche, P., Wider, G., Peter, M., Lehmann, A.R., Hofmann, K., and Dikic, I. (2005) Ubiquitin-binding domains in Y-family polymerases regulate translesion synthesis. Science 310(5755), 1821-4. 161. Ishikawa, T., Uematsu, N., Mizukoshi, T., Iwai, S., Iwasaki, H., Masutani, C., Hanaoka, F., Ueda, R., Ohmori, H., and Todo, T. (2001) Mutagenic and nonmutagenic bypass of DNA lesions by Drosophila DNA polymerases dpoleta and dpoliota. J. Biol. Chem. 276(18), 15155-63.

368

162. Choi, J.Y., Chowdhury, G., Zang, H., Angel, K.C., Vu, C.C., Peterson, L.A., and Guengerich, F.P. (2006) Translesion synthesis across O6-alkylguanine DNA adducts by recombinant human DNA polymerases. J. Biol. Chem. 281(50), 38244-56. 163. Masuda, Y. and Kamiya, K. (2002) Biochemical properties of the human REV1 protein. FEBS Lett. 520(1-3), 88-92. 164. Strauss, B., Rabkin, S., Sagher, D., and Moore, P. (1982) The role of DNA polymerase in base substitution mutagenesis on non-instructional templates. Biochimie 64(8-9), 829-38. 165. Glick, E., Chau, J.S., Vigna, K.L., McCulloch, S.D., Adman, E.T., Kunkel, T.A., and Loeb, L.A. (2003) Amino acid substitutions at conserved tyrosine 52 alter fidelity and bypass efficiency of human DNA polymerase eta. J. Biol. Chem. 278(21), 19341-6. 166. Matsuda, T., Bebenek, K., Masutani, C., Hanaoka, F., and Kunkel, T.A. (2000) Low fidelity DNA synthesis by human DNA polymerase-eta. Nature 404(6781), 1011-3. 167. Washington, M.T., Johnson, R.E., Prakash, L., and Prakash, S. (2004) Human DNA polymerase iota utilizes different nucleotide incorporation mechanisms dependent upon the template base. Mol. Cell. Biol. 24(2), 936-43. 168. Johnson, R.E., Prakash, L., and Prakash, S. (2005) Biochemical evidence for the requirement of Hoogsteen base pairing for replication by human DNA polymerase iota. Proc. Natl. Acad. Sci. U. S. A. 102(30), 10466-71. 169. Brown, J.A., Fowler, J.D., and Suo, Z. (2010) Kinetic basis of nucleotide selection employed by a protein template-dependent DNA polymerase. Biochemistry 49(26), 5504-10. 170. Wolfle, W.T., Washington, M.T., Prakash, L., and Prakash, S. (2003) Human DNA polymerase kappa uses template-primer misalignment as a novel means for extending mispaired termini and for generating single-base deletions. Genes Dev. 17(17), 2191-9.

369

171. Fogg, M.J., Pearl, L.H., and Connolly, B.A. (2002) Structural basis for uracil recognition by archaeal family B DNA polymerases. Nat. Struct. Biol. 9(12), 922- 7. 172. Vasquez-Del Carpio, R., Silverstein, T.D., Lone, S., Swan, M.K., Choudhury, J.R., Johnson, R.E., Prakash, S., Prakash, L., and Aggarwal, A.K. (2009) Structure of human DNA polymerase kappa inserting dATP opposite an 8-OxoG DNA lesion. PLoS One 4(6), e5766. 173. Jain, R., Nair, D.T., Johnson, R.E., Prakash, L., Prakash, S., and Aggarwal, A.K. (2009) Replication across template T/U by human DNA polymerase-iota. Structure 17(7), 974-80. 174. Nair, D.T., Johnson, R.E., Prakash, L., Prakash, S., and Aggarwal, A.K. (2009) DNA Synthesis across an Abasic Lesion by Human DNA Polymerase iota. Structure 17(4), 530-7. 175. Avkin, S., Adar, S., Blander, G., and Livneh, Z. (2002) Quantitative measurement of translesion replication in human cells: evidence for bypass of abasic sites by a replicative DNA polymerase. Proc. Natl. Acad. Sci. U. S. A. 99(6), 3764-9. 176. Boudsocq, F., Kokoska, R.J., Plosky, B.S., Vaisman, A., Ling, H., Kunkel, T.A., Yang, W., and Woodgate, R. (2004) Investigating the role of the little finger domain of Y-family DNA polymerases in low fidelity synthesis and translesion replication. J. Biol. Chem. 279(31), 32932-40. 177. Pohjola, S.K., Savela, K., Kuusimaki, L., Kanno, T., Kawanishi, M., and Weyand, E. (2004) Polycyclic Aromatic Hydrocarbons of Diesel and Gasoline Exhaust and DNA Adduct Detection in Calf Thymus DNA and Lymphocyte DNA of Workers Exposed to Diesel Exhaust. Polycyclic Aromatic Compounds 25(4-5), 451-465. 178. Liu, Y., Yang, Z., Utzat, C.D., Geacintov, N.E., Basu, A.K., and Zou, Y. (2005) Interactions of human replication protein A with single-stranded DNA adducts. Biochem. J. 385(Pt 2), 519-26. 179. Zou, Y., Shell, S.M., Utzat, C.D., Luo, C., Yang, Z., Geacintov, N.E., and Basu, A.K. (2003) Effects of DNA adduct structure and sequence context on strand

370

opening of repair intermediates and incision by UvrABC nuclease. Biochemistry 42(43), 12654-61. 180. Duvauchelle, J.B., Blanco, L., Fuchs, R.P., and Cordonnier, A.M. (2002) Human DNA polymerase mu (Pol mu) exhibits an unusual replication slippage ability at AAF lesion. Nucleic Acids Res. 30(9), 2061-7. 181. Freisinger, E., Grollman, A.P., Miller, H., and Kisker, C. (2004) Lesion (in)tolerance reveals insights into DNA replication fidelity. Embo J. 23(7), 1494- 505. 182. Hogg, M., Wallace, S.S., and Doublie, S. (2004) Crystallographic snapshots of a replicative DNA polymerase encountering an abasic site. Embo J 23, 1483-93. 183. Zang, H., Irimia, A., Choi, J.Y., Angel, K.C., Loukachevitch, L.V., Egli, M., and Guengerich, F.P. (2006) Efficient and High Fidelity Incorporation of dCTP Opposite 7,8-Dihydro-8-oxodeoxyguanosine by Sulfolobus solfataricus DNA Polymerase Dpo4. J. Biol. Chem. 281(4), 2358-72. 184. Pohjola, S.K., Lappi, M., Honkanen, M., Rantanen, L., and Savela, K. (2003) DNA binding of polycyclic aromatic hydrocarbons in a human bronchial epithelial cell line treated with diesel and gasoline particulate extracts and benzo[a]pyrene. Mutagenesis 18(5), 429-38. 185. Mitchelmore, C.L., Livingstone, D.R., and Chipman, J.K. (1998) Conversion of 1-nitropyrene by Brown trout (Salmo trutta) and turbot (Scophtalamus maximus) to DNA adducts detected by 32P-postlabelling. Biomarkers 3(1), 21 - 33. 186. Sabbioni, G. and Jones, C.R. (2002) Biomonitoring of arylamines and nitroarenes. Biomarkers 7(5), 347-421. 187. Rafil, F., Franklin, W., Heflich, R.H., and Cerniglia, C.E. (1991) Reduction of nitroaromatic compounds by anaerobic bacteria isolated from the human gastrointestinal tract. Appl. Environ. Microbiol. 57(4), 962-8. 188. Malia, S.A., Vyas, R.R., and Basu, A.K. (1996) Site-specific frame-shift mutagenesis by the 1-nitropyrene-DNA adduct N-(deoxyguanosin-8-y1)-1-

371

aminopyrene located in the (CG)3 sequence: effects of SOS, proofreading, and mismatch repair. Biochemistry 35(14), 4568-77. 189. Hirose, M., Lee, M.S., Wang, C.Y., and King, C.M. (1984) Induction of rat mammary gland tumors by 1-nitropyrene, a recently recognized environmental mutagen. Cancer Res. 44(3), 1158-62. 190. El-Bayoumy, K., Hecht, S.S., Sackl, T., and Stoner, G.D. (1984) Tumorigenicity and metabolism of 1-nitropyrene in A/J mice. Carcinogenesis 5(11), 1449-52. 191. Malia, S.A. and Basu, A.K. (1995) Mutagenic specificity of reductively activated 1-nitropyrene in Escherichia coli. Biochemistry 34(1), 96-104. 192. Hatanaka, N., Yamazaki, H., Oda, Y., Guengerich, F.P., Nakajima, M., and Yokoi, T. (2001) Metabolic activation of carcinogenic 1-nitropyrene by human cytochrome P450 1B1 in Salmonella typhimurium strain expressing an O- acetyltransferase in SOS/umu assay. Mutat. Res. 497(1-2), 223-33. 193. Chan, P. (1996) NTP technical report on the toxicity studies of 1-Nitropyrene (CAS No. 5522-43-0) Administered by Inhalation to F344/N Rats. Toxic Rep. Ser. 34, 1-D2. 194. Wooster, R., Cleton-Jansen, A.M., Collins, N., Mangion, J., Cornelis, R.S., Cooper, C.S., Gusterson, B.A., Ponder, B.A., von Deimling, A., Wiestler, O.D., and et al. (1994) Instability of short tandem repeats (microsatellites) in human cancers. Nat Genet 6(2), 152-6. 195. Watt, D.L., Utzat, C.D., Hilario, P., and Basu, A.K. (2007) Mutagenicity of the 1- nitropyrene-DNA adduct N-(deoxyguanosin-8-yl)-1-aminopyrene in mammalian cells. Chem. Res. Toxicol. 20(11), 1658-64. 196. Nolan, S.J., Vyas, R.R., Hingerty, B.E., Ellis, S., Broyde, S., Shapiro, R., and Basu, A.K. (1996) Solution properties and computational analysis of an oligodeoxynucleotide containing N-(deoxyguanosin-8-yl)-1-aminopyrene. Carcinogenesis 17(1), 133-44.

372

197. Suo, Z., Lippard, S.J., and Johnson, K.A. (1999) Single d(GpG)/cis- diammineplatinum(II) adduct-induced inhibition of DNA polymerization. Biochemistry 38(2), 715-26. 198. Gu, Z., Gorin, A., Krishnasamy, R., Hingerty, B.E., Basu, A.K., Broyde, S., and Patel, D.J. (1999) Solution structure of the N-(deoxyguanosin-8-yl)-1- aminopyrene ([AP]dG) adduct opposite dA in a DNA duplex. Biochemistry 38(33), 10843-54. 199. Batra, V.K., Beard, W.A., Shock, D.D., Krahn, J.M., Pedersen, L.C., and Wilson, S.H. (2006) Magnesium-induced assembly of a complete DNA polymerase catalytic complex. Structure 14(4), 757-66. 200. Einolf, H.J. and Guengerich, F.P. (2001) Fidelity of nucleotide insertion at 8-oxo- 7,8-dihydroguanine by mammalian DNA polymerase delta. Steady-state and pre- steady-state kinetic analysis. J. Biol. Chem. 276(6), 3764-71. 201. Choi, J.Y. and Guengerich, F.P. (2005) Adduct size limits efficient and error-free bypass across bulky N2-guanine DNA lesions by human DNA polymerase eta. J. Mol. Biol. 352(1), 72-90. 202. Miller, H. and Grollman, A.P. (1997) Kinetics of DNA polymerase I (Klenow fragment exo-) activity on damaged DNA templates: effect of proximal and distal template damage on DNA synthesis. Biochemistry 36(49), 15336-42. 203. Choi, J.Y. and Guengerich, F.P. (2004) Analysis of the effect of bulk at N2- alkylguanine DNA adducts on catalytic efficiency and fidelity of the processive DNA polymerases bacteriophage T7 exonuclease- and HIV-1 reverse transcriptase. J. Biol. Chem. 279(18), 19217-29. 204. Woodside, A.M. and Guengerich, F.P. (2002) Misincorporation and stalling at O(6)-methylguanine and O(6)-benzylguanine: evidence for inactive polymerase complexes. Biochemistry 41(3), 1039-50. 205. Sherrer, S.M., Fiala, K.A., Fowler, J.D., Newmister, S.A., Pryor, J.M., and Suo, Z. (2011) Quantitative analysis of the efficiency and mutagenic spectra of abasic

373

lesion bypass catalyzed by human Y-family DNA polymerases. Nucleic Acids Res. 39(2), 609-22. 206. Brown, J.A., Pack, L.R., Fowler, J.D., and Suo, Z. (2011) Pre-steady-state kinetic analysis of the incorporation of anti-HIV nucleotide analogs catalyzed by human X- and Y-family DNA polymerases. Antimicrob. Agents Chemother. 55(1), 276- 83. 207. Schorr, S., Schneider, S., Lammens, K., Hopfner, K.P., and Carell, T. (2010) Mechanism of replication blocking and bypass of Y-family polymerase {eta} by bulky acetylaminofluorene DNA adducts. Proc. Natl. Acad. Sci. U. S. A. 107(48), 20720-5. 208. Bacolod, M.D., Krishnasamy, R., and Basu, A.K. (2000) Mutagenicity of the 1- nitropyrene-DNA adduct N-(deoxyguanosin-8-yl)-1-aminopyrene in Escherichia coli located in a nonrepetitive CGC sequence. Chem. Res. Toxicol. 13(6), 523-8. 209. Maher, V.M., Ouellette, L.M., Curren, R.D., and McCormick, J.J. (1976) Frequency of ultraviolet light-induced mutations is higher in xeroderma pigmentosum variant cells than in normal human cells. Nature 261(5561), 593-5. 210. Misra, R.R. and Vos, J.M. (1993) Defective replication of psoralen adducts detected at the gene-specific level in xeroderma pigmentosum variant cells. Mol. Cell. Biol. 13(2), 1002-12. 211. Silverstein, T.D., Johnson, R.E., Jain, R., Prakash, L., Prakash, S., and Aggarwal, A.K. (2010) Structural basis for the suppression of skin cancers by DNA polymerase eta. Nature 465(7301), 1039-43. 212. Kusumoto, R., Masutani, C., Shimmyo, S., Iwai, S., and Hanaoka, F. (2004) DNA binding properties of human DNA polymerase eta: implications for fidelity and polymerase switching of translesion synthesis. Genes Cells 9(12), 1139-50. 213. Park, H., Zhang, K., Ren, Y., Nadji, S., Sinha, N., Taylor, J.S., and Kang, C. (2002) Crystal structure of a DNA decamer containing a cis-syn thymine dimer. Proc. Natl. Acad. Sci. U. S. A. 99(25), 15965-70.

374

214. Wang, Y.C., Maher, V.M., Mitchell, D.L., and McCormick, J.J. (1993) Evidence from mutation spectra that the UV hypermutability of xeroderma pigmentosum variant cells reflects abnormal, error-prone replication on a template containing photoproducts. Mol. Cell. Biol. 13(7), 4276-83. 215. Wang, Y., Woodgate, R., McManus, T.P., Mead, S., McCormick, J.J., and Maher, V.M. (2007) Evidence that in xeroderma pigmentosum variant cells, which lack DNA polymerase eta, DNA polymerase iota causes the very high frequency and unique spectrum of UV-induced mutations. Cancer Res. 67(7), 3018-26. 216. Guo, C., Kosarek-Stancel, J.N., Tang, T.S., and Friedberg, E.C. (2009) Y-family DNA polymerases in mammalian cells. Cell. Mol. Life Sci. 66(14), 2363-81. 217. Jansen, J.G., Tsaalbi-Shtylik, A., Hendriks, G., Gali, H., Hendel, A., Johansson, F., Erixon, K., Livneh, Z., Mullenders, L.H., Haracska, L., and de Wind, N. (2009) Separate domains of Rev1 mediate two modes of DNA damage bypass in mammalian cells. Mol. Cell. Biol. 29(11), 3113-23. 218. Zhang, Y., Yuan, F., Wu, X., Wang, M., Rechkoblit, O., Taylor, J.S., Geacintov, N.E., and Wang, Z. (2000) Error-free and error-prone lesion bypass by human DNA polymerase kappa in vitro. Nucleic Acids Res. 28(21), 4138-46. 219. Vasquez-Del Carpio, R., Silverstein, T.D., Lone, S., Johnson, R.E., Prakash, L., Prakash, S., and Aggarwal, A.K. (2011) Role of human DNA polymerase kappa in extension opposite from a cis-syn thymine dimer. J. Mol. Biol. 408(2), 252-61. 220. Sherman, S.E. and Lippard, S.J. (1987) Structural Aspects of Platinum Anticancer Drug Interactions with DNA. Chem. Rev. 87, 1153-1181. 221. Zlatanova, J., Yaneva, J., and Leuba, S.H. (1998) Proteins that specifically recognize cisplatin-damaged DNA: a clue to anticancer activity of cisplatin. FASEB J. 12(10), 791-9. 222. Gupta-Burt, S., Shamkhani, H., Reed, E., Tarone, R.E., Allegra, C.J., Pai, L.H., and Poirier, M.C. (1993) Relationship between patient response in ovarian and breast cancer and platinum drug-DNA adduct formation. Cancer Epidemiol. Biomarkers Prev. 2(3), 229-34.

375

223. Huang, L., Turchi, J.J., Wahl, A.F., and Bambara, R.A. (1993) Effects of the anticancer drug cis-diamminedichloroplatinum(II) on the activities of calf thymus DNA polymerase epsilon. Biochemistry 32(3), 841-8. 224. Hoffmann, J.S., Pillaire, M.J., Maga, G., Podust, V., Hubscher, U., and Villani, G. (1995) DNA polymerase beta bypasses in vitro a single d(GpG)-cisplatin adduct placed on codon 13 of the HRAS gene. Proc. Natl. Acad. Sci. U. S. A. 92(12), 5356-60. 225. Villani, G., Hubscher, U., and Butour, J.L. (1988) Sites of termination of in vitro DNA synthesis on cis-diamminedichloroplatinum(II) treated single-stranded DNA: a comparison between E. coli DNA polymerase I and eucaryotic DNA polymerases alpha. Nucleic Acids Res. 16(10), 4407-18. 226. Yamada, K., Takezawa, J., and Ezaki, O. (2003) Translesion replication in cisplatin-treated xeroderma pigmentosum variant cells is also caffeine-sensitive: features of the error-prone DNA polymerase(s) involved in UV-mutagenesis. DNA Repair (Amst) 2(8), 909-24. 227. Adar, S. and Livneh, Z. (2006) Translesion DNA synthesis across non-DNA segments in cultured human cells. DNA Repair (Amst) 5(4), 479-90. 228. Plosky, B.S., Frank, E.G., Berry, D.A., Vennall, G.P., McDonald, J.P., and Woodgate, R. (2008) Eukaryotic Y-family polymerases bypass a 3-methyl-2'- deoxyadenosine analog in vitro and methyl methanesulfonate-induced DNA damage in vivo. Nucleic Acids Res. 36(7), 2152-62. 229. Haracska, L., Washington, M.T., Prakash, S., and Prakash, L. (2001) Inefficient bypass of an abasic site by DNA polymerase eta. J. Biol. Chem. 276(9), 6861-6. 230. Brown, J.A., Zhang, L., Sherrer, S.M., Taylor, J.S., Burgers, P.M., and Suo, Z. (2010) Pre-Steady-State Kinetic Analysis of Truncated and Full-Length Saccharomyces cerevisiae DNA Polymerase Eta. J. Nucleic Acids 2010. 231. Douliez, J.P., Michon, T., and Marion, D. (2000) Steady-state tyrosine fluorescence to study the lipid-binding properties of a wheat non-specific lipid- transfer protein (nsLTP1). Biochim. Biophys. Acta 1467(1), 65-72.

376

377