<<

A Dissertation

entitled

Crystallographic Studies of DNA Replication and Repair

by

Stephen J. Tomanicek

Submitted as partial fulfillment of the requirements for

the Doctor of Philosophy in Chemistry

Advisor: Timothy C. Mueser, Ph.D.

Graduate School

The University of Toledo

May 2005

Copyright © 2005

This document is copyrighted material. Under copyright law, no parts of this document

may be reproduced without the expressed permission of the author.

An Abstract of

Crystallographic Studies of DNA Replication and Repair Proteins

Stephen J. Tomanicek

Submitted as partial fulfillment of the requirements for

the Doctor of Philosophy in Chemistry

The University of Toledo

May 2005

The duplication of genomic information is central to the survival of organisms in all kingdoms of life. DNA replication and repair processes are essential for maintaining the fidelity and genomic stability required for life. Many proteins are involved directly in a number of coordinated interactions to ensure the accurate and efficient replication and repair of DNA. However, a number of these coordinated interactions during the replication and repair of DNA remain uncharacterized. Therefore, studying the nature of the various -protein and protein-substrate interactions can provide a more comprehensive understanding of both DNA replication and repair in all forms of life.

Specifically, the fidelity of DNA replication is highly dependent on the function of the flap (FEN-1) family of . The FEN-1 family of DNA replication associated DNA repair enzymes are structure specific 5’ to 3’

iii

that are members of the RAD2/RAD27 family of eukaryotic . The FEN-1 family of enzymes are also functionally related to both the bacteriophage and prokaryotic

5’ to 3’ . Many of the enzymes in the RAD2/RAD27 family of nucleases are involved in the processing of Okazaki fragment primers during lagging-strand DNA synthesis and in processing strands displaced during DNA synthesis associated with repair. However, a comprehensive structural characterization of the structure-specific substrate recognition of the FEN-1 family of enzymes has not yet been completed.

This work was focused primarily on structural studies of the archaeal pernix (Ape) FEN-1 and the T4 RNase H, a FEN-1 homologue in the bacteriophage T4. A number of X-ray crystallographic studies were focused on understanding the molecular basis of nucleic acid substrate recognition and the role of divalent metal ions in the catalytic mechanism of these enzymes. These structural studies have provided a more complete understanding of how catalysis is facilitated by the structure-specific substrate recognition of these essential enzymes in both DNA replication and repair.

iv

Acknowledgements

I would like to thank my advisor Dr. Mueser for the opportunity to complete my

graduate studies in his laboratory. I am truly grateful for the amount of help and

encouragement that he provided me during my research. Through his teaching, guidance, and expertise I have developed a great interest and appreciation for scientific research. I would also like to thank my committee members, Dr. Gray, Dr. Hu, and Dr. Viola for

their helpful discussions and support throughout the completion of my graduate studies.

Many thanks to the University of Toledo Department of Chemistry and their staff,

especially Dr. Leif Hanson for his assistance during my graduate research. Thanks also

go to the American Heart Association Ohio Valley Affiliate for providing me a

predoctoral fellowship during the last two years of graduate research. Lastly, I would

like to thank Dr. Gloria Borgstahl for allowing me to begin my graduate studies in her

laboratory.

I am also very grateful and appreciative for the large amount of help and support

that my lab members, past and present, have provided me. I thank you all as colleagues

and most of all as friends. Many thanks also go to my family and friends for their

support. I will be forever thankful to my parents, Marge and Charlie, for there unconditional support and encouragement through all of my educational endeavors. In addition, I would like to thank my fiancée, Kristina, and her children, Ari and Antonio, for there love, support, encouragement, and patience throughout the completion of my graduate studies.

v

Table of contents

Acknowledgements...... v

Table of contents ...... vi

List of Tables ...... x

List of Figures...... xii

List of Abbreviations...... xviii

Preface...... xx

Chapter 1: Background ...... 1

1.1 Flap Endonuclease-1...... 1

1.1.1 Background ...... 1

1.1.2 Archaeal Flap Endonuclease (FEN-1)...... 16

1.2 Bacteriophage T4 RNase H...... 19

1.2.1 Background ...... 19

1.2.2 and proposed reaction ...... 27

1.3 Escherichia coli DNA-Binding Protein from Starved Cells...... 31

Chapter 2: Structure of the Metal-free Aeropyrum pernix (Ape)

Flap Endonclease-1 (FEN-1) ...... 35

vi

2.1 X-ray Diffraction Data Collection...... 35

2.2 Data Processing ...... 37

2.3 Structure Determination ...... 40

2.4 Model Building and Refinement...... 42

2.5 General Architecture of the Metal-free Ape FEN-1...... 57

2.6 Bridge Region and Active Site Structure...... 60

2.7 Related Enzyme Structure Comparison...... 62

Chapter 3: Flap Endonuclease-1 Divalent Metal and DNA

Substrate Structural Studies ...... 71

3.1 Divalent Metal Studies: Aeropyrum pernix (Ape) Flap

Endonuclease-1...... 71

3.1.1 Preface...... 71

3.1.2 Expression...... 71

3.1.3 Purification ...... 72

3.1.4 Co-crystallization with Divalent Metal Ions...... 84

3.1.5 X-ray Diffraction Data Collection and Data Processing ...... 99

3.1.6 Structure Determination and Refinement...... 103

3.1.7 Electron Density Interpretation and Analysis...... 107

3.2 DNA Substrate Studies: Archaeal Flap Endonuclease-1 ...... 112

3.2.1 Protein and DNA Substrate Preparation...... 112

3.2.2 Co-crystallization with Flap DNA Substrates ...... 115

vii

3.2.3 X-ray Diffraction Data Collection ...... 123

3.2.4 Aeropyrum pernix FEN-1 DNA Binding...... 124

Chapter 4: Bacteriophage T4 RNase H Structural Studies...138

4.1 Metal Free Bacteriophage T4 RNase H...... 138

4.1.1 Data Processing ...... 138

4.1.2 Structure Determination ...... 139

4.1.3 Model Building and Refinement...... 140

4.2 Metal Free D132N Bacteriophage T4 RNase H ...... 149

4.2.1 Data Processing ...... 149

4.2.2 Model Building and Refinement...... 150

4.3 Structure Analysis and Comparison...... 156

4.3.1 General Architecture of Metal-free D132N T4 RNase H...... 157

4.3.2 Central Groove and Large Subdomain Structure ...... 160

4.3.3 Active Site Structure...... 162

4.3.4 Related Enzyme Structure Comparison ...... 163

4.3.5 Related Enzyme Bridge Region Comparison and DNA Binding

Implications ...... 166

4.4 D132N Bacteriophage T4 RNase H with DNA Substrate ...... 170

4.4.1 X-ray Diffraction Data Collection ...... 171

4.4.2 Data Processing ...... 173

4.4.3 Structure Determination ...... 175

4.4.4 Initial Model Building and Refinement ...... 176

viii

Chapter 5: Escherichia coli DNA-Binding Protein from Starved Cells...... 179

5.1 Expression and Purification...... 179

5.2 Dynamic Light Scattering (DLS)...... 182

5.3 Crystallization ...... 183

5.4 X-ray Diffraction Data Collection and Data Processing...... 185

5.5 Characterization ...... 188

5.5.1 N-terminal Protein Sequencing ...... 188

5.5.2 Mass Spectroscopy...... 190

5.6 Structure Determination ...... 190

References ...... 193

Appendices...... 204

Appendix I: American Heart Association Ohio Valley Affiliate

Predoctoral Fellowship Proposal...... 205

Appendix II: American Heart Association Predoctoral Fellowship

Progress Report for Renewal ...... 218

ix

List of Tables

Table 2.1: Metal-free Aeropyrum pernix (Ape) flap endonuclease-1 (FEN-1)

crystallographic data ...... 40

Table 2.2: Metal-free Aeropyrum pernix (Ape) flap endonuclease-1 (FEN-1)

crystallographic refinement data...... 55

Table 3.1: X-ray diffraction data sets of Ape FEN-1 divalent metal soaking and growth

experiments using the metal-free Ape FEN-1 crystal form...... 87

Table 3.2: Commercially available and in-house crystal screening kits...... 89

Table 3.3: Known cryoprotectant solutions and protective concentrations...... 94

Table 3.4: Data sets of Ape FEN-1 in the presence of divalent metals (MgCl2, MnCl2,

CaCl2, SrCl2, and BaCl2) following crystallization rescreening and optimization in a citrate-free buffer ...... 102

Table 3.5: Summary of molecular replacement results for the data sets of Ape FEN-1 grown in the presence of different divalent metals...... 105

Table 3.6: Summary of Refmac5 refinement results for the data sets of Ape FEN-1 grown in the presence of different divalent metals ...... 106

Table 3.7: Summary of the superposition of the models of Ape FEN-1 grown in the presence of different divalent metals onto the metal-free Ape FEN-1 structure ...... 108

Table 3.8: Expansion tray index of Ape and Archaeoglobus veneficus FEN-1 with 7/14 flap DNA...... 119

Table 4.1: Metal-free T4 RNase H crystallographic data...... 139

Table 4.2: Metal-free T4 RNase H preliminary crystallographic refinement data...... 143

Table 4.3: Metal-free T4 RNase H crystallographic refinement data...... 147

x

Table 4.4: Metal-free D132N T4 RNase H crystallographic data ...... 150

Table 4.5: Metal-free D132N T4 RNase H crystallographic refinement data...... 154

Table 4.6: D132N T4 RNase H/fork DNA complex crystallographic data...... 174

Table 5.1: Dps/PexB data sets collected...... 187

Table 5.2: Data processing results for Dps Data set #1 ...... 188

xi

List of Figures

Figure 1.1: Dna2/RPA/FEN-1 model for Okazaki fragment processing in eukaryotic cells ...... 5

Figure 1.2: FEN-1 only model for Okazaki fragment processing in eukaryotic cells ...... 7

Figure 1.3: RNase H1/FEN-1 model for Okazaki fragment processing in eukaryotic cells ...... 8

Figure 1.4: Flap DNA substrate of the FEN-1 enzymes...... 11

Figure 1.5: Sequence alignment of archaeal FEN-1 enzymes from various species and human FEN-1 showing absolutely conserved active site acidic residues...... 18

Figure 1.6: Bacteriophage T4 DNA Replication Fork...... 21

Figure 1.7: Model showing the 5’ to 3’ cleavage of the RNA pentamer primer and adjacent DNA by the T4 RNase H during T4 lagging strand DNA replication ...... 26

Figure 1.8: Active site of the native T4 RNase H bound to two magnesium ions...... 28

Figure 1.9: Proposed nuclease reaction of T4 RNase H ...... 30

Figure 1.10: Ribbon structure of Dps ...... 32

Figure 2.1: Metal-free Ape FEN-1 crystal and X-ray diffraction image ...... 36

Figure 2.2: Primary sequence alignment of Ape FEN-1 and Pfu FEN-1 ...... 41

Figure 2.3: Ribbon diagram of the (Pfu) flap endonuclease-1

(FEN-1) structure...... 42

Figure 2.4: Composite omit electron density map quality...... 45

Figure 2.5: Ape FEN-1 model building and refinement example ...... 48

Figure 2.6: Extension of Ape FEN-1 resolution to 1.4 Å ...... 49

xii

Figure 2.7: Electron density of bridge region following the first round of restrained refinement of metal-free Ape FEN-1...... 52

Figure 2.8: Final 2Fobs-Fcalc electron density map quality...... 53

Figure 2.9: Model building and refinement summary ...... 54

Figure 2.10: Summary of the structure determination process for metal-free

Ape FEN-1...... 56

Figure 2.11: Ribbon diagram of metal-free Ape FEN-1...... 57

Figure 2.12: Ribbon diagram of metal-free Ape FEN-1 with labeled secondary structural

features...... 58

Figure 2.13: Bridge region of the metal-free Ape FEN-1...... 60

Figure 2.14: Active site region of metal-free Ape FEN-1 ...... 62

Figure 2.15: Comparison of metal-free Ape FEN-1 to related FEN-1 structures in the

RAD2/RAD27 family of nucleases ...... 65

Figure 2.16: Comparison of the metal-free Ape FEN-1 active site and bridge region to

related structures in the RAD2/RAD27 family of nucleases...... 68

Figure 3.1: SDS-PAGE gel following initial purification of archaeal FEN-1 proteins by

Third Wave Technologies...... 74

Figure 3.2: Superdex 75 chromatogram for Ape FEN-1 purification...... 78

Figure 3.3: SDS-Page gel from Superdex 75 run of Ape FEN-1 purification...... 79

Figure 3.4: Poros HS cation exchange chromatogram for Ape FEN-1 purification...... 81

Figure 3.5: SDS-Page gel from Poros HS run of Ape FEN-1 purification...... 82

Figure 3.6: Summary of the Ape FEN-1 purification scheme...... 83

Figure 3.7: Unit cell packing of native Ape FEN-1 molecules in the crystal...... 84

xiii

Figure 3.8: Initial crystal screen hits of Ape FEN-1 with MgCl2 at 21 °C...... 92

Figure 3.9: Coarse gradient expansion crystals of Ape FEN-1 with MgCl2 at 21 °C ...... 93

Figure 3.10: Shallow gradient expansion crystals at 21 °C of Ape FEN-1 in the presence of divalent metal ions following heat incubation...... 98

Figure 3.11: Advanced Photon Source (APS) and in-house X-ray sources ...... 100

Figure 3.12: X-ray diffraction image of an Ape FEN-1 crystal grown in the presence of

MnCl2...... 101

Figure 3.13: Summary of the structure determination process for Ape FEN-1 grown in the

presence of various divalent metals (citrate-free buffer) ...... 107

Figure 3.14: Electron density map interpretation of the active site region of Ape FEN-1: a comparison between the models of Ape FEN-1 grown in the presence of divalent metals

(from data sets shown in Table 3.4) versus the final model of the metal-free

Ape FEN-1(see Chapter 2)...... 111

Figure 3.15: Flap DNA substrate of the FEN-1 enzymes...... 112

Figure 3.16: Flap DNA substrates used for co-crystallization trials with the archaeal

FEN-1 enzymes...... 114

Figure 3.17: Initial crystal screen hits of Ape FEN-1 and 7/14 flap DNA ...... 116

Figure 3.18: Initial crystal screen hits of Archaeoglobus veneficus FEN-1 and 7/14 flap

DNA...... 117

Figure 3.19: X-ray diffraction image of an Ape FEN-1 and 7/14 flap DNA crystal (see

Figure 3.17A) grown from the initial co-crystallization trials...... 124

Figure 3.20: Ribbon structure of the metal-free Ape FEN-1 superimposed onto the ribbon structure of Archaeoglobus veneficus FEN-1 bound to 3’ flap DNA...... 127

xiv

Figure 3.21: Sequence alignment of the residues making up the 3’ flap DNA binding

pocket of the Archaeoglobus fulgidus (Afu) FEN-1 in comparison to other archaeal

FEN-1 enzymes and Human FEN-1, (adapted from Friedrich-Heineken and Hubscher,

2004) ...... 128

Figure 3.22: Amino acid residues forming the 3’ flap DNA interface of the

Archaeoglobus fulgidus (Afu) FEN-1 in comparison to the proposed 3’ flap DNA binding

interface of Ape FEN-1...... 130

Figure 3.23: Proposed model for flap DNA substrate binding to the Ape FEN-1

enzyme ...... 134

Figure 3.24: Groove between the Ape FEN-1 antiparallel β strand and the large

subdomain of the enzyme ...... 135

Figure 4.1: Ribbon diagram of metal-bound T4 RNase H...... 140

Figure 4.2: Final electron density quality of metal-free T4 RNase H ...... 147

Figure 4.3: Summary of the structure determination process for metal-free

T4 RNase H...... 148

Figure 4.4: Ribbon diagram of metal-free T4 RNase H ...... 149

Figure 4.5: Final electron density quality of metal-free D132N T4 RNase H...... 154

Figure 4.6: Summary of the structure determination process for metal-free D132N

T4 RNase H...... 155

Figure 4.7: Ribbon diagram of metal-free D132N T4 RNase H...... 156

Figure 4.8: Ribbons diagrams of (A) metal-free D132N T4 RNase H and (B) metal-bound

native T4 RNase H (Mueser et al., 1996) ...... 157

xv

Figure 4.9: Major structural differences between metal-free and metal-bound

T4 RNase H...... 159

Figure 4.10: Bridge region of metal-free and metal-bound T4 RNase H ...... 161

Figure 4.11: Active site region of metal-free and metal-bound T4 RNase H...... 163

Figure 4.12: Comparison of metal-free and metal-bound T4 RNase H to related structures in the RAD2 family of nucleases...... 165

Figure 4.13: Comparison of the metal-free and metal-bound T4 RNase H active site and bridge regions to related structures in the RAD2 family of nucleases ...... 168

Figure 4.14: Fork DNA substrate used for co-crystallization with the D132N

T4 RNase H...... 171

Figure 4.15: X-ray diffraction images of crystals of the D132N T4 RNase H in the

presence of fork DNA...... 172

Figure 5.1: SDS-PAGE gel following initial purification of archaeal FEN-1 proteins by

Third Wave Technologies...... 180

Figure 5.2: SDS-Page gel from Superdex 75 column purification of Ape FEN-1 and Dps impurity...... 181

Figure 5.3: Shallow gradient expansion crystals at 21 °C of the impurity (Dps) from both

the Ape and Tzi FEN-1 purifications...... 184

Figure 5.4: X-ray diffraction images of Dps crystals grown from material isolated from

Tzi FEN-1 ...... 186

Figure 5.5: Summary of the Bravais lattice autoindexing results for Dps

Data Set #1...... 187

Figure 5.6: Amino acid primary sequence of E. coli Dps/PexB...... 189

xvi

Figure 5.7: Dps molecular replacement search models ...... 191

xvii

List of Abbreviations

ADSC ……………. Area Detector Systems Corporation

Afu……………….. Archaeoglobus fulgidus

Ape……………….. Aeropyrum pernix

APS………………. Advanced Photon Source

Ave……………….. Archaeoglobus veneficus

Bis-Tris ………….. 2,2-Bis(hydroxymethyl)-2,2’,2’’-nitrilotriethanol

CC………………... Correlation coefficient

CCD ……………... Charge-Coupled Device

CCP4 ……………. Collaborative Computational Project 1994

CFE……………… Cell free extract

CNS……………… Crystallographic and NMR System

Cp…………………Polydispersity

DLS ……………... Dynamic Light Scattering

DNA…………….. Deoxyribonucleic acid

Dps……………… DNA-binding protein from starved cells

DT……………….. Translational diffusion coefficient

E. coli …………… Escherichia coli

EDTA ………….... Ethylenediaminetetraacetic acid

FEN-1……………. Flap endonuclease-1

HEPES ………...... 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid

HPLC ………….... High-Performance Liquid Chromatography

IPTG …………..... Isopropyl-β-D-thiogalactopyranoside

xviii

MES …………….. 4-morpholineethanesulfonic acid monohydrate

MPD …………….. 2-methyl-2,4-pentanediol

MW ……………… Molecular Weight

MWCO ………….. Molecular Weight Cut-Off

OMCC …………… Ohio Macromolecular Crystallography Consortium

PCNA……………. Proliferating Cell Nuclear Antigen

PDB ……………… Protein Data Bank

PEG ……………… Polyethylene glycol

PEI ………………. Polyethylenimine

Pfu……………….. Pyrococcus furiosus

pI ………………... Isoelectric point

PIPES …………… Piperazinebis(ethanesulfonic) acid

RH………………… Hydrodynamic radius of gyration

RNA…………….... Ribonucleic acid

SDS-PAGE ……… Sodium Dodecyl Sulphate-Polyacrylamide Gel Electrophoresis

Taq……………….

Tm …………….... Melting temperature

TNR…………….. Trinucleotide repeat

Tris ……………… Tris(hydroxymethyl)aminomethane

Tzi……………….. Thermococcus zilligii

XPG……………... Xeroderma pigmentosa group G protein

xix

Preface

The goal of my dissertation research was to gain knowledge in the fields of

biochemistry and macromolecular crystallography. I began my graduate research career

as a member of Dr. Gloria Borgstahl’s structural biology research laboratory starting in

the spring of 2000. My research in Dr. Borgstahl’s laboratory involved the design of expression vectors for attachment of a lanthanide-binding affinity peptide tag onto one of

the termini of any test protein. The goal of the project was to express, purify, crystallize,

and collect X-ray diffraction data of a recombinant test protein bound to a lanthanide ion.

The bound lanthanide ion would then be used to solve the crystallographic phase problem

eventually leading to structure determination of the test protein. Specifically, molecular

cloning techniques were used to design two expression vectors in which a

lanthanide-binding tag was fused to the N-terminus of recombinant lysozyme. Protein

expression and purification were attempted on the lanthanide-tagged lysozyme constructs

but were unsuccessful. In addition to using lysozyme for this research, a barnase

construct containing a C-terminal zinc finger lanthanide-binding tag was provided by Dr.

Paul Rosevear (University of Cincinnati, Cincinnati, OH). Protein expression,

purification, and crystallization trials were completed using this lanthanide-bound

barnase construct but no diffraction quality crystals were obtained. No further work on

these projects was completed following the relocation of the Borgstahl laboratory to

Omaha, Nebraska in the summer of 2002. Even though these projects were not

completed, I was able to learn a number of molecular biology, protein chemistry, and

xx

crystallization techniques and methodologies that would be very useful in my future graduate studies.

In the summer of 2002, I joined Dr. Timothy C. Mueser’s research laboratory in order to complete my graduate studies in macromolecular crystallography. I was provided an opportunity to work on a number of projects that were focused on X-ray crystallographic structural studies of DNA replication and repair enzymes. Having learned techniques such as molecular cloning, protein expression and purification, and crystallization in the Borgstahl laboratory, my dissertation research in the Mueser laboratory was primarily directed toward my interests in the structure determination process. My first project was to complete the X-ray diffraction data collection, data processing, structure determination, model building, and refinement of the metal-free

Aeropyrum pernix (Ape) flap endonuclease-1 (FEN-1) (as discussed in Chapter 2). This project was one portion of a major project in the Mueser lab in which purified archaeal

FEN-1 protein from various species along with purified synthetic flap DNA substrates were provided, in kind, by Third Wave Technologies, Inc. (Madson, WI). The overall goal of the project was to obtain X-ray structures of these archaeal FEN-1 enzymes bound to flap DNA substrate. Therefore, following completion of the metal-free Ape

FEN-1 structure, I was involved in the co-crystallization screening and crystal optimization of a number of archaeal FEN-1 enzymes in the presence of synthetic DNA substrate (as discussed in Chapter 3). Prior to obtaining reproducible diffraction quality crystals of a FEN-1/DNA substrate, the collaboration with Third Wave Technologies Inc. was abruptly ended when their stock became public in 2003. No further archaeal FEN-1

xxi

protein or DNA substrate was then provided for completion of the co-crystallization

studies.

During my work on the co-crystallization studies with DNA substrate, I was also

involved in a project that involved solving the structure of the Ape FEN-1 in the presence

of catalytic divalent metal ions in order to study the structural binding characteristics of

these catalytic metal ions in the active site region of the enzyme (as discussed in

Chapter 3). While preparing archaeal FEN-1 protein for the co-crystallization studies with DNA substrate and the divalent metal ion studies of Ape FEN-1, we discovered that the last two large-scale fermentation preparations of archaeal FEN-1 enzymes by Third

Wave Technologies Inc. were heavily contaminated by an impurity that accounted for

approximately 50% of protein material that was present (as discussed in Chapter 5). The

impurity was isolated from various FEN-1 samples and was screened for crystallization

because it was initially believed that this impurity was a truncation product of each

respective archaeal FEN-1 enzyme. However, the impurity was later identified to be the

Escherichia coli (E. coli) DNA protection protein, Dps/PexB. Following initial crystal

screening, I was involved in crystal optimization, data collection, and structure

determination trials by molecular replacement of the Dps impurity.

In addition to the structural studies of the archaeal FEN-1 enzymes discussed

above, I was involved in the data processing, structure determination, model building,

and refinement of both the metal-free native and D132N mutant of the bacteriophage T4

RNase H, a related homologue to the FEN-1 enzymes (as discussed in Chapter 4). Also,

I was involved in the data collection, initial phasing, and model building and refinement

of the D132N T4 RNase H in the presence of bound fork DNA substrate (as discussed in

xxii

Chapter 4). These structural studies will be used in order to characterize the molecular basis of nucleic acid substrate recognition and the role of divalent metal ions in the catalytic mechanism of the T4 RNase H enzyme. Upon completion of the model of T4

RNase H bound to fork DNA substrate, this structural information will be used for constructing a model of the binding of flap DNA substrate to the metal-free Ape FEN-1 enzyme.

Throughout my research in the Mueser lab, I was involved in the training of other lab members in macromolecular crystallography. The projects in our lab were designed so that I was able to work with a number of other group members using a variety of techniques critical to the structure determination process. I also gained valuable experience in macromolecular crystallography by coordinating a number of synchrotron trips to the Advanced Photon Source (APS) in Chicago, IL for data collection. Overall, my research experiences in the Mueser lab have contributed greatly to my knowledge in the fields of both biochemistry and macromolecular crystallography. Through this research, I have gained both an appreciation and an understanding of the research process which will help prepare me for an independent career in research.

xxiii

Chapter 1: Background

1.1 Flap Endonuclease-1

1.1.1 Background

DNA replication is both a semi-conservative and a semi-discontinuous process

due to the antiparallel structure of the double helix and the 5’ to 3’ directionality of DNA

synthesis. The leading strand is synthesized continuously in the direction of the opening

replication fork, whereas the lagging strand is synthesized in a discontinuous manner in

the opposite direction of the opening replication fork. The lagging strand is synthesized as a series of fragments called (reviewed in Kornberg and Baker,

1992). Before lagging strand synthesis can occur, each of the Okazaki fragments must be initiated by a short RNA primer. In , when synthesis of one Okazaki fragment encounters a previously synthesized or downstream fragment, the elongating DNA strand displaces the 5’ end of this downstream fragment resulting in a 5’ unannealed flap that

contains the initiator RNA primer along with adjacent DNA. The RNA primer and any

adjacent DNA must be removed prior to ligation of the Okazaki fragments in order to

complete lagging strand synthesis. The flap endonuclease (FEN-1) has been shown to be

an important nuclease in the final cleavage step of the 5’ displaced fragment prior to

ligation and completion of lagging strand synthesis (Kao and Bambara, 2003; Liu et al.,

2004).

1 2

The flap endonuclease (FEN-1) family of DNA replication associated DNA repair

enzymes are structure specific 5’ to 3’ endonucleases that are members of the

RAD2/RAD27 family of eukaryotic nucleases which include the Schizosaccharomyces

pombe RAD2, Saccharomyces cerevisiae RAD27, murine FEN-1, and human FEN-1 enzymes (Harrington and Lieber, 1994a; Liu et al., 2004; Shen et al., 1998). The FEN-1 family of enzymes are also functionally related to both the bacteriophage and prokaryotic

5’ to 3’ exonucleases (Liu et al., 2004). The bacteriophage exonucleases include the bacteriophage T4 RNase H (see section 1.2) and the smaller bacteriophage T5 D15 and

T7 6 exonucleases (Mueser et al., 1996; Ceska et al., 1996). The prokaryotic related homologues include the 5’ to 3’ exonuclease domains of DNA repair from such as E. coli Pol I and Thermus aquaticus (Taq) (Kim et al., 1995). In addition, it has been shown that the FEN-1 enzymes are related to the larger eukaryotic nucleotide excision repair enzymes such as the human Xeroderma pigmentosa group G protein (XPG) (Harrington and Lieber, 1994b). Like FEN-1, many of these enzymes in the RAD2/RAD27 family are involved in removing RNA primer fragments during lagging strand DNA replication or damaged DNA fragments in various DNA repair pathways.

Amino acid sequence alignments have shown that the eukaryotic FEN-1 nucleases contain three domains: the N-terminal (N), intermediate or internal (I), and the C-terminal

(C) domains, whereas the functionally related bacteriophage and prokaryotic nuclease homologues only contain the N and I domains (Harrington and Lieber, 1994b). Both the

N and I domains show a relatively high sequence similarity among members of the archaeal and eukaryotic FEN-1 nucleases and the related FEN-1 homologues in both

3

bacteriophage and prokaryotic organisms (Mueser et al., 1996; Liu et al., 2004). The N and I domains of all members of the FEN-1 family of nucleases contain distinct highly conserved acidic amino acids that have been shown to be involved in both catalysis and

substrate binding (Mueser et al., 1996; Shen et al., 1998). The C domain, which is absent

in the bacteriophage and prokaryotic FEN-1 homologues, has been shown to be important

for FEN-1 interaction with the proliferating cell nuclear antigen (PCNA) during DNA

replication (Shen et al., 1998; Chapados et al., 2004; Sakurai et al., 2004).

The FEN-1 enzymes are structure specific 5’ to 3’ endonucleases that recognize

and act on three stranded substrates called flap DNA. Flap DNA is generated in-vivo

during strand displacement synthesis, an event that occurs in both lagging strand DNA

replication and in various DNA repair pathways. Specifically in eukaryotic cells, lagging

strand DNA replication is initiated by the activity of the DNA

α/primase complex (pol α) which synthesizes an 8-12 nucleotide RNA primer. The DNA

synthetase activity then synthesizes approximately 20-30 nucleotides of adjacent DNA.

This initiator RNA and adjacent DNA must be removed prior to the joining of Okazaki

fragments to preserve the fidelity of replication because pol α lacks proofreading 3’ to 5’

exonuclease activity. As a result, the pol α-synthesized DNA adjacent to the RNA primer

may contain misincorporated bases which are potentially mutagenic. Following

initiation, the pol α complex is displaced from the RNA-DNA primer by the clamp loader

(RF-C) which triggers the loading and binding of PCNA and DNA

polymerase δ (pol δ), respectively. Once the holoenzyme form of RF-C, PCNA, and pol

δ are assembled, highly processive synthesis continues until the 5’ end of the previous

Okazaki fragment is reached. On reaching the 5’ end of a downstream Okazaki fragment,

4

the pol δ complex begins strand displacement synthesis which generates a 5' flap DNA structure containing the RNA primer and pol α-synthesized DNA. The resulting flap

DNA structure is processed and the final cleavage step is performed by the FEN-1

enzyme creating a nick that can then be sealed by DNA I (Kao and Bambara,

2003).

FEN-1 has been shown to have a preference for a short 5’ displaced flap that is

approximately 1 to 5 nucleotides in length. However, if the displaced flap becomes

longer, sufficient single stranded DNA is available to form a for the

eukaryotic single-stranded binding protein, (RPA). It has been

shown that the binding of RPA to the displaced flap limits the flap length to

approximately 30-35 nucleotides during displacement synthesis (Maga et al., 2001).

Previous studies have shown that an RPA coated 5’ flap intermediate inhibits FEN-1

activity but stimulates the activity of the multifunctional protein, Dna2. Dna2 is a multifunctional enzyme with 5’ to 3’ , 5’ nuclease, 3’ nuclease, and ATPase activities (Budd and Campbell, 1997; Budd et al., 2000). It was also shown that the

binding of RPA to a displaced 5’ flap led to the recruitment of the Dna2 and the

formation of an RPA/Dna2/DNA ternary complex (Bae et al., 2001). A number of

models have been proposed to explain the processing of the displaced 5’ flap during

lagging strand DNA replication. Three models are discussed here. These results led to the proposal of a Dna2/RPA/FEN-1 model for the processing and cleavage of 5’ displaced flaps during lagging strand synthesis. The Dna2/RPA/FEN-1 model for

Okazaki fragment processing is shown illustrated in Figure 1.1.

5

5’ 3’

pol δ RPA

Dna2 RPA RPA

Cleavage by RPA Dna2 Dna2 FEN-1

Cleavage by FEN-1

FEN-1

PCNA

DNA ligase I

PCNA Figure 1.1: Dna2/RPA/FEN-1 model for Okazaki fragment processing in eukaryotic cells. 5’ to 3’ elongation of the upstream Okazaki fragment by the pol δ complex (DNA polymerase δ and PCNA) results in strand displacement of the downstream RNA primer (shown in red) and some adjacent DNA. If a long flap is displaced, RPA will bind to the flap and inhibit FEN-1 activity. The binding of RPA recruits the binding of the Dna2 multifunctional protein. If the long flap forms a secondary structure such as a hairpin, the helicase activity of Dna2 is needed in order to process the flap. Once Dna2 is bound and any potential secondary structure is removed, the Dna2 endonuclease activity will cleave the RPA coated flap, generating a short flap substrate. This Dna2 processed flap is too short for RPA binding but is an ideal substrate for FEN-1 bound to PCNA. FEN-1 cleavage of the short flap structure one base pair into the downstream Okazaki fragment duplex results in a nick in the DNA. This nick is then sealed by the DNA ligase I bound to PCNA.

6

In this model, displacement synthesis begins as the pol δ complex encounters a previously synthesized Okazaki fragment. If the displaced flap becomes long enough,

RPA can bind to the single stranded flap, subsequently recruiting the Dna2 multifunctional protein. The helicase activity of the Dna2 may be required in order to process long flaps containing a repeat region that could potentially fold into secondary structures (such as a hairpin) which are completely resistant to FEN-1 cleavage

(Henricksen et al., 2000). Following any processing of a structured flap, the endonuclease activity of the Dna2 can cleave the long flap leaving only 5-7 nucleotides.

This cleavage step allows the dissociation of both the RPA and the Dna2. The remaining flap is a more favorable substrate for FEN-1 cleavage that results in a single nick that can be sealed by DNA ligase I (Bae et al., 2001).

In contrast to the Dna2/RPA/FEN-1 model, a FEN-1 only model has also been proposed. In this model, FEN-1 cleavage is assumed to be very efficient as the flap is generated during displacement synthesis. As the flap is displaced and before it is long enough for RPA to bind, the flap is removed by one or more FEN-1 cleavages until all of the pol α-synthesized RNA/DNA is removed. Studies in support of this model have shown that the processing of the lagging strands during strand displacement synthesis is not Dna2 dependent, and that FEN-1 alone is sufficient for nicked product formation prior to ligation (Ayyagari et al., 2003). The FEN-1 only model for Okazaki fragment processing is illustrated in Figure 1.2. However, if FEN-1 activity is somehow compromised or inhibited by the formation of longer flaps that are bound to RPA, it has been suggested that Dna2 may be needed for processing of the displaced flap (Ayyagari et al., 2003; Kao et al., 2004).

7

5’ 3’

PCNA pol δ

FEN-1

Cleavage by FEN-1

FEN-1

PCNA

DNA ligase I

PCNA Figure 1.2: FEN-1 only model for Okazaki fragment processing in eukaryotic cells. 5’ to 3’ elongation of the upstream Okazaki fragment by the pol δ complex (DNA polymerase δ and PCNA) results in strand displacement of the downstream RNA primer (shown in red) and some adjacent DNA. The short flaps generated are then cleaved (one base pair into the downstream Okazaki fragment duplex) by the FEN-1 bound to PCNA which leaves a nick in the DNA. This nick is then sealed by the DNA ligase I bound to PCNA.

Lastly, an RNase H1/FEN-1 model has been proposed for the processing of the

RNA primers during lagging strand synthesis (Turchi et al., 1994). In this model, the

RNA primers are removed by the sequential action of both eukaryotic RNase H1 and the

FEN-1. Most of the initiator RNA is cleaved by the RNase H1 prior to the formation of the displaced flap. 5’ endonucleolytic cleavage of the RNA primer by the RNase H1 has been shown to leave only a single ribonucleotide that is subsequently displaced by the pol α complex. This displaced flap can then be cleaved by FEN-1. Simultaneous strand

8 displacement followed by FEN-1 cleavage may be required to remove any remaining pol α-synthesized DNA before ligation can take place. The RNase H1/FEN-1 model for

Okazaki fragment processing is shown illustrated in Figure 1.3. However, additional genetic studies have shown that RNase H1 is not essential for Okazaki fragment processing, suggesting that the FEN-1/RNase H1 model is not the primary pathway for

RNA primer removal (Qiu et al., 1999).

Cleavage by RNase H1

RNase H1 5’ 3’

PCNA pol δ

FEN-1

Cleavage by FEN-1

FEN-1

PCNA

DNA ligase I

PCNA

Figure 1.3: RNase H1/FEN-1 model for Okazaki fragment processing in eukaryotic cells. Prior to completion of the 5’ to 3’ elongation of the upstream Okazaki fragment by the pol δ complex (DNA polymerase δ and PCNA), RNase H1 cleaves the RNA primer (shown in red) one nucleotide 5’ to the RNA/DNA junction. Following cleavage by RNase H1, the pol δ complex carries out strand displacement synthesis and displaces the remaining ribonucleotide along with some adjacent DNA. This short flap containing the remaining ribonucleotide is then cleaved by FEN-1 bound to PCNA which leaves a nick in the DNA. This nick is then sealed by the DNA ligase I bound to PCNA.

9

Based on the previously discussed models of Okazaki fragment processing,

FEN-1 is an essential 5’ nuclease that plays a critical role in the removal of RNA primers and adjacent DNA during lagging strand DNA synthesis. Thus, FEN-1 is a critical enzyme for maintaining genomic integrity in a number of organisms. Studies of deletion mutants of the Schizosaccharomyces pombe RAD2 (FEN-1 homologue in yeast) display a temperature sensitive growth defect and accumulate in S phase, suggesting that DNA replication was blocked (Sommers et al., 1995). Also, deletions of the Saccharomyces cerevisiae RAD27, have been shown to cause DNA replication defects such as chromosomal instability, an increased frequency of frame-shift mutations, and increased trinucleotide repeat (TNR) expansion (reviewed in Henneke et al., 2003). Specifically, deletion of the Saccharomyces cerevisiae RAD27 resulted in length-dependent destabilization of triplet CTG tracts and a significant increase in expansion frequency, indicating an inability to process the primers associated with Okazaki fragments correctly

(Tishkoff et al., 1997; Schweitzer and Livingston, 1998; Freudenreich et al., 1998).

These results indicate that incorrect processing of the Okazaki fragment primers during strand displacement synthesis by FEN-1 enzymes can lead to mutations that have been associated with human neurological and neuromuscular disorders such as recessive retinitis pigmentosa, Huntington’s disease, myotonic dystrophy, fragile X syndrome, and

Friedreich’s ataxia. These disorders are believed to be exclusively caused by trinucleotide repeat expansion (reviewed in Usdin and Grabczyk, 2000).

Understanding that FEN-1 has been associated with an increased rate of expansion, a sequence expansion model involving FEN-1 has been proposed and is referred to as the FEN-1 interference model (reviewed in Usdin and Grabczyk, 2000),

10

(Liu et al., 2004). As discussed previously, during lagging strand replication, the FEN-1 enzyme is able to cleave a short, displaced 5’ single-stranded flap prior to ligation of

adjacent Okazaki fragments. However, if this displaced flap becomes longer and

contains certain triplet base repeats (CTG-CAG, CGG-CCG, and GAA-TTC), the

formation of a structured flap, such as one with a self-complementary repeat sequence,

would inhibit cleavage by FEN-1 (Gordenin et al., 1997). If the Dna2/RPA/FEN-1 flap

processing model (shown in Figure 1.1) were to somehow become compromised or fail,

the self-complementary, structured flap (hairpin, triplex, tetraplex) could then equilibrate

into a bubble intermediate or internal loop conformation, caused by a misaligned

annealing to the template strand that could be ligated into an expanded strand (Liu and

Bambara, 2003). A number of repair pathways incorporate the unprocessed flap leading

to expansion (reviewed in Usdin and Grabczyk, 2000). Studies in support of the FEN-1

interference model have shown that TNR flaps form stable secondary structures that

significantly reduce FEN-1 binding (Henricksen et al., 2000). More recently, studies

have indicated that FEN-1 competes with DNA ligase I at the last step of Okazaki

fragment processing to inhibit sequence expansion (Henricksen et al., 2002). Without a

complete understanding of how expansion occurs and causes disease, it is clear from the

results discussed above that FEN-1 is involved in the prevention of TNR expansion.

During lagging strand DNA synthesis, the read through replication generates an

upstream complimentary strand that displaces a segment of the downstream duplex of a

previously synthesized Okazaki fragment called flap DNA (as shown in Figure 1.4).

11

Figure 1.4: Flap DNA substrate of the FEN-1 enzymes. Flap DNA substrate (also called a double flap substrate) has a displaced downstream 5’ flap along with an upstream 3’ flap overhang. The FEN-1 enzymes possess structure-specific endonuclease activity and cleave the 5’ flap one base pair into the downstream duplex DNA (shown with a red arrow).

Recognition by the FEN-1 enzymes is mediated by the structure of the DNA flap junction at the 3’ end of the short complimentary strand. Studies have demonstrated that FEN-1 enzymes specifically recognize a short 5’ single-stranded arm, regardless of composition or sequence, near the junction where the two strands of duplex DNA separate. FEN-1 has been shown to function primarily as an endonuclease during DNA replication.

However, FEN-1 also has a low efficiency exonuclease activity that can cleave a nick, a gap, or a recessed 5’ end of double-stranded DNA (Harrington and Lieber, 1994a). As an endonuclease, FEN-1 is able to cleave a pseudo-Y-structure but is inert on single-stranded DNA, double-stranded DNA, a heterologous loop, a D loop, a Holliday junction, and either a 3’ or 5’ overhang (Harrington and Lieber, 1994a). Although FEN-1 can cleave a pseudo-Y-structure that lacks an upstream duplex region, the presence of an upstream duplex resulted in a significant increase in cleavage rate and cleavage accuracy

(Harrington and Lieber, 1994a; Kaiser et al., 1999). It has also been shown that an upstream one nucleotide 3’ overhang results in an enhanced endonuclease activity of approximately 3 orders of magnitude (Kaiser et al., 1999). Recent structural information

12

has identified a unique binding pocket for the upstream 3’ overhang base on an archaeal

FEN-1 enzyme, as discussed in Chapter 3, section 3.2.4 (Chapados et al., 2004).

Following flap DNA recognition, the enzyme then moves down the single-stranded arm

to the cleavage site that is located at the junction of the single- and double-stranded

nucleic acid where cleavage takes place exclusively one base pair into the downstream

duplex (Shen et al., 1998; Bornarth et al., 1999; Kaiser et al., 1999; Kao et al., 2002). If

the upstream duplex contains a 3’ one nucleotide overlap, cleavage by FEN-1 results in a

nick in the downstream duplex that can be sealed by DNA ligase. However, when the

upstream duplex abuts the downstream duplex with no 3’ overlap nucleotide, cleavage

one base pair into the downstream duplex leaves a gap which would need to be filled by

DNA polymerase prior to ligation (Kaiser et al., 1999). These results, along with

mutational studies of the Saccharomyces cerevisiae RAD27, suggest that a double flap

containing both a downstream 5’ displaced arm and an upstream 3’ overhang nucleotide

(see Figure 1.4) is the in vivo substrate during DNA replication (Kao et al., 2002). This

double flap structure is proposed to be created in vivo by a transient flap equilibration

following strand displacement synthesis.

To date, the X-ray crystal structures of nine enzymes in this family have been

determined: one prokaryotic source, the 5’ to 3’ exonuclease domain of Thermus

aquaticus (Taq) polymerase (1TAQ, 2.40 Å resolution) (Kim et al., 1995) ; two from

bacteriophage, the T4 RNase H (1TFR, 2.1 Å resolution) (Mueser et al., 1996) and the

T5 5’ to 3’ exonuclease (1EXN, 2.50 Å resolution) (Ceska et al., 1996), (1XO1, 2.50 Å

resolution) (Garforth et al., 1999), (1UT5, 1UT8, 2.75 Å) (Feng et al., 2004); four from

Euryarchaeal organisms, the Pyrococcus furiosus (Pfu) flap endonuclease-1 (1B43,

13

2.00 Å resolution) (Hosfield et al., 1998b), the Methanococcus jannaschii flap

endonuclease-1 (1A76 and 1A77, 2.00 Å resolution) (Hwang et al., 1998), the

Pyrococcus horikoshii flap endonuclease-1 (1MC8, 3.10 Å resolution) (Matsui et al.,

2002), and the Archaeoglobus fulgidus (Afu) flap endonuclease-1 complexed to 3’ flap

DNA (1RXW 2.00 Å) (Chapados et al., 2004); one from the Crenarchaeal organisms

presented here, the Aeropyrum pernix flap endonuclease-1 (1.4 Å) (see Chapter 2); and one eukaryotic source, the Human flap endonuclease-1 complexed to the homotrimeric human PCNA (1UL1, 2.90 Å) (Sakurai et al., 2004). The recent structural information of the Afu FEN-1 complexed to 3’ flap DNA has contributed to the understanding of the recognition of the FEN-1 family of enzymes to the upstream duplex region of the flap substrate; however, structural studies of this family of enzymes have yet to determine the complete substrate recognition. Also, structural studies of FEN-1 complexed to PCNA have provided details as to how the PCNA clamp may help to orient the FEN-1 enzyme in order to facilitate binding and cleavage of a flap DNA substrate during DNA replication (Chapados et al., 2004; Sakurai et al., 2004).

The FEN-1 family of enzymes can be identified by highly conserved acidic amino acids in the N and I domains which form magnesium ion coordination motifs that have been shown to be involved in both catalysis and substrate binding (Harrington and

Lieber, 1994a; Mueser et al., 1996; Shen et al., 1998). Structural studies have shown that the catalytic core of the FEN-1 family of enzymes is highly conserved and contains mainly the negatively charged acidic residues, aspartic and glutamic acid, to which two magnesium ions are shown bound in the active site of the enzyme (Mueser et al., 1996;

Hosfield et al., 1998b; Hwang et al., 1998; Sakurai et al., 2004). Coordination of divalent

14

cations is essential for activity and may be required to both neutralize this condensed

region of negative charge of the active site as well as coordinating the inner sphere water molecule involved in the enzymatic cleavage reaction (Harrington and Lieber, 1994a;

Kaiser et al., 1999). Mutational studies have shown that one of the divalent metal ions

(Mg1) is involved in catalysis, whereas the other divalent metal ion (Mg2) is believed to

be involved in substrate binding (Bhagwat et al., 1997c; Shen et al., 1997). Interestingly,

the mutation of the residues that coordinate the proposed catalytic metal (Mg1) site in the

5’ to 3’ exonuclease domain of Taq polymerase greatly reduced but did not eliminate

cleavage (personal communication, Third Wave Technologies, Inc.).

A two-metal mechanism has been proposed for the FEN-1-mediated cleavage of

the scissile phosphate bond one base pair into the downstream duplex, analogous to the

two-metal mechanism that has been proposed for the 3’ to 5’ exonuclease domain of E. coli DNA polymerase I (Beese and Steitz, 1991). Based on this proposal, one of the metal ions (Mg1 in FEN-1) facilitates the formation of a hydroxyl ion that can perform nucleophilic attack of the scissile one base pair into the downstream duplex. The other metal ion (Mg2 in FEN-1) stabilizes both the oxyanion of the leaving group and the pentavalent species formed during the transition state. This two-metal mechanism requires that the distance between the metal ions to be less than 4 Å.

Interestingly, the crystal structures of the FEN-1 homologue T4 RNase H and the archaeal FEN-l enzymes from Methanococcus jannaschii and Pyrococcus furiosus (Pfu) show that the two bound magnesium ions in the active site are ~7 Å, 5 Å, and 5 Å apart, respectively. This suggests that a conformational change upon substrate binding must occur in the FEN-1 active site region that would bring the two bound magnesium ions

15

closer together, supporting the two-metal mechanism. Without a co-crystal structure of

FEN-1 bound to the double flap DNA substrate, the nature of such a conformational

change in the active site region is unknown.

The mutational studies discussed above also provide evidence that a two-metal mechanism may not describe the FEN-1 cleavage of the scissile phosphodiester bond.

Mutations of the residues that coordinate the Mg2 metal ion in the FEN-1 homologue, T4

RNase H, have been shown to retain activity, suggesting that the Mg2 metal ion is not required for catalysis (Bhagwat et al., 1997c). Following mutation of the residues coordinating Mg2 in T4 RNase H, it is not known if the Mg2 metal ion can still remain bound to the enzyme. It is likely that Mg2 is a structural metal, possibly forming part of the binding site for the substrate through chelation to the phosphate backbone of the DNA substrate, as initially proposed by (Mueser et al., 1996). It is also possible that the Mg2 metal ion might serve to promote a conformational change in the active site region that would indirectly facilitate the binding and cleavage of the bound substrate. In the human

FEN-1, mutations in the Mg2 metal site abolish enzyme activity due to an inability to bind flap DNA substrate and form an active complex (Shen et al., 1997). Therefore, this evidence suggests that a one-metal mechanism may explain the cleavage of flap DNA substrate by the FEN-1 enzymes. In a one-metal mechanism, an active site acidic amino acid acts as a nucleophile and abstracts a proton from an adjacent water molecule to form a hydroxyl ion that can then perform nucleophilic attack of the scissile phosphodiester bond. In this mechanism, the metal ion (Mg1 in FEN-1) stabilizes the pentavalent species formed during the transition state. An amino acid residue serving as a proton donor proximal to the pentavalent transition state species could then serve as a general

16 acid catalyst facilitating the departure of the 3’ oxygen of the adjacent nucleotide.

Additional FEN-1/substrate co-crystallization information may provide additional evidence in support of either a one or two-metal mechanism to explain catalysis of cleavage by FEN-1. See section 1.2 for additional discussion of the nuclease reaction.

1.1.2 Archaeal Flap Endonuclease (FEN-1)

Living organisms are classified into the kingdoms: Prokaryota, Eukaryota, and

Archaea (Woese et al., 1990). The Archaeal organisms are hyperthermophilic that normally exist in very extreme environments which subject them to high levels of salt, sulfur, temperature, or pressure. These extreme environments are located in deep water oceanic thermal vents, geysers, hot springs, and high salt lakes.

The kingdom is further separated into the Euryarchaea and the Crenarchaea.

The Euryarchaea are known as the thermophilic , whereas the Crenarchaea are known as the sulfur metabolizing . Interestingly, these organisms have evolved in order to thrive in an environment where most living organisms cannot survive.

These extreme archaeal organisms are of interest because they have a significant functional similarity to a number of eukaryotic systems (Woese et al., 1990).

Specifically, the kingdom Archaea has a DNA replication and repair system more closely related to the eukaryotic system than the bacterial system (Hosfield et al., 1998a). Thus, the archaeal organisms are a useful model system to study in order to more fully understand eukaryotic DNA replication and repair pathways.

In particular, the archaeal FEN-1 enzymes from both the Euryarchaea and the

Crenarchaea are more closely related to the eukaryotic nucleases (murine FEN-1, human

FEN-1, Schizosaccharomyces pombe RAD2, and Saccharomyces cerevisiae RAD27

17 enzymes) (Liu et al., 2004; Shen et al., 1998). In contrast to the prokaryotic FEN-1 homologues, the archaeal FEN-1 enzymes exist as independent proteins and share about

75% sequence similarity with the human FEN-1 enzyme (Shen et al., 1998; Hosfield et al., 1998a). Also, the active site acidic residues in the human FEN-1 enzyme are absolutely conserved in archaeal FEN-1 enzymes. A sequence alignment of the human

FEN-1 and archaeal FEN-1 enzymes from various species is shown in Figure 1.5. The archaeal FEN-1 enzymes show similar substrate binding and catalytic properties to those described in section 1.1.1. However, based on their thermophilic properties, the archaeal

FEN-1 enzymes displayed optimal catalytic activity between 70 and 85 °C (Kaiser et al.,

1999).

Chapters 2 and 3 of this thesis will primarily discuss the work that was done on the Crenarchaeal Aeropyrum pernix (Ape) FEN-1. The Aeropyrum pernix is an aerobic hyperthermophilic ocean-vent organism with an optimal growth temperature approaching

100 °C (Sako et al., 1996). The Ape FEN-1 enzyme is approximately 40.1 kDa and has a theoretical pI of 7.75 calculated from the ExPASy ProtParam tool (Gill and von Hippel,

1989). The Ape FEN-1 contains 56 negatively charged amino acid residues (Asp and

Glu) and 57 positively charged amino acid residues (Arg and Lys). The Ape FEN-1 amino acid sequence is shown in Figure 1.5. The Ape FEN-1 enzyme was used in a number of crystallographic studies aimed at solving the structure of the native, metal-bound, and flap DNA substrate bound forms of the enzyme.

1 AfuFEN1 MGADIGDL-----FE ---REEVELEYFSGK KIAVDAFNTLYQFIS IIRQPDGTPLKDSQG RITSHLSGILYRVSN MVEVGIRPVFVFDGE 82 2 AveFEN1 MGADIGEL-----LE ---REEVELEYFSGR KIAIDAFNTLYQFIS IIRQPDGTPLKDSQG RMTSHLSGILYRVSN MIEVGMRPIFVFDGE 82 3 ApeFEN1 MGVNLREL-----IP PEARREVELRALSGY VLALDAYNMLYQFLT AIRQPDGTPLLDREG RVTSHLSGLFYRTIN LVEEGIKPVYVFDGK 85 4 PfuFEN1 MGVPIGEI-----IP ---RKEIELENLYGK KIAIDALNAIYQFLS TIRQKDGTPLMDSKG RITSHLSGLFYRTIN LMEAGIKPVYVFDGE 82 5 PhFEN-1 MGVPIGDL-----VP ---RKEIDLENLYGK KIAIDALNAIYQFLS TIRQRDGTPLMDSKG RITSHLSGLFYRTIN LMEAGIKPAYVFDGK 82 6 MjFEN-1 MGVQFGDF-----IP ---KNIISFEDLKGK KVAIDGMNALYQFLT SIRLRDGSPLRNRKG EITSAYNGVFYKTIH LLENDITPIWVFDGE 82 7 HumanFEN-1 MGIQGLAKLIADVAP -SAIRENDIKSYFGR KVAIDASMSIYQFLI AVRQ-GGDVLQNEEG ETTSHLMGMFYRTIR MMENGIKPVYVFDGK 88

91 105 106 120 121 135 136 150 151 165 166 180 1 AfuFEN1 PPEFKKAEIEERKKR RAEAEEMWIAALQAG D-KDAKKYAQAAGRV DEYIVDSAKTLLSYM GIPFVDAPSEGEAQA AYMAAKGDVEYTGSQ 171 2 AveFEN1 PPVFKQKEIEERKER RAEAEEKWIAAIERG E-KYAKKYAQAAARV DEYIVESSKKLLEYM GVPWVQAPSEGEAQA AYMAAKGDVDFTGSQ 171 3 ApeFEN1 PPEMKSREVEERLRR KAEAEARYRRAVEAG EVEEARKYAMMAARL TSDMVEESKELLDAM GMPWVQAPAEGEAQA AYMARKGDAWATGSQ 175 4 PfuFEN1 PPEFKKKELEKRREA REEAEEKWREALEKG EIEEARKYAQRATRV NEMLIEDAKKLLELM GIPIVQAPSEGEAQA AYMAAKGSVYASASQ 172 5 PhoFEN-1 PPEFKRKELEKRREA REEAELKWKEALAKG NLEEARKYAQRATKV NEMLIEDAKKLLQLM GIPIIQAPSEGEAQA AYMASKGDVYASASQ 172 6 MjFEN-1 PPKLKEKTRKVRREM KEKAELKMKEAIKKE DFEEAAKYAKRVSYL TPKMVENCKYLLSLM GIPYVEAPSEGEAQA SYMAKKGDVWAVVSQ 172 7 HumanFEN-1 PPQLKSGELAKRSER RAEAEKQLQQAQAAG AEQEVEKFTKRLVKV TKQHNDECKHLLSLM GIPYLDAPSEAEASC AALVKAGKVYAAATE 178

181 195 196 210 211 225 226 240 241 255 256 270 1 AfuFEN1 DYDSLLFGSPRLARN LAITGKRKLPGKNVY VDVKPEIIILESNLK RLGLTREQLIDIAIL VGTDYNEG-VKGVGV KKALNYIKTYGDIFR 260 2 AveFEN1 DYDSLLFGSPKLARN LAITGKRKLPGKNVY VEVKPEIIDLNGNLR RLGITREQLVDIALL VGTDYNEG-VKGVGV KKAYKYIKTYGDVFK 260 3 ApeFEN1 DYDSLLFGSPRLVRN LAITGRRKLPGRDQY VEIKPEIIELEPLLS KLGITREQLIAVGIL LGTDYNPGGVRGYGP KTALRLVKSLGDPMK 265 4 PfuFEN1 DYDSLLFGAPRLVRN LTITGKRKLPGKNVY VEIKPELIILEEVLK ELKLTREKLIELAIL VGTDYNPGGIKGIGL KKALEIVRHSKDPLA 262 5 PhFEN-1 DYDSLLFGAPRLIRN LTITGKRKMPGKDVY VEIKPELVVLDEVLK ELKITREKLIELAIL VGTDYNPGGVKGIGP KKALEIVRYSRDPLA 262 6 MjFEN-1 DYDALLYGAPRVVRN LTTT------KEMPELIELNEVLE DLRISLDDLIDIAIF MGTDYNPGGVKGIGF KRAYELVRSGVAKDV 250 7 HumanFEN-1 DMDCLTFGSPVLMRH LTASEAKKLP------IQEFHLSRILQ ELGLNQEQFVDLCIL LGSDYCES-IRGIGP KRAVDLIQKHKSIEE 258

271 285 286 300 301 315 316 330 331 345 346 360 1 AfuFEN1 ALKALKVN---IDH- --VEEIRNFFLNPPV TD--DYRIEFREPDF EKAIEFLCEEHDFSR ERVEKALEKLKA--- --LKSTQATLERWF- 336 2 AveFEN1 ALKALKVE---QEN- --IEEIRNFFLNPPV TN--NYSLHFGKPDD EKIIEFLCEEHDFSK DRVEKAVEKLKAG-- --MQASQSTLERWFS 338 3 ApeFEN1 VLASVPRGEYDPDY- --LRKVYEYFLNPPV TD--DYKIEFRKPDQ DKVREILVERHDFNP ERVERALERLGKAYR EKLRGRQSRLDMWFG 350 4 PfuFEN1 KFQKQSDVD------LYAIKEFFLNPPV TD--NYNLVWRDPDE EGILKFLCDEHDFSE ERVKNGLERLKKAIK ---SGKQSTLESWFK 339 5 PhFEN-1 KFQRQSDVD------LYAIKEFFLNPPV TN--EYSLSWKEPDE EGILKFLCDEHNFSE ERVKNGIERLKKAIK ---AGRQSTLESWFV 339 6 MjFEN-1 LKKEVEYYD------EIKRIFKEPKV TD--NYSLSLKLPDK EGIIKFLVDENDFNY DRVKKHVDKLYNLIA N--KTKQKTLDAWFK 326 7 HumanFEN-1 IVRRLDPNKYPVPEN WLHKEAHQLFLEPEV LDPESVELKWSEPNE EELIKFMCGEKQFSE ERIRSGVKRLSKSRQ ---GSTQGRLDDFFK 345

361 375 376 390 391 1 AfuFEN1 ------336 2 AveFEN1 ------338 3 ApeFEN1 ------350 4 PfuFEN1 R------340 5 PhFEN-1 KKKP------343 6 MjFEN-1 ------326 7 HumanFEN-1 VTGSLSSAKRKEPEP KGSTKKKAKTGAAGK FKRGK 380

Figure 1.5: Sequence alignment of archaeal FEN-1 enzymes from various species and human FEN-1 showing absolutely conserved active site acidic residues. Amino acid sequence alignment of the archaeal FEN-l enzymes (Archaeoglobus fulgidus (Afu), Archaeoglobus veneficus (Ave), Aeropyrum pernix (Ape), Pyrococcus furiosus (Pfu), Pyrococcus horikoshii (Ph), and Methanococcus jannaschii (Mj) FEN-1) along with Human FEN-1 showing the conserved active site acidic residues in red. The ordered bridge region residues (88-130) of the metal-free Ape FEN-1 structure are shown highlighted in gray (discussed in Chapter 2). 18 19

1.2 Bacteriophage T4 RNase H

1.2.1 Background

As described in section 1.1.1, when DNA is replicated, the direction of the

replication on the template strand is always 5’ to 3’. The leading strand is synthesized in

a continuous manner, but the lagging strand is synthesized in a discontinuous manner as a

series of fragments called Okazaki fragments. Each of the Okazaki fragments must be initiated by a short RNA primer before lagging strand synthesis can occur.

Bacteriophage T4 is used as a model system for DNA replication and encodes all proteins required for DNA replication (Cha and Alberts, 1989). Following many years of characterizing the bacteriophage T4 system, it has been shown that many of the proteins play a significant or essential role in DNA replication. When all of the required T4 DNA replication proteins are reconstituted in vitro, these proteins catalyze the synthesis of the leading and lagging strands of a replication fork with a speed and accuracy that is comparable to that observed in vivo (Nossal, 1994).

The bacteriophage T4 encodes ten proteins that are involved in DNA replication and repair. These ten proteins in the T4 DNA replication system comprise two distinct but highly cooperative assemblies, the and the primosome. The gene 43 DNA polymerase is the central component of the replisome. The gene 43 protein is a relatively nonprocessive polymerase that synthesizes the leading and lagging strands of the DNA and has 3’- 5’ editing exonuclease activity (Cha and Alberts, 1989). There are three polymerase accessory proteins, the gene 45 sliding clamp and the gene 44/62 clamp loader proteins, that help increase the of the gene 43 polymerase

20

during lagging strand synthesis. The polymerase accessory proteins are also required

during leading strand synthesis on nicked or forked duplex templates (Nossal, 1994).

The gene 45 sliding clamp holds the polymerase on the template strand while the gene

44/62 proteins load the clamp onto the DNA (reviewed in Jones et al., 2004). The

complex of the three accessory proteins and the gene 43 polymerase is referred to as the polymerase holoenzyme (Cha and Alberts, 1989). The gene 41 hexameric helicase is the primary component of the primosome and is responsible for unwinding duplex DNA

ahead of the progressing leading strand replisome. The gene 61 primase, a

non-processive RNA polymerase, is associated with the gene 41 helicase. The primase and helicase are required for synthesizing the RNA pentamer primers for lagging strand replication (Burke et al., 1985). The gene 32 single stranded DNA binding protein is required to protect the single-stranded lagging strand, while stimulating lagging strand synthesis and increasing the rate of primer synthesis (Jones et al., 2004). The gene 59 helicase assembly protein accelerates loading of the gene 41 helicase on DNA, specifically at the replication fork (Jones et al., 2000). T4 RNase H, a 5’ to 3’ exonuclease, is able to remove the pentamer primers from the 5’ end of the Okazaki fragments generated during lagging strand replication (Bhagwat et al., 1997b). Gaps left following the removal of the RNA primers by the RNase H are then filled by the gene 43

DNA polymerase. The gene 30 DNA ligase can repair nicks in the final step of the lagging strand synthesis (Nossal, 1994; Benkovic et al., 2001). See Figure 1.6 for a summary of the bacteriophage T4 DNA replication fork.

21

ssDNA binding T4 DNA Replication Fork protein (32)

5’ Ligase (30) RNase H 3’

Lagging strand

Primase (61) 3’ Polymerase (43) 5’ Helicase (41)

Leading strand

Helicase loading protein (59) 5’ Clamp (45)

3’

Clamp-loader (44/62)

Figure 1.6: Bacteriophage T4 DNA Replication Fork.

During lagging strand DNA replication in both prokaryotic and eukaryotic organisms, the discontinuous Okazaki fragments are initiated by short RNA primers that must subsequently be removed by a 5’ nuclease before the gap can filled in by a DNA polymerase and then joined with a DNA ligase (Kornberg and Baker, 1992; Nossal,

1994). As discussed in section 1.1.1, eukaryotic lagging strand DNA synthesis is initiated by the DNA polymerase α/primase complex (pol α) which synthesizes short

22

RNA primers followed by a stretch of adjacent DNA. These RNA primers and adjacent

DNA then need to be removed following Okazaki fragment synthesis in order to maintain genomic integrity because the pol α complex lacks a 3’ to 5’ exonuclease activity. The lack of the 3’ to 5’ exonuclease activity results in the pol α being more error prone and therefore having a low fidelity. Likewise, in T4 lagging strand synthesis, it has been suggested that the accuracy of adding the first DNA nucleotides adjacent to the RNA pentamer primer is lower than subsequent additions (Bhagwat and Nossal, 2001). Thus, it has been proposed that processing of the 5’ termini of the Okazaki fragments must involve both the removal of the RNA and a small amount of the adjacent DNA in order to preserve the fidelity of replication (Kao and Bambara, 2003). Therefore, it is important for overall genomic integrity that the discontinuous fragments in lagging strand replication be processed correctly by a 5’ nuclease.

In bacteriophage T4 DNA replication, the RNA pentamer primers and adjacent

DNA are removed by a phage-encoded 5’ to 3’ nuclease, the T4 RNase H. This phage encoded nuclease is sufficient for T4 bacteriophage DNA replication because wild type

T4 RNase H does not require host enzymes for the processing of RNA primers on the lagging strand. T4 RNase H is a member of the RAD2 family of prokaryotic 5’ to 3’ exonucleases and eukaryotic FEN-1 related replication and repair nucleases, as discussed in section 1.1.1 (Mueser et al., 1996). The prokaryotic nucleases include both the smaller bacteriophage T5 D15 and T7 gene 6 exonucleases and the 5’ to 3’ exonuclease domains of DNA polymerases from bacteria such as E. coli and Thermus aquaticus (Taq). The enzymes in the RAD2 family of nucleases have a significant amino acid sequence similarity between one another and in comparison to the larger eukaryotic nucleotide

23

excision repair enzymes such as the human XPG (Harrington and Lieber, 1994b).

Sequence alignments of the RAD2 nucleases show a high sequence similarity in both the

N-terminal (N) and intermediate (I) conserved regions with some variability in the C- terminal (C) region (Mueser et al., 1996; Liu et al., 2004). In particular, the N and I domains of all members of the RAD2 family of nucleases contain highly conserved acidic amino acids that have been shown to be involved in both catalysis and substrate binding (Mueser et al., 1996; Shen et al., 1998).

Like other members in the RAD2 family of nucleases, T4 RNase H possesses both a 5’ to 3’ exonuclease activity that degrades RNA-DNA and DNA-DNA duplexes,

giving short oligonucleotide products, and a 5’ to 3’ flap endonuclease activity that

cleaves close to the junction of single- and double-stranded DNA on fork and flap

substrates (Liu et al., 2004; Bhagwat et al., 1997a). The eukaryotic and archaeal FEN-1

enzymes primarily utilize their flap endonuclease activity, whereas the related

homologues in bacteriophage (T4 RNase H and the T5 5’ to 3’ exonuclease) and

(E. coli Pol I and Thermus aquaticus (Taq) 5’ to 3’ exonucleases) utilize

their exonuclease activity (Mueser et al., 1996; Ceska et al., 1996; Kim et al., 1995).

Based on these differences in nuclease activity between members of the RAD2 nuclease

family, the removal of RNA primers differs in both phage and prokaryotic systems versus

that in eukaryotic related systems (Bhagwat and Nossal, 2001).

In T4 lagging strand DNA replication, the T4 DNA polymerase (gene 43

protein)/clamp (gene 45 protein) complex does not typically catalyze strand displacement

synthesis. Instead, the T4 DNA polymerase stops and is rapidly released when it reaches

an annealed duplex on model lagging strand DNA templates (Hacker and Alberts, 1994;

24

Carver et al., 1997). In agreement with these results, studies have shown that the T4

RNase H acts as a 5’ to 3’ exonuclease and begins degradation of the RNA pentaprimer

before the synthesis of the upstream fragment is completed (Bhagwat and Nossal, 2001).

Following cleavage of the RNA primer by the T4 RNase H, the resulting gap can then be filled by the DNA polymerase that is elongating the upstream fragment. A nick can then be sealed by the T4 DNA ligase (gene 30 protein). Thus, the primary mechanism by which RNA primers are processed and cleaved in the T4 system differs considerably from that discussed for the eukaryotic FEN-1 enzymes, as discussed in section 1.1.1.

However, additional studies of a mutant T4 DNA polymerase lacking the 3’ to 5’ exonuclease domain have shown that if the processing polymerase is able to perform strand displacement synthesis upon reaching a downstream Okazaki fragment, the resulting flap is cleaved by the 5’ to 3’ endonuclease activity of the T4 RNase H

(Bhagwat and Nossal, 2001). Interestingly, the T4 RNase H cleaves a double flap substrate (hypothesized in vivo substrate of FEN-1) at a significantly slower rate than a substrate with a single flap of the same sequence (Gangisetty et al., 2005).

During lagging strand DNA synthesis, T4 RNase H by itself functions as a nonprocessive 5’ to 3’ exonuclease, removing a single oligonucleotide (between 1-4 nucleotides in length) each time it binds nucleic acid substrate. During multiple turnover reactions, the nonprocessive degradation of the DNA duplex from the 5’ end continues until a product of between 8-11 nucleotides remains at the 3’ end (Bhagwat et al., 1997a).

However, when the single stranded DNA binding protein (gene 32 protein) can bind

behind the nuclease, the processivity of the T4 RNase H (functioning as a 5’ to 3’

exonuclease) is increased so that approximately 10-50 oligonucleotides are removed each

25

time it binds nucleic acid substrate. In the presence of the 32 protein when T4 RNase H

can bind to the substrate multiple times, the processive degradation of the DNA duplex

resulted in the same 8-11 nucleotide product as that discussed above for the

nonprocessive degradation. However, the binding of 32 protein to a single-stranded flap

substrate inhibits the T4 RNase flap endonuclease activity in an analogous manner to the

inhibition of FEN-1 by RPA bound to a 5’ flap substrate (as discussed in section 1.1.1).

The increased processivity of the T4 RNase H in the presence the 32 protein is advantageous during lagging strand synthesis because it guarantees that the exonuclease will not only remove the RNA pentamer primers, but also adjacent DNA. In addition to increasing the processivity of the T4 RNase H, the 32 protein has been shown to also increase the processivity of the elongating DNA polymerase (Nossal, 1994). With the 32 protein bound on the lagging strand template, the rate of processive polymerization is approximately 100-fold greater than the rate of processive degradation of the T4 RNase

H (Bhagwat and Nossal, 2001). During the time needed for one round of RNA/DNA degradation by the processive T4 RNase H exonuclease activity, the processive DNA polymerase fills in the resulting gap, displaces the 32 protein, and forms a nick that can be sealed by DNA ligase. Therefore, the extent of processive T4 RNase H degradation is then controlled by the difference in the rates of degradation and processive synthesis by the DNA polymerase (Bhagwat and Nossal, 2001). This control of degradation maintains the fidelity of replication by ensuring that the T4 RNase H does not remove excess adjacent DNA. See Figure 1.7 for a model showing the mechanism for the processing of

RNA pentamer primers and adjacent DNA during T4 lagging strand DNA replication.

26

5’ 3’ T4 RNase H

32

45 43

T4 RNase H 5’ to 3’ exonuclease cleavage 32

45 43

T4 RNase H 32

45 43

45 43

T4 ligase

Figure 1.7: Model showing the 5’ to 3’ exonuclease cleavage of the RNA pentamer primer and adjacent DNA by the T4 RNase H during T4 lagging strand DNA replication. Prior to completion of the 5’ to 3’ elongation of the upstream Okazaki fragment by the processive T4 DNA polymerase (43)/clamp(45) complex in the presence of single-stranded binding protein (32), the T4 RHase H begins 5’ to 3’ exonuclease degradation of the pentamer RNA primer (shown in red) along with adjacent downstream DNA. The T4 RNase H is a processive exonuclease when 32 protein is bound on the upstream template strand behind itself. While the T4 RNase H is removing the RNA primer and approximately 30 adjacent nucleotides, the processing polymerase fills in the resulting gap, displaces the 32 protein from the remaining single stranded template, and creates a nick which can be sealed by T4 ligase.

27

Recently, studies have indicated that the T4 RNase H is stimulated by its replication sliding clamp (gene 45 protein) in a similar manner to the stimulation of

FEN-1 by the PCNA sliding clamp. The N-terminus of T4 RNase H interacts with the 45 clamp, whereas the C-terminus of FEN-1 interacts with PCNA. However, this stimulation of activity was not required for the normal processing of lagging strand fragments. Specifically, the binding of the 45 clamp did not increase the processivity of the 5’ to 3’ exonuclease activity of the T4 RNase H. Instead, it was shown that the interaction of T4 RNase H with the 32 protein was essential for the processing of lagging strand Okazaki fragments. Mutational studies of the 32 protein without its N-terminal domain did not stimulate the processivity of the T4 RNase H. The N-terminal domain of the 32 protein is thought to be important for the binding of adjacent 32 molecules to each other when cooperatively bound to DNA. These results suggest that the of the 32 protein on the single stranded template is required for the processivity of the 5’ to

3’ exonuclease activity of the T4 RNase H during lagging strand Okazaki fragment processing (Gangisetty et al., 2005).

1.2.2 Active site and proposed nuclease reaction

The X-ray crystal structure of the native T4 RNase H enzyme (as shown in

Chapter 4, Figure 4.1) was solved in the presence of two bound magnesium ions in the active site (Mueser et al., 1996). Like other members of the RAD2 family of nucleases, the T4 RNase H active site contained seven highly conserved acidic amino acids. The first magnesium ion (Mg1) was inner-sphere coordinated directly to the carboxylate oxygen atom of D132 and outer-sphere coordinated through an extensive network of water molecules to the side chains of D19, D71, and D155 (as shown in Figure 1.8). The

28 second magnesium ion (Mg2) was inner-sphere coordinated directly by six water molecules and was outer-sphere coordinated through these water molecules to the side chains of residues D132, D157, and D200 (as shown in Figure 1.8). The Mg1 and Mg2 magnesium ions were separated by approximately 7 Å.

Figure 1.8: Active site of the native T4 RNase H bound to two magnesium ions. The active site of native T4 RNase H is shown bound to two magnesium ions (Mg1 and Mg2 shown in yellow). Mg1 is inner-sphere coordinated to the carboxylate oxygen of D132 and outer-sphere coordinated through water molecules (shown in red) to the side chains of D19, D71, and D155. Mg2 is inner-sphere coordinated by six water molecules and outer-sphere coordinated through these water molecules to the side chains of D132, D157, and D200. PDB 1TFR, (Mueser et al., 1996).

Site-directed mutagenesis studies have shown that the Mg1 magnesium ion is the catalytic site of the T4 RNase H enzyme. The aspartic acid residues (D132, D19, D71, and D155) were each mutated to an asparagine (N) and their respective activities were studied. The mutants D132N, D19N, D71N, and D155N all completely lost both their exonuclease and flap endonuclease activities but still were able to bind DNA substrate.

In contrast, asparagine mutants of the residues that were outer-sphere coordinated to Mg2

(D157 and D200) maintained their catalytic activity. These results suggested that the

Mg2 metal ion could still remain bound in the absence of any one of the outer-sphere

29

coordinated residues or that the Mg2 metal ion was not required for catalysis (Bhagwat et

al., 1997c). As discussed in section 1.1.1, it has been hypothesized that Mg2 may stabilize the protein near the active site or form a portion of the substrate binding site.

However, without a co-crystal structure of a member of this family of enzymes bound to

DNA, the position and orientation of substrate in the active site is unknown.

Based on the mutagenesis studies of the residues coordinating Mg1 and Mg2, a one-metal mechanism can be proposed for the nuclease activity of T4 RNase H. In this proposed mechanism, an active site acidic amino acid (acting as a general base) abstracts a proton from one of the inner-sphere magnesium-coordinated water molecules, generating a magnesium-coordinated hydroxide ion. The bound magnesium ion can reduce the electrostatic repulsion between the coordinated hydroxide ion and the negatively charged phosphoryl group of the bound substrate promoting nucleophilic attack. Once positioned by the magnesium ion, this hydroxide ion then performs nucleophilic SN2 addition to the electrophilic phosphorus atom which generates a

pentacoordinate phosphorane intermediate. The pentacoordinate transition state

intermediate is stabilized by the magnesium ion. Once the transition state is formed, a

proton is abstracted from a proximal amino acid residue (acting as a general acid) by a

nucleophilic oxygen atom in the pentacoordinate transition state. This amino acid can

then assist in the departure of the 3’-oxygen of the adjacent nucleotide by abstracting the

proton from the phosphorane oxygen, facilitating the cleavage of the scissile

phosphodiester bond. The proposed nuclease reaction is shown in Figure 1.9. Until a

co-crystal structure of the T4 RNase H bound to substrate in the active site is available,

30 the specific amino acid residues (general base and general acid catalyst) which are participating in the nuclease reaction are speculative.

O O Base O Base A B O D132 D132 O N O N O O O H H H O H O P O P O O O H O O H 2+ 2+ H O Mg O O Mg H H O O H H O O O O H O H O H Base Base H H H H H

Base O O Base D O C O D132 O O H D132 H N O H O O H N O O O H H O O P 2+ H O O Mg P O H O 2+ O H Mg H O O O H H O O O H O H H H O H O H Base H H Base

Figure 1.9: Proposed nuclease reaction of T4 RNase H. The bound magnesium ion represents Mg1 in the native T4 RNase H structure. A: Following abstraction of a proton (by an active site acidic amino acid acting as a general base) from one of the inner-sphere magnesium coordinated water molecules, a magnesium-coordinated hydroxide ion was generated. The magnesium-coordinated hydroxide ion then acts as a nucleophile and attacks the electrophilic phosphorus atom which generates a pentacoordinate phosphorane intermediate. B: The pentacoordinate intermediate is stabilized by the magnesium ion. A proton is abstracted from a proximal amino acid residue (acting as a general acid) by a nucleophilic oxygen atom in the transition state. C and D: This amino acid residue can then assist in the departure of the 3’-oxygen of the adjacent nucleotide by abstracting the proton from the phosphorane transition state oxygen, which facilitates the cleavage of the scissile phosphodiester bond. This proposal is based on the magnesium-bound T4 RNase H X-ray crystal structure. PDB 1TFR, (Mueser et al., 1996).

31

Chapter 4 of this thesis will discuss the work that was done on the bacteriophage

T4 RNase H. The T4 RNase H enzyme is approximately 35.5 kDa and has a theoretical

pI of 8.61 calculated from the ExPASy ProtParam tool (Gill and von Hippel, 1989). The

T4 RNase H contains 43 negatively charged amino acid residues (Asp and Glu) and 47

positively charged amino acid residues (Arg and Lys). The T4 RNase H enzyme was

used in a number of crystallographic studies aimed at solving the structure of both the

metal-free native, metal-free D132N mutant, and fork DNA substrate bound forms of the

enzyme.

1.3 Escherichia coli DNA-Binding Protein from Starved Cells

Bacterial cells are constantly faced with a number of changing environmental

conditions that can be detrimental to their survival. Environmental conditions such as

low pH, high osmolarity, nutrient deficiency, and oxidative stress require that bacterial

cells have adaptive mechanisms which enable them to survive. As a result, bacteria have

developed a number of regulatory mechanisms by which protein expression levels can be

rapidly adjusted in order to ensure successful adaptation to such environmentally stressful

conditions. In response to nutritional or oxidative stress, Escherichia coli (E. coli) cells express a non-specific DNA-binding protein called Dps (DNA-binding protein from

starved cells) or PexB (Grant et al., 1998). Dps has been shown to be important for the

protection of DNA from oxidative damage (Ceci et al., 2004). In addition, studies have

shown that Dps can also protect DNA against UV light and thermal shock (Martinez and

Kolter, 1997). A number of studies have confirmed that Dps induces a significant

compaction of the E. coli chromosomal DNA in an analogous manner to that of the class

of bacterial nucleoid-associated proteins which includes heat-unstable nucleoid protein

32

(HU), histone-like nucleoid structuring protein (HN-S), integration host factor (IHF), and

factor for inversion stimulation (Fis) (Azam and Ishihama, 1999).

The X-ray crystal structure of Dps revealed a spherical dodecamer that measures

approximately 90 Å in diameter with a hollow central cavity which is 45 Å in diameter

(1DPS 1.6 Å) (Grant et al., 1998). The dodecamer structure is assembled by the packing

of monomers which have a four helix bundle core. The ribbon structure of Dps is shown

in Figure 1.10.

A B

Figure 1.10: Ribbon structure of Dps. A: Ribbon structure of one four helix bundle Dps monomer. B: Ribbon structure of the Dodecamer structure of Dps (12 monomers). PDB 1DPS, (Grant et al., 1998) This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

Due to the dodecameric structure of Dps, it has been suggested that Dps is a divergent

member of the bacterioferritin/ferritin superfamily. Similar to Dps, a ferritin monomer

consists of a four helix bundle which is assembled in a roughly spherical 24mer cluster

with a diameter of 120 Å containing a central core which is 80 Å in diameter. The

33

ferritins are a group of iron storage proteins that regulate iron metabolism in living

organisms. Ferritins can store up to 4000 Fe+3 ions in their central core, and they have a

ferroxidase activity in which they can carry out the oxidation of Fe+2 into Fe+3 (Grant et al., 1998). Based on the structural homology between Dps and the ferritins, it has been suggested that Dps may protect DNA from oxidative damage by storing Fe+2 ions that

could potentially generate hydroxyl radicals leading to both DNA single and

double-strand breaks through oxidation of sugar or nucleotide base moieties (Ilari et al.,

2002; Ceci et al., 2004).

Although it has been demonstrated that Dps can bind non-specifically to

double-stranded DNA with a length of 40-64 base pairs with a dissociation constant

(Kd) of 172-178 nM, the actual mode of the Dps/DNA interaction in unknown (Azam and

Ishihama, 1999). The structure of the Dps dodecamer does not reveal an obvious DNA

binding site. Also, the surface of the dodecamer structure contains a significant amount

of negative charge. However, a DNA binding model was hypothesized based on the

packing of the dodecamers in the unit cell of the crystal. It has been suggested that the

N-terminal lysine residues are involved in binding to the DNA when the dodecamer Dps

structure is packed in a pseudo-hexagonal lattice. The hexagonal packing of the

dodecamer results in large solvent channels which are approximately the diameter of a

B DNA helix. These solvent channels are believed to be occupied by a total of nine lysine residues (three from each dodecamer interface of the solvent channel) which could interact electrostatically with the phosphate backbone of the DNA as it threaded through

the channels (Grant et al., 1998). This model was suggested to contribute to the

organization and condensation of the chromosome based on the protein-protein

34

interactions of Dps in the pseudo-hexagonal lattice. Recently, the N-terminus was shown

to promote the self-aggregation of Dps dodecamers. It was proposed that the N-terminal

lysine residues interact with the negatively charged surface residues of neighboring Dps

molecules thereby promoting self-aggregation. N-terminal deletion studies demonstrated

that the cooperative protein-protein interactions of the Dps dodecamer molecules are

essential to the formation of large Dps/DNA complexes in the presence of plasmid DNA

(Ceci et al., 2004). Additional structural studies are needed in order to characterize the mechanism by which Dps is able to bind DNA.

Chapter 5 of this thesis will discuss the work that was done on E. coli Dps. A Dps

monomer is approximately 18.5 kDa and has a theoretical pI of 5.72 calculated from the

ExPASy ProtParam tool (Gill and von Hippel, 1989). One monomer contains 24

negatively charged amino acid residues (Asp and Glu) and 20 positively charged amino

acid residues (Arg and Lys). A truncated form of E. coli Dps was isolated as an impurity

during the purification of various archaeal FEN-1 enzymes. Following characterization by of the impurity by mass spectroscopy, subsequent crystallographic studies were aimed at solving the structure of this truncated form of Dps.

Chapter 2: Structure of the Metal-free Aeropyrum pernix (Ape)

Flap Endonclease-1 (FEN-1)

Protein purification, crystallization, and crystal harvesting of the metal-free

Aeropyrum pernix (Ape) flap endonuclease-1 (FEN-1) have been previously completed.

The goal of this project was to complete X-ray diffraction data collection, data processing, structure determination, model building, and refinement of the metal-free

Ape FEN-1.

2.1 X-ray Diffraction Data Collection

Ape FEN-1 crystals were soaked momentarily in a substitute mother liquor

containing a cryoprotectant and were then flash frozen in liquid nitrogen. The substitute

mother liquor and cryoprotectant solution consisted of the following: 25% (v/v)

2-methyl-2,4-pentanediol, 12% (w/v) polyethylene glycol (PEG) 4000, 25 mM Na PIPES

pH 6.5, 100 mM Na MES pH 5.6, 200 mM sodium formate, 50 mM disodium hydrogen

citrate, 50 mM KCl, and 50 mM NH4Cl. Multiple native data sets were collected at

BioCARS 14-BM-C (Argonne National Laboratories, Advanced Photon Source,

Chicago, IL, USA) using an ADSC Quantum 4 CCD detector. An initial 1.9 Å X-ray

diffraction data set had been collected and was used in molecular replacement for structure determination. The best data set that was collected diffracted to 1.4 Å

resolution (see Figure 2.1 for an X-ray diffraction image of a metal-free Ape FEN-1

crystal). When collecting X-ray diffraction data on a CCD detector, it is useful to collect

35 36

both a high and a low resolution data set to ensure that the data is complete at both high

and low resolution. High resolution data collection using a longer X-ray beam exposure

time can sometimes cause a CCD detector to become saturated depending on the flux of

the X-ray source. Thus, a second or third data collection pass can be completed using a

shorter exposure time to ensure that all low resolution data is collected with a high

completeness. The high resolution data were collected first using a 0.9 Å wavelength, a crystal-to-detector distance of 120 mm, 0.5° oscillations, 15 second exposures, for 150 frames at -173 °C followed by a low resolution data collection using a 0.9 Å wavelength, a crystal-to-detector distance of 170 mm, 1.0° oscillations, 10 second exposures, for 100 frames at -173 °C. During integration of the low resolution data set, a number of reflections were rejected due to oversaturation of the CCD detector. A third data set was then collected using a 0.9 Å wavelength, a crystal-to-detector distance of 170 mm, 1.0° oscillations, 2 second exposures, for 100 frames at -173 °C.

A B

Figure 2.1: Metal-free Ape FEN-1 crystal and X-ray diffraction image. A. Metal-free Ape FEN-1 diffraction quality crystal grown in 12% (w/v) polyethylene glycol (PEG) 4000, 100 mM Na MES pH 5.6, and 200 mM sodium formate. B. X-ray diffraction image of Ape FEN-1 crystal. Data was collected at BioCARS 14-BMC (Argonne National Laboratories, Advanced Photon Source, Chicago, IL, USA).

37

Perhaps the most important factor in assessing the quality of X-ray diffraction data collection is the completeness. A data set is complete if the Ewald sphere has been crossed by all reflections (or symmetrically related reflections) in the asymmetric part of the reciprocal lattice (Dauter, 1999). Based on the indexing of two or three initial diffraction images, one can identify the crystal symmetry with some accuracy along with the orientation of the crystal in relationship to the X-ray beam and the detector. Then an appropriate rotation range for the data collection can be determined so that all unique reflections are measured at least once. 75 degrees of high resolution data were needed due to the hexagonal symmetry of the metal-free Ape FEN-1 crystals. A 60 degree rotation range was required in order to collect all unique reflections at least once, however an additional 15 degrees of rotation was collected resulting in multiple measurements of equivalent reflections along the principle axis. A higher amount of redundancy (multiple measurements of equivalent reflections) in an X-ray diffraction data set will lead to more accurate data (Dauter, 1999).

2.2 Data Processing

Data processing of X-ray diffraction data is composed of three stepwise procedures: indexing, integration, and scaling. Indexing involves the determination of the crystal lattice type, the crystal unit cell parameters, and the crystal orientation parameters from a single oscillation image. During indexing, Miller indices (h k l) are assigned for each reflection in order to determine the reciprocal space symmetry (Laue symmetry). The reciprocal space symmetry can then be analyzed to determine the real space or Bravais lattice of the crystal. Once the crystal lattice type and unit cell parameters are known, the intensities of all reflections are then integrated on each

38

respective oscillation image at an appropriate resolution range. Following integration,

scaling is then completed in order to merge all unique reflections throughout the

resolution range of the data collection.

An initial X-ray diffraction data set of the metal-free Ape FEN-1 had been

collected and was indexed, integrated, and scaled to 1.9 Å resolution. The data were

integrated using DENZO and merged using SCALEPACK from the HKL software

(Otwinowski and Minor, 1997). The metal-free Ape FEN-1 crystals belong to the

hexagonal space group P61 (systematic absences along the 00l axis consistent with a

sixfold screw), with unit cell dimensions a = 92.8, b = 92.8, c = 80.9 Å with α = β = 90°

3 -1 and γ = 120°. Calculation of the Matthews’ coefficient (VM = 2.51 Å Da ) indicated

that there was one monomer per asymmetric unit in the crystal (Matthews, 1968). The

Matthews’ coefficient allows an approximate calculation of the solvent content in the

crystal, and is therefore useful for determining the expected number of molecules in the

asymmetric unit of the crystal. The data had an overall Rmerge of 3.6% with a

completeness of 98.8%. The Rmerge value for the highest resolution bin (1.9 Å) was

11.4% and had an intensity/error (I/σ) value of 12.7. The final scale file was then used to create a CNS data file (h k l, Fobs, and Sigma Fobs) containing an Rfree random data subset of 3% for use in the refinement software suite, Crystallography & NMR System (CNS)

(Brunger, 1992; Brunger et al., 1998).

The best data set collected diffracted to 1.4 Å. The data were indexed, integrated, and scaled to 1.4 Å resolution. The data were integrated using DENZO and merged using SCALEPACK from the HKL software (Otwinowski and Minor, 1997). This metal-free Ape FEN-1 data belongs to the same hexagonal space group P61, with

39

identical unit cell dimensions a = 92.8, b = 92.8, c = 80.9 Å with α = β = 90° and

γ = 120°. The data had an overall Rmerge of 5.3% with a completeness of 93.9%. The

Rmerge value for the highest resolution bin (1.4 Å) was 18.5% and had an intensity/error

(I/σ) value of 4.4. See Table 2.1 for all metal-free Ape FEN-1 crystallographic data. The

final scale file was then used to create a CNS style and a CCP4 style data file (h k l, Fobs, and Sigma Fobs) both containing the same Rfree random data subset of 3% for use in the

refinement software suites Crystallography & NMR System (CNS) (Brunger, 1992;

Brunger et al., 1998) and CCP4 (Collaborative Computational Project,

1994), respectively. During the creation of the respective data files (CNS and CCP4), the

observed structure factors (Fobs) for each h k l are calculated by taking the square root of

the measured intensity for each individual reflection (Ihkl). The Rfree random data set is a

small fraction of the data that is not used in the refinement process; however, an Rvalue is still calculated for this small test set. It is then possible to compare the value of the conventional Rvalue with the Rfree value to determine if the model is being correctly

constructed. Because the reflections used for the calculation of the Rfree are not used in

the refinement process, the conventional Rvalue is expected to be slightly lower than the

Rfree value. Therefore, if the structure is an accurate model of the data, the Rvalue and the

Rfree value will have similar values because the structure factors used in the refinement

must be highly correlated to those not used (Brunger, 1997; Kleywegt and Jones, 1997).

40

Table 2.1: Metal-free Aeropyrum pernix (Ape) flap endonuclease-1 (FEN-1) crystallographic data.

Lattice type Primitive hexagonal

Space group P61 Asymmetric unit One molecule Cell dimensions a = 92.8, b = 92.8, c = 80.9 Å, α = β = 90° and γ = 120°

Data Set #1 Data Set #2 Resolution 1.9 Å 1.4 Å a Rmerge 3.6% 5.3% Observed reflections 215,799 925,349 Unique reflections 31,105 74,361 Completeness, % 98.8% 93.9% n n a ⎡ 2 2 ⎤ 2 2 Rmerge = 100 × ⎢∑∑ F ()hkl − F ()hkl i⎥ / ∑∑F ()hkl where F ()hkl is the intensity of the ⎣ hkl i=1 ⎦ hkl i=1 2 hkl reflection and F ()hkl i is the mean value of i multiple measurements of the n equivalent reflections.

2.3 Structure Determination

Using the 1.9 Å data set (described in section 2.2), the initial phasing was solved by molecular replacement using AMoRe (Navaza and Saludjian, 1997) with the structure of the Pyrococcus furiosus (Pfu) flap endonuclease-1 (FEN-1) (1B43, 2.00 Å resolution) as the search model (Hosfield et al., 1998b). The Pfu FEN-1 structure is shown in Figure

2.3. As a general guideline, molecular replacement tends to be more successful if there is at least a 30-35% sequence identity between the protein of interest and the chosen search model. An amino acid primary sequence alignment between the Ape and Pfu FEN-1 proteins is shown in Figure 2.2. During molecular replacement, the Pyrococcus furiosus flap endonuclease-1 model was positioned into the metal-free Ape FEN-1 crystal lattice using a rotation search followed by a translation search. Possible orientations of the search model are determined during the rotation search. The translation search is then used to place an oriented molecule into the unit cell. Lastly, a rigid body refinement is

41 completed to optimize the rotational orientation of the search model following the translation search. Once a molecular replacement solution is obtained from the rotation and translation search, phase information can be calculated from the model and combined with the experimentally observed amplitudes. The molecular replacement rotation and translation search was done using one monomer from 15.0 to 3.0 Å resolution. Generally a lower resolution search is done during molecular replacement because lower resolution reflections reflect the overall features of the structure, such as secondary structural elements. Very low resolution reflections (10 Å or lower) are mainly influenced by the bulk solvent. The molecular replacement results for the space group P61 gave an Rfactor of

48.6% and a correlation coefficient (CC) of 41.8%. The P61 space group was confirmed by comparison of the molecular replacement results using space groups P6 (Rfactor of

54.2%, CC of 14.0%) and P65 (Rfactor of 54.4%, CC of 23.3%). Following molecular replacement, the Pfu FEN-1 model was transformed into the Ape FEN-1 sequence using a threading program from the Genemine software package

(http://www.bioinformatics.ucla.edu/genemine/) and an output coordinate model of the metal-free Ape FEN-1 was generated.

Ape: 1 MGVNLRELIPPEARREVELRALSGYVLALDAYNMLYQFLTAIRQPDGTPLLDREGRVTSH 60 MGV + E+IP R+E+EL L G +A+DA N +YQFL+ IRQ DGTPL+D +GR+TSH Pfu: 1 MGVPIGEIIP---RKEIELENLYGKKIAIDALNAIYQFLSTIRQKDGTPLMDSKGRITSH 57

Ape: 61 LSGLFYRTINLVEEGIKPVYVFDGKPPEMKSREVEERLRRKAEAEARYRRAVEAGEVEEA 120 LSGLFYRTINL+E GIKPVYVFDG+PPE K +E+E R + EAE + R A+E GE+EEA Pfu: 58 LSGLFYRTINLMEAGIKPVYVFDGEPPEFKKKELEKRREAREEAEEKWREALEKGEIEEA 117

Ape: 121 RKYAMMAARLTSDMVEESKELLDAMGMPWVQAPAEGEAQAAYMARKGDAWATGSQDYDSL 180 RKYA A R+ ++E++K+LL+ MG+P VQAP+EGEAQAAYMA KG +A+ SQDYDSL Pfu: 118 RKYAQRATRVNEMLIEDAKKLLELMGIPIVQAPSEGEAQAAYMAAKGSVYASASQDYDSL 177

Ape: 181 LFGSPRLVRNLAITGRRKLPGRDQYVEIKPE-II-ELEPLLSKLGITREQLIAVGILLGTDY 240 LFG+PRLVRNL ITG+RKLPG++ YVEIKPE II E KL TRE LI + IL+GTDY Pfu: 178 LFGAPRLVRNLTITGKRKLPGKNVYVEIKPELIILEEVLKELKL--TREKLIELAILVGTDY 237

Ape: 241 NPGGVRGYGPKTALRLVKSLGDPMKVLASVPRGEYDPDYLRKVYEYFLNPPVTDDYKIEF 300 NPGG++G G K AL +V+ DP+ + + D D L + E+FLNPPVTD+Y + + Pfu: 238 NPGGIKGIGLKKALEIVRHSKDPLAKF----QKQSDVD-LYAIKEFFLNPPVTDNYNLVW 292

Ape: 301 RKPDQDKVREILVERHDFNPERVERALERLGKAYREKLRGRQSRLDMWFG 350 R PD++ + + L + HDF+ ERV+ LERL KA + G+QS L+ WF Pfu: 293 RDPDEEGILKFLCDEHDFSEERVKNGLERLKKAIKS---GKQSTLESWFKR 341 Figure 2.2: Primary sequence alignment of Ape FEN-1 and Pfu FEN-1. 199 residues out of 350 were identical (57%) and are shown in blue. 257 residues out of 350 (73%) were both identical or positive identities.

42

Figure 2.3: Ribbon diagram of the Pyrococcus furiosus (Pfu) flap endonuclease-1 (FEN-1) structure. Ribbon diagram of the Pyrococcus furiosus (Pfu) flap endonuclease-1 (FEN-1) structure in the presence of bound divalent cations in the active site. The N-terminus is shown in blue and can be traced to the C-terminus shown in red. The metal sites are reported in the manuscript but are not reported in the PDB coordinates (1B43, 2.00 Å resolution), (Hosfield et al., 1998b). The Pfu FEN-1 was used as the search model in molecular replacement to solve the metal-free Ape FEN-1 structure. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

2.4 Model Building and Refinement

Model building and refinement of a model is an extremely important aspect of the structure determination process. The goal of crystallographic refinement is to optimize the agreement of an atomic model with both observed X-ray diffraction data and chemical restraints by minimizing the difference between observed and calculated structure factor amplitudes (Kleywegt and Jones, 1997; Brunger and Rice, 1997). The molecular replacement method of structure determination results in an initial model that is somewhat biased to the calculated phases of the search model. The initial model that is then built into an electron density map may contain errors. In order to produce an

43 accurate unbiased model that reflects the experimentally observed data, model building and crystallographic refinement must be carried out in a cyclic process that will lead to an improved model. Thus, the goal of model building and refinement is to build a model that accurately explains the experimental observations while making physical, chemical, and biological sense. An accurate model is one that can be fit to the quality of the respective electron density map while displaying correct stereochemistry of both main chain and side chain conformations (torsion and side chain rotamer angles) along with preferred environments of amino acid residues (Kleywegt and Jones, 1997; Jones and

Kjeldgaard, 1997).

Following molecular replacement, the output coordinate file generated from the rigid body refinement following the rotation and translation search was used for the initial refinement of the 1.9 Å metal-free Ape FEN-1 data. Simulated annealing (CNS) was used initially in the refinement process of the metal-free Ape FEN-1. CNS simulated annealing is a nonlinear refinement method that allows the model to explore a large range of conformations. By coupling a simulated high temperature followed by a slow cooling to the refinement process, a sufficient amount of calculated kinetic energy is applied to the model so that amino acid side chain and backbone conformational changes can occur and energy barriers can be crossed. Therefore, by allowing the model to cross energy barriers, it is more likely that the model will reach a lower or possibly a global energy minimum (Brunger et al., 1997; Brunger and Rice, 1997). Thus, simulated annealing was an appropriate method of refinement for the metal-free Ape FEN-1 considering that there were differences in the amino acid primary sequences of the Ape FEN-1 and the Pfu

FEN-1. During simulated annealing refinement, model bias toward the Pfu structure can

44 be minimized by allowing the Ape FEN-1 model to explore a large number of conformations to account for any structural changes due to the differences in the primary sequences of Ape and Pfu FEN-1.

The first round of refinement was completed using CNS simulated annealing at

4000 °K with a 100 K constant cooling to 273 K using all data from 45.0-1.9 Å resolution yielding an Rvalue of 36.7% and an Rfree of 39.9%. During the simulated annealing refinement, a composite omit electron density map was calculated at 1.9 Å resolution along with Fobs-Fcalc and 2Fobs-Fcalc difference electron density maps. Electron density maps are calculated using fast Fourier equations to evaluate the electron density (ρ) at any given position in space (eq 1).

1 (eq. 1) ρ (x, y, z) = ∑ Fhkl cos[]2π ()hx + ky +lz −φhkl V hkl

The composite omit map was used initially in model building to reduce bias toward the

Pfu FEN-1 structure. In order to remove errors in the atomic model from the molecular replacement, a composite omit electron density map covering the entire model was created. The composite omit map was calculated by omitting small regions

(approximately 5-10%) of the model during simulated annealing refinement. A small map was then calculated for each respective omitted portion of the model until the entire model had been omitted. The small omit maps corresponding to all omitted regions in the model were then accumulated and written out as a continuous composite map that covered the whole model. If the quality of the composite omit electron density map is good and shows appropriate connectivity, the resulting electron density is most likely real and not due to model bias (Bhat, 1988; Kleywegt and Jones, 1997). Due to potentially large structural differences between the molecular replacement search molecule and the

45 initial atomic model, difference electron density maps alone are not sufficient to remove all errors or phase bias. Difference electron density maps are created using calculated phase information that can potentially bias the refinement process toward the current atomic model even if questionable regions of the atomic model are removed from refinement (Brunger and Rice, 1997). Therefore, the electron density interpretation of the Ape FEN-1 model during the first model building round was biased toward the simulated annealing composite omit electron density map at 1.9 Å resolution. The quality of the 1.9 Å composite omit electron density map is shown in Figure 2.4.

Figure 2.4: Composite omit electron density map quality. A region of the Ape FEN-1 model is shown in yellow to show the quality of the composite omit electron density map at 1.9 Å resolution.

Molecular graphics was then used to complete the first round of model building using the O program (Jones et al., 1991). Following refinement, the σ levels of all difference electron density maps were normalized using the MAPMAN program

(Kleywegt and Jones, 1996). Fobs-Fcalc and 2Fobs-Fcalc electron density maps were used in all rounds of model building. The first round of modeling was completed with approximately 62% of the residues in place by using molecular graphics. A total of four rounds of CNS simulated annealing refinement (3000 K with a 100 K constant cooling to

46

273 K using all data from 45.0-1.9 Å resolution) and four rounds of model building were completed with a total of 296 out of 350 amino acid residues in place with an Rvalue of

30.9% and an Rfree of 34.0%. Several regions of the Ape FEN-1 model were not modeled due to disordered electron density. The two largest regions missing from the Ape FEN-1 were amino acids 98-128 in the bridge region directly above the active site and amino acids 339-350 at the C-terminus. Several smaller disordered regions were seen between areas where the electron density was interpretable. To help in interpreting the missing segments in the model, a bulk solvent correction simulated annealing refinement was also completed in addition to both the third and fourth CNS simulated annealing refinements.

A bulk solvent correction imposes a scaling factor to account for the overall solvent content in the crystal. Protein crystals can contain between ~30% to ~70% solvent, most of which is disordered in the solvent channels between the protein molecules of the crystal lattice (Matthews, 1968). This disordered or bulk solvent must be accounted for because the electron density of protein molecules is surrounded by this continuous bulk solvent electron density. If no model for this continuous bulk solvent electron density is taken into consideration, atomic protein models are artificially placed in a "vacuum" environment, leading to a vast overestimation of the electron density contrast at the protein surface. This can lead to calculated structure factor amplitudes which are systematically much larger than the observed structure factor amplitudes at resolutions below ~5 Å. The deviation between calculated and observed structure factor amplitudes may lead to severe problems in electron density difference map calculations. The electron density of missing portions of the model surrounded by bulk solvent may have a low signal-to-noise ratio and may not be seen using Fobs-Fcalc and 2Fobs-Fcalc difference

47 electron density maps. Thus, the most important benefit of a bulk solvent correction is the enhanced signal-to-noise ratio of electron density difference maps for missing parts of the model (Rossman and Arnold, 2001). A bulk solvent correction can be applied during refinement if the boundary that separates the protein and the solvent is known. As a result, it is suggested to use a bulk solvent correction only when a majority of the model is in place in the electron density. In the case of the Ape FEN-1 model, 272 amino acids

(77%) were in place prior to the use of the bulk solvent correction. The difference electron density maps calculated from the bulk solvent corrections were used in addition to the non-bulk solvent difference electron density maps during model building. An example of the cyclic process of model building and refinement, and how this process improved a missing region of the Ape FEN-1 model is shown in Figure 2.5. Residues

8-11 corresponding to amino acids LIPP could not be modeled into the initial simulated annealing composite omit electron density map. However, residues 8-11 were modeled with confidence following four rounds of model building and refinement. It is advantageous when model building to build off of regions or segments of amino acids that are well ordered in electron density maps. When these additional residues are included in the next round of refinement, the resulting electron density will show improvement in the region where additional modeling occurred if a correct interpretation of the density was made. It is then possible to interpret regions of a model that were previously missing. Electron density improvement was seen at residues 8-11 following each round of refinement. The side chains of Leu-8 and Ile-9 as well as the rings of

Pro-10 and Pro-11 possess well ordered electron density in the 2Fobs-Fcalc and bulk solvent correction 2Fobs-Fcalc electron density maps.

48

Round 1: CNS simulated annealing refinement 1σ Composite Omit

Round 1: Model building

Round 2: CNS simulated annealing refinement

1σ 2Fobs-Fcalc

Round 2: Model building

Round 3: CNS simulated annealing refinement +/- bulk solvent correction

1σ 2Fobs-Fcalc Bulk solvent 1σ 2Fobs-Fcalc

Round 3: Model building

Round 4: CNS simulated annealing refinement +/- bulk solvent correction

1σ 2Fobs-Fcalc Bulk solvent 1σ 2Fobs-Fcalc

Round 4: Model building

Figure 2.5: Ape FEN-1 model building and refinement example. Electron density map improvement of amino acid residues 8-11 (LeuIleProPro) in the Ape FEN-1 model after four rounds of CNS simulated annealing refinement and model building. Electron density maps (composite omit, 2Fobs-Fcalc, and Bulk solvent 2Fobs-Fcalc) contoured at 1.0σ are shown following the respective round of refinement. The Ape FEN-1 model showing the final coordinates of residues 8-11 is shown in yellow.

49

Upon completion of model building and refinement at 1.9 Å, the resolution of the native Ape FEN-1 structure was extended to 1.4 Å resolution. Using the 1.4 Å Ape

FEN-1 data, a fifth round of refinement was completed using CNS simulated annealing at

3000 K with a 100 K constant cooling to 273 K using all data from 45.0-1.4 Å resolution yielding an Rvalue of 31.5% and an Rfree of 32.5%. The slight increase in the Rvalue following the extension of the resolution to 1.4 Å was an indication that the higher resolution data was contributing additional information in the electron density that still needed modeling. Additional difference electron density was present near the missing amino acids 98-128 of the bridge region following extension of the resolution (see Figure

2.6). The new electron density was interpreted and modeled to be an extension of the α helical bridge region that extended from residue 128. Eight additional residues were modeled adjacent to residue 129 in this region during round five of model building.

Three additional rounds (round six, seven, and eight) of CNS simulated annealing refinement and three rounds of model building (round six and seven) were then completed.

Figure 2.6: Extension of Ape FEN-1 resolution to 1.4 Å. Additional difference electron density (Fobs-Fcalc shown in green (3.0σ) and 2Fobs-Fcalc shown in light blue (1.0σ)) was present following extension of the resolution to 1.4 Å. The new density was present where residues 98-128 were missing in the Ape FEN-1 model. The Ape FEN-1 model following extension of the resolution is shown in yellow.

50

All simulated annealing refinements were completed at 3000 K with a 100 K constant cooling to 273 K using all data from 45.0-1.4 Å resolution. Following the fifth model building cycle, the CNS water_pick program was used prior to the sixth round of CNS simulated annealing refinement to automatically pick water molecule coordinate positions in the metal-free Ape FEN-1 model (Read, 1986; Kleywegt and Brunger, 1996).

As general rule, the addition of solvent molecules should be postponed until the majority of the protein model is essentially complete and well refined in order to prevent the incorrect placement of solvent molecules into less-ordered amino acid side chain density or noise peaks (Kleywegt and Jones, 1997). Approximately 87% of the Ape FEN-1 model (304 residues) was in place in the electron density prior to the addition of solvent molecules. One additional bulk solvent correction was also completed in parallel with the seventh round of CNS simulated annealing refinement. With approximately 90.3% of the metal-free Ape FEN-1 model (316 residues) in place following eight rounds of CNS simulated annealing refinement and seven rounds of model building, the Ape FEN-1 model had an Rvalue of 21.2% and an Rfree of 22.3%. Following the seventh and eighth

CNS refinement, it was clear that no further improvement to model could be made using simulated annealing refinement. The electron density for 34 amino acids was still not present in the Ape FEN-1 model. There were 16 residues in the helical bridge region above the active site, 11 residues at the C-terminus, and three other small gaps in the structure that were still not interpretable in the electron density. Thus, Refmac5 restrained, positional refinement (CCP4) was chosen to complete the model building

(Murshudov, 1997).

51

CCP4 Refmac5 restrained, positional refinement is an effective refinement technique once the majority of the model has been constructed. Large secondary structural and amino acid side chain conformational changes of a model are allowed to occur during CNS simulated annealing refinement. Once the overall conformation of the model has been determined and the majority of amino acids have been positioned, final refinement and model building cycles can be completed using a chemically restrained, positional method of refinement. Refmac5 restrained, positional refinement is a refinement technique that will not allow large conformational changes to occur.

Restrained, positional refinement will instead refine individual atom coordinates based on chemical restrains such as bond lengths and bond distances and on a maximum likelihood phase probability target. The first round of refinement was completed using CCP4

Refmac5 restrained refinement from 45.0-1.5 Å resolution giving an Rvalue of 17.9% and an Rfree of 20.0%. One round of water picking was also included with the first round of

Refmac5 refinement using the automated ARP_waters program (Lamzin, 1993; Morris et al., 2002). Following all refinements, Fobs-Fcalc and 2Fobs-Fcalc difference electron density maps were calculated. Molecular graphics was then used to complete the first round of model building using the O program (Jones et al., 1991). Prior to model building, the σ levels of all difference electron density maps were normalized using the MAPMAN program (Kleywegt and Jones, 1996). Fobs-Fcalc and 2Fobs-Fcalc electron density maps were used in all rounds of model building following restrained refinement. Examination of the difference electron density maps following refinement showed that a substantial amount of difference density was present in the Ape FEN-1 model in the region of the missing residues in the bridge region helices (residues 98-114 missing in the model). An

52 example of the electron density that was present following the first round of restrained refinement is shown in Figure 2.7.

Figure 2.7: Electron density of bridge region following the first round of restrained refinement of metal-free Ape FEN-1. Additional difference electron density (Fobs-Fcalc shown in green (3.0σ) and 2Fobs-Fcalc shown in light blue (1.0σ)) was present following the first round of restrained refinement using CCP4 Refmac5. The new density was present where residues 98-114 were missing in the Ape FEN-1 model. An α carbon backbone trace of the Ape FEN-1 model following the first round of restrained refinement is shown in light yellow.

A total of ten rounds of CCP4 Refmac5 restrained refinement (from 45.0-1.4 Å) and nine rounds of model building were completed with a total of 335 out of 350 amino acid residues in place with a final Rvalue of 17.3% and an Rfree of 20.2%. The automated

ARP_waters program was used during the first, third, fourth, and fifth rounds of Refmac5 restrained refinement. During model building rounds, incorrectly placed solvent molecules were removed from the model prior to refinement. Additional residues in the bridge region were built during the model building rounds following restrained refinements. A nine residue polyalanine helix was positioned into the bridge region

(residues 98-114) difference electron density that was generated during the restrained refinements (density as shown in Figure 2.7). After the polyalanine helix was positioned, the alanine residues were mutated to the correct residues corresponding to those in the

53

Ape FEN-1 primary sequence. Adjustments were made to the bridge region residues during all model building rounds following restrained refinement. Due to disordered electron density, all but four residues were able to be built in the bridge region of the Ape

FEN-1 structure. In addition, other small gaps in the structure that were previously not interpretable in the electron density were able to be built following restrained refinement.

During the model building rounds following restrained refinement, a total of twenty additional residues were interpreted into the electron density. Disordered electron density was present at residue one (N-terminal methionine) and at the C-terminus of Ape FEN-1 therefore residues 1 and 341-350 were not able to be modeled. The final 2Fobs-Fcalc electron density quality of the metal-free Ape FEN-1 at 1.4 Å resolution is shown in

Figure 2.8. Overall, CCP4 Refmac5 restrained refinement was an effective refinement technique for the Ape FEN-1 model following CNS simulated annealing refinement. As a result of completing the model building and refinement process with restrained refinement, most missing or disordered regions of the Ape FEN-1 structure could be interpreted into electron density whereas these same regions were not seen following extensive simulated annealing refinement. A summary of the model building and refinement process of the Ape FEN-1 structure is shown in Figure 2.9.

Figure 2.8: Final 2Fobs-Fcalc electron density map quality. A region of the Ape FEN-1 model is shown in yellow to show the quality of the final 2Fobs-Fcalc electron density map (1σ) at 1.4 Å resolution.

1 -********* ********** ********** ********** ********** 50 51 ********** ********** ********** ********** ********** 100 101 ********-- --******** ********** ********** ********** 150 151 ********** ********** ********** ********** ********** 200 201 ********** ********** ********** ********** ********** 250 251 ********** ********** ********** ********** ********** 300 301 ********** ********** ********** ********** ------350 * Composite Omit Map: 216 residues built (61.7%) R value 35.2%, R free 37.4% */* CNS Refinement(8 rounds): 316 residues built (90.3%) R value 21.2%, R free 22.3% */*/* CCP4 Refmac5 Refinement(10 rounds): 335 residues built (95.7%) R value 17.3%, R free 20.2%

Figure 2.9: Model building and refinement summary. The model building and refinement summary is shown for the metal-free Ape FEN-1 structure. The primary sequence of Ape FEN-1 is shown from 1-350 amino acids. A star (*) represents each amino acid in the primary sequence that was able to be modeled into electron density following refinement. Blue stars (*) represent amino acids that could be modeled into the initial simulated annealing composite omit electron density map. Red stars (*) represent the amino acids that could be modeled following a total of eight round of CNS simulated annealing refinement (seven model building rounds, one after each respective refinement). Green stars (*) represent amino acids that could be modeled following a total of ten rounds of CCP4 Refmac5 restrained refinement (nine model building rounds, one after each respective refinement). The N-terminal methionine along with four amino acids in the bridge region (109-112) and ten amino acids at the C-terminus (341-350) were disordered in the electron density and could not be modeled (denoted by a -). The corresponding Rvalue and Rfree values are shown in comparison to the number of residues completed in the Ape FEN-1 model.

54 55

The stereochemistry of the final refined amino acid coordinate file of the metal-free Ape

FEN-1 was then analyzed by PROCHECK (Laskowski et al., 1992). The Ramachandran plot showed that for the 335 observed amino acids in the model, 93.3% were in most favored stereochemical regions, 6.3% were in allowed regions, and 0.4% were in generously allowed regions (see Table 2.2: Metal-free Aeropyrum pernix (Ape) flap endonuclease-1 (FEN-1) crystallographic refinement data). See Figure 2.10 for a summary of the metal-free Ape FEN-1 structure determination process. The ribbon diagram for metal-free Ape FEN-1 is shown in Figure 2.11.

Table 2.2: Metal-free Aeropyrum pernix (Ape) flap endonuclease-1 (FEN-1)

crystallographic refinement data.

Refinement Resolution 45.0-1.4 Å a Rvalue 17.2% a Rfree 20.2% Average B factor 20.8 Å2 b RMSBonds 0.015 Å b RMSAngles 1.409° Atoms (nonhydrogen) 3124 Solvent atoms 393 Hetero atoms 4

Ramachandran plotc Most favored 93.3% Allowed 6.4% Generously allowed 0.4%

a Rvalue = , Rfree is the free Rvalue (5% random data subset), (Brunger 1992). ∑ Fobs − Fcalc / ∑ Fobs hkl hkl b Root-mean-square deviations of bond lengths in Å and bond angles in degrees calculated with CNS/CCP4 Refmac5 (Collaborative Computational Project 1994).

c Ramachandran plot quality assessment using PROCHECK (Laskowski, MacArthur et al. 1992).

56

Figure 2.10: Summary of the structure determination process for metal-free Ape FEN-1.

X-ray diffraction data collection: 1.9 Å and 1.4 Å

X-ray diffraction data processing using HKL: indexing, integration, and scaling

Molecular replacement using AMoRe: 1.9 Å data set Pfu FEN-1 as search model

CNS simulated annealing refinement: 4 rounds

Model building with O: 4 rounds

Extension of resolution: 1.4 Å data set CNS simulated annealing refinement: 4 rounds

CNS water pick: 1 round

Model building with O: 3 rounds

CCP4 Refmac5 restrained refinement: 10 rounds

CCP4 ARP waters: 4 rounds

Model building with O: 9 rounds

Stereochemical validation using PROCHECK

57

Figure 2.11: Ribbon diagram of metal-free Ape FEN-1. The N-terminus is shown in blue and can be traced to the C-terminus shown in red. The first amino acid at the N-terminus (Met-1 denoted with a dot), residues 109-112 in the bridge region, and the last ten amino acids of the C-terminus were not seen in the electron density and are not included in the ribbon diagram. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

2.5 General Architecture of the Metal-free Ape FEN-1

The three-dimensional ribbon model of the native Ape FEN-1 is represented in the absence of divalent cations (Figure 2.11). The overall fold of the enzyme was very ordered in the absence of active site divalent cations. The structural order in the metal-free Ape FEN-1 structure is similar to that of both the metal-free native and D132N bacteriophage T4 RNase H structures that are discussed in Chapter 4. See Figure 2.12 for a ribbon diagram of the labeled secondary structural features of metal-free Ape FEN-1.

The metal-free Ape FEN-1 structure was a single domain α/β protein that consisted of a central groove that was formed from a larger subdomain on the right and a smaller subdomain on the left. The larger subdomain on the right (colored blue, light blue, aqua, light green, and green) contained α/β structure along with a α-helical bundle

58 which formed the right side of the groove (H1, H2, and H5). Two extended smaller loop regions connected H1 with H2 above the α-helical bundle.

Figure 2.12: Ribbon diagram of metal-free Ape FEN-1 with labeled secondary structural features. The ribbon diagram of metal-free Ape FEN-1 is shown. Helices H1-H14 and β strands S1-S8 are labeled from the N-terminus (show in blue) to the C-terminus (shown in red). Disordered (missing) residues 1 at the N-terminus (one blue dot), 109-112 in the bridge region (four green dots), and 341-350 at the C-terminus (ten red dots) are shown. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

The core of the large subdomain contained a six-stranded β sheet (S1, S2, S3, S4, S5, and

S6) which was packed on both sides with α helices (H2 and H6). Five of the six β strands were in a parallel orientation (S2, S3, S4, S5, and S6) while S1 was oriented perpendicular to the directionality of the parallel β strand as a result of the N-terminus wrapping around from the central groove toward the larger subdomain. The N-terminal methionine residue was not observed in the electron density and is denoted by a blue dot in Figure 2.12. The core of the larger subdomain also contained an antiparallel β strand

(S7 and S8) that extended off of S6 and was oriented towards the front of the Ape FEN-1

59 molecule. The most striking feature of the large subdomain was the ordered antiparallel helix bundle (H3 and H4). This ordered helical bundle spans residues 88-130 and is referred to as the bridge region. Ordered electron density was present for all but four residues (109-112 denoted by green dots in Figure 2.12) in the two α helices that were positioned directly above the conserved acidic residues of the active site. By examining the hexagonal P61 crystal packing of the metal-free Ape FEN-1, it was observed that the bridge region helices (H3 and H4) were not involved in any crystal lattice contacts that may have contributed to the structural order.

The smaller subdomain (left, colored light green, yellow, and orange) contained an α-helical bundle (H8, H9, H10, H11, and H12) in which S8 extended downward from the antiparallel β strand towards the lower portion of the central groove to H8 of the

α-helical bundle. The α-helical bundle of the smaller subdomain formed the left side of the central groove. The base of the central groove was formed near the interface of the small subdomain helical bundle (H8 and H9), S8 of the antiparallel β ribbon, and the N- terminal portion of the structure. The core of the groove was formed from H6 and H7 between the large and small subdomains. The C-terminal region (colored dark orange and red) was formed by one long loop that extended from H12 across the back of the structure below the central groove and the large subdomain to form an antiparallel helix- loop-helix (H13 and H14) conformation. The C-terminal helices H13 and H14 were packed against portions of the outer surfaces of helices H2 and H5 of the large subdomain. Due to disordered electron density, the last ten residues (residues 341-350 denoted by red dots in Figure 2.12) at the C-terminus (H14) were not modeled in the Ape

FEN-1 structure.

60

2.6 Bridge Region and Active Site Structure

Analysis of the central groove and the large subdomain of the metal-free Ape

FEN-1 model showed overall structural order in the absence of divalent metals. The active site region of Ape FEN-1 was located in the central groove and was oriented directly below the bridge region (residues 88-130) of the large subdomain (see Figure

2.13). The active site region of the metal-free Ape FEN-1 consisted of a clustering of acidic residues at the base of the central groove.

Figure 2.13: Bridge region of the metal-free Ape FEN-1. Metal-free Ape FEN-1 structure of the ordered helical bridge region near the active site. All acidic residues are colored in red (active site acidic residues are not labeled here), all basic residues are colored in blue, and all hydrophobic residues are colored in green. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

The metal-free Ape FEN-1 structure contained an ordered helical bridge region which contained three basic amino acids that were oriented into or towards the active site.

The ordered helical bridge region showed residues K90 and R97 extending down directly over the active site region (from H3 in Figure 2.12). Also, K122 was extending out over the active site from the adjacent bridge helix (from H4 in Figure 2.12). One other basic

61 residue, K85, was observed to be extending from the loop preceding H3 in Figure 2.12 toward the backside of the conserved active site. K85 was observed to be in a salt bridge interaction with residue E93. In addition, the N-terminus was located adjacent to the active site and contained basic residue R6 that was positioned toward the front of the active site. Hydrophobic residues were also observed proximal to the active site and could potentially be involved in hydrophobic stacking interactions with nucleic acid substrate. Residue Y123 extended downward from the bridge region and residues Y32 and Y36 formed a small hydrophobic region to the right of the active site that extended off of the helix bundle in the large subdomain (from H1 in Figure 2.12).

The active site of the Ape FEN-1 consisted of two clusters of acidic residues that are completely conserved in all known FEN-1 enzymes from amino acid sequence alignments. The first cluster contained residues D30, D83, D155, and D157 and the second cluster contained residues D176, D178, and D239 (see Figure 2.14). The first cluster was the proposed catalytic metal site while the second cluster was the proposed substrate binding metal site. Mutational studies in both the bacteriophage T4 RNase H

(related homologue to the FEN-1 enzymes) and the human FEN-1 have shown that mutation in homologous residues in the catalytic metal one site (Mg1 or M-1) caused a loss of nuclease activity while nucleic acid substrate binding was maintained (Bhagwat et al., 1997c; Shen et al., 1997). Mutations at the second metal site (Mg2 or M-2) either retained nuclease activity or lost nuclease activity due to an inability of the enzyme to bind substrate. Therefore it has been speculated that metal site two is structural and involved in nucleic acid substrate binding.

62

Figure 2.14: Active site region of metal-free Ape FEN-1. Metal-free Ape FEN-1 structure of the residues in the conserved active site near the bridge region. All active site acidic residues are colored in red, basic residue K90 is colored in blue, and hydrophobic residue Y123 is colored in green. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

In the absence of bound divalent metals in the active site of Ape FEN-1, the ordered basic residues in the bridge region (K90 and R97) seemed to compensate for the abundance of negative charge contributed from the conserved acidic residues. The ordered bridge region resulted in a salt bridging interaction between a positively charged bridge residue and an acidic carboxylate residue of the active site. Specifically, bridge region residue K90 was observed to be in a salt bridge interaction with D83. The two observed salt bridging interactions (K85 with E93 and K90 with D83) could be contributing to the position and orientation of the bridge region helix bundle relative to the active site.

2.7 Related Enzyme Structure Comparison

Ape FEN-1 is a member of the RAD2/RAD27 family of replication and repair structure-specific nucleases (Liu et al., 2004). The FEN-1 family of enzymes is functionally related to both the bacteriophage and prokaryotic 5’ to 3’ exonucleases. The

63 bacteriophage exonucleases include the bacteriophage T4 RNase H and the smaller bacteriophage T5 D15 and T7 gene 6 exonucleases. The prokaryotic related homologues include the 5’ to 3’ exonuclease domains of DNA repair polymerases from bacteria such as E. coli Pol I and Thermus aquaticus (Taq). The archaeal flap endonuclease-1 (FEN-1) enzymes from both the Euryarchaea and the Crenarchaea are more closely related to the eukaryotic nucleases (murine FEN-1, human FEN-1, Schizosaccharomyces pombe

RAD2, and Saccharomyces cerevisiae RAD27 enzymes) (Liu et al., 2004; Shen et al.,

1998). Like Ape FEN-1, many of these enzymes in the RAD2/RAD27 family are involved in removing RNA primer fragments during lagging strand DNA replication. To date, the X-ray crystal structures of nine enzymes in this family have been determined: one prokaryotic source, the N-terminal 5’ to 3’ exonuclease domain of Thermus aquaticus (Taq) polymerase (1TAQ, 2.40 Å resolution) (Kim et al., 1995); two from bacteriophage, the T4 RNase H (1TFR, 2.1 Å resolution) (Mueser et al., 1996) and the

T5 5’ to 3’ exonuclease (1EXN, 2.50 Å resolution) (Ceska et al., 1996), (1XO1, 2.50 Å resolution) (Garforth et al., 1999), (1UT5, 1UT8, 2.75 Å) (Feng et al., 2004); four from

Euryarchaeal organisms, the Pyrococcus furiosus (Pfu) flap endonuclease-1 (1B43,

2.00 Å resolution) (Hosfield et al., 1998b), the Methanococcus jannaschii flap endonuclease-1 (1A76 and 1A77, 2.00 Å resolution) (Hwang et al., 1998), the

Pyrococcus horikoshii flap endonuclease-1 (1MC8, 3.10 Å resolution) (Matsui et al.,

2002), and the Archaeoglobus fulgidus (Afu) flap endonuclease-1 complexed to 3’ flap

DNA (1RXW 2.00 Å) (Chapados et al., 2004); and one eukaryotic source, the Human flap endonuclease-1 complexed to the homotrimeric human PCNA (1UL1, 2.90 Å)

(Sakurai et al., 2004). The Aeropyrum pernix (Ape) FEN-1 is the first FEN-1 enzyme

64 solved from a Crenarchaeal organism and is the highest resolution structure of a FEN-1 enzyme solved to date. It has been shown that Ape FEN-1 has a significant sequence similarity to the previously solved archaeal and eukaryotic FEN-1 enzymes (see Chapter

1, Figure 1.5). The X-ray crystal structure of the metal-free Ape FEN-1 is shown in comparison to the previously determined FEN-1 enzymes (the Pyrococcus furiosus (Pfu)

FEN-1, the Methanococcus jannaschii FEN-1, the Pyrococcus horikoshii FEN-1, the

Archaeoglobus fulgidus (Afu) FEN-1, and the Human FEN-1) in Figure 2.15.

All of the FEN-1 structures shown in Figure 2.15 display a very similar overall conformation in comparison to the Ape FEN-1. All of the previously solved FEN-1 enzymes possess a large subdomain that contained the α/β secondary structural features of the Ape FEN-1. The large subdomain of each enzyme contained a helical bundle along with a multiple-strand β sheet core that formed the right portion of the central groove. Each enzyme had a central groove containing the active site residues that was located between the small and large subdomains. The smaller subdomain of each enzyme contained an α-helical bundle than formed the left portion of the central groove. Both the

Ape FEN-1 and the Pfu FEN-1 displayed a prominent antiparallel β strand that extended toward the front of the large subdomain in the respective structures. This antiparallel β strand was not present in the rest of the FEN-1 structures. In contrast to the related bacteriophage and prokaryotic 5’ nuclease structures (see Figure 4.12) that have been determined, the N-terminal residues of all FEN-1 enzymes were well ordered and are located directly adjacent to the conserved active site. This suggests that the N-terminus of the FEN-1 enzymes may play a role in substrate binding, catalysis, or both.

65

A: Aeropyrum pernix (Ape) FEN-1 B: Pyrococcus furiosus (Pfu) FEN-1

C: Methanococcus jannaschii FEN-1 D: Pyrococcus horikoshii FEN-1

E: Archaeoglobus fulgidus FEN-1 F: Human FEN-1

Figure 2.15: Comparison of metal-free Ape FEN-1 to related FEN-1 structures in the RAD2/RAD27 family of nucleases. A: Ribbon diagram of the metal-free Aeropyrum pernix (Ape) FEN-1 showing an ordered bridge or helical clamp conformation. Disordered or missing regions in the structure are labeled. B: Ribbon diagram of the Pyrococcus furiosus (Pfu) FEN-1 crystallized with divalent metals showing an ordered and more closed bridge structure. PDB 1B43, (Hosfield et al., 1998b). The metal sites are reported in the manuscript but are not reported in the PDB coordinates. C: Ribbon diagram of the Methanococcus jannaschii FEN-1 crystallized in the presence of bound magnesium ions (shown in gold) in the active site shows an ordered but open bridge structure. PDB 1A76 and 1A77, (Hwang et al., 1998). D: Ribbon diagram of the Pyrococcus horikoshii FEN-1 crystallized in the absence of divalent cations displays a flexible and open bridge structure loop. PDB 1MC8, (Matsui et al., 2002). E: Ribbon diagram of the Archaeoglobus fulgidus (Afu) FEN-1 crystallized in the presence of 3’ flap DNA (shown in purple) shows an ordered helical bridge structure. PDB 1RXW, (Chapados et al., 2004). F: Ribbon diagram of the Human FEN-1 (molecule X in manuscript) crystallized in the presence of PCNA (not shown here) and in the presence of bound magnesium ions (shown in gold) in the active site shows a disordered and open bridge structure. The disordered region is in the structure is labeled. PDB 1UL1, (Sakurai et al., 2004). All ribbon diagrams can be traced from the N-terminus (blue) to the C-terminus (red). This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

66

The most striking difference between all six of the FEN-1 enzymes was the wide variability of structural conformations of the large subdomain in the region directly over the active site referred to as the arch or bridge region. The Ape FEN-1 metal-free structure showed an ordered and closed bridge region conformation that directly spanned the active site. The Pfu FEN-1 structure displayed an ordered and closed bridge structure in the presence of divalent metal ions in the active site. However, both the

Methanococcus jannaschii FEN-1 and the Human FEN-1 enzymes in the presence of bound divalent metal ions in the active site region showed very open and disordered bridge structures, respectively. The Pyrococcus horikoshii FEN-1 displayed a large and flexible loop structure in the absence of bound divalent metal ions in the active site.

Lastly, the Archaeoglobus fulgidus (Afu) FEN-1 bound to 3’ flap DNA displayed an ordered but open antiparallel two-helix bundle bridge structure that was directly above the active site. The FEN-1 structures display very similar overall conformations in both the large and small subdomains, but show a wide variability in structural conformation in the bridge region directly over the active site.

The structural comparison of the FEN-1 enzymes showed that major structural differences were observed in the bridge region located directly above the active sites of the respective enzymes. These large structural differences in the bridge region may have important consequences for the binding of nucleic acid substrate at the active site. It has been proposed that basic amino acid residues contained within the various bridge structures of related 5’ nucleases are involved in nucleic acid substrate binding interactions with the phosphate backbone of the substrate (Mueser et al., 1996; Ceska et al., 1996; Hosfield et al., 1998b; Hwang et al., 1998; Chapados et al., 2004; Qiu et al.,

67

2004). Mutational studies of basic amino acid residues within the various bridge structures have demonstrated their role in binding to nucleic acid substrate (Qiu et al.,

2004). In addition, mutational evidence has shown that the basic residues within the bridge region may also have a role in catalysis (Qiu et al., 2004; Storici et al., 2002).

Recently, it has been shown that various aromatic and hydrophobic residues near the bridge region and proximal to the active site might position and orient the nucleic acid substrate for hydrolysis through stacking interactions (Matsui et al., 2004). It is assumed that the bridge structure would undergo a conformational change resulting in increased order upon binding a flap nucleic acid substrate. An ordered helical bridge structure was observed in the crystal structure of the Afu FEN-1 complexed to 3’ flap DNA (Chapados et al., 2004). Thus, it is speculated that bridge region basic and hydrophobic residues may help to position the 5’ flap so that the scissile phosphate bond near the flap junction of the substrate is correctly oriented over the active site residues bound to divalent metal ions.

The active site and bridge regions of all known FEN-1 structures highlighting potentially important acidic, basic, and hydrophobic amino acid residues involved in the binding and catalysis of nucleic acid substrate is shown in Figure 2.16. All of the FEN-1 structures displayed a conserved active site region in which all of the conserved acidic residues were located in a similar region in the central groove. The Ape FEN-1 structure contained an ordered and closed bridge structure in which a number of basic and hydrophobic residues were seen extending toward the active site region. In the absence of divalent metals, the closed conformation could serve as an inactive form of the enzyme.

68

A: Aeropyrum pernix (Ape) FEN-1 B: Pyrococcus furiosus (Pfu) FEN-1

C: Methanococcus jannaschii FEN-1 D: Pyrococcus horikoshii FEN-1

E: Archaeoglobus fulgidus FEN-1 F: Human FEN-1

Figure 2.16: Comparison of the metal-free Ape FEN-1 active site and bridge region to related structures in the RAD2/RAD27 family of nucleases. A: Ribbon diagram of the metal-free Aeropyrum pernix (Ape) FEN-1. Active site residues D30, D83, D155, D157, D176, D178, and D239 are shown in red. Basic residues R6, K85, K90, R97, and K122 proximal to the active site are shown in blue. Hydrophobic residues Y32, Y36, and Y123 near the bridge region are shown in green. Residues 109-112 are not shown in the bridge region. B: Ribbon diagram of the Pyrococcus furiosus (Pfu) FEN-1. Active site residues D27, D79, E151, E153, D172, D174, and D235 are shown in red. Basic residues K86, K92, R93, R94, R97, R117, and R122 proximal to the active site are shown in blue. PDB 1B43, (Hosfield et al., 1998b). The metal sites are reported in the manuscript but are not reported in the PDB coordinates. C: Ribbon diagram of the Methanococcus jannaschii FEN-1. Two bound magnesium ions are shown in gold. Active site residues D27, D80, E152, E154, D173, D175, and D224 are shown in red. Basic residues R91, R95, K98, K104, K106, R123, and K192 proximal to the active site are shown in blue. Hydrophobic residue Y120 near the bridge region is shown in green. PDB 1A76 and 1A77, (Hwang et al., 1998). D: Ribbon diagram of the Pyrococcus horikoshii FEN-1. Active site residues D27, D80, E152, E154, D173, D175, and D236 are shown in red. Basic residues K82, K87, R94, R118, K119, and R123 proximal to the active site are shown in blue. Hydrophobic residues L29 and Y33 near the active site are shown in green. PDB 1MC8, (Matsui et al., 2002). E: Ribbon diagram of the Archaeoglobus fulgidus FEN-1 bound to 3’ flap DNA (shown in purple). Active site residues D27, D80, E151, E153, D172, D174, and D235 are shown in red. Basic residues K87, R94, R98, and R125 proximal to the active site are shown in blue. Hydrophobic residues F29 and Y33 near the active site are shown in green. PDB 1RXW, (Chapados et al., 2004). F: Ribbon diagram of the Human FEN-1 (molecule X in manuscript) crystallized in the presence of PCNA (not shown here). Two bound magnesium ions are shown in gold. Active site residues D34, D86, E158, E160, D179, D181, and D233 are shown in red. Residues 100-128 were disordered and are not shown in the bridge region. PDB 1UL1, (Sakurai et al., 2004). This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

69

Upon the binding of divalent metal ions, Ape FEN-1 may undergo a conformational change in which the central groove and α-helical bridge structure open up in order to accommodate the binding of flap nucleic acid substrate. The structure of the

Methanococcus jannaschii FEN-1 in the presence of bound divalent metal ions in the active site displayed an extended bridge structure and a more open cleft in the central groove. Basic and hydrophobic residues in the bridge region could potentially bind to the

5’ arm of the flap nucleic acid substrate inducing a conformational change to a more closed bridge region. The Human FEN-1 structure also displayed a more open cleft in the presence of bound divalent metals in the active site. The disordered region (residues

100-128) of the bridge structure could become more structurally ordered upon binding the 5’ arm of the flap substrate. Both the Pfu FEN-1 (in the presence of bound divalent metal ions) and the Pyrococcus horikoshii FEN-1 (without divalent metal ions) contain extensive bridge structures with a number of basic residues positioned above the active site region. More helical secondary structure was seen in the Pfu FEN-1 bridge region in comparison to the large flexible loop that was present in the Pyrococcus horikoshii

FEN-1 bridge region. In comparison to the bridge structures of the Methanococcus jannaschii FEN-1 and the Human FEN-1, the cleft under the bridge region was more closed or occluded in both the Pfu FEN-1 and the Pyrococcus horikoshii FEN-1 structures. The Afu FEN-1 bound to 3’ flap DNA displayed an ordered and open α- helical bridge structure. Upon binding of the upstream 3’ flap DNA, a number of basic and hydrophobic residues were seen extending downward from the ordered α-helical bridge structure over the active site region. These residues might be correctly positioned to allow binding of the downstream 5’ arm of the flap DNA. It has been proposed that

70 the ordering of the helical bridge structure is coupled to FEN-1 conformational changes that are promoted by the structure specific binding of the 3’ flap of duplex DNA substrate. The DNA-dependent conformational ordering and subsequent closing of the helical bridge structure may facilitate precise cleavage of the 5’ arm of the flap DNA duplex (Chapados et al., 2004). It is therefore interesting that the Ape FEN-1 was the only FEN-1 structure that displayed a bridge region that exhibited the degree of structural order that was seen in the Afu FEN-1 structure. It can be hypothesized that the Ape

FEN-1 ordered helical bridge structure would have to undergo a conformational change whereby the cleft over the active site was opened as a result of binding of 3’ flap of duplex DNA substrate (see Chapter 3, section 3.2.4 for discussion). Future co- crystallization trails of the Ape FEN-1 in the presence of flap DNA substrate would be needed to understand the significance of the extensive, ordered bridge region structure and its role in DNA binding.

Chapter 3: Flap Endonuclease-1 Divalent Metal and DNA

Substrate Structural Studies

3.1 Divalent Metal Studies: Aeropyrum pernix (Ape) Flap

Endonuclease-1

3.1.1 Preface

The metal-free structure of the Aeropyrum pernix (Ape) flap endonuclease-1

(FEN-1) structure has been completed to a resolution of 1.4 Å and a well ordered helical bridge region was observed in a closed conformation directly over the active site of the enzyme (see Figure 3.11). Having completed a metal-free structure of the Ape FEN-1 enzyme, it was proposed that the binding of divalent metal ions to the active site may result in a structural rearrangement of the residues in the active site. It was also hypothesized that the binding of divalent metal ions to the active site could cause a conformational change in the helical bridge region of the Ape FEN-1 resulting in a more open conformation of the enzyme. Thus, the goal of the divalent metal studies was to study the structural binding characteristics of these catalytic metal ions to the Ape FEN-1 enzyme by solving the structure of the Ape FEN-1 in the presence of divalent metal ions.

3.1.2 Expression

Archaeal flap endonuclease-1 (FEN-1) protein from various thermophilic organisms (Archaeoglobus fulgidus (Afu), Archaeoglobus veneficus (Ave), Aeropyrum

71 72 pernix (Ape), Pyrococcus furiosus (Pfu), and Thermococcus zilligii (Tzi) FEN-1) were expressed and partially purified by our collaborators at Third Wave Technologies, Inc. in

Madison, Wisconsin. Most preparations of these respective FEN-1 proteins were produced using small-scale (4x 1 L shaker flasks) and large-scale (20 L fermentor) fermentation. Briefly, large scale expression of soluble archaeal FEN-1 protein was obtained from IPTG (isopropyl-beta-D-thiogalactopyranoside) induction of BL21(DE3)

Escherichia coli (E. coli) bacterial host cells at 37 °C. Following induction, the cells were harvested by centrifugation in preparation for cell lysis.

3.1.3 Purification

All expression and initial purification of archaeal FEN-1 protein was completed by our collaborators at Third Wave Technologies. Following cell harvesting, bacterial cell pellets were resuspended in a lysis buffer containing 10 mM Tris-HCl pH 7.5,

100 mM NaCl, and 2 mM EDTA with 10 mg of hen egg-white lysozyme and incubated for 15 min at 4 °C. Hen egg-white lysozyme was added to the lysis buffer to rupture the bacterial cells in order to release all cellular contents that included the overexpressed

FEN-1 protein along with endogenous proteins and cellular material from the host cells.

200 µL of a 10% (w/v) solution of deoxycholic acid was then added and the solution was sonicated for two minutes at 80% power. Sonication was used to subject the cell lysis suspension to ultrasonic sound waves using a sonicator which bursts the bacterial protoplasts and shears any DNA present in the sample. The sonicated sample was then centrifuged at 14,000xg for 15 min at 4 °C. The supernatant (or cell free extract (CFE)) was decanted and then heated to 67 °C for one hour in order to denature endogenous

E. coli proteins. 250 µL of a 10% (v/v) solution of polyethyleneimine (PEI) was then

73 added to the supernatant following heat treatment and incubated for 30 min at 4 °C. PEI was used to precipitate any remaining cellular DNA that was present in the sample. The sample was then centrifuged at 14,000xg for 15 min at 4 °C, with the FEN-1 protein remaining in the clarified supernatant. Denatured endogenous host protein along with

PEI precipitated DNA was present in the pellet following centrifugation. The protein in the supernatant was then precipitated by the addition of 0.476 g/ml ammonium sulfate and the solution was incubated for 30 min at 4 °C and then centrifuged at 14,000xg for 15 min at 4 °C. The pellet, containing the FEN-1 protein, was resuspended in 5 mL of

50 mM Tris-HCl pH 8.0 and 1 mM EDTA and further dialyzed overnight in the same buffer at 4 °C. The dialysis buffer (buffer A) was used to equilibrate a heparin HPLC column. The dialyzed protein was then ran over the heparin column and was eluted with a linear salt gradient of increasing buffer B: 50 mM Tris-HCl, 1 M NaCl, and 1 mM

EDTA. The peak (monitored by the absorbance at 280 nm) that eluted at approximately

0.5-0.8 M NaCl was collected and pooled together. Heparin is a naturally occurring glycosaminoglycan consisting of alternating hexuronic acid and D-glucosamine residues.

The polymer is heavily sulfated, carrying sulfamino (N-sulfate) groups at C-2 of the glucosamine units as well as ester sulfate (O-sulfate) groups in various positions. Due to the large amount of anionic sulfate groups, heparin functions as a high capacity cation exchange resin and has been shown to be effective in the purification of DNA binding proteins. The pooled fractions were then dialyzed overnight against 50 mM Tris-HCl pH 8.0, 1 mM EDTA, 50% (v/v) glycerol and stored at -80 °C. Following initial purification, the archaeal FEN-1 proteins were shipped on dry ice from Third Wave

Technologies. All initial archaeal FEN-1 protein preparations were expressed using

74 small-scale shaker flasks and the protein was received at high purity. However, later preparations were expressed by large-scale fermentation and were heavily contaminated following initial purification by Third Wave Technologies. An SDS-PAGE (sodium dodecyl sulphate-polyacrylamide gel electrophoresis) gel (Invitrogen NuPage 4-12%

SDS-PAGE gel) was run to access the purity of the FEN-1 samples and is shown in

Figure 3.1.

1 2 3 4 5 6 7 8 9 10 11 12

36.5 kDa

~19 kDa 14.4 kDa

Figure 3.1: SDS-PAGE gel following initial purification of archaeal FEN-1 proteins by Third Wave Technologies. Lane 1 and 8 are molecular weight markers. Lanes 2 and 3 are samples of Archaeoglobus fulgidus (Afu) FEN-1 (MW: 38 kDa), lanes 4 and 5 are samples of Archaeoglobus veneficus (Ave) FEN-1 (MW: 38 kDa), lanes 6 and 7 are samples of Aeropyrum pernix (Ape) FEN-1 (MW: 40 kDa), lanes 9 and 10 are samples of Pyrococcus furiosus (Pfu) FEN-1 (MW: 38 kDa), and lanes 11 and 12 are samples of Thermococcus zilligii (Tzi) FEN-1 (MW: 40 kDa). The samples of the archaeal FEN-1 proteins shown here were expressed using large-scale fermentation prior to initial purification.

The respective archaeal FEN-1 proteins (Archaeoglobus fulgidus (Afu), Archaeoglobus veneficus (Ave), Aeropyrum pernix (Ape), Pyrococcus furiosus (Pfu), and Thermococcus zilligii (Tzi) FEN-1) were seen on the SDS-PAGE gel from approximately 38-40 kDa. In addition, a large band at approximately 19 kDa was present in all but one of the respective FEN-1 samples. In the Ape and Thermococcus zilligii FEN-1 samples, the

75 impurity accounted for ~50% of protein material that was present. It was initially believed that the impurity was a truncation product of each respective archaeal FEN-1 enzyme. The 19 kDa band was not previously seen in other smaller scale archaeal FEN-1 preparations provided by Third Wave Technologies. It was later determined by size-exclusion chromatography and dynamic light scattering (DLS) measurements that the impurity had a molecular weight of approximately 150 kDa in solution. The characterization of the impurity, identified by mass spectroscopy as E. coli Dps/PexB, is discussed in Chapter 5. Based on the purity of the archaeal FEN-1 samples that were provided from Third Wave Technologies, additional purification of the FEN-1 enzymes was needed prior to crystallization studies in the presence of divalent metal ions or flap

DNA substrate. The Ape FEN-1 enzyme was primarily used for the metal-studies that will be described below.

Solubility optimization for the Ape FEN-1 enzyme had been previously completed by others in the group. These results were used to determine the optimum solubility buffer for the crystallization of the metal-free Ape FEN-1 (25 mM Na PIPES pH 6.5, 50 mM disodium hydrogen citrate, 50 mM KCl, and 50 mM NH4Cl). Results from solubility screening showed that the optimum pH was 6.5, the best anion was citrate, and the best cations were Mg+2 and Ca+2. A combination of the optimum buffer, anion, and cation was used to determine the maximum solubility of the Ape FEN-1. A combination of 50 mM dipotassium hydrogen citrate and 10 mM MgCl2 resulted in a maximum solubility of 106 mg/mL for Ape FEN-1 (Collins et al., 2004). Based on these solubility results, a buffer of 50 mM dipotassium hydrogen citrate pH 6.5 and 100 mM

KCl was chosen initially for purification of the Ape FEN-1. A large amount of purified

76

Ape FEN-1 was still needed for additional co-crystallization studies in the presence of divalent metals (see section 3.1.4) and for co-crystallization studies with flap DNA substrate (see Chapter 3: section 3.2.2). Thus, MgCl2 was eliminated from the initial purification buffer and throughout the protocol because it has been shown that divalent cations are essential for nuclease activity. Elimination of divalent metals during the purification of Ape FEN-1 would eliminate the possibility of inadvertent activation of nuclease activity of the Ape FEN-1 during co-crystallization experiments in the presence of flap DNA. Therefore, any divalent metals needed for co-crystallization trials with Ape

FEN-1 would be added by dialysis following purification.

The impure Ape FEN-1 protein solution was dialyzed overnight at 21 °C using a

Slide-A-Lyzer (Pierce) cassette with a molecular weight cutoff of 10,000 Da against 50 mM dipotassium hydrogen citrate pH 6.5 and 100 mM KCl. After dialysis, the protein solution was concentrated to 6 mL at 21 °C using an Amicon Ultra-15 Centrifugal Filter

Unit with a molecular weight cutoff of 10,000 Da (Millipore). All Ape FEN-1 purification was completed at 21 °C using a BIO-RAD™ Biologic Duo-flow high- performance liquid chromatography system with a BIO-RAD™ Biologic Quad-tec UV- visible detector. A Superdex 75 (Amersham Biosciences 17-1044-10) size-exclusion

HPLC column was then equilibrated with the above dialysis buffer. Superdex 75 media consists of a composite matrix of dextran and agarose. Size-exclusion chromatography separates proteins based on their molecular size using a non-interactive stationary phase media. The stationary phase bead particles have pores with a defined distribution of pore diameters. Proteins that are larger than the defined pore diameter of the bead are excluded from the internal volume of the pore and are eluted from the column first (void

77 volume). If proteins are smaller than the defined pore size of the bead, they can access the internal volume of the pores and they will be retained longer in the column. The smallest proteins will be retained the longest and will elute last because more of the internal pore volume is accessible (reviewed in Huber, 2000). Superdex 75 media was chosen because Ape FEN-1 has a molecular weight of 40 kDa while the primary impurity was shown to be approximately 150 kDa in solution. The Ape FEN-1 protein was small enough to be retained by a portion of the internal volume of the media pores while the major impurity would be excluded from the internal pore volume and therefore would elute in the void volume.

1 mL injections of the impure Ape FEN-1 protein solution were loaded onto the

Superdex 75 size-exclusion HPLC column (10/30 column) and were followed by a

2 column volume isocratic elution using a buffer of 50 mM dipotassium hydrogen citrate pH 6.5 and 100 mM KCl. A Superdex 75 chromatogram of a 1 mL injection of impure

Ape FEN-1 protein solution is shown in Figure 3.2. Two peaks (one at fraction 10 and one at fractions 13, 14, and 15) were observed in the chromatogram by monitoring the absorbance at 280 nm. An SDS-PAGE gel was then run, stepping across the 280 nm peaks, to check the purity of the sample. Samples from fractions 10, 12, 13, 14, and 15 were prepared for an SDS-PAGE gel. 25 µL of each fraction was collected and 25 µL of

2x SDS buffer was added to each sample. The SDS-PAGE gel is shown below in

Figure 3.3.

Fractions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 100.0 % Buffer B X X X X X

1.75 1.75

1.50 1.50 75.0

1.25 1.25

1.00 1.00 50.0

0.75 0.75

0.50 0.50 25.0

0.25 0.25

0.00 0.0 0.00

-0.25 -0.25

10.00 20.00 30.00 40.00 AU Min.Tenth AU

Figure 3.2: Superdex 75 chromatogram for Ape FEN-1 purification. The green line monitors the absorbance of protein at 280 nm, the purple line monitors the absorbance at 260 nm, and the red line monitors the conductivity during the run. Fractions 10, 12, 13, 14, and 15 are marked with an X and were checked on an SDS-PAGE gel to assess purity.

78 79

1 2 3 4 5 6 36.5 kDa Ape FEN-1

14.4 kDa

Figure 3.3: SDS-Page gel from Superdex 75 run of Ape FEN-1 purification. Lane 1 is the molecular weight marker and lanes 2-6 are Superdex 75 fractions 10, 12, 13, 14, and 15, respectively. The denatured impurity (~19 kDa) was eluted in the void fraction 10 and was also seen in fraction 12. Ape FEN-1 (40 kDa) was eluted in the peak corresponding to fractions 13-15. See Figure 3.2 for the Superdex 75 chromatogram.

The large impurity was eluted in the void volume of the column in fractions 9, 10, and

11. The void volume fractions were known from previous standardization chromatograms of the Superdex 75 size-exclusion HPLC column. Ape FEN-1 was eluted from the column in fractions 12, 13, 14, and 15. The fractions corresponding to the ends of the 280 nm peak for Ape FEN-1, fractions 12 and 15, were pooled from all runs and loaded one last time over the Superdex 75 column. Fractions 13 and 14 containing Ape

FEN-1 were pooled from all Superdex 75 column runs. The pooled fractions still contained some impurities, thus it was decided to further purify the protein by using a

POROS 20 HS HPLC column (Applied Biosystems).

POROS 20 HS is a high resolution cation exchange perfusion medium which allows for much higher flow rates under which high capacity and resolution are achieved.

The POROS HS media is composed of cross-linked organic polymer particles of poly(styrene-divinylbenzene). The media particles are functionalized with sulfopropyl

- groups (-CH2CH2CH2SO3 ) that allow for electrostatic interactions with positively

80 charged proteins while negatively charged proteins pass through in the flow-through of the column. Ape FEN-1 had a theoretical pI of 7.75 with almost an equal number of negatively (Asp and Glu = 56) and positively (Arg and Lys = 57) charged amino acid residues. The POROS HS cation exchange column was chosen because Ape FEN-1 was expected to be positively charged at pH 6.5 in the buffer following the Superdex 75 elution. Prior to loading the Ape FEN-1 on the POROS HS column (Pharmacia HR

16/10 column), the column was equilibrated with 25 mM Tris-HCl pH 7.5 and 100 mM

NH4Cl (Buffer A). The conductivity of the pooled Superdex 75 fractions was checked to ensure that the Ape FEN-1 protein solution had a conductivity that was less than or equal to Buffer A. If the conductivity of a protein solution is greater than the conductivity of an equilibrated ion exchange chromatography column, the protein likely will not interact electrostatically with the media and will pass through the column in the flow through.

Pooled fractions 13 and 14 containing Ape FEN-1 from the Superdex 75 runs were loaded on the POROS HS column followed by a wash step using Buffer A. The

Ape FEN-1 was eluted from the column at a rate of 10 mL/min using a 5 column volume

(CV) linear salt gradient of 100% Buffer A (25 mM Tris-HCl pH 7.5 and 100 mM

NH4Cl) to 100% Buffer B (25 mM Tris HCl pH 7.5 and 1 M NH4Cl). A chromatogram from the Ape FEN-1 POROS HS column purification is shown in Figure 3.4. Fractions

17, 18, 19, 20, 21, 22, 23, and 24 were then checked on an SDS-PAGE as shown in

Figure 3.5.

Fractions 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

100.0 % Buffer B X X X X X X X X 0.90 90.0

0.80 80.0

75.0 0.70 70.0

0.60 60.0

0.50 50.0 50.0

0.40 40.0

0.30 30.0

25.0 0.20 20.0

0.10 10.0

-0.00 0.0 0.0

-0.10 -10.0

2.00 4.00 6.00 8.00 10.00 AU Min.Tenth mS/cm

Figure 3.4: POROS HS cation exchange chromatogram for Ape FEN-1 purification. The green line monitors the absorbance of protein at 280 nm, the purple line monitors the absorbance at 260 nm, the red line monitors the conductivity , and the black line monitors the linear salt gradient during the run. Fractions 17, 18, 19, 20, 21, 22, 23, and 24 are marked with an X and were checked on an SDS-PAGE gel to assess purity.

81 82

1 2 3 4 5 6 7 8 9 Ape FEN-1 36.5 kDa

Figure 3.5: SDS-Page gel from POROS HS run of Ape FEN-1 purification. Lanes 1-8 are POROS HS fractions 17, 18, 19, 20, 21, 22, 23, and 24, respectively. Lane 9 is a molecular weight marker. Ape FEN-1 (40 kDa) was eluted in the peak corresponding to fractions 18-21. See Figure 3.4 for the POROS HS chromatogram.

Ape FEN-1 was eluted from the POROS HS column in fractions 18-21 corresponding to the 280 nm absorbance peak in the chromatogram. Fractions 18-21 containing purified

Ape FEN-1 protein were collected and pooled together. The Ape FEN-1 protein concentration was checked using the absorbance at 280 nm and a calculated extinction coefficient of 0.999 (Absorbance 0.1% for a 1 mg/mL solution) from the ExPASy

ProtParam tool (Gill and von Hippel, 1989) using an Agilent Technologies™ UV-visible spectrophotometer. The Ape FEN-1 protein (~4.6 mg/mL) was aliquoted and stored at

-80 °C in a 20% (v/v) glycerol solution for further crystallization experiments. A summary of the Ape FEN-1 purification protocol after initial purification by Third Wave

Technologies is shown in Figure 3.6.

83

Figure 3.6: Summary of the Ape FEN-1 purification protocol.

Archaeal FEN-1 proteins initially purified and then shipped by Third Wave Technologies, Inc.

Run SDS-PAGE gel to check purity of Ape FEN-1.

Dialysis of partially pure Ape FEN-1 into: 50 mM dipotassium hydrogen citrate pH 6.5 and 100 mM KCl.

Load dialyzed protein solution onto Superdex 75 size-exclusion HPLC column.

Collect fractions from Superdex 75 elution and check the purity of appropriate fractions on an SDS-PAGE gel.

Pool appropriate fractions of Ape FEN-1 based on the purity following the Superdex 75 column.

Equilibrate POROS HS high resolution cation exchange column with Buffer A: 25 mM Tris-HCl pH 7.5 and 100 mM NH4Cl.

Load pooled fractions of Ape FEN-1 from Superdex 75 column onto POROS HS cation exchange column and wash with Buffer A.

Run POROS HS method with a linear salt gradient for elution: 0-100% Buffer B: 25 mM Tris-HCl pH 7.5 and 1 M NH4Cl

Collect fractions of Ape FEN-1 from POROS HS column elution and check the purity of appropriate fractions on an SDS-PAGE gel.

Pool fractions of purified Ape FEN-1 and aliquot for storage at -80 °C.

84

3.1.4 Co-crystallization with Divalent Metal Ions

Having completed the metal-free structure of the Ape FEN-1 enzyme at 1.4 Å resolution, the goal of these studies was to solve the X-ray structure of Ape FEN-1 in the presence of divalent metal ions. Initial metal studies were performed using native, diffraction quality Ape FEN-1 crystals (see Figure 2.1) in soaking experiments with divalent metal ions. The 1.4 Å native Ape FEN-1 model packed the hexagonal P61 space group with unit cell dimensions of a = 92.8, b = 92.8, and c = 80.9 Å with α = β = 90° and γ= 120° shown below (see Figure 3.7). Due to the symmetrical packing of the Ape

FEN-1 molecule in the crystal, porous channels were present throughout the native Ape

FEN-1 crystals. It was also apparent that the bridge region is not involved in any crystal lattice contacts. It was therefore hypothesized that the porous channels in the native Ape

FEN-1 crystals should allow the access of metal ions to the active site during soaking experiments with divalent metal ions and also should accommodate any conformational changes in the protein.

Figure 3.7: Unit cell packing of native Ape FEN-1 molecules in the crystal. The Ape FEN-1 P61 unit cell is shown in red. The bridge region antiparallel helix bundle is shown exposed to the porous solvent channels in the crystal.

85

All crystal soaking experiments were completed using native Ape FEN-1 diffraction quality crystals. Various divalent metals (MgCl2, MnCl2, SrCl2, BaCl2, and

ZnCl2) have been used at concentrations ranging from 6-25 mM and 200 mM for crystal soaking experiments at time increments of 5, 10, and 20 minutes and 24 and 48 hours.

Following the respective divalent metal soaking experiments, the Ape FEN-1 crystals were cryogenically preserved for X-ray diffraction data collection as was described in section 2.1.1. Several X-ray diffraction data sets were then collected on the soaked native Ape FEN-1 crystals (see Table 3.1). All data were collected at APS on the

BioCARS 14-BM-C beamline (0.9 Å wavelength) using an ADSC Quantum 4 CCD detector. Interestingly, the resolution of the metal soak data sets was generally lower compared to the native Ape FEN-1 1.4 Å resolution data set. It was initially speculated that the decrease in resolution of the metal-soaked Ape FEN-1 crystals may have been the result of a protein conformational change upon binding to these divalent metals at the active site. However, structure determination by molecular replacement using the metal-free Ape FEN-1 model and subsequent analysis of the respective difference electron density maps revealed that no divalent metals were bound in the Ape FEN-1 active region.

In addition to the crystal soaking experiments, co-crystallization experiments in the presence of 20 and 200 mM MgCl2 have been completed using the native Ape FEN-1 solubility buffer conditions (25 mM Na PIPES pH 6.5, 50 mM disodium hydrogen citrate, 50 mM KCL, and 50 mM NH4Cl) and crystallization conditions (12% (w/v) polyethylene glycol (PEG) 4000, 100 mM Na MES pH 5.6, and 200 mM sodium formate). Another native Ape FEN-1 growth condition was also attempted using

86

19% PEG 4000, 200 mM MgCl2, and 100 mM Na MES pH 5.6. Following the respective crystal growth experiments, the Ape FEN-1 crystals were cryogenically preserved for

X-ray diffraction data collection as was described in section 2.1.1. X-ray diffraction data sets were then collected on the Ape FEN-1 crystals grown in the presence of MgCl2

(see Table 3.1). All data were collected at APS on the BioCARS 14-BM-C beamline

(0.9 Å wavelength) using an ADSC Quantum 4 CCD detector. These crystals also diffracted to a lower resolution than the native Ape FEN-1 crystals. Following structure determination by molecular replacement using the metal-free Ape FEN-1 model, analysis of the respective difference electron density maps revealed that no divalent metals were bound in the Ape FEN-1 active region. Structure determination by molecular replacement for both the Ape FEN-1 metal-soaking and the Ape FEN-1 co-crystallization experiments is the same as that described in section 3.1.6.

Interestingly, metal-soaking and crystal growth experiments in the presence of divalent metals using the native metal-free Ape FEN-1 crystal form have not produced evidence of metal binding in the active site of the Ape FEN-1 enzyme. The presence of

50 mM disodium hydrogen citrate in the optimized Ape FEN-1 solubility buffer could have contributed to a lack of binding of divalent metal ions to the active site. It has been shown that citrate is a well-known chelator of divalent metal ions. Thus, the formation of magnesium citrate or manganese citrate could have interfered with the chelation of the divalent metal ions by the protein. However, even at high concentrations (200 mM) of both magnesium and manganese, no evidence of metal binding was observed in the Ape

FEN-1 active site. It is believed that the binding of divalent metal ions and/or DNA substrate to the FEN-1 enzyme family may involve a structural rearrangement of the

87 residues in the active site. This structural rearrangement may be caused by a conformational change in the bridge region resulting in a more open conformation upon binding of metal or DNA substrate. Based on this hypothesis, it was assumed that the native Ape FEN-1 crystal lattice packing may not have allowed the required conformational change to occur for the binding of divalent metals. Therefore, it was decided to rescreen for a new crystallization condition of Ape FEN-1 grown in a citrate-free magnesium-optimized solubility buffer.

Table 3.1: X-ray diffraction data sets of Ape FEN-1 divalent metal soaking and growth experiments using the metal-free Ape FEN-1 crystal form.

# Crystal Resolution Source/Date Ape FEN-1 + MgCl 1 2 1.9 Å APS BioCARS 14-BM-C July 2002 soaking Ape FEN-1 + MnCl 2 2 1.8 Å APS BioCARS 14-BM-C July 2002 soaking Ape FEN-1 + SrCl 3 2 2.5 Å APS BioCARS 14-BM-C July 2002 soaking Ape FEN-1 + BaCl 4 2 1.8 Å APS BioCARS 14-BM-C July 2002 soaking Ape FEN-1 + ZnCl 5 2 1.8 Å APS BioCARS 14-BM-C July 2002 soaking Ape FEN-1 + MgCl 6 2 1.8 Å APS BioCARS 14-BM-C Dec 2002 soaking (#2) Ape FEN-1 + SrCl 7 2 1.8 Å APS BioCARS 14-BM-C Dec 2002 soaking (#2) Ape FEN-1 + SrCl 8 2 1.7 Å APS BioCARS 14-BM-C Dec 2002 soaking (#3) Ape FEN-1 + MnCl 9 2 2.5 Å APS BioCARS 14-BM-C Apr 2003 soaking (#2) Ape FEN-1 + MgCl 10 2 1.9 Å APS BioCARS 14-BM-C Aug 2003 (native growth condition) Ape FEN-1 + MgCl 11 2 2.1 Å APS BioCARS 14-BM-C Aug 2003 (native growth condition)

Previous solubility screening for an optimal crystallization buffer for the Ape

FEN-1 enzyme had shown that the optimum pH was 6.5, the best anion was citrate, and the best cations were Mg+2 and Ca+2 (Collins et al., 2004). A citrate-free solubility

88

optimized buffer of 50 mM Bis-Tris pH 6.5, 100 mM NH4Cl, and 25 mM MgCl2 was chosen for crystallization trials to find a metal-bound form of Ape FEN-1. Purified Ape

FEN-1 protein (see section 3.1.3) was dialyzed overnight at 21 °C against 50 mM

Bis-Tris pH 6.5, 100 mM NH4Cl, and 25 mM MgCl2. After dialysis, the Ape FEN-1 protein solution was concentrated to ~15 mg/mL for crystallization trials. All protein samples were filtered prior to crystallization using Millipore Ultrafree-MC HV

Centrifugal Filters™ with a 0.45 µm filter.

Due to the large number of variables affecting crystallization, such as concentration, temperature, pH, ionic strength, precipitants, and additives, initial crystallization trials are usually simplified by using an incomplete factorial method or a sparse matrix method. The incomplete factorial method of crystallization screening involves sampling a very coarse matrix of crystallization conditions in which the results are used to construct finer grids around presumed or projected conditions of the incomplete factorial (Carter and Carter, 1979). In contrast, the sparse matrix method of crystallization screening involves sampling conditions that are heavily weighted or biased towards known or published crystallization conditions (Jancarik and Kim, 1991). The sparse matrix approach considers buffer pH, additives, and precipitating agents as important parameters in the formulation of a sparse matrix crystallization screen. If either the incomplete factorial or sparse matrix method for initial crystal screening yield any potential “hit” crystallization conditions, a systematic grid screening approach can be utilized. Grid screens can be formulated to complete a systematic local search around a potential “hit” condition in order to optimize the crystal growth. Currently, a number of commercially available crystallization screening kits are available. Many of these

89 commercially available screens are sparse matrix formulations that sample a wide range of parameters that affect the crystallization of macromolecules. Table 3.2 shows the commonly used crystal screening kits that contain a total of 626 conditions that can be tested. The Ion Screen is the precursor to the Hampton Research PEG/Ion Screen and is prepared in-house (Cudney, B., personal communication; Mueser et al., 2000).

Table 3.2: Commercially available and in-house crystal screening kits.

Screen Conditions Type of Screen Distributor Crystal Screen I™ 50 The original sparse matrix screen. Hampton Research

Sparse matrix: samples salts, polymers, Crystal Screen II™ 48 Hampton Research organics, and pH. Sparse matrix: samples salts, polyols, Natrix™ 48 Hampton Research organics, and pH.

SaltRX™ 96 Sparse matrix: samples salt and pH. Hampton Research

Sparse matrix, incomplete factorial, and Index™ 96 Hampton Research grid screen format.

Wizard I™ 48 Random sparse matrix. Emerald BioSystems

Wizard II™ 48 Random sparse matrix. Emerald BioSystems

Cryo I™ 48 Sparse matrix cryoprotectant screen. Emerald BioSystems

Cryo II™ 48 Sparse matrix cryoprotectant screen. Emerald BioSystems

Samples cations and anions versus pH Ion Screen 96 Prepared in-house at a constant precipitant concentration.

A large number of conditions can be screened with high-throughput crystallization plates such as the Corning 96-well™ or the Greiner 96-well™ single or multidrop (3 drops per well) plates. These high-throughput screening plates are set up for sitting-drop vapor diffusion crystallization experiments. Once positive crystal hits are identified, they can then be expanded and optimized using hanging-drop vapor diffusion VDX 24-well™,

90

Costar 24-well™, or Nextal 24-well™ expansion plates with the aim of producing single, diffraction quality crystals. The expansion trays can also be set up in a sitting-drop vapor diffusion format using 48-well and 96-well trays if a larger local search is needed. In the

Mueser lab, all expansion trays are set up using the A/B gradients method. In this method, shallow and coarse gradients are used to vary a single chemical parameter around the initial crystallization conditions (Senger and Mueser, 2004). Parameters that can be varied include, but are not limited to, buffer, PEG and salt concentration. With this method, two solutions are prepared, an initial solution (A) and a final solution (B), using standardized pipetting maps. Coarse and shallow step gradients can be prepared by adding decreasing amounts of A to consecutive wells followed by an increasing volume of B. Coarse gradients are used to further optimize the growth conditions and shallow gradients are used to produce diffraction quality crystals.

In vapor diffusion crystallization techniques, nucleation can occur when the protein concentration increases through the dehydration-driven reduction of solution volume caused by the equilibration of water vapor from the drop containing the protein and the more concentrated reservoir solution (Weber, 1997). Vapor diffusion promotes the transition of an undersaturated protein solution through the metastable region of the crystallization phase diagram to the labile region where the protein solution becomes supersaturated. If nucleation conditions do occur as the protein becomes supersaturated, the formation of ordered nuclei and subsequent crystal growth can take place. The rate at which equilibrium is reached in vapor diffusion determines whether or not crystal growth will proceed from the labile or metastable regions. Ideally, the goal of vapor diffusion techniques is to promote or induce the formation of nuclei at the lowest level of

91 supersaturation that is needed in the labile region of the phase diagram. As the nuclei start to grow, the system would slowly enter the metastable region of the phase diagram as the amount of solute (protein solution) is depleted during crystal growth. Crystal growth would be slow and ordered at this ideal position in the phase diagram. The likelihood of competing nuclei or precipitate formation during crystal growth would be very low due to the decreasing amount of supersaturation (McPherson, 1999).

After the Ape FEN-1 protein was concentrated to ~15 mg/mL, sitting-drop vapor diffusion crystallization trials at 21 °C were completed using the following crystal screen kits: Crystal Screen I™, Crystal Screen II™, SaltRX™, Index™, Wizard I™,

Wizard II™, and the Ion Screen. Corning 96-well trays were used for the sitting-drop vapor diffusion crystal trials with 1 µL of reservoir solution and 1 µL of protein sample.

The trays were then taped and stored at 21 °C until they were examined at 7 and 10 days following initial setup. Crystal hits were observed in Crystal Screen I™, Crystal Screen

II™, and in the Wizard II™ screens. Izit Crystal Dye™ (Hampton Research) was added to each respective hit condition and the crystals turned blue. Izit dye was used to determine if the crystal hit was protein or salt. Crystals will turn blue if they are protein because the Izit dye can diffuse through the solvent channels of a protein crystal. The best crystal hits are shown in Figure 3.8. The most promising crystal screen hits were from Crystal Screen I™ (8% PEG 4000 and 100 mM Na acetate pH 4.6) and from

Crystal Screen II™ (10% PEG 8000, 8% (v/v) ethylene glycol, and 100 mM Na HEPES pH 7.5). The Wizard II™ hit condition (10% PEG 3000, 100 mM K citrate pH 4.2, and

200 mM NaCl) contained citrate and therefore was not optimized. Expansions of the

Crystal Screen I™ and Crystal Screen II™ hit conditions were set up using hanging-drop

92 vapor diffusion 24-well expansion crystal trays. All expansion trays were set up using

2 µL of reservoir solution and 2 µL of Ape FEN-1 protein solution at a concentration of

~15 mg/mL in the following dialysis buffer: 50 mM Bis-Tris pH 6.5, 100 mM NH4Cl, and 25 mM MgCl2. For the Crystal Screen I™ hit, a coarse gradient of 1-20% PEG 4000 and 100 mM Na acetate pH 4.6 was set up at 21 °C using a Costar 24-well™ tray. For the Crystal Screen II™ hit condition, a coarse gradient of 1-20% PEG 8000, 8% ethylene glycol, and 100 mM Na HEPES pH 7.5 was set up at 21 °C using a Costar 24-well™ tray.

A B

C

Figure 3.8: Initial crystal screen hits of Ape FEN-1 with MgCl2 at 21 °C. A: Crystal Screen I™ hit condition: 8% PEG 4000 and 100 mM Na acetate pH 4.6. B: Wizard II™ hit condition: 10% PEG 3000, 100 mM K citrate pH 4.2, and 200 mM NaCl. C: Crystal Screen II™ hit condition: 10% PEG 8000, 8% (v/v) ethylene glycol, and 100 mM Na HEPES pH 7.5. All three hit conditions were dyed blue after the addition of Izit Crystal Dye™.

93

Crystals of Ape FEN-1 grown in the presence of MgCl2 that were obtained from the coarse gradient expansions of the Crystal Screen I™ and Crystal Screen II™ hits are shown in Figure 3.9. The Ape FEN-1 crystals in both of the respective expansions were approximately 0.5 mm x 0.4 mm. A B

Figure 3.9: Coarse gradient expansion crystals of Ape FEN-1 with MgCl2 at 21 °C. A: Crystal Screen I™ hit condition expansion: ~4% PEG 4000 and 100 mM Na acetate pH 4.6. B: Crystal Screen II™ hit condition: ~5% PEG 8000, 8% (v/v) ethylene glycol, and 100 mM Na HEPES pH 7.5.

Following expansion and optimization of an initial crystal hit, the crystals must then be cryogenically flash frozen in liquid nitrogen prior to X-ray diffraction data collection. Protein crystals are typically very fragile due to the relatively loose packing of molecules in the crystal, and they can contain between ~30% to ~70% solvent, most of which is disordered in the solvent channels between the protein molecules of the crystal lattice (Matthews, 1968). During the freezing process, the large amount of solvent in the crystal can cause ice formation and fracture lines that can damage the protein crystal and affect the diffraction experiment. Therefore, care must be taken in order to preserve the crystal integrity when protein crystals are cryogenically frozen in liquid nitrogen. An appropriate cryoprotectant solution must be chosen so the protein crystal is protected during freezing process. Cryoprotectant solutions reduce the potential of water and thus

94 disrupt the hydrogen bond structure of the solvent by altering the dielectric constant and polarizability of the solvent. The presence of a cryoprotectant can then change the way in which water freezes in the crystal. Ideally, the cryoprotectant solution will freeze as a clear, transparent glass when the protein crystal is frozen in liquid nitrogen resulting in little damage to the crystal. Some commonly used cryoprotectant solutions and respective concentrations are shown in Table 3.3, (adapted from Rogers, 1997). There is no universal cryoprotectant solution available (McPherson, 2004), thus an extensive search sometimes must be completed in order to find a suitable cryoprotectant solution that the protein crystal is stable in.

Table 3.3: Known cryoprotectant solutions and protective concentrations, (adapted from Rogers, 1997).

Cryoprotectant Concentration (%)

Erythritol 11 (w/v)

Ethylene glycol 11-30 (v/v) Glucose 25 (w/v)

Glycerol 13-30 (v/v)

2-methyl-2,4-pentanediol (MPD) 20-30 (v/v)

PEG 400 25-35 (v/v)

Xylitol 22 (w/v) (2R,3R)-(--)-butane-2,3-diol 8 (v/v)

When a suitable cryoprotectant has been found, protein crystals can then be harvested and cryogenically frozen in liquid nitrogen for data collection. For low temperature data collection, crystals are mounted using small nylon loops (various sizes based on crystal size) that are inserted into the steel base of a crystal mounting pin. The crystal mounting pin can be magnetically attached to a crystal manipulation tool or to a goniometer head

95 for data collection. During mounting, a single crystal is picked up from the mother liquor containing a cryoprotectant solution and is directly flash frozen either in liquid nitrogen or in the cryostream at the beamline.

The Ape FEN-1 crystals that were grown in the presence of MgCl2

(see Figure 3.9) were soaked momentarily in a substitute mother liquor containing a cryoprotectant and were then flash frozen in liquid nitrogen. For the Ape FEN-1 crystal grown in ~5% PEG 8000, 8% ethylene glycol, and 100 mM Na HEPES pH 7.5, a substitute mother liquor solution of the following was made: 50 mM Bis-Tris pH 6.5,

100 mM NH4Cl, 25 mM MgCl2, ~5% PEG 8000, 100 mM Na HEPES pH 7.5, and 25% ethylene glycol. Ethylene glycol was chosen as a cryoprotectant because the crystal growth condition contained 8% ethylene glycol. It was therefore assumed that the crystals would be stable following an increase in the cryoprotectant concentration (8% to

25% ethylene glycol). The Ape FEN-1 crystals were stable after being soaked momentarily into the substitute mother liquor containing 25% ethylene glycol and were then flash frozen in liquid nitrogen and stored until X-ray diffraction data collection was attempted. For the Ape FEN-1 crystal grown in ~4% PEG 4000 and 100 mM Na acetate pH 4.6, a substitute mother liquor of the following was made: 50 mM BisTris pH 6.5, 100 mM NH4Cl, 25 mM MgCl2, ~4% PEG 4000, 100 mM Na acetate pH 4.6, and 25% ethylene glycol. Ethylene glycol was chosen as a cryoprotectant because the previously described Ape FEN-1 crystal (Figure 3.9B) was stable when soaked in a substitute mother liquor containing 25% ethylene glycol. The Ape FEN-1 crystals grown in ~4%

PEG 4000 and 100 mM Na acetate pH 4.6 were also stable following a momentary soak in the substitute mother liquor containing 25% ethylene glycol and were then flash frozen

96 in liquid nitrogen and stored until X-ray diffraction data collection was attempted.

Having shown that the Ape FEN-1 crystals in Figure 3.9 were stable following a momentary soak in a respective substitute mother liquor containing 25% ethylene glycol, additional expansion trays were set up in which the Ape FEN-1 crystals were grown in the presence of varying concentrations of ethylene glycol. One 4x6 format (four experiments of six-well gradients in one 24-well tray) hanging-drop vapor diffusion expansion tray was set up at 21 °C with 6-16% PEG 4000 and 100 mM Na acetate pH 4.6 across each gradient and 5, 10, 15, and 20% ethylene glycol in each experiment. Another

4x6 hanging-drop vapor diffusion expansion tray was set up at 21 °C with 8-18% PEG

8000 and 100 mM Na HEPES pH 7.5 across each gradient and 5, 10, 15, and 20% ethylene glycol in each experiment. Both of the 4x6 expansion trays were set up using

2 µL of reservoir solution and 2 µL of Ape FEN-1 protein solution at a concentration of

~15 mg/mL in the following dialysis buffer: 50 mM Bis-Tris pH 6.5, 100 mM NH4Cl, and 25 mM MgCl2. In both of the 4x6 expansion trays, crystals were present in all concentrations (5-20%) of ethylene glycol. The best single crystals were observed towards the lower end of the PEG 4000 and PEG 8000 gradients, respectively. These experiments showed that it was possible to grow diffraction quality single crystals of Ape

FEN-1 in the presence of protective concentrations of cryoprotectant (20% ethylene glycol).

During the optimization of the crystal hits from the rescreening of Ape FEN-1 in the presence of MgCl2 (Figure 3.8 and 3.9), it was suggested that a heat incubation of

Ape FEN-1 in the presence of divalent metal ions prior to crystallization may promote the binding of divalent metals in the active site region (Petsko, G.A., personal

97 communication). It was hypothesized that an elevated temperature may allow the enzyme the needed conformational flexibility to bind divalent metals, because Ape

FEN-1 is an archaeal enzyme with optimal catalytic activity near 75 °C (Kaiser et al.,

1999). Heat incubation studies were then completed using purified Ape FEN-1 in the presence of various divalent metals (MgCl2, MnCl2, CaCl2, SrCl2, BaCl2, and ZnCl2).

Purified Ape FEN-1 protein at ~15 mg/mL (see section 3.1.3) was dialyzed overnight at

21 °C against 25 mM BisTris pH 6.5, 100 mM NH4Cl, and 50 mM of different divalent metals (MgCl2, MnCl2, CaCl2, SrCl2, BaCl2, and ZnCl2). The Ape FEN-1 protein was slightly precipitated following dialysis against ZnCl2. Dialyzed Ape FEN-1 protein in the presence of divalent metal was then incubated in a thermal cycler and incubated at 75 °C for 30 min and then slow-cooled in 5 °C increments until the samples reached 25 °C.

The Ape FEN-1 sample in the presence of ZnCl2 was completely precipitated following the heat incubation and slow-cool cycle, and was therefore not used in further crystallization studies. The heat incubated Ape FEN-1 protein samples in the presence of divalent metal ions (MgCl2, MnCl2, CaCl2, SrCl2, and BaCl2, respectively) were then centrifuged at 10,000xg for 5 min to remove any slight precipitate that formed during the heat incubation. Any slight precipitation that was present could have been due to endogenous E. coli proteins that were copurified along with Ape FEN-1. The protein concentration of the Ape FEN-1 samples was then checked using the absorbance at 280 nm as described in section 3.1.3 and was shown to be ~15 mg/mL for all samples following heat incubation. The heat incubated Ape FEN-1 samples were then filtered through 0.45 µm spin filters in preparation for crystallization experiments.

98

Several 2x12 format (two experiments of twelve-well gradients in one 24-well tray) hanging-drop vapor diffusion crystallization expansion trays were then set up at

21 °C. Shallow gradients of 8-14% PEG 8000, 100 mM Na HEPES pH 7.5, and 25% ethylene glycol were set up for Ape FEN-1 in the presence of MgCl2, MnCl2, and CaCl2, respectively, and 8-12% PEG 8000, 100 mM Na HEPES pH 7.5, and 25% ethylene glycol were set up for Ape FEN-1 in the presence of SrCl2 and BaCl2, respectively. All of the 2x12 expansion trays were set up using 2 µL of reservoir solution and 2 µL of Ape

FEN-1 protein solution at a concentration of ~15 mg/mL. After approximately 3 or 4 days, diffraction quality single crystals were present in all of the respective expansion trays. Examples of some of the Ape FEN-1 crystals that were grown in the presence of divalent metal ions following heat incubation are shown in Figure 3.10.

A B

C

Figure 3.10: Shallow gradient expansion crystals at 21 °C of Ape FEN-1 in the presence of divalent metal ions following heat incubation. A: Ape FEN-1 crystal in the presence of MgCl2 grown at ~11.5% PEG 8000, 100 mM Na HEPES pH 7.5, and 25% (v/v) ethylene glycol. B: Ape FEN-1 crystal in the presence of MnCl2 grown at ~12% PEG 8000, 100 mM Na HEPES pH 7.5, and 25% (v/v) ethylene glycol. C: Ape FEN-1 crystal in the presence of SrCl2 grown at ~9.3% PEG 8000, 100 mM Na HEPES pH 7.5, and 25% ethylene glycol.

99

The PEG 8000, 100 mM Na HEPES pH 7.5, and 25% ethylene glycol growth condition was used for all heat incubated Ape FEN-1 and divalent metal crystallization studies because Ape FEN-1 crystals grown from the PEG 4000, 100 mM Na acetate pH 4.6, and

20% ethylene glycol growth condition (see Figure 3.9) diffracted poorly when first tested

(see section 3.1.5).

All of the Ape FEN-1 crystals grown in the presence of divalent metal ions were then harvested directly from the growth drop and were flash frozen in liquid nitrogen and stored for data collection. The growth mother liquor contained a cryoprotectant solution

(25% (v/v) ethylene glycol) that was high enough in protective concentration (see Table

3.3) so that the crystals could be harvested and flash frozen directly in liquid nitrogen.

3.1.5 X-ray Diffraction Data Collection and Data Processing

All Ape FEN-1 crystals grown in the presence of different divalent metal ions

(with and without heat incubation prior to crystallization in the citrate-free buffer) were either soaked momentarily in a substitute mother liquor containing a cryoprotectant or directly flash frozen in liquid nitrogen as was described in section 3.1.4. Multiple data sets were collected at both BioCARS 14-BM-C (Argonne National Laboratories,

Advanced Photon Source, Chicago, IL, USA) using an ADSC Quantum 315 CCD detector and in-house (Ohio Macromolecular Crystallography Consortium (OMCC),

University of Toledo, Toledo, OH, USA) using a Rigaku FRE High Brilliance X-ray

Generator with an R-AXIS IV image plate detector and a Saturn 92 CCD detector (see

Figure 3.11).

100

A B

Figure 3.11: Advanced Photon Source (APS) and in-house X-ray sources. A: BioCARS 14-BM-C beamline with an ADSC Quantum 315 CCD detector. B: In-house Rigaku FRE High Brilliance X-ray Generator with an R-AXIS IV image plate detector and a Saturn 92 CCD detector.

The Ape FEN-1 crystals grown in the presence of divalent metal ions that were optimized from the Crystal Screen I™ hit condition (8% PEG 4000 and 100 mM Na acetate pH 4.6) shown in Figures 3.8A and 3.9A were tested for X-ray diffraction using the in-house X-ray source. All of the crystals tested did not diffract well following examination of multiple diffraction images that were exposed 90° apart. The diffraction patterns were streaked and did not diffract to high resolution, thus no data were collected on any of the Ape FEN-1 crystals optimized from the Crystal Screen I™ hit condition.

The Ape FEN-1 crystals grown in the presence of divalent metal ions (Figures 3.8C,

3.9B, and 3.10) that were optimized from the Crystal Screen II™ hit condition (10% PEG

8000, 8% (v/v) ethylene glycol, and 100 mM Na HEPES pH 7.5) were also tested for diffraction both at the synchrotron and in-house. Most of the Ape FEN-1 crystals diffracted to high resolution, therefore multiple data sets were collected of these crystals grown in the presence of different divalent metal ions (see Table 3.4). All data collected at the BioCARS 14-BM-C beamline were collected using a wavelength of 0.9 Å, and all

101 data collected in-house were collected using a wavelength of 1.54 Å. Each data set was collected at -173 °C and an appropriate rotation range for the data collection was determined so that all unique reflections were measured at least once in all of the respective data sets. For all of the data sets, 75-90 degrees of data were needed due to the hexagonal symmetry of the Ape FEN-1 crystals grown in the presence of divalent metals.

A 60 degree rotation range was required in order to collect all unique reflections at least once, however an additional 15-30 degrees of rotation was collected resulting in multiple measurements of equivalent reflections along the principle axis (see Chapter 2, section

2.1 and 2.2 for discussion). An X-ray diffraction image of an Ape FEN-1 crystal grown in the presence of MnCl2 following heat incubation is shown in Figure 3.12.

Figure 3.12: X-ray diffraction image of an Ape FEN-1 crystal grown in the presence of MnCl2. The Ape FEN-1 protein was heat incubated in the presence of MnCl2 prior to crystallization. Data was collected in-house using a Rigaku FRE High Brilliance X-ray Generator with a Saturn 92 CCD detector.

Table 3.4: Data sets of Ape FEN-1 in the presence of divalent metals (MgCl2, MnCl2, CaCl2, SrCl2, and BaCl2) following crystallization rescreening and optimization in a citrate-free buffer. Ape FEN-1 protein was incubated at 75 °C for 30 min in the presence of divalent metals and slow-cooled to 25 °C prior to crystallization (indicated as: Heat incubated for data sets 2-7 below). The data were indexed, integrated, and scaled at the respective resolutions shown. In data sets 1, 2, 3, and 6 the data were only useful to the indicated resolution in parenthesis.

Data set 1 Data set 2 Data set 3 Data set 4 Data set 5 Data set 6 Data set 7

ADSC Quantum 315 Rigaku Saturn 92 Rigaku Saturn 92 Rigaku R-Axis IV Rigaku Saturn 92 Rigaku Saturn 92 Rigaku Saturn 92 Detector CCD CCD CCD Image plate CCD CCD CCD

Ape FEN-1 + MgCl Ape FEN-1 + MnCl Ape FEN-1 + CaCl Ape FEN-1 + SrCl Ape FEN-1 + BaCl Ape FEN-1 + BaCl Crystal Ape FEN-1 + MgCl 2 2 2 2 2 2 2 Heated Incubated Heated Incubated Heated Incubated Heated Incubated Heated Incubated Heated Incubated

Lattice type Primitive hexagonal Primitive hexagonal Primitive hexagonal Primitive hexagonal Primitive hexagonal Primitive hexagonal Primitive hexagonal

Space group P61 P61 P61 P61 P61 P61 P61

Asymmetric unit One molecule One molecule One molecule One molecule One molecule One molecule One molecule

a = 93.188 Å a = 93.065 Å a = 92.816 Å a = 92.611 Å a = 92.736 Å a = 93.027 Å a = 92.907 Å b = 93.188 Å b = 93.065 Å b = 92.816 Å b = 92.611 Å b = 92.736 Å b = 93.027 Å b = 92.907 Å Cell dimensions c = 81.101 Å c = 81.288 Å c = 80.959 Å c = 81.207 Å c = 81.131 Å c = 81.265 Å c = 81.040 Å

α = β = 90° γ = 120° α = β = 90° γ = 120° α = β = 90° γ = 120° α = β = 90° γ = 120° α = β = 90° γ = 120° α = β = 90° γ = 120° α = β = 90° γ = 120°

1.8 Å 1.68 Å 1.35 Å 1.9 Å Resolution 1.9 Å 1.75 Å 2.9 Å (useful to 2.0 Å) (useful to 1.95 Å) (useful to 1.8 Å) (useful to 1.97 Å)

Rmerge 5.6% 6.5% 6.7% 5.9% 3.9% 6.4% 9.9%

Observed 291,517 146,419 564,726 232,239 190,172 124,231 49,653 reflections

Unique reflections 72,395 39,306 75,499 31,063 78,127 60,967 17,220

Completeness 99.2% 85.6% 87.2% 99.5% 99.3% 98.5% 99.5%

102 103

All of the data sets of Ape FEN-1 grown in the presence of different divalent metal ions were indexed, integrated, and scaled to the resolution shown in Table 3.4, respectively. The data were integrated using DENZO and merged using SCALEPACK from the HKL2000 software (Otwinowski and Minor, 1997). In some data sets (1, 2, 3, and 6), the data were only useful to the resolution indicated in parenthesis in Table 3.4 due to both high Rmerge values and low I/σ values in the higher resolution shells (see

Chapter 4, section 4.1.1 for discussion). As a result, the resolution indicated in parenthesis in Table 3.4 for data sets 1, 2, 3, and 6 was used in all refinements, respectively. All of the Ape FEN-1 crystals grown in the presence of different divalent metal ions belonged to the hexagonal space group P61 (systematic absences along the 00l axis consistent with a sixfold screw), with unit cell dimensions as shown in Table 3.4.

Calculation of the Matthews’ coefficient indicated that there was one monomer per asymmetric unit in each respective crystal (Matthews, 1968). The overall Rmerge and completeness for each respective data set of Ape FEN-1 in the presence of divalent metal ions is shown in Table 3.4. The completeness of the highest useful resolution shell for data set 2 (1.95 Å) and data set 3 (1.8 Å) was 99.9% and 99.8%, respectively. Following data processing, the final scale files were then used to create a CCP4 style data file (h k l,

Fobs, and Sigma Fobs) containing an Rfree random data subset of 3% for use in the refinement software suite, CCP4 (Collaborative Computational Project, 1994).

3.1.6 Structure Determination and Refinement

The initial phasing of the seven Ape FEN-1 data sets shown in Table 3.4 was completed using AMoRe (Navaza and Saludjian, 1997) with the structure of the metal-free Ape FEN-1 in the absence of the bridge region helices as the search model

104

(see Chapter 2, Figure 2.12 for the metal-free ribbon model). The bridge region helices

(residues 88-130: H3 and H4 in Figure 2.12) were removed because it was previously hypothesized that the binding of divalent metal ions to the active site may cause a conformational change in the helical bridge region of the Ape FEN-1 resulting in a more open conformation of the enzyme. Therefore, any conformational change in this region would reflect the observed data and not the search model upon analysis of difference electron density maps. Molecular replacement was used for structure determination of the Ape FEN-1 data sets in Table 3.4 because initial attempts to solve the various data sets in Table 3.1 (crystal soaks and crystal growth experiments using the metal-free crystallization condition) were unsuccessful following rigid body and subsequent restrained refinement. It was speculated that even though all of the crystals belonged to the hexagonal space group P61 (see section 3.1.4), the rigid body and restrained refinements were unsuccessful because different molecules in the unit cell may have been refined even though they were symmetrically equivalent. The rigid body refinement was most likely not robust enough to locate and minimize the difference between the observed data and the search molecule. As a result, molecular replacement was needed for structure determination.

For all data sets in Table 3.4, the molecular replacement rotation and translation search was done using one monomer of metal-free Ape FEN-1 (no bridge region) from

9.0-3.0 Å. Following the rotation and translation search, a rigid body refinement was completed to optimize the rotational orientation of the search model. The molecular replacement results for the space group P61 are summarized in Table 3.5.

105

Table 3.5: Summary of molecular replacement results for the data sets of Ape FEN-1 grown in the presence of different divalent metals.

Correlation Data sets (From Table 3.4) R factor Coefficient (CC)a

Data set 1: Ape FEN-1 + MgCl2 30.3% 77.9%

Data set 2: Ape FEN-1 + MgCl2 (Heated Incubated) 33.6% 74.1%

Data set 3: Ape FEN-1 + MnCl2 (Heated Incubated) 32.7% 76.1%

Data set 4: Ape FEN-1 + CaCl2 (Heated Incubated) 32.9% 73.5%

Data set 5: Ape FEN-1 + SrCl2 (Heated Incubated) 32.9% 74.5%

Data set 6: Ape FEN-1 + BaCl2 (Heated Incubated) 31.9% 75.9%

Data set 7: Ape FEN-1 + BaCl2 (Heated Incubated) 32.0% 75.4%

1/ 2 a obs cal ⎡ obs 2 cal 2 ⎤ CC = ∆ FH ∆ FH / (∆ FH ) (∆ FH ) ⎣⎢ ⎦⎥

Following molecular replacement, output coordinate files for each respective data set were generated from the rigid body refinement following the rotation and translation search. The coordinate files of the respective Ape FEN-1 data sets in Table 3.4 were then used for Refmac5 restrained, positional refinement (CCP4) (Murshudov, 1997).

Refmac5 restrained, positional refinement was chosen because it was believed that the metal-bound Ape FEN-1 model would be similar in overall fold and conformation to the metal-free form of the enzyme (see Chapter 2) except at the active site and bridge region. It was hypothesized that the binding of divalent metal ions to the active site may cause a conformational change in the helical bridge region of the Ape

106

FEN-1. The molecular replacement output coordinate files of the respective Ape FEN-1 data sets in the presence of divalent metals were refined in the absence of the bridge region helices (residues 88-130). If a large conformational change in the bridge region occurred based on the examination of difference electron density maps following

Refmac5 refinement, simulated annealing refinement (CNS) would then be completed.

Using the output coordinate files (no bridge region) of the respective Ape FEN-1 data sets from molecular replacement, Refmac5 restrained refinement was done. A summary of the Refmac5 refinements is shown in Table 3.6. See Figure 3.13 for a summary of the structure determination and refinement process of all the data sets of Ape FEN-1 grown in the presence of divalent metals.

Table 3.6: Summary of Refmac5 refinement results for the data sets of Ape FEN-1 grown in the presence of different divalent metals.

Refinement Data sets (From Table 3.4) R R Resolution value free

Data set 1: Ape FEN-1 + MgCl2 50.0-2.0 Å 23.8% 28.0%

Data set 2: Ape FEN-1 + MgCl2 (Heated Incubated) 25.5-1.95 Å 26.5% 29.6%

Data set 3: Ape FEN-1 + MnCl2 (Heated Incubated) 36.0-1.8 Å 27.2% 31.0%

Data set 4: Ape FEN-1 + CaCl2 (Heated Incubated) 30.0-1.9 Å 26.9% 30.0%

Data set 5: Ape FEN-1 + SrCl2 (Heated Incubated) 30.0-1.75 Å 26.5% 29.1%

Data set 6: Ape FEN-1 + BaCl2 (Heated Incubated) 30.0-1.97 Å 25.3% 30.1%

Data set 7: Ape FEN-1 + BaCl2 (Heated Incubated) 35.0-2.9 Å 21.9% 31.2%

107

Figure 3.13: Summary of the structure determination process for Ape FEN-1 grown in the presence of various divalent metals (citrate-free buffer).

X-ray diffraction data collection: (see Table 3.4 for data sets)

X-ray diffraction data processing using HKL2000: indexing, integration, and scaling

Molecular replacement using AMoRe: Metal-free Ape FEN-1 (no bridge region helices) as search model

CCP4 Refmac5 restrained refinement: 1 round

Electron density interpretation and analysis with molecular graphics using O: 1 round

3.1.7 Electron Density Interpretation and Analysis

Following Refmac5 refinement (results shown in Table 3.6), Fobs-Fcalc and

2Fobs-Fcalc difference electron density maps were calculated (as described in Chapter 2, section 2.4). Prior to electron density map interpretation, the σ levels of all difference electron density maps were normalized using the MAPMAN program (Kleywegt and

Jones, 1996). Molecular graphics was then used to complete the electron density interpretation and analysis using the O program (Jones et al., 1991). The electron density difference maps of each respective Ape FEN-1 model grown in the presence of different divalent metals were then examined. For each respective model, extensive Fobs-Fcalc electron density was present at a contour level of 3σ that represented the position of the bridge region helices. To determine if a bridge region conformational change occurred,

108 the lsqman least squares superimposition program (Kleywegt and Jones, 1994) was used to superimpose the models of Ape FEN-1 grown in the presence of different divalent metals (no bridge region) onto the metal-free Ape FEN-1 structure. A summary of the

RMS values for the superposition is shown in Table 3.7.

Table 3.7: Summary of the superposition of the models of Ape FEN-1 grown in the presence of different divalent metals onto the metal-free Ape FEN-1 structure.

Models of Ape FEN-1 grown with metals RMS

(Data set 1) Ape FEN-1 + MgCl2 0.182 Å

(Data set 2) Ape FEN-1 + MgCl2 (Heated Incubated) 0.256 Å

(Data set 3) Ape FEN-1 + MnCl2 (Heated Incubated) 0.183 Å

(Data set 4) Ape FEN-1 + CaCl2 (Heated Incubated) 0.209 Å

(Data set 5) Ape FEN-1 + SrCl2 (Heated Incubated) 0.217 Å

(Data set 6) Ape FEN-1 + BaCl2 (Heated Incubated) 0.203 Å (Data set 7) Ape FEN-1 + BaCl (Heated Incubated) 0.189 Å 2

The superposition showed that all of the models of Ape FEN-1 grown in the presence of different divalent metals had a very similar overall structure in comparison to the metal-free structure. As a result, the coordinates of the metal-free Ape FEN-1 bridge region (residues 88-109 and 112-130) were used to model the position of the missing bridge regions in each respective model. Electron density map interpretation of the

Fobs-Fcalc map contoured at 3σ showed that no bridge region conformational change occurred in any of the respective models grown in the presence of divalent metals.

In addition to the bridge region electron density map interpretation, the active sites of the respective models were examined to determine if any divalent metal ions were bound. Interpretation of the respective 2Fobs-Fcalc (using a 1σ contour) difference electron density maps showed that all of the conserved acidic residues were well positioned into

109 the electron density following Refmac5 positional refinement. The active site region was then examined for any divalent metal ion binding by the interpretation of the respective

Fobs-Fcalc (3σ contour) difference electron density maps. Surprisingly, none of the models of Ape FEN-1 grown in the presence of divalent metals (in a citrate-free buffer) showed evidence of divalent metal ion binding in the active site. The Fobs-Fcalc electron density maps of the active site regions of six of the seven models of Ape FEN-1 grown in the presence of divalent metals are shown in comparison to the 2Fobs-Fcalc electron density map of the metal-free Ape FEN-1 active site in Figure 3.14. The Fobs-Fcalc electron density maps of six models of Ape FEN-1 (Figure 3.14A: Fobs-Fcalc electron density corresponding to the model solved from data sets 1, 2, 3, and 6 and Figure 3.14B:

Fobs-Fcalc electron density corresponding to the model solved from data sets 4 and 5 in

Table 3.4) do not show evidence of divalent metal ions bound in the active site. The positions in space of the superimposed Fobs-Fcalc electron density in both Figure 3.14A and 3.14B appear to be very similar to the location of solvent molecules labeled S1-S6 as seen in the 2Fobs-Fcalc electron density map of the final model of the metal-free Ape

FEN-1 (shown in Figure 3.14C). Therefore, well-ordered solvent molecules are present in the location where the divalent metals are expected to bind in both the metal-free

(Chapter 2) and citrate-free (Chapter 3) hexagonal crystal forms of Ape FEN-1. A model of the expected binding sites of two divalent magnesium ions (Mg1 and Mg2) in the Ape

FEN-1 active site is shown in Figure 3.14D. It is expected that the active site of the magnesium-bound form of the Ape FEN-1 enzyme would resemble the active site of the magnesium-bound bacteriophage T4 RNase H (as shown in Chapter 1, Figure 1.8), a related homologue to the FEN-1 enzymes (Mueser et al., 1996). As was discussed in

110

Chapter 2, section 2.6, the active site residues of all known FEN-1 enzymes are completely conserved. Based on amino acid sequence alignments (Shen et al., 1998) along with mutational and structural studies of Human FEN-1 (Shen et al., 1997; Sakurai et al., 2004) and the T4 RNase H (Bhagwat et al., 1997c), residues D30, D83, D155, and

D157 in Ape FEN-1 are the proposed binding site for the catalytic metal (Mg1), whereas residues D176, D178, and D239 are the proposed binding site for the metal involved in nucleic acid substrate binding (Mg2).

Thus, it can be hypothesized that the lattice contacts of both the metal-free and citrate-free hexagonal crystal forms of the Ape FEN-1 may not allow a bridge region conformational change to occur, thereby preventing the binding of divalent metal ions to the active site. However, it is more likely that any large conformational change in the bridge region along with the binding of divalent metals to the active site is coupled or dependent on the binding of nucleic acid substrate to the enzyme. A more open bridge structure may be the result of a large conformational change of the enzyme induced by the binding of nucleic acid substrate, similar to that shown in the structure of the

Archaeoglobus fulgidus FEN-1 bound to 3’ flap DNA as was discussed in Chapter 2, section 2.7 (Chapados et al., 2004). An open bridge structure may then result in the active site region being more accessible, therefore promoting the binding of divalent metal ions.

111

A B

C D

Mg1

Mg2

Figure 3.14: Electron density map interpretation of the active site region of Ape FEN-1: a comparison between the models of Ape FEN-1 grown in the presence of divalent metals (from data sets shown in Table 3.4) versus the final model of the metal-free Ape FEN-1 (see Chapter 2). A: Active site region of the models of Ape FEN-1 grown in the presence of Mg+2 (data sets 1 and 2), Mn+2 (data set 3), +2 and Ba (data set 6) and the corresponding superimposed Fobs-Fcalc electron density maps contoured at 3σ. Maps are colored as described: red (data set 1), green (data set 2), blue (data set 3), and magenta (data set 6). The respective models of Ape FEN-1 (from data sets 1, 2, 3, and 6) are superimposed and are shown in yellow. Active site residues D30, D83, E155, E157, D176, D178, and D239 are labeled in white. B: Active site region of the models of Ape FEN-1 grown in the presence of Ca+2 (data set 4), and Sr+2 (data set 5) and the corresponding superimposed Fobs-Fcalc electron density maps contoured at 3σ. Maps are colored as described: green (data set 4) and red (data set 5). The respective models of Ape FEN-1 (from data sets 4 and 5) are superimposed and are shown in yellow. Active site residues are labeled as in (A). C: Active site region of the metal-free Ape FEN-1 final model and the corresponding 2Fobs-Fcalc electron density map contoured at 1σ. The model of the metal-free Ape FEN-1 is shown in yellow. The active site residues D30, D83, E155, and E157 along with bridge region residue K90 are labeled in white. Six solvent molecules are labeled S1, S2, S3, S4, S5, and S6, respectively. D: Active site region of the metal-free Ape FEN-1 final model showing expected positions of bound divalent metal ions shown in gold and labeled Mg1 and Mg2, respectively. The model of the metal-free Ape FEN-1 is shown in yellow. Active site residues are labeled as in (A) and (B).

112

3.2 DNA Substrate Studies: Archaeal Flap Endonuclease-1

The overall goal of the DNA substrate studies was to solve the X-ray structure of an archaeal FEN-1 enzyme bound to flap DNA (shown in Figure 3.15). Previous structural studies of this family of enzymes have yet to determine the complete substrate recognition. Therefore, additional structural studies of the FEN-1 and flap DNA substrate complex were needed to fully characterize the molecular basis of this interaction.

Figure 3.15: Flap DNA substrate of the FEN-1 enzymes. Flap DNA substrate (also called a double flap substrate) has a displaced downstream 5’ flap along with an upstream 3’ flap overhang. The FEN-1 enzymes possess structure-specific endonuclease activity and cleave the 5’ flap one base pair into the downstream duplex DNA (shown with a red arrow).

3.2.1 Protein and DNA Substrate Preparation

Archaeal flap endonuclease-1 (FEN-1) protein from various thermophilic organisms (Archaeoglobus fulgidus (Afu), Archaeoglobus veneficus (Ave), Aeropyrum pernix (Ape), Pyrococcus furiosus (Pfu), and Thermococcus zilligii (Tzi) FEN-1) was expressed and partially purified by our collaborators at Third Wave Technologies, Inc. in

Madison, Wisconsin (as discussed in sections 3.1.2 and 3.1.3). All additional purification of the respective FEN-1 enzymes was completed as described in section 3.1.3. Following purification, each of the FEN-1 proteins was then dialyzed into their previously optimized solubility buffer (Collins, B. K., Masters Thesis 2003).

113

Oligonucleotide flap DNA substrates of various lengths were provided, in-kind, by Third Wave Technologies, Inc. for FEN-1-substrate co-crystallization trials. The flap

DNA substrates were provided in an unannealed state consisting of three separate oligonucleotide fragments that are shown color-coded in Figure 3.16. Different variations of the flap DNA substrate were designed by altering the lengths of both the upstream and downstream duplex regions and were referred to as the 6/10, 7/11, and 7/14 substrates, respectively (see Figure 3.16). A two nucleotide 5’ arm that extended from the downstream duplex region was used to serve as the 5’ flap portion of each respective substrate. Each substrate also contained a one nucleotide overlap extending from the end of the upstream duplex region serving as the 3’ flap because studies have shown that the rate of cleavage is enhanced by a 3’ single base overlap from the upstream duplex (Kaiser et al., 1999). Therefore, it is believed that the overhang base is specifically recognized by the FEN-1 enzymes helping to position the substrate for cleavage. Recent structural studies have confirmed that the overlapping base on the upstream duplex (3’ flap) is recognized by the Archaeoglobus fulgidus (Afu) FEN-1 enzyme (Chapados et al., 2004).

All substrates were designed to promote potential crystal lattice contacts by incorporating an unpaired 5’ C nucleotide and an unpaired 5’ G nucleotide extending from the upstream and downstream duplex regions, respectively. Studies were completed at Third Wave

Technologies, Inc. and it was determined that these substrates were cleaved by the FEN-1 enzymes provided and were stable at room temperature (Kaiser, M.W., personal communication).

114

5’ C 6/10 Substrate A A 5’ C G C G G C G C T C T G C T G T G C G C C G C G A G A C G A C A C G 5’ Tm=42.6°C Tm=46.8°C

5’ Flap DNA C 7/11 Substrate A A 5’ C C G C A C C G C T C T C C T A C C G G C G T G G C G A G A G G A T G G C G 5’ Tm=45.3°C Tm=49.2°C

5’ C 7/14 Substrate A A 5’ C C G C A C C G A C C A G A C C A G A C C A G C G T G G C T G G T C T G G T C T G G T G 5’ Tm=45.3°C Tm=59.9°C

Figure 3.16: Flap DNA substrates used for co-crystallization trials with the archaeal FEN-1 enzymes. The substrates 6/10, 7/11, and 7/14 are shown with the corresponding melting temperatures (Tm) of both the upstream (red and black) and downstream (blue and black) duplex regions. Substrate melting points have been determined using the Hyther “nearest neighbor” algorithm (http://ozone2.chem.wayne.edu/Hyther/hythermenu.html). All melting temperatures were calculated at 0.3 mM oligonucleotide concentration and 50 mM salt.

The flap DNA substrates 6/10, 7/11, and 7/14 (shown in Figure 3.16) were then annealed by adding equal molar amounts of the three oligonucleotide fragments (colored red, blue, and black). As an example, the 7/14 flap DNA substrate was annealed by adding equal molar amounts of the three fragments (1:1:1 ratio of ~500 nmoles each) in an annealing buffer of 20 mM Tris-HCl, 100 mM NaCl, and 10 mM EDTA. A microtube containing the sample was then placed in a beaker of water and heated until the water boiled. The beaker was then placed in a styrofoam container and allowed to cool overnight until it reached ambient temperature. The styrofoam container acts as an insulator and cools the DNA down at a slow rate, giving the samples a chance to properly

115 anneal. Following annealing, all flap DNA samples were stored at -20 °C until co-crystallization trials were attempted.

3.2.2 Co-crystallization with Flap DNA Substrates

All initial co-crystallization trials of the archaeal FEN-1 enzymes and flap DNA substrates were completed by others in the group. Initial sparse matrix co-crystallization trials were conducted on five Archaeal FEN-1 enzymes (Ape, Afu, Ave, Pfu, and Tzi) in the presence of three different types of flap DNA substrate (7/11, 6/10, and 7/14). A 1:1 equal molar complex (based on ~10 mg/mL protein) of FEN-1:flap DNA substrate was used in all co-crystallization trials. Sitting-drop vapor diffusion crystallization trials at

4 °C and 21 °C were completed using the following crystal screen kits: Cryo I™,

Natrix™, Wizard I™, and the Ion Screen. Greiner 96-well™ multidrop (3 drops per well) plates were used with 1 µL of reservoir solution and 1 µL of FEN-1/flap DNA complex. Low salt screens were chosen so that DNA substrate binding would not be adversely affected by high amounts of salt in other sparse matrix screens. The trays were then taped and stored at 4 °C and 21 °C, respectively, until they were examined for crystal hit conditions. Work for this thesis began by evaluating the initial sparse matrix co-crystallization trials and was followed by further optimization of the most promising crystal hit conditions (discussed below) from the initial screens.

Several crystal hits were obtained with the Ape and Ave FEN-1 enzymes in the presence of the 7/14 DNA flap substrate. The best crystal hits of Ape FEN-1 and 7/14 flap DNA were observed in the Cryo I™ (40% (v/v) 1,2-propanediol and 100 mM Na

HEPES pH 7.5), (40% (v/v) 1,2-propanediol, 100 mM Na Acetate pH 4.5, and 50 mM

Ca(OAc)2), Wizard I™ (35% (v/v) MPD and 100 mM Na/K phosphate pH 6.2), and

116

Natrix™ (10% PEG 400, 50 mM Na HEPES pH 7.5, 100 mM KCl, and 10 mM CaCl2) screens which are shown in Figure 3.17. Izit Crystal Dye™ was added to most of the hit conditions to verify that the crystals were protein.

A B C

D E

Figure 3.17: Initial crystal screen hits of Ape FEN-1 and 7/14 flap DNA. A: Cryo I™ hit condition (4 °C): 40% (v/v) 1,2-propanediol and 100 mM Na HEPES pH 7.5. B: Wizard I™ hit condition (21 °C): 10% PEG 3000 and 100 mM Na CHES pH 9.5. C: Cryo I™ hit condition (21 °C): 40% (v/v) 1,2-propanediol, 100 mM Na Acetate pH 4.5, and 50 mM Ca(OAc)2. D: Wizard I™ hit condition (21 °C): 35% (v/v) MPD and 100 mM Na/K phosphate pH 6.2. E: Natrix™ hit condition (21 °C): 10% PEG 400, 50 mM Na HEPES pH 7.5, 100 mM KCl, and 10 mM CaCl2.

The best crystal hits of the Ave FEN-1 and 7/14 DNA were observed in the Ion Screen

(20% PEG 4000, 100 mM Na PIPES pH 6.5, and 200 mM CaCl2), (20% PEG 4000, 100 mM Na HEPES pH 7.5, and 200 mM CaCl2), (20% PEG 4000, 100 mM Na MES pH 5.6, and 200 mM CaCl2) and Wizard I™ (20% PEG 3000 and 100 mM Na acetate pH 4.5)

117 screens which are shown in Figure 3.18. Izit Crystal Dye™ was added to all of the hit conditions to verify that the crystals were protein.

A B

C D

Figure 3.18: Initial crystal screen hits of Archaeoglobus veneficus FEN-1 and 7/14 flap DNA.

A: Ion Screen hit condition (21 °C): 20% PEG 4000, 100 mM Na PIPES pH 6.5, and 200 mM CaCl2. B: Ion Screen hit condition (21 °C): 20% PEG 4000, 100 mM Na HEPES pH 7.5, and 200 mM CaCl2. C: Wizard I™ hit condition (21 °C): 20% PEG 3000 and 100 mM Na acetate pH 4.5. D: Ion Screen hit condition (4 °C): 20% PEG 4000, 100 mM Na MES pH 5.6, and 200 mM CaCl2.

Prior to setting up coarse gradient expansions of the hit conditions, purified Ape and Ave FEN-1 proteins were dialyzed overnight at 21 °C against their respective solubility optimized buffers in the absence of any divalent metal ions. The Ape FEN-1 was dialyzed against 25 mM Na PIPES pH 6.5, 50 mM disodium hydrogen citrate,

50 mM KCL, and 50 mM NH4Cl and the Ave FEN-1 was dialyzed against 25 mM Na

PIPES pH 6.2 and 1 mM DTT. Following dialysis, the FEN-1 samples were concentrated to ~15 mg/mL. The FEN-1 protein samples were then incubated in the

118 presence of Chelex 100 resin (Bio-Rad) for 5 min at 21 °C in order to remove any trace divalent metal ions. Divalent metal ions were removed from all of the FEN-1 protein samples in order to prevent any inadvertent activation of FEN-1 nuclease activity during co-crystallization experiments in the presence of flap DNA. Chelex 100 resin is made from styrene divinylbenzene copolymers containing paired iminodiacetate ions which act as chelating groups in binding polyvalent metal ions. The respective FEN-1 protein samples were then filtered through 0.45 µm spin filters and the protein concentration was checked using the absorbance at 280 nm in preparation for co-crystallization expansion experiments. A 1:1 equal molar complex (based on ~10 mg/mL protein) of FEN-1:flap

DNA substrate (7/14) was used for all crystal hit expansion trays. After the DNA substrate was added to the protein sample, the sample was incubated for 30 min (at either

4 °C or 21 °C) allowing the components of the complex a chance to bind. The

FEN-1:flap DNA complex was made fresh each time expansion trays were set up so that any degradation of the DNA substrate resulting from long-term storage would be avoided.

Several coarse gradient expansions were then set up of the hit conditions shown in

Figures 3.17 and 3.18 for Ape and Ave FEN-1 with 7/14 flap DNA, respectively. All expansions were set up using hanging-drop vapor diffusion Costar 24-well™ trays with

1 + 1 µL or 2 + 2 µL (reservoir solution + FEN-1/flap DNA complex) at either 4 °C or 21

°C. A summary of the coarse gradient expansions is shown in Table 3.8. Interestingly, none of the initial crystal hits shown in Figure 3.17 and 3.18 were able to be reproduced in the expansion trays.

Table 3.8: Expansion tray index of Ape and Archaeoglobus veneficus FEN-1 with 7/14 flap DNA.

Ape FEN-1 with 7/14 DNA Crystal Hit Expansions (Figure 3.17) Temp °C Tray Format Drop Setup µL Coarse Gradient Expansion Screen Conditions 4 2 X 12 2 + 2 35-50% 1,2-propanediol, 100 mM Na HEPES pH 7.5 4 1 X 24 2 + 2 30-60% 1,2-propanediol, 100 mM Na HEPES pH 7.5 21 2 X 12 2 + 2 35-50% 1,2-propanediol, 100 mM Na HEPES pH 7.5 4 2 X 12 2 + 2 35-50% 1,2-propanediol, 100 mM Na HEPES pH 7.5 4 2 X 12 2 + 2 30-50% PEG 400, 100 mM Na HEPES pH 7.5 21 2 X 12 2 + 2 30-50% PEG 400, 100 mM Na HEPES pH 7.5 4 2 X 12 1 + 1 40-70% PEG 400, 100 mM Na HEPES pH 7.5 4 2 X 12 2 + 2 8-20% PEG 3350, 100 mM Na HEPES pH 7.5 21 2 X 12 2 + 2 8-20% PEG 3350, 100 mM Na HEPES pH 7.5 4 2 X 12 1 + 1 11-17% PEG 3350, 100 mM Na HEPES pH 7.5 21 2 X 12 2 + 2 5-20% PEG 3350, 100 mM Na CHES pH 9.5 4 2 X 12 2 + 2 5-20% PEG 3350, 100 mM Na CHES pH 9.5 21 2 X 12 2 + 2 30-50% 1,2-propanediol, 100 mM Acetate pH 4.6, 50 mM Ca(OAc)2 4 2 X 12 2 + 2 30-50% 1,2-propanediol, 100 mM Acetate pH 4.6, 50 mM Ca(OAc)2 4 2 X 12 2 + 2 30-50% MPD, 100 mM Na/K phosphate pH 6.2 21 2 X 12 2 + 2 30-50% MPD, 100 mM Na/K phosphate pH 6.2 4 2 X 12 2 + 2 30-50% MPD, 100 mM Na/K phosphate pH 6.2 4 1 X 24 1 + 1 1-30% PEG 400, 50 mM Na HEPES pH 7.5, 100 mM KCl, 10 mM CaCl2 21 1 X 24 1 + 1 1-30% PEG 400, 50 mM Na HEPES pH 7.5, 100 mM KCl, 10 mM CaCl2

Archaeoglobus veneficus FEN-1 with 7/14 DNA Crystal Hit Expansions (Figure 3.18) Temp °C Tray Format Drop Setup Coarse Gradient Expansion Screen Conditions

21 2 X 12 1 + 1 10-27% PEG 4000, 100 mM Na PIPES pH 6.5, 200 mM CaCl2 4 2 X 12 1 + 1 10-27% PEG 4000, 100 mM Na PIPES pH 6.5, 200 mM CaCl2 21 2 X 12 1 + 1 10-27% PEG 4000, 100 mM Na HEPES pH 7.5, 200 mM CaCl2 4 2 X 12 1 + 1 10-27% PEG 4000, 100 mM Na HEPES pH 7.5, 200 mM CaCl2 21 2 X 12 1 + 1 10-27% PEG 3350, 100 mM Na acetate pH 4.6 4 2 X 12 1 + 1 10-27% PEG 3350, 100 mM Na acetate pH 4.6 21 2 X 12 1 + 1 10-27% PEG 4000, 100 mM Na MES pH 5.6, 200 mM CaCl2 4 2 X 12 1 + 1 10-27% PEG 4000, 100 mM Na MES pH 5.6, 200 mM CaCl2

119 120

Approximately 40% of the drops precipitated (light or dark precipitate), which suggested that the protein and DNA concentrations were not too low. Also, a number of conditions were either clear or had a phase separation in the drop. A more complete screening at slightly higher concentrations of protein/DNA complex or perhaps the use of an additive screen may be needed in order to obtain reproducible crystals. Interestingly, a number of the crystal hits for both Ape and Ave FEN-1 with 7/14 flap DNA contained calcium, suggesting that the presence of divalent calcium may be necessary to stabilize the protein/DNA complex. Calcium has been previously used as a catalytically inert probe for metallonucleases. Specifically, divalent calcium has been shown to form a stable and specific complex between the EcoRV and the DNA containing the enzyme’s recognition site (Vipond and Halford, 1995). It has also recently been shown that bound divalent calcium ions enhance the substrate binding of the T5 5’ to 3’ exonuclease but do not support catalysis (Feng et al., 2004). Therefore, it was decided to perform co-crystallization screening on the Ape FEN-1 and 7/14 DNA substrate in the presence of the catalytically inert divalent metal CaCl2.

Purified Ape FEN-1 protein (see section 3.1.3) was dialyzed overnight against a buffer containing 25 mM Bis-Tris pH 6.5 and 100 mM NH4Cl. The protein solution was concentrated to ~15 mg/mL and then incubated in the presence of Chelex 100 resin

(Bio-Rad) for 5 min at 21 °C in order to remove any trace divalent metal ions. Next, the

Ape FEN-1 protein sample was filtered through a 0.45 µm spin filter and ~20 mM CaCl2 was added to the solution. The Ape FEN-1 protein sample in the presence of CaCl2 was then heat incubated as was described in section 3.1.4. Following heat incubation, the Ape

FEN-1 protein sample was centrifuged at 10,000xg for 5 min and filtered through a

121

0.45 µm spin filter. The protein concentration was checked using the absorbance at

280 nm in preparation for co-crystallization screening. A 1:1 equal molar complex

(based on ~10 mg/mL protein) of Ape FEN-1:flap DNA substrate (7/14) was used in all co-crystallization trials. Sitting-drop vapor diffusion crystallization trials at 21 °C were completed using the following crystal screen kits: Crystal Screen I™, Crystal

Screen II™, Index™, Natrix™, and the Ion Screen. Corning 96-well™ plates were set up using the Honeybee™ high throughput, crystallization robot (Genomic Solutions) with 0.5 µL of reservoir solution and 0.5 µL of FEN-1/flap DNA complex. A limited screening was completed due to a small supply of flap DNA substrate. The trays were then taped and stored at 21 °C until they were examined for crystal hit conditions. No crystal hits were obtained in any of the screen conditions. However, many of the drops were clear which indicted that a higher concentration of FEN-1/flap DNA complex might be needed for co-crystallization trials. An additional expansion tray was set up of the

FEN-1/flap DNA (7/14) complex in the presence of divalent calcium based on the optimized shallow gradient expansions of the citrate-free Ape FEN-1 grown in the presence of different divalent metal ions (as discussed in section 3.1.4). Based on the optimized shallow gradient of Ape FEN-1 in the presence of calcium, one 2x12 format hanging-drop vapor diffusion crystallization expansion tray was set up at 21 °C with a gradient of 8-14% PEG 8000, 100 mM Na HEPES pH 7.5, and 25% ethylene glycol. The expansion tray was set up using 2 µL of reservoir solution and 2 µL of a 1:1 equal molar complex (based on ~15 mg/mL protein) of Ape FEN-1:flap DNA substrate (7/14). No crystal growth was observed and the majority of the drops were clear. No crystal growth indicates that the 7/14 flap DNA substrate inhibits the growth of Ape FEN-1 crystals in

122 conditions that were previously optimized for growth in the presence of divalent calcium

(discussed in section 3.1.4). It can be inferred that the presence of divalent calcium may promote a stable complex between Ape FEN-1 and the 7/14 flap DNA substrate thereby preventing the growth of the protein only hexagonal crystals discussed in 3.1.4. Once more substrate is available, a complete screening at a higher concentration (based on

~20-30 mg/mL of Ape FEN-1) of a 1:1 complex of Ape FEN-1/flap DNA in the presence of divalent calcium ions may be necessary to obtain crystals.

Additional methods must also be considered to determine whether or not the native Ape FEN-1 enzyme has activated nuclease activity during the incubation of the co-crystallization trials. Extensive dialysis against EDTA may be necessary to remove any divalent metal ions that could potentially activate nuclease activity. It must also be shown that incubation in the presence of Chelex 100 resin does not affect the folding of the native Ape FEN-1 enzyme prior to co-crystallization trials. Once more flap DNA substrate is available, electrophoretic mobility gel shift assays (EMSAs) would be useful to study the substrate binding and any potential trace nuclease activity of Ape FEN-1 in both the presence and absence of divalent calcium. Lastly, an active site mutant Ape

FEN-1 (analogous to the D132N bacteriophage T4 RNase H discussed in Chapter 4, section 4.2) that does not support substrate cleavage may be needed for co-crystallization trials in the presence of flap DNA substrate. It is thought that one of the metals in the

FEN-1 family of enzymes is required mainly for DNA binding and the other metal is required mainly for DNA catalysis. Therefore, by mutating residues that eliminate binding of the catalytic metal ion, DNA binding may be sustained while virtually eliminating DNA catalysis. Our collaborators at Third Wave Technologies, Inc. have

123 attempted this technique with the 5’ to 3’ exonuclease domain of Taq polymerase, and were able to greatly reduce but not eliminate cleavage. It is possible that a combination of mutant enzymes, chelating agents, and low temperature will reduce cleavage enough to allow co-crystallization.

3.2.3 X-ray Diffraction Data Collection

Only one of the Ape FEN-1 and 7/14 flap DNA crystal hits (shown in Figure

3.17) was able to be mounted for X-ray diffraction data collection. The crystal of Ape

FEN-1 and 7/14 flap DNA grown in 40% (v/v) 1,2-propanediol and 100 mM Na HEPES pH 7.5 (Figure 3.17A, Cryo I™ screen) was harvested directly from the growth drop and was flash frozen in liquid nitrogen. The crystal was harvested and flash frozen directly because the growth mother liquor contained a cryoprotectant solution (40% (v/v)

1,2-propanediol) that was high enough in protective concentration. The crystal was tested for X-ray diffraction using a 0.9 Å wavelength source at BioCARS 14-BM-C

(Argonne National Laboratories, Advanced Photon Source, Chicago, IL, USA) using an

ADSC Quantum 4 CCD detector. An image showing X-ray diffraction to approximately

3 Å is shown in Figure 3.19. No data was collected due to the large unit cell dimensions that were predicted from indexing (very close spot pattern of the reciprocal lattice shown in Figure 3.19) which were further complicated by a high mosaicity. The longest unit cell edge was predicted to be approximately 490 Å. More crystallization trials will have to be completed in order to find a condition from which diffraction quality crystals can be grown.

124

Figure 3.19: X-ray diffraction image of an Ape FEN-1 and 7/14 flap DNA crystal (see Figure 3.17A) grown from the initial co-crystallization trials. X-ray diffraction image was collected at BioCARS 14-BM-C (Argonne National Laboratories, Advanced Photon Source, Chicago, IL, USA) using an ADSC Quantum 4 CCD detector.

3.2.4 Aeropyrum pernix FEN-1 DNA Binding

Without direct structural information characterizing the structure-specific recognition of flap DNA substrate by the Ape FEN-1 enzyme, it is still possible to gain an understanding of this structure-specific interaction because many aspects of substrate recognition are unique to the FEN-1 family of enzymes. However, a comprehensive structural characterization of the structure-specific substrate recognition of the FEN-1 family of enzymes has not yet been completed.

Prior to the structural evidence characterizing the binding of the upstream 3’ flap region of the DNA substrate (discussed below), various studies led to the proposal that the FEN-1 enzymes contained a region or pocket in the enzymes that specifically recognized the base at the 3’ end of the upstream duplex. It has been shown that during

125 eukaryotic lagging strand DNA replication, a flap DNA structure is generated by a strand displacement as the processing DNA polymerase (pol δ) interacts with a downstream

Okazaki fragment (discussed in Chapter 1, section 1.1). The resulting flap DNA structure is processed and the final cleavage step is performed by the FEN-1 enzyme creating a nick that can then be sealed by DNA ligase I (Kao and Bambara, 2003). FEN-1 preferentially recognizes a flap DNA structure (also called a double flap structure) with both a downstream 5’ displaced arm and a 3’ one base overhang between the upstream and downstream duplex regions (Kao et al., 2002). Studies have confirmed that flap

DNA substrates containing a one nucleotide overhang (3’ flap) on the upstream duplex were exclusively cleaved one nucleotide into the downstream duplex region and could subsequently be sealed with DNA ligase. In contrast, flap DNA substrates lacking the 3’ overhang nucleotide were not ligated by DNA ligase (Kaiser et al., 1999). Also, the cleavage rate of the FEN-1 enzymes on a flap DNA substrate with a one nucleotide overhang (3’ flap) on the upstream duplex has been shown to be approximately three orders of magnitude higher than for a substrate lacking the 3’ overlap (Hosfield et al.,

1998a; Kaiser et al., 1999). The identity of the base (A, C, G, or T) occupying the 3’ overhang position did not effect the cleavage rate which suggested that the FEN-1 enzymes may contain a region or pocket that specifically recognized the 3’ nucleotide in a structure specific manner. This proposal was supported by experiments in which a dideoxyribose was substituted at the 3’ end of the upstream duplex containing the overhang base. The removal of the single oxygen atom (3’ OH) resulted in a significant decrease in FEN-1 activity on a flap DNA substrate containing a 3’ dideoxynucleotide overhang (Kaiser et al., 1999).

126

Recently, X-ray structural studies were completed of the Archaeoglobus fulgidus

(Afu) FEN-1 bound to 3’ flap DNA (Chapados et al., 2004). Based on the structural similarities of both the metal-free Ape FEN-1 (discussed in Chapter 2) and the Afu

FEN-1, implications for 3’ flap DNA binding to Ape FEN-1 can be inferred. The metal-free ribbon structure of Ape FEN-1 is shown superimposed on the ribbon structure of Afu FEN-1 bound to 3’ flap DNA in Figure 3.20. The overall structures of the Afu

FEN-1 and Ape FEN-1 are very similar except in their respective bridge regions and in the location of the 3’ flap DNA binding interface as shown in Figure 3.20. It has been proposed that conformational changes in the H2-H3 loop region of the Afu FEN-1 are contributing to an increased structural ordering of the bridge region upon the binding of

3’ flap DNA. The specific packing interactions between the H2-H3 loop and the bridge region are thought to contribute to the structural order in the bridge region that was observed in the Afu FEN-1 (Chapados et al., 2004). In all of the crystal structures of the

FEN-1 related enzymes (see Chapter 2, section 2.7 and Chapter 4, section 4.3.4), it is interesting that the metal-free Ape FEN-1 is the only structure that displayed an ordered helical bridge region in the absence of bound DNA. In order for the 3’ flap of the upstream duplex DNA to bind to the Ape FEN-1 enzyme, conformational changes in the

H1-H2 loop region and in helical regions proximal to the 3’ flap binding interface would be required based on the structure of the 3’ flap DNA interface of the Afu FEN-1 (as shown in Figure 3.20B and 3.20C). A large bridge region conformational change would also be needed in order for the H1-H2 loop to interact with the bridge region in the Ape

FEN-1 enzyme (Figure 3.20B and 3.20C). A conformational change of this magnitude

127 would result in a more open form of the Ape FEN-1 enzyme, potentially allowing the 5’ flap region of the DNA substrate access to the active site.

A

Flap DNA

B C

Figure 3.20: Ribbon structure of the metal-free Ape FEN-1 superimposed onto the ribbon structure of Archaeoglobus fulgidus (Afu) FEN-1 bound to 3’ flap DNA. The upstream duplex (with a 3’ flap) DNA is shown in green and the downstream duplex (with a 5’ flap) DNA is shown in black adjacent to (A). A: The ribbon structure of the metal-free Ape FEN-1 (shown in blue), (Chapter 2) is superimposed on the ribbon structure of Archaeoglobus fulgidus FEN-1 (shown in red) bound to 3’ flap DNA (shown in green). PDB 1RXW, (Chapados et al., 2004). B: The ribbon structure of the 3’ flap DNA binding interface on the Afu FEN-1 is shown in red. The bound 3’ flap DNA is shown in green. The Ape FEN-1 ribbon structure is shown superimposed in blue. C: The ribbon structure of the Afu FEN-1 bridge region and 3’ flap DNA binding interface is shown in red in the absence of bound 3’ flap DNA. The ribbon structure of the Ape FEN-1 bridge region and inferred 3’ flap DNA binding interface is shown superimposed in blue. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

128

Based on the structure of the Afu FEN-1 bound to 3’ flap DNA, amino acid sequence alignments have shown that the residues that make up the 3’ flap DNA binding interface are highly conserved from archaeal FEN-1 enzymes to the Human FEN-1

(Friedrich-Heineken and Hubscher, 2004). A sequence alignment of the residues that form the 3’ flap DNA binding pocket of the Afu FEN-1 are shown in comparison to other archaeal FEN-1 enzymes (including Ape FEN-1) and the Human FEN-1 enzyme in

Figure 3.21, (adapted from Friedrich-Heineken and Hubscher, 2004). Residues in the

Afu FEN-1 structure that make specific contacts to the 3’ flap DNA are denoted by a black star in the sequence alignment.

* ** ** * ** * 1 AfuFEN1 23 KIAVDAFNTLYQFIS IIRQPDGTPLKDSQG RITSHLSGILYRVSN 67 2 AveFEN1 23 KIAIDAFNTLYQFIS IIRQPDGTPLKDSQG RMTSHLSGILYRVSN 67 3 ApeFEN1 26 VLALDAYNMLYQFLT AIRQPDGTPLLDREG RVTSHLSGLFYRTIN 70 4 PfuFEN1 23 KIAIDALNAIYQFLS TIRQKDGTPLMDSKG RITSHLSGLFYRTIN 67 5 PhFEN-1 23 KIAIDALNAIYQFLS TIRQRDGTPLMDSKG RITSHLSGLFYRTIN 67 6 MjFEN-1 23 KVAIDGMNALYQFLT SIRLRDGSPLRNRKG EITSAYNGVFYKTIH 67 7 HumanFEN-1 30 KVAIDASMSIYQFLI AVRQ-GGDVLQNEEG ETTSHLMGMFYRTIR 73 ** *

* ** * * * 1 AfuFEN1 298 EKAIEFLCEEHDFSR ERVEKALEKLKA--- --LKSTQATLERWF- 336 2 AveFEN1 298 EKIIEFLCEEHDFSK DRVEKAVEKLKAG-- --MQASQSTLERWFS 338 3 ApeFEN1 306 DKVREILVERHDFNP ERVERALERLGKAYR EKLRGRQSRLDMWFG 350 4 PfuFEN1 298 EGILKFLCDEHDFSE ERVKNGLERLKKAIK ---SGKQSTLESWFK 339 5 PhFEN-1 298 EGILKFLCDEHNFSE ERVKNGIERLKKAIK ---AGRQSTLESWFV 339 6 MjFEN-1 284 EGIIKFLVDENDFNY DRVKKHVDKLYNLIA N--KTKQKTLDAWFK 326 7 HumanFEN-1 304 EELIKFMCGEKQFSE ERIRSGVKRLSKSRQ ---GSTQGRLDDFFK 345 * ** * Figure 3.21: Sequence alignment of the residues making up the 3’ flap DNA binding pocket of the Archaeoglobus fulgidus (Afu) FEN-1 in comparison to other archaeal FEN-1 enzymes and Human FEN-1, (adapted from Friedrich-Heineken and Hubscher, 2004). The amino acid residues of the archaeal FEN-l enzymes (Archaeoglobus fulgidus (Afu), Archaeoglobus veneficus (Ave), Aeropyrum pernix (Ape), Pyrococcus furiosus (Pfu), Pyrococcus horikoshii (Ph), and Methanococcus jannaschii (Mj) FEN-1) along with Human FEN-1 are aligned based on the residues shown to form the 3’ flap DNA binding pocket in the Afu FEN-1 structure. PDB 1RXW, (Chapados et al., 2004). The residues of the Afu FEN-1 that specifically interact with the 3’ flap DNA are denoted with a black star (*). Identical residues are shaded in black and similar residues are shaded in gray. Seven residues used in mutagenesis studies of Human FEN-1 (Friedrich-Heineken and Hubscher, 2004) are denoted with a red star (*) (discussed below).

The sequence alignment shows a number of completely conserved residues and a number of similar residues that provide evidence that there is a conserved 3’ flap DNA binding pocket in both archaeal and eukaryotic FEN-1 enzymes. In particular, residues making

129 up the 3’ flap DNA binding interface in the Afu FEN-1 are highly conserved in the Ape

FEN-1 sequence. Based on a similar sequence alignment to that shown in Figure 3.21, mutagenesis studies were completed on the Human FEN-1 enzyme in which seven conserved or similar residues (denoted by red stars in Figure 3.21) hypothesized to be involved in binding to 3’ flap DNA were mutated to alanine. The mutation of these seven residues resulted in a complete loss of 3’ flap DNA specificity (Friedrich-Heineken and Hubscher, 2004). The sequence alignment analysis and the mutational studies of the

Human FEN-1 are in support of the structural evidence for a 3’ flap DNA binding pocket in the FEN-1 enzymes. Therefore, based on both structural (discussed above) and sequence alignment comparisons between the Ape FEN-1 and the Afu FEN-1, it is possible that the Ape FEN-1 might bind to 3’ flap DNA in a similar manner to that observed in the Afu FEN-1 structure.

The amino acid residues in the 3’ flap DNA interface of the Afu FEN-1 structure are shown in Figure 3.22A and 3.22B in both the presence and absence of bound 3’ flap

DNA, respectively. The amino acid residues (labeled in Figure 3.22B) form specific contacts with the 3’ flap DNA substrate. Residues F35, I38, I39, and L47 form a surface-exposed “hydrophobic wedge” that interacts with the DNA substrate through hydrophobic packing interactions. Additional hydrophobic stacking interactions were observed between Y63, F310, and R314 and various deoxyribose sugars of the DNA backbone. A number of amino acid side chains (R64, N67, K317, and K321) and the backbone of S311 from adjacent α-helices are involved in hydrogen bonding interactions with the phosphate backbone of the double-stranded DNA substrate.

130

A

Flap DNA

upstream 3’ overhang base

B C

Figure 3.22: Amino acid residues forming the 3’ flap DNA interface of the Archaeoglobus fulgidus (Afu) FEN-1 in comparison to the proposed 3’ flap DNA binding interface of Ape FEN-1. A: The upstream duplex with a 3’ overhang base (DNA shown in brown) is shown bound to the ribbon structure of the Archaeoglobus fulgidus FEN-1 in the 3’ flap pocket of the enzyme. The amino acid residues are shown that make specific contact with the 3’ flap DNA. PDB 1RXW, (Chapados et al., 2004). B: The ribbon structure of the Afu FEN-1 3’ flap DNA binding pocket is shown in the absence of DNA with the corresponding residues labeled that are involved in specific contacts with the DNA substrate. The residues (F35, I38, I39, L47, K48, T55, Y63, R64, N67, H308, F310, S311, R314, K317, and K321) are denoted as black stars in the sequence alignment shown in Figure 3.21. C: The ribbon structure of the metal-free Ape FEN-1 (see Chapter 2) is shown with the corresponding residues of the proposed 3’ flap DNA binding pocket in the enzyme. The labeled residues (F38, A41, I42, L50, L51, T58, Y66, R67, N70, H316, F318, N319, R322, R325, and R329) are based on a sequence alignment (see Figure 3.21) with the residues of Afu FEN-1 shown in (B) that are involved in specific contacts with the 3’ flap DNA substrate. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

131

Interestingly, a number of hydrophobic and hydrogen bonding interactions were observed with 3’ terminal deoxyribose sugar, but not the associated 3’ overhang base. Specifically, the backbones of K48 and H308 along with the side chain of T55 were observed to be hydrogen bonded to the 3’-hydroxyl of the 3’ overhang nucleotide. These results are in agreement with previous studies that showed evidence of sequence-independent or structure-specific recognition of the 3’ overhang base by the FEN-1 enzymes (Kaiser et al., 1999). To further support the structural evidence for a sequence-independent 3’ flap

DNA binding site in the Afu FEN-1, residue T55 was mutated to a Phe in order to disrupt the binding pocket and the hydrogen bonding to the 3’-hydroxyl of the 3’ overhang nucleotide (Chapados et al., 2004). A substantial decrease in activity was observed for the T55F mutant Afu FEN-1 in the presence of a flap DNA substrate with a one nucleotide 3’ overhang. These results were also in agreement with previous studies in which FEN-1 activity was significantly decreased by a flap DNA substrate containing a

3’ dideoxynucleotide (Kaiser et al., 1999).

Based on the structural and sequence comparisons between the 3’ flap DNA binding region of the Afu FEN-1 (Figure 3.22A and 3.22B) and the corresponding region of the metal-free Ape FEN-1 (Figure 3.22C), a similar mode of 3’ flap DNA binding can be inferred for the Ape FEN-1 enzyme. Ape FEN-1 residues F38, A41, I42, and L50 could potentially interact with the DNA substrate through hydrophobic packing interactions forming a similar “hydrophobic wedge” to that observed in the Afu FEN-1 structure. Additional hydrophobic stacking interactions could occur between Ape FEN-1 residues Y66, F318, and R322 and deoxyribose sugars of the DNA backbone. Residues

R67, N70, N319, R325, and R329 could form side chain or backbone hydrogen bonding

132 interactions with the phosphate backbone of the double-stranded DNA substrate. The

3’-hydroxyl of the 3’ overhang nucleotide could be specifically stabilized through hydrogen bonding interactions with the backbones of residues L51 and H316 as well as the side chain of T58, resulting in a sequence-independent recognition of the 3’ overhang nucleotide by the Ape FEN-1 enzyme. However, the binding the 3’ flap DNA to the Ape

FEN-1 enzyme would be dependent on a number of conformational changes, specifically the H1-H2 loop and helical regions proximal to the 3’ flap DNA binding pocket

(discussed above and shown in Figure 3.20).

Though valuable, the structural characterization of the binding of the upstream duplex 3’ flap DNA does not provide a complete understanding of how the FEN-1 family of enzymes is able to correctly bind and process a displaced 5’ RNA/DNA or DNA/DNA flap substrate that is generated during lagging-strand replication. Based on the structural model of the Afu FEN-1 bound to 3’ flap DNA, it has been proposed that the binding of the 3’ flap portion of the DNA substrate anchors the substrate in such a way that the scissile phosphate (located between the first and second downstream nucleotides) is positioned near the active site of the enzyme as a result of a large kink angle

(approximately 90°-100°) in the DNA substrate. The kink, centered on the template strand at the phosphate opposite the junction of the 5’ arm and the 3’ overhang nucleotide, is proposed to position the downstream duplex on the enzyme so that the 5’ flap region is located at the active site (Chapados et al., 2004). As discussed previously, the binding of the 3’ flap portion of the substrate to the Afu FEN-1 is believed to promote the conformational ordering and closing of the bridge structure over the active site, thus helping to facilitate cleavage of the 5’ flap of the substrate (Chapados et al., 2004). The

133 kinked flap DNA model of substrate recognition is supported by the fact that the active site is approximately 25 Å away from the 3’ flap DNA binding pocket in the Afu FEN-1.

Therefore, the specific binding of the 3’ flap portion of the DNA substrate to the Afu

FEN-1 enzyme likely disrupts the double helical topology of the substrate promoting the large separation of the 5’ flap and associated downstream duplex from the upstream 3’ flap.

Based on the assumption that the Ape FEN-1 may bind to the 3’ flap of the upstream DNA substrate in a similar manner to that observed for the Afu FEN-1, a kinked flap DNA model may also provide an understanding of how the Ape FEN-1 is able to bind the downstream duplex and position the 5’ displaced flap near the active site for cleavage to occur. A model of a kinked flap DNA substrate bound to Ape FEN-1 is shown in Figure 3.23B with the 3’ flap bound in the binding pocket (analogous to the Afu

FEN-1 structure, Figure 3.23A) and the downstream 5’ flap positioned near the active site. The location of the 3’ flap binding pocket and the kink in the substrate would likely disrupt the duplex nature of the DNA near the junction of the upstream and downstream duplex regions. The binding of the upstream 3’ flap may help to position the downstream

5’ flap in the active site of the Ape FEN-1 enzyme. A conformational change in the bridge region resulting in a more open bridge structure upon 3’ flap binding followed by a subsequent closing of the bridge region over a correctly positioned 5’ flap might facilitate cleavage of the phosphodiester bond one base pair into the downstream duplex.

134

A

Expected cleavage 5’ flap

location 3’ flap

B Antiparallel β strand

Heterotrimeric PCNA

Figure 3.23: Proposed model for flap DNA substrate binding to the Ape FEN-1 enzyme. A: The ribbon structure of Archaeoglobus fulgidus (Afu) FEN-1 bound to 3’ flap DNA (shown in green). PDB 1RXW, (Chapados et al., 2004). B: The ribbon model of the Ape FEN-1 bound to a flap DNA substrate (shown in green). The 5’ flap and associated downstream duplex is positioned over the conserved active site acidic residues (shown in red). The 3’ flap and associated upstream duplex is shown analogous to (A). The antiparallel β strand near the upstream and downstream junction is labeled. Heterotrimeric PCNA (shown in brown) is bound to the C-terminus of Ape FEN-1 encircling the upstream duplex region of the flap DNA substrate (discussed below in text). This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

135

A kinked flap DNA substrate may be stabilized by the Ape FEN-1 enzyme due to the presence of a prominent antiparallel β strand that extends toward the front of the large subdomain (labeled in Figure 3.23). It can be hypothesized that the kinked portion of the substrate may be held or clamped between the large subdomain and the antiparallel β strand of the Ape FEN-1. This antiparallel β strand is shown in Figure 3.24 along with potential residues that could bind to the DNA substrate.

Figure 3.24: Groove between the Ape FEN-1 antiparallel β strand and the large subdomain of the enzyme. Ape FEN-1 ribbon diagram of the antiparallel β strand and the large subdomain showing the open groove that could potentially stabilize a kinked DNA substrate. Hydrophobic residues (A41, I193, L199, V206, and I208) are shown in green. Basic residues (R67 and R197) are shown in blue. Hydrophilic residues (Q37, T40, and Q175) are shown in red. Active site residues are shown in red (not labeled) towards the bottom of the figure. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

Residues proximal to the groove between the antiparallel β strand and the large subdomain could contribute hydrophobic, hydrogen bonding, and electrostatic

136 interactions with the DNA substrate. In particular, residues A41, I193, L199, V206, and

I208 could form hydrophobic stacking interactions with deoxyribose sugars or nucleotide bases if the duplex nature of the substrate was disrupted due to kinking. Residues Q37,

T40, R67, Q175, and Q175 could form hydrogen bonding and electrostatic interactions with the phosphate backbone of the substrate. Interestingly, only the Ape FEN-1 and the

Pyrococcus furiosus (Pfu) FEN-1 structures contain such prominent antiparallel β strands

(see Chapter 2, Figure 2.15). Sequence alignments have also revealed that the Pfu FEN-1 contains insertion residues corresponding to the antiparallel β strand that are not present in Human FEN-1 (Sakurai et al., 2004). It has been proposed that the presence of this antiparallel β strand is a structural adaptation that may constrain the orientation of the flap DNA substrate allowing the thermophilic FEN-1 enzymes to function at elevated temperatures (Hosfield et al., 1998b). Further structural information will be needed in order to support the basis for the role of the antiparallel β strand in the stabilization of a kinked DNA substrate.

The understanding of how the FEN-1 family of enzymes is able to recognize a flap DNA substrate is further complicated by the evidence that FEN-1 activity is markedly stimulated by the proliferating cell nuclear antigen (PCNA) clamp (Liu et al.,

2004). Previous studies have revealed a heterotrimeric PCNA in the archaeal Sulfolobus solfataricus that led to the proposal that the DNA polymerase, DNA ligase I, and the

FEN-1 are each associated to a different monomer of the heterotrimeric PCNA complex.

These results suggested the presence of a preassembled processive, processing complex during DNA synthesis and Okazaki fragment processing (Dionne et al., 2003). No structural evidence characterizing this large preassembled complex is currently available.

137

However, recent structural evidence has confirmed that the C-terminus of the FEN-1 enzyme is involved in binding to the PCNA clamp (Chapados et al., 2004; Sakurai et al.,

2004). Based on the structural information, it has been proposed that the interaction between the PCNA and the FEN-1 may enhance the FEN-1 binding to the DNA by restraining the orientation of the FEN-1 relative to the upstream duplex DNA which is encircled by PCNA (Chapados et al., 2004). It can be inferred that the PCNA may interact with the C-terminus of the Ape FEN-1, possibly resulting in an ordering of the last ten residues that were missing in the metal-free Ape FEN-1 model (see Chapter 2).

This interaction with the PCNA clamp could then help to orient the Ape FEN-1 to facilitate binding of the flap DNA substrate for proper cleavage (as shown in Figure

3.23B). Additional enzyme/substrate co-crystallization studies will need to be completed in order to gain a comprehensive understanding of the unique structure specific recognition of flap DNA by the FEN-1 family of enzymes. Also, co-crystallization studies of the FEN-1 family of enzymes in the presence of both flap DNA substrate and the PCNA clamp would be very valuable in understanding the molecular basis and role of this ternary complex in DNA replication.

Chapter 4: Bacteriophage T4 RNase H Structural Studies

4.1 Metal Free Bacteriophage T4 RNase H

Protein purification, crystallization, crystal harvesting, and X-ray diffraction data collection of the metal-free bacteriophage T4 RNase H have been previously completed.

The goal of this project was to complete the data processing, structure determination, model building, and refinement of the native metal-free bacteriophage T4 RNase H.

4.1.1 Data Processing

The X-ray diffraction data were indexed, integrated, and scaled initially to 1.5 Å resolution. The data were integrated using DENZO and merged using SCALEPACK from the HKL software (Otwinowski and Minor, 1997). The metal-free T4 RNase H crystals belong to the monoclinic space group P21, with unit cell dimensions a = 36.3, b = 86.5, c = 57.8 Å with α = 90° β = 92° and γ = 90°. Calculation of the Matthews’

3 -1 coefficient (VM = 2.5 Å Da ) indicated that there was one monomer per asymmetric unit in the crystal (Matthews, 1968). Analysis of the respective resolution bins in the initial scale file revealed that the data was only useful to 1.8 Å resolution. The Rmerge values for the highest resolution bins (1.5-1.75 Å) were between 48-29% and the intensity/error

(I/σ) values were approximately 2.5-4.6. Ideally, data can be used with Rmerge values of under 20% and an I/σ value greater than four for each respective resolution bin.

Therefore, the data were rescaled to 1.8 Å resolution and had an overall Rmerge of 4.5% with a completeness of 99.9%. The highest resolution bin had an Rmerge value of 18.5%

138 139 and an I/σ of 7.4%. See Table 4.1 for all metal-free T4 RNase H crystallographic data.

The final scale file was then used to create a CNS style and a CCP4 style data file (h k l,

Fobs, and Sigma Fobs) containing an Rfree random data subset of 3% for use in the refinement software suites Crystallography & NMR System (CNS) (Brunger, 1992;

Brunger et al., 1998) and CCP4 (Collaborative Computational Project,

1994), respectively.

Table 4.1: Metal-free T4 RNase H crystallographic data.

Lattice type Primitive monoclinic

Space group P21 Asymmetric unit One molecule Cell dimensions a = 36.3, b = 86.5, c = 57.8 Å, α = 90° β = 92° γ = 90° Data Resolution 1.8 Å a Rmerge 4.5% Observed reflections 134,973 Unique reflections 33,128 Completeness, % 99.9% n n a ⎡ 2 2 ⎤ 2 2 Rmerge = 100 × ⎢∑∑ F ()hkl − F ()hkl i⎥ / ∑∑F ()hkl where F ()hkl is the intensity of the ⎣ hkl i=1 ⎦ hkl i=1 2 hkl reflection and F ()hkl i is the mean value of i multiple measurements of the n equivalent reflections.

4.1.2 Structure Determination

The initial phasing was solved by molecular replacement using AMoRe (Navaza and Saludjian, 1997) with the previously solved T4 RNase H in the presence of two magnesium ions in the active site (see Figure 4.1: Ribbon diagram of metal-bound T4

RNase H), (PDB code 1TFR, 2.1 Å), (Mueser et al., 1996). During molecular replacement, the metal-bound T4 RNase H model was positioned into the metal-free T4

RNase H crystal lattice using a rotation search followed by a translation search. The molecular replacement rotation and translation search was done using one monomer from

15.0 to 3.0 Å resolution. Following the rotation and translation search, a rigid body

140 refinement was completed to optimize the rotational orientation of the search model. The molecular replacement results for the space group P21 gave an Rfactor of 49.9% and a correlation coefficient (CC) of 33.4%, and an output coordinate model of the metal-free

T4 RNase H was generated.

Figure 4.1: Ribbon diagram of metal-bound T4 RNase H. Ribbon diagram of metal-bound T4 RNase H in the presence of two bound magnesium ions shown in gold. The N-terminus is shown in blue and can be traced to the C-terminus shown in red. The metal- bound structure is missing the first 11 amino acids in the N-terminus. The structure also has a disordered bridge region from amino acids 89-97 and a disordered turn at residue 181 and 182. The metal-bound T4 RNase H was used as the search model in molecular replacement to solve the metal-free form of the enzyme (PDB code 1TFR, 2.1 Å), (Mueser et al., 1996). This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

4.1.3 Model Building and Refinement

Following molecular replacement, the output coordinate file generated from the rigid body refinement following the rotation and translation search was used for the initial refinement of the metal-free T4 RNase H data. Simulated annealing (CNS) refinement was used initially in the refinement process of the metal-free T4 RNase H.

Simulated annealing is a nonlinear refinement method that allows the model to explore a large range of conformations. By coupling a simulated high temperature followed by a slow cooling to the refinement process, a sufficient amount of calculated kinetic energy is applied to the model so that amino acid side chain and backbone conformational changes can occur and energy barriers can be crossed. Therefore, by allowing the model to cross energy barriers, it is more likely that the model will reach a lower or possibly a global

141 energy minimum (Brunger et al., 1997; Brunger and Rice, 1997). Thus, it was hypothesized that simulated annealing would be an appropriate method of refinement of the metal-free T4 RNase H considering the substantial amount of structural differences anticipated between it and the metal-bound form of T4 RNase H (see Figure 4.1: Ribbon diagram of metal-bound T4 RNase H). The first round of refinement was completed using CNS simulated annealing at 2000 K with a 100 K constant cooling to 273 K using all data from 20.0-1.8 Å resolution yielding an Rvalue of 28.8% and an Rfree of 32.1%.

Following all refinements, Fobs-Fcalc and 2Fobs-Fcalc difference electron density maps were calculated. Molecular graphics was then used to complete the first round of model building using the O program (Jones et al., 1991). Prior to model building, the σ levels of all difference electron density maps were normalized using the MAPMAN program

(Kleywegt and Jones, 1996). Fobs-Fcalc and 2Fobs-Fcalc electron density maps were used in all rounds of model building. A total of three rounds of CNS simulated annealing refinement and two rounds of model building were completed with a total of 297 out of

305 amino acid residues in place with an Rvalue of 19.0% and an Rfree of 24.0%. Due to disordered electron density, the first eight amino acids at the N-terminus were not able to be modeled with confidence. All simulated annealing refinements were done using

2000 K with a 100 K constant cooling to 273 K using all data from 20.0-1.8 Å resolution.

Following the first and second model building cycles, the CNS water_pick program was used prior to CNS simulated annealing refinement to automatically pick water molecule coordinate positions in the metal-free T4 RNase H model (Read, 1986; Kleywegt and

Brunger, 1996). A total of 445 solvent atoms were identified using two rounds of the

CNS water_pick program. With 97% of the metal-free T4 RNase H amino acid residues in place following three rounds of CNS refinement and two rounds of model building,

142

Refmac5 restrained, positional refinement (CCP4) was then used to complete the model building process (Murshudov, 1997).

Using the output coordinate file from the third round of CNS refinement,

Refmac5 restrained refinement was done from 20.0-1.8 Å giving an Rvalue of 17.7% and an Rfree of 21.1%. One round of water picking was also included with the Refmac5 refinement using the automated ARP_waters program (Lamzin, 1993; Morris et al.,

2002). A third round of model building was then completed using molecular graphics.

Moderate changes were made in the bridge region of the metal-free T4 RNase H (amino acid residues 92-96) due to more ordered CCP4 electron density maps (Fobs-Fcalc and

2Fobs-Fcalc) in comparison to the electron density maps that were created by CNS refinement. Also, the electron density of the disordered N-terminal region (amino acids one through eight) was examined using both Fobs-Fcalc and 2Fobs-Fcalc electron density maps at a variety of σ contour levels. A total of 34 solvent atoms were identified and removed from the coordinate file. 30 of the identified solvent atoms were incorrectly placed in the electron density of the disordered N-terminal region as a result of using the

CNS water_pick and the CCP4 ARP_waters programs, respectively. A second and third round of Refmac5 restrained refinement without ARP_water picking was then completed at the same resolution as described previously. By removing the incorrectly placed water molecules in the N-terminal region, it was anticipated that some Fobs-Fcalc difference electron density would be present for the disordered residues one through eight in the metal-free T4 RNase H structure. Following the second and third Refmac5 refinement rounds, model building was used to analyze the electron density maps at both the N- terminus and in the bridge region (residues 89-97) where some repositioning of amino acid side chains was completed. Surprisingly, the difference electron density at the N-

143 terminus was still too disordered for amino acids one through eight to be modeled with confidence. A final Refmac5 restrained refinement without ARP_water picking was then completed from 20.0-1.8 Å giving an Rvalue of 16.2% and an Rfree of 20.7%. The stereochemistry of the final refined amino acid coordinate file of the metal-free T4 RNase

H was then analyzed by PROCHECK (Laskowski et al., 1992). The Ramachandran plot showed that for the 297 observed amino acids in the model, 91.3% were in most favored stereochemical regions, 8.4% were in allowed regions, and 0.4% (one amino acid, E96) were in generously allowed regions (see Table 4.2: Metal-free T4 RNase H preliminary crystallographic refinement data).

Table 4.2: Metal-free T4 RNase H preliminary crystallographic refinement data.

Refinement Resolution 20.0-1.8 Å a Rvalue 16.2% a Rfree 20.7% Average B factor 16.9 Å2 b RMSBonds 0.021 Å b RMSAngles 1.637° Atoms (nonhydrogen) 2925 Solvent atoms 484

Ramachandran plotc Most favored 91.3% Allowed 8.4% Generously allowed 0.4%

a Rvalue = , Rfree is the free Rvalue (3% random data subset), (Brunger 1992). ∑ Fobs − Fcalc / ∑ Fobs hkl hkl b Root-mean-square deviations of bond lengths in Å and bond angles in degrees calculated with CCP4 Refmac5 (Collaborative Computational Project 1994). c Ramachandran plot quality assessment using PROCHECK (Laskowski, MacArthur et al. 1992).

Upon examination and comparison of the crystallographic refinement statistics of both the metal-free T4 RNase H (1.8 Å, Rvalue of 16.2%) and the metal-free D132N T4

RNase H mutant (1.5 Å, Rvalue of 18.6%), (see Chapter 4.2, Table 4.5: Metal-free D132N

144

T4 RNase H crystallographic refinement data), it was determined that the overall Rvalue of

16.2% for the metal-free T4 RNase H model may have been incorrectly forced lower by the presence of 484 solvent atoms in the refined structure. By comparison, the metal-free

D132N T4 RNase H structure was refined with a total of 174 solvent atoms, approximately 300 less solvent atoms than were present in the native metal-free T4

RNase H model. When building the protein model, caution must be taken so that additional atoms (solvent water molecules) are not incorporated incorrectly into the model by the use of automatic water picking programs such as CNS_waterpick or CCP4

Refmac5 ARP_waters. Lesser quality X-ray diffraction data can potentially lead to the generation of difference electron density maps that contain Fourier artifacts at lower map

σ contour levels. Less ordered electron density artifacts can be incorrectly identified as potential solvent molecules by automatic water picking programs. As a general guideline when evaluating an X-ray diffraction data set, there should be a minimum of four times the number of observed unique reflections for each refined parameter

(x, y, z, temperature B factor) per atom in order for the these parameters to remain uncorrelated during refinement. If the ratio of the number of observed unique reflections to refined parameters becomes less than four, the Rvalue can be artificially driven much lower as a result of the parameters becoming correlated. Similarly, any decrease in the ratio of unique reflections to the number of refined parameters can cause the Rvalue to decrease slightly. The addition of large numbers of incorrectly placed solvent atoms increases the number of refined parameters, thus causing a decrease in the unique reflection to refined parameter ration along with a decrease in the Rvalue. Because each atom contributes to all observed reflections, the decrease in the unique reflection to refined parameter ratio resulting from the presence of incorrectly placed solvent atoms

145 can cause errors in electron density map calculations. These errors can contribute to flattening of missing protein electron density and hence an incorrect model. Therefore, it was decided to remove all 484 solvent atoms from the metal-free T4 RNase H coordinate file and re-refine the data using CCP4 Refmac5. It was anticipated that some or all of the disordered or weak electron density at the N-terminus would then improve significantly enough for the N-terminus to be modeled.

Using the current model of the metal-free T4 RNase H where 297 out of 305 amino acids were modeled (all 484 solvent atoms removed), one CCP4 Refmac5 restrained refinement round from 20.0-1.8 Å was completed without the use of the

ARP_waters program so that the electron density of the disordered N-terminal region could be examined in the absence of all solvent molecules. Next, this Refmac5 refined output model was put through three successive rounds of CCP4 Refmac5 refinement from 20.0-1.8 Å with ARP_water picking giving an Rvalue of 20.1% and an Rfree of 24.6%.

It is advantageous to use multiple rounds of the ARP_waters program because solvent atom coordinate positions are determined based on hydrogen bonding distances to adjacent solvent molecules and amino acid residues (peptide bonds and side chains). A complete model building walkthrough of the electron density was then completed at a contour level of 1.0σ for the 2Fobs-Fcalc map and 4.0σ for the Fobs-Fcalc map to determine if solvent atoms were correctly identified by the ARP_waters program. A contour level of

4.0σ for the Fobs-Fcalc was used so that only well ordered solvent atom electron difference density was observed. The 4.0σ Fobs-Fcalc map contour level resulted in difference electron density with fewer map artifacts and a minimal number of incorrectly identified solvent molecules. Incorrectly identified solvent molecules were then deleted from the coordinate file of the model and another round of CCP4 Refmac5 refinement was

146

completed without ARP_water picking. Following a comparison of both Fobs-Fcalc and

2Fobs-Fcalc electron density maps from the CCP4 Refmac5 refinements before (no solvent atoms in the model) and after 4 rounds of Refmac5 (3 rounds of ARP_water picking of solvent molecules), there was no significant improvement in the order of the electron density of the first eight amino acids at the N-terminus of the metal-free T4 RNase H.

Therefore, no further model building at the N-terminus was attempted. Lastly, two additional rounds of CCP4 Refmac5 refinement without ARP_water picking were used to correct the stereochemistry of the bridge region amino acid residues 89-97. The metal-free T4 RNase H preliminary crystallographic refinement data showed that one residue (E96) was in a generously allowed stereochemical conformation based on the

Ramachandran plot analysis using PROCHECK (See Table 4.2). Repositioning of bridge region amino acids 89-97 was then completed prior to the final round of CCP4 Refmac5 refinement. The final round of CCP4 Refmac5 refinement gave a final Rvalue of 19.8% and an Rfree value of 21.8%. The final 2Fobs-Fcalc electron density quality of the metal-free T4 RNase H structure and bridge region is shown in Figure 4.2. The stereochemistry of the final refined amino acid coordinate file of the metal-free T4 RNase

H was then analyzed by PROCHECK. The Ramachandran plot showed that for the 297 observed amino acids in the model, 91.3% were in most favored stereochemical regions and 8.7% were in allowed regions (see Table 4.3: Metal-free T4 RNase H crystallographic refinement data). See Figure 4.3 for a summary of the metal-free T4

RNase structure determination process. The ribbon diagram for metal-free T4 RNase H is shown in Figure 4.4.

147

Table 4.3: Metal-free T4 RNase H crystallographic refinement data.

Refinement Resolution 20.0-1.8 Å a Rvalue 19.8% a Rfree 21.8% Average B factor 20.2% b RMSBonds 0.024 Å b RMSAngles 1.782° Atoms (nonhydrogen) 2551 Solvent atoms 110

c Ramachandran plot Most favored 91.3% Allowed 8.7%

a Rvalue = , Rfree is the free Rvalue (3% random data subset), (Brunger 1992). ∑ Fobs − Fcalc / ∑ Fobs hkl hkl b Root-mean-square deviations of bond lengths in Å and bond angles in degrees calculated with CCP4 Refmac5 (Collaborative Computational Project 1994). c Ramachandran plot quality assessment using PROCHECK (Laskowski, MacArthur et al. 1992).

A B

Figure 4.2: Final electron density quality of metal-free T4 RNase H. A: Final 2Fobs-Fcalc electron density map contoured at 1.0 σ following the final CCP4 Refmac5 round of refinement. The final model is shown in yellow. B: Final 2Fobs-Fcalc electron density map of the α-carbon trace of the bridge region (shown in light yellow) of metal-free T4 RNase H contoured at 1.0σ. Ordered side chain density corresponding to residues R90 and R94 (noted in white) is shown.

148

Figure 4.3: Summary of the structure determination process for metal-free T4 RNase H.

X-ray diffraction data processing using HKL: indexing, integration, and scaling

Molecular replacement using AMoRe: Metal-bound T4 RNase H as search model

CNS simulated annealing refinement: 3 rounds

CNS water pick: 2 rounds

Model building with O: 2 rounds

CCP4 Refmac5 restrained refinement: 4 rounds

CCP4 ARP waters: 1 round

Model building with O: 3 rounds

Removal of all solvent atoms

CCP4 Refmac5 restrained refinement: 7 rounds

CCP4 ARP waters: 3 rounds

Model building with O: 3 rounds

Stereochemical validation using PROCHECK

149

Figure 4.4: Ribbon diagram of metal-free T4 RNase H. The N-terminus is shown in blue and can be traced to the C-terminus shown in red. The first eight amino acids at the N-terminus were not seen in the electron density and are not included in the ribbon diagram. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997). 4.2 Metal Free D132N Bacteriophage T4 RNase H

Protein purification, crystallization, crystal harvesting, and X-ray diffraction data collection of the metal-free D132N bacteriophage T4 RNase H have been previously completed. The goal of this project was to complete the data processing, structure determination, model building, and refinement of the metal-free D132N bacteriophage T4

RNase H.

4.2.1 Data Processing

The X-ray diffraction data were indexed, integrated, and scaled to 1.5 Å resolution. The data were integrated using DENZO and merged using SCALEPACK from the HKL software (Otwinowski and Minor, 1997). The metal-free D132N T4

RNase H crystals belong to the monoclinic space group P21, with unit cell dimensions a = 36.0, b = 85.8, c = 57.3 Å with α = 90° β = 92° and γ = 90°. Calculation of the

150

3 -1 Matthews’ coefficient (VM = 2.5 Å Da ) indicated that there was one monomer per asymmetric unit in the crystal (Matthews, 1968). The data had an overall Rmerge of 4.8% with a completeness of 95.2%. The Rmerge value for the highest resolution bin (1.5 Å) was 8.9% and had an intensity/error (I/σ) value of 10.0. See Table 4.4 for all metal-free

D132N T4 RNase H crystallographic data. The final scale file was then used to create a

CCP4 style data file containing an Rfree random data subset of 3.1% for use in CCP4

Refmac5 refinement (Collaborative Computational Project, 1994).

Table 4.4: Metal-free D132N T4 RNase H crystallographic data.

Lattice type Primitive monoclinic

Space group P21 Asymmetric unit One molecule Cell dimensions a = 36.0, b = 85.8, c = 57.3 Å, α = 90° β = 92° γ = 90° Data Resolution 1.5 Å a Rmerge 4.8% Observed reflections 309,990 Unique reflections 52,759 Completeness, % 95.2% n n a R = 100 ⎡ 2 2 ⎤ 2 where 2 is the intensity of the merge × ⎢∑∑ F ()hkl − F ()hkl i⎥ / ∑∑F ()hkl F ()hkl ⎣ hkl i=1 ⎦ hkl i=1 2 hkl reflection and F ()hkl i is the mean value of i multiple measurements of the n equivalent reflections.

4.2.2 Model Building and Refinement

Molecular replacement was not needed in order to solve the structure of the metal-free D132N T4 RNase H. Following data processing of the D132N mutant T4

RNase H, a partially completed model of the native metal-free P21 crystal form had been completed (see Chapter 4.1). At the time, the partially completed model of the native metal-free T4 RNase H had been refined with three rounds of CNS simulated annealing followed by three rounds of CCP4 Refmac5 restrained refinement. Due to the fact that

151

both the native and the D132N crystal forms were monoclinic P21 with very similar unit cell dimensions (see Table 4.1 and 4.4), the X-ray diffraction data (h k l, Fobs, and Sigma

Fobs in the CCP4 style data file) of the D132N mutant were refined against the partially completed model of the metal-free T4 RNase H. It was hypothesized that the overall conformation and secondary structural characteristics of the D132N mutant T4 RNase H would be very similar to the native metal-free model, therefore CCP4 Refmac5 restrained, positional refinement (Murshudov, 1997) was chosen. The first round of refinement was completed using CCP4 Refmac5 restrained refinement from 45.0-1.5 Å resolution giving an Rvalue of 23.3% and an Rfree of 25.1%. Following all refinements,

Fobs-Fcalc and 2Fobs-Fcalc difference electron density maps were calculated. Molecular graphics was then used to complete the first round of model building using the O program (Jones et al., 1991). Prior to model building, the σ levels of all difference electron density maps were normalized using the MAPMAN program (Kleywegt and

Jones, 1996). Fobs-Fcalc and 2Fobs-Fcalc electron density maps were used in all rounds of model building. A total of six rounds of CCP4 Refmac5 restrained refinement and four rounds of model building were completed with a total of 297 out of 305 amino acid residues in place with an Rvalue of 18.2% and an Rfree of 20.6%. The automated

ARP_waters program was used with the second, third, fourth, and fifth rounds of

Refmac5 restrained refinement (Lamzin, 1993; Morris et al., 2002). Only moderate changes to amino acid conformation and torsion angles were made in the bridge region

(amino acid residues 89-97) of the metal-free D132N T4 RNase H. Due to partially disordered electron density, the first eight amino acids at the N-terminus were not modeled prior to the addition of solvent atoms. However, it was important to attempt to build the N-terminus since it has been identified as the binding site of the T4 gene 45

152 clamp protein. During model building rounds, incorrectly placed solvent atoms were removed from the N-terminal region containing partially disordered electron density. A total of 236 correctly placed solvent atoms were identified following the four rounds of the CCP4 ARP_waters program.

Following the sixth round of CCP4 restrained refinement, a polyalanine α helix was positioned into the partially disordered electron density at the N-terminus. Residues two through eight were then mutated to the correct amino acids corresponding to the primary sequence of the metal-free D132N T4 RNase H. After the torsion angles of the side chain residues were adjusted in relationship to the electron density, another round of

CCP4 Refmac5 restrained refinement was completed from 45.0-1.5 Å resolution without the use of the ARP_waters program. Surprisingly, analysis of the resulting electron density difference maps (both Fobs-Fcalc and 2Fobs-Fcalc) at the N-terminus showed that the maps had been severely flattened down to noise levels. When atoms are modeled correctly into electron density, one would normally expect an ordering or enhancement of the resulting electron density maps following a refinement cycle. Understanding that each atom contributes to all observed reflections, it was determined that the presence of the 236 solvent atoms in the structure caused some errors in the calculation of the electron density at the N-terminus. Therefore, it was decided to repeat the CCP4

Refmac5 restrained refinement in the absence of all solvent atoms along with resetting the temperature factors (B values) of all amino acid atoms to 20. Examination of the resulting electron density difference maps showed more ordered 2Fobs-Fcalc density from residues four through eight at the N-terminus. Following the removal of all solvent atoms from the coordinate file, a total of four rounds of CCP4 Refmac5 restrained refinement and three rounds of model building were needed to position residues four

153 through eight at the N-terminus. The electron density for residues one, two, and three was still disordered, therefore they could not be modeled with confidence. Three consecutive rounds of CCP4 Refmac5 restrained refinements were then used along with the ARP_waters program to pick solvent atoms in the structure. One round of model building was completed following the three rounds of refinement and solvent atom picking. Incorrectly placed solvent atoms were identified and subsequently removed from the coordinate file. Lastly, one final round of CCP4 Refmac5 restrained refinement without ARP_water picking was completed from 45.0-1.5 Å resolution giving a final

Rvalue of 18.6% and an Rfree of 20.9%. The final 2Fobs-Fcalc electron density quality of the metal-free D132N T4 RNase H structure and bridge region is shown in Figure 4.5. The stereochemistry of the final refined amino acid coordinate file of the metal-free D132N

T4 RNase H was then analyzed by PROCHECK (Laskowski et al., 1992). The

Ramachandran plot showed that for the 302 observed amino acids in the model, 91.8% were in most favored stereochemical regions and 8.2% were in allowed regions (see

Table 4.5: Metal-free D132N T4 RNase H crystallographic refinement data). See Figure

4.6 for a summary of the metal-free D132N T4 RNase structure determination process.

The ribbon diagram for metal-free D132N T4 RNase H is shown in Figure 4.7.

154

Table 4.5: Metal-free D132N T4 RNase H crystallographic refinement data.

Refinement Resolution 45.0-1.5 Å a Rvalue 18.6% a Rfree 20.9% Average B factor 17.4% b RMSBonds 0.015 Å b RMSAngles 1.450° Atoms (nonhydrogen) 2656 Solvent atoms 174

Ramachandran plotc Most favored 91.8% Allowed 8.2%

a Rvalue = , Rfree is the free Rvalue (3.1% random data subset) (Brunger 1992). ∑ Fobs − Fcalc / ∑ Fobs hkl hkl b Root-mean-square deviations of bond lengths in Å and bond angles in degrees calculated with CCP4 Refmac5 (Collaborative Computational Project 1994).

c Ramachandran plot quality assessment using PROCHECK (Laskowski, MacArthur et al. 1992).

A B

Figure 4.5: Final electron density quality of metal-free D132N T4 RNase H. A: Final 2Fobs-Fcalc electron density map contoured at 1.0 σ following the final CCP4 Refmac5 round of refinement. The final model is shown in yellow. B: Final 2Fobs-Fcalc electron density map of the α-carbon trace of the bridge region (shown in light yellow) of metal-free D132N T4 RNase H contoured at 1.0σ. Ordered side chain density corresponding to residues R90 and R94 (noted in white) is shown.

155

Figure 4.6: Summary of the structure determination process for metal-free D132N T4 RNase H.

X-ray diffraction data processing using HKL: indexing, integration, and scaling

CCP4 Refmac5 restrained refinement: 1 round Metal-free D132N T4 RNase H data (CCP4 style data file) Metal-free wild-type T4 RNase H (model)

Model building with O: 1 round

CCP4 Refmac5 restrained refinement: 5 rounds

CCP4 ARP waters: 4 rounds

Model building with O: 3 rounds

Removal of all solvent atoms

CCP4 Refmac5 restrained refinement: 8 rounds

CCP4 ARP waters: 3 rounds

Model building with O: 5 rounds

Stereochemical validation using PROCHECK

156

Figure 4.7: Ribbon diagram of metal-free D132N T4 RNase H. The N-terminus is shown in blue and can be traced to the C-terminus shown in red. The first three amino acids at the N-terminus were not seen in the electron density and are not included in the ribbon diagram. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

4.3 Structure Analysis and Comparison

The structure analysis was completed using the metal-free D132N mutant T4

RNase H structure. The lsqman least squares superimposition program (Kleywegt and

Jones, 1994) was used to superimpose the metal-free D132N mutant T4 RNase H structure (see Figure 4.7) onto the metal-free native T4 RNase H structure (see Figure

4.4) giving an RMS value of 0.247 Å. The model of the metal-free D132N mutant

RNase H has a very similar overall structure in comparison to the native metal-free structure as expected. Structural comparisons were made between the previously solved metal-bound native T4 RNase H (see Figure 4.1) and the metal-free D132N mutant T4

RNase H.

157

4.3.1 General Architecture of Metal-free D132N T4 RNase H

The three-dimensional ribbon model of the mutant D132N T4 RNase H is represented in the absence of divalent cations (Figure 4.7). Surprisingly, the overall fold of the enzyme was more ordered in the absence of active site divalent cations. In contrast, the native metal-bound T4 RNase H showed regions of structural disorder.

These regions in the metal-bound native enzyme are visible in the electron density maps of the metal-free P21 crystal form. Generally, the binding of metal ions to protein is associated with some form of increased structural order. Local structural order can be seen at the metal binding site or structural motif. In addition, structural order may occur on a larger scale if secondary and tertiary structural conformational changes occur in response to the binding of metal ions to a protein. Thus, it is unique that the metal-free form of the T4 RNase H is more structurally ordered than the metal-bound form of the enzyme.

A B

Figure 4.8: Ribbon diagrams of (A) metal-free D132N T4 RNase H and (B) metal-bound native T4 RNase H (Mueser et al., 1996). A: The ribbon diagram of metal-free D132N T4 RNase H. Helices H1-H13 and β strands S1-S7 are labeled from the N- terminus (show in blue) to the C-terminus (shown in red). Three disordered (missing) residues at the N-terminus are denoted by dots, one dot per missing residue. B: The ribbon diagram of the metal-bound native T4 RNase H. Two bound magnesium ions (shown in gold) are labeled Mg1 and Mg2, respectively. Helices H1-H13 and β strands S1-S8 are labeled from the N-terminus (shown in blue) to the C-terminus (shown in red). Disordered (missing) residues 1-11, 89-97, and 181-182 are denoted by dots, one dot per missing residue. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

158

See Figure 4.8 for the ribbon diagrams showing labeled secondary structural features of both the metal-free D132N T4 RNase H and the magnesium-bound T4 RNase H (Mueser et al., 1996).

The metal-free D132N T4 RNase H structure contained the large central groove that is present in the native metal-bound T4 RNase H. The large central groove divided the metal-free molecule into a larger subdomain on the right and a smaller subdomain on the left (as viewed in Figure 4.8). The large subdomain on the right contained mainly the

N-terminal portion of the sequence (colored blue, light blue, and green). This domain contained α/β structure along with a helix bundle which formed the right side of the groove (H1, H2, and H5). The core of the large subdomain contained a five-stranded parallel β sheet (S1, S2, S3, S4, and S5) which was packed on both sides with α helices

(H2 and H6). The N-terminus of the metal-free D132N structure was more ordered in contrast to the metal-bound structure. Only the first three amino acids were not seen in the electron density in the metal-free structure. In the metal-bound structure, residue 12 was the first residue observed with good electron density. The missing 11 residues in the metal-bound structure are denoted by black dots in Figure 4.8B. The most striking feature of the large subdomain was the ordered helical bridge region at residues

89-97 (H4). Ordered electron density was present for both the metal-free native and

D132N mutant RNase H structures (see Figure 4.2B and 4.5B). In the metal-bound T4

RNase H structure, residues 89-97 showed disorder and could not be modeled accurately.

The missing nine residues, denoted by black dots in Figure 4.8B, are shown spanning the central groove over the active site region. By examining the monoclinic P21 crystal packing of the metal-free D132N RNase H, it was observed that the bridge region helices

159

(H4 and H5) were not involved in any crystal lattice contacts that may have contributed to the increased structural order.

The smaller subdomain (colored light green, yellow, and orange) contained an

α-helical bundle (H7, H8, H9, and H10) and a pair of antiparallel β sheets (S6 and S7) that formed the left side of the large central groove. As in the metal-bound structure of the enzyme, a well ordered hydrophobic pocket was present at the interface between the small subdomain and the large subdomain near the lower portion of the ordered bridge region (H3). The base of the large central groove was formed near the interface of the helical bundle (H7) of the small subdomain and the α/β core of the large subdomain (H6 and S5). Structural order was observed at the base of the large groove in the metal-free structure prior to H7. In the metal-bound structure, some disorder was observed between

S6 and H7 (residues 181-182 denoted as black dots in Figure 4.8B) near the bottom of the large central groove. The C-terminal region (colored dark orange and red) was formed by one α helix (H11) that extended from H10 across the back of the structure (below H6 of the large subdomain) to form an antiparallel helix-loop-helix (H12 and H13) conformation that packed against the outer surfaces of helices H2 and H5. See Figure 4.9 for a summary of the major structural differences between the metal-free D132N and the metal-bound native T4 RNase H.

A B

Figure 4.9: Major structural differences between metal-free and metal-bound T4 RNase H. A: Metal-free D132N T4 RNase H ribbon structure with the missing three residues at the N-Terminus. B: Metal-bound native T4 RNase H ribbon structure with labeled regions of structural disorder. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

160

4.3.2 Central Groove and Large Subdomain Structure

Structural analysis of the large central groove and the large subdomain of both the metal-bound native and the metal-free D132N mutant enzymes clearly showed more intrinsic structural order in the absence of divalent metals. The active site region of the

T4 RNase H is located in the large central groove and is oriented directly below the bridge region (residues 89-97) of the large subdomain. The active site of the metal-bound T4 RNase H consisted of a clustering of conserved acidic residues surrounded by two magnesium ions (Mg1 and Mg2 in Figure 4.8B) at the base of the central groove (Mueser et al., 1996). In the metal-bound structure, the strong negative charge contributed by the cluster of active site acidic residues was partially neutralized by the two magnesium ions. Interestingly, the large subdomain bridge region (residues

89-97) was disordered in the metal-bound T4 RNase H crystal structure (see Figure

4.10B). The bridge region residues 89-97 contained several positively charged basic residues (R90, K92, and R94) that have been proposed to be involved in binding to DNA substrates. In contrast to the metal-bound T4 RNase H structure, the metal-free D132N

T4 RNase H crystal structure contained an ordered helical bridge region (see Figure

4.10A). The ordered helical bridge region showed residues R90 and R94 extending down directly over the active site region. In the absence of bound divalent metals in the active site, the ordered basic residues in the bridge region seemed to compensate for the abundance of negative charge contributed from the conserved acidic residues. The increased structural order of the bridge region in the metal-free D132N T4 RNase H resulted in salt bridging interactions between positively charged bridge residues and active site carboxylate residues (see Figure 4.10A and B). In particular, bridge region residue R90 was observed to be in a salt bridge interaction with D200. In the

161 metal-bound structure, D200 was shown to be in outer sphere coordination through a water molecule to Mg2 in the active site. Also in the metal-free structure, residue K87 was shown to be involved in a salt bridge interaction with E130. In the metal-bound form of the enzyme, K87 was involved in a salt bridge interaction with D200. Lastly, the side chain of residue R94 was well ordered in the metal-free D132N structure and extended downward from the bridge region adjacent to W101 directly above the acidic residue D71 in the active site. The structural differences of the bridge region in the absence (ordered) or presence (disordered) of divalent metal ions may have important consequences for the binding of DNA substrate at the active site.

In addition to increased structural order of bridge region residues 89-97 in the large subdomain, the metal-free D132N T4 RNase H was also more ordered at a region below the active site.

A: Metal-free D132N T4 RNase H B: Metal-bound native T4 RNase H

Figure 4.10: Bridge region of metal-free and metal-bound T4 RNase H. A: Metal-free D132N T4 RNase H structure of the ordered helical bridge region near the active site. B: Metal-bound native T4 RNase H structure of the disordered bridge region near the active site. All active site acidic residues are colored in red (N132 colored in purple), all basic residues are colored in blue, and all hydrophobic residues are colored in green in the respective structures. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

162

A disordered turn was present at residues 181-182 in the metal-bound form of the T4

RNase H. The increased structural order below the active site in the metal-free structure may have been related to a reorientation of conserved acidic active site residues. In the absence of bound magnesium, the acidic residues D155 and D157 of the active site are repelled into a different position. The side chains of both D155 and D157 were repositioned from their orientation in the metal-bound structure in which they were outer sphere coordinated through water molecules to metals Mg1 and Mg2, respectively. In particular, D155 was involved in a salt bridging interaction with H174 (discussed below).

The repositioning of these acidic residues in the lower portion of the active site region in the metal-free D132N structure may have contributed to overall conformational stability in the loop region at the base of the large central groove.

4.3.3 Active Site Structure

Structural analysis of the metal-free D132N T4 RNase H active site region showed that one of the catalytic residues was reoriented in comparison to the metal-bound form of the enzyme. In the magnesium-bound form of the T4 RNase H enzyme, Mg1 was inner sphere coordinated to the side chain of D132 and outer-sphere coordinated through an extensive network of water molecules to the side chains of D19,

D71, and D155 (see Figure 4.11B). Site-directed mutagenesis studies have shown that the Mg1 magnesium ion is indeed the catalytic site of the T4 RNase H enzyme. The aspartic acid residues (D132, D19, D71, and D155) were each mutated to an asparagine (N) and their respective activities were studied. The mutants D132N, D19N,

D71N, and D155N completely lost their exonuclease activity (Bhagwat et al., 1997c).

Thus, in the absence of bound magnesium ions, the catalytic site residues remain

163 essentially in place with the exception of residue D155 (see Figure 4.11A). Specifically, residue D155 was shown to be reoriented and stabilized by the formation of a salt bridging interaction with H174. As was mentioned previously, repositioning of active site residues in the metal-free D132N structure may have contributed to conformational stability in the loop region at the base of the large central groove. It is possible that the formation of the salt bridging interaction between the catalytic residue D155 and H174 is contributing to the more ordered and rigid structure of the metal-free D132N T4 RNase H at the base of the large central groove.

A B

Figure 4.11: Active site region of metal-free and metal-bound T4 RNase H. A: Metal-free D132N T4 RNase H structure of the residues at the catalytic active site. B: Metal-bound native T4 RNase H structure of the residues at the catalytic site (Mg1) in the presence of bound two magnesium ions (shown in gold). All active site acidic residues are colored in red (N132 colored purple), and all basic residues are colored in blue in the respective structures. This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

4.3.4 Related Enzyme Structure Comparison

T4 RNase H is a member of the RAD2 family of prokaryotic and eukaryotic replication and repair nucleases. The prokaryotic nucleases include both the smaller bacteriophage T5 D15 and T7 gene 6 exonucleases and the 5’ to 3’ exonuclease domains of DNA polymerases from bacteria such as E. coli and Thermus aquaticus (Taq). The

164 archaeal flap endonuclease-1 (FEN-1) enzymes from both the euryarchaea and the crenarchaea are more closely related to the eukaryotic nucleases (murine FEN-1, human

FEN-1, Schizosaccharomyces pombe RAD2, and Saccharomyces cerevisiae RAD27 enzymes) (Liu et al., 2004). Like T4 RNase H, many of these enzymes in the RAD2 family are involved in removing RNA primer fragments during lagging strand DNA replication. To date, the X-ray crystal structures of nine enzymes in this family have been determined: one prokaryotic source, the N-terminal 5’ to 3’ exonuclease domain of

Thermus aquaticus (Taq) polymerase (1TAQ, 2.40 Å resolution) (Kim et al., 1995); two from bacteriophage, the T4 RNase H (1TFR, 2.1 Å resolution) (Mueser et al., 1996) and the T5 5’ to 3’ exonuclease (1EXN, 2.50 Å resolution) (Ceska et al., 1996), (1XO1,

2.50 Å resolution) (Garforth et al., 1999), (1UT5, 1UT8, 2.75 Å) (Feng et al., 2004); four from Euryarchaeal organisms, the Pyrococcus furiosus (Pfu) flap endonuclease-1 (1B43,

2.00 Å resolution) (Hosfield et al., 1998b), the Methanococcus jannaschii flap endonuclease-1 (1A76 and 1A77, 2.00 Å resolution) (Hwang et al., 1998), the

Pyrococcus horikoshii flap endonuclease-1 (1MC8, 3.10 Å resolution) (Matsui et al.,

2002), and the Archaeoglobus fulgidus flap endonuclease-1 complexed to 3’ flap DNA

(1RXW 2.00 Å) (Chapados et al., 2004); one from the Crenarchaeal organisms, the

Aeropyrum pernix flap endonuclease-1 (1.4 Å) (see Chapter 2); and one eukaryotic source, the Human flap endonuclease-1 complexed to the homotrimeric human PCNA

(1UL1, 2.90 Å) (Sakurai et al., 2004).

It has been shown that the T4 RNase H has a significant sequence similarity to the bacteriophage T5 D15 and T7 gene 6 exonucleases and to the N-terminal 5’ to 3’ nuclease domains of DNA Polymerase I from different bacterial species (Hollingsworth and Nossal, 1991; Mueser et al., 1996; Bhagwat et al., 1997c). The X-ray crystal

165 structures of both the metal-free D132N and metal-bound native T4 RNase H are shown in comparison to the 5’ to 3’ exonuclease domain (residues 10-291) of Thermus aquaticus (Taq) polymerase and the T5 5’ to 3’ exonuclease in Figure 4.12.

A: Metal-free D132N T4 RNase H B: Metal-bound native T4 RNase H

C: 5’ to 3’ exonuclease domain of Taq polymerase D: Bacteriophage T5 5’ to 3’ exonuclease

Figure 4.12: Comparison of metal-free and metal-bound T4 RNase H to related structures in the RAD2 family of nucleases. A: Ribbon diagram of the metal-free D132N T4 RNase H showing an ordered bridge conformation. B: Ribbon diagram of metal-bound native T4 RNase H showing a disordered bridge region and a region of disorder towards the bottom of the large central groove. Two bound magnesium ions are shown in gold. PDB 1TFR, (Mueser et al., 1996). C: Ribbon diagram of the N-terminal 5’ to 3’ exonuclease domain of Taq polymerase in the presence of a bound zinc ion (shown in gold) in the active site. A disordered bridge region (residues 69-85) is shown in the structure. PDB 1TAQ, (Kim et al., 1995). D: Ribbon diagram of the bacteriophage T5 5’ to 3’ exonuclease crystallized in the absence of divalent cations in the active site displays an ordered bridge region/helical arch. PDB 1EXN, (Ceska et al., 1996). All ribbon diagrams can be traced from the N-terminus (blue) to the C-terminus (red). This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

166

Both the structure of the zinc bound 5’ to 3’ exonuclease domain of Thermus aquaticus

(Taq) polymerase and the metal-free T5 5’ to 3’ exonuclease display a very similar overall conformation in comparison to the T4 RNase H. Both related enzymes show a large subdomain that contains the α/β secondary structural features of the T4 RNase H structures. Also, the central groove containing the active site residues is located between a smaller subdomain on the left and a larger subdomain on the right in both of the related enzymes, as was observed in the metal-free and metal-bound T4 RNase H structures.

Interestingly, the 5’ to 3’ exonuclease domain of Taq polymerase in the presence of a bound zinc ion in the active site shows disorder in the bridge region. The disorder in the bridge region directly above the active site region is similar to the T4 RNase H structure in the presence of two divalent magnesium ions bound in the active site. However, the structure of the T5 5’ to 3’ exonuclease in the absence of divalent cations displays a more ordered bridge region or helical arch directly over the active site region. The ordered helical arch structure is very similar to the ordered bridge structure in the metal-free

D132N T4 RNase H structure. Overall, the structures display very similar active sites but show a wide variability in structural conformation in the bridge region directly over the active site. The structures show an open or more disordered bridge region in the presence of divalent metals and a more ordered or closed bridge region in the absence of bound divalent metals.

4.3.5 Related Enzyme Bridge Region Comparison and DNA Binding Implications

The structural comparison of the metal-free D132N and metal-bound native T4

RNase H with the closely related family of the RAD2 nucleases showed that major structural differences were observed in the bridge region located directly above the active

167 sites of the respective enzymes. A more disordered or open bridge structure was observed in the presence of bound divalent metals in the active site while a more ordered or closed bridge structure was observed in the absence of bound divalent metals. This structural variability in the bridge region above the active sites of the related enzymes in the RAD2 family of nucleases may be important for nucleic acid substrate binding in the active site. The D132N T4 RNase H and T5 5’ to 3’ exonuclease structures in the absence of bound divalent metal ions possessed ordered bridge regions characterized by basic amino acid side chains extending downward from the bridge toward the acidic active site. These ordered basic residues are most likely compensating for the abundance of negative charge contributed from the conserved acidic residues in the absence of bound metal ions. In contrast, bridge region basic amino acid side chains in the metal-bound structures of both the native T4 RNase H and the 5’ to 3’ exonuclease domain of Taq polymerase were either completely or partially disordered. The disordered region (residues 69-85) of the 5’ to 3’ exonuclease domain of Taq polymerase contained two lysine and two arginine residues that are most likely involved in substrate recognition based on mutational studies of the 5’ nuclease domain of E. coli DNA polymerase I (Xu et al., 2001). A comparison of basic and hydrophobic residues proximal to the active site residues of both the metal-free D132N and metal-bound native

T4 RNase H are shown in comparison to the 5’ to 3’ exonuclease domain of Taq polymerase and the T5 5’ to 3’ exonuclease in Figure 4.13. Similar to what was observed in the metal-free D132N T4 RNase H, the T5 5’ to 3’ exonuclease bridge structure showed four ordered basic residues (K83, R86, K89, and R93) extended downward from the helical arch toward the active site residues. Like the T4 RNase H, the T5 5’ to 3’ exonuclease has both exonuclease and flap endonuclease activity (Feng et al., 2004).

168

A: Metal-free D132N T4 RNase H B: Metal-bound native T4 RNase H

C: 5’ to 3’ exonuclease domain of Taq polymerase D: Bacteriophage T5 5’ to 3’ exonuclease

Figure 4.13: Comparison of the metal-free and metal-bound T4 RNase H active site and bridge regions to related structures in the RAD2 family of nucleases. A: Ribbon diagram of the metal-free D132N T4 RNase H. Active site residues D19, D71, E130, D155, D157, and D200 are shown in red (except N132 in purple). Basic residues K87, R90, R94, H174, K192, and K199 proximal to the active site are shown in blue. Hydrophobic residues L25, L29, W99, W101, and F105 near the bridge region are shown in green. B: Ribbon diagram of metal-bound native T4 RNase H. Two bound magnesium ions are shown in gold. Active site residues are labeled as in (A) (except D132 in red). Basic residues K87, H174, K192, and K199 proximal to the active site are shown in blue. Hydrophobic residues L25, L29, W99, W101, and F105 near the bridge region are shown in green. PDB 1TFR, (Mueser et al., 1996). C: Ribbon diagram of the metal-bound N-terminal 5’ to 3’ exonuclease domain of Taq polymerase. A bound zinc ion is shown in gold. Active site residues D18, D67, E117, D119, D142, D144, D188, and E189 are shown in red. Basic residues H20, H21, R85, and R183 proximal to the active site are shown in blue. Hydrophobic residues Y24, F27, and F92 near the bridge region are shown in green. PDB 1TAQ, (Kim et al., 1995). D: Ribbon diagram of the metal-free bacteriophage T5 5’ to 3’ exonuclease. Active site residues D26, D68, E128, D130, D131, D153, D155, D201, and D204 are shown in red. Basic residues K83, R86, K89, R93, and K196 proximal to the active site are shown in blue. Hydrophobic residues F32, Y90, F104, and F105 near the bridge region are shown in green. PDB 1EXN, (Ceska et al., 1996). This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

The ordered helical arch or bridge region structure suggests a possible structural

mechanism for the recognition of flap substrates in which the flap is threading through or

under the ordered bridge region (Ceska et al., 1996; Bhagwat et al., 1997c). It is also

169 shown in Figure 4.13 that all of the related enzymes displayed a small hydrophobic region in the proximity of the bridge region. Together, electrostatic interactions (bridge region basic residues and the phosphate backbone of the DNA substrate) and nucleotide base stacking interactions (bridge region hydrophobic residues and the nucleotide bases of the DNA substrate) could potentially localize a flap DNA substrate over the active site residues in a proper orientation for cleavage.

It was initially speculated that the disordered bridge region (residues 89-97 containing several basic amino acids) in the metal-bound structure of the native T4

RNase H would adopt a more rigid conformation upon the binding of nucleic acid substrate (Mueser et al., 1996). It has since been shown that basic residues in the bridge region above the active site of T4 RNase H are involved in substrate binding and recognition of flap DNA (Bhagwat et al., 1997c). Thus, it is interesting that basic residues in the bridge region of the D132N T4 RNase H (R90 and R94) were more structurally ordered in the absence of divalent cations and no nucleic acid substrate. This ordered bridge region therefore suggests a possible conformation of the enzyme in the presence of bound fork or flap DNA. Both ordered bridge structures in the T4 and T5 enzymes could potentially accommodate a threaded single-stranded DNA substrate over the active site and under the bridge region. However, it appears that the central groove containing the active site residues is too narrow for the binding of RNA-DNA or

DNA-DNA duplex. It has been shown that the binding of divalent cations in the active site of T4 RNase H results in conformational changes in both the lower portion of the active site in the central groove and in the bridge region of the large subdomain.

Therefore, it can be hypothesized that the binding of divalent metal ions in the active site may result in large conformational changes allowing an opening or loosening of the

170 overall structure of the enzyme. The central groove containing the active site may then be wide enough to incorporate nucleic acid substrate (both duplex and flap) so that the scissile phosphodiester bond is located close to the position of the catalytic magnesium ion (Mg1 in the native T4 RNase H). If a flap nucleic acid substrate was bound, basic residues in the bridge region could then bind to the backbone phosphates on the single-stranded flap. In addition, hydrophobic residues in the bridge region could result in base stacking interactions further helping to position and orient a single-stranded flap substrate for cleavage. It can be suggested that the above structural features of both the metal-free and metal-bound T4 RNase H structures will contribute to substrate binding.

An ordered or closed bridge region (metal-free structure) in combination with a less ordered or open active site central groove (metal-bound structure) could potentially be important to the correct binding and orientation of nucleic acid substrate.

Co-crystallization studies of both enzyme and nucleic acid substrate (both duplex and flap) will be needed to gain a more complete understanding of the structure specific binding of nucleic acid substrate and its directionality in the active site central groove of the T4 RNase H.

4.4 D132N Bacteriophage T4 RNase H with DNA Substrate

Protein purification, crystallization, and crystal harvesting of the D132N bacteriophage T4 RNase H in the presence of a fork DNA substrate have been previously completed by other students in the laboratory. They used fork DNA substrate consisting of a 12 base pair duplex with an associated 6 nucleotide 5’ single stranded arm and a 12 nucleotide 3’ single stranded arm (as shown in Figure 4.14) which was then used for co-crystallization with the catalytically inactive D132N T4 RNase H mutant (described in

171 sections 4.2 and 4.3). Effort for this thesis work was to assist in the data collection, data processing, structure determination, and initial model building and refinement of the

D132N bacteriophage T4 RNase H in the presence of a fork DNA substrate. The completion of the D132N T4 RNase H/fork DNA complex will be completed by others in the group.

Fork DNA

Figure 4.14: Fork DNA substrate used for co-crystallization with the

D132N T4 RNase H. The fork DNA substrate consisted of a 12 base pair duplex with an associated 6 nucleotide 5’ single stranded arm and a 12 nucleotide 3’ single stranded arm.

4.4.1 X-ray Diffraction Data Collection

Multiple data sets were collected in-house (Ohio Macromolecular Crystallography

Consortium (OMCC), University of Toledo, Toledo, OH, USA) using a Rigaku FRE

High Brilliance X-ray Generator with a Saturn 92 CCD detector. The first data set showed diffraction to 3.2 Å resolution (see Figure 4.15A for an X-ray diffraction image) and was collected using a 1.54 Å wavelength, a crystal-to-detector distance of 70 mm,

0.5° oscillations, 10 second exposures, for 360 frames at -173 °C. This 3.2 Å data set was initially used in molecular replacement for structure determination of the RNase H and fork DNA complex. The best data set diffracted to 3.0 Å resolution (see Figure

4.15B for an X-ray diffraction image) and was collected using a 1.54 Å wavelength, a

172 crystal-to-detector distance of 60 mm, 0.5° oscillations, 20 second exposures, for 229 frames at -173 °C.

A B

Figure 4.15: X-ray diffraction images of crystals of the D132N T4 RNase H in the presence of fork DNA. A: X-ray diffraction image of the 3.2 Å resolution data set. B: X-ray diffraction image of the 3.0 Å resolution data set. Data was collected in-house using a Rigaku FRE High Brilliance X-ray Generator with a Saturn 92 CCD detector.

Both of the data sets had a high mosaicity which was estimated to be ~1.4-2.0° following integration. Mosaicity is a measure of the degree of misorientation of the small mosaic blocks that compose a protein crystal. A high mosaicity will cause the diffraction by a particular reflection to be spread over a range of crystal rotation during data collection.

The lunes of the diffraction pattern become wider because there are more partial reflections measured in a diffraction image (Dauter, 1999). As a result, no clearly visible borders are present in the diffraction pattern as reflections fade out gradually over across a lune, further complicating data processing.

For each data set, an appropriate rotation range for the data collection was determined so that all unique reflections were measured at least once. 90 degrees of data

173 were needed due to the orthorhombic symmetry of the D132N RNase H/fork DNA complex crystals in order to collect all unique reflections at least once. However, an additional 90 degrees (for the 3.2 Å data set) and 25 degrees (for the 3.0 Å data set) of rotation was collected resulting in multiple measurements of equivalent reflections along the principle axis (see Chapter 2, section 2.1 and 2.2 for discussion). A higher amount of redundancy (multiple measurements of equivalent reflections) in an X-ray diffraction data set will lead to more accurate data (Dauter, 1999).

4.4.2 Data Processing

The first X-ray diffraction data set of the D132N RNase H with fork DNA had been collected and was indexed, integrated, and scaled to 3.2 Å resolution. The data were integrated using DENZO and merged using SCALEPACK from the HKL2000 software (Otwinowski and Minor, 1997). The D132N RNase H/fork DNA complex crystals belong to the orthorhombic space group P212121 with unit cell dimensions a = 68.9, b = 85.8, c = 89.4 Å with α = β = γ = 90°. Calculation of the Matthews’

3 -1 coefficient (VM = 3.72 Å Da ) indicated that there was one monomer per asymmetric unit in the crystal (Matthews, 1968). The data had an overall Rmerge of 9.0% with a completeness of 99.9%. The Rmerge value for the highest resolution bin (3.2 Å) was

21.4% and had an intensity/error (I/σ) value of 10.1. The final scale file was then used to create a CNS style and a CCP4 style data file (h k l, Fobs, and Sigma Fobs) both containing the same Rfree random data subset of 5% for use in the refinement software suites

Crystallography & NMR System (CNS) (Brunger, 1992; Brunger et al.,

1998) and CCP4 (Collaborative Computational Project, 1994), respectively.

174

The best data set collected diffracted to 3.0 Å. The data were indexed, integrated, and scaled to 3.0 Å resolution. The data were integrated using DENZO and merged using SCALEPACK from the HKL2000 software (Otwinowski and Minor, 1997). This

D132N RNase H/fork DNA complex data belongs to the same orthorhombic space group

P212121, with identical unit cell dimensions a = 68.9, b = 85.8, c = 89.4 Å with

α = β = γ = 90°. The data had an overall Rmerge of 9.5% with a completeness of 99.3%.

The Rmerge value for the highest resolution bin (3.0 Å) was 29.2% and had an intensity/error (I/σ) value of 4.8. See Table 4.6 for all D132N RNase H/fork DNA complex crystallographic data. The final scale file was then used to create a CNS style and a CCP4 style data file (h k l, Fobs, and Sigma Fobs) both containing the same Rfree random data subset of 5% for use in the refinement software suites Crystallography &

NMR System (CNS) (Brunger, 1992; Brunger et al., 1998) and CCP4 (Collaborative

Computational Project, 1994), respectively.

Table 4.6: D132N T4 RNase H/fork DNA complex crystallographic data.

Lattice type Primitive orthorhombic

Space group P212121 Asymmetric unit One molecule

Cell dimensions a = 68.9, b = 85.8, c = 89.4 Å, α = β = γ = 90°

Data Set #1 Data Set #2 Resolution 3.2 Å 3.0 Å a Rmerge 9.0% 9.5% Observed reflections 64,157 44,620 Unique reflections 9,174 11,112 Completeness, % 99.9% 99.3% n n a R = 100 ⎡ 2 2 ⎤ 2 where 2 is the intensity of the merge × ⎢∑∑ F ()hkl − F ()hkl i⎥ / ∑∑F ()hkl F ()hkl ⎣ hkl i=1 ⎦ hkl i=1 2 hkl reflection and F ()hkl i is the mean value of i multiple measurements of the n equivalent reflections.

175

4.4.3 Structure Determination

Using the 3.2 Å data (Data Set #1 in Table 4.6), the initial phasing was solved by molecular replacement using AMoRe (Navaza and Saludjian, 1997) with the structure of the metal-free D132N T4 RNase H in the absence of a portion of the bridge region as the search model (see Chapter 4, Figure 4.8A for the metal-free ribbon model of the D132N

T4 RNase H). A portion of the ordered bridge region in the D132N RNase H (residues

87-101 located in H4 and H5 in Figure 4.8A) was removed because it was previously shown that basic residues in the bridge region above the active site of T4 RNase H are involved in substrate binding and recognition of DNA (Bhagwat et al., 1997c).

Therefore, any conformational change in this region upon DNA substrate binding would reflect the observed data and not the search model upon analysis of difference electron density maps.

The molecular replacement rotation and translation search was done using one monomer of D132N RNase H (no bridge residues 87-101) from 9.0 to 3.5 Å resolution.

Following the rotation and translation search, a rigid body refinement was completed to optimize the rotational orientation of the search model. The molecular replacement results for the space group P212121 gave an Rfactor of 48.5% and a correlation coefficient

(CC) of 29.7%. Following the rigid body refinement, an output coordinate model for the

D132N RNase H/fork DNA complex was generated. This coordinate model was used initially in refinement because the 3.0 Å (Data Set #2) data had not yet been collected.

Once the 3.0 Å data had been collected, molecular replacement was also completed as was described above. The molecular replacement results for the space group P212121 gave an Rfactor of 50.5% and a correlation coefficient (CC) of 35.1%. Rigid body refinement was used to calculate the best fit for the output coordinate model.

176

4.4.4 Initial Model Building and Refinement

Following molecular replacement, the output coordinate file generated from the rigid body refinement was used in the initial refinement of the 3.2 Å D132N RNase

H/fork DNA complex. Refinement was completed using Refmac5 which was chosen to create the initial maps to verify that the fork DNA was bound to the D132N RNase H.

Refmac5 refinement was completed from 40.0-3.2 Å resolution giving an Rvalue of 39.1% and an Rfree of 46.7%. Following Refmac5 refinement, Fobs-Fcalc and 2Fobs-Fcalc difference electron density maps were calculated (as described in Chapter 2, section 2.4). Prior to electron density map interpretation, the σ levels of all difference electron density maps were normalized using the MAPMAN program (Kleywegt and Jones, 1996). Molecular graphics was then used to complete the electron density interpretation and analysis using the O program (Jones et al., 1991). The electron density difference maps were examined at a variety of σ contour levels. The Fobs-Fcalc difference electron density map at a 2σ contour level clearly showed the bound fork DNA substrate. Remarkably, molecular replacement (using a protein-only search model) was successful in the initial phasing of the large fork DNA substrate bound to the D132N RNase H. However, further examination of the coordinate model of the D132N RNase H revealed a number of main chain breaks which suggested that restrained, positional refinement was not able to account for all conformational changes in the protein model that reflected the observed data. In order to accomodate for these conformational changes, the molecular replacement output coordinate file was refined using simulated annealing (CNS)

(Brunger et al., 1997; Brunger and Rice, 1997).

The first round of CNS simulated annealing refinement was completed at 3000 K with a slow cooling to 273 K using all data from 40.0-3.2 Å resolution yielding an Rvalue

177

of 37.0% and an Rfree of 42.5%. Following all CNS refinements, Fobs-Fcalc and 2Fobs-Fcalc difference electron density maps were calculated. Prior to model building, the σ levels of all difference electron density maps were normalized using the MAPMAN program

(Kleywegt and Jones, 1996). Molecular graphics was then used to complete the electron density interpretation and analysis using the O program (Jones et al., 1991). Analysis of the D132N RNase H coordinate model following simulated annealing refinement showed that no main chain breaks were present. Also, the Fobs-Fcalc electron density maps were slightly improved in the regions where the fork DNA substrate was present. Examination of the individual amino acid temperature B factors following minimization (B factor minimization range set from 1-200 during simulated annealing) revealed that a number of the B factor values were close to 1. These individual amino acid B factors were much lower than would be expected for a coordinate model at a resolution of 3.2 Å. Typically, a lower B factor is associated with a more ordered amino acid side chain or solvent atom in the molecule. The majority of the B factors may have been driven substantially lower due to the larger scattering contribution of the fork DNA which were unaccounted for by the D132N RNase H model alone. Understanding that each atom contributes to all observed reflections (discussed in section 4.1.3), it was determined that the artificially low B factor values may have caused some errors in the calculation of the electron density throughout the model resulting in flattened electron density maps.

Having completed the molecular replacement of the 3.0 Å data (Data Set #2 in

Table 4.6) that was discussed above, CNS simulated annealing refinement was completed at 3000 K with a slow cooling to 273 K using all data from 40.0-3.0 Å resolution yielding an Rvalue of 40.7% and an Rfree of 45.9%. Based on the previous observation that individual amino acid B factors were very low following minimization, a lower limit

178 threshold (B factor minimization range set from 15-200 during simulated annealing) was established during individual B factor minimization. Following refinement, a majority of the individual amino acid B factors were near the lower limit established during minimization. It was anticipated that as the fork DNA substrate was modeled into the electron density, the positions of these atoms in the density would compensate for the large scattering contribution that was missing during the initial refinements. As a result, more appropriate B factor values reflecting a 3.0 Å model would be observed. The resulting difference electron density maps (both Fobs-Fcalc and 2Fobs-Fcalc) displayed more connectivity in regions corresponding to both D132N RNase H and the fork DNA substrate. Also, additional density was present for both the 5’ and 3’ single stranded arms of the fork DNA substrate. Model building and refinement rounds were then began by others in the group. The initial model building and refinement rounds were focused on positioning a duplex DNA fragment into the corresponding electron density.

Following positioning of the majority of the duplex DNA, improvements were made to the protein portion of the model.

When the majority of the D132N RNase H/fork DNA model is in place in the electron density, a simulated annealing composite omit map (see Chapter 2, section 2.4 for discussion) will be calculated in order to remove any model bias remaining from the molecular replacement search model. In addition, a bulk solvent correction will be used to enhance the signal-to-noise ratio of the electron density difference maps for any missing parts of the model (see Chapter 2, section 2.4 for discussion). Once the majority of the model has been constructed, CCP4 Refmac5 restrained, positional refinement will be used to complete the model building and refinement process.

Chapter 5: Escherichia coli DNA-Binding Protein

from Starved Cells

5.1 Expression and Purification

Expression of the E. coli Dps/PexB (DNA-binding protein from starved cells) protein was upregulated during large-scale fermentation expression of recombinant archaeal flap endonuclease-1 (FEN-1) enzymes and was later isolated during additional purification of these respective FEN-1 enzymes. As was discussed in Chapter 3, section

3.1.2, archaeal FEN-1 protein from various thermophilic organisms were expressed and partially purified by our collaborators at Third Wave Technologies, Inc. in Madison,

Wisconsin. Most preparations of these respective FEN-1 proteins were produced using small-scale (4x 1 L shaker flasks) and large-scale (20 L fermentor) fermentation. All expression of soluble archaeal FEN-1 protein was obtained from IPTG induction of

BL21(DE3) E. coli bacterial host cells at 37 °C. The initial purification protocol of the respective archaeal FEN-1 proteins is described in Chapter 3, section 3.1.3. Following initial purification, the archaeal FEN-1 proteins were shipped on dry ice from Third

Wave Technologies. All initial archaeal FEN-1 protein preparations were expressed using small-scale shaker flasks and the protein was received at high purity. However, later preparations were expressed by large-scale fermentation and were heavily contaminated following initial purification by Third Wave Technologies. An SDS-PAGE gel was run to access the purity of the FEN-1 samples and is shown in Figure 5.1.

179 180

1 2 3 4 5 6 7 8 9 10 11 12

36.5 kDa

~19 kDa 14.4 kDa

Figure 5.1: SDS-PAGE gel following initial purification of archaeal FEN-1 proteins by Third Wave Technologies. Lane 1 and 8 are molecular weight markers. Lanes 2 and 3 are samples of Archaeoglobus fulgidus (Afu) FEN-1 (MW: 38 kDa), lanes 4 and 5 are samples of Archaeoglobus veneficus (Ave) FEN-1 (MW: 38 kDa), lanes 6 and 7 are samples of Aeropyrum pernix (Ape) FEN-1 (MW: 40 kDa), lanes 9 and 10 are samples of Pyrococcus furiosus (Pfu) FEN-1 (MW: 38 kDa), and lanes 11 and 12 are samples of Thermococcus zilligii (Tzi) FEN-1 (MW: 40 kDa). The samples of the archaeal FEN-1 proteins shown here were expressed using large-scale fermentation prior to initial purification. A large impurity was present at approximately 19 kDa in all but one of the archaeal FEN-1 samples.

The respective archaeal FEN-1 proteins (Archaeoglobus fulgidus (Afu), Archaeoglobus veneficus (Ave), Aeropyrum pernix (Ape), Pyrococcus furiosus (Pfu), and Thermococcus zilligii (Tzi) FEN-1) were seen on the SDS-PAGE gel from approximately 38-40 kDa. In addition, a large band at approximately 19 kDa was present in all but one of the respective FEN-1 samples. In the Ape and Tzi FEN-1 samples, the impurity accounted for ~50% of protein material that was present. Interestingly, the 19 kDa band was not previously seen in other smaller scale archaeal FEN-1 preparations provided by

Third Wave Technologies. There was concern that the impurity was a truncation product of each respective archaeal FEN-1 enzyme, and was therefore of interest to Third Wave

Technologies, Inc. because they use the archaeal FEN-1 enzymes in their clinical assays.

181

Having shown that the major impurity was approximately 19 kDa on an

SDS-PAGE gel, a Superdex 75 size-exclusion HPLC column was used first during the additional purification of the respective FEN-1 enzymes (as discussed in Chapter 3, section 3.1.3). In all impure archaeal FEN-1 samples that were loaded on the Superdex

75 column, the impurity was surprisingly found in the void volume during elution. This indicated that the impurity was larger than the defined pore diameter of the bead and was therefore much larger than 19 kDa. A Superdex 75 chromatogram of a 1 mL injection of impure Ape FEN-1 protein solution is shown in Chapter 3, Figure 3.2. Two peaks (one at fraction 10 corresponding to the impurity and one at fractions 13, 14, and 15 corresponding to Ape FEN-1) were observed in the chromatogram by monitoring the absorbance at 280 nm. An SDS-PAGE gel was then run, stepping across the 280 nm peaks, to check the purity of the samples and is shown in Figure 5.2.

1 2 3 4 5 6 36.5 kDa Ape FEN-1

14.4 kDa DPS ~19 kDa

Figure 5.2: SDS-Page gel from Superdex 75 column purification of Ape FEN-1 and DPS impurity. Lane 1 is the molecular weight marker and lanes 2-6 are Superdex 75 fractions 10, 12, 13, 14, and 15, respectively, as shown in the Superdex 75 chromatogram in Chapter 3, Figure 3.2. The DPS impurity (~19 kDa) was eluted in the void fraction 10 and was also seen in fraction 12. Ape FEN-1 (40 kDa) was eluted in the peak corresponding to fractions 13-15.

As shown in the SDS-PAGE gel, the isolated impurity was highly pure following size-exclusion chromatography purification of an Ape FEN-1 protein sample. The fractions containing the impurity were pooled and stored separately following each of the

182 respective archaeal FEN-1 purifications. In particular, a high yield of the impurity was isolated from the Ape, Ave, and the Tzi FEN-1 samples.

5.2 Dynamic Light Scattering (DLS)

Dynamic light scattering (DLS) measurements were performed on the impurity to obtain an estimate of the molecular weight in solution. DLS is used to determine the amount of size heterogeneity (polydispersity) present in a protein solution based on the translational diffusion coefficient (DT) of a macromolecule undergoing Brownian motion in solution. DLS measurements analyze the time scale of the scattered light intensity fluctuation by a mathematical process called autocorrelation. The translational diffusion coefficient of the molecules in the sample cell is calculated from the autocorrelation of scattered light intensity data by performing a nonlinear least-squares fit of the autocorrelation coefficients to an exponential decay. The hydrodynamic radius of gyration (RH) is then derived from the translational diffusion coefficient of the molecules in the sample cell using the Stokes-Einstein equation. Molecular weight can then be estimated from a standard calibration curve of molecular weight versus RH obtained from various globular proteins of known mass (Ferre-D'Amare and Burley, 1997; Wilson,

2003). The polydispersity (Cp) of the molecules in solution is defined with respect to the

RH and can be interpreted as the standard deviation of the width of the Gaussian distribution in nanometers. Solutions are defined as monodisperse when the Cp is less than 15% of the RH, moderately polydisperse when Cp is less then 30% of the RH, and significantly polydisperse when Cp is greater than 30% of the RH (Moradian-Oldak et al.,

1998).

183

Prior to DLS measurements, a solution of approximately 1 mg/ml of purified Dps protein from the Tzi FEN-1 sample was filtered using a Millipore Ultrafree-MC HV

Centrifugal Filter™ with a 0.1 µm filter. Approximately 55 µL of protein sample was then injected into the sample cell and DLS measurements were taken at 21 °C using a

Dyna-Pro-801 DLS instrument (Protein Solutions, Inc.). The impurity had a hydrodynamic radius of 4.97 nm and a 20.3% polydispersity (1.01 nm). The estimated molecular weight was shown to be 141.6 kDa suggesting the material was a homogenous

7mer or 8mer complex. The estimated molecular weight from the DLS measurements verified the size exclusion chromatography elution of the Dps fraction in the void volume during purification.

5.3 Crystallization

Due to the high yield of the yet unidentified impurity that was isolated from the

Ape, Ave, and the Tzi FEN-1 samples, sparse matrix crystallization trials were attempted in parallel with the FEN-1 proteins. Prior to crystallization trials, the protein was dialyzed into a buffer containing 50 mM Tris-HCl pH 7.5, 100 mM NH4Cl, 150 mM

NaCl, and 10 mM MgCl2 and was then concentrated to ~19 mg/ml. Sitting-drop vapor diffusion crystallization trials at 21 °C were then completed using a number of crystal screen kits with 1 µL of reservoir solution and 1 µL of protein sample. Many small single crystals were obtained in the initial screens with the impurity that was isolated from both the Ape and Tzi FEN-1 samples. Hanging-drop vapor diffusion expansions of the best crystal hit conditions were then set up using coarse gradients of 20-40% PEG

400, 100 mM Na HEPES pH 7.5, and 200 mM MgCl2 and 10-30% PEG 1000, 100 mM

Na cacodylate pH 6.5, and 200 mM MgCl2 at 21 °C using Costar 24-well™ trays. Many

184 very small crystals were obtained throughout the expansion trays following these coarse expansions, especially towards the higher end of the respective PEG gradients. Next, several 1x24 and 2x12 format shallow gradients of 10-17% PEG 400, 100 mM

Na HEPES pH 7.5, and 200 mM MgCl2 and 10-14% PEG 1000, 100 mM Na cacodylate pH 6.5, and 200 mM MgCl2 were set up at 21 °C. All of the respective expansion trays were set up using 2 µL of reservoir solution and 2 µL of protein solution at a concentration of ~19 mg/mL. Examples of the impurity (Dps) crystals that were grown from the shallow gradient expansion trays are shown in Figure 5.3.

A B

Figure 5.3: Shallow gradient expansion crystals at 21 °C of the impurity (DPS) from both the Ape and Tzi FEN-1 purifications. A: Impurity (DPS) crystal grown at ~17% PEG 400, 100 mM Na HEPES pH 7.5, and 200 mM MgCl2. This crystallized material was isolated from the Tzi FEN-1 sample. B: Impurity (DPS) crystal grown at ~14% PEG 1000, 100 mM Na cacodylate pH 6.5, and 200 mM MgCl2. This crystallized impurity was isolated from the Ape FEN-1 sample.

The crystals that were grown at ~17% PEG 400, 100 mM Na HEPES pH 7.5, and

200 mM MgCl2 (as shown in Figure 5.3A) were soaked momentarily in a substitute mother liquor containing a cryoprotectant and were then flash frozen in liquid nitrogen.

A substitute mother liquor solution of the following was made: 50 mM Tris-HCl pH 7.5,

100 mM Na HEPES pH 7.5, 100 mM NH4Cl, 150 mM NaCl, and 210 mM MgCl2.

PEG 400 was chosen as a cryoprotectant because the crystal growth condition contained

185

~17% PEG 400. It was therefore assumed that the crystals would be stable following an increase in the cryoprotectant concentration (17% to 30% PEG 400). The crystals were stable after being soaked momentarily into the substitute mother liquor containing 30%

PEG 400 and were then flash frozen in liquid nitrogen and stored until X-ray diffraction data collection was attempted. The crystals that were grown at ~14% PEG 1000,

100 mM Na cacodylate pH 6.5, and 200 mM MgCl2 (as shown in Figure 5.3B) were also soaked momentarily in substitute mother liquor containing a cryoprotectant prior to being flash frozen in liquid nitrogen. A substitute mother liquor solution of the following was made: 50 mM Tris-HCl pH 7.5, 100 mM Na cacodylate pH 6.5, 100 mM NH4Cl,

150 mM NaCl, and 210 mM MgCl2. The crystals were then soaked momentarily into the substitute mother liquor containing either a 30% PEG 400 or a 25% ethylene glycol solution. These crystals were stable in both of the cryoprotectant solutions, therefore a number of crystals from each respective cryoprotectant solution were flash frozen in liquid nitrogen and stored until X-ray diffraction data collection was attempted.

5.4 X-ray Diffraction Data Collection and Data Processing

Multiple data sets of the crystals of the Dps impurity were collected at both

BioCARS 14-BM-C (Argonne National Laboratories, Advanced Photon Source,

Chicago, IL, USA) using an ADSC Quantum 4 and a Quantum 315 CCD detector and in-house (Ohio Macromolecular Crystallography Consortium (OMCC), University of

Toledo, Toledo, OH, USA) using a Rigaku FRE High Brilliance X-ray Generator with a

Saturn 92 CCD detector. All data collected at the BioCARS 14-BM-C beamline were collected using a wavelength of 0.9 Å, and all data collected in-house were collected using a wavelength of 1.54 Å. Dps crystals grown from material isolated from Ape

186

FEN-1 (as shown in Figure 5.3B) did not diffract well following examination of multiple diffraction images that were exposed 90° apart. The diffraction patterns were streaked and did not diffract to high resolution, thus no data were collected on any of these crystals. However, the Dps crystals that were grown from material isolated from the Tzi

FEN-1 (as shown in Figure 5.3A) diffracted to approximately 2 Å resolution. X-ray diffraction images of Dps crystals are shown in Figure 5.4A and 5.4B.

A B

Figure 5.4: X-ray diffraction images of Dps crystals grown from material isolated from Tzi FEN-1. A: X-ray diffraction image of a Dps crystal that diffracts to approximately 2 Å resolution but contains a significant amount of diffuse scatter and high mosaicity (Data set #3 in Table 5.1). B: X-ray diffraction image of a Dps crystal that diffracts to approximately 2.5 Å resolution and was used for molecular replacement trials (Data set #1 in Table 5.1). The X-ray diffraction data was collected at BioCARS 14-BMC (Argonne National Laboratories, Advanced Photon Source, Chicago, IL, USA).

A total of four Dps data sets were collected and are shown in Table 5.1. All of the data sets were integrated using DENZO and merged using SCALEPACK from the HKL2000 software (Otwinowski and Minor, 1997). A summary of the autoindexing results are shown in Figure 5.5 for Data set #1.

187

Table 5.1: Dps/PexB data sets collected.

Data set Crystal Resolution Source/Date Detector

APS BioCARS 14-BM-C 1 Dps/PexB 2.5 Å ADSC Quantum 4 CCD Aug 2003 APS BioCARS 14-BM-C 2 Dps/PexB 2.0 Å ADSC Quantum 4 CCD Nov 2003 APS BioCARS 14-BM-C ADSC Quantum 315 3 Dps/PexB 2.0 Å Apr 2004 CCD OMCC In-house 4 Dps/PexB 2.0 Å Rigaku Saturn 92 CCD Aug 2004

Figure 5.5: Summary of the Bravais lattice autoindexing results for Dps Data set #1. The lowest metric tensor distortion index indicates possible solutions.

188

The Data set #1 shown in Table 5.1 was initially processed with the highest symmetry

Bravais lattice (I4) which contained the lowest unit cell distortion index. Due to the high

Rmerge value, a statistical measurement of symmetry related reflections, additional data processing using the other possible Bravais lattices was completed. The data processing results for Data set #1 are shown in Table 5.2 where the results indicate that the body centered orthorhombic space groups (I222 or I212121) are most likely

Table 5.2: Data processing results for Dps Data set #1. The autoindexing results for the respective Bravis lattices are shown in Figure 5.5.

Space Group Rmerge I4 42% I222 6.1%

I212121 6.1% F222 43% C2 43% P1 4%

5.5 Characterization

5.5.1 N-terminal Protein Sequencing

While crystallization trials were being attempted, the impurity that was isolated from the Tzi FEN-1 sample was sent off for N-terminal sequencing (Midwest Analytical,

Inc., St. Louis, MO). The N-terminal sequencing results showed that the first five amino acids of the impurity were KATNL. Amino acid primary sequence analysis showed that none of the respective archaeal FEN-1 proteins that were provided by Third Wave

Technologies, Inc. contained the sequence KATNL, and therefore confirmed that the impurity was not a FEN-1 truncation product. However, following an E. coli genome

189 search, the sequencing results did indicate that the impurities were possibly an E. coli

DNA protection protein known as Dps/PexB that is produced during starvation or stress conditions. The amino acid sequence of Dps/PexB is shown in Figure 5.6. Following examination of the Dps amino acid sequence, the N-terminal sequencing results suggested that the impurity was a truncated form of Dps/PexB which was missing the first nine amino acids at the N-terminus.

1 MSTAKLVKSK ATNLLYTRND VSDSEKKATV ELLNRQVIQF IDLSLITKQA HWNMRGANFI 61 AVHEMLDAFR TALIDHLDTM AERAVQLGGV ALGTTQVINS KTPLKSYPLD IHNVQDHLKE 121 LADRYAIVAN DVRKAIGEAK DDDTADILTA ASRDLDKFLW FIESNIE

Figure 5.6: Amino acid primary sequence of E. coli Dps/PexB. The amino acid sequence of E. coli Dps/PexB (1-167) is shown with the N-terminal sequencing results (KATNL) highlighted in yellow.

The molecular weight of a full length Dps monomer (amino acids 1-167) is 18.5 kDa, whereas the molecular weight of Dps without the first nine amino acids was calculated to be approximately 17.7 kDa from the ExPASy ProtParam tool (Gill and von Hippel,

1989). Thus, a truncated form of the Dps protein was in agreement that observed for the molecular weight of the impurity on SDS-PAGE (as shown in Figure 5.2). As discussed in Chapter 1 section 1.3, the X-ray crystal structure of Dps showed that monomers were associated in a spherical dodecameric structure with a diameter of approximately 85 Å.

The molecular weight for the Dps dodecamer is approximately 220 kDa and would be expected to be approximately 210 kDa if the dodecameric structure was formed from truncated monomers. A larger Dps structure was in agreement with the void volume elution of the isolated impurity that was discussed in section 5.1. Also, DLS

190 measurements confirmed that the impurity was much larger than that for a single monomer of Dps in solution.

5.5.2 Mass Spectroscopy

In order to confirm the identity of the isolated impurity, samples of the impurity were sent for Mass Spectroscopy (MS) analysis (The Michigan Proteome Consortium,

The University of Michigan Medical School, Ann Arbor, MI). MALDI TOF-TOF mass spectroscopy was used to determine the masses of the tryptic peptide fragments of the impurity. Database searching of the tryptic masses was then completed which confirmed that the impurity was E. coli Dps. The database searching of both the MS and MS/MS results of the impurity showed that the Confidence of the protein Identification (C. I.%) was 100% for E. coli Dps.

5.6 Structure Determination

Having identified the impurity to be Dps, initial phasing of Data set #1 (Table 5.1) was attempted using both AMoRe (Navaza and Saludjian, 1997) and MOLREP (Vagin and Teplyakov, 1997) with the previously solved structure of E. coli Dps (Grant et al.,

1998). Initially, Data set #1 was processed as I222 and I212121, respectively (as shown in

Table 5.2). Both space groups had a Matthews’ coefficient of ~2 Å3 Da-1 which indicated that there were two monomers in the asymmetric unit. Using a Dps monomer (as shown in Figure 5.7A) as a search model, the molecular replacement results for the space groups

I222 and I212121 gave an Rfactor of 60% (CC of 24) and 59% (CC of 24), respectively.

With incorrect molecular replacement solutions using the space groups I222 and I212121,

Data set #1 was reprocessed as P1 (as shown in Table 5.2). The Matthews’ coefficient

191 for the P1 space group indicated that there was one Dps dodecamer in the asymmetric unit. Molecular replacement was then attempted using the Dps dodecamer (as shown in

Figure 5.8B) as a search model. This molecular replacement search was also not able to find the correct solution. Lastly, one Dps monomer was used to search for twelve monomers in the P1 asymmetric unit and was only able to locate eight positions. Further examination of the coordinates of this search indicated that the molecular packing of monomers did not resemble a sphere as in the known Dps structure. Several attempts to refine this molecular replacement solution using Refmac5 were unsuccessful.

A B

Figure 5.7: Dps molecular replacement search models. A: Ribbon structure of one four helix bundle Dps monomer. B: Ribbon structure of the Dodecamer structure of Dps (12 monomers). PDB 1DPS, (Grant et al., 1998) This figure was generated with the programs MOLSCRIPT and RENDER using Raster3D (Kraulis, 1991; Merritt and Bacon, 1997).

192

It was surprising that molecular replacement using the known X-ray structure of

Dps was not successful at solving the initial phasing of the Dps data sets. These results suggest that the truncated form of the isolated E. coli Dps protein may have a unique structure from that previously determined. Interestingly, the DLS results showed that the truncated Dps protein had an estimated molecular weight of 142 kDa which was considerably less than would be expected for a dodecamer formed from twelve truncated monomers (210 kDa). However, a molecular replacement search using a Dps monomer model was able to locate eight positions in the asymmetric unit. The molecular weight of eight truncated Dps monomers is also approximately 142 kDa. Therefore, based on the above molecular replacement results, experimental phasing by heavy atom soaking methods might be required to solve the structure of this truncated form of Dps.

193

References

Ayyagari, R., Gomes, X.V., Gordenin, D.A. and Burgers, P.M. (2003) Okazaki fragment maturation in yeast. I. Distribution of functions between FEN1 AND DNA2. J Biol Chem, 278, 1618-1625. Azam, T.A. and Ishihama, A. (1999) Twelve species of the nucleoid-associated protein from Escherichia coli. Sequence recognition specificity and DNA binding affinity. J Biol Chem, 274, 33105-33113. Bae, S.H., Bae, K.H., Kim, J.A. and Seo, Y.S. (2001) RPA governs endonuclease switching during processing of Okazaki fragments in eukaryotes. Nature, 412, 456-461. Beese, L.S. and Steitz, T.A. (1991) Structural basis for the 3'-5' exonuclease activity of Escherichia coli DNA polymerase I: a two metal ion mechanism. Embo J, 10, 25- 33. Benkovic, S.J., Valentine, A.M. and Salinas, F. (2001) Replisome-mediated DNA replication. Annu Rev Biochem, 70, 181-208. Bhagwat, M., Hobbs, L.J. and Nossal, N.G. (1997a) The 5'-exonuclease activity of bacteriophage T4 RNase H is stimulated by the T4 gene 32 single-stranded DNA- binding protein, but its flap endonuclease is inhibited. J Biol Chem, 272, 28523- 28530. Bhagwat, M., Hobbs, L.J. and Nossal, N.G. (1997b) The 5'-exonuclease activity of bacteriophage T4 RNase H is stimulated by the T4 gene 32 single-stranded DNA- binding protein, but its flap endonuclease is inhibited. J Biol Chem, 272, 28523- 28530. Bhagwat, M., Meara, D. and Nossal, N.G. (1997c) Identification of residues of T4 RNase H required for catalysis and DNA binding. J Biol Chem, 272, 28531-28538. Bhagwat, M. and Nossal, N.G. (2001) Bacteriophage T4 RNase H removes both RNA primers and adjacent DNA from the 5' end of lagging strand fragments. J Biol Chem, 276, 28516-28524. Bhat, T.N. (1988) Calculation of an OMIT map. J. Appl. Cryst, 21, 279-281.

194

Bornarth, C.J., Ranalli, T.A., Henricksen, L.A., Wahl, A.F. and Bambara, R.A. (1999) Effect of flap modifications on human FEN1 cleavage. Biochemistry, 38, 13347- 13354. Brunger, A.T. (1992) X-PLOR Version 3.1 A system for X-ray crystallography and NMR. Yale University Press, New Haven, CT. Brunger, A.T. (1997) Free R Value: Cross-Validation in Crystallography. In Carter, C.W. and Sweet, R.M. (eds.), Methods Enzymol Academic Press, New York, Vol. 277, pp. 367-396. Brunger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse-Kunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., Read, R.J., Rice, L.M., Simonson, T. and Warren, G.L. (1998) Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr D Biol Crystallogr, 54 (Pt 5), 905-921. Brunger, A.T., Adams, P.D. and Rice, L.M. (1997) New applications of simulated annealing in X-ray crystallography and solution NMR. Structure, 5, 325-336. Brunger, A.T. and Rice, L.M. (1997) Crystallographic Refinement by Simulated Annealing: Methods and Applications. In Carter, C.W. and Sweet, R.M. (eds.), Methods Enzymol Academic Press, New York, Vol. 277, pp. 243-269. Budd, M.E. and Campbell, J.L. (1997) A yeast replicative helicase, Dna2 helicase, interacts with yeast FEN-1 nuclease in carrying out its essential function. Mol Cell Biol, 17, 2136-2142. Budd, M.E., Choe, W. and Campbell, J.L. (2000) The nuclease activity of the yeast DNA2 protein, which is related to the RecB-like nucleases, is essential in vivo. J Biol Chem, 275, 16518-16529. Burke, R.L., Munn, M., Barry, J. and Alberts, B.M. (1985) Purification and properties of the bacteriophage T4 gene 61 RNA priming protein. J Biol Chem, 260, 1711- 1722. Carter, C.W., Jr. and Carter, C.W. (1979) Protein crystallization using incomplete factorial experiments. J Biol Chem, 254, 12219-12223.

195

Carver, T.E., Jr., Sexton, D.J. and Benkovic, S.J. (1997) Dissociation of bacteriophage T4 DNA polymerase and its processivity clamp after completion of Okazaki fragment synthesis. Biochemistry, 36, 14409-14417. Ceci, P., Cellai, S., Falvo, E., Rivetti, C., Rossi, G.L. and Chiancone, E. (2004) DNA condensation and self-aggregation of Escherichia coli Dps are coupled phenomena related to the properties of the N-terminus. Nucleic Acids Res, 32, 5935-5944. Ceska, T.A., Sayers, J.R., Stier, G. and Suck, D. (1996) A helical arch allowing single- stranded DNA to thread through T5 5'- exonuclease. Nature, 382, 90-93. Cha, T.A. and Alberts, B.M. (1989) The bacteriophage T4 DNA replication fork. Only DNA helicase is required for leading strand DNA synthesis by the DNA polymerase holoenzyme. J Biol Chem, 264, 12220-12225. Chapados, B.R., Hosfield, D.J., Han, S., Qiu, J., Yelent, B., Shen, B. and Tainer, J.A. (2004) Structural basis for FEN-1 substrate specificity and PCNA-mediated activation in DNA replication and repair. Cell, 116, 39-50. Collaborative Computational Project, N. (1994) The CCP4 suite: programs for protein crystallography. Acta Crystallogr D Biol Crystallogr, 50, 760-763. Collins, B.K., Tomanicek, S.J., Lyamicheva, N., Kaiser, M.W. and Mueser, T.C. (2004) A preliminary solubility screen used to improve crystallization trials: crystallization and preliminary X-ray structure determination of Aeropyrum pernix flap endonuclease-1. Acta Crystallogr D Biol Crystallogr, 60, 1674-1678. Dauter, Z. (1999) Data-collection strategies. Acta Crystallogr D Biol Crystallogr, 55 (Pt 10), 1703-1717. Dionne, I., Nookala, R.K., Jackson, S.P., Doherty, A.J. and Bell, S.D. (2003) A heterotrimeric PCNA in the hyperthermophilic archaeon Sulfolobus solfataricus. Mol Cell, 11, 275-282. Feng, M., Patel, D., Dervan, J.J., Ceska, T., Suck, D., Haq, I. and Sayers, J.R. (2004) Roles of divalent metal ions in flap endonuclease-substrate interactions. Nat Struct Mol Biol, 11, 450-456.

196

Ferre-D'Amare, A.R. and Burley, S.K. (1997) Dynamic Light Scattering in Evaluating Crystallizability of Macromolecules. In Carter, C.W. and Sweet, R.M. (eds.), Methods Enzymol Academic Press, New York, 276, 157-166. Freudenreich, C.H., Kantrow, S.M. and Zakian, V.A. (1998) Expansion and length- dependent fragility of CTG repeats in yeast. Science, 279, 853-856. Friedrich-Heineken, E. and Hubscher, U. (2004) The Fen1 extrahelical 3'-flap pocket is conserved from archaea to human and regulates DNA substrate specificity. Nucleic Acids Res, 32, 2520-2528. Gangisetty, O., Jones, C.E., Bhagwat, M. and Nossal, N.G. (2005) Maturation of Bacteriophage T4 Lagging Strand Fragments Depends on Interaction of T4 RNase H with T4 32 Protein Rather than the T4 Gene 45 Clamp. J Biol Chem, 280, 12876-12887. Garforth, S.J., Ceska, T.A., Suck, D. and Sayers, J.R. (1999) Mutagenesis of conserved lysine residues in bacteriophage T5 5'-3' exonuclease suggests separate mechanisms of endo-and exonucleolytic cleavage. Proc Natl Acad Sci U S A, 96, 38-43. Gill, S.C. and von Hippel, P.H. (1989) Calculation of protein extinction coefficients from amino acid sequence data. Anal Biochem, 182, 319-326. Gordenin, D.A., Kunkel, T.A. and Resnick, M.A. (1997) Repeat expansion--all in a flap? Nat Genet, 16, 116-118. Grant, R.A., Filman, D.J., Finkel, S.E., Kolter, R. and Hogle, J.M. (1998) The crystal structure of Dps, a ferritin homolog that binds and protects DNA. Nat Struct Biol, 5, 294-303. Hacker, K.J. and Alberts, B.M. (1994) The rapid dissociation of the T4 DNA polymerase holoenzyme when stopped by a DNA hairpin helix. A model for polymerase release following the termination of each Okazaki fragment. J Biol Chem, 269, 24221-24228. Harrington, J.J. and Lieber, M.R. (1994a) The characterization of a mammalian DNA structure-specific endonuclease. Embo J, 13, 1235-1246.

197

Harrington, J.J. and Lieber, M.R. (1994b) Functional domains within FEN-1 and RAD2 define a family of structure- specific endonucleases: implications for nucleotide excision repair. Dev, 8, 1344-1355. Henneke, G., Friedrich-Heineken, E. and Hubscher, U. (2003) Flap endonuclease 1: a novel tumour suppresser protein. Trends Biochem Sci, 28, 384-390. Henricksen, L.A., Tom, S., Liu, Y. and Bambara, R.A. (2000) Inhibition of flap endonuclease 1 by flap secondary structure and relevance to repeat sequence expansion. J Biol Chem, 275, 16420-16427. Henricksen, L.A., Veeraraghavan, J., Chafin, D.R. and Bambara, R.A. (2002) DNA ligase I competes with FEN1 to expand repetitive DNA sequences in vitro. J Biol Chem, 277, 22361-22369. Hollingsworth, H.C. and Nossal, N.G. (1991) Bacteriophage T4 encodes an RNase H which removes RNA primers made by the T4 DNA replication system in vitro. J Biol Chem, 266, 1888-1897. Hosfield, D.J., Frank, G., Weng, Y., Tainer, J.A. and Shen, B. (1998a) Newly discovered archaebacterial flap endonucleases show a structure-specific mechanism for DNA substrate binding and catalysis resembling human flap endonuclease-1. J Biol Chem, 273, 27154-27161. Hosfield, D.J., Mol, C.D., Shen, B. and Tainer, J.A. (1998b) Structure of the DNA repair and replication endonuclease and exonuclease FEN-1: coupling DNA and PCNA binding to FEN-1 activity. Cell, 95, 135-146. Huber, C.G. (2000) Biopolymer Chromatography. In (Ed.), R.A.M. (ed.), Encyclopedia of Analytical Chemistry. Ó John Wiley & Sons Ltd, Chichester, p. pp. 11250– 11278. Hwang, K.Y., Baek, K., Kim, H.Y. and Cho, Y. (1998) The crystal structure of flap endonuclease-1 from Methanococcus jannaschii. Nat Struct Biol, 5, 707-713. Ilari, A., Ceci, P., Ferrari, D., Rossi, G.L. and Chiancone, E. (2002) Iron incorporation into Escherichia coli Dps gives rise to a ferritin-like microcrystalline core. J Biol Chem, 277, 37619-37623. Jancarik, J. and Kim, S.-H. (1991) Sparse matrix sampling: a screening method for crystallization of proteins. J. Appl. Cryst, 24, 409-411.

198

Jones, C.E., Green, E.M., Stephens, J.A., Mueser, T.C. and Nossal, N.G. (2004) Mutations of bacteriophage T4 59 helicase loader defective in binding fork DNA and in interactions with T4 32 single-stranded DNA-binding protein. J Biol Chem, 279, 25721-25728. Jones, C.E., Mueser, T.C. and Nossal, N.G. (2000) Interaction of the bacteriophage T4 gene 59 helicase loading protein and gene 41 helicase with each other and with fork, flap, and cruciform DNA. J Biol Chem, 275, 27145-27154. Jones, T.A. and Kjeldgaard, M. (1997) Electron-Density Map Interpretation. In Carter, C.W. and Sweet, R.M. (eds.), Methods in Enzymol Academic Press, New York, Vol. 277, pp. 173-208. Jones, T.A., Zou, J.Y., Cowan, S.W. and Kjeldgaard, M. (1991) Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Cryst., A47, 110-119. Kaiser, M.W., Lyamicheva, N., Ma, W., Miller, C., Neri, B., Fors, L. and Lyamichev, V.I. (1999) A comparison of eubacterial and archaeal structure-specific 5'- exonucleases. J Biol Chem, 274, 21387-21394. Kao, H.I. and Bambara, R.A. (2003) The protein components and mechanism of eukaryotic Okazaki fragment maturation. Crit Rev Biochem Mol Biol, 38, 433- 452. Kao, H.I., Henricksen, L.A., Liu, Y. and Bambara, R.A. (2002) Cleavage specificity of Saccharomyces cerevisiae flap endonuclease 1 suggests a double-flap structure as the cellular substrate. J Biol Chem, 277, 14379-14389. Kao, H.I., Veeraraghavan, J., Polaczek, P., Campbell, J.L. and Bambara, R.A. (2004) On the roles of Saccharomyces cerevisiae Dna2p and Flap endonuclease 1 in Okazaki fragment processing. J Biol Chem, 279, 15014-15024. Kim, Y., Eom, S.H., Wang, J., Lee, D.S., Suh, S.W. and Steitz, T.A. (1995) Crystal structure of Thermus aquaticus DNA polymerase. Nature, 376, 612-616. Kleywegt, G.J. and Brunger, A.T. (1996) Checking your imagination: applications of the free R value. Structure, 4, 897-904.

199

Kleywegt, G.J. and Jones, T.A. (1994) Halloween.Masks and Bones. From First Map to Final Model, 59-66. Kleywegt, G.J. and Jones, T.A. (1996) xdlMAPMAN and xdlDATAMAN - programs for reformatting, analysis and manipulation of biomacromolecular electron-density maps and reflection data sets. Acta Crystallogr D Biol Crystallogr, 52, 826-828. Kleywegt, G.J. and Jones, T.A. (1997) Model Building and Refinement Practices. In Carter, C.W. and Sweet, R.M. (eds.), Methods in Enzymol Academic Press, New York, Vol. 277, pp. 208-230. Kornberg, A. and Baker, T.A. (1992) DNA Replication, Second Edition. W. H. Freeman and Company, New York. Kraulis, P.J. (1991) MOLSCRIPT - A program to produce both detailed and schematic plots of protein structures. J. Appl. Cryst, 24, 946-950. Lamzin, V.S. (1993) Automated refinement of protein models. Acta Crystallogr D Biol Crystallogr, 49, 129-147. Laskowski, R.A., MacArthur, M.W., Moss, D.S. and Thornton, J.M. (1992) PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Cryst, 26, 283-291. Liu, Y. and Bambara, R.A. (2003) Analysis of human flap endonuclease 1 mutants reveals a mechanism to prevent triplet repeat expansion. J Biol Chem, 278, 13728-13739. Liu, Y., Kao, H.I. and Bambara, R.A. (2004) Flap endonuclease 1: a central component of DNA metabolism. Annu Rev Biochem, 73, 589-615. Maga, G., Villani, G., Tillement, V., Stucki, M., Locatelli, G.A., Frouin, I., Spadari, S. and Hubscher, U. (2001) Okazaki fragment processing: modulation of the strand displacement activity of DNA polymerase delta by the concerted action of replication protein A, proliferating cell nuclear antigen, and flap endonuclease-1. Proc Natl Acad Sci U S A, 98, 14298-14303. Martinez, A. and Kolter, R. (1997) Protection of DNA during oxidative stress by the nonspecific DNA-binding protein Dps. J Bacteriol, 179, 5188-5194. Matsui, E., Abe, J., Yokoyama, H. and Matsui, I. (2004) Aromatic residues located close to the active center are essential for the catalytic reaction of flap endonuclease-1

200

from hyperthermophilic archaeon Pyrococcus horikoshii. J Biol Chem, 279, 16687-16696. Matsui, E., Musti, K.V., Abe, J., Yamasaki, K., Matsui, I. and Harata, K. (2002) Molecular structure and novel DNA binding sites located in loops of flap endonuclease-1 from Pyrococcus horikoshii. J Biol Chem, 277, 37840-37847. Matthews, B.W. (1968) Solvent content of protein crystals. J Mol Biol, 33, 491-497. McPherson, A. (1999) Crystallization of Biological Macromolecules. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. McPherson, A. (2004) Protein crystallization in the structural genomics era. J Struct Funct Genomics, 5, 3-12. Merritt, E.A. and Bacon, D.J. (1997) Raster3D: Photorealistic molecular graphics. In Carter, C.W. and Sweet, R.M. (eds.), Methods in Enzymol Academic Press, New York, Vol. 277, pp. 505-525. Moradian-Oldak, J., Leung, W. and Fincham, A.G. (1998) Temperature and pH- dependent supramolecular self-assembly of amelogenin molecules: a dynamic light-scattering analysis. J Struct Biol, 122, 320-327. Morris, R.J., Perrakis, A. and Lamzin, V.S. (2002) ARP/wARP's model-building algorithms. I. The main chain. Acta Crystallogr D Biol Crystallogr, 58, 968-975. Mueser, T.C., Nossal, N.G. and Hyde, C.C. (1996) Structure of bacteriophage T4 RNase H, a 5' to 3' RNA-DNA and DNA-DNA exonuclease with sequence similarity to the RAD2 family of eukaryotic proteins. Cell, 85, 1101-1112. Mueser, T.C., Rogers, P.H. and Arnone, A. (2000) Interface sliding as illustrated by the multiple quaternary structures of liganded hemoglobin. Biochemistry, 39, 15353- 15364. Murshudov, G.N. (1997) Refinement of macromolecular structures by the maximum- likelihood method. Acta Crystallogr D Biol Crystallogr, 53, 240-255. Navaza, J. and Saludjian, P. (1997) aMoRe: An Automated Molecular Replacement Program Package. In Carter, C.W. and Sweet, R.M. (eds.), Methods in Enzymol Academic Press, New York, Vol. 276, pp. 581-594.

201

Nossal, N.G. (1994) The Bacteriophage T4 DNA Replication Fork. In Karam, J.D. (ed.), Molecular Biology of Bacteriophage T4. American Society of Microbiology, Washington D.C., pp. 43 - 53. Otwinowski, Z. and Minor, W. (1997) Processing of X-ray diffraction data collecting in oscillation mode. In Carter, C.W. and Sweet, R.M. (eds.), Methods Enzymol Academic Press, New York, 276, 307-326. Qiu, J., Liu, R., Chapados, B.R., Sherman, M., Tainer, J.A. and Shen, B. (2004) Interaction interface of human flap endonuclease-1 with its DNA substrates. J Biol Chem, 279, 24394-24402. Qiu, J., Qian, Y., Frank, P., Wintersberger, U. and Shen, B. (1999) Saccharomyces cerevisiae RNase H(35) functions in RNA primer removal during lagging-strand DNA synthesis, most efficiently in cooperation with Rad27 nuclease. Mol Cell Biol, 19, 8361-8371. Read, R.J. (1986) Improved Fourier coefficients for maps using phases from partial structures with errors. Acta Cryst., A42, 140-149. Rogers, D.W. (1997) Practical Cryocrystallography. In Carter, C.W. and Sweet, R.M. (eds.), Methods in Enzymol Academic Press, New York, 276, 183. Rossman, M.G. and Arnold, E. (2001) Solvent Flattening Chapter 15. International Tables for Crystallography Crystallography of Biological Macromolecules. IUCr, Vol. F. Sako, Y., Nomura, N., Uchida, A., Ishida, Y., Morii, H., Koga, Y., Hoaki, T. and Maruyama, T. (1996) Aeropyrum pernix gen. nov., sp. nov., a novel aerobic hyperthermophilic archaeon growing at temperatures up to 100 degrees C. Int J Syst Bacteriol, 46, 1070-1077. Sakurai, S., Kitano, K., Yamaguchi, H., Hamada, K., Okada, K., Fukuda, K., Uchida, M., Ohtsuka, E., Morioka, H. and Hakoshima, T. (2004) Structural basis for recruitment of human flap endonuclease 1 to PCNA. Embo J, 24, 683-93. Schweitzer, J.K. and Livingston, D.M. (1998) Expansions of CAG repeat tracts are frequent in a yeast mutant defective in Okazaki fragment maturation. Hum Mol Genet, 7, 69-74.

202

Senger, A.B. and Mueser, T.C. (2004) Rapid preparation of custom grid screens for crystal growth optimization. J. Appl. Cryst, in press. Shen, B., Nolan, J.P., Sklar, L.A. and Park, M.S. (1997) Functional analysis of point mutations in human flap endonuclease-1 active site. Nucleic Acids Res, 25, 3332- 3338. Shen, B., Qiu, J., Hosfield, D. and Tainer, J.A. (1998) Flap endonuclease homologs in archaebacteria exist as independent proteins. Trends Biochem Sci, 23, 171-173. Sommers, C.H., Miller, E.J., Dujon, B., Prakash, S. and Prakash, L. (1995) Conditional lethality of null mutations in RTH1 that encodes the yeast counterpart of a mammalian 5'- to 3'-exonuclease required for lagging strand DNA synthesis in reconstituted systems. J Biol Chem, 270, 4193-4196. Storici, F., Henneke, G., Ferrari, E., Gordenin, D.A., Hubscher, U. and Resnick, M.A. (2002) The flexible loop of human FEN1 endonuclease is required for flap cleavage during DNA replication and repair. Embo J, 21, 5930-5942. Tishkoff, D.X., Filosi, N., Gaida, G.M. and Kolodner, R.D. (1997) A novel mutation avoidance mechanism dependent on S. cerevisiae RAD27 is distinct from DNA mismatch repair. Cell, 88, 253-263. Turchi, J.J., Huang, L., Murante, R.S., Kim, Y. and Bambara, R.A. (1994) Enzymatic completion of mammalian lagging-strand DNA replication. Proc Natl Acad Sci U S A, 91, 9803-9807. Usdin, K. and Grabczyk, E. (2000) DNA repeat expansions and human disease. Cell Mol Life Sci, 57, 914-931. Vagin, A. and Teplyakov, A. (1997) MOLREP: an automated program for molecular replacement. J. Appl. Cryst, 30, 1022-1025. Vipond, I.B. and Halford, S.E. (1995) Specific DNA recognition by EcoRV restriction endonuclease induced by calcium ions. Biochemistry, 34, 1113-1119. Weber, P.C. (1997) Overview of Protein Crystallization Methods. In Carter, C.W. and Sweet, R.M. (eds.), Methods in Enzymol Academic Press, New York, 276, 13. Wilson, W.W. (2003) Light scattering as a diagnostic for protein crystal growth--a practical approach. J Struct Biol, 142, 56-65.

203

Woese, C.R., Kandler, O. and Wheelis, M.L. (1990) Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A, 87, 4576-4579. Xu, Y., Potapova, O., Leschziner, A.E., Grindley, N.D. and Joyce, C.M. (2001) Contacts between the 5' nuclease of DNA polymerase I and its DNA substrate. J Biol Chem, 276, 30167-30177.

204

Appendices

Appendix I: American Heart Association Ohio Valley Affiliate

Predoctoral Fellowship Proposal

Appendix II: American Heart Association Predoctoral Fellowship

Progress Report for Renewal

205

Appendix I: American Heart Association Ohio Valley Affiliate

Predoctoral Fellowship Proposal

Specific Aims.

Aim 1. Model building and refinement of native Aeropyrum pernix (Ape) flap endonuclease enzyme (FEN-1).

A 1.9 Å X-ray diffraction data set of native Aeropyrum pernix (Ape) flap endonuclease enzyme (FEN-1) has been collected. The structure has been solved using molecular replacement. I am currently building the molecular model using computer graphics. The first round of modeling has been completed with approximately 65% of the residues in place. We have recently been able to collect additional high-resolution

(1.4 Å) data of the native Ape FEN-1. I propose to complete the model building and refinement of the native Ape FEN-1 structure during which the resolution will be extended to 1.4 Å.

Aim 2. Crystal soaking experiments and X-ray diffraction data collection of

Aeropyrum pernix (Ape) flap endonuclease enzyme (FEN-1) with divalent metals.

Native, diffraction quality Aeropyrum pernix (Ape) FEN-1 crystals will be used for crystal soaking experiments with divalent metals to understand the binding of catalytic divalent metal ions in catalysis. Initial soaking experiments and X-ray diffraction data have been collected on Aeropyrum pernix (Ape) flap endonuclease enzyme (FEN-1) with various divalent metal ions. These data sets will be analyzed by molecular replacement using the native Ape FEN-1 model that is being constructed in

Specific Aim 1.

206

Aim 3. Crystallization trials of Archaeal flap endonuclease enzymes (FEN-1) with substrate.

Oligonucleotide flap DNA substrates will be provided, in-kind, by Third Wave

Technologies, Inc. for enzyme-substrate crystallization trials. Several methods will be utilized to inhibit cleavage of the DNA substrates by the FEN-1 enzymes.

Co-crystallization trials will be conducted and crystal optimization will be performed as needed. Diffraction data collection and structure refinement as discussed in Aims 1 and 2 will be performed on the complexes.

Background and Significance.

The flap endonuclease (FEN-1) family of DNA replication associated DNA repair enzymes are structure specific 5’ to 3’ endonucleases that recognize and act on three stranded substrates called flap DNA. Flap DNA is generated in-vivo during strand displacement synthesis, an event that occurs in both DNA replication and DNA repair.

The read through replication generates a complimentary strand that displaces a segment of the duplex parent strand. Recognition by FEN-1 enzymes is mediated by the structure of the DNA flap junction at the 3’ end of the short complimentary strand. Studies have demonstrated that FEN-1 enzymes specifically recognize a short 5’ single-stranded arm, regardless of composition or sequence, near the junction where the two strands of duplex

DNA separate. The enzyme then moves down the single-stranded arm to the cleavage site that is located at the junction of the double- and single-stranded nucleic acid (Shen et al., 1998; Kaiser et al., 1999). Incorrect processing of the Okazaki fragment primers during strand displacement synthesis by FEN-1 enzymes can lead to mutations that have been associated with human disorders such as recessive retinitis pigmentosa,

207

Huntington’s disease, fragile X syndrome, and Friedreich’s ataxia leading to hypertrophic cardiomyopathy and possibly cardiac failure. In particular, Friedreich’s ataxia is believed to be a result of a trinucleotide repeat (TNR) expansion. FEN-1 has been associated with an increased rate of expansion (Usdin and Grabczyk, 2000).

To date, the X-ray crystal structures of six enzymes in this family have been determined: one prokaryotic source, the 5’ to 3’ exonuclease domain of Thermus aquaticus (Taq) polymerase (1TAQ, 2.40 Å resolution) (Kim et al., 1995) ; two from bacteriophage, the T4 RNase H (1TFR, 2.1 Å resolution) (Mueser et al., 1996) and the

T5 5’ to 3’ exonuclease (1EXN, 2.50 Å resolution) (Ceska et al., 1996), (1XO1, 2.50 A resolution) (Garforth et al., 1999); and three from Euryarchaeal organisms, Pyrococcus furiosus (Pfu) flap endonuclease-1 (1B43, 2.00 Å resolution) (Hosfield et al., 1998b),

Methanococcus jannaschii flap endonuclease-1 (1A76 and 1A77, 2.00 Å resolution)

(Hwang et al., 1998), and Pyrococcus horikoshii (Pho) flap endonuclease-1 (1MC8, 3.10

Å resolution) (Matsui et al., 2002). Structural studies of this family of enzymes have yet to determine the substrate recognition. The research we propose in Specific Aim 1 will be both the first FEN-1 enzyme solved from the Crenarchaeal organisms and the highest resolution (1.4 Å) structure of a FEN-1 enzyme solved to date.

The flap endonuclease family can be identified by a highly conserved, acidic magnesium ion coordination motif. Magnesium ions are bound in the active site of the enzyme and are required for catalysis. The catalytic core of the FEN-1 enzymes are highly conserved and contain mainly the negatively charged acidic residues, aspartic and glutamic acid. The acidic catalytic center of flap endonuclease enzymes coordinates two magnesium ions. Coordination of divalent cations is essential and may be required to

208 both neutralize this condensed region of negative charge of the active site as well as coordinating the inner sphere water molecule involved in the enzymatic cleavage reaction. In addition, divalent cations may also be required for chelation to the phosphate backbone of the DNA substrate, stabilizing the substrate binding to the enzyme in a proper orientation. The open packing in the native Aeropyrum pernix (Ape) FEN-1 crystal (see Specific Aim 2, Figure 3) should allow access of divalent metal ions to the active site to more fully understand the binding of catalytic metals.

Recent sequence information demonstrates that Archaeal FEN-1 enzymes share a high degree of sequence homology with the eukaryotic FEN-1 enzymes. Understanding the mechanisms of the Archaeal forms of these FEN-1 enzymes may then provide a greater understanding of eukaryotic DNA replication and repair pathways because the kingdom Archaea has a replication system more closely related to the eukaryotic system, than the bacterial system. Thus, it would be beneficial to obtain additional structural data of the Archaeal FEN-1 enzymes to more fully characterize the substrate binding specificity and the role of divalent metal ions in the catalytic mechanism of these enzymes.

Research Design and Methods.

The goal of this proposed research is to understand the molecular basis of the interaction of the Aeropyrum pernix (Ape) flap endonuclease (FEN-1) enzyme with divalent metal ions and flap DNA substrates through the use of X-ray crystallographic studies. A major project in the lab began with six (6) thermostable flap endonuclease enzymes from the Archaeal organisms: Archaeoglobus fulgidus (Afu), Aeropyrum pernix

(Ape), Archaeoglobus veneficus (Ave), Pyrococcus furiosus (Pfu), Pyrococcus horikoshii

209

(Pho) and Thermococcus zilligii (Tzi). The project involves crystallization trials and biophysical studies of both native archaeal flap endonucleases and native and/or site- directed mutants of FEN-1 enzymes with flap DNA substrates of varying lengths.

Additional crystallization trials will be conducted on the DNA substrates themselves with and without metals. The purified, archaeal enzymes and flap DNA substrates of various lengths will be supplied, in-kind, by Third Wave Technologies, Inc.

Aim 1. Model building and refinement of native Aeropyrum pernix (Ape) flap endonuclease enzyme (FEN-1).

In the first phase of the study, five Archaeal FEN-1 enzymes (Afu, Ave, Ape, Pfu, and Tzi) have been subjected to crystallization screening. The solubility parameters of the enzymes have been determined using a common ion screen method. This allowed us to maximize the solubility for crystallization studies. Each protein was then concentrated to the appropriate level and subjected to various vapor diffusion crystallization screens.

After initial crystallization trials were completed, greater quantities of the particular enzymes have been provided for optimization of crystallization conditions to produce diffraction quality crystals.

Each protein gave reasonable and promising results in the crystal trials. In particular, the Ape FEN-1 initial crystal screens gave numerous hits. These conditions were optimized and diffraction quality crystals were obtained (see Figure 1). The other four proteins have produced crystallization hits and are being investigated by others in the group. All initial protein crystallographic data collection is conducted using an R-

Axis IV in-house X-ray source with cryogenic capabilities. High resolution data

210 collection is obtained at the Advanced Photon Source (APS) at Argonne National

Laboratory.

To date, a 1.9 Å X-ray diffraction data set of native Aeropyrum pernix (Ape) flap endonuclease enzyme (FEN-1) has been collected, and we have been able to solve the structure using the Pfu FEN-1 as a model in molecular replacement. The Pfu FEN-1 model, positioned in the Ape FEN-1 crystal lattice, was transformed into the Ape FEN-1 sequence using a threading program called LOOK

(http://www.bioinformatics.ucla.edu/genemine/), and has been refined using CNS to generate an initial composite omit electron density map at 1.9 Å resolution. The composite omit map is being used in model building to reduce bias. The first round of modeling has been completed with approximately 65% of the residues in place by using molecular graphics. A second CNS refinement was recently completed using the partial fit model. An example of the electron density map after the second refinement can be seen in Figure 2. I am currently working on the second round of modeling of the Ape

FEN-1 structure. We have recently been able to collect additional high-resolution (1.4 Å) data of the native Ape FEN-1. Upon completion of model building and refinement at 1.9

Å, the resolution of the native Ape FEN-1 structure will be extended to 1.4 Å.

Figure 1: Native Ape FEN-1 crystal Figure 2: Native Ape FEN-1 electron density map (1.9 Å) after second refinement.

211

Aim 2. Crystal soaking experiments and X-ray diffraction data collection of

Aeropyrum pernix (Ape) flap endonuclease enzyme (FEN-1) with divalent metals.

In the second phase of the study, various divalent metals are being used to determine binding characteristics of these metal ions to the Aeropyrum pernix (Ape)

FEN-1 enzyme. These experiments initially consisted of a 20-minute soak of a 10 mM final concentration of various divalent metal ion solutions into the crystallization drop of native, diffraction quality crystals of the Aeropyrum pernix (Ape) FEN-1 enzyme. The

Aeropyrum pernix (Ape) FEN-1 crystals containing divalent metal ions were then cryogenically preserved for X-ray diffraction data collection. A more complete time- course metal soaking study may have to be explored to determine if the divalent metal ions are actually incorporated into the active site of the enzyme. Further biophysical studies will include light scattering analysis to determine the aggregation state of the enzymes in the presence of various divalent metal ions, and isothermal titration calorimetric analysis to determine the binding constants of the metal ions.

To date, we have been able to collect several data sets of native Ape FEN-1 crystals with metal soaks (20-minute) of divalent cations: Ba, Mg, Mn, Sr, and Zn (see

Table 1). These data sets will be analyzed completely when the 1.4 Å native Ape FEN-1 model has been completed. Interestingly, the resolution of the metal soak data sets is generally lower compared to the native Ape FEN-1 data set. This may be indicative of some protein conformational change upon binding to these divalent metals at the active site. The 1.9 Å native Ape FEN-1 model packed the hexagonal P61 space group with unit cell dimensions of a = 92.8, b = 92.8, c = 80.9 with α = β = 90° and γ= 120° shown below (see Figure 3). Due to the symmetrical packing of the Ape FEN-1 molecule in the

212 crystal, porous channels are present throughout the native Ape FEN-1 crystals. It is also apparent that the side chain residues near the active site that are involved in metal binding are not involved in any crystal lattice contacts. We therefore hypothesize that these porous channels in the native Ape FEN-1 crystals should allow the access of divalent metal ions to the active site. If these crystal soak experiments are successful, structure determination will be completed on the Archaeal FEN-1 enzymes to further characterize the binding of the various divalent metals in the active site of these enzymes.

Protein Resolution Comment Ape FEN-1 1.4 A Native Ape FEN-1 2.0 A Ba2+ soak Ape FEN-1 1.7 A Mg2+ soak Ape FEN-1 1.9 A Mn2+ soak Ape FEN-1 1.8 A Sr2+ soak Ape FEN-1 1.9 A Zn2+ soak

Table 1: Ape FEN-1 diffraction data sets.

Figure 3: Unit cell (shown in red) packing of native Ape FEN-1 molecules in the crystal.

Aim 3. Crystallization trials of Archaeal flap endonuclease enzymes (FEN-1) with substrate.

In the third phase of the study, we will attempt to define conditions for the crystallization of the 6 Archaeal FEN-1 enzymes in the presence of appropriate flap DNA substrates. Co-crystallization trials will be conducted and crystal optimization will be performed as needed. DNA substrate design was based on a frequently used substrate designed by the scientists at Third Wave Technologies, Inc. Due to the requirement that the enzymatic activity must be inhibited while substrate recognition is maintained,

213 several methods will be utilized to inhibit cleavage of the flap DNA substrates by the

FEN-1 enzymes. If any stable complex of enzyme and substrate can be crystallized,

X-ray diffraction data collection and structure refinement will be conducted to more fully characterize the molecular basis of substrate recognition.

Additional biophysical studies will include light scattering analysis to determine the aggregation state of the enzymes in the presence of substrate, and, if possible, isothermal titration calorimetric analysis to determine the binding constants to the substrates.

To date, we have performed preliminary co-crystallization trials with the native

Aeropyrum pernix Ape FEN-1 enzyme and a trial flap DNA substrate. The trial DNA substrate has been shown to inhibit growth of native Ape FEN-1 crystals using previously determined conditions for crystallization.

Inhibiting cleavage of DNA substrates by the FEN-1 enzymes.

Our first approach to inhibit enzymatic activity of FEN-1 will involve co- crystallization trials in the presence of DNA substrate but in the absence of divalent cations. Divalent cations are required for enzymatic activity and their absence may inhibit degradation of the substrate by the FEN-1 enzymes.

Our second approach is to substitute other cations, such as cobalt hexamine, in place of magnesium. It is known that certain cations inhibit cleavage by the FEN-1 enzymes, however, it is not yet known whether these cations inhibit binding, cleavage or both.

Our third approach is to use mutant forms of the enzymes that do not support cleavage. The mutations that have the greatest effect on cleavage also reduce the binding

214 of the divalent cations to the enzyme. However, because the enzymes have two metal binding sites, it is possible with this technique to eliminate the binding of only one metal.

It is thought that one of the metals in this class of enzymes is required mainly for DNA binding and the other metal is required mainly for DNA catalysis. By knocking out the correct metal binding site, DNA binding may be sustained while virtually eliminating

DNA catalysis. Scientists at Third Wave Technologies, Inc have attempted this technique with Taq polymerase, and were able to greatly reduce but not eliminate cleavage. It is possible that a combination of mutant enzymes, chelating agents, and low temperature will reduce cleavage enough to allow co-crystallization.

Design of DNA substrates.

Artificial fork substrates are prepared by synthesizing DNA oligonucleotides that are complimentary at one end and non-complimentary at the other. Flap DNA substrates are constructed from the synthetic fork DNA. The addition of a short oligonucleotide complimentary to the non-complimentary segment of the 3’ to 5’ fork strand creates a second region of duplex DNA, referred to as the upstream duplex. The original duplex from the fork DNA substrate is referred to as the downstream duplex.

Because all three arms of the flap substrate are required for binding by FEN-1 enzymes, variable lengths of the three arms affect the binding efficiency of the enzyme.

Substrate design will be optimized around the shortest length necessary to maintain specific recognition. Several proposed substrates will be provided, in-kind, by Third

Wave Technologies, Inc (see Figure 4). The length of the upstream duplex (top left strand) will be varied from 6 to 10 base pairs. The length of the downstream duplex (top right strand) will be varied from 10 to 14 base pairs. The last base pair of the upstream

215 duplex will contain a mismatch to lock the DNA in the proper conformation for binding to the enzyme. Studies have shown that the rate of cleavage is enhanced by a single base overlap from the upstream duplex. No overlap between duplex regions decreases the activity of the enzyme. This decrease suggests that active site mutants with diminished activity on overlapped substrates may have negligible activity and may be useful for co- crystallization studies. It has been decided to use a 3 nucleotide 5' arm extending from the downstream duplex to serve as the flap region. That length may be changed if necessary. Substrate melting points have been determined using the Hyther “nearest neighbor” algorithm (http://ozone2.chem.wayne.edu/Hyther/hythermenu.html). Studies have been undertaken at Third Wave Technologies, Inc and it was determined that these substrates are cleaved by the FEN-1 enzymes and are stable at room temperature.

Figure 4: Proposed flap DNA substrates

216

Ethical Aspects of the Proposed Research.

Appropriate procedures will be followed according to the University of Toledo

Department of Chemistry for all disposal of chemical and biohazard waste associated with the proposed research. Specifically, all recombinant DNA, cellular debris, and any chemicals used or generated in the proposed research will be safely and appropriately disposed of without damage or harm to the environment.

217

References

Ceska, T.A., Sayers, J.R., Stier, G. and Suck, D. (1996) A helical arch allowing single- stranded DNA to thread through T5 5'- exonuclease. Nature, 382, 90-93. Garforth, S.J., Ceska, T.A., Suck, D. and Sayers, J.R. (1999) Mutagenesis of conserved lysine residues in bacteriophage T5 5'-3' exonuclease suggests separate mechanisms of endo-and exonucleolytic cleavage. Proc Natl Acad Sci U S A, 96, 38-43. Hosfield, D.J., Mol, C.D., Shen, B. and Tainer, J.A. (1998) Structure of the DNA repair and replication endonuclease and exonuclease FEN-1: coupling DNA and PCNA binding to FEN-1 activity. Cell, 95, 135-146. Hwang, K.Y., Baek, K., Kim, H.Y. and Cho, Y. (1998) The crystal structure of flap endonuclease-1 from Methanococcus jannaschii. Nat Struct Biol, 5, 707-713. Kaiser, M.W., Lyamicheva, N., Ma, W., Miller, C., Neri, B., Fors, L. and Lyamichev, V.I. (1999) A comparison of eubacterial and archaeal structure-specific 5'- exonucleases. J Biol Chem, 274, 21387-21394. Kim, Y., Eom, S.H., Wang, J., Lee, D.S., Suh, S.W. and Steitz, T.A. (1995) Crystal structure of Thermus aquaticus DNA polymerase. Nature, 376, 612-616. Matsui, E., Musti, K.V., Abe, J., Yamasaki, K., Matsui, I. and Harata, K. (2002) Molecular structure and novel DNA binding sites located in loops of flap endonuclease-1 from Pyrococcus horikoshii. J Biol Chem, 277, 37840-37847. Mueser, T.C., Nossal, N.G. and Hyde, C.C. (1996) Structure of bacteriophage T4 RNase H, a 5' to 3' RNA-DNA and DNA-DNA exonuclease with sequence similarity to the RAD2 family of eukaryotic proteins. Cell, 85, 1101-1112. Shen, B., Qiu, J., Hosfield, D. and Tainer, J.A. (1998) Flap endonuclease homologs in archaebacteria exist as independent proteins. Trends Biochem Sci, 23, 171-173. Usdin, K. and Grabczyk, E. (2000) DNA repeat expansions and human disease. Cell Mol Life Sci, 57, 914-931.

218

Appendix II: American Heart Association Predoctoral Fellowship

Progress Report for Renewal

Specific Aim 1. Model building and refinement of native Aeropyrum pernix (Ape) flap endonuclease enzyme (FEN-1).

The model building and refinement of the native metal free Aeropyrum pernix

(Ape) flap endonuclease-1 (FEN-1) enzyme has been completed with 96% of the amino acid residues in place using molecular graphics. In summary, a 1.9 Å X-ray diffraction data set of native Ape FEN-1 was collected, and we solved the structure with AMORE using the Pfu FEN-1 (1B43, 2.00 Å resolution) as a model in molecular replacement (see

Crystal #1 in Table 1). The Pfu FEN-1 model, positioned in the Ape FEN-1 crystal lattice, was transformed into the Ape FEN-1 sequence using the threading program

GeneMine, and the subsequent coordinates were refined using CNS to generate an initial composite omit electron density map at 1.9 Å resolution. This composite omit map was used in the first model building cycle to reduce bias. Additional high-resolution (1.4 Å) data of the native Ape FEN-1 has also been collected (see Crystal #2 in Table 1). Upon completion of model building and refinement at 1.9 Å, the resolution of the native Ape

FEN-1 structure was extended to 1.4 Å resolution. The Ape FEN-1 model building was completed after a total of 14 molecular graphics and refinement cycles consisting of six cycles of CNS simulated annealing followed by eight cycles of CCP4 Refmac5 restrained refinement yielding a final R value of 17.3% and an R free of 20.2% (see Image 1D:

Final 2Fo-Fc electron density map).

The X-ray crystal structures of six enzymes in this family were previously determined. Ape FEN-1 is the first FEN-1 enzyme solved from a Crenarchaeal organism

219 and is the highest resolution (1.4 Å) structure of a FEN-1 enzyme solved to date (see

Image 1C: Native Ape FEN-1 ribbon diagram). In the Ape FEN-1 model, we observe a very ordered helical bridge region (amino acids 87-130) that is positioned over the catalytic core of the enzyme which contains mainly negatively charged acidic residues, aspartic and glutamic acid. It is believed that the bridge region may be involved in DNA substrate binding resulting from a structural rearrangement of the active site upon binding of the catalytic divalent metal ions.

Table 1: Aeropyrum pernix (Ape) FEN-1 Crystal Data

# Crystal Resolution Rmerge Source/Date 1 Ape FEN-1 Native 1.9 Å 3.6% APS BioCARS July 2002 2 Ape FEN-1 Native 1.4 Å 5.3% APS BioCARS Dec 2002

3 Ape FEN-1 + MgCl2 (native growth condition) 1.9 Å 7.1% APS BioCARS Aug 2003

4 Ape FEN-1 + MgCl2 (new growth condition) 1.8 Å 3.5% APS BioCARS Apr 2003 1 5 Ape FEN-1 + MgCl2 (heat incubated, new growth condition) 1.68 Å 6.5% OMCC FR-E May 2004 1 6 Ape FEN-1 + MnCl2 (heat incubated, new growth condition) 1.35 Å 6.7% OMCC FR-E May 2004 7 Ape FEN-1 + 7/14 DNA Substrate ~3 Å N/A APS BioCARS Nov 2003

1OMCC (Ohio Macromolecular Crystallography Consortium, University of Toledo)

Specific Aim 2. Crystal soaking and growth experiments and X-ray diffraction data collection of Aeropyrum pernix (Ape) flap endonuclease enzyme (FEN-1) with divalent metals.

In the second phase of the study, various divalent metals are being used to determine the structural binding characteristics of these catalytic metal ions to the

Aeropyrum pernix (Ape) FEN-1 enzyme. Crystal soaking experiments have been completed using native Ape FEN-1 crystals. It was hypothesized that the large solvent channels in the native Ape FEN-1 crystal should allow the access of divalent metal ions

220

to the active site. Thus, various divalent metals (BaCl2, MgCl2, MnCl2, SrCl2, and ZnCl2) have been used at concentrations ranging from 6-20 mM for crystal soaking experiments at time increments of 5, 10, and 20 minutes and 24 and 48 hours. Several X-ray diffraction data sets were collected on the soaked native Ape FEN-1 crystals. In addition to the crystal soaking experiments, crystal growth experiments have been completed using the native Ape FEN-1 conditions in the presence of 20 and 200 mM MgCl2 or

MnCl2, respectively. X-ray diffraction data was collected for Ape FEN-1 grown in the presence of 200 mM MgCl2 (see Crystal #3 in Table 1). Interestingly, structure determination by molecular replacement using the native Ape FEN-1 model (Specific

Aim 1) and subsequent analysis of the respective difference electron density maps has shown that no divalent metal ions are bound in the active site of the Ape FEN-1 enzyme following crystal soaking or growth experiments using the native Ape FEN-1 crystallization condition.

It is believed that the binding of divalent metal ions and/or DNA substrate to the

FEN-1 enzyme family may involve a structural rearrangement of the residues in the active site. This structural rearrangement may be caused by a conformational change in the bridge region resulting in a more open conformation upon binding of metal or DNA substrate. Based on this hypothesis, we believe that the native Ape FEN-1 crystal lattice packing may not allow the required conformational change to occur for the binding of divalent metals. Therefore, we decided to screen for a new crystallization condition of

Ape FEN-1 grown in the presence of MgCl2 or MnCl2. We have completed crystallization rescreening of the Ape FEN-1 enzyme in the presence of divalent metals with a variety of commercially available sparse matrix crystal screens. Crystal

221 optimization and high resolution data collection have been completed on this new crystal condition of Ape FEN-1 grown in the presence of 25 mM MgCl2 (see Crystal #4 in

Table 1). In addition, heat incubation studies of Ape FEN-1 were explored prior to crystallization using the new growth condition. Because Ape FEN-1 is an archaeal enzyme with optimal catalytic activity near 75°C, an elevated temperature may allow the enzyme the needed conformational flexibility to bind divalent metals. Heat incubation studies have been completed on purified Ape FEN-1 in the presence of MgCl2 or MnCl2 at 75°C followed by a slow cooling prior to crystallization using the new condition. High resolution data was collected on the heat incubated Ape FEN-1 grown in the presence of

50 mM MgCl2 or MnCl2 (see Crystal #5 and #6 in Table 1, Image 1B: Ape FEN-1 with

MgCl2 crystal, and Image 1E: Diffraction image of Ape FEN-1 with MnCl2 collected at

OMCC FR-E). Structure determination by molecular replacement using the native Ape

FEN-1 model has been completed and analysis of the difference electron density maps is currently being done.

Specific Aim 3. Crystallization trials of Archaeal flap endonuclease enzymes

(FEN-1) with substrate.

In the third phase of the study, we are attempting to define conditions for the crystallization of five Archaeal FEN-1 enzymes [Archaeoglobus fulgidus (Afu),

Aeropyrum pernix (Ape), Archaeoglobus veneficus (Ave), Pyrococcus furiosus (Pfu), and

Thermococcus zilligii (Tzi)] in the presence of appropriate flap DNA substrates. Initial sparse matrix co-crystallization trials were conducted on all five Archaeal FEN-1 enzymes in the presence of three different types of flap DNA substrate (7/11, 6/10, and

222

7/14). Several crystal hits were obtained with the Ape and the Ave FEN-1 enzymes in the presence of the 7/14 DNA flap substrate. Many of the hit conditions were stained with Izit Crystal Dye (Hampton Research) to verify that the crystals were protein. In particular, a crystal of the Ape FEN-1 and 7/14 DNA complex has shown diffraction to approximately 3 Å (see Crystal #7 in Table 1 and Image 1A: Ape FEN-1 with 7/14 DNA substrate crystal). Crystal optimization is currently underway to produce more diffraction quality crystals of the Ape and Ave FEN-1 and 7/14 complex. In addition, initial high throughput, robotic co-crystallization screening was completed on the Ape FEN-1 and

7/14 DNA substrate in the presence of the catalytically inert divalent metal CaCl2.

Analysis of the screening conditions is currently being done.

Image 1