Structural Studies of RecE

Dissertation

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of

Philosophy in the Graduate School of The Ohio State University

By

Jinjin Zhang, M. S.

The Ohio State Biochemistry Graduate Program

The Ohio State University

2009

Dissertation Committee:

Dr. Charles Bell, Adviser

Dr. Ross Dalbey

Dr. Dehua Pei Dr. Jiyan Ma

ABSTRACT

The is subject to alternations by both normal metabolic

activities and environmental factors, resulting in many different types of DNA

damage. Double-stranded DNA breaks (DSBs) are a particularly deleterious form of

DNA damage. Mutations of the responsible for their repair can cause cancer

and related diseases. In Chapter 1, an overview of the three pathways of DSB repair is

given. One pathway for the repair of dsDNA breaks is single-strand annealing (SSA),

which is promoted by Rad52 in , and by the phage-encoded RecET and

Redαβ recombination systems in bacteria. RecET and Redαβ each consist of a

5’-3’exonuclease, RecE or Redα (λ exonuclease), that resects the ends of the DNA created at the break to form long 3’-overhangs, and a second protein, RecT or Redβ (β protein), that loads onto the overhang to promote its annealing with a complementary strand of ssDNA. Due in large part to a lack of structural information, the proteins of these recombination systems are not well understood at the mechanistic level. The

biotechnology applications of the phage-based recombination systems in

manipulation and DNA sequencing are also introduced in this chapter.

II

In Chapter 2, the crystal structure of the C-terminal domain of RecE

exonuclease at 2.8 Å is presented. RecE forms a toroidal tetramer with a central tapered channel that is wide enough to bind dsDNA at one end, but is partially

plugged at the other end by the C-terminal segment of the protein. Four narrow tunnels, one within each subunit of the tetramer, lead from the central channel to the

four active sites, which lie about 15 Å from the channel. The structure suggests a

mechanism in which dsDNA enters though the open end of the central channel, the

5’-ended strand passes through a narrow tunnel to access one of the four active sites,

and the 3’-ended strand passes through the plugged end of the channel at the back of

the tetramer.

Based on the crystal structure and sequences comparisons, 24 amino acid

residues of RecE were mutated to alanine are purified, and the nuclease and

DNA-binding activities of the purified proteins are presented in Chapter 3. The

structure-activity analysis, which targeted the six conserved motifs, the

disordered loop that is positioned to interact with an incoming DNA substrate, the

surface of the central channel, and the C-terminal plug, complements our structural

study of RecE and supports the proposed model for its mechanism of action.

In order to expand our understanding of the molecular mechanism of this class

of toroidal exonuclease , the crystallization of RecE and λ exonuclease in

complex with DNA was pursued and is still ongoing. In Chapter 4, we show that

several forms of promising crystals of -DNA complexes have been obtained

and the existence of DNA in the crystals was verified. We also discuss the strategies

III

to facilitate the complex co-crystallization as a future direction, with particular emphasis on the design of DNA substrates with varied length and ends.

The Src homology 2 (SH2) domain specifically recognizes a phosphorylated tyrosine (pTyr) residue. This process is fundamental in many signal transduction events. In Chapter 5, we study the structural basis of the specificity for the interactions between SH2-domains and pTyr-containing sequences selected from the peptide library screenings, a collaborative project with Dr. Dehua Pei in Department of Chemistry at The Ohio State University. Crystal structures of N-terminal SH2 domain of SHP-2 in complex with two different high affinity peptides,

FVP and AQLW, were determined. A striking feature of the SH2-FVP structure is that two copies of the FVP peptide, running antiparallel to one another, bind to the peptide-binding surface on the SH2 domain. All previous structures contain only one peptide bound per SH2 domain. The biological implications of this novel 1:2 SH2 domain-peptide complex are discussed.

IV

Dedicated to My Parents and My Husband

V

ACKNOWLEDGMENTS

I would like to thank my advisor, Dr. Charles E. Bell, for his intellectual

guidance, support and encouragement throughout my graduate career. Without his

inspiration, the work that I have completed would be meaningless. His wisdom,

enthusiasm and friendliness have made my life at the Ohio State University enjoyable

and productive.

I would also like to extend my sincere gratitude to Dr. Ross Dalbey, Dr.

Dehua Pei, and Dr. Jiyan Ma for serving on my dissertation committee and for their thoughtful advices.

I gratefully acknowledge Dr. Scout Walsh for serving on my candidacy exam

committee and for his suggestions in making the SeMet RecE protein for the crystal

structure determination.

I wish to appreciate Dr. Andrew Herr, at University of Cincinnati College of

Medicine, for providing the data of analytical ultracentrifugation and his stimulating

discussions in the manuscript preparation.

I also wish to express my appreciation to previous and existing members of

Bell lab, Dr. Rakhi Rajan, Dr. Xu Xing, Dr. Dieudonné Ndjonka, Jame Wisler, Jinwei

Hu and Emily Story. This work would be impossible without their support.

VI

I owe special thanks to my friends, Dr. Xinhe Wang, Xin Li and Ross Wilson, for their technical discussion and insightful comments.

Many thanks to other friends, Dr. Fei Wang, Min Tian, Yi Xiong and Yujie

Sun, for their valuable friendship.

Lastly, I would like to impart my most sincere gratitude toward my parents,

Gaohui Zhang and Yuejiao Chen, and my husband Wenlan Chen, for being so understanding and supportive through the years of my graduate studies, and for offering me heartfelt encouragement when I truly needed it most.

VII

VITA

June 30, 1981 ………………………………………………..Born- Fujian, P. R. China

1999-2003 ……………………………………B. S. Shandong University, P. R. China

2004-present ……………Graduate Fellow, The Ohio State University, Columbus OH

PUBLICATIONS

1. Zhang, Jinjin; Xing, Xu; Herr, Andrew B; Bell, Charles E. “Crystal Structure of E. coli RecE Protein Reveals a Toroidal Tetramer for Processing Double Stranded DNA Breaks.” Structure, 17:690-702.

1. Pauff, Jame M; Zhang, Jinjin; Bell, Charles E; Hille, Ross. (2008). “Substrate Orientation in Xanthine Oxidase: Crystal Structure of Enzyme in Reaction with 2-hydroxy-6-methylpurine.” J. Biol. Chem., 283(8):4818-4824.

FIELDS OF STUDY

Major Field: Biochemistry

VIII

TABLE OF CONTENTS

Page Abstract ……………………………………………………………………………… II Dedication …………………………………………………………………………… V Acknowledgements ………………………………………………………………….VI Vita ……………………………………………………………..………………… VIII Table of Contents..……………………..…………………………………………….IX List of Figures……………………………………………………………………...XIII List of Tables……………………………………………………………………….XVI

Chapters

1 Introduction …………..……………………………………………………...... 1

1.1 Repair of Double-Stranded DNA Breaks…….……….…………………….…1 1.2 Phage-Based Recombination Systems………………………………………...5 1.3 Protein Components of Phage-Based Recombination Systems……………….8 1.3.1 Processive 5’–3’ Enzymes: RecE and λ Exonuclease…………………...8 1.3.2 Recombinase: RecT and β Protein……………………………………..10 1.3.3 Exonuclease-Recombinase Complexes………………………….……..14 1.4 Biotechnology Applications of RecET and Redαβ Systems………..………..15 1.4.1 Recombineering Technology………………………………..………….15 1.4.2 Nanopore DNA Sequencing……………………………………………17

IX

2 Crystal Structure of Escherichia Coli RecE Exonuclease Reveals a Toroidal Tetramer for Processing Double Stranded DNA Breaks……………………...20

2.1 Introduction…………………………………………………………………20 2.2 Methods and Materials……………………………………………………...21 2.2.1 Expression and Purification of RecE proteins………………………...21 2.2.2 Analytical Ultracentrifugation………………………………………...23 2.2.3 Crystallization of RecE Fragments……………………………………23 2.2.4 Expression and Purification of Selenomethionine RecE606…………..24 2.2.5 X-ray Structure Determination of RecE606 ………………………..…26 2.3 Results……………………………………………………………………... 27 2.3.1 RecE Forms Stable Tetramers in Solution…………………………... 27 2.3.2 Crystallization and Structure Determination………………………… 31 2.3.3 Structure of the RecE Monomer……………………………………... 38 2.3.4 Structure of the RecE Active Site……………………………………..40 2.3.5 Six Conserved Sequence Motifs……………………………………...44 2.3.6 Structure of the RecE Tetramer……………………………………….47 2.4 Discussions..………………………………………………………………...52 2.4.1 Structural Comparison of RecE and λ Exonuclease Oligomers….…...56 2.4.2 Mechanism for Processive Digestion of dsDNA……………………..60 2.4.3 Interaction with the Single Strand Annealing Protein…….…………..64

3 Structure-Activity Analysis of RecE……………………..…………..………..65 . 3.1 Introduction…………………………………………………….…………...65 3.2 Methods and Materials.………………………………….………………….69

X

3.2.1 Expression and purification of RecE564 mutants…….………………. 69 3.2.2 Real-time DNase assay…………………………….………………… 70 3.2.2.1 Reaction conditions…………………………………………. 70

3.2.2.2 Control reactions……………………………………………. 70

3.3.2.3 DNA calibration curve………………………………………. 71 3.2.2.4 Data processing……………………………………………... 71 3.2.3 Fluorescence anisotropy binding assay……………………………… 75 3.3 Results …………...………………………………………………………... 76 3.4 Discussions..……….………………………….…………………………….79

4 Crystallization of λ Exonuclease and RecE in Complex with DNA Substrates…………………………………………………………….………….85

4.1 Introduction………………………………………………………………....85 4.2 Methods and Materials……………………………………………………...86 4.2.1 Gel-shift DNA binding assay………………………………………....86 4.2.2 Crystallization ………………………………………………………..87 4.2.3 Fluorescence detection of DNA in crystals……………………….…..88 4.3 Results………………………………………………………………….…...88 4.3.1 RecE and λ exonuclease can bind dsDNA as short as 12 base pairs.....88 4.3.2 A collection of duplex DNA substrates……………………………….89 4.3.3 Crystals of RecE and λ Exo in complex with DNA…………………..92 4.3.4 Detection of DNA in the crystals……………………………………..95 4.4 Discussions..…………………………………………….…………………..98 4.5 Future directions…………………………………………………………….99

XI

5 Crystal Structures of N-terminal SH2 Domain of Tyrosine Phosphatase SHP-2 in Complex with High Affinity Peptides ………………..………………..103

5.1 Introduction ……………………………………………………………….103 5.1.1 Src homology 2 domain …………………………………………….103 5.1.2 Tyrosine phosphatase SHP-2 …………………………….………….105 5.1.3 Peptide specificity of SHP-2 N-SH2 domains ……………………...109 5.2 Methods and materials …………………………………………………….111 5.2.1 Crystallization and data collection ………………………………….111 5.2.2 Structure determination ……………………………………………..112 5.3 Results …………………………………………………………………….115 5.4 Discussions ………………………………………………………………..122

6 References…………...…………………………………………………………123

XII

LIST OF FIGURES

Figure Page

1.1 Alternative pathways for double-strand break repair…..…..…………………...4 1.2 Schematic view of mediated by RecET and Redαβ7 1.3 Toroidal structure of λ exonuclease……………………………………………..9 1.4 EM-derived models of β protein..……………………..………………………12 1.5 Crystal structure of the SSA domain of hRad52…………………………....…13 1.6 A comparison of DNA manipulation by the traditional genetic engineering and the recombineering techniques……………..……....………………………….16 1.7 Nanopore DNA sequencing……..………………..……………………………19

2.1 Size exclusion chromatogram of RecE564…………..………….………..……29 2.2 Analytical ultracentrifugation results showing that RecE protein forms a stable tetramer in solution…………………..……………………………………….30 2.3 SDS-PAGE of purified RecE fragments and λ exonuclease……………...…..33 2.4 Crystals and diffractions of RecE proteins with different lengths…………....34 2.5 Exonuclease activities of RecE proteins…………………………..………….35 2.6 Electron density for the RecE crystal structure…………………..…………...37 2.7 RecE monomer structure and structure alignment of RecE to λ exonuclease and RecB……………..…………………………………………………………...39 2.8 Close-up view of the active site of RecE………………………………..……42 2.9 Close-up stereo view comparing the active sites of RecE and RecB………....43

XIII

2.10 Alignment of five different RecE sequences showing six conserved sequence motifs…………………………………………………………………………46 2.11 Structure of the RecE Tetramer……………………………………………….48 2.12 View of active site tunnels in the RecE tetramer……………………………...50 2.13 View of the central channel in the RecE tetramer…………………………….51 2.14 Structure-based sequence alignment of RecE, RecB and λ exonuclease……..55 2.15 Monomers of RecE and λ exonulcease pack into their toroidal oligomers in Opposite orientation………………………………………………………….57

3.1 Proposed mechanism of processing double-stranded DNA ends by RecE tetramer……………………………………………………………………….66 3.2 Six conserved active site motifs of RecE……………………………………..68 3.3 Schematic representation of the ReDA reaction………………………………71 3.4 A standard dsDNA calibration curve………………………………………….73 3.5 An example of ReDA data processing………………………………………..74 3.6 Raw fluorescence data of ReDA for RecE mutations………………………...81 3.7 Fluorescence anisotropy data for 24 RecE mutations………………………...83

4.1 Gel shift assays showing the binding of λ exonuclease and RecE to duplex DNA substrates with varied lengths………………………………………….90 4.2 Crystal forms of λ exonuclease-DNA complexes…………………………….94 4.3 Crystals of λ exonuclease and RecE606 in complex with DNA……..………..96 4.4 Crystals of λ exonuclease in complex with a 5’-Cy5-labeled 12-mer duplex..97 4.5 DNA substrates for co-crystallization……………………………………….102

XIV

5.1 Schematic model for signal transduction through a transmembrane Receptor…………………………………………………………………….104 5.2 Structure of Src SH2 domain in complex with a high affinity phosphopeptide (pYEEI)……………………………………………………………………..106 5.3 Structure of SHP-2 and mechanism of inhibition…………………………...108 5.4 Crystals of SHP-2 N-SH2 domain in complex with FVP peptide and a diffraction image……………………………………………………………113 5.5 Crystal structure of N-SH2 domain of SHP-2 tyrosine phosphatase in complex with two copies of FVP peptide……………………………………………116 5.6 Peptide FVP binding sites and interactions………………………………….118 5.7 Stereo view of structure alignment of SH2 domain alone and in complex with peptide substrates…………………………………………………………..120 5.8 Surface views of SH2 domains with peptides bound……………………….121

XV

LIST OF TABLES

Table Page

2.1 A recipe for minimal media preparation…………………………...………….25 2.2 Summary of the crystallographic data……………………………………..….36 3.1 Effects of mutations on RecE exonuclease and DNA-binding activities……..80 4.1 DNA sequences used for co-crystallization……………………………….…..93 5.1 Crystallographic and refinement statistics…………………………………...114

XVI

CHAPTER 1

INTRODUCTION

The human genome is under constant assault from environmental factors that compromise its integrity. The cells of all organisms have developed efficient pathways for repair of different types of DNA damage that can arise. The human genome project has already revealed more than 130 genes whose products participate in a variety of DNA repair mechanisms (Wood, et al., 2001). Mutations in these DNA repair genes can result in cancer and aging-related diseases. For examples, BRCA1 and BRCA2 mutations confer a significantly increased risk of , mutations in mismatch repair proteins can cause hereditary nonpolyposis , and mutations in excision repair proteins are frequently associated with xerodoma pigmentosum. A thorough knowledge of the inner workings of these

DNA repair systems is necessary to understand how mutations lead to cancer and related diseases.

1.1 Repair of Double-Stranded DNA Breaks.

DNA double-stranded breaks (DSB) in are one of most dangerous types of DNA damage. They can be induced by environmental mutagens,

1

such as ionizing radiation (Ward, 1988), and certain chemicals used in cancer chemotherapy, such as bleomycin (Steighner and Povirk, 1990). DSBs can also arise during DNA synthesis when a replication fork encounters an alteration to the DNA template and during meiosis by specific (Philips and Morgan, 1994).

Unrepaired DSBs can lead to genome rearrangements, such as deletions, translocations, and ultimately oncogenic transformation (Abella Columna, et al.,

1993). In general, cells can repair DSBs by one of three main pathways (Figure 1.1): non-homologous end joining (NHEJ) (Weaver, 1995), homologous recombination

(HR) (Szostak, et al., 1983; Shinohara and Ogawa, 1995), or single strand annealing

(SSA) (Kolodner, et al., 1994)). Depending on the stage of the cell cycle, the relative

contribution of these DSB repair pathways will vary. HR is most efficient in the S and

G2 phases of the cell cycle due to the availability of sister chromatids as repair

templates, while NHEJ is mainly used in G1 phase in the absence of a sister

chromatid (Dik, et al., 2001).

NHEJ is the predominant DSB repair pathway in many organisms, including

higher eukaryotes such as human (Lieber, 2008) and mouse. It involves rejoining of

two exposed DNA ends and frequently results in minor changes in DNA sequence,

such as nucleotide loss and addition, at the rejoining site. The ends of the DNA

exposed at the break are first recognized by Ku70/80 heterodimer, which prevents

further damage to the ends and also recruits other proteins. Binding of the catalytic

subunit of a DNA-dependent protein kinase (DNA-PKcs) to Ku at DNA ends activates

the protein kinase activity of DNA-PKcs, permitting it to phosphorylate itself and the

2

Artemis that cleans up 5’ or 3’ overhangs. Additional enzymes, such as

XRCC4 and DNA IV, are recruited by Ku. XRCC4 in turn recruits a polynucleotide kinase and to assist in end processing. Then ligase IV catalyzes the ligation of the two strands (Figure 1.1 A).

In yeast, DSB repair is dominated by the HR mechanism, which requires the

presence of a homologous to use for a DNA strand exchange reaction.

This reaction is promoted by an ATP-dependent recombinase, RecA in bacteria or

Rad51 in eukaryotes (Bell, 2005). As demonstrated in Figure 1.1 B, the DNA ends are first resected by a 5’-3’ exonuclease, which results in the formation of long

3’-overhangs. This step is achieved by the RecBCD -nuclease complex in E. coli, and by a combination of activities in eukaryotes, including the MRN complex,

Sae2 endonuclease, Sgs1 helicase, and ExoI (Mimitou and Symington, 2008). The recombinase is then loaded onto the 3’-overhang to form a helical nucleoprotein filament that facilitates invasion of the 3’-overhang into a homologous duplex to pair with a complementary strand of DNA. The 3’-end of the ssDNA from the break serves as a primer for DNA synthesis using the homologous chromosome as a template, and the resulting intermediate, often a , is resolved by one of several possible routes (Paques and Haber, 1999). While RecA and Rad51 are the central players in HR, the entire process requires several other proteins, including members of the Rad52 epistasis group in eukaryotes (Symington, 2002).

If the break occurs between two regions on the DNA that share homology, it can be repaired by SSA, which involves 5’-3’ resection of the ends and annealing of

3

Figure 1.1 Alternative pathways for double-strand break repair.

(A) NHEJ: Ku70/80 heterodimers bind to and juxtapose the two ends for processing and ligation. Boxes indicate alterations to the sequence.

(B) HR: after RecA/Rad51 mediated strand-invasion and DNA synthesis, the SSA activity of Rad52 mediates second end capture or synthesis-dependent strand annealing. The vertical lines indicate regions annealed by Rad52. [Arrow heads indicate 3’-ends, and dashed red arrows indicate newly synthesized DNA.]

(C) SSA: the single stranded overhang produced by RecE or λ exonuclease anneals to a complementary strand, promoted by RecT or β protein. The hatched boxes indicate homologous regions.

4

the resulting 3’-overhangs (Figure 1.1 C). SSA is carried out by Rad52 in eukaryotes, and by phage-encoded proteins in bacteria, which bind to ssDNA and promote the annealing of complementary strands. Although SSA requires that the break occur between two homologous regions on the DNA, it is highly efficient, and can in fact account for a significant proportion of recombination events (Fishman-Lobell, et al.,

1992). Moreover, the SSA activity of Rad52 plays a central role in the repair of dsDNA breaks by HR (Figure 1 B). For example, after the 3’-overhang from one end of the break invades a homologous duplex to initiate repair DNA synthesis, Rad52 is required for the “second-end capture” step in which the 3’-overhang from the other end of the break is annealed to the displaced strand of the homologous duplex

(Mcllwraith and West, 2008; Sugiyama, et al., 2006). Alternatively, in the synthesis dependent strand annealing (SDSA) pathway, in which the extended 3’-overhang is displaced from the homologous duplex, Rad52 is required to anneal the displaced strand with the 3’-overhang of the other end of the break (Lao, et al., 2008). The SSA activity of Rad52 also mediates end-to-end chromosome fusions following loss of telomeres (Wang and Baumann, 2008). Despite the central role of Rad52 in several different DNA repair pathways, the mechanism by which it promotes the SSA reaction is not well understood.

1.2 Phage-Based Recombination Systems

The basic processes of DNA repair are highly conserved among eukaryotes, and even bacteriophages. This study primarily focuses on phage-based

5

recombination systems, including the RecET system encoded within a cryptic rac

prophage found in E. coli, and Redαβ of bacteriophage λ (Kolodner, et al., 1994; Stahl,

et al., 1997). These two recombination systems are functionally equivalent and each

consists of a 5’-3’ exonuclease, RecE or Redα (λ exonuclease), and a recombinase,

RecT or Redβ (β protein). When presented with a free end at a DSB, the exonuclease

binds to the end and resects the 5’-ended strand to form a 3’-ended ssDNA tail

(Figure 1.2). The recombinase then loads onto the ssDNA and promotes

recombination by one of two pathways: single-strand annealing or strand invasion. In

annealing, the ssDNA bound by the recombinase is paired with a complementary

single-stranded region of DNA. This situation occurs at DSBs within repeated DNA

sequences and at ssDNA exposed on the lagging strand of replication forks. In the

strand invasion pathway, the ssDNA-recombinase complex invades a complementary

region of a duplex DNA and promotes strand exchange. Although strand invasion in

vivo is thought to be mostly RecA-dependent (Stahl, et al., 1997), both RecT and β protein catalyze limited types of strand invasion reactions in vitro in the absence of

RecA (Li, et al., 1998; Noirot and Kolodner, 1998; Noirot, et al., 2003). The exonuclease and recombinase proteins of each system form a specific protein-protein interaction that may serve to load the recombinase onto the 3’-overhang as it is generated by the exonuclease (Muyrers, et al., 2000; Radding, et al., 1971). The physical interaction of a nuclease-recombinase pair, also observed for RecBCD and

RecA (Anderson and Kowalczykowski, 1997), is emerging as a common theme in homologous recombination.

6

Figure 1.2 Schematic view of homologous recombination mediated by RecET and

Redαβ.

7

1.3 Protein Components of Phage-Based Recombination Systems

1.3.1 Processive 5’–3’ Enzymes: RecE and λ Exonuclease.

RecE and λ exonuclease are ATP-independent enzymes that specifically bind to linear dsDNA ends and track along degrading the 5’-ended strand to generate

5’-mononucleotides and a long 3’-ended ssDNA tail (Joseph and Kolodner, 1983;

Little, 1967). They have no detectable activity on any type of circular DNA substrates and very low activity on single strand DNA substrates. Like many other nuclease enzymes, their nuclease activities require Mg2+ and are inhibited by other divalent

cations, such as Ca2+ (Joseph and Kolodner, 1983a). A 5’-phosphate group on the

DNA substrate is shown to be critical for λ exonuclease activities, while RecE can

efficiently degrade duplex DNA substrates with a 5’-hydroxyl group as well as ones

with 5’-phosphate (Joseph and Kolodner, 1983b). RecE and λ exonuclease are both

highly processive enzymes that can remain bound to a DNA substrate and

sequentially cleave thousands of at a rate of ~ 10 nucleotides (nt) per

second.

The crystal structure of λ exonuclease (Figure 1.3 A) revealed a toroidal trimer

with a central channel that is tapered from 30 Å on one side to about 15 Å on the other

(Kovall and Matthews, 1997). The funnel-shaped channel suggests a mode of action

in which the trimer tracks along the duplex with dsDNA entering on one side and

ssDNA exiting on the other (Figure 1.3 B). The three active sites of the trimer each

bind a Mg2+ and are located within a cleft on each subunit that is exposed to the

central channel. The portion of λ exonuclease that forms the active site exhibits a fold

8

Figure 1.3 Toroidal strucuture of λ exonuclease (Kovall, et al., 1997).

(A) Ribbon diagram of the trimer. The position of the active site Mg2+ is indicated by a magenta sphere, and the phosphate bound near the active site is shown with oxygen atoms as red spheres.

(B) Model of the λ exo-DNA complex. dsDNA enters the channel from the right, and ssDNA exits on the left.

9

dubbed the “restriction endonuclease-like” fold that is also seen in several type II

restriction endonuclease, the nuclease domain of the RecB subunits of the RecBCD helicase-nuclease complex, the MutH DNA repair endonuclease, and several other nuclease (Kovall and Matthews, 1998; Singleton, et al., 2004; Lee, et al., 2005). The proteins within this diverse group of nuclease enzymes likely share a common evolutionary origin and related catalytic mechanisms (Kovall and Matthews, 1999).

RecE is a much larger protein than λ exonuclease (866 vs. 226 amino acids), but limited proteolysis and genetic studies have indentified a 303-residue C-terminal domain of RecE that has all of the activities of the full-length protein both in vivo and in vitro (Chang and Julin, 2001; Chu, et al., 1989; Muyrers, et al., 2000). This domain of RecE (residues 564-866) has essentially no detectable overall with λ exonuclease, but segments of RecE and RecB containing the presumed active site residues can be aligned to one another (Aravind, et al., 1999; Chang and Julin,

2001). Phylogenetic analyses indicate that RecB and λ exonuclease belong to distinct families within the endonuclease-like superfamily, and that RecE is more closely related to the RecB family (Aravind, et al., 2000).

1.3.2 Recombinase: RecT and β Protein.

RecT, β protein, and Rad52 (the eukaryotic homolog) bind to ssDNA and promote the annealing of complementary strands (Hall, et al., 1993; Muniyappa &

Radding, 1986). In the electron microscope (EM), these proteins form a variety of structures, including oligomeric rings and helical filaments. Although there is no

10

general consensus for the roles of the different oligomers in promoting the SSA

reaction, a compelling model for SSA has been proposed for β protein (Passy, et al.,

1999). β protein binds weakly to ssDNA, hardly at all to dsDNA, but tightly to the

duplex product of annealing formed when two complementary oligonucleotides are

added to β protein sequentially (Karakousis, et al., 1998). In the EM, β protein forms oligomeric rings of about 12-15 subunits alone or in the presence of ssDNA, and a left-handed helical filament in the presence of complementary oligonucleotides or heat-denatured DNA (Figure 1.4). Based on these observations, it was proposed that

the oligomeric ring of β protein binds to ssDNA and presents it along its outer surface

for annealing with a second strand of ssDNA. Formation of an initial region of

annealed duplex nucleates the assembly of a helical filament of β protein, which

polymerizes on and sequesters the dsDNA product as it is formed, thus driving the

reaction forward. Interestingly, it has recently been shown by atomic force

microscopy that in the absence of DNA β protein actually forms an open lock washer instead of a closed ring (Erler, et al., 2009).

The crystal structure of the N-terminal DNA binding domain of hRad52 revealed an undecameric (11-mer) ring with a positively charged outer groove for binding ssDNA (Figure 1.5). In the EM, Rad52 forms oligomeric rings that induce the co-aggregation of single-stranded regions of DNA ends (Stasiak, et al., 2000;

Subramanian, et al., 2003). Rad52 also forms helical filaments, but a functional role

for the filaments has not yet been established. RecT binds much more tightly to

ssDNA than Rad52 or β protein, and EM studies suggest that it may use a different

11

Figure 1.4 EM-derived models of β protein (Passy, et al., 1999). The oligomeric ring on the left has ssDNA bound along its outer rim. The filament on the right has dsDNA bound along its inner surface.

12

Figure 1.5 Crystal structure of the SSA domain of hRad52 (Singleton, et al.,

2002).

(A) View of ssDNA modeled into the positively charged outer groove on the oligomer.

(B) Two monomers of the oligomer are shown with ssDNA modeled as in A.

13

mechanism for SSA (Thresher, et al., 1995). RecT forms oligomeric rings alone, and

helical filaments in the presence of ssDNA. Based on the observation that

RecT-ssDNA filaments tend to co-aggregate side-by-side with one another, it was proposed that SSA could occur via alignment of two RecT-ssDNA filaments.

1.3.3 Exonuclease-Recombinase Complexes.

RecE and λ exonuclease each form a protein-protein interaction with their

respective recombinase protein, RecT and β protein (Muyrers, et al., 2000; Radding,

et al., 1971). Although the interaction is essential for recombination in vivo, exactly

how it promotes recombination has not yet been established. It could serve to load the

SSA protein onto the 3’-overhang as it is generated by the exonuclease. This would be

precisely analogous to the loading of RecA onto 3’-overhangs generated by RecBCD

(Anderson & Kowalczykowski, 1997), and is also reminiscent of the loading of

Rad51 onto dsDNA-ssDNA junctions by BRCA2 (Yang, et al., 2005). It appears that

the physical linkage between an exonuclease (or an associated adaptor protein) and a

recombinase is a fundamental property of DNA repair by HR. Intriguingly, genetic

studies of Redαβ suggest another possible role for the protein-protein interaction

(Shulman, et al., 1971). Mutations of the gene encoding β protein can modulate the

apparent activity of λ exonuclease. It could be that the protein-protein interaction

serves not only to load β protein onto the 3’-overhang, but also to stop the

exonuclease in its tracks, once β protein has been assembled on the DNA. RecE and λ

exonuclease are so processive that if their activities were not attenuated in some way,

they could chew up excessive amounts of the DNA.

14

1.4 Biotechnology Applications of RecET and Redαβ Systems

1.4.1 Recombineering Technology.

In the post-genomics era, new tools are needed for manipulating large pieces

of DNA. The past decade has seen the development of novel in vivo recombinant

DNA technologies employing RecET and Redαβ recombination system that are well

suited for these purposes, due to their simplicity, high efficiency, and ability to work

at short regions of homology (Copeland, et al., 2001; Muyrers, et al., 2001). These

methods were referred to as “recombineering”, ET cloning, or in vivo cloning. Unlike classical in vitro cloning techniques that are based on ligation of DNA fragments generated by digestion (Figure 1.6 A), recombineering relies solely on very short (~ 50 bp) homologous regions shared between two DNA molecules

(Figure 1.6 B). Its advantages include not only high efficiency but also the unique capacity to clone DNA fragments up to 100 kb in length. Therefore, this technique is particularly useful for cloning large DNA molecules, such as bacterial artificial

chromosomes (BAC), which are difficult to manipulate by traditional genetic

engineering techniques (Muyrers, et al., 2004). Recently, the recombineering method

has been widely used for fast and efficient construction of vectors for subsequent

manipulation of the mouse genome or for use in cell culture experiments. It can also

be used to generate insertions, deletions, or point mutations. One example of a

recombineering procedure is briefly depicted in Figure 1.6 B. A duplex DNA,

containing a gene of interest flanked by short regions homologous to the targeted

sequence in a BAC, is PCR amplified. Co-transformation of the duplex DNA

15

Figure 1.6 A comparison of DNA manipulation by the traditional genetic engineering and the recombineering techniques (Copeland, et al., 2001).

(A) The classic genetic engineering uses restriction enzymes and to cut and rejoin DNA fragments, followed by insertion of the cloned cassette into the target site of a BAC by homologous recombination.

(B) Recombineering directly places the PCR-amplified cassette, flanked by two short homologous regions, into the target site of a BAC.

16

with the BAC into E. coli cells expressing the Redαβ or RecET genes results in

insertion of the gene of interest into the target site of the BAC without the use of

restriction enzymes. Genes encoding the proteins of these recombination systems are

found in a wide variety of bacterial and phage genomes, and are currently being

exploited for genetic engineering in organisms such as Mycobacterium tuberculosis

(Van Kessel, et al., 2008) and Drosophila melanogaster (Vanken, 2006). This technique has already been demonstrated to be powerful and Red/ET Recombineering

Kits are now commercially available through Gene Bridges

(http://www.genebridges.com). However, as they are developed further, knowledge of the structure and mechanism of the proteins involved could facilitate realization of

their full potential.

1.4.2 Nanopore DNA Sequencing.

To expand current understanding of the molecular basis of cancer and other

diseases through genomic analysis, the development of new technologies to reduce

both the cost and the time of sequencing will be required. A single molecule technique

was employed to read a DNA sequence as it is threaded through a nano-scale pore using ionic current blockage (Branton, et al., 2008). The original design of this system was a membrane- inserted α-hemolysin, which forms a remarkable pore whose inside diameter is barely as large as the diameter of a single nucleic acid strand. An ion current reduction can be readily detected when a molecule is translocating through the

α-hemolysin pore that is attached to a lipid bilayer (Figure 1.7 A). Recently, λ

17

exonuclease, due to its highly processive nature, was used to improve the existing

nano-pore sequencing technique (Branton, et al., 2008). In this device, λ exonuclease

is linked to the original system containing an aminocyclodextrin adaptor inside the

nanopore (Figure 1.7 B). When a duplex DNA molecule is captured by the λ

exonuclease enzyme, one strand is processively digested and the 5’-mononucleotides

released are driven through the α-hemolysin nanopore by the voltage across the lipid

bilayer. The identity of the four dNMPs, A, T, G or C, can be readily determined by

four distinguishable levels of the reduced ionic current. The potential advantage of this innovative technique is that it can sequence DNA directly from cells without amplification, modification or using expensive reagents such as fluorescent tags

(Sanderson, 2008). Currently, this approach is still at its infancy and faces many challenges, one of which will be to favorably integrate the exonuclease and the nanopore detection system. It is clear that our structural studies of RecE or λ exonuclease, and their complexes with DNA, will pave the way for further developing this technique. For example, a detailed understanding of exactly where the DNA binds on the exonuclease, and where the mononucleotide products are released from the exonuclease will greatly benefit efforts to precisely orient the enzyme at the pore entrance.

18

Figure 1.7 Nanopore DNA sequencing (Branton, et al., 2008).

(A) Standard sequencing using ionic current blockage. An α-hemolysin pore, which is just big enough to accommodate a single nucleic acid strand, is attached to a lipid bilayer. A reduced ionic current can be easily detected when a molecule is translocating through the nanopore.

(B) Exonuclease sequencing by modulation of the ionic current. An exonuclease is tethered on top of the standard nanopore device and the 5’-mononucleotides generated by the enzyme are drawn into the pore by the electric field. The DNA sequence could be determined from the characteristic change in current as each nucleotide crosses the adaptor. 19

CHAPTER 2

CRYSTAL STRUCTURE OF ESCHERICHIA COLI RECE EXONUCLEASE

REVEALS A TOROIDAL TETRAMER FOR PROCESSING DOUBLE

STRANDED DNA BREAKS

2.1 Introduction

The E. coli RecET and bacteriophage λ Redαβ systems promote homologous recombination reactions that are historically important and technologically useful, but not well understood mechanistically. RecE and Redα (λ exonuclease) are 5’-3’ that bind to duplex DNA ends and processively digest the 5’-ended strand to produce a 3’-ended ssDNA tail. Through a direct physical interaction, the exonuclease then loads the corresponding single-strand annealing protein, RecT or

Redβ, onto the ssDNA to promote its annealings with a complementary strand of ssDNA. In the past decade, innovative “recombineering” technologies have been developed that implement these systems in new methods of genetic engineering. More recently, the exonucleases of these systems have been exploited in nano-pore DNA sequencing due to their highly processive natures. Knowledge of the structure and mechanism of these proteins could pave the way for further development of their applications in biotechnology.

20

The structure of λ exonuclease suggests a mode of action in which the trimeric

ring tracks along the duplex with dsDNA entering the channel on one side and ssDNA

exiting on the other as the 5’-ended strand is digested. Although RecE and λ

exonuclease are functionally equivalent, given the extensive differences in their

primary sequences, it would be of interest to see how similar their structures will be,

particularly at the quaternary level. To gain further insight into the structure and

mechanisms of this class of processive exonuclease enzymes, we have determined the

crystal structure of the C-terminal nuclease domain of RecE. The structure of RecE

reveals a toroidal tetramer that forms a central channel of similar size and shape as

that seen in the λ exonuclease trimer. Comparison of the structures of RecE and λ exonuclease reveals common structural features that appear to be fundamental to their modes of action.

2.2 Methods and Materials

2.2.1 Expression and Purification of RecE proteins.

The gene for RecE564 was amplified by PCR from E. coli genomic DNA and

cloned into the NdeI and BamHI sites of a pET-14b vector (Novagen), which

expresses the protein with an N-terminal 6His-tag and an intervening sequence for

thrombin digestion. DNA sequencing of the cloned inserts revealed the presence of a

P658L mutation relative to the recE sequence from E. coli K12. The Quickchange

procedure (Stratagene) was used to construct plasmids for producing wild type

RecE564 with proline at amino acid position of 658. The plasmids were transformed

21

into an E.coli BL21(AI) expression strain under the control of arabinose. Cell cultures

(6 × 1L) were inoculated in 2.8 L broad bottom flasks at 37 ℃ with shaking at 225 rpm. When the OD600 reached 0.8, the expression was induced by addition of 0.1 %

arabinose followed by continued shaking for 4 hours at 37 ℃. Cells were harvested

by centrifugation at 10,000 × g for 10 minutes, and re-suspended in a buffer

containing 50 mM NaH2PO4, 300 mM NaCl, and 10 mM imidazole, pH 8.0. After

incubation with lysozyme (0.3 mg/ml) and protease inhibitors (1 μg/ml pepstatin, 1

μg/ml leupeptin, 1mg/ml phenylmethyl sulfonyl fluoride) on ice for 30 minutes, the

cells were lysed by repeated sonication on ice, and the soluble portion of the resulting

lysate was clarified by centrifugation twice at 40,000 g for 20 minutes. The protein

was purified by nickel affinity chromatography (Qiagen), digested with thrombin

protease (GE Healthcare), and further purified by anion exchange chromatography on

Hi-Trap QHP (GE Healthcare). The purified protein was dialyzed into 20 mM Tris, 1

mM dithiothreitol (DTT), concentrated to 10-50 mg/ml, and stored at -80 ℃ in small

aliquots.

Several other longer or shorter C-terminal fragments of RecE, starting with

residues 554, 588, 602 or 606, were also constructed and purified. RecE554 and

RecE588 proteins were cloned, expressed and purified using the same procedures as

for RecE564. The expression and purification protocols for RecE602 and RecE606

proteins were modified, since their solubility was greatly decreased compared to other

longer RecE fragments. To express soluble RecE602 and RecE606 proteins, the

temperature for cell culture inoculation was lowered to 18 ℃ after induction with

22

arabinose, followed by continued shaking for 16 hours at 18 ℃. A minimal of 150 mM NaCl and 10 % glycerol was used in all the buffers during protein purification, as well as in the final storage buffer.

2.2.2 Analytical Ultracentrifugation

Sedimentation experiments were performed at 20 ℃ in a Beckman XL-I analytical ultracentrifuge using absorbance optics. 400 μl samples of RecE564 in the buffer from gel filtration were spun at 40,000 rpm at 20 ℃ in double-sector charcoal-filled epon centerpieces. Data were analyzed using the c(s) and c(M) models in the program Sedfit (Schuck, 2000) to determine differential sedimentation coefficient or apparent mass distributions, respectively. In sedimentation equilibrium experiments, 100 μl samples at concentrations ranging from 1.2 to 12 μM were spun at 11,000, 14,000, and 19,000 rpm until equilibrium was reached. Data were truncated using WinReedit and globally fit using WinNonlin (http://www.rasmb.bbri.org/) (Herr, et al., 1997; Herr, et al., 2003). All the sedimentation experiments and data analysis were performed by Dr. Andrew B. Herr at University of Cincinnati.

2.2.3 Crystallization of RecE Fragments.

RecE564 proteins were crystallized in a condition containing 15 % PEG 3350,

0.2 M CaCl2, 0.1 M HEPES pH 7.0, 20 % glycerol. The crystals index in a tetragonal unit cell with dimensions a = b = 158 Å, c = 145 Å. RecE554 and RecE588 crystals grew in the same conditions as RecE564, but exhibited similar anisotropic diffraction

23

problems. RecE602 crystallized in two different tetragonal forms, which had 5 mM

602 MgCl2 and 150 mM CaCl2 in their crystallization conditions. The Mg-form RecE crystals were grown in 2 mM MgCl2, 0.15 M DL-Malic Acid, pH 7.0, 26 % glycerol,

and diffracted to ~6 Å with cell dimension a = b = 125 Å, c = 67 Å. The Ca-form

602 RecE crystals were grown in a similar condition, 150 mM CaCl2, 0.15 M DL-Malic

Acid, pH 7.0, 26 % glycerol, and diffracted to ~5 Å with cell dimension a = b = 246 Å,

c = 71 Å, space group I4. Crystals of RecE606 (P658L) were grown by the

hanging-drop vapor diffusion method at 22 ℃. The reservoir solution consisted of 2

mM MgCl2, 100 mM DL-malic acid, pH 7.0, 35-45% glycerol. Both the malic acid

and glycerol are required for crystal growth. The condition was originally identified

with PEG/malic acid, and we found that glycerol could be substituted for PEG. The

hanging drop was prepared by mixing 1 μl of 10 mg/ml RecE606 in storage buffer

plus 2 mM MgCl2, and 1 μl of reservoir solution. Crystals usually take 2-4 days to

grow and can be flash frozen in liquid nitrogen.

2.2.4 Expression and Purification of Selenomethionine RecE606.

A detailed recipe for SeMet incorporated protein expression by metabolic inhibition is listed in Table 2.1. This method relies on the supplementation of minimal media with six amino acids, Lys, Phe, Thr, Ile, Leu, and Val, which directly or indirectly inhibit methionine biosynthesis, coupled with supplementation of the relatively non-toxic SeMet. The general procedure to produce soluble SeMet

RecE606 is described as follows: 1) a 100 ml culture of E.coli BL21 (AI) cells with

24

Table 2.1. A recipe for minimal media preparation.

5× stock of M9 salts (1 L)

30 g Na2HPO4

15 g KH2PO4

5 g NH4Cl 2.5 g NaCl

Vitamin stock (100 ml) 100 mg Riboflavin 100 mg Niacinamide 100 mg Pyridoxine monohydrochloride 100 mg Thiamine

Minimal media (1 L) 200 ml 5× M9 stock (autoclaved)

2 ml MgSO4 (sterile filtered)

2 ml FeSO4 (sterile filtered, made fresh) 0.8 ml 50% glycerol stock (sterile filtered) 1 ml Vitamin stock (sterile filtered) 0.5 ml Ampicillin 50 mg/ml stock (sterile filtered) 794 ml tap water (autoclaved)

Amino acid supplements (1 L) 100 mg L-Lysine 100 mg L-Phenylalanine 100 mg L-Threonine 50 mg L-Isoleucine 50 mg L-Leucine 50 mg L-Valine 50 mg L-SelenoMethionine

25

transformed recE (residues 606-866 containing P658L mutation) gene is grown in

minimal medium overnight at 37 ℃. 2) 30 ml of the overnight culture is transferred into 2 L minimal medium in a 2.8 L flat bottom flask and inoculated with speed 120 rpm. 3) when OD600 reaches 0.3, solid powders of the six amino acids, together with the SeMet, are added directly into the culture with desired concentrations (Table 2.1).

4) the temperature is raised from 37 ℃ to 42 ℃ for 30 minute to induce the expression of chaperone proteins and also start inhibition of Met synthesis. 5) cultures

are chilled on ice for 15 minutes and induced with 0.1% arabinose with continued

shaking with speed 120 rpm at 18 ℃ for 16 hours. Purification of the SeMet version

of RecE606 followed the same procedure as the native version, except for the

inclusion of 10 mM 2-mercaptoethanol in all buffers.

2.2.5 X-ray Structure Determination of RecE606.

X-ray diffraction data of crystals of the native and SeMet-version RecE606

(P658L) were collected at -180 ℃ at beamline 31-ID of the Advanced Photon Source at the Se absorption peak, λ = 0.97929 Å. Data were integrated and scaled with

MOSFLM and SCALA of the CCP4 suite (CCP4, 1994). The native data set was collected to 2.4 Å and a cutoff the data at 2.8 Å was chosen based on consideration of

I/σ and R-merge. While the R-merge of the highest resolution shell (2.80 – 2.95) is quite high (0.70), the data is 23-fold redundant, and the I/σ in this shell is 3.3. The crystal structure was determined by single wavelength anomalous diffraction (SAD) method, which is typically more successful due to rapid radiation damage. The SAD

26

phases were improved significantly by solvent flattening using the AutoSol feature of the PHENIX suite (Adams et al., 2002, which was particularly powerful due to the high solvent content (72%). The resulting electron density map allowed tracing of residues 612-634 and 704-861 of RecE606 using the program COOT (Emsley and

Cowtan, 2004). The structure was refined using the simulated annealing, minimization, and individual temperature factor refinement protocols of CNS

(Brünger et al., 1998) with a maximum likelihood target, and bulk solvent and anisotropic temperature factor correction options. Alternating rounds of model building and refinement yielded the final model, which consists of residues 612-664 and 699-864 of RecE. Side chains of residues 637-646, 700-701, and 704 of RecE were not resolved in the electron density and were truncated to alanine. The final structure was refined at 2.8 Å resolution to crystallographic R- and free R-factors of

28.9% and 31.1%, respectively (Table 2). Refinements, with more relaxed stereochemical restraints (high weight on x-ray term) and with other refinement programs such as CCP4 Refmac, have been tried and unable to obtain significantly lower R-values. Structure figures were prepared using PYMOL (Delano Scientific

LLC).

2.3 Results

2.3.1 RecE forms stable tetramers in solution.

For structural studies of RecE we initially expressed and purified residues

564-866 (RecE564), which form the C-terminal nuclease domain identified by limited

27

proteolysis (Chang and Julin, 2001). The photolytic data, together with in vivo deletion analysis (Chu, 1989) indicate that the N-terminal 563 residues of RecE are loosely structured and completely dispensable for protein activities. Previous biophysical analyses of full-length RecE protein revealed that it forms an oligomer, but the precise number of subunits was not established (Joseph and Kolodner, 1983a).

Our size exclusion chromatography result (Figure 2.1) showed that RecE564 elution

volume corresponds to hemamers or likely smaller if a non-spherical shape is

assumed, which is significantly larger than a λ exonuclease trimer. To definitively

determine the oligomeric state of RecE564, analytical ultracentrifugation experiments

(Figure 2.2) were performed by collaborating with Dr. Andrew B. Herr’s lab at

University of Cincinnati. Sedimentation velocity indicated a single 6.3 S species with

an estimated mass of 144 ± 7 kDa (Figure 2.2 A), compared to the calculated mass of

138.0 kDa for a tetramer of RecE564. A frictional coefficient of 1.6 indicated a

non-spherical shape, consistent with a toroidal structure. To further verify the mass of this species, sedimentation equilibrium was performed at three different rotor speeds at sample concentrations ranging from 1.2 to 12 μM (Figure 2.2 B). A single-species fit was sufficient to model the observed data, and resulted in an experimental molecular weight of 137 ± 5 kDa, again consistent with a stable tetramer of RecE564.

These results establish that the oligomeric state of the C-terminal nuclease domain of

RecE, and by inference of the full-length protein, is a stable tetramer in solution. This

is in notable contrast to λ exonuclease, which forms a trimer.

28

Figure 2.1 Size exclusion chromatogram of RecE564. Samples of RecE564 used in ultracentrifugation experiments were purified by a previous graduate student Xu Xing, on Superdex S-200 (GE Healthcare) with 150 mM NaCl, 20 mM Tris, pH 8.0 as the running buffer. A single sharp peak at 62.87 ml corresponds to a mass of 218 kDa, which indicates that RecE564 might aggregate to form hexamers (calculated MW =

207 kDa), or smaller if a non-spherical shape is assumed.

29

Figure 2.2 Analytical ultracentrifugation results showing that RecE protein forms a stable tetramer in solution.

(A) Sedimentation coefficient distribution plot from a sedimentation velocity run of

RecE564. A peak corresponding to a single 6.3 S species is observed. The MW of this species estimated from the velocity data is 144 ± 7 kDa, consistent with a tetramer of

RecE564 (calculated mass 138.0 kDa).

(B) Equilibrium sedimentation curves observed for RecE564 at three different rotor speeds. The upper panel shows the residuals of a fit of the data to a single species of mass 137 ± 5 kDa, again consistent with a tetramer.

30

2.3.2 Crystallization and Structure Determination.

We were able to crystallize RecE564, but the crystals diffracted x-rays to a

maximum resolution of only 3.5 Å at the synchrotron, with severe anisotropy (Figure

2.4 A). To look for alternative crystal forms, we constructed and purified several longer or shorter C-terminal fragments of RecE that start at residues 554, 588, 602 or

606 (Figure 2.3). The observation of two additional crystal forms of RecE602 (Figure

2.4 B and C) reinforced the tendency of the protein to crystallize in tetragonal symmetry, and confirmed the strategy of getting different crystal forms by varying the length of the protein. In genetic studies, a shorter C-terminal fragment of RecE,

beginning as far into the amino acid sequence as residue 606, exhibits full activity in

recombination assay in vivo (Muyrers, et al., 2000; Chu, et al., 1989). Remarkably,

further truncation by 4 additional residues from the N-terminal end (RecE606) resulted

in crystals that diffracted to a 2.8 Å resolution at the synchrotron (Figure 2.4 D).

During the course of this work, we discovered that the constructs we used to

express RecE fragment proteins all contained a P658L mutation. Fortuitously, this

mutation improves the diffraction of the RecE606 crystals by about 1 Å resolution

relative to crystals of RecE606 without the mutation. As measured using a

fluorescence-based exonuclease assay, the P658L mutation does not affect the enzymatic activity of either RecE564 or RecE606 (Figure 2.5). However, the shorter

RecE606 fragment, with or without the P658L mutation, exhibits significantly lower

exonuclease activity than RecE564 in this assay, as has been reported previous using a

gel-based assay (Muyrers, et al., 2000). Importantly, despite its lower exonuclease

31

activity in vitro, RecE606 actually exhibits 4-fold increased activity in recombination assay in vivo, as compared to longer fragments of the protein (Muyrers, et al., 2000).

Since RecE shares little sequence homology with any proteins whose structures are available, experimental phasing was needed to determine its crystal structure. Solving the phase problem was turned out to be another big challenge, as standard soaking procedures failed to identify a useful heavy atom derivative after extensive attempts with a variety of heavy metals, and the SeMet protein poorly expressed in an insoluble form. After trying several different expression protocols,

~10 mg soluble SeMet RecE606 (P658L) protein was obtained from 6 L cell cultures.

Fortunately, the SeMet derivative protein crystallized readily in the similar condition

606 that grew native one. Crystals of RecE belong to the tetragonal space group P4212, with cell dimension of a = b = 123 Å, c = 67 Å, one monomer per asymmetric unit, and a relatively high solvent content of 72 %. The crystal structure was determined by single wavelength anomalous diffraction (SAD) from crystals of the selenomethionine protein. The x-ray diffraction data and the final refinement statistics are summarized in Table 2.2. A representative section of the unbiased experimental electron density map calculated with the SAD phases after solvent flattening is shown in Figure 2.6 A.

This map allowed fitting most of the RecE, including residues 612-664 and 699-864.

The final structure was refined to 2.8 Å resolution to crystallographic R- and free

R-factor of 28.9 % and 31.1 %. The somewhat high R-factors might be largely due to the anisotropic diffraction, as well as the presence of the 34-residue disordered loop for which there is some density that is not clear enough to model (Figure 2.6 B).

32

Figure 2.3 SDS-PAGE of purified RecE fragments and λ exonuclease. From left to right, the lanes represent proteins 1) ladder, 2) λ exonuclease, 3) RecE554, 4) RecE564,

5) RecE588, 6) RecE602, and 7) RecE606.

33

Figure 2.4 Crystals and diffractions of RecE proteins with different lengths. Four forms of RecE crystals are shown from top to bottom, A. RecE564, B. RecE602 (Mg), C.

RecE602 (Ca), D. RecE606 (Mg). The scale bars next to the crystals are 100 microns.

From left to right, the figures show (1) an image of crystals under a microscope, (2) a diffraction image, and (3) data collection statistics and crystallization conditions.

34

Figure 2.5 Exonuclease activities of RecE proteins.

(A) The plot shows the decrease in fluorescence of PicoGreen due to the digestion of a linear dsDNA fragment upon addition of exonuclease enzymes. The different curves are for reactions with different enzymes, as indicated at right. Notice that the exonuclease activity of RecE564 is similar to λ exonuclease, and much higher than that

of Rec606. The P658L mutation, which dramatically improves the crystals of the

RecE606 protein, has no effect on the exonuclease activity of either RecE564 or RecE606.

(B) Close-up of the curves for the RecE606 fragments.

35

Table 2.2 Summary of the Crystallographic Data

RecE606 P658L SeMet RecE606 (Native) P658L X-ray Diffraction Data

Space Group P4212 P4212 Unit Cell Dimensions a = b (Å) 123.2 123.4 c (Å) 67.3 67.7 Resolution (Å) 45.4-2.8 (2.95-2.80) 36.7-3.2 (3.37-3.20) No. unique reflections 13,187 9,072 Redundancy 23.5 (23.4) 12.5 (12.9) Completeness (%) 99.6 (100.0) 99.9 (100.0) I/s 19.6 (3.3) 22.2 (6.7)

Rmerge 0.080 (0.70) 0.099 (0.33)

Refinement Statistics Resolution (Å) 45.41-2.80 No. of reflections 13,172

Rwork/Rfree (%) 28.9/31.1 No. of protein atoms 1,737 Mean B factor (Å2) 68.9 R.M.S.D bond length (Å) 0.0089 R.M.S.D bond angle (°) 1.36 Residues in Ramachandran plot Most favored regions 151 (77.8%) Additional allowed regions 35 (18.0 %) Generously allowed regions 8 (4.1 %) Disallowed regions 0 (0 %)

36

Figure 2.6 Electron density for the RecE crystal structure.

(A) View of the experimental density map calculated at 3.2 Å with SAD phases after

solvent flattening. The map is contoured at 1 σ and superimposed on the final refined

model of RecE.

(B) Packing of tetramers in the RecE crystal structure. The side view of a tetramer

with one subunit highlighted in yellow. The RecE tetramers pack in layers in the a-b

plane of the crystal, with less extensive contacts along the c axis, which is in the

horizontal direction. The 2Fo-Fc electron density map, calculated at 2.8 Å with phases from the final model, is contoured at 1 σ in blue. Notice the bridging density between

the layers, which corresponds to the 34-residue segment that is not included in the

final structure.

37

2.3.3 Structure of the RecE Monomer.

RecE folds into a structure with a central five-stranded mixed β-sheet surrounded by nine α-helices and a small three-stranded anti-parallel β-sheet (Figure

2.7A). As expected, the C-terminal portion of RecE, residue 725-845, exhibits a fold

that is found in λ exonuclease, the RecB nuclease domain, and several other nuclease

enzymes (Kovall and Mattews, 1998; Singleton, et al., 2004). This common core fold

includes all eight β-strands and helices C, G, and H of RecE. In addition, helices A, B,

and I of RecE are also present in λ exonuclease, and helix E of RecE is also present in

RecB (Figure 2.7). Although the structural alignment between RecE and λ

exonuclease includes two more α helices than that between RecE and RecB, RecE

aligns more closely to RecB. For the core fold that forms the active site, the rmsd

between RecE and λ exonuclease is 2.7 Å for 98 pairs of Cα atoms, while the rmsd

between RecE and RecB is 1.6 Å for 82 pairs of Cα atoms. Interestingly, RecE and λ

exonuclease both have an extended loop that crosses over the active site to create a

hole of about 5-10 Å in diameter. This loop connects helices B and C in RecE and is

in a structurally equivalent position in λ exonuclease. Based on the structure of the

RecE tetramer described below, it appears that the 5’-ended strand of the DNA

substrate must be threaded through this hole in order to access the active site.

A 34-residue segment of RecE located between helices D and E, residues 665-698, is

not well resolved in the electron density and is not included in the final refined

structure. However, sparse electron density shows that it extends about 25 Å into the

solvent region to contact the next layer of molecules in the crystal (Figure 2.6 B)

38

Figure 2.7 RecE monomer structure and structure alignment of RecE to λ exonuclease and RecB.

(A) Ribbon diagram of the RecE monomer structure. Six conserved sequence motifs

are highlighted in different colors as defined in sequence alignment result (Figure

2.10). The position of the active site Mg2+ is indicated by a magenta sphere. Side

chains of highly conserved active site residues are shown in stick form.

(B) The λ exonuclease monomer, viewed in the same orientation as RecE, after

structural superposition. Regions of λ exonuclease that align to within 4 Å of equivalent Cα atoms of RecE are shaded in green. Active site residues are shown in magenta.

(C) The RecB monomer, viewed in the same orientation as RecE, after structural superposition. Coloring is the same as in panel B.

39

In the RecE tetramer, this extended segment is in position to interact with the

“downstream” portion of the incoming DNA substrate. The P658L mutation, which

dramatically improves the crystal diffraction but does not affect enzymatic activity, is

located at the tight turn between helices C and D, just preceding the 34-residues

extended segment (Figure 2.7A). The very C-terminal portion of RecE, residues

848-864, forms an extended segment that is also not present in RecB or λ exonuclease.

In the tetramer, the C-terminal segment packs against the adjacent subunit to form an

integral part of the subunit interface.

2.3.4 Structure of the RecE Active Site.

As expected based on sequence comparisons and mutational studies (Change

& Julin, 2001), Asp-748, Asp-759, and Lys-761 of RecE come together to form the

presumed active site on the enzyme (Figure 2.8). These residues are located on the

loop preceding β-strand 6. Although Mg2+ is present at 2 mM in the crystals, electron

density at its expected location is not observed, possibly due to the presence of 100 mM malic acid, which was required for crystallization. To confirm the position of the active site Mg2+, a 3.7 Å data set was collected on a crystal soaked with 25 mM

MnCl2. The resulting difference Fourier map (FMn2+ - Fnat) showed a 8 σ peak, the

highest peak in the map, at the expected position for the Mn2+ (Figure 2.8). Similar to

λ exonuclease, the Mn2+ is coordinated by the carboxyl groups of Asp-748, Asp-759

and backbone carbonyl oxygen of Val-760. In RecE, the Mn2+ is also coordinated by a

fourth ligand, His-652, which located on α-helix C. An equivalent residue (His-956) is

40

found in RecB, where it coordinates the Ca2+ ion bound in the active site (Singleton et

al., 2004). In λ exonuclease, a highly conserved glutamate (Glu-85) that could

potentially be a fourth ligand is found at the equivalent position (Kovall and Mattews,

1997; Aravind, et al., 2000). Thus, the active site of RecE, in addition to its fold, more

closely resembles RecB than λ exonuclease, despite its closer functional similarity to

the latter. Although the lower resolution of the Mn2+-soacked RecE structure does not

permit a detailed analysis of the coordination geometry, structural alignment with

RecB suggest that RecE uses the same octahedral coordination geometry observed for

RecB, with four sites on the metal occupied by protein ligands and two sites available

for potential interactions with the DNA substrate and/or a hydrolytic water molecule

(Figure 2.9).

In the structure of λ exonuclease, a phosphate ion acquired from the buffers

used in protein purification is bound to a pocket near the active site, where it is coordinated by Arg-28, Ser-35, Ser-117, and Gln-157 (residue numbering of λ exonuclease). Since λ exonuclease has much higher activity on DNA substrates that

contain a 5’-phosphate group (Little, 1967; Mitsis & Kwagh, 1999), the phosphate observed in the crystal structure has been proposed to mark the site on the protein for binding to the terminal 5’-phosphate of the dsDNA substrate (Subramanian, et al.,

2003). Electron density is not observed for such a phosphate in the RecE structure,

even though phosphate was present during protein purification. Although Gln-157 and

Ser-35 of λ exonuclease are conserved in RecE (as Gln-781 and Ser-617, respectively),

Are-28 and Ser-117 are not. Consistent with these structural observations, the activity

41

Figure 2.8 Close-up view of the active site of RecE. The Six conserved sequence motifs are highlighted in different colors as defined in sequence alignment result

(Figure 2.10). The grey cage of electron density shows an 8 σ difference Fourier map

(FMn2+ - Fnat) calculated at 3.7 Å from data collected on a crystal soaked with 25 mM

MnCl2.

42

Figure 2.9 Close-up stereo view comparing the active sites of RecE and RecB.

The structures of the RecB C-terminal domain (PDB code 1W36; Singleton, et al.,

2004) and RecE were aligned by least squares superposition (rms = 1.6 Å for 82 pairs of Cα atoms). The Ca2+ ion from the structure of RecB is shown as a magenta sphere.

Residues of RecB that coordinate the Ca2+ are shown in stick form with carbon yellow, nitrogen blue, and oxygen red. Dashed lines indicate direct bonds to the Ca2+ ion.

Residues of RecE that superimpose on the ligand residues of RecB are shown in stick

form with carbon in cyan. Labels indicate residues of RecE, with the corresponding

residue numbers of RecB in parenthesis. Notice that the positions of the ligand

residues of RecE are consistent with the octahedral geometry observed for RecB if

slight reorientations of the side chains are allowed, especially for His-652.

43

of RecE is not sensitive to the presence of a 5’-phosphate group on the DNA substrate

(Joseph & Kolodner, 1983b).

2.3.5 Six Conserved Sequence Motifs.

The protein sequence of E.coli Rac prophageRecE564 was used for a

NCBI-BLAST search (http://blast.ncbi.nlm.nih.gov/Blast.cgi) and 108 hits were

found, most of which are RecE proteins from different organisms. Four relatively diverse RecE sequences (sharing about 50 % sequence identity with one another), from Salmonella enterica, Burkholderia vietnamiensis, Iodobacteriophage phiPLPE, or Sodalis glossinidius, were chosen and aligned with the one from E.coli Rac prophage using the program ClustalW (http://www.clustal.org/). Interestingly, only the residues 601-866 of E. coli RecE could be aligned, although alignment with full length RecE was attempted, and six highly conserved sequence motifs emerge (Figure

2.10). Strikingly, when the six motifs were mapped as color coded on RecE structure, they all cluster around the active site within about 15 Å of the Mg2+. Three of these

motifs (Motif II, III, and IV) are also conserved in RecB and λ exonuclease family members, as has been noted (Aravind, et al., 2000). The residues that coordinate the

Mg2+ in RecE come from Motif II (Asp-748), Motif III (Asp-759 and Val-760) and

Motif V (His-652). Several other residues of these motifs, though not directly

coordinated to the Mg2+, are also highly conserved. Lys-761 of Motif III, which is

essentially invariant among the entire superfamily of nuclease enzymes, is within 3 Å

of the Mg2+, and has been suggested to stabilize the negatively charged

44

pentacoordinate phosphate intermediate and/or activate the hydrolytic water

molecules (Kovall and Mattews, 1999). Also invariant among RecB and λ

exonuclease family members is Gln-781 of Motif IV, which is hydrogen bonded to the

catalytic Lys-761 residue and thus may play a role in precisely orienting it or in

modulating its reactivity. Glu-729 of Motif I is invariant among RecE and RecB

family members, but is not found in λ exonuclease. In the RecE structure Glu-729 is

more distant from the Mg2+ (8.8 Å), but is within 4 Å of the active site histidine

residue (His-652).

Other residues of RecE that are highly conserved include Ser-731 and Tyr-733 of Motif I, Arg-744 and Arg-746 of Motif II, Tyr-778 and Tyr-785 of Motif IV, and

Leu-656, Glu-657, and Pro-658 of Motif V (Figure 2.8 and 2.10). These residues are all within about 15 Å of the Mg2+, and are likely to play key roles in substrate binding.

Finally, a sixth motif that is highly conserved within the RecE family is located at the

very N-terminal end of the crystallized RecE606 construct, but is not resolved in the electron density. Motif VI includes a YHA sequence, residue 607-609, that is invariant among RecE sequences. Based on the structure, the YHA sequence would be positioned within or preceding β strand 1 (colored black in Figure), about 20 Å from the active site. Although the YHA sequence appears to be unique to RecE family members, Ser-615 and Ser-617 of Motif VI which are located at the N-terminal end of helix A, are highly conserved in RecB and λ exonuclease family members and are located at similar positions in their structures.

45

Figure 2.10 Alignment of five different RecE sequences showing six conserved sequence motifs, which are numbered based on a scheme used previously for

RecB family members (Aravind, et al., 2000). The top line labeled

“Rac_prophage” is the sequence of E.coli RecE protein. The multiple sequence alignment was performed with the ClustalW algorithm. Invariant (*), highly conserved (:), and conserved (.) residues are indicated by the symbols below the sequence. Residues that coordinate the Mg2+ are indicated with these symbols in

magenta. Secondary structures of RecE are indicated above the sequence.

46

2.3.6 Structure of the RecE Tetramer.

In the crystal, RecE forms a P4-symmetric tetramer with overall dimension of

95 × 95 × 45 Å (Figure 2.11). Since the molecular 4-fold axis of the tetramer is

coincident with the crystallographic 4-fold, the tetramer has perfect 4-fold symmetry.

As viewed perpendicular to the 4-fold axis (Figure 2.11B), the tetramer has a remarkably flat surface on one side (the left side), and a much more featured surface

on the other side. The 34-residue segment between helices D and E that is not

included in the final structure would project out about 25 Å to the right of the tetramer

to give it an even more featured appearance. A total of 2,760 Å2 of solvent-accessible

surface area is buried at each subunit interface, which is formed by a roughly equal

mixture of hydrophobic, ion pair, and hydrogen bonding interactions. Solvent

accessible surface area calculations were performed using the AREAIMOL feature of

CCP4 with a probe radius of 1.4 Å (CCP4, 1994). A prominent feature of the interface

is the extended C-terminal segment, residues 848-864, which packs against the

neighboring subunit in the tetramer, in part through parallel β-sheet hydrogen bonding

with β-strand 8 (Figure 2.11 C). Other regions of RecE that are located at the subunit

interface include helix B and the extended segment that follows it, which pack against

helix E and the β7-β8 hairpin of the neighboring subunit. In general, the residues

buried at the subunit interface are not highly conserved among RecE sequences,

making it difficult to predict if all RecE proteins will form a similar tetramer.

However, the RecE sequences are highly divergent, such that essentially all of the

highly conserved residues are near the active site. All of the RecE proteins end

47

Figure 2.11 Structure of the RecE Tetramer.

(A) View of the RecE tetramer looking down into the central channel from the open

end. The active site Mg2+ of each subunit is indicated by a magenta sphere.

(B) Side view of the RecE, from the left of panel A. The open end of the central

channel, to which dsDNA would bind, faces to the right.

(C) Rear view of the RecE tetramer, from the plugged end of the channel.

48

at about the same amino acid position (± 4 residues), suggesting that extended

C-terminal segment that helps to form the tetramer in E. coli RecE may be a common feature among them.

Looking down the 4-fold axis (Figure 2.11 A), the RecE tetramer forms a central channel that is about 25-30 Å wide throughout its depth, except at the back, where the channel is partially plugged by the C-terminal residues 858-864 of each subunit (Figure 2.11 C). In particular, the side chains of Arg-858 and Trp-859 project towards the 4-fold axis of the tetramer to result in a narrowing of the channel to about

10 Å (Figure 2.12). The depth of the channel is approximately 40 Å, enough to accommodate one complete turn of B-form DNA (Figure 2.13 B). Somewhat surprisingly, the central channel has only a weakly positive electrostatic potential

(Figure 2.13 A). Side chains of RecE that line the surface of the channel and are thus in a position to interact with the DNA substrate through electrostatic or hydrogen bonding interactions include Glu-765, Gln-767, Arg-768, Thr-771, and Asp-775 of helix G, as well as Arg-858 and Trp-859 of the C-terminal plug (Figure 2.12). Though not included in the final refined model, the side chain of Lys-643 (of the loop

connecting helices B and C) and Lys-704 (at the beginning of helix E), as well as the

34-residue segment between helices D and E, are near the opening of the channel and

are thus in position to interact with the incoming dsDNA substrate.

The four active sites of the tetramer are located about 15 Å from the central

channel (28 Å from the 4-fold axis), and about midway (~20 Å) along its depth. The

active sites are exposed to the central channel by a ~10 Å wide tunnel through each

49

Figure 2.12 View of active site tunnels in the RecE tetramer. Stereo Cα trace of the

RecE tetramer. Side chains lining the central channel and the active site tunnels are highlighted in stick form with carbon yellow, oxygen red, and nitrogen blue. The rear of the central channel is plugged by the C-terminal segment of each subunit, which includes Arg-858 and Trp-859. Side chains of Lys-704 and Lys-643 are not included in the final refined model, but are shown here to indicate their positions on the backbone.

50

Figure 2.13 View of the central channel in the RecE Tetramer.

(A) Surface view of the RecE tetramer colored according to electrostatic potential, with blue strongly positive and red strongly negative based on the charge-smoothed potential calculated in PYMOL (Delano Scientific LLC). Notice the weakly positive electrostatic potential of the central channel, except at the back where Arg-858 gives it a strong positive potential. The right panel is the view from the left of panel B, looking into the active site portal of the upper, front subunit. Notice the strong positive electrostatic potential around the rim of the active site portal.

(B) View of the RecE tetramer with B form DNA modeled manually into the central channel. The right panel is the side view of the model for the RecE-DNA complex, with dsDNA entering the central channel at right, and ssDNA exiting the channel at left. 51

subunit, the roof of which is formed by the loop connecting helices B and C (Figure

2.12). The tunnel is wide enough to allow passage of one strand of a DNA substrate bound within the central channel, but not the entire duplex. A number of conserved residues, including Ser-615 and Ser-617 of Motif VI, and Tyr-776, Tyr-778, Gln-781, and Tyr-785 of Motif IV, line the surface of the active site tunnel (Figure 2.12), and are thus in position to interact with a duplex DNA substrate as it passes through the tunnel. Interestingly, the four residues of Motif IV, three of which are tyrosine, form a ladder in which the side chain point out from the same face of helix H and are spaced

~3.8 Å from one another. This “tyrosine ladder” would appear to be well suited for forming stacking interaction with the bases of ssDNA. Each active site tunnel forges all the way through its subunit to form a portal at the outer surface of the tetramer. A number of positively charged residues, including Arg-744 and Arg-746 of Motif II, line the rim of the portal to give it a positive electrostatic potential (Figure 2.13 A).

This positively charged region could be a site for binding the terminal 5’-phosphate group that is generated at each cycle of the exonuclease reaction, which would help to position the scissile of the DNA substrate correctly within the active site. The portal itself could facilitate release of the 5’-mononucleotides that are liberated from the 5’-ended strand of the DNA substrate as is processively digested.

2.4 Discussions

We report the first crystal structure of RecE protein, a highly processive 5’-3’ exonulcease that is part of a two-component recombination system found in several

52

bacterial and phage genomes. We have determined the crystal structure not of full

length RecE protein, which is 866 amino acids, but rather of a C-terminal fragment of

the protein, residues 606-866. Although the RecE606 fragment has considerably lower

exonuclease activity than RecE564, the stable domain of RecE identified by limited

proteolysis, it is important to note that RecE606 is fully functional in recombinant

assays in vivo (Muyrers, et al., 2000). We were able to crystallize the longer RecE564

fragment, but the crystals diffracted x-rays weakly and anisotropically. Thus, removal

of residues 564-605 was necessary to obtain crystals of sufficient quality to solve the

structure at atomic resolution (2.8 Å). In the crystal structure, the N-terminal residues

of RecE606 are involved in extensive crystal packing interactions that would clearly be

disrupted if the additional residues of RecE564 were present. This provides a possible explanation for the significantly improved diffraction of crystals of RecE606.

What is the reason for the reduced activity of RecE606 as compared to RecE564?

Since residues 564-605 are included in the stable fragment of RecE that is resistant to limited proteolysis, they are likely to be a folded part of the structure. However, this region in E.coli RecE is not conserved or even present in many other RecE proteins.

Based on the position of the most N-terminal residue (Pro-612) in the structure of

RecE606, residues 564-605 would be located at the outer surface of the tetramer, near

the active site portal. A highly conserved YHA sequence within Motif VI of RecE

(residues 608-610) is present at the N-terminal end of the crystallized RecE606

fragment, but it not resolved in the electron density. As will be discussed in Chapter 3,

mutation of Tyr-609 or His-610 to alanine reduces catalytic activity significantly,

53

suggesting that both of these residues play important roles in catalysis. Conceivably,

the absence of residue 564-605 could disrupt the folding of or the interactions with the

YHA motif in the RecE606 fragment, thereby explaining its reduced activity.

During the course of this work, we discovered that the constructs we used to

express the RecE proteins contained a P658L mutation. Therefore, we expressed and

purified RecE606 and RecE564 proteins without this mutation, and showed that the

mutation does not affect the exonuclease activities of either protein. Remarkably,

crystals of the RecE606 protein without the P658L mutation diffracted x-ray to only ~4

Å on our home x-ray source, a full 1 Å lower in resolution than crystals of RecE606 with the mutation. How could the P658L mutation lead to such a significant improvement in crystal diffraction? Pro-658 is a highly conserved residue within

Motif V of RecE, which also includes the histidine residue (His-652) that coordinates the Mg2+. We have determined the crystal structure of the RecE606 protein without the P658L mutation to 4.2 Å resolution, and at this resolution the two structures are

essentially indistinguishable (data not shown). In the structure of the P658L mutant,

Leu-658 is located in the tight turn between helices C and D, which is near the

flexible 34-residue segment that forms the only crystal contacts along the c axis of the

crystal (Figure 2.6 B). It is conceivable that the mutation somehow stabilizes this

region of the protein so that more rigid crystal contacts are formed. It is interesting to

note that although Pro-658 is highly conserved among RecE sequences, leucine is

actually found at this position in RecB and λ exonuclease (Figure 2.14).

54

Figure 2.14 Structure-based sequence alignment of RecE, RecB and λ exonuclease. The structures of λ exonuclease and RecB were superimposed with the structure of RecE and the resulting sequence alignments are shown above and below the sequence of RecE, respectively. The conserved sequence motifs of RecE are colored onto the sequences, as in Figure 2.10 (Motif I, green, Motif II, yellow; Motif

III, blue; Motif IV, red; Motif V, orange; Motif VI, black). The positions of secondary structures in RecE are indicated above the sequence alignment. Residues that are conserved in the three proteins are indicated below the sequence. Residues that coordinate the Mg2+ ion are shown with these symbols in red. The 34-residue

insertion in RecE that forms an extended loop that is poorly resolved in the structure

is shaded in gray.

55

2.4.1 Structural Comparison of RecE and λ exonuclease Oligomers.

Despite RecE having essentially the same function as λ exonuclease, the structure of the RecE monomer more closely resembles that of RecB, which functions as a single subunit in the RecBCD helicase-nuclease complex. This is evident from the significantly lower rmsd of superimposing the core region of RecE to RecB (1.6 Å) than to λ exonuclease (2.7 Å). Moreover, the conserved active site residues of RecE are more similar to those of RecB than those of λ exonuclease. For example, the histidine residue that coordinates the active site metal in RecE and RecB is replaced by a glutamate in λ exonuclease, and λ exonuclease does not appear to have the highly conserved glutamate of Motif I of RecE and RecB. Although RecE and λ exonuclease both form toroidal oligomers with central channels of similar size and shape, the regions of the two proteins that come into contact at their respective subunit interfaces are different. Moreover, and strikingly, if we assume that the DNA substrates binds to the open ends of the central channels in the RecE and λ exonuclease structures, then it is clear that the subunits of RecE and λ exonuclease are packed into their oligomeric rings in essentially opposite orientations relative to the incoming DNA substrates.

This can be seen by superimposing the monomers of RecE and λ exonuclease and then translating them into their respective oligomeric rings, as shown in Figure 2.15.

The closer similarity of RecE to RecB, together with the differences in subunit packing in the RecE and λ exonuclease oligomers, suggest that RecE and λ exonuclease have evolved separately, possible from a common ancestor that was monomeric, to form oligomeric rings for processing DSBs.

56

Figure 2.15 Monomers of RecE and λ exonuclease pack into their toroidal oligomers in opposite orientation. In the center of the figure, the RecE (cyan) and λ exonuclease (red) monomers are superimposed and viewed looking into the active site, which is indicated by the magenta sphere for the Mg2+. The RecE and λ exonuclease oligomers are shown at right and left, respectively, with the colored subunit in the same orientationas in the superposition in the middle. This places the colored subunit at the back of the oligomer in each case, with its active site facing the central channel.

The DNA structures indicate the open end of the central channel on each structure, to which the dsDNA substrates would bind. Notice that open end of the central channel of RecE tetramer faces to the right, while that of the λ exonuclease trimer faces to the left.

57

Our conclusion that the subunits of RecE and λ exonuclease are packed into their oligomeric rings in opposite orientations relative to the incoming DNA substrates implies the 5’-ended strands of the DNA substrates would approach the active sites of RecE and λ exonuclease from roughly opposite directions. This would seem to require that the active sites of RecE and λ exonuclease, which are evolutionarily related and contain several conserved residues, do chemistry on substrates oriented differently relative to their active sites. However, since it is the terminal nucleotide that is cleaved, there may be sufficient conformational flexibility of the 5’-ended strand within the active site, such that the scissile bond of the DNA substrate could site down in similar orientations relative to the Mg2+ in the two structures. It is also important to note that the RecB C-terminal domain is able to cleave both the 5’- and 3’-ended strands of the DNA substrate, suggesting an inherent adaptability of this fold to accommodate different types or orientations of DNA substrates. If we assume that RecE and λ exonuclease have evolved separately to form oligomeric rings that perform the same function, it is interesting to note several common features of their overall structures that are likely to be fundamental to their modes of action.

First, the central channels formed by RecE and λ exonuclease are remarkably similar in size and shape, and are of appropriate dimension to bind to B-form dsDNA substrates. Both central channels are completely open at one end and partially plugged at the other. In λ exonuclease, the channel is plugged by the N-terminal end of an

α-helix, whereas in RecE, the channel is plugged by the very C-terminal segment of

58

the protein (residues 848-864). Thus, different regions of RecE and λ exonuclease

have apparently evolved to perform the same functional role. Interestingly, in both

structures, the side chain of a positively charged residue, Arg-858 in RecE, and

Lys-76 in λ exonuclease, projects toward the central axis of the oligomer at the

narrowest part of the channel. Conceivably, this positively charged residue, as well as

a nearby tryptophan residue that is present in both structures (Trp-859 in RecE and

Trp-80 in λ exonuclease), could be important for binding the 3’-ended strand of the

DNA substrate as it passes through the narrow end of the channel.

Second, RecE and λ exonuclease both have extended segments that project out

from the rim of the central channel, in a position to interact with the “downstream”

portion of the incoming dsDNA substrate. In λ exonuclease this extended segment, residues 43-50, is formed by a loop between two α helices that correspond to helices

A and B of RecE. This loop includes three positively charged residues, Arg-45, Lys-48,

and Lys-49, which could form favorable electrostatic interactions with the

sugar-phosphate backbone of the incoming DNA substrate. In RecE, the 34-residue

segment between helices D and E is located in a very similar position in the tetramer.

Although this segment is not well resolved in the crystal structure, sparse density

shows that it extends out about 25 Å from the rim of the central channel, in a position to interact with the incoming DNA substrate. Consistent with a possible role in

DNA-binding, this segment of RecE contains five positive charged residues (Arg-673,

Arg-674, Lys-679, Lys-683 and Lys-694), a conserved RFIVAP motif (residue

664-669), and an invariant threonine residue (Thr-775).

59

Third, despite the fact that the subunits of RecE and λ exonuclease are packed

into their respective oligomers in opposite orientations, the active sites of the two

proteins are located in very similar positions within the oligomers. In both structures, the active sites are located about 10-15 Å from the rim of the central channel, and are exposed to the channel via a narrow tunnel that forges through each subunit. In both cases, the tunnel that leads from the central channel to the active site is wide enough to allow the passage of ssDNA, but not dsDNA, suggesting that the DNA substrate is unwound by at least 2-3 base pairs prior to nucleolytic digestion. Moreover, in both structures, the tunnel forges all the way through each subunit of the oligomer to form a portal that could facilitate release of the 5’-mononucleotide that is liberated at each cycle of the reaction. In RecE, the portal is located at the outer edge of the side of the tetramer at which the dsDNA is presumed to enter, while in λ exonuclease the portal is located on the other side, the side at which the 3’-ended strand is presumed to exit.

2.4.2 Mechanism for Processive Digestion of dsDNA.

Based on the common structural features of RecE and λ exonuclease, which are likely to be fundamental to their modes of action, we propose a general mechanism for how this class of enzymes catalyzes processive digestion of dsDNA.

The proposed mechanism is similar to and builds on one proposed previously for λ

exonuclease (Kovall and Mattews, 1997). In this mechanism, the dsDNA substrate

enters the toroidal oligomer through the open end of the central channel, which is of

the appropriate dimension to accommodate one complete turn of B-form DNA. Upon

60

binding, the strands at the end of the DNA substrate are unwound, with 5’-ended

strand passing through a narrow tunnel to engage with one of the active sites on the

oligomer, and the 3’-ended strand passing through the narrow opening at the back of

the tetramer, possibly interacting with Arg-858 and Trp-859 of RecE, or Lys-76 and

Trp-80 of λ exonuclease. Upon hydrolysis of the terminal nucleotide on the 5’-ended strand, the released 5’-mononucleotide diffuse out through the active site portal, and

the DNA substrate translocates through the channel to position the next nucleotide on

the 5’-ended strand within the active site.

Observations from single molecule studies of λ exonuclease support the notion

that the terminal base pairs of the dsDNA substrate are unwound prior to cleavage of

the 5’-ended strand. In one study, λ exonuclease was observed to digest the substrate at a fairly constant rate of 12 nucleotides per second, except at distinct sites at which the enzyme paused (Perkin, et al., 2003). The pause sites exhibited a directional dependence and occurred at a particular GGCGA sequence, which was proposed to interact as ssDNA with residues lining the inner surface of the channel, including

Trp-24. In a separate study, the rate of translocation of λ exonuclease along a DNA

substrate was shown to depend on the sequence of the DNA, with slower rates

correlating with GC-rich sequence (Van Oijen, et al., 2003). It was concluded that

melting of the terminal nucleotide (or nucleotides) was necessary for catalysis and

was the rate-limiting step at each cycle of the reaction. These types of experiments

have not yet been performed on RecE protein. It would be interesting to see if RecE

exhibits a similar behavior.

61

The structures of RecE and λ exonuclease and the resulting mechanism that is

proposed nicely account for the hallmark features of the nuclease activities of these

enzymes. For example, the total lack of activity on circular DNA substrates is due to

the location of the active site within a tunnel on the enzyme. Since in RecE the

5’-ended strand of the DNA substrate must be threaded through the tunnel formed by

the loop between helices B and C to access the active site, it is physically not possible

for double stranded DNA to gain access to the active site, even if RecE were a

monomer. As has been pointed out previously (Breyer and Mattehws, 2001), the

highly processive nature of these enzymes can be accounted for by the fact that the

toroidal oligomer and the DNA substrate are topologically linked by the threading of the 3’-ended strand of the DNA through the central channel of the oligomer, like a bead on a string. The low activity on single-stranded DNA substrates can also be explained. The 5’-end of ssDNA could conceivably diffuse through the central channel to access the active site, consistent with the small amount of activity that is observed. However, cleavage of ssDNA would not be processive, and would therefore be much slower, since it is the other strand of the DNA substrate, the 3’-ended strand, that is responsible for topologically linkage to the enzyme.

Based on this model for the action of RecE and λ exonuclease, some interesting questions emerge. Do the RecE and λ exonuclease oligomers use all of their active sites during processive digestion of a dsDNA substrate, such as in a sequential type of mechanism? Or does the 5’-ended strand engage with one of the active sites on the oliogmer for multiple rounds of nucleolytic hydrolysis, such that in

62

principle only one active site per oligomer would be sufficient for processive nuclease

activity? The structures of RecE and λ exonuclease offer some insight into this question. In the RecE tetramer, the active sites on adjacent monomers are located ~40

Å from one another, and are accessed from the central channel through relative

narrow tunnels on each subunit. In the λ exonuclease trimer, the three active sites are

arranged in a similar fashion. Thus, movement of the terminal nucleotide of the

5’-ended strand of the DNA substrate from one active site to the next on the toroidal

oligomer would require substantial conformational rearrangement, as well as breaking

of what are likely to be extensive interactions between the protein and DNA substrates.

Thus, the structures appear to support a mechanism in which the dsDNA substrate is

engaged with one of the active sites on the oligomer for multiple rounds of processive

dsDNA digestion.

Another interesting question will be what provides the driving force to push

the DNA substrates through the RecE and λ exonuclease oligomers, since these

reactions are ATP-independent. The structure of RecBCD heterotrimer, which has

both nuclease and helicase domains, in complex with a DNA substrate (Singleton, et

al., 2004), sheds some light on answering this question. In this structure, an extensive

unwinding reaction on the dsDNA substrate was observed in the initial binding event,

in the absence of helicase activities. Conceivably, for RecE and λ exonuclease, which

have no known helicase activities, the initial unraveling of the duplex DNA ends

could be driven by the favorable binding energy of the two DNA strands to different

regions of the protein. Translocation of the DNA substrate along the central channel of

63

the oligomer could possibly been coupled by the energy released from hydrolysis of the phosphodiester bond at each step of the nuclease reaction.

2.4.3 Interaction with the Single Strand Annealing Protein

RecE and λ exonuclease each form a specific protein-protein interaction with their respective single strand annealing protein, RecT or β protein. The functional interaction with RecT is maintained in RecE606 (Muyrers, et al., 2000), indicating that the site on RecE for interaction with RecT resides within the crystallized fragment.

Although the role of the protein-protein interaction is not firmly established, it could be for loading of the single strand annealing protein directly onto the 3’-ended strand of the DNA substrate as it is generated by the exonuclease. Such a mechanism would be precisely analogous to the loading of RecA onto the 3’-ended strand of a DNA substrate during processing of a dsDNA break by the RecBCD complex (Anderson and Kowalczykowski, 1997). If this is indeed the role of the RecE-RecT interaction, one might expect the site on RecE for binding to RecT to be located on the side of the tetramer at which the 3’-ended strand of the DNA substrate is extruded, as is the case of RecBCD (Singleton, et al., 2004). Interestingly however, this face of the RecE tetramer is remarkably flat, with on obvious loop or groove to facilitate a stable protein-protein interaction. The same is true for the λ exonuclease trimer. The structure of RecE does reveal an exposed hydrophobic patch on this side of the tetramer, formed by the side chains of Phe-754, Trp-756, Trp-720, Leu-717, Ile-851, and Met-819.Perhaps this hydrophobic patch could be a site for interaction with RecT.

64

CHAPTER 3

STRUCTURE-ACTIVITY ANALYSIS OF RECE

3.1 Introduction.

The crystal structure of RecE exonuclease has been described and discussed in detail in Chapter 2. RecE forms a toroidal tetramer with a tapered central channel that is wide enough for a double stranded DNA to enter at the wider end but only a single strand of the DNA to exit at the other end. Based on the structure, a mechanism of processive dsDNA digestion by RecE and related oligomeric exonuclease enzymes was proposed (Figure 3.1). A duplex DNA binds to the open end of the central channel. Upon binding, the terminal 3-5 base pairs of the DNA are unraveled, the

5’-ended strand is fed into a narrow tunnel to access one of the four active sites, and the 3’-ended single strand overhang passes through the central channel, exiting from the partially plugged end. The structure and the proposed mechanism suggest that at least three regions in addition to the active site on the RecE structure are potentially involved in protein-DNA interactions. First, the disordered 34-residue loop (residues

665-698) that projects out about 25 Å from the open end of the central channel is positioned to interact with the incoming duplex DNA substrate. This is evidenced by the bridging electron density between two layers of RecE molecules (Figure 2.6)

65

Figure 3.1 Proposed mechanism of processing double-stranded DNA ends by

RecE tetramer. The crystal structure of RecE tetramer with a toroidal central channel suggests a mechanism in which a duplex DNA binds to the open end of the channel, the terminal few base pairs are unwound, the 5’-ended strand passes through a tunnel to access one of the four active sites, and the 3’-ended strand passes through the narrow end of the central channel.

66

and the fact that this loop has five positively charged residues (Arg-673, Arg-674,

Lys-679, Lys-683, and Lys-694). It appears that without a DNA substrate, these extending loops are highly flexible, and so they are not clearly observed in the electron density maps. Second, if we assume that a duplex DNA enters and a single stranded overhang exits from the central channel, residues lining the surface of the central channel must play an essential role in protein-DNA contacts and might also be involved in DNA translocation. These residues include Glu-765, Gln-767, Arg-768,

Thr-771, Asp-775 and Tyr-776 of helix G, as well as Thr-807, Glu-809 and Cys-810 of the loop connecting strand β7 and β8. Third, the C-terminal segment (residues

858-866) that packs against the neighboring subunit and partially plugs the back of

the central channel is poised to interact with the 3’-ended strand as it passes through

the narrow end of the channel at the back of the tetramer. In particular, the side chains

of Arg-858 and Trp-859 project towards the 4-fold axis of the tetramer, to constrict the

channel to about 10 Å.

Mapping the six highly conserved sequence motifs onto the structure of RecE

monomer shows that all six of the motifs cluster around the active site Mg2+ ion

(Figure 3.2). Some residues, such as Asp-748, Asp-759, and Val-760 of Motif III, and

His-652 of Motif V, directly coordinate the Mg2+. All of the other conserved residues

are within ~15 Å from the active site. These residues include Ser-731 and Tyr-733 of

motif I, Arg-744 and Arg-746 of Motif II, Tyr-778, Gln-781and Tyr-785 of Motif IV,

Leu-656, Glu-657 and Pro-658 of Motif V, and Tyr-608, His-609, Ser-615 and

Ser-617 in Motif VI. Apparently, a large network of interactions between the DNA

67

Figure 3.2 Six conserved active site motifs of RecE.

(A) Six conserved sequence motifs are mapped in the RecE monomer, with Motif I

green, Motif II yellow, Motif III blue, Motif IV red, Motif V orange and Motif VI

black. The active site Mg2+ is indicated by the magenta sphere.

(B) Close-up view of RecE active site. Side chains of highly conserved residues are

shown in stick form. Notice that the motifs all cluster around the active site.

68

substrate and the residues lining the active site cleft is required for catalysis and

translocation of the enzyme along the DNA substrate.

In order to test the proposed model for RecE’s mechanism of action, we

introduced 24 mutations into RecE564, purified the mutant proteins, and measured their exonuclease and DNA-binding activities. The mutations targeted the six conserved active site motifs, the 34-residue disordered loop, residues lining the

surface of the central channel, and the C-terminal plug.

3.2 Methods and Materials

3.2.1 Expression and purification of RecE564 mutants

The Quickchange procedure (Stratagene) was used to construct plasmids for

producing 24 RecE564 mutations and all the plasmids were subjected to sequencing.

Each protein was expressed in E.coli BL21(AI) under the control of arabinose. Cell cultures (1 L) were inoculated at 37 ℃ and induced with 0.1% arabinose when the

OD600 reached 0.8, followed by continued shaking for 4 hours at 37 ℃. Cells were harvested and lysed in the same way as described in Chapter 2. Each protein was purified by nickel affinity chromatography (HiTrap Chelating HP, GE Healthcare) and gel filtration chromatography (HiLoad 16/60 Superdex 200 prep grade, GE

Healthcare). For the mutations that gave two separated peaks during gel filtration, fractions from the second peak (corresponding to a mass of ~ 220 kDa) were pooled.

Each protein was concentrated to 2-24 mg/ml, stored in 20 mM Tris (pH 7.5), 1 mM

DTT, and frozen at -80 ℃ in small aliquots.

69

3.2.2 Real-time DNase assay

3.2.2.1 Reaction conditions.

The real-time DNase assay (ReDA) was developed based on PicoGreen fluorescence (Tolun and Myers, 2003). The fluorescence dye PicoGreen has high fluorescence output only when bound to dsDNA. Hence, the nuclease activity can be measured in real time by the decrease in fluorescence intensity upon addition of an enzyme to a DNA substrate (Figure 3.3). Reactions (200 μl) consisting of 0.5 nM

BamHI linearized pUC19 DNA (1 nM ends) and PicoGreen (1/20,000 diluted,

Molecular Probes) in 20 mM Tris (pH7.5), 10 mM MgCl2, 1 mM DTT were incubated in the dark for 3 minutes. Reactions were initiated by addition of a saturating amount of RecE or λ exonuclease (40-100 nM oligomer) at the desired temperature (37 ℃ or

25 ℃). The time-based fluorescence intensity was measured using a FluoroMax-3 spectrofluorometer (Horiba Jobin Yvon) at 484 nm excitation, 522 nm emission, 4 nm slit width, and 1 s-1 sampling frequency.

3.2.2.2 Control reactions.

A reaction mixture prepared without addition of enzyme served as the negative control to account for a slight enzyme-independent decrease in signal, possibly due to photobleaching. A positive control was performed with 0.25 nM heat denatured pUC19 DNA (0.5 nM total ssDNA) to mimic completion of the reaction to

5’-mononucleotides and one strand of ssDNA. The heat denatured DNA was prepared by boiling the linear duplex DNA for 10 minutes, followed by rapid cooling on ice for

10 additional minutes.

70

Figure 3.3 Schematic representation of the ReDA reaction (Tolun and Myers,

2003). PicoGreen® (yellow rectangles) has low fluorescent output when not bound to

dsDNA. Upon binding to dsDNA, the fluorescence output of PicoGreen® increases

(distorted yellow rectangles). When λExo (red ellipses) is added in the presence of

Mg2+, it binds to the ends of the dsDNA and digests DNA in 5′ to 3′ direction, producing 3′ overhangs and free deoxyribonucleotide monophosphates (blue rectangles). During digestion, PicoGreen® is displaced and its fluorescence output decreases.

71

3.2.2.3 DNA calibration curve.

The concentration of PicoGreen used in this nuclease assay is far below what

is suggested by Molecular Probes. It was found that the fluorescence signal of

PicoGreen is not linearly dependent on concentration at this range, and thus a calibration curve was used to convert the measured fluorescence values to DNA concentrations (Tolun and Myers, 2003). To do this, the linear double-stranded pUC19 DNA (0, 0.1, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2 nM ends) were titrated into reaction mixtures without the enzyme and read by emission scanning at 484 nm excitation,

500-550 nm emission, 4 nm slit width, and 1 nm sampling frequency. The fluorescence readings at 522 nm (X axis) were plotted against the increased DNA concentration (Y axis) (Figure 3.4). A third order polynomial equation

(Y = 5E-13X3 - 5E-08X2 + 0.0037X + 0.1825) was fit and used to correct and convert

the fluorescence intensity to DNA concentration using Microsoft Excel.

3.2.2.4 Data processing.

In order to extract the rate of DNA digestion from the measured fluorescence

curves, the raw fluorescence data were manipulated as follows. First, fluorescence

reading at each time point was converted to DNA concentration using the calibration

curve as described above. Next, each data set was converted to percentage of DNA

digested by setting the negative control (no enzyme reaction) to 0% and the positive

control (0.5 nM heat-denatured ssDNA) to 100% (Figure 3.5 B). Finally, the slope of

the linear part of each data set, fit using Microsoft Excel (Figure 3.5 C), multiplied by

2686 bp (the length of the pUC19 DNA substrate) to give the rate of reaction in terms

72

Figure 3.4 A standard dsDNA calibration curve. The data nicely fit a third order

polynomial equation with X equals to fluorescence intensity and Y equals to dsDNA end concentration.

73

Figure 3.5 An example of ReDA data processing.

(A) Raw fluorescence data for RecE564 and three mutations.

(B) Fluorescence intensity at each time point was converted to DNA concentration

using the calibration curve and the percentage of digestion was plotted versus time.

(C) The first 30 seconds of data were fit with a linear equation. The slopes were used

for kcat calculation. 74

of nucleotides digested per second. Since each DNA molecule has two ends that can

simultaneously be digested, half of this calculated rate gives the rate of digestion in

terms of nucleotides digested per second per DNA end. At least three identical

experiments were carried out for each protein.

3.2.3 Fluorescence anisotropy binding assay

An oligonucleotide labeled with a 5’-fluorescein, 5’-[FluorT]AGAGCTTAA

TTGCTGAATCTGGTG-3’ (HPLC purified from Integrated DNA Technologies), was

annealed with its complement, 5’-CACCAGATTCAGCAATTAAGCTCT-3’ to form a

25 bp duplex (F25). The complementary strands were annealed by heating both

strands at 95 ℃ for 5 minutes and then slow cooling to room temperature by turning

off the water bath. F25 (10 nM) was incubated with increasing concentrations

(10-1000 nM) of purified RecE564 (or mutant as indicated) for 20 minutes at 25 ℃ in

buffer containing 20 mM Tris (pH 7.5), 10 mM CaCl2, and 1 mM DTT. The fluorescence anisotropy of each equilibrated sample (20 μl) was measured at 25 ℃

with 490 nm excitation and 515 nm emission using a Spectra Max M5 Microplate

Reader (Molecular Devices). Two readings were taken for each sample and the

averaged anisotropy was plotted versus protein concentrations. Dissociation constants

(Kd) were determined from a fit of the data in each curve using KaleidaGraph

(Synergy Software) to the following equation, which assumes a 1:1 binding stoichiometry:

2 1/2 A = Amin + [(Y + S + Kd) - {(Y + S + Kd) - (4YS)} ] * (Amax - Amin)/(2Y)

75

where A is the measured anisotropy, Amin is the minimum anisotropy, Amax is the

maximum anisotropy, S is the concentration of un-labeled protein and Y is the

concentration of labeled DNA. Standard deviations are based on values from three

independent experiments.

3.3 Results

In order to probe the roles of the different regions of RecE in DNA-binding

and catalysis, 24 selected residues of RecE564 were mutated to alanine, and the

nuclease (kcat) and DNA-binding (Kd) activities of the purified proteins were

determined (Table 3.2). The nuclease activities for all RecE mutations were measured

at 25 ℃ and 37 ℃, respectively. As expected, RecE proteins hydrolyze DNA

substrates at a faster rate at 37 ℃. However, the significant differences in enzyme

activities at these two temperatures were somewhat surprising. At 37 ℃, only four

mutations (E729A, K761A, Y778A and H652A) out of these 24 are completely dead,

while at 25 ℃ ten additional mutations (R744A, R746A, Y785A, E657A, Y608A,

H609A, S617A, R768A, W859A and Δ858-866) are totally inactive. Curiously, two

mutants that were essentially inactive at 25 ℃, R768A and W859A, appear to be more

active than wild type RecE564 at 37 ℃. The ten mutants that were active at 25 ℃ all

exhibit 5-15 fold higher nuclease activities at 37 ℃. The effect of temperature on λ

exonuclease seems to be less significant (2 fold increased kcat at 37 ℃) as compared to RecE proteins. Due to the fact that 37 ℃ is close to the physiological condition, the enzyme activities at this temperature might better represent the real ones in vivo. The

76

following discussions on catalysis affected by RecE mutations are based on the nuclease activity assay performed at 37 ℃.

Mutation of His-652 of Motif V to alanine completely abolishes activity, consistent with the notion that this residue, like Asp-748 and Asp-759 (Change &

Julin, 2001), is critical for binding and positioning the active site Mg2+. Mutation of

several other residues of the conserved active site motifs, including Glu-729 of Motif

1, Arg-746 of Motif II, Lys-761 of Motif III, Tyr-778 and Tyr 785 of Motif IV,

Glu-657 of Motif V, and Tyr-608 of Motif VI, results in a >20-fold reduction in kcat, indicating that at least one residue from each of the six active site motifs is required for efficient catalysis. From these data it is apparent that a fairly large network of interactions between the DNA substrate and active site residues of RecE is important for precisely orienting the scissile bond of the DNA substrate relative to the active site and/or translocation of the enzyme along the 5’-ended strand of the duplex.

Surprisingly, mutation of Gln-781 of Motif IV, a residue that is very near the catalytic lysine (Lys-761) and is essentially invariant not only in RecE but also in the RecB and

λ exonuclease families, results in only a 3-fold reduction in kcat. Mutation of the

conserved residues of the active-site motifs has only a modest effect on DNA-binding,

resulting in at most a 2 to 3-fold increase in Kd.

Mutation of several conserved residues within the disordered loop that projects

out from the central channel of RecE (residues 665-698) has only a minimal effect on kcat, but mutation of the highly conserved Thr-675 of this loop reduces the affinity of

RecE for the dsDNA substrate by almost 10-fold, which is the greatest effect on DNA

77

binding seen for any of the 24 mutations. Thus, while the residues of the extended

loop do not appear to be critical for ongoing digestion of the dsDNA substrate once a

processive reaction gets started (as measured by the exonuclease assay with

saturating amounts of RecE), these data indicate that the extended loop is important

for initial recognition and binding of RecE to dsDNA ends. Similarly, mutation of

three positively charged residues that line the central channel, Lys-643, Lys-704, and

Arg-768, has a minimal effect on kcat, but weakens the affinity of RecE for dsDNA ends significantly (3 to 5-fold increases in Kd).

Lastly, deletion of the C-terminal residues of RecE (858-866) that plug the central channel at the back of the tetramer results in significant effects on both catalysis (3-fold reduction in kcat) and DNA binding (9-fold increase in Kd). These

data suggest that the residues of the C-terminal plug interact with the 3’-ended strand

of the DNA-substrate both in the pre-initiation complex and during ongoing catalysis.

The data for mutaions of Trp-859 and Arg-858 of the C-terminal plug to alanine

indicate that the side chain of Trp-859 may play a particularly significant role in

binding, while that of Arg-858 is less significant. Interestingly, the R858A and

W859A single mutations actually result in a significant (~50%) increase in kcat. A similar effect is seen for other residues of RecE within or near the central channel, such as Lys-704, Thr-675, and Arg-768. These data suggest that weakening of the interactions with the dsDNA substrate disrupts the initial recognition of dsDNA ends, but may actually enhance the rate of ongoing digestions once they get started.

78

3.4 Discussions

As expected, the results of the mutational study largely support our model of

passing the DNA substrate through the central channel of the RecE tetramer. Eight out

of the 14 mutations targeting the six highly conserved active site motifs exhibit

a >20-fold reduction in kcat, indicating that most of these conserved residues are

required for efficient catalysis. Half of the 10 mutants that are proposed to involved in

interacting with DNA substrates based on the structure of the enzyme do show

significantly (>5-fold) increased Kd values, implying their essential roles in DNA

binding.

Interestingly, these 24 selected mutations appear to be divided into two groups

that are involved in catalysis and DNA-binding independently. All 14 of the active site

mutations display similar binding affinity to the DNA as the wild type RecE564,

regardless whether their kcat is affected or not. In contrast, all of the DNA- mutations, except the deletion of the C-terminal plug, hydrolyzeIn contrast, all of

the DNA-binding site mutations, except the deletion of the C-terminal plug, hydrolyze

DNA at a rate that is similar to or even higher than the wild type protein.

79

Table 3.1 Effects of mutations on RecE exonuclease and DNA-binding activities. 564 -1 -1 Protein or mutant of RecE kcat (sec ) 25 ℃ kcat (sec ) 37 ℃ Kd (nM) WT (RecE564) 3.57 ± 0.22 18.8 ± 1.9 70 ± 2 P658L (RecE606) 0.42 ± 0.01 110 ± 20 λ exonuclease 4.51 ± 0.18 9.3 ± 0.4 161 ± 2 Motif I E729A -0.26 ± 0.06* 0.03 ± 0.08 57 ± 3 Motif II R744A -0.19 ± 0.01* 6.6 ± 0.7 95 ± 8 R746A -0.16 ± 0.13* 0.3 ± 0.1 90 ± 10 Motif III K761A -0.30 ± 0.13* 0.16 ± 0.04 130 ± 10 Motif IV Y778A 0.1 ± 0.03 0.2 ± 0.2 130 ± 20 Q781A 0.37 ± 0.06 6.2 ± 0.4 70 ± 20 Y785A -0.01 ± 0.04* 0.4 ± 0.1 98 ± 9 Motif V H652A -0.12 ± 0.19* 0.02 ± 0.04 130 ± 10 E657A -0.13 ± 0.16* 0.8 ± 0.1 90 ± 10 P658L 2.87 ± 0.03 14.4 ± 0.1 80 ± 10 Motif VI Y608A -0.15 ± 0.17* 0.6 ± 0.1 160 ± 10 H609A -0.05 ± 0.17* 2.1 ± 0.1 50 ± 3 S615A 0.94 ± 0.13 14.6 ± 0.1 100 ± 4 S617A -0.29 ± 0.33* 4.5 ± 0.3 150 ± 20 Disordered loop (residues 665-698) F661A 1.32 ± 0.01 14.7 ± 0.6 118 ± 8 F665A 1.55 ± 0.11 16.4 ± 0.3 200 ± 20 P669A 3.49 ± 0.11 23.2 ± 1.3 109 ± 8 T675A 2.61 ± 0.21 23.6 ± 0.9 650 ± 30 Central channel K643A 2.35 ± 0.15 22.3 ± 0.8 210 ± 6

K704A 1.72 ± 0.00 24.6 ± 2.2 330 ± 50 R768A -0.16 ± 0.03* 25.5 ± 0.6 310 ± 30 C-terminal plug R858A 3.37 ± 0.01 28.3 ± 1.9 77 ± 5 W859A -0.07 ± 0.03* 27.8 ± 2.0 350 ± 20 Δ858-866 -0.04 ± 0.18* 6.4 ± 0.5 660 ± 80

* Due to the inactivity of the mutations and the sensitivity of the fluorescence, the data of the dead mutations tend to fit into a linear equation with an unreliable R2

(<50%), resulting in a slightly negative rate which can be regarded as zero.

80

Figure 3.6 Raw fluorescence data of ReDA for RecE mutations.

For comparison, RecE nuclease activities measured at 37 ℃ and 25 ℃ are shown on left and right on each panel. Mutations are colored as indicated in the figure legends on the top of each figure. Notice the significantly increased enzyme activities at the higher temperature.

81

(Figure 3.6 Continued)

82

Figure 3.7 Fluorescence anisotropy data for 24 RecE mutations. The binding

affinities of RecE mutants to DNA were measured using 10 nM fluorescein-labeled

25-mer duplex DNA as a function of increasing concentrations of the RecE mutations.

The data were fit to the equation given in Materials and Methods. Mutations are colored as indicated in the figure legends above each figure. One representative data set for each mutation is shown here, but the reported kD values and the standard

deviations reported in Table 3.1 are based on three independent experiments.

83

(Figure 3.7 Continued)

84

CHAPTER 4

CRYSTALLIZATION OF λ EXONUCLEASE AND RECE IN COMPLEX

WITH DNA SUBSTRATES

4.1. Introduction

Processing a double-stranded DNA break is an essential step in DNA repair by

single strand annealing pathway. Crystal structures of RecE and λ exonuclease, which

form a tapered central channel, suggest a mechanism in which dsDNA enters the

channel from the open end, upon binding the terminal few base pairs are unwound,

the 5’-ended strand is fed into a narrow tunnel to access one of the four active sites, and the 3’-ended strand exits from the narrow end at the back of the channel. Our structure-activity analysis of 24 selected mutations strongly supports this model. This

analysis has identified several conserved residues that are critical in ,

some of which are as far as ~15 Å away from the Mg2+ site. Apparently, a large

network of interactions between the DNA substrate and residues lining the active site

cleft is required for nucleotide hydrolysis and translocation of the enzyme along the

DNA substrate. Although, the crystal structures and the mutational studies have

revealed some intriguing features of this type of oligomeric exonuclease enzyme,

without a structure of the protein-DNA complex, the exact mechanism of processing

85

double stranded breaks and the roles of individual residues are unclear. Does the DNA

substrate bind within the central channels of RecE and λ exonuclease as depicted in

the model above? Are there significant conformational changes that occur in the

proteins upon binding to DNA? What are the roles of the conserved residues that line

the active site cleft? To answer these questions, we will determine crystal structures of

RecE and λ exonuclease in complex with DNA substrates. The structures will reveal the atomic details of the protein-DNA interactions, the conformation of the DNA

substrate, and any conformational changes that occur in the proteins upon DNA

binding. If the DNA is bound in a catalytically relevant conformation, the structure

will reveal the positioning of the scissile phosphodiester bond relative to the Mg2+

center, and the detailed interactions between the terminal nucleotides of the 5’-ended

strand and the conserved residues lining the active site cleft.

4.2 Methods and materials

4.2.1 Gel-shift DNA binding assay

All the DNA substrates used in the gel-shift assay are 32P-labeled. Each

labeling reaction (20 μl) contained 20 pmol ssDNA, 20 pmol g-32P ATP and 10 units T4

polynucleotide kinase. The mixture was incubated at 37 ℃ for 30 minutes, followed by

purification with Amersham G25 column twice. The radioactivity of purified 32P-ssDNA (1 μl) was measured in a LS 6000 IC liquid scintillation counter (Beckman) with 2.5 ml scintillation fluid. The labeled ssDNA was then annealed to its complementary strand to form a

32P-labeled dsDNA by boiling the mixture at 95 ℃ for 10 minutes, followed by slow cooling

86

to room temperature. The binding of RecE and λ exonuclease to duplex DNA substrates with a variety of length was examined. In each 10 μl reaction, P32-labelled dsDNA (3 or 20 nM) was titrated with increasing concentrations (0-10 μM) of the enzyme in buffer containing

20 mM Tris (pH 7.5), 10 mM CaCl2, and 1 mM DTT. The mixture was incubated at room temperature for 10 min and then mixed with 2 μl loading dye (20 % glycerol, 0.12 % bromophenol blue, and 0.12 % xylene cyanol FF). Samples were loaded onto a native TBE polyacrylamide gel (12-15%) and electrophoresed in TBE buffer at 150 V for 50 minutes at

4 ℃. The gel was then dried and autoradiographed. The TBE buffer was comprised of 89 mM Tris, 89 mM boric acid, and 2 mM EDTA.

4.2.2 Crystallization

For crystallization, 10 mg/ml RecE or λ exonuclease was mixed with a slight

excess of DNA substrate at a molecular molar ratio of 1.1 dsDNA per exonuclease

oligomer, in the presence of 5 mM CaCl2, which allows the enzyme to bind DNA but inhibits its nuclease activity. All DNA were synthesized and HPLC-purified by

Integrated DNA Technologies. DNA concentrations were measured by UV absorbance at 260 nm using an extinction coefficient calculated from the sequence and the complementary strands were annealed for co-crystallization. The protein-DNA mixture was subjected to crystallization screening with at least 1,000 conditions by the hanging-drop vapor diffusion method at 22 ℃. Each reservoir contained a 500 μl commercial screen solution (Hampton Research) and the hanging drop was prepared

by mixing 1 μl of the protein-DNA complex and 1 μl of reservoir solution. A list of

87

detailed crystallization conditions for the seven crystal forms that we have obtained so

far is shown in Figure 4.1. RecE606 in complex with a 39-mer hairpin duplex was

crystallized in 0.3 M sodium succinate pH 7.0, 0.1 M BisTrisPropane pH 7.0.

4.2.3 Fluorescence detection of DNA in crystals

Crystals of exonuclease (RecE or λ Exo) alone and in complex with a duplex

DNA substrate grew in two separate drops. After gentle washing of crystals with the mother liquor solution, crystals were transferred directly into a 5 μl drop containing

SYBR-Gold dye (Invitrogen) diluted 1:5,000 into the mother liquor. After soaking for one hour, the drop was visualized using a fluorescence inverted microscope (Olympus

IX81) equipped with a FITC filter set. Digital images were recorded with a black-and white camera (Olympus FVII).

4.3 Results

4.3.1 RecE and λ exonuclease can bind dsDNA as short as 12 base pairs.

In order to probe the minimal length of the duplex DNA to which RecE/λ

exonuclease can bind, we have examined the binding affinities of both enzymes to

DNA substrates with a length varied from 10 to 83 (Figure 4.1) using a

gel-shift assay. The results indicate that duplex DNA as short as 12 base pairs can be

bound by RecE/λ exonuclease with a Kd at a low μM or high nM range. Interestingly,

titration of an 83 mer dsDNA with RecE reveals two shifted bands, while binding of

RecE to DNA shorter than 40 bp exhibits a single shifted band. This suggests that

with a long dsDNA substrate an intermediate complex, which likely is the DNA with

88

one end bound by exonuclease and the other end free, is formed before both ends are

fully bound. In contrast, short DNA sequences (≤ 40 bp) do not allow two enzyme

oligomers to occupy at both ends at the same time, likely due to steric overlap.

4.3.2 A collection of duplex DNA substrates.

Based on the results of the gel-shift assay, we designed a series of duplex

DNA substrates with different lengths and end structures. This collection of dsDNA

are 10 to 20 base pairs long and have three types of terminal structures, including two

blunt ends, two 3’-tails or one blunt end with a hairpin loop. Duplexes with two blunt

ends that have an even number of nucleotides in each strand were designed to be

self-complementary in order to form a symmetric (palindromic) duplex, which

ensures that exactly the same complex is formed regardless of which end of the DNA

the protein binds to. A hairpin duplex is made by an oligonucleotide that anneals

intra-molecularly to form a duplex with one free end and a 5 nt hairpin at the other.

This type of substrate will have a lower propensity for end-to-end stacking that could interfere with crystallization of a mechanistically relevant complex. Indeed, this is the type of DNA that was used for co-crystallization of RecBCD (Singleton, et al., 2004),

which also binds specifically to DNA ends. Single molecule studies revealed that λ

exonuclease pauses when it encounters GGCGA sequences on the 5’-ended strand, and it was proposed that this sequence binds as ssDNA with enhanced affinity to the

active site cleft (Perkins, et al., 2003). We have also obtained 5 nt and 9 nt single

strand oligoes that contain this sequence for λ exonulcease-DNA co-crystalliztion.

89

Figure 4.1 Gel shift assays showing the binding of λ exonuclease and RecE to duplex DNA substrate with varied length. (A) 83-mer duplex, (B) 40-mer duplex,

(C) 25-mer duplex, (D) 16-mer duplex, (E) 33-mer hairpin duplex, (F) 14-mer duplex, and (G) 12-mer duplex.

90

(Figure 4.1 continued)

91

4.3.3 Crystals of RecE and λ Exo in complex with DNA.

Co-crystallization of λ exonuclease with 15 different duplex DNA substrates

(1-15 in Table 4.1) were screened individually and 4 distinct crystal forms that all crystallized in P3 space group have been identified so far. Detailed information of each crystal form, including crystallization condition, collection data statistics, crystal image, and diffraction pattern, are listed in Figure 4.2. It is worthwhile to note that

DNA is required to grow these crystals and different types of DNA affect the types of crystal forms. Overall, λ exonuclease crystallizes readily with shorter duplexes (10-13 bp) and all crystal forms diffract very well to a high resolution (2-3 Å). The molecular replacement solutions were obtained using a λ exo monomer or trimer as a search model. However, we have so far been unable to resolve electron density for the DNA.

The possible reasons are discussed in the following sections.

Co-crystallization trials of the RecE-DNA complex have led to promising crystals of RecE606 in complex with a 39-mer hairpin duplex that diffract to ~ 5.0 Å,

but with significant anisotropy. This RecE-DNA complex crystallized in a new crystal

form (space group I4212), which was not observed for RecE protein alone (P43212),

supporting the idea that DNA is present in the crystals. We are currently working on

optimizing this crystal form, as well as discovering addition forms by varying DNA

length and ends.

92

Table 4.1 DNA sequences used for co-crystallization.

Name of the DNA DNA Sequence and Structure

1. 5 mer ssDNA

2. 9 mer ssDNA

3. symmetric 10 mer duplex

4. 11 mer duplex with 3’-tail

5. asymmetric 11 mer duplex

6. symmetric 12 mer duplex 5’-hydroxyl 7. symmetric 12 mer duplex 5’-phosphate 8. asymmetric 12 mer duplex

9. asymmetric 13 mer duplex

10. symmetric 14 mer duplex

11. symmetric 16 mer duplex GC-ending 12. symmetric 16 mer duplex AT-ending 13. symmetric 20 mer duplex

14. 29 mer hairpin duplex

15. 33 mer hairpin duplex

16. 39 mer hairpin duplex

93

Figure 4.2 Crystal forms of λ exonuclease-DNA complexes. Four forms have been identified so far. Crystallization conditions and data collection statistics, crystal images, and diffractions are displayed from left to right.

94

4.3.4 Detection of DNA in the crystals.

Given the multiple crystal forms obtained from co-crystallization trials and

inability to resolve electron density for DNA substrates, we performed a sensitive

assay employing a fluorescence dye to rapidly detect nucleic acids in crystals of

macromolecular complexes (Kettenberger & Cramer, 2006). As described in materials

and methods, crystals of the enzyme with (Crystal Form 1) and without DNA bound

were placed in a drop side by side, soaked with SYBR-Gold, a fluorescent dye that

specifically associates with nucleic acids, and then visualized by fluorescence microscopy. The nucleic acid-containing crystals can be easily and rapidly detected by strong fluorescence of the crystal stained by the dye, while nucleic acid-free crystals are barely visualized. A comparison of crystal images of λExo with a 12 mer duplex under a regular transition light microscope and a fluorescence microscopy provides striking evidence that these crystals contain DNA. A similar test was performed and verified that the RecE606-hairpin 39 mer duplex crystals have DNA bound.

As a further test, λ Exo was co-crystallized with a 5’-Cy5 labeled 12 mer

duplex (Georgescu, et al., 2008) in the condition that grows Form 1 (Figure 4.3 1)

crystals. The intense blue color of the crystal, due to the Cy5 tag on the DNA

substrate, provides convincing evidence that these crystals contain the desired DNA

complex (Figure 4.4).

95

Figure 4.3 Crystals of λ exonuclease and RecE606 in complex with DNA. From left to right, the figures show (1) crystals soaked with SYBR Gold viewed under the visible microscope, (2) the same view under a fluorescence microscope, and (3) a diffraction image. The DNA complex crystals (C) were washed and soaked with

SYBR Gold. Protein-only crystals (P), grown in similar conditions but without DNA, were placed in the same drop as the co-crystals during the soaking procedure. Intense fluorescence of the DNA co-crystals but not the protein-only crystals provides strong evidence that the co-crystals contain the DNA.

96

Figure 4.4 Crystals of λ exonuclease in complex with a 5’-Cy5-labeled 12-mer duplex.

97

4.4 Discussions

Four crystal forms of λ exonuclease-DNA complexes have been identified

during co-crystallization trials. Apparently, the specific DNA substrate has a

significant influence on the crystal lattice that is observed, suggesting that the DNA is

indeed part of the crystal. In addition, two experiments, fluorescent staining of

co-crystals and co-crystallization of the protein with Cy5-labeled DNA, strongly

suggest that the Form 1 crystals (Figure 4.3 1) contain the desired DNA substrate.

Molecular replacement solutions were readily obtained when a monomer or trimer of

λ exonuclease was used as the search model. However, we have been unable to

resolve electron density for the DNA, likely due to two problems. First, four of the

crystal forms exhibit merohedral twinning. When the twinning fraction is 0.5, the

diffraction data scale with P6 symmetry, which is twice that of the actual symmetry

(P3). When the twinning fraction is 0.2-0.3, the data scale with the correct P3

symmetry, but the measured intensities are convoluted and a correction factor

(derived from the twinning fraction) must be applied. The second problem is that the

λ Exo-DNA complexes, which presumably contain only one DNA duplex per trimer, are oriented along the 3-fold axis of the crystal. Since the DNA itself is not 3-fold

symmetric, density for it may be partially obscured by the 3-fold symmetry of the data.

Attempts to merge and scale the diffraction data in lower symmetry (P1) and calculate

electron density maps have not yet resulted in clear density for the DNA, suggesting

that the complexes throughout the crystal may be statistically disordered among the

three possible orientations about the 3-fold axis.

98

Due to the low resolution of the diffraction (~5.0 Å) and the anisotropy

observed for the RecE606-hp39mer co-crystals, we have not been able to collect a

complete data set for molecular replacement. Although these crystals are not yet good

enough to determine the structure at high resolution, we expect that manipulation of

the length and ends of the DNA substrate will result in new crystals forms with

improved diffraction properties.

4.5 Future Directions.

The crystal structure of RecE606 tetramer shows that the central channel is

about 40 Å deep and the sparse bridging electron density indicates that the 34-residue

missing loop extends out about 25 Å. Presumably, the path that tracks along the DNA

substrate is about 65 Å long, which corresponds to the length of a 19-basepair double

stranded DNA (assuming a helical pitch of 34 Å for a complete turn of 10 bp B-form

DNA). As proposed in the DNA processing mechanism, the terminal 3-5 base pairs

need to be unwound in order to allow the 5’-strand to access the active site and the

3’-strand to pass through the central channel. Hence, an ideal DNA intermediate is a

14-16 mer duplex with a 3-5 mer 3’-sstail. Coincidently, the RecE606-DNA co-crystals we have obtained are with a 39 mer hairpin duplex (17 bp + 5 nt loop), which is roughly the same length predicted from the crystal structure.

Our strategy to facilitate RecE-DNA co-crystallization is summarized in the following three parts. First, we will use DNA substrates with a variety of end structures, including (1) a blunt end, (2) a 3-5 bp non-complementary end, (3) a 3-5

99

mer 3’-sstailed end, or (4) a 2-3 bp non-complementary end with a 2-3 mer 3’-single

stranded tail. Two types of duplexes will be employed: a symmetric duplex with two

identical ends and a hairpin duplex with one end free and one end blocked (Figure 4.5

A and B). Second, an alternative approach will be to use DNA substrates that contain

an internal phosphorothioate, to block cleavage at a particular position in the duplex

(Figure 4.8 C). In the presence of Mg2+, the enzyme would bind to the end of the

duplex and digest the 5’-ended strand until it reached the phosphorothioate.

Crystallization of the resulting complex would likely reveal a mechanistically

informative intermediate. Phosphorothioates have been used in the crystallization of other exonuclease-DNA complexes (Brautigam and Steitz, 1998) and are known to block RecE digestion (Muyer, et al., 2000). Third, our structure-activity studies have identified several RecE mutations that bind to DNA with a similar affinity as wild type RecE, but their nuclease activities are completely inhibited. These mutations, including E729A, R746A, K761A, Y778A, and H652A, are good candidates for co-crystallization with DNA in the presence of Mg2+.

Multiple crystal forms of λ Exo-DNA complexes have demonstrated our

ability to crystallize λ Exo in complex with different DNA substrates. Although these

crystals exhibit twinning defects that have created obstacle to structure determination,

it is clear that a DNA substrate plays a crucial role in crystal packing. We will

continue working on the existing four crystal forms with a hope that a less twinned

data set will be obtained to allow us to determine the structure of the complex.

Alternatively, we will employ the second and third strategies developed for the

100

RecE-DNA complex. DNA substrates with an internal phosphorothioate, or λ Exo mutations that bind tightly to DNA with inhibited nuclease activity, will be used for co-crystallization trials.

101

Figure 4.5 DNA substrates for co-crystallization.

(A) Symmetric duplexes with: (1) two blunt ends, (2) two non-complementary ends,

(3) two 3’-tailed ends, and (4) two non-complementary ends with a 3’-ss tail.

(B) Hairpin duplexes with a: (1) blunt end, (2) non-complementary end, (3) 3’-tailed end, and (4) non-complementary end with a 3’-ss tail.

(C) Phosphorothioate linkage.

102

CHAPTER 5

CRYSTAL STRUCTURES OF N-TERMINAL SH2 DOMAIN OF TYROSINE

PHOSPHATASE SHP-2 IN COMPLEX WITH HIGH AFFINITY PEPTIDES

5.1 Introduction

5.1.1 Src homology 2 domain

The Src homology 2 (SH2) domain, which was first indentified in the oncoproteins Src and Fps (Sadowski, et al., 1986), is a regulatory protein module of about 100 amino acid residues that is found in many intracellular signal transduction proteins (Russell, et al., 1992). All SH2-domains specifically recognize a phosphorylated tyrosine (pTyr) residue within a target protein, often resulting in relocalization of the SH2 domain-containing proteins to a specific cellular location, such as the plasma membrane. This process constitutes the fundamental event of signal transduction through a transmembrane receptor. For example, in the case of receptor protein kinases (Figure 5.1), binding of an extracellular ligand to a cell-surface receptor triggers a conformational change in the receptor that propagates through the membrane. The signal is “sensed” by an intracellular domain of the receptor, whose tyrosine kinase activity is subsequently activated to phosphorylate itself and other proteins. The phosphorylated sites on the receptor then become targets

103

Figure 5.1 Schematic model for signal transduction through a transmembrane receptor. Binding of a ligand to the extracellular domain of a receptor protein kinase induces a conformational change that activates the receptor, which subsequently autophosphorylates itself. The pTyr residues of the receptor kinase attract SH2 domain-containing intercellular tyrosine kinases, such as Src and Syk protein kinases.

Upon binding, these kinases are activated to phosphorylate downstream targets, triggering a cascade of events that eventually leads to an altered cellular response.

104

of the SH2 domains-containing proteins, such as intercellular tyrosine kinases, leading to activation of these kinases and phosphorylation of downstream proteins. This cascade of events eventually results in altered patterns of of other cellular responses.

There are about 115 SH2-containing proteins that have been indentified in the human genome (Liu, et al., 2006). All the SH2 domains share a common fold that consists of a large central three-stranded β-sheet (βB, βC, βD) flanked by two

α-helices (αA and αB), one on either side. Three loop regions connecting different secondary structure elements, BG, BG and EF, play an essential role in interacting with pTyr-containing sequences by forming a binding pocket. The phosphopeptide ligand binds to a surface of the SH2 domain that extends from the N-terminal helix

αA to the C-terminal helix αB, with the peptide oriented perpendicular to the plane of the central β-sheet. As an example, the Src SH2 domain, which was the first SH2 domain whose crystal structure was determined, is shown in Figure 5.2, in complex with a high affinity peptide pYEEI.

5.1.2 Tyrosine phosphatase SHP-2

If a protein is phosphorylated by a kinase, the phosphate group must be eventually removed by a phosphatase through hydrolysis. Otherwise, the phosphorylated protein would be permanently activated. The regulation of tyrosine phosphorylation, controlled by protein tyrosine kinases (PTKs) and protein tyrosine (PTPs), plays a critical role in cellular communications (Neel and Tonks,

1997; Neel, et al., 2003). PTPs comprise a large superfamily (about 110 human

105

Figure 5.2 Structure of Src SH2 domain in complex with a high affinity phosphopeptide (pYEEI) (PDB code 1SPS; Waksman, et al., 1993). This view is looking at the peptide binding site. Notice the conserved SH2 domain architecture, a central β-sheet sandwiched by two α-helices, one at each side. In addition to these secondary structure elements, three loops, BC, EF, and BG, contribute to the formation of peptide-binding pocket.

106

members) which is defined by conserved amino acid sequences in the catalytic domain (Andersen, et al., 2001).

The SH2 domain containing phosphatase (SHP) is a subfamily of cytoplasmic

PTPs that contains two tandem N-terminal SH2 domains (N-SH2 and C-SH2), a classic phosphatase catalytic domain and a C-terminal tail (Figure 5.3 A). Two SHPs, known as SHP-1 (also named as SH-PTP1, PTP1C, HCP, and SHP) and SHP-2 (also named as SH-PTP2, PTP1D, Syp, PTP2C and SH-PTP3), have been frequently linked to several cellular activities (Chong and Maiese, 2007), such as progenitor cell development (Chan and Yoder, 2004), cellular growth (Yi, et al., 1993), tissue inflammation (Aoki, et al., 2000; Horvat, et al., 2001), cellular chemotaxis (Kim, et al.,

1999, Kim, et al., 2006) and cell survival (Rakesh and Agrawal, 2005, Maas, et al.,

2004). SHP-1 is predominantly present in hematopoietic cells (Yi, et al., 1992), while

SHP-2 is ubiquitously expressed in every tissue and transduces signals in cells that are activated by a variety of ligands, including growth factors, cytokines, hormones, and

MHC-antigen complexes (Hof, et al., 1998).

The crystal structure of SHP-2 shows that in the absence of a pTyr-containing binding partner, the N-SH2 domain is in a “closed” configuration that is bound to the phosphatase domain, directly blocking its active site (Hof, et al., 1998)(Figure 5.3 B).

The structure suggests a mechanism in which binding of the N-SH2 domain to a pTyr-containing ligand induces a conformational change that prevents its binding to the phosphatase catalytic domain, which is thus activated for catalysis (Figure 5.3 C).

The C-SH2 domain does not have a direct role in activation (Hof, et al., 1998).

107

Figure 5.3 Structure of SHP-2 and mechanism of inhibition.

(A) Domain structure of SHP phosphatase family. SHPs contain two tandem SH2 domains (N- and C-SH2), a phosphatase catalytic domain (PTP) and a C-terminal tail.

(B) The “closed” conformation of SHP-2 (PDB code 2SHP, Hof, et al., 1998). Notice the N-SH2 domain interacts with the PTP domain by insertion of its DE loop directly into the PTP catalytic cleft.

(C) Proposed mechanism of Shp-2 activation. A SHP-2 binding protein (BP) that has two pTyr sites binds to the SH2 domains of SHP-2, with one pTyr for each SH2 domain. The binding triggers a conformation change of the N-SH2 domain, leading to a movement of the N-SH2 away from the PTP active site and activation of its activity.

108

5.1.3 Peptide specificity of SHP-2 N-SH2 domains.

As introduced above, SH2-domains recognize pTyr-containing sites of target

proteins in a sequence-dependent manner. In other words, the affinity of a given SH2

domain for a given pTyr residue within a target protein depends on the amino acids

that surround it. Misreading of phosphorylated sites by SH2 domains would result in

recruitment of inappropriate SH2 domain-containing proteins and hence lead to

abnormal cellular signaling, which is largely associated with human pathologies

including cancer and many other diseases (Hirai and Varmus., 1990; Bromberg, et al.,

1999; Miled, et al., 2007). In particular, mutations in human SHP-2 are primary

causes of the inherited disorder (NS) (Tartaglia, et al., 2001;

Tartaglia, et al., 2002; Kosaki, et al., 2002). Most NS mutations are found in either the

N-SH2 or the phosphatase domain, and are involved in inhibition of PTP catalysis. In

addition, SHP-2 is implicated as being a key virulence determinant in pathogenesis by

Helicobacter pylori, the major cause of gastric ulcer and carcinoma (Ligashi, et al.,

2002). Recently, SHPs have become attractive drug targets for novel cancer therapies

(Irandoust, et al., 2009).

The affinity and specificity of the SH2-phosphopeptide interactions have been

extensively studied and abundant structural information on SH2-peptide complexes

are available. Although the sequence specificity of SH2 domains varies, it is generally

agreed that the three residues immediately C-terminal to the pTyr site on the peptide

substrate (at the +1, +2, and +3 positions) are crucial determinants for recognition by

a given SH2 domain. In the case of SH2 domains of SHP-1 and SHP-2, the two

109

residues N-terminal to the pTyr site (at the -1 and -2 positions) are also important.

Thus, in a study of the sequence specificity of SHP-2 N-SH2 domain, Pei’s group at the Ohio State University Department of Chemistry designed a combinatorial pTyr peptide library, H2N-TAXXpYXXX-LNBBRM-resin, where X represents any of the

18 natural amino acids except for Met and Cys and B is β-alanine (Sweeney, et al.,

2005). The N-terminal TA is used to reduce potential bias caused by electrostatic interactions between the SH2 domain and the free N-terminus, and the C-terminal

LNBBRM sequence helps to improve the solubility and flexibility of the peptides.

The library was synthesized on TentaGel S NH2 resin. Binding of the biotinylated

SH2 domain to a resin-bound peptide library recruits a streptavidin- conjugate to the surface of the beads. Addition of BCIP induces a series of chemical reactions so that beads that carry high-affinity SH2 peptides become colored. The positive beads are then isolated and subject to sequencing by partial

Edman degradation and mass spectrometry (Sweeney, et al., 2005).

Screening the N-SH2 domain of SHP-2 with this combinatorial peptide library has identified four classes of phosphopeptide sequences that exhibit high affinity:

Class I, (I/L/V/m)XpY(T/V/A)X(I/V/L/f), Class II, W(M/T/v)pY(y/r)(I/L)X, Class III,

(I/V)XpY(L/M/T)Y(A/P/T/S/g), and Class IV, (I/V/L)XpY(F/M)XP. Lower case letters in these sequences represent less frequently selected residues and X is any amino acid except for glycine and proline (Sweeney, et al., 2005). Class III and IV sequences exhibit remarkably higher binding affinity for the SH2 domain, although they are less observed less frequently, as compared to Class I and II sequences.

110

In order to compare the binding interactions between the SHP-2 N-SH2 domain and the peptides from frequently and less frequently selected classes, two peptides, AQLW (Class I) and FVP (Class IV), were selected and subject to structural studies by X-ray crystallography individually. Strikingly, the crystal structure of the

SH2 domain in complex with the FVP peptide reveals a novel binding stoichiometry that two copies of the peptide bind to one SH2 domain.

5.2 Methods and Materials

5.2.1 Crystallization and data collection

The purified N-terminal SH2 domain of SHP-2 was co-crystallized with

selected peptides, RVIpYFVPLNR (referred to as FVP), RLNpYAQLWHR (referred

to as AQLW), and RIHpYLYALNR (referred to as LYA) at room temperature by

hanging drop vapor diffusion. The SH2 domain (10 mg/ml) in 0.1 M Mes (pH 5.5), 50

mM KCl, was mixed with a slight molar excess (1 : 1.1) of peptide dissolved in water.

The hanging drops were prepared by mixing 2 μl complex and 2 μl reservoir solution,

which consisted of 20% PEG 3350, 0.1 M bis-Tris (pH 5.5), 0.2 M Li2SO4. Large

tetragonal crystals (Figure 5.4) grew within about three days. Crystals were

transferred to a cryo-protectant solution containing reservoir solution supplemented

with 20% glycerol and frozen in liquid nitrogen. The diffraction data were collected at

-180 ℃ at beamline 19BM of the Advanced Photon Source. Date were integrated

and scaled with DENZO/SCALEPACK (Otwinowski and Minor, 1997).

111

5.2.2 Structure determination

The structure was determined by molecular replacement with MOLREP of the

CCP4 suite (CCP4, 1994) using the structure of unbound SHP-2 N-SH2 domain (PDB

code 1AYD; Lee, et al., 1994) as the search model. The structures with the FVP and

AQLW peptides were refined to 1.80 Å and 2.05 Å, respectively, using Refmac of the

CCP4 suite (CCP4, 1994). After the first round of refinement, the electron density maps revealed strong density for the FVP and AQLW peptides. For LYA, no density corresponding to the peptide was observed. Several rounds of refinement with Refmac and model building using the program COOT (Emsley and Cowtan, 2004) led to final

models. The final refined structure of SH2-FVP consists of residues 4-103 of the SH2

domain (Chain A), residues (-2)-(+4) of the first peptide (Chain P) and residues

(-2)-(+3) of the second peptide (Chain Q). The final SH2-AQLW structure consists of

residues 5-103 of the protein and residues (-2)-(+4) of the peptide (Chain B). Side

chains of residues, Arg-4, Lys-BC1 and Lys-BG7 in the complex with FVP, and Arg-5,

Lys-BC1, Lys-BG5 and Lys-BG7 in the complex with AQLW, were not resolved in

the electron density and were truncated to alanine. The C-terminal peptide residues,

Arg(+6), Asn(+5) of each peptide and Leu(+4) of Peptide 2 in the complex with FVP,

and His(+5), Arg(+6) of the peptide in the complex with AQLW were not observed in

the electron density and were not included in the final structures. Data collection and

refinement statistics are presented in Table 1. Structural figures were prepared using

PYMOL (Delano Scientific LLC). The secondary structural elements are defined

using Dali server (http://ekhidna.biocenter.helsinki.fi/dali_server) and named

112

Figure 5.4 Crystals of SHP-2 N-SH2 domain in complex with FVP peptide and a diffraction image. The crystals of the complex and a representative image of a

diffraction to 2.2 Å are shown on left and right.

113

Table 5.1 Crystallographic and Refinement Statistics.

NSH2-FVP NSH2-AQLW X-ray Diffraction Data

Space group P43212 P43212 Unit Cell dimensions a = b (Å) 62.9 62.7 c (Å) 75.3 75.3 Resolution (Å) 48.3-1.8 48.2-2.05 (1.89-1.80) (2.13-2.05) No. unique reflections 15,827 9,943 Redundancy 11.2 (11.6) 10.9 (11.3) Completeness (%) 99.3 (100) 99.0 (100) I/σ 61.5 (5.0) 46.5 (6.7)

Rmerge 0.030 (0.382) 0.064 (0.364)

Refinement Statistics Resolution (Å) 48.3-1.80 24.1-2.05 No. of reflections 13,727 9,354

Rwork/Rfree 0.22/0.26 0.21/0.25 No. of protein atoms 992 860 Mean B factor (Å 2) 43.7 42.3 R.M.S.D bond length (Å) 0.016 0.017 R.M.S.D bond angle (°) 1.629 1.547 Residues in Ramachandran plot Favored regions 98 (97.0%) 95 (96.0%) Allowed regions 3 (3.0%) 4 (4.0%)

114

to the nomenclature established previously for SH2 domains (Eck, et al., 1993 and

Lee, et al., 1994). Solvent accessible surface area calculation was performed using the

AREAIMOL feature of CCP4 with a probe radius of 1.4 Å (CCP4, 1994).

5.3 Results

In order to understand the structural basis of the specificity for the interactions

between SH2-domains and pTyr-containing sequences selected from the peptide

library screenings, crystal structures of the SHP-2 N-SH2 domain in complex with

LYA, AQLW and FVP were determined. In all the three structures, the protein exhibits a typical SH2 domain fold consisting of a central anti-parallel β sheet flanked by two

α helices, one on either side. Examination of the electron density map of the SH2-LYA complex revealed density for a phosphate ion in the expected pTyr binding pocket, but no density for the rest LYA peptide. Thus, the structure of the SH2 domain is nearly identical to the unbound SH2 domain structure (PDB code 1AYD) published earlier.

The electron density map for the SH2-AQLW complex allowed tracing of residues

LNpYAQLW of the peptide, but the side chains of residues Q(+2) and W(+4), which face out into solvent, were not resolved. The complex with the AQLW peptide is very similar to previous SH2-domian peptide complexes (PDB code 1AYA, 1AYB, and

1AYC) (Figure 5.7), and will not be discussed in detail here. The electron density map for the complex with FVP peptide, however, is surprisingly different (Figure 3 A). In contrast to all previous SH2-domain peptide complexes, which contain a single peptide per SH2 domain, we observe two copies of the FVP peptide bound to the peptide-binding surface on the SHP-2 N-SH2 domain (Figure 5.5).

115

Figure 5.5 Crystal structure of N-SH2 domain of SHP-2 tyrosine phosphatase in complex with two copies of FVP peptide. Ribbon diagram of the SHP2 N-terminal

SH2 domain in complex with two identical FVP peptides, with the SH2 domain in light blue, the Peptide1 (P1) in magenta and peptide2 (P2) in green. Notice that the two peptides form an anti-parallel dimer that forms β-sheet hydrogen bonds with strands βD and βH on the BG loop of the SH2 domain.

116

The first peptide, Peptide 1, binds in the canonical manner, as an extended

β-strand that pairs with the strand βD of the SH2-domain and inserts its pTyr residue into a positively charged pocket formed in part by Arg-βB5 and Lys-βD6. The second peptide, Peptide 2, binds in an extended conformation running anti-parallel to Peptide

1. Residues IpYF of Peptide 2 pair up with residues 88-90 of the BG loop of the SH2 domain, which form a short β-strand (referred to as βH). The pTyr residue of Peptide

2 is exposed on the surface of the structure, but forms potentially favorable electrostatic interactions with Lys-BG5, as evidenced by the ordering of the Lys-BG5 side chain in this structure, which was not resolved in the electron density for structures of unbound SH2 domain or the complex with peptide AQLW. Overall, the two peptides and βH of the BG loop of the SH2-domain form an anti-parallel three-stranded β-sheet, which adds onto the central five-stranded β-sheet (βA, βB, βC,

βD and βG) of the SH2 domain to form a twisted eight-stranded β-sheet that almost forms a closed barrel.

In addition to the hydrogen bonding interactions, binding of the two peptides is also stabilized by hydrophobic interactions with the SH2-domain and with one another (Figure 5.6 B). In particular, the side chain of Phe(+1) of Peptide 2 is anchored into a hydrophobic pocket on the SH2 domain formed by the side chains of

Ile-βD5, Leu-βE4, Leu-BG4, and Ile-BG12 of the SH2 domain (Figure 5.6 C). This hydrophobic pocket is the site to which apolar residues at the +3 position of other peptides have been observed to bind. The side chains of Ile(-1) of Peptide 2 and

Phe(+1) residue of Peptide 1 are also partially buried, although their interactions

117

Figure 5.6 Peptide FVP binding sites and interactions.

(A) Stereo view of the electron density map for the bound FVP peptides. The 2fo-fc map, calculated at 1.8 Å after refinement, is contoured at 1 σ and superimposed on the final refined model of the complex.

(B) Close-up view of hydrogen-bonding interactions among Peptide 1, Peptide 2 and

SH2 domain. Residue labels are colored the same as the carbon in their own chains.

(C) Close-up view of the hydrophobic pocket to which Phe+1 of Peptide 2 is anchored. Residues involved in the hydrophobic interactions are labeled. 118

appear to be less optimal (Figure 5.6 C). Due to the anti-parallel configuration, the

Pro(+3) residue of Peptide 1 stacks closely over the Phe(+1) side chain of Peptide 2,

and vice versa, to form additional hydrophobic interactions that appear to stabilize the

complex. The side chains of the Val+2 residues of each peptide are exposed at the

surface of the complex, but could form stabilizing hydrophobic interactions with one

another. Based on the structure, it appears that hydrophobic interactions between the

FVP residues of the two peptides with one another and with the SH2 domain are in large part responsible for the specificity of FVP sequence for the SHP-2 N-SH2 domain.

As seen previously in structures of N-SH2 domain of SHP-2 (Lee, et al., 1994), the peptide-binding surface appears as an extended wide groove, lying on the surface that is perpendicular to the plane of the central β-sheet of the SH2 domain. In order to accommodate Peptide 2 of the FVP complex, the BG loop of the SH2-domain moves up and away from the peptide-binding site by about 4 Å, as compared to the structure of this domain alone (Figure 5.7). In addition, the residues C-terminal to the pTyr of

Peptide 1 are pushed down to create a room for the C-terminal residues of Peptide 2

(Figure 5.7). Consistent with the wider peptide-binding groove, these movements result in a significantly increased binding surface in the FVP structure as compared to

the AQLW and previous SH2 domain structures (Figure 5.8). A total of 670.6 Å2

solvent-accessible surface area of the SH2 domain is buried at the SH2-FVP interface,

which is contributed approximately equally by the two peptides, while a total of 417.7

Å2 surface area of the protein is buried at the SH2-AQLW interface.

119

Figure 5.7 Stereo view of structure alignment of SH2 domain alone and in complex with peptide substrates. Three structures, unbound SH2 domain (grey,

PDB code 1AYD, Lee, et al. 1994), SH2-AQLW (with protein salmon and peptide cyan) and SH2-FVP (colored as in figure 1) complexes are superimposed and viewed as in Figure 1. Notice that the BC and BG loops move away from the peptide binding groove to accommodate the peptides. The BG loop moves even further in the FVP complex to accommodate the second peptide.

120

Figure 5.8 Surface views of SH2 domains with peptides bound. Surface view of the SHP-2 N-SH2 domain in complex with peptide FVP and AQLW are shown on left and right, respectively, with colors as indicated in Figure 3. Notice that the peptide-binding groove opens up to accommodate the second peptide in the FVP complex.

121

5.4 Discussions

This is the first study that has identified two copies of FVP peptide bound at the peptide-binding surface, which might lead to a paradigm shift in this field. The novel feature of the 2 to 1 peptide-SH2 domain interaction indicates that two proteins could be recruited to the SHP-2 N-SH2 domain at the same time, creating a larger contact interface. More importantly, the interactions between these two proteins, in addition to ones between individual protein and the SH2 domain, might be critical for cellular signaling regulations.

One protein that has the FVP sequence, but is uncharacterized, has been identified by searching the protein database. In future studies, it will be interesting to solve the structure of the SH2 domain in complex with the peptide sequence from a real protein, to check whether they bind in the same way as observed in this study.

122

REFERENCES

Abella Columna, E. et al. (1993). Analysis of restriction enzyme-induced chromosomal aberrations by fluorescence in situ hybridization. Environ Mol Mutagen 22(1): 26-33.

Adams, P.D. et al. (2002) PHENIX: building new software for automated crystallographic structure determination. Acta Cryst. D58, 1948-1954.

Andersen, J. N. et al. (2001). Structural and evolutionary relationships among protein tyrosine phosphatase domains. Mol Cell Biol 21(21): 7117-36.

Anderson, D. G. and Kowalczykowski, S. C. (1997). The translocating RecBCD enzyme stimulates recombination by directing RecA protein onto ssDNA in a chi-regulated manner. Cell 90(1): 77-86.

Aoki, Y. et al. (2000). Increased susceptibility to ischemia-induced brain damage in transgenic mice overexpressing a dominant negative form of SHP2. Faseb J 14(13): 1965-73.

Aravind, L., Walker, D.R., and Koonin, E.V. (1999) Conserved domains in DNA repair proteins and evolution of repair systems. Nucleic Acids Res. 27, 1223-1242.

Aravind, L., Makarova, K.S., and Koonin, E.V. (2000) Holliday junction resolvases and related : identification of new families, phyletic distribution and evolutionary trajectories. Nucleic Acids Res. 28, 3417-3432.

123

Barbour, S. D. et al. (1970). Biochemical and genetic studies of recombination proficiency in Escherichia coli. II. Rec+ revertants caused by indirect suppression of rec- mutations. Proc Natl Acad Sci 67(1): 128-35.

Baudin, A. et al. (1993). A simple and efficient method for direct gene deletion in . Nucleic Acids Research 21(14): 3329-30.

Bell, C. E. (2005). Structure and mechanism of Escherichia coli RecA ATPase. Mol Microbiol 58(2): 358-66.

Bergfors, T. M. (1999). Protein crystallization: techniques, strategies, and tips: a laboratory manual, International University Line.

Boehmer, P. E. and Emmerson, P. T. (1991). Escherichia coli RecBCD enzyme: inducible overproduction and reconstitution of the ATP-dependent from purified subunits. Gene 102(1): 1-6.

Branton, D. et al. (2008). The potential and challenges of nanopore sequencing. Nat Biotechnol 26(10): 1146-53.

Brautigam, C. A. and Steitz, T. A. (1998). Structural principles for the inhibition of the 3'-5' exonuclease activity of Escherichia coli DNA I by phosphorothioates. J Mol Biol 277(2): 363-77.

Breyer, W.A. and Matthews, B.W. A structural basis for . Protein Sci. 10, 1699-1711.

Bromberg, J. F. et al. (1999). Stat3 as an . Cell 98(3): 295-303.

124

Brünger, A. T. et al. (1998). Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr. D54, 905-921.

Buchhop, S. et al. (1997). Interaction of p53 with the human Rad51 protein. Nucleic acids research 25(19): 3868-74.

Camerini-Otero, R. D. and Hsieh, P. (1995). Homologous recombination proteins in prokaryotes and eukaryotes. Annual Review of Genetics 29: 509-52.

Carter, D. M. and Radding, C. M. (1971). The role of exonuclease and beta protein of phage lambda in genetic recombination. II. Substrate specificity and the mode of action of lambda exonuclease. J Biol Chem 246(8): 2502-12.

Cassuto, E. et al. (1971). Role of exonuclease and protein of phage lambda in genetic recombination. V. Recombination of lambda DNA in vitro. Proc Natl Acad Sci 68(7): 1639-43.

Cassuto, E. and Radding, C. M. (1971). Mechanism for the action of lambda exonuclease in genetic recombination. Nat New Biol 229(1): 13-6.

Chan, R. J. and Yoder, M. C. (2004). The multiple facets of hematopoietic stem cells. Curr Neurovasc Res 1(3): 197-206.

Chang, H. W. and Julin, D. A. (2001). Structure and function of the Escherichia coli RecE protein, a member of the RecB nuclease domain family. J Biol Chem 276(49): 46004-10.

125

Chong, Z. Z. and Maiese, K. (2007). The Src homology 2 domain tyrosine phosphatases SHP-1 and SHP-2: diversified control of cell growth, inflammation, and injury. Histol Histopathol 22(11): 1251-67.

Chu, C.C., Templin, A., and Clark, A.J. (1989) Suppression of a in the recE gene of Escherichia coli K-12 occurs by gene fusion. J. Bacteriol. 171, 2101-2109.

Clark, A. J. (1973). Recombination deficient mutants of E. coli and other bacteria. Annu Rev Genet 7: 67-86.

Clark, A. J. (1974). Progress toward a metabolic interpretation of genetic recombination of Escherichia coli and bacteriophage lambda. Genetics 78(1): 259-71.

Clark, A. J. and Sandler, S. J. (1994). Homologous genetic recombination: the pieces begin to fall into place. Critical Reviews in Microbiology 20(2): 125-42.

Clark, A. J. et al. (1984). Genes of the RecE and RecF pathways of conjugational recombination in Escherichia coli. Cold Spring Harb Symp Quant Biol 49: 453-62.

Clark, A. J. et al. (1993). Genetic and molecular analyses of the C-terminal region of the recE gene from the Rac prophage of Escherichia coli K-12 reveal the recT gene. Journal of Bacteriology 175(23): 7673-82.

Cohen, A. et al. (1985). General genetic recombination of bacterial plasmids. Basic Life Sciences 30(Plasmids Bact.): 505-19.

126

COLLABORATIVE COMPUTATIONAL PROJECT, NUMBER 4 (1994) The CCP4 Suite: Programs for Protein Crystallography. Acta Cryst. D50, 760-763.

Copeland, N. G. et al. (2001). Recombineering: a powerful new tool for mouse functional genomics." Nat Rev Genet 2(10): 769-79.

Court, D. L. et al. (2002). Genetic engineering using homologous recombination. Annual Review of Genetics 36: 361-388.

Cox, M. M. et al. (2000). The importance of repairing stalled replication forks. Nature 404(6773): 37-41.

Datta, S. et al. (2006). A set of recombineering plasmids for gram-negative bacteria. Gene 379: 109-115.

De Zutter, J. K. and Knight, K. L. (1999). The hRad51 and RecA proteins show significant differences in cooperative binding to single-stranded DNA. Journal of molecular biology 293(4): 769-80.

Derewenda, Z. S. (2004). Rational Protein Crystallization by Mutational Surface Engineering. Structure 12(4): 529-535.

Dillingham, M. S. et al. (2003). RecBCD enzyme is a bipolar DNA helicase. Nature 423(6942): 893-7.

Edelmann, W. and Kucherlapati, R. (1996). Role of recombination enzymes in mammalian cell survival. Proc Natl Acad Sci 93(13): 6225-7.

127

Ellis, H. M. et al. (2001)."High efficiency mutagenesis, repair, and engineering of chromosomal DNA using single-stranded oligonucleotides. Proc Natl Acad Sci 98(12): 6742-6746.

Emsley, P., and Cowtan, K. (2004) Coot: model-building tools for molecular graphics. Acta Cryst. D60, 2126-2132.

Erler, A. et al. (2009). Conformational adaptability of Redbeta during DNA annealing and implications for its structural relationship with Rad52. J Mol Biol 391(3): 586-98.

Fishman-Lobell, J. et al. (1992). Two alternative pathways of double-strand break repair that are kinetically separable and independently modulated. Mol Cell Biol 12(3): 1292-303.

Georgescu, R. E. et al. (2008). Structure of a sliding clamp on DNA. Cell 132(1): 43-54.

Haber, J. E. (1999). DNA repair. Gatekeepers of recombination. Nature 398(6729): 665, 667.

Hall, S. D. (1993). Homologous pairing proteins involved in RecE pathway recombination of Escherichia coli: biochemical characterization of exoVIII and RecT recombination activities: 214 pp.

Hall, S.D., Kane, M.F., and Kolodner, R.D. (1993) Identification and characterization of the Escherichia coli RecT protein, a protein encoded by the recE region that promotes renaturation of homologous single-stranded DNA. J. Bacteriol. 175, 277-287.

128

Handa, N. et al. (1997). Chi-star, a chi-related 11-mer sequence partially active in an E. coli recC1004 strain. Genes Cells 2(8): 525-36.

Herr, A.B. et al. (1997) Heparin-induced self-association of fibroblast growth factor-2. Evidence for two oligomerization processes. J. Biol. Chem. 272, 16382-16389.

Herr, A.B., Ballister, E.R., and Bjorkman, P.J. (2003) Bivalent binding of IgA1 to FcalphaRI suggests a mechanism for cytokine activation of IgA phagocytosis. J. Mol. Biol. 327, 645-657.

Hefferin, M. L. and Tomkinson, A. E. (2005). Mechanism of DNA double-strand break repair by non-homologous end joining. DNA Repair (Amst) 4(6): 639-48.

Heras, B. et al. (2003). Dehydration converts DsbG crystal diffraction from low to high resolution. Structure 11(2): 139-145.

Higashi, H. et al. (2002). SHP-2 tyrosine phosphatase as an intracellular target of CagA protein. Science 295(5555): 683-6.

Hirai, H. and Varmus, H. E. (1990). Site-directed mutagenesis of the SH2- and SH3-coding domains of c-src produces varied phenotypes, including oncogenic activation of p60c-src. Mol Cell Biol 10(4): 1307-18.

Hof, P. et al. (1998). Crystal structure of the tyrosine phosphatase SHP-2. Cell 92(4): 441-50.

Horvat, A. et al. (2001). A novel role for protein tyrosine phosphatase shp1 in controlling glial activation in the normal and injured nervous system. J Neurosci 21(3): 865-74.

129

Irandoust, M. et al. (2009). Role of tyrosine phosphatase inhibitors in cancer treatment with emphasis on SH2 domain-containing tyrosine phosphatases (SHPs). Anticancer Agents Med Chem 9(2): 212-20.

Iyer, L. M. et al. (2002). Classification and evolutionary history of the single-strand annealing proteins, RecT, Redb, ERF and RAD52. BMC Genomics 3: 8-19.

Joseph, J. W. and Kolodner, R. (1983). Exonuclease VIII of Escherichia coli. I. Purification and physical properties. J Biol Chem 258(17): 10411-7.

Joseph, J. W. and Kolodner, R. (1983). Exonuclease VIII of Escherichia coli. II. Mechanism of action. J Biol Chem 258(17): 10418-24.

Kanaar, R. et al. (1998). Molecular mechanisms of DNA double-strand break repair. Trends in Cell Biology 8(12): 483-489.

Karakousis, G. et al. (1998). The beta protein of phage lambda binds preferentially to an intermediate in DNA renaturation. J Mol Biol 276(4): 721-31.

Kettenberger, H. and Cramer, P. (2006). Fluorescence detection of nucleic acids and proteins in multi-component crystals Acta Crystallogr D Biol Crystallogr 62(Pt 2): 146-50.

Kim, C. H. et al. (1999). Abnormal chemokine-induced responses of immature and mature hematopoietic cells from motheaten mice implicate the protein tyrosine phosphatase SHP-1 in chemokine responses. J Exp Med 190(5): 681-90.

130

Kim, H. Y. et al. (2006). Raft-mediated Src homology 2 domain-containing proteintyrosine phosphatase 2 (SHP-2) regulation in microglia. J Biol Chem 281(17): 11872-8.

Kmiec, E. and Holloman, W.K. (1981) Beta protein of bacteriophage lambda promotes renaturation of DNA. J. Biol. Chem. 256, 12636-12639.

Kogoma, T. (1996). Recombination by replication. Cell 85(5): 625-627.

Kolodner, R. et al. (1994). Homologous pairing proteins encoded by the Escherichia coli recE and recT genes. Mol Microbiol 11(1): 23-30.

Kosaki, K. et al. (2002). PTPN11 (protein-tyrosine phosphatase, nonreceptor-type 11) mutations in seven Japanese patients with Noonan syndrome. J Clin Endocrinol Metab 87(8): 3529-33.

Kovaleski, B. J. et al. (2006). In vitro characterization of the interaction between HIV-1 Gag and human lysyl-tRNA synthetase. J Biol Chem 281(28): 19449-56.

Kovall, R. and Matthews, B. W. (1997). Toroidal structure of lambda-exonuclease. Science 277(5333): 1824-7.

Kovall, R. A. and Matthews, B. W. (1998). Structural, functional, and evolutionary relationships between lambda-exonuclease and the type II restriction endonucleases. Proc Natl Acad Sci 95(14): 7893-7.

Kovall, R. A. and Matthews, B. W. (1999). Type II restriction endonucleases: structural, functional and evolutionary relationships. Curr Opin Chem Biol 3(5): 578-83.

131

Kowalczykowski, S. C. et al. (1994). Biochemistry of homologous recombination in Escherichia coli. Microbiological Reviews 58(3): 401-65.

Kowalczykowski, S. C. and Eggleston, A. K. (1994). Homologous pairing and DNA strand-exchange proteins. Annual Review of Biochemistry 63: 991-1043.

Kusano, K. et al. (1994). DNA double-strand break repair: genetic determinants of flanking crossing-over. Proc Natl Acad Sci 91(3): 1173-7.

Kusano, K. et al. (1994). Involvement of RecE exonuclease and RecT annealing protein in DNA double-strand break repair by homologous recombination. Gene 138(1-2): 17-25.

Kushner, S. R. et al. (1974). Isolation of exonuclease VIII. Enzyme associated with the sbcA indirect suppressor. Proc Natl Acad Sci 71(9): 3593-7.

Kushner, S. R. et al. (1974). Isolation of the enzyme associated with sbcA indirect suppressor." Mech. Recomb., [Proc. Biol. Div. Res. Conf.], 27th: 137-43.

Kuzminov, A. (1999) Recombinational repair of DNA damage in Escherichia coli and bacteriophage l. Microbiol. Mol. Biol. Rev. 63, 751-813.

Kvaratskhelia, M. et al. (2002). Identification of specific HIV-1 reverse transcriptase contacts to the viral RNA:tRNA complex by mass spectrometry and a primary amine selective reagent. Proc Natl Acad Sci 99(25): 15988-93.

Lao, J. P. et al. (2008). Rad52 promotes postinvasion steps of meiotic double-strand-break repair. Mol Cell 29(4): 517-24.

132

Lee, J.Y. et al. (2005) MutH complexed with hemi- and unmethylated : coupling of base recognition and DNA cleavage. Mol. Cell 20, 155-166.

Li, Z. et al. (1998). The beta protein of phage lambda promotes strand exchange. J Mol Biol 276(4): 733-44.

Liang, F. et al. (1998). Homology-directed repair is a major double-strand break repair pathway in mammalian cells. Proc Natl Acad Sci 95(9): 5172-7.

Little, J. W. et al. (1967). An exonuclease induced by bacteriophage lambda. I. Preparation of the crystalline enzyme. J Biol Chem 242(4): 672-8.

Little, J. W. (1967). An exonuclease induced by bacteriophage lambda. II. Nature of the enzymatic reaction. J Biol Chem 242(4): 679-86.

Liu, B. A. et al. (2006). The human and mouse complement of SH2 domain proteins-establishing the boundaries of phosphotyrosine signaling. Mol Cell 22(6): 851-68.

Lloyd, R. G. and Buckman, C. (1985). Identification and genetic analysis of sbcC mutations in commonly used recBC sbcB strains of Escherichia coli K-12. J Bacteriol 164(2): 836-44.

Luisi-DeLuca, C. et al. (1988). Analysis of the recE of Escherichia coli K-12 by use of polyclonal antibodies to exonuclease VIII. Journal of Bacteriology 170(12): 5797-805.

133

Luo, W. et al. (2008). Retraction: analysis of the TCR alpha and beta chain CDR3 spectratypes in the peripheral blood of patients with Systemic Lupus Erythematosus. J Autoimmune Dis 5: 5.

Lusetti, S.L. and Cox, M.M. (2002) The bacterial RecA protein and the recombinational DNA repair of stalled replication forks. Annu. Rev. Biochem. 71, 71-100.

Maas, M. et al. (2003). Reactive oxygen species induce reversible PECAM-1 tyrosine phosphorylation and SHP-2 binding. Am J Physiol Heart Circ Physiol 285(6): H2336-44.

Maat, A. et al. (2008). Cocaine is a Major Risk Factor for Antipsychotic Induced Akathisia, Parkinsonism and Dyskinesia. Psychopharmacol Bull 41(3): 5-10.

Malkov, V. A. and Camerini-Otero, R. D. (1995). Photocrosslinks between single-stranded DNA and Escherichia coli RecA protein map to loops L1 (amino acid residues 157-164) and L2 (amino acid residues 195-209). Journal of Biological Chemistry 270(50): 30230-33.

Mao, Z. et al. (2008). DNA repair by nonhomologous end joining and homologous recombination during cell cycle in human cells. Cell Cycle 7(18): 2902-6.

Matsuura, S. et al. (2001). Real-time observation of a single DNA digestion by l exonuclease under a fluorescence microscope field. Nucleic Acids Research 29(16): e79/1-e79/5.

McIlwraith, M. J. and West, S. C. (2008). DNA repair synthesis facilitates RAD52-mediated second-end capture during DSB repair. Mol Cell 29(4): 510-6.

134

McRee, D. E. (1999). Practical Protein Crystallography, Academic Press.

Miled, N. et al. (2007). Mechanism of two classes of cancer mutations in the phosphoinositide 3-kinase catalytic subunit. Science 317(5835): 239-42.

Mimitou, E. P. and Symington, L. S. (2008). Sae2, Exo1 and Sgs1 collaborate in DNA double-strand break processing. Nature 455(7214): 770-4.

Mitsis, P.G., and Kwagh, J.G. (1999) Characterization of the interaction of lambda exonuclease with the ends of DNA. Nucleic Acids Res. 27, 3057-3063.

Morimatsu, K. et al. (2001). Interaction of tyrosine 65 of RecA protein with the first and second DNA strands. Journal of molecular biology 306(2): 189-99.

Morimatsu, K. and Horii, T. (1995). The DNA-binding site of the RecA protein. Photochemical cross-linking of Tyr103 to single-stranded DNA. FEBS 228(3): 772-8.

Muniyappa, K. and Radding, C. M. (1986). The homologous recombination system of phage l. Pairing activities of b protein. Journal of Biological Chemistry 261(16): 7472-8.

Muyrers, J. P. et al. (2000). RecE/RecT and Redalpha/Redbeta initiate double-stranded break repair by specifically interacting with their respective partners. Genes Dev 14(15): 1971-82.

Muyrers, J. P. P. et al. (2004). ET recombination: DNA engineering using homologous recombination in E. coli. Methods in Molecular Biology 256(Bacterial Artificial Chromosomes, Volume 2): 107-121.

135

Muyrers, J. P. P. et al. (2000). ET-cloning: think recombination first. Genetic Engineering 22: 77-98.

Muyrers, J. P. P. et al. (2001). Techniques: Recombinogenic engineering-new options for cloning and manipulating DNA. Trends in Biochemical Sciences 26(5): 325-331.

Muyrers, J. P. P. et al. (2002). Introducing Red/ET recombination: DNA engineering for the 21st century. Gene Cloning and Expression Technologies: 87-96, A5-A7.

Muyrers, J. P. P. et al. (1999). Rapid modification of bacterial artificial chromosomes by ET-recombination. Nucleic Acids Research 27(6): 1555-1557.

Neel, B. G. et al. (2003). The 'Shp'ing news: SH2 domain-containing tyrosine phosphatases in cell signaling. Trends Biochem Sci 28(6): 284-93.

Neel, B. G. and Tonks, N. K. (1997). Protein tyrosine phosphatases in signal transduction. Curr Opin Cell Biol 9(2): 193-204.

Noirot, P. et al. (2003). Hallmarks of homology recognition by RecA-like recombinases are exhibited by the unrelated Escherichia coli RecT protein. Embo J 22(2): 324-34.

Noirot, P. and Kolodner, R. D. (1998). DNA strand invasion promoted by Escherichia coli RecT protein. J Biol Chem 273(20): 12274-80.

Nussbaum, A. et al. (1992). Restriction-stimulated homologous recombination of plasmids by the RecE pathway of Escherichia coli. Genetics 130(1): 37-49.

136

Oppenheim, A. B. et al. (2004). In vivo recombineering of bacteriophage l by PCR fragments and single-strand oligonucleotides. Virology 319(2): 185-189.

Paques, F. and Haber, J. E. (1999). Multiple pathways of recombination induced by double-strand breaks in Saccharomyces cerevisiae. Microbiol Mol Biol Rev 63(2): 349-404.

Passy, S. I. et al. (1999). Rings and filaments of beta protein from bacteriophage lambda suggest a superfamily of recombination proteins. Proc Natl Acad Sci 96(8): 4279-84.

Pawson, T. (2004). Specificity in signal transduction: from phosphotyrosine-SH2 domain interactions to complex cellular systems. Cell 116(2): 191-203.

Perkins, T. T. et al. (2003). Sequence-dependent pausing of single lambda exonuclease molecules. Science 301(5641): 1914-8.

Pfeiffer, P. (1998). The mutagenic potential of DNA double-strand break repair. Toxicology letters 96-97: 119-29.

Phillips, J. W. and Morgan, W. F. (1994). Illegitimate recombination induced by DNA double-strand breaks in a mammalian chromosome. Mol Cell Biol 14(9): 5794-803.

Poteete, A. R. (2001). What makes the bacteriophage lambda Red system useful for genetic engineering: molecular mechanism and biological function. FEMS Microbiology Letters 201(1): 9-14.

Rajan, R. et al. (2006). Probing the DNA sequence specificity of Escherichia coli RECA protein. Nucleic Acids Research 34(8): 2463-2471.

137

Rakesh, K. and Agrawal, D. K. (2005). Controlling cytokine signaling by constitutive inhibitors. Biochem Pharmacol 70(5): 649-57.

Raschle, M. et al. (2004). Multiple interactions with the Rad51 recombinase govern the homologous recombination function of Rad54. The Journal of biological chemistry 279(50): 51973-80.

Roca, A. I. and Cox, M. M. (1997). RecA protein: structure, function, and role in recombinational DNA repair. Progress in Nucleic Acid Research and Molecular Biology 56: 129-223.

Russell, R. B. et al. (1992). Conservation analysis and structure prediction of the SH2 family of phosphotyrosine binding domains. FEBS Lett 304(1): 15-20.

Sadowski, I. et al. (1986). A noncatalytic domain conserved among cytoplasmic protein-tyrosine kinases modifies the kinase function and transforming activity of Fujinami sarcoma virus P130gag-fps. Mol Cell Biol 6(12): 4396-408.

Sanderson, K. (2008). Personal genomes: Standard and pores. Nature 456(7218): 23-5.

Schuck, P. (2000) Size-distribution analysis of macromolecules by sedimentation velocity ultracentrifugation and lamm equation modeling. Biophys. J. 78, 1606-1619.

Shell, S. M. et al. (2005). Mass Spectrometric Identification of Lysines Involved in the Interaction of Human Replication Protein A with Single-Stranded DNA. Biochemistry 44(3): 971-978.

138

Shinohara, A. and Ogawa, T. (1995). Homologous recombination and the roles of double-strand breaks. Trends Biochem Sci 20(10): 387-91.

Shulman, M. J. et al. (1970). Properties of recombination-deficient mutants of bacteriophage lambda. J Mol Biol 52(3): 501-20.

Silberstein, Z. et al. (1995). Primary products of break-induced recombination by Escherichia coli RecE pathway. Journal of Bacteriology 177(7): 1692-8.

Singleton, M. R. et al. (2004). Crystal structure of RecBCD enzyme reveals a machine for processing DNA breaks. Nature 432(7014): 187-93.

Singleton, M. R. et al. (2002). Structure of the single-strand annealing domain of human RAD52 protein. Proc Natl Acad Sci 99(21): 13492-13497.

Skiba, M. C. et al. (1999). Intersubunit proximity of residues in the RecA protein as shown by engineered disulfide cross-links. Biochemistry 38(37): 11933-41.

Smith, G. R. (1988). Homologous recombination in procaryotes. Microbiol Rev 52(1): 1-28.

Stahl, F. (1996). Meiotic recombination in yeast: coronation of the double-strand-break repair model. Cell 87(6): 965-8.

Stahl, M. M., L. Thomason, et al. (1997). Annealing vs. invasion in phage lambda recombination. Genetics 147(3): 961-77.

Stasiak, A. Z. et al. (2000). The human Rad52 protein exists as a heptameric ring. Curr Biol 10(6): 337-40.

139

Steighner, R. J. and Povirk, L. F. (1990). Bleomycin-induced DNA lesions at mutational hot spots: implications for the mechanism of double-strand cleavage. Proc Natl Acad Sci 87(21): 8350-4.

Subramanian, K. et al. (2003). The enzymatic basis of processivity in lambda exonuclease. Nucleic Acids Res 31(6): 1585-96.

Sugiyama, T. et al. (2006). Rad52-mediated DNA annealing after Rad51-mediated DNA strand exchange promotes second ssDNA capture. Embo J 25(23): 5539-48.

Sweeney, M. C. et al. (2005). Decoding protein-protein interactions through combinatorial chemistry: sequence specificity of SHP-1, SHP-2, and SHIP SH2 domains. Biochemistry 44(45): 14932-47.

Symington, L. S. (2002). Role of RAD52 epistasis group genes in homologous recombination and double-strand break repair. Microbiol Mol Biol Rev 66(4): 630-70, table of contents.

Szostak, J. W. et al. (1983). The double-strand-break repair model for recombination. Cell 33(1): 25-35.

Takahashi, N. K. et al. (1993). Genetic analysis of double-strand break repair in Escherichia coli. Journal of bacteriology 175(16): 5176-85.

Tartaglia, M. et al. (2001). Mutations in PTPN11, encoding the protein tyrosine phosphatase SHP-2, cause Noonan syndrome. Nat Genet 29(4): 465-8.

Templin, A. et al. (1972). "Genetic analysis of mutations indirectly suppressing recB and recC mutations. Genetics 72(2): 105-15.

140

Thomason, L. C. et al. (2005). Recombineering in prokaryotes. Phages: 383-399, 3 plates.

Thresher, R. J. et al. (1995). Electron microscopic visualization of RecT protein and its complexes with DNA. Journal of Molecular Biology 254(3): 364-71.

Tolun, G. and Myers, R. S. (2003). A real-time DNase assay (ReDA) based on PicoGreen fluorescence. Nucleic Acids Res 31(18): e111.

Tracy, R. B. et al. (1997). The preference for GT-rich DNA by the yeast Rad51 protein defines a set of universal pairing sequences. Genes & Development 11(24): 3423-3431.

Tracy, R. B. and Kowalczykowski, S. C. (1996). In vitro selection of preferred DNA pairing sequences by the Escherichia coli RecA protein. Genes & Development 10(15): 1890-1903.

Tullius, T. D. and Dombroski, B. A. (1986). Hydroxyl radical footprinting: high-resolution information about DNA-protein contacts and application to l repressor and Cro protein. Proc Natl Acad Sci 83(15): 5469-73.

Van Kessel, J.C. and Hatfull, G.F. (2007) Recombineering in Mycobacterium tuberculosis. Nat. Methods 4, 147-152.

Van Gent, D. C. et al. (2001). Chromosomal stability and the DNA double-stranded break connection. Nat Rev Genet 2(3): 196-206.

Van Oijen, A.M. et al (2003). Single-molecule kinetics of lambda exonuclease reveal base dependence and dynamic disorder. Science 301, 1235-1238.

141

Venken, K. J. et al. (2006). P[acman]: A BAC Transgenic Platform for Targeted Insertion of Large DNA Fragments in D. melanogaster. Science 314(5806): 1747-1751.

Volodin, A. A. and Camerini-Otero, R. D. (2002). Influence of DNA sequence on the positioning of RecA monomers in RecA-DNA cofilaments. Journal of Biological Chemistry 277(2): 1614-1618.

Volodin, A. A. et al. (1997). Periodicity in recA protein-DNA complexes. FEBS Letters 407(3): 5-325-328.

Waksman, G. et al. (1993). Binding of a high affinity phosphotyrosyl peptide to the Src SH2 domain: crystal structures of the complexed and peptide-free forms. Cell 72(5): 779-90.

Wang, X. and Baumann, P. (2008). Chromosome fusions following telomere loss are mediated by single-strand annealing. Mol Cell 31(4): 463-73.

Ward, J. F. (1988). DNA damage produced by ionizing radiation in mammalian cells: identities, mechanisms of formation, and reparability. Prog Nucleic Acid Res Mol Biol 35: 95-125.

Weaver, D. T. (1995). What to do at an end: DNA double-strand-break repair. Trends Genet 11(10): 388-92.

Wong, I. and Lohman, T. M. (1993). A double-filter method for nitrocellulose-filter binding: application to protein-nucleic acid interactions. Proc Natl Acad Sci 90(12): 5428-32.

142

Wood, R. D. et al. (2001). Human DNA repair genes. Science 291(5507): 1284-9.

Wu, Z. et al. (2006). Domain Structure and DNA Binding Regions of b Protein from Bacteriophage l. Journal of Biological Chemistry 281(35): 25205-25214.

Yang, H et al. (2005). The BRCA2 homologue Brh2 nucleates RAD51 filament formation at a dsDNA-ssDNA junction. Nature 433(7026): 653-7.

Yi, T. et al. (1993). Hematopoietic cell phosphatase associates with the interleukin-3 (IL-3) receptor beta chain and down-regulates IL-3-induced tyrosine phosphorylation and mitogenesis. Mol Cell Biol 13(12): 7577-86.

Yi, T. L. et al. (1992). Protein tyrosine phosphatase containing SH2 domains: characterization, preferential expression in hematopoietic cells, and localization to human chromosome 12p12-p13. Mol Cell Biol 12(2): 836-46.

Yu, D. et al. (2003). Recombineering with overlapping single-stranded DNA oligonucleotides: Testing a recombination intermediate Proc Natl Acad Sci 100(12): 7207-7212.

Yu, M. et al. (1998). The 30-kDa C-terminal domain of the RecB protein is critical for the nuclease activity, but not the helicase activity, of the RecBCD enzyme from Escherichia coli. Proc Natl Acad Sci 95(3): 981-6.

Yu, Y. and Bradley, A. (2001). Mouse genomic technologies: Engineering chromosomal rearrangements in mice. Nature Reviews Genetics 2(10): 780-790.

143