STRUCTURAL AND FUNCTIONAL STUDIES OF THE BACTERIAL RECA

DISSERTATION

Presented in Partial Fulfillment of the Requirements for

the Degree Doctor of Philosophy in the Graduate

School of The Ohio State University

By

Rakhi Rajan, M.S. * * * * *

The Ohio State University 2007

Dissertation Committee:

Dr. Charles E. Bell, Advisor Approved by Dr. Dehua Pei

Dr. Scott Walsh

Dr. Ross Dalbey

Advisor Biophysics Graduate Program

ABSTRACT

Double stranded (ds) DNA breaks are among the most detrimental types of DNA

damage. dsDNA breaks can be repaired in cells by a process called homologous

recombination. RecA is the key player that mediates the DNA strand exchange reaction

in the recombination process. The gram positive bacterium

(Dr) is extremely resistant to high doses of ionizing radiation and thus of great interest for studying biological DNA repair processes and is of potential use in the bioremediation of

radioactive waste. The resistance of Dr to extreme doses of ionizing radiation depends on

its highly efficient capacity to repair dsDNA breaks. The Dr RecA protein promotes

DNA strand-exchange by an unprecedented inverse pathway, in which the presynaptic

filament is formed on dsDNA instead of ssDNA. In order to gain insight into the remarkable DNA repair capacity of Dr and the novel mechanistic features of its RecA protein, the x-ray crystal structure of Dr RecA in complex with ATPγS was determined at

2.5Å resolution. Like RecA from E. coli, Dr RecA crystallizes as a helical filament that is closely related to its biologically relevant form, but with a more compressed pitch of

67Å. Although the overall fold of Dr RecA is similar to E. coli RecA, there is a large reorientation of the C-terminal domain, which in E. coli RecA has a site for binding dsDNA. Compared to E. coli RecA, the inner surface along the central axis of the Dr

RecA filament has an increased positive electrostatic potential. Unique amino acid ii residues in Dr RecA cluster around a flexible β-hairpin that has also been implicated in

DNA binding. The details of Dr RecA structure are discussed in chapter 2.

RecA generally binds to any sequence of ssDNA but has a preference for GT-rich

sequences, as found in the recombination hot spot Chi (5’-GCTGGTGG-3’). When this

sequence is located within an oligonucleotide, binding of RecA is phased relative to it,

with a periodicity of three nucleotides. This implies that there are three separate

nucleotide-binding sites within a RecA monomer that may exhibit preferences for the

four different nucleotides. In chapter 3, a RecA coprotease assay was used to further

probe the ssDNA sequence specificity of E. coli RecA protein. The extent of self-

cleavage of a λ-repressor protein fragment in the presence of RecA, ADP-AlF4, and 64 different trinucleotide-repeating 15-mer oligonucleotides was determined. The coprotease activity of RecA is strongly dependent on the ssDNA sequence, with TGG-repeating sequences giving by far the highest coprotease activity, and GC and AT-rich sequences the lowest. For selected trinucleotide-repeating sequences, the DNA-dependent ATPase and DNA-binding activities of RecA were also determined. The DNA-binding and coprotease activities of RecA have the same sequence dependence, which is essentially opposite to that of the ATPase activity of RecA. The implications with regard to the biological mechanism of RecA are discussed.

The inverse strand exchange pathway of Dr RecA was proposed based on in vitro strand exchange reactions, which gives an indirect measurement of the RecA-DNA

iii interaction. The crystal structure of Dr RecA showed features consistent with the inverse

strand exchange mechanism. In chapter 4, a set of experiments was designed to directly

measure the interactions of Dr and Ec RecA with ssDNA and dsDNA substrates.

The experiments do not reveal any distinctive differences in the DNA-binding properties

of the two proteins that are consistent with the proposed model for the inverse strand

exchange pathway of Dr RecA.

Chapter 5 summarizes the work done in collaboration with Dr. Pei. S-

Ribosylhomocysteinase (LuxS) is the enzyme which catalyses the synthesis of the

precursor of type II bacterial quorum sensing molecule (AI-2). AI-2 is very important in

development, since it mediates inter- communication with in . A

catalytically inactive mutant (C84A) of LuxS (BsLuxS) was co-

crystallized with the 2-ketone intermediate and the structure was determined to 1.8 Å

resolution. The structure reveals that the C2 carbonyl oxygen is directly coordinated with

the metal ion, providing strong support for the proposed Lewis acid function of the metal

ion during catalysis. A series of structural analogues of the substrate for LuxS were

designed and synthesized. Co-crystal structures of the wild type BsLuxS bound with two

of these compounds largely confirmed the design principles, i.e., the importance of both the homocysteine and ribose moieties in the high-affinity binding to LuxS active site.

iv

Dedicated to my family

v ACKNOWLEDGMENTS

I would like to thank my advisor, Dr. Charles E. Bell, for his support and

guidance throughout my graduate education. His guidance and insight were instrumental

in the success of the research presented in this dissertation.

I am grateful to my committee members Dr. Dehua Pei, Dr. Ross Dalbey, and Dr.

Scott Walsh for their valuable suggestions in the preparation of the thesis. I am thankful

to Dr. Russ Hille for serving in the committee for a major period and also for his inputs in

developing the dissertation. Special thanks are due to Dr. Pei and his lab members,

especially Dr. Zhu, who were involved in part of the work presented in this dissertation.

I would like to thank Dr. Scott Walsh for access to the BIAcore instrument and

also for his guidance in conducting and interpreting the SPR experiments. I also thank Dr.

Hille for the use of the gel documentation system in his laboratory. I am thankful to Dr.

Kalpana Ghoshal and Dr. Sarmila Majumder for help with the gel extraction and purification techniques. I am indebted to Dr. Bell’s lab members Jinjin, Xu, Jim, and Dr.

Ndjonka for the stimulating discussions and help throughout the research.

vi I am grateful to the Ohio State Biophysics Graduate Studies Program and Dr.

Thomas Clanton for accepting me into the Program. I extend my special thanks to Susan

Hauser for all her help during the graduate studies. I also thank the Department of

Molecular and Cellular Biochemistry, and the administrative members, Barbara Nesbitt,

Brenda Blanton, Ron Louters, and Eric Robbins for all their help during the course of the study.

I extend very special thanks to my husband Sudarshan Seshadri who provided moral support and encouragement throughout my doctoral work. I am proud of my daughter Meena for surviving through the difficult times throughout my graduate studies.

Finally, I would like to thank my parents, Rajasekharan Nair and Leelamony, and my sister Reshmi, who provided the strong foundation for all the achievements in my life.

This research was supported by the funds available to Dr. Charles E. Bell from the

‘National Institute of Health’ and to research collaborator Dr. Dehua Pei from ‘National

Institute of Health’.

vii VITA

May, 15, 1976 ...... Born- Trivandrum, India

1994-1998 ...... B.Sc. Agriculture, Kerala Agricultural University, Kerala, India

1998-2000 M.Sc. Biotechnology, Tamil Nadu Agricultural University, Coimbatore, India

2000-2002 ...... Junior Research Fellow, Madurai Kamaraj University, Madurai, India

2002-present...Graduate Research Associate, The Ohio State University, Columbus, USA

PUBLICATIONS

1. Rajan, R., and Bell, C. E. (2004). Crystal structure of RecA from Deinococcus radiodurans: insights into the structural basis of extreme radioresistance. J. Mol. Biol. 344, 951-963.

2. Rajan, R., Zhu, J., Hu, X., Pei, D., and Bell, C. E. (2005). Crystal structure of S- ribosylhomocyteinase (LuxS) in complex with a catalytic 2-ketone intermediate. Biochemistry 44, 3745-3753.

3. Rajan, R., Wisler, J. W., and Bell, C. E. (2006). Probing the DNA sequence specificity of RECA protein. Nucleic Acids Res. 34, 2463-2471.

4. Shen, G., Rajan, R., Zhu, J., Bell, C. E., and Pei, D. (2006). Design and synthesis of substrate and intermediate analogue inhibitors of S-ribosylhomocysteinase. J. Med. Chem. 49, 3003-3011.

viii

FIELDS OF STUDY

Major Field: Biophysics

ix TABLE OF CONTENTS

Page

Abstract...... ii

Dedication...... v

Acknowledgments...... vi

Vita ...... viii

List of tables...... xix

List of figures...... xx

Abbreviations...... xxiv

Chapters:

1. Introduction

1.1 ...... 1

1.2 RecA ...... 2

1.2.1 Discovery of RecA and its role in the HR process ...... 2

1.2.2 Crystal structure of RecA ...... 4

1.2.3 Extended conformation of RadA and Rad51 proteins ...... 6 x 1.2.4 Coprotease activity of RecA ...... 7

1.2.5 ATPase activity of RecA ...... 8

1.2.5.1 Models for the role of ATPase activity of RecA ...... 8

1.2.5.2 Waves of ATP hydrolysis ...... 10

1.2.6 Nucleation and Cooperativity in RecA-DNA binding ...... 11

1.2.7 Stoichiometry of RecA-DNA binding ...... 12

1.2.8 Regulation of E. coli RecA in vivo ...... 13

1.2.9 Proposed Mechanism for ATP hydrolysis by RecA ...... 14

1.2.10 Differences between eukaryotic Rad51 and bacterial RecA proteins

...... 16

1.2.11 Relevant new information on RecA ...... 18

1.2.12 Importance of a RecA DNA structure ...... 20

1.3 Deinococcus radiodurans RecA...... 21

1.4 X-ray crystallography ...... 24

1.4.1 Obtaining diffraction quality crystals ...... 24

1.4.2 X-ray diffraction data collection ...... 25

1.4.2.1 Sources of x-ray radiation ...... 25

xi 1.4.2.2 The process of x-ray diffraction...... 26

1.4.2.3 Reciprocal lattice ...... 27

1.4.2.4 Principles of X-ray crystallography ...... 28

1.4.2.5 Conversion of diffraction data to structural information ...... 29

1.4.3 Calculation of phases ...... 31

1.4.3.1 Molecular Replacement ...... 31

1.4.3.2 Multiple Isomorphous Replacement (MIR)...... 31

1.4.3.3 Multiwavelength Anomalous Diffraction (MAD) ...... 32

2. Crystal Structure of RecA from Deinococcus radiodurans: Insights into the

Structural Basis of Extreme Radioresistance

2.1 Introduction ...... 48

2.2 Materials and Methods ...... 49

2.2.1 Cloning, expression, and protein purification...... 49

2.2.2 Crystallization, x-ray data collection and structure determination .. 50

2.3 Results and Discussion ...... 52

2.3.1 Over expression and purification of Dr RecA ...... 52

2.3.2 Overall structure...... 53

xii 2.3.3 Domain movements ...... 55

2.3.4 Monomoer-monomer interface ...... 57

2.3.5 ATP-binding pocket...... 58

2.3.6 Surface electrostatics ...... 61

2.3.7 Amino acid positions in Dr RecA with residue types not seen in

other RecA sequences...... 62

2.3.8 Structural Implications for the Inverse Mechanism of Strand

Exchange...... 64

2.4 Conclusions...... 66

3. Probing the DNA Sequence Specificity of E. coli RecA Protein

3.1 Introduction...... 78

3.2 Materials and Methods...... 81

3.2.1 Materials ...... 81

3.2.2 Protein Expression and Purification...... 81

3.2.3 RecA coprotease assay ...... 82

3.2.4 ATPase assay ...... 83

3.2.5 DNA-binding assay...... 84

xiii 3.3 Results and Discussion ...... 84

3.3.1 Dependence of RecA coptrotease activity on ssDNA sequence...... 84

3.3.2 Dependence of RecA coprotease activity on oligonucleotide length

...... 86

3.3.3 Dependence of RecA coprotease activity on type of ATP cofactor 86

3.3.4 Detailed comparison of TGG, GGT, and GTG...... 87

3.3.5 Dependence of RecA ATPase activity on DNA sequence ...... 88

3.3.6 Dependence of RecA DNA-binding on DNA sequence...... 89

3.3.7 Basis of sequence specificity of coprotease activity...... 90

3.3.8 Validity of using trinucleotide-repeating sequences...... 92

3.3.9 Basis of sequence specificity of binding...... 93

3.3.10 Contrasting preferences for ATPase and coprotease activities...... 94

3.3.11 Biological implications ...... 96

3.4 Conclusions...... 97

4. Testing the Inverse Strand Exchange Mechanism of Deinococcus radiodurans

RecA

4.1 Introduction...... 108 xiv 4.2 Materials and Methods...... 110

4.2.1 Purification of Deinococcus radiodurans RecA (native) ...... 110

4.2.2 Purification of Escherichia coli RecA (native) ...... 111

4.2.3 Purification of the isolated CTD of Dr RecA and Ec RecA ...... 113

4.2.4 Nuclease Assay ...... 114

4.2.5 Gel Shift Assay ...... 114

4.2.6 Strand Exchange Assay...... 115

4.2.7 Double Filter binding assay ...... 115

4.2.8 ATPase assay ...... 116

4.2.9 Surface Plasmon Resonance ...... 117

4.2.9.1 Preparation of DNA ...... 117

4.2.9.2 Preparation of the sensor chip...... 117

4.2.9.3 SPR experimental setup ...... 118

4.2.9.4 SPR data analysis...... 119

4.3 Results and Discussion ...... 119

4.3.1 Full length Ec and Dr RecA...... 119

4.3.1.1 Protein purification and characterization...... 119

xv 4.3.1.2 Filter Binding Assay ...... 120

4.3.1.3 ATPase Assay ...... 122

4.3.1.4 Surface Plasmon Resonance (SPR) ...... 122

4.3.2 Isolated CTD of Ec and Dr RecA ...... 126

4.3.2.1 Protein purification and characterization...... 126

4.3.2.2 Filter Binding Assay ...... 127

4.4 Conclusions...... 128

5. Structural studies of Bacillus subtilis LuxS in complex with reaction intermediates and inhibitors

5.1 Introduction...... 145

5.2 Materials and Methods...... 151

5.2.1 Materials ...... 151

5.2.2 Site-Directed Mutagenesis of LuxS...... 151

5.2.3 Purification of C84A BsLuxS (non-His-tag) ...... 152

5.2.4 Purification of VhLuxS Mutants...... 153

5.2.5 LuxS Activity Assay...... 154

5.2.6 UV-VIS Spectroscopy ...... 154

xvi 5.2.7 Crystallization and Structure Determination ...... 154

5.3 Results and Discussion ...... 155

5.3.1 Structure of LuxS complexed with catalytic 2-ketone intermediate 4

...... 158

5.3.2 Site-Directed Mutagenesis...... 160

5.3.3 UV-VIS Spectroscopy ...... 161

5.3.4 Mechanistic Implications...... 161

5.4 Co-crystallization of BsLuxS with various inhibitors ...... 175

5.4.1 Introduction...... 175

5.4.2 Materials and Methods...... 176

5.4.2.1 Synthesis of the inhibitors...... 176

5.4.2.2 Crystallization and X-ray Diffraction ...... 178

5.4.3 Results and Discussion ...... 179

5.4.3.1 Structure of LuxS in complex with compounds 10 and 11...... 179

5.4.3.2 Structure of LuxS in complex with SA-I ...... 181

5.5 Structure of BsLuxS bound to a citrate ion...... 191

5.5.1 Introduction...... 191

xvii 5.5.2 Methods...... 191

5.5.2.1 Crystallization and Structure Determination ...... 191

5.5.2.2 UV-VIS spectra of BsLuxS and BsLuxsC84A in the presence of

citric acid...... 192

5.5.3 Results and Discussion ...... 192

5.5.3.1 Structure of BsLuxSC84A with citrate ion...... 192

5.5.3.2 Activity Assay for LuxS and LuxSC84A mutant in the presence of

citrate ion ...... 196

5.5.3.3 Inhibitor Design ...... 197

List of References ...... 205

xviii LIST OF TABLES

Table Page

1.1 List of proteins which regulate the function of RecA...... 41

1.2 List of proteins which regulate the function of Rad51...... 43

2.1 Data collection and refinement statistics for Dr RecA structure...... 69

5.1 Data collection and refinement statistics for LuxS-2 ketone intermediate 4

structure...... 173

5.2 Catalytic activity of LuxS Mutants...... 174

5.3 Data collection and refinement statistics for inhibitors 10 and 11...... 185

5.4 List of atomic interactions between LuxS and Inhibitors 10 and 11 ...... 186

5.5 Data collection and refinement statistics for LuxS-SA-1 structure...... 189

5.6 List of atomic interactions between LuxS and SA-1 ...... 190

5.7 Data collection and refinement statistics for LuxS-citrate structure...... 203

5.8 List of important atomic interactions observed in the LuxS-citrate structure .... 204

xix LIST OF FIGURES

Figure Page

1.1 Schematic representation of the strand exchange reaction in E. coli...... 34

1.2 Conformation of the active and inactive filaments of RecA...... 35

1.3 Crystal structure of E.coli RecA...... 36

1.4 Coprotease activity of RecA...... 37

1.5 RecA has a motor activity which helps in the rotation of the DNA molecule...... 38

1.6 ATP hydrolysis occurs as waves through out the RecA filament...... 39

1.7 Schematic representation of the nucleotide binding pocket in RecA...... 40

1.8 Schematic representation of the proposed mechanism of ATP hydrolysis by

RecA...... 42

1.9 The proposed inverse strand exchange mechanism of Dr RecA ...... 44

1.10 Schematic representation of the different stages in the x-ray crystallography

structure determination process...... 45

1.11 Schematic diagram for the diffraction process...... 46

2.1 Dr RecA crystal and diffraction pattern...... 68

2.2 Structure of D. radiodurans RecA monomer and polymer...... 70

xx

2.3 Reorientation of the C-terminal domain of Dr RecA relative to E. coli RecA..... 71

2.4 Structure of the Dr RecA ATP-binding pocket...... 73

2.5 Increased positive electrostatic potential on the surface of Dr RecA ...... 74

2.6 Amino acid residues of Dr RecA with side-chains that are uncommonly seen in

other RecA sequences...... 75

2.7 Comparison of the structures of the C-terminal domains of Dr RecA and E. coli

RecA in the vicinity of the dsDNA-binding site of E. coli RecA...... 77

3.1 A schematic representation of the filter binding assembly...... 99

3.2 RecA coprotease activity is highly dependent on the ssDNA sequence...... 100

3.3 RecA coprotease activity in the presence of 64 trinucleotide-repeating 15-mers.

...... 101

3.4 DNA sequence specificity of RecA coprotease activity is not dependent on

oligonucleotide length or type of ATP cofactor...... 102

3.5 Time course of RecA coprotease activity with three different TGG-repeating

sequences and the top in vitro selected sequence...... 103

3.6 Dependence of RecA ATPase activity on the sequence of ssDNA...... 104

3.7 Binding of RecA to TGG and CCA-repeating 48-mer oligonucleotides...... 106 xxi 3.8 Correlation between RecA-DNA binding and coprotease activity for the 64

trinucleotides...... 107

4.1 Hypothetical model showing the strand exchange mechanism in E. coli and D.

radiodurans...... 130

4.2 Gels for ssDNA and dsDNA exonuclease assay...... 131

4.3 Gel shift assay showing the binding of Dr and Ec full-length proteins to ss, ds,

overhang, and hairpin DNA...... 132

4.4 Strand exchange reaction with Dr and Ec RecA...... 133

4.5 Binding of Ec and Dr full length proteins to 39-mer ss and dsDNA...... 134

4.6 Competitive DNA binding assay for the full length Dr and Ec RecA proteins.. 135

4.7 ATPase assay with full length Ec and Dr RecA ...... 136

4.8 Schematic representation of an experimental set up for SPR...... 137

4.9 SPR curves for the binding of Dr and Ec RecA to ss and ds DNA...... 138

4.10 Kinetic analysis of the SPR data for Ec RecA-DNA binding...... 140

4.11 Hill analysis of the SPR data for Ec RecA ...... 141

4.12 Gel showing the anomalous mobility of Dr RecA CTD on an SDS-PAGE gel 142

4.13 Binding of the isolated CTD of Dr RecA to ssDNA and dsDNA...... 144

xxii

5.1 Schematic representation of the different types of bacterial QS...... 165

5.2 Biosynthetic pathway of AI-2...... 166

5.3 Proposed catalytic mechanism of LuxS...... 167

5.4 Crystal structure of LuxS in complex with the 2-ketone intermediate 4...... 168

5.5 Schematic view of the interactions between LuxS and the 2-ketone intermediate.

...... 170

5.6 Comparison of the structure of LuxS bound to 2-ketone intermediate 4 with the

structures of LuxS bound to SRH and uncomplexed LuxS...... 171

5.7 UV-Vis absorption spectra of C84A Co2+-BsLuxS...... 172

5.8 Structure of Co-BsLuxS in complex with inhibitors 10 and 11...... 183

5.9 Structure of LuxS in complex with SA-1 ...... 187

5.10 Crystal structure of LuxS in complex with citrate ion...... 199

5.11 Schematic representation of the citrate ion with numbers for each oxygen atom

...... 201

5.12 UV-Vis spectra of C84A mutant and wild type LuxS...... 202

xxiii ABBREVIATIONS

α alpha, used for alpha-helix of proteins

β beta, used for beta-sheets of proteins

λ wavelength

Å Angstroms (10-10 m)

∆ delta, used for difference / change in two parameters

˚C degree centigrade

NaCl Sodium chloride

DTT Dithiothreitol

IPTG Isopropyl β-D-1-thiogalactopyranoside

EDTA Ethylene diamine tetra-acetic acid uv Ultra violet

KCl Potassium chloride

SDS-PAGE Sodium dodecyl sulphate – polyacrylamide gel electrophoresis

HPLC High pressure liquid chromatography

ATP Adenosine Tri Phosphate dATP deoxy Adenosine Tri Phosphate

Dr Deinococcus radiodurans

Ec Escherichia coli

xxiv RU Response Unit ss single stranded ds double stranded m milli

M molar

μ micro

γ gamma

θ theta

Φ phi

∑ summation

σ sigma

% percentage

DNA Deoxyribo Nucleic Acid

CTD C-terminal domain

NTD N-terminal domain

OD Optical Density

L liter rpm revolutions per minute cm centimeter

LB Luria Bertani

xxv Ni2+ nickel

NaH2PO4 Sodium dihydrogen phosphate

MgCl2 Magnesium chloride

P phosphate

BSA Bovine Serum Albumin

V volt nt nucleotide bp base pair

SSB Single Stranded DNA binding protein cpm counts per minute

PEP Phospho Enol Pyruvate

NADH Nicotinamide adenine dinucleotide reduced nm nanometer s second)s) min minute(s) ka on rate kd off rate kcat turnover number

KA association constant

KD dissociation constant

xxvi KM Michaelis-Menten constant

NMR Nuclear Magnetic Resonance

NHS N-hydoxysuccinimide

EDC N-ethyl-N’-(dimethylaminopropyl)carbodiimide

Rmax maximum RU n co-operativity value nM nanomolar

NC nitrocellulose

SPR Surface Plasmon Resonance

xxvii CHAPTER 1

INTRODUCTION

1.1 Homologous Recombination

Homologous recombination (HR) is the process of exchange of genetic information between two homologous DNA molecules. This biological process is involved in the repair of double stranded (ds) DNA breaks, creation of genetic diversity, and maintenance of genomic integrity (Kuzminov, 1999). In cells, dsDNA breaks are formed by exposure to ionizing radiation, various mutagenesis agents and desiccation. In human cells, failure to accurately and efficiently repair dsDNA breaks can lead to cancer and aging related diseases. Genetic diversity is created during the meiotic stage of the cell cycle when crossovers occur between the homologous . Since the crossovers are formed only between the homologous chromosomes, the genetic information is conserved.

In bacteria, there are mainly three different pathways of : 1)

RecBCD, 2) RecFOR and 3) RecET. RecBCD is a multi subunit enzyme with and nucleolytic activities that promotes the major pathway for recombination.

1 The RecFOR pathway usually becomes active in the absence of a functional RecBCD enzyme. Both RecBCD and RecFOR pathways require RecA protein in the later stages of the recombination process to carry out the strand exchange reaction between homologous

DNA molecules (Kuzminov, 1999). The RecET pathway is independent of RecA protein.

In this pathway, RecE in combination with RecT enzyme carries out homologous recombination by a pathway involving single strand annealing.

1.2 RecA

1.2.1 Discovery of RecA and its role in the HR process

RecA was discovered by mutational studies which were aimed to isolate mutants

that were defective in the recombination process as well repair of damaged DNA (Clark

and Margulies, 1965). This study identified RecA as the common enzyme involved in

both the recombination and DNA repair processes. It is believed that the primary role of

RecA is in the DNA repair process and it evolved over generations because of the

importance of prompt DNA repair in cell survival. Since its discovery, RecA has been a

subject of intense study to understand the mechanism of the recombination process. RecA

belongs to a family of proteins that couple the energy derived from ATP binding and

hydrolysis to carry out various mechanical processes. The other prominent members of

this family include DNA , F1-ATP synthase, and ATP-binding cassette (ABC) membrane transporters (Bell, 2005).

RecA is a DNA dependent ATPase that promotes the DNA strand exchange reaction that is the central step of the homologous recombination (Bianco et al. 1998,

Cox 2003, McGrew and Knight 2003). The homologue of RecA is the Rad51 protein in

2 and RadA in the archaeans. In a typical reaction involving homologous

recombination, RecBCD binds to the dsDNA break and causes the unwinding and

degradation of the DNA. The degradation continues until RecBCD encounters a χ (chi)

sequence (5’-GCTGGTGG-3’), after which only the 5’-ended strand is chewed up,

leading to the formation of a 3’-ended ssDNA tail. RecBCD then loads RecA on to the

ssDNA overhang produced at the dsDNA break. The ssDNA binding protein (SSB) prevents the formation of secondary structure in the ssDNA and thus assists the loading

of RecA onto the ssDNA. The RecA-ssDNA filament searches for an intact homologous

dsDNA mainly by random collisions between nucleoprotein filament and the intact

dsDNA molecules. Once the homologous dsDNA is located, the ssDNA pairs with the

complementary strand of the duplex DNA. This is followed by strand invasion and joint

molecule formation between the two DNA molecules. The reaction is then continued by a

repertoire of enzymes including topoisomerases, DNA polymerases, ligases etc., which

help in the synthesis of new DNA molecule using the 3’-end as a primer and the intact

DNA strand as the template (Figure 1.1). At the end of the reaction, another set of

enzymes assists in the resolution of the cross-over structures leading to the release of two

intact dsDNA molecules. (Kowalczykowski et al., 1994)

Under in vitro conditions, RecA can form either an “active” or “inactive”

nucleoprotein filament depending on the type of adenosine mono nucleotide that is

present (Figure 1.2). These two distinct forms of filaments were identified by electron microscopic studies involving RecA, DNA and different ATP analogues. In the presence of ATP or an ATP analogue, RecA binds to ss or dsDNA and forms a right handed, helical nucleoprotein filament, within which the DNA is bound in a highly extended

3 conformation ( Stasiak et al. 1981). This extended RecA-DNA filament is considered to be the active form and has a pitch of about 95 Å and an axial rise per nucleotide of 5.1 Å

(Egelman and Stasiak, 1986, and Egelman and Stasiak 1993). The hydrolysis of ATP is not required for RecA to bind to DNA, as the binding can occur in the presence of non- hydrolysable ATP analogues. RecA can bind to DNA in the absence of ATP or ATP analogs and this will form “collapsed” filaments with a pitch of 64 Å and axial rise of 2.1

Å. In the presence of ADP, RecA also forms a compressed filament. This compressed filament, with a pitch of 68 - 80 Å is referred to as the “inactive” form of RecA (Egelman and Stasiak, 1986, and Egelman and Stasiak 1993).

Each RecA monomer binds to three nucleotides of ssDNA or three base pairs of dsDNA (Takahashi et al., 1987) and there are approximately six RecA monomers per turn of the helical filament (DiCapua et al. 1982, Egelman and Stasiak 1986, Takahashi et al. 1989). The DNA is extended by 50-60% relative to its conformation in the B-form

DNA (Egelman, 2001). It is believed that this extended DNA conformation will facilitate the homology search and base pair switching between the two DNA molecules during the recombination process.

1.2.2 Crystal structure of RecA

The crystal structure of Escherichia coli (Ec) RecA was solved by Story et al.

(Story et al., 1992) more than a decade ago. The structure represented a compressed

RecA filament with a pitch of 83 Å. The protein is organized in to three domains: an N-

terminal domain ((NTD), residues 6-33), a core domain (residues 34-269) and a C-

terminal domain ((CTD), residues 270-328) (Figure 1.3). The residues 1-5, 329-352, and the putative DNA binding loops were disordered in the crystal structure. The NTD is 4 composed of an α-helix and a β-strand, which pack against the core domain of a

neighboring monomer in the filament. The core region is organized into a β-sheet flanked

by α-helices on either side. The ATP binding site (Walker A motif or P loop, residues 66-

73), the Mg2+ binding site (Walker B motif, residues 139-144), and the two putative DNA binding loops, L1 and L2, are located in the core domain. Based on the information from earlier biochemical studies and the crystal structure, the positions for the two DNA binding loops were assigned by Story et al. The loop L1, residues 157 – 164, is the

secondary DNA binding site and it is believed that L1 binds to dsDNA. The loop L2, residues 195-209, is the primary DNA binding loop and it is believed that L2 binds to

ssDNA. In the crystal structure, these two loops were disordered, but face the interior of

the filament where the DNA is known to bind. The P-loop is facing the interior of the

filament. The earlier electron microscopic study of RecA had shown that DNA lies within

the filament interior along the central axis (Egelman and Yu, 1989).

The CTD protrudes to the outer surface of the filament and is comprised of a three

stranded β-sheet and three α-helices. The CTD of RecA (residues 270-352) is thought to

play a significant role in regulating the binding of the dsDNA to the RecA-ssDNA

filament. The last few residues of Ec RecA (329-352) were disordered in the Ec RecA

crystal structure and this region has a predominance of negatively charged residues.

These negatively charged residues have been suggested to form salt bridges with various

basic residues of the protein and form a closed structure that doesn’t allow the dsDNA to

enter the RecA-ssDNA filament. In the presence of 6-10 mM Mg2+, these salt bridges are

broken, thus allowing the entry of dsDNA into the nucleoprotein filament (Lusetti et al.

2003). An NMR study with a C terminal fragment of RecA (residues 268-330) had

5 shown that there is a patch of residues including Trp-290, and Gly-301 to Asn-304 in the

CTD which binds to dsDNA, prior to the entry of the dsDNA in to the RecA-ss DNA

filament (Aihara et al., 1997). The emerging idea is that CTD acts as a “gateway” for the entry of dsDNA in to the RecA-ss DNA filament.

The crystal structures of RecA from Mycobacterium tuberculosis (Mt) (Datta et

al., 2000, 2003a, b) and Deinococcus radiodurans (Dr) (Rajan and Bell, 2004) and also

different crystal forms of Ec RecA (Xing and Bell, 2004) have been determined. The

loops L1 and L2 are visible in the different structures of Mt RecA, L1 in MtRecA-(Mg2+-

ATPγS) structure and L2 in MtRecA-ATPγS structure. But all these different structures represent a compressed RecA filament. An extended RecA-DNA complex crystal structure is very crucial in understanding the mechanism of homologous recombination.

The crystal structures of RecA homologs, Rad51 and RadA in extended conformation are available now. But these extended crystal structures do not have a DNA bound to the protein. So the details of the protein-DNA interaction are missing in these structures.

Nevertheless, these structures show significant conformational differences compared to the compressed RecA filament.

1.2.3 Extended conformation of RadA and Rad51 proteins.

The RadA and Rad51 crystal structures in an extended conformation were solved

almost at the same time (Wu et al., 2004, Conway et al., 2004). Rad51 from

Saccharomyces cerevisiea crystallized as an active extended filament with a pitch of 130

Å (Conway et al., 2004). Even though the crystals were grown in the presence of ssDNA

and ATPγS, DNA and the nucleotide cofactor are missing from the structure. An

intriguing feature of this structure is that there are two different types of interaction at the 6 ATP binding site in adjacent monomers, making the true symmetry of the filament P31 instead of the P61 symmetry observed in RecA filaments. In the first type, there is direct

contact between His-352 present at the ATP binding site of one protomer with the P-loop

of the adjacent protomer. In the second type, the His-352 of one protomer is occluded

from the ATP-binding pocket of the neighboring one by Phe-187 from the neighboring

subunit. The structure of RadA from Methanococcus voltae was also solved in an

extended conformation with a pitch of 106 Å (Wu et al., 2004). In this study, the authors

found that PEG can substitute for DNA and make the RecA filament extended even in the

absence of DNA or ATP analogue. The RadA structure was solved in the presence of 6%

PEG 3350. The loop 1 residues were ordered, while the loop 2 residues were disordered

in the structure. As in the Rad51 structure, the ATP binding pocket is located at the

subunit interface, as compared to its inward facing position in the Ec RecA crystal

structure. Each monomer has two half sites for binding ATP cofactor. Thus, comparing

the Ec RecA and the RadA and Rad51 structures it is clear that there is substantial

reorientation of the ATP binding pocket in the active and inactive forms of RecA

filament. The orientation of the ATP binding site in the extended conformation also sheds

light into the cooperativity of RecA in DNA binding (explained later).

1.2.4 Coprotease activity of RecA

In addition to the recombination reaction, RecA promotes a side reaction in which

RecA acts as a “coprotease”. The RecA-ssDNA-ATP complex binds to and activates the self-cleavage of LexA and related phage repressors such as cI (repressor protein) from bacteriophage λ (Figure 1.4) (Craig and Roberts 1980, Craig and Roberts 1981, Little

1984). LexA protein regulates the expression of SOS genes, a family of genes that are 7 responsible for turning on the cellular SOS response following DNA damage. Similarly,

the λ repressor protein represses the expression of genes related to the lytic cycle in the λ

. When RecA-ssDNA filaments are formed in a cell in response to DNA

damage, LexA or the λ repressor protein binds to the active RecA filament. RecA is

thought to stabilize a conformation of LexA or λ repressor that can undergo self-

cleavage. The repressor proteins are thought to bind to a notch between two adjacent

RecA monomers in the filament (Story et al., 1992).

1.2.5 ATPase activity of RecA

1.2.5.1 Models for the role of ATPase activity of RecA

RecA is a DNA dependent ATPase. The kcat for ATP hydrolysis by RecA is

around 30 min-1 on ssDNA and 16-20 min-1 on dsDNA at 37°C (Cox, 2003, and Roca and Cox, 1997). The hydrolysis of ATP is not required for the binding of RecA to DNA.

RecA can bind to DNA to form filaments and also carry out strand exchange reactions in the presence of non-hydrolysable ATP analogs (Kowalczykowski and Krupp, 1995, and

Menetski et al., 1990). But ATP hydrolysis by RecA is essential for the filament disassembly (Lindsley and Cox, 1990, and Shivashankar et al., 1999). The exact role of

ATP hydrolysis in the RecA mediated strand exchange reaction is still being elucidated.

There are two prominent hypotheses on the requirement of RecA ATPase activity for homologous recombination. They are i) the “RecA redistribution model”, and ii) the

“RecA motor: a facilitated DNA rotation model” (Cox, 2003). According to the “RecA

redistribution model”, the hydrolysis of ATP is required only for the release of monomers

from the 5’ end of the RecA-DNA filament that has finished the strand exchange

reaction. This is a mechanism for recycling of RecA monomers (Bianco et al., 1998). 8 This model is supported by the observation that under in vitro reaction conditions, RecA

can form hetroduplex DNA (duplex DNA containing the ssDNA and one of the strands

from the homologous intact dsDNA) of 2.4 kb in less that 2.5 min in the presence of non- hydrolysable ATP analogs (Menetski et al., 1990). Another observation was that in the

presence of the non-hydrolysable ATPγS, the filament dissociation is greatly reduced

(Lindsley and Cox, 1990). ATPγS is a competitive inhibitor of ATP hydrolysis and even the presence of 10% of ATPγS can stabilize the RecA-DNA filament substantially

(Weinstock et al., 1981b). Interestingly, ATP hydrolysis occurs uniformly through out the filament. But dissociation of the monomer from filament occurs predominantly at the

5’ end (Kowalczykowski and Krupp, 1987, Lindsley and Cox, 1990).

The second model, the “RecA motor: a facilitated DNA rotation model”, assigns distinct roles for the ATPase activity of RecA in the strand exchange reaction. According to this model, RecA has a motor function which is coupled to ATP hydrolysis. The

RecA-motor activity helps in bringing trapped DNA segments, which are outside the primary pairing event, into the nucleoprotein filament (Cox, 2003). When the strand exchange reaction involves longer DNA fragments, it is possible to have secondary pairing between RecA-ss DNA filament and the complementary dsDNA (Figure 1.5).

When this occurs, a part of the dsDNA will be left outside the nucleoprotein filament.

The motor activity of RecA will help in the rotation of this external DNA segment thus facilitating its entry into the RecA-DNA filament. The motor activity of RecA helps to maintain the directionality (unidirectional, in a 5’ to 3’ manner) of the strand exchange reaction (Cox and Lehman, 1981), helps RecA to overcome structural barriers such as pyrimidine dimers, heterologous insertions etc. present in the DNA (Das Gupta and

9 Radding, 1982, Livneh and Lehman, 1982), and helps to promote strand exchange

reactions involving four strands (Kim et al., 1992, Shan et al., 1996). Interestingly,

Rad51, the eukaryotic homolog of RecA, which has much lower ATPase activity compared to Ec RecA, has no directionality in in vitro strand exchange reactions, cannot

overcome structural barriers in DNA, and cannot promote reactions involving four DNA

strands (Namsaraev and Berg, 2000, Sung and Robberson, 1995)

1.2.5.2 Waves of ATP hydrolysis

As mentioned before, RecA-DNA forms a helical filament and each turn

of the filament is comprised of six RecA monomers. Each RecA monomer can bind to exactly three bases of ssDNA or three base pairs of dsDNA. Based on kinetics of ATP hydrolysis and filament dissociation, a facilitated rotation model for coupling waves of

ATPase to rotation of dsDNA into the filament during strand exchange has been developed. According to this model, when the dsDNA is rotated into the RecA-ssDNA filament (Figure 1.5), it is bound to every sixth protein monomer of the filament. ATP hydrolysis occurs simultaneously in these six monomers and as a result, the DNA is handed over to the next set of six RecA monomers present in the filament. The ATP hydrolysis occurs in waves which move along the RecA-DNA filament and a particular wave passes through every sixth monomer (Figure 1.6) (Cox et al., 2005). This model is supported by various observations as mentioned below. A study that examined the temperature dependence of branch migration (Arrhenius activation energy of 13.3 ± 1.1 kcal mole-1) and ATP hydrolysis (Arrhenius activation energy of 14.4 ± 1.4 kcal mole-1)

indicated that the two processes are coupled (Bedale and Cox, 1996). Another interesting

finding from this study is that the rate of branch migration (380 ± 20 bp min-1) is related 10 -1 to the kcat for ATPase (20 min for dsDNA) by a factor of 18. This corresponds to the

number of bases within one turn of the RecA-DNA filament (six monomers bound to 18

nt of ssDNA or 18 bp of dsDNA). This correlation was proposed to be due to a facilitated

rotation of the DNA with in the nucleoprotein filament, which will rotate DNA one

complete turn around the filament by the hydrolysis of ATP in six successive steps

involving each of the six monomers in one turn of the helix. As the ATP hydrolysis

proceeds, the RecA dissociation occurs at the 5’ end of the filament. It was documented

that the rate of dissociation of RecA from the 5’ end of the filament (koff: 123 ± 16

-1 monomer per minute per filament end) is related to the kcat (20 min for dsDNA) by a

factor of 6 (Cox et al., 2005). This study proposed that waves of ATP hydrolysis move

unidirectionally from the 5’ to 3’ end of the filament and successive waves originate

every sixth monomer.

1.2.6 Nucleation and Cooperativity in RecA-DNA binding

The rate-limiting step in the formation of RecA nucleoprotein filament is

nucleation. RecA can readily bind to ssDNA, but binding to dsDNA is very slow. This is

thought to arise from the difficulty in unwinding duplex DNA compared to ssDNA. In

both ss and dsDNA, binding of RecA causes extension of the DNA, such that axial rise

per nucleotide of the DNA is 5.1 Å instead of the 3.4 Å in normal B-form DNA. It is

interesting to note that RecA can bind faster to Z-form DNA than B-form DNA, which

may be due to the difference in conformation of Z-DNA (Blaho and Wells, 1987). RecA

binds to the sugar phosphate backbone of the DNA, with the bases oriented perpendicular to the filament axis. The rate of assembly of RecA is 30-40 monomers sec-1 on ssDNA

and 20 sec-1 on gapped duplex DNA. The rate of disassembly is 3 monomers sec-1 at the 11 5’ end and this slower dissociation leads to a net filament growth in the 5’ to 3’ direction

(Roca and Cox, 1990). The rate of assembly at the 3’ end is at least one order of

magnitude higher than the rate of disassembly of RecA at the 5’ end.

The binding of RecA to DNA is highly cooperative. The Hill coefficient of

binding of RecA to both ss and dsDNA is around 3 (Lee and Cox, 1990, Weinstock et al.,

1981c). The Hill coefficient is a measure of the degree of cooperativity in the binding of

two substrates. When there is cooperativity in a binding event, the affinity of the second

monomer to the substrate is greater than that of the first monomer to the substrate. This is

because the monomer-monomer interactions facilitate the binding of the second monomer

to the substrate. Since the RecA filament is formed by the interaction between monomers,

the conformational state of a particular monomer may affect that of its neighbor and thus propagate through out the filament. The crystal structures of Rad51 and RadA shows that there is close interaction between the ATP binding site of one monomer with that of the

adjacent one. The reduction in filament dissociation in the presence of ATPγS is another

proof for the existence of cooperativity between RecA monomers. The blocking of ATP

hydrolysis in one of the monomers by ATPγS prevent the dissociation of the neighboring subunits even if they are bound to ATP because of the interactions existing between the

monomers in the filament.

1.2.7 Stoichiometry of RecA-DNA binding

The stoichiometry of RecA-DNA binding is one RecA monomer to three bases of

ssDNA or three base pairs of dsDNA (Takahashi et al., 1987). This means that there are

three nucleotide binding pocket within a RecA monomer. But there is no evident

repetition of structure present in the sequence or structure of the monomer. This means 12 that each of the three nucleotide binding pockets is formed by different amino acids and

there are inherent differences between these three separate sites (Figure 1.7). This will

lead to a preferential positioning or phasing of RecA monomers on repetitive DNA

sequences (Volodin et al., 1997). The (5’-GCTGGTGG-3’), which is a

recombination hot spot, enhances the phasing of RecA monomers on the DNA (Volodin

and Cametini-Otero, 2002). The removal of G or T bases in the chi sequences or a change

in their positions reduced the phasing of RecA monomers (Volodin et al., 2003). This

shows that each nucleotide binding pocket prefers a particular base and that there is likely

to be a particular combination of three bases to which a RecA monomer binds

preferentially.

1.2.8 Regulation of E. coli RecA in vivo

The expression of RecA is tightly regulated in bacterial cells. Under normal

growth conditions (in the absence of DNA damage), there is a basal level of expression of

RecA in the cells, less than 10,000 monomers per cell. Following DNA damage, the

expression is increased at least ten fold (Cox, 2003). RecA expression and activity are

controlled in at least three levels: i) RecA gene expression is repressed by the LexA

protein, ii) auto regulation by RecA iself, and iii) by interactions of RecA with a number

of other proteins. LexA protein binds to the consensus sequence in the promoter region of

recA and prevents the expression of RecA protein. Following DNA damage, LexA

undergoes self cleavage (coprotease activity of RecA) and turns on the expression of

RecA (Fernandez de Henestrosa et al., 2000). The C-terminal tail of RecA forms a closed structure preventing the binding to DNA, especially dsDNA. This regulation is removed

in vitro in the presence of 6-10 mM MgCl2 which helps in the formation of an open 13 structure of RecA capable of promoting the strand exchange reaction (Lusetti et al.,

2003).

There is a repertoire of proteins that bind to RecA and regulate its activity (Table

1.1). The SSB (ss DNA binding) protein binds to ssDNA and thereby prevents formation of secondary structures in it. This enhances RecA binding to ssDNA (Kowalczykowski and Krupp 1987). SSB can also compete with RecA for binding to ssDNA. This is especially true when SSB is added to ssDNA prior to RecA under in vitro conditions.

RecBCD complex processes dsDNA break to form ssDNA overhangs and directly assists in the loading of RecA onto the ssDNA overhang (Kowalczykowski, 2000). RecF, RecO and RecR (RecFOR) promote the binding of RecA to ssDNA by relieving SSB inhibition by direct interactions between RecO and SSB (Umezu and Kolodner, 1994). The activity of RecA is inhibited by the RecX protein by a filament capping mechanism, where RecX binds to the end of the RecA filament and prevents further growth of the filament (Drees et al., 2004). DinI protein is antagonistic to RecX in that DinI enhances the stability of

RecA filament, and when present in excess quantity compared to RecA, the RecA filament dissociation is inhibited by Din I (Lusetti et al., 2004). There are several other proteins including PsiB, RdgC, and UvrD that are involved in regulating RecA function, but their exact modes of regulation of RecA are not yet established (Cox, 2007).

1.2.9 Proposed Mechanism for ATP hydolysis by RecA

Based on the crystal structure of RecA, the information from biochemical studies and comparison with structures of other NTPases, a mechanism for ATP hydrolysis has been proposed for RecA (Figure 1.8) (Story and Steitz, 1992). The position of ADP is known from the Ec RecA-ADP crystal structure (Story and Steitz, 1992). The position of 14 γ-phosphate was modeled in to the above structure based on the structure of p21 bound to

GTP (Pai et al., 1990). The γ-phosphate is positioned near three residues outside the P-

loop, Glu 96, Asp 144, and Gln 194. Glu 96 acts as a general base that activates a water

molecule for attack on the γ-phosphate of ATP, thus initiating the ATP hydrolysis. The

mutation of Glu 96 to Asp produced a RecA protein that bound to DNA very tightly. It

was apparent from biochemical studies with this E96D mutant that the tight binding

phenotype is due to the failure of ATP hydrolysis which leads to very slow dissociation

rate in the E96D-DNA complex compared to the wild type RecA (Campbell and Davis,

1999b). The shorter length of the Asp side chain compared to the Glu side chain will lead

to the reduction in the amount of hydrolysis. Asp 144 interacts with the Mg2+. The binding of RecA to DNA is dependent on the concentration of Mg2+ (4-10mM). The

Mg2+ is coordinated by the β and γ phosphates of ATP, Thr 72 and a water molecule that is hydrogen bonded to Asp 144.

Biochemical studies have revealed that Gln 194 is essential for the catalytic

activity of RecA as it cannot be mutated to any other amino acid. Gln 194 is proposed to

be an “allosteric switch” for mediating the ATP induced conformational changes in RecA

(Kelley, and Knight 1997, and Hortnagel et al., 1999). The proposed reaction mechanism

for allostery is as follows. The Mg2+ ion interacts with the β and γ phosphates of ATP

such that γ phosphate is held in position to interact with Gln 194. Gln 194 conveys the

information of ATP binding to L2 and helix G two key ssDNA-binding elements of the

protein. Upon binding to ATP, the L2 becomes activated to bind to ssDNA. This

represents the extended “active” conformation of RecA. This is followed by the

activation of the water molecule by Glu 96, which in turn attacks the γ phosphate of ATP,

15 leading to the hydrolysis of ATP. Once ADP is formed, the allosteric switch, residue Gln

194, cannot hold the conformations of L2 and helix G, such that the RecA filament

compresses to the “inactive” conformation. This is followed by the release of ADP and Pi

(inorganic phosphate) from the RecA monomer. Once the products are released, RecA can bind to another molecule of ATP and start a new cycle of the reaction.

Another important residue to note in E. coli RecA is Phe 217. Phe 217 is proposed

to be involved in transferring the ATP mediated conformational change to the

neighboring subunit. F217Y mutation increased the cooperativity of DNA binding and thereby enhanced the filament formation. Phe 217 inserts into a pocket in a neighboring

subunit and thus transmits the allosteric information through out the filament (Kelley et

al., 2001). The authors propose that the interaction of Phe 217 with the neighboring

subunit is not dependent on the binding to DNA and hence they expect this interaction to

be present both in active and inactive filaments of RecA. Interestingly, the Rad51

structure in the extended conformation showed two types of subunit interfaces; one

interface where the residue equivalent to Phe 217 of RecA, His 352, contacts the P-loop

of the neighboring monomer, whereas in the second type of interface, His 352 is occluded

from the P-loop by Phe 187 of the adjacent monomer (Conway et al., 2004). Based on

these observations, it is conceivable that Phe 217 of Ec RecA also have different

conformations within consecutive subunits of the filament.

1.2.10 Differences between eukaryotic Rad51 and bacterial RecA proteins

There are important differences in the properties of RecA and Rad51 proteins.

RecA can form active filaments only in the presence of ATP analogue, whereas Rad51

can form active filaments in the absence of ATP. Surface Plasmon Resonance (SPR) 16 studies comparing the DNA-binding properties of Ec RecA and Rad51 from humans (hu) revealed that hu Rad51 activity is independent of ATP, where as Ec RecA binds to DNA

only in the presence of ATP analogue (Kelley De Zutter and Knight, 1999). The kcat for the ATPase activity of Rad51 is at least 30 times lower than that of Ec RecA (Sung,

1994). There is an N-terminal extension of about 100 amino acids in Rad51 protein compared to RecA protein. This extended NTD of Rad51 has DNA binding properties and is also involved in protein-protein interactions (Aihara et al., 1999, Krejci et al.,

2001). It is clear now that the NTD of Rad51 corresponds to the CTD of RecA (Aihara et al., 1999, Conway et al., 2004). As mentioned above, Rad51 has no directionality in the filament formation, cannot bypass structural barriers encountered in the DNA, and cannot promote four-strand exchange reactions. An efficient strand exchange reaction mediated by Rad51 involves a lot of protein-protein interactions (Table 1.2). Rad51-Rad52 interaction is required for loading of Rad51 onto the ssDNA overhang on the dsDNA.

Rad51 cannot displace RPA (equivalent to the SSB protein of E. coli) without the help of

Rad52. The Rad55/57 proteins increase the ssDNA affinity of Rad51 and promote the cooperativity of Rad51 in binding to DNA. The Rad51-Rad54 interaction helps in unwinding the dsDNA and subsequent pairing of ss and dsDNA (Symington, 2002). All these processes are done by RecA without the involvement of any other protein. Even though the overall fold of both RecA and Rad51 are similar it is apparent that there are interesting differences in the mechanism by which each of them promote the strand exchange reaction. RecA appears to be highly efficient in that it can carry out the strand exchange reaction with extra features not present in Rad51 with minimum interactions with other proteins.

17 1.2.11 Relevant new information on RecA

A recent study has found an unrelated function for E. coli RecA protein. Gomez-

Gomez and coworkers found that RecA is required for the swarming motion of E. coli

cells. Swarming is a collective movement involving various cells on semi-solid surfaces.

The absence of RecA leads to the formation of a flagellum with defective propulsion

function (Gomez-Gomez et al., 2007). This finding is interesting because of the fact that

it relates RecA directly to a motor function. DNA damage and nucleoprotein filament

formation is not required for the swarming motion. This study gives RecA two entirely

different functions: it role is DNA repair and its role in cell motility.

The direct visualization of RecA-DNA assembly at the single molecule level has

been carried out by fluorescence studies. The study of the assembly of fluorescently

labeled fully active RecA on dsDNA revealed that nucleation is the rate limiting step in

filament formation, while filament extension is a faster process that can occur

bidirectionally on the DNA. There are multiple sites of nucleation on the DNA, and faster

nucleation occurred in the presence of ATPγS than with ATP. Binding of a minimum of

4-5 RecA-ATP monomers are required to form a stable nucleus (Galletto et al., 2006).

Single-molecule FRET (Fluorescence Resonance Energy Transfer) analysis was carried

out to study the dynamics of RecA-DNA assembly on ssDNA (Joo et al., 2006). This

study revealed that a minimum number of five monomers and ssDNA length of 17 bases are sufficient for nucleoprotein filament formation. Another important observation from

this study which was contradictory to the existing knowledge is that RecA filament grows

and shrinks on both the 5’ and 3’ ends. But a higher rate of assembly at the 3’ end leads

to a net filament formation in the 3’ direction. 18 An archaeal RadA from Sulfolobus solfataricus (Sso) has been crystallized in a left handed helical filament (Chen et al., 2007). All of the other RecA, Rad51 and RadA structures sofar determined form a right handed helical filament. In this work, the authors

claim that the left-handed helical filament is an intermediate in the strand exchange reaction and that RecA-like proteins have motor function. A hinge region located

immediately after the helix α5 is designated as the “subunit rotation motif (SRM)”. The

superposition of various RecA-like proteins in closed ring, right handed filament, over

wound right-handed filament, and left-handed filament revealed sequential

conformational changes in the hinge region, suggesting the existence of a subunit rotation

during the strand exchange reaction. They compare this rotary motion of RecA-like

proteins to that of flagellin, another helical protein filament present in flagellum, and

propose that subunit rotation may be required for the assembly of protein helical

filaments. Thus more and more studies are revealing the existence of a RecA motor

activity.

Calorimetric analysis of RecA-DNA binding provided insights into the

mechanisms of homology recognition by RecA-ss DNA (Takahashi et al., 2007). A large

enthalpy change (ΔH) was observed for the binding of RecA to the first DNA molecule,

whereas the ΔH for the binding of the second DNA molecule was much lower. The

authors propose that this ΔH occurs due to the organization of disordered loops and this

conformational change was confirmed by the circular dichroism (CD) spectrum. There

was no substantial change in the CD spectrum for the binding of a second DNA molecule

to the RecA-ss DNA complex. From this study, it was proposed that RecA binds to and

extends the first DNA molecule such that the bases are exposed within the RecA filament

19 for a faster homology search with the dsDNA. Thus, RecA acts as a scaffold for two

DNA molecules to search for homology and the pairing occurs due to the switching of

hydrogen bonds between the two DNA molecules. The protonation (0.5 proton per RecA

monomer) of RecA and ATP analogue are necessary for the required conformational

change to bind to DNA. This study also suggested that a RecA dimer may be the basic unit that binds to DNA. Along these lines, it is interesting to note that a dimeric RecA in which two RecA monomers were fused head to tail, had similar functions as monomeric

RecA both in vivo and in vitro (Forget et al., 2006).

1.2.12 Importance of a RecA DNA structure

The RecA protein has been the subject of rigorous study since its discovery. There

remain several questions about RecA that are still not answered. The exact residues of

RecA that interact with ss and dsDNA are not known yet. The process of homology

search between the two DNA molecules, RecA’s role in base pair switching, the exact

role of ATP hydrolysis in the strand exchange process, and the details of motor activity of

RecA remain to be fully elucidated. A RecA-DNA structure is crucial for understanding

the mechanism by which RecA promotes the strand exchange reaction. Even though

many groups of scientists are working on this aspect, there is no success yet. It is an

extremely difficult task because RecA is a very dynamic system and its interaction with

DNA is also complex. So to get a stable RecA-DNA complex compatible with the

techniques of crystallization is challenging. More complex techniques such as covalent

linking of protein-DNA complex, co-crystallizing with an antibody engineered to

recognize only the protein-DNA complex, developing RecA mutants that can form the

most stable protein-DNA complex, deletion mutants of RecA, especially CTD deletion 20 mutants, exploring the possibilities of different types of DNA substrates that can be used

for crystallization, co-crystallizing with other proteins with which RecA interacts (eg. λ

repressor protein) etc. can be attempted to obtain an active RecA filament with DNA.

Also it is interesting to find out why RecA is very powerful in promoting the strand

exchange reaction, mostly by itself, while the more developed eukaryotic proteins need

assistance from a large number of protein interactions.

1.3 Deinococcus radiodurans RecA

Deinococcus radiodurans (Dr) is a gram-positive bacterium well known for its ability

to survive extreme doses of ionizing radiation (Minton 1996, Battista 1997). It was first

isolated from canned food which was sterilized by irradiation. Though Dr is highly

resistant to a broad spectrum of DNA damaging agents, it can recover from particularly

high doses of ionizing radiation, which is known for producing dsDNA breaks. Dr can

survive high doses of radiation, 150-200 dsDNA breaks per haploid genome which can

produce up to 1000 dsDNA breaks in a typical Dr polyploid cell (Daly et al. 1994). By

contrast, E. coli can survive only a few dsDNA breaks at a time (Krasin and Hutchinson

1977) . The radioresistance of Dr is largely due to its highly proficient mechanisms for

DNA repair. Dr has a typical prokaryotic repertoire of DNA repair enzymes, though nearly one third of its genes encode proteins of unknown function that are not known to exist in other organisms (White et al. 1999, Makarova et al. 2001).

Homologous recombinational repair of dsDNA breaks by RecA protein is central

to the extreme radioresistance of Dr, and several features of Dr RecA are unique. Dr recA mutants that are defective in recombination are highly sensitive to ionizing radiation

(Gutman et al. 1994, Daly and Minton 1995). Whereas Dr UvrA and DNA polymerase I 21 can be complemented by their respective E. coli genes without loss of radioresistance

(Agostini et al. 1996, Gutman et al. 1993) this is not the case for Dr RecA (Carroll et al.

1996), indicating it has unique functional properties that are essential. The amino acid sequences of E coli and Dr RecA are 56% identical, but Dr RecA is toxic to E. coli unless its expression is sufficiently repressed (Gutman et al. 1994). In Dr, the expression of

RecA is tightly controlled and is not detectable in undamaged cells (Carroll et al., 1996).

It is expressed in response to radiation damage suggesting that Dr RecA is toxic to Dr cells and is tolerated only during the radiation damage (Satoh et al., 2002). While RecA expression in E. coli is regulated by LexA repressor as part of the SOS response, Dr

RecA expression is instead controlled by IrrE (also called PprI), a repressor protein unique to Dr that keeps the non-induced level of RecA unusually low (Earl et al. 2002,

Hua et al. 2003).

Dr contains 4 to 10 genomic copies per cell (Hansen 1978), and this multiplicity

is thought to greatly facilitate recombinational repair (Minton and Daly 1995). The Dr

genome has an unusual toroidal shape that may also facilitate dsDNA break repair by

holding the two ends created at a break near one another (Levin-Zaidman et al. 2003). It

is thought that the multiplicity of the chromosomes and its unusual toroidal shape results

in a faster homology search and location of the complementary DNA molecule. Dr has a

RecA-independent pathway of dsDNA break repair that functions at the early stages of

recovery (Daly and Minton 1996). However, this pathway, which is thought to be some

form of single stranded annealing, accounts for only up to one third of the dsDNA breaks encountered. It has been suggested that the coprotease activity of Dr RecA may be even

more important to radioresistance than its activity (Satoh et al. 2002).

22 Dr RecA has been overexpressed and purified from E. coli and characterized biochemically, revealing unusual features that parallel the genetic characteristics (Kim et al. 2002). Like E. coli RecA, Dr RecA forms right-handed helical filaments on DNA, hydrolyzes ATP and dATP in a DNA-dependent fashion, and promotes similar in vitro strand exchange reactions. Importantly, however, Dr RecA promotes the strand- exchange reaction via an inverse pathway (Kim and Cox 2002). Whereas E. coli RecA and all other known recombinase enzymes, including human Rad51, polymerize on ssDNA and then incorporate dsDNA into the filament, Dr RecA first polymerizes on dsDNA. It is thought that this nucleoprotein filament (Dr RecA- dsDNA) then searches for and locates a complementary region of ssDNA (Figure 1.9). Consistent with this and also unusual, Dr does not contain genes for RecB and RecC, which along with RecD in

E. coli process a dsDNA break into a long 3'-ended ssDNA tail that is a substrate for

RecA (Anderson and Kowalczykowski 1997).

Compared to E. coli RecA, Dr RecA exhibits some interesting differences in ATP and dATP hydrolysis, and in the coupling of NTP hydrolysis to strand exchange (Kim et al., 2002). Dr RecA hydrolyzes dATP faster than ATP at a wide range of pH. Dr can hydrolyze ATP only at lower pH (pH range 6-7) when compared to E. coli RecA. Strand exchange activities of Dr RecA shows that the strand exchange is limited in the presence of dATP, while there is more extensive strand exchange in the presence of ATP. Another observation is that there is a lag phase for Dr to bind to dsDNA. The length of the lag phase was very much dependent on the pH. It is also noted that Dr bound to dsDNA readily in the presence of a mixture of ds and ssDNA.

23 1.4 X-ray crystallography

X-ray crystallography is a powerful technique for elucidating the atomic organization of macromolecules, especially the structure of biological macromolecules like protein, DNA, RNA, sugars etc. In this method, x-rays are allowed to pass through a single crystal of the molecule under study. The x-rays are diffracted by the different atoms present in the molecule and the diffraction pattern can be used to solve the structure of the molecule. X-rays are used for this purpose because the wavelength of x- rays (around 1 Å) is on the same order as the distance between two bonded atoms in a molecule (usual bond length between two atoms is between 1-2 Å). The electrons of each atom of the molecule scatter the x-rays incident on the molecule. Since the strength of diffraction of a single molecule is very weak, the molecule is crystallized. In a crystal, the individual molecules are arranged in an orderly manner and this results in the amplification of the diffraction of the incident x-rays. The diffraction pattern depends on the internal organization of the molecule and also on how the molecule is arranged with in the crystal. The intensity and phase of each of the spots in the diffraction pattern contains information about the molecular structure. This information is deduced from the intensity, phase, and the distribution of spots using mathematical calculations to solve the structure of the molecule (Figure 1.10).

1.4.1 Obtaining diffraction quality crystals

The first step in solving the structure of a molecule is to obtain crystals of the molecule that diffract x-rays to high resolution. There are several methods to obtain crystals: i) hanging drop, ii) sitting drop, iii) dialysis or gel, iv) microbatch etc. The principle behind the hanging and sitting drop methods is vapor diffusion. The protein 24 solution is allowed to equilibrate with a solution (reservoir solution) of a precipitating

agent such as ammonium sulfate, polyethylene glycol etc. When the concentration of the

precipitant is optimal, single, large crystals are formed in the drop. In the hanging drop,

the protein drop hangs over the reservoir solution, while in the sitting drop, it sits on a

stage over the reservoir solution.

The dialysis method is a liquid equilibration process, equilibration of the

precipitant concentration to which the protein is exposed occurs by liquid diffusion. Here

the protein mixed with reservoir solution is placed inside small dialysis buttons and

placed in reservoir solution to reach the optimal precipitant concentration. The dialysis

method is not useful for screening large number of crystallizing conditions, but it is

useful for getting larger crystals using an already known condition. The microbatch

technique is a modification of the traditional batch method of protein crystallization.

Here, protein is mixed with the precipitant solution at the final concentration. The batch

method requires large amounts of protein and this is usually a good method to obtain

bigger crystals of an already known crystal growth condition. In the microbatch method,

the volume of protein is considerably reduced and the protein is mixed with a precipitant solution and covered by oil (Chayen, 1998). It is the preferred screening method for the

highly automated robotic crystallization screens.

1.4.2 X-ray diffraction data collection

1.4.2.1 Sources of x-ray radiation

Once good quality crystals are obtained, the next step is the measurement of the

x-ray diffraction intensities. The crystals are mounted on to nylon loops and flash frozen

under liquid nitrogen. There are two types of x-ray sources for the data collection 25 process. The traditional method is the rotating anode x-ray generator, where electrons are

emitted by a hot cathode filament, which are then accelerated towards a charged copper

plate by an electric field. Once these electrons hit the copper plate, electrons from the K-

shell of Cu2+ are ejected. The electrons from L-shell then fall in to the lower energy level

K-shell. The excess energy in the electrons are emitted as x-rays that are filtered and

focused onto the crystal for the x-ray diffraction experiment. L to K transition produces

an x-ray beam of 1.54 Å wavelength. The second source of x-rays is the synchrotron.

Synchrotrons are becoming increasingly popular because of the fact that smaller crystals

are sufficient for data collection at synchrotron because of the high intensity of the x-ray

beam. In addition, with the charge-coupled device (CCD) detectors, data collection is

exceedingly fast. Another advantage is the tunability of wavelength in synchrotrons. In

the synchrotron, electrons are accelerated through large storage rings at the speed of light.

When these fast electrons are bent using magnetic field, x-rays are emitted, which are

focused and used for the diffraction of crystals.

1.4.2.2 The process of x-ray diffraction

When x-rays fall on the molecule, there is elastic scattering of the incident x-rays by the electron cloud of each atom in the molecule. The scattered x-rays are waves with specific amplitudes and phases. The atoms scatter x-rays in all directions. These waves can undergo constructive or destructive interference. Those waves whose path difference is an integral number of wavelength undergo constructive interference and they occur only in discrete directions depending on the molecular organization and the arrangement of molecules in the crystal. After constructive interference, the diffracted x-rays can be observed as spots on a detection screen (Figure 1.11A). The intensity of the spot depends 26 on the degree of constructive interference. The intensity is a very important measurement

in x-ray crystallography, as it contains information on the internal arrangement of atoms

in the molecule. At a particular orientation of the crystal in the x-ray beam, only a subset

of the possible reflections can be observed, and thus images at several different

orientations of the crystal are needed for collecting a full data set (Rhodes, 1993).

1.4.2.3 Reciprocal lattice

The crystal is an ordered, repeating array of molecules. The smallest unit from

which the entire crystal can be generated by translations is called the unit cell. The unit

cell is defined by three axes (a, b, and c) and three angles (α, β, and γ) (Figure 1.11B).

The smallest part of the unit cell from which the whole unit cell can be generated by

applying the symmetry operations is called the asymmetric unit. The collection of

symmetry operations by which the asymmetric unit can be converted to the unit cell is

called the space group. The space group defines the collection of symmetry elements of

the crystal such as rotation axes, mirror planes, screw axes etc.

The crystal lattice is the collection of points such that view from any one point, in

any direction, is identical to that from any other lattice point. There is one lattice point at

each corner of the unit cell and in some cases lattice points are located at the faces or in

the center of the cell. There are 14 different lattices (called Bravais lattices) and 65 different space groups possible for protein crystals. A lattice plane refers to a set of parallel planes that pass through all of the lattice points of the crystal. The indices of a

given set of lattice planes depends on how many times the planes intersect the unit cell

axes a, b, and c (Figure 1.11C). Each of these planes gives rise to a particular reflection

of the diffraction pattern. The lattice formed by the diffraction spots is termed the 27 “reciprocal lattice”. This is because there is an inverse relation between the spots

observed on the film (detection device) and the lattice points in the crystal. Shorter repeat distances in the crystal (i.e. smaller unit cell dimensions) results in larger spacings in the

diffraction pattern and vice versa. This is because a small displacement angle between the

diffracted rays can cause constructive interference if the spots are spaced apart and this

angle is larger for those spots that are closer together in the crystal.

1.4.2.4 Principles of X-ray crystallography

The main principle of x-ray crystallography is that constructive interference occurs when the difference in path length between two diffracted rays is an integral number of the wavelength of the x-ray. Bragg’s law states that a set of parallel planes

(indexed hkl and called Miller indices) that are separated by a distance of dhkl produces

constructive interference when x-rays of wavelength λ incident on the crystal at an angle of θ, emerges out of the crystal at an angle θ. This happens only when θ satisfies the

condition,

2dhklsin θ = nλ, (Equation 1.1)

where n is an integer (Figure 1.11D). Each set of parallel planes, hkl, gives rise to a spot

on the film and the intensity of each spot depends on the number of atoms that lie on the

set of planes. Large unit cells require smaller angles to meet the Bragg’s law and hence

there are more sets of planes reflecting x-rays at a particular orientation of the crystal.

This results in more spots on the film compared to a crystal with a smaller unit cell

(Rhodes, 1993).

28 1.4.2.5 Conversion of diffraction data to structural information

The intensity and phase of each reflection are the two important pieces of

information needed for structure determination by x-ray crystallography. The intensity of

each spot is observed directly in the diffraction experiment. The phase information is not

directly observed, but can be obtained by other means, as described below. Each

reflection in the diffraction pattern is a structure factor (F(hkl)), which has both magnitude

and phase.

iα(hkl) F(hkl) = F(hkl)e , (Equation 1.2)

where F(hkl) is the magnitude and α(hkl) is the phase.

1/2 F(hkl) = [c I(hkl)] , (Equation 1.3)

where I(hkl) is the intensity and c is a scaling factor.

All atoms in the unit cell contribute to each F(hkl) reflection.

The structure factors are converted to an electron density map by means of a

Fourier synthesis.

iα(hkl) -2πi(hx + ky + lz) ρ(x,y,z) = (1/V) ∑hkl F(hkl)e e , (Equation 1.4)

where ρ(x,y,z) is the electron density at any point (x,y,z) in the unit cell, V is the volume of the unit cell, F(hkl) is the amplitude (square root of intensity) of the particular hkl

reflection in the diffraction pattern, and α(hkl) is the phase of the reflection The Fourier

synthesis is a summation over all of the measured reflections. Each reflection (hkl)

contains information about the electron density at each point (x,y,z) in the crystal. Once

the electron density map is obtained, a model of the molecule is constructed to fit within

the contours of the electron, as viewed on a 3D graphics computer. This is followed by

29 several stages of crystallographic refinement, a computational process in which the

position of each atom in the molecule is optimized by reducing the difference between

the intensity of the reflection measured in the diffraction experiment and the intensity

calculated from the positions of the atom in the model. Several other parameters, such as

ideal bond lengths, bond angles, and vander Waals distances are optimized during this

refinement process. A key parameter in the refinement process is the R factor.

R = ∑hkl ││Fobs│- │Fcalc││/ ∑hkl │Fobs│, (Equation 1.5)

where Fobs is the observed amplitude from the diffraction data and Fcalc is the calculated

amplitude from the initial model. For a well refined protein structure, the value of R-

factor ranges from 0.15 to 0.25 and a lower R-factor represents a better agreement

between the model and the crystal structure. A set of reflections (5-10%) are not used for

the refinement process and the difference in the measured intensities of these spots

compared to the intensities calculated from the refitted model gives the free R-factor

(Rhodes, 1993).

Usually, a Patterson function is used in the initial stages of structure determination. The Patterson function is a Fourier series without the phase information and hence it can be calculated directly from the x-ray diffraction experiment.

2 -2πi(hu + kv + lw) P(u,v,w) = (1/V) ∑hkl │F hkl│e , (Equation 1.6),

where P(u,v,w) locates a point in the Patterson map. The Patterson map shows positions

of vectors between each pair of atoms in the crystal. The Patterson function helps in

orienting the molecule in the unit cell and this will help in calculating the initial phases

for all the reflections.

30 1.4.3 Calculation of phases

Unfortunately, the phase information for each reflection cannot be observed

directly from the diffraction experiment. There are various methods to obtain the phases of the spots in the diffraction pattern. They are described below.

1.4.3.1 Molecular Replacement

This method of phase calculation is possible if the structure of a related protein is

available. This is usually done for mutant versions of the protein structure already

available or the protein belonging to different organisms or maybe proteins belonging to the same family. Using vector maps calculated from the search model and the measured diffraction data from the crystal (Patterson map), the available structure is oriented into the unit cell of the new crystal. The phases are calculated from this oriented structure and an electron density map is calculated using this phase information and the amplitudes measured from the diffraction data. In a successful molecular replacement, the electron density map shows features of the new crystal and the differences in the model can be fit according to the electron density map. There should be at least 30% sequence similarity between the already known and the new protein for applying the molecular replacement for the phase calculation.

1.4.3.2 Multiple Isomorphous Replacement (MIR)

MIR is used to calculate the phases of reflections for a new structure. This method

requires a native data set and at least one or more heavy atom derivatized crystal data

sets. Once a good native crystal that diffracts to a reasonable resolution is obtained, the

next step is to obtain heavy atom derivatives by soaking the crystal in different heavy

31 metal solutions. The heavy metal should bind to one or more equivalent positions in all

the protein molecules in the crystal. The binding of the heavy atom should not alter the

structure in other ways; that is the native and heavy metal derivativatized crystals should be isomorphous. The principle is that addition of the heavy metal in a few locations within the protein gives rise to measurable differences in the intensities of the reflections when compared to the native data set. A heavy metal is used for MIR because there is amplification in the difference in the intensities between native and derivative data set as there are more electrons in the heavy metal. The positions of the heavy atoms are located using a difference Patterson map. Once the heavy atoms are located, the complete structure factor including both phase and amplitude for the heavy metal can be calculated

(FH).

The structure factor of the protein can now be calculated using the new

information.

FP = FPH -FH, (Equation 1.7)

where FP represents the structure factor of protein alone, FPH is that of the derivative and

FH is that of the heavy metal alone. In most cases, more than one derivative is required

for the unambiguous determination of the phase values (Rhodes, 1993).

1.4.3.3 Multiwavelength Anomalous Diffraction (MAD)

This method takes advantage of the fact that when a heavy atom derivative is

exposed to x-rays, the heavy metal absorbs a fraction of the radiation. Because of this

Friedel’s law which states that the reflections hkl and -h-k-l have equal intensity but

opposite phases is broken. This inequality in the intensities of centrosymmetric

reflections is called anomalous dispersion. Since the MAD experiment needs at least two 32 different wavelengths of x-ray source, this method is possible only at synchrotron with

tunable wavelength. Usually, data are collected at two or more different wavelengths, one

near the absorption edge of the heavy metal and one or more far from the absorption

edge. The structure factor of heavy atom (FHP) is different at these two wavelengths

because of the absorption of x-rays by the heavy metal at the absorption edge. The

differences in the measured intensities are used to locate the heavy atoms in the crystal using a difference Patterson function. The heavy atom position is refined and then an initial estimate of the phases of all reflections is made. The advancement in producing

selenomethionine substituted protein has revolutionized the use of MAD method. This removes the difficulty in producing a good heavy metal derivative crystal (Rhodes,

1993).

X-ray crystallography is still the leading method to solve the structures of macro molecular proteins. It is shown that the size of the molecule is not a problem for crystallography (eg. Crystal structure of ribosome complex with several RNA and proteins with molecular weight in the MDa range) (Yusupov et al., 2001a, Yusupov et

al., 2001b). With the availability of powerful synchrotron sources, improvements in leading to engineered proteins amenable for crystallization, availability of software for solving complicated equations to get the phase information, model building etc. the potential of x-ray crystallography is still far reaching.

33

Figure 1.1: Schematic representation of the strand exchange reaction in E. coli. The broken strand is blue, the homologous intact dsDNA is red and RecA monomers are green. RecBCD constitutes the helicase and nuclease complex in E. coli, which processes dsDNA break to form ss overhangs. RecA is loaded on to the ssDNA overhang to form nucleoprotein filaments. This is followed by homology search and location of homologous dsDNA, and strand invasion between the ds and ssDNA. The intact complementary strand from the dsDNA is used as a template for new DNA synthesis by DNA polymerase, which is followed by resolution of intermediate structures, and ligation of the DNA break. At the end of the process, two intact dsDNA molecules are formed.

34

Figure 1.2: Conformation of the active and inactive filaments of RecA (Adapted from Singleton and Xiao, 2002). I represents the incoming DNA (ss DNA formed from the processing of the ds break), C represents the strand which is complementary to the strand I, and O represents the outgoing strand after the strand exchange reaction. RecA-triple stranded extended filament is thought to be a reaction intermediate in the strand exchange pathway. Note the difference in the pitch for protein-only filament and the nucleoprotein filaments.

35

Figure 1.3: Crystal structure of E.coli RecA (Made using the coordinates of 2REB (Story et al., 1992). A) Each monomer is organized into three domains, the NTD, the core domain and the CTD. The core domain has the ATP binding site, and the two putative DNA binding loops which are disordered in the structure (dotted lines in the figure). The NTD is involved in the RecA polymerization, the core domain has the catalytic core which drives the strand exchange reaction that is coupled to ATP hydrolysis, and the CTD brings in the dsDNA for homologous pairing. B) A RecA filament with alternating monomers shown in cyan and golden. Three full turns are shown in the figure. One turn is composed of six RecA monomers. C) A view from the top of the filament showing a central canal where DNA is supposed to bind.

36

Figure 1.4: Coprotease activity of RecA. During the normal cell cycle, the expression of the SOS genes (genes responsible for carrying out DNA repair following enormous DNA damage) are repressed by LexA (magenta and yellow circles) which binds to the operator of these genes. Following uv irradiation of the cell, ssDNA is produced. RecA (green spheres) binds to ssDNA to form active nucleoprotein filaments. The monomeric LexA binds to the nucleoprotein filament and undergoes a self cleavage. The cleavage occurs at a region between the NTD and the CTD of LexA. It is thought that the binding of LexA to RecA-ssDNA stabilizes a conformation of LexA which can easily undergo self cleavage. The cleavage results in the reduction in the amount of free LexA which is available to bind the operator and thus the SOS regulon is turned on. A similar type of reaction occurs in the case of the phage λ repressor protein which prevents the expression of the proteins for the lytic cell cycle of the phage. Once the λ repressor undergoes self cleavage, the phage switches to the lytic cycle.

37

Figure 1.5: RecA has a motor activity which helps in the rotation of the DNA molecule (Figure from Cox, 2003, Reprinted, with permission, from the Annual Review of Microbiology, Volume 57 (c) 2003 by Annual Reviews www.annualreviews.org). The orange spheres represent the RecA polymerized on the ssDNA (green). The complementary dsDNA is shown as blue strands. Position A represents the primary paring event between RecA-ss DNA and the dsDNA. D represents the secondary paring and the part C is left outside the nucleoprotein filament. Since RecA has a motor activity which is coupled to ATP hydrolysis, it can rotate C and bring it in to the filament for a full pairing event. The rotation occurs in the direction of white arrows and there is migration of the paired DNA in the direction of the black arrow to involve the whole dsDNA in the pairing reaction.

38

Figure 1.6: ATP hydrolysis occurs as waves through out the RecA filament (Figure from Cox, 2003, Reprinted, with permission, from the Annual Review of Microbiology, Volume 57 (c) 2003 by Annual Reviews www.annualreviews.org). The black circles represent every sixth RecA monomer in the filament. ATP hydrolysis occurs through these black circles at a particular time point, The DNA is handed over to the next set of black monomers after one set of ATP hydrolysis, which carries out the next cycle of ATP hydrolysis. Thus ATP hydrolysis travels as waves passing through every sixth RecA monomer.

39

Figure 1.7: Schematic representation of the nucleotide binding pocket in RecA. Three RecA monomers are shown (pink and cyan). Each RecA monomer has three nucleotide binding pocket which binds DNA by the sugar phosphate backbone (green squares). The bases are exposed for homology search with the dsDNA. Each binding pocket is formed by different constellation of amino acids and hence RecA would bind with higher affinity to a particular triplet repeat of ssDNA.

40

Protein Function Rec BCD -processes the ds DNA break and produces 3’ tailed ss DNA over hangs -assists in the loading of RecA on to the ss DNA over hang

SSB -binds to ss DNA and prevent secondary structure formation -competes with RecA for loading onto the ss DNA

Rec FOR -RecO binds to SSB and thus assist the loading of RecA onto the ss DNA -RecFOR assists RecA when a functional RecBCD is absent in the cell

Rec X -caps the RecA filament and destabilizes the filament formation and thus prevent the strand exchange reaction

Din I - stabilizes RecA filament formation and thus promotes the strand exchange reaction

Table 1.1: List of proteins which regulate the function of RecA.

41

Figure 1.8: Schematic representation of the proposed mechanism of ATP hydrolysis by RecA (Adapted from Story and Steitz, 1992). There are three main regions in the active site, the ATP binding site, the Mg2+ binding site, and the DNA binding site. In this picture, the loop L2 and helix G, which are involved in binding the primary DNA are shown. Mg2+ is held in the active site by interactions with β and γ phosphates of ATP, Thr 73 and a water molecule hydrogen bonded to Asp 144. Thus Mg2+ helps in orienting the γ phosphate to interact with Gln 194. The interaction of the γ phosphate of ATP with Gln 194 stabilizes L2 and helix G which then binds to ssDNA. The ssDNA is held with the bases exposed (colored rectangles) to facilitate homology search with the dsDNA. Glu 96 hydrolyzes the water molecule (blue circle) which starts an in line attack of the γ phosphate of ATP leading to the hydrolysis of ATP. Once ADP is formed ATP-Gln 194 interaction is broken leading to the disordering of the DNA binding loop.

42

Protein Function

Rad52 - Rad52 interacts with Rad51 for displacing RPA (equivalent to the SSB protein) and thus assisting the loading of Rad51 on to ssDNA

RPA -binds to ss DNA and prevent secondary structure formation

Rad55/57 - increases the ssDNA affinity of Rad51 by enhancing the cooperative DNA binding

Rad54 - unwinds the dsDNA for the pairing reaction

Table 1.2 List of proteins which regulate the function of Rad51

43

Figure 1.9: The proposed inverse strand exchange mechanism of Dr RecA (Adapted from Kim and Cox, 2002). In typical RecA-mediated strand-exchange reactions, as seen in Ec RecA, the protein polymerizes on ssDNA first, and then incorporates dsDNA into the filament for strand-exchange. But Dr RecA uses an unprecedented mechanism in which it polymerizes on dsDNA first and then incorporates the complementary ssDNA for the strand-exchange reaction

44

Figure 1.10: Schematic representation of the different stages in the x-ray crystallography structure determination process. A) The first stage is obtaining good diffraction quality crystals. These crystals may be derivatized with heavy metals depending on the method of phase determination. B) It represents a typical diffraction pattern which shows the distribution of the reflection spots. The higher resolution spots are towards the end of the film. The intensity of each spot is measured and the phase is determined by MR, MIR, or MAD. C) The electron density map is produced from the structure factor by Fourier synthesis. A model is built to fit in the electron density map. In the figure, the blue cages show the electron density for the protein, and the red cage show the electron density for the ATPγS co-factor which was not present in the starting model. It is possible to fit ATPγS in to the electron density as it shows clear features of ATPγS molecule. The model building passes through several rounds of fitting and refining. D) The final structure which came out of the model building process.

45

Figure 1.11 A) Schematic diagram for the diffraction process (Adapted from Rhodes, 1993). The internal arrangement of the crystal is enlarged and shown. The x-rays are diffracted by the electron clouds of the atoms and after constructive interference appear

46 as spots on the film. B) Representation of a unit cell. The length of the axes is represented by a, b, and c and the angles made by these axes as α, β, and γ. C) A typical representation of the different lattice planes in a crystal. The lattice planes are represented by Miller indices which depend on how many times the plane cuts each axes. D) Schematic representation of the conditions satisfying the Bragg’s equation. Two different planes of atoms (black circles), which are spaced at a distance, dhkl are shown. Two rays are shown (R1 and R2), which are incident at an angle θ on the plane of atoms. Constructive interference occurs when the path difference between the two rays (2BC) is an integral number of the wavelength. i.e, 2dsinθ = nλ, where n is an integer.

47 CHAPTER 2

CRYSTAL STRUCTURE OF RECA FROM DEINOCOCCUS

RADIODURANS: INSIGHTS INTO THE STRUCTURAL

BASIS OF EXTREME RADIORESISTANCE*

2.1 Introduction

Since its discovery (Anderson et al. 1956), Dr has been of keen interest for studying

mechanisms of DNA repair for medical purposes as well as for processing toxic waste.

As mentioned in chapter 1, it is proposed that Dr has an inverse strand exchange pathway

and that Dr RecA is indispensable for the survival of Dr in extreme radiations. Even

though there is 56% sequence similarity between Ec RecA and Dr RecA sequences, Ec

RecA cannot substitute Dr RecA. The expression of Dr RecA is toxic to E. coli cells.

Although the principles that underlie extreme radioresistance in Dr are multifaceted and poorly understood (Battista et al. 1999, Narumi 2003), it is clear that repair of dsDNA breaks by RecA is central to radioresistance, and that Dr RecA has many novel 48

* This work has been published in the Journal of Molecular Biology (2004), Volume 344, 951-963.

mechanistic features. In the present study, the x-ray crystal structure of Dr RecA has

been determined at 2.5Å resolution in complex with ATPγS. The crystal structure reveals a significant reorientation of the dsDNA-binding C-terminal domain, an increased positive electrostatic potential along the inner surface of the filament, and structural changes in the flexible β6−β7 hairpin that has also been implicated in DNA binding

(Rehrauer and Kowalczykowski 1996). The structure thus suggests that these sites on the protein may play a role in dictating the inverse pathway of recombination and extreme radioresistance.

2.2 Materials and Methods

2.2.1 Cloning, expression, and protein purification

Deinococcus radiodurans genomic DNA was isolated from strain R1 (ATCC number 13939) and the recA gene was PCR amplified and ligated into pET-14b

(Novagen) using NdeI and BamHI restriction sites. This vector produces Dr RecA

protein with an N terminal six-His tag and an intervening thrombin cleavage site. The

correctness of the gene was confirmed by DNA sequencing. The was

transformed into BL21-AI cells (Invitrogen), and 6 1L cultures were grown at 37°C in

LB broth. Cultures were induced at OD600 = 0.6 with 0.2% arabinose, incubated an

additional four hours at 37°C, harvested by centrifugation and frozen at –80 °C. Thawed

cells were suspended in sonication buffer (300 mM NaCl, 50 mM NaH2PO4, 10 mM imidazole, pH 8.0), 0.1 mg/ml PMSF, 1 μg/ml each of leupeptin and pepstatin A, and 1 mg/ml lysozyme were added, and the mixture was incubated for one hour on ice prior to sonication. After sonication and centrifugation, the clarified lysate was loaded onto a 15

49 ml Ni-NTA fast flow column (Qiagen). After extensive washing overnight with

sonication buffer containing 30 mM imidazole, the protein was eluted with a linear

gradient of 30-500 mM imidazole. Fractions containing Dr RecA were pooled and dialyzed into thrombin cleavage buffer (20 mM NaH2PO4, 150 mM NaCl, pH 7.4). The

Dr RecA-6 His fusion protein was digested with 50 units of thrombin (Amersham

Biosciences) at 22 °C for ~24 hours and then loaded back onto the Ni-NTA column to

remove any uncleaved Dr RecA and remaining impurities. The flow-through fractions

were pooled and dialyzed into 20mM Tris pH 8.0, loaded onto a 25 ml HiTrap Q HP

anion exchange column (Amersham Biosciences), and eluted with a linear gradient of 0-1

M NaCl. Pooled fractions were dialyzed into 20 mM Tris pH 8.0, 1 mM DTT,

concentrated to 44 mg/ml, aliquoted and stored at -80° C. The final Dr RecA protein

contains the extra N-terminal sequence Gly-Ser-His. Sequence numbering follows the

convention of Narumi et al., (Narumi et al., 1999), assuming a 363-residue protein

beginning with Met1. Protein concentration was determined by OD280 using an

extinction coefficient of 10,240 M-1cm-1, which was calculated from the amino acid

sequence (Gill and von Hippel, 1989). Protein concentration was also checked by

Bradford assay, which gave similar values.

2.2.2 Crystallization, x-ray data collection, and structure determination

Crystals were grown at 22°C by hanging drop vapor diffusion by mixing 2 μl of

10 mg/ml Dr RecA, 2 mM ATPγS, 4 mM magnesium chloride, 20 mM Tris pH 8.0, 1

mM DTT, with 2 μl of reservoir solution containing 50 mM MES pH 6.5, 200 mM

ammonium acetate, 50 mM calcium chloride, and 9.7 % PEG 4000. Crystals were

transferred to artificial mother liquor solution supplemented with 30% sucrose, and 50 frozen in liquid nitrogen. X-ray diffraction data were collected using a rotating anode

generator and an R-AXIS IV++ image plate (Rigaku), and processed with Crystal Clear

software (Molecular Structure Corporation). The structure was initially solved using a

3.0Å data set, and later refined using an improved 2.5Å data set (Table 2.1). A homology

model of Dr RecA was constructed from the E. coli RecA structure (PDB code 2REB)

(Story et al., 1992) using the Swiss Model server, and used as the search model for molecular replacement by the AMoRe program (Navaza 1994) of CCP4 suite

(Collaborative Computational Project Number 4 1994). The structure was refined using rigid body, torsion angle simulated annealing, and individual temperature factor protocols of Crystallography and NMR System (Brünger et al. 1998). After the first round of refinement, it was clear that the C-terminal domain was positioned incorrectly, and the density corresponding to the C-terminal domain was not interpretable. The solvent flipping procedure in CNS dramatically improved the electron density for the C-terminal domain, which was manually positioned as a rigid body into the density using the program O (Jones et al. 1991). The entire model was rebuilt to fit within the density, and at later stages complete density was observed for the ATPγS molecule, which was added to the model, along with 27 water molecule. Due to a lack of electron density, residues 1-

14 (plus the extra three residues from the expression vector), 168-175, 208-222, 245-250, and 342-363 were not included in the model. Figures were prepared using MOLSCRIPT

(Kraulis 1991), RASTER3D (Merritt and Bacon 1997), and GRASP (Nicholls et al.

1991).

51 2.3 Results and Discussion

2.3.1 Over expression and purification of Dr RecA

Dr RecA was expressed as an N-terminal 6-His fusion protein in E. coli. As has

been observed in some instances by other groups (Gutman et al. 1994, Kim et al. 2002),

Dr RecA was toxic to E. coli, to the extent in our case that the expression vector could

not be transformed into BL21 (DE3) pLysS cells. Such toxicity, which has also been

observed in certain E. coli RecA mutants, may be due to increased ssDNA or dsDNA

binding that prevents proper segregation (Campbell and Davis 1999b). This

problem was overcome by using a BL21-AI expression host in which T7 RNA

polymerase is tightly regulated by the PBAD promoter (Guzman et al. 1995). Prior to

induction with 0.2% arabinose, expression of Dr RecA is not detectable by SDS-PAGE

and toxicity does not prohibit transformation or cell growth. The protein was purified by

nickel affinity and anion exchange chromatography, and after thrombin cleavage,

contains the extra residues Gly-Ser-His at the N-terminus. Although the presence of extra

N-terminal residues does not affect the coprotease or ssDNA-dependant ATPase

activities of E. coli RecA, it was not possible to demonstrate ATPase or DNA strand exchange activity for this preparation of Dr RecA based on procedures reported for Dr

RecA with a native sequence (Kim et al. 2002). Nonetheless, this Dr RecA protein crystallized in a typical filamentous form in complex with ATPγS and these crystals were pursued for the present study (Figure 2.1). As described below, the extra N-terminal

residues, as well as the first 14 residues of the native Dr RecA sequence are disordered in

the crystal structure, and thus they do not appear to contribute to the folded structure or

the interactions at the monomer-monomer interface. It is conceivable that the N-terminus 52 of Dr RecA, which is extended by 12 residues compared to E. coli RecA, may have a role

such as in DNA-binding that is disrupted by the presence of the extra N-terminal

residues. However, the observation that the Dr RecA protein with the N-terminal His tag

is toxic towards E. coli suggests that this protein is binding to DNA in vivo, since toxicity

is thought to arise from increased DNA-binding (Campbell and Davis 1999b). It has

previously been observed that an extra glycine residue at the C-terminus of Dr RecA

alters its activities somewhat, though the protein does catalyze DNA strand-exchange

reations (Kim et al. 2002).

2.3.2 Overall structure

Like E. coli and mycobacterial RecA proteins (Story et al. 1992, Datta et al. 2000,

Datta et al. 2003a, Datta et al. 2003b), Dr RecA crystallizes in the absence of DNA as a

61-symmetric filament (Figure 2.2) that is similar to the biologically relevant, DNA-

bound form of the protein. The helical pitch of the Dr RecA filament is 67Å, making it the most compressed of any RecA crystal structure seen to date. RecA-DNA filaments

observed by electron microscopy generally are seen in two conformational states

depending on the nucleotide cofactor: ATP or non-hydrolyzable ATP analogs produce an

extended ‘active’ filament with a helical pitch of ~95Å, while ADP or no nucleotide

produces a more compressed ‘inactive’ filament with a pitch of ~70Å (Egelman and

Stasiak 1993). The available RecA crystal structures, all in the absence of DNA and

ranging in pitch from 74-83Å, are generally thought to represent the compressed, inactive

conformation of the protein. By contrast, recent crystal structures of yeast Rad51 and

Archaeal RadA are in an extended, ‘active’ conformation in which the ATP binding site

has moved to the subunit interface (Conway et al. 2004, Wu et al. 2004). Thus, despite 53 the binding of ATPγS in the present structure, the compressed pitch, absence of DNA,

and overall similarity to the previous RecA structures support the notion that the Dr RecA crystal structure represents the inactive conformation of the protein filament. Although

ATP promotes the extended conformation of RecA filaments, structures of Mycobacterial

(Datta et al. 2000, Datta et al. 2003a, Datta et al. 2003b) and E. coli RecA (Xing and Bell

2004) in complex with ATP analogs are also seen in the compressed, inactive form. On

the other hand, the structure of yeast Rad51 is in an extended form, despite the absence of

bound nucleotide (Conway et al. 2004). Thus, for RecA and related recombinase

enzymes, it seems that the crystal packing forces or solution parameters can influence the

conformation (extended or compressed) of the filament observed in the crystal, perhaps

to an even greater degree than the bound cofactor. It should be pointed out that although

the compressed filaments are referred to as ‘inactive’, the structures of the RecA

protomer within them likely represent relevant conformations that occur during the

catalytic cycle of the strand-exchange reaction, which is coordinated by ATP binding and

hydrolysis.

The overall structure of the Dr RecA monomer, which is similar to that of E. coli

RecA (Story et al. 1992), is organized into three domains (Figure 2.2): an N-terminal

domain that forms much of the monomer-monomer interface, a central core domain that

in E. coli RecA has the ATP and principal DNA-binding sites (Malkov and Camerini-

Otero 1995), and a C-terminal domain that, in the case of E. coli RecA, has been shown

to bind to dsDNA (Aihara et al. 1997). The core domains of E. coli and Dr RecA

superimpose to an rmsd of 0.71 Å for 197 pairs of Cα atoms. As in E. coli RecA

structures, electron density is absent for residues at the N- and C-termini (residues 1-14 54 and 342-363 in Dr RecA), and for the two presumed DNA-binding loops, Dr RecA

residues 168-175 and 208-222, all of which are not included in the refined model. It

should be noted that in Dr RecA the disordered region at the N-terminus is extended by

12 residues compared to E. coli RecA. In addition, residues 245-250 at the turn of the

extended β6−β7 hairpin of Dr RecA are not modeled due to disorder. In E. coli RecA,

the residues at the turn of the β6−β7 hairpin are present in the model, but flexible

compared to the rest of the structure.

2.3.3 Domain movements

When the Dr and E. coli RecA crystal structures are superimposed by their core

domains (Figure 2.3a), movements of the N- and C-terminal domains are evident. The

N-terminal domain packs against the adjacent monomer in the filament, and thus the

slight change in its position in Dr RecA results from the subtle re-packing of monomers

in the more compressed helical filament. The movement of the C-terminal domain is a

much more dramatic 14° rotation about the axis shown in Figure 2.3b. The C-terminal

domain is of particular interest because it contains a dsDNA-binding site involving a

patch of conserved residues, including Lys302, Trp290, and Gly301 in E coli RecA

(Aihara et al. 1997). In addition, EM studies of E. coli RecA-DNA filaments indicate that

ATP-mediated conformational changes involve significant movements of the C-terminal domain (Yu et al. 2001, VanLoock et al. 2003). The emerging view is that after E. coli

RecA has polymerized on ssDNA, the CTD functions as a ‘gateway’ for entry of dsDNA into the filament and that this process is coordinated by ATP-mediated conformational changes.

55 The rotation of the C-terminal domain seen here in the Dr RecA structure

displaces the supposed dsDNA-binding site at the tip of the C-terminal domain by

approximately 7Å in a direction that is away from the filament axis and towards the 5’

end of the filament (Figure 2.3b). It would seem quite likely that this repositioning of the

C-terminal domain in Dr RecA is related in some way to its unique DNA-binding

properties. Similar, though less dramatic C-terminal domain movements have been seen

in different E. coli RecA crystal structures with pitches ranging from 74-83Å (Story et al.

1992, Xing and Bell 2004). Thus, it is possible that the rotation of the C-terminal domain

seen here for Dr RecA arises from a coupling between the filament extension and C-

terminal domain position that is a more general feature of RecA. Mycobacterial RecA

structures, which have a pitch of ~73Å, show some slight variation in the C-terminal

domain position, but are much closer to that observed for E. coli (Datta et al. 2000, Datta

et al. 2003a, Datta et al. 2003b). Though the conformation of the C-terminal domain

observed here for Dr RecA could be influenced by the bound ATPγS, a wide variety of

nucleotides bound to Mycobacterial RecA structures do not significantly alter the

position of the C-terminal domain. In all RecA crystal structures the C-terminal domain

is intimately involved in the crystal packing interactions, which also could influence its

observed orientation. Analysis of multiple crystal forms is thus necessary to fully

understand the meaning of the conformational differences. Nonetheless, the more dramatic nature of the C-terminal domain reorientation observed here for Dr RecA is suggestive of the C-terminal domain having some role in dictating its specialized DNA binding properties.

56 2.3.4 Monomer-monomer interface

The functional differences between E. coli and Dr RecA do not appear to be due

to any obvious differences in the packing of monomers to form the filament. The

monomer-monomer interaction in the Dr RecA filament, which buries a total of 2,650 Å2 of solvent accessible surface area, is highly similar to that seen in the E. coli RecA filament (Story et al. 1992), which buries 2,770 Å2. Of the 61 amino acid residues buried

at the interface, 45 (74%) have side chains that are identical in E coli and Dr RecA,

making the subunit interface more conserved than the rest of the protein. Nearly all of

the hydrogen bonds (8/9) formed at the interface in Dr RecA are also seen in E coli

RecA, as is one of the two ion pairs, that between Lys263-Asp132. A second ion pair at

the subunit interface in Dr RecA between Lys40 and Glu125 is not seen in E coli RecA,

but is compensated for by a different ion pair between Lys6-Asp139 of E. coli RecA. In

both proteins, the monomer-monomer interaction involves the head-to-tail packing of a

positively charged surface on one monomer with a negatively charged surface on the

neighboring monomer. In both cases, most of the charged residues do not form direct ion

pairs. Rather, more long-range electrostatic interactions are formed between two

complementarily charged surfaces.

Although the N-terminal helix plays a key role in forming the monomer-monomer

interface (Figure 2.2), the first N-terminal residue visualized in the crystal structure,

Ala15, is ~10Å away from the neighboring monomer. Thus, it does not appear likely that

the extra N-terminal Gly-Ser-His residues of the Dr RecA protein used for this study

interfere with or alter the monomer-monomer interactions that build up the filament,

though this possibility cannot be excluded. Rather, the N-terminal residues 1-14 of Dr 57 RecA, along with the extra Gly-Ser-His residues, are located on the outer surface of the filament where they most likely extend out into solvent.

2.3.5 ATP-binding pocket

Electron density for ATPγS in the active site of Dr RecA was very clear (Figure

2.4a), allowing for accurate placement of the entire cofactor. At 2.5Å resolution, this is

the most detailed view for any RecA-nucleotide complex that has been reported. The

density for the γ-thiophosphate group, though complete, was somewhat weaker than for

the rest of the ATPγS molecule, indicating it has increased mobility. This is consistent with the idea that the crystallized complex represents an ‘inactive’ conformation in which the protein is not fully engaged with the γ-thiophosphate group. The γ-thiophosphate of

ATPγS does make close contacts with the side chains of Gln206 and Thr85 of Dr RecA

(Figure 2.4b). In E. coli RecA, Gln194, which corresponds to Gln206 of Dr RecA, is thought to be a key mediator of the allosteric effect of ATP (Kelley and Knight 1997), serving as a link between the γ-phosphate and the loop 2 residues 195-211 that are thought to be the primary DNA-binding site. Despite the interaction of Gln206 of Dr

RecA with the γ-thiophosphate of ATPγS observed here, the loop 2 residues remain disordered, again consistent with the complex representing an ‘inactive’ conformation.

Curiously, although magnesium ion is included in the crystallization mixture, there is no density for magnesium in the active site, as is also seen in some nucleotide complexes of

M. tuberculosis RecA (Datta et al. 2000, Datta et al. 2003a).

Of the residues of Dr RecA that interact with ATPγS (Figure 2.4b), only two,

Lys239 and Gly82, have different side chains than the corresponding residues in E. coli

58 and Mycobacterial RecA molecules. Lys239 in Dr RecA makes a close contact with the

O3’ hydroxyl of ATPγS. The corresponding residue in M. tuberculosis RecA, Arg228,

can form a similar interaction with bound nucleotides (Datta et al. 2000, Datta et al.

2003a), although in most of the reported complexes it does not. Conceivably Lys239 in

Dr RecA could be important in relaying information from the ATP-binding site to C- terminal regions of the protein. Gly82, which lies within the highly conserved P-loop motif of Dr RecA (residues 78-85), is of particular interest because this amino acid position, which is Ser70 in E. coli RecA, is nearly invariant across 63 eubacterial RecA sequences (Karlin and Brocchiere 1996). Only in Dr and in closely related Thermus species, where glycine is found, is serine not seen at this position. In Dr RecA, Gly82 does not appear to lead to any significant change in the overall conformation of the P- loop, and its backbone amide forms a hydrogen bond to the β-phosphate of ATPγS.

While Gly82 in Dr RecA may allow for backbone conformations not accessible in other

RecA molecules, the phi-psi torsion angles of Gly82 in the present structure are the same as in E. coli RecA, which has serine at this position. As is seen in structures of

Mycobacterial RecA in complex with ATP analogs, the P-loop of Dr RecA in the complex with ATPγS is slightly ‘open’ relative to its conformation in uncomplexed E. coli RecA.

The Ser70 side chain in E. coli RecA is in a position to hydrogen bond to the β- phosphate of ATP and to the Lys72 ε-amino group. Since in E. coli RecA Lys72, along with Glu96, is one of the two key residues thought to participate directly in the ATP- hydrolysis reaction (Rehrauer and Kowalczykowski 1993, Campbell and Davis 1999a), the absence of the serine side chain in DrRecA could conceivably affect the ATPase 59 reaction or the relay of allosteric information. Compared to E. coli RecA, Dr RecA does

exhibit some interesting differences in the rates of hydrolysis of ATP and dATP and the

extent to which they are coupled to the DNA strand-exchange reaction (Kim et al. 2002).

In particular, Dr RecA hydrolyzes dATP more efficiently than ATP, and Dr RecA

displaces SSB from ssDNA more efficiently in the presence of dATP than with ATP.

However, in contrast with ATP, the hydrolysis of dATP is poorly coupled to strand

exchange. It would seem likely that the presence of Gly82 instead of serine at this

position within the P-loop could at least in part account for these differences. Along

these lines, a Ser69Gly substitution at the neighboring position in E. coli RecA leads to

differential rates of hydrolysis of ATP, dATP, and ddATP, which are coupled to the rate

of strand exchange (Nayak and Bryant 1999). Amino acid residues of Dr RecA near the

hydroxyl groups of the bound ATPγS, such as Lys239 (Figure 2.4b) could also play a role

in the differential coupling of dATP and ATP hydrolysis to strand-exchange observed for

Dr RecA.

One final residue of note is Tyr277, a nearly invariant residue that in E.coli RecA

(Tyr264) is well known for photo-crosslinking to 8-azido-ATP (Knight and McEntee

1985). In Dr RecA this side chain packs against the adenine ring of ATPγS (Figure 2.4b).

In all other RecA structures this tyrosine side chain is either disordered or rotated to close

over the ribose moiety of bound nucleotides (Datta et al. 2000, Datta et al. 2003a, Datta et al. 2003b). Due to its proximity to the ribose moiety, this residue could conceivably be

partly responsible for dictating the differential binding and hydrolysis of dATP and ATP

by Dr RecA and other RecA proteins.

60 2.3.6 Surface electrostatics

There are some differences in the positions of charged amino acids within the

sequences of E. coli and Dr RecA that do not alter the polypeptide backbone structure,

but nonetheless have a substantial effect on the electrostatic surface potential (Figure

2.5), which could in turn influence the interactions with DNA substrates. While both E.

coli and Dr RecA filaments exhibit a significant dipole, with the 5’ end positively

charged and the 3’ end negatively charged, the detailed electrostatic features are quite

different. In particular, the surface of the Dr RecA filament facing the central axis and

the 5’ end is significantly more positively charged than in the E. coli RecA filament.

This appears to be due in part to the presence of two additional positively charged residues in Dr RecA, Arg225 and Arg341, as well as the absence of two negatively

charged residues, Glu86 and Glu38, that are present in E. coli RecA (Figure 2.5). The Dr

RecA amino acid sequence has a predicted net charge of +4 higher than E. coli RecA,

although the estimated pI of Dr RecA of 5.5 is very close to the average measured value

of 5.6 for E. coli RecA (Roca and Cox 1997). Conceivably, the increased positive charge

along the central cavity of the Dr RecA filament could have a role in dictating its

observed preference for polymerizing on dsDNA instead of ssDNA.

It should be pointed out that several disordered residues are absent from the two

structures and thus not included in the electrostatic calculations. These include the two primary DNA binding loops that point towards the filament axis, and several residues at the N-and C-termini. Within the disordered DNA-binding loop 2 of Dr RecA (residues

208-222), there is a glutamate at position 209 that is a methionine in E. coli RecA, although glutamate is commonly seen at this position in other RecA sequences. The E. 61 coli RecA sequence has a high density of negatively charged residues near its C-terminus, a feature that, though commonly seen in other RecA proteins, is less prevalent in Dr

RecA. Instead, Dr RecA has an unusual Ala and Pro-rich sequence at its C-terminus, the function of which is unknown. E. coli RecA molecules missing the C-terminal ~20 residues have enhanced DNA-binding properties, including an increased rate of binding to dsDNA (Benedict and Kowalczykowski 1988, Tateishi et al. 1992, Lusetti et al. 2003).

Thus, the C-terminus of Dr RecA could quite likely play a role in modulating its DNA binding properties. Finally, the N-terminus of Dr RecA is extended by 12 residues compared to E. coli RecA, and this segment has two negatively and two positively charged residues.

2.3.7 Amino acid positions in Dr RecA with residue types not seen in other

RecA sequences

It is conceivable that the unusual DNA-binding properties of Dr RecA could be

conferred on the protein through a small number of critical amino acids. Along these

lines there are ~24 amino acid positions within Dr RecA that have side chains that are

seen exclusively in Dr RecA (9/24) or uncommonly seen in other RecA sequences, based

on an alignment of 63 eubacterial RecA sequences (Karlin and Brocchiere 1996). These

are mapped onto the structure in Figure 2.6. Surprisingly, none of these amino acid

positions are located within either of the two presumed DNA-binding loops of Dr RecA,

the sequences of which are nearly identical in Dr and E. coli RecA. Instead, many of the

residues with different side chains in Dr RecA are located in the region of the structure

between the C-terminal domain and the ATP-binding pocket. In particular, the region

around the β6−β7 hairpin of Dr RecA has a high density of side chains that are not seen 62 or uncommonly seen in other RecA sequences. For example, Ala252 of Dr RecA, which

is located at the beginning of β strand 7, is a glycine residue in all of the other 62 RecA

sequences examined. The structural consequence of this substitution in Dr RecA, which

adds a methyl group, is to twist the β6−β7 hairpin away from β-strand 8 (Figures 2.2a,

2.5). Similarly, the nearby Gln242 and Pro243 on β−strand 6 of Dr RecA are an insertion

and an amino acid type not seen at this position in other RecA sequences that result in a bulge in β strand 6, causing a further distortion of the β6−β7 hairpin. Val246 is another amino acid within this β-hairpin of Dr RecA that is rarely seen at this position in other

RecA sequences, although residues 245-250 are disordered and not included in the model. The β6−β7 hairpin of E. coli RecA is a site of UV-crosslinking to ssDNA

(Malkov and Camerini-Otero 1995) and mutations of Arg243 and Lys245 of this hairpin disrupt binding of E. coli RecA to DNA (Kurumizaka et al. 1999). Thus, it would seem likely that the β6−β7 hairpin is a particular region of Dr RecA that could play a role in dictating its unique DNA-binding properties.

Other highly conserved amino acid positions within RecA that have a different side chain in Dr RecA include Phe303 and Gly289, both within the C-terminal domain.

Phe303, which is a highly conserved tryptophan in other RecA sequences (Trp290 in E. coli RecA), is located on a surface of the C-terminal domain near several other highly conserved residues that form a binding site for dsDNA in E. coli RecA, as determined by nuclear magnetic resonance studies of its isolated C-terminal domain (Aihara, et al.

1997). Gly289 of Dr RecA, which occurs within helix H of the C-terminal domain, is an

Asp, Glu, Lys, or Asn in all other RecA sequences. One final amino acid position of note, Thr259 in Dr RecA, is either Ile or Val in all of the other 62 RecA sequences. This 63 residue is at the top of β7 and facing the interior of the monomer. In Dr RecA, the side

chain of Thr259 interacts with a buried and very well-ordered water molecule or possible ion (but modeled as a water) that is also bonded to the side chain of Asn262 and the backbone carbonyl groups of Ala62 and Pro266.

2.3.8 Structural Implications for the Inverse Mechanism of Strand

Exchange

In strand-exchange reactions promoted by E. coli RecA, the protein first

polymerizes on ssDNA and then incorporates dsDNA into the RecA-ssDNA filament. Dr

RecA catalyzes DNA strand exchange reactions via an unprecedented inverse mechanism in which the protein polymerizes first on dsDNA, and then incorporates ssDNA into the filament. In the case of E. coli RecA, the ssDNA on which the protein first polymerizes

is bound along the central axis of the filament, through interactions most likely involving

the disordered loops L1 and L2 (Malkov and Camerini-Otero 1995). If E. coli RecA is

added to dsDNA instead of ssDNA, then the protein will polymerize on dsDNA, with

dsDNA bound along the central axis of the filament. According to NMR studies (Aihara et al. 1997), the isolated C-terminal domain of E. coli RecA binds to dsDNA, but not to ssDNA, and a “gateway” model has been proposed in which the C-terminal domain coordinates the entry of dsDNA to the central axis of the filament where the ssDNA is bound. Thus, the first DNA (ssDNA in normal reactions) binds along the central axis of the E. coli RecA filament, and the C-terminal domain associates with the second DNA molecule (dsDNA) to facilitate its pairing with the first DNA. Since Dr RecA catalyzes

DNA strand-exchange by an inverse mechanism, it would seem likely that its C-terminal

64 domain may bind specifically to ssDNA instead of dsDNA, coordinating the entry of ssDNA to the central axis of the filament to pair with dsDNA.

As mentioned before, the increased positive charge along the central axis of the

Dr RecA filament may provide a structural basis for the observed preference of Dr RecA for polymerizing on dsDNA instead of ssDNA. With this model in mind, it is interesting to note structural differences between the C-terminal domains of Dr RecA and E. coli

RecA that would dictate a possible preference for ssDNA vs. dsDNA. Although the positioning of the C-terminal domain is different in the E. coli RecA and Dr. RecA structures, the overall fold is quite similar: the two structures can be superimposed to an rmsd of 0.72 Å for all Cα atoms. However, there are a few key amino acid changes in Dr

RecA that are in the vicinity of the observed dsDNA-binding site on the E. coli RecA C- terminal domain (Figure 2.7). The site on the C-terminal domain of E. coli RecA that interacts with dsDNA, as determined by chemical shift perturbation of backbone 1H and

15N, includes Gly301, Lys302, Ala303, Asn304, Ile298, and Trp290, many of which are highly conserved. These residues form a patch on the surface of the C-terminal domain at the N-terminal end of helix I, which could interact favorably through a helix dipole interaction with the negatively charged phosphates of DNA. It should be noted that since the NMR study only looked at backbone atoms, the study gives the general region of the binding site, but not all of the interacting residues. One potentially key amino acid difference in Dr RecA is Phe303. This residue, which is Trp290 in E. coli RecA, is a tryptophan in 61 of 63 RecA sequences. Given the proximity of this residue to the dsDNA binding site, its high conservation in RecA sequences, and its unique occurrence as phenylalanine in Dr RecA, it would seem to be a likely candidate in dictating a

65 possible specificity for binding to ssDNA. Another residue change at the supposed

DNA-binding site is Glu316 in Dr RecA, which is Ala303 in Ec RecA. In Dr RecA the

Glu316 side chain contacts the backbone, and thus neutralizes the positive charge at the

N-terminal end of helix I. Thus, having glutamate at this position could potentially favor

binding to ssDNA instead of dsDNA. It should be noted, however, that glutamate is the

consensus amino acid type at this residue position, and thus it is not unique to Dr RecA.

Somewhat further away from the N-terminal end of helix I, Lys280 and Lys282 of E. coli

RecA are replaced by Asp293 and Asp295 of Dr RecA. The replacement of the

positively charged residues with negatively charged residues in Dr RecA at these

positions could also dictate a preference for ssDNA, although there are other amino acid

positions on the surface of the C-terminal domain of Dr RecA, such as Lys298 and

Lys317, where positively charged residues are substituted for negatively charged or

neutral amino acids in E. coli RecA.

2.4 Conclusions

Recombinational repair of dsDNA breaks by RecA protein is central to the extreme radioresistance of Dr, and accordingly, Dr RecA catalyzes DNA strand-exchange via an unprecedented, inverse reaction pathway. The structure of DrRecA provides a three-dimensional framework that will be useful in designing mutational experiments to unravel the mechanistic basis for the unique biochemical features that contribute to radioresistance. While the overall structures of Dr and E. coli RecA are similar, key differences include a large rotation of the C-terminal domain that interacts with dsDNA, an increased positive electrostatic surface potential along the central groove of the filament, and an altered conformation of the β6−β7 hairpin. The structural consequences 66 of amino acid positions that have unique side chains in Dr RecA provide further clues to understanding the unique biochemical properties that lead to radioresistance. A more detailed understanding of the unique DNA-binding properties of Dr RecA, and the mechanism of homomlogous recombination in general is significantly limited by the absence of a structure of a RecA-DNA complex for any RecA molecule. Moreover, parts of Dr RecA not resolved in the structure, in particular the DNA binding loops and residues at the N- and C-termini, may play significant roles in dictating the DNA-binding properties. Nonetheless, the structure of the ATPγS complex reported here offers valuable insights that should serve as a useful starting point. Understanding the structural and functional differences between E. coli and Dr RecA should shed light on the biochemical mechanism of homologous recombination in general, a process that is still poorly understood in many respects.

67

Figure 2.1 Dr RecA crystal and diffraction pattern. A) Hexagonal crystals od Dr RecA protein. B) The diffraction pattern from the Dr RecA crystals. The crystals diffracted to 2.5 Å.

68

Data Collection Statistics

Space group P61 a=b (Å)a 111.4 c (Å) 67.5 Resolution (Å) 20.0-2.5 No. of reflections 82,286 No. of unique reflections 16,634 Completeness (%) 99.9 (99.6)b Redundancy 5.0 (4.9) c Rmerge 8.4 (41.1) I/σ 12.2 (3.6)

Refinement Statistics

Resolution (Å) 20.0-2.5 No. of reflections (working/free) 14,877/1,705 Completeness (%) 99.7

RMS Deviation from Ideal Geometry

Bonds (Å) 0.007 Angles (°) 1.4 R-factor/free R-factor (%)d 22.7/26.1 No. of waters 27

aFor comparison the unit cell dimensions of the E. coli RecA crystal structure are a = b= 102.4 Å, c = 82.7 Å, and the resolution is 2.3 Å. bNumbers in parantheses refer to the highest resolution shell only. c Rmerge =∑|Ih-‹I›h|/∑Ih, where ‹I›h is average intensity over symmetry equivalents. dR-factor = ∑|Fobs - Fcalc|/∑obs. The free R-factor is calculated from 10% of the reflections that are omitted from the

Table 2.1 Data collection and refinement statistics for Dr RecA structure.

69

Figure 2.2 Structure of D. radiodurans RecA monomer and polymer. a) Ribbon representation of a monomer, colored blue to red from N- to C-terminus. The orientation is roughly the same as for the top monomer of the filament shown in panel b. ATPγS is in orange ball-and-stick with carbon black, oxygen red, nitrogen blue, phosphorous magenta, and sulfur yellow. Three disordered loops are indicated by the colored spheres and residue numbers. The N-terminal (residues 15-45), core (residues 46-282) and C- terminal (residues 283-341) domains are indicated. α helices and β-strands are labeled according to E. coli RecA (Story et al. 1992) and the beginning and ending residues of each secondary structure element are indicated. b) Two complete turns of the 61- symmetric Dr RecA filament are shown with alternating subunits in green and magenta. The view is perpendicular to the 61 axis (crystallographic c axis), which is shown as a black line. The 5’ to 3’ polarity (Stasiak et al. 1988) and helical pitch of the filament are indicated.

70

Figure 2.3 Reorientation of the C-terminal domain of Dr RecA relative to E. coli RecA. a) Stereo view of superimposed Cα-traces of Dr RecA (magenta) and E. coli RecA (yellow) monomers. The superposition is based on the Cα atoms of the core domains only (residues 34-269 and 46-282 of E. coli and Dr RecA, respectively). ATPγS from the Dr RecA structure is shown in ball-and-stick with green bonds. Notice that the largest structural differences are in the positions of the N-terminal helix and the C- terminal domain. The coordinates for E. coli RecA are from PDB code 2REB (Story et al. 1992). b) Movement of the Dr RecA C-terminal domain in the context of the E. coli

71 RecA filament. The superposition from panel a is rotated ~90° about the central axis to give the view from the front of the filament, and neighboring subunits from the E. coli RecA filament are shown in cyan. The black line shows the filament axis with the 5’-3’ polarity (Stasiak et al. 1988) indicated, and the green line shows the axis about which the 14° rotation of the C-terminal domain occurs. c) Close up view of the reorientation of the C-terminal domain. The view is the same as in panel b. The side chains of two residues of the E. coli RecA C-terminal domain that interact with dsDNA, Trp290 and Lys302, are shown in ball-and-stick. The corresponding residues in Dr RecA, also shown in ball-and- stick, are Phe303 and Lys315.

72

Figure 2.4 Structure of the Dr RecA ATP-binding pocket. a) Stereo view of electron density in the active-site region. ATPγS (green) and Dr RecA (gold) are shown in ball- and-stick. The blue cage is the final 2.5Å 2Fo-Fc electron density map contoured at +1σ, and the red cage is a 2.5Å Fo-Fc simulated annealing omit map, calculated with ATPγS omitted and contoured at +3σ. The orientation is similar to Figures 2.1a and 2.2a, but rotated ~60° from left to right about the vertical axis. b) Stereo view of interactions between ATPγS and active site residues of Dr RecA. All potential hydrogen bonds within 3.5Å are shown as black dotted lines. Notice that six consecutive amide groups of the P-loop form potential hydrogen bonds to the phosphates.

73

a, D. radiodurans b, E. coli

Figure 2.5 Increased positive electrostatic potential on the surface of Dr RecA. a) Electrostatic surface potential of the Dr RecA filament. The molecular surface of one complete turn of the Dr RecA filament is shown roughly perpendicular to the filament axis, with the 5’ end at the top. The surface is colored according to electrostatic potential at +/- 7 kT with positively charged regions blue and negatively charged regions red. b, Electrostatic surface potential of the E. coli RecA filament. The orientation and coloring are the same as in panel a. Notice the increased positive charge along the upper (5’ end) and inner surface of the Dr RecA filament (left), as compared to the E. coli RecA filament (right). The large blue patch at the very top of each filament is covered by the next subunit in the polymer, and thus not exposed to solvent except at the very end of the filament.

74

a αA Dr M S K D A T K E IS A PTD A K E R S K A I E T A MSQIE K A FGK GSIMK LGA E S K L D V Q50 Ec------A I DEN K Q K A L A A A LGQIE K QFGK GSIMR LGEDR SMD V E 38 :.* *: :*:.*** * *******: ** : .:**: αΒ β1 αC Dr V V STGSLSLD L A LGV GGIPR G R ITE IYGPE SGGK TTLA L A I V A Q A Q K A G G 100 EcTISTGSLSLD I A LGA GGLPM GR I V E IYGPE SSGK TTLTLQ V I A A A Q R E G K 88 .:******** :***.**:* *** ****** *.*****:* ::* **: * β2 αD β3 αE Dr T C A FID A E H A L D P V Y A R A LG V NTDELLV SQ PD NGE Q A L E IME LLV R SGA I 150 Ec T C A FID A E H A L D PIYA R K LG V D I D NLLCSQ PD TGE Q A L E ICDA L A R SGA V 138 ********** ***:*** ** *: *:** ** **.******* : *.****: β4 αF Dr D V V V V D S V A A LTPR A E I E G D MGD SLPGLQA R LMSQA L R K LTA ILSK TGTA 200 Ec D V I V V D S V A A LTPK A E I E G E IGD SHMGLA A R MMSQA M R K L A GNLK QSNTL 188 **:******* ***:*****: :*** ** * *:****:*** :. *.::.* β5 αG β6 Dr A IFINQV R E K IGV MYGNPE T TTGGR A L K FY A S V R L D V R K I GQPTK V GND A 250 EcLIFINQIR M K IGV MFGNPE T TTGGNA L K FY A S V R L D I RRIG-A V K E G E N V 237 *****:* * ****:***** ****.***** ******:*:* * ..* *::. β7 β8 αH β9 Dr V A NTV K I K T V K N K V A A PFK E V E L A L V YGK GFD QLSD L V GL A A D M D IIK K A 300 Ec V GSE T R V K V V K N K I A A PFK Q A E FQILYGE G INFYGE L V D LGV K E K LIE K A 287 *.. .::*.* ***:*****: .*: ::**:* :: .:**.* ... .:*:** β10 β11 αI αJ DrGSFYSYGDER IGQGK E K TIA YIA E R P E M E Q E I R D R V M A A I R A G-NA G E A P 349 Ec G A WYSYK G E K IGQGK A N A T A WLK D NPE T A K E I E K K V R E LL LSNPNSTPD F 337 *::*** .*: ***** :: * :: :.** : **..:* : :. *:

Dr A L A P A P A A P E A A E A - 363 Ec S V DDS E G V A E TNEDF 352 :: : ...* : *

b

Figure 2.6

75 Figure 2.6 Amino acid residues of Dr RecA with side-chains that are uncommonly seen in other RecA sequences. a) Structure-based sequence alignment of E. coli and Dr RecA. The portions of the sequences not included in the structures are shaded in gray. Secondary structures in Dr RecA are indicated above the alignment. Charged residues are colored in red letters for negative and blue for positive. The green shaded boxes indicate residue positions that are highly conserved among 63 RecA sequences (Karlin and Brocchiere 1996), but have a different side chain in Dr RecA. The blue shaded boxes indicate residue positions that are moderately conserved and have a different side chain in Dr RecA. The symbols below the alignment indicate the degree of conservation between Dr and E. coli RecA: identical (*), conserved (:), and semiconserved (.), as defined by ClustalW (Higgins et al. 1994). b) Mapping of amino acid positions with distinctly different side chains onto the structure of Dr RecA. A stereo view of the Dr RecA monomer is shown as a magenta Cα-trace in the same orientation as Figures 2.1a and 2.2a. Amino acid side chains of residues shaded green or blue in panel a are shown in green or blue ball-and-stick. Notice that many of the side chains that are different in Dr RecA cluster together in the region of the structure between the C-terminal domain and the ATP-binding pocket, particularly around the β6−β7 hairpin, near residues 243-252.

76

a b

J

H

9 I 10 11

Figure 2.7 Comparison of the structures of the C-terminal domains of Dr RecA and E. coli RecA in the vicinity of the dsDNA-binding site of E. coli RecA. a) Ribbon drawing of the E. coli RecA C-terminal domain (PDB code 2REB) (Story et al. 1992), with selected amino acids shown in ball-and-stick. The dsDNA-binding site determined by chemical shift perturbation of backbone 1H and 15N involves residues Gly301, Lys302, and Ala303 (Aihara et al. 1997). The view is the same as in Figure 2.2a. b, Ribbon drawing of the Dr RecA C-terminal domain with selected amino acids shown in ball-and- stick. Notice that Glu316 of Dr RecA caps the N-terminal end of helix I, which is the observed binding-site of the E. coli RecA C-terminal domain to dsDNA. Also notice that Lys280 and Lys282 of E. coli RecA are replaced with Asp293 and Asp295 of Dr RecA. Gln300 and Gly301 of E. coli RecA are invariant among RecA enzymes, and Lys302 is almost always lysine or arginine.

77 CHAPTER 3

PROBING THE DNA SEQUENCE SPECIFICITY OF E.

COLI RECA PROTEIN*

3.1 Introduction

RecA binds to all sequences of ssDNA, primarily along the sugar phosphate

backbone with the bases exposed for homology recognition (Leahy and Radding 1986).

Nonetheless, RecA does bind preferentially to certain types of DNA sequences. In early

studies it was noted that RecA binds preferentially to poly(dT), due to its lack of

secondary structure (McEntee et al., 1981, Amaratunga and Benight 1988). In vitro

selection for sequences of DNA that bind optimally to E.coli RecA (Tracy and

Kowalczykowski 1996) pulled out GT-rich sequences that bear a striking resemblance to the recombination hotspot, Chi (5’-GCTGGTGG-3’). The same in vitro selection experiment using yeast Rad51, a eukaryotic homolog of RecA, pulled out a very similar

78 * This work has been published in the journal, Nucleic Acids Research (2006), Volume 34, 2463-

2471. set of GT rich sequences (Tracy et al. 1997a), indicating that the sequence specificity is conserved and possibly inherent to some aspect of DNA structure such as its ability to be extended. Interestingly, similar GT-rich sequences are present in highly recombinogenic regions of DNA in higher eukaryotes, including microsatellites, Alu repeat elements, and the constant regions of immunoglobulin heavy chains (Tracy and Kowalczykowski 1996,

Tracy et al. 1997a).

The binding of RecA to different sequences of ssDNA has been examined directly by various biophysical techniques. Using isothermal titration calorimetry (ITC) to compare the binding of RecA to poly(dT), poly(dA) and poly(dC) revealed that RecA binds with a substantially more favorable enthalpy to poly(dT) (Wittung et al. 1997).

Using surface plasmon resonance (SPR) to compare the binding of RecA and Rad51 to all possible dinucleotide-repeating 39-mers revealed that both proteins bind strongly to

CT, GT and CA-repeating sequences and weakly to GA, AT and GC-repeats (Biet et al.

1999, Dutreix 1997). These differences were attributed largely to secondary structure of the ssDNA. Using fluorescence anisotropy to study the binding of RecA to different trinucleotide-repeating 39-mers revealed that RecA binds tightly to TTT, CCC, TCC and

TAC-repeating sequences and weakly to GAA, AAA, CGG and CAG repeats (Bar-Ziv and Libchaber 2001). It was concluded that un-stacking of purines is an energetic barrier to RecA binding such that pyrimidine-rich sequences bind tightly to RecA. The differences in binding could be correlated with the calculated folding energies of the ssDNA (Zuker 2003).

Two lines of evidence suggest that in the active state of the filament each RecA monomer binds exactly three nucleotides of DNA. First, alignment of naturally occurring

79 sequences of E.coli DNA that surround Chi sites reveals a striking TGG-repeating consensus sequence extending to at least 230 bp on either side of Chi (Tracy et al.

1997b). Second, DMS reactivity of Chi-containing oligonucleotides bound by RecA-

ATPγS exhibits a phasing of precisely three nucleotides (Volodin and Camerini-Otero

2002). These observations suggest that each monomer of RecA has three separate nucleotide binding sites, each formed by a different constellation of amino acid residues

(Figure 1.7). This gives rise to distinct preferences for different nucleotides at each of the three binding sites.

To systematically test the ssDNA sequence specificity of E.coli RecA protein, a coprotease assay was used to compare the binding of RecA to all 64 possible trinucleotide-repeating 15-mer oligonucleotides. Since λ repressor cleavage is specific for the RecA-ssDNA-ATP complex, the extent of binding of RecA to a particular DNA sequence can be inferred from the rate of repressor cleavage. This assay is simple and rapid enough to compare large numbers of different sequences, and sensitive enough to show significant differences among them. By using 15-mers as opposed to longer oligonucleotides, the inhibition of RecA-DNA binding due to secondary structure formation or intermolecular pairing of the DNA was likely minimized. Thi inturn would allow the effects of the direct interaction between RecA and the ssDNA to be accentuated.

80 3.2 Materials and Methods

3.2.1 Materials

All oligonucleotides were purchased from Integrated DNA Technologies and

dissolved in ddH2O. 48-mer oligonucleotides were purified by ion-exchange HPLC.

Oligonucleotide concentrations are expressed in nucleotides and were measured by O.D. at 260 nm using extinction coefficients calculated from the sequences. M13 ssDNA was from New England Biolabs. ADP, ATP, NADH, phosphoenolpyruvate (PEP), pyruvate kinase type VII, and lactic dehydrogenase type XXXIX were from Sigma-Aldrich. E. coli

single-stranded DNA binding protein (SSB) was from USB. All other chemicals were

Fisher ACS certified grade.

3.2.2 Protein Expression and Purification

E. coli RecA was expressed and purified as described previously (Xing and Bell

2004). Briefly, RecA was expressed as a 6-His fusion protein from pET14b (Novagen) in

E. coli BL21(DE3)pLysS cells. The protein was purified by Ni2+-affinity and anion

exchange chromatography. The final RecA protein has the extra sequence Gly-Ser-His-

Met at the N-terminus, but the protein has ssDNA-dependent coprotease and ATPase

activities that are essentially indistinguishable from native RecA protein based on

experiments with several different oligonucleotide sequences (data not shown). The concentration of the protein was determined by O.D. at 280 nm using the extinction coefficient of 20,340 M-1cm-1calculated from their amino acid sequences.

The λ repressor fragments were purified for the cleavage assay. These fragments had similar or better coprotease activity as compared to the full length λ repressor and 81 hence they were used for the assay. Residues 93-236 of λ repressor (cI93-236) were

expressed and purified by a similar procedure, as described previously for cI132-236 (Bell et

al. 2000). The fragment was cloned into pET14b vector and expressed in BL21AI cells.

The protein was purified by the Ni NIA column, the His-tag was cleaved and a second

Ni-NTA column was run to remove the uncleaved protein and other impurities. A

hypercleavable form of λ repressor (cIhc) consisting of residues 101-229 and bearing the mutations P158T and A152T, was constructed using the QuickChange (Stratagene) procedure, and purified using the same procedure as for cI93-236 (The λ repressor

fragments were purified by Dr. Dieudonné Ndjonka). The concentrations of the λ

repressor fragments were determined by O.D. at 280 nm using extinction coefficients

-1 -1 - calculated from their amino acid sequences: 21,095 M cm for cI93-236, and 15,220 M

1 -1 cm for cIhc.

3.2.3 RecA coprotease assay

The self-cleavage of a fragment of λ repressor (cI93-236) was measured in the

presence of RecA, ADP-AlF4 and 64 different trinucleotide-repeating 15-mer

oligonucleotides. Reactions (50 μl) included 10 μM RecA, 30 μM oligonucleotide, 1 mM

ADP, 2 mM aluminum nitrate, 10 mM NaF, 20 mM Tris pH 7.4, 2 mM MgCl2 and 50 mM NaCl. The above components were mixed and incubated at 25 ºC for 30 minutes, after which 10 μM cI93-236 was added and the mixture was incubated at 25 ºC for 20

minutes. The reaction was quenched by adding 0.25 volumes of 5X SDS–PAGE loading

buffer and immediately heating to 95 ºC for 5 minutes. Samples were run on a 13.5%

SDS-PAGE gel and stained with coomassie brilliant blue. Dried gels were digitally

82 scanned and the intensity of each band was integrated using Kodak Digital Science 1D

image analysis software. The % cleavage was calculated from the net intensities of the

bands corresponding to cleaved and uncleaved cI93-236 (residues 112-236 and 93-236,

respectively). The net intensity of cleaved cI93-236 was multiplied by 1.16 (145/125) to

account for the smaller size of the cleaved product relative to the uncleaved substrate.

Reactions were done in triplicate and the mean and standard deviation are reported.

3.2.4 ATPase assay

The DNA-dependant ATPase activity of RecA in the presence of selected 48-mer

trinucleotide-repeating sequences was measured by a coupled spectrophotometric assay

(25). The 500 μL reaction mixture contained 25 mM Tris pH 7.4, 10 mM MgCl2, 1 mM

DTT, 1 mM ATP, 2 mM PEP, 0.5 mM NADH, 30 units/mL each of pyruvate kinase and lactic dehydrogenase, 0.8 μM RecA, and 3 μM oligonucleotide. The reactions were

carried out at 37 ºC with an electrically heated cell holder. All reaction components except RecA were pre-mixed, and reactions were initiated by mixing in RecA. The O.D. at 340 nm was monitored at one-minute intervals using an Ultrospec 2100 pro UV/VIS

spectrophotometer (Amersham Biosciences). In the reaction with 3 μM M13 ssDNA, 0.8

μM E. coli single-stranded binding protein (SSB) was added after RecA. The ATPase rate (min-1) was calculated using the formula: (-dA/dt)(1/6300 M-1cm-1)(reaction volume)

/ [RecA], where dA/dt is the slope in the linear region of the plot of OD340 vs. time (min).

Each reaction was done in triplicate and the mean and standard deviation are reported.

83 3.2.5 DNA-binding assay

This experiment was done by Jim Wisler. Binding of RecA to 48-mer

oligonucleotides was measured with the double filter-binding method (Wong and

Lohman 1993). Binding reactions (50 μl) were in buffer B (25 mM Tris-acetate pH 7.5, 4

mM Mg-acetate, 10 mM KCl, and 1 mM DTT) and contained 1 mM ATPγS, about 10

nM 32P end-labeled 48-mer oligonucleotide, and 0-2 μM RecA. The reactions were

equilibrated at 37 °C for 30 minutes and then loaded onto a Minifold I 48-well slot-blot

apparatus (Schleicher & Schuell) containing nitrocellulose (BioRad) and DEAE

(Whatman) filters, which were pre-treated as previously described (Tracy and

Kowalczykowski 1996, Wong and Lohman 1993) and equilibrated in buffer B prior to

use. Samples were loaded into the wells, pulled through under vacuum, and washed with

1 ml of buffer B (Figure 3.1). The radioactivity on the filters was measured using a Storm

860 phosphorimager (Amersham Biosciences) and Image Quant 5.2 software (Molecular

Dynamics), and the % bound for each sample was calculated from the net intensities of

the bound (nitrocellulose) and unbound oligonucleotide (DEAE).

3.3 Results and Discussion

3.3.1 Dependence of RecA coptrotease activity on ssDNA sequence

The cleavage of a λ repressor fragment (residues 93-236; cI93-236) in the presence

of RecA, ADP-AlF4, and 64 different trinucleotide-repeating 15-mer oligonucleotides

was measured by SDS-PAGE. This fragment of λ repressor undergoes RecA-mediated cleavage in a similar manner as the full-length protein (Sauer et al. 1982). The

temperature of 25 ºC and time of 20 minutes were chosen in order to maximally

84 distinguish the cleavage efficiencies in the presence of the 64-different trinucleotide- repeating 15-mers. Figure 3.2 shows an example of the coprotease reaction for five different sequences, and Figure 3.3 shows the % cleavage in the presence of all 64 trinucleotide-repeating 15-mers. The observed coprotease activity is highly dependent on the sequence of DNA. Under the conditions of the assay, the % cleavage values ranged from 91% to 7%. The values for 15-mers containing different permutations of the same trinucleotide repeat (i.e. TGG, GTG, and GGT) were observed to be very similar.

Accordingly, mean values for each such group are shown in the set of 24 non-redundant trinucleotides in Figure 3.3B.

By far the highest coprotease activity was observed with TGG-repeating sequences, which gave 90% cleavage in the 20-minute reaction. The next best trinucleotides were AGG and TTG, which gave ~60% cleavage. These are similar to

TGG, but with the substitutions of T to A and G to T, respectively. Trinucleotide- repeating sequences that gave the lowest coprotease activity (~10% cleavage) tended to have combinations of GC or AT. Among the polynucleotides, cleavage was highest for

GGG (49%), followed by CCC (39%), TTT (38%), and AAA (13%).

From examination of Figure 3.3, it is clear that it is the sequence of nucleotides within a trinucleotide, and not simply the composition, that determines the ability to promote RecA coprotease activity. This is most evident in comparing the cleavage for

TCG (45%) and CTG (23%), which differ in sequence but not in composition. Similarly, it is not simply the pattern of pyrimidines and purines within a trinucleotide that is important, since TGG gives the highest cleavage (90%), whereas CAG (grouped with

AGC and GCA) gives the lowest cleavage (8%). Thus, individual bases have specific 85 properties that give rise to a strong nucleotide sequence-dependence of RecA coprotease activity.

3.3.2 Dependence of RecA coprotease activity on oligonucleotide length

For the reasons described above, 15-mer trinucleotide-repeating oligonucleotides were chosen for the exhaustive comparison. To test if the same sequence-dependence of

RecA coprotease activity is observed for longer oligonucleotides, cleavage reactions for five selected trinucleotide-repeats (TGG, TTG, TTT, TCA, and CCA) were compared for

15-mer and 48-mer oligonucleotides. For these experiments the ratio of RecA to DNA was fixed at three nucleotides per RecA monomer so that the reactions with 48-mer oligonucleotides contained fewer molecules of ssDNA, but the same number of nucleotides. The results show definitively that the length of the oligonucleotide does not change the sequence-specificity of RecA coprotease activity (Figure 3.4A). Interestingly, slightly but significantly lower coprotease activity was observed for all of the 48-mer sequences, which is somewhat counter-intuitive since the binding of RecA to ssDNA is known to be highly cooperative (Menetski and Kowalczykowski 1985, Takahashi et al.,

1986).

3.3.3 Dependence of RecA coprotease activity on type of ATP cofactor

The coprotease assays described above were carried out in the presence of ADP-

AlF4, which gives the highest coprotease activity of any ATP analog tested. In order to determine if the DNA sequence dependence of RecA coprotease activity was the same under conditions of ATP hydrolysis, reactions in the presence of ATP were attempted.

However, with ATP as the cofactor cleavage is much less efficient, and there was no

86 detectable cleavage of cI93-236 for reactions with 15-mer or 48-mer oligonucleotides, even

after several hours at 37 °C. In order to overcome this problem, coprotease reactions with

ATP were performed with a hypercleavable form of λ repressor consisting of residues

101-229 and bearing the mutations P158T and A152T (hereafter referred to as cIhc). In

Figure 3.4B, the coprotease activity of RecA in the presence of ADP-AlF4 (with cI93-236) and ATP (with cIhc) is compared for five selected 48-mer oligonucleotides under

otherwise identical conditions. The DNA sequence-dependences of RecA coprotease

activity in the presence of ADP-AlF4 and ATP are essentially the same. With TCA and

CCA-repeating 48-mers, there was no detectable cleavage of cIhc in the 20-minute

reactions. Thus it can be concluded that although the repressor cleavage reaction is

dramatically slower under conditions of ATP hydrolysis, the dependence of the reaction

on the DNA sequence is the same as with ADP-AlF4.

3.3.4 Detailed comparison of TGG, GGT, and GTG

The initial experiments showed that GGT, GTG, and TGG-repeating 15-mers

gave by far the highest RecA coprotease activity. These three oligonucleotides differ only

in the point at which they start. In order to determine if there is a single best trinucleotide

repeat, these three 15-mers were examined more closely by doing a time-course for the

cleavage reactions (Figure 3.5). The rationale for doing this experiment is as follows. If,

for example, the optimal trinucleotide is TGG, then a TGG-repeating 15-mer will have five ideal sites, whereas GTG and GGT-repeating sequences will only have four. One might expect this effect to be accentuated for 9-mer sequences as compared to 15-mers

since the difference in the number of ideal sites would be more significant (2 vs. 3 as compared to 4 vs. 5). 87 The results of the time-course show that GTG and GGT-repeating sequences give significantly higher rates of cleavage than TGG-repeats, both for the 15-mers and for the

9-mers. At lower time-points GTG is consistently better than GGT, although the differences are not statistically significant at any single time-point. The differences do not appear to be accentuated for 9-mers as compared to 15-mers, as was hypothesized above.

These results indicate that the ideal trinucleotide for RecA coprotease activity is GTG, with GGT a very close second and TGG a more distant third. As a final test, a GTG- repeating 18-mer and the top 18-mer sequence from in vitro selection, 5’-

GCGTGTGTGGTGGTGTGC-3’ (14), were compared in a time course for the coprotease reaction (Figure 3.5C). The GTG-repeating sequence gives significantly higher coprotease activity, particularly at the earlier time points.

3.3.5 Dependence of RecA ATPase activity on DNA sequence

As RecA is a DNA-dependent ATPase, measurements of ATPase rate have commonly been used to indirectly assess the binding of RecA to DNA. As a next step,

ATPase asays were carried out to determine if the dependence of RecA coprotease activity on DNA sequence that was observed above was also seen for RecA ATPase activity. A coupled spectrophotometric ATPase assay was used to compare the RecA

ATPase activity in the presence of five different trinucleotide-repeating sequences that gave a wide range of % cleavage values (TGG, TTG, TTT, TCA, and CCA). Whereas

15-mer sequences did not give any detectable ATPase activity (data not shown), 48-mer sequences gave ATPase rates approaching that of M13 ssDNA (Figure 3.6), as has been observed previously (Bianco and Weinstock 1996). Surprisingly, TGG produced by far the lowest ATPase rate (9 min-1) of the five trinucleotide-repeating sequences examined, 88 while CCA gave the highest rate (18 min-1). Thus, the sequence with the highest

coprotease activity gives the lowest ATPase activity, and vice versa. In general, such an inverse correlation was observed, but did not hold exactly for all of the five sequences tested; TCA gives very low coprotease activity but only moderate ATPase activity. Thus, it can be concluded that both the ATPase and coprotease activities of RecA are highly dependent on the sequence of DNA to which RecA is bound, but that the preferred sequences for the two activities are remarkably different.

3.3.6 Dependence of RecA DNA-binding on DNA sequence

In order to directly examine the binding of RecA to different sequences of

ssDNA, a double filter-binding assay was employed (Wong and Lohman 1993). Briefly,

RecA and 32P end-labeled oligonucleotide were incubated and passed through two filters.

First a nitrocellulose filter retains protein and protein-bound oligonucleotide. The

unbound oligonucleotide that passes through is then trapped on a DEAE filter. The

oligonucleotide on each filter is quantified by phosphorimaging, and the % of DNA bound can be determined. 48-mer oligonucleotides of TGG and CCA-repeats were selected for this experiment due to their contrasting effects on the co- and

ATPase activities of RecA. The binding experiments were done at ~10 nM oligonucleotide and 0-2 μM RecA. The resulting binding curves (Figure 3.7) show that

RecA bound substantially better to the TGG-repeating 48-mer than to CCA. Whereas a maximum of about 83% of the TGG repeat was retained on the nitrocellulose filter, only about 35% of the CCA repeat was maximally retained. Moreover, the apparent KD estimated from these measurements is 91 nM for the TGG-repeating 48-mer and 246 nM for the CCA-repeating 48-mer. Similar attempts to obtain the binding curves for TTT, 89 TCA, and TTG-repeating 48-mers were not successful due to the high level of binding of

these oligonucleotides to the nitrocellulose filter in the absence of RecA.

In conclusion, the double filter-binding assay shows that RecA binds preferentially to TGG-repeating sequences than to CCA repeats. Thus, based on this

limited comparison, the sequence-dependence of RecA-DNA binding is apparently similar to that of RecA coprotease activity, while that of RecA ATPase activity is

different.

3.3.7 Basis of sequence specificity of coprotease activity

In the present study, a coprotease assay was used to systematically compare the

binding of RecA to all possible trinucleotide-repeating 15-mer oligonucleotides. The

coprotease activity of RecA is highly dependent on the DNA sequence: the % cleavage for each sequence was reproducible and varied from 90% to below 10%. Thus, the coprotease assay is a highly efficient and sensitive method for comparing the binding of

RecA to different sequences of ssDNA. The most significant result from this analysis is

that TGG-repeating sequences (including GGT and GTG) stand out as giving by far the

highest coprotease activity.

What is the physical basis for the strong preference for TGG-repeating sequences in promoting RecA coprotease activity? There are two possible reasons. First, TGG- repeating sequences could induce a particular conformation of the RecA filament, such as a precise degree of extension, that is just right for binding of the repressor and promoting

the self-cleavage reaction. Alternatively, the higher coprotease activity may simply

reflect the tighter binding of RecA to TGG-repeating sequences than to other sequences,

giving rise to a higher percent of RecA bound under the conditions of the assay. While 90 the first possibility is intriguing, there is very little information regarding how different

sequences of ssDNA affect the structure of RecA. On the other hand, there is evidence

supporting the second possibility. First, in the binding experiment of Figure 3.7, the

TGG-repeating 48-mer, which gave the highest coprotease activity, clearly binds to RecA

better than the CCA-repeat, which gave low coprotease activity. Second, based on an in

vitro selection experiment (Tracy and Kowalczykowski 1996), TGG, GTG, and GGT

were among the most frequently occurring trinucleotides within the selected sequences.

In fact, for all trinucleotides there is a general correlation between the ability to activate

RecA coprotease activity and the frequency of occurrence in the selected sequences

(Figure 3.8). Since the in vitro selection experiment is based on binding (a limited

amount of protein is mixed with an excess of DNA which has a mixture of different DNA

sequences. The protein will bind preferentially to certain sequences compared to other

sequences. The preferred sequence is amplified by PCR which is followed by another

round of selection experiment. At the end of several cycles, only the most preferred

sequences will remain in the mixture), it can be concluded that the observed differences

in coprotease activity seen for the sequences are very likely to be largely if not entirely

due to differences in extent of RecA binding.

Based on the time course of the coprotease reactions for the three different

permutations of TGG-repeats, it was determined that the best trinucleotide is GTG,

closely followed by GGT and more distantly TGG. Interestingly, in the in vitro selection

experiments, GTG was found to be the most frequently selected trinucleotide, both for E.

coli RecA ( Tracy and Kowalczykowski 1996), in which case it was tied with TGG, and especially in the case of yeast Rad51, in which case it was significantly higher than GGT

91 and TGG (Tracy et al. 1997a). Thus, together the three studies point to GTG as being the

most preferred trinucleotide.

3.3.8 Validity of using trinucleotide-repeating sequences

In the coprotease assay, we focused on comparing all possible trinucleotide- repeating sequences, for the reasons described above. How appropriate was this choice?

Although it seems clear that one monomer of RecA binds to precisely three nucleotides of ssDNA (Figure 1.7, Tracy et al. 1997b, Volodin and Camerini-Otero 2002), it is conceivable that neighboring monomers within a filament interact with one another in such a way as to influence one another’s sequence specificity, thus giving rise to sequence preferences extending beyond the context of a trinucleotide. In the in vitro

selection experiments, which selected an 18 nucleotide sequence flanked by two 18

nucleotide fixed sequences, precisely the same sequence, 5’-

GCGTGTGTGGTGGTGTGC-3’, was clearly the most frequently selected, both for E. coli RecA and yeast Rad51 ( Tracy and Kowalczykowski 1996, Tracy et al. 1997a). This would tend to support the idea that the specificity of the RecA-DNA interaction extends beyond the trinucleotide. It is possible, however, that the fixed 18 nucleotide flanking regions, which are needed for PCR amplification, somehow influenced the preferred nucleotides at positions within the selected region, particularly at the ends where non-

GTG trinucleotides were found. Consistent with this latter interpretation, the GTG- repeating 18-mer gives slightly higher coprotease activity than an 18-mer corresponding to the top sequence from in vitro selection (Figure 3.5C), suggesting that GTG-repeating

18-mer binds to RecA more tightly.

92 Another point to consider is that while it has been observed that RecA binds in a

phased manner with a period of three nucleotides to oligonucleotides containing the Chi

sequence (Volodin and Camerini-Otero 2002), the phasing of RecA on the sequences

used in this study has not been examined. Thus, it is possible, especially for the weaker

binding sequences, that RecA is not bound in homogeneous alignment for all molecules

within a given sample. Nonetheless, with these caveats in mind, since the trinucleotide is

the basic unit of ssDNA to which a monomer of RecA binds, it seems clear that

trinucleotide-repeating sequences are the most appropriate choice for determining the

sequence specificity of RecA in a comprehensive, systematic manner.

3.3.9 Basis of sequence specificity of binding

If the observed differences in coprotease activity are due to differences in binding,

then what is the physical basis for this? The binding of RecA to a particular sequence of

DNA could be influenced by (1) the intermolecular interactions between the amino acid residues within the three binding sites on RecA and the four different nucleotides of

DNA, and/or (2) some intrinsic property of the DNA itself, such as its propensity to be extended, form intramolecular secondary structure (folding) or intermolecular pairing interactions. The observation from in vitro selection experiments that both E. coli RecA and yeast Rad51 exhibit remarkably similar sequence preferences would tend to suggest that they are due to some intrinsic property of the DNA itself, although it is conceivable that both proteins have evolved to form interactions that favor the same sequences.

The sequence preferences seen in Figure 3.3 would tend to suggest that both intermolecular interactions between RecA and ssDNA as well as intrinsic properties of the DNA are at play. On the one hand, the fact that the sequences that give the lowest 93 coprotease activity tend to have consecutive GC or AT pairs indicates that secondary

structure or intermolecular pairing does to some extent play a role in dictating binding affinity, at least for those sequences. On the other hand, the differences in coprotease activity cannot simply be accounted for by differences in secondary structure formation of the DNA or ability to be extended. For example, TTT-repeating sequences, which form minimal secondary structure and stacking interactions, are clearly not the most preferred sequences for RecA binding and coprotease activity. In addition, purine bases are known to resist the unstacking required for forming the extended conformation. Yet, the optimal sequences are not pyrimidine-rich, but instead contain two consecutive guanosines. In fact, GGG gave significantly higher coprotease activity than TTT or CCC, indicating that clearly the penalty for unstacking the purine bases does not play a dominant role in dictating the sequence specificity. Thus, it would appear that the atomic interactions between the amino acid residues of the binding sites on RecA and the individual nucleotides of ssDNA play a significant role in dictating the sequence specificity.

3.3.10 Contrasting preferences for ATPase and coprotease activities

Coprotease activity is high for TGG-repeating sequences and low for CCA-

repeating sequences (90% and 17% cleavage, respectively). By contrast, the ATPase rate is highest for the CCA-repeating 48-mer, and lowest by far for the TGG-repeat (18 min-1 and 9 min-1, respectively). Since the coprotease activity correlates well with binding, as

seen in Figures 3.7 and 3.8, it is clear that ATPase rate does not. Thus, while the ATPase

activity of RecA is dependent on its being bound to DNA, apparently sequences that bind to RecA optimally actually prevent maximal ATPase activity. Conceivably, this could be 94 due to an effect on release of the products of ATP hydrolysis (ADP and Pi) as opposed to

the hydrolysis event per se. In the active state of the filament, which is required for ATP

hydrolysis, ATP is bound at the interface between neighboring subunits (Conway et al.

2004, VanLoock et al. 2003, Wu et al. 2004). Release of ADP and Pi would require a

local subunit reorientation similar to that seen in the inactive compressed state (Story et

al. 1992), in which the ATP site is exposed toward the central axis of the filament.

Perhaps this conformational transition is somewhat restricted when RecA is bound to

TGG-repeating sequences.

In addition to the differences in sequence preference, the oligonucleotide length has contrasting effects on the ATPase and coprotease activities. Whereas coprotease activity is higher on 15-mers as compared to 48-mers, ATPase activity is high on 48-mers and undetectable on 15-mers. The dependence of ATPase activity on longer sequences has been well established (Bianco and Weinstock, 1996), and is generally attributed to the highly cooperative behavior of the DNA-dependent ATPase activity of RecA (

Weinstock et al. 1981a). The observation that coprotease activity is actually higher on shorter sequences is less well documented. In the coprotease assay, the ratio of RecA to nucleotides of ssDNA was fixed at 1:3. Therefore, assuming complete binding, with 15- mer oligonucleotides there will be a greater number of filaments than with 48-mers. If the binding of repressor to a particular site on a filament prevents binding of additional repressor molecules to neighboring subunits, as is in fact suggested by EM studies of a

RecA-LexA complex (Yu and Egelman 1993), then a greater number of shorter filaments would therefore give rise to higher coprotease activity, as is observed. An alternative explanation, as has been noted (Wittung et al. 1997), is that longer DNA sequences

95 possess more nucleation sites for RecA, and thus have a higher potential to form

discontinuous filaments. This would be especially problematic in the presence of ATPγS

or ADP-AlF4, where there is less dissociation and redistribution of RecA monomers than

with ATP.

3.3.11 Biological implications

It has previously been shown that RecA binds preferentially to TGG-rich

sequences, that a TGG-repeating consensus pattern is found around Chi sites in the E.

coli genome, and that TGG-repeating sequences are more recombinogenic than other sequences (Tracy and Kowalczykowski 1996, Tracy et al. 1997a). The implications with regard to similar GT-rich sequences found in higher eukaryotes, such as Alu repeat elements, microsatellites, and the constant regions of immunoglobulin heavy chains have been discussed (Tracy and Kowalczykowski 1996, Tracy et al. 1997a). Two new observations that come out of our study are that (1) RecA coprotease activity is highest on TGG-repeating sequences, and (2) RecA ATPase activity is significantly lower on

TGG-repeating sequences than on other sequences.

Biologically, it makes sense that the coprotease activity should be highest on the

GT-rich sequences found around Chi sites where RecA filaments often originate, in order to provide a rapid and robust signal for LexA destruction and initiation of the SOS response. Similarly, high coprotease activity on short RecA filaments might allow for rapid LexA destruction, even for relatively short ssDNA regions formed at single strand gaps. It also makes sense that ATPase activity should be low on the TGG-rich sequences where filaments often originate, since ATP hydrolysis is not required for filament

formation and is in fact coupled to release of RecA monomers from the 5’-end of the 96 filaments (Rosselli and Stasiak 1990, Bork et al. 2001). Thus, the low ATPase activity on

the TGG-rich sequences that was observed may be important for stabilizing initial

filament formation at Chi sites. It has previously been shown that RecA does in fact

dissociate more slowly from a GT-rich, in vitro-selected 18-mer sequence than its

complement, though this was not attributed to differences in ATPase rates (Tracy and

Kowalczykowski 1996).

ATP hydrolysis is necessary, however, for later stages of strand-exchange such as branch migration past regions of heterology (Rosselli and Stasiak 1991, Kim et al. 1992),

strand-exchange beyond about 3000 nucleotides or in a unidirectional 5’-3’ manner (Jain

et al. 1994), and recycling of RecA monomers at the end of strand exchange (Rosselli

and Stasiak 1990). All of these activities presumably occur on long filaments, away from

the GT-rich sites where filaments tend to originate. Thus, it appears that the sequence

preferences of the DNA-binding (Tracy and Kowalczykowski 1996), coprotease, and

ATPase activities of RecA may be finely tuned, together with a sequence bias at

recombination hotspots (Tracy et al. 1997b), to maximize efficiency of recombination in

vivo.

3.4 Conclusions

RecA coprotease activity is an effective means of probing the sequence specificity of the RecA-ssDNA interaction. TGG-repeating sequences, including GTG and GGT, induce particularly high coprotease activity. Based on RecA-ssDNA binding measurements and comparison to previous in vitro selection experiments, the enhanced ability of these sequences to induce RecA coprotease activity is attributed largely to their

efficient binding, as opposed to their stabilizing a particular conformational state. 97 Contrasting sequence and length preferences are seen for RecA ATPase and coprotease activities, which can be rationalized in terms of their biological roles. Overall this study highlights properties of RecA-DNA filaments that are mechanistically informative.

98

Figure 3.1 A schematic representation of the filter binding assembly. The arrows indicate the path of the samples. The samples are blotted on to the slots. The plate with slots is kept on a nitrocellulose (NC) membrane, which is kept on a DEAE membrane. The protein, and hence the protein-DNA complex will bind to the NC membrane, while the free DNA will bind to the DEAE membrane. For easy movement of the samples, vacuum is applied after loading the samples and each blot is washed with buffer to ensure the passage of all the sample in to the slot. A typical blot from filter binding assay is showed on the right. Notice that, as the protein concentration increase, the intensity of the band on the NC membrane increases and that on the DEAE membrane reduces. The % bound is calculated as (intensity of band on NC/ (intensity of band on NC + intensity of band on DEAE)) x 100.

99

Figure 3.2 RecA coprotease activity is highly dependent on the ssDNA sequence. The SDS-PAGE gel shows the self-cleavage of λcI93-236 to produce λcI112-236 in the presence of RecA, ADP-AlF4, and different trinucleotide-repeating 15-mers, for a 20-minute reaction as described in Materials and Methods. The first three lanes show that no cleavage is detected if RecA, ssDNA, or ADP is omitted from the reaction. Notice that cleavage is particularly high for GTG.

100

Figure 3.3 RecA coprotease activity in the presence of 64 trinucleotide-repeating 15- mers. The value in the table is the % of λcI93-236 cleaved in a 20 minute reaction at 25 °C, as described in Materials and Methods. Each reaction was performed in triplicate, and the mean and standard deviation are reported. (A) % Cleavage values for all 64 possible trinucleotide-repeating 15-mers, listed in order from high to low. (B) Mean % cleavage for the set of 24 non-redundant trinucleotides. The value in panel B is the average of those for the three different permutations (i.e. TGG, GTG, and GGT) of each trinucleotide from panel A. Notice that the TGG-repeating sequences give by far the highest coprotease activity.

101

Figure 3.4 DNA sequence specificity of RecA coprotease activity is not dependent on oligonucleotide length or type of ATP cofactor. (A) The % cleavage of cI93-236 in the presence of five trinucleotide-repeating sequences was determined for 15-mer and 48-mer oligonucleotides under otherwise identical conditions (as described in Figure 3.2). In all reactions there are 3 nucleotides of ssDNA per monomer of RecA. The error bars show the standard deviation from three separate reactions. The 15-mers (open bars) give slightly but significantly higher coprotease activity than the 48-mers (filled bars), but the sequence-dependence of RecA coprotease activity is the same. (B) The RecA coprotease activity is compared for reactions in the presence of ADP-AlF4 (open bars) and ATP (filled bars). All reactions are with 48-mer oligonucleotides. Since the cleavage is normally dramatically slower with ATP, reactions in the presence of ATP used a hypercleavable form of λ repressor (see text), but under otherwise identical conditions as reactions with ADP-AlF4.

102

Figure 3.5 Time course of RecA coprotease activity with three different TGG- repeating sequences and the top in vitro selected sequence. The percent cleavage of cI93-236 in the presence of TGG, GTG, and GGT-repeating sequences is plotted vs. time for (A) 15-mer oligonucleotides, and (B) 9-mer oligonucleotides. GTG and GGT give significantly higher coprotease activity than TGG. At early time points, GTG gives slightly higher coprotease activity than GGT. (C) Comparison of a GTG-repeating 18- mer and the top in vitro selected sequence, 5’-GCGTGTGTGGTGGTGTGC-3’ (SKBT18-mer; Tracy and Kowalczykowski 1996, Tracy et al. 1997a). The value at each time-point is the average of three separate reactions and the error bars show the standard deviations.

103

Figure 3.6 Dependence of RecA ATPase activity on the sequence of ssDNA. A) The schematic diagram showing the coupling of oxidation of NADH to ATP hydrolysis by RecA. PK represents pyruvate kinase, LDH is lactate dehydrogenase and Pi is inorganic phosphate. B) The rate of RecA ATP hydrolysis was determined in the presence of five 104 different trinucleotide-repeating 48-mer oligonucleotides using a coupled spetrophotometric assay, as described in Materials and Methods. The ATPase rate is shown together with the coprotease activity from Figure 3.3. Notice that the coprotease activity is highest for TGG and lowest for CCA, while the opposite is true for the ATPase activity.

105

Figure 3.7 Binding of RecA to TGG and CCA-repeating 48-mer oligonucleotides. The amount of 32P end-labeled 48-mer bound to RecA as a function of RecA concentration was determined by the double filter-binding method (Wong and Lohman 1993). The % bound values reported are the average of three experiments and error bars indicate the standard deviation. Notice that RecA binds much more tightly to the TGG repeat than to the CCA repeat.

106

Figure 3.8 Correlation between RecA-DNA binding and coprotease activity for the 64 trinucleotides. “Frequency Selected” is the number of times each trinucleotide occurs within 24 in vitro-selected 18-mer sequences, divided by the total number of trinucleotide occurrences (taken from Table 3 of ref. Tracy and Kowalczykowski 1996). % Cleavage is the value for each trinucleotide-repeating 15-mer reported in Figure 3.3. Each point in the plot represents a single trinucleotide. Notice that there is a general correlation between trinucleotides that give high coprotease activity and those that are frequently selected.

107 CHAPTER 4

TESTING THE INVERSE STRAND EXCHANGE

MECHANISM OF DEINOCOCCUS RADIODURANS RECA

4.1 Introduction

Deinococcus radiodurans (Dr) is a gram positive bacterium that has the remarkable ability to recover from extensive DNA damage caused by extreme doses of ionizing radiation and other mutagenesis agents (Minton, 1994). Dr is thus of great interest for studying biological DNA repair mechanisms, which when defective can lead to cancer and aging related diseases such as Bloom’s and Werner’s syndromes. Dr is also of significant interest for bioremediation of radioactive waste because of its innate ability to live in extreme environments.

dsDNA breaks are one among the most detrimental type of DNA damage caused by radiation exposure and homologous recombination is one of the ways by which cells repair dsDNA breaks (Figure 1.1). RecA promotes the central DNA strand exchange step of homologous recombination. In all the organisms characterized till now, the strand exchange follows a particular order; RecA monomers polymerize on the ssDNA produced from the dsDNA break and forms a nucleoprotein filament. This nucleoprotein 108 filament locates a homologous dsDNA and carries out the strand exchange reaction

(Figure 1.1). The CTD of EcRecA protein is thought to play a key role in the strand exchange reaction by binding to the dsDNA and facilitating its entry into the initial

RecA-ssDNA filament. (Kurumizaka, et al., 1996). In the eukaryotic protein Rad51, the

NTD which is analogous to the CTD of E. coli RecA, binds to dsDNA (Aihara et al.,

1999).

The mechanism by which Dr repairs extensive DNA damage is multifaceted and poorly understood. It is documented that RecA is essential for the survival of Dr after radiation exposure (Zahradka et al., 2006). The RecA protein from Dr is unique from that of other organisms in that it promotes the strand exchange reaction by an unprecedented inverse pathway: Dr RecA first forms a filament on a duplex DNA and then brings in a separate ssDNA for homologous recombination (Figure 1.9; Kim and Cox, 2002). It is thought that the inverse strand exchange mechanism plays an important role to assist Dr to survive extreme radiations. It is interesting to note that Ec RecA, which has the normal strand exchange pathway, cannot complement the RecA activity in a Dr recA- strain, while other DNA repair proteins like UvrA, DNA Polymerase I can be complemented by the Ec versions in the corresponding mutant strains of Dr (Makarova et al., 2001).

The x-ray crystal structure of Dr RecA had been determined in order to learn more about its unusual mechanistic features and its key role in an extremely efficient DNA repair system (Rajan and Bell, 2004). Although the overall structure of Dr RecA is similar to that of Ec RecA, there are two key differences. First, in Dr RecA, there is an increased accumulation of positive charge along the central axis of the filament, which is the binding site for the first DNA molecule (Figure 2.5). This is consistent with the

109 observed preference of Dr RecA for polymerizing on dsDNA, which has twice the

negative charge. The increased positive charge noticed along the central axis of the Dr

RecA filament could enhance its binding to dsDNA compared to ssDNA. Second, there is

a large reorientation of the CTD of Dr RecA compared to that of Ec RecA. There are

important differences near the DNA binding patch in the CTD while comparing Ec and

Dr RecA (Figure 2.7, Rajan and Bell, 2004). This latter observation is interesting in light

of the proposed role of the CTD in binding to the second DNA substrate and bringing it

in to the nucleoprotein filament for pairing and strand exchange.

Based on these facts, a model has been proposed for the strand exchange reaction in Dr RecA. According to this model, Dr RecA monomers will polymerize preferentially to dsDNA and the dsDNA will bind along the central axis of the filament. The CTD will bind to the complementary ssDNA and bring it into the Dr RecA-dsDNA filament for strand exchange reaction (Figure 4.1). This model makes several predictions about the

DNA binding properties of Dr RecA as compared to Ec RecA. These predictions include

(1) full length Dr RecA binds preferentially to dsDNA instead of ssDNA, and (2) the isolated CTD of Dr RecA binds preferentially to ssDNA.

4.2 Materials and Methods

4.2.1 Purification of Deinococcus radiodurans RecA (native)

For purifying the native form of Dr RecA, the gene was cloned into pET9a vector

in between NdeI and BamHI restriction sites and expressed using BL21AI cells. 6 1L

cultures were grown at 37°C in LB broth, induced at OD600 = 0.6 with 0.2% arabinose, and incubated for an additional four hours at 37°C. The cells were pelleted and stored at –

80 °C. The cells were thawed and resuspended in sucrose lysis buffer (50 mM Tris, 25% 110 sucrose pH 7.5), which contained 0.1 mg/ml PMSF, 1 μg/ml each of leupeptin and

pepstatin A, and 1 mg/ml lysozyme and incubated on ice for one hour. This was followed

by sonication and the cell lysate was centrifuged at 18,000 rpm for 30 minutes at 4 °C.

Solid ammonium sulfate (0.37g/ml) was added to the supernatant, stirred at 4 °C for 30

minutes, and spun at 15,000g for 30 minutes. The pellet was solubilized in 20 mM Tris,

100 mM NaCl pH 8.0 (Buffer A) and dialyzed against the same buffer overnight. The

next day, the solution was centrifuged and the clear supernatant was loaded on to the

DEAE anion exchange column and eluted with a salt gradient of 0.1 to 1 M NaCl. The

fractions containing Dr RecA were dialyzed into buffer A, loaded onto QHP column, and

eluted with a similar salt gradient. The fractions from QHP were subsequently purified by heparin affinity and resource Q columns using buffer A and the same salt gradient.

Finally, Dr RecA was dialyzed into the storage buffer (20 mM Tris, 100 mM NaCl, 1 mM DTT, pH 8.0), concentrated, aliquoted, and stored at -80 °C. The concentration of the protein was determined by O.D. at 280 nm using extinction coefficient 11,920 M-1cm-

1, which was calculated from the amino acid sequence. Nuclease assays were carried out

to make sure the protein was free of ds and ss exonucleases.

4.2.2 Purification of Escherichia coli RecA (native)

The Ec recA gene was cloned into pET9a vector using NdeI and BamHI

restriction sites and expressed using BL21AI cells. 6 1L cultures were grown at 37°C in

LB broth, induced at OD600 = 0.6 with 0.2% arabinose, and incubated for an additional

four hours at 37°C. The cells were pelleted and stored at –80 °C. The cells were thawed

and resuspended in sucrose lysis buffer (50 mM Tris, 25% sucrose pH 7.5), which

contained 0.1 mg/ml PMSF, 1 μg/ml each of leupeptin and pepstatin A, and 1 mg/ml 111 lysozyme and incubated on ice for one hour. This was followed by addition of 25 mM

EDTA, sonication and the cell lysate was centrifuged at 18,000 rpm for 30 minutes at 4

°C. The supernatant was precipitated by the addition of 0.5% polyethyleneimine at pH

7.5, stirred at 4 °C for 30 minutes. The solution was centrifuged at 12,000 g for 20 minutes. The pellet, which contained the required protein, was washed with R buffer (20 mM Tris, 0.1 mM EDTA, 10% glycerol, pH 7.5) containing 50 mM ammonium sulfate and then extracted with R buffer containing 300 mM ammonium sulfate. This solution was centrifuged at 12,000 g for 20 minutes and the resultant supernatant contained RecA.

Solid ammonium sulfate (0.37g/ml) was added to the supernatant, stirred at 4 °C for 30 minutes, and spun at 15,000g for 30 minutes. The resulting pellet was washed twice with

R buffer containing 2.1M ammonium sulfate and finally suspended in R buffer with 100 mM KCl and dialyzed overnight in the same buffer. The next day, the solution was centrifuged and the clear supernatant was loaded on to the QHP anion exchange column and eluted with a salt gradient of 0.1 to 1 M KCl. The fractions containing RecA were dialyzed into buffer A (20 mM Tris, 0.1 mM EDTA, 10% glycerol, 100 mM NaCl, pH

7.5) and purified subsequently by heparin affinity, resource Q and Mono Q columns with a salt gradient of 0.1 to 1M NaCl. Finally, the fractions containing Ec RecA were

dialyzed into the storage buffer (20 mM Tris, 100 mM NaCl, 1 mM DTT, pH 7.5),

concentrated, aliquoted, and stored at -80 °C. The concentration of the protein was

determined by O.D. at 280 nm using extinction coefficient 20,340 M-1cm-1, which was

calculated from the amino acid sequence. Nuclease assays were carried out to make sure

the protein was free of ds and ss exonucleases.

112 4.2.3 Purification of the isolated CTD of Dr RecA and Ec RecA

The CTD of Ec RecA (residues 268-330) and the corresponding region of Dr

RecA (residues 281-344) were cloned into pET9a vector using NdeI and BamHI restriction sites and expressed using BL21AI cells. The cell preparation is same as mentioned for the Dr RecA native protein. The cells were resuspended in sucrose lysis buffer (50 mM Tris, 25% sucrose, 5 mM β mercaptoethanol, 5 mM EDTA) at pH 7.5 for

DrRecA CTD and at pH 6.5 for EcRecA CTD. Protease inhibitors (0.1 mg/ml PMSF, 1

μg/ml each of leupeptin and pepstatin A) and 1 mg/ml lysozyme were added to the sucrose lysis buffer, and incubated on ice for one hour. The cells were sonicated and the cell lysate was centrifuged at 18,000 rpm for 30 minutes at 4 °C. The supernatant was precipitated by the addition of 0.5% polyethyleneimine at pH 7.5, stirred at 4 °C for 30 minutes. The solution was centrifuged at 12,000 g for 20 minutes. The protein was present in the supernatant. The supernatant was dialyzed into 20 mM Tris, 100 mM NaCl, pH 7.4 and loaded on to DEAE column (anion exchange). The protein did not bind to the

DEAE column. The flow through from the DEAE column was loaded onto heparin and eluted with a salt gradient of 0.1-1M NaCl. After the heparin column, the protein was dialyzed into 20 mM Tris, 100 mM NaCl, pH 7.4 and loaded onto MonoS cation exchange column. The protein bound to MonoS and was eluted with a linear salt gradient of 0.1-1M NaCl. Finally, the protein was dialyzed into the storage buffer (20 mM Tris,

100 mM NaCl, 1 mM DTT, pH 7.4), concentrated, aliquoted, and stored at -80 °C. For the purification of EcRecA CTD the following columns were used sequentially; QHP

(anion exchange), SPHP (cation exchange), heparin and MonoS (cation exchange). The protein did not bind to QHP and heparin and hence the flow through was loaded on to the 113 subsequent column. The concentration of the proteins were determined by O.D. at 280

nm using extinction coefficient 4,470 M-1cm-1 for DrRecA CTD and 15.470 M-1cm-1 for

EcRecA CTD, which was calculated from their amino acid sequences. Nuclease assays showed that there is ds and ss exonuclease contamination in both the protein preparations.

4.2.4 Nuclease Assay

The nuclease assay was carried out to detect the presence of ss or ds exonuclease

in the protein preparation. The reaction buffer included 20 mM Tris pH 7.5, 10 mM

MgCl2, 0.05 mg/ml BSA, 1 mM DTT, pH 7.5. 15 μM protein was incubated with 40 μM

nt DNA in the above buffer at 37 °C for 2 hours. A 39-mer oligonucleotide was used for

the ssDNA exonuclease assay, while pUC19 digested with BamHI was used for the

dsDNA exo nuclease assay. The reaction was stopped by adding 1% SDS to the samples

and incubating at 37 °C for 10 minutes. The ssDNA samples were electrophoresed on a

15% non-denaturing polyacrylamide gel using 0.5 x TBE (44.5 mM Tris, 44.5 mM boric

acid, 1 mM EDTA) with 3 mM MgCl2. The bands were visualized by staining with

SYBR gold. The dsDNA samples were electrophoresed on a 0.8% agarose gel and

visualized by ethidium bromide staining. The nuclease activity was observed as a

reduction in the intensity of the band corresponding to the full length dsDNA or ssDNA

(Figure 4.2).

4.2.5 Gel Shift Assay

The 20 μl reaction mixture contained 0.5 μM molecules of 32P labeled DNA, 20

mM Tris pH 7.4, 10 mM KCl, 10 mM MgCl2, 0.5 mM EDTA, 0.5 mM ATPγS, 1 mM

DTT, 0.1 mg/ml BSA. The protein concentrations used were 0, 3, 6, 9, 14, and 20 μM.

The reaction mixture was incubated at 37 °C for 30 minutes, after which 1/6th volume of 114 loading dye (20% glycerol, 0.12% bromophenol blue, and 0.12% xylenexynol) was

added. The samples were electrophoresed on a 1.2% agarose gel using 0.5 x TBE (44.5

mM Tris, 44.5 mM boric acid, 1 mM EDTA) buffer supplemented with 3 mM MgCl2 and

ran at 60 V for 3 hours. The gel was then dried and autoradiographed.

4.2.6 Strand Exchange Assay

The reaction mixture contained 25 mM Tris acetate pH 7.5, 10 mM Mg acetate, 5

% glycerol, 1 mM DTT, 0.1 mg/ml BSA, 3 mM potassium glutamate, 30 U/ml creatine

phosphokinase, 10 mM phosphocreatine, 3mM dATP, 21 μM nt of ΦX174 ssDNA, 21

μM nt of ΦX174 dsDNA (ΦX174 dsDNA was linearized by digesting with PstI), 7 μM

RecA, 2.1 μM Ec SSB. The reaction components except dATP, SSB and dsDNA were

mixed first. This was followed by the addition of dATP and SSB and incubation at 37 °C

for 10 minutes. The strand exchange reaction was then initiated by the addition of

dsDNA. 10 μl aliquots were taken at 0, 10, 20, 40, and 60 minutes time points, and

quenched by adding 1/3rd volume of stop buffer (60 mM EDTA, 5% SDS, 6.25%

glycerol, 0.05% bromophenol blue). The samples were incubated at 42 °C for 30 minutes

and electrophoresed on 0.5% agarose gel. The gel was run overnight at 2V/cm in 1x TAE

(40 mM Tris, 20 mM Acetic acid, 1 mM EDTA pH 8.0) buffer at 4 °C. The bands were

visualized by staining the gel in 1x TAE buffer containing 0.1% ethidium bromide for 30

minutes, followed by washing in water for 2 hours.

4.2.7 Double Filter binding assay

The experimental setup is same as described in chapter 3. The reactions were

carried out in 20 mM Tris pH 7.4, 10 mM KCl, 10 mM MgCl2, 0.5 mM EDTA, 0.5 mM

ATPγS, 5 mM DTT, 0.1 mg/ml BSA. For the isolated CTDs, the same buffer without 115 ATPγS was used. For preparing radiolabeled dsDNA for the experiments, a control

annealing reaction was done where 1 μl of the radiolabeled ssDNA at 50,000 cpm count

was titrated with increasing amounts of reverse strand. This mixture was heated at 95 °C

for 15 minutes and cooled overnight. The reaction mixture was loaded on a gel and the

exact concentration of reverse strand required for shifting the position of the radioactive

band (dsDNA will migrate slower in the gel than the ssDNA) was noted. The required

amount of radiolabeled dsDNA was prepared based on the exact amount of forward

(radiolabeled) and reverse strand concentrations obtained from this experiment. When higher concentrations of radiolabeled DNA was required, cold DNA was mixed with hot

DNA and then this mixture was used for the reaction.

4.2.8 ATPase assay

The DNA-dependant ATPase activity of Ec RecA in the presence of selected 48-

mer trinucleotide-repeating sequences was measured by a coupled spectrophotometric

assay (Morrical et al. 1986). The 500 μL reaction mixture contained 25 mM Tris pH 7.4,

10 mM MgCl2, 1 mM DTT, 2 mM dATP, 2 mM PEP, 0.5 mM NADH, 30 units/mL each

of pyruvate kinase and lactic dehydrogenase, 0.8 μM RecA, and 3 μM oligonucleotide.

The reactions were carried out at 37 ºC with an electrically heated cell holder. All

reaction components except RecA were pre-mixed, and reactions were initiated by

mixing in RecA. The O.D. at 340 nm was monitored at one-minute intervals using an

Ultrospec 2100 pro UV/VIS spectrophotometer (Amersham Biosciences). The ATPase

rate (min-1) was calculated using the formula: (-dA/dt)(1/6300 M-1cm-1)(reaction volume)

/ [RecA], where dA/dt is the slope in the linear region of the plot of OD340 vs. time (min).

116 4.2.9 Surface Plasmon Resonance

The SPR experiments were carried out in the Biacore 2000 instrument. The

various steps involved in the experiment are mentioned below.

4.2.9.1 Preparation of DNA

83-mer DNA (5’ biotin C T A C T G C G C C A G A A C G C G C C A G G G C

G T C A C A G A T T T C C A G T G C C T G C T C G C T G T C A C A C T G G G

A G C A C A G C A G G T T G T C G A T A 3’) whose sequence corresponded to that

used in earlier SPR studies (Kelley De Zutter and Knight, 1999) was used for the present

SPR experiments. For making dsDNA, a non-biotin labeled reverse strand of the 83-mer

strand mentioned above was purchased. Annealing was carried out by mixing a slight

excess (5-10%) of reverse strand, incubation at 75 °C for 15 minutes, followed by

overnight cooling. The excess reverse strand ensures that all the biotin labeled DNA is

double stranded and thus prevents the binding of ssDNA to the flowcell intended for

dsDNA labeling. 10 μM stock of ssDNA and 5 μM stock of dsDNA were prepared and

spin filtered through amicon ultrafree-MC spin filter (10,000 g for 2 minutes). The

concentration of the DNA was checked at O.D. 260 after spin filtering. 1000 ng/ml of

ssDNA and 2000 ng/ml dsDNA were prepared in HBS buffer (10 mM HEPES, 150 mM

NaCl, 3 mM EDTA, pH 7.4) for labeling the biosensor chip.

4.2.9.2 Preparation of the sensor chip

The biosensor chip CM5 was purchased from Biacore. The chip has a

carboxymethylated dextran matrix covalently attached to a gold film. There are four individual cells, the flow cells, in the chip. The labeling was done using the HBS buffer

117 containing 0.005% Tween 20. The buffer was filtered through 0.2 μm filter and degassed

thoroughly before use. The chip was activated by passing 25 μl of 50 mM N-

hydoxysuccinimide (NHS) and 200 mM N-ethyl-N’-(dimethylaminopropyl)carbodiimide

(EDC) mixture at a flow rate of 5 μl/min. NHS and EDC were mixed just before use. This

was followed by the injection of 40 μl of 100 μg/ml of streptavidin in 10 mM sodium

acetate pH 4.5 at a flow rate of 5 μl/min. The streptavidin will bind to the activated chip

surface. Any unlabeled chip surface was blocked by passing 30 μl of 3M ethanolamine at

pH 9.0. The biotin labeled DNA was then passed (25 – 40 μl depending on the density

required) at a slow flow rate of 2 μl/min to ensure good binding to streptavidin. If the

required label density was not achieved (200 – 400 RU) the DNA was injected once

more. The life time of one chip was 2-3 weeks, after which the DNA dissociated from the

chip. Out of the four flow cells, one was unlabeled (the reference cell), two were labeled

with two different concentrations of ssDNA, and the last one was labeled with dsDNA.

For the preparation of the reference cell, NHS-EDC, streptavidin, and ethanolamine are

injected, but not the DNA.

4.2.9.3 SPR experimental setup

The flow buffer was 20 mM Tris, 138 mM NaCl, 27 mM KCl, 5 mM MgCl2,

0.005% Tween 20, pH 7.5), which was filtered (0.2 μm) and degassed before use. The protein samples were prepared in the flow buffer to which 0.5 mM ATPγS was added.

The protein was prepared at 50 μM concentration (100 μl), spin filtered, and the concentration was checked using absorbance at O.D. 280. This protein stock was diluted with the sample buffer to get various concentrations (5, 10, 20, 30, 40, 60, 80, 100, 150,

200, and 250 nM) which were then injected through the flow cells. A few samples 118 without the protein were also included in the experiment to account for the non-specific

binding of the buffer components to the chip. A total volume of 250 μl sample was

injected at a flow rate of 10 μl/min. After injecting the protein, the chip was washed with

the flow buffer to observe the dissociation of the bound protein. 10 μl of 2M MgCl2 was then injected to regenerate the chip surface.

4.2.9.4 SPR data analysis

The BIAevaluation 3.2 RC1 software was used for analyzing the binding data.

The data was subjected to both kinetic and steady state analysis. For kinetic analysis both

the association and the dissociation phases were fit simultaneously. This fit gave values

for ka (on) and kd (off) rates, and the association (KA) and the dissociation (KD) constants

were calculated from the on and off rates. The data were also subjected to steady state

analysis. The maximum RU (Rmax) for the association phase was plotted against the

protein concentration and fit using the Hill equation ((Rmax x (protein

n n concentration) )/(KA + (protein concentration) , where n is the Hill coefficient).

4.3 Results and Discussion

4.3.1 Full length Ec and Dr RecA

4.3.1.1 Protein purification and characterization

Both Dr and Ec full length proteins were purified in their native form as

mentioned in the materials and methods section. The nuclease assays revealed that both

Dr and Ec RecA preparations were free of nuclease contamination. The purified proteins were characterized biochemically by the electrophoretic mobility shift assay (EMSA/gel shift assay) and the DNA strand exchange assay. EMSA was carried out to analyze the

119 pattern of binding of Ec and Dr RecA to different types of DNA, for example to ss, ds,

hairpin, ds with overhang DNA etc. A previous study from the laboratory has shown that

Ec RecA formed tighter complexes with hairpin DNA when compared to ss or dsDNA

(unpublished data). The present study showed that, for both Ec and Dr RecA, the protein-

DNA binding reached saturation in the presence of hairpin or overhang DNA. This is

evident from the absence of the band corresponding to the free oligonucleotide. Also Ec

RecA formed comparatively stable protein-DNA complexes, which is evident from the

smaller extent of streaking in the different lanes (Figure 4.3).

An in vitro DNA strand exchange reaction was carried out using Dr and Ec RecA.

When RecA is incubated with ΦX174 ss and ΦX174 dsDNA, RecA will promote the

formation of the joint molecule between the ss and ds DNA. This is followed by the

exchange of the complementary strand of the dsDNA to the ssDNA leading to the

formation of the nicked circular dsDNA product (Figure 4.4). The joint molecule and

nicked circular dsDNA have lower mobility on a gel, when compared to ds or ssDNA

(Figure 4.4). It is evident from the figure that both Dr and Ec RecA can promote the

strand exchange reaction. It appears that, under the particular reaction conditions, Dr

RecA was able to form the products at earlier time points and also to a fuller extent, when

compared to Ec RecA.

4.3.1.2 Filter Binding Assay

The filter binding assay was used for a direct analysis of the interaction between

protein and DNA. RecA was incubated with 32P labeled DNA in the presence of ATPγS at 37°C for one hour and the reaction mixture was bloted on the filter membrane. The

120 amount of protein-DNA complex formed was higher with ssDNA for both Ec and Dr

RecA. At lower protein concentrations, Dr RecA formed higher amounts of protein-DNA complex with ss and ds DNA compared to Ec RecA (Figure 4.5).

A competitive filter binding assay was designed to analyze the DNA binding of

the full length Dr and Ec RecA. For this, the proteins were incubated with 100 nM of 32P

labelled ss or dsDNA for 30 minutes, and cold competitor DNA was added to the reaction mix. For example, RecA-labeled ssDNA mixture will be competed with cold dsDNA.

The idea is to quantify the amount of labelled DNA getting dissociated from the initial protein-labeled DNA complex upon addition of the increasing concentrations of the cold competitor. This can be visualized as a reduction in the intensity of the radioactive band which is obtained from the initial protein-DNA complex. The competitive binding assay showed that both Ec and Dr RecA form tighter complexes with ssDNA (Figure 4.6). This is because the dissociation of the labeled ssDNA from the RecA- labeled ssDNA complex is less upon addition of cold dsDNA when compared to the dissociation of the labeled dsDNA from RecA-labeled dsDNA complex in the presence of cold ssDNA.

Another interesting observation is that, in the presence of a lower concentration of cold ssDNA, the binding of RecA to labled dsDNA is enhanced for both Ec and Dr RecA. It had been documented for Dr RecA that there is increased binding of Dr RecA to dsDNA if lower concentrations of ssDNA were present in the reaction mixture (Kim et al. 2002).

The lag phase present for the binding of Dr RecA to dsDNA is reduced greatly if ssDNA is present in the reaction mixture.

Thus, the filter binding experiments do not reveal enhanced binding of Dr RecA to dsDNA as was predicted based on the inverse strand exchange pathway. Dr RecA is

121 essentially identical to Ec RecA in its DNA binding properties. Together from both the

filter binding and gel shift assays, it is evident that both Ec and Dr RecA form tighter

complexes with ssDNA than with dsDNA.

4.3.1.3 ATPase assay

Like all the RecA proteins characterized, Dr RecA has a DNA dependent ATPase activity. Dr RecA can hydrolyze dATP faster than ATP (Makarova et al., 2001). The

ATPase assay was carried out using Dr and Ec RecA and the rate of hydrolysis of dATP

was measured under different experimental conditions. The rate of dATP hydrolysis by

Dr and Ec full length RecA in the presence of ΦX174 ss and dsDNA was measured. Both

Dr and Ec RecA had higher ATPase rates in the presence of ssDNA than with dsDNA

(Figure 4.7 A and B). Previously, ATPase assay was conducted with various 48-mer

triplet repeating ssDNA sequences to analyze the sequence dependence of ATPase

activity for Ec RecA (Chapter 3). A similar set of experiments were conducted using Dr

RecA and selected three 48-mer triplet repeating sequences (figure 4.7 C, D). It is

interesting to note that Dr RecA exhibited a similar sequence preference as Ec RecA for

the ATPase activity. The TGG repeating sequence gave the lowest ATPase rate, while

CCA and TCA repeating sequences gave higher ATPase rates. Earlier (chapter 3), it was

concluded that ssDNA sequences which bound tighter to RecA (based on the coprotease

and filter binding assays) had lower ATPase activity and vice versa. So it can be infered

that, like Ec RecA, Dr RecA also binds preferentially to TGG repeating sequences.

4.3.1.4 Surface Plasmon Resonance (SPR)

SPR is a powerful technique for analyzing the interactions between two

molecules. SPR is useful for determining the kinetic rate constants and also the 122 equilibrium constants of a binding reaction. It is a continuous flow system, where one of the substrates (termed the ligand) is immobilized on a gold coated sensor chip, while the other substrate (termed the analyte) is passed through the chip in a buffer (Figure 4.8).

During the flow, the analyte binds to the ligand with kinetics that depend on the rate of the interaction. The refractive index of the gold surface varies when the analyte binds to and dissociates from the ligand and this change in refractive index is recorded as the response unit (RU). The value of RU increases when the analyte binds to the ligand

(association phase), while it decreases when the analyte dissociates from the ligand

(dissociation phase).

SPR experiments were carried out using Ec and Dr full length RecA. The biosensor chip had four flow cells, two of which were labeled with two different concentration of ssDNA (83-mer), one was labeled with dsDNA (83-mer), and the other one was with out any DNA (the reference cell). The RU from reference cell is subtracted from the RU of other cells to account for any non-specific binding of the protein or the buffer components to the chip surface. The buffer contained 0.5 mM ATPγS, as the optimal binding of RecA to DNA requires ATP or an ATP analogue. The protein concentrations used for the experiments varied from 0-250 nM. This low protein concentration was required to observe the cooperativity of RecA binding to DNA, which is most evident at low protein concentrations. In the case of Ec RecA, the extent of protein bound to ssDNA was higher than to dsDNA (Figure 4.9 A and B). There was no substantial binding of Dr RecA to ss or dsDNA (Figure 4.9 C and D) and hence no attempts were made to determine the kinetic and equilibrium parameters.

123 The binding of RecA to DNA did not reach saturation under the particular

reaction conditions used. This was noted in earlier SPR experiments with Ec RecA and

Rad51 (Kelley De Zutter and Knight, 1999, Biet et al., 1999). The binding experiments were carried out at a slow flow rate of 10 μl/min and a total volume of 250 μl was

injected for each reaction, in order to maximize the saturation of RecA to DNA at each

RecA concentration.

The data analysis was done using kinetic and steady state parameters, using the

BIAevaluation software (version 3.2) supplied by the BIAcore. For the kinetic data analysis, the association and dissociation phases were globally fit for all the curves at different protein concentrations, to get the ka (on rate) and kd (off rate) rates (Figure 4.10).

-7 -1 -1 The rate of dissociation (kd) of Ec RecA from dsDNA (3.6e M s ) is significantly lower

-4 -1 -1 than that from ssDNA (6.9e M s ). As a result, the dissociation constant (KD) was

-11 -7 lower for Ec RecA-dsDNA binding (3.5e M), while the KD was 2.8e M for Ec RecA-

ssDNA interaction. Since the kinetic fit doesn’t consider the effect of cooperativity in the

RecA-DNA binding event, the KD reported from this fit is an apparent KD. The data was

also analyzed by a Hill plot to account for the cooperativity in the RecA-DNA binding

(Figure 4.11). The maximum RU (Rmax) for the association phase was plotted against the

protein concentration and fit to the Hill equation (see materials and methods). This

-4 analysis gave almost similar KD values for ss and dsDNA (10 M). The hill coefficient

was higher for ssDNA (2.03) and lower for dsDNA (1.71). Since the protein-DNA

binding didn’t reach saturation under the experimental conditions, the KD reported here

might not reflect the exact KD. Nevertheless, it is the best estimate of the KD which can

be calculated using a reasonable experimental set up.

124 The binding of RecA to DNA is a complex process. The DNA-protein interaction

and the interactions between the different monomers affect the binding of protein to

DNA. Thus the affinity of the first RecA monomer for DNA is different from that of the

second monomer to DNA because of the cooperativity between the two monomers and

also due to the influence of DNA-protein interaction of the first monomer. A complex

model that separates the intrinsic DNA binding affinity of a monomer from the

cooperative protein-protein interactions had been described by McGhee and von Hippel

(McGhee and von Hippel, 1974). This model had been used in several analyses including

RecA and Rad51 proteins to analyze the DNA binding properties of the particular proteins (Kelley De Zutter and Knight, 1999, Kelley De Zutter et al., 2001). For example,

the Hill analysis of a similar SPR data using F217Y mutant of Ec RecA showed that the

mutant is essentially the same as the wild type in its DNA binding properties. But the

analysis using McGhee and von Hippel model showed that the mutant had about 250 fold

increase in the cooperativity parameter ω (the relative affinity of an incoming monomer

for a uniformly protein coated DNA vs. an isolated site in the DNA) when compared to

the wild type. This was confirmed by electron microscopy which showed longer

filaments for the mutant than for the wild type (Kelley De Zutter et al., 2001). The

McGhee and von Hippel model could be used for the analysis of the data from the present

study to see in more detail the differences in the interaction of Ec RecA to ss and dsDNA.

It was not possible to analyze the DNA binding properties of Dr RecA using SPR.

This could be due to various reasons. The Dr RecA-DNA binding is highly dependent on

the pH of the reaction conditions. There is a substantial increase in the Dr RecA-DNA

binding at a lower pH (pH 6). The present SPR experiments were carried out at a pH 7.5.

125 It had been observed that Dr RecA exhibits differential binding to different nucleotide

cofactors. dATP is the preferred nucleotide cofactor for Dr RecA. The SPR experiments

were carried out using ATPγS as the nucleotide cofactor. The other DNA binding

experiments such as the filter binding, gel shift, and the strand exchange assays, showed

that the Dr RecA protein is functional. The inability of Dr RecA to produce a substantial

binding in the SPR experiments could be due to the unfavorable reaction conditions.

Further standardization using different pH and different nucleotide cofactors might be necessary for the SPR experiments with Dr RecA.

4.3.2 Isolated CTD of Ec and Dr RecA

4.3.2.1 Protein purification and characterization

The design of the isolated CTD of Dr RecA was based on a previous NMR study

which characterized the DNA binding properties of the isolated CTD of Ec RecA (Aihara

et al., 1997). In this study, the CTD, consisting of residues 268 to 330 of Ec RecA

(EcRecA CTD) was purified and its binding to 12-mer ss and dsDNA molecules was

analyzed by NMR. The corresponding region of Dr RecA, residues 281-344 (DrRecA

CTD) was cloned into the pET9a vector in order to purify the protein in its native form.

The isolated CTD of Ec and Dr RecA showed a difference in mobility when run on a

SDS-PAGE gel (Figure 4.12). The DrRecA CTD ran at a size corresponding to a dimer,

while EcRecA CTD ran at a size corresponding to the monomer on the SDS-PAGE gel.

The identities of the purified proteins were confirmed by mass spectrometric (MS)

analysis, which showed the expected mass for both the proteins. The nuclease assay

showed that there is slight nuclease contamination with both the CTD protein

preparations. 126 4.3.2.2 Filter Binding Assay

The CTDs of Dr and Ec RecA were used for filter binding experiments with ss and dsDNA. DrRecA CTD showed binding to both ss and dsDNA, while there was no visible binding of EcRecA CTD to ss or dsDNA. Oligonucleotides of different lengths were used for the filter binding assay. There was no detectable binding of DrRecA CTD to a 12-mer DNA. Experiments with 39-mer and 83-mer DNA showed that the DrRecA

CTD bound to both the oligomers in a similar manner. The binding curves of DrRecA

CTD for both ss and dsDNA are sigmoidal showing the existence of cooperativity in the binding event (Figure 4.13 A and C). The % of protein bound to DNA was plotted against the protein concentration and the points were fit using the Hill equation (Figure 4.13 B and D). This gave a hill coefficient of 4.15 for ssDNA and 2.57 for dsDNA, showing the existence of strong cooperativity between the DrRecA CTD and DNA interaction.

The inability of EcRecA CTD to bind to ss or dsDNA could be due to various reasons. It is possible that the DNA used in the filter binding experiments is getting chewed up by the trace amounts of nuclease present in the protein preparation. Another reason could be that the protein-DNA concentration used in the filter binding experiments is very low when compared to that used in the NMR experiments with the isolated

EcRecA CTD. In the NMR experiments, a protein concentration of 0.2 mM and a DNA concentration of 0.75 and 1.5 molar equivalents (when compared to the protein concentration) of ds and ssDNA respectively were used (Aihara et al., 1997). In the present study, the DNA concentration was 100 nM and the protein concentration varied from 0 to 1 μM for the filter binding experiments. A very high protein and DNA

127 concentration similar to that in the NMR experiments is not possible because the binding

capacity of the NC membrane is limited.

In conclusion, the experiments involving the isolated CTDs of Dr and Ec RecA

require further standardization of the protein purification and experimental conditions. A

his-tagged version of the isolated CTDs of Ec and Dr RecA was purified. Even though these protein preparations were free of nuclease contamination, they did not bind to either ss or dsDNA as observed by nitrocellulose filter binding and gel shift assays. The crystal structure of the isolated Dr RecA CTD was solved and the structure showed that the Dr

RecA CTD had the same fold as that of the CTD in the full length Dr RecA structure thus demonstrating the integrity of the purified protein (data not shown).

4.4 Conclusions

The research was aimed at analyzing the inverse strand exchange mechanism of

Dr RecA. The filter binding, gel shift, and ATPase assays show that the full length Dr

RecA has similar preferences for ss and ds DNA as that obeserved for the full length Ec

RecA. Both Ec RecA and Dr RecA formed a higher amount of stable protein-DNA complex with ssDNA. The ATPase assay showed that both Ec and Dr RecA have similar sequence preference for the ATPase activity. From the filter binding assay with the Dr

RecA CTD, it can be concluded that DrRecA CTD can bind to both ss and dsDNA equally well. Thus, the current results are not in accordance with the predications made from the model (Figure 4.1). This could be due to the particular pH and nucleotide requirements for Dr RecA (Kim et al., 2002) which needs to be standardized in the current experiments. The exact mechanism by which Ec RecA and Dr RecA bind to DNA appear to be different. This could be a reason for the observed lower level of response in 128 the SPR experiments involving Dr RecA. It is reasonable to think that there are proteins yet to be identified that confer the special DNA repair properties of Dr and its RecA protein.

129

Figure 4.1 Hypothetical model showing the strand exchange mechanism in E. coli and D. radiodurans. The protein is organized into three domains, the NTD (blue), the core domain (green), and the CTD (orange). The core and NTD interact with each other leading to the formation of the RecA filament. The first DNA molecule (ssDNA, cyan in Ec and dsDNA, red in Dr) bind along the central axis of the RecA filament. The CTD which is protruding outside the filament will bind to the second DNA molecule (dsDNA in Ec and ssDNA in Dr) and bring it into the filament for pairing with the first DNA molecule.

130

Figure 4.2 Gels for ssDNA (A) and dsDNA (B) exonuclease assay. A) The first lane is the control with no protein, the second lane with λ exonuclease, and the rest of the lanes are different protein preparations of Ec RecA wild type and mutant proteins. Note the disappearance of the DNA band in lane 3 due to the nuclease contamination. B) The first lane is the DNA ladder, second lane with out any protein, and the 8th lane is the positive control with λ exonuclease, which is a dsDNA exonuclease. The DNA is fully digested in this lane. The rest of the lanes represent different preparations of wild type and mutant Ec RecA. Lane 4 has dsDNA exonuclease contamination because the intensity of the band for the full length dsDNA is less than in lane 2.

131

Figure 4.3 Gel shift assay showing the binding of Dr and Ec full-length proteins to ss, ds, overhang (OH), and hairpin (HP) DNA. A) Schematic representation of the different types of DNA used in the experiment. The concentration of protein used was 0, 3, 6, 9, 14, and 20 μM, while the DNA concentration was 500 nM nt for all the experiments. B) The binding of Dr RecA to ss and dsDNA. The first six lanes are for ssDNA (48-mer) and the last six are for dsDNA (39-mer). C) The binding of Ec RecA to ss and dsDNA. The order of loading is same as that of B. D) The binding of Dr RecA to overhang (made by annealing a 36-mer and a 30-mer DNA) and hairpin DNA (49-mer). The first six lanes are for overhang DNA and last six lanes are for hairpin DNA. E) The binding of Ec RecA to overhang and hairpin DNA. The order of loading is same as D. The hairpin DNA gave the most stable complex for Ec RecA. This is because there is less streaking seen on the lanes corresponding to the hairpin DNA. In the case of Dr RecA, all the different types of DNA gave rise to weaker complexes (streaking on all the lanes). The protein-DNA complex migrates slower for Dr RecA when compared to Ec RecA.

132

Figure 4.4 Strand exchange reaction with Dr and Ec RecA. A) Schematic representation of the strand exchange process. css DNA is the closed ssDNA, ncds DNA is the nicked circular dsDNA. B represents the strand exchange reaction with Dr RecA, while C represents that with Ec RecA. The reaction was carried out in the presence of 3 mM dATP. The different time points used for the reaction include 0, 10, 20, 40, and 60 minutes. The first lane in B is ΦX174 ssDNA and the last lane in C is ΦX174 dsDNA. The joint molecule is a complex of ss and dsDNA and it is a slowly migrating intermediate in the reaction. The ncds DNA is the product of the reaction. Note that Dr RecA forms more amount of ncds DNA at lower time points.

133

Figure 4.5 Binding of Ec and Dr full length proteins to 39-mer ss and dsDNA. Graphical analysis of the filter binding data for the binding of Dr and Ec full length RecA to ss and dsDNA. The amount of RecA-DNA complex formed was higher with ssDNA for both Dr and Ec RecA. Dr RecA formed more complex with both ss and ds DNA compared to Ec RecA at earlier time points.

134

Figure 4.6 Competitive DNA binding assay for the full length Dr and Ec RecA proteins. The legend is labeled according to the order in which the DNA was added to the reaction. A protein concentration of 500 nM and DNA concentration of 4 μM nt (for the first DNA) was used for the reaction. An excess concentration of the first DNA was used in order to ensure the binding of all the protein molecules to the first DNA. The concentration of the cold competing DNA was 0, 2, 4, 8, and 16μM nt. The graph shows that ssDNA easily displaces dsDNA from the RecA-dsDNA complex, when compared to the displacement of ssDNA from RecA-ssDNA complex by dsDNA. 39-mer ss and ds DNA were used for the experiment.

135

Figure 4.7 ATPase assay with full length Ec and Dr RecA. A) The curves for dATP hydrolysis by Ec and Dr RecA in the presence of long ss and dsDNA. B) The rate of hydrolysis of dATP by Ec and Dr RecA in the presence of different DNA. The rate is higher in the presence of ssDNA than dsDNA. C) The curves for dATP hydrolysis by Dr RecA in the presence of different trinucleotide repeats (48-mer long). D) The rate of dATP hydrolysis by Dr RecA in the presence of different triplet repeating 48-mer sequences. As noted for Ec RecA (chapter 3), Dr RecA has lower ATPase rate in the presence of TGG-repeating sequences than with CCA or TCA repeats.

136

Figure 4.8 Schematic representation of an experimental set up for SPR. The DNA molecules (red) are immobilized on the sensor chip, and RecA (green) is passed through buffer inside the flow channel. RecA will bind to DNA while flowing in the buffer. Once RecA binds to the DNA, there will be increase in the refractive index at the surface of the sensor chip, as is measured by the detector.

137

Figure 4.9

138

Figure 4.9 SPR curves for the binding of Ec and Dr RecA to ss and dsDNA. Under the same experimental conditions, Ec RecA gives greater extent of binding to both ss and ds DNA compared to Dr RecA. The dissociation of Ec RecA-dsDNA complex is significantly less when compared to Ec RecA-ssDNA complex.

139

Figure 4.10 Kinetic analysis of the SPR data for Ec RecA-DNA binding. A) The simultaneous fit of the association and dissociation phases based on a 1:1 Langmuir binding. The black lines represent the curves from the fit and the colored lines represent the actual data. Note that both the curves overlap indicating a very good fit of the data to the model. B) Values obtained for different parameters from the kinetic analysis. “Higher” indicates the the flow cell with higher concentration of ssDNA and “lower” indicates the one with lower concentration of ssDNA. The KD obtained from this fit is an apparent KD since it does not consider the cooperativity in the DNA binding event. The -11 -7 KD is lower for Ec RecA-dsDNA (3.5e M) compared to that for Ec RecA-ssDNA (10 M), showing a very tight binding of Ec RecA to dsDNA. This is mainly due to the lower level of dissociation (kd) of Ec RecA from the dsDNA.

140

Figure 4.11 Hill analysis of the SPR data for Ec RecA. A) Hill plot for the Rmax values at different protein concentrations for Ec RecA-ssDNA binding. Note that the curve has a sigmoidal shape indicating cooperativity in the binding mechanism. B) Values obtained for different parameters from the Hill analysis. “Higher” indicates the the flow cell with higher concentration of ssDNA and “lower” indicates the one with lower concentration of ssDNA. The hill coefficient is higher for Ec RecA-ssDNA (2.03) than for Ec RecA- dsDNA (1.71). This means Ec RecA binds more cooperatively to ssDNA than to dsDNA. The Hill analysis gives an estimate of the true KD (the intrinsinc affinity of a monomer to DNA), which is similar for both ss and dsDNA, while the kinetic analysis shows significant difference in the KD values for the interaction of Ec RecA with ss and ds DNA (Figure 4.10).

141

Figure 4.12

142

Figure 4.12 A) Gel showing the anomalous mobility of the DrRecA CTD on an SDS- PAGE. DrRecA CTD runs at a size of 14 kDa, while EcRecA CTD runs at 7 kDa. B) MS results show the existence of a single peak in the protein purifications. The molecular weight corresponding to the major peak is shown.

143

Figure 4.13 Binding of the isolated CTD of Dr RecA to ssDNA and dsDNA. 39-mer long ss and ds DNA were used for this experiment. Note that the binding curves are sigmoidal representing cooperativity in the bindig of Dr RecA CTD to ss and ds DNA (Aand C). The Hill analysis (B and D) gave the cooperativity value, which is 4.15 for Dr RecA CTD-ssDNA binding and 2.57 for Dr RecA CTD-dsDNA binding. There was no detectable binding of Ec RecA CTD to ss or dsDNA under the particular experimental conditions.

144

CHAPTER 5

STRUCTURAL STUDIES OF BACILLUS SUBTILIS LUXS

IN COMPLEX WITH REACTION INTERMEDIATES AND

INHIBITORS*

5.1 Introduction

Quorum sensing (QS) is a process by which bacterial cells communicate with one another to regulate their gene expression in response to the cell density (Miller and

Bassler, 2001). QS is mediated by the production and release of small signaling molecules called autoinducers (AI) into the extra cellular environment. QS includes both intra (Type I) and inter (Type II) species communication. Three different types of QS circuits have been characterized (Xavier and Bassler, 2003) (Figure 5.1), one in gram negative bacteria, a second in gram positive bacteria and a third in Vibrio harveyi, which shows features of the first two QS pathways.

In type I QS, each species of bacterium produces a unique AI-1 molecule that is recognized by a species-specific receptor, thus preventing interspecies cross talk. The

145 * Parts of this work have been published in Biochemistry (2005), Volume 44, 3745-3753, and

Journal of Medicinal Chemistry (2006), Volume 49, 3003-3011.

AI-1 molecule for gram negative bacteria is acyl homoserine lactone (AHL), which is freely diffused into the surrounding medium. AHL is produced by the protein LuxI. Once the concentration of AHL reaches a threshold, it binds to LuxR, and the AHL-LuxR complex binds to specific DNA sequences and turns on the expression of several genes.

In the case of gram positive bacteria, the AI-1 is comprised of short oligopeptides called autoinducing peptides (AIP). AIP are typically 5 to 17 amino acids long and sometimes have special side chain modifications (Federle and Bassler, 2003). Unlike AHL, which freely diffuses out of the cell, AIP requires specific cell surface oligopeptide transporters to transport them to the outside environment. The detection of these peptides involves a sensor kinase protein and a response regulator protein. The sensor kinase is phosphorylated at an invariant His residue, while the receptor is phosphorylated at an Asp residue (Federle and Bassler, 2003).

The third type of QS pathway shows features of the pathways noticed in both gram positive and gram negative bacteria. This was first identified in Vibriyo harveyi, a shrimp pathogen (Federle and Bassler, 2003, Bassler et al., 1993). In this case the bacteria produes two different AI molecules, AI-1 and AI-2. AI-1 is produced by LuxLM and it freely diffuses to the outside environment. But the detection of this AI-1 molecule by LuxN involves phosphorylation of sensor kinase-response regulator pair as noted for the gram positive bacteria. The AI-1 molecule is involved in the intra-species communication of V. harveyi. In addition to this, V. harveyi also produces the type II QS molecule, AI-2. AI-2 is produced by the enzyme LuxS and it is secreted outside the cell.

The periplasmic protein LuxP binds to AI-2 and the LuxP-AI-2 complex activates LuxQ, which has both the sensor kinase and receptor regulator sites. The signals from both

146

LuxN and LuxQ turn on the luxCDABE gene cassette, which again involves a series of

His and Asp phosphorylations (Figure 5.1).

It is thought that type II QS helps bacteria to identify the presence and density of

other species in a community (Xavier and Bassler, 2003). This is because luxS has been

identified in various bacterial species (both gram positive and gram negative). The AI-2

molecule produced by these different bacterial species was detected by a reporter strain

of V. harveyi, which was developed to detect AI-2 signal (Federle and Bassler, 2003).

The chemical identity of AI-2 in V. harveyi is furanosyl borate diester, which was first revealed from the crystal structure of LuxP in complex with AI-2 (Chen et al., 2002). The

same team showed that active AI-2 in Salmonella typhimurium is (2R,4S)-2-methyl-

2,3,3,4-tetrahydroxytetrahydrofuran (R-THMF) (Miller et al., 2004).

S–Ribosylhomo-cysteinase (LuxS) is the key enzyme in the biosynthetic pathway

of AI-2. The biosynthesis starts from S-adenosylhomocysteine (SAH), which is formed as a byproduct of many S-adenosylmethionine-dependent methyltransferase reactions

(Figure 5.2). SAH is hydrolyzed by a nucleosidase Pfs to adenine and S-

ribosylhomocysteine (SRH). Next, LuxS cleaves the thioether bond in SRH to produce L-

homocysteine (Hcys) and 4,5-dihydroxy-2,3-pentanedione (DPD) (Miller and Duerre,

1968, Surette et al., 1999, Schauder et al., 2001). DPD spontaneously cyclizes to form a

furanone as the active form of AI-2 molecule in Salmonella typhimurium (Miller et al.,

2004). In Vibrio harveyi the furanone is further complexed with borate to produce a

furanosylborate diester as the active AI-2 (Chen et al., 2002). AI-2 signaling is

responsible for various functions in bacteria such as virulence, biofilm formation,

motility, toxin production etc. (Federle and Bassler, 2003).

147

Although the overall reaction catalyzed by LuxS is analogous to the reaction

catalyzed by SAH hydrolase, which hydrolytically cleaves SAH into Hcys and adenosine,

LuxS does not contain the essential NAD+ cofactor found in SAH hydrolase (Palmer and

Abeles, 1979). Instead, the high-resolution X-ray crystal structures of LuxS from Bacillus

subtilis, , , and Deinococcus radiodurans

revealed that LuxS contains a tetrahedrally bound divalent metal in the active site (Lewis et al., 2001, Hilgers and Ludwig, 2001, Ruzheinikov et al., 2001). Biochemical studies have shown that the divalent metal ion is Fe2+ in LuxS from B. subtilis (BsLuxS). The

substitution of Fe2+ by Co2+ retains full catalytic activity of the enzyme (Zhu et al.,

2003a). The Co2+ substituted LuxS is more stable than the enzyme coordinated to Fe2+.

This is because the Fe2+ ion is easily oxidized to Fe3+, thus reducing the activity of the

enzyme. LuxS exists as a homodimer in which two identical active sites are formed at the

dimer interface. Each active site contains a Fe2+ ion, coordinated by three conserved

residues [His-54, His-58, and Cys-126 in BsLuxS as well as a water molecule.

A catalytic mechanism has been proposed for the LuxS-catalyzed reaction, as

shown in Figure 5.3 (Pei and Zhu, 2004, Zhu et al., 2003a, Zhu et al., 2003b, Zhu et al.,

2004). In an aqueous solution, the ribose ring of SRH is in equilibrium with the open-

chain aldehyde form of the substrate. LuxS may either preferentially bind the open-chain

form and shift the equilibrium towards the open chain form, or bind the ribose form and catalyze its ring opening. In the productive E•S complex 1, the aldehyde carbonyl binds to the metal ion, displacing the bound water that coordinates the metal in the free enzyme. Coordination to the metal increases the acidity of the C2 proton, which is abstracted by a general base (likely Cys-84 in BsLuxS). The cis-enediolate 2 formed

148

undergoes ligand exchange, shifting the metal from C1 to the C2 oxygen atom,

presumably assisted by a second base/acid (likely Glu-57) and via a 5-membered-ring

transition state, to give enediolate 3. Reprotonation at C1 position by Cys-84 and

tautomerism back to the keto form generates the 2-keto intermediate 4. Repetition of the

above sequence shifts the carbonyl group to C3 position to give a 3-keto intermediate 7.

Subsequent β-elimination, probably catalyzed by Glu-57 (or possibly a third acid/base), results in the release of Hcys and the formation of DPD from the enol form 9. DPD spontaneously tautomerizes to the keto form, either on its way off or after leaving the active site.

The above mechanism is supported by several lines of evidence. When the LuxS reaction was carried out in D2O, deuterium was incorporated into C1, C2, and C5

positions of DPD. This confirms the involvement of sequential proton transfers involving the various C positions as mentioned in the reaction scheme (Zhu et al., 2003a). Both the regiochemistry and stereochemistry of the proton transfer steps in Figure 5.3 have recently been confirmed experimentally by using specifically deuterated SRH substrates

(Zhu et al., 2004). The ketone intermediates 4 and 7 have been observed in real time by

13C NMR spectroscopy (Zhu et al., 2003b). Furthermore, the 2-keto intermediate has

been chemically synthesized and demonstrated to be a chemically and kinetically

competent intermediate on the catalytic pathway (Zhu et al., 2003b).

Despite significant progresses made during the past few years, there are certain

details lacking to create a complete picture of the LuxS catalytic mechanism. One issue is

the role of the metal ion during catalysis. The absorption spectra of the Co2+-substituted

LuxS showed significant spectral shifts coupled to the catalytic cycle. This suggests that

149

the metal ion is directly coordinated with the substrate/intermediates (Zhu et al., 2003a).

However, in the co-crystal structure of LuxS bound to SRH, the ribose hydroxyl groups

were 3.0 Å and 3.2 Å away from the metal ion, ruling out direct coordination of the metal

ion by the hydroxyl groups (Ruzheinikov, et al., 2001). Another unresolved issue is the

function of Ser-6, His-11, and Arg-39, which are in close proximity to the substrate and

are highly conserved residues. Interpretation of the reported LuxS/SRH structure was

potentially complicated by several factors. First, the solved structure contained a Zn2+ ion as the metal cofactor and the Zn2+-LuxS is an order of magnitude less active than the

native enzyme. Second, Cys-84 in the structure was oxidized into cysteic acid, whose

larger size undoubtedly affects substrate binding and catalysis. Indeed, oxidation of Cys-

84 to cysteic acid inactivates LuxS (Zhu et al., 2003a). Third, the reported structure had

relatively high B-factors for the metal ion and SRH, apparently arising from their partial

occupancy in the active site, as well as an alternative conformation for the O1 ribose hydroxyl group (β-conformer) of the substrate. Finally, the ribose form of SRH is not the

“active” form of the substrate and, therefore, the structure does not reflect that of the

productive E•S complex.

This project was aimed at answering several unresolved issues about the catalytic

mechanism of LuxS. The first part was to understand the role of the metal ion in the

catalytic mechanism of LuxS. This was approached by co-crystallizing LuxS with a 2-

ketone intermediate 4 in the reaction pathway. A catalytically inactive Cys84Ala mutant

form of BsLuxS was used to prevent turnover of the intermediate by LuxS. The enzyme

was substituted with Co2+ to ensure the stability of the enzyme during the crystallization

150

trials. The structure gave insight in to the role of the metal ion in the catalytic cycle and

also the importance of several invariant residues around the active site.

5.2 Materials and Methods

5.2.1 Materials

Oligonucleotides were purchased from Integrated DNA Technologies (Coralville,

IA). Talon resin was from Clontech Laboratories (Palo Alto, CA). The 2-ketone

intermediate 4 was synthesized as previously described (Zhu et al., 2003b). The

chemicals used in crystal growth and handling were Fisher Scientific certified A.C.S. grade. All other chemicals were purchased from Sigma-Aldrich (St. Louis, MO).

5.2.2 Site-Directed Mutagenesis of LuxS

The site directed mutagenesis, protein purification, activity assay, uv-vis

spectroscopy measurements, and synthesis of various compounds were done by members

of Dr. Pei’s group (Zhu, J., Shen, G., and Hu, X.). I have done the crystallography and

structure determination part of the project.

Site-directed mutagenesis was carried out on the pET22b-luxS (non-

His-tag) for BsLuxS (Zhu et al., 2003a) and pET22b-luxS-HT for V. harveyi LuxS

(VhLuxS) (Zhu et al., 2003b) using the QuikChange mutagenesis kit (Stratagene, CA).

The primers used were as follows: BsLuxS-C84A, 5’-G A T A T T T C T C C A A T G

G G C G C C C A A A C A G G C T A T T A T C -3’; VhLuxS-S6A, 5’-A T G C C T T

T A T T A G A C G C C T T T A C C G T A-3’; VhLuxS-H11Q, 5’-A G C T T T A C C

G T A G A C C A A A C G C G T A T-3’; VhLuxS-R39M, 5’-A C G G T A T T C G A

C C T A A T G T T C A C T G C-3’; and VhLuxS-R39K, 5’-A C G G T A T T C G A C

151

C T A A A A T T C A C T G C-3’. The identity of all DNA constructs was confirmed by

DNA sequencing.

5.2.3 Purification of C84A BsLuxS (non-His-tag)

E. coli BL21(DE3) cells (4 L) carrying the plasmid pET22b-luxS-C84A were

grown in minimal media supplemented with 75 mg/L ampicillin, 0.25% D-glucose, 2

μg/mL thiamin, 1 μg/mL D-biotin, 0.1% (NH4)2SO4, and a metal salt mixture (0.5 mM

MgSO4, 0.5 μM H3BO3, 0.1 μM MnCl2, 0.5 μM CaCl2, 10 nM CuSO4, 1 nM ammonium

molybdate) at 37 °C to an OD600 of 0.6. The cells were induced by the addition of 100

μM isopropyl β-D-thiogalactoside, and continued to grow at 30 °C for an additional 15 h.

2+ For the preparation of Co -substituted enzymes, 100 μM CoCl2 was added to the growth

media at the time of induction. Cells were harvested by centrifugation and resuspended in

70 mL of lysis buffer containing 25 mM Tris (pH 7.6), 20 mM NaCl, 1% Triton X-100,

0.5% protamine sulfate, 40 μg/mL p-methylbenzenesulfonyl fluoride and 70 μg/mL

chicken egg white lysozyme. The cells were lysed by stirring for 20 min at 4 °C,

followed by brief sonication and centrifugation. The supernatant was loaded on a Q-

Sepharose Fast-Flow column (2.5 x 13 cm; Amersham Pharmacia Biotech AB) pre-

equilibrated with 25 mM Tris (pH 7.6) and 20 mM NaCl. The column was washed with

300 mL of the equilibrating buffer and eluted with a NaCl gradient (20-500 mM) in the

above buffer. Purple fractions containing significant amount of LuxS protein (as analyzed

by SDS-PAGE) were pooled (80 mL), concentrated in an Amicon apparatus (Millipore)

to 33 mL and adjusted to 1 M (NH4)2SO4. The protein solution was then loaded on a

Phenyl-Sepharose Fast-Flow column (2.5 x 16 cm; Amersham Pharmacia Biotech AB),

152

washed with 80 mL of equilibrating buffer containing 25 mM Tris (pH 7.8), 20 mM NaCl

and 1 M (NH4)2SO4, and eluted with a reverse gradient of 1-0 M (NH4)2SO4. Fractions were analyzed by 15% SDS-PAGE, concentrated to 20 mL, quickly frozen in isopropanol dry ice bath and stored at -80 °C. Protein concentration was determined by the Bradford method using bovine serum albumin as standard and by measuring the thiol content using

5,5’-dithio-bis-(2-nitrobenzoic acid) (DTNB) (Ellman, 1959). Metal analysis was performed by inductively coupled plasma emission spectrometry (ICP-ES) at the

Chemical Analysis Laboratory of the University of Georgia. C84A LuxS contained 0.89

Co2+ and 0.02 Zn2+ per polypeptide. The results showed that the Bradford method

overestimates the LuxS concentration by a factor of 2.5, a fact that was corroborated by the metal analysis results (ref. Zhu et al., 2003a and this work).

5.2.4 Purification of VhLuxS Mutants

E. coli BL21(DE3) cells (4 L) carrying the appropriate plasmid were

grown in the above media at 37 °C to an OD600 of 0.9. Cells were induced in the same

manner for 5 h at 30 °C in the presence of 100 μM CoCl2. After centrifugation, cells were

lysed in 70 mL of lysis buffer containing 20 mM Tris (pH 8.0), 0.5 M NaCl, 5 mM

imidazole, 1% Triton X-100, 0.5% protamine sulfate, and 70 μg/mL chicken egg white

lysozyme with stirring and brief sonication. After spinning down the crude lysate, the

supernatant was loaded on a Talon metal affinity column (Clontech, 2.5 x 2.5 cm)

equilibrated in 20 mM Tris (pH 8.0), 0.5 M NaCl, and 5 mM imidazole. The column was

eluted with the above buffer containing 60 mM imidazole. Fractions were colleted,

analyzed and stored in the same manner as described above. Protein concentration was

153

determined by Bradford method and corrected by a factor of 0.5, which is based on the metal analysis results of V. harveyi Co2+-LuxS.

5.2.5 LuxS Activity Assay

All LuxS activity assays were performed in a buffer containing 50 mM HEPES

(pH 7.0), 150 mM NaCl, 150 μM 5,5’-dithio-bis-(2-nitrobenzoic acid) (DTNB) (Ellman,

1959) and various concentrations of SRH (0-200 μM) or 2-ketone 4 (0-130 μM) at room

temperature. The reactions were initiated by the addition of Co2+-LuxS (final

concentration 0.5-1.7 μM) and monitored continuously at 412 nm in a Perkin-Elmer λ25

UV-VIS spectrophotometer. The initial rates recorded from the early regions of the

progress curves were fitted into the Michaelis-Menten equation using KaleidaGraph 3.5

to obtain the kcat and KM values.

5.2.6 UV-VIS Spectroscopy

C84A BsLuxS (non-His-tag) was diluted in a buffer containing 50 mM HEPES

(pH 7.0) to give a final concentration of 220 μM, and incubated with 370 μM of 2-ketone

4 at room temperature. Absorption spectra were recorded on a Perkin-Elmer λ25 UV-VIS

spectrophotometer at various times (0–90 min).

5.2.7 Crystallization and Structure Determination

The C84A variant of BsLuxS was cocrystallized with 2-ketone intermediate 4 by

hanging drop vapor diffusion, where the well solution consisted of 2.2 M ammonium

sulfate, 0.1 M HEPES pH 7.0, and the hanging drop was prepared by mixing 2 μl of well

solution with 2 μl of 35 mg/ml (1.9 mM) LuxS, 3 mM intermediate 4, 25 mM Tris pH

8.0, 100 mM sodium chloride. Prior to data collection crystals were transferred to a

154

solution of 15% glycerol, 2.5 M ammonium sulfate, 0.1 M HEPES pH 7.0, 3 mM

intermediate 4, mounted in nylon loops (Hampton Research), and frozen by plunging in liquid nitrogen. X-ray diffraction data were collected at -180 °C using a Rigaku

RUH3RHB rotating anode generator and an R-AXIS IV++ image plate detector. The data

were processed with CrystalClear software (Molecular Structure Corporation).

Crystallographic refinement using CNS (Brünger et al., 1998) began with the structure of

the BsLuxS SRH complex (PDB code 1JVI) (Ruzheinikov et al., 2001) with all water

molecules and SRH omitted and Cys-84 truncated to alanine. The conjugate gradient

minimization and individual temperature factor protocols were used with a maximum

likelihood target and overall anisotropic temperature factor and bulk solvent corrections

to the data. The 2-ketone intermediate was added to the model at the later stages of

refinement after all of the protein atoms and several water molecules were positioned.

Model building used the program O (Jones et al., 1991), and the geometrical parameters

for refinement of the 2-ketone intermediate were generated using the Dundee PRODRG2

server (Schuettelkopf and van Aalten, 2004). The distances between the Co2+ and its protein and substrate ligands were not restrained. Figures were generated using O,

MOLSCRIPT (Kraulis, 1991), RASTER3D (Merritt and Bacon, 1997), and LIGPLOT

(Wallace et al., 1995).

5.3 Results and Discussion

5.3.1 Structure of LuxS complexed with catalytic 2-ketone intermediate 4

The catalytically inactive C84A variant of BsLuxS (Zhu et al., 2003a, Zhu et al.,

2003b) was co-crystallized with 2-ketone intermediate 4 from solutions of ammonium

155

sulfate at pH 7.0. The crystals are the same form as previously reported for BsLuxS

(Hilgers and Ludwig, 2001, Ruzheinikov et al., 2001), space group P6522 with one LuxS

subunit and one 2-ketone intermediate per asymmetric unit. The two subunits of the LuxS

molecular dimer are related by a crystallographic 2-fold axis of symmetry with two

identical active sites formed at the dimer interface (Figure 5.4A). The structure was

refined at 1.8 Å resolution to an R-factor and free R-factor of 22.5% and 25.6%,

respectively (Table 5.1). The model includes all residues of LuxS except for residues 1-3

at the N-terminus, for which there was no observed electron density. The estimated

atomic coordinate error of the structure is 0.20-0.35Å.

The 1.8 Å resolution diffraction limit of the crystals of the complex reported here,

which is the best that could be obtained after several trials, is somewhat lower than the

1.2 Å (PDB code 1J98) or 1.6 Å (PDB code 1IEO) diffraction limit for crystals of

uncomplexed BsLuxS (Hilgers and Ludwig, 2001, Ruzheinikov et al., 2001), but higher

than the 2.2 Å diffraction limit for crystals of BsLuxS in complex with SRH (PDB code

1JVI) (Ruzheinikov et al., 2001). For all four structure determinations, the diffraction

data were collected on similar rotating anode sources, thus the diffraction limit likely

reflects an increase in flexibility and/or conformational heterogeneity of the protein upon

substrate binding, which is more prevalent in the complex with SRH than in the complex with the 2-ketone intermediate. Accordingly, the mean temperature factor value of 29 Å2 for the structure of the complex with the 2-ketone intermediate is in between that of the structure of uncomplexed BsLuxs (19 Å2) and BsLuxS bound to SRH (37 Å2)

(Ruzheinikov et al., 2001).

156

The presence of the bound 2-ketone intermediate causes only very slight

structural changes in the LuxS dimer, which can be superimposed to the 1.2 Å structure

the uncomplexed LuxS dimer (Ruzheinikov et al., 2001; PDB code 1J98) to an rmsd of

0.23 Å for all Cα atoms. Despite the fact that the active sites are located at the interface between the two subunits of the dimer, binding of the 2-ketone intermediate does not cause any significant changes in the protein-protein interactions at the interface. The largest differences between the two structures are on the order of ~1 Å and occur at Arg-

65 and Pro-96. Arg-65, which is at a kink in helix α1, interacts with the carboxylate of

the homocysteine moiety of the 2-ketone intermediate and is pulled toward the active site

by about 1.2 Å in the structure of the complex reported here as well as in the complex

with SRH (Ruzheinikov et al., 2001). Pro-96 of LuxS is modeled as Thr in the 1.2 Å

structure of uncomplexed BsLuxS as well as in the complex with SRH (Ruzheinikov et

al., 2001), but is clearly present as Pro in the current structure, resulting in ~1.2 Å shifts of the backbone atoms.

The electron density for the 2-ketone intermediate is complete (Figure 5.4B), allowing for placement of all of its constituent atoms. Figure 5.4C shows a close up view of the interactions between LuxS and the 2-ketone intermediate, and a schematic view indicating the distances between interacting atoms in the refined model is shown in

Figure 5.5. The refined temperature factors for the 2-ketone intermediate (40-50 Å2) are only slightly higher than for the protein atoms (20-40 Å2), indicating full or nearly full

occupancy of the compound. The Co2+ ion refines to a temperature factor of 32 Å2. Like

SRH, the 2-ketone intermediate is deeply buried at the dimer interface of LuxS, forming close, specific interactions with both subunits of the dimer. The homocysteine group of 157

the 2-ketone intermediate is bound in essentially the same way to LuxS as observed for bound SRH (Ruzheinikov et al., 2001), with the amino and carboxylate groups neutralized by interactions with the side chains of Asp-78(A) and Arg-65(A), respectively. The (A) following the residue number designates the subunit of the active site with the bound Co2+ ion, whereas residues of the other subunit are designated (B).

By contrast, the ribulose moiety of the 2-ketone intermediate is bound in a very different way to the enzyme than is seen for the substrate SRH (Ruzheinikov et al.,

2001). Instead of the closed ribose ring of SRH (Figure 5.6A), the 2-ketone intermediate is clearly in an open, extended conformation, with O4 separated from C1 by 3.5 Å. At the end of the 2-ketone intermediate, the O1 atom is anchored to the active site of LuxS through close interactions with Arg-39(B), His-11(B), and Ser-6(B), three residues that are highly conserved among all known LuxS sequences. The interactions involving His-

11(B) and Arg-39(B) are not seen in the complex with SRH, in which the O1 atom is positioned differently by about 3.4 Å. The O1 atom of the 2-ketone intermediate is also within 3.6 Å of the hydroxyl group of Tyr-89(B), another highly conserved residue of

LuxS.

The electron density for Ala-84(B), the site of the inactivating C84A substitution, is consistent with Ala (Figure 5.4B), providing further confirmation of the presence of the substitution. In structures of wild type LuxS, the cysteine residue at this position is oxidized to sulfinic/sulfonic acid, which inactivates the enzyme (Lewis et al., 2001,

Hilgers and Ludwig 2001, Ruzheinikov et al., 2001). The Cβ atom of Ala-84 is within van der Waals contact (3.4 Å) of the C1 and C2 atoms of the 2-ketone intermediate.

Thus, based on the structure of the 2-ketone intermediate bound to the C84A protein, the

158

–SH group of Cys-84 in the wild type enzyme would be positioned to potentially interact

with the C1, C2, C3, or O3 atoms of the intermediate, consistent with its proposed

general acid/base roles during catalysis (Figure 5.3).

Moving along the 2-ketone intermediate, the O2 atom of the ketone group is

oriented to form a close interaction (2.2 Å) with the Co2+ ion, which is also liganded by

His-54(A), His-58(A), and Cys-126(A), with approximate tetrahedral coordination

geometry maintained. The positions of the Co2+ ion and its protein ligands are not

significantly altered by the binding of the 2-ketone intermediate, as compared to the 1.2

Å structure of uncomplexed BsLuxS with Zn2+ (Ruzheinikov et al., 2001 ). The O2 atom

of the 2-ketone group occupies approximately the same position as the water molecule

that is the fourth ligand to the Zn2+ in the structure of uncomplexed BsLuxS (Figure

5.6B). The O2 atom of the 2-ketone group of the intermediate also interacts with the backbone amide of Gly-127(B) and a well-ordered water molecule. The O3 atom of the

2-ketone intermediate is also near the cobalt ion (2.9 Å) although it is not oriented directly toward it. The O3 atom also interacts with the carboxylate of Glu-57(A), and the backbone amide of Ala-84(B). On the basis of the retained tetrahedral geometry of the

metal center and minimal spectral changes of the Co2+ ion upon binding of the intermediate, it is concluded that the O3 atom of intermediate 4 is not directly coordinated with the metal ion. The O4 atom of the 2-ketone intermediate points away from the Co2+ ion, interacting with the side chain of Ser-6(B) and a water molecule.

Interestingly, a well-ordered water molecule (Wat-200) occupies the same

position as the O1 atom of the SRH substrate bound to LuxS. This water forms close

interactions with the side chain of Ser-6(B), the backbone carbonyl oxygen of Gln-

159

125(A), and the O1 and O4 atoms of the 2-ketone intermediate. This water molecule is

shielded from bulk solvent by the side chain of Phe-7(B), which is invariant.

Conceivably, this water molecule could have a role in shuttling protons among the

various groups of the enzyme and substrate during the LuxS reaction.

5.3.2 Site-Directed Mutagenesis

The potential roles of Ser-6, His-11 and Arg-39 in substrate/intermediate binding

and catalysis were further investigated by site-directed mutagenesis. VhLuxS was chosen

-1 -1 for the study because its higher kcat value of 0.40 s compared to 0.03 s of BsLuxS

facilitates the activity measurements. The S6A mutant showed ~14-fold lower activity

-1 toward SRH, primarily due to a decrease in kcat value (0.031 s ); the KM value (45 μM)

was essentially unchanged (Table 5.2). The other mutants (H11Q, R39M, and R39K) had

-1 -1 ~1,000-fold reduction in activity (kcat/KM = 8–11 M s ). However, their kcat and KM values could not be accurately determined due to the low activities. The activity of these mutants toward the 2-ketone intermediate 4 was also assessed and compared to that of the wild-type enzyme (Table 5.2). Mutation of Ser-6 resulted in ~9-fold reduction in activity

-1 (kcat = 0.10 s ), whereas the H11Q and R39M/K mutations reduced the activity by ~50

-1 fold (kcat ~ 0.009 s ). Again, the reduction in activity is primarily due to a decrease in kcat values. Thus, Ser-6, His-11, and Arg-39 are all involved in substrate/intermediate binding and/or catalysis. Furthermore, the different magnitude in activity reduction toward SRH vs. the 2-ketone intermediate indicates that these residues play important roles both in the formation and the decay of the 2-ketone intermediate.

160

5.3.3 UV-VIS Spectroscopy

The interaction of the 2-ketone intermediate with the LuxS active site was further

investigated by absorption spectroscopy of Co2+-substituted C84A LuxS. The spectrum

of Co2+ ion is highly sensitive to its ligand environment, both the number and identity of

the ligands as well as the ligand geometry (Maret and Vallee, 1993). In the absence of the

2-ketone at pH 7.0, C84A LuxS exhibited three d-d transition bands at 660, 630, and 556

nm. The extinction coefficient for the 660 nm band, which is the strongest absorption

band, is ~430 M-1 cm-1, consistent with a tetrahedral ligand environment (Figure 5.7).

Note that the spectrum is somewhat different than that of wild-type or E57A mutant LuxS

(Zhu et al., 2003a), likely due to the removal of the negative charge associated with the thiolate ion of Cys-84. Upon the addition of 1.7 equivalents of the 2-ketone intermediate, the spectra underwent immediate changes. First, three d-d transition bands showed a decrease in intensity, with a maximum extinction coefficient of ~340 M-1 cm-1 at 660 nm.

Second, the band at 556 nm was blue shifted to 548 nm. These results are consistent with

the observations from the co-crystal structure that the metal ion remains tetrahedrally

coordinated with His-54, His-58, Cys-126, and the carbonyl oxygen of the 2-ketone

intermediate, which replaces the water in the free enzyme as the fourth ligand.

5.3.4 Mechanistic Implications

Central to the proposed mechanism of LuxS reaction is the Lewis acid function

of the metal ion (Figure 5.3). Previous observation that wild-type Co2+-LuxS (but not the

C84A mutant) underwent dramatic spectral changes during the catalytic cycle implies

161

that the metal ion is directly coordinated with the substrate and intermediates (Zhu et al.,

2003a). However, the co-crystal structure of LuxS bound with SRH showed that the

metal was not directly ligated to SRH (Ruzheinikov et al., 2001). In this work, co-

crystallization and absorption spectroscopy both show that a catalytic intermediate is

directly coordinated with the metal center. It is reasonable to suggest that other catalytic

intermediates such as the 3-ketone intermediate and the putative enediolates are also

directly coordinated with the metal ion during catalysis, although further experimentation

is needed to confirm this. The lack of direct binding of SRH to the metal ion can be

reconciled by the hypothesis that the ribose ring form of SRH, while the dominant form under the physiological condition, is not catalytically active; rather, the open, free aldehyde form is the reactive species. The monodentate interaction between the ribulose

moiety and the metal is somewhat surprising but reasonable. The enediolate formed after

the abstraction of the C3 proton by Cys-84 would be expected to bind the metal ion in a

bidentate fashion, as proposed for the aldose-ketose isomerases (Collyer et al., 1990).

Perhaps the LuxS active site is designed to favor the monodentate binding so that

catalysis would not be trapped in the enediolate stage.

The structure and the mutagenesis studies provide additional insight into the

function of active-site residues. Cys-84 has been proposed to act as the general base/acid,

catalyzing the proton transfers along C1-C3 positions (Zhu et al., 2003a, Zhu et al.,

2003b). The co-crystal structure of the Cys-84Ala mutant shows that the side chain of

Cys-84, based on the position of Ala-84, would indeed be properly positioned for such a

role. It has been previously shown that mutation of Cys-84 to Ala inactivates the enzyme,

whereas the C84D and C84S mutants retain residual activities (Table 5.2) (Zhu et al.,

162

2003b). Thus, both structural and kinetic data are consistent with the proposed function

for Cys-84. Glu-57 was proposed to catalyze the transfer of the C2 (and C3) hydroxyl

proton to the C1 (and C2) oxyanion as well as serving as the general base for the final β-

elimination reaction. The position of the Glu-57 side chain relative to the bound

intermediate is consistent with this latter role, which is also supported by the observation that mutation of Glu-57 into Ala results in the accumulation of the 3-ketone intermediate

7 (Zhu et al., 2003b). Since its side chain carboxylate is hydrogen bonded to the C3

hydroxyl group, it can act as a general base to abstract the hydroxyl proton to form the

enediolate intermediate. However, its carboxylate group is clearly too far from the C2

oxygen atom to allow a direct transfer of the proton originally derived from the C3-OH to

the C2 oxyanion. Then, where does the C2 oxyanion pick up the needed proton to

become a hydroxyl group? There are several other residues near the bound intermediate,

Ser-6, His-11, Arg-39, and Tyr-89, all of which are highly conserved residues. Ser-6 is involved in substrate/intermediate binding; its mutation results in moderate reduction in catalytic activity. His-11 and Arg-39 are hydrogen bonded to the C1 hydroxyl group of the 2-ketone intermediate. However, the ~1000-fold reduction in catalytic activity upon their mutation suggests that these two residues may have roles in addition to substrate/intermediate binding. It is conceivable that His-11 acts as the general acid donating a proton to the C2 oxyanion as it departs from the metal ion. To regenerate the active enzyme, Glu-57 needs to be deprotonated, while His-11 must be protonated. We tentatively propose that Arg-39, Tyr-89, and/or several structured water molecules

(Figure 5.4B) may serve as the proton shuttle to transfer the proton from Glu-57 to His-

11.

163

In conclusion, the present study has provided direct evidence for the previously proposed Lewis acid function of the metal ion in LuxS-catalyzed reaction. It also provides new insights into the catalytic function of active site residues. The results should facilitate further mechanistic investigation of the enzyme and the design of selective

LuxS inhibitors.

164

Figure 5.1 Schematic representation of the different types of bacterial QS (Figure adapted from Federle and Bassler, 2003). A) QS in gram negative bacteria. AHLs (red triangles) are synthesized by the enzyme LuxI. AHLs freely diffuse outside the cell and once the concentration reaches a threshold, AHLs bind to LuxR and the LuxR-AHL complex turns on various genes in the bacteria. B) QS in gram positive bacteria. Short peptides (red wavy lines) are transported outside the cell by transporters. These peptides will bind to a sensor kinase-receptor regulator pair, which undergoes phosphorylation at specific His and Asp residues respectively. This in turn, turns on various genes of the bacteria. C) QS in V. harveyi. LuxLM produces AI-1 for intraspecific communication. The AI-1 molecules bind to LuxN which undergoes phosphorylation. LuxS produces AI- 2 which helps in inter species communication. LuxP and LuxQ are essential for the detection of AI-2. LuxN-AI-1 and LuxPQ-AI-2 complexes activates LuxU and LuxO which turns on other genes in the cell.

165

Figure 5.2 Biosynthetic pathway of AI-2.

166

Figure 5.3 Proposed catalytic mechanism of LuxS.

167

A

B

C

Figure 5.4 Crystal structure of LuxS in complex with the 2-ketone intermediate 4: (A) stereo ribbon drawing of the LuxS dimer bound to the 2-ketone intermediate. The two intermediate compounds are bound at the interface and shown in ball-and-stick with green bonds. The Co2+ ions of each active site are shown as purple spheres. The N- and C-terminal residues of the model (Val-4 and Gly-157) are indicated; (B) stereo view of electron density in the active site region. The view is approximately the same as for the active site on the right of the dimer in panel A. The blue cage is the 1.8 Å 2Fobs-Fcalc electron density map contoured at 1 σ. The red cage is a 1.8 Å Fobs-Fcalc simulated annealing omit map contoured at 3.5 σ, for which the atoms of the 2-ketone intermediate were omitted for the refinement and map calculation; (C) stereo view of the atomic interactions between LuxS and the 2-ketone intermediate. Amino acid residues of subunit A of the dimer, which has the bound Co2+ ion for this active site, are shown with gold

168

bonds. Amino acid residues of the other subunit (B) are shown with cyan bonds. Atom types are colored as follows: oxygen, red; carbon, black; nitrogen, blue, sulfur, yellow; cobalt, purple. All potential hydrogen bonds within 3.5Å are shown as dotted lines. Notice that the oxygen atom of the 2-ketone position of the intermediate is coordinated to the cobalt ion. Also notice that the Ala-84 Cβ atom of the Cys-84Ala LuxS protein is positioned near the C2 and C3 atoms of the 2-ketone intermediate.

169

Figure 5.5 Schematic view of the interactions between LuxS and the 2-ketone intermediate. Potential hydrogen bonding interactions within 3.5 Å are shown as dashed lines with each distance indicated. Residues within van der Waals contact are also indicated. Notice that the O2 atom of 2-ketone group of the intermediate is within 2.2 Å of the Co2+ ion. The mean coordinate error of the crystallographic model is estimated to be 0.2-0.35 Å. The figure was prepared using LigPlot (Wallace et al., 1995).

170

A

B

Figure 5.6 Comparison of the structure of LuxS bound to 2-ketone intermediate 4 with the structures of LuxS bound to SRH and uncomplexed LuxS: (A) the structure of LuxS bound to 2-ketone intermediate 4 (ball-and-stick with green bonds) is superimposed on SRH (magenta bonds) from the structure of the BsLuxS bound to SRH (PDB code 1JVI) (Ruzheinikov et al., 2001). The superposition is based on all Cα atoms of the LuxS dimer for each structure. The view is approximately the same as for figures 5.4B and 5.4C. Notice that the difference in the conformation of the substrates is greatest for the O1 atom; (B) superposition of the metal centers in the structures of LuxS with Co2+ bound to 2-ketone intermediate 4 (protein ligands in gold bonds) and uncomplexed LuxS with Zn2+ (protein ligands in magenta bonds). The water molecule (red sphere) is from the structure of uncomplexed LuxS (PDB code 1J98) (Ruzheinikov et al., 2001).

171

Figure 5.7 UV-Vis absorption spectra of C84A Co2+-BsLuxS (220 μM) in the absence and presence of 2-ketone intermediate 4 (370 μM). The LuxS and ketone 4 were rapidly mixed and the spectra were recorded at 0, 30, 60, and 90 min.

172

Data Collection Statistics

Space group P6522 a=b (Å) 62.5 c (Å) 149.7 Resolution (Å) 16.1-1.8 No. of reflections 205,678 No. of unique reflections 17,419 Completeness (%) 99.6 (97.3)a Redundancy 11.8 (8.9) b Rmerge 9.9 (53.4) I/σc 12.7 (4.1)

Refinement Statistics

Resolution (Å) 16.1-1.8 No. of reflections (working/free) 15,098/1,638 Completeness (%) 99.7 Mean B-factor (Å2) 29.0 Estimated Coordinate Errord 0.35

RMS Deviation from Ideal Geometry

Bonds (Å) 0.006 Angles (°) 1.2 R-factor 22.5 (42.1) e Rfree (%) 25.6 (42.4) No. of waters 103

aNumbers in parantheses refer to the highest resolution shell only. b Rmerge =∑|Ih-‹I›h|/∑Ih, where ‹I›h is average intensity over symmetry c equivalents. I/σ is the mean of the intensity/sigma of the unique, d averaged reflections. The estimated coordinate error is the value from e the cross-validated ∑ plot. R-factor = ∑|Fobs - Fcalc|/∑obs. Rfree is calculated from 10% of the reflections that are omitted from the refinement.

Table 5.1 Data collection and refinement statistics for LuxS-2 ketone intermediate 4 structure.

173

Table 5.2: Catalytic Activity of LuxS Mutants. (aData from reference Zhu et al.,

2003b).

174

5.4 Co-crystallization of BsLuxS with various inhibitors

5.4.1 Introduction

Since QS controls many bacterial behaviors including biofilm formation and

bacterial virulence, proteins involved in quorum sensing are being explored as novel

targets for antibacterial drug design (Federle and Bassler, 2003, Lyon and Muir, 2003,

Suga and Smith, 2003). Halogenated furanone derivatives have been shown to act as AI-1

antagonists and they inhibited the expression of virulence factors by Pseudomonas

aeruginosa and increased bacterial susceptibility to antibiotic tobramycin (Hentzer et al.,

2002, Ren et al., 2002, Hentzer et al., 2003). In a mouse pulmonary infection model, the

drug inhibited quorum sensing of the infecting bacteria and promoted their clearance by

the mouse immune response (Hentzer et al., 2003). Suga and coworkers synthesized a

series of AHL analogs, some of which acted as antagonists of quorum sensing and

interfered with virulence expression and biofilm formation by P. aeruginosa (Smith et

al., 2003a, Smith et al., 2003b). LuxS/AI-2 also regulates a host of bacterial behaviors

including virulence, biofilm formation, motility, toxin and antibiotic production,

luminescence, and ABC transporter expression (Federle and Bassler, 2003). Further, the

luxS gene is present in the majority of Gram-positive and Gram-negative bacteria and is

highly conserved (Federle and Bassler, 2003). Thus, AI-2 antagonists and LuxS inhibitors

have the potential as a class of unconventional, broad-spectrum antibacterial agents. Two

SRH analogues have recently been reported as weak inhibitors of LuxS (IC50 ~1 mM)

(Alfaro et al., 2004), but potent LuxS inhibitors are still lacking.

In order to develop a potent LuxS inhibitor, several substrate and intermediate analogues have been synthesized by Dr. Pei’s group. Those compounds that gave higher 175

inhibition of the enzyme activity were chosen and co-crystallized with LuxS. The structure of LuxS with two inhibitors 10 and 11 has been determined and the differences in their stereochemistry have been confirmed by crystal structure. It also showed how these compounds are bound to the active site of the enzyme. Since the structure of these inhibitors were determined using the native BsLuxS, it was possible to analyze the role of

Cys84 in the catalytic mechanism of LuxS.

BsLuxS was also co-crystallized with a substrate analog (S-anhydroribosyl-L- homocysteine, (SA-1)). SA-1 is an inhibitor of LuxS, which prevents the initial stages in the LuxS catalytic pathway (the initial aldose-ketose isomerization is prevented by this compound) (Alfaro et al., 2004). The earlier crystal structure of LuxS with a substrate had the catalytic Cys84 oxidized and hence detailed interactions involving Cys84 were not visible (Hilgers and Ludwig, 2001, Ruzheinikov et al., 2001). In the present study, the structure of SA-1 was determined with wild type enzyme with an active Cys84.

Hence it showed the details of the interaction of the substrate in the active site of the

LuxS.

5.4.2 Materials and Methods

5.4.2.1 Synthesis of the inhibitors

(2S)-2-Amino-4-[(2R,3S)-2,3-dihydroxy-3-N-hydroxycarbamoylpropylmercapto]- butyric acid (10). A solution of 1 M BCl3 in CH2Cl2 (0.50 mL, 0.50 mmol) was charged into a 10 ml round bottom flask and cooled to –78 ºC, and compound 21 (53 mg, 0.080 mmol) in CH2Cl2 (0.72 mL) was added dropwise at –78 ºC. The yellow reaction mixture was stirred at –78 ºC for 3 h. MeOH/CH2Cl2 (2:1, 1.08 mL) was then added at –78 ºC and the mixture was stirred for another 5 min. The solvent was removed by rotary 176

evaporation, 2 mL of MeOH was added, and again removed by rotary evaporation. The

residue was purified by silica gel chromatography (30% H2O in CH3CN) to give product

4 as a white gel (15 mg, 71% yield, Rf = 0.34). The product was further purified by

affinity chromatography with Affi-Gel® boronate gel (Bio-Rad), which was eluted with

30 mM Na2HPO4-NaH2PO4 buffers of the following pH’s: 8.5, 8.1, 7.6, 7.0, 6.75, 6.5,

6.25, 6.0, 5.5, 5.0, and 4.5. Compound 4 was collected over the pH range of 6.5-6.0. 1H

NMR (400 MHz, D2O): δ 4.09 (d, J = 4.4 Hz, 1H), 3.86 (quintet, J = 4.4 Hz, 1H), 3.68 (t,

J = 6.0 Hz, 1H), 2.65 (dd, J = 14.0 Hz, 4.0 Hz, 1H), 2.58-2.51 (m, 3H), 2.02-1.92 (m,

13 2H). C NMR (100 MHz, D2O): δ 174.2, 170.1, 72.8, 71.1, 53.9, 33.0, 30.3, 27.4.

- - HRESI-MS: calcd. for C8H15N2O6S (M -H ) 267.0651; found: 267.0656.

(2S)-2-Amino-4-[(2R,3R)-2,3-dihydroxy-3-N-hydroxycarbamoyl- propylmercapto]butyric acid (11). Compound 26 (42 mg, 0.09 mmol)) was dissolved in

1 mL of TFA and the solution was stirred at RT for 1 h. TFA was removed by rotary evaporation and the remaining residue was treated with 1 N HCl for 4 h at RT. The mixture was concentrated and the crude product was purified on a silica gel column eluted with 30% H2O in CH3CN to produce a white solid (24 mg, quantitative yield, Rf =

1 0.34). H NMR (400 MHz, D2O): δ 4.25 (d, J = 2.8 Hz, 1H), 4.02-3.98 (m, 1H), 3.79-

3.76 (m, 1H), 2.75-2.67 (m, 2H), 2.64 (t, J = 7.6 Hz, 2H), 2.15- 2.00 (m, 2H). 13C NMR

(100 MHz, D2O): δ 174.1, 170.9, 72.0, 70.6, 53.8, 33.6, 30.3, 27.2. HRESI-MS: calcd. for

- + C8H15N2O6S (M – H ) 267.0651; found 267.0628.

S-anhydroribosyl-L-homocysteine (SA-1). This compound was a gift from Dr. Zhou’s group and it was synthesized according to the protocol mentioned in Alfaro et al. 2004.

177

5.4.2.2 Crystallization and X-ray Diffraction

Co-BsLuxS was co-crystallized with inhibitors (compounds 10 and 11) by the

hanging drop vapor diffusion method. The well solution consisted of 0.1 M HEPES pH

7.0 and 2.2 M ammonium sulfate. LuxS protein (10 mg/mL) was mixed with 5.4 mM

inhibitor, 25 mM Tris-HCl (pH 8.0), and 100 mM NaCl. The crystals were transferred to a cryoprotectant solution consisting of 0.1 M HEPES (pH 7.0), 2.5 M ammonium sulfate,

15% sucrose, 5.4 mM inhibitor, mounted in nylon loops, and frozen in liquid nitrogen. X-

ray diffraction data were collected at -180 °C using a Rigaku RUH3RHB rotating anode

generator and an R-AXIS IV++ image plate detector. The data were processed with

CrystalClear software (Molecular Structure Corporation). Crystallographic refinement

using CNS (Brünger et al., 1998) began with the structure of BsLuxS 2-ketone intermediate complex (PDB code 1YCL) (Rajan et al., 2005) after removing all water molecules, Co2+, and the 2-ketone intermediate. The conjugate gradient minimization and

individual temperature factor protocols were used with a maximum likelihood target and

overall anisotropic temperature factor and bulk solvent corrections to the data. The

inhibitors were added to the model at the later stages of refinement after all of the protein

atoms, Co2+, and several water molecules were positioned. Stereochemical parameters for

refinement of the inhibitor were generated using the Dundee PRODRG2 server

(Schuettelkopf and Van Aalten, 2004). Model building used the program O (Jones et al.,

1991), and figures were generated with MOLSCRIPT (Kraulis, 1991) and RASTER3D

(Merritt and Bacon, 1997).

178

5.4.3 Results and Discussion

5.4.3.1 Structure of LuxS in complex with compounds 10 and 11

Of the various compounds synthesized and tested for inhibition of LuxS,

compounds 10 and 11 were the most potent, competitive inhibitors with KI values of 0.72

and 0.37 M, respectively. To gain insight into the structural basis of LuxS inhibition by

the above compounds, Co-BsLuxS was co-crystallized with the inhibitors 10 and 11 from

a solution of ammonium sulfate at pH 7.0. Crystals of both LuxS complexes are

isomorphous with those of the 2-ketone intermediate (4) complex (PDB code 1YCL)

(Rajan et al., 2005), and contain a LuxS dimer oriented along a 2-fold crystallographic

axis, with one monomer per asymmetric unit. The active site is formed at the dimer

interface. The structure with inhibitor 10 was refined at 1.8 Å resolution to an R-factor of

19.4% and a free R-factor of 22.6% (Table 5.3). The final model includes residues 4 to

157 of LuxS, one Co2+ ion, 138 water molecules, two sulfate ions and one inhibitor

molecule. The temperature factors for the protein atoms range from 11-45 Å2, while those

for the inhibitor range from 19-24 Å2, indicating full occupancy of the inhibitor. The Cα

atoms of the structures of LuxS bound to 10 and the 2-ketone intermediate can be superimposed to an rmsd of 0.2 Å. There are no significant conformational changes in

LuxS, even for side chains of residues at the active site.

Clear electron density for inhibitor 10 allowed for it to be fit into the model unambiguously (Figure 5.8A). Inhibitor 10 is bound to LuxS in very much the same way as the 2-ketone intermediate (Figure 5.8D), forming a very similar set of interactions

(Figure 5.8B and Table 5.4). The oxygen atoms of the 2-ketone and 3-hydroxyl groups of inhibitor 10 interact closely with the Co2+ atom, while the oxygen atom of the 179

hydroxamate group (O1) is hydrogen bonded to the side chains of Arg-38 and His-11.

The O1 atom does not however interact directly with Ser-6, as is seen for the 2-ketone, due to a change in the N1-C2 torsion angle imposed by its partial double bond character.

The -SH group of the catalytic Cys-84 residue of LuxS, which is replaced by alanine in the complex with the 2-ketone inhibitor, is close to the C2 (3.5 Å), N1 (3.4 Å), and C3

(4.1 Å) atoms of the inhibitor (Table 5.4). This is consistent with Cys-84 serving as a proton donor or acceptor to (or from) these atoms during the course of the reaction, as outlined in Figure 5.3. The atoms of the homocysteine portion of the inhibitor are bound essentially the same as observed for the 2-ketone complex. Interestingly, a well-ordered water molecule, near the O2 atom of inhibitor 10 and forming hydrogen bonds with Ser-6 and His-11, is observed in both structures. Given the proximity of this water to the reactive groups of the protein and the inhibitor, it could conceivably have an important role in the reaction.

The structure of Co-BsLuxS was also determined in complex with inhibitor 11, and refined at 1.9 Å resolution to an R-factor of 19.1 % and a free R-factor of 22.2% (Table

5.3 and Figure 5.8C). The electron density allowed for all atoms of inhibitor 11 to be placed in the structure unambiguously. Inhibitor 11 differs from inhibitor 10 only in having an inverted stereochemical configuration at the C4 position. This change is evident in the electron density map, although the density for the C4 and O4 atoms is slightly weaker than for the rest of the compound, indicating a higher degree of flexibility. Due to the change in configuration, the O4 atom of inhibitor 11 points down

(as viewed in Figure 5.8C), and is closer to the Co2+ atom (3.5 Å) than is the case for the

O4 atom of inhibitor 10 (5.2 Å). Thus, the structure of the complex with inhibitor 11

180

demonstrates that LuxS binds to substrate-related compounds with some degree of

conformational variation at the C4 position. In comparing the structures of Co-BsLuxS

bound to inhibitors 10, 11, and the 2-ketone intermediate, it is apparent that LuxS binds

to substrate-related compounds with a virtually fixed conformation for the homocysteine

portion, and a more adaptable conformation for the functional groups of the ribosyl

portion, as would be predicted from the reaction scheme in Figure 5.3.

5.4.3.2 Structure of LuxS in complex with SA-1

SA-1 is a substrate analog which can arrest the catalysis of the wild type LuxS

enzyme. The crystal structure of LuxS with SA-1 would help in analyzing the interactions

of the native enzyme with the substrate. In the earlier structure of LuxS with substrate,

the active site Cys (Cys-84) was oxidized, and hence a detailed analysis of the

interactions was impossible (Ruzheinikov et al., 2001). In the present structure of LuxS with SA-1, the Cys-84 of BsLuxS is in the active form. The structure was refined at 1.6 Å resolution to an R-factor of 20.7% and a free R-factor of 23.3% (Table 5.5). There was

clear electron density for SA-1 and the presence of the compound in the ring form was

evident from the density (Figure 5.9A).The difference of SA-I from the 2-ketone

intermediate 4 was that O1 atom is missing in SA-1 (Figure 5.9C). This is because of the

ring closure at this point. In the LuxS-SA-I complex, the O3 atom of SA-I is closer to

Co2+ (1.9 Å) than the O2 atom (2.4 Å) (Figure 5.9B, Table 5.6). In the 2-ketone

intermediate 4 structure, O2 atom was closer to the metal (2.2 Å) than the O3 atom (2.9

Å). The O3-Co2+ distance noted in this structure is the closest metal-O2/O3 distance

noted for the various structures solved in this project. Another important difference is the

altered conformation of the homocysteine portion of the compounds (Figure 5.9C). This 181

shows that the homocysteine portion can adapt flexible conformations to fit into the active site. The water molecule which was noted in the active site in other structures is absent in the LuxS-SA-I complex.

182

A

B

C

D

Figure 5.8 Structure of Co-BsLuxS in complex with inhibitors 10 and 11. The two subunits of LuxS are colored gold and cyan, and the Co2+ ion is colored magenta. (A) Stereo view of the electron density for inhibitor 10. The blue cage is the 1.8 Å 2Fobs – Fcalc electron density map contoured at 1 σ. The red cage is a Fobs – Fcalc map, contoured at 2.5 σ, calculated before the inhibitor was added to the model. The inhibitor is shown with green bonds. (B) Stereo view showing hydrogen-bonding interactions between LuxS and inhibitor 10 as dotted lines. (C) Stereo view showing the hydrogen bonding interactions in the structure of Co-BsLuxS in complex with inhibitor 11 (grey bonds). Notice that the stereochemical configuration at the C4 position is inverted relative to inhibitor 10 in panel B, causing the O4 atom to point down, in the direction of the Co+2

183

atom. (D) Stereo view of a superposition of the structures of inhibitors 10 (green), 11 (grey), and the 2-ketone intermediate (magenta) in complex with Co-BsLuxS. The superposition is based on the protein atoms, which are shown only for the structure of LuxS in complex with inhibitor 10, but are virtually identical in the other two structures. Notice that the O2 and O3 atoms form very similar interactions with the Co2+ ion in the three structures.

184

Data Collection Statistics Inhibitor 10 Inhibitor 11

Space group P6522 P6522 a=b (Å) 62.9 62.8 c (Å) 148.6 148.6 Resolution (Å) 20.4-1.8 27.2-1.9 No. of reflections 89,103 97,262 No. of unique reflections 15,675 14,925 Completeness (%) 91.4 (54.5)a 98.6 (90.6) Redundancy 5.5 (1.4) 5.9 (1.2) b Rmerge 6.2 (20.3) 5.2 (18.4) I/σc 17.2 (3.9) 22 (5.1)

Refinement Statistics

Resolution (Å) 20.4-1.8 27.2-1.9 No. of reflections (working/free) 14,054/1,568 13,382/1,500 Completeness (%) 91.1 98.6 Mean B-factor (Å2) 21.1 19.6 Estimated Coordinate Errord 0.17 0.16

RMS Deviation from Ideal Geometry

Bonds (Å) 0.004 0.005 Angles (°) 1.2 1.2 R-factor 19.4 (26.8) 19.1 (22.1) R-free (%)e 22.6 (32.2) 22.1 (26.1) No. of waters 138 130

aNumbers in parantheses refer to the highest resolution shell only. b Rmerge =∑|Ih-‹I›h|/∑Ih, where ‹I›h is average intensity over symmetry c equivalents. I/σ is the mean of the intensity/sigma of the unique, averaged reflections. dThe estimated coordinate error is the value from e the cross-validated ∑ plot. R-factor = ∑|Fobs - Fcalc|/∑obs. Rfree is calculated from 10% of the reflections that are omitted from the refinement.

Table 5.3 Data collection and refinement statistics for inhibitors 10 and 11.

185

Inhibitor atom LuxS atom Inhibitor 10 Inhibitor 11 distance (Å) distance (Å)

N1 Cys-84 SG 3.4 3.5 O1 His-11 NE2 2.7 2.6 O1 Arg-39 NH1 3.2 3.4 O1 Arg-39 NH2 2.8 3.1 O1 Gly-127 N 3.5 3.4 C2 Cys-84 SG 3.5 3.6 O2 Co2+ 2.2 2.1 O2 Watera 3.1 3.2 O3 Co2+ 2.3 2.3 O3 Glu-57 OE1 2.6 2.7 O3 His-58 NE2 3.0 3.0 O4 Ser-6 OG 2.5 3.7 N Asp-78 OD1 2.9 2.9 N Ile-79 O 2.8 2.8 N Ser-80 OG 3.3 3.3 OXT Asp-78 OD1 3.2 3.3 OXT Ile-79 N 3.0 3.1

Table 5.4 List of atomic interactions between LuxS and Inhibitors 10 and 11. a In LuxS/10 complex, the number of the water molecule was 469 and in LuxS/11 it was 448.

186

A

B

C

Figure 5.9 Structure of LuxS in complex with SA-1. The two subunits of LuxS are colored gold and cyan, SA-1 is colored green, and the Co2+ ion is colored magenta. (A) Stereo view of the electron density for SA-1. The blue cage is the 1.6 Å 2Fobs – Fcalc electron density map contoured at 1 σ. The red cage is a Fobs – Fcalc map, contoured at 2.5 σ, calculated before SA-1 was added to the model. (B) Stereo view showing hydrogen- bonding interactions as dotted lines. (C) Overlay of the LuxS-bound 2-ketone intermediate 4 (purple) onto the structure of LuxS in complex with SA-1. The superposition is based on the protein atoms of the two structures. The protein atoms of LuxS from the structure of the complex with the 2-ketone intermediate 4 are not shown.

187

The superposition shows altered conformation of the homocysteine portion of the two compounds. Also note the open ring conformation in the 2-ketone intermediate 4 structure and the closed ring in SA-1 structure.

188

Data Collection Statistics

Space group P6522 a=b (Å) 62.6 c (Å) 149.0 Resolution (Å) 17.6-1.6 No. of reflections 181,057 No. of unique reflections 23,846 Completeness (%) 98.4 (87.8)a Redundancy 7.2 (1.4) b Rmerge 4.3 (29.9) I/σc 22.3 (3.3)

Refinement Statistics

Resolution (Å) 17.6-1.6 No. of reflections (working/free) 23,672/2,336 Completeness (%) 98.4 Mean B-factor (Å2) 24.0 Estimated Coordinate Errord 0.24

RMS Deviation from Ideal Geometry

Bonds (Å) 0.005 Angles (°) 1.3 R-factor 20.7 (34.3) R-free (%)e 23.3 (35.5) No. of waters 141

aNumbers in parantheses refer to the highest resolution shell only. b Rmerge =∑|Ih-‹I›h|/∑Ih, where ‹I›h is average intensity over symmetry equivalents. cI/σ is the mean of the intensity/sigma of the d unique, averaged reflections. The estimated coordinate error is the value from the cross-validated ∑ plot. eR-factor = ∑|Fobs - Fcalc|/∑obs. Rfree is calculated from 10% of the reflections that are omitted from the refinement.

Table 5.5 Data collection and refinement statistics for LuxS-SA-1 structure.

189

SA-1 atom LuxS atom Distance (Å)

O2 Co2+ 2.4 O2 Gln-125 O 2.8 O3 Co2+ 1.9 O3 His-58 NE2 2.9 O4 Ser-6 OG 2.6 N Asp-78 OD1 2.5 N Ile-79 O 2.9 N Ser-80 OG 3.1 O Asp-78 OD1 3.2 O Ile-79 N 3.0

Table 5.6 List of atomic interactions between LuxS and SA-1.

190

5.5 Structure of BsLuxS bound to a citrate ion

5.5.1 Introduction

LuxS has become a major focus for the development of antibacterial drugs. Any compound that can inhibit LuxS from synthesizing AI-2 can be a potent broad spectrum antibiotic, since it can interact with more bacterial species. During the crystallization efforts of a catalytically inactive mutant of LuxS (C84A) with the 2-ketone intermediate 4, a structure with citrate bound in the active site was obtained. The LuxS- citrate complex provides information on how compounds unrelated to the actual substrate bind to the active site of LuxS. The citrate ion occupies the position of the ribosyl part of the 2-ketone intermediate 4 (Rajan et al., 2005) in the active site. In contrast to the 2- ketone bound structure, the N-terminus is ordered in the citrate bound LuxS structure, and the N-terminus occupies the binding site for the amino acid portion of SRH. This particular structure of the enzyme could represent an intermediate conformation in the reaction pathway of LuxS and might give insights into the development of a potent LuxS inhibitor.

5.5.2 Methods

5.5.2.1 Crystallization and Structure Determination

The crystallization solution contained 100 mM Sodium citrate pH 5.6, and varying concentrations of ammonium sulfate (1.8-2.2 M). The crystals were obtained by hanging drop vapor diffusion. The hanging drop was prepared by mixing 2 μL of well solution with 2 μL of protein solution (0.54 mM LuxSC84A, 5.4 mM 2 ketone intermediate 4, 25 mM Tris pH 8.0 and 100 mM NaCl). The freezing, data collection and

191

structure determination was as reported earlier (Rajan et al., 2005). The BsLuxS SRH

complex (PDB code 1JVI) was used as the starting model after removing the water

molecules, and SRH. Cys84 was truncated to Ala in the starting model. The model and

geometric parameters for citrate molecule was obtained from the HIC-Up database

(Kleywegt and Jones, 1998). Figures were generated using O (Jones et al., 1991),

PYMOL (DeLano, 2003), MOLSCRIPT (Kraulis, 1991), and RASTER3D (Merritt and

Bacon, 1997).

5.5.2.2 UV-VIS spectra of BsLuxS and BsLuxSC84A in the presence of citric acid

BsLuxS (wild type) and BsLuxSC84A mutant were dialyzed against 80 mM

HEPES buffer (pH 8.0) overnight with three changes of the buffer. Various amount of

citric acid up to a concentration of 10 mM was added to 100-150 M protein solution

and the UV-VIS spectra were collected at 800-300 nm. The addition of citric acid

gradually decreased the pH of the solution from 8.0 (0 mM acid) to 7.1 (10 mM acid). As

a control, the UV-VIS spectra of both proteins exhibited no change in the absence of

citric acid at pH 7.1-8.0.

5.5.3 Results and Discussion

5.5.3.1 Structure of BsLuxSC84A with citrate ion

The primary goal of the experiment was to determine the structure of

BsLuxSC84A with 2 ketone intermediate 4. For crystallizing BsLuxSC84A with the 2- ketone intermediate 4, drops were set up with sodium citrate at pH 5.6 in varying concentrations of ammonium sulfate. The concentration of sodium citrate (100 mM) was

192

approximately 19 fold compared to that of the 2-ketone intermediate 4 (5.4 mM).

Interestingly, the crystals obtained from this condition did not have the 2-ketone intermediate 4 bound to active site; instead there was a citrate ion in the active site. These crystals belong to the same space group, P6522, as was previously reported for LuxS

(Hilgers and Ludwig, 2001, Ruzheinikov et al., 2001, Rajan et al., 2005). The asymmetric unit consisted of one monomer, one citrate ion, one cobalt ion, two sulfate ions, and 131 water molecules. The functinal enzyme is a dimer, with the active site at the interface of the dimer. Residues from both the subunits contribute to the formation of the active site. In the present structure, a citrate ion was bound at each of the actie sites

(Figure 5.10A). The structure was refined at 1.7 Å resolution and the R-factor and free R-

factor were 17.9% and 20.8% respectively (Table 5.7). The mean temperature factor for

the protein atoms ranged from 6 to 38 Å2, and the temperature factor for the citrate ion

varied from 8 to 10 Å2, which shows that citrate is bound in the active site with full

occupancy. The electron density clearly showed the presence of Ala as the 84th residue, in

agreement with the Cys84Ala mutation. The clear electron density helped in positioning

citrate in the active site (Figure 5.10B).

The structure of BsLuxSC84A was solved earlier with the 2 ketone intermediate 4

in the presence of HEPES buffer (Rajan et al., 2005). The structure clearly showed the mode of binding of the intermediate to the active site. The ribosyl part of 2 ketone intermediate 4 make interactions with His-11, Arg-39, Glu-57, Gly-83, Tyr-89, and Gly-

127, while the amino acid part bound to the N-terminus of the other monomer in the dimer. The first residue visible in this structure is Val-4.

193

In the present structure of BsLuxSC84A with a bound citrate, citrate ion occupies

the position of the ribosyl part of the 2-ketone intermediate 4 (Rajan et al., 2005) and it

interacts with the same residues as was seen for the 2-ketone intermediate 4 structure

(Figure 5.10C). The numbering of oxygen atoms of citrate is as shown in Figure 5.11A.

The O1 atom of citrate is hydrogen bonded to Tyr-89OH, and O2 interacts with Arg39

and His11. The O3 atom interacts with Cys-84N, where as the O4 atom of citrate

interacts with Glu-57 and Gly-83N. Co2+ is bound in a bidentate manner by the O5 and

O6 atoms of citrate ion. O6 atom of citrate is closer to Co2+ (2.0 Å) than O5 atom (2.6 Å).

The O7 atom of citrate ion interacts with Arg-39. The important interactions and the

corresponding distances of interaction are given in Table 5.8.

Surprisingly, the N-terminus of LuxS (the N-terminus of the monomer with out

the bound Co2+) becomes well ordered when it is bound to citrate. The second and third

residues of LuxS which were disordered in the previous structures (Shen et al., 2006,

Rajan et al., 2005) were visible in the present structure (Figure 5.11B). The N-terminus of the protein fits in the position for Hcys in the 2-ketone intermediate 4 structure. The amino and carboxylate groups of the Hcys part of 2 ketone intermediate 4 were neutralized by Asp-78 and Arg-65 respectively (Rajan et al., 2005). In the present structure, Asp-78 and Ile-79 interact with N of Pro-2 where as Arg-65 and Ile-79 interacts with O of Pro-2. Thus, Pro-2 replaces the amino and carboxylate groups of the Hcys.

Pro-2 fills this part of the active site, but does not participate in the interactions with the citrate ion. It appears that the N-terminus of the protein closed this end of the active site.

The temperature factor for Pro-2 is 25 Å2, which shows that the Pro-2 is very well

ordered, considering it is the N-terminus of the protein.

194

Ruzheinikov and coworkers reported the structure of free LuxS and Luxs bound

to RHC (ribosyl homocysteine) and HCys (Ruzheinikov, et al., 2001). Here they showed

that, in the free enzyme and LuxS bound to RHC, the N terminus was well ordered forming a closed conformation where as in the LuxS bound to HCys, the first four

residues were disordered, forming an open conformation. They suggested that the closed

conformation holds the substrate tighter while the open conformation may be necessary

for the substrate to access the active site. Based on the above observations, the present

structure should represent a closed conformation of LuxS bound tightly to citrate. The

citrate could competitively inhibit LuxS from binding to the actual substrate.

The binding of N-terminus to the active site suggest a possible role for the N-

terminus in controlling substrate binding and release. The structure showed a definite role

for Pro-2 in forming a closed conformation of LuxS through the interactions and N and O

of Pro-2 with Asp-78, Ile-79 and Arg-65. To check whether the Pro-2 had some role in

the catalytic cycle of LuxS, a mutant was made in which Pro-2 was replaced with Ala.

The activity of this mutant was compared with the wild type enzyme both in citrate and

HEPES buffers. The assay showed no difference in activity between the wild type and mutant LuxS (data not shown). This could be because the N and O atoms of the main chain of any amino acid can form the same hydrogen bonding as observed in the wild type. A mutant protein with the first three amino acids deleted could explain whether there is a role for the N-terminus in the catalytic cycle of LuxS.

The binding of citrate to the active site was first observed in the Cys84Ala mutant of BsLuxS. To see if citrate can bind to the wild type enzyme, LuxS was crystallized in

Sodium citrate buffer at pH 5.6. The wild type BsLuxS did not have citrate in the active

195

site when crystallized under the same conditions. The absence of citrate can be due to the

thiolate group of Cys 84 forming close contact with the citrate ion and/or due to the

electrostatic repulsion of deprotonated side chain of Cys-84. An overlap of the citrate

bound structure with that of the LuxS bound to inhibitors 10 (Shen et al., 2006) where the

wild type LuxS was used for crystallization (Figure 5.11B), shows that the O3 (2.6 Å) and O7 (3.0 Å) atoms of citrate are within the hydrogen bonding distance to the Cys 84

SG.

5.5.3.2 Activity assay for LuxS and LuxSC84A mutant in the presence of

citrate ion

The uv/vis spectra of LuxSC84A showed immediate concentration-dependent

changes upon addition of citric acid (Figure 5.12). The intensity of LMCT band (347 nm)

and two d-d transition peaks (660 nm and 640 nm) continuously decreased. Clear spectral

change was observed with 1 mM citric acid. Two new peaks started to appear at 560 nm

and 520 nm at 6 mM citric acid. However, these differences could be also due to slight

pH variations. Interestingly the wild type LuxS did not show any spectral change even up

to 10 mM citric acid (Figure 5.12) even after incubation on ice for a few hours. This is

consistent with the crystallographic study which showed no citrate ion in the wild type

LuxS crystallized with citrate buffer. To avoid pH bias, citric acid was adjusted to pH 7.6

with NaOH and the above experiments were repeated with higher concentration of the

compounds. The binding affinity of citrate to LuxSC84A decreased significantly after

adjusting the pH. This could mean that the low pH is necessary for the binding of citrate

in the active site of LuxS. The absence of a spectral change in the wild type LuxS

196

indicates that citrate does not bind to the active site, most probablye due to the steric

clashes or charge repulsions with the thiolate group of of the active site Cys84 residue.

5.5.3.3 Inhibitor design

The present study reports the first structure of LuxS bound to a non-substrate like

molecule. Therefore this structure provides insights into designing new inhibitors which

are not substrate related. The citrate ion forms several hydrogen bonds, which are

important for binding the substrate, intermediates, and the residues that are thought to

have prominent role in the catalytic cycle (His 11, Arg 39, Glu 57, and Tyr 89) (Rajan et

al., 2005). There have been previous reports of cases where buffer components like

citrate and Tris has inhibited enzymes (Harrison et al., 1994, Wang et al., 2005). The

structure of human aldose reductase enzyme bound to citrate showed the existence of

anion binding site in the enzyme and helped in understanding the mechanism of action of this enzyme (Harrison et al., 1994). Citrate is an inhibitor of aldose reductase enzyme.

The structures of human B-type phosphoglycerate mutase bound to citrate (Wang et al.,

2005) and aminopeptidase bound to Tris (Desmarais et al., 2002) has helped in studying the enzyme mechanisms in detail. These studies suggest that the information obtained from the LuxS mutant bound to citrate ion can be used for designing an inhibitor for

LuxS. However, citrate ion is bound to the LuxSC84A mutant and not the wild type enzyme. The problems arising due to the citrate atoms forming close contact with Cys-84 side chain and the possible role of electrostatic repulsion between the side chain of Cys-

84 and the citrate should be avoided in the actual inhibitor for the wild type enzyme. The citrate ion is a good starting model for inhibitor design because it is bound to the same

197

site of LuxS as that of the actual substrate, but with significantly different atomic

positions.

In the present study, the interactions made by a citrate ion in the active site of

BsLuxSC84A mutant are reported. The citrate ion is bound in the active site essentially in a similar way as the ribose part of the SRH (Ruzheinikov et al, 2001) or 2-ketone intermediate 4 (Rajan et al., 2005). We propose that citrate can be a good starting model

for developing an inhibitor for LuxS.

198

A

B

C

Figure 5.10 Crystal structure of LuxS in complex with citrate ion. (A) Stereo ribbon diagram of the LuxS dimer bound to citrate ion. An active dimer is shown with one citrate ion (green bonds) in each active site. The monomer bound to Co2+ (magenta) is shown in golden and the other monomer is shown in cyan. N and C terminals (Pro-2 and Gly-157 respectively) are labeled for each subunit. Notice that the N-terminus is wrapping around the citrate ion and closing that end of the active site. (B) Stereoview of the electron density of citrate in the active site of LuxSC84A. The blue cage represents

199

the 2Fo-Fc map contoured at 1σ and the red density represents the Fo-Fc map contoured at 2.5σ. The backbone of two monomers of LuxSC84A with citrate is shown. (C) The stereo view of the atomic interactions between LuxS and citrate in the active site. The interactions are shown as black dotted lines.

200

A

B

Figure 5.11 A) A schematic representation of the citrate ion with the numbers for each oxygen atom. B) An overlay of the structure of LuxS bound to citrate (green) with the structure of LuxS bound to inhibitor 10 (purple) (Shen et al., 2006). In the citrate bound LuxS, the N-terminus is well ordered. Notice that Pro2 occupies the position of HCys part of the inhibitor 10. Also in the structure with inhibitor 10, the first residue visible is Val4. Note the position of Cys-84 in the structure with inhibitor 10 and Ala-84 in the citrate bound structure. The O3 and O7 atoms of citrate form close interactions with the Cys84.

201

Figure 5.12 UV-Vis spectra of C84A mutant and wild type LuxS. The C84A mutant of BsLuxS shows significant spectral changes upon the addition of citric acid, while the wild type does not show considerable spectral changes on the addition of citric acid.

202

Data Collection Statistics

Space group P6522 a=b (Å) 62.1 c (Å) 150.7 Resolution (Å) 26.9-1.7 No. of reflections 120,639 No. of unique reflections 28,625 Completeness (%) 83.5 (18.3)a Redundancy 4.2 (1.2) b Rmerge 6.5 (17.6) I/σc 17.4 (2.9)

Refinement Statistics

Resolution (Å) 26.9-1.7 No. of reflections (working/free) 14,778/1,619 Completeness (%) 87.6 Mean B-factor (Å2) 15.8 Estimated Coordinate Errord 0.18

RMS Deviation from Ideal Geometry

Bonds (Å) 0.005 Angles (°) 1.2 R-factor 17.9 (25) R-free (%)e 20.8 (25.6) No. of waters 131

a Numbers in parantheses refer to the highest resolution shell only. b Rmerge =∑|Ih-‹I›h|/∑Ih, where ‹I›h is average intensity over c symmetry equivalents. I/σ is the mean of the intensity/sigma of the d unique, averaged reflections. The estimated coordinate error is the e value from the cross-validated ∑ plot. R-factor = ∑|Fobs - Fcalc|/∑obs. Rfree is calculated from 10% of the reflections that are omitted from the refinement.

Table 5.7 Data collection and refinement statistics for LuxS-citrate structure.

203

Atom A Atom B Distance (Å)

Citrate O1 Tyr-89 OH 2.6 Citrate O2 His-11 NE2 2.6 Citrate O2 Arg-39 NH1 2.8 Citrate O2 Arg-39 NH2 3.5 Citrate O3 Gly-83 N 3.2 Citrate O3 Ala-84 N 2.8 Citrate O4 Glu-57 OE1 3.2 Citrate O4 Glu-57 OE2 2.0 Citrate O5 Co2+ 2.6 Citrate O5 His-54 NE2 3.3 Citrate O5 His-58 NE2 3.2 Citrate O6 Co2+ 2.0 Citrate O6 Cys-126 SG 3.4 Citrate O6 Gly-127 N 2.8 Citrate O7 Arg-39 NH2 3.2 Pro-2 N Asp-78 OD1 2.4 Pro-2 N Ile-79 O 3.1 Pro-2 O Ile-79 N 3.2 Ser03 N Ser-6 OG 2.9 Co2+ His-54 NE2 2.1 Co2+ His-58 NE2 2.1 Co2+ Cys-126 SG 2.3

Table 5.8 List of important atomic interactions observed in the LuxS-citrate structure.

204

LIST OF REFERENCES

Agostini, H.J., Carroll, J.D. and Minton, K.W. (1996). Identification and characterization of uvrA, a DNA repair gene of Deinococcus radiodurans. J. Bacteriol. 178, 6759-6765.

Aihara, H., Ito, Y., Hitoshi, K. Terada, T., Yokoyama, S. and Shibata, T. (1997). An interaction between a specified surface of the C-terminal domain of RecA protein and double-stranded DNA for homologous pairing. J. Mol. Biol. 274, 213-221.

Aihara, H., Ito, Y., Kurumizaka, H., Yokoyama, S., and Shibata, T. (1999). The N- terminal domain of the human Rad51 protein binds DNA: structure and a DNA binding surface as revealed by NMR. J. Mol. Biol. 290, 495-504.

Alfaro, J. F.; Zhang, T.; Wynn, D. P.; Karschner, E. L.; Zhou, Z. S. (2004). Synthesis of LuxS inhibitors targeting bacterial cell-cell communication. Org. Lett. 6, 3043-3046.

Amaratunga, M. and Benight, A.S. (1988) DNA sequence dependence of ATP hydrolysis by RecA protein. Biochem. Biophys. Res. Commun., 157, 127-133.

Anderson, A., Nordan, H., Cain, R., Parrish, G. and Duggan. (1956). Studies on a radioresistant micrococcus. I. Isolation, morphology, cultural characteristics, and resistance to gamma radiation. Food Technol. 10, 575-578.

Anderson, D.G. and Kowalczykowski, S.C. (1997). The translocating RecBCD enzyme stimulates recombination by directing RecA protein onto ssDNA in a chi-regulated manner. Cell 90, 77-86.

Bar-Ziv, R. and Libchaber, A. (2001) Effects of DNA sequence and structure on binding of RecA to single-stranded DNA. Proc. Natl Acad. Sci. U S A, 98, 9068-9073.

Bassler, B. L.; Wright, M.; Showalter, R. E.; Silverman, M. R. (1993). Intercellular signaling in Vibrio harveyi: Sequence and function of genes regulating expression of luminescence. Mol. Microbiol., 9, 773-786.

Battista, J.R. (1997). Against all odds: the survival strategies of Deinococcus radiodurans. Annu. Rev. Microbiol. 51, 203-224.

205

Battista, J.R., Earl, A.M. and Park,M-J. (1999). Why is Deinococcus radiodurans so resistant to ionizing radiation? Trends Microbiol. 7, 362-365.

Bedale, W.A., and Cox, M. (1996). Evidence for the coupling of ATP hydrolysis to the final (extension) phase of RecA protein-mediated DNA strand exchange. J Biol Chem. 271, 5725-5732.

Bell, C.E. (2005). Structure and mechanism of Escherichia coli RecA ATPase. Molec. Microbiol.58, 358-366.

Bell, C.E., Frescura, P., HochsChild, A. and Lewis, M. (2000) Crystal structure of the lambda repressor C-terminal domain provides a model for cooperative operator binding. Cell, 101, 801-811.

Benedict, R.C. and Kowalczykowski, S.C. (1988). Increase of DNA strand assimilation activity of recA protein by removal of the C terminus and structure-function studies of the resulting protein fragment. J. Biol. Chem. 263, 15513-15520.

Bianco, P.R. and Weinstock, G.M. (1996) Interaction of the RecA protein of Escherichia coli with single-stranded oligodeoxyribonucleotides. Nucleic Acids Res., 24, 4933-4939.

Bianco, P.R., Tracy, R.B. and Kowalczykowski, S.C. (1998) DNA strand exchange proteins: a biochemical and physical comparison. Front Biosci., 3, 570-603.

Biet, E., Sun, J. and Dutreix, M. (1999) Conserved sequence preference in DNA binding among recombination proteins: an effect of ssDNA secondary structure. Nucleic Acids Res., 27, 596-600.

Blaho, J.A., and Wells, R.D. (1987). Left-handed Z-DNA binding by the RecA protein of Escherichia coli. J. Biol. Chem. 262, 6082-6088.

Bork J.M., Cox, M.M., and Imnam, R.B. (2001) RecA protein filaments disassemble in the 5’ to 3’ direction on single-stranded DNA. J. Biol. Chem. 276, 45740-45743.

Brünger, A. T.; Adams, P. D.; Clore, G. M.; DeLano, W. L.; Gros, P.; Grosse-Kunstleve, R. W.; Jiang, J. S.; Kuszewski, J.; Nilges, M.; Pannu, N. S.; Read, R. J.; Rice, L. M.; Simonson, T.; Warren, G. L. (1998). Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr. D 54, 905-921.

Campbell, M.J. and Davis, R.W. (1999a). On the in vivo function of the RecA ATPase. J. Mol. Biol. 286, 437-445.

Campbell, M.J. and Davis, R.W. (1999b). Toxic mutations in the recA gene of E. coli prevent proper chromosome segregation. J. Mol. Biol. 286, 417-435.

206

Carroll, J.D., Daly, M.J. and Minton, K.W. (1996). Expression of recA in Deinococcus radiodurans. J. Bacteriol. 178, 130-135.

Chayen, N.E. (1998). Comparative studies of protein crystallization by vapour-diffusion and microbatch techniques. Acta Crystallogr D Biol Crystallogr. 54, 8-15..

Chen, L.T., Ko, T.P., Chang, Y.C., Lin, K.A., Chang, C.S., Wang, A.H.J., and Wang, T.F. (2007). Crystal structure of the left-handed archaeal RadA helical filament: identification of a functional motif for controlling quaternary structures and enzymatic functions of RecA family proteins. Nucl. Acids Res. 35, 1787-1801.

Chen, X.; Schauder, S.; Potier, N.; Van Dorsselaer, A.; Pelczer, I.; Bassler, B. L.; Hughson, F. M. (2002). Structural identification of a bacterial quorum-sensing signal containing boron. Nature 415, 545-549.

Clark, A.J., and Margulies, A.D. (1965). Isolation and characterization of recombination- deficient mutants of Escherichia coli K12. Proc. Natl. Acad. Sci. U S A. 53, 451-459.

Collaborative Computational Project Number 4. (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallogr. D 550, 760-763.

Collyer, C. A.; Henrick, K.; Blow, D. M. (1990) Mechanism for aldose-ketose interconversion by D-xylose isomerase involving ring opening followed by 1,2-hydride shift, J. Mol. Biol. 212, 211-235.

Conway, A.B., Lynch, T.W., Zhang, Y., Fortin, G.S., Fung, C.W., Symington, L.S. and Rice, P.A. (2004) Crystal structure of a Rad51 filament. Nat. Struct. Mol. Biol., 11, 791- 796.

Cox, J.M., Tsodikov, O.V. and Cox, M.M. (2005) Organized unidirectional waves of ATP hydrolysis within a RecA filament. PLoS Biol., 3 (2), e52.

Cox, M.M. (2003) The bacterial RecA protein as a motor protein. Annu. Rev. Microbiol., 57, 551-577.

Cox, M.M. (2007) Regulation of bacterial RecA protein function. Crit. Rev. Biochem. Molbiol., 42, 41-63.

Cox, M.M., and Lehman, I.R. (1981). Directionality and polarity in RecA protein- promoted branch migration. Proc. Natl. Acad. Sci. USA, 78: 6018-6022.

Craig, N.L. and Roberts, J.W. (1980) E. coli recA protein-directed cleavage of phage lambda repressor requires polynucleotide. Nature, 283, 26-30.

207

Craig, N.L. and Roberts, J.W. (1981) Function of nucleoside triphosphate and polynucleotide in Escherichia coli recA protein-directed cleavage of phage lambda repressor. J. Biol. Chem., 256, 8039-8044.

Daly, M.J. and Minton, K.W. (1995). Interchromosomal recombination in the extremely radioresistant bacterium Deinococcus radiodurans. J. Bacteriol. 177, 5495-5505.

Daly, M.J. and Minton, K.W. (1996). An alternative pathway of recombination of chromosomal fragments precedes recA-dependent recombination in the radioresistant bacterium Deinococcus radiodurans. J. Bacteriol. 178, 4461-4471.

Daly, M.J., Ouyang, L., Fuchs, P., and Minton, K.W. (1994). In vivo damage and recA- dependent repair of plasmid and chromosomal DNA in the radiation-resistant bacterium Deinococcus radiodurans. J. Bacteriol. 86, 294-298.

Das Gupta, C., and Radding, C.M. (1982). Polar branch migration promoted by recA protein: effect of mismatched base pairs. Proc. Natl. Acad. Sci. USA, 79: 762-766.

Datta, S., Ganesh, N., Chandra, N.R., Muniappa, K. and Vijayan, M. (2003a). Structural studies on MTRecA-nucleotide complexes: insights into DNA and nucleotide binding and the structural signature of NTP recognition. Prot. Struct. Funct. Gen. 50, 474-485.

Datta, S., Krishna, R., Ganesh, N., Chandra, N.R., Muniyappa, K. and Vijayan, M. (2003b). Crystal structures of Mycobacterium smegmatis RecA and its nucleotide complexes. J. Bacteriol. 185, 4280-4284.

Datta, S., Prabu, M.M., Vaze, M.B., Ganesh, N., Chandra, N.R., Muniyappa, K. and Vijayan, M. (2000). Crystal structures of Mycobacterium tuberculosis RecA and its complex with ADP-AlF4: implications for decreased ATPase activity and molecular aggregation. Nucleic Acids Res. 28, 4964-4973.

DeLano, W. L. (2003) PyMOL Reference Manual, DeLano Scientific LLC, San Carlos, CA.

Desmarais, W. T., Bienvenue, D. L., Bzymek, K. P., Holz, R. C., Petsko, G. A., and Ringe, D. (2002) The 1.20 A resolution crystal structure of the aminopeptidase from Aeromonas proteolytica complexed with tris: a tale of buffer inhibition, Structure 10, 1063-1072.

DiCapua, E., Engel, A., Stasiak, A. and Koller, T. (1982) Characterization of complexes between recA protein and duplex DNA by electron microscopy. J. Mol. Biol., 157, 87- 103.

Drees, J.C., Lusetti, S.L., Chitteni-Pattu, S., Inman, R.B., and Cox, M.M. (2004). A RecA filament capping mechanism for RecX protein. Mol. Cell. 15, 789-798.

208

Dutreix, M. (1997) (GT)n repetitive tracts affect several stages of RecA-promoted recombination. J. Mol. Biol., 273, 105-113.

Earl, A.M., Mohundro, M.M., Mian, I.S. and Battista, J.R. (2002). The IrrE protein of Deinococcus radiodurans R1 is a novel regulator of recA expression. J. Bacteriol. 184, 6216-6224.

Egelman, E.H. (2001). Does a stretched DNA structure dictate the helical geometry of RecA-like filaments? J Mol Biol. 309, 539-542.

Egelman, E.H. and Stasiak, A. (1993). Electron microscopy of RecA-DNA complexes: Two different states, their functional significance and relation to the solved crystal structure. Micron 24, 309-324.

Egelman, E.H. and Stasiak, A. (1986). Structure of helical RecA-DNA complexes. Complexes formed in the presence of ATP-gamma-S or ATP. J. Mol. Biol. 191, 677-697.

Egelman, E.H. and Yu, X. (1989). The location of DNA in RecA-DNA helical filaments. Science. 245, 404-407.

Ellman, G. L. (1959) Tissue sulfhydryl groups, Arch. Biochem. Biophys. 82, 70-77.

Federle, M. J., and Bassler, B. L. (2003) Interspecies communication in bacteria, J. Clin. Invest. 112, 1291-1299.

Fernandez De Henestrosa, A.R., Ogi, T., Aoyagi, S., Chafin, D., Hayes, J.J., Ohmori, H., and Woodgate, R. (2000). Identification of additional genes belonging to the LexA regulon in Escherichia coli. Mol. Microbiol. 35, 1560-1572.

Forget, A.L., Kudron, M.M., McGrew, D.A., Calmann, M.A., Schiffer, C.A., and Knight, K.L. (2006). RecA dimers serve as a functional unit for assembly of active nucleoprotein filaments. Biochemistry 45, 13537-13542.

Galletto, R., Amitani, I., Baskin, R.J., and Kowalczykowski, S.C. (2006). Direct observation of individual RecA filaments assembling on single DNA molecules. Nature 443, 875-878.

Gill, S.C. and von Hippel, P.H. (1989). Calculation of protein extinction coefficients from amino acid sequence data. Anal. Biochem. 182, 319-326.

Gomez-Gomez, J., Manfredi, C., Alonso, J.C., and Blazquez, J. (2007). A novel role for RecA under non-stress: promotion of swarming motility in Escherichia coli K-12. BMC Biol. 5, e14.

209

Gutman, P.D., Carroll, J.D., Masters, C.I. and Minton, K.W. (1994). Sequencing, targeted mutagenesis and expression of a recA gene required for the extreme radioresistance of Deinococcus radiodurans. Gene 141, 31-37.

Gutman, P.D., Fuchs, L., Ouyang, L. and Minton, K.W. (1993). Identification, sequencing, and targeted mutagenesis of a DNA polymerase gene required for the extreme radioresistance of Deinococcus radiodurans. J. Bacteriol. 175, 3581-3590.

Guzman, L.-M., Belin, D., Carson, M. and Beckwith, J. (1995). Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter. J. Bacteriol. 177, 4121-4130.

Hansen, M.T. (1978). Multiplicity of genome equivalents in the radiation resistant bacterium Micrococcus radiodurans. J. Bacteriol. 134, 71-75.

Harrison, D. H., Bohren, K. M., Ringe, D., Petsko, G. A., and Gabbay, K. H. (1994). An anion binding site in human aldose reductase: mechanistic implications for the binding of citrate, cacodylate, and glucose 6-phosphate, Biochemistry 33, 2011-2020.

Hentzer, M.; Riedel, K.; Rasmussen, T. B.; Heydorn, A.; Andersen, J. B.; Parsek, M. R.; Rice, S. A.; Eberl, L.; Molin, S.; Høiby, N.; Kjelleberg, S.; Givskov, M. (2002). Inhibition of quorum sensing in Pseudomonas aeruginosa biofilm bacteria by a halogenated furanone compound. Microbiol. 148, 87-102.

Hentzer, M.; Wu, H.; Andersen, J. B.; Riedel, K.; Rasmussen, T. B.; Bagge, N.; Kumar, N.; Schembri, M. A.; Song, Z.; Kristoffersen, P.; Manefield, M.; Costerton, J. W.; Molin, S.; Eberl, L.; Steinberg, P.; Kjelleberg, S.; Høiby, N.; Givskov, M. (2003). Attenuation of Pseudomonas aeruginosa virulence by quorum sensing inhibitors. EMBO J. 22, 3803- 3815.

Higgins, D., Thompson, J., Gibson, T., Thompson, J.D., Higgins, D.G., Gibson, T.J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673-4680.

Hilgers, M. T., and Ludwig, M. L. (2001) Crystal structure of the quorum-sensing protein LuxS reveals a catalytic metal site, Proc. Natl. Acad. Sci. U.S.A. 98, 11169-11174.

Hortnagel, K., Voloshin, O.N., Kinal, H.H., Ma, N., Schaffer-Judge, C., and Camerini- Otero, R.D. (1999). Saturation mutagenesis of the E. coli RecA loop L2 homologous DNA pairing region reveals residues essential for recombination and recombinational repair. J. Mol. Biol. 286, 1097-1106.

210

Hua, Y., Narumi, I., Gao, G., Tian, B., Satoh, K., Kitayama, S. and Shen, B. (2003). PprI: a general switch responsible for extreme radioresistance of Deinococcus radiodurans. Biochem. Biophys. Res. Comm. 306, 354-360.

Jain, S.K., Cox, M.M., and Inman, R.B. (1994) On the role of ATP hydrolysis in RecA protein-mediated DNA strandexchange III. Unidirectional branch migration and extensive hybrid DNA formation. J. Biol. Chem., 269, 20653– 20661.

Jones, T. A., Zhou, J. Y., Cowan, S. W., and Kjeldgaard, M. (1991) Improved methods for building protein models in electron density maps and the location of errors in these models, Acta Crystallogr. A 47, 110-119.

Jones, T.A., Zhou, J.-Y., Cowan, S.W. and Kjeldgaard, M. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr. A 47, 110-119.

Joo, C., McKinney, S.A., Nakamura, M., Rasnik, I., Myong, S., and Ha, T. (2006). Real- time observation of RecA filament dynamics with single monomer resolution. Cell 126, 515-527.

Karlin, S. and Brocchiere, L. (1996). Evolutionary conservation of RecA genes in relation to protein structure and function. J. Bacteriol. 178, 1881-1894.

Kelley De Zutter, J., and Knight, K.L. (1999). The hRad51 and RecA proteins show significant differences in cooperative binding to single-stranded DNA. J. Mol. Biol. 293, 769-780.

Kelley De Zutter, J., Forget, A.L., Logan, K.M., and Knight, K.L. (2001). Phe217 regulates the transfer of allosteric information across the subunit interface of the RecA protein filament. Structure 9, 47-55.

Kelley, J.A. and Knight, K.L. (1997). Allosteric regulation of RecA protein function is mediated by Gln194. J. Biol. Chem. 272, 25778-25782.

Kim, J.I., Cox, M.M., and Inman, R.B. (1992) On the role of ATP hydrolysis in RecA protein-mediated DNA strand exchange I. Bypassing a short heterologous insert in one DNA substrate. J. Biol. Chem., 267, 16438–16443.

Kim, J.I., Sharma, A.K., Abbott, S.N., Wood, E.A., Dwyer, D.W., Jambura, A., Minton, K.W., Inman, R.B., Daly, M.J. and Cox, M.M. (2002). RecA protein from the extremely radioresistant bacterium Deinococcus radiodurans: expression, purification and characterization. J. Bacteriol. 184, 1649-1660.

211

Kim. J-I. and Cox, M.M. (2002). The RecA proteins of Deinococcus radiodurans and Escherichia coli promote DNA strand exchange via inverse pathways. Proc. Natl. Acad. Sci. U.S.A. 99, 7917-7921. Kleywegt, G.J. and Jones, T.A. (1998) Databases in protein crystallography, Acta Cryst D54, 1119-1131.

Knight, K.L. and McEntee, K. (1985). Tyrosine 264 in the recA protein from Escherichia coli is the site of modification by the photoaffinity label 8-azidoadenosine 5’- triphosphate. J. Biol. Chem. 260, 10185-10191.

Kowalczykowski, S.C. (2000). Initiation of genetic recombination and recombination- dependent replication. Trends Biochem Sci. 25, 156-165.

Kowalczykowski, S.C., and Krupp, R.A. (1987). Effects of Escherichia coli SSB protein on the single-stranded DNA-dependent ATPase activity of Escherichia coli RecA protein. Evidence that SSB protein facilitates the binding of RecA protein to regions of secondary structure within single-stranded DNA. J. Mol. Biol. 193: 97-113.

Kowalczykowski, S.C., and Krupp, R.A. (1995). DNA-strand exchange promoted by RecA protein in the absence of ATP: Implications for the mechanism of energy transduction in protein-promoted nucleic acid transactions. Proc. Natl. Acad. Sci. USA 92: 3478-3482.

Kowalczykowski, S.C., Dixon, D.A., Eggleston, A.K., Lauder, S.D., and Rehrauer, W.M. (1994). Biochemistry of homologous recombination in Escherichia coli. Microbiol. Rev. 58, 401-465.

Krasin, F. and Hutchinson, R. (1977). Repair of DNA double-strand breaks in Escherichia coli, which requires recA function and the presence of a duplicate genome. J. Mol. Biol. 116, 81-98.

Kraulis, P. (1991) MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures, J. Appl. Crystallogr. 24, 946-950.

Krejci, L., Damborsky, J., Thomsen, B., Duno, M., and Bendixen, C. (2001). Molecular dissection of interactions between Rad51 and members of the recombination-repair group. Mol. Cell. Biol. 21, 966-976.

Kurumizaka, H., Ikawa, S., Sarai, A., Shibata, T. (1999). The mutant RecA proteins, RecAR243Q and RecAK245N, exhibit defective DNA binding in homologous pairing. Arch. Biochem. Biophys. 365, 83-91.

Kuzminov, A. (1999). Recombinational repair of DNA damage in Escherichia coli and bacteriophage λ. Microb. Mol. Biol. Rev. 63, 751-813.

212

Leahy, M.C. and Radding, C.M. (1986) Topography of the interaction of recA protein with single-stranded deoxyoligonucleotides. J. Biol. Chem., 261, 6954-6960.

Lee, J., and Cox, M.M. (1990). Inhibition of RecA protein-promoted ATP hydrolysis. I. ATPγS and ADP are antagonistic inhibitors. Biochemistry 29: 7666-7676.

Levin-Zaidman, S., Englanger, J., Shimoni, E., Sharma, A.K., Minton, K.W. and Minsky, A. (2003). Ringlike structure of the Deinococcus radiodurans genome: a key to radioresistance? Science 299, 254-256.

Lewis, H. A., Furlong, E. B., Laubert, B., Eroshkina, G. A., Batiyenko, Y., Adams, J., et al. (2001) A structural genomics approach to the study of quorum sensing: crystal structures of three LuxS orthologs, Structure 9, 527-537.

Lindsley, J.E., and Cox, M.M. (1990). Assembly and disassembly of RecA protein filaments occurs at opposite filament ends: Relationship to DNA strand exchange. J. Biol. Chem. 265: 9043-9054.

Little, J.W. (1984) Autodigestion of lexA and phage lambda repressors. Proc. Natl Acad. Sci. U S A, 81, 1375-1379.

Livneh, Z., and Lehman, I.R. (1982). Recombinational bypass of pyramidine dimers promoted by the RecA protein of Escherichia coli. Proc. Natl. Acad. Sci. USA, 79: 3171- 3175.

Lusetti, S.L., Voloshin, O.N., Inman, R.B., Camerini-Otero, R.D., and Cox, M.M. (2004). The DinI protein stabilizes RecA protein filaments. J. Biol. Chem. 279, 30037-30046.

Lusetti, S.L., Wood, E.A., Fleming, C.D., Modica, M.J., Korth, J., Abbott, L., Dwyer, D.W., Roca, A.I., Inman, R.B., and Cox, M.M. (2003). C-terminal deletions of the Escherichia coli RecA protein: characterization of in vivo and in vitro effects. J. Biol. Chem. 278, 16372-16380.

Lyon, G. J.; Muir, T. W. Chemical signaling among bacteria and its inhibition. (2003). Chem. Biol. 10, 1007-1021.

Makarova, K.S., Aravind, L., Wolf, Y.I., Tatusov, R.L., Minton, K.W., Koonin, E.V. and Daly, M.J. (2001). Genome of the extremely radiation-resistant bacterium Deinococcus radiodurans viewed from the perspective of comparative genomics. Microbiol. Mol. Biol. Rev. 65, 44-79.

Malkov, V.A. and Camerini-Otero, R.D. (1995). Photo cross-links between single- stranded DNA and Escherichia coli RecA protein map to loops L1 (amino acid residues 157-164) and L2 (amino acid residues 195-209). J. Biol. Chem. 270, 30230-30233.

213

Maret, W.; Vallee, B. (1993) Cobalt as probe and label of proteins, Methods Enzymol. 226, 52-71.

Marshall, J. A.; Seletsky, B. M.; Luke, G. P. (1994). Synthesis of protected carbohydrate derivatives through homologation of threose and erythrose derivatives with chiral γ- alkoxy allylic stannanes. J. Org. Chem., 59, 3413-3420.

McEntee, K., Weinstock, G.M. and Lehman, I.R. (1981) Binding of the recA protein of Escherichia coli to single- and double-stranded DNA. J. Biol. Chem., 256, 8835-8844.

McGhee, J.D., and von Hippel, P.H. (1974). Theoretical aspects of DNA-protein interactions: co-operative and non-co-operative binding of large ligands to a one- dimensional homogeneous lattice. J. Mol. Biol. 86, 469-489.

McGrew, D.A. and Knight, K.L. (2003) Molecular design and functional organization of the RecA protein. Crit. Rev. Biochem. Mol. Biol., 38, 385-432.

Menetski, J.P. and Kowalczykowski, S.C. (1985) Interaction of recA protein with single- stranded DNA. Quantitative aspects of binding affinity modulation by nucleotide cofactors. J. Mol. Biol., 181, 281-295.

Menetski, J.P., Bear, D.G., and Kowalczykowski, S.C. (1990) Stable DNA heteroduplex formation catalyzed by the Escherichia coli RecA protein in the absence of ATP hydrolysis. Proc Natl Acad Sci USA, 87: 21-25.

Merritt, E.A. and Bacon, D.J. (1997). Raster3D: photorealistic molecular graphics. Meth. Enzymol., 277, 505-524.

Miller, C. H., and Duerre, J. A. (1968) S-Ribosylhomocysteine cleavage enzyme from Escherichia coli, J. Biol. Chem. 243, 92-97.

Miller, M. B.; Bassler, B. L. (2001). Quorum sensing in bacteria. Annu. Rev. Microbiol. 55, 165-199.

Miller, S. T.; Xavier, K. B.; Campagna, S. R.; Taga, M. E.; Semmelhack, M. F.; Bassler, B. L.; Hughson, F. M. (2004) Salmonella typhimurium recognizes a chemically distinct form of the bacterial quorum-sensing signal AI-2, Mol. Cell 15, 677-687.

Minton, K.W. (1994). DNA repair in the extremely radioresistant bacterium Deinococcus radiodurans. Mol Microbiol. 13, 9-15. Minton, K.W. (1996). Repair of ionizing-radiation damage in the radiation resistant bacterium Deinococcus radiodurans. Mutat. Res. 363, 1-7.

214

Minton, K.W. and Daly, M.J. (1995). A model for repair of radiation-induced DNA double strand breaks in the extreme radiophile Deinococcus radiodurans. BioEssays 17, 457-464.

MolBioch 840 (2006) Practical protein crystallography class notes.

Moreau, P.L. and Carlier, M.F. (1989) RecA protein-promoted cleavage of LexA repressor in the presence of ADP and structural analogues of inorganic phosphate, the fluoride complexes of aluminum and beryllium. J. Biol. Chem., 264, 2302-2306.

Morrical, S.W., Lee, J. and Cox, M.M. (1986) Continuous association of Escherichia coli single-stranded DNA binding protein with stable complexes of recA protein and single- stranded DNA. Biochemistry, 25, 1482-1494.

Namasaraev, E.A. and Berg, P. (2000) Rad51 uses one mechanism to drive DNA strand exchange in both directions. J. Biol. Chem. 275: 3970-3976.

Narumi, I. (2003). Unlocking radiation resistance mechanisms: still a long way to go. Trends Microbiol. 11, 422-425.

Narumi, I., Satoh, K., Kikuchi, M., Funayama, T., Kitayama, S., Yanagisawa, T., Watanabe, H. and Yamamoto, K. (1999). Molecular analysis of the Deinococcus radiodurans recA locus and identification of a mutation site in a DNA repair-deficient mutant, rec30. Mutat. Res. 435, 233-243.

Navaza, J. (1994). AMoRe: an automated package for molecular replacement. Acta Crystallogr. A 50, 157-163.

Nayak, S. and Bryant, F.R. (1999). Differential rates of NTP hydrolysis by the mutant [S69G]RecA protein: evidence for a coupling of NTP turnover to DNA strand exchange. J. Biol. Chem. 274, 25979-25982.

Nicholls, A., Sharp, K. and Honig, B. (1991). Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins 11, 281- 296.

Pai, E.F., Krengel, U., Petsko, G.A., Goody, R.S., Kabsch, W., and Wittinghofer, A. (1990). Refined crystal structure of the triphosphate conformation of H-ras p21 at 1.35 A resolution: implications for the mechanism of GTP hydrolysis. EMBO J. 9,2351-2359.

Palmer, J, and Abeles, R. (1979) The mechanism of action of S-adenosylhomocysteinase, J. Biol. Chem. 254, 1217-1226.

Pei, D.; Zhu, J. (2004). Mechanism of S-ribosylhomocysteinase (LuxS). Curr. Opin. Chem. Biol. 8, 492-497.

215

Rajan, R., and Bell. C.E. (2004). Crystal structure of RecA from Deinococcus radiodurans: insights into the structural basis of extreme radioresistance. J Mol Biol. 344, 951-63.

Rajan, R.; Zhu, J.; Hu, X.; Pei, D.; Bell, C. E. (2005). Crystal structure of S- ribosylhomocysteinase (LuxS) in complex with a catalytic 2-ketone intermediate. Biochemistry 44, 3745-3753.

Rehrauer, W.M. and Kowalczykowski S.C. (1993). Alteration of the nucleoside triphosphate (NTP) catalytic domain within Escherichia coli recA protein attenuates NTP hydrolysis but not joint molecule formation. J. Biol. Chem. 268, 1292-1297.

Rehrauer, W.M. and Kowalczykowski, S.C. (1996). The DNA binding site(s) of the Escherichia coli RecA protein. J. Biol. Chem. 271, 11996-12002.

Ren, D.; Sims, J. J.; Wood, T. K. (2002). Inhibition of biofilm formation and swarming of Bacillus subtilis by (5Z)-4-bromo-5-(bromomethylene)-3-butyl-2(5H)-furanone. Lett. Appl. Microbiol. 34, 293-299.

Rhodes, G. (1993). Crystallography made crystal clear. 2nd edition. Published by Academic Press, San Diego.

Roca, A.I. and Cox, M.M. (1990). The RecA protein: Structure and Function. Crit. Rev. Biochem. And Mol. Biol. 25(6), 415-456.

Roca, A.I. and Cox, M.M. (1997). RecA protein: structure, function, and role in recombinational DNA repair. Prog. Nucl. Acid. Res. 56, 129-223.

Rosselli, W., and Stasiak, A. (1990) Energetics of RecA-mediated recombination reactions. Without ATP hydrolysis RecA can mediate polar strand exchange but is unable to recycle. J. Mol. Biol., 216, 335-352.

Rosselli, W., and Stasiak, A. (1991) The ATPase activity of RecA is needed to push the DNA strand exchange through heterologous regions. EMBO J., 10, 4391-4396.

Ruzheinikov, S. N., Das, S. K., Sedelnikova, S. E., Hartley, A., Foster, S. J.; Horsburgh, M. J., et al. (2001) The 1.2 Å structure of a novel quorum-sensing protein, Bacillus subtilis LuxS, J. Mol. Biol. 313, 111-122.

Satoh, K., Narumi, I., Kikuchi, M., Kitayama, S., Yanagisawa, T., Yamamoto, K. and Watanabe, H. (2002). Characterization of RecA424 and RecA670 proteins from Deinococcus radiodurans. J. Biochem. 131, 121-129.

216

Sauer, R.T., Ross, M.J. and Ptashne, M. (1982) Cleavage of the lambda and P22 repressors by recA protein. J. Biol. Chem., 257, 4458-4462.

Schauder, S., Shokat, K., Surette, M. G., and Bassler, B. L. (2001) The LuxS family of bacterial autoinducers: biosynthesis of a novel quorum-sensing signal molecule, Mol. Microbiol. 41, 463-476.

Schuettelkopf, A. W., and van Aalten, D. M. F. (2004) PRODRG - a tool for high- throughput crystallography of protein-ligand complexes, Acta Cryst. D60, 1355-1363.

Shan, Q., Cox, M.M, and Inman, R.B. (1996). DNA strand exchange promoted by RecA K72R. Two reaction phases with different Mg2+ requirements. J Biol Chem. 271, 5712- 24.

Shen, G., Rajan, R., Zhu, J., Bell, C. E., and Pei, D. (2006) Design and synthesis of substrate and intermediate analogue inhibitors of s-ribosylhomocysteinase, J Med Chem 49, 3003-3011.

Shivashankar, G.V., Feingold, M., Krichevsky, O., and Libchaber, A. (1999). RecA polymerization on double-stranded DNA by using single-molecule manipulation: The role of ATP hydrolysis. Proc. Natl. Acad. Sci. USA 96: 7916-7921.

Singleton, S.F., and Xiao, J. (2001-2002). The stretched DNA geometry of recombination and repair nucleoprotein filaments. Biopolymers. 61,145-158.

Smith, K. M.; Bu, Y.; Suga, H. (2003a). Library screening for synthetic agonists and antagonists of a Pseudomonas aeruginosa autoinducer. Chem. Biol. 2003, 10, 563-571. Smith, K. M.; Bu, Y.; Suga, H. (2003b). Induction and Inhibition of Pseudomonas aeruginosa quorum sensing by synthetic autoinducer analogs. Chem. Biol. 10, 81-89.

Stasiak, A., DiCapua, E. and Koller, T. (1981) Elongation of duplex DNA by recA protein. J. Mol. Biol., 151, 557-564.

Stasiak, A., Egelman, E.H., and Howard-Flanders, P. (1988). Structure of helical RecA- DNA complexes III. The structural polarity of RecA filaments and functional polarity in the RecA-mediated strand exchange reaction. J. Mol. Biol. 202, 659-662.

Story, R.M., and Steitz, T.A. (1992). Structure of the recA protein-ADP complex. Nature 355, 374-376.

Story, R.M., Weber, I.T. and Steitz, T.A. (1992). The structure of the E. coli recA protein monomer and polymer. Nature 355, 318-325.

Suga, H.; Smith, K. M. (2003). Molecular mechanisms of bacterial quorum sensing as a new drug target. Curr. Opin. Chem. Biol., 7, 586-591.

217

Sung, P. (1994) Catalysis of ATP-dependent homologous DNA pairing and strand exchange by yeast RAD51 protein. Science 265:1241-1243.

Sung, P. and Robberson, D.L. (1995) DNA strand exchange mediated by a RAD51- ssDNA nucleoprotein filament with polarity opposite to that of RecA. Cell 82:453-461.

Surette, M. G., Miller, M. B., and Bassler, B. L. (1999) Quorum sensing in Escherichia coli, Salmonella typhimurium, and Vibrio harveyi: a new family of genes responsible for autoinducer production, Proc. Natl. Acad. Sci. U.S.A. 96, 1639-1644.

Symington, L.S. (2002). Role of RAD52 epistasis group genes in homologous recombination and double-strand break repair. Microbiol Mol Biol Rev. 66, 630-70.

Takahashi, M., Kubista, M. and Nordén, B. (1987) Lineardichroism study of RecA-DNA complexes. Structural evidence and binding stoichiometries. J. Biol. Chem., 262, 8109- 8111.

Takahashi, M., Kubista, M. and Nordén, B. (1989) Binding stoichiometry and structure of RecA-DNA complexes studied by flow linear dichroism and fluorescence spectroscopy. Evidence for multiple heterogeneous DNA co-ordination. J. Mol. Biol., 205, 137-147.

Takahashi, M., Maraboeuf, F., Morimatsu, K., Selmane, T., Fleury, F., and Nordén, B. (2007). Calorimetric analysis of binding of two consecutive DNA strands to RecA protein illuminates mechanism for recognition of homology. J Mol Biol. 365, 603-611.

Takahashi, M., Strazielle, C., Pouyet, J. and Daune, M. (1986) Co-operativity value of DNA RecA protein interaction. Influence of the protein quaternary structure on the binding analysis. J. Mol. Biol., 189, 711-714.

Tateishi, S., Horii, T., Ogawa, T. and Ogawa, H. (1992). C-terminal truncated Escherichia coli RecA protein RecA5327 has enhanced binding affinities to single- and double-stranded DNAs. J. Mol. Biol. 223, 115-129.

Tracy, R.B. and Kowalczykowski, S.C. (1996) In vitro selection of preferred DNA pairing sequences by the Escherichia coli RecA protein. Genes Dev., 10, 1890-1903.

Tracy, R.B., Baumohl, J.K. and Kowalczykowski, S.C. (1997a) The preference for GT- rich DNA by the yeast Rad51 protein defines a set of universal pairing sequences. Genes Dev., 11, 3423-3431.

Tracy, R.B., Chedin, F. and Kowalczykowski, S.C. (1997b) The recombination hot spot Chi is embedded within islands of preferred DNA pairing sequences in the E. coli genome. Cell, 90, 205-206.

218

Umezu, K., and Kolodner, R.D. (1994). Protein interactions in genetic recombination in Escherichia coli. Interactions involving RecO and RecR overcome the inhibition of RecA by single-stranded DNA-binding protein. J. Biol. Chem. 269, 30005-30013.

VanLoock, M.S., Yu, X., Yang, S., Lai, A.L., Low C., Campbell, M.J. and Egelman, E.H. (2003). ATP-mediated conformational changes in the RecA filament. Structure 11, 187- 196.

Volodin, A.A., and Camerini-Otero, R.D. (2002) Influence of DNA sequence on the positioning of RecA monomers in RecA-DNA cofilaments. J. Biol. Chem., 277, 1614- 1618.

Volodin, A.A., Smirnova, H.A., and Bocharova, T.N. (1997) Periodicity in recA protein- DNA complexes. FEBS Lett., 407, 325-328. Volodin, A.A., Smirnova, H.A., Bocharova, T.N., and Camerini-Otero, R.D. (2003) Phasing of RecA monomers on quasi-random DNA sequences. FEBS Lett., 546, 203-208.

Wallace, A. C., Laskowski, R. A., and Thornton, J. M. (1995) LIGPLOT: A program to generate schematic diagrams of protein-ligand interactions, Prot. Eng. 8, 127-134.

Wang, Y., Wei, Z., Liu, L., Cheng, Z., Lin, Y., Ji, F., and Gong, W. (2005) Crystal structure of human B-type phosphoglycerate mutase bound with citrate, Biochem Biophys Res Commun 331, 1207-1215.

Weinstock, G.M., McEntee, K. and Lehman, I.R. (1981a) Hydrolysis of nucleoside triphosphates catalyzed by the recA protein of Escherichia coli. Steady state kinetic analysis of ATP hydrolysis. J. Biol. Chem., 256, 8845-8849.

Weinstock, G.M., McEntee, K. and Lehman, I.R. (1981b) Interaction of the RecA protein of Escherichia coli with adenosine 5’-O-(3-thiotriphosphate). J. Biol. Chem., 256, 8850- 8855.

Weinstock, G.M., McEntee, K. and Lehman, I.R. (1981c) Hydrolysis of nucleoside triphosphates catalyzed by the RecA protein of Escherichia coli. Characterization of ATP hydrolysis. J. Biol. Chem., 256, 8829-8834.

White, O., et al. (1999). Genome sequence of the radioresistant bacterium Deinococcus radiodurans R1. Science 286, 1571-1577.

Wittung, P., Ellouze, C., Maraboeuf, F., Takahashi, M. and Nordén, B. (1997) Thermochemical and kinetic evidence for nucleotide-sequence-dependent RecA-DNA interactions. Eur. J. Biochem., 245, 715-719.

219

Wong, I. and Lohman, T.M. (1993) A double-filter method for nitrocellulose-filter binding: application to protein-nucleic acid interactions. Proc. Natl Acad. Sci. U S A, 90, 5428-5432.

Wu, Y., He, Y., Moya, I.A., Qian, X. and Luo, Y. (2004). Crystal structure of archaeal recombinase RadA: a snapshot of its extended conformation. Mol. Cell 15, 423-435.

Xavier, K. B., and Bassler, B. L. (2003) LuxS quorum sensing: more than just a numbers game, Curr. Opin. Microbiol. 6, 191-197.

Xing, X. and Bell, C.E. J. Mol. Biol. (2004). Crystal structure of Escherichia RecA in a compressed helical filament. J. Mol. Biol. 342, 1471-1485.

Yu, X., and Egelman, E.H. (1993) The LexA repressor binds within the deep helical groove of the activated RecA filament. J. Mol. Biol., 231, 29-40.

Yu, X., Jacobs S.A., West, S.C., Ogawa, T. and Egelman, E.H. (2001). Domain structure and dynamics in the helical filaments formed by RecA and Rad51 on DNA. Proc. Natl. Acad. Sci. USA, 98, 8419-8424.

Yusupov, G.Z., Yusupova, M.M., Cate, J.H.D., and Noller, H.F. (2001b). The path of messenger RNA through the ribosome. Cell 106, 233-241.

Yusupov, M.M., Yusupova, G.Z., Baucom, A., Lieberman, K., Earnest, T.N., Cate, J.H.D., and Noller, H.F. (2001a). Crystal structure of the ribosome at 5.5 Å resolution. Science 292, 883-896.

Zahradka, K., Slade, D., Bailone, A., Sommer, S., Averbeck, D., Petranovic, M., Lindner, A.B., Radman, M. (2006). Reassembly of shattered chromosomes in Deinococcus radiodurans. Nature. 443, 569-573.

Zhu, J., Dizin, E., Hu, X., Wavreille, A., Park, J., and Pei, D. (2003a) S- Ribosylhomocysteinase (LuxS) is a mononuclear iron protein, Biochemistry 42, 4717- 4726.

Zhu, J., Hu, X., Dizin, E., and Pei, D. (2003b) Catalytic mechanism of S- ribosylhomocysteinase (LuxS): direct observation of ketone intermediates by 13C NMR spectroscopy, J. Am. Chem. Soc. 125, 13379-13381.

Zhu, J., Patel, R., and Pei, D. (2004) Catalytic mechanism of S-ribosylhomocysteinase (LuxS): stereochemical course and kinetic isotope effect of proton transfer reactions, Biochemistry 43, 10166-10172.

Zuker, M. (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res., 31, 3406-3415.

220