Characterization of the MoxR Family of AAA+

by

Jamie Donald Snider

A thesis submitted in conformity with the requirements

for the Degree of Doctor of Philosophy

Department of Biochemistry

University of Toronto

© Copyright by Jamie Donald Snider (2007)

ISBN: 978-0-494-39719-0

ISBN: 978-0-494-39719-0

Characterization of the MoxR Family of AAA+ ATPases

Doctor of Philosophy (2007)

Jamie Donald Snider

Department of Biochemistry, University of Toronto

Abstract

The MoxR family of AAA+ proteins is widespread throughout and archaea, but surprisingly little is known about their function. Here I present experimental characterization, bioinformatics analysis and an in-depth literature review of the MoxR family. My study reveals at least seven distinct MoxR lineages (subfamilies) including the MoxR Proper (MRP), TM0930,

RavA, CGN, APE2220, PA2707 and YehL, and supports a role for these proteins in a variety of different systems. Despite their diversity, however, MoxR proteins appear to share a common method of action, namely an involvement in the assembly of multimeric complexes and a possible role in metal insertion. My gene neighbourhood analysis reveals a clear association between MoxR AAA+ and Von Willebrand Factor Type A (VWA) protein-encoding genes, in addition to a number of subfamily specific associations. I also present an in-depth analysis of the

RavA subfamily, including a phylogenetic profiling study across a subset of RavA-containing organisms and extensive experimental characterization of a representative RavA protein from

Escherichia coli K12 MG1655. My experimental work includes biochemical, structural, and gene expression analysis. I demonstrate that RavA is a functional ATPase, which forms

ii hexameric rings in the presence of nucleotide. Expression of RavA is under the control of the σS promoter, suggesting that it is important under stress response conditions. In addition, I provide evidence supporting an interaction between RavA and its corresponding VWA-encoding protein

ViaA, suggesting the two proteins function as a system. RavA also interacts with the inducible , LdcI, forming a large cage-like structure which may be important in the regulation of RavA activity. The results of my profiling study and a microarray analysis using RavA/ViaA deletion and overexpression strains identify a number of candidate substrates, interaction partners and systems with which RavA and ViaA appear to be involved. My work provides invaluable information regarding the MoxR proteins and presents a useful groundwork upon which to base future studies.

iii

Acknowledgements

There are far too many people who have helped make my graduate school experience a remarkable one, that it would be impossible to express the full extent of my gratitude here, but I would be remiss if I didn’t at least attempt to mention some of them here. First and foremost, I would like to thank my wife Hannah, for supporting me in my decision to return to school, as well as providing me with much needed encouragement (and displaying remarkable patience) over the many years I’ve spent in the pursuit of my degree. Walid Houry, my supervisor, who has proven to be an exceptional mentor and has contributed greatly to my development as a scientist. I will miss our frequent (and sometimes heated) scientific discussions. My committee members, Russell Bishop and Jacqueline Segall, both of whom have easily gone beyond their expected ‘duties’, taking time out of their busy schedules, on more than one occasion, to discuss both my project and future goals with me, as well as bring me articles of interest. Usheer Kanjee, who joined my project early on and who has proven to be both a good friend and colleague.

Guillaume Thibault, for his advice, support and plain willingness to put up with me in general.

All those who have worked on the RavA project over the years, including my ‘number one guy’

Keith Wong, crazy Asad Merchant and his crazier accomplice Yaying Shen, ‘sweet and innocent’ Shaliny Ramachandran (always ready to keep to me in my place with a razor-sharp remark), Michelle Lin (one of the hardest working fourth-year students I’ve known), the ever- enthusiastic Bharat Sharma and Dr. Sabulal Baby. I also want to thank all other members of the

Houry lab, past and present, including Anna Gribun, Rongmin Zhao, Yoshito Kakihara, Angela

Yu, Jen Huen, Andre Pow, Philip Wong, Toni Davidson, Yulia Tsitrin, Urszula Wojtyra (in many ways an inspiration to me) and the many undergraduate students who have been part of the lab over the years. And finally, a special thanks to all of those others whose names I haven’t mentioned, including the office staff, fellow graduate students and some remarkable collaborators. Thanks and good luck to you all!

iv

Table of Contents

Abstract ...... ii Acknowledgements ...... iv Table of Contents ...... v List of Tables ...... ix List of Figures ...... x List of Abbreviations ...... xii 1. General Introduction: The AAA+ Superfamily - Similarity in Folds and Mechanism of Action, Diversity in Function ...... 1 1.1 The P-Loop NTPases ...... 2 1.2 The AAA+ ATPases ...... 3 1.2.1 General Structure ...... 3 1.2.2 Evolution, Classification, and Diverse Functions of AAA+ Proteins ...... 9 1.2.2.1 The Extended AAA Group ...... 10 1.2.2.1A Classical AAA Clade ...... 12 1.2.2.1B Other Extended AAA Families ...... 19 1.2.2.2 The HEC Group ...... 27 1.2.2.2A Clamp Loader Clade ...... 28 1.2.2.2B Initiation Clade ...... 31 1.2.2.2C Other HEC Families ...... 33 1.2.2.3 ExeA Group ...... 36 1.2.2.4 STAND Group ...... 38 1.2.2.5 PACTT Group ...... 39 1.2.2.5A HCL Clade ...... 39 1.2.2.5B Helix-2 Insert Clade ...... 46 1.2.2.5C Other PACTT Families ...... 54 1.2.3 Thesis Rationale: The MoxR Family ...... 57 2. MoxR AAA+ ATPases: A Novel Family of Molecular Chaperones ...... 59 2.1 Summary ...... 60 2.2 Introduction ...... 60 2.3 Materials and Methods ...... 61 2.3.1 MoxR Phylogenetic Analysis ...... 61 2.3.2 MoxR Gene Neighbourhood Analysis ...... 62 2.4 Results and Discussion ...... 63 2.4.1 MRP (MoxR Proper) Subfamily...... 78

v

2.4.2 TM0930 Subfamily ...... 85 2.4.3 RavA Subfamily ...... 86 2.4.4 CGN Subfamily ...... 88 2.4.4.1 NirQ/NorQ-type members ...... 89 2.4.4.2 CbbQ-type members ...... 91 2.4.4.3 GvpN-type members...... 92 2.4.4.4 DnaJ/BolA Associated (DBA) members ...... 94 2.4.5 APE2220 Subfamily ...... 94 2.4.6 PA2707 Subfamily ...... 95 2.4.7 YehL Subfamily ...... 96 2.5 Conclusions ...... 97 3. Formation of a Distinctive Complex Between the Inducible Bacterial Lysine Decarboxylase and a Novel AAA+ ATPase ...... 99 3.1 Summary ...... 100 3.2 Introduction ...... 100 3.3 Materials and Methods ...... 103 3.3.1 Bioinformatics ...... 103 3.3.2 Bacterial Strains ...... 104 3.3.3 Gene Cloning ...... 105 3.3.4 Protein Expression and Purification ...... 105 3.3.5 Measurement of RavA ATPase Activity ...... 106 3.3.6 Preparation of Cell Extracts for LdcI Activity Assays ...... 107 3.3.7 Measurement of LdcI Lysine Decarboxylase Activity ...... 107 3.3.8 Growth in LdcI Indicator Media ...... 108 3.3.9 Extreme Acid Shock Assays ...... 108 3.3.10 Northern Blot Analysis ...... 108 3.3.11 RT-PCR ...... 109 3.3.12 Western Blot Analysis ...... 110 3.3.13 Subcellular Fractionation of E. coli ...... 110 3.3.14 Size Exclusion Chromatography ...... 111 3.3.15 Analytical Ultracentrifugation Analysis of RavA ...... 111 3.3.16 Mapping of RavA Domains ...... 111 3.3.17 Electron Microscopy ...... 113 3.3.18 Image Analysis ...... 113 3.4 Results ...... 115 3.4.1 RavA Represents a Distinct Subfamily Within the MoxR AAA+ Family ...... 115 3.4.2 RavA is a Soluble Cytoplasmic ATPase Consisting of Two Domains ...... 120

vi

3.4.3 ravA and viaA genes form an operon under the direct control of σS ...... 123 3.4.4 RavA Forms a Hexameric Oligomer ...... 128 3.4.5 RavA Binds to the Inducible Lysine Decarboxylase LdcI/CadA ...... 128 3.4.6 RavA-LdcI Form a Very Large Distinctive Cage-Like Complex as Visualized by Negative Stain Electron Microscopy ...... 134 3.4.7 The Binding of RavA to LdcI Affects RavA Activity but Not LdcI Activity ...... 138 3.5 Discussion ...... 142 3.5.1 RavA-ViaA Might Play a Role in Metal Insertion ...... 142 3.5.2 Possible Implications of LdcI Binding to RavA ...... 144 3.5.3 Implications of the RavA-LdcI complex structure ...... 145 4. Towards Understanding the Function of the RavA-ViaA Chaperone System: A Phylogenetic Profiling and Microarray Study ...... 146 4.1 Summary ...... 147 4.2 Introduction ...... 147 4.3 Materials and Methods ...... 149 4.3.1 Phylogenetic Profiling Analysis ...... 149 4.3.2 Bacterial Strains ...... 151 4.3.3 Gene Cloning ...... 152 4.3.4 Western Blot Analysis ...... 153 4.3.5 Microarray Experiments ...... 153 4.3.6 Analysis of Microarray Data ...... 154 4.4 Results and Discussion ...... 155 4.4.1 Phylogenetic Profiling Analysis ...... 155 4.4.2 Microarray Analysis ...... 159 4.4.2.1 Results of the Microarray Analysis of RavA-ViaA Overexpression Strain...... 160 4.4.2.2 Examining the Microarray Analysis Results of the RavA-ViaA Overexpression Strain ...... 167 4.4.2.3 Results of the Microarray Analysis of RavA Overexpression Strain ...... 175 4.4.2.4 Examining the Microarray Analysis Results of the RavA Overexpression Strain ...... 180 4.4.2.5 Results of the Microarray Analysis of ΔravA::cat Strain ...... 182 4.4.2.6 Examining the Microarray Analysis Results of ΔravA::cat Strain ...... 183 4.4.3 The Putative Functions of RavA-ViaA ...... 200 5. General Conclusion and Future Directions ...... 203 5.1 The MoxR Family ...... 204 5.2 The RavA Subfamily ...... 205 5.2.1 Phenotype Screening ...... 205

vii

5.2.2 Characterization of RavA-ViaA ...... 206 5.2.3 Characterization of RavA-LdcI...... 207 5.2.4 Alternate Substrates, Interaction Partners and Systems ...... 208 5.3 Conclusion ...... 209 6. References ...... 210 7. Appendix ...... 248

viii

List of Tables

TABLE 1. GI numbers and organism distribution of MoxR AAA+ proteins organized according to subfamilies...... 65 TABLE 2. Significantly enriched genes detected by profiling analysis...... 157 TABLE 3. Genes whose transcript levels increase in MG1655 pRV cells relative to those in wildtype cells...... 165 TABLE 4. Genes whose transcript levels decrease in MG1655 pRV cells relative to those in wildtype cells...... 168 TABLE 5. Genes whose transcript levels increase in MG1655 pR cells relative to those in wildtype cells...... 176 TABLE 6. Genes whose transcript levels decrease in MG1655 pR cells relative to those in wildtype cells...... 178 TABLE 7. Genes whose transcript levels increase in MG1655 ΔravA::cat cells relative to those in wildtype cells...... 184 TABLE 8. Genes whose transcript levels decrease in MG1655 ΔravA::cat cells relative to those in wildtype cells...... 190 SUPPLEMENTARY TABLE 1. Organisms used in profiling analysis...... 249 SUPPLEMENTARY TABLE 2. Complete phylogenetic profiling results...... 251 SUPPLEMENTARY TABLE 3. General listing of additional experiments...... 313

ix

List of Figures

FIGURE 1. The P-Loop NTPases: major divisions and representative structures from the P-loop NTPase -GTPase (KG) and Additional Strand Catalytic E (ASCE) structural groups...... 4 FIGURE 2. Structure of the AAA+ module of Saccharomyces cerevisiae RFC1...... 6 FIGURE 3. Classification of AAA+ proteins...... 11 FIGURE 4. Selected AAA+ modules and oligomeric structures from the Extended AAA Group...... 13 FIGURE 5. Schematic diagrams illustrating the domain structure of select AAA+ and associated proteins...... 23 FIGURE 6. Selected AAA+ modules and oligomeric structures from the HEC Group...... 29 FIGURE 7. Holliday junctions and the RuvAB motor complex...... 34 FIGURE 8. Selected AAA+ modules and oligomeric structures from the PACTT Group...... 40 FIGURE 9. EM image of -c...... 56 FIGURE 10. Phylogenetic tree of the MoxR AAA+ family...... 64 FIGURE 11. Consensus sequences and alignment of MoxR subfamily members...... 76 FIGURE 12. Distribution of MoxR AAA+ proteins across organisms...... 79 FIGURE 13. General gene structure for each MoxR AAA+ subfamily...... 80 FIGURE 14. PQQ Structure and MDH Gene Region...... 84 FIGURE 15. Pathway of microbial denitrification...... 90 FIGURE 16. Phylogenetic analysis of MoxR AAA+ proteins...... 116 FIGURE 17. Assignment of highly conserved residues in RavA AAA+ proteins...... 119 FIGURE 18. Domain mapping, localization, and ATPase activity of RavA...... 121 FIGURE 19. ravA-viaA gene organization in E. coli...... 125 FIGURE 20. Analysis of RavA oligomerization...... 129 FIGURE 21. RavA interacts with the inducible lysine decarboxylase (LdcI/CadA)...... 131 FIGURE 22. EM analysis of LdcI and RavA-LdcI complex...... 135 FIGURE 23. The effect of RavA binding to LdcI on LdcI and RavA activities...... 139 FIGURE 24. Genomic environment of ravA in E. coli K12 MG1655...... 150 FIGURE 25. RavA overexpression...... 161 FIGURE 26. Analysis of RavA expression during cell growth...... 162 FIGURE 27. COG class distribution of genes undergoing significant changes in expression as obtained by microarray analysis...... 163 FIGURE 28. Genomic organization of selected genes observed to undergo expression changes in the microarray analysis...... 170

x

FIGURE 29. Venn diagram showing the overlap in gene Expression changes among the different microarray conditions...... 181

xi

List of Abbreviations

AAA(+) ATPases associated with a variety of cellular activities (plus) ADP Adenosine 5'-diphosphate ADP Adenosine 5'-monophosphate AMP-PCP Adenosine 5'-[β,γ-methylene]triphosphate ANS 1-anilinonapthalene-8-sulphonate ASCE Additional strand catalytic E ATP Adenosine 5'-triphosphate ATPγS Adenosine 5'-[γ-thio]triphosphate CDD Conserved domain database CGN CbbQ/GvpN/NorQ COG Clusters of orthologous groups CRP Cyclic-AMP receptor protein CTF Contrast transfer function CTP Cytidine 5'-triphosphate DHC Dynein heavy chain DNA Deoxyribonucleic acid DUF Domain of unknown function EBP Enhancer binding protein EM Electron microscopy ERAD Endoplasmic reticulum associated degradation FSC Fourier shell correlation GTP Guanosine 5'-triphosphate HCL HslU/Clp/Lon HEC Helicases and clamp loaders IHF Integration host factor KG Kinase-GTPase MALDI Matrix assisted laser desorption ionization MCM Minichromosome maintenance MIDAS Metal-ion dependent adhesion site mRNA Messenger ribonucleic acid MRP MoxR Proper MSA Multivariate statistical analysis MUSCLE Multiple sequence comparison by log-expectation NAS N-acetylserine NCBI National Centre for Biotechnology Information

xii

NSF N-ethylmaleimide sensitive factor NTP Nucleoside triphosphate OAS O-acetylserine OMP Outer membrane protein PAAA Proteasomal ATPases PACTT Protease, chelatase, transcriptional activator and transport PCR chain reaction PHYLIP Phylogeny inference package PLP Pyridoxal-5'-phosphate ppGpp Guanosine 3’,5’-bispyrophosphate PQQ Pyrroloquinoline quinone PS1βH Pre-sensor 1 β-hairpin PVDF Polyvinylidene difluoride RavA Regulatory ATPase variant A RFC Replication factor C RNA Ribonucleic acid RT-PCR Reverse transcription polymerase chain reaction RuBisCO Ribulose-1,5-bisphosphate carboxylase/oxygenase SAM S-adenosyl-methionine SDS-PAGE Sodium dodecyl sulfate polyacrylamide gel electrophoresis SNARE Soluble NSF attachment receptor snoRNP Small nucleolar riboprotein STAND Signal transduction ATPases with numerous domains TNBS Trinitrobenzenesulfonic acid TPR Tetratricopeptide repeat UTP Uridine 5'-triphosphate ViaA VWA interacting with AAA+ ATPase VWA Von Willebrand factor type A WHIP Werner helicase interacting protein

xiii

1. General Introduction: The AAA+ Superfamily - Similarity in Folds and Mechanism of Action, Diversity in Function

1

In the following chapter I provide a brief description of the P-loop NTPase class of proteins, followed by a detailed overview of the one of the major lineages of this class, the

AAA+ superfamily. I present a coordinated look at the results of numerous studies on the

AAA+ proteins, describing their defining structural and mechanistic characteristics, as well as the unique features and distinct functional roles evolved by different members of the superfamily. This chapter provides context and useful background information for my thesis work, which describes the characterization of a largely unstudied family of AAA+ proteins, known as MoxR.

1.1 The P-Loop NTPases

The energy obtained from the hydrolysis of nucleo33333tides is fundamental to a myriad of biological processes, and indeed to life itself. Cells have evolved various mechanisms for harnessing this energy and directing it towards useful work. One class of proteins using such a mechanism are the P-loop NTPases, an abundant class of nucleotide binding/hydrolyzing proteins, which play critical roles in a vast array of cellular functions.

These proteins are found in all three major superkingdoms of life, including the prokaryotic (i.e. lacking a true nucleus) Archaea and Bacteria, as well as the Eucarya (i.e. eukaryotes, possessing a true nucleus and membrane-bound organelles derived from bacterial endosymbionts) (Iyer et al. 2004; Martin and Embley 2004). In fact, roughly 5-10% of proteins encoded by fully sequenced prokaryotic and eukaryotic genomes completed thus far are predicted to contain a P-loop NTPase domain (Koonin et al. 2000). P-loop NTPases are defined by the presence of the nominal P-loop, a conserved nucleotide phosphate-binding motif, also referred to as the Walker A motif (Gx4GK[S/T], where ‘x’ is any residue and variant residues are enclosed in square brackets), and a second, more variable region, called the Walker B motif

(hhhh[D/E], where ‘h’ is a hydrophobic residue and variant residues are enclosed in square brackets). Both the Walker A and Walker B motifs are important for binding/interaction with

2 nucleotides, which are typically ATP or GTP, and Mg2+ (Gorbalenya and Koonin 1989; Saraste et al. 1990; Walker et al. 1982). The P-loop NTPases share a common αβα core domain structure, consisting of a parallel β-sheet sandwiched between two sets of α-helices (Milner-

White et al. 1991). Although not universally true, P-loop NTPases most commonly catalyze the hydrolysis of the β-γ bond of the bound nucleotide, utilizing the energy released from this reaction to direct conformational changes in other molecules (Leipe et al. 2004).

The P-loop NTPases can be divided into two major structural groups, the Kinase-

GTPase (KG) group and the Additional Strand Catalytic E (ASCE) group. These groups, and representative members illustrating the core structural features of each, are shown in Figure 1.

The KG group is characterized by the adjacency of the Walker B strand and the strand connected to the P-loop, with a characteristic strand order of 54132 in their core β-sheet (Fig. 1, lower left panel). The ASCE group contains an additional strand inserted between the strand connected to the P-loop strand and the Walker B strand, giving a core β-sheet strand order of

51432, as well as a catalytically important, conserved glutamate residue within the Walker B motif (Fig. 1, lower right panel) (Leipe et al. 2003). Within these two groups the P-loop

NTPases can be divided into numerous distinct lineages (Fig. 1) (Leipe et al. 2003; Leipe et al.

2004; Leipe et al. 2002).

The following sections give a detailed look at one of these major lineages, the AAA+ proteins of the ASCE structural group. A review of the current AAA+ literature is provided, addressing the structural, mechanistic and functional similarities and differences that have evolved between the numerous families of this diverse group of P-loop NTPases.

1.2 The AAA+ ATPases

1.2.1 General Structure

AAA stands for ‘ATPases Associated with diverse cellular Activities’, and, as the name implies, was first used to describe a class of ATP hydrolyzing with a range of

3

P-Loop NTPases

Kinase-GTPase (KG) Group Additional Strand Catalytic E (ASCE) Group

TRAFAC AAA+ ATPases KAP ATPases

SIMIBI GTPases RecA/F1 ATPases ABC ATPases

SF1/2 Helicases VirD/PilT ATPases

C

5 * 1 4 4 1

5 N 3 3 2 N 2 C

FIGURE 1. The P-Loop NTPases: major divisions and representative structures from the P-loop NTPase Kinase-GTPase (KG) and Additional Strand Catalytic E (ASCE) structural groups. The top two panels show the two major structural groups (KG and ASCE) and major classes/superfamilies which lie within them. The lower left and right panels show representative NTPase core domains of the KG and ASCE groups, respectively. β-strands are represented by arrows and are shown in green. The strands of the conserved ‘core’ β-sheet structure are numbered in order of occurrence in the primary sequence. Helices are represented by cylinders and are shown in grey. The Walker A and B motifs are shown in red and blue, respectively. Lower left: P- loop NTPase core domain from Thermus aquaticus Ffh protein (KG structural group) (Freymann et al. 1999). Lower right: P-loop NTPase core domain from Saccharomyces cerevisiae RFC1 clamp loader protein (ASCE structural group) (Bowman et al. 2004). The additional strand between the Walker A and Walker B associated β- strands is marked with a ‘*’. The position of the catalytic glutamic acid residue is shown in pink.

4 functional roles (Kunau et al. 1993). Among other processes, AAA proteins were found to be involved in protein degradation, vesicular fusion, peroxisome biogenesis, and the assembly of membrane complexes (Iyer et al. 2004). Subsequent work showed that AAA proteins are actually a subset of a much larger superfamily of ATPases, now referred to as AAA+ (Neuwald et al. 1999).

In addition to the conserved αβα core domain structure, and the Walker A and B motifs of the P-loop NTPases, the AAA+ proteins contain a number of other conserved, distinguishing features. All these features are found within a 200-250 amino acid region referred to as the

AAA+ ‘module’ (Neuwald et al. 1999; Ogura and Wilkinson 2001). Figure 2 shows a representative of an AAA+ module from the RFC1 Clamp Loader protein of Saccharomyces cerevisiae (Bowman et al. 2004). The structure of the AAA+ module of this protein represents the ‘basic’ core structure of AAA+ modules without insertions or modifications. Various major features are marked in Figure 2. Like many other members of the ASCE structural group,

AAA+ proteins function as oligomeric rings, with a hexameric arrangement being most common (see below; (Iyer et al. 2004).

Notably, the AAA+ module consists of two discrete domains. The first corresponds to the P-loop NTPase αβα nucleotide-binding core domain, and consists of a 5-stranded parallel β- sheet, with a 51432 strand order, sandwiched between α-helices (Fig. 2A, Green) (Ogura and

Wilkinson 2001). The second domain, unique to AAA+ proteins, consists of a bundle of four α- helices (Fig. 2A, Purple). The sequence of this domain is much less conserved across AAA+ proteins than the nucleotide binding domain, but all appear to share a common core fold consisting of two helical hairpins arranged in a left-handed, superhelical structure (Ammelburg et al. 2006; Ogura and Wilkinson 2001). Nucleotide bound by the αβα domain is sandwiched between these two domains. In oligomeric structures, the nucleotide also faces the αβα domain of the neighbouring subunit (Ogura et al. 2004; Ogura and Wilkinson 2001) and, hence, in some

AAA+ proteins, the nucleotide is required for proper oligomerization.

5

A) α6 Nucleotide Binding Domain α5 α α8 0 C β5 β α7 1 α4 α1 β4

α β3 3 N β2 α2 α-Helical Domain

B)

C

ATPγS

Mg2+

Box II Walker A N Walker B Sensor 1 Arg Finger Sensor 2

C)

ATPγS

Lys359

α

β Box II Thr360 Arg516 γ 2+ Walker A Mg Asp424 Walker B Asn456 Sensor 1 Sensor 2 Glu425

6

FIGURE 2. Structure of the AAA+ module of Saccharomyces cerevisiae RFC1. A) Overall view of the RFC1 AAA+ module from Saccharomyces cerevisiae RFC1 (Bowman et al. 2004). The nucleotide binding domain is shown in green. The C-terminal α-helical domain is shown in purple. β-strands and α-helices are labeled (β1 to β5 and α0 to α8, respectively). B) Overall view of the AAA+ module showing major motifs (coloured and labeled as described in the inset) and bound ATPγS (yellow sticks) and Mg2+ (grey sphere). C) Close up view of nucleotide binding/catalytic site. Side chains of key residues are labeled and coloured as described in the inset.

7

The AAA+ module contains several signature motifs. The first motif is Box II (Fig. 2B and C, Pink), which maps to a putative alpha helical structure in an extended N-terminal region

(α0), and is a defining feature of the AAA+ protein class (Iyer et al. 2004; Neuwald et al. 1999).

Residues in this motif, based upon their proximity to the bound nucleotide, have been proposed to play a role in adenine recognition although, notably, this motif is not always conserved

(Neuwald et al. 1999).

The next conserved feature (Fig. 2B and C, Red) is the Walker A motif which maps to the region between strand β1 and helix α1. The conserved lysine and threonine/serine residues of this motif are proposed to be important in binding the β- and γ-phosphates of bound ATP , and Mg2+ ion, respectively. The lysine residue is also believed to be important in maintaining the proper conformation of the P-loop, via hydrogen bonding with the main chain oxygens of the loop (Saraste et al. 1990; Walker et al. 1982). The Walker A motif is followed by two subtle motifs, referred to as Box IV and IV’ (not shown) (Neuwald et al. 1999).

The Walker B motif is associated with the β3 strand (Fig. 2B and C, Blue) and has the consensus sequence hhhhDE (h = a hydrophobic residue), where E is the conserved catalytic glutamate residue characteristic of the ASCE group of P-loop NTPases. The carboxylate of this residue is believed to act as a catalytic base, abstracting a proton from a molecule of water, thereby priming it for a nucleophilic attack on the γ-phosphate of bound ATP (Leipe et al. 2003;

Ogura and Wilkinson 2001). The conserved aspartate residue is involved in the co-ordination of the Mg2+ ion (Gorbalenya and Koonin 1989; Walker et al. 1982).

Immediately after the Walker B motif, the Sensor 1 motif is present on strand β4 (Fig.

2B and C, Cyan). The Sensor 1 motif contains a conserved polar residue, which is asparagine in the RFC1 structure, although threonine, serine or histidine is sometimes observed. This residue has been shown to be functionally important, and may interact with the γ-phosphate of ATP either directly, potentially acting as a ‘sensor’ of nucleotide binding/hydrolysis, or indirectly via a water molecule, possibly helping to properly orient the water molecule for nucleophilic attack

8 on the substrate (Guenther et al. 1997; Karata et al. 1999). It is important to note that the Sensor

1 motif is not strictly unique to the AAA+ proteins, but is also found in certain other divisions of the ASCE structural class of P-loop NTPases (Iyer et al. 2004).

After Sensor 1, is the Box VII which contains a conserved arginine residue that is near the N-terminus of strand β5 (Fig. 2B, Brown) (Neuwald et al. 1999). In most AAA+ proteins this residue is oriented towards the ATP-containing of a neighbouring subunit and is proposed to act as an ‘arginine finger’, interacting with the γ-phosphate of the nucleotide in the neighbouring subunit. The arginine finger has been shown to be necessary for ATP hydrolysis though not binding, playing an important role in intersubunit communication/ (Davey et al. 2003; Johnson and O'Donnell 2003; Karata et al. 1999; Rombel et al. 1999). This arginine finger is also found in some other ASCE group members, particularly those which form rings

(Lupas and Martin 2002).

Box VII is followed by two subtle motifs, Box VII´ and VII´´ (not shown), and then the

Sensor 2 motif (Fig. 2B and C, Orange). All of these motifs are in the second, helical bundle domain of the AAA+ module. The Sensor 2 motif, which is on the third helix of the second domain (α7), contains a conserved arginine residue that interacts with the γ-phosphate of the bound ATP substrate. The function of this arginine residue appears to be somewhat divergent, and it has been implicated in a variety of roles including ATP binding, hydrolysis, sensing and inter-subunit interaction. It is also believed to be involved in mediating movement of the C- domain relative to the N-domain of the AAA+ module during ATP hydrolysis (Neuwald et al.

1999; Ogura et al. 2004; Ogura and Wilkinson 2001). Motion between these two domains is proposed to be important in the generation of mechanical force which affects substrate molecules and functional partners (Ogura and Wilkinson 2001).

1.2.2 Evolution, Classification, and Diverse Functions of AAA+ Proteins

Sequence and structural analysis indicate that the AAA+ superfamily is very ancient, and

9 underwent considerable divergence prior to the appearance of the last common ancestor of the

Eucarya, Bacteria, and Archaea domains of life (Iyer et al. 2004; Neuwald et al. 1999). Recent phylogenetic studies using sequence and structural information have shown that the AAA+ superfamily can be divided into numerous smaller families (Ammelburg et al. 2006; Beyer 1997;

Frickey and Lupas 2004; Iyer et al. 2004; Neuwald et al. 1999). These studies clearly demonstrate that, beyond the core features defined in the previous section, many of these AAA+ lineages have evolved their own unique functional sequences and structural elements.

Figure 3 shows some of the major groups, clades, and families of AAA+ proteins and represents the integrated results of two recent classification studies (Ammelburg et al. 2006; Iyer et al. 2004) with the inclusion of new families and reassignment of certain families in accordance with the more recent work (Ammelburg et al. 2006). The major groups are the

Extended AAA Group, the HEC (Helicases and Clamp Loaders) Group, the PACTT (Protease,

Chelatase, Transcriptional Activators and Transport) Group, the ExeA Group and the STAND

Group. The STAND group is shown in yellow due to uncertainty as to its classification within the AAA+ lineage (see below).

As can be seen, AAA+ proteins are involved in a vast array of different functions, where they are typically involved in utilizing the power of ATP hydrolysis to mediate molecular remodeling events. Specific cellular functions are defined mainly by the fusion of different domains to the AAA+ modules and by the association of the proteins with various functional partners. The following section provides an overview of the AAA+ proteins, giving a brief description of each of the major AAA+ families, in the context of their higher order classification, and illustrates the remarkable functional diversity of these proteins as a whole.

1.2.2.1 The Extended AAA Group

The Extended AAA Group contains the Classical AAA clade defined by the earlier classification study (Iyer et al. 2004), in addition to several related families from the more recent

10

Extended AAA Group

Classical AAA Clade Other 1. FtsH [P] 1. RuBisCO Activase [P] 2. Katanin [P] 2. Rvb [P/N] 3. NSF/CDC48 [P] 3. ClpAB-D1 [P] 4. Pex1/6 [P] 4. SpoVK [U] 5. Bcs1p [P] 5. Ycf2 [U] 6. Proteasomal ATPases [P] 6. AFG1[U] 7. Viral Helicases [N]

HEC Group

Clamp Loader Clade Initiation Clade Other 1. HolB/DnaX [P/N] 1. DnaA/DnaC [P/N] 1. RuvB [N] 2. RFC [P/N] 2. CDC6 / ORC [P/N] 2. IstB [P/N] 3. WHIP [P/N] 3. HolA [P/N]

ExeA Group STAND Group

PACTT Group (Pre-sensor 1 β-Hairpin Superclade)

HCL Clade Helix-2 Insert Clade Other 1. HslU / ClpX [P] 1. Chelatase [S] 1. Dynein Heavy Chain (DHC) [P/N] 2. LonA [P/N] 2. McrB/Unc53 [P/N] 2. LonB [P/N] 3. ClpAB-D2 / Torsin [P] 3. Midasin [P/N] 4. MCM [N] 5. σ54 Activator [P/N] 6. YifB [U] 7. MoxR [P] 8. ComM [U]

FIGURE 3. Classification of AAA+ proteins. Division of the AAA+ proteins into groups (green boxes), clades (blue or grey boxes) and families. The STAND group is coloured yellow to reflect uncertainty as to its classification within the AAA+ lineage. The figure is based upon two recently published AAA+ classification efforts, performed using both sequence and structural information (Ammelburg et al. 2006; Iyer et al. 2004). The major known or putative target(s) – Protein (P), Nucleic acid (N), Small Molecule (S) or Unknown (U) – are listed in square brackets beside the family name.

11 work, listed in the ‘Other’ box under the Extended AAA classification (Fig. 3) (Ammelburg et al. 2006).

1.2.2.1A Classical AAA Clade

This clade includes the FtsH, Katanin, CDC48/NSF, Pex1/6, Bcs1p, and Proteasomal

ATPases families (Fig. 3). The originally identified AAA proteins all fall within this clade.

Structurally it is defined by the presence of an additional, small helix downstream of strand β2

(Fig. 4A,B) and a conserved glycine N-terminal to the arginine finger (Iyer et al. 2004).

FtsH Family

FtsH is an ATP-dependent, zinc metalloprotease that is essential for bacterial cell growth

(Jayasekera et al. 2000; Tomoyasu et al. 1995). The protein is anchored to the inner membrane via two helices at its N-terminal transmembrane domain (Tomoyasu et al. 1993a; Tomoyasu et al. 1993b). The remainder of the protein, located in the cytosol, consists of three major domains, including the two domains of the AAA+ module (core and α-helical), as well as a C-terminal protease domain (Krzywda et al. 2002; Niwa et al. 2002; Suno et al. 2006; Tomoyasu et al.

1993b) (Fig. 4A). FtsH is responsible for the degradation of both misassembled/misfolded membrane protein complexes and certain cytosolic regulatory proteins (Ito and Akiyama 2005).

FtsH functions as a homohexamer and is proposed to consist of a combination of alternately

‘open’ (active) and ‘closed’ (inactive) subunits, which cycle in a nucleotide dependent manner, driving the translocation of substrates to the protease active sites. It has been proposed that FtsH substrates pass via a tunnel along a ‘closed’ subunit into the protease active site of an adjacent

‘open’ subunit. Such a mechanism is consistent with the location of the proteolytic active sites at the periphery of the hexameric ring (Suno et al. 2006). In addition to bacteria, FtsH family members are also found in certain eukaryotic organelles (e.g. mitochondria, chloroplasts), but are noticeably absent from archaea.

12

A) FtsH Top

N β5

β1 β4

β3

C β2 Side Extra Helix

B) Rvb1 Top N

β5

β1

β4 β3 β2

C Extra Helix Side

β-Sheet Rich Insert

C) ClpB

C

β5 Coiled-Coil ‘Propeller’ β1 Domain β4 β3 β2

N

13

FIGURE 4. Selected AAA+ modules and oligomeric structures from the Extended AAA Group. Each panel shows an individual AAA+ module. Nucleotide binding ‘core’ domains are shown in green, small α-helical domains are shown in purple. All other major features are as marked. For FtsH and Rvb1, top and side views of the oligomeric structures of a given protein are shown in insets, with the AAA+ modules coloured as in the larger figure. Any accessory domains external to the AAA+ module are shown in grey. Bound nucleotide (ADP) is shown as yellow spheres. A) AAA+ module from Thermus thermophilus FtsH as a representative member of the Classical AAA Clade. Insets: FtsH cytoplasmic domain hexamer. Protease domains are shown in grey (Suno et al. 2006). B) AAA+ Module from human Rvb1 protein as a representative member of the Rvb family. Insets: Rvb1 hexamer (Matias et al. 2006). C) N-terminal AAA+ Module (D1) from Thermus thermophilus ClpB showing the propeller domain (Lee et al. 2003a).

14

Katanin Family

Members of the katanin family are found in eukaryotes (Iyer et al. 2004). Katanin, named after the katana, the Japanese samurai sword, is involved in the severing and disassembly of , processes which contribute to the functionally important, dynamic nature of these structures (Baas et al. 2005; McNally and Vale 1993). Microtubules are polar polymers of dimers, and play diverse roles in cells, including acting as tracks for the intracellular movement/organization of vesicles and other cellular components, and serving as the key structural component of the mitotic/meiotic spindle during chromosome segregation (Walczak

2000). Katanin functions as a heterodimer consisting of the enzymatically active, AAA+ subunit

(P60) and an enzymatically inactive subunit (P80) which serves to regulate P60 activity and target the complex to the centrosome (Hartman et al. 1998; McNally and Vale 1993). Studies have shown that the katanin P60 subunit, in its ATP-bound state, forms hexameric complexes around microtubules. Binding of katanin to microtubules stimulates its ATPase activity, leading to nucleotide hydrolysis and a conformational change in the katanin ring, resulting in destabilization of the tubulin contacts and filament severing (Hartman and Vale 1999).

NSF/CDC48 Family

The NSF/CDC48 family is found in both eukaryotes and archaea (Iyer et al. 2004). NSF

(N-ethylmaleimide sensitive factor) was originally identified as a protein that was inactivated by

N-ethylmaleimide (Malhotra et al. 1988). It plays an important role in membrane fusion events, working in conjunction with an adaptor protein called SNAP to disassemble soluble NSF attachment receptor (SNARE) complexes, using the power of ATP hydrolysis (Sollner et al.

1993; Whiteheart and Matveeva 2004). SNAREs are integral membrane proteins which form a stable, coiled-coil complex with their cytoplasmic domains that bridges membranes, promoting fusion (Ernst and Brunger 2003; Sutton et al. 1998; Weber et al. 1998). By disassembling these complexes, NSF effectively recycles them for use in future fusion events. The NSF protein

15 consists of an N-terminal domain, involved in substrate binding, and two AAA+ modules (D1 and D2) (Nagiec et al. 1995; Tagaya et al. 1993). D1 possesses the majority of the ATPase activity and is essential for complex dissociation. D2 mediates oligomerization of the protein into its functional hexameric state (Fleming et al. 1998; Nagiec et al. 1995; Whiteheart et al.

1994). There has also been some evidence that NSF may act as a chaperone for proteins involved in processes other than membrane fusion (Whiteheart and Matveeva 2004).

Cdc48 (also known as p97 and VCP, or VAT in archaea) is essential and highly abundant, comprising approximately 1% of total cellular protein. It plays an important role in the ubiquitin system, and is associated with a range of diverse functions including transcriptional regulation, membrane fusion, B and T-cell activation, cell cycle regulation, stress response, programmed cell death, endoplasmic reticulum associated degradation (ERAD), and protein degradation (Wang et al. 2004). One of the major functions of Cdc48 is to act as a

‘segregase’, using the power of ATP hydrolysis to extract ubiquitinated substrates from protein complexes and membranes, and direct them to the proteasome for degradation (Jentsch and

Rumpf 2006; Wang et al. 2004). It is also involved in controlling the degree of ubiquitination of various substrates, either via the inhibition/promotion of ubiquitination or through deubiquitination (Richly et al. 2005; Rumpf and Jentsch 2006). The various roles of Cdc48 are mediated by interactions with a variety of different cofactors/adaptors. The range of functional roles has led to the proposal that Cdc48 acts as a molecular ‘gearbox’, playing a key role in controlling the fate of various protein substrates (Jentsch and Rumpf 2006). Cdc48 consists of two AAA+ modules (D1 and D2) in addition to an N-terminal domain and C-terminal extension

(DeLaBarre and Brunger 2003; Zhang et al. 2000). It forms a homohexameric oligomer consisting of two stacked, ring-shaped layers formed from the D1 and D2 modules, which is clearly demonstrated in the crystal structure of the full length molecule from mouse (DeLaBarre and Brunger 2003). The D1 domain is the major domain responsible for directing oligomerization, and does so in a manner that does not require, though is enhanced by, bound

16 nucleotide (Wang et al. 2003). The D2 domain, on the other hand, appears to be responsible for the bulk of the ATPase activity and has recently been shown to play a role in substrate binding

(DeLaBarre et al. 2006; Song et al. 2003). The N-terminal domain is responsible for interacting with substrates and cofactors, and works in conjunction with the D1 and D2 domains to direct nucleotide-induced conformational changes (Rouiller et al. 2002; Wang et al. 2004).

Pex1/6 Family

The Pex1/6 family is another AAA+ family found only in eukaryotes (Iyer et al. 2004).

Members of this family play an important role in the process of peroxisome biogenesis

(Erdmann et al. 1991; Voorn-Brouwer et al. 1993). Peroxisomes are essential organelles, belonging to the microbody family, which consist of a proteinaceous matrix surrounded by a single membrane (Wanders and Waterham 2006). They are involved in numerous metabolic processes associated with reactive oxygen species, and contain a wide variety of proteins, including catalase, superoxide dismutase, peroxidases and enzymes required for the O2- dependent oxidation of substrates, among others. Virtually all peroxisomes are involved in the

β-oxidation of fatty acids, as well as additional processes, which may vary from organism to organism (Michels et al. 2005). The Pex1 and Pex6 proteins play important roles in peroxisome fusion and in receptor recycling during the process of matrix protein uptake (Thoms and

Erdmann 2006). Both Pex1 and Pex6 contain two C-terminal AAA+ modules (D1 and D2), and have been shown to interact functionally in an ATP-dependent manner (Birschmann et al. 2005;

Faber et al. 1998; Kiel et al. 1999; Tamura et al. 1998). The D1 modules appear to contain the major site for interaction between Pex1 and Pex6, while ATP binding and hydrolysis by the D2 modules appear to be essential for proper function (Birschmann et al. 2005).

Bcs1p Family

The Bcs1p family is found exclusively in eukaryotes (Iyer et al. 2004). Bcs1p is a

17 mitochondrial inner membrane protein, which functions as a chaperone and is required for the proper assembly of the cytochrome bc1 complex (complex III), an important component of the respiratory chain (Cruciat et al. 1999; Nobrega et al. 1992). It is specifically involved in the late stage incorporation of the Rieske FeS and Qcr10p subunits, in a process which is dependent upon binding and hydrolysis of ATP. Bcs1p is capable of forming an oligomeric species within the mitochondrial membrane, however, whether or not oligomerization is essential for its function, is not yet known (Cruciat et al. 1999). Based upon studies on other AAA+ proteins, however, it seems reasonable to conclude that Bcs1p likely functions in a multimeric (possibly hexameric) state.

Proteasomal ATPase Family

The proteasomal ATPase family (sometimes referred to as PAAA), is found throughout the archaeal and eukaryotic lineages (Iyer et al. 2004). Members of this family are found as components of proteasomes - large, ATP-dependent proteolytic complexes, found in archaea and eukaryotes, as well as in some actinobacteria. These complexes catalyze the degradation of ubquitinated (in eukaryotes) and non-ubiquitinated (in archaea and bacteria, as well as in eukaryotes to a limited degree) polypeptide substrates (Lupas et al. 1994; Pearce et al. 2006;

Smith et al. 2006; Zhang et al. 2004). The eukaryotic and archaeal proteasomes are the best studied, and are structurally and functionally similar, although the archaeal proteasomes are simpler in design and composition (Smith et al. 2006). They consist of two major elements, a

20S ‘core’ structure and a ‘cap’ structure (19S in eukaryotes and PAN in archaea) (Smith et al.

2006; Zwickl et al. 1999). The core proteasomal particle consists of two outer and two inner seven-membered rings, and contains a central proteolytic chamber (Benaroudj et al. 2003; Groll et al. 1997; Lowe et al. 1995). Entry into the chamber is controlled via gated channels on either end of the complex (Groll et al. 2000; Smith et al. 2005).

PAAA family members are found within the cap structures, which bind the termini of

18 the cylindrical core complex. The caps control entry of substrate into the core via opening and closing of the gated channel, in a manner dependent upon nucleotide binding to the cap, but not hydrolysis (Smith et al. 2005). The caps are also responsible for binding protein substrates and unfolding them using the power of ATP hydrolysis (Benaroudj and Goldberg 2000; Smith et al.

2006). Unfolded substrates are then translocated through the open channels into the proteolytic core via a proposed passive, unidirectional diffusion method, which requires ATP binding but not hydrolysis (Smith et al. 2005). The core structure of the actinobacterial proteasome is similar to that of archaea and eukaryotes, also forming a cylindrical particle of four seven-membered rings containing a central proteolytic chamber. Like the archaeal proteasome it is less complex than that of eukaryotes, and consists of fewer distinct types of subunits (Hu et al. 2006; Kwon et al. 2004b; Tamura et al. 1995). Its mechanism of function is not well understood and only a limited number of substrates have been identified (Pearce et al. 2006). No distinct cap structure has been observed for these proteasomes, although proteins related to the PAAA family have been detected in actinobacteria and appear to be important for proteasome function in at least some cases (Darwin et al. 2005; Pearce et al. 2006; Wolf et al. 1998; Zhang et al. 2004).

1.2.2.1B Other Extended AAA Families

A number of other protein families are also part of the Extended AAA Group, but outside of the Classical AAA Clade defined in the earlier study by Iyer et al. (Iyer et al. 2004).

All have been designated as ‘Other’ in Figure 3 (Ammelburg et al. 2006; Iyer et al. 2004).

RuBisCO Activase Family

Members of the RuBisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) activase family are found in plants and green algae. As the name implies, they act as activators of

RuBisCO. RuBisCO is the key enzyme of the Calvin Cycle and is responsible for the fixation of

CO2, catalyzing the carboxylation of ribulose-1,5-bisphosphate to form two molecules of 3-

19 phosphoglycerate (Shively et al. 1998). RuBisCO activase works on RuBisCO during its catalytic cycle, using the power of ATP hydrolysis to affect the enzyme’s conformation and promote the removal of various inhibitory sugar phosphate molecules/reaction byproducts from the enzyme’s active site, thereby maintaining its function (Portis 2003). The activase allows

RuBisCO to function maximally at suboptimal levels of CO2 and may also enhance the enzyme’s catalytic activity in a manner independent of inhibitor removal (He et al. 1997; Portis et al. 1986).

Rvb Family

The Rvb family is found in both archaea and eukaryotes, with archaea containing a single representative member and eukaryotes containing two orthologous members (referred to as Rvb1 and Rvb2) (Iyer et al. 2004). Rvb1/Rvb2 are also referred to as Tih1/Tih2, TIP49/TIP48

(or TIP48/TIP49), TAP54α/TAP54β, Pontin52/Reptin52, and RuvBL1/RuvBL2. Members of this family are characterized by the presence of a β-sheet rich insert (Fig. 4B, pink) upstream of the Walker B motif (Iyer et al. 2004; Matias et al. 2006). In addition, they possess a small helix downstream of the β2 strand (Fig. 4B, yellow), similar to members of the Classical AAA Clade to which they are closely related. The eukaryotic members have been studied extensively and have been shown to be essential for viability in a number of organisms, however their exact function is still unknown (Bauer et al. 2000; Kanemaki et al. 1999; Qiu et al. 1998). Rvb1 and

Rvb2 are found in large nuclear complexes, including small nucleolar ribonucleoprotein

(snoRNP) complexes, chromatin-modifying complexes, and transcription-activating complexes, as well as in association with mitotic elements during mitosis. As such, they are involved in a diverse set of functions, including chromatin remodeling, transcriptional regulation, DNA repair, apoptosis, mitosis, and snoRNP assembly, nucleolar localization, and trafficking (Bauer et al.

2000; Bauer et al. 1998; Cho et al. 2001; Fuchs et al. 2001; Gartner et al. 2003; Ikura et al. 2000;

Jonsson et al. 2001; King et al. 2001; Kobor et al. 2004; Krogan et al. 2003; Lim et al. 2000;

20

Newman et al. 2000; Shen et al. 2000; Sigala et al. 2005; Wood et al. 2000). Human Rvb1 and

Rvb2 interact with one another, forming a complex resembling two stacked, hexameric rings, reminiscent of the complexes formed by proteins containing two linked AAA+ modules. The formation of this complex does not require nucleotide (Puri et al. 2006). Currently it is unclear whether the individual rings are homo or heterooligomers. Both Rvb1 and Rvb2 in the complex are equally capable of binding nucleotide, and the complex possesses significantly greater

ATPase activity than either Rvb1 or Rvb2 alone (Ikura et al. 2000; Puri et al. 2006). Both proteins, however, must be capable of hydrolyzing ATP in order for this enhancement of activity to occur (Puri et al. 2006). The formation of such a double ring hexamer might not be true for Rvb1/2 from other species.

Although the proteins are often part of nucleic acid-associated complexes, the ATPase activity of the Rvb1/Rvb2 complex or the individual proteins is not enhanced in the presence of various forms of DNA and RNA, and recent studies by several groups have shown that the proteins themselves do not appear to possess any helicase or branch migration activity (Ikura et al. 2000; Matias et al. 2006; Puri et al. 2006; Qiu et al. 1998). There is still some uncertainty regarding this issue, however, as these results contradict earlier work which reported detecting both DNA-stimulated ATPase activity and helicase activity for purified Rvb1 and Rvb2 proteins

(Kanemaki et al. 1999; Makino et al. 1999). The β-sheet rich insert has been shown to bind both

DNA and RNA, however, this binding is sequence independent, and its exact functional significance is not yet clear (Matias et al. 2006). Although their exact role is still unknown, based upon current knowledge, it is reasonable to hypothesize that the Rvb1 and Rvb2 proteins act as molecular motors both in complex with one another and, in some cases, individually, and are important for the remodeling, assembly and function of various protein-protein, protein-

DNA, and protein-RNA complexes.

21

ClpA/B Family

The ClpA/B family is found predominantly in bacteria and eukaryotes, although it has been detected in archaea, apparently as the result of horizontal transfer from bacteria. Members of this family contain an N-terminal domain followed by two AAA+ modules (Fig. 5A), each belonging to a distinct AAA+ family, suggesting that they resulted from an ancient gene fusion event rather than a duplication event (Iyer et al. 2004). The N-terminal AAA+ module (ClpA-

D1/ClpB-D1) belongs to the Extended AAA Group (Fig. 3), while the C-terminal AAA+ module (ClpA-D2/ClpB-D2) is a member of the HCL clade of the PACTT Group (Fig. 3)

(Ammelburg et al. 2006).

ClpA is part of a protein degradation machine, similar in design to the proteasome. It forms hexameric rings which associate with the ends of a tetradecameric complex of the cylindrical protease ClpP (Kessel et al. 1995). While present in both bacteria and eukaryotes, in eukaryotes ClpAP complexes are found in mitochondria and chloroplasts, rather than the cytosol.

The role of ClpAP in eukaryotes is not well understood. However, the system has been studied extensively in bacteria. ClpA is responsible for the ATP-driven unfolding and translocation of various protein substrates through the central pore of its oligomeric ring into the core proteolytic chamber of the ClpP complex, effectively acting as a ‘gate’ for the protease, in a manner that is reminiscent of the proteasome cap proteins (Gottesman 2003). ClpA recognizes substrates containing recognition signals (typically N- or C-terminal), such as N-end rule substrates or

SsrA-tagged proteins (Gottesman et al. 1998; Tobias et al. 1991). The SsrA tag is an 11-amino- acid tag that is added to the end of truncated proteins derived from mRNAs that are not completely translated; the tag serves as a prokaryotic degradation marker to direct elimination of these incomplete products (Keiler et al. 1996). Degradation of N-end rule substrates is mediated through the small adapter/substrate modulator protein, ClpS, which specifically binds and delivers them to ClpAP, while inhibiting the degradation of other types of substrates (Dougan et al. 2002; Erbse et al. 2006). The ClpA hexamer resembles that of similar,

22

A) ClpA N-Domain AAA+ (D1) AAA+(D2)

Propeller Domain Insertion

ClpB N-Domain AAA+ (D1) AAA+(D2)

I-Domain Insertion

ClpX N-Domain AAA+ HslU N-Domain AAA+

B) CobS AAA+

CobT VWA

BchI/ChlI AAA+

Degenerate BchD/ChlD VWA AAA+

C) Midasin

Long D/E N-Domain AAA+ 1 AAA+ 2 AAA+ 3 AAA+ 4 AAA+ 5 AAA+ 6 VWA Linker Rich

D) Dynein Heavy Chain (DHC) Head Forming

Insertion N-Terminal Region AAA+ 1 AAA+ 2 AAA+ 3 AAA+ 4 AAA+ 5 AAA+ 6 C-Domain (Stem Forming) (Stalk Forming)

23

FIGURE 5. Schematic diagrams illustrating the domain structure of select AAA+ and associated proteins. Diagrams are not to scale. A) ClpA, ClpB, HslU and ClpX, illustrating the evolutionary relationship of their AAA+ modules. Blue coloured AAA+ modules belong to the Extended AAA Group. Green coloured AAA+ modules belong to the PACTT Group (HCL Clade). B) AAA+ and VWA containing subunits belonging to or associated with the Chelatase Family of the PACTT Group (Helix-2 Insert Clade). C) Domain structure of the Midasin protein of the PACTT Group (Helix-2 Insert Clade). D) Domain structure of a dynein heavy chain (DHC) protein of the PACTT Group.

24 double AAA+ module-containing proteins (e.g. Cdc48), consisting of two stacked hexameric rings formed by N-D1 and D2 (Guo et al. 2002; Kessel et al. 1995). Substrate binding and translocation is mediated by the flexible N-terminal domain and channel-lining loops in the D1 and D2 domains, while docking of the ClpA hexamer to the ClpP protease is mediated by distal surface loops in D2 (Hinnerwisch et al. 2005a; Hinnerwisch et al. 2005b; Kim et al. 2001; Singh et al. 2001). D1 is responsible for hexamer formation, which requires nucleotide binding but not hydrolysis, while D2 is essential for ATP hydrolysis. (Seol et al. 1995).

ClpB family members are found in prokaryotes (ClpB) and the organelles of some eukaryotes (Hsp78), as well as in the cytosol of yeast (Hsp104) and plants (Hsp101)(Bosl et al.

2006; Zolkiewski 2006). Unlike ClpA, ClpB proteins do not associate with a protease, and are instead involved in the disaggregation and reactivation of protein aggregates, working in conjunction with the DnaK (Hsp70) and DnaJ (Hsp40) chaperones (Bosl et al. 2006; Krzewska et al. 2001b; Tkach and Glover 2006; Zhang and Guy 2005; Zolkiewski 2006). Expression of

ClpB is induced in response to a variety of stress conditions, and ClpB plays an important role in the cellular stress response, particularly thermotolerance (Bosl et al. 2006; Hong and Vierling

2000; Leidhold et al. 2006; Leonhardt et al. 1993; Queitsch et al. 2000; Schmitt et al. 1996;

Tkach and Glover 2006; Young et al. 2001; Zolkiewski 2006). ClpB, Hsp78, Hsp101 and

Hsp104 all hexamerize in the presence of nucleotide, and EM studies of ClpB and Hsp104 have shown formation of a stacked ring structure reminiscent of ClpA. Intriguingly, however, ClpB hexamer formation is stimulated only by ATP and analogues such as ATPγS, while Hsp78,

Hsp101 and Hsp104 will also hexamerize in the presence of ADP (Gallie et al. 2002; Krzewska et al. 2001a; Lee et al. 2003a; Parsell et al. 1994; Zolkiewski et al. 1999). Recent work has also suggested that Hsp78 forms trimeric complexes in vivo, and that this might represent a relevant functional state (Leidhold et al. 2006). The D2 AAA+ module is responsible for hexamerization of Hsp78 and Hsp104, while ATP hydrolysis is largely dominated by D1 (Hattendorf and

Lindquist 2002; Parsell et al. 1994; Schirmer et al. 1998) (Krzewska et al. 2001a). The situation

25 is different in ClpB where the D1 module is critical for hexamerization, and where both D1 and

D2 make a significant contribution to ATPase activity (Kim et al. 2000; Mogk et al. 2003;

Watanabe et al. 2002). Studies have shown that, similar to Hsp104, the Hsp101 D2 module is required for proper oligomerization, while mutations in the D1 module have no significant effect

(Gallie et al. 2002). The role of the N-terminal domain of ClpB family proteins is uncertain, and the subject of some debate, but studies have shown that it is not required for proper function

(Beinker et al. 2002; Mogk et al. 2003; Tkach and Glover 2006).

ClpB family members also contain a long, leucine-rich coiled-coil domain inserted into the small alpha-helical domain of D1 (Fig. 4C), resembling a two-bladed propeller at the surface of the oligomeric ring (Lee et al. 2003a). This domain has been shown to be important for function, and is proposed to play an active role in protein disaggregation and interdomain communication (Cashikar et al. 2002; Lee et al. 2003a). There are currently two major models of protein disaggregation by ClpB family members. In the first model they act as a ‘molecular crowbar’, prying apart surfaces in aggregates (possibly using the propeller domain) and producing smaller non-native structures that could be refolded by other chaperones such as the

DnaK(Hsp70)/DnaJ(Hsp40) system (Glover and Tkach 2001; Lee et al. 2003a). There is also considerable recent evidence for a ‘molecular ratchet’ model, wherein proteins are extracted from aggregates and passed through the central channel of the ClpB family oligomers, and then interact with other chaperones to mediate their proper refolding (Tkach and Glover 2006;

Zolkiewski 2006). It is tempting to speculate that ClpB proteins can function via both models of disaggregation, acting as either a crowbar or a ratchet, as the situation requires. There is also recent evidence that the DnaK (Hsp70)/DnaJ (Hsp40) system may also function upstream of

ClpB proteins, perhaps helping to prepare aggregates for interaction with the ClpB family chaperones (Zietkiewicz et al. 2004).

26

SpoVK, Ycf2, AFG1, and Viral Helicase Families

A number of poorly understood AAA+ families are also included in this group, including SpoVK, Ycf2 and AFG1. The SpoVK family is found in certain prokaryotes. Studies of SpoVK from Bacillus subtilis have shown that it is important in the sporulation process, and that disruption of the gene results in sporulation-defective cells that are no longer acquire resistance to organic solvent, heat, and lysozyme (Fan et al. 1992). The exact function of

SpoVK, however, is unknown. Members of the Ycf2 family are found encoded in the chloroplast genomes of a wide range of land plants. The function of Ycf2 is unknown, although deletion/insertion studies have shown that it is essential in tobacco, and that it does not appear to be involved in either photosynthesis or gene expression (Drescher et al. 2000). The Ycf2 protein is quite large, with a general predicted size of 230-270 kDa. The expressed protein appears to be processed to a smaller form of ~170 kDa, and contains a single AAA+ module towards the C- terminus of the molecule (Glick and Sears 1993; Richards et al. 1994). The AFG1 family is found throughout eukaryotes and in select bacteria (Iyer et al. 2004). AFG1 is poorly characterized and its function is currently unknown (Lee and Wickner 1992).

In addition, a number of different viral helicases are also associated with the Extended

AAA Group. These viral helicases are notably not Superfamily III helicases, which, although originally identified as a distinctive AAA+ clade (Iyer et al. 2004), have been removed from the

AAA+ classification in the more recent analysis due to their lack of the distinctive small, α- helical domain characteristic of the AAA+ module (Ammelburg et al. 2006).

1.2.2.2 The HEC Group

The HEC (Helicases and Clamp Loaders) group encompasses the Clamp Loader and

Initiation Clades defined in the earlier study (Iyer et al. 2004), in addition to several related families included in the more recent analysis and are listed in the box marked ‘Other’ under the

HEC classification (Fig. 3) (Ammelburg et al. 2006).

27

1.2.2.2A Clamp Loader Clade

There are three major families within the Clamp Loader Clade, including the bacterial clamp loaders (HolB/DnaX family), the eukaryotic/archaeal clamp loaders (RFC family), and the WHIP family. (Iyer et al. 2004). For the most part the members of this clade possess basic

AAA+ modules, with little to no modification in their core structure, although, notably, the

HolB AAA+ modules do contain a zinc-binding module inserted downstream of the Walker A- associated helix (Guenther et al. 1997; Iyer et al. 2004). Figure 2 shows the structure of the

AAA+ module of the RFC1 protein, which typifies this ‘basic’ core structure.

HolB/DnaX, RFC, and WHIP Families

Members of the HolB/DnaX and RFC families have been well studied and are key components of the multi-component ‘replisome’ complex, which is responsible for the replication of DNA. The major function of these proteins is to load ‘sliding clamp’ structures onto DNA (Johnson and O'Donnell 2005). These sliding clamps encircle DNA strands, and serve as mobile structures to which DNA polymerase core enzyme can be mounted during the process of DNA replication (Kong et al. 1992; Krishna et al. 1994; Stukenberg et al. 1991; Yao et al. 1996). This serves to anchor the polymerase to its DNA template, and greatly increases the

DNA synthesis rate and processivity (i.e. the length of fragment which can be replicated) of the enzyme (Johnson and O'Donnell 2005; Stukenberg et al. 1991). The clamp loader AAA+ proteins form a pentameric, ring-like structure, which is notably distinct from the hexameric rings typically formed by AAA+ proteins (Fig. 6A) (Bowman et al. 2004; Jeruzalmi et al. 2001).

This structure associates with sliding clamps only in the presence of ATP. This clamp loader/sliding clamp complex then binds to DNA template, specifically recognizing ‘primed’ sites. The association with DNA stimulates the ATPase activity of the clamp loader complex, leading to ATP hydrolysis and subsequent release of the clamp loader, leaving the sliding clamp

28

A) RFC Pentamer

B) DnaA C

β5

β1

β4

β3

β2

N

Two Equally Sized Helices Side

C) RuvB

β5

C β1

β4 β3 β2 N

PS1βH

29

FIGURE 6. Selected AAA+ modules and oligomeric structures from the HEC Group. A) Heteropentameric RFC1-5 oligomer from Saccharomyces cerevisiae (Bowman et al. 2004). RFC1-5 assemble to form a pentameric ring-like structure, with the AAA+ modules forming a slight spiral. Domains I and II of RFC1-5, which comprise the AAA+ module of each protein, are shown in colour as follows; RFC1 (Red), RFC2 (Cyan), RFC3 (Blue), RFC4 (Green), RFC5 (Orange). Domains III of RFC1-5, which pack together to form a cylindrical ‘collar’ structure, and domain IV of RFC1, which connects the first and last subunits, are shown in grey. The PCNA clamp complex, which was bound to the RFC oligomer in the structure, has been omitted for clarity. Bound ATPγS is shown in yellow. B) AAA+ module of DnaA from Aquifex aeolicus as a representative member of the Initiation Clade. The nucleotide binding ‘core’ domain is shown in green, and the small α-helical domain is shown in purple. All other major features are as marked. Inset: Side view of oligomeric spiral structure. AAA+ modules are coloured as in the larger figure. DNA-binding domains are shown in grey. Bound nucleotide (AMP-PCP) is shown as yellow spheres (Erzberger et al. 2006). C) AAA+ module of RuvB from Thermus thermophilus HB8 as a representative member of the RuvB family. The structure is coloured as in panel B) (Yamada et al. 2001).

30

‘loaded’ on the DNA template and ready for association with polymerase (Gomes et al. 2001;

Turner et al. 1999). The clamp loader complex also forms key interactions with the replication helicase, which is responsible for template unwinding, and the two DNA polymerase enzymes involved in the synthesis of the leading and lagging strands (Gao and McHenry 2001a; Gao and

McHenry 2001b). Clamp loader function is required at frequent regular intervals during synthesis of the lagging strand, and is also believed to be important in mediating transfer of

DNA polymerase from a site of completed synthesis to the clamp at a newly primed site (Leu et al. 2003). In addition, clamp loader proteins and sliding clamps have been implicated in other processes associated with DNA repair and metabolism. Notably, the RFC family contains

‘alternate’ RFC-like clamp loader subunits which serve to alter the specificity/function of clamp loader complexes (Iyer et al. 2004; Johnson and O'Donnell 2005). RFC1-5 assemble to form a heteropentameric ring structure, with the AAA+ modules forming a slight spiral (Fig. 6A).

The exact role of members of the WHIP family is not well understood, although, like the other families in this clade, they interact with DNA polymerase and sliding clamps, and have been shown to play a role in DNA replication/repair (Branzei et al. 2002; Hishida et al. 2006).

1.2.2.2B Initiation Clade

The initiation clade contains the bacterial DnaA/DnaC family and the archaeal/eukaryotic CDC6/ORC family. Members of this clade possess a unique structural feature, consisting of two equally sized helices, after strand β2 of the ATPase core domain (Fig.

6B, Pink) (Iyer et al. 2004).

DnaA/DnaC Family

These proteins play important roles in the initiation of DNA replication at replication origins. The DnaA/DnaC proteins function in bacterial initiation. Bacterial origins of replication contain multiple copies of short DNA sequences which serve as recognition sequences for

31 initiator proteins (Schaeffer et al. 2005). DnaA recognizes and binds as a monomer to multiple

‘DnaA-box’ motifs at the origin (Carr and Kaguni 2001; Fuller et al. 1984; Schaper and Messer

1995). The DnaA protein acts as a nucleotide-dependent switch, which, in the ATP-bound state, is responsible for the local unwinding of the origin leading to the formation of a single stranded

‘bubble’ or ‘open complex’ (Bramhill and Kornberg 1988; Speck et al. 1999). Recent structural studies have suggested that the ATP-bound DnaA forms a complex resembling a filamentous, open-ended, helical spiral (Fig. 6A), which, once again, differs from the typical hexameric ring structure formed by most members of the AAA+ superfamily (Erzberger et al. 2006). Formation of the ‘open complex’ is followed by the loading of two molecules of hexameric DnaB helicase.

Hexameric DnaB, when not bound to DNA, is associated with six molecules of the DnaC helicase loader, which acts to inhibit the helicase in the presence of ATP (Kobori and Kornberg

1982; Wahle et al. 1989; Wickner and Hurwitz 1975). Proper loading of DnaB is dependent upon DnaC, which acts to expand the ssDNA ‘bubble’ to facilitate DnaB binding, and, in conjunction with DnaA, directs proper loading. DnaC then hydrolyzes ATP and dissociates from the origin, thereby relieving inhibition of the helicase (Davey et al. 2002a; Marszalek and

Kaguni 1994; Schaeffer et al. 2005). Subsequent helicase activity and the recruitment of additional proteins lead to the formation of the functional replisome and the associated replication forks. Notably, DnaA also has a role as a transcriptional regulator, serving to either activate or repress the expression of various genes, including itself (Messer and Weigel 1997).

CDC6/ORC Family

Initiation of replication in eukaryotes involves members of the CDC6/ORC family. ORC proteins are part of the heterohexameric ORC complex, which acts as a replication initiator, and serves as a ‘platform’ onto which other proteins, such as CDC6, can bind (Bell and Stillman

1992; Davey et al. 2002b). The ORC complex and CDC6 work in conjunction with another protein, Cdt1, to load the ‘minichromosome maintenance’ (MCM) replicative helicase onto

32

DNA, in a manner reminiscent of the bacterial loading of the DnaB helicase (Donovan et al.

1997; Maiorano et al. 2000).

1.2.2.2C Other HEC Families

RuvB Family

Members of the RuvB family are found throughout bacteria (Iyer et al. 2004), and play a key role in the later stages of homologous recombination. Homologous recombination is critically important in the maintenance of genome stability and repair of DNA damage, as well as in the generation of biological diversity. One important intermediate of this process is the

Holliday junction, a DNA structure consisting of two homologous duplex DNA molecules associated via a single-stranded crossover (Fig. 7). RuvB works in conjunction with the RuvA and RuvC proteins to process Holliday junctions into mature recombinant DNA molecules

(West 1997). The RuvA protein forms tetrameric complexes which bind Holliday junctions with high affinity (Iwasaki et al. 1992; Parsons et al. 1992; Tsaneva et al. 1992a). Structural studies have revealed the presence of either one or two bound tetramers, and this association converts the Holliday junction from the X- to the square planar conformation (Fig. 7C) (Ariyoshi et al.

2000; Hargreaves et al. 1998; Parsons et al. 1995; Roe et al. 1998). The RuvA protein interacts directly with the RuvB protein and facilitates its loading onto DNA (Parsons and West 1993).

RuvB proteins, which contain a single AAA+ module that is followed by a C-terminal, winged- helix DNA-binding domain, have been shown by EM studies to bind to the Holliday junction as hexameric rings, contacting the bound RuvA on two opposite sides (Parsons et al. 1995;

Yamada et al. 2001; Yu et al. 1997). Together, the RuvA and RuvB proteins function as an

ATP-dependent motor, promoting the branch migration of the Holliday junction, the process by which the junction moves along the DNA (Iwasaki et al. 1992; Tsaneva et al. 1992b). RuvB molecules are proposed to act as ATP driven pumps, driving helical rotation of dsDNA, pulling

33

A) 5‘ 3‘ 5‘ 3‘ 3‘ 5‘ 3‘ 5‘ 3‘ 5‘ 3‘ 5‘ 5‘ 3‘ 5‘ 3‘ Holliday Junction Branch Migration Holliday Junction (X-Conformation) (X-Conformation)

B) 3‘5‘

5‘ 3‘ 3‘ 5‘ 34

5‘3‘ RuvAB Motor Complex at Holliday Junction (Square-Planar Conformation)

C) 3‘5‘

Rotate Bottom Strands Rotate Arms 5‘ 3‘ 180 Degrees 5‘ 3‘ in Plane 3‘ 5‘ 3‘ 5‘ 5‘ 3‘ 3‘ 5‘ 3‘ 5‘ 5‘ 3‘ 5‘ 3‘ 3‘ 5‘ Holliday Junction (X-Conformation)

5‘3‘ Holliday Junction (Square-Planar Conformation)

FIGURE 7. Holliday junctions and the RuvAB motor complex. A) Holliday junction in the X-conformation illustrating the process of branch migration, leading to lengthening of heteroduplex DNA. B) RuvAB motor complex bound at a Holliday junction in the square-planar conformation directing branch migration. This is believed to be the relevant physiological state in which branch migration occurs. C) Schematic diagram illustrating the relationship / conversion between the Holliday junction X-conformation and square-planar conformation.

35 it through the RuvA core and thereby promoting the branch migration process and increasing the formation of heteroduplex DNA (Fig. 7A and B) (Parsons et al. 1995; Yamada et al. 2004; Yu et al. 1997). After branch migration, the junction is resolved by the action of the RuvC endonuclease, generating two recombinant DNA duplexes (West 1997).

Members of the RuvB family possess a β-hairpin insert between Sensor 1 and its preceding helix, a feature characteristic of the Pre-sensor 1 β-hairpin superclade (Fig. 6C).

Notably, this structure is important for the interaction of RuvB with RuvA (Han et al. 2001;

Yamada et al. 2002). The RuvB family does not, however, possess any features that allow it to be placed into either the HCL or Helix-2 Insert Clades (Iyer et al. 2004). In addition, more recent work suggests that the RuvB family is most closely related to the clamp loaders, and as such was placed into the HEC group (Ammelburg et al. 2006). The evolutionary history and significance of the PS1βH structural motif in RuvB are thus not presently clear.

IstB and HolA Families

The two other remaining families of the HEC group are IstB and HolB. IstB, encoded by one of two genes in the istAB operon of the bacterial IS21 transposable element, is closest to the

DnaA/DnaC family (Ammelburg et al. 2006; Iyer et al. 2004). It functions as a ‘helper protein’, working in conjunction with the IstA / to mediate IS21 transposition

(Schmid et al. 1998). The HolA AAA+ protein has diverged significantly from the bacterial clamp loaders, yet still functions as an essential component of the clamp loader complex, acting as a ‘wrench’ in the opening of sliding clamps prior to their loading onto DNA template

(Ammelburg et al. 2006; Johnson and O'Donnell 2005; Stewart et al. 2001).

1.2.2.3 ExeA Group

The ExeA group, not originally described in the AAA+ classification of Iyer et al.

(2004), appeared in the more recent study by Ammelburg et al. (2006) and consists of a core of

36

ExeA proteins loosely associated with a number of phage proteins involved in DNA transposition (Ammelburg et al. 2006). The ExeA proteins are found in various bacterial species, where they act as a component of type II secretion systems (Schoenhofen et al. 2005). Type II secretion systems play an important role in the excretion of hydrolytic enzymes and toxins, many of which contribute to organism pathogenecity. The secretion pathway consists of two major steps, wherein substrates containing N-terminal signal peptides are first transported across the cytoplasmic membrane to the periplasm, and, after undergoing any necessary folding and modification, such as removal of a signal peptide sequence, disulfide-bond formation, subunit assembly, and others, are then transferred to the type II secretion apparatus for translocation across the outer membrane (Sandkvist 2001). The type II secretion apparatus is a large, cell envelope-spanning, multi-component complex. exeA genes typically form an operon with exeB genes, both of which have been shown to be required for type II secretion of proteins in

Aeromonas hydrophila and E. coli (Francetic et al. 2000; Jahagirdar and Howard 1994). Work with Aeromonas hydrophila ExeA and ExeB has revealed that they are associated with the inner membrane, and span the membrane a single time. Each possesses a large domain localized in the periplasm, while ExeA also possesses a large cytoplasmic domain containing the AAA+ module

(Howard et al. 1996). ExeB does not appear to contain any domains or motifs of recognizable function. The ExeA and ExeB proteins have been shown to form a large, heteroligomeric inner membrane complex (Schoenhofen et al. 1998). Gel filtration studies identified two major species of this complex. Assuming the presence of stoichiometric amounts of ExeA and ExeB, one complex is predicted to consist of two ExeA and two ExeB protomers, while the other complex is expected to consist of roughly six of each protein (Schoenhofen et al. 2005).

Unfortunately, more accurate information about the structure and stoichiometry of the complexes is not currently available, however, based upon the behaviour of other AAA+ proteins, it is tempting to speculate that the largest complex may consist of an ExeA hexamer associated with approximately six ExeB subunits. ExeA has been shown to be a functional

37

ATPase, consistent with the presence of a AAA+ module, and can also utilize GTP, CTP, and

UTP, although it binds these nucleotides with lower affinity (Schoenhofen et al. 2005). The exact roles of ExeA and ExeB have not been definitively established. However, in A. hydrophila, they have been shown to be important for directing both the proper localization of the ExeD translocation channel-forming protein from the inner to the outer membrane, as well as its assembly into a multimeric complex (Ast et al. 2002). ExeA from this organism has also been shown to bind peptidoglycan via a region in its C-terminal periplasmic domain, and this interaction is critical for the localization/assembly of ExeD and, therefore, functional secretion

(Howard et al. 2006). Work in E. coli has suggested that ExeA and ExeB may also play an important role in the regulation or stability of other type II secretory components, however, similar results were not obtained in A. hydrophila (Ast et al. 2002; Francetic et al. 2000).Thus, there may be some differences in specific functions of the proteins between species. Generally, however, it seems likely that ExeA functions as a chaperone, working in conjunction with ExeB and utilizing the power of ATP hydrolysis, to direct the assembly and localization of secretory components by manipulating the structure of peptidoglycan to facilitate these processes.

1.2.2.4 STAND Group

The STAND (Signal Transduction ATPases with Numerous Domains) group was originally described as a distinct superfamily of P-loop NTPases belonging to the ASCE class.

Members of this class are typically large, multidomain proteins, defined by the presence of various distinct sequence and structural motifs. These motifs include a unique highly helical, globular domain, following the NTPase module, which itself consists of a core NTPase domain and a small α-helical domain, similar to that of the AAA+ proteins. As their name implies, many of the characterized members participate in signaling processes. More specifically, the class includes proteins involved in the regulation of apoptosis (e.g. APAF1/CED4), plant pathogen and stress resistance proteins, a number of bacterial transcriptional regulators (e.g.

38

GutR, AsfR), animal disease-response proteins (e.g. CARD4), subunits (e.g. TP1), a number of archaeal NTPase families, as well as a variety of uncharacterized proteins from a range of organisms (Leipe et al. 2004).

Based upon a cluster analysis (Ammelburg et al. 2006) and the presence of the small α- helical domain, similar to that of the AAA+ proteins, between the NTPase core and the second highly helical domain, it was suggested that the STAND proteins may actually represent a family of AAA+ proteins,. Notably, however, although the STAND group is more closely related to the AAA+ ATPases than any of the other groups in the ASCE class, STAND group members are still only loosely connected with other AAA+ proteins (Ammelburg et al. 2006;

Leipe et al. 2004). This, coupled with the considerable size of the STAND group and its unique sequence and structural features, make it subject to debate whether the STAND group actually represents a unique, rapidly diverged AAA+ family or a distinctly separate lineage of P-loop

NTPases. As a result of this ambiguity, and the fact that STAND group has been described in detail elsewhere (Leipe et al. 2004), it is not further discussed here.

1.2.2.5 PACTT Group

The PACTT (Protease, Chelatase, Transcriptional Activators and Transport) Group defined by Ammelburg et al. (Ammelburg et al. 2006) corresponds largely to the Pre-Sensor 1

β-Hairpin Superclade of Iyer et al. (Fig. 4) (Ammelburg et al. 2006; Iyer et al. 2004). Members of the superclade possess a characteristic β-Hairpin insert between the Sensor 1 strand and its preceding helix (shown in blue in Fig. 8). The superclade can be mainly divided into two separate clades, the HCL Clade and the Helix-2 Insert Clade.

1.2.2.5A HCL Clade

Members of the HCL (HslU/Clp/Lon) Clade contain an extended loop between Strand

β2 and the following helix (Fig. 8A, Pink) (Iyer et al. 2004). Analysis of the available structures

39

A) ClpA-D2

N β-Sheet β5

β1

β4

β3 C β2

Extended Loop PS1βH

B) HslU Top

β5

β1 PS1βH β4 N β-Sheet β3

β2 Side C

Extended Loop I-Domain

C) BchI C Displaced α-helical domain

β5 β1 PS1βH N β4

β3 β2 Helix-2 Insert

Long Helical Insert

Chelatase Insert

40

FIGURE 8. Selected AAA+ modules and oligomeric structures from the PACTT Group. Nucleotide binding ‘core’ domains are shown in green, small α-helical domains are shown in purple. Pre-Sensor 1 β-Hairpins are shown in blue. All other major features are as marked. A) AAA+ module D2 from ClpA as a representative member of the HCL Clade (Guo et al. 2002). B) HslU from Escherichia coli, illustrating the I-Domain insertion. Dotted lines represent portions of I-Domain not visible in the structure. Inset: Top and side views of the HslU hexameric ring. AAA+ modules are coloured as in the larger figure. Bound nucleotide (dADP) is shown as yellow spheres (Wang et al. 2001). C) AAA+ module from Rhodobacter capsulatus BchI as a representative member of the Helix-2-Insert Clade. The dotted line replaces a small stretch of residues not clearly resolved in the structure (Fodje et al. 2001).

41 of HslU, Lon, ClpX, and ClpA-D2 also reveals that these proteins possess a disrupted small α- helical domain, containing a two or three stranded β-sheet structure (Fig. 8A, Yellow).

HslU/ClpX Family

HslU/ClpX family members are widespread throughout bacteria, and are also found in the mitochondria and plastids of certain eukaryotes (Iyer et al. 2004). Like ClpA, these proteins are an important part of the cellular quality control machinery, functioning independently as molecular chaperones, or as ATP-dependent regulatory components of protease complexes, in a manner analogous to the proteasome cap by utilizing the power of ATP hydrolysis to unfold specific protein substrates and direct them through their central pore and into proteolytic chambers for degradation (Zolkiewski 2006). Unlike ClpA, however, HslU and ClpX contain only a single AAA+ module (Fig. 5A).

ClpX, like ClpA, binds the ClpP serine protease, forming hexameric caps on the ends of the protease tetradecamer, resulting in a ‘symmetry mismatch’ (Grimaud et al. 1998; Ortega et al. 2000). A number of ClpX substrates have been identified, and are recognized by the protein mainly through signals at either their N- or C-terminus. Some examples include the λO phage replication protein (N-terminal recognition), and the MuA transposase (C-terminal recognition)

(Gottesman 1996). Numerous endogenous E. coli substrates have also been identified via a proteomics approach (Flynn et al. 2003). ClpX also recognizes SsrA-tagged substrates, similar to ClpA, however the ClpXP complex appears to be the system primarily responsible for the degradation of these substrates in vivo (Gottesman et al. 1998). ClpX activity is modulated by certain specificity factors including RssB, involved in directing the degradation of the σS transcription factor, and UmuD and SspB, which enhance the recognition/degradation of the

UmuD′ DNA polymerase subunit and SsrA-tagged substrates, respectively (Levchenko et al.

2000; Neher et al. 2003; Wah et al. 2002; Zhou et al. 2001). In addition to its AAA+ module,

ClpX contains an N-terminal zinc binding domain (ZBD), which has been shown to be

42 important in specificity factor binding and in interaction with certain substrates, and has recently been proposed to modulate substrate translocation through ClpX and into ClpP (Singh et al.

2001; Thibault et al. 2006; Wojtyra et al. 2003).

HslU does not interact with ClpP, but rather with its own protease, HslV. The HslV threonine protease is similar in design to ClpP, but consists of two stacked hexameric, rather than heptameric rings, thereby forming a dodecameric structure. HslU forms hexameric ring structures containing a central pore and associates with the ends of the protease complex (Sousa et al. 2000). Thus, the HslU/HslV complex does not display the ‘symmetry mismatch’ characteristic of the ClpAP and ClpXP complexes. Association of HslU to HslV leads to a distinct stimulation of the HslV proteolytic activity, as well as a marked increase in the rate of

ATP hydrolysis by HslU (Seol et al. 1997; Yoo et al. 1996). Interestingly, activation of the HslV protease by HslU appears to be due to allosteric changes in the active sites of the protease, rather than simply an effect on accessibility to the protease core as is known to be the case in some other ATP dependent proteases, such as the proteasome (Groll et al. 2005; Ramachandran et al.

2002; Sousa et al. 2002). HslU proteins do not typically possess additional N-terminal or C- terminal domains, but rather contain a unique insertion of ~ 130 amino acids, called the I-

Domain, in their AAA+ module (Fig. 8B, Pink), which resemble outwardly extending tentacles in the oligomeric ring structure (Sousa et al. 2000). This domain is proposed to play a role in substrate recognition and unfolding (Groll et al. 2005). HslU substrate recognition is poorly understood, and only a limited number of natural substrates have been identified. Examples of E. coli substrates include the SulA cell-division inhibitor, as well as the heat shock factor σ32, which, notably, is also a target of other ATP-dependent cellular proteases (Kanemori et al. 1997;

Nishii and Takahashi 2003; Seong et al. 1999). The phage P22 Arc repressor is also readily degraded by HslUV, apparently through a recognition sequence in its N-terminus (Burton et al.

2005; Kwon et al. 2004a). Studies in E. coli have shown that HslUV is induced in response to heat shock, and plays a role in cellular stress response (Chuang and Blattner 1993; Kanemori et

43 al. 1997; Missiakas et al. 1996; Wu et al. 1999).

LonA/LonB Families

There are two major divisions of Lon AAA+ proteins, LonA, which is found mainly in bacteria and certain eukaryotic organelles, and LonB, which is found in archaea. LonA and

LonB differ in domain structure and certain characteristic sequences, however, both consist of a

AAA+ module fused to a C-terminal proteolytic domain containing a Ser-Lys catalytic dyad active site. LonA proteins possess an additional N-terminal domain, while LonB proteins possess transmembrane helix insertions in the AAA+ module between the Walker A and Walker

B motifs and appear to be membrane associated (Besche et al. 2004; Fukui et al. 2002; Iyer et al.

2004; Rotanova et al. 2004). In the AAA+ analysis performed by Iyer et al. (Iyer et al. 2004), both LonA and LonB were combined into a single family belonging to the HCL clade. The subsequent analysis by Ammelburg et al. (Ammelburg et al. 2006), however, suggested that the

AAA+ module of LonB was actually closer to that of YifB of the Helix-2 Insert clade than to that of LonA, thus dividing Lon proteins into two distinct AAA+ families. A recently solved crystal structure of the small, α-helical domain of the AAA+ module from E. coli LonA reveals the presence of the distinctive β-sheet structure characteristic of other members of the HCL clade, consistent with both the Iyer and Ammelburg classifications (Botos et al. 2004).

Unfortunately the structure of a LonB AAA+ module has not yet been obtained, and as such is not available to help determine the proper classification of this family.

Lon proteases play a key role in protein quality control, the maintenance of cellular homeostasis and regulatory events by directing the unfolding and degradation of abnormal, mutant, and unstable proteins using the power of ATP hydrolysis (Rotanova et al. 2006). Lon proteins have also been implicated in ATP-dependent protein refolding, and have been shown to bind DNA and RNA (Fu and Markovitz 1998; Fu et al. 1997; Liu et al. 2004; Lu et al. 2003;

Rep et al. 1996). Lon is generally not essential for viability, although deletions have been

44 observed to produce cellular defects in some bacterial and eukaryotic species (Tsilibaris et al.

2006). Like other AAA+ proteins, Lon proteases function as oligomeric assemblies, although the functional state has not been definitively established, and may vary between species.

Electron microscopy studies have detected heptameric rings (from Saccharomyces cerevisiae) and hexameric rings (from E. coli), however, other studies have suggested the possible existence of dimers, tetramers and octamers (Park et al. 2006; Rotanova et al. 2006; Rudyak et al. 2001;

Stahlberg et al. 1999). Some or all of these states may be functionally relevant. Electron microscopy images of hexameric E. coli LonA revealed a double-ring structure, suggesting that the AAA+ modules and protease domains each form a distinct layer (Park et al. 2006).

Oligomerization requires Mg2+, but not nucleotide, although nucleotide does appear to induce conformational changes and have a stabilizing effect (Park et al. 2006; Rudyak et al. 2001;

Stahlberg et al. 1999; Vasilyeva et al. 2002). Substrate recognition by Lon proteases is not well understood. It has been proposed that Lon recognizes structural motifs/elements exposed upon unfolding of substrate proteins. In addition, the N and C-terminal elements of particular substrates have also been shown to be preferentially recognized by Lon, although no consensus sequence has been determined (Tsilibaris et al. 2006). Deletion and mutagenesis studies suggest that the N-terminal domain of LonA proteases may play a role in the recognition of certain substrates, however, this has not yet been clearly established (Rotanova et al. 2006).

ClpAB-D2/Torsin Family

The ClpAB-D2/Torsin family contains the second AAA+ module of the ClpA and ClpB proteins (described previously) as well as the eukaryotic Torsin proteins found in animals (Iyer et al. 2004). Mutations in the best studied of the torsins, TorsinA, are associated with early onset dystonia, a neurological disease characterized by sustained muscle contractions causing twisting and uncontrolled movement (Breakefield et al. 2001; Gerace 2004). The disease is clearly a result of defects within the central nervous system, however, no gross defects in neuron

45 structure are apparent (Gerace 2004). Torsins are found mainly in the endoplasmic reticulum

(ER) of eukaryotic cells, and studies on TorsinA have shown that it is membrane associated and faces into the ER lumen (Hewett et al. 2000; Hewett et al. 2003; Kustedjo et al. 2000; Liu et al.

2003). Interestingly, torsin family members do not possess any significant domains in addition to the AAA+ module, however studies on TorsinA have shown that it contains a 40 amino acid hydrophobic stretch responsible for ER localization and membrane association (Breakefield et al.

2001; Liu et al. 2003). TorsinA does not appear to be an integral membrane protein, however, and thus the association with the membrane is likely mediated via interaction with other proteins

(Callan et al. 2007). Recent work has showed that TorsinA is, as predicted, a functional ATPase which, consistent with other AAA+ proteins, forms a large oligomeric species in vitro. The stoichiometry and structure of this complex have not yet been determined (Pham et al. 2006).

Experiments have shown that TorsinA can function as a molecular chaperone suppressing protein aggregation, and can protect cells against oxidative stress (Caldwell et al.

2003; Kuner et al. 2003; McLean et al. 2002). In addition, TorsinA is important for proper morphology of the nuclear envelope in neurons and the deletion of the gene encoding TorsinA was shown to be lethal in mice shortly after birth (Gerace 2004; Goodchild et al. 2005). TorsinA has also been shown to interact with the nuclear envelope localized LAP1 protein, a lamin- binding protein of poorly understood function, and with the homologous ER-localized LULL1 protein, whose function is also unknown. Torsin A also interacts with the light chain 1

(KLC1), which is part of the kinesin-I motor protein complex involved in the transport of membranous organelles along microtubules (Goodchild and Dauer 2005; Kamm et al. 2004).

The functional significance of these interactions is not clear, however, and the specific role of the torsins in cells remains unresolved.

1.2.2.5B Helix-2 Insert Clade

Members of the Helix-2 Insert Clade are defined by the presence of an insert within the

46 second helix of the AAA+ core domain. This insert consists of two β-strands flanking a small helix (Fig. 8C, Pink). Many of these family member also possess an insert of a long helix between strand β5 of the core domain and the small α-helical domain of the AAA+ module (Fig.

8C, Yellow). Notably this strand appears to significantly displace the small α-helical domain from its typical position relative to the core domain (Iyer et al. 2004).

Chelatase Family

Members of the metal chelatase family are found in archaea, bacteria, and plants (Iyer et al. 2004). These proteins utilize the power of ATP hydrolysis to mediate the insertion of Mg2+ or

Co2+ into porphyrin rings as part of the synthesis of (bacterio)chlorophyll or cobalamin (vitamin

B12), respectively (Fodje et al. 2001). They work in conjunction with proteins containing Von

Willebrand Factor Type A (VWA) domains, which are metal binding domains often involved in mediating protein-protein interactions (Whittaker and Hynes 2002). Although well studied in eukaryotes, the role of VWA proteins in prokaryotes is much less understood, and the metal chelatase family is one of only a small number of AAA+ families known to function directly with them. The Mg2+ and Co2+ chelatases both function as three-subunit enzymes, of which the former is better studied. Functional Co2+ chelatase consists of a single AAA+ module- containing subunit, CobS, as well as the CobN and CobT (VWA domain containing) subunits

(Fodje et al. 2001). Functional Mg2+ chelatase consists of a AAA+ module containing subunit,

BchI/ChlI, and two other subunits, BchD/ChlD (VWA containing) and BchH/ChlH

(homologous to CobN) (Walker and Willows 1997). The BchD/ChlD proteins contain an N- terminal region similar to BchI/ChlI in addition to a C-terminal VWA domain (Fodje et al.

2001). The N-terminal region appears to represent a AAA+ module, however in many organisms key motifs are absent or disrupted, and the subunit has been shown to lack any independent ATPase activity (Fodje et al. 2001; Hansson and Kannangara 1997; Jensen et al.

1999; Petersen et al. 1999; Sirijovski et al. 2006; Walker and Willows 1997). A schematic

47 diagram of the AAA+ and VWA containing subunits of the chelatases is shown in Figure 5B.

Electron microscopy studies of BchI/ChlI from different organisms have revealed that they form either hexameric or heptameric rings in the presence of nucleotide, consistent with the oligomeric ring structures observed for many other AAA+ proteins (Fodje et al. 2001; Reid et al.

2003; Willows et al. 2004). It has also recently been reported that BchD can form hexameric rings, which upon association with BchI form a two-tiered hexameric complex, however the electron microscopy images supporting this claim have not yet been published (Sirijovski et al.

2006).

Although the mechanism of the chelatase enzymes is not well understood, the H subunits have been shown to be responsible for porphyrin binding and to contain the chelatase active site, while the I and D components are proposed to power the chelation reaction and direct any necessary remodeling events (Reid and Hunter 2002). The reaction itself is proposed to consist of two major steps, the first involving the association of the I and D subunits into an initial complex in a manner dependent upon ATP-binding, and the second involving transient association with the porphyrin bound H subunit, ATP-hydrolysis and metal insertion (Reid and

Hunter 2002; Sirijovski et al. 2006).

McrB/Unc53 Family

The McrB/Unc53 family is found sporadically throughout bacteria and archaea, as well as in animals (Iyer et al. 2004). Bacteria and archaea contain ‘classical’ McrB, which is a component of the McrBC restriction endonuclease complex. The complex recognizes two ‘half- sites’ each consisting of a purine followed by a methylated/hydroxymethylated cytosine. The optimal separation of these half-sites is approximately 55 to 103 basepairs, although half-sites with up to 2000 basepairs between them can still be recognized (Stewart and Raleigh 1998;

Sutherland et al. 1992). Binding to methylated-DNA is carried out by the McrB protein, with the specific located in the N-terminal domain of the protein outside of the AAA+ module (Gast et al. 1997; Pieper et al. 1999). McrB, unlike other AAA+ proteins, utilizes GTP

48 as a substrate, rather than ATP (Sutherland et al. 1992). The restriction endonuclease site is contained within the McrC subunit, and cleavage of DNA is coupled to the hydrolysis of nucleotide by McrB (Pieper and Pingoud 2002; Sutherland et al. 1992).

McrB forms oligomeric species and binds to appropriate target DNA in a manner stimulated and stabilized by the presence of GTP nucleotides (Gast et al. 1997; Kruger et al.

1995; Panne et al. 2001; Pieper et al. 2002; Pieper et al. 1999; Stewart et al. 2000). This oligomerization is dependent upon the AAA+ module, in a manner consistent with other AAA+ proteins (Pieper et al. 2002; Pieper et al. 1999). Electron microscopy studies of McrB-GTPγS in the absence of DNA have shown that it forms heptameric rings, suggesting that this may be the form of the in vivo oligomer (Panne et al. 2001). Addition of McrC promotes dimerization of these rings to form a tetradecameric species, consistent with other observations that the addition of McrC to McrB-DNA in the presence of nucleotide leads to the formation of higher order complexes in gel-shift and surface plasmon resonance assays (Gast et al. 1997; Panne et al. 2001;

Pieper et al. 2002; Stewart et al. 2000). A current model for McrBC function involves the binding of an McrB heptamer and an associated McrC molecule(s) to each recognition half-site.

These complexes then remain bound to these recognition sites, while using the power of GTP hydrolysis to translocate the DNA between the sites possibly through the central pores in the

McrB rings, thereby leading to the formation of loops. Eventually the two complexes ‘collide’ and form the active tetradecameric complex mediated by their respective McrC molecules, which then cleave the DNA in the formed loops (Bourniquel and Bickle 2002; Panne et al. 1999;

Pieper et al. 2002).

McrB-like AAA+ proteins are also found in certain higher organisms, however, their function appears to have diverged from that of classical McrB (Iyer et al. 2004). These proteins, which we generally refer to as Unc53/Unc53-like, exist in a variety of different forms and have been implicated in processes associated with cellular outgrowth and migration, and appear to be particularly important in neuronal development . Different Unc53 proteins have been shown to

49 possess various properties, including binding, helicase activity and endonuclease activity, however their exact cellular roles are not well-characterized (Hekimi and Kershaw

1993; Ishiguro et al. 2002; Martinez-Lopez et al. 2005; Peeters et al. 2004; Stringham et al.

2002).

Midasin Family

The midasin family appears to be present in most, if not all, eukaryotes, but not in bacteria or archaea (Iyer et al. 2004). Midasin proteins are strikingly large (~560 kDa in yeast), consisting of a poorly conserved N-terminal domain of variable length, followed, in order, by 6 tandem AAA+ modules, a large ‘linker’ domain, an acidic domain rich in aspartate and glutamate residues, and a C-terminal VWA domain (Fig 5C). Of the six AAA+ modules, modules 1 and 6 may be nonfunctional in ATP binding and/or hydrolysis in many organisms due to various mutations in the Walker B, Sensor I, arginine finger and/or Sensor II motifs. The

AAA+ modules within a midasin molecule are distinct from one another and can be divided into two major evolutionary groups, with one group being comprised of the even numbered modules and the other comprised of the odd. The even numbered modules share a common structural feature consisting of an extended loop insertion between the Box IV and IV´ motifs (Garbarino and Gibbons 2002). The presence of the VWA domain is reminiscent of the Metal Chelatase family of AAA+ proteins, which are known to function with these metal-binding domains, and, in the case of the ChlD/BchD protein, contain a AAA+ module and VWA domain fused together on a single polypeptide. Although the specific role of the VWA domain in the metal chelatases is not clear, these domains are known to be important in mediating protein-protein interactions (Whittaker and Hynes 2002). Thus the VWA domain might play a role in mediating interaction of midasin with or substrate proteins. This domain has been reported to be necessary for midasin structure and/or function, however the data supporting this have not yet been published (Galani et al. 2004).

50

Localization and gene deletion studies in yeast have revealed that midasin is localized to the nucleus, and is essential for viability (Galani et al. 2004; Garbarino and Gibbons 2002;

Winzeler et al. 1999). Midasin has been shown to associate with the Rix1 particle, a late stage

60S pre-ribosomal particle composed of a Rix1-Ipi1-Ipi3 protein complex bound to the developing 60S subunit, and is critical for the proper maturation and nuclear export of the 60S ribosomal subunit to the cytoplasm (Galani et al. 2004). Regrettably, the large size of midasin proteins has made biochemical analysis difficult, and detailed functional and structural information is not yet available. The presence of six tandem AAA+ modules does strongly suggest that the protein functions as a hexamer, however, and like other AAA+ molecules uses the power of ATP binding and hydrolysis to direct remodeling events critical to its maturation and export functions.

MCM Family

The MCM (Minichromosome Maintenance) family of AAA+ proteins is found throughout archaea and eukaryotes. These proteins contain a single AAA+ module in addition to an N-terminal region containing a zinc binding motif, important for complex assembly and

ATPase activity, and a C-terminal stretch containing a DNA binding domain (Forsburg 2004;

Iyer et al. 2004). In eukaryotes, eight known MCM variants have been identified (MCM2-9). Of these, six (MCM2-7), oligomerize to form a heterohexameric complex, believed to be the dominant species in vivo, as well as a number of subcomplexes (Forsburg 2004; Maiorano et al.

2006). These MCM proteins function as a DNA helicase, playing a key role in the initiation of

DNA replication, and may also function as the primary eukaryotic replicative helicase, however this is still subject to debate (Davey et al. 2002b; Maiorano et al. 2006). The proteins are recruited to origins of replication by the ORC/CDC6/Cdt1 complex in a manner which requires

ORC-CDC6 ATP hydrolysis in order to proceed efficiently (Bowers et al. 2004; Speck et al.

2005). All six MCM subunits are required for proper elongation to occur, however the

51 mechanism of MCM2-7 helicase function is not clearly understood (Labib et al. 2000). The heterohexameric complex of all six subunits does not display helicase activity in vitro, however a subcomplex consisting of only MCM4, MCM6, and MCM7 has been shown to possess ATP- dependent helicase activity, suggesting that these subunits may form the catalytic core and that the role of other subunits may be regulatory (Ishimi 1997; Lee and Hurwitz 2000). Whether or not there is an in vivo rearrangement of the heterohexamer to form the MCM4/6/7 complex, or whether or not additional modifications/interactions activate helicase activity within the heterohexameric form is uncertain (Forsburg 2004).

The MCM8 and MCM9 proteins were more recently discovered. The MCM8 protein is only found in higher multicellular organisms (Lutzmann et al. 2005). It does not appear to associate with the MCM2-7 complex, and has been shown to possess both ATPase and DNA helicase activity in vitro (Gozuacik et al. 2003; Maiorano et al. 2005). It is believed to act as a helicase during the elongation process, and is required for processive DNA synthesis to occur at a maximal rate, but is not absolutely essential (Maiorano et al. 2005). Its exact role in DNA replication is currently unclear. MCM9 is the largest known member of the AAA+ containing

MCM proteins, and appears to exist only in vertebrates. Unlike MCM2-8, it contains a substantially larger C-terminal region, in which no recognizable motifs can be found. The function of MCM9 in DNA replication is unknown at present (Lutzmann et al. 2005). The archaeal MCM complex is much simpler than its eukaryotic counterparts, generally consisting of only a single MCM molecule forming a homooligomeric species, and displays both ATPase and DNA helicase activity (Costa et al. 2006; Kelman and Hurwitz 2003).

σ54 Activator Family

The σ54 Activator family is represented only in bacteria (Iyer et al. 2004). The members of this family are also referred to as Enhancer Binding Proteins (EBPs), and a diverse range of them are known to exist. In addition to a single AAA+ module, members of this family typically

52 contain an N-terminal regulatory domain, which regulates the activity of the protein in response to specific environmental signals, and a C-terminal DNA-binding domain. They are involved in the activation of transcription of genes under the control of σ54-dependent promoters, which function in transcription of genes associated with a range of specific stress responses and metabolic/transport processes such as nitrogen metabolism, zinc tolerance, acetoacetate metabolism, and others.

An RNA polymerase-σ54 complex binds to a promoter in an inactive or ‘closed’ conformation, and requires remodeling before it can direct transcription. The EBPs bind to enhancer sequences, usually located ~ 80 to 150 basepairs upstream, and in rare instances downstream, of transcriptional start sites. An EBP interacts with a ‘closed’ RNA polymerase-σ54 complex bound at the promoter via a DNA-bending event involving the integration host factor

(IHF), and utilizes the power of ATP hydrolysis to remodel the closed RNA-polymerase-σ54-

DNA complex into an active ‘open’ conformation, allowing transcription to proceed (Buck et al.

2006; Reitzer and Schneider 2001; Schumacher et al. 2006; Zhang et al. 2002). Interaction with

σ54 is transient, varying throughout the course of the ATP hydrolysis cycle, and occurs directly through two loop structures: one containing the crucially important GAFTGA signature sequence, and mapping to the Helix-2-Insert, and another corresponding to the Pre-Sensor 1 β-

Hairpin (Buck et al. 2006; Rappas et al. 2005).

Studies have shown that, like other AAA+ proteins, members of the σ54 activator family appear to function as oligomers; EM and crystal structure analyses of the EBPs PspF, NtrC, and

ZraR have shown the formation of distinct hexameric rings (De Carlo et al. 2006; Rappas et al.

2005; Sallai and Tucker 2005). Structural studies of the EBP NtrC1, however, have revealed the formation of heptameric rings, suggesting that the functional oligomeric state of different EBPs may vary (Lee et al. 2003b).

53

YifB and ComM Families

Members of the YifB family are found in certain prokaryotes. In addition to the Helix-2 insert, they also contain a putative Zn2+-binding cluster insertion downstream of strand β4 in the

AAA+ nucleotide binding domain (Iyer et al. 2004). Iyer et al. (2004) reported that members of the YifB family contain a single AAA+ module fused to a Lon N-terminal protease domain. Our analysis of available sequence information, however, has failed to detect the presence of such a protease domain. No functional characterization of YifB has been performed, and thus the role of these proteins is unclear.

The ComM family was not described in the AAA+ classification by Iyer et al. (2004), but is closely related to members of the Helix-2-Insert Clade (Ammelburg et al. 2006). As such,

I have tentatively assigned it to this clade (Fig. 3, separated from other Helix-2-Insert Clade members by a grey line). The function of ComM proteins is unknown. However, disruption of comM expression in Haemophilus influenzae resulted in a drastic reduction of natural competence development, and was proposed to be important during recombination of donor

DNA into the chromosome (Gwinn et al. 1998).

1.2.2.5C Other PACTT Families

These include the LonB family (discussed above) and the Dynein Heavy Chain (DHC)

Family.

Dynein Heavy Chain (DHC) Family

The dynein heavy chain (DHC) proteins, like the midasin proteins, consist of 6 AAA+ modules fused together in tandem on a single polypeptide and are found throughout eukaryotes.

Both DHC and midasin were originally classified into a single family by Iyer et al. (2004), however, subsequent analysis has suggested that the dynein AAA+ modules are less closely related to those of midasin than previously thought, and are, in fact, only peripherally related to

54 other members of the PACTT group (Ammelburg et al. 2006). DHCs serve as components of multimeric dynein complexes, where they function as molecular motors, using the hydrolysis of

ATP as a source of energy for directing molecular motion and conformational changes.

Different forms of DHCs and dynein complexes exist, including those associated with driving the motion of cilia and flagella, as well as cytoplasmic variants involved in the trafficking of various forms of ‘cargo’ molecules, such as vesicles, organelles, and chromosomes along cytoskeletal filaments. Cytoplasmic dynein plays an important role in a variety of cellular processes including mitosis, nuclear envelope breakdown, retrograde vesicle transport, and maintenance of the Golgi apparatus, and has been shown to be essential for viability (Sakato and

King 2004).

In addition to the six AAA+ modules, DHC proteins possess a large N-terminal stretch, an insertion between AAA+4 and AAA+5, and a C-terminal domain (Mocz and Gibbons 2001;

Sakato and King 2004). Of the six AAA+ modules, AAA+5 and AAA+6 contain degenerate P- loop motifs, and only the first four AAA+ modules are believed to be capable of binding ATP

(Gibbons et al. 1991; Mocz and Gibbons 1996; Mocz and Gibbons 2001; Sakato and King

2004). The AAA+1 module is the major site of ATP hydrolysis, and has been shown to be essential for the molecule’s function, while binding at the other sites is believed to serve a regulatory role (Eshel 1995; Gibbons et al. 1987; King and Witman 1987). Nucleotide binding at AAA+3 has also been show to be essential and necessary for mediating nucleotide-dependent dynein release from microtubules, suggesting that it is responsible for coupling the ATP hydrolysis at AAA+1 to this process (Silvanovich et al. 2003). A schematic diagram of a DHC is shown in Figure 5D.

Electron microscopy studies have shown that DHC proteins fold into a globular ‘head’ structure with two extensions, referred to as the stalk and the stem (Burgess et al. 2003; Burgess et al. 2004; Goodenough and Heuser 1984; Goodenough et al. 1987). A composite image from one of these EM studies is shown in Figure 9 (Burgess et al. 2003). The stem is formed by the

55

Stalk

Head

Stem

Reprinted by permission from Macmillan Publishers Ltd: Nature, Burgess et al. , Vol. 421 (13), pages 715-718. Copyright 2003.

FIGURE 9. EM image of Dynein-c. Composite class average image of Chlamydomonas reinhardtii Dynein-c, comprised of 1 dynein heavy chain and 3 stem-associated light chains, in the presence of ADP-Vanadate, obtained by negative stain electron microscopy (Burgess et al. 2003).

56 N-terminal region of the protein, and is responsible for protein-protein interaction/cargo binding

(Sakato and King 2004). The stalk is responsible for microtubule binding, via a domain located at the tip of the structure (Gee et al. 1997).

The DHC heads resemble heptameric ring structures, formed from the six AAA+ modules and an additional region, possibly corresponding to the C-terminal domain, which recent work has suggested plays a role in regulating ATPase activity (Hook et al. 2005; Samso and Koonce 2004). The stalk and stem structures of the DHC molecules are highly flexible in nature. Nucleotide binding and possibly hydrolysis bring about conformational changes in the globular head structure, as well as generating swing-like motions of the stem relative to the head and stalk. Although the mechanism of DHC function is not fully understood, it has been proposed that these swing-like motions of the stem effectively act as ‘power-strokes’, which allow translocation of dynein relative to microtubules associated with the stalk (Burgess et al.

2003; Burgess et al. 2004).

1.2.3 Thesis Rationale: The MoxR Family

The above sections clearly illustrate how the common architecture of the AAA+ proteins has been modified and adapted to carry out an astounding number of different functions. By the introduction of new domains and motifs to a common AAA+ core module, nature has created an array of molecular machines capable of harnessing the power of nucleotide binding and hydrolysis and utilizing it to carry out the remodeling of a remarkably diverse range of substrates. It is also evident that there is still much to learn. In most cases, our understanding of the various mechanisms employed by specific AAA+ families is limited at best. Obtaining a thorough understanding of these mechanisms, and the commonalities and differences employed between families, will help us develop an integrated view of AAA+ systems, and a better understanding of the fundamental workings of these machines which are so crucial to life. In addition, as these proteins are so functionally diverse, learning about them will provide us with

57 invaluable knowledge of a host of different biological systems.

Notably, there are still a number of AAA+ families about which we know remarkably little, and acquiring knowledge of the function of these families is essential. One such family, and the focus of the work presented in my thesis, is the MoxR family, which belongs to the

Helix-2 Insert Clade of the PACTT Group (Ammelburg et al. 2006; Iyer et al. 2004). These proteins are widespread throughout bacteria and archaea, but are notably absent from eukaryotes

(Iyer et al. 2004).

A review of the literature suggested that these proteins might be involved in the assembly of oligomeric complexes, however the available information is scattered and incomplete at best. The broad distribution of this family is notable, suggesting a fundamental importance for these proteins, while the remarkable lack of knowledge regarding them hints at an involvement in biological systems that are not well understood. This, coupled with the fact that the family appears to be limited to the Bacteria and Archaea superkingdoms, makes it a particularly interesting target for research. Obtaining a knowledge of the MoxR proteins might therefore not only further our understanding of the AAA+ superfamily, but may also provide us with valuable information about systems unique to the lower kingdoms of life. Such information in turn may be useful in health-related research, particularly in the development of methods to combat non-eukaryotic pathogens.

The work herein describes my research efforts into the MoxR AAA+ family, providing an in-depth analysis, including computational studies of the entire family and biochemical analyses of a representative member of Escherichia coli K12 MG1655.

58

2. MoxR AAA+ ATPases: A Novel Family of Molecular Chaperones

Publication Details: Snider J, Houry WA. (2006). MoxR AAA+ ATPases: A Novel Family of

Molecular Chaperones? J Struct Biol. 156(1), 200-209.

59

2.1 Summary

The MoxR AAA+ family is a large, diverse group of ATPases that, so far, has been poorly studied. Members of this family are found throughout the Bacteria and Archaea superkingdoms, but have not yet been detected in Eucarya. The limited experimental data available to date suggest that members of this family might have chaperone-like activities. Here we present an extensive phylogenetic analysis which builds upon our previously published work

(Snider et al. 2006), described in Chapter 3, and reveals that the MoxR family can be divided into at least seven subfamilies, including MoxR Proper (MRP), TM0930, RavA, CGN,

APE2220, PA2707, and YehL. We also include a comprehensive overview and gene context analysis for each of these subfamilies. Our data reveal distinct conserved associations of certain

MoxR family members with specific genes, including further support for our previously reported observation that many members of the MoxR AAA+ family are found near Von

Willebrand Factor Type A (VWA) proteins and are likely to function with them. We propose, based on bioinformatic analyses and the available literature, that the MoxR AAA+ proteins function with VWA domain-containing proteins to form a chaperone system that is important for the folding/activation of proteins and protein complexes by primarily mediating the insertion of metal cofactors into the substrate molecules.

2.2 Introduction

ATPases Associated with various cellular Activities, or AAA+ proteins, are a large superfamily of P-loop NTPases, whose members display a remarkable range of functional diversity. These proteins are involved in processes ranging from protein refolding and degradation to DNA repair and replication (Iyer et al. 2004). Regardless of their specific function, however, AAA+ proteins display a general role in molecular remodeling events, using the power of ATP hydrolysis to bring about conformational changes. AAA+ proteins are defined by the presence of the AAA+ module, a 200-250 amino acid region responsible for ATP

60 binding and hydrolysis. These modules are comprised of two distinct structural subdomains: an

N-terminal α/β-core subdomain and a smaller, C-terminal α-helical subdomain (Ogura and

Wilkinson 2001). AAA+ modules contain a variety of conserved sequence motifs responsible for ATP sensing and hydrolysis. Major motifs include the Walker A motif (GxxGxGKT), the

Walker B motif (hhhhDE), Sensor I, and Sensor II (Neuwald et al. 1999). AAA+ proteins generally function as oligomers, often forming hexameric rings (Hanson and Whiteheart 2005).

In our effort to discover novel molecular chaperones in bacteria, we were intrigued by several reports in the literature of AAA+ proteins that were required for the proper maturation of specific proteins or complexes. Upon further analysis of these AAA+ proteins, we discovered that they belonged to the MoxR family. This family contains a large number of poorly characterized AAA+ proteins. It is remarkably widespread throughout bacteria and archaea, with members being represented in all major lineages (Frickey and Lupas 2004; Iyer et al. 2004;

Neuwald et al. 1999); however, no eukaryotic members have yet been identified (Iyer et al.

2004). Though the exact functional role of MoxR proteins is unclear at present, research performed to date suggests a chaperone-like role in the assembly and activation of specific protein complexes. Phylogenetic analysis of the MoxR AAA+ shows that it can be divided into at least seven smaller subfamilies (see below). These include MoxR Proper

(MRP), TM0930 (formerly APE0892), RavA, CGN, APE2220, PA2707, and YehL. Of these, only members of the MRP, RavA, and CGN subfamilies have been studied experimentally. Here, we provide an overview of the MoxR AAA+ proteins based on bioinformatic analyses and the available literature, highlighting the novel chaperone-like activity of members of this family.

2.3 Materials and Methods

2.3.1 MoxR Phylogenetic Analysis

Three MoxR AAA+ amino acid sequences were selected from each MoxR subfamily based on a previously published phylogenetic analysis (Snider et al. 2006). Each of the 18 total

61 sequences was subjected to BlastP analysis against 275 of the microbial genomes available in the NCBI database using default parameters (Altschul et al. 1990). All homologous protein sequences matching with an Expect value ≤ 0.05 were selected, compiled into a single list and filtered to eliminate redundancy. Each sequence was then compared against the Clusters of

Orthologous Groups (COG) database using the COGNITOR program and all those identified as belonging to the MoxR COG0714 were selected. The COG groups consist of orthologous and/or paralogous protein sequences, and each COG represents an ancient conserved domain, with

COG0714 corresponding to proteins containing a MoxR-type AAA+ domain. The COG groupings were developed by comparing protein sequences in complete genomes, as described elsewhere (Tatusov et al. 2000). The sequences we identified were then subjected to preliminary alignments using the MUSCLE (Edgar 2004) and CLUSTALW (Thompson et al. 1994) programs with default parameters, and then the sequences were shortened to include only the

AAA+ modules. All sequences with incomplete AAA+ modules were removed and not included in the subsequent analysis.

The remaining 596 sequences were aligned once again using the MUSCLE program.

The phylogenetic tree was constructed from this alignment via the neighbor-joining method using the PROTDIST and NEIGHBOUR programs available in the PHYLIP package

(Felsenstein 1996). A PMB model of substitution (Veerassamy et al. 2003) and a coefficient of variation of 0.7 were used.

2.3.2 MoxR Gene Neighbourhood Analysis

For each of the 596 sequences included in the analysis, the protein sequences encoded by the 10 genes surrounding each MoxR (5 on both sides, where available) were extracted from the

NCBI database. A total of 6366 sequences were identified and compared against the Conserved

Domain Database (CDD) (Marchler-Bauer et al. 2005), available through the NCBI website, to detect putative domains. Sequences from each of the major subfamilies identified in our

62 phylogenetic tree were grouped, and the total occurrence of individual domains in a given subfamily was examined. Domains occuring in high frequency were selected, and the proteins containing those domains were manually examined. This allowed the identification of proteins whose genes occur in the neighborhood of MoxR genes with high frequency.

2.4 Results and Discussion

To obtain a global view of the MoxR AAA+ family, we performed a large scale phylogenetic analysis, building and expanding upon our previously published work (Snider et al.

2006). 596 members of the MoxR COG0714 were identified by BlastP analysis with 18 representative MoxR sequences selected from our previous analysis (Snider et al. 2006), against

275 microbial genomes available in the NCBI database. These sequences were aligned and used in the construction of a phylogenetic tree as described in Materials and Methods (Fig. 10 and

Table 1). In our previous work, in which 156 MoxR proteins from 94 organisms were analyzed, we detected only six distinct subfamilies. Our current analysis is more comprehensive and covers a much broader spectrum of organisms resulting in the detection of an additional major

MoxR subfamily (APE2220 branched from previously assigned PA2707 subfamily). This brings the total subfamily count to seven (Fig. 10): MoxR Proper (MRP), TM0930, RavA, CGN,

APE2220, PA2707, and YehL. The signature sequences characterizing each subfamily are given in Fig. 11.

In an effort to learn more about the function and role of the various MoxR AAA+ subfamily members, we also conducted an analysis of the genes surrounding each MoxR gene.

By identifying genes which occur in close proximity to MoxR genes, it was hoped that potential functional partners could be identified. Where sequencing information was available, the protein products of five genes on either side of a moxR gene were examined for putative domains using the Conserved Domain Database (CDD). The sequences were grouped based upon the individual MoxR subfamilies identified in the phylogenetic tree, and then analyzed to identify

63

YehL MRP

APE0892 PA2707

TM0930

APE2220

RavA

(b) GvpN (c) DBA (a) NirQ/NorQ/CbbQ CGN

FIGURE 10. Phylogenetic tree of the MoxR AAA+ family. The tree was constructed from 596 MoxR AAA+ (COG0714) sequences as described in the Materials and Methods. Each subfamily is represented by a distinct colour. Small subgroups within the CGN branch are labeled.

64

TABLE 1. GI numbers and organism distribution of MoxR AAA+ proteins organized according to subfamilies.

65

MRP Organism 1 50085313 Acinetobacter sp. ADP1 2 14600469 Aeropyrum pernix K1 3 14601331 Aeropyrum pernix K1 4 17936136 Agrobacterium tumefaciens str. C58 5 53763879 Anabaena variabilis ATCC 29413 6 53764603 Anabaena variabilis ATCC 29413 7 66854311 Anaeromyxobacter dehalogenans 2CP-C 8 15605787 Aquifex aeolicus VF5 9 11500001 Archaeoglobus fulgidus DSM 4304 10 67156348 Azotobacter vinelandii AvOP 11 67158507 Azotobacter vinelandii AvOP 12 30262143 Bacillus anthracis str. Ames 13 30262970 Bacillus anthracis str. Ames 14 30262992 Bacillus anthracis str. Ames 15 52142071 Bacillus cereus E33L 16 52142483 Bacillus cereus E33L 17 52142507 Bacillus cereus E33L 18 52143308 Bacillus cereus E33L 19 56962707 Bacillus clausii KSM-K16 20 15613167 Bacillus halodurans C-125 21 15613294 Bacillus halodurans C-125 22 52079118 Bacillus licheniformis ATCC 14580 23 52784485 Bacillus licheniformis ATCC 14580 24 16077700 Bacillus subtilis subsp. subtilis str. 168 25 49478050 Bacillus thuringiensis serovar konkukian str. 97-27 26 49479029 Bacillus thuringiensis serovar konkukian str. 97-27 27 49479279 Bacillus thuringiensis serovar konkukian str. 97-27 28 49481547 Bacillus thuringiensis serovar konkukian str. 97-27 29 53712010 Bacteroides fragilis YCH46 30 53713713 Bacteroides fragilis YCH46 31 29346320 Bacteroides thetaiotaomicron VPI-5482 32 29347622 Bacteroides thetaiotaomicron VPI-5482 33 49475915 Bartonella henselae str. Houston-1 34 49474495 Bartonella quintana str. Toulouse 35 42524207 Bdellovibrio bacteriovorus HD100 36 42525206 Bdellovibrio bacteriovorus HD100 37 23465766 Bifidobacterium longum NCC2705 38 23466355 Bifidobacterium longum NCC2705 39 33601705 Bordetella bronchiseptica RB50 40 33592723 Bordetella pertussis Tohama I 41 15594521 Borrelia burgdorferi B31 42 51598437 Borrelia garinii PBi 43 27377576 Bradyrhizobium japonicum USDA 110 44 62423394 Brevibacterium linens BL2 45 62424921 Brevibacterium linens BL2 46 62425852 Brevibacterium linens BL2 47 62290440 Brucella abortus biovar 1 str. 9-941 48 17986743 Brucella melitensis 16M 49 23502420 Brucella suis 1330 50 48784037 Burkholderia fungorum LB400 51 48787308 Burkholderia fungorum LB400 52 46447516 Candidatus Protochlamydia amoebophila UWE25 53 16124820 Caulobacter crescentus CB15 54 16127058 Caulobacter crescentus CB15 55 78188094 Chlorobium chlorochromatii CaD3 56 67919324 Chlorobium limicola DSM 245 57 67935554 Chlorobium phaeobacteroides DSM 266 58 21672904 Chlorobium tepidum TLS 59 76258656 Chloroflexus aurantiacus J-10-fl 60 76260545 Chloroflexus aurantiacus J-10-fl 61 76261568 Chloroflexus aurantiacus J-10-fl

66 62 76261674 Chloroflexus aurantiacus J-10-fl 63 76261722 Chloroflexus aurantiacus J-10-fl 64 34499140 Chromobacterium violaceum ATCC 12472 65 28210117 Clostridium tetani E88 66 28210482 Clostridium tetani E88 67 67873390 Clostridium thermocellum ATCC 27405 68 67874366 Clostridium thermocellum ATCC 27405 69 67876439 Clostridium thermocellum ATCC 27405 70 71278888 Colwellia psychrerythraea 34H 71 71280875 Colwellia psychrerythraea 34H 72 71281272 Colwellia psychrerythraea 34H 73 67920426 Crocosphaera watsonii WH 8501 74 67921791 Crocosphaera watsonii WH 8501 75 48855605 Cytophaga hutchinsonii 76 48855640 Cytophaga hutchinsonii 77 48856710 Cytophaga hutchinsonii 78 41724654 Dechloromonas aromatica RCB 79 66798984 Deinococcus geothermalis DSM 11300 80 15805648 Deinococcus radiodurans R1 81 15805942 Deinococcus radiodurans R1 82 51244487 Desulfotalea psychrophila LSv54 83 68177217 Desulfuromonas acetoxidans DSM 684 84 68179359 Desulfuromonas acetoxidans DSM 684 85 68194677 Enterococcus faecium DO 86 61100258 Erythrobacter litoralis HTCC2594 87 68053913 Exiguobacterium sp. 255-15 88 68055029 Exiguobacterium sp. 255-15 89 68055559 Exiguobacterium sp. 255-15 90 68140100 Ferroplasma acidarmanus Fer1 91 56707444 Francisella tularensis subsp. tularensis SCHU S4 92 56418786 Geobacillus kaustophilus HTA426 93 78222841 Geobacter metallireducens GS-15 94 39996796 Geobacter sulfurreducens PCA 95 58038813 Gluconobacter oxydans 621H 96 55377779 Haloarcula marismortui ATCC 43049 97 55377846 Haloarcula marismortui ATCC 43049 98 55378022 Haloarcula marismortui ATCC 43049 99 55378460 Haloarcula marismortui ATCC 43049 100 55379107 Haloarcula marismortui ATCC 43049 101 55379636 Haloarcula marismortui ATCC 43049 102 15789525 Halobacterium sp. NRC-1 103 56460103 Idiomarina loihiensis L2TR 104 67984590 Kineococcus radiotolerans SRS30216 105 67987585 Kineococcus radiotolerans SRS30216 106 62512932 Lactobacillus casei ATCC 334 107 54297819 Legionella pneumophila str. Paris 108 54298850 Legionella pneumophila str. Paris 109 50954869 Leifsonia xyli subsp. xyli str. CTCB07 110 24217112 Leptospira interrogans serovar Lai str. 56601 111 16799547 Listeria innocua Clip11262 112 16802498 Listeria monocytogenes EGD-e 113 23012052 Magnetospirillum magnetotacticum MS-1 114 46201069 Magnetospirillum magnetotacticum MS-1 115 13472437 Mesorhizobium loti MAFF303099 116 13476883 Mesorhizobium loti MAFF303099 117 68191785 Mesorhizobium sp. BNC1 118 68210783 Methanococcoides burtonii DSM 6242 119 45357926 Methanococcus maripaludis S2 120 20091666 Methanosarcina acetivorans C2A 121 20091966 Methanosarcina acetivorans C2A 122 68133958 Methanosarcina barkeri str. fusaro 123 68134740 Methanosarcina barkeri str. fusaro

67 124 68134813 Methanosarcina barkeri str. fusaro 125 21226474 Methanosarcina mazei Go1 126 21228769 Methanosarcina mazei Go1 127 68212855 Methylobacillus flagellatus KT 128 68212878 Methylobacillus flagellatus KT 129 68213168 Methylobacillus flagellatus KT 130 68214143 Methylobacillus flagellatus KT 131 1771259 Methylobacterium extorquens 132 53802302 Methylococcus capsulatus str. Bath 133 53803126 Methylococcus capsulatus str. Bath 134 53804191 Methylococcus capsulatus str. Bath 135 53804871 Methylococcus capsulatus str. Bath 136 48862097 Microbulbifer degradans 2-40 137 48863191 Microbulbifer degradans 2-40 138 48864144 Microbulbifer degradans 2-40 139 41406457 Mycobacterium avium subsp. paratuberculosis K-10 140 41407303 Mycobacterium avium subsp. paratuberculosis K-10 141 41409316 Mycobacterium avium subsp. paratuberculosis K-10 142 31792674 Mycobacterium bovis AF2122/97 143 31794341 Mycobacterium bovis AF2122/97 144 31794863 Mycobacterium bovis AF2122/97 145 15827968 Mycobacterium leprae TN 146 15610300 Mycobacterium tuberculosis H37Rv 147 41615028 Nanoarchaeum equitans Kin4-M 148 76802957 Natronomonas pharaonis DSM 2160 149 69926200 Nitrobacter hamburgensis X14 150 75676810 Nitrobacter winogradskyi Nb-255 151 54022056 Nocardia farcinica IFM 10152 152 54022423 Nocardia farcinica IFM 10152 153 54025450 Nocardia farcinica IFM 10152 154 71367302 Nocardioides sp. JS614 155 71368303 Nocardioides sp. JS614 156 71370008 Nocardioides sp. JS614 157 23128076 Nostoc punctiforme PCC 73102 158 53687148 Nostoc punctiforme PCC 73102 159 17232341 Nostoc sp. PCC 7120 160 17232526 Nostoc sp. PCC 7120 161 79039527 Novosphingobium aromaticivorans DSM 12444 162 23098168 Oceanobacillus iheyensis HTE831 163 266551 Paracoccus denitrificans 164 69933296 Paracoccus denitrificans PD1222 165 69936932 Paracoccus denitrificans PD1222 166 77918077 Pelobacter carbinolicus DSM 2380 167 77919650 Pelobacter carbinolicus DSM 2380 168 68551151 Pelodictyon phaeoclathratiforme BU-1 169 54303499 Photobacterium profundum SS9 170 67910148 Polaromonas sp. JS666 171 67911168 Polaromonas sp. JS666 172 34541231 Porphyromonas gingivalis W83 173 50842235 Propionibacterium acnes KPA171202 174 50842459 Propionibacterium acnes KPA171202 175 68553513 Prosthecochloris aestuarii DSM 271 176 77359911 Pseudoalteromonas haloplanktis TAC125 177 77360442 Pseudoalteromonas haloplanktis TAC125 178 15598071 Pseudomonas aeruginosa PAO1 179 15598266 Pseudomonas aeruginosa PAO1 180 15599518 Pseudomonas aeruginosa PAO1 181 77459430 Pseudomonas fluorescens PfO-1 182 77460664 Pseudomonas fluorescens PfO-1 183 26988757 Pseudomonas putida KT2440 184 71734709 Pseudomonas syringae pv. phaseolicola 1448A 185 71736770 Pseudomonas syringae pv. phaseolicola 1448A

68 186 71066231 Psychrobacter arcticus 273-4 187 71364084 Psychrobacter cryohalolentis K5 188 18313052 Pyrobaculum aerophilum str. IM2 189 14521486 Pyrococcus abyssi GE5 190 14521800 Pyrococcus abyssi GE5 191 14521830 Pyrococcus abyssi GE5 192 18976777 Pyrococcus furiosus DSM 3638 193 18976790 Pyrococcus furiosus DSM 3638 194 18976889 Pyrococcus furiosus DSM 3638 195 18977176 Pyrococcus furiosus DSM 3638 196 14590245 Pyrococcus horikoshii OT3 197 14590294 Pyrococcus horikoshii OT3 198 14590644 Pyrococcus horikoshii OT3 199 45516415 Ralstonia eutropha JMP134 200 46131252 Ralstonia eutropha JMP134 201 53762666 Ralstonia eutropha JMP134 202 68556434 Ralstonia metallidurans CH34 203 68558258 Ralstonia metallidurans CH34 204 17548330 Ralstonia solanacearum GMI1000 205 77464605 Rhodobacter sphaeroides 2.4.1 206 77465281 Rhodobacter sphaeroides 2.4.1 207 74023979 Rhodoferax ferrireducens DSM 15236 208 32470950 Rhodopirellula baltica SH 1 209 32471453 Rhodopirellula baltica SH 1 210 32471601 Rhodopirellula baltica SH 1 211 32472894 Rhodopirellula baltica SH 1 212 32474049 Rhodopirellula baltica SH 1 213 32474362 Rhodopirellula baltica SH 1 214 32474854 Rhodopirellula baltica SH 1 215 32475230 Rhodopirellula baltica SH 1 216 32475538 Rhodopirellula baltica SH 1 217 32475657 Rhodopirellula baltica SH 1 218 32475717 Rhodopirellula baltica SH 1 219 32475922 Rhodopirellula baltica SH 1 220 32476470 Rhodopirellula baltica SH 1 221 32477222 Rhodopirellula baltica SH 1 222 32477836 Rhodopirellula baltica SH 1 223 39934260 Rhodopseudomonas palustris CGA009 224 47573127 Rubrivivax gelatinosus PM1 225 47575072 Rubrivivax gelatinosus PM1 226 47575492 Rubrivivax gelatinosus PM1 227 68560646 Rubrobacter xylanophilus DSM 9941 228 68520048 Shewanella baltica OS155 229 68543476 Shewanella baltica OS155 230 69157931 Shewanella denitrificans OS217 231 69158439 Shewanella denitrificans OS217 232 69951503 Shewanella frigidimarina NCIMB 400 233 69953057 Shewanella frigidimarina NCIMB 400 234 24373711 Shewanella oneidensis MR-1 235 24374610 Shewanella oneidensis MR-1 236 56695064 Silicibacter pomeroyi DSS-3 237 69299360 Silicibacter sp. TM1040 238 15966118 Sinorhizobium meliloti 1021 239 67927619 Solibacter usitatus Ellin6076 240 67928730 Solibacter usitatus Ellin6076 241 67932857 Solibacter usitatus Ellin6076 242 29831601 Streptomyces avermitilis MA-4680 243 29832654 Streptomyces avermitilis MA-4680 244 21220574 Streptomyces coelicolor A3(2) 245 21221461 Streptomyces coelicolor A3(2) 246 21224461 Streptomyces coelicolor A3(2) 247 51891601 Symbiobacterium thermophilum IAM 14863

69 248 51892908 Symbiobacterium thermophilum IAM 14863 249 51893691 Symbiobacterium thermophilum IAM 14863 250 56752224 Synechococcus elongatus PCC 6301 251 71544612 Syntrophobacter fumaroxidans MPOB 252 71541958 Syntrophomonas wolfei str. Goettingen 253 20806880 Thermoanaerobacter tengcongensis MB4 254 20808592 Thermoanaerobacter tengcongensis MB4 255 72161504 Thermobifida fusca YX 256 72162077 Thermobifida fusca YX 257 72162899 Thermobifida fusca YX 258 57641781 Thermococcus kodakarensis KOD1 259 16081968 Thermoplasma acidophilum DSM 1728 260 13541896 Thermoplasma volcanium GSS1 261 22299098 Thermosynechococcus elongatus BP-1 262 22299624 Thermosynechococcus elongatus BP-1 263 15643889 Thermotoga maritima MSB8 264 78484422 Thiomicrospira crunogena XCL-2 265 78484440 Thiomicrospira crunogena XCL-2 266 71151760 Thiomicrospira denitrificans ATCC 33889 267 42525835 Treponema denticola ATCC 35405 268 42526098 Treponema denticola ATCC 35405 269 71674546 Trichodesmium erythraeum IMS101 270 15600945 Vibrio cholerae O1 biovar eltor str. N16961 271 59712032 Vibrio fischeri ES114 272 59713861 Vibrio fischeri ES114 273 59714084 Vibrio fischeri ES114 274 28900540 Vibrio parahaemolyticus RIMD 2210633 275 28901306 Vibrio parahaemolyticus RIMD 2210633 276 27366610 Vibrio vulnificus CMCP6 277 27367912 Vibrio vulnificus CMCP6 278 34558275 Wolinella succinogenes DSM 1740 279 21240892 Xanthomonas axonopodis pv. citri str. 306 280 21241871 Xanthomonas axonopodis pv. citri str. 306 281 21244104 Xanthomonas axonopodis pv. citri str. 306 282 66766442 Xanthomonas campestris pv. campestris str. 8004 283 66767268 Xanthomonas campestris pv. campestris str. 8004 284 66769539 Xanthomonas campestris pv. campestris str. 8004 285 58580694 Xanthomonas oryzae pv. oryzae KACC10331 286 58580790 Xanthomonas oryzae pv. oryzae KACC10331 287 28198949 Xylella fastidiosa Temecula1

TM0930 Organism 1 67543580 Burkholderia vietnamiensis G4 2 15895133 Clostridium acetobutylicum ATCC 824 3 18309638 Clostridium perfringens str. 13 4 28211018 Clostridium tetani E88 5 67923912 Crocosphaera watsonii WH 8501 (Synechocystis sp. WH 8501) 6 66796771 Deinococcus geothermalis DSM 11300 7 15806190 Deinococcus radiodurans R1 8 68205090 Desulfitobacterium hafniense DCB-2 9 68207662 Desulfitobacterium hafniense DCB-2 10 78355530 Desulfovibrio desulfuricans G20 11 67985081 Kineococcus radiotolerans SRS30216 12 28379781 Lactobacillus plantarum WCFS1 13 23024780 Leuconostoc mesenteroides subsp. mesenteroides ATCC 8293 14 77166411 Nitrosococcus oceani ATCC 19707 15 77918137 Pelobacter carbinolicus DSM 2380 16 77919199 Pelobacter carbinolicus DSM 2380 17 77463720 Rhodobacter sphaeroides 2.4.1 18 32475223 Rhodopirellula baltica SH 1 19 39934285 Rhodopseudomonas palustris CGA009 20 21219813 Streptomyces coelicolor A3(2)

70 21 21224831 Streptomyces coelicolor A3(2) 22 15643692 Thermotoga maritima MSB8 23 52006472 Thiobacillus denitrificans ATCC 25259 24 71150636 Thiomicrospira denitrificans ATCC 33889

RavA Organism 1 14601148 Aeropyrum pernix K1 2 66779440 Anaeromyxobacter dehalogenans 2CP-C 3 53713430 Bacteroides fragilis YCH46 4 29345745 Bacteroides thetaiotaomicron VPI-5482 5 57241650 Campylobacter lari RM2100 6 34498193 Chromobacterium violaceum ATCC 12472 7 48853798 Cytophaga hutchinsonii 8 50118970 Erwinia carotovora subsp. atroseptica SCRI1043 9 3183569 Escherichia coli K12 10 68140080 Ferroplasma acidarmanus Fer1 11 15668250 Methanocaldococcus jannaschii DSM 2661 12 15668999 Methanocaldococcus jannaschii DSM 2661 13 20089152 Methanosarcina acetivorans C2A 14 20091041 Methanosarcina acetivorans C2A 15 68134507 Methanosarcina barkeri str. fusaro 16 68132708 Methanosarcina barkeri str. fusaro 17 21227633 Methanosarcina mazei Go1 18 21228941 Methanosarcina mazei Go1 19 26553847 Mycoplasma penetrans HF-2 20 54027784 Nocardia farcinica IFM 10152 21 37524086 Photorhabdus luminescens subsp. laumondii TTO1 22 48478302 Picrophilus torridus DSM 9790 23 18313673 Pyrobaculum aerophilum str. IM2 24 47571823 Rubrivivax gelatinosus PM1 25 29143935 Salmonella enterica subsp. enterica serovar Typhi Ty2 26 16767163 Salmonella typhimurium LT2 27 75177684 Shigella boydii BS512 28 24115049 Shigella flexneri 2a str. 301 29 74314228 Shigella sonnei Ss046 30 21225846 Streptomyces coelicolor A3(2) 31 70606761 Sulfolobus acidocaldarius DSM 639 32 15899119 Sulfolobus solfataricus P2 33 15920718 Sulfolobus tokodaii str. 7 34 15601518 Vibrio cholerae O1 biovar eltor str. N16961 35 59713681 Vibrio fischeri ES114 36 28900864 Vibrio parahaemolyticus RIMD 2210633 37 27366515 Vibrio vulnificus CMCP6 38 22123927 Yersinia pestis KIM 39 51594364 Yersinia pseudotuberculosis IP 32953

CGN Organism 1 17937329 Agrobacterium tumefaciens str. C58 2 15890596 Agrobacterium tumefaciens str. C58 3 17938076 Agrobacterium tumefaciens str. C58 4 53765439 Anabaena variabilis ATCC 29413 (Anabaena flos-aquae UTEX 1444) 5 15606998 Aquifex aeolicus VF5 6 56476473 Azoarcus sp. EbN1 7 58616208 Azoarcus sp. EbN1 8 56475540 Azoarcus sp. EbN1 9 52786006 Bacillus licheniformis ATCC 14580 (DSM 13) 10 16078999 Bacillus subtilis subsp. subtilis str. 168 11 49480901 Bacillus thuringiensis serovar konkukian str. 97-27 12 49476243 Bartonella henselae str. Houston-1 13 49474765 Bartonella quintana str. Toulouse 14 42524020 Bdellovibrio bacteriovorus HD100 15 27378327 Bradyrhizobium japonicum USDA 110

71 16 27375291 Bradyrhizobium japonicum USDA 110 17 62290867 Brucella abortus biovar 1 str. 9-941 18 62317823 Brucella abortus biovar 1 str. 9-941 19 17986333 Brucella melitensis 16M 20 23500007 Brucella suis 1330 21 23502870 Brucella suis 1330 22 48785894 Burkholderia fungorum LB400 23 48783367 Burkholderia fungorum LB400 24 78060332 Burkholderia sp. 383 25 78062846 Burkholderia sp. 383 26 78062694 Burkholderia sp. 383 27 67546348 Burkholderia vietnamiensis G4 28 71083155 Candidatus Pelagibacter ubique HTCC1062 29 46446495 Candidatus Protochlamydia amoebophila UWE25 30 16127251 Caulobacter crescentus CB15 31 71280340 Colwellia psychrerythraea 34H 32 19552962 Corynebacterium glutamicum ATCC 13032 33 41722782 Dechloromonas aromatica RCB 34 68206339 Desulfitobacterium hafniense DCB-2 35 78357962 Desulfovibrio desulfuricans G20 36 78358472 Desulfovibrio desulfuricans G20 37 46580444 Desulfovibrio vulgaris subsp. vulgaris str. Hildenborough 38 61101800 Erythrobacter litoralis HTCC2594 39 56420269 Geobacillus kaustophilus HTA426 40 78221278 Geobacter metallireducens GS-15 41 78222749 Geobacter metallireducens GS-15 42 78221249 Geobacter metallireducens GS-15 43 39997256 Geobacter sulfurreducens PCA 44 39998398 Geobacter sulfurreducens PCA 45 58038539 Gluconobacter oxydans 621H 46 10803574 Halobacterium sp. NRC-1 47 16120174 Halobacterium sp. NRC-1 48 3182933 Hydrogenophilus thermoluteolus 49 24213995 Leptospira interrogans serovar Lai str. 56601 50 68246620 Magnetococcus sp. MC-1 51 23013670 Magnetospirillum magnetotacticum MS-1 52 46202988 Magnetospirillum magnetotacticum MS-1 53 13473076 Mesorhizobium loti MAFF303099 54 68193348 Mesorhizobium sp. BNC1 55 20093916 Methanopyrus kandleri AV19 56 68133744 Methanosarcina barkeri str. fusaro 57 21229090 Methanosarcina mazei Go1 58 53803114 Methylococcus capsulatus str. Bath 59 41407889 Mycobacterium avium subsp. paratuberculosis K-10 60 41408751 Mycobacterium avium subsp. paratuberculosis K-10 61 69931525 Nitrobacter hamburgensis X14 62 69927889 Nitrobacter hamburgensis X14 63 75676166 Nitrobacter winogradskyi Nb-255 64 75674601 Nitrobacter winogradskyi Nb-255 65 77165320 Nitrosococcus oceani ATCC 19707 66 30249950 Nitrosomonas europaea ATCC 19718 67 30249871 Nitrosomonas europaea ATCC 19718 68 53688547 Nostoc punctiforme PCC 73102 (Nostoc punctiforme ATCC 29133) 69 17229743 Nostoc sp. PCC 7120 70 79041724 Novosphingobium aromaticivorans DSM 12444 71 23098298 Oceanobacillus iheyensis HTE831 72 69937461 Paracoccus denitrificans PD1222 73 78186570 Pelodictyon luteolum DSM 273 74 68550032 Pelodictyon phaeoclathratiforme BU-1 75 54302736 Photobacterium profundum SS9 76 67908648 Polaromonas sp. JS666 77 67910463 Polaromonas sp. JS666

72 78 15595717 Pseudomonas aeruginosa PAO1 79 581477 Pseudomonas stutzeri 80 18313204 Pyrobaculum aerophilum str. IM2 81 53761178 Ralstonia eutropha JMP134 82 68560128 Ralstonia metallidurans CH34 83 77465583 Rhodobacter sphaeroides 2.4.1 84 77463895 Rhodobacter sphaeroides 2.4.1 85 77462524 Rhodobacter sphaeroides 2.4.1 86 74024877 Rhodoferax ferrireducens DSM 15236 87 32474017 Rhodopirellula baltica SH 1 88 39934531 Rhodopseudomonas palustris CGA009 89 39933575 Rhodopseudomonas palustris CGA009 90 47572614 Rubrivivax gelatinosus PM1 91 47571930 Rubrivivax gelatinosus PM1 92 69157765 Shewanella denitrificans OS217 93 56708999 Silicibacter pomeroyi DSS-3 94 56697691 Silicibacter pomeroyi DSS-3 95 69298074 Silicibacter sp. TM1040 96 16263146 Sinorhizobium meliloti 1021 97 15966425 Sinorhizobium meliloti 1021 98 67931130 Solibacter usitatus Ellin6076 99 21283028 Staphylococcus aureus subsp. aureus MW2 100 27468010 Staphylococcus epidermidis ATCC 12228 101 70726501 Staphylococcus haemolyticus JCSC1435 102 73662650 Staphylococcus saprophyticus subsp. saprophyticus ATCC 15305 103 21225156 Streptomyces coelicolor A3(2) 104 52008575 Thiobacillus denitrificans ATCC 25259 105 52006401 Thiobacillus denitrificans ATCC 25259 106 52005988 Thiobacillus denitrificans ATCC 25259 107 52007166 Thiobacillus denitrificans ATCC 25259 108 52007150 Thiobacillus denitrificans ATCC 25259 109 78484768 Thiomicrospira crunogena XCL-2 110 78484774 Thiomicrospira crunogena XCL-2 111 71675175 Trichodesmium erythraeum IMS101 112 71675179 Trichodesmium erythraeum IMS101 113 31795433 Yersinia pestis KIM 114 56552440 Zymomonas mobilis subsp. mobilis ZM4

APE2220 Organism 1 14601921 Aeropyrum pernix K1 2 66857871 Anaeromyxobacter dehalogenans 2CP-C 3 27378484 Bradyrhizobium japonicum USDA 110 4 27381249 Bradyrhizobium japonicum USDA 110 5 27380774 Bradyrhizobium japonicum USDA 110 6 48784025 Burkholderia fungorum LB400 7 48785460 Burkholderia fungorum LB400 8 48786361 Burkholderia fungorum LB400 9 78044986 Carboxydothermus hydrogenoformans Z-2901 10 78044536 Carboxydothermus hydrogenoformans Z-2901 11 76259385 Chloroflexus aurantiacus J-10-fl 12 76259870 Chloroflexus aurantiacus J-10-fl 13 66796930 Deinococcus geothermalis DSM 11300 14 68206487 Desulfitobacterium hafniense DCB-2 15 68208667 Desulfitobacterium hafniense DCB-2 16 68207165 Desulfitobacterium hafniense DCB-2 17 13471664 Mesorhizobium loti MAFF303099 18 13470392 Mesorhizobium loti MAFF303099 19 68191435 Mesorhizobium sp. BNC1 20 15679802 Methanothermobacter thermautotrophicus str. Delta H 21 68268910 Moorella thermoacetica ATCC 39073 22 68270693 Moorella thermoacetica ATCC 39073 23 41408344 Mycobacterium avium subsp. paratuberculosis K-10

73 24 31793606 Mycobacterium bovis AF2122/97 25 31791547 Mycobacterium bovis AF2122/97 26 15609563 Mycobacterium tuberculosis H37Rv 27 15607511 Mycobacterium tuberculosis H37Rv 28 69926851 Nitrobacter hamburgensis X14 29 75676388 Nitrobacter winogradskyi Nb-255 30 54024194 Nocardia farcinica IFM 10152 31 54023339 Nocardia farcinica IFM 10152 32 71366759 Nocardioides sp. JS614 33 71366769 Nocardioides sp. JS614 34 71369136 Nocardioides sp. JS614 35 71366713 Nocardioides sp. JS614 36 67910922 Polaromonas sp. JS666 37 46132875 Ralstonia eutropha JMP134 38 68555060 Ralstonia metallidurans CH34 39 17546185 Ralstonia solanacearum GMI1000 40 77462478 Rhodobacter sphaeroides 2.4.1 41 39933818 Rhodopseudomonas palustris CGA009 42 39936862 Rhodopseudomonas palustris CGA009 43 68560845 Rubrobacter xylanophilus DSM 9941 44 56697250 Silicibacter pomeroyi DSS-3 45 56697488 Silicibacter pomeroyi DSS-3 46 69302237 Silicibacter sp. TM1040 47 15965505 Sinorhizobium meliloti 1021 48 29827989 Streptomyces avermitilis MA-4680 49 51891237 Symbiobacterium thermophilum IAM 14863

PA2707 Organism 1 50085403 Acinetobacter sp. ADP1 2 56478206 Azoarcus sp. EbN1 3 67153916 Azotobacter vinelandii AvOP 4 33597996 Bordetella parapertussis 12822 5 33593476 Bordetella pertussis Tohama I 6 27381805 Bradyrhizobium japonicum USDA 110 7 74019658 Burkholderia ambifaria AMMD 8 67661933 Burkholderia cenocepacia HI2424 9 48784327 Burkholderia fungorum LB400 10 53725874 Burkholderia mallei ATCC 23344 11 76811951 Burkholderia pseudomallei 1710b 12 78067143 Burkholderia sp. 383 13 67533743 Burkholderia vietnamiensis G4 14 16125704 Caulobacter crescentus CB15 15 67935400 Chlorobium phaeobacteroides DSM 266 16 34497828 Chromobacterium violaceum ATCC 12472 17 71279839 Colwellia psychrerythraea 34H 18 53729816 Dechloromonas aromatica RCB 19 61101138 Erythrobacter litoralis HTCC2594 20 78221335 Geobacter metallireducens GS-15 21 56460554 Idiomarina loihiensis L2TR 22 24214888 Leptospira interrogans serovar Lai str. 56601 23 46204825 Magnetospirillum magnetotacticum MS-1 24 23016600 Magnetospirillum magnetotacticum MS-1 25 13473377 Mesorhizobium loti MAFF303099 26 23129560 Nostoc punctiforme PCC 73102 (Nostoc punctiforme ATCC 29133) 27 53688517 Nostoc punctiforme PCC 73102 (Nostoc punctiforme ATCC 29133) 28 23124671 Nostoc punctiforme PCC 73102 (Nostoc punctiforme ATCC 29133) 29 17230983 Nostoc sp. PCC 7120 30 79042368 Novosphingobium aromaticivorans DSM 12444 31 69935186 Paracoccus denitrificans PD1222 32 77918052 Pelobacter carbinolicus DSM 2380 33 68549067 Pelodictyon phaeoclathratiforme BU-1 34 67848172 Polaromonas sp. JS666

74 35 77359631 Pseudoalteromonas haloplanktis TAC125 36 15597903 Pseudomonas aeruginosa PAO1 37 77457636 Pseudomonas fluorescens PfO-1 38 26991257 Pseudomonas putida KT2440 39 71065362 Psychrobacter arcticus 273-4 40 71363791 Psychrobacter cryohalolentis K5 41 46132462 Ralstonia eutropha JMP134 42 68555665 Ralstonia metallidurans CH34 43 17545717 Ralstonia solanacearum GMI1000 44 77463205 Rhodobacter sphaeroides 2.4.1 45 74022410 Rhodoferax ferrireducens DSM 15236 46 39936680 Rhodopseudomonas palustris CGA009 47 47572573 Rubrivivax gelatinosus PM1 48 69952182 Shewanella frigidimarina NCIMB 400 49 56697359 Silicibacter pomeroyi DSS-3 50 69298300 Silicibacter sp. TM1040 51 29828879 Streptomyces avermitilis MA-4680 52 21220913 Streptomyces coelicolor A3(2) 53 21223637 Streptomyces coelicolor A3(2) 54 71545112 Syntrophobacter fumaroxidans MPOB 55 72160806 Thermobifida fusca YX 56 71674763 Trichodesmium erythraeum IMS101 57 71674360 Trichodesmium erythraeum IMS101 58 71677375 Trichodesmium erythraeum IMS101 59 71673995 Trichodesmium erythraeum IMS101 60 71674671 Trichodesmium erythraeum IMS101 61 71674199 Trichodesmium erythraeum IMS101 62 71674292 Trichodesmium erythraeum IMS101 63 71674347 Trichodesmium erythraeum IMS101 64 71676219 Trichodesmium erythraeum IMS101

YehL Organism 1 14600668 Aeropyrum pernix K1 2 29349316 Bacteroides thetaiotaomicron VPI-5482 3 78062405 Burkholderia sp. 383 4 16130057 Escherichia coli K12 5 67988697 Kineococcus radiotolerans SRS30216 6 24213400 Leptospira interrogans serovar Lai str. 56601 7 54026623 Nocardia farcinica IFM 10152 8 32473269 Rhodopirellula baltica SH 1 9 56480039 Shigella flexneri 2a str. 301 10 29827526 Streptomyces avermitilis MA-4680 11 29830000 Streptomyces avermitilis MA-4680 12 21223181 Streptomyces coelicolor A3(2) 13 21219694 Streptomyces coelicolor A3(2) 14 72161863 Thermobifida fusca YX

Other Organism 1 14601056 Aeropyrum pernix K1 2 54023165 Nocardia farcinica IFM 10152 3 68560206 Rubrobacter xylanophilus DSM 9941 4 67926321 Solibacter usitatus Ellin6076 5 55981468 Thermus thermophilus HB8

75 Walker A

Walker B Sensor I

76

FIGURE 11. Consensus sequences and alignment of MoxR subfamily members.

All residues shown are at least 80% conserved within their given subfamily. Conservation is based on the following amino acid groupings: ILVM, KR, DE, ST, YFW and NQ denoted as i, k, d, s, y, and n, respectively. Ala, Cys, Pro, His, and Gly are not grouped. Members of the ILVM group are always represented by their letter ‘i’; all other conserved positions are represented by their group letter only if no single residue is present in more than 80% of sequences. An ‘x’ denotes that a residue is present in a given position in at least 50% of the sequences in a subfamily but is not conserved. The Walker A, Walker B, and sensor 1 regions are indicated.

77 neighboring genes. We selected this particular approach because it is highly specific and allowed us to examine the different MoxR subfamilies in detail with a minimum of background noise.

2.4.1 MRP (MoxR Proper) Subfamily

Of all of the MoxR subfamilies identified, MRP appears to be the largest and most diverse. From the 596 MoxR AAA+ ATPases identified in our study, 287 (48.2%) belong to the

MRP subfamily (Fig. 10 and Table 1). These are distributed across 155 of the 275 organisms included in our analysis. Most of these organisms contain multiple MRP proteins.

Rhodopirellula baltica SH 1 is the most extreme example, with a genome encoding an astounding 15 MRP subfamily members (Table 1). The range of organisms is remarkably diverse, including members of all major subdivisions of the Proteobacteria, as well as the

Acidobacteria, Actinobacteria, Aquificae, Bacteroidetes, Chlamydiae, Chlorobi, Chloroflexi,

Cyanobacteria, Deinococcus-Thermus, Firmicutes, Planctomycetes, Spirochaetes, and

Thermotogae phyla. Representatives are also found in the Archaea superkingdom (Fig. 12).

Notably, no members of the MRP family are present in Escherichia coli K12 or any of the other enterobacteria included in our analysis.

Our analysis of MRP encoding genes shows that they are closely associated with genes encoding proteins of unknown function belonging to the COG1721 (Fig. 13A). Of the 287 MRP members, 258 (90%) are close to genes encoding COG1721 proteins. 250 of these (97%) are immediately adjacent to COG1721 proteins. In addition, three MRP genes are close to two copies of COG1721 encoding genes. Most of the COG1721 proteins (94%) also contain a

PFAM DUF58 domain (Domain of Unknown Function 58). A small subset of the COG1721 proteins (~13%) also contain a Von Willebrand Factor Type A (VWA) domain. The functions of COG1721 and DUF58 proteins is currently unknown. Based upon the remarkably high

78

MoxR AAA+ Subfamilies Bacterial Phyla MRP TM0930 RavA CGN APE2220 PA2707 YehL Acidobacteria (1) Actinobacteria (21) Aquificae (1) Bacteroidetes (4) Chlamydiae (6) Chlorobi (8) Chloroflexi (3) Cyanobacteria (10) Deinococcus-Thermus (3) Firmicutes (55) Fusobacteria (1) Planctomycetes(1) Proteobacteria (129) Spirochaetes (5) Thermotogae (1)

Archaeal Phyla Crenarchaeota (5) Euryarchaeota (20) Nanoarchaeota (1)

FIGURE 12. Distribution of MoxR AAA+ proteins across organisms.

Organisms are sorted according to phyla. A filled box indicates that a particular subfamily is present in at least one member of a given phylum. The number of organisms from each phylum included in our analysis is shown in parenthesis.

79 A) MRP Subfamily

COG1721 & MRP-AAA+ Duf58

MRP-AAA+ COG1721 & VWA VWA TPR Duf58 COG1721 & VWA TPR Duf58 & VWA

COG1721& MRP-AAA+ Transglutaminase Duf58

B) TM0930 Subfamily E) APE2220 Subfamily 80 (51% associated TM0930 AAA+ COG3864 APE2220 AAA+ COG3552 CoxE VWA with Cox/Xdh gene cluster) TM0930 AAA+ COG3864 VWA

F) PA2707 Subfamily C) (30% not RavA Subfamily COG3552 CoxE VWA PA2707 AAA+ associated & COG3825 with VWA) RavA-AAA+ VWA

G) YehL Subfamily D) CGN Subfamily NO reductase YehL-AAA+ YehM VWA RuBisCO CGN-AAA+ VWA & COG4548 NorD Gas Vesicle (no VWA) YehM VWA DnaJ/BolA associated YehL-AAA+ Other

FIGURE 13. General gene structure for each MoxR AAA+ subfamily.

Each panel shows the major conserved gene organization present around the MoxR genes for each subfamily. Note that the gene structure for MRP subfamily members associated with VWA proteins is quite diverse (panel A, line 2). The gene structure shown is the most frequently occurring (> 42% of cases). In panel F, the double slash separating the genes indicates that the positions of the VWA genes with respect to the PA2707 AAA+ genes are highly variable.

81 conservation of position between these genes and the MRP genes, however, it seems reasonable to assume that they likely function together.

Of the entire MRP subfamily, 105 of the 287 family members (~37%) were found near

VWA encoding genes. The VWA domain is a metal-binding domain often involved in mediating protein-protein interactions (Whittaker and Hynes 2002). Binding of metal, which is typically magnesium, occurs through a non-contiguous metal-ion dependant adhesion site

(MIDAS) that is important for binding to protein ligands (Xiong et al. 2002). Although these proteins have been well studied in eukaryotic organisms, where they are known to be involved in a wide range of processes including cell adhesion, transport, the complement system, proteolysis, transcription, DNA repair, and ribosome biogenesis, their roles in bacteria and archaea are much less understood. Research has implicated them, however, in bacterial surface adhesion, fibrinogen binding, serum opacity, and metal insertion (Kachlany et al. 2000; Katerov et al. 2000; Willows 2003). In 68 of the 287 MRP family members, the MRP genes are close to more than one VWA encoding gene (generally two, see Fig. 13A). Virtually all of the MRP genes associated with VWA encoding genes are also near COG1721 genes, and, interestingly,

21% of all of the VWA genes are fused to a COG1721. Close to half of these MRP genes are also closely associated with genes encoding proteins containing tetratricopeptide repeat (TPR) domains, which are known to be important in mediating protein-protein interactions (Blatch and

Lassle 1999). In some instances, the TPR and VWA domains are on a single polypeptide chain.

Another 26% of MRP genes are in close genomic proximity to genes encoding proteins with a putative transglutaminase domain (CDD 22804) (Fig. 13A). None of these MRP genes are associated with VWA genes, although virtually all of them are associated with COG1721 genes. This relationship makes it tempting to speculate that these MRP proteins may interact with the transglutaminases, perhaps utilizing them as substrates.

Limited research has been conducted on MRP genes located near methanol dehydrogenase gene clusters in several organisms. The first moxR gene described, and from

82 which the entire family derives its name, was an MRP subfamily member from Paracoccus denitrificans. Paracoccus denitrificans is a gram-negative soil bacterium capable of growing methylotrophically on one-carbon compounds such as methanol or methylamine (Van Spanning et al. 1991). Oxidation of methanol is carried out by the enzyme methanol dehydrogenase

(MDH), a tetrameric complex consisting of two identical large (α) and two identical small (β) subunits. MDH utilizes pyrroloquinoline quinone (PQQ, Fig. 14A) as a cofactor and cytochrome

2+ cL as an electron acceptor. Activity of MDH is also dependent upon bound Ca ions, which are believed to be important for PQQ coordination and activation (Anthony 2004). Studies in

Paracoccus denitrificans showed that, when cells contained an insertion mutant of the moxR gene, they were unable to grow on minimal media using methanol as a carbon source. Growth on methylamine or succinate, however, was unaffected. Examination of the cells showed that

MDH, PQQ, and cytochrome c levels were all normal. However, the expressed MDH was inactive. Thus, the authors concluded that the moxR gene appears to be important for activation of methanol dehydrogenase, possibly via modification and activation of one of the subunits of MDH, or via regulation of genes encoding the MDH activators (Van Spanning et al.

1991).

Similar results were obtained in Methylobacterium extorquens, where cells containing an insertion mutant of an MRP gene located near the MDH gene cluster also lost the ability to grow on methanol (Toyama et al. 1998). The moxR gene is located adjacent to another gene, moxS, which encodes a protein of unknown function (Fig. 14B). Interestingly, the moxRS genes studied in Methylobacterium extorquens are located between two MDH gene clusters: the first

(moxFJGI) encodes the MDH structural genes and the second encodes proteins known to be required for Ca2+ insertion (moxACKLD) (Amaratunga et al. 1997; Richardson and Anthony

1992) (Fig. 14B). Disruption of genes in the latter region was observed to result in MDH lacking

Ca2+ and containing an abnormally bound PQQ cofactor (Richardson and Anthony 1992).

Intriguingly, the moxC and moxL genes encode proteins containing VWA domains. Such an

83

A) O OH OH O NH

O N O OH O

Pyrroloquinoline quinone (PQQ)

B)

MoxF MoxJ MoxG MoxI MoxR MoxS MoxA MoxC MoxK MoxL MoxD

MDH Putative Cyt. cL MDH AAA+ Unknown Large Chaperone Small Subunit Subunit Ca2+ insertion

FIGURE 14. PQQ Structure and MDH Gene Region.

(A) Structure of Pyrroloquinoline quinone (PQQ) (Anthony 2004). (B) Shown is the organization of the Methylobacterium extorquens methanol dehydrogenase (MDH) gene region.

84 arrangement is reminiscent of the large subset of MRP genes found in close proximity to VWA- containing proteins. This, in conjunction with the tendency to find members of other MoxR subfamily genes in association with VWA-encoding genes, may suggest an association between the moxR gene product and those of the moxACKLD, thereby implicating MoxR in metal insertion into MDH. Such an involvement in metal insertion is reminiscent of the metal chelatase enzymes, to which the MoxR family is closely related (Iyer et al. 2004). These enzymes utilize both AAA+ and VWA domains to mediate insertion of metal ions into porphyrin rings as part of the synthesis of cobalamin or (bacterio)chlorophyll. In the case of Mg chelatase, one of the subunits, BchD, actually contains a AAA+ and VWA domain fused together on a single polypeptide (Fodje et al. 2001).

It is noteworthy that the moxR gene in Methylobacterium extorquens does not appear to be in the vicinity of a COG1721-encoding gene, while in Paracoccus denitrificans, which has a similar gene arrangement in this area, the moxR gene is adjacent to a COG1721 gene. This may suggest that the COG1721 protein is not absolutely required for MoxR function in MDH activation. Alternatively, the COG1721 gene in Methylobacterium extorquens may simply have not yet been properly identified or may exist elsewhere in the genome.

Since only a very small subset of the MRP subfamily members are found in organisms that contain the methanol dehydrogenase gene cluster, it is clear that the function of these proteins is more general in nature. It is tempting to speculate that the MRP, and indeed MoxR proteins as a whole, may be involved in general metal insertion processes, possibly working as molecular chaperones helping to mediate proper insertion of metal ions/metal cofactors into substrate proteins.

2.4.2 TM0930 Subfamily

The TM0930 subfamily contains 24 of the 596 MoxR AAA+ proteins analyzed (Fig. 10 and Table 1). TM0930 genes are present in 21 different organisms, including members of the

85

Actinobacteria, Cyanobacteria, Deincoccus-Thermus, Firmicutes, Planctomycetes,

Proteobacteria (Beta, Delta/Epsilon and Alpha subdivisions), and Thermotogae phyla (Fig. 12).

No archaeal members were identified. TM0930 genes are all found near (typically immediately adjacent to) genes encoding proteins belonging to COG3864 (Fig. 13B). This COG represents a group of bacterial proteins of unknown function. A subset of these COG3864 gene products

(37.5%) also contains a C-terminal VWA domain. Members of this subfamily have not been studied experimentally, and, thus, nothing is known about their specific function. The conservation of gene position between the TM0930 and COG3864/VWA encoding genes suggests that they may function together.

It should be noted that our current analysis suggests that APE0892, which was previously grouped with some members of the now expanded TM0930 branch (Snider et al.

2006), is actually distinct and seems to represent a minor branch (Fig. 1 and Table 1). This conclusion is further supported by our gene neighborhood analysis since APE0892 is not found in the neighborhood of a COG3864-encoding gene.

2.4.3 RavA Subfamily

The RavA subfamily consists of AAA+ proteins comprised of an N-terminal MoxR-type

AAA+ module, followed by a poorly conserved C-terminal region of variant length. Of the 596

MoxR AAA+ proteins we identified, 39 belong to the RavA subfamily (Fig. 10 and Table 1).

These are distributed across 35 of the 275 organisms analyzed (Table 1). RavA proteins are found in a range of bacteria belonging to the Actinobacteria, Bacteroidetes, Firmicutes, and

Proteobacteria (Beta, Delta/Epsilon, and Gamma subdivisions) phyla, as well as in members of

Crenarchaeota and Euryarchaeota phyla of the Archaea superkingdom (Fig. 12). Most organisms contain only a single copy of the ravA gene, however, certain archaea, particularly

Methanocaldococcus jannaschii DSM2611 and the three members of the Methanosarcina subdivision, each contain two copies (Table 1).

86

The genes encoding RavA proteins are almost always found adjacent to genes encoding proteins containing a VWA domain (Snider et al. 2006) (Fig. 13C). Cytophaga hutchinsonii and

Chromobacterium violaceum RavA’s are exceptions, with no nearby VWA-containing gene being detected. Our analysis did not detect any other genes co-occurring with members of the

RavA subfamily as a whole.

Of the numerous RavA-type MoxR AAA+ proteins identified, only Escherichia coli

K12 MG1655 RavA (YieN_ECOLI) has been characterized in detail (Snider et al. 2006).

Studies on WT MG1655 cells grown in rich and minimal media at 37°C have shown that induction of RavA is maximal towards late log and early stationary phase. The ravA promoter region contains a nearly perfect σS consensus sequence, and induction has been shown to be eliminated in strains deficient in rpoS, suggesting that this stationary phase sigma factor is responsible, at least in part, for ravA regulation. Northern blotting and RT-PCR studies have demonstrated that the ravA gene comprises an operon with the VWA-protein encoding gene adjacent to it, viaA (YieM_ECOLI). Considering the invariant adjacency of ravA and viaA genes across the various organisms studied (Fig. 13C), it is likely this operon arrangement is conserved. The RavA protein itself has also been purified and characterized. Studies have shown that it is a functional ATPase which forms hexameric rings in the presence of nucleotide.

The enzyme also hydrolyzes GTP, although not as effectively as ATP. E. coli K12 RavA consists of three discrete fragments, including the AAA+ core and α-helical subdomains, as well as a poorly conserved C-terminal domain of unknown function. ravA deletion mutants do not possess any obvious phenotype under a wide range of conditions, so the exact function of the protein in the cell is still elusive (Snider et al. 2006).

The RavA protein was found to interact strongly with the inducible lysine decarboxylase enzyme, LdcI, encoded by the cadA gene (Snider et al. 2006). LdcI is a pyridoxal phosphate

(PLP, a vitamin B6 derivative) dependent decameric enzyme (Sabo et al., 1974; Sabo and

Fischer, 1974) that plays a major role in the acid stress response in bacteria (Merrell and Camilli

87

1999; Park et al. 1996; Soksawatmaekhin et al. 2004). The RavA-LdcI complex has been visualized by negative stain electron microscopy, and has been shown to form a remarkable

‘cage-like’ structure comprised of two LdcI decamers linked by up to five RavA oligomers.

RavA does not appear to be important for the expression, biogenesis, or function of the LdcI enzyme, so the exact role of the RavA-LdcI complex is still unclear. Interestingly, the RavA protein does not interact with the constitutive lysine decarboxylase, LdcC, even though LdcC is

69% identical and 84% similar to LdcI. This suggests that the interaction between RavA and

LdcI is highly specific. It has been proposed that the RavA-LdcI complex may be involved in the regulation of RavA activity under low pH conditions. The complex may be important, for instance, in sequestering RavA from, or enhancing RavA activity towards its substrates.

Alternatively, complex formation may direct RavA towards an entirely new set of substrates.

The exact role of RavA in the cell is under investigation (Snider et al. 2006).

2.4.4 CGN Subfamily

The CGN subfamily, (CbbQ/GvpN/NorQ), refers to a large, highly diversified branch of the MoxR AAA+ phylogenetic tree. From our analysis, 114 of 596 MoxR AAA+ proteins belong to this subfamily (Fig. 10 and Table 1). These are found in 81 of the organisms incorporated in our study, including members of all major subdivisions of the Protebacteria, as well as of the Acidobacteria, Actinobacteria, Aquificae, Chlamydiae, Chlorobi, Cyanobacteria,

Firmicutes, Planctomycetes and Spirochaetes phyla. CGN proteins are also found in members of the Euryarchaeota and Crenarchaeota phyla of the Archaea superkingdom (Fig. 12).

A global analysis of the family shows that 83 of the CGN genes co-occur with VWA encoding genes (73%) (Fig. 13D). All of these genes co-occur with a single VWA gene, with the exception of the CGN found in Shewanella denitrificans OS217, which occurs in proximity to two VWA genes. Approximately 80% of the CGN genes are immediately adjacent to the

VWA gene, and, in the case of one CGN found in Desulfitobacterium hafniense DCB-2, the

88

VWA and AAA+ are encoded by the same gene. Of these VWA proteins, 86% belong to

COG4548 (NorD) (Fig. 13D). Members of this COG are similar to the NorD protein which is associated with the activation of nitric oxide reductase enzyme (see below).

The extreme diversity of this subfamily makes a general gene-structure analysis difficult, and there do not appear to be any genes, other than the VWA, that are strongly associated with members of this branch as a whole. The CGN subfamily is perhaps the best experimentally studied of all of the MoxR subfamilies, however, and some functional information is available.

Experimental work has been performed on the NirQ/NorQ-, CbbQ- and GvpN–type members.

2.4.4.1 NirQ/NorQ-type members

Our analysis detects 22 CGN genes associated with nitric-oxide reductase encoding genes (Fig. 10). Almost all of these genes also appear to co-occur with VWA-encoding genes.

This group includes the NirQ/NorQ enzymes, which have been studied experimentally in a variety of organisms. Mutagenesis experiments have been performed in Pseudomonas stutzeri,

Paracoccus denitrificans, Rhodobacter sphaeroides 2.4.3, and Pseudomonas aeruginosa. These genes are found in association with the nitrite reductase/nitric oxide reductase gene clusters and appear to play an important role in bacterial denitrification. Denitrification is a microbial respiratory process involving the use of oxidized nitrogen compounds as alternative electron acceptors. The entire pathway consists of four reduction reaction steps, leading from nitrate to dinitrogen. The four enzymes involved in this process are nitrate reductase, nitrite reductase, nitric oxide reductase, and nitrous oxide reductase (Philippot 2002) (Fig. 15). Research to date suggests that the NirQ/NorQ proteins may play a role in the activation of the respiratory, short- chain nitric oxide reductase, which acts at the third step in this pathway.

Enzyme activity studies on P. aeruginosa and P. stutzeri mutants deleted of nirQ/norQ genes revealed the loss of nitric oxide reductase activity, although the induction from the nitric oxide reductase promoter was not compromised (Arai et al. 1999; Jungst and Zumft 1992). Thus

89

nitrate NO3

Nitrate Reductase

nitrite NO2

Nitrite Reductase

nitric oxide NO

Nitric Oxide Reductase

nitrous oxide N2O

Nitrous Oxide Reductase

N2

Denitrification Pathway

FIGURE 15. Pathway of microbial denitrification.

90 it appears that the nirQ/norQ gene products may be important for regulating the activity of nitric oxide reductase at the enzyme level (Jungst and Zumft 1992). Deletion or insertion mutagenesis of the nirQ/norQ genes in P. aeruginosa, P. denitrificans, and R. sphaeroides 2.4.3 resulted in strains incapable of anaerobic growth on minimal medium using nitrate or nitrite as a sole electron acceptor (Arai et al. 1998; Bartnikas et al. 1997; de Boer et al. 1996). In addition, in P. denitrificans and R. sphaeroides, insertion mutagenesis of the neighboring norD genes, which encode VWA proteins, also produced a similar phenotype (Bartnikas et al. 1997; de Boer et al.

1996). Mutant phenotypes could be partially complemented by introduction of plasmids containing the nirQ/norQ or norD genes (Arai et al. 1999; Bartnikas et al. 1997; de Boer et al.

1996; Jungst and Zumft 1992).

The nirQ/norQ genes are often found closely associated with nirO/norE and nirP/norF genes, and, in P. aeruginosa, the three genes have been shown to comprise an operon (nirQOP)

(Arai et al. 1994). Our analysis detects nirO/norE genes near 16 of the 22 nirQ/norQ type CGN genes identified. nirO/norE and nirP/norF encode small transmembrane proteins whose exact function is unknown, although NirO is homolgous to subunit III of bacterial and mitochondrial cytochrome oxidases (Arai et al. 1994). The exact role of these proteins is unclear, and there have been conflicting reports as to whether or not deletion of these genes produces an effect on nitric oxide reductase activity (Arai et al. 1998; de Boer et al. 1996).

2.4.4.2 CbbQ-type members

The CbbQ-type proteins are closely related to the NirQ/NorQ proteins. Our study detected 8 CbbQ-type CGN genes (Fig. 10) co-occurring with genes encoding bacterial ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO); all but one also co-occur with a

VWA gene. The cbbQ genes have been shown to be important in the activation of RuBisCO.

This enzyme is responsible for the fixation of CO2, catalyzing the carboxylation of ribulose-1,5- bisphosphate (RuBP) to form two molecules of 3-phosphoglycerate. Two major forms of

91

RuBisCO are found in chemoautotrophic bacteria: form I is a hexadecameric enzyme composed of a 8 large and 8 small subunits, and form II is a dimeric, tetrameric, or octameric complex composed of only large subunits (Shively et al. 1998). CbbQ genes have been found in association with both types of RuBisCO genes.

Both Form I and Form II RuBisCO enzymes can be stimulated by CbbQ and its associated VWA domain containing protein, CbbO (Hayashi et al. 1997; Hayashi et al. 1999).

Studies have shown that overexpression of Hydrogenophilus thermoluteolus CbbQ and/or CbbO with form I RuBisCO from the same organism in an E. coli system results in the production of a more active form of RuBisCO than when the enzyme is overexpressed alone. RuBisCO purified from the cells upon co-expression of CbbQ and/or CbbO possessed a Vmax almost two-fold

2+ greater than enzyme expressed without CbbQ/O. The Km of the enzyme for RuBP and Mg was also elevated, while the Km for CO2 was unchanged. The conformational state of the co- expressed RuBisCO was also observed to be different than RuBisCO expressed alone, as determined using both ANS staining and circular dichroism spectroscopy (Hayashi et al. 1997).

Form II RuBisCO from Hydrogenovibrio marinus overexpressed with its CbbQ was also observed to undergo activation and a change in conformation (Hayashi et al. 1999).

Although the mechanism of action of CbbQ on RuBisCO enzymes is unclear, the ability of CbbQ to activate both forms of RuBisCO suggests that it interacts with the large, rather than the small subunit (which is not present in form II) (Hayashi et al. 1999). Analysis of purified recombinant CbbQ showed that it is a functional ATPase with activity dependent upon the presence of Mg2+ ions (Hayashi and Igarashi 2002).

2.4.4.3 GvpN-type members

The GvpN-type CGN MoxR proteins (Fig. 10) have also been studied experimentally and have been shown to have an important role in gas vesicle formation. Our analysis detected

11 CGN protein-encoding genes located in close proximity to gas-vesicle biosynthesis genes

92

(Fig. 13D). None of these are associated with VWA protein-encoding genes. Gas vesicles are intracellular, hollow, gas-filled structures comprised solely of protein. They are found in aquatic microorganisms where they provide buoyancy, allowing cells to control their depth in water, thus optimizing exposure to oxygen and light. The major component of gas vesicles is the small, hydrophobic GvpA protein, which forms the ribs of the main structure. GvpC, a larger hydrophilic protein, is a minor component located on the outer surface of the gas vesicles where it serves to stabilize the entire structure (Walsby 1994). Mutations in gvpN produce varying effects on gas vesicle formation. Insertion mutagenesis of Halobacterium salinarium gvpN resulted in cells which formed a large number of gas vesicles of unusually small size compared with WT cells. Removal of the bulk of the insertion cassette did not affect the phenotype, suggesting that it is specific to GvpN and not a result of a polar effect (DasSarma et al. 1994).

Different results were obtained in another study using Haloferax volcanii cells. When the gvp gene cluster was deleted in these cells and the cells were transformed with the Halobacterium salinarium gvp gene cluster containing a mutant form of gvpN, the cells formed a very low number of gas vesicles compared to cells transformed with the WT gene cluster. Northern blotting revealed that the other three genes in the same operon as gvpN were expressed at normal levels, indicating that the phenotype was not a result of a polar effect and was not due to alterations in the expression level of these genes (Offner et al. 1996).

Thus, the exact role of GvpN is unclear and may differ slightly between different organisms. Based upon the studies on other MoxR AAA+ proteins, however, it is tempting to speculate that GvpN may have a general, chaperone-like function, possibly acting on GvpA and/or GvpC to ensure that they are properly folded and successfully assembled into the gas vesicle structure.

93

2.4.4.4 DnaJ/BolA-Associated (DBA) members

A small subset of CGN genes are found near genes encoding a DnaJ-type molecular chaperone and a BolA-type protein (Fig. 10). These CGN genes, which we have called the

DnaJ/BolA Associated (DBA) subgroup, are also associated with a VWA-encoding gene. DnaJ has been well studied in E. coli, where it works with the DnaK/Hsp70 molecular chaperone and the nucleotide exchange factor GrpE to promote protein folding and the oligomerization/dissociation of protein complexes (Houry 2001). No Hsp70 or GrpE encoding genes were detected in any of the 10 genes surrounding CGN. BolA is a transcription factor and has also been studied in E. coli, where it has been shown to play a role in cell morphology, cell division, and cell stress response (Santos et al. 2002). Whether the co-occurence of these CGN genes with dnaJ and bolA genes has any functional significance is unclear, and will need to be determined experimentally. It is interesting to postulate, for example, that the CGN and VWA proteins may work with DnaJ in order to promote the proper folding of substrate proteins.

With the lack of experimental work on any of the remaining CGN members, and the absence of any larger-scale gene conservation, no conclusions as to the function of the remaining members can yet be drawn.

2.4.5 APE2220 Subfamily

The APE2220 subfamily contains 49 of the 596 MoxR proteins included in our analysis

(Fig. 10 and Table 1). These are found distributed across 31 organisms, including members of the Actinobacteria, Chloroflexi, Deinococcus-thermus, Firmicutes, and Proteobacteria (Alpha,

Beta and Delta/Epsilon subphyla) phyla, as well as two members of the Archaea superkingdom

(Fig. 12).

Examination of the gene structure of this family reveals that 46 of the 49 APE2220 genes are in close proximity (typically adjacent) to genes encoding VWA proteins belonging to the CoxE COG3552 (Fig. 13E). Members of this COG are of unknown function, but their genes

94 are often found as part of a carbon monoxide dehydrogenase (Cox) gene cluster. Indeed, 25 of the 49 APE2220 genes found near these VWA proteins appear to be part of a carbon monoxide/xanthine dehydrogenase-type gene cluster, with the specific gene arrangement and composition varying between organisms. The remaining 21 APE2220 members have a varied genomic environment.

Although no experimental work has been done on members of the APE2220 subfamily, it seems reasonable to assume that those proteins whose genes are found in association with the dehydrogenase cluster are likely to be involved in the biogenesis and/or activation of these enzymes in a manner analogous to that of other MoxR AAA+ proteins. The carbon monoxide and xanthine dehydrogenase enzymes are both members of the molybdenum hydroxylase family, and are dependent upon metal cofactors for their activity (Hille 2005). It is tempting to speculate that the APE2220 and their associated VWA proteins may be important for proper insertion of these cofactors into these enzymes.

2.4.6 PA2707 Subfamily

Of the 596 MoxR AAA+ proteins used in our analysis, 64 belong to the PA2707 subfamily (Fig. 10 and Table 1). These can be found in a total of 52 of the 275 organisms studied, including members of the Actinobacteria, Chlorobi, Cyanobacteria, Proteobacteria (all major subphyla), and Spirochaetes phyla. No archaeal members were detected (Fig. 12).

45 of the 64 PA2707 genes (70%) are located in close proximity to genes encoding

VWA proteins (Fig. 13F). The position of the VWA genes with respect to the PA2707 genes varies between organisms, although they are adjacent in about 60% of the cases. No PA2707 gene is in proximity to more than a single VWA encoding gene. All of the VWA proteins are identified as CoxE type, with all but 3 belonging to COG3552 CoxE VWA, similar to the VWA proteins near APE2220 subfamily members. All of the VWA proteins also belong to COG3825, an uncharacterized class of bacterial proteins whose function is not known. Interestingly, and in

95 contrast to members of the APE2220 subfamily, these genes do not appear to be part of a cox gene cluster, although a small subset (~23%) do appear to be close to genes encoding a cytrochrome c. It is interesting to note that Trichodesmium erythraeum IMS101, a filamentous cyanobacteria found in nutrient poor tropical and subtropical ocean waters, contains 9 PA2707 subfamily genes with two adjacent to VWA-encoding genes. No research has been performed to date on any of the members of the PA2707 subfamily.

2.4.7 YehL Subfamily

The YehL subfamily is the smallest MoxR subfamily detected in our study. Of the 596

MoxR AAA+ ATPases we identified only 14 (2.3%) as belonging to the YehL subfamily (Fig.

10 and Table 1). These are distributed across 12 of the 275 organisms in our analysis (Fig. 12).

These organisms include members of the Actinobacteria and Proteobacteria (Beta and Gamma subdivisions), as well as single members from the Bacteroidetes, Planctomycetes, and

Spirochaetes phyla. A lone archaeal YehL was also identified in Aeropyrum pernix K1. The

Actinobacteria Streptomyces avermitilis MA-4680 and Streptomyces coelicolor A3(2) contain two different yehL genes.

Examination of yehL gene structure (Fig. 13G) reveals that these genes are also found in close proximity to genes encoding VWA proteins, once again suggesting the possibility that these genes function together. Of the 14 yehL genes, only the representative from Kineococcus radiotolerans is not in close proximity to a VWA gene. Since the sequencing around the yehL gene in this organism is incomplete, however, the possibility that an as yet undetected VWA encoding gene may be nearby cannot be dismissed.

12 of the 14 yehL genes are also adjacent to a moderately conserved gene encoding a long protein of unknown function. The product of this gene, which we will refer to as yehM after the Escherichia coli K12 homologue, contains no recognizable domains or motifs. The yehM gene is not detected in Aeropyrum pernix and Kineococcus radiotolerans, although once

96 again the latter may be due to incomplete sequencing of the gene region. In Rhodopirellula baltica SH1, the yehM and VWA genes are fused together, encoding a predicted product of

~1300 amino acids in length with a C-terminal VWA domain. The same is observed for one of the two yehM genes detected in Streptomyces coelicolor A3(2). In all but one of the remaining cases, the yehM gene is located between the yehL and VWA genes. In all of these cases it encodes a protein of ~700-800 amino acids in length.

The conserved association of the yehL, yehM, and VWA encoding genes strongly suggests that they may function together. To date, no experimental work has been performed on members of the YehL MoxR subfamily, so information about their roles in the cell is not available. The extremely limited distribution of these proteins, however, may suggest a highly specialized role, perhaps acting as molecular chaperones for a single or very narrow range of substrate proteins.

2.5 Conclusions

The large size and remarkable diversity of the MoxR AAA+ family emphasize the importance of this particular class of proteins. Surprisingly, however, relatively little is known about these enzymes and the specific functions of the MoxR AAA+ proteins remains elusive.

Our phylogenetic analysis has identified at least seven distinct MoxR AAA+ subfamilies.

Analysis of adjacent genes reveals distinctly different patterns of neighboring genes for each of these subfamilies, although one major commonality between all of the subfamilies is a tendency to be near genes encoding proteins containing VWA domains.

Analysis of the limited experimental research available on MoxR AAA+ suggests a role in the assembly and activation of protein complexes. Exactly how these proteins function is not clear, although a role in metal insertion seems a reasonable hypothesis. The similarity between the MoxR AAA+ proteins and the metal chelatase enzymes is notable, particularly in light of the fact that the latter enzymes utilize a VWA in conjunction with a AAA+ module to carry out the

97 insertion of Co2+ and Mg2+ into porphyrin rings (Fodje et al. 2001). MoxR AAA+ proteins in conjunction with VWA domain-containing proteins might insert metals directly into proteins rather than into porphyrin rings or other related structures. This is supported by the experimental data demonstrating that the activities of MDH, NO reductases, and RuBisCO are dependent on the presence of metal cofactors, and that these activities are disrupted or reduced upon deletion of the respective MoxR AAA+ and/or VWA associated proteins. Also, many members of the

APE2220 AAA+ subfamily appear to associate with carbon monoxide dehydrogenase/xanthine dehydrogenase-type gene clusters, which encode enzymes dependent upon metal cofactors for their function.

The possibility that some MoxR proteins may have different chaperone-like functions not involving metal insertion must also be considered. Gas vesicles, whose assembly appears to be dependent upon the CGN-type MoxR AAA+ protein, GvpN, are not reported to incorporate metal ions into their structure. Notably, none of the GvpN-encoding genes appears to occur in proximity to VWA-encoding gene. Thus, it is possible that MoxR-type proteins not found in association with VWA proteins may not play a role in metal insertion events, but rather have different chaperone type functions.

Considering the size and diversity of the MoxR AAA+ protein family, it is remarkable that so little is known about it. Continued research in this area is essential and is sure to provide fascinating insights into the function of what appears to be a novel class of molecular chaperones.

98

3. Formation of a Distinctive Complex Between the Inducible Bacterial Lysine Decarboxylase and a Novel AAA+ ATPase

Publication Details: Snider J, Gutsche I, Lin M, Baby S, Cox B, Butland G, Greenblatt J, Emili

A, Houry WA. (2006). Formation of a Distinctive Complex Between the Inducible Bacterial

Lysine Decarboxylase and a Novel AAA+ ATPase. J Biol Chem. 281(3), 1432-1546.

Please note that the material in this chapter was published prior to that presented in Chapter 2, and that the bioinformatic analysis of the MoxR AAA+ proteins presented here is a precursor to the more thorough analysis of Chapter 2.

Data Attribution: I performed the majority of the experiments in this chapter, with the following exceptions. The electron microscopy work was performed by Dr. Irina Gutsche1 (Fig. 22). The domain mapping experiment was conducted by Dr. Sabulal Baby2 and Mr. Brian Cox3 (Fig. 18A,

B). The western blot in the presence / absence of acid shock was performed by Ms. Michelle

Lin2 (Fig. 23E). The RavA-TAP pulldown data was provided by Dr. Gareth Butland4, Dr.

Andrew Emili4 and Dr. Jack Greenblatt4 (Fig. 21A).

1 Laboratoire de Virologie Moléculaire et Structurale, UMR 2472/1157, CNRS-Institut National de la Recherche

Agronomique and Institut Fédératif de Recherche 115, 1 Avenue de la Terrasse, 91198 Gif-sur-Yvette Cedex,

France.

2 Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada.

3 Samuel Lunenfeld Research Institute, Mount Sinai Hospital and Department of Medical Genetics and

Microbiology, University of Toronto, Toronto, Ontario M5S 1A8, Canada.

4 Banting and Best Department of Medical Research and Department of Medical Genetics and Microbiology,

University of Toronto, Toronto, Ontario M5S 1A8, Canada.

99

3.1 Summary

AAA+ ATPases are ubiquitous proteins that employ the energy obtained from ATP hydrolysis to remodel proteins, DNA, or RNA. The MoxR family of AAA+ proteins is widespread throughout bacteria and archaea, but is largely uncharacterized. Limited work with specific members has suggested a potential role as molecular chaperones involved in the assembly of protein complexes. As part of an effort aimed at determining the function of novel

AAA+ chaperones in Escherichia coli, we report the characterization of a representative member of the MoxR family, YieN, which we have renamed RavA (Regulatory ATPase Variant

A). We show that the ravA gene exists on an operon with another gene encoding a protein,

YieM, of unknown function containing a Von Willebrand Factor Type A (VWA) domain. RavA expression is under the control of the σS transcription factor, and its levels increase towards late log/early stationary phase, consistent with its possible role as a general stress response protein.

RavA functions as an ATPase and forms hexameric oligomers. Importantly, we demonstrate that

RavA interacts strongly with inducible lysine decarboxylase (LdcI or CadA) forming a large cage-like structure consisting of two LdcI decamers linked by a maximum of five RavA oligomers. Surprisingly, the activity of LdcI does not appear to be affected by binding to RavA in a number of in vitro and in vivo assays, however, complex formation results in the stimulation of RavA ATPase activity. Data obtained suggest that the RavA-LdcI interaction may be important for the regulation of RavA activity against its targets.

3.2 Introduction

The AAA+ (ATPases Associated with various cellular Activities) superfamily of proteins comprises a large and functionally diverse class of P-loop NTPases. Members of this superfamily are involved in a wide-range of cellular processes, including protein refolding/degradation, transcriptional regulation, ribosome and organelle biogenesis, DNA repair and replication, and molecular transport events (Iyer et al. 2004; Neuwald et al. 1999).

100

Proteins belonging to this superfamily contain one or more copies of the AAA+ module, which is a 200-250 amino acid region containing several distinct conserved motifs (Kunau et al. 1993;

Neuwald et al. 1999; Patel and Latterich 1998) including the Walker A and Walker B consensus sequences. AAA+ modules are responsible for ATP binding and hydrolysis, the energy of which is then harnessed for use in molecular rearrangement events (Ogura and Wilkinson 2001).

AAA+ proteins oligomerize, typically forming ring-shaped hexamers (Hanson and Whiteheart

2005).

Recent phylogenetic analyses using sequence and structural information have identified numerous families of AAA+ proteins (Beyer 1997; Frickey and Lupas 2004; Iyer et al. 2004;

Neuwald et al. 1999). One major but poorly characterized family is the MoxR family, which is widespread throughout bacteria and archaea, with members being represented in all major lineages (Frickey and Lupas 2004; Iyer et al. 2004; Neuwald et al. 1999). The exact functional role of MoxR proteins is unclear, although limited work with members of this family suggests a chaperone-like role in the assembly/activation of specific protein complexes. The first characterized member of this family is MoxR in Paracoccus denitrificans which was shown to be important in the biogenesis of methanol dehydrogenase (MDH). Cells in which the moxR gene was disrupted produced wild-type levels of methanol dehydrogenase, but, surprisingly, the enzyme was not functional (Van Spanning et al. 1991). Another MoxR-related protein,

NorQ/NirQ, was shown to be important in the biogenesis of nitric oxide reductase. As with

MoxR and MDH, disruptions of the norQ/nirQ gene resulted in the production of non- functional nitric-oxide reductase enzyme in Pseudomonas stutzeri, Paracoccus denitrificans, and Rhodobacter sphaeroides (Bartnikas et al. 1997; de Boer et al. 1996; Jungst and Zumft

1992). Also, the CbbQ gene product, a MoxR-related protein, was shown to alter RuBisCO conformation and stimulate its activity upon overexpression of recombinant forms of the proteins together in an Escherichia coli system (Hayashi et al. 1997; Hayashi et al. 1999). Yet another MoxR family member, GvpN, has been shown to be important in gas vesicle biogenesis

101 in certain organisms (DasSarma et al. 1994; Englert et al. 1992; Mlouka et al. 2004; Offner et al.

1996). The exact role and mechanism of function of MoxR proteins in all of these processes are not clear.

Genes encoding MoxR proteins frequently co-occur in close proximity to genes encoding proteins containing Von Willebrand Factor Type A (VWA) domains, suggesting that these gene products may be functionally linked. For example, the MoxR family members, norQ and cbbQ genes, are encoded on the same operon as, and immediately upstream of, the norD and cbbO genes, respectively, which encode VWA proteins. Both NorD and CbbO were shown to be important in the activation of functional nitric oxide reductase and RuBisCO, respectively

(Bartnikas et al. 1997; de Boer et al. 1996; Hayashi et al. 1997). The VWA domain is a metal- binding domain often mediating protein-protein interactions (Whittaker and Hynes 2002). The metal is typically magnesium. Metal-binding occurs through a non-contiguous metal ion dependant adhesion site (MIDAS) which is important for binding protein ligands (Xiong et al.

2002). VWA proteins have been well-studied in eukaryotes, where they are involved in a range of processes including cell-adhesion, transport, the complement system, proteolysis, transcription, DNA repair, and ribosome biogenesis. Bacterial and archaeal VWA proteins are not as well characterized, though limited work has identified members involved in surface adhesion, serum opacity, fibrinogen binding, and metal insertion into protoporphyrin IX

(Kachlany et al. 2000; Katerov et al. 2000; Willows 2003). Thus, in order to carry out their varied remodeling functions, many MoxR AAA+ proteins probably work in conjunction with

VWA domain-containing proteins.

As part of a research project aimed at identifying novel AAA+ chaperones in

Escherichia coli, we became interested in the uncharacterized protein YieN (SwissProt number

P31473), which defines a MoxR subfamily (Iyer et al. 2004). We have renamed this protein

RavA (Regulatory ATPase Variant A). Here, we show that RavA is a cytoplasmic, hexameric

AAA+ protein that functions with a VWA domain-containing protein, YieM (SwissProt number

102

P03818), which we termed ViaA (VWA Interacting with AAA+ ATPase). Detailed information is presented on RavA gene organization, expression patterns, protein subcellular localization, enzymatic activity, and oligomerization properties. Furthermore, we show that RavA strongly interacts with the inducible lysine decarboxylase (LdcI or CadA, SwissProt number P0A9H3), an enzyme important in bacterial acid stress response, to form a distinctive five-fold symmetric cage-like complex as determined by electron microscopy.

3.3 Materials and Methods

3.3.1 Bioinformatics

A total of 156 proteins belonging to the MoxR AAA+ family, as defined by the Clusters of Orthologous Groups (MoxR COG0714), were selected for analysis (Tatusov et al. 2003).

These sequences included those currently identified in the COG database as belonging to

COG0714, as well as additional sequences extracted using the BLASTP program from 427 microbial genomes in the NCBI sequence database (Altschul et al. 1990). All additional sequences were analyzed using the COGNITOR program to ensure that they belonged to

COG0714 (Tatusov et al. 2000). A multiple sequence alignment was then constructed using the

CLUSTAL W program (Thompson et al. 1994). The phylogenetic tree was generated using the

PROTDIST and FITCH programs included in the PHYLIP package (Felsenstein 1996). A PMB model of substitution (Veerassamy et al. 2003) and global rearrangement were used. Default settings were used for all other parameters.

36 sequences identified as belonging to the RavA subfamily of MoxR AAA+ were selected and aligned using CLUSTAL W. This alignment was then used to identify residues conserved in >90% of sequences. Conservation was based upon the following amino acid groupings – ILVM, KR, DE, ST, YFW, NQ. Ungrouped amino acids include A, C, P, H and G.

103

3.3.2 Bacterial Strains

Wild-type MG1655 strain was purchased from ATCC (ATCC Number 47076). The

MG1655 ΔravA and ΔviaA strains were constructed from WT MG1655 using the Lambda Red system as described (Datsenko and Wanner 2000; Yu et al. 2000). Briefly, a knockout cassette containing the chloramphenicol resistance marker was amplified by PCR from pKD3 plasmid using primers RKO_forward (5´- agaaacgtctatactcgcaatttacgcagaacttttgacgaaagggtgtaggctggagctgcttc-3´) and RKO_reverse (5´- aatctgtgcaccgacatcctgtaggctggcttcaatgcgacctaacatatgaatatcctccttag-3´) to construct the ravA deletion strain, or VKO_forward (5´- ctgaaggcggcaatcactgatgatgttccccgctggcgtgaaggtgtaggctggagctgcttc-3´) and VKO_reverse (5´- gcgagagcgtcccttctctgctgtaataatttatcgccgccagcgcatatgaatatcctccttag-3´) to construct the viaA

(yieM_ECOLI) deletion strain. DY330 cells were inoculated from single colonies into 50 mL

LB media and grown at 30˚C with shaking to an OD600 = 0.6 – 0.8. 25 mL culture was heated in a 42˚C water bath with shaking for 15 minutes to induce expression of the Lambda Red genes exo, bet and gam. Cells were immediately cooled on ice for 10 minutes. Cells were made electrocompetent and then transformed with 1 – 2 μL of linear PCR product using a BioRad

Gene Pulser. Transformed cells were diluted into 1 mL LB and grown at 30˚C for 1 hour. Cells were then plated onto LB agar containing 25 μg/mL chloramphenicol and grown overnight at

30˚C. Resistant colonies were selected and screened for successful deletion of the gene using

PCR. The deletion cassette was then transferred from DY330 to MG1655 using P1Δdam rev6 transduction as described (Sternberg and Maurer 1991). Verification of the deletion strains was performed using PCR, DNA sequencing, Northern, and Western blot analyses. It should be noted that deletion of ravA resulted in a strain that also cannot transcribe viaA (see Fig. 13A,B).

The RavA-TAP-encoding strain was constructed using the Lambda Red system in a manner similar to that described for the construction of the knock-out strains. Specific details on strain construction, as well as on the pull-down and identification of bands by mass

104 spectrometry are provided elsewhere (Butland et al. 2005).

MG1655 ΔrpoS and ΔcadB strains were obtained from the E. coli genome project at the

University of Wisconsin , Madison, USA.

3.3.3 Gene Cloning

ravA (yieN_ECOLI), ldcI (cadA/dcly_ECOLI), and ldcC (dclz_ECOLI) genes were cloned from E. coli K12 MG1655, while viaA was cloned from E. coli O157:H7. It should be noted that the amino acid sequence of ViaA from E. coli K12 MG1655 and O157:H7 strains are identical. Primers used were as follows: RAVA_forward (5´-atcacgatcatatggctcaccctcatttattag-

3´), RAVA_reverse (5´-tacgtaggatccttagcattgttgtgcctggcg-3´), VIAA_forward (5´- aactcgatcatatgctaacgctggatacg-3´), VIAA_reverse (5´-ctatggatccttatcgccgccagcgtctgagc-3´),

LDCI_forward (5´-atgctaccatatgaacgttattgcaatattg-3´), LDCI_reverse (5´- ctggatccttatttcttgctttcttctttcaatacc-3´), LDCC_forward (5´-tctaccatgggcatgaacatcattgccattatg-3´), and LDCC_reverse (5´-cacctcgagttatcccgccatttttaggac-3´).

ravA was cloned into p11 plasmid (Zhang et al. 2001), a modified pET15b which adds

His6 followed by a tobacco etch virus (TEV) cut site to the N-terminus of the expressed protein

(HV-tag). viaA and ldcI were cloned into pET3a (Novagen) which expresses untagged proteins. ldcC was cloned into pET16b such that it was expressed untagged, with the exception of additional N-terminal Gly and Met residues. All constructs were verified by DNA sequencing.

3.3.4 Protein Expression and Purification

Proteins were expressed in BL21(DE3) pLysS (Stratagene). RavA, ViaA, and LdcI protein expression was induced by addition of 0.4 mM IPTG to cultures grown to midlog phase.

Cells expressing RavA were grown overnight at 30˚C in the presence of inducer, and those expressing ViaA were induced at 18˚C overnight. Cells expressing LdcI were induced for 3 hours at 37˚C. LdcC-expressing cells were grown at 37˚C for 24 hours without the addition of

105

IPTG. Following induction, cells were harvested by centrifugation at 3,000 x g for 20 minutes and stored at -80˚C pending purification.

RavA was purified using Qiagen Ni-NTA agarose beads according to manufacturer’s protocols. The HV-tag was removed by digestion with TEV protease at 1:10 molar ratio of TEV to protein for 5 hours at 4˚C. The protein was further purified using a MonoS HR 5/5 column.

ViaA was precipitated from cell lysate using 30% ammonium sulfate. Pellets were resuspended and exchanged into a no salt buffer using PD10 desalting columns. The protein was further purified on a Heparin HP5 column.

Cells overexpressing LdcI were lysed in buffer LI1 (25 mM TrisHCl, pH 7.5, 300 mM

NaCl, 0.1 mM Pyridoxal-5´-Phosphate, 5% Glycerol, and 1 mM DTT). Cell debris was removed by centrifugation. The supernatant was heated at 70˚C for 5 minutes resulting in the precipitation of the bulk of E. coli proteins, however, most of LdcI remained soluble (Sabo et al.

1974). Precipitate was removed by centrifugation at 30,000 x g for 15 minutes. LdcI in the supernatant was further purified by ion exchange chromatography using a MonoQ HR 5/5 column and then by size exclusion chromatography using a Superdex 200 HR 10/30 column.

LdcC was precipitated from cell lysate using 30% ammonium sulfate and the pellets were resuspended and exchanged into low salt buffer by dialysis. The protein was further purified using a MonoQ HR 5/5 column.

The final yields of purified proteins per 1 L of culture were as follows: 6 mg for RavA, 2 mg for ViaA, 10 mg for LdcI, and 5 mg for LdcC. Proteins were judged to be more than 95% pure by SDS-PAGE analysis. All chromatography columns used were from Amersham

Biosciences. Protein concentration was determined using BioRad Protein Assay.

3.3.5 Measurement of RavA ATPase Activity

RavA ATPase activity was measured by determining the free phosphate released after

ATP hydrolysis using a colorimetric assay similar to that reported previously (Lanzetta et al.

106

1979). Briefly, RavA was placed in buffer RA1 (5 mM MgCl2, 0.02% Triton X-100, and 1 mM

DTT) containing various concentrations of ATP substrate. RA1 solutions of different pH were generated using 10 mM concentrations of citrate, acetate, MES, HEPES or TAPS buffer. Where applicable, BSA, ViaA, LdcI, and LdcC were included in the buffer. Assays were performed in

100 μL volumes at 37˚C in eppendorf tubes or in a quartz plate. At different time points, 10 μL aliquots of the assay mixture were removed and added to 200 μL RA2 (1 mM malachite green,

8.5 mM ammonium molybdate, and 1 M HCl). Color development was allowed to proceed for 1 minute, and then was stopped by the addition of 25 μL of 37% citric acid. Absorbance was measured at 660 nm and converted to moles of phosphate produced using a KH2PO4 standard curve.

3.3.6 Preparation of Cell Extracts for LdcI Activity Assays

WT and ΔravA cells were grown in LB media containing 0.4% glucose to an OD600 of

3.0 to 4.0 at 37°C. Cells were pelleted by centrifugation, resuspended in lysis buffer LT1 (25 mM HEPES, pH 7.0, 300 mM NaCl, 10% Glycerol, 5 mM β-mercaptoethanol) and lysed by sonication. The lysates were cleared of debris by centrifugation, flash frozen on liquid N2 and stored at -80°C until use in assays.

3.3.7 Measurement of LdcI Lysine Decarboxylase Activity

The ability of LdcI to generate cadaverine from lysine was measured as previously described with slight modification (Phan et al. 1982). Purified LdcI was dissolved in buffer LA1

(2.5 mM MgCl2, 1 mM ATP, 0.1 mM PLP, and 4 mM lysine). LA1 solutions of different pHs were generated using acetate, HEPES or TAPS buffers at a concentration of 50-100 mM. Total cell protein (25 μg/assay) was dissolved in buffer LA2 (25 mM MES, pH 6.0, 0.1 mM PLP, and

4 mM lysine). Assays were conducted in 100 μL volumes at 37˚C using either a quartz plate or eppendorf tubes. At specific time points, 15 μL aliquots were removed and added to an equal

107 volume of 1 M Na2CO3 stop solution. Subsequently, 15 μL of 10 mM 2,4,6- trinitrobenzenesulfonic acid (TNBS) was added and samples were heated at 40˚C for 5 minutes to allow the formation of cadaverine-TNBS and lysine-TNBS adducts. Cadaverine-TNBS

(N,N´-bistrinitrophenylcadaverine) was extracted using 500 μL toluene and the absorbance at

340 nm was measured. An extinction coefficient of 2.5x104 M-1 cm-1 was used (Phan et al. 1982) to convert absorbance into moles of cadaverine produced.

3.3.8 Growth in LdcI Indicator Media

E. coli K12 MG1655 WT, ΔravA, and ΔcadB cells were grown in lysine decarboxylase indicator media DI1 (0.3% yeast extract, 0.1% glucose, 0.5% lysine, and 0.0016% bromocresol purple pH indicator) at 37°C. Samples were taken at selected time points, cells were pelleted by centrifugation, and the absorbance at 590 nm of the bromocresol purple indicator in the cleared lysates was measured.

3.3.9 Extreme Acid Shock Assays

Extreme acid shock assays were adapted from previously described protocols (Iyer et al.

2003). Briefly, E. coli K12 MG1655 WT and ΔravA cells were grown in LB media containing

2% glucose at 37°C for 18 hours. Cells were diluted 1/100 into pH 2.5 acid shock media AS1

(40 mM KCl, 80 mM KH2PO4, 33 mM H3PO4, 1.7 mM sodium citrate, and 20 mM glucose) with and without 1 mM lysine. Samples were incubated at 37°C for 2 hours and then serially diluted in LB media in a 96-well plate. Diluted cells were then spotted onto an LB agar plate and grown up at 37°C overnight. Growth on plates was examined to determine cell viability.

3.3.10 Northern Blot Analysis

E. coli MG1655 cultures were grown in LB media at 37˚C to an OD600 of 0.6-0.8. Total

RNA was isolated using the single step acid guanidinium thiocyanate-phenol-chloroform

108 extraction method as described (Chomczynski and Sacchi 1987). Briefly, 1.5 mL samples of culture were centrifuged and pelleted cells were resuspended in 500 μL buffer NB1 (4 M guanidinium thiocyanate, 25 mM sodium citrate, 0.5% sodium lauryl sarcosinate, and 0.1 M β- mercaptoethanol). Samples were then passed four times through a 25 gauge needle to facilitate lysis. Next, 0.05 mL of 2 M sodium acetate (pH 4.0), 0.5 mL phenol, and 0.1 mL chloroform- isoamyl alcohol were added sequentially to the sample. After each addition, the tube was capped and mixed thoroughly by inversion. Samples were then vortexed briefly, incubated on ice for 15 minutes and centrifuged at 10,000 x g for 20 minutes at 4˚C. The upper aqueous phase was transferred to a fresh tube and an equal volume of isopropanol was added. The solution was mixed thoroughly and incubated at -20˚C for 1 hour. RNA precipitate was collected by centrifugation at 10,000 x g for 30 minutes at 4˚C. The isopropanol was carefully decanted and the pellet was resuspended in 0.15 mL buffer NB1. After thorough mixing, the sample was precipitated again using isopropanol. After centrifugation, the RNA pellet was washed twice with 75% ethanol and the pellet was resuspended in 50 μL DEPC water. Samples were stored at

-80˚C.

20 μg of total RNA were separated on a 1.2% agarose gel containing 6.7% formaldehyde.

RNA was transferred to Hybond N+ Nylon membrane (Amersham) and was covalently cross- linked to the membrane by UV irradiation using a Stratalinker UV Crosslinker (Stratagene).

Probe generation and Northern blotting were carried out using the AlkPhos Direct Labeling and

Detection System (Amersham). For RavA, the DNA probe used was complementary to bp 623-

1497 of the ravA gene, while for ViaA, the DNA probe used was complementary to bp 719-

1452 of the viaA gene.

3.3.11 RT-PCR

RT-PCR analysis was performed using the Qiagen OneStep RT-PCR kit as described in the manufacturer’s instructions. 2 μg of E. coli K12 MG1655 total RNA was used as a template.

109

Amplification was carried out using primers RAVA_forward2 (5´- tatcacggcgccatggctcaccctcatttattag-3´) and VIAA_reverse and primers RAVA_internal_forward

(5´-gcgaattccatatgcaacaaattgatgtattgatgaccg-3´) and VIAA_internal_reverse (5´- cgttcgatcactttttcacg-3´).

3.3.12 Western Blot Analysis

Typically, cell pellets were lysed by sonication and lysates were centrifuged to remove cellular debris. Supernatant was transferred to a fresh tube and protein concentration was determined using the BioRad Protein Assay. A total of 5 to 40 μg of protein was separated on a

10% SDS-PAGE gel and transferred to Biotrace PVDF membrane using a BioRad semidry transfer apparatus as per manufacturer’s instructions. Membranes were then incubated with rabbit polyclonal anti-RavA or anti-LdcI antibodies generated at the Division of Comparative

Medicine, University of Toronto. Membranes were then washed and incubated with protein LA- peroxidase. ECL detection reagent (Amersham Biosciences) was used to visualize protein bands on a Bioflex chemiluminescence detection film (Clonex).

3.3.13 Subcellular Fractionation of E. coli

A volume of 200 mL of cells grown at 37˚C until an OD600 of 2.5 was divided into 2 x

100 mL and centrifuged at 4,000 x g for 10 min at 4˚C. Cell pellets were washed once with buffer SF1 (10 mM TrisHCl, pH 7.1, and 30 mM NaCl) and then centrifuged again. One pellet was used for the isolation of perisplasmic and cytoplasmic proteins, and the other pellet was used for the isolation of membrane proteins. Periplasmic and cytoplasmic proteins were isolated using a protocol based on a previously described osmotic shock method (Lee and Ahn 2000), while total membrane proteins were isolated as previously described (Lemire and Weiner 1986).

110

3.3.14 Size Exclusion Chromatography

A calibrated Superose 6 HR 10/30 column connected to an AKTA FPLC system

(Amersham Biosciences) was used for size exclusion chromatography. Samples were typically run in buffer SE1 (25 mM TrisHCl, pH 7.5, 300 mM NaCl, 2.5 mM MgCl2, and 1 mM DTT) in the presence or absence of 1 mM ATP. 0.1 mM PLP was added to the buffer when LdcI or

LdcC were included in the sample. The column was calibrated using molecular weight standards

(Sigma): thyroglobulin (669 kDa), apoferritin (443 kDa), β-amylase (200 kDa), bovine serum albumin (66 kDa), carbonic anhydrase (29 kDa) and cytochrome C (12.4 kDa). All experiments were performed at 10ºC, and absorbance was monitored at 280 nm.

3.3.15 Analytical Ultracentrifugation Analysis of RavA

Sedimentation equilibrium experiments on RavA were carried out at the

Ultracentrifugation Service Facility in the Department of Biochemistry at the University of

Toronto. 20 μM RavA protein in Buffer AU1 (25 mM Tris, pH 7.5, 300 mM NaCl, 1 mM DTT,

10% glycerol) containing 0.2 mM ATP was spun at 5,000 and 8,000 rpm, at 4°C, in an An-60 Ti

Rotor in a Beckman Optima XL-A analytical ultracentrifuge. Absorbance was recorded at 230 and 280 nm. Data analysis was performed using the Origin Microcal XL-A/CL-I Data Analysis

Software Package Version 4.0.

3.3.16 Mapping of RavA Domains

7.5 μg purified RavA protein was dissolved in 140 μL buffer RD1 (10 mM Tris, pH 7.5,

100 mM KCl, 10 mM MgCl2, 2.5 mM DTT, 5 mM ATP) and digested with 0.075 μg trypsin

(Roche, 1418475) at 4°C. At specific time points aliquots of digested protein were removed and separated using SDS-PAGE. Sample preparation of bands for mass spectrometry was adapted from a previously described method (Shevchenko et al. 1996). Briefly, the samples were processed in 500 μL PCR grade tubes. The gel band was cubed and washed sequentially in

111

HPLC grade water (VWR, EM-WX0004-1) and 100 mM ammonium bicarbonate (Sigma,

A6141). The gel cubes were dehydrated between the following steps with 100% acetonitrile

(VWR, EM-AX0151-1): gel cubes were reduced with 1 mM DTT (Sigma D9163) at 50oC for 30 minutes and alkylated with 55 mM iodoacetamide (Sigma, I1149) at room temperature for 20 minutes in the dark. Gel cubes were dehydrated with acetonitrile washes and speedvac treated to remove acetonitrile. Protein was digested by swelling the cubes with 50 μL of 0.1 ng/μL of trypsin in 50 mM ammonium carbonate and 0.1% calcium chloride overnight. The tryptic peptides were extracted from the gel slices by shaking with two washes of 150 μL 100 mM ammonium carbonate buffer at 37oC for 45 minutes. Liquid from the overnight digestion and two extractions were pooled on ice and acidified with 2 μL of 1% glacial acetic acid (VWR EM-

AX0073-6). Peptides were desalted and concentrated using C18 resin (Sigma, H8261) by a batch method. Final volume was 10 μL of extracted tryptic peptides in 65% acetonitrile and 1% acetic acid. A 1 μL aliquot of tryptic peptide sample was mixed with 1 μL of a saturated α-

CHCA (Sigma, C2020) solution in 30% acetonitrile and 1% acetic acid and spotted onto a stainless steel MALDI sample plate (Bruker Daltonics). For high mass bands, the samples were concentrated and desalted by a batch method similar to tryptic peptide clean up, but with the exception that C4 resin (Sigma) was used. The final volume was 15 μL in 65% acetonitrile and

1% acetic acid. 1 μL aliquot of partial proteolysis sample was mixed with 1 μL of a saturated sinapinic acid (Sigma, D7927) solution in 30% acetonitrile and 1% acetic acid and spotted onto a stainless steel MALDI sample plate.

High mass samples were analyzed in a Bruker Reflex III (Bruker Daltonics) in high mass linear mode with a high mass detector. Carbonic anhydrase (Sigma, C3934) and lysozyme

(Sigma, L7651) were used as external calibrants. Tryptic peptides were analyzed in a Bruker

Reflex III (Bruker Daltonics) in reflector mode. Samples were internally calibrated using trypsin peptide peaks. Mass mapping was done using PAWS (Genomic Solutions) against the known sequence of the protein. The tryptic peptide finger print of a gel band was used to determine an

112 approximate region for the partial proteolysis fragment from the full sequence. The observed high mass was then used to define the exact cut sites for the partial proteolysis fragments.

3.3.17 Electron Microscopy

For preparation of negatively stained protein samples, 3 μL of protein solutions were applied to glow discharged continuous carbon copper grids and left for one minute before washing with two drops of 1% (w/v) uranyl acetate. The grids were observed with a Philips CM

12 transmission electron microscope with LaB6 filament at 80 kV. Images were recorded on

Kodak SO-163 films at 35,000x magnification in the case of RavA, LdcI, and ATPγS-RavA-

LdcI preparations and at 22,000x for the ADP-RavA-LdcI preparation. Negatives were digitized on a Zeiss SCAI scanner at a pixel size of 7 μm, corresponding, at the specimen level, to 0.2 nm and 0.318 nm at 35,000x and at 22,000x, respectively. Image processing was carried out on a

Linux workstation using the EM (Hegerl 1996; Hegerl and Altbauer 1982), EMAN (Ludtke et al.

1999), BSOFT (Heymann 2001), and PFT2 (Baker and Cheng 1996; Belmap 2004) software packages. Images were binned to 0.4 nm or to 0.636 nm at the specimen level.

3.3.18 Image Analysis

For LdcI, 3631 subframes of 88*88 pixels containing single particles were extracted interactively from several micrographs at a similar defocus of approximately 600 nm, and low- pass-filtered at 17 Å resolution in order to remain within the first zero of the contrast transfer function (CTF). This data set was translationally but not rotationally aligned relative to the rotationally averaged total sum of the individual images. The aligned data set was subjected to multivariate statistical analysis (MSA/EM), which clearly demonstrated the 5-fold symmetry of the LdcI particle. Characteristic class averages were then used as a set of references for multi- reference alignment (MRA/EM) followed by MSA and classification. The resulting class averages were similar to those obtained by reference free alignment and classification routine in

113

EMAN. Five characteristic views were selected to generate an initial 3D-model of the LdcI particle by cross common lines. Refinement of the 3D-model was, therefore, undertaken with a

D5-symmetry imposed (EMAN and PFT2/BSOFT). After convergence, the symmetry was relaxed and the absence of divergence was verified. The resolution of the reconstruction was determined via Fourier shell correlation to be around 20 Å according to the 0.5 criterion. The handedness was not determined.

For ADP-RavA-LdcI and ATPγS-RavA-LdcI complexes, the procedure was similar to the one described above for LdcI alone. 2902 subframes of 80*80 pixels (6.36 Å/pix) were extracted from the images of ADP-RavA-LdcI and 2629 subframes of 112*112 pixels (4 Å/pix) from the images of ATPγS-RavA-LdcI. The data were low-pass-filtered at 18 Å, which is again within the first zero of the CTF. In this case, the majority of particles were present in a side view orientation or as tilted views. Given the appearance of the side views it was obvious that the

RavA-LdcI complex looked like a sandwich of two LdcI oligomers bridged together by densities which might be attributed to RavA. Therefore, the reconstruction was performed imposing a 5-fold symmetry of the complex, although contrary to the situation with LdcI alone and due to the under representation of top views, it could not be revealed by MSA analysis.

Once again, classification allowed selection of five characteristic views used for generation of an initial 5-fold symmetric model by cross common lines. In addition, two alternative 3D- models were created by using the same five characteristic views supplemented by a simulated top view. The two simulated top views were composed of the previously determined top view of

LdcI alone surrounded by five symmetrically distributed blobs of density positioned either as a radial extension of each LdcI subunit or in the middle between two adjacent subunits, so as to reach the dimensions compatible with the observed side views of the RavA-LdcI complexes.

After the refinement in EMAN and PFT2/BSOFT with a D5-symmetry imposed, all the models converged to the same final structure, which remained stable upon symmetry relaxation. The reconstructions of the two complexes, ADP-RavA-LdcI and ATPγS-RavA-LdcI, looked

114 indistinguishable from each other and had a similar final resolution. The resolution of the reconstructions was estimated to be around 30 Å via FSC, but was most probably asymmetric due to the preferential adsorption of the particles on the carbon surface in a side view orientation.

Moreover, if some of the nucleotide-RavA-LdcI complexes were built by LdcI molecules linked not by five but by four or three RavAs, for example, such heterogeneity of the preparation would also lead to a deterioration of the achievable resolution.

3.4 Results

3.4.1 RavA Represents a Distinct Subfamily Within the MoxR AAA+ Family

Initially, phylogenetic analysis of the MoxR AAA+ family was carried out to determine the major subfamilies within this family. A phylogenetic tree was generated using 156 MoxR

AAA+ sequences which belong to MoxR Clusters of Orthologous Groups 0714 (COG0714)

(National Center for Biotechnology Information, NIH) (Tatusov et al. 2003). The tree reveals the presence of six distinct subfamilies: MoxR-Proper, CGN, RavA, YehL, APE0892 and

PA2707 (Fig. 16A). The first five of these groups are consistent with those reported earlier (Iyer et al. 2004). None of the proteins of the RavA, YehL, APE0892, and PA2707 subfamilies have yet been characterized, however, limited work with members of the MoxR-Proper and CGN subfamilies has suggested that the MoxR AAA+ proteins function as molecular chaperones.

Closer examination of the 36 genes in the RavA subfamily reveals that 33 of these occur immediately adjacent to genes encoding proteins containing VWA domains (Fig. 16B). The co- occurrence of MoxR AAA+ containing genes with those encoding proteins containing VWA domains is a common feature within the MoxR family as a whole (data not shown), but appears with particularly high frequency among the RavA subfamily. This suggests a functional linkage between the RavA AAA+ proteins and their VWA-containing counterparts. In the case of E. coli RavA, the VWA-containing protein is YieM, which we have renamed ViaA.

Examination of a CLUSTAL W (Thompson et al. 1994) alignment of all 36 RavA

115

A) YehL PA2707 CGN

APE0892

RavA

MoxR Proper

B) RavA Subfamily Adjacent VWA domain- GI Number Organism # containing protein

1 14601148 Aeropyrum pernix 2 66858434 Anaeromyxobacter dehalogenans 2CP-C 3 60681677 Bacteroides fragilis NCTC 9343 4 29345745 Bacteroides thetaiotaomicron VPI-5482 5 57241650 Campylobacter lari RM2100 6 34498193 Chromobacterium violaceum ATCC 12472 7 48853798 Cytophaga hutchinsonii 8 16131614 Escherichia coli K12 9 50118970 Erwinia carotovora subsp. atroseptica SCRI1043 10 68140080 Ferroplasma acidarmanus Fer1 11 47092621 Listeria monocytogenes str. 4b H7858 12 2496128 Methanocaldococcus jannaschii DSM 13 2495785 Methanocaldococcus jannaschii DSM 14 20091041 Methanosarcina acetivorans C2A 15 20089152 Methanosarcina acetivorans C2A 16 68134507 Methanosarcina barkeri str. fusaro 17 68132708 Methanosarcina barkeri str. fusaro 18 21228941 Methanosarcina mazei Go1 19 21227633 Methanosarcina mazei Go1 20 26553847 Mycoplasma penetrans HF-2 21 54027784 Nocardia farcinica IFM 10152 22 54302311 Photobacterium profundum SS9 23 37524086 Photorhabus luminescens 24 48478302 Picrophilus torridus DSM 9790 25 18313673 Pyrobaculum aerophilum str. IM2 26 47571823 Rubrivivax gelatinosus PM1 27 16767163 Salmonella typhimurium 28 24115049 Shigella flexneri 2a str. 301 29 21225846 Streptomyces coelicolor A3 30 70606761 Sulfolobus acidocaldarius DSM 639 31 15899119 Sulfolobus solfataricus P2 32 15920718 Sulfolobus tokodaii str. 7 33 15601518 Vibrio cholera 34 59713681 Vibrio fischeri ES114 35 28900864 Vibrio parahaemolyticus RIMD 2210633 36 16120358 Yersinia pestis

116

FIGURE 16. Phylogenetic analysis of MoxR AAA+ proteins.

A) Phylogenetic tree generated from 156 MoxR AAA+ proteins belonging to COG0714 as described in the Materials and Methods. Six major subfamilies were identified, including RavA, MoxR Proper, YehL, CGN, APE0892, and PA2707. B) 36 members of the RavA subfamily from 32 organisms are listed. The presence or absence of an adjacent VWA domain-containing protein is indicated.

117 sequences revealed the presence of a number of highly conserved signature sequences in the E. coli K12 MG1655 RavA sequence (marked in red in Fig. 17). Upon comparison to common

AAA+ signature sequences (Neuwald et al. 1999), it is observed that many of these conserved residues distinguish RavA proteins from other AAA+ proteins. At the N-terminus, the Box II motif contains two highly conserved residues, an Arg and hydrophobic residue (Leu in the

RavA sequence). The Walker A motif contains a number of highly conserved residues, but most of these are common to all AAA+ proteins. One feature of note, however, is that members of the

RavA subfamily favor an alanine residue in place of the more commonly observed glycine of the classic ‘GK(S/T)’ motif. A series of highly conserved residues also exist in the Box IV and

Box IV΄ motifs. Based upon comparison to published alignments, at least part of the sequence in this region may lie on a small ‘insert’ in the second helix of the AAA+ core module (Iyer et al.

2004). The Walker B motif also contains a number of highly conserved residues. Most of these are common to all AAA+ proteins, however, the conserved tryptophan/phenylalanine residue is particularly unique to RavA’s and may have important functional significance to RavA proteins.

A unique, highly conserved LP motif also exists immediately prior to the Walker B motif.

Within Box VI, which follows the Walker B, members of the MoxR family possess a highly conserved LLxxΦxE motif, with members of the RavA subfamily possessing a slightly extended version IxNxLLxxΦNE(R/K) motif, where Φ corresponds to I,L,V, or M. The significance of this conserved region is unclear. The Sensor 1 region contains a highly conserved PΦx5ASNxΦP motif. The conserved Asn residue appears to correspond to the AAA+

Sensor 1 residue (5), responsible for interacting with the gamma phosphate of bound ATP. The conserved Ser is also very common in AAA+ in general. The rest of this conserved sequence, however, appears to be specific to RavA/MoxR proteins.

A conserved LxAxYDRxxxR motif exists in Box VII-Box VII΄ region. This region contains the conserved arginine finger of AAA+ proteins, believed to be responsible for mediating intersubunit communication in response to ATP hydrolysis

118

RavA Sequence Box II Walker A 1 MAHPHLLAERISRLSSSLEKGLYERSHAIRLCLLAALSGESVFLLGPPGIAKSLIARRLKFAFQNARAFE 70 Φ Φ T FAFQNARAFE 70 Box IV Box IV’ Walker B Box VI 71 YLMTRFSTPEEVFGPLSIQALKDEGRYERLTSGYLPEAEIVFLDEIWKAGPAILNTLLTAINERQFRNGA 140 Φ KTY ΦΦΦ ΦF Φ K Sensor 1 Box VII Box VII’ Box VII’’ 141 HVEKIPMRLLVAASNELPEADSSLEALYDRMLIRLWLDKVQDKANFRSMLTSQQDENDNPVPDALQVTDE 210 Φ Φ ΦΦ Sensor 2 211 EYERWQKEIGEITLPDHVFELIFMLRQQLDKLPDAPYVSDRRWKKAIRLLQASAFFSGRSAVAPVDLILL 280 ΦΦ Φ Φ

281 KDCLWYDAQSLNLIQQQIDVLMTGHAWQQQGMLTRLGAIVQRHLQLQQQQSDKTALTVIRLGGIFSRRQQ 350 ΦΦ 351 YQLPVNVTASTLTLLLQKPLKLHDMEVVHISFERSALEQWLSKGGEIRGKLNGIGFAQKLNLEVDSAQHL 420 421 VVRDVSLQGSTLALPGSSAEGLPGEIKQQLEELESDWRKQHALFSEQQKCLFIPGDWLGRIEASLQDVGA 490 491QIRQAQQC 498

FIGURE 17. Assignment of highly conserved residues in RavA AAA+ proteins.

The amino acid sequence of RavA from E. coli K12 is shown. Major features of the AAA+ module were obtained from published alignments and are shown boxed (Iyer et al. 2004; Neuwald et al. 1999). Sites of residues conserved in >90% of 36 Clustal W-aligned RavA subfamily members are colored red. Conservation is based on the following amino acid groupings: ILVM, KR, DE, ST, YFW and NQ. A, C, P, H and G are not grouped. In almost all cases a specific residue highlighted in the RavA sequence is one of the most frequently occurring throughout the RavA subfamily. Variant conserved residues present in at least 20% of subfamily members are indicated in single-letter code below the RavA sequence. In the case of the ILVM group, a Φ below a residue is used to indicate that at least 20% of subfamily members contain one of these residues. All glutamine residues after the Sensor II motif are shown in green.

119 (Ogura and Wilkinson 2001). The first R residue in the conserved motif may correspond to the arginine finger. Some additional conserved hydrophobic residues also exist in Box VII΄ and in the sequence following Box VII΄΄. The Sensor 2 region contains a short stretch of conserved residues, SDRR. One of the arginines is expected to correspond to the conserved Sensor 2 residue present throughout most of the AAA+ superfamily. This residue is believed to be responsible for sensing ATP and mediating movements between the AAA+ core domain and small α-helical domain (Ogura and Wilkinson 2001). The Ser residue and first Arg are common to many MoxR proteins in general, however, the Asp and second Arg residues appear to be unique to RavA subfamily members, and may be important for a specific function in these proteins. Some additional conserved hydrophobic residues also exist in this region, however, there is little conservation in the RavA sequences after Sensor 2.

3.4.2 RavA is a Soluble Cytoplasmic ATPase Consisting of Two Domains

In order to experimentally determine the domains present in RavA structure, we expressed and purified the protein from E. coli. The purified protein was then treated with limited amounts of sequencing grade trypsin for varying amounts of time (Fig. 18A). The boundaries of the fragments generated were then determined using mass spectrometry (Fig.

18B). Based on this analysis, the protein can be divided into two main domains: a highly conserved N-terminal domain (M1 – R347) that corresponds to the AAA+ module, and a poorly conserved C-terminal domain (R348 – C498) that is predicted to be predominantly α helical. As is typical for AAA+ modules (Neuwald et al. 1999), the N-terminal domain can be separated into two smaller fragments (Fig. 18A,B), which appear to correspond to the AAA+ αβ subdomain (M1 – R187) and an α helical subdomain (S188 – R347). The C-terminal domain of

RavA has no known motifs or signature sequences and appears to represent a novel domain of unknown function. The sequence of this domain bears little homology to any other protein sequence in NCBI (Wheeler et al. 2005) and is poorly conserved in RavA proteins outside of the

120

Time (min) AAA+ Module

MW Uncut 1 5 15 30 60 90 B) A (39.6 kDa) A) 116 66 RavA B (21.2 kDa) C (18.4 kDa) D (16.8 kDa) 45 A 35 kDa 25 B C M1 R187/S188 R347/R348 C498 18 D 14 121

C) D)

) RavA ATPase

localization of RavA in cells -1 600 grown to early stationary phase

µg at pH 7.5

αRavA MG1655 -1 k (sec-1) αRavA MG1655 ∆ravA cat 400 1.4 (0.2)

Km (M) -4 -4 Membrane 7.9 x 10 (2.5 x 10 ) Cytoplasmic Periplasmic

200 kcat/Km (M-1 sec-1) 1800 (600)

Initial Rate (pmol min 0 45 6789 pH

FIGURE 18. Domain mapping, localization, and ATPase activity of RavA.

A) SDS-PAGE analysis of RavA tryptic digest at different time points are shown. Major bands are labeled. B) Schematic diagram of the trypsin cleavage sites and proposed tryptic domains of RavA, as identified by mass spectrometry is shown. The four discrete fragments, labeled A to D, refer to the bands indicated in A. C) Localization of RavA in WT and ΔravA E. coli cells grown to stationary phase in LB media. Cytoplasmic, periplasmic, and membrane fractions are shown. RavA protein is localized to the cytoplasm. A faint band present in the membrane fraction is nonspecific as it is present in both WT and ΔravA strains. D) ATPase activity of purified

RavA protein at various pH’s. kcat, Km and kcat/Km values are given for pH 7.5 (inset).

122 enterobacteria subdivision. It is interesting to note that the E. coli RavA sequence after residue

A288 is highly enriched in glutamines (Fig. 17).

Subsequently, the subcellular localization of RavA was determined. Wild-type and

ΔravA cells grown at 37˚C to early stationary phase were lysed and separated into cytoplasmic, periplasmic, and crude membrane fractions. Western blot analysis of these fractions using

αRavA antibodies (Fig. 18C) clearly showed that RavA is predominantly cytoplasmic. This is consistent with the lack of any predicted signal sequences or transmembrane regions in RavA. A faint band observed in the membrane fractions appears to be non-specific, as it appears in both the WT and ΔravA strains.

As shown in Fig. 18D, purified RavA at 0.4 μM has the highest ATPase activity at neutral to alkaline pH, with the highest activity around pH 7.5. Below pH 7.5, the activity drops rapidly and is minimal below pH 6.0. Complete Michaelis-Menten kinetic analysis was performed at pH 7.5 (Fig. 18D, inset). The kcat and KM values for RavA ATPase are in a similar range to those obtained for other AAA+ proteins such as HslU (Yoo et al. 1996), NtrC (Austin and Dixon 1992) and Lon (Starkova et al. 1998). Furthermore, RavA also has GTPase activity.

GTP hydrolysis appears to be much slower than ATP hydrolysis at lower enzyme/substrate concentrations but reaches comparable levels with increasing amounts of enzyme/substrate (data not shown). This effect may be the result of a lower ability of GTP versus ATP to promote

RavA oligomerization (data not shown).

3.4.3 ravA and viaA genes form an operon under the direct control of σS

The ravA and viaA gene sequences encode predicted proteins of similar length consisting of 498 and 483 amino acid residues, respectively. The genes reside on the E. coli chromosome close to the origin of replication (OriC) (Blattner et al. 1997), about 2200 base pairs clockwise from OriC, and the genes are transcribed in the counter-clockwise direction. Interestingly, the 5´ end of the ViaA coding region overlaps the 3´ end of the RavA coding sequence by seven bases

123

(Fig. 19A). Such an overlap might hint to the presence of an ancient gene whose sequence was a fusion of the ravA and viaA gene sequences.

In order to determine whether ravA and viaA are on the same operon, Northern blot analysis was carried out on E. coli MG1655 wild-type (WT), MG1655 ΔravA, and MG1655

ΔviaA cells. The ΔravA cells have most of the RavA ORF deleted: from 13 base pairs upstream of the ‘atg’ start codon to base pair 1431 (Fig. 19B). This deletion does not disrupt the ViaA

ORF. The ΔviaA cells contain a deletion of base pairs 166 to 1437 of the ViaA ORF. The cells were grown in LB media at 37˚C to mid-log phase; subsequently, total RNA was isolated and probed with alkaline phosphatase-labeled DNA probe complementary to bp 623-1497 of the ravA gene or complementary to bp 719-1452 of the viaA gene (Fig. 19A,B). Wild-type cells clearly show the presence of two different ravA mRNA transcripts of approximately 1900 and

1100 bases in size (Fig. 19C, bands 2 and 3, respectively, in the first lane of panel I). Low levels of an even larger transcript of approximately 3100 bases are also detected (Fig. 19C, band 1 in the first lane of panel I). RT-PCR analysis using forward and reverse primers complementary to the 5´ end of the ravA ORF and the 3´end of the viaA ORF, respectively, confirms the existence of the larger transcript (Fig. 19D, band 1). In addition, RT-PCR using a forward primer internal to the ravA ORF and a reverse primer internal to the viaA ORF also produces a product (Fig.

19D, band 2), further supporting the existence of these two genes on a single transcript. viaA transcript is detected as distinct bands of approximately 2000 and 1200 bases (Fig. 19C, bands 1 and 2, respectively, in first lane of panel II). A larger transcript is not detected in the viaA blot, likely a reflection of lower probe sensitivity.

These data suggest that ravA and viaA exist as a single operon, consistent with the lack of a predicted viaA promoter sequence. The full length ~3100 bases transcript appears to be inherently unstable, however, and undergoes at least two distinct internal cleavage events shown schematically in Fig. 19E. The first of these cleavages produces the ~1900 base fragment observed in the ravA blot (Fig. 19C, band 2 in the first lane of panel I) and the ~1200 base

124

A) -10 Region 1 M A H --- Q Q C * 498 aa RavA aacgTCTATACTCGCaatt --- 1 ATGGCTCAC --- CAACAATGCTAA 1497 bp 1 ATGCTAACGCTG --- CGCTGGCGGCGATAA 1452 bp ViaA 1 M L T L --- R W R R * 483 aa OriC B) ~2200 bp ravA probe viaA probe

ravA gene viaA gene

CAT insertion cassette CAT insertion cassette overlap

C) D) ravA Northerns viaA Northerns RT-PCR Panel I Panel II

WT ∆ravA∆viaA WT ∆ravA∆viaA 6000

125 4000 1 6000 3000 6000 2500 2000 4000 4000 2

1 3000 base pairs 3000 bases 1500 1 2000 2000 2 1200 4 1500 bases 1500 2 1000 3 1000 1000 5 800 500 500 500 200 200

E) F) ~1.1 kb ~2.0 kb MG1655 MG1655 ∆rpoS OD600 0.5 1.0 2.2 3.8 5.4 0.5 1.0 1.8 2.7 4.0 α ravA-viaA transcript ~3.1 kb RavA grown in LB at 37oC, pH 7.5 ~1.9 kb ~1.2 kb

FIGURE 19. ravA-viaA gene organization in E. coli.

A) The overlap between the RavA and ViaA ORFs is shown. The σS consensus sequence in the predicted RavA -10 promoter region is indicated in upper case, with bolded residues exactly matching the consensus sequence (Weber et al. 2005). B) Scaled schematic showing the areas of the ravA and viaA genes which hybridize to the probes used in the Northern blots and the regions replaced by the chloramphenicol acetyl (CAT) resistance cassettes in the deletion strains. Sites of internal primers used in RT-PCR, indicated by small arrows, are also shown. C) Northern analysis of RavA and ViaA in WT, ΔravA and ΔviaA strains. Observed bands are numbered (see text). D) RT-PCR confirmation of ravA-viaA-containing transcript. Band 1 was amplified using RavA and ViaA terminal primers. Band 2 was amplified using RavA and ViaA internal primers. E) Scaled schematic showing estimated cleavage sites within the ~3.1 kb ravA-viaA-containing transcript and predicted product sizes. F) Western blot analysis of RavA protein levels in WT and ΔrpoS cells grown in LB at 37°C. RavA protein expression is significantly induced towards late log/early stationary phase in WT cells. This induction is not observed in the ΔrpoS cells, consistent with ravA being regulated by σS.

126 fragment observed in the viaA blot (Fig. 19C, band 2 in the first lane of panel II). The second cleavage produces the ~1100 base fragment observed in the ravA blot (Fig. 19C, band 3 in the first lane of panel I) and the ~2000 base fragment observed in the viaA blot (Fig. 19C, band 1 in the first lane of panel II). Whether or not these cleavage events have a regulatory role is currently under investigation.

Northern blot analysis of total RNA from MG1655 ΔravA cells shows that the deletion of the ravA gene results in the loss of all mRNA species (Fig. 19C, lane 2 in both panels). On the other hand, northern blot analysis of MG1655 ΔviaA using the probe for ravA shows the presence of an ~1700 base transcript and the previously observed ~1100 base transcript (Fig.

19C, bands 4 and 5, respectively, in lane 3 of panel I). The ~1700 base transcript is slightly smaller than the ~1900 base fragment observed in wild-type cells (Fig. 19C, band 2 in lane 1 of panel I) probably resulting from earlier termination of transcription due to the presence of the deletion cassette (Fig. 19B). These results confirm that the ravA and viaA genes form a single operon.

Analysis of the sequence immediately upstream of the ravA gene reveals the presence of a highly conserved σS promoter consensus sequence (Fig. 19A) (Weber et al. 2005). This consensus sequence has been shown to be present in genes that are upregulated in stationary phase. Consistent with this, western blot analysis of MG1655 cells shows a significant induction of RavA levels as cells approach late log/early stationary phase (Fig. 19F, left panel). Such an induction does not occur in an MG1655 ΔrpoS strain, although constant levels of RavA protein appear to be present at all times (Fig. 19F, right panel). The data suggest that RavA induction is partially dependent on the stationary phase sigma factor σS, likely in a direct manner. The presence of low, constitutive levels of RavA protein in the ΔrpoS strain suggests that the ravA promoter may display conditional selectivity allowing its expression to be regulated by other sigma factor(s) (e.g. σ70). Such conditional promoter selectivity has been reported for other E. coli genes, for instance dps and yaiA, whose expression may be regulated by either σS or σ70,

127 depending on cellular conditions (Weber et al. 2005).

3.4.4 RavA Forms a Hexameric Oligomer

Many AAA+ proteins typically function as oligomeric rings (Hanson and Whiteheart

2005); therefore, the nature of the oligomeric state of RavA was investigated using size exclusion chromatography and analytical ultracentrifugation. Size exclusion chromatography analysis using 3 μM RavA revealed that the protein forms oligomeric species, and that this oligomerization is enhanced in the presence of nucleotide (Fig. 20A). The largest species formed runs near the 443 kDa marker. In order to more accurately confirm the size of this species, analytical ultracentrifugation sedimentation equilibrium studies were performed in the presence of ATP. The best fit to the data set corresponds to a monodisperse species of approximately 310 kDa in size (Fig. 20B). This molecular weight corresponds closely with that expected of a hexameric species. Thus we propose that the functional state of RavA is a hexameric oligomer in the presence of nucleotides. Sedimentation equilibrium experiments in the absence of nucleotides revealed the existence of a range of smaller species (data not shown), consistent with our size exclusion results.

Negative stain electron microscopy of RavA in the presence of ADP, ATP or non hydrolyzable ATP analogues invariably showed a mixture of different oligomeric states of the protein, ranging from monomers to bigger oligomers. The top view appearance of the largest oligomers clearly favored their hexameric composition (data not shown). However, due to the inhomogeneity of RavA oligomeric states on the grid and the resulting difficulty in unambiguous identification of corresponding side views, no attempts have been made to obtain a three dimensional reconstruction of the RavA hexamer (however, see below).

3.4.5 RavA Binds to the Inducible Lysine Decarboxylase LdcI/CadA

In an effort to determine the function of RavA, experiments were carried out in an

128

A) 10 669 443 200 66

8 +ATP 6 -ATP

4

2

Absorbance (mAU) 0 1011 1213 1415 1617 18 Elution Volume (mL)

B) 3 2 1 0 -1 -2 Residuals -3

1.5

1.0

0.5 Absorbance at 280 nm (AU) 0 5.9 6.0 6.1 Radius (cm)

FIGURE 20. Analysis of RavA oligomerization.

A) size exclusion chromatography analysis, using a Superose 6 HR 10/30 column, of 3 µM RavA in the presence and absence of 1 mM ATP. In the absence of ATP, RavA migrates mainly as a small, apparently monomeric species. In the presence of ATP, RavA forms a larger species near the 443 kDa marker. Similar results were obtained when using ADP. B) sedimentation equilibrium analysis of 20 µM RavA in the presence of 0.2 mM ATP. RavA exists as a monodisperse species with an average molecular weight of 310 kDa, roughly the size expected of a hexameric oligomer.

129 attempt to identify its interacting partners. E. coli DY330 strain (Datsenko and Wanner 2000;

Yu et al. 2000), in which a TAP-tag (Rigaut et al. 1999) was fused C-terminally to the endogenous ravA chromosomal gene, was used for pulldown experiments (Butland et al. 2005).

When RavA-TAP was isolated from cells grown to stationary phase in TB rich media at 37˚C, the inducible cytoplasmic lysine decarboxylase (LdcI) was found to be strongly bound to RavA

(Fig. 21A). LdcI (also termed CadA, SwissProt name DCLY_ECOLI, SwissProt number

P0A9H3) is a pyridoxal phosphate (PLP, a vitamin B6 derivative) dependent decameric enzyme

(Sabo et al. 1974; Sabo and Fischer 1974) that plays a major role in the acid-stress response in bacteria (Merrell and Camilli 1999; Park et al. 1996; Soksawatmaekhin et al. 2004). LdcI decarboxylates a lysine to produce the polyamine cadaverine which is then pumped out of the cell by an inner membrane lysine/cadaverine antiporter protein, CadB (Meng and Bennett 1992;

Soksawatmaekhin et al. 2004). This process removes a proton from the cell cytoplasm and reduces the acidification of the intracellular environment. The genes cadB and ldcI/cadA are on the same operon (Meng and Bennett 1992; Watson et al. 1992), and the operon is activated by an inner membrane anchored transcription activator, CadC, that binds a Pcad element in the promoter region of cadBA (Watson et al. 1992). The regulation of cadBA transcription is rather complex. The operon is induced by low external pH, by the presence of exogenous lysine, and under anaerobic conditions (Auger and Bennett 1989; Sabo et al. 1974). The binding of RavA to

LdcI leads us to speculate that RavA might also be involved in the bacterial stress response or that the interaction between the two proteins is regulatory in nature.

Analysis of the binding of RavA to LdcI was carried out using size exclusion chromatography (Fig. 21B,C). All experiments shown in Fig. 21B were performed in the presence of 1 mM ATP. 18 μM RavA protein migrates as a single species with an apparent molecular weight of ~440 kDa representing the hexameric species (Fig. 21B, panel 1 – top panel). 10 μM LdcI (Fig. 21B, panel 2) also elutes as a single species, with an apparent molecular weight of 600 kDa probably representing the decameric species that is known to be

130

A) RavA-TAP pull down 97 LdcI (81 kDa) 66 RavA-CBP (60 kDa)

45 MW (kDa) MW

31

B) 669 443 200 66 1 18 µM RavA 2 10 µM LdcI LdcI 3 µ µ RavA 18 M RavA + 10 M LdcI LdcI 4 RavA 18 µM RavA + 40 µM LdcI 5 10 µM LdcC LdcC 6 18 µM RavA + 10 µM LdcC RavA 12 34 5 6 7 8 9 Fraction Number

C) 669 443 200 66 1 1.5 µM LdcI + ATP 2 3 µM RavA -ATP 3 3 µM RavA +ATP 4 LdcI 3 µM RavA + 1.5 µM LdcI RavA -ATP 5 LdcI 3 µM RavA + 1.5 µM LdcI RavA +ATP 12345678910 Fraction Number

131

FIGURE 21. RavA interacts with the inducible lysine decarboxylase (LdcI/CadA).

A) Pull-down of endogenous, C-terminally TAP-tagged RavA. Identification of the bands by mass spectrometry reveals that RavA interacts with the inducible lysine decarboxylase enzyme, LdcI (upper band). The middle band corresponds to RavA-CBP (calmodulin binding peptide, which remains after TEV cleavage). The band below RavA is a truncation product. B) Analysis of the interaction of purified RavA, LdcI, and LdcC using size exclusion chromatography in the presence of 1 mM ATP. Boxes indicate major peaks. C) Interaction of RavA and LdcI, at lower concentrations, in the presence and absence of 1 mM ATP. Note that LdcI alone runs identically in both the presence (Panel 1) and absence (not shown) of nucleotide.

132 formed by LdcI (also see below) (Sabo et al. 1974; Sabo and Fischer 1974). A mixture of 10 μM

LdcI with excess RavA (18 μM) results in a large shift in the migration of both species indicating a strong interaction (Fig. 21B, panel 3). A range of higher order oligomers are formed, with the predominant complex (boxed) migrating at or near the void volume of the column, suggesting a size of approximately 50 MDa or greater. The predominant complex formed changes if 18 μM RavA is incubated with approximately 2-fold excess of LdcI (40 μM). The formed complex elutes within the column volume, and its molecular weight is 3 – 5 MDa as estimated by sedimentation equilibrium experiments (Fig. 21B, left box in panel 4 and data not shown). We term this complex the fraction three complex (FTC), which, as discussed below, might be the physiologically relevant RavA-LdcI complex.

E. coli contains another protein which is a very close paralogue of LdcI. This protein is the constitutive lysine decarboxylase LdcC (SwissProt name DCLZ_ECOLI, SwissProt number

P52095). LdcI and LdcC are 69% identical and 84% similar as determined using pairwise

BLASTP analysis. LdcC is expressed in E. coli at very low levels and its exact role in the cell remains unclear (Kikuchi et al. 1997; Lemonnier and Lane 1998). Like LdcI, LdcC is also a

PLP-dependent enzyme that decarboxylates lysine and possibly assembles into a decamer

(Kikuchi et al. 1997; Lemonnier and Lane 1998) and Fig. 21B, panel 5). However, the enzymatic activity profile and stability of LdcC differ from that of LdcI (Kikuchi et al. 1997;

Lemonnier and Lane 1998). Due to the high sequence similarity between LdcC and LdcI, we expected that LdcC might also bind RavA, however, we were surprised to find that there is no significant interaction between RavA and LdcC (Fig. 21B, panel 6). The specific interaction of

RavA with LdcI and not LdcC possibly indicates that RavA is a stress response protein.

The size of the LdcI oligomer (decamer) is unaffected by the presence or absence of

ATP (Fig. 21C, panel 1 – upper panel) or by LdcI concentration (compare elution profile of 10

μM LdcI in Fig. 21B, panel 2, to the elution profile of 1.5 μM LdcI in Fig. 21C, panel 1). As shown in Fig. 21C, the size of the RavA-LdcI complex is dependent upon RavA oligomerization.

133

3 μM RavA in the absence of ATP predominantly elutes at ~60 kDa representing a monomeric species, while, in the presence of ATP, 3 μM RavA elutes as the larger hexameric species (Fig.

20A and Fig. 21C, panels 2 & 3). LdcI binds to 3 μM RavA in the absence and presence of ATP

(Fig. 21C, panels 4 & 5), however, the size of the complex is smaller in the absence of ATP as compared to when ATP is present. These data indicate that LdcI probably does not promote the oligomerization of RavA in the absence of nucleotide. In Fig. 21C, panel 5, the fraction three complex (FTC) is also observed.

Similar interactions were observed between RavA and LdcI when ADP was added to the buffer rather than ATP. Furthermore, the inclusion of lysine or cadaverine or the removal of the

LdcI cofactor PLP from the buffer did not affect the interaction between RavA and LdcI (data not shown). These observations clearly indicate that the interaction between RavA and LdcI is highly robust and functionally significant.

3.4.6 RavA-LdcI Form a Very Large Distinctive Cage-Like Complex as Visualized by Negative

Stain Electron Microscopy

A 20 Å resolution structure of the D5-symmetric LdcI oligomer (Fig. 22A-C) was obtained as described in the Materials and Methods. The existence of a five-fold symmetry axis in the particle was corroborated by multivariate statistical analysis. Fig. 22B shows the pronounced five-fold symmetrical shape of the top view class average and the two-fold symmetry of the side view class average. Some dimeric dissociation products could always be detected in the background of the micrographs (Fig. 22A), in agreement with the early publication of Sabo et al. (Sabo et al. 1974). The random orientation of single particles on the carbon film made this specimen particularly suitable for a three dimensional reconstruction by angular reconstitution approach (Fig. 22C). The decamer of LdcI revealed itself as two pentameric rings, tightly stacked together back-to-back and turned about 35 degrees with respect to each other. Each monomer has a globular shape. The relative rotation of the two rings confers

134

A) C)

top

tilted

B) side

15 nm

D) F)

top

tilted

E) side

15 nm

135

FIGURE 22. EM analysis of LdcI and RavA-LdcI complex.

A) negatively stained LdcI particles. B) five characteristic views of LdcI used to calculate an initial 3D model of the particle. C) isosurface representations of the 3D reconstruction of LdcI. D) negatively stained RavA-ADP-LdcI particles. E) two hypothetical five-fold symmetrical top views of RavA-ADP-LdcI created as described in the text and five characteristic views of RavA-ADP-LdcI used to calculate an initial 3D model of the particle. F) isosurface representations of the 3D reconstruction of RavA-ADP-LdcI.

136

a pronounced handedness to the molecule responsible for the “plait-like” appearance of the side view.

Grids prepared by mixing ADP- or ATPγS-RavA and LdcI in equimolar amounts (1.8

µM RavA to 1.8 µM LdcI) looked indistinguishable from preparations of LdcI alone except for a higher protein background attributed to RavA. Remarkably, however, at about two-fold excess of RavA (3.24 µM RavA and 1.8 µM LdcI in the presence of 3 mM ADP or ATPγS), the background became clean, individual LdcI views virtually disappeared, and most of the protein was present as an amazingly large and apparently homogeneous RavA-LdcI complex. Analysis of the data revealed that the ADP-RavA-LdcI complex contained not only one, but two LdcI decamers sandwiched together and glued by additional densities that certainly arise from RavA

(Fig. 22D). In this case however, the particles had a strongly preferred orientation on the grid, and although different class averages could be easily identified as side views and various tilted views, none could be attributed to a true five-fold symmetrical top view. Nevertheless, even without a top view class average, an initial model could still be constructed from several characteristic tilted views (Fig. 22E) by cross common lines technique, provided a five-fold symmetry is imposed. A refinement with such a model resulted in the reconstruction shown in

Fig. 22F. The validity of this approach was confirmed by creating two of the possible top views of the complex, composed of the experimentally observed top view of the LdcI alone that was artificially surrounded by five blobs of density. This density was arbitrarily positioned either in the middle between two adjacent LdcI subunits or as a radial extension of each LdcI subunit (see the first two panels in Fig. 22E). The only restrictions were the five-fold symmetry of the top view and the experimental dimensions of the ADP-RavA-LdcI complex side view. These two additional sets of views gave rise to two different initial 3D-models of the complex, which converged during the refinement process to the same final reconstruction (Fig. 22F), identical to the one obtained without any artificial top view.

137

Hence, the RavA-LdcI complex is a unique complex in which five RavA oligomers (at most) bridge two LdcI decamers resulting in a distinctive cage-like structure. Although the volume of the densities attributed to RavA would be consistent with its inclusion in the complex as a hexameric species, the current quality of the reconstruction does not allow us to claim this with absolute certainty. Remarkably, no differences could be observed between the RavA-LdcI complex in ADP and the one in ATPγS (data not shown), at least at the current resolution estimated to be around 30 Å. We believe that the complex seen in Fig. 22 corresponds to the fraction three complex (FTC) in Fig. 21.

3.4.7 The Binding of RavA to LdcI Affects RavA Activity but Not LdcI Activity

The formation of a tight complex between RavA and LdcI suggests that RavA affects

LdcI function or vice versa. Since RavA is a AAA+ protein, we expected that it would remodel

LdcI in a nucleotide-dependent manner and, hence, affect the activity of the enzyme. LdcI activity was measured at a range of different pH values in the presence and absence of RavA protein (Fig. 23A). LdcI activity is greater at neutral pH than at alkaline pH, with highest activities observed at pH 6.0 and 7.0. At pH 8.0 and 9.0, LdcI activity is virtually undetectable.

Surprisingly, the activity of the LdcI enzyme was unaffected by the presence of RavA under any of the conditions tested even at higher concentrations of RavA and LdcI or under different incubation conditions (data not shown). The inclusion of ViaA protein in the assay also failed to produce any alteration in the activity of the LdcI enzyme (data not shown).

Furthermore, LdcI activity in whole cell extracts was also not affected by deletion of the ravA gene (Fig. 23B), in which the enzyme was induced by growth of cells in unbuffered LB media containing excess glucose. The utilization of glucose as a carbon source results in the decrease in the pH of the media as cells ferment the glucose and produce lactic and acetic acids

(Clark 1989). The addition of purified RavA and nucleotides to the extract also did not affect

LdcI activity (data not shown). The lack of effect of RavA on LdcI activity in vivo was further

138

A) B) C)

LdcI Activity ) LdcI Activity Growth in Lysine Decarboxylase Media 25 nM LdcI

-1 1.0 25 nM LdcI 50 in Cell Extracts

) + 250 nM RavA WT

60 µg -1 250 nM LdcI ∆ravA

250 nM LdcI -1 0.8 ∆cadB + 250 nM RavA 40

µg

-1 30 0.6 40 pH 6.1 20 0.4

10 20 0.2

Absorbance at 590 nm pH 5.0 0

Initial Rate (pmol min 0 WT ∆ravA 0 100 200 300 400 500

Initial Rate (nmol min 0 Time (min) 456789 pH 139

D) ) Extreme Acid Challenge F)

-1

µg

WT no shock 1 -1 1600 RavA ATPase WT shock 2 1200 WT shock/Lys 3 ∆ravA no shock 4 800

∆ravA shock 5 400 ∆ravA shock/Lys 6 0

Initial Rate (pmol min 0.25 µM RavA -- -++++- +++ E) µ MG1655 αRavA MG1655 αLdcI 0.25 M ViaA ------+ ++- Time (min) 010204060120 0 10 20 40 60 120 0.5 µM LdcI -- +---- +-+ control 0.5 µM LdcC +- -+------acid shock 0.5 µM BSA -+ --+-- --- column no. 1 2345678910

FIGURE 23. The effect of RavA binding to LdcI on LdcI and RavA activities.

A, enzymatic activity of purified LdcI at various pH’s in the presence (empty circles/diamond) and absence (filled circles/diamond) of RavA. B, measurement of LdcI activity in soluble cell extracts prepared from WT and ΔravA cells grown in LB in the presence of glucose. Activity is given per μg of total soluble protein. No difference in activity is observed between strains. C, growth of WT, ΔravA, and ΔcadB cells in lysine decarboxylase indicator medium containing lysine, glucose as a carbon source, and bromocresol purple as a pH indicator. At various time points, samples were taken and cleared of cells by centrifugation. The pH of the supernatant was recorded by monitoring the change in absorbance at 590 nm of the bromocresol purple indicator. Growth in the presence of glucose results in a decrease in the pH of the medium from pH 6.1 initially to pH 5.0. D, extreme acid shock survival assay. The susceptibility of WT and ΔravA strains to extreme acid shock (pH 2.5 for 2 hours) was assessed in the presence and absence of lysine to monitor the effectiveness of the CadBA acid stress response system. Both strains suffer a significant decrease in survival in the absence of lysine (compare row 2 to 1 and row 5 to 4) but are unaffected by the shock if lysine is present (compare row 3 to 1 and row 6 to 4). The CadBA system appears to be equally effective in both strains. E, expression of RavA and LdcI in response to low pH. WT cells were grown in LB media at pH 7.0 to mid-log phase and then shifted to pH 5.0 by addition of low pH media, or kept at pH 7.0 as a control by adding pH 7.0 media. Samples were withdrawn at various time points after addition and the levels of RavA and LdcI were monitored by Western blotting. F, effect of BSA, LdcI, LdcC, and ViaA on ATPase activity of purified RavA at pH 7.5, 37°C.

140 confirmed by comparing the ability of WT and ΔravA cells to recover from a decrease in pH using the CadBA system by growing cells in special indicator media (Fig. 23C). This media contains excess glucose, bromocresol purple as a pH indicator, and lysine for use by the CadBA system. The change in the pH of the medium over time can be tracked by monitoring the change in the absorbance of the indicator at 590 nm. As shown in Fig. 23C, both WT and ΔravA cells experience an initial decrease in pH from 6.1 to about 5.0, but then recover, eventually raising the pH of the medium back to pH 6.1. On the other hand, a ΔcadB mutant, which is also deficient in LdcI/CadA production (data not shown), cannot restore the pH of the media, showing that pH recovery is dependent upon a functional CadBA system. Both WT and ΔravA strains have equal recovery rates, showing that the lysine dependent response to low pH does not appear to be compromised in the knockout strain.

The ability of WT and ΔravA cells to respond to a more extreme acid challenge was also examined. Cells were grown to stationary phase in LB media and then exposed to minimal media at pH 2.5, with and without added lysine, for 2 hours. Cells were then diluted in LB media and plated on LB agar plates to measure cell viability (Fig. 23D). Both WT and ΔravA cells appear to withstand this extreme acid stress equally well in the presence of lysine, providing further evidence that the CadBA system does not appear to be compromised in the ravA deletion strain. Finally, while LdcI is strongly induced when cells are shifted from neutral pH to pH 5.5, no such induction is observed for RavA (Fig. 23E). RavA is also not induced by shift to high pH media (data not shown). The induction of LdcI or its levels were also not affected in ravA deletion or overexpression strains (data not shown). All the data presented above seem to strongly suggest that the RavA-LdcI complex does not function to regulate LdcI activity, but, rather, the purpose of the complex is to regulate the activity of the RavA protein.

The effect of LdcI on RavA ATPase was measured. RavA ATPase activity was assessed using 1 mM ATP substrate in the presence of ViaA, LdcI, LdcC and BSA (Fig. 23F). None of the added proteins appears to possess significant ATPase activity of their own (columns 1

141 through 4). The activity of 0.25 μM RavA alone is shown (column 5). Addition of 0.5 μM BSA or 0.5 μM LdcC does not have any significant effect on RavA ATPase activity (columns 6 and 7) consistent with the lack of binding of BSA or LdcC to RavA. Addition of 0.5 μM LdcI results in

1.4 fold stimulation of RavA ATPase (column 8). Increasing the concentration of LdcI did not result in further stimulation of RavA ATPase (data not shown). This observed increase in activity can be explained in several ways. One possibility is that RavA may exist in a more active form upon binding LdcI; alternatively, LdcI may help stimulate RavA hexamer formation in the presence of nucleotide by acting as a scaffold, with a concomitant increase in the overall

ATPase activity of the population. Addition of ViaA results in a 1.9 fold stimulation of RavA

ATPase activity (column 9). RavA activity in the presence of both 0.5 μM LdcI and 0.25 μM

ViaA was stimulated by 2.4 fold (column 10). Since the fold stimulation obtained is not an average of 1.4 and 1.9, this might indicate the possible formation of a ternary complex of RavA-

ViaA-LdcI. Alternatively, in the presence of LdcI and ViaA, RavA might exist in a different conformation or oligomeric state than in the presence of LdcI or ViaA alone. It should be noted that the stimulations observed in the above experiments are reproducible with different preparations of proteins.

Based on the above data, our current speculation is that LdcI functions to regulate RavA and that the RavA-LdcI complex is formed under certain conditions to modulate RavA function.

Although we have identified a putative regulator of RavA, efforts are currently underway to determine the exact role of RavA and ViaA in the cell and the substrates targeted by this AAA+

& VWA system.

3.5 Discussion

3.5.1 RavA-ViaA Might Play a Role in Metal Insertion

Our analysis of RavA reveals that it is a member of a distinct subfamily of the MoxR

AAA+ proteins whose function remains poorly characterized. We were able to divide the MoxR

142 proteins into six subfamlies (Fig. 16A). Intriguingly, despite the distant relationship between members of the MoxR-proper and CGN subfamilies, there appears to be a remarkable similarity in the nature of their function. The parallels between the importance of the CGN family members NorQ/NirQ and CbbQ for the production/activation of nitric oxide reductase and

Rubsico, respectively, and the importance of MoxR for the production of active methanol dehydrogenase are striking. Even the GvpN protein, although not important for the biogenesis of an enzyme, appears to be crucial for the production of the large proteinaceous structure that comprises gas vesicles in certain cyanobacteria and halophilic archaea. Thus, it does not seem unreasonable to presume that RavA proteins, and indeed members of the other uncharacterized

MoxR subfamilies, may play similar roles.

The co-occurrence of genes encoding proteins with VWA domains with MoxR AAA+ genes is also notable. We have demonstrated that RavA and its corresponding VWA protein,

ViaA, comprise an operon (Fig. 19). This is consistent with results reported for the NorQ/NirQ and CbbQ proteins, whose genes are also part of an operon containing immediately adjacent

VWA protein encoding genes. These VWA domain-containing proteins were also shown to be important in the biogenesis of active nitric oxide reductase and RuBisCO. Although the exact role of ViaA is unclear, one possibility is that ViaA and RavA function in a role similar to that of the metal-chelatases, a family of AAA+ proteins to which the MoxR family has been shown to be closely related (Iyer et al. 2004; Wolf et al. 2001). In these metal-chelatases, the AAA+ components work in conjunction with VWA proteins to mediate the insertion of Co+2 or Mg+2 into porphyrin rings as part of the synthesis of cobalamin or (bacterio)chlorophyll, respectively

(Fodje et al. 2001). In the case of Mg-chelatase, one of the subunits, BchD, actually contains a

AAA+ and VWA domain fused together on a single polypeptide, the only example of such an arrangement known in bacteria and archaea (Whittaker and Hynes 2002). It is intriguing to speculate that RavA and ViaA may, therefore, play a role in metal insertion, a process which may be important in the activation of their substrate(s). Interestingly, both the nitric oxide

143 reductase for which NorQ/NirQ are important and the RuBisCO for which CbbQ is important depend upon metal ions for their function. Efforts are currently underway to explore this possibility.

3.5.2 Possible Implications of LdcI Binding to RavA

The interaction of RavA with LdcI is interesting, however, its exact role is puzzling.

RavA does not appear to exert any effect upon the activity of LdcI under a range of different in vitro conditions (Fig. 23A). In addition, cells in which the RavA gene has been disrupted display normal LdcI induction and activity, and do not appear to be compromised in their response to a variety of acid stresses (Fig. 23B-D). RavA levels are also not induced by acid, and the protein is present under growth conditions in which the LdcI enzyme is not present (Fig. 23E). The interaction of LdcI with RavA does result in an increase in RavA ATPase activity, however, suggesting that the complex is indeed functionally important. A further increase is observed in the presence of ViaA, suggesting ternary complex formation. The interaction also appears to be highly specific to LdcI, since RavA does not interact with the closely related LdcC protein. One possibility is that the complex formed between RavA and LdcI represents a means of regulating

RavA activity in response to acid stress conditions. Thus RavA can be envisioned to carry out an independent function of some form, possibly helping to mediate the folding/assembly of particular enzyme complexes during stationary phase. Upon exposure to acid stress conditions, levels of LdcI increase dramatically, leading to the formation of the RavA-LdcI complex. This complex may represent a way of sequestering RavA protein from its natural substrates in response to acid stress. Alternatively, the complex may provide a means of maintaining RavA function under acid stress conditions, something more in line with the observed stimulation of

RavA activity. By acting as a ‘scaffold’, LdcI may stabilize the RavA oligomer under acid stress conditions. It is also possible that the complex formed may lead to a ‘gain of function’, directing

RavA towards entirely new substrate proteins via localization and/or conformational changes.

144

3.5.3 Implications of the RavA-LdcI complex structure

EM analysis has provided us with an exceptional view of both the LdcI enzyme alone and in complex with RavA. The structure of the LdcI particle, deduced from the presented electron microscopy data (Fig. 22A-C), confirms the suggestion of Sabo et al. (Sabo et al. 1974) that the LdcI oligomer is composed of five stable dimers. The characterisation of LdcI in that work had indeed indicated that the protein was dimeric at pH 8.0 and at low ionic strength, whereas lowering of the pH to 7.0 and a concomitant increase in ionic strength favoured an association of dimers into decamers. The dimer-decamer equilibrium was shown to be dependent on protein concentration. These observations explain some presence of dimers in our experimental conditions. It should be noted that based on a visual analysis of individual side views, Sabo et al. (Sabo et al. 1974), proposed the orientation of the dimer to be roughly perpendicular to the plane of the ring. On the contrary, our 3D reconstruction unambiguously demonstrates a pronounced rotation of the monomer inside the dimeric building block of the

LdcI double-toroid.

The cage-like structure of the imaged RavA-LdcI complex makes it tempting to speculate that it may provide a space in which another protein or proteins might be confined.

Proteins held in such a position might then be subject to remodelling by the RavA oligomers.

Such a complex could thus represent a novel chaperone structure assembled under acid stress conditions when levels of LdcI enzyme increase dramatically.

In summary, considerable information has been gathered on the RavA protein and its interaction with ViaA and LdcI. Its regulation by σS clearly suggests an importance of this protein in stationary phase and general stress responses. However, the substrate targets of RavA and the significance of its interaction with LdcI are currently under investigation.

145

4. Towards Understanding the Function of the RavA-ViaA Chaperone System: A Phylogenetic Profiling and Microarray Study

146

4.1 Summary

The MoxR family of AAA+ proteins is a diverse group of ATPases, widespread throughout bacteria and archaea. Bioinformatics and experimental data accumulated to date reveal that these proteins can be divided into at least seven distinct subfamilies, and suggest a general role as molecular chaperones involved in the assembly of multimeric complexes, possibly mediating metal insertion events. Despite this information, however, very little is known about the specific functional roles and mechanistic aspects of these proteins. We have previously reported an extensive experimental analysis of the RavA subfamily of MoxR proteins, using a representative member from Escherichia coli. This information provided us with valuable biophysical data regarding RavA, and identified the VWA-containing protein

ViaA as a putative functional partner. In addition, we identified the inducible lysine decarboxylase enzyme, LdcI, as a RavA interactor and potential regulator of RavA activity. The functional role of RavA, and its associated ViaA protein, however, has still not been determined.

Here we present the results of a genomic profiling study and microarray analysis experiments, performed in an effort to identify putative interaction partners/substrates and systems associated with RavA-ViaA. The results of this work identify a list of candidate interaction partners and systems, and suggest a role for RavA family members in a number of metabolic pathways including the cellular envelope, metal/sulfur metabolism and stress response.

4.2 Introduction

The AAA+ proteins, or ATPases Associated with various cellular Activities, are a large superfamily of P-loop NTPases. Members of this superfamily display remarkable functional diversity, and have been shown to be involved in processes including protein refolding and degradation, DNA repair and replication, transcriptional regulation, ribosome and organelle biogenesis, and molecular transport processes (Iyer et al. 2004; Neuwald et al. 1999).

147

Regardless of their exact function, however, AAA+ proteins display a general role in molecular remodeling events, utilizing the power of ATP hydrolysis to drive conformational changes.

Proteins belonging to the AAA+ superfamily contain one or more copies of the AAA+ module, a 200 – 250 amino acid region responsible for ATP binding and hydrolysis. These modules consist of two distinct structural subdomains: an N-terminal α/β-core subdomain and a smaller, C-terminal α-helical subdomain (Ogura and Wilkinson 2001). AAA+ modules contain a number of conserved sequence motifs responsible for ATP sensing and hydrolysis, including the Walker A motif (GxxGxGKT), the Walker B motif (hhhhDE), Sensor I (containing a conserved Asn, Thr, Ser, or His residue), and Sensor II (containing a conserved Arg residue)

(Neuwald et al. 1999). In general, AAA+ proteins function as oligomeric species, with hexameric rings being most frequently observed (Hanson and Whiteheart 2005). A number of recent phylogenetic analyses using sequence and structural information have shown that the

AAA+ superfamily can be subdivided into numerous smaller families (Ammelburg et al. 2006;

Beyer 1997; Frickey and Lupas 2004; Iyer et al. 2004; Neuwald et al. 1999). One major, but poorly characterized family is the MoxR family whose members are widespread throughout bacteria and archaea (Ammelburg et al. 2006; Frickey and Lupas 2004; Iyer et al. 2004;

Neuwald et al. 1999).

Previously, we described a detailed phylogenetic, gene neighbourhood, and literature analysis of the MoxR AAA+ proteins (Snider and Houry 2006), (See Chapter 2). Our study defined a number of distinct MoxR subfamilies, including MoxR Proper (MRP), TM0930, CGN,

APE2220, PA2707, YehL, and RavA, and various subfamily-specific gene associations, suggesting the existence of potential gene clusters and operons. Although the specific function of MoxR AAA+ proteins is still unknown, our work revealed a possible role as molecular chaperones involved in metal insertion and the assembly of multimeric protein complexes.

Much of the recent work in our group has focused on understanding the role of the RavA subfamily. To this end, we have performed extensive experimental characterization using

148

Escherichia coli K12 RavA as our model protein. Our gene neighbourhood analysis (Snider and

Houry 2006) found ravA genes, like most other MoxR AAA+ genes, co-occur with genes encoding Von Willebrand Factor Type A (VWA) domain-containing proteins. The VWA domain is a metal-binding domain often involved in protein-protein interactions (Whittaker and

Hynes 2002). We have named the VWA gene associated with ravA subfamily members as viaA.

We have shown experimentally that the E. coli ravA and viaA genes comprise an operon (Fig.

24), and that the interaction of RavA and ViaA leads to the stimulation of RavA ATPase activity

(Snider et al. 2006). The functional significance of this interaction is, however, unclear. We have also found that RavA tightly binds the inducible lysine decarboxylase (LdcI) to form a large cage-like structure that we elucidated by electron microscopy (Snider et al. 2006). The lysine decarboxylase is involved in the bacterial acid stress response (Foster 2004). The interaction between RavA and LdcI seems to be regulatory in nature resulting in the enhancement of RavA ATPase activity, however, the activity of LdcI is not affected by RavA binding.

In an effort to obtain further information about the function of RavA-ViaA chaperone system and to determine the putative substrates of this system, a genomic profiling study and microarray analysis experiments were conducted. The results of these studies provided us with a list of candidate interaction partners and systems, suggesting potential roles for RavA family members in a number of metabolic pathways including the cellular envelope, metal/sulfur metabolism, and stress response.

4.3 Materials and Methods

4.3.1 Phylogenetic Profiling Analysis

An all-against-all BlastP analysis of 14 RavA-containing organisms belonging to the gammaproteobacteria and with completely sequenced genomes was performed using default parameters. The BlastP results were then analyzed to develop genomic profiles using

149

ravA Genomic Environment in Escherichia coli K12 MG1655

3922.8 kb 3928.0 kb 3928.6 kb

gidA asnC viaA kup rbsA mioC asnA ravA rbsD

FIGURE 24. Genomic environment of ravA in E. coli K12 MG1655. The figure is drawn to scale.

150 Escherichia coli K12 as our reference dataset. The reference set served as a source of query sequences whose presence or absence was assessed in each of the other 13 genomes as follows.

The best-hit match for each protein in the reference set was identified in each of the other 13 genomes. Any best-hit which had an Expect value below a cutoff of 10-10 was considered to be

Present in a target genome. Best-hits which failed to meet the Expect cutoff were analyzed further to determine if they produced a reciprocal best-hit with the reference set query protein. If such a reciprocal best-hit was detected, the sequence was also called Present in a given genome.

Any sequences which failed to meet the Expect cutoff and did not produce a reciprocal best-hit were considered Absent. We then performed a secondary profiling analysis using genes which were 100% conserved across the RavA-containing organisms. These genes were analyzed using an arbitrary set of 25 gammaproteobacteria which were shown in our previous analysis not to contain ravA gene. This analysis was used in an attempt to identify genes which were specifically enriched in the RavA-containing organisms.

The reliability of the profiling method was established by examining the distribution of

RavA proteins. The current analysis successfully detected RavA sequences in all 14 organisms of the ‘RavA’ set and in none of the 25 organisms of the ‘non-RavA’ set, consistent with our previous analysis (Snider and Houry 2006). Manual examination of the best-hits revealed that all matches were to those of RavA sequences, and no spurious matches to other MoxR AAA+ proteins had contributed to a Present call. Thus, the profiling method was robust.

4.3.2 Bacterial Strains

Wild-type MG1655 strain was purchased from ATCC (ATCC Number 47076). MG1655

ΔravA::cat was constructed from WT MG1655 using the Lambda Red system as previously described (Snider et al. 2006). The chloramphenicol resistance cassette was flanked by FRT- sites, and, hence, the cassette was removed using FLP recombinase expressed from the pCP20 plasmid, which displays both temperature inducible expression of FLP and temperature sensitive

151 replication to cure the cells of the pCP20 plasmid. (Cherepanov and Wackernagel 1995;

Datsenko and Wanner 2000) This allowed the generation of strain MG1655 ΔravA with no marker. Strains overexpressing RavA and RavA-ViaA under their natural promoters were constructed by transforming MG1655 with plasmids pR and pRV (see below). MG1655 cells overexpressing RavA under the control of a T7 promoter were produced by first transforming cells with the pT7POL26 plasmid,which expresses T7 RNA polymerase under the control of an

IPTG inducible promoter (Casadaban and Cohen 1980), followed by pET22b-RavA (see below).

Control cells were transformed with empty pET22b vector in place of pET22b-RavA.

4.3.3 Gene Cloning

ravA and ravA-viaA genes under the control of their own promoters were cloned from

Escherichia coli K12 MG1655 and Escherichia coli O157:H7 respectively. The ravA-viaA promoter regions and the ViaA amino acid sequences are identical between the two strains, while the RavA amino acid sequences are 99% identical and are expected to behave identically. ravA alone and ravA-viaA operon were amplified by PCR together with 206 bp of the promoter region. promoter-ravA was amplified using primers RAVA2_forward (5´- gtggatccgaaatgtgtgcttagtcccttg-3´) and RAVA2_reverse (5´-atggatccgactgcgcacgcttcacg-3´), while promoter-ravA-viaA gene fragment was amplified using primers RAVA2_forward and

VIAA_reverse (5´-ctatggatccttatcgccgccagcgtctgagc-3´). The amplified products were then placed in the p11 plasmid (Zhang et al. 2001) using the BglII and BamHI sites. Use of these cut sites removes the p11 T7 promoter sequence. For overexpression of untagged RavA under the control of a T7 promoter, the ravA open reading frame from our previously prepared p11 construct (Snider et al. 2006) was subcloned into pET22b using the NdeI and BamHI restriction sites.

152

4.3.4 Western Blot Analysis

50 mL of MG1655 cells containing p11, pR, or pRV plasmid were grown at 37°C in LB containing 100 μg/mL ampicillin. At OD600 ~ 0.5, 2.6, and 5.0, 10 mL aliquots were removed, pelleted by centrifugation, resuspended in lysis buffer LY1 (25 mM TrisHCl, pH 7.5, 300 mM

NaCl, 0.1 mM, 5% Glycerol, and 1 mM DTT), and lysed by sonication. Samples were centrifuged to remove cellular debris and the supernatant was transferred to a fresh tube. Protein concentration was determined using the BioRad Protein Assay. 20 μg of proteins were separated on 10% SDS-PAGE gel and transferred to Biotrace PVDF membrane (PALL) using a BioRad semidry transfer apparatus as per manufacturer’s instructions. Membranes were then incubated with rabbit polyclonal αRavA antibodies generated at the Division of Comparative Medicine,

University of Toronto. Membranes were washed and incubated with protein LA-peroxidase

(Sigma). ECL detection reagent (GE Healthcare) was used to visualize protein bands on a

Bioflex chemiluminescence detection film (Clonex).

4.3.5 Microarray Experiments

MG1655, MG1655 ΔravA::cat, MG1655 p11, MG1655 pR, and MG1655 pRV were grown in LB media at 37°C from an initial OD600 of ~0.025. Cells were harvested when their

OD600 reached ~3 and total RNA was isolated from 500 μL volumes using the Qiagen RNeasy

Mini Kit with RNAprotect Bacteria Reagent, as per the manufacturer’s instructions. Samples were stored at -80°C. Total RNA quality was assessed using the Agilent 2100 Bioanalyzer

(Agilent Technologies).

Microarray analysis was performed at The Centre for Applied Genomics Microarray

Facility, Hospital for Sick Children, Toronto, Ontario, Canada, using Affymetrix GeneChip E. coli Genome 2.0 Arrays. Sample preparation and array processing were performed following standard protocols as outlined in Section 3a of the Affymetrix GeneChip Expression Analysis

Technical Manual available for download from the Affymetrix company website. In brief,

153 cDNA synthesis was performed with Invitrogen Superscript II enzyme using random primers and 10 μg total RNA template. RNA template was subsequently degraded using NaOH followed by cDNA clean-up using Qiagen MinElute PCR Purification Columns. cDNA was then fragmented using DNaseI (GE Healthcare) and labeled with biotin at the 3´ end using GeneChip DNA Labeling Reagent (Affymetrix) and Promega Terminal Deoxynucleotidyl

Transferase. Between 2 to 5 μg of biotin-labeled cDNA was used in subsequent hybridization to the E. coli Genome 2.0 Arrays. Hybridization, washing, and staining were performed in the

Affymetrix GeneChip Hybridization Oven 640 and Fluidics Station 450 using. Arrays were scanned using the Affymetrix GeneChip Scanner 3000.

4.3.6 Analysis of Microarray Data

Single array data analysis was performed using the GeneChip Operating Software

(GCOS). Array signal intensities were globally scaled using an All Probe Sets Scaling strategy and a target signal of 150. Default E. coli Genome 2.0 Array parameters were used in the analysis for the determination of the presence and absence of genes. Any genes which produced an ‘Absent’ call were automatically assigned a signal intensity of ‘0’. Details on the statistical methods employed can be found in the Affymetrix GeneChip Expression Analysis manual (Data

Analysis Fundamentals, available on the Affymetrix company website). Comparison analysis was performed for WT vs. ΔravA::cat, WT p11 vs. WT pR and WT p11 vs. WT pRV, using a bootstrapping approach for unpaired data. All analyses were performed using in-house software written using ActivePerl version 5.8.0.806 (ActiveState).

Bootstrap datasets were generated by random resampling from the original datasets, with replacement. In a comparison, 10,000 bootstrap datasets were generated per gene. These random datasets were used in the generation of a t-distribution, to which the t-statistic of the original dataset for each gene was compared. An empirical p-value was calculated from the proportion of bootstrap t-statistics which were more extreme than the t-statistic calculated from the original

154 dataset. Changes in gene expression levels having p-values less than 0.05 were selected as being significant and the signal log2 ratio of these changes were calculated. Only significant changes with absolute signal log2 ratios of 0.6 (~1.5 fold absolute change in transcript level) or greater were selected for further analysis.

A manual review of the change in gene levels was then performed. All remaining genes were examined using the data currently available in the EcoCyc Encyclopedia of Escherichia coli K-12 Genes and Metabolism Database (http://ecocyc.org/) and were grouped together into operons where possible (Keseler et al. 2005). As a final screen, any genes whose levels changed according to the above criteria and which were part of a recognized transcriptional unit, but for which the majority of genes in the transcriptional unit did not change, were considered spurious and discarded. In order to provide general functional classification, the predicted products of all differentially expressed genes were also submitted for Clusters of Orthologous Groups (COG) analysis using the COGNITOR program (Tatusov et al. 2000) available through the NCBI website.

4.4 Results and Discussion

4.4.1 Phylogenetic Profiling Analysis

In order to identify potential RavA interaction partners or substrates, we carried out phylogenetic profiling analysis using 14 gammaproteobacteria previously shown to contain

RavA subfamily members (Snider and Houry 2006) and for whom complete genome sequence information was available. Our approach involved using an all-against-all BlastP analysis of these 14 organisms, identifying the best-hits and the corresponding Expect values of these matches as described in Materials and Methods. The profile was developed using Escherichia coli K12 as the reference set. From this profile, a total of 1918 ORFs conserved across all 14

RavA-containing organisms were identified. In order to identify ORFs specifically enriched in these organisms, all 1918 of these ORFs were subjected to further profiling analysis using 25

155

gamma proteobacteria which were shown in our previous bioinformatics analysis (Snider and

Houry 2006) not to contain RavA. The list of organisms used in our profiling analysis is given in Supplementary Table 1. The complete results of the profiling analysis, covering all E. coli

ORFs, are given in Supplementary Table 2.

The list of genes that were found to be present in all 14 RavA-containing gamma proteobacteria, but absent in the set of 25 organisms lacking RavA are given in Table 2. One of these highly enriched genes is viaA. This is in agreement with our experimental data, which showed a functional interaction between RavA and ViaA resulting in the enhancement of RavA

ATPase (Snider et al. 2006). It is also in agreement with our earlier gene neighbourhood analysis, which revealed that viaA genes are often found in close proximity to ravA genes, and our observation that the ravA and viaA genes in Escherichia coli K12 form an operon (Snider et al. 2006; Snider and Houry 2006).

In addition to ravA and viaA, there are 18 other genes that are enriched in RavA- containing organisms (Table 2). It is tempting to speculate that one or more of these proteins may be involved with the RavA-ViaA system, acting as functional partners or substrates.

Notably, a number of the enriched genes encode membrane-associated proteins. These proteins are involved in various processes, including transport (ChbB, YbaE, TolA), flagellar export/biosynthesis (FliO), biofilm formation (YggN), cell-division (FtsN) and stress-response

(UspB). In addition, there are a number of poorly characterized proteins whose exact roles are not well understood (YicH, YijD, YhhL, FxsA).

156

TABLE 2. Significantly enriched genes detected by profiling analysis. List of genes present in all 14 RavA-containing gamma proteobacteria, but absent from all 25 non-RavA-containing gamma proteobacteria. The subcellular localization of the product of each gene is also shown. Genes are separated into two major groups, cytoplasmic vs. non-cytoplasmic, and are sorted alphabetically by name within each.

157

GI Number Name Description 1 16131945 melA Alpha-galactosidase, NAD(P)-binding . 2 90111646 ravA Putative AAA+ chaperone. 3 16129069 thiK Thiamine kinase. Involved in thiamine salvage. 4 90111677 ubiC Chorismate pyruvate . Role in ubiquinone biosynthesis. 5 49176398 viaA Predicted von Willibrand factor containing protein. 6 16128156 yaeH Putative structural protein. 7 16128225 yafA Fermentation/respiration switch protein. Esterase activity. 8 16128450 ybaM Small hypothetical protein. 9 90111146 ybcJ Small, predicted RNA-binding protein.

10 16129692 chbB N,N'-diacetylchitobiose-specific enzyme IIB component of PTS. 11 90111358 fliO Component of FliOPQR-FlhAB flagellar export channel. 12 16131771 ftsN Essential cell division protein . 13 90111691 fxsA Unknown function. FxsA overproduction inhibits F exclusion of bacteriophage T7. 14 16128714 tolA Component of Tol-Pal complex. Role in colicin & DNA uptake / cell envelope integrity maintenance. 15 16131366 uspB Universal stress protein UspB . 16 16128430 ybaE Predicted transporter subunit: Periplasmic-binding component of ABC. 17 16130859 yggN Involved in biofilm formation. 18 16131338 yhhL Conserved inner membrane protein. Unknown function. 19 16131526 yicH Unknown function. May contain a b-barrel. 20 16131802 yijD Unknown function. Cause mutator phenotype when overexpressed.

158

Enriched genes encoding cytoplasmic proteins include a number whose function is unknown (YbcJ, YbaM and YaeH), as well as several enzymes involved in metabolic/biosynthetic processes. The latter include the α-galactosidase MelA, a dimeric, Mn2+- dependent enzyme required by E. coli for the use of α-galactosides as a carbon source (Nagao et al. 1988), the thiamine kinase ThiK, a Mg2+-dependent enzyme involved in thiamine salvage

(Iwashima et al. 1972; Melnick et al. 2004), the chorismate pyruvate lyase UbiC, involved in the aerobic biosynthesis of the electron carrier ubiquinone (Nichols and Green 1992; Siebert et al.

1994), and the YafA esterase, which is proposed to play a role in regulating flux between respiration and fermentation pathways (Koo et al. 2004; Kuznetsova et al. 2005).

Thus our profiling results suggest a link between RavA and the cellular envelope, as well as a possible association with several metabolic pathways including metal-dependent enzymes, an observation consistent with our previous work suggesting an involvement of the RavA-ViaA system in metal insertion (Snider and Houry 2006). The small pool of enriched genes may suggest that RavA activity is required for specific rather than general cellular functions or pathways. Nevertheless, it is not possible to rule out the possible involvement of RavA with highly conserved systems or proteins.

4.4.2 Microarray Analysis

In order to experimentally identify potential cellular pathways in which RavA and

RavA-ViaA activity is required and to provide experimental evidence to complement our phylogentic profile analysis, we conducted a microarray study on Escherichia coli K12

MG1655 WT and ΔravA::cat cells, as well as on WT cells containing plasmids carrying the ravA gene (pR) or ravA-viaA operon (pRV) under the control of their endogenous promoters constructed as described in Materials and Methods. RavA levels in cells carrying pR and pRV

159 are substantially increased with respect to levels in the WT strain carrying the empty plasmid

(Fig. 25). This increase is likely due to the presence of multiple copies of the plasmid-borne genes, and might be indicative of titration of an as yet unidentified transcriptional repressor. It should be noted that the overexpression of RavA, when placed under a T7 promoter, results in cell death (data not shown); hence, excess RavA might perturb specific metabolic pathways.

By perturbing the wild-type levels of RavA and ViaA proteins in cells, this should generate alterations in the expression levels of genes directly or indirectly associated with or dependent on the RavA-ViaA system, thereby providing functional clues as to the cellular role of RavA-ViaA. The study was carried out on cells at early stationary phase (OD600 ~3, Fig. 26) since RavA expression is dependent on the stationary phase sigma factor, σS, and RavA expression is highest upon entry into stationary phase (Snider et al. 2006). Total RNA was harvested from the cells and used in microarray analysis with Affymetrix E. coli Genome 2.0

Arrays. Gene expression profiles were then analyzed as described in Materials and Methods in order to identify differentially expressed genes. These genes were examined using the data currently available in the EcoCyc Encyclopedia of Escherichia coli K-12 Genes and Metabolism

(Keseler et al. 2005) and were grouped together into operons where possible. In order to provide general functional classification, the protein products of all differentially expressed genes were also submitted for Clusters of Orthologous Groups (COG) analysis using the COGNITOR program (Tatusov et al. 2000). The overall results of the COG analysis are shown in Fig. 27.

4.4.2.1 Results of the Microarray Analysis of RavA-ViaA Overexpression Strain

The microarray data show that the presence of the pRV plasmid results in a 6.4 and 12.2 fold increase in the levels of the ravA and viaA transcripts, respectively (Table 3). This increase in transcriptional expression is consistent with the results of our western blot analysis (Fig.

25A).

The overexpression of both RavA and ViaA in cells resulted in a number of gene

160

OD600 0.5 5.02.6 MG1655 p11

MG1655 pR

MG1655 pRV

FIGURE 25. RavA overexpression. Western blot analysis of E. coli K12 MG1655 cells containing p11, pR, or pRV plasmid using αRavA antibodies. RavA levels are significantly higher in strains containing pR (row 2) and pRV (row 3) than those containing p11 (row 1).

161 6

4 600 OD

2

OD600 2.11.00.50.1 3.8 5.4 αRavA

0 0 400 800 1200 1600 Time (min)

FIGURE 26. Analysis of RavA expression during cell growth. Growth curve for E. coli MG1655 cells under conditions identical to that used in the preparation of microarray samples (aerobic growth in LB media at 37°C). RavA induction is clearly increased towards late log/early stationary phase, and levels are particularly high at the point which samples were selected for RNA preparation and microarray analysis (OD600 ~ 3.0).

162 A) 45.0

40.0

35.0

30.0 pRV 25.0 pR ∆ 20.0 ravA E. coli 15.0

% Genes Where Levels Increase 10.0

5.0

0.0 C D E F G H J K L M N O P Q R/S T X COG Class

B) 45.0

40.0

35.0

30.0 pRV 25.0 pR ∆ 20.0 ravA E. coli 15.0

% Genes Where Levels Decrease 10.0

5.0

0.0 C E F G H I J K L M N O P R/S T X COG Class

[C] Energy Production and Conversion [M] Cell wall / membrane / envelope biogenesis [D] Cell Cycle Control / Cell Division [N] Cell Motility [E] Amino Acid Transport / Metabolism [O] Postranslational Mod., Protein Turnover, Chaperones [F] Nucleotide Transport / Metabolism [P] Inorganic Ion Transport / Metabolism [G] Carboyhydrate Transport / Metabolism [RS] General Func. Prediction / Unknown Function [H] Coenzyme Transport and Metabolism [T] Signal Transduction Mechanisms [K] Transcription [X] No Assigned COG [L] Replication, Recombination, and Repair

163

FIGURE 27. COG class distribution of genes undergoing significant changes in expression as obtained by microarray analysis. The distribution of genes whose levels significantly increase (A) or decrease (B) under each of the microarray conditions used are shown with respect to the distribution of COG classes across all E. coli genes.

164

TABLE 3. Genes whose transcript levels increase in MG1655 pRV cells relative to those in wild-type cells. Genes are grouped together into operons (where possible) and then listed alphabetically. ‘++’ indicates that transcript was not detected in wild-type cells, thus a ‘fold-increase’ could not be calculated.

165

Gene Name b Number Description COG Class Fold Change 1 asnA b3744 Aspartate-ammonia E 2.0

2 asnC b3743 AsnC transcriptional repressor K 1.9

3 betT b0314 BetT choline BCCT transporter M 1.9

4 cirA b2155 Outer membrane ferric siderophore, colicin receptor P 4.9

5 edd b1851 Phosphogluconate dehydratase EG 1.7

6 glcC b2980 GlcC transcriptional dual regulator K ++

7 purL b2557 Phosphoribosylformylglycinamide synthase F 3.8

8 ravA b3746 Putative AAA+ chaperone S 6.4 9 viaA b3745 Predicted von Willibrand factor containing protein R 12.2

10 yajR b0427 YajR putative MFS transporter GEPR 2.6

11 yccS b0960 Hypothetical protein S 1.5

12 ycjS b1315 Putative galactose 1-dehydrogenase R 1.7

13 ydcD b1457 Hypothetical protein X 1.6

14 ydhX b1671 Putative , Fe-S subunit C 2.1

15 ydjJ b1774 Predicted oxidoreductase, Zn-dependent and NAD(P)-binding ER 1.7

16 yedA b1959 Putative transmembrane subunit GER 1.5

17 yghJ b2973 Predicted inner membrane lipoprotein X 1.6

18 yhaM b3108 Conserved protein of unknown function X 2.6 19 yhaO b3110 YhaO putative STP transporter E 3.4

20 yhcN b3238 Conserved hypothetical protein X 2.1

21 yhfO b3372 Fructoselysine 3-epimerase G 1.6

22 yigJ b3823 RhtC threonine Rht Transporter E 1.7

23 yohL b2105 Conserved hypothetical protein S ++

24 yrfD b3395 Protein involved in utilization of DNA as a carbon source X 4.3

25 IG_1390915_1391229-f NA Intergenic Region 2.0 26 IG_1223131_1223501-f NA Intergenic Region ++

166 expression changes. 26 significant increases in gene expression were observed, 24 of which were in defined, protein-encoding genes, and 2 of which matched intergenic regions (Table 3).

COG classification results are shown in Fig. 27A. 25.0% of genes encoded products not belonging to an identifiable COG, while a further 29.2% encoded proteins belonging to COGS which were poorly characterized or of unknown function. 29.2% of proteins appear to be involved in amino acid transport and metabolism, 16.7% of proteins are involved in carbohydrate transport and metabolism, and 8.3% are involved in inorganic ion transport and metabolism. Another 2 proteins (8.3%) play a role in transcription. Note that of the proteins in the above groups, 4 belong to COGs corresponding to multiple classes (Table 3).

Overexpression of both RavA and ViaA also resulted in decrease in the expression of some genes. A total of 24 significant decreases in transcript levels were detected, all of which corresponded to defined, protein-encoding genes (Table 4). COG functional analysis (Fig. 27B) showed that the largest COG classification group represented is inorganic ion transport and metabolism. Proteins not belonging to a recognized COG, or belonging to a COG of unknown function comprise the next largest group, while the third largest group of proteins are involved in amino acid transport and metabolism, of which two are also predicted to be involved in coenzyme transport and metabolism.

Thus, in general, the largest number of changes appears to be in the transcript of genes involved in inorganic ion transport and metabolism and in amino acid transport and metabolism, as well as in genes of unknown function. These changes largely differ from the general distribution of COG classes across all known E. coli proteins (Fig. 27), suggesting an important bias rather than a random distribution.

4.4.2.2 Examining the Microarray Analysis Results of the RavA-ViaA Overexpression Strain

Examining some of the gene changes more closely reveals some interesting features. The most dramatic of these is in the cysteine regulon (Table 4 and Fig. 28), including the cysK,

167

TABLE 4. Genes whose transcript levels decrease in MG1655 pRV cells relative to those in wild-type cells. Genes are grouped together into operons (where possible) and then listed alphabetically. ‘--‘ indicates that transcript was not detected in pRV cells, thus a ‘fold-decrease’ could not be calculated.

168

Gene Name b Number Description COG Class Fold Change 1 cysK b2414 Cysteine synthase E -2.3

2 cysM b2421 Cysteine synthase B E -2.1 3 cysA b2422 Sulfate ABC transporter P -4.3 4 cysW b2423 Sulfate ABC transporter P -2.8 5 cysU b2424 Sulfate ABC transporter P -5.1 6 cysP b2425 Thiosulfate ABC transporter P -4.8

7 cysC b2750 Adenylylsulfate kinase P -3.2 8 cysN b2751 Sulfate adenylyltransferase P -3.7 9 cysD b2752 Sulfate adenylyltransferase EH -5.9

10 cysH b2762 3'-phospho-adenylylsulfate reductase EH -4.4 11 cysI b2763 Sulfite reductase hemoprotein subunit P -3.7 12 cysJ b2764 Sulfite reductase flavoprotein subunit P -5.0

13 feoA b3408 Ferrous iron transport protein A P -2.1 14 feoB b3409 FeoB ferrous iron transporter P -1.8 15 feoC b3410 Putative transcriptional regulator X -1.6

16 lacA b0342 Galactoside O-acetyltransferase monomer R -5.5 17 lacZ b0344 b-galactosidase monomer G --

18 metK b2942 MetK S-adenosylmethionine synthetase monomer H -1.6

19 yciE b1257 Conserved protein associated with stress response X -1.7

20 yciW b1287 Predicted oxidoreductase X -3.7

21 yddV b1490 Diguanylate cyclase T -2.1

22 ydjN b1729 Predicted transporter R -2.5

23 yeeD b2012 Conserved hypothetical protein O -4.0 24 yeeE b2013 Putative transport system permease protein R -4.6

169 The CysB Regulon The NsrR Regulon

+ CysB -- NsrR cysJ cysI cysH The feoABC Operon -- Fnr + Fnr hmp + CysB -- Fur cysD cysN cysC feoA feoB feoC -- NsrR -- NsrR -- Fnr + CysB ygbA ytfE cysP cysU cysW cysA cysM

+ CysB + CysB -- CysB cysK cbl cysB The hyaABCDEF Operon + CysB -- NarL + ArcA + Cbl -- IscR tauA tauB tauC tauD -- NarP + AppY hyaA hyaB hyaC hyaD hyaE hyaF -- CysB + Cbl ssuE ssuAssuD ssuC ssuB

nap-ccm Operon + Fnr napD napH napC ccmB ccmD 170 -- NarL +- NarP napF napA napG napB ccmA ccmC ccmE ccmF ccmG ccmH

Threonine Degradation Operons

+ TdcA + IHF + TdcR + CRP tdcA tdcB tdcC tdcD tdcE tdcF tdcG

-- Lrp kbl tdh

FIGURE 28. Genomic Organization of Selected Genes Observed to Undergo Expression Changes in the Microarray Analysis. Graphical representation of the gene arrangement of selected genes/operons/regulons of interest. Transcriptional regulation of certain genes/operons is indicated by boxes containing the name of the transcriptional regulator and a ‘+’, ‘-‘ or ‘+-‘ sign indicating activation, repression or both, respectively. The figure is based upon information from the EcoCyc database. cysMAWUP, cysCND and cysHIJ operons, all of which are observed to undergo a 2 to 6 fold decrease in expression upon overexpression of RavA-ViaA (Table 4). All of these genes fall mainly into the amino acid transport and metabolism or inorganic ion transport and metabolism

COG classes. The cys genes are involved in the assimilation of inorganic sulfate and the biosynthesis of cysteine (Kredich 1996). The expression of all of these genes is under the control of the CysB LysR-type transcriptional regulator.

The CysB protein is regarded as the master regulator of sulfur assimilation in E. coli, controlling the expression of genes involved in the assimilation of aliphatic sulfonates

(tauABCD, ssuEADCB, cbl), as well as those of the cysteine regulon (van der Ploeg et al. 2001).

CysB works in conjunction with the co-inducer N-acetylserine (NAS) in order to activate transcription of the cys genes. NAS is non-enzymatically derived from O-acetylserine (OAS), the immediate precursor of cysteine and the product of the CysE serine acetyltransferase

(Kredich 1996). The reduction in the cysteine regulon strongly suggests that the RavA-ViaA system plays an as yet undefined role related to the assimilation of sulfur.

Perhaps the simplest explanation for the observed reduction is that the CysB transcriptional regulator is unfolded, or inactivated, or its levels are reduced by the RavA-ViaA system. However, since CysB is known to autoregulate itself, acting as a repressor of its own transcription (Bielinska and Hulanicka 1986), a drop in the level of active CysB protein would, therefore, be expected to lead to an alleviation of this repression, with a corresponding increase in the amount of CysB transcript. But, according to our microarray analysis, CysB transcript levels were found to be unchanged in the RavA-ViaA overexpression strain, hence, RavA-ViaA are unlikely to directly act on CysB. Furthermore, the ssuEADCB operon is repressed by CysB

(Bykowski et al. 2002), but no increase in the transcriptional level of these genes is observed.

The tau operon and cbl gene, both of which are subject to CysB activation, were not observed to decrease in our microarray, however this may be due to the fact the levels of these genes in our wild-type strain was relatively low to begin with.

171

Another possible explanation is that the overexpression of RavA-ViaA results in a reduction in the levels of NAS, or in an increase in the levels of sulfide and/or thiosulfate.

Reduced production of OAS from CysE would lead to a corresponding reduction in NAS, and a drop in the level of expression from the cys genes. Similarly, sulfide and thiosulfate compete with NAS for CysB (van der Ploeg et al. 2001). Either of these would affect the ability of CysB to function as an activator, but would leave the protein intact and capable of functioning as a repressor. Reduced CysE activity could be a direct result of interaction with the RavA-ViaA system, or indirect as a result of an increase in intracellular cysteine, which acts as an inhibitor of the CysE enzyme (Kredich 1996). CysE is universally conserved across all the RavA- containing organisms we examined, however it is not significantly enriched (Supplementary

Table 2). It should also be noted that most of the cys genes are conserved in all 14 RavA- containing organisms, and are also highly conserved across the gammaproteobacteria in general

(Supplementary Table 2).

Another interesting transcriptional change observed was the 1.6 to 2.1 fold decrease in the genes of feoABC operon (Table 4 and Figure 28). FeoAB belong to the inorganic ion transport and metabolism COG class, while FeoC does not belong to a recognized COG. The

Feo system plays an important role in the uptake of ferrous iron (Fe2+) (Kammler et al. 1993).

Ferrous iron is highly soluble relative to ferric iron (Fe3+) and can, thus, be transported directly as a free metal. Due to its relatively low stability, however, ferrous iron is only likely to predominate over ferric iron under anaerobic or low pH conditions. Thus, although ferrous iron transport occurs both aerobically and anaerobically, it is likely to play a greater role under conditions of low oxygen and/or reduced pH (Cartron et al. 2006; Hantke 1987). The feoABC operon is repressed by the Fur transcriptional regulator, which acts as a general regulator of iron metabolism, and activated by the Fnr transcriptional regulator, which is important in regulating the transcription of genes involved in fermentation and anaerobic respiration (Hantke 1987;

Kammler et al. 1993). How overexpression of RavA and ViaA lead to repression of this operon

172 is unclear. Our profiling study shows that the Feo system is well represented across the RavA- containing organisms, more-so than in the non-RavA-containing organisms (Supplementary

Table 2), suggesting that the Feo proteins may be direct or indirect substrates of RavA.

Intriguingly, we also observed a strong change in the transcriptional level (~4.9 fold increase) of another iron transport gene belong to the inorganic ion transport and metabolism

COG class, cirA (Table 3). CirA is an outer membrane protein, which acts as a receptor in the

TonB-dependent uptake of the Fe3+-dihydroxybenzoylserine (Hantke 1990). It can also act as a receptor for certain bacteriocidal colicins (Braun et al. 2002). Transcription of the cirA gene is controlled by two overlapping promoters (Griggs et al. 1987). Transcription is repressed by Fur and activated by cAMP-CRP (Griggs et al. 1990; Griggs and Konisky 1989). This is an interesting result, in light of the fact that transcription of the feo operon, also regulated by Fur, was observed to decrease in the overexpression strain. Although increased levels of intracellular cAMP might contribute to the cirA induction, perhaps helping to offset iron mediated repression, no global change in cAMP-CRP regulated genes is observed. This may suggest that the opposing changes observed in the transcription of the feo and cir genes are directed by an alternate regulatory mechanism.

Other changes worthy of note are the nearly 2-fold increases in the asnA and asnC gene transcripts (Table 3). Both of these genes are located immediately downstream of the viaA gene on the E. coli chromosome, with asnA being oriented on the opposite strand (Fig. 24). asnA encodes the enzyme asparagine synthetase I, which is responsible for the ATP-dependent synthesis of asparagine from aspartate and ammonia (Cedar and Schwartz 1969; Nakamura et al.

1981). AsnC acts as a transcriptional activator of asnA, as well as a repressor of its own transcription (Kolling and Lother 1985). Its ability to activate asnA transcription is repressed by asparagine, however, its repressor activity remains unaffected. In addition, asnC expression is under the control of the Nac transcriptional regulator, which acts to repress asnC transcription under conditions of low nitrogen availability (Kolling and Lother 1985; Poggio et al. 2002). It

173 seems likely that the increased level of asnA transcription observed is a result of the observed increase in asnC levels. Although the nature of an involvement of the RavA-ViaA system is uncertain, the genomic proximity of the asnA and asnC genes to the ravA and viaA genes is intriguing. It is tempting to speculate that RavA and ViaA might work to enhance or repress the activity of AsnA, which is known to be dependent upon Mg+2 for proper function (Cedar and

Schwartz 1969), or possibly affect the structure of AsnC. Excess amounts of RavA and ViaA may thus lead to a disruption in the normal activities of AsnA and/or AsnC, and the increase in transcription observed may represent a cellular effort to compensate for this. Although our genomic neighbourhood analysis did not detect a conserved association between the asnA/asnC genes and the ravA/viaA genes (Snider and Houry 2006), our profiler does reveal that an AsnC- like protein is found in all of our RavA-containing organisms, and only 16 of the non-RavA gammaproteobacteria (Supplementary Table 2). AsnA, on the other hand, is much more poorly conserved and is only detected in 10 out of 14 RavA organisms (Supplementary Table 2), suggesting that if it is indeed a substrate for RavA-ViaA, it would only be so in a subset of

RavA organisms. It is currently not known if AsnC regulates the transcription of other genes.

A mild decrease of ~1.6 fold was also observed in expression of the metK gene, which encodes S-adenosyl-methionine (SAM) synthetase, and is responsible for the synthesis of SAM from methionine and ATP (Cantoni 1951; Hunter et al. 1975). SAM acts as a major methyl donor in metabolism and the metK gene has been shown to be essential in E. coli (Wei and

Newman 2002). Studies on metK have shown that it is induced upon biofilm formation (Ren et al. 2004), and is repressed by the MetJ transcription factor in complex with SAM (LaMonte and

Hughes 2006; Su and Greene 1971). The significance of this decrease is uncertain, however,

MetK is involved in sulfur metabolism, and, therefore, the changes in its transcript levels are consistent with some of the other changes observed.

The RavA-ViaA overexpression strain also showed a distinct decrease in the expression of the lacZ and lacA genes of the lacZYA operon (Fig. 25). This observed decrease is likely a

174 background effect due to overexpression of the LacI repressor protein carried on the pRV plasmid.

In summary, an examination of a number of changes observed in the RavA-ViaA overexpression strain suggests that the cells may be experiencing changes in the levels of intracellular iron and sulfur-based compounds, possibly suggesting a potential role for the

RavA-ViaA system in iron/sulfur metabolism. This is partly consistent with our phylogenetic profiling analysis, and previous work (Snider and Houry 2006), which suggested that RavA-

ViaA might be involved in metal insertion. In addition, the change in the levels of the neighbouring asnA and asnC genes may suggest an involvement in asparagine metabolism.

4.4.2.3 Results of the Microarray Analysis of RavA Overexpression Strain

The microarray data shows that the presence of the pR plasmid results in a 6.3 fold increase in the level of ravA transcript (Table 5). Once again, this increase in transcriptional expression is consistent with the observed increase in protein levels as detected by western blot analysis (Fig. 25A).

A total of 20 significant increases were observed in the pR overexpression strain. 14 of these changes corresponded to defined protein-encoding genes, while the remainder matched intergenic regions (Table 5). A total of 19 significant decreases were also detected (Table 6). All

19 of these changes correspond to defined genes with two sequences encoding small RNAs. The results of the COG functional analysis (Fig. 27) reveal that the changes appear to cover a diverse range of COG classes with genes encoding proteins of unknown/poorly defined function, or which do not belong to a recognizable COG, making up a substantial portion of the genes showing significant changes in their levels. As such, it is difficult to draw conclusions from the observed general trends. However, examining specific changes more closely does reveal interesting aspects.

175

TABLE 5. Genes whose transcript levels increase in MG1655 pR cells relative to those in wild-type cells. Genes are listed alphabetically. ++ indicates that transcript was not detected in wild-type cells, thus a ‘fold- increase’ could not be calculated.

176

Gene Name b Number Description COG Class Fold Change 1 cirA b2155 Outer membrane ferric siderophore, colicin receptor P 2.3

2 glcC b2980 GlcC transcriptional dual regulator K ++

3 hcp b0873 Hybrid-cluster protein / hydroxylamine reductase C 2.1

4 hmpA b2552 Nitric oxide dioxygenase / dihydropteridine reductase 2 C 1.8

5 pitB b2987 Phosphate transporter P ++

6 ravA b3746 Putative AAA+ chaperone R 6.3

7 talC b3946 Fructose 6-phosphate aldolase 2 G 1.5

8 yccM b0992 Hypothetical protein C 1.7

9 ygbA b2732 Predicted protein X 1.6

10 ygbN b2740 YgbN Gnt transporter GE 2.0

11 ygeH b2852 Predicted transcriptional regulator TK ++

12 yhjE b3523 YhjE MFS transporter GEPR 5.6

13 yjiG b4329 Putative membrane protein S 1.5

14 ytfE b4209 Iron metabolism protein D 1.9

15 IG_2175231_2175531-r NA Intergenic Region 1.6 16 IG_1584511_1584843-r NA Intergenic Region 1.7 17 IG_11787_12162-f NA Intergenic Region 1.8 18 IG_706981_707556-r NA Intergenic Region 3.4 19 IG_1314117_1314439-r NA Intergenic Region ++ 20 IG_1160775_1161107-r NA Intergenic Region ++

177

TABLE 6. Genes whose transcript levels decrease in MG1655 pR cells relative to those in wild-type cells. Genes are listed alphabetically. ‘--‘ indicates that transcript was not detected in pR cells, thus a ‘fold-decrease’ could not be calculated.

178

Gene Name b Number Description COG Class Fold Change 1 ebgA b3076 Evolved b-D-galactosidase, a-alpha; subunit; cryptic gene G -3.3

2 gspG b3328 Putative protein secretion protein for export N -4.7

3 nohB b0560 DLP12 prophage; DNA packaging protein X -4.3

4 ordL b1301 g-glutamylputrescine oxidase E -2.0

5 recC b2822 Component of the RecBCD (Exo V) helicalse/nuclease complex L -3.1

6 rhsE b1456 RhsE protein in rhs element M -1.7

7 rspB b1580 Putative dehydrogenase RspB ER -2.2

8 rybB b4417 Small RNA that interacts with Hfq -1.6

9 ryjA b4459 Small RNA -1.7

10 yqiG b3046 Putative membrane protein N --

11 ybfB b0702 Conserved protein X -2.1

12 yciE b1257 Conserved protein X -1.7

13 ycfZ b1121 Predicted inner membrane protein X -5.1

14 yebO b1825 Predicted protein X -1.5

15 ygdE b2806 Putative RNA 2'-O-ribose methyltransferase R -3.6

16 ygiZ b3027 Conserved inner membrane protein X -3.4

17 yhfS b3376 Putative biotin metabolism protein X -3.5

18 yibG b3596 Conserved protein R -2.4

19 yjiR b4340 Putative regulator KE -4.2

179 4.4.2.4 Examining the Microarray Analysis Results of the RavA Overexpression Strain

Intriguingly, there is little overlap in the datasets obtained for RavA (Tables 2 and 3) and for RavA-ViaA (Tables 4 and 5) overexpression strains (Fig. 29). There is only one gene whose transcript level decreases upon RavA and RavA-ViaA overexpression, yciE. The function of yciE is unknown, however, its protein levels have been observed to increase under both aerobic and anaerobic osmotic stress conditions (Weber et al. 2006). There are only two genes whose transcription increases upon RavA and RavA-ViaA overexpression, namely cirA ferric siderophore receptor gene, possibly indicative of a disruption of cellular iron concentration, and glcC. GlcC acts as a transcriptional regulator and activates the transcription of the glcDEFGBA operon, which plays a role in glycolate utilization (Pellicer et al. 1996; Pellicer et al. 1999).

The absence of a significant number of common genes whose expression levels change in both RavA and RavA-ViaA overexpression strains is consistent with the fact that ravA and viaA form an operon (Snider et al. 2006), and suggests RavA does not function independently of

ViaA in the cell. Nevertheless, determining the effect of overexpressing RavA alone might shed some light on the function of the RavA-ViaA system.

One particularly striking observation is that cells overexpressing RavA alone appear to be undergoing a mild nitrosative or iron limitation stress response, as evidenced by increases in the hmpA, ygbA, and ytfE genes (Table 5). All three of the genes are under the control of the

NsrR repressor protein (Fig. 28), and are induced upon exposure to nitric oxide, nitrite, nitrate, and iron limitation (Bodenmiller and Spiro 2006). hmpA encodes a flavohemoglobin with roles in nitric oxide detoxification and reduction of dihydropteridine and ferric siderophores

(Andrews et al. 1992; Gardner and Gardner 2002; Vasudevan et al. 1991). ytfE is believed to play an important role in resistance to nitric oxide stress, and possibly in the assembly/maintenance of [Fe-S] cluster proteins (Justino et al. 2006; Justino et al. 2005). The exact function of ygbA is unknown. In addition, there is an increase in the transcript levels of the hcp gene of the hcp-hcr operon, although a corresponding increase in hcr was not observed

180

∆ravA::cat

356

asnA + -- cysCD -- -- rybB -- + feoABC -- -- ytfE + -- metK -- + 4 2

pRV 43 34 pR cirA + + glcC + + yciE -- -- 3

FIGURE 29. Venn diagram showing the overlap in gene expression changes among the different microarray conditions. Each microarray condition is represented by a colored circle: ∆ravA::cat in red, RavA overexpressor (pR) in green, and RavA/ViaA overexpressor (pRV) in blue. Increases (+) and decreases (-) in gene expression in the different cells with respect to those in wildtype cells are marked besides each gene name and are colour coded accordingly. The number of gene expression changes corresponding to each region of the diagram is indicated in a small circle.

181 (Table 5). Hcp is a hydroxylamine reductase, and is proposed to have a role in the detoxification of potentially harmful by-products of nitrate metabolism (Wolfe et al. 2002). Hcp is induced by nitrate, nitrite, nitrosating agents and anaerobic growth conditions, and is under the control of the FNR, NarL, and NarP transcriptional regulators (Filenko et al. 2005; Flatley et al. 2005; van den Berg et al. 2000).

Thus, there is strong evidence that cells overexpressing RavA undergo a form of nitrosative, and possibly iron limitation, stress. The reason for this is unclear, but the potential association with iron metabolism is once again intriguing. The nitrosative stress response might indicate that the RavA and ViaA system recognizes an enzyme involved in nitrate/nitrite metabolism as a substrate. Overexpression of RavA alone may lead to an incomplete processing/maturation of the target protein(s), leading to the observed stress effect. The presence of the ViaA protein presumably leads to a complete processing/maturation of substrate(s), and a corresponding prevention of the stress.

4.4.2.5 Results of the Microarray Analysis of ΔravA::cat Strain

Initially, it should be emphasized that the ΔravA::cat strain fails to express viaA, as determined by Northern blotting, consistent with ravA and viaA comprising an operon (Snider et al. 2006), and thus can be effectively considered a double deletion strain. Microarray analysis of this strain revealed a surprisingly very large number of changes in gene expression levels as cells approached stationary phase.

Deletion of the ravA-viaA genes resulted in a total of 220 significant increases. Of these,

147 corresponded to defined genes (Table 7), while the remaining 73 changes were in intergenic stretches, and 5 genes encoded small regulatory RNAs. A total of 142 significant decreases were also detected (Table 8). Of these, 112 corresponded to defined genes, while the remaining 30 changes map to intergenic regions. Two of the genes, rprA and oxyS, encode small regulatory

RNAs. COG functional classification of the genes that showed significant changes showed a

182 substantial number of decreases in amino acid transport/metabolism proteins and carbohydrate transport/metabolism genes, as well as in those encoding proteins not belonging to recognized

COGs (Fig. 27). A significant number of increases in genes whose function is poorly characterized/unknown, or whose products do not belong to a defined COG, were also observed

(Fig. 27). The percentage of genes in these categories are largely distinct from the general distribution of COG classes across all known E. coli proteins, and thus shows an interesting bias

(Fig. 27). The number of genes without a clearly defined role is particularly striking and may indicate involvement of RavA-ViaA with systems whose roles are not yet well understood.

4.4.2.6 Examining the Microarray Analysis Results of ΔravA::cat Strain

A caveat in analyzing the data from this strain is that our microarray analysis shows a significant, ~17 fold, increase in the levels of kup transcripts (Table 7), which encodes a potassium uptake protein active under hyperosmotic stress conditions at low pH (Trchounian and Kobayashi 1999). The position of this gene next to the ravA gene (Fig. 24) suggests that the observed increase in kup transcript levels might be a polar effect. The ravA deletion strain contains a chloramphenicol resistance cassette in place of ravA, and this resistance cassette is under the control of its own promoter. The resistance cassette is encoded on the same strand and orientation as the kup gene. As a result of this, read through transcription from the cassette could explain the elevated levels of kup gene observed, although the transporter might not be active under the conditions of our experiments at neutral pH (Trchounian and Kobayashi 1999).

Nevertheless, it is possible that some of the observed gene level changes in the deletion strain might result indirectly from an increase in the levels of this transporter.

Several of the changes observed in the deletion strain parallel those from the RavA or

RavA-ViaA overexpression strains (Fig. 29). For instance, as in the RavA-ViaA overexpression strain, we once again observed a decrease in the levels in the cysC and cysD genes (Table 8).

Although this effect is much less pronounced in the ravA deletion strain than for the RavA-ViaA

183

TABLE 7. Genes whose transcript levels increase in MG1655 ΔravA::cat cells relative to those in wild-type cells. Genes are grouped together into operons (where possible) and then listed alphabetically. ‘++’ indicates that transcript was not detected in wild-type cells, thus a ‘fold-increase’ could not be calculated.

184

Gene Name b Number Description COG Class Fold Change 1 acpT b3475 Holo-[acyl carrier protein] synthase 2 Q 1.5

2 appA b0980 6-phytase / pH 2.5 acid phosphatase X 1.7

3 arnB b2253 UDP-L-Ara4O C-4" transaminase M 1.8 4 arnC b2254 Undecaprenyl phosphate-L-Ara4FN transferase M 1.6 5 arnA b2255 UDP-L-Ara4N formyltransferase / UDP-GlcA C-4"-decarboxylase MG 1.7

6 aroA b0908 3-phosphoshikimate-1-carboxyvinyltransferase E 1.6

7 arpB b1721 Hypothetical protein X 2.4

8 ccmH b2194 Cytochrome c biogenesis protein CcmH R 1.8 9 ccmG b2195 Thioredoxin-like protein, cytochrome c biogenesis OC 1.9 10 ccmF b2196 Cytochrome c-type biogenesis protein O 2.1 11 ccmE b2197 CcmABCDEFGH cytochrome c biogenesis system O 1.7 12 ccmD b2198 CcmABCDEFGH cytochrome c biogenesis system N 1.5 13 ccmC b2199 CcmABCDEFGH cytochrome c biogenesis system O 2.1 14 ccmB b2200 CcmABCDEFGH cytochrome c biogenesis system O 1.8 15 ccmA b2201 CcmABCDEFGH cytochrome c biogenesis system Q 1.8 16 napC b2202 Cytochrome c protein C 1.6 17 napB b2203 Small subunit of periplasmic nitrate reductase, cytochrome c550 protein C 1.5 18 napH b2204 Ferredoxin-type protein C 1.8

19 dctA b3528 DctA dicarboxylate DAACS transporter C 1.6

20 dsdX b2365 DsdX Gnt tranporter GE 1.5

21 ecnA b4410 entericidin A, antidote to lipoprotein entericidin B S 1.7

22 evgS b2370 Sensor kinase T 1.6

23 exuT b3093 ExuT hexuronate MFS transporter GEP 1.6

24 fhuE b1102 Outer membrane receptor for ferric iron uptake P 1.6

25 G6553 Pha b1052 Probable phantom gene X 2.1

26 gadA b3517 A subunit E 2.2

27 gapC b1416 Interrupted glyceraldehyde-3-dehydrogenase G 1.6

28 glmS b3729 L-glutamine:D-fructose-6-phosphate aminotransferase M 1.5

29 glnL b3869 Kinase-phosphotransferase nitrogen regulator II T 1.7

30 gspC b3324 Putative protein secretion protein for export X 1.8

31 guaA b2507 GMP synthase / GMP synthase (ammonia dependent) F 1.7 32 guaB b2508 IMP dehydrogenase F 1.7

33 hokC b4412 HokC, Gef toxin; interferes with membrane function when in excess S 1.5

34 hokD b1562 Polypeptide destructive to membrane potential X 2.5

35 hscA b2526 Chaperone, member of Hsp70 protein family O 1.9 36 hscB b2527 Hsc20 co-chaperone that acts with Hsc66 in IscU iron-sulfur cluster assembly O 1.7

37 htgA b0012 Predicted protein X ++

38 hyaA b0972 Hydrogenase 1, small subunit C 1.7 39 hyaB b0973 Hydrogenase 1, large subunit C 1.8 40 hyaC b0974 Hydrogenase 1, b-type cytochrome subunit C 2.7 41 hyaD b0975 Protein involved in processing of HyaA and HyaB proteins C 1.9 42 hyaF b0977 Protein involved in nickel incorporation into hydrogenase 1 proteins X 2.0

43 iscS b2530 Cysteine desulfurase monomer K 2.0 44 iscR b2531 IscR transcriptional regulator E 1.9

45 kup b3747 TrkD potassium KUP transporter P 16.8

185 46 lysA b2838 Diaminopimelate decarboxylase E ++

47 metK b2942 MetK S-adenosylmethionine synthetase monomer H 2.2

48 mhpT b0353 MhpT MFS transporter GEP 1.5

49 mtr b3161 Mtr tryptophan ArAAP transporter E 4.6

50 narV b1465 Nitrate reductase Z, g-subunit C 1.8

51 nhaA b0019 Sodium/proton NhaA transporter P 1.7

52 nupC b2393 NupC nucleoside NUP transporter F 2.4

53 nusB b0416 Transcription antitermination protein NusB K 1.6

54 phoB b0399 PhoB-Phosphorylated transcriptional dual regulator TK 1.6

55 phrB b0708 Deoxyribodipyrimidine (photoreactivation) L 1.6

56 polB b0060 DNA polymerase II L 1.7

57 potE b0692 Putrescine/proton symporter: putrescine/ornithine antiporter E 3.3

58 rdlA b4420 Antisense RNA, trans-acting regulator of ldrA 1.6

59 rdlB b4422 Antisense RNA, trans-acting regulator of ldrB 1.8

60 rdlC b4424 Antisense RNA, trans-acting regulator of ldrC 2.1

61 rpmF b1089 50S ribosomal subunit protein L32 J 1.9

62 rstB b1609 Sensor kinase-phosphotransferase T 2.5

63 rybA b4416 Small RNA 1.7

64 rybB b4417 Small RNA that interacts with Hfq 1.7

65 sdiA b1916 SdiA transcriptional activator K 1.5

66 secE b3981 Sec Protein Secretion Complex N 1.5

67 sieB b1353 Rac prophage; phage superinfection exclusion protein X 2.0

68 speB b2937 Agmatinase E 1.7

69 tfaS b2353 Hypothetical protein X 1.8

70 thiJ b0424 Conserved protein R 1.6

71 trpE b1264 Anthranilate synthase component I EH 3.9 72 trpL b1265 Trp operon leader peptide X 7.7

73 umuD b1183 SOS mutagenesis and repair L 1.6 74 umuC b1184 SOS mutagenesis and repair KT 2.5

75 viaA b3745 Predicted von Willibrand factor containing protein S 3.7

76 xdhD b2881 Putative oxidoreductase; possible selenate reductase / role in purine salvage C 1.7

77 yadC b0135 Putative fimbrial-like protein X 4.5

78 yadL b0137 Putative adhesin-like protein X 1.5

79 yagS b0285 Putative oxidoreductase, FAD-binding domain C 1.8

80 yagV b0289 Conserved protein X 1.8

81 ybaW b0443 Conserved hypothetical protein R 2.1

186 82 ybaY b0453 Predicted outer membrane lipoprotein S 1.5

83 ybcM b0546 Putative ARAC-type regulatory protein K 2.0

84 ybdK b0581 g-glutamyl:cysteine ligase YbdK S 1.8

85 ybdL b0600 Methionine aminotransferase, PLP-dependent E 1.9

86 ybdO b0603 Predicted DNA-binding transcriptional regulator LYSR-type K 2.8

87 ybeB b0637 Predicted protein S 1.5

88 ybeU b0648 Predicted tRNA ligase X 4.1

89 ybfP b0689 Predicted protein X ++

90 ybhB b0773 Predicted kinase inhibitor R 1.7

91 ycbV b0943 Predicted fimbrial-like adhesin protein X ++ 92 yccV b0966 Heat shock protein, hemimethylated DNA-binding protein X 1.7

93 ycgG b1168 Conserved protein X 1.6

94 ychE b1242 Predicted inner membrane protein S 2.1

95 yciR b1285 RNase II modulator T 1.5

96 ydaE b4526 Rac prophage; zinc-binding protein X 4.8

97 ydaT b1358 Hypothetical protein X 1.6

98 ydaU b1359 Hypothetical protein X 1.6

99 ydeI b1536 Conserved protein S 1.5

100 ydeK b1510 Predicted lipoprotein X 1.6

101 ydfZ b1541 Conserved protein X 2.4

102 ydhC b1660 YdhC drug MFS transporter X 1.6

103 ydiH b1685 Hypothetical protein X 1.6

104 ydiK b1688 Hypothetical protein; transcription may be purine regulated R 1.6

105 ydiV b1707 Conserved protein T 1.6

106 yeaY b1806 YeaY-outer membrane lipoprotein M 1.8

107 yebA b1856 Conserved protein M 1.6

108 yedL b1932 Predicted acyltransferase K 1.7

109 yedQ b1956 Conserved protein T 1.6

110 yedS b1964 Putative outer membrane protein X 1.7 111 yedS 2 b1965 Putative outer membrane protein X 1.8

112 yegS b2086 Conserved protein R 1.6

113 yfcO b2332 Conserved hypothetical protein X 1.9

114 yfdF b2345 Hypothetical protein X ++

115 yfiL b2602 Predicted protein X 1.6

116 yfjM b2629 CP4-57 prophage; predicted protein X 2.2

117 ygiA b3036 Predicted protein X 1.7

187 118 yhcG b3220 Conserved hypothetical protein X 1.8

119 yhiD b3508 Predicted Mg(2+) transport ATPase S 1.9

120 yhiU b3513 MdtEF multidrug transporter Q 2.7 121 yhiV b3514 MdtEF multidrug transporter Q 2.1

122 yhjG b3524 Ponserved protein X 1.9

123 yhjJ b3527 Putative peptidase R 4.1

124 yieG b3714 Putative membrane / transport protein R ++

125 yieJ b3717 Conserved hypothetical protein S 3.5

126 yigL b3826 Conserved protein with a phosphatase-like domain R 1.7

127 yjgB b4269 Putative oxidoreductase R 1.7

128 yjgW b4274 KpLE2 phage-like element X 3.7

129 yjgX b4275 KpLE2 phage-like element; putative transmembrane protein R 1.7

130 yjhS b4309 Conserved protein X 3.2

131 ylaB b0457 Conserved protein T 1.7

132 ymfJ b1144 e14 prophage; predicted protein X 1.5

133 ynaI b1330 YnaI M 1.7

134 ynfF b1588 Oxidoreductase subunit paralog of DmsA C 1.5 135 ynfG b1589 Oxidoreductase, Fe-S subunit paralog of DmsB C 1.5

136 yncD b1451 Probable TonB-dependent receptor P 1.6

137 yoaA b1808 Conserved protein L 1.6

138 yoaD b1815 Predicted phosphodiesterase T 1.8

139 yobB b1843 Conserved protein R 1.7

140 yodA b1973 Cadmium-induced metal binding protein X 1.8

141 yojI b2211 YojI Q 1.6

142 yqaA b2689 Conserved inner membrane protein S 1.5

143 yqgD b2941 Hypothetical protein X 1.6

144 yqjB b3096 Conserved hypothetical protein with possible extracytoplasmic function X 1.5

145 yqjE b3099 Conserved inner membrane protein X 1.9

146 yrfB b3393 Protein involved in utilization of DNA as a carbon source X 3.4

147 zitB b0752 ZitB zinc CDF transporter P 1.7

148 IG_58180_58473-f NA Intergenic Region 1.5 149 IG_239085_239418-r NA Intergenic Region. Antisense to yafF. 1.9 150 IG_248135_248357-r NA Intergenic Region 3.5 151 IG_268188_268512-f NA Intergenic Region 2.0 152 IG_296321_296604-f NA Intergenic Region 3.3 153 IG_338968_339388-r NA Intergenic Region. Antisense to yahH 1.7 154 IG_376536_376758-r NA Intergenic Region 2.1 155 IG_410277_410520-r NA Intergenic Region 1.6 156 IG_479933_480477-r NA Intergenic Region 1.9 157 IG_510604_510864-r NA Intergenic Region 3.7 158 IG_545588_545903-r NA Intergenic Region 4.6

188 159 IG_557978_558196-f NA Intergenic Region 4.1 160 IG_576109_576620-f NA Intergenic Region 1.8 161 IG_638732_638945-f NA Intergenic Region 4.3 162 IG_688237_688565-r NA Intergenic Region 2.6 163 IG_698401_698796-r NA Intergenic Region 2.0 164 IG_752019_752407-f NA Intergenic Region ++ 165 IG_769835_770677-f NA Intergenic Region 1.8 166 IG_849321_849672-f NA Intergenic Region 1.7 167 IG_856779_857018-f NA Intergenic Region 2.0 168 IG_892657_893006-r NA Intergenic Region 3.1 169 IG_899799_900088-r NA Intergenic Region 1.6 170 IG_902958_903174-r NA Intergenic Region 1.6 171 IG_921814_922135-r NA Intergenic Region 1.8 172 IG_925195_925447-f NA Intergenic Region 5.1 173 IG_1020143_1020360-r NA Intergenic Region 1.8 174 IG_1030936_1031361-f NA Intergenic Region 2.5 175 IG_1049754_1050185-r NA Intergenic Region 1.9 176 IG_1062999_1063258-f NA Intergenic Region 2.0 177 IG_1164909_1165307-f NA Intergenic Region 1.7 178 IG_1166613_1166821-f NA Intergenic Region 1.8 179 IG_1246600_1246918-r NA Intergenic Region 4.0 180 IG_1271073_1271341-r NA Intergenic Region ++ 181 IG_1293368_1293648-f NA Intergenic Region 1.7 182 IG_1308294_1308592-r NA Intergenic Region 1.6 183 IG_1494656_1494879-f NA Intergenic Region 1.7 184 IG_1494656_1494879-r NA Intergenic Region ++ 185 IG_1500180_1500480-f NA Intergenic Region 1.6 186 IG_1596111_1596640-r NA Intergenic Region 5.1 187 IG_1695077_1695296-r NA Intergenic Region 1.6 188 IG_1899610_1900071-f NA Intergenic Region ++ 189 IG_2039141_2039396-f NA Intergenic Region 1.8 190 IG_2039141_2039396-r NA Intergenic Region ++ 191 IG_2217502_2217711-r NA Intergenic Region 1.6 192 IG_2226860_2227457-f NA Intergenic Region 1.8 193 IG_2228406_2228643-r NA Intergenic Region 4.2 194 IG_2234521_2234762-f NA Intergenic Region 1.8 195 IG_2289168_2289377-r NA Intergenic Region 1.8 196 IG_2302414_2303127-f NA Intergenic Region 1.6 197 IG_2342190_2342884-f NA Intergenic Region 1.6 198 IG_2404662_2405580-r NA Intergenic Region 1.7 199 IG_2411153_2411489-f NA Intergenic Region 1.6 200 IG_2460667_2461031-f NA Intergenic Region 4.4 201 IG_2475650_2475866-r NA Intergenic Region 4.8 202 IG_2481360_2481774-r NA Intergenic Region ++ 203 IG_2492423_2492717-r NA Intergenic Region 1.5 204 IG_2561138_2561596-f NA Intergenic Region ++ 205 IG_2714470_2714773-r NA Intergenic Region ++ 206 IG_2815526_2815805-r NA Intergenic Region 5.5 207 IG_2922538_2922756-r NA Intergenic Region 1.7 208 IG_2943865_2944102-f NA Intergenic Region 1.6 209 IG_2974408_2974620-f NA Intergenic Region 2.2 210 IG_3071712_3071995-r NA Intergenic Region 1.8 211 IG_3181340_3181828-f NA Intergenic Region 2.3 212 IG_3483456_3483756-f NA Intergenic Region 2.0 213 IG_3483456_3483756-r NA Intergenic Region 2.1 214 IG_3655198_3655995-r NA Intergenic Region 2.1 215 IG_3656524_3656861-f NA Intergenic Region 2.3 216 IG_3661158_3661519-r NA Intergenic Region 2.1 217 IG_3663441_3663809-r NA Intergenic Region 2.9 218 IG_3672006_3672415-f NA Intergenic Region 1.8 219 IG_3679571_3679790-r NA Intergenic Region 1.6 220 IG_3717398_3717677-f NA Intergenic Region 1.6

189

TABLE 8. Genes whose transcript levels decrease in MG1655 ΔravA::cat cells relative to those in wild-type cells. Genes are grouped together into operons (where possible) and then listed alphabetically. ‘--’ indicates that transcript was not detected in ΔravA::cat cells, thus a ‘fold-decrease’ could not be calculated.

190

Gene Name b Number Description COG Class Fold Change 1 aldA b1415 Aldehyde dehydrogenase A C -1.6

2 asnA b3744 Aspartate-ammonia ligase E -11.2

3 asnB b0674 Asparagine synthetase B E -3.2

4 atoC b2220 AtoC-Phosphorylated transcriptional activator T -1.6

5 carA b0032 Carbamoyl phosphate synthetase EF -1.7 6 carB b0033 Carbamoyl phosphate synthetase EF -1.5

7 cdd b2143 Cytidine deaminase F -1.7

8 cdsA b0175 CDP-diglyceride synthetase I -1.5

9 clpB b2592 ClpB chaperone O -1.9

10 copA b0484 YbaR X -2.2

11 cpdB b4213 2',3'-cyclic nucleotide 2'-phosphodiesterase / 3'-nucleotidase F -1.8

12 cpsB b2049 Mannose-1-phosphate -(GDP) M -3.6

13 cysC b2750 Adenylylsulfate kinase P -1.8 14 cysD b2752 Sulfate adenylyltransferase EH -2.5

15 dcuC b0621 DcuC dicarboxylate transporter C -1.5

16 dgoT b3691 YidT galactonate MFS transporter GEP -4.6

17 dnaK b0014 Chaperone Hsp70; DNA biosynthesis; autoregulated heat shock proteins O -1.6

18 exbD b3005 ExbD uptake of enterochelin; tonB-dependent uptake of B colicins N -1.5 19 exbB b3006 ExbB protein; uptake of enterochelin; tonB-dependent uptake of B colicins N -1.6 20 G7562 Pha b3007 Probable phantom gene X -2.5

21 fadB b3846 Multifunctional enzyme involved in fatty acid oxidation I -1.7

22 fadE b0221 Acyl-CoA dehydrogenase I -1.5

23 feoA b3408 Ferrous iron transport protein A P -2.1 24 feoB b3409 FeoB ferrous iron transporter P -3.2 25 feoC b3410 Putative transcriptional regulator X -2.8

26 fucO b2799 Propanediol oxidoreductase monomer C -1.9 27 fucA b2800 L-fuculose-phosphate aldolase G -1.8

28 fumA b1612 Fumarase A monomer C -1.5

29 fumC b1611 Fumarase C monomer C -1.7

30 glnA b3870 Adenylyl-[glutamine synthetase] E -1.7

31 glpF b3927 GlpF - glycerol MIP channel G -1.9

32 gltB b3212 Glutamate synthase (NADPH) large chain precursor E -1.7 33 gltD b3213 Glutamate synthase (NADPH) small chain E -1.6

34 gltP b4077 GltP glutamate/aspartate DAACS transporter C -1.6

35 gntP b4321 GntP Gluconate Gnt transporter GE -1.7

36 hipA b1507 HipA transcriptional activator R -1.5 37 hipB b1508 HipB transcriptional activator K -1.7

38 htpG b0473 HtpG monomer O -1.8

191 39 hyuA b2873 Phenylhydantoinase monomer F -1.6

40 malG b4032 Maltose/Maltodextrin Transport System G -2.1 41 malF b4033 Maltose/Maltodextrin Transport System G -2.1 42 malE b4034 Maltose/Maltodextrin Transport System G -2.2

43 malK b4035 Maltose/Maltodextrin Transport System X -2.2

44 melA b4119 a-galactosidase monomer G -1.6

45 narK b1223 NarK nitrite MFS transporter P -4.3

46 ninE b0548 DLP12 prophage, conserved protein similar to phage 82 and lambda proteins X -6.1

47 oxyS b4458 OxyS RNA; oxidative stress regulator -2.1

48 pepE b4021 Peptidase E, a dipeptidase where amino-terminal residue is aspartate E -1.6

49 polA b3863 DNA polymerase I L -1.5

50 pssR b3763 HdfR transcriptional regulator K -1.6

51 ptsA b3947 PEP-protein phosphotransferase system enzyme I G -2.3

52 putA b1014 PutA bifunctional enzyme and transcriptional regulator C -1.5

53 ravA b3746 Putative AAA+ chaperone R --

54 rprA b4431 Small RNA regulator of RpoS -3.4

55 sdaR b0162 SdaR transcriptional regulator KT -1.6

56 sgcC b4304 Putative PTS permease component X -1.9

57 sodA b3908 Superoxide dismutase (Mn) P -2.3

58 ssuC b0934 YcbE/YcbM ABC transporter P -1.5

59 tdcG b3112 L-serine deaminase 3 E -1.6 60 tdcF b3113 Hypothetical protein J -1.6 61 tdcE b3114 2-ketobutyrate formate-lyase / pyruvate formate-lyase 4 C -2.3 62 tdcD b3115 Propionate kinase / acetate kinase C C -2.5 63 tdcC b3116 TdcC threonine STP transporter E -2.2 64 tdcB b3117 Threonine dehydratase (catabolic) E -1.8 65 tdcA b3118 TdcA transcriptional activator K -2.1

66 tdh b3616 Threonine dehydrogenase E -1.6 67 kbl b3617 2-amino-3-ketobutyrate CoA ligase H -1.6

68 tnaC b3707 tna operon leader peptide X -1.9 69 tnaB b3709 TnaB tryptophan ArAAP transporter E -1.7

70 uidB b1616 UidB glucuronides GPH transporter G -2.4 71 uidA b1617 b-glucuronidase G -6.5

72 uxaC b3092 D-glucuronate / D-galacturonate isomerase G -1.8

73 uxuB b4323 Mannonate oxidoreductase G -2.1

74 wzxC b2046 Uncharacterizaed polysaccharide transporter R -1.6

75 yaaI b0013 Predicted protein X -4.2

76 yaiS b0364 Conserved protein X -1.8

77 ybeF b0629 Predicted DNA-binding transcriptional regulator, LYSR-type K -1.5

78 ybiC b0801 Predicted dehydrogenase C -2.0

192 79 ycaD b0898 YcaD MFS transporter GEP -1.5

80 ycjF b1322 Putative membrane protein X -1.6

81 ycjX b1321 Conserved protein R -4.3

82 ydfA b1571 Hypothetical protein X -1.8

83 ydjH b1772 Predicted kinase X -1.6

84 yebB b1862 Conserved hypothetical protein X -4.7

85 yeeF b2014 YeeF amino acid APC transporter E -1.6

86 yehK b4541 Predicted protein X -1.9

87 yeiQ b2172 Putative oxidoreductase G -1.6

88 yfiD b2579 Stress-induced alternate pyruvate formate-lyase subunit R -1.8

89 ygaY b2681 Predicted transporter GEP -1.6

90 ygeW b2870 Conserved protein E -1.7

91 ygeX b2871 2,3-diaminopropionate ammonia-lyase monomer E -1.6

92 ygfT b2887 Fused predicted oxidoreductase, Fe-S subunit and nucleotide-binding subunit E -1.6

93 ygfU b2888 YgfU NCS2 tranporter F -1.7

94 ygiP b3060 Predicted DNA-binding transcriptional regulator LYSR-type K -1.9

95 ygjH b3074 Conserved protein R -1.6

96 yibL b3602 Conserved protein X -1.6

97 yicI b3656 a-xylosidase G -1.5

98 yihT b3881 Putative aldolase X -1.6

99 yjbM b4048 Conserved hypothetical protein X -2.3

100 yjfM b4185 Conserved hypothetical protein X -4.1

101 yjgK b4252 Conserved hypothetical protein S -1.6

102 yjjM b4357 Hypothetical protein K -1.6

103 yjjN b4358 Predicted oxidoreductase, Zn-dependent and NAD(P)-binding E --

104 yjjW b4379 Putative activating enzyme O -1.8

105 ykfH b4504 Predicted protein X --

106 yneI b1525 Predicted NAD-dependent succinate semialdehyde dehydrogenase C -1.7

107 ypdH b2387 Predicted enzyme IIB component of PTS G -1.7

108 yqeB b2875 Conserved protein with NAD(P)-binding Rossman fold O -1.6

109 yqeJ b2848 Predicted protein X -1.6

110 ytfE b4209 Iron metabolism protein X -1.5

111 zntA b3469 Zinc, cobalt and lead efflux system P -1.7

112 zraP b4002 Zinc homeostasis protein X -3.1

193 113 IG_66551_66834-r NA Intergenic Region -1.6 114 IG_230882_231121-f NA Intergenic Region -1.8 115 IG_266192_266407-f NA Intergenic Region -2.3 116 IG_507784_508098-r NA Intergenic Region -2.8 117 IG_696357_696735-r NA Intergenic Region -1.5 118 IG_915271_915695-r NA Intergenic Region -1.5 119 IG_939944_940268-r NA Intergenic Region. Antisense to portion of dmsA gene -3.1 120 IG_944781_945093-r NA Intergenic Region -4.5 121 IG_1197461_1197917-f NA Intergenic Region -1.7 122 IG_1272823_1273147-r NA Intergenic Region -1.7 123 IG_1357266_1357513-r NA Intergenic Region -1.5 124 IG_1554368_1554648-f NA Intergenic Region -1.8 125 IG_1619063_1619355-f NA Intergenic Region. Antisense to portion of ydeD gene -5.5 126 IG_1741267_1741480-f NA Intergenic Region -1.5 127 IG_1755135_1755444-r NA Intergenic Region -2.0 128 IG_1785137_1785468-f NA Intergenic Region -- 129 IG_2085091_2085350-r NA Intergenic Region -1.8 130 IG_2238369_2238647-r NA Intergenic Region -1.7 131 IG_2301518_2301924-r NA Intergenic Region -1.6 132 IG_2493313_2493598-f NA Intergenic Region -1.8 133 IG_2558919_2559387-f NA Intergenic Region -4.3 134 IG_2689361_2689675-r NA Intergenic Region -2.1 135 IG_2986191_2986523-f NA Intergenic Region -2.7 136 IG_3010415_3010634-r NA Intergenic Region -1.8 137 IG_3119296_3119649-r NA Intergenic Region -1.6 138 IG_3320147_3320373-r NA Intergenic Region -1.5 139 IG_3372125_3372503-r NA Intergenic Region -4.2 140 IG_3416249_3416785-f NA Intergenic Region -3.2 141 IG_3489935_3490204-f NA Intergenic Region -3.4 142 IG_3597290_3597559-r NA Intergenic Region -1.7

194 overexpression strain, it is still interesting and once again suggests a possible alteration in the levels of intracellular sulfur or cysteine. A decrease in the expression of the feoABC operon was also observed (2.1 – 3.2 fold, Table 8 and Fig. 29), which suggests a possible alteration in the levels of intracellular iron. The fact that both deletion of RavA and overexpression of RavA and

ViaA leads to decreases in the cys and feo genes suggests that maintaining the proper levels of

RavA and ViaA may be important for iron/sulfur balance.

The deletion strain also shows a remarkable decrease of approximately 11.2 fold in the levels of asnA (Table 8), which was observed to increase in the RavA-ViaA overexpression strain (Fig. 29), again suggesting a role of RavA-ViaA with asparagine synthesis. In addition, a drop of approximately 3.2 fold was observed in the asnB gene transcript, (Table 8) which encodes the glutamine hydrolyzing asparagine synthetase (Humbert and Simoni 1980; Scofield et al. 1990). This gene is located in a totally different genomic environment than asnA and, unlike asnA, does not appear to be subject to regulation by AsnC. Thus, an association of RavA-

ViaA with the asparagine synthetases seems like a strong possibility. Notably, AsnB is conserved across all 14 RavA-containing organisms, while not being completely conserved across non-RavA organisms (Supplementary Table 2). As such, it is tempting to speculate that

AsnB, which is both multimeric and metal-dependent (Reitzer and Magasanik 1982), may be a substrate for the RavA-ViaA system. Alternatively the observed effect may be more indirect.

One possible explanation is an increase in intracellular asparagine concentration, which has been shown to lead to a repression of the expression of both enzymes (Cedar and Schwartz 1969;

Kolling and Lother 1985; Reitzer and Magasanik 1982).

Finally, the deletion strain also showed an increase (2.2 fold) in the levels of metK

(Table 7), which was observed to decrease in the overexpressor (Fig. 29), providing more support for an involvement of the RavA-ViaA system in sulfur metabolism and possibly in

MetK biogenesis/folding.

We also see overlap between the deletion strain and the RavA overexpression strain (Fig.

195

29). Most notably the expression of the ytfE gene, which was observed to increase upon overexpression of RavA (Table 5), undergoes a mild decrease upon deletion of ravA (~1.5 fold,

Table 8). This may be indicative of alteration in the level of iron, or possibly a relief of a basal level of nitrosative stress. We also observe an increase (1.7 fold) in the rybB regulatory RNA

(Table 7), which was observed to drop in the RavA overexpressor (Table 6). rybB transcription is under the control of the membrane stress factor σE and plays an important role in accelerating the decay of multiple outer membrane proteins (OMPs) mRNAs during the envelope stress response. It also plays a role in bacterial envelope homeostasis under standard growth conditions, where the small amount of rybB expressed is proposed to help limit the synthesis of OMPs

(Papenfort et al. 2006). The induction of rybB in the deletion strain is not part of a general σE response, as there are few changes in known σE-dependent genes (Rhodius et al. 2006). As such, the nature of the relationship of rybB and RavA is not known, but is in agreement with a RavA role in stress response, and may suggest a functional association of RavA and/or RavA substrates with the bacterial envelope.

These highlighted overlaps are consistent with a role for the RavA/ViaA system in iron/sulfur metabolism. This proposal is further supported by some other changes observed in the ravA deletion strain, which were not observed in either of the overexpression strains. For instance, we observed an approximate 1.6 fold decrease in the exbBD operon (Table 8), which encodes the components of a TonB associated energy transducing system involved in the uptake of heme, iron and Fe3+-siderophores, and vitamin B12 across the outer membrane (Braun and

Braun 2002). Such a drop might be consistent with an increase in the level of intracellular iron.

We also observed increases from 1.7 to 2.0 fold in iscS and iscR of the iscRSUA operon, and in hscA and hscB of the hscAB-fdx operon (Table 7). Both of these operons are involved in the assembly of [Fe-S] clusters (Nakamura et al. 1999; Takahashi and Nakamura 1999). IscS, a cysteine desulfurase, has also been shown to play a role in [Fe-S] cluster repair, and in the synthesis of thiamine, biotin, molybdopterin, and thiolated tRNA (Djaman et al. 2004; Lauhon

196 and Kambampati 2000; Leimkuhler and Rajagopalan 2001; Mueller et al. 2001; Skovran and

Downs 2000). It is tempting to speculate that the disruption of [Fe-S] clusters in the deletion strain might be leading to an increase in the levels of intracellular iron and/or sulfur.

We also observed changes in the levels of several other genes associated with metal metabolism (Tables 6 and 7). These include an increase in yodA transcript, which encodes a cadmium/oxidative stress response protein of unknown function (Ferianc et al. 1998), and fhuE, which encodes a ferric-coprogen and ferric-rhodotorulic acid uptake receptor (Hantke 1983), and decreases in copA, which encodes a copper efflux ATPase involved in maintaining copper homeostasis (Rensing et al. 2000), zraP, which encodes a periplasmic zinc-binding protein associated with zinc resistance (Noll et al. 1998), and zntA, which encodes a Zn+2/Cd+2 resistance P-type ATPase involved in Zn+2 and Cd+2 efflux from cells (Rensing et al. 1997). The changes in the levels of these genes suggest a disruption of the metal balance within the deletion strain.

We observed increases in two large operons associated with respiratory processes. For instance, a notable increase of ~1.7 to 2.7 fold in the hyaABCD and F genes of the hya operon was observed (Table 7). Only the hyaE gene did not show a change, which might be a false negative result. The hya operon encodes the components of hydrogenase 1, as well as proteins involved in its maturation (Menon et al. 1990; Menon et al. 1991). E. coli hydrogenase 1 is a membrane bound, heterooligomeric enzyme containing an [Fe-S] cluster and nickel

(DerVartanian et al. 1996; Sawers and Boxer 1986). It is involved in the uptake and oxidation of hydrogen, using electron acceptors of high redox potential. It is believed to be important under microaerophilic conditions, serving as an energy provider and/or in a protective role. Its protective role may involve reducing the concentration of O2 under low oxygen conditions to levels favourable for the action of more stringent anaerobic enzymes (Laurinavichene et al.

2002). Transcriptional regulation of the operon is very complex.

An increase was also observed in all of the ccm genes (ccmABCDEFGH), with changes

197 ranging from 1.5 fold to 2.1 fold (Table 7). The ccm genes encode a maturation system for cytochrome c proteins, which play an important role in respiratory electron transport chains

(Thony-Meyer et al. 1995). The ccm, (also referred to as Type I), maturation system is one of at least three major cytochrome c maturation systems known to exist, and is found in certain prokaryotes, as well as in the mitochondria of plants and protozoa. The proteins are believed to form a membrane-protein complex, which mediates the stereospecific insertion of heme into selected apocytochrome c molecules (Thony-Meyer 2002). In addition, Ccm proteins have proposed roles in numerous other processes, including iron acquisition (siderophore dependent and independent) and heme biosynthesis, among others (Cianciotto et al. 2005). The potential association with iron acquisition is particularly intriguing in light of our observed changes in other metal transport/stress related genes mentioned above. The ccm genes are part of the larger nap-ccm operon (Figure 28), containing a total of 15 genes: 7 nap genes encoding a functional nitrate reductase followed by the 8 ccm genes (Grove et al. 1996). The operon is induced anaerobically in the presence of nitrate or nitrite, and is under the control of the Fnr, NarP, and

NarL transcriptional regulators (Choe and Reznikoff 1993; Rabin and Stewart 1993). In addition, transcription can also be directed from promoters within the downstream nap and ccm genes

(Grove et al. 1996; Tanapongpipat et al. 1998). In our microarray analysis, we detected changes in the napHBC genes (Table 7), as well as in the ccm genes (Table 7). The association of nap- ccm genes with both anaerobic respiratory processes and iron uptake fits with the general observations of our microarray results from the RavA and RavA-ViaA overexpressors.

Another interesting change is a decrease in two operons involved in the degradation of threonine, tdcABCDEFG and kbl-tdh (Figure 28). The tdc operon encodes genes involved in threonine/serine transport and degradation, and is primarily activated under anaerobic conditions providing a source of metabolic energy (Sawers 1998). Regulation of this operon is complex, involving the operon specific transcriptional regulators TdcA and TdcR, as well as integration host factor (IHF) and cyclic-AMP receptor protein (CRP) (Ganduri et al. 1993; Schweizer and

198

Datta 1989; Wu et al. 1992; Wu and Datta 1992). The Fnr and ArcA transcriptional regulators, involved in coordinated expression of aerobic and anaerobic genes, are also important for the expression of the tdc operon, however, their regulation is believed to be indirect (Chattopadhyay et al. 1997). The kbl and tdh genes comprise an alternate threonine degradation pathway, encoding two enzymes which work to convert threonine into glycine and are important in the utilization of this amino acid as a carbon source (Boylan and Dekker 1983). Transcription of this operon is upregulated in rich media and by leucine, and is down regulated in minimal media and by the Lrp transcriptional regulator (Boylan and Dekker 1983; Ernsting et al. 1992). Although both operons encode threonine degradation systems, the regulation of these genes differs. As such, the cause of the decrease in transcription of these two operons in the ravA deletion strain is not immediately clear. One possibility is that the two pathways share a common regulatory element which has yet to be discretely identified, and that this element is disrupted in the ravA deletion. In our profiling analysis, kbl and tdh-like genes are detected in all of the RavA - containing organisms and are not significantly enriched with respect to non-RavA organisms,

(Supplementary Table 2). Furthermore, the Tdh enzyme has been shown to be both multimeric and metal-dependant (Boylan and Dekker 1981; Epperly and Dekker 1991).

One other notable change is a mild decrease (~1.6 fold) in the melA gene, encoding an α- galactosidase. MelA was observed only in RavA-containing organisms in our profiling analysis, and the change in level in the microarray provides further support for a functional interaction between MelA and the RavA-ViaA system.

Numerous other changes were also observed, however the diverse nature of these genes makes it difficult to identify patterns and trends. In addition, many of the genes are not well characterized, and, therefore, currently do not provide us with much useful information. The substantial number of changes in intergenic regions is also intriguing, and it is tempting to speculate that they might encode regulatory RNA molecules or small peptides.

199

4.4.3 The Putative Functions of RavA-ViaA

The MoxR family of AAA+ proteins is a diverse group of ATPases, widespread throughout bacteria and archaea, which, surprisingly, is still poorly characterized. The family is comprised of a number of smaller subfamilies which we have previously described in detail

(Snider and Houry 2006). This work suggested a potential role for MoxR AAA+ proteins as molecular chaperones involved in the assembly of multimeric protein complexes and possibly mediating metal insertion events.

Much of our earlier experimental work has focused on the RavA subfamily of MoxR

AAA+ proteins. We have performed both bioinformatics analyses and experimental characterization, using Escherichia coli K12 MG1655 RavA as our model (Snider et al. 2006).

Despite considerable work, the function of RavA remains elusive. In this report we have described a profiling and microarray study undertaken in an effort to further our understanding of the function of RavA. Our profiling approach served as an extension of our previously described gene-neighbourhood analysis, attempting to identify potential RavA interaction partners based upon conservation. The method successfully identified ViaA, consistent with our previous results, as well as 18 other potential candidates. Many of these genes are associated with the cell envelope, providing evidence that RavA-ViaA targets may be membrane associated and/or periplasmic. This observation is consistent with the recent observations that

Tn5-mediated disruption of the viaA gene of Escherichia coli caused an increase in the production of outer membrane vesicles (vesiculation) (McBroom et al. 2006; McBroom and

Kuehn 2006). Vesiculation serves multiple roles, including acting as an envelope stress- response system. The association of the phenotype with the RavA-ViaA system may suggest that this system is important for the vesiculation process, or that disruption of the system results in an envelope stress triggering the vesiculation response. RavA also profiled with a number of genes predicted to be cytoplasmic, suggesting that RavA targets may not be, at least solely, associated with the cell envelope. Also, the possibility of the involvement of RavA-ViaA with

200 proteins which are highly conserved across gamma proteobacteria in general cannot be ruled out.

Our microarray analyses were performed on cells which had altered levels of RavA or

RavA-ViaA. By disrupting the wild-type levels of these proteins we hoped to generate alterations in the transcriptional levels of various pathways associated with RavA-ViaA function, and, thereby, obtain clues as to the role of RavA-ViaA in the cell. While the information obtained from the microarray is interesting, the use of this method to elucidate the function of genes which are not transcription factors clearly presents unique challenges. Notably, by attempting to examine both mild changes, as well as stronger changes, we are forced to contend with potentially high levels of background, which serve to complicate the interpretation of our results. In addition, while disruptions in specific pathways are clearly observed, the exact role of

RavA/ViaA in these pathways, and whether or not the observed changes are due to direct or indirect effects, cannot be established. Specific interaction partners for RavA/ViaA are also not necessarily apparent using this method, as said targets may themselves not undergo expression changes at the transcriptional level. Nevertheless, the technique can serve as a useful guide, and the information it provides should help in the development of further experimentation, as well as prove valuable in the interpretation of results obtained using different experimental approaches.

An overview of the microarray data suggests a possible involvement of the RavA and

ViaA system in metal/sulfur metabolism/insertion. Disruptions in a number of metal and sulfur transport/metabolism systems, as well as changes in operons associated with the synthesis and repair of [Fe-S] clusters were observed. In addition, cells overexpressing RavA alone appeared to undergo a nitrosative stress response. One tempting hypothesis is that RavA/ViaA might work on an [Fe-S] cluster containing protein(s) involved in nitrogen metabolism. Numerous other changes were observed, however, including a large number of genes of unknown function, suggesting possible involvement of RavA/ViaA in systems which have not yet been well characterized. The large number of changes observed, in a

201

The only gene that is common between our profiling analysis and the microarray study is melA. In our profiling analysis, melA was found to be highly enriched in the RavA-containing organisms (Table 2), while our microarray shows that melA levels are reduced in the ΔravA::cat strain relative to those in wild-type strain (Table 8). MelA is a metal dependent α-galactosidase that functions as a homodimer. It binds one NAD and one manganese ion per subunit. This may suggest that the melA gene product, serves as a RavA substrate consistent with the suggested role of RavA-ViaA in metal insertion.

In conclusion, our analyses have provided us with a number of potential interaction partners of the RavA-ViaA system, and have suggested a range of systems in which RavA and

ViaA might function. Future characterization and phenotypic analyses associated with these candidate proteins and systems should validate some of these results and more clearly define the cellular role of the RavA family. This in turn should prove invaluable in furthering our understanding of the MoxR AAA+ proteins in general, helping us to unlock the secrets of a large and intriguing family of bacterial and archaeal ATPases.

202

5. General Conclusion and Future Directions

203

5.1 The MoxR Family

The MoxR family of AAA+ proteins is widely spread throughout the Bacteria and

Archaea superkingdoms, and a number of distinct lineages have evolved. Our bioinformatics analysis, experimental work, and careful examination of the available literature have provided us with invaluable information regarding these proteins, and have laid a valuable groundwork upon which to base future studies. The work presented herein supports a role for these proteins in a variety of different systems, but suggests a common ‘theme’ of action, specifically an involvement in the assembly of multimeric protein complexes, and a possible involvement in metal insertion events. We have also observed a clear association between MoxR AAA+ proteins and proteins containing the Von Willebrand Factor Type A (VWA) domain, something which may suggest that they work in a manner analogous to the metal chelatase enzymes, which also make use of both AAA+ and VWA domains to mediate metal insertion events. Our experimental work using the RavA protein of Escherichia coli K12, a representative member of the RavA subfamily of MoxR AAA+ proteins, has shown directly that the RavA-associated

VWA protein, ViaA, is capable of stimulating RavA ATPase activity, and that the genes encoding the two proteins comprise an operon. Thus it seems likely that a similar trend can be expected for members of other MoxR families. The exact role of these VWA proteins has yet to be established, however. Our gene neighbourhood analysis has also identified other genes potentially associated with specific MoxR family members, and these may represent either other functional partners or potential substrates. Known substrates, supported by limited experimental data, include methanol dehydrogenase, nitric oxide reductase, RuBisCO and gas vesicles. For the vast majority of MoxR proteins, however, clear substrates / interaction partners have yet to be identified, and several of the associated genes we have detected have no known function.

One major avenue of future work, therefore, is the experimental investigation of members of each of the various MoxR subfamilies, using representatives from key organisms.

For the MoxR proteins for which potential substrates have been identified, the individual MoxR

204 proteins and their interaction partners/substrates need to be biochemically characterized and their mechanisms of action identified. Whether or not these MoxR proteins recognize other substrates, or are involved in additional systems, also needs to be determined. In regards to completely uncharacterized MoxR proteins, even more basic research needs to be carried out. In addition to biochemical characterization of these MoxR proteins, deletion studies and phenotype screening will be necessary, in order to identify systems and partners with which these proteins are involved. Interaction studies, such as pull-downs with, with the MoxR ATPases also need to be performed. In some cases our gene neighbourhood analysis has identified potentially associated proteins, and these should facilitate analysis. Characterization of these proteins, examination of their interaction with their respective MoxR ATPases, and identification of other proteins with which they interact should prove informative.

5.2 The RavA Subfamily

We have already begun characterization of the RavA subfamily of MoxR AAA+ proteins, using a representative member from Escherichia coli K12 MG1655 as our model, and this work has been described herein. Although we have obtained extensive information about

RavA, its specific function is still elusive and much more research still needs to be done.

5.2.1 Phenotype Screening

To date, our general phenotypic screening of the ΔravA strain, including growth on a range of different nutrients and exposure to a variety of stresses, has not yet produced a conclusive phenotype (see Supplementary Table 3). Intriguingly, however, recent work from one of our collaborators, has revealed that Salmonella enterica strains carrying a ravA deletion display several distinct phenotypes, including reduced growth in low pH minimal media under conditions of limited Mg2+, formation of unusually small colonies on plates, greater susceptibility to oxidative stress, reduced survival in macrophages and a reduction in the

205 secretion of certain key proteins associated with virulence (Yamamoto et al., unpublished results). This is a strong contrast to the lack of phenotype observed in our E. coli K12 MG1655 mutants, and may possibly suggest an important role for RavA in virulent bacterial strains.

Notably, however, efforts to complement these phenotypes by reintroduction of the ravA gene on a plasmid have not been successful (Yamamoto et al., unpublished results). One of our current projects, therefore, is focused on attempting to reproduce their results by generating ravA deletions in a variety of Salmonella strains, and, if successful, to attempt to complement them.

In addition, a recent study has reported that Escherichia coli cells containing a Tn5- mediated disruption of the viaA gene have an increase in the production of outer membrane vesicles (McBroom et al. 2006; McBroom and Kuehn 2006), a process which serves multiple roles including acting as an envelope stress-response system. These results may be consistent with those of our profiler, where many of the detected candidates were observed to be involved in the bacterial envelope. We are currently attempting to confirm this phenotype using our

ΔravA and ΔviaA strains.

5.2.2 Characterization of RavA-ViaA

While we demonstrated a clear association between RavA and ViaA, suggesting they represent a functional system, we are still not sure how exactly these proteins interact, or what their exact role is. Thus characterization of ViaA, and its interaction with the RavA protein, represents a major avenue of future research. Basic biochemical characterization of the ViaA protein will be important, as will further, more direct demonstrations of a physical interaction between RavA and ViaA. The nature of this physical interaction, and the specific domains / sequence regions involved should be examined, for instance using mutagenesis and truncation studies. In addition, experiments to identify other specific proteins with which ViaA might

206 interact (e.g. pull-down studies) should prove invaluable, and might provide further clues as to the function of the RavA – ViaA system.

5.2.3 Characterization of RavA-LdcI

The interaction between RavA and LdcI also needs to be further studied. While the formation of the cage-like structure is particularly fascinating, its role is still unknown. Our results have clearly demonstrated that LdcI does not represent a RavA substrate, but that the complex does appear to result in stabilization of the RavA oligomer in the presence of nucleotide, and a corresponding increase in its ATPase activity. Thus we speculate that the function of the complex may be to regulate RavA, possibly in response to acid stress conditions.

As the interaction leads to an increase in RavA activity, we propose that the complex may result in a stabilization of RavA under stress conditions, and/or may possibly be involved in redirection of RavA activity towards an alternative set of substrates. Future research is necessary, however, in order to validate this hypothesis. Useful experiments may include in vitro studies, using purified proteins, to measure RavA ATPase activity under low pH in the presence of LdcI

(and possibly ViaA as well) or similar assays performed in whole cell-lysates after co- expression of the proteins. Measurement of RavA stability in WT vs ΔldcI strains at low pH (e.g. using pulse-chase) might also prove informative. Attempts to identify interactors which specifically recognize the complex should also be performed, in an effort to detect cage-specific substrates or alternative functional partners. Structural studies of the LdcI-RavA interaction are also important, and are currently underway. Efforts to determine the crystal structure of the co- complex, and mutagenesis / truncation studies to identify key interaction surfaces are also in progress, and this work is greatly facilitated by the recent determination of the crystal structure of the LdcI decamer in the absence of RavA. In addition, this structural study has identified ppGpp (guanosine 3’,5’-bispyrophosphate), a molecule involved in the bacterial stringent response, as a binding factor of LdcI, and further work has suggested that the molecule acts to

207 regulate LdcI activity (Kanjee et al., unpublished results). The stringent response represents an adaptive bacterial stress response to conditions of nutritional stress (Jain et al. 2006), and it is notable that the presence of this molecule may be consistent with the observation that LdcI is important for E. coli acid stress response under conditions of anaerobic phosphate starvation

(Moreau 2007). Whether or not the RavA-LdcI complex relates to the regulation of LdcI by ppGpp is another important avenue of investigation.

It is important to note that interaction with the LdcI protein cannot represent a general feature of all RavA family members, as profiling analysis does not detect an LdcI homologue in all organisms known to contain RavA (data not shown), although the enzyme is detected in all

RavA-containing gammaproteobacteria (Supplementary Table 2). This suggests that the interaction is confined to a particular subset of RavA-containing organisms, and an investigation of why only this specific subset displays the interaction also represents an important avenue of future research.

5.2.4 Alternate Substrates, Interaction Partners and Systems

Our profiling analysis and microarray experiments have also identified a number of potential interaction partners and systems in which RavA may be involved. It will be important to characterize these candidate proteins and systems in order to determine if their function is in fact related to RavA. For instance, the functioning of specific proteins and systems, (e.g. sulfur assimilation, metal transport, specific metabolic pathways etc.), should be assessed in ΔravA and

RavA overexpression strains, to determine if any alteration in their activity is observed. Notably, however, for many of the observed changes in the microarray studies, further verification of said changes using Northern blotting and quantitative PCR should be performed prior to a more in- depth analysis. In some cases, such as the profiling analysis where only a limited number of candidates were identified, or for selected candidates identified in our microarray analysis (e.g.

FeoABC, AsnABC), it may also be feasible to clone, purify and characterize many of these

208 proteins, and directly assess their interaction with purified RavA in vitro, or to analyze their interaction with RavA using a co-expression approach.

5.3 Conclusion

Thus, it is apparent that, while we have learned a good deal about the MoxR AAA+ family, and the RavA subfamily in particular, there is still much to learn. The wide distribution of this family implies that these proteins play an important role, while the restriction of them to the Bacterial and Archaea superkingdoms of life suggests that an understanding of these systems may not only help us understand the physiology of lower organisms, but may also be invaluable in helping us develop treatments against certain non-eukaryotic pathogens. The importance of conducting research into the MoxR proteins cannot be denied, and future work into this area should prove both rewarding and insightful.

209

6. References Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). "Basic local alignment search tool." J Mol Biol, 215(3), 403-10. Amaratunga, K., Goodwin, P. M., O'Connor, C. D., and Anthony, C. (1997). "The methanol oxidation genes mxaFJGIR (S) ACKLD in Methylobacterium extorquens." FEMS Microbiol Lett, 146(1), 31-8. Ammelburg, M., Frickey, T., and Lupas, A. N. (2006). "Classification of AAA+ proteins." J Struct Biol, 156(1), 2-11. Andrews, S. C., Shipley, D., Keen, J. N., Findlay, J. B., Harrison, P. M., and Guest, J. R. (1992). "The haemoglobin-like protein (HMP) of Escherichia coli has ferrisiderophore reductase activity and its C-terminal domain shares homology with ferredoxin NADP+ reductases." FEBS Lett, 302(3), 247-52. Anthony, C. (2004). "The quinoprotein dehydrogenases for methanol and glucose." Arch Biochem Biophys, 428(1), 2-9. Arai, H., Igarashi, Y., and Kodama, T. (1994). "Structure and ANR-dependent transcription of the nir genes for denitrification from Pseudomonas aeruginosa." Biosci Biotechnol Biochem, 58(7), 1286-91. Arai, H., Kodama, T., and Igarashi, Y. (1998). "The role of the nirQOP genes in energy conservation during anaerobic growth of Pseudomonas aeruginosa." Biosci Biotechnol Biochem, 62(10), 1995-9. Arai, H., Kodama, T., and Igarashi, Y. (1999). "Effect of nitrogen oxides on expression of the nir and nor genes for denitrification in Pseudomonas aeruginosa." FEMS Microbiol Lett, 170(1), 19-24. Ariyoshi, M., Nishino, T., Iwasaki, H., Shinagawa, H., and Morikawa, K. (2000). "Crystal structure of the holliday junction DNA in complex with a single RuvA tetramer." Proc Natl Acad Sci U S A, 97(15), 8257-62. Ast, V. M., Schoenhofen, I. C., Langen, G. R., Stratilo, C. W., Chamberlain, M. D., and Howard, S. P. (2002). "Expression of the ExeAB complex of Aeromonas hydrophila is required for the localization and assembly of the ExeD secretion port multimer." Mol Microbiol, 44(1), 217-31. Auger, E. A., and Bennett, G. N. (1989). "Regulation of lysine decarboxylase activity in Escherichia coli K-12." Arch Microbiol, 151(5), 466-8. Austin, S., and Dixon, R. (1992). "The prokaryotic enhancer binding protein NTRC has an ATPase activity which is phosphorylation and DNA dependent." Embo J, 11(6), 2219-28. Baas, P. W., Karabay, A., and Qiang, L. (2005). "Microtubules cut and run." Trends Cell Biol, 15(10), 518-24.

210

Baker, T. S., and Cheng, R. H. (1996). "A model-based approach for determining orientations of biological macromolecules imaged by cryoelectron microscopy." J Struct Biol, 116(1), 120-30. Bartnikas, T. B., Tosques, I. E., Laratta, W. P., Shi, J., and Shapleigh, J. P. (1997). "Characterization of the nitric oxide reductase-encoding region in Rhodobacter sphaeroides 2.4.3." J Bacteriol, 179(11), 3534-40. Bauer, A., Chauvet, S., Huber, O., Usseglio, F., Rothbacher, U., Aragnol, D., Kemler, R., and Pradel, J. (2000). "Pontin52 and reptin52 function as antagonistic regulators of beta- catenin signalling activity." Embo J, 19(22), 6121-30. Bauer, A., Huber, O., and Kemler, R. (1998). "Pontin52, an interaction partner of beta-catenin, binds to the TATA box binding protein." Proc Natl Acad Sci U S A, 95(25), 14787-92. Beinker, P., Schlee, S., Groemping, Y., Seidel, R., and Reinstein, J. (2002). "The N terminus of ClpB from Thermus thermophilus is not essential for the chaperone activity." J Biol Chem, 277(49), 47160-6. Bell, S. P., and Stillman, B. (1992). "ATP-dependent recognition of eukaryotic origins of DNA replication by a multiprotein complex." Nature, 357(6374), 128-34. Belmap, D. M., Conway, J.F., Heymann, J.B. (2004). "PFT2 and EM3DR2." http://www.niams.nih.gov/rcn/labbranch/lsbr/software/pft2_em3dr2/pft2_em3dr2.htm. Benaroudj, N., and Goldberg, A. L. (2000). "PAN, the proteasome-activating nucleotidase from archaebacteria, is a protein-unfolding molecular chaperone." Nat Cell Biol, 2(11), 833-9. Benaroudj, N., Zwickl, P., Seemuller, E., Baumeister, W., and Goldberg, A. L. (2003). "ATP hydrolysis by the proteasome regulatory complex PAN serves multiple functions in protein degradation." Mol Cell, 11(1), 69-78. Besche, H., Tamura, N., Tamura, T., and Zwickl, P. (2004). "Mutational analysis of conserved AAA+ residues in the archaeal Lon protease from Thermoplasma acidophilum." FEBS Lett, 574(1-3), 161-6. Beyer, A. (1997). "Sequence analysis of the AAA protein family." Protein Sci, 6(10), 2043-58. Bielinska, A., and Hulanicka, D. (1986). "Regulation of the cysB gene expression in Escherichia coli." Acta Biochim Pol, 33(2), 133-7. Birschmann, I., Rosenkranz, K., Erdmann, R., and Kunau, W. H. (2005). "Structural and functional analysis of the interaction of the AAA-peroxins Pex1p and Pex6p." Febs J, 272(1), 47-58. Blatch, G. L., and Lassle, M. (1999). "The tetratricopeptide repeat: a structural motif mediating protein-protein interactions." Bioessays, 21(11), 932-9. Blattner, F. R., Plunkett, G., 3rd, Bloch, C. A., Perna, N. T., Burland, V., Riley, M., Collado- Vides, J., Glasner, J. D., Rode, C. K., Mayhew, G. F., Gregor, J., Davis, N. W.,

211

Kirkpatrick, H. A., Goeden, M. A., Rose, D. J., Mau, B., and Shao, Y. (1997). "The complete genome sequence of Escherichia coli K-12." Science, 277(5331), 1453-74. Bodenmiller, D. M., and Spiro, S. (2006). "The yjeB (nsrR) gene of Escherichia coli encodes a nitric oxide-sensitive transcriptional regulator." J Bacteriol, 188(3), 874-81. Bosl, B., Grimminger, V., and Walter, S. (2006). "The molecular chaperone Hsp104 - A molecular machine for protein disaggregation." J Struct Biol, 156(1), 139-148. Botos, I., Melnikov, E. E., Cherry, S., Khalatova, A. G., Rasulova, F. S., Tropea, J. E., Maurizi, M. R., Rotanova, T. V., Gustchina, A., and Wlodawer, A. (2004). "Crystal structure of the AAA+ alpha domain of E. coli Lon protease at 1.9A resolution." J Struct Biol, 146(1-2), 113-22. Bourniquel, A. A., and Bickle, T. A. (2002). "Complex restriction enzymes: NTP-driven molecular motors." Biochimie, 84(11), 1047-59. Bowers, J. L., Randell, J. C., Chen, S., and Bell, S. P. (2004). "ATP hydrolysis by ORC catalyzes reiterative Mcm2-7 assembly at a defined origin of replication." Mol Cell, 16(6), 967-78. Bowman, G. D., O'Donnell, M., and Kuriyan, J. (2004). "Structural analysis of a eukaryotic sliding DNA clamp-clamp loader complex." Nature, 429(6993), 724-30. Boylan, S. A., and Dekker, E. E. (1981). "L-threonine dehydrogenase. Purification and properties of the homogeneous enzyme from Escherichia coli K-12." J Biol Chem, 256(4), 1809-15. Boylan, S. A., and Dekker, E. E. (1983). "Growth, enzyme levels, and some metabolic properties of an Escherichia coli mutant grown on L-threonine as the sole carbon source." J Bacteriol, 156(1), 273-80. Bramhill, D., and Kornberg, A. (1988). "Duplex opening by dnaA protein at novel sequences in initiation of replication at the origin of the E. coli chromosome." Cell, 52(5), 743-55. Branzei, D., Seki, M., Onoda, F., and Enomoto, T. (2002). "The product of Saccharomyces cerevisiae WHIP/MGS1, a gene related to replication factor C genes, interacts functionally with DNA polymerase delta." Mol Genet Genomics, 268(3), 371-86. Braun, V., and Braun, M. (2002). "Active transport of iron and siderophore antibiotics." Curr Opin Microbiol, 5(2), 194-201. Braun, V., Patzer, S. I., and Hantke, K. (2002). "Ton-dependent colicins and microcins: modular design and evolution." Biochimie, 84(5-6), 365-80. Breakefield, X. O., Kamm, C., and Hanson, P. I. (2001). "TorsinA: movement at many levels." Neuron, 31(1), 9-12. Buck, M., Bose, D., Burrows, P., Cannon, W., Joly, N., Pape, T., Rappas, M., Schumacher, J., Wigneshweraraj, S., and Zhang, X. (2006). "A second paradigm for gene activation in bacteria." Biochem Soc Trans, 34(Pt 6), 1067-71.

212

Burgess, S. A., Walker, M. L., Sakakibara, H., Knight, P. J., and Oiwa, K. (2003). "Dynein structure and power stroke." Nature, 421(6924), 715-8. Burgess, S. A., Walker, M. L., Sakakibara, H., Oiwa, K., and Knight, P. J. (2004). "The structure of dynein-c by negative stain electron microscopy." J Struct Biol, 146(1-2), 205-16. Burton, R. E., Baker, T. A., and Sauer, R. T. (2005). "Nucleotide-dependent substrate recognition by the AAA+ HslUV protease." Nat Struct Mol Biol, 12(3), 245-51. Butland, G., Peregrin-Alvarez, J. M., Li, J., Yang, W., Yang, X., Canadien, V., Starostine, A., Richards, D., Beattie, B., Krogan, N., Davey, M., Parkinson, J., Greenblatt, J., and Emili, A. (2005). "Interaction network containing conserved and essential protein complexes in Escherichia coli." Nature, 433(7025), 531-7. Bykowski, T., van der Ploeg, J. R., Iwanicka-Nowicka, R., and Hryniewicz, M. M. (2002). "The switch from inorganic to organic sulphur assimilation in Escherichia coli: adenosine 5'- phosphosulphate (APS) as a signalling molecule for sulphate excess." Mol Microbiol, 43(5), 1347-58. Caldwell, G. A., Cao, S., Sexton, E. G., Gelwix, C. C., Bevel, J. P., and Caldwell, K. A. (2003). "Suppression of polyglutamine-induced protein aggregation in Caenorhabditis elegans by torsin proteins." Hum Mol Genet, 12(3), 307-19. Callan, A. C., Bunning, S., Jones, O. T., High, S., and Swanton, E. (2007). "Biosynthesis of the dystonia-associated AAA+ ATPase torsinA at the endoplasmic reticulum." Biochem J, 401(2), 607-12. Cantoni, G. L. (1951). "Activation of methionine for transmethylation." J Biol Chem, 189(2), 745-54. Carr, K. M., and Kaguni, J. M. (2001). "Stoichiometry of DnaA and DnaB protein in initiation at the Escherichia coli chromosomal origin." J Biol Chem, 276(48), 44919-25. Cartron, M. L., Maddocks, S., Gillingham, P., Craven, C. J., and Andrews, S. C. (2006). "Feo-- transport of ferrous iron into bacteria." Biometals, 19(2), 143-57. Casadaban, M. J., and Cohen, S. N. (1980). "Analysis of gene control signals by DNA fusion and cloning in Escherichia coli." J Mol Biol, 138(2), 179-207. Cashikar, A. G., Schirmer, E. C., Hattendorf, D. A., Glover, J. R., Ramakrishnan, M. S., Ware, D. M., and Lindquist, S. L. (2002). "Defining a pathway of communication from the C- terminal peptide binding domain to the N-terminal ATPase domain in a AAA protein." Mol Cell, 9(4), 751-60. Cedar, H., and Schwartz, J. H. (1969). "The asparagine synthetase of Escherhic coli. I. Biosynthetic role of the enzyme, purification, and characterization of the reaction products." J Biol Chem, 244(15), 4112-21.

213

Chattopadhyay, S., Wu, Y., and Datta, P. (1997). "Involvement of Fnr and ArcA in anaerobic expression of the tdc operon of Escherichia coli." J Bacteriol, 179(15), 4868-73. Cherepanov, P. P., and Wackernagel, W. (1995). "Gene disruption in Escherichia coli: TcR and KmR cassettes with the option of Flp-catalyzed excision of the antibiotic-resistance determinant." Gene, 158(1), 9-14. Cho, S. G., Bhoumik, A., Broday, L., Ivanov, V., Rosenstein, B., and Ronai, Z. (2001). "TIP49b, a regulator of activating transcription factor 2 response to stress and DNA damage." Mol Cell Biol, 21(24), 8398-413. Choe, M., and Reznikoff, W. S. (1993). "Identification of the regulatory sequence of anaerobically expressed locus aeg-46.5." J Bacteriol, 175(4), 1165-72. Chomczynski, P., and Sacchi, N. (1987). "Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction." Anal Biochem, 162(1), 156-9. Chuang, S. E., and Blattner, F. R. (1993). "Characterization of twenty-six new heat shock genes of Escherichia coli." J Bacteriol, 175(16), 5242-52. Cianciotto, N. P., Cornelis, P., and Baysse, C. (2005). "Impact of the bacterial type I cytochrome c maturation system on different biological processes." Mol Microbiol, 56(6), 1408-15. Clark, D. P. (1989). "The fermentation pathways of Escherichia coli." FEMS Microbiol Rev, 5(3), 223-34. Costa, A., Pape, T., van Heel, M., Brick, P., Patwardhan, A., and Onesti, S. (2006). "Structural studies of the archaeal MCM complex in different functional states." J Struct Biol, 156(1), 210-9. Cruciat, C. M., Hell, K., Folsch, H., Neupert, W., and Stuart, R. A. (1999). "Bcs1p, an AAA- family member, is a chaperone for the assembly of the cytochrome bc(1) complex." Embo J, 18(19), 5226-33. Darwin, K. H., Lin, G., Chen, Z., Li, H., and Nathan, C. F. (2005). "Characterization of a Mycobacterium tuberculosis proteasomal ATPase homologue." Mol Microbiol, 55(2), 561-71. DasSarma, S., Arora, P., Lin, F., Molinari, E., and Yin, L. R. (1994). "Wild-type gas vesicle formation requires at least ten genes in the gvp gene cluster of Halobacterium halobium plasmid pNRC100." J Bacteriol, 176(24), 7646-52. Datsenko, K. A., and Wanner, B. L. (2000). "One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products." Proc Natl Acad Sci U S A, 97(12), 6640-5. Davey, M. J., Fang, L., McInerney, P., Georgescu, R. E., and O'Donnell, M. (2002a). "The DnaC helicase loader is a dual ATP/ADP switch protein." Embo J, 21(12), 3148-59. Davey, M. J., Indiani, C., and O'Donnell, M. (2003). "Reconstitution of the Mcm2-7p heterohexamer, subunit arrangement, and ATP site architecture." J Biol Chem, 278(7), 4491-9.

214

Davey, M. J., Jeruzalmi, D., Kuriyan, J., and O'Donnell, M. (2002b). "Motors and switches: AAA+ machines within the replisome." Nat Rev Mol Cell Biol, 3(11), 826-35. de Boer, A. P., van der Oost, J., Reijnders, W. N., Westerhoff, H. V., Stouthamer, A. H., and van Spanning, R. J. (1996). "Mutational analysis of the nor gene cluster which encodes nitric-oxide reductase from Paracoccus denitrificans." Eur J Biochem, 242(3), 592-600. De Carlo, S., Chen, B., Hoover, T. R., Kondrashkina, E., Nogales, E., and Nixon, B. T. (2006). "The structural basis for regulated assembly and function of the transcriptional activator NtrC." Genes Dev, 20(11), 1485-95. DeLaBarre, B., and Brunger, A. T. (2003). "Complete structure of p97/valosin-containing protein reveals communication between nucleotide domains." Nat Struct Biol, 10(10), 856-63. DeLaBarre, B., Christianson, J. C., Kopito, R. R., and Brunger, A. T. (2006). "Central pore residues mediate the p97/VCP activity required for ERAD." Mol Cell, 22(4), 451-62. DerVartanian, M. E., Menon, N. K., Przybyla, A. E., Peck, H. D., Jr., and DerVartanian, D. V. (1996). "Electron paramagnetic resonance (EPR) studies on hydrogenase-1 (HYD1) purified from a mutant strain (AP6) of Escherichia coli enhanced in HYD1." Biochem Biophys Res Commun, 227(1), 211-5. Djaman, O., Outten, F. W., and Imlay, J. A. (2004). "Repair of oxidized iron-sulfur clusters in Escherichia coli." J Biol Chem, 279(43), 44590-9. Donovan, S., Harwood, J., Drury, L. S., and Diffley, J. F. (1997). "Cdc6p-dependent loading of Mcm proteins onto pre-replicative chromatin in budding yeast." Proc Natl Acad Sci U S A, 94(11), 5611-6. Dougan, D. A., Reid, B. G., Horwich, A. L., and Bukau, B. (2002). "ClpS, a substrate modulator of the ClpAP machine." Mol Cell, 9(3), 673-83. Drescher, A., Ruf, S., Calsa, T., Jr., Carrer, H., and Bock, R. (2000). "The two largest chloroplast genome-encoded open reading frames of higher plants are essential genes." Plant J, 22(2), 97-104. Edgar, R. C. (2004). "MUSCLE: multiple sequence alignment with high accuracy and high throughput." Nucleic Acids Res, 32(5), 1792-7. Englert, C., Wanner, G., and Pfeifer, F. (1992). "Functional analysis of the gas vesicle gene cluster of the halophilic archaeon Haloferax mediterranei defines the vac-region boundary and suggests a regulatory role for the gvpD gene or its product." Mol Microbiol, 6(23), 3543-50. Epperly, B. R., and Dekker, E. E. (1991). "L-threonine dehydrogenase from Escherichia coli. Identification of an active site cysteine residue and metal ion studies." J Biol Chem, 266(10), 6086-92.

215

Erbse, A., Schmidt, R., Bornemann, T., Schneider-Mergener, J., Mogk, A., Zahn, R., Dougan, D. A., and Bukau, B. (2006). "ClpS is an essential component of the N-end rule pathway in Escherichia coli." Nature, 439(7077), 753-6. Erdmann, R., Wiebel, F. F., Flessau, A., Rytka, J., Beyer, A., Frohlich, K. U., and Kunau, W. H. (1991). "PAS1, a yeast gene required for peroxisome biogenesis, encodes a member of a novel family of putative ATPases." Cell, 64(3), 499-510. Ernst, J. A., and Brunger, A. T. (2003). "High resolution structure, stability, and synaptotagmin binding of a truncated neuronal SNARE complex." J Biol Chem, 278(10), 8630-6. Ernsting, B. R., Atkinson, M. R., Ninfa, A. J., and Matthews, R. G. (1992). "Characterization of the regulon controlled by the leucine-responsive regulatory protein in Escherichia coli." J Bacteriol, 174(4), 1109-18. Erzberger, J. P., Mott, M. L., and Berger, J. M. (2006). "Structural basis for ATP-dependent DnaA assembly and replication-origin remodeling." Nat Struct Mol Biol, 13(8), 676-83. Eshel, D. (1995). "Functional dissection of the dynein motor domain." Cell Motil Cytoskeleton, 32(2), 133-5. Faber, K. N., Heyman, J. A., and Subramani, S. (1998). "Two AAA family peroxins, PpPex1p and PpPex6p, interact with each other in an ATP-dependent manner and are associated with different subcellular membranous structures distinct from peroxisomes." Mol Cell Biol, 18(2), 936-43. Fan, N., Cutting, S., and Losick, R. (1992). "Characterization of the Bacillus subtilis sporulation gene spoVK." J Bacteriol, 174(3), 1053-4. Felsenstein, J. (1996). "Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods." Methods Enzymol, 266, 418-27. Ferianc, P., Farewell, A., and Nystrom, T. (1998). "The cadmium-stress stimulon of Escherichia coli K-12." Microbiology, 144 ( Pt 4), 1045-50. Filenko, N. A., Browning, D. F., and Cole, J. A. (2005). "Transcriptional regulation of a hybrid cluster (prismane) protein." Biochem Soc Trans, 33(Pt 1), 195-7. Flatley, J., Barrett, J., Pullan, S. T., Hughes, M. N., Green, J., and Poole, R. K. (2005). "Transcriptional responses of Escherichia coli to S-nitrosoglutathione under defined chemostat conditions reveal major changes in methionine biosynthesis." J Biol Chem, 280(11), 10065-72. Fleming, K. G., Hohl, T. M., Yu, R. C., Muller, S. A., Wolpensinger, B., Engel, A., Engelhardt, H., Brunger, A. T., Sollner, T. H., and Hanson, P. I. (1998). "A revised model for the oligomeric state of the N-ethylmaleimide-sensitive fusion protein, NSF." J Biol Chem, 273(25), 15675-81.

216

Flynn, J. M., Neher, S. B., Kim, Y. I., Sauer, R. T., and Baker, T. A. (2003). "Proteomic discovery of cellular substrates of the ClpXP protease reveals five classes of ClpX- recognition signals." Mol Cell, 11(3), 671-83. Fodje, M. N., Hansson, A., Hansson, M., Olsen, J. G., Gough, S., Willows, R. D., and Al- Karadaghi, S. (2001). "Interplay between an AAA module and an integrin I domain may regulate the function of magnesium chelatase." J Mol Biol, 311(1), 111-22. Forsburg, S. L. (2004). "Eukaryotic MCM proteins: beyond replication initiation." Microbiol Mol Biol Rev, 68(1), 109-31. Foster, J. W. (2004). "Escherichia coli acid resistance: tales of an amateur acidophile." Nat Rev Microbiol, 2(11), 898-907. Francetic, O., Belin, D., Badaut, C., and Pugsley, A. P. (2000). "Expression of the endogenous type II secretion pathway in Escherichia coli leads to chitinase secretion." Embo J, 19(24), 6697-703. Freymann, D. M., Keenan, R. J., Stroud, R. M., and Walter, P. (1999). "Functional changes in the structure of the SRP GTPase on binding GDP and Mg2+GDP." Nat Struct Biol, 6(8), 793-801. Frickey, T., and Lupas, A. N. (2004). "Phylogenetic analysis of AAA proteins." J Struct Biol, 146(1-2), 2-10. Fu, G. K., and Markovitz, D. M. (1998). "The human LON protease binds to mitochondrial promoters in a single-stranded, site-specific, strand-specific manner." Biochemistry, 37(7), 1905-9. Fu, G. K., Smith, M. J., and Markovitz, D. M. (1997). "Bacterial protease Lon is a site-specific DNA-binding protein." J Biol Chem, 272(1), 534-8. Fuchs, M., Gerber, J., Drapkin, R., Sif, S., Ikura, T., Ogryzko, V., Lane, W. S., Nakatani, Y., and Livingston, D. M. (2001). "The p400 complex is an essential E1A transformation target." Cell, 106(3), 297-307. Fukui, T., Eguchi, T., Atomi, H., and Imanaka, T. (2002). "A membrane-bound archaeal Lon protease displays ATP-independent proteolytic activity towards unfolded proteins and ATP-dependent activity for folded proteins." J Bacteriol, 184(13), 3689-98. Fuller, R. S., Funnell, B. E., and Kornberg, A. (1984). "The dnaA protein complex with the E. coli chromosomal replication origin (oriC) and other DNA sites." Cell, 38(3), 889-900. Galani, K., Nissan, T. A., Petfalski, E., Tollervey, D., and Hurt, E. (2004). "Rea1, a dynein- related nuclear AAA-ATPase, is involved in late rRNA processing and nuclear export of 60 S subunits." J Biol Chem, 279(53), 55411-8. Gallie, D. R., Fortner, D., Peng, J., and Puthoff, D. (2002). "ATP-dependent hexameric assembly of the heat shock protein Hsp101 involves multiple interaction domains and a functional C-proximal nucleotide-binding domain." J Biol Chem, 277(42), 39617-26.

217

Ganduri, Y. L., Sadda, S. R., Datta, M. W., Jambukeswaran, R. K., and Datta, P. (1993). "TdcA, a transcriptional activator of the tdcABC operon of Escherichia coli, is a member of the LysR family of proteins." Mol Gen Genet, 240(3), 395-402. Gao, D., and McHenry, C. S. (2001a). "tau binds and organizes Escherichia coli replication proteins through distinct domains. Domain IV, located within the unique C terminus of tau, binds the replication fork, helicase, DnaB." J Biol Chem, 276(6), 4441-6. Gao, D., and McHenry, C. S. (2001b). "tau binds and organizes Escherichia coli replication through distinct domains. Partial proteolysis of terminally tagged tau to determine candidate domains and to assign domain V as the alpha binding domain." J Biol Chem, 276(6), 4433-40. Garbarino, J. E., and Gibbons, I. R. (2002). "Expression and genomic analysis of midasin, a novel and highly conserved AAA protein distantly related to dynein." BMC Genomics, 3(1), 18. Gardner, A. M., and Gardner, P. R. (2002). "Flavohemoglobin detoxifies nitric oxide in aerobic, but not anaerobic, Escherichia coli. Evidence for a novel inducible anaerobic nitric oxide-scavenging activity." J Biol Chem, 277(10), 8166-71. Gartner, W., Rossbacher, J., Zierhut, B., Daneva, T., Base, W., Weissel, M., Waldhausl, W., Pasternack, M. S., and Wagner, L. (2003). "The ATP-dependent helicase RUVBL1/TIP49a associates with tubulin during mitosis." Cell Motil Cytoskeleton, 56(2), 79-93. Gast, F. U., Brinkmann, T., Pieper, U., Kruger, T., Noyer-Weidner, M., and Pingoud, A. (1997). "The recognition of methylated DNA by the GTP-dependent restriction endonuclease McrBC resides in the N-terminal domain of McrB." Biol Chem, 378(9), 975-82. Gee, M. A., Heuser, J. E., and Vallee, R. B. (1997). "An extended microtubule-binding structure within the dynein motor domain." Nature, 390(6660), 636-9. Gerace, L. (2004). "TorsinA and torsion dystonia: Unraveling the architecture of the nuclear envelope." Proc Natl Acad Sci U S A, 101(24), 8839-40. Gibbons, I. R., Gibbons, B. H., Mocz, G., and Asai, D. J. (1991). "Multiple nucleotide-binding sites in the sequence of dynein beta heavy chain." Nature, 352(6336), 640-3. Gibbons, I. R., Lee-Eiford, A., Mocz, G., Phillipson, C. A., Tang, W. J., and Gibbons, B. H. (1987). "Photosensitized cleavage of dynein heavy chains. Cleavage at the "V1 site" by irradiation at 365 nm in the presence of ATP and vanadate." J Biol Chem, 262(6), 2780- 6. Glick, R. E., and Sears, B. B. (1993). "Large unidentified open reading frame in plastid DNA (ORF2280) is expressed in chloroplasts." Plant Mol Biol, 21(1), 99-108. Glover, J. R., and Tkach, J. M. (2001). "Crowbars and ratchets: hsp100 chaperones as tools in reversing protein aggregation." Biochem Cell Biol, 79(5), 557-68.

218

Gomes, X. V., Schmidt, S. L., and Burgers, P. M. (2001). "ATP utilization by yeast replication factor C. II. Multiple stepwise ATP binding events are required to load proliferating cell nuclear antigen onto primed DNA." J Biol Chem, 276(37), 34776-83. Goodchild, R. E., and Dauer, W. T. (2005). "The AAA+ protein torsinA interacts with a conserved domain present in LAP1 and a novel ER protein." J Cell Biol, 168(6), 855-62. Goodchild, R. E., Kim, C. E., and Dauer, W. T. (2005). "Loss of the dystonia-associated protein torsinA selectively disrupts the neuronal nuclear envelope." Neuron, 48(6), 923-32. Goodenough, U., and Heuser, J. (1984). "Structural comparison of purified dynein proteins with in situ dynein arms." J Mol Biol, 180(4), 1083-118. Goodenough, U. W., Gebhart, B., Mermall, V., Mitchell, D. R., and Heuser, J. E. (1987). "High- pressure liquid chromatography fractionation of Chlamydomonas dynein extracts and characterization of inner-arm dynein subunits." J Mol Biol, 194(3), 481-94. Gorbalenya, A. E., and Koonin, E. V. (1989). "Viral proteins containing the purine NTP-binding sequence pattern." Nucleic Acids Res, 17(21), 8413-40. Gottesman, S. (1996). "Proteases and their targets in Escherichia coli." Annu Rev Genet, 30, 465-506. Gottesman, S. (2003). "Proteolysis in bacterial regulatory circuits." Annu Rev Cell Dev Biol, 19, 565-87. Gottesman, S., Roche, E., Zhou, Y., and Sauer, R. T. (1998). "The ClpXP and ClpAP proteases degrade proteins with carboxy-terminal peptide tails added by the SsrA-tagging system." Genes Dev, 12(9), 1338-47. Gozuacik, D., Chami, M., Lagorce, D., Faivre, J., Murakami, Y., Poch, O., Biermann, E., Knippers, R., Brechot, C., and Paterlini-Brechot, P. (2003). "Identification and functional characterization of a new member of the human Mcm protein family: hMcm8." Nucleic Acids Res, 31(2), 570-9. Griggs, D. W., Kafka, K., Nau, C. D., and Konisky, J. (1990). "Activation of expression of the Escherichia coli cir gene by an iron-independent regulatory mechanism involving cyclic AMP-cyclic AMP receptor protein complex." J Bacteriol, 172(6), 3529-33. Griggs, D. W., and Konisky, J. (1989). "Mechanism for iron-regulated transcription of the Escherichia coli cir gene: metal-dependent binding of fur protein to the promoters." J Bacteriol, 171(2), 1048-54. Griggs, D. W., Tharp, B. B., and Konisky, J. (1987). "Cloning and promoter identification of the iron-regulated cir gene of Escherichia coli." J Bacteriol, 169(12), 5343-52. Grimaud, R., Kessel, M., Beuron, F., Steven, A. C., and Maurizi, M. R. (1998). "Enzymatic and structural similarities between the Escherichia coli ATP-dependent proteases, ClpXP and ClpAP." J Biol Chem, 273(20), 12476-81.

219

Groll, M., Bajorek, M., Kohler, A., Moroder, L., Rubin, D. M., Huber, R., Glickman, M. H., and Finley, D. (2000). "A gated channel into the proteasome core particle." Nat Struct Biol, 7(11), 1062-7. Groll, M., Bochtler, M., Brandstetter, H., Clausen, T., and Huber, R. (2005). "Molecular machines for protein degradation." Chembiochem, 6(2), 222-56. Groll, M., Ditzel, L., Lowe, J., Stock, D., Bochtler, M., Bartunik, H. D., and Huber, R. (1997). "Structure of 20S proteasome from yeast at 2.4 A resolution." Nature, 386(6624), 463-71. Grove, J., Tanapongpipat, S., Thomas, G., Griffiths, L., Crooke, H., and Cole, J. (1996). "Escherichia coli K-12 genes essential for the synthesis of c-type cytochromes and a third nitrate reductase located in the periplasm." Mol Microbiol, 19(3), 467-81. Guenther, B., Onrust, R., Sali, A., O'Donnell, M., and Kuriyan, J. (1997). "Crystal structure of the delta' subunit of the clamp-loader complex of E. coli DNA polymerase III." Cell, 91(3), 335-45. Guo, F., Maurizi, M. R., Esser, L., and Xia, D. (2002). "Crystal structure of ClpA, an Hsp100 chaperone and regulator of ClpAP protease." J Biol Chem, 277(48), 46743-52. Gwinn, M. L., Ramanathan, R., Smith, H. O., and Tomb, J. F. (1998). "A new transformation- deficient mutant of Haemophilus influenzae Rd with normal DNA uptake." J Bacteriol, 180(3), 746-8. Han, Y. W., Iwasaki, H., Miyata, T., Mayanagi, K., Yamada, K., Morikawa, K., and Shinagawa, H. (2001). "A unique beta-hairpin protruding from AAA+ ATPase domain of RuvB motor protein is involved in the interaction with RuvA DNA recognition protein for branch migration of Holliday junctions." J Biol Chem, 276(37), 35024-8. Hanson, P. I., and Whiteheart, S. W. (2005). "AAA+ proteins: have engine, will work." Nat Rev Mol Cell Biol, 6(7), 519-29. Hansson, M., and Kannangara, C. G. (1997). "ATPases and phosphate exchange activities in magnesium chelatase subunits of Rhodobacter sphaeroides." Proc Natl Acad Sci U S A, 94(24), 13351-6. Hantke, K. (1983). "Identification of an iron uptake system specific for coprogen and rhodotorulic acid in Escherichia coli K12." Mol Gen Genet, 191(2), 301-6. Hantke, K. (1987). "Ferrous iron transport mutants in Escherichia coli K12." FEMS Microbiol Lett, 44, 53-57. Hantke, K. (1990). "Dihydroxybenzoylserine--a siderophore for E. coli." FEMS Microbiol Lett, 55(1-2), 5-8. Hargreaves, D., Rice, D. W., Sedelnikova, S. E., Artymiuk, P. J., Lloyd, R. G., and Rafferty, J. B. (1998). "Crystal structure of E.coli RuvA with bound DNA Holliday junction at 6 A resolution." Nat Struct Biol, 5(6), 441-6.

220

Hartman, J. J., Mahr, J., McNally, K., Okawa, K., Iwamatsu, A., Thomas, S., Cheesman, S., Heuser, J., Vale, R. D., and McNally, F. J. (1998). "Katanin, a microtubule-severing protein, is a novel AAA ATPase that targets to the centrosome using a WD40-containing subunit." Cell, 93(2), 277-87. Hartman, J. J., and Vale, R. D. (1999). "Microtubule disassembly by ATP-dependent oligomerization of the AAA enzyme katanin." Science, 286(5440), 782-5. Hattendorf, D. A., and Lindquist, S. L. (2002). "Cooperative kinetics of both Hsp104 ATPase domains and interdomain communication revealed by AAA sensor-1 mutants." Embo J, 21(1-2), 12-21. Hayashi, N. R., Arai, H., Kodama, T., and Igarashi, Y. (1997). "The novel genes, cbbQ and cbbO, located downstream from the RubisCO genes of Pseudomonas hydrogenothermophila, affect the conformational states and activity of RubisCO." Biochem Biophys Res Commun, 241(2), 565-9. Hayashi, N. R., Arai, H., Kodama, T., and Igarashi, Y. (1999). "The cbbQ genes, located downstream of the form I and form II RubisCO genes, affect the activity of both RubisCOs." Biochem Biophys Res Commun, 265(1), 177-83. Hayashi, N. R., and Igarashi, Y. (2002). "ATP binding and hydrolysis and autophosphorylation of CbbQ encoded by the gene located downstream of RubisCO genes." Biochem Biophys Res Commun, 290(5), 1434-40. He, Z., von Caemmerer, S., Hudson, G. S., Price, G. D., Badger, M. R., and Andrews, T. J. (1997). "Ribulose-1,5-bisphosphate carboxylase/oxygenase activase deficiency delays senescence of ribulose-1,5-bisphosphate carboxylase/oxygenase but progressively impairs its catalysis during tobacco leaf development." Plant Physiol, 115(4), 1569-80. Hegerl, R. (1996). "The EM Program Package: A Platform for Image Processing in Biological Electron Microscopy." J Struct Biol, 116(1), 30-4. Hegerl, R., and Altbauer, A. (1982). "The "EM" program system." Ultramicroscopy, 9(1-2), 109-16. Hekimi, S., and Kershaw, D. (1993). "Axonal guidance defects in a Caenorhabditis elegans mutant reveal cell-extrinsic determinants of neuronal morphology." J Neurosci, 13(10), 4254-71. Hewett, J., Gonzalez-Agosti, C., Slater, D., Ziefer, P., Li, S., Bergeron, D., Jacoby, D. J., Ozelius, L. J., Ramesh, V., and Breakefield, X. O. (2000). "Mutant torsinA, responsible for early-onset torsion dystonia, forms membrane inclusions in cultured neural cells." Hum Mol Genet, 9(9), 1403-13. Hewett, J., Ziefer, P., Bergeron, D., Naismith, T., Boston, H., Slater, D., Wilbur, J., Schuback, D., Kamm, C., Smith, N., Camp, S., Ozelius, L. J., Ramesh, V., Hanson, P. I., and

221

Breakefield, X. O. (2003). "TorsinA in PC12 cells: localization in the endoplasmic reticulum and response to stress." J Neurosci Res, 72(2), 158-68. Heymann, J. B. (2001). "Bsoft: image and molecular processing in electron microscopy." J Struct Biol, 133(2-3), 156-69. Hille, R. (2005). "Molybdenum-containing hydroxylases." Arch Biochem Biophys, 433(1), 107- 16. Hinnerwisch, J., Fenton, W. A., Furtak, K. J., Farr, G. W., and Horwich, A. L. (2005a). "Loops in the central channel of ClpA chaperone mediate protein binding, unfolding, and translocation." Cell, 121(7), 1029-41. Hinnerwisch, J., Reid, B. G., Fenton, W. A., and Horwich, A. L. (2005b). "Roles of the N- domains of the ClpA unfoldase in binding substrate proteins and in stable complex formation with the ClpP protease." J Biol Chem, 280(49), 40838-44. Hishida, T., Ohya, T., Kubota, Y., Kamada, Y., and Shinagawa, H. (2006). "Functional and physical interaction of yeast Mgs1 with PCNA: impact on RAD6-dependent DNA damage tolerance." Mol Cell Biol, 26(14), 5509-17. Hong, S. W., and Vierling, E. (2000). "Mutants of Arabidopsis thaliana defective in the acquisition of tolerance to high temperature stress." Proc Natl Acad Sci U S A, 97(8), 4392-7. Hook, P., Mikami, A., Shafer, B., Chait, B. T., Rosenfeld, S. S., and Vallee, R. B. (2005). "Long range allosteric control of cytoplasmic dynein ATPase activity by the stalk and C- terminal domains." J Biol Chem, 280(38), 33045-54. Houry, W. A. (2001). "Chaperone-assisted protein folding in the cell cytoplasm." Curr Protein Pept Sci, 2(3), 227-44. Howard, S. P., Gebhart, C., Langen, G. R., Li, G., and Strozen, T. G. (2006). "Interactions between peptidoglycan and the ExeAB complex during assembly of the type II secretin of Aeromonas hydrophila." Mol Microbiol, 59(3), 1062-72. Howard, S. P., Meiklejohn, H. G., Shivak, D., and Jahagirdar, R. (1996). "A TonB-like protein and a novel membrane protein containing an ATP-binding cassette function together in exotoxin secretion." Mol Microbiol, 22(4), 595-604. Hu, G., Lin, G., Wang, M., Dick, L., Xu, R. M., Nathan, C., and Li, H. (2006). "Structure of the Mycobacterium tuberculosis proteasome and mechanism of inhibition by a peptidyl boronate." Mol Microbiol, 59(5), 1417-28. Humbert, R., and Simoni, R. D. (1980). "Genetic and biomedical studies demonstrating a second gene coding for asparagine synthetase in Escherichia coli." J Bacteriol, 142(1), 212-20. Hunter, J. S., Greene, R. C., and Su, C. H. (1975). "Genetic characterization of the metK locus in Escherichia coli K-12." J Bacteriol, 122(3), 1144-52.

222

Ikura, T., Ogryzko, V. V., Grigoriev, M., Groisman, R., Wang, J., Horikoshi, M., Scully, R., Qin, J., and Nakatani, Y. (2000). "Involvement of the TIP60 histone acetylase complex in DNA repair and apoptosis." Cell, 102(4), 463-73. Ishiguro, H., Shimokawa, T., Tsunoda, T., Tanaka, T., Fujii, Y., Nakamura, Y., and Furukawa, Y. (2002). "Isolation of HELAD1, a novel human helicase gene up-regulated in colorectal carcinomas." Oncogene, 21(41), 6387-94. Ishimi, Y. (1997). "A DNA helicase activity is associated with an MCM4, -6, and -7 protein complex." J Biol Chem, 272(39), 24508-13. Ito, K., and Akiyama, Y. (2005). "Cellular functions, mechanism of action, and regulation of FtsH protease." Annu Rev Microbiol, 59, 211-31. Iwasaki, H., Takahagi, M., Nakata, A., and Shinagawa, H. (1992). "Escherichia coli RuvA and RuvB proteins specifically interact with Holliday junctions and promote branch migration." Genes Dev, 6(11), 2214-20. Iwashima, A., Nishino, H., and Nose, Y. (1972). "Conversion of thiamine to thiamine monophosphate by cell-free extracts of Escherichia coli." Biochim Biophys Acta, 258(1), 333-6. Iyer, L. M., Leipe, D. D., Koonin, E. V., and Aravind, L. (2004). "Evolutionary history and higher order classification of AAA+ ATPases." J Struct Biol, 146(1-2), 11-31. Iyer, R., Williams, C., and Miller, C. (2003). "Arginine-agmatine antiporter in extreme acid resistance in Escherichia coli." J Bacteriol, 185(22), 6556-61. Jahagirdar, R., and Howard, S. P. (1994). "Isolation and characterization of a second exe operon required for extracellular protein secretion in Aeromonas hydrophila." J Bacteriol, 176(22), 6819-26. Jain, V., Kumar, M., and Chatterji, D. (2006). "ppGpp: stringent response and survival." J Microbiol, 44(1), 1-10. Jayasekera, M. M., Foltin, S. K., Olson, E. R., and Holler, T. P. (2000). "Escherichia coli requires the protease activity of FtsH for growth." Arch Biochem Biophys, 380(1), 103-7. Jensen, P. E., Gibson, L. C., and Hunter, C. N. (1999). "ATPase activity associated with the magnesium-protoporphyrin IX chelatase enzyme of Synechocystis PCC6803: evidence for ATP hydrolysis during Mg2+ insertion, and the MgATP-dependent interaction of the ChlI and ChlD subunits." Biochem J, 339 ( Pt 1), 127-34. Jentsch, S., and Rumpf, S. (2006). "Cdc48 (p97): a 'molecular gearbox' in the ubiquitin pathway?" Trends Biochem Sci. Jeruzalmi, D., O'Donnell, M., and Kuriyan, J. (2001). "Crystal structure of the processivity clamp loader gamma (gamma) complex of E. coli DNA polymerase III." Cell, 106(4), 429-41.

223

Johnson, A., and O'Donnell, M. (2003). "Ordered ATP hydrolysis in the gamma complex clamp loader AAA+ machine." J Biol Chem, 278(16), 14406-13. Johnson, A., and O'Donnell, M. (2005). "Cellular DNA replicases: components and dynamics at the replication fork." Annu Rev Biochem, 74, 283-315. Jonsson, Z. O., Dhar, S. K., Narlikar, G. J., Auty, R., Wagle, N., Pellman, D., Pratt, R. E., Kingston, R., and Dutta, A. (2001). "Rvb1p and Rvb2p are essential components of a chromatin remodeling complex that regulates transcription of over 5% of yeast genes." J Biol Chem, 276(19), 16279-88. Jungst, A., and Zumft, W. G. (1992). "Interdependence of respiratory NO reduction and nitrite reduction revealed by mutagenesis of nirQ, a novel gene in the denitrification gene cluster of Pseudomonas stutzeri." FEBS Lett, 314(3), 308-14. Justino, M. C., Almeida, C. C., Goncalves, V. L., Teixeira, M., and Saraiva, L. M. (2006). "Escherichia coli YtfE is a di-iron protein with an important function in assembly of iron-sulphur clusters." FEMS Microbiol Lett, 257(2), 278-84. Justino, M. C., Vicente, J. B., Teixeira, M., and Saraiva, L. M. (2005). "New genes implicated in the protection of anaerobically grown Escherichia coli against nitric oxide." J Biol Chem, 280(4), 2636-43. Kachlany, S. C., Planet, P. J., Bhattacharjee, M. K., Kollia, E., DeSalle, R., Fine, D. H., and Figurski, D. H. (2000). "Nonspecific adherence by Actinobacillus actinomycetemcomitans requires genes widespread in bacteria and archaea." J Bacteriol, 182(21), 6169-76. Kamm, C., Boston, H., Hewett, J., Wilbur, J., Corey, D. P., Hanson, P. I., Ramesh, V., and Breakefield, X. O. (2004). "The early onset dystonia protein torsinA interacts with kinesin light chain 1." J Biol Chem, 279(19), 19882-92. Kammler, M., Schon, C., and Hantke, K. (1993). "Characterization of the ferrous iron uptake system of Escherichia coli." J Bacteriol, 175(19), 6212-9. Kanemaki, M., Kurokawa, Y., Matsu-ura, T., Makino, Y., Masani, A., Okazaki, K., Morishita, T., and Tamura, T. A. (1999). "TIP49b, a new RuvB-like DNA helicase, is included in a complex together with another RuvB-like DNA helicase, TIP49a." J Biol Chem, 274(32), 22437-44. Kanemori, M., Nishihara, K., Yanagi, H., and Yura, T. (1997). "Synergistic roles of HslVU and other ATP-dependent proteases in controlling in vivo turnover of sigma32 and abnormal proteins in Escherichia coli." J Bacteriol, 179(23), 7219-25. Karata, K., Inagawa, T., Wilkinson, A. J., Tatsuta, T., and Ogura, T. (1999). "Dissecting the role of a conserved motif (the second region of homology) in the AAA family of ATPases. Site-directed mutagenesis of the ATP-dependent protease FtsH." J Biol Chem, 274(37), 26225-32.

224

Katerov, V., Lindgren, P. E., Totolian, A. A., and Schalen, C. (2000). "Streptococcal opacity factor: a family of bifunctional proteins with lipoproteinase and fibronectin-binding activities." Curr Microbiol, 40(3), 149-56. Keiler, K. C., Waller, P. R., and Sauer, R. T. (1996). "Role of a peptide tagging system in degradation of proteins synthesized from damaged messenger RNA." Science, 271(5251), 990-3. Kelman, Z., and Hurwitz, J. (2003). "Structural lessons in DNA replication from the third domain of life." Nat Struct Biol, 10(3), 148-50. Keseler, I. M., Collado-Vides, J., Gama-Castro, S., Ingraham, J., Paley, S., Paulsen, I. T., Peralta-Gil, M., and Karp, P. D. (2005). "EcoCyc: a comprehensive database resource for Escherichia coli." Nucleic Acids Res, 33(Database issue), D334-7. Kessel, M., Maurizi, M. R., Kim, B., Kocsis, E., Trus, B. L., Singh, S. K., and Steven, A. C. (1995). "Homology in structural organization between E. coli ClpAP protease and the eukaryotic 26 S proteasome." J Mol Biol, 250(5), 587-94. Kiel, J. A., Hilbrands, R. E., van der Klei, I. J., Rasmussen, S. W., Salomons, F. A., van der Heide, M., Faber, K. N., Cregg, J. M., and Veenhuis, M. (1999). "Hansenula polymorpha Pex1p and Pex6p are peroxisome-associated AAA proteins that functionally and physically interact." Yeast, 15(11), 1059-78. Kikuchi, Y., Kojima, H., Tanaka, T., Takatsuka, Y., and Kamio, Y. (1997). "Characterization of a second lysine decarboxylase isolated from Escherichia coli." J Bacteriol, 179(14), 4486-92. Kim, K. I., Cheong, G. W., Park, S. C., Ha, J. S., Woo, K. M., Choi, S. J., and Chung, C. H. (2000). "Heptameric ring structure of the heat-shock protein ClpB, a protein-activated ATPase in Escherichia coli." J Mol Biol, 303(5), 655-66. Kim, Y. I., Levchenko, I., Fraczkowska, K., Woodruff, R. V., Sauer, R. T., and Baker, T. A. (2001). "Molecular determinants of complex formation between Clp/Hsp100 ATPases and the ClpP peptidase." Nat Struct Biol, 8(3), 230-3. King, S. M., and Witman, G. B. (1987). "Structure of the alpha and beta heavy chains of the outer arm dynein from Chlamydomonas flagella. Masses of chains and sites of ultraviolet-induced vanadate-dependent cleavage." J Biol Chem, 262(36), 17596-604. King, T. H., Decatur, W. A., Bertrand, E., Maxwell, E. S., and Fournier, M. J. (2001). "A well- connected and conserved nucleoplasmic helicase is required for production of box C/D and H/ACA snoRNAs and localization of snoRNP proteins." Mol Cell Biol, 21(22), 7731-46. Kobor, M. S., Venkatasubrahmanyam, S., Meneghini, M. D., Gin, J. W., Jennings, J. L., Link, A. J., Madhani, H. D., and Rine, J. (2004). "A protein complex containing the conserved

225

Swi2/Snf2-related ATPase Swr1p deposits histone variant H2A.Z into euchromatin." PLoS Biol, 2(5), E131. Kobori, J. A., and Kornberg, A. (1982). "The Escherichia coli dnaC gene product. III. Properties of the dnaB-dnaC protein complex." J Biol Chem, 257(22), 13770-5. Kolling, R., and Lother, H. (1985). "AsnC: an autogenously regulated activator of asparagine synthetase A transcription in Escherichia coli." J Bacteriol, 164(1), 310-5. Kong, X. P., Onrust, R., O'Donnell, M., and Kuriyan, J. (1992). "Three-dimensional structure of the beta subunit of E. coli DNA polymerase III holoenzyme: a sliding DNA clamp." Cell, 69(3), 425-37. Koo, B. M., Yoon, M. J., Lee, C. R., Nam, T. W., Choe, Y. J., Jaffe, H., Peterkofsky, A., and Seok, Y. J. (2004). "A novel fermentation/respiration switch protein regulated by enzyme IIAGlc in Escherichia coli." J Biol Chem, 279(30), 31613-21. Koonin, E. V., Wolf, Y. I., and Aravind, L. (2000). "Protein fold recognition using sequence profiles and its application in structural genomics." Adv Protein Chem, 54, 245-75. Kredich, N. M. (1996). "Biosynthesis of Cysteine." Escherichia coli and Salmonella 2nd Ed., F. C. Neidhart, Curtiss, R. Ingraham, J.L. Lin, E.C.C., ed., ASM, Washington DC, 514-524. Krishna, T. S., Kong, X. P., Gary, S., Burgers, P. M., and Kuriyan, J. (1994). "Crystal structure of the eukaryotic DNA polymerase processivity factor PCNA." Cell, 79(7), 1233-43. Krogan, N. J., Keogh, M. C., Datta, N., Sawa, C., Ryan, O. W., Ding, H., Haw, R. A., Pootoolal, J., Tong, A., Canadien, V., Richards, D. P., Wu, X., Emili, A., Hughes, T. R., Buratowski, S., and Greenblatt, J. F. (2003). "A Snf2 family ATPase complex required for recruitment of the histone H2A variant Htz1." Mol Cell, 12(6), 1565-76. Kruger, T., Wild, C., and Noyer-Weidner, M. (1995). "McrB: a prokaryotic protein specifically recognizing DNA containing modified cytosine residues." Embo J, 14(11), 2661-9. Krzewska, J., Konopa, G., and Liberek, K. (2001a). "Importance of two ATP-binding sites for oligomerization, ATPase activity and chaperone function of mitochondrial Hsp78 protein." J Mol Biol, 314(4), 901-10. Krzewska, J., Langer, T., and Liberek, K. (2001b). "Mitochondrial Hsp78, a member of the Clp/Hsp100 family in Saccharomyces cerevisiae, cooperates with Hsp70 in protein refolding." FEBS Lett, 489(1), 92-6. Krzywda, S., Brzozowski, A. M., Verma, C., Karata, K., Ogura, T., and Wilkinson, A. J. (2002). "The crystal structure of the AAA domain of the ATP-dependent protease FtsH of Escherichia coli at 1.5 A resolution." Structure, 10(8), 1073-83. Kunau, W. H., Beyer, A., Franken, T., Gotte, K., Marzioch, M., Saidowsky, J., Skaletz- Rorowski, A., and Wiebel, F. F. (1993). "Two complementary approaches to study peroxisome biogenesis in Saccharomyces cerevisiae: forward and reversed genetics." Biochimie, 75(3-4), 209-24.

226

Kuner, R., Teismann, P., Trutzel, A., Naim, J., Richter, A., Schmidt, N., von Ahsen, O., Bach, A., Ferger, B., and Schneider, A. (2003). "TorsinA protects against oxidative stress in COS-1 and PC12 cells." Neurosci Lett, 350(3), 153-6. Kustedjo, K., Bracey, M. H., and Cravatt, B. F. (2000). "Torsin A and its torsion dystonia- associated mutant forms are lumenal glycoproteins that exhibit distinct subcellular localizations." J Biol Chem, 275(36), 27933-9. Kuznetsova, E., Proudfoot, M., Sanders, S. A., Reinking, J., Savchenko, A., Arrowsmith, C. H., Edwards, A. M., and Yakunin, A. F. (2005). "Enzyme genomics: Application of general enzymatic screens to discover new enzymes." FEMS Microbiol Rev, 29(2), 263-79. Kwon, A. R., Trame, C. B., and McKay, D. B. (2004a). "Kinetics of protein substrate degradation by HslUV." J Struct Biol, 146(1-2), 141-7. Kwon, Y. D., Nagy, I., Adams, P. D., Baumeister, W., and Jap, B. K. (2004b). "Crystal structures of the Rhodococcus proteasome with and without its pro-peptides: implications for the role of the pro-peptide in proteasome assembly." J Mol Biol, 335(1), 233-45. Labib, K., Tercero, J. A., and Diffley, J. F. (2000). "Uninterrupted MCM2-7 function required for DNA replication fork progression." Science, 288(5471), 1643-7. LaMonte, B. L., and Hughes, J. A. (2006). "In vivo hydrolysis of S-adenosylmethionine induces the met regulon of Escherichia coli." Microbiology, 152(Pt 5), 1451-9. Lanzetta, P. A., Alvarez, L. J., Reinach, P. S., and Candia, O. A. (1979). "An improved assay for nanomole amounts of inorganic phosphate." Anal Biochem, 100(1), 95-7. Lauhon, C. T., and Kambampati, R. (2000). "The iscS gene in Escherichia coli is required for the biosynthesis of 4-thiouridine, thiamin, and NAD." J Biol Chem, 275(26), 20096-103. Laurinavichene, T. V., Zorin, N. A., and Tsygankov, A. A. (2002). "Effect of redox potential on activity of hydrogenase 1 and hydrogenase 2 in Escherichia coli." Arch Microbiol, 178(6), 437-42. Lee, J. E., and Ahn, T. I. (2000). "Periplasmic localization of a GroES homologue in Escherichia coli transformed with groESx cloned from Legionella-like endosymbionts in Amoeba proteus." Res Microbiol, 151(8), 605-18. Lee, J. K., and Hurwitz, J. (2000). "Isolation and characterization of various complexes of the minichromosome maintenance proteins of Schizosaccharomyces pombe." J Biol Chem, 275(25), 18871-8. Lee, S., Sowa, M. E., Watanabe, Y. H., Sigler, P. B., Chiu, W., Yoshida, M., and Tsai, F. T. (2003a). "The structure of ClpB: a molecular chaperone that rescues proteins from an aggregated state." Cell, 115(2), 229-40.

227

Lee, S. Y., De La Torre, A., Yan, D., Kustu, S., Nixon, B. T., and Wemmer, D. E. (2003b). "Regulation of the transcriptional activator NtrC1: structural studies of the regulatory and AAA+ ATPase domains." Genes Dev, 17(20), 2552-63. Lee, Y. J., and Wickner, R. B. (1992). "AFG1, a new member of the SEC18-NSF, PAS1, CDC48-VCP, TBP family of ATPases." Yeast, 8(9), 787-90. Leidhold, C., von Janowsky, B., Becker, D., Bender, T., and Voos, W. (2006). "Structure and function of Hsp78, the mitochondrial ClpB homolog." J Struct Biol, 156(1), 149-64. Leimkuhler, S., and Rajagopalan, K. V. (2001). "A sulfurtransferase is required in the transfer of cysteine sulfur in the in vitro synthesis of molybdopterin from precursor Z in Escherichia coli." J Biol Chem, 276(25), 22024-31. Leipe, D. D., Koonin, E. V., and Aravind, L. (2003). "Evolution and classification of P-loop kinases and related proteins." J Mol Biol, 333(4), 781-815. Leipe, D. D., Koonin, E. V., and Aravind, L. (2004). "STAND, a class of P-loop NTPases including animal and plant regulators of programmed cell death: multiple, complex domain architectures, unusual phyletic patterns, and evolution by horizontal gene transfer." J Mol Biol, 343(1), 1-28. Leipe, D. D., Wolf, Y. I., Koonin, E. V., and Aravind, L. (2002). "Classification and evolution of P-loop GTPases and related ATPases." J Mol Biol, 317(1), 41-72. Lemire, B. D., and Weiner, J. H. (1986). "Fumarate reductase of Escherichia coli." Methods Enzymol, 126, 377-86. Lemonnier, M., and Lane, D. (1998). "Expression of the second lysine decarboxylase gene of Escherichia coli." Microbiology, 144 ( Pt 3), 751-60. Leonhardt, S. A., Fearson, K., Danese, P. N., and Mason, T. L. (1993). "HSP78 encodes a yeast mitochondrial heat shock protein in the Clp family of ATP-dependent proteases." Mol Cell Biol, 13(10), 6304-13. Leu, F. P., Georgescu, R., and O'Donnell, M. (2003). "Mechanism of the E. coli tau processivity switch during lagging-strand synthesis." Mol Cell, 11(2), 315-27. Levchenko, I., Seidel, M., Sauer, R. T., and Baker, T. A. (2000). "A specificity-enhancing factor for the ClpXP degradation machine." Science, 289(5488), 2354-6. Lim, C. R., Kimata, Y., Ohdate, H., Kokubo, T., Kikuchi, N., Horigome, T., and Kohno, K. (2000). "The Saccharomyces cerevisiae RuvB-like protein, Tih2p, is required for cell cycle progression and RNA polymerase II-directed transcription." J Biol Chem, 275(29), 22409-17. Liu, T., Lu, B., Lee, I., Ondrovicova, G., Kutejova, E., and Suzuki, C. K. (2004). "DNA and RNA binding by the mitochondrial lon protease is regulated by nucleotide and protein substrate." J Biol Chem, 279(14), 13902-10.

228

Liu, Z., Zolkiewska, A., and Zolkiewski, M. (2003). "Characterization of human torsinA and its dystonia-associated mutant form." Biochem J, 374(Pt 1), 117-22. Lowe, J., Stock, D., Jap, B., Zwickl, P., Baumeister, W., and Huber, R. (1995). "Crystal structure of the 20S proteasome from the archaeon T. acidophilum at 3.4 A resolution." Science, 268(5210), 533-9. Lu, B., Liu, T., Crosby, J. A., Thomas-Wohlever, J., Lee, I., and Suzuki, C. K. (2003). "The ATP-dependent Lon protease of Mus musculus is a DNA-binding protein that is functionally conserved between yeast and mammals." Gene, 306, 45-55. Ludtke, S. J., Baldwin, P. R., and Chiu, W. (1999). "EMAN: semiautomated software for high- resolution single-particle reconstructions." J Struct Biol, 128(1), 82-97. Lupas, A., Zwickl, P., and Baumeister, W. (1994). "Proteasome sequences in eubacteria." Trends Biochem Sci, 19(12), 533-4. Lupas, A. N., and Martin, J. (2002). "AAA proteins." Curr Opin Struct Biol, 12(6), 746-53. Lutzmann, M., Maiorano, D., and Mechali, M. (2005). "Identification of full genes and proteins of MCM9, a novel, vertebrate-specific member of the MCM2-8 protein family." Gene, 362, 51-6. Maiorano, D., Cuvier, O., Danis, E., and Mechali, M. (2005). "MCM8 is an MCM2-7-related protein that functions as a DNA helicase during replication elongation and not initiation." Cell, 120(3), 315-28. Maiorano, D., Lutzmann, M., and Mechali, M. (2006). "MCM proteins and DNA replication." Curr Opin Cell Biol, 18(2), 130-6. Maiorano, D., Moreau, J., and Mechali, M. (2000). "XCDT1 is required for the assembly of pre- replicative complexes in Xenopus laevis." Nature, 404(6778), 622-5. Makino, Y., Kanemaki, M., Kurokawa, Y., Koji, T., and Tamura, T. (1999). "A rat RuvB-like protein, TIP49a, is a germ cell-enriched novel DNA helicase." J Biol Chem, 274(22), 15329-35. Malhotra, V., Orci, L., Glick, B. S., Block, M. R., and Rothman, J. E. (1988). "Role of an N- ethylmaleimide-sensitive transport component in promoting fusion of transport vesicles with cisternae of the Golgi stack." Cell, 54(2), 221-7. Marchler-Bauer, A., Anderson, J. B., Cherukuri, P. F., DeWeese-Scott, C., Geer, L. Y., Gwadz, M., He, S., Hurwitz, D. I., Jackson, J. D., Ke, Z., Lanczycki, C. J., Liebert, C. A., Liu, C., Lu, F., Marchler, G. H., Mullokandov, M., Shoemaker, B. A., Simonyan, V., Song, J. S., Thiessen, P. A., Yamashita, R. A., Yin, J. J., Zhang, D., and Bryant, S. H. (2005). "CDD: a Conserved Domain Database for protein classification." Nucleic Acids Res, 33(Database issue), D192-6. Marszalek, J., and Kaguni, J. M. (1994). "DnaA protein directs the binding of DnaB protein in initiation of DNA replication in Escherichia coli." J Biol Chem, 269(7), 4883-90.

229

Martin, W., and Embley, T. M. (2004). "Evolutionary biology: Early evolution comes full circle." Nature, 431(7005), 134-7. Martinez-Lopez, M. J., Alcantara, S., Mascaro, C., Perez-Branguli, F., Ruiz-Lozano, P., Maes, T., Soriano, E., and Buesa, C. (2005). "Mouse neuron navigator 1, a novel microtubule- associated protein involved in neuronal migration." Mol Cell Neurosci, 28(4), 599-612. Matias, P. M., Gorynia, S., Donner, P., and Carrondo, M. A. (2006). "Crystal Structure of the Human AAA+ Protein RuvBL1." J Biol Chem, 281(50), 38918-29. McBroom, A. J., Johnson, A. P., Vemulapalli, S., and Kuehn, M. J. (2006). "Outer membrane vesicle production by Escherichia coli is independent of membrane instability." J Bacteriol, 188(15), 5385-92. McBroom, A. J., and Kuehn, M. J. (2006). "Release of outer membrane vesicles by Gram- negative bacteria is a novel envelope stress response." Mol Microbiol. McLean, P. J., Kawamata, H., Shariff, S., Hewett, J., Sharma, N., Ueda, K., Breakefield, X. O., and Hyman, B. T. (2002). "TorsinA and heat shock proteins act as molecular chaperones: suppression of alpha-synuclein aggregation." J Neurochem, 83(4), 846-54. McNally, F. J., and Vale, R. D. (1993). "Identification of katanin, an ATPase that severs and disassembles stable microtubules." Cell, 75(3), 419-29. Melnick, J., Lis, E., Park, J. H., Kinsland, C., Mori, H., Baba, T., Perkins, J., Schyns, G., Vassieva, O., Osterman, A., and Begley, T. P. (2004). "Identification of the two missing bacterial genes involved in thiamine salvage: thiamine pyrophosphokinase and thiamine kinase." J Bacteriol, 186(11), 3660-2. Meng, S. Y., and Bennett, G. N. (1992). "Nucleotide sequence of the Escherichia coli cad operon: a system for neutralization of low extracellular pH." J Bacteriol, 174(8), 2659- 69. Menon, N. K., Robbins, J., Peck, H. D., Jr., Chatelus, C. Y., Choi, E. S., and Przybyla, A. E. (1990). "Cloning and sequencing of a putative Escherichia coli [NiFe] hydrogenase-1 operon containing six open reading frames." J Bacteriol, 172(4), 1969-77. Menon, N. K., Robbins, J., Wendt, J. C., Shanmugam, K. T., and Przybyla, A. E. (1991). "Mutational analysis and characterization of the Escherichia coli hya operon, which encodes [NiFe] hydrogenase 1." J Bacteriol, 173(15), 4851-61. Merrell, D. S., and Camilli, A. (1999). "The cadA gene of Vibrio cholerae is induced during infection and plays a role in acid tolerance." Mol Microbiol, 34(4), 836-49. Messer, W., and Weigel, C. (1997). "DnaA initiator--also a transcription factor." Mol Microbiol, 24(1), 1-6. Michels, P. A., Moyersoen, J., Krazy, H., Galland, N., Herman, M., and Hannaert, V. (2005). "Peroxisomes, glyoxysomes and glycosomes (review)." Mol Membr Biol, 22(1-2), 133- 45.

230

Milner-White, E. J., Coggins, J. R., and Anton, I. A. (1991). "Evidence for an ancestral core structure in nucleotide-binding proteins with the type A motif." J Mol Biol, 221(3), 751- 4. Missiakas, D., Schwager, F., Betton, J. M., Georgopoulos, C., and Raina, S. (1996). "Identification and characterization of HsIV HsIU (ClpQ ClpY) proteins involved in overall proteolysis of misfolded proteins in Escherichia coli." Embo J, 15(24), 6899-909. Mlouka, A., Comte, K., Castets, A. M., Bouchier, C., and Tandeau de Marsac, N. (2004). "The gas vesicle gene cluster from Microcystis aeruginosa and DNA rearrangements that lead to loss of cell buoyancy." J Bacteriol, 186(8), 2355-65. Mocz, G., and Gibbons, I. R. (1996). "Phase partition analysis of nucleotide binding to axonemal dynein." Biochemistry, 35(28), 9204-11. Mocz, G., and Gibbons, I. R. (2001). "Model for the motor component of dynein heavy chain based on homology to the AAA family of oligomeric ATPases." Structure, 9(2), 93-103. Mogk, A., Schlieker, C., Strub, C., Rist, W., Weibezahn, J., and Bukau, B. (2003). "Roles of individual domains and conserved motifs of the AAA+ chaperone ClpB in oligomerization, ATP hydrolysis, and chaperone activity." J Biol Chem, 278(20), 17615- 24. Moreau, P. L. (2007). "The lysine decarboxylase CadA protects Escherichia coli starved of phosphate against fermentation acids." J Bacteriol, 189(6), 2249-61. Mueller, E. G., Palenchar, P. M., and Buck, C. J. (2001). "The role of the cysteine residues of ThiI in the generation of 4-thiouridine in tRNA." J Biol Chem, 276(36), 33588-95. Nagao, Y., Nakada, T., Imoto, M., Shimamoto, T., Sakai, S., Tsuda, M., and Tsuchiya, T. (1988). "Purification and analysis of the structure of alpha-galactosidase from Escherichia coli." Biochem Biophys Res Commun, 151(1), 236-41. Nagiec, E. E., Bernstein, A., and Whiteheart, S. W. (1995). "Each domain of the N- ethylmaleimide-sensitive fusion protein contributes to its transport activity." J Biol Chem, 270(49), 29182-8. Nakamura, M., Saeki, K., and Takahashi, Y. (1999). "Hyperproduction of recombinant ferredoxins in escherichia coli by coexpression of the ORF1-ORF2-iscS-iscU-iscA- hscB-hs cA-fdx-ORF3 gene cluster." J Biochem (Tokyo), 126(1), 10-8. Nakamura, M., Yamada, M., Hirota, Y., Sugimoto, K., Oka, A., and Takanami, M. (1981). "Nucleotide sequence of the asnA gene coding for asparagine synthetase of E. coli K- 12." Nucleic Acids Res, 9(18), 4669-76. Neher, S. B., Sauer, R. T., and Baker, T. A. (2003). "Distinct peptide signals in the UmuD and UmuD' subunits of UmuD/D' mediate tethering and substrate processing by the ClpXP protease." Proc Natl Acad Sci U S A, 100(23), 13219-24.

231

Neuwald, A. F., Aravind, L., Spouge, J. L., and Koonin, E. V. (1999). "AAA+: A class of chaperone-like ATPases associated with the assembly, operation, and disassembly of protein complexes." Genome Res, 9(1), 27-43. Newman, D. R., Kuhn, J. F., Shanab, G. M., and Maxwell, E. S. (2000). "Box C/D snoRNA- associated proteins: two pairs of evolutionarily ancient proteins and possible links to replication and transcription." Rna, 6(6), 861-79. Nichols, B. P., and Green, J. M. (1992). "Cloning and sequencing of Escherichia coli ubiC and purification of chorismate lyase." J Bacteriol, 174(16), 5309-16. Nishii, W., and Takahashi, K. (2003). "Determination of the cleavage sites in SulA, a cell division inhibitor, by the ATP-dependent HslVU protease from Escherichia coli." FEBS Lett, 553(3), 351-4. Niwa, H., Tsuchiya, D., Makyio, H., Yoshida, M., and Morikawa, K. (2002). "Hexameric ring structure of the ATPase domain of the membrane-integrated metalloprotease FtsH from Thermus thermophilus HB8." Structure, 10(10), 1415-23. Nobrega, F. G., Nobrega, M. P., and Tzagoloff, A. (1992). "BCS1, a novel gene required for the expression of functional Rieske iron-sulfur protein in Saccharomyces cerevisiae." Embo J, 11(11), 3821-9. Noll, M., Petrukhin, K., and Lutsenko, S. (1998). "Identification of a novel transcription regulator from Proteus mirabilis, PMTR, revealed a possible role of YJAI protein in balancing zinc in Escherichia coli." J Biol Chem, 273(33), 21393-401. Offner, S., Wanner, G., and Pfeifer, F. (1996). "Functional studies of the gvpACNO operon of Halobacterium salinarium reveal that the GvpC protein shapes gas vesicles." J Bacteriol, 178(7), 2071-8. Ogura, T., Whiteheart, S. W., and Wilkinson, A. J. (2004). "Conserved arginine residues implicated in ATP hydrolysis, nucleotide-sensing, and inter-subunit interactions in AAA and AAA+ ATPases." J Struct Biol, 146(1-2), 106-12. Ogura, T., and Wilkinson, A. J. (2001). "AAA+ superfamily ATPases: common structure-- diverse function." Genes Cells, 6(7), 575-97. Ortega, J., Singh, S. K., Ishikawa, T., Maurizi, M. R., and Steven, A. C. (2000). "Visualization of substrate binding and translocation by the ATP-dependent protease, ClpXP." Mol Cell, 6(6), 1515-21. Panne, D., Muller, S. A., Wirtz, S., Engel, A., and Bickle, T. A. (2001). "The McrBC restriction endonuclease assembles into a ring structure in the presence of G nucleotides." Embo J, 20(12), 3210-7. Panne, D., Raleigh, E. A., and Bickle, T. A. (1999). "The McrBC endonuclease translocates DNA in a reaction dependent on GTP hydrolysis." J Mol Biol, 290(1), 49-60.

232

Papenfort, K., Pfeiffer, V., Mika, F., Lucchini, S., Hinton, J. C., and Vogel, J. (2006). "sigma(E)-dependent small RNAs of Salmonella respond to membrane stress by accelerating global omp mRNA decay." Mol Microbiol, 62(6), 1674-88. Park, S. C., Jia, B., Yang, J. K., Van, D. L., Shao, Y. G., Han, S. W., Jeon, Y. J., Chung, C. H., and Cheong, G. W. (2006). "Oligomeric structure of the ATP-dependent protease La (Lon) of Escherichia coli." Mol Cells, 21(1), 129-34. Park, Y. K., Bearson, B., Bang, S. H., Bang, I. S., and Foster, J. W. (1996). "Internal pH crisis, lysine decarboxylase and the acid tolerance response of Salmonella typhimurium." Mol Microbiol, 20(3), 605-11. Parsell, D. A., Kowal, A. S., and Lindquist, S. (1994). "Saccharomyces cerevisiae Hsp104 protein. Purification and characterization of ATP-induced structural changes." J Biol Chem, 269(6), 4480-7. Parsons, C. A., Stasiak, A., Bennett, R. J., and West, S. C. (1995). "Structure of a multisubunit complex that promotes DNA branch migration." Nature, 374(6520), 375-8. Parsons, C. A., Tsaneva, I., Lloyd, R. G., and West, S. C. (1992). "Interaction of Escherichia coli RuvA and RuvB proteins with synthetic Holliday junctions." Proc Natl Acad Sci U S A, 89(12), 5452-6. Parsons, C. A., and West, S. C. (1993). "Formation of a RuvAB-Holliday junction complex in vitro." J Mol Biol, 232(2), 397-405. Patel, S., and Latterich, M. (1998). "The AAA team: related ATPases with diverse functions." Trends Cell Biol, 8(2), 65-71. Pearce, M. J., Arora, P., Festa, R. A., Butler-Wu, S. M., Gokhale, R. S., and Darwin, K. H. (2006). "Identification of substrates of the Mycobacterium tuberculosis proteasome." Embo J, 25(22), 5423-32. Peeters, P. J., Baker, A., Goris, I., Daneels, G., Verhasselt, P., Luyten, W. H., Geysen, J. J., Kass, S. U., and Moechars, D. W. (2004). "Sensory deficits in mice hypomorphic for a mammalian homologue of unc-53." Brain Res Dev Brain Res, 150(2), 89-101. Pellicer, M. T., Badia, J., Aguilar, J., and Baldoma, L. (1996). "glc locus of Escherichia coli: characterization of genes encoding the subunits of glycolate oxidase and the glc regulator protein." J Bacteriol, 178(7), 2051-9. Pellicer, M. T., Fernandez, C., Badia, J., Aguilar, J., Lin, E. C., and Baldom, L. (1999). "Cross- induction of glc and ace operons of Escherichia coli attributable to pathway intersection. Characterization of the glc promoter." J Biol Chem, 274(3), 1745-52. Petersen, B. L., Kannangara, C. G., and Henningsen, K. W. (1999). "Distribution of ATPase and ATP-to-ADP phosphate exchange activities in magnesium chelatase subunits of Chlorobium vibrioforme and Synechocystis PCC6803." Arch Microbiol, 171(3), 146-50.

233

Pham, P., Frei, K. P., Woo, W., and Truong, D. D. (2006). "Molecular defects of the dystonia- causing torsinA mutation." Neuroreport, 17(16), 1725-8. Phan, A. P., Ngo, T. T., and Lenhoff, H. M. (1982). "Spectrophotometric assay for lysine decarboxylase." Anal Biochem, 120(1), 193-7. Philippot, L. (2002). "Denitrifying genes in bacterial and Archaeal genomes." Biochim Biophys Acta, 1577(3), 355-76. Pieper, U., Groll, D. H., Wunsch, S., Gast, F. U., Speck, C., Mucke, N., and Pingoud, A. (2002). "The GTP-dependent restriction enzyme McrBC from Escherichia coli forms high- molecular mass complexes with DNA and produces a cleavage pattern with a characteristic 10-base pair repeat." Biochemistry, 41(16), 5245-54. Pieper, U., and Pingoud, A. (2002). "A mutational analysis of the PD...D/EXK motif suggests that McrC harbors the catalytic center for DNA cleavage by the GTP-dependent restriction enzyme McrBC from Escherichia coli." Biochemistry, 41(16), 5236-44. Pieper, U., Schweitzer, T., Groll, D. H., and Pingoud, A. (1999). "Defining the location and function of domains of McrB by deletion mutagenesis." Biol Chem, 380(10), 1225-30. Poggio, S., Domeinzain, C., Osorio, A., and Camarena, L. (2002). "The nitrogen assimilation control (Nac) protein represses asnC and asnA transcription in Escherichia coli." FEMS Microbiol Lett, 206(2), 151-6. Portis, A. R., Jr. (2003). "Rubisco activase - Rubisco's catalytic chaperone." Photosynth Res, 75(1), 11-27. Portis, A. R., Salvucci, M. E., and Ogren, W. L. (1986). "Activation of Ribulosebisphosphate Carboxylase/Oxygenase at Physiological CO(2) and Ribulosebisphosphate Concentrations by Rubisco Activase." Plant Physiol, 82(4), 967-971. Puri, T., Wendler, P., Sigala, B., Saibil, H., and Tsaneva, I. R. (2006). "Dodecameric Structure and ATPase Activity of the Human TIP48/TIP49 Complex." J Mol Biol. Qiu, X. B., Lin, Y. L., Thome, K. C., Pian, P., Schlegel, B. P., Weremowicz, S., Parvin, J. D., and Dutta, A. (1998). "An eukaryotic RuvB-like protein (RUVBL1) essential for growth." J Biol Chem, 273(43), 27786-93. Queitsch, C., Hong, S. W., Vierling, E., and Lindquist, S. (2000). "Heat shock protein 101 plays a crucial role in thermotolerance in Arabidopsis." Plant Cell, 12(4), 479-92. Rabin, R. S., and Stewart, V. (1993). "Dual response regulators (NarL and NarP) interact with dual sensors (NarX and NarQ) to control nitrate- and nitrite-regulated gene expression in Escherichia coli K-12." J Bacteriol, 175(11), 3259-68. Ramachandran, R., Hartmann, C., Song, H. K., Huber, R., and Bochtler, M. (2002). "Functional interactions of HslV (ClpQ) with the ATPase HslU (ClpY)." Proc Natl Acad Sci U S A, 99(11), 7396-401.

234

Rappas, M., Schumacher, J., Beuron, F., Niwa, H., Bordes, P., Wigneshweraraj, S., Keetch, C. A., Robinson, C. V., Buck, M., and Zhang, X. (2005). "Structural insights into the activity of enhancer-binding proteins." Science, 307(5717), 1972-5. Reid, J. D., and Hunter, C. N. (2002). "Current understanding of the function of magnesium chelatase." Biochem Soc Trans, 30(4), 643-5. Reid, J. D., Siebert, C. A., Bullough, P. A., and Hunter, C. N. (2003). "The ATPase activity of the ChlI subunit of magnesium chelatase and formation of a heptameric AAA+ ring." Biochemistry, 42(22), 6912-20. Reitzer, L., and Schneider, B. L. (2001). "Metabolic context and possible physiological themes of sigma(54)-dependent genes in Escherichia coli." Microbiol Mol Biol Rev, 65(3), 422- 44, table of contents. Reitzer, L. J., and Magasanik, B. (1982). "Asparagine synthetases of Klebsiella aerogenes: properties and regulation of synthesis." J Bacteriol, 151(3), 1299-313. Ren, D., Bedzyk, L. A., Thomas, S. M., Ye, R. W., and Wood, T. K. (2004). "Gene expression in Escherichia coli biofilms." Appl Microbiol Biotechnol, 64(4), 515-24. Rensing, C., Fan, B., Sharma, R., Mitra, B., and Rosen, B. P. (2000). "CopA: An Escherichia coli Cu(I)-translocating P-type ATPase." Proc Natl Acad Sci U S A, 97(2), 652-6. Rensing, C., Mitra, B., and Rosen, B. P. (1997). "The zntA gene of Escherichia coli encodes a Zn(II)-translocating P-type ATPase." Proc Natl Acad Sci U S A, 94(26), 14326-31. Rep, M., van Dijl, J. M., Suda, K., Schatz, G., Grivell, L. A., and Suzuki, C. K. (1996). "Promotion of mitochondrial membrane complex assembly by a proteolytically inactive yeast Lon." Science, 274(5284), 103-6. Rhodius, V. A., Suh, W. C., Nonaka, G., West, J., and Gross, C. A. (2006). "Conserved and variable functions of the sigmaE stress response in related genomes." PLoS Biol, 4(1), e2. Richards, C. M., Hardison, R. C., and Boyer, C. D. (1994). "Expression of the large plastid gene, ORF2280, in tomato fruits and flowers." Curr Genet, 26(5-6), 494-6. Richardson, I. W., and Anthony, C. (1992). "Characterization of mutant forms of the quinoprotein methanol dehydrogenase lacking an essential calcium ion." Biochem J, 287 ( Pt 3), 709-15. Richly, H., Rape, M., Braun, S., Rumpf, S., Hoege, C., and Jentsch, S. (2005). "A series of ubiquitin binding factors connects CDC48/p97 to substrate multiubiquitylation and proteasomal targeting." Cell, 120(1), 73-84. Rigaut, G., Shevchenko, A., Rutz, B., Wilm, M., Mann, M., and Seraphin, B. (1999). "A generic protein purification method for protein complex characterization and proteome exploration." Nat Biotechnol, 17(10), 1030-2.

235

Roe, S. M., Barlow, T., Brown, T., Oram, M., Keeley, A., Tsaneva, I. R., and Pearl, L. H. (1998). "Crystal structure of an octameric RuvA-Holliday junction complex." Mol Cell, 2(3), 361-72. Rombel, I., Peters-Wendisch, P., Mesecar, A., Thorgeirsson, T., Shin, Y. K., and Kustu, S. (1999). "MgATP binding and hydrolysis determinants of NtrC, a bacterial enhancer- binding protein." J Bacteriol, 181(15), 4628-38. Rotanova, T. V., Botos, I., Melnikov, E. E., Rasulova, F., Gustchina, A., Maurizi, M. R., and Wlodawer, A. (2006). "Slicing a protease: structural features of the ATP-dependent Lon proteases gleaned from investigations of isolated domains." Protein Sci, 15(8), 1815-28. Rotanova, T. V., Melnikov, E. E., Khalatova, A. G., Makhovskaya, O. V., Botos, I., Wlodawer, A., and Gustchina, A. (2004). "Classification of ATP-dependent proteases Lon and comparison of the active sites of their proteolytic domains." Eur J Biochem, 271(23-24), 4865-71. Rouiller, I., DeLaBarre, B., May, A. P., Weis, W. I., Brunger, A. T., Milligan, R. A., and Wilson-Kubalek, E. M. (2002). "Conformational changes of the multifunction p97 AAA ATPase during its ATPase cycle." Nat Struct Biol, 9(12), 950-7. Rudyak, S. G., Brenowitz, M., and Shrader, T. E. (2001). "Mg2+-linked oligomerization modulates the catalytic activity of the Lon (La) protease from Mycobacterium smegmatis." Biochemistry, 40(31), 9317-23. Rumpf, S., and Jentsch, S. (2006). "Functional division of substrate processing cofactors of the ubiquitin-selective Cdc48 chaperone." Mol Cell, 21(2), 261-9. Sabo, D. L., Boeker, E. A., Byers, B., Waron, H., and Fischer, E. H. (1974). "Purification and physical properties of inducible Escherichia coli lysine decarboxylase." Biochemistry, 13(4), 662-70. Sabo, D. L., and Fischer, E. H. (1974). "Chemical properties of Escherichia coli lysine decarboxylase including a segment of its pyridoxal 5'-phosphate binding site." Biochemistry, 13(4), 670-6. Sakato, M., and King, S. M. (2004). "Design and regulation of the AAA+ microtubule motor dynein." J Struct Biol, 146(1-2), 58-71. Sallai, L., and Tucker, P. A. (2005). "Crystal structure of the central and C-terminal domain of the sigma(54)-activator ZraR." J Struct Biol, 151(2), 160-70. Samso, M., and Koonce, M. P. (2004). "25 Angstrom resolution structure of a cytoplasmic dynein motor reveals a seven-member planar ring." J Mol Biol, 340(5), 1059-72. Sandkvist, M. (2001). "Biology of type II secretion." Mol Microbiol, 40(2), 271-83. Santos, J. M., Lobo, M., Matos, A. P., De Pedro, M. A., and Arraiano, C. M. (2002). "The gene bolA regulates dacA (PBP5), dacC (PBP6) and ampC (AmpC), promoting normal morphology in Escherichia coli." Mol Microbiol, 45(6), 1729-40.

236

Saraste, M., Sibbald, P. R., and Wittinghofer, A. (1990). "The P-loop--a common motif in ATP- and GTP-binding proteins." Trends Biochem Sci, 15(11), 430-4. Sawers, G. (1998). "The anaerobic degradation of L-serine and L-threonine in enterobacteria: networks of pathways and regulatory signals." Arch Microbiol, 171(1), 1-5. Sawers, R. G., and Boxer, D. H. (1986). "Purification and properties of membrane-bound hydrogenase isoenzyme 1 from anaerobically grown Escherichia coli K12." Eur J Biochem, 156(2), 265-75. Schaeffer, P. M., Headlam, M. J., and Dixon, N. E. (2005). "Protein--protein interactions in the eubacterial replisome." IUBMB Life, 57(1), 5-12. Schaper, S., and Messer, W. (1995). "Interaction of the initiator protein DnaA of Escherichia coli with its DNA target." J Biol Chem, 270(29), 17622-6. Schirmer, E. C., Queitsch, C., Kowal, A. S., Parsell, D. A., and Lindquist, S. (1998). "The ATPase activity of Hsp104, effects of environmental conditions and mutations." J Biol Chem, 273(25), 15546-52. Schmid, S., Seitz, T., and Haas, D. (1998). "Cointegrase, a naturally occurring, truncated form of IS21 transposase, catalyzes replicon fusion rather than simple insertion of IS21." J Mol Biol, 282(3), 571-83. Schmitt, M., Neupert, W., and Langer, T. (1996). "The molecular chaperone Hsp78 confers compartment-specific thermotolerance to mitochondria." J Cell Biol, 134(6), 1375-86. Schoenhofen, I. C., Li, G., Strozen, T. G., and Howard, S. P. (2005). "Purification and characterization of the N-terminal domain of ExeA: a novel ATPase involved in the type II secretion pathway of Aeromonas hydrophila." J Bacteriol, 187(18), 6370-8. Schoenhofen, I. C., Stratilo, C., and Howard, S. P. (1998). "An ExeAB complex in the type II secretion pathway of Aeromonas hydrophila: effect of ATP-binding cassette mutations on complex formation and function." Mol Microbiol, 29(5), 1237-47. Schumacher, J., Joly, N., Rappas, M., Zhang, X., and Buck, M. (2006). "Structures and organisation of AAA+ enhancer binding proteins in transcriptional activation." J Struct Biol, 156(1), 190-9. Schweizer, H. P., and Datta, P. (1989). "Identification and DNA sequence of tdcR, a positive regulatory gene of the tdc operon of Escherichia coli." Mol Gen Genet, 218(3), 516-22. Scofield, M. A., Lewis, W. S., and Schuster, S. M. (1990). "Nucleotide sequence of Escherichia coli asnB and deduced amino acid sequence of asparagine synthetase B." J Biol Chem, 265(22), 12895-902. Seol, J. H., Baek, S. H., Kang, M. S., Ha, D. B., and Chung, C. H. (1995). "Distinctive roles of the two ATP-binding sites in ClpA, the ATPase component of protease Ti in Escherichia coli." J Biol Chem, 270(14), 8087-92.

237

Seol, J. H., Yoo, S. J., Shin, D. H., Shim, Y. K., Kang, M. S., Goldberg, A. L., and Chung, C. H. (1997). "The heat-shock protein HslVU from Escherichia coli is a protein-activated ATPase as well as an ATP-dependent proteinase." Eur J Biochem, 247(3), 1143-50. Seong, I. S., Oh, J. Y., Yoo, S. J., Seol, J. H., and Chung, C. H. (1999). "ATP-dependent degradation of SulA, a cell division inhibitor, by the HslVU protease in Escherichia coli." FEBS Lett, 456(1), 211-4. Shen, X., Mizuguchi, G., Hamiche, A., and Wu, C. (2000). "A chromatin remodelling complex involved in transcription and DNA processing." Nature, 406(6795), 541-4. Shevchenko, A., Wilm, M., Vorm, O., and Mann, M. (1996). "Mass spectrometric sequencing of proteins silver-stained polyacrylamide gels." Anal Chem, 68(5), 850-8. Shively, J. M., van Keulen, G., and Meijer, W. G. (1998). "Something from almost nothing: carbon dioxide fixation in chemoautotrophs." Annu Rev Microbiol, 52, 191-230. Siebert, M., Severin, K., and Heide, L. (1994). "Formation of 4-hydroxybenzoate in Escherichia coli: characterization of the ubiC gene and its encoded enzyme chorismate pyruvate- lyase." Microbiology, 140 ( Pt 4), 897-904. Sigala, B., Edwards, M., Puri, T., and Tsaneva, I. R. (2005). "Relocalization of human chromatin remodeling cofactor TIP48 in mitosis." Exp Cell Res, 310(2), 357-69. Silvanovich, A., Li, M. G., Serr, M., Mische, S., and Hays, T. S. (2003). "The third P-loop domain in cytoplasmic dynein heavy chain is essential for dynein motor function and ATP-sensitive microtubule binding." Mol Biol Cell, 14(4), 1355-65. Singh, S. K., Rozycki, J., Ortega, J., Ishikawa, T., Lo, J., Steven, A. C., and Maurizi, M. R. (2001). "Functional domains of the ClpA and ClpX molecular chaperones identified by limited proteolysis and deletion analysis." J Biol Chem, 276(31), 29420-9. Sirijovski, N., Olsson, U., Lundqvist, J., Al-Karadaghi, S., Willows, R. D., and Hansson, M. (2006). "ATPase activity associated with the magnesium chelatase H-subunit of the chlorophyll biosynthetic pathway is an artefact." Biochem J, 400(3), 477-84. Skovran, E., and Downs, D. M. (2000). "Metabolic defects caused by mutations in the isc gene cluster in Salmonella enterica serovar typhimurium: implications for thiamine synthesis." J Bacteriol, 182(14), 3896-903. Smith, D. M., Benaroudj, N., and Goldberg, A. (2006). "Proteasomes and their associated ATPases: a destructive combination." J Struct Biol, 156(1), 72-83. Smith, D. M., Kafri, G., Cheng, Y., Ng, D., Walz, T., and Goldberg, A. L. (2005). "ATP binding to PAN or the 26S ATPases causes association with the 20S proteasome, gate opening, and translocation of unfolded proteins." Mol Cell, 20(5), 687-98. Snider, J., Gutsche, I., Lin, M., Baby, S., Cox, B., Butland, G., Greenblatt, J., Emili, A., and Houry, W. A. (2006). "Formation of a distinctive complex between the inducible

238

bacterial lysine decarboxylase and a novel AAA+ ATPase." J Biol Chem, 281(3), 1532- 46. Snider, J., and Houry, W. A. (2006). "MoxR AAA+ ATPases: A novel family of molecular chaperones?" J Struct Biol, 156(1), 200-9. Soksawatmaekhin, W., Kuraishi, A., Sakata, K., Kashiwagi, K., and Igarashi, K. (2004). "Excretion and uptake of cadaverine by CadB and its physiological functions in Escherichia coli." Mol Microbiol, 51(5), 1401-12. Sollner, T., Bennett, M. K., Whiteheart, S. W., Scheller, R. H., and Rothman, J. E. (1993). "A protein assembly-disassembly pathway in vitro that may correspond to sequential steps of synaptic vesicle docking, activation, and fusion." Cell, 75(3), 409-18. Song, C., Wang, Q., and Li, C. C. (2003). "ATPase activity of p97-valosin-containing protein (VCP). D2 mediates the major enzyme activity, and D1 contributes to the heat-induced activity." J Biol Chem, 278(6), 3648-55. Sousa, M. C., Kessler, B. M., Overkleeft, H. S., and McKay, D. B. (2002). "Crystal structure of HslUV complexed with a vinyl sulfone inhibitor: corroboration of a proposed mechanism of allosteric activation of HslV by HslU." J Mol Biol, 318(3), 779-85. Sousa, M. C., Trame, C. B., Tsuruta, H., Wilbanks, S. M., Reddy, V. S., and McKay, D. B. (2000). "Crystal and solution structures of an HslUV protease-chaperone complex." Cell, 103(4), 633-43. Speck, C., Chen, Z., Li, H., and Stillman, B. (2005). "ATPase-dependent cooperative binding of ORC and Cdc6 to origin DNA." Nat Struct Mol Biol, 12(11), 965-71. Speck, C., Weigel, C., and Messer, W. (1999). "ATP- and ADP-dnaA protein, a molecular switch in gene regulation." Embo J, 18(21), 6169-76. Stahlberg, H., Kutejova, E., Suda, K., Wolpensinger, B., Lustig, A., Schatz, G., Engel, A., and Suzuki, C. K. (1999). "Mitochondrial Lon of Saccharomyces cerevisiae is a ring-shaped protease with seven flexible subunits." Proc Natl Acad Sci U S A, 96(12), 6787-90. Starkova, N. N., Koroleva, E. P., Rumsh, L. D., Ginodman, L. M., and Rotanova, T. V. (1998). "Mutations in the proteolytic domain of Escherichia coli protease Lon impair the ATPase activity of the enzyme." FEBS Lett, 422(2), 218-20. Sternberg, N. L., and Maurer, R. (1991). "Bacteriophage-mediated generalized transduction in Escherichia coli and Salmonella typhimurium." Methods Enzymol, 204, 18-43. Stewart, F. J., Panne, D., Bickle, T. A., and Raleigh, E. A. (2000). "Methyl-specific DNA binding by McrBC, a modification-dependent restriction enzyme." J Mol Biol, 298(4), 611-22. Stewart, F. J., and Raleigh, E. A. (1998). "Dependence of McrBC cleavage on distance between recognition elements." Biol Chem, 379(4-5), 611-6.

239

Stewart, J., Hingorani, M. M., Kelman, Z., and O'Donnell, M. (2001). "Mechanism of beta clamp opening by the delta subunit of Escherichia coli DNA polymerase III holoenzyme." J Biol Chem, 276(22), 19182-9. Stringham, E., Pujol, N., Vandekerckhove, J., and Bogaert, T. (2002). "unc-53 controls longitudinal migration in C. elegans." Development, 129(14), 3367-79. Stukenberg, P. T., Studwell-Vaughan, P. S., and O'Donnell, M. (1991). "Mechanism of the sliding beta-clamp of DNA polymerase III holoenzyme." J Biol Chem, 266(17), 11328- 34. Su, C. H., and Greene, R. C. (1971). "Regulation of methionine biosynthesis in Escherichia coli: mapping of the metJ locus and properties of a metJ plus-metJ minus diploid." Proc Natl Acad Sci U S A, 68(2), 367-71. Suno, R., Niwa, H., Tsuchiya, D., Zhang, X., Yoshida, M., and Morikawa, K. (2006). "Structure of the whole cytosolic region of ATP-dependent protease FtsH." Mol Cell, 22(5), 575-85. Sutherland, E., Coe, L., and Raleigh, E. A. (1992). "McrBC: a multisubunit GTP-dependent restriction endonuclease." J Mol Biol, 225(2), 327-48. Sutton, R. B., Fasshauer, D., Jahn, R., and Brunger, A. T. (1998). "Crystal structure of a SNARE complex involved in synaptic exocytosis at 2.4 A resolution." Nature, 395(6700), 347-53. Tagaya, M., Wilson, D. W., Brunner, M., Arango, N., and Rothman, J. E. (1993). "Domain structure of an N-ethylmaleimide-sensitive fusion protein involved in vesicular transport." J Biol Chem, 268(4), 2662-6. Takahashi, Y., and Nakamura, M. (1999). "Functional assignment of the ORF2-iscS-iscU-iscA- hscB-hscA-fdx-ORF3 gene cluster involved in the assembly of Fe-S clusters in Escherichia coli." J Biochem (Tokyo), 126(5), 917-26. Tamura, S., Shimozawa, N., Suzuki, Y., Tsukamoto, T., Osumi, T., and Fujiki, Y. (1998). "A cytoplasmic AAA family peroxin, Pex1p, interacts with Pex6p." Biochem Biophys Res Commun, 245(3), 883-6. Tamura, T., Nagy, I., Lupas, A., Lottspeich, F., Cejka, Z., Schoofs, G., Tanaka, K., De Mot, R., and Baumeister, W. (1995). "The first characterization of a eubacterial proteasome: the 20S complex of Rhodococcus." Curr Biol, 5(7), 766-74. Tanapongpipat, S., Reid, E., Cole, J. A., and Crooke, H. (1998). "Transcriptional control and essential roles of the Escherichia coli ccm gene products in formate-dependent nitrite reduction and cytochrome c synthesis." Biochem J, 334 ( Pt 2), 355-65. Tatusov, R. L., Fedorova, N. D., Jackson, J. D., Jacobs, A. R., Kiryutin, B., Koonin, E. V., Krylov, D. M., Mazumder, R., Mekhedov, S. L., Nikolskaya, A. N., Rao, B. S., Smirnov, S., Sverdlov, A. V., Vasudevan, S., Wolf, Y. I., Yin, J. J., and Natale, D. A. (2003). "The COG database: an updated version includes eukaryotes." BMC Bioinformatics, 4, 41.

240

Tatusov, R. L., Galperin, M. Y., Natale, D. A., and Koonin, E. V. (2000). "The COG database: a tool for genome-scale analysis of protein functions and evolution." Nucleic Acids Res, 28(1), 33-6. Thibault, G., Tsitrin, Y., Davidson, T., Gribun, A., and Houry, W. A. (2006). "Large nucleotide- dependent movement of the N-terminal domain of the ClpX chaperone." Embo J, 25(14), 3367-76. Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994). "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice." Nucleic Acids Res, 22(22), 4673-80. Thoms, S., and Erdmann, R. (2006). "Peroxisomal matrix protein receptor ubiquitination and recycling." Biochim Biophys Acta. Thony-Meyer, L. (2002). "Cytochrome c maturation: a complex pathway for a simple task?" Biochem Soc Trans, 30(4), 633-8. Thony-Meyer, L., Fischer, F., Kunzler, P., Ritz, D., and Hennecke, H. (1995). "Escherichia coli genes required for cytochrome c maturation." J Bacteriol, 177(15), 4321-6. Tkach, J. M., and Glover, J. R. (2006). "Hsp104p: a protein disaggregase." Chaperones, M. M. a. I. Braakman, ed., Springer, Goteborg, 65-84. Tobias, J. W., Shrader, T. E., Rocap, G., and Varshavsky, A. (1991). "The N-end rule in bacteria." Science, 254(5036), 1374-7. Tomoyasu, T., Gamer, J., Bukau, B., Kanemori, M., Mori, H., Rutman, A. J., Oppenheim, A. B., Yura, T., Yamanaka, K., Niki, H., and et al. (1995). "Escherichia coli FtsH is a membrane-bound, ATP-dependent protease which degrades the heat-shock transcription factor sigma 32." Embo J, 14(11), 2551-60. Tomoyasu, T., Yamanaka, K., Murata, K., Suzaki, T., Bouloc, P., Kato, A., Niki, H., Hiraga, S., and Ogura, T. (1993a). "Topology and subcellular localization of FtsH protein in Escherichia coli." J Bacteriol, 175(5), 1352-7. Tomoyasu, T., Yuki, T., Morimura, S., Mori, H., Yamanaka, K., Niki, H., Hiraga, S., and Ogura, T. (1993b). "The Escherichia coli FtsH protein is a prokaryotic member of a protein family of putative ATPases involved in membrane functions, cell cycle control, and gene expression." J Bacteriol, 175(5), 1344-51. Toyama, H., Anthony, C., and Lidstrom, M. E. (1998). "Construction of insertion and deletion mxa mutants of Methylobacterium extorquens AM1 by electroporation." FEMS Microbiol Lett, 166(1), 1-7. Trchounian, A., and Kobayashi, H. (1999). "Kup is the major K+ uptake system in Escherichia coli upon hyper-osmotic stress at a low pH." FEBS Lett, 447(2-3), 144-8.

241

Tsaneva, I. R., Illing, G., Lloyd, R. G., and West, S. C. (1992a). "Purification and properties of the RuvA and RuvB proteins of Escherichia coli." Mol Gen Genet, 235(1), 1-10. Tsaneva, I. R., Muller, B., and West, S. C. (1992b). "ATP-dependent branch migration of Holliday junctions promoted by the RuvA and RuvB proteins of E. coli." Cell, 69(7), 1171-80. Tsilibaris, V., Maenhaut-Michel, G., and Van Melderen, L. (2006). "Biological roles of the Lon ATP-dependent protease." Res Microbiol, 157(8), 701-13. Turner, J., Hingorani, M. M., Kelman, Z., and O'Donnell, M. (1999). "The internal workings of a DNA polymerase clamp-loading machine." Embo J, 18(3), 771-83. van den Berg, W. A., Hagen, W. R., and van Dongen, W. M. (2000). "The hybrid-cluster protein ('prismane protein') from Escherichia coli. Characterization of the hybrid-cluster protein, redox properties of the [2Fe-2S] and [4Fe-2S-2O] clusters and identification of an associated NADH oxidoreductase containing FAD and [2Fe-2S]." Eur J Biochem, 267(3), 666-76. van der Ploeg, J. R., Eichhorn, E., and Leisinger, T. (2001). "Sulfonate-sulfur metabolism and its regulation in Escherichia coli." Arch Microbiol, 176(1-2), 1-8. Van Spanning, R. J., Wansell, C. W., De Boer, T., Hazelaar, M. J., Anazawa, H., Harms, N., Oltmann, L. F., and Stouthamer, A. H. (1991). "Isolation and characterization of the moxJ, moxG, moxI, and moxR genes of Paracoccus denitrificans: inactivation of moxJ, moxG, and moxR and the resultant effect on methylotrophic growth." J Bacteriol, 173(21), 6948-61. Vasilyeva, O. V., Kolygo, K. B., Leonova, Y. F., Potapenko, N. A., and Ovchinnikova, T. V. (2002). "Domain structure and ATP-induced conformational changes in Escherichia coli protease Lon revealed by limited proteolysis and autolysis." FEBS Lett, 526(1-3), 66-70. Vasudevan, S. G., Armarego, W. L., Shaw, D. C., Lilley, P. E., Dixon, N. E., and Poole, R. K. (1991). "Isolation and nucleotide sequence of the hmp gene that encodes a haemoglobin- like protein in Escherichia coli K-12." Mol Gen Genet, 226(1-2), 49-58. Veerassamy, S., Smith, A., and Tillier, E. R. (2003). "A transition probability model for amino acid substitutions from blocks." J Comput Biol, 10(6), 997-1010. Voorn-Brouwer, T., van der Leij, I., Hemrika, W., Distel, B., and Tabak, H. F. (1993). "Sequence of the PAS8 gene, the product of which is essential for biogenesis of peroxisomes in Saccharomyces cerevisiae." Biochim Biophys Acta, 1216(2), 325-8. Wah, D. A., Levchenko, I., Baker, T. A., and Sauer, R. T. (2002). "Characterization of a specificity factor for an AAA+ ATPase: assembly of SspB dimers with ssrA-tagged proteins and the ClpX hexamer." Chem Biol, 9(11), 1237-45.

242

Wahle, E., Lasken, R. S., and Kornberg, A. (1989). "The dnaB-dnaC replication protein complex of Escherichia coli. II. Role of the complex in mobilizing dnaB functions." J Biol Chem, 264(5), 2469-75. Walczak, C. E. (2000). "Microtubule dynamics and tubulin interacting proteins." Curr Opin Cell Biol, 12(1), 52-6. Walker, C. J., and Willows, R. D. (1997). "Mechanism and regulation of Mg-chelatase." Biochem J, 327 ( Pt 2), 321-33. Walker, J. E., Saraste, M., Runswick, M. J., and Gay, N. J. (1982). "Distantly related sequences in the alpha- and beta-subunits of ATP synthase, , kinases and other ATP- requiring enzymes and a common nucleotide binding fold." Embo J, 1(8), 945-51. Walsby, A. E. (1994). "Gas vesicles." Microbiol Rev, 58(1), 94-144. Wanders, R. J., and Waterham, H. R. (2006). "Biochemistry of mammalian peroxisomes revisited." Annu Rev Biochem, 75, 295-332. Wang, J., Song, J. J., Franklin, M. C., Kamtekar, S., Im, Y. J., Rho, S. H., Seong, I. S., Lee, C. S., Chung, C. H., and Eom, S. H. (2001). "Crystal structures of the HslVU peptidase- ATPase complex reveal an ATP-dependent proteolysis mechanism." Structure, 9(2), 177-84. Wang, Q., Song, C., and Li, C. C. (2003). "Hexamerization of p97-VCP is promoted by ATP binding to the D1 domain and required for ATPase and biological activities." Biochem Biophys Res Commun, 300(2), 253-60. Wang, Q., Song, C., and Li, C. C. (2004). "Molecular perspectives on p97-VCP: progress in understanding its structure and diverse biological functions." J Struct Biol, 146(1-2), 44- 57. Watanabe, Y. H., Motohashi, K., and Yoshida, M. (2002). "Roles of the two ATP binding sites of ClpB from Thermus thermophilus." J Biol Chem, 277(8), 5804-9. Watson, N., Dunyak, D. S., Rosey, E. L., Slonczewski, J. L., and Olson, E. R. (1992). "Identification of elements involved in transcriptional regulation of the Escherichia coli cad operon by external pH." J Bacteriol, 174(2), 530-40. Weber, A., Kogl, S. A., and Jung, K. (2006). "Time-dependent proteome alterations under osmotic stress during aerobic and anaerobic growth in Escherichia coli." J Bacteriol, 188(20), 7165-75. Weber, H., Polen, T., Heuveling, J., Wendisch, V. F., and Hengge, R. (2005). "Genome-Wide Analysis of the General Stress Response Network in Escherichia coli: {sigma}S- Dependent Genes, Promoters, and Sigma Factor Selectivity." J Bacteriol, 187(5), 1591- 603.

243

Weber, T., Zemelman, B. V., McNew, J. A., Westermann, B., Gmachl, M., Parlati, F., Sollner, T. H., and Rothman, J. E. (1998). "SNAREpins: minimal machinery for membrane fusion." Cell, 92(6), 759-72. Wei, Y., and Newman, E. B. (2002). "Studies on the role of the metK gene product of Escherichia coli K-12." Mol Microbiol, 43(6), 1651-6. West, S. C. (1997). "Processing of recombination intermediates by the RuvABC proteins." Annu Rev Genet, 31, 213-44. Wheeler, D. L., Barrett, T., Benson, D. A., Bryant, S. H., Canese, K., Church, D. M., DiCuccio, M., Edgar, R., Federhen, S., Helmberg, W., Kenton, D. L., Khovayko, O., Lipman, D. J., Madden, T. L., Maglott, D. R., Ostell, J., Pontius, J. U., Pruitt, K. D., Schuler, G. D., Schriml, L. M., Sequeira, E., Sherry, S. T., Sirotkin, K., Starchenko, G., Suzek, T. O., Tatusov, R., Tatusova, T. A., Wagner, L., and Yaschenko, E. (2005). "Database resources of the National Center for Biotechnology Information." Nucleic Acids Res, 33 Database Issue, D39-45. Whiteheart, S. W., and Matveeva, E. A. (2004). "Multiple binding proteins suggest diverse functions for the N-ethylmaleimide sensitive factor." J Struct Biol, 146(1-2), 32-43. Whiteheart, S. W., Rossnagel, K., Buhrow, S. A., Brunner, M., Jaenicke, R., and Rothman, J. E. (1994). "N-ethylmaleimide-sensitive fusion protein: a trimeric ATPase whose hydrolysis of ATP is required for membrane fusion." J Cell Biol, 126(4), 945-54. Whittaker, C. A., and Hynes, R. O. (2002). "Distribution and evolution of von Willebrand/integrin A domains: widely dispersed domains with roles in cell adhesion and elsewhere." Mol Biol Cell, 13(10), 3369-87. Wickner, S., and Hurwitz, J. (1975). "Interaction of Escherichia coli dnaB and dnaC(D) gene products in vitro." Proc Natl Acad Sci U S A, 72(3), 921-5. Willows, R. D. (2003). "Biosynthesis of chlorophylls from protoporphyrin IX." Nat Prod Rep, 20(3), 327-41. Willows, R. D., Hansson, A., Birch, D., Al-Karadaghi, S., and Hansson, M. (2004). "EM single particle analysis of the ATP-dependent BchI complex of magnesium chelatase: an AAA+ hexamer." J Struct Biol, 146(1-2), 227-33. Winzeler, E. A., Shoemaker, D. D., Astromoff, A., Liang, H., Anderson, K., Andre, B., Bangham, R., Benito, R., Boeke, J. D., Bussey, H., Chu, A. M., Connelly, C., Davis, K., Dietrich, F., Dow, S. W., El Bakkoury, M., Foury, F., Friend, S. H., Gentalen, E., Giaever, G., Hegemann, J. H., Jones, T., Laub, M., Liao, H., Liebundguth, N., Lockhart, D. J., Lucau-Danila, A., Lussier, M., M'Rabet, N., Menard, P., Mittmann, M., Pai, C., Rebischung, C., Revuelta, J. L., Riles, L., Roberts, C. J., Ross-MacDonald, P., Scherens, B., Snyder, M., Sookhai-Mahadeo, S., Storms, R. K., Veronneau, S., Voet, M., Volckaert, G., Ward, T. R., Wysocki, R., Yen, G. S., Yu, K., Zimmermann, K.,

244

Philippsen, P., Johnston, M., and Davis, R. W. (1999). "Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis." Science, 285(5429), 901-6. Wojtyra, U. A., Thibault, G., Tuite, A., and Houry, W. A. (2003). "The N-terminal zinc binding domain of ClpX is a dimerization domain that modulates the chaperone function." J Biol Chem, 278(49), 48981-90. Wolf, S., Nagy, I., Lupas, A., Pfeifer, G., Cejka, Z., Muller, S. A., Engel, A., De Mot, R., and Baumeister, W. (1998). "Characterization of ARC, a divergent member of the AAA ATPase family from Rhodococcus erythropolis." J Mol Biol, 277(1), 13-25. Wolf, Y. I., Rogozin, I. B., Kondrashov, A. S., and Koonin, E. V. (2001). "Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context." Genome Res, 11(3), 356-72. Wolfe, M. T., Heo, J., Garavelli, J. S., and Ludden, P. W. (2002). "Hydroxylamine reductase activity of the hybrid cluster protein from Escherichia coli." J Bacteriol, 184(21), 5898- 902. Wood, M. A., McMahon, S. B., and Cole, M. D. (2000). "An ATPase/helicase complex is an essential cofactor for oncogenic transformation by c-Myc." Mol Cell, 5(2), 321-30. Wu, W. F., Zhou, Y., and Gottesman, S. (1999). "Redundant in vivo proteolytic activities of Escherichia coli Lon and the ClpYQ (HslUV) protease." J Bacteriol, 181(12), 3681-7. Wu, Y., Patil, R. V., and Datta, P. (1992). "Catabolite gene activator protein and integration host factor act in concert to regulate tdc operon expression in Escherichia coli." J Bacteriol, 174(21), 6918-27. Wu, Y. F., and Datta, P. (1992). "Integration host factor is required for positive regulation of the tdc operon of Escherichia coli." J Bacteriol, 174(1), 233-40. Xiong, J. P., Stehle, T., Zhang, R., Joachimiak, A., Frech, M., Goodman, S. L., and Arnaout, M. A. (2002). "Crystal structure of the extracellular segment of integrin alpha Vbeta3 in complex with an Arg-Gly-Asp ligand." Science, 296(5565), 151-5. Yamada, K., Ariyoshi, M., and Morikawa, K. (2004). "Three-dimensional structural views of branch migration and resolution in DNA homologous recombination." Curr Opin Struct Biol, 14(2), 130-7. Yamada, K., Kunishima, N., Mayanagi, K., Ohnishi, T., Nishino, T., Iwasaki, H., Shinagawa, H., and Morikawa, K. (2001). "Crystal structure of the Holliday junction migration motor protein RuvB from Thermus thermophilus HB8." Proc Natl Acad Sci U S A, 98(4), 1442-7. Yamada, K., Miyata, T., Tsuchiya, D., Oyama, T., Fujiwara, Y., Ohnishi, T., Iwasaki, H., Shinagawa, H., Ariyoshi, M., Mayanagi, K., and Morikawa, K. (2002). "Crystal structure

245

of the RuvA-RuvB complex: a structural basis for the Holliday junction migrating motor machinery." Mol Cell, 10(3), 671-81. Yao, N., Turner, J., Kelman, Z., Stukenberg, P. T., Dean, F., Shechter, D., Pan, Z. Q., Hurwitz, J., and O'Donnell, M. (1996). "Clamp loading, unloading and intrinsic stability of the PCNA, beta and gp45 sliding clamps of human, E. coli and T4 replicases." Genes Cells, 1(1), 101-13. Yoo, S. J., Seol, J. H., Shin, D. H., Rohrwild, M., Kang, M. S., Tanaka, K., Goldberg, A. L., and Chung, C. H. (1996). "Purification and characterization of the heat shock proteins HslV and HslU that form a new ATP-dependent protease in Escherichia coli." J Biol Chem, 271(24), 14035-40. Young, T. E., Ling, J., Geisler-Lee, C. J., Tanguay, R. L., Caldwell, C., and Gallie, D. R. (2001). "Developmental and thermal regulation of the maize heat shock protein, HSP101." Plant Physiol, 127(3), 777-91. Yu, D., Ellis, H. M., Lee, E. C., Jenkins, N. A., Copeland, N. G., and Court, D. L. (2000). "An efficient recombination system for chromosome engineering in Escherichia coli." Proc Natl Acad Sci U S A, 97(11), 5978-83. Yu, X., West, S. C., and Egelman, E. H. (1997). "Structure and subunit composition of the RuvAB-Holliday junction complex." J Mol Biol, 266(2), 217-22. Zhang, C., and Guy, C. L. (2005). "Co-immunoprecipitation of Hsp101 with cytosolic Hsc70." Plant Physiol Biochem, 43(1), 13-8. Zhang, R. G., Skarina, T., Katz, J. E., Beasley, S., Khachatryan, A., Vyas, S., Arrowsmith, C. H., Clarke, S., Edwards, A., Joachimiak, A., and Savchenko, A. (2001). "Structure of Thermotoga maritima stationary phase survival protein SurE: a novel acid phosphatase." Structure (Camb), 9(11), 1095-106. Zhang, X., Chaney, M., Wigneshweraraj, S. R., Schumacher, J., Bordes, P., Cannon, W., and Buck, M. (2002). "Mechanochemical ATPases and transcriptional activation." Mol Microbiol, 45(4), 895-903. Zhang, X., Shaw, A., Bates, P. A., Newman, R. H., Gowen, B., Orlova, E., Gorman, M. A., Kondo, H., Dokurno, P., Lally, J., Leonard, G., Meyer, H., van Heel, M., and Freemont, P. S. (2000). "Structure of the AAA ATPase p97." Mol Cell, 6(6), 1473-84. Zhang, X., Stoffels, K., Wurzbacher, S., Schoofs, G., Pfeifer, G., Banerjee, T., Parret, A. H., Baumeister, W., De Mot, R., and Zwickl, P. (2004). "The N-terminal coiled coil of the Rhodococcus erythropolis ARC AAA ATPase is neither necessary for oligomerization nor nucleotide hydrolysis." J Struct Biol, 146(1-2), 155-65. Zhou, Y., Gottesman, S., Hoskins, J. R., Maurizi, M. R., and Wickner, S. (2001). "The RssB response regulator directly targets sigma(S) for degradation by ClpXP." Genes Dev, 15(5), 627-37.

246

Zietkiewicz, S., Krzewska, J., and Liberek, K. (2004). "Successive and synergistic action of the Hsp70 and Hsp100 chaperones in protein disaggregation." J Biol Chem, 279(43), 44376- 83. Zolkiewski, M. (2006). "A camel passes through the eye of a needle: protein unfolding of the Clp ATPases." Mol Microbiol, 61(5), 1094-1100. Zolkiewski, M., Kessel, M., Ginsburg, A., and Maurizi, M. R. (1999). "Nucleotide-dependent oligomerization of ClpB from Escherichia coli." Protein Sci, 8(9), 1899-903. Zwickl, P., Ng, D., Woo, K. M., Klenk, H. P., and Goldberg, A. L. (1999). "An archaebacterial ATPase, homologous to ATPases in the eukaryotic 26 S proteasome, activates protein breakdown by 20 S proteasomes." J Biol Chem, 274(37), 26008-14.

247

7. Appendix

248

SUPPLEMENTARY TABLE 1. Organisms used in profiling analysis.

249

RavA Containing Non-RavA Containing 1 Erwinia carotovora subsp. atroseptica SCRI1043 1 Acinetobacter sp. ADP1 2 Escherichia coli K12 2 Buchnera aphidicola str. Bp (Baizongia pistaciae) 3 Photorhabdus luminescens subsp. laumondii TTO1 3 Coxiella burnetii RSA 493 4 Shigella boydii Sb227 4 Colwellia psychrerythraea 34H 5 Shigella flexneri 2a str. 301 5 Francisella tularensis subsp. tularensis SCHU S4 6 Shigella sonnei Ss046 6 Haemophilus ducreyi 35000HP 7 Salmonella enterica subsp. enterica serovar Typhi Ty2 7 Haemophilus influenzae Rd KW20 8 Salmonella typhimurium LT2 8 Idiomarina loihiensis L2TR 9 Vibrio cholerae O1 biovar eltor str. N16961 9 Legionella pneumophila str. Paris 10 Vibrio fischeri ES114 10 Methylococcus capsulatus str. Bath 11 Vibrio parahaemolyticus RIMD 2210633 11 Mannheimia succiniciproducens MBEL55E 12 Vibrio vulnificus CMCP6 12 Nitrosococcus oceani ATCC 19707 13 Yersinia pestis KIM 13 Pseudomonas aeruginosa PAO1 14 Yersinia pseudotuberculosis IP 32953 14 Psychrobacter arcticus 273-4 15 Psychrobacter cryohalolentis K5 16 Pseudomonas fluorescens Pf-5 17 Pseudoalteromonas haloplanktis TAC125 18 Pasteurella multocida subsp. multocida str. Pm70 19 Pseudomonas putida KT2440 20 Pseudomonas syringae pv. phaseolicola 1448A 21 Shewanella oneidensis MR-1 22 Thiomicrospira crunogena XCL-2 23 Wigglesworthia glossinidia endosymbiont of Glossina brevipalpis 24 Xanthomonas campestris pv. campestris str. 8004 25 Xanthomonas oryzae pv. oryzae KACC10331

250

SUPPLEMENTARY TABLE 2. Complete phylogenetic profiling results. For each gene, the GI number, gene name, a general description and the frequency of occurrence in RavA- containing organisms (‘RavA’ column) and non-RavA-containing organisms (‘Non-RavA’ column) are listed. Genes are divided into groups, based upon their frequency of occurrence in RavA-containing organisms. The group of the most frequently occurring genes is listed first, followed by all remaining groups in descending order. Within each group the genes are sorted in reverse alphabetical order by name.

251

GI Name Description RavA Non-RavA 16131833 zraS sensory in two-component regulatory system with 14 22 16131834 zraR fused DNA-binding response regulator in two-component regulatory 14 23 16129811 znuC high-affinity zinc transporter ATPase 14 25 16129812 znuB high-affinity zinc transporter membrane component 14 16 90111346 znuA high-affinity zinc transporter periplasmic component 14 13 16131171 zntR zinc-responsive transcriptional regulator 14 16 16131341 zntA zinc, cobalt and lead efflux system 14 22 16130338 zipA cell division protein ZipA 14 18 16130812 zapA protein that localizes to the cytokinetic ring 14 10 16132204 ytjB hypothetical protein b4387 14 3 90111709 ytfT predicted sugar transporter subunit: membrane component of ABC 14 7 49176475 ytfR predicted sugar transporter subunit: ATP-binding component of ABC 14 25 16132049 ytfQ predicted sugar transporter subunit: periplasmic-binding component 14 7 16132043 ytfN hypothetical protein b4221 14 21 16132042 ytfM predicted outer membrane protein and surface antigen 14 23 16132040 ytfL predicted inner membrane protein 14 25 16132038 ytfJ predicted transcriptional regulator 14 2 16131163 yrdC predicted ribosome maturation factor 14 25 90111568 yrdA hypothetical protein b3279 14 17 16131089 yrbK hypothetical protein b3199 14 9 16131086 yrbG predicted calcium/sodium:proton antiporter 14 14 16131085 yrbF predicted toluene transporter subunit: ATP-binding component of ABC 14 24 16131084 yrbE predicted toluene transporter subunit: membrane component of ABC 14 21 16131083 yrbD predicted ABC-type organic solvent transporter 14 19 16131082 yrbC predicted ABC-type organic solvent transporter 14 16 90111555 yrbA predicted DNA-binding transcriptional regulator 14 17 16131042 yraP hypothetical protein b3150 14 19 16131041 yraO DnaA initiator-associating factor for replication initiation 14 18 16131040 yraN hypothetical protein b3148 14 23 16131039 yraM hypothetical protein b3147 14 18 16131038 yraL predicted methyltransferase 14 25 90111530 yqiC hypothetical protein b3042 14 12 16130927 yqiA predicted esterase 14 12 16130909 yqhD alcohol dehydrogenase, NAD(P)-dependent 14 15 16130850 yqgF Holliday junction resolvase-like protein 14 25 90111516 yqgE hypothetical protein b2948 14 25 16130749 yqeG predicted transporter 14 11 90111494 yqeF acetyl-CoA acetyltransferase 14 18 16130701 yqcD hypothetical protein b2794 14 20 16130699 yqcC hypothetical protein b2792 14 6 16130698 yqcB tRNA pseudouridine synthase 14 23 16130697 yqcA flavodoxin 14 14 16130602 yqaB predicted 14 19 49176245 ypjD predicted inner membrane protein 14 17 90111457 yphH predicted DNA-binding transcriptional regulator 14 3 16130472 yphE fused predicted sugar transporter subunits of ABC superfamily: 14 25 16130471 yphD predicted sugar transporter subunit: membrane component of ABC 14 7 90111455 yphC predicted oxidoreductase, Zn-dependent and NAD(P)-binding 14 20 16130399 ypfI predicted hydrolase 14 9 16130318 ypdG predicted enzyme IIC component of PTS 14 9 16130317 ypdF predicted peptidase 14 24 16130315 ypdD fused predicted PTS enzymes: Hpr component/enzyme I 14 24 16130313 ypdB predicted response regulator in two-component system withYpdA 14 13 16130151 yojL predicted thiamine biosynthesis lipoprotein 14 19 16130148 yojI fused predicted multidrug transport subunits of ABC superfamily: 14 25 16130080 yohK predicted inner membrane protein 14 16 16130079 yohJ hypothetical protein b2141 14 2

252 16130075 yohF predicted oxidoreductase with NAD(P)-binding Rossmann-fold domain 14 25 16129765 yoaH hypothetical protein b1811 14 4 16129770 yoaE fused predicted membrane protein/conserved protein 14 25 16129762 yoaA conserved protein with nucleoside triphosphate hydrolase domain 14 18 16129710 ynjD predicted transporter subunit: ATP-binding component of ABC 14 25 16129681 yniC predicted hydrolase 14 16 16129679 yniA predicted phosphotransferase/kinase 14 5 16129554 ynfM predicted transporter 14 21 16129553 ynfL predicted DNA-binding transcriptional regulator 14 23 90111304 ynfK predicted dethiobiotin synthetase 14 23 16129547 ynfG oxidoreductase, Fe-S subunit 14 10 90111301 ynfF oxidoreductase subunit 14 15 16129545 ynfE oxidoreductase subunit 14 16 16129485 yneJ predicted DNA-binding transcriptional regulator 14 16 90111288 yneI predicted aldehyde dehydrogenase 14 22 16129410 yncD predicted iron outer membrane transporter 14 16 16129370 ynbB predicted CDP-diglyceride synthase 14 24 16129291 ynaI conserved inner membrane protein 14 19 90111219 ymfC 23S rRNA pseudouridine synthase 14 22 16128803 yliG predicted SAM-dependent methyltransferase 14 24 16128800 yliD predicted peptide transporter subunit: membrane component of ABC 14 20 16128799 yliC predicted peptide transporter subunit: membrane component of ABC 14 20 16128798 yliB predicted peptide transporter subunit: periplasmic-binding 14 17 16128797 yliA fused predicted peptide transport subunits of ABC superfamily: 14 25 90111112 ykgC pyridine nucleotide-disulfide oxidoreductase 14 25 16132220 yjtD predicted rRNA methyltransferase 14 22 16132196 yjjW predicted pyruvate formate lyase activating enzyme 14 5 90111745 yjjV predicted DNase 14 25 90111740 yjjN predicted oxidoreductase, Zn-dependent and NAD(P)-binding 14 20 16132208 yjjK fused predicted transporter subunits of ABC superfamily: 14 25 16132177 yjiZ predicted transporter 14 18 16132158 yjiO multidrug efflux system protein 14 21 90111724 yjhU KpLE2 phage-like element; predicted DNA-binding transcriptional 14 2 90111725 yjhH KpLE2 phage-like element; predicted lyase/synthase 14 24 16132118 yjhG KpLE2 phage-like element; predicted dehydratase 14 21 90111715 yjgQ conserved inner membrane protein 14 24 16132083 yjgP conserved inner membrane protein 14 22 16132071 yjgI predicted oxidoreductase with NAD(P)-binding Rossmann-fold domain 14 25 90111711 yjgF ketoacid-binding protein 14 24 16132077 yjgD hypothetical protein b4255 14 8 16132056 yjgA hypothetical protein b4234 14 19 90111710 yjfF predicted sugar transporter subunit: membrane component of ABC 14 7 16131988 yjeS predicted Fe-S electron transport protein 14 15 90111696 yjeQ ribosome-associated GTPase 14 21 16131984 yjeP predicted mechanosensitive channel 14 23 16131971 yjeK predicted lysine aminomutase 14 19 16131990 yjeE ATPase with strong ADP affinity 14 23 16132000 yjeB predicted DNA-binding transcriptional regulator 14 14 90111694 yjeA lysyl-tRNA synthetase 14 25 16131956 yjdL predicted transporter 14 11 16131908 yjcR predicted membrane fusion protein of efflux pump 14 20 16131891 yjcE predicted cation/proton antiporter 14 17 16131890 yjcD predicted permease 14 17 16131875 yjbN tRNA-dihydrouridine synthase A 14 23 16131848 yjbC 23S rRNA pseudouridine synthase 14 23 16131829 yjaG hypothetical protein b3999 14 8 16131802 yijD conserved inner membrane protein 14 0 16131766 yiiU hypothetical protein b3928 14 5

253 16131761 yiiT stress-induced protein 14 5 16131728 yiiD predicted acetyltransferase 14 8 90111660 yihW predicted DNA-binding transcriptional regulator 14 13 16131722 yihU predicted oxidoreductase with NAD(P)-binding Rossmann-fold domain 14 20 16131706 yihI hypothetical protein b3866 14 4 16131700 yihE predicted kinase 14 8 90111656 yihA GTP-binding protein 14 25 90111654 yigZ predicted 14 19 16131683 yigP hypothetical protein b3834 14 9 49176424 yigL predicted hydrolase 14 7 16131624 yifE hypothetical protein b3764 14 7 90111646 yieN fused predicted transcriptional regulator: sigma54 activator 14 0 49176398 yieM predicted von Willibrand factor containing protein 14 0 16131582 yieG predicted inner membrane protein 14 17 16131549 yidK predicted transporter 14 15 16131573 yidC inner membrane protein component YidC 14 25 16131565 yidA predicted hydrolase 14 8 90111631 yicO predicted xanthine/uracil permase 14 17 16131532 yicM predicted transporter 14 16 16131526 yicH hypothetical protein b3655 14 0 90111627 yicG conserved inner membrane protein 14 18 90111628 yicF NAD-dependent DNA ligase LigB 14 25 16131525 yicE predicted transporter 14 15 16131515 yicC hypothetical protein b3644 14 20 16131482 yibN predicted rhodanese-related sulfurtransferase 14 25 16131477 yibK predicted rRNA methylase 14 23 16131468 yibH hypothetical protein b3597 14 18 49176377 yiaY predicted Fe-containing alcohol dehydrogenase 14 15 16131456 yiaU predicted DNA-binding transcriptional regulator 14 18 16131444 yiaI predicted hydrogenase, 4Fe-4S ferredoxin-type component 14 8 49176370 yiaD predicted outer membrane lipoprotien 14 22 16131411 yhjV predicted transporter 14 11 90111607 yhjK predicted diguanylate cyclase 14 16 16131399 yhjJ predicted zinc-dependent peptidase 14 3 16131393 yhjC predicted DNA-binding transcriptional regulator 14 22 16131371 yhiR predicted DNA (exogenous) processing protein 14 20 16131369 yhiQ predicted SAM-dependent methyltransferase 14 20 16131368 yhiP predicted transporter 14 11 16131366 yhiO universal stress protein UspB 14 0 16131364 yhiN predicted oxidoreductase with FAD/NAD(P)-binding domain 14 20 16131311 yhhW hypothetical protein b3439 14 19 90111597 yhhT predicted inner membrane protein 14 21 16131343 yhhQ conserved inner membrane protein 14 10 16131338 yhhL conserved inner membrane protein 14 0 16131337 yhhF predicted methyltransferase 14 25 16131308 yhgN predicted antibiotic transporter 14 15 90111588 yhgF predicted transcriptional accessory protein 14 25 16131258 yhfW predicted mutase 14 7 90111574 yhfK conserved inner membrane protein 14 11 16131235 yhfA hypothetical protein b3356 14 13 16131233 yheU hypothetical protein b3354 14 6 16131231 yheS fused predicted transporter subunits of ABC superfamily: 14 25 90111573 yheO predicted DNA-binding transcriptional regulator 14 8 16131224 yheN hypothetical protein b3345 14 14 16131223 yheM hypothetical protein b3344 14 6 16131159 yhdZ predicted amino-acid transporter subunit 14 25 90111567 yhdY predicted amino-acid transporter subunit 14 15 49176333 yhdX predicted amino-acid transporter subunit 14 15

254 16131156 yhdW predicted amino-acid transporter subunit 14 8 49176330 yhdP conserved membrane protein, predicted transporter 14 15 16131141 yhdH predicted oxidoreductase, Zn-dependent and NAD(P)-binding 14 8 16131122 yhcM conserved protein with nucleoside triphosphate hydrolase domain 14 17 90111560 yhcB hypothetical protein b3233 14 4 16131070 yhbY predicted RNA-binding protein 14 22 90111552 yhbX predicted hydrolase, inner membrane 14 17 16131050 yhbU predicted peptidase (collagenase-like) 14 15 16131049 yhbT predicted lipid carrier protein 14 4 16131048 yhbS predicted acyltransferase with acyl-CoA N-acyltransferase domain 14 5 16131047 yhbQ hypothetical protein b3155 14 10 16131090 yhbN predicted transporter subunit: periplasmic-binding component of ABC 14 20 16131095 yhbJ hypothetical protein b3205 14 19 16131093 yhbH predicted ribosome-associated, sigma 54 modulation protein 14 20 16131091 yhbG predicted transporter subunit: ATP-binding component of ABC 14 25 90111551 yhbC hypothetical protein b3170 14 23 90111541 yhaO predicted transporter 14 11 16131000 yhaJ predicted DNA-binding transcriptional regulator 14 15 16130981 ygjQ predicted thioredoxin-like 14 4 90111535 ygjO predicted methyltransferase small domain 14 19 90111534 ygjG putrescine:2-oxoglutaric acid aminotransferase, PLP-dependent 14 25 16130960 ygjD O-sialoglycoprotein endopeptidase 14 25 16130920 ygiW hypothetical protein b3024 14 7 16130916 ygiS predicted transporter subunit: periplasmic-binding component of ABC 14 17 16130956 ygiP predicted DNA-binding transcriptional regulator 14 22 16130955 ygiH hypothetical protein b3059 14 18 16130950 ygiF predicted adenylate cyclase 14 15 90111524 yghU predicted S-transferase 14 15 16130901 yghA oxidoreductase 14 25 16130863 yggX hypothetical protein b2962 14 25 16130856 yggW coproporphyrinogen III oxidase 14 25 16130855 yggV putative deoxyribonucleotide triphosphate pyrophosphatase 14 24 16130853 yggT predicted inner membrane protein 14 17 16130852 yggS predicted enzyme 14 25 90111517 yggR predicted transporter 14 23 16130859 yggN hypothetical protein b2958 14 0 90111515 yggJ hypothetical protein b2946 14 24 16130823 yggE hypothetical protein b2922 14 5 16130800 ygfZ putative global regulator 14 22 16130799 ygfY hypothetical protein b2897 14 15 90111510 ygfU predicted transporter 14 15 90111509 ygfT fused predicted oxidoreductase: Fe-S subunit/nucleotide-binding 14 20 90111508 ygfS predicted oxidoreductase, 4Fe-4S ferredoxin-type subunit 14 8 49176279 ygfQ predicted transporter 14 17 90111507 ygfO predicted transporter 14 15 16130780 ygfK predicted oxidoreductase, Fe-S subunit 14 17 16130804 ygfF predicted NAD(P)-binding oxidoreductase with NAD(P)-binding 14 25 90111511 ygfB hypothetical protein b2909 14 15 49176282 ygfA predicted ligase 14 22 16130774 ygeY hypothetical protein b2872 14 18 49176276 ygeW hypothetical protein b2870 14 19 16130771 ygeV predicted DNA-binding transcriptional regulator 14 21 90111503 ygeR Tetratricopeptide repeat transcriptional regulator 14 24 90111499 ygeK predicted DNA-binding transcriptional regulator 14 16 16130739 ygeD predicted inner membrane protein 14 9 16130734 ygdP dinucleoside polyphosphate hydrolase 14 23 16130719 ygdL hypothetical protein b2812 14 21 16130718 ygdK predicted Fe-S metabolism protein 14 17

255 16130702 ygdH hypothetical protein b2795 14 14 16130714 ygdD conserved inner membrane protein 14 16 90111489 ygcW predicted deoxygluconate dehydrogenase 14 25 16130672 ygcM 6-pyruvoyl tetrahydrobiopterin synthase (PTPS) 14 8 16130684 ygcF hypothetical protein b2777 14 22 16130683 ygcE predicted kinase 14 18 16130652 ygbO tRNA pseudouridine synthase D 14 19 16130647 ygbN predicted transporter 14 11 16130643 ygbJ predicted dehydrogenase, with NAD(P)-binding Rossmann-fold domain 14 20 16130642 ygbI predicted DNA-binding transcriptional regulator 14 13 16130618 ygbD nitric oxide reductase 14 17 16130599 ygaG S-ribosylhomocysteinase 14 6 90111481 ygaA anaerobic nitric oxide reductase transcription regulator 14 21 16130538 yfjG hypothetical protein b2619 14 19 90111469 yfjF hypothetical protein b2618 14 19 90111467 yfjD predicted inner membrane protein 14 24 16130509 yfiQ fused predicted acyl-CoA synthetase: NAD(P)-binding 14 10 16130516 yfiO predicted lipoprotein 14 24 16130503 yfiK neutral amino-acid efflux system 14 9 16130514 yfiH hypothetical protein b2593 14 23 16130506 yfiF predicted methyltransferase 14 23 90111462 yfiE predicted DNA-binding transcriptional regulator 14 22 16130504 yfiD pyruvate formate lyase subunit 14 5 90111461 yfiC predicted S-adenosyl-L-methionine-dependent methyltransferase 14 6 16130518 yfiA cold shock protein associated with 30S ribosomal subunit 14 16 16130457 yfhQ predicted methyltransferase 14 23 16130487 yfhL predicted 4Fe-4S cluster-containing protein 14 22 90111458 yfhK predicted sensory kinase in two-component system 14 23 16130449 yfhJ hypothetical protein b2524 14 13 90111460 yfhH predicted DNA-binding transcriptional regulator 14 10 16130483 yfhD predicted transglycosylase 14 17 16130479 yfhA predicted DNA-binding response regulator in two-component system 14 21 16130438 yfgM hypothetical protein b2513 14 20 16130437 yfgL protein assembly complex, lipoprotein component 14 18 90111450 yfgK GTP-binding protein EngA 14 25 16130428 yfgF predicted inner membrane protein 14 17 16130420 yfgD predicted oxidoreductase 14 20 16130419 yfgC predicted peptidase 14 19 16130442 yfgB predicted enzyme 14 25 16130441 yfgA hypothetical protein b2516 14 14 16130392 yffH predicted NUDIX hydrolase 14 15 16130396 yffB hypothetical protein b2471 14 19 16130354 yfeV N-acetylmuramic acid phosphotransfer permease 14 3 16130353 yfeU N-acetylmuramic acid-6-phosphate etherase 14 5 16130352 yfeT predicted DNA-binding transcriptional regulator 14 10 16130335 yfeR predicted DNA-binding transcriptional regulator 14 23 16130311 yfdZ hypothetical protein b2379 14 22 16130305 yfdU hypothetical protein b2373 14 21 16130275 yfcY acetyl-CoA acetyltransferase 14 18 16130274 yfcX fused enoyl-CoA hydratase and epimerase and 14 17 16130266 yfcN hypothetical protein b2331 14 23 16130260 yfcL hypothetical protein b2325 14 1 90111418 yfcK hypothetical protein b2324 14 18 16130237 yfcG predicted glutathione S-transferase 14 16 16130262 yfcA conserved inner membrane protein 14 17 16130230 yfbV hypothetical protein b2295 14 8 16130227 yfbS predicted transporter 14 13 16130225 yfbQ aspartate aminotransferase 14 22

256 16130190 yfbG hypothetical protein b2255 14 25 90111408 yfbE uridine 5'-(beta-1-threo-pentapyranosyl-4-ulose diphosphate) 14 22 16130198 yfbB predicted peptidase 14 5 16130171 yfaE predicted 2Fe-2S cluster-containing protein 14 10 16130126 yejM predicted hydrolase, inner membrane 14 11 16130125 yejL hypothetical protein b2187 14 7 16130124 yejK nucleoid-associated protein NdpA 14 11 16130122 yejH predicted ATP-dependet helicase 14 7 16130118 yejF fused predicted oligopeptide transporter subunits of ABC 14 25 16130117 yejE predicted oligopeptide transporter subunit 14 17 16130116 yejB predicted oligopeptide transporter subunit 14 20 16130084 yeiT predicted oxidoreductase 14 17 90111399 yeiP elongation factor P 14 24 16130102 yeiM predicted nucleoside transporter 14 11 16130099 yeiJ predicted nucleoside transporter 14 11 16130098 yeiI predicted kinase 14 9 16130095 yeiE predicted DNA-binding transcriptional regulator 14 22 16130104 yeiC predicted kinase 14 9 16130067 yehX predicted transporter subunit: ATP-binding component of ABC 14 25 90111392 yehT predicted response regulator in two-component system withYehU 14 18 16130021 yegQ predicted peptidase 14 15 90111378 yegH fused predicted membrane protein/predicted membrane protein 14 25 90111380 yegD predicted chaperone 14 23 16129957 yeeZ predicted epimerase, with NAD(P)-binding Rossmann-fold domain 14 6 90111371 yeeY predicted DNA-binding transcriptional regulator 14 17 49176178 yeeX hypothetical protein b2007 14 4 16129927 yeeN hypothetical protein b1983 14 23 90111365 yeeJ adhesin 14 9 90111362 yedW predicted DNA-binding response regulator in two-component system 14 24 16129914 yedV predicted sensory kinase in two-component regulatory system with 14 23 16129865 yecS predicted transporter subunit: membrane component of ABC 14 15 16129824 yecP predicted S-adenosyl-L-methionine-dependent methyltransferase 14 16 16129823 yecO predicted methyltransferase 14 17 90111352 yecM predicted metal-binding enzyme 14 5 16129852 yecI predicted ferritin-like protein 14 9 16129847 yecG universal stress protein 14 5 16129864 yecC predicted transporter subunit: ATP-binding component of ABC 14 25 90111341 yebU predicted methyltransferase 14 23 90111340 yebT hypothetical protein b1834 14 15 16129786 yebS conserved inner membrane protein 14 10 49176157 yebR hypothetical protein b1832 14 11 90111339 yebQ predicted transporter 14 20 16129806 yebK predicted DNA-binding transcriptional regulator 14 10 16129817 yebC hypothetical protein b1864 14 23 90111345 yebA predicted peptidase 14 21 16129761 yeaZ predicted peptidase 14 24 16129760 yeaY predicted lipoprotein 14 7 16129757 yeaX predicted oxidoreductase 14 11 16129754 yeaU predicted dehydrogenase 14 22 90111335 yeaT predicted DNA-binding transcriptional regulator 14 22 16129739 yeaI predicted diguanylate cyclase 14 16 16129735 yeaE predicted oxidoreductase 14 16 90111329 yeaC hypothetical protein b1777 14 8 16129767 yeaB predicted NUDIX hydrolase 14 13 16129732 yeaA methionine sulfoxide reductase B 14 22 16129683 ydjN predicted transporter 14 19 16129730 ydjL predicted oxidoreductase, Zn-dependent and NAD(P)-binding 14 20 16129728 ydjJ predicted oxidoreductase, Zn-dependent and NAD(P)-binding 14 20

257 16129727 ydjI predicted aldolase 14 22 90111328 ydjH predicted kinase 14 16 16129725 ydjG predicted oxidoreductase 14 16 16129724 ydjF predicted DNA-binding transcriptional regulator 14 13 16129644 ydiK predicted inner membrane protein 14 15 16129643 ydiJ predicted FAD-linked oxidoreductase 14 20 16129642 ydiI hypothetical protein b1686 14 12 49176136 ydiD hypothetical protein b1701 14 23 16129648 ydiB quinate/shikimate 5-dehydrogenase, NAD(P)-binding 14 23 16129659 ydiA hypothetical protein b1703 14 14 90111314 ydhX predicted 4Fe-4S ferridoxin-type protein 14 10 90111311 ydhJ undecaprenyl pyrophosphate phosphatase 14 23 16129598 ydhH anhydro-N-acetylmuramic acid kinase 14 20 49176131 ydhF predicted oxidoreductase 14 11 16129612 ydhD hypothetical protein b1654 14 25 49176132 ydhC predicted transporter 14 21 16129617 ydhB predicted DNA-binding transcriptional regulator 14 16 16129592 ydgR putative tripeptide transporter permease 14 11 16129499 ydfH predicted DNA-binding transcriptional regulator 14 11 16129498 ydfG L-allo-threonine dehydrogenase, NAD(P)-binding 14 25 16129403 ydcW medium chain aldehyde dehydrogenase 14 22 16129401 ydcU predicted spermidine/putrescine transporter subunit 14 19 16129400 ydcT predicted spermidine/putrescine transporter subunit 14 24 16129398 ydcR fused predicted DNA-binding transcriptional regulator/predicted 14 18 90111273 ydcP predicted peptidase 14 15 90111269 ydcI predicted DNA-binding transcriptional regulator 14 23 16129305 ydaO predicted C32 tRNA thiolase 14 23 16129289 ycjZ predicted DNA-binding transcriptional regulator 14 20 16129282 ycjX conserved protein with nucleoside triphosphate hydrolase domain 14 8 16129281 ycjW predicted DNA-binding transcriptional regulator 14 14 16129278 ycjU predicted beta-phosphoglucomutase 14 14 16129227 yciV hypothetical protein b1266 14 19 16129245 yciT predicted DNA-binding transcriptional regulator 14 13 16129240 yciS conserved inner membrane protein 14 5 90111240 yciO hypothetical protein b1267 14 24 16129241 yciM hypothetical protein b1280 14 14 16129232 yciK short chain dehydrogenase 14 25 90111241 yciH translation intiation factor Sui1 14 12 16129214 yciA predicted hydrolase 14 18 90111235 ychM predicted transporter 14 20 16129166 ychF translation-associated GTPase 14 25 16129203 ychE predicted inner membrane protein 14 18 90111228 ycgN hypothetical protein b1181 14 19 16129143 ycgM predicted isomerase/hydrolase 14 15 49176083 ycgL hypothetical protein b1179 14 15 16129069 ycfN thiamin kinase 14 0 16129068 ycfM predicted outer membrane lipoprotein 14 1 16129063 ycfH predicted metallodependent hydrolase 14 25 16129095 ycfC hypothetical protein b1132 14 19 16129060 yceG predicted aminodeoxychorismate lyase 14 23 16129051 yceD hypothetical protein b1088 14 18 90111205 ycdW 2-ketoacid reductase 14 18 16128976 ycdK hypothetical protein b1010 14 16 90111201 ycdG predicted transporter 14 14 16128935 yccX predicted acylphosphatase 14 9 49176066 yccW predicted methyltransferase 14 20 90111194 yccS predicted inner membrane protein 14 11 90111197 yccK predicted sulfite reductase subunit 14 18

258 16128922 ycbZ predicted peptidase 14 17 16128915 ycbY predicted methyltransferase 14 20 16128914 ycbX predicted 2Fe-2S cluster-containing protein 14 4 90111193 ycbW hypothetical protein b0946 14 2 16128894 ycbL predicted metal-binding enzyme 14 24 16128893 ycbK hypothetical protein b0926 14 5 16128923 ycbG hypothetical protein b0956 14 4 16128867 ycaN predicted DNA-binding transcriptional regulator 14 22 16128859 ycaJ recombination protein 14 24 16128865 ycaD putative MFS family transporter protein 14 13 16128835 ybjR predicted amidase and lipoprotein 14 22 90111176 ybjI hypothetical protein b0844 14 6 16128827 ybjF 23S rRNA m(5)U747-methyltransferase 14 22 90111181 ybjE predicted transporter 14 6 16128792 ybiY predicted pyruvate formate lyase activating enzyme 14 5 16128791 ybiW predicted pyruvate formate lyase 14 5 16128790 ybiV predicted hydrolase 14 5 16128788 ybiT fused predicted transporter subunits of ABC superfamily: 14 25 16128757 ybhO cardiolipin synthase 2 14 16 16128748 ybhK predicted transferase with NAD(P)-binding Rossmann-fold domain 14 5 16128738 ybhI predicted transporter 14 10 90111168 ybhF fused predicted transporter subunits of ABC superfamily: 14 25 49176046 ybhD predicted DNA-binding transcriptional regulator 14 23 16128734 ybhA predicted hydrolase 14 7 16128688 ybgL hypothetical protein b0713 14 12 16128686 ybgJ predicted enzyme subunit 14 11 16128685 ybgI conserved metal-binding protein 14 20 16128684 ybgH predicted transporter 14 11 16128717 ybgF hypothetical protein b0742 14 16 16128711 ybgC predicted acyl-CoA thioesterase 14 18 16128662 ybfF hypothetical protein b0686 14 11 16128643 ybeZ predicted protein with nucleoside triphosphate hydrolase domain 14 21 16128642 ybeY hypothetical protein b0659 14 25 16128641 ybeX predicteed ion transport 14 24 90111158 ybeQ hypothetical protein b0644 14 18 90111155 ybeF predicted DNA-binding transcriptional regulator 14 15 16128614 ybeD hypothetical protein b0631 14 9 90111157 ybeB hypothetical protein b0637 14 23 16128619 ybeA hypothetical protein b0636 14 21 16128591 ybdR predicted oxidoreductase, Zn-dependent and NAD(P)-binding 14 20 16128586 ybdO predicted DNA-binding transcriptional regulator 14 14 16128583 ybdL putative aminotransferase 14 22 16128580 ybdB hypothetical protein b0597 14 12 90111146 ybcJ predicted RNA-binding protein 14 0 16128488 ybbS DNA-binding transcriptional activator of the allD operon 14 18 16128477 ybbO short chain dehydrogenase 14 25 90111142 ybbN predicted thioredoxin domain-containing protein 14 22 16128474 ybbL predicted transporter subunit: ATP-binding component of ABC 14 25 16128473 ybbK predicted protease, membrane anchored 14 24 16128472 ybbJ conserved inner membrane protein 14 5 16128508 ybbF UDP-2,3-diacylglucosamine hydrolase 14 24 16128479 ybbA predicted transporter subunit: ATP-binding component of ABC 14 25 49176025 ybaY predicted outer membrane lipoprotein 14 3 16128429 ybaX predicted aluminum resistance protein 14 22 16128427 ybaV hypothetical protein b0442 14 12 90111137 ybaO predicted DNA-binding transcriptional regulator 14 19 16128450 ybaM hypothetical protein b0466 14 0 16128462 ybaL predicted transporter with NAD(P)-binding Rossmann-fold domain 14 22

259 16128465 ybaK hypothetical protein b0481 14 16 16128430 ybaE predicted transporter subunit: periplasmic-binding component of ABC 14 0 16128398 ybaD hypothetical protein b0413 14 22 16128455 ybaB hypothetical protein b0471 14 23 90111133 yajR predicted transporter 14 23 90111132 yajQ nucleotide-binding protein 14 18 90111130 yajO predicted oxidoreductase, NAD(P)-binding 14 16 90111128 yajF 14 7 16128392 yajC preprotein translocase subunit YajC 14 25 16128310 yahK predicted oxidoreductase, Zn-dependent and NAD(P)-binding 14 18 16128301 yahB predicted DNA-bindng transcriptional regulator 14 19 16128254 yagF CP4-6 prophage; predicted dehydratase 14 21 16128253 yagE CP4-6 prophage; predicted lyase/synthase 14 24 16128205 yafV predicted C-N hydrolase family amidase, NAD(P)-binding 14 15 90111099 yafS predicted S-adenosyl-L-methionine-dependent methyltransferase 14 14 16128209 yafJ predicted amidotransfease 14 15 16128196 yafD hypothetical protein b0209 14 4 16128195 yafC predicted DNA-binding transcriptional regulator 14 23 16128225 yafA fermentation/respiration switch protein 14 0 16128170 yaeT hypothetical protein b0177 14 24 16128169 yaeL zinc metallopeptidase 14 23 16128156 yaeH hypothetical protein b0163 14 0 16128188 yaeB hypothetical protein b0195 14 15 16128151 yadT vitamin B12-transporter protein BtuF 14 8 16128150 yadS conserved inner membrane protein 14 17 16128149 yadR hypothetical protein b0156 14 25 16128121 yadH predicted transporter subunit: membrane component of ABC 14 23 16128120 yadG predicted transporter subunit: ATP-binding component of ABC 14 25 16128137 yadB glutamyl-Q tRNA(Asp) synthetase 14 25 90111086 yacL hypothetical protein b0119 14 6 16128094 yacG zinc-binding protein 14 16 16128095 yacF hypothetical protein b0102 14 5 16128004 yaaH conserved inner membrane protein associated with acetate transport 14 2 16128000 yaaA hypothetical protein b0006 14 17 16131439 xylH D-xylose transporter subunit 14 7 16131438 xylG fused D-xylose transporter subunits of ABC superfamily: ATP-binding 14 25 16131437 xylF D-xylose transporter subunit 14 7 16131435 xylB xylulokinase 14 18 16129703 xthA exonuclease III 14 24 16128407 xseB exodeoxyribonuclease VII small subunit 14 16 16130434 xseA exodeoxyribonuclease VII large subunit 14 21 49176270 xni exonuclease IX 14 25 16130796 xerD site-specific tyrosine recombinase XerD 14 21 16131663 xerC site-specific tyrosine recombinase XerC 14 20 16130331 xapR DNA-binding transcriptional activator 14 23 16129998 wcaB predicted acyl transferase 14 19 16130808 visC hypothetical protein b2906 14 22 16132080 valS valyl-tRNA synthetase 14 25 16130279 vacJ predicted lipoprotein 14 20 16132145 uxuR DNA-binding transcriptional repressor 14 11 16129861 uvrY response regulator 14 22 16131665 uvrD DNA-dependent ATPase I and helicase II 14 25 90111354 uvrC excinuclease ABC subunit C 14 23 16128747 uvrB excinuclease ABC subunit B 14 23 16131884 uvrA excinuclease ABC subunit A 14 22 16128916 uup fused predicted transporter subunits of ABC superfamily: 14 25 16129294 uspE stress-induced protein 14 12 16131367 uspA universal stress global response regulator 14 6

260 16128464 ushA UDP-sugar hydrolase 14 12 16130254 usg hypothetical protein b2319 14 19 16130422 uraA uracil transporter 14 14 16130953 uppP undecaprenyl pyrophosphate phosphatase 14 18 90111448 upp uracil phosphoribosyltransferase 14 18 16130505 ung uracil-DNA glycosylase 14 22 16132013 ulaR DNA-binding transcriptional dual regulator 14 12 16132017 ulaC L-ascorbate-specific enzyme IIA component of PTS 14 3 16131536 uhpT sugar phosphate antiporter 14 10 90111632 uhpC membrane protein regulates uhpT expression 14 15 90111633 uhpB sensory histidine kinase in two-component regulatory sytem with 14 9 16131539 uhpA DNA-binding response regulator in two-component regulatory system 14 22 16131321 ugpQ cytoplasmic glycerophosphodiester phosphodiesterase 14 12 90111593 ugpC glycerol-3-phosphate transporter subunit 14 25 16129969 ugd UDP-glucose 6-dehydrogenase 14 22 16131680 udp uridine phosphorylase 14 12 90111379 udk uridine kinase 14 12 90111670 udhA soluble pyridine nucleotide transhydrogenase 14 25 90111431 ucpA short chain dehydrogenase 14 25 16130246 ubiX 3-octaprenyl-4-hydroxybenzoate carboxy-lyase 14 14 16130809 ubiH 2-octaprenyl-6-methoxyphenyl hydroxylase 14 22 16130167 ubiG 3-demethylubiquinone-9 3-methyltransferase 14 22 16128645 ubiF 2-octaprenyl-3-methyl-6-methoxy-1,4-benzoquinol hydroxylase 14 22 49176426 ubiE ubiquinone/menaquinone biosynthesis methyltransferase 14 22 16131689 ubiD 3-octaprenyl-4-hydroxybenzoate decarboxylase 14 14 90111677 ubiC chorismate pyruvate lyase 14 0 16131684 ubiB putative ubiquinone biosynthesis protein UbiB 14 21 16131866 ubiA 4-hydroxybenzoate octaprenyltransferase 14 22 16129595 tyrS tyrosyl-tRNA synthetase 14 25 16129284 tyrR DNA-binding transcriptional dual regulator, tyrosine-binding 14 21 16129857 tyrP tyrosine transporter 14 10 16131880 tyrB tyrosine aminotransferase, tyrosine-repressible, PLP-dependent 14 17 16130521 tyrA fused chorismate mutase T/prephenate dehydrogenase 14 9 16131810 tufB protein chain elongation factor EF-Tu (duplicate of tufA) 14 25 16131218 tufA protein chain elongation factor EF-Tu (duplicate of tufB) 14 25 16132176 tsr methyl-accepting chemotaxis protein I, serine sensor receptor 14 15 16128163 tsf elongation factor Ts 14 25 16130507 trxC thioredoxin 2 14 25 16128855 trxB thioredoxin reductase, FAD/NAD(P)-binding 14 25 67005950 trxA thioredoxin 14 25 16131058 truB tRNA pseudouridine synthase B 14 23 16130253 truA tRNA pseudouridine synthase A 14 23 16131262 trpS tryptophanyl-tRNA synthetase 14 25 16132210 trpR Trp operon repressor 14 4 16129225 trpE anthranilate synthase component I 14 24 16129224 trpD bifunctional indole-3-glycerol-phosphate synthase/anthranilate 14 24 90111239 trpC bifunctional indole-3-glycerol phosphate 14 23 16129222 trpB tryptophan synthase subunit beta 14 23 16129221 trpA tryptophan synthase subunit alpha 14 23 90111218 trmU tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase 14 25 16131522 trmH tRNA (Guanosine-2'-O-)-methyltransferase 14 19 16131574 trmE tRNA modification GTPase 14 25 16130528 trmD tRNA (guanine-N(1)-)-methyltransferase 14 24 16131803 trmA tRNA (uracil-5-)-methyltransferase 14 22 49176432 trkH potassium transporter 14 17 16129324 trkG Rac prophage; potassium transporter subunit 14 17 16131169 trkA potassium transporter peripheral membrane component 14 17 16129380 trg methyl-accepting chemotaxis protein III, ribose and galactose 14 14

261 16132063 treR trehalose repressor 14 13 16132061 treC trehalose-6-P hydrolase 14 13 16132062 treB fused trehalose(maltose)-specific PTS enzyme: IIB component/IIC 14 3 16131757 tpiA triosephosphate isomerase 14 25 90111350 torZ trimethylamine N-oxide reductase system III, catalytic subunit 14 12 90111200 torS hybrid sensory histidine kinase in two-component regulatory system 14 23 16128961 torR DNA-binding response regulator in two-component regulatory system 14 24 16128963 torA trimethylamine N-oxide (TMAO) reductase I, catalytic subunit 14 11 16129717 topB DNA topoisomerase III 14 23 16129235 topA DNA topoisomerase I 14 23 16129213 tonB membrane spanning protein in TonB-ExbB-ExbD complex 14 2 16128713 tolR membrane spanning protein in TolA-TolQ-TolR complex 14 19 16128712 tolQ membrane spanning protein in TolA-TolQ-TolR complex 14 24 90111528 tolC outer membrane channel precursor protein 14 17 16128715 tolB translocation protein TolB precursor 14 24 16128714 tolA cell envelope integrity inner membrane protein TolA 14 0 16131577 tnaB tryptophan transporter of low affinity 14 10 16129061 tmk thymidylate kinase 14 24 16131134 tldD predicted peptidase 14 18 16130390 tktB transketolase 2, thiamin-binding 14 25 49176286 tktA transketolase 1, thiamin-binding 14 25 90111614 tkrA 2-ketoaldonate reductase/glyoxylate reductase B 14 23 16128181 tilS tRNA(Ile)-lysidine synthetase 14 25 16128421 tig trigger factor 14 24 16130731 thyA thymidylate synthase 14 25 16129675 thrS threonyl-tRNA synthetase 14 25 16127998 thrC threonine synthase 14 18 16127997 thrB homoserine kinase 14 9 16127996 thrA bifunctional aspartokinase I/homeserine dehydrogenase I 14 24 16128060 thiQ thiamin transporter subunit 14 25 16128061 thiP thiamin ABC transporter membrane protein 14 15 16128402 thiL thiamine monophosphate kinase 14 23 16128408 thiI thiamine biosynthesis protein ThiI 14 13 16131820 thiH thiamine biosynthesis protein ThiH 14 3 33347812 thiG thiazole synthase 14 19 90111672 thiF thiamine biosynthesis protein ThiF 14 22 16131823 thiE thiamine-phosphate pyrophosphorylase 14 19 16130041 thiD phosphomethylpyrimidine kinase 14 22 16131824 thiC thiamine biosynthesis protein ThiC 14 18 16128391 tgt queuine tRNA-ribosyltransferase 14 23 16128437 tesB acyl-CoA thioesterase II 14 16 16128478 tesA multifunctional acyl-CoA thioesterase I and protease I and 14 17 16129199 tdk 14 13 16131487 tdh L-threonine 3-dehydrogenase 14 20 49176314 tdcG L-serine dehydratase 3 14 19 90111542 tdcF predicted L-PSP (mRNA) endoribonuclease 14 24 49176316 tdcE pyruvate formate-lyase 4/2-ketobutyrate formate-lyase 14 5 16131008 tdcD propionate kinase/acetate kinase C, anaerobic 14 16 16131009 tdcC L-threonine/L-serine transporter 14 11 16131010 tdcB threonine dehydratase 14 19 16131011 tdcA DNA-binding transcriptional activator 14 23 16128062 tbpA thiamin transporter subunit 14 4 16128351 tauB taurine transporter subunit 14 25 49176429 tatD DNase, magnesium-dependent 14 25 16131687 tatC TatABCE protein translocation system subunit 14 21 49176428 tatB sec-independent translocase 14 11 90111653 tatA twin argininte translocase protein A 14 8 16129838 tar methyl-accepting chemotaxis protein II 14 15

262 16129837 tap methyl-accepting protein IV 14 15 16128002 talB transaldolase B 14 20 16130389 talA transaldolase A 14 21 16131420 tag 3-methyl-adenine DNA glycosylase I, constitutive 14 17 16130484 tadA tRNA-specific adenosine deaminase 14 25 16130700 syd SecY interacting protein Syd 14 4 16128047 surA peptidyl-prolyl cis-trans isomerase (PPIase) 14 24 16130458 suhB inositol monophosphatase 14 25 16129636 sufS selenocysteine lyase 14 25 16129635 sufE cysteine desufuration protein SufE 14 17 16129638 sufC cysteine desulfurase ATPase component 14 25 16129640 sufA iron-sulfur cluster assembly scaffold protein 14 25 16128704 sucD succinyl-CoA synthetase subunit alpha 14 23 16128703 sucC succinyl-CoA synthetase subunit beta 14 23 16128702 sucB dihydrolipoamide acetyltransferase 14 25 16128701 sucA alpha-ketoglutarate decarboxylase 14 24 16130583 stpA DNA binding protein, nucleoid-associated 14 6 16128900 ssuB alkanesulfonate transporter subunit 14 25 16130984 sstT sodium:serine/threonine symporter 14 17 16131118 sspB ClpXP protease specificity-enhancing factor 14 23 16131119 sspA stringent starvation protein A 14 23 16131885 ssb single-strand DNA-binding protein 14 25 16130501 srmB ATP-dependent RNA helicase 14 25 16130614 srlR DNA-bindng transcriptional repressor 14 13 16130612 srlD 3-ketoacyl-(acyl-carrier-protein) reductase 14 25 16130845 sprT hypothetical protein b2944 14 9 16130113 spr predicted peptidase, outer membrane lipoprotein 14 12 16129720 sppA protease IV (signal peptide peptidase) 14 25 16131521 spoT bifunctional (p)ppGpp synthetase II/ guanosine-3',5'-bis 14 23 16128669 speF isozyme, inducible 14 8 90111522 speC ornithine decarboxylase, constitutive 14 8 16130839 speA 14 17 16131888 soxS DNA-binding transcriptional dual regulator 14 4 16129233 sohB predicted inner membrane peptidase 14 25 16129614 sodB superoxide dismutase, Fe 14 25 49176442 sodA superoxide dismutase, Mn 14 25 16128888 smtA predicted S-adenosyl-L-methionine-dependent methyltransferase 14 9 16130539 smpB SsrA-binding protein 14 25 90111468 smpA small membrane lipoprotein 14 17 49176336 smf hypothetical protein b4473 14 22 16131227 slyX hypothetical protein b3348 14 5 16131228 slyD FKBP-type peptidyl prolyl cis-trans isomerase (rotamase) 14 21 90111310 slyA transcriptional regulator SlyA 14 10 90111747 slt lytic murein transglycosylase, soluble 14 20 90111603 slp outer membrane lipoprotein 14 7 90111625 slmA nucleoid occlusion protein 14 8 16130273 sixA phosphohistidine phosphatase 14 13 16131342 sirA cell developmental protein SirA 14 17 16128063 sgrR DNA-binding transcriptional regulator 14 1 16132121 sgcR KpLE2 phage-like element; predicted DNA-binding transcriptional 14 12 16132122 sgcE ribulose-phosphate 3-epimerase 14 25 16132123 sgcA KpLE2 phage-like element; predicted phosphotransferase enzyme IIA 14 3 16128139 sfsA sugar fermentation stimulation protein A 14 12 90111281 sfcA malate dehydrogenase, (decarboxylating, NAD-requiring) (malic 14 17 16128860 serS seryl-tRNA synthetase 14 25 16128874 serC phosphoserine aminotransferase 14 25 16132205 serB 3-phosphoserine phosphatase 14 17 16130814 serA D-3-phosphoglycerate dehydrogenase 14 23

263 16128663 seqA regulatory protein for replication initiation 14 8 16131461 selB selenocysteinyl-tRNA-specific translation factor 14 25 16131065 secG protein-export membrane protein 14 21 16128394 secF protein export protein SecF 14 23 16131811 secE translocase 14 21 16128393 secD protein export protein SecD 14 23 16131480 secB export protein SecB 14 23 16128091 secA translocase 14 25 16128697 sdhD succinate dehydrogenase cytochrome b556 small membrane subunit 14 15 16128696 sdhC succinate dehydrogenase cytochrome b556 large membrane subunit 14 17 16128699 sdhB succinate dehydrogenase, FeS subunit 14 24 16128698 sdhA succinate dehydrogenase flavoprotein subunit 14 24 16130703 sdaC predicted serine transporter 14 11 16130704 sdaB L-serine deaminase II 14 19 16129768 sdaA L-serine deaminase I 14 19 16131755 sbp sulfate transporter subunit 14 12 16128383 sbcD exonuclease, dsDNA, ATP-dependent 14 11 16128382 sbcC exonuclease, dsDNA, ATP-dependent 14 11 16129952 sbcB exonuclease I 14 18 16129251 sapF predicted antimicrobial peptide transporter subunit 14 25 16129252 sapD predicted antimicrobial peptide transporter subunit 14 25 16129253 sapC predicted antimicrobial peptide transporter subunit 14 17 16129254 sapB predicted antimicrobial peptide transporter subunit 14 20 16129255 sapA predicted antimicrobial peptide transporter subunit 14 16 16130082 sanA hypothetical protein b2144 14 4 16129816 ruvC Holliday junction resolvase 14 24 16129813 ruvB Holliday junction DNA helicase B 14 23 16129814 ruvA Holliday junction DNA helicase motor protein 14 23 16130692 rumA 23S rRNA (uracil-5-)-methyltransferase 14 22 16131296 rtcR sigma 54-dependent transcriptional regulator of rtcBA expression 14 21 16130121 rsuA 16S rRNA pseudouridylate 516 synthase 14 23 16129567 rstB sensory histidine kinase in two-component regulatory system with 14 23 16129566 rstA DNA-binding response regulator in two-component regulatory system 14 24 16129196 rssB response regulator of RpoS 14 23 16129538 rspB predicted oxidoreductase, Zn-dependent and NAD(P)-binding 14 20 16132189 rsmC 16S ribosomal RNA m2G1207 methyltransferase 14 16 16131168 rsmB 16S rRNA m5C967 methyltransferase, 14 23 16130495 rseC RseC protein involved in reduction of the SoxR iron-sulfur cluster 14 3 16130496 rseB periplasmic negative regulator of sigmaE 14 13 16130497 rseA anti-sigma factor 14 8 16131069 rrmJ 23S rRNA methyltransferase 14 25 16129776 rrmA 23S rRNA m1G745 methyltransferase 14 11 16130961 rpsU 30S ribosomal protein S21 14 24 16128017 rpsT 30S ribosomal protein S20 14 25 16132024 rpsR 30S ribosomal protein S18 14 25 16130530 rpsP 30S ribosomal protein S16 14 25 16131057 rpsO 30S ribosomal protein S15 14 25 16131186 rpsN 30S ribosomal protein S14 14 25 16131177 rpsM 30S ribosomal protein S13 14 25 16131221 rpsL 30S ribosomal protein S12 14 25 16131176 rpsK 30S ribosomal protein S11 14 24 16131200 rpsJ 30S ribosomal protein S10 14 24 16131120 rpsI 30S ribosomal protein S9 14 25 16131185 rpsH 30S ribosomal protein S8 14 24 16131220 rpsG 30S ribosomal protein S7 14 24 16132022 rpsF 30S ribosomal protein S6 14 25 16131182 rpsE 30S ribosomal protein S5 14 25 16131175 rpsD 30S ribosomal protein S4 14 25

264 16131193 rpsC 30S ribosomal protein S3 14 25 16128162 rpsB 30S ribosomal protein S2 14 25 16128878 rpsA 30S ribosomal protein S1 14 25 16131520 rpoZ DNA-directed RNA polymerase subunit omega 14 23 16130648 rpoS RNA polymerase sigma factor 14 25 16131092 rpoN DNA-directed RNA polymerase subunit N 14 17 16131333 rpoH RNA polymerase sigma factor 14 25 16130498 rpoE RNA polymerase, sigma 24 (sigma E) factor 14 17 16130963 rpoD RNA polymerase sigma factor 14 25 16131818 rpoC DNA-directed RNA polymerase subunit beta' 14 25 16131817 rpoB DNA-directed RNA polymerase subunit beta 14 25 16131174 rpoA DNA-directed RNA polymerase subunit alpha 14 25 16131571 rpmH 50S ribosomal protein L34 14 14 16131507 rpmG 50S ribosomal protein L33 14 24 16129052 rpmF 50S ribosomal protein L32 14 24 16131774 rpmE 50S ribosomal subunit protein L31 14 20 16131181 rpmD 50S ribosomal protein L30 14 22 16131191 rpmC 50S ribosomal protein L29 14 13 16131075 rpmA 50S ribosomal protein L27 14 25 16130123 rplY 50S ribosomal protein L25 14 24 16131188 rplX 50S ribosomal protein L24 14 25 16131197 rplW 50S ribosomal protein L23 14 24 16131194 rplV 50S ribosomal protein L22 14 25 16131076 rplU 50S ribosomal protein L21 14 25 16129672 rplT 50S ribosomal protein L20 14 25 16130527 rplS 50S ribosomal protein L19 14 25 16131183 rplR 50S ribosomal protein L18 14 25 16131173 rplQ 50S ribosomal protein L17 14 25 16131192 rplP 50S ribosomal protein L16 14 25 16131180 rplO 50S ribosomal protein L15 14 25 16131189 rplN 50S ribosomal protein L14 14 24 16131121 rplM 50S ribosomal protein L13 14 25 16131816 rplL 50S ribosomal protein L7/L12 14 25 16131813 rplK 50S ribosomal protein L11 14 25 16131815 rplJ 50S ribosomal protein L10 14 25 16132025 rplI 50S ribosomal protein L9 14 25 16131184 rplF 50S ribosomal protein L6 14 25 16131187 rplE 50S ribosomal protein L5 14 25 16131198 rplD 50S ribosomal protein L4 14 25 16131199 rplC 50S ribosomal protein L3 14 25 16131196 rplB 50S ribosomal protein L2 14 25 16131814 rplA 50S ribosomal protein L1 14 25 90111685 rpiR DNA-binding transcriptional repressor 14 10 16130815 rpiA ribose-5-phosphate isomerase A 14 25 16131514 rph ribonuclease PH 14 22 16131264 rpe ribulose-phosphate 3-epimerase 14 25 16129610 rnt ribonuclease T 14 23 90111698 rnr R, RNase R 14 24 16131572 rnpA ribonuclease P 14 22 16128176 rnhB ribonuclease HII 14 23 16128201 rnhA ribonuclease H 14 25 90111563 rng ribonuclease G 14 25 16129589 rnfG electron transport complex protein RnfG 14 15 16129590 rnfE NADH-ubiquinone oxidoreductase 14 15 16129588 rnfD electron transport complex protein RnfD 14 17 16129587 rnfC electron transport complex protein RnfC 14 13 16129586 rnfB electron transport complex protein RnfB 14 23 16129585 rnfA Na(+)-translocating NADH-quinone reductase subunit E 14 17

265 16129047 rne fused ribonucleaseE: endoribonuclease/RNA-binding protein/RNA 14 25 16129758 ribonuclease D 14 20 16130492 rncS ribonuclease III 14 25 16129247 rnb exoribonuclease II 14 24 16131681 rmuC predicted recombination limiting protein 14 18 16130515 rluD 23S rRNA pseudouridine synthase 14 23 16129049 rluC 23S rRNA pseudouridylate synthase 14 23 16129230 rluB 23S rRNA pseudouridylate synthase 14 22 16128052 rluA pseudouridine synthase for 23S rRNA (position 746) and 14 23 16128624 rlpB minor lipoprotein 14 8 16128616 rlpA minor lipoprotein 14 21 16132002 rlmB 23S rRNA (Gm2251)-methyltransferase 14 24 90111466 rimM 16S rRNA-processing protein 14 24 16129029 rimJ ribosomal-protein-S5-alanine N-acetyltransferase 14 4 16132191 rimI ribosomal-protein-alanine N-acetyltransferase 14 23 16128400 ribH riboflavin synthase subunit beta 14 25 16128019 ribF hypothetical protein b0025 14 24 16128399 ribD fused diaminohydroxyphosphoribosylaminopyrimidine deaminase and 14 25 16129620 ribC riboflavin synthase subunit alpha 14 25 16130937 ribB 3,4-dihydroxy-2-butanone 4-phosphate synthase 14 25 16129238 ribA GTP cyclohydrolase II protein 14 24 49176421 rhtC threonine efflux system 14 15 49176422 rhtB neutral amino-acid efflux system 14 13 16131639 rho transcription termination factor Rho 14 25 16128765 rhlE RNA helicase 14 25 16131636 rhlB ATP-dependent RNA helicase 14 25 16131745 rhaS DNA-binding transcriptional activator, L-rhamnose-binding 14 12 16131746 rhaR DNA-binding transcriptional activator, L-rhamnose-binding 14 4 16131645 rffH glucose-1-phosphate thymidylyltransferase 14 23 49176411 rffG dTDP-glucose 4,6-dehydratase 14 23 49176409 rffE UDP-N-acetyl glucosamine-2-epimerase 14 13 16131647 rffA TDP-4-oxo-6-deoxy-D-glucose transaminase 14 22 16129981 rfbB dTDP-glucose 4,6 dehydratase, NAD(P)-binding 14 23 16129979 rfbA glucose-1-phosphate thymidylyltransferase 14 22 16131503 rfaQ lipopolysaccharide core biosynthesis protein 14 12 16131688 rfaH transcriptional activator RfaH 14 7 16131491 rfaF ADP-heptose:LPS heptosyltransferase II 14 11 16131490 rfaD ADP-L-glycero-D-mannoheptose-6-epimerase, NAD(P)-binding 14 18 49176407 rep DNA helicase and single-stranded DNA-dependent ATPase 14 25 16130691 relA (p)ppGpp synthetase I/GTP pyrophosphokinase 14 23 16130605 recX RecA regulator RecX 14 19 16128456 recR recombination protein RecR 14 23 49176420 recQ ATP-dependent DNA helicase 14 24 16130490 recO DNA repair protein RecO 14 22 49176247 recN recombination and repair protein 14 23 16130794 recJ ssDNA exonuclease, 5' --> 3'-specific 14 24 16131523 recG ATP-dependent DNA helicase 14 25 16131568 recF recombination protein F 14 23 16130723 recD exonuclease V (RecBCD complex), alpha chain 14 22 16130726 recC exonuclease V (RecBCD complex), gamma chain 14 21 16130724 recB exonuclease V (RecBCD complex), beta subunit 14 25 16130606 recA recombinase A 14 24 16128378 rdgC recombination associated protein 14 17 16130155 rcsC hybrid sensory kinase in two-component regulatory system with RcsB 14 23 16130154 rcsB DNA-binding response regulator in two-component regulatory system 14 19 16131621 rbsR DNA-binding transcriptional repressor of ribose metabolism 14 14 16131620 rbsK ribokinase 14 16 90111647 rbsD predicted cytoplasmic sugar-binding protein 14 5

266 16131618 rbsC ribose ABC transporter permease protein 14 7 16131619 rbsB D-ribose transporter subunit 14 11 16131617 rbsA fused D-ribose transporter subunits of ABC superfamily: ATP-binding 14 25 16131059 rbfA ribosome-binding factor A 14 25 49176363 rbbA fused ribosome-associated ATPase: ATP-binding protein/ATP-binding 14 25 49176418 rarD predicted chloramphenical resistance permease 14 16 16132206 radA predicted repair protein 14 23 16128390 queA S-adenosylmethionine:tRNA ribosyltransferase-isomerase 14 23 16130922 qseC sensory histidine kinase in two-component regulatory system with 14 23 16130921 qseB DNA-binding response regulator in two-component regulatory system 14 24 16131877 qor quinone oxidoreductase, NADPH-dependent 14 18 16132066 pyrI aspartate carbamoyltransferase regulatory subunit 14 1 16128164 pyrH uridylate kinase 14 25 16130687 pyrG CTP synthetase 14 25 16129242 pyrF orotidine 5'-phosphate decarboxylase 14 25 16131513 pyrE orotate phosphoribosyltransferase 14 24 16128912 pyrD dihydroorotate dehydrogenase 14 21 16129025 pyrC dihydroorotase 14 16 16132067 pyrB aspartate carbamoyltransferase catalytic subunit 14 25 16129632 pykF 14 22 16129807 pykA pyruvate kinase 14 22 16129263 puuE GABA aminotransferase, PLP-dependent 14 25 16129261 puuC gamma-Glu-gamma-aminobutyraldehyde dehydrogenase, NAD(P)H-dependent 14 22 90111244 puuA gamma-Glu-putrescine synthase 14 22 16128981 putP proline:sodium symporter 14 20 16128980 putA fused DNA-binding transcriptional regulator/proline 14 22 16129193 purU formyltetrahydrofolate deformylase 14 22 16129616 purR DNA-binding transcriptional repressor, hypoxanthine-binding 14 14 16130425 purN phosphoribosylglycinamide formyltransferase 14 22 16130424 purM phosphoribosylaminoimidazole synthetase 14 24 49176239 purL phosphoribosylformylglycinamidine synthase 14 23 16128506 purK phosphoribosylaminoimidazole carboxylase 14 22 16131836 purH bifunctional phosphoribosylaminoimidazolecarboxamide 14 25 16130247 purF amidophosphoribosyltransferase 14 25 16128507 purE phosphoribosylaminoimidazole carboxylase catalytic subunit 14 24 16131835 purD phosphoribosylamine--glycine ligase 14 23 16129094 purB adenylosuccinate lyase 14 25 16131999 purA adenylosuccinate synthetase 14 25 16130733 ptsP fused PTS enzyme: PEP-protein phosphotransferase (enzyme I)/GAF 14 24 16131094 ptsN sugar-specific enzyme IIA component of PTS 14 15 16130342 ptsI PEP-protein phosphotransferase of PTS system (enzyme I) 14 24 16130341 ptsH phosphohistidinoprotein-hexose phosphotransferase component of PTS 14 10 16129064 ptsG fused glucose-specific PTS enzymes: IIB component/IIC component 14 5 49176446 ptsA fused predicted PTS enzymes: Hpr component/enzyme I 14 22 16130725 ptr protease III 14 15 16129167 pth peptidyl-tRNA hydrolase 14 25 16130232 pta phosphate acetyltransferase 14 19 16131596 pstS phosphate transporter subunit 14 8 16131595 pstC phosphate transporter subunit 14 18 16131593 pstB phosphate transporter subunit 14 25 16131594 pstA phosphate transporter subunit 14 19 90111464 pssA phosphatidylserine synthase 14 10 90111246 pspF DNA-binding transcriptional activator 14 21 16129267 pspC DNA-binding transcriptional activator 14 3 16129266 pspB phage shock protein B 14 4 16129265 pspA regulatory protein for phage-shock-protein operon 14 4 16131985 psd phosphatidylserine decarboxylase 14 23 16129170 prsA ribose-phosphate pyrophosphokinase 14 25

267 16128315 prpR DNA-binding transcriptional activator 14 21 16128320 prpE predicted propionyl-CoA synthetase with ATPase domain 14 22 16128318 prpC 2-methylcitrate synthase 14 21 16130592 proW glycine betaine transporter subunit 14 16 16130591 proV glycine betaine transporter subunit 14 25 16128187 proS prolyl-tRNA synthetase 14 25 49176156 proQ putative solute/DNA competence effector 14 9 16128371 proC pyrroline-5-carboxylate reductase 14 23 16128228 proB gamma-glutamyl kinase 14 20 16128229 proA gamma-glutamyl phosphate reductase 14 20 90111419 prmB N5-glutamine methyltransferase 14 25 16131147 prmA ribosomal protein L11 methyltransferase 14 22 16131370 prlC oligopeptidase A 14 24 16131179 prlA protein translocase subunit SecY 14 25 16131234 prkB predicted phosphoribulokinase 14 7 16131773 priA primosome assembly protein PriA 14 23 16128222 prfH peptide chain release factor 2 14 25 16132193 prfC peptide chain release factor 3 14 25 16130793 prfB peptide chain release factor 2 14 25 16129174 prfA peptide chain release factor 1 14 25 16129784 prc carboxy-terminal protease for penicillin-binding protein 3 14 22 16129453 pqqL predicted peptidase 14 21 16128918 pqiB paraquat-inducible protein B 14 15 16128917 pqiA paraquat-inducible membrane protein A 14 11 16130427 ppx exopolyphosphatase 14 21 16129658 ppsA phosphoenolpyruvate synthase 14 23 16130534 ppnK inorganic polyphosphate/ATP-NAD kinase 14 24 16128426 ppiD peptidyl-prolyl cis-trans isomerase (rotamase D) 14 25 16128509 ppiB peptidyl-prolyl cis-trans isomerase B (rotamase B) 14 21 16131242 ppiA peptidyl-prolyl cis-trans isomerase A (rotamase A) 14 21 16131794 ppc phosphoenolpyruvate carboxylase 14 17 16132048 ppa inorganic pyrophosphatase 14 22 16128839 poxB pyruvate dehydrogenase 14 21 16128825 potI putrescine transporter subunit: membrane component of ABC 14 16 16128824 potH putrescine transporter subunit: membrane component of ABC 14 19 90111177 potG putrescine transporter subunit: ATP-binding component of ABC 14 25 16128668 potE putrescine/proton symporter: putrescine/ornithine antiporter 14 18 16129086 potD spermidine/putrescine ABC transporter periplasmic substrate-binding 14 15 16129087 potC spermidine/putrescine ABC transporter membrane protein 14 19 16129088 potB spermidine/putrescine ABC transporter membrane protein 14 20 16129089 potA putrescine/spermidine ABC transporter ATPase protein 14 25 16131704 polA DNA polymerase I 14 25 16129560 pntB pyridine nucleotide transhydrogenase 14 18 16129561 pntA NAD(P) transhydrogenase subunit alpha 14 19 49176320 pnp polynucleotide phosphorylase/polyadenylase 14 25 16132057 pmbA predicted peptidase required for the maturation and secretion of 14 18 90111212 plsX fatty acid/phospholipid synthesis protein 14 17 16130914 plsC 1-acyl-sn-glycerol-3-phosphate acyltransferase 14 11 90111678 plsB glycerol-3-phosphate acyltransferase 14 18 49176423 pldB lysophospholipase L(2) 14 7 16131365 pitA phosphate transporter, low-affinity 14 20 16128683 phr deoxyribodipyrimidine photolyase, FAD-binding 14 19 16131592 phoU negative regulator of PhoR/PhoB two-component regulator 14 16 16128385 phoR sensory histidine kinase in two-component regulatory system with 14 23 16129092 phoQ sensory histidine kinase in two-compoent regulatory system with 14 22 16129093 phoP DNA-binding response regulator in two-component regulatory system 14 24 16128984 phoH conserved protein with nucleoside triphosphate hydrolase domain 14 21 16128227 phoE outer membrane phosphoporin protein E 14 3

268 16128384 phoB DNA-binding response regulator in two-component regulatory system 14 24 16131922 phnL carbon-phosphorus lyase complex subunit 14 24 49176459 phnK carbon-phosphorus lyase complex subunit 14 25 16131932 phnC phosphonate/organophosphate ester transporter subunit 14 25 16131934 phnA hypothetical protein b4108 14 20 16129669 pheT phenylalanyl-tRNA synthetase beta subunit 14 25 16129670 pheS phenylalanyl-tRNA synthetase alpha subunit 14 25 16129859 pgsA phosphatidylglycerophosphate synthetase 14 24 16128403 pgpA phosphatidylglycerophosphatase A 14 19 16128664 pgm phosphoglucomutase 14 16 16130827 pgk 14 25 16131851 pgi glucose-6-phosphate isomerase 14 25 16128152 pfs 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase 14 12 16131789 pflD predicted formate acetyltransferase 2 (pyruvate formate lyase II) 14 5 49176447 pflC pyruvate formate lyase II activase 14 5 16128870 pflB pyruvate formate lyase I 14 5 16128869 pflA pyruvate formate lyase activating enzyme 1 14 5 49176138 pfkB 6- II 14 11 16131754 pfkA 6-phosphofructokinase 14 12 16128239 perR CP4-6 prophage; predicted DNA-binding transcriptional regulator 14 21 16129090 pepT peptidase T 14 3 16128899 pepN aminopeptidase N 14 23 16128223 pepD aminoacyl-histidine dipeptidase (peptidase D) 14 7 90111453 pepB aminopeptidase B 14 25 16132082 pepA leucyl aminopeptidase 14 25 16130489 pdxJ pyridoxal phosphate biosynthetic protein 14 19 16129596 pdxH pyridoxamine 5'-phosphate oxidase 14 20 16130255 pdxB erythronate-4-phosphate dehydrogenase 14 23 16128046 pdxA 4-hydroxythreonine-4-phosphate dehydrogenase 14 21 16128106 pdhR transcriptional regulator of pyruvate dehydrogenase complex 14 11 90111090 pcnB poly(A) polymerase I 14 23 16130650 pcm protein-L-isoaspartate O-methyltransferase 14 14 16131280 pckA phosphoenolpyruvate carboxykinase 14 14 90111393 pbpG D-alanyl-D-alanine endopeptidase 14 24 16130444 pbpC fused transglycosylase/transpeptidase 14 24 16130926 parE DNA topoisomerase IV subunit B 14 25 16130915 parC DNA topoisomerase IV subunit A 14 25 90111565 panF sodium/panthothenate symporter 14 19 16128126 panC pantoate--beta-alanine ligase 14 20 16128127 panB 3-methyl-2-oxobutanoate hydroxymethyltransferase 14 20 16128716 pal peptidoglycan-associated outer membrane lipoprotein 14 24 16129059 pabC 4-amino-4-deoxychorismate lyase 14 16 16129766 pabB para-aminobenzoate synthase component I 14 24 16131239 pabA para-aminobenzoate synthase component II 14 24 16129361 paaY predicted hexapeptide repeat acetyltransferase 14 17 16129358 paaJ acetyl-CoA acetyltransferase 14 18 16129356 paaH 3-hydroxybutyryl-CoA dehydrogenase 14 17 16129354 paaF enoyl-CoA hydratase-isomerase 14 20 16129353 paaE predicted multicomponent oxygenase/reductase subunit for 14 13 16131799 oxyR DNA-binding transcriptional dual regulator 14 23 90111697 orn oligoribonuclease 14 25 16129208 oppF oligopeptide transporter subunit 14 25 49176090 oppD oligopeptide transporter ATP-binding component 14 25 16129206 oppC oligopeptide transporter subunit 14 20 16129205 oppB oligopeptide permease ABC transporter membrane protein 14 20 16129204 oppA oligopeptide transporter subunit 14 18 16131282 ompR osmolarity response regulator 14 24 16129338 ompN outer membrane pore protein N, non-specific 14 2

269 16128896 ompF outer membrane porin 1a (Ia;b;F) 14 1 16130152 ompC outer membrane porin protein C 14 1 16128924 ompA outer membrane protein A (3a;II*;G;d) 14 17 16129296 ogt O-6-alkylguanine-DNA:cysteine-protein methyltransferase 14 22 16131073 obgE GTPase involved in cell partioning and DNA repair 14 25 16131812 nusG transcription antitermination protein NusG 14 25 16128401 nusB transcription antitermination protein NusB 14 24 16131061 nusA transcription elongation factor NusA 14 25 16130325 nupC nucleoside (except guanosine) transporter 14 11 16129713 nudG pyrimidine (deoxy)nucleoside triphosphate pyrophosphohydrolase 14 17 16130930 nudF ADP-ribose pyrophosphatase 14 15 16131274 nudE ADP-ribose diphosphatase 14 17 49176450 nudC NADH pyrophosphatase 14 16 16129591 nth DNA glycosylase and apyrimidinic (AP) lyase (endonuclease III) 14 25 16131900 nrfE heme lyase (NrfEFG) for insertion of heme into c552, subunit NrfE 14 19 16131898 nrfC formate-dependent nitrite reductase, 4Fe4S subunit 14 10 16132059 nrdG anaerobic ribonucleotide reductase activating protein 14 6 16130590 nrdF ribonucleotide-diphosphate reductase beta subunit 14 13 16130589 nrdE ribonucleotide-diphosphate reductase alpha subunit 14 25 16132060 nrdD anaerobic ribonucleoside triphosphate reductase 14 5 16130170 nrdB ribonucleotide-diphosphate reductase beta subunit 14 24 16130169 nrdA ribonucleotide-diphosphate reductase alpha subunit 14 25 16131096 npr phosphohistidinoprotein-hexose phosphotransferase component of 14 10 16130488 acpS 4'-phosphopantetheinyl transferase 14 6 16128536 nmpC DLP12 prophage; truncated outer membrane porin (pseudogene) 14 1 16131055 nlpI hypothetical protein b3163 14 8 16130649 nlpD predicted outer membrane lipoprotein 14 23 16129664 nlpC predicted lipoprotein 14 12 90111442 nlpB lipoprotein 14 3 16131531 nlpA cytoplasmic membrane lipoprotein-28 14 16 90111575 nirC nitrite transporter 14 7 16131244 nirB nitrite reductase, large subunit, NAD(P)H-binding 14 12 16131352 nikE nickel transporter subunit 14 25 16131351 nikD nickel transporter subunit 14 25 16131350 nikC nickel transporter subunit 14 20 16131349 nikB nickel transporter subunit 14 20 16131348 nikA nickel transporter subunit 14 16 16128014 nhaR DNA-binding transcriptional activator 14 7 16128013 nhaA pH-dependent sodium/proton antiporter 14 13 16130097 nfo endonuclease IV 14 2 16129608 nemA N-ethylmaleimide reductase, FMN-linked 14 18 16130443 ndk nucleoside diphosphate kinase 14 24 16129072 ndh respiratory NADH dehydrogenase 2/cupric reductase 14 19 16129185 narX sensory histidine kinase in two-component regulatory system with 14 15 16130394 narQ sensory histidine kinase in two-component regulatory system with 14 10 16130130 narP DNA-binding response regulator in two-component regulatory system 14 24 16129184 narL DNA-binding response regulator in two-component regulatory system 14 22 16130143 napA nitrate reductase, periplasmic, large subunit 14 17 49176329 nanR transcriptional regulator NanR 14 10 16131115 nanA N-acetylneuraminate lyase 14 24 16129070 nagZ beta-hexosaminidase 14 23 16128655 nagE fused N-acetyl glucosamine specific PTS enzyme: IIC, IIB , and IIA 14 8 16128652 nagC DNA-binding transcriptional dual regulator, repressor of 14 10 16128654 nagB glucosamine-6-phosphate deaminase 14 6 16128653 nagA N-acetylglucosamine-6-phosphate deacetylase 14 12 16128622 nadD nicotinic acid mononucleotide adenyltransferase 14 15 16128102 nadC nicotinate-nucleotide pyrophosphorylase 14 20 16130499 nadB L-aspartate oxidase 14 24

270 16129930 nac DNA-binding transcriptional dual regulator of nitrogen assimilation 14 23 16129032 mviN predicted inner membrane protein 14 25 16130862 mutY adenine DNA glycosylase 14 23 16128092 mutT nucleoside triphosphate pyrophosphohydrolase, marked preference for 14 23 16130640 mutS DNA mismatch repair protein 14 24 16131506 mutM formamidopyrimidine-DNA glycosylase 14 22 16131992 mutL DNA mismatch repair protein 14 24 16130735 mutH DNA mismatch repair protein 14 11 90111671 murI glutamate racemase 14 20 16128083 murG N-acetylglucosaminyl transferase 14 25 16128079 murF UDP-N-acetylmuramoyl-tripeptide:D-alanyl-D-alanine ligase 14 25 16128078 murE UDP-N-acetylmuramoylalanyl-D-glutamate--2,6-diaminopimelate ligase 14 25 16128081 murD UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase 14 25 16128084 murC UDP-N-acetylmuramate--L-alanine ligase 14 25 16131806 murB UDP-N-acetylenolpyruvoylglucosamine reductase 14 24 16131079 murA UDP-N-acetylglucosamine 1-carboxyvinyltransferase 14 25 16128889 mukF condesin subunit F 14 4 16128890 mukE condesin subunit E 14 4 16128891 mukB cell division protein MukB 14 4 16131053 mtr tryptophan transporter of high affinity 14 10 16131470 mtlA fused mannitol-specific PTS enzymes: IIA components/IIB 14 11 16131098 mtgA monofunctional biosynthetic peptidoglycan transglycosylase 14 24 16132041 msrA methionine sulfoxide reductase A 14 21 16130825 mscS mechanosensitive channel 14 23 16129808 msbB lipid A biosynthesis (KDO)2-(lauroyl)-lipid IVA acyltransferase 14 19 16128881 msbA fused lipid transporter subunits of ABC superfamily: membrane 14 25 90111388 mrp antiporter inner membrane protein 14 23 16131137 mreD cell wall structural complex MreBCD transmembrane component MreD 14 18 16131138 mreC cell wall structural complex MreBCD transmembrane component MreC 14 21 90111564 mreB cell wall structural complex MreBCD, actin-like component MreB 14 23 16128617 mrdB cell wall shape-determining protein 14 25 16128618 mrdA transpeptidase involved in peptidoglycan synthesis 14 25 16128142 mrcB penicillin-binding protein 1b 14 24 90111584 mrcA fused penicillin-binding protein 1a: murein transglycosylase/murein 14 24 16128080 mraY phospho-N-acetylmuramoyl-pentapeptide-transferase 14 25 16128075 mraW S-adenosyl-methyltransferase 14 25 90111250 mppA murein tripeptide (L-ala-gamma-D-glutamyl-meso-DAP) transporter 14 16 16132055 mpl UDP-N-acetylmuramate:L-alanyl-gamma-D-glutamyl-meso-diaminopimelate 14 25 16128794 moeB molybdopterin biosynthesis protein MoeB 14 22 16128795 moeA molybdopterin biosynthesis protein 14 17 16128728 modF fused molybdate transporter subunits of ABC superfamily: 14 25 16128733 modC molybdate transporter subunit 14 24 16128732 modB molybdate ABC transporter permease protein 14 18 16128731 modA molybdate transporter subunit 14 16 16131697 mobB molybdopterin-guanine dinucleotide biosynthesis protein B 14 6 16128753 moaE molybdopterin synthase, large subunit 14 17 16128752 moaD molybdopterin synthase, small subunit 14 11 16128751 moaC molybdenum cofactor biosynthesis protein C 14 17 16128749 moaA molybdenum cofactor biosynthesis protein A 14 17 16128705 mngR DNA-binding transcriptional dual regulator, fatty-acyl-binding 14 12 16128706 mngA fused 2-O-a-mannosyl-D-glycerate specific PTS enzymes: IIA 14 9 16128198 mltD predicted membrane-bound lytic murein transglycosylase D 14 18 90111520 mltC membrane-bound lytic murein transglycosylase C 14 12 16130720 mltA membrane-bound lytic murein transglycosylase A 14 11 16131610 mioC flavodoxin 14 15 16129137 minE cell division topological specificity factor MinE 14 15 16129138 minD membrane ATPase of the MinC-MinD-MinE system 14 19 16129139 minC septum formation inhibitor 14 20

271 16128644 miaB isopentenyl-adenosine A37 tRNA methylthiolase 14 24 16131993 miaA tRNA delta(2)-isopentenylpyrophosphate transferase 14 25 16132064 mgtA magnesium transporter 14 22 90111195 mgsA methylglyoxal synthase 14 6 16130086 mglC beta-methylgalactoside transporter inner membrane component 14 7 16130088 mglB methyl-galactoside transporter subunit 14 5 16130087 mglA fused methyl-galactoside transporter subunits of ABC superfamily: 14 25 16129077 mfd transcription-repair coupling factor 14 25 16131677 metR DNA-binding transcriptional activator, homocysteine-binding 14 19 16128190 metQ DL-methionine transporter subunit 14 16 16128192 metN DL-methionine transporter subunit 14 25 16131778 metL bifunctional II/homoserine dehydrogenase II 14 24 16130843 metK S-adenosylmethionine synthetase 14 24 16131776 metJ transcriptional repressor protein MetJ 14 8 16128191 metI DL-methionine transporter subunit 14 16 16131845 metH B12-dependent methionine synthase 14 16 16130052 metG methionyl-tRNA synthetase 14 25 16131779 metF 5,10-methylenetetrahydrofolate reductase 14 18 16131678 metE 5-methyltetrahydropteroyltriglutamate--homocysteine 14 19 16130906 metC cystathionine beta-lyase 14 21 16131777 metB cystathionine gamma-synthase 14 21 16131767 menG ribonuclease activity regulator protein RraA 14 18 90111411 menF menaquinone-specific isochorismate synthase 14 24 16130195 menE O-succinylbenzoic acid--CoA ligase 14 22 16130199 menD 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase 14 5 16130196 menC O-succinylbenzoate synthase 14 5 16130197 menB naphthoate synthase 14 20 16131768 menA 1,4-dihydroxy-2-naphthoate octaprenyltransferase 14 5 16131945 melA alpha-galactosidase, NAD(P)-binding 14 0 16131578 mdtL multidrug efflux system protein 14 23 16131386 mdtF multidrug transporter, RpoS-dependent 14 23 16131385 mdtE multidrug resistance efflux transporter 14 23 16130017 mdtD multidrug efflux system protein 14 18 16130016 mdtC multidrug efflux system, subunit C 14 23 16130015 mdtB multidrug efflux system, subunit B 14 23 90111381 mdtA multidrug efflux system, subunit A 14 23 16128434 mdlB fused predicted multidrug transporter subunits of ABC superfamily: 14 25 16128433 mdlA fused predicted multidrug transporter subunits of ABC superfamily: 14 25 16131126 mdh malate dehydrogenase 14 12 16130688 mazG nucleoside triphosphate pyrophosphohydrolase 14 19 16129488 marC predicted transporter 14 18 16128161 map methionine aminopeptidase 14 25 16129348 maoC fused aldehyde dehydrogenase/enoyl-CoA hydratase 14 22 49176125 manA mannose-6-phosphate isomerase 14 3 16128388 malZ maltodextrin glucosidase 14 13 16129579 malX fused maltose and glucose-specific PTS enzymes: IIB component -! 14 5 16131442 malS periplasmic alpha-amylase precursor 14 7 16131292 malQ 4-alpha-glucanotransferase (amylomaltase) 14 14 49176351 malP maltodextrin phosphorylase 14 12 16131861 malK fused maltose transport subunit, ATP-binding component of ABC 14 25 16129578 malI DNA-binding transcriptional repressor 14 13 16131136 maf Maf-like protein 14 19 90111211 maf Maf-like protein 14 19 16130388 maeB malic enzyme 14 24 16128847 macB fused macrolide transporter subunits of ABC superfamily: 14 24 90111182 macA macrolide transporter subunit, membrane fusion protein (MFP) 14 23 16131451 lyxK L-xylulose kinase 14 18 16131955 lysU lysine tRNA synthetase, inducible 14 25

272 16130792 lysS lysine tRNA synthetase, constitutive 14 25 16130743 lysR DNA-binding transcriptional dual regulator 14 19 16131850 lysC aspartate kinase III 14 24 16130742 lysA diaminopimelate decarboxylase, PLP-binding 14 23 49176118 lsrK autoinducer-2 (AI-2) kinase 14 17 16129474 lsrD AI2 transporter 14 5 16129473 lsrC AI2 transporter 14 7 16129472 lsrA fused AI2 transporter subunits of ABC superfamily: ATP-binding 14 25 16128021 lspA signal peptidase II 14 25 16128856 lrp DNA-binding transcriptional dual regulator, leucine-binding 14 19 16130224 lrhA DNA-binding transcriptional repressor of flagellar, motility and 14 9 16128882 lpxK tetraacyldisaccharide 4'-kinase 14 24 16128172 lpxD UDP-3-O-3-hydroxymyristoyl] glucosamine N-acyltransferase 14 24 16128089 lpxC UDP-3-O-3-hydroxymyristoyl] N-acetylglucosamine deacetylase 14 24 16128175 lpxB lipid-A-disaccharide synthase 14 24 16128174 lpxA UDP-N-acetylglucosamine acyltransferase 14 24 16128109 lpdA dihydrolipoamide dehydrogenase 14 25 16128208 lpcA phosphoheptose isomerase 14 17 16128424 lon DNA-binding ATP-dependent protease La 14 25 16129081 lolE outer membrane-specific lipoprotein transporter subunit 14 24 90111215 lolD outer membrane-specific lipoprotein transporter subunit 14 25 16129079 lolC outer membrane-specific lipoprotein transporter subunit 14 24 16129172 lolB outer membrane lipoprotein LolB precursor 14 16 16128858 lolA outer-membrane lipoprotein carrier protein precursor 14 23 16128640 lnt apolipoprotein N-acyltransferase 14 22 16131475 lldR DNA-binding transcriptional repressor 14 11 16131327 livG leucine/isoleucine/valine transporter subunit 14 25 90111594 livF leucine/isoleucine/valine transporter subunit 14 25 90111156 lipB lipoyltransferase 14 25 16128611 lipA lipoyl synthase 14 25 16130337 ligA NAD-dependent DNA ligase LigA 14 25 16130732 lgt prolipoprotein diacylglyceryl transferase 14 25 16131869 lexA LexA repressor 14 18 16128625 leuS leucyl-tRNA synthetase 14 25 90111083 leuO leucine transcriptional activator 14 14 16128065 leuD isopropylmalate isomerase small subunit 14 19 16128066 leuC isopropylmalate isomerase large subunit 14 23 90111082 leuB 3-isopropylmalate dehydrogenase 14 22 16128068 leuA 2-isopropylmalate synthase 14 21 16130493 lepB leader peptidase (signal peptidase I) 14 25 16130494 lepA GTP-binding protein LepA 14 25 16129341 ldhA D-lactate dehydrogenase 14 23 16128179 ldcC lysine decarboxylase 2, constitutive 14 8 49176012 lacI lac repressor 14 14 16128045 ksgA dimethyladenosine transferase 14 25 16128041 kefC glutathione-regulated potassium-efflux system protein 14 22 16131229 kefB glutathione-regulated potassium-efflux system protein 14 22 16128449 kefA fused conserved protein 14 24 16130746 kduD 2-deoxy-D-gluconate 3-dehydrogenase 14 25 16131504 kdtA 3-deoxy-D-manno-octulosonic-acid transferase 14 24 16131087 kdsD D-arabinose 5-phosphate isomerase 14 24 16128885 kdsB 3-deoxy-manno-octulosonate cytidylyltransferase 14 24 16129178 kdsA 2-dehydro-3-deoxyphosphooctonate aldolase 14 24 16128670 kdpE DNA-binding response regulator in two-component regulatory system 14 24 16128671 kdpD fused sensory histidine kinase in two-component regulatory system 14 21 16128673 kdpB potassium-transporting ATPase subunit B 14 22 90111606 kdgK ketodeoxygluconokinase 14 13 16131488 kbl 2-amino-3-ketobutyrate coenzyme A ligase 14 24

273 16131029 kbaY tagatose 6-phosphate aldolase 1, kbaY subunit 14 22 16129215 ispZ intracellular septation protein A 14 21 16128167 ispU undecaprenyl pyrophosphate synthase 14 25 16128023 ispH 4-hydroxy-3-methylbut-2-enyl diphosphate reductase 14 22 16130440 ispG 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase 14 21 16130653 ispF 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 14 21 16129171 ispE 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase 14 21 16130654 ispD 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase 14 21 16128166 ispC 1-deoxy-D-xylulose 5-phosphate reductoisomerase 14 21 16131077 ispB octaprenyl diphosphate synthase 14 24 16128406 ispA geranyltranstransferase 14 24 16130454 iscU scaffold protein 14 18 49176235 iscS cysteine desulfurase 14 25 16130456 iscR DNA-binding transcriptional repressor 14 19 16130453 iscA iron-sulfur cluster assembly protein 14 25 16130367 intZ CPZ-55 prophage; predicted integrase 14 14 16130281 intS CPS-53 (KpLE1) prophage; predicted prophage CPS-53 integrase 14 14 16132092 intB KpLE2 phage-like element; predicted integrase 14 15 16130540 intA CP4-57 prophage; integrase 14 15 16131060 infB translation initiation factor IF-2 14 25 16128851 infA translation initiation factor IF-1 14 25 16128048 imp organic solvent tolerance protein precursor 14 23 16131631 ilvY DNA-binding transcriptional dual regulator 14 23 90111084 ilvI acetolactate synthase III large subunit 14 21 16128071 ilvH acetolactate synthase small subunit 14 17 49176403 ilvE branched-chain amino acid aminotransferase 14 22 49176404 ilvD dihydroxy-acid dehydratase 14 21 16131632 ilvC ketol-acid reductoisomerase 14 19 16131541 ilvB acetolactate synthase large subunit 14 22 16131630 ilvA threonine dehydratase 14 19 16128020 ileS isoleucyl-tRNA synthetase 14 25 16132086 idnR DNA-binding transcriptional repressor, 5-gluconate-binding 14 14 16132088 idnO gluconate 5-dehydrogenase 14 25 16132089 idnD L-idonate 5-dehydrogenase, NAD-binding 14 19 16129099 icdA isocitrate dehydrogenase 14 22 90111447 hyfR DNA-binding transcriptional activator, formate sensing 14 21 16130620 hydN formate dehydrogenase-H, 4Fe-4S] ferredoxin subunit 14 9 16130896 hybA hydrogenase 2 4Fe-4S ferredoxin-type component 14 10 16128425 hupB HU, DNA-binding transcriptional regulator, beta subunit 14 25 16131830 hupA HU, DNA-binding transcriptional regulator, alpha subunit 14 25 16130951 htrG predicted signal transduction protein (SH3 domain) 14 10 16129017 htrB lipid A biosynthesis lauroyl acyltransferase 14 20 16129783 htpX heat shock protein HtpX 14 22 16128457 htpG heat shock protein 90 14 24 16131622 hsrA predicted multidrug or homocysteine efflux system 14 18 16131770 hslV ATP-dependent protease peptidase subunit 14 22 16131769 hslU ATP-dependent protease ATP-binding subunit 14 25 16131277 hslR ribosome-associated heat shock protein Hsp15 14 21 90111586 hslO Hsp33-like chaperonin 14 21 16128633 hscC Hsp70 family chaperone Hsc62, binds to RpoD and inhibits 14 25 16130452 hscB co-chaperone HscB 14 16 16130451 hscA chaperone protein HscA 14 25 90111092 hrpB predicted ATP-dependent helicase 14 20 49176106 hrpA ATP-dependent helicase 14 20 90111088 hpt hypoxanthine-guanine phosphoribosyltransferase 14 16 16132190 holD DNA polymerase III subunit psi 14 3 16132081 holC DNA polymerase III subunit chi 14 13 16129062 holB DNA polymerase III subunit delta' 14 20

274 16128623 holA DNA polymerase III subunit delta 14 23 16131268 hofQ predicted fimbrial transporter 14 23 16128099 hofC assembly protein in type IV pilin biogenesis, transmembrane protein 14 23 16128100 hofB conserved protein with nucleoside triphosphate hydrolase domain 14 23 16129198 hns global DNA-binding transcriptional dual regulator H-NS 14 6 16130477 hmp fused nitric oxide dioxygenase/dihydropteridine reductase 2 14 15 16128171 hlpA periplasmic chaperone 14 12 16130439 hisS histidyl-tRNA synthetase 14 25 16130243 hisQ histidine/lysine/arginine/ornithine transporter subunit 14 14 16130241 hisP histidine/lysine/arginine/ornithine transporter subunit 14 25 16130242 hisM histidine/lysine/arginine/ornithine transporter subunit 14 14 16130244 hisJ histidine/lysine/arginine/ornithine transporter subunit 14 15 16129967 hisI bifunctional phosphoribosyl-AMP cyclohydrolase/phosphoribosyl-ATP 14 21 16129964 hisH imidazole glycerol phosphate synthase subunit HisH 14 21 16129960 hisG ATP phosphoribosyltransferase 14 21 16129966 hisF imidazole glycerol phosphate synthase subunit HisF 14 21 16129961 hisD histidinol dehydrogenase 14 21 16129962 hisC histidinol-phosphate aminotransferase 14 22 90111373 hisB imidazole glycerol-phosphate dehydratase/histidinol phosphatase 14 23 90111374 hisA 1-(5-phosphoribosyl)-5-(5-phosphoribosylamino)methylideneamino] 14 21 49176077 hinT purine nucleoside phosphoramidase 14 25 16128879 himD integration host factor subunit beta 14 24 16129668 himA integration host factor subunit alpha 14 25 16131994 hfq RNA-binding protein Hfq 14 23 16131995 hflX predicted GTPase 14 24 16131996 hflK modulator for HflB protease specific for phage lambda cII repressor 14 25 16131997 hflC modulator for HflB protease specific for phage lambda cII repressor 14 21 16128053 hepA ATP-dependent helicase HepA 14 19 16131654 hemY predicted protoheme IX synthesis protein 14 16 16131655 hemX predicted uroporphyrinogen III methylase 14 16 90111657 hemN coproporphyrinogen III oxidase 14 24 16128147 hemL glutamate-1-semialdehyde aminotransferase 14 25 16129175 hemK N5-glutamine S-adenosyl-L-methionine-dependent methyltransferase 14 25 16128459 hemH ferrochelatase 14 23 16131696 hemG protoporphyrin oxidase, flavoprotein 14 4 16130361 hemF coproporphyrinogen III oxidase 14 20 16131827 hemE uroporphyrinogen decarboxylase 14 22 16131656 hemD uroporphyrinogen-III synthetase 14 17 49176416 hemC porphobilinogen deaminase 14 22 90111123 hemB delta-aminolevulinic acid dehydratase 14 22 16129173 hemA glutamyl-tRNA reductase 14 22 16128929 helD DNA helicase IV 14 24 16129577 hdhA 7-alpha-hydroxysteroid dehydrogenase 14 25 49176402 hdfR transcriptional regulator HdfR 14 16 16130421 hda DNA replication initiation factor 14 22 16130461 hcaT predicted 3-phenylpropionic transporter 14 14 16130462 hcaR DNA-binding transcriptional activator of 3-phenylpropionic acid 14 23 16130467 hcaD phenylpropionate dioxygenase, ferredoxin reductase subunit 14 12 16130466 hcaB 2,3-dihydroxy-2,3-dihydrophenylpropionate dehydrogenase 14 18 49176395 gyrB DNA gyrase subunit B 14 25 16130166 gyrA DNA gyrase subunit A 14 25 90111480 gutQ predicted phosphosugar-binding protein 14 24 16130696 gudP predicted D-glucarate transporter 14 16 16128097 guaC guanosine 5'-monophosphate oxidoreductase 14 25 16130433 guaB inositol-5-monophosphate dehydrogenase 14 25 16130432 guaA bifunctional GMP synthase/glutamine amidotransferase protein 14 24 16131206 gspF general secretory pathway component, cryptic 14 23 16131205 gspE general secretory pathway component, cryptic 14 23

275 90111569 gspD general secretory pathway component, cryptic 14 23 16130848 gshB glutathione synthetase 14 20 16130600 gshA glutamate--cysteine ligase 14 18 16128817 grxA glutaredoxin 1, redox coenzyme for ribonucleotide reductase (RNR1a) 14 8 16130533 grpE heat shock protein 14 25 16131967 groES co-chaperonin GroES 14 25 16131968 groEL chaperonin GroEL 14 25 90111587 greB transcription elongation factor GreB 14 25 90111554 greA transcription elongation factor GreA 14 25 16128224 gpt xanthine phosphoribosyltransferase 14 6 16131479 gpsA NAD(P)H-dependent glycerol-3-phosphate dehydrogenase 14 24 49176408 gpp guanosine pentaphosphatase/exopolyphosphatase 14 21 16131263 gph phosphoglycolate phosphatase 14 22 16131372 gor glutathione reductase 14 25 16131290 gntY predicted gluconate transport associated protein 14 19 90111589 gntX gluconate periplasmic binding protein with 14 18 49176350 gntT gluconate transporter, high-affinity GNT I system 14 11 49176356 gntR DNA-binding transcriptional repressor 14 14 16132142 gntP fructuronate transporter 14 11 16129970 gnd 6-phosphogluconate dehydrogenase 14 18 16131519 gmk 14 25 16128193 gmhB hypothetical protein b0200 14 20 16131430 glyS glycyl-tRNA synthetase subunit beta 14 25 16131431 glyQ glycyl-tRNA synthetase subunit alpha 14 25 16130476 glyA serine hydroxymethyltransferase 14 24 16128493 glxR tartronate semialdehyde reductase, NADH-dependent 14 21 16128498 glxK glycerate kinase II 14 14 90111635 glvC arbutin specific enzyme IIC component of PTS 14 5 49176386 glvB arbutin specific enzyme IIB component of PTS 14 5 16130330 gltX glutamyl-tRNA synthetase 14 25 16131903 gltP glutamate/aspartate:proton symporter 14 22 16128635 gltL glutamate and aspartate transporter subunit 14 25 16128636 gltK glutamate and aspartate transporter subunit 14 15 16128637 gltJ glutamate and aspartate transporter subunit 14 15 16128638 gltI glutamate and aspartate transporter subunit 14 9 16131103 gltD glutamate synthase, 4Fe-4S protein, small subunit 14 17 16131102 gltB glutamate synthase, large subunit 14 18 16128695 gltA citrate synthase 14 21 16130175 glpT sn-glycerol-3-phosphate transporter 14 12 16131297 glpR DNA-binding transcriptional repressor 14 13 16131764 glpK 14 17 49176353 glpG predicted intramembrane serine protease 14 11 16131299 glpE thiosulfate sulfurtransferase 14 14 16131300 glpD sn-glycerol-3-phosphate dehydrogenase, aerobic, FAD/NAD(P)-binding 14 16 16130176 glpA sn-glycerol-3-phosphate dehydrogenase (anaerobic), large subunit, 14 15 16128199 gloB predicted hydroxyacylglutathione hydrolase 14 25 16129609 gloA glyoxalase I, Ni-dependent 14 22 16128656 glnS glutaminyl-tRNA synthetase 14 24 16128777 glnQ glutamine ABC transporter ATP-binding protein 14 25 16128778 glnP glutamine ABC transporter permease protein 14 15 16131709 glnL sensory histidine kinase in two-component regulatory system with 14 19 16128435 glnK nitrogen assimilation regulatory protein for GlnL, GlnE, and AmtB 14 21 16128779 glnH glutamine ABC transporter periplasmic protein 14 15 16131708 glnG fused DNA-binding response regulator in two-component regulatory 14 24 16130949 glnE fused deadenylyltransferase/adenylyltransferase for glutamine 14 21 16128160 glnD PII uridylyl-transferase 14 20 16130478 glnB regulatory protein P-II for glutamine synthetase 14 21 16131710 glnA glutamine synthetase 14 22

276 16131598 glmU bifunctional N-acetylglucosamine-1-phosphate 14 25 16131597 glmS D-fructose-6-phosphate amidotransferase 14 25 16131066 glmM phosphoglucosamine mutase 14 25 16131305 glgX glycogen debranching enzyme 14 15 16131302 glgP glycogen phosphorylase 14 12 16130879 glcD glycolate oxidase subunit, FAD-linked 14 20 16130880 glcC DNA-binding transcriptional dual regulator, glycolate-binding 14 12 16131608 gidB glucose-inhibited division protein B 14 23 16131609 gidA glucose-inhibited division protein A 14 25 90111095 geneA CDP-diglyceride synthase 14 24 16130806 gcvH glycine cleavage system protein H 14 19 16130715 gcvA DNA-binding transcriptional dual regulator 14 23 16128491 gcl glyoxylate carboligase 14 21 90111384 gatY D-tagatose 1,6-bisphosphate aldolase 2, catalytic subunit 14 22 16130029 gatD galactitol-1-phosphate dehydrogenase, Zn-dependent and 14 19 90111545 garR tartronate semialdehyde reductase 14 21 16131019 garP predicted (D)-galactarate transporter 14 15 49176317 garK glycerate kinase I 14 14 16129733 gapA glyceraldehyde-3-phosphate dehydrogenase 14 25 16129197 galU glucose-1-phosphate uridylyltransferase 14 24 16128726 galT galactose-1-phosphate uridylyltransferase 14 5 16130089 galS DNA-binding transcriptional repressor 14 14 16130741 galR DNA-binding transcriptional repressor 14 13 16128724 galM galactose-1-epimerase (mutarotase) 14 8 16128725 galK 14 6 16129982 galF predicted subunit with GalU 14 24 49176045 galE UDP-galactose-4-epimerase 14 23 16130576 gabT 4-aminobutyrate aminotransferase 14 25 16130575 gabD succinate-semialdehyde dehydrogenase I, NADP-dependent 14 22 90111691 fxsA inner membrane protein 14 0 16131219 fusA elongation factor EF-2 14 25 16128659 fur ferric uptake regulator 14 23 16129569 fumC fumarate hydratase 14 23 16130712 fucR DNA-binding transcriptional activator 14 11 16130706 fucO L-1,2-propanediol oxidoreductase 14 15 16130710 fucK L-fuculokinase 14 18 16131336 ftsY fused Signal Recognition Particle (SRP) receptor: membrane binding 14 25 16131334 ftsX predicted transporter subunit: membrane component of ABC 14 17 16128082 ftsW integral membrane protein involved in stabilizing FstZ ring during 14 25 16128086 ftsQ membrane anchored protein involved in growth of wall at septum 14 23 16131771 ftsN essential cell division protein 14 0 16128076 ftsL membrane bound cell division protein at septum containing leucine 14 4 16128857 ftsK DNA-binding membrane protein required for chromosome resolution and 14 23 16128077 ftsI transpeptidase involved in septal peptidoglycan synthesis 14 25 16131068 ftsH protease, ATP-dependent zinc-metallo 14 25 16131335 ftsE predicted transporter subunit: ATP-binding component of ABC 14 25 16130655 ftsB cell divison protein FtsB 14 20 16128087 ftsA cell division protein 14 25 16129855 ftn ferritin iron storage protein (cytoplasmic) 14 9 16131791 frwD predicted enzyme IIB component of PTS 14 10 16131787 frwC predicted enzyme IIC component of PTS 14 9 16131788 frwB predicted enzyme IIB component of PTS 14 10 90111664 frvB fused predicted PTS enzymes: IIB component/IIC component 14 10 16128073 fruR DNA-binding transcriptional dual regulator 14 12 16130106 fruK 1-phosphofructokinase 14 10 16130107 fruB fused fructose-specific PTS enzymes: IIA component/HPr component 14 11 16130105 fruA fused fructose-specific PTS enzymes: IIBcomponent/IIC components 14 10 16128165 frr ribosome releasing factor 14 25

277 16128341 frmA alcohol dehydrogenase class III/glutathione-dependent formaldehyde 14 20 90111578 frlR predicted DNA-binding transcriptional regulator 14 12 90111576 frlA predicted fructoselysine transporter 14 21 16131690 fre NAD(P)H-flavin reductase 14 12 16131976 frdD fumarate reductase subunit D 14 4 16131977 frdC fumarate reductase subunit C 14 4 16131978 frdB fumarate reductase (anaerobic), Fe-S subunit 14 24 16131979 frdA fumarate reductase 14 24 90111553 folP 7,8-dihydropteroate synthase 14 23 16128135 folK 2-amino-4-hydroxy-6-hydroxymethyldihyropteridine pyrophosphokinase 14 24 16130091 folE GTP cyclohydrolase I 14 22 16128513 folD bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase/ 14 24 16130250 folC bifunctional folylpolyglutamate synthase/ dihydrofolate synthase 14 25 90111533 folB bifunctional dihydroneopterin aldolase/dihydroneopterin 14 21 16128042 folA dihydrofolate reductase 14 25 16130417 focB predicted formate transporter 14 7 16128871 focA formate transporter 14 7 16129295 fnr DNA-binding transcriptional dual regulator, global regulator of 14 16 16131167 fmt methionyl-tRNA formyltransferase 14 25 16129867 fliY cystine transporter subunit 14 19 16129872 fliS flagellar protein FliS 14 14 16129897 fliR flagellar biosynthesis protein R 14 15 16129896 fliQ flagellar biosynthesis protein Q 14 8 90111358 fliO flagellar biosynthesis protein 14 0 16129893 fliN flagellar motor switch protein 14 15 16129892 fliM flagellar motor switch protein M 14 15 16129888 fliI flagellum-specific ATP synthase 14 25 16129884 fliE flagellar hook-basal body protein FliE 14 8 16129871 fliD flagellar capping protein 14 14 16129870 fliC flagellin 14 14 49176170 fliA flagellar biosynthesis sigma factor 14 25 16129832 flhB flagellar biosynthesis protein B 14 15 16129831 flhA flagellar biosynthesis protein A 14 15 49176076 flgI flagellar P-ring protein precursor I 14 15 16129042 flgH flagellar L-ring protein precursor H 14 15 16129041 flgG flagellar component of cell-distal portion of basal-body rod 14 15 16129040 flgF flagellar component of cell-proximal portion of basal-body rod 14 15 16129039 flgE flagellar hook protein E 14 15 16129038 flgD flagellar basal body rod modification protein D 14 14 16129036 flgB flagellar basal-body rod protein B 14 15 16130797 fldB flavodoxin 2 14 12 16128660 fldA flavodoxin 1 14 12 16128022 fkpB FKBP-type peptidyl-prolyl cis-trans isomerase (rotamase) 14 21 16131226 fkpA FKBP-type peptidyl-prolyl cis-trans isomerase (rotamase) 14 24 90111705 fklB FKBP-type peptidyl-prolyl cis-trans isomerase (rotamase) 14 24 16128773 fiu predicted iron outer membrane transporter 14 14 16131149 fis DNA-binding protein Fis 14 22 16128519 fimZ predicted DNA-binding transcriptional regulator 14 22 16132134 fimE tyrosine recombinase/inversion of on/off regulator of fimA 14 20 16132133 fimB tyrosine recombinase/inversion of on/off regulator of fimA 14 20 16129065 fhuE ferric-rhodotorulic acid outer membrane transporter 14 15 16128144 fhuC iron-hydroxamate transporter subunit 14 25 16128146 fhuB fused iron-hydroxamate transporter subunits of ABC superfamily: 14 15 16128143 fhuA ferrichrome outer membrane transporter 14 15 16130638 fhlA DNA-binding transcriptional activator 14 21 90111101 fhiA flagellar system protein, promoterless fragment (pseudogene) 14 15 16130531 ffh Signal Recognition Particle (SRP) component with 4.5S RNA (ffs) 14 25 16128572 fepG iron-enterobactin transporter subunit 14 15

278 16128573 fepD iron-enterobactin transporter subunit 14 15 16128571 fepC iron-enterobactin transporter subunit 14 25 16128575 fepB iron-enterobactin transporter subunit 14 3 16128567 fepA iron-enterobactin outer membrane transporter 14 15 16132108 fecE KpLE2 phage-like element; iron-dicitrate transporter subunit 14 25 16132109 fecD KpLE2 phage-like element; iron-dicitrate transporter subunit 14 15 16132110 fecC KpLE2 phage-like element; iron-dicitrate transporter subunit 14 15 90111723 fecB KpLE2 phage-like element; iron-dicitrate transporter subunit 14 3 16132112 fecA KpLE2 phage-like element; ferric citrate outer membrane transporter 14 16 90111264 feaB phenylacetaldehyde dehydrogenase 14 22 16130450 fdx 2Fe-2S] ferredoxin 14 17 16131733 fdoH formate dehydrogenase-O, Fe-S subunit 14 9 16131734 fdoG formate dehydrogenase-O, large subunit 14 16 16129435 fdnI formate dehydrogenase-N, cytochrome B556 (gamma) subunit, 14 8 16129434 fdnH formate dehydrogenase-N, Fe-S (beta) subunit, nitrate-inducible 14 10 16129433 fdnG formate dehydrogenase-N, alpha subunit, nitrate-inducible 14 16 16131905 fdhF formate dehydrogenase-H, selenopolypeptide subunit 14 21 16131735 fdhD formate dehydrogenase accessory protein 14 16 16132054 fbp fructose-1,6-bisphosphatase 14 20 16130826 fbaA fructose-bisphosphate aldolase 14 22 16129150 fadR fatty acid metabolism regulator 14 8 16130277 fadL long-chain fatty acid outer membrane transporter 14 16 16130976 fadH 2,4-dienoyl-CoA reductase, NADH and FMN-linked 14 18 16129759 fadD acyl-CoA synthase 14 23 16131692 fadB fused 3-hydroxybutyryl-CoA 14 17 49176430 fadA acetyl-CoA acetyltransferase 14 18 16128173 fabZ (3R)-hydroxymyristoyl ACP dehydratase 14 24 16131801 fabR DNA-binding transcriptional repressor 14 17 16129054 fabH 3-oxoacyl-(acyl carrier protein) synthase 14 23 16129056 fabG 3-oxoacyl-acyl-carrier-protein] reductase 14 25 16129058 fabF 3-oxoacyl-(acyl carrier protein) synthase 14 25 16129055 fabD acyl carrier protein S-malonyltransferase 14 25 16130258 fabB 3-oxoacyl-(acyl carrier protein) synthase 14 25 16128921 fabA 3-hydroxydecanoyl-ACP dehydratase 14 19 16130903 exbD membrane spanning protein in TonB-ExbB-ExbD complex 14 22 16130904 exbB membrane spanning protein in TonB-ExbB-ExbD complex 14 24 16130302 evgS hybrid sensory histidine kinase in two-component regulatory system 14 22 16130301 evgA DNA-binding response regulator in two-component regulatory system 14 15 16130383 eutI predicted phosphotransacetylase subunit 14 19 90111438 eutG predicted alcohol dehydrogenase in ethanolamine utilization 14 15 16130380 eutE predicted aldehyde dehydrogenase, ethanolamine utilization protein 14 5 16130491 era GTP-binding protein Era 14 25 90111611 eptB predicted metal dependent hydrolase 14 17 90111688 eptA predicted metal dependent hydrolase 14 17 16130828 epd D-erythrose 4-phosphate dehydrogenase 14 25 16131281 envZ osmolarity sensor protein 14 22 90111620 envC protease with a role in cell division 14 19 16128569 entF enterobactin synthase multienzyme complex component, ATP-dependent 14 21 16128577 entE 2,3-dihydroxybenzoate-AMP ligase component of enterobactin synthase 14 22 16128576 entC isochorismate synthase 14 23 16128579 entA 2,3-dihydroxybenzoate-2,3-dehydrogenase 14 25 16130686 eno phosphopyruvate hydratase 14 25 16130846 endA DNA-specific endonuclease I 14 7 90111231 emtA lytic murein endotransglycosylase E 14 9 16130299 emrY predicted multidrug efflux system 14 18 16130300 emrK EmrKY-TolC multidrug resistance efflux pump, membrane fusion 14 19 90111634 emrD multidrug efflux system protein 14 19 16130598 emrB multidrug efflux system protein 14 18

279 16130597 emrA multidrug efflux system 14 22 16131972 efp elongation factor P 14 25 16129804 edd phosphogluconate dehydratase 14 21 16129803 eda keto-hydroxyglutarate-aldolase/keto-deoxy-phosphogluconate aldolase 14 16 16130970 ebgR DNA-binding transcriptional repressor 14 14 16128405 dxs 1-deoxy-D-xylulose-5-phosphate synthase 14 21 16130078 dusC tRNA-dihydrouridine synthase C 14 23 16131148 dusB tRNA-dihydrouridine synthase B 14 23 16131727 dtd D-tyrosyl-tRNA deacylase 14 22 16130297 dsdX predicted transporter 14 11 16130296 dsdC DNA-binding transcriptional dual regulator 14 22 16131961 dsbD thiol:disulfide interchange protein precursor 14 21 16130795 dsbC protein disulfide isomerase II 14 20 49176085 dsbB disulfide bond formation protein B 14 9 16131701 dsbA periplasmic protein disulfide isomerase I 14 21 16131412 dppF dipeptide transporter 14 25 16131413 dppD dipeptide transporter 14 24 16131414 dppC dipeptide transporter 14 20 16131415 dppB dipeptide transporter 14 20 16131416 dppA dipeptide transporter 14 19 16128454 dnaX DNA polymerase III subunits gamma and tau 14 25 16128202 dnaQ DNA polymerase III subunit epsilon 14 25 16131569 dnaN DNA polymerase III subunit beta 14 25 16128008 dnaK molecular chaperone DnaK 14 25 16128009 dnaJ chaperone Hsp40, co-chaperone with DnaK 14 25 16130962 dnaG DNA 14 25 16128177 dnaE DNA polymerase III subunit alpha 14 25 16131878 dnaB replicative DNA helicase 14 25 16131570 dnaA chromosomal replication initiation protein 14 24 16128862 dmsB dimethyl sulfoxide reductase, anaerobic, subunit B 14 10 90111183 dmsA dimethyl sulfoxide reductase, anaerobic, subunit A 14 16 16128138 dksA DNA-binding transcriptional regulator of rRNA transcription, DnaK 14 23 16128049 djlA Dna-J like membrane chaperone protein 14 16 16128767 dinG ATP-dependent DNA helicase 14 18 90111234 dhaR predicted DNA-binding transcriptional regulator, dihydroxyacetone 14 21 16129552 dgsA DNA-binding transcriptional repressor 14 10 90111641 dgoT D-galactonate transporter 14 16 49176392 dgoR predicted DNA-binding transcriptional regulator 14 11 16131868 dgkA 14 19 90111624 dfp bifunctional phosphopantothenoylcysteine 14 24 16128808 deoR DNA-binding transcriptional repressor 14 11 16132201 deoD purine nucleoside phosphorylase 14 11 16132198 deoC deoxyribose-phosphate aldolase 14 12 16132200 deoB phosphopentomutase 14 7 16131125 degS serine endoprotease, periplasmic 14 24 16131124 degQ serine endoprotease, periplasmic 14 24 16128154 degP serine endoprotease (protease Do), membrane-associated 14 24 16131166 def peptide deformylase 14 25 90111417 dedD hypothetical protein b2314 14 4 90111550 deaD ATP-dependent RNA helicase 14 25 16129442 ddpF D-ala-D-ala transporter subunit 14 25 16129443 ddpD D-ala-D-ala transporter subunit 14 25 16129444 ddpC D-ala-D-ala transporter subunit 14 20 16129445 ddpB D-ala-D-ala transporter subunit 14 19 16129446 ddpA D-ala-D-a la transporter subunit 14 16 16128085 ddl D-alanylalanine synthetase 14 25 16128366 ddl D-alanylalanine synthetase 14 25 90111426 ddg lipid A biosynthesis palmitoleoyl acyltransferase 14 19

280 16131949 dcuB C4-dicarboxylate antiporter 14 5 16131963 dcuA C4-dicarboxylate antiporter 14 5 16131400 dctA C4-dicarboxylate transport protein 14 22 16129497 dcp dipeptidyl carboxypeptidase II 14 24 16129304 dbpA ATP-dependent RNA helicase, specific for 23S rRNA 14 25 90111650 dapF diaminopimelate epimerase 14 24 16130397 dapE succinyl-diaminopimelate desuccinylase 14 24 16128025 dapB dihydrodipicolinate reductase 14 23 16130403 dapA dihydrodipicolinate synthase 14 24 16131266 damX hypothetical protein b3388 14 1 16131265 dam DNA adenine methylase 14 10 90111369 dacD D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 6b) 14 24 16128807 dacC D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 6a) 14 24 16131072 dacB D-alanyl-D-alanine carboxypeptidase 14 13 16128615 dacA D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 5) 14 24 16131772 cytR DNA-binding transcriptional dual regulator 14 14 16130339 cysZ putative sulfate transport protein CysZ 14 14 90111430 cysW sulfate/thiosulfate transporter subunit 14 20 16130349 cysU sulfate/thiosulfate transporter subunit 14 18 16128510 cysS cysteinyl-tRNA synthetase 14 25 16132036 cysQ PAPS (adenosine 3'-phosphate 5'-phosphosulfate) 14 20 16130658 cysN sulfate adenylyltransferase subunit 1 14 25 16130347 cysM cysteine synthase B (O-acetylserine sulfhydrolase B) 14 23 16130340 cysK cysteine synthase A, O-acetylserine sulfhydrolase A subunit 14 23 16130671 cysJ sulfite reductase, alpha subunit, flavoprotein 14 16 16130670 cysI sulfite reductase, beta subunit, NAD(P)-binding, heme-binding 14 9 16131246 cysG fused siroheme synthase 1,3-dimethyluroporphyriongen III 14 16 16131478 cysE serine acetyltransferase 14 20 16130659 cysD sulfate adenylyltransferase subunit 2 14 13 16130657 cysC adenylylsulfate kinase 14 12 16129236 cysB DNA-binding transcriptional dual regulator, 14 23 16130348 cysA sulfate/thiosulfate transporter subunit 14 25 16128324 cynT carbonic anhydrase 14 22 16128323 cynR DNA-binding transcriptional dual regulator 14 23 16128854 cydD fused cysteine transporter subunits of ABC superfamily: membrane 14 25 16128853 cydC fused cysteine transporter subunits of ABC superfamily: membrane 14 25 16128709 cydB cytochrome d terminal oxidase, subunit II 14 21 90111166 cydA cytochrome d terminal oxidase, subunit I 14 21 16131659 cyaY frataxin-like protein 14 11 16131658 cyaA adenylate cyclase 14 9 16130248 cvpA membrane protein required for colicin V production 14 20 90111351 cutC copper homeostasis protein 14 4 16128553 cusS sensory histidine kinase in two-component regulatory system with 14 23 16128554 cusR DNA-binding response regulator in two-component regulatory system 14 24 16128558 cusA copper/silver efflux system, membrane component 14 23 16128471 cueR DNA-binding transcriptional activator of copper-responsive regulon 14 15 16128581 cstA carbon starvation protein 14 16 16130603 csrA carbon storage regulator 14 23 16129511 cspI Qin prophage; cold shock protein 14 24 16128956 cspG DNA-binding transcriptional regulator 14 24 16128606 cspE cold shock protein E 14 24 16128848 cspD cold shock protein homolog 14 24 16129777 cspC stress protein, member of the CspA-family 14 24 16129516 cspB Qin prophage; cold shock protein 14 24 16131427 cspA major cold shock protein 14 24 16130717 csdA cysteine sulfinate desulfinase 14 25 16130343 crr glucose-specific PTS system enzyme IIA component 14 8 16131236 crp DNA-binding transcriptional dual regulator 14 15

281 16132216 creC sensory histidine kinase in two-component regulatory system with 14 23 16132215 creB DNA-binding response regulator in two-component regulatory system 14 24 16132214 creA hypothetical protein b4397 14 7 16131752 cpxR DNA-binding response regulator in two-component regulatory system 14 24 16131751 cpxA two-component sensor protein 14 23 16129988 cpsG phosphomannomutase 14 25 16132035 cpdB bifunctional 2',3'-cyclic nucleotide 14 9 16130928 cpdA cyclic 3',5'-adenosine monophosphate phosphodiesterase 14 17 16128468 copA copper transporter 14 22 90111136 cof thiamin pyrimidine pyrophosphate hydrolase 14 7 16129083 cobB NAD-dependent deacetylase 14 12 16128096 coaE dephospho-CoA kinase 14 24 16131505 coaD phosphopantetheine adenylyltransferase 14 24 16130835 cmtB predicted mannitol-specific enzyme IIA component of PTS 14 3 16128810 cmr multidrug efflux system protein 14 20 16128877 cmk cytidylate kinase 14 25 16129210 cls cardiolipin synthetase 14 18 16128423 clpX ATP-dependent protease ATP-binding subunit 14 25 16128849 clpS ATP-dependent Clp protease adaptor protein ClpS 14 17 16128422 clpP ATP-dependent Clp protease proteolytic subunit 14 25 16130513 clpB protein disaggregation chaperone 14 23 16128850 clpA ATPase and specificity subunit of ClpA-ClpP ATP-dependent serine 14 23 16128595 citT citrate:succinate antiporter 14 11 16130093 cirA ferric iron-catecholate outer membrane transporter 14 18 16129833 cheZ chemotaxis regulator, protein phosphatase for CheY 14 8 16129834 cheY chemotaxis regulator transmitting signal to flagellar motor 14 16 16129836 cheR chemotaxis regulator, protein-glutamate methyltransferase 14 13 16129835 cheB chemotaxis-specific methylesterase 14 16 16129692 chbB N,N'-diacetylchitobiose-specific enzyme IIB component of PTS 14 0 16129619 cfa cyclopropane fatty acyl phospholipid synthase 14 15 16130081 cdd cytidine deaminase 14 6 16130952 cca fused tRNA nucleotidyl transferase/2'3'-cyclic 14 25 16128966 cbpA curved DNA-binding protein, DnaJ homologue that functions as a 14 25 16129929 cbl DNA-binding transcriptional activator of cysteine biosynthesis 14 23 16128027 carB carbamoyl-phosphate synthase large subunit 14 25 16128026 carA carbamoyl-phosphate synthase small subunit 14 24 16128119 can carbonic anhydrase 14 17 90111080 caiE predicted acyl transferase 14 17 16128030 caiD carnitinyl-CoA dehydratase 14 20 49175993 caiC crotonobetaine/carnitine-CoA ligase 14 23 16131958 cadB predicted lysine/cadaverine transporter 14 17 16131957 cadA lysine decarboxylase 1 14 8 16129231 btuR cob(I)yrinic acid a,c-diamide adenosyltransferase 14 11 16129665 btuD vitamin B12-transporter ATPase 14 25 16129667 btuC vtamin B12-transporter permease 14 15 16131804 btuB vitamin B12/cobalamin outer membrane transporter 14 18 16128386 brnQ predicted branched chain amino acid transporter (LIV-II) 14 11 90111135 bolA regulator of penicillin binding proteins and beta lactamase 14 21 90111613 bisC biotin sulfoxide reductase 14 14 16131807 birA biotin--protein ligase 14 24 49176434 bipA GTP-binding protein 14 25 16131288 bioH carboxylesterase of pimeloyl-CoA synthesis 14 18 16128744 bioF 8-amino-7-oxononanoate synthase 14 24 16128746 bioD dethiobiotin synthetase 14 25 16128745 bioC predicted methltransferase, enzyme of biotin synthesis 14 17 16128742 bioA adenosylmethionine--8-amino-7-oxononanoate transaminase 14 25 16131590 bglF fused beta-glucoside-specific PTS enzymes: IIA component/IIB 14 8 16131589 bglB cryptic phospho-beta-glucosidase B 14 2

282 16130803 bglA 6-phospho-beta-glucosidase A 14 2 16128297 betB betaine aldehyde dehydrogenase, NAD-dependent 14 22 16130120 bcr bicyclomycin/multidrug efflux system 14 20 16130405 bcp thioredoxin-dependent thiol peroxidase 14 22 16131938 basS sensory histidine kinase in two-component regulatory system with 14 23 16131939 basR DNA-binding response regulator in two-component regulatory system 14 24 16130693 barA hybrid sensory histidine kinase, in two-component regulatory system 14 23 16130018 baeS sensory histidine kinase in two-component regulatory system with 14 23 16130019 baeR DNA-binding response regulator in two-component regulatory system 14 24 16131603 atpH F0F1 ATP synthase subunit delta 14 25 16131601 atpG F0F1 ATP synthase subunit gamma 14 25 16131604 atpF F0F1 ATP synthase subunit B 14 25 16131605 atpE F0F1 ATP synthase subunit C 14 13 16131600 atpD F0F1 ATP synthase subunit beta 14 25 16131599 atpC F0F1 ATP synthase subunit epsilon 14 25 16131602 atpA F0F1 ATP synthase subunit alpha 14 25 16130156 atoS sensory histidine kinase in two-component regulatory system with 14 23 16130157 atoC fused response regulator of ato operon, in two-component system 14 24 16130161 atoB acetyl-CoA acetyltransferase 14 18 16129700 astD succinylglutamic semialdehyde dehydrogenase 14 22 16129702 astC succinylornithine transaminase, PLP-dependent 14 25 16129819 aspS aspartyl-tRNA synthetase 14 25 16128895 aspC aspartate aminotransferase, PLP-dependent 14 17 90111690 aspA aspartate ammonia-lyase 14 23 16128897 asnS asparaginyl-tRNA synthetase 14 20 16131611 asnC DNA-binding transcriptional dual regulator 14 16 16128650 asnB asparagine synthetase B 14 18 16130004 asmA predicted assembly protein 14 4 16131307 asd aspartate-semialdehyde dehydrogenase 14 25 16130621 ascG DNA-binding transcriptional repressor 14 14 49176263 ascF fused cellobiose/arbutin/salicin-specific PTS enzymes: IIB 14 3 16130623 ascB cryptic 6-phospho-beta-glucosidase 14 2 16128830 artQ arginine transporter subunit 14 14 16128832 artP arginine transporter subunit 14 25 16128829 artM arginine transporter subunit 14 14 16128828 artJ arginine transporter subunit 14 15 16128831 artI arginine transporter subunit 14 16 16131375 arsC arsenate reductase 14 20 16128373 aroL II 14 25 90111581 aroK shikimate kinase I 14 25 16129660 aroH 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase, tryptophan 14 23 16128722 aroG 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase, phenylalanine 14 23 16130522 aroF 3-deoxy-D-arabino-heptulosonate-7-phosphate synthase, 14 23 16131162 aroE dehydroshikimate reductase, NAD(P)-binding 14 23 16130264 aroC chorismate synthase 14 24 16131267 aroB 3-dehydroquinate synthase 14 24 16128875 aroA 3-phosphoshikimate 1-carboxyvinyltransferase 14 24 16130245 argT lysine/arginine/ornithine transporter subunit 14 16 16129828 argS arginyl-tRNA synthetase 14 25 16131127 argR arginine repressor 14 9 16132076 argI ornithine carbamoyltransferase 1 14 23 16131798 argH argininosuccinate lyase 14 22 16131063 argG argininosuccinate synthase 14 22 16128258 argF CP4-6 prophage; ornithine carbamoyltransferase 2, chain F 14 23 16131795 argE acetylornithine deacetylase 14 22 16131238 argD bifunctional acetylornithine aminotransferase/ 14 25 16131796 argC N-acetyl-gamma-glutamyl-phosphate reductase 14 19 90111669 argB acetylglutamate kinase 14 19

283 16130722 argA N-acetylglutamate synthase 14 18 49176326 arcB hybrid sensory histidine kinase in two-component regulatory system 14 23 16132218 arcA DNA-binding response regulator in two-component regulatory system 14 24 49176167 araH fused L-arabinose transporter subunits of ABC superfamily: membrane 14 7 16129850 araG fused L-arabinose transporter subunits of ABC superfamily: 14 25 16128453 apt adenine phosphoribosyltransferase 14 20 16128944 appC cytochrome bd-II oxidase, subunit I 14 21 16128945 appB cytochrome bd-II oxidase, subunit II 14 21 16128410 apbA 2-dehydropantoate 2-reductase 14 8 16128043 apaH diadenosinetetraphosphatase 14 25 16130858 ansB periplasmic L-asparaginase II 14 12 16129721 ansA cytoplasmic asparaginase I 14 16 16128418 ampG muropeptide transporter 14 22 16128103 ampD N-acetyl-anhydromuranmyl-L-alanine amidase 14 22 90111492 amiC N-acetylmuramoyl-L-alanine amidase 14 21 16131991 amiB N-acetylmuramoyl-l-alanine amidase II 14 21 16130360 amiA N-acetylmuramoyl-l-alanine amidase I 14 21 49176311 alx predicted inner membrane protein, part of terminus 14 11 16131910 alsK D-allose kinase 14 7 16131911 alsE allulose-6-phosphate 3-epimerase 14 25 16131912 alsC D-allose transporter subunit 14 7 16131914 alsB D-allose transporter subunit 14 7 16131913 alsA fused D-allose transporter subunits of ABC superfamily: ATP-binding 14 25 16129153 alr alanine racemase 14 24 16131879 alr alanine racemase 14 24 90111619 aldB aldehyde dehydrogenase B 14 22 16129376 aldA aldehyde dehydrogenase A, NAD-linked 14 22 16130604 alaS alanyl-tRNA synthetase 14 25 90111152 ahpF alkyl hydroperoxide reductase, F52a subunit, FAD/NAD(P)-binding 14 25 16128588 ahpC alkyl hydroperoxide reductase, C22 subunit 14 22 16131023 agaR DNA-binding transcriptional dual regulator 14 13 49176319 agaA predicted truncated N-acetylgalactosamine-6-phosphate deacetylase 14 12 90111106 afuC CP4-6 prophage; predicted ferric transporter subunit 14 25 16130967 aer fused signal transducer for aerotaxis sensory component/methyl 14 17 16130393 aegA fused predicted oxidoreductase: FeS binding subunit/NAD/FAD-binding 14 20 16128458 adk 14 25 16131941 adiC arginine:agmatin 14 18 16131943 adiA biodegradative arginine decarboxylase 14 8 90111280 adhP alcohol dehydrogenase 14 20 16129202 adhE fused acetaldehyde-CoA dehydrogenase/iron-dependent alcohol 14 14 16130150 ada fused DNA-binding transcriptional dual 14 20 16131893 actP acetate permease 14 20 16131895 acs acetyl-coenzyme A synthetase 14 22 16131154 acrF multidrug efflux system protein 14 23 16131153 acrE cytoplasmic membrane lipoprotein 14 21 16130395 acrD aminoglycoside/multidrug efflux system 14 23 16128446 acrB multidrug efflux system protein 14 23 16128447 acrA multidrug efflux system 14 19 16129057 acpP acyl carrier protein 14 25 16128111 acnB aconitate hydratase 14 19 16129237 acnA aconitate hydratase 14 23 16130231 ackA acetate kinase 14 16 16128108 aceF dihydrolipoamide acetyltransferase 14 25 16128107 aceE pyruvate dehydrogenase subunit E1 14 24 16130251 accD acetyl-CoA carboxylase subunit beta 14 23 16131144 accC acetyl-CoA carboxylase 14 24 16131143 accB acetyl-CoA carboxylase 14 24 16128178 accA acetyl-CoA carboxylase subunit alpha 14 23

284 16129300 abgR predicted DNA-binding transcriptional regulator 14 23 16128852 aat leucyl/phenylalanyl-tRNA--protein transferase 14 17 16130740 aas 2-acyl-glycerophospho-ethanolamine acyltransferase 14 23 16131133 aaeR predicted DNA-binding transcriptional regulator, efflux system 14 22 16131131 aaeA p-hydroxybenzoic acid efflux system component 14 21 16129805 zwf glucose-6-phosphate 1-dehydrogenase 13 18 90111679 zur DNA-binding transcriptional repressor, Zn(II)-binding 13 16 16132044 ytfP hypothetical protein b4222 13 0 90111707 ytfK hypothetical protein b4217 13 0 90111704 ytfB predicted cell envelope opacity-associated protein 13 0 49176335 yrdD predicted DNA topoisomerase 13 12 16131088 yrbI 3-deoxy-D-manno-octulosonate 8-phosphate phosphatase 13 23 90111556 yrbB hypothetical protein b3191 13 0 16130990 yqjA conserved inner membrane protein 13 14 16130929 yqiB predicted dehydrogenase 13 8 90111526 yqhC predicted DNA-binding transcriptional regulator 13 11 16130801 yqfA predicted oxidoreductase, inner membrane subunit 13 13 16130601 yqaA conserved inner membrane protein 13 15 94541121 ypeB hypothetical protein b4546 13 7 90111435 ypeA putative acetyltransferase 13 4 16130319 ypdH predicted enzyme IIB component of PTS 13 6 16130314 ypdC predicted DNA-binding protein 13 2 16130312 ypdA predicted sensory kinase in two-component system with YpdB 13 10 90111337 yoaD predicted phosphodiesterase 13 16 90111336 yoaB hypothetical protein b1809 13 15 16129634 ynhG hypothetical protein b1678 13 13 16129483 yneH predicted glutaminase 13 11 16129481 yneF predicted diguanylate cyclase 13 16 90111207 ymdC predicted hydrolase 13 19 16128802 yliF predicted diguanylate cyclase 13 15 16128801 yliE conserved inner membrane protein 13 16 90111138 ylaB conserved inner membrane protein 13 17 16128232 ykfG CP4-6 prophage; predicted DNA repair protein 13 20 16132195 yjjU predicted esterase 13 6 90111743 yjjP predicted inner membrane protein 13 6 16132192 yjjG nucleotidase 13 9 16132184 yjjB conserved inner membrane protein 13 6 90111738 yjiA predicted GTPase 13 11 16132117 yjhF KpLE2 phage-like element; predicted transporter 13 11 49176479 yjgM predicted acetyltransferase 13 2 90111692 yjeI hypothetical protein b4144 13 0 16131887 yjcC predicted signal transduction protein (EAL domain containing 13 16 16131831 yjaH hypothetical protein b4001 13 0 16131793 yijP conserved inner membrane protein 13 11 16131760 yiiS hypothetical protein b3922 13 4 16131699 yihD hypothetical protein b3858 13 4 16131676 yigM predicted inner membrane protein 13 7 16131664 yigB predicted hydrolase 13 14 16131662 yigA hypothetical protein b3810 13 9 16131552 yidP predicted DNA-binding transcriptional regulator 13 12 90111636 yidE hypothetical protein b3685 13 5 16131473 yibL hypothetical protein b3602 13 1 16131457 yiaV membrane fusion protein (MFP) component of efflux pump, signal 13 17 16131455 yiaT hypothetical protein b3584 13 5 16131432 yiaH conserved inner membrane protein 13 0 16131340 yhhN conserved inner membrane protein 13 0 16131339 yhhM hypothetical protein b3467 13 0 90111599 yhhJ predicted transporter subunit: membrane component of ABC 13 13

285 94541125 yheV hypothetical protein b4551 13 3 16131232 yheT predicted hydrolase 13 10 16131230 yheR glutathione-regulated potassium-efflux system ancillary protein 13 4 16131140 yhdA conserved inner membrane protein 13 15 16131101 yhcC predicted Fe-S oxidoreductase 13 1 90111549 yhbV predicted protease 13 4 90111536 ygjP predicted metal dependent hydrolase 13 11 16130899 yghZ aldo-keto reductase 13 16 90111518 yggU hypothetical protein b2953 13 7 90111519 yggL hypothetical protein b2959 13 10 16130861 yggH tRNA(m7G46)-methyltransferase 13 22 16130831 yggF predicted hexoseP phosphatase 13 4 16130824 yggA arginine exporter protein 13 14 90111512 ygfG methylmalonyl-CoA decarboxylase, biotin-independent 13 20 16130713 ygdE predicted methyltransferase 13 16 16130594 ygaZ predicted transporter 13 11 16130584 ygaW predicted inner membrane protein 13 0 16130607 ygaD competence damage-inducible protein A 13 15 16130559 yfjY CP4-57 prophage; predicted DNA repair protein 13 20 90111463 yfiP hypothetical protein b2583 13 7 16130525 yfiN predicted diguanylate cyclase 13 15 16130418 yfgO predicted inner membrane protein 13 16 90111433 yfeX hypothetical protein b2431 13 11 90111428 yfeA predicted diguanylate cyclase 13 16 16130261 yfcM hypothetical protein b2326 13 12 16130239 yfcH conserved protein with NAD(P)-binding Rossmann-fold domain 13 20 49176208 yfcE phosphodiesterase 13 0 90111415 yfbU hypothetical protein b2294 13 2 16130184 yfaY competence damage-inducible protein A 13 4 90111401 yejA predicted oligopeptide transporter subunit 13 11 16130111 yeiR predicted enzyme 13 11 16130110 yeiQ predicted dehydrogenase, NAD-dependent 13 8 16130096 yeiH conserved inner membrane protein 13 11 16130066 yehW predicted transporter subunit: membrane component of ABC 13 11 16130064 yehU predicted sensory kinase in two-component system with YehT 13 10 16130039 yegW predicted DNA-binding transcriptional regulator 13 8 16130007 yegE predicted diguanylate cyclase, GGDEF domain signalling protein 13 17 16129943 yeeS CP4-44 prophage; predicted DNA repair protein 13 19 16129928 yeeO predicted multidrug efflux system 13 15 90111360 yedQ predicted diguanylate cyclase 13 16 16129755 yeaV predicted transporter 13 16 16129752 yeaS neutral amino-acid efflux system 13 17 90111334 yeaP predicted diguanylate cyclase 13 15 16129744 yeaM predicted DNA-binding transcriptional regulator 13 11 90111331 yeaJ predicted diguanylate cyclase 13 16 16129719 ydjA predicted oxidoreductase 13 17 16129630 ydhY predicted 4Fe-4S ferridoxin-type protein 13 9 16129613 ydhO predicted lipoprotein 13 12 90111312 ydhL hypothetical protein b1648 13 5 16129563 ydgI predicted arginine/ornithine antiporter transporter 13 14 16129501 ydfI predicted mannonate dehydrogenase 13 8 16129496 ydeJ competence damage-inducible protein A 13 13 16129494 ydeH hypothetical protein b1535 13 16 90111284 yddV predicted diguanylate cyclase 13 16 16129455 yddA fused predicted multidrug transporter subunits of ABC superfamily: 13 10 16129402 ydcV predicted spermidine/putrescine transporter subunit 13 13 16129303 ydaN zinc transporter 13 8 90111253 ydaM predicted diguanylate cyclase, GGDEF domain signalling protein 13 16

286 16129276 ycjS predicted oxidoreductase, NADH-binding 13 10 16129273 ycjP predicted sugar transporter subunit: membrane component of ABC 13 13 16129272 ycjO predicted sugar transporter subunit: membrane component of ABC 13 11 16129283 ycjF hypothetical protein b1322 13 8 90111236 yciU dsDNA-mimic protein 13 5 16129234 yciN hypothetical protein b1273 13 0 90111237 yciI predicted enzyme 13 21 16129176 ychQ predicted transcriptional regulator 13 4 16129194 ychJ hypothetical protein b1233 13 18 16129177 ychA predicted transcriptional regulator 13 12 90111225 ycgG conserved inner membrane protein 13 16 16129151 ycgB hypothetical protein b1188 13 9 16129076 ycfS hypothetical protein b1113 13 11 90111213 ycfP hypothetical protein b1108 13 3 16129067 ycfL hypothetical protein b1104 13 0 16129020 yceJ predicted cytochrome b561 13 16 16129030 yceH hypothetical protein b1067 13 9 16128997 ycdX hypothetical protein b1034 13 1 16128989 ycdT predicted diguanylate cyclase 13 16 16128931 yccT hypothetical protein b0964 13 4 16128926 yccR hypothetical protein b0959 13 3 16128928 yccF conserved inner membrane protein 13 3 16128937 yccA inner membrane protein 13 17 16128892 ycbB predicted carboxypeptidase 13 6 90111187 ycaI conserved inner membrane protein 13 18 90111179 ybjT conserved protein with NAD(P)-binding Rossmann-fold domain 13 4 16128815 ybjL hypothetical protein b0847 13 5 16128844 ybjD conserved protein with nucleoside triphosphate hydrolase domain 13 2 16128787 ybiS hypothetical protein b0819 13 11 90111167 ybhJ predicted hydratase 13 22 16128735 ybhE 6-phosphogluconolactonase 13 5 16128687 ybgK predicted enzyme subunit 13 13 16128710 ybgE conserved inner membrane protein 13 1 16128682 ybgA hypothetical protein b0707 13 10 16128626 ybeL hypothetical protein b0643 13 4 16128560 ybdG predicted mechanosensitive channel 13 12 16128480 ybbP predicted inner membrane protein 13 16 16128439 ybaZ predicted methyltransferase 13 9 16128428 ybaW hypothetical protein b0443 13 0 16128469 ybaS predicted glutaminase 13 11 16128466 ybaP hypothetical protein b0482 13 3 90111131 yajL hypothetical protein b0424 13 3 90111134 yajG predicted lipoprotein 13 4 16128389 yajB hypothetical protein b0404 13 5 90111127 yaiI hypothetical protein b0387 13 14 16128376 yaiE hypothetical protein b0391 13 12 16128370 yaiC predicted diguanylate cyclase 13 16 16128313 yahN neutral amino-acid efflux system 13 13 16128300 yahA predicted DNA-binding transcriptional regulator 13 16 16128213 yafL predicted lipoprotein and C40 family peptidase 13 11 16128210 yafK hypothetical protein b0224 13 3 16128040 yabF glutathione-regulated potassium-efflux system ancillary protein 13 6 16132144 uxuB D-mannonate oxidoreductase, NAD-binding 13 8 16132143 uxuA mannonate dehydratase 13 2 90111702 ulaG predicted L-ascorbate 6-phosphate lactonase 13 3 16132016 ulaB L-ascorbate-specific enzyme IIB component of PTS 13 3 90111703 ulaA ascorbate-specific PTS system enzyme IIC 13 3 16131324 ugpA glycerol-3-phosphate transporter subunit 13 12

287 16130958 ttdB L(+)-tartrate dehydratase 13 12 16129826 torY TMAO reductase III (TorYZ), cytochrome c-type subunit 13 8 16128962 torC trimethylamine N-oxide (TMAO) reductase I, cytochrome c-type 13 8 16129389 tehB predicted S-adenosyl-L-methionine-dependent methyltransferase 13 5 16128352 tauC taurine transporter subunit 13 14 16130738 tas predicted oxidoreductase, NADP(H)-dependent aldo-keto reductase 13 15 16130651 surE acid phosphatase 13 22 90111451 sseA 3-mercaptopyruvate sulfurtransferase 13 13 16131165 smg hypothetical protein b3284 13 9 16130114 rtn hypothetical protein b2176 13 16 16131825 rsd stationary phase protein, binds sigma 70 RNA polymerase subunit 13 7 16131195 rpsS 30S ribosomal protein S19 13 25 16131190 rpsQ 30S ribosomal protein S17 13 25 49176137 rpmI 50S ribosomal protein L35 13 5 16131508 rpmB 50S ribosomal protein L28 13 24 90111097 rof modulator of Rho-dependent transcription termination 13 0 16132213 rob DNA-binding transcriptional activator 13 7 16128593 rnk nucleoside diphosphate kinase regulator 13 10 49176361 rhsB rhsB element core protein RshB 13 7 49176410 rffD UDP-N-acetyl-D-mannosaminuronic acid dehydrogenase 13 22 16131502 rfaG glucosyltransferase I 13 8 16130948 rfaE fused heptose 7-phosphate kinase/heptose 1-phosphate 13 14 90111623 radC DNA repair protein RadC 13 20 16129802 purT phosphoribosylglycinamide formyltransferase 2 13 22 16130401 purC phosphoribosylaminoimidazole-succinocarboxamide synthase 13 14 16129798 ptrB protease II 13 10 16128316 prpB 2-methylisocitrate lyase 13 16 16130593 proX glycine betaine transporter subunit 13 5 16128451 priC primosomal replication protein N'' 13 0 16130426 ppk polyphosphate kinase 13 14 16131633 ppiC peptidyl-prolyl cis-trans isomerase C (rotamase C) 13 17 16128101 ppdD predicted major pilin subunit 13 7 16128822 potF putrescine transporter subunit: periplasmic-binding component of 13 15 16128054 polB DNA polymerase II 13 7 16128719 pnuC predicted nicotinamide mononucleotide transporter 13 17 16128898 pncB nicotinate phosphoribosyltransferase 13 12 90111327 pncA nicotinamidase/pyrazinamidase 13 8 16131671 pldA outer membrane phospholipase A 13 11 16129335 pinR Rac prophage; predicted site-specific recombinase 13 8 16130520 pheA fused chorismate mutase P/prephenate dehydratase 13 23 16129239 pgpB phosphatidylglycerophosphatase B 13 5 16129594 pdxY pyridoxine kinase 13 10 16130344 pdxK pyridoxine kinase 13 10 16129355 paaG enoyl-CoA hydratase 13 20 49176206 nuoG NADH dehydrogenase subunit G 13 19 16130587 nrdH glutaredoxin-like protein 13 0 49176133 norM multidrug efflux protein NorM 13 20 16129149 nhaB sodium/proton antiporter 13 8 16129426 narY nitrate reductase 2 (NRZ), beta subunit 13 9 16129188 narH nitrate reductase 1, beta (Fe-S) subunit 13 9 16130142 napG quinol dehydrogenase periplasmic component 13 5 16130145 napF ferredoxin-type protein, predicted role in electron transfer to 13 7 16130144 napD assembly protein for periplasmic nitrate reductase 13 5 16130139 napC nitrate reductase, cytochrome c-type, periplasmic 13 8 90111403 napB nitrate reductase, small, cytochrome C550 subunit, periplasmic 13 7 16129082 nagK N-acetyl-D-glucosamine kinase 13 6 16129694 nadE NAD synthetase 13 10 16128718 nadA quinolinate synthetase 13 18

288 16129031 mviM predicted oxidoreductase with NAD(P)-binding Rossmann-fold domain 13 0 16131471 mtlD mannitol-1-phosphate 5-dehydrogenase 13 9 16129287 mpaA murein peptide amidase A 13 1 16129841 motB flagellar motor protein MotB 13 12 16131698 mobA molybdopterin-guanine dinucleotide biosynthesis protein A 13 12 16130608 mltB membrane-bound lytic murein transglycosylase B 13 20 16129736 mipA scaffolding protein for murein synthesizing machinery 13 5 16131839 metA homoserine O-succinyltransferase 13 6 90111290 marA DNA-binding transcriptional dual activator of multiple antibiotic 13 2 16131858 malG maltose transporter subunit 13 13 16131859 malF maltose transporter subunit 13 10 16131860 malE maltose ABC transporter periplasmic protein 13 1 16131862 lamB maltoporin precursor 13 4 16129674 infC translation initiation factor IF-3 13 24 16131627 ilvM acetolactate synthase II, small subunit 13 2 16132087 idnT L-idonate and D-gluconate transporter 13 11 16132090 idnK D-gluconate kinase, thermosensitive 13 11 90111675 iclR DNA-binding transcriptional repressor 13 14 16130817 iciA chromosome replication initiation inhibitor protein 13 8 90111637 ibpB heat shock chaperone 13 11 16131555 ibpA heat shock chaperone 13 11 90111444 hyfA hydrogenase 4, 4Fe-4S subunit 13 5 16130631 hycB hydrogenase 3, Fe-S subunit 13 5 16128840 hcr HCP oxidoreductase, NADH-dependent 13 14 90111180 hcp hydroxylamine reductase 13 2 16130888 gss fused glutathionylspermidine amidase/glutathionylspermidine 13 7 16128461 gsk inosine/guanosine kinase 13 5 49176355 gntU gluconate transporter, low affinity GNT 1 system 13 11 90111591 gntK gluconate kinase 2 13 11 16129246 gmr modulator of Rnase II stability 13 17 16129993 gmd GDP-D-mannose dehydratase, NAD(P)-binding 13 22 16131551 glvG predicted 6-phospho-beta-glucosidase (pseudogene) 13 0 16131524 gltS glutamate transporter 13 11 16131763 glpX fructose 1,6-bisphosphatase II 13 4 16130174 glpQ periplasmic glycerophosphodiester phosphodiesterase 13 10 16131765 glpF glycerol facilitator 13 13 16130177 glpB anaerobic glycerol-3-phosphate dehydrogenase subunit B 13 4 16131304 glgC glucose-1-phosphate adenylyltransferase 13 7 16131306 glgB glycogen branching enzyme 13 14 16131303 glgA glycogen synthase 13 14 49176294 glcF glycolate oxidase iron-sulfur subunit 13 14 16131319 ggt gamma-glutamyltranspeptidase periplasmic precursor 13 18 16130807 gcvT glycine cleavage system aminomethyltransferase T 13 19 90111443 gcvR DNA-binding transcriptional repressor, regulatory protein accessory 13 8 16130805 gcvP glycine dehydrogenase 13 17 16131948 fumB anaerobic class I fumarate hydratase (fumarase B) 13 12 16129570 fumA fumarate hydratase (fumarase A), aerobic Class I 13 12 16128088 ftsZ cell division protein FtsZ 13 25 16131762 fpr ferredoxin-NADP reductase 13 16 16129895 fliP flagellar biosynthesis protein P 13 15 16129891 fliL flagellar basal body-associated protein FliL 13 1 16129890 fliK flagellar hook-length control protein 13 7 90111357 fliH flagellar assembly protein H 13 6 16129886 fliG flagellar motor switch protein G 13 15 16129885 fliF flagellar M-ring protein 13 15 16129034 flgM anti-sigma factor for FliA (sigma 28) 13 0 16129046 flgL flagellar hook-associated protein L 13 14 16129045 flgK flagellar hook-associated protein K 13 14

289 16129044 flgJ flagellar biosynthesis protein FlgJ 13 14 16129037 flgC flagellar basal-body rod protein C 13 15 16131753 fieF ferrous iron efflux protein F 13 21 16128145 fhuD iron-hydroxamate transporter subunit 13 1 16131285 feoB ferrous iron transport protein B 13 12 16131284 feoA ferrous iron transport protein A 13 2 16131732 fdoI formate dehydrogenase-O, cytochrome b556 subunit 13 8 90111100 fadE acyl-CoA dehydrogenase 13 17 90111537 exuR DNA-binding transcriptional repressor 13 11 16129931 erfK conserved protein with NAD(P)-binding Rossmann-fold domain 13 11 16131152 envR DNA-binding transcriptional regulator 13 5 16128526 emrE multidrug efflux protein 13 16 90111557 elbB isoprenoid biosynthesis protein with amidotransferase-like domain 13 9 90111291 eamA cysteine and O-acetyl-L-serine efflux system 13 11 90111283 dos cAMP phosphodiesterase, heme-regulated 13 17 16128217 dinB DNA polymerase IV 13 17 16128153 dgt deoxyguanosinetriphosphate triphosphohydrolase 13 17 16130252 dedA conserved inner membrane protein 13 20 16131951 dcuS sensory histidine kinase in two-component regulatory system with 13 8 16129152 dadA D-amino acid dehydrogenase small subunit 13 12 16130350 cysP thiosulfate transporter subunit 13 12 16130669 cysH phosphoadenosine phosphosulfate reductase 13 15 16128415 cyoC cytochrome o ubiquinol oxidase subunit III 13 18 16128416 cyoB cytochrome o ubiquinol oxidase subunit I 13 18 16132030 cycA D-alanine/D-serine/glycine transporter 13 14 49176443 cpxP periplasmic protein combats stress 13 1 16131808 coaA 13 8 16130834 cmtA predicted fused mannitol-specific PTS enzymes: IIB component/IIC 13 3 16129839 cheW purine-binding chemotaxis protein 13 11 16129840 cheA fused chemotactic sensory histidine kinase in two-component 13 16 16129688 chbF cryptic phospho-beta-glucosidase, NAD(P)-binding 13 0 16129690 chbA N,N'-diacetylchitobiose-specific enzyme IIA component of PTS 13 0 16128607 ccrB camphor resistance protein CrcB 13 12 16130132 ccmG periplasmic thioredoxin of cytochrome c-type biogenesis 13 19 16130133 ccmF heme lyase, CcmF subunit 13 19 16130134 ccmE periplasmic heme chaperone 13 19 16130136 ccmC heme exporter subunit 13 19 16130137 ccmB heme exporter subunit 13 19 90111402 ccmA heme exporter subunit 13 20 16128034 caiT L-carnitine/gamma-butyrobetaine antiporter 13 16 16128743 bioB biotin synthase 13 25 16131588 bglH carbohydrate-specific outer membrane porin, cryptic 13 2 16128299 betT choline transporter of high affinity 13 17 49176374 avtA valine--pyruvate transaminase 13 6 90111645 atpI F0F1 ATP synthase subunit I 13 0 16131606 atpB F0F1 ATP synthase subunit A 13 25 16129698 astE succinylglutamate desuccinylase 13 9 49176415 aslB predicted regulator of arylsulfatase activity 13 3 16128105 aroP aromatic amino acid transporter 13 16 16128044 apaG hypothetical protein b0050 13 11 90111277 ansP L-asparagine transporter 13 16 16128436 amtB ammonium transporter 13 15 16130008 alkA 3-methyl-adenine DNA glycosylase II 13 7 16131033 agaI galactosamine-6-phosphate isomerase 13 6 16129581 add adenosine deaminase 13 14 16131840 aceB malate synthase 13 3 16131841 aceA 13 16 16131161 yrdB hypothetical protein b3280 12 1

290 90111547 yraR predicted nucleoside-diphosphate-sugar epimerase 12 11 16130997 yqjG predicted S-transferase 12 11 16130473 yphF predicted sugar transporter subunit: periplasmic-binding component 12 7 49176237 yphA predicted inner membrane protein 12 4 16129714 ynjH hypothetical protein b1760 12 0 90111303 ynfJ putative voltage-gated ClC-type chloride channel ClcB 12 7 16128919 ymbA hypothetical protein b0952 12 1 16128483 ylbH conserved protein, rhs-like 12 7 16128291 ykgE predicted oxidoreductase 12 14 90111109 ykgA predicted DNA-binding transcriptional regulator 12 6 90111748 yjjX NTPase 12 2 16132161 yjiR fused predicted DNA-binding transcriptional regulator/predicted 12 18 16132120 yjhI KpLE2 phage-like element; predicted DNA-binding transcriptional 12 9 90111713 yjgK hypothetical protein b4252 12 4 16132008 yjfC predicted synthetase/amidase 12 7 16131998 yjeT conserved inner membrane protein 12 0 16131966 yjeH predicted transporter 12 4 16131989 yjeF predicted carbohydrate kinase 12 19 90111680 yjbO phage shock protein G 12 0 16131838 yjaB predicted acetyltransferase 12 4 16131792 yijO predicted DNA-binding transcriptional regulator 12 7 16131759 yiiR conserved inner membrane protein 12 1 90111661 yihX phosphatase 12 0 90111659 yihV predicted sugar kinase 12 7 16131583 yieH predicted hydrolase 12 8 16131527 yicI predicted alpha-glucosidase 12 4 16131483 yibO phosphoglyceromutase 12 17 49176378 yibJ predicted Rhs-family protein 12 7 49176375 yiaN predicted transporter 12 14 16131448 yiaM predicted transporter 12 3 16131447 yiaL hypothetical protein b3576 12 5 16131445 yiaJ predicted DNA-binding transcriptional repressor 12 11 16131390 yhjA predicted cytochrome C peroxidase 12 11 16131312 yhhX predicted oxidoreductase with NAD(P)-binding Rossmann-fold domain 12 6 16131222 yheL predicted intracellular sulfur oxidation protein 12 1 94541124 yhdL hypothetical protein b4550 12 3 90111566 yhdJ predicted methyltransferase 12 4 16131111 yhcH hypothetical protein b3221 12 5 16130959 ygjE predicted tartrate:succinate antiporter 12 4 49176302 ygiQ hypothetical protein b4469 12 11 16130934 ygiC predicted enzyme 12 7 90111529 ygiB conserved outer membrane protein 12 1 16130907 yghB conserved inner membrane protein 12 10 16130736 ygdQ predicted inner membrane protein 12 14 16130304 yfdV predicted transporter 12 0 49176212 yfcU predicted export usher protein 12 6 16130226 yfbR hypothetical protein b2291 12 4 16130189 yfbF undecaprenyl phosphate-L-Ara4FN transferase 12 16 16130183 yfaX predicted DNA-binding transcriptional regulator 12 13 90111400 yeiU undecaprenyl pyrophosphate phosphatase 12 3 16130092 yeiG predicted esterase 12 16 16130047 yehB predicted outer membrane protein 12 9 90111349 yecN predicted inner membrane protein 12 3 16129738 yeaH hypothetical protein b1784 12 9 16129737 yeaG conserved protein with nucleoside triphosphate hydrolase domain 12 9 90111330 yeaD hypothetical protein b1780 12 14 16129676 ydiY hypothetical protein b1722 12 7 16129662 ydiU hypothetical protein b1706 12 17

291 90111307 ydgJ predicted oxidoreductase 12 11 16129559 ydgG predicted inner membrane protein 12 15 16129460 ydeP predicted oxidoreductase 12 14 90111286 ydeN hypothetical protein b1498 12 8 90111285 ydeM hypothetical protein b1497 12 4 90111279 yddG predicted methyl viologen efflux pump 12 1 16129399 ydcS predicted spermidine/putrescine transporter subunit 12 12 90111214 ycfQ predicted DNA-binding transcriptional regulator 12 10 16129019 yceI hypothetical protein b1056 12 15 16129018 yceA hypothetical protein b1055 12 18 16128949 yccZ predicted exopolysaccharide export protein 12 15 16128947 yccC cryptic autophosphorylating protein Etk 12 13 16128887 ycbC conserved inner membrane protein 12 5 16128884 ycaR hypothetical protein b0917 12 16 90111186 ycaL predicted peptidase with chaperone function 12 13 90111172 ybiN predicted SAM-dependent methyltransferase 12 9 16128771 ybiI hypothetical protein b0803 12 7 16128763 ybhG hypothetical protein b0795 12 15 16128678 ybfO conserved protein, rhs-like 12 7 16128529 ybcM DLP12 prophage; predicted DNA-binding transcriptional regulator 12 3 16128452 ybaN conserved inner membrane protein 12 7 16128257 yagI CP4-6 prophage; predicted DNA-binding transcriptional regulator 12 12 16128255 yagG CP4-6 prophage; predicted sugar transporter 12 4 90111096 yaeR predicted lyase 12 6 16128183 yaeQ hypothetical protein b0190 12 11 49176004 yaeP hypothetical protein b4406 12 0 16128148 yadQ chloride channel protein 12 11 90111377 wzc protein-tyrosine kinase 12 11 16130001 wzb protein-tyrosine phosphatase 12 14 16130002 wza lipoprotein required for capsular polysaccharide translocation 12 14 16128970 wrbA TrpR binding protein WrbA 12 16 16129987 wcaJ predicted UDP-glucose lipid carrier transferase 12 20 16130987 uxaC glucuronate isomerase 12 3 49176119 uxaB tagaturonate reductase 12 7 16128590 uspG universal stress protein UP12 12 0 90111262 uspF stress-induced protein, ATP-binding protein 12 2 16129147 umuC DNA polymerase V subunit UmuC 12 17 16132020 ulaF L-ribulose 5-phosphate 4-epimerase 12 7 16129285 tpx thiol peroxidase 12 9 49176448 thiS sulfur carrier protein ThiS 12 0 16129604 sodC superoxide dismutase, Cu, Zn 12 10 90111618 sgbU predicted L-xylulose 5-phosphate 3-epimerase 12 4 16131454 sgbE L-ribulose-5-phosphate 4-epimerase 12 4 16128516 sfmD predicted outer membrane export usher protein 12 9 16131529 setC predicted sugar efflux system 12 2 16130108 setB lactose/glucose efflux system 12 2 49175994 setA broad specificity sugar efflux system 12 3 16129386 rimL ribosomal-protein-L7/L12-serine acetyltransferase 12 3 16129415 rhsE rhsE element core protein RshE 12 7 16128481 rhsD rhsD element protein 12 7 16128676 rhsC rhsC element core protein RshC 12 7 16131464 rhsA rhsA element core protein RshA 12 7 16131492 rfaC ADP-heptose:LPS heptosyl transferase I 12 12 16130153 rcsD phosphotransfer intermediate protein in two-component regulatory 12 0 16131726 rbn ribonuclease BN 12 20 16129504 pinQ Qin prophage; predicted site-specific recombinase 12 8 16129217 ompW outer membrane protein W 12 15 16131901 nrfF heme lyase (NrfEFG) for insertion of heme into c552, subunit NrfF 12 17

292 16128058 araC DNA-binding transcriptional dual regulator 12 3 16128185 nlpE lipoprotein involved with copper homeostasis and adhesion 12 2 16131245 nirD nitrite reductase small subunit 12 6 16131472 mtlR DNA-binding repressor 12 2 16130596 mprA DNA-binding transcriptional repressor of microcin B17 synthesis and 12 3 16129842 motA flagellar motor protein MotA 12 11 16130324 mntH manganese transport protein MntH 12 10 16129558 mdtJ multidrug efflux system transporter 12 13 49176154 manZ mannose-specific enzyme IID component of PTS 12 3 16129772 manY mannose-specific enzyme IIC component of PTS 12 3 16129771 manX fused mannose-specific PTS enzymes: IIA component/IIB component 12 3 16129580 malY bifunctional beta-cystathionase, PLP-dependent/ regulator of 12 14 16131294 malT transcriptional regulator MalT 12 6 16131863 malM maltose regulon periplasmic protein 12 1 16128838 ltaE L-allo-, PLP-dependent 12 8 16129471 lsrR lsr operon transcriptional repressor 12 2 16131476 lldD L-lactate dehydrogenase, FMN-linked 12 12 16131329 livH leucine/isoleucine/valine transporter subunit 12 7 16130747 kduI 5-keto-4-deoxyuronate isomerase 12 2 16129781 kdgR predicted DNA-binding transcriptional regulator 12 14 16129211 kch voltage-gated potassium channel 12 2 16131780 katG catalase/hydroperoxidase HPI(I) 12 9 49176140 katE hydroperoxidase HPII(III) (catalase) 12 16 16130411 hyfF NADH dehydrogenase subunit N 12 20 16128132 htrE predicted outer membrane usher protein 12 7 16129340 hslJ heat-inducible protein 12 1 16132169 hsdS specificity determinant for hsdM and hsdR 12 7 16132170 hsdM DNA methylase M 12 18 16131214 gspO bifunctional prepilin leader peptidase/ methylase 12 19 16131481 grxC glutaredoxin 3 12 21 16132212 gpmB phosphoglycerate mutase 12 7 16130178 glpC sn-glycerol-3-phosphate dehydrogenase (anaerobic), small subunit 12 15 16130844 galP D-galactose transporter 12 17 16130577 gabP gamma-aminobutyrate transporter 12 18 16130707 fucA L-fuculose phosphate aldolase 12 8 16128340 frmB predicted esterase 12 17 16129889 fliJ flagellar biosynthesis chaperone 12 7 16132138 fimD outer membrane usher protein, type 1 fimbrial synthesis 12 9 90111198 etp phosphotyrosine-protein phosphatase 12 12 16128566 entD phosphopantetheinyltransferase component of enterobactin synthase 12 4 16128780 dps DNA protection during starvation conditions 12 8 90111302 dmsD twin-argninine leader-binding protein for DmsA and TorA 12 4 16130071 dld D-lactate dehydrogenase, FAD-binding, NADH independent 12 2 16128194 dkgB 2,5-diketo-D-gluconate reductase B 12 12 16132199 deoA thymidine phosphorylase 12 5 16128604 dcuC anaerobic C4-dicarboxylate transport 12 2 16128413 cyoE protoheme IX farnesyltransferase 12 18 16128226 crl DNA-binding transcriptional regulator 12 0 16129691 chbC N,N'-diacetylchitobiose-specific enzyme IIC component of PTS 12 1 90111093 cdaR DNA-binding transcriptional activator 12 8 16130131 ccmH heme lyase, CcmH subunit 12 17 16130135 ccmD cytochrome c biogenesis protein 12 0 16131215 bfr bacterioferritin, iron storage and detoxification protein 12 16 16129701 astA arginine succinyltransferase 12 10 16128055 araD L-ribulose-5-phosphate 4-epimerase 12 5 16128057 araB ribulokinase 12 0 16131975 ampC beta-lactamase/D-alanine carboxypeptidase 12 13 16128490 allR DNA-binding transcriptional repressor 12 13

293 90111546 agaV N-acetylgalactosamine-specific enzyme IIB component of PTS 12 3 16131032 agaD N-acetylgalactosamine-specific enzyme IID component of PTS 12 3 16131031 agaC N-acetylgalactosamine-specific enzyme IIC component of PTS 12 3 16131030 agaB N-acetylgalactosamine-specific enzyme IIB component of PTS 12 3 16128460 aes acetyl esterase 12 10 16129373 acpD acyl carrier protein phosphodiesterase 12 10 49176425 ysgA predicted hydrolase 11 9 16131036 yraJ predicted outer membrane protein 11 10 16131035 yraI predicted periplasmic pilin chaperone 11 6 16130966 yqjI predicted transcriptional regulator 11 5 16130965 yqjH predicted siderophore interacting protein 11 5 90111539 yqjF predicted quinol oxidase subunit 11 3 49176305 yqiG predicted outer membrane usher protein 11 9 16130802 yqfB hypothetical protein b2900 11 4 16130400 ypfJ hypothetical protein b2475 11 11 90111363 yodB predicted cytochrome 11 17 16129097 ymfB bifunctional thiamin pyrimidine pyrophosphate hydrolase/ thiamin 11 12 90111175 yliJ predicted glutathione S-transferase 11 9 16128281 ykgM 50S ribosomal protein L31 11 10 90111739 yjiY predicted inner membrane protein 11 15 16132160 yjiQ predicted transposase 11 0 90111718 yjhB KpLE2 phage-like element; predicted transporter 11 20 49176480 yjgN conserved inner membrane protein 11 0 90111689 yjdC predicted transcriptional regulator 11 3 16131855 yjbH predicted porin 11 2 49176436 yihP predicted transporter 11 4 49176435 yihO predited transporter 11 4 16131625 yifB predicted bifunctional enzyme and transcriptional regulator 11 20 49176401 yieP predicted transcriptional regulator 11 9 16131579 yidZ predicted DNA-binding transcriptional regulator 11 14 90111638 yidQ conserved outer membrane protein 11 0 90111629 yicJ predicted transporter 11 4 16131486 yibD predicted glycosyl transferase 11 14 16131450 yiaO predicted transporter 11 12 16131421 yiaC predicted acyltransferase with acyl-CoA N-acyltransferase domain 11 4 16131418 yhjX predicted transporter 11 9 90111605 yhjH EAL domain containing protein involved in flagellar function 11 5 16131392 yhjB predicted DNA-binding response regulator in two-component 11 12 16131313 yhhY predicted acetyltransferase 11 3 16131106 yhcD predicted outer membrane protein 11 8 16131052 yhbW predicted enzyme 11 12 16131046 yhbP hypothetical protein b3154 11 2 16131001 yhaK predicted pirin-related protein 11 16 16130982 ygjR predicted NAD(P)-binding dehydrogenase 11 5 49176298 yghX predicted hydrolase (pseudogene) 11 5 16130875 yghK glycolate transporter 11 13 16130581 ygaV predicted DNA-binding transcriptional regulator 11 4 49176257 ygaM hypothetical protein b2672 11 0 16130526 yfiB predicted outer membrane lipoprotein 11 13 90111449 yfgJ hypothetical protein b2510 11 0 16130357 yfeY hypothetical protein b2432 11 0 16130271 yfcS predicted periplasmic pilus chaperone 11 6 16130062 yehS hypothetical protein b2124 11 9 90111355 yedO D-cysteine desulfhydrase 11 7 16129905 yedA predicted inner membrane protein 11 7 16129801 yebG conserved protein regulated by LexA 11 3 16129723 ydjE predicted transporter 11 18 16129687 ydjC hypothetical protein b1733 11 2

294 16129565 ydgC conserved inner membrane protein associated with alginate 11 5 16129572 ydgA hypothetical protein b1614 11 7 16129463 ydeS predicted fimbrial-like adhesin protein 11 0 16129431 yddL predicted lipoprotein 11 0 16129126 ycgF predicted FAD-binding phosphodiesterase 11 16 90111217 ycfD hypothetical protein b1128 11 19 16129026 yceB predicted lipoprotein 11 3 16128998 ycdY hypothetical protein b1035 11 4 90111202 ycdH predicted oxidoreductase, flavin:NADH component 11 10 16128907 ycbS predicted outer membrane usher protein 11 6 90111185 ycaO hypothetical protein b0905 11 11 16128845 ybjX hypothetical protein b0877 11 5 16128768 ybiB hypothetical protein b0800 11 7 16128761 ybhS predicted transporter subunit: membrane component of ABC 11 13 16128760 ybhR predicted transporter subunit: membrane component of ABC 11 13 16128754 ybhL predicted inner membrane protein 11 5 16128741 ybhB predicted kinase inhibitor 11 10 94541102 ybgT hypothetical protein b4515 11 0 16128720 ybgR zinc transporter ZitB 11 17 90111165 ybgQ predicted outer membrane protein 11 9 16128675 ybfA hypothetical protein b0699 11 0 16128582 ybdH predicted oxidoreductase 11 0 16128528 ybcL DLP12 prophage; predicted kinase inhibitor 11 10 16128122 yadI predicted PTS Enzyme IIA 11 0 16128123 yadE predicted polysaccharide deacetylase lipoprotein 11 5 16128059 yabI conserved inner membrane protein 11 13 16131648 wzxE O-antigen translocase 11 2 16131650 wecG putative UDP-N-acetyl-D-mannosaminuronic acid transferase 11 5 16131649 wecF putative enterobacterial common antigen polymerase 11 1 16129984 wcaL predicted glycosyl transferase 11 15 16129574 uidB glucuronide transporter 11 4 16129575 uidA beta-D-glucuronidase 11 3 16131323 ugpE glycerol-3-phosphate transporter subunit 11 11 16128396 tsx nucleoside channel, receptor of phage T6 and colicin K 11 4 16129637 sufD component of SufBCD complex 11 10 90111261 stfR Rac prophage; predicted tail fiber protein 11 0 90111189 ssuA alkanesulfonate transporter subunit 11 6 16128114 speE spermidine synthase 11 11 16130838 speB agmatinase 11 11 16129487 sotB sugar efflux transporter 11 14 49176129 slyB outer membrane lipoprotein 11 4 16129863 sdiA DNA-binding transcriptional activator 11 4 16129195 rssA hypothetical protein b1234 11 14 16131178 rpmJ 50S ribosomal protein L36 11 20 16128920 rmf ribosome modulation factor 11 6 16128820 rimK ribosomal protein S6 modification protein 11 18 16128781 rhtA threonine and homoserine efflux system 11 6 16131640 rfe UDP-GlcNAc:undecaprenylphosphate GlcNAc-1-phosphate transferase 11 15 90111243 puuP putrescine importer 11 18 16132023 priB primosomal replication protein N 11 8 90111523 pppA bifunctional prepilin leader peptidase/ methylase 11 21 16130810 pepP proline aminopeptidase P II 11 21 16132194 osmY periplasmic protein 11 12 16129441 osmC osmotically inducible, stress-inducible membrane protein 11 8 16130211 nuoN NADH dehydrogenase subunit N 11 20 16130212 nuoM NADH dehydrogenase subunit M 11 20 16130213 nuoL NADH dehydrogenase subunit L 11 20 16130214 nuoK NADH dehydrogenase subunit K 11 11

295 16130216 nuoI NADH dehydrogenase subunit I 11 17 16129818 ntpA dATP pyrophosphohydrolase 11 4 16130617 norV anaerobic nitric oxide reductase flavorubredoxin 11 5 16128284 insF-1 IS3 element protein InsF 11 15 16128357 insF-2 IS3 element protein InsF 11 15 16128524 insF-3 DLP12 prophage; IS3 element protein InsF 11 15 16128990 insF-4 IS3 element protein InsF 11 15 16130028 insF-5 IS3 element protein InsF 11 15 16132106 insO-2 KpLE2 phage-like element; partial transposase of insertion element 11 15 16128819 nfsA nitroreductase A, NADPH-dependent, FMN-dependent 11 3 16129427 narZ nitrate reductase 2 (NRZ), alpha subunit 11 4 16129187 narG nitrate reductase 1, alpha subunit 11 4 90111558 nanK N-acetylmannosamine kinase 11 3 16131113 nanE predicted N-acetylmannosamine-6-P epimerase 11 4 16128003 mogA molybdenum cofactor biosynthesis protein 11 10 16128729 modE DNA-binding transcriptional dual regulator 11 10 16130263 mepA penicillin-insensitive murein endopeptidase 11 5 16131946 melB melibiose:sodium symporter 11 4 16129557 mdtI multidrug efflux system transporter 11 9 90111209 mdtH predicted drug efflux system 11 1 16129011 mdoG glucan biosynthesis protein, periplasmic 11 9 90111741 mdoB phosphoglycerol transferase I 11 11 16129477 lsrG autoinducer-2 (AI-2) modifying protein LsrG 11 1 16129633 lpp murein lipoprotein 11 1 16131328 livM leucine/isoleucine/valine transporter subunit 11 5 49176358 livJ leucine/isoleucine/valine transporter subunit 11 4 16128329 lacZ beta-D-galactosidase 11 3 16128327 lacA galactoside O-acetyltransferase 11 11 16131429 insK IS150 conserved protein InsB 11 15 16130634 hypB GTP hydrolase involved in nickel liganding into hydrogenases 11 6 16130413 hyfH hydrogenase 4, Fe-S subunit 11 0 16130409 hyfD hydrogenase 4 membrane subunit 11 20 16130407 hyfB NADH dehydrogenase subunit N 11 20 16130630 hycC NADH dehydrogenase subunit N 11 20 16129027 grxB glutaredoxin 2 (Grx2) 11 3 90111668 gldA glycerol dehydrogenase 11 1 16130238 folX D-erythro-7,8-dihydroneopterin triphosphate 2'-epimerase and 11 9 16130256 flk predicted flagella assembly protein 11 0 16129035 flgA flagellar basal body P-ring biosynthesis protein A 11 1 90111729 fimI fimbrial protein involved in type 1 pilus biosynthesis 11 1 16129249 fabI enoyl-(acyl carrier protein) reductase 11 24 16129797 exoX exodeoxyribonuclease X 11 8 16130202 elaA predicted acyltransferase with acyl-CoA N-acyltransferase domain 11 14 16128282 eaeH attaching and effacing protein, pathogenesis factor 11 0 16128863 dmsC dimethyl sulfoxide reductase, anaerobic, subunit C 11 3 16129024 dinI DNA damage-inducible protein I 11 0 16129529 dicA Qin prophage; predicted regulator for DicB 11 0 16131117 dcuD predicted transporter 11 3 16128417 cyoA cytochrome o ubiquinol oxidase subunit II 11 16 49176476 cybC cytochrome b562, truncated (pseudogene) 11 1 16128116 cueO multicopper oxidase (laccase) 11 12 90111478 csiR DNA-binding transcriptional dual regulator 11 8 16129989 cpsB mannose-1-phosphate guanyltransferase 11 15 16128322 codA cytosine deaminase 11 6 16129932 cobT nicotinate-nucleotide--dimethylbenzimidazole 11 11 16128621 cobC predicted alpha-ribazole-5'-P phosphatase 11 13 16129179 chaA calcium/sodium:proton antiporter 11 4 16131756 cdh CDP-diacylglycerol pyrophosphatase 11 1

296 16128033 caiA crotonobetaine reductase subunit II, FAD-binding 11 16 16129666 btuE predicted glutathione peroxidase 11 13 16131974 blc outer membrane lipoprotein (lipocalin) 11 14 16130070 bglX beta-D-glucoside glucohydrolase, periplasmic 11 7 16131653 aslA acrylsulfatase-like enzyme 11 6 16130192 arnT 4-amino-4-deoxy-L-arabinose transferase 11 13 16128448 acrR DNA-binding transcriptional repressor 11 5 16130935 zupT predicted dioxygenase 10 13 90111706 ytfH predicted transcriptional regulator 10 9 16131275 yrfF predicted inner membrane protein 10 0 16130995 yqjK hypothetical protein b3100 10 0 90111532 yqiH predicted periplasmic pilin chaperone 10 6 90111474 ypjA adhesin-like autotransporter 10 8 90111396 yohG predicted outer membrane protein 10 18 16129774 yobD hypothetical protein b1820 10 2 16129407 yncA predicted acyltransferase with acyl-CoA N-acyltransferase domain 10 9 16129343 ynbE predicted lipoprotein 10 4 16128950 ymcA hypothetical protein b0984 10 2 16132159 yjiP predicted transposase (pseudogene) 10 1 16132148 yjiE predicted DNA-binding transcriptional regulator 10 11 90111727 yjhT hypothetical protein b4310 10 2 16132012 yjfP predicted hydrolase 10 0 16131906 yjcP predicted outer membrane factor of efflux pump 10 19 16131892 yjcF hypothetical protein b4066 10 0 16131882 yjbQ hypothetical protein b4056 10 11 90111676 yjbF predicted lipoprotein 10 2 16131758 yiiQ hypothetical protein b3920 10 0 16131718 yihQ alpha-glucosidase 10 4 90111655 yihF hypothetical protein b3861 10 6 90111652 yigI hypothetical protein b3820 10 5 49176414 yifK predicted transporter 10 12 16131548 yidJ predicted sulfatase/phosphatase 10 9 16131394 yhjD conserved inner membrane protein 10 0 16131380 yhiD predicted Mg(2+) transport ATPase inner membrane protein 10 11 16131314 yhhZ hypothetical protein b3442 10 2 16131287 yhgA predicted transposase 10 1 16131145 yhdT conserved inner membrane protein 10 0 16130939 ygiL predicted fimbrial-like adhesin protein 10 1 90111514 yggG predicted peptidase 10 14 16130830 yggD predicted DNA-binding transcriptional regulator 10 0 16130668 ygcB conserved protein, member of DEAD box family 10 1 16130646 ygbM hypothetical protein b2739 10 8 16130639 ygbA hypothetical protein b2732 10 1 90111479 ygaY predicted transporter (pseudogene) 10 6 49176258 ygaX predicted transporter 10 5 90111459 yfhB hypothetical protein b2560 10 0 90111434 yfeZ predicted inner membrane protein 10 0 16130336 yfeH predicted inner membrane protein 10 7 90111429 yfeC predicted DNA-binding transcriptional regulator 10 0 16130283 yfdH CPS-53 (KpLE1) prophage; bactoprenol glucosyl transferase 10 15 49176211 yfcT predicted outer membrane export usher protein 10 4 16130240 yfcI hypothetical protein b2305 10 1 16130234 yfcD predicted NUDIX hydrolase 10 4 16130191 yfbH hypothetical protein b2256 10 4 90111407 yfaZ predicted outer membrane porin protein 10 0 90111406 yfaV predicted transporter 10 8 16130180 yfaU predicted 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 10 7 16130090 yeiB conserved inner membrane protein 10 5

297 16130048 yehC predicted periplasmic pilin chaperone 10 6 90111372 yefM antitoxin of the YoeB-YefM toxin-antitoxin system 10 5 90111370 yeeF predicted amino-acid transporter 10 17 90111319 ydjM predicted inner membrane protein regulated by LexA 10 2 16129663 ydiV hypothetical protein b1707 10 0 90111318 ydiO predicted acyl-CoA dehydrogenase 10 16 16129615 ydhP predicted transporter 10 15 90111313 ydhM predicted DNA-binding transcriptional regulator 10 11 16129562 ydgH hypothetical protein b1604 10 0 16129556 ydgD predicted peptidase 10 0 16129502 ydfJ predicted transporter 10 17 16129464 ydeT hypothetical protein b1505 10 6 16129493 ydeE predicted transporter 10 4 16129339 ydbK fused predicted pyruvate-flavodoxin oxidoreductase: conserved 10 1 16129342 ydbH hypothetical protein b1381 10 2 16129367 ydbC predicted oxidoreductase, NAD(P)-binding 10 13 90111249 ycjG L-Ala-D/L-Glu epimerase 10 6 16129183 ychP predicted invasin 10 0 16129165 ycgV predicted adhesin 10 3 16129157 ycgR protein involved in flagellar function 10 0 90111230 ycgO potassium/proton antiporter 10 12 16129075 ycfR hypothetical protein b1112 10 0 16129073 ycfJ hypothetical protein b1110 10 7 90111206 ycdZ predicted inner membrane protein 10 1 16128987 ycdR predicted enzyme associated with biofilm formation 10 7 16128979 ycdC predicted DNA-binding transcriptional regulator 10 9 90111196 yccU predicted CoA-binding protein with NAD(P)-binding Rossmann-fold 10 1 16128958 yccM predicted 4Fe-4S membrane protein 10 8 16128906 ycbR predicted periplasmic pilin chaperone 10 6 90111190 ycbQ predicted fimbrial-like adhesin protein 10 1 90111178 ybjS predicted NAD(P)H-binding oxidoreductase with NAD(P)-binding 10 10 16128834 ybjQ hypothetical protein b0866 10 11 16128821 ybjN predicted oxidoreductase 10 0 16128813 ybjJ predicted transporter 10 4 16128770 ybiJ hypothetical protein b0802 10 0 16128740 ybhC predicted pectinesterase 10 2 16128692 ybgP predicted assembly protein 10 6 16128694 ybgD predicted fimbrial-like adhesin protein 10 0 16128562 ybdF hypothetical protein b0579 10 3 16128395 yajD hypothetical protein b0410 10 7 16128125 yadD predicted transposase 10 1 90111087 yacC hypothetical protein b0122 10 0 16128074 yabB hypothetical protein b0081 10 18 16128039 yaaU predicted transporter 10 18 16131440 xylR DNA-binding transcriptional activator, xylose-binding 10 6 16129994 wcaF predicted acyl transferase 10 8 16132019 ulaE L-xylulose 5-phosphate 3-epimerase 10 4 16132018 ulaD 3-keto-L-gulonate 6-phosphate decarboxylase 10 5 16129568 tus DNA replication terminus site-binding protein 10 1 16130957 ttdA tartrate dehydratase 10 12 49176400 trkD potassium transporter 10 10 16128960 torT periplasmic sensory protein associated with the TorRS two-component 10 2 90111150 tfaD DLP12 prophage; predicted tail fiber assembly protein (pseudogene) 10 1 16129388 tehA potassium-tellurite ethidium and proflavin transporter 10 3 90111693 sugE multidrug efflux system protein 10 10 16130913 sufI repressor protein for FtsI 10 7 90111315 sufB component of SufBCD complex 10 10 90111188 ssuC alkanesulfonate transporter subunit 10 11

298 90111452 sseB rhodanase-like enzyme, sulfur transfer from thiosulfate 10 0 16129542 speG spermidine N1-acetyltransferase 10 3 16128113 speD S-adenosylmethionine decarboxylase proenzyme 10 8 16131889 soxR DNA-binding transcriptional dual regulator, Fe-S center for 10 7 16131078 sfsB DNA-binding transcriptional activator of maltose metabolism 10 4 16128515 sfmC pilin chaperone, periplasmic 10 6 16131462 selA selenocysteine synthase 10 7 16129539 rspA predicted dehydratase 10 9 16131916 rpiB ribose-5-phosphate isomerase B 10 3 49176413 rffT 4-alpha-L-fucosyltransferase 10 1 49176412 rffC TDP-fucosamine acetyltransferase 10 1 16129262 puuB gamma-Glu-putrescine oxidase, FAD/NAD(P)-binding 10 15 16129269 pspE thiosulfate:cyanide sulfurtransferase (rhodanese) 10 4 16128387 proY predicted cryptic proline transporter 10 13 16131937 proP proline/glycine betaine transporter 10 19 16130730 ppdA hypothetical protein b2826 10 0 16130887 pitB phosphate transporter 10 21 16129121 pin e14 prophage; site-specific DNA recombinase 10 8 49176017 phoA bacterial alkaline phosphatase 10 8 16131928 phnF predicted DNA-binding transcriptional regulator of phosphonate 10 8 16128559 pheP phenylalanine transporter 10 13 16131693 pepQ proline dipeptidase 10 20 16128124 panD aspartate 1-decarboxylase precursor 10 12 16128782 ompX outer membrane protein X 10 0 16130215 nuoJ NADH dehydrogenase subunit J 10 11 16130217 nuoH NADH dehydrogenase subunit H 10 17 16130219 nuoF NADH:ubiquinone oxidoreductase, chain F 10 18 16130220 nuoE NADH dehydrogenase subunit E 10 18 16130221 nuoC NADH:ubiquinone oxidoreductase, chain C,D 10 17 16130222 nuoB NADH dehydrogenase subunit B 10 17 49176207 nuoA NADH dehydrogenase subunit A 10 16 16131902 nrfG heme lyase (NrfEFG) for insertion of heme into c552, subunit NrfG 10 8 16131899 nrfD formate-dependent nitrite reductase, membrane subunit 10 5 90111682 nrfB nitrite reductase, formate-dependent, penta-heme cytochrome c 10 5 16131896 nrfA nitrite reductase, formate-dependent, cytochrome 10 5 16130179 yfaD hypothetical protein b2244 10 1 90111559 nanT sialic acid transporter 10 21 16128750 moaB molybdopterin biosynthesis protein B 10 15 90111105 mmuP CP4-6 prophage; predicted S-methylmethionine transporter 10 17 16128246 mmuM homocysteine methyltransferase 10 10 16128331 mhpR DNA-binding transcriptional activator, 3HPP-binding 10 10 16128337 mhpE 4-hydroxy-2-ketovalerate aldolase 10 13 16128332 mhpA 3-(3-hydroxyphenyl)propionate hydroxylase 10 6 16131944 melR DNA-binding transcriptional dual regulator 10 5 90111270 mdoD glucan biosynthesis protein, periplasmic 10 9 90111289 marR DNA-binding transcriptional repressor of multiple antibiotic 10 4 16128443 maa maltose O-acetyltransferase 10 6 16130094 lysP lysine transporter 10 17 16131474 lldP L-lactate permease 10 12 16131330 livK leucine transporter subunit 10 4 90111091 ligT 2'-5' RNA ligase 10 5 16128672 kdpC potassium-transporting ATPase subunit C 10 9 16128674 kdpA potassium-transporting ATPase subunit A 10 8 16128796 iaaA L-asparaginase 10 7 16130414 hyfI hydrogenase 4, Fe-S subunit 10 17 16130412 hyfG hydrogenase 4, subunit 10 17 16130626 hycG hydrogenase 3 and formate hydrogenase complex, HycG subunit 10 17 16130627 hycF formate hydrogenlyase complex iron-sulfur protein 10 11

299 16130628 hycE hydrogenase 3, large subunit 10 17 16130629 hycD hydrogenase 3, membrane subunit 10 13 16130891 hybF protein involved with the maturation of hydrogenases 1 and 2 10 5 16132171 hsdR endonuclease R 10 4 16128444 hha modulator of gene expression, with H-NS 10 1 16128723 gpmA phosphoglyceromutase 10 9 16130320 glk 10 13 16129715 gdhA glutamate dehydrogenase 10 19 16131018 garL alpha-dehydro-beta-deoxy-D-glucarate aldolase 10 7 16131387 gadW DNA-binding transcriptional activator 10 1 16131784 fsaB fructose-6-phosphate aldolase 2 10 0 90111174 fsaA fructose-6-phosphate aldolase 1 10 0 90111356 fliZ predicted regulator of FliA activity 10 0 16129033 flgN export chaperone for FlgK and FlgL 10 0 16128037 fixC predicted oxidoreductase with FAD/NAD(P)-binding domain 10 16 49176488 fimC chaperone, periplasmic 10 6 16132135 fimA major type 1 subunit fimbrin (pilin) 10 1 16131240 fic stationary-phase protein, cell division 10 3 16131731 fdhE formate dehydrogenase accessory protein FdhE 10 7 16129992 fcl bifunctional GDP-fucose synthetase: GDP-4-dehydro-6-deoxy-D-mannose 10 15 16130988 exuT hexuronate transporter 10 10 16128578 entB isochorismatase 10 3 16128133 ecpD predicted periplasmic pilin chaperone 10 6 49176308 ebgA cryptic beta-D-galactosidase, alpha subunit 10 3 16131511 dut deoxyuridine 5'-triphosphate nucleotidohydrolase 10 23 90111527 dkgA 2,5-diketo-D-gluconate reductase A 10 12 16131870 dinF DNA-damage-inducible SOS response protein 10 9 16131950 dcuR DNA-binding response regulator in two-component regulatory system 10 0 90111595 dcrB periplasmic protein 10 0 16130005 dcd deoxycytidine triphosphate deaminase 10 14 16128159 dapD 2,3,4,5-tetrahydropyridine-2-carboxylate N-succinyltransferase 10 20 16128414 cyoD cytochrome o ubiquinol oxidase subunit IV 10 12 16128555 cusC copper/silver efflux system, outer membrane component 10 20 16128955 cspH stress protein, member of the CspA-family 10 2 16129003 csgD DNA-binding transcriptional activator in two-component regulatory 10 5 16132217 creD inner membrane protein 10 4 16128605 crcA palmitoyl transferase for Lipid A 10 1 16131666 corA magnesium/nickel/cobalt transporter 10 13 16129933 cobS cobalamin synthase 10 11 90111375 cld regulator of length of O-antigen component of lipopolysaccharide 10 1 90111153 citE citrate lyase, citryl-ACP lyase (beta) subunit 10 6 16128603 citB DNA-binding response regulator in two-component regulatory system 10 5 16128602 citA sensory histidine kinase in two-component regulatory system with 10 3 16128032 caiB crotonobetainyl-CoA:carnitineCoA-transferase 10 9 16131410 bcsG predicted inner membrane protein 10 2 16129699 astB succinylarginine dihydrolase 10 9 16131612 asnA asparagine synthetase AsnA 10 6 16131373 arsR DNA-binding transcriptional repressor 10 10 49176018 araJ predicted transporter 10 15 16130745 araE arabinose transporter 10 14 16128547 appY DLP12 prophage; DNA-binding transcriptional activator 10 3 16128946 appA phosphoanhydride phosphorylase 10 2 16128104 ampE predicted inner membrane protein 10 4 16130542 alpA CP4-57 prophage; DNA-binding transcriptional activator 10 1 90111699 aidB isovaleryl CoA dehydrogenase 10 13 16131130 aaeB p-hydroxybenzoic acid efflux system component 10 8 16132033 ytfG NAD(P)H:quinone oxidoreductase 9 5 16132031 ytfE predicted regulator of cell morphogenesis and cell wall metabolism 9 4

300 90111585 yrfG predicted hydrolase 9 10 16130391 ypfG hypothetical protein b2466 9 4 16129794 yobA hypothetical protein b1841 9 1 16129747 yoaF conserved outer membrane protein 9 1 90111324 ynjE predicted thiosulfate sulfur transferase 9 10 16129707 ynjA hypothetical protein b1753 9 1 16129413 yncG predicted enzyme 9 4 90111276 yncC predicted DNA-binding transcriptional regulator 9 3 16129372 ynbD predicted phosphatase, inner membrane protein 9 7 16129158 ymgE predicted inner membrane protein 9 0 16132150 yjiG conserved inner membrane protein 9 0 90111728 yjhA N-acetylnuraminic acid outer membrane channel protein 9 0 16132085 yjgR predicted ATPase 9 12 90111712 yjgJ predicted transcriptional regulator 9 7 90111716 yjgB predicted alcohol dehydrogenase, Zn-dependent and NAD(P)-binding 9 15 90111695 yjeM predicted transporter 9 3 16131970 yjeJ hypothetical protein b4145 9 0 16131894 yjcH conserved inner membrane protein involved in acetate transport 9 6 16131883 yjbR hypothetical protein b4057 9 1 16131854 yjbG hypothetical protein b4028 9 1 90111666 yiiM hypothetical protein b3910 9 10 16131741 yiiL L-rhamnose mutarotase 9 1 16131581 yieF chromate reductase, Class I, flavoprotein 9 10 90111639 yidR hypothetical protein b3689 9 0 90111621 yibQ predicted polysaccharide deacetylase 9 11 16131396 yhjG predicted outer membrane biogenesis protein 9 5 16131395 yhjE predicted transporter 9 17 16131359 yhiI predicted HlyD family secretion protein 9 14 49176309 ygjI predicted transporter 9 4 16130755 ygeG predicted chaperone 9 0 16130744 ygeA predicted racemase 9 7 16130645 ygbL hypothetical protein b2738 9 5 16130644 ygbK hypothetical protein b2737 9 5 16130579 ygaU hypothetical protein b2665 9 10 90111477 ygaF predicted enzyme 9 8 90111414 yfbT predicted hydrolase or phosphatase 9 7 16130168 yfaL adhesin 9 7 16130127 yejO predicted autotransporter outer membrane protein 9 3 16130036 yegT predicted nucleoside transporter 9 1 16130026 yegS hypothetical protein b2086 9 7 90111364 yeeI hypothetical protein b1976 9 15 16129918 yedZ hypothetical protein b1972 9 10 16129917 yedY hypothetical protein b1971 9 11 16129821 yecE hypothetical protein b1868 9 8 90111338 yebN conserved inner membrane protein 9 9 16129756 yeaW predicted 2Fe-2S cluster-containing protein 9 9 16129749 yeaQ conserved inner membrane protein 9 0 16129729 ydjK predicted transporter 9 15 16129603 ydhK conserved inner membrane protein 9 5 90111309 ydhA predicted lipoprotein 9 0 16129468 ydeU conserved protein, predicted pseudogene 9 4 16129430 yddK hypothetical protein b1471 9 1 16129423 yddE hypothetical protein b1464 9 13 16129375 ydcF hypothetical protein b1414 9 0 16129321 ydaV Rac prophage; predicted DNA replication protein 9 2 16129301 ydaL hypothetical protein b1340 9 16 90111192 ycbF predicted periplasmic pilini chaperone 9 4 90111184 ycaM predicted transporter 9 4

301 16128864 ycaC predicted hydrolase 9 8 16128809 ybjG undecaprenyl pyrophosphate phosphatase 9 2 16128789 ybiU hypothetical protein b0821 9 0 16128783 ybiP predicted hydrolase, inner membrane 9 9 16128657 ybfM predicted outer membrane porin 9 2 16128630 ybeT conserved outer membrane protein 9 13 16128584 ybdM hypothetical protein b0601 9 1 16128511 ybcI conserved inner membrane protein 9 3 90111141 ybbM predicted inner membrane protein 9 4 16128445 ybaJ hypothetical protein b0461 9 0 16128309 yahJ deaminase 9 4 16128218 yafN predicted antitoxin of the YafO-YafN toxin-antitoxin system 9 1 16128197 yafE predicted S-adenosyl-L-methionine-dependent methyltransferase 9 9 16128184 yaeJ hypothetical protein b0191 9 16 16131857 xylE D-xylose transporter 9 10 16131436 xylA xylose isomerase 9 5 16130332 xapB xanthosine transporter 9 1 16129990 wcaI predicted glycosyl transferase 9 5 16129146 umuD DNA polymerase V, subunit D 9 7 16131325 ugpB glycerol-3-phosphate transporter subunit 9 2 16130042 thiM hydroxyethylthiazole kinase 9 3 16128353 tauD taurine dioxygenase 9 8 16129925 shiA shikimate transporter 9 17 90111147 sfmA predicted fimbrial-like adhesin protein 9 1 16129718 selD selenophosphate synthetase 9 11 16131295 rtcB hypothetical protein b3421 9 6 16128634 rihA ribonucleoside hydrolase 1 9 9 16131747 rhaT L-rhamnose:proton symporter 9 1 16131744 rhaB rhamnulokinase 9 5 49176441 rhaA L-rhamnose isomerase 9 1 16131496 rfaY lipopolysaccharide core biosynthesis protein 9 0 16131501 rfaP kinase that phosphorylates core heptose of lipopolysaccharide 9 4 16131497 rfaJ UDP-D-glucose:(galactosyl)lipopolysaccharide glucosyltransferase 9 4 16129522 relE Qin prophage; toxin of the RelE-RelB toxin-antitoxin system 9 5 16131919 phnO predicted acyltransferase with acyl-CoA N-acyltransferase domain 9 0 16131931 phnD phosphonate/organophosphate ester transporter subunit 9 5 16131847 pepE peptidase E 9 3 16128548 ompT DLP12 prophage; outer membrane protease VII (outer membrane protein 9 1 16130941 insD-5 IS2 insertion element transposase InsAB' 9 2 16128561 nfnB dihydropteridine reductase, NAD(P)H-dependent, oxygen-insensitive 9 9 90111673 nfi endonuclease V 9 4 16130141 napH quinol dehydrogenase membrane component 9 5 90111746 nadR nicotinamide-nucleotide adenylyltransferase 9 4 16131170 mscL large-conductance mechanosensitive channel 9 15 90111116 mhpD 2-keto-4-pentenoate hydratase 9 4 16129155 ldcA L,D-carboxypeptidase A 9 7 16130512 kgtP alpha-ketoglutarate transporter 9 16 16129537 intQ Qin prophage; predicted defective integrase 9 8 16128520 intD DLP12 prophage; predicted integrase 9 5 16131540 ilvN acetolactate synthase small subunit 9 1 16130791 idi isopentenyl-diphosphate delta-isomerase 9 2 16128492 hyi hydroxypyruvate isomerase 9 8 16129593 gst glutathionine S-transferase 9 10 16131207 gspG pseudopilin, cryptic, general secretion pathway 9 16 16130033 gatZ D-tagatose 1,6-bisphosphate aldolase 2, subunit 9 0 16131020 garD (D)-galactarate dehydrogenase 9 5 16129451 gadC predicted glutamate:gamma-aminobutyric acid antiporter 9 10 16129452 gadB glutamate decarboxylase B, PLP-dependent 9 1

302 16128463 fsr predicted fosmidomycin efflux system 9 4 16131740 frvA predicted enzyme IIA component of PTS 9 7 16129843 flhC DNA-binding transcriptional dual regulator with FlhD 9 1 16132139 fimF minor component of type 1 fimbriae 9 0 16132188 fhuF ferric iron reductase involved in ferric hydroximate transport 9 1 16130298 dsdA D-serine dehydratase 9 3 90111232 dhaH fused predicted dihydroxyacetone-specific PTS enzymes: HPr 9 10 16129907 dcm DNA cytosine methylase 9 6 16131962 cutA copper binding protein, copper sensitivity 9 8 16129934 cobU adenosylcobinamide kinase/adenosylcobinamide-phosphate 9 10 16128598 citF citrate lyase, citrate-ACP transferase (alpha) subunit 9 2 16131959 cadC DNA-binding transcriptional activator 9 5 16131403 bcsZ endo-1,4-D-glucanase 9 2 16131408 bcsE hypothetical protein b3536 9 2 16128056 araA L-arabinose isomerase 9 1 16128361 ampH beta-lactamase/D-alanine carboxypeptidase 9 6 16128496 allB allantoinase 9 14 16128968 agp glucose-1-phosphatase/inositol phosphatase 9 2 90111251 abgT predicted cryptic aminobenzoyl-glutamate transporter 9 4 16130777 yqeB conserved protein with NAD(P)-binding Rossman fold 8 7 16130776 yqeA predicted amino acid kinase 8 4 16130044 yohM membrane protein conferring nickel and cobalt resistance 8 2 90111323 ynjB hypothetical protein b1754 8 3 90111275 yncB predicted oxidoreductase, Zn-dependent and NAD(P)-binding 8 10 16128805 yliI predicted dehydrogenase 8 12 16128290 ykgD predicted DNA-binding transcriptional regulator 8 11 90111684 yjcS predicted alkyl sulfatase 8 4 16131846 yjbB predicted transporter 8 5 90111667 yijE predicted permease 8 9 16131446 yiaK 2,3-diketo-L-gulonate dehydrogenase, NADH-dependent 8 8 90111596 yhhS predicted transporter 8 5 16131074 yhbE conserved inner membrane protein 8 10 49176312 yhaM hypothetical protein b4470 8 2 16130998 yhaH predicted inner membrane protein 8 5 16130969 ygjH hypothetical protein b3074 8 14 16130821 ygfH propionyl-CoA:succinate-CoA transferase 8 12 49176267 ygcU predicted FAD containing dehydrogenase 8 13 90111488 ygcS predicted transporter 8 5 90111486 ygcQ predicted flavoprotein 8 15 90111484 ygcN predicted oxidoreductase with FAD/NAD(P)-binding domain 8 15 49176236 yfhR predicted peptidase 8 6 16130445 yfhM hypothetical protein b2520 8 9 16130334 yfeN conserved outer membrane protein 8 0 16130205 yfbK hypothetical protein b2270 8 4 16130069 yehZ predicted transporter subunit: periplasmic-binding component of ABC 8 8 16130068 yehY predicted transporter subunit: membrane component of ABC 8 4 16130038 yegV predicted kinase 8 2 16129954 yeeE predicted inner membrane protein 8 9 16129908 yedJ predicted phosphohydrolase 8 4 16129877 yedF hypothetical protein b1930 8 2 16129876 yedE predicted inner membrane protein 8 3 16129706 ydjZ conserved inner membrane protein 8 7 90111321 ydjX predicted inner membrane protein 8 0 16129655 ydiS predicted oxidoreductase with FAD/NAD(P)-binding domain 8 15 16129654 ydiR predicted electron transfer flavoprotein, FAD-binding 8 15 16129652 ydiP predicted DNA-binding transcriptional regulator 8 4 16129390 ydcL predicted lipoprotein 8 2 16129288 ycjY predicted hydrolase 8 4

303 16129271 ycjN predicted sugar transporter subunit: periplasmic-binding component 8 2 16129182 ychN hypothetical protein b1219 8 3 16129125 ycgE predicted DNA-binding transcriptional regulator 8 6 16128988 ycdS predicted outer membrane protein 8 3 16128814 ybjK predicted DNA-binding transcriptional regulator 8 2 16128769 ybiC predicted dehydrogenase 8 8 16128564 ybdK gamma-glutamyl:cysteine ligase 8 4 16128538 ybcS DLP12 prophage; predicted lysozyme 8 3 16128001 yaaJ predicted transporter 8 17 16130783 xdhD fused predicted xanthine/hypoxanthine oxidase: 8 8 16130986 uxaA altronate hydrolase 8 5 16131391 treF cytoplasmic trehalase 8 5 90111122 tauA taurine transporter subunit 8 5 16132125 sgcC KpLE2 phage-like element; predicted phosphotransferase enzyme IIC 8 0 16131452 sgbH 3-keto-L-gulonate 6-phosphate decarboxylase 8 5 16128362 sbmA predicted transporter 8 7 16128024 rihC ribonucleoside hydrolase 3 8 9 16130100 rihB ribonucleoside hydrolase 2 8 9 16129978 rfbC dTDP-4-deoxyrhamnose-3,5-epimerase 8 15 16131498 rfaI UDP-D-galactose:(glucosyl)lipopolysaccharide-alpha-1, 8 3 16128975 rarA predicted hydrolase 8 3 16129260 puuR DNA-binding transcriptional repressor 8 9 16131715 ompL predicted outer membrane porin L 8 0 90111521 nupG nucleoside transporter 8 2 16130588 nrdI hypothetical protein b2674 8 1 16129846 insA-5 IS1 repressor protein InsA 8 1 16129012 mdoH glucosyltransferase MdoH 8 10 16132203 lplA lipoate-protein ligase A 8 4 16131024 kbaZ tagatose 6-phosphate aldolase 1, kbaZ subunit 8 0 16129306 intR Rac prophage; integrase 8 7 90111445 hyfC hydrogenase 4, membrane subunit 8 1 16130897 hybO hydrogenase 2, small subunit 8 4 16128938 hyaA hydrogenase 1, small subunit 8 4 16129467 hipB DNA-binding transcriptional regulator 8 1 90111577 frlB fructoselysine-6-P-deglycase 8 17 16128036 fixB predicted electron transfer flavoprotein, NAD/FAD-binding domain 8 15 16128568 fes enterobactin/ferric enterobactin esterase 8 2 90111412 elaC ribonuclease Z 8 4 49176391 dgoA 2-dehydro-3-deoxy-6-phosphogalactonate aldolase 8 3 16128326 cynX predicted cyanate transporter 8 10 90111454 csiE stationary phase inducible protein 8 0 16129689 chbR DNA-binding transcriptional dual regulator 8 3 16131404 bcsB regulator of cellulose synthase, cyclic di-GMP binding 8 2 16128843 aqpZ aquaporin Z 8 6 16129926 amn AMP nucleosidase 8 4 16128501 allD ureidoglycolate dehydrogenase 8 8 16130149 alkB oxidative demethylase of N1-methyladenine or N3-methylcytosine DNA 8 4 16129476 yneB hypothetical protein b1517 7 1 90111267 ynbA predicted inner membrane protein 7 6 16129008 ymdB hypothetical protein b1045 7 5 90111104 ykfC CP4-6 prophage; conserved protein 7 4 90111719 yjhC KpLE2 phage-like element; predicted oxidoreductase 7 9 16131953 yjdJ predicted acyltransferase with acyl-CoA N-acyltransferase domain 7 1 94541129 yidD hypothetical protein b4557 7 21 90111612 yhjY hypothetical protein b3548 7 0 90111548 yhbO predicted intracellular protease 7 13 49176303 ygiV predicted transcriptional regulator 7 4 16130936 ygiE zinc transporter ZupT 7 8

304 16130756 ygeH predictedtranscriptional regulator 7 0 16130582 ygaP predicted inner membrane protein with hydrolase activity 7 3 90111432 yfeW hypothetical protein b2430 7 10 16130306 yfdW formyl-coenzyme A transferase 7 9 16130286 yfdK CPS-53 (KpLE1) prophage; conserved protein 7 1 90111425 yfdE predicted CoA-transferase, NAD(P)-binding 7 10 16130257 yfcJ predicted transporter 7 5 16130182 yfaW predicted enolase 7 9 90111348 yecD predicted hydrolase 7 5 90111333 yeaO hypothetical protein b1792 7 5 16129629 ydhV predicted oxidoreductase 7 0 16129450 yddW predicted liprotein 7 3 16129393 ydcN predicted DNA-binding transcriptional regulator 7 8 16129391 ydcM predicted transposase 7 3 16129382 ydcJ hypothetical protein b1423 7 6 16128986 ycdQ predicted glycosyl transferase 7 9 90111203 ycdL predicted enzyme 7 4 16128967 yccE hypothetical protein b1001 7 0 16128505 ybcF predicted carbamate kinase 7 4 16128308 yahI predicted carbamate kinase-like protein 7 4 16128303 yahD predicted transcriptional regulator with ankyrin domain 7 5 16128252 yagA CP4-6 prophage; predicted DNA-binding transcriptional regulator 7 5 16129995 wcaE predicted glycosyl transferase 7 2 16129999 wcaA predicted glycosyl transferase 7 11 16129906 vsr DNA mismatch endonuclease of very short patch repair 7 4 16129160 treA periplasmic trehalase 7 5 90111506 ssnA putative chlorohydrolase/aminohydrolase 7 11 16129980 rfbD dTDP-4-dehydrorhamnose reductase subunit, NAD(P)-binding, of 7 13 16131499 rfaB UDP-D-galactose:(glucosyl)lipopolysaccharide-1, 7 9 16131257 php predicted hydrolase 7 0 16131353 nikR nickel responsive regulator 7 1 16128689 nei endonuclease VIII/ 5-formyluracil/5-hydroxymethyluracil DNA 7 5 16129424 narV nitrate reductase 2 (NRZ), gamma subunit 7 2 16129428 narU nitrate/nitrite transporter 7 5 16129186 narK nitrate/nitrite transporter 7 3 16129189 narJ molybdenum-cofactor-assembly chaperone subunit (delta subunit) of 7 2 16129190 narI nitrate reductase 1, gamma (cytochrome b(NR)) subunit 7 2 16128651 nagD UMP phosphatase 7 4 90111117 mhpT predicted 3-hydroxyphenylpropionic transporter 7 14 16129016 mdtG predicted drug efflux system 7 4 16130619 hypF carbamoyl phosphate phosphatase and maturation protein for NiFe] 7 5 16130637 hypE carbamoyl phosphate phosphatase, hydrogenase 3 maturation protein 7 6 16130636 hypD protein required for maturation of hydrogenases 7 5 16130635 hypC protein required for maturation of hydrogenases 1 and 3 7 1 16130893 hybD predicted maturation element for hydrogenase 2 7 4 16130894 hybC hydrogenase 2, large subunit 7 4 16128941 hyaD protein involved in processing of HyaA and HyaB proteins 7 3 16128939 hyaB hydrogenase 1, large subunit 7 4 16129466 hipA regulator with hipB 7 5 16131383 hdeD acid-resistance membrane protein 7 0 16130465 hcaC 3-phenylpropionate dioxygenase, predicted ferredoxin subunit 7 4 16130694 gudD (D)-glucarate dehydratase 1 7 5 49176177 flu CP4-44 prophage; antigen 43 (Ag43) phase-variable biofilm formation 7 9 90111081 fixA predicted electron transfer flavoprotein subunit, ETFP adenine 7 15 90111385 fbaB fructose-bisphosphate aldolase 7 0 90111151 dsbG periplasmic disulfide isomerase/thiol-disulphide oxidase 7 4 49176390 dgoD galactonate dehydratase 7 12 16128557 cusB copper/silver efflux system, membrane fusion protein 7 1

305 16128321 codB cytosine transporter 7 6 16128596 citG triphosphoribosyl-dephospho-CoA transferase 7 9 90111608 bcsC cellulose synthase subunit 7 2 90111609 bcsA cellulose synthase, catalytic subunit 7 7 16130158 atoD acetyl-CoA:acetoacetyl-CoA transferase, alpha subunit 7 13 90111602 arsB arsenite/antimonite transporter 7 6 16128500 allC N-carbamoyl-L-amino acid amidohydrolase 7 10 16130944 yqiI hypothetical protein b3048 6 0 16130316 ypdE predicted peptidase 6 4 90111326 ynjI predicted inner membrane protein 6 0 16129326 ynaK Rac prophage; conserved protein 6 1 90111113 ykgG predicted transporter 6 4 16132128 yjhQ KpLE2 phage-like element; predicted acetyltransferase 6 2 90111658 yihS predicted glucosamine isomerase 6 6 16131546 yidH conserved inner membrane protein 6 1 16131279 yhgE predicted inner membrane protein 6 1 16130975 ygjK predicted glycosyl hydrolase 6 0 16130964 ygjF G/U mismatch-specific DNA glycosylase 6 0 90111504 ygeZ dihydropyrimidinase 6 10 90111485 ygcO predicted 4Fe-4S cluster-containing protein 6 0 49176171 yedK hypothetical protein b1931 6 6 16129695 ydjQ endonuclease of nucleotide excision repair 6 16 16128883 ycaQ hypothetical protein b0916 6 3 16128868 ycaK hypothetical protein b0901 6 6 16128786 ybiR predicted transporter 6 2 16128305 yahF predicted acyl-CoA synthetase with NAD(P)-binding domain and 6 5 16129986 wzxC colanic acid exporter 6 5 16128904 ssuE NAD(P)H-dependent FMN reductase 6 6 16128902 ssuD alkanesulfonate monooxygenase 6 5 90111726 sgcX KpLE2 phage-like element; predicted endoglucanase with Zn-dependent 6 4 16131493 rfaL O-antigen ligase 6 0 16131918 phnP carbon-phosphorus lyase complex accessory protein 6 3 16131933 phnB hypothetical protein b4107 6 5 16128954 insB-4 IS1 transposase InsAB' 6 1 16128785 mntR DNA-binding transcriptional regulator of mntH 6 3 16131738 frvX predicted endo-1,4-beta-glucanase 6 3 49176341 frlC predicted isomerase 6 0 16129649 aroD 3-dehydroquinate dehydratase 6 2 16131043 yraQ predicted permease 5 2 16129411 yncE hypothetical protein b1452 5 1 16128292 ykgF predicted amino acid dehydrogenase with NAD(P)-binding domain and 5 7 16132129 yjhR KpLE2 phage-like element; predicted frameshift suppressor 5 0 16131530 yicL predicted inner membrane protein 5 0 16131110 yhcG hypothetical protein b3220 5 3 49176293 yghJ predicted inner membrane lipoprotein 5 0 16130037 yegU predicted hydrolase 5 3 16129653 ydiQ hypothetical protein b1697 5 6 16129397 ydcQ predicted DNA-binding transcriptional regulator 5 1 49176071 ycdM predicted monooxygenase 5 5 16128758 ybhP predicted DNase 5 10 16128756 ybhN conserved inner membrane protein 5 7 16130333 xapA purine nucleoside phosphorylase 5 6 16129972 wbbK lipopolysaccharide biosynthesis protein 5 0 16129576 uidR DNA-binding transcriptional repressor 5 1 16129478 tam trans-aconitate 2-methyltransferase 5 4 49176352 rtcA RNA 3'-terminal-phosphate cyclase 5 3 16128319 prpD 2-methylcitrate dehydratase 5 10 16129849 otsB trehalose-6-phosphate phosphatase, biosynthetic 5 5

306 16132172 mrr methylated adenine and cytosine restriction protein 5 0 16129122 mcrA e14 prophage; 5-methylcytosine-specific restriction endonuclease B 5 1 16130410 hyfE hydrogenase 4, membrane subunit 5 1 16130785 guaD guanine deaminase 5 13 16131389 gadA glutamate decarboxylase A, PLP-dependent 5 1 16128629 djlB predicted chaperone 5 0 16130819 argK arginine/ornithine transport system ATPase 5 1 16131535 ade cryptic adenine deaminase 5 0 94541135 ytjA hypothetical protein b4568 4 0 16132027 ytfA predicted transcriptional regulator 4 0 90111332 yoaI hypothetical protein b1788 4 0 94541112 ynfO Qin prophage; predicted protein 4 0 90111227 ymgG hypothetical protein b1172 4 0 94541090 ymdF hypothetical protein b4518 4 3 90111199 ymcD hypothetical protein b0987 4 0 16128546 ylcE DLP12 prophage; predicted protein 4 0 16128377 ykiA hypothetical protein b0392 4 0 16128221 ykfJ hypothetical protein b0235 4 2 90111736 yjiT hypothetical protein b4342 4 0 94541138 yjhX hypothetical protein b4566 4 1 16131873 yjbL hypothetical protein b4047 4 0 16131782 yijF hypothetical protein b3944 4 2 16131379 yhiF predicted DNA-binding ranscriptional regulator 4 0 90111525 yghY predicted dienlactone hydrolase (pseudogene) 4 3 16130558 yfjX CP4-57 prophage; predicted antirestriction protein 4 1 16130272 yfcV predicted fimbrial-like adhesin protein 4 0 16130207 yfbM hypothetical protein b2272 4 0 90111398 yeiA dihydropyrimidine dehydrogenase 4 6 16130050 yehE hypothetical protein b2112 4 0 94541093 yecJ hypothetical protein b4537 4 0 90111278 yddH hypothetical protein b1462 4 4 16129327 ydaY Rac prophage; predicted protein 4 0 16129277 ycjT predicted hydrolase 4 2 90111238 yciG hypothetical protein b1259 4 2 16129136 ycgI hypothetical protein b1173 4 0 16128811 ybjH hypothetical protein b0843 4 0 16128550 ybcH hypothetical protein b0567 4 1 16128470 ybaT predicted transporter 4 5 16128271 yagT predicted xanthine dehydrogenase, 2Fe-2S subunit 4 10 16128269 yagR predicted oxidoreductase with molybdenum-binding domain 4 9 90111094 yaeI predicted phosphatase 4 5 16127999 yaaX hypothetical protein b0005 4 0 16130770 xdhC xanthine dehydrogenase, Fe-S binding subunit 4 10 16130769 xdhB xanthine dehydrogenase, FAD-binding subunit 4 8 16130768 xdhA xanthine dehydrogenase, molybdenum binding subunit 4 8 90111543 tdcR DNA-binding transcriptional activator 4 0 16131638 rhoL rho operon leader peptide 4 0 16129359 paaK phenylacetyl-CoA ligase 4 4 16128345 insC-1 IS2 insertion element repressor InsA 4 0 16129364 insC-2 IS2 insertion element repressor InsA 4 0 16129536 insD-7 Qin prophage; IS2 insertion element transposase InsAB', C-ter 4 0 16132093 insC-6 KpLE2 phage-like element; IS2 insertion element repressor InsA 4 0 49175991 NA toxic membrane protein, small 4 0 49176176 insC-3 CP4-44 prophage; IS2 insertion element repressor InsA 4 0 90111501 insC-4 IS2 insertion element repressor InsA 4 0 16128707 mngB alpha-mannosidase 4 0 16129611 lhr predicted ATP-dependent helicase 4 7 90111732 kptA RNA 2'-phosphotransferase-like protein 4 3

307 94541132 insM transposase 4 1 16131104 gltF periplasmic protein 4 0 90111295 essQ Qin prophage; predicted S lysis protein 4 0 16128537 essD DLP12 prophage; predicted phage lysis protein 4 0 16128556 cusF periplasmic copper-binding protein 4 0 16130689 chpA toxin of the ChpA-ChpR toxin-antitoxin system, endoribonuclease 4 0 90111252 abgA predicted peptidase, aminobenzoyl-glutamate utilization protein 4 16 16131660 yzcX hypothetical protein b3808 3 0 90111592 yrhA hypothetical protein b3443 3 0 16130842 yqgD predicted inner membrane protein 3 0 16130841 yqgC hypothetical protein b2940 3 0 16130572 yqaD hypothetical protein b2658 3 0 94541113 yniD hypothetical protein b4535 3 0 16129418 yncM hypothetical protein b1459 3 1 90111222 ymfT e14 prophage; predicted DNA-binding transcriptional regulator 3 0 94541097 ylcG DLP12 prophage; predicted protein 3 0 16128234 ykfF CP4-6 prophage; predicted protein 3 0 16132162 yjiS hypothetical protein b4341 3 0 16132107 yjhV KpLE2 phage-like element; predicted protein 3 1 16132098 yjgZ KpLE2 phage-like element; predicted protein 3 0 49176458 yjdP hypothetical protein b4487 3 0 90111687 yjcZ hypothetical protein b4110 3 0 16131584 yieI predicted inner membrane protein 3 0 16131550 yidL predicted DNA-binding transcriptional regulator 3 0 16131467 yibG hypothetical protein b3596 3 0 16131022 yhaV hypothetical protein b3130 3 1 16130932 ygiA hypothetical protein b3036 3 0 16130886 yghT predicted protein with nucleoside triphosphate hydrolase domain 3 0 16130885 yghS predicted protein with nucleoside triphosphate hydrolase domain 3 0 49176296 yghQ predicted inner membrane protein 3 0 16130765 ygeP hypothetical protein b2862 3 0 16130675 ygcP predicted anti-terminator regulatory protein 3 0 90111423 yfdM CPS-53 (KpLE1) prophage; predicted methyltransferase 3 1 90111404 yfaT hypothetical protein b2229 3 2 94541136 yehK hypothetical protein b4541 3 0 16129879 yedL predicted acyltransferase 3 2 16129684 ydjO hypothetical protein b1730 3 0 16129514 ydfR Qin prophage; predicted protein 3 0 16129512 ydfP Qin prophage; conserved protein 3 0 16129535 ydfE Qin prophage; predicted protein 3 0 16129530 ydfA Qin prophage; predicted protein 3 0 16129378 ydcA hypothetical protein b1419 3 0 94541091 ydaF Rac prophage; predicted protein 3 0 94541108 yciX_2 hypothetical protein b4523 3 0 94541107 yciX_1 hypothetical protein b4522 3 0 16128993 ycdU predicted inner membrane protein 3 0 16128985 ycdP predicted inner membrane protein 3 0 16128532 ybcO DLP12 prophage; predicted protein 3 1 90111145 ybbY predicted uracil/xanthine transporter 3 3 16128314 yahO hypothetical protein b0329 3 0 16128311 yahL hypothetical protein b0326 3 0 90111108 yagV hypothetical protein b0289 3 0 16128233 yafX CP4-6 prophage; predicted protein 3 1 16128203 yafT predicted aminopeptidase 3 0 16128219 yafO predicted toxin of the YafO-YafN toxin-antitoxin system 3 1 90111089 yadM predicted fimbrial-like adhesin protein 3 0 90111422 tfaS CPS-53 (KpLE1) prophage; tail fiber assembly protein fragment 3 0 16130818 sbm methylmalonyl-CoA mutase 3 1

308 90111148 renD DLP12 prophage; predicted protein 3 0 16129520 rem Qin prophage; predicted protein 3 0 16129312 racC Rac prophage; predicted protein 3 0 16130563 pinH predicted invertase fragment (pseudogene) 3 0 16131930 phnE_1 phosphonate/organophosphate ester transporter subunit 3 3 90111265 paaD predicted multicomponent oxygenase/reductase subunit for 3 2 16129349 paaA predicted multicomponent oxygenase/reductase subunit for 3 2 16130918 NA predicted cyanide hydratase 3 2 90111115 mhpC 2-hydroxy-6-ketonona-2,4-dienedioic acid hydrolase 3 7 49176368 ldrD toxic polypeptide, small 3 0 49176089 ldrC toxic polypeptide, small 3 0 49176088 ldrB toxic polypeptide, small 3 0 49176087 ldrA toxic polypeptide, small 3 0 90111255 kil Rac prophage; inhibitor of ftsZ, killing protein 3 0 49176371 hokA toxic polypeptide, small 3 0 90111229 hlyE hemolysin E 3 0 16129976 glf UDP-galactopyranose mutase, FAD/NAD(P)-binding 3 3 49176295 glcE glycolate oxidase FAD binding subunit 3 10 16129525 flxA Qin prophage; predicted protein 3 0 90111640 cbrA predicted oxidoreductase with FAD/NAD(P)-binding domain 3 0 16131034 yraH predicted fimbrial-like adhesin protein 2 0 90111497 yqeK hypothetical protein b2849 2 0 94541123 ypjJ hypothetical protein b4548 2 0 16130564 ypjB hypothetical protein b2649 2 0 90111397 yohH hypothetical protein b2139 2 0 94541092 yncN hypothetical protein b4532 2 0 16129414 yncH hypothetical protein b1455 2 0 94541106 ymgH hypothetical protein b4521 2 0 94541105 ymgF hypothetical protein b4520 2 0 16129128 ymgA hypothetical protein b1165 2 0 16129113 ymfR e14 prophage; predicted protein 2 0 90111221 ymfJ e14 prophage; predicted protein 2 0 90111220 ymfI e14 prophage; predicted protein 2 0 16128784 yliL hypothetical protein b0816 2 0 90111124 ykiB hypothetical protein b0370 2 0 94541094 ykfH hypothetical protein b4504 2 0 16132095 yjgW KpLE2 phage-like element; predicted protein 2 0 16131982 yjeN hypothetical protein b4157 2 0 16131954 yjdK hypothetical protein b4128 2 0 90111662 yiiE predicted transcriptional regulator 2 0 90111598 yhhH hypothetical protein b3483 2 0 16131109 yhcF predicted transcriptional regulator 2 0 16130923 ygiZ conserved inner membrane protein 2 0 90111500 ygeN hypothetical protein b2858 2 0 16130754 ygeF hypothetical protein b2850 2 0 16130568 ygaQ hypothetical protein b2654 2 0 16130557 yfjW CP4-57 prophage; predicted inner membrane protein 2 0 16130552 yfjR CP4-57 prophage; predicted DNA-binding transcriptional regulator 2 1 16130548 yfjN CP4-57 prophage; RNase LS 2 0 16130375 yffS CPZ-55 prophage; predicted protein 2 0 16130295 yfdT CPS-53 (KpLE1) prophage; predicted protein 2 0 16130294 yfdS CPS-53 (KpLE1) prophage; predicted protein 2 0 16130292 yfdQ CPS-53 (KpLE1) prophage; predicted protein 2 0 16130209 yfbO hypothetical protein b2274 2 0 16129944 yeeT CP4-44 prophage; predicted protein 2 0 16129882 yedM hypothetical protein b1935 2 0 16129429 yddJ hypothetical protein b1470 2 0 90111257 ydaG Rac prophage; predicted protein 2 0

309 94541109 ydaE Rac prophage; conserved protein 2 0 16129308 ydaC Rac prophage; predicted protein 2 0 16129159 ycgY hypothetical protein b1196 2 0 16128665 ybfP hypothetical protein b0689 2 0 16128679 ybfC hypothetical protein b0704 2 0 16128494 ybbV hypothetical protein b0510 2 0 90111114 yahM hypothetical protein b0327 2 0 94541096 yahH hypothetical protein b0322 2 0 16128304 yahE hypothetical protein b0319 2 0 16128278 yagZ hypothetical protein b0293 2 0 16128277 yagY hypothetical protein b0292 2 0 16128263 yagL CP4-6 prophage; DNA-binding protein 2 0 16128204 yafU predicted inner membrane protein 2 0 16131575 tnaC leader peptide 2 0 16131494 rfaK lipopolysaccharide core biosynthesis 2 2 16129310 recT Rac prophage; recombination and repair protein 2 0 16131201 pioO part of gsp divergon involved in type II protein secretion 2 0 16129280 ompG outer membrane porin 2 0 16131108 insH-10 IS5 transposase and trans-activator 2 5 16128244 insH-1 CP4-6 prophage; IS5 transposase and trans-activator 2 5 16128535 insH-2 DLP12 prophage; IS5 transposase and trans-activator 2 5 16128639 insH-3 IS5 transposase and trans-activator 2 5 16129292 insH-4 IS5 transposase and trans-activator 2 5 16129331 insH-5 Rac prophage; IS5 transposase and trans-activator 2 5 16129935 insH-6 CP4-44 prophage; IS5 transposase and trans-activator 2 5 16129971 NA IS5 transposase and trans-activator 2 5 16130129 insH-8 IS5 transposase and trans-activator 2 5 16130882 insH-9 IS5 transposase and trans-activator 2 5 16131377 insH-11 IS5 transposase and trans-activator 2 5 90111427 insL-3 IS186/IS421 transposase 2 0 94541098 NA DLP12 prophage; predicted lipoprotein 2 0 16128010 insL-1 IS186/IS421 transposase 2 0 16128531 ninE DLP12 prophage; conserved protein 2 0 16128552 nfrB bacteriophage N4 receptor, inner membrane subunit 2 2 16129379 mokB regulatory peptide 2 0 16132166 mcrC 5-methylcytosine-specific restriction enzyme McrBC, subunit McrC 2 1 16128266 intF CP4-6 prophage; predicted phage integrase 2 1 16131819 htrC heat shock protein 2 0 90111078 htgA hypothetical protein b0012 2 0 16129528 dicC Qin prophage; DNA-binding transcriptional regulator for DicB 2 0 16128325 cynS cyanate hydratase 2 4 16128540 borD DLP12 prophage; predicted lipoprotein 2 0 16130160 atoE short chain fatty acid transporter 2 9 16132045 yzfA hypothetical protein b4223 1 0 16131318 yrhB hypothetical protein b3446 1 0 16130816 yqfE hypothetical protein b2915 1 0 90111473 ypjM CP4-57 prophage; predicted protein (pseudogene) 1 0 90111471 ypjK CP4-57 prophage; predicted inner membrane protein 1 0 90111294 ynfN Qin prophage; predicted protein 1 0 16129465 yneL predicted transcriptional regulator 1 0 16129336 ynaE Rac prophage; predicted DNA-binding transcriptional regulator 1 0 16129130 ymgC hypothetical protein b1167 1 0 16129111 ymfM e14 prophage; predicted protein 1 0 16129105 ymfH e14 prophage; predicted protein 1 0 90111216 ymfA predicted inner membrane protein 1 0 90111204 ymdE hypothetical protein b1028 1 0 16128280 ykgL hypothetical protein b0295 1 0 16128235 ykfB CP4-6 prophage; predicted protein 1 0

310 90111721 yjhE KpLE2 phage-like element; predicted membrane protein (pseudogene) 1 0 90111720 yjhD KpLE2 phage-like element; predicted protein (pseudogene) 1 0 49176417 yigE hypothetical protein b4482 1 2 49176405 yifO hypothetical protein b3776 1 2 90111648 yifN conserved protein (pseudogene) 1 2 90111544 yhaB hypothetical protein b3120 1 0 16130760 ygeL hypothetical protein b2856 1 0 16130757 ygeI hypothetical protein b2853 1 0 16130666 ygcK hypothetical protein b2759 1 0 16130554 yfjT CP4-57 prophage; predicted protein 1 0 90111472 yfjS CP4-57 prophage; predicted protein 1 0 90111470 yfjO CP4-57 prophage; predicted protein 1 0 16130547 yfjM CP4-57 prophage; predicted protein 1 0 16130545 yfjK CP4-57 prophage; conserved protein 1 0 16130374 yffR CPZ-55 prophage; predicted protein 1 0 16130373 yffQ CPZ-55 prophage; predicted protein 1 0 16130372 yffP CPZ-55 prophage; predicted protein 1 0 16130371 yffO CPZ-55 prophage; predicted protein 1 0 16130370 yffN CPZ-55 prophage; predicted protein 1 0 16130369 yffM CPZ-55 prophage; predicted protein 1 0 16130368 yffL CPZ-55 prophage; predicted protein 1 0 90111424 yfdR CPS-53 (KpLE1) prophage; conserved protein 1 0 49176214 yfdP CPS-53 (KpLE1) prophage; predicted protein 1 0 16130290 yfdO CPS-53 (KpLE1) prophage; predicted defective phage replication 1 1 16130289 yfdN CPS-53 (KpLE1) prophage; predicted protein 1 0 16130011 yegJ hypothetical protein b2071 1 0 16129527 ydfX Qin prophage; predicted protein 1 0 16129526 ydfW Qin prophage; predicted protein 1 0 16129524 ydfV Qin prophage; predicted protein 1 0 16129503 ydfK Qin prophage; predicted DNA-binding transcriptional regulator 1 0 16129534 ydfD Qin prophage; predicted protein 1 0 16129532 ydfC Qin prophage; predicted protein 1 0 90111298 ydfB Qin prophage; predicted protein 1 0 16129191 ychS hypothetical protein b1228 1 0 16129127 ycgZ hypothetical protein b1164 1 0 16129084 ycfZ predicted inner membrane protein 1 0 90111161 ybfH hypothetical protein b0691 1 1 16128677 ybfB predicted inner membrane protein 1 0 16128542 ybcW DLP12 prophage; predicted protein 1 0 94541089 ybcD DLP12 prophage; predicted replication protein fragment (pseudogene) 1 0 16128482 ybbC hypothetical protein b0498 1 0 16128265 yagN CP4-6 prophage; predicted protein 1 0 16128261 yagJ CP4-6 prophage; predicted protein 1 0 49176006 yafY CP4-6 prophage; predicted DNA-binding transcriptional regulator 1 0 16128051 yabQ hypothetical protein b0057 1 0 16128050 yabP hypothetical protein b0056 1 0 16129975 wbbH O-antigen polymerase 1 0 16127995 thrL thr operon leader peptide 1 0 90111256 sieB Rac prophage; phage superinfection exclusion protein 1 0 16129317 racR Rac prophage; predicted DNA-binding transcriptional regulator 1 0 16130519 pheL pheA gene leader peptide 1 0 16129350 paaB predicted multicomponent oxygenase/reductase subunit for 1 2 16129192 NA predicted protamine-like protein 1 0 94541110 NA Rac prophage; predicted lipoprotein 1 0 16129102 lit e14 prophage; cell death peptidase, inhibitor of T4 late gene 1 0 94541101 kdpF potassium ion accessory transporter subunit 1 0 16129883 intG predicted defective phage integrase (pseudogene) 1 0 49176035 hokE toxic polypeptide, small 1 0

311 49176107 hokB toxic polypeptide, small 1 0 16129959 hisL his operon leader peptide 1 0 90111572 gspM general secretory pathway component, cryptic 1 0 16131203 gspC general secretory pathway component, cryptic 1 1 90111293 gnsB Qin prophage; predicted protein 1 0 16129533 dicB Qin prophage; cell division inhibition protein 1 0 16131585 cbrC hypothetical protein b3717 1 2 49176127 blr beta-lactam resistance membrane protein 1 0 90111305 asr acid shock protein precursor 1 0

312

SUPPLEMENTARY TABLE 3. General listing of additional experiments. A list of additional experiments that were performed. These experiments generally produced negative or weak preliminary results, or provided us with minor, sometimes supplemental, information. As such they are not presented in the body of the thesis.

313

Overexpression Phenotype Studies Objective Test if overexpression of His-RavA adversely affects cell growth.

Experimental Description Grew BL21 cells (LB media) expressing His-RavA from a leaky pProEX HTb plasmid vs. cells containing plasmid alone or ClpX- pProEX control. General Result RavA containing cells took longer to reach stationary phase and contained aggregate like formations at completion of growth.

Objective Test if overexpression of His-RavA affects general cell morphology.

Experimental Description Gram stain and microscopic examination of BL21 cells grown in LB +/- leaky expression of His-RavA from pProEXHTb vs. vector alone and ClpX controls. General Result No strong difference in cell morphology was observed between strains, although RavA may have been slightly more filamentous in some cases.

Objective Test if overexpression of His-RavA causes cells to aggregate.

Experimental Description Incubated His-ClpX and His-RavA pProEX HTb containing BL21 in LB and monitored rate of aggregation/settling by OD600. General Result The His-RavA overexpression strain did not aggregate/settle more quickly than the His-ClpX overexpression strain.

Objective Test toxicity of RavA overexpression in liquid culture.

Experimental Description Grew MG1655 cells (LB media) containing IPTG-inducible T7 RNA polymerase +/- plasmid containing RavA under the control of a T7 promoter and constructed a growth curve. General Result IPTG induced expression of RavA caused a significant reduction in cell growth compared to cells containing plasmid alone.

Deletion Phenotype Studies Objective Test if ΔravA::cat and ΔviaA::cat grow abnormally in LB and M9 media.

Experimental Description Carried out growth curve analysis of WT and both deletion strains using LB and M9 media.

General Result No significant difference in growth rate was observed.

314

Objective Test if ΔravA::cat and ΔviaA::cat are more susceptible than WT to oxidative stress.

Experimental Description Grew cells in LB media to log phase/stationary phase, pelleted cells and incubated in minimal media containing hydrogen peroxide. Determined survival by colony count. General Result No significant differences in survival were observed.

Objective Test if ΔravA::cat and ΔviaA::cat are more susceptible than WT to osmotic stress.

Experimental Description Spotted dilutions of WT and deletion strains on LB-agar plates containing increasing levels of NaCl. Also shocked cells with salt and determined survival by colony count. General Result No differences in viability were observed.

Objective Test if ΔravA::cat and ΔviaA::cat are more susceptible than WT to pH stress.

Experimental Description Spotted dilutions of WT and deletion strains on LB-agar plates at different pH values (5 to 9). Also shocked cells with LB at various pH and determined survival by colony count. General Result No differences in viability were observed.

Objective Test if ΔravA::cat and ΔviaA::cat are more susceptible than WT to heat shock.

Experimental Description Grew cells in LB media, heat shocked at 50C for 30 minutes, serial diluted, and spot plated onto LB agar. Also used a colony count approach. General Result No differences in viability were observed.

Objective Test if ΔravA::cat and ΔviaA::cat are more susceptible than UV stress (DNA damage).

Experimental Description Exposed deletion strains and WT cells to UV light for varying lengths of time, serial diluted and spot plated on LB-agar.

General Result No differences in viability were observed.

315

Objective Test if ΔravA::cat cells display abnormal growth using different carbon sources.

Experimental Description Screened deletion strain using the Biolog high-throughput metabolic screen. Examined cells for ability to grow on different carbon sources. General Result Deletion cells appeared to utilize L-asparagine, fructose, L- alaninamide and B-D glucuronic acid as carbon sources better than WT cells.

Objective Follow up A: verify growth phenotype observed with L- asparagine.

Experimental Description Grew deletion strain in minimal media aerobically using asparagine as the sole carbon source.

General Result Neither strain grew successfully overnight, but after an additional 8 hours both strains appeared to grow equally well.

Objective Follow up B: verify growth phenotype observed with L- asparagine.

Experimental Description Grew cells in minimal media containing Biolog indicator (w/ and w/o shaking).

General Result After 1 week of growth deletion strain appeared to utilize Asn (shaking culture) but not WT. Unclear if this could be contamination (extended growth).

Objective Follow up C: verify growth phenotype observed with L- asparagine. Also examine ΔviaA::cat.

Experimental Description Grew deletion strains in minimal media aerobically using asparagine as the sole carbon source. Performed detailed growth curve analysis. General Result All strains grew equally well. No phenotype apparent.

Objective Test if ΔravA::cat cells display abnormal growth using different nitrogen, phosphate and sulfur sources.

Experimental Description Screened deletion strain using the Biolog high-throughput metabolic screen. Examined cells for ability to grow on different nitrogen, phosphate and carbon sources. General Result Deletion strain utilized L-cysteine as a nitrogen source earlier than WT (but both were equally capable of using it by the completion of growth).

316

Objective Test if ΔravA::cat and ΔviaA::cat are more susceptible than WT to metal stress.

Experimental Description Spotted dilutions of WT and deletion strains on plates containing high concentrations of metals or EDTA metal chelator and examined ability to grow. General Result ΔravA strain appeared to be less viable on plates containing high levels of Mn.

Objective Follow up: Verify Mn stress result.

Experimental Description Repeated experiment with Mn using pH's ranging from 5 to 9.

General Result ΔravA strain was less viable than WT and ΔviaA strains. Effect was more pronounced at lower pH.

Objective Follow up: General metal stress at lower pH

Experimental Description Repeated general metal stress screen, with selected metals, using LB-agar plates buffered to pH 5.0 and 6.0 with Acetate and MES, respectively. General Result No phenotype observed (even with Mn) at pH 6.0. No survival observed at pH 5.0.

Objective Follow up: Further examination of Mn stress result/complementation attempt.

Experimental Description Repeated Mn stress screen on ΔravA::cat strain using pH 5.0 (unbuffered media) and attempted to complement with pR plasmid. Also examined a ravA::Tn5 strain (FB21842). General Result Phenotype observed only in deletion strain, and it was not successfully complemented by pR. ravA::Tn5 strain did not show phenotype. May be a polar effect.

Objective Test ability of ΔravA::cat and ΔviaA::cat cells to grow under nitrogen-limiting conditions.

Experimental Description Carried out growth curve analysis of WT and deletion strains in minimal media containing glutamine as a nitrogen source.

General Result No difference in growth observed between strains.

317

Objective Compare ability of ΔravA and WT cells to grow anaerobically using various electron acceptors.

Experimental Description Grew cells anaerobically in minimal MOPS media, containing glycerol as a carbon source, in the presence of various electron acceptors and constructed growth curves. General Result WT and ΔravA cells grew comparably under all conditions.

Objective Test if ΔravA cells are more susceptible to NO stress than WT.

Experimental Description Grew WT and ΔravA cells under a variety of conditions in the presence of different NO donor compounds. Constructed growth curves and compared growth. General Result No consistent difference in growth was observed between WT and ΔravA strains.

Objective Compare growth of WT and ΔravA strain in the presence of YniC substrate.

Experimental Description Grew cells in MOPS minimal media containing 2-deoxy-glucose- 6-phosphate (artificial YniC substrate). yniC deletion strains do not grow well in the presence of this compound. General Result Both WT and ΔravA strains grew equally well in the media, suggesting that YniC activity is not appreciably decreased in the ΔravA strain.

Objective Compare ability of WT and ΔravA, ΔyehL and ΔravAΔyehL strains to grow anaerobically at low pH.

Experimental Description Grew cells in moderately rich media +/-excess glucose. Compared growth.

General Result All strains grew equally well under both standard and low pH conditions.

Objective Compared ability of WT and ΔravA cells to grow in the presence of toxic levels of adenine.

Experimental Description Grew cells in minimal W-salts media +/- adenine. Generated growth curve and compared.

General Result Both strains were affected equally by the presence of toxic levels of adenine.

318

Objective Test if ΔravA cells are compromised in their ability to reduce selenate.

Experimental Description Grew cells on LB plates +/- selenate. Examined cells for production of reddish colour (indicating selenate production).

General Result Both WT and ΔravA cells reduced selenate with equal effectiveness.

Objective Compare growth of WT and ΔravA strains in N-minimal media.

Experimental Description Grew up strains in N-minimal media and generated growth curve. Used N-minimal media which simulates the internal environment of macrophages. General Result Both WT and ΔravA cells grew equally well.

Objective Compare ability of WT and ΔravA strains to form biofilms.

Experimental Description Grew cells, using supplemented LB media, in 96 well plates for 24 hours. Discarded culture, washed and stained with crystal violet in order to detect biofilms formed on plate. General Result No significant difference in biofilm formation was observed between strains.

Objective Compare motility of WT and ravA::Tn5 strains.

Experimental Description Grew strains on minimal galactose media (inoculated into center of plates) and determined the extent of migration from the center.

General Result ΔravA strain appeared to be slightly less motile than WT strain. Difference is statistically significant.

Biochemical Studies

Objective Test if His-RavA can direct the unfolding of GFP-ssrA.

Experimental Description Carried out degradation assays of GFP-ssrA using RavA protein and ClpP protease (Fluorometry study).

General Result GFP-ssrA was not degraded by ClpP in the presence of absence of RavA.

319

Objective Test if His-RavA has proteolytic activity.

Experimental Description Carried out in vitro degradation assays of unfolded GFP and BSA and analyzed by SDS-PAGE.

General Result No degradation of the unfolded proteins was observed.

Objective Test if His-RavA undergoes autocatalytic cleavage driven by ATP.

Experimental Description Incubated His-RavA in the presence of ATP and analyzed aliquots by SDS-PAGE at various timepoints.

General Result No evidence of autocatalytic cleavage was observed.

Objective Test if His-RavA can degrade peptides.

Experimental Description Carried out a degradation assay using His-RavA and a flurogenic peptide (Fluorometry study).

General Result No peptide degradation was observed.

Objective Test if RavA can disaggregate proteins.

Experimental Description Provided protein to Ms. Sylvia Ho who conducted a luciferase disaggregation assay using RavA and unknowingly truncated ViaA. General Result No disaggregation activity was observed by either protein alone, or together.

Objective Test if RavA can bind DNA (General).

Experimental Description Tested for general DNA binding to RavA. Mixed protein +/- ATP with DNA ladder, incubated and on agarose gel. Examined for fragment shift indicative of binding. General Result Very mild smearing of DNA was observed in presence of protein +/- nucleotide. No strong shifts observed, however.

320

Objective Test if RavA can bind its own upstream DNA region.

Experimental Description Amplified region upstream of RavA by PCR, as well as ClpX (as a control). Incubated DNA +/- nucleotide & RavA, ran on an agarose gel and looked for band shift. General Result No shifting of DNA bands was evident in the presence of RavA protein +/- nucleotide.

Objective Generate chromosomal RavA-FLAG construct (expression/pull- down studies).

Experimental Description Prepared construct in E. coli K12 MG1655. This construct has been verified by PCR.

General Result The single FLAG tag was not sufficient for detection/pull-down of RavA protein using anti-FLAG antibody.

Objective Test if purified RavA is phosphorylated.

Experimental Description Incubated purified RavA +/- shrimp alkaline phosphatase and analyzed samples on an SDS-PAGE gel to determine if there was a difference in migration. General Result No difference in migration observed between SAP treated and untreated samples. Therefore no evidence of phosphorylation.

Objective Test if DNA/RNA has an effect on RavA ATPase activity.

Experimental Description Measured RavA ATPase using colorimetric and coupled enzyme assays. Analyzed +/- ssDNA and dsDNA oligos, and E. coli total genomic DNA and total RNA. General Result No stimulation of RavA activity by DNA or RNA was observed.

Objective Follow up: Test DNA stimulation in the presence of LdcI or ViaA protein.

Experimental Description Measured of activity of RavA using colorimetric ATPase assay in the presence of DNA plus LdcI or ViaA.

General Result Even in the presence of LdcI and ViaA, no DNA stimulation of RavA ATPase activity was observed.

321

Objective Examine the secondary structure of RavA by circular dichroism.

Experimental Description Used circular dichroism to determine the secondary structure of RavA protein and the RavA C-terminal domain.

General Result RavA protein is largely alpha-helical. C-terminal region, however, is predominantly beta-sheet/random coil.

Objective Test if LdcI is metal dependent.

Experimental Description Carried out LdcI assay in the presence of EDTA to determine if it was dependent on metal for its activity.

General Result EDTA failed to inhibit LdcI, suggesting it is not metal-dependent or that the EDTA was not successful in extracting bound metal from the protein.

Objective Test if RavA possess GTPase activity.

Experimental Description Carried out colorimetric malachite green assay using GTP as the RavA substrate.

General Result RavA utilized GTP but only at higher concentrations of substrate (non-Michaelis Menten). May suggest that GTP does not promote RavA oligomerization as readily as ATP.

Objective Test if RavA ATPase is stimulated by the FucU protein.

Experimental Description Carried out colorimetric malachite green assay on RavA in the presence/absence of FucU protein. Also included ViaA and LdcI proteins. General Result The presence of FucU protein did not affect RavA ATPase activity.

Objective Examine RavA expression in minimal media. Experimental Description Carried out growth curve and western blot analysis to study RavA expression when grown in minimal M9 media.

General Result Observed induction towards late log/stationary phase, the same induction pattern observed for cells grown in rich LB media.

322

Objective Test if LdcI stabilizes RavA at low pH.

Experimental Description Incubated RavA protein at pH 3.0 +/- LdcI and then measured it's ATPase activity using the malachite green assay at neutral pH.

General Result The presence of LdcI resulted in a moderate reduction in RavA stability as determined by ATPase activity. This may have been a result of protein precipitation.

Objective Test if RavA can disaggregate/refold denatured GFP.

Experimental Description Used a fluorometric assay to determine if RavA +/- LdcI and ViaA could direct the disaggregation/refolding of thermally inactivated GFP protein. General Result No disaggregation or refolding activity was observed.

Objective Test if RavA and ViaA interact by gel filtration.

Experimental Description Ran RavA/ViaA proteins alone and together (+/- ATP) on size exclusion column and looked for peak shifts indicative of interaction. Also analyzed fractions by SDS-PAGE, General Result Although a slight shift of the RavA and ViaA peaks was observed, no strong evidence of an interaction was detectable using this method.

Objective Test if LdcI/LdcC activity is affected by the presence of metals.

Experimental Description Carried out standard LdcI assay in the presence of a variety of metals and compared to activity in the absence of metal.

General Result Found that certain metals caused a reduction in LdcI/LdcC activity (elimination in the case of Zn). Cadaverine controls showed that the effect was specific to the enzyme reaction.

Objective Test additional RavA interactions by gel filtration.

Experimental Description Carried out gel filtration analysis to determine if RavA forms a ternary complex with ViaA and LdcI, and also checked if RavA and FucU interact. General Result No evidence of a ternary complex or an interaction between RavA and FucU was detected by this method.

323

Objective Test ATPase activity of RavA at higher salt concentrations.

Experimental Description Carried out colorimetric malachite green ATPase assay on RavA in the presence/absence of 300 mM NaCl or KCl.

General Result Found that RavA ATPase activity was reduced 3-5 fold at this salt concentration. Salt controls showed that the effect was specific to the enzyme reaction itself.

Objective Test LdcI expression in a CadB mutant strain.

Experimental Description Carried out LdcI western blot analysis on WT and cadB::Tn5 cells grown under LdcI inducing conditions.

General Result No LdcI expression was evident in the CadB::Tn5 strain. Consistent with two genes forming an operon.

Objective Test ATPase activity of RavA at various protein concentrations. (Ensure it is oligomeric under std conditions).

Experimental Description Carried out colorimetric malachite green ATPase assay using different ATP and RavA concentrations.

General Result Increasing concentrations of RavA did not result in an increase in ATPase activity, suggesting that RavA is fully oligomerized under standard assay conditions.

Objective Test GTPase activity of RavA at various protein concentrations. (See if oligomerization is less effective in GTP)

Experimental Description Carried out colorimetric malachite green ATPase assay using GTP and different concentrations of RavA.

General Result Increasing concentrations of RavA resulted in a marked increase in GTPase activity, suggesting that this nucleotide does not promote oligomerization as strongly as ATP.

Objective Follow up: Test GTPase of RavA at different concentrations in presence of LdcI (Does LdcI promote olig?). Experimental Description Carried out colorimetric malachite green ATPase assay using GTP in the presence/absence of LdcI.

General Result Found that LdcI enhances the GTPase activity of RavA markedly, but only at lower RavA concentrations. This suggests that LdcI may help RavA assemble (i.e. act as a scaffold).

324

Objective Test if RavA ATPase activity is stimulated by YniC.

Experimental Description Carried out colorimetric malachite green ATPase assay in the presence/absence of YniC and/or ViaA.

General Result YniC did not lead to stimulation of RavA ATPase activity.

Objective Test if YniC phosphatase activity is enhanced by RavA.

Experimental Description Carried out colorimetric malachite green assay to measure phosphatase activity of YniC in the presence/absence of RavA and ViaA. General Result No effect on YniC activity was observed in the presence of RavA and/or ViaA.

RavA Crystallization Studies

Extensive efforts to obtain RavA crystals for structure determination have been conducted by myself and another graduate student in the lab, Mr. Usheer Kanjee. To date, sizeable, diffracting crystals have only been obtained for RavA in the absence of nucleotide. These crystals are hexagonal in appearance and develop in solutions containing malonate or sodium citrate/formate (depending on the RavA construct used). Regrettably they only diffract to 8A at best, and efforts to refine them have been unsuccessful. Usheer is currently pursuing this avenue of the project and will be responsible for publishing details of these experiments.

325