A Dissertation

Entitled

Expanded Functionality of the Bacterial Global Regulator Lrp

by

Benjamin R. Hart

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the

Doctor of Philosophy Degree in Biomedical Science

______Dr. Robert Blumenthal, Major Advisor

______Dr. John David Dignam, Committee Member

______Dr. Ivana de la Serna, Committee Member

______Dr. R. Mark Wooten, Committee Member

______Dr. Isabel Novella, Committee Member

______Dr. Patricia Komuniecki,

Dean of the College of Graduate Studies

The University of Toledo

August 2010

Copyright 2010, Benjamin R. Hart This document is copyrighted material. Under copyright law, no parts of this document may be reproduced without expressed permission of the author

ii

Abstract

Predicting gene regulation from genome sequences is an important technique for understanding bacteria that cannot currently be grown in the laboratory. This approach involves extrapolation from a well-characterized bacterium. Several assumptions are made when using this technique; key among these is that sequence-conserved transcription factors, target genes, and binding sites for these transcription factors upstream of the target genes together imply conservation of the regulation. However the level of conservation necessary for accurate predictions has not been defined.

Previous studies have illustrated that the Leucine Responsive Regulatory Protein

(Lrp) orthologs from Escherichia coli and Proteus mirabilis have only partially- conserved regulatory effects despite 98% overall amino acid sequence identity and complete conservation of the DNA binding helix-turn-helix domain. Studies described here reveal that these regulatory differences are associated with previously-unappreciated but fundamental functional differences between the Lrp orthologs. These studies are particularly important for predicting regulation from genome sequence as Lrp is a global regulator that, in E. coli, directly controls over 200 genes.

The first manuscript, prepared for the Journal of Bacteriology, focuses on the amino acid coregulators of Lrp that, for E. coli, were only previously known to include leucine and alanine. This study revealed that methionine, isoleucine, histidine, and

iii threonine also have significant coregulatory effects. In addition, modest differences between Lrp orthologs were observed in response to some amino acids.

The second manuscript focuses on the role of the N-terminal tail of the Lrp protein. This unstructured tail is a region containing many of the sequence differences between Lrp orthologs.Through the generation of hybrid proteins, this study demonstrates that the N-terminal tails contribute to differences in transcription, and DNA binding between E. coli and P. mirabilis Lrp.

Together, these results suggest that even overall sequence identity of 98% is insufficient to allow regulatory extrapolation in the absence of fairly detailed understanding of the regulatory protein.

iv

To Greg Hart and Susan Habegger, their loving support has always been appreciated and I thank them for all their encouragement.

v

Acknowledgments

This dissertation would not have been possible without the love, support, and help of my family and friends who have helped me along the way. It was through their support that I was able to persist through the challenging times.

I wish to thank my major advisor, Dr. Robert Blumenthal, for taking me into his lab and providing me the opportunity to work under his supervision. His guidance, has really helped me develop over the last few years.

I would like to extend my gratitude to my committee members, Dr. John David Dignam,

Dr. Isabel Novella, Dr. R. Mark Wooten, and Dr. Ivana de la Serna for all their time and advice they have offered.

I am especially grateful to Dr. John David Dignam for helping me purify protein and the lengthy discussions regarding protein assembly and the biochemisty involved in this work.

I would like to thank my collaborators Dr. Jennifer Hinerman and Dr. Andrew Herr for their work with analytical centrifugation studies.

vi I would like to extend my appreciation to Dr. Ronald Viola and the Buenafe Arachea help with and access to the instrument for the Dynamic Light Scattering experiments.

I would like to thank the members of the Blumenthal Lab: Dr. Iwona Mruk, Dr. Robert

Lintner, Dr. Pankaj Mishra, Dr. Kristen Williams, and Xaiochen Zhao for all their suggestions and advice.

I would like to extend my thanks to the secretaries of the Medical Microbiology and

Immunology Department especially Sue Payne, Sharron Ellard, Tracy McDaniel, and

Tamara Chaimberlin.

vii

Table of Contents

Acknowledgements…………………………………………………………………...…..vi

Table of Contents…………………………………………..……………………………viii

1 Literature……………………………………………………………………..……1

1.1 Prediction of transcriptional regulation among bacteria is currently

difficult…………………………………………………………………….1

1.1.1 Most regulatory predictions from genome sequences use

extrapolation from well-studied bacteria……………………….....1

1.1.2 One of the challenges for regulatory prediction is determining the

extent to which conserved sequence indicates conserved function,

both among transcription factors and target genes………………..3

1.1.3 An even greater challenge for prediction involves identifying

binding sites for transcription factors in the DNA upstream of

genes………………………………………………………………5

1.1.4 Challenges to making regulatory predictions are present at the regulatory network and cellular level...... 8

1.1.5 This thesis focuses on the conservation of function in a model

transcription factor……………………...…………………………9

viii 1.2 Lrp is a good model transcription factor for studying conservation of

function…………………………………………………………………..10

1.2.1 Lrp is a well-studied global regulator in E. coli………………….10

1.2.2 In E. coli, Lrp can have a variety of regulatory effects…………..12

1.2.3 Lrp is widespread, and particularly highly conserved among

Enterobacteriaceae……………………………………………….15

1.2.4 Lrp has been structurally characterized………………………….16

1.2.5 binding changes the Lrp oligomeric state……………...18

1.2.6 Lrp from E. coli and P. mirabilis have significant functional

differences despite 98% sequence identity………………………20

2 Unexpected coregulator range and ortholog-specific differences in the global

regulator Lrp of Escherichia coli and Proteus mirabilis………………….……..22

2.1 Abstract…………………………………………………………………..23

2.2 Introduction………………………………………………………………24

2.3 Materials and methods…………………………………………………...29

2.4 Results……………………………………………………………………32

2.5 Discussion………………………………………………………………..47

2.6 References………………………………………………………………..55

3 Recognition of DNA by the Helix-Turn-Helix Global Regulatory Protein Lrp is

Modulated by the Amino Terminus……………………………………………...65

3.1 Abstract……………………….………………………………………….67

3.2 Introduction………………………………………………………………68

3.3 Results……………………………………………………………………73

ix 3.4 Discussion………………………………………………………………..88

3.5 Methods……………………………………………………………….….92

3.6 References………………………………………………………………..97

3.7 Supplemental Figures…………………………………………………...104

4 Discussion………………………………………………………………………106

4.1 Lrp orthologs from closely related species have distinct regulatory

effects…………………………………………………………………...106

4.1.1 The N-terminus of Lrp is responsible for some of the differences

between Lrp function……………………………………..…….107

4.1.2 Comparisons of the N-terminus of Lrp to established regulatory

regions offer insights into Lrp function……………………...…108

4.1.3 N-terminus of Lrp is a possible region for fine tuning Lrp

regulation without sacrificing important regulatory

connections...... 110

4.2 Lrp has a broader range of coregulators than was known……………...111

4.2.1 Amino acid sensitivity is broader than previously expected…...111

4.2.2 Differences within the Lrp RAM domain are associated with

differences in Lrp sensitivity to co-regulators…………….……116

4.2.3 Evolutionary dynamics of Lrp. What can we learn from the N and

C domains and substitutions that have appeared within these

regions?...... 117

4.3 Summary……………………………………………………………….118

5 References…………………………………………………………………………120

x

1 Literature

1.1 Prediction of transcriptional regulation among bacteria is currently difficult.

Prediction of gene regulation across species is a complex task, requiring an in- depth knowledge of evolutionary conservation of transcriptional regulators, binding sites within promoters of target genes and the target genes themselves (Madan Babu, Teichmann et al. 2006; Lintner, Mishra et al. 2008; Lintner, Mishra et al. 2008; Baumbach, Rahmann et al. 2009), as well as structure-function relationships in the regulatory proteins. The basic assumption is that if the target genes, transcription regulators and binding sites are conserved, the regulation of the target gene by the regulator should also be conserved. Determining the presence of a target gene and regulator are relatively straightforward since bacterial “parts lists” (consisting of genes, open reading frames, and regulatory elements) can readily be determined (VanBogelen, Greis et al. 1999; Mao, Su et al. 2006; Powell and Hutchison 2006). Major challenges remain, however, especially in making the connections between parts in the regulatory network (Kreimer, Borenstein et al. 2008).

1.1.1 Most regulatory predictions from genome sequences use extrapolation

from well-studied bacteria.

The availability of over a thousand bacterial genomes, many of which are from organsims that have not been grown in the laboratory, generates a strong impetus to understand the regulatory systems in these organisms based on their DNA sequences

1 (Janga and Collado-Vides 2007; Kreimer, Borenstein et al. 2008). To predict regulatory networks, the most commonly-used approach is to compare sequences to those of a well- studied organism (Babu, Luscombe et al. 2004; Espinosa, Gonzalez et al. 2005; Madan Babu, Teichmann et al. 2006; Jothi, Przytycka et al. 2007). However, there are surprisingly few bacteria that have been studied in great depth at the physiological and regulatory levels. Most often, the well studied reference organism is the best-studied organism to date, E. coli K12 (Edwards, Ibarra et al. 2001; Martinez-Antonio and Collado-Vides 2003; Espinosa, Gonzalez et al. 2005; Balaji and Aravind 2007; Janga and Collado-Vides 2007; Price, Dehal et al. 2007; Karimpour-Fard, Leach et al. 2008; Seshasayee, Fraser et al. 2009)

This model does not come without its limitations, in that E. coli K12 itself still is not completely understood and only limited predictions about its metabolism are now being made (Edwards, Ibarra et al. 2001; Martinez-Antonio and Collado-Vides 2003; Kreimer, Borenstein et al. 2008). It is important to understand that while computationally-based predictions are alluring and still being improved, results are frequently unreliable, there is still room for traditional biochemical methodology (VanBogelen, Greis et al. 1999) especially for organisms that can be studied in the lab. The use of these predictions are limited to short evolutionary distances (Babu, Luscombe et al. 2004; Balaji and Aravind 2007), since evolutionary divergence makes predictions in distant organisms unreliable (Janga and Collado-Vides 2007). Regulatory networks are evolutionarily flexible, and even closely-related bacteria that live in distinct ecological niches may have substantially different regulatory needs (Madan Babu and Teichmann 2003; Madan Babu, Teichmann et al. 2006; Balaji and Aravind 2007). Despite these concerns, both because of the increasing use of extrapolatory methods, and because of the intrinsic importance of understanding evolution of regulatory architecture, it is essential to have laboratory tests of the assumptions (of conservation of binding sites, transcriptional regulators, and target genes) underlying this approach to predicting regulation.

2 1.1.2 One of the challenges for regulatory prediction is determining the

extent to which conserved sequence indicates conserved function,

both among transcription factors and target genes.

Understanding the limits of evolutionary conservation is key to making accurate predictions across species (Price, Dehal et al. 2007; Baumbach, Rahmann et al. 2009). However the exact level of conservation required is unknown, meaning it is difficult to state how similar genes need to be to have predictive value (Tian and Skolnick 2003; Jothi, Przytycka et al. 2007; Price, Dehal et al. 2007; Loewenstein, Raimondo et al. 2009). To illustrate that high sequence identity does not necessarily imply function, synthetic gene pairs with 88% sequence identity have been constructed that have completely different structures and functions (Alexander, He et al. 2007). Further, examples exist where single mutations can radically change protein function. For example, a change in the tetracycline repressor can change the role of the inducer to that of a co-repressor (Kamionka, Bogdanska-Urbaniak et al. 2004), and mutations in cyclic adenosine monophosphate (cAMP) receptor protein (CRP) can eliminate the need for its co-activator (Lin, Kovac et al. 2002). This flexibility is true of target genes as well as regulators: a single amino acid substitution in benzophenone synthase can convert it into a phenylpyrone synthase, and change both its and its (Klundt, Bocola et al. 2009). As these cases illustrate, it is necessary to know not only the percent identity of a protein but also where within the protein differences may have substantial functional consequences (Ng and Henikoff 2001; Ng and Henikoff 2003; Wu, Mao et al. 2007; Karimpour-Fard, Leach et al. 2008; Loewenstein, Raimondo et al. 2009). Currently, the standard method for identifying orthologs is sequence comparison, yielding a percent identity score and using substitution matrices to assess how “conservative” a substitution might be (Margelevicius and Venclovas ; Al-Shahib, Breitling et al. 2005; Price, Dehal et al. 2007; Wu, Mao et al. 2007). Supplemental methods have

3 been developed to predict physical and chemical properties, and even putative post- translational modifications (Jensen, Ussery et al. 2003). Homology based approaches to predicting protein function and annotation nevertheless have serious limitations (Ng and Henikoff 2001; Ng and Henikoff 2003; Wu, Mao et al. 2007; Karimpour-Fard, Leach et al. 2008; Loewenstein, Raimondo et al. 2009). When used in conjunction with structural information, the predictions improve but are still far from perfect (Loewenstein, Raimondo et al. 2009). Breaking proteins down into structural or functional domains can provide additional information content and therefore better predictions (Devos and Valencia 2000). Additional information, such as contextual information regarding gene clustering (Wu, Mao et al. 2007) where target genes are closely linked to their transcription factors, can help to predict functional homology. However, very little information regarding essential functional regions is built into these methods, making single important amino acid differences a significant concern when predicting gene function. To address this issue, techniques have been developed that can predict functional and deleterious sites on proteins from sequence data (Ng and Henikoff 2001; Ng and Henikoff 2003). These intrinsic protein properties are one level at which technical improvements are being made for predicting gene function. Thus far, I have only considered factors intrinsic to the regulatory or target protein itself. However, extrinsic protein properties also play an important role in regulatory prediction. For example, protein levels must be similar between species for regulatory patterns to be conserved (Seshasayee, Fraser et al. 2009). In one study several orthologous genes from E. coli and Shewanella were found to have different expression patterns (Price, Dehal et al. 2007). In addition phenotrophic and expression differences have been observed among Streptococcus pneumoniae strains (Hendriksen, Silva et al. 2007). Other extrinsic factors include such things as pool size variation of coregulatory molecules.

4

1.1.3 An even greater challenge for prediction involves identifying binding

sites for transcription factors in the DNA upstream of genes.

The most challenging aspect of predicting regulation is the identification of transcription factor binding sites on the DNA (Sandve and Drablos 2006; Doniger and Fay 2007; Wei and Yu 2007; Harari, del Val et al. 2009). This is partially due to the short and often-degenerate nature of such binding sites (Xie, Pan et al. ; Lapidot, Mizrahi-Man et al. 2008; Harari, del Val et al. 2009). Despite these complications, methods continue to be developed to identify and predict transcription factor binding sites. In 2000, Stormo reviewed major developments from the first identification of motifs for the -10 region of promoters (Pribnow 1975) to many of the more complex approaches that provide the backbone for current techniques (Stormo 2000). The basic method of predicting binding sites consists of four steps: data collection, model building, generation of position weight matrices, and ultimately expressing the information content as a sequence logo in which frequency information about positions in alignments are expressed as a bit score (Wasserman and Sandelin 2004). One of three basic methods is typically used for the generation of weight matrices: Expectation Maximization, Gibbs sampling (a stochastic version of EM), or Phylogenetic footprinting (Janga and Collado-Vides 2007; Hawkins, Grant et al. 2009; van Hijum, Medema et al. 2009). The basic principle behind Expectation Maximization consists of position weight matrices. These algorithms store sequence data as a matrices of probabilities for each position in a sequence. The probabilistic matrix is then improved and optimized through recursive iterations, incorporating new sequences and positions until the probabilistic matrix is optimized (Lawrence and Reilly 1990; Cardon and Stormo 1992; Stormo 2000). Simply speaking this method fits a short window within a promoter and through recursively shifting this window an optimized alignment for a proposed

5 is identified from information stored in a matrix. The advantage of storing sequence data in this format is that, instead of a single sequence result, a motif is defined by base probabilities at each position. The Expectation Maximization method seeks to optimize the position weight matrix, fitting the sequences the algorithm is trained with, while Gibbs sampling fits a series of sequence alignments that best define a common matrix (Siddharthan, Siggia et al. 2005). Gibbs sampling while similar to Expectation Maximization methods still stores binding site motif information in a matrices. Gibbs sampling differs however, by randomizing the positioning of various sites within the alignments and throughout the motif identification process probabilistically determining if the motif can be improved therefore frequently functioning faster than EM by eliminating the thorough systematic search through the sequences (Lawrence, Altschul et al. 1993; van Hijum, Medema et al. 2009). Phylogenetic footprinting approaches consist of aligning known transcription factor binding sites of one specific gene from multiple species, or by aligning all known sites for a transcription factor within one organism (Wang and Stormo 2003). Two assumptions inherent to determining binding sites with position weight matrices are: 1. between individual bases does not exist (i.e., each position contributes to overall binding energy independently of all other positions in the transcription factor binding site); and 2. that spacing is inflexible, and 3. that binding site positions are fixed within the motif (Wasserman and Sandelin 2004).

These techniques, while useful, suffer from three primary problems. The first complication is known as the “Futility theorem”, according to which no methods beyond conventional tools are able to validate predictions, therefore making predictions hard to confirm or falsify (Wasserman and Sandelin 2004; Sandve and Drablos 2006; Wei and Yu 2007; Lapidot, Mizrahi-Man et al. 2008; Perez and Groisman 2009). The second problem, frequently observed with high throughput laboratory techniques, is a high signal-to-noise ratio that can mask weak signals (Rhodius and Mutalik ; Doniger and Fay 2007; Wei and

Yu 2007; Ernst, Beg et al. 2008; Lapidot, Mizrahi-Man et al. 2008; Nagaraj, O'Flanagan et

6 al. 2008; Harari, del Val et al. 2009). The third complication is that many of the methods are imprinted with contextual information based on the test genome, normally through weight matrices, that can leave an inherent bias to searching techniques (Stormo 2000; van Hijum, Medema et al. 2009). These biases include base composition and general patterns that can appear (Stormo 2000). When assessing many of these programs, the computational tools often worked much better on datasets designed for testing these algorithms than on datasets consisting of new sequences (Tompa, Li et al. 2005). It is now common practice to use multiple methods in order to reduce false positives and improve results (Sandve and Drablos 2006; van Hijum, Medema et al. 2009).

Incorporating our understanding of evolution into these methods is an area of current focus. Transcription factor binding sites are more evolutionarily conserved than the surrounding non-coding DNA, though across species variability is still seen (Moses, Chiang et al. 2003). As with prediction of protein function, the biggest limitation to detecting transcription factor binding sites is still evolutionary distance. Within the !- proteobacteria, binding sites are conserved over short evolutionary distances, for example E. coli and Salmonella typhimurium contain comparable binding sites for many genes and transcription factors as analyzed by a bioinformatic approach (Espinosa, Gonzalez et al. 2005). However if extended out to Haemophilus influenzae or Vibrio cholerae, sites were less frequently conserved (Espinosa, Gonzalez et al. 2005). Another complication arises, similarly to predicting protein function, in that substitutions vary in the strength of their effects. A single mutation in the upstream region of E. coli gal can inactivate both of its promoters (Bingham, Ponnambalam et al. 1986). Single base mutations in tetPA can result in several hundredfold decreases in expression (Daniels and Bertrand 1985). Even when both the regulator and binding site are both present, the nature of the regulation they produce is still flexible. For example the general transcriptional activator ComK in Bacillus subtilis, when placed into Lactococcus lactis, acts as a general repressor (Susanna, den Hengst et al. 2006). Some general rules apply:

7 activators are nearly always upstream of the start site, whereas repressors tend to function through mechanisms that typically could exist on either side of the transcriptional start site (Madan Babu and Teichmann 2003; Janga and Collado-Vides 2007). These findings have provided some general ground rules and, with additional information, we may aquire the ability to make accurate transcription factor binding site predictions. However, at present transcription factor binding site prediction is still difficult.

1.1.4 Challenges to making regulatory predictions are present at the regulatory network and cellular level.

Complications both at the protein and genomic levels clearly make regulatory predictions challenging. Important issues that require attention include transcription factor levels (Lintner, Mishra et al. 2008; Seshasayee, Fraser et al. 2009), and being able to identify co-evolution between a transcription factor binding site and transcription factor (Gelfand 2006). Another complication is differential rates of evolution, with fixation of mutations in repressors occurring more frequently than in activators (Balaji and Aravind 2007), and transcription factors in general acquiring fixed mutations more frequently than the genes they regulate when observed across a range of species (Janga and Collado-Vides 2007; Perez and Groisman 2009). In some cases the challenges are reduced; for example genes that have been introduced via horizontal gene transfer tend to bring neighbor regulators and other related genes and products in close conjunction making prediction easier (Gelfand 2006; Seshasayee, Fraser et al. 2009). However, a combination of gene duplication and horizontal gene transfer can lead to rapid functional divergence (Price, Dehal et al. 2007). Transcription factor binding sites can be highly plastic, and are subject to frequent loss and gain of function. In four Saccharomyces species as much as 31% of transcription factor binding sites involved some level of turnover (be it loss, or replacement)

(Doniger and Fay 2007) indicative of continuous low level regulatory rewiring. Rewiring experiments in bacteria, where promoters and genes were mismatched and combined in a 8 plasmid-borne construct, revealed that bacterial regulatory networks are highly plastic and that rewired duplication events were not generally detrimental to cellular growth but in many cases actually improved fitness (Isalan, Lemerle et al. 2008). Currently, limited extrapolations are possible. Lintner et al. showed that despite introducing transcription factors with nearly identical sequences into the same bacterial background, the resulting regulatory patterns were significantly different (Lintner, Mishra et al. 2008). As we gain increasing insight and expand our grasp of the key factors we will be able to improve our predictive methods and more successfully extrapolate regulation across species. One of the keys to making accurate predictions will involve systematic large scale studies of protein-DNA interactions. These studies will elucidate the key binding and specificity constraints of transcription factor binding sites and transcription factors (Gelfand 2006), helping to expand our insights into many of these complications. These studies would involve a mutagenic approach to studying both transcription factors and their binding sites so that binding sites can be predicted for variants of transcription factors. To predict regulation accurately, more information than transcription factors and transcription factor binding sites should be considered. The roles and plasticity of protein – protein interactions, small RNAs, RNA binding proteins, riboswitches, supercoiling, and sigma factors indicate that evolution of regulatory networks is complex (Martinez-Nunez, Perez- Rueda et al.).

1.1.5 This thesis focuses on functional differences in a model transcription

factor.

Lintner et al. took a systematic approach to test the accuracy of the extrapolation model for predicting regulation (Lintner, Mishra et al. 2008). They focused on a well- characterized global transcription factor for several reasons. Global transcription factors are important for general regulation of many physiologically-relevant characteristics. In 9 addition, the ability to understand and predict regulation from one of these top tier regulatory genes is far more important to understanding a cell’s physiology than focusing on transcription factors targeting a single gene. This thesis is following up on some surprising differences observed by Lintner et al., between closely-related orthologs of a global regulator.

1.2 Lrp is a good model transcription factor for studying conservation of function.

Our studies focus on the E. coli global regulator Leucine Responsive Regulatory

Protein (Lrp). Lrp is one of the top-tier regulators in E. coli, capable of regulating ~400 genes. The top 7 global regulators in E. coli together regulate nearly half of the E. coli genome (Martinez-Antonio and Collado-Vides 2003), making these regulators important proteins to understand if adequate predictions are to be made in additional organisms.

Lrp can also be found in a broad range of species from bacteria to archaea with various functions depending on species that frequently sense an organism’s environment and are involved in metabolic regulation and virulence (Newman and Lin 1995; Brinkman,

Ettema et al. 2003; Yokoyama, Ishijima et al. 2006; Baek, Wang et al. 2009).

1.2.1 Lrp is a well-studied global regulator in E. coli.

Global regulators are transcription factors in bacteria that control over 100 members within their regulon (Gottesman 1984). The top 7 global regulators in E. coli directly regulate over half of the genome (Martinez-Antonio and Collado-Vides 2003).

Global regulators have broad effects on bacterial physiology often broadly sensing the

10 environment resulting in robust cellular responses to changes within the environment

(Martinez-Antonio and Collado-Vides 2003).

Lrp is a global regulator in the !-proteobacteria that can regulate ~10% of the E. coli genome and is identified as a feast or famine protein (Tani, Khodursky et al. 2002).

Based on analysis of the results of chromatin immunoprecipitation microarrays, Lrp directly regulates approximately 200 genes (Cho, Barrett et al. 2008), and about 400 overall (Lintner, Mishra et al. 2008). Lrp is nonessential in E. coli though Lrp- strains are on a metabolic precipice, and are sensitive to mutations that would result in minimal effects in Lrp+ strains (Ambartsoumian, D'Ari et al. 1994). Broadly speaking Lrp regulates a complex number of genes across many physiological classes, though a generalized role would be to help regulate between rich and minimal environments

(Newman, D'Ari et al. 1992). As a feast or famine protein Lrp is involved in regulation of many genes involved in bacterial metabolism by upregulating biosynthetic genes and downregulating transporters and catabolic genes (Newman, D'Ari et al. 1992; Newman and Lin 1995). Lrp directly regulates many genes involved in amino acid synthesis and metabolism, including livKHMGF, which codes for the high-affinity branched chain amino acid transporters (Haney, Platko et al. 1992; Bhagwat, Rice et al. 1997), ilvIH coding for acetohydroxyacid synthase III (Platko, Willins et al. 1990; Willins, Ryan et al.

1991; Willins and Calvo 1992; Marasco, Varcamonti et al. 1994), ilvGMEDA which specifies genes involved in the synthesis of many branched chain amino acids (Rhee,

Parekh et al. 1996), and gltBD (Ernsting, Denninger et al. 1993; Borst, Blumenthal et al.

1996; Wiese, Ernsting et al. 1997) specifying glutamate synthase. These genes are mentioned here in part because they provide some insights into the regulatory scope of

11 Lrp, and also because some are used in experiments discussed in later chapters.

Additionally, Lrp regulates complex phenotypes such as swarming in P. mirabilis (Hay,

Tipper et al. 1997) and it is involved in the regulation of fimbriae and virulence in

Salmonella enterica serovar Typhimurium as well as phase variation regulating production of various fimbriae in E. coli (Lahooti, Roesch et al. 2005; McFarland,

Lucchini et al. 2008; Corcoran and Dorman 2009). The important global role for Lrp in bacterial physiology, and its conservation across bacterial species, make it an attractive model for testing regulatory extrapolation.

1.2.2 In E. coli, Lrp can have a variety of regulatory effects.

Lrp without a coregulator can cause activation (Platko, Willins et al. 1990;

Ernsting, Denninger et al. 1993) or repression (Wang, Wu et al. 1994) depending on the promoter it binds to. However upon binding of a coregulator (e.g. leucine) the Lrp effect can change to enhance activation of the promoter, repress (or more strongly repress) the target promoter, or switch from activation to repression (Calvo and Matthews 1994;

Bhagwat, Rice et al. 1997; Lintner, Mishra et al. 2008) of the promoter.

This regulation is dependent not only on the presence of leucine but also on the

Lrp level (Bhagwat, Rice et al. 1997). The regulation of lrp is necessary to understand its role in the E. coli regulatory network. In rich media the Lrp level increases as the cell approaches stationary phase and is expressed at 1.3x higher levels in stationary phase than log phase. Cells grown in minimal media have 3-4x higher levels of Lrp than cells grown in rich media (Landgraf, Wu et al. 1996). Maximal Lrp levels are moderately abundant with ~6000 subunits per cell (Cui, Midkiff et al. 1996), reaching concentrations

12 as high as 15 µM (Chen, Hao et al. 2001). Lrp represses its own promoter, forming a negative feedback loop (Wang, Wu et al. 1994). In addition to autoregulation and growth phase effects, lrp is regulated by ppGpp which is formed when ribosomes encounter uncharged tRNAs, further linking lrp expression to growth rate (Landgraf, Wu et al.

1996).

To understand and predict Lrp’s regulatory roles we need to understand its binding sites. The variability of Lrp binding sequences and cooperative effects of adjacent binding sites make this a significant challenge. However, from multiple binding sites sequences a Lrp consensus site was derived: AGAATTTTATTCT (Cui, Wang et al.

1995; Yokoyama, Ishijima et al. 2006). In many cases, this consensus did not provide adequate information for site prediction. Using selective evolution of ligands via exponential enrichment (SELEX), a second consensus site was determined:

YAGHAWATTWTDCTR where Y = C/T, H = not G, W = A/T, D = not C, and R = A/G

(Cui, Wang et al. 1995; Cui, Midkiff et al. 1996). These sites are consistent with the previous consensus, and provide additional predictive flexibility. However, these sites alone did not provide the strong binding affinity observed experimentally unless flanking

DNA consisting of a minimum of 3-4 bases of single stranded DNA surrounding the consensus was included (Cui, Midkiff et al. 1996). The addition of this flanking DNA is consistent with a possible role for the N-terminus of Lrp in DNA binding (de los Rios and

Perona 2007). In comparisons with natural binding sites, differences were still observed when compared to the Lrp binding site logo sequence (Shultzaberger and Schneider

1999). The Lrp binding site logo implies that individual Lrp binding sites were most likely monomeric in nature, reflecting the asymmetric nature of the sites, and not dimeric

13 as supported by the imperfect palindrome uncovered from SELEX experiments (Cui,

Wang et al. 1995; Shultzaberger and Schneider 1999). It was not possible to distinguish between activation and repression based solely on the binding site sequences

(Shultzaberger and Schneider 1999), implying that multimerization and/or positioning on the promoter likely define Lrp’s role at a given promoter.

Contextual clues within the sequence can provide potential indications of Lrp’s transcriptional role. Positioning of binding sites provide a strong indication of the sign of transcriptional regulation (Madan Babu and Teichmann 2003). This holds true for Lrp.

Lrp binding far upstream of the promoter is typically associated with activation. This can been seen with the gltBDF promoter where Lrp binds ~150 bp and more upstream

(Ernsting, Denninger et al. 1993; Paul, Blumenthal et al. 2001; Paul, Mishra et al. 2007), ilvIH where Lrp binds at sites ranging from -260 to -190 and -150 to -40 (Wang and

Calvo 1993; Wang and Calvo 1993; Jafri, Chen et al. 2002), ilvGMEDA -226 (Rhee,

Parekh et al. 1996), serA where it binds at -37 to -133 bases from P1 (Yang, Lin et al.

2002), and gcvTHP at -92 to -229 (Stauffer and Stauffer 1994; Stauffer and Stauffer

1999). In some of these cases it appears that DNA bending by integration host factor

(IHF) plays an essential role in bringing Lrp into contact with the RNA polymerase, or through promoting DNA structures capable of initiating transcription (Sacco, Ricca et al.

1993; Paul, Blumenthal et al. 2001; Paul, Mishra et al. 2007). It seems likely, since Lrp

RNA polymerase contact has yet to be clearly observed, that DNA bending and a structural role of Lrp may be relevant. DNA bending is supported by an experiment where integration host factor sites replaced Lrp sites, resulting in similar transcription

(Stauffer and Stauffer 1999). Similar positional trends can be observed for repression

14 where Lrp occludes the core promoter region or can be found downstream of the start site; this is observed in the cases of serA P2 (Yang, Lin et al. 2002), rrnB P1 (Pul, Lux et al. 2008), aidB (Landini, Hajec et al. 1996), ompC, and ompF (Ferrario, Ernsting et al.

1995). With the dadAX promoter, where Lrp acts as a dual regulator, both these trends are observed (Zhi, Mathew et al. 1999).

Lrp also appears to play a secondary role as a regulator of DNA supercoiling

(Beloin, Jeusset et al. 2003; Kelly, Conway et al. 2006; Pul, Wurm et al. 2007; Corcoran and Dorman 2009). Lrp’s role in supercoiling, and associated indirect effects on transcription, are less clear than its role as a direct activator or repressor, but clearly plays a part in phase variation of fimbrial production (Kelly, Conway et al. 2006; Corcoran and

Dorman 2009), and may play roles in control of other genes.

1.2.3 Lrp is widespread, and particularly highly conserved among

Enterobacteriaceae.

Lrp is a widespread transcriptional regulator with family members spread broadly across eubacteria and archaea (Kyrpides and Ouzounis 1995; Kawashima, Aramaki et al.

2008). Within enteric bacteria, Lrp is highly conserved (Friedberg, Platko et al. 1995;

Lintner, Mishra et al. 2008). The few Lrp amino acid changes within enteric bacteria, and their clustering, indicate that there is strong evolutionary pressure to maintain sequence identity of Lrp, suggesting that much of the protein has functional importance

(Friedberg, Platko et al. 1995). The 164-amino acid Lrp monomer of E. coli has very high percent identities to the amino acid sequences of orthologs from Salmonella enterica

15 Typhimurium (100%), Proteus mirabilis (98%), and even Vibrio cholerae from outside the Enterobacteriaceae (92%).

1.2.4 Lrp has been structurally characterized.

The structure of Lrp orthologs has been determined from species ranging from enteric bacteria to archaea. The structures provide insights into Lrp function even in cases where sequence similarity differs significantly. Generally speaking the18.8 kDa

Lrp subunit consists of two domains. One of these is a C-terminal domain, containing a

Regulation of Amino acid Metabolism (RAM) domain responsible for the responsiveness of Lrp to coregulatory amino acids (Platko and Calvo 1993; Ettema, Brinkman et al.

2002). This part of Lrp also contains an Aspartokinase, Chorismate mutase, and TyrA

(ACT) domain (Brinkman, Ettema et al. 2003; Reddy, Gokulan et al. 2008). The C- terminal domain is characterized by a structural "#""#" motif that is consistent in all crystal structures of Lrp orthologs obtained to date (Leonard, Smits et al. 2001; Peeters,

Willaert et al. 2006; de los Rios and Perona 2007; Ren, Sainsbury et al. 2007; Shrivastava and Ramachandran 2007; Yokoyama, Ishijima et al. 2007; Reddy, Gokulan et al. 2008).

A crystal structure is shown on page 67.

The Lrp N-terminal domain contains a Helix-Turn-Helix motif that is responsible for DNA binding (Platko and Calvo 1993; Enoru-Eta, Gigot et al. 2000; de los Rios and

Perona 2007). These domains are connected by one or two flexible linker regions, depending on the species (Reddy, Gokulan et al. 2008). The multimeric structure contains the C-terminal domains in the central region, while the Helix-Turn-Helix domains are on the outer edges where they can make DNA contacts (Leonard, Smits et al. 2001; Peeters,

16 Willaert et al. 2006; de los Rios and Perona 2007; Ren, Sainsbury et al. 2007; Shrivastava and Ramachandran 2007; Yokoyama, Ishijima et al. 2007; Reddy, Gokulan et al. 2008).

The unstructured N-terminus of Lrp extends out from the Helix-Turn-Helix and its removal greatly diminishes DNA binding (de los Rios and Perona 2007). Therefore the

N-terminal tail is suspected to make additional DNA contacts. A cocrystal structure of E. coli Lrp (EcoLrp) with DNA is available (de los Rios and Perona 2007), reveals that the

DNA wraps around the outer edge of the Lrp multimer, and confirms that the Helix-Turn-

Helix motifs play central roles in DNA binding.

E. coli Lrp crystallizes into an octameric ring structure (chapter 3) that consists of four dimers (de los Rios and Perona 2007). Upon DNA binding the octameric ring opens slightly at a point between adjacent dimers (de los Rios and Perona 2007). Octameric structures are also observed in Mycobacterium tuberculosis (Shrivastava and

Ramachandran 2007; Reddy, Gokulan et al. 2008), Neisseria meningitides (Ren,

Sainsbury et al. 2007), Sulfolobus solfataricus (Peeters, Hoa et al. 2005) and in the archael species Pyrococcus OT3 and Pyrococcus furiosus (Leonard, Smits et al. 2001;

Yokoyama, Ishijima et al. 2007). These orthologs conserve the tetramer of dimers structure (Leonard, Smits et al. 2001; Yokoyama, Ishijima et al. 2007). Lrp from E. coli,

M. tuberculosis, S. sulfataricus, and N. meningitides exists as dimers, tetramers and octamers, and Pyrococcus Lrp is predominately a dimer and octameric conformations

(Yokoyama, Ishijima et al. 2006).

The interactions between subunits are driven by hydrophobic forces, primarily between the " sheets of the C-terminal domain (Reddy, Gokulan et al. 2008).The interactions between Lrp and its amino acid coregulators have been observed structurally

17 in the cases of Neisseria meningitidis (Ren, Sainsbury et al. 2007) and Mycobacterium tuberculosis (Shrivastava and Ramachandran 2007), and the sites involved in leucine responsiveness have been mapped genetically in other orthologs (Platko and Calvo 1993;

Leonard, Smits et al. 2001; Reddy, Gokulan et al. 2008) to the dimer interfaces between subunits, within the central C-terminal pocket of the dimer, and to sites where dimers interact to form tetramers.

1.2.5 Cofactor binding changes the Lrp oligomeric state.

As the name implies Lrp is sensitive to leucine, but in some cases can respond to other amino acids as coregulators. Lrp orthologs have been reported to interact with the amino acids Leu, Ala, Arg, Gln, His, Lys, Met, Phe, Pro, Thr, Trp, Tyr and Val (Platko and Calvo 1993; Ambartsoumian, D'Ari et al. 1994; Chen, Hao et al. 2001; Chen, Rosner et al. 2001; Chen and Calvo 2002; Berthiaume, Crost et al. 2004; de los Rios and Perona

2007; Ren, Sainsbury et al. 2007; Shrivastava and Ramachandran 2007; Boulette,

Baynham et al. 2009). Not all of these have been shown to involve direct interactions.

The functioning of these additional coregulators will be discussed detail in a later chapter.

Lrp multimerizes to form higher order complexes including dimers, tetramers, octamers, and possibly hexadecamers by stacking of two octamers (Chen, Rosner et al.

2001; Chen and Calvo 2002). The mechanism of leucine modifying regulation by Lrp has been suggested to involve modulation of Lrp multimerization (Chen, Rosner et al. 2001;

Chen and Calvo 2002). Leucine is currently thought to stabilize the Lrp octamer and favor disassociation of Lrp hexadecamers, thus modifying Lrp-DNA interactions and

18 binding affinities and interactions with RNA polymerase (Chen, Rosner et al. 2001; Chen and Calvo 2002).

A octamer-hexadecamer equilibrium is consistent with conformational changes observed in Mycobacterium tuberculosis Lrp upon ligand binding to the second of two ligand-binding sites. Coregulator binding to this site could prevent contacts necessary to forming hexadecamers (Shrivastava and Ramachandran 2007). Multiple coregulator binding sites preventing hexadecamer formation is consistent with the identification of two different-affinity leucine binding sites detected by leucine titrations of E. coli Lrp

(Chen and Calvo 2002). The existence of hexadecameric Lrp is still debatable, since it has only been detected by dynamic light scattering and chemical crosslinking utilizing a

His6 Lrp (Chen, Rosner et al. 2001; Chen and Calvo 2002); other methods were less conclusive and there is no crystal structure of the hexadecamer. In support of Lrp hexadecamers, Lrp tubes have been identified in crystals of the Lrp homolog FL11 from

Pyrococcus OT3 (Yokoyama, Ishijima et al. 2006).

Coregulators have been primarily identified as destabilizing factors in Lrp multimer assembly (Brinkman, Ettema et al. 2003). However, leucine or alanine have very different effects via Lrp on clp phase variation: Ala locks phase shifting off, while

Leu merely represses the promoter (Crost, Garrivier et al. 2003). Multiple differential functions of coregulators could indicate that modulation of Lrp activity is more sophisticated than has been appreciated to date. Obviously, such features and their conservation must be elucidated before meaningful regulatory predictions can be made.

Lrp response to coregulators depends on the target promoter and additional regulatory proteins that may interact with the promoter. Many of the promoters controlled

19 by global regulators are controlled by two or more of them (Janga and Collado-Vides

2007; Mendoza-Vargas, Olvera et al. 2009). For example, PgltB is controlled by Lrp,

IHF, Crp, and the more local regulator ArgR (Paul, Mishra et al. 2007). This complex level of regulation illustrates the function of Lrp as a master regulator that is capable of carefully regulating many promoters with varied levels of expression.

1.2.6 Lrp from E. coli and P. mirabilis have significant functional

differences despite 98% sequence identity.

In a previous study, which led to the work described here, Lintner et al. interchanged E. coli Lrp (EcoLrp) and P. mirabilis Lrp (PmiLrp) in a lrp null strain of E. coli, and observed substantial differences in regulation (Lintner, Mishra et al. 2008). In this experimental approach, the extrapolation is simplified by having identical target genes and transcription factor binding sites. Despite the strains only differing at the lrp gene, which are 98% identical at the amino acid level, the strains shared only about half of genes regulated. This was suprising given that the proteins had identical Helix-Turn-

Helix regions.

Lintner et al. also noted that the lrp gene itself was expressed differently in the two native species, with P. mirabilis Lrp being expressed at substantially higher levels than Lrp in E. coli. Differences such as these raised questions about possible differences in Lrp function despite the sequence similarity (pages 25, 68). Due to the location of amino acid substitutions between the Lrp orthologs, with one in the Lrp RAM domain

(Ettema, Brinkman et al. 2002), two in the disordered N-terminal tail, and one in the linker region between the N and C domains, several questions came to mind. First, in E.

20 coli, only Leu and Ala are known coregulators; might the RAM domain substitution make P. mirabilis Lrp sensitive to different amino acid coregulators?

The second question addressed the molecular basis for the partially distinct sets of promoters regulated by the Proteus and Escherichia Lrp orthologs. Lrp is a HTH protein, and PmiLrp and EcoLrp have identical HTH sequences therefore they are expected to have equivalent regulatory effects, however they do not. The N-terminal domain has been implicated in DNA binding, though only through complete deletion of the first ten amino acids (de los Rios and Perona 2007), and perhaps this region plays some role in the transcriptional differences between Lrp orthologs. In this thesis, I seek to address these two questions.

21

2 Unexpected coregulator range and ortholog-specific differences in the global regulator Lrp of Escherichia coli and Proteus mirabilis

Benjamin R. Hart1 and Robert M. Blumenthal1,2*

Department of Medical Microbiology & Immunology, and Program in Infection,

Immunity & Transplantation, The University of Toledo College of Medicine, Toledo,

Ohio,1 and Program in Bioinformatics & Proteomics/Genomics, The University of

Toledo, Toledo, Ohio2

*Corresponding author. Mailing address: Department of Medical Microbiology &

Immunology, University of Toledo College of Medicine, 3000 Arlington Avenue,

Toledo, OH 43614-2598. Phone: (419) 383-5422. Fax: (419) 383-3002. E-mail: [email protected]

Running title: Lrp coregulator range and ortholog differences

22 Abstract

The Lrp/AsnC family of transcription factors links bacterial and archaeal gene regulation to metabolism. Members of this family, collectively, respond to a range of amino acids as coregulators. In Escherichia coli, Lrp regulates over 200 genes directly, and is well known to respond to leucine and, to a somewhat lesser extent, alanine. We focused on Lrp from Proteus mirabilis and E. coli, orthologs with 98% identity overall and identical helix-turn-helix motifs, for which a previous study nevertheless found functional differences. Differences between these orthologs, within and adjacent to the amino acid-responsive RAM domain, led us to investigate the potential effect for different sensitivities to amino acids. We found modest but significant differences in their sensitivities to some amino acids. More strikingly, via both in vivo reporter fusion assays and in vitro electrophoretic mobility shift experiments, we found that E. coli Lrp itself responded to a broader range of amino acids than previously appreciated. In particular, for both the E. coli and P. mirabilis orthologs, Lrp responsiveness to methionine was similar in magnitude to that of leucine. Both Lrp orthologs are also fairly sensitive to Ile, His and Thr. These observations have substantial implications for understanding E. coli physiology and for attempts to predict or model regulatory architecture.

23 Introduction

The Lrp/AsnC family of transcription factors is broadly distributed, and frequently ties bacterial metabolism to environmental signals, mediating transitions between “feast and famine” (Newman, D'Ari et al. 1992; Calvo and Matthews 1994).

Family members are present in archaea as well as bacteria (Charlier, Roovers et al. 1997).

Lrp/AsnC proteins include an N-terminal domain with a helix-turn-helix motif that interacts with DNA, and a C-terminal RAM (regulation of amino acid metabolism) domain that responds to amino acid coregulators (Platko and Calvo 1993; Ettema,

Brinkman et al. 2002; de los Rios and Perona 2007; Ren, Sainsbury et al. 2007;

Shrivastava and Ramachandran 2007). The C-terminal domain also mediates the formation of multimers (dimers, octamers, and hexadecamers), providing additional regulatory complexity (Chen, Rosner et al. 2001; Chen and Calvo 2002). The Leucine

Responsive regulatory Protein (Lrp) is a global regulatory protein that regulates ~400 genes in E. coli, of which ~130 involve direct interactions (Tani, Khodursky et al. 2002;

Cho, Barrett et al. 2008). This regulation appears to help E. coli adapt between two major environments: “gut and gutter” (Calvo and Matthews 1994). Lrp from E. coli (EcoLrp) is the most extensively studied protein in this family, and its structure has been determined

(de los Rios and Perona 2007) (see Fig. 1C).

24

Figure 1.

Figure 1. PlivK model system. A. Plasmid diagrams. pEcLrp and pPmLrp are pCC1- based plasmids containing (respectively) E. coli lrp or P. mirabilis lrp inserted into the

BamHI site under control of PlacUV5 and an artificial consensus ribosome-binding site. pRHLiv2 is derived from pKK223-3 (GenBank accession #M77749.1) and contains a

PlivK-lacZ fusion transcriptionally isolated from the rest of the plasmid by strong bidirectional terminators (red boxes). The cat gene of pRHLiv2 was inactivated to allow independent selection for each plasmid. B. LacZ ("-galactosidase) activity is plotted vs. culture optical density; a straight line indicates steady-state growth and the slopes reflect relative levels of expression. In the key, “+” (red symbols) indicates presence of 10 mM

25 L-Leu in the MOPS-glucose medium, and “–” (blue symbols) indicates its absence.

“Vec” is the pCC1 vector control (no Lrp), while “Eco” and “Pmi” respectively refer to

EcoLrp and PmiLrp. C. Sequences of EcoLrp (upper) and PmiLrp (lower) in single-letter amino acid code. The four sequence differences are highlighted. Indications of secondary structure (cylinders for helices and arrows for strands) are derived from the crystal structure of EcoLrp (de los Rios and Perona 2007); the helix-turn-helix motif, primarily responsible for DNA sequence recognition, is shown. Positions of conserved residues from the RAM domain (regulation of amino acid metabolism (Ettema, Brinkman et al.

2002)) are indicated.

26 Table 1. Coregulators of Lrp orthologs.

Species Amino Acida Methodb Referencesc e.g., (Yang, Lin et al. 2002; Berthiaume, Crost indirect (in et al. 2004; Lintner, Mishra et al. 2008) Escherichia coli Leu vivo) mutational (Platko and Calvo 1993) analysis (Ernsting, Denninger et al. 1993; Bhagwat, Rice et al. 1997; Zhi, Mathew et al. 1999) EMSA DNAse I (Marasco, Varcamonti et al. 1994; Wiese, footprints Ernsting et al. 1997) (Chen, Rosner et al. 2001; Chen and Calvo dynamic light 2002) scattering (Mathew, Zhi et al. 1996; Berthiaume, Crost et indirect (in al. 2004) Ala vivo) (Mathew, Zhi et al. 1996; Zhi, Mathew et al. 1999) EMSA DNAse I (Zhi, Mathew et al. 1999) footprints indirect (in (Nakanishi, Tashiro et al. 2009) Butyrate vivo)

Actinobacillus Ile/Leu/Val indirect (in (McFarland and Dorman 2008) pleuropneumoniae mix vivo)

Klebsiella indirect (in (Janes and Bender 1999) aerogenes Ala vivo) Phe, Tyr, Met, Mycobacterium His, Lys, Arg, tuberculosis Pro, Thr, Gln crystallography (Shrivastava and Ramachandran 2007)

Neisseria (Ren, Sainsbury et al. 2007) meningitidis Leu, Met crystallography indirect (in (Lintner, Mishra et al. 2008) Proteus mirabilis Leu vivo)

Pseudomonas indirect (in (Boulette, Baynham et al. 2009) aeruginosa D/L-Ala, Val vivo) (Hecht, Zhang et al. 1996; Marshall, Sheehan et Salmonella al. 1999; McFarland and Dorman 2008; Baek, entrica serovar indirect (in Wang et al. 2009) Typhimurium Leu vivo) indirect (in Vibrio cholerae Leu vivo) (Lintner, Mishra et al. 2008)

a – L-isomer, unless otherwise indicated

27 b – “Indirect (in vivo)” refers to methods such as response of a lacZ reporter fusion when

the amino acid is added to the growth medium. c – Where many references report a particular type of observation, a representative subset

are listed.

Lrp orthologs from several genera are collectively responsive to a variety of amino acids, including Leu, Ala, Arg, Gln, His, Lys, Met, Phe, Pro, Thr, Trp, Tyr and Val

(though this has not yet been shown to involve direct effects in all cases). In contrast, as shown in Table 1, the well-studied E. coli Lrp ortholog (EcoLrp) has only been reported to respond to Leu (Willins, Ryan et al. 1991; Haney, Platko et al. 1992; Platko and Calvo

1993; Roesch and Blomfield 1998; Chen, Hao et al. 2001; Chen, Rosner et al. 2001;

Chen and Calvo 2002) and Ala (Martin 1996; Mathew, Zhi et al. 1996; Zhi, Mathew et al.

1998; Zhi, Mathew et al. 1999; Berthiaume, Crost et al. 2004). At least in the case of

EcoLrp and Leu, their interaction modulates multimerization with associated effects on transcription (Chen, Rosner et al. 2001; Chen and Calvo 2002).

It would be very useful, for bioinformatic prediction of cell physiology (or understanding its evolution), to assume conserved regulatory properties for conserved regulatory proteins. However, the appropriate limits for such extrapolation between species have not yet been well defined, and orthologous regulators do not always play the same roles (Hershberg and Margalit 2006; Lozada-Chavez, Janga et al. 2006; Madan

Babu, Teichmann et al. 2006; Price, Dehal et al. 2007; Janga and Perez-Rueda 2009). In a previous study, we found distinct regulatory differences between Lrp orthologs from V. cholerae, P. mirabilis, and E. coli, even when they were expressed in the same

28 background from the same expression sequences (Lintner, Mishra et al. 2008). These differences were seen despite the orthologs’ very high sequence identity and completely conserved helix-turn-helix motifs. One of the few sequence differences between these orthologs lay within a region involved in coregulator interactions (Platko and Calvo

1993; Ettema, Brinkman et al. 2002) (Fig. 1C). We explore here the possibility that differences within the coregulator binding domains could help explain differences in Lrp regulatory behavior.

Materials and Methods

Bacterial strains and growth conditions. Most bacterial strains used in this study were based on the E. coli BE10.2 background (Matthews, Cui et al. 2000), and contained pCC1-based plasmids pVec (vector control), pEcLrp (E. coli lrp), or pPmLrp

(P. mirabilis lrp), together with compatible reporter plasmids pRHLiv2 (PlivK-lacZ) or pPM2005 (PgltB-lacZ) (Lintner, Mishra et al. 2008) (see Fig. 1A). The lrp-bearing plasmids contain the respective lrp ORFs, with a consensus Shine-Dalgarno ribosome binding site, downstream of the vector’s PlacUV5 promoter. The reporter plasmids are derived from pBH403, which in turn is derived from pKK223-3 (GenBank accession

#M77749.1). E. coli strain SPB107 carries two chromosomal fusions: PlivK-lacZ as the result of a lplacMu insertion, and PlacUV5-lrp from background strain AAEC546

(Blomfield, Calie et al. 1993; Bhagwat, Rice et al. 1997). Cells were grown in baffled flasks with gyrotory shaking at 37 °C, except for the 20-amino acid screening experiment, in which they were grown at 37 °C in 5 ml capped polypropylene culture tubes on a rotator. These cultures were grown in Morpholinopropane sulfonate (MOPS)

29 glucose minimal media (Neidhardt, Bloch et al. 1974) from Teknova (Hollister, CA).

Background expression from PlacUV5 is sufficient (Lintner, Mishra et al. 2008), so no

IPTG inducer was used. For protein purification, cells were grown with aeration in STG medium (LB containing 0.2% glycerol and 50 mM potassium phosphate at pH 7.4)

(Matthews, Cui et al. 2000). Antibiotics were used where indicated: ampicillin (100 mg/ml), tetracycline (10 mg/ml), and chlorampenicol (15 mg/ml). Media were supplemented with amino acids as indicated at 10 mM, except for tryptophan at 5 mM.

Overnight cultures were inoculated from M9-glucose agar plates (Sambrook and Russell

2001), streaked the previous day from frozen stocks. Dilution series were made so one of these starter cultures would still be in late exponential phase. These cultures, with an

OD600nm of 0.4 – 0.8, were used to inoculate fresh media at 1:50.

"-galactosidase assays. Experimental cultures were inoculated 1:250 from overnight ones. Between OD600nm values of 0.1 and 0.8, 1 ml culture samples were collected and lysed by vortex mixing 30s with 50 mL of chloroform and 25 ml of 10%

(w:v) SDS. To determine "-galactosidase levels, ONPG hydrolysis was plotted against culture absorbance and fitted by linear regression to yield "-galactosidase activity

(Platko, Willins et al. 1990; Lintner, Mishra et al. 2008).

Western blot analysis. Equal volumes of cell cultures were centrifuged at 13K x g for 2 min, and pellets were suspended in SDS buffer and boiled for 10 min. Protein concentrations were determined by the Lowry-based RC DC protocol (BioRad, Hercules,

CA). Equal amounts of protein were loaded onto 10% polyacrylamide gels, electrophoresed at 110V in 1x Tris-glycine SDS buffer (Sambrook and Russell 2001), and electroblotted to PVDF using an Xcell apparatus (Invitrogen Carlsbad, CA). The

30 blotted membrane was blocked with 5% powdered milk in PBST (137 mM sodium chloride, 2 mM potassium chloride, 10 mM dibasic sodium phosphate, 1.7 mM monobasic potassium phosphate, 0.05% tween 20, pH 7.4), and probed with a 1:10,000 dilution of rabbit anti-EcoLrp polyclonal antiserum and a 1:25,000 dilution of HRP- conjugated goat anti-rabbit IgG (gift of Dr. Darren Sledjeski). Detection made use of

ECLplus reagents (GE Health Science, Piscataway, NY) per the manufacturer’s instructions. Protein bands were visualized on an UltraLum 16vS imaging system

(Omega, Claremont, CA) and densitometry was performed using ImageJ (Rasband

2010).

Protein purification. Native Lrp protein was purified as previously reported

(Matthews, Cui et al. 2000). In short, E. coli JWD3-1 cells were grown in 500 ml STG medium and induced with 0.5 mM IPTG when the culture reached an OD600nm of 1.0-1.5.

Cells were grown for 2 h post induction, and were then pelleted and frozen until purification. For purification, cells were sonicated in (3 ml/g cells) TG10ED buffer (10 mM Tris pH 8.0, 10% glycerol, 0.1 mM EDTA , 0.2 M NaCl and 0.1 mM DTT) with 100 ml 1.14 M phenylmethylsulfonyl fluoride (PMSF) per 500 ml cells grown. Sonication was in a cup horn probe (Ultrasonics, Plainview, NY) at maximum power for five rounds of 1 min, separated by 2 min on ice. The lysate was centrifuged 30 min at 15K x g, and the resulting supernatant was loaded onto a 1 x 12 cm BioRex70 cation exchange column

(BioRad, Hercules, CA) equilibriated with TG10ED. Proteins were eluted with a 0.2 – 1.0

M NaCl gradient, and fractions were analyzed by examining stained SDS polyacrylamide gels for the Lrp 18.9 kDa band. Fractions containing Lrp were pooled and concentrated with VivaSpin concentrators having a 10,000 MW cutoff (Sartorius stedim, Dusseldorf,

31 Germany). Concentrated Lrp fractions were then loaded onto a 1 x 28 cm Superose12 column (GE Healthcare, Uppsala, Sweden) equilibrated with TG10ED buffer. Fractions containing highly purified Lrp were concentrated and dialyzed into MES buffer (10 mM

N-morpholinoethane sulfonate, pH 6.25, 0.1 mM EDTA, and 0.2 M KCl). For stability of Lrp in sensitive assays of binding or multimerization, we have found that transfer to the MES buffer must occur within 96 h of lysis.

Electrophoretic mobility shift assays. Purified Lrp was mixed with 23 nM DNA in a solution containing 40 mM Tris pH 7.4, 60 mM KCl, 0.1 mM EDTA, 5% glycerol,

80 mM NaCl, and 1 mM DTT, with a given L-amino acid at 10 mM as indicated. The samples were incubated at 23 °C for 20 min prior to the addition of 1 ml Novex high density TBE sample buffer (Invitrogen, Carlsbad, CA) and immediately loaded onto a 1.5 mm 4% acrylamide TBE gel in an Xcell apparatus (Invitrogen). Samples were electrophoresed at 110V until they entered the gel, and then resolved at 80V at room temperature. The gel was stained with 0.5 mg ethidium bromide / ml and visualized with an UltraLum Imager (Omega, Claremont, CA); densitometry was performed with ImageJ

(Rasband 2010).

Results

PlivK as a sensitive measure of Lrp response to coregulators. To investigate potential functional differences between Lrp orthologs, we used the E. coli livKHMGF promoter (PlivK) fused to the reporter gene lacZ. PlivK was chosen because it is both activated by Lrp in the absence of leucine, and repressed by Lrp in the presence of leucine (Haney, Platko et al. 1992; Bhagwat, Rice et al. 1997; Lintner, Mishra et al.

32 2008), thus giving a wide dynamic range. Lrp orthologs from E. coli and P. mirabilis were introduced into a lrp-Tn10 strain of E. coli, on the plasmids pEcLrp and pPmLrp

(Lintner, Mishra et al. 2008) (Fig. 1A). These plasmids, based on the low copy pCC1BAC vector (Wild and Szybalski 2004), have identical expression sequences upstream of both orthologs, and have been used in previous studies (Lintner, Mishra et al.

2008). The compatible reporter plasmid carries a PlivK-lacZ transcriptional fusion, isolated from the rest of the plasmid by flanking terminators. As shown in Fig. 1B, Lrp from E. coli or P. mirabilis has essentially identical effects on Plivk-lacZ, with ~8x activation in the absence of Leu, and repression to nearly background levels in the presence of Leu.

33 Figure 2.

Figure 2. Full amino acid screen. Strains with plasmids shown in Fig. 1A were grown in MOPS-glucose medium with the indicated L-amino acid present at 10 mM (except for

Trp at 5 mM, and Val or Cys not shown as there was complete growth inhibition). The bars indicate mean LacZ activity from triplicate single-time assays (mid-logarithmic growth), with standard errors shown, and bar colors represent vector control (no Lrp, white), EcoLrp (gray), and PmiLrp (black). Shading of amino acid names distinguishes those having <25% effect on PlivK-lacZ (white), 25-75% effect (light gray), and >75%

34 effect (dark gray). The dotted line indicates the approximate level of activity for both

EcoLrp and PmiLrp when no amino acid is added to the medium (bars marked “MOPS”).

Several amino acids elicit differential expression of PlivK. Lrp is well known to respond to leucine (hence its name). However other amino acids affect Lrp behavior, even in E. coli, yet are rarely considered (Table 1). We examined the effects of all 20 amino acids on PlivK-lacZ to functionally compare the E. coli and P. mirabilis orthologs of Lrp (EcoLrp and PmiLrp). For this screening, we grew triplicate cultures in MOPS minimal glucose medium each with one amino acid at 10 mM (except for Trp at 5 mM), in shaken capped tubes. Samples were collected when the OD600nm was between 0.4 and

0.8, and b-galactosidase levels were measured (Fig. 2). The amino acids (AA) could be divided into four classes. The first three classes are AA having little effect (<25% reduction in expression relative to no added AA), those having intermediate effects (25-

75%), and those with strong effects (>75%). The strongly-effective AAs included Leu, as expected, but also Met, His, Thr, and Ile, which have not been reported to affect EcoLrp.

Interestingly, the known EcoLrp coregulator Ala had only intermediate effects in this assay. The fourth group, including Ser, Val, and Cys, had toxic or strong effects on growth and therefore were excluded with the exception of serine. As a result, we carried out no further studies with Val and Cys; some additional studies were carried out with

Ser as it had ortholog-specific effects. The growth effects were expected based on the Val sensitivity of E. coli K12 (De Felice, Squires et al. 1977; Ambartsoumian, D'Ari et al.

1994), the conditional auxotrophy for Ile in the presence of excess Ser (Daniel and

Danchin 1979), and the inhibition of threonine deaminase by cysteine (Harris 1981).

35 Figure 3.

Figure 3. Correlogram of PmiLrp vs. EcoLrp for selected amino acids. MOPS- glucose cultures with plasmids as in Fig. 1A included no added amino acids (“None”) or

10 mM Ile (strongly depressive group), Lys (moderately depressive), Gln and Pro

(weakly depressive), and Ser (like Pro, appearing in Fig. 2 to have differential effects on

EcoLrp and PmiLrp). A. Growth rates. Only Ser yielded a vastly different doubling time.

B. LacZ activity slopes from experiments such as those shown in Fig. 1B, plotted for

EcoLrp vs. PmiLrp (Lrp– vector control is plotted on both axes as open square). The solid line indicates expected results if EcoLrp and PmiLrp had identical effects; the dashed line fits the data (excluding the vector control and Ser). Means of triplicate cultures are shown, along with standard errors; where error bars are not visible, they were smaller than the symbols.

Some coregulators have differential effects on PlivK regulation by Lrp orthologs. The most striking finding from Fig. 2 is that amino acids aside from Leu

36 strongly affect EcoLrp (and PmiLrp). However, some amino acids with more limited effects appeared to have differentially affected EcoLrp and PmiLrp, which would be significant enough to justify their further investigation despite their relatively modest effects. Accordingly, we repeated the Fig. 2 experiment in more detail, growing larger cultures in well-aerated flasks and plotting "-galactosidase activity versus culture density

(as in Fig. 1B). Linearity of this plot indicates relatively steady state growth, and the slopes very accurately reflect relative levels of gene expression. The resulting slopes for

EcoLrp were plotted against those for PmiLrp in a correlogram (Fig. 3B). Ile and Lys, included as controls (Fig. 2), again yielded minimal difference between the Lrp orthologs

(Fig. 3B). Pro and Gln gave lacZ activity slopes that varied more between the Lrp orthologs. In fact, the responses to Lys, Gln, Pro, and no added AA are all consistent with

EcoLrp having ~70% of the activity of PmiLrp (dotted line in Fig. 3B). This differential effect is amino acid specific, as it was not seen with the strongly effective class (next section) where amino acids did not have a different effect between PmiLrp and EcoLrp, but differs from that shown in Fig. 2, where Gln and Pro yielded higher PlivK-lacZ expression with EcoLrp than with PmiLrp under the lower aeration conditions.

Serine showed the most profound ortholog-specific effects (Fig. 3B). Relative to the absence of exogenous AA, Ser increased PlivK-lacZ activity by ~70% when PmiLrp was present, but decreased it by ~80% when EcoLrp was present. This effect might be explained, at least in part, by the parallel effects on growth rate in each case (Fig. 3A).

Whether this effect is direct or not, it represents a surprisingly large difference between cells differing only in Lrp orthologs that are 98% identical (Fig. 1C).

37 Comparing these results to the vector control (Lrp–), as expected both Lrp orthologs substantially activated PlivK in the absence of exogenous AA (Fig. 3B,

“None”). Ile reduced this effect almost to vector levels with both orthologs. Pro, Gln, and

Lys, while having different magnitudes of effects with EcoLrp and PmiLrp, consistently if modestly increased the extent of apparent activation.

Figure 4.

Figure 4. Electrophoretic mobility shift analysis (EMSA) of Pro, Gln and Asn. PlivK

DNA (23 nM) was incubated with purified EcoLrp or PmiLrp as described in Materials and Methods, prior to resolution on a nondenaturing gel. The indicated AA were included in the loading buffer and the gel, to maintain their concentration during electrophoresis.

Concentrations of Lrp protein used, calculated as the monomer, were 0, 125, 250, 375, and 500 nM. The 500 nM concentration corresponds to 250 nM as dimers, 62 nM as

38 octamers, and 31 nM as hexadecamers. A. Gel images (negative image, stained after electrophoresis with ethidium bromide and viewed under UV illumination). B.

Densitometric analysis of unshifted bands in gel images, normalized at each concentration to the result with no added AA.

We next carried out electrophoretic mobility shift analyses (EMSAs), to help determine whether the effects of Pro and Gln on EcoLrp and PmiLrp are direct. Based on the results in Fig. 2, we also tested Asn since it had differential effects between Lrp orthologs. Lrp binding to PlivK has not previously been demonstrated via EMSA, though chromatin immunoprecipitation analyses have detected it ((Cho, Barrett et al. 2008) and

A. Khodursky, pers. commun.). Fig. 4A shows results from experiments in which the respective AA were included in the loading buffer and incorporated into the gel, to maintain coregulator levels at 10 mM during electrophoresis. Fig. 4B shows the results of densitometry of the unshifted bands, normalized (for each Lrp protein) to the results when no AA were added; this corrects for possible differences in the fraction of active protein in each of the two preparations, as well as for different intrinsic affinities for

PlivK. The results suggest that Gln stimulates binding of both Lrp orthologs to PlivK, while Asn and Pro stimulate binding of PmiLrp but have little effect on EcoLrp.

39 Figure 5.

Figure 5. Correlograms for the strong AA coregulators of Lrp. A. Growth rates of

EcoLrp vs. PmiLrp cells bearing PlivK-lacZ. The strong co-regulators were present at 10 mM. B. LacZ activities of EcoLrp vs. PmiLrp. Means of triplicate cultures are shown, along with standard errors; where error bars are not visible, they were smaller than the symbols. Points below the vector control (open circle) suggest repression, while points above vector but below “none” (no amino acid added) suggest decreased activation. The inset shows the apparent repression on an expanded scale.

Several amino acids have strong coregulatory effects on Plivk. To refine the

Fig. 2 results for the class of strong coregulators (Leu, Ile, Met, His, Thr), we repeated the experiment in more detail and plotted the b-galactosidase activity versus culture density. Again, the resulting EcoLrp slopes are shown versus those for PmiLrp (Fig. 5B).

Both with respect to growth rates (Fig. 5A) and PlivK-lacZ expression, the points all fall on or very close to the correlogram lines, indicating very similar effects for both Lrp 40 orthologs. As shown in Fig. 5A, some of the effects of Thr and His may result from their

(ortholog-independent) slowing of growth. Comparison to the absence of Lrp (vector control) reveals that Met and Leu are associated with actual repression (inset, Fig. 5B), while Ile, Thr, and His reduce the extent of activation but do not cause repression.

Lrp levels are not substantially altered by the strong coregulators. If the addition of AA to the media resulted in altered amounts of Lrp, this might explain differences in Plivk-lacZ expression. This is especially a concern as Plrp responds to the nutrient environment and ppGpp: Lrp levels are lower in rich media (Landgraf, Wu et al.

1996; Chen, Lan et al. 1997), so individual AA might reduce the Lrp concentration. Our results were not affected by the facts that the lrp gene is autogenously regulated by Lrp

(Wang, Wu et al. 1994), and that coregulators can normally affect Lrp levels in that way, because in these experiments the lrp genes are controlled by PlacUV5 (see Methods) – this promoter is relatively insensitive to ppGpp (Primakoff and Artz 1979; Primakoff

1981). Nevertheless, to determine the levels of Lrp, we collected samples from mid- exponential cultures (grown as in the experiment described in Fig. 5) and performed western blot analysis with a polyclonal anti-Lrp antiserum. Lrp levels were not reduced by the presence of the strong Lrp coregulators (Fig. 6). If anything, Leu, Met, Ile, and Thr appear to have increased Lrp levels slightly (30-50%), while His and Arg had no effect.

41 Figure 6.

Figure 6. Western blot analysis of EcoLrp levels in the presence of the strong amino acid coregulators. Cells were grown under the same conditions as used for the LacZ assays, and samples were collected in mid-exponential phase at a culture OD600nm of ~0.5.

Equal amounts of protein were loaded onto SDS gels, blotted, and probed with polyclonal anti-Lrp antiserum as described in Materials and Methods. Arg, which has no apparent effect on PlivK-lacZ expression, was included as a control. A. Image of one of the triplicate blots. B. Images such as shown in (A) were quantified using Image J, and values were normalized to those from cells grown in the absence of amino acids. Error bars indicate standard errors from the three experiments.

Strong coregulators affect Lrp binding to target promoters. To determine if the amino acids in the strong regulator class directly interact with Lrp, we performed electrophoretic mobility shift assays (EMSA) in which 10 mM AA was added to the 42 binding reaction. As no difference was seen between EcoLrp and PmiLrp for this class of

AA (Fig. 5B), these experiments only used EcoLrp. As before, the respective AAs were also cast into the gels and included in the loading buffer at 10 mM, to maintain their concentration during electrophoresis. We examined effects on two promoters. PlivK was used to allow comparison to our other results (Fig. 7, A-C), though as Lrp both activates and represses this promoter we suspected that the coregulators might not greatly affect binding per se. However weakening of EcoLrp binding to PlivK was seen with all five

AA, as reflected in the rate disappearance of unshifted promoter DNA with respect to protein concentration (Fig. 7 B,C).

Figure 7.

43 Figure 7. Mobility shift assays with EcoLrp and strong amino acid co-regulators.

Two promoter targets were used (both at 23 nM): PlivK (A-C) and PgltBD (D-F). The indicated AA were included in the loading buffer and the gel, to maintain their concentration during electrophoresis. Concentrations of Lrp protein used, calculated as the monomer, were 0, 125, 250, 375, and 500 nM. The 500 nM concentration corresponds to 250 nM as dimers, 62 nM as octamers, and 31 nM as hexadecamers. A, D.

Negative image of ethidium-stained gel photographed under UV. A composite image is shown, with the first five lanes showing EcoLrp titration in the absence of AA, while the remainder are the 375 nM EcoLrp lanes from five separate gels with the indicated AA cast into the gel Arrows indicate observed DNA bands in the absence of AA. B, E.

Quantitative densitometry of representative image set, highlighting comparison between control (no AA added) and Leu (known coregulator). C, F. Quantitative densitometry of triplicate EMSAs (including the set shown in B and E) was carried out using NIH Image

J. Standard errors are shown.

We also examined PgltBD (Fig. 7, D-F). Lrp activates PgltBD, and Leu reduces

Lrp binding, though the in vivo effects of Leu on gltBD transcription are minimal, possibly due to increased activation efficiency (Ernsting, Denninger et al. 1993; Borst,

Blumenthal et al. 1996; Wiese, Ernsting et al. 1997). Only His and Leu detectably weakened DNA binding by EcoLrp, as judged by residual unshifted DNA. The Leu results are consistent with those in an earlier study (Ernsting, Denninger et al. 1993); our results are smaller in magnitude, probably reflecting our use of a lower coregulator concentration (10 mM vs. 30 mM). However differences were observed in the

44 distribution of shifted bands, indicating that some AAs affected EcoLrp-DNA complexes even if they only minimally affected the fraction of unbound DNA. The AA yielding this phenomenon were Leu, Ile, and Met (Fig. 7D). Lrp alone yielded two distinct shifted bands. In the presence of Met, Lrp yielded only one shifted band. Leu and Ile resulted in a triplet of shifted bands. Addition of Thr resulted in a diffuse upper band. As these differences were not seen with PlivK DNA, or in the absence of Lrp, they do not appear to be electrophoretic artifacts due to the presence of the various AA.

Use of a plasmid-independent method to assess EcoLrp-dependent coregulator effects. To test further the apparent breadth of Lrp-coregulator interactions revealed by our in vivo and in vitro studies, we used a strain system independent of that used for Figs. 2, 3 and 5. This system has an independent PlivK-lacZ fusion that is chromosomal and results from lplacMu integration, in an independent host background:

E. coli strain AAEC546 (which carries a chromosomal Plac-lrp fusion) (Blomfield, Calie et al. 1993; Bhagwat, Rice et al. 1997). This strain was grown in the presence and absence of IPTG, which yields respectively Lrp levels undetectable via western blot, and

0.35 ng per ml of cell extract – roughly equivalent to an intracellular monomer concentration of 1.5 mM (Borst, Blumenthal et al. 1996). These two conditions were used with MOPS glucose medium having no added AA, or 10 mM levels of Leu, Met, Ala,

Gln, His, Ile, Lys, Pro, or Thr. The results are shown in Fig. 8, and confirmed the roles of

Leu, Met, Ile, Lys, His, and Thr as strong co-regulators. In particular, Met as well as Leu was associated with frank repression, while the others antagonized activation.

Additionally, Ala showed strong antagonism of activation in this experimental system,

45 consistent with previous results of others (Mathew, Zhi et al. 1996; Zhi, Mathew et al.

1999; Berthiaume, Crost et al. 2004; Crost, Harel et al. 2004).

Figure 8.

Figure 8. In vivo coregulator assays using a chromosomal reporter. A. An example of LacZ assays that were carried out in E. coli strain SPB107, which carries a chromosomal Plac-lrp fusion and a PlivK-lacZ fusion. LacZ activity is plotted against culture density (OD600nm ). The cultures were grown in MOPS-glucose medium, ±10mM of each of the indicated amino acids, and ±0.4 mM IPTG. B. Histogram of the slopes resulting from (A). Slopes from experiments in the absence of IPTG (low Lrp) are shown as grey bars, while filled bars are from cultures containing IPTG (Lrp+). Shading below the graph relates to shading in Fig. 2, with light grey indicating AA that had an intermediate effect in that experiment, and dark grey indicating a strong co-regulator.

46 Discussion

Global regulators are key to understanding or predicting the regulatory architecture of bacterial cells. The transcriptional regulators of E. coli, and probably most bacteria, follow a power law distribution with respect to number of promoters controlled; the top seven together control about half of all E. coli genes (Martinez-Antonio and

Collado-Vides 2003). As one of these top seven global regulators, Lrp plays a central role in E. coli cell physiology (Tani, Khodursky et al. 2002), and we have studied it as a representative of this class (Ernsting, Denninger et al. 1993; Borst, Blumenthal et al.

1996; Bhagwat, Rice et al. 1997; Wiese, Ernsting et al. 1997; Paul, Blumenthal et al.

2001; Tani, Khodursky et al. 2002; Paul, Mishra et al. 2007; Lintner, Mishra et al. 2008).

Predicting regulation via extrapolation from genome sequences is particularly important for bacteria that cannot yet be grown easily in the laboratory, when microarray and other transcriptional data may not be available. Such approaches assume a fairly complete understanding of the conserved regulators [e.g., (Baumbach, Rahmann et al. 2009)].

However, when we used microarray analysis to compare E. coli strains differing only in producing the native EcoLrp or the orthologous PmiLrp from P. mirabilis, substantial differences in gene expression patterns were seen (Lintner, Mishra et al. 2008). This was striking, given that EcoLrp and PmiLrp differ at only 4/164 AA positions (Fig. 1C), and led us to explore here whether Lrp interactions with coregulators might explain part of the unexpected behavior. We have found, through effects on both in vivo reporter fusions and in vitro DNA-binding assays, that a wider range of AA appear to affect EcoLrp and

47 PmiLrp than had previously been appreciated. In this respect, EcoLrp and PmiLrp appear to resemble some other Lrp orthologs (Table 1).

We used PlivKHMGF (PlivK) as our primary model of a Lrp-regulated promoter.

It is activated by Lrp (Bhagwat, Rice et al. 1997; Hung, Baldi et al. 2002; Tani,

Khodursky et al. 2002), and in the presence of exogenous Leu it is repressed by Lrp

(Bhagwat, Rice et al. 1997; Tani, Khodursky et al. 2002; Cho, Barrett et al. 2008). This promoter thus provides a relatively sensitive readout of Lrp-dependent interactions with the coregulator Leu. Global analyses suggest that PlivK also binds Crp (Grainger, Hurd et al. 2005) and Ihf (Grainger, Hurd et al. 2006). Effects of Crp are unlikely in our experiments, as the cultures were all grown with glucose as the carbon source (Tagami and Aiba 1995; Tagami, Inada et al. 1995; Takahashi, Inada et al. 1998). Similarly, except for the experiments involving Ser (and possibly His and Thr; Figs. 3A and 5A), cultures were in unrestricted logarithmic growth so Ihf levels should be fairly constant

(Aviv, Giladi et al. 1994; Weglenska, Jacob et al. 1996). Excluding Glu, the tested AA gave remarkably constant expression of PlivK-lacZ in Lrp– cultures (Fig. 2). In contrast, in the presence of Lrp, the coregulators had a range of transcriptional effects on PlivK

(Fig. 9).

48 Figure 9.

Figure 9. Summary of coregulator effects on Lrp regulation of PlivK. Relative level of expression is on unitless vertical scale. Repression is indicated by levels lower than the

Lrp– vector control baseline. Depressed or enhanced activation are inferred relative to the expression level in Lrp+ cells with no added AA.

Met acts as a corepressor of PlivK. Leu is a Lrp corepressor of the promoter for livKHMGF, an operon specifying a high-affinity uptake system for branched-chain AAs

(Haney, Platko et al. 1992; Bhagwat, Rice et al. 1997; Lintner, Mishra et al. 2008). In this study, Met was found to have strong effects on Lrp-dependent regulation of PlivK similar to those shown by Leu (Figs. 2, and 5B inset). This effect of Met on the Escherichia and

Proteus Lrp orthologs has not been reported previously, and has significant physiological implications, but has precedence in other species (Table 1). In Neisseria meningitidis,

Met-Lrp interactions appear to adapt metabolism to nutrient-poor environments (Ren,

Sainsbury et al. 2007); while for Mycobacterium tuberculosis, Met was shown to bind

Lrp (Shrivastava and Ramachandran 2007). As Lrp activates PlivK while Lrp•Leu or

49 Lrp•Met repress it, one would expect Lrp to bind PlivK whether or not the coregulator is present. In fact, EMSA revealed that Met, like Leu, only mildly affects binding of EcoLrp or PmiLrp to PlivK (Fig. 7, A-C), so presumably Met and Leu alter the protein-DNA complex such that it interferes with transcription. It is noteworthy that, at least in E. coli, the AdoMet synthetase gene (metK) is repressed by Lrp and induced by Leu [(Newman,

Budman et al. 1998), and Fig. S3 of (Cho, Barrett et al. 2008)], and the same pattern was observed for the Met transporter genes metNIQ (Cho, Barrett et al. 2008); our results suggest that the MetNIQ and MetK substrate Met probably also acts as an inducer via

Lrp.

Modulating Lrp activation of PlivK. Of the AA tested in depth (Figs. 3, 5 and 8) only Met and Leu reduced PlivK-lacZ expression to less than that seen in Lrp+ cells when no AA were added. The other AA having substantial effects all appeared to modulate the extent of activation, in that PlivK-lacZ expression levels were higher than in the Lrp– vector control, but were either similar to or lower than in Lrp+ cells with no added AA

(Fig. 8). The results for some AA were consistent across all of our experiments (Table 2).

This group included the two AA yielding strong repression (Leu and Met), and five others substantially decreasing the extent of activation (Ala, Ser, His, Thr, and Ile). The others (Gln, Pro, and Lys) appear to interact with Lrp in vitro, but to have complex in vivo effects due to the plasmid vs. chromosomal location of the genes, the culture conditions (such as level of aeration), or both.

50

Table 2. Comparison of relative amino acid effects

AA added Fig. 2a,b Figs. 3, 5 Fig. 8

MOPS (1.00) (1.00) (1.00)

Gln 1.04 1.25 0.76

Pro 0.78 1.47 1.02

Ala 0.54 NDc 0.26

Ser 0.53 0.20 ND

Lys 0.40 1.29 0.76

His 0.24 0.48 0.49

Thr 0.15 0.78 0.43

Ile 0.12 0.38, 0.30 0.41

Leu 0.03 0.01 0.01

Met 0.004 0.03 0.03

a – data for EcoLrp in indicated figures, divided by the value with no added AA b – Experiment for Fig. 2 used triplicate culture tube-grown cells with single-time assay

using plasmid-based system, for Figs. 3 and 5 used shaken flasks with multiple

samples and linear fit and plasmid-based system, for Fig. 8 used shaken flasks and

linear fit but with chromosomal genes. c – ND, not determined

51 Ile, His and Thr substantially reduced the extent of PlivK activation relative to when no AA were added (Figs. 2, 5B), and did so equally with EcoLrp and PmiLrp.

Mycobacterium tuberculosis Lrp binds His and, with substantially lower affinity, Thr

(Shrivastava and Ramachandran 2007). In vivo studies indicate that Val affects

Pseudomonas aeruginosa Lrp regulation (Boulette, Baynham et al. 2009); given the roles of Leu, Ile and Met, we suspect Val might also interact with EcoLrp however we did not test this due to growth inhibition. Decreased activation could result from reduced Lrp binding, reduced activation efficiency, or both. The EMSA results indicate that these three AA affect PlivK binding at least as much as Leu does (Fig. 7, A-C), though His is the only one that has as great an effect as Leu on PgltBD binding (Fig. 7, D-F). However these are not strong effects, suggesting that conformational changes in the Lrp-DNA complex are primarily responsible for the reduced PlivK transcription. With respect to the regulatory logic of these AA serving as coregulators, global analysis suggests that

EcoLrp regulates the HisJQMP uptake system, in addition to the Thr transporters YgjU,

TdcC and SdaC (Cho, Barrett et al. 2008).

In contrast to Ile, His and Thr, the AA Lys, Gln and Pro increased PlivK activation relative to when no AA were added (Figs. 3B, 9); though as noted above and in

Table 2, the results varied somewhat with the experimental system. Neither Gln nor Pro has been reported to affect EcoLrp, though the proline dehydrogenase of R. capsulatus is controlled by a Lrp-like protein in response to Pro (Keuntje, Masepohl et al. 1995). The

Sulfolabus Lrp-like protein regulates in response to Lys (Brinkman, Bell et al. 2002). Lys also binds MtbLrp (Shrivastava and Ramachandran 2007), and directly regulates expression of a Lys-e-aminotransferase (Reddy, Gokulan et al. 2008). Lys serving as a

52 coregulator of EcoLrp would strengthen the logical basis for EcoLrp regulation of a gene for the lysyl-tRNA synthetase lysU (Gazeau, Delort et al. 1992; Lin, Ernsting et al. 1992).

This group of AA appeared to have differential effects on EcoLrp and PmiLrp, though the effect was moderate. The enhanced activation could result from increased binding, increased activation efficiency, or both. Our EMSA results seem inconsistent with increased binding, and changes in migration of the bound complexes is suggestive of conformational differences in the bound complexes (Fig. 7, A-C).

Possible basis for varied effects of different coregulators. The additional coregulator AA have varied effects on the EcoLrp regulatory pattern of PlivK. Similar to a model for Lrp suggested earlier (Wiese, Ernsting et al. 1997), coregulator binding may reduce Lrp affinity to DNA but at the same time increase the ability of the remaining bound Lrp to activate the promoter. According to this view, the result would depend on the affinity of the given target DNA sequence and the relative effects on Lrp DNA affinity and activation efficiency. Decreased binding and increased activation efficiency may both result from the Lrp octamer-hexadecamer equilibrium, that is driven towards the smaller form by (at least) Leu (Chen, Rosner et al. 2001; Chen and Calvo 2002).

AraC which, like EcoLrp or PmiLrp at PlivK can switch between activation and repression, may serve as a useful model to understand Lrp-AA regulation (Dirla, Chien et al. 2009; Rodgers, Holder et al. 2009). However, it is not yet clear how any Lrp protein activates transcription.

Another possible basis for regulatory flexibility in Lrp proteins may involve coregulator binding. It is interesting that the AA-sensing RAM domain of Lrp (Ettema,

Brinkman et al. 2002) contains one of the four differences between EcoLrp and PmiLrp

53 (Fig. 1C). Mycobacterium tuberculosis Lrp structures of co-crystals indicate the simultaneous binding of multiple AAs per monomer, in two distinct binding pockets

(Shrivastava and Ramachandran 2007). Occupancy of one of these sites, analogous to the

Leu-binding site in EcoLrp (de los Rios and Perona 2007), caused a conformational change that seemed likely to influence multimerization, while occupancy of the other site had distinct effects that seemed more likely to alter interactions with RNA polymerase or other transcription factors. Different AAs were selectively bound at the two sites.

Our observations, revealing unexpected coregulator breadth for one of the global regulators at the apex of the !-proteobacterial “operating system” (Yan, Fang et al. ;

Martinez-Antonio and Collado-Vides 2003), have major physiological consequences.

They also reveal that extrapolatory predictions of regulatory architecture may require greater depth of knowledge of the transcription factors than has been assumed.

ACKNOWLEDGMENTS

Drs. Robert Lintner and Pankaj Mishra for generating some plasmids used Dr. Darren

Sledjeski for providing antiserum, Dr. Arkady Khodursky for sharing unpublished results, and Dr. J. David Dignam for sharing his thoughts and helping with advice and equipment on Lrp purifications. We thank Drs. Ivana de la Serna, R. Mark Wooten, and Isabel Novella for advice and comments on the manuscript. This work was supported by funds from NIH grant R01 AI54716, and a Research Challenge award from the University of Toledo, to RMB. BRH was also supported, in part, by a graduate fellowship from the University of

Toledo Health Science Campus.

54

References

Ambartsoumian, G., R. D'Ari, et al. (1994). "Altered amino acid metabolism in lrp

mutants of Escherichia coli K12 and their derivatives." Microbiology 140 ( Pt 7):

1737-44.

Aviv, M., H. Giladi, et al. (1994). "Expression of the genes coding for the Escherichia

coli integration host factor are controlled by growth phase, rpoS, ppGpp and by

autoregulation." Mol Microbiol 14(5): 1021-31.

Baek, C. H., S. Wang, et al. (2009). "Leucine-responsive regulatory protein (Lrp) acts as

a virulence repressor in Salmonella enterica serovar Typhimurium." J Bacteriol

191(4): 1278-92.

Baumbach, J., S. Rahmann, et al. (2009). "Reliable transfer of transcriptional gene

regulatory networks between taxonomically related organisms." BMC Syst Biol

3: 8.

Berthiaume, F., C. Crost, et al. (2004). "Influence of L-leucine and L-alanine on Lrp

regulation of foo, coding for F1651, a Pap homologue." J Bacteriol 186(24):

8537-41.

Bhagwat, S. P., M. R. Rice, et al. (1997). "Use of an inducible regulatory protein to

identify members of a regulon: application to the regulon controlled by the

leucine-responsive regulatory protein (Lrp) in Escherichia coli." J Bacteriol

179(20): 6254-63.

55 Blomfield, I. C., P. J. Calie, et al. (1993). "Lrp stimulates phase variation of type 1

fimbriation in Escherichia coli K-12." J Bacteriol 175(1): 27-36.

Borst, D. W., R. M. Blumenthal, et al. (1996). "Use of an in vivo titration method to

study a global regulator: effect of varying Lrp levels on expression of gltBDF in

Escherichia coli." J Bacteriol 178(23): 6904-12.

Boulette, M. L., P. J. Baynham, et al. (2009). "Characterization of alanine catabolism in

Pseudomonas aeruginosa and its importance for proliferation in vivo." J Bacteriol

191(20): 6329-34.

Brinkman, A. B., S. D. Bell, et al. (2002). "The Sulfolobus solfataricus Lrp-like protein

LysM regulates lysine biosynthesis in response to lysine availability." J Biol

Chem 277(33): 29537-49.

Calvo, J. M. and R. G. Matthews (1994). "The leucine-responsive regulatory protein, a

global regulator of metabolism in Escherichia coli." Microbiol Rev 58(3): 466-90.

Charlier, D., M. Roovers, et al. (1997). "Cloning and identification of the Sulfolobus

solfataricus lrp gene encoding an archaeal homologue of the eubacterial leucine-

responsive global transcriptional regulator Lrp." Gene 201(1-2): 63-8.

Chen, C. F., J. Lan, et al. (1997). "Metabolic regulation of lrp gene expression in

Escherichia coli K-12." Microbiology 143 ( Pt 6): 2079-84.

Chen, S. and J. M. Calvo (2002). "Leucine-induced dissociation of Escherichia coli Lrp

hexadecamers to octamers." J Mol Biol 318(4): 1031-42.

Chen, S., Z. Hao, et al. (2001). "Modulation of Lrp action in Escherichia coli by leucine:

effects on non-specific binding of Lrp to DNA." J Mol Biol 314(5): 1067-75.

56 Chen, S., M. H. Rosner, et al. (2001). "Leucine-regulated self-association of leucine-

responsive regulatory protein (Lrp) from Escherichia coli." J Mol Biol 312(4):

625-35.

Cho, B. K., C. L. Barrett, et al. (2008). "Genome-scale reconstruction of the Lrp

regulatory network in Escherichia coli." Proc Natl Acad Sci U S A 105(49):

19462-7.

Crost, C., J. Harel, et al. (2004). "Influence of environmental cues on transcriptional

regulation of foo and clp coding for F165(1) and CS31A adhesins in Escherichia

coli." Res Microbiol 155(6): 475-82.

Daniel, J. and A. Danchin (1979). "Involvement of cyclic AMP and its receptor protein in

the sensitivity of Escherichia coli K 12 toward serine: excretion of 2-ketobutyrate,

a precursor of isoleucine." Mol Gen Genet 176(3): 343-50.

De Felice, M., C. Squires, et al. (1977). "Growth inhibition of Escherichia coli K-12 by

L-valine: a consequence of a regulatory pattern." Mol Gen Genet 156(1): 1-7. de los Rios, S. and J. J. Perona (2007). "Structure of the Escherichia coli leucine-

responsive regulatory protein Lrp reveals a novel octameric assembly." J Mol

Biol 366(5): 1589-602.

Dirla, S., J. Y. Chien, et al. (2009). "Constitutive mutations in the Escherichia coli AraC

protein." J Bacteriol 191(8): 2668-74.

Ernsting, B. R., J. W. Denninger, et al. (1993). "Regulation of the gltBDF operon of

Escherichia coli: how is a leucine-insensitive operon regulated by the leucine-

responsive regulatory protein?" J Bacteriol 175(22): 7160-9.

57 Ettema, T. J., A. B. Brinkman, et al. (2002). "A novel ligand-binding domain involved in

regulation of amino acid metabolism in prokaryotes." J Biol Chem 277(40):

37464-8.

Gazeau, M., F. Delort, et al. (1992). "Escherichia coli leucine-responsive regulatory

protein (Lrp) controls lysyl-tRNA synthetase expression." FEBS Lett 300(3): 254-

8.

Grainger, D. C., D. Hurd, et al. (2006). "Association of nucleoid proteins with coding and

non-coding segments of the Escherichia coli genome." Nucleic Acids Res 34(16):

4642-52.

Grainger, D. C., D. Hurd, et al. (2005). "Studies of the distribution of Escherichia coli

cAMP-receptor protein and RNA polymerase along the E. coli chromosome."

Proc Natl Acad Sci U S A 102(49): 17693-8.

Haney, S. A., J. V. Platko, et al. (1992). "Lrp, a leucine-responsive protein, regulates

branched-chain amino acid transport genes in Escherichia coli." J Bacteriol

174(1): 108-15.

Harris, C. L. (1981). "Cysteine and growth inhibition of Escherichia coli: threonine

deaminase as the target ." J Bacteriol 145(2): 1031-5.

Hecht, K., S. Zhang, et al. (1996). "D-histidine utilization in Salmonella typhimurium is

controlled by the leucine-responsive regulatory protein (Lrp)." J Bacteriol 178(2):

327-31.

Hershberg, R. and H. Margalit (2006). "Co-evolution of transcription factors and their

targets depends on mode of regulation." Genome Biol 7(7): R62.

58 Hung, S. P., P. Baldi, et al. (2002). "Global gene expression profiling in Escherichia coli

K12. The effects of leucine-responsive regulatory protein." J Biol Chem 277(43):

40309-23.

Janes, B. K. and R. A. Bender (1999). "Two roles for the leucine-responsive regulatory

protein in expression of the alanine catabolic operon (dadAB) in Klebsiella

aerogenes." J Bacteriol 181(3): 1054-8.

Janga, S. C. and E. Perez-Rueda (2009). "Plasticity of transcriptional machinery in

bacteria is increased by the repertoire of regulatory families." Comput Biol Chem

33(4): 261-8.

Keuntje, B., B. Masepohl, et al. (1995). "Expression of the putA gene encoding proline

dehydrogenase from Rhodobacter capsulatus is independent of NtrC regulation

but requires an Lrp-like activator protein." J Bacteriol 177(22): 6432-9.

Landgraf, J. R., J. Wu, et al. (1996). "Effects of nutrition and growth rate on Lrp levels in

Escherichia coli." J Bacteriol 178(23): 6930-6.

Lin, R., B. Ernsting, et al. (1992). "The lrp gene product regulates expression of lysU in

Escherichia coli K-12." J Bacteriol 174(9): 2779-84.

Lintner, R. E., P. K. Mishra, et al. (2008). "Limited functional conservation of a global

regulator among related bacterial genera: Lrp in Escherichia, Proteus and Vibrio."

BMC Microbiol 8: 60.

Lozada-Chavez, I., S. C. Janga, et al. (2006). "Bacterial regulatory networks are

extremely flexible in evolution." Nucleic Acids Res 34(12): 3434-45.

Madan Babu, M., S. A. Teichmann, et al. (2006). "Evolutionary dynamics of prokaryotic

transcriptional regulatory networks." J Mol Biol 358(2): 614-33.

59 Marasco, R., M. Varcamonti, et al. (1994). "In vivo footprinting analysis of Lrp binding

to the ilvIH promoter region of Escherichia coli." J Bacteriol 176(17): 5197-201.

Marshall, D. G., B. J. Sheehan, et al. (1999). "A role for the leucine-responsive regulatory

protein and integration host factor in the regulation of the Salmonella plasmid

virulence (spv ) locus in Salmonella typhimurium." Mol Microbiol 34(1): 134-45.

Martin, C. (1996). "The clp (CS31A) operon is negatively controlled by Lrp, ClpB, and

L-alanine at the transcriptional level." Mol Microbiol 21(2): 281-92.

Martinez-Antonio, A. and J. Collado-Vides (2003). "Identifying global regulators in

transcriptional regulatory networks in bacteria." Curr Opin Microbiol 6(5): 482-9.

Mathew, E., J. Zhi, et al. (1996). "Lrp is a direct repressor of the dad operon in

Escherichia coli." J Bacteriol 178(24): 7234-40.

Matthews, R. G., Y. Cui, et al. (2000). "Wild-type and hexahistidine-tagged derivatives

of leucine-responsive regulatory protein from Escherichia coli." Methods

Enzymol 324: 322-9.

McFarland, K. A. and C. J. Dorman (2008). "Autoregulated expression of the gene

coding for the leucine-responsive protein, Lrp, a global regulator in Salmonella

enterica serovar Typhimurium." Microbiology 154(Pt 7): 2008-16.

Nakanishi, N., K. Tashiro, et al. (2009). "Regulation of virulence by butyrate sensing in

enterohaemorrhagic Escherichia coli." Microbiology 155(Pt 2): 521-30.

Neidhardt, F. C., P. L. Bloch, et al. (1974). "Culture medium for enterobacteria." J

Bacteriol 119(3): 736-47.

Newman, E. B., L. I. Budman, et al. (1998). "Lack of S-adenosylmethionine results in a

cell division defect in Escherichia coli." J Bacteriol 180(14): 3614-9.

60 Newman, E. B., R. D'Ari, et al. (1992). "The leucine-Lrp regulon in E. coli: a global

response in search of a raison d'etre." Cell 68(4): 617-9.

Paul, L., R. M. Blumenthal, et al. (2001). "Activation from a distance: roles of Lrp and

integration host factor in transcriptional activation of gltBDF." J Bacteriol

183(13): 3910-8.

Paul, L., P. K. Mishra, et al. (2007). "Integration of regulatory signals through

involvement of multiple global regulators: control of the Escherichia coli gltBDF

operon by Lrp, IHF, Crp, and ArgR." BMC Microbiol 7: 2.

Platko, J. V. and J. M. Calvo (1993). "Mutations affecting the ability of Escherichia coli

Lrp to bind DNA, activate transcription, or respond to leucine." J Bacteriol

175(4): 1110-7.

Platko, J. V., D. A. Willins, et al. (1990). "The ilvIH operon of Escherichia coli is

positively regulated." J Bacteriol 172(8): 4563-70.

Price, M. N., P. S. Dehal, et al. (2007). "Orthologous transcription factors in bacteria

have different functions and regulate different genes." PLoS Comput Biol 3(9):

1739-50.

Primakoff, P. (1981). "In vivo role of the relA+ gene in regulation of the lac operon." J

Bacteriol 145(1): 410-6.

Primakoff, P. and S. W. Artz (1979). "Positive control of lac operon expression in vitro

by guanosine 5'-diphosphate 3'-diphosphate." Proc Natl Acad Sci U S A 76(4):

1726-30.

Rasband, W. (2010). "ImageJ." http://rsb.info.nih.gov/ij/docs/index.html.

61 Reddy, M. C., K. Gokulan, et al. (2008). "Crystal structure of Mycobacterium

tuberculosis LrpA, a leucine-responsive global regulator associated with

starvation response." Protein Sci 17(1): 159-70.

Ren, J., S. Sainsbury, et al. (2007). "The structure and transcriptional analysis of a global

regulator from Neisseria meningitidis." J Biol Chem 282(19): 14655-64.

Rodgers, M. E., N. D. Holder, et al. (2009). "Functional modes of the regulatory arm of

AraC." Proteins 74(1): 81-91.

Roesch, P. L. and I. C. Blomfield (1998). "Leucine alters the interaction of the leucine-

responsive regulatory protein (Lrp) with the fim switch to stimulate site-specific

recombination in Escherichia coli." Mol Microbiol 27(4): 751-61.

Sambrook, J. and D. W. Russell (2001). Molecular cloning : a laboratory manual. Cold

Spring Harbor, N.Y., Cold Spring Harbor Laboratory Press.

Shrivastava, T. and R. Ramachandran (2007). "Mechanistic insights from the crystal

structures of a feast/famine regulatory protein from Mycobacterium tuberculosis

H37Rv." Nucleic Acids Res 35(21): 7324-35.

Tagami, H. and H. Aiba (1995). "Role of CRP in transcription activation at Escherichia

coli lac promoter: CRP is dispensable after the formation of open complex."

Nucleic Acids Res 23(4): 599-605.

Tagami, H., T. Inada, et al. (1995). "Glucose lowers CRP* levels resulting in repression

of the lac operon in cells lacking cAMP." Mol Microbiol 17(2): 251-8.

Takahashi, H., T. Inada, et al. (1998). "CRP down-regulates adenylate cyclase activity by

reducing the level of phosphorylated IIA(Glc), the glucose-specific

phosphotransferase protein, in Escherichia coli." Mol Gen Genet 259(3): 317-26.

62 Tani, T. H., A. Khodursky, et al. (2002). "Adaptation to famine: a family of stationary-

phase genes revealed by microarray analysis." Proc Natl Acad Sci U S A 99(21):

13471-6.

Wang, Q., J. Wu, et al. (1994). "Regulation of the Escherichia coli lrp gene." J Bacteriol

176(7): 1831-9.

Weglenska, A., B. Jacob, et al. (1996). "Transcriptional pattern of Escherichia coli ihfB

(himD) gene expression." Gene 181(1-2): 85-8.

Wiese, D. E., 2nd, B. R. Ernsting, et al. (1997). "A nucleoprotein activation complex

between the leucine-responsive regulatory protein and DNA upstream of the

gltBDF operon in Escherichia coli." J Mol Biol 270(2): 152-68.

Wild, J. and W. Szybalski (2004). "Copy-control pBAC/oriV vectors for genomic

cloning." Methods Mol Biol 267: 145-54.

Willins, D. A., C. W. Ryan, et al. (1991). "Characterization of Lrp, and Escherichia coli

regulatory protein that mediates a global response to leucine." J Biol Chem

266(17): 10768-74.

Yan, K. K., G. Fang, et al. "Comparing genomes to computer operating systems in terms

of the topology and evolution of their regulatory control networks." Proc Natl

Acad Sci U S A.

Yang, L., R. T. Lin, et al. (2002). "Structure of the Lrp-regulated serA promoter of

Escherichia coli K-12." Mol Microbiol 43(2): 323-33.

Zhi, J., E. Mathew, et al. (1998). "In vitro and in vivo characterization of three major

dadAX promoters in Escherichia coli that are regulated by cyclic AMP-CRP and

Lrp." Mol Gen Genet 258(4): 442-7.

63 Zhi, J., E. Mathew, et al. (1999). "Lrp binds to two regions in the dadAX promoter region

of Escherichia coli to repress and activate transcription directly." Mol Microbiol

32(1): 29-40.

64

3 Recognition of DNA by the Helix-Turn-Helix

Global Regulatory Protein Lrp is Modulated by the

Amino Terminus

Benjamin R. Hart1, Pankaj K. Mishra1a, Robert E. Lintner1b, Jennifer M.

Hinerman2, Andrew B. Herr2, and Robert M. Blumenthal1,3*

Department of Medical Microbiology & Immunology, and Program in Infection,

Immunity & Transplantation, The University of Toledo College of Medicine,

Toledo, Ohio, 48104 1

Department of Molecular Genetics, Biochemistry, & Microbiology,

University of Cincinnati College of Medicine, Cincinnati, Ohio

45267 2

Program in Bioinformatics & Proteomics/Genomics, The

University of Toledo, Toledo, Ohio 48104 3

a - current address: Department of Medicine / The Centre for Immunity and

65 Inflammation, UMDNJ - New Jersey Medical School, Newark, NJ 071032

b - current address: Broad Institute of MIT and Harvard University,

Cambridge, MA, 021423

Running title: Recognition of DNA by the global regulator Lrp

*Corresponding author. Mailing address: Department of Medical Microbiology &

Immunology, University of Toledo College of Medicine, 3000 Arlington Avenue,

Toledo, OH 43614-1021. Phone: (419) 383-5422. Fax: (419) 383-3002. E-mail: [email protected]

Running title: Recognition of DNA by the Helix-Turn-Helix Global Regulatory Protein

Lrp is Modulated by the Amino Terminus

66 Abstract

The AsnC/Lrp family of regulatory proteins links bacterial and archaeal transcriptional machinery to metabolism. In E. coli Lrp regulates approximately 400 genes, ~200 of them directly. In earlier studies, Lrp orthologs from V. cholerae, P. mirabilis, and E. coli yielded significantly different regulons when introduced into the same background. These differences were despite amino acid sequence identities of 92% and 98% respectively when compared to E. coli Lrp, and complete conservation of the helix-turn-helix motifs.

The N-terminal region contains the majority of the sequence differences among these Lrp orthologs, which led us to investigate its role. Through the generation of hybrid proteins, we found that the N-terminal differences are responsible for some of the differences between orthologs in terms of DNA binding (as revealed by mobility shift assays) and multimerization (as revealed by gel filtration, dynamic light scattering, and analytical ultracentrifugation). These observations suggest that the N-terminal tail of Lrp, as with a number of other regulatory proteins, plays a significant modulatory role.

67 Introduction

A key question in bioinformatics and molecular biology is the extent to which the degree of conservation between proteins is related to conservation of their function. This is a highly nonlinear relationship, as some residues play key structural or functional roles, while others matter very little or not at all. As an example, two proteins that both act as S- adenosyl-L-methionine-dependent methyltransferases can have less than 10% identity between equivalent positions (Fauman, Blumenthal et al. 1999), while it is possible for two proteins to be 88% identical yet have distinct structures and ligand specificities

(Alexander, He et al. 2007).

With the flood of bacterial genome sequences becoming available, including some from metagenomic analyses of bacteria that cannot yet be grown in the laboratory (Chen, Yu et al. ; Ellrott, Jaroszewski et al.), there is great interest in predicting the regulatory architecture of these organisms from their DNA sequences. This is relevant even for the emerging field of synthetic biology where bacterial genomes are now able to be designed

(Gibson, Glass et al.), as regulatory networks add complexity and increased robustness to a genome (Mazurie, Bonchev et al.). This generally involves regulatory extrapolation from well-studied bacteria, making the assumption that a conserved transcription factor, conserved target gene for the regulator, and a predicted binding site for the regulator upstream of the target gene, together imply a conserved regulatory link (Espinosa,

Gonzalez et al. 2005; Madan Babu, Teichmann et al. 2006; Ravcheev, Gerasimova et al.

2007). However, there is evidence that regulatory proteins are more likely than most to exhibit significant functional diversification even over short evolutionary distances

68 (Price, Dehal et al. 2007), perhaps reflecting the needs of bacteria for rapid adaptation of regulatory architecture to new niches (Mazurie, Bonchev et al. ; Lozada-Chavez, Janga et al. 2006).

Figure 1.

Fig. 1. Structure of Lrp octamer. The structure (de los Rios and Perona 2007)was obtained from the Protein Data Bank (http://www.rcsb.org/pdb). The open conformation of the octameric ring is seen when the protein is complexed with DNA (not shown). One of the DNA-binding helix-turn-helix (HTH) motifs is circled. Sequence alignments show the N-terminal half of the protein, with secondary structure indicated above the sequence.

The recognition helix of the HTH motif is highlighted in blue. The variable N-terminal region is expanded into a sequence logo. GenBank accession numbers are indicated for each sequence.

69

The global regulator Lrp (Leucine-responsive regulatory protein) affects the expression of ~200 E. coli genes directly and many more indirectly (D'Ari, Lin et al. 1993; Calvo and Matthews 1994; Newman and Lin 1995; Brinkman, Ettema et al. 2003; Yokoyama,

Ishijima et al. 2006; Cho, Barrett et al. 2008). Lrp binds regulatory DNA via a helix-turn- helix (HTH) motif (Fig. 1), and alterations to this motif strongly influence DNA binding

(Platko and Calvo 1993; Leonard, Smits et al. 2001; de los Rios and Perona 2007).

Results from X-ray diffraction of a Lrp-DNA support the role of its HTH in DNA sequence recognition (de los Rios and Perona 2007). Lrp orthologs from many different genera bear a perfectly conserved HTH motif side of the downstream (recognition) helix of the HTH motif (Fig. 1), and we were surprised to find that even the highly conserved

Lrp proteins from Proteus mirabilis (98% identical to E. coli Lrp) and Vibrio cholerae

(92% identity) have significant functional differences (Lintner, Mishra et al. 2008).

In other HTH proteins it has been demonstrated that amino acid side chains projecting from the DNA-facing side of the downstream (recognition) helix of the HTH motif influence DNA sequence specificity. For example, exchanging these amino acids between two related phage repressors resulted in exchange of their distinct DNA sequence specificities (Wharton and Ptashne 1985; Hollis, Valenzuela et al. 1988). We suspect that, as in the above cases, the N-terminal region of Lrp (Fig. 1) may be responsible for some of the observed functional differences.

70 Figure 2.

Fig. 2. Amino-terminal sequences of Lrp orthologs. A. Amino acid sequence alignment of the N-termini of the Lrp alleles used in this study. The alignment also shows the similarity to the N-terminus of the CI activator/repressor from!l phage. The

“actives” are substitutions in this region of $CI that retained the ability to maintain lysogeny (Eliason, Weiss et al. 1985; Clarke, Beamer et al. 1991; Kim and Hu 1995). B.

Crystal structure of the $ CI activator/repressor bound to DNA, and showing its N- terminal tails making contacts to the major groove (http://www.rcsb.org/pdb) (Clarke,

Beamer et al. 1991).

71 Several lines of evidence led us to suspect that the N-terminal region of Lrp (Fig. 1) might be responsible for some of the functional differences we observed. First, of the four amino acid differences between P. mirabilis Lrp and that of E.coli, two are located within the N-terminal 10 amino acids (in the case of V. cholerae Lrp this was three out of 13 total differences). Figure 1 shows the amino termini and HTH motifs of representative

Lrp protein sequences from the Enterobacteriaceae and Vibrionaceae; the strong conservation of their HTH regions is clear. The logo in Fig. 1 shows that, while the N- terminal 13 aa are more variable than the HTH, this variation is quite limited in scope among the Lrp orthologs sharing identical HTH motifs leading to suspicions that the n- terminus is responsible for some functional differences. Second, when the 10 N-terminal- most amino acids were deleted from E. coli Lrp, DNA binding activity was substantially reduced (de los Rios and Perona 2007). Finally the N-terminal portion of Lrp has sequence similarity to that of the phage $ repressor/activator CI (Fig. 2A). The N- terminal 6 amino acid arm of CI is required for proper DNA binding by this HTH protein; deleting this arm reduces operator binding by ~8000-fold (Pabo, Krovatin et al. 1982;

Eliason, Weiss et al. 1985; Benson, Adams et al. 1992; Kim and Hu 1995), and there is structural information in support of its role (Fig. 2B).

We report that as previously demonstrated for phage repressors the N-terminal arm of

Lrp is also significantly affects its specificity and function, and we discuss the effects of subtle differences on global regulation and their implications for sequence based regulatory prediction.

72 Results

Comparative effects of an N-terminal tail mutations in Lrp and $CI

To explore the possible functional activity of the Lrp N-terminal region, we first generated mutants in which the well-conserved Lys at position 6 (Fig. 1) was replaced by

Gln (Fig. 2A). This substitution was chosen on the basis of the apparent similarity between the N-termini of Lrp and the CI repressor-activator of lambda phage (Fig. 2A).

In the low-temperature crystal structure of $CI bound to its DNA target, a Lys (K4, where numbering does not include the removed fMet; Fig. 2A) forms H-bonds in the major groove to the O6 positions of two consecutive Gs in the DNA. Any substitutions at

K4 (e.g. K4Q of $ CI) greatly reduce binding despite the fact that the HTH motif is unchanged (Clarke, Beamer et al. 1991; Benson, Adams et al. 1992). To explore if the N- terminal region of Lrp has a functional role similar to that of N-terminal $CI we generated mutants in which the conserved Lys at position 6 (Fig. 1) was replaced by Gln

(Fig. 2A).

We tested whether a similar effect resulted from the corresponding change in Lrp. One test promoter was PgltB, which is activated 20-30 fold by Lrp (Ernsting, Atkinson et al.

1992; Borst, Blumenthal et al. 1996; Wiese, Ernsting et al. 1997; Paul, Blumenthal et al.

2001; Paul, Mishra et al. 2007). We used a PgltB-lacZ fusion, with either WT or K6Q

EcoLrp supplied from a plasmid. As expected, WT EcoLrp gave nearly 30-fold activation

(relative to the vector control; Fig. 3). The K6Q mutant gave ~60% of the WT EcoLrp activation level of PgltB. The effect of K6Q substitution on Lrp activation of PgltB is

73 significant, though substantially smaller than the effect of the equivalent substitution on

CI activation of l PRM.

Figure 3.

Fig. 3. Effects of K6Q mutation on EcoLrp. LacZ activity is plotted vs. culture density, and the slopes indicate relative activity. Three promoters were fused to the lacZ reporter gene. Vector control (no Lrp) is shown as open circles, while WT EcoLrp is filled circles, and K6QEcoLrp is grey circles.

Two other tested promoters showed essentially no difference between the WT and K6Q

Lrp proteins (Fig. 3). One, PlivK, is activated by Lrp in the absence of coregulator leucine

(Haney, Platko et al. 1992; Bhagwat, Rice et al. 1997; Tani, Khodursky et al. 2002; Cho,

Barrett et al. 2008), and yielded the expected activation by both Lrp proteins. Another tested promoter was Plrp from E. coli. Plrp is autogenously repressed about three-fold by

Lrp (Wang, Wu et al. 1994). Compared to the vector control, WT EcoLrp repressed Plrp-

74 lacZ expression nearly three-fold (Fig. 3), and the K6Q substitution had no significant effect.

Differences in the amino-terminal tail contribute to some of the regulatory differences among Lrp orthologs.

The K6Q EcoLrp results indicate that the responsiveness of a Lrp sensitive promoter to regulatory input can be changes as the result of a substitution to a signle amino acid that resides outside the known DNA recognition helix. While the K6Q effects were modest, the substitution tests the role of the basic region in the N-terminal arm. The K6QEcoLrp mutation does not that other positions in that arm have effects. To test this possibility, we made substitutions in E. coli Lrp that yielded the N-terminal 13 amino acid sequences from P. mirabilis (PmiEcoLrp), or from V. cholerae (VchEcoLrp; Fig. 2A). We also used the WT proteins from the three species (EcoLrp, PmiLrp, and VchLrp; Fig. 1).

75 Figure 4.

Fig. 4. Effects of Lrp orthologs and hybrids on two promoters. The gltB and lrp promoters were fused to lacZ, and LacZ activities were plotted vs. culture density. Only those fusions and conditions yielding significant differences between Lrp orthologs are shown.

We performed "-galactosidase assays using lacZ fusions to two promoters that were described in the previous section: PgltB and Plrp. The PgltB-lacZ fusion yielded the expected strong activation by EcoLrp, with modestly decreased activation by PmiLrp and

VchLrp (Fig. 4). The hybrid Lrp proteins gave intermediate levels of activation, though the differences were limited in scale.

PlivK showed no differences, so hybrids weren’t tested. PgltB showed less activation than

Pmi or Vch than with Eco; the hybrids were intermediate. Plrp showed hyperrepression

76 with Pmi, as per Lintner et. al. (Lintner, Mishra et al. 2008); the VchEco hybrid acted like

Eco, while the PmiEco hybrid acted like Pmi.

We next tested Plrp, which in our previous study, showed 2-3 fold greater repression by

PmiLrp than by EcoLrp or VchLrp (Lintner, Mishra et al. 2008). We obtained the same result (Fig. 4). We obtained comparable results in that the PmiEco hybrid Lrp elicited enhanced repression while responses to the VchEcoLrp results were indistinguishable from that of EcoLrp suggesting a clear role for the N-terminal region in regulating Plrp

(Fig. 4). These results suggest that the two N-terminal amino acid differences between

PmiLrp and EcoLrp are responsible for their Plrp regulatory difference.

77 Figure 5.

Fig. 5. Effects of EcoLrp and PmiLrp on PompT. A plasmid containing an PompT:LacZ fusion was transformed into Be10.2 cells containing plasmids carrying EcoLrp, PmiLrp, or PmiEcoLrp. LacZ activity was plotted vs. culture density. A. (Left Panel) Cells were grown in media containing Ile, Val, and thiamine. B. (Right Panel) Cells were grown in media containing Ala, Leu, Ile, Val, and thiamine. C. EMSA of 23 nM PompT lane 1 without any Lrp added, lanes 2 and 3 with 250 and 500 nM EcoLrp and PmiLrp in lanes

4 and 5. D. EMSA of PompT same as in C. with the addition of 10 mM Leu.

78

The third promoter we tested was chosen based on our previously-reported microarray analysis of an E. coli K-12 lrp-Tn10 strain producing either EcoLrp or PmiLrp ectopically

(Lintner, Mishra et al. 2008). That data indicated that the gene for the export processing protease OmpT exhibited one of the strongest differential effects of the two lrp alleles.

Further, and unpublished chromatin immunoprecipitation data showing that EcoLrp binds to PompT in vivo (Cho, Barrett et al. 2008, A. Khodursky, pers. commun.). In the absence of the coregulator leucine, we saw significant and indistinguishable activation with all tested lrp alleles (Fig. 5). In the presence of Leu and Ala, EcoLrp was associated with repression, while PmiLrp yielded activation. The hybrid PmiEcoLrp had no effect compared to vector and an intermediate effect to PmiLrp and EcoLrp. The control of

PompT is direct with both PmiLrp and EcoLrp binding the promoter in the presence and absence of 10 mM Leu (Fig 5C,D).

79

Figure 6.

Fig 6. Mobility shift assays of Plrp DNA. The DNA including Plrp was 335 bp, and present at 23 nM. Lrp concentrations in order of lane (nM) are: 0, 1, 37.5, 75, 112.5, 150,

187.5, 225, 262.5, and 300. Images were quantified via Image J, and disassociation constants were calculated by plotting the amount of unshifted DNA versus Lrp concentration, determining the concentration of Lrp necessary for 50% of the DNA to be shifted from a linear fit (Fig S2).

80

Lrp amino-terminal tail affects DNA binding.

To determine if the stronger Plrp repression by PmiLrp than by EcoLrp was due to differences in DNA binding, we performed mobility shift assays using Plrp DNA and the native EcoLrp and PmiLrp proteins. We did not use VchLrp, VchEcoLrp, or K6Q-

EcoLrp in these studies as we are purifying untagged proteins, and these did not bind to the cation exchange column (see Methods). Interestingly, all of the nonbinders contain a substitution for a lysine residue in the N-terminus (Fig 2.). The results indicated, surprisingly, that PmiLrp bound Plrp with much lower affinity than did EcoLrp (Fig. 6).

The general affinities were too weak to study robustly using EMSAs, though apparent association constants could be calculated by measuring the consentration required to shift

%50 of the unshifted DNA band (Supplemental Fig. 2). To test whether the N-terminus was responsible for this difference in affinity, we assayed the PmiEcoLrp hybrid. A sample titration from triplicate EMSA gels is shown in Fig. 7A, and the results are consistent with the N-terminal tail being responsible for the differences in the affinity to

Plrp (Fig. 7B).

81 Figure 7.

Fig 7. Single-gel mobility shift assay of Plrp DNA. A. We carried out side-by-side

EMSAs on one gel, with Lrp concentrations of 30, 225, and 400 nM. B. Densitometric quantitation of results with 400 nM Lrp (±SE) from gel in (A) and two other replicate gels not shown.

Lrp multimerization is affected by the amino-terminal tail.

Lrp forms octamers and hexadecamers, and the shift between these states may be responsible for regulatory changes in response to the coregulator leucine (Chen, Rosner et al. 2001; Chen and Calvo 2002; de los Rios and Perona 2007). We therefore tested whether differences at the N-terminus could affect the Lrp multimeric state.

In purifying the native Lrp proteins, we noticed salt-dependent effects on the multimerization states in the final gel filtration step (not shown). Octamers and

82 monomers were inferred to be the predominant peaks at 0.2 M NaCl, eluting at 25 and 38 min respectively which was slightly faster than the myoglobin standard (16.7kDa). In the case of EcoLrp, the predominant gel filtration peak was at 38 min in the presence of 0.2

M NaCl, but shifted to 25 min upon increasing the salt to 1 M.

Figure 8.

Fig. 8. Multimerization of Lrp as revealed by dynamic light scattering. Given the shape of Lrp (see Fig. 1), mass is not expected to vary linearly with multimeric state; however, for comparison, the expected mass of a Lrp tetramer, octamer, and hexadecamer are approximately 75, 150, and 300 kDa, respectively. Open circles indicate samples in the absence of Leu, closed circles indicate samples in the presence of 10 mM Leu A. EcoLrp.

B. PmiLrp. C. PmiEcoLrp. Error bars in A and C are the SE of 30 data points collected at each concentration. In B, the means are shown from 30 individual assays.

83 Using dynamic light scattering (DLS) (Harding 1994), we observed in the presence of leucine that EcoLrp, PmiLrp, and PmiEcoLrp produced octamers, consistent with previous findings (Fig. 8). However, in the absence of leucine only PmiLrp and

PmiEcoLrp produced forms consistent with hexadecamers in a concentration-dependent manner at concentrations above 10 mM (Fig. 8).

84 Figure 9.

85 Fig 9. Velocity sedimentation of Lrp. A. EcoLrp at ~20 mM with 0.2 M KCl (dark blue) and no salt added (gray). Figures are plotted with the sedimentation coefficient (s*) versus the sedimentation coefficient distribution (C(S)). Peaks are consistent with the forms as labeled: s*12 16mer, s*7 octamer, s*5 tetramer, and s*2 monomer. The peak centered around 0.25 is likely degradation products. B. PmiLrp at ~20 mM with 0.2 M

KCl (dark blue) and no salt added (gray). C. PmiEcoLrp hybrid at ~20 mM with 0.2 M

KCl (dark blue) and no salt added (gray).

We sought to carry out a second test using analytical ultracentrifugation (AUC) (Stafford

2009). All three Lrp proteins yielded forms consistent with monomers, tetramers, and octamers. However, except for PmiEcoLrp, hexadecamers were only observed in the absence of 0.2 M KCl (Fig. 9). This apparent absence of hexadecamers in 0.2 M KCl was consistent with our gel fitration results (see above).

In the absence of 0.2 M KCl, putative 16mers appeared in the analytical ultracentrifuge traces (Fig. 9). The appearance of hexadecamers is consistent with previously published data for His-tagged EcoLrp (Chen, Rosner et al. 2001; Chen and

Calvo 2002). It is not clear why EcoLrp 16mers were seen via ultracentrifugation but not via DLS. The salt sensitivity of 16mer appearance suggests that the 16mer results predominantly from electrostatic interactions.

The PmiEcoLrp hybrid behaved differently, with hexadecamers appearing in the presence of 0.2 M KCl (Fig. 9C) but not in the absence of salt.

86 One possible explanation for this difference is that PmiEcoLrp has a tendency to aggrigate in low concentrations of KCl, and it is possible that this protein forms 16mers, however we were unable to observe them due to aggregation and the low concentration of measurable protein for centrifugation.

We sought to reconcile our data with the previously-reported EcoLrp hexadecamers in the presence of 0.2 M KCl (Chen, Rosner et al. 2001; Chen and Calvo 2002). We noted that those studies used a version of EcoLrp carrying an amino-terminal 6xHis tag.

Given our observations on the effects of the N-terminal tail on Lrp multimerization, we generated and purified the His-tagged EcoLrp. Under our conditions, the tagged protein formed polydisperse aggregates when observed by DLS, yielding a very high sum of squares (SOS; data not shown). This was done under the same conditions as used by

Chen et al. In addition, we attempted to reduce the noise by adding 0.2 M KCl or 10 mM

Leu, neither of which significantly reduced the SOS. Irrespective of the possible effects of an amino-terminal His tag, our other dynamic light scattering results suggest that the native EcoLrp and PmiLrp proteins differ in their multimerization properties, and that the

PmiEcoLrp hybrid behaves like PmiLrp.

87 Discussion

Role of amino-terminal tails in DNA binding by HTH proteins

A variety of proteins that use helix-turn-helix motifs for sequence-specific DNA binding also rely on flexible amino-terminal tails (sometimes referred to as “arms”) for DNA binding (Aravind, Anantharaman et al. 2005). These tails generally contain basic amino acids, and examples include lCI (Fig. 2) (Pabo, Krovatin et al. 1982), the widespread bacterial nickel-responsive regulator NikR (Benanti and Chivers 2007), and homeodomain proteins in the Animalia (Dragan, Li et al. 2006). The effect of these tails can be profound, such as the ~8000-fold reduced binding when it is deleted from $CI

(Eliason, Weiss et al. 1985). Many other HTH proteins do not rely on amino-terminal tails for DNA binding, and compensatory HTH mutations can in some cases restore binding to tail mutants (Benson, Adams et al. 1992). One possible interpretation is that adding to HTH-DNA interactions by establishing dependence on an amino tail allows greater evolutionary flexibility in fine-tuning relative affinities for different binding sites.

This strategy would be particularly important for a global regulatory protein such as Lrp, that directly controls ~200 genes in E. coli (A. Khodursky, pers. commun.; (Cho, Barrett et al. 2008)).

Effects of the amino-terminal tail on Lrp

Our results suggest that the amino-terminal tail of Lrp proteins play significant roles in promoter regulation, DNA binding, and multimerization. This interpretation is consistent with the profound effects on DNA binding of deleting the first 10 amino acids of EcoLrp

(de los Rios and Perona 2007), though not surprisingly the effects reported here of 1-3

88 substitutions are more limited than those of complete tail removal. In addition, Lrp is active as an octameric ring with DNA wrapping around the outside edge, where eight

HTH motifs and eight tails are present (Chen, Rosner et al. 2001; Chen and Calvo 2002; de los Rios and Perona 2007), which is consistent with the extended DNAse I footprints generated by Lrp on target binding sites (Wiese, Ernsting et al. 1997; Zhi, Mathew et al.

1999; Jafri, Chen et al. 2002). This implies strong cooperativity that would tend to reduce the importance of individual Lrp-DNA contacts. This may also explain the striking pattern of sequence divergence among Enterobacteriaceae and Vibrionaceae (Fig. 1), where Lrp orthologs have completely conserved HTH motifs and >90% overall identity, but show substantial variation in their amino-terminal tails.

Table 1. Relative effects of EcoLrp and its derivatives on transcriptional fusions

Lrp PlivK-lacZ PgltB-lacZ Plrp-lacZ PompT-lacZ

-Leu -Leu +Leu +Leu

WT (Eco) (1.00)a (1.00) (1.00) (1.00)

K6Q 1.18 0.60 0.90 NDb

V2I,S4N (PmiEco) ND 0.76 0.33 4.29

K5Y,R7K,G9S (VchEco) ND 0.69 0.96 ND

a – Ratio of slopes from the plots of ß-galactosidase activity vs. culture density, mutant /

WT. b – ND, not determined

89

Table 1 compares the effects on transcriptional fusions of just EcoLrp and its derivatives

(K6Q-EcoLrp, PmiEcoLrp and VchEcoLrp; Fig. 2A), to eliminate effects of the substitutions outside of the amino-terminal tail in PmiLrp and VchLrp. The results are consistent with the hypothesis that the Lrp tail plays a promoter-specific fine-tuning role

– no single promoter or EcoLrp variant consistently yields the largest effects. For example, PmiEcoLrp has the two greatest effects relative to EcoLrp, threefold on Plrp and fourfold on PompT, but has the smallest effect of tested variants on PgltB.

Our results suggest that the Lrp amino-terminal tails affect both multimerization and

DNA binding. EMSA reveals that PmiLrp binds Plrp DNA with lower affinity than

EcoLrp, and that the hybrid PmiEcoLrp behaves in this respect like PmiLrp (Figs. 6, 7).

Similarly, DLS reveals a greater propensity for PmiLrp to form multimers larger than octamers than EcoLrp, and the hybrid PmiEcoLrp behaves like PmiLrp (Fig. 8). In this respect, it is worth noting that the carboxyl-terminal 10 amino acids of Lrp also appear to play a role in multimerization (Chen, Rosner et al. 2001), though EcoLrp and PmiLrp are identical there. Using AUC the data however supported that hexadecamers are likely a result of electrostatic interactions (Fig. 9). PmiEcoLrp was more prone to aggregation and could therefore explain the possible reduced concentration for the AUC and the reduced signal (Fig. 9C). This aggregation could also have had an effect on the lack of obvious hexadecamer within this solution. It appears that modifications to native Lrp proteins through modifications within the N-terminus can result in unexpected behavior.

The same goes for observations involving the use of His6Lrp.

90

The effects of the amino-terminal tail on in vivo and in vitro behavior of EcoLrp appear to be significant, and some of the implications of this are discussed below. Nevertheless, it is not yet obvious how the in vivo and in vitro effects fit together. Comparing just

EcoLrp and PmiEcoLrp and their interactions with Plrp, EcoLrp represses to a lesser extent in vivo, yet has higher affinity for the promoter DNA in vitro. One possible explanation is that the effective concentration of EcoLrp in the cell is lower than that of

PmiEcoLrp, perhaps due to differing affinities for nonspecific DNA (Chen, Hao et al.

2001; Peterson, Dahlquist et al. 2007). Given the variability and high content of charged aa in the amino-terminal tail (Fig. 1), differences in Lrp sequestration on nonspecific

DNA seem likely. Furthermore, by analogy with the charged amino-terminal tails of histones (Munshi, Shafi et al. 2009; Cheng and Blumenthal 2010), it is even possible that binding of proteins such as Lrp is modulated by post-translational modifications.

Implications for bioinformatic prediction of regulation

Predicting transcriptional regulatory interactions, based on a conserved transcription factor, conserved target gene, and conserved binding site for the regulator upstream of the target gene, poses significant bioinformatic challenges at each level. Binding site prediction is probably the most difficult of these, but that element of the predictive approach can be obviated for testing purposes by placing different orthologous regulators into the same bacterial background. When this was done with Lrp in E. coli, unexpectedly large differences were seen (Lintner, Mishra et al. 2008). That result suggested that the 4/164 substitutions between PmiLrp and EcoLrp, and the 12/164

91 between VchLrp and EcoLrp, had significant functional effects despite the fact that none were within the helix-turn-helix domain. Our results, in establishing that the amino- terminal tail of EcoLrp affects regulatory function, is consistent with other studies indicating particular evolutionary flexibility of transcription factors (Lozada-Chavez,

Janga et al. 2006; Madan Babu, Teichmann et al. 2006; Price, Dehal et al. 2007; Janga and Perez-Rueda 2009).

Methods

Bacterial strains, media, and growth conditions

Most bacterial strains used in this study were based in the Be10.2 backround containing pCC1 based plasmids containing EcoLrp (E. coli Lrp), PmiLrp (Proteus mirabilis Lrp),

VchLrp (Vibrio cholerae Lrp) or hybrid Lrps where the subsitiutions were made in the N- terminal tail EcoLrp protein to that of PmiLrp or VchLrp or a K6QLrp mutant changing the K in the 6th position to Q.

In all cases cells were grown in baffled flasks shaken at 37°C. For lacZ assays cells were grown on LB plates, then transferred to M9 plates before being moved to

Morpholinopropane sulfonic acid (MOPS) glucose minimal medium (Teknova Holister

Ca). For pLrp-LacZ analyses, additional amino acids and supplements were used at the following final concentrations where listed: L-alanine 10 mM L-leucine, 0.4 mM L- isoleucine and 0.4 mM L-valine, thiamine . Antibiotics were used, where indicated, as follows: 100 µg ampicillin /ml, 15 µg chloramphenicol /ml, 100 µg kanamycin/ml, and 10

µg tetracycline /ml. Cells were maintained in log phase for at least 10 generations before being diluted 1:50 for experimental cultures. For protein purification cells were grown in

92 STG media.

"-galactosidase assays

Strains were grown to exponential phase in glucose minimal MOPS medium. Samples were taken at 25 min intervals throughout the growth period. Levels of ß-galactosidase were determined by o-nitrophenyl-D-galactoside (ONPG) hydrolysis. ß-galactosidase levels were plotted against culture absorbance, and points were fitted via linear regression. The resulting slope yields the ß-galactosidase activity.

Generation of hybrid Lrps

Lrp hybrids were generated using the quick-change method (Epicentre, Madison WI).

Purification of Lrps

Native Lrp protein was purified as previously reported. For EcoLrp JWD3-1 cells were used, for the PmiLrp Be10.2 cells containing PmiLrp inserted into pTRC99a

Lrps, and PmiEcoLrp was inserted into pBad24 transformed into Be10.2 cells. Cells were grown in STG media until an OD600 of between 1.0 and 1.5 was reached where cells were induced for 2 h with 0.5 mM IPTG or 0.2% arabinose for PmiEcoLrp. Cells were pelleted and frozen until purification. For purification, cells were sonicated in TG10ED buffer (10 mM Tris pH 8.0, 10% glycerol, 0.1 mM EDTA , 0.2 M NaCl and 0.1 mM

DTT) with 100 ml 0.2 g/ml phenylmethylsulfonyl fluoride (PMSF) per 500 ml cell suspension. Cells were sonicated in a cup horn probe (Ultrasonics, Plainview, NY) at maximum power for five rounds of 1 min, separated by 2 min on ice. The lysate was centrifuged 30 min at 15k x g, and the resulting supernatant was loaded onto a 1 x 12 cm

BioRex70 cation exchange column (BioRad, Hercules, CA) equilibriated with TG10ED.

Proteins were eluted with a 0.2 – 1.0 M NaCl gradient, and fractions were analyzed by

93 examining coomassie stained SDS polyacrylamide gels for the Lrp 18.9 kDa band.

Fractions containing Lrp were pooled and concentrated with VivaSpin concentrators having a 10,000 MW cutoff (Sartorius stedim, Dusseldorf, Germany). Concentrated Lrp fractions were then loaded onto a 1 x 28 cm Superose12 column (GE Healthcare,

Uppsala, Sweden) equilibrated with TG10ED buffer. Fractions containing highly purified

Lrp were concentrated and dialyzed into MES buffer (10 mM N-morpholinoethane sulfonate, pH 6.25, 0.1 mM EDTA, and 0.2 M KCl).

For purification of His6Lrp EcoLrp was cloned into pET-45B(+) between the PshAI site and the SalI site. Cells were grown to an OD600 of between 1.0 and 1.5, cells were then pelleted and frozen until purification. Cells were sonicated in (3 ml/g pelleted cells) 10 mM MES buffer pH 6.5 with 10 mM imidazole, 5 mM "-mercaptoethanol, 0.5 M KCl, and 100 ml/500 ml cultured cells of 20 mg/ml EtOH using a horn cup probe for 10s x 6 with 10 s breaks on ice (Ultrasonics, Plainview, NY). Cells were spun at 10k X g for 30 min at 4oC, and supernatant was incubated with Ni-NTA (Qiagen) at a ratio of 1 ml beads

/ 4 ml supernatant for 1 hr at 4oC. Slurry was then loaded onto a 10cm x 0.5 cm column and beads were washed 3 x with 5 ml lysis buffer / ml of beads. Four 0.5 ml fractions were eluted with Lysis buffer containing 200 mM imindazole and analyzed via SDS page. Fractions containing Lrp were pooled and concentrated with VivaSpin concentrators having a 10,000 MW cutoff (Sartorius stedim, Dusseldorf, Germany).

Concentrated Lrp fractions were then loaded onto a 1 x 28 cm Superose12 column (GE

Healthcare, Uppsala, Sweden) equilibrated with TG10ED buffer. Fractions containing highly purified Lrp were concentrated and dialyzed into MES buffer (10 mM N- morpholinoethane sulfonate, pH 6.25, 0.1 mM EDTA, and 0.2 M KCl).

94 Dynamic Light Scattering

Dynamic light scattering was performed using a Dynapro Titan instrument from Wyatt

Technologies using Dynamic V6 software. The 30 ml samples were spun at 15k rpm for

15 minutes at 4oC prior to removal of the top 16 ml (for removal of dust) which were added to the 12 ml 3 window cuvette (Wyatt technologies). Readings were collected 30 times for 30 seconds.

Mobility Shift assay

Purified Lrp was mixed with 23 nM 375 bp fragment of Plrp or a 575 bp fragment of

PompT DNA where indicated in a solution containing 40 mM Tris pH 7.4, 60 mM KCl,

0.1 mM EDTA, 5% glycerol, 80 mM NaCl, and 1 mM DTT, with Leucine added at 30 mM as indicated. The samples were incubated at 23 °C for 20 min prior to the addition of

1 ml Novex high density TBE sample buffer with 30 mM leucine added where indicated

(Invitrogen, Carlsbad, CA) and immediately loaded onto a 1.5 mm 4% acrylamide TBE gel in an Xcell apparatus (Invitrogen) with 30 mM leucine cast into the gel where indicated. Samples were electrophoresed at 110V until they entered the gel, and then resolved at 80V at room temperature. The gel was stained with 0.5 mg ethidium bromide

/ ml and visualized with an UltraLum Imager (Omega, Claremont, CA); densitometry was performed with ImageJ (Rasband 2010).

Analytical Ultracentrifugation

Sedimentation velocity experiments were utilized to determine the size distribution for each LRP sample (EcoLrp, PmiLrp, and PmiEcoLrp) at 25°C using a Beckman XL-I analytical ultracentrifuge. AUC experiments for each protein sample were performed at

30,000 rpm, with data collection occurring at 230 nm and 238 nm using an absorbance

95 optics system. Data were deconvoluted to determine sedimentation coefficient distributions using the c(s) analysis routine in the program SEDFIT (Schuck 2000).

ACKNOWLEDGEMENTS

We would like to thank Dr. Ronald Viola and Ms. Buenafe Arachea for the use of and help with the dynamic light scattering instrument, Dr. Darren

Sledjeski for providing antiserum, Dr. Arkady Khodursky for sharing unpublished results, and Dr. J. David Dignam for sharing his thoughts and helping with advice and equipment on Lrp purifications. We thank Drs. Ivana de la Serna, R. Mark Wooten, and Isabel Novella for advice and comments on the manuscript. This work was supported by funds from NIH grant R01 AI54716, and a Research Challenge award from the University of Toledo, to RMB. BRH was also supported, in part, by a graduate fellowship from the University of

Toledo Health Science Campus.

96

References

Alexander, P. A., Y. He, et al. (2007). "The design and characterization of two proteins

with 88% sequence identity but different structure and function." Proc Natl Acad

Sci U S A 104(29): 11963-8.

Aravind, L., V. Anantharaman, et al. (2005). "The many faces of the helix-turn-helix

domain: transcription regulation and beyond." FEMS Microbiol Rev 29(2): 231-

62.

Benanti, E. L. and P. T. Chivers (2007). "The N-terminal arm of the Helicobacter pylori

Ni2+-dependent transcription factor NikR is required for specific DNA binding."

J Biol Chem 282(28): 20365-75.

Benson, N., C. Adams, et al. (1992). "Mutant lambda repressors with increased operator

affinities reveal new, specific protein-DNA contacts." Genetics 130(1): 17-26.

Bhagwat, S. P., M. R. Rice, et al. (1997). "Use of an inducible regulatory protein to

identify members of a regulon: application to the regulon controlled by the

leucine-responsive regulatory protein (Lrp) in Escherichia coli." J Bacteriol

179(20): 6254-63.

Borst, D. W., R. M. Blumenthal, et al. (1996). "Use of an in vivo titration method to

study a global regulator: effect of varying Lrp levels on expression of gltBDF in

Escherichia coli." J Bacteriol 178(23): 6904-12.

97 Brinkman, A. B., T. J. Ettema, et al. (2003). "The Lrp family of transcriptional

regulators." Mol Microbiol 48(2): 287-94.

Calvo, J. M. and R. G. Matthews (1994). "The leucine-responsive regulatory protein, a

global regulator of metabolism in Escherichia coli." Microbiol Rev 58(3): 466-90.

Chen, S. and J. M. Calvo (2002). "Leucine-induced dissociation of Escherichia coli Lrp

hexadecamers to octamers." J Mol Biol 318(4): 1031-42.

Chen, S., Z. Hao, et al. (2001). "Modulation of Lrp action in Escherichia coli by leucine:

effects on non-specific binding of Lrp to DNA." J Mol Biol 314(5): 1067-75.

Chen, S., M. H. Rosner, et al. (2001). "Leucine-regulated self-association of leucine-

responsive regulatory protein (Lrp) from Escherichia coli." J Mol Biol 312(4):

625-35.

Chen, T., W. H. Yu, et al. "The Human Oral Microbiome Database: a web accessible

resource for investigating oral microbe taxonomic and genomic information."

Database (Oxford) 2010: baq013.

Cheng, X. and R. M. Blumenthal (2010). "Coordinated chromatin control: structural and

functional linkage of DNA and histone methylation." Biochemistry 49(14): 2999-

3008.

Cho, B. K., C. L. Barrett, et al. (2008). "Genome-scale reconstruction of the Lrp

regulatory network in Escherichia coli." Proc Natl Acad Sci U S A 105(49):

19462-7.

Clarke, N. D., L. J. Beamer, et al. (1991). "The DNA binding arm of lambda repressor:

critical contacts from a flexible region." Science 254(5029): 267-70.

98 D'Ari, R., R. T. Lin, et al. (1993). "The leucine-responsive regulatory protein: more than

a regulator?" Trends Biochem Sci 18(7): 260-3. de los Rios, S. and J. J. Perona (2007). "Structure of the Escherichia coli leucine-

responsive regulatory protein Lrp reveals a novel octameric assembly." J Mol

Biol 366(5): 1589-602.

Dragan, A. I., Z. Li, et al. (2006). "Forces driving the binding of homeodomains to

DNA." Biochemistry 45(1): 141-51.

Eliason, J. L., M. A. Weiss, et al. (1985). "NH2-terminal arm of phage lambda repressor

contributes energy and specificity to repressor binding and determines the effects

of operator mutations." Proc Natl Acad Sci U S A 82(8): 2339-43.

Ellrott, K., L. Jaroszewski, et al. "Expansion of the protein repertoire in newly explored

environments: human gut microbiome specific protein families." PLoS Comput

Biol 6(6): e1000798.

Ernsting, B. R., M. R. Atkinson, et al. (1992). "Characterization of the regulon controlled

by the leucine-responsive regulatory protein in Escherichia coli." J Bacteriol

174(4): 1109-18.

Espinosa, V., A. D. Gonzalez, et al. (2005). "Comparative studies of transcriptional

regulation mechanisms in a group of eight gamma-proteobacterial genomes." J

Mol Biol 354(1): 184-99.

Fauman, E. B., R. M. Blumenthal, et al. (1999). "Structure and evolution of AdoMet-

dependent methyltransferases." 1-38.

Gibson, D. G., J. I. Glass, et al. "Creation of a bacterial cell controlled by a chemically

synthesized genome." Science 329(5987): 52-6.

99 Haney, S. A., J. V. Platko, et al. (1992). "Lrp, a leucine-responsive protein, regulates

branched-chain amino acid transport genes in Escherichia coli." J Bacteriol

174(1): 108-15.

Harding, S. E. (1994). "Determination of diffusion coefficients of biological

macromolecules by dynamic light scattering." Methods Mol Biol 22: 97-108.

Hollis, M., D. Valenzuela, et al. (1988). "A repressor heterodimer binds to a chimeric

operator." Proc Natl Acad Sci U S A 85(16): 5834-8.

Jafri, S., S. Chen, et al. (2002). "ilvIH operon expression in Escherichia coli requires Lrp

binding to two distinct regions of DNA." J Bacteriol 184(19): 5293-300.

Janga, S. C. and E. Perez-Rueda (2009). "Plasticity of transcriptional machinery in

bacteria is increased by the repertoire of regulatory families." Comput Biol Chem

33(4): 261-8.

Kim, Y. I. and J. C. Hu (1995). "Operator binding by lambda repressor heterodimers with

one or two N-terminal arms." Proc Natl Acad Sci U S A 92(16): 7510-4.

Leonard, P. M., S. H. Smits, et al. (2001). "Crystal structure of the Lrp-like

transcriptional regulator from the archaeon Pyrococcus furiosus." Embo J 20(5):

990-7.

Lintner, R. E., P. K. Mishra, et al. (2008). "Limited functional conservation of a global

regulator among related bacterial genera: Lrp in Escherichia, Proteus and Vibrio."

BMC Microbiol 8: 60.

Lozada-Chavez, I., S. C. Janga, et al. (2006). "Bacterial regulatory networks are

extremely flexible in evolution." Nucleic Acids Res 34(12): 3434-45.

100 Madan Babu, M., S. A. Teichmann, et al. (2006). "Evolutionary dynamics of prokaryotic

transcriptional regulatory networks." J Mol Biol 358(2): 614-33.

Mazurie, A., D. Bonchev, et al. "Evolution of metabolic network organization." BMC

Syst Biol 4: 59.

Munshi, A., G. Shafi, et al. (2009). "Histone modifications dictate specific biological

readouts." J Genet Genomics 36(2): 75-88.

Newman, E. B. and R. Lin (1995). "Leucine-responsive regulatory protein: a global

regulator of gene expression in E. coli." Annu Rev Microbiol 49: 747-75.

Pabo, C. O., W. Krovatin, et al. (1982). "The N-terminal arms of lambda repressor wrap

around the operator DNA." Nature 298(5873): 441-3.

Paul, L., R. M. Blumenthal, et al. (2001). "Activation from a distance: roles of Lrp and

integration host factor in transcriptional activation of gltBDF." J Bacteriol

183(13): 3910-8.

Paul, L., P. K. Mishra, et al. (2007). "Integration of regulatory signals through

involvement of multiple global regulators: control of the Escherichia coli gltBDF

operon by Lrp, IHF, Crp, and ArgR." BMC Microbiol 7: 2.

Peterson, S. N., F. W. Dahlquist, et al. (2007). "The role of high affinity non-specific

DNA binding by Lrp in transcriptional regulation and DNA organization." J Mol

Biol 369(5): 1307-17.

Platko, J. V. and J. M. Calvo (1993). "Mutations affecting the ability of Escherichia coli

Lrp to bind DNA, activate transcription, or respond to leucine." J Bacteriol

175(4): 1110-7.

101 Price, M. N., P. S. Dehal, et al. (2007). "Orthologous transcription factors in bacteria

have different functions and regulate different genes." PLoS Comput Biol 3(9):

1739-50.

Rasband, W. (2010). "ImageJ." http://rsb.info.nih.gov/ij/docs/index.html.

Ravcheev, D. A., A. V. Gerasimova, et al. (2007). "Comparative genomic analysis of

regulation of anaerobic respiration in ten genomes from three families of gamma-

proteobacteria (Enterobacteriaceae, Pasteurellaceae, Vibrionaceae)." BMC

Genomics 8: 54.

Schuck, P. (2000). "Size-distribution analysis of macromolecules by sedimentation

velocity ultracentrifugation and lamm equation modeling." Biophys J 78(3):

1606-19.

Stafford, W. F., 3rd (2009). "Protein-protein and ligand-protein interactions studied by

analytical ultracentrifugation." Methods Mol Biol 490: 83-113.

Tani, T. H., A. Khodursky, et al. (2002). "Adaptation to famine: a family of stationary-

phase genes revealed by microarray analysis." Proc Natl Acad Sci U S A 99(21):

13471-6.

Wang, Q., J. Wu, et al. (1994). "Regulation of the Escherichia coli lrp gene." J Bacteriol

176(7): 1831-9.

Wharton, R. P. and M. Ptashne (1985). "Changing the binding specificity of a repressor

by redesigning an alpha-helix." Nature 316(6029): 601-5.

Wiese, D. E., 2nd, B. R. Ernsting, et al. (1997). "A nucleoprotein activation complex

between the leucine-responsive regulatory protein and DNA upstream of the

gltBDF operon in Escherichia coli." J Mol Biol 270(2): 152-68.

102 Yokoyama, K., S. A. Ishijima, et al. (2006). "Feast/famine regulatory proteins (FFRPs):

Escherichia coli Lrp, AsnC and related archaeal transcription factors." FEMS

Microbiol Rev 30(1): 89-108.

Zhi, J., E. Mathew, et al. (1999). "Lrp binds to two regions in the dadAX promoter region

of Escherichia coli to repress and activate transcription directly." Mol Microbiol

32(1): 29-40.

103

Supplemental Figures

Supplemental Figure 1.

104 Fig. S1. Velocity Sedimentation of Lrp orthologs at A. ~50 mM B. ~20 mM. Color indicates the ortholog: EcoLrp (blue), PmiLrp (red) and PmiEcoLrp (green). Peaks at s* of 7 indicate octamer, s* of 5 is consistent with tetramer, and s* of 2 is probably monomer. PmiEcoLrp peak at 11 is likely contamination, as it did not vary in a concentration dependent manner. The peak centered around 0.25 is likely a degradation product.

Fig S2.

Figure S2. Linear fits for calculation of apparent association constants for Fig. 6.

Percentage of shifted DNA plotted against Lrp concentration and fit with a linear function. EcoLrp (open circles dashed line) (Fig.6A), PmiLrp (maroon line, filled circles) (Fig. 6B), EcoLrp + Leu (red line, open squares) (Fig. 6C). Constants were calculated for 50% shifted DNA.

105

4 Discussion

4.1 Lrp orthologs from closely related species have distinct regulatory effects.

Our studies into the functional differences between the closely-related Lrp orthologs from Proteus mirabilis and Escherichia coli have revealed that even a few changes in protein sequence can have important effects. These changes have occurred outside of regions previously identified as being functionally important, such as the DNA binding helix-turn-helix motif or amino acid coregulator interacting domains. Therefore, it is possible that these mutations have been selected for and confir an advantage in their native background. The Lrp orthologs tested were 98% identical, and would be expected to behave in a functionally similar manner for the purposes of predicting regulation by extrapolation (Janga and Collado-Vides 2007; Kreimer, Borenstein et al. 2008). Earlier studies with these Lrp orthologs revealed that this is not entirely the case, with the orthologs yielding only partially-overlapping regulatory patterns when moved into the same background (Lintner, Mishra et al. 2008). The differences in regulatory patterns,

DNA binding affinity, and multimerization can be attributed to these few amino acid differences. This is despite complete conservation of the DNA recognizing helix-turn- helix domains. These differences led us to investigate the amino acid substitutions within these orthologs to determine their implications.

106 In order to make more accurate predictions of regulation across species, the nature of the relevant regulatory proteins should be well understood. Our data suggest that even the extensively-studied global regulator Lrp has not been characterized sufficiently.

Specifically, the N-terminus of Lrp plays a substantial role in its function, and Lrp responds to a broader range of coregulators than previously appreciated.

4.1.1 The N-terminus of Lrp is responsible for some of the differences between Lrp function.

Lrp is highly conserved over nearly its full length among !-proteobacteria, suggesting that much of the protein is functionally important (Friedberg, Platko et al.

1995). The N-terminal tail is one of the most variable regions of Lrp, indicating that it may not be functionally important by the standards of Friedberg et al. Others showed that deletion of the first ten AA profoundly reduces in vitro DNA binding by EcoLrp, though the basis for this effect was not determined (de los Rios and Perona 2007). Our data indicate that the N-terminus is responsible for some of the regulatory differences observed between PmiLrp and EcoLrp, possibly due to effects on multimerization.

Others have shown that EcoLrp forms octamers and hexadecamers (16mers), with leucine favoring the octameric state (Chen, Rosner et al. 2001; Chen and Calvo 2002). In our studies, dynamic light scattering revealed that PmiLrp and PmiEcoLrp form 16mers, while EcoLrp remains in an octameric state over the tested concentration range.

However, our analytical ultracentrifugation data indicated the formation of 16mers by both EcoLrp and PmiLrp. These differences have likely functional consequences, with evidence that 16mers of EcoLrp have a higher DNA binding affinity than the octameric

107 form (Chen and Calvo 2002). It appears that both Lrp orthologs form 16mers but this formation is very sensitive to the concentration both Lrp and salt. The relatively small amount of total Lrp in the hexadecameric state at ~20 µM suggests the Lrp is predominantly in the octameric state in solution and “possibly” in the cell and that leucine further stabilizes this octameric structure. The role of the N-terminal region for multimerization is unclear though the data suggests that the N-terminus of PmiLrp predisposes the Lrp orthologs to formation of 16mers and aggregation.

4.1.2 Comparisons of the N-terminus of Lrp to established regulatory regions offer insights into Lrp function.

The use of DNA-binding tails has been seen in regulatory proteins from eukaryotes, archaea, and bacteria. The variability within this region suggests a mechanism by which fine-tuning of transcription factor interactions with diverse sites can occur.

Lrp forms a circular structure, with DNA wrapped around the outside. The unstructured tails that extend out to the DNA beyond the helix-turn-helix (de los Rios and

Perona 2007). This is reminiscent of histones. The H2a, H2b, H3 and H4 nucleosome proteins each have negatively-charged and unstructured N-terminal tails that play important roles in DNA binding (Arya and Schlick 2006).

Further, inquiry into this analogy sparked my interest due to the lysine-rich region on the N-terminal tail of Lrp. Histones have a 15-30 residue tail that, like Lrp is, lysine rich, and a site for post-translational modifications (Kimura, Matsubara et al. 2005).

Post-translational modifications of lysines in prokaryotes include methylation (Polevoda

108 and Sherman 2007), acetylation (Zhang, Sprung et al. 2009), and a ubiquitin-like modification known as pupylation (Burns, Liu et al. 2009). Acetylation and other post- translational modifications are central to the ability of nucleosomal histones to modulate transcription (Kimura, Matsubara et al. 2005). Given the apparently analogous basic tails on histones and Lrp, I experimented briefly to explore this similarity. Many regulatory proteins and metabolic in E. coli are acetylated (Yu, Kim et al. 2008; Zhang,

Sprung et al. 2009), adding support to the possibility of Lrp acetylation. The results were inconclusive, as background levels with the acetyl-lysine antibody were extremely high.

Attempts to reduce this background by immunoprecipitation resulted in bands on two parallel gels that appeared to overlap, with Lrp and acetyl-lysine bands appearing at the same position on the gel. However the protein A beads still generated a high level of background. A major complication in these experiments was a limitation of the available antibodies, as both the anti-acetyllysine and the anti-Lrp antibodies were generated in rabbit, making re-probing experiments difficult. It remains to be determined whether Lrp activity can be modulated in a similar manner to that of the histones, adding an additional layer to transcriptional regulation in bacteria.

Homeodomain proteins contain a helix-turn-helix motif next to either an N or C terminal tail (Dragan, Li et al. 2006). As in Lrp, these tails are unstructured and typically contain lysine or arginine (Dragan, Li et al. 2006). The tails of homeodomain proteins make essential contacts with the bases of the minor groove (Otting, Qian et al. 1990;

Dragan, Li et al. 2006). These tails contribute to binding of the homeodomain to DNA, and are linked to multimerization of homeodomain proteins into dimers (Fraenkel, Rould

109 et al. 1998; Dragan, Li et al. 2006). The function of the tails of homeodomain proteins are similar to the function of the tails of Lrp.

In additional instances N-terminal tails are used not only with helix-turn-helix proteins, but also in ribbon-helix-helix proteins such as ParG. In ParG, which is a small

~10 kDa transcription factor that regulates the parGF operon, a similar disordered tail of

30 residues contributes to DNA binding and specificity (Carmelo, Barilla et al. 2005).

The tail of ParG like that of Lrp contains positively-charged regions.

4.1.3 N-terminus of Lrp is a possible region for fine tuning Lrp regulation without sacrificing important regulatory connections.

N-terminal plasticity within highly similar Lrp orthologs suggests that even in closely related organisms regulatory networks are subject to broad changes at the global regulator with additional fine tuning at individual promoters. My data indicate that the differences within the N-terminus result in plasticity in DNA binding and multimerization. A possible explanation for this plasticity is that there are fewer evolutionary constraints on the flexible N-terminal tail than on the defined 3D structure of the helix-turn-helix. Changes within the HTH motif of a global regulator such as Lrp could result in radically altered binding with potential for disrupting regulation at a global scale. Therefore between closely related species it seems likely that modifications would occur within regions that could result in small adjustments to the regulatory network and not broad strong effects.

110 4.2 Lrp has a broader range of coregulators than was known.

Several additional amino acid coregulators were identified for EcoLrp. The additional interactions have implications for the depth to which Lrp senses the cell’s physiological status. It also raises the question of whether this is another feature that differs among Lrp orthologs. Other differences between Lrp orthologs were observed in their sensitivity to amino acids. The region of Lrp for sensitivity to amino acids is the regulation of prokaryotic amino acid metabolism (RAM) domain on the C-terminal end of the protein (Ettema, Brinkman et al. 2002). Four amino acid differences between

PmiLrp and EcoLrp result in differences in sensitivity. This functional alteration associated with a very small number of amino acid changes is, again, inconsistent with a process of genetic drift. Further, two of the four changes are in the N-terminal tail (see above), while the other two amino acid differences occur in regions that were not known to be important for coregulator binding. One difference is within the linker region between the N and C terminal domains, and another is within the regulation of prokaryotic amino acid metabolism (RAM) domain though distant from the coregulator binding sites. The differences suggest that much of the Lrp protein is subject to selective constraints as suggested by Friedberg et al (Friedberg, Platko et al. 1995) and that differences in the orthologs are possibly a mechanism of fine tuning a complex regulatory network.

4.2.1 Amino acid sensitivity is broader than previously expected.

Most Lrp orthologs have a limited number of known coregulator sensitivities

(Yokoyama, Ishijima et al. 2006). There are a few exceptions to this. For example Lrp

111 from Mycobacterium tuberculosis binds at least five amino acids (ten were tested, five had stronger affinity than the others) (Shrivastava and Ramachandran 2007). Leucine and alanine were originally the only identified co-regulators of EcoLrp function

(Mathew, Zhi et al. 1996; Zhi, Mathew et al. 1999; Berthiaume, Crost et al. 2004; Crost,

Harel et al. 2004). We found that EcoLrp activity is also regulated by methionine, isoleucine, histidine, and threonine.

Typically, Lrp co-regulators are a metabolic intermediate or amino acid product of a pathway regulated by Lrp (Brinkman, Ettema et al. 2003), generating feedback loops. In the cases of the strong amino acid coregulators of EcoLrp (namely, Leu, Ala, and Met) this appears to be the case. In regard to regulation of genes involved in the production of alanine, dadAX (alanine racemase) (Zhi, Mathew et al. 1998; Zhi, Mathew et al. 1999), and gltBD (glutamate synthase; glutamate is an essential substrate for production of many amino acids) are regulated by Lrp (Ernsting, Denninger et al. 1993).

A similar case occurs for leucine: while the leu genes are not directly regulated by Lrp

(Landgraf, Boxer et al. 1999), ilvGMEDA produces branched chain amino acids and is directly regulated by Lrp (Rhee, Parekh et al. 1996) and Lrp also regulates ilvIH (Jafri,

Chen et al. 2002). Histidine biosynthesis is affected, with hisGDCBHAFI being regulated by Lrp (Lintner, Mishra et al. 2008). In addition methionine is also within the

Lrp regulon, with metE and metH both in the Lrp regulon (Lintner, Mishra et al. 2008), though this may be an indirect regulatory effect. ThrA is also within the EcoLrp regulon with threonine also being involved in a feedback loop regulating its own production via

Lrp (Lintner, Mishra et al. 2008). Therefore, Lrp in E. coli plays quite nicely into feedback loops by regulating many of the genes that produce the amino acid coregulators.

112 With the exception of the hisGDCBHAFI operon, metE and metH, promoters for all of these have been pulled down in a chromatin immunoprecipitation (ChIP-chip) analysis using an antibody to epitope-tagged Lrp (Cho, Barrett et al. 2008). The level of regulation, in expression microarray analyses, varied from 2-19 fold (Lintner, Mishra et al. 2008).

Another question is what else these six coregulators might have in common, such as positions within amino acid metabolism, amino acid concentrations, and physical properties. For comparison, Pyrococcus OT3 Lrp ortholog DM1 responds to a wide variety of amino acids that affect its stability, and their shared feature is hydrophobicity

(Michiyo Sakuma 2005). In contrast, Mycobacterium tuberculosis Lrp responds to principally aromatic amino acids (Shrivastava and Ramachandran 2007). Preferential coregulators of EcoLrp include branched chain amino acids (with the possible exception of valine, which wasn’t tested due to toxicity issues). Hydrophobicity is a feature of most

EcoLrp co-regulators, however not all hydrophobic amino acids act as coregulators. His is the only charged co-regulator, and size does not appear important.

Table 1 summarizes features of the strong coregulators of EcoLrp. There are two basic questions about these amino acids. First, is there an obvious regulatory logic for this choice of coregulators? Second, is there an obvious physical basis for the selection? For the second question, as discussed below it is important that Lrp appears to have two distinct sets of amino acid binding pockets (Chen and Calvo 2002; Shrivastava and

Ramachandran 2007), and they probably have distinct preferences.

113

Table 3.

Amino Acid terminala MWb pIb Volume, Å3b Leu yes 131.2 6.0 166.7 Met no 149.2 5.7 162.9 Ala no 89.1 6.1 88.6 Ile yes 131.2 6.0 166.7 His yes 155.2 7.6 153.2 Lys yes 146.2 9.7 168.6 Thr no 119.1 5.6c 116.1 Table 3. Properties of strong coregulators a - terminal amino acids are a final step in the biosynthesis pathway, from (Neidhardt

1987) b - molecular weight, isoelectric point, and volume in cubic angstroms. Information from

(http://www.imb-jena.de/IMAGE_AA.html) except as indicated below. c - information from (http://www.geneinfinity.org/sp_aaprops.html)

Terminal amino acids within their pathways would represent final products and thus be a useful metabolite for sensing the overall amino acid pool. Some of the strong coregulators of EcoLrp are terminal amino acids within their biosynthetic pathways

(Table 3). This is true for some of the coregulators to which EcoLrp is sensitive, including Leu, His, and Ile. Others such as Met, Thr, and Ala are all fed into pathways generating downstream amino acids (Neidhardt and Curtiss 1996). For example Met can further be processed into Cys, or Ala can be converted into Val through transaminase C

(Neidhardt and Curtiss 1996). In addition, Lrp seems to be insensitive to other terminal amino acids such as Phe or Tyr (Neidhardt and Curtiss 1996).

Another possible commonality among Lrp coregulators is concentrations within bacterial pools. Amino acid pools in E. coli contain relatively low levels of Met, Ile, and 114 Leu, however these are not the lowest levels of amino acids (Raunio and Rosenqvist

1970; Raunio and Leppavirta 1975). Comparisons with the stoichiometric content of amino acids within E. coli indicate that Leu is relatively low in comparison to many amino acids including Ile and Thr both of which have effects on Lrp (Neidhardt 1987).

Amino acid pools in Aerobacter aerogenes are comparable to that of E. coli where Ile and Leu are below detectible concentrations while Ala concentrations are significantly higher (Tempest, Meers et al. 1970). These levels vary with growth rate, such that concentrations increase during rapid growth (Tempest, Meers et al. 1970). The amino acid pools are largest during periods of rapid growth, so EcoLrp could utilize this signal to help modulate transcription during these periods and when nutritional levels drop

EcoLrp can then help prepare the cells for stationary phase. In any case, it does not appear that the set of EcoLrp coregulators reflect selection for responsiveness to either the lowest or highest intracellular pool concentrations. Thus no single property obviously explains the set of amino acids that co-regulate EcoLrp.

From a protein perspective, several residues within Lrp coregulator binding sites on the external edges of dimers have been used to make limited predictions of the nature of the set of coregulators (Kawashima, Aramaki et al. 2008). It would be useful to better understand the structures and functions of the pair of binding pockets identified in

EcoLrp and MtbLrp, and whether each pocket has a distinct set of properties with respect to coregulator binding and its regulatory consequences. Understanding the kinetics and dynamics of these sites with the strong coregulators would be useful.

Interestingly, one Lrp ortholog (Grp from Sulfolobus tokodaii strain 7) responds to Gln, which binds to the N-terminal side of the HTH recognition helix (Kumarevel,

115 Nakano et al. 2008). Coregulator binding to this region has not been reported in any other

Lrp ortholog, and provides another example of the functional possibilities that need to be explored in order to make adequate predictions of regulation across species.

4.2.2 Differences within the Lrp RAM domain are associated with differences in Lrp sensitivity to co-regulators.

Lrp contains a RAM domain located in the C-terminal half of the protein (Ettema,

Brinkman et al. 2002) (see Fig. 1 of chapter 2 page 26). The RAM domain contains the residues shown to interact with at least some of the amino acid coregulators (Platko and

Calvo 1993; Reddy, Gokulan et al. 2008). This domain contains multiple binding sites.

EcoLrp, for example, binds Leu at two sites (Chen and Calvo 2002), while

Mycobacterium tuberculosis Lrp binds different amino acids at the two distinct pockets

(Shrivastava and Ramachandran 2007).

One sequence difference between the RAM domains of PmiLrp and EcoLrp is close to the dimerization interface. This difference is not at previously-identified coregulator binding sites (Platko and Calvo 1993; Leonard, Smits et al. 2001; Reddy,

Gokulan et al. 2008). However, effects on coregulator-responsive conformational changes are possible. In Mycobacterium tuberculosis Lrp, conformational changes associated with different amino acids depend on the coregulator binding site used

(Shrivastava and Ramachandran 2007). The differences between EcoLrp and PmiLrp may therefore be due to conformational constraints due to the T to M substitution within the RAM domain, the Q to S substitution in the linker domain (bridging the N-terminal

116 DNA binding domain and the C-terminal effector domain), or both. The exact nature and functional differences due to these amino acid substitutions has yet to be explored.

4.2.3 Evolutionary dynamics of Lrp. What can we learn from the N and C domains and substitutions that have appeared within these regions?

Lrp orthologs are found across a wide range of species from archaea to bacteria

(Kawashima, Aramaki et al. 2008). Lrp appears to be a global regulator in Vibrionaceae

(Lintner, Mishra et al. 2008) and Enterbacteriaceae (Brinkman, Ettema et al. 2003) and has a varying role as a local or global regulator in more distant species (Yokoyama,

Ishijima et al. 2006). Several species contain multiple Lrp paralogs (Kawashima,

Aramaki et al. 2008). E. coli for example contains three Lrp paralogs Lrp, AsnC (which regulates asparagine biosynthesis), and YbaO (of unknown function) (Yokoyama,

Ishijima et al. 2006), though these paralogs are phylogenetically distant with identities of for example 25% identity for AsnC when compared to Lrp (Kawashima, Aramaki et al.

2008). A phylogenetic tree of Lrp orthologs indicates that AsnC is more closely related to archaeal Lrp orthologs than EcoLrp (Kawashima, Aramaki et al. 2008). The roles for

Lrp were therefore more likely to exist as a local regulator in the ancestral strain and through duplication events some orthologs have independently become global regulators in the cases of Enterobacteriaceae and Vibrionaceae and Pyrococcus OT3. The same is likely for the response of Lrp to multiple coregulators. Bacterial Lrp orthologs vary greatly in their response to amino acids while EcoLrp responds to a range of amino acids as does Mycobacterium tuberculosis Lrp (Shrivastava and Ramachandran 2007). It is therefore reasonable to think that Lrp from Enterobacteriaceae can be compared

117 functionally. However, even then extrapolations about regulatory networks involving Lrp need to be done very carefully, taking into consideration not only the overall percent identity and HTH motif, but also the N-terminal tails and coregulator binding regions. It would not be shocking if Lrp, and even EcoLrp, held additional surprises in terms of its structure-function relationships and regulatory roles.

4.3 Summary

For the purposes of predicting regulation across sequenced bacterial species by extrapolating from a better-studied organism, several assumptions are generally made. A key assumption is that relatively high sequence conservation among orthologous transcription factors implies that they are, at least in essence, functionally equivalent. A previous test of this assumption revealed surprising differences in Lrp function despite

98% overall amino acid sequence identity and a completely conserved helix-turn-helix motif. These differences suggested that, despite decades of study, Lrp is not sufficiently understood to support regulatory extrapolations. This would suggest that very few transcription factors are well-enough characterized for this purpose.

In particular, the studies reported here revealed unexpected properties regarding two key functions of the global regulator Lrp: its recognition of DNA and its control by coregulatory amino acids. The results indicate that the unstructured N-terminal tails of

Lrp orthologs, which account for a substantial fraction of sequence variability among these proteins, are involved in binding to target DNAs; and that Lrp (at least from

Escherichia coli and Proteus mirabilis) is responsive to three times the number of coregulators as had previously been reported. The data reported here suggest that some

118 of the transcriptional differences between Lrp orthologs are due to changes near the N- terminus. The findings that methionine, isoleucine, histidine, lysine, and threonine have effects on the level of Lrp mediated transcription suggest Lrp is more deeply tied into E. coli metabolism than previously reported. The strong co-regulators of Lrp could be separated into two groups by their regulatory effects. Specifically, Leu and Met both elicited the strongest effects (activation to repression of PlivK), while Ile, Ala, His, Lys, and Thr decreased activation but did not cause repression. The differential effects suggest that Lrp responds to coregulators in different ways that are not yet understood.

These findings reveal that there are important functions of regulatory proteins that need to be better understood to allow accurate predictions of regulation to be made from genome sequences. In particular more information than just the principal DNA binding domain and coregulator binding domains is necessary for accurate predictions. More knowledge is needed, for example, about the various coregulator binding pockets, and how their occupancy is translated into changes in interaction with DNA or with RNA polymerase. These results also suggest that even for transcriptional factors with high levels of conservation, evolutionary constraints may have allowed for key changes to occur in seemingly unimportant regions reducing substantial deleterious mutations while allowing for fine tuning of regulatory systems.

119

5 References

Al-Shahib, A., R. Breitling, et al. (2005). "Feature selection and the class imbalance

problem in predicting protein function from sequence." Appl Bioinformatics 4(3):

195-203.

Alexander, P. A., Y. He, et al. (2007). "The design and characterization of two proteins

with 88% sequence identity but different structure and function." Proc Natl Acad

Sci U S A 104(29): 11963-8.

Ambartsoumian, G., R. D'Ari, et al. (1994). "Altered amino acid metabolism in lrp

mutants of Escherichia coli K12 and their derivatives." Microbiology 140 ( Pt 7):

1737-44.

Arya, G. and T. Schlick (2006). "Role of histone tails in chromatin folding revealed by a

mesoscopic oligonucleosome model." Proc Natl Acad Sci U S A 103(44): 16236-

41.

Babu, M. M., N. M. Luscombe, et al. (2004). "Structure and evolution of transcriptional

regulatory networks." Curr Opin Struct Biol 14(3): 283-91.

Baek, C. H., S. Wang, et al. (2009). "Leucine-responsive regulatory protein (Lrp) acts as

a virulence repressor in Salmonella enterica serovar Typhimurium." J Bacteriol

191(4): 1278-92.

120 Balaji, S. and L. Aravind (2007). "The two faces of short-range evolutionary dynamics of

regulatory modes in bacterial transcriptional regulatory networks." Bioessays

29(7): 625-9.

Baumbach, J., S. Rahmann, et al. (2009). "Reliable transfer of transcriptional gene

regulatory networks between taxonomically related organisms." BMC Syst Biol

3: 8.

Beloin, C., J. Jeusset, et al. (2003). "Contribution of DNA conformation and topology in

right-handed DNA wrapping by the Bacillus subtilis LrpC protein." J Biol Chem

278(7): 5333-42.

Berthiaume, F., C. Crost, et al. (2004). "Influence of L-leucine and L-alanine on Lrp

regulation of foo, coding for F1651, a Pap homologue." J Bacteriol 186(24):

8537-41.

Bhagwat, S. P., M. R. Rice, et al. (1997). "Use of an inducible regulatory protein to

identify members of a regulon: application to the regulon controlled by the

leucine-responsive regulatory protein (Lrp) in Escherichia coli." J Bacteriol

179(20): 6254-63.

Bingham, A. H., S. Ponnambalam, et al. (1986). "Mutations that reduce expression from

the P2 promoter of the Escherichia coli galactose operon." Gene 41(1): 67-74.

Borst, D. W., R. M. Blumenthal, et al. (1996). "Use of an in vivo titration method to

study a global regulator: effect of varying Lrp levels on expression of gltBDF in

Escherichia coli." J Bacteriol 178(23): 6904-12.

121 Boulette, M. L., P. J. Baynham, et al. (2009). "Characterization of alanine catabolism in

Pseudomonas aeruginosa and its importance for proliferation in vivo." J Bacteriol

191(20): 6329-34.

Brinkman, A. B., T. J. Ettema, et al. (2003). "The Lrp family of transcriptional

regulators." Mol Microbiol 48(2): 287-94.

Burns, K. E., W. T. Liu, et al. (2009). "Proteasomal protein degradation in Mycobacteria

is dependent upon a prokaryotic ubiquitin-like protein." J Biol Chem 284(5):

3069-75.

Calvo, J. M. and R. G. Matthews (1994). "The leucine-responsive regulatory protein, a

global regulator of metabolism in Escherichia coli." Microbiol Rev 58(3): 466-90.

Cardon, L. R. and G. D. Stormo (1992). "Expectation maximization algorithm for

identifying protein-binding sites with variable lengths from unaligned DNA

fragments." J Mol Biol 223(1): 159-70.

Carmelo, E., D. Barilla, et al. (2005). "The unstructured N-terminal tail of ParG

modulates assembly of a quaternary nucleoprotein complex in transcription

repression." J Biol Chem 280(31): 28683-91.

Chen, S. and J. M. Calvo (2002). "Leucine-induced dissociation of Escherichia coli Lrp

hexadecamers to octamers." J Mol Biol 318(4): 1031-42.

Chen, S., Z. Hao, et al. (2001). "Modulation of Lrp action in Escherichia coli by leucine:

effects on non-specific binding of Lrp to DNA." J Mol Biol 314(5): 1067-75.

Chen, S., M. H. Rosner, et al. (2001). "Leucine-regulated self-association of leucine-

responsive regulatory protein (Lrp) from Escherichia coli." J Mol Biol 312(4):

625-35.

122 Cho, B. K., C. L. Barrett, et al. (2008). "Genome-scale reconstruction of the Lrp

regulatory network in Escherichia coli." Proc Natl Acad Sci U S A 105(49):

19462-7.

Corcoran, C. P. and C. J. Dorman (2009). "DNA relaxation-dependent phase biasing of

the fim genetic switch in Escherichia coli depends on the interplay of H-NS, IHF

and LRP." Mol Microbiol 74(5): 1071-82.

Crost, C., A. Garrivier, et al. (2003). "Leucine-responsive regulatory protein-mediated

repression of clp (encoding CS31A) expression by L-leucine and L-alanine in

Escherichia coli." J Bacteriol 185(6): 1886-94.

Crost, C., J. Harel, et al. (2004). "Influence of environmental cues on transcriptional

regulation of foo and clp coding for F165(1) and CS31A adhesins in Escherichia

coli." Res Microbiol 155(6): 475-82.

Cui, Y., M. A. Midkiff, et al. (1996). "The leucine-responsive regulatory protein (Lrp)

from Escherichia coli. Stoichiometry and minimal requirements for binding to

DNA." J Biol Chem 271(12): 6611-7.

Cui, Y., Q. Wang, et al. (1995). "A consensus sequence for binding of Lrp to DNA." J

Bacteriol 177(17): 4872-80.

Daniels, D. W. and K. P. Bertrand (1985). "Promoter mutations affecting divergent

transcription in the Tn10 tetracycline resistance determinant." J Mol Biol 184(4):

599-610. de los Rios, S. and J. J. Perona (2007). "Structure of the Escherichia coli leucine-

responsive regulatory protein Lrp reveals a novel octameric assembly." J Mol

Biol 366(5): 1589-602.

123 Devos, D. and A. Valencia (2000). "Practical limits of function prediction." Proteins

41(1): 98-107.

Doniger, S. W. and J. C. Fay (2007). "Frequent gain and loss of functional transcription

factor binding sites." PLoS Comput Biol 3(5): e99.

Dragan, A. I., Z. Li, et al. (2006). "Forces driving the binding of homeodomains to

DNA." Biochemistry 45(1): 141-51.

Edwards, J. S., R. U. Ibarra, et al. (2001). "In silico predictions of Escherichia coli

metabolic capabilities are consistent with experimental data." Nat Biotechnol

19(2): 125-30.

Enoru-Eta, J., D. Gigot, et al. (2000). "Purification and characterization of Sa-lrp, a

DNA-binding protein from the extreme thermoacidophilic archaeon Sulfolobus

acidocaldarius homologous to the bacterial global transcriptional regulator Lrp." J

Bacteriol 182(13): 3661-72.

Ernst, J., Q. K. Beg, et al. (2008). "A semi-supervised method for predicting transcription

factor-gene interactions in Escherichia coli." PLoS Comput Biol 4(3): e1000044.

Ernsting, B. R., J. W. Denninger, et al. (1993). "Regulation of the gltBDF operon of

Escherichia coli: how is a leucine-insensitive operon regulated by the leucine-

responsive regulatory protein?" J Bacteriol 175(22): 7160-9.

Espinosa, V., A. D. Gonzalez, et al. (2005). "Comparative studies of transcriptional

regulation mechanisms in a group of eight gamma-proteobacterial genomes." J

Mol Biol 354(1): 184-99.

124 Ettema, T. J., A. B. Brinkman, et al. (2002). "A novel ligand-binding domain involved in

regulation of amino acid metabolism in prokaryotes." J Biol Chem 277(40):

37464-8.

Ferrario, M., B. R. Ernsting, et al. (1995). "The leucine-responsive regulatory protein of

Escherichia coli negatively regulates transcription of ompC and micF and

positively regulates translation of ompF." J Bacteriol 177(1): 103-13.

Fraenkel, E., M. A. Rould, et al. (1998). "Engrailed homeodomain-DNA complex at 2.2

A resolution: a detailed view of the interface and comparison with other engrailed

structures." J Mol Biol 284(2): 351-61.

Friedberg, D., J. V. Platko, et al. (1995). "The amino acid sequence of Lrp is highly

conserved in four enteric microorganisms." J Bacteriol 177(6): 1624-6.

Gelfand, M. S. (2006). "Evolution of transcriptional regulatory networks in microbial

genomes." Curr Opin Struct Biol 16(3): 420-9.

Gottesman, S. (1984). "Bacterial regulation: global regulatory networks." Annu Rev

Genet 18: 415-41.

Haney, S. A., J. V. Platko, et al. (1992). "Lrp, a leucine-responsive protein, regulates

branched-chain amino acid transport genes in Escherichia coli." J Bacteriol

174(1): 108-15.

Harari, O., C. del Val, et al. (2009). "Identifying promoter features of co-regulated genes

with similar network motifs." BMC Bioinformatics 10 Suppl 4: S1.

Hawkins, J., C. Grant, et al. (2009). "Assessing phylogenetic motif models for predicting

transcription factor binding sites." Bioinformatics 25(12): i339-47.

125 Hay, N. A., D. J. Tipper, et al. (1997). "A nonswarming mutant of Proteus mirabilis lacks

the Lrp global transcriptional regulator." J Bacteriol 179(15): 4741-6.

Hecht, K., S. Zhang, et al. (1996). "D-histidine utilization in Salmonella typhimurium is

controlled by the leucine-responsive regulatory protein (Lrp)." J Bacteriol 178(2):

327-31.

Hendriksen, W. T., N. Silva, et al. (2007). "Regulation of gene expression in

Streptococcus pneumoniae by response regulator 09 is strain dependent." J

Bacteriol 189(4): 1382-9. http://www.geneinfinity.org/sp_aaprops.html. http://www.imb-jena.de/IMAGE_AA.html.

Isalan, M., C. Lemerle, et al. (2008). "Evolvability and hierarchy in rewired bacterial

gene networks." Nature 452(7189): 840-5.

Jafri, S., S. Chen, et al. (2002). "ilvIH operon expression in Escherichia coli requires Lrp

binding to two distinct regions of DNA." J Bacteriol 184(19): 5293-300.

Janes, B. K. and R. A. Bender (1999). "Two roles for the leucine-responsive regulatory

protein in expression of the alanine catabolic operon (dadAB) in Klebsiella

aerogenes." J Bacteriol 181(3): 1054-8.

Janga, S. C. and J. Collado-Vides (2007). "Structure and evolution of gene regulatory

networks in microbial genomes." Res Microbiol 158(10): 787-94.

Jensen, L. J., D. W. Ussery, et al. (2003). "Functionality of system components:

conservation of protein function in protein feature space." Genome Res 13(11):

2444-9.

126 Jothi, R., T. M. Przytycka, et al. (2007). "Discovering functional linkages and

uncharacterized cellular pathways using phylogenetic profile comparisons: a

comprehensive assessment." BMC Bioinformatics 8: 173.

Kamionka, A., J. Bogdanska-Urbaniak, et al. (2004). "Two mutations in the tetracycline

repressor change the inducer anhydrotetracycline to a corepressor." Nucleic Acids

Res 32(2): 842-7.

Karimpour-Fard, A., S. M. Leach, et al. (2008). "The topology of the bacterial co-

conserved protein network and its implications for predicting protein function."

BMC Genomics 9: 313.

Kawashima, T., H. Aramaki, et al. (2008). "Transcription regulation by feast/famine

regulatory proteins, FFRPs, in archaea and eubacteria." Biol Pharm Bull 31(2):

173-86.

Kelly, A., C. Conway, et al. (2006). "DNA supercoiling and the Lrp protein determine the

directionality of fim switch DNA inversion in Escherichia coli K-12." J Bacteriol

188(15): 5356-63.

Kimura, A., K. Matsubara, et al. (2005). "A decade of histone acetylation: marking

eukaryotic chromosomes with specific codes." J Biochem 138(6): 647-62.

Klundt, T., M. Bocola, et al. (2009). "A single amino acid substitution converts

benzophenone synthase into phenylpyrone synthase." J Biol Chem 284(45):

30957-64.

Kreimer, A., E. Borenstein, et al. (2008). "The evolution of modularity in bacterial

metabolic networks." Proc Natl Acad Sci U S A 105(19): 6976-81.

127 Kumarevel, T., N. Nakano, et al. (2008). "Crystal structure of glutamine receptor protein

from Sulfolobus tokodaii strain 7 in complex with its effector L-glutamine:

implications of effector binding in molecular association and DNA binding."

Nucleic Acids Res 36(14): 4808-20.

Kyrpides, N. C. and C. A. Ouzounis (1995). "The eubacterial transcriptional activator Lrp

is present in the archaeon Pyrococcus furiosus." Trends Biochem Sci 20(4): 140-

1.

Lahooti, M., P. L. Roesch, et al. (2005). "Modulation of the sensitivity of FimB

recombination to branched-chain amino acids and alanine in Escherichia coli K-

12." J Bacteriol 187(18): 6273-80.

Landgraf, J. R., J. A. Boxer, et al. (1999). "Escherichia coli Lrp (leucine-responsive

regulatory protein) does not directly regulate expression of the leu operon

promoter." J Bacteriol 181(20): 6547-51.

Landgraf, J. R., J. Wu, et al. (1996). "Effects of nutrition and growth rate on Lrp levels in

Escherichia coli." J Bacteriol 178(23): 6930-6.

Landini, P., L. I. Hajec, et al. (1996). "The leucine-responsive regulatory protein (Lrp)

acts as a specific repressor for sigma s-dependent transcription of the Escherichia

coli aidB gene." Mol Microbiol 20(5): 947-55.

Lapidot, M., O. Mizrahi-Man, et al. (2008). "Functional characterization of variations on

regulatory motifs." PLoS Genet 4(3): e1000018.

Lawrence, C. E., S. F. Altschul, et al. (1993). "Detecting subtle sequence signals: a Gibbs

sampling strategy for multiple alignment." Science 262(5131): 208-14.

128 Lawrence, C. E. and A. A. Reilly (1990). "An expectation maximization (EM) algorithm

for the identification and characterization of common sites in unaligned

biopolymer sequences." Proteins 7(1): 41-51.

Leonard, P. M., S. H. Smits, et al. (2001). "Crystal structure of the Lrp-like

transcriptional regulator from the archaeon Pyrococcus furiosus." Embo J 20(5):

990-7.

Lin, S. H., L. Kovac, et al. (2002). "Ability of E. coli cyclic AMP receptor protein to

differentiate cyclic nucelotides: effects of single site mutations." Biochemistry

41(9): 2946-55.

Lintner, R. E., P. K. Mishra, et al. (2008). "Limited functional conservation of a global

regulator among related bacterial genera: Lrp in Escherichia, Proteus and Vibrio."

BMC Microbiol 8: 60.

Lintner, R. E., P. K. Mishra, et al. (2008). "Limited functional conservation of a global

regulator among related bacterial genera: Lrp in Escherichia, Proteus and Vibrio."

BMC Microbiol. (in press).

Loewenstein, Y., D. Raimondo, et al. (2009). "Protein function annotation by homology-

based inference." Genome Biol 10(2): 207.

Madan Babu, M. and S. A. Teichmann (2003). "Functional determinants of transcription

factors in Escherichia coli: protein families and binding sites." Trends Genet

19(2): 75-9.

Madan Babu, M., S. A. Teichmann, et al. (2006). "Evolutionary dynamics of prokaryotic

transcriptional regulatory networks." J Mol Biol 358(2): 614-33.

129 Mao, F., Z. Su, et al. (2006). "Mapping of orthologous genes in the context of biological

pathways: An application of integer programming." Proc Natl Acad Sci U S A

103(1): 129-34.

Marasco, R., M. Varcamonti, et al. (1994). "In vivo footprinting analysis of Lrp binding

to the ilvIH promoter region of Escherichia coli." J Bacteriol 176(17): 5197-201.

Margelevicius, M. and C. Venclovas "Detection of distant evolutionary relationships

between protein families using theory of sequence profile-profile comparison."

BMC Bioinformatics 11: 89.

Marshall, D. G., B. J. Sheehan, et al. (1999). "A role for the leucine-responsive regulatory

protein and integration host factor in the regulation of the Salmonella plasmid

virulence (spv ) locus in Salmonella typhimurium." Mol Microbiol 34(1): 134-45.

Martinez-Antonio, A. and J. Collado-Vides (2003). "Identifying global regulators in

transcriptional regulatory networks in bacteria." Curr Opin Microbiol 6(5): 482-9.

Martinez-Nunez, M. A., E. Perez-Rueda, et al. "New insights into the regulatory

networks of paralogous genes in bacteria." Microbiology 156(Pt 1): 14-22.

Mathew, E., J. Zhi, et al. (1996). "Lrp is a direct repressor of the dad operon in

Escherichia coli." J Bacteriol 178(24): 7234-40.

McFarland, K. A. and C. J. Dorman (2008). "Autoregulated expression of the gene

coding for the leucine-responsive protein, Lrp, a global regulator in Salmonella

enterica serovar Typhimurium." Microbiology 154(Pt 7): 2008-16.

McFarland, K. A., S. Lucchini, et al. (2008). "The leucine-responsive regulatory protein,

Lrp, activates transcription of the fim operon in Salmonella enterica serovar

typhimurium via the fimZ regulatory gene." J Bacteriol 190(2): 602-12.

130 Mendoza-Vargas, A., L. Olvera, et al. (2009). "Genome-wide identification of

transcription start sites, promoters and transcription factor binding sites in E.

coli." PLoS One 4(10): e7526.

Michiyo Sakuma, H. K., and Masashi Suzuki (2005). "Effects of amino acids on

assembling of an archaeal feast/famine regulatory protein, DM1 (pot1216151)."

Proc japan Acad 81(Ser B).

Moses, A. M., D. Y. Chiang, et al. (2003). "Position specific variation in the rate of

evolution in transcription factor binding sites." BMC Evol Biol 3: 19.

Nagaraj, V. H., R. A. O'Flanagan, et al. (2008). "Better estimation of protein-DNA

interaction parameters improve prediction of functional sites." BMC Biotechnol

8: 94.

Nakanishi, N., K. Tashiro, et al. (2009). "Regulation of virulence by butyrate sensing in

enterohaemorrhagic Escherichia coli." Microbiology 155(Pt 2): 521-30.

Neidhardt, F. C. (1987). Escherichia coli and Salmonella typhimurium : cellular and

molecular biology. Washington, D.C., American Society for Microbiology.

Neidhardt, F. C. and R. Curtiss (1996). Escherichia coli and Salmonella : cellular and

molecular biology. Washington, D.C., ASM Press.

Newman, E. B., R. D'Ari, et al. (1992). "The leucine-Lrp regulon in E. coli: a global

response in search of a raison d'etre." Cell 68(4): 617-9.

Newman, E. B. and R. Lin (1995). "Leucine-responsive regulatory protein: a global

regulator of gene expression in E. coli." Annu Rev Microbiol 49: 747-75.

Ng, P. C. and S. Henikoff (2001). "Predicting deleterious amino acid substitutions."

Genome Res 11(5): 863-74.

131 Ng, P. C. and S. Henikoff (2003). "SIFT: Predicting amino acid changes that affect

protein function." Nucleic Acids Res 31(13): 3812-4.

Otting, G., Y. Q. Qian, et al. (1990). "Protein--DNA contacts in the structure of a

homeodomain--DNA complex determined by nuclear magnetic resonance

spectroscopy in solution." Embo J 9(10): 3085-92.

Paul, L., R. M. Blumenthal, et al. (2001). "Activation from a distance: roles of Lrp and

integration host factor in transcriptional activation of gltBDF." J Bacteriol

183(13): 3910-8.

Paul, L., P. K. Mishra, et al. (2007). "Integration of regulatory signals through

involvement of multiple global regulators: control of the Escherichia coli gltBDF

operon by Lrp, IHF, Crp, and ArgR." BMC Microbiol 7: 2.

Peeters, E., B. T. Hoa, et al. (2005). "Overexpression, purification, crystallization and

preliminary X-ray diffraction analysis of the C-terminal domain of Ss-LrpB, a

transcription regulator from Sulfolobus solfataricus." Acta Crystallogr Sect F

Struct Biol Cryst Commun 61(Pt 11): 985-8.

Peeters, E., R. Willaert, et al. (2006). "Ss-LrpB from Sulfolobus solfataricus condenses

about 100 base pairs of its own operator DNA into globular nucleoprotein

complexes." J Biol Chem 281(17): 11721-8.

Perez, J. C. and E. A. Groisman (2009). "Transcription factor function and promoter

architecture govern the evolution of bacterial regulons." Proc Natl Acad Sci U S

A 106(11): 4319-24.

132 Platko, J. V. and J. M. Calvo (1993). "Mutations affecting the ability of Escherichia coli

Lrp to bind DNA, activate transcription, or respond to leucine." J Bacteriol

175(4): 1110-7.

Platko, J. V., D. A. Willins, et al. (1990). "The ilvIH operon of Escherichia coli is

positively regulated." J Bacteriol 172(8): 4563-70.

Polevoda, B. and F. Sherman (2007). "Methylation of proteins involved in translation."

Mol Microbiol 65(3): 590-606.

Powell, B. C. and C. A. Hutchison, 3rd (2006). "Similarity-based gene detection: using

COGs to find evolutionarily-conserved ORFs." BMC Bioinformatics 7: 31.

Pribnow, D. (1975). "Nucleotide sequence of an RNA polymerase binding site at an early

T7 promoter." Proc Natl Acad Sci U S A 72(3): 784-8.

Price, M. N., P. S. Dehal, et al. (2007). "Orthologous Transcription Factors in Bacteria

Have Different Functions and Regulate Different Genes." PLoS Comput Biol

3(9): e175.

Price, M. N., P. S. Dehal, et al. (2007). "Orthologous transcription factors in bacteria

have different functions and regulate different genes." PLoS Comput Biol 3(9):

1739-50.

Pul, U., B. Lux, et al. (2008). "Effect of upstream curvature and transcription factors H-

NS and LRP on the efficiency of Escherichia coli rRNA promoters P1 and P2 - a

phasing analysis." Microbiology 154(Pt 9): 2546-58.

Pul, U., R. Wurm, et al. (2007). "The role of LRP and H-NS in transcription regulation:

involvement of synergism, allostery and macromolecular crowding." J Mol Biol

366(3): 900-15.

133 Raunio, R. and H. Rosenqvist (1970). "Amino acid pool of Escherichia coli during the

different phases of growth." Acta Chem Scand 24(8): 2737-44.

Raunio, R. P. and M. Leppavirta (1975). "The effect of culture age, chloramphenicol and

B6 inhibitors on intra- and extracellular keto and amino acids of Escherichia coli

B." J Gen Microbiol 87(1): 141-9.

Reddy, M. C., K. Gokulan, et al. (2008). "Crystal structure of Mycobacterium

tuberculosis LrpA, a leucine-responsive global regulator associated with

starvation response." Protein Sci 17(1): 159-70.

Ren, J., S. Sainsbury, et al. (2007). "The structure and transcriptional analysis of a global

regulator from Neisseria meningitidis." J Biol Chem 282(19): 14655-64.

Rhee, K. Y., B. S. Parekh, et al. (1996). "Leucine-responsive regulatory protein-DNA

interactions in the leader region of the ilvGMEDA operon of Escherichia coli." J

Biol Chem 271(43): 26499-507.

Rhodius, V. A. and V. K. Mutalik "Predicting strength and function for promoters of the

Escherichia coli alternative sigma factor, sigmaE." Proc Natl Acad Sci U S A

107(7): 2854-9.

Sacco, M., E. Ricca, et al. (1993). "A stereospecific alignment between the promoter and

the cis-acting sequence is required for Lrp-dependent activation of ilvIH

transcription in Escherichia coli." FEMS Microbiol Lett 107(2-3): 331-6.

Sandve, G. K. and F. Drablos (2006). "A survey of motif discovery methods in an

integrated framework." Biol Direct 1: 11.

Seshasayee, A. S., G. M. Fraser, et al. (2009). "Principles of transcriptional regulation

and evolution of the metabolic system in E. coli." Genome Res 19(1): 79-91.

134 Shrivastava, T. and R. Ramachandran (2007). "Mechanistic insights from the crystal

structures of a feast/famine regulatory protein from Mycobacterium tuberculosis

H37Rv." Nucleic Acids Res 35(21): 7324-35.

Shultzaberger, R. K. and T. D. Schneider (1999). "Using sequence logos and information

analysis of Lrp DNA binding sites to investigate discrepanciesbetween natural

selection and SELEX." Nucleic Acids Res 27(3): 882-7.

Siddharthan, R., E. D. Siggia, et al. (2005). "PhyloGibbs: a Gibbs sampling motif finder

that incorporates phylogeny." PLoS Comput Biol 1(7): e67.

Stauffer, L. T. and G. V. Stauffer (1994). "Characterization of the gcv control region

from Escherichia coli." J Bacteriol 176(20): 6159-64.

Stauffer, L. T. and G. V. Stauffer (1999). "Role for the leucine-responsive regulatory

protein (Lrp) as a structural protein in regulating the Escherichia coli gcvTHP

operon." Microbiology 145 ( Pt 3): 569-76.

Stormo, G. D. (2000). "DNA binding sites: representation and discovery." Bioinformatics

16(1): 16-23.

Susanna, K. A., C. D. den Hengst, et al. (2006). "Expression of transcription activator

ComK of Bacillus subtilis in the heterologous host Lactococcus lactis leads to a

genome-wide repression pattern: a case study of horizontal gene transfer." Appl

Environ Microbiol 72(1): 404-11.

Tani, T. H., A. Khodursky, et al. (2002). "Adaptation to famine: a family of stationary-

phase genes revealed by microarray analysis." Proc Natl Acad Sci U S A 99(21):

13471-6.

135 Tempest, D. W., J. L. Meers, et al. (1970). "Influence of environment on the content and

composition of microbial free amino acid pools." J Gen Microbiol 64(2): 171-85.

Tian, W. and J. Skolnick (2003). "How well is enzyme function conserved as a function

of pairwise sequence identity?" J Mol Biol 333(4): 863-82.

Tompa, M., N. Li, et al. (2005). "Assessing computational tools for the discovery of

transcription factor binding sites." Nat Biotechnol 23(1): 137-44. van Hijum, S. A., M. H. Medema, et al. (2009). "Mechanisms and evolution of control

logic in prokaryotic transcriptional regulation." Microbiol Mol Biol Rev 73(3):

481-509, Table of Contents.

VanBogelen, R. A., K. D. Greis, et al. (1999). "Mapping regulatory networks in

microbial cells." Trends Microbiol 7(8): 320-8.

Wang, Q. and J. M. Calvo (1993). "Lrp, a global regulatory protein of Escherichia coli,

binds co-operatively to multiple sites and activates transcription of ilvIH." J Mol

Biol 229(2): 306-18.

Wang, Q. and J. M. Calvo (1993). "Lrp, a major regulatory protein in Escherichia coli,

bends DNA and can organize the assembly of a higher-order nucleoprotein

structure." Embo J 12(6): 2495-501.

Wang, Q., J. Wu, et al. (1994). "Regulation of the Escherichia coli lrp gene." J Bacteriol

176(7): 1831-9.

Wang, T. and G. D. Stormo (2003). "Combining phylogenetic data with co-regulated

genes to identify regulatory motifs." Bioinformatics 19(18): 2369-80.

Wasserman, W. W. and A. Sandelin (2004). "Applied bioinformatics for the

identification of regulatory elements." Nat Rev Genet 5(4): 276-87.

136 Wei, W. and X. D. Yu (2007). "Comparative analysis of regulatory motif discovery tools

for transcription factor binding sites." Genomics Proteomics Bioinformatics 5(2):

131-42.

Wiese, D. E., 2nd, B. R. Ernsting, et al. (1997). "A nucleoprotein activation complex

between the leucine-responsive regulatory protein and DNA upstream of the

gltBDF operon in Escherichia coli." J Mol Biol 270(2): 152-68.

Willins, D. A. and J. M. Calvo (1992). "In vitro transcription from the Escherichia coli

ilvIH promoter." J Bacteriol 174(23): 7648-55.

Willins, D. A., C. W. Ryan, et al. (1991). "Characterization of Lrp, and Escherichia coli

regulatory protein that mediates a global response to leucine." J Biol Chem

266(17): 10768-74.

Wu, H., F. Mao, et al. (2007). "Hierarchical classification of functionally equivalent

genes in prokaryotes." Nucleic Acids Res 35(7): 2125-40.

Xie, Y., W. Pan, et al. "A Bayesian approach to joint modeling of protein-DNA binding,

gene expression and sequence data." Stat Med 29(4): 489-503.

Yang, L., R. T. Lin, et al. (2002). "Structure of the Lrp-regulated serA promoter of

Escherichia coli K-12." Mol Microbiol 43(2): 323-33.

Yokoyama, K., S. A. Ishijima, et al. (2006). "Feast/famine regulatory proteins (FFRPs):

Escherichia coli Lrp, AsnC and related archaeal transcription factors." FEMS

Microbiol Rev 30(1): 89-108.

Yokoyama, K., S. A. Ishijima, et al. (2007). "Feast/famine regulation by transcription

factor FL11 for the survival of the hyperthermophilic archaeon Pyrococcus OT3."

Structure 15(12): 1542-54.

137 Yu, B. J., J. A. Kim, et al. (2008). "The diversity of lysine-acetylated proteins in

Escherichia coli." J Microbiol Biotechnol 18(9): 1529-36.

Zhang, J., R. Sprung, et al. (2009). "Lysine acetylation is a highly abundant and

evolutionarily conserved modification in Escherichia coli." Mol Cell Proteomics

8(2): 215-25.

Zhi, J., E. Mathew, et al. (1998). "In vitro and in vivo characterization of three major

dadAX promoters in Escherichia coli that are regulated by cyclic AMP-CRP and

Lrp." Mol Gen Genet 258(4): 442-7.

Zhi, J., E. Mathew, et al. (1999). "Lrp binds to two regions in the dadAX promoter region

of Escherichia coli to repress and activate transcription directly." Mol Microbiol

32(1): 29-40.

138