Corrections

BIOPHYSICS AND COMPUTATIONAL BIOLOGY BIOLOGY Correction for “Crosstalk and the evolution of specificity in two- Correction for “Differential processing of Arabidopsis ubiquitin- component signaling,” by Michael A. Rowland and Eric J. Deeds, like Atg8 autophagy by Atg4 cysteine proteases,” by which appeared in issue 15, April 15, 2014, of Proc Natl Acad Sci Jongchan Woo, Eunsook Park, and S. P. Dinesh-Kumar, which USA (111:5550–5555; first published March 31, 2014; 10.1073/ appeared in issue 2, January 14, 2014, of Proc Natl Acad Sci pnas.1317178111). USA (111:863–868; first published December 30, 2013; 10.1073/ The authors note that ref. 4, “Huynh TN, Stewart V (2011) pnas.1318207111). Negative control in two-component signal transduction by trans- The authors note that the following statement should be mitter activity. Mol Microbiol 82(2):275–286.” should added to the Acknowledgments: “National Science Foundation instead appear as “Ray JC, Igoshin OA (2010) Adaptable func- Grant NSF-IOS-1258135 funded to Dr. Georgia Drakakaki sup- tionality of transcriptional feedback in bacterial two-component ported Eunsook Park.” systems. PLoS Comput Biol 6(2):e1000676.” www.pnas.org/cgi/doi/10.1073/pnas.1409947111 www.pnas.org/cgi/doi/10.1073/pnas.1408294111 CORRECTIONS

www.pnas.org PNAS | June 24, 2014 | vol. 111 | no. 25 | 9325 Downloaded by guest on September 29, 2021 Crosstalk and the evolution of specificity in two-component signaling

Michael A. Rowlanda and Eric J. Deedsa,b,1 aCenter for Bioinformatics and bDepartment of Molecular Biosciences, University of Kansas, Lawrence, KS 66047

Edited by Thomas J. Silhavy, Princeton University, Princeton, NJ, and approved March 7, 2014 (received for review September 11, 2013) Two-component signaling (TCS) serves as the dominant signaling growth rate and fitness in mutant cells grown under phosphate- modality in . A typical pathway includes a sensor limiting conditions. It has been shown that adding crosstalk to kinase (HK) that phosphorylates a response regulator (RR), mod- TCS can reduce information transfer efficiency under certain ulating its activity in response to an incoming signal. Most HKs are conditions (24), but it remains unclear exactly why TCS pathways bifunctional, acting as both kinase and phosphatase for their are constrained from evolving crosstalk. substrates. Unlike eukaryotic signaling networks, there is very One of the most common motifs in eukaryotic signaling net- little crosstalk between bacterial TCS pathways; indeed, adding works is a pair of (e.g., a kinase and a phosphatase) crosstalk to a pathway can have disastrous consequences for acting on a shared substrate (Fig. 1A) (25, 26). Using mathe- fitness. It is currently unclear exactly what feature of TCS neces- matical models, we recently showed that adding multiple com- – sitates this degree of pathway isolation. In this work we used peting substrates to this type of Goldbeter Koshland (GK) loop mathematical models to show that, in the case of bifunctional HKs, would tend to induce an ultrasensitive, switch-like behavior in adding a competing substrate to a TCS pathway will always the system, which could easily have positive phenotypic con- sequences for the cell (26–29). In the work described here, we reduce response of that pathway to incoming signals. We found performed a similar analysis, extending a well-studied and vali- that the pressure to maintain cognate signaling is sufficient to dated mathematical model of bifunctional HKs (Fig. 1B) to the explain the experimentally observed “kinetic preference” of HKs case of multiple substrates (3, 4). We found that, because the HK for their cognate RRs. These findings imply a barrier to the evolution acts both as the kinase and the phosphatase in these systems, the of new HK–RR pairs, because crosstalk is unavoidable immediately addition of competing interactions with multiple RRs always BIOPHYSICS AND after the duplication of an existing pathway. We characterized a set decreases the response of the cognate RR. This is consistent with of “near-neutral” evolutionary trajectories that minimize the im- the findings of Capra et al. (23), who showed that the phenotypic COMPUTATIONAL BIOLOGY pact of crosstalk on the function of the parental pathway. These effects of their crosstalk mutant were not due to the mis- trajectories predicted that crosstalk interactions should be re- regulation of NtrX targets, but rather a direct result of decrease moved before new input/output functionalities evolve. Analysis in phosphate starvation signaling. of HK sequences in bacterial provided evidence that the The pressure to maintain cognate signaling suggests the exis- selective pressures on the HK–RR interface are different from those tence of a barrier in the evolution of new TCS pathways. New experienced by the input domain immediately after duplication. HK–RR pairs can arise from the duplication of existing HK–RR This work thus provides a unifying explanation for the evolution genes, which subsequently diverge into a new pathway (21, 30). of specificity in TCS networks. There is unavoidable crosstalk immediately postduplication, which can attenuate the response to the original signal. Using our models, we characterized a set of “near-neutral” evolution- bacterial signaling | network evolution | signal specificity ary trajectories that minimize the impact of the new pair on the signaling of the parent pathway. All of these trajectories involved wo-component signaling (TCS) represents the primary sig- insulating the two pathways from one another before establishing Tnaling modality in bacteria (1). The prototypical TCS path- new input and output functionalities. To test this prediction, we way includes a membrane-bound sensor (HK) that autophosphorylates upon receiving an input signal. The HK Significance then binds and transfers its phosphoryl group to a response regulator (RR), which often functions directly as a The global architectures of signaling networks in bacteria and factor, regulating gene expression patterns in response to the eukaryotes are remarkably different: crosstalk between path- signal (1, 2). Many HKs are bifunctional, acting as both the ki- ways is very common in eukaryotes but is very limited in bac- nase and phosphatase for their RR; the ratio of kinase to teria. Bacteria use two-component signaling (TCS) to transduce phosphatase activity, and thus the state of the information, relying on a single to act as both kinase and RR, is controlled by the input (1–8). phosphatase for targets. We used mathematical models to show “ ” Signaling networks in eukaryotes display extensive crosstalk, that introducing crosstalk in TCS always decreases system per- with individual kinases acting on large numbers of targets: the – formance. This indicates that the large-scale differences between kinase Cdk1, for instance, has hundreds of substrates in (9 eukaryotic and bacterial networks likely derive from differences 11). Bacterial TCS networks show a remarkably different to- – in the dynamics of the fundamental motifs from which the net- pology: HKs usually act on a single target (12 17). Intensive works themselves are constructed. We further demonstrated experimental study over the past 10 years has revealed the bio- that the pressure to avoid crosstalk has influenced the evolution chemical and biophysical basis for this lack of promiscuity. In general, HKs demonstrate a strong “kinetic preference” for their of new TCS pairs, driving rapid sequence divergence in cognate substrates, preferentially phosphorylating them on short interaction interfaces immediately postduplication. timescales (7, 15, 16, 18–21). A relatively small number of resi- dues in the protein–protein interaction interface between HKs Author contributions: M.A.R. and E.J.D. designed research; M.A.R. performed research; and RRs is responsible for maintaining this specificity (14–16, M.A.R. and E.J.D. analyzed data; and M.A.R. and E.J.D. wrote the paper. 20–23). Recently, Capra et al. (23) demonstrated that making The authors declare no conflict of interest. just two mutations in this interface could introduce an in- This article is a PNAS Direct Submission. teraction between an HK (PhoR) and a noncognate RR (NtrX) 1To whom correspondence should be addressed. E-mail: [email protected]. in . This exogenous interaction decreased phos- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. phate starvation signaling, leading to profound decreases in 1073/pnas.1317178111/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1317178111 PNAS Early Edition | 1of6 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ABInput 2 ðr − 1Þ − ðKK + rKPÞ + ððr − 1Þ − ðKK + rKPÞÞ + 4ðr − 1ÞrKP Input S p = ; ðr − Þ P P 2 1 K K HK [1]

P P where S* ≡ [S*]/[S]0 is the mole fraction of phosphorylated sub- S S RR RR strate, KK ≡ KM,K/[S]0 and KP ≡ KM,P/[S]0 are the Michaelis constants divided by the total concentration of substrate, and r ≡ kcat,K[K]0/kcat,P[P]0 is the ratio of the maximum velocities of P HK the enzymes (25). Because protein concentrations (and thus the Output Output saturation parameters) remain constant over short timescales CD(31), r represents the dominant response parameter. In Fig. 1C, 1.0 1.0 we considered a model of a GK loop in which an explicit input High [S][S]0 LowLow [[RR]RR]0 molecule binds and activates the kinase, thus modulating r (SI 0.75 Low [[S]S]0 0.75 Appendix). At unsaturating concentrations of substrate, sub- 0.5 PhosphatasePhosphatase Kinase 0.5 Phosphatase strate phosphorylation increases hyperbolically (Fig. 1C). At RegimeRegime ReRegimegime Regime

Fraction S* saturating concentrations, however, the system displays a switch- 0.25 Fraction RR* 0.25 High [[RR]RR] th 0 like behavior known as “0 -order ultrasensitivity.” When r < 1, phosphatase activity dominates and the addition of substrate de- 10-2 10-1 100 101 102 10-2 100 102 104 creases S*; we call this the “phosphatase regime.” When r > 1, Input [nM] Input [nM] kinase activity dominates and the addition of substrate increases S*; this is the “kinase regime.” These two opposing trends lead to Fig. 1. TCS pathways vs. Goldbeter–Koshland Loops. (A) Diagram of a – an increasingly ultrasensitive response as total substrate concen- Goldbeter Koshland loop. An input activates a kinase K, which phosphorylates C – the substrate. The phosphatase P is a separate enzyme that undoes this tration increases (Fig. 1 ) (25 29). modification. (B) Diagram of a TCS pathway. An input causes the autophos- A major difference between eukaryotic GK loops and bacte- phorylation of an HK, which transfers its phosphoryl group to the RR. The rial TCS is the fact that the HK often acts as both kinase and unphosphorylated HK also serves as the phosphatase. (C) The fraction of phosphatase for its substrate RR (Fig. 1B). Ten years ago, phosphorylated substrate S as a function of input concentration (on a log Batchelor and Goulian (3) developed an approximate analytical = = μ scale) for two total concentrations of S ([S]0 100 nM, black, and [S]0 10 M, solution of a mathematical model of TCS signaling and dem- red). The phosphatase regime and kinase regime defined in the main text are onstrated that the concentration of phosphorylated RR ([RR*]) shaded pink and green, respectively. Note that the addition of substrate was insensitive to changes in total RR concentration ([RR] ). To makes the response more switch-like (25). (D) The fraction of phosphorylated 0 study crosstalk in TCS, we constructed a model very similar to response regulator RR as a function of input concentration for two total SI Ap- concentrations of RR ([RR]0 = 100 nM, black, and [RR]0 = 10 μM, red). As dis- that of Batchelor and Goulian and other authors (see cussed in the text, HKs are always in the phosphatase regime, so the entire plot pendix for the details of the model) (3, 4). We were able to obtain is shaded pink. Note that increasing total substrate concentration in this case a complete analytical solution in this case and found that the reduces the response efficiency of the RR. steady-state response of the system follows: À Á À Á qÀÀffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiÁ Á À Á 2 rβ − β′ − e′KK + erKP + rβ − β′ − ðe′KK + erKPÞ + 4 rβ − β′ erKP separately aligned multiple HK and input domain sequences RR p = À Á ; 2 rβ − β′ from fully sequenced bacterial genomes. Analysis of the K /K A S [2] ratios of the most recently diverged domains revealed that the interaction interface of the HK is under strong positive selection where RR* ≡ [RR*]/[RR] is the fraction of phosphorylated re- immediately after duplication, likely owing to the pressure to 0 sponse regulator, KK and KP are as previously defined, and r ≡ insulate interactions between the parent and duplicate pairs (7, kcat,K/kcat,P becomes a constant ratio between the catalytic rates – 15, 16, 18 21). Input domains in HKs often evolve through of the kinase and phosphatase reactions. In this case, the dom- “ ” domain swapping, whereby a new pathway picks up input inant response parameters to changes in input are the new β and functionality by wholesale exchange of domains with other pro- « terms, which are dependent upon the and teins in the (21, 30). Analysis of KS values indicates that autodephosphorylation rates of the HK and thus the input signal these swapping events generally occur only after the HK inter- (see SI Appendix for derivation and details). We compared the faces have had sufficient time to evolve interaction specificity. predictions of this solution to previous experimental results by These findings suggest that the majority of HK–RR duplications Batchelor and Goulian (3) in which the concentrations of the follow the near-neutral evolutionary paths we predicted. Overall, HK and RR were varied and found that Eq. 2 reproduces their our work indicates that the bifunctional nature of HKs has likely data (SI Appendix). been a major driving force in the evolution of insulated topolo- As with the GK loop, we considered a case in which an explicit gies in bacterial signaling networks (21). input molecule binds and activates the HK (Fig. 1B). Because bacterial TCS are well studied, experimental values are available Results for both total concentrations and kinetic parameters in this Response to Changes in RR Concentration. To understand the im- model (SI Appendix) (2, 3, 13, 32). Using those parameters for pact of crosstalk on signaling, it is helpful to consider the re- the purpose of display, we found a dramatic decrease in RR* sponse of the system to changes in the concentration of a single when total RR concentration is high (Fig. 1D). Note that this is substrate (26). As mentioned above, eukaryotic signaling net- the fraction of phosphorylated RR; the total concentration of works are formed largely from a motif in which one enzyme active RR molecules does not depend on [RR]0 when the RR is (e.g., a kinase K) modifies a substrate and a second enzyme at saturating concentrations, as previously noted (SI Appendix) (e.g., a phosphatase P) removes the modification (Fig. 1A). (3). Thus, although the response of the system is robust to Goldbeter and Koshland first characterized the behavior of changes in total RR in this regime, it also becomes inefficient; this system over 30 years ago, finding that the response of the that is, increasing the expression of the RR does not increase the system at steady-state followed: response capacity of the system.

2of6 | www.pnas.org/cgi/doi/10.1073/pnas.1317178111 Rowland and Deeds In the GK loop (Fig. 1 A and C), the separation of the kinase into bacterial cells would likely decrease the performance of and phosphatase regimes depends upon the term (r – 1) in the the PhoR/PhoB signaling system, providing an explanation for denominator of Eq. 1, which is negative in the phosphatase re- the lower fitness of crosstalk mutants in phosphate-limiting gime and positive in the kinase regime. Eq. 2 contains a similar conditions. term, (rβ – β′), and this term is always negative. TCS loops are Bacterial genomes can encode 5–200 HK–RR pairs, depend- thus always in the phosphatase regime, and the general trend in ing on the species in question (21). To consider the impact of Fig. 1D does not depend on specific values of kinetic parameters crosstalk in such cases, we expanded the model to include N HKs (SI Appendix). This behavior ultimately arises from the fact that and RRs (Fig. 2B). In this model each HK can interact with each both the kinase and phosphatase reactions produce unphos- RR; in principle, every pair in this case has an independent as- phorylated HK, which itself is a phosphatase, keeping the system sociation affinity (i.e., KD). To simplify the problem, we assigned in the phosphatase regime. every cognate pair in the system (HKi-RRi, e.g., HK1 interacting with RR1) the same affinity KD,C, and every noncognate pair Competition in TCS. To consider crosstalk in TCS, we added (HKi-RRj,i≠ j, e.g., HK1 interacting with RR2) the same affinity a single competing RR to the system diagrammed in Fig. 1B.We KD,NC (Fig. 2B). We fixed the cognate interaction to the value K ≈ μ denote the cognate RR as RR1, the noncognate interaction observed experimentally ( D,C 1 M) (32). We then varied the K ’ K ≡ K K partner as RR2, and define the ratio of their total concentrations ratio between noncognate and cognate D s( D ratio D,NC/ D,C ) N to be R ≡ [RR2]0/[RR1]0. To account for the impact of crosstalk in networks of various sizes and measured the output activity of C on RR1 function, we also added explicit output molecules (O1 RR1 in response to the activation of HK1 (Fig. 2 ). We find that binding phosphorylated RR and O binding phosphorylated output activity is heavily attenuated for all systems when the non- 1 2 K ’ RR2) to the model. To study the responses of this system to cognate D s are relatively strong. However, when the noncognate a competing RR when the HK displays no kinetic preference for KD’s are weaker than the cognate’s by approximately three to four either substrate, we set the kinetic parameters of RR2 to be the orders of magnitude, the activity of the single active pathway is same as those for RR (7, 15, 16, 18–21). We found that adding essentially unaffected for N = 5toN = 50. 1 K “ RR2 at the same total concentration as the cognate substrate To determine whether D ratios in this range provide kinetic results in a decrease in output activity of the system, which we preferencing” similar to that observed by Skerker et al. (18), we defined as the fraction of O1 molecules bound by phosphorylated replicated their in vitro experiment using our model. This in- RR1 (Fig. 2A). As the total concentration of RR2 is increased, volved mixing either a cognate or noncognate RR with a fully impact on RR1 activity becomes even more significant. The sit- phosphorylated HK at equal concentrations. When the HK acts uation is similar to that in Fig. 1D, but in this case the total on a cognate substrate, the phosphorylation of the RR peaks BIOPHYSICS AND at 10 s, and after 1 h both the HK and RR are completely concentration of RR1 is constant, so both the fraction and con- COMPUTATIONAL BIOLOGY centration of phosphorylated RR decreases, leading to a de- dephosphorylated. In contrast, a noncognate substrate with a KD 1 3 4 crease in output activity. Our results thus indicate that the ratio of either 10 or 10 exhibits no phosphorylation at 10 s type of crosstalk introduced experimentally by Capra et al. (23) but considerable response after 1 h (Fig. 2D), directly re- 4 capitulating the findings of Skerker et al. (18). A KD ratio of ∼10 4 also gives cognate catalytic efficiencies (kcat/KM) that are 10 higher than noncognate efficiencies, consistent with other ex- ABperimental findings (14). Our results thus indicate that the ob- 1.0 1 served kinetic preference of HKs for their cognate substrates can HK1 HK2 HKN be explained simply by the need to maintain cognate responses 0.750.75 R = 0 C R = 1 (Fig. 2 ) in the presence of competing substrates, rather than 0.50.5 an explicit pressure to prevent misregulation of noncognate targets (23). [RR2]0 = 0 0.25 R = 10 0.25 [RR2]0 = [RR1]0 Fraction Active Output RR RR RR [RRR2]0 = 10*[RR 1001]0 1 2 N [RR ] = 100*[RR ] 2 0 1 0 Evolutionary Trajectories. New TCS pathways can arise through -1 0 1 2 3 4 Fraction Active Output 10 10 0 10 10 2 10 10 4 the duplication and divergence of existing HK–RR pairs (21, 30). 10 Input [nM]10 10 D 10 sec 1 hr The duplication event itself produces two HK–RR pairs that are Input [nM] 2.52500 C identical (Fig. 3A, steps 0 to 1). This effectively increases both 2.02000 0.50.5 the total concentration of the substrate and the concentration of N = 5 1.51500 “ ” N = 10 the HK, both of which can decrease the response of the parent 0.40.4 N = 25 N = 50 SI Appendix 1.01000 signaling pathway (Fig. 2 and ). Because such de- 0.30.3 N = 5 Concentration [nM] creases could strongly affect the fitness of cells in which the 0.5500 N = 10 0.20.2 duplication occurs (23), the unavoidable crosstalk that occurs

N = 25 Concentration [uM] 1 10 100 1000 immediately postduplication could present a barrier to the evo- 0.1Fraction Active Output 0.1 N = 50 0 1 Time [sec]2 3 10 10 10 10 lution of new TC signaling pathways. Subsequent evolutionary Time [s] 0 1 2 3 4 5 Fraction Active Output 10 10 10 10 10 10 Cognate events, such as the evolution of one duplicate RR that cannot K 1Ratio (Noncognate / 3Cognate) 5 10D 10 10 3 4 10 KD Ratio 10 KD Ratio activate the original output genes but still competes with the KD Ratio (Noncognate / Cognate) [HK*] [RR*] original RR for phosphorylation by the HKs, could easily exac- erbate this problem (Fig. 2). Fig. 2. Effects of competition on TCS signaling. (A) Fraction of active output We thus determined whether there were any “evolutionary as a function of input concentration in response to competition between the trajectories” that could minimize the effect of crosstalk on the ≡ cognate RR1 and noncognate RR2. The ratio R [RR2]0/[RR1]0 is varied as parent signaling pathway. To do this, we developed a simple B N N indicated. ( ) Diagram of a TCS network with HKs and RRs. Each HKi – K model of the evolution of HK RR pairs postduplication. In this interacts with its cognate RRi with D,C (black arrows) and with noncognate model, we defined two types of evolutionary steps: the removal RR with KD,NC (gray arrows). (C) Fraction of active output as a function of KD j of an interaction, meaning that the kinetic parameters of the ratio = KD,C/KD,NC in TCS networks of varying size N. The input concentration was set at a concentration that produces 50% phosphorylation for an iso- interaction are set so weak that binding of the two molecules lated HK–RR pair. (D) Concentration of HK* (dashed lines) and RR* (solid becomes very unlikely, and the addition of an interaction, lines) as a function of time for a cognate substrate and two noncognate meaning that the kinetic parameters for the binding of two 3 4 substrates with KD ratios of 10 and 10 . These models start with 2.5 μM HK* molecules are made stronger. There are thus six specific events and 2.5 μM RR, exactly replicating the in vitro experiments of Skerker et al. that can occur in our evolutionary trajectories: (A) removing the (18). The two time points investigated experimentally in that work are HK2–RR1 interaction, (B) removing the HK1–RR2 interaction, highlighted, 10 s (pink vertical line) and 1 h (orange vertical line). (C) adding the I2–HK2 interaction, (D) removing the I1–HK2

Rowland and Deeds PNAS Early Edition | 3of6 0231 Evidence for Near-Neutral Trajectories. HK proteins generally A Step: “ ” I1 I1 I1 I1 contain a distinct kinase (K) domain, which interacts with the RR and is involved in the phosphotransfer reaction, as well as an “input” domain (I) that recognizes external signals AB and modulates HK function (Figs. 1B and 3C) (1, 16, 21, 30, 33). Our model predicts a pressure to eliminate crosstalk relatively early in the evolutionary trajectory of a given se- O O O O quence pair, before changes occur in the input domain. In 1 1 C 1 1 evolutionary terms, this pressure would manifest itself as set of amino acid changes in the HK–RR interaction interfaces of 45 67the duplicate pairs to insulate the two pathways from one I1 I2 I1 I2 I1 I2 I1 I2 another (14, 16, 20, 22, 23). To test these predictions, we obtained the amino acid and DNA sequences of HKs from bacterial genomes in the Kyoto DEF Encyclopedia of Genes and Genomes (KEGG) database (34). Using available Pfam annotations (35), we restricted our anal- ysis to sequences that contain a PAS domain, because this is the O O O O O most common and well-studied input domain for HK proteins 1 1 1 1 2 (30, 33). We retained only those genomes where we could B 1.0 1 C identify five or more such sequences, resulting in a total of 352 Histidine Kinase: bacterial genomes. To identify putative recent duplication events, we performed multiple sequence alignments of the K domains 0.96 0.96 Input Domain from each genome separately and focused only on those pairs (PAS) that were nearest neighbors in the phylogenetic trees obtained from those alignments. 0.92Fraction I1O1 0.92 Although duplication and divergence are common in the evolution of HKs, new pathways can also enter a lineage through

0.88 Kinase Domain 0.88 -1 0 1 2 3 4 5 6 (HGT) (21, 30). To remove HGT pairs Fraction Active Output 1 Fraction 1234567Step # (K) from our analysis, we followed the approach of Alm et al. (30) Evolutionary Step exactly, constructing a “phylogenetic profile” for each HK gene in our dataset based on its presence or absence across the Fig. 3. Evolutionary trajectories. (A) An example of an evolutionary tra- phyolgentic tree of our bacterial genomes. Of the 2,243 closely jectory starting with a single TCS pathway in which the input I1 activates related nearest-neighbor pairs we identified, 342 of them – HK1,HK1 phosphorylates RR1, and RR1 activates the output O1. The HK RR (∼15%) represented recent HGT events. We thus obtained pair is duplicated in the first step (step 0 → 1), introducing crosstalk. The new a total of 1,901 pairs that represented bona fide duplication HK–RR pair is modified through a series of coarse-grained events. Each step events, at least according to this analysis. Further details re- corresponds to a discrete change in the interaction capabilities of the mol- garding the sequences we obtained and the HGT analysis can be ecules in question (events A–F described in the text). Any alternative or- Materials and Methods SI Appendix B found in and . dering of these events constitutes a unique evolutionary trajectory. ( ) We used this data to calculate the rate of synonymous sub- Fraction of active output O1 in response to saturating I1 at each evolutionary step for the 24 trajectories that displayed the least impact on parental sig- stitutions (KS) and the rate of nonsynonymous substitutions (KA) naling (black) and the 4 trajectories that displayed the largest impact on for our sequences (36). The ratio of these two parameters, parental signaling (red). Multiple trajectories can exhibit the same trends in KA/KS, provides an estimate of the relative strength of selection > parental signaling; hence, there are only a few visible curves. (C) Diagram of on the coding sequence of the protein. A value of KA/KS 1 an HK containing two domains: an input domain I (PAS domain) and the indicates positive selection for changes at the protein level, kinase domain K. whereas a KA/KS <1 indicates “purifying” selection to maintain the sequence of the protein unchanged (36). We included the sequence of Spo0B in our K domain alignments, using the interaction, (E) removing the RR2–O1 interaction, and (F) adding available cocrystal structure between Spo0B and Spo0F (its RR) the RR2–O2 interaction. This provides a model with 64 possible to determine which residues in each HK sequence were likely to states, depending upon the existence of these six interactions, and participate in this interface (16, 37). Using the alignment for each 720possibletrajectories(e.g.,A,B,C,D,E,ForE,A,D,F,C,B). non-HGT pair, we calculated KA and KS values for the interfacial An example of one such trajectory is diagrammed in Fig. 3A. Each residues of the K domain and the noninterfacial residues of the trajectory was then analyzed at each step for the activation of both K domain. outputs in the presence of either input. The neutrality of the tra- In Fig. 4A, we plot the value of KA/KS as a function of KS jectories was measured based upon a single criterion: having (a rough estimator of time since duplication) for non-HGT se- minimal impact on parental signaling, which we defined using the quence pairs based on all residues in either the K domain in- total concentration of active O1 in the presence of saturating con- terface or the noninterface region. We found that the strength of centrations of I1, summed across all of the “steps” in the trajectory. selection on these subsets of residues was quite different: for We obtained 24 “near-neutral” trajectories that minimize im- one, the average K /K in the interfacial residues is higher A S − pact on parental signaling equally well across all steps (Fig. 3B); overall (SI Appendix, Fig. S9, P = 4.73 × 10 9). We also found the example trajectory in Fig. 3A is a member of that set. In all of a strong power-law dependence of KA/KS on KS for the interface, these trajectories, the crosstalk interactions between the HKs whereas noninterface residues showed a statistically distinct P < × −16 SI Appendix and RRs are removed before HK2 and RR2 lose their capacity to and much weaker dependence ( 2 10 , , interact with I1 and O1. This prevents inactive HK2 from acting section 5.3). as a phosphatase for RR1, and avoids reductions in O1 activation To test whether the size of the subset of residues considered owing to competition between RR1 and RR2 for phosphorylation might influence the calculation of KA and KS, we generated by HK1 (Fig. 2A). The red curves in Fig. 3B represent the four random subsets of noninterface residues with the same total trajectories with the maximal total impact on parental signaling. number of residues as the interface. We also used the Spo0B These trajectories all exhibit the opposite order of events: in those structure to generate similar random subsets of noninterface cases, input/output functionalityisalwaysalteredbeforethe surface residues, to control for the fact that surface residues HK–RR crosstalk is removed. (such as those on the HK/RR interface) might experience relaxed

4of6 | www.pnas.org/cgi/doi/10.1073/pnas.1317178111 Rowland and Deeds A 10 101 B 30 102 duplication of the HK gene. There are two possible scenarios in 0 this case: in scenario A, an HK gene is duplicated and sub- 10 100 7.5 sequently picks up a “new” PAS domain from some other pro- 10-1 20 S S 10-2 tein in the genome. In scenario B, a protein with a PAS domain / K / K 5 10-1 100 10-1 100 A A is duplicated, and later picks up a new K domain through domain K K 10 swapping. Our model predicts that scenario A should be more 2.5 Interface Interface Noninterface Noninterface common in HK evolution, because input changes should occur relatively later in the evolutionary trajectory (Fig. 3). 1 1234 1234 To test this prediction, we took each of our domain-swapped K K S S non-HGT HK pairs and compared the KS of the K domain from 4 0.4 C D Interface that pair with the KS of the closest PAS domain. Of the 1,300 Noninterface cases for which we could obtain the relevant KS values, 951 of 3 0.3 them had a larger KS value for the K domain than for the PAS C < × −16 2 0.2 domain (Fig. 4 ,P 2 10 ), as our model predicted. This statistical bias is present if we consider only those cases where Density

, K Domains the PAS domain that is swapped originates only from other HK

S 1 0.1 −16 K proteins, or only from non-HK proteins (P < 2 × 10 in both cases, SI Appendix,Fig.S12). Even in cases of very recent 1234 5101520duplications where scenario B seems more likely (i.e., the blue KS, PAS Domains # of Amino Acid Substitutions triangle in Fig. 4C, with KS for both domains <1), we see sig- nificant pressure to mutate interface residues. In particular, the Fig. 4. Sequence analysis. (A) The KA/KS values as a function of KS for non- HGT HK sequence pairs for both the K domain interface residues (green circles) average number of substitutions in the interface for those se- and noninterface residues (orange circles). The black and red lines correspond quence pairs is approximately eight, similar to that observed for to power-law regressions of the interface and noninterface data, respectively. all pairs in the dataset (Fig. 4D). The average substitution rate in (Inset) The same data and fits, plotted on a log-log scale. (B) A plot similar to this case is much larger than that observed for random noninter- − that in A, but for the interface and noninterface residues of RR proteins. (C) face subsets of the same size (P < 10 5, permutation test). This The KS value for each HK domain pair is plotted against the KS value for the indicates a near-universal pressure to diversify the interface resi- corresponding PAS domains. Of the 1,300 points in this plot, 951 are above the −16 dues of newly evolved HK/RR pairs. diagonal (the black line, P < 2 × 10 ). (D) A plot of the distribution of sub- BIOPHYSICS AND stitution rates for all of the K domains in C (red) and just those K domains from

Discussion COMPUTATIONAL BIOLOGY very recent duplications (KS <1) where the PAS domain is younger (the blue triangle in C corresponds to the points used to make the blue line). The number The results described above indicate that the vast global differ- of amino acid substitutions in the interface positions (solid lines) is compared ences in topology between eukaryotic and bacterial signaling with the number obtained from random subsets of noninterface positions of networks are likely the result of differences in the atomic the same size (dashed lines). “motifs” from which the networks themselves are constructed. In particular, the kinase–phosphatase pairs that are typically found in eukaryotic networks become more ultrasensitive as they be- evolutionary pressures. In both cases, the trends were the same come more saturated, a behavior that allows these loops to as those in Fig. 4A (SI Appendix, Figs. S6 and S7). Using a second couple the responses of multiple downstream targets in in- available HK/RR structure to determine the interface residues teresting and potentially adaptive ways (Fig. 1C) (25, 26). In [HK853/RR468 from Thermotoga maratima (7)] also gives sim- contrast, the two-component architecture of bacterial signaling ilar results (SI Appendix,Fig.S8). Finally, the raw substitution motifs makes them inherently less efficient as they become sat- rates (i.e., the total number of amino acid changes between two urated, ultimately driving down total system response as com- sequences) shows much higher values for interface positions petitive substrates are added (Fig. 1D). This behavior likely compared with other positions in the sequence, regardless of underlies the fitness cost of crosstalk observed in vivo, resulting whether these positions are on the surface or not (SI Appendix, in a natural evolutionary pressure to maintain isolated cognate Fig. S10). The difference in substitution rates can be readily seen signaling pathways (23). Indeed, our models indicate that a re- in an example alignment for a recently diverged pair of K do- quirement to maintain cognate responses is sufficient to obtain mains from the bacterium Halococcus turkmenicus (SI Appendix, the degree of kinetic preference that HKs show for their sub- Fig. S11). Using a similar analysis for RR proteins, we found strates (Fig. 2) (18). essentially the same trends when comparing interface to non- Although our models indicate that crosstalk in TCS generally interface residues for those proteins (Figs. 4B and SI Appendix, decreases response, this does not imply that such systems abso- Figs. S8 and S9). Overall, these findings indicate that the inter- lutely cannot tolerate the presence of more than one interaction face residues of both the HK and RR proteins tend to diversify partner. Indeed, there are known examples of HKs that act ef- after duplication to prevent crosstalk, consistent with our predic- ficiently on more than one RR (e.g., the bacterial chemotaxis tions (Fig. 3). pathway) (38). Although introducing crosstalk does decrease We also considered the evolution of input functionality in HK response, the system can compensate by increasing the total proteins. In our alignments, we found only 67 cases out of the expression level of that particular RR to maintain a particular 2,243 nearest-neighbors in the K domain alignment where the concentration of active RR* (SI Appendix) (3). Of course, such PAS domains for those two proteins were also nearest neighbors an increase comes with its own fitness cost: the bacterium must in the PAS domain alignment for that genome. In other words, invest more energy in protein synthesis to obtain the same level we found that PAS domains tend to display extensive domain of signaling performance. In some cases, the phenotypic benefits swapping, where new input functionality evolves not through of crosstalk may outweigh this cost, resulting in HKs with more divergence of the ancestral input domain, but rather the re- than one target. As the number of targets increases, however, the placement of the original function through wholesale introduction cost of maintaining the response becomes larger (Fig. 2). This is of the input domain from another, unrelated protein. This is con- likely the reason that even the few bifunctional HKs that have sistent with earlier findings on PAS domain evolution in HKs (30). more than one target rarely act on more than two or three Because the evolution of input functionality is dominated by downstream RRs (34, 38). Monofunctional HKs, however, should domain swapping, we could not perform a robust KA/KS analysis act more like kinases in GK loops (Fig. 1), and so proteins like the similar to that in Fig. 4 A and B. Instead, we focused on under- chemotaxis kinases may experience a considerably relaxed con- standing the timing of the domain-swapping event relative to the straint against evolving crosstalk.

Rowland and Deeds PNAS Early Edition | 5of6 The evolution of new signaling pathways in bacteria often including kinetic preferencing (18), the evolution of protein in- involves the duplication and subsequent divergence of an existing teraction interfaces (Fig. 4) (14–16, 18–23), and the deleterious TCS pair (Fig. 3A) (21). Our findings indicate that the impact of effects of crosstalk in vivo (23). The constraint against crosstalk crosstalk on HK signaling likely shapes the evolutionary land- may limit the types of information processing available to bac- scapes of these duplicate pairs. Specifically, the fitness costs of terial signaling networks, with more involved computations oc- crosstalk generate significant evolutionary pressures that result curring at the level of the complex gene regulatory networks in rapid diversification of the HK–RR interface, insulating the protein interactions and allowing the subsequent evolution of downstream of RRs (39, 40). new input and output functionalities (Figs. 3 and 4). It is cur- Materials and Methods rently possible to engineer both HK and PAS domain sequences to introduce a wide variety of HK–RR and HK–I interactions. Our model of TCS dynamics, and the corresponding systems of ordinary It would thus be straightforward to create a number of the in- differential equations (ODEs), is described in SI Appendix, section 1. We used termediate evolutionary “states” considered by our model (e.g., the CVODE package from SUNDIALS (41) to numerically integrate the system Fig. 3A) and assess their relative fitness costs in vivo. One such of ODEs. Nucleic acid and amino acid sequences of HKs were obtained from case has already been investigated experimentally (23); the the KEGG database (34), and domain boundaries were obtained from Pfam investigation of systems with related topologies would provide annotations (35). The amino acid sequences of the domains within each detailed tests of our predictions. The combination of these genome were aligned using CLUSTALW (42). The nucleic acid sequences experimental efforts with more detailed phylogenetic analyses were then mapped to the amino acid multiple sequence alignments. KA and of recent duplication events (30) would ultimately result in a KS values were obtained using the seqinR library in the R statistical com- definitive characterization of the evolutionary trajectories of puting platform (43). Further details on our simulations and analyses can be new TCS pathways. found in SI Appendix. The reliance of bacterial signaling systems on only two com- ponents results in signaling dynamics that cannot easily admit ACKNOWLEDGMENTS. We thank Justin Blumenstiel, Tom Kolokotrones, competitive interactions. Our work indicates that this inherent Walter Fontana, and Michael Laub for many helpful discussions regarding feature of TCS dynamics underlies a diverse array of observations, this work.

1. Stock AM, Robinson VL, Goudreau PN (2000) Two-component signal transduction. 22. Capra EJ, et al. (2010) Systematic dissection and trajectory-scanning mutagenesis of Annu Rev Biochem 69:183–215. the molecular interface that ensures specificity of two-component signaling path- 2. Qin L, Yoshida T, Inouye M (2001) The critical role of DNA in the equilibrium between ways. PLoS Genet 6(11):e1001220. OmpR and phosphorylated OmpR mediated by EnvZ in Escherichia coli. Proc Natl Acad 23. Capra EJ, Perchuk BS, Skerker JM, Laub MT (2012) Adaptive mutations that prevent Sci USA 98(3):908–913. crosstalk enable the expansion of paralogous signaling protein families. Cell 150(1): 3. Batchelor E, Goulian M (2003) Robustness and the cycle of phosphorylation and de- 222–232. phosphorylation in a two-component regulatory system. Proc Natl Acad Sci USA 24. Lyons SM, Prasad A (2012) Cross-talk and information transfer in mammalian and 100(2):691–696. bacterial signaling. PLoS ONE 7(4):e34488. 4. Huynh TN, Stewart V (2011) Negative control in two-component signal transduction 25. Goldbeter A, Koshland DE, Jr. (1981) An amplified sensitivity arising from covalent by transmitter phosphatase activity. Mol Microbiol 82(2):275–286. modification in biological systems. Proc Natl Acad Sci USA 78(11):6840–6844. 5. Lois AF, Weinstein M, Ditta GS, Helinski DR (1993) Autophosphorylation and phos- 26. Rowland MA, Fontana W, Deeds EJ (2012) Crosstalk and competition in signaling phatase activities of the oxygen-sensing protein FixL of Rhizobium meliloti are co- networks. Biophys J 103(11):2389–2398. ordinately regulated by oxygen. J Biol Chem 268(6):4370–4375. 27. Huang CY, Ferrell JE, Jr. (1996) Ultrasensitivity in the mitogen-activated protein ki- 6. Keener J, Kustu S (1988) Protein kinase and phosphoprotein phosphatase activities of nase cascade. Proc Natl Acad Sci USA 93(19):10078–10083. nitrogen regulatory proteins NTRB and NTRC of enteric bacteria: Roles of the con- 28. Kim SY, Ferrell JE, Jr. (2007) Substrate competition as a source of ultrasensitivity in the served amino-terminal domain of NTRC. Proc Natl Acad Sci USA 85(14):4976–4980. inactivation of Wee1. Cell 128(6):1133–1145. 7. Casino P, Rubio V, Marina A (2009) Structural insight into partner specificity and 29. Bagowski CP, Besser J, Frey CR, Ferrell JE, Jr. (2003) The JNK cascade as a biochemical phosphoryl transfer in two-component signal transduction. Cell 139(2):325–336. switch in mammalian cells: Ultrasensitive and all-or-none responses. Curr Biol 13(4): 8. Mizuno T (1997) Compilation of all genes encoding two-component phosphotransfer 315–320. signal transducers in the genome of Escherichia coli. DNA Res 4(2):161–168. 30. Alm E, Huang K, Arkin A (2006) The evolution of two-component systems in bacteria 9. Ptacek J, et al. (2005) Global analysis of protein phosphorylation in yeast. Nature reveals different strategies for niche adaptation. PLOS Comput Biol 2(11):e143. 438(7068):679–684. 31. Belle A, Tanay A, Bitincka L, Shamir R, O’Shea EK (2006) Quantification of protein 10. Ubersax JA, et al. (2003) Targets of the cyclin-dependent kinase Cdk1. Nature half-lives in the budding yeast proteome. Proc Natl Acad Sci USA 103(35):13004–13009. 425(6960):859–864. 32. Cai SJ, Inouye M (2002) EnvZ-OmpR interaction and osmoregulation in Escherichia 11. Hill SM (1998) Receptor crosstalk: Communication through cell signaling pathways. coli. J Biol Chem 277(27):24155–24161. Anat Rec 253(2):42–48. 33. Lee J, et al. (2008) Changes at the KinA PAS-A dimerization interface influence his- 12. Ninfa AJ, et al. (1988) Crosstalk between bacterial chemotaxis signal transduction tidine kinase function. Biochemistry 47(13):4051–4064. proteins and regulators of transcription of the Ntr regulon: Evidence that nitrogen 34. Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M (2004) The KEGG resource for assimilation and chemotaxis are controlled by a common phosphotransfer mecha- deciphering the genome. Nucleic Acids Res 32(Database issue):D277–D280. nism. Proc Natl Acad Sci USA 85(15):5492–5496. 35. Punta M, et al. (2012) The Pfam protein families database. Nucleic Acids Res 40(Da- 13. Fisher SL, Kim SK, Wanner BL, Walsh CT (1996) Kinetic comparison of the specificity of tabase issue):D290–D301. the vancomycin resistance VanSfor two response regulators, VanR and PhoB. Bio- 36. Hurst LD (2002) The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends chemistry 35(15):4732–4740. Genet 18(9):486. 14. Skerker JM, et al. (2008) Rewiring the specificity of two-component signal trans- 37. Zapf J, Sen U, Madhusudan, Hoch JA, Varughese KI (2000) A transient interaction duction systems. Cell 133(6):1043–1054. between two phosphorelay proteins trapped in a crystal lattice reveals the mecha- 15. Siryaporn A, Goulian M (2008) Cross-talk suppression between the CpxA-CpxR and nism of molecular recognition and phosphotransfer in signal transduction. Structure EnvZ-OmpR two-component systems in E. coli. Mol Microbiol 70(2):494–506. 8(8):851–862. 16. Laub MT, Goulian M (2007) Specificity in two-component signal transduction path- 38. Porter SL, Wadhams GH, Armitage JP (2011) Signal processing in complex chemotaxis ways. Annu Rev Genet 41:121–145. pathways. Nat Rev Microbiol 9(3):153–165. 17. Fisher SL, Jiang W, Wanner BL, Walsh CT (1995) Cross-talk between the histidine 39. Milo R, et al. (2002) Network motifs: Simple building blocks of complex networks. protein kinase VanS and the response regulator PhoB. Characterization and identifica- Science 298(5594):824–827. tion of a VanS domain that inhibits activation of PhoB. JBiolChem270(39):23143–23149. 40. Shen-Orr SS, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional 18. Skerker JM, Prasol MS, Perchuk BS, Biondi EG, Laub MT (2005) Two-component signal regulation network of Escherichia coli. Nat Genet 31(1):64–68. transduction pathways regulating growth and cell cycle progression in a bacterium: A 41. Hindmarsh AC, et al. (2005) SUNDIALS: Suite of nonlinear and differential/algebraic system-level analysis. PLoS Biol 3(10):e334. equation solvers. Acm T Math Software 31(3):363–396. 19. Groban ES, Clarke EJ, Salis HM, Miller SM, Voigt CA (2009) Kinetic buffering of cross 42. Larkin MA, et al. (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23(21): talk between bacterial two-component sensors. J Mol Biol 390(3):380–393. 2947–2948. 20. Siryaporn A, Perchuk BS, Laub MT, Goulian M (2010) Evolving a robust signal trans- 43. Charif D, Lobry J (2007) SeqinR 1.0-2: A contributed package to the R project for duction pathway from weak cross-talk. Mol Syst Biol 6:452. statistical computing devoted to biological sequences retrieval and analysis. Struc- 21. Capra EJ, Laub MT (2012) Evolution of two-component signal transduction systems. tural Approaches to Sequences Evolutions: Molecules, Networks, Populations, eds Annu Rev Microbiol 66:325–347. Bastolla U, Porto M, Roman E, Vendruscolo M (Springer, New York), pp 207–232.

6of6 | www.pnas.org/cgi/doi/10.1073/pnas.1317178111 Rowland and Deeds