<<

Presence͞absence polymorphism for alternative pathogenicity islands in viridiflava, a pathogen of Arabidopsis

Hitoshi Araki†‡, Dacheng Tian§, Erica M. Goss†, Katrin Jakob†, Solveig S. Halldorsdottir†, Martin Kreitman†, and Joy Bergelson†¶

†Department of Ecology and , , Chicago, IL 60637; and §Department of Biology, Nanjing University, Nanjing 210093, Republic of China

Communicated by Tomoko Ohta, National Institute of Genetics, Mishima, Japan, March 1, 2006 (received for review January 25, 2006) The contribution of arms race dynamics to –pathogen coevo- pathogens are defined and differentiated from close relatives by lution has been called into question by the presence of balanced horizontally acquired virulence factors (12). However, a survey polymorphisms in resistance genes of , but of effectors in finds effectors that have less is known about the pathogen side of the interaction. Here we been acquired recently and others that have been transmitted investigate structural polymorphism in pathogenicity islands (PAIs) predominantly by descent, indicating that pathogenicity may in Pseudomonas viridiflava, a prevalent bacterial pathogen of A. evolve in both genomic contexts (13). thaliana. PAIs encode the type III secretion system along with its In this study, we investigated PAIs in P. viridiflava, which is a effectors and are essential for pathogen recognition in . P. prevalent bacterial pathogen of wild A. thaliana populations viridiflava harbors two structurally distinct and highly diverged PAI (14). P. viridiflava is in the P. syringae group (15). Although P. paralogs (T- and S-PAI) that are integrated in different chromo- syringae is intensively studied as a bacterial plant pathogen (13, some locations in the P. viridiflava genome. Both PAIs are segre- 16–18), little is known about the genetic basis of pathogenicity gating as presence͞absence polymorphisms such that only one PAI

in P. viridiflava. Here we report a previously undescribed ar- GENETICS ١ ١ ([T-PAI, S-PAI] and [ T-PAI, S-PAI]) is present in any individual cell. rangement of PAIs and a long-term presence of an unusual dual A worldwide population survey identified no isolate with neither PAI polymorphism in this bacterial pathogen. This polymor- or both PAI. T-PAI and S-PAI genotypes exhibit virulence differ- phism is not caused by recent HGT but rather is analogous to ences and a host-specificity tradeoff. Orthologs of each PAI can be polymorphism in host defense genes: evolutionarily long-lived found in conserved syntenic locations in other Pseudomonas spe- polymorphism for two paralogous PAIs in a single pathogen cies, indicating vertical phylogenetic transmission in this genus. species. Molecular evolutionary analysis of PAI sequences also argues against ‘‘recent’’ horizontal transfer. Spikes in nucleotide diver- Results ١ gence in flanking regions of PAI and -PAI alleles suggest that the We first examined the structure and DNA sequence of PAIs in dual PAI polymorphism has been maintained in this species under five P. viridiflava strains (LP23.1a, PNA3.3a, ME3.1b, some form of balancing selection. Virulence differences and host RMX23.1a, and RMX3.1b) that were collected from naturally specificities are hypothesized to be responsible for the mainte- occurring A. thaliana plants in populations in the Midwest nance of the dual PAI system in this bacterial pathogen. United States, and their entire PAIs and flanking sequences were isolated. LP23.1a and PNA3.3a possess a PAI similar in ͉ balancing selection ͉ plant–pathogen interaction ͉ arms structure to that found in the congener P. syringae (16). It has a race ͉ horizontal gene transfer tripartite mosaic structure composed of a gene cluster encoding the Type III protein-secretion apparatus (hrp͞hrc gene cluster), rms race dynamics were once thought to dominate plant– the 5Ј effector loci (exchangeable effector loci or EEL), and the Apathogen coevolution through the process of rapid substi- 3Ј effector loci (conserved effector loci or CEL). We designate tutions of adaptive mutation in both sides of the interaction (1). this structural form as the T- (tripartite) PAI (Fig. 1A; see also However, the presence of balanced polymorphisms in some Table 1, which is published as supporting information on the resistance (R) genes in the plant host Arabidopsis thaliana PNAS web site). The other three isolates (ME3.1b, RMX23.1a, suggests a different type of coevolutionary dynamic (2–5). For and RMX3.1b) possess a PAI with a single component hrp͞hrc example, diversifying selection, rather than directional selection, cluster (Fig. 1B; see also Table 2, which is published as support- A. thaliana R is observed in the extremely polymorphic -gene ing information on the PNAS web site), a structure previously RPP13 and the downey mildew avirulence gene, ATR13, which not reported in any pathogenic bacteria. We designate this type triggers RPP13-mediated resistance (6). In bacteria, effector as the S- (single) PAI. proteins such as ATR13 are transported into host cells via the The T-PAIs of LP23.1a and PNA3.3a are 47 kb and 43 kb, Type III secretion system (TTSS), which is found in both animal respectively, differing primarily by an insertion͞deletion (indel) and plant pathogens (7). Effectors delivered by the TTSS are essential for causing disease in susceptible hosts and for eliciting

defense responses in resistant hosts. In a variety of Gram- Conflict of interest statement: No conflicts declared. negative bacterial pathogens, the genes encoding the TTSS and Abbreviations: HGT, horizontal gene transfer; HR, hypersensitive response; PAI, pathoge- its effectors comprise a physical gene cluster called a pathoge- nicity island; TTSS, Type III secretion system. nicity island (PAI) (8). Data deposition: The sequences reported in this paper have been deposited in the GenBank Pathogenicity-related genes, including entire PAIs, are often database (accession nos. AY597274–AY597283, AY859111, AY859112, AY859115, introduced into bacterial species by horizontal (or lateral) gene AY859128, AY859131, AY859183, AY859184, AY859351, AY859355, AY859358, transfer (HGT) (8–10), which is believed to be important DQ158500–158855, DQ168848, and DQ220702). because it allows the recipient pathogen to immediately use ‡Present address: Department of , Oregon State University, Corvallis, OR 97331. already-evolved pathogenicity strategies and, thereby, accelerate ¶To whom correspondence should be addressed. E-mail: [email protected]. the pace of pathogen evolution (11). In fact, several human © 2006 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0601431103 PNAS ͉ April 11, 2006 ͉ vol. 103 ͉ no. 15 ͉ 5887–5892 Downloaded by guest on September 23, 2021 Fig. 1. Two PAIs in P. viridiflava. Gene compositions of Region 1 (A) and Region 2 (B), locations of the T- and S-PAIs, respectively, in P. viridiflava. Boxes represent ORFs, and numbers above or below boxes are ORF numbers corresponding to Tables 1 and 2. The clade (19) of each isolate and which PAI it contains is indicated.

difference in the EEL region (Fig. 1A). Gene compositions in isolates containing an S-PAI in Region 2 have a 3-kb-long .([the hrp͞hrc gene cluster and in the CEL region are otherwise sequence in Region 1 instead of a T-PAI ([ٌT-PAI, S-PAI identical to one another and are nearly identical to the T-PAI of This sequence is similar to part of the EEL sequence in the P. syringae pv. tomato (Pto) DC3000 (16). The gene compositions T-PAI but lacks any known effector gene homologs. Note that in the EEL region are nearly identical in the two P. viridiflava we defined the 5Ј end of the T-PAI as being immediately isolates but different from that of Pto DC3000. This region is downstream of tgt, queA, and tRNALeu following a definition of known to be hypervariable among P. syringae pathovars (16–17). PAIs in P. syringae (16). This genomic region does not necessarily The T-PAI contains three known avirulence (avr) gene ho- represent a unit that shares the same evolutionary history in mologs, hopPsyA, avrE, and avrF (18, 20, 21) and two effector P. viridiflava. gene candidates (hopPtoA1 and hopPtoM). We have reported previously the presence of two diverged The S-PAIs in ME3.1b, RMX23.1a, and RMX3.1b are Ϸ30 kb (and nonrecombining) clades in P. viridiflava, which likely in length and contain a 10-kb-long insertion in the middle of the represent two distinct subspecies (19). However, these clades do hrp͞hrc cluster relative to the T-PAI (Fig. 1B). The S-PAI not correspond to the PAI haplotypes: in a survey of 96 isolates, contains only two avr gene homologs (avrE and avrF), and these the two PAI haplotypes coexist within clade A (10 AT and 57 .(homologs are located in the 10-kb insertion. The T- and S-PAIs AS), whereas clade B is fixed for [ٌT-PAI, S-PAI] (29 BS share 25 gene homologs and many operon structures (Tables 1 Recombination between AT and AS isolates is clearly evident at and 2), and yet are distinct in gene composition and order, other loci spread around the genome (19), so divergent clades especially for effector gene loci. The sequences of these gene cannot explain this disassociation between the S- and T-PAIs. homologs are also highly diverged between the two PAIs. Other possible explanations for the absence of recombinant Nucleotide divergence (22, 23) between the 25 shared genes PAI genotypes include tight physical linkage and͞or natural averages 0.701 across all sites and 1.44 for synonymous sites. selection. According to the similarities of the flanking regions In a previous study (23) we investigated nucleotide polymor- (Tables 1 and 2), Region 1 and Region 2 are 2.1–2.4 Mb apart phism at five genomic regions in a worldwide collection of P. in three divergent genomes of P. syringae (6.1–6.3 Mb circular viridiflava isolates. Both T-PAI and S-PAI bearing isolates are genomes), suggesting that these regions may not be physically present in this sample. Total and synonymous site divergence tightly linked in P. viridiflava (24–26). Strong selection on this between T- and S-PAI genotypes at these five loci (0.042 and association implies that genotypes with both or no PAI are at a 0.182, respectively) is 16.9 and 7.7 times lower, respectively, than selective disadvantage relative to the genotypes with only one the corresponding divergence between genes located in the S- PAI. The biological importance of PAIs in pathogenic bacteria and T-PAI. This observation suggests a paralogous (rather than is understood; the disadvantage of carrying two PAIs, on the an allelic) relationship of the two PAIs. Analysis of genomic other hand, awaits experimental analysis. sequences flanking their boundaries confirms that they are The fact that the two PAI haplotypes reside in a single species indeed located in different genomic regions (T-PAI in Region 1 means that both of the PAI indels are true polymorphisms. Two and S-PAI in Region 2; Fig. 1). Given this lack of allelism, we issues related to the possible adaptive significance of these were surprised to discover that isolates with one type of PAI did polymorphisms can be addressed: (i) the age of origination of not contain the other: the presence of one PAI was perfectly each PAI in the lineage leading to (or represented by) P. associated with the absence of the other. In a larger sample of viridiflava and (ii) the age of each PAI indel polymorphism. For 286 P. viridiflava isolates collected from worldwide populations clarity, we analyze data for the T-PAI first and in greater detail, of A. thaliana, four of seven populations are polymorphic for the and then we show that the same data patterns and arguments PAI type (Table 3, which is published as supporting information hold for the S-PAI. on the PNAS web site). In total, 10% contain a T-PAI and 90% Phylogenetic analysis of the PAIs revealed that the time of the contain an S-PAI. Each isolate harbors one, and only one, PAI. most recent common ancestor of T-PAI and S-PAI predates the Isolates containing a T-PAI in Region 1 have a 200-bp sequence split of P. viridiflava from other Pseudomonas species (Fig. 2A). in Region 2, instead of an S-PAI ([T-PAI, ٌS-PAI]). Similarly, This genealogical relationship means that one of the PAIs

5888 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0601431103 Araki et al. Downloaded by guest on September 23, 2021 Alternatively, these features may reflect the exchangeability of the EEL region (16–17). Third, a phylogeny of T-PAI sequences is consistent in both depth (i.e., divergence) and branching order with a corresponding phylogeny based on other resident genes in these species (Fig. 2 A and B). Finally, the two T-PAI alleles in P. viridiflava (of LP23.1a and PNA3.3a) differ at Ϸ200 segre- gating sites within the T-PAI, which argues against a very recent HGT event. Indeed, the level of genetic variation of several genes on the T-PAI is indistinguishable from that on the background of T-PAI containing isolates from natural popula- tions (H.A., M.K., and J.B., unpublished data). All of these results suggest that a recent HGT cannot explain the presence͞ absence polymorphism of the T-PAI. The age of the T-PAI indel polymorphism can provide clues about the type of selection acting on this polymorphism. Fol- lowing established methods (3–4), we investigated the genetic divergence in the regions immediately flanking the site of the -T-PAI indel polymorphism between [T-PAI, ٌS-PAI] and [ٌT PAI, S-PAI] isolates. Window plot analysis reveals a clear spike of genetic variation in the junction region (nucleotide diversity approaching 30%), almost all of which can be accounted for by [the divergence between [T-PAI, ٌS-PAI] and [ٌT-PAI, S-PAI isolates (Fig. 3A). We do not believe this divergence is the result of mutational processes associated with insertion or excision of the PAI but rather represents the accumulation of substitutions between the two alleles over time. First, nucleotide divergence

is nearly equally divided between groups when compared with GENETICS PtoDC3000. Specifically, nucleotide divergence for synonymous sites in the ORFs that neighbor the T-PAI indel junction -R1-ORF41 and R1-ORF79) is 1.24 for [T-PAI, ٌS-PAI] iso) lates and 1.26 for [ٌT-PAI, S-PAI] isolates. Second, the level of polymorphism in the flanking region within each allele class is similar to that of the background loci in this species. E[␪] (expected genetic diversity conditioned on allele frequency; ref. 28) of the 1-kb-long flanking regions centered on the T-PAI [junction is 0.032 [T-PAI, ٌS-PAI] and 0.020 [ٌT-PAI, S-PAI for isolates within clade A and 0.014 [ٌT-PAI, S-PAI] for isolates within clade B; these values are comparable to 0.022 (clade A) and 0.009 (clade B) for the background loci (19). Therefore, both with respect to the mutational differences that Fig. 2. Genealogical relationships of PAI and non-PAI genes. Neighbor have accumulated between the T-PAI and ٌT-PAI alleles and joining trees of 25 PAI genes (A), non-PAI genes (16S and gyrB; B) and the PAI the mutational polymorphism that has accumulated within each gene avrE (C) for P. viridiflava, Pto DC3000 ,and P. cichorii 83-1 are shown. The allelic class, we find no evidence for unequal mutation rates. third positions of codons in the ORFs were used for PAI genes (7.4 kb) and avrE Window plot analysis of AT and AS sequences revealed (1.7 kb), and total sequences were used for 16S and gyrB (2.0 kb in total). The significant departures from selective neutrality (D* and F*) (27) 25 PAI genes in A include ORFs R1-ORF45-51, 53–68, 72, 73, 76, and R2-ORF8- around the indel-junction (Fig. 3A), a reflection of the large 14, 16-20, 24-36 (Tables 1 and 2). Shown are bootstrap probabilities with 1,000 replications. The clade and PAI [clade, PAI] are given for each P. viridiflava relative divergence between the two alleles. Such a departure isolate, as in Fig. 1. PAIs in P. syringae DC3000 and in P. cichorii 83-1 are from neutrality is consistent with balancing selection acting on orthologous to T-PAI and S-PAI, respectively. the T-PAI polymorphism. The genealogical relationship of the sequence flanking the T-PAI (Region 1) indicates that the deletion of the T-PAI predated the most recent common an- cannot have originated as a recent duplication event of the other. cestor of the A and B clades. This finding is significant because, As discussed above, HGT is thought to be a common mechanism based on the divergence of the two clades (Ks ϭ 0.334; ref. 19), for bacterial pathogens to acquire distinct PAIs. Is the PAI indel we estimate that these clades split off from each other 43–49 polymorphism a consequence of a recent HGT in P. viridiflava? million years ago (29). There is virtually no chance that a Four lines of evidence argue against such a HGT event being polymorphism can remain segregating for this length of time by recent. First, the structures of the T-PAIs in P. viridiflava and P. genetic drift alone but rather requires some form of balancing syringae Pto DC3000 are similar, and the genes flanking the PAIs selection. are orthologous, indicating synteny. Second, there are no obvi- The same logic allows us to conclude that the S-PAI has also ous indicators of recent HGT, like heterogeneity of GϩC been maintained as an ancient balanced polymorphism in P. contents, integration-related fragments, and presence of tRNAs viridiflava. We did not detect the presence of a homolog of the around this PAI (9–11). The EEL region of the T-PAI does have S-PAI in P. syringae, but we were able to identify a homolog in a tRNA in its 5Ј flanking region (which are sometimes associated P. cichorii 83-1, another relative of P. viridiflava (30). The S-PAIs with HGT integration) and has slightly lower GϩC content in P. viridiflava and P. cichorii share similarity in their flanking (52.8% on average) compared with other regions (58–61%). sequence (Table 2) and show no indication of HGT. In addition, This tRNA is conserved between Pseudomonas species and may the genealogy of the S-PAI is consistent with the phylogenic mark the original insertion point of a more ancient HGT of this relationship of the species (Fig. 2 A and B), including the PAI in an ancestor of the Pseudomonas plant pathogens (16). divergence between species and divergence between the A and

Araki et al. PNAS ͉ April 11, 2006 ͉ vol. 103 ͉ no. 15 ͉ 5889 Downloaded by guest on September 23, 2021 Fig. 3. Genetic divergence in flanking regions of T- and S-PAIs. Average genetic diversity of all samples (␲) and between isolates containing T- and S-PAIs (Dxy) by using the Jukes and Cantor correction (22, 23) were plotted for the flanking regions of the T-PAI (A, Region 1) and of the S-PAI (B, Region 2) for the five P. viridiflava isolates in Figs. 1 and 2. Window size was 100 bp with 25-bp steps. The locations of ORFs are indicated by boxes; insertions found in only one isolate and the locations of PAIs are represented by open and closed triangles, respectively. Orthologous ORFs found in the Pto DC3000 genome (24) are indicated by filled bars with their accession ID. The PAI in Pto DC3000 was defined from Pspto1411 to Pspto1367 (16). Sharp, clear peaks of genetic divergence are observed around the indel-junction both of T-PAI and S-PAI. Fu and Li’s test of selective neutrality (D* and F*) (27) detected significant departure from neutrality around the peaks (P Ͻ 0.02, indicated by asterisks) based on a sample of 96 isolates.

B clades of P. viridiflava (19). The same branching pattern was S-PAI] isolates in A. thaliana and in tobacco (Fig. 4). These further confirmed by other isolates collected from wild A. isolates were selected from within clade A, so that the effects of thaliana populations, based on the third codon position of avrE genetic background are minimized. [T-PAI, ٌS-PAI] isolates in the PAIs (Fig. 2C). Finally, window plot analysis for the elicit an HR in A. thaliana Col-0 significantly slower than flanking regions of the S-PAI reveals a clear spike of genetic variation when [T-PAI, ٌS-PAI] and [ٌT-PAI, S-PAI] isolates are compared (Fig. 3B). Again, these polymorphisms are not associated with the insertion or excision of the PAIs: nucleotide divergence (compared with Pto DC3000) for synonymous sites in the ORFs that neighbor the S-PAI indel-junction (R2-ORF6 and ,R2-ORF40) are 0.78 and 0.75 for [T-PAI, ٌS-PAI] and [ٌT-PAI S-PAI] isolates, respectively. In the flanking regions of the S-PAI (1 kb centered on the junction) among the 96 isolates, ,E[␪] ϭ 0.037 ([T-PAI, ٌS-PAI]) (27) and 0.024 ([ٌT-PAI S-PAI]) for isolates in clade A and 0.008 for isolates within clade B, which again are close to those in the background loci (E[␪] ϭ 0.022 for clade A and 0.009 for clade B). Window plot analysis between [T-PAI, ٌS-PAI] and [ٌT-PAI, S-PAI] isolates in the clade A revealed significant departures from selective neutrality (D* and F*) (27) around the S-PAI junction (Fig. 3B), consistent with balancing selection on the S-PAI presence͞absence poly- morphism. The genealogical relationship of these flanking se- quences was similar to that of flanking sequences of the T-PAI (data not shown), suggesting that the indel of the S-PAI also had occurred before the split of the clades. The long-term presence of this unusual dual polymorphism, together with the absence of recombinant haplotypes, are strong indicators of selection-maintaining alternative PAIs in this Fig. 4. Virulence phenotype variations in P. viridiflava. Virulence pheno- pathogen species. The biology of the two PAIs must differ in types of P. viridiflava isolates measured by HR tests. Time until isolates are recognized (causing HR) by host plants was plotted (Materials and Methods). some way relevant to that selection. PAIs encode effectors that, Two host plants were used, A. thaliana Col-0 (x axis) and Tobacco cv. Burley (y when delivered to plant cells by the TTSS, may be recognized by axis). To minimize the effect of their genetic background, only isolates in clade the host and elicit a rapid defense response known as the A (19) were selected. [AT], LP23.1a, PNA3.3a, LU5.1a, LU9.1e, LU13.1a, hypersensitive response (HR). To test this hypothesis, we con- LU18.1a, LU19.1a, SP3.1a, ME210.1b, and PT220.1a; [AS], RMX23.1a, ME3.1a, .ducted experiments of [T-PAI, ٌS-PAI] and [ٌT-PAI, KNOX3.4a, BOG1.1a, BOG4.2a, BOR1.3d, KY5.1a, KY7.1d, SP1.1a, and SP12.1a

5890 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0601431103 Araki et al. Downloaded by guest on September 23, 2021 T-PAI, S-PAI] isolates ([T-PAI, ٌS-PAI] mean ϭ 21.8 h, Denhardt’s͞0.5% SDS solution. Membranes were washed withٌ] T-PAI, S-PAI] mean ϭ 14.8 h; P Ͻ 0.0001). However, in 2ϫ SSC and 1% SDS solution twice at 50°C for 5 min and thenٌ] tobacco, we observed the opposite pattern; the same [T-PAI, once for 15 min. Positive phage clones were partially digested by S-PAI] isolates elicit an HR significantly more rapidly ([T-PAI, MboI and subcloned into a plasmid by using a Zero Backgroundٌ͞ S-PAI] mean ϭ 10.0 h, [ٌT-PAI, S-PAI] mean ϭ 14.9 h; P Ͻ Kan Cloning Kit (Invitrogen). PCR products from the insertsٌ -and produce significantly larger lesions ([T-PAI, ٌS- were sequenced by using a CEQ8000 capillary sequencer (Beck (0.0001 PAI] mean ϭ 100%, [ٌT-PAI, S-PAI] mean ϭ 85.6%; P Ͻ man Coulter). To cover larger regions of the PAIs in LP23.1a than [ٌT-PAI, S-PAI] isolates. These virulence and the and PNA3.3a, secondary screening was performed by using (0.0002 host-specificity differences raise the possibility that the two primer sets LP40222f (5Ј-GATGATGAAACCACGGGCT- distinct PAIs are maintained by selection as alternative means of GTA-3Ј)-LP40589r (5Ј-TCCTCACGCAATGGCACCGTTA- interacting with different hosts. P. viridiflava is known to have a 3Ј) and LP38131f (5Ј-CGAGCAACTGAAAAACCTCGGGC- broad host range and does appear to infect other plant species 3Ј)-LP37891r (5Ј-TTGAGCGTTACCAGATCAAGCG-3Ј). All co-occurring with A. thaliana in Midwest populations (19). PCR reactions used Takara Taq HS (Takara Bio, Shiga, Japan) with 25 cycles of 95°C for 30 sec, 55–60°C for 40 sec, and 72°C Discussion for 2 min. Multiple functionally distinct PAIs previously have been iden- Because of the large sizes of the target sequences, we also tified in bacterial pathogen species and are generally associated constructed Fosmid libraries for ME3.1b, RMX23.1a, RMX3.1b, with HGT (31). The two-PAI system in P. viridiflava is unusual and P. cichorii 83-1, employing CopyControl Fosmid Library in several respects. First, individual isolates possess only one PAI Production Kit (Epicentre Technologies, Madison, WI), and rather than both. Second, although we cannot exclude an ancient screened them by PCR. We used the EZ::TN ϽKAN-2Ͼ Inser- HGT event for the origination of the two PAIs in the lineages tion Kit (Epicentre Technologies) to sequence positive clones. leading to P. viridiflava, recent HGT can be effectively ruled out. To obtain a sequence where the S-PAI was deleted in LP23.1a, Third, the pattern of genetic variation observed in and around the primers GR10kF (5Ј-GGGCTGGGCTACAGCATCGTGC- the PAIs, and the phylogeny of the two alleles, suggests that CAC-3Ј) and GR11kR (5Ј-GTGGCTTCAGCAGCACGAAT- balancing selection has been responsible for the maintenance of GATTG-3Ј) were used, and positive clones were isolated. To the two-PAI system. This evidence shows balancing selection on obtain this sequence in PNA3.3a, we directly sequenced PCR

PAIs of bacterial pathogens. products by using primer sets based on conserved sites. To obtain GENETICS The conventional view of pathogen evolution has been that a a sequence where the T-PAI was deleted in ME3.1b and single optimal level of virulence should evolve to balance the RMX23.1a, the primers LP80.5F (5Ј-GTCGTGCCACCGC- costs and benefits of virulence (growth and competition vs. host CGTCACCTTCGC-3Ј)-LP83.3R (5Ј-GACACGGATGAAGT- survival and transmission). However, long-term polymorphism GAGTCTTCTCG-3Ј) were used and positive clones were iso- in pathogen populations can be obtained in models in which lated. To obtain this sequence in RMX3.1b, we again sequenced levels of virulence vary, particularly in the face of host hetero- PCR products generated by using primers based on conserved geneity, superinfection, and spatial structure (32–35). Such sites. Partial sequences of the PAIs in the other isolates of P. polymorphisms have not been reported yet for any pathogen, viridiflava were obtained in the same manner (PCR), based on although evidence of balancing selection in bacteria has been the conserved sequences among the isolates above. Correspond- accumulating (6, 36–37). We predict that population genetic ing GenBank accession nos. to these sequences are AY597274– analysis of virulence factors in other pathogens will provide AY597283, AY859111, AY859112, AY859115, AY859128, additional examples of evolutionarily stable polymorphisms. AY859131, AY859183, AY859184, AY859351, AY859355, Why the entire PAI, rather than a particular effector gene, has AY859358, and DQ158500–158855. The entire PAI sequences been selected as a unit remains to be addressed. from P. cichorii 83-1 were cloned from a Fosmid library in the Stable polymorphism for alternative forms of a critical com- same manner as above. Partial GyrB sequence also was obtained ponent of virulence in P. viridiflava is strikingly similar to stable from this species by PCR with primers Gyr-F (5Ј-CMGGCG- polymorphism for resistance and susceptibility alleles in certain GYAAGTTCGATGACAAYTC-3Ј) and Gyr-R (5Ј-TRATBK- defense genes in their plant host, A. thaliana (2–4). This study CAGTCARACCTTCRCGSGC-3Ј). Corresponding GenBank indicates the potential for evolutionarily stable polymorphism on accession nos. are DQ168848 and DQ220702. both sides of the plant–pathogen interaction. We do not yet know whether plant–pathogen polymorphism is mechanistically cou- Genotyping PAIs from Worldwide A. thaliana Populations. We geno- pled in a ‘‘ warfare’’; theoretically, such coevolutionary typed the PAIs in 298 P. viridiflava isolates from the Midwest and dynamics should be a viable possibility (4). other worldwide populations. For 93 of these isolates, the clade was known from Goss et al. (19). We determined the clade of Materials and Methods three additional samples, two from Lund, Sweden (LU1.1a and Sample Materials. Five P. viridiflava samples, from which entire LU18.1a), and one from Kyoto (KY12.1d). The remaining 202 PAIs were sequenced, were collected from naturally occurring isolates were genotyped for PAIs only. PAI genotyping was done A. thaliana plants in populations in the Midwest United States by PCR. The primer set to identify the presence of the T-PAI (14). Samples from worldwide populations were collected in the included shcAf1 (5Ј-GGCGCACTTAACCCTCTGKT- same manner (19). P. cichorii 83-1 was kindly provided by J. CAATGA-3Ј) and hoppsyAr1 (5Ј-CYGGCGTATGATT- Greenberg (University of Chicago). GATAAACGCATCG-3Ј). The primer set to identify the presence of the S-PAI included RMXPAIf6 (5Ј-TGGTC- PAI Isolation and Sequencing. We constructed genomic libraries of GAGCTGTTCACTCACCTGT-3Ј) and RMXPAIr7 (5Ј- all five strains of P. viridiflava employing a Lambda FIX II vector TTGAACTGGTTGATCGGGTTCAGG-3Ј). There were three kit in Escherichia coli XL-1 Blue MRA-P2 (Stratagene) and primer sets used to identify the absence of the T-PAI: (i) using standard molecular cloning procedures (38). Probes were RMXnPAIf5 (5Ј-CCGTGCTGTGGTCATTGTCCTGAT-3Ј) designed based on partial hrpS sequences (14) or by amplifying and MEnPAIr2u (5Ј-GTAACAAGCCRTGACACAAAC- HrcN sequence based on Pto DC3000 (16). PCR products were CTAC-3Ј), (ii) MEnPAIf5 (5Ј-TGCCATCGTTCATATT- purified and radioactively labeled with Ready-To-Go DNA GAAGCTCAG-3Ј) and MEnPAIr4 (5Ј-CGCACGGCATCGC- Labeling Beads (Amersham Pharmacia Biosciences). Hybridiza- CAACCTTGAATG-3Ј), and (iii) MEnPAIf7 (5Ј-ACCTC- tion was performed at 50–65°C overnight in 6ϫ SSC͞5ϫ AATCAATACTCTGGAGATCA-3Ј) and MEnPAIr6 (5Ј-

Araki et al. PNAS ͉ April 11, 2006 ͉ vol. 103 ͉ no. 15 ͉ 5891 Downloaded by guest on September 23, 2021 CGAYTCACTGACCATCAACTGCCTG-3Ј). Finally, to iden- diluted in the morning, and grown to an optical density of 0.7–1.0 tify the absence of the S-PAI, we used the primers LPnPAIf2 colony-forming units (cfu)͞ml at 600 nm. For infection, bacteria (5Ј-GTCCGGTCTGCTACCAGAACCTGGC-3Ј) and LPn- were diluted to 2 ϫ 108 cfu͞ml in 10 mM MgSO buffer. Two Ј Ј 4 PAIr2 (5 -TGCGCACCAGCGGCAGGTATTGCGG-3 ). leaves on each of two A. thaliana Col-0 plants and one leaf on each of three tobacco cv. Burley plants were infected with each Statistical Tests of Polymorphism Levels. Sequences were edited by strain of P. viridiflava by using a blunt-end syringe. For both host SEQUENCHER 4.1.2 (Gene Codes, Ann Arbor, MI). Homologs species, 10 isolates with each PAI were assayed. Appearance of were searched by BLASTX (using the E value threshold Ͼ0.001) for the bacterial databases for the National Center for Biotech- HR was scored hourly after infection. For tobacco, percent nology Information (www.ncbi.nlm.nih.gov) and TIGR (www.ti- available leaf area (area within the leaf veins) showing necrosis gr.org) and aligned by CLUSTAL X (39) with minor manual was estimated visually after 24 h. corrections. Polymorphism and divergence surveys were per- formed by using DNASP 3.53 and DNASP 4.00 (40) and PROSEQ 2.91 We thank J. Dangl, R. F. Doolittle, G. Dwyer, M. Nordborg, and H. (41). Fu and Li’s test was performed by using DNASP 4.00 (40) with Ochman for helpful discussions; J. T. Greenberg for a sample of P. the window size 100 bp and the step size 25 bp (Fig. 3). cichorii 83-1; and J. Gladstone, M. Ludwig, and E. Bakker for technical Neighbor-joining trees were constructed by MEGA2.1 (42). help. This work was supported by grants from the National Institutes of Health and National Science Foundation and fellowships from HR Tests. P. viridiflava strains were incubated on KB plates at 28°C the Japan Society for the Promotion of Science and the Dropkin for 48 h. Single colonies were grown in liquid KB overnight, Foundation.

1. Dawkins, R. & Krebs, J. R. (1979) Proc. R. Soc. London Ser. B 205, 489–511. 23. Jukes, T. H. & Cantor, C. R. (1969) in Mammalian Protein Metabolism, eds. 2. Bergelson, J., Kreitman, M., Stahl, E. A. & Tian, D. (2001) Science 292, Munro, H. N. (Academic, New York), pp. 21–132. 2281–2285. 24. Buell, C. R., Joardar, V., Lindeberg, M., Selengut, J., Paulsen, I. T., Gwinn, 3. Tian, D, Araki, H., Stahl, E. A., Bergelson, J. & Kreitman, M. (2002) Proc. Natl. M. L., Dodson, R. J., Deboy, R. T., Durkin, A. S., Kolonay, J. F., et al. (2003) Acad. Sci. USA 99, 11525–11530. Proc. Natl. Acad. Sci. USA 100, 10181–10186. 4. Stahl, E. A., Dwyer, G., Mauricio, R., Kreitman, M. & Bergelson, J. (1999) 25. Feil, H., Feil, W. S., Chain, P. Larimer, F., DiBartolo, G., Capeland, A. Lykidis, Nature 400, 667–671. A., Trong, S., Nolan, M., Goltsman, E., et al. (2005) Proc. Natl. Acad. Sci. USA 5. Holub, E. B. (2001) Nat. Rev. Genet. 2, 516–527. 102, 11064–11069. 6. Allen, R. L., Bittner-Eddy, P. D., Grenville-Briggs, L. J., Meitz, J. C., Rehmany, 26. Joardar, V., Lindeberg, M., Jackson, R. W., Selengut, J., Dodson, R., Brinkac, A. P., Rose, L. E. & Beynon, J. L. (2004) Science 306, 1957–1960. L. M., Daugherty, S. C., Deboy, R., Durkin, A. S., Giglio, M. G., et al. (2005) 7. Staskawicz, B. J., Mudgett, M. B., Dangl, J. L. & Galan, J. E. (2001) Science J. Bacteriol. 187, 6488–6498. 292, 2285–2289. 27. Fu, Y. X. & Li, W. H. (1993) Genetics 133, 693–709. 8. Hacker, J. & Kaper, J. B. (2000) Annu. Rev. Microbiol. 54, 641–679. 28. Innan, H. & Tajima, F. (1997) Genetics 147, 1431–1444. 405, 9. Ochman, H., Lawrence, G. J. & Groisman, E. A. (2000) Nature 299–304. 29. Ochman, H. & Wilson, A. C. (1987) J. Mol. Evol. 26, 74–86. 10. Brown, J. R. (2003) Nat. Rev. Genet. 4, 121–132. 30. Yamamoto, S. Kasai, H., Arnold, D. L., Jackson, R. W., Vivian, A. & 11. Ochman, H., Lerat, E. & Daubin, V. (2005) Proc. Natl. Acad. Sci. USA 102, Harayama, S. (2000) Microbiology 146, 2385–2394. 6595–6599. 31. Marcus, S. L., Brumell, J. H., Pfeifer, C. G. & Brett Finlay, B. (2000) Microbes 12. Lan, R. & Reeves, P. R. (2001) Trends Microbiol. 9, 419–424. Infect. 2, 145–156. 13. Rohmer, L., Guttman, D. S. & Dangl, J. L. (2004) Genetics 167, 1341–1360. 32. Bergelson, J., Dwyer, G. & Emerson, J. J. (2001) Ann. Rev. Genet. 35, 469–499. 14. Jakob, K., Goss, E. M., Araki, H., Van, T., Kreitman, M. & Bergelson, J. (2002) 33. Boots, M., Hudson, P. J. & Sasaki, A. (2004) Science 303, 842–844. Mol. Plant–Microbe Interact. 15, 1195–1203. 15. Gardan, L., Shafik, H., Belouin, S., , R., Grimont, F. & Grimont, P. A. D. 34. Sasaki, A. & Godfray, H. C. (1999) Proc. R. Soc. London Ser. B 266, 455–463. (1999) Int. J. Syst. Bacteriol. 49, 469–478. 35. Frank, S. A. (1993) Evol. Ecol. 7, 45–75. 16. Alfano, J. R., Charkowski, A. O., Deng, W.-L., Badel, J. L., Petnicki-Ocwieja, 36. Brisson, D. & Dykhuizen, D. E. (2004) Genetics 168, 713–722. T., van Dijk, K. & Collmer, A. (2000) Proc. Natl. Acad. Sci. USA 97, 4856–4861. 37. Qiu, W.-G., Dykhuizen, D. E., Acosta, M. S. & Luft, B. J. (2002) Genetics 160, 17. Charity, J. C., Pak, K., Delwiche, C. F. & Hutcheson, S. W. (2003) Mol. 833–849. Plant–Microbe Interact. 16, 495–507. 38. Sambrook, J., Fritsch, E. F. & Maniatis, R. (1989) Molecular Cloning: A 18. Heu, S. & Hutcheson, S. W. (1993) Mol. Plant–Microbe Interact. 6, 553–564. Laboratory Manual (Cold Spring Harbor Lab. Press, Plainview, NY), 2nd Ed. 19. Goss, E. M., Kreitman M. & Bergelson, J. (2005) Genetics 169, 21–35. 39. Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. & Higgins, D. G. 20. van Dijk, K., Tam, V. C., Records, A. R., Petnicki-Ocwieja, T. & Alfano, J. R. (1997) Nucleic Acids Res. 25, 4876–4882. (2002) Mol. Microbiol. 44, 1469–1481. 40. Rozas, J. & Rozas, R. (1999) Bioinformatics 15, 174–175. 21. Lorang, J. M. & Keen, N. T. (1995) Mol. Plant–Microbe Interact. 8, 49–57. 41. Filatov, D. (2002) Mol. Ecol. Notes 2, 621–624. 22. Nei, M. (1987) Molecular Evolutionary Genetics (Columbia Univ. Press, New 42. Kumar, S., Tamura, K., Jakobsen, I. B. & Nei, M. (2001) Bioinformatics 17, York). 1244–1245.

5892 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0601431103 Araki et al. Downloaded by guest on September 23, 2021