Copyright Ó 2007 by the Genetics Society of America DOI: 10.1534/genetics.107.077925

Molecular of Pathogenicity-Island Genes in Pseudomonas viridiflava

Hitoshi Araki,*,†,1 Hideki Innan,‡ Martin Kreitman* and Joy Bergelson* *Department of Ecology and Evolution, , Chicago, Illinois 60637, †Department of Biology, Nanjing University, Nanjing 210093, China and ‡Graduate University of Advanced Studies, Hayama 240-0193, Japan Manuscript received June 20, 2007 Accepted for publication August 10, 2007

ABSTRACT The bacterial pathogen Pseudomonas viridiflava possesses two pathogenicity islands (PAIs) that share many gene homologs, but are structurally and phenotypically differentiated (T-PAI and S-PAI). These PAIs are paralogous, but only one is present in each isolate. While this dual presence/absence polymorphism has been shown to be maintained by balancing selection, little is known about the molecular evolution of individual genes on the PAIs. Here we investigate genetic variation of 12 PAI gene loci (7 on T-PAI and 5 on S- PAI) in 96 worldwide isolates of P. viridiflava. These genes include avirulence genes (hopPsyA and avrE ), their putative chaperones (shcA and avrF ), and genes encoding the type III outer proteins (hrpA, hrpZ, and hrpW ). Average nucleotide diversities in these genes (p ¼ 0.004–0.020) were close to those in the genetic background. Large numbers of recombination events were found within PAIs and a sign of positive selection was detected in avrE. These results suggest that the PAI genes are evolving relatively freely from each other on the PAIs, rather than as a single unit under balancing selection. Evolutionarily stable PAIs may be preferable in this species because preexisting genetic variation enables P. viridiflava to respond rapidly to natural selection.

ENE-for-geneinteractionsbetween plants and plant R genes, however, the Phyophthora infestans scr74 gene G pathogens involve specific resistance (R) gene pro- family of effectors may be subject to diversifying se- ducts in plants that are responsible for the recognition of lection (Liu et al. 2005). Limited evidence of balancing elicitors encoded by avirulence (Avr) genes in pathogens selection acting upon virulence-related genes is found (Flor 1971). Plant genomes possess many R gene loci in Fusarium graminearum (Ward et al. 2002) and in (Arabidopsis Genome Initiative 2000); annotation of Borrelia burgdorferi (Qiu et al. 2004). A comprehensive the genomic sequence, in particular, survey of Pseudomonas syringae effectors identified an identifies 150, most of which occur in clusters of 2–9 additional seven gene families with domains subject to loci (Meyers et al. 2003). It is clear that selection has diversifying selection, although the majority of gene acted to diversify paralogous R gene family members in families were shaped by purifying selection (Rohmer the Arabidopsis genome (Meyers et al. 2003). However, et al. 2004). Rohmer et al. (2004) also showed that 9 of 24 many R gene polymorphisms in A. thaliana have been effector genes were acquired via horizontal gene trans- shown to be subject to balancing or transient balancing fer (HGT), indicating that HGT is one of the important selection, rather than directional selection (Stahl et al. factors in bacterial effector gene evolution. 1999; Bergelson et al. 2001; Tianet al. 2002; Bakker et al. In a previous study, we identified two paralogous PAIs 2006; Shen et al. 2006), in contrast to expectations under in P. viridiflava (Araki et al. 2006), a prevalent natural the simplest arms race model (Dawkins and Krebs pathogen of A. thaliana (Jakob et al. 2002). Although 1979). In fact, there is little evidence for direct R–Avr these PAIs (T- and S-PAI, respectively) share 25 gene protein interaction (e.g., Dodds et al. 2006). These facts homologs, they are structurally distinct. The T-PAI is may be explained by the ‘‘guard hypothesis’’, in which R composed of three parts: a gene cluster encoding the gene products are assumed to recognize Avr gene pro- type III protein-secretion apparatus (hrp/hrc gene clus- ducts only indirectly via modifications of host proteins ter), the 59-effector locus (exchangeable effector locus, that are targets of the Avr gene products (Dangl and EEL), and the 39-effector locus (conserved effector Jones 2001). locus, CEL). The S-PAI, on the other hand, has a single Much less is known about the molecular evolution component hrp/hrc cluster with a 10 kb-long insertion. of genes encoding bacterial effectors. Like Arabidopsis The two PAIs are integrated at different locations in the P. viridiflava genome, and both islands are segregating for alleles in which the entire island is deleted (Figure Sequence data from this article have been deposited with the GenBank 1). Extensive surveys of P. viridiflava isolates from Data Library under accession nos. AY859094–AY859371. worldwide collections have identified only two com- 1Corresponding author: Department of , Oregon State University, 3029 Cordley Hall, Corvallis, OR 97331. binations, each containing one of the two islands (i.e., E-mail: [email protected] ½T-PAI, =S-PAI and ½=T-PAI, S-PAI). Several lines of

Genetics 177: 1031–1041 (October 2007) 1032 H. Araki et al. evidence indicate that this dual presence/absence poly- differentiated clades, A and B (Goss et al. 2005). Both T- morphism has been maintained by balancing selection. and S-PAI isolates are present in clade A, but only S-PAI Furthermore, the ½T-PAI, =S-PAI and ½=T-PAI, S-PAI isolates have been identified in clade B. We therefore isolates exhibit virulence differences (Araki et al. 2006). analyzed genetic variation and molecular evolution of ½T-PAI, =S-PAI isolates elicit a rapid defense response the 12 loci for each of the three groupings: 7 loci on ½AT known as the hypersensitive response (HR) in A. (indicating isolates with the T-PAI in clade A) and 5 on thaliana significantly more slowly than ½=T-PAI, S-PAI ½AS (indicating isolates with the S-PAI in clade A) and ½BS isolates, whereas they elicit an HR in tobacco signifi- (indicating isolates with the S-PAI in clade B) (Figure 1). cantly more quickly than the ½=T-PAI, S-PAI isolates. Our results suggest that these loci have evolved relatively This functional differentiation of PAIs is required for freely from other genes on the same PAI, despite the fact selection to maintain this unusual dual-haplotype con- that they are located on genetic islands and that entire figuration as a polymorphism. Why the entire PAI, PAIs are under balancing selection. rather than a particular effector gene on it, has been selected as a unit has been unclear and how individual genes on the PAI evolve remains unanswered. MATERIALS AND METHODS To address these questions, we investigate molecular Samples: Ninety-three isolates out of the 96 investigated evolution in 12 PAI gene loci (7 on T-PAI and 5 on S-PAI) here are described in Goss et al. (2005). Eighty-three of them in P. viridiflava. These loci include five genes shared by were isolated from A. thaliana plants and the remaining 10 the two PAIs (avrE, avrF, hrpA, hrpZ, and hrpW designated were isolated from other weedy plant species alongside A. as avrE , avrF , hrpA , hrpZ , and hrpW on S-PAI thaliana. Three additional isolates included in this study—- [S] [S] [S] [S] [S] LU1.1a and LU18.1a (from Lund, Sweden) and KY12.1d and avrE[T], avrF[T], hrpA[T], hrpZ[T], and hrpW[T] on T- (from Kyoto, Japan)—were isolated from A. thaliana plants. PAI) as well as two T-PAI-specific genes (hopPsyA[T] and The PAI presence/absence genotype was identified by PCR raki shcA[T]) (Figure 1). hopPsyA and shcA encode an Avr pro- (see A et al. 2006 for details). In this manner, 10 isolates tein that elicits a HR on tobacco and its putative chaper- were identified as ½AT, 57 were identified as ½AS and 29 were one, respectively (Alfano and Collmer 1997; Van Dijk identified as ½BS. Sequence data analyses and the tests of selective neutrality: et al. 2002). avrE encodes a known Avr protein that elicits Sequences of the seven PAI genes were obtained by direct PCR HR on tobacco and soybean plants (Lorang and Keen and sequencing using the same conditions as those described 1995). avrF is putatively an AvrE-specific chaperone in Araki et al. (2006). Primers for the PCR are listed in sup- (Bogdanove et al. 1998) and might be required for plemental Table 1 at http://www.genetics.org/supplemental/. ° ° efficient delivery of AvrE (Ham et al. 2006). A potential Primer annealing temperatures were between 55 and 60 . Sequences of hrpW from three isolates (LU5.1a, LU9.1e, role of AvrE as a suppressor of basal immunity and pro- [T] and PT220.1a) and hrpW[S] from four isolates (KNOX230.1a, motion of host cell death is reported in Debroy et al. ME751a, ME753a, and ME756a) were eliminated from the fol- (2004). hrpA, hrpZ, and hrpW encode outer proteins lowing analyses because we could obtain only partial sequences. involved in the type-III secretion system (TTSS). hrpA Sequences were edited by Sequencher v.4.1.2 (Gene Codes, encodes the pilus, which plays a key role in the secretion Ann Arbor, MI). After editing, sequences were aligned by CLUSTAL X (Thompsonet al. 1997) with minor manual adjust- of Avr and other type-III effector proteins and is subject ments. Reference sequences for alignments were obtained from to natural selection in P. syringae (Jin and He 2001; the GenBank online database as follows: P. viridiflava LP23.1a, Guttman et al. 2006). hrpZ and hrpW encode harpin PNA29.1a, and P. syringae pv. tomato (Pto) DC3000 for T-PAI uang reston and P. viridiflava ME3.1b, RMX23.1a, RMX3.1b, and P. cichorii proteins (H et al. 1995; P et al. 1995), which raki also elicit HR in the apoplast (in contrast to Avr proteins 83-1 for S-PAI (A et al. 2006). Polymorphism and di- ei e vergence surveys, selective neutrality tests, and a permutation that elicit HR in plant cells; W et al. 1992; H et al. test for genetic differentiation with 10,000 replicates were 1993). These genes, therefore, are likely to be involved performed using DnaSP v4.00 (Rozas et al. 2003). Neighbor- in the Arabidopsis–P. viridiflava interaction, and hence joining (NJ) trees (Saitou and Nei 1987) were constructed by MEGA2.1 (Kumar et al. 2001) using Kimura’s two-parameter subject to natural selection. Considering the range of imura putative functions of these gene products, however, the model (K 1980), the multilocus HKA test was conducted using the HKA program (available from http://lifesci.rutgers. typesanddegreesofselectivepressuresonthesegenes edu/heylab/HeylabSoftware.htm#HKA) with 10,000 simu- might vary. Thus, one might expect that these PAI genes lations, and the minimum-recombination number was esti- evolve in an independent fashion subject to different se- mated utilizing methods in Hudson and Kaplan (1985) and yers riffiths lective pressures (e.g., negative or positive selection). An M and G (2003). Frequent recombination was alternative hypothesis is that the balancing selection on further tested by a runs test using GENECONV v. 1.81 (based on a global permutation test after multiple-comparison cor- the presence/absence polymorphisms for entire PAIs rection; see Sawyer 1989). overcomes the selective pressure on each PAI gene. In this case, one would expect that these genes evolve in concert. We used 96 bacterial isolates collected mostly from A. RESULTS thaliana leaves in worldwide populations, including the same isolates as previously investigated (Goss et al. 2005; Physical organization and structural evolution: Figure 1 Araki et al. 2006). P. viridiflava consists of two genetically depicts the physical location of the 12 loci on the two Pathogenicity-Island Gene Evolution 1033

Figure 1.—Schematics for locations of the PAI genes investigated in this study. T-PAI is located in region 1 (left) and S-PAI is located in region 2 (right). Physical distance of the two regions is unknown in P. viridiflava. ½AT, ½AS, and ½BS represent ½clade, PAI-type of the isolates. In all the samples, the presence of T-PAI in region 1 was perfectly associated with the absence of S-PAI in region 2, and vice versa. Arrowheads represent locations where at least one recombination event was detected by Hudson and Kaplan’s Rm (1985), and the number below the arrowhead represents Rm found within or between loci. paralogous PAIs (T-PAI in chromosomal region 1 and two members of the EEL; horizontal transfer of the EEL S-PAI in chromosomal region 2). T- and S-PAI genotypes region is further supported by the presence of tRNALeu were both represented among the 67 clade A isolates but in its 59 end and its low G 1 C% in Pseudomonas only S-PAI genotypes were present in the 29 clade B (Charity et al. 2003; Araki et al. 2006). The data do not isolates. Thus, our 96 samples of P. viridiflava consistof10 allow us to ascertain the direction or source of the HGT ½AT,57½AS,and29½BS isolates. An Avr gene homolog, event, however. In the hrp/hrc cluster, on the other hand, hopPsyA[T] and its putative chaperone, shcA[T],werelocated no sign of HGT was found. Even the genealogy of par- in the EEL region in the T-PAI, but is not present in the S- tial sequences (175 bp) in hrpK[T], a gene in the hrp/hrc PAI. The avrE[S], avrF[S],andhrpW[S] were located in the 10- cluster next to hopPsyA[T] and shcA[T] in P. viridiflava, was kb insertion in both ½AS and ½BS, whereas homologs of consistent with the phylogeny of these species (Figure these genes (avrE[T], avrF[T],andhrpW[T])werefoundin 2B). This result suggests that the hrp/hrc cluster is not the CEL region in ½AT isolates. Both hrpA and hrpZ included in the potential HGT of the EEL region, con- occupied adjacent locations in the hrp/hrc cluster in both firming previous reports in P. syringae (Sawada et al. lfano harity T-PAI (hrpA[T] and hrpZ[T])andS-PAI(hrpA[S] and hrpZ[S]). 1999; A et al. 2000; C et al. 2003). In the EEL region on the T-PAI, hopPsyA[T] and shcA[T] The hrp/hrc gene cluster is composed of genes were located adjacent to one another and shared the encoding the type-III protein-secretion apparatus. In same orientation in all 10 ½AT isolates. The consistent the hrp/hrc gene cluster, hrpZ and hrpW encode harpin gene composition and gene orientation of these genes proteins, defined as glycine-rich, cysteine-lacking pro- among P. viridiflava isolates contrasts to those in P. teins that are secreted by the TTSS and that possess heat- syringae. Among P. syringae pathovars, the EEL region is stable HR elicitor activity (Wei et al. 1992; Huang et al. hypervariable both in gene composition and in gene 1995; Charkowski et al. 1998). HrpZ in P. viridiflava orientation, which suggests frequent HGTs and genetic shares these features of amino acid sequences, contain- exchanges (Alfano et al. 2000; Charity et al. 2003; ing no cystein-coding codons and 42–46 out of 314–337 Deng et al. 2003). Indeed, phylogenetic analysis of glycine-coding codons. HrpW in P. syringae is composed hopPsyA and shcA among P. viridiflava and P. syringae of two domains: the harpin domain in the 59 half and isolates revealed that the genes in P. syringae isolates are the pectate lyase domain in the 39 half (Charkowski split into two distantly related clades (groups X and Y in et al. 1998). This structure was also found in P. viridiflava, Figure 2A). The nucleotide sequences from P. viridiflava and no cysteine-coding codon was found in hrpW in this cluster together and are closely related to the P. syringae species. However, the harpin domain is quite variable group X. This pattern was confirmed using several among groups in P. viridiflava (average p within species phylogenetic methods including maximum parsimony is 0.27 and 0.11 for synonymous and nonsynonymous and the unweighted pair group method with arithme- sites, respectively), and the number of glycine-coding co- tic mean (UPGMA) (data not shown). Interestingly, all dons varies substantially even among isolates in the same gene homologs in P. syringae group X share the same group (supplemental Figure 1 at http://www.genetics. gene orientation as those in P. viridiflava, whereas those org/supplemental/). The average number (and standard in P. syringae group Y have the opposite gene orienta- deviation, SD) of glycine-coding codons in the harpin tion. These results are consistent with the HGT of these domain is 37.9 (SD ¼ 0.1) for ½AT,116.1(SD¼ 6.7) for 1034 H. Araki et al.

Figure 2.—Genealogical relationships of hop PsyA and shcA (A) and a part of hrpK (B). The neighbor-joining trees were constructed on the basis of the nucleotide variations in the third co- don position in the combined coding sequences of hopPsyA 1 schA (1704 bp) and the partial se- quences of hrpK (175 bp). The sequences in P. syringae were obtained from either Charity et al. (2003) or Deng et al. (2003). Bootstrap prob- abilities (%) with 1000 replicates, which were .80%, were represented on or below the major branch. The partial hrpK sequences in P. viridifla- va were only available in LP23.1a and PNA3.3a (Araki et al. 2006).

½AS, and 115.2 (SD ¼ 7.7) for ½BS.Thiscomparesto40 2. hrpA shows almost no genetic variation within each glycine-coding codons in the harpin domain of Pto group. No variation was found in hrpA[T] among ½AT DC3000 and 119 glycine-coding codons in the harpin and in hrpA[S] among ½BS isolates. hrpA[S] among ½AS domain of P. cichorii 83-1. The pectate lyase domain in was nearly invariant. hrpW of P. viridiflava, on the other hand, is less poly- 3. No clear difference in the levels of genetic variation morphic and amino acid sequence is well conserved was found between T- and S-PAI and between the PAI both between species and between T-PAI and S-PAI in genes and their genetic background. this domain (average p withinspeciesis0.21and0.04 The average nucleotide diversity observed in the PAI for synonymous and nonsynonymous sites, respectively). genes is low in ½AT (p ¼ 0.004), which is only 18 and While the harpin domain in HrpW is known to be under 27% of that in ½AS and ½BS (Table 1). However, the low positive selection in P. syringae (Rohmer et al. 2004), the level of genetic variation in ½AT was expected a priori functional and evolutionary significance of variable num- because ½AT isolates occur at 10% frequency in the bers of glycine-coding codons in this domain needs to be worldwide sample. Under neutral equilibrium model further addressed in P. viridiflava. assumptions, estimates of the neutral parameter u ¼ Level of polymorphism within lineages: Synonymous 2Nm (N is an effective population size and m is the nucleotide diversity (p ,Nei 1987) was estimated for syn mutation rate) were very close for ½AT (0.024) and ½AS each gene within each of the three PAI-clade groups (0.023) after adjusting for the PAI frequency difference (½AT, ½AS, and ½BS, Table 1). There are three notable in clade A (Innan and Tajima 1997) (Table 1). Those observations in Table 1: values of u and that in ½BS (0.013, S-PAI is fixed in clade 1. psyn varies substantially among loci and among B) were also very similar to the levels of polymorphism groups. This result suggests the possibility of differ- found in the background loci for these isolates (p ¼ ent evolutionary histories among the PAI genes. 0.022 in clade A and 0.009 in clade B; Goss et al. 2005). Pathogenicity-Island Gene Evolution 1035

TABLE 1 Genetic variations of the PAI genes

½AT½AS½BS:

a a a Gene Size (bp) n p (psyn) n p (psyn) n p (psyn) shcA[T] 378 10 0.0025 (0.0106) 0 — 0 — hopPsyA[T] 1146 10 0.0010 (0.0033) 0 — 0 — avrE ([T] or [S]) 5165 10 0.0041 (0.0101) 57 0.0233 (0.0714) 29 0.0130 (0.0425) avrF ([T] or [S]) 395 10 0.0135 (0.0508) 57 0.0120 (0.0434) 29 0.0079 (0.0306) hrpA ([T] or [S]) 217 10 0.0000 (0.0000) 57 0.0013 (0.0054) 29 0.0000 (0.0000) hrpZ ([T] or [S]) 952 10 0.0005 (0.0020) 57 0.0104 (0.0373) 29 0.0048 (0.0195) hrpW ([T] or [S]) 1474 7 0.0022 (0.0040) 54 0.0185 (0.0559) 28 0.0229 (0.0739) Averageb — — 0.0036 (0.0096) — 0.0199 (0.0616) — 0.0133 (0.0441) E½uc — — 0.0241 (0.0646) — 0.0233 (0.0725) — — a Average nucleotide diversity (Nei 1987). Numbers in parentheses represent the average nucleotide diversity of synonymous sites. b Average over the five PAI genes shared between T- and S-PAI (avrE, avrF, hrpA, hrpZ, and hrpW). c Estimate of u ¼ 2Nm, where N is an effective population size of clade A and m is the mutation rate, was calculated from the average nucleotide diversities of ½AT and ½AS on the basis of Innan and Tajima (1997), assuming random mating and 10/67 and 57/67 of ½AT and ½AS frequencies in the clade A population, respectively.

Genetic divergence: Genetic divergence of the PAI similar rates of divergence of the two P. viridiflava clades. gene homologs among groups in P. viridiflava and These patterns of divergence are consistent with a between species (P. syrinage for T-PAI and P. cichorii for model of vertical phylogenetic transmission of the S- S-PAI, Araki et al. 2006) is shown in Table 2. T-PAI and PAI loci within species. S-PAI gene homologs are extensively diverged (Dxy ¼ Comparisons between ½AT isolates in P. viridiflava 0.624–1.860, Nei 1987), confirming that the most re- and Pto DC3000, on the other hand, revealed that cent common ancestor between T- and S-PAI predates synonymous divergence of the orthologous PAI genes the split of P. syringae and P. cichorii from their most (Dxy ¼ 1.1–2.9, average 1.6) is consistently greater than recent common ancestors with P. viridflava (Figure 3) for the background loci (Dxy ¼ 0.49–0.73, average 0.56, (see also Araki et al. 2006). Goss et al. 2005). While the cause of the large di- Comparisons between ½AS and ½BS reveal a moderate vergence is not clear, the genealogy of the third codon level of genetic divergence (Dxy ¼ 0.014–0.086), similar sequences from the five genes shared between T- and to that of the background loci in these isolates (Dxy ¼ S-PAI (Figure 3) indicates a higher evolutionary rate 0.037–0.087, Goss et al. 2005). Levels of genetic diver- of these genes in ½AT relative to that in Pto DC3000. gence between ½AS and P. cichorii were very similar to According to the genetic distance between these line- those between ½BS and P. cichorii (Table 2), indicating ages (Figure 3), the evolutionary rates of the PAI genes

TABLE 2

Genetic diversities of the PAI genes between lineages (Dxy)

shcA hopPsyA avrE avrF hrpA hrpZ hrpW Within P. viridiflava (orthologous) ½AS vs. ½BS — — 0.086 (0.284) 0.079 (0.326) 0.014 (0.064) 0.051 (0.217) 0.093 (0.367) Within P. viridiflava (paralogous) ½AT vs. ½AS — — 0.739 (1.545) 0.659 (N.A.) 1.744 (1.833) 1.046 (2.544) 0.811 (1.592) ½AT vs. ½BS — — 0.735 (1.462) 0.624 (3.730) 1.860 (2.713) 1.044 (2.280) 0.822 (1.727) Between species (orthologous) ½AT vs. Pto DC3000 0.498 (1.565) 0.801 (2.915) 0.431 (1.657) 0.338 (2.941) 0.929 (1.113) 0.523 (1.654) 0.491 (1.667) ½AS vs. Pcic 83-1 — — 0.568 (1.431) 0.282 (0.917) 0.140 (0.426) 0.244 (0.862) 0.309 (1.037) ½BS vs. Pcic 83-1 — — 0.569 (1.381) 0.292 (0.970) 0.134 (0.408) 0.260 (1.052) 0.299 (0.997) ei ukes antor Genetic diversities (Dxy) were calculated on the basis of N (1987) with J and C (1969) correction. Values within parenthesis represent Dxy in synonymous sites. Pto DC3000 and Pcic 83-1 represent P. syringae DC3000 and P. cichorii 83-1, which are outgroups with the T-PAI and S-PAI homolog, respectively (Araki et al. 2006). Dxy in the synonymous sites of avrF could not be obtained because this value was beyond the level of the Jukes and Cantor (1969) correction. 1036 H. Araki et al.

TABLE 3 Detected number of recombination events within each group

Method ½AT½AS½BS

Hudson and Kaplan’s (1985) Rm 55942 yers riffiths M and G ’s (2003) Rm 58682 Sawyer’s (1989) run test 10 166 35

actual recombination events in the history of a sample (Hudson and Kaplan 1985). The majority of the re- combination events was detected within avrE in both PAIs, the largest gene locus in each PAI (.5 kb, Figure 1). Qualitatively similar results were also obtained by differ- ent methods (Sawyer 1989; Myers and Griffiths 2003) (Table 3). These results are consistent with those of Goss et al. (2005), in which a high rate of recombination is reported among genomewide, background loci in this species. The large numbers of recombination events within the small genomic islands (,30 kb each) found in this study provide further evidence of the high re- combination rate in this species. Figure 3.—Genealogical relationships of five shared genes between T-PAI and S-PAI. The neighbor-joining trees were Between T-PAI and S-PAI, on the other hand, the large constructed on the basis of the nucleotide variations in the genetic divergence (Table 2, Figure 3) suggests that third codon position in the combined coding sequences of paralogous recombination (i.e., gene conversion) is inef- the five genes (aveE, avrF, hrpA, hrpZ, and hrpW, 6720 bp com- ficient in exchanging the nucleotide sequences between umar parable in total) after the Jukes–Cantor correction (K the two. et al. 2001). Bootstrap probabilities (percentage) with 1000 replicates were represented on or below the major branches. Tests of selective neutrality: We investigated the compatibility of polymorphism and divergence patterns with selective neutrality by applying statistical tests on among the ½AT lineages are on average 48.3% higher the basis of (1) allele frequencies (Tajima 1989; Fu and than those in Pto DC3000 after the split of these Li 1993), (2) polymorphism level differences among lineages. Although Tajima’s relative-rate test (Tajima loci (HKA test) (Hudson et al. 1987), (3) the distribu- 1993) did not detect a statistical significance of the tion of synonymous and nonsynonymous polymorphism difference in the evolutionary rates between them (P ¼ and divergence (MK test) (McDonald and Kreitman 0.36–0.47, P. cichorii as an outgroup), this result may be 1991), and (4) the ratio of nonysynonymous:synonymous due to lack of statistical power because P. cichorii was divergence (Ka/Ks). highly divergent from these lineages. Thus, our data are Strongly negative (or positive) values of Tajima’s D basically consistent with the increased evolutionary rate and Fu and Li’s D* and F* statistics at individual loci can of the PAI genes among ½AT lineages rather than in be interpreted as evidence for positive directional (or P. syringae. balancing) selection. No significant deviation from selec- Recombination within each PAI: Within each PAI, tive neutrality was detected by these tests for any of the patterns of genetic variation provided clear signs of 12 loci (data not shown). recombination, despite the tight physical linkage of the The HKA test is a conservative test of neutrality that PAI gene loci. We estimated the minimum number of asks whether polymorphism levels across loci are com- recombination events (Rm) between sequences within patible with neutral evolution, taking into account pos- ½AT, ½AS, and ½BS using the very conservative method sible functional constraint or mutation-rate differences of Hudson and Kaplan (1985) (Table 3 and Figure 1). between the loci. This test, and the MK test below, is best The estimated Rm in S-PAI is very large (Rm½AS¼59 applied using divergence data between species (or and Rm½BS¼42), whereas that in T-PAI is moderate clades) that are closely related to avoid complications (Rm½AT¼5). The small number of recombination events due to multiple substitutions at individual nucleotide detected in ½AT must be, at least in part, a consequence sites. In our data, generally large genetic diversities of small sample size (10 isolates) and low genetic varia- between species and between T- and S-PAI (Dxy . 1, tion in ½AT (Table 1). Regardless, these are impressively Table 2) suggested that these data are not suitable for large numbers, especially given the fact that this con- this neutrality test. Therefore, we restricted application servative method identifies only a small fraction of the of the HKA test to the comparison of polymorphism Pathogenicity-Island Gene Evolution 1037

TABLE 4 Distribution of substitutions between AS and BS

Substitutions Locus No. of codon analyzed Fix-syn. Fix-rep. Pol-syn. Pol-rep. P

avrE[S] 1715 130 485 72 186 0.035 avrF[S] 132 10 35 2 6 1.000 hrpA[S] 68 2 2 0 0 — hrpZ[S] 314 23 58 3 10 0.755 hrpW[S] 471 36 166 6 54 0.166 The P-value was calculated from the MK test (McDonald and Kreitman 1991) using Fisher’s exact test. None of them was statistically significant after Bonferroni correction for multiple comparisons. Fix-syn., Fix-rep., Pol- syn., and Pol-rep.: fixed synonymous, fixed replacement, polymorphic synonymous, and polymorphic replace- ment substitutions, respectively. levels among the five loci in ½AS and ½BS. For these codon, D* ¼2.35, F* ¼2.46, P , 0.05) when we genes, neither the multilocus HKA test (P ¼ 0.86) nor examined avrE[S] in ½BS with an outgroup from ½AS. the pairwise HKA test (P ¼ 0.22 ½hrpA[S]-hrpW[S]–0.94 This window was one of three windows that showed ½hrpZ[S]-avrE[S]) detected departure from selective significant departure from neutrality both by D* and F* neutrality. (all negative, Figure 4) and no significant departure was The ratio of synonymous and replacement substitu- detected by these statistics when we examined avrE[S] in tions between species (or clades) can be compared with ½AS with an outgroup from ½BS. These results indicate a the same ratio for polymorphisms within a species (or a possibility of positive selection on the N-terminal of clade) as a test for adaptive protein evolution (McDonald AvrE. and Kreitman 1991). As above, we restricted the ap- Although we could not carry out a MK test for hrpA, plication of this test to protein evolution within and this gene is highly differentiated between ½AT and Pto between ½AS and ½BS PAI genes (Table 4). Of the four genes that could be tested (avrE[S], avrF[S], hrpZ[S], hrpW[S]), the MK test detected a significant departure from selective neutrality (in the direction of adaptive protein evolution) for avrE[S] (P ¼ 0.035). The avrF[S] has a shorter coding region and many fewer substitu- tions than avrE[S] and showed no sign of positive selection between ½AS and ½BS (P ¼ 1.0). Note that the hrpA[S] coding region is very short (68 codons) and is nearly invariant both within and between ½AS and ½BS clades. Thus we could not carry out a MK test for this gene. To further investigate the distribution of replacement substitutions between ½AS and ½BS in avrE[S],we performed a sliding window analysis (Figure 4). A high level of genetic divergence between ½AS and ½BS was found in the first 1 kb of avrE[S]. Similar distributions of replacement substitutions were found among other comparisons of avrE (between species and between T- PAI and S-PAI) and even between ½BS isolates from different geographic regions (data not shown). A less Figure 4.—Window plot analysis of nucleotide divergence conserved amino acid sequence of the first half of avrE between ½AS and ½BS in avrE. The genetic divergences in re- placement sites (solid line) and in synonymous sites (dashed is reported between P. syringae and Erwinia amylovora line) are plotted with relative position of the start position of (Bogdanove et al. 1998), indicating that the highly the coding region in avrE. The window size and step size were variable N-terminal of AvrE protein is common in plant 100 and 25 bp, respectively. An asterisk represents a midpoint pathogens. Whether the elevated genetic divergence is of the windows that showed statistical significance (P , 0.05) due to positive selection or an absence of selective by both Fu and Li’s D* and F*in½BS with a sequence from ½AS as an outgroup. D* ¼2.35 and F* ¼2.46 in a window constraint in this region is not clear, but Fu and Li’s D* of 392–503 (bp from the start codon), D* ¼2.35 and F* ¼ and F* showed significant deviation from the selective 2.46 in a window of 2493–2592, and D* ¼2.41 and F* ¼ neutrality in a window of 392–503 bp (from the start 2.77 in a window of 2718–2817. 1038 H. Araki et al.

TABLE 5

Ka/Ks ratio among PAI genes

shcA hopPsyA avrE avrF hrpA hrpZ hrpW Within P. viridiflava (orthologous) ½AS vs. ½BS — — 0.115 0.041 0.000 0.034 0.052 Within P. viridiflava (paralogous) ½AT vs. ½AS — — 0.387 — 0.938 0.345 0.419 ½AT vs. ½BS — — 0.411 0.112 0.635 0.389 0.386 Between species (orthologous) ½AT vs. Pto DC3000 0.257 0.208 0.157 0.047 0.792 0.219 0.198 ½AS vs. Pcic 83-1 — — 0.295 0.168 0.165 0.148 0.161 ½BS vs. Pcic 83-1 — — 0.309 0.163 0.164 0.121 0.156 Missing data are either by no sample (no shcA and hopPsyA in ½AS and ½BS) or failure of Jukes and Cantor (1969) correction (avrF between ½AT and ½AS).

DC3000, and between T- and S-PAI (Table 2). The Ka/Ks of presumably coevolved loci (T- and S-PAI) are main- ratio of hrpA in these comparisons is also moderately tained by balancing selection, most likely because of high (see below). their virulence differences and host specificities (Araki The leucine-rich repeat domain of R proteins in et al. 2006). On the other hand, the extensive reshuffling A. thaliana and other plant species, which is generally of polymorphism within each PAI by homologous believed to encode protein-recognition sites, can evolve recombination indicates that the entire PAI is not rapidly, with nonsynonymous:synonymous substitution evolving as a single linkage block in P. viridiflava. Rather, ratios (Ka/Ks) exceeding one (reviewed in Bergelson there appears to be considerable independence in the et al. 2001). To examine whether Avr proteins have genealogical histories of individual loci and ample similarly fast rates of evolution, we calculated the Ka/Ks opportunity for simultaneous adaptive evolution of con- ratios for the seven genes (Table 5). None of them stituent PAI genes. That the components of the enter- produced Ka/Ks . 1. The Ka/Ks ratio in hrpA was oocyte effacement island in Escherichia coli are, at best, moderately high between ½AT and Pto DC3000 (Ka/Ks weakly coupled in their evolution, has similarly been ¼ 0.79) and between T-PAI and S-PAI (Ka/Ks ¼ 0.64– reported (Castillo et al. 2005). 0.94). Although there remains a possibility that only a Functional significance of the PAI genes: The func- part of the coding sequences are subject to the positive tional significance of the genes investigated is largely selection and hence Ka/Ks . 1, window plot analyses for unknown in P. viridiflava. However, several lines of the Ka/Ks ratio generally did not detect regional Ka/Ks . evidence indicate that the PAI genes investigated here 1 in the seven genes. A single exception was 600 bp at are functional. First, no frameshift or null mutations the 59 end of avrE, which showed high Ka/Ks between were observed in any of the coding sequences of these species (e.g., between ½AS and P. cichorii 83-1, Ka/Ks ¼ genes in any of the sample of 96 isolates. Second, 0.3–4.2, average 1.2). nonsynonymous rates of evolution are uniformly lower than the corresponding synonymous rates across all within- and between-species comparisons, and are there- fore entirely compatible with selective constraints on DISCUSSION amino acid substitutions. Together, the lack of frame- The main goal of this study was to explore how genes shifts and the relatively low nonsynonymous-substitution involved in pathogencity have evolved on two PAIs that rates are solid indicators of protein functionality. have already been determined to be under balancing In addition, a TTSS-specific promoter domain, hrp selection in P. viridiflava. Three principle results emerged box, was found to be conserved among the PAIs. The hrp from our analyses: (1) the levels of genetic variation box was originally defined as GGAACC-N15 or 16-CCA varied among the PAI genes, but were in a range of CNNA in P. syrinage (Xiao and Hutcheson 1994). The genetic variation observed in the background, (2) sub- hrp box has been found in the 59-flanking regions stantial numbers of recombination events were detected of many TTSS-related genes in a broad range of plant within each PAI, and (3) there was no statistical evidence pathogens, allowing some redundancies (Arnold et al. for positive selection in most of the PAI genes, with the 2001; Rantakari et al. 2001). In P. viridiflava,weidentified exception of avrE. Our results, considered jointly with 12 hrp boxes in the T-PAI and 9 in the S-PAI (supplemental those previously reported in Araki et al. (2006), suggest Table 2 at http://www.genetics.org/supplemental/). Hrp- that two different levels of selection need to be con- box sequences were identified for 7 of the 12 loci (avrE[T], sidered. At the PAI level as a whole, two distinct systems avrE[S], avrF[S], hrpA[T], hrpA[S], hrpW[T], and hrpW[S])in Pathogenicity-Island Gene Evolution 1039 all the isolates investigated in P. vidiflava. For the other 5 hrpA[T] and hrpA[S] were nearly invariant within each loci (avrF[T], hrpZ[T], hrpZ[S], shcA[T], and hopPsyA[T]), we locus but highly divergent between these paralogous did not find hrp boxes, but these genes may belong to loci (Tables 1 and 2). Within each locus, the low levels of operons in which the promoters are located some dis- polymorphism in hrpA indicate strong selective or tance from individual genes (as reported in P. syringae; mutational constraints. Strong selective constraints for Alfano et al. 2000). The identified hrp boxes (supplemen- conserving the protein structure of HrpA might be talTable2athttp://www.genetics.org/supplemental/) reasonable, considering that hrpA encodes a pilus that show that the hrp box is relatively well conserved among functions as a conduit for Avr protein delivery ( Jin and genes and lineages, in contrast to the high levels of ge- He 2001). However, hrpA in P. syringae is reported to be netic divergence in the coding regions between the para- under diversifying selection (Guttman et al. 2006), sug- logous PAIs. This selective constraint on promoter gesting that modifications in protein structure of HrpA sequences is another good indicator of functional can be adaptive in Pseudomonas. Indeed, unlike within expression of these genes. each locus, hrpA is highly divergent between hrpA[T] and Finally, we have shown that both T-PAI isolates and S- hrpA[S] and between species (Table 2). While our data PAI isolates are capable of producing pilli when grown did not allow us to effectively test the selective neutrality on inducing medium and induce the salicylic-acid- of hrpA, this pattern of genetic variation in hrpA (nearly mediated host defense pathway when grown in Arabi- monomorphic within each locus and highly divergent dopsis (Jakob et al. 2007). While not definitive proof, this between loci) may be caused by positive selection for does suggest that the type-III secretion system in these functional differentiation of HrpA between loci, fol- isolates is operational in this species. lowed by negative selection for conserving HrpA within Evolution of PAI genes in P. viridiflava: We found a each locus in P. viridiflava. potential HGT of hopPsyA and shcA between P. syringae Evolutionary significance of phylogenetically stable and P. viridiflava. We cannot determine a specific route PAIs: Comparable levels of polymorphism in the PAI of the HGT by this genealogy alone because there are genes and the extensive shuffling of variation among many possible ways to end up with the observed orthologous genes is a hallmark of vertical gene trans- genealogy (Figure 2A). This is especially true if we take mission. Horizontal transmission of foreign DNA into possible HGTs from uninvestigated species into consid- the PAI would introduce linked mutations on the allele eration. If we assume a single HGT between P. viridiflava in which the foreign sequence landed, but such an event and P. syringae, however, Figure 2A suggests that the would produce only one new haplotype. This would not HGT did not occur very recently because the level of be counted as a recombination event in the Hudson genetic divergence in these genes is high even between and Kaplan (1985) method, which requires the pres- P. viridiflava and P. syringae group X (D[P. viridiflava-group X] ence of all four possible combinations of two segregat- ¼ 0.227). The extensive variation in the gene repertoire ing sites (four haplotypes) to identify a recombination in the EEL region in P. syringae (Alfano et al. 2000; event. Thus, the results reported here are very consis- Charity et al. 2003; Deng et al. 2003), which contrasts tent with the vertical gene transmission of the PAI genes the uniform EEL gene composition in P. viridiflava, in this species, as was reported in Araki et al. (2006). suggests that the HGTs are more likely to occur in P. While there is sound evidence that the evolution of syringae rather than in P. viridiflava. It is still possible that pathogenesis involves the acquisition of virulence-related the ½AT isolates acquired the observed EEL region genes (including entire PAIs) via HGT (Reid et al. 2000; relatively recently from an unknown species, but the Bukhalid et al. 2002), some studies reported that PAIs moderate levels of genetic variation in hopPsyA and shcA have been retained within pathogen species over evolu- comparable to those in the other genes among ½AT tionary time (e.g., Escobar-Paramo et al. 2003; Gressmann (Table 1) suggest the evolutionary stability of hopPsyA et al. 2005; Rohmer et al. 2004; Nallapareddy et al. 2005). and shcA in P. viridiflava. More importantly, the geneal- We do not know why some PAIs are evolutionary stable ogy of hrpK (Figure 2B) suggests no influence of HGT whereas others are not, but our results provide one on the hrp/hrc cluster. Thus, the extent of a HGT in P. possible explanation: Unlike a PAI acquired via recent viridiflava, if any, must be limited to the EEL region. HGT, an evolutionary stable PAI can work as a large We found a sign of positive selection in avrE, which is reservoir of preexisting genetic variation in this highly one of two Avr genes investigated in this study. One polymorphic species. It guarantees an immediate re- possible cause of selection on this gene is its virulence sponse to selection because the population does not have function, suggested by Venisseet al. (2003) and Debroy to wait for the next occurrence of a potentially adaptive et al. (2004). According to Debroy et al. (2004), avrE mutation. Thus, the stable and polymorphic PAI may be in P. syringae functions as a suppressor of salicylic-acid- favorable when rapid adaptive response is required. In mediated basal defense, and amino acid changes in addition, large population size (evidenced by the large AvrE might affect this function. Our window plot anal- standing crop of single nucleotide polymorphism) also ysis indicates the N-terminal region of AvrE is a good increases the chance occurrence of a novel adaptive candidate of the target of natural selection. mutation in the population. 1040 H. Araki et al.

We thank D. Begun and two anonymous reviewers for helpful Flor, H. H., 1971 Current status of the gene-for-gene concept. suggestions. This work was supported by NIH grant GM-62504 to J.B. Annu. Rev. Phytopathol. 9: 275–296. and M.K. and by a fellowship from the Japan Society for the Promotion Fu, Y.-X., and W.-H. Li, 1993 Statistical tests of neutrality of muta- of Science to H.A. tions. Genetics 133: 693–709. Goss, E. M., M. E. Kreitman and J. Bergelson, 2005 Genetic diver- sity, recombination, and cryptic clades in Pseudomonas viridiflava LITERATURE CITED infecting natural populations of Arabidopsis thaliana. Genetics 169: 21–35. Alfano, J. R., and A. Collmer, 1997 The type III (Hrp) secretion Gressmann, H., B. Linz,R.Ghai,K.P.Pleissner,R.Schlapbach pathway of plant pathogenic bacteria: trafficking harpins, Avr et al., 2005 Gain and loss of multiple genes during the evolution proteins, and death. J. Bacteriol. 179: 5655–5662. of Helicobacter pylori. PLoS Genet. 1: 419–428. Alfano, J. R., A. O. Charkowski,W.L.Deng,J.L.Badel,T.Petnicki- Guttman, D. S., S. J. Gropp,R.L.Morgan and P. W. Wang, Ocwieja et al., 2000 The Pseudomonas syringae Hrp pathogenicity 2006 Diversifying selection drives the evolution of the type III island has a tripartite mosaic structure composed of a cluster of secretion system pilus of Pseudomonas syringae. Mol. Biol. Evol. type III secretion genes bounded by exchangeable effector and 23: 2342–2354. conserved effector loci that contribute to parasitic fitness and path- Ham, J. H., D. R. Majerczak,A.S.Arroyo-Rodriguez,D.M.Mackey ogenicity in plants. Proc. Natl. Acad. Sci. USA 97: 4856–4861. and D. L. Coplin, 2006 WtsE, an AvrE-family effector protein Arabidopsis Genome Initiative, 2000 Analysis of the genome se- from Pantoea stewartii subsp. stewartii, causes disease-associated quence of the flowering plant Arabidopsis thaliana. Nature 408: cell death in corn and requires a chaperone protein for stability. 796–815. Mol. Plant-Microbe Interact. 19: 1092–1102. Araki, H., D. Tian,E.M.Goss,K.Jakob,S.S.Halldorsdottir et al., He, S. Y., H. C. Huang and A. Collmer, 1993 Pseudomonas syringae 2006 Presence/absence polymorphism for alternative pathoge- pv. syringae harpinPss: a protein that is secreted via the Hrp path- nicity islands in Pseudomonas viridiflava, a pathogen of Arabidopsis. way and elicits the hypersensitive response in plants. Cell 73: Proc. Natl. Acad. Sci. USA 103: 5887–5892. 1255–1266. Arnold,D.L.,R.W.Jackson,A.J.Fillingham,S.C.Goss,J.D.Taylor Huang, H. C., R. H. Lin,C.J.Chang,A.Collmer and W. L. Deng, et al., 2001 Highly conserved sequences flank avirulence genes: 1995 The complete hrp gene cluster of Pseudomonas syringae pv. isolation of novel avirulence genes from Pseudomonas syringae pv. syringae 61 includes two blocks of genes required for harpinPss pisi. Microbiology 147: 1171–1182. secretion that are arranged colinearly with Yersinia ysc homologs. Bakker, E. G., C. Toomajian,M.Kreitman and J. Bergelson, Mol. Plant-Microbe Interact. 8: 733–746. 2006 A genome-wide survey of R gene polymorphisms in Arabi- Hudson, R. R., and N. L. Kaplan, 1985 Statistical properties of the dopsis. Plant Cell 18: 1803–1818. number of recombination events in the history of a sample of Bergelson, J., M. Kreitman,E.A.Stahl and D. Tian, 2001 Evo- DNA sequences. Genetics 111: 147–164. lutionary dynamics of plant R-genes. Science 292: 2281–2285. Hudson, R. R., M. Kreitman and M. Aguade´, 1987 A test of neutral Bogdanove,A.J.,J.F.Kim,Z.Wei,P.Kolchinsky,A.O.Charkowski molecular evolution based on nucleotide data. Genetics 116: et al., 1998 Homology and functional similarity of an hrp-linked 153–159. pathogenicity locus, dspEF,ofErwinia amylovora and the avirulence Innan, H., and F. Tajima, 1997 The amounts of nucleotide variation locus avrE of Pseudomonas syringae pathovar tomato. Proc. Natl. within and between allelic classes and the reconstruction of the Acad. Sci. USA 95: 1325–1330. common ancestral sequence in a population. Genetics 147: Bukhalid,R.A.,T.Takeuchi,D.Labeda and R. Loria, 2002 Hor- 1431–1444. izontal transfer of the plant virulence gene, nec1, and flanking se- Jakob, K., E. M. Goss,H.Araki,T.Van,M.Kreitman et al., quences among genetically distinct Streptomyces strains in the 2002 Pseudomonas viridiflava and P. syringae–natural pathogens diastatochromogenescluster.Appl.Environ.Microbiol.68: 738–744. of Arabidopsis thaliana. Mol. Plant-Microbe Interact. 15: 1195– Castillo, A., L. Equiarte and V. Souza, 2005 A genomic popula- 1203. tion genetics analysis of the pathogenic enterocyte effacement Jakob, K., J. Kniskern and J. Bergelson, 2007 The role of pectate island in Escherichia coli the search for the unit of selection. Proc. lyase and the jasmonic acid defense response in Pseudomonas vir- Natl. Acad. Sci. USA 102: 1542–1547. idiflava virulence. Mol. Plant-Microbe Interact. 20: 146–158. Charity, J. C., K. Pak,C.F.Delwiche and S. W. Hutcheson, Jin, Q., and S. Y. He, 2001 Role of the Hrp pilus in type III protein 2003 Novel exchangeable effector loci associated with the Pseu- secretion in Pseudomonas syringae. Science 294: 2556–2558. domonas syringae hrp pathogenicity island: evidence for integron- Jukes, T. H., and C. R. Cantor, 1969 Evolution of protein mole- like assembly from transposed gene cassettes. Mol. Plant-Microbe cules, pp. 21–132 in Mammalian Protein Metabolism. Academic Interact. 16: 495–507. Press, New York. Charkowski, A. O., J. R. Alfano,G.Preston,J.Yuan,S.Y.He et al., Kimura, M., 1980 A simple method for estimating evolutionary rates 1998 The Pseudomonas syringae pv. tomato HrpW protein has do- of base substitutions through comparative studies of nucleotide mains similar to harpins and pectate lyases and can elicit the sequences. J. Mol. Evol. 16: 111–120. plant hypersensitive response and bind to pectate. J. Bacteriol. Kumar, S., K. Tamura,I.B.Jakobsen and M. Nei, 2001 MEGA2: 180: 5211–5217. molecular evolutionary genetics analysis software. Bioinformatics Dangl, J. L., and J. D. Jones, 2001 Plant pathogens and integrated 17: 1244–1245. defense responses to infection. Nature 411: 826–833. Liu, Z. Y., J. I. Bos,M.Armstrong,S.C.Whisson,L.Da Cunha et al., Dawkins, R., and J. R. Krebs, 1979 Arms race between and within 2005 Patterns of diversifying selection in the phytotoxin-like species. Proc. R. Soc. Lond. Ser. B Biol. Sci. 205: 489–511. scr74 gene family of Phytophthora infestans. Mol. Biol. Evol. 22: Debroy, S., R. Thilmony,Y.B.Kwack,K.Nomura and S. Y. He, 659–672. 2004 A family of conserved bacterial effectors inhibits salicylic Lorang, J. M., and N. T. Keen, 1995 Characterization of avrE from acid-mediated basal immunity and promotes disease necrosis in Pseudomonas syringae pv. tomato: a hrp-linked avirulence locus con- plants. Proc. Natl. Acad. Sci. USA 101: 9927–9932. sisting of at least two transcriptional units. Mol. Plant-Microbe In- Deng, W. L., A. H. Rehm,A.O.Charkowski,C.M.Rojas and A. teract. 8: 49–57. Collmer, 2003 Pseudomonas syringae exchangeable effector loci: McDonald, J. H., and M. Kreitman, 1991 Adaptive protein evolu- sequence diversity in representative pathovars and virulence func- tion at the Adh locus in Drosophila. Nature 351: 652–654. tion in P. syringae pv. syringae B728a. J. Bacteriol. 185: 2592–2602. Myers, S. R., and R. C. Griffiths, 2003 Bounds on the minimum Dodds, P. N., G. J. Lawrence,A.M.Catanzariti,T.Teh,C.I.Wang number of recombination events in a sample history. Genetics et al., 2006 Direct protein interaction underlies gene-for-gene 163: 375–394. specificity and coevolution of the flax resistance genes and flax Meyers, B. C., A. Kozik,A.Griego,H.Kuang and R. W. Michelmore, rust avirulence genes. Proc. Natl. Acad. Sci. USA 103: 8888–8893. 2003 Genome-wide analysis of NBS-LRR-encoding genes in Escobar-Paramo, P., C. Giudicelli,C.Parsot and E. Denamur, Arabidopsis. Plant Cell 15: 809–834. 2003 The evolutionary history of shigella and enteroinvasive Es- Nallapareddy,S.R.,W.X.Huang,G.M.Weinstock andB.E.Murray, cherichia coli revisited. J. Mol. Evol. 57: 140–148. 2005 Molecular characterization of a widespread, pathogenic, Pathogenicity-Island Gene Evolution 1041

and antibiotic resistance-receptive Enterococcus faecalis lineage and Shen, J., H. Araki,X.Sun, J.-Q. Chen and D. Tian, 2006 Unique dissemination of its putative pathogenicity islands. J. Bacteriol. evolutionary mechanism in R-genes under the presence/absence 187: 5709–5718. polymorphism in Arabidopsis thaliana. Genetics 172: 1243–1250. Nei, M., 1987 Molecular Evolutionary Genetics. Columbia University Stahl, E. A., G. Dwyer,R.Mauricio,M.Kreitman and J. Bergelson, Press, New York. 1999 Dynamics of disease resistance polymorphism at the Rpm1 Preston, G., H. C. Huang,S.Y.He and A. Collmer, 1995 The locus of Arabidopsis.Nature400: 667–671. HrpZ proteins of Pseudomonas syringae pvs. syringae, glycinea, Tajima, F., 1993 Simple methods for testing the molecular evolu- and tomato are encoded by an operon containing Yersinia ysc ho- tionary clock hypothesis. Genetics 135: 599–607. mologs and elicit the hypersensitive response in tomato but not Tajima, F., 1989 Statistical method for testing the neutral mutation soybean. Mol. Plant-Microbe Interact. 8: 717–732. hypothesis by DNA polymorphism. Genetics 123: 585–595. Qiu, W. G., S. E. Schutzer,J.F.Bruno,O.Attie,X.Yun et al., Thompson, J. D., T. J. Gibson,F.Plewniak,F.Jeanmougin and D. G. 2004 Genetic exchange and plasmid transfer in Borrelia burgdor- Higgins, 1997 The CLUSTAL X windows interface: flexible feri sensu stricto revealed by three-way genome comparisons and strategies for multiple sequence alignment aided by quality anal- multilocus sequence typing. Proc. Natl. Acad. Sci. USA 101: ysis tools. Nucleic Acids Res. 25: 4876–4882. 14150–14155. Tian, D., H. Araki,E.Stahl,J.Bergelson and M. Kreitman, Rantakari, A., O. Virtaharju,S.Vahamiko,S.Taira,E.T.Palva 2002 Signature of balancing selection in Arabidopsis. Proc. Natl. et al., 2001 Type III secretion contributes to the pathogenesis Acad. Sci. USA 99: 11525–11530. of the soft-rot pathogen Erwinia carotovora: partial characteriza- Van Dijk, K., V. C. Tam,A.R.Records,T.Petnicki-Ocwieja and J. R. tion of the hrp gene cluster. Mol. Plant-Microbe Interact. 14: Alfano, 2002 The ShcA protein is a molecular chaperone that 962–968. assists in the secretion of the HopPsyA effector from the type III Reid, S. D., C. J. Herbelin,A.C.Bumbaugh,R.K.Selander and T. S. (Hrp) protein secretion system of Pseudomonas syringae. Mol. Whittam, 2000 Parallel evolution of virulence in pathogenic Microbiol. 44: 1469–1481. Escherichia coli. Nature 406: 64–67. Venisse, J. S., M. A. Barny,J.P.Paulin and M. N. Brisset, Rohmer, L., D. S. Guttman and J. L. Dangl, 2004 Diverse evolu- 2003 Involvement of three pathogenicity factors of Erwinia amy- tionary mechanisms shape the type III effector virulence factor lovora in the oxidative stress associated with compatible interac- repertoire in the plant pathogen Pseudomonas syringae. Genetics tion in pear. FEBS Lett. 537: 198–202. 167: 1341–1360. Ward, T. J., J. P. Bielawsku,H.C.Kisstler,E.Sullivan and K. Rozas, J., J. C. Sanchez-Delbarrio,X.Messeguer and R. Rozas, O’Donnell, 2002 Ancestral polymorphism and adaptive evolu- 2003 DnaSP, DNA polymorphism analyses by the coalescent tion in the trichothecene mycotoxin gene cluster of phytopath- and other methods. Bioinformatics 19: 2496–2497. ogenic Fusarium. Proc. Natl. Acad. Sci. USA 99: 9278–9283. Saitou, N., and M. Nei, 1987 The neighbor-joining method: a new Wei, Z. M., R. J. Laby,C.H.Zumoff,D.W.Bauer,S.Y.He et al., method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 1992 Harpin, elicitor of the hypersensitive response produced 406–425. by the plant pathogen Erwinia amylovora. Science 257: 85–88. Sawada, H., F. Suzuki,I.Matsuda and N. Saitou, 1999 Phylo- Xiao, Y., and S. W. Hutcheson, 1994 A single promoter sequence genetic analysis of Pseudomonas syringae pathovars suggests the recognized by a newly identified alternate sigma factor directs ex- horizontal gene transfer of argK and the evolutionary stability pression of pathogenicity and host range determinants in Pseudo- of hrp gene cluster. J. Mol. Evol. 49: 627–644. monas syringae. J. Bacteriol. 176: 3089–3091. Sawyer, S. A., 1989 Statistical tests for detecting gene conversion. Mol. Biol. Evol. 6: 526–538. Communicating editor: D. Begun