The crystal structure of Pseudomonas avirulence protein AvrPphB: A -like fold with a distinct substrate-

Minfeng Zhu*, Feng Shao†, Roger W. Innes‡, Jack E. Dixon†, and Zhaohui Xu*§

*Department of Biological Chemistry, Medical School and Life Sciences Institute, University of Michigan, Ann Arbor, MI 48109; †Departments of Pharmacology and Cellular and Molecular Medicine, School of Medicine, and Department of Chemistry and Biochemistry, University of California at San Diego, La Jolla, CA 92093; and ‡Department of Biology, Indiana University, Bloomington, IN 47405

Contributed by Jack E. Dixon, October 9, 2003 AvrPphB is an avirulence (Avr) protein from the plant pathogen ification of the host target(s) also contributes to pathogen Pseudomonas syringae that can trigger a disease-resistance re- virulence and promotes disease in susceptible hosts that lack the sponse in a number of host plants including Arabidopsis. AvrPphB corresponding R (1, 12). Therefore, it is expected that belongs to a novel family of cysteine with the charter biochemical studies of Avr proteins, aimed at identifying their member of this family being the Yersinia effector protein YopT. host targets and revealing the nature of target modification, will AvrPphB has a very stringent substrate specificity, catalyzing a facilitate our understanding of the molecular mechanisms con- single proteolytic cleavage in the Arabidopsis serine͞threonine ferring plant disease resistance and pathogen virulence. kinase PBS1. We have determined the crystal structure of AvrPphB We have used the AvrPphB protein from the plant pathogen by x-ray crystallography at 1.35-Å resolution. The structure is Pseudomonas syringae as a model to understand disease- composed of a central antiparallel ␤-sheet, with ␣-helices packing resistance mechanisms. AvrPphB is an effector protein secreted on both sides of the sheet to form a two-lobe structure. The core by the Pseudomonas type III secretion system into the plant host of this structure resembles the papain-like cysteine proteases. The (13, 14). AvrPphB induces the HR in a number of host plants, similarity includes the AvrPphB of Cys-98, including Arabidopsis (15). The PBS1 and RPS5 of Ara- His-212, and Asp-227 and the residue Asn-93. Based bidopsis are specifically required for AvrPphB-induced HR on analogy with inhibitor complexes of the papain-like proteases, (16–18). RPS5 is a member of the NB-LRR class of plant R we propose a model for the substrate-binding mechanism of proteins, and PBS1 is a protein serine͞threonine kinase. Our AvrPphB. A deep and positively charged pocket (S2) and a neigh- biochemical studies have established that AvrPphB belongs to an boring shallow surface (S3) likely bind to aspartic acid and glycine additional family of cysteine proteases (15). The charter member residues in the substrate located two (P2) and three (P3) residues of this family is the virulence factor YopT from the pathogenic N terminal to the cleavage site, respectively. Further implications Yersinia species. about the specificity of plant pathogen recognition are also AvrPphB undergoes an autoproteolytic processing event, discussed. cleaving itself between Lys-62 and Gly-63 (15, 19). The catalytic triad in AvrPphB (Cys-98, His-212, and Asp-227) is required for lants activate a highly coordinated disease-resistance re- autoproteolytic cleavage and AvrPphB-induced disease resis- Psponse upon recognition of a pathogen. Failure of this tance (15). Recently, we have shown that PBS1 is a proteolytic recognition event or suppression of the resistance response by substrate for AvrPphB (10). The cleavage of PBS1 appears to be the pathogen results in plant disease. Pathogen recognition required for AvrPphB͞RPS5-mediated disease resistance. requires the simultaneous presence of an avirulence (Avr) gene Therefore, we propose that AvrPphB is recognized indirectly via product from the pathogen and a cognate plant resistance (R) its enzymatic activity (cleavage of PBS1) by the RPS5 protein in protein (1). The genetic interaction between an R gene and an Arabidopsis (10). This could represent a widespread resistance Avr gene leads to the activation of plant disease-resistance mechanism, because several Avr proteins are now thought to signaling cascades, culminating in the plant hypersensitive re- function as specific proteases (10, 15, 20–26). sponse (HR) (1). The HR refers to the rapid, localized pro- To understand disease-resistance mechanisms at the molec- grammed cell death response at the site of pathogen attack that ular level and provide a rationale for engineering this process for correlates with the restriction of pathogen growth. The predom- the benefit of crop protection, a structural understanding of the inant class of R proteins is known as NB-LRR proteins, because proteins mediating pathogen recognition is needed. In the they contain a nucleotide-binding site followed by leucine-rich present study, we determined the crystal structure of the auto- repeats (1, 2). Interestingly, the domain architecture of the plant processed form of AvrPphB. The overall structure of AvrPphB, NB-LRR proteins resembles that of the mammalian Nod pro- along with a detailed analysis of its catalytic site, is described. teins, which also play a role in the innate immune response (3). Our work confirms the previous prediction that AvrPphB be- The presence of Ϸ175 NB-LRR proteins encoded in the Arabi- longs to the papain superfamily of cysteine proteases, with the dopsis genome is believed to provide the specificity and diversity structurally conserved active site catalytic triad and the oxyanion required for plant pathogen recognition (4, 5). hole (15). Comparison of the AvrPphB structure with other Extensive genetic studies in the past 10 years have identified cysteine family members provides insight into the Ͼ30 pairs of Avr-R genes (2). The biochemical mechanisms interactions that most likely occur between AvrPphB and its underlying the Avr-R genetic interaction are just beginning to be substrate, PBS1. unraveled (6). The ‘‘gene-for-gene’’ interaction model proposed Ͼ30 years ago by Flor suggested that Avr proteins act as ligands and R proteins serve as the corresponding receptors (7). This is Abbreviation: Avr, avirulence. likely to be an oversimplified model. Instead, emerging biochem- Data deposition: The atomic coordinates and structure factors for AvrPphB have been ical evidence suggests that Avr proteins specifically modify host deposited in the Protein Data Bank, www.rcsb.org (PDB ID code 1UKF). targets, thus generating signals detected by R proteins in resis- §To whom correspondence should be addressed. E-mail: [email protected]. tant plants (8–11). According to the ‘‘guard’’ hypothesis, mod- © 2003 by The National Academy of Sciences of the USA

302–307 ͉ PNAS ͉ January 6, 2004 ͉ vol. 101 ͉ no. 1 www.pnas.org͞cgi͞doi͞10.1073͞pnas.2036536100 Downloaded by guest on October 2, 2021 Materials and Methods Protein Expression and Purification. Full-length AvrPphB from P. syringae pv. phaseolicola was cloned into the NdeI͞SalI sites of pET21a (Novagen) and transformed into Escherichia coli strain BL21 (DE3) to be expressed as a recombinant protein with a hexahistidine tag at its C terminus. The E. coli strain harboring this plasmid was grown in M9 minimal media and induced by 0.4 mM isopropyl ␤-D-thiogalactoside at 20°C for 20 h. AvrPphB was purified by Ni2ϩ-NTA affinity chromatography (Qiagen, Chats- worth, CA) with a subsequent size-exclusion Superdex 200 column (Amersham Pharmacia). A selenomethionine form of the protein was expressed and purified following the same Fig. 1. A stereo ribbon diagram of AvrPphB. ␣-Helices are drawn as magenta protocol with E. coli strain B834 (DE3). coils and labeled A–F; ␤-strands are drawn as green arrows and labeled 1–7; and other structural elements are drawn as thick gray lines. The catalytic triad Crystallization and Data Collection. Purified proteins were buffer- residues at the active site (Cys-98, His-212, and Asp-227) and the exchanged into 20 mM Tris⅐HCl (pH 7.5), 10 mM NaCl, and 1 residue forming the oxyanion hole (Asn-93) are drawn as ball-and-stick mod- mM Tri(2-carboxyethyl)phosphine hydrochloride by using mi- els. Image was prepared with the program MOLSCRIPT (51). crofiltration (Centricon-10, Amicon) and further concentrated to 17 mg͞ml. Both AvrPphB and selenomethionine AvrPphB cules. The final model includes residues 81–267 and a total of 209 crystals were grown by the sitting drop vapor diffusion method water molecules. under the optimal condition of 1.95 M (NH4)2SO4, 100 mM Pipes (pH 6.7) at room temperature. Crystals appeared 3–5 days after Results and Discussion initial set-up and grew to optimal size (0.2 ϫ 0.3 ϫ 0.3 mm3)in Overall Structure of AvrPphB. AvrPphB overexpressed and purified 7–10 days. Crystals were transferred in three steps into a from E. coli undergoes autoproteolytic processing (15, 19), thus cryo-protecting solution containing 2.1 M (NH4)2SO4, 100 mM ͞ our purified AvrPphB protein was 28 kDa and began with Pipes (pH 6.7), and 30% (wt vol) glucose and then flash-cooled Gly-63, a potential myristoylation site (15, 31). The first 18 with liquid nitrogen. Cryo-protected crystals were used for data residues (63–80) as well as the C-terminal hexa-histidine tag MICROBIOLOGY collection under a nitrogen stream at 110 K. Both native and were not visible in the electron density map, and thus were not four-wavelength multiple-wavelength anomalous diffraction included in our structure. The structure was determined to data were collected at the Structural Biology Center Beamline near-atomic resolution of 1.35 Å with an R factor of 20.3% and ID-19 at the Advanced Photon Source and processed with the an Rfree of 21.8%. The rms deviations from ideal values in bond HKL2000 package (27). Crystals belong to the orthorhombic space ϭ ϭ lengths and bond angles are 0.004 Å and 1.3°, respectively. All of group P212121 with unit cell constants a 43.6 Å, b 49.6 Å, ϭ the residues are located in favored regions of the Ramachandran c 75.7 Å (Table 1, which is published as supporting informa- plot. tion on the PNAS web site). There is one AvrPphB molecule in The structure of AvrPphB has an ␣͞␤ fold, with seven the crystallographic asymmetric unit. ␤-strands (␤1–␤7) and six ␣-helices (␣A–␣F) (Fig. 1). It is organized around a central seven-stranded antiparallel ␤-sheet, Structure Determination and Refinement. Experimental phases with ␣-helices packing on both sides of the sheet to form a were obtained by the multiple anomalous diffraction method two-lobe structure. The top lobe, as viewed in Fig. 1, contains the (28) (Table 2, which is published as supporting information on ␤-sheet and two helices. The order of the strands in the sheet the PNAS web site). All structure determination and refinement follows ␤1, ␤6, ␤5, ␤4, ␤3, ␤7, and ␤2. His-212 and Asp-227, procedures were carried out with the CNS program (29) and which have both been implicated in catalysis, are located at the model building used the O program (30). Two selenium sites N-terminal and C-terminal ends of strand ␤4 and strand ␤5, expected for one AvrPphB molecule were found by using the respectively. The bottom lobe consists of helices ␣A–␣D. The Patterson method. Subsequently, heavy atom parameters were active site nucleophile Cys-98 is located at the N terminus of refined, and multiple-wavelength anomalous diffraction phases helix ␣A. Right above the cysteine slightly to the left in the top were calculated and improved by solvent flipping to 1.85-Å lobe is His-212. A potential substrate-binding site of AvrPphB is resolution. The high quality of the experimental electron density located between the molecule’s two lobes, within a crevice that maps and the known selenium sites allowed unambiguous tracing is open and shallow on the left side but narrow and deep on the of the protein main chain and most of the side chains. A random right side. As discussed below, the catalytic nucleophile Cys-98, selection of 10% of the reflections was set aside as the test set the general base His-212, and Asp-227 form a catalytic triad that, for cross-validation during the refinement (Table 3, which is along with other structural features, resembles the well-known published as supporting information on the PNAS web site). superfamily of papain-like cysteine proteases. Initial refinement consisted of several iterations of conjugated The catalytic cysteine residue is solvent exposed and appears gradient minimization, torsion angle dynamics-simulated an- to have extra electron density surrounding its sulfur atom. The nealing by the maximum-likelihood target function with exper- extra density could correspond to sulfinic acid, which results imental phases as a prior phase distribution, grouped B-factor from oxidation of the catalytic cysteine. This is possibly caused refinement, and model rebuilding (29). These were performed by the higher reactivity and corresponding lowered pKa for against the native data set initially at 2.0-Å resolution and Cys-98, because the other two cysteines in the AvrPphB structure gradually extended to 1.35 Å. Later rounds of refinement also are free of such modification. This finding was confirmed by included selecting chemically reasonable water molecules and treatment with 10 mM DTT, which resulted in the disappearance ␴ Խ ԽϪԽ Խ model rebuilding in phase-combined A-weighted 2 Fo Fc of this extra density. Similar oxidation of active site sulfhydryl Խ ԽϪԽ Խ and Fo Fc maps and individual atomic B-factor refinement. group has been observed for other papain family members (32). Water molecules were modeled into isolated electron density Despite the relatively low sequence identity among members ␴ Խ ԽϪԽ Խ Խ ԽϪԽ Խ that appear in both A-weighted 2 Fo Fc and Fo Fc maps of the YopT family of cysteine proteases, all members have the and form hydrogen bonds with AvrPphB or other water mole- same predicted secondary structure assignment as determined

Zhu et al. PNAS ͉ January 6, 2004 ͉ vol. 101 ͉ no. 1 ͉ 303 Downloaded by guest on October 2, 2021 Fig. 3. Electron density map of AvrPphB. Stereo pair of a ␴A-weighed 2ԽFoԽ ϪԽFcԽ simulated-annealing omit electron density map (1.35 Å, contoured at 1.2 ␴) calculated with the final refined coordinates. Shown here is a region near the catalytic triad. The region includes highly conserved residues Trp-105, Asp-227 (one of the catalytic triad residues), Pro-228, Asn-229, Gly-231, Glu- 232, and Phe-233. Notably, the rings of Trp-105, Phe-226, and Pro-228 are stacked on each other. The image was prepared with the program MOLSCRIPT. Fig. 2. Multiple sequence alignment of the representative members of the YopT family. Y4zC and Q9amw4 are putative type III effec- tors from plant symbiotic bacteria Rhizobium and Bradyrhizobium, respec- as indicated by a search performed with the Dali algorithm (42). tively; Q9rbw5 is from P. syringae pv. phaseolicola and is also known as ORF4; The rms deviation value between AvrPphB and Staphopain for 110 ␣ AvrPphB is an Avr protein from P. syringae pv. phaseolicola; and YopT is an selected C atoms is 3.7 Å. effector protein from human pathogen Yersinia pestis. All of the sequences Secondary structure elements of AvrPphB that have structural were aligned with the program CLUSTALW (52). Invariant residues are shaded, equivalents in papain-like proteases include most of the central and the catalytic triad residues are indicated with bold red letters. Secondary antiparallel ␤-sheet (␤2-␤7) and helix ␣A. The major differences structure elements are indicated underneath the sequence: ␣-helices, rectan- gles; ␤-strands, arrows; other elements, solid lines; structurally unobserved among these structures lie in the connections between secondary structure elements (Fig. 4A). For example, AvrPphB contains residues at the N terminus, dashed lines. Color assignment is the same as in Fig. ␣ ␤ 1. For the purpose of clarity, sequences in non-AvrPphB proteins, where no three helices between A and 2 with short loops connecting the homology is identified with AvrPphB, have been replaced by bracketed num- helices. This gives AvrPphB a somewhat more compact and rigid bers indicating the number of residues omitted from the display. bottom lobe. Other members including papain and B have a long winding loop at the bottom of the lobe. On the top lobe, AvrPphB has a tight ␤-turn between ␤3 and ␤4 (which by the neural network program of the PREDICTPROTEIN server contains the active site residue His-212). At the corresponding (33) (Fig. 2). This finding suggests that all members of the YopT location, papain has a 23-residue insertion that loops around the family are likely to adopt a common 3D structure represented by top part of the molecule whereas staphopain has an eight-residue that of AvrPphB. There are a number of highly conserved loop (Fig. 4A). This region contributes to the formation of the residues within the family in addition to the invariant catalytic substrate-binding S2 site in papain-like proteases and has the triad residues as highlighted in Fig. 2. They are clustered mostly highest B factor in the AvrPphB structure (average B factor 28.7 around the catalytic triad, in particular, Asp-227. Three con- Å2 as compared with average B factor 15.5 Å2 for the entire served residues, Trp-105, Phe-226, and Pro-228, stack on each protein). As discussed below, based on the location and structure other via Van del Waals interactions (Fig. 3). Trp-105 also forms flexibility, we hypothesize that this region plays an important a hydrogen bond with the conserved Glu-232. These interactions role in shaping the substrate-binding S2 site in AvrPphB. presumably bring helix ␣A near strands ␤4 and ␤5 to help form Overlap of the AvrPphB catalytic triad (Cys-98, His-212, and and stabilize the catalytic triad. Consistent with this analysis, Asp-227) with that of the papain-like (Cys, His, and replacement of the corresponding tryptophan (Trp-146) in Asn) yields rms deviation values on the three C␣ atoms of YopT with alanine abolished the cytotoxic effect of YopT in between 0.18 and 0.36 Å for 17 papain-like structures listed in the yeast (15) and the cytoskeleton disruption phenotype in trans- SCOP database (43). In the superimposed structures of AvrPphB fected mammalian cells (data not shown). and papain, the central helix and the catalytic cysteine and histidine align very well, with the third catalytic residue Asp-227 Structural Homologs of AvrPphB and the Mechanism of Catalysis. The in AvrPphB corresponding to Asn-175 in papain (Fig. 4B). In overall architecture of AvrPphB bears similarity to that seen for papain, the main-chain amide group of Cys-25 and the side-chain members of papain-like cysteine protease superfamily (34), includ- amide group of Gln-19 define the so-called oxyanion hole, which ing papain (35), actinidin (36), glycyl (37), cruzain binds to the main-chain carbonyl group of the P1 residue of the (38), (39), bleomycin (40), staphopain,¶ and substrate (defined as the first residue on the N-terminal side of ubiquitin C-terminal hydrolase (41). The structural features of the the cleavage site). In AvrPphB, Asn-93 occupies an analogous papain superfamily consist of the common core of one ␣-helix and position to Gln-19 in papain, which indicates that it may also four strands of antiparallel ␤-sheet, which are also present in the participate in the formation of the oxyanion hole (44, 45). AvrPphB structure. Members differ by insertion into and circular Overlap of all four of these AvrPphB active site residues on papain-like proteases yields rms deviation values that range from permutation of the catalytic core. Of the papain-like proteases, ␣ staphopain has the highest overall structural similarity to AvrPphB, 0.60 to 0.88 Å for C atoms as illustrated in Fig. 4B. These structural similarities suggest that the catalytic mech- anism of AvrPphB will mirror that of the papain-like proteases ¶Hofmann, B., Schomburg, D. & Hecht, H. J. (1993) Acta Crystallogr. A 49, Suppl., 102. (46). Thus, it is likely that AvrPphB Cys-98 and His-212 form a

304 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.2036536100 Zhu et al. Downloaded by guest on October 2, 2021 Fig. 4. Structure comparison of AvrPphB with papain-like cysteine proteases. (A) Ribbon diagrams of AvrPphB (Left), papain (PDB ID code 1ppn) (Center), and staphopain (PDB ID code 1cv8) (Right). The structures are aligned based on least-squares superimposition on the active site residue C␣ atoms. The active site residues are shown as ball-and-stick models. The structurally conserved helix ␣A and central ␤-sheets are colored magenta and green, respectively. The loop between strands ␤3 and ␤4 are colored red. (B) Comparison of AvrPphB and papain-like active sites. Active site residues of AvrPphB, Asn-93, Cys-98, His-212, and Asp-227 are shown as green stick models. A representative collection of five papain-like cysteine protease active sites are shown as gray stick models after least-squares superimposition on the active site residue C␣ atoms. The papain-like structures shown have PDB ID codes 2act, 1cv8, 1gec, 1me4, and 1ppn. Other papain-like structures used in structural comparisons in this article are PDB ID codes 1meg, 1huc, 1yal, 1cqd, 1k3b, 8pch, 1atk, 1mhw, 1gho, 1fh0, 1ef7, 2cb5, and 1dki. Refer to the Protein Data Bank for primary references to these structures, which are not included here because of space limits. Images were prepared with the program MOLSCRIPT.

thiolate-imidazolium ion pair, Asp-227 functions to orient the its main chain located in the S1 site in a similar fashion to that enzyme’s active site and perhaps to stabilize the protonated form of papain, whereas its side chain points out toward the solvent. of His-212, and Asn-93 contributes to formation of the oxyanion For members of the papain family, an acidic residue is often

hole. These essential roles in catalysis are supported by exper- MICROBIOLOGY imental data, which have shown that mutation of any of the Cys, His, and Asp residues in both AvrPphB and YopT completely abrogate their proteolytic activities as well as biological functions (15, 47). E-64 (N-[N-(L-3-trans-carboxirane-2-carbonyl)-L- leucyl]-agmatine), a well-studied inhibitor for papain-like cys- teine proteases, is known to specifically recognize the catalytic cysteine and covalently modify its sulfhydrl group (48). In the case of AvrPphB, proteolytic cleavage of its substrate PBS1 protein is largely abolished when the enzyme is mixed with E-64 (data not shown). In addition, E-64 is a potent and efficient inhibitor of the YopT cleavage of its substrate prenylated Rho GTPases (15). These data suggest that AvrPphB and all other members of the YopT family share a common catalytic mech- anism with that of papain protease.

Insights into Substrate Binding Specificity. AvrPphB cleaves itself between Lys-62 and Gly-63, and it cleaves PBS1 between Lys-243 and Ser-244 (10, 15). A common Gly-Asp-Lys motif was found to occupy the P3, P2, and P1 sites in both substrates (Fig. 5A). A likely mode of substrate binding to AvrPphB is suggested by analogy with complexes of papain-like proteases, in which bound inhibitors occupy the substrate binding sites S or SЈ. Although the loops connected by a disulfide bond near the active site in the bottom lobe of papain are not present in AvrPphB, the corre- sponding space is filled by the aromatic side chain of Tyr-138. This is likely to be the S1 substrate-binding site in AvrPphB (Fig. 5 B and C). In papain, the P1 residue main-chain carbonyl group is placed into the oxyanion hole, whereas its side chain points Fig. 5. The structural basis of AvrPphB substrate specificity. (A) Sequence comparison of AvrPphB cleavage sites in its precursor and substrate PBS1 protein. upward into bulk solvent. This renders little substrate specificity A common Gly-Asp-Lys motif preceding the cleavage sites is highlighted in red, at the P1 residue. Mutational analysis of AvrPphB and PBS1 and the arrow indicates the cleavage sites. (B and C) Active site clefts of papain- suggested that AvrPphB has strong preferences for the P3 and P2 like enzyme and AvrPphB. Orientation is the same as in Fig. 4A. B shows the residues (see below), but less selectivity for the P1 residue. molecular surface of cruzain (PDB ID code 2aim), and C shows that of AvrPphB. Although a Lys residue is present at the P1 site in both the The structure of cruzain was determined with the inhibitor benzoyl-Arg-Ala- AvrPphB precursor and PBS1 proteins (Fig. 5A), substitution of fluoromethyl ketone, which occupies the S3, S2, and S1 sites and is shown in B Left Lys-62 with alanine at the P1 position in AvrPphB had no effect as a CPK representation. C Right is a zoom-in view of the proposed active site of AvrPphB. The proposed S2 site residue (Arg-205) and four catalytically important on AvrPphB autoproteolytic cleavage (data not shown). Simi- residues are drawn underneath the molecular surface. Note the positive charac- larly, mutation of the corresponding Lys-243 in PBS1 showed ter of S2 and shallowness of S3 at the substrate-binding site. All surfaces are only a minor effect on its cleavage by AvrPphB (10). Therefore, colored based on the electrostatic potential of the molecule (ranging from Ϫ23 it is quite possible that the P1 residue for AvrPphB may also have to ϩ23 kT). Images were prepared with the program GRASP (53).

Zhu et al. PNAS ͉ January 6, 2004 ͉ vol. 101 ͉ no. 1 ͉ 305 Downloaded by guest on October 2, 2021 found within the S2 site, which interacts with basic residues in the syringae and Ralstonia and three from plant symbiotic bacteria P2 position of substrates. For example, a glutamic acid residue Rhizobium and Bradyrhizobium. The list will continue to grow occupies the S2 site in cathepsin B and cruzain (two papain with the ongoing efforts of microbial genome sequencing. In family members), and its negative side chain interacts with a addition, multiple Avr proteases of the YopT family can be found positively charged arginine at the P2 site of their respective in a single Pseudomonas strain. For example, HopPtoN and substrates (38, 49) (Fig. 5B). When the AvrPphB structure is HopPtoC are both YopT family members and are found in P. superimposed onto the structure of cruzain bound to its sub- syringae pv. tomato strain DC3000 (50). Our structural analysis strate-mimicking inhibitor (benzoyl-Arg-Ala-fluoromethyl ke- suggests that these putative Avr proteases will likely target tone), Arg-205 is found to occupy the S2 site (Fig. 5C). The different substrates in the plant host, or possibly cleave the same putative S2 site in AvrPphB is rather deep and has an overall substrates at different positions, generating different cleavage positive charge. This finding is consistent with our observation signals or products. Detection of these different proteases would that AvrPphB prefers an aspartic acid at the substrate P2 site; thus be expected to require distinct R proteins. The large mutation of Asp-242 in PBS1 (the P2 residue) drastically in- number of YopT-like proteins found in plant pathogens may creases its resistance to cleavage by AvrPphB (10). reflect coevolutionary pressures in which the evolution of new R The S3 site of AvrPphB has a strong preference for a glycine proteins in the host that detect the cleavage products of a given residue as mutation of Gly-241 in the cleavage site of PBS1 (the protease selects for pathogens with new protease variants. P3 residue) to alanine almost completely blocks its cleavage (10). In summary, we have shown that AvrPphB has an overall The structural comparison between AvrPphB and cruzain, as protein fold that places it within the papain-like cysteine pro- shown in Fig. 5 B and C, suggests that Leu-142, Tyr-138, and tease superfamily, although it clearly differs from the typical Phe-154 form the S3 site. All three residues have bulky hydro- papain fold. This structural similarity includes the three cata- phobic side chains, which makes the S3 site a very shallow lytically important residues with a structural arrangement that is surface. Such a shallow and hydrophobic surface may not allow superimposable to that of papain-like proteases. It is thus binding of any amino acids other than glycine. expected that AvrPphB uses a similar mechanism in catalyzing Although the proteolytic triad is invariant in the entire YopT the proteolysis. The similarity between AvrPphB and other family of proteases, residues corresponding to the substrate- papain-like proteases also suggests a likely substrate-binding site binding sites in AvrPphB, particularly the S2 and S3 sites, appear that might explain the distinct substrate specificity of AvrPphB to be divergent among the family members (Fig. 2). This finding among the YopT family members. Our structural analysis of the suggests that different YopT family member proteases have specificity of Avr proteases provides a partial explanation for the different substrate specificities and thus target different host one-to-one relationship between Avr genes and corresponding R proteins. Indeed, YopT from Yersinia specifically cleaves the genes, a hallmark of the plant disease resistance mechanism. Our prenylated Rho GTPases. A polybasic sequence upstream of the study also begins to advance our understanding of plant disease cleavage site, as well as an isoprenoid moiety in the substrate, resistance mechanisms to the structural level. plays a crucial role in dictating the cleavage specificity (47). Moreover, Y4zC from Rhizobium fails to cleave PBS1, although We thank J. Stuckey for maintaining the x-ray facility at the University it also undergoes autoproteolytic cleavage (F.S. and J.E.D., of Michigan Medical School; D. Peisach for help with data collection and unpublished data). Consistent with this observation, the site of crystallography; and R. Zhang, S. Ginell, and A. Joachimiak for access autoproteolytic cleavage in Y4zC differs from AvrPphB, with the and help at the Advanced Photon Source Structural Biology Center (Argonne, IL) beamline. F.S. was supported by the Anthony and Lillian P1, P2, and P3 sites being occupied by Met, Lys, and Asp, Lu graduate student fellowship. This work was supported by National respectively, rather than Lys, Asp, and Gly (F.S. and J.E.D., Institutes of Health Grants GM46451 (to R.W.I.), DK18849 (to J.E.D.), unpublished data). and GM60997 (to Z.X.), the Walther Cancer Institute (J.E.D.), and the A recent PSI-BLAST search identifies seven Avr genes of the Ellison Medical Foundation (J.E.D.). Z.X. is a University of Michigan YopT family of cysteine proteases from the plant pathogens P. Biological Sciences Scholar and a Pew Scholar in Biomedical Sciences.

1. Dangl, J. L. & Jones, J. D. (2001) Nature 411, 826–833. 19. Puri, N., Jenner, C., Bennett, M., Stewart, R., Mansfield, J., Lyons, N. & Taylor, 2. Martin, G. B., Bogdanove, A. J. & Sessa, G. (2003) Annu. Rev. Plant Biol. 54, J. (1997) Mol. Plant Microbe Interact. 10, 247–256. 23–61. 20. Innes, R. (2003) Mol. Microbiol. 50, 363–365. 3. Inohara, N. & Nunez, G. (2003) Nat. Rev. Immunol. 3, 371–382. 21. Hotson, A., Chosed, R., Shu, H., Orth, K. & Mudgett, M. B. (2003) Mol. 4. Arabidopsis Genome Initiative (2000) Nature 408, 796–815. Microbiol. 50, 377–389. 5. Meyers, B. C., Kozik, A., Griego, A., Kuang, H. & Michelmore, R. W. (2003) 22. Axtell, M. J., Chisholm, S. T., Dahlbeck, D. & Staskawicz, B. J. (2003) Mol. Plant Cell 15, 809–834. Microbiol. 49, 1537–1546. 6. Schneider, D. S. (2002) Cell 109, 537–540. 23. Kruger, J., Thomas, C. M., Golstein, C., Dixon, M. S., Smoker, M., Tang, S., 7. Flor, H. H. (1971) Annu. Rev. Phytopathol. 9, 275–296. Mulder, L. & Jones, J. D. (2002) Science 296, 744–747. 8. Axtell, M. J. & Staskawicz, B. J. (2003) Cell 112, 369–377. 24. Lavie, M., Shillington, E., Eguiluz, C., Grimsley, N. & Boucher, C. (2002) Mol. 9. Mackey, D., Belkhadir, Y., Alonso, J. M., Ecker, J. R. & Dangl, J. L. (2003) Cell Plant Microbe Interact. 15, 1058–1068. 112, 379–389. 25. Mestre, P., Brigneti, G. & Baulcombe, D. C. (2000) Plant J. 23, 653–661. 10. Shao, F., Golstein, C., Ade, J., Stoutemyer, M., Dixon, J. E. & Innes, R. W. 26. Orth, K., Palmer, L. E., Bao, Z. Q., Stewart, S., Rudolph, A. E., Bliska, J. B. (2003) Science 301, 1230–1233. & Dixon, J. E. (1999) Science 285, 1920–1923. 11. Bonas, U. & Lahaye, T. (2002) Curr. Opin. Microbiol. 5, 44–50. 27. Otwinowski, Z. & Minor, W. (1997) Methods Enzymol. 276, 307–326. 12. Van der Biezen, E. A. & Jones, J. D. (1998) Trends Biochem. Sci. 23, 454–456. 28. Hendrickson, W. A. & Ogata, C. M. (1997) Methods Enzymol. 276, 494–523. 13. Jenner, C., Hitchin, E., Mansfield, J., Walters, K., Betteridge, P., Teverson, D. 29. Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., & Taylor, J. (1991) Mol. Plant Microbe Interact. 4, 553–562. Grosse-Kunstleve, R. W., Jiang, J. S., Kuszewski, J., Nilges, M., Pannu, N. S., 14. Tampakaki, A. P., Bastaki, M., Mansfield, J. W. & Panopoulos, N. J. (2002) et al. (1998) Acta Crystallogr. D 54, 905–921. Mol. Plant Microbe Interact. 15, 292–300. 30. Jones, T. A., Zou, J. Y., Cowan, S. W. & Kjeldgaard, M. (1991) Acta Crystallogr. 15. Shao, F., Merritt, P. M., Bao, Z., Innes, R. W. & Dixon, J. E. (2002) Cell 109, A 47, 110–119. 575–588. 31. Nimchuk, Z., Marois, E., Kjemtrup, S., Leister, R. T., Katagiri, F. & Dangl, J. L. 16. Warren, R. F., Merritt, P. M., Holub, E. & Innes, R. W. (1999) Genetics 152, (2000) Cell 101, 353–363. 401–412. 32. Jia, Z., Hasnain, S., Hirama, T., Lee, X., Mort, J. S., To, R. & Huber, C. P. 17. Warren, R. F., Henk, A., Mowery, P., Holub, E. & Innes, R. W. (1998) Plant (1995) J. Biol. Chem. 270, 5527–5533. Cell 10, 1439–1452. 33. Rost, B. & Sander, C. (1993) Proc. Natl. Acad. Sci. USA 90, 7558–7562. 18. Swiderski, M. R. & Innes, R. W. (2001) Plant J. 26, 101–112. 34. Rawlings, N. D. & Barrett, A. J. (1994) Methods Enzymol. 244, 461–486.

306 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.2036536100 Zhu et al. Downloaded by guest on October 2, 2021 35. Kamphuis, I. G., Kalk, K. H., Swarte, M. B. & Drenth, J. (1984) J. Mol. Biol. 45. Menard, R., Carriere, J., Laflamme, P., Plouffe, C., Khouri, H. E., Vernet, T., 179, 233–256. Tessier, D. C., Thomas, D. Y. & Storer, A. C. (1991) Biochemistry 30, 36. Baker, E. N. (1980) J. Mol. Biol. 141, 441–484. 8924–8928. 37. O’Hara, B. P., Hemmings, A. M., Buttle, D. J. & Pearl, L. H. (1995) 46. Storer, A. C. & Menard, R. (1994) Methods Enzymol. 244, 486–500. Biochemistry 34, 13190–13195. 47. Shao, F., Vacratsis, P. O., Bao, Z., Bowers, K. E., Fierke, C. A. & Dixon, J. E. 38. Gillmor, S. A., Craik, C. S. & Fletterick, R. J. (1997) Protein Sci. 6, 1603–1611. (2003) Proc. Natl. Acad. Sci. USA 100, 904–909. 39. Musil, D., Zucic, D., Turk, D., Engh, R. A., Mayr, I., Huber, R., Popovic, T., 48. Varughese, K. I., Ahmed, F. R., Carey, P. R., Hasnain, S., Huber, C. P. & Turk, V., Towatari, T., Katunuma, N., et al. (1991) EMBO J. 10, 2321–2330. Storer, A. C. (1989) Biochemistry 28, 1330–1332. 40. Joshua-Tor, L., Xu, H. E., Johnston, S. A. & Rees, D. C. (1995) Science 269, 49. Kirschke, H., Barrett, A. J. & Rawlings, N. D. (1995) Protein Profile 2, 945–950. 1581–1643. 41. Johnston, S. C., Larsen, C. N., Cook, W. J., Wilkinson, K. D. & Hill, C. P. (1997) 50. Buell, C. R., Joardar, V., Lindeberg, M., Selengut, J., Paulsen, I. T., Gwinn, EMBO J. 16, 3787–3796. M. L., Dodson, R. J., Deboy, R. T., Durkin, A. S., Kolonay, J. F., et al. (2003) 42. Holm, L. & Sander, C. (1993) J. Mol. Biol. 233, 123–138. Proc. Natl. Acad. Sci. USA 100, 10181–10186. 43. Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. (1995) J. Mol. Biol. 51. Kraulis, P. J. (1991) J. Appl. Crystallogr. 24, 946–950. 247, 536–540. 52. Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994) Nucleic Acids Res. 22, 44. Schroder, E., Phillips, C., Garman, E., Harlos, K. & Crawford, C. (1993) FEBS 4673–4680. Lett. 315, 38–42. 53. Nicholls, A., Sharp, K. A. & Honig, B. (1991) Proteins 11, 281–296. MICROBIOLOGY

Zhu et al. PNAS ͉ January 6, 2004 ͉ vol. 101 ͉ no. 1 ͉ 307 Downloaded by guest on October 2, 2021