Colloquium

A framework for interpreting the leucine-rich repeats of the Listeria internalins

Michael Marino*, Laurence Braun†, Pascale Cossart†, and Partho Ghosh*‡

*Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0314; and †Institut Pasteur, Unite´des Interactions Bacte´ries-Cellules, 28 rue du Dr. Roux, 75015 Paris, France The surface InlB of the bacterial pathogen Listeria mono- Internalin Family cytogenes is required for inducing phagocytosis in various non- InlA and InlB belong to the internalin family of Listeria , phagocytic mammalian cell types in vitro. InlB causes tyrosine which includes at least seven additional members in L. mono- phosphorylation of host cell adaptor proteins, activation of phos- cytogenes (InlC, InlC2, InlD, InlE, InlF, InlG, and InlH) (4, phoinositide 3-kinase, and rearrangements of the actin cytoskel- 15–18). Internalin proteins are also found in Listeria ivanovii,a eton. These events lead to phagocytic uptake of the bacterium by pathogen of ruminants (19, 20), and also in nonpathogenic the host cell. InlB belongs to the internalin family of Listeria species of the genus Listeria. Except for InlC, which is a soluble proteins, which also includes InlA, another surface protein involved and secreted protein, the L. monocytogenes internalins have in host cell invasion. The internalins are the largest class of bacterial sequence motifs that are consistent with bacterial cell surface proteins containing leucine-rich repeats (LRR), a motif associated localization. In contrast to InlB, which has been shown to be with protein–protein interactions. The LRR motif is found in a important to proliferation in hepatocytes in vivo (21, 22), the role functionally diverse array of proteins, including those involved in of InlA in virulence has not been clearly demonstrated (23, 24). the plant immune system and in the mammalian innate immune The biochemical functions and exact virulence properties of the response. Structural and functional interpretations of the se- other L. monocytogenes internalins remain to be determined. quences of internalin family members are presented in light of the Proteins of the internalin family share several sequence motifs, recently determined x-ray crystal structure of the InlB LRR domain. the most distinctive of these being leucine-rich repeats (LRR). The LRR motif is associated with protein–protein interactions and is found in both bacterial and eukaryotic proteins (25–27). number of intracellular pathogens have the ability to invade The majority of bacterial LRR proteins are found in Listeria and mammalian cells that are normally nonphagocytic by in- A belong to the internalin family, but LRR proteins have also been ducing phagocytic behavior (1). The cellular machinery required identified in Yersinia spp. (28), Shigella flexneri (29), Salmonella for phagocytosis appears to be present in many types of non- typhimurium (30), and Burkholderia (Pseudomonas) solanacea- phagocytic cells but in a constitutively unassembled or inactive rum (31). Among eukaryotes, the LRR motif is found in proteins state. Certain pathogens possess means by which to induce encoded by the disease resistance (R) genes of the plant immune assembly of the phagocytic machinery necessary to bring about system (32–34) and by the toll and toll-like genes of Drosophila their own uptake. Among these pathogens is the facultative and mammals, which are involved in innate immune responses intracellular bacterium Listeria monocytogenes, a cause of men- (35, 36). The LRR motif is not limited to virulence- or immunity- ingitis and abortion in humans (2). L. monocytogenes induces related proteins but is instead found in a large group of proteins its own phagocytosis in a large number of nonphagocytic cell with highly diverse functions and cellular localizations (25). types in vitro through the actions of the bacterial surface The LRR regions of InlA and InlB play critical roles in proteins InlA and InlB (also called internalin and internalin B, interactions with mammalian cells. The LRR region of InlA is respectively) (3). necessary to promote invasion of permissive cells in vitro (37). The related proteins InlA and InlB act on different sets of cell Likewise, the LRR region of InlB is necessary for the activation types. Although InlA is required for inducing phagocytosis in the of phosphoinositide 3-kinase, rearrangement of the actin cy- enterocyte-like epithelial cell line Caco-2 (4), InlB is required for toskeleton, and invasion of permissive cells in vitro (38). The inducing phagocytosis in some hepatocyte, endothelial, epithe- recently determined x-ray crystal structure of the InlB LRR lial, or fibroblast-like cell lines (e.g., Vero, HeLa, HEp-2, domain (39) provides a framework for the interpretation of Chinese hamster ovary) (5–10). The different cell type specific- sequences of internalin family members. ities likely reflect the fact that InlA and InlB have different receptor specificities. The receptor for InlA is E-cadherin (11), Internalin LRR and, whereas the receptor for InlB has not been identified, it is The internalin LRR regions are highly regular, with almost all clear that it is not E-cadherin. Nevertheless, InlB has been shown repeats being 22 residues in length (Fig. 1). The LRRs are to have potent effects on host cell signaling events. InlB pro- tandemly repeated from 6 times in InlG to 16 times in InlA. The motes activation of host cell phosphoinositide (PI) 3-kinase and 7.5 LRRs present in InlB give it its elongated and curved shape tyrosine phosphorylation of certain adaptor proteins (Gab1, Shc, (Fig. 2). The repeats are similar to those observed in the LRRs and Cbl) implicated in the membrane localization of phospho- of porcine and human ribonuclease inhibitors (RI) (40–42), the inositide 3-kinase (PI 3-kinase) (12, 13). PI 3-kinase is known to

have effects on actin polymerization (14), a process that is This paper was presented at the National Academy of Sciences colloquium ‘‘Virulence and required for L. monocytogenes invasion of host cells (12). Defense in Host–Pathogen Interactions: Common Features Between Plants and Animals,’’ Reorganization of the actin cytoskeleton presumably leads to held December 9–11, 1999, at the Arnold and Mabel Beckman Center in Irvine, CA. changes in the membrane that cause phagocytosis of the Abbreviations: LRR, leucine-rich repeat; RI, ribonuclease inhibitor. bacterium. ‡To whom reprint requests should be addressed. E-mail: [email protected].

8784–8788 ͉ PNAS ͉ August 1, 2000 ͉ vol. 97 ͉ no. 16 Downloaded by guest on September 27, 2021 COLLOQUIUM

Fig. 1. Sequence alignments of L. monocytogenes internalin LRR regions. Stars above the sequences denote conserved internalin LRR residues, and bars denote the position of ␤-strands and extent of 310-helices observed in the InlB structure (39). Residues predicted to form the concave or convex face of these structures are enclosed in boxes. Hydrophobic, negatively charged, and positively charged residues predicted to be surface exposed are highlighted in yellow, red, and cyan, respectively. Sequences for InlA, InlB, InlC, InlC2, InlD, InlE, InlF, InlG, and InlH are shown. The repeats are 22 residues in length except for repeat 4 of InlC and repeat 6 of InlA, which are 21 residues, and repeat 5 of InlF, which is 23 residues. These small deletions and insertions are predicted to occur at loop regions.

U2LRR fragment of U2 small nuclear ribonucleoprotein (43), repeat of U2LRR contains a second ␤-strand instead of a . and the Ran GTP-ase activating protein rna1p (44). As in these In InlB, the helical segments are short (three to five residues) and ␤ structures, each repeat alternates between a short -strand and are exclusively composed of 310-helices. U2LRR possesses 23- ␤ an opposing antiparallel helical segment; the -strands and and 25-residue repeats that also form 310-helices. This contrasts helices are connected to each other by coils. The main chain with RI and rna1p, which have long (10–14 residues) ␣-helices wraps around in a right-handed sense, with the ␤-strands forming and also have longer repeat lengths (28 residues or greater). ␤ a parallel -sheet and the helices stacking on each other, giving Apparently the more tightly wound 310-helix accommodates rise to the elongated shape of the LRR. formation of short repeats more favorably than does the ␣-helix. Although the ␤-strands are a highly conserved structural The of the LRR region arises from the ␤-strands of feature of LRR proteins, the helical segments are more diver- adjacent repeats being packed more closely together than op- gent. The ␤-strands are almost always composed of three resi- posing helices. The ␤-strands are within interrepeat hydrogen dues that are in precise register. On the other hand, the helical bonding distance, whereas the helices form intrarepeat hydrogen segments are variable in length, register, and type. In fact, one bonds. The ␤-strands form the concave face and the helices the

Marino et al. PNAS ͉ August 1, 2000 ͉ vol. 97 ͉ no. 16 ͉ 8785 Downloaded by guest on September 27, 2021 segment of Ϸ40 residues termed the ‘‘N-terminal cap’’ in InlB, because it provides a hydrophilic cap on the hydrophobic core of the first LRR (39). The hydrophobic core would be exposed to solvent at either end of the LRR were it not for other portions of the protein. The N-terminal cap forms numerous polar and hydrophobic interactions with the first LRR. The most promi- nent of these is a hydrophobic interaction with Tyr at position 20 of the first LRR, a highly conserved residue in the internalin family (Fig. 1). In addition to its structural role, the N-terminal cap has been suggested to be involved in function based on the observation of two highly solvent-exposed calcium ions that bind to this region (39). The calciums are not required for formation of protein structure and may instead act as metal ion bridges between InlB and mammalian cell surface receptors or binding proteins. The N-terminal cap and LRR regions of InlB are sufficient to elicit mammalian cell effects and to induce phagocytosis (38). At the C terminus of the LRR region is a conserved sequence of Ϸ100 residues termed the interrepeat (IR) region (17). The IR region is dispensable for the mammalian cell signaling activities of InlB but is required in InlA for inducing phagoctyosis (37). The IR region has not yet been visualized, but crystals of intact InlB have been obtained that should elucidate the role of this as well as other regions of internalins (M.M. and P.G., unpublished data). At least a portion of the IR region is likely to provide a hydrophilic cap on the last LRR. LRR Pattern: Structural Residues A total of 10 positions of the 22-residue internalin LRR are highly conserved and serve structural rather than functional roles. Seven positions (2, 5, 7, 12, 15, 18, and 21) of the repeat Fig. 2. Structure of the InlB LRR region (residues 77–242). The right-handed contain mainly Leus or Iles that point inward and compose the coil of the LRR alternates between ␤-strands and 310-helices. The ␤-strands hydrophobic core of the protein (Fig. 3). Other hydrophobic form the concave face of the molecule and have a superhelical twist not residues are accommodated at these positions, including Val, observed in other LRR proteins. This figure and Fig. 4 were generated with Phe, Met, and Ala (Fig. 1). Position 10 also faces inward and MOLSCRIPT (48) and rendered with RASTER3D (49). usually contains Asns or Glns, which form the so-called ‘‘Asn ladder’’ (45). The Asn hydrogen bonds to main-chain in convex face of the molecule (Fig. 2). This curvature would its own repeat and to those in the preceding repeat. This explains appear to limit the possible number of tandem LRRs, because ␤ why position 10 in the first repeat is not constrained to be an Asn a closed circle would be formed. However, the InlB -strands are or Gln. twisted and give the entire structure a right-handed superhelical The seven hydrophobic positions along with the Asn-ladder twist, thereby placing no obvious limits on the possible number ␤ position define the internalin LRR sequence motif (Fig. 4). This of tandem repeats (39). The -strands of RI, U2LRR, and rna1p motif corresponds almost identically to the 22-residue repeat are not as extensively twisted and indeed place a limit on the number of possible tandem repeats. It is not clear at present what gives rise to this difference. ␤ The coil regions connecting -strands and 310-helices consti- tute the great majority (14–17 residues) of the 22-residue InlB LRR. Even though these regions lack conventional secondary structure, they form highly regular structures (rms deviation of 0.32ÅinC␣ positions). This regularity may in part be caused by the presence of water molecules that act as extensions of secondary structure. These waters are intercalated between repeats and form bridging hydrogen bonds between main-chain atoms of adjacent repeats. Structural determination of InlB to high resolution (1.86 Å) allowed visualization of three distinct spines of water molecules that run the length of the repeat region (39). Some of these -mediating waters are also observed in the 2.0-Å resolution structure of human RI (42). A regular pattern of hydrogen bond-mediating waters is expected to be present in other LRR proteins and to be observed in high-resolution structures of other LRR proteins. LRR Flanking Sequences Fig. 3. Structure of a single InlB LRR. Positions 2, 5, 7, 10, 12, 14, 15, 17, 18, The LRR regions are flanked by sequences that are conserved and 21 are conserved for structural reasons. The remaining positions are in the internalin family and that most likely play a role in solvent exposed and variable. The register of the InlB LRR is that described for providing stability. At the N terminus of the LRR region is a porcine RI (40).

8786 ͉ www.pnas.org Marino et al. Downloaded by guest on September 27, 2021 LRR Pattern: Functional Residues More than half the residues of the internalin LRR face outward, are variable, and can serve to define protein–protein interaction surfaces specific to each internalin. As with other LRR proteins, the interaction surfaces of the internalins are likely to be formed by the ␤-strand regions that constitute the concave faces of these molecules. The concave face has been observed to form the major binding surface in complexes of RI and U2LRR with their target proteins (41, 43) and is inferred to form the binding surface for rna1p (44). The sides of RI adjacent to the concave face are also involved in binding. In InlB, the concave face contains separate patches of hydrophobic and negatively charged residues that could constitute protein–protein interaction sur- faces (39). Sequence characteristics of the internalins also point to the concave face as being important to protein–protein interactions. Fig. 4. Consensus motifs of the internalin LRR and the ␤-helix repeat. The Three-quarters of the hydrophobic residues predicted to be two differ primarily at position 19. Black bars above the sequences represent surface exposed in the L. monocytogenes internalins are located ␤ -strands, and the gray bar represents the 310-helix. Black arrows in the on the concave face (positions 3, 4, 6, 8, and 9) (Fig. 1). Likewise, cartoons represent ␤-strands and the gray zigzag the 310-helix. The register for ␤ greater than three-quarters of the negatively charged residues the -helix repeat has been adjusted for purposes of comparison. predicted to be surface exposed are also located on the concave motif identified for proteins of the right-handed ␤-helix family face (Fig. 1). In general, the concave faces of the internalins (Fig. 4) (46, 47). This correspondence led to a suggestion that the possess the hydrophobic and negative charge characteristics internalins would form a ␤-helix structure rather than the observed for InlB. The sides adjacent to the concave face observed LRR structure (47). The ␤-helix structure is found in (positions 11 and 13 on one side, and positions 21 and 1 on the pectate lyases and forms a distinctive L-shaped repeat contain- other) are highly enriched in positively charged residues, as also observed in InlB. In contrast to the distinctive pattern of residues ing three ␤-strands. The greatest difference between the two found on the concave face and sides, the convex face (positions repeat motifs occurs at position 19, which faces outward and is 16, 19, and 20) of the internalins appears to be relatively variable in the internalin LRR but is occupied by an inward- featureless (Fig. 1). The exception to this is InlC, for which the facing conserved hydrophobic residue in the ␤-helix motif. The convex face is predicted to contain a small hydrophobic patch. close correspondence of these two motifs suggests a means by Although the high sequence conservation of the internalin LRR which to probe the basis for their formation. allows strong predictions about protein–protein interaction sites Two structural positions of the InlB LRR are unique to the based on the InlB structure, it does not lead as easily to internalin family. Position 14 is outward facing and is usually predictions about target specificity. occupied by an Asp that hydrogen bonds to main-chain atoms at The challenge for the future will be in identifying binding the beginning of 310-helices (Fig. 3). Other residues capable of partners for individual internalins and in determining how the hydrogen bonding, including Ser, Thr, Asn, and Gln, are also LRR structural motif is used to generate a variety of specific found at this position (Fig. 1). Position 17 is generally occupied functional interactions. by a small , usually Gly, Pro, Ala, or Ser. It is located on the 310-helix, and a larger residue at this position would M.M. was supported by National Institutes of Health Training Grant sterically clash with the preceding 310-helix. These two positions GM07240-24. This work was in part supported by a W. M. Keck add to the eight inward-facing ones to yield the 10 positions Distinguished Young Scholar award and an American Heart Association

conserved for structural reasons. grant (9930135N) to P.G. COLLOQUIUM

1. Finlay, B. B. & Cossart, P. (1997) Science 276, 718–725. 17. Dramsi, S., Dehoux, P., Lebrun, M., Goossens, P. L. & Cossart, P. (1997) Infect. 2. Lorber, B. (1997) Clin. Infect. Dis. 24, 1–9. Immun. 65, 1615–1625. 3. Cossart, P. & Lecuit, M. (1998) EMBO J. 17, 3797–3806. 18. Raffelsbauer, D., Bubert, A., Engelbrecht, F., Scheinpflug, J., Simm, A., Hess, 4. Gaillard, J. L., Berche, P., Frehel, C., Gouin, E. & Cossart, P. (1991) Cell 65, J., Kaufmann, S. H. & Goebel, W. (1998) Mol. Gen. Genet. 260, 144–158. 1127–1141. 19. Engelbrecht, F., Domainguez-Bernal, G., Hess, J., Dickneite, C., Greiffenberg, 5. Dramsi, S., Biswas, I., Maguin, E., Braun, L., Mastroeni, P. & Cossart, P. (1995) L., Lampidis, R., Raffelsbauer, D., Daniels, J. J., Kreft, J., Kaufmann, S. H., Mol. Microbiol. 16, 251–261. et al. (1998) Mol. Microbiol. 30, 405–417. 6. Lingnau, A., Domann, E., Hudel, M., Bock, M., Nichterlein, T., Wehland, J. & 20. Engelbrecht, F., Dickneite, C., Lampidis, R., Go¨tz, M., DasGupta, U. & Chakraborty, T. (1995) Infect. Immun. 63, 3896–3903. Goebel, W. (1998) Mol. Gen. Genet. 257, 186–197. 7. Braun, L., Ohayon, H. & Cossart, P. (1998) Mol. Microbiol. 27, 1077–1087. 21. Gaillard, J. L., Jaubert, F. & Berche, P. (1996) J. Exp. Med. 183, 359–369. 8. Greiffenberg, L., Goebel, W., Kim, K. S., Weiglein, I., Bubert, A., Engelbrecht, 22. Gregory, S. H., Sagnimeni, A. J. & Wing, E. J. (1997) Infect. Immun. 65, F., Stins, M. & Kuhn, M. (1998) Infect. Immun. 66, 5260–5267. 5137–5141. 9. Mu¨ller, S., Hain, T., Pashalidis, P., Lingnau, A., Domann, E., Chakraborty, T. 23. Jonquie`res, R., Bierne, H., Mengaud, J. & Cossart, P. (1998) Infect. Immun. 66, & Wehland, J. (1998) Infect. Immun. 66, 3128–3133. 3420–3422. 10. Parida, S. K., Domann, E., Rohde, M., Muller, S., Darji, A., Hain, T., Wehland, 24. Lecuit, M., Dramsi, S., Gottardi, C., Fedor-Chaiken, M., Gumbiner, B. & J. & Chakrabory, T. (1998) Mol. Microbiol. 28, 81–93. Cossart, P. (1999) EMBO J. 18, 3956–3963. 11. Mengaud, J., Ohayon, H., Gounon, P., Mege, R. M. & Cossart, P. (1996) Cell 25. Kobe, B. & Deisenhofer, J. (1994) Trends Biochem. Sci. 19, 415–421. 84, 923–932. 26. Buchanan, S. G. & Gay, N. J. (1996) Prog. Biophys. Mol. Biol. 65, 1–44. 12. Ireton, K., Payrastre, B., Chap, H., Ogawa, W., Sakaue, H., Kasuga, M. & 27. Kajava, A. V. (1998) J. Mol. Biol. 277, 519–527. Cossart, P. (1996) Science 274, 780–782. 28. Leung, K. Y. & Straley, S. C. (1989) J. Bacteriol. 171, 4623–4632. 13. Ireton, K., Payrastre, B. & Cossart, P. (1999) J. Biol. Chem. 274, 17025– 29. Hartman, A. B., Venkatesan, M., Oaks, E. V. & Buysse, J. M. (1990) J. 17032. Bacteriol. 172, 1905–1915. 14. Carpenter, C. L. & Cantley, L. C. (1996) Curr. Opin. Cell. Biol. 8, 153–158. 30. Miao, E. A., Scherer, C. A., Tsolis, R. M., Kingsley, R. A., Adams, L. G., 15. Engelbrecht, F., Chun, S. K., Ochs, C., Hess, J., Lottspeich, F., Goebel, W. & Ba¨umler,A. J. & Miller, S. I. (1999) Mol. Microbiol. 34, 850–864. Sokolovic, Z. (1996) Mol. Microbiol. 21, 823–837. 31. Van Gijsegem, F., Gough, C., Zischek, C., Niqueux, E., Arlat, M., Genin, S., 16. Domann, E., Zechel, S., Lingnau, A., Hain, T., Darji, A., Nichterlein, T., Barberis, P., German, S., Castello, P. & Boucher, C. (1995) Mol. Microbiol. 15, Wehland, J. & Chakraborty, T. (1997) Infect. Immun. 65, 101–109. 1095–1114.

Marino et al. PNAS ͉ August 1, 2000 ͉ vol. 97 ͉ no. 16 ͉ 8787 Downloaded by guest on September 27, 2021 32. Dixon, M. S., Jones, D. A., Keddie, J. S., Thomas, C. M., Harrison, K. & Jones, 40. Kobe, B. & Deisenhofer, J. (1993) (London) 366, 751–756. J. D. (1996) Cell 84, 451–459. 41. Kobe, B. & Deisenhofer, J. (1995) Nature (London) 374, 183–186. 33. Boyes, D. C., Nam, J. & Dangl, J. L. (1998) Proc. Natl. Acad. Sci. USA 95, 42. Papageorgiou, A. C., Shapiro, R. & Acharya, K. R. (1997) EMBO J. 16, 15849–15854. 5162–5177. 34. Gassmann, W., Hinsch, M. E. & Staskawicz, B. J. (1999) Plant J. 20, 265–277. 43. Price, S. R., Evans, P. R. & Nagai, K. (1998) Nature (London) 394, 645–650. 35. Belvin, M. P. & Anderson, K. V. (1996) Annu. Rev. Cell. Dev. Biol. 12, 393–416. 44. Hillig, R. C., Renault, L., Vetter, I. R., Drell, T., IV, Wittinghofer, A. & Becker, 36. Hoffmann, J. A., Kafatos, F. C., Janeway, C. A. & Ezekowitz, R. A. (1999) J. (1999) Mol. Cell 3, 781–791. Science 284, 1313–1318. 45. Kobe, B. & Deisenhofer, J. (1995) Curr. Opin. Struct. Biol. 5, 409–416. 37. Lecuit, M., Ohayon, H., Braun, L., Mengaud, J. & Cossart, P. (1997) Infect. 46. Yoder, M. D., Keen, N. T. & Jurnak, F. (1993) Science 260, 1503–1507. Immun. 65, 5309–5319. 47. Heffron, S., Moe, G. R., Sieber, V., Mengaud, J., Cossart, P., Vitali, J. & 38. Braun, L., Nato, F., Payrastre, B., Mazie´, J. C. & Cossart, P. (1999) Mol. Jurnak, F. (1998) J. Struct. Biol. 122, 223–235. Microbiol. 34, 10–23. 48. Kraulis, P. (1991) J. Appl. Crystallogr. 24, 946–950. 39. Marino, M., Braun, L., Cossart, P. & Ghosh, P. (1999) Mol. Cell 4, 1063–1072. 49. Merrit, E. A. & Bacon, D. J. (1997) Methods Enzymol. 277, 505–524.

8788 ͉ www.pnas.org Marino et al. Downloaded by guest on September 27, 2021