The crystal structure of the rhomboid peptidase from Haemophilus influenzae provides insight into intramembrane

M. Joanne Lemieux*, Sarah J. Fischer, Maia M. Cherney, Katherine S. Bateman, and Michael N. G. James*

Group in Structure and Function, Department of Biochemistry, University of Alberta, Edmonton, AB, Canada T6G 2H7

Communicated by Robert M. Stroud, University of California, San Francisco, CA, November 10, 2006 (received for review October 24, 2006) Rhomboid peptidases are members of a family of regulated in- brane antigen-1 (AMA1), known to play a role in the tramembrane peptidases that cleave the transmembrane segments merozoite stage of falciparum (15). In the patho- of integral membrane . Rhomboid peptidases have been genic Providencia stuartii, rhomboids have been shown to cleave shown to play a major role in developmental processes in Dro- and release a quorum-sensing factor (16, 17). sophila and in mitochondrial maintenance in yeast. Most recently, the function of rhomboid peptidases has been directly linked to Results apoptosis. We have solved the structure of the rhomboid peptidase Haemophilus influenzae GlpG (hiGlpG) Structure. We have overex- from Haemophilus influenzae (hiGlpG) to 2.2-Å resolution. The pressed various prokaryotic homologs of rhomboid peptidase for phasing for the crystals of hiGlpG was provided mainly by molec- structural analysis, namely, hiGlpG, GlpG ular replacement, by using the coordinates of the Escherichia coli (ecGlpG), and subtilis YqgP. Crystals were obtained for rhomboid (ecGlpG). The structural results on these rhomboid pep- both hiGlpG and ecGlpG; those of hiGlpG diffracted to a higher tidases have allowed us to speculate on the catalytic mechanism of resolution. The structure of hiGlpG was solved by molecular cleavage in a membranous environment. We have iden- replacement by using the atomic coordinates from ecGlpG (3). tified the relative disposition of the nucleophilic to the The overall structure of hiGlpG consists of a six-␣-helical general base/acid function of the conserved histidine. Modeling a bundle (H1 to H6) (Fig. 1) with residues for the tetrapeptide substrate in the context of the rhomboid structure located in a central cavity between H1 and H3 near the periplas- reveals an comprising the side chain of a second mic surface of the protein. A long conserved loop (L1) is located conserved histidine and the main-chain NH of the nucleophilic between the first and second transmembrane helices, H1 and H2, serine residue. In both hiGlpG and ecGlpG structures, a water at the periplasmic side of the protein membrane, thus blocking molecule occupies this oxyanion hole. access to the active site. The L1 loop has a prominent amphi- pathic nature (3); it extends beyond the 6-helical bundle into the intramembrane peptidase ͉ membrane protein ͉ rhomboid ͉ hydrophobic environment of the . A short two- x-ray crystallography residue loop connects H2 and H3 at the cytoplasmic end of the molecule. H3 and H4 are connected by a loop, L3, that extends homboid peptidases belong to family S54 of intramembrane as a coil into the internal cavity in the center of the peptidase. Rserine peptidases (1); these carry out proteolysis of At the catalytic serine residue, this random coil becomes ␣- single transmembrane substrates within the environment of the helical like in bacterial . H5 is unusual in that it is not lipid bilayer. Rhomboid peptidases cleave their substrates in the fully ␣-helical but rather deformed. H5 does interact through outer leaflet of the lipid bilayer, thereby releasing an exocellular hydrophobic interactions with H2 and H4. H6 has the catalytic signal that can, in turn, play a role in signaling (for histidine near its N terminus, and it extends deep into the review, see ref. 2). Although the catalytic mechanism for soluble cytoplasmic space at its C terminus. serine peptidases has been well characterized, both biochemically and structurally, we are just beginning to understand the mecha- Bound Lipid Molecules. The detergent mixture present through- nism of serine peptidases in the membrane environment (3). out the crystallization procedure was C E (AnaPOE–C E ) Although the rhomboids are a newly discovered family of 12 8 12 8 with lower amounts of dodecylmaltoside (DDM) remaining peptidases, they have been identified in all kingdoms (4, 5). Their after detergent exchange. The electron density for two mole- diverse functions are being revealed through genetic screens and cules of C E could be identified in the electron density maps. developmental and cell biology studies (6). Rhomboid pepti- 12 8 Both molecules pack between two hiGlpG molecules (Fig. 1 A dases have been shown to play a role in releasing EGF from the membrane environment. In melanogaster, where and B). The first C12E8 interacts with helices H3 and H6 on one rhomboids were first discovered (7), they are expressed in a of the hiGlpG molecules and with H6 on the other. The second temporal and a tissue specific manner to cleave three different C12E8 interacts with helices H1 and H2 on one hiGlpG substrates: spitz, gurkin, and keren. Rhomboid cleavage of these substrates occurs in the membrane-spanning helix, and the Author contributions: M.J.L., S.J.F., and M.M.C. designed research; M.J.L., S.J.F., M.M.C., and released peptide then interacts with the EGF receptor (8, 9). K.S.B. performed research; M.J.L., K.S.B., and M.N.G.J. analyzed data; and M.J.L., K.S.B., and More recently, the function of a mitochondrial rhomboid M.N.G.J. wrote the paper. peptidase, -associated rhomboid-like (PARL) pepti- The authors declare no conflict of interest. dase, has been associated with cytochrome c release from the Abbreviations: hiGlpG, H. influenzae GlpG; ecGlpG, E. coli GlpG; PA, phosphatidic acid. mitochondria, thereby affecting apoptosis (10). In addition, Data deposition: The coordinates for the H. influenzae GlpG structure have been deposited mitochondrial PARL has been linked to insulin resistance and in the , www.pdb.org (PDB ID code 2NR9). Type II (11). The role of rhomboid peptidases in *To whom correspondence may be addressed. E-mail: [email protected] or mitochondrial maintenance has been studied in detail in yeast [email protected]. (12, 13). In , rhomboid peptidases have been shown This article contains supporting information online at www.pnas.org/cgi/content/full/ to be important for host cell invasion by (14). 0609981104/DC1. In addition, rhomboids have been shown to cleave apical mem- © 2007 by The National Academy of Sciences of the USA

750–754 ͉ PNAS ͉ January 16, 2007 ͉ vol. 104 ͉ no. 3 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0609981104 Downloaded by guest on September 24, 2021 Fig. 1. Structure of hiGlpG. (A) Cartoon representation of hiGlpG colored in a rainbow fashion, with the N terminus being dark blue and the C terminus being red. Three lipids, PA (cyan) and two detergent molecules, C12E8 (yellow), can be seen associated with hiGlpG. (B) View A rotated 90°.

molecule and with H2 on the other. There are three lipid areas that appear to have variability are the L1 loop, located molecules visible in the electron density map that we have between H1 and H2, and the H5 helix. In fact, we see relatively presently interpreted as phosphatidic acid (PA); the phospho- weak electron density for H5, suggesting some conformational ryl group has significantly higher electron density than the variability for this helix. On the other hand, we see density for other atoms in the glycerol backbone and acyl chains. (Figs. 1 the C terminus of the molecule that has allowed us to build H6 A and B and 2). PA has been shown to provide a source of the further into the cytoplasm than that for ecGlpG. The C terminus signaling lipid diacylglycerol (18). It is possible that these lipids of hiGlpG actually extends beyond the lipid bilayer boundary, may be cardiolipin (CL), phosphatidylserine (PS) or phos- similar to that seen with the GlpT structure (19). phatidylethanolamine (PE), the main lipids found in the E. coli lipid bilayer. Two PA molecules can be seen flanking the L1 Rhomboid Active Site. Most serine peptidases are characterized by ␥ loop, a loop that is proposed to be flexible in order for a consisting of the nucleophilic O atom of the substrate to bind (see below and ref. 3). Another PA is located serine, a general acid/base (the imidazole ring of a histidine) that near H6. It is unclear at present whether these lipids merely assists in the deprotonation of the nucleophile and an aspartic play a structural role, acting as chaperones, or whether they acid carboxylate that helps to maintain an optimal position for play a role in rhomboid function. the imidazole ring of the histidine during catalysis. Importantly, and in addition to these groups, there is an oxyanion hole, a feature that provides electrophilic assistance to the nucleophilic Comparison of hiGlpG and ecGlpG Structures. Superimposition of ␥ the C␣ atoms of the hiGlpG and ecGlpG structures reveals that attack by the serine O on the carbonyl carbon atom of the hiGlpG and ecGlpG share a common fold as predicted, with an scissile bond. Rhomboid has these features as well, but the molecular details are different. rmsd of 1.09 Å for 137 equivalent C␣ atom pairs. An overlay of In the native unbound form of rhomboid, the residues com- the structures demonstrates this agreement (Fig. 3). The only prising the catalytic triad and the oxyanion hole are buried and inaccessible to solvent (Fig. 4). Three loops, L1, L3 and L5, have to move substantially in order that the substrate can gain access to the active site (3). The B-factors for these loops are high, ranging from 60 Å2 to 70 Å2 for L1, 50 Å2 for L3, and 60 Å2 for L5, supporting this hypothesis. The overall B-factor for hiGlpG is 35.5 Å2. A possible sequence of events could include: the destabilized area of the transmembrane helix of the substrate (20) docks to rhomboid displacing the L1 gate (Gly-29 to Ser-55 in hiGlpG). The L3 loop (Gly-109 to Gly-114) and the L5 loop (Gly-161 to Gly-165) are then displaced. The beginning and ending residues of these loops are defined principally by glycine residues that have large associated conformational flexibility, which facilitates moving these three loops to provide access for the substrate to the active site (Fig. 4).

Discussion Rhomboid Mechanism. We have explored the binding of a short segment of a substrate polypeptide, spitz, by superimposing the side chain of His-169 and O␥ of Ser-116 from hiGlpG onto the structure of complexed to the turkey ovomucoid third domain, OMTKY3 (1CHO.pdb) (21). This superposition, although not very

Fig. 2. Electron density for lipid. 2FoϪFc electron density for a lipid molecule, accurate, gives a reasonable position for the segment of substrate BIOCHEMISTRY PA, bound between loop L1 and helix H3. polypeptide from P2 to P2Ј [nomenclature of Schechter and Berger

Lemieux et al. PNAS ͉ January 16, 2007 ͉ vol. 104 ͉ no. 3 ͉ 751 Downloaded by guest on September 24, 2021 Fig. 3. Superimposition of hiGlpG and ecGlpG rhomboid structures. Wall-eyed stereoview of structural superimposition of rhomboid structures. hiGlpG is shown in magenta, and ecGlpG is shown in green. Structural alignment was carried out with ALIGN (27). See supporting information (SI) Fig. 5 for sequence alignment.

(22)], fitting into the rhomboid active site (Fig. 4). The substrate has precludes the binding of water or the side chain of Asn-166. A been modeled as four residues surrounding the cleavage site of more likely candidate for a third member of the catalytic triad spitz, Ala-Ser-Gly-Ala (20). The cleavage specificities of several is the packing of the side chain of Tyr-120 against the imidazole rhomboids indicate a preference for residues, having small side ring of His-169, thereby ensuring some stability in the position chains, such as glycine and alanine (3). of the general base (3). The water molecule in our structure, W40 (Fig. 4A), which It has not escaped our attention that the nucleophilic Ser-116 bridges Ser116NH and His65N␧2H and forms H bonds to is at the beginning of helix H4 and that the NH of Ser-116 would Phe113O and Val162O, is located in the oxyanion hole; this form a good hydrogen bond to the oxyanion of the tetrahedral water molecule would be displaced by the carbonyl-oxygen atom intermediate. Therefore, additional stabilization of the oxyanion of the P1 residue of the substrate, thus placing the P1 carbonyl- will come from the partial positive charge on the helix dipole of carbon atom in an ideal position for nucleophilic attack by helix H4. This situation is analogous to the peptide dipole Ser116O␥. The nucleophilic attack is facilitated by the transfer stabilization provided by the helix bearing the Ser-221 in bac- of the proton on O␥ to the N␧2 atom of the imidazole ring of terial subtilisin (24). Subtilisin has the side-chain amide from His-169. A good hydrogen bond from Ser116O␥...N␧2 His-169 Asn-155 that contributes to the formation of the oxyanion hole. is already established in the native , thereby defining the Rhomboid is similar to subtilisin, but it uses a histidine side chain pathway for this proton transfer. The oxyanion hole in rhomboid rather than an asparagine carboxamide. hiGlpG comprises the main-chain NH of Ser-116 and the We have solved the structure of a full-length rhomboid protonated N␧2 of the imidazole of His-65. This favorable peptidase, namely hiGlpG. The superimposed structures of tautomeric form for the imidazole ring of His-65 is ensured by hiGlpG and ecGlpG are similar; however, their active sites have the main-chain hydrogen bond from Leu61NH to N␦1 of His-65 subtle differences and our structure has allowed us to identify the (Fig. 4). Not only does this oxyanion hole stabilize the developing oxyanion hole, an essential feature for serine peptidases that is negative charge on the carbonyl-oxygen atom in the tetrahedral required to propose confidently a sound catalytic mechanism. intermediate, but it also assists the nucleophilic attack by Perhaps the most unexpected result of our structural analysis is electrophilically enhancing the polarization of the carbonyl to find the catalytic machinery of a soluble globular serine and (C—O) bond of the substrate. Mutagenesis experiments have the general peptidase like ␣-chymotrypsin or subtilisin Ϸ9Å demonstrated that, in ecGlpG and YqgP, the residues equivalent below the surface of the bilayer. It is also surprising to find a to His-169 and Ser-116 from hiGlpG are essential for activity catalytic site in hiGlpG resembling that of bacterial subtilisin (23). In addition, mutagenesis in YqgP of the residue equivalent with the nucleophilic serine and general base histidine residues to His 65 in hiGlpG, which is also highly conserved, resulted in near the N termini of two separate ␣-helices. Similarly, the markedly reduced activity. oxyanion hole is formed by the side chain of an asparagine in In hiGlpG there is no direct equivalent to the third member subtilisin. There is clearly an evolutionary relationship between 2Ϫ of the catalytic triad, i.e., an AspCOO , or an AsnCONH2.In these two families of serine peptidases. ecGlpG, the side-chain amide of Asn-251 forms a hydrogen bond to a water molecule, and the water, in turn, forms a Materials and Methods hydrogen bond to N␦1 of His-254, the general base in that Cloning and Expression. H. influenzae DNA was purchased from enzyme. hiGlpG has no equivalent water molecule, and the American Type Culture Collection (Manassas, VA). With the close contact of the imidazole ring of His-169 to Phe-155 likely use of restriction digestion, PCR products were then ligated

752 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0609981104 Lemieux et al. Downloaded by guest on September 24, 2021 Fig. 4. Mechanism of H. influenzae GlpG. (A) Top view from the periplasmic space of hiGlpG. Residues in the active site are labeled in gray. Water (WAT40) located in the active site between Ser-116 and His-65 is shown in red. (B) Similar view to A but with loops L1, L3, and L5 removed. A model of the D. melanogaster rhomboid substrate spitz (magenta) has been docked manually into the active site. (C) Wall-eyed stereoview of spitz docked into the active site of hiGlpG. The hydroxyl oxygen of S116 is hydrogen boding with H169. H169 is stabilized by Y120. The carbonyl oxygen from the P1 residue is stabilized by H65 and the backbone amide of S116, forming the oxyanion hole. (D) Proposed catalytic mechanism of hiGlpG (ChemDraw).

into pBAD-MycHisA vector (Invitrogen, Burlington, Canada). inhibitor mixture (NEB, Beverly, MA), 1 mM PMSF, and 0.1 Expression was carried out in Top10 cells (Invitrogen) in mg/ml DNase and lysed by using an TEmulsiFlex-C3T, (Avestin, Luria–Bertani medium supplemented with ampicillin. Induc- Ottawa, Canada). Unbroken cells were pelleted in a JA17 rotor tion of the various expression constructs were carried out as at 10,000 ϫ g for 20 min. Membrane fractions were collected by described below with arabinose. ultracentrifugation in a L8–80 ultracentrifuge at 100,000 ϫ g in a 45Ti rotor (Beckman). Membrane Fraction Isolation. Cells were grown to an OD600 of 0.4 and induced with 0.0002% arabinose at 24°C for 16 h. Cells were Protein Purification. Membrane fractions were homogenized in 50 harvested at 12,227 ϫ g for 10 min by using an AvantiJ1.8000 mM Tris, 300 mM NaCl, 30 mM imidazole, 20% glycerol, and 1%

rotor (Beckman, Fullerton, CA). Cells were resuspended in four DDM (pH 8.0). The solution was stirred for 30 min, followed by BIOCHEMISTRY volumes of TBS supplemented with an EDTA-free peptidase- ultracentrifugation for 30 min at 110,000 ϫ g in a 45Ti rotor

Lemieux et al. PNAS ͉ January 16, 2007 ͉ vol. 104 ͉ no. 3 ͉ 753 Downloaded by guest on September 24, 2021 (Beckman, USA). The supernatant was incubated with Ni-NTA had one molecule in the AU. Molecular replacement was carried resin (Qiagen, Ontario, Canada) for 2 h. The resin was then out by using MolRep (25) with the ecGlpG coordinates. Re- collected and washed with 20 column volumes (CV) of 50 mM finement was carried out with Refmac5 (26). See SI Table 1 for Tris, 300 mM NaCl, 30 mM imidazole, 20% glycerol, and 0.1% statistical results. DDM (pH 8.0), followed by 20 CV of the above stated buffer with 35 mM imidazole. Protein fractions were eluted in a We thank Dr. Ya Ha for generously making available the E. coli GlpG step-wise manner with 3 ϫ 2 CV of the above-described buffer coordinates (2IC8.pdb); Drs. James Holden (ALS, BL 8.3.1), Ernst containing 250, 500, and 1,000 mM imidazole. Protein was then Bergmann, and Jonathan Parrish for assistance with data collection; all concentrated by using a 30K centrifugal filter (Millipore, Bed- members of the M.N.G.J. laboratory for their support; and Wendy ford, MA) and subjected to detergent exchange on a Spe- Kasinec for administrative assistance. X-ray diffraction data were col- hadex200 (16/60) column (Amersham, Piscataway, NJ) in 50 mM lected at beam line 8.3.1 of the Advanced Light Source (ALS) at the Lawrence Berkeley National Laboratory, Berkeley, CA, under an agree- Tris (pH 8.0), 0.05% C12E8, 20 mM NaCl, and 10% glycerol (see SI Fig. 6). ment with the Alberta Synchrotron Institute (ASI). The ALS is operated by the Department of Energy and supported by the National Institutes Crystallographic Analysis. Protein for crystallization was concen- of Health (Bethesda, MD). Beam line 8.3.1 was funded by the National trated by using Millipore Ultrafree centrifugal concentrators, 30 Science Foundation, the University of California, and Henry Wheeler. The ASI synchrotron access program is supported by grants from the kDa molecular mass cutoff to a concentration of 5 mg/ml. Alberta Science and Research Authority (ASRA) and the Alberta Crystals of hiGlpg were obtained in 25% PEG 4000, 0.1 M citrate Heritage Foundation for Medical Research (AHFMR). This work has (pH 6.0), 1 M NaCl, 3% ethanol, and 15% glycerol and grew to been supported by the Canadian Institute for Health Research (CIHR). ␮ ϫ ␮ ϫ ␮ dimensions of 100 m 50 m 20 m in 1–2 weeks. Crystals M.J.L is supported by a fellowship scholarships from CHIR, CIHR were directly flash-cooled in liquid nitrogen. Data were collected strategic training program for membrane proteins and cardiovascular at the Advanced Light Source beam line 8.3.1. Two different disease, and Alberta Heritage Foundation for Medical Research. space groups were obtained: monoclinic C2 and orthorhombic M.N.G.J. acknowledges support from the Canada Research Chair’s P212121. The monoclinic data diffracted to higher resolution and Program.

1. Rawlings ND, Morton FR., Barrett AJ (2006) Nucleic Acids Res 34:D270– 15. Howell SA, Hackett F, Jongco AM, Withers-Martinez C, Kim K, Carruthers D272. VB, Blackman MJ (2005) Mol Microbiol 57:1342–1356. 2. Urban S, Freeman M (2002) Curr Opin Genet Dev 12:512–518. 16. Urban S, Schlieper D, Freeman M (2002) Curr Biol 12:1507–1512. 3. Wang Y, Zhang Y, Ha Y (2006) Nature 444:153–155. 17. Rather PN, Ding X, Baca-DeLancey RR, Siddiqui S (1999) J Bacteriol 4. Wasserman JD, Urban S, Freeman M (2000) Genes Dev 14:1651–1663. 181:7185–7191. 5. Pascall JC, Brown KD (1998) FEBS Lett 429:337–340. 18. Dillon DA, Wu WI, Riedel B, Wissing JB, Dowhan W, Carman GM (1996) 6. Freeman M (2004) Nat Rev Mol Cell Biol 5:188–197. J Biol Chem 271:30548–30553. 7. Urban S, Lee JR, Freeman M (2001) Cell 107:173–182. 19. Huang Y, Lemieux MJ, Song J, Auer M, Wang DN (2003) Science 301:616–620. 8. Lee JR, Urban S, Garvey CF, Freeman M (2001) Cell 107:161–171. 20. Urban S, Freeman M (2003) Mol Cell 11:1425–1434. 9. Urban S, Lee JR, Freeman M (2002) EMBO J 21:4277–4286. 21. Fujinaga M, Sielecki AR, Read RJ, Ardelt W, Laskowski M, Jr, James MN 10. Cipolat S, Rudka T, Hartmann D, Costa V, Serneels L, Craessaerts K, Metzger (1987) J Mol Biol 195:397–418. K, Frezza C, Annaert W, D’Adamio L, et al. (2006) Cell 126:163–175. 22. Schechter I, Berger A (1967) Biochem Biophys Res Commun 27:157–162. 11. Walder K, Kerr-Bayles L, Civitarese A, Jowett J, Curran J, Elliott K, Trevaskis 23. Lemberg MK, Menendez J, Misik A, Garcia M, Koth CM, Freeman M (2005) J, Bishara N, Zimmet P, Mandarino L, et al. (2005) Diabetologia 48:459–468. EMBO J 24:464–472. 12. Herlan M, Vogel F, Bornhovd C, Neupert W, Reichert AS (2003) J Biol Chem 24. Hol WG, van Duijnen PT, Berendsen HJ (1978) Nature 273:443–446. 278:27781–27788. 25. Vagin A, Teplyakov A (1997) J Appl Crystallogr 30:1022–1025. 13. McQuibban GA, Saurya S, Freeman M (2003) Nature 423:537–541. 26. Steiner RA, Lebedev AA, Murshudov GN (2003) Acta Crystallogr D 59(Pt 14. Brossier F, Jewett TJ, Sibley LD, Urban S (2005) Proc Natl Acad Sci USA 12):2114–2124. 102:4146–4151. 27. Cohen GH (1997) J Appl Crystallogr 30:1160–1161.

754 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0609981104 Lemieux et al. Downloaded by guest on September 24, 2021