st8308.qxd 03/22/2000 11:36 Page 305

Research Article 305

Crystal structure of X: a flip–flop of the ring of His23 allows carboxy-monopeptidase and carboxy- activity of the Gregor Guncar1, Ivica Klemencic1, Boris Turk1, Vito Turk1, Adriana Karaoglanovic-Carmona2, Luiz Juliano2 and Dušan Turk1*

Background: Cathepsin X is a widespread, abundantly expressed -like Addresses: 1Department of Biochemistry and v mammalian lysosomal . It exhibits carboxy-monopeptidase as Molecular Biology, Jozef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia and 2Departamento de well as carboxy-dipeptidase activity and shares a similar activity profile with Biofisica, Escola Paulista de Medicina, Rua Tres de . The latter has been implicated in normal physiological events as Maio 100, 04044-020 Sao Paulo, Brazil. well as in various pathological states such as rheumatoid arthritis, Alzheimer’s disease and cancer progression. Thus the question is raised as to which of the *Corresponding author. E-mail: [email protected] two activities has actually been monitored. Key words: Alzheimer’s disease, , Results: The crystal structure of human cathepsin X has been determined at cathepsin B, cathepsin X, papain-like cysteine 2.67 Å resolution. The structure shares the common features of a papain-like protease enzyme fold, but with a unique . The most pronounced feature of the Received: 1 November 1999 cathepsin X structure is the mini-loop that includes a short three-residue Revisions requested: 8 December 1999 insertion protruding into the active site of the protease. The residue Tyr27 on Revisions received: 6 January 2000 one side of the loop forms the surface of the S1 substrate-, and Accepted: 7 January 2000 His23 on the other side modulates both carboxy-monopeptidase as well as Published: 29 February 2000 carboxy-dipeptidase activity of the enzyme by binding the C-terminal carboxyl group of a substrate in two different sidechain conformations. Structure 2000, 8:305–313

Conclusions: The structure of cathepsin X exhibits a binding surface that will 0969-2126/00/$ – see front matter © 2000 Elsevier Science Ltd. All rights reserved. View metadata, citationassist and insimilar the design papers of at specificcore.ac.uk inhibitors of cathepsin X as well as of cathepsin brought to you by CORE

B and thereby help to clarify the physiological roles of both . provided by Repositório Institucional UNIFESP

Introduction pathological processes such as tumor malignancy and pro- Cathepsin X belongs to the group of lysosomal cysteine gression (reviewed in [16–18]), muscular dystrophy [19], proteases, which constitute a numerous and important part rheumatoid arthritis [20], osteoporosis and other bone dis- of the family of papain-like (clan CA; reviewed orders [2,10,21] and Alzheimer’s disease [22]. in [1]). The research interest in this field is increasing rapidly, together with the number of identified enzymes, Cathepsin X was discovered only recently [23–25] and its as it becomes evident that lysosomal cysteine proteases biochemical characterization is just emerging [25–27]. It is are, besides being involved in intracellular catabo- the only known lysosomal cysteine protease exhibiting lism, also involved in a number of important cellular carboxy-monopeptidase activity. Besides carboxy- processes [2,3]. Currently 11 human are known monopeptidase activity cathepsin X also exhibits carboxy- at the sequence level (B, C, F, H, K, L, O, S, V, W and X). dipeptidase activity and is readily inhibited by CA074, Some of them are expressed in a wide variety of cells (B, previously known as the specific inhibitor of cathepsin B, C, F, H, L, O and X), whereas others (S, K, V and W) are which is the only known carboxy-dipeptidase among lyso- tissue-specific and have been shown to be or proposed to somal cysteine proteases (I.K. et al., unpublished observa- be involved in more specialized processes (reviewed in tions). The inhibitor has been used as a standard tool to [3]). They are involved in invariant chain processing and assess functions of cathepsin B in numerous in vivo and in [4–7], in bone remodeling [8–10] and vitro studies, for example in relation with exocytosis [28] in proteolytic processing of various [11], among and cancer (reviewed in [16,18,29]). Similarly, the implica- these the most important being thyroglobulin [12–14] and tion of cathepsin X in the development of Alzheimer’s granzyme B [15]. Whereas the former is an important disease is based on the use of inhibitors [27] and therefore source of thyroglobulin hormones the latter is involved in requires confirmation. the activation of the cell death proteases, the (reviewed in [3]). Besides their normal physiological roles, A structural investigation of the active site of cathepsin X is, lysosomal cysteine proteases are involved in a number of therefore, a mandatory step for understanding the specific st8308.qxd 03/22/2000 11:36 Page 306

306 Structure 2000, Vol 8 No 3

features of cathepsin X in order to establish the structural (Figure 1). The two domains interact along an extended basis of the specificity of similar cathepsins, in particular interface, reminiscent of a book with the spine in front. At cathepsin B. This will enable us to design substrates and the top the two domains separate and form a ‘V-shaped’ inhibitors that will address specifically only the desired active-site cleft with the reactive site residues, Cys31 and enzyme. It has already been suggested, from sequence data His180, each coming from a different domain. and a substrate study [26], that a short insertion of residues 24–26, which constitutes the tip of the mini-loop placed in The structure of cathepsin X, as judged from a number of the immediate vicinity of the reactive site, must be superimposed pairs, is rather similar to the structures of involved in interaction with substrate. It was also suggested human , rat cathepsin B, zymogen of cathepsin [26] that the mini-loop residue His23 is involved in recogni- L and papain (yielding respectively 145, 148, 146 and 141 tion and binding of the C-terminal carboxylic group of a equipositioned Cα pairs at a threshold distance of 1.5 Å). bound substrate. The crystal structure of cathepsin X pre- It is less similar to , actinidin, glycyl-endopep- sented here has enabled us to elucidate the structural basis tidase I and (yielding 122, 116, 128 and 130 of the mechanism by which cathepsin X exhibits both pairs) and least similar to human cathepsin B and porcine carboxy-monopeptidase and carboxy-dipeptidase activity. (with 89 and 73 pairs; Figure 2). Although by numerical criteria the structure of cathepsin X does not Results superimpose on the structure of cathepsin B as well as on Overall structure those of other related enzymes, a visual inspection reveals The structure of human cathepsin X comprises the that the structures of cathepsin B, and to a smaller extent sequence of the whole mature single-chain protein of 242 cathepsin H, are the most similar in the position and archi- residues, which are numbered consecutively tecture of the surface loops to the structure of cathepsin X. from Leu1 to Val242. The asymmetric unit of orthorhom- Some outstanding loops of the cathepsin X structure bic crystals contains two molecules. The three-dimen- (110–123 and 189–193) appear at similar positions to some sional structures of the two molecules are almost identical. cathepsin B loops (132–139 and 208–212). Particularly The complete chains of both molecules, with the excep- striking is the exposed position of the ring of Phe115, tion of a few sidechain residues, are well resolved in the placed at the tip of the 110–123 loop. The short three- electron-density maps. The superimposed structures show residue insertion (Ile24, Pro25 and Gln26) appears to be of a root mean square deviation (rmsd) of 0.55 Å for all Cα particular importance for the specificity of cathepsin X. It atoms; when only residues included into noncrystallo- is located within the mini-loop (Figure 3), an otherwise graphic symmetry (NCS) constraints are used for Cα atom conserved loop between Gln19 and Cys25 (using papain superposition (235 residues), the resulting rms fitting pro- numbering), forming the wall of the S2′ substrate-binding cedure yields a considerably lower value (0.26 Å). site [32]. An exception is the loop of glycyl-endopepti- dase, in which a tyrosine residue replaces the conserved The structure possesses the typical fold of a papain-like Ser21 (papain numbering). cysteine protease, a two domain structure [30,31]. The predominant secondary structural features of the N-termi- The mature form of human cathepsin X contains 11 cys- nal domain on the left, or L domain, are α helices, whereas teine residues. Ten of them are involved in disulfide bonds the C-terminal domain, or R domain, has a β-barrel fold (28–71, 65–103, 93–109, 112–118 and 153–235). The only

Figure 1

Structure of cathepsin X. (a) Ribbon diagram of cathepsin X (blue shown as orange sticks. The figure was prepared using the program ribbons) in the standard view with the mini-loop shown in green. The RIBBONS [52]. (b) A stereo plot of the Cα trace of cathepsin X. The His23, Tyr27 mini-loop sidechains are shown in green and the dotted lines represent disulphide bridges. Cys31 catalytic residue sidechain in yellow. Disulfide bridges are st8308.qxd 03/22/2000 11:36 Page 307

Research Article Crystal structure of human cathepsin X Guncv ar et al. 307

Figure 2

Structural based alignment of the sequences of cathepsin X (catx) with cathepsin B (1huc [33]), cathepsin H (8pch [34]), cathepsin L (1icf [53]) and papain (9pap [54]). Identical residues are shaded in black and similar residues are in gray.

sulfhydryl group of the enzyme that is accessible to solvent Two cis , Pro25 and Pro53, have been resolved in is the active site Cys31, which appears to be blocked by a the structure. The Pro25 is positioned on the tip of the small unidentified group as revealed by the final difference mini-loop. electron-density maps (2Fo–Fc; Figure 3). The disulfides 65–103 and 28–71 are well conserved among the papain- Active-site cleft like enzyme structures. Like cathepsin B, cathepsin X The active-site cleft of cathepsin X is reminiscent of does not have the disulfide 153–200 (using papain number- related . It is a V-shaped canyon with no ing) to fix the upper loops defining the S2 and S1′ sub- features similar to the occluding loop of carboxy-dipepti- strate binding sites. Instead there is a disulfide 153–235 dase cathepsin B [33] or the mini-chain of amino-pepti- much lower in the core of the β barrel domain, which is dase cathepsin H [34], which block access to a substrate. A absent in the related structures. The disulfide 93–109 has closer inspection, taking into account the sidechains, an equivalent only in the cathepsin B structure (100–132), shows that the positions of the otherwise conserved whereas the cathepsin X disulfide 112–118 appears to be residues (Gly20 and Ser21 using papain numbering) are unique within the family of papain-like enzymes. occupied by His23 and Tyr27. The Gly20 and Ser21 in all

Figure 3

A stereoview of the final 2Fo–Fc electron- density map of the cathepsin X active site contoured at 1.3 σ (blue) superimposed on a stick model of cathepsin X. The cathepsin X molecule is shown in orange and the mini-loop is in green. The oxygen, nitrogen and sulfur atoms are represented as small spheres in red, blue and yellow, respectively. Water molecules are drawn as pink spheres. The figure was prepared with the program MAIN [49]. st8308.qxd 03/22/2000 11:36 Page 308

308 Structure 2000, Vol 8 No 3

other known structures of the papain-like enzymes form a groove is much deeper because of the Tyr27 benzoyl ring, short loop with the serine sidechain pointing away from which points downwards towards the catalytic cysteine the active-site cleft, whereas the three residue insertion residue, demarcating the region between the S1 and S2′ (Ile24, Pro25 and Gln26) forces the His23 and Tyr27 into binding site. The Glu72 sidechain, which points upwards an extended conformation with their sidechains intruding on the other side of the S1 pocket, provides a negative into the active site cleft of the enzyme. The imidazole charge, which is not unique for a papain-like enzyme. A ring of His23 attaches to the surface of the mini-loop similar negative charge is provided by Asp72 in actinidin, (Gln22–Cys31; Gln19–Cys25, using papain numbering) whereas the occluding loop residue Glu122 in cathepsin B and actually occupies the place, which in equivalent struc- enters the S1 binding site from the opposite side (the side tures is the S2′ substrate-binding site. The arrangement is of Tyr27 in cathepsin X). The interactions between the similar to that in cathepsin H, in which Thr83P fills the S2 P1 residue and the S1 binding site of cathepsin X thus substrate-binding site. The difference is that in cathepsin appear to be tighter and also more selective than in the H the C-terminal carboxylic group binds the N terminus other related proteases. of a bound substrate, whereas in cathepsin X the car- boxylic group of a bound substrate is attracted by the posi- The place where the P2 residue sidechain binds to the tively charged sidechain of His23. cathepsin X is, as in other related proteases, the only sub- strate site that can be called a pocket [32]. In most papain- The active-site cleft in cathepsin X thus comprises sub- like cysteine proteases this site is hydrophobic. The strate-binding sites S2, S1 and S1′ [31,32]. The conserved surface of the pocket in cathepsin X also has predomi- catalytic residues, Cys31 and His180, and the functional nantly hydrophobic character with Val181 at the bottom groups of residues involved in interaction with a substrate and Gly154 and Ile178 on the right. Ser205 (papain num- mainchain are observed at the expected positions. The bering) is in general quite conserved, exceptions being amide proton on Nε1 of Trp202 is in a position to bind the cathepsin B and cruzipain with a glutamic acid. Out of carbonyl of the P1′ residue. The between the those related enzymes for which structures are known, Gln22 sidechain and the amide of Cys31 positioned at the cathepsin X is the only one with a histidine residue N terminus of the central α helix from the L domain can (His234) at this position. In addition, the specific character bind the carbonyl of the P1 residue. Gly74 (Gly66 in of the S2 pocket is modulated in cathepsin X by Asp76, papain) provides its amide proton as well as carbonyl oxygen which is a hydrophobic residue in other related structures, to form a short antiparallel β-leader hydrogen bonded usually methionine or . The Asp76–His234 arrange- arrangement with the mainchain carbonyl and amide group ment presumably enables cathepsin X to process substrates of a bound P2 residue. The interaction between the car- with long sidechains with a hydrophilic tail at position P2. bonyl and the electrostatic positive potential at the edge of the highly conserved Trp32 is evident [32]. Figure 4 With the extended mainchain of Ala157 and Thr158 forming a substantial part of the surface of the S1′ site, cathepsin X joins the group of cathepsins B and H, in con- trast to other related enzymes (cathepsins K, L and S, papain, actinidin and others), in which an additional residue twists the chain-trace into a short curve. In cathep- sin B the S1′ binding site is enclosed from the top by the unique trace of the loop between Phe174 and His199, whereas cathepsin H has a valine residue at the position of Ala157. The S1′ binding site of cathepsin X is thus the largest among the related cathepsins of which the struc- tures are known.

In the related papain-like proteases the sidechain of a bound substrate P1 residue is in contact with the shallow groove of the S1 binding site formed between Ser21, Cys22 and Gly23 (papain numbering) and Cys65, Asn64, Gly66 on the other side bridged by the disulfide Cys22–Cys65 (papain numbering). (An exception is glycyl-, in which the Glu23 of the enzyme fills the space that is normally occupied by the moiety of Chemical structures of CA074 and CA030 inhibitors. the sidechain.) In the cathepsin X structure, however, the st8308.qxd 03/22/2000 11:36 Page 309

Research Article Crystal structure of human cathepsin X Guncv ar et al. 309

Figure 5

The structure of cathepsin B (thin lines) in complex with the CA030 inhibitor (thick lines) superimposed on the structure of cathepsin X (medium lines). The His23 ring that can be rotated to the cathepsin-B-like position is shown in bold dashed lines.

Discussion complex with cathepsin B and the proposed substrate Carboxy-monopeptidase and carboxy-dipeptidase activities model [37] as a template for the modeling of the geometry of cathepsin X of substrates bound into the active cleft of cathepsin X A kinetic study has shown that cathepsin X can act not only (Figure 5). Models of AFFW and AFRFW peptide as a carboxy-monopeptidase as shown by Nägler et al. [26] sequences were used to investigate the mechanisms of but also as carboxy-dipeptidase (I.K. et al., unpublished carboxy-peptidase and carboxy-dipeptidase activities. The observations) Of the substrates DnpGFFFW-OH, results of the modeling study suggest that the proton on DnpGRFFW-OH, DnpGFRFW-OH and DnpGFFRW- Nε2 of His23, together with the amide proton on Nε2 of OH (sequences in single-letter amino acid code), it has Trp202, can form the anchor for a carbonyl as well as for a been found that cathepsin X cleaves the last two C-terminal carboxyl group of the C-terminal P1′ substrate residue residues from DnpGFRFW-OH. In addition, cathepsin X (Figures 6 and 7). This confirms the proposal based on a can be readily inhibited by CA074 (I.K. et al., unpublished homology model of cathepsin X [26]. In order to create an observations), which was previously believed to be specific open S2′ substrate-binding site, the His23 ring has to move for cathepsin B. CA074 is an epoxysuccinyl-based inhibitor out of the place occupied in the crystal structure. Simple χ χ (Figure 4) with its isoleucine–proline dipeptide mimicking rotation about the 1 and 2 angles of the histidine can bring substrate binding in the S1′ and S2′ binding sites [35–37]. the imidazole ring into a position equivalent to that occu- We therefore used the crystal structure of CA030 in pied by His110 in cathepsin B, where the two

Figure 6

Carboxy-peptidase activity of cathepsin X. (a) The surface of the active-site cleft region of cathepsin X is colored according to electrostatic potential, whereas the surface of the catalytic Cys31 residue is shown in yellow. The modeled substrate AFFW is shown in ball-and-stick representation. Atoms are represented as small spheres, color- coded as in Figure 3. This figure was prepared with WebLab ViewerLite (Molecular Simulations Incorporated, San Diego, CA). (b) Schematic representation of the substrate binding. The substrate (thick line) has been modeled into the active-site cleft of cathepsin X (thin lines). Hydrogen bonds and disulfide bridges are shown as dashed lines. st8308.qxd 03/22/2000 11:36 Page 310

310 Structure 2000, Vol 8 No 3

Figure 7

Carboxy-dipeptidase activity of cathepsin X. His23 in the cathepsin X active site has been rotated to the cathepsin-B-like position. (a) The same color coding as in Figure 6 is used. The model of substrate AFRFW is shown in ball-and-stick representation. (b) Schematic representation of the substrate binding. The substrate (thick line) has been modeled into the active-site cleft of cathepsin X (thin lines). Hydrogen bonds and disulfide bridges are shown as dashed lines.

histidines His110 and His111 [33,37] bind the carboxyl cathepsin B, in which a cystatin-type inhibitor competes group of a substrate (Figure 5). Cathepsin X has no equiva- with occluding loop residues for binding into the active- lent residue to His111 of cathepsin B and therefore inter- site cleft [39,40]. The obvious solution appears to be that acts with only one carboxyl group oxygen atom of a the sidechains of His23 and Tyr27 are displaced when a substrate. This explains why cathepsin X does not differen- complex with a cystatin inhibitor is formed. Our modeling tiate between the substrates with blocked and free C-termi- study suggests that the ring of His23 flips away into its nal carboxylic group (I.K. et al., unpublished observations). cathepsin B-like position and the ring of Tyr27 rotates into the S1 substrate-binding site. The flip-flop of the His23 sidechain between the confor- mation observed in the crystal structure and the cathepsin Specific inhibitor design B-like conformation is most probably a substrate-induced The primary goal of the design of specific inhibitors of mechanism. The carboxy-dipeptidase activity appears to lysosomal cysteine proteases is to establish the biological be defined by the affinity of cathepsin X for the residue role of a particular enzyme. Cross reactivity is likely to binding into the S1 binding site. Arginine is preferred over lead to wrong conclusions. The relatively wide tissue dis- phenylalanine to such an extent that, instead of accepting tribution and abundance of cathepsin X [23,24] are similar an arginine residue in the S2 binding site, the enzyme to those of cathepsin B. Because CA074 has been shown modifies the binding pattern by shifting the arginine into to interact similarly with cathepsins X and B, it is evident the P1 position (I.K. et al., unpublished observations). that physiological studies based on the use of this inhibitor Thereby two binding sites on the prime site are made have to be repeated in order to clarify whether cathepsin B available and carboxy-dipeptidase activity is exhibited. or cathepsin X, or perhaps both enzymes, are implicated. From our modeling study it is not evident whether a bound P1 residue sidechain results in a shift of the Tyr27 ring, The structure of cathepsin X in a putative complex with which then assists in displacement of the His23 sidechain. CA030, which was generated by superposition of the This is an issue that requires further investigation. structure of cathepsin B complexed with CA030 [37] on the structure of cathepsin X, provides the first hints for Binding of protein inhibitors the design of inhibitors able to differentiate between the In contrast to the initial report [26], cathepsin X can be two enzymes (Figure 5). CA030 is a very close analog of inhibited with various cystatin-type protease inhibitors CA074 (Figure 4): the inhibitors do not differ in the parts

such as cystatin C and stefin A with Ki values in the that interact with the primed binding sites of the nanomolar range (I.K. et al., unpublished observations), enzymes. The only difference is in the nonprimed side, similar to those obtained for the inhibition of cathepsin B where an ester-tail instead of an amide-tail points towards [30]. A modeling study based on the papain–stefin B the S2 binding site. In order to convert CA074 into a complex [38] has shown that the sidechains of His23 and selective inhibitor of cathepsin X, the following modifica- Tyr27 clash with residues Val55 and Ala56 in the QVVAG tions appear evident. Firstly, a compound with a single region (data not shown). A similar situation occurs in amino acid residue instead of two could be attached to an st8308.qxd 03/22/2000 11:36 Page 311

Research Article Crystal structure of human cathepsin X Guncv ar et al. 311

epoxysuccinyl reactive group and bind to His23 of cathep- the two enzymes. The structure of cathepsin X with a sin X and thus be too short to reach the His110 and rotated His23 ring, superimposed on the structure of the His111 residues on the surface of cathepsin B. Secondly, a complex of CA030 with cathepsin B and models of sub- bulkier residue than isoleucine of CA074 could fill the S1′ strates AFFW and AFRFW, explains the dual capabil- site of cathepsin X and simultaneously not be able to bind ity of cathepsin X to exhibit carboxy-monopeptidase and the smaller S1′ site of cathepsin B. Thirdly, the carboxyl carboxy-dipeptidase activities and thus provide the basis group of CA074 could be modified to interfere with the for design of selective inhibitors of cathepsin B. binding to cathepsin B residue His111 and binding to cathepsin X would remain unaffected. Although cathepsin X was unknown until recently, it has already been involved in an industrial application The reverse case, the design of an inhibitor that will [25,27]. It is held responsible for the cleavage of β- inhibit only cathepsin B and not cathepsin X (and others) amyloid peptide, which can, when secreted from the cells seems a more difficult problem. One solution might be to and accumulated in amyloid plaques, result in enlarge the cathepsin B inhibitor by expanding the inter- Alzheimer’s disease. Inhibition of its activity was pro- action region to the nonprimed binding sites. It is known posed to slow-down the progress of the disease and, if that the specificity of papain-like cysteine proteases is par- diagnosed in time, even prevent its manifestation. It ticularly encoded in the S2 and S1′ substrate binding sites should be noted that this role of cathepsin X is based on [31,32]. A large number of small organic compounds react an inhibition study and might thus require additional chemically with the reactive site sulfhydryl of a cysteine clarification. protease [31]. Most of these inhibitors interact only with one half of the active-site cleft on the ‘nonprimed’ side. A Materials and methods selective inhibitor of cathepsin B should probably span Crystallization and data collection the active site from the S2 to the S2′ binding site, as has Cathepsin X was isolated from human liver (I.K. et al., unpublished already been suggested [37]. Such a setup has been shown observations). The purified cathepsin X was concentrated to a concen- to be quite successful in the cases of lysosomal cysteine tration of 8.8 mg/ml in a spin concentrator (Centricon; Amicon). Crys- tals were grown using the sitting-drop vapor diffusion method. The endopeptidases, cathepsins L, S and K [41,42], whereas reservoir contained 1 ml 14.4% PEG 4000, 7.2% 2-propanol and the potent double-headed inhibitors of cathepsin B might 0.072 M Na-HEPES adjusted to pH 8.0. The drop was composed of now require additional improvements [43,44]. The selec- 2.2 µl reservoir solution and 2.8 µl protein (8.8 mg/ml) in 20 mM Na- tive cathepsin B inhibitor should, therefore, exploit the acetate and 1 mM EDTA pH 5.1. ′ ′ ‘primed’ substrate binding sites S1 and S2 as well as Diffraction data were collected from a single crystal using Cu Kα radia- reach into the S2 binding pocket, in which it should opti- tion from a Rigaku Ru200 rotating-anode X-ray generator and recorded mally utilize the fit against the cathepsin B residues Pro76 on a 345 mm MAR Research image-plate detector. Although crystals and Glu245 and at the same time repel the equivalent diffracted to a higher resolution at synchrotron, we did not use synchro- tron data sets because of increased mosaicity of the frozen crystals. cathepsin X residues Asp76 and His234. Auto indexing and scaling was done using the programs MOSFLM and SCALA [45,46]. The crystal diffracted to 2.67 Å resolution and

Biological implications belonged to the orthorhombic space group C2221 with cell dimen- Lysosomal cysteine proteases are involved in a number sions a = 62.3 Å, b = 92.5 Å, c = 209.9 Å. The asymmetric unit con- of physiological and pathological events. Although tained two molecules. similar in sequence and fold, they exhibit diversity in Structure determination and refinement localization, substrate and inhibitor specificity and sta- The structure of cathepsin X was solved using the molecular replace- bility. With the discovery of new members of the group, ment method implemented in EPMR program [47]. The crystal struc- the perception that the role of lysosomal cysteine pro- ture of actinidin [48] was used as the search model. Two solutions with a correlation factor of 0.311 and an R value of 0.54 were found using teases is limited to the nonselective degradation of pro- data in the 15–4 Å resolution range. teins in is changing. There is a growing body of evidence that they work in concert. For example, they In the subsequent structure determination, the program MAIN [49] was appear to be involved in the strictly regulated matura- used for density modifications, model building and refinement. The first electron-density map was calculated with two actinidin molecules [48]. tion of major histocompatibility complex (MHC) class II The noninterpreted regions of the electron density were masked with molecules and antigen presentation [2]. atoms, superimposed and averaged. Density of the solvent region was flipped with the factor 0.3. After completing the model using data to It is not surprising that the discovery of each new 3.0 Å resolution, the resolution of the data was gradually expanded to the final range (10.0–2.67 Å) in subsequent cycles that involved model enzyme might require reconsideration of current knowl- rebuilding, positional and B-value refinement, density averaging and edge. In the case of cathepsin X it is evident that new solvent generation. Standard parameters for protein [50] were used for inhibitors should be developed in order to re-evaluate geometry regularization. Averaged kicked omit-maps were used the role(s) of cathepsin B. The crystal structure of throughout the structure determination process to reveal ambiguous cathepsin X presented here reveals the differences and parts of the structure. The averaged kicked omit-maps were calculated by adding eight difference omit-maps, each resulting from a corre- similarities between the structures of the active sites of sponding set of randomly displaced atoms, up to 0.8 Å along each st8308.qxd 03/22/2000 11:36 Page 312

312 Structure 2000, Vol 8 No 3

Table 1 280, 450-453. 6. Nakagawa, T.Y., et al., & Rudensky, A.Y. (1999). Impaired invariant Crystallographic data and refinement statistics. chain degradation and antigen presentation and diminished - induced arthritis in null mice. Immunity 10, 207-217. Data collection 7. Shi, G.P., et al., & Chapman, H.A. (1999). Cathepsin S required for normal MHC class II peptide loading and germinal center Space group C2221 development. Immunity 10, 197-206. Cell parameters a = 62.3 Å, b = 92.5 Å, 8. Drake, F.H., et al., & Gowen, M. (1996). Cathepsin K, but not c = 209.9 Å, α = β = γ = 90° cathepsins B, L, or S, is abundantly expressed in human osteoclasts. Molecules in asymmetric unit 2 J. Biol. Chem. 271, 12511-10006. Limiting resolution (Å) 2.67 9. Brömme, D., Okamoto, K., Wang, B.B. & Biroc, S. (1996). Human Measured reflections 15,2861 cathepsin O2, a matrix protein-degrading cysteine protease expressed Independent reflections 16,732 in osteoclasts. Functional expression of human cathepsin O2 in R (99–2.67 Å) 0.062 Spodoptera frugiperda and characterization of the enzyme. J. Biol. sym Chem. 271, 2126-2132. Completeness (99–2.67 Å) 0.95 10. Saftig, P., et al., & von Figura, K. (1998). Impaired osteoclastic bone Completeness in the highest shell (2.83–2.67 Å) 0.87 resorption leads to osteopetrosis in cathepsin-K-deficient mice. Proc. I/σ (99–2.67 Å) 11.9 Natl Acad. Sci. USA 95, 13453-13458. I/σ in the highest shell (2.83–2.67 Å) 3.6 11. Wang, P.H., et al., & Hsueh, W.A. (1991). Identification of renal cathepsin B as a human prorenin-processing enzyme. J. Biol. Chem. Final refinement parameters 266, 12633-12638. No. scattering protein atoms 3816 12. Dunn, A.D., Crutchfield, H.E. & Dunn, J.T. (1991). Thyroglobulin No. solvent molecules 118 processing by thyroidal proteases. Major sites of cleavage by Resolution range in refinement (Å) 10.0–2.67 cathepsins B, D and L. J. Biol. Chem. 266, 20198-20204. Reflections used in refinement 16,433 13. Brix, K., Lemansky, P. & Herzog, V. (1996). Evidence for extracellularly R factor 0.183 acting cathepsins mediating thyroid hormone liberation in thyroid R (%) 0.226 epithelial cells. Endocrinology 137, 1963-1974. free 14. Nagaya, T., et al., & Seo, H. (1998). Intracellular proteolytic cleavage Geometry of the final model of 9-cis-retinoic acid receptor alpha by cathepsin L-type protease is a Rmsd of bond distances (Å) 0.012 potential mechanism for modulating thyroid hormone action. J. Biol. Rmsd of bond angles (°) 1.6 Chem. 273, 33166-33173. 15. Pham, C.T. & Ley, T.J. (1999). I is required for B factor value (rms by bonds) the processing and activation of granzymes A and B in vivo. Proc. Natl Overall and (rmsd) (Å) 25.3 (2.0) Acad. Sci. USA 96, 8627-8632. 16. Sloane, B.F., Moin, K. & Lah, T. (1994). Biochemical and molecular aspects of selected cancers. (Pretlow, T.G. and Pretlow, T.P., eds). coordinate. After the crystallographic R value dropped below 0.25 at Academic Press, San Diego. 17. Kos, J. & Lah, T.T. (1998). Cysteine proteinases and their endogenous 2.7 Å resolution, the temperature-factor refinement was applied. Water inhibitors: target proteins for prognosis, diagnosis and therapy in molecules were generated using an automatic procedure in MAIN and cancer. Oncol. Rep. 5, 1349-1361. then corrected manually. 18. Michaud, S. & Gour, B.J. (1998). Cathepsin B inhibitors as potential anti-metastatic agents. Exp. Opin. Ther. Patents 8, 645-672. The final refinement included all reflections (cutoff 1σ) in the resolution 19. Kominami, E., Kunio, I. & Katunuma, N. (1987). Activation of the range 10–2.67 Å, with the crystallographic R value being 0.183. The intramyofibral autophagic-lysosomal system in muscular dystrophy. geometry of the final model was inspected with MAIN and Am. J. Pathol. 127, 461-466. PROCHECK [51]. All residues lie in the allowed regions of the 20. Mort, J.S., Recklies, A.D. & Poole, A.R. (1984). Extracellular presence of the lysosomal proteinase cathepsin B in rheumatoid synovium and Ramachandran plot, 0.825 of residues in most favored regions and its activity at neutral pH. Arthritis. Rheum. 27, 509-515. 0.175 in additional allowed regions. Other structural parameters for the 21. Gelb, B. D., Shi, G. P., Chapman, H. A. & Desnick, R. J. (1996). structure refined at this resolution show no significant deviations from Pycnodysostosis, a lysosomal disease caused by cathepsin K the expected values (overall G factor is 0.2, which is better than deficiency. Science 273, 1236-1008. expected for 2.67 Å resolution). The crystallographic data and refine- 22. Cataldo, A.M. & Nixon, R.A. (1990). Enzymatically active lysosomal ment statistics are summarized in (Table 1). proteases are associated with amyloid deposits in Alzheimer brain. Proc. Natl Acad. Sci. USA 87, 3861-3865. Accession numbers 23. Santamaria, I., Velasco, G., Cazorla, M., Fueyo, A., Campo, E. & López- The coordinates have been deposited in the with Otín, C. (1998). , a novel human cysteine proteinase produced by breast and colorectal carcinomas. Cancer Res. accession code 1EF7 and will be on hold for one year. 58, 1624-1630. 24. Nägler, D.K. & Ménard, R. (1998). Human cathepsin X: a novel Acknowledgements cysteine protease of the papain family with a very short proregion and We thank CNR for providing us beam time at synchrotron Elettra and the unique insertions. FEBS Lett. 434, 135-139. CNR team for help with data collection. We thank Roger H Pain for thought- 25. Tung, J.S., Sinha, S., McConlogue, L., & Semko, C.M.F. Cathepsin and ful reading of the manuscript. This work was supported by the Slovenian methods and compositions for inhibition thereof. Athena Ministry of Science and Technology. Neurosciences, Inc. 469362(5849711). 12-15-1998. South San Francisco,CA. US Patent. References 26. Nägler, D.K., Zhang, R., Tam, W., Sulea, T., Purisima, E.O. & Menard, 1. Barrett, A.J., Rawlings, N.D. & Woessner, J.F. Jr. (1998), Handbook of R. (1999). Human cathepsin X: a cysteine protease with unique proteolytic enzymes. (Barrett, A.J., Rawlings, N.D. & Woessner, J.F. Jr., carboxypeptidase activity. Biochemistry 38, 12648-12654. eds) Academic Press Ltd., London. 27. Tung, J.S., Sinha, S., & Semko, C.M.F. Cathepsin and methods and 2. Chapman, H.A., Riese, R.J. & Shi, G.P. (1997), Emerging roles for compositions for inhibition thereof. Athena Neurosciences, Inc. cysteine proteases in human biology. Annu. Rev. Physiol. 59, 63-88. 850392(5858982). 1-12-1999. South San Francisco, CA. 5-2-1997. 3. Turk, B., Turk, D. & Turk, V. Lysosomal cysteine proteases: more than US Patent. scavengers. Biochim. Biophys. Acta, in press. 28. Linebaugh, B.E. (1999). Exocytosis of active cathepsin B enzyme 4. Cresswell, P. (1996). Invariant chain structure and MHC class II activity at pH 7. 0, inhibition and molecular mass. Eur. J. Biochem. function. Cell 84, 505-507. 264, 100-109. 5. Nakagawa, T., et al., & Rudensky, A.Y. (1998). Cathepsin L: critical 29. Coulibaly, S., et al., & Mach, L. (1999). Modulation of invasive role in Ii degradation and CD4 T cell selection in the thymus. Science properties of murine squamous carcinoma cells by heterologous st8308.qxd 03/22/2000 11:36 Page 313

Research Article Crystal structure of human cathepsin X Guncv ar et al. 313

expression of cathepsin B and cystatin C. Int. J. Cancer 83, 526-531. of papain refined at 1.65 Å resolution. J. Mol. Biol. 179, 233-286. 30. Turk, B., Turk, V. & Turk, D. (1997). Structural and functional aspects of papain-like cysteine proteinases and their protein inhibitors. Biol. Chem. 378, 141-150. 31. McGrath, M.E. (1999). The lysosomal cysteine proteases. Annu. Rev. Biophys. Biomol. Struct. 28, 181-204. 32. Turk, D., Guncar, G., Podobnik, M. & Turk, B. (1998). Revised definition of substrate binding sites of papain-like cysteine proteases. Biol. Chem. 379, 137-147. 33. Musil, D., et al., & Bode, W. (1991). The refined 2.15 Å X-ray crystal structure of human liver cathepsin B: the structural basis for its specificity. EMBO J. 10, 2321-2330. 34. Guncv ar, G., Podobnik, M., Pungercv ar, J., Štrukelj, B., Turk, V. & Turk, D. (1998) Crystal structure of porcine cathepsin H determined at 2.1 Å resolution: location of the mini-chain C-terminal carboxyl group defines cathepsin H function. Structure 6, 51-61. 35. Towatari, T., et al., & Katunuma, N. (1991). Novel epoxysuccinyl peptides. A selective inhibitor of cathepsin B, in vivo. FEBS Lett. 280, 311-305. 36. Buttle, D.J., Murata, M., Knight, C.G. & Barrett, A.J. (1992). CA074 methyl ester: a proinhibitor for intracellular cathepsin B. Arch. Biochem. Biophys. 299, 377-380. 37. Turk, D., et al., & Turk, V. (1995). Crystal structure of cathepsin B inhibited with CA030 at 2.0- Å resolution: a basis for the design of specific epoxysuccinyl inhibitors. Biochemistry 34, 4791-4007. 38. Stubbs, M.T., et al., & Turk, V. (1990). The refined 2.4 Å X-ray crystal structure of recombinant human stefin B in complex with the cysteine proteinase papain: a novel type of proteinase inhibitor interaction. EMBO J. 9, 1939-1947. 39. Illy, C., Quraishi, O., Wang, J., Purisima, E., Vernet, T. & Mort, J.S. (1997). Role of the occluding loop in cathepsin B activity. J. Biol. Chem. 272, 1197-1202. 40. Podobnik, M., Kuhelj, R., Turk, V. & Turk, D. (1997). Crystal structure of the wild-type human procathepsin B at 2.5 Å resolution reveals the native active site of a papain-like cysteine protease zymogen. J. Mol. Biol. 271, 774-788. 41. Thompson, S.K., Halbert, S.M., Bossard, M.J., Tomaszek, T.A., Levy, M.A. & Veber, D.F. (1997). Design of potent and selective human cathepsin K inhibitors that span the active site. Proc. Natl Acad. Sci. USA 94, 14249-14254. 42. Katunuma, N., et al., & Asao, T. (1999). Structure based development of novel specific inhibitors for cathepsin L and cathepsin S in vitro and in vivo. FEBS Lett. 458, 6-10. 43. Schaschke, N., Assfalg-Machleidt, I., Machleidt, W., Turk, D. & Moroder, L. (1997). E-64 analogues as inhibitors of cathepsin B. On the role of the absolute configuration of the epoxysuccinyl group. Bioorg. Med. Chem. 5, 1789-1797. 44. Schaschke, N., Assfalg-Machleidt, I., Machleidt, W. & Moroder, L. (1998). Substrate/propeptide-derived endo-epoxysuccinyl peptides as highly potent and selective cathepsin B inhibitors. FEBS Lett. 421, 80-82. 45. Leslie, A.G.W. (1992). Joint CCP4 and ESF-EACMB. Newsletter on Protein Crystallography, 26, SERC Daresbury Laboratory, Warrington, UK. 46. CCP4 (1994). Collaborative Computational Project Number 4, The CCP4 suite: programs for proteine crystallography. Acta Crystallogr. 50, 760-763. 47. Kissinger, C.R., Gehlhaar, D.K. & Fogel, D.B. (1999). Rapid automated molecular replacement by evolutionary search. Acta. Crystallogr. D Biol. Crystallogr. 55 ( Pt 2), 484-491. 48. Baker, E.N. (1980) Structure of actinidin, after refinement at 1.7 Å resolution. J. Mol. Biol. 141, 441-484. 49. Turk, D. (1992). Further Development of a Program for Molecular Graphics and Electron Density Manipulation and its Application in Various Protein Structure Determinations. PhD Thesis, Technische Universität München, Germany. 50. Engh, R.A. & Huber, R. (1991) Accurate bond and angle parameters for X-ray protein structure refinement. Acta Crystallogr. A 47, 392-400. 51. Laskowski, R.A., MacArthur, M.W., Moss, D.S. & Thornton, J.M. (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26, 283-291. Because Structure with Folding & Design operates a 52. Carson, M. (1991) Ribbons 2.0. J. Appl. Crystallogr. 24, 958-961. 53. Guncv ar, G., Pungercv icv , G., Klemencic, I., Turk, V. & Turk, D. (1999) ‘Continuous Publication System’ for Research Papers, this Crystal structure of MHC class II-associated p41 Ii fragment bound to paper has been published on the internet before being printed cathepsin L reveals the structural basis for differentiation between cathepsins L and S. EMBO J. 18, 793-803. (accessed from http://biomednet.com/cbiology/str). For 54. Kamphuis, I.G., Kalk, K.H., Swarte, M.B. & Drenth, J. (1984) Structure further information, see the explanation on the contents page.