Biochimie xxx (2015) 1e25

Contents lists available at ScienceDirect

Biochimie

journal homepage: www.elsevier.com/locate/biochi

Review Structure and function of legumain in health and disease

* Elfriede Dall, Hans Brandstetter

Dept. of Molecular Biology, University of Salzburg, A-5020 Salzburg, Austria article info abstract

Article history: The last years have seen a steady increase in our understanding of legumain biology that is driven from Received 14 August 2015 two largely uncoupled research arenas, the mammalian and the plant legumain field. Research on Accepted 18 September 2015 legumain, which is also referred to as asparaginyl endopeptidase (AEP) or vacuolar processing Available online xxx (VPE), is slivered, however. Here we summarise recent important findings and put them into a common perspective. Legumain is usually associated with its endopeptidase activity in where Keywords: it contributes to antigen processing for class II MHC presentation. However, newly recognized functions Electrostatic stabilization disperse previously assumed boundaries with respect to their cellular compartmentalisation and enzy- Cellular localization Allostery matic activities. Legumain is also found extracellularly and even translocates to the cytosol and the nucleus, with seemingly incompatible pH and redox potential. These different milieus translate into Death domain changes of legumain's molecular properties, including its (auto-)activation, conformational stability and Context-dependent activities enzymatic functions. Contrasting its endopeptidase activity, legumain can develop a carboxypeptidase activity which remains stable at neutral pH. Moreover, legumain features a peptide activity, with intriguing mechanistic peculiarities in plant and human isoforms. In pathological settings, such as cancer or Alzheimer's disease, the proper association of legumain activities with the corresponding cellular compartments is breached. Legumain's increasingly recognized physiological and pathological roles also indicate future research opportunities in this vibrant field. © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Contents

1. History & nomenclature ...... 00 2. Classification within the world ...... 00 3. Phylogenetic distribution e whohasit?...... 00 4. Tissue distribution e where is it? ...... 00 4.1. (extra-)Cellular localization ...... 00 4.2. How do proteins get to the lysosomal/vesicular system? ...... 00 4.3. (patho-)Physiologic localization of mammalian legumain outside the ...... 00 4.4. How to exit the endo-lysosome? ...... 00 4.5. How does legumain survive at near neutral pH in the cytosol, nucleus or at the cell surface? ...... 00 5. Physiological functions ...... 00 5.1. Digestion ...... 00 5.2. Immunity ...... 00 5.3. Antimicrobial activity ...... 00 5.4. (Immune) signalling ...... 00 5.5. Apoptosis ...... 00 5.6. Transcription factor ...... 00 5.7. Osteoclast remodelling ...... 00 5.8. Legumain deficient mice ...... 00 6. Domain architecture e how does legumain look like? ...... 00

* Corresponding author. E-mail address: [email protected] (H. Brandstetter). http://dx.doi.org/10.1016/j.biochi.2015.09.022 0300-9084/© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 2 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25

6.1. Recombinant access ...... 00 6.2. Crystal structures of legumain ...... 00 6.3. The catalytic-domain e legumain is protease and ligase ...... 00 6.3.1. Protease substrates ...... 00 6.4. The LSAM-domain e signalling ...... 00 7. Activation of legumain ...... 00 7.1. Activation to -specific endopeptidase (AEP) ...... 00 7.2. Activation to ACP ...... 00 7.3. Catalytic mechanism of legumain protease activities (AEP/ACP) ...... 00 7.4. Comparison of legumain with and ...... 00 8. specificity ...... 00 8.1. P1 Asn/Asp ...... 00 8.2. P3eP2 and P10 ...... 00

8.3. Tuning legumain protease activity: kcat selection ...... 00 8.4. ACP substrate specificity ...... 00 9. Ligase activities ...... 00 9.1. ATP-dependent ...... 00 9.2. ATP-independent transpeptidases ...... 00 9.3. Catalytic mechanism of legumain ligase activity ...... 00 9.4. Ligase substrates and specificity ...... 00 9.4.1. P1 Asn/Asp ...... 00 9.4.2. P10 and P20 ...... 00 10. Activity regulation ...... 00 10.1. Protein inhibitors ...... 00 10.1.1. ...... 00 10.1.2. Clitocypin/macrocypin ...... 00 10.1.3. Prodomain e reversible activation and zymogenization ...... 00 10.1.4. How to prevent cleavage of protein substrates or canonical inhibitors ...... 00 10.2. The environment is regulating legumain activity ...... 00 10.3. Activators ...... 00 10.3.1. Direct conformational stabilization ...... 00 10.3.2. Allosteric stabilization by integrin binding ...... 00 10.3.3. Accelerators of activation ...... 00 10.4. Low molecular weight inhibitors ...... 00 11. Pathological functions ...... 00 11.1. Legumain in and around cancer ...... 00 11.2. Multiple sclerosis ...... 00 11.3. Legumain in Alzheimer's disease ...... 00 11.4. Therapeutic and diagnostic options ...... 00 11.4.1. Prodrugs and targeted drug delivery ...... 00 11.4.2. DNA vaccines ...... 00 11.4.3. Legumain imaging and its use as prognostic marker ...... 00 11.4.4. Inhibition/knock down of legumain to suppress tumour progression and Alzheimer's disease ...... 00 12. Future perspectives, outlook ...... 00 12.1. Smart Legumain Activity Modulation (SLAM) ...... 00 12.2. Legumain web ...... 00 Conflict of interest ...... 00 Acknowledgements ...... 00 References ...... 00

1. History & nomenclature asparaginyl endopeptidase (AEP), and these names are still used today. In 1993 the term ‘legumain’ was introduced by Kembhavi Already in the early 1980s the legumain was et al. [10], which is nowadays its most frequently used name. identified in the vetch seedlings and the common bean Phaseolus Already in 1993 it was shown that legumain very likely harbours a vulgaris, however not classified yet [1,2]. Only three years later, a protease and a ligase activity [9]. Mammalian legumain was iden- putative 32 kDa cysteine protease was found in the trematode tified in 1996 as putative cysteine protease PRSC1 in humans [11] Schistosoma mansoni [3]. Another two years later, in 1989, the and confirmed as legumain only one year later in pig. In 2007 32 kDa protein Sm32 was confirmed to be a protease [4]. The legumain activity was for the first time also found in arthropods sequence of plant legumains was then cross-confirmed by the [12]. Further names were used for legumain to emphasise specific sequence of the homologous Schistosoma enzyme [5]. In 1990 functions, including ACP (asparaginyl carboxypeptidase) because of legumain was initially named as haemoglobinase in S. mansoni [6]. its activity as a mono-carboxy-peptidase [13], or nucellain, because Later it was also referred to as endoprotease B in germinating barley of its localization to nucellar cell walls in barley [14]. Recently the seeds [7]. Given its localization and function in plant [5,8], name “butelase 1” was used for a legumain isoform from the seeds and its strict specificity for cleaving after asparagine residues [9] of Clitoria ternatea [15]. While the use of the term AEP seems legumain was named vacuolar processing enzyme (VPE) or justifiable for its brevity or if the endopeptidase activity should be

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 3 emphasised, we generally encourage to use the name legumain, relevance [41e43]. Legumains of blood digesting ticks are localized which is consistent with its multiple and still growing functions to the midgut [12]. In plants, different isoforms were shown to be and in accordance with the recommendations of the Nomenclature expressed at different levels in different organs and different Committee of the IUBMB [16]. developmental stages. Therefore, plant legumains are not solely expressed in seeds but also in vegetative organs [28,44,45]. aVPE 2. Classification within the protease world and gVPE are relatively abundant in vegetative organs [45], bVPE in reproductive organs, in particular the seed embryo [46], dVPE in Formally, legumain was classified as a member of clan CD and two cell layers of the seed coat at an early stage of seed develop- family C13 of cysteine (EC 3.4.22.34) because of the ment [28]. recognition of a conserved His148-Gly-spacer-Ala-Cys189 motif. This highly conserved motif is typical for clan CD members including 4.1. (extra-)Cellular localization caspase-1, , gingipain R, paracaspases and metacaspases and suggests an evolutionary relationship [16e18]. Furthermore, a The cellular localisation of legumain is organism dependent. highly conserved block of four hydrophobic residues precedes the Mammalian legumain is localized mainly to the endo-lysosomal catalytic His148 and Cys189 residues. Also, a ~15% sequence ho- system, plant legumains to vacuoles, nucellar cell walls and the mology to caspases and its strict specificity at the P1 position link pericarp [14,47] and tick legumains to the acidic compartments of legumain to caspases, albeit with a modified preference for the P1- gut cells (gut epithelium), but also outside the cells in the peri- residue, Asn vs. Asp. trophic matrix [12,31,48,49]. Common characteristics of those Notably, the C13 peptidase family of legumain-like proteins compartments are an acidic pH and a reducing redox-potential. In contains the Glycosylphosphatidylinositiol:protein transamidase mammalian lysosomes the reducing environment is established (Gpi8p; C13.005) [19]. Gpi8p co-localises with the rough endo- amongst other factors by the transition metals iron and copper plasmic reticulum where it attaches a GPI-anchor at the C-terminus liberated upon the degradation of metalloproteins [50,51]. Acidic of a nascent target protein, anchoring it to the membrane [20,21]. pH is maintained by ATPases that pump protons into the lumen of The evolutionary relation of Gpi8p with legumain is, therefore, lysosomes. The lysosomal pH is between 6e5 and 5e4.5 in early reflected not only by their sequence homology but also mecha- and late endosomes, respectively [52]. However, when lysosomes nistically by the transamidation reaction that is catalysed by Gpi8p mature, their pH can reach values as low as 3.8 [53,54]. Likewise, [22,23]. Human Gpi8p works in complex with 3 more proteins, the acidic compartments of tick gut cells share a gradual change in namely Gaa1p, Gpi16p and Gpi17p [24e26], all 4 of them being pH along the endocytic pathway. The pH in digestive vesicles membrane proteins [19]. In yeast 3 homologs of these proteins ranges from pH 3.5 to 4.5 [55]. Interestingly, were also (Gaa1p, Gpi8p and Gpi16p) suffice to form a stable complex [27]. shown to have an effect on lysosomal pH in humans. IL-10 raises the Gpi8p is likely found in all eukaryotes utilizing GPI anchoring. The pH of APC endosomes thereby attenuating protease activities, with genes encoding the Gpi-complex are essential in yeast, illustrating consequences for MHCII loading. By contrast, IL-6 was shown to its importance. decrease the pH in endosomes [56]. Plant vacuoles are very similar to mammalian lysosomes with respect to acidic pH and reducing 3. Phylogenetic distribution e who has it? conditions. Additionally, they are relatively high in NO3 (20e70 mM) [57] and contain glutathione, which contribute to Today, there is evidence for functional legumain in plants, their redox potential [58]. mammals, Blastocystis, trematodes like S. mansoni, and ticks, but not in bacteria [19]. The so far identified and classified number of 4.2. How do proteins get to the lysosomal/vesicular system? legumain isoforms varies with the organisms. Mammals only have 1 functional legumain isoform, plus a pseudo-gene with unknown Mammalian proteins are targeted to the lysosomal system function in humans [19]. Plants, however, have a varying number of mostly via the mannose-6-phosphate receptor pathway [52,59,60]. functional legumain isoforms, which can be subdivided into 4 Lysosomal cysteine cathepsins, as an example, very often harbour subfamilies in Arabidopsis thaliana aVPEedVPE [28]. Sunflower an N-glycosylation site on their N-terminal propeptides that targets contains at least 5 AEPs [29], and in barley 7e8 isoforms have been them to the vesicular pathway via the mannose-6-phosphate re- identified [29,30]. Also ticks have been shown to produce more ceptor (M6PR) [61]. Under mildly acidic conditions, they dissociate than 1 legumain isoforms. Haemaphysalis longicornis encodes 2 from the M6PR in the endo-lysosome. In plants proteins enter the functionally characterized isoforms [31,32], Ixodes ricinus has 1 vesicular system from the ER. Via an N- or C-terminal or an internal characterized isoform, IrAE [33], but very likely there are even more propeptide they are further translocated to vacuoles. Proteins functional and non-functional isoforms in ticks [34,35]. Blastocystis scheduled for secretion lack this sequence. Furthermore, plant ST7, an enteric protozoan parasite, produces at least 2 isoforms [36] vacuoles are subdivided into lytic and storage vacuoles. Proteins and in the trematodes S. mansoni, Fasciola gigantica and Opis- ending up in lytic vacuoles are transported via the Golgi complex. thorchis viverrini 1 to 2 isoforms have been identified [4,37,38]. Proteins destined for storage vacuoles do not necessarily need to go With the progress of sequencing technologies and whole genome via the Golgi [62]. sequencing also the number of identified legumain isoforms is expected to further increase. 4.3. (patho-)Physiologic localization of mammalian legumain outside the lysosome 4. Tissue distribution e where is it? Primarily under pathophysiologic conditions there is evidence Mammalian legumain is most abundant in kidney and testis, for legumain also at locations outside of acidic vesicles namely however it was shown to have a rather broad tissue distribution extracellularly, in the cytoplasm and in the nucleus [63e67] [39,40]. Furthermore, legumain could be localized to different types (Fig. 1). While the presentation at the cell surface has also been of antigen presenting cells (APCs), including human and murine described for lysosomal cathepsins under pathophysiological dendritic cells. It is relatively abundant in murine B-cell lines and conditions via the mannose 6-phosphate receptor pathway [68], EBV-transformed B-cell lines, underscoring its immunological the localization to the cytosol and nucleus is puzzling and

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 4 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25

Fig. 1. Trafficking of legumain in and outside the cell. Synthesized as an inactive proenzyme, legumain is targeted via the ER and Golgi to the endolysosomal system, where it gets activated to the asparaginyl-specific endopeptidase (AEP). Additionally it might be secreted extracellularly directly via the Golgi (dark blue path) or indirectly via the endosomal system (light blue path). Likewise, extracellular legumain may re-enter the cell via endocytosis (green path) or directly via translocation to the cytosol (dark purple path), e.g. under cancer conditions. Additionally, under pathophysiologic conditions like Alzheimer's disease where the lysosome is altered, legumain may enter the cytosol directly from a leaky lysosome. Once exposed to the cytosol, legumain may then enter the nucleus (light purple path). discussed in the next paragraph. Similarly the functions of legu- cytosolic expression but rather postulate that legumain must main at these “non-canonical” compartments are poorly defined be routed through the ER and Golgi to obtain its required and often related to pathological situations, such as cancer or post-translational modifications; Alzheimer's disease [66,69e71]. Nuclear localization is not limited (ii) translocation from the cell surface into the cytosol. In this to the human system, but was also reported in plants with a po- scenario, legumain should be fully post-translationally tential role as transcription factor [72]. The nuclear localization modified, possibly but not necessarily including autocata- appears consistent with a potential bipartite nuclear localization lytic activation (Fig. 2). Legumain oscillates between the sequence (KRK289, KRK318) [73] e but only if the protein is exposed endo-lysosomal compartment and the secretory pathway to the cytosol [74]. [82], explaining why its activation state can vary (Fig. 1). The cell surface-to-cytosol translocation would resemble that of the well documented translocation of MMP-12 into virus- 4.4. How to exit the endo-lysosome? infected cells [83]. Legumain was found on the cell-surface mostly in the context of tumours, in particular of tumour- It is well established that prolegumain is initially located to associated macrophages where the local pH is acidic compartments like endoplasmic reticulum and Golgi where it is enough to maintain an active conformation [67,70,84e89]. routed to the endosome before it becomes activated in the lyso- Therefore, the cell surface to cytosol translocation appears a some (Fig. 1). Likewise, legumain has been found at the cell surface, sensible scenario in the microenvironment of tumours. albeit limited to the microenvironments of several tumours [75]. Intriguingly, nuclear localization of legumain was reported in However, it is less clear, and challenging current dogma, how it can tumours in agreement with the bipartite nuclear localization translocate to its destinations in the cytosol and in the nucleus sequence (NLS). The two motifs (KRK289, KRK318) are sepa- [74,76]. rated by 26 amino acids, whereas standard separation would These translocations are indeed not fully understood and puz- be 10e12 amino acids [73]. A single KRK motif may be suf- zling. Nonetheless, such non-canonical trafficking is not without ficient for nuclear localization as exemplified by the precedence. Recently, nuclear and cytosolic translocation of apoptotic signalling adapter molecule FADD. FADD was also secreted MMP-12 was documented by an array of functional and found to localize to the nucleus with a single KRK35-motif, as immune-fluorescent confocal microscopy data [77]; similar trans- demonstrated by mutagenesis [90]. In legumain the motif is locations were also reported for MMP-14 [78] and, more indirect, partly located on the activation peptide with the b cleavage for MMP-7 [79]. site at Asn323-Asp324. The active, b-cleaved form would Several hypothetical legumain routes could possibly account for allow the KRK318 motif to reorganize into spatial proximity to the cytosolic and nucleic localization of legumain, including KRK289, thus mimicking a standard bi-partite NLS [73]. This scenario implicitly assumes that the microenvironment of (i) cytosolic expression; an analogous situation has been the tumour contributes to a partial perforation of the cyto- described, for example, for the lysosomal protease solic membrane, the exact mechanism being unclear. L when expressed as a splice variant lacking the N-terminal Notably, virus-infected cells (as in MMP-12 [83]) and tumour signal peptide [80]. However, such a splice form has not been cells share a common immune status in presenting peptide- reported for legumain and appears particularly unlikely MHC I complexes at their cell surface. because the glycosylation was found necessary for the sta- (iii) translocation from the endo-lysosome to the cytosol. bility of human legumain [81]. Therefore, we exclude Depending on the maturation/acidification of the endo-

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 5

Fig. 2. Legumain stability, activation, activities, and their regulation. A, In a simple picture prolegumain is stable at neutral pH and can be activated via pH shift to the active endopeptidase (AEP). AEP is irreversibly destabilized at near neutral pH. B, In a more detailed picture the activation, stability, specificity and activity of different legumain variants is controlled by different interaction partners. The pH-induced (electrostatic) activation is accelerated by glycosaminoglycans (GAGs), triggering autocatalytic activation at pH ~5.5 and below. Alternatively, prolegumain can be proteolytically activated to the asparaginyl-specific carboxypeptidase (ACP) by selective removal of the activation peptide (dark blue). AEP can be generated from ACP via electrostatic release of the Legumain Stabilization and Activity Modulation (LSAM) domain at acidic pH. Activated AEP is stable at pH 6 and irreversibly conformationally inactivated at neutral pH values. Analogous to the LSAM domain, cystatins act as direct and reversible conformational stabilisers of AEP at neutral pH. 120 Active enzyme can be regenerated from a -AEP complex at pH < 4. Legumain can be allosterically stabilized either by binding to the integrin aVb3 via the RGD motif or by ternary complex formation with TRAF6 and HSP90a. As an additional dimension of complexity, legumain protease and ligase activities are pH dependent, with the ligase dominating at neutral pH and the protease at acidic pH.

lysosomal compartment, legumain may still be in its lysosomal biogenesis and vacuolar ATPase activity [95e97]. zymogen (full-length) form (Fig. 1). Indeed, cytosolic legu- Moreover, lysosomal membrane permeability in Alzheimer's main has been reported as zymogen [63], but also in its fully disease was reported to be altered in Alzheimer patients activated form [66,69e71]. The endo/lysosome-to-cytosol [98,99] and likely affected by lysosomal dysfunction translocation was reported in pathological situations, such [100,101]. Therefore, it may seem plausible that legumain as in cancer and Alzheimer's disease [63,69], oxidative stress may percolate from the lysosome to the cytosol. and necrotic/apoptotic cell death [20,91]. Recent literature provides further evidence about the dynamics of the lyso- some, e.g., in the context of autophagic lysosome reforma- 4.5. How does legumain survive at near neutral pH in the cytosol, tion. This complexity is exemplified by the conversion of nucleus or at the cell surface? vesicular to tubular efflux from the lysosome [92e94]. Additional circumstantial evidence relates to the neuronal While legumain is stable in its zymogen form at neutral pH, acidosis in Alzheimer's disease which appears linked with the asparaginyl endopeptidase activity is rapidly and irreversibly

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 6 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 destroyed at neutral pH (Fig. 2A) [81,102]. This inactivation results 5.2. Immunity from conformational unfolding due to the high density of acidic amino acids, which are partially protonated at acidic pH but are In mammals, legumain plays important roles in the immune negatively charged and electrostatically repulsive at neutral pH system. In particular, legumain is involved in processing self and [13]. Adding to the cell-biological conundrum discussed in the foreign proteins for presentation on the MHCII complex. Thereby it previous paragraph, the existence of active legumain outside the is involved both in generating tolerance by destructing self- endo-lysosomal compartment appears therefore also critical from proteins and in mounting immune responses towards foreign an- a protein conformational aspect. This paradox can be reconciled tigens by their processing [114e117]. Legumain is thus catalysing by several considerations. Firstly, in the pathophysiological set- the generation as well as the destruction of T-cell epitopes [118].A tings of cancer and Alzheimer's disease, the relevant environ- prominent example for the deleterious effects with peptide ments tend to be slightly acidified [84,95,103,104]. Secondly, only destruction is given by legumain's association with multiple scle- uncomplexed, fully activated legumain is destabilized at neutral rosis [43]. pH. Several complexes such as integrin aVb3, and probably b1 integrins [105], present at the cell surface, have been shown 5.3. Antimicrobial activity to stabilize activated legumain (Fig. 2B) [13,88]; likewise the complex formation with cystatin inhibitors stabilizes activated Also in plants legumains are important components of the im- legumain, thereby counterintuitively acting as legumain agonists mune system. Prominent examples include the generation of cyclic at neutral pH [106]. Thirdly, when legumain is proteolytically peptides that are important for plants' defence against pathogens activated at pH ~5.5e6.5, the C-terminal propeptide (LSAM) [119]. The immunological relevance of such cyclic peptides can be remains bound to the catalytic domain, resulting in a mono- illustrated by Kalata B1 from the African plant Oldenlandia affinis carboxypeptidase activity that is stable at near neutral pH [13]. that has proven antimicrobial and insecticidal activities, amongst Fourthly, post-translational modifications can stabilize legumain others [120,121]. Furthermore, it is the active component that is as has been illustrated by its Lys63-linked ubiquitination by the used in native African medicine to accelerate child birth [122]. E3 ligase TRAF6. The ubiquitination further resulted in ternary complex formation with the heat shock protein HSP90a, which 5.4. (Immune) signalling increased the intracellular legumain stability and secretion [63]. Legumain was ubiquitinated on its activation peptide at Lys318, Besides its immunological role in antigen processing and preceding the auto-cleavage site at Asn323. The modification site generation, legumain is also an important factor in im- explains why ubiquitination occurred only with pro-legumain mune signalling. Legumain was shown to proteolytically activate and why ubiquitination sterically prevented auto-activation. TLRs, receptors of the innate [123,124], although However, the ubiquitination could be reverted by the deubiqui- cathepsins seem to be sufficient to carry out TLR processing in the tinase USP17 [63]. absence of legumain [125]. Likewise, processing of endo-lysosomal Taken together, the existence of active legumain at near neutral cysteine proteases by legumain was proven in vitro and in vivo pH milieus of the cytosol, nucleus or at the cell surface is plausible if which again modulates the downstream immune response [126]. stabilized by intra- or intermolecular complexes. Moreover, legumain can initiate the removal of the invariant chain chaperone of MHCII, thereby not only critically influencing peptide generation but also MHCII activation [127]. Signalling is not 5. Physiological functions restricted to the immune system but also includes the activation of other proteases at the cell surface [128]. 5.1. Digestion 5.5. Apoptosis Legumains have different functions that are mostly, but not exclusively, related to its protease activity. Their functions are or- Plant legumains are discussed as main mediators of pro- ganism specific and partly adapted to the host. Tick legumain, for grammed cell death in plants since they perform caspase-like ac- example, predominantly functions as a haemoglobinase. When tivities/specificities; with several plant legumain isoforms present, ticks suck up blood, the haemoglobin that is contained in the blood they could potentially resemble caspase activation cascades. Along crystallizes in the gut lumen. These haemoglobin crystals are then that line, legumains mediate the response against pathogens in taken up by gut epithelial cells via receptor-mediated endocytosis tobacco [129], in the chemical-induced apoptosis in tomato [130], [107]. The vesicles containing host haemoglobin consequently fuse in nitric oxide-induced cell death in Arabidopsis [131] and more with lysosomes containing the active haemoglobinases. There recently, in virus induced hypersensitive cell death [132]. For a haemoglobin is digested to peptides and free amino acids [35]. detailed review on legumain in plant cell death see Ref. [133]. Legumain facilitates haemoglobin digestion not only directly by Furthermore, human legumain was shown to be involved in processing haemoglobin but also by activating other proteases. neuronal apoptosis by degrading a DNase inhibitor during excito- Recent genetic analysis also showed that there might be more yet neurotoxicity [134]. unknown legumain isoforms with different functions, potentially also organized in a cascade-like manner [34]. Likewise, Schistosome 5.6. Transcription factor legumain is also a key mediator of haemoglobin degradation [108]. Plant legumains are key in the processing and activation While most legumain literature reports on its protease activity, of seed storage proteins during seed maturation [109e113]. Arabi- there is also literature reporting protease-independent functions of dopsis mutants lacking VPE genes (a, b, g) accumulated precursor legumain, including its transcription factor activity. A legumain proteins in seeds, further illustrating the importance of this class of from tomato leaves was shown via chromatin immunoprecipitation enzyme in seed maturation. Interestingly, bVPE is especially to bind to the promoter of ACS (1-aminocyclopropane-1-1carbox- important, because it can compensate missing a and g genes [113]. ylic acid synthase) and thereby induce gene expression [72]. ACS is Consequently, vegetative VPEs (a and g) are not a prerequisite for inducing ethylene biosynthesis, which is a hormone influencing precursor protein processing as long as bVPE is present. plant growth, development and defence response. Likewise, human

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 7 legumain was found to translocate to the nucleus, albeit under disulfides are highly conserved. Glycosylation sites, the RGD-motif, pathophysiologic conditions [66]. It is tempting to speculate that the TRAF6-binding sequence, signal peptides and the exact there it might also act as a regulator of gene expression, i.e. as a sequence and position of autocatalytic processing sites is varying transcription factor [72]. Structural analysis would suggest an throughout different organisms and likely presents a means to electrostatic complementarity of the negatively charged DNA adapt the enzyme to the specific needs of the host. phosphate backbone to the basic C-terminal LSAM domain, not to the catalytic domain [13]. Intriguingly, an analogous transcription factor activity has been 6.1. Recombinant access demonstrated for MMP-12 which is secreted but was found to translocate to the nucleus of virus-infected cells, exhibiting an The gene encoding human legumain is located on chromosome antiviral effect [83]. 14 at the locus14q32.12. Since prolegumain harbours several posttranslational modifications like N-linked glycosylation sites, fi 5.7. Osteoclast remodelling disul des and autocatalytic processing sites, recombinant expres- sion of the enzyme proved to be challenging. However, meanwhile Legumain was also found in the inhibition of osteoclast forma- the successful production in different expression systems has been tion and thus to inhibit bone resorption [135e137]. Interestingly, reported; for a detailed list see Table 1. With some exceptions it was the environment of osteoclasts as well as their cytosol is slightly mostly eukaryotic systems including insect cells, yeast, HEK, CHO acidic, consistent with legumain being in an active state [138e140]. and LEXSY that proved to be successful. However, recently the However, the osteoclast inhibitory function was mapped to the C- expression of different plant isoforms as soluble proteins in the fl terminal prodomain and is correspondingly independent of its E. coli Shuf e strain was reported [144]. If the proenzyme was fi protease activity. To emphasize its role in bone remodelling, legu- produced recombinantly, subsequent puri cation proceeded fi fi main was synonymously termed osteoclast inhibitory peptide 2 mostly via a His-tag and Ni-af nity puri cation. [136]. 6.2. Crystal structures of legumain 5.8. Legumain deficient mice Crystal structures of prolegumain and the enzymatically active Legumain knock out mice have been constructed by different endopeptidase domain have been determined recently for human, groups and proved to be healthy with some minor phenotypes, mouse and Chinese hamster legumains [13,145,146]. For repre- when unchallenged [123,124,126,141,142]. Legumain knock out sentative pdb entries see 4aw9, 4fgu, 4noj, 4nok, 4d3x. The crystal mice showed slower antigen processing (TTCF); the initial T-cell structures revealed that prolegumain comprises of the caspase-like response is faster in wild-type mice. However, the difference to the AEP domain and a C-terminal prodomain positioned on top of the final anti-tetanus response was only little [116]. Moreover, the mice (Fig. 4B). The prodomain can be further segmented into a had a normal phenotype, were viable and healthy but did not gain latency conferring activation peptide (AP) and an LSAM (Legumain weight at the same rate as wild-type littermates [116]. Further- Stabilization and Activity Modulation) domain. Substrate access in more, Chan and colleagues found that legumain-deficient mice prolegumain is prohibited by the AP on the non-primed and S10and accumulated procathepsins in the kidney, which is typical for by the LSAM domain on the primed substrate interaction sites, lysosomal disorders [141]. Additionally, the mice showed symp- explaining why prolegumain is enzymatically inactive. The LSAM toms typical for the hemophagocytic syndrome. domain has a death domain (DD)-like fold, albeit with different connectivity of the DD-helices, and is additionally stabilized by two 6. Domain architecture e how does legumain look like? disulfides (Fig. 4B, C). Like the lysosomal cathepsins, legumain is synthesized as an inactive proenzyme that needs to undergo pH dependent auto- 6.3. The catalytic-domain e legumain is protease and ligase catalytic activation to release the active endopeptidase [39,143]. Essentially, human prolegumain starts with a signal peptide (Met1- Legumain's functions in digestion, antigen processing, signalling fi Ala17) that is released during traf cking, followed by a short 8 via processing/activation can be attributed mainly to its protease amino acid N-terminal propeptide (Val18-Asp25), the cysteine and ligase activities. Therefore, there can be no doubt that the protease domain (Gly26-Asn323) and a C-terminal prodomain catalytic caspase-domain is the molecule responsible for fulfilling (Asp324-Tyr433) (Figs. 3 and 4A) [13,39]. The cysteine protease these functions, although the LSAM domain modulates the pepti- domain has a caspase-like overall structure and harbours the cat- dase and possibly also the ligase function, as discussed below. alytic triad (Cys189-His148-Asn42) and, depending on the organ- ism, a varying number of N-glycosylation sites. Human legumain harbours 4 N-glycosylation sites resulting in a molecular weight of 6.3.1. Protease substrates ~56 kDa of the fully glycosylated proenzyme [13,18]. The C-terminal Meanwhile a huge number of legumain protease substrates have prodomain adopts a death-domain-like architecture that is addi- been identified in different organisms. For a detailed list including tionally stabilized by 2 disulfides [13]. Furthermore, the catalytic references please refer to Table 2. Essentially, protein substrates domain harbours an RGD motif that proved to be important for include prolegumain itself, (auto)antigens like the myelin basic interaction with integrin aVb3 [88]. Recently a (reversible) ubiq- protein (MBP) or the tetanus toxin C-terminal fragment (TTCF), uitination site (Lys318) and TRAF6 have been identified other lysosomal proteases like cathepsins B, H and L and their with consequences in breast cancer [63]. While prolegumain is cystatin inhibitors; immune-related proteins like innate Toll-like stable at neutral pH, active uncomplexed legumain is destabilized receptors TLR9/7 and the invariant chain chaperone Ii; seed stor- at pH > 6.0. age proteins, proforms of cyclic inhibitors, haemoglobin seem Overall, the catalytic domain is highly conserved throughout all specific to plants; in Alzheimer's disease the protein phosphatase 2 kingdoms, while the C-terminal prodomain is less well conserved inhibitor SET and the . All these substrates share an (Fig. 3). However, structural elements like the catalytic residues and asparagine or aspartic acid at P1 position.

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 8 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25

Fig. 3. The legumain sequence space includes GPI-transamidase and caspases. A sequence alignment of human (Q99538), mouse (O89017), chinese hamster ovary (CHO, G3I1H5) and schistosome legumain (Q9NFY9), Haemaphysalis longicornis legumain isoform 1 (tick_1st; A4PF00) and isoform 2 (tick_2nd A9CQC1), A. thaliana aVPE (P49047), bVPE (Q39044), gVPE (Q39119), dVPE (Q9LJX8) and jack bean legumain (P49046), human GPI-transamidase (Q92643) and human caspase-1 (P29466) was prepared using ClustalW [265] and modified with Aline [266]. Autocatalytic processing sites are indicated with arrows, catalytic residues with red stars, glycosylation sites with green triangles, S1-specificity residues with blue diamonds, the double arginine-motif (Arg342, Arg403) with light blue diamonds, the RGD120 motif is shaded in yellow, the TRAF6-binding site in purple (A210NPRESSY217), the ubiquitination site Lys318 is indicated by a purple circle and the KRK289eKRK318 motifs, constituting a potential nuclear localization signal, are labelled with a blue box. The sequence numbering corresponds to human legumain. Secondary structure elements are indicated based on the crystal structure of human prolegumain (pdb entry 4fgu), whereby green elements correspond to the catalytic domain, blue elements to the activation peptide and orange elements to the LSAM (Legumain Stabilization and Activity Modulation) domain.

6.4. The LSAM-domain e signalling incompatible with the pH-stability requirements of the active pro- tease. Therefore, it is more likely that under such conditions either Although legumain is mostly referred to its hydrolytic activity, full-length prolegumain or a stabilized complex of the catalytic some of its functions are most likely independent from its protease domain with an unspecified partner is present at those locations. activity, including its roles as transcription factor [72] or in osteoclast Furthermore, the recognition of a death-domain like architecture remodelling. The latter activity was mapped to its C-terminal pro- within the prodomain provokes the speculation whether the LSAM domain [136]. Also, some of its reported (extra)cellular locations are domain is mediating proteineprotein interactions, typical for the

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 9

Fig. 4. Domain architecture of legumain. A, 1D-diagram of human prolegumain. Red: signal peptide (SP), light green: N-terminal fragment (NTF), dark green: catalytic domain, blue: activation peptide (AP; KRK289eN323), orange: Legumain Stabilization and Activity Modulation (LSAM; D324eY433) domain. (Autocatalytic) processing sites are labelled by arrows, catalytic residues by red stars, S1-specificity residues by blue diamonds, and the RGD120 motif by a purple triangle. Disulfide bridges on the LSAM domain are indicated. B, Crystal structure of human prolegumain (pdb entry 4fgu). The caspase-like catalytic domain is shown in green, the AP in blue and the LSAM domain in orange. The active site is labelled by the catalytic Cys189 and the C-terminal processing sites KRK289 and N323 are shown in sticks. The KRK289eKRK318 motif on the AP represents a potential bipartite nuclear local- ization signal. Furthermore, legumain may get ubiquitinated at Lys318. C, Topology diagram of prolegumain. The catalytic domain is shown in green, the AP in blue and the LSAM domain in orange. Red stars: catalytic residues, blue diamonds: S1-specificity residues.

DD-fold (Fig. 4B, C). Inflammatory and initiator caspases harbour at both termini, resulting in 47 kDa and 46 kDa intermediate legumain their N-terminus CARDs (caspase recruitment domain) and DEDs species (Fig. 5A) [102,143]. The C-terminal processing site in (death effector domain). These modules form, together with the DDs mammals was assigned to Asn323, N-terminal processing occurs (death domains) and the PYDs (pyrin domains), the death-fold su- after Asp21 and Asp25 [102,143,156]. C-terminal processing is perfamily of homotypic interaction motifs [147]. Death domains play already observed at pH 5.5 whereas processing after the N-terminal pivotal roles in apoptosis, necrosis, inflammation and immunity by aspartic acid residues is restricted to pH 4.5. While processing mediating the assembly of oligomeric signalling complexes like the after the Asn323 residue proved to be critical to generate peptidase apoptosome or the inflammasome [148,149]. Death domain in- activity, N-terminal cleavage is not essential for activation [81]. teractions do generally not cross the boundaries of a subfamily [147]. Additionally, another auto-cleavage site was identified in the hu- The death-fold is characterized by a bundle of typically 6 antiparallel man variant after Asp303-309 residues [81]. In vivo, an additional a-helices forming a Greek key topology [150]. They interact mostly C-terminal in trans trimming of 46 kDa legumain is observed, via three types of asymmetric interactions [21,151].Thesixsofar performed by a different protease resulting in a 36 kDa legumain identified interaction patches are non-overlapping, therefore each variant. However, the identity of the protease and the exact module can in theory bind up to six partners [147].Oligomeric cleavage site remain unclear. Indirect evidence hints at KRK289 as complexes have been identified for DDs [152,153]. CARDs, PYDs and the additional processing site, which might be performed by an DEDs have only been found in dimeric complexes [21,154,155]. enzyme with trypsin-like specificity like prohormone convertases, Therefore it is tempting to speculate, whether similar complexes e.g. PC2 [13,16]. Interestingly, the enzymatic activity of the 36 kDa exist for (pro)legumain, the asparaginyl-specific carboxypeptidase and the 46 kDa species towards a peptidic substrate is virtually or the isolated LSAM domain. identical [102]. Similarly, Asp303-309 processed legumain does not differ in activity [81]. Instead, this processing may affect legumain's ubiquitination by TRAF6 which occurs at Lys318 [157]; additionally, 7. Activation of legumain the Asp303/309 cleavage will likely influence its nuclear localiza- 289 fi tion, as it disrupts the bipartite nuclear localization motif KRK - 7.1. Activation to asparagine-speci c endopeptidase (AEP) 318 X26-KRK . Upon auto-activation the pH-stability profile of legumain is While prolegumain is synthesized as inactive zymogen (56 kDa changing dramatically. While prolegumain is stable at near in mammals), an acidic pH shift triggers autocatalytic processing at

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 10 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25

Table 1 Similar to mammalian legumain, plant, Schistosome and tick Recombinant expression systems. legumains are also activated via autoproteolytic processing at acidic Organism Isoform/mutants Reference pH [12,158e160].

Pichia pastoris Sugar cane [267] 7.2. Activation to ACP Barley HvLeg-2, HvLeg-4 [30] Ixodes ricinus [12] Branchiostoma belcheri [235] The recently determined crystal structures of prolegumain and Schistosoma mansoni C189N [160,175] the enzymatically active endopeptidase revealed a two-domain HEK293 organization of prolegumain with a caspase-like AEP (peptidase) Human N323A [102,268] domain and a C-terminal prodomain positioned on top of the Mouse H45A, H148A, C189S [40] enzyme (Fig. 4B) [13,145,146]. The prodomain can be further sub- Insect cells Mouse [145] divided into a latency conferring activation peptide (AP) and an Human [145] LSAM (Legumain Stabilization and Activity Modulation) domain. Arabidopsis thaliana gVPE [159] The AP blocks access on the non-primed and S10 site, and the LSAM LEXSY domain on the primed substrate binding sites, explaining why Human E190K [81] CHO prolegumain is inactive. Besides blocking substrate access, the Chinese hamster ovary [146] LSAM domain serves as an electrostatic stabilizer of the AEP Human [143] domain at neutral pH by triggering an electrostatically encoded E. coli stability switch (ESS) localized near the AEP active site. Selective Blastocystis GST-fusion [269] proteolytic removal of the AP at (near) neutral pH converts prole- Castor bean [5] fi Sunflower HaAEP1 [144] gumain into an asparaginyl-speci c mono-carboxypeptidase (ACP; Jack bean CeAEP1 [144] Fig. 5A, B) [13]. The standard auto-catalytic cleavage after Asn323 Arabidopsis AtAEP2 [144] (b-site) clips the C-terminal end of the AP and renders the active Castor bean RcAEP1 [144] site at least partially available. An additional cleavage after a KRK289 E. coli refolding Haemaphysalis longicornis HlLgm2 [31] site (a-site) at the N-terminal end of the AP is needed for its release. Schistosoma mansoni [270] The latter cleavage is most likely performed by an enzyme with trypsin-like specificity like PC-2. Contrasting the standard activa- tion at low pH, the LSAM domain remains electrostatically bound to neutral pH, auto-catalytically activated legumain is stable at the AEP domain and is thereby stabilizing it at neutral pH, similar as acidic pH, as found in the endo-lysosomal system, but destabi- observed for prolegumain. The LSAM domain not only conforma- lized at pH > 6.0. tionally stabilizes the peptidase at neutral pH, but also provides a double arginine motif that is characteristic for carboxypeptidases

Table 2 Legumain substrates.

Protease/ligase substrate Protease substrate (P1eP10) Ligase (P1eP10) Reference

Mammals Annexin A2 [233] , H, L, S [79,126] Progelatinase A [128] Myelin basic protein (MBP) [43] (Pro)legumain AsneAsp AsneAsp [81,102,143,156] TTCF (Tetanus toxin C-terminal fragment) [115] TLR 7/9 [123,124] Invariant chain chaperone (Ii) [127] BetV1 [271] Fibronectin [272] Acetoacetyl-CoA synthetase [273] SET [134] Tau [71] Cystatin C, E AsneAsp/Ser AsneAsp/Ser [106] TDP-43 (TAR DNA-binding protein 43) [254] Tick, schistosoma Haemoglobin [35,108] Schistosoma cathepsin B [274] BSA [49] Prolegumain [12,160] Plant Globulin [275] 11S/12S proglobulin [112,113] Albumins [113,276] Prolegumain [158,159,277] Glutelin [111,278] Gliadin SH-EP (Serine protease inhibitor precursor) [279,280] Proconcanavalin A AsneGlu AsneSer [9,205] PawS1 (‘pro-SFTI-1’) AsxeXxx AspeGly [206] Kalata-type cyclic peptides (e.g. Oak1 ¼ precursor of kalata B1) AsneGly/Ser/His AsneGly [122,207] Cyclic knottins AsxeXxx AspeGly [208]

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 11

Fig. 5. Activation of legumain to the asparaginyl-specific endopeptidase (AEP) or carboxypeptidase (ACP). A, Inactive prolegumain is synthesized at neutral pH and undergoes autocatalytic processing at the C-terminal Asn323 (pH 5.5) and D303/309 (pH 4.0) sites and N-terminally after Asp21 and Asp25. Another in trans processing occurs after the KRK289 motif by an unknown protease. While N-terminal processing is not essential for enzyme activation, C-terminal processing renders the non-primed substrate binding sites accessible. The release of the activation peptide (blue) generates ACP whereas the release of the Legumain Stabilization and Activity Modulation (LSAM) domain is obligatory to gain full AEP activity. B, Proteolytic and electrostatic activation of legumain. At pH 6 legumain can be activated to an Asn-specific mono-carboxypeptidase (ACP) by proteolytic release of the activation peptide (AP; proteolytic activation). Shifting pH to < 5 leads to the electrostatic removal of the LSAM domain and releases the active endopeptidase (electrostatic activation). C, Zoom-in view on the ACP active site with a modelled substrate (Ac-YVADjA; blue sticks). The substrate was modelled based on the Ac-YVAD-chloromethylketone complex structure (pdb entry 4aw9). The double-arginine-motif (Arg342, Arg403) is shown in orange sticks. D, Zoom-in view on the AEP active site. Catalytic residues are shown in green sticks, S1-specificity resides in blue (Arg44, His45) and red (Glu187, Asp231) sticks, and Glu190 in yellow sticks.

(Fig. 5B, C) [161]. The notion that legumain can develop endo- and enzymatic properties and thus help to reconcile its partly con- exopeptidase activities in a pH-dependent manner is similar as flicting moonlighting activities and localizations. observed in cathepsin B. Cathepsin B can develop endo- and exopeptidase activities depending on the conformation of an 7.3. Catalytic mechanism of legumain protease activities (AEP/ACP) occluding loop [162]. At acidic pH (in the lysosome), cathepsin B acts as a carboxypeptidase, whereas at near neutral pH (early en- A key feature of carboxypeptidases is a positively charged an- dosome; extracellularly), it acts as an endopeptidase. The pH- chor to electrostatically grab a substrate's C-terminus [161]. Legu- dependence of the legumain AEP and ACP activities is differing, main harbours the double arginine motif Arg342 and Arg403, with the carboxypeptidase activity present at near neutral pH and derived from the LSAM helices DDa2 and DDa4. While the double the endopeptidase activity dominating at acidic pH. The distinct arginine motif is important for legumain ACP activity only, the activation states of legumain differ in both their biophysical and active site residues Cys189, His148 and Asn42 are essential for both

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 12 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25

AEP and ACP activities (Fig. 5C, D). The catalytic residues are ar- ranged similar to the conformation found in caspases, and likewise the deprotonation of the catalytic Cys189 occurs spontaneously (i.e., not triggered by a catalytic base) and is dependent on the local pKa. The role of the catalytic His148 most likely is, together with the amides nitrogens of Cys189 and Gly147, to complete the oxy- anion pocket for the substrate's P1 residue; additionally, it may serve as catalytic acid for the leaving P10 residue and thereby facilitate the rupture of the scissile peptide bond. The side chain of Asn42 is located at a position similar to the carbonyl oxygen at position c177 in caspases. Although the exact role of Asn42 in is unclear, it most likely orients His148 and thus assists in polarizing and activating the substrate. The same might be true for the carbonyl oxygen 177 in caspases [163], paracaspases [164,165] and metacaspases [166].

7.4. Comparison of legumain with caspases and cathepsins

Legumain was classified as a member of clan CD, the caspase- like enzymes, by its sequential and structural relationship, as testified by a ~15% sequence homology and a highly conserved sequence motif surrounding the active site residues (Fig. 3). These highly conserved motifs suggest an evolutionary relationship [16e18]. Also, its strict specificity at the P1 position links it to caspases, albeit different in the nature of the preferred P1-residue. However, its location to the endo-lysosomal system, its pH dependent autoproteolytic activation, its pH stability profile, and its physiologic functions rather link it to cathepsins (clan CA). Like- Fig. 6. Crystal structure of the asparaginyl-specific endopeptidase (AEP). The crystal wise, it is inhibited by several members of the cystatin superfamily structure of human legumain activated at pH 4.0 is shown as a green cartoon (pdb providing a further relation to cathepsins. But these similarities are entry 4awb). The active site is labelled by the covalent Z-Ala-Ala-AzaAsn-chlor- 120 only superficial and not reflected by a detectable sequence ho- omethylketone inhibitor (orange sticks). Catalytic residues and the RGD motif are shown as sticks. mology between legumain and cathepsins [39]. Essentially, legu- main resembles a caspase in cathepsin's clothing. Indeed, cystatins are inhibiting legumain via a structurally distinct reactive centre recognition is basically replaced by the glycosylation at position loop located on the opposite to its /cathepsin-reactive loops 263 in mammalian legumains. Moreover, legumain's S1-specificity [167,168]. Therefore, the interaction of cystatins with legumain is pocket differs from caspases. Its bipolar nature makes it specific for not indicative for a structural similarity. Consistently, legumain is Asn over Asp (Fig. 5D). Unlike caspases, mature legumain is a single insensitive to E-64, a broad spectrum epoxy inhibitor of papain-like chain, monomeric enzyme. Legumain exploits an activation route proteases [39,167]. AEP's catalytic domain has a caspase-like fold different from caspases. The death domain like fold of the LSAM and also the interaction with a substrate is very much like in cas- domain links it to the monomeric caspases carrying N-terminal pases (Figs. 6 and 5D). Although, the core of secondary structure CARD (caspase-1, 9) or DED domains (caspase-8) [169]. Although elements is caspase-like, there exist some major differences. Firstly, structures of procaspase-1 (pdb entry 3e4c) and procaspase-8 (pdb there are four extra helices. One is separated from helix a2 via the entry 2k7z) are available, the exact position of the prodomain RGD motif (aI), aII is replacing the intersubunit linker connecting relative to the catalytic domain is still not clear. Conversely, the what is known as large and small subunit from caspases and there crystal structure of prolegumain gives a structural hint on the po- are two extra helices at the C-terminus (aIII and aIV) (Fig. 4C). tential position of the N-terminal prodomain in caspases, which e Secondly, the loop regions (L2 L4), which are important for cas- can be checked by biochemical data. pase activation, are different. There is an enormous extra loop starting after b2 spanning from Met70 to Thr109. This legumain specific insertion-loop is well conserved among all kingdoms, 8. Substrate specificity suggesting a specific function (Fig. 3). Interestingly, it harbours a glycosylation site (Asn91) close to the AP suggesting that it is sta- 8.1. P1 Asn/Asp bilizing the AP and consequently protects prolegumain against , i.e. premature activation. In contrast to caspases, the Legumain has a very narrow substrate specificity, cleaving intersubunit linker (loop L2 in caspase nomenclature) is ordered in selectively after asparagine residues. Only at very acidic pH con- legumain and forms helix aII. The substrate specificity loops c341 ditions, aspartic acid is also tolerated [81,170,171]. This observation and c381 (caspase-1 numbering) differ in size. Legumain has a was confirmed in mammals and plants likewise and can be un- prolonged and well-structured c341-loop forming the platform for derstood by the steric restraints and the zwitterionic architecture non-primed substrate recognition. The c381-loop in legumain is implemented in the S1-specificity pocket (Fig. 5D). The negatively shortened to a minimum, but it harbours a glycosylation site charged hemisphere ideally stabilizes the NH2 of an asparagine side Asn263 (Figs. 3 and 6). This glyco-moiety on Asn263 may prove chain, but will cause electrostatic repulsion of a charged aspartate important for substrate specificity at P5 and upwards. Interestingly, at pH 5.5, although it would fit in size. This is contrasting the sit- plant legumains have a c381-loop that is comparable in length to uation in caspases, where Serc347 and Glnc283 are in place of the caspases and consistently lack the glycosylation on this loop. This negatively charged residues of legumain. However, at a pH the comparison suggests that the role of the c381-loop in substrate pKa of aspartic acid (~3.8) the side chain becomes protonated and

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 13 now resembles an asparagine both in size and in charge [172]. 8.4. ACP substrate specificity Importantly, legumain P1 substrate specificity for asparagine does not tolerate post-translational modifications such as N-linked Using peptide competition assays it could be shown that legu- glycosylation and iso-Asp formation. Likewise, asparagine deami- main's asparaginyl carboxypeptidase (ACP) activity is a preferential dation interferes with processing as seen on the example of the mono-carboxy-peptidase, selectively removing one amino acid tetanus toxin C-terminal fragment TTCF [173]. from a protein C-terminus (Fig. 5C) [13]. As AEP, the ACP-activity obligatory requires an Asn at P1 position. However, a free C-ter- 8.2. P3eP2 and P10 minus can likely be replaced by an aspartic acid at P10 position, since the side chain of Asp resembles a C-terminus. This substitu- Besides its strict specificity for Asn at P1 position, legumain is tion is, for example, observed in the auto-catalytic Asn323-Asp324 less selective at P3eP2 and primed positions. The c341-loop serves cleavage site of human prolegumain. ACP's substrate preference in as a template for nonprimed substrate binding, mediated by anti- P10 is thus reversed to AEP's P10 preference where Asp or Glu would parallel b-sheet formation of the substrate with Ser216-Tyr220 on be disfavoured because of their adverse kcat effect. the bIV strand (Fig. 5D). Peptide libraries have been used to deter- mine cleavage preferences of pig, Schistosoma and human legu- 9. Ligase activities main. As a result it turned out that Arg/Lys/His are well tolerated at P10 whereas D/E are converted slower [174]. This observation is 9.1. ATP-dependent ligases consistent with kcat tuning effects that result from changes in the local pKa of the catalytic Cys189, as discussed in the next paragraph In order to link two peptides via a peptide bond, the C-terminus [13]. Residues with small side chains (G, C, N, S, A) are also well of one fragment needs to be activated. While a large variety of tolerated at P10 whereas proline was not well tolerated [174]. coupling reagents such as ester and anhydride bonds are used in Additionally, substrate specificity between legumains from synthetic chemistry, physiological activation is achieved by the different species seems to differ. This can be exemplified by Schis- formation of energy rich bonds, in particular carboxylate phosphate tosoma legumain which has an ideal cleavage sequence of mixed anhydrides [181]. Typically, either one of two energy rich ThreAlaeAsn (P3eP2eP1) whereas the human enzyme prefers ATP-derived intermediates is observed in vivo: aminoacyl-AMP ProeThreAsn [175]. Interestingly, plant legumain showed a pref- (aa þ ATP / aa-AMP þ PPi; aa: amino acid) and amino- erence for bulky residues at P3 position [171]. Additionally, glyco- acylphosphate (aa þ ATP / aa-Pi þ ADP). The ribosome uses the sylation on the 381-loop likely has a modulating effect on non- first intermediate to synthesize peptide bonds. In this case an primed substrate specificity [13]. aminoacyl-AMP is transferred to a t-RNA, consequently releasing the AMP. Thereby a stable aminoacyl-tRNA intermediate is gener- 8.3. Tuning legumain protease activity: kcat selection ated [182,183]. Non-ribosomal synthetases like glutathione syn- thase and L-amino acid a-ligase are examples for enzymes forming “How can a cysteine protease be active at acidic pH?” is one of peptide bonds via an aminoacylphosphate intermediate [183e186]. the major questions when working with lysosomal cysteine pro- The schematics of ATP-dependent ligation reactions are shown in teases. Deprotonation of the catalytic Cys Sg is essential to perform Fig. 7A. a nucleophilic attack. In chymotrypsin-like serine proteases a cat- alytic His serves as a base and abstracts the proton from the cata- 9.2. ATP-independent transpeptidases lytic Ser, enabling it to perform a nucleophilic attack [176,177]. Lysosomal cathepsins belonging to the papain-like cysteine prote- Besides the classic ATP-dependent, de novo peptide bond syn- ase family employ an analogous triad with an imidazoliume thio- thesis, there are also examples for ATP-independent transpeptidase late ion pair stabilization. Additionally, the catalytic cysteine is mechanisms, such as implemented in inteins and sortases. Inteins placed at the beginning of an a-helix; this helix cap with its positive catalyse a protein splicing reaction, i.e. the autocatalytic excision of charge of the helix dipole further favours the thiolate state, as re- an internal domain (“intein”) within a protein. In the most common flected by a resulting pKa of 3.6 [178e180]. situation, the central intein domain is flanked at the N-terminal and Such pKa tuning is not possible in caspase-like enzymes where C-terminal side by the N-extein and C-extein domain, respectively. the catalytic His237c is ~6 Å away from the catalytic Cys285c Sg Inteins are found in prokaryotes, eukaryotes and archaea, albeit in (Fig. 5D). Therefore, His237c (His148 in legumain) cannot abstract varying numbers. Methanococcus jannaschii for example contains the Sg proton. Consequently, the deprotonation of legumain's cat- 19 inteins [187,188]. alytic Cys189 can be expected to become the rate-limiting step at Their catalytic mechanism is essentially composed of 4 steps acidic pH of the lysosome. This pH dependence is indeed exploited and repeatedly exploits nucleophilic substitution reactions as in by Glu190 in human legumain which implements a kcat-driven serine or cysteine proteases. Therefore, suitable nucleophiles (Ser/ mechanism to fine-tune its substrate specificity [13]. The highly Cys/Thr) must be present both at the N-terminus of the intein and conserved Glu190 is positioned next to the catalytic Cys189 and of the C-extein; additionally, an Asn must be present at the intein's leads to an increase of the local pKa of Cys189 by stabilizing the C-terminus. The splicing reaction starts with the intein's Cys1/Ser1 proton at Sg (Fig. 5D). Legumain thus harbours a carboxylate e thiol attack of the N-terminal splice junction, converting the N- polar pair with reversed signs as compared to the imidazoliume- exteineintein peptide bond into a (thio)ester intermediate (step 1). thiolate ion pair found in cathepsins. In agreement with this anal- An analogous nucleophilic attack (step 2) by the C-extein nucleo- ysis, the charge reversal mutant E190K positions a positive charge phile Cys þ 1/Thr þ 1/Ser þ 1 results in a branched (thio)ester in- close to the catalytic Cys189 and thereby enhances legumain's kcat termediate between the N-extein and the C-extein, whereby the by decreasing the local pKa of Cys189. The E190K mutation does not intein is still connected with the C-extein. In a next step, (step 3) the alter the substrate binding affinity (KM) but the turnover speed intein's C-terminal Asn cyclises into a succinimide, thereby 0 0 (kcat). Substrates presenting charged residues at P1 and P2 posi- releasing the intein from the extein domains. Finally, (step 4) by an tion may therefore regulate their own turnover speed, in agree- acyl shift the (thio)ester bond between the extein domains is ment with results from Schwarz et al. [174], as described in the converted to an amide bond, i.e. a standard peptide bond [189]. previous paragraph. Sortases are transpeptidases present only in Gram-positive

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 14 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25

Fig. 7. Schematics of catalytic mechanisms employed to form peptide bonds. A, In ATP-dependent protein synthesis, such as by the ribosome, energy rich ATP is used to activate a carboxy-terminus (peptide1) by covalently attaching an AMP (A is adenosine). B, By contrast, transpeptidation reactions use a nucleophile, usually the Og of a serine or the Sg of a cysteine, to activate a peptide bond (R is the Ca of the P10 amino acid) or the carbamoyl group of a glutamine side chain (R is a hydrogen). While inteins and sortases attack a peptide bond and transfer a new peptide, transglutaminases attack a glutamine side chain and form an isopeptide bond with the Nz of a lysine. C, In legumain, the C-terminal carboxylic acid of the substrate is activated by an energy rich Asu147 found in the active site. Thereby, an anhydride intermediate is generated. In either case the reaction proceeds with the formation of a peptide bond, linking peptide 2 via its free amino group to the C-terminus of peptide 1. The nucleophile of “peptide2” could be the amino group of a peptide or the Nz of a lysine side chain.

bacteria. They catalyse the covalent attachment of a secreted pro- mechanism proposed for plant legumains, as discussed below tein to the peptidoglycan cell wall [190,191]. There are 4 subfamilies [15,144]. The lack of a thio-ester intermediate in the ligation by sortase AeD, with sortase A being most prominent and present in human legumain poses the question for an alternative energy most gram-positive bacteria [192]. Their catalytic mechanism ex- source which could activate the P10 carboxylate and thereby enable ploits a conserved Cys residue that performs a nucleophilic attack peptide bond formation. The crystal structures of human, mouse of the substrate at the conserved LPXT-G motif (usually between T and also CHO legumain uncovered a conserved succinimide/ and G). Thereby a tetrahedral intermediate is generated, leading to aspartimide147 (Asu147) preceding the catalytic His148 (as evi- the rupture of the peptide bond and the concomitant release of the denced by the electron density of, for example, pdb entries 4n6o, primed side product. The mechanism thus mimics that of a cysteine 4noj) (Fig. 8B) [106,145,146]. Succinimides are generated from the protease. Importantly, the resulting thio-ester in sortases is long- Asp precursor via a condensation reaction [196]. The conversion of lived enough to allow the free N-terminus from a peptidoglycan Asp/Asn residues to succinimide and subsequent Asp racemization precursor (or a Lys side chain) to attack. Thereby a second tetra- in proteins is diagnostic to the ageing of proteins and is also hedral thio-hemiketal intermediate is generated. Finally, the cata- important in age-related diseases like AD and cataract [197e199]. lytic Cys is released, completing the formation of the new peptide Succinimides are energy rich moieties that represent the transition bond and the regeneration of the enzyme [193e195]. The sche- state between the conversion of an Asn to Asp via deamidation matics of intein- or sortase-like transpeptidation reactions are [196]. Succinimides are labile in water and pH sensitive, i.e. they are shown in Fig. 7B. easily hydrolysed at neutral pH by a hydroxide but can be stabilized at acidic pH where OH concentration is low [200,201]. Because 9.3. Catalytic mechanism of legumain ligase activity they harbour energy rich bonds (like ATP) they are used as coupling reagents [181]. The coupling of proteins via free amines onto a Mammalian legumains can be active both as protease and ligase, biosensor chip also exploits this feature [202]. Interestingly, suc- albeit with differing pH-profiles [106,145]. While the protease ac- cinimide residues have also been found in different crystal struc- tivity dominates at acidic pH, the ligase activity dominates at near tures, e.g. lysozyme [203] and amylomaltase. In the latter case neutral pH. Both activities are in a pH dependent equilibrium, su- succinimide formation is thought to be accelerated at increasing perposing each other. For human legumain it was further shown temperature and be indicative for increased protein stability of the that the ligase activity is independent of the catalytic Cys189. Co- enzyme from Thermus thermophilus [204]. valent modification of Cys189 Sg uncoupled the ligase and protease Prolegumain harbours an Asp147 when it leaves ribosomal activities by blocking the protease but not the ligase activity. This protein biosynthesis, as confirmed via mass spectrometric analysis. uncoupling was structurally translated into a rotation of the cata- However, during autocatalytic activation the Asp is converted to an lytic Cys189 Sg away from the scissile peptide bond in the legu- aspartimide (Asu) as observed in human legumain. Asp147 is mainecystatin E complex, representing the ligase state of the strictly conserved throughout all legumain sequences, like the enzyme (Fig. 8B) [106]. These observations exclude sortase and catalytic protease triad is, but is mostly Ser, Ala or Gly in caspases intein-like mechanisms and thus appear in marked contrast to the (Fig. 3). Additionally, mutagenesis studies (D147S) indicate the

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 15

Fig. 8. Interaction of cystatins with legumain. A, Crystal structure of legumain (AEP, green) in complex with human cystatin E (light blue). RCL: reactive centre loop, LEL: legumain exosite loop. The binding site of papain-like enzymes is indicated by a red circle. B, Zoom-in view on the legumain active site. The cystatin E reactive centre loop harbours the conserved Asn39 (blue sticks) and binds like a substrate to the AEP active site. Catalytic residues are indicated in green sticks, the cystatin E reactive centre loop in dark blue. C, Schematic model of legumain inhibition by cystatins. Cystatins (blue) bind to legumain (green) at pH > 4. Depending on the pH environment an equilibrium between Asn39- processed and intact/religated cystatin will be established. At acidic pH the processed variant dominates, whereas at neutral pH intact cystatin is abundant. Cleaved inhibitor remains bound to legumain because of exosite interactions mediated by the LEL (purple).

relevance for Asu147 for the ligation of peptide bonds [106]. Based here is to excise the 14 aa SFTI-1 peptide and perform the ring- on these observations a mechanism of legumain ligase activity was closure reaction, in order to obtain the mature cyclic inhibitor. proposed, where Cys189 is cleaving a substrate, thereby releasing a are cyclic peptides that play important functions in plant free C-terminus (step 1). Next, the free P1 C-terminus can attack the defence against pathogens [119]. They are structurally character- proximal Asu147, resulting in an energy rich carboxylate anhydride ized by a head-to-tail cyclised backbone and 3 conserved disulfides (step 2; Fig. 7C). Then another succinimide-intermediate is gener- with knotted topology. Prominent examples include the kalata- ated by the P1-Asn (step 3). The P1-Asu is finally attacked by the type peptides and cyclic knottin-type peptides. Members of both free P10 N-terminus (step 4), thereby regenerating the P1 aspara- families have been identified as legumain protease/ligase sub- gine. In this model, the P1 Asu formation (step 3) is optional and strates, e.g. Oak1, the precursor of kalata B1 and MCoTI-II, the cyclic depends on the structural/steric requirements of the substrate miniprotein Momordica cochinchinensis Trypsin Inhibitor II [106]. [122,207,208]. By contrast, recent studies on plant legumain and SFTI-1 by Bernath-Levin and colleagues suggest a sortase-like mechanism for the transpeptidation reaction (Fig. 7B) [144]. Accordingly, the re- 9.4.1. P1 Asn/Asp action starts with the attack of the P1 Asp by C189 Sg, forming a These substrates have in common a conserved Asn/Asp at P1 classical thio hemiketal intermediate. After release of the C-ter- position of the protease and the ligase substrate; this specificity is minal cleavage product, the thioester intermediate must be pro- dictated by legumain's S1 site. Cleavage after the Asn/Asp is tected from premature hydrolysis. Then the primed (i.e. C-terminal) obligatory as this generates the C-terminal free Asn/Asp that serves ligase substrate binds and its P10 Gly nucleophile substitutes and as P1 residue of the ligase substrate. In case of cystatin E, mutation releases the catalytic cysteine [144]. of the P1-Asn39 residue to Asp completely abolished ligase activity Given the high sequence conservation between mammalian and [106]. Both SFTI and cyclotides need processing after Asn or Asp plant legumains, it appears possible that the aspartimide-mediated residues and ligation to result in a mature, cyclic peptide (Table 2). and the thioester-mediated ligation reaction can be catalysed by While concanavalin A and kalata-type peptides have an Asn at P1 one and the same enzyme, albeit with different preferences under position of the ligase substrate, SFTI and knottins have Asp residues. different conditions, e.g., different pH optima for either reaction. Gillon and colleagues also showed that mutating the P1-Asn to Asp abolished cyclisation of kalata B1 [122]. This is also in agreement with observations by Nguyen and co-workers who showed that 9.4. Ligase substrates and specificity replacing the P1-Asn by Asp in their model peptide led to a tremendous reduction of the ligation efficacy (less than 10% in 4 h) Legumain ligase activity has been proven for human, mouse and [15]. Additionally, Bernath-Levin and colleagues recently showed plant forms [106,145,205]. Ligase substrates identified in mammals that mutation of the P1-Asp to Asn led to a ~800-fold increase in include (pro)legumain itself, which is capable of reverting the ligation efficacy, suggesting that the Asn is in principle a better Asn323-cleaved, activated form into the uncleaved zymogen form ligase substrate [144]. However, ligation kinetics is not necessarily (Table 2). Furthermore human legumain is processing and ligating optimized for a fast but rather for a specific reaction. For the cystatin inhibitors after a conserved Asn39 residue [106]. Ligase macrocyclisation of SFTI several cleavages have to occur in a substrates and products known from plants include concanavalin A, defined order which is guaranteed by the choice of an Asn (fast) and SFTI-1 and cyclotides. In concanavalin A legumain acts as a splicing Asp (slow) at P1. The authors suggest that the choice of the addi- enzyme by excising an internal propeptide and re-linking distant tional cleavages after Asn that are necessary to generate/release the parts of the protein in an inverted orientation [205]. PawS1 is the C-terminal Gly (P10), happen faster than the cleavage after Asp. precursor of the sunflower trypsin inhibitor 1 (SFTI-1), a 14 amino Thereby it is guaranteed that the P10 nucleophile is available and acid long, cyclic, disulfide-bonded inhibitor. PawS1 encodes 2 ready to attack when the metastable P1 Asp-thioester forms, thus mature proteins: SFTI-1 and an albumin [206]. Legumains function minimizing futile hydrolysis events [144].

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 16 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25

9.4.2. P10 and P20 have been identified [221]. Macrocypins are ~20% identical in Cystatins present Asp and Ser residues at the P10 position of the sequence to clitocypin and have been subdivided in 5 subgroups protease/ligase substrate (Table 2). The cyclic substrates have a (macrocypin 1-5), with macrocypin 1 and 3 being the only ones glycine at P10 position (the C-terminus to be linked), in concanav- inhibiting legumain. Both clitocypin and macrocypins were alin A the P1-Asn is linked to a P10-Ser. Recently, Nguyen et al. expressed in E. coli as inclusion bodies and refolded. They proved to performed a systematic specificity profiling of butelase 1 ligase be very stable proteins, capable of reversible unfolding. Structures activity, a legumain homolog from the seeds of Clitoria ternatea. of clitocypin and macrocypin 1 have been solved and revealed that They found that at P10-position virtually every natural amino acid is they are Kunitz-type inhibitors. Together with mutagenesis data accepted except for Pro and acidic residues like Asp and Glu. the structures suggest disjunct papain and legumain reactive sites, However, the P20 seems to be more critical. Here only Leu, Ile, Val, functionally resembling the situation of cystatins [222]. Inhibition and Cys were tolerated [15]. This is also in agreement with earlier constants of macrocypins and clitocypin towards legumain are in observations by Gillon et al. [122]. However, Bernath-Levin and the nM range (Ki: rMcp1 3.3 nM; rMcp3 9.2 nM; rClt 21.5 nM). colleagues showed that mutating the P10-Gly to Ala prevented macrocyclization in a synthetic substrate [144]. 10.1.3. Prodomain e reversible activation and zymogenization In general, cyclic peptides usually have well-defined confor- By blocking access to the active site, the C-terminal prodomain mations and are very stable molecules due to resistance to pro- has an auto-inhibitory function. Interestingly, adding back the teolytic processing, particularly stable cross-linked hydrogen isolated prodomain to the AEP-domain also reduced enzymatic bonds, further stabilization by disulfide bonds and lack of confor- activity [159]. This effect is similar to cathepsins were isolated mational rearrangement upon binding to their target (protease) propeptides also efficiently inhibited their matching cathepsin [209,210]. They are extremely resistant even to harsh chemical in vitro [223]. Furthermore, autocatalytic processing at the Asn323 conditions like in the digestive tract [211]. These features make site is reversible in mammalian legumains and thereby an already them a promising scaffold for the design of orally available drugs. proteolytically activated enzyme can be converted back to the latent zymogen state by auto-ligation [106,145]. 10. Activity regulation 10.1.4. How to prevent cleavage of protein substrates or canonical 10.1. Protein inhibitors inhibitors Combining knowledge on known legumain inhibitors with 10.1.1. Cystatins structural and biochemical information on legumain provides us Cystatins form a superfamily of cysteine protease inhibitors with new strategies for the design of even more potent legumain which is further subdivided into four subfamilies: stefins, cystatins, inhibitors. An efficient strategy to block legumain activity could for kininogens and non-inhibitory cystatins. Family 1 cystatins (stefins) instance be to have high affinity binding of primed and non-primed are mainly localized intracellularly. However, extracellular loca- (cleavage) products together with its ligase activity, as it is tions have also been proven [212,213]. They are potent inhibitors of exploited by cystatins (“kcat<0”). Additionally, high affinity binding papain and papain-like enzymes with a Ki in the low nM range (KM) together with slow turnover (low kcat) will assist in preventing [16,214]. Due to an N-terminal signal peptide, type 2 cystatins are cleavage and thereby trap the enzyme in a high affinity complex. secreted outside the cell. Selected members of the family 2 cys- kcat-tuning can be easily accomplished by introducing negative 0 0 tatins were shown to be potent legumain inhibitors with Ki values charges on the P1 eP2 binding sites of a potential inhibitor. in the low nM/pM range [167,168,215]. A conserved asparagine (Asn39, cystatin C numbering) was proven to be important for 10.2. The environment is regulating legumain activity legumain inhibition. Asn39 is located between helix a1 and b2, on a reactive centre loop different to the papain inhibitory site (Fig. 8A) Legumain activity is greatly influenced by its environment. [167]. Although all members of the cystatin superfamily share the Acidic pH is essential for activation to the endopeptidase activity. conserved Asn39 essential for legumain inhibition, only selected Importantly, the stability of the AEP-domain is also pH-dependent members of the family 2 cystatins are inhibiting legumain. These (Figs. 2 and 9). The catalytic Cys189 must (transiently) exist as a members are cystatin C, E/M and F. Cystatin E/M is the most potent nucleophilic thiolate for catalysis, making it prone for “toxic” legumain inhibitor (Ki ¼ 0.0016 nM) followed by cystatin C and F oxidation; therefore, a reducing environment is favourable for ac- [16,167]. The crystal structure of a legumainecystatin E complex tivity (Fig. 9). The LSAM domain has a pH-stabilizing effect on the revealed that cystatins bind to the legumain active site canonically, AEP-domain. Thereby legumain can act as a carboxypeptidase also i.e. in a substrate-like manner with the Asn39 on the reactive centre at higher pH values. Likewise, the legumain ligase activity is loop serving as P1 residue (Fig. 8). The inhibitor is processed like a dominant at near neutral pH. Therefore, pH is essentially defining substrate after the Asn39. However, the cleavage products remain which enzymatic legumain variant is present (Fig. 2). bound to the enzyme because the enzymeeproduct complex is stabilized via exosite interactions that are mediated by cystatin's 10.3. Activators legumain exosite loop (LEL). Based on in silico docking studies and co-migration on gel filtration it seems very likely that a ternary 10.3.1. Direct conformational stabilization complex composed of a papain-like enzyme and legumain simul- The LSAM domain functions as a direct electrostatic stabilizer of taneously binding to cystatin may form in vivo [106,167]. Cystatins the AEP domain by physically shielding it/protecting it from a have also been identified in plants (phytocystatins) and parasites neutral pH environment (Fig. 9). Interestingly, recombinant [216e218]. expression of the AEP-domain only did not result in protein, sug- gesting that the LSAM domain also functions as a folding chaperone 10.1.2. Clitocypin/macrocypin (unpublished data). Likewise, cystatins bind to legumain directly to Clitocypin was isolated from the fungus Clitocybe nebularis.Itisa its active site. Thereby they block access to the substrate binding 16 kDa protein that belongs to family I48 in the MEROPS database. sites and also stabilize the AEP domain. A legumainecystatin C Clitocypin inhibits papain-like enzymes, legumain and complex formed at pH 5.5 is stable at pH 6.5 and can release the [219,220]. More recently, macrocypins from Macrolepiota procera active enzyme by shifting pH back to 4.0. Binding to the inhibitor

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 17

activity of glycosaminoglycans is a common phenomenon observed in (lysosomal) cathepsins [231]. Crystals structures of (active) in complex with C4S are available and illustrate allo- steric binding of GAGs to the enzyme [232]. Binding at 2 sites leads to a pH-stabilization of the mature enzyme and an increase in enzymatic activity towards collagen substrates. Furthermore, there is evidence for direct binding of GAGs also to cathepsins S and B. However, there is no consensus binding site in cathepsins but they rather bind to different sites on different targets/enzymes. GAGs seem to act in two ways: they stabilize the open conformation of procathepsins thereby allowing activation at higher pH values, and they turn the proenzyme into a better substrate for activation [227]. It is, therefore, tempting to speculate that the activating effects found for legumain might be mediated by similar conformational effects.

10.4. Low molecular weight inhibitors

Like other cysteine proteases legumain can be inhibited by compounds reacting with the catalytic cysteine like maleimide, iodacetamide/acetate, methyl-methanethiosulfonate [39,106,233]. Such compounds are rather unspecifically thiol-reactive and valu- Fig. 9. Different factors regulate legumain activity at different sites. In general, legu- able for in vitro studies, but less so for cellular assays [234]. Notably, main activity, activation and conformational stability are influenced by pH, the redox legumain is not inhibited by the broad spectrum cysteine protease fi potential, glycosylation and glycosaminoglycans (GAGs). Substrate speci city and inhibitors E-64, PMSF nor by the aspartate protease inhibitor turnover are directly regulated by substrates themselves presenting charged residues pepstatin A [235]. Consistent with its specificity for Asn but also close to the catalytic Cys189 (kcat effect); by the LSAM domain restricting access to the primed interaction sites; by cystatins blocking substrate access; and by pH and the Asp, classic covalent caspase inhibitors like YVAD-cmk/fmk are also local redox potential which tune the protonation state of Cys189. Furthermore an potent legumain inhibitors [236]. Likewise, Asp and Asn residues 120 allosteric RGD site encodes an integrin binding site. equipped with an acyloxymethyl ketone (AOMK) warhead have also been proven to be efficient inhibitors [170,237]. Given the tendency of asparagine for deamidation via aspartimide formation, can therefore be seen like a reversible conversion to the zymogen it is often preferred to use a modified AzaAsn. The AzaAsn serves state and suggests a molecular recycling mechanism. The inhibitor the P1 residue of a small peptide and is additionally linked to a cystatin C thus acts as an agonist at extracellular pH conditions by reactive warhead. Commonly used warheads include hal- protecting legumain in the complex with some leaking legumain omethylketones [238], epoxides and Michael acceptors. These in- accounting for a basal enzymatic activity. Upon internalization of hibitors are also used in conjunction with fluorescent labels for the legumainecystatin complex and subsequent re-entering of the in vivo imaging and can then be utilized as activity based probes for endo-lysosomal system, the free and active legumain will be imaging [239e242]. The latter two types of covalent inhibitors released from the complex. Binding of cystatin E to legumain is not were also successfully used towards Schistosoma and I. ricinus (easily) reversible at pH 4.0, therefore it might serve as an intra- legumains [243,244]. Interestingly, bovine legumain was also and extracellular inhibitor of legumain. þ þ shown to be inhibited by divalent cations like Hg2 and Cu2 [233], probably due to the thiol-reactivity of these metals. Further inhi- 10.3.2. Allosteric stabilization by integrin binding bition selectivity can be reached by extending peptidomimetic Legumain was previously shown to bind to b1 integrins [64] and frameworks by using non-natural amino acids [74]. to aVb3 integrin [23,88]. Furthermore, binding to the aVb3 integrin had a pH-stabilizing effect on the isolated AEP-domain. In silico 11. Pathological functions modelling of the AEP-integrin complex suggests that the integrin acts as an allosteric stabilizer, since the RGD motif is far off the 11.1. Legumain in and around cancer active site (Figs. 2B, 6 and 9) [13]. The aVb3 complex anchors legumain at the surface of tumour associated macrophages (TAMs), Overexpression of legumain in cancer was first reported in 2003 fl re ecting mechanistic analogies with lysosomal cathepsins that are [64]. Meanwhile it has been shown that legumain is overexpressed also anchored by TAMs and furthermore share a similar pH- in the majority of human solid tumours, including breast cancer fi e stability pro le [224 226]. [245], colorectal cancer [66], ovarian cancer [65], prostate cancer [246] and gastric cancer [247]. Legumain was shown to promote 10.3.3. Accelerators of activation cell migration and its overexpression is associated with enhanced Analogous to procathepsins [227], the auto-catalytic activation tissue invasion and metastasis; it promotes tumorigenesis and of legumain has been shown to be accelerated by glycosamino- correlates with poor prognosis in different cancer types glycans. Chondroitin 4-sulphate (C4S), chondroitin 6-sulphate [63e66,246,248,249]. (C6S), condroitin 4,6-sulphate (C4,6S), heparin, heparan sulphate Under cancer conditions, legumain was identified at locations and chondroitin sulphate-derived decasaccharides were shown to that appear incompatible with its pH-stability requirements like in accelerate auto-activation at pH 4.0. At pH 5.0 only C4S and C4,6S the nucleus, cytosol (gastric cancer) and extracellularly [66,75,247]. accelerated activation (Figs. 2B and 9) [228]. Nuclear legumain was found in colorectal cancer patients; however Proteoglycans are components of the extracellular matrix, but its function in the nucleus is still unclear. Here, it needs to be sta- also present in lysosomes, were they finally get degraded [229,230]. bilized, for example by the LSAM domain, to survive the neutral pH 289 318 Acceleration of activation and a stimulating effect on enzymatic environment. The KRK -X26-KRK motif, only present in

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 18 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 prolegumain (Fig. 4B), provides a plausible mechanism for legu- and translocated to the cytoplasm of neuronal cells of AD patients main's nuclear localization route: prolegumain still stabilized by the [69]. The increased legumain activity and translocation to the LSAM domain is localized to the nucleus via the bipartite KRK289- cytoplasm can be explained by acid conditions in AD brains that 318 X26-KRK motif, there it can act as a signalling molecule or, increase lysosomal permeability [20,91,253]. Legumain was shown alternatively, it can be proteolytically activated to ACP (stabilized by to proteolytically process tau, which is intimately related to AD the LSAM domain) and act as a protease. Possibly the Asn323 acti- (Fig. 10) [71]. Furthermore, SET, a multifunctional protein and in- vated ACP (b-legumain) variant will be preferred over prolegumain hibitor of protein phosphatase 2A (PP2A), becomes activated upon for nuclear import as in b-legumain the AP may fold back and such legumain cleavage. Thereby PP2A phosphatase activity is sup- end up with a spatial separation of the both KRK motifs similar to pressed, resulting in tau hyperphosphorylation. This in conse- the X10-12 linker [73]. However, the exact mechanism is still unclear. quence furthermore leads to neurofibrillary tangles (NFTs) which Cytoplasmic legumain was shown to be ubiquitinated at Lys318 are composed of hyperphosphorylated tau fragments [69]. by TRAF6 in breast cancer patients. TRAF6 binds to the motif Since SET is also a DNase inhibitor, degradation of SET by A210NPRESSY217 on prolegumain [63]. Ubiquitinated prolegumain legumain will result in DNA damage in brain. However, PIKE-L can forms a complex with HSP90a which promotes its intracellular bind and protect SET from processing by legumain [134]. stability and secretion. Disrupting the interaction between prole- Another legumain substrate involved in AD is TAR DNA-binding gumain and TRAF6 or inhibiting HSP90a led to a reduction in protein 43 (TDP-43). TDP-43 is a nuclear protein that is involved in prolegumain secretion and, consequently, had a positive effect on RNA splicing and it is a major component in tau-negative inclusions tumour expansion by reducing tumour metastasis. Interestingly, of frontotemporal lobar degeneration and amyotrophic lateral overexpression of the C189S-prolegumain dead mutant failed to sclerosis. Under such conditions, TDP-43 is translocated to the promote cell migration and invasion [63]. Pro-AEP was secreted cytoplasm where it can be phosphorylated and cleaved by legu- under stress conditions, such as starvation and hydrogen peroxide main [254]. stimulation. Legumain is expressed not only by tumour cells but also by 11.4. Therapeutic and diagnostic options surrounding stromal cells and found extracellularly in the tumour microenvironment, either associated with cell surfaces or with Because of high legumain concentrations in the tumour micro- matrix proteins. Because of the reduced pH of the tumour micro- environment mostly bound to the surface of tumour associated environment, legumain can be functional, at those locations [75]. macrophages (TAM), targeting TAMs via legumain is evaluated as a Tumour-associated macrophages (TAMs) are critical modulators of promising strategy for delivering anti-cancer drugs [250].Nowa- tumour development and metastasis. High levels of legumain have days, basically three approaches are studied: prodrugs, DNA vac- been found on the surface of TAMs, on the surface of and within cines and legumain inhibition. tumour and stroma cells of breast cancer and colorectal cancer pa- tients [87,250]. These diverse localizations of legumain in and 11.4.1. Prodrugs and targeted drug delivery around cancer cells suggest different functions at different locations. Legumain-targeted prodrugs are designed as fusion of a peptide Interestingly, not only the proteases but also the associated in- with C-terminal (P1) Asn coupled to a chemotherapeutic reagent, hibitors are altered in cancer. For example, cystatin E has been which is inactive when fused to the peptide. Legumain is the only shown to have a positive effect on the invasiveness of human mammalian protease that specifically cleaves after asparagine [78]. melanoma (by suppressing legumain activity). In breast cancer, Because legumain is only found extracellulary at high levels in the recent studies indicate that loss of the cysteine protease inhibitor tumour microenvironment, but sparsely otherwise [64,67,248], the cystatin E/M leads to increased tumour growth and metastasis drug gets specifically activated only at its intended target cell. [251]. Therefore, side effects are expectedly low. Examples of experimental prodrugs include legubicin, a legumain-cleavable peptide coupled to 11.2. Multiple sclerosis doxorubicin, a cytostatic drug used in chemotherapy. Legubicin was effectively tumoricidal in murine colon carcinoma model with Legumain is a key enzyme in the processing of foreign and self reduced toxicity [64]; LEG-3, which also couples a peptide to antigens; thus it critically affects the development of an immune doxorubicin [75]; CBZ-Ala-Ala-Asn-Dox employs an analogous response or tolerance, respectively. The situation can be exempli- peptide-doxorubicin-linkage strategy [86]; carbobenzyloxy- fied by the foreign tetanus toxin and the myelin basic protein (MBP) alanine-alanine-asparagine-ethylenediaine-etoposide, etoposide is and their consequent presentation on the MHCII complex [114,115]. the active cytostatic drug which acts by topoisomerase inhibition By processing the tetanus toxin legumain generates peptides for and gets released upon legumain cleavage [89]; carbobenzyloxy- class II MHC presentation and thus stirs a desired immune recog- alanine-alanine-asparagine-ethylenediamine-etoposide [89]; Suc- nition; by contrast, exaggerated legumain protease activity Ala-Ala-Asn-Val-colchicine, here colchicine is the anti-mitotic drug destructs MBP peptides that were important for presentation in the that inhibits microtuble formation, when released from the peptide thymus, and the development of tolerance against MBP. Alterna- [255]; Ala-Ala-Asn-TAT-liposomes. The Ala-Ala-Asn recognition tively, undue legumain ligase activity might mask MBP peptides for peptide is linked to the cell penetrating TAT peptide via a lysine side presentation and tolerance induction. This destruction (or mask- chain, forming an isopeptidic bond; the resulting branched TAT ing) is associated with multiple sclerosis (MS) [43]. Consequently, peptide has strongly reduced cell penetration capacity. Only after T-cells of MS patients respond to an MBP-derived peptide. In cleavage of the branched tripeptide by legumain, the cell pene- healthy individuals, T-cells reactive towards this peptide were trating TAT peptide is unmasked, allowing the drug-loaded lipo- cleared already during the process of generating tolerance towards some to selectively enter the tumour cells and deliver its cell toxic self-proteins [252]. cargo [256]. Similar to the prodrug approach, nanoparticles encapsulating 11.3. Legumain in Alzheimer's disease doxorubicin are targeted to the tumour microenvironment via the peptidomimetic legumain inhibitor RR-11a where they are endo- Rather recently legumain's involvement in Alzheimer's disease cytosed selectively by tumour cells; consequently, doxorubicin hits (AD) has been recognized. Active AEP was found at increased levels selectively the tumour cell [257]; Another variation of this

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 19

Fig. 10. Legumain is a key player in Alzheimer's disease (AD). Legumain is found translocated to the cytoplasm of neuronal cells in AD patients because of increased lysosomal permeability accompanying cell acidosis. Here, legumain is processing Tau and SET, the inhibitor of protein phosphatase 2A (PP2A) and is thereby activating it. Consequently PP2A phosphatase activity is suppressed, resulting in Tau hyperphosphorylation (pTau) catalysed by Tau kinases, e.g., GSK3b. This furthermore leads to the formation of neurofibrillary tangles (NFTs). NFTs are composed of truncated and hyperphosphorylated Tau. The pH environment combined with legumain ligase activity dominating at near neutral pH suggests that legumain might cross-link NFTs by forming new intra- or intermolecular (iso)peptide bonds (red lines). Since SET is also a DNase inhibitor, degradation of SET by legumain will result in DNA fragmentation in neurons in the brain. However, PIKE-L can bind SET and prevent its processing by legumain. Another legumain substrate involved in AD is TAR DNA- binding protein 43 (TDP-43). TDP-43 is a nuclear protein that is involved in RNA splicing and is translocated to the cytoplasm under AD conditions where it can be phosphorylated and cleaved. approach was the co-targeting of the legumain-activatable Auri- probes proved to be useful markers for primary tumour inflam- statin prodrug to legumain and integrin aVb3. The co-targeting led mation and early stage metastatic lesions. Recently, legumain- to higher efficacy in murine breast cancer models as compared to controlled MRI imaging techniques have been developed; legu- legumain targeting only [85,88]. Similarly, the Probody-approach main cleavage of the MRI contrast agent changed its hydropho- captializes on the co-targeting of a therapeutic antibody in anti- bicity and could thereby promote its rotational correlation time tR tumour treatment [258]. In the study by Desnoyers et al. the pro- [262]. Alternative on-off switching techniques rely on legumain- body contained a masked variant of the anti-EGFR antibody controlled disassembly of 19F MRI nanoparticles [263]. cetuximab. Only upon cleavage by legumain, cetuximab gets acti- vated and blocks the EGFR-pathway. The specific targeting and 11.4.4. Inhibition/knock down of legumain to suppress tumour activation of Probodies and prodrugs in general to the tumour progression and Alzheimer's disease environment will reduce on-target toxicity in normal tissues. While the correlation of legumain with the hypoxic tumour microenvironment is well established and used for targeted drug 11.4.2. DNA vaccines delivery, there have been also efforts to inhibit legumain itself as a DNA vaccination approaches similarly capitalize on the fact that causal target in tumour progression [239,251]. Legumain inhibition legumain is presented and accessible to the immune system only at can also be combined with other cancer targets such as aVb3 the tumour microenvironment, in particular on the surface of integrin in order to improve both specificity and efficacy; a bispe- tumour-associated macrophages (TAMs). By engineering artificial cific antibody simultaneously targeting these two proteins could poly-ubiquitin-legumain fusion vector constructs, it is possible to inhibit the growth of primary tumours in a breast cancer mouse ensure proteasomal degradation and class I MHC presentation, ul- model [264]. timately mounting a robust CD8þ cytotoxic T lymphocyte (CTL) Alzheimer's disease is another area where legumain inhibition response that is directed to legumain presenting TAMs. This might prove an attractive target. This notion has been fostered by concept has been tested and confirmed using different variants of studies that link legumain directly to tauopathies, e.g., tau hyper- legumain DNA vaccines [245,250,259]. In order to enable oral phosphorylation [69,71]. Further research will be necessary to administration of the DNA vaccine, different nanoparticle delivery evaluate the full potential of legumain inhibition in these disease systems have been successfully employed [260]. areas.

11.4.3. Legumain imaging and its use as prognostic marker 12. Future perspectives, outlook Legumain is also considered as a negative prognostic marker since its overexpression correlates with shorter overall survival e.g., 12.1. Smart Legumain Activity Modulation (SLAM) in breast cancer patients [67,249]. This correlation spurs the in- terest to monitor legumain distribution in vivo. Different imaging Given legumain's three orthogonal or compensatory activities strategies have been used, including quenched activity-based (endopeptidase, carboxypeptidase, ligase), the development of probes to visualize active legumain endopeptidase [261]. Those activity-specific probes and inhibitors is especially interesting, and

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 20 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 will open completely new possibilities. In fact, the unselective in- of protease B from germinating vetch seeds, Biokhimiia 47 (1982) 814e821. hibition of all three legumain activities may be contra-productive. [3] A. Ruppel, Y.E. Shi, D.X. Wei, H.J. Diesfeld, Sera of Schistosoma japonicum- infected patients cross-react with diagnostic 31/32 kD proteins of S. man- For one, activity-selective imaging tools would enable to probe soni, Clin. Exp. Immunol. 69 (1987) 291e298. which legumain activity is found in which cellular compartment [4] M.Q. Klinkert, R. Felleisen, G. Link, A. Ruppel, E. Beck, Primary structures of fi under different (patho)physiologic conditions. Take as an example Sm31/32 diagnostic proteins of Schistosoma mansoni and their identi cation as proteases, Mol. Biochem. Parasitol. 33 (1989) 113e122. the relevant legumain activity in cancer: Legumain's pH-stability [5] I. Hara-Nishimura, Y. Takeuchi, M. Nishimura, Molecular characterization of a profile suggests that its ligase activity matters in cancers where vacuolar processing enzyme related to a putative cysteine proteinase of legumain is overexpressed and translocated to the extracellular Schistosoma mansoni, Plant Cell 5 (1993) 1651e1659. [6] M.A. el Meanawy, T. Aji, N.F. Phillips, R.E. Davis, R.A. Salata, I. Malhotra, environment, cytoplasm or the nucleus, because the ligase activity D. McClain, M. Aikawa, A.H. Davis, Definition of the complete Schistosoma dominates at near neutral pH. However, under such conditions the mansoni hemoglobinase mRNA sequence and gene expression in developing isolated AEP-domain will not survive the neutral pH and will parasites, Am. J. Trop. Med. Hyg. 43 (1990) 67e78. [7] S. Marttila, I. Porali, T.H.D. Ho, A. Mikkonen, Expression of the 30 kD cysteine therefore need further direct or indirect stabilization, e.g. via the endoprotease B in germinating barley seeds, Cell Biol. Int. 17 (1993) LSAM domain or integrin aVb3 (Figs. 2B and 9). Therefore, the sit- 205e212. uation needs further clarification. Orthogonal activity-based probes [8] I. Hara-Nishimura, K. Inoue, M. Nishimura, A unique vacuolar processing would allow to assign distinct legumain activities to different (ex- enzyme responsible for conversion of several proprotein precursors into the mature forms, FEBS Lett. 294 (1991) 89e93. tra-)cellular compartments. Moreover, this assignment would form [9] S. Ishii, Asparaginylendopeptidase: an enzyme probably responsible to post- the basis to assign legumain substrates with the relevant legumain translational proteolysis and transpeptidation of proconcanavalin A, Seika- e activity. For example, very little is currently known about human gaku 65 (1993) 185 189. [10] A.A. Kembhavi, D.J. Buttle, C.G. Knight, A.J. Barrett, The two cysteine endo- legumain ligase substrates. However, it is conceivable that ligase peptidases of legume seeds: purification and characterization by use of activity plays critical roles in physiological situations such as the specific fluorometric assays, Arch. Biochem. Biophys. 303 (1993) 208e213. generation of 3d-crosslinked epitopes for class II MHC; but also in [11] T. Tanaka, J. Inazawa, Y. Nakamura, Molecular cloning of a human cDNA encoding putative cysteine protease (PRSC1) and its chromosome assign- pathological situations such as the cross-linking of NFTs or amyloid ment to 14q32.1, Cytogenet. Cell Genet. 74 (1996) 120e123. deposits in Alzheimer's disease (Fig. 10). [12] D. Sojka, O. Hajdusek, J. Dvorak, M. Sajid, Z. Franta, E.L. Schneider, C.S. Craik, How could such smart legumain activity modulators (SLAM) be M. Vancova, V. Buresova, M. Bogyo, K.B. Sexton, J.H. McKerrow, C.R. Caffrey, P. Kopacek, IrAE: an asparaginyl endopeptidase (legumain) in the gut of the developed? Despite its complexity, there are several routes to hard tick Ixodes ricinus, Int. J. Parasitol. 37 (2007) 713e724. accomplish the SLAM development. To illustrate the principle, a [13] E. Dall, H. Brandstetter, Mechanistic and structural studies on legumain SLAM inhibitor/probe could be designed to bind to the primed (e.g., explain its zymogenicity, distinct activation pathways, and regulation, Proc. 0e 0 Natl. Acad. Sci. U. S. A. 110 (2013) 10940e10945. S2 S4 ) recognition sites to distinguish between endo and [14] C. Linnestad, D.N. Doan, R.C. Brown, B.E. Lemmon, D.J. Meyer, R. Jung, carboxypeptidase; or a SLAM probe might block the thiol of the O.A. Olsen, Nucellain, a barley homolog of the dicot vacuolar-processing catalytic Cys189 without blocking active site access e and thus protease, is localized in nucellar cell walls, Plant Physiol. 118 (1998) e distinguish protease and ligase activities. In fact, MMTS is a prim- 1169 1180. [15] G.K. Nguyen, S. Wang, Y. Qiu, X. Hemu, Y. Lian, J.P. Tam, Butelase 1 is an Asx- itive variant of such a protease-specific inhibitor [106]. Since specific ligase enabling peptide macrocyclization and synthesis, Nat. Chem. legumain is an allosterically regulated protein, allosteric SLAMs Biol. 10 (2014) 732e738. should also exist. The details to implement such compounds pose a [16] N.D. Rawlings, A.J. Barrett, A. Bateman, MEROPS: the database of proteolytic enzymes, their substrates and inhibitors, Nucleic Acids Res. 40 (2012) challenge, but a very rewarding one, a Grand SLAM. D343eD350. [17] A.J. Barrett, N.D. Rawlings, Evolutionary lines of cysteine peptidases, Biol. Chem. 382 (2001) 727e733. 12.2. Legumain web [18] J.M. Chen, N.D. Rawlings, R.A. Stevens, A.J. Barrett, Identification of the active site of legumain links it to caspases, clostripain and gingipains in a new clan Analogous to the caspases, legumain harbours a death-domain like of cysteine endopeptidases, FEBS Lett. 441 (1998) 361e365. fi [19] N.D. Rawlings, A.J. Barrett, A. Bateman, MEROPS: the peptidase database, prodomain, and several legumain isoforms were identi ed in plants Nucleic Acids Res. 38 (2010) D227eD233. and ticks; additionally caspase-like functions such as in programmed [20] S.G. Lillico, J.C. Mottram, N.B. Murphy, S.C. Welburn, Characterisation of the cell death were attributed to legumains in plants. These analogies QM gene of Trypanosoma brucei, FEMS Microbiol. Lett. 211 (2002) 123e128. raise the question whether legumain(s) are similarly organised in a [21] J.D. Hilley, J.L. Zawadzki, M.J. McConville, G.H. Coombs, J.C. Mottram, Leish- mania mexicana mutants lacking glycosylphosphatidylinositol (GPI):protein cascade-like system. It is tempting to speculate whether there transamidase provide insights into the biosynthesis and functions of GPI- could also be other proteases involved in such a cascade. Possibly, anchored proteins, Mol. Biol. Cell 11 (2000) 1183e1195. there is even a death domain-like adaptor that can cross-connect [22] M. Benghezal, A. Benachour, S. Rusconi, M. Aebi, A. Conzelmann, Yeast Gpi8p is essential for GPI anchor attachment onto proteins, EMBO J. 15 (1996) legumain via its LSAM death domain to partner proteins, similar to 6575e6583. ASC in the caspase-system [169]. Experimental approaches like pull [23] M.A. Zacks, N. Garg, Recent developments in the molecular, biochemical and down assays will be required to address such possibilities. functional characterization of GPI8 and the GPI-anchoring mechanism [re- view], Mol. Membr. Biol. 23 (2006) 209e225. [24] D. Hamburger, M. Egerton, H. Riezman, Yeast Gaa1p is required for attach- Conflict of interest ment of a completed GPI anchor onto proteins, J. Cell Biol. 129 (1995) 629e639. [25] K. Ohishi, N. Inoue, Y. Maeda, J. Takeda, H. Riezman, T. Kinoshita, Gaa1p and The authors declare no conflicts of interest. gpi8p are components of a glycosylphosphatidylinositol (GPI) transamidase that mediates attachment of GPI to proteins, Mol. Biol. Cell 11 (2000) 1523e1533. Acknowledgements [26] K. Ohishi, N. Inoue, T. Kinoshita, PIG-S and PIG-T, essential for GPI anchor attachment to proteins, form a complex with GAA1 and GPI8, EMBO J. 20 This work was supported by the Austrian Science Fund (FWF (2001) 4088e4098. € [27] P. Fraering, I. Imhof, U. Meyer, J.M. Strub, A. van Dorsselaer, C. Vionnet, project P_23454-B11) and the Austrian Academy of Sciences (OAW A. Conzelmann, The GPI transamidase complex of Saccharomyces cerevisiae project number 22866). contains Gaa1p, Gpi8p, and Gpi16p, Mol. Biol. Cell 12 (2001) 3295e3306. [28] S. Nakaune, K. Yamada, M. Kondo, T. Kato, S. Tabata, M. Nishimura, I. Hara- Nishimura, A vacuolar processing enzyme, deltaVPE, is involved in seed coat References formation at the early stage of seed development, Plant Cell 17 (2005) 876e887. [1] C. Csoma, L. Polgar, Proteinase from germinating bean cotyledons. Evidence [29] N. Hatsugai, K. Yamada, S. Goto-Yamada, I. Hara-Nishimura, Vacuolar pro- for involvement of a thiol group in catalysis, Biochem. J. 222 (1984) 769e776. cessing enzyme in plant programmed cell death, Front. Plant Sci. 6 (2015) [2] A.D. Shutov, N.L. Do, I.A. Vaintraub, Purification and partial characterization 234.

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 21

[30] I. Julian, J. Gandullo, L.K. Santos-Silva, I. Diaz, M. Martinez, Phylogenetically [56] E. Fiebiger, P. Meraner, E. Weber, I.F. Fang, G. Stingl, H. Ploegh, D. Maurer, distant barley legumains have a role in both seed and vegetative tissues, Cytokines regulate proteolysis in major histocompatibility complex class II- J. Exp. Bot. 64 (2013) 2929e2941. dependent antigen presentation by dendritic cells, J. Exp. Med. 193 (2001) [31] M.A. Alim, N. Tsuji, T. Miyoshi, M.K. Islam, X. Huang, T. Hatta, K. Fujisaki, 881e892. HlLgm2, a member of asparaginyl endopeptidases/legumains in the midgut [57] N.M. Crawford, Nitrate: nutrient and signal for plant growth, Plant Cell 7 of the ixodid tick Haemaphysalis longicornis, is involved in blood-meal (1995) 859e868. digestion, J. Insect Physiol. 54 (2008) 573e585. [58] G. Noctor, A. Mhamdi, G. Queval, C.H. Foyer, Regulating the redox gate- [32] M.A. Alim, N. Tsuji, T. Miyoshi, M.K. Islam, T. Hatta, K. Yamaji, K. Fujisaki, keeper: vacuolar sequestration puts glutathione disulfide in its place, Plant Developmental stage- and organ-specific expression profiles of asparaginyl Physiol. 163 (2013) 665e671. endopeptidases/legumains in the ixodid tick Haemaphysalis longicornis, [59] J. Klumperman, A. Hille, T. Veenendaal, V. Oorschot, W. Stoorvogel, K. von J. Vet. Med. Sci. 70 (2008) 1363e1366. Figura, H.J. Geuze, Differences in the endosomal distributions of the two [33] D. Sojka, Z. Franta, M. Horn, O. Hajdusek, C.R. Caffrey, M. Mares, P. Kopacek, mannose 6-phosphate receptors, J. Cell Biol. 121 (1993) 997e1010. Profiling of proteolytic enzymes in the gut of the tick Ixodes ricinus reveals [60] S. Waguri, F. Dewitte, R. Le Borgne, Y. Rouille, Y. Uchiyama, J.F. Dubremetz, an evolutionarily conserved network of aspartic and cysteine peptidases, B. Hoflack, Visualization of TGN to endosome trafficking through fluo- Parasit. Vectors 1 (2008) 7. rescently labeled MPR and AP-1 in living cells, Mol. Biol. Cell 14 (2003) [34] M.E. Santamaria, P. Hernandez-Crespo, F. Ortego, V. Grbic, M. Grbic, I. Diaz, 142e155. M. Martinez, Cysteine peptidases and their inhibitors in Tetranychus urticae: [61] A. Hasilik, C. Wrocklage, B. Schroder, Intracellular trafficking of lysosomal a comparative genomic approach, BMC Genomics 13 (2012) 307. proteins and lysosomes, Int. J. Clin. Pharmacol. Ther. 47 (Suppl. 1) (2009) [35] D. Sojka, Z. Franta, M. Horn, C.R. Caffrey, M. Mares, P. Kopacek, New insights S18eS33. into the machinery of blood digestion by ticks, Trends Parasitol. 29 (2013) [62] A. Vitale, N.V. Raikhel, What do proteins need to reach different vacuoles? 276e285. Trends Plant Sci. 4 (1999) 149e155. [36] I. Wawrzyniak, C. Texier, P. Poirier, E. Viscogliosi, K.S. Tan, F. Delbac, H. El [63] Y. Lin, Y. Qiu, C. Xu, Q. Liu, B. Peng, G.F. Kaufmann, X. Chen, B. Lan, C. Wei, Alaoui, Characterization of two cysteine proteases secreted by Blastocystis D. Lu, Y. Zhang, Y. Guo, Z. Lu, B. Jiang, T.S. Edgington, F. Guo, Functional role ST7, a human intestinal parasite, Parasitol. Int. 61 (2012) 437e442. of asparaginyl endopeptidase ubiquitination by TRAF6 in tumor invasion and [37] P. Adisakwattana, V. Viyanant, W. Chaicumpa, S. Vichasri-Grams, metastasis, J. Natl. Cancer Inst. 106 (2014) dju012. A. Hofmann, G. Korge, P. Sobhon, R. Grams, Comparative molecular analysis [64] C. Liu, C. Sun, H. Huang, K. Janda, T. Edgington, Overexpression of legumain in of two asparaginyl endopeptidases and encoding genes from Fasciola tumors is significant for invasion/metastasis and a candidate enzymatic gigantica, Mol. Biochem. Parasitol. 156 (2007) 102e116. target for prodrug therapy, Cancer Res. 63 (2003) 2957e2964. [38] T. Laha, J. Sripa, B. Sripa, M. Pearson, L. Tribolet, S. Kaewkes, P. Sithithaworn, [65] L. Wang, S. Chen, M. Zhang, N. Li, Y. Chen, W. Su, Y. Liu, D. Lu, S. Li, Y. Yang, P.J. Brindley, A. Loukas, Asparaginyl endopeptidase from the carcinogenic Z. Li, D. Stupack, P. Qu, H. Hu, R. Xiang, Legumain: a biomarker for diagnosis liver fluke, Opisthorchis viverrini, and its potential for serodiagnosis, Int. J. and prognosis of human ovarian cancer, J. Cell Biochem. 113 (2012) Infect. Dis. 12 (2008) e49e59. 2679e2686. [39] J.M. Chen, P.M. Dando, N.D. Rawlings, M.A. Brown, N.E. Young, R.A. Stevens, [66] M.H. Haugen, H.T. Johansen, S.J. Pettersen, R. Solberg, K. Brix, K. Flatmark, E. Hewitt, C. Watts, A.J. Barrett, Cloning, isolation, and characterization of G.M. Maelandsmo, Nuclear legumain activity in colorectal cancer, PLoS One 8 mammalian legumain, an asparaginyl endopeptidase, J. Biol. Chem. 272 (2013) e52980. (1997) 8090e8098. [67] J. Gawenda, F. Traub, H.J. Luck, H. Kreipe, R. von Wasielewski, Legumain [40] J.M. Chen, P.M. Dando, R.A. Stevens, M. Fortunato, A.J. Barrett, Cloning and expression as a prognostic factor in breast cancer patients, Breast Cancer Res. expression of mouse legumain, a lysosomal endopeptidase, Biochem. J. 335 Treat. 102 (2007) 1e6. (Pt 1) (1998) 111e117. [68] P. Ghosh, N.M. Dahms, S. Kornfeld, Mannose 6-phosphate receptors: new [41] A.N. Antoniou, S.L. Blackwood, D. Mazzeo, C. Watts, Control of antigen pre- twists in the tale, Nat. Rev. Mol. Cell Biol. 4 (2003) 202e212. sentation by a single protease cleavage site, Immunity 12 (2000) 391e398. [69] G. Basurto-Islas, I. Grundke-Iqbal, Y.C. Tung, F. Liu, K. Iqbal, Activation of [42] E.S. Trombetta, M. Ebersold, W. Garrett, M. Pypaert, I. Mellman, Activation of asparaginyl endopeptidase leads to Tau hyperphosphorylation in Alzheimer lysosomal function during dendritic cell maturation, Science 299 (2003) disease, J. Biol. Chem. 288 (2013) 17495e17507. 1400e1403. [70] M.H. Haugen, K. Boye, J.M. Nesland, S.J. Pettersen, E.V. Egeland, T. Tamhane, [43] B. Manoury, D. Mazzeo, L. Fugger, N. Viner, M. Ponsford, H. Streeter, K. Brix, G.M. Maelandsmo, K. Flatmark, High expression of the cysteine G. Mazza, D.C. Wraith, C. Watts, Destructive processing by asparagine proteinase legumain in colorectal cancer e implications for therapeutic endopeptidase limits presentation of a dominant T cell epitope in MBP, Nat. targeting, Eur. J. Cancer 51 (2015) 9e17. Immunol. 3 (2002) 169e174. [71] Z. Zhang, M. Song, X. Liu, S.S. Kang, I.S. Kwon, D.M. Duong, N.T. Seyfried, [44] I. Hara-Nishimura, T. Shimada, K. Hatano, Y. Takeuchi, M. Nishimura, W.T. Hu, Z. Liu, J.Z. Wang, L. Cheng, Y.E. Sun, S.P. Yu, A.I. Levey, K. Ye, Transport of storage proteins to protein storage vacuoles is mediated by Cleavage of tau by asparagine endopeptidase mediates the neurofibrillary large precursor-accumulating vesicles, Plant Cell 10 (1998) 825e836. pathology in Alzheimer's disease, Nat. Med. 20 (2014) 1254e1262. [45] T. Kinoshita, M. Nishimura, I. Hara-Nishimura, The sequence and expression [72] N. Matarasso, S. Schuster, A. Avni, A novel plant cysteine protease has a dual of the gamma-VPE gene, one member of a family of three genes for vacuolar function as a regulator of 1-aminocyclopropane-1-carboxylic acid synthase processing enzymes in Arabidopsis thaliana, Plant Cell Physiol. 36 (1995) gene expression, Plant Cell 17 (2005) 1205e1216. 1555e1562. [73] S. Kosugi, M. Hasebe, N. Matsumura, H. Takashima, E. Miyamoto-Sato, [46] T. Kinoshita, M. Nishimura, I. Hara-Nishimura, Homologues of a vacuolar M. Tomita, H. Yanagawa, Six classes of nuclear localization signals specificto processing enzyme that are expressed in different organs in Arabidopsis different binding grooves of importin alpha, J. Biol. Chem. 284 (2009) thaliana, Plant Mol. Biol. 29 (1995) 81e89. 478e485. [47] V. Radchuk, D. Weier, R. Radchuk, W. Weschke, H. Weber, Development of [74] B. Alberts, A. Johnson, J. Lewis, D. Morgan, M. Raff, K. Roberts, P. Walter, maternal seed tissue in barley is mediated by regulated cell expansion and Molecular Biology of the Cell, Garland Science, New York, 2014, p. 1464. cell disintegration and coordinated with endosperm growth, J. Exp. Bot. 62 [75] W. Wu, Y. Luo, C. Sun, Y. Liu, P. Kuo, J. Varga, R. Xiang, R. Reisfeld, K.D. Janda, (2011) 1217e1227. T.S. Edgington, C. Liu, Targeting cell-impermeable prodrug activation to tu- [48] M. Horn, M. Nussbaumerova, M. Sanda, Z. Kovarova, J. Srba, Z. Franta, mor microenvironment eradicates multiple drug-resistant neoplasms, Can- D. Sojka, M. Bogyo, C.R. Caffrey, P. Kopacek, M. Mares, Hemoglobin digestion cer Res. 66 (2006) 970e980. in blood-feeding ticks: mapping a multipeptidase pathway by functional [76] T. Braulke, J.S. Bonifacino, Sorting of lysosomal proteins, Biochim. Biophys. proteomics, Chem. Biol. 16 (2009) 1053e1063. Acta 1793 (2009) 605e614. [49] M. Abdul Alim, N. Tsuji, T. Miyoshi, M. Khyrul Islam, X. Huang, M. Motobu, [77] D.J. Marchant, C.L. Bellac, T.J. Moraes, S.J. Wadsworth, A. Dufour, G.S. Butler, K. Fujisaki, Characterization of asparaginyl endopeptidase, legumain induced L.M. Bilawchuk, R.G. Hendry, A.G. Robertson, C.T. Cheung, J. Ng, L. Ang, by blood feeding in the ixodid tick Haemaphysalis longicornis, Insect Bio- Z.S. Luo, K. Heilbron, M.J. Norris, W.M. Duan, T. Bucyk, A. Karpov, L. Devel, chem. Mol. Biol. 37 (2007) 911e922. D. Georgiadis, R.G. Hegele, H.L. Luo, D.J. Granville, V. Dive, B.M. McManus, [50] T. Kurz, J.W. Eaton, U.T. Brunk, Redox activity within the lysosomal C.M. Overall, A new transcriptional role for matrix metalloproteinase-12 in compartment: implications for aging and apoptosis, Antioxid. Redox Signal. antiviral immunity, Nat. Med. 20 (2014) 497e506. 13 (2010) 511e523. [78] R. Shimizu-Hirota, W. Xiong, B.T. Baxter, S.L. Kunkel, I. Maillard, X.W. Chen, [51] P. Boya, Lysosomal function and dysfunction: mechanism and disease, F. Sabeh, R. Liu, X.Y. Li, S.J. Weiss, MT1-MMP regulates the PI3Kdelta.Mi-2/ Antioxid. Redox Signal. 17 (2012) 766e774. NuRD-dependent control of macrophage immune function, Genes Dev. 26 [52] P. Saftig, J. Klumperman, Lysosome biogenesis and lysosomal membrane (2012) 395e413. proteins: trafficking meets function, Nat. Rev. Mol. Cell Biol. 10 (2009) [79] X. Wang, F.L. Chow, T. Oka, L. Hao, A. Lopez-Campistrous, S. Kelly, S. Cooper, 623e635. J. Odenbach, B.A. Finegan, R. Schulz, Z. Kassiri, G.D. Lopaschuk, C. Fernandez- [53] R.W. M.E, J.B. Lloyd (Eds.), Biology of the Lysosome, vol. 27, Plenum Press, Patron, Matrix metalloproteinase-7 and ADAM-12 (a disintegrin and New York, 1996. metalloproteinase-12) define a signaling axis in agonist-induced hyperten- [54] B. Turk, D. Turk, V. Turk, Lysosomal cysteine proteases: more than scaven- sion and cardiac hypertrophy, Circulation 119 (2009) 2480e2489. gers, Biochim. Biophys. Acta 1477 (2000) 98e111. [80] B. Goulet, A. Baruch, N.S. Moon, M. Poirier, L.L. Sansregret, A. Erickson, [55] F.A. Lara, U. Lins, G.H. Bechara, P.L. Oliveira, Tracing heme in a living cell: M. Bogyo, A. Nepveu, A cathepsin L isoform that is devoid of a signal peptide hemoglobin degradation and heme traffic in digest cells of the cattle tick localizes to the nucleus in S phase and processes the CDP/Cux transcription Boophilus microplus, J. Exp. Biol. 208 (2005) 3093e3101. factor, Mol. Cell 14 (2004) 207e219.

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 22 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25

[81] E. Dall, H. Brandstetter, Activation of legumain involves proteolytic and D.P. McManus, Proteolytic degradation of host hemoglobin by schistosomes, conformational events, resulting in a context- and substrate-dependent ac- Mol. Biochem. Parasitol. 89 (1997) 1e9. tivity profile, Acta Crystallogr. Sect. F Struct. Biol. Cryst. Commun. 68 (2012) [109] T. Shimada, N. Hiraiwa, M. Nishimura, I. Hara-Nishimura, Vacuolar pro- 24e31. cessing enzyme of soybean that converts proproteins to the corresponding [82] R. Smith, H.T. Johansen, H. Nilsen, M.H. Haugen, S.J. Pettersen, mature forms, Plant Cell Physiol. 35 (1994) 713e718. G.M. Maelandsmo, M. Abrahamson, R. Solberg, Intra- and extracellular [110] I. Hara-Nishimura, Vacuolar processing enzyme responsible for maturation regulation of activity and processing of legumain by cystatin E/M, Biochimie of vacuolar proteins, Seikagaku 67 (1995) 372e377. 94 (2012) 2590e2599. [111] H. Kato, T. Minamikawa, Identification and characterization of a rice cysteine [83] D.J. Marchant, C.L. Bellac, T.J. Moraes, S.J. Wadsworth, A. Dufour, G.S. Butler, endopeptidase that digests glutelin, Eur. J. Biochem. 239 (1996) 310e316. L.M. Bilawchuk, R.G. Hendry, A.G. Robertson, C.T. Cheung, J. Ng, L. Ang, Z. Luo, [112] K. Muntz, A.D. Shutov, Legumains and their functions in plants, Trends Plant K. Heilbron, M.J. Norris, W. Duan, T. Bucyk, A. Karpov, L. Devel, D. Georgiadis, Sci. 7 (2002) 340e344. R.G. Hegele, H. Luo, D.J. Granville, V. Dive, B.M. McManus, C.M. Overall, A new [113] T. Shimada, K. Yamada, M. Kataoka, S. Nakaune, Y. Koumoto, M. Kuroyanagi, transcriptional role for matrix metalloproteinase-12 in antiviral immunity, S. Tabata, T. Kato, K. Shinozaki, M. Seki, M. Kobayashi, M. Kondo, Nat. Med. 20 (2014) 493e502. M. Nishimura, I. Hara-Nishimura, Vacuolar processing enzymes are essential [84] O. Warburg, K. Posener, E. Negelein, On the metabolism of carcinoma cells, for proper processing of seed storage proteins in Arabidopsis thaliana, J. Biol. Biochem. Z. 152 (1924) 309e344. Chem. 278 (2003) 32292e32299. [85] K.M. Bajjuri, Y. Liu, C. Liu, S.C. Sinha, The legumain protease-activated auri- [114] T. Burster, A. Beck, E. Tolosa, V. Marin-Esteban, O. Rotzschke, K. Falk, statin prodrugs suppress tumor growth and metastasis without toxicity, A. Lautwein, M. Reich, J. Brandenburg, G. Schwarz, H. Wiendl, A. Melms, ChemMedChem 6 (2011) 54e59. R. Lehmann, S. Stevanovic, H. Kalbacher, C. Driessen, Cathepsin G, and not [86] H. Chen, X. Liu, E.S. Clayman, F. Shao, M. Xiao, X. Tian, W. Fu, C. Zhang, the asparagine-specific endoprotease, controls the processing of myelin B. Ruan, P. Zhou, Z. Liu, Y. Wang, W. Rui, Synthesis and evaluation of a CBZ- basic protein in lysosomes from human B lymphocytes, J. Immunol. 172 AAN-Dox prodrug and its in vitro effects on SiHa cervical cancer cells under (2004) 5495e5503. hypoxic conditions, Chem. Biol. Drug Des. (2015), http://dx.doi.org/10.1111/ [115] B. Manoury, E.W. Hewitt, N. Morrice, P.M. Dando, A.J. Barrett, C. Watts, An cbdd.12525. asparaginyl endopeptidase processes a microbial antigen for class II MHC [87] Y. Lin, C. Wei, Y. Liu, Y. Qiu, C. Liu, F. Guo, Selective ablation of tumor- presentation, Nature 396 (1998) 695e699. associated macrophages suppresses metastasis and angiogenesis, Cancer [116] S.P. Matthews, I. Werber, J. Deussing, C. Peters, T. Reinheckel, C. Watts, Sci. 104 (2013) 1217e1225. Distinct protease requirements for antigen presentation in vitro and in vivo, [88] Y. Liu, K.M. Bajjuri, C. Liu, S.C. Sinha, Targeting cell surface alpha(v)beta(3) J. Immunol. 184 (2010) 2423e2431. integrin increases therapeutic efficacies of a legumain protease-activated [117] K. Wolk, G. Grutz, K. Witte, H.D. Volk, R. Sabat, The expression of legumain, auristatin prodrug, Mol. Pharm. 9 (2012) 168e175. an asparaginyl endopeptidase that controls antigen processing, is reduced in [89] L. Stern, R. Perry, P. Ofek, A. Many, D. Shabat, R. Satchi-Fainaro, A novel endotoxin-tolerant monocytes, Genes Immun. 6 (2005) 452e456. antitumor prodrug platform designed to be cleaved by the endoprotease [118] C. Watts, C.X. Moss, D. Mazzeo, M.A. West, S.P. Matthews, D.N. Li, legumain, Bioconjug. Chem. 20 (2009) 500e510. B. Manoury, Creation versus destruction of T cell epitopes in the class II MHC [90] M. Gomez-Angelats, J.A. Cidlowski, Molecular evidence for the nuclear pathway, Ann. N. Y. Acad. Sci. 987 (2003) 9e14. localization of FADD, Cell Death Differ. 10 (2003) 791e797. [119] D.J. Craik, Host-defense activities of cyclotides, Toxins (Basel) 4 (2012) [91] V.N. Pivtoraiko, S.L. Stone, K.A. Roth, J.J. Shacka, Oxidative stress and auto- 139e156. phagy in the regulation of lysosome-dependent neuron death, Antioxid. [120] J.P. Tam, Y.A. Lu, J.L. Yang, K.W. Chiu, An unusual structural motif of anti- Redox Signal. 11 (2009) 481e496. microbial peptides containing end-to-end macrocycle and cystine-knot [92] L. Yu, C.K. McPhee, L.X. Zheng, G.A. Mardones, Y.G. Rong, J.Y. Peng, N. Mi, disulfides, Proc. Natl. Acad. Sci. U. S. A. 96 (1999) 8913e8918. Y. Zhao, Z.H. Liu, F.Y. Wan, D.W. Hailey, V. Oorschot, J. Klumperman, [121] C. Jennings, J. West, C. Waine, D. Craik, M. Anderson, Biosynthesis and E.H. Baehrecke, M.J. Lenardo, Termination of autophagy and reformation of insecticidal properties of plant cyclotides: the cyclic knotted proteins from lysosomes regulated by mTOR, Nature 465 (2010), 942eU911. Oldenlandia affinis, Proc. Natl. Acad. Sci. U. S. A. 98 (2001) 10614e10619. [93] Y.G. Rong, M. Liu, L. Ma, W.Q. Du, H.S. Zhang, Y. Tian, Z. Cao, Y. Li, H. Ren, [122] A.D. Gillon, I. Saska, C.V. Jennings, R.F. Guarino, D.J. Craik, M.A. Anderson, C.M. Zhang, L. Li, S. Chen, J.Z. Xi, L. Yu, Clathrin and phosphatidylinositol-4,5- Biosynthesis of circular proteins in plants, Plant J. 53 (2008) 505e515. bisphosphate regulate autophagic lysosome reformation, Nat. Cell Biol. 14 [123] S. Maschalidi, S. Hassler, F. Blanc, F.E. Sepulveda, M. Tohme, M. Chignard, (2012), 924-þ. P. van Endert, M. Si-Tahar, D. Descamps, B. Manoury, Asparagine endopep- [94] S. Sridhar, B. Patel, D. Aphkhazava, F. Macian, L. Santambrogio, D. Shields, tidase controls anti-influenza virus immune responses through TLR7 acti- A.M. Cuervo, The lipid kinase PI4KIII beta preserves lysosomal identity, vation, PLoS Pathog. 8 (2012) e1002841. EMBO J. 32 (2013) 324e339. [124] F.E. Sepulveda, S. Maschalidi, R. Colisson, L. Heslop, C. Ghirelli, E. Sakka, [95] C.M. Yates, J. Butterworth, M.C. Tennant, A. Gordon, Enzyme-activities in A.M. Lennon-Dumenil, S. Amigorena, L. Cabanie, B. Manoury, Critical role for relation to Ph and lactate in postmortem brain in Alzheimer-type and other asparagine endopeptidase in endocytic Toll-like receptor signaling in den- , J. Neurochem. 55 (1990) 1624e1630. dritic cells, Immunity 31 (2009) 737e748. [96] P. Syntichaki, N. Tavernarakis, Signaling pathways regulating protein syn- [125] S.E. Ewald, A. Engel, J. Lee, M. Wang, M. Bogyo, G.M. Barton, Nucleic acid thesis during ageing, Exp. Gerontol. 41 (2006) 1020e1025. recognition by Toll-like receptors is coupled to stepwise processing by ca- [97] M. Artal-Sanz, C. Samara, P. Syntichaki, N. Tavernarakis, Lysosomal biogen- thepsins and asparagine endopeptidase, J. Exp. Med. 208 (2011) 643e651. esis and function is critical for necrotic cell death in Caenorhabditis elegans, [126] R. Maehr, H.C. Hang, J.D. Mintern, Y.M. Kim, A. Cuvillier, M. Nishimura, J. Cell Biol. 173 (2006) 231e239. K. Yamada, K. Shirahama-Noda, I. Hara-Nishimura, H.L. Ploegh, Asparagine [98] A.J. Yang, D. Chandswangbhuvana, L. Margol, C.G. Glabe, Loss of endosomal/ endopeptidase is not essential for class II MHC antigen presentation but is lysosomal membrane impermeability is an early event in amyloid A beta 1- required for processing of cathepsin L in mice, J. Immunol. 174 (2005) 42 pathogenesis, J. Neurosci. Res. 52 (1998) 691e698. 7066e7074. [99] R.A. Nixon, A.M. Cataldo, P.M. Mathews, The endosomal-lysosomal system of [127] B. Manoury, D. Mazzeo, D.N. Li, J. Billson, K. Loak, P. Benaroch, C. Watts, neurons in Alzheimer's disease pathogenesis: a review, Neurochem. Res. 25 Asparagine endopeptidase can initiate the removal of the MHC class II (2000) 1161e1172. invariant chain chaperone, Immunity 18 (2003) 489e498. [100] R.A. Nixon, The role of autophagy in neurodegenerative disease, Nat. Med. 19 [128] J.M. Chen, M. Fortunato, R.A. Stevens, A.J. Barrett, Activation of progelatinase (2013) 983e997. A by mammalian legumain, a recently discovered cysteine proteinase, Biol. [101] D.M. Wolfe, J.H. Lee, A. Kumar, S. Lee, S.J. Orenstein, R.A. Nixon, Autophagy Chem. 382 (2001) 777e783. failure in Alzheimer's disease and the role of defective lysosomal acidifica- [129] O. del Pozo, E. Lam, Caspases and programmed cell death in the hypersen- tion, Eur. J. Neurosci. 37 (2013) 1949e1961. sitive response of plants to pathogens, Curr. Biol. 8 (1998) R896. [102] J.M. Chen, M. Fortunato, A.J. Barrett, Activation of human prolegumain by [130] A.J. De Jong, F.A. Hoeberichts, E.T. Yakimova, E. Maximova, E.J. Woltering, cleavage at a C-terminal asparagine residue, Biochem. J. 352 (Pt 2) (2000) Chemical-induced apoptotic cell death in tomato cells: involvement of 327e334. caspase-like proteases, Planta 211 (2000) 656e662. [103] O. Warburg, On the metabolism of cancer cells, Naturwissenschaften 12 [131] A. Clarke, R. Desikan, R.D. Hurst, J.T. Hancock, S.J. Neill, NO way back: nitric (1924) 1131e1137. oxide and programmed cell death in Arabidopsis thaliana suspension cul- [104] Y. Kato, S. Ozawa, C. Miyamoto, Y. Maehata, A. Suzuki, T. Maeda, Y. Baba, tures, Plant J. 24 (2000) 667e677. Acidic extracellular microenvironment and cancer, Cancer Cell Int. 13 (2013). [132] N. Hatsugai, M. Kuroyanagi, K. Yamada, T. Meshi, S. Tsuda, M. Kondo, [105] C. Liu, C.Z. Sun, H.N. Huang, K. Janda, T. Edgington, Overexpression of legu- M. Nishimura, I. Hara-Nishimura, A plant vacuolar protease, VPE, mediates main in tumors is significant for invasion/metastasis and a candidate enzy- virus-induced hypersensitive cell death, Science 305 (2004) 855e858. matic target for prodrug therapy, Cancer Res. 63 (2003) 2957e2964. [133] I. Hara-Nishimura, N. Hatsugai, S. Nakaune, M. Kuroyanagi, M. Nishimura, [106] E. Dall, J.C. Fegg, P. Briza, H. Brandstetter, Structure and mechanism of an Vacuolar processing enzyme: an executor of plant cell death, Curr. Opin. aspartimide-dependent peptide ligase in human legumain, Angew. Chem. Plant Biol. 8 (2005) 404e408. Int. Ed. Engl. 54 (2015) 2917e2921. [134] Z. Liu, S.W. Jang, X. Liu, D. Cheng, J. Peng, M. Yepes, X.J. Li, S. Matthews, [107] O. Grandjean, A. Aeschlimann, Contribution to the study of digestion in ticks: C. Watts, M. Asano, I. Hara-Nishimura, H.R. Luo, K. Ye, Neuroprotective ac- histology and fine structure of the midgut ephithelium of Ornithodorus tions of PIKE-L by inhibition of SET proteolytic degradation by asparagine moubata, Murray (Ixodoidea, Argasidae), Acta Trop. 30 (1973) 193e212. endopeptidase, Mol. Cell 29 (2008) 665e678. [108] P.J. Brindley, B.H. Kalinna, J.P. Dalton, S.R. Day, J.Y. Wong, M.L. Smythe, [135] S.J. Choi, S.V. Reddy, R.D. Devlin, C. Menaa, H. Chung, B.F. Boyce,

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 23

G.D. Roodman, Identification of human asparaginyl endopeptidase (legu- [162] D. Musil, D. Zucic, D. Turk, R.A. Engh, I. Mayr, R. Huber, T. Popovic, V. Turk, main) as an inhibitor of osteoclast formation and bone resorption, J. Biol. T. Towatari, N. Katunuma, et al., The refined 2.15 A X-ray crystal structure of Chem. 274 (1999) 27747e27753. human liver cathepsin B: the structural basis for its specificity, EMBO J. 10 [136] S.J. Choi, N. Kurihara, Y. Oba, G.D. Roodman, Osteoclast inhibitory peptide 2 (1991) 2321e2330. inhibits osteoclast formation via its C-terminal fragment, J. Bone Min. Res. 16 [163] K.P. Wilson, J.A. Black, J.A. Thomson, E.E. Kim, J.P. Griffith, M.A. Navia, (2001) 1804e1811. M.A. Murcko, S.P. Chambers, R.A. Aldape, S.A. Raybuck, et al., Structure and [137] K. Kubota, K. Wakabayashi, T. Matsuoka, Proteome analysis of secreted mechanism of interleukin-1 beta converting enzyme, Nature 370 (1994) proteins during osteoclast differentiation using two different methods: two- 270e275. dimensional electrophoresis and isotope-coded affinity tags analysis with [164] C. Wiesmann, L. Leder, J. Blank, A. Bernardi, S. Melkko, A. Decock, A. D'Arcy, two-dimensional chromatography, Proteomics 3 (2003) 616e626. F. Villard, P. Erbel, N. Hughes, F. Freuler, R. Nikolay, J. Alves, F. Bornancin, [138] H.C. Blair, S.L. Teitelbaum, R. Ghiselli, S. Gluck, Osteoclastic bone resorption M. Renatus, Structural determinants of MALT1 protease activity, J. Mol. Biol. by a polarized vacuolar proton pump, Science 245 (1989) 855e857. 419 (2012) 4e21. [139] J.P. Mattsson, C. Skyman, H. Palokangas, K.H. Vaananen, D.J. Keeling, Char- [165] J.W. Yu, P.D. Jeffrey, J.Y. Ha, X. Yang, Y. Shi, Crystal structure of the mucosa- acterization and cellular distribution of the osteoclast ruffled membrane associated lymphoid tissue lymphoma translocation 1 (MALT1) paracaspase vacuolar Hþ-ATPase B-subunit using isoform-specific antibodies, J. Bone region, Proc. Natl. Acad. Sci. U. S. A. 108 (2011) 21004e21009. Min. Res. 12 (1997) 753e760. [166] K. McLuskey, J. Rudolf, W.R. Proto, N.W. Isaacs, G.H. Coombs, C.X. Moss, [140] M. Grano, R. Faccio, S. Colucci, R. Paniccia, N. Baldini, A.Z. Zallone, A. Teti, J.C. Mottram, Crystal structure of a Trypanosoma brucei metacaspase, Proc. Extracellular Ca2þ sensing is modulated by pH in human osteoclast-like cells Natl. Acad. Sci. U. S. A. 109 (2012) 7469e7474. in vitro, Am. J. Physiol. 267 (1994) C961eC968. [167] M. Alvarez-Fernandez, A.J. Barrett, B. Gerhartz, P.M. Dando, J. Ni, [141] C.B. Chan, M. Abe, N. Hashimoto, C. Hao, I.R. Williams, X. Liu, S. Nakao, M. Abrahamson, Inhibition of mammalian legumain by some cystatins is due A. Yamamoto, C. Zheng, J.I. Henter, M. Meeths, M. Nordenskjold, S.Y. Li, to a novel second reactive site, J. Biol. Chem. 274 (1999) 19195e19203. I. Hara-Nishimura, M. Asano, K. Ye, Mice lacking asparaginyl endopeptidase [168] T. Cheng, K. Hitomi, I.M. van Vlijmen-Willems, G.J. de Jongh, K. Yamamoto, develop disorders resembling hemophagocytic syndrome, Proc. Natl. Acad. K. Nishi, C. Watts, T. Reinheckel, J. Schalkwijk, P.L. Zeeuwen, Cystatin M/E is a Sci. U. S. A. 106 (2009) 468e473. high affinity inhibitor of cathepsin V and cathepsin L by a reactive site that is [142] G. Miller, S.P. Matthews, T. Reinheckel, S. Fleming, C. Watts, Asparagine distinct from the legumain-binding site. A novel clue for the role of cystatin endopeptidase is required for normal kidney physiology and homeostasis, M/E in epidermal cornification, J. Biol. Chem. 281 (2006) 15893e15899. FASEB J. 25 (2011) 1606e1617. [169] P. Fuentes-Prior, G.S. Salvesen, The protein structures that shape caspase [143] D.N. Li, S.P. Matthews, A.N. Antoniou, D. Mazzeo, C. Watts, Multistep activity, specificity, activation and inhibition, Biochem. J. 384 (2004) autoactivation of asparaginyl endopeptidase in vitro and in vivo, J. Biol. 201e232. Chem. 278 (2003) 38980e38990. [170] D. Kato, K.M. Boatright, A.B. Berger, T. Nazif, G. Blum, C. Ryan, K.A. Chehade, [144] K. Bernath-Levin, C. Nelson, A.G. Elliott, A.S. Jayasena, A.H. Millar, D.J. Craik, G.S. Salvesen, M. Bogyo, Activity-based probes that target diverse cysteine J.S. Mylne, Peptide macrocyclization by a bifunctional endoprotease, Chem. protease families, Nat. Chem. Biol. 1 (2005) 33e38. Biol. 22 (2015) 571e582. [171] V.I. Rotari, P.M. Dando, A.J. Barrett, Legumain forms from plants and animals [145] L. Zhao, T. Hua, C. Crowley, H. Ru, X. Ni, N. Shaw, L. Jiao, W. Ding, L. Qu, differ in their specificity, Biol. Chem. 382 (2001) 953e959. L.W. Hung, W. Huang, L. Liu, K. Ye, S. Ouyang, G. Cheng, Z.J. Liu, Structural [172] L. Stryer, Biochemistry, Spektrum 4 (2003). analysis of asparaginyl endopeptidase reveals the activation mechanism and [173] C.X. Moss, S.P. Matthews, D.J. Lamont, C. Watts, Asparagine deamidation a reversible intermediate maturation stage, Cell Res. 24 (2014) 344e358. perturbs antigen presentation on class II major histocompatibility complex [146] W. Li, Structural Analysis of Bacterial and Viral Infections and a Cysteine molecules, J. Biol. Chem. 280 (2005) 18498e18503. Proteinase Legumain, Fakultat€ für Lebenswissenschaften, Technische Uni- [174] G. Schwarz, J. Brandenburg, M. Reich, T. Burster, C. Driessen, H. Kalbacher, versitat€ Carolo-Wilhelmina Braunschweig, 2014. Characterization of legumain, Biol. Chem. 383 (2002) 1813e1816. [147] K. Kersse, J. Verspurten, T. Vanden Berghe, P. Vandenabeele, The death-fold [175] M.A. Mathieu, M. Bogyo, C.R. Caffrey, Y. Choe, J. Lee, H. Chapman, M. Sajid, superfamily of homotypic interaction motifs, Trends Biochem. Sci. 36 (2011) C.S. Craik, J.H. McKerrow, Substrate specificity of schistosome versus human 541e552. legumain determined by P1-P3 peptide libraries, Mol. Biochem. Parasitol. [148] S. Yuan, X. Yu, M. Topf, S.J. Ludtke, X. Wang, C.W. Akey, Structure of an 121 (2002) 99e105. apoptosome-procaspase-9 CARD complex, Structure 18 (2010) 571e583. [176] D.M. Blow, J.J. Birktoft, B.S. Hartley, Role of a buried acid group in the [149] F. Martinon, K. Burns, J. Tschopp, The inflammasome: a molecular platform mechanism of action of chymotrypsin, Nature 221 (1969) 337e340. triggering activation of inflammatory caspases and processing of proIL-beta, [177] L. Hedstrom, Serine protease mechanism and specificity, Chem. Rev. 102 Mol. Cell 10 (2002) 417e426. (2002) 4501e4524. [150] A. Steward, G.S. McDowell, J. Clarke, Topology is the principal determinant in [178] S.D. Lewis, F.A. Johnson, J.A. Shafer, Effect of cysteine-25 on the ionization of the folding of a complex all-alpha Greek key death domain from human -159 in papain as determined by proton nuclear magnetic reso- FADD, J. Mol. Biol. 389 (2009) 425e437. nance spectroscopy. Evidence for a his-159eCys-25 ion pair and its possible [151] C.H. Weber, C. Vincenz, The death domain superfamily: a tale of two in- role in catalysis, Biochemistry 20 (1981) 48e51. terfaces? Trends Biochem. Sci. 26 (2001) 475e481. [179] F.A. Johnson, S.D. Lewis, J.A. Shafer, Determination of a low pK for histidine- [152] L. Wang, J.K. Yang, V. Kabaleeswaran, A.J. Rice, A.C. Cruz, A.Y. Park, Q. Yin, 159 in the S-methylthio derivative of papain by proton nuclear magnetic E. Damko, S.B. Jang, S. Raunser, C.V. Robinson, R.M. Siegel, T. Walz, H. Wu, resonance spectroscopy, Biochemistry 20 (1981) 44e48. The Fas-FADD death domain complex structure reveals the basis of DISC [180] S.D. Lewis, F.A. Johnson, J.A. Shafer, Potentiometric determination of ioni- assembly and disease mutations, Nat. Struct. Mol. Biol. 17 (2010) zations at the active site of papain, Biochemistry 15 (1976) 5009e5017. 1324e1329. [181] V.R. Pattabiraman, J.W. Bode, Rethinking amide bond synthesis, Nature 480 [153] H.H. Park, E. Logette, S. Raunser, S. Cuenin, T. Walz, J. Tschopp, H. Wu, Death (2011) 471e479. domain assembly mechanism revealed by crystal structure of the oligomeric [182] K. Lang, M. Erlacher, D.N. Wilson, R. Micura, N. Polacek, The role of 23S ri- PIDDosome core complex, Cell 128 (2007) 533e546. bosomal RNA residue A2451 in peptide bond synthesis revealed by atomic [154] H. Qin, S.M. Srinivasula, G. Wu, T. Fernandes-Alnemri, E.S. Alnemri, Y. Shi, mutagenesis, Chem. Biol. 15 (2008) 485e492. Structural basis of procaspase-9 recruitment by the apoptotic protease- [183] S. Hashimoto, Ribosome-independent peptide synthesis in nature and their activating factor 1, Nature 399 (1999) 549e557. application to dipeptide production, J. Biol. Macromol. 8 (2008) 28e37. [155] D. Kwon, J.H. Yoon, S.Y. Shin, T.H. Jang, H.G. Kim, I. So, J.H. Jeon, H.H. Park, [184] F.T. Guilford, J. Hope, Deficient glutathione in the pathophysiology of A comprehensive manually curated protein-protein interaction database for mycotoxin-related illness, Toxins 6 (2014) 608e623. the Death Domain superfamily, Nucleic Acids Res. 40 (2012) D331eD336. [185] H.J. Forman, H. Zhang, A. Rinna, Glutathione: overview of its protective roles, [156] S. Halfon, S. Patel, F. Vega, S. Zurawski, G. Zurawski, Autocatalytic activation measurement, and biosynthesis, Mol. Asp. Med. 30 (2009) 1e12. of human legumain at aspartic acid residues, FEBS Lett. 438 (1998) 114e118. [186] K. Tabata, H. Ikeda, S. Hashimoto, ywfE in Bacillus subtilis codes for a novel [157] Y.Y. Lin, Y.M. Qiu, C. Xu, Q.L. Liu, B.W. Peng, G.F. Kaufmann, X. Chen, B. Lan, enzyme, L-amino acid ligase, J. Bacteriol. 187 (2005) 5195e5202. C.Y. Wei, D.S. Lu, Y.S. Zhang, Y.F. Guo, Z.M. Lu, B. Jiang, T.S. Edgington, F. Guo, [187] F.B. Perler, InBase: the intein database, Nucleic Acids Res. 30 (2002) Functional Role of Asparaginyl Endopeptidase Ubiquitination by TRAF6 in 383e384. Tumor Invasion and Metastasis, J Natl. Cancer Inst. 106 (2014). [188] G. Volkmann, H.D. Mootz, Recent progress in intein research: from mecha- [158] N. Hiraiwa, M. Nishimura, I. Hara-Nishimura, Vacuolar processing enzyme is nism to directed evolution and applications, Cell. Mol. Life Sci. 70 (2013) self-catalytically activated by sequential removal of the C-terminal and N- 1185e1206. terminal propeptides, FEBS Lett. 447 (1999) 213e216. [189] F.B. Perler, Protein splicing mechanisms and applications, IUBMB Life 57 [159] M. Kuroyanagi, M. Nishimura, I. Hara-Nishimura, Activation of Arabidopsis (2005) 469e476. vacuolar processing enzyme by self-catalytic removal of an auto-inhibitory [190] O. Schneewind, P. Model, V.A. Fischetti, Sorting of protein A to the staphy- domain of the C-terminal propeptide, Plant Cell Physiol. 43 (2002) 143e151. lococcal cell wall, Cell 70 (1992) 267e281. [160] C.R. Caffrey, M.A. Mathieu, A.M. Gaffney, J.P. Salter, M. Sajid, K.D. Lucas, [191] W.W. Navarre, O. Schneewind, Proteolytic cleavage and cell wall anchoring C. Franklin, M. Bogyo, J.H. McKerrow, Identification of a cDNA encoding an at the LPXTG motif of surface proteins in gram-positive bacteria, Mol. active asparaginyl endopeptidase of Schistosoma mansoni and its expression Microbiol. 14 (1994) 115e121. in Pichia pastoris, FEBS Lett. 466 (2000) 244e248. [192] O. Schneewind, A. Fowler, K.F. Faull, Structure of the cell wall anchor of [161] H. Brandstetter, J.S. Kim, M. Groll, R. Huber, Crystal structure of the tricorn surface proteins in Staphylococcus aureus, Science 268 (1995) 103e106. protease reveals a protein disassembly line, Nature 414 (2001) 466e470. [193] U. Ilangovan, H. Ton-That, J. Iwahara, O. Schneewind, R.T. Clubb, Structure of

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 24 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25

sortase, the transpeptidase that anchors proteins to the cell wall of Staph- nebularis, J. Biol. Chem. 275 (2000) 20104e20109. ylococcus aureus, Proc. Natl. Acad. Sci. U. S. A. 98 (2001) 6056e6061. [220] J. Sabotic, D. Gaser, B. Rogelj, K. Gruden, B. Strukelj, J. Brzin, Heterogeneity in [194] B.A. Frankel, R.G. Kruger, D.E. Robinson, N.L. Kelleher, D.G. McCafferty, the cysteine protease inhibitor clitocypin gene family, Biol. Chem. 387 (2006) Staphylococcus aureus sortase transpeptidase SrtA: insight into the kinetic 1559e1566. mechanism and evidence for a reverse protonation catalytic mechanism, [221] J. Sabotic, T. Popovic, V. Puizdar, J. Brzin, Macrocypins, a family of cysteine Biochemistry 44 (2005) 11188e11200. protease inhibitors from the basidiomycete Macrolepiota procera, FEBS J. [195] K.W. Clancy, J.A. Melvin, D.G. McCafferty, Sortase transpeptidases: insights 276 (2009) 4334e4345. into mechanism, substrate specificity, and inhibition, Biopolymers 94 (2010) [222] M. Renko, J. Sabotic, M. Mihelic, J. Brzin, J. Kos, D. Turk, Versatile loops in 385e396. mycocypins inhibit three protease families, J. Biol. Chem. 285 (2010) [196] T. Geiger, S. Clarke, Deamidation, isomerization, and racemization at aspar- 308e316. aginyl and aspartyl residues in peptides. Succinimide-linked reactions that [223] T. Fox, E. de Miguel, J.S. Mort, A.C. Storer, Potent slow-binding inhibition of contribute to protein degradation, J. Biol. Chem. 262 (1987) 785e794. cathepsin B by its propeptide, Biochemistry 31 (1992) 12571e12576. [197] S. Ritz-Timme, M.J. Collins, Racemization of aspartic acid in human proteins, [224] V. Gocheva, H.W. Wang, B.B. Gadea, T. Shree, K.E. Hunter, A.L. Garfall, Ageing Res. Rev. 1 (2002) 43e59. T. Berman, J.A. Joyce, IL-4 induces cathepsin protease activity in tumor- [198] N. Fujii, D-amino acid in elderly tissues, Biol. Pharm. Bull. 28 (2005) associated macrophages to promote cancer growth and invasion, Genes 1585e1589. Dev. 24 (2010) 241e255. [199] R. Motoie, N. Fujii, S. Tsunoda, K. Nagata, T. Shimo-oka, T. Kinouchi, N. Fujii, [225] A.M. Lechner, I. Assfalg-Machleidt, S. Zahler, M. Stoeckelhuber, W. Machleidt, T. Saito, K. Ono, Localization of D-beta-aspartyl residue-containing proteins M. Jochum, D.K. Nagler, RGD-dependent binding of procathepsin X to in various tissues, Int. J. Mol. Sci. 10 (2009) 1999e2009. integrin alphavbeta3 mediates cell-adhesive properties, J. Biol. Chem. 281 [200] B. Yan, S. Steen, D. Hambly, J. Valliere-Douglass, T. Vanden Bos, S. Smallwood, (2006) 39588e39597. Z. Yates, T. Arroll, Y. Han, H. Gadgil, R.F. Latypov, A. Wallace, A. Lim, [226] K. Stachowiak, M. Tokmina, A. Karpinska, R. Sosnowska, W. Wiczk, Fluoro- G.R. Kleemann, W. Wang, A. Balland, Succinimide formation at Asn 55 in the genic peptide substrates for carboxydipeptidase activity of cathepsin B, Acta complementarity determining region of a recombinant monoclonal antibody Biochim. Pol. 51 (2004) 81e92. IgG1 heavy chain, J. Pharm. Sci. 98 (2009) 3509e3521. [227] D. Caglic, J.R. Pungercar, G. Pejler, V. Turk, B. Turk, Glycosaminoglycans [201] A.L. Pace, R.L. Wong, Y.T. Zhang, Y.H. Kao, Y.J. Wang, Asparagine deamidation facilitate procathepsin B activation through disruption of propeptide-mature dependence on buffer type, pH, and temperature, J. Pharm. Sci 102 (2013) enzyme interactions, J. Biol. Chem. 282 (2007) 33076e33085. 1712e1723. [228] L. Berven, H.T. Johansen, R. Solberg, S.O. Kolset, A.B. Samuelsen, Autoacti- [202] GElifesciences, Biacore e Sensor Surface Handbook, 2015, pp. 1e100. vation of prolegumain is accelerated by glycosaminoglycans, Biochimie 95 [203] S. Noguchi, K. Miyawaki, Y. Satow, Succinimide and isoaspartate residues in (2013) 772e781. the crystal structures of hen egg-white lysozyme complexed with tri-N- [229] L. Kjellen, U. Lindahl, Proteoglycans: structures and interactions, Annu. Rev. acetylchitotriose, J. Mol. Biol. 278 (1998) 231e238. Biochem. 60 (1991) 443e475. [204] T.R. Barends, J.B. Bultema, T. Kaper, M.J. van der Maarel, L. Dijkhuizen, [230] B.G. Winchester, Lysosomal metabolism of glycoconjugates, Subcell. Bio- B.W. Dijkstra, Three-way stabilization of the covalent intermediate in amy- chem. 27 (1996) 191e238. lomaltase, an alpha-amylase-like transglycosylase, J. Biol. Chem. 282 (2007) [231] M. Novinec, B. Lenarcic, B. Turk, Cysteine cathepsin activity regulation by 17242e17249. glycosaminoglycans, Biomed. Res. Int. 2014 (2014) 309718. [205] P.S. Sheldon, J.N. Keen, D.J. Bowles, Post-translational peptide bond forma- [232] Z. Li, M. Kienetz, M.M. Cherney, M.N. James, D. Bromme, The crystal and tion during concanavalin A processing in vitro, Biochem. J. 320 (Pt 3) (1996) molecular structures of a cathepsin K:chondroitin sulfate complex, J. Mol. 865e870. Biol. 383 (2008) 78e91. [206] J.S. Mylne, M.L. Colgrave, N.L. Daly, A.H. Chanson, A.G. Elliott, E.J. McCallum, [233] T. Yamane, K. Takeuchi, Y. Yamamoto, Y.H. Li, M. Fujiwara, K. Nishi, A. Jones, D.J. Craik, Albumins and their processing machinery are hijacked for S. Takahashi, I. Ohkubo, Legumain from bovine kidney: its purification, cyclic peptides in sunflower, Nat. Chem. Biol. 7 (2011) 257e259. molecular cloning, immunohistochemical localization and degradation of [207] I. Saska, A.D. Gillon, N. Hatsugai, R.G. Dietzgen, I. Hara-Nishimura, annexin II and vitamin D-binding protein, Biochim. Biophys. Acta 1596 M.A. Anderson, D.J. Craik, An asparaginyl endopeptidase mediates in vivo (2002) 108e120. protein backbone cyclization, J. Biol. Chem. 282 (2007) 29721e29728. [234] J. Baell, M.A. Walters, Chemistry: Chemical con artists foil drug discovery, [208] J.S. Mylne, L.Y. Chan, A.H. Chanson, N.L. Daly, H. Schaefer, T.L. Bailey, Nature 513 (2014) 481e483. P. Nguyencong, L. Cascales, D.J. Craik, Cyclic peptides arising by evolutionary [235] L. Teng, H. Wada, S. Zhang, Identification and functional characterization of parallelism via asparaginyl-endopeptidase-mediated biosynthesis, Plant Cell legumain in amphioxus Branchiostoma belcheri, Biosci. Rep. 30 (2010) 24 (2012) 2765e2778. 177e186. [209] M.L. Korsinczky, H.J. Schirra, K.J. Rosengren, J. West, B.A. Condie, L. Otvos, [236] J. Rozman-Pungercar, N. Kopitar-Jerala, M. Bogyo, D. Turk, O. Vasiljeva, M.A. Anderson, D.J. Craik, Solution structures by 1H NMR of the novel cyclic I. Stefe, P. Vandenabeele, D. Bromme, V. Puizdar, M. Fonovic, M. Trstenjak- trypsin inhibitor SFTI-1 from sunflower seeds and an acyclic permutant, Prebanda, I. Dolenc, V. Turk, B. Turk, Inhibition of papain-like cysteine pro- J. Mol. Biol. 311 (2001) 579e591. teases and legumain by caspase-specific inhibitors: when reaction mecha- [210] N.L. Daly, Y.K. Chen, F.M. Foley, P.S. Bansal, R. Bharathi, R.J. Clark, nism is more important than specificity, Cell Death Differ. 10 (2003) C.P. Sommerhoff, D.J. Craik, The absolute structural requirement for a proline 881e888. in the P30-position of Bowman-Birk protease inhibitors is surmounted in the [237] K.B. Sexton, M.D. Witte, G. Blum, M. Bogyo, Design of cell-permeable, fluo- minimized SFTI-1 scaffold, J. Biol. Chem. 281 (2006) 23668e23675. rescent activity-based probes for the lysosomal cysteine protease aspar- [211] M.L. Colgrave, D.J. Craik, Thermal, chemical, and enzymatic stability of the aginyl endopeptidase (AEP)/legumain, Bioorg. Med. Chem. Lett. 17 (2007) cyclotide kalata B1: the importance of the cyclic cystine knot, Biochemistry 649e653. 43 (2004) 5965e5975. [238] A.J. Niestroj, K. Feussner, U. Heiser, P.M. Dando, A. Barrett, B. Gerhartz, [212] V. Turk, W. Bode, The cystatins: protein inhibitors of cysteine proteinases, H.U. Demuth, Inhibition of mammalian legumain by Michael acceptors and FEBS Lett. 285 (1991) 213e219. AzaAsn-halomethylketones, Biol. Chem. 383 (2002) 1205e1214. [213] M. Abrahamson, A.J. Barrett, G. Salvesen, A. Grubb, Isolation of six cysteine [239] J. Lee, M. Bogyo, Synthesis and evaluation of aza-peptidyl inhibitors of the proteinase inhibitors from human urine. Their physicochemical and enzyme lysosomal asparaginyl endopeptidase, legumain, Bioorg. Med. Chem. Lett. 22 kinetic properties and concentrations in biological fluids, J. Biol. Chem. 261 (2012) 1340e1343. (1986) 11282e11289. [240] O.D. Ekici, M.G. Gotz, K.E. James, Z.Z. Li, B.J. Rukamp, J.L. Asgian, C.R. Caffrey, [214] V. Turk, V. Stoka, O. Vasiljeva, M. Renko, T. Sun, B. Turk, D. Turk, Cysteine E. Hansell, J. Dvorak, J.H. McKerrow, J. Potempa, J. Travis, J. Mikolajczyk, cathepsins: from structure, function and regulation to new frontiers, Bio- G.S. Salvesen, J.C. Powers, Aza-peptide Michael acceptors: a new class of chim. Biophys. Acta (1824) 68e88. inhibitors specific for caspases and other clan CD cysteine proteases, J. Med. [215] J. Ni, M. Abrahamson, M. Zhang, M.A. Fernandez, A. Grubb, J. Su, G.L. Yu, Y. Li, Chem. 47 (2004) 1889e1892. D. Parmelee, L. Xing, T.A. Coleman, S. Gentz, R. Thotakura, N. Nguyen, [241] J.L. Asgian, K.E. James, Z.Z. Li, W. Carter, A.J. Barrett, J. Mikolajczyk, M. Hesselberg, R. Gentz, Cystatin E is a novel human cysteine proteinase G.S. Salvesen, J.C. Powers, Aza-peptide epoxides: a new class of inhibitors inhibitor with structural resemblance to family 2 cystatins, J. Biol. Chem. 272 selective for clan CD cysteine proteases, J. Med. Chem. 45 (2002) 4958e4960. (1997) 10853e10858. [242] J. Lee, M. Bogyo, Development of near-infrared fluorophore (NIRF)-labeled [216] M. Martinez, M. Diaz-Mendoza, L. Carrillo, I. Diaz, Carboxy terminal extended activity-based probes for in vivo imaging of legumain, ACS Chem. Biol. 5 phytocystatins are bifunctional inhibitors of papain and legumain cysteine (2010) 233e243. proteinases, FEBS Lett. 581 (2007) 2914e2918. [243] A. Ovat, F. Muindi, C. Fagan, M. Brouner, E. Hansell, J. Dvorak, D. Sojka, [217] B. Manoury, W.F. Gregory, R.M. Maizels, C. Watts, Bm-CPI-2, a cystatin ho- P. Kopacek, J.H. McKerrow, C.R. Caffrey, J.C. Powers, Aza-peptidyl Michael molog secreted by the filarial parasite Brugia malayi, inhibits class II MHC- acceptor and epoxide inhibitorsepotent and selective inhibitors of Schisto- restricted antigen processing, Curr. Biol. 11 (2001) 447e451. soma mansoni and Ixodes ricinus legumains (asparaginyl endopeptidases), [218] J. Murray, B. Manoury, A. Balic, C. Watts, R.M. Maizels, Bm-CPI-2, a cystatin J. Med. Chem. 52 (2009) 7192e7210. from Brugia malayi nematode parasites, differs from Caenorhabditis elegans [244] K.E. James, M.G. Gotz, C.R. Caffrey, E. Hansell, W. Carter, A.J. Barrett, cystatins in a specific site mediating inhibition of the antigen-processing J.H. McKerrow, J.C. Powers, Aza-peptide epoxides: potent and selective in- enzyme AEP, Mol. Biochem. Parasitol. 139 (2005) 197e203. hibitors of Schistosoma mansoni and pig kidney legumains (asparaginyl [219] J. Brzin, B. Rogelj, T. Popovic, B. Strukelj, A. Ritonja, Clitocypin, a new type of endopeptidases), Biol. Chem. 384 (2003) 1613e1618. cysteine proteinase inhibitor from fruit bodies of mushroom clitocybe [245] S. Lewen, H. Zhou, H.D. Hu, T. Cheng, D. Markowitz, R.A. Reisfeld, R. Xiang,

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022 E. Dall, H. Brandstetter / Biochimie xxx (2015) 1e25 25

Y. Luo, A Legumain-based minigene vaccine targets the tumor stroma and confer respective “off” and “on” (19)F NMR/MRI signals for legumain activity suppresses breast cancer growth and angiogenesis, Cancer Immunol. detection in zebrafish, ACS Nano 9 (2015) 5117e5124. Immunother. 57 (2008) 507e515. [264] Y. Liu, R.K. Goswami, C. Liu, S.C. Sinha, Chemically programmed bispecific [246] Y. Ohno, J. Nakashima, M. Izumi, M. Ohori, T. Hashimoto, M. Tachibana, As- antibody targeting legumain protease and alphavbeta3 integrin mediates sociation of legumain expression pattern with prostate cancer invasiveness strong antitumor effects, Mol. Pharm. 12 (2015) 2544e2550. and aggressiveness, World J. Urol. 31 (2013) 359e364. [265] J.D. Thompson, D.G. Higgins, T.J. Gibson, CLUSTAL W: improving the sensi- [247] N. Li, Q. Liu, Q. Su, C. Wei, B. Lan, J. Wang, G. Bao, F. Yan, Y. Yu, B. Peng, J. Qiu, tivity of progressive multiple sequence alignment through sequence X. Yan, S. Zhang, F. Guo, Effects of legumain as a potential prognostic factor weighting, position-specific gap penalties and weight matrix choice, Nucleic on gastric cancers, Med. Oncol. 30 (2013) 621. Acids Res. 22 (1994) 4673e4680. [248] R.V. Murthy, G. Arbman, J. Gao, G.D. Roodman, X.F. Sun, Legumain expression [266] C.S. Bond, A.W. Schuttelkopf, ALINE: a WYSIWYG protein-sequence align- in relation to clinicopathologic and biological variables in colorectal cancer, ment editor for publication-quality alignments, Acta Crystallogr. D Biol. Clin. Cancer Res. 11 (2005) 2293e2299. Crystallogr. 65 (2009) 510e512. [249] M. Wu, G.R. Shao, F.X. Zhang, W.X. Wu, P. Xu, Z.M. Ruan, Legumain protein as [267] L.K. Santos-Silva, A. Soares-Costa, L.T.S. Gerald, S.P. Meneghin, F. Henrique- a potential predictive biomarker for Asian patients with breast carcinoma, Silva, Recombinant expression and biochemical characterization of sugar- Asian Pac. J. Cancer Prev. 15 (2014) 10773e10777. cane legumain, Plant Physiol. Biochem. 57 (2012) 181e192. [250] Y. Luo, H. Zhou, J. Krueger, C. Kaplan, S.H. Lee, C. Dolman, D. Markowitz, [268] L. Berven, H.T. Johansen, R. Solberg, S.O. Kolset, A.B. Samuelsen, Autoacti- W. Wu, C. Liu, R.A. Reisfeld, R. Xiang, Targeting tumor-associated macro- vation of prolegumain is accelerated by glycosaminoglycans, Biochimie 95 phages as a novel strategy against breast cancer, J. Clin. Investig. 116 (2006) (2013) 772e781. 2132e2141. [269] B. Wu, J. Yin, C. Texier, M. Roussel, K.S. Tan, Blastocystis legumain is localized [251] J.J. Briggs, M.H. Haugen, H.T. Johansen, A.I. Riker, M. Abrahamson, O. Fodstad, on the cell surface, and specific inhibition of its activity implicates a pro- G.M. Maelandsmo, R. Solberg, Cystatin E/M suppresses legumain activity and survival role for the enzyme, J. Biol. Chem. 285 (2010) 1790e1798. invasion of human melanoma, BMC Cancer 10 (2010) 17. [270] A.H. Davis, J. Nanduri, D.C. Watson, Cloning and gene expression of Schis- [252] C. Watts, S.P. Matthews, D. Mazzeo, B. Manoury, C.X. Moss, Asparaginyl tosoma mansoni protease, J. Biol. Chem. 262 (1987) 12851e12855. endopeptidase: case history of a class II MHC compartment protease, [271] M. Egger, A. Jurets, M. Wallner, P. Briza, S. Ruzek, S. Hainzl, U. Pichler, Immunol. Rev. 207 (2005) 218e228. C. Kitzmuller, B. Bohle, C.G. Huber, F. Ferreira, Assessing protein immuno- [253] R.J. Castellani, K. Honda, X. Zhu, A.D. Cash, A. Nunomura, G. Perry, M.A. Smith, genicity with a dendritic cell line-derived endolysosomal degradome, PLoS Contribution of redox-active iron and copper to oxidative damage in Alz- One 6 (2011) e17278. heimer disease, Ageing Res. Rev. 3 (2004) 319e326. [272] Y. Morita, H. Araki, T. Sugimoto, K. Takeuchi, T. Yamane, T. Maeda, [254] J.H. Herskowitz, Y.M. Gozal, D.M. Duong, E.B. Dammer, M. Gearing, K. Ye, Y. Yamamoto, K. Nishi, M. Asano, K. Shirahama-Noda, M. Nishimura, T. Uzu, J.J. Lah, J. Peng, A.I. Levey, N.T. Seyfried, Asparaginyl endopeptidase cleaves I. Hara-Nishimura, D. Koya, A. Kashiwagi, I. Ohkubo, Legumain/asparaginyl TDP-43 in brain, Proteomics 12 (2012) 2455e2463. endopeptidase controls extracellular matrix remodeling through the [255] R.L. Smith, O.A. Astrand, L.M. Nguyen, T. Elvestrand, G. Hagelin, R. Solberg, degradation of fibronectin in mouse renal proximal tubular cells, FEBS Lett. H.T. Johansen, P. Rongved, Synthesis of a novel legumain-cleavable colchicine 581 (2007) 1417e1424. prodrug with cell-specific toxicity, Bioorg. Med. Chem. 22 (2014) 3309e3315. [273] S. Hasegawa, M. Yamasaki, T. Fukui, Degradation of acetoacetyl-CoA syn- [256] Z. Liu, M. Xiong, J. Gong, Y. Zhang, N. Bai, Y. Luo, L. Li, Y. Wei, Y. Liu, X. Tan, thetase, a ketone body-utilizing enzyme, by legumain in the mouse kidney, R. Xiang, Legumain protease-activated TAT-liposome cargo for targeting Biochem. Biophys. Res. Commun. 453 (2014) 631e635. tumours and their microenvironment, Nat. Commun. 5 (2014) 4280. [274] M. Sajid, J.H. McKerrow, E. Hansell, M.A. Mathieu, K.D. Lucas, I. Hsieh, [257] D. Liao, Z. Liu, W. Wrasidlo, T. Chen, Y. Luo, R. Xiang, R.A. Reisfeld, Synthetic D. Greenbaum, M. Bogyo, J.P. Salter, K.C. Lim, C. Franklin, J.H. Kim, : a novel targeting ligand for nanotherapeutic drug delivery C.R. Caffrey, Functional expression and characterization of Schistosoma inhibiting tumor growth without systemic toxicity, Nanomedicine 7 (2011) mansoni cathepsin B and its trans-activation by an endogenous asparaginyl 665e673. endopeptidase, Mol. Biochem. Parasitol. 131 (2003) 65e75. [258] L.R. Desnoyers, O. Vasiljeva, J.H. Richardson, A. Yang, E.E. Menendez, [275] J. Tiedemann, A. Schlereth, K. Muntz, Differential tissue-specific expression T.W. Liang, C. Wong, P.H. Bessette, K. Kamath, S.J. Moore, J.G. Sagert, of cysteine proteinases forms the basis for the fine-tuned mobilization of D.R. Hostetter, F. Han, J. Gee, J. Flandez, K. Markham, M. Nguyen, M. Krimm, storage globulin during and after germination in legume seeds, Planta 212 K.R. Wong, S. Liu, P.S. Daugherty, J.W. West, H.B. Lowman, Tumor-specific (2001) 728e738. activation of an EGFR-targeting probody enhances therapeutic index, Sci. [276] D. Gruis, J. Schulze, R. Jung, Storage protein accumulation in the absence of Transl. Med. 5 (2013), 207ra144. the vacuolar processing enzyme family of cysteine proteases, Plant Cell 16 [259] M. Smahel, M. Duskova, I. Polakova, J. Musil, Enhancement of DNA vaccine (2004) 270e290. potency against legumain, J. Immunother. 37 (2014) 293e303. [277] H. Park, I. Kusakabe, Y. Sakakibara, H. Kobayashi, Autoproteolytic processing [260] Z. Liu, D. Lv, S. Liu, J. Gong, D. Wang, M. Xiong, X. Chen, R. Xiang, X. Tan, of aspartic proteinase from sunflower seeds, Biosci. Biotechnol. Biochem. 65 Alginic acid-coated chitosan nanoparticles loaded with legumain DNA vac- (2001) 702e705. cine: effect against breast cancer in mice, PLoS One 8 (2013) e60190. [278] Y. Wang, S. Zhu, S. Liu, L. Jiang, L. Chen, Y. Ren, X. Han, F. Liu, S. Ji, X. Liu, [261] L.E. Edgington, M. Verdoes, A. Ortega, N.P. Withana, J. Lee, S. Syed, J. Wan, The vacuolar processing enzyme OsVPE1 is required for efficient M.H. Bachmann, G. Blum, M. Bogyo, Functional imaging of legumain in glutelin processing in rice, Plant J. 58 (2009) 606e617. cancer using a new quenched activity-based probe, J. Am. Chem. Soc. 135 [279] T. Okamoto, A. Yuki, N. Mitsuhashi, T. Minamikawa, Asparaginyl endopep- (2013) 174e182. tidase (VmPE-1) and autocatalytic processing synergistically activate the [262] Y.J. Chen, S.C. Wu, C.Y. Chen, S.C. Tzou, T.L. Cheng, Y.F. Huang, S.S. Yuan, vacuolar cysteine proteinase (SH-EP), Eur. J. Biochem. 264 (1999) 223e232. Y.M. Wang, Peptide-based MRI contrast agent and near-infrared fluorescent [280] R.L. Heath, P.A. Barton, R.J. Simpson, G.E. Reid, G. Lim, M.A. Anderson, probe for intratumoral legumain detection, Biomaterials 35 (2014) 304e315. Characterization of the protease processing sites in a multidomain protein- [263] Y. Yuan, S. Ge, H. Sun, X. Dong, H. Zhao, L. An, J. Zhang, J. Wang, B. Hu, ase inhibitor precursor from Nicotiana alata, Eur. J. Biochem. 230 (1995) G. Liang, Intracellular self-assembly and disassembly of (19)F nanoparticles 250e257.

Please cite this article in press as: E. Dall, H. Brandstetter, Structure and function of legumain in health and disease, Biochimie (2015), http:// dx.doi.org/10.1016/j.biochi.2015.09.022