<<

ACTIVATION MECHANISMS FOR ZYMOGENS

BELONGING TO THE PAPAIN FAMILY OF CYSTEINE

PROTEASES

OMAR QURAISHI

Biochemistry Department

McGill University, Montreal

September 1999

A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfilment of the requirements of the degree of Doctor of

Philosophy

O Omar Quraishi, 1999 National Library Bibliothèque nationale I*I of Canada du Canada Acquisitions and Acquisitions et Bibliographie Services services bibliographiques 395 Wellington Street 395, rue Wellington Ottawa ON KIA ON4 Ottawa ON KIA ON4 Canada Canada Yow file Votre rélérence

Our fi& Notre rdférence

The author has granted a non- L'auteur a accordé une licence non exclusive licence allowing the exclusive permettant à la National Library of Canada to Bibliothèque nationale du Canada de reproduce, loan, distribute or sell reproduire, prêter, distibuer ou copies of this thesis in microfom, vendre des copies de cette thèse sous paper or electronic formats. la forme de microfiche/fXm, de reproduction sur papier ou sur format électronique.

The author retains ownership of the L'auteur conserve la propriété du copyright in this thesis. Neither the droit d'auteur qui protège cette thèse. thesis nor substantial extracts fiom it Ni la thèse ni des extraits substantiels may be printed or otherwise de celle-ci ne doivent être imprimés reproduced without the author's ou autrement reproduits sans son permission. autorisation. PAGE

Acknowiedgments

Abstract

Résumé

Introduction and Literature Review

- Serine Proteases

- Activation of and Chymotrypsinogen - Prothrombin Activation - Proprotein Convertases

- Aspartic Proteases

- Zinc Metailoproteases

- Procarboxypeptidase A and B - Prostromelysin-1

- Cysteine Proteases

- Caspase Family - Pap ain Family

Chapter 1: The Occluding Loop in Cathepsin B Defines the pH Dependence of Inhibition by Its Propeptide

- Connecting Text for Chapters 1 and 2

Cha~ter2: Identification of Interna1 Autoproteolytic Cleavage Sites Within the Prosegments of Recombinant Procathepsin B and Procathepsin S

- Connecting Text for Chapters 2 and 3

Chapter 3: Functional Expression of Human Procathepsin H in Pichiapastoris and Attempts at its Correct Processing

Summary

References LIST OF FIGURES AND TABLES FOLLOWING PAGE

WTRODUCTION AND LITERATURE REVIEW FIGURE 1: Activation of Chyrnotrypsinogen IFIGURIF 2: Prothrombinase Cornplex FIGURE 3 :Progastrïcsin Activation FIGURE 4: Sequence Alignment of Cathepsin L-like Prodomains FIGURE 5: Prosegments of Cathepsin L and Cathepsin B FIGURE 6: Structures of Procathepsin L and Procathepsin B FIGURE 7: Structure of Procathepsin B FIGURE 8: Model of Cystatins in Cornplex with Papain

ILLUSTRATIONS FOR CHAPTER 1 FIGURE 1: Prosequences of Rat and Human Cathepsin B FIGURE 2: Conformations Adopted by the Occluding Loop FIGURE 3: Autocatalytic Processing of Procathepsin B TABLE 1: Propeptide Inhibition of Cathepsin B Mutants TABLE 2: Role of Aspartic Acid Residues in Propeptide TABLE 3: Activity of Cathepsin B Mutants Towards 2-Phe-Arg-MCA

ILLUSTRATIONS FOR CHAPTER 2 FIGURE 1: Am3-stained SDS-PAGE of Procathepsin B Processing FIGURE 2: PVDF-membranes of Procathepsins B and S FIGURE 3: Autoproteolytic Cleavage Sites in Procathepsins B and S FIGURE 4: Active Site Cleft in Rat Cys29Ser Procathepsin B FIGURE 5: Plots of k,b, versus Proenzyme Concentration FXGURE 6: Western Blots of Wild-Type and Arg8Ala Propapain

ILLUSTRATIONS FOR CHAPTER 3 FIGURE 1: Homology Mode1 of Procathepsin H 111 FLGURE 2 (A,B):Non-Reducing SDS-PAGE of Glycosylated Procathepsin H 111 FIGURE 3 : Structural Alignment of Prosegments Composed of Cys82p 111 FIGURE 4: Progress Curves of Aminopeptidase Activity for Cathepsin H Isoforms 111 The research was fûnded in part by the Govemment of Canada's Network of

Centres of Excellence Program supported by the Medical Research Council of Canada

and the Natural Sciences and Engineering Research Council of Canada through Pence

Inc. (the Engineering Network of Centres of Excellence).

The author gratefully acknowledges the mentorship and encouragement

provided by Dr. Andrew C- Storer who made sure that he left the laboratory with the

ability to 'think independently' in order to be prepared to take on firture challenges.

The author also wishes to thank the encouragement and advice provided by several

members of the laboratory and other employees of the Biotechnology Research

Institute, National Research Council Canada @ast and present) : Drs. Marko Pregel,

Dorit K. Nagler, Robert Ménard, Edmund Ziomek, Richard Hrabal, J. Sivaraman,

Shahul NiIar, and Dr- John S. Mort (Joint Diseases Laboratory, Shnners Hospital for

Children). The author wishes to thank Mr. Robert Dupras for introducing him to the

hockey and softball games which enabled him to interact with the fiendly employees

of the BRI.

The author would also like to reserve special mention for his wife, Dr.

Katharine A. Cacpenter, who demonstrated tremendous patience during the course of

these studies. To his beloved parents, (late) Abdul Mateen Quraishi and Yolande

Quraishi, for instilhg the values of obtaining a higher education fiom a very early age.

Finally, the author is grateful to the Almighty God :The Creator, Ruler, and Sustainer,

Cherisher of Al1 Worlds, The Most Gracious, and The Most Mercifil. ABSTRACT

The activation mechanisms for zymogens belonging to the papain family of cysteine proteases were investigated. This was accomplished using site-directed mutagenesis, kinetic measurements, the identification of processing intermediates, and the analysis of the various X-ray crystal structures reported to date. Procathepsin B is a unique precursor of papain-like in that it is composed of a shorter prodomain ; i-e., 62 residues versus 290 residues for those belonging to the cathepsin L-subfamily, and the mature is composed of a twenty residue insertion termed the occluding loop. In this study, the pH dependence of cathepsin B inhibition by its propeptide was shown to be eliminated upon the removal of this enzyme's occluding loop.

Furthemore, variants of cathepsin B c-g the mutation Asp22Ala or His 11 OAla also displayed a loss of pH dependence for their affinity to the propeptide inhibitor.

Sirnilarly, the overall rate of autoprocessing for full-length procathepsin B was shown to be affected by the occluding loop mutations. These results suggest a possible influence of the pH-dependent stability of the occluding loop on the overall rate of processing for this precursor.

The addition of the protein-protehase inhibitor, cystatin C, impeded the overall rate of autoactivation for procathepsin B and procathepsin S and caused the accumulation of processing intermediates for these precursors which were subsequently identified by autornated Edrnan degradation. The N-terminal sequences of these processing intermediates correspond to an area of the prodomains which binds through the substrate-binding clefts of these enzymes, thus suggesting a plausible intramolecular step of processing for this family of zymogens. This unimolecular mechanism was found to rely on the confoxmational mobility of prosegment residues (Le., at the C-terminal end). Furthemore, in contrast to what has been observed for zymogens belonging to the aspartic protease family, it was determined that charged residues located at the N-temiuius of the mature segment found in propapain do not contribute to the overall pH-triggering mechanism of activation for this precursor.

Procathepsin H was determined to be an unusual mammalian member of the cathepsin L-subfamily due to its inability to autoprocess, and the aminopeptidase activiw of mature cathepsin H was found to be incapable of converting its own precursor. Furthemore, prosegment residues located near the pro/rnature junction of procathepsin H are highly resistant to the proteolytic action of secondary proteases.

These findings are consistent with the pre-formation of a disulfide bond within the cathepsin H precursor which links the prodomain to the enzyme using Cys82p and

Cys214. Consistent with the fïndings for procathepsins B and S, the unrestncted conformational mobility at the C-terminal end of the prosegment (i-e., near the pro/rnature junction) is an important prerequisite for efficient autoactivation to occur among zymogens of papain-like enzymes. RÉsU1MÉ

Les mécanismes d'activation de zymogènes des protéases à cystéine de la famille de la papaine ont été étudiés. Ceci a été accompli en utilisant la mutagénèse dirigée, des mesures de cinétique, l'identification d'intermédiaires de maturation, et l'analyse des diverses structures crystallines connues a ce jour. La procathepsine B est un précurseur unique dans le groupe d'enzymes similaires à la papaine du au fait qu'elle est composée d'une prorégion plus courte, contenant seulement 62 résidus, comparativement aux 90 et plus résidus rencontrés dans les précurseurs appartenant au sous-groupe d'enzymes similaires à la cathepsine L. De plus, la cathepsine B mature contient une insertion de vingt acides amines qui constituent une boucle d'occlusion.

Dans cette étude, il a été démontré que l'influence du pH sur l'inhibition de la cathepsine B par son propeptide est éliminée lorsque cette boucle d'occlusion n'est pas présente. De plus, des variantes de la cathepsine B contenant la mutation Asp22Ala ou

Hisl lOAla ont aussi perdu cette dépendence du pH pour l'affinité de l'inhibiteur propeptide. Similairement, la vitesse globale d'automaturation de la procathepsine B est affectée par les mutations de la boucle d'insertion. Ces résultats suggèrent que la stabilité de cette boucle d'insertion, qui est dépendante du pH, influence la vitesse de maturation du précurseur de la cathepsine B.

L'addition d'un inhibiteur protéique des protéases à cystéine, la cystatine C, cause une forte diminution de la vitesse de maturation autocatalytique de précurseurs tels la procathepsine B et la procathepsine S, et en conséquence conduit à une accumulation d'intermédiaires de maturation pour ces précurseurs. Ces intermédiaires ont été identifiés par l'analyse de la séquence N-terminale par la méthode dYEdrnan.

Les séquences N-terminales de ces intermédiaires permettent de localiser Ies sites d'hydrolyse à une région des prodomaines située au site de liaison de subtrats de ces enzymes, ce qui nous permet de suggérer une étape intramoléculaire dans la maturation de cette famille de zymogènes. Ce mécanisme unimoléculaires dépend de la mobilité conformationelle de résidus de la prorégion dans la partie C-terminale. De plus, contrairement à ce qui a été observé pour les zyrnogènes de la famille des protéases aspartyle, il a été démontré que des résidus chargés situés en N-terminal du domaine correspondant à l'enzyme mature dans la propaine ne contribuent pas au mécanisme pH-dépendant de déclenchement de la maturation pour ce précurseur.

Il a été déterminé que la procathepsine H est un membre particulier du sous- groupe d'enzymes similaires à la cathepsine L, du au fait qu'elle ne peut procéder à une maturation autocatalytique, et à l'incapacité de la cathepsine H de convertir son précurseur en enzyme mature via son activité aminopeptidase. De plus, les résidus du prodomaine situés près de la jonction de la prorégion et de l'enzyme mature de la procathepsine H sont très resistants à l'action protéolytique de protéases exogènes. Ces résultats peuvent être expliqués par la présence d'un pont disulfùe dans le précurseur de la cathepsine H (Cys82p-Cys214) qui relie le prodomaine au reste de l'enzyme. En accord avec les résultat obtenus pour les procathepsines B et S, la mobilité conformationelIe dans la partie C-terminale de la prorégion (près de Ia jonction prodomaine-enzyme mature) est un facteur déterminant pour la maturation autocatalytique des zymogènes des enzymes de la farnille de la papaine.

lNTRODUCTION AND LITERATURE REVIEW

Enzymes have been specifically designed to be proficient in catalyzing a diverse array of chernical reactions. These reactions may involve the of bonds (serine, cysteine, aspartyl, or zn2+-coordinated pro teases), folding/unfo lding of

polypeptides or large (caInexin (1,2)), shifting the cis/trans isomenzation

equilibrium of proline residues within proteins (cyclophilins (3)),

phosphorylation~dephosphorylation of proteins (kinases/phosp hatases (43)) or

mediating protein-protein or protein-DNA interactions, etc,. Enzymes have the

capacity of accelerating the rates of reactions by many orders of magnitude using

several methods. For exarnple, enzymes are capable of increasing the effective

concentrations and optimizing the relative orientations of reactants. Enzymes also

provide a unique chemical environment which lowers the energy of activation for the

reaction (E,) and possess catalytic residues with enhanced nucleophilicity. The

diversity of chemical reactions needed for the survival of living organisms requires a

similar diversity arnong enzymes. In humans, enzymes have been documented to partake in many physiological functions such as general protein degradation, antigen processing, bone resorption, cartilage proteoglycan breakdown, blood coagulation, and cellular apoptosis, etc. As central components to a vanety of biological processes, factors which lead to deregulated enzymatic activity inevitably lead to several disease states. Thrombin, for exarnple, is critical for activating the blood coagulation cascade.

Disorders in regdating thrornbin activity may, therefore, lead to heart attacks or strokes. Furtherrnore, deregulated activity of the matrix metalloproteases (MMPs) have been implicated in tumor metastasis (6) and the lysosornal cysteine proteases belonmg to the papain family have been associated with arthritis (7) and osteoporosis (8).

Hence, enzymes are increasingly being viewed as important targets for therapeutic intervention. In order to regulate the activity of these proteases, al1 known ceIIu1a.r and bacterial proteolytic proteases are expressed as latent higher molecular weight proprotein precursors. Ttiese proenzymes possess extensions of various lengths at the

N-terminus of the enzyme. For example, the precursor of cathepsin C is composed of over 200 residues (9,10) and that of consists of only six amino-acids (1 1). The prosegments usually display their inhibitory activity by blocking access of natural substrates to the enzyme's substrate-binding cl&. Proregions have also been shown to promote proper protein folding of the enzyme, chaperone the enzyme to its appropriate destination, and stabilize the enzyme until it is transported to either the stomach, vacuoles, golgi, lysosome, or extracellular matrix, etc,. Once the zymogens arrive at their final destination, however, a mechanism is then needed to activate them so as to allow the active enzymes to perfonn their catalytic duties. in order to make this conversion possible, most (but not dl) proenzymes contain the structural information necessq to produce active protein ; i.e. a preformed and functional catalytic ion-pair and a mature substrate-binding cleft. The proenzymes which are competent in performing autoactivation rnay do so using intra- or intermolecular pathways. The proenzymes which are not capable of 'self activation, however, must rely either on the activity of secondary proteases, the binding of accessory protein molecules, or a combination of the two in order to complete the maturation process. Proenzymes capable of autoprocessing usually require the destabilization of the prodornain/enzyme complex. This 'loosening' rnay be achieved by simply varying the pH environment to which the precursor is subjected. The difficulties in elucidating the rnolecular mechanisms involved in proenzyme activation arise due to the fact that many events may occur simultaneously, such as limited proteolysis of the precursor and conformational changes within the prosegments and/or catalytic domains. Clearly, a detailed characterization of the rnethods utilized by the prosegments to inhibit their parent enzymes is an important step in understanding the events which lead to precursor activation.

In this thesis, zymogens of proteolytic enzymes and their mechanisms of activation will be the main focus of discussion. Proteases have evolved several methods of cleavïng peptide (amide) bonds. Some proteolytic enzymes utilize the side- chain of either a serine or a cysteine residue as a catalytic nucleophile. For these proteases, the formation of a hydrogen bond between the catalytic nucleophile and an irnidazole (often located at a distant position relative to the catalytic residue within the prirnary structure of the protein) is required to enhance the nucleophilicity and reactivity of the enzyme by acting as a general base and serving to lower the pK, of the catalytic nucleophile. Other proteases may utilize metal CO factors or the surrounding solvent to carry out their proteolytic activity. In addition, proteolytic enzymes possess an electrophilic pocket, temed the oxyanion hole, which accepts and stabilizes the generated negative charge on the carbonyl oxygen of the tetrahedral intermediate. For some farnilies of enzymes, such as those of the serine and aspartic proteases, their mechanisms of activation have been well characterized due to the fact that much progress has been made in eludidating the three-dimensional structures of their zymogens. However, for families of zymogens which have been identified only recently such as those of the caspase farnily of cysteine proteases or the proprotein convertase farnily of serine proteases, there is a corresponding lack of structural information available. Hence, much progress remains to be made in the characterization of processing mechanisms for several families of precursors. The main body of the thesis will address the structural and mechanistic features

of autoactivation among zymogens belonging to the papain family of cysteine

proteases. In order to appreciate the similarities and differences arnong the different

families of proproteins, the introduction of this thesis reviews the available literature

whkh focusses on the mechanisms of processuig ; Le., structural requirements,

identification of cleavage sites, etc., for families of proteases other than the papain

family. The classes of enzymes will be divided upon the nature of the catalytic

nucleophile used to carry out hydrolysis. Therefore, al1 enzymes will be classified as

either serine, aspartic, 2n2+-coordinated,or cysteine proteases. Each group will also be

divided into sub-families since enzymes using the same type of nucIeophilic residue

may be composed of different three-dimensional folds, or have a different ceIlular or physiological localization and fimction.

SERINE PROTEASES

To date, zymogens belonging to the senne family of proteases have been the best charactenzed as a result of over 45 years of accumulated data using kinetic, chernical, and physical techniques. Chyrnotrypsin and trypsin are biosynthesized as larger inactive precursors by the pancreatic acinar cells. Using the pancreatic duct, these proenzymes are then channelled into the duodenum where they are subseqiiently activated due to the low pH environment of the small intestine. Upon their activation, they utilize a senne residue as the catalytic nucleophile and serve as catalysts for the of peptide (amide) bonds with varying specificities for the side-chains located adjacent to the peptide bond to be cleaved. Acute pancreatitis is a senous medical condition which rnay arise fiom the premature activation of these zyrnogens within the

pancreatic tissues.

ACTIVATION OF TRYPSINOGEN AND CHYMOTRYPSTNOGEN

The zymogen of trypsin, termed ûypsinogen, is activated (processed) by the proteolytic action of a second se~eprotease, , which is secreted by the duodenal mucosa. Enteropeptidase recognizes the high concentration oE negative charge formed by 4 consecutive aspartate residues (Asp 11 p-Asp l4p) within trypsinogen's N-terminal hexapeptide extension, and serves to cleave between Lys l5p and Ile16 to yield the active enzyme and cause the permanent removal of the prosegment (Note : Suffix 'p' refers to residues located in the prodomain). This site of proteolysis rnay be recognized by the small amount of trypsin produced by enteropeptidase activity, and therefore, the extent of trypsinogen activation is arnplified by the activity of the mature form of the protein. Hence, trypsinogen activation is autocatalytic but requires the initial activity of a secondary protease. Crystallographic studies of trypsinogen (12,13,14) indicate that the catalytic triad is indistinguishable fiom that of the mature enzyme, but its inactivity is due to a partially obstructed substrate-binding cleft and immature oxyanion hole arising Erom a disordered loop

(residues 186-294) located within the active site. These observations are consistent with the necessity for structural changes during activation of this farnily of zymogens

(1%

The situation for chymotrypsinogen activation is similar to that for trypsinogen.

The active site center in mature (active) chyrnotrypsin is composed of Ser195 and

His57. In addition, AsplO2 is situated in the vicinity of the catalytic histidine to stabilize the required conformation of the imidazoliurn side-chain (16,17)- The oxyanion hole, stabilizing the developing negative charge on the carbony l ox ygen atom of the tetrahedral intermediate, is forrned by the backbone amide protons of Gly 193 and

Ser195 (18). Within the mature protein, therefore, the ability of these amides to donate their protons to the carbonyl oxygen of the Pi residue (notation of Schechter & Berger,

1967 (19)) requires a specific backbone conformation for residues 190- 195. Similarly to trypsinogen, chyrnotrypsinogen activation is initiated by the activity of a secondary protease. The activity of mature trypsin (or that of enteropeptidase) is required to cleave between residues Argl Sp-Ilel6 of the 15-residue proregion found in chymotrypsinogen (Figure 1 on next page). Residues lp- 15p of the prosegment, however, remain covalently attached to the main body of the enzyme via a disul fide bond linking Cys lp and Cys122. The liberated N-terminus at IIe 16 then forms an ion- pair with the carboxylate side-chain of Asp 194. This interaction is a prerequisite to form an active enzyme. Following these events leads to the autocatalytic release of the dipeptides Serf4p-ArglSp and Thr147-Asn148. Direct cornparison of the crystaI structures of bovine chymotrypsinogen (20) and mature cc- and y-chymotrypsin (21,22) reveals that their overall fold are identical and that only a small segment of residues undergo conformational changes during conversion. Significantly, it has been determined that the substrate-binding cleft is only partially forrned in the chymotrypsin precursor. The peptide bond between Met1 92-Gly193 is in the wrong orientation to allow the backbone amide atom to contribute a proton to the oxyanion hole. As has been observed for trypsinogen, chyrnotrypsinogen consists of a preformed catalytic triad but an immature oxyanion hole, thus leading to a substrate-binding clef? which is partially obstmcted. In the chyrnotrypsin precursor, the Aspl94 carboxylate (located in Chymotrypsinogcn . 1 245 (inactive) I I I I Trypsin L d3yrnotryptin 171116 245 I K 1 I S S S S

Chymotrypsin Ser 14 -Arg 15 Thr 147 -Am 148

Conformationd change

K-Chymotrypsin

Conformationai change

a-aiymotrypsin

FIGURE 1 the active site clefi) is H-bonded to His40. In the mature protein, however, Aspl 94 interacts with the new N-terminus at IIel6. This new interaction causes Aspl94 to rotate approximately 180" about its own Ca-C bond. This conformational change necessitates the backbone amide nitrogens of Glyl93 and Ser195 to protrude more into the substrate-binding clef? and contribute to the oxyanion hole as hydrogen donors to the substrate carbonyl oxygen. In addition, the side-chah of Met192 changes its position fiom a buried position in the zymogen to that of a solvent exposed residue in the mature enzyme. In sumrnary, the activation of trypsinogen and chymotrypsinogen hvolves the maturation of a partially forrned substrate-binding clefi and oxyanion hole rnediated by limited proteolysis at the pro/mature junction due the activity of a secondary protease.

PROTFrROMBm ACTWATION

In order to illustrate the diversity of activation mechanisms found among serine proteases, the process of converting prothrombin to mature thrornbin is also discussed.

Thrombin is the central protease which triggers the blood coagulation cascade, activates the conversion of soluble fibrinogen of blood plasma to the insoluble fibnn clot (23), and is known to activate blood platelets (24). Hence, the conversion of prothrombin to active thrombin must be under tight regdatory control in order to prevent the production of unwanted bIood dots which could dtimately lead to strokes or heart attacks.

Active thrombin is generated via formation of the prothrombinase complex (25)

(Figure 2 on next page). Upon vascular injury, soluble prothrombin will associate with the surface of blood vessicles through its fiagment 1 (or laingle 1 (KI)) domain at the u-)Throm bin

PROTH.ROMBINASE COMPLEX

FIGURE 2 site of haernomhage with the assistance of ca2+ions. Circulating factor Va will then recognize and associate with the fragment 2 (or krhgle 2 (K2)) domain of the prothrombin molecule. The complex of prothrombin and factor Va exposes two sites with the sequence, Ile-Glu-Gly-Arg, which are then recognized and cleaved by the active , factor Xa (26). These sites, however, are not accessible in the absence of factor Va and major conformational changes are, therefore, required within the thrombin precursor. Following this limited proteolysis, mer autocatalytic hydrolysis take place to produce active thrombin. Hence, the efficient conversion of prothrombin to active thrombin requires the sirnultaneous presence of membrane, ca2+ ions, factor Va, and the activity of a secondary protease, factor Xa.

PROPROTEIN CONVERTASES (PCs)

Recently, seven marnmalian senne proteases homo logous to the yeast kexin and bacterial subtilisins, referred to as proprotein convertases (PCs), have been identified and shown to be implicated in the maturation of a diverse array of prohormone polypeptides and proprotein precursors (27). Located mainly within the ~runs-Golgi network, these enzymes cleave a variety of precursors at the consensus (Arg/Lys)-

(xaa),-&? sequence, where Xaa is any amino-acid except Cys and n = 0, 2, 4, or 6

(28). Common examples of substrates for the PCs include a- and y-endorphin (29), the metalloprotease ADAM-10 (30), as well as the arnyloidogenic AP40, -42, and

-43 (3 1). Al1 zymogens belonging to the PC family of senne protease are composed of an N-terminal signal peptide, followed by a prosegment, a catalytic domain, a P- domain (whose function is not well understood), and an enzyme-specific C-terminal segment. Although PCs are specifically designed to process other precursors, it is interesting to note that PCs are themselves biosynthesized as proproteins in order to reguiate their hydrolytic activity. The prosegment is thought to act as a molecular chaperone serving to promote proper folding of the convertases in the endoplasmic reticulum. The prosegment is cleaved by an autocatalytic mechanism (the molecular basis for this conversion is not well understood) and acts as a non-covalent inhibitor until the enzyme is safely channelIed into the @ans Golgi network.

ASPARTTC PROTEASES

The physiological roles of aspartic proteases include ; the digestion of dietary proteins in the stomach of mammals, the regulation of blood pressure (e-g. renin (32)), and maturation of the retroviral Gag polyprotein to the structural, regulatory and enzymatic proteins irnplicated in the stability and replication of the HIV virion (33).

Cornmon to al1 members of the aspartic protease farnily is the dimerization of a compact P-barre1 domain to forrn mature enzyme. The active site is composed of two aspartate residues (Asp32 and Asp215 in ; Asp32 and Asp217 in gastricsin) which reside in close proximity to one another within a hydrophobic binding clef? formed between the two P-barre1 domains. hterestingly, a water molecule is hydrogen-bonded to both aspartate residues. These proteases perforrn their catalytic activity using a single displacement mechanism whereby the Hz0 rnolecule

(deprotonated) acts as a nucleophile and is able to attack the carbonyl carbon of the scissile amide bond to be cleaved (34,35). This is in contrast to the serine proteases which utilize a double displacement reaction rnechanism. This process involves nucleophilic catalysis perfoxmed by a residue belonging to the enzyme leading to the formation of a covalent complex with their substrates. Furthemore, senne proteases use a &O molecule to assist in the deacylation step. The X-ray crystal structures of porcine pepsinogen (36), hurnan pepsinogen A (37), and human progastricsin (38) have been detennined. Within the farnily of aspartic proteases there exists an excellent example of an intermediate of progastricsin processing (39) which may be stabilized by adjusting the pH to neutral conditions. Zymogens belonging to this family typically possess a positively charged N-temiinal prosegment (> 40 residues) that interacts with the negatively charged catalytic domain using several salt bridges. In hurnan progastricsin, Lys37p forms a H-bonded salt bridge to both aspartate residues located in the active site. These salt bridges allow the prosegment residues Pro34p to Arg39p to block access of natural substrates to the preformed active site. Similar to the precursors belonging to the papain family (described later), activation of the aspartic protease zymogens requires the destabilization of salt bridges caused by their exposure to low pH followed by lirnited proteolysis of the prosegment. Unlike the precursors of the papah family (described later), however, zyrnogens of the aspartic protease hiryare disthguished by a major conformational rearrangement (irreversible) within the N- temiinus of the mature enzyme segment which occurs as a consequence of activation.

Upon exposure of the gasûicsin precursor to low pH, for exarnple, there is loosening of the prosegment/enzyme complex involving the unfolding of residues

Phe26p-Leu43p which bind through the active site cleft, and thus leading to the formation of interrnediate 1 (40) (Figure 3 on next page). The first proteoIytic event involves intrarnolecular autocatalytic cleavage of the peptide bond between Phe26p and

Leu27p followed by intermolecular autocatalytic cleavage at the pro/mature junction FIGURE 3 (Leu43p-Serl), resdting in the formation of intermediate 2. Residues AIalp-Phe26p remain temporarily associated non-covalently to the mature segment (40). Intermediate

2 is unique in that it may be stabilized by increasing the pH to neutral conditions which serves to inhibit the catalytic activity of these enzymes and preventing fûrther processulg to fom mature protein. Interestingly, the catalytic HzO molecule was observed to bind to the two aspartate residues in Intermediate 2. In summary, the catalytic centers and substrate-binding clefts found among the precursors of aspartic proteases are prefonned and functional, and lowering the pH environment to which they are exposed results in the protonation of salt bridges forrned by the carboxylate side chains of aspartate and glutamate residues which stabilize the association behveen the prosegment and enzyme.

ZINC WTALLOPROTEASES

PROCARBOXYPEPTIDASE A AND B

Zinc metalloproteases such as carboxypeptidase A and carboxypeptidase B are found in the and serve to degrade proteins in the alimentary tract of marnrnals

(41) by cleaving peptide bonds at the C-terrnini of polypeptide substrates. These enzymes consist of a zn2' at the active site that is directly involved in the catalytic mechanism. The 2n2' ion is coordinated by three residues located within the substrate- binding cleft of these enzymes ; His69, Glu72, and His196. Sirnilar to the aspartyI

6 proteases, the mechanism of peptide hydrolysis for the pancreatic zinc metalloproteases involves the activation of a water molecule by Glu270 (Procarboxypeptidase A nurnbering) followed by nucleophilic attack of the scissile amide bond by the hydroxide (42,43). The positively charged environment fonned by the residues located within the substrate-binding cleft, including the znZf ion itself, assists with the hydro lytic reactions b y neutralizing the developing negative charge O f the tetrahedral intermediate.

The prosegments of the zinc carboxypeptidases are approximately 95 residues in length. Similarly to chymotrypsinogen, activation of procarboxypeptidases is initiated by limited proteolysis at the pro/mature junction by active trypsin. The pro/mature junction in procarboxypeptidase A (Arg99p-Aial) is accessible to direct recognition and proteolysis since it is localized to a conformationally fkee loop (Le. with high B-factors) (44,45). The prosegment is merdegraded by trypsin and mature carboxypeptidase A at interna1 sites that are not accessible in the zymogcn (at

Arg74p-Tyr75p in procarboxypeptidase A; and at Arg83p-Ser84p in procarboxypeptidase B). Therefore, major conformational changes must occur within the prosegment during conversion of these zyrnogens but none have been observed within the catalytic domain.

PRO-STROMELYSIN-1

As opposed to carboxypeptidases A and B, strornelysin-1 is a 2n2' endopeptidase ; i.e., cleaves at intemd sites within a polypeptide substrate, and is a mernber of the family of matrix metalloproteases (MMPs). MMPs function at neutral pH and are involved in the degradation of connective tissue for the purpose of regulating tissue homeostasis (6,46). The deregulated activity of this farnily of enzymes has been associated with tumor metastasis (6). The prosegment of this zymogen is composed of 82 residues and the C-terminus of the enzyme has been shown to be composed of a domain that has been proposed to mediate interactions with rnacromolecular substrates and inhibitors (47). The X-ray crystal structure of pro- stromelysin-1 (47) reveals that the catalytic zn2+ is directly coordinated by a residue fiom the prosegment, Cys75p. This interaction utilizes the sulfur atom from the side- chain of Cys75p and thereby prevents catalytic activity for this zyrnogen. Sîmilar to the prosegments found mong precursors of the papain farnily (discussed later), the prosegment of pro-stromelysin-1 binds through the enzyme's active site cleft in the reverse substrate-binding mode.

The conversion of pro-strornelysin-l to its mature protein involves lirnited proteolysis at the pro/rnature junction (HisSZp-Phel) using the activity of other proteolytic enzymes, and cleavage sites within the prosegment (e-g., Glu68p-Va169p, a-helix 3p) have also been identified (48). The only significant conformational changes which take place is in the loop consisting of the pro/mature junction.

Following cleavage, the loop undergoes a conformational rearrangement that results in salt bridge formation between the newly liberated N-terminus at Phel and Asp237 in a rnanner similar to that observed for mature serine proteases. This salt bridge, however, is 12 A away fi-om the catalytic 2n2+ and is not expected to affect the active site or the catalytic activity of the enzyme. In contrast, the newIy fonned salt bridge formed among serine proteases (trypsin, chymotrypsin ; discussed previously) is crucial for the maturation of the oxyanion hole.

CYSTEM PROTEASES

CASPASE FAMILY

Apoptosis is a form of ce11 death that is vital for morphogenesis, tissue homeostasis, and host defense (49). Disruptions in the apoptotic program are associated with neurodegenerative disorders, where there is excessive ce11 death, and cancer, where there is insufficient ce11 death. Key mediators that initiate and execute the apoptotic program are members of the aspartate-specific family of cytoplasmic cysteine proteases, known as caspases (50,51)- In order to regulate their activities, these apoptotic catalysts are synthesized as inactive precursors. This family of precursors are divided into two classes ; Class I (apical ; initiator enzymes) such as caspases (-2, -8, -9, and -10) are composed of long amino-terminal prodomains in excess of 100 residues and Class II (executioner enzymes) such as caspases (-3, -6, and

-7) consist of relatively short prodomains (52,53). The zymogens of apical caspases may be recruited by specific adaptor molecules. For exarnple, procaspase-8 and - 10 are recruited at the cytosolic side of death receptors (e-g. Fas receptor) via their interaction with the adaptor molecule FADD (54), whereas procaspase-9 has been associated with the rnitochondria in the signaling pathway leading to apoptosis (55,56). Processing of procaspases into the mature heterotetramer product, (p20p10)2, requires specific cleavage after aspartic acid residues located within the interdomain linkers OF the protein (57). Furthermore, recent studies suggest that the long prodomains in some class 1 caspases are able to mediate dimenzation of procaspase molecules, thereby promoting autoprocessing (5 8). To date, no three-dimensional structure has been solved for any of these zymogens and the mechanism of activation of procaspases in a particular apoptotic pathway is poorly understood. The catalytic activity of mature initiator enzymes are subsequently used to process downstream executioner enzymes in trans. Hence, there exists an elaborate hierarchy of zymogen activation arnong the caspase family of cysteine proteases. Furthermore, accessory proteins are also utilized as death stimuli. For exarnple, a regulator of apoptosis termed Apaf-1 (a marnrnalian homolog of C. elegans CED-4 (59)), is capable of binding to and activating procaspase-

9 in the presence of cytochrome c and dATP, thus leading to the activation of caspase-

3. Furthemore, the serine protease derived Erom cytotoxic T cells, triggers activation of both procaspase-3 and -7 (59). Much progress remains to be made, however, in elucidating the molecular mechanisms of processing utilized by the caspase family of zymogens.

PAPAIN FAmY

Papain, derived from the dried latex of the tropical papaya hit, is the canonical enzyme of a family of cysteine proteases which continuously increases in its rnembership. Much interest has been placed on papain as a mode1 for protein structure and fiinction studies. Papain has been known to be a cornmon ingredient in meat tenderizers since it is an efficient endopeptidase and is safely digested by the stomach.

This enzyme has also been used to prevent chi11 haze during beer production. Papain- like enzymes may be found in bacteria, yeast, plants, and mammals. In addition, a papain-like gene has also been identified in the baculovinis Azrtographa califarnica

Nuclear Polyhedrosis Virus genorne (60). The rnamrnalian homologs of papain include cathepsins B (61), C (9,10), H (62), K (63), L (64), S (65), W (66) and X (67). With some notable exceptions, these enzymes are targetted to the mature lysosome using the mannose-6-phosphate receptor (68) and have been shown to be responsible for general intracellular protein turnover (69), bone resorption (70), cartilage proteoglycan breakdown (71), and antigen processing within the endosomal system (72). Recently, cathepsin L has been found to be a critical protease for invariant chain degradation in thymic epithelium (73,74). Degradation of the invariant chain was found to be perturbed in cathepsin L-deficient mice, thus leading to impairment of positive selection by CD~+T cells caused by a reduced repertoire of major histocornpatibili~-foreign peptide complexes on the cortical epithelial cells.

Cathepsin S is an enzyme which fiilfills a sirniIar role to that of cathepsin L but mediates negative selection within peripheraI bone mmow-derived antigen-presenting cells in the thymic medulla (74). Furthemore, cathepsin K has been found to be very highly expressed in osteoclasts, thus suggesting a role for this enzyme in bone remodelling (75). Deficiency in the cathepsin K enzyme has been linked to a disease hown as pycnodysostosis (or Pycno) which is a rare recessive trait characterized by bone fiagility and short stature (76). Due to their ability to cleave type 1-collagen, elastin, and proteoglycans, enzymes such as cathepsin B are also beIieved to have a role in tumor progression (77). Therefore, these enzymes constitute interesting targets for therapeutic intervention. Despite the availability of inhibitors with sufficient potency, however, they often lack the required selectivity needed for therapeutic applications.

Enzymes belonging to the papain family perform their duties by utilizing a catalytic thiolate-imidazolium ion-pair. Nucleophiiic attack on the carbonyl carbon of the peptide bond to be cleaved is performed by the sulfur atorn of the catalytic cysteine,

Cys25 (papain numbering ; Cys29 in cathepsin B). The resulting tetrahedral intermediate carries a forma1 negative charge on the oxygen which is then stabilized by the oxyanion hole formed by Gln 19 (papain numbering ; GW3 in cathepsin B). The departure of the leaving group leads to the formation of a covalent acyl-enzyme intermediate. The catalytic unidmole, HislS9 (papain nurnbering ; His 199 in cathepsin

B) acts as a weak base and serves to deprotonate a water molecule. The covalent intemediate is then attacked by the hydroxyl group causing the formation of the final product and the regeneration of free enzyme (78). In addition to the thiolate- imidazolium ion-pair, each enzyme possesses an extended substrate-binding clef3 capable of accomodating several substrate residues simultaneously and accounting for a very modest degree of substrate selectivity. For example, cathepsin B prefers a basic amino-acid residue in PI such as an Arg and a hydrophobie Phe group in the Pz position

(79). In general, however, designing an inhibitor which is highly selective for a given enzyme within this farnily constitutes a challenge due to their relatively broad substrate specificity.

Most members of the papain fâmily function as endoproteases. Cathepsins B,

C, and H, however, may exhibit unique exopeptidase activities. Cathepsin C is known to homodimerize into a tetramer and function as an arninodipeptidase (9)- whereas cathepsin H functions as a mono-aminopeptidase (80,81). Cathepsin B is capabte of carboxydipeptidase activity ; i-e., the removal of dipeptides fiom the C-terminus of an extended polypeptide substrate (79). The X-ray crystal structure of cathepsin B (82)- the most abundant lysosomal , reveals the molecular basis for the unique exopeptidase activity of this enzyme. Cathepsin B shares a common three- dimensional fold as other papain-like enzymes but also contains a novel insertion of over twenty residues. Within this insertion is fbund an exposed disulfide Ioop (residues

Cysl08-Cysllg), refefred to as the occluding loop, which partiaIly 'occludes' the portion of the substrate-binding cleft that would otherwise bind substrate residues to the

C-terminal side of the scissile amide bond (discussed in Chapter 1). Two specialized histidines located within the occluding loop, His 1l O and His 1Il, are strategically positioned such that both of their side-chains are directed into the substrate-binding clef3 facing the S2' subsite. Therefore, the orientation of the two histidines allows them to accept the negatively charged Pz' carboxylate of the bound substrate. Conversely, the unusual mono-aminopeptidase activity docurnented for cathepsin H is attributable to the exclusion of natural substrates fiorn the unprimed subsites of this enzyme's substrate-binding cleft (83) (discussed in Chapter 3).

To protect themselves from unwanted digestion, eukaryotic cells synthes ize these proteases as latent proproteins. These proenzymes consist of polypeptides of various lengths at the N-terminus of the protease that act as potent pH-dependent inhibitors of the parent enzymes. The prosegments have also been shown to be stnctly required for the expression of native proteases and promote correct protein folding in vivo and in vitro (84). These proregions may also act as intrarnolecular chaperones and aIso heip to stabilize the enzymes upon exposure to neutral or alkaline pH environments (84). Similar to the caspase farnily, the papain-like enzymes have also been divided into sub-families based on the lengths of their prosegments. For example, most precursors belonging to the cathepsin L-subfamily possess over 90 residues within their prosegments (refer to Figure 4 on next page). Upon their conversion to mature protein, they generally display endopeptidase activity with varying substrate specifrcity.

Sequence identity arnong the cathepsin L-like proregions is lower than that observed for the mature enzyme domains. However, cathepsin L-like prosegments display a higher level of sequence homology within the central region composed of residues 2 1p-

77p, including the ERFNIN rno tif (Glu27p-X3-Arg-X3-Phe-X2-Asn-X3-Ile-X3-Asn46p) which foxms part of an extended a-helix (called cc2p), and the (G1A)NFD segment

(Gly59p-XI-Asn-XI-Phe-Xi-Asp65p)which binds between the primed subsites of the enzyme's substrate-binding cleft and the hydrophobic prosegrnent-binding loop (also called exosite) (85-88). The second subfmily is that of cathepsin B to which this enzyme is the only member. The cathepsin B precursor is distinguished by a shorter prosegment of only 62 residues (refer to Figure 5) which lacks the ERFNIN motif, yet contains a motif (Gly27p-Xi-Asn-Xi-Tyr-XZ-Asp34p; cathepsin B prosegment numbering) which binds between the primed subsites of the active site cleft and the prosegment-binding loop of cathepsin B (89-91); i.e., reminiscent of the GNFD segment within the cathepsin L-like prodomains. Furthemore, another unique Feature associated to cathepsin B is an insertion of over twenty residues located within the catalytic domain, termed the occluding loop, which has been implicated in the dipeptidyl carboxypeptidase activity of this enzyme (discussed previously).

The X-ray crystal structures of precursors beIonging to the papain- famil y (85-

91) reveal a conserved mode of prosegrnent binding, and a conserved mechanism of inhibiting the activities of this family of cysteine proteases (refer to Figures 6 and 7).

The prosegments tend to be composed of more secondary (a-helical) structure within their N-terminal ends (residues 5p-75p) than their C-terminal ends (residues 76p-96p)

(cathepsin L prosegrnent numbering). The bound prodomain is composed of a four- turn cc-helix (cclp : 5p-19p) followed by a long arnphipathic a-helix consisting of 7.5 turns (a2p : 25p-51p). Furtherrnore, the end of helix a2p turns and folds back into an extended conformation fonning a hairpin structure (P lp) where residues 56p-59p reside along the surface of the enzyme. Prior to its way to the substrate-binding cleft of the enzyme, the prodomain follows into a third a-helix composed of only two turns (u3p :

68p-75p). In general, residues 76p-96p stretch along the surface of the enzyme in an extended conformation fkom the substrate-binding cleft to the pro/mature junction.

Enzyme residues in contact with the prosegments are located within three major surfaces. The first major contact region is the active site defi of the enzyme which

FIGURE 7 accomodates residues 75p-8lp that adopt an extended confôrmation and bind in a direction opposite to that expected for natural substrates ; i-e., reminiscent of the 2n2+ endopeptidase prostrornelysin-1 discussed previously. Hence, the activities of these enzymes are inbibited by the ability of the prosegments to block the access of natural substrates to the catalytic center. The second enzyme interface for prosegrnent interactions is the primed subsites of the substrate-binding clefi which is required for accomodating residues 63p-75p. The third major contact region is the hydrophobic prosegment-binding loop (also referred to as the exosite) which interacts with residues

54p-58p. Key prodomain residues at the enzyrne interface are Phe56p (hornologous to

Trp24p in procathepsin B) and Gly77p (Gly43p ; cathepsin B prosegment numbering)

(92). The residue in position 56p is composed of an aromatic residue (Phe or Tyr ; Trp in procathepsin B) which nestles into the hydrophobic prosegment-binding loop

(exosite) of the enzyme and plays a critical role in stabilizing the N-terminal cap of the prodomain to the surface of the enzyme. The residue in position 77p is usually a small uncharged residue (Gly or AIa) which allows deep penetration of the prose,ment into the substrate-binding cleft of the enzymes ; i.e., in the reverse substrate-binding mode.

The stability of the prosegment-enzyme complex in cathepsin L-like precursors also relies heavily on the formation of salt bridges involving highly conserved residues of the prodomain ; Le. between GIu27p and Arg3lp with Glu70p as well as between

Arg3lp and Asp65p, which help helices alp and cc2p to fold into close proximity to helix a3p (85-88).

The possibility of having mistargetted enzymes ; Le. enzymes misrouted to the extracellular matrix rather than ta the lysosome, may lead to severe pathological conditions. As stated previously, aberrant proteolytic activity for cysteine proteases belonging to the papain farnily has been implicated in several disease States such as rheumatoid arthritis (6), osteoporosis (76), and cancer metastasis (77). Therefore, organisms are required to possess 'self' defense mechanisms kom the potentially destructive activities of these enzymes. Cysteine proteases of the papain family are strongly but reversibly inhibited by the cystatins (dissociation constants in the femtomolar+picomolar range), a superfardy of proteins composed of 120 residues and hornologous to chicken cystatin (93). For example, cystatin C is present in al1 human body fluids but is most abundant in cerebrospinal fluid and seminal plasma (94).

The molecular basis for the interactions of cystatins with members of the papain fkmily is distinct fiom the strategies utilized by the prodomains in the zymogen structures.

Similarly to the prosegrnents, cystatins inhibit these enzymes by blocking access of the substrate-binding cleft to incoming substrates. Ho wever, as opposed to the prodomains which extend through the active site cleft in the reverse substrate-binding mode, cystatins utilize three exposed loops (a tripartite interaction) which combine to form a wedge-shaped hydrophobic edge that is highly cornplimentary in shape to the active- site cleft of enzymes belonging to the papain farnily (95) (refer to Figure 8).

Significantly, the protrusion of Gly 11 (arnino-terminal segment) into the Sz subsite is critical for the stability of this 1 :1 complex (95). Furthermore, the first hairpin loop is composed of a highly conserved QWAG region (Q55-IVAG in human cystatin C) which is fianked on both sides by the amino-terminal segrnent (composed of Glyl 1) and a second hairpin loop. Curiously, the QVVAG region is adjacent to another conserved motif found in the prïmary arnino-acid sequence of the protein, Gly59-X-

Asn-X-Phe-X-Asp, which is rerniniscent to the GNFD segments found within the prodomains of the papain family (96). FIGURE 8 The majority of precursors belonging to the papain family are capable of autoactivating upon their exposure to acidic pH environments as is the case within the mature lysosome, which results in the production of mature (active) enzyme corresponding to an N-terminal sequence which begins at or near the pro/mature junction ; i-e., the area of the prosegment with the least secondary structure.

Furthemore, subjecting inactive propapain (Cys25Ser) (97), procathepsin B

(Cys29Ser) (98), and procathepsin L (CysZSSer) (99) to catalytic quantities of their corresponding mature enzyme leads to correct cleavage of the precursor to form mature protein ; Le., mature enzymes belonging to the papain Family are capable of maturing their own precursors via intermolecular cleavage (processing in tram). Furthemore, kinetic analyses are suggestive of an event which is independent of proenzyme concentration ; Le., a unimolecular mechanism of processing (97-100). Chemically synthesized peptides (termed propeptides) with a sequence identical to that of the proregions of rat cathepsin B (102), human cathepsin L (103), and hurnan cathepsin S

(104) are much weaker inhibitors towards the parent enzyme at low pH @H 4.0) than at neutral pH values (pH 6.0). Hence, the weaker affinities of the enzymes for their propeptides correlates with the observation that this farnily of zyrnogens autoprocess faster at low pH than at neutral pH (97-100). These results suggest that the influence of the ionkation of one or more carboxylic groups is important in regulating the stability of the prosegmentlenzyrne complex dong with the concomitant release of the prodomain due to proteolytic processing at or near the pro/mature junction. The partial or complete digestion of the f?ee propeptide is also necessary in order to ensure that activation of these enzymes is irreversible. This thesis addresses some of the structural requirements for efficient processing to occur among zyrnogens belonging to the papain family. This has been achieved by studying unique members of the p apain superfamily, namely procathepsin

B and procathepsin H. Initially, this project was highIighted by the finding of

Ouraishi. O. that the chemically synthesized propeptide of cathepsin B (residues lp-

56p) was a more potent inhibitor of this enzyme upon the deletion of the exposed occluding loop (ACys 108-Cys 119 ; termed the M 1 mutant). (Please refer to reference (105), entitled 'Role of the Occluding Loop in Cathepsin B Activity' by

Illy, C., Quraishi. O., Wang, J., Purisima, E., Vernet, T., and Mort, J.S. (1997) J.

BioL Clrem. 272, 1197-1202). In Table IV of this articie, a 50-fold improvernent in the affinity of the propeptide for the surface of the Ml mutant was observed at pH 6.0 in cornparison to the wild-type enzyme. A similar improvement was determined by I&

-C. for the binding affinity of cystatin C to the Ml mutant. Therefore, in addition to

'occluding' the active site cleft in cathepsin B to extended polypeptide substrates, the occluding loop was also found to obstruct the binding of cathepsin B inhibitors (the cathepsin B prodomain and cystatin C). In addition, it was observed that the Ml mutant autoprocessed much slower than Sie wiId-type precursor; Le,, complete maturation of the Ml precursor required 5 days at pH 5.0 and 4°C as opposed to several hours for wild-type procathepsin B under the sarne conditions. This preliminary data lead to fürther investigating the role of the occluding loop in regulating the pH dependence of propeptide binding to cathepsin B as well as procathepsin B processing. Chapter 1 of this thesis entitled 'The Occluding Loop in

Cathepsin B Defines the pH Dependence of Inhibition by Its Propeptide' has been accepted for publication (refer to reference (106)). This chapter identifies a critical salt bridge on the enzyme (Asp22-HisllO), which stabilizes the occluding loop to the surface of cathepsin B- As a consequence, perturbation of this salt bridge through site- directed mutagenesis essentially eliminates the pH dependence of propeptide binding and also has a profound effect on the overall rate of procathepsin B processing.

Chapter 2 of this thesis entitled 'Identification of Interna1 Autoproteolytic Cleavage

Sites Within the Prosegments of Recombinant Procathepsin B and Procathepsin

S' provides evidence for a slow unirnolecular rnechanism of processing for this farnily of zymogens. This was achieved using the ability of human cystatin C to impede the rapid intennolecular cascade of autoproteolysis, and thus favoring the observation of intramolecular reactions. Inspection of the X-ray crystal structures for this farnily of zymogens (85-91) indicates that the N-termini of the novel processing intermediates identified in this study correspond to an area of the prosegments which bind through the substrate-binding clefts of these enzymes, and that the sites of intrarnolecular hydrolysis occur within a sketch of prosegment residues which are in close proximity to the catalytic center. Finally, Chapter 3 entitled 'Functional Expression of Human

Procathepsin H in Piclria pastoris and Attempts at its Correct Processing' highlights a unique member of the cathepsin L-subfamily (approximately 40 % sequence hornology to procathepsin L) which displays mono-aminopeptidase activity upon its maturation. The exopeptidase activity of this enzyrne has been attributed to an octapeptide mini-chain, denved fiom the C-terminus of the cathepsin H prodomain, which remains attached to the main body of the enzyme via Cys82p and Cys214 foliowing activation of the cathepsin H precursor. Therefore, in addition to the covalent attachent at the pro/mature junction, procathepsin H is an unusual mammalian homolog of papain-like enzymes in that it is also composed of a second covalent attachent which Iinks the prosegment to the enzyme, thus restricting the conformational mobility of prosegment residues near the prohature junction. The results of this chapter illustrate that, unlike procathepsin L, the cathepsin H precursor is incapable of autoprocessing and that mature cathepsin H is not independently responsible for the conversion of its own precursor. It has also been established that the prohature junction in procathepsin H is highly resistant to proteolysis by the activities of mature cathepsins B, D, H, K, L, and S. Furthemore, it is shown that the mature rnini-chah (derived fiom the C-terminus of the cathepsin H prosegment) optimizes but is not strictly required for the aminopeptidase activity displayed by this enzyme. CHAPTER 1

Cha~ter1 :Contributions of Authors other than Omar Quraishi

Dorit K. Na~ler-Provided the P. pastoris transformed with hurnan wild-type and occluding loop mutants of cathepsin B.

Ted Fox- Performed experirneots which lead to the data provided in Table 2 of this chap ter.

J. Sivaraman and Miroslaw Cvder- Provided the coordinates of the crystal structnre of rat procathepsin B and assisted with the preparation of Figure 2 of this chapter.

John S. Mort - Provided the original cDNA of procathepsin B

Andrew C. Storer - Provided funding and mentorship for this project. The Occluding Loop in Cathepsin B Defines the pH

Dependence of Inhibition by Its propeptide?

'WCPublication No. 00000. The research was funded in part by the Governrnent of

Canada's Network of Centres of Excellence Program supported by the Medical

Research CounciI of Canada and the Natural Sciences and Engineering Research

Council of Canada through PENCE Inc, (the Protein Engineering Network of Centres of Excellence).

Omar ~uraishi) Dorit K. ~agler)"ed FOX? J. sivaramant

Miroslaw ~ygler:~John S. MO& and Andrew C. ~torer*~*~~

*protein Engineering Network of Centres of Excellence and Department of

Biochemisw, McGill Universiw, 3655 Dnimmond Street, Montreal, Quebec, Canada

H3G 1Y6, '~harmaceutical Biotechnology Sector, Biotechnology Research Institute,

National Research Council Canada, 6 100 Royalrnount Avenue, Montreal, Quebec,

Canada H4P 2R2, l~ointDiseases Laboratory, Shriners Hospital for Children, 1529

Cedar Avenue, Montreal, Quebec, Canada H3G 1A6, and Protein Engineering Network of Centres of Excellence and Department of Surgery, McGill University, Montreal,

Quebec, Canada.

' Address correspondence to this author at the Biotechnology Research Institute, 6100

Royalrnount Avenue, Montreal, Canada H4P 2R2. E-mail: [email protected].

Present address: Vertex Pharmaceuticals Inc. 130 Waverly St., Cambridge,

Massachusetts, USA 02 13 9.

RUNNING TITLE: Propeptide Inhibition of Cathepsin B Mutants ABBREVIATIONS:

Residue numbering relates to mature human cathepsin B.

The abbreviations used are: 2-Phe-Arg-MCA, benzyioxycarbonyl-L-phenylalanyl-L-

arme 4-rnethylcoumarinyl-7-amide; SDS-PAGE, polyacrylamide gel

electrophoresis; DMSO, dimethyl sulfoxide; DTT, dithiothreitol; A(Cys 108-Cys 119) corresponds to the occluding loop deletion mutant as reported in Illy, C, Quraishi, O.. et al., Mort, J.S. (Ref. 105).

In the text, the words "proregion" and "prosegment" refer to the polypeptide stretch located N-terminal to the mature enzyme in the proenzyme, while the word

"propeptide" refers to the chemically synthesized polypeptide corresponding to the proregion sequence but without the mature enzyme. Abstract :

Papain-like proenzymes are prone to autoprocess under acidic pH conditions.

Similarly, peptides derived fiom the proregion of cathepsin B are potent pH dependent inhibitors of that enzyme ; Le., at pH 6.0 the inhibition of human cathepsin B by its propeptide is dehed by slow binding kinetics with a Ki of 3.7 nM and at pH 4.0 by classical kinetics with a Ki of 82 nM. This pH dependency is essentially elirninated by either the removal of a portion of the enzyme's occluding Ioop through deletion mutagenesis or by the mutation of one of either residue Asp22 or Hisl 10 to alanine ; e-g., the mutant enzyme Hisl lOAla is inhibited by its propeptide with Ki's of 2.0 i 0.3 nM at pH 4.0 and 1.1 0.2 nM at pH 6.0. For the Hisl lOAla mutant the inhibition also displays slow binding kinetics at both pH 4.0 and pH 6.0. As shown by the crystal structure of mature cathepsin B usi il, D., et al. (1991) Embo J. 10, 2321-2330 (82)]

Asp22 and HisllO form a salt bridge in the mature enzyme and it has been shown that this bridge stabilizes the occluding loop in its closed position pagler, D.K., el al.

(1997) Biochemisv 36, 12608-12615 (107)]. Thus the pH dependency of propeptide binding can be explained on the basis of a competitive binding between the occluding loop and the propeptide. At low pH, when the Asp22-Hisl lO pair forms a salt bridge stabilizing the occluding loop in its closed conformation, the loop more effectively cornpetes with the propeptide than at higher pH where deprotonation of His 110 and the concomitant destruction of the Asp22-HisllO salt bridge results in a destabilization of the closed form of the loop. The rate of autocatalytic processing of procathepsin B to cathepsin B correlates with the aanity of the enzyrne for its propeptide rather than with its catalytic activity, thus suggesting a possible influence of occluding loop stability on the rate of processîng. Cathepsin B (EC 3.4.22.1) is the most abundant lysosomal cysteine protease and is a unique member of the papain superfamily (108) in that it exhibits both endopeptidase and dipeptidyl carboxypeptidase (exopeptidase) activity. The X-ray crystal structure of mature cathepsin B reveals the molecular basis for this duaiity (82).

Unlike other papain-like proteases, cathepsin B possesses an extra structural element referred to as the 'occluding loop' which conîributes to the prïmed subsites of the substrate binding cleft- Two cntical residues of the occluding loop, HisllO and

HisZ Il, are strategicaily positioned to 'occlude' extended substrates from binding to the enzyme and to accept the negatively charged carboxylate of the Pz7residue at the C- terminus of the substrates. Removal of this exposed disulfide loop (residues Cys 108-

Cysl19) in cathepsin B was shown to increase affbity for the protein inhibitor cystatin

C and the cathepsin B propeptide due to unrestricted access of these inhibitors to the active site (105). In addition, this variant of procathepsin B was shown to autoprocess much more slowly than the wild-type enzyme (105). Cornparison OF the recently deterrnined three-dimensional structures of rat (89) and human (90,91) procathepsin B with mature cathepsin B (82) reveals that the occluding loop is a highly flexible segment of the protein. The backbone of the occluding loop is able to undergo a conformational transition in which the tip of the loop moves by as much as 10Â. The largest movernent is that of the side chah of Hislll which is displaced by over 14A with a sirnultaneous large rnovement of Kisl 10. The X-ray crystal structure of procathepsin B (89-91) also reveals that the 62-residue proregion adopts an extended conformation along the surface of cathepsin B with the majority of close contacts provided by residues 22p-47p of the prosegment (Figure 1 of Chapter 1). Cathepsin B residues in contact with the proregion are located within three

major areas. The first major contact region is the substrate-binding cleft of the enzyme,

which accornodates residues 41p to 47p of the prosegment (Region 1). In this region,

the prosegment adopts an extended conformation and binds in a direction opposite to

that expected for natural substrates. The second major site is the pnmed subsite of the

active site cleft, tenned the occluding loop crevice, which becomes exposed upon movement of the occiuding loop. Exposure of this surface on cathepsin B is required

for residues 29p to 40p of the proregion to bind (Region 2). The third major contact region is the hydrophobie prosegment-binding loop (exosite) which interacts with residues 2Zp to 26p of the prosegment (Region 3). The ability of the proregion to utilize al1 possible interactions with the surface of cathepsin B, namely with the active site, the occluding loop crevice and the exosite, is therefore highly dependent on the conformational mobility of the occluding loop. Analysis of the observed conformations which are adopted by the occluding loop (Figure 2 of Chapter 1) shows that specific contacts between His 110 and His 11 1 of the occluding loop and the rest of the protein change as the occluding loop shifts fiom the 'open' conformation found in the proenzyme structure (89-9 1) to the 'closed' one fond in the structure of the mature enzyme (82). When the loop is in the 'open' conformation, the side chains of residues

HisllO and As11222 as well as Hislll and Asp224 are in close proximity to one another. Following the disposal of the proregion to form the mature enzyme, these contacts are destroyed and new ones are formed. In mature cathepsin B, the side chains of Hisl 10 stacks against Trp221 and its EN fonns a salt bridge with Asp22, located in the primed subsite of the active site cleft. The side chah of Hislll is closest to Leu181, Vall12, and with the backbone of Asp224 and Trp225, but no salt bridges are

present. Furthemore, Argll6 forms a hydrogen bond with Asp224.

It has been suggested that the expected effect of pH on the stability of the

occluding loop ; Le., due to protonation/deprotonation of the stabilizing salt bridges,

could in part define the pH dependencies observed for the exo-and endo-peptidase

activities of cathepsin B (107). In addition, it has been observed that the processing of procathepsin B and the inhi'bition of mature cathepsin B by its corresponding propeptide exhibit similar pH dependencies ; Le., at neutral pH, which can be expected

to favor the 'open' conformation of the loop, procathepsin B is less prone to

autoprocessing (109) and the propeptide is a tight binding inhibitor (102) whereas at

acidic pH where the 'closed' conformation of the loop is expected to be favored, the propeptide binds less tightly and the proenzyme autoprocesses more rapidly. Thus it is the goal of this present study to explore the possible links between the pH dependency of : the cathepsin B occluding loop conformational flexibility; the propeptide binding afnnity; and procathepsin B autoprocessing. MATERIALS AND METHODS

The substrate Z-Phe-Arg-MCA was purchased fiom IAF Biochem International Inc.,

Laval, Québec. The synthesis and purification of the rat cathepsin B propeptides was as described previously (102). The pepsin-agarose resin was purchased £tom the Sigma

Chernical Company, Recombinant human cystatin C was a generous gifi fiorn Dr.

Irena Ekiel (Biotechnology Research Institute).

Synthesis and Purification of Human PCBI. Peptide synthesis of the human cathepsin B propeptide (residues lp to 56p) was carried out using standard Fmoc chemistry on an Advanced Chemtech MPS 396 solid-phase synthesizer. Cnide human peptide WCBI) was partially purified by HPLC on a Vydac C4 (300 A) column (5 x

25 cm) using a linear 20-60% acetonitrile gradient at a flow rate of 33 mumin for 120 min (A = O. 1% TFA in HPLC grade water; B = 0.1 % TFA in HPLC grade acetonitrile).

Human PCB I was rechromatographed with a Vydac C 18 column (0.46 x 25 cm) using a linear 10-70% acetonitrile gradient (l%/min) at a flow rate of 1 mumin and stored lyophilized at 4OC. analysis was performed using a Beckrnan mode1 6300 arnino-acid analyzer. Electrospray mass spectral analysis using a Perkin Elmer SCIEX

API III spectrometer operated in the positive mode for detection of protonated species confirmed the expected molecular mass of 65 12 daltons.

Expvession of Wild-Type and Occhding Loop Variants of Numan Procathepsin B. In vitro site-directed mutagenesis was performed as descnbed previously (105,107). The oligonucleotide 5'-TGT GAG CAC GCT GTG AAC GGC GCC C-3' (mutated bases are underlined) was used for the Hisl llAla mutation which introduced a new DraIII site. These cDNA constructs, consisting of wild-type and occluding loop variants of

human procathepsin B as a fusion with the preproregion of yeast a-factor, were

digested with Xho 1 and Not 1and the proenzyme nagments were subcloned into the

pPIC9 vector (Invitrogen Inc., San Diego, California) and expressed in the yeast Pichia pastons. For integration into the Pichia genome, the pPIC9 based constructs were

linearized by cleavage with Bgl II and purified. The P. pasruris host strain GS115

(Invitrogen Inc.) was then transfomed with the Linearized constructs by

electroporation. Positive transformants were grown for 2 days in medium containing

glycerol as the carbon source followed by incubation in the presence of methanol for a

fûrther 3 days to induce expression of recombinant protein. The consensus sequence

for oligosaccharide substitution in the mature protein had been removed by the

substitution, SerllSAla, for al1 variants of cathepsin B. The site for oligosaccharide

substitution within the proregion (Asn2lp) was left unaltered. For the purpose of

clarity on SDS-PAGE (12 % gels), recombinant proenzyrnes were deglycosylated using

endoglycosidase H prior to their purification.

Purification of Procathepsin B. The recombinant proenzyrnes were purified fiom the culture supernatant using a hydropho bic resin under non-acidic conditions. The culture

supernatant (250 ml) was concentrated to 40 ml using an Amicon stirred-ce11 (YM-10 membrane). Dmgconcentration, the buffer was exchanged to 50 mM Tris (pH 7.4) containing 1.2 M (N&)2S04. Concenhated proenzyme samples were then applied to a

10 ml column of butyl-sepharose (Pharmacia Inc.) resin. Proenzyme fiactions eluted fiom the column by applying a linear gradient of decreasing ammonium sulfate concentration. Procathepsin B (wild-type and mutants) eluted at 0.6-0.8 (NK&SO4 and were stored at 4°C.

Procathepsin B Activation. Both wild-type and His 1l l Ala procathepsin B autoprocessed efficiently against 50 rnM sodium acetate (pH 5.0) to form mature enzyme. The occluding loop deIetion mutant, A(Cys 108-Cys1 i 9), and variants carrying the mutation Asp22Ala or HisllOAIa autoprocessed slowly under these conditions. For this reason, these variants of procathepsin B were readily converted to mature enzyme at pH 4.7 using 50 U/mL of pepsin Unmobilized on agarose resin.

Imrnobilized pepsin was removed by filtration following two hours of incubation with proenzyme. The processed enzymes were purified and their kJKM values for 2-Phe-

AG-MCA were obtained as described previously (lOS,lO7).

Procathepsin B Autoprocessing. Autoprocessing of punfied procathepsin B to form mature enzyme was monitored using SDS-PAGE (12% gels). Wild-type @PM),

His 11 1 Ala (3 PM), His 11 OAla (7pM), and Asp22Ala (6pMJ procathepsin B were subjected to acidic pH conditions, i.e. at 50 rnM sodium acetate buffer (pH 5.0) with 1 mM DTT.

Kirzetic Measzrrernents. Kinetic fluorescence measurements were carried out using a

Perkin Elmer LS-SB luminescence spectrometer which monitored MCA formation using an excitation wavelength of 380 nm and a detection wavelength of 440 m.

Since the KM of wild-type human cathepsin B for the substrate 2-Phe-Arg-MCA was estirnated to be 0.100 mM under these conditions (los), a substrate concentration of 10 pM was used for slow-binding kinetics ([SI cc KM). 2-Phe-Arg-MCA concentrations of 10, 20, 40, and 80 pM were used for plots of l/v vs [inhibitor] (1 10). The final concentration of cathepsin B mutants was 0.1 nM for each assay, except for the

A(Cys 108-Cys 119) mutant ai pH 4.0 where a final concentration of 1.5 nM was used due to its low activïty under these conditions (105). Unless otherwise stated assays were performed at 2S°C and conditions were 50 mM phosphate (pH 6.0) or acetate (pH

4.0-5.0) buffer containing 0.2 mM EDTA, 1mM DTT, and 3% DMSO. The enzymes studied were sufficiently stable under the assay conditions used for the tirne required.

Analysis of Data. Under the experirnental conditions used, progress curves for the inhibition of the occluding toop mutants by propeptides at pH 6.0 (and at pHc4.7 for

A(Cysl08-Cys 119), His 11 OAla, and Asp22Ala) followed typical one-step slow-binding kinetics as defined by the equations (1 11-1 15).

where p] is the concentration of fkee MCA formed, vi and v, are the initial and steady- state velocities, respectively, t is the time, and kob,is the rate constant for inhibition. Nonlinear regression using the program Enzfitter (published by Elsevier-Biosoft,

Cambridge, U.K.) gave the individual parameters (vi, v,, and hb,)for each progress curve. For each data set, the enzyme-inhibitor dissociation constant (Ki)values were obtained fkom the relationship vilvS- 1 = mlKi (1 16), where the vi represents the initial rate for substrate hydrolysis in the absence of inhibitor. A plot of kobrvs [inhibitor] remains linear over the range of inhibitor concentrations studied (4-40 n.), confkning that inhibition of the occluding loop mutants by both hPCBl and rPCBl occurs by a one-step process (102). Due to the near-zero intercept, kffvalues were calculated using the relationship Ki= k,dkon. Inhibition of wild-type and Hisl 11 Ala cathepsin B at pH

4.0-4.5 gave linear plots of [Pl vs tirne, and Kivalues were obtained firom plots of llv vs [inhibitor] (1 10) at the four substrate concentrations.

REMJLTS AND DISCUSSION

As shown in Table 1, the afnnity of cathepsin B for a peptide derived from its proregion is increased approximately 40 fold at pH 6.0 and 200 fold at pH 4.0 by the partial deletion of the occluding loop. In addition, the affullty of cathepsin B for the same propeptide is increased significantly at both pH 4.0 and 6.0, although to a lesser degree, by the mutation of either or both of residues Hisl10 and Asp22 to alanines. As shown by Nagler et al. (107) these mutations result in the destabilization of the closed conformation of the occluding loop found in the mature enzyme thus decreasing its ability to compete with the propeptide for binding to the active site cleft.

It has been previously shown that a synthetic peptide with a sequence identical to that of the proregion of rat cathepsin B is a much weaker inhibitor (160 fold) of that enzyme at pH 4.0 than at pH 6.0 (102). At pH 6.0 the rat cathepsin B propeptide displays slow binding inhibition kinetics whereas at the lower pH it behaves as a classical cornpetitive inhibitor. This pH dependent switch in inhibition type cm be accounted for by the lower aityof the enzyme for the propeptide at pH 4.0 coupled with a more rapid rate of dissociation at the lower pH. As can be seen fiom Table 1 the human cathepsin B-propeptide interaction displays a sirnilar though smaller (22 fold) pH dependency. The weaker binding and faster off rates observed for both the rat and human propeptides correlates with the observation that the processing of procathepsin

B is considerably faster at pH 4.0 than at pH 6.0. Fox et al. (102) reported that the pH dependency of the for the inhibition of rat cathepsin B by its propeptide is consistent with the influence of multiple ionisations in the pH range 4.0-6.0 and it is suggested that the ionisation of one or more carboxylic groups is important for binding.

However, the question of whether these carboxylic acid residues reside on the mature enzyme, the propeptide or both was not addressed. Frorn the present study, based on the results discussed below, it is evident that it is residues within the mature enzyme that are responsible for this effect.

On comparing the sequences of the rat and human propeptides (Figure 1 of this chapter) it cm be seen that only three acidic side chains are conserved, Le. aspartic acid residues at positions 1lp and 34p, and at position 12p aspartic acid (rat) and glutarnic acid (human). In order to explore the possibility that one or more of these acid groups contributes to the pH dependency of propeptide binding, peptides with a 5-residue deletion at the N-temiinus were synthesized in which each of the three aspartic acid residues found in the rat propeptide were sequentially substituted by asparagine residues (the deletion of 5 residues fiom the N-terminus results in a 3-fold increase in

Ki(pH 4.0)/KiCpH 6.0) versus the fùll length propeptide but does not adversely affect the overall pH dependency of binding, whereas it facilitates synthesis). The dissociation constants obtained for each of the modified peptides at pH 4.0 and 6.0 are given in Table 2. It can be concluded fkom Table 2 (following this chapter) that the carboxylic acid side chahs found within the propeptides do not contribute significantly to the pH dependency of propeptide binding. In addition, the presence of negative charges within the propeptide does not have a large influence on the overall strength of peptide binding. A recent study (107) indicates that pH dependent conformational changes occur in the occluding loop of cathepsin B in the pH range of 3.0 to 8.0. It was suggested that at the higher pH the loop becomes more flexible and as such less able to compete with extended endopeptidase substrates for the S' subsites. Conversely at lower pH values the more rigid loop binds more tightly to those sites thus occluding them and giving rise to a higher level of exopeptidase activity. That this pH dependence of the conformation of the occluding loop also plays a major role in the pH dependency of the propeptide binding is evidenced by the effect of partial deletion of the loop- In Table 1 it can be seen that for this mutant, A(Cysl08-Cysl l9), the effect of pH on the K; of the propeptide is essentially elirninated. Also fiom Table 1, it can be seen that removal of the ionic interactions between residues Asp22 and His 110 through the mutation of either or both residues to alanine also largely eliminates the pH dependency of the inhibition of cathepsin B by its propeptide. Essentially, as with the occluding loop deletion mutant, the high affuiity of the propeptide is maintained through the pH range 4.0-6.0. As discussed above the salt bridge between residues

Asp22 and Hisl10 is impoaant for maintaining the loop in a closedrigid conformation.

These results again support the view that at low pH (4.0) as opposed to pH 6.0 the occluding loop is able to compete more effectively with the propeptide for the binding site on the enzyme. Clearly, any change to the overall charge of the Asp22-Hisl10 ion pair cm be expected to infiuence the conformational stability of the occluding loop and as a consequence the measured Kiof the propeptide. Thus the deprotonation of Hisl10 or the protonation of Asp22 can be expected to idluence propeptide binding. Despite the fact that Hisl lO and Hislll of the occluding loop reside adjacent to one another, there is significant selectivity observed for the Hisl 10 residue to regulating the pH dependence of propeptide binding, This selectivity can be accounted for by their difference in chernical environment since the HisllO residue anchors the occluding loop to the rest of the enzyme. In order for the Asp22-His 110 interaction to iufluence the propeptide binding it is necessary that the pK, of either of these two residues be within or very close to the range 4.0-6.0. Given that the stabilization effect decreases at higher pH it is necessary to conclude that it is the deprotonation of the EN of Hisl10 that is responsible for the observed pH dependency.

Since the pH dependency of propeptide binding in the range of pH 4.0-6.0 appears to be determined by the integrity of the occluding loop and since mutation of either residue Asp22 or Kisl 10 (but not Hisll 1) to ahnine eliminates the pH dependency (Table l), as discussed above, it is possible to speculate that it is the protonation state of the Asp22-HisllO ion pair that directly determines the pH dependency of Ki.In this case the relationship of measured Ki(Ki meas) to the intrinsic

Ki and pH is given by the equation:

Ki rneas = Ki (l+ H/K3) / (l+H/&) where Ki= dissociation constant for the propeptide binding to the enzyme when His 110 is deprotonated and K3and K4 are the pK,'s of HisllO in the fiee enzyme and enzyme- propeptide complex, respectively. It is interesting to note (Table 1) that the effect of the aspartic acid and histidine mutations to alanines are not additive, i.e. the double

mutation, AspZZAla/Hisl lOAla, has the same effect as the individual mutations. This

implies that the two single mutations have equivalent effects, i.e. the end result of the

two mutations is the same and does not reflect interactions of these side chains with

other groups. For example, in the open state Hisl IO interacts with residue Asn222. If

the effect of the Hisl lOAla mutation was due, in part, to a loss of the Hisl10-Asn222

interaction it is expected that the effect of the individual mutations Hisl lONa and

Asp22Ala should show a significant degree of additivity for the double mutant.

In the study of Fox et al. (102) it was reported that the pH dependency of Ki is due to the influence of pH on kotrrather than k,,. Since both kOnand k,, for the mutants

Asp22Ala and Hisl lOAla are largely pH independent (Table 1) it follows that for the wild-type enzyme the pH dependency of kOrris a reflection of the pH dependency of the aspartate-histidine interaction. How is this possible if, as discussed above and as demonstrated by the crystal structure of procathepsin B, the aspartate-histidine interaction is broken when the proregion and presumably the propeptide is bound? One possible explanation is that for the enzyme-propeptide complex protonation of HisllO serves to actively displace the propeptide, i.e. rather than the sequence, dissociation of the propeptide followed by formation of the aspartate-histidine ion-pair, the ion-pair is either partially or fûlly formed prior to the dissociation of the propeptide. Thus displacement of the propeptide would take place as a two step process: an initial displacement fiom the occluding loop binding site as a result of the closing of the occluding loop and formation of the Asp22-Hisl lO ion-pair, followed by dissociation of the propeptide fiom the active site cleft. Clearly, competing mechanisrns could involve initial displacement fiom the active site followed by dissociation fkom the occluding loop binding site and the direct one step simultaneous displacement f?om both sites. The pH dependency results would suggest a significant role for the two step process involving the earlier displacement from the occluding loop site. Conversely, the lower pH-sensitivity of k,, implies that for the binding process, the binding of propeptide to the open fom of the enzyme (Asp22-His110 ion-pair broken) predominates.

The rat and human cathepsin B propeptides (residues lp to 56p) share an overall sequence homology of 71% (Figure 1). Furtherniore, fiagrnent analysis and alanine scannïng studies revealed that the segment of polypeptide between the motif NTTW

(21p to 24p) which binds to the hydrophobic prosegment-binding loop (exosite) on the enzyrne and the CGT (42p to 44p) motif which binds through the active site clefi of cathepsin B is crucial for the binding afflnity of the fiee peptide to cathepsin B

(92,117). The homology between these polypeptides within the segment 2 1p to 50p increases to 86%. Therefore, it is not surprishg that both the rat and human propeptides display similar inhibitory activify towards mature human cathepsin B. In fact, the & values for the human propeptide did not show any significant differences over those of the rat (Table 1).

Procathepsin B, in vivo, is synthesized as a glycoprotein consisting of two solvent exposed sites of N-linked oligosaccharide substitution, one at Asdlp of the prosegment which is located near the hydrophobic prosegment binding loop (exosite) and a second at Asnll3 located on the occiuding Ioop of the enzyme. Due to the fact that Asn2lp is not in direct contact with the sur£ace of the enzyme (89-91) it cm reasonably be concluded that the absence of glycosylation on Asn2lp of the chemically synthesized propeptides will not significantly affect their afEn.ity for the enzyrne. This is supported by alanine scanning studies (92) which did not link any unique importance

of Asn2lp to propeptide binding. Conversely, the absence of glycosylation on Am113

of the occluding loop could affect the conformational stability of the occluding loop

and as a consequence the values of the inhibition constants obtained for the

propeptides. However, similar pH dependencies of autoprocessing of the glycosylated

and non-glycosylated proenzymes are observed (data not shown) implying that the

influence of glycosylation may be small.

There is an interesting correlation between the rate of autoprocessing of purified

fiill-Iength variants of procathepsin B (Figure 3 of this chapter) and the decrease in the

mature enzyme's affinity for the propeptides at pH 4.0 (Table 1). For example, the

fities of the propeptides towards mature wild-type and His Il1 Ala cathepsin B are

similarly afZected by pH ; Le., there is a marked decrease in affity at low pH.

Similady, exposwe of the Hisll lAla procathepsin B to acidic pH conditions leads to a rate of maturation comparable to wild-type procathepsin B ; Le., under the conditions given for Figure 3 processing of these enzymes is complete within minutes rather than days. Conversely, there is a maintenance of potent inhibition of the A(Cysl08-

Cysl l9), Asp22Ala, and Hisl 1OAla forms of the enzyme by the propeptides at low pH

and the proenzymes of these three variants process much slower than either wild-type or Hisl 11 Ala procathepsin B (Figure 3). The A(Cysl08-Cys 11 9) proenzyme requires 5 days incubation at pH 5.0 (105)' and both Asp22Ala and HisllOAla procathepsin B require over 7 days under the same conditions. Full length procathepsin B is stable in neutral pH environments when the occIuding loop favors an 'open' state and is prone to autoprocess under acidic pH to form mature cathepsin B where the occluding loop

favors a 'closed' state. It is interesting to note, therefore, that the gating equilibrïum of the occluding loop and autoprocessing of procathepsin B are both pH dependent. The

Asp22Ala and HisllOAla procathepsin B mutants shared much slower rates of

autoactivation at pH 5.0 compared to wild-type and HislllAla procathepsin B

(Figure 3). Once the occluding loop mutants have been matured using immobilized

pepsin, however, they are still able to cleave small synthetic substrates (Table 3).

Hence, the catalytic capability of these mutants has not been disr-upted but the

autoprocessing machinery has been greatly perturbed- It is possible, therefore, that the

closure of the occluding loop plays an important and unique step in the elimination of

the proregion in procathepsin B. Such a processing mechanism would explain the

observation that perturbation of the pH-dependent gating equilibrium of the occluding

Ioop has such a profound effect on the affinity of the free propeptide binding at low pH

(Tables 1 and 2) as well as the rates of procathepsin B autoprocessing (Figure 3).

The absence of an occluding loop and the presence of longer proregions in other

papain-like proenzymes (>90 residues versus 62 residues in procathepsin B) would

suggest that the pH-triggering mechanism of autoprocessing in procathepsin B is as

unique as the enzyme's dual exopeptidase and endopeptidase character. Sequence

alignment reveals that Asp22 in cathepsin B is not entirely conserved throughout the

papain superfamily. It is repIaced by an asparagine residue in rnany other cysteine proteases such as papain, cathepsin L and cathepsin H, and substituted by a tyrosine residue in cathepsin S. ~urthermore,stnictural alignrnent indicates that the residue

located in this position is not in close proximity to the bound prosegment (85-91).

Therefore, other enzymes would not share the same pH dependence of prosegrnent binding as that observed for cathepsin B. In addition, the negative charge of a highly conserved aspartate residue in the proregion of propapain (Asp65p, papain nurnbering) was shown to be important in maintaining the papain precursor in a latent fom and to participate in an electrostatic triggering mechanism of propapain processing (100).

Despite the difficulty in aligning the prosegments of propapain and that of procathepsin

B, the closest match to Asp65p in propapain is Asp34p in procathepsin B (cathepsin B numbering, Figure 1). As noted earlier, replacement of Asp34p in the cathepsin B propeptide to an asparagine did not alter the pH dependence of cathepsin B inhibition.

Curiously, structural alÏgnment reveals that the consewed Asp6Sp resides within an area of propapain which is homologous to the position of the occluding loop in procathepsin B (85-91). Furthemore, only the koffvalue of the propeptide in cathepsin

B was found to be pH dependent in the pH range 4.7-6.0 (102) and not that of k,, as is the case with the cathepsin L propeptide (103). Interestingly, the modest increase in Ki values of the propeptides for the A(Cys 108-Cys 119) mutant of cathepsin B as the pH is dropped fiom 6.0 to 4.0 corresponds to a sirnilady modest decrease in the km value only. This suggests that the presence of the occluding loop in wild-type cathepsin B is responsible for the observed increase in the off-rate of the propeptide upon exposure to low pH. It has already been established that maturation of procathepsin B proceeds by an autoactivation mechanism (98,101,109) and that the main role of cathepsin B in its natural environment, the acidified lysosome, is to act as an exopeptidase (105) which relies on the 'closed' configuration of the occluding loop.

In sumrnary, this study confims a link between the pH dependencies of the cathepsin B occluding loop conformation and the propeptide binding affinity, and supports a direct correlation of this link with the in vitro rates of procathepsin B autoprocessing. ACKNOUZEDGMENT

The authors would Iîke to thank Jean Lefebvre for his technical assistance in the synthesis and purification of the human propeptide of cathepsin B, and Dr. Robert

Ménard for many valuable discussions. FIGURE 1

RAT HDK V DM1 1 R 1 K V LE G RmAN RSRPSFHPLS DELVNYWKQ. NT'I'WQAGHNF YNVDMSYLKR LCGTFLGGPK PPQRVM 1P OP OP 30~ 4 OP 5 OP

Region 3 Region 2 Region 1

3 Prosequences of Rat and Human Cathepsin B showing the primary sequence of the proregions which were chemically synthesked (residues Ip to 56p). Consensus of these tyo sequences is 86% within the 21p - 50p segment. Figure 2: Superimposition of the confornations adopted by the occluding loop in

procathepsin B @lue) and mature cathepsin B (brown). Note the positions of both His 0 110 and His 11 l with respect to the main body of the enzyme and the orientations adopteci - by the disulnde bridge- Linking Cys1 O8 to Cys 119 (green). single ietter codes

used,

Figure 3: Monitoring procathepsin B (36 kDa) autoprocessing to mature cathepsin B (30 kDa) in 50 mM acetate buffer pH 5.0 with ImM DTT using SDS-PAGE (12% gels).

Gels A, B. C,D correspond to Wild-Type @FM),Hisll lAia (3pM), Hisl lOALa (7pM), and Asp22Ala (6pM) procathepsin B, respectively. Initial proenyme concentrations are indicated 'in brackets. Note the incubation times. MIN

MIN

DAY

DAY - Table 1: Equilibrium and kinetic data for the inhibition of human cathepsin B mutants* by both human and rat cathepsin B propeptides at pH 4.0 and 6.0

Hurnan Wild type 11 &Cl 086119)

I Asp22Aa II Hlsl1OAla I Asp22AlaMlsll OAla 1.4 f 0.2 2.1 I Hls111Ala 86.0 f -

Rat Wild type 85.0 f 6.0' -

I &ci oscr i9) 0g6 0.1 4-2 I Asp22Ala 5.1 f 0.5 0.5

I Hls11OAla 4.1 f 0.3 0.4 l4 Asp22AleMlsl lOAla 1.8 0.2 1 11 Hlsl 11 Ala 87.0 f 8.0' -

aPropeptidss comprise of reslduer lp56 of the correspondlng cathepsln B proreglon., he 4 values are gbnas the averages of 4 detemha~lonswlth the calculetecf standard devlatlons. CClasslcallnhlbltlon obseived, Ys delermlned ;mm plots of 1k vs [II (Dlxon, 1953). Table 2: Rote of rat propeptlde aspartic acid residues In the Inhibition of rat cathepsin B by its propeptlde at pH 4.0 and 6.0'

'Assay conditions were: 0.1M phosphate (pH 6.0) or acetate (pH 4.0) buffer containing 1 mM EDTA, I mM DTT, 0.025% Brij-35 and 3% DMSO. ~ubstitutedpropeptide residues are underlined and in boldface. Classical inhibition observed, Ki's determined from plots of Ilv vs [Il (Dixon, 1953) Table 3: Activity of Caîhepsin B Towards ZPhe-Arg-MCA &/KM (M''s-')

Wild-Type 180,000 380,000 430,000 His111Ala 680,000 1,150,000 1,225,000 His 1 lOAia 190,000 320,000 400,000 Asp22Ala 550,000 900,000 1,200,000 A(Cys 108-Cys 1 19) 75,000 330,000 261,000 CONNECTTNG TEXT FOR CHAPTERS 1 AND 2

The importance of the occluding loop, particularly salt bridge formation between HisllO and Asp22, in definhg the pH dependence of cathepsin B inhibition by its propeptide and affecting the overall rate of procathepsin B processing was discussed in Chapter 1. It may be reasonably concluded that the pH dependent mobility of the occluding loop constitutes a unimolecular process within the zyrnogen. Not addressed in Chapter 1, however, is whether this process contributes to an intrarnolecular autoproteolytic event Perturbation of the closed form of the occluding loop using site-directed mutagenesis; Le., destruction of the His 11 O/Asp22 salt bridge, was shown to lead to variants of procathepsin B which could no longer autoprocess. In

Chapter 2, intemal cleavage sites within the prosegment of cathepsin B (as well within the prosegment of cathepsin S) were identified using the inhibitory capacity of cystatin

C. These autoproteolytic reactions occur within an area ofthe prosegments which bind through the substrate-binding clefts of these enzymes. Given the kinetic data and that these intermediates of processing are observed at al1 concentrations of proenzyme (even nanomolar concentrations), suggests that these reactions correspond to intramolecular proteolytic events.

Furthemore, it has been postulated that the N-terminus of the mature segment in propapain rnay play a role in a unimolecular autoproteolytic event (118). This possibility has been investigated by disrupting the formation of a highly conserved salt bridge between Asp6 and kg8 through site-directed mutagenesis. In this study, the ability of Arg8Ala propapain to autoprocess is compared to that of the wild-type precursor. CHAPTER 2

Contributions of Authors other than Omar Ouraishi:

Andrew C. Storer: Provided hding and mentorship for this project.

Note : Both wild-type and Arg8Ala propapain (and the antibodies for Western blot analysis) were provided by the laboratory of Dr. D. Y.Thomas (BRI/NRC). Xdentification of Interna1 Autoproteolytic Cleavage Sites

Within the Prosegments of Recombinant Procathepsin B and

Procathepsin S

CONTRlBUTION OF A PLAUSIBLE UNIMOLECULAR AUTOPROTEOLYTIC EVENT FOR THE PROCESSING OF ZYMOGENS BELONGING TO THE PAPAIN FAMILY*

'NRCC Publication No. 00000. The research was fünded in part by the Government of

Canada's Network of Centres of Excellence Program supported by the MedicaI

Research Council of Canada and the NaturaI Sciences and Engineering Research

Council of Canada through PENCE Inc. (the Protein Engineering Network of Centres of Exceilence).

Omar Quraishiz and Andrew C. Storer$§T

From the $Protein Engineering Network of Centres of Excellence and Department of

Biochemistry, McGilI University, 3655 Dnimmond Street, Montreal, Quebec, Canada

H3 G 1Y6, and the §Pharrnaceutical Biotechnology Sector, Biotechnology Research

Institute, National Research Council Canada, 6 100 Royalmount Avenue, Montreal,

Quebec, Canada H4P 2R2-

To whom correspondence should be addressed: Pharmaceutical BiotechnoIogy

Sector, Biotechnology Research Institute, National Research Council of Canada, G LOO

Royalmount Ave., Montreal, Quebec, Canada H4P 2R2. Tel.: 5 14-496-6256; Fax: 5 14-

496- 1629; E-mail:[email protected].

RUNNING TITLE: Processing Intermediates in Procathepsins B and S ABBrnrnTIONS

1 As indicated in the text, residue numbering relates to that of cathepsin B for

recombinant human cathepsin B, and to that of cathepsin L for recombinant

human cathepsin S. Residues in the proregion are identified with the suffix p.

2 The abbreviations used are: E-64, trans-epoxysuccinyl-L-leucyl-amido-(4-

guanidino)butane; 2-Phe-Arg-MCA, benzyloxycarbonyl-L-phenyIalany1-L-

arginine 4-methylcoumarinyl-7-amide; SDS-PAGE, sodium dodecyl sulfate-

polyacrylamide gel electrophoresis; DMSO, dimethyl sulfoxide; EDTA,

ethylene-diamine tetraacetic acid; DTT, dithiothreitol.

3 In the text, the words 'prosegment', 'proregion', 'prosequence', or 'prodomain'

refer to the polypeptide stretch located N-terminal to the mature enzyme in the

proenzyme, while the word 'propeptide' refers to the chemically synthesized

polypeptide corresponding to the proregion sequence but without the mature

enzyme.

4 In the text, the terms 'autoprocessing', 'autoactivation', or 'maturation' relate to

the ability of a zymogen to convert itself to a mature protein resulting in

cleavage at or near the pro/mature junction. Abstract:

Autoprocessing, is a mechanism by which a latent higher molecular weight zyrnogen rnay convert itseif to form active enzyme. Autoprocessing of proenzymes belonging to the papain family of cysteine proteases, namely the identification of intermediate events, bas been difficult to characterize. Processing near the pro/mature junction, due either to the catalytic activity of the mature enzyme or to secondary proteases, has been documented for this family of proenzymes. Furthermore, kinetic studies are suggestive that a slow rnechanism occurs during autoactivation which is independent of proenzy-me concentration. However, inspection of the recently detemined X-ray crystal structures (85-91) does not support this evidence. This is due prirnarily to the extensive distances between the catalytic thiol-irnidazolium ion-pair and the putative site of proteolysis required to form mature protein. Furthermore, the prosegments have been shown to bind through the substrate-binding clefts in a direction opposite to that expected for natural substrates. Previous to this work, the recent study of a non-homology knowledge based prediction of propapain activation

(118) proposed that an intrarnolecular proteolytic event rnay involve major conformational rearrangernent at the N-terminus of the mature enzyme domain, We report, using the cystatin C inhibitor and N-terminal sequencing, novel autoproteolytic intermediates of processing for recombinant procathepsin B and procathepsin S in vitro. The crystal structures (85-91) indicate that these reactions occur within a segment of the proregion which binds through the substrate-binding clefts of the enzymes, thus suggestive that these reactions are occegas unimolecuIar processes.

Using site-directed mutagenesis, we also show that charged residues located at the N- tenninus of the mature enzyme domain of propapain do not participate to the overall pH-triggering mechanism of autoprocessing for this precursor. These results provide the molecular basis for a plausible unimolecular step of processing among zymogens of papain-like enzymes. Prior to being shuttled to the mature lysosome, cysteine proteases of the papain family are first synthesized as latent precursors of higher molecular weight. Zymogens of papain-like enzymes are composed of polypeptide extensions of various lengths at the N-terminus of the mature enzyme domain that act as potent and selective inhibitors towards the cognate enzyme (102-104). The stability of the prosegment/enzyme complex is believed to rely mady on electrostatic interactions since most precursors of the papain superfarnily are susceptible to autoactivation upon their exposure to acidic pH environments (97-101,109). The crystal structures of rat (89) and hurnan (9O,9 1) procathepsin B, hurnan procathepsin L (85), procarkain (86), and human procathepsin

K (87,88) have been reported recently. The crystal structures reveal that each enzyme is inhibited by a small segment of the proregion binding through the substrate-binding cleft in a direction opposite to that expected for natural substrates. In order to protect cells from unregulated digestion, this reverse configuration is believed to help ensure proenzyme stability at neutral pH as the zyrnogen is passed fiom the endoplasmic reticulum to its final destination; Le., the acidified lysosomal compartment of the .

Progress still remains to be made in elucidating the molecular mechanisms involved in the conversion of many types of zymogens to their catalytically active forms. For instance, autoproteolytic cleavage of the serine precursor prosubtilisin E has been suggested to occur at the identical site (near the pro/mature junction) in either an intemolecular or intramolecular manner and that the mechanism which predominates is dependent mainly on the starting concentration of the proenzyme

(119,120). The molecular basis for the unimolecular mechanism of prosubtilisin E, however, has yet to be elucidated. Similarly, precursors belonging to the papain family of cysteine proteases have been proposed to undergo a non-exclusive unimolecular step (97-100). It is not clear, however, whether this step could represent a process other than an intramolecular cleavage reaction such as a rate-limiting activation process that exposes the active-site and renders the proenzyme active. If a unimolecular cleavage reaction is involved, it is also important to determine whether the cleavage site for the intramolecular event is identical to that observed via intermolecular processing (Le. in mm) as has been proposed for prosubtilisin E (1 I9,120), or if it occurs elsewhere in the proregion generating a catalytically competent processing intermediate followed by its merconversion to yield mature enzyme.

In general, zymogens of all families of proteolytic enzymes rnay undergo maturation via interrnolecular or intramolecular non-exclusive pathways. Kinetic studies of the conversion of propapain (97), procathepsin B (98) and procathepsin L

(99) to form mature enzyme have revealed both an interrnolecular and intrarnolecular component to processing. Precursors belonging to the papain family may, therefore, utillize different pathways to convert thernselves into a mature protein. For example, these precursors may participate in direct bimolecular cleavage to form mature enzyme.

Furthemore, these proenzymes may utilize an intramolecuIar process to form catalytically competent processing intemediates followed by Mer intermolecular proteolysis to fonn mature enzyme. However, the expedient interrnolecular reactions which contribute to proenzyme processing make it difficult to 'trap' and identiQ any processing intermediates. For exarnple, intermediates corresponding to N-tennini located elsewhere in the proregion may be unstable and difficult to detect under normal experimental conditions. Significantly, kinetic studies (97-99) have also revealed that the molecularity; Le., the concentration of proenzyme at which the rate of the intermolecular events equals that of the unimolecular process, is in the range of only

1o-~- 1O-' M.

The occurrence of a unimolecular step of maturation, however, is inconsistent to what is observed in the three-dimensional structures for this family of precursors (85-

91). For example, the crystal structure of procathepsin B reveals that the pro/mature junction ; Le., the putative site for proteolytic processing, is approximately 28A fkom the catdytic nucleophile. In addition, direct cornparison of the crystal structures of procathepsin B (89-91) with that of mature cathepsin B (82) reveals no evidence for major N-terminal rearrangement within the mature segment following maturation as is found in zymogens belonging to other families (36,12 1,122). Independently, the crystal structures have not provided any ches with regards to the possible existence of a unimolecular step. Despite this, a non-homology knowledge-based strategy predicted that a plausible unimolecular proteolytic step in propapain processing could involve the adjustment of a single p-turn that rearranges the &st 12 residues of the enzyme domain and allows the mature N-terminus to reach the active-site in a cleavable direction (1 18).

Within the mature N-terminus are Asp6 and Arg8 which are highly conserved residues arnong cysteine proteases of the papain farnily (108). Interestingly, al1 X-ray crystal structures of papain-like enzymes (pro- and mature) reported to date reveal that the side-chahs of Asp6 and Arg8 contribute to the formation of a saft bridge (82,83,85-9 1).

Here, we atternpt to identie novel intermediates of processing for the precursors of cathepsin B and cathepsin S in vitro. This has been achieved by monitoring using SDS-PAGE the processing of purified procathepsin B and procathepsin S in the presence of the endogenous inhibitor, cystatin C. Cystatin C has been shown to be a substrate-binding cleft-directed protein inhibitor of papain-like enzymes with Ki values in the sub-picomolar range (96,123). Since the formation of a tight-binding complex with cystatin C requires that the substrate-binding cleft of papain-like enzymes be accessible ; i.e., fÏee fiom natural substrates or the prodomain, it rnay be reasonably assumed that the affkity of cystatin C for the mature enzyme would be superior to that for either the proenzyme or any intermediates generated during autoproteolysis ; e. hierarchy of cystatin C affinity for mature enzyme>intemediates>proenzyme. In excess, there fore, cystatin C rnay provide the desired effect of inhibiting the rate of the intermolecular proteolytic cascade caused rndy by the activity of mature enzyme, and thereby favoring intrarnolecular events which would otherwise go undetected. Furthemore, we investigate any role the N- terminus of the mature enzyme domain may play in processing. This has been achieved by perforrning site-directed mutagenesis of Arg8 to an alanine residue in propapain ; i.e., destruction of the Asp6/Arg8 salt bridge, and monitoring the overall effect of this mutation on the ability of propapain to autoactivate. EXPERIMENTAL PROCEDURES

Matenals - Human wild-type and GIyl ZGIu cystatin C were a generous gift from Dr.

Irena Ekiel (l3iotechnology Research Institute). Recombinant human wild-type procathepsin B was expressed and purified as described previously (106). The vector

(pPIC9) and P. pastoris strain GS 11 5 were purchased kom Invitrogen Corp. (San

Diego, CA). The substrate benzyloxycarbonyl-L-phenylalanyl-L-arginine 4- methylcoumarinyl-7-amide hydrochloride (Z-Phe-Arg-MCA) and the covalent inhibitor

E-64 (1-[[(L-h-ans-epoxysuccinyl)-L-Zeucyl]amino]-4-guanidino)butane) were purchased fkom IAF Biochem International Inc. (Laval, Canada).

Eqression of Procathepsin B and Procathepsin S - A cDNA construcf consisting of human wild-type procathepsin B or procathepsin S as a fusion with the preproregion of yeast a-factor, was digested with XhoI and NotI and the proenzyme fragment was subcloned into the pPIC9 vector (Invitrogen Inc., San Diego, California). For integration into the Pichia genome, the pPIC9 based constructs were linearized by cleavage with BgZII and purified. The Pichia pastoris host strain GS 115 (Invitrogen

Inc.) was then transformed with the linearized constructs by electroporation. Positive transformants were grown for 2 days in medium containhg glycerol as the carbon source followed by incubation in the presence of methanol for a further 3 days to induce expression of recombinant protein. The consensus sequence for oligosaccharide substitution located in the occluding loop of mature cathepsin B had been removed by the substitution, Serl lSAla. Al1 other sites for oligosaccharide substitution within procathepsin B and procathepsin S were lefi unaltered. Protein secreted into the culture supernatants was analyzed by SDS-polyacrylarnide gel electrophoresis (12% gels). Expression of Wild-Type and R8A Propapain - Propapain was produced as

described previously (97,100). Briefly, the Saccharomyces cerevisiae strain B J3 50 1

was transformed with the expression vector derived fkom pVTIOO-U in which the

propapain gene is under the control of the a-factor promoter. Yeast cells were kst

grown under selective conditions to ensure plasrnid maintenance and then transferred

into a rich medium. The ceils were lysed using a French Press where the soluble

fraction of the lysate included propapain. Complete processing in cis was achieved by

incubating soluble cellular extracts with 50 mM sodium acetate (pH 3.8), 20 mM

cysteine at 65OC for 30 min. Sarnples were analyzed by Western blot following

separation of the proteins using SDS-PAGE.

PuriJcntion of Procathepsin B and Procathepsin S - The proenzymes were purified

fiom the culture supernatant using a hydrophobic resin under non-acidic conditions.

The culture supernatant (250 ml) was concentrated to 50 ml using an Arnicon stirred- ce11 (YM-10 membrane). During concentration, the supernatant was exchanged to 50 mM Tris (pH 7-4) containhg 1.6 M (NH&S04. Concentrafed recombinant proenzyme was then pwified on an FPLC system (Pharmacia) using a butyl-sepharose fast fiow column. Proenzyme fractions eluted from the column by applying a linear gradient of decreasing ammonium sulfate concentration. Giycosylated procathepsin B and procathepsin S eluted at 0.6-0.8M and 0.3-0.5 M (NH&S04, respectively, and samples were stored at 4OC.

In Vitro Processing of Procathepsin B and Procatheps Zn S - Puri fied procathepsin B

(20 PM) and procathepsin S (4 pMJ samples were dialysed against 50 mM sodium acetate (pH 5.0), 1 mM dithiothreitol at 4°C overnight in the presence (or absence) of

5-fold excess human wild-type cystatin C. Each sample was then treated with excess

E-64 followed by SDS-PAGE (12% gels).

N-Teminal Idenfz$cation of Protein Ban& - After electrophoresis, protein bands

were blotted onto hydrophobie polyvinylidene difluoride membranes using the method

as described previously (124). The membranes were then stained with Coomassie

Brilliant Blue R250 (Bio-Rad Laboratories) and each protein band of interest was

subjected to a minimum of five cycles of automated solid-phase Edman degradation.

Fluorogenic Assay for Monitoring Proenzyme Processing - Processing of human

wild-type procathepsin B and procathepsin S was followed in a continuous manner by

carrying out the reaction in a 3-ml quartz cuvette in the presence of the substrate Z-Phe-

Arg-MCA (1OpM) and measuring fluorescence as a hction of time. The conversion

of procathepsins B and S to active enzyme Ieads to hydrolysis of the substrate, and fluorescence of the MCA product was monitored using excitation and emission wavelengths of 380 and 440 nm, respectively. Processing was initiated by lowering the pH firom 7.4 (pH of the stock solution of procathepsins B and S) to 5.0 (pH of the assay). Reactions were carried out at 25OC in the presence of 50 rnM sodium acetate buffer, 0.2 M NaCl, 2 rnM EDTA, 2 rnM dithiothreitol, and 3% DMSO. The reaction mixture was stirred continuously in the cuvette during the reaction. The product versus time curves were fitted to the following equation (1 11 - 115),

@?]= VPE f - ((VE - VPE) (1 - exp(-kobd>)) 1 kobs where P is the MCA product formed, vp~represents the initial rate of product release

(which shouId refiect activity of the proenzyme, if any), v~ corresponds to the rate for

mature cathepsin B or cathepsin S, and kbsis a first order rate constant.

RESUltTS AND DISCUSSION

Exposing proenzymes of the papain family such as procathepsin B and

procathepsin S to acidic pH environments Ieads to their rapid autocatalytic conversion

to form mature protein. Exceptions to this general phenornenon includes the precursor

of papaya proteinase N which is incapable of autoprocessing (125) despite its

demonstrated enzymatic activity (1 26). This incapacity has been attributed to the

resûicted specificity of this enzyme for substrates containing glycine in the Pi position.

Furthemore, we have recently determined that the inability of procathepsin H to 'self

activate is due to the formation of a disulfide bond ((127) ; Chapter 3 of this thesis),

featured only in the precursors of cathepsin H (mamrnalian), aleurain (barley), and

orizain y (rice seeds), which links the prosegment to the catalytic domain via residues

Cys82p and Cys2 14 (cathepsin L numbering) (128). The occurrence of a unimolecular processing step for precursors of papain-like enzymes has been postulated based on kinetic data (97-99) where a rate of autocatalysis was extrapolated for proenzyme concentrations approaching zero. The nature of this step, however, has remained

elusive. Monitoring the autocatalytic processing of purified precursors belonging to the papain farnily (no mature enzyme present) using SDS-PAGE typically reveals protein bands which correspond only to the full-length precursor of higher molecular weight and/or to the mature enzyme of lower molecular weight ; Le., processing intermediates are usually not observed. In the study of Vernet et al. (100), however, an intermediate protein band of 30 kDa was observed for propapain using Western blots analysis, yet insufficient quantities led to the inability to identifjr this species using Edman degradation.

Monitoring the autoprocessing of puified recombinant procathepsin B and procathepsin S in vitro in the presence of excess amounts of the protein inhibitor, cystatin C, effectively increases the population of intermediate species which would otherwise be difficutt to identiS using SDS-PAGE. Cystatin C is an endogenous active site-directed inhibitor of papain-like enzymes. Three exposed loops on cystatin C, with

Glyll playing a key role, are necessary for the inhibitory activity of this protein (129) as is its unobstructed access to the substrate-binding cleR of the enzyme (105).

Therefore, the prosequence of any precursor of papain-like enzymes must be removed fkom the active site clefl in order for this tripartite interaction to take place. The presence of excess arnounts of cystatin C introduces a cornpetition (between the prodomains and cystatin C) for the substrate-binding clef? of the enzyme which results in significantly slower rates of activation due rnainly to its eEect on the intennoIecular proteolytic cascade. Since stable complex formation between cystatin C and papain- like enzymes requires that access to the substrate-binding cleft of the enzyme be unobstructed, inhibition by cystatin C of the different isoforms of the enzyme is expected to be of the order : mature enzyme>intemediate>proenzyme. Therefore, intermolecular reactions will be more strongly inhibited than intramolecular processes.

As show in Figures 1 and 2 of Chapter 2, the presence of cystatin C successfully retards proenzyme processing and allows for the time-dependent accmuIation of an intermediate protein band at 32 kDa for cathepsùi B and 29 kDa for cathepsin S. Identzpcation of Intemediate Cleavage Sites in Procathepsin B using the Cystatin C

Assay - With the addition of cystatin C to the reaction mixture, an intermediate species

of cathepsin B is observable even at nanomolar proenzyme concentrations as

determined by AgN03-stained SDS-PAGE (Figure 1) as weIl as in micromolar

quantities (PVDF membranes ; Coomassie-Blue staining ; Figure 2A). Direct N-

terminal sequencing of the intermediate band of cathepsin B processing indicates a

mixture of species following cleavage at ~ys42~?~43~(70%) and A.rg40p'f'~eu4 1

(30%) (Figure 3, cathepsin B prosegment numbering). FolIowing several weeks of

incubation, both full-length procathepsin B and the processing intermediate disappear

and only the protein band corresponding to mature cathepsin B is observed. Therefore,

it may be concluded that these are true intermediates of processing and not dead-end

(side product) reactions. The crystal structure of procathepsin B (89-9 1) indicates that

this segment of the cathepsin B proregion binds through the substrate-binding cleft of the enzyme in the reverse mode and that the carbonyl carbon of Cys42p is in closest

proximity to the catalytic residue, Cys29 (Figure 4). Based on structural analysis of

this region of the cathepsin B precursor, the carbonyl carbon of Cys42p is located

approximately 4A fiom the catalytic nucleophile and the potential bond angle between

the catalytic nucleophile and the carbonyl oxygen of Cys42p is 127 degrees ; i.e.,

conducive to fonning a tetrahedral intermediate (Figure 4). For the ~rg40~f~eu4l~

cleavage site, the carbonyl carbon of Arg40p is approximately 6.7 A fiom Cys29 and the potential bond angle between the carbonyl oxygen of Arg40p and the catalytic nucleophile is 36.6 degrees. Hence, the ability of the carbonyl carbon of Arg40p to reach the catalytic center of the enzyme for hydrolysis requires significant

conformational mobility of the small a-helix composed of residues Asp34p-tLeu4lp which interact with the occluding loop crevice and the primed subsites of the substrate-

binding cleft of cathepsin B.

Given the proximity of these carbonyl carbons (that of Cys42p and Arg40p) to

the catdytic center, it is tempting to speculate that these reactions occur in an

intramolecular manner. As discussed previously, kinetic studies are suggestive of a

unimolecular step among members of the papah family of precursors whose molecularïty is unusually low (LO-~-~O~M) yet still rapid when compared to

uncatalyzed (spontaneous) peptide hydrolysis (10-''M& HOWis this possible if, many

(but not all) intramolecular enzymatic reactions have molecularities in the range 102-

104 M due to an increased effective concentration of reacting species and favorable

entropic effects? This may be accounted for by the reverse binding mode adopted by the prodomain in its interaction with the substrate-binding cleft of the enzyme. As a consequence, the formation of a tetrahedral intemediate at the carbonyl carbon of

Cys42p or Arg40p would not be stabilized by the oxyanion hole formed by Gln23.

This structural incompatibility has, therefore, lead to the suggestion that it would not be possible for the enzyme to perform such reactions (91). Previous work with oxyanion hole mutants of papain (130) and cathepsin B (131), however, indicate that the specialized glutamine is not an absolute requirement to hydrolyse small synthetic substrates but rather is a feature which may improve the catalytic efficiency of these enzymes by only 10 to 100-fold depending on the substrate under study. Alternatively, the sIow rate of these unimolecular reactions may also be due to the fact that the reverse complementarity between the bound proregion and the enzyme's substrate- binding cleft causes the distance between the 6N of the catalytic His199 and the primary NH (leaving) group of Gly43p to be larger than would be the case for natural substrates (132). Hence, the donation of a proton fiom the 6N of Hisl99 to the leaving

amide group of Gly43p may be less efficient to that observed for natural substrates, and thus, resulting in a reversible nucleophilic reaction which has difficulty going to completion. Since the catalytic thiol among cysteine proteases is more nucleophilic and simultaneously constitutes a better leaving group than the catalytic hydroxyl found among serine proteases, it has been postulated that proton transfers effectuated by the

SN of catalytic histidines found among cysteine proteases would need to be more efficient than those found among serine proteases (133); i.e., the catalytic histidine found in cysteine proteases must compensate for the lower pK, of the catalytic thiol group. This compensation may involve the rotation of the catalytic imidazole side- chain about its CP-Cy bond as has been observed in the crystal structure of cathepsin B in complex with a pyridyl disulfide inhibitor (134). In this complex, fis199 was shown to be rotated 120 degrees relative to its position found in other crystal structures of cathepsin B (82,89-91). In surnmary, it is advantageous for eukaryotic ceIls to require that the conversion of these zymogens to be under strict regulatory control, namely the pH environment in which they are found. Typically, precursors of papain- like enzymes are stable at neutral pH and are prone to autoprocess more efficiently under acidic pH conditions (97-101,109), as is the case in the acidified lysosomal cornpartment of the cell.

The ability to detect cleavage at the carbonyl carbon of Arg40p indicates a significant degree of conformational mobility that exists for the proregion within the prosegrnent/substrate-binding cleft interface. This mobility rnay be accounted for by the pH-dependent stability of the enzyme's occluding loop which consequently defines the pH-dependence of propeptide binding as well as the overall rate of procathepsin B processing (106). Cornpetition between the prosegment and the occluding loop for the surface of the enzyme termed the 'occluding loop crevice' was shown to be regulated by the formation of a salt bridge between HisllO of the ocduding loop and Asp22 located on the main body of the enzyme. Site-directed mutagenesis of either one of

Hisl 10 or Asp22 to an alanine residue produces a variant of procathepsin B which is stable and incapable of autoactivation. Remaining elusive (fiorn the data presented in

Chapter l), however, was whether these mutations caused the perturbation of a unimolecular event involving proteolysis of the prosegment. As procathepsin B is exposed to acidic pH conditions, it is possible that salt bridge formation between

HisllO and Asp22 promotes cornpetition between the occluding doop and the N- terminal cap of the proregion. From this cornpetition, it follows that the rernaining C- terminal residues of the proregion; i-e., prosegrnent residues which stretch fiom the substrate-binding cleft to the pro/mature junction, would have reduced affinity for the surface of the enzyme and increased conformational mobility. In agreement with this proposa1 is the consistent lack of secondary or tertiary structure found within the C- terminal end of papain-like prosegments when bound to the cognate enzyme (85-91).

Furthemore, truncated propep tides composed only of these C-terminal residues display weaker affinity for the enzyme than the full-length propeptide (92,117). Additional evidence for mobility within the prosegment has been shown for the propeptide of cathepsin L which loses most of its tertiary structure yet almost none of its secondary structure at low pH (135). It is believed the high B-factors corresponding to the C- temiinal end of papain-like prosegments facilitates the autocatalytic conversion of these zyrnogens upon their exposwe to acidic pH. That the conformational mobility within the C-terminal end of papain-like prosegments is important for autoprocessing is evidenced by the results obtained for procathepsin H (127); presented in Chapter 3 of this thesis).

Identification of Intemediate Cleavage Sites in Procathepsin S using fie Cystatin C

Assay - In the case of procathepsin S, the presence of cystatin C is not stnctly required to observe an intermediate processing band on SDS-PAGE which is approximately 2 kDa heavier than the mature enzyme (data not shown). In order to ensure the accumulation of sufficient quantities of this species for N-terminal identification, cystatin C was added to procathepsin S under processing conditions to inhibit bimolecular reactions as has been discussed previously for procathepsin B (Figure 2B).

Based on their migration on SDS-PAGE, it is assumed that the intermediate species formed in the presence of cystatin C are identical to those formed in the absence of the inhibitor. This band corresponds to a mixture of species containing N-termini of

?~er77~-~eu78~-~r~79~(50%) and ?~er73~-~eu74p-~et75~(50%) (Figure 3 ; cathepsin L prosegrnent numbenng). Despite the low primary arnino-acid sequence hornology between the prosequences found in cathepsin B and cathepsin S and that the three-dimensional structure of procathepsin S has yet to be detennined, the fold of the prosegments and the mechanism by which they inhibit enzyrnatic activity is comrnon to al1 precursors of the papain family reported to date (85-91). The structural homology between the prosegments shown in Figure 3 reveals that cleavage at the carbonyl carbon of residues in position 42p (procathepsin B) and 76p (procathepsin S) are perfectly aiigned. This conservation is observed despite the great di fference in length for these two prodomains ; i.e., 62 residues for procathepsin B and >90 residues for procathepsin S. It follows, therefore, that Ser76p is predicted to bind through the substrate-binding cleft of cathepsin S in the reverse binding mode ; Le., as is the case for Cys42p in procathepsin B, and that its carbonyl carbon is located closest to the catalytic residue, Cys25.

It is interesting to note that cathepsin S prefers to cleave at interna1 sites within its prosegment where serine residues are l~catedin the Si' position, namely at the putative ~~s91~T~er92~ site near the pro/mature junction to form mature cathepsin S as well as at the ~et72~T~er73~and ~er76~?~er77~ sites identified using the cystatin

C assay (Figure 3). Located within the prosequence of cathepsin S is found two adjacent serine residues, Ser76p and Ser77p. Curiously, proteolysis is only observed at the carbonyl carbon of Ser76p and not at the carbonyl carbon of Met75p. This is indicative that cleavage at the carbonyl carbon of Ser76p may be selective. Cleavage at the carbonyl carbon of Met72p indicates a significant arnount of conformational mobility among residues of the proregion which bind through the substrate-binding cleft of cathepsin S ; Le., as has been observed for procathepsin B.

The conversion of procathepsin H to its mature form (discussed in Chapter 3) has been proposed to involve cleavage at the carbonyl carbon of Ser77p (83) (Figure 3 ; cathepsin L prosegment numbering) which is located adjacent to the cleavage sites identified in this study for procathepsin B and procathepsin S using the cystatin C assay. This cleavage site has been proposed to be a prelude to the formation of an octapeptide of prosegrnent residues (Glu78p+Thr85p), called the 'mini-chain', which rernains attached to mature cathepsin H via a disulfide bond (136). Similarly to proaleurain (137), we have recently determined that procathepsin H is an unusual zymogen of the cathepsin L-subfamily which is incapable of autoactivation (127;

Chapter 3 of this thesis). This incornpatibility has been attributed to the pre-formation of a disulfide bridge linking the prosegment to the enzyme domain within the precursor. This additional covalent attachment is believed to restnct the confornational mobility at the C-terminal end of the cathepsin H prosegment. It follows, therefore, that procathepsin H was found to be unable of performing cleavage at the carbonyl carbon of Ser77p (or Trp76p) in an intra- or intermolecuIar manner, thus requiring the action of secondary proteases (127; Chapter 3 of this thesis).

Continuous Monitoring of WiZd-type Procathepsin B and Procathepsin S

Autocatalytic Processing - A continuous assay based on the hydrolysis of the substrate

2-Phe-Arg-MCA by the active enzyme generated in the autocatalytic process was used,

The rate of substrate hydrolysis increases with tirne due to time-dependent release of active enzyme fiom the precursor until a constant rate is obtained that corresponds to the activity of fuily processed enzyme. The curves cm be fitted to a mode1 which assumes a first order increase in rate fkom an initial rate vfi corresponding to the activity of the precursors (if any), to a final rate VE, corresponding to the activity of hlly processed enzyme. Based on the results of non-linear regression analysis, no significant activity of the precursors against the 2-Phe-Arg-MCA substrate could be detected and the first-order rate for autocatalytic processing, kob,, increases linearly with proenzyme concentration (Figure 5). The direct link between the rates of autoprocessing and precursor concentration confirms the occurrence of a bimolecular reaction ; i.e., intermolecular processing of proenzyme by fulfy or partially processed

(active) enzyme. In support of the unirnolecular autoproteolytic reactions discovered using the cystatin C assay (discussed above), the extrapolated rate constants as the concentration of either procathepsin B or procathepsin S tend toward zero were not null. This kinetic data suggests the occurrence of an intrarnolecular event; Le.,

independent of the concentration of proenzyme, in the processing of these precursors.

Role of the N-teminus of the Mature Segment in the Autocatalyhc Processing of

Precursors BeZonging to the Papain Family - Previous to this work, a non-hornology

knowledge based prediction of propapain activation proposed that an intrarnolecular

event may involve the N-terminus of the mature enzyme domain moving towards the

active site cleft and thus facilitating the release of the prosegment (1 18). Using the

structure of mature papain as a template ; Le., the effect of the prosegment was not

considered, it was proposed that the adjustrnent of a single p-turn extends the first 12

residues at the N-terminus of the enzyme and is capable of allowing the prohature junction to reach the active site in the substrate-binding mode. For this rearrangement

to be made possible, it would be expected that the salt bridge formed between Asp6 and

Arg8 found in al1 papain-like enzymes reported to date (82,83,85-91) ; i-e., residues

which contribute to the p-turn, wodd also contribute to the overall pH-triggering

mechanism of propapain processing. In order to test this hypothesis, site-directed

mutagenesis aimed at removing this salt bridge was performed. In this study, we show

that Arg8Ala propapain remains competent to autoactivate to form mature protein

(Figure 6). These results collaborate with the X-ray crystal structures of papain-like

enzymes (pro- and mature enzymes) which demonstrate hi& resolution arnong residue

side-chains located at the N-terminus of the mature segment, thus corresponding to a region of the molecule which is conformationally constrained (low B-factors).

Furthermore, the N-terminus of the mature segment within the precursors (85-91) is

essentially in the identical conformation to that found in the crystal structures of mature enzyme (82,83), thus suggesting that no major N-terminal rearrangernent is observed during precursor activation. In addition, the overall assurnption that the putative site of proteolysis to fom mature protein near the prohature junction is the only possible cleavage site ; i.e., as has been proposed for prosubtilisin E (119,120), remains speculative as an unidentified processing intermediate was observed for propapain at 30 kDa (100). For exarnple, autoprocessing of procathepsin B predominantly yields mature cathepsin B composed of six residues derived fiom its prosegment

(Phe57p+Lys62p) rather than starting at Leu1 (98,lO 1,109) despite the presence of the salt bridge formed by Asp6 and Arg8 within this precursor (89-91). In summary, it is not clear how the N-terminus of the mature segment within this family of precursors could ; (a) undergo major rearrangernent f?om a low energy state consisting of a conserved salt bridge fonned by Asp6 and kg8; (b) to a stretched conformation consisting of little secondary structure with the concomitant destruction of the

AspG/Arg8 salt bridge; (c) followed by its eventual remto the position shown in the

X-ray crystal structures (85-91 ). High-resolution nuclear magnetic resonance studies may provide Merinsight into the possible dynarnical role that the N-terminus of the enzyme plays in pH-triggered precursor activation.

Nature of the Steps Irzvolved in the Autocat~lyticProcessing of Procathepsin B and

Procathepsin S - Previous studies have established that the reactivity of the catalytic cysteine residue found within the precursors of papain-like enzymes is responsible for the maturation of this family of zymogens (97- 10 1,109). Hence, autoactivation of zymogens belonging to the papain family require that the precursors be composed of a preformed and functional catalytic center and substrate-binding clef?. In this study, we have described the identification of novel processing intermediates for procathepsin B and procathepsin S. The intermediates identified for cathepsin B are observable only in the presence of cystatin C at either very low starting concentrations of proenzyme (low nM) as determïned by Am3-stained SDS-PAGE (Figure 1) or at higher concentrations of the precursor (pMJ (Figure 2). In the case of cathepsin S, processing intermediates are observable on SDS-PAGE in the absence of cystatin C (data not shown). Due to their low quantities, however, their identification is facilitated only by the addition of cystatin C to the reaction, This addition has the desired effect of significantIy increasing the population (half-life) of the cathepsin S intemediates. That these novel cleavage products are observed at al1 starting concentrations of proenzyme, including very low concentrations, suggests that these reactions are occuring as unimolecular processes and that they may be important. The crystal structures of precursors of the papain family (85-91) demonstrate that these cleavage reactions are taking place within a segment of the proregion which binds through the substrate- binding clefis of the enzyme in the reverse binding mode. Following the completion of these proteolytic steps, over 20 residues derived from the C-terminal end of the prosegment remain covalently attached to the mature segment at the pro/matue junction. Intuitively, these intermediate species would be as catalytically competent as the mature enzyme since the C-terminal end of papain-like prosegments possess low inhibitory activity as compared to that of the full-length prodomain (92,lO3,ll7).

Given the kinetic and structural data presented here, it is tempting to specdate that these reactions are occurring as slow unùnolecular steps which may be necessary for tnggering the intermolecdar proteolytic cascade. It is also possible, however, that these intermediates merely represent birnolecular side-reactions which are more easily isolated using the cystatin C inhibitor. The incubation of inactive procathepsin B

(Cys29Ser) with wild-type mature cathepsin B, however, leads onIy to the formation of mature enzyme as proteolysis elsewhere in the cathepsin B proregion was not observed

(98,lOlJOg). A similar experiment for the inactive procathepsin S (Cys25Ser) has yet to be performed. It is interesting to note that these processing intermediates in cathepsin B (Figure 1) and cathepsin S (data not shown) are also observed at nanomolar concentrations of the precursors. As has been suggested for propapain (97,100), the conversion of procathepsin B and procathepsin S appears to involve mutualIy non- exclusive processes. The first step may involve the slow intrarnolecular cleavage reactions presented here, followed by the rapid intermolecuIar proteolytic cascade performed by the catalytic activity of mature or semi-mature species whose quantities accumulate with tirne.

The cystatin C assay described above was attempted with other precursors such as propapain, procathepsin K and procathepsin L. Due to the supenor affinity of cystatin C (10~'~-10-'~M) for these enzymes as compared to their prosegments (IO-'

M), the capacity of these precursors to autoprocess was essentially eliminated even after prolonged periods (data not shown). Following approxirnately two weeks of incubation, cystatin C was gradually degraded; Le., due either to its own instability or to proteolysis by catalytic arnounts of mature enzyme, followed by the rapid maturation of the proenzymes. How is it possible that the inhibition of procathepsin B activation by cystatin C produces a processing intermediate that is not observed for other precursors using SDS-PAGE? One possible explanation is the decreased affiniiy of cystatin C for the cathepsin B substrate-binding cleft due to the obstruction caused by this enzyme's occluduig loop (105). Therefore, the presence of the occluding loop in cathepsin B decreases the affinity of cystatin C for this enzyme to that found for the full-length propeptide ; Le., low nanomolar versus picomolar range (1 05). In an attempt to decrease the disparity between the afEnïties of cystatin C and the prodomains of cathepsin K and cathepsin L for these enzymes, a variant of human cystatin C carrying the mutation, Glyl lGlu, was also used (96,129). It was determined that Gly 1l Glu cystatin C inhibited the activation of procathepsin K and procathepsin L as effectively as the wild-type inhibitor and did not lead to the accumulation of processing intermediates (data not shown).

The pH dependence of propeptide binding which has been observed for several enymes belonging to the papain family (102-104) suggests that charged residues located within the prosequences such as the highly conserved Asp65p and Glu70p found within prosegments of the cathepsin L-subfamily (85-88), and not those found on the enzyme, are responsible for the pH-triggering activation of these precursors.

However, we have recently determined that autoactivation in procathepsin B is as unique as its three-dimensional structure (106). The prosequence of cathepsin B is over

30 residues shorter than those of cathepsin L-like prosegments and the enzyme possesses a unique insertion of 20 residues, tenned the occluding loop, which contributes to the phed subsites of the substrate-binding cleft. Tt was determined that the formation of a salt bridge within cathepsin B, and not its prosequence, stabilizes the occluding loop to the surface of the enzyme which consequently affects the pH dependence of propeptide binding as well as the overall rate of procathepsin B processing.

It has been proposed that deregulated secretion of papain-like enzymes to the extracellular matrix may serve as a catalyst for propagating turnour metastasis (138) as well as rheumatoid arthritis (7). Since the prodomains are known to act as intrarnolecular chaperones (84) and serve to stabilize the enzyme at neutral pH conditions, it may be reasonably concluded that these enzymes are targetted to the extracellular maûix in their precursor form. Despite the neutral pH environment generally associated to the extracellular matrix, it has been postulated that rnicroenvironments of acidic pH, cailed resorption pits, may be the Iocation where low concentrations of zymogens of papain-like enzymes are converted to their mature forms. The discovery of a unirnoIecular mechanism of processing for this family of precursors may help to explain how lysosomal enzymes (even at low concentrations) are implicated in a number of degradative and invasive pathological conditions extracellularly (7,69-7 1,138). Although inhibitors with sufficient potency are available for this class of enzymes, they often lack the required selectivity needed for therapeutic applications. In addition to the traditional approach of designing substrate-binding cleft-directed inhibitors, an improved understanding into the mo lecuIar basis of autoprocessing for zymogens of the papain family rnay lead to novel therapeutic strategies where the conversion of proenzyme is intervened. This would have significant consequences given that the prosegments of papain-like cysteine pro teases are intrinsic appendices which are both potent and selective inhibitors of the enzyme fkom which they originate. ACKNOWLEDGMENTS

We thank France Dumas for the N-terminal sequence determinations and Dr. Irena

Ekiel for her generous gift of human wild-type and GlyllGlu cystatin C. We also thank Dr. J. Sivaraman for his assistance in preparing Figure 4. We also thank Daniel

Tessier, Dr. Thierry Vernet and Dr. Dave Thomas for providing the constructs of wild- type and Arg8Ala propapain. We are also grateful to Dr. John S. Mort for many valuable discussions, Figure 1 : SDS-PAGE stained with AgN03. Procathepsin B (37kDa) at low concentrations (2 nM) exposed to 50 mM acetate buffer (pH 5-0) and ImM DTT for 12 hours in the presence of 10 nM (lane 1) and 50 nM(Iane 2) hurnan wild-type cystatin C.

In lane 3 is show the autocatalytic processing of 2 pMprocathepsin B in the absence of cystatin C (heavy band at 30 kDa corresponding to Wyprocessed protein). Note that the proportion of processing intermediate (32 kDa) to mature cathepsin B (30 kDa) is improved with increasing concentration of cystatin C.

Figure 2 : Polyvinylidene difluoride membranes (Applied Biosystems, Problott TM

Membranes) stained with Coomassie Brilliant Blue R-250 (Bio-Rad Laboratones) containing immobilized procathepsin B (A), and a mixture of glycosylated and deglycosylated procathepsin S (B). Lane 1 and lane 2 of each membrane represent the conversion of proenzyme to mature enzyme in the absence and presence of cystatin C

(M, = 12,500), respectively. The processing intexmediates migrate at 32kDa and 29 kDa for procathepsin B and procathepsin S, respectively. The 29kDa species for procathepsin

S is detectable in low quantities using SDS-PAGE in the absence of cystatin C (data not shown).

72~ OP 85P 90~ 96~ H HKYLWSEPQNCSATKSNYLRGTGPY fi fi fi fi

S MSLMSSLRVPSQWQRNITYKSNPNR 't' 't' fi

38~ 45P OP 55~ 62~ B LKRLCGTFLGGPKPPQRVMFTEDLK 'P + fi

Figure 3 : Structural alignment of the C-terminal ends of the prosegments of cathepsin B, cathepsin S, and cathepsin K. Cathepsin B prosegment numbering was iised for cathepsin B, and cathepsin L prosegment numbering was used for cathepsin S and cathepsin H. The established cleavage sites to form mature protein near the pro/mature junction are represented as (fi). The sites of autoproteolytic processing detected using the cystatin C assay are denoted as (+). Figure 4: The Active Site Cleft in Rat Procathepsin B (Cys29Ser) (89). The

cathepsin B prosegment (red residues) binds through the substrate-binding cleft of

cathepsin B @lue residues) in the reverse N+C direction to that taken by natural

substrates. The carbonyl carbon of Cys42p is in closest proximity to the catalytic nucleophile (approx. 4-41and the potential bond angle between the catalytic nucleophile

and the carbonyl oxygen of Cys42p is 127 degrees. The carbonyl carbon of Arg4ûp is not shown.

Figure 5: Continuous Assay for the Autocatalytic Processing of Wild-Type - Procathepsin B (A) and Procathepsin S (B). Plots of the first-order rates of processing (kbs)obtained by nonlinear regression of the data to the equation discussed in EXperimental Procedures as a fùnction of precursor concentration (determined by active site titration with the E-64 inhibitor). The data are in agreement with a fmt-order rate of processing. For both procathepsins B and S, the rate of processing, br,increases linearly with proenyme concentration, indicative of a bimoIecular reaction which most likely corresponds to intermolecular processing of proenzyme by mature or semi-mature protein. In addition, there is a corresponding rate constant of precursor activation (2.4 x 10 -4 s' 1 for procathepsin B and 8.1 x 10-~s-l for procathepsin S) as the concentration of proenyme is extrapolated to zero, indicative of an activation event which is independent of the concentration of precursor. (A) PROCATHEPSIN B AUTOPROCESSING

0.00 0.05 0.10 0-15 020 025 0.30 0.35 [ Procathepsin 6 ] (nM)

(B) PROCATHEPSIN S AUTOPROCESSING

0.00 0.05 0.10 0.15 020 025 0.30 0.35 [ Procathepsin S ] (nM) Figure 6 : Monitoring the autocatalytic processing of wild-type propapain (iane 1) and

Arg8Ala propapain (he 3) using Western blot analysis following their incubation at 65 degrees in 50 mM acetate buffer @H 3.8) and 20 mM cysteine for 30 min (iane 2, wild- type mature papain; lane 4, Arg8Ala mature papain). Molecular weights of the papain precursor and mature form are 37 kDa and 25 kDa, respectively. The Arg8Ala variant of propapain was observed to autoprocess as efficiently as the wild-type precursor.

COhWCTIIVG TEXT FOR CHAPTERS 2 AND 3

In chapter 2, unimolecuIar mechanisms of autoproteolytic processing for procathepsin B and procathepsin S were identi fied. These reactions involve cleavage at interna1 peptide bonds within the prosegments which bind through the substrate- binding clefts of these enzymes. Cuiously, the maturation of procathepsin H to its mature form has been proposed to involve proteolysis at ~er77~f~1~78~. This reaction is a prelude to the production of an octapeptide of prosegment residues, termed the rnini-chain, which remains covalently attached to the main body of cathepsin H using a stable disulfide bond formed by Cys82p and Cys214. Based on structural analysis, it is interesthg to note that Ser77p and Glu78p reside within an area of the cathepsin H prosegrnent which is homologous to the stretch of amino-acid residues believed to interact with the active site cleft of the enzyme. Similarly to what was identified for procathepsins B and S in chapter 2 using the cystatin C inhibitor, it was hypothesized in chapter 3 of this thesis that procathepsin H would be capable of performing unimolecular autoproteolytic processing at ~er77~f~lu78~.Instead, it has been determined that procathepsin H is an unusual mammalian member of the cathepsin L- subfamily which is incapable of autoactivation. This incapacity has been attributed to the pre-formation of a disulfide bond linking Cys82p and Cys214 and the presence of a unique tryptophan residue within the structure of the cathepsin H precursor. This additional covalent attachrnent was found to limit the conformational mobility of the C- terminal end of the prosegment, and thus rendering the pro/mature junction within the cathepsin H precursor highly resistant to proteolysis. Contributions of authors other than Omar Quraishi :

Mort, J.S. - Provided the cDNA constructs (procathepsin H integrated into the pPIC9 vector), trdomed P. pastoris, wild-type human cathepsin H, and wild-type human cathepsin D.

Storer, AC. - Provided fùnding and mentorship for this project. Functional Expression of Human Procathepsin H in Pichia Pastoris and Attempts at its Correct Processing* CATHEPSN H LACKING THE MINI-CHAN EETAINS STGNIFICANT AMINOPEPTIDASE ACTIVITY

*NRCC Publication No. 00000. The research was fûnded in part by the Govemrnent of Canada's Network of Centres of Excellence Program supported by the Medical Research Council of Canada and the Natural Sciences and Enginee~gResearch Council of Canada through PENCE Inc. (the Protein Engineering Network of Centres of Excellence).

Omar Quraishi$§, John S. Mow, and Andrew C. Storer$II* From the $Protein Engineering Network of Centres of Excellence and Department of Biochemistry, McGill University, 3655 Dnimmond Street, Montreal, Quebec, Canada H3G 1Y6, the YJoint Diseases Laboratory, Shriners Hospital for Children, 1529 Cedar Avenue, Montreal, Quebec, Canada H3G lA6, and Protein Engineering Network of Centres of Excellence and Department of Surgery, McGill University, Montreal, Quebec, Canada, and the llPharmaceutica1 Biotechnology Sector, Biotechnology Research Institute, National Research Council of Canada, 6100 Royalmount Avenue, Montreal, Quebec, Canada H4P 2R2.

* To whom correspondence should be addressed: Pharmaceutical Biotechnology Sector, Biotechnology Research Institute, National Research Council of Canada, 6100 Royalmount Ave., Montreal, Quebec, Canada H4P 2R2. Tel.: 5 14-496-6256; Fax: 5 14- 496- 1629; E-mail: [email protected].

§ Present Address : B iochem Therapeutic Inc., 275 band-Frappier Blvd., Laval, Québec, Canada H7V 4A7

RUNNING TITLE: Cathepsin H Expression, Processing, and Activity ABBREVIATIONS

* Unless stated otherwise, residue nurnbering in the text relates to that of cathepsin L as presented in Coulombe, R et al. (85). Residues in the proregion are identified with the suffix p.

2 In the text, the words 'proregion', 'prosegment', 'prodomain', or 'prosequence' refer to the polypeptide stretch located N-terminal to the mature enzyme in the proenzyme, while the word "propeptide" refers to the chemically synthesized polypeptide corresponding to the proregion sequence but without the mature enzyme.

In the text, the ternis 'autoactivation', 'autoprocessing', or 'maturation', relate to the ability of an enzyme (pro- or mature) to convert its own precursor to a mature protein by cleaving at or near the proIrnature junction.

The abbreviations used are: E-64, tans-eporysuccinyl-L-leucyl-arnido-(4- guanidino)butane; Arg-MCA, L-ar-oinine 4-methylcouma~inyl-7-amide;Z-Phe-Arg-

MCA, benzyloxycarbonyl-L-phenylalanyl-L-arginine 4-methylcournarinyl-7-amide;

SDS-PAGE, sodium dodecyl sulfate-polyacrylamide gel electrophoresis; 'DMSO, dimethyl sulfoxide; EDTA, ethylene-diamine tetraacetic acid; DTT, dithiothreitol. Abstract :

Within the papain family of cysteine proteases, cathepsin H is unusud in that it

displays mono-aminopeptidase activity. This capacity has been iinked mainly to an

octapeptide, termed the mini-chain, that blocks the unprimed subsites of this enzyme's

substrate-binding cleft. The mini-chain, derived fiom residues located within the C-

terminal end of the cathepsin H proregion, rernains covalently linked to the main body

of the enzyme following activation of the precursor using a disulfide bond formed by

Cys82p and Cys214 (83). The mechanisms leading to the formation of mature

cathepsin H, however, have yet to be elucidated. To date, al1 reported attempts to

isolate cathepsin H have required that it be purified fkom natural sources. Here, we

report that a cDNA encoding the hurnan cathepsin H precursor has been expressed as

* an a-factor fusion construct in the methanotropic yeast Pichia pastoris. Unlike most

proenzymes belonging to the papain farnily, procathepsin H was determined to be

incapable of autoprocessing under acidic pH conditions in vitro and mature cathepsin H

does not independently contribute to the conversion of its own precursor. The

conversion of procathepsin H to its mature form, therefore, requires the action of other

proteases. For example, cathepsin D has been found to cleave at the N-teminus of the

rnini-chah in the presence of SDS detergent. Furthemore, prosegment residues

located near the pro/mature junction were found to be resistant to proteolysis.

Homology modelling suggests that the unusual stability of the cathepsin H precursor to

autocatalysis and cleavage near the pro/mature junction is attributable to the pre-

formation of a disulfide bond linking the prodomain to the enzyme. This feature,

unequaled in other mammalian precursors of the papain family, serves to restrict the

conformational mobility at the C-terminal end of the proregion. Proteolysis near the pro/mature junction of procathepsin H at may only be achieved uskg a secondary protease such as cathepsin L in the simultaneous presence of DTT and

SDS detergent. hterestingly, this isoform of cathepsin H retains its aminopeptidase activity towards the Arg-MCA substrate; Le., with a /&,/KM = 1,950 M%' veTsus

11,700 M-1 s' 1 and 4.2 M-'s" for human wild-type cathepsin H and cathepsin L, respectively. These results suggest that the mini-chain serves to optimize, yet is not strictly required, for the arninopeptidase (exopeptidase) activity of cathepsin H.

Therefore, an additional role for the rnini-chain is to Iirnit the endopeptidase activity of cathepsin H. Cathepsin H (EC 3.4.22.16) is a ubiquitously expressed lysosornal enzyme belonging to the papain superfafnily of cysteine proteases (1 08) which includes the rnammalian cathepsins B, K, L, and S. The identification of downstrearn targets and the precise physiological roles of cathepsin H have yet to be determined. Increased cathepsin H activity, however, have been correlated with human glioma ce11 invasion

(139). Furthemore, the expression level of cathepsin H or cathepsin H-like enzymes

(140) has been shown to be increased in other disease States such as melanoma and tumor metastasis (141) as well as breast carcinoma (142). Cathepsins B (107), C (9)' H

(62) and bleomycin hydrolase (143) are the ody known exopeptidases of the papain family. Cathepsin B is primarily a carboxydipeptidase, while cathepsin C is an arninodipeptidase. The carboxydipeptidase activity of cathepsin B has been linked to a unique insertion of 20 residues, caIled the occluding loop, which contributes to the primed subsites of this enzyme's substrate-binding cleft. Conversely, a feature which distinguishes bleomycin hydrolase and cathepsin H fiom al1 other marnrnalian rnernbers of the papain family is their mono-aminopeptidase activity; Le., cleavùig a single residue fiom the N-terminus of an extended polypeptide substrate. The arninopeptidase activity of bleomycin hydrolase is accounted for by the protmsion of the C-teminal end of the enzyme into the active site cleft (144). Similar to the role of the occluding loop for the carboxydipeptidase activity found in cathepsin B, it has been proposed that the aminopeptidase activity of cathepsin H is due mainly to the presence of an octapeptide, termed the mini-chain, that blocks the unprimed subsites of this enzyme's substrate-binding cleft (83). Cornparison of the X-ray crystat structures for this farnily of zymogens (85-91) to the recently determined c~ystalstructure of mature porcine cathepsin H (83) has raised many questions concerning the maturation mechanism for the cathepsin H precursor in particular, and that for zyrnogens of the papain family in general.

Cathepsin H isolated fkom subcellular fiactions has been shown to consist of different N-terminal sequences. Typically, this enzyme contains the N-terminal sequence corresponding to the pro/mature junction, namely Gly(-2)Pro(- 1)Tyr1 -Pro2-

Pro3 and Tyrl-Pro2-Pro2 (136). The major isofonn of cathepsin H has also been shown to possess an additional N-terrninal sequence corresponding to a glycosylated octapeptide (Glu78p-Pro-Gln-Asn-Cys-Ser-Ala-ThrWp),referred to as the mini-chain, which is composed of residues located at the C-terminal end of the cathepsin H proregion, Following maturation of the cathepsin H precursor, the mini-chah has been shown to remain attached to the enzyme domain using a disulfide bridge. The crystal structure of mature porcine cathepsin H (83) reveals the method by which the mini- chah interacts with the main body of the enzyme. Significantly, the disulfide bridge linking the mini-chain to the catalytic domain is composed of two cysteine residues which are unique to cathepsin H (rnammalian), aleurain (barley), and orizain y (rice seeds), Cys82p of the prodomain and Cys214 Iocated on the enzyme (128). Contrary to full-length proregions which have been reported to bind in the reverse substrate-binding mode through the active site clefts of their cognate enzymes (85-91), the mini-chain interacts with the enzyme in the same direction as that taken by natural substrates (83).

Since the fold of the prosegments and their mechanism of inhibiting enzyrnatic activity are conserved arnong the zymogen structures reported to date (85-91), it follows that the terminal residues of the mini-chah; Le., Glu78p and ThrSSp, essentially exchange places with one another during the activation of procathepsin H. The 'flipping' of the mini-chah enables Thr85p to be strategically positioned within the S2 subsite of the enzyme's active site cleft with its fiee carboxyl group facing the Si subsite (83) ; i.e., the carboxyl group of Thr85p serves to accept the positively charged N-terminus of the bound substrate. Based on the structure of the mature protein, it has been proposed that maturation of procathepsin H involves 'cLippingYat the N-terminus of the mini-chain

(carbonyl carbon of Ser77p) as a first step, followed by a second cleavage near the prolmature junction with the disulfide bond linking Cys82p and Cys214 remaining intact (83). In addition, it was proposed that the remaining prosegment residues (C- terminal end of the prosegment) 'flip' such that they bind to the enzyme in the substrate-binding mode folIowed by intramolecular 'trimming' reactions leading to the mature mini-chah composed of Thr85p at its C-temiinus (83). Here, we attempt to elucidate the sequence of proteolytic events leading to the formation of mature cathepsin EI in vitro as well as to identify the maturase(s) responsible for these processes. Interestingly, mode1 building of procathepsin H (Figure 1 of Chapter 3) predicts that residue Glu78p of the cathepsin H proregion binds through the substrate- binding cleft in the reverse mode and that the carbonyl carbon of Ser77p is in closest proximity to the catalytic center. Evidence for the capacity of cathepsin H to perform proteolysis at ~er77~'?~lu78~would be of significant interest since a unimolecular rnechanism of processing for procathepsin B and procathepsin S has been identified

((145) ; Chapter 2 of this thesis). These observations, therefore, are suggestive that a unimolecular proteolytic event at the ~er77~T~lu78~site is a plausible mechanism leading to procathepsin H maturation. Confirmation of such a process would have important implications for other zymogens belonging to the papain family.

Previous to this work, nahiral sources such as rabbit liver (146) and lung (147), rat skin (148), human liver (149,150), placenta (15 1) and kidney (1 52), porcine (83) and bovine spleen (153), as well as bovine and human brain (154), have been used to

isolate cathepsin H. Moreover, the activity yields in al1 of these cases were very poor.

The expression of recombinant proenzymes belonging to the papain family of cysteine proteases has been achieved as an a-factor fùsion constmct in the methanotropic yeast

Pichia pastoris (85,88,89,99,105-107). We have used this system to produce human procathepsin H which may be easily purified, and thus, facilitate the monitoring of procathepsin H processing in vitro. The ability of procathepsin H to perfom intramolecular 'self cleavage at the N-terminus of the mini-chain ; i.e., at the carbonyl carbon of Ser77p, is also investigated. Furthemore, we also present a structure-activity relationship for different isoforms of cathepsin H towards the exopeptidase substrate,

AG-MCA. EXPERfMENTAL PROCEDURES

Production of Active Cathepsins B, D, K,K, L, and S - Active human cathepsin H and cathepsin D were obtained from the Iaboratory of Dr. John S- Mort (Joint Diseases

Laboratory ;Shriners Hospital for Children) and Arg-MCA substrate was purchased fi-om Bachem Bioscience Inc. For cathepsins B, K, L, and S, each enzyme was similady expressed as a precursor in P. pastoris as descnbed in this study for procathepsin H. Each precursor was activated by subjecting them to acidic pH and rnildly reducing conditions. Each processed enzyme was independently purified with a fast-flow column of SP-sepharose resin (Pharrnacia) using 50 mM sodium acetate (pH

5.0). Each enzyme was eluted from the column using a linear gradient of increasing

NaCl concentration. To each enzyme was added Hg2C12to a final concentration of 0.1 mM and al1 sarnples were stored at 4OC.

Hornology Model of Procathepszk H - The crystal structure of procathepsin L (85)

(over 40% sequence homology to procathepsin H) was used as a template to construct a hornology model of procathepsin H. The aromatic side-chains extending from the substrate-binding cleft to the prosegment-binding loop (exosite) are highly conserved among proenzymes belonging to the papain farnily. The side-chains of residues found in the prosegrnent of procathepsin L were sequentially substituted to those found in the cathepsin H precursor using ON0 version 5.10. The prosegment of cathepsin H was also oriented such that the side-chah of Cys82p is in closest proximity to Cys214; Le., conducive to fonning a disulfide bond. Finally, Insight II was used to present the model shown in Figure 1 of Chapter 3. Construction of WiZd-Type Human Procathepsin H - A cDNA constmct consisting of human wild-type procathepsin H as a fusion with the preproregion of yeast cc-factor had been prepared (obtained from the laboratory of Dr. John S. Mort, Joint Diseases

Laboratory, Shriners Hospital for Children). AL1 sites for glycosylation, AsnSOp,

Asn8lp and Asnl il, were kept intact. The no1and NotI restriction sites were then introduced into the C-temiinal segment of the a-factor preproregion and the 3'- untranslated region of procathepsin H, respectively, to allow subcloning into the Pichia pastoris expression vector pPIC9 (Invitrogen Inc., San Diego, California).

Expression of Procathepsin H in P. pastoris - For integration into the Pichia genome, the pPIC9 based constructs were linearized by cleavage with Bgm and purified. The P. pastoris host strain GS115 (hvitrogen) was then transformed with the linearized constnicts by electroporation. Positive transformants were grown at 30°C for 2 days in medium containhg glycerol as the carbon source followed by incubation in the presence of methanol for a Mer3 days to induce expression of recombinant protein. Protein secreted into the culture supernatant was analyzed by non-reducing

SDS-PAGE.

Purij5cation of Procathepsin H - The culture supernatant (250 ml) was concentrated to 40 ml using an Amicon stirred-ce11 (YM-IO membrane). During concentration, the supernatant was dialyzed against 50 mM Tris (pH 7.4) containhg 1.6 M (NH&S03.

Concentrated recombinant proenzyme was then purified on an FPLC system using a butyl-sepharose fast flow column (Pharmacia Inc.). Proenzyme fractions eluted fkom the coiumn by applying a linear gradient of decreasing ammonium sulfate concentration. Both glycosylated and deglycosylated procathepsin H eluted at 0.3-0.6

M w)2So4and the samples were stored at 4OC.

In Vitro Processing of Procathepsin H - For autoprocessing assays, purified sarnples

of wild-type human procathepsin H were dialyzed against 50 mM acetate (pH 4.0-5.0)

and 1mM DTT at various temperatures (25, 37, and 60°C). For bimoIecu1ar

processing (in tram) assays, catalytic arnounts of lysosomal cysteine proteases such as

human cathepsins B, H, K, L, and S were each added to procathepsin H (molar ratio of

procathepsin Wexogeneous protease > 250/1) under mildly reducing conditions

(+DTT-SDS). For the addition of the aspartic proteases, 50 units/ml of imrnobilized

pepsin (Sigma-Aldrich Canada Ltd.) was incubated with procathepsin H at pH 4.7 for

two hours at 37°C followed by its removal by filtration. The addition of human

cathepsin D was canied out under the sarne conditions as those for pepsin (-DTT-

SDS). Following incubation with the various proteases, each reaction was then treated

with 10 pM of either E-64 or pepstatin. Furthemore, processing reactions were also

attempted in the presence of O. 1 mM SDS using either cathepsin L or cathepsin D as

catalyst.

N-Terminal Identijkation of the Cathepsin H Isoforms - Iso fonns of cathepsin H

imrnobilized in non-reducing polyacrylarnide gels were electoblotted onto hydrophobic

polyvinylidene difluoride membranes using the method as descnbed previously (124).

The membranes were then stained with Coomassie Brilliant Blue R250 (Bio-Rad

Laboratories) and each protein band of interest was subjected to a minimum of five

cycles of automated solid-phase Edrnan degradation. In some cases, aqueous sarnples of cathepsin H were applied directly fur N-temiinal sequence analysis without pnor purification.

Active Site Titration - Due to the inability of procathepsin H to autoactivate, exogeneous proteases were needed to produce various isoforms of active cathepsin H with an exposed active site. To procathepsin H was added catalytic arnounts of a pre- determined quantity of cathepsin L in the presence of 0.1 rnM SDS and this reaction was incubated at 37°C for 1 hr. The active site titrant, E-64, was determined to be a slow-binding inhibitor to cathepsin H as compared to cathepsin L. For this reason, the reaction mix of cathepsin H and cathepsin L was first treated with a pre-determined excess of E-64 ([--641 where no activity is detected for either enzyme ; Le.,

Arg-MCA for cathepsin H and Z-Phe-Arg-MCA for cathepsin L. Finally, aliquots of cathepsin L were then added to this mixture dlits activity towards the 2-Phe-Arg-

MCA substrate was detected. With volume corrections taken into account, the onginal concentration of cathepsin H was calculated as being equal to the difference of p-

64ITOTALand [cathepsin LITOTALwhen fluorescence generated by z-Phe-Arg-MCA cleavage was recovered.

Enzyme Assays - Kinetic fluorescence rneasurements were carried out using a SPEX

Fluorolog-2 spectrofluorometer which monitored MCA formation using an excitation wavelength of 380 nm and a detection wavelength of 440 nrn. A final concentration of

1.0 nM cathepsin H was used for each assay. The KM of wild-type hwnan cathepsin H for the exopeptidase Arg-MCA substrate was determined by the non-linear fitting of measured initial velocities at different concentrations of substrate to the Michaelis- Menten equation. Since the KM was estimated to be 0.282 rnM, a substrate concentration of 10 pM was used for each assay to estimate kcat/KMvalues ([SI KM).

Previous studies on cathepsin H purified fkom cow brain (154) resulted in KM values of

0.169 mM and 0.195 mM at pH 7.0 and 37OC for Arg-2-naphthylamide and Arg-p- nitroanilide, respectively. In the present study, assays were performed at 25°C using 50 mM phosphate (pH 6.0) buffer containhg 0.2 rnM EDTA, ImM DTT, and 3% DMSO.

Despite the pre-incubation with 0.1 mM SDS in some assays, each isoform of cathepsin

H was sufficiently stable under the assay conditions used for the time required. As a control, no activiiy of cathepsins B, D, K, L, or S towards the Arg-MCA substrate was detected during the time required for all assays (5 lhr), thus conhnhg that production of fkee MCA (fiom Arg-MCA) was due strictly to the aminopeptidase activity of the various cathepsin H isoforms. From these assays, values of kcaL/Kiwere obtained using the equation v = [E][S]k&t/KM at [SI<<&.

Non-Redzrcing SDS-PAGE Analysis - Proteins were treated with 10 pM of E-64 or pepstatin (for reactions using pepsin and cathepsin D). When necessary, deglycosylated procathepsin H was obtained by incubating 1 unit of endoglycosidase H

Poehringer Mannheim) for 1 hr at 37°C. In order to obtain accurate determinations of the sites of proteolysis due to the activities of cathepsins D, K, L, and S ; i-e., in order to ensure the integrity of the disulfide bond linking Cys82p and Cys2 14, the cathepsin

H isoforms were purified using non-reducing SDS-PAGE (12% gels). This was followed by blotting the proteins onto polyvinylidene difluoride membranes for direct

Edman degradation. It should be noted, however, that the apparent molecular weights of the various cathepsin H isoforms is based on their migrations in non-reducing SDS-

PAGE (Figure 2A and 2B) rather than standard (reducing) SDS-PAGE.

Modelling of Human Procathepsin H - The primary amino-acid sequence homology

between procathepsin H and procathepsin L is over 40% and the structural homology

among proenzymes belonging to the papain family; Le., the fold of the prosegments

and their mechanism of inhibiting enzymatic activity, is ais0 conserved (85-91). In an

attempt to construct a mode1 of procathepsin H, the tertiary fold of procathepsin L was

used as a template. Depicted in Figure 1 are the unique cysteine residues, Cys82p and

Cys214, found only in procathepsin H among the mammalian members of papain-like

enzymes as well as in proaleurain (barley) and pro-orizain y (rice seeds). From Figure

1, it is revealed that the two cysteines are in close proximity to one another and located

within the unprimed segment of this enzyme's substrate-binding cleft. Furthemore,

the scaffold of residues located on the enzyme; Phe143, Tyr146, Tyr15 1, Trp189, and

Trp193, are also highlighted. Each of these residues contributes to an intncate ladder

of hydrophobie interactions which are highly conserved among members of the papain

family. These residues extend from the hydrophobic prosegment-binding loop

(exosite) composed of Tyr146 and TyrlS1, to Trp189 located within the prïmed

subsites of the enzyme's substrate-binding cleft. It has also been documented that residue Phe56p of the proregion (not shown in Figure 1) binds to the pr~se~gnent- binding loop in procathepsin L (85-88) and is positioned perpendicularly to Tyr15 1.

This position within the prosequence consists typically of an arornatic residue. For

example, Phe56p is replaced by a tyrosine in procathepsin K and procathepsin S, and a tryptophan (Trp24p) in procathepsin B (85-91). The insertion of these hydrophobic residues into the exosite contributes to the hydrophobic ladder and provides added stability to the prosegment-enzyme complex. The prosegment of cathepsin H is unusual in that, in addition to possessïng a phenylalanine residue in the 56p position, it is also composed of a unique tryptophan residue, Trp76p, which is predicted to bind in close proximity to the Si7 subsite of the substrate-binding cleft in cathepsin H.

Interestingly, as the prosegment is oriented such that the side chains of Cys82p and

Cys214 are capable of forming a disulfide bond, the side chah of Trp76p is shown to stack against Trp189 and is in close proximity to the catalytic Hisl63. Although the thiolate-imidazolium ion-pair has been shown to be preformed, and presumably functional, in the structures of zymogens belonging to the papain family as is found in the corresponding mature enzymes, the presence of Trp76p in procathepsin H may induce the catalytic Es163 to stack against this unequaled aromatic side chain rather than to form a stable ion-pair with the catalytic cysteine, Cys25.

Expression and Purification of Procathepsin H - The recombinant cathepsin H precursor was expressed at far lower levels in Pichia pastoris as compared to other precursors reported previously; Le., approximately 1.0 mghter of culture medium as opposed to 10-20 mg/liter for rat procathepsin B (89), hurnan procathepsin L (85), and human procathepsin K (88). The proenzyme was purified using hydrophobic interactions (butyl-sepharose resin, Pharmacia Inc.) under neutral pH conditions. The cathepsin H proenzyme was heterogeneous due to modification of the N-linked oligosaccharide moieties on the proregion and enzyme domain and rnigrated with an apparent molecular mass ranging fiom 45 to 60 kDa with a dominant band at 50 kDa in non-reducing SDS-PAGE (Figure 2). Following enzymatic deglycosylation with endoglycosidase H (Boehringer Mannheim), procathepsin H migrated as a single band with the expected size of 37 kDa (data not shown).

Processing of Procathepsin H ln Vitro - Exposwe of procathepsin H to similar conditions used for most other zymogens of the papain farnily; Le., acidic pH and rnildly reducing conditions, was insufficient to prornote autoactivation of either glycosylated or deglycosylated foms of the proenzyme. The addition of active cathepsins D, K, L, and S, but not cathepsin H (Figure 2A) or cathepsin B, to the cathepsin H precursor (+DTT-SDS) yielded intermediate species (Figure 2B and

Figure 3) with N-termini corresponding to a confined area of the pradomain which is solvent exposed and accessible to proteolytic processing in trans. This segment of the prosequence is composed of Asp65p and Glu70p which are highly conserved residues among the prosegments of precursors belonging to the cathepsin L subfamily (>90 residues) (Figure 3). The crystal structure of human procathepsin L (85) reveals that

Asp65p forms a salt bridge with Arg31p, and Glu70p contributes to salt bridge formation with both Arg3lp and Glu27p. These conserved interactions serve to fold helices nlp (residues 6p-19p) and a2p (residues 25p-51p) in close proximity to helix a3p (residues 68p-75p). Cunously, it was proposed by Vernet et al. (100) that the N- terminal sequence of an intermediate band of propapain processing corresponded to a sequence residing in proximity to the segment, Gly/Aia59p-Xxx-Asn-Xxx-Phe-Xxx-

Asp65p. This conclusion was based on the apparent molecular weight of the intermediate as detennined by its migration using Western blot analysis (100). Production of Active Cathepsin H - In order to produce cathepsin H lacking the majority of prosegment residues, and consequently the mini-chain, the precursor must be incubated with catalytic amounts of a secondary protease, such as cathepsin L, with the simultaneous presence of DTT and SDS detergent (+DTT+SDS). Under these conditions, proteolysis at ~r~92~?~l~93~located near the pro/mature junction was observed (Figure 3). In the absence of SDS detergent, however, this site is not hydrolysed by cathepsin L. Under these conditions (i-e., +DTT-SDS), the major isoform of cathepsin H produced corresponded to an N-terminal sequence of t~lu70~-

Ile7lp-Lys72p (Figure 3), thus suggesting that over 20 residues 6om the prosegrnent continue to remain covalently attached to the enzyme via the pro/mature junction and the disulfide bridge formed by Cys82p and Cys214. Furthemore, incubation of procathepsin H with active cathepsin K produces an identical N-terminal sequence of cathepsin H to that observed via cathepsin L, and cathepsin S cleaves at T~he68~-

Ala69p-Glu70p. Conversely, procathepsin H was resistant to cleavage by the exopeptidase activities of cathepsin H and cathepsin B.

Incubation of procathepsin H with pepsin Ieads to the hydrolysis at the t~sn61p-~ln62p-~ he63p site. Following incubation with cathepsin D (-DTT-SDS), the major cathepsin H isoform produced was detennined to have the N-terminal sequence ?69p~la-70p~lu-~1e71p.Upon the addition of SDS detergent (+DTT+SDS), however, cathepsin D was capable of cleaving at the N-termimal end of the mini-chah,

?~lu79~-~ro80~-~ln81~(25% of signal), in addition to the ?69p~la-70p~lu-~le71p site (75% of signal). Activiîy of Procuthepsin H - The progress curves shown in Figure 4 indicate that cathepsin H is capable of cleaving the Arg-MCA substrate, regardless of whether it is composed of the mature mini-chain. For example, most intemediates of processing continue to be composed of over 20 residues derived fiom the C-terminal end of the cathepsin H proregion (attached covalently at two sites to the enzyme), yet they demonstrate significant exopeptidase activity with respect to cathepsin H containing the mature octapeptide. The &/KM value obtained for mature cathepsin H (composed of the mini-chain) was determined to be 11,700 lK1s-l. For the cathepsin H intermediate produced following incubation with cathepsin L (+DTT-SDS; ?~lu70~-1le71~-

Lys72p), the value was determined to be 889 M-'s". Finally, production of cathepsin H lacking the majonty of prosegment residues following the incubation of the zymogen with cathepsin L in the simultaneous presence of DTT and SDS detergent

(+DTT+SDS ; ?~1~93~-~hr94~-~1~95~)was determined to have a kcal/KM= 1,950 M- l s-1 . Interestingly, this isofonn of cathepsin H is highly homologous in sequence and three-dimensional structure to mature hurnan cathepsin L. Similady to mature human cathepsins B, D, K, and S (data not shown), however, cathepsin L is shown to display poor activity towards the Arg-MCA substrate (Figure 4).

DISCUSSION

The ability of zymogens belonging to the papain family to undergo autocatalytic maturation is predicated on them containing an intact active-site machinery ; i.e., a preformed catalytic ion-pair and substrate-binding cleft, sirnilar to that found in the mature enzyme. Proenzymes of the papain family are more stable at neutral pH and are prone to autoprocess under acidic conditions (97401,209) such as that found in the mature lysosornai cornpartment of the cell. This observation irnplies that electrostatic interactions are critical in regulating the stabilization or destabilization of these prosegment/e;izyme complexes and correlates with the demonstrated pH dependence of inhibition by the propeptides of cathepsin B (102,106), cathepsin L (103), and cathepsin S (104) towards their parent enzyme. Recently, we have deterrnined that the occluding loop of cathepsin B dehes the pH dependence of inhibition by its propeptide, and consequently, regulates the moiecular switch for procathepsin B autoactivation (106). The formation of a critical salt bridge on cathepsin B, invoIving

HisllO of the occluding loop and Asp22 located near the Sz' pocket of this enzyme's substrate-binding cleft, helps to stabilize the closed form of the loop at low pH and allows it to compete with the propeptide for the surface of the enzyme (106).

Interestingly, the absence of an occluding loop and the presence of longer proregions in most other precursors of the papain family such as those of the cathepsin L-subfarnily

(~90residues versus 62 residues in procathepsin B), implies that the pH-tnggering mechanism of autoprocessing for procathepsin B may be unique to that enzyme, and that the important 'activating' salt bridges arnong cathepsin L-like precursors are likely to reside predominantly within the prodomains. In marnrnals, procathepsin H is a unique member of the cathepsin L-subfamily of zymogens in that it is composed of a conformational constraint at the C-terminal end of its prosegment. This constraint is the pre-formation of a disulfide bond linking Cys82p of the proregion to Cys214 located on the enzyme which has been shown to remain intact in mature porcine cathepsin H (83). Therefore, a detailed characterization of the mechanism of processing for procathepsin H in vitro would provide unique insights into the conformational requirements of the prodomain for efficient autoprocessing to occur

among zymogens of the papain farnily. Here, we attempt to elucidate the mechanism

of processing for the cathepsin H precursor whose prosegment shares the same length

(>90 residues) and high identity (>40%) to that of cathepsin L (Figure 3), but consists

of amino-acid residues which are unequaled among the marnmalian homologs of

papain ; Le., Trp76p and Cys82p within the proregion and Cys214 located on the main

body of the enzyme.

Previous to this work, the only precursors of the papain family which were

found to be incapable of autoprocessing were those for papaya proteinase IV (PPIV)

(126) and the plant equivalent of cathepsin H, a barley vacuolar thiol protease known as

aleurain (137,155). The inability of pro-PPIV to autoprocess has been attributed

mainly to its crowded active-site cleft consisting of the unique residues, G133 and

Arg65 (papain nurnbering) which confers strict'specificity of this enzyme for substrates

with a glycine residue in the Pl position. In the case of proaleurain, it was deterrnined

that this precursor was not capable of autoprocessing in the absence of aIeurone ce11

extracts and that mature (active) aleurain did not participate in the processing of its own precursor (137). Curiously, the prosequence and catalytic domain of aleurain share the identical residues Cys82p and Cys214, respectively, to those found in cathepsin H and orizain y (128) (Figure 3). Similar to what was observed for proaleurain, we report that procathepsin H is a stable prosegment/enzyme complex and incapable of autoprocessing in vitro. On the basis that the cathepsin H proregion inhibits the enzyme using the reverse substrate-binding mode as has been reported for other proregions (85-91), the disulfide bridge linking Cys82p to Cys214 would significantly reduce the degrees of conformational freedom at the C-terminal end of the prosequence; Le., frorn prosegment residues which bind through the substrate-binding cleft of cathepsin H to those near the pro/mature junction (Ser77p-Glu78p-Pro79 +

Tyr1-Pro2-Pro3). Therefore, regardless of the pH conditions to which procathepsin H is subjected, the disulfide bond may serve to eliminate the pH dependence of prosegment binding as has been demonstrated for non-covalent propeptide/enzyme complexes (102-1 04,106). The negative charge of a highly conserved aspartate residue in the proregion of propapain (Asp65p) was shown to be important in maintainhg the papain precursor in a latent fom and to participate in an electrostatic triggering rnechanism of propapain processing (100). Despite the conservation of both Asp65p and Glu70p within the cathepsin H proregion (Figure 3), it may be reasonably assumed that the Cys82p/Cys214 disulfide bridge impedes the prodomain ffom dissociating fiom the surface of the enzyme.

Full-length procathepsin H is voici of catalytic activity towards synthetic substrates suc11 as 2-Phe-Arg-MCA or kg-MCA (data not shown). This observation suggests that the prosegmentfenzyme complex is stable and that access of small substrates to the enzyme's substrate-binding cleft is restricted- Structural alignment and mode1 building indicates that the prosegment of cathepsin H contains a unique tryptophan residue, Trp76p, which is predicted to bind near the Si' subsite of the enzyme's substrate-binding clef3 (Figure 1). Curiously, there exists no other papain-

Iike prosegment which contains a bullcy aromatic group in this position. For example, the proregions of human procathepsins K, L, and S consist of threonine, asparagine, and serine, respectively, in this position and those for proaleurain and pro-onzain y consist of a leucine (85). The codonnation of the bound prosegrnent as illustrated in

Figure 1 is such that the side- chahs of Cys82p and Cys214 are at the closest possible distance to one another; Le., conducive to fomiing a disulfide bond, This orientation

also causes Trp76p to stack against Trp189 and contribute to a highly conserved and

intricate ladder of hydrophobie residues that extend fiom the prosegment-binding loop

(exosite) to the substrate-binding cleft of the enzyme. In precursors belonging to the

cathepsin L-subfamily (85-88), it has been established that the side ch& of Phe56p

(structurally homologous to Trp24p in procathepsin B) pmtrudes into the cavity of the

exosite formed by residues Tyr146 and Tyrl51. In procathepsin H, therefore, the

ladder is composed of seven aromatic residues compared to only six in other precursors

of papain-like enzymes and may, therefore, contribute to the increased stability of the

cathepsin H precursor. Furthennore, due to the unique location of Trp77p, it is possible

that this novel aromatic group interferes with proper formation of the catalytic thiolate-

imidazolium ion-pair in the full-length cathepsin H precursor. It is predicted that

Trp76p is in close proxirnity to the catalytic imidazole, His163 (Figure 1). Ln order to

assist in the transfer of protons to the leaving amino group, it has been proposed that

the catalytic histidine of senne proteases has the inherent capacity to adopt various

conformations via rotation about its Ca-CP axis (156). As an illustration of this

capacity arnong cysteine proteases, the crystal structure of cathepsin B in cornplex with the pyridyl disulfide inhibitor revealed a rotation of 120" and 6S0 for the side-chahs of

the catalytic His199 and Cys29 residues (134), respectively, cornpared to their

orientations observed in other crystal structures of cathepsin B (82,89-91).

Furthennore, the crystal structure of porcine cathepsin H (83) revealed dirnerization of the enzyme caused by crystal packing. This dimerization was shown to induce salt bridge formation between the ENof the catalytic His163 and the carboxyl group of the

C-terminal residue, Va1222, from a neighboring rnolecule in the crystal. Consequently, this salt bridge causes the catalytic histidine to rotate 80° about its Ca-CP bond.

Hence, the positioning of the buky Trp76p aromatic group in the Si' subsite rnay influence the catalytic histidine to rotate its side-chain perpendicular to Trp76p rather than to form a stable ion pair with the catalytic cysteine, and thus, may help to explain how the full-length cathepsin H precursor is void of catalytic activity towards small synthetic substrates.

In the case of proaleurain, correct processing of this zymogen was only observed following its incubation with barley ce11 extracts (137). Both 'clipping' (loss of 9 kDa) followed by 'trimmïng' (loss of 1 kDa) proteolytic events were necessary for the complete maturation of proaleurain and were shown to be due to the activity of two independent maturases. The inhibition of the trimrning reactions by E-64 suggested that this event was mediated by the activity of a thioI protease, but the barley enzyme needed to perform the 'clipping' reaction was not identified. Sùnilarly to proaleurain, the conversion of the cathepsin H precursor has been proposed to occur in a multi-step fashion (83) involving cleavage at the pro/mature junction and at both termini of the mini-chain. Most notable is the proposed cleavage site at the N-terminus of the mini- chah, ~er77~'T'~lu78~,since this segment of the proregion is predicted to bind in the reverse substrate-binding mode through the enzyme's substrate-binding cleft and to be in close proxunity to the catalytic ion-pair (Figure 1). However, the sequence of proteolytic events and the protease(s) responsible for these reactions had yet to be determined. Previous to this work, procathepsin H had been shown to be processed following its incorporation into the lysosome and that this conversion was inhibited by pepstatin, a potent inhibitor of aspartic proteases such as cathepsin D (157-159). It was also shown that processing of the cathepsin H precursor displayed significantly slower kinetics when compared to that of cathepsin B (157-159). As has been docurnented for proaleurain, we report that conversion of the cathepsin H precursor requires the catalytic activity of a secondary protease and not that of cathepsin H itself (Figure 2A).

Based on the proposed sites of proteolysis discussed previously, it is reasonable to assume that the activiq of an endopeptidase would be more efficient to perform these reactions as comp&ed to that of an exopeptidase. This is evidenced by the resistance of procathepsin H to cleavage by either mature cathepsin B which functions primarily as a carboxydipeptidase or by mature cathepsin H. Conversely, the various endopeptidases used in this study such as cathepsins D, K, L, S, and pepsin, were each capable of converthg procathepsin H to a lower molecular weight species. Curiously, N-terrninal identification of each cathepsin H intermediate indicated that the cysteine endopeptidases were only capable of cleaving within a finite sketch of proregion residues which includes the highly conserved Asp65p and Glu70p (Fi3we 3). Upon the addition of SDS detergent and cathepsin D, however, a minor signal (25% of total) corresponding to the N-terminus of the mature rnini-chain, 'T'~lu78~-~ro79~,was detected in addition to the f~la69~-~lu70~site (75% of total signal). It is not clear whether cathepsin D may cleave directly at the N-terminus of the mini-chain or if this reaction is tirne-dependent and biphasic in nature ; Le., that cathepsin D needs to cleave upstream at ?69~~la-~lu70~before being capable of cleaving at the N-terminus of the mini-ch&. Conversely, no proteolysis was observed at the C-terminus of the mature mini-chain (?~hr85~-~ys86~).Clearly, the cathepsin H intermediates composed of the disulfide bridge linking Cys82p and Cys214 display remarkable stability under the conditions used in this study which ultimately facilitates their identification by Edrnan degradation. Conversely, if the corresponding processing intermediates among other papain precursors are produced, their identification would be severely irnpeded by their rapid degradation to form mature protein. The enhanced stability of the cathepsin H intermediates may be due in part to the presence of the disulfide bridge luiking Cys82p and Cys214. Cleavage of procathepsin H by endopeptidases such as cathepsin D and cathepsin L remove over 60 residues at the N-terminal cap of the proregion (clipping), which includes the removal of Phe56p fiom the exosite surface of the enzyme-

Presurnably, this 'clipping' reaction has the effect of partially exposing the substrate- binding cleft of cathepsin H as well as to destabilize the hydrophobie interactions between Trp76p and Trp 189, and between Trp76p and the catalytic His 163 residue.

The lack of catalytic activity for the full-length cathepsin H precursor towards either Arg-MCA or 2-Phe-Arg-MCA suggests that the enzyme's substrate-binding cleFt is inaccessible to small substrates and/or the catalytic ion-pair is not optimally formed to partake in chernical reactions. It is interesting to note, however, that the intermediate of cathepsin H processing produced by the activity of cathepsin L (~la69~?~lu70~) demonstrates significant aminopeptidase activity; i.e., kCat/KM= 889 M-'s-' as compared to 11,700 M' for wild-type cathepsin H cornposed of the mature mini-chain (Figure 4).

How is this possible if, as discussed above and as demonstrated by the homology mode1

(Figure 1), the intermediate of cathepsin H lacks a mature mini-chain but rather is composed of over 20 residues derived fkom the C-terminal end of the prosegment?

First, removal of the N-terminal a-helical cap of the prodomain (approximately 70 residues) will cause the active site cl& of the enzyme to be more exposed to incoming substrates than is the case for the full-length precursor ; i-e., the remaùiing prosegrnent residues will bind less tightly to the surface of the enzyme. Second, the substrate- binding cleft of cathepsin H may be uniquely designed to accomodate aminopeptidase activity independently of the mature mini-chah For exarnple, the crystai structure of porcine cathepsin H reveaIs that the unprimed region of this enzyme's active site cIeft is narrower than those of other related structures (83). Significantly, the backbone carbonyl oxygens surrounding GIy65-Gly66 have been shown to be positioned closer to the SZ pocket ; Le., the putative location of the positively charged N-terminus of a bound substrate. Furthemore, cathepsin H contains an insertion loop within the R- domain consisting of Lysl55A-Thrl55B-Pro 155C-Asp 155D located between Serl55-

Ser156 (papain numbering). This insertion is positioned adjacent to the unprimed pockets of the active site cleft as well as to the bound mini-chah. When the mature mini-chain is bound to the enzyme, the side-chain of Asp155D is directed away fiom the substrate-binding cleft of the enzyme and faces the surrounding solvent. In the absence of the mini-chah, however, the orientation of the Asp155D side-chain may change such that it faces the positively charged N-terminus of the Arg-MCA substrate ;

Le., in a rnanner analogous to Thr85p of the mature mini-chah Conversely, the catalytic activity of the cathepsin H processing intermediates may be due instead to the production of trace amounts of cathepsin H composed of the mature mini-chain. Based on the N-terminal sequence analyses of aqueous sarnples of cathepsin H, however, no such species was detected.

Processing in pans among papain-like precursors usually results in proteolysis at or near the pro/mature junction. Papain-like prosegments strategicaliy possess the least secondary structure near the pro/mature junction when bound to their cognate . enzyme (85-91) ; Le., the most conformationally disordered segment of the proregions.

Consequently, this conformational keedom faciliates proteolysis near the pro/mature junction to form mature protein. Similady, tnincated propeptides composed only of the C-temiinal end of the full-Iength propeptide display poor inhibitory activity towards the parent enzyme (92,103,117). In this study, we report that the C-terminal end of the cathepsin H prodomain is unusually resistant to cleavage by endopeptidases despite consisting of a Leu-Arg motif at positions -6 and -5, respectively; Le., a sequence compatible for recognition by many thiol proteases such as active cathepsins B, L, S

(160) and K (161). Furthemore, the C-terminal end of the mature mini-chah (Thr85p) is also resistant to proteolysis- As discussed above, the homology model of procathepsin H predicts that Cys82p and Cys214 fom a disuEde bridge near the S4 subsite linking the C-terminai end of the proregion to the main body of the enzyme.

Intuitively, this additional covalent attachrnent would significantly lirnit the confornational rnobility at the C-terminal end of the prosegment, thereby providing resistance to proteolysis in an area of the proregion where processing in tram normally occurs. Therefore, susceptibility of the cathepsin H precursor to proteolysis near the pro/mature junction may improve only upon the complete or partial denaturation of the bound prosegment. This is evidenced by the ability of cathepsin L to cleave near the pro/rnature junction of procathepsin H with the sirnultaneous presence of DTT and SDS detergent (Figure 3). This reaction leads to the production of an isoform of cathepsin H

(?~1~93~-Thr94p)which lacks the rnajority of prosegment residues, and consequently the mini-chah, and is therefore highly homologous in sequence and structure to mature human cathepsin L. Similar to the cathepsin H processing intermediates discussed above, this isoform retains significant aminopeptidase activity (kcat/KM= 1,950 M%') towards kg-MCA compared to mature hurnan cathepsin H (kcat/KM= 11,700 M%' ) and possesses enhanced aminopeptidase activity compared to mature human cathepsin

L (kat/&= 4.2 ~''s-l).Given this data, it is necessary to conclude that the mùii-chain serves to optirnize, yet is not stnctly required, for the rnono-aminopeptidase activity of cathepsin K. It is aIso important to note that intramolecular proteolysis at the

~er77pT~lu78~site was not obsewed. The ability to perform such a unïmolecular step arnong other precursors of the papain family which lack the disuIfide attachent, however, has been demonstrated for procathepsin B and procathepsin S ((145) ;

Chapter 2 of this thesis). In agreement with previous work (157-159), mature cathepsin

H composed of the mature rnini-chah is incapable of processing its own precursor and cleavage at the ~er77~î~lu78psite may be performed instead by another enzyme such as cathepsin D. Therefore, in order to produce active cathepsin H composed of the mature mini-chah, it rnay be necessary to subject procathepsin H to a cocktail of proteases as is the case within the mature lysosome.

Cathepsïns B; C, H, aleurain, onzain y, and bleomycin hydrolase have each evolved from an ancestral papain-like cysteine protease (108). In the case of cathepsin

B, the addition of a 20-residue insertion, called the occluding loop, contributes to the pbed subsites of the substrate-binding clef3 and enables this enzyme to fwiction as a carboxydipeptidase. The shorter prosegment of cathepsin B (62 residues) also contains a cysteine residue, Cys42p (cathepsin B numbering), yet procathepsin B is capable of autoprocessing (98,lOl,lO9). The crystal structure of procathepsin B reveals that

Cys42p binds close to the SI'pocket and is in proximity to the catalytic Cys29 residue, yet has no covalent partner (89-91). However, alanine scanning studies have revealed that Cys42p is a critical residue for the inhibitory activity of the hl-length cathepsin B propeptide (92). In contrast, the unprimed subsites found in cathepsin H are more narrowed and a disufide bond, formed by Cys82p and Cys214, links the prosegment to the main body of the enzyme. We propose that this additional covalent attachent causes procathepsin H, and consequently proaleurain, to be incompatible to "self' activate and thus requiring the action of other maturases. As has been shown for proaleurain (137), it remains possible that procathepsin H requires the activity of more than one endopeptidase for complete maturation to take place. This is evidenced by the inability of either one of the lysosomal protease used in this study to produce mature cathepsin H (+ or - mini-chah) without the partial or complete denaturation of the prosegment by the simultaneous addition of DTT and SDS detergent. Furthemore, we have shown that the additional disulfide bond inhibits direct proteolysis at the pro/mature junction by secondary proteases, presumably by impeding the conformational mobility of the C-termina1 end of the prosegment. Therefore, the simultaneous addition of SDS detergent and DTT to procathepsin H denatures the C- terminal end of the proregion sufficiently by abrogating the disulfide bond and thus rendering it more susceptible to proteolysis. In addition to the involvement of other maturases, these fïndings suggest that zyrnogen-membrane interactions mediated by the prosegment of cathepsin H may be important during lysosomal targeting and activation of procathepsin H as has been observed for mouse procathepsin L (162). ACKNOWLEDGMENTS

The authors thank France Dumas for N-terminai sequence identifications of the various cathepsin H isofoms and Dr. J. Sivaraman for his assistance in preparing Figure 1. We also thank Dr. Robert Ménard and Dr. Dorit K. Nagler for many valuable discussions. Figure 1 : Homology mode1 of procathepsin 8. View of the backbone trace (brown) of human Cys2SSer Procathepsin L as reported in Coulombe, R et al. (85). The prosegment residues (green) were sequentially mutated to those found in procathepsin H and the conformation adopted by the bound prosegment reflects the highty conserved mode of inhibition observed in the crystal structures reported to date of precursors of papain-like enymes (85-91). A ladder of highly conserved aromatic residues are highlighted (pi&) as well as the catdytic residues, Cys25(Ser) and His163 (white).

Located near the unprimeci subsites are residues Cys82p and Cys214 (yellow), found only in the precursors of cathepsin H, aleurain, and orizain y, linking the prosegment to the main body of the enzyme. Residue Trp76p, unequaled in other prosegments of the papain family, is also illustrateci and shown to stack against Trp189 (and perhaps the catalytic His163) near the Si' subsite of the substrate-binding cleft. Residue Phe56p (not shown), is laiown to stack aga% Tyr151 located within the hydrophobie prosegment- binding loop (exosite) of the enzyme (equivalent to the position of Trp24p in procathepsin B). Prodomain residues which are predicted to bind through the substrate- binding cleft of cathepsin H in the reverse mode contains the following sequence:

Leu75p-Trp-Ser-Glu-Pro-GIn-Asn-Cys82pwith the carbonyl carbon of Ser77p predicted to be in closest proximity to the catalytic center.

Figure 2 :Non-Reducing SDS-PAGE of purified glycosylated procathepsin H (Coornassie

BLue staining). Gel A :(Iane 1) ptuified procathepsin H ; Oane 2) procathepsin H foiIowing treatment with mature human cathepsin H composed of the mini-chain (28 kDa). Gel B : procathepsin H incubatexi with active cathepsins B, D, K, L, and S. Positions of molecular mass standards are indicated on the le&

Figure3 : Cornparison of primary prosequences of human cathepsin L, aleurain

(barley), orizain y (rice seeds) and human cathepsin H. The Cys82p residue found in the prosequences of aleurain, orizain y and cathepsin H are underlined and io boldface as is the unique Trp76p residue located in the prosegment of cathepsin H. The mature rnini-chain is composed of residues Glu78p+Thr85p. Both Asp65p and Glu70p are highly conserved among prosegments of the papain family. The various N-tenninal sequences detected for cathepsin K following treatment of the precursor with active cathepsins D, K, L, and S as well as pepsin (P) are indicated by arrows. The treatment of procathepsin H with cathepsin L with the simultaneous addition of DTT and SDS detergent leads to an N-terminal sequence located near the prohature junction indicated as L(SDS). The treatment of procathepsin H with cathepsin D in the presence of SDS detergent leads to a major sequence at ?A.ka69p-~lu70p(75% of signal) and a minor sequence correspondhg to the N-terminus of the mature minichain f'~lu78~~ro79p

(25% of signal, indicated as D(SDS)). CATL TLTFDHSLE A QWlXWKAM HN * -RLYO- MNEEOW RRAVWEKNM K MI ELHNQEYR ALEU OALGRTRHAL RFARFAVRYO --KSYESAAEVRR RFRIFSESLE EVRSTM-- ORiZy AALGRTROAL ' RFARF A V R HO - - KRYGDAAEV Q R RF R i F SE S L E LVRST NRR -2 CATH E LSVNSL EKF HF KS WMS KHR. - - KTY STE - EYHH RLQT F AS NWR K1 NAHN - NGN

CAïL EGKHSF T MAM NAFO DMTS E E FR Q -VM NOFQ NRKPR KOKVF QE PLFYE ALEU - - GLPYR L O I M FS DMSW EE FQ A - T R LOAA QTDATLAON HLMRDAAA ORET - - GLPYRL GI NRFA DMSW EE FQA - SR LOAA QNCSATLAGN HRMRDAPA Figure 4: Structure-Activity Relationship of the Various Cathepsin H Isoforms. Curves 1, 2, 3, and 4 correspond to the cataiytic activities of mature human wild-type cathepsin H composed of the mature mini-chin (&'/KM = 11,700 MIS-'), cathepsin H composed of the N-terminus ?~1~93~-~hr94~-~1~95~(&,/KM = 1950 MIS-') formed following treatment with cathepsin L in the presence of SDS detergent (+DTT+SDS), cathepsin H composed of ?~lu70pIIe71p~~s72~ (kt/KM = 8 89 MIS-') following treatment with cathepsin L in the absence of SDS detergent, and mature human cathepsin L (Lt/KM= 4.2 MIS-'), respectively, towards the Arg-MCA substrate which measures mono-arninopeptidase (exopeptidase) activïty. Cathepsins B, D, K, L, and S did not display any appreciable hydrolytic activity towards Arg-MCA (5 1 hour of reaction Cathepsin H Activity Towards Arg-MCA SUMMARY

The living ceIl possesses a variety of degradative enzymes. The mamrnalian

homologs of papain are diverse in terms of their substrate specificity despite the fact

that theu overall three-dimensional folds are quite sirnilar. This diversity rnay be

accounted for by subtle differences in their primary arnino-acid sequences and other

structural features located within the substrate-binding clefts of the enzymes. Insights

into the proteolytic pathways to which enzymes such as cathepsin L (73) and cathepsin

S (74) participate have only been elucidated recently, thus leading to their consideration

as important targets for therapeutic intervention.

Papain-like enzymes are first synthesized as zymogens consisting of long

extensions at the N-terminus of the enzyme which serve to inhibit, stabilize, and target

the enzymes until they reach their final destination ; i.e. the mature lysosome. Upon

their arriva1 to the acidified lysosomal cornpartment of the cell, these proenzymes

undergo maturation. Due to the high local concentration of proteases within the mature

lysosome, activation of these proenzymes may take place through autolysis or by the

action of other proteases.

Procathepsin B, is capable of autoactivation in vitro (98,101,109) and is a unique member of the papain farnily of zymogens in that it is composed of a shorter proregion (62 residues) than those found among cathepsin L-like precursors (>90 residues). Furthemore, cathepsin B is composed of an exposed disulfide loop, terrned the occluding loop, which contributes to the primed subsites of this enzyme's substrate- binding clefi. Kinetic assays using small fluorogenic substrates have indicated that the rate of autocatalytic processing of procathepsin B (and other rnernbers of this family)

correlate with the affinity of the enzyme for its propeptide rather than with its catalytic activity (102-104,106) ; i.e. peptides derived from the sequence of proregion residues display potent inhibition of the enzyme at neutral pH where full-length precursors are more stable, but display weaker a.fEnity for the enzyme under acidic pH conditions where zymogens of papain-like enzymes autoprocess more efficiently.

In Chapter 1 of this thesis, it has been demonstrated that the affuiity of cathepsin

B for its propeptide may be improved significantly upon the deletion of the occluding loop (105,106). Hence, the formation of a stable propeptidekathepsin B complex requires unobstmcted access of the inhibitor to the enzyme's substrate-binding cleft.

This finduig collaborates with the recently determined X-ray crystal structure of procathepsin B (89-91) which indicates that the occluding loop is a flexible motif capable of undergoing major conformational changes. In mature cathepsin B, the occluding loop is in a closed position (82) and shown to be stabilized by the formation of a saIt bridge between Asp22 located within the primed subsites of the substrate- binding cleft of cathepsin B and His 110 located within the occluding loop (106,107).

In this study, the occluding loop was show to compete with the cathepsin B propeptide for the surface of the enzyme (termed the occluding loop crevice). This cornpetition was shown to be maximized at Iow pH and regulated by the interactions of Hisl10, but not Hislll, with the main body of the enzyme. Therefore, major conformational changes during the autocatalytic processing of procathepsin B involves the intramolecular pH-dependent movernent of the occluding loop, thus suggestive that the pH-triggering rnechanism of autoprocessing in procathepsin B is unique to this zymogen ody. In summary, the critical electrostatic interactions which regulate procathepsin B processing reside within the enzyme domain and not the cathepsin B prosegment. Conversely, for members of the cathepsin L-subfamily ; i.e., prodomain composed of over 90 residues and no occluding loop insertion within the mature

protein, the pH triggering mechanian is likely to reside within the prosegment. In

particular, the protonation of salt bridges which bring helices orlp and a2p in close

proximity to helix a3p ; i.e., involving the highly conserved GI37p and Arg3 1p to

Glu70p, and kg3lp to Asp65p, are likely to contribute to the maturation process. In

the study of Vernet et al. (100), it was demonstrated that the negative charge of Asp65p

participates in the control of intrarnolecular processing of the papain precursor.

In Chapter 2 of this thesis, interna1 cleavage sites within the prosegments of cathepsin B and cathepsin S were identified during the autocatalytic conversion of their zymogens. This was accomplished by adding to the precursors excess amounts of the protein-proteinase inhibitor, cystatin C, prior to their exposure to acidic pH environments. The addition of cystatin C was successfbl in trapping these precursors into a slower cascade of autoproteolytic processing ; Le., aEtyof cystatin C for the mature enzyme>intemediate>full-length precursor. Hence, the addition of cystatin C irnpedes the rate of intermolecular processing caused by the activity of mature protein and thus facilitates the detection of unimolecular reactions. Based on structural analysis of the precursors (85-92), the newly identified sites of proteolytic processing take place within a stetch of prosegment residues which are known to bind in the reverse mode through the substrate-binding clefis of these enzymes. That these autoproteolytic processing reactions are observed at al1 concentrations of proenzyme suggests that they occur as unirnolecular events and that they may be important. The unusually low molecularity of these unimolecular events may be accounted for by the fact that the prosequence binds through the active site cleft in the reverse direction to that taken by natural substrates. Significantly, this reverse complementarity causes the distance between the 6N of the catalytic histidine and the teaving backbone amide group to be greater than would be the case for natural substrates, thus leading to highly reversible nucleophilic reactions at the carbonyl carbon of the scissile amide bond

(within the prosegment) which have difficulty going to completion ; Le., protonation of the leaving backbone amide group, and consequently, formation of the covalent acyl- enzyme intermediate are inefficient processes.

The N-terminus of the mature segment ha. been shown to undergo major conformational changes during the autoactivation of zymogens belonging to the aspartic protease farnily (e.g. progastricsin, discussed in Literature Review and

Introduction). Conversely, it has been detemined that highly conserved charged residues (Asp6 and Arg8) which contribute to the formation of a salt bridge within the catalytic domain of propapain were not found to be involved in regulating the pH- triggering mechanism of this precursor.

In Chapter 3 of this thesis, the precursor of cathepsin H was show to display unusual stability to autoactivation as well as to proteolysis near the prolrnature junction.

These fbdings have been attributed to the pre-formation of a disulfide bond, using

Cys82p and Cys214, which serves to link the C-terminal end of the prosegment to the enzyme domain and is known to remain intact following maturation of the precursor

(83). Procathepsin H is the only marnmalian precursor of papain-like enzymes to be composed of Cys82p and Cys214. Other zymogens which contain these cysteine residues are proaleurain (barley) and pro-onzain y (rice seeds) (128). Following the incubation of procathepsin H with various proteases, it was demonstrated that the various isoforms of cathepsin H lacking the mature mini-chain retain significant aminopeptidase activity towards Arg-MCA as compared to other members of this family (e-g., cathepsins B, D, K, L, and S) as well as to mature wild-type cathepsin H which is composed of the octapeptide. This may be accounted for by novel insertions of amino-acid residues which narrow the substrate-binding cleft (Le., unprimed portion) found in cathepsin H.

In the case of the caspase family of cytoplasmic cysteine proteases involved in cellular apoptosis (discussed in the introduction), there exists a sophisticated hierarchy of intermolecular processing. Among lysosomal cysteine proteases belonging to the papain family, however, no such hierarchy of proteolytic events has been reported. The activation of most lysosomal enzymes appear to be mutually non-exclusive since most of these enzymes are competent to autoactivate upon exposure to acidic pH environments. Based on the results fiom this study, however, the inability of procathepsin H to autoactivate and its dependence on the activity of other proteases suggests that this precursor may be an important exception.

Sirnilar to what has been observed for zymogens of other farnilies of proteases

(serine, aspartic, ~n-coordinated),precursors of papain-Iike enzymes possess a pre- fonned and functional catalytic triad and a mature substrate-binding cleft ; Le., a prerequisite for the capability to 'self activate. These zymogens rnay also be activated due to the action of secondary proteases in vivo, yet they do not need to bind to adaptor molecules as has been reported for some precursors belonging to the caspase family of cysteine proteases (e.g. the association of Apaf-1 with procaspase-9) (58). In addition, pH-induced conformational rearrangements are necessary for the efficient conversion of these zymogens. In this study, the pH-dependent conformational stability of the occluding loop was shown to be critical in regulating the overall rate of procathepsin B processing. Furthermore, the conformational mobility of residues at the C-terminal end of the prosegments ; i-e., residues which stretch fkom the substrate-binding cleft to the pro/mature junction, was also found to be important. Intemal cleavage sites within the prosegments have been identified which take place while the prosegment is bound in the reverse substrate-binding mode. Therefore, the maturation of zymogens belonging to the papain family cm proceed via intermolecular or intramo Iecular non-exclusive events. The discovery of a unimolecular step of autoactivation may help to explain how this family of enzymes, when even present at low concentrations, may be implicated in several invasive and pathological conditions extracellularly. REFERENCES

Williams, D.B. (1995) Biochem. Cell BioL 73, 123-132

Trombetta, S.E., and Helenius, A. (1998) Cuw. Opin. Struct. BioL 8,587-592

Fischer, G., Wittmann-Liebold, B., Lang, K., Kiefhaber, T., and Schrnid, F.X.

(1989) Nature 337,476-478

UIlrich, A., and Schlessinger, J. (1990) Cell61,203-2 12

Hanks, S.K., Quinn, A.M., and Hunter, T. (1988) Science 241,42-52

Murphy, G.J.P., Murphy, G., and Reynolds, J.J. (1991)FEBS Lett- 289-4-7

Mort, J.S., Recklies, A.D., and Poole, A.R. (1 984) Arthritis Rhezim. 27, 509-5 15

Drake, F.H., Dodds, R., James, I., Connor, J., Debouck, C., Richardson, S., Lee, E.,

Rieman, D., Barthlow, R., Hastings, G., and Gowen, M. (1996)J. BioL Chern. 271,

1251 1-12516

Dolenc, I., Turk, B., Pungercic, G., Ritonja, A., and Turk, V. (1995) J. Biol. Chem.

270,21626-2 163 1

10. Pari:, A., Strukelj, B., Pungercar, J., Renko, M., Dolenc, I., Turk, V. (1995) FEBS

Lett. 369,326-330

11. Neurath, H. (1957)Adv Protein Chem 12,3 19-386

12. Fehlhammer, H., Bode, W. and Huber, R. (1977)J. Mol. BioZ. 111,415-438

13. Bode, W.and Huber, R. (1978)FEBS Lett. 90,265-269

14. Huber, R. and Bode, W. (1978) Acc. Chem. Res. 11, 114-122

15. Davie E.W. and Neurath, H. (1955)J. BioZ. Chem. 212,s 15-529

16. Craik, C.S., Roczniak, S., Largman, C. and Rutter, W.J. (1987) Science 239, 909-

913 17. Sprang, S., Standing, T., Fletterick, R.J., Stroud, R.M., Finer-Moore, J., Xuong,

N.H., Harnlin, R., Rutter, W.J. and Craik, CS. (1987) Science 237,905-909

18. Robertus, J.D., Kraut, J., Alden, R.A. and BÏrktoft, J.J. (1 972) Biochernisw 11,

4293-4303

19. Schechter, 1. and Berger, A. (1967) Biochem. Biophys. Res. Commzm. 27, 157-162

20. Wang, D., Bode, W. and Huber, R (1985) J. Mol. Biol. 185,595-624

21. Blevins, RA. and Tulinsky, A. (1985) J. BioL Chem. 260,4264-427s

22. Cohen, G.H., Silverton, E.W. and Davies, D.R. (198 1) J. Mol. BioZ. 148,449-479

23. Lorand, L. (1986) Ann. Y. Acad. Sci 485, 144-158

24. Berndt, M.C., and Phillips, D.R. (1981) in Platelets and Pafhology (Gordon, J.L.,

Ed.) pp -43-74, Elsevier/North-Holland Biomedical Press, Amsterdam

25. Nesheim, M.E., Katpnann, J.A., Tracy, P.B., and Mann, K.G. (1981) Meth.

EnzymoZ. [21] Factor V pp. 249-274 (Academic Press, Inc.)

26. Esmon, C.T. and Jackson, C.M. (1974) J. Biol. Chem. 249,7782-7790

27. Seidah, N.G., Mbikay, M., Marcinkiewicz, M. and Chrétien, M. (1998) Proteolytic

and Cellular Mechanisrns in Prohomone and Neuropeptide Precursor Processing,

ed. Hook, V.Y.H. (Landes, Georgetown, TX), 49-76

28. Steiner, D.F. (1998) Curr. Opin. Chem. Biol. 2,3 1-39

29. Ling, N., Burgus, R. and Guillemin, R. (1976) Proc. Nat. Acad. Sci USA 73, 3042-

3046

30. Rosendahl, M.S., Ko, S .C., Long, D.L., Brewer, M.T., Rosenzweig, B., Hedl, E.,

Anderson, L., Pyle, S.M., Moreland, J., Meyers, M.A., et al. (1 997) J. Biol. Chern.

272,24588-24593

3 1. Checler, F. (1995) J. Narrochem. 65, 1431 -1fî.44 32. Sielecki, A.R., Hayakawa, K., Fujinaga, M., Murphy, M.E., Fraser, M., Muir, A.K.,

Carilfi, C.T., Lewicki, J.A., Baxter, J.D. and James, M.N.G. (1989) Science 243,

1346-1351

33. Wlodawer, A. and Erickson, J.W. (1993) Ann. Rev. Biochem. 62,543-585

34. Davies, D.R. (1990) Ann. Rev- Biophys. Chem. 19, 189-215

35. James, M.N.G.,Sielecki, A.R., Rayakawa, K. and Gelb, M.H. (1992) Biochernistry

31,3872-3886

36. James, M.N.G.and Sielecki, A.R. (1986) Nature 319,33-38

37. Bateman, K.S.,Chemey, M.M., Tarasova, N.I. and James, M.N.G. (1998) ed. The

aspartic proteinases : Retroviral, fungal, plant, and mammalian. New York :

Plenum Press. In press.

38. Moore, S.A., Sielecki, A-R., Chernaia, M.M., Tarasova, N.I. and James, M.N.G.

(1995)J Mol. Biol. 247,466-485

39. Khan, A-R., Cherney, M-M., Tarasova, N.I. and James, M.N.G. (1997) Nature

Struct. Biol. 4, 1 O 10-1 O 15

40. Foltmann, B. and Jensen, A.L. (1982) Eur. J. Biochem. 128,63-70

41. Puigserver, A., Chapus, C. and Kerfelec, B. (1986) In : Desnuelle, P., Sjostrom, H.,

Noren, O., eds. Molecular and cellulnr basis of digestion. Amsterdam : Elsevier.

pp. 235-247

42. Matthews, B.H. (1988) Acc. Chem. Res. 21,333-340

43. Hanson, J.E., Kaplan, A-P-, and Bartlett, P.A. (1989) Biochemisv 28,6294-6305

44. Guasch, A., Coll, M., Aviles, F.X., and Huber, R. (1992) J. Mol. Biol. 224, 141-157

45. Aviles, F-X-, Vendrell, J*, Guasch, A., Coll, M., and Huber, R. (1993) Eur. J.

Biochem. 211,382-389 46. Docherty, A.J.P., O'Connell, J., Crabbe, T., Angal, S., and Murphy, G. (1992)

Trends Biotechnol- 10,200-207

47. Becker, J.W., Marcy, A.I., Rokosz, L.L., Axel, M.G., Burbaum, J.J., Fitzgerald,

P.M.D., Cameron, P.M., Esser, C.K., Hagmann, W.K., Hennies, J.D., and Springer,

J.P. (1995) Protein Sci. 4, 1966-1976

48. Nagase, H., Suzuki, K., Enghild, J.J., and Salvesen, G. (1991) Biomed. Biochim.

Acta 50,749-754

49. Ellis, R.E.,Yuan, J., and Horvitz, H.R. (1991) Annu. Rev. Cell, Biol, 7,663-698

50. Cohen, G.M.(1997) Biochem. J. 326,l-16

51. Nicholson, D.W., and Thornberry, N.A. (1997) Trends Biochem, Sci. 8,299-306

52. Salvesen, G.S., and Dixit, V.M. (1997) Cell 91,443-446

53. Froelich, C.J., Dixit, V.M., and Yang, ,X. (1998) Imrnunol. Today 19,30-36

54. Muzio, M.,Stockwell, B.R., Stennicke, H.R., Salvesen, G.S., and Dixit, V.M.

(1998) J: BioZ. Chem. 273,2926-2930

55. Susin, S.A., Zamzami, N., Castedo, M., Daugas, E., Wang, H.G., Geley, S., Fassy,

F., Reed, J-C., and Kroemer, G. (1997) .IEq. Med. 186,25-37

56. Thornberry, N.A., Bull, H.G., Calaycay, J.R., Chapman, KT., Howard, A.D.,

Kostura, M. J., Miller, D.K., Molineaux, S.M., Weidner, J.R., and Aunins, J. (1992)

Nature 356,768-774

57. Butt, A.J., Harvey, N.L., Parasivam, G., Kurnar, S. (1998) J. Biol. Chem. 273,

6763-6768

58. Li, P., Nijhawanl, D., Budihardjol, I., Srinivasula, S., Ahmad, M., Alnemri, E.S.,

and Wang, X. (1997) CeZZ 91,479-489 59. Yang, X., Stennicke, H.R., Wang, B., Green, D.R., Janicke, RU,Srinivasan, A.,

Seth, P., Salvesen, G.S., and Froelich, C.J- (1998) J. Biol. Chem. 273, 34278-34253

60. Rawlings, ND., Pearl, L.H., and Buttle, D. J, (1992) Biol- Chem. Hoppe-Seyler

373, 1211-1215

6 1. Mort, J.S., and Buttle, D.J. (1997) Int. J. Biochern. Cell. Biol- 29, 715-720

62. Kirschke, H., Langner, J., Wiederanders, B., Ansorge, G., Bohley, P., and Hanson,

H. (1977) Acta Biol. Med. Germ. 36, 185-199

63. Li, Y.P., Alexander, M.B., Wucherpfennig, A.L., Chen, W., Yelick, P., and

Stashenko, P. (1994) Mol. Biol. Cell. 5,335a

64. Kirschke, H., and Barrett, A.J. (1987) in Lysosomes : Their Role in Protei~z

BreaMown (Glaumann, H., and Ballard, F.J., eds) pp. 193-238, Academic Press,

London

65. Nakagawa, T.Y., Bnssette, W.H., Lira, P.D., Griffiths, R.J., Howard, E.D.,

Petrushova, N., Stock, J., McNeish, J.D., Eastman, S.E., Clarke, SR., Rosloniec,

E.F., Elliot, E.A., Rudensky, A.Y. (1999) Imrnunity 10,207-217

66. Linnevers, C., Smeekens, S.P., and Bromme, D. (1997) FEBS Lett. 405,253-259

67. Nagler, D.K., and Ménard, R. (1998) FEBS Lett. 434, 135-139

68. Lazzarino, D., and Gabel, C.A. (1990) J. Biol. Chem. 265, 1 1864-1 1871

69. Bohley, P., and Seglen, P.O. (1992) Experentia 48, 151-157

70. Delaisse, J.M., Boyde, A., Maconnachie, E., Ali, N.N., Sear, C.H.J., Eeckhout, Y.,

Vaes, G., and Jones, S.J. (1987) Bone 8,305-313

71. Esser, R.E., Angelo, R.A., Murphey, M.D., Wats, L.M., Thomburg, L.P., Palmer,

J.T., Talhouk, J.W., and Smith, R.E.(1994) Arthritis Rheurn. 37,236-247 72. Guagliardi, L.E., Koppelman, B., Blum, J., Marks, M.S., Cresswell, P., and

Brodsb, F. (1 990) Nature 343, 133-139

73. Nakagawa, T., Roth, W., Wong, P., Nelson, A., Farr, A., Deussing, J., Villadmgos,

J.A., Ploegh, H., Peters, C., Rudensky, A.Y. (1998) Science 280,450-453

74. Cresswell, P. (1 998) Science 280,394-395

75. Bromme, D., Okamoto, K., Wang, B.B., Biroc, S. (1994) J. Biol. Chem. 271, 2126-

2132

76. Gelb, B.D., Shi, G.-P., Chapman, H.A., Desnick, R.J. (1996) Science 273, 1236-

1238

77. Sloane, B.F. (1981) Science 212, 1 151-1 153

78. Storer, AC, and Ménard, R. (1 994) Meth. Enzymoi. 244,486-500

79. Aronson, N.N., and Barrett, A.J. (1978) Biochem. J. 171, 759-765

80. Takahashi, T., Dehdarani, A.H., and Tang, J. (1988) J. Bioi. Chem. 263, 10952-

10957

81. Rothe, M,, and Dodt, J. (1992) Ettr. J. Biochem. 210, 759-764

82. Musil, D., Zucic, D., Turk, D., Engh, R.A., Mayr, L, Huber, R., Popovic, T., Turk,

V., Towatarï, T., Katunuma, N., and Bode, W. (1991) EmOJ, 10,2321-2330

83. Guncar, G., Podobnik, M., Pungercar, J., Stnikelj, B., Turk, V., and Turk, D. (1998)

Structure 6,5 1-6 1

84. Baker, D., Shiau, A.K., and Agard, D.A. (1993) Cuw, Biol. 5,966-970

85. Coulombe, R-, Grochulski, P., Sivaraman, J., Ménard, R., Mort, J.S., and Cygler,

M. (1996) ElMBO J, 15,5492-5503

86. Groves, M.R., Taylor, M.A., Scott, M., Curnrnings, N.J., Pickersgill, R. W., and

Jenkins, J.A. (1 996) Structure 4, 1 193-1203 87. LaLonde, J.M., Zhao, B., Janson, C.A., D'Alessio, K.J., McQueney, M.S., Orsini,

M.J., Debouck CM., and Smith, W.W. (1999) Biochemistry 38,862-869

88. Sivaraman, J., Lalumière, M., Ménard, R., and Cygler, M. (1999) Protein Sci 8,

283-290

89. Cygler, M., Sivaraman, J., Grochulski, P., Coulombe, R., Storer, AC, and Mort,

J.S. (1996) Shucture 4,405-416

90. Turk, D., Podobnik, M., Kuhelj, R., Dolinar, M., and Turk, V. (1996) FEBS Lett.

384,211-214

9 1. Podobnik, M., Kuhelj, R., Turk, V., and Turk, D. (1997) J. Mol. Biol. 271, 774-788

92. Chen, Y., Plouffe, C., Ménard, R, and Storer, A.C. (1996) FEBSLett. 393, 24-26

93. Barrett, AI,Rawlings, N.D., Davies, M.E., Machleidt, W., Salvesen, G., and Tu&

V. (1986) In : A.J. Barrett and G. Salvesen (eds.), Proteinase Inhibitors, 5 15-568,

Elsevier, Amsterdam

94. Abrahamson, M., Barrett, A.J., Salvesen, G., and Gmbb, A. (1986) J. Biol. Chem.

261,11282-1 1289

95. Bode, W., Engh, R., Musil, D., Thiete, U., Huber, R., Karshikov, A., Brzin, J., Kos,

J., and Turk, V. (1988) EMBO J. 7,2593-2599

96. Hall, A., Dalbage, H., Grubb, A., and Abrahamson, M. (1993) Biochem. J. 291,

123-129

97. Vernet, T., Khouri, H.E., Lafiamme, P., Tessier, D.C., Musil, R., Gour-Salin, B.J.,

Storer, AC, and Thomas, D.Y. (1991) J. Biol. Chem. 266,2145 1-2167

98. Mach, L., Mort, J.S., and GlossI, 1. (1994). J. Biol. Chem. 269, 13030-13035

99. Ménard, R., Carmona, E., Takebe, S., Dufour, É., Plouffe, C., Mason, P., and Mort,

J.S. (1998) J. Biol. Chem. 273,4478-4484 100. Vemet, T., Berti, P.J., de Montigny, C., Musil, R., Tessier, D.C., Ménard, R.,

Magny, M.-C., Storer, AC., and Thomas, D.Y. (1995) J. Biul. Chem. 270, 10838-

10846

101. Mach, L., SchwihJa, H., Stüwe, K., Rowan, A.D., Mort, J.S., and Glossl, J.

(1993) Biochem. J. 293,437-442

102. Fox, T., de Miguel, E., Mort, J.S., & Storer, A.C. (1992) Biochemistry 31,

12571-12576

103. Carmona, E., Dufour, É., Plouffe, C., Takebe, S., Mason, P., Mort, J.S., &

Ménard, R (1996) Biochernistry 35,8 149-8 157

104. Maubach, G., Schilling, K., Rommerskirch, W., Wenz, I., Schultz, J.E., Weber,

E., and Wiederanders, B. (1997) Eur. J. Biochem. 250,745-750

105. Illy, C., Quraishi, O., Wang, J., Purisima, E., Vemet, T., and Mort, J.S. (1997) J.

Biol. Chem. 272, 1197-1202

106. Quraishi, O., Nagler, D.K., Fox, T., Sivaraman, J., Cygler, M., Mort, J.S., and

Storer, A.C. (1999) Biochemistry 38,50 17-5023

107. Nagler, D.K., Storer, AC, Portaro, F.C.V., Carmona, E., Juliano, L., and

Ménard, R. (1997) Biochemiso-y 36, 12608-12615

108. Berti, P.L. and Storer, A.C. (1995) J. Mol- Biol. 246,273-283

109. Rowan, AD., Mason, P., Mach, L., and Mort, J.S. (1992) J. Biol. Chem. 267,

15993-15999

110. Dixon, M (1953) Biochem. J. 55,170-171

1 11. Cha, S. (1975) Biochem. Pharmacol. 24,2177-21 85

112. Morrison, J.F. (1969) Biochim. Biophys. Acta 185,269-286

113. Momson, J.F. (1982) Trends Biochem. Sci. 7, 102-105 1 14. Momson, J.F. and Stone, S .R. (1 985) Comments Mol. Cell. Biophys. 2,347-3 68

115. Momson, J.F. and Walsh, C.T. (1988) Adv. Enzyrnol. Relat. Areas Mol. Bol.

61,202-301

1 16. Izquierdo-Martin, M. and Stein, R.L. (1 992) J. Am. Chem. Soc. 114, 1527-1528

1 17. Chagas, J.R., Ferrer-Di Martino, M., Gauthier, F., and Lalmanach, G. (1996)

FEBS Lett. 392,233-236

1 18. Jaqueline Padilla-Zuniga, A., and Rojo-Domuiguez, A. (1 998) Fdding &

Design 3,271-284

119. Li, Y., and Inouye, M. (1 996) J. Mol. Biol. 262,59 1-594

120. Volkov, A., and Jordan, F. (1996) J: Mol. Biol. 262,595-599

121. Gallagher, T., Gilfiland, G., Wang, L., and Bryan, P. (1995) Sh-rrcture 3, 907-

914

122. Becker, J.W., et al., and Springer, J.P. (1995) Prot. Sci 4, 1966-1976

123. Abrahamson, M. (1994) Methods Enzymol. 244,685-700

124. Matsudaira, P. (1987) J. Biol. Chem. 262,10035-10038

125. Thomas, M.P., Topharn, C.M., Kowlessur, D., MeIIor, G.W., Thomas, E.W.,

Whitford, D., and Brocklehurst, K. (1994) Biochem. J. 300, 805-820

126. Baker, K.C., Taylor, M.A.J., Curnrnings, N.J., Tuîibn, M.-A., Worboys, K.A.,

and Cornerton, I.F. (1996) Prot. Eng. 9, 525-529

127. Quraishi, O., Mort, J.S., and Storer, AC(1999) (In Preparation ; Chapter 3 of

this thesis)

128. Watanabe, H., Abe, K., Emor, Y., Hosoyama, H., and Ami, S. (199 1) J. Biol.

Chem. 266,16897-16902 129. Haii, A., Ekiel, I., Mason, R.W., Kasprzykowski, F., Gmbb, A., and

Abrahamson, M. (1 998) Biochemistry 37,407 1-4079

130- Ménard, R., Carrière, J., La£iamme, P., Plouffe, C., Khouri, H.E., Vernet, TI,

Tessier, D.C., Thomas, D-Y., and Storer, AC. (1992) Biochemistry 30, 8924-8928

13 1. NQler, D.K., Storer, A.C., and Ménard, R.(1999) Manuscript in Preparation

132. Arad, D., Langridge, R., and Kollrnan, P.A. (2990) J. Am. Chem. Soc. 112,491-

502

133. Higaki, J.N., Evnin, L-B., and Craik, CS. (1989) Biochemishy 28,9256-9263

134. Jia, Z., Hasnain, S., Hirama, T., Lee, X., Mort, J.S., To, R., and Huber, C.P.

(1995) J. Biol. Chem. 270,5527-5533

135. Jerala, R., Zerovnik, E., Kidric, J., and Turk, V. (1998) J. Biol. Chern. 273,

11498-1 1504

136. Ritonja, A., Popovic, T., Kotnik, M., Machleidt, W., and Turk, V. (1988) FEBS

Lett. 228,34 1-345

137. Holwerda, B.C., Galvin, N.J., Baranski, T.J., and Rogers, J.C. (1990) Plant Ce11

2, 1091-2106

138. Sloane, B.F. (1990) Semin. Cancer Biol. 1, 1372152

139. Sivaparvathi, M., Sawaya, R., Gokaslan, Z.L., Chintala, K.S., and Rao, J.S.

(1996) Cancer Lett. 104, 121-126

140. Tsushima, H., Ueki, A., Matsuoka, Y., Mihara, H., and Hopsu-Havu, V.K.

(1991) Int. J: Cancer 48,726-732

141. Schweiger, A., Stabuc, B., Popovic, T., Turk, V., and Kos, J. (1997) J. hrnttnol.

Methods 201, 165-232 142. Gabrijelcic, D., et al., Turk, V. (1992) Eur. J. Clin. Chern. Clin. Biochem. 30,

69-74

143. Zheng, W., Johnston, S.A., and Joshua-Tor, L. (1998) CeZZ 93,103-109

144. Joshua-Tor, L., Xu, H.E., Johnston, S.A-, Rees, D.C. (1995) Science 269, 945-

950

145. Quraishi, O., and Storer, A.C. (1999) (To be subrnitted to J. Biol. Chem. ;

Chapter 2 of this thesis)

146. Singh, H., and Kalnitsky, G. (1978) J: Biol Chem. 253,43 19-4326

147. Harvima, R.J., Yabe, K., Fraki, J.E., Fukuyama, K., and Epstein, W.L. (1987) J.

hvest. Dermatol. 88,393-397

148. Barrett, A.J. (1980) Proc. FEBS Meet. (Enz. Regul. Mech. Action) 60, 307-3 15

149. Schwartz, W.N., and Barrett, AL (1980) Biochem. J: 191,487-497

150. Sawicki, G., and Warwas, M. (1989) Acta. Biochim. Pol. 36,343-35 1

151. Popovic, T., Brzin, J., Kos, J., Lenarcic, B., Machleidt, W., Ritonja, A., Hanada,

K., and Turk, V. (1988) Biol. Chem. Noppe-Seyler 369, 175-183

152. Locnikar, P., Popovic, T., Lah, T., Kregar, I., Babnik, J., Kopitar, M., and Turk,

V. (1981) in. Proceedings of International Symposium, Proteinuses & The&

inhibitors :Struct. Fztnct. Appl. Aspects, 109-1 16

153. Azaryan, A.V., and Galoyan, A.A. (1987) Neztrochern. Res. 12,207-2 13

154. Raghav, N., Kamboj, R.C., Parnami, S., and Singh, H. (1995) Ind J.

Biochemistry & Biophys. 32,279-285

155. Rogers, J-C., Dean, D., and Heck, G.R. (1985) Proc. Nntl. Acad. Sci USA 82,

65 12-6516

156. Rebek, J. Jr- (1988) Struct. Chem. 1, 129-131 157. Nishimura, Y., and Kato, K. (1987) Biochem. Biophys. Res. Comm. 148, 329-

334

158. Nishimura, Y., Kawabata, T., Yano, S., and Kato, K. (1990) Arch. Biochem.

Biophys. 283,458-463

159. Nishimura, Y., Tsuji, H., Kato, K., Sato, H., Amano, J., and Hirneno, M. (1995)

Biol. Pham. Bull. 18,829-836

160. Bromme, D,, Bonneau, P.R., Lachance, P., and Storer, A.C. (1994) J. BioL

Chem 269,30238-30242

161. Bossard, M.J., Tomaszek, T.A., Thompson, S.K., Amegadzie, B.Y., Hanning,

C.R., Jones, C., Kurdyla, J.T., McNulty, D.E., Drake, F.H., Gowen, M., and Levy,

M.A. (1996) J. Biol. Chem. 271, 12517-12524

162. McIntyre, G.F., Godbold, G.D., Enckson, A.H. (1994) J. BioZ. Chem. 269, 567-

572