Modification of the Prosegment in Understanding its Role in the Folding and Function of PMII

by

Ahmad Haniff Jaafar

A Thesis Presented to The University of Guelph

In partial fulfillment of requirements for the degree of Doctor of Philosophy in Food Science

Guelph, Ontario, Canada

© Ahmad Haniff Jaafar, August, 2014

ABSTRACT

MODIFICATION OF THE PROSEGMENT IN UNDERSTANDING ITS ROLE IN THE FOLDING AND FUNCTION OF PMII

Ahmad Haniff Jaafar Advisor: University of Guelph, 2014 Professor R.Y. Yada

This thesis explores the folding and activation of II (PMII) and the role of its prosegment (PS) on structure – function. Three different constructs of PMII were designed with PSs of varying lengths: extended PSPMII with a 60 residue PS, truncated PSPMII with a

48 residue PS, and NoProPMII with no PS. Extended and truncated PSPMII produced mature with similar conformation. NoProPMII mature, however, showed improper folding as indicated by low thermal stability, a more solvent-exposed conformation, an 11-fold reduction in the activity assay, and a lower pepstatin-A requirement for complete inhibition.

In addition, the PS length was discovered to affect the activation of PMII. Extended PSPMII produced mature enzyme with an extra two PS residues (+2 PMII mature) whereas truncated

PSPMII produced mature enzyme with an extra 12 PS residues (+12 PMII mature).

The role of a PS in PMII folding was further investigated by conducting folding kinetic studies on PMII. It was found that the native PMII (Np) did not fold at the lowest free energy, but was kinetically stabilized. Upon unfolding, Np formed a thermodynamically stable, yet inactive refolded state (Rp). Np was characterized to have a slow rate of unfolding and folding, as indicated by large free energy barriers to unfold and fold of 24.50 and 25.12 kcal/mol, respectively. In the presence of the extended PS (60 residues), the energy landscape was shifted and the activation energy barrier was lowered to 12.37 kcal/mol, which enhanced the folding rate by approximately 18,550 times.

To elucidate the effect of PS residue length on PMII folding, structural analysis and in silico simulation were conducted on the two zymogens of PMII: extended PSPMII and truncated

PSPMII. Both zymogens appeared to be in a more solvent-exposed conformation as compared to their mature forms. Due to the oppositely charged residues between the main body and the PS, it appeared that the interactions between both structures were driven by electrostatic forces with 27 and 34 interactions (H-bonds and salt-bridges) measured between the main PMII body and truncated and extended PS, respectively.

ACKNOWLEDGEMENTS

It gives me great honour to acknowledge herein Prof. Rickey Yada for his intellectual guidance and unceasing support rendered towards me during my Ph.D. candidature at

University of Guelph. My time at Guelph has been a truly remarkable journey of self exploration and knowledge enrichment best indicated by the completion of my studies, and presentation of this thesis thereafter. Million thanks also go to my Advisory Committees

Prof.Yoshinori Mine, Prof. George Harauz and Prof. Alejandro Marangoni who have equally contributed in making my success achievable.

My deepest and most sincere gratitude is thus conveyed to my family and parents. Their love and compassion have always ignited the passion in me to strive and persevere in order to make it to the finishing line. As such, my success is as much as it is theirs. Great thanks to my colleagues, especially all my lab mates Huogen, Brian, Yasumi, Dref, Derek, Charity,

Brenna and Reena for their help during my time in Guelph. My appreciation is also extended to Mary Anne Smith for her expert editing throughout my entire thesis.

And last but never least, to my employer, Universiti Putra Malaysia (UPM), and my sponsor,

Malaysian Ministry of Education (MOE), I am most grateful for and thus fully acknowledge the chance and opportunity given to me. My Ph.D. would not be completed without the financial support and endorsement from both parties. THANK YOU.

iv

TABLE OF CONTENTS

ABSTRACT ...... ii ACKNOWLEDGEMENTS ...... iv TABLE OF CONTENTS ...... v LIST OF TABLES ...... viii LIST OF FIGURES ...... ix LIST OF ABBREVIATIONS ...... xi CHAPTER 1 Introduction and literature review ...... 1 1.1 Introduction ...... 1 1.2 Literature Review ...... 5 1.2.1 The family of APs ...... 5 1.2.2 General structure and mechanism of APs ...... 7 1.2.3 Multifunctional PS in APs ...... 10 1.2.4 Prosegment catalyst folding ...... 12 1.2.5 Prosegment catalyzed folding in APs ...... 13 1.2.6 Malarial APs ...... 14 1.2.7 Plasmepsin II ...... 15 1.2.8 Expression of PMII ...... 16 1.2.9 Unique characteristic of the PMII PS ...... 17 CHAPTER 2 Characterization of PMII with various lengths of prosegment ...... 19 2.1 Introduction ...... 19 2.2 Materials and Methods ...... 20 2.2.1 Materials ...... 20 2.2.2 Generation of PMII expression constructs ...... 20 2.2.3 Expression and purification of PMII constructs ...... 22 2.2.4 Purification of PMII constructs ...... 22 2.2.4.1 Extended PSPMII and truncated PSPMII ...... 23 2.2.4.2 NoProPMII ...... 24 2.2.5 Protein concentration determination of PMII constructs and mature ...... 24 2.2.6 Optimum pH of activity and stability of PMII mature enzymes ...... 25 2.2.7 Structural characteristic analysis of PMII mature enzymes ...... 26 2.2.7.1 Far-UV circular dichroism spectroscopy...... 26 2.2.7.2 Intrinsic fluorescence spectroscopy ...... 26 2.2.8 Thermostability analysis using differential scanning calorimetry ...... 26 2.2.9 Kinetic parameters of PMII mature enzymes ...... 27 2.2.10 Active enzyme comparison of PMII mature enzymes ...... 27 2.2.11 Structural prediction of +2 PMII and +12 PMII mature enzymes ...... 28 2.2.12 Statistical Analysis ...... 28 2.3 Results and Discussion ...... 28 2.3.1 Construction and soluble expression of PMII constructs ...... 28 2.3.2 Purification of PMII constructs ...... 35 2.3.3 Activation to +2 PMII, +12 PMII, and NoProPMII mature enzymes ...... 43 2.3.4 Optimum pH of activity and stability of PMII mature enzymes ...... 45 2.3.5 Structural characteristics of PMII mature enzymes...... 48 2.3.5.1 Far-UV circular dichroism spectroscopy...... 48

v

2.3.5.2 Intrinsic fluorescence spectroscopy ...... 50 2.3.6 Differential scanning calorimetry of PMII mature enzymes ...... 52 2.3.7 Kinetic parameters of PMII matures ...... 54 2.3.8 Active enzyme comparison using pepstatin-A ...... 56 2.3.9 Structural prediction of +2 PMII and +12 PMII mature enzymes ...... 58 2.4 Conclusion ...... 60 CHAPTER 3 Kinetically-stabilized and PS-catalyzed folding of PMII ...... 62 3.1 Introduction ...... 62 3.2 Materials and Methods ...... 64 3.2.1 Materials ...... 64 3.2.2 Expression and purification of PMII ...... 65 3.2.3 Preparation of refolded PMII ...... 65 3.2.4 Kinetic stability of native PMII ...... 65 3.2.5 Prosegment-catalyzed PMII folding ...... 66 3.2.6 Prosegment inhibition studies ...... 66 3.2.7 Statistical Analysis ...... 67 3.3 Results and Discussion ...... 67 3.3.1 Kinetic stability of PMII ...... 67 3.3.2 Prosegment-catalyzed folding of refolded PMII ...... 71 3.3.3 Prosegment affinity toward native PMII ...... 75 3.3.4 Energy landscape of uncatalyzed and prosegment-catalyzed PMII folding ...... 77 3.4 Conclusion ...... 80 CHAPTER 4 Structural characteristics of PMII zymogens ...... 81 4.1 Introduction ...... 81 4.2 Materials and Methods ...... 83 4.2.1 Expression and purification of PMII zymogens ...... 83 4.2.2 Structural determination of PMII zymogens ...... 83 4.2.3 Homology modelling of extended PSPMII ...... 83 4.2.4 Molecular dynamics simulations of PMII zymogens ...... 84 4.2.5 Structural visualization, surface properties, intermolecular interactions and flexibility of PMII zymogens ...... 85 4.3 Results and Discussion ...... 86 4.3.1 Secondary and tertiary structure of PMII zymogens ...... 86 4.3.2 In silico structural study ...... 88 4.3.3 Homology modelling of extended PSPMII ...... 89 4.3.4 Molecular dynamics simulations of PMII zymogens ...... 91 4.3.5 Electrostatic surface potential of PMII zymogens...... 93 4.3.6 Flexibility of extended PSPMII ...... 95 4.3.7 Intermolecular-molecular contacts between the PS and main body of PMII ...... 97 4.4 Conclusion ...... 101 CHAPTER 5 Conclusion and future recommendations ...... 104 5.1 Conclusion ...... 104 5.2 Future recommendations ...... 107 REFERENCES ...... 109 APPENDICES ...... 120

vi

Appendix 1 N-terminus analysis of extended PSPMII ...... 120 Appendix 2 N-terminus analysis of +2 PMII mature activated from extended PSPMII ...... 124 Appendix 3 N-terminus analysis of truncated PSPMII ...... 128 Appendix 4 N-terminus analysis of of +12 PMII mature activated from truncated PSPMII ...... 132 Appendix 5 N-terminus analysis of NoPro mature ...... 136

vii

LIST OF TABLES

Table 1.1: Family of aspartic ...... 6

Table 2.1: N-terminal sequencing of PMII constructs ...... 45

Table 2.2: Secondary structure distribution of PMII matures...... 50

Table 2.3: Intrinsic protein (Trp) fluorescent signal of PMII mature enzymes ...... 52

Table 2.4: Thermal stability of PMII mature enzymes ...... 54

Table 2.5: Kinetic parameters of PMII mature enzymes ...... 56

Table 3.1: Kinetic parameters of extended PS and truncated PS ...... 72

Table 4.1: H-bond and salt-bridge interactions between the extended and truncated PSs and native PMII ...... 97

viii

LIST OF FIGURES

Figure 1.1: Structure of a typical aspartic ...... 8

Figure 1.2: Steps involved in substrate catalysis and the interaction of catalytic groups with the surrounding residues ...... 10

Figure 1.3: Energy diagram of protein folding reaction ...... 13

Figure 1.4: Structure comparison of multiple APs zymogens ...... 18

Figure 2.1: Schematic diagram for generating the three different PMII constructs ...... 29

Figure 2.2: Complete sequence of extended PSPMII construct in pET32b+ plasmid ...... 31

Figure 2.3: Complete sequence of truncated PSPMII construct in pET32b+ plasmid ...... 33

Figure 2.4: Complete sequence of NoProPMII construct in pET32b+ plasmid ...... 35

Figure 2.5: Purification results for the extended PSPMII construct ...... 37

Figure 2.6: SDS-PAGE purification results for the extended PSPMII construct ...... 38

Figure 2.7: Purification results for the truncated PSPMII construct ...... 39

Figure 2.8: SDS-PAGE purification results for the truncated PSPMII construct ...... 40

Figure 2.9: Purification results for the NoProPMII construct...... 41

Figure 2.10: SDS-PAGE purification results for the NoProPMII construct ...... 42

Figure 2.11: Optimum pH activity of PMII mature enzymes ...... 46

Figure 2.12: Stability of PMII mature enzymes as a function of pH ...... 48

Figure 2.13: CD spectra of PMII mature enzymes ...... 49

Figure 2.14: Intrinsic protein (Trp) fluorescent signal of PMII mature enzymes ...... 51

Figure 2.15: DSC profiles of PMII mature enzymes ...... 53

Figure 2.16: Activity recovery upon pepstatin-A addition to mature enzymes ...... 57

Figure 2.17: Predicted structures of +12 PMII and +2 PMII generated from I-TASSER ...... 59

Figure 3.1: Unfolding kinetics of PMII using urea as denaturant...... 68

Figure 3.2: Uncatalyzed folding kinetics of PMII ...... 70

Figure 3.3: Free energy diagram summarizing the folding landscape of PMII ...... 71

ix

Figure 3.4: Kinetic trace of recovery of Np activity upon incubation with several concentrations of PS ...... 73

Figure 3.5: PS-catalyzed folding of Rp to Np using truncated and extended PS ...... 74

Figure 3.7: Folding landscape of PMII in the presence and absence of PS ...... 79

Figure 4.1: CD analysis on the extended and truncated PSPMII ...... 86

Figure 4.2: Intrinsic protein (Trp) fluorescence spectra of truncated and extended PSPMII and +2 PMII mature ...... 87

Figure 4.3: Superimposition of the homology modelling structure and the crystal structure of 1PFZ ...... 90

Figure 4.4: Structural comparison between truncated and extended PSPMII ...... 92

Figure 4.5: Electrostatic surface potential of extended PSPMII ...... 94

Figure 4.6: Flexibility of the predicted structure of extended PSPMII superimposed with the crystal structure of 1PFZ ...... 96

Figure 4.7: H-bonds between the β-sheet and α-helix of the PS and the adjacent residues in the main body of native PMII...... 99

Figure 4.8: Salt-bridge interactions between the extra PS residues with adjacent charges residues within the main body of PMII ...... 100

Appendix Figure 1.1: Summary results on N-terminus analysis of extended PSPMII ...... 120

Appendix Figure 1.2: Results on N-terminus analysis of extended PSPMII ...... 123

Appendix Figure 2.1: Summary results on N-terminus analysis of +2 PMII mature activated from extended PSPMII ...... 124

Appendix Figure 2.2: Results on N-terminus analysis of +2 PMII mature activated from extended PSPMII ...... 127

Appendix Figure 3.1: Summary results on N-terminus analysis of truncated PSPMII ...... 128

Appendix Figure 3.2: Results on N-terminus analysis of truncated PSPMII ...... 131

Appendix Figure 4.1: Summary results on N-terminus analysis of +12 PMII mature activated from truncated PSPMII ...... 132

Appendix Figure 4.2: Results on N-terminus analysis of +12 PMII mature activated from truncated PSPMII ...... 135

Appendix Figure 5.1: Summary results on N-terminus analysis of NoProPMII mature ...... 136

Appendix Figure 5.2: Results on N-terminus analysis of NoProPMII mature ...... 139

x

LIST OF ABBREVIATIONS

APs, aspartic peptidase or protease(s)

BSA, bovine serum albumin

CAPS, N-cyclohexyl-3-aminopropanesulfonic acid

CD, circular dichroism

DSC, differential scanning calorimetry

Extended PSPMII, plasmepsin II expressed with 60 amino acid residue prosegment

FPLC, fast protein liquid chromatography

HAP, Histoaspartic protease

IPTG, isopropyl-β-D-thiogalacto-pyranoside

NoProPMII, plasmepsin II expressed without prosegment

NoProPMII mature, mature PMII produced in the absence of prosegment

Np, native PMII

PCR, polymerase chain reaction

PMII, plasmepsin II proPMII, plasmepsin with full length prosegment

PS, prosegment

PS-Np, prosegment-native PMII complex

PS-Rp, prosegment-refolded PMII complex

Rp, refolded PMII

Tm, melting temperature

Truncated PSPMII, plasmepsin II expressed with 48 amino acid residue prosegment

2837b, synthetic substrate optimized for plasmepsin II cleavage

+2 PMII mature, mature PMII activated from Extended PSPMII

+12 PMII mature, mature PMII activated from Truncated PSPMII

xi

CHAPTER 1 Introduction and literature review

1.1 Introduction

The function of a protein is dependent on its conformational stability in its biologically active state (Demidyuk et al., 2010). Therefore, it is important to understand how fold into their biologically active states and how these active states are stabilized. There are two fundamental categories of protein stability: thermodynamic and kinetic (Sanchez-Ruiz,

2010). Thermodynamic stability is determined by the free energy difference between two ground states [e.g., native (N) and unfolded (U)]. It can be summarised using the Lewis equation, ΔG=-RT ln K, where ΔG is the change in unfolding free energy [ΔG = G(U) –

G(N)] and K is the unfolding equilibrium constant measured at constant pressure and temperature. Results from the Lewis equation connect to the concept of thermodynamic stability in that protein folding in any physiological environment is driven towards the state of lowest free energy (Anfinsen, 1973). Kinetic stability, on the other hand, is related to the free energy that separates the N state from the non-functional form or the transition state. In kinetically stable proteins, there is a high free energy barrier between the two and N is not thermodynamically favorable with respect to the inactive intermediate state (Baker et al.,

1992).

Aspartic proteases (APs) are believed to adopt kinetic stability to maintain their biologically active structure. Research on native , a representative of APs, suggests that it is irreversibly denatured upon unfolding (Dee and Yada, 2010), whereas pepsinogen (i.e., the zymogen form of pepsin that includes the PS) can be reversibly unfolded (Privalov et al.,

1981). This indicates that native pepsin exists in a kinetically trapped or metastable native state. In this situation, the native form is separated from the denatured state by a large

1 unfolding activation barrier (Dee and Yada, 2010), which prohibits return to the native state upon unfolding. As an alternative, the protein folds into a stable intermediate form (Sohl et al., 1998). In the presence of a PS, however, an intermediate complex of refolded-prosegment

(RP-PS) is formed, thus lowering the energy barrier. The metastable native state that was previously protected by a large unfolding activation barrier can then be formed when the PS detaches upon activation (Thurlar et al., 2004; Dee and Yada, 2010).

Although the function of the PS is not entirely understood, there is evidence that indicates it is required in native structure folding. The involvement of the PS in the folding catalysis has been extensively studied in serine proteases such as subtilisin E, α-lytic protease and

Streptomyces griseus peptidase B (Cunningham et al., 1999; Fisher et al., 2007; Truhlar et al., 2004). As well, similar catalytic behaviour has been detected in porcine pepsin (Dee and

Yada, 2010) and there are numerous pepsin-like APs that are expected to mimic this behaviour.

Plasmepsin II (PMII) is a malarial AP that is believed to show similar catalytic behaviour to pepsin. PMII shares a significant amount of sequence with other eukaryotic APs (Silva et al.,

1996). However, the PS of PMII is unique because it is larger (i.e., approximately 125 residues with a hydrophobic stretch of 21 amino acids), has an uncharacteristic conformation, and has no sequence similarity to other APs (Gardner et al., 2002). Due to its unique PS,

PMII also has a different activation mechanism as compared to other APs. The PS of PMII functions like a “harness” that keeps the two lobes (the N- and C-termini) of the molecule apart (Bernstein et al., 1999). Despite this uniqueness, little attention has been paid to understanding the functionality of PMII’s PS, though research exploring the recombinant

2 expression, crystallographic analysis and inhibition of mature PMII has been reported (Asoja et al., 2002; Clemente et al., 2006; Kim et al., 2006: Miura et al., 2010).

Understanding the function of PMII’s PS as a folding catalyst has been slowed by the structure of the PMII zymogen, which has not been fully elucidated. The hydrophobic stretch in the amino acid sequence makes expression of full length zymogen very difficult; however, a truncated version of the PMII zymogen has been used to express mature PMII. The truncated version has 48 PS residues (a length similar to other AP PSs) that interact substantially with the C-domain but not the N-domain (Bernstein et al., 1999). Research on D (i.e., the closest AP to PMII with 35% identity similarity) has reported contradictory results regarding the importance of the PS. Conner (1992) reported that the PS was necessary for proper folding, while Fortenberry et al. (1995) suggested otherwise.

Recently, Parr-Vasquez and Yada (2010) investigated two chimeric proteins and found that native PMII fused with a pepsin PS was capable of autoactivation in an acidic environment and capable of synthetic substrate cleavage. Native pepsin fused with a PMII PS, on the other hand, was unstable and misfolded. These findings highlight the importance of the PS in the proper folding of pepsin but not PMII. However, because the experiment focused on chimeric proteins a detailed role of PMII’s PS could not be determined. In a more related paper, Xiao and co-workers (2011) discovered that PMII without its PS was unable to refold spontaneously once it had been unfolded. Upon unfolding, native PMII folded into an inactive intermediate form that was obstructed from transformation into a functional conformation by a folding barrier of 25.5 kcal/mol (Xiao et al., 2011). Due to this barrier, it would take weeks for native PMII to equilibrate with the intermediate form under normal circumstances which, in reality, are impossible to acheive due to aggregation and proteolytic effects (Xiao et al., 2011). This finding suggested that PMII needs its PS to catalyze folding

3 within a practical time frame. The purpose of this research was to better understand the role of the PMII PS. The research has been divided into three parts each with a separate but related objective.

Objective 1: Determine and compare the conformation of PMII mature enzymes

obtained from constructs of PMII with three PSs of varying lengths (i.e., extended

PSPMII: native with 60 residues of PS, truncated PSPMII: native with 48 residues of

PS and NoProPMII: native with no PS) (Chapter 2).

Objective 2: Determine the kinetic stability of native PMII and measure the effect of

the PS (extended and truncated) as a folding catalyst (Chapter 3).

Objective 3: Compare the structural conformation of extended and truncated PSPMII

and identify the contacts and interactions made by the extra residues in the extended

zymogen (Chapter 4).

The following three hypotheses were proposed:

Hypothesis 1: In PMII, the PS is critical in initiating the folding and stabilizing the

biologically active native conformation.

Rationale: Proteins with PSs are initially synthesized as inactive precursors that

undergo post-translational processing into proteins with optimal proteolytic function.

Similar to other proteins with PS, the PS of PMII is believed to be crucial for the

correct folding of the biologically active enzyme.

4

Hypothesis 2: PS length is critical to the activity and folding of PMII.

Rationale: The conversion of inactive zymogen to an active form is a complex process

and the presence of the PS is critical. The sequence after the hydrophobic stretch in

the PS is believed to enable the appropriate interaction during the conversion process.

Hypothesis 3: Native PMII exists as a kinetically stable protein with a high free

energy barrier that separates the native and non-functional intermediate forms, and

PMII is irreversibly denatured without the PS.

Rationale: Sharing similarity with other APs in the same A1 family (pepsin, cathepsin

D and ), PMII is expected to behave similarly in which the PS functions as

a folding catalyst by lowering the activation energy and accelerating the formation of

the native state from the unfolded state.

1.2 Literature Review

1.2.1 The family of APs

Classically, proteolytic enzymes have been divided into four groups based on their catalytic mechanism: aspartic, cysteine, metallo and serine proteases. Acid proteases or aspartic peptidases (APs) are a widely distributed subfamily of proteolytic enzymes in the endonuclease family. APs can be found in animals, bacteria, protozoa, archaea, fungi, plants, parasites and retroviruses (Davies, 1990). Based on the MEROPS database

(http://www.merops.sanger.ac.uk), APs are further divided into six groups according to their amino acid sequence homology: AA, AC, AD, AE, AF, and an unassigned group (Table 1.1).

5

Table 1.1: Family of aspartic proteases Group Family Type of Peptidase

AA A1 Pepsin A (Homo sapiens) A2 Walleye dermal sarcoma virus retropepsin A3 Cauliflower mosaic virus-type peptidase (cauliflower mosaic A9 virus) A11 Spumapepsin (human spumaretrovirus) A 28 Copia transposon peptidase (Drosophila melanogaster) A 32 DNA-damage inducible protein 1 (Saccharomyces cerevisiae) A 33 PerP peptidase (Caulobacter crescentus) Skin SASPas (Mus musculus) AC A8 Signal peptidase II () AD A22 Persenilin 1 (Homo sapiens) A24 Type 4 prepilin peptidase 1 (Pseudomonas aeruginosa) AE A25 Gpr peptidase (Bacillus megaterium) A31 HybD peptidase (Escherichia coli) AF A26 Omptin (Escherichia coli) Unassigned A5 Thermopsin (Sulfolobus acidoldarius) A36 Sporulation factor SpoIIGA (Escherichia coli) (Adapted from MEROBS websites: http://www.merops. sanger.ac.uk)

The largest group is the AA group, which can be further divided into eight families based on evolutionary relationship and tertiary structure. Among the eight families, enzymes in the A1 family are widely distributed and can be active in acidic environments without cofactors

(Rawling et al., 2006). The A1 family is the most extensively studied, is known for enzymes that contribute to proteolysis of food proteins in the vertebrate stomach, and is the second enzyme to be crystallized (Northrop, 1930). To date, 181 enzymes have been identified within the A1 family and 34 of them have had their structures crystallized, solved and deposited in the Protein Databank (Rawling et al., 2012). The A1 family includes the majority of plant APs and pepsin-like enzymes such as pepsin, rennin, , and plasmepsin.

6

1.2.2 General structure and mechanism of APs

In common with other members of the A1 family, a great majority of pepsin-like APs can be generally characterized by two catalytic aspartic residues within the , an optimal activity in acidic pH range and specific inhibition by pepstatin-A (Davies, 1990). Most APs exist as monomers and are formed from a single polypeptide chain that is folded into two structurally similar domains. Each domain contributes a catalytic Asp residue that converges in the plane of an approximate two-fold axis of symmetry (Dunn, 2002). The enzyme forms a classic bilobal structure with a large cleft that separates the two domains. The active site of the enzyme is located at the base of this large cleft and is stabilized via multiple hydrogen bonds in highly conserved Asp-Thr/Ser-Gly sequences (Tang and Wong, 1987). Another structural feature of APs is a highly mobile β-hairpin loop known as the “flap” that covers the substrate-binding cleft. The flap is found in an open conformation in apoenzymes and in a closed conformation in the presence of an inhibitor/substrate (Davies, 1990). The structure of a typical AP is shown in Figure 1.1.

7

Figure 1.1: Structure of a typical C-terminal and N-terminal domains are colored in yellow and red, respectively. The catalytic Asp residues are represented in a stick mode. The image structure was generated using PyMol using the 1PSG pdb file. (Adapted from Hartsuck et. al., 1992)

Numerous studies have been conducted in an attempt to elucidate the catalytic mechanism of these enzymes. Many of these studies manipulated the binding of pepstatin or pepstatin-like inhibitor, as a model, to investigate the tetrahedral intermediate and discovered similar extended conformations present in the active site cleft of the APs (for a review see James,

2004). Observation of the fragment of pepstatin bound to the active site of rhizopuspepsin

8 showed that the hydroxyl group of the first statin was bound with both oxygen atoms to both catalytic Asp residues (i.e., Asp-215 and Asp-32; Bott et al., 1992). However, structural studies on penicillopepsin inhibited by pepstatin by James et al. (1992) showed that the distance between the two inner oxygen atoms (i.e., oxygen from the catalytic Asp) is 2.9 Å.

The author proposed that Asp-215 functioned as a general base while Asp-32 donated a proton to the carbonyl oxygen of the scissile peptide bond.

Even though specific details remain unclear, it is generally agreed that the catalytic mechanism of APs is an acid-base catalysis involving nucleophilic attack by the water molecule. It has been proposed that catalysis occurs when the first aspartic residue (Asp-32) is protonated and the second residue (Asp-215) remains charged. The charged Asp-215 functions as a general base and activates the nucleophilic attack on the carbonyl carbon of the scissile peptide bond to create the tetrahedral oxyanion intermediate. Meanwhile, Asp-32 assists in the protonation of the oxyanion intermediate, which destabilizes the carbon- nitrogen bonds of the scissile peptide and breaks the intermediate into two products. The carboxyl product (from the amino terminal side of the peptide bond) remains hydrogen bonded to Asp-32 and Asp-215 in their negatively charged forms and is ready for the next round of catalysis (Dunn, 2002). The general catalytic mechanism for most APs is illustrated in Figure 1.2.

9

Figure 1.2: Steps involved in substrate catalysis and the interaction of catalytic groups with the surrounding residues (a) Acid/base foundation support by the Asp residues and the nucleophilic attack from the water molecule. (b) Formation of tetrahedral intermediate and interaction with nearby residues adjacent to the catalytic site. (Adapted from Andreeva and Rumsh, 2001)

1.2.3 Multifunctional PS in APs

All vertebrate APs and the majority of fungal APs are produced as inactive precursors

(zymogens) with a residue extension known as a prosegment (PS) attached to the active enzyme structure. The zymogen then undergoes activation to produce the active enzyme when the conditions are appropriate (Khan and James, 1998). The PSs of APs are reported to have high sequence similarity and their conformation consists of a long β-strand attached to the N-terminus followed by small α-helices. In most cases, the PS is approximately 40 to 50 amino acid residues in length of which a high proportion are positively charged (Dunn,

2002).

10

Initially, the PS was believed to specifically regulate enzyme activation (Koelsch et al.,

1994). In the zymogen, interactions among the charged residues between the PS and the native enzyme are formed via several hydrogen bonds and salt-bridge interactions including an interaction between Lys-36p (pepsin numbering and p denotes the PS) and one of the active site Asp residues. This interaction sterically blocks access to the active site and inhibits the enzyme, thus preventing undesirable degradation throughout its passage in the cellular environment (Dunn, 2002). Upon exposure to an acidic environment, protonation of the charged residues disrupts the interactions and causes repulsion between the stabilized forces

(mainly salt bridges), which subsequently detaches the PS and activates the enzyme (Khan and James, 1998).

Over time, studies have shown that PSs have other roles outside of their inhibitory function in the activation of APs. In vitro folding experiments on multiple APs have shown that the presence of a PS is necessary for correct folding. Direct expression of AP-1 from Rhizopus niveus (Fakuda, et al., 1996) and peptidase A from Saccharomyces cerevisiae (Van den Hazel et al., 1999) in the absence of a PS resulted in their rapid degradation and an inactive enzyme.

However, active forms (upon activation) for these two enzymes were produced when they were expressed with an attached PS. Recently, Parr-Vasquez and Yada (2010) observed a similar result when expressing PMII without the PS. The resulting enzyme had a similar secondary structure to the activated mature produced with the PS, but the activity recovered was approximately 20-fold less.

There has been a report that a PS is not critical for the correct folding of cathepsin D. When the enzyme was expressed in the absence of the 44-amino acid PS, the mature enzyme was steadily expressed and found to bind to pepstatinyl-agarose in an affinity column similar to

11 the wild-type, which signified that the expressed mature folded correctly (Fortenberry and

Chirgwin, 1995). These results contradict another study, where a mutation in the cathepsin D

PS resulted in an unstable mature in vivo (Conner, 1992). Moreover, the crystal structure of mature cathepsin D obtained at neutral pH in the absence of a PS was discovered to be structurally rearranged and in a catalytically inactive form (Lee et al., 1998).

1.2.4 Prosegment catalyst folding

The role of the PS as a catalyst for folding was first established by Ikemura et al. (1987) who discovered, using subtilisin E (SbtE), that unfolded protein (U) was unable to retain its active form without the presence of the PS. Baker and coauthors (1992) conducted a series of refolding experiments on α-lytic protease and found that U was transformed into an intermediate state (I) that was partially folded and kinetically trapped, thus preventing the formation of catalytically active native form (N). In the absence of a PS, direct transition from U to N is thermodynamically unfavorable due to a higher stability (lower free energy) of U than N as shown in Figure 1.3 (Demidyuk et al., 2010). In the absence of the PS, U can be transformed into I under specific conditions. However, due to the lower energy state of U and higher energy barrier between I and N, the formation of N is unfavorable. In addition, I is not stable and undergoes degradation thereby preventing the formation of N.

12

Figure 1.3: Energy diagram of protein folding reaction In the absence of PS, native enzyme is less thermodynamically stable and kinetically trapped with large energy barrier. Upon the addition of PS, N•P complex is more thermodynamically stable and the energy barrier is lowered. (Adapted from Demidyuk et al., 2010)

Adding PS to U will result in the formation of an I•P complex. The PS also decreases the magnitude of the energy barrier between I•P and N•P, which permits faster formation of the thermodynamically stable N•P complex (Demidyuk et al., 2010). Similar folding mechanisms have been reported for several enzymes that have PS-catalyzed folding such as subtilisin

BPN9 (Eder et al., 1993; Pulido et al., 2006), α-lytic protease (Sohl et al., 1998),

Streptomyces griseus protease B (Truhlar et al., 2004), and pepsin (Dee and Yada, 2010).

1.2.5 Prosegment catalyzed folding in APs

Since the discovery of the PS’s role in protein folding (Ikemura et al., 1987), numerous experiments focusing on serine proteases have been reported, e.g., α-lytic protease (Sohl et

13 al., 1998), Streptomyces griseus protease B (Truhlar et al., 2004), and carboxypeptidase Y

(Winther et al., 1991). Recently, a series of experiments investigating PS-catalyzed folding have been conducted on APs using pepsin A (Dee et al., 2009; Dee et al., 2010).

Dee et al. (2009) reported that the PS functions as both a folding catalyst and inhibitor for native pepsin. In subsequent work, Dee et al. (2010) observed that native pepsin was kinetically stable due to a large unfolding barrier (24.5 kcal/mol) and experienced irreversible denaturation due to an equally large refolding barrier (24.6 kcal/mol), both of which indicated that native pepsin was thermodynamically metastable. The formation of native pepsin requires a PS, which drives the equilibrium towards the N•P complex and stabilizes the folding transition state, thereby catalyzing folding. Since pepsin is the archetype of a broad class of APs, it was hypothesized that native states were optimized for kinetic rather than thermodynamic stability, and may be a common feature for pepsin-like APs (Xiao et al.,

2011).

1.2.6 Malarial APs

Members of the AP family are widely distributed in nature and are even found within the parasite responsible for malarial disease. Malaria was one of the earliest known diseases and remains the most lethal parasitic disease, especially in developing countries. Annually, it is responsible for the deaths of 2-3 million people worldwide (Snow et al., 2005).

Taxonomically, malarial parasites are in the kingdom protozoa, phylum Apicomplexa, class

Sporozoa, order Haemosporida, family Plasmodiidae and genus Plasmodium (Coatney et al.,

1971). Four Plasmodium species commonly infect humans (i.e., P. falciparum, P. vivax, P. malariae and P. ovale), but P. falciparum and P. vivax cause the highest morbidity and

14 mortality with the former causing the most deaths (Gardiner et al., 2009). In 2002, Gardner and coworkers successfully identified 10 genes within the P. falciparum genome. To date, only plasmepsin I (PMI), plasmepsin II (PMII), plasmepsin IV (PMIV) and histoaspartic protease (HAP) have been characterized in detail. These four Plasmodium proteases are highly homologous with sequence identity between PMI and PMII, PMIV, and HAP at 73%,

68%, and 63%, respectively (Bhaumik et al., 2009). Among the Plasmodium proteases, PMI and PMII (which are alternatively known as hemoglobinases) have been extensively studied

(Francis et al., 1997). These two enzymes are believed to have a role in the initial step of host cell hemoglobin degradation within the acidic food vacuole of the parasite (Ersmark et al.,

2006). PMII is the topic of this thesis work.

1.2.7 Plasmepsin II

Like many eukaryotic APs, PMII is synthesized in the cell as a zymogen. Under favorable conditions, the inactive zymogen then undergoes molecular rearrangement followed by the proteolytic removal of the PS to produce the active PMII mature. Activation to form mature enzyme in vivo is carried out by a maturase (Francis et al., 1997). The PMII zymogen can also be activated autocatalytically in an acidic environment (pH 3.5-5.0). However, the cleavage site for autoactivation is 12 residues upstream from the wild-type mature N- terminus (Hill et al., 1994). Structural analysis of PMII reveals that the mature, formed enzyme is composed of 329 amino acids and has the bilobal shape, topology, and β-hairpin structure within the N-terminal domain (the “flap”) typical of eukaryotic APs (Silva et al.,

1996; Asoja et al., 2003; see section 1.2.2). As with other APs, the catalytic site of the enzyme is located at the core junction of the two domains and consists of two aspartic residues: Asp-34 and Asp-214 (Silva et al., 1996).

15

1.2.8 Expression of PMII

Isolation of sufficient quantities of plasmepsins from parasites is impractical for detailed structure-function studies. Therefore, recombinant enzymes are needed. However, recombinant expression of the full length PMII zymogen has been unsuccessful due to low levels of expression and misinitiation of translation at Met-70 (Silva et al., 1996; Luker et al.,

1996). Moreover, the presence of hydrophobic trans-membrane regions as well as complex folding motifs limit the success of the normal expression technique that was initially developed for expressing prevalent, soluble proteins. Problems encountered during the expression of membrane proteins include the formation of toxic products or inclusion bodies, which limit protein yield (Loll, 2003). The difficulties with membrane protein expression are further amplified during purification where the tedious and time-consuming testing of detergents and extraction conditions that remove unwanted endogenous membrane proteins while maintaining the stability and function of the desired protein have notoriously impeded research (Loll, 2003).

To overcome this problem, a truncated version of the recombinant PMII zymogen (48 residues rather than 125 in the PS) was created and has been successfully expressed in high yields (Hill et al., 1994). This construct has been used to express the PMII zymogen. This zymogen is approximately the same size as the archetypal AP zymogens and is inactive at neutral pH. It also undergoes autoactivation at pH 4.7 (Hill et al., 1994). Bernstein and coworkers (1999) successfully crystallized the truncated zymogen and observed that the PMII zymogen was quite different from other pepsin-like zymogen structures.

16

1.2.9 Unique characteristic of the PMII PS

While PMII shares substantial sequence similarity with other eukaryotic APs, PMII’s PS is longer and has an uncharacteristic conformation when compared to other AP PSs (Bernstein et al., 1999). Comprising 125 amino acids, the PS of PMII is approximately 70 residues longer than the typical pepsin-like PS (Barry et al., 1995). As well, the PS of PMII has a well-defined crescent-shaped secondary structure with a β-strand followed by an α-helix, a turn, a second α-helix and a coil connection to the mature segment (Figure 1.4). The PS of

PMII interacts extensively with the C-domain of the mature PMII (Kim et al., 2006) and contains a stretch of 21 residues of hydrophobic trans-membrane helix located in the middle of the PS and 36 residues from the C-terminus sequences. This section functions like an anchor and attaches the PMII zymogen to the cytostome region, thus forming a complex. This complex is delivered from the endoplasmic reticulum to the digestive vacuole where activation and hemoglobin digestion occurs (Francis et al., 1994).

A structural comparison of the zymogen and the mature PMII showed that the zymogen experiences a large domain opening resulting in changes to the architecture of the active site

(Bernstein et al., 1999). The distance between the two catalytic Asp residues (Asp-34 and

Asp-214) in the zymogen form is increased by an additional 3.2 Å relative to the active PMII.

In this conformation, the Asp residues are unable to provide the acid/base assistance for substrate catalysis, as in the active formation. In addition, interactions between the mature enzyme and the PS change the symmetry of the “fireman grip”, thus disrupting the stabilized forces and altering the folding of the active site (Bernstein et al., 1999). Therefore, in the zymogenic form, the catalytic aspartic acid residues are too distant and the absence of a properly formed active site accounts for the unique inhibition mode in PMII.

17

Figure 1.4: Structure comparison of multiple APs zymogens Crystal structures of the AP zymogens (a) porcine pepsinogen A, (b) prophytepsin, and (c) proplasmepsin II. (Adapted from Bernstein and James, 1999)

18

CHAPTER 2 Characterization of PMII with various lengths of prosegment

2.1 Introduction

Plasmepsin II (PMII) has been extensively studied as it is believed to be one of the first enzymes to trigger hemoglobin digestion in the parasites that cause malaria (Bhaumik et al.,

2012; Ersmark et al., 2006). PMII is expressed within cells as an inactive zymogen in which an extra sequence (i.e., the prosegment, PS) is hydrolyzed to produce the active protein

(Demidyuk, et al., 2010). The PS of PMII is substantially longer than that of most other aspartic peptidase (AP) zymogens (Hill et al., 1994). In the middle of the PS, there is a hydrophobic region that forms a trans-membrane helix that anchors the protein to the membrane during transport from the endoplasmic reticulum to the digestive vacuole (Francis et al., 1994). Direct isolation of this enzyme from the host parasite is impractical and the presence of the hydrophobic membrane domain makes over-expression difficult due to the limited availability of soluble protein expression systems (Silva et al., 1996). Therefore, in this study, PMII was expressed with a truncated PS that did not include the hydrophobic region (Hill et al., 1994).

Little research has been conducted regarding the effect of PS truncation on the folding of

PMII as compared to the full length PS (see Section 1.2.8). In this chapter, the effect of PS length on PMII folding was assessed by generating three constructs of recombinant PMII with varying PS lengths attached to the native enzyme: extended PSPMII (60 residues), truncated PSPMII (48 residues) and NoProPMII (0 residues). These constructs were expressed, purified, and the mature forms (+2 PMII mature, +12 PMII mature, and

NoProPMII mature, respectively) were compared in terms of structural characteristics 19

(secondary and tertiary structural analyses by circular dichroism and intrinsic tryptophan fluorescent spectroscopy), thermostability (analysed by differential scanning calorimetry), kinetic parameters, structural prediction and activity (using pepstatin-A) in order to better understand the importance of the PS in PMII folding.

2.2 Materials and Methods

2.2.1 Materials

The pET32b(+) vector plasmids and Escherichia coli Rosetta gamiTM B (DE3) pLysS cells were purchased from Novagen (Mississauga, ON, Canada). A GenluteTM Plasmid Miniprep

Kit was purchased from Sigma-Aldrich (St. Louis, MO, USA). Pfu DNA polymerase was obtained from Fermentas Life Sciences (Burlington, ON, Canada) and primers were synthesized by Sigma Genosys (Oakville, ON, Canada). A QIAquick® PCR Purification Kit was purchased from Qiagen (Germantown, MD, USA). HIS®-Select 6.4 mL cartridges were obtained from Sigma-Aldrich (St. Louis, MO, USA). Mono Q 10/100 GL and SuperoseTM 12

10/300 GL columns were obtained from GE Healthcare (Chalfont, St Giles, UK). Centrifugal

Filter Units were supplied by Millipore Corp. (Bedford, MA, USA). All chemicals and media were either obtained from Fisher Scientific (Nepean, ON, Canada) or Sigma-Aldrich (St.

Louis, MO. USA).

2.2.2 Generation of PMII expression constructs

Constructs of truncated PSPMII and NoProPMII fused with highly soluble thioredoxin protein transformed in Escherichia coli Rosetta gamiTM B (DE3) pLysS cells using pET32b(+) vector plasmid were used in this experiment. The amplified PCR product for

NoProPMII was cloned previously into a modified pET32b(+) expression vector using the

HindIII and NcoI restriction enzyme cut sites. The modified vector has its native thrombin cut

20 site removed and replaced upstream of the NcoI restriction cut site in the multiple cloning site

(Parr-Vasquez and Yada, 2010). The following primers were used for the amplification of truncated PSPMII and NoProPMII constructs (respectively):

Forward: 5’CCCCGAGGCCGCTGCCCTGAGTTCAAATGATAATATCG3’

Reverse: 5’CCAGCTTTTATCCCTTCTTTTTCGC3’

Forward: 5’GGCCATGGGGAGTTCCAAATGATGGTGTCGAATTAG3’

Reverse: 5’CCAAGTCTTTTATAAATTCTTTTTAGC3’

In order to acquire the extended PSPMII construct, pmii in the expression vector pET32b(+) was used for amplification of the target region. The extended PSPMII region was amplified with a 5’ overhang coding for NcoI restriction enzyme cut site and the 3’ overhang encoding for XhoI restriction enzyme cut site. The following primers were used for the amplification of extended PSPMII construct:

Forward: 5’ATCCCATGGCGCAAAGAGATAATGAAATGAATG3’

Reverse: 5’GCACTCGAGTTATAAATTCTTTTTAGCAAGAGC3’

The amplified PCR product was cloned into a pET32b(+) expression vector using the NcoI and XhoI restriction cut sites. Constructs were sequenced in-house using an ABI Prism DNA

Sequencer.

21

2.2.3 Expression and purification of PMII constructs

All three constructs transformed in E. coli were used for the expression of recombinant protein. Cells were grown in Luria-Bertani growth medium containing 15 mg/mL kanamycin,

34 mg/mL chloramphenicol, 12.5 mg/mL tetracycline and 50 mg/mL ampicillin in an orbital shaker at 37°C until the OD600 was approximately 1.0, and were then induced with isopropyl-

β-D-thiogalactopyranoside (IPTG) at 0.5 mM. Following the induction, the cells were further incubated at 30°C for 16 hours. After expression, cells were recovered by centrifugation at

4800 x g for 10 min and stored at -20°C for further use. Cell pellets from 4 L cultures were resuspended in 100 mL 50 mM NaPO4, pH 7.5, containing 0.2 mg lysozyme and 4 µL benzonase nuclease (Novagen, Mississauga, ON, Canada) overnight at 4°C with gentle shaking. To remove the cell debris, the sample was centrifuged at 18,000 x g for 20 min at

4°C.

2.2.4 Purification of PMII constructs

Supernatant from each construct was applied to an HIS-Select Cartridge (Sigma-Aldrich,

Oakville, ON, Canada) in an AKTATMFPLC system (GE Healthcare, Chalfont St. Giles, UK).

The column was washed with Buffer A (50 mM NaPO4/0.3 M NaCl/10 mM imidazole, pH

7.5) until a stable UV baseline was obtained. To remove weakly bound proteins, 10% Buffer

B (50 mM NaPO4/0.3 M NaCl/250 mM imidazole, pH 7.5) was applied to the column and washed using eight column volumes. Target fusion protein was obtained by washing the column with 100% Buffer B, and then the extended and truncated PSPMII proteins were further purified using the protocol in Section 2.2.4.1 and the NoProPMII protein was further purified using the protocol in Section 2.2.4.2.

22

2.2.4.1 Extended PSPMII and truncated PSPMII

Following the above purification (Section 2.2.4: Expression and purification of PMII constructs), eluted samples of extended and truncated PSPMII constructs were concentrated and washed using the Amicon Ultra-15 centrifugal filter unit with an Ultracel-50 membrane

(Millipore Corp., Bedford, MA, USA). This process was repeated three times using recombinant enterokinase buffer (20 mM Tris-HCl/0.2 M NaCl/20 mM CaCl2). Recombinant enterokinase (rEK-Novagen, Canada) was added at a ratio of 1:2000 (rEK:protein) and then the sample was incubated at 21°C for 16 hours. Next, the protein sample was washed three times with 20 mM Tris/HCl, pH 7.5 using the Amicon Ultrafiltration system with Ultracel-30 membranes (MWCO 30kDa).

The concentrated sample (5 mL) was applied to a Mono Q 10/100 GL (GE Healthcare,

Chalfont, St. Giles, UK) anion exchange column and then washed with 20 mM Tris/HCl, pH

7.5 (Buffer C). After, a gradient of 0-20% 20 mM Tris/HCl/1 M NaCl, pH 7.5 (Buffer D) was applied over 10 column volumes with a constant flow rate of 1 mL/min. Fractions containing the zymogen were collected, concentrated and stored at 4°C in Buffer C. To obtain mature

PMII, the concentrated zymogen (2 mL) was activated in 50 mM sodium acetate, pH 4.7 for

1 hour at room temperature. Following activation, the protein sample was washed and concentrated in 20 mM Tris/HCl, pH 7.5 using the Amicon Ultrafiltration system with

Ultracel-30 membranes (MWCO 10 kDa). The concentrated protein (0.5 mL) was applied to a SuperoseTM 12 10/300 GL column (GE Healthcare, Chalfont St Giles, UK) at a flow rate of

0.5mL/min in order to recover the pure protein. The zymogen and mature samples were also subjected to SDS-PAGE and electroblotted on a polyvinylidene difluoride (PVDF) membrane for N-terminal sequence analysis using Edman Sequencing in order to verify the identity of the samples.

23

2.2.4.2 NoProPMII

Following the initial purification procedure described at the beginning of Section 2.2.4, the fusion protein was washed and concentrated using the Amicon Ultrafiltration system with

Ultracel-30 membranes (MWCO 10 kDa) in thrombin buffer (200 mM Tris/HCl/1.5 M

NaCl/25 mM CaCl2, pH 8.4). The sample was digested with thrombin using the ratio of

1:5000 (thrombin:protein) at 37°C for 1 hour. After digestion, the sample was washed and concentrated in 20 mM Tris/HCl, pH 7.5 using the Amicon Ultrafiltration system with

Ultracel-30 membranes (MWCO 10 kDa). The concentrated sample (5 mL) was applied to

UNO-Q1 anion exchange column (BIO-RAD, Mississauga, ON, Canada) at a flow rate of 1 mL/min. The column was washed with Buffer C (0.3 M NaCl/50 mM Tris/10 mM imidazole, pH 7.5) until a flat UV baseline was obtained. To remove weakly bound proteins, 10% Buffer

D (0.3 M NaCl/50mM Tris/250 mM imidazole, pH 7.5) was applied to the column. The target protein was obtained by washing the column with 20% Buffer D.

In order to remove the remaining contamination, the sample was concentrated using the

Amicon Ultrafiltration system with Ultracel-30 membranes (MWCO 10 kDa). The concentrated sample was applied to a SuperoseTM 12 10/300 GL column in 20 mM Tris/HCl, pH 7.0 at a volume of 0.5 mL per run with a flow rate maintained at 0.5 mL/min. The product resulting from thrombin digestion was subjected to SDS-PAGE and electroblotted on a PVDF membrane for N-terminal sequence analysis by Edman Sequencing to verifying the identity of the sample.

2.2.5 Protein concentration determination of PMII constructs and mature enzymes

Protein concentrations were determined using two different methods. First, during the protein purification steps, protein concentration was determined in triplicate via the DC Protein

24

Assay (Bio-Rad, Hercules, CA, USA) using bovine serum albumin as the standard. Second, once the pure protein was recovered, protein concentration was determined using an absorbance assay at 280 nm with extinction coefficients of 1.021 M-1cm-1 and 1.091 M-1cm-1 for the zymogen and the mature enzyme, respectively, as calculated from ProtParam

(www.expasy.ca).

2.2.6 Optimum pH of activity and stability of PMII mature enzymes

Optimum pH values for activity and stability were determined for each of the three mature enzymes (i.e., +2 PMII mature, +12 PMII mature and NoProPMII mature) in order to identify the optimum conditions for subsequent structural and enzymatic activity studies. Optimum proteolytic activity for each mature PMII was determined by conducting an activity assay at the following pH values: 2.2, 3.0, 4.0, 5.0, and 6.0. A synthetic peptide substrate EDANS-

CO-CH2-CH2-CO-Ala-Leu-Glu-Arg-Met-Phe-Leu-Ser-Phe-Pro-Dap-(DABCYL)-OH

(2837b) was used following the method of Istvan and Goldberg (2005). Fluorescence was measured using a Victor2 1420 Multilabel Counter (Perkin Elmer, Woodbridge ON, Canada) with excitation at 335 nm and emission at 535 nm. The reaction rates were determined by calculating the slope of the linear portion of the curve (fluorescence/min) (Xiao et al., 2006).

Optimum pH stability was determined for all three mature enzymes by incubating the samples at 37°C for 1 hour at the following pH values: 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, and

10.0. Following incubation, each sample was assayed for activity using the synthetic substrate (2837b) at pH 4.8.

25

2.2.7 Structural characteristic analysis of PMII mature enzymes

2.2.7.1 Far-UV circular dichroism spectroscopy

Far-UV circular dichroism (CD) spectra were determined using a Jasco J-810 spectropolarimeter (Jasco Inc., Tokyo, Japan) with a 100 µL quartz cuvette and 0.1 cm pathlength. The far-UV spectrum from 250 nm to 185 nm was measured at room temperature using a protein concentration of 0.2 mg/mL in 0.1 M phosphate buffer, pH 7.0 (i.e., pH where the sample displayed the highest conformation; Xiao et al., 2011). Dichroweb, an online CD analysis tool, was used to analyse the results (http://www.cryst.bbk.ac.uk/cdweb/ html/home.html).

2.2.7.2 Intrinsic fluorescence spectroscopy

Fluorescence spectra were recorded using a Shimadzu RF-540 spectrofluorometer (Shimadzu

Corporation, Kyoto, Japan) using a 1 cm quartz cell at room temperature, excitation at 295 nm and an emission scan from 305 nm to 450 nm. Excitation and emission slit-widths of 3 nm and 5 nm were used, respectively. A sample concentration of 0.05 mg/mL in 0.1 M phosphate buffer, pH 7.0 was used.

2.2.8 Thermostability analysis using differential scanning calorimetry

Calorimetry experiments were carried out with a MicroCal VP-DSC (MicroCal,

Northampton, MA) using a scan rate of 90°C/h over a temperature range of 10 to 90°C, a 30 s filtering period (time delay), and a passive feedback response. A protein concentration of 0.5 mg/mL in 0.1 M phosphate buffer, pH 7.0 was used. Data reduction and analysis were performed using MicroCal Origin v7.0 (OriginLab, Northampton, MA).

26

2.2.9 Kinetic parameters of PMII mature enzymes

Kinetic parameters were determined using the synthetic substrate 2837b. The assay was conducted using substrate concentrations of 0.2, 0.4, 0.6, 0.8, 1.0, 1.5, 2.0, 3.0, 4.0 and 5.0

µM added to the enzyme (3 nM for +2 PMII mature and +12 PMII mature and 30 nM for

NoProPMII mature) in 0.1 M sodium acetate, pH 4.8 to a final volume of 200 µL. The concentration difference between +2 and +12 PMII mature and NoProPMII mature was due to low measured activity in NoProPMII, which thus required a larger amount of enzyme in order to conduct comparable activity measurements. The measured fluorescence was converted to moles per second using a conversion factor derived from a standard curve for the complete digestion of the substrate by S. cerevisiae proteinase A (Sigma-Aldrich,

Oakville, ON, Canada) following the method of Xiao et al. (2006). Non-linear regression analysis using the Michaelis-Menten model was used to determine both Km and kcat (Parr-

Vasquez and Yada, 2010).

2.2.10 Active enzyme comparison of PMII mature enzymes

A pepstatin-A inhibition study was conducted in order to quantify the amount of active enzyme obtained from each construct (Yonezawa et al., 1999). Pepstatin-A concentrations of

0.1, 0.2, 0.3, 0.4, 0.5, 0.75, 1.0, 2.0 and 5.0 nM (0.025, 0.05, 0.075, 0.1, 0.15, 0.25, 0.5, 1.0,

2.0 and 5.0 nM for NoProPMII) were added to 3 nM enzymes (30 nM for NoProPMII) and incubated for 5 min prior to addition of 2 µM fluorescent synthetic substrate 2837b. Due to low activity levels, the concentrations of protein and pepstatin-A were different for the

NoProPMII mature sample. Enzymatic activity was determined using the same procedure that was described in Section 2.2.6. Reaction rate slopes were normalized against slopes of activity in the absence of pepstatin-A. The initial reaction rates were determined from linear fits and were used to calculate the amount of active enzyme obtained from each construct.

27

2.2.11 Structural prediction of +2 PMII and +12 PMII mature enzymes

The predicted structures of +2 PMII mature and +12 PMII mature were determined based on a homology modelling technique using the I-TASSER server. The server is fully automated and can be used by any registered user. Moreover, this server was the highest ranked server for protein structure prediction in CASP7 and CASP8 experiments (Roy et al., 2010). The sequences of the two matures were uploaded to the server in FASTA format (i.e., a text-based format for representing peptide sequences) and the final results were accessed via a link delivered by email. The predicted structure was made by retrieving a template of a protein with similar folds from the Protein Databank library and the selected templates were assembled into a full-length model by optimizing the H-bond network and removing any steric overlaps. Following that procedure, structures were refined by reassembling and clustering the possible templates (Roy et al, 2010).

2.2.12 Statistical Analysis

Statistical analyses were conducted using GraphPad Prism software (Motulsky, H.J. 2003; www.graphpad.com). All experiments were independently repeated three times with two determinations per replicate. Results were compared using one-way ANOVAs and means were separated using Tukey’s test. Significantly different means (p < 0.05) are denoted by different letters

2.3 Results and Discussion

2.3.1 Construction and soluble expression of PMII constructs

The general scheme for generating the three different PMII constructs is illustrated in Figure

2.1. The first construct, extended PSPMII, contained the longest PS (60 residues), which covered the entire PMII sequence up to the trans-membrane domain. The second construct,

28 truncated PSPMII, contained a 48 residue PS that was based on the archetypal number of PS residues that appears in most pepsin-like APs in the A1 family (Hill et al., 1994). The third construct, NoProPMII, was the control and did not have a PS.

(A)

Trx Tags tPS (60 aa) + Mature PMII (329 aa)

(B)

Construct 1 (Extended PSPMII-60 PS) Construct 2 (Truncated PSPMII-48 PS) Construct 3 ( NoProPMII-0 PS)

Q90p S78p S1

PS (125 aa) Native PMII (329 aa)

Figure 2.1: Schematic diagram for generating the three different PMII constructs (A) General template for generating the constructs. All three constructs were prepared by attaching the target protein with the thioredoxin and tags proteins. (B) Schematic diagram of zymogen PMII and the cut off limit for all three constructs involved in the experiment.

Overexpression of the target protein was accomplished using the combination of pET32b(+) plasmid vector expressed into E. coli Rosetta gamiTM B (DE3) pLysS cells. This system was selected based on the bacterial strain compatibility and features, which enhanced the folding and solubility of the target protein (Novagen, 2006). Expressing the vector in the selected bacterial strain greatly enhanced disulfide bond formation within the E. coli cytoplasm and thus prevented protein degradation or compartmentalization in insoluble inclusion bodies.

The presence of 6 His tag residues also helped minimize the number of purification steps

(Derman et al., 1993). Furthermore, the bacterial strain used had been transformed with a vector (pLysS) that encoded rare tRNAs, lysozyme and two reductase mutations. These features were very important in preventing protein truncation, which is likely to occur when proteins originating from species other than E. coli are expressed (Prinz et al., 1997). The

29 target protein was overexpressed through a 109-aa Trx•Tag™ thioredoxin protein fusion to form a Trx-tPMII, which increased the solubility of the expressed protein. The complete pET32b vector sequences of all constructs are shown in Figures 2.2, 2.3 and 2.4.

30

Figure 2.2: Complete sequence of extended PSPMII construct in pET32b+ plasmid The extended PMII zymogen was cloned into a pET32b(+) expression vector using the NcoI and xhoI restriction enzyme cut sites. Targeted region and restriction sites are highlighted in different colored boxes: His-Tag (red); Enterokinase (blue); NcoI cut sites (purple); Extended PS (orange); Activated mature PMII (green). Construct was sequenced using ABI prism DNA sequencer and translation of a nucleotide sequence to a protein sequence was done using Translate (ExPASy: http://www.expasy.org)

31

32

Figure 2.3: Complete sequence of truncated PSPMII construct in pET32b+ plasmid The truncated PMII zymogen was cloned into pET32b(+) expression vector using the NcoI and HindIII restriction enzyme cut sites (Parr-Vasquez and Yada, 2010). Targeted region and restriction sites are highlighted in different colored boxes: His-Tag (red); Enterokinase (blue); NcoI cut sites (purple); Extended PS (orange); Activated mature PMII (green). Construct was sequenced using ABI prism DNA sequencer and translation of a nucleotide sequence to a protein sequence was done using Translate (ExPASy: http://www.expasy.org)

33

34

Figure 2.4: Complete sequence of NoProPMII construct in pET32b+ plasmid The wild-type PMII zymogen was cloned into pET32b(+) expression vector using the NcoI and HindIII restriction enzyme cut sites (Parr-Vasquez and Yada, 2010). Targeted region and restriction site are highlighted in different colored boxes: His-Tag (red); Enterokinase (blue); NcoI cut sites (purple); Extended PS (orange); Activated mature PMII (green). Construct was sequenced using ABI prism DNA sequencer and translation of a nucleotide sequence to a protein sequence was done using Translate (ExPASy: http://www.expasy.org)

2.3.2 Purification of PMII constructs

Purification protocol differences existed between constructs with a PS and the NoProPMII due to different activation procedures applied to each. Constructs with a PS (i.e., extended and truncated PSPMII) were prepared as a zymogen and later underwent acidic autoactivation. Alternatively, NoProPMII was designed so that it could produce mature enzyme directly after thrombin digestion.

Following protein expression, cells from each construct were lysed and centrifuged in order to obtain the crude fusion protein, which was applied to a nickel affinity column (see Section

2.2.4). At the end of this column, the majority of contaminants had been removed. Results

35 from the partially purified fusion protein from all three constructs are shown in Figures 2.5A,

2.7A, and 2.9A.

For constructs with a PS (i.e., extended and truncated PSPMII), the crude fusion protein was concentrated and the protein buffer was changed to an rEK buffer via size exclusion centrifugation (Section 2.2.4.1). The fusion protein was digested overnight with rEK enzyme and then applied to mono-Q 10/100GL anion exchanger column to yield pure zymogen.

These results are shown in Figures 2.5B and 2.7B. The peak containing pure zymogen was collected, concentrated, and activated through acidification and the sample was applied to a

SuperoseTM12 10/300 gel filtration column to obtain pure mature (Figures 2.5C, 2.7C).

Meanwhile, the NoProPMII construct was purified directly after thrombin digestion to recover pure enzyme (Section 2.2.4.2). The sample was then applied to an UNO Q anion exchanger column to yield partially purified mature enzyme (Figure 2.9B). Contaminants were removed when the sample was reapplied to a SuperoseTM12 10/300 gel filtration column and the targeted protein was collected (Figure 2.9C).

For verification purposes, approximately 5 µg of sample from each purification result was applied to a 15% SDS-PAGE gel and stained with Coomassie Brilliant Blue. The purified samples are shown in Figures 2.6, 2.8 and 2.10.

36

UNICORN 5.01 (Build 318) Result file: C:\...\default\Ahmad\110811 long mature stage A HS

110811 long mature stage A HS:10_UV 110811 long mature stage A HS:10_Conc

mAU

2500 A

2000

1500

1000

500

UNICORN 5.01 (Build 318) Result file: C:\...\default\Ahmad\150811 Long Zymo mono Q

0 150811 Long Zymo mono Q:10_UV 150811 Long Zymo mono Q:10_Conc 150811 Long Zymo mono Q:10_Fractions 0.0 10.0 20.0 30.0 40.0 50.0 ml mAU

1000 B

800

600

280 A

400

200

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Waste 5.0 10.0 15.0 20.0 25.0 30.0 35.0 ml C

Volume (mL)

Figure 2.5: Purification results for the extended PSPMII construct (A) Nickel affinity was performed on crude protein sample after cell lysis. Elution was recovered using 50 mM NaPO4/250 mM imidazole, pH 7.5; (B) Mono Q anion exchanger purification on extended zymogen of PMII. To purify the extended PMII zymogen, 5 mL of concentrated post nickel affinity sample were applied to mono Q anion exchanger column. Separation was achieved using the gradient of 20% 20 mM Tris/HCl/1 M NaCl, pH 7.5 applied to the column over eight column volumes; (C) Post activation size exclusion chromatography. After activation +2 PMII mature was purified using SuperoseTM 12 10/300 GL column with 0.5 mL concentrated sample applied to the column with flow rate maintained at 0.5 mL/min and separation was achieved in 50 mM NaPO4, pH 7.5. All purification protocols were monitored at 280 nm.

37

A B C D E

50

40

20

Figure 2.6: SDS-PAGE purification results for the extended PSPMII construct (A) Protein marker; (B) Crude protein harvested from E. coli Rosetta gamiTM B (DE3) pLysS cells; (C) Fusion protein purified using nickel affinity chromatography; (D) +2 PMII mature enzyme; (E) Extended PMII zymogen. Approximately 5 µg sample was applied in each well and the gel was stained with Coomassie Brilliant Blue.

38

UNICORN 5.01 (Build 318) Result file: C:\...\default\Ahmad\110811 short mature stage A HS

110811 short mature stage A HS:10_UV 110811 short mature stage A HS:10_Conc

mAU

2000 A

1500

1000

500

UNICORN 5.01 (Build 318) Result file: C:\...\default\Ahmad\150811 short mature mono Q after activation

0 150811 short mature mono Q after activation:10_UV 150811 short mature mono Q after activation:10_Conc 0.0 150811 short 10.0mature mono Q after20.0 activation:10_Fractions30.0 40.0 50.0 ml

mAU

2000 B

1500

80 2

1000 A

500

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 5.0 10.0 15.0 20.0 25.0 ml

C

Volume (mL)

Figure 2.7: Purification results for the truncated PSPMII construct (A) Nickel affinity was performed on crude protein sample after cell lysis. Elution was recovered using 50 mM NaPO4/250 mM imidazole, pH 7.5; (B) Mono Q anion exchanger purification on extended zymogen of PMII. To purify the extended PMII zymogen, 5 mL of concentrated post nickel affinity sample were applied to mono Q anion exchanger column. Separation was archived using a gradient of 20% 20 mM Tris/HCl/1 M NaCl, pH 7.5 applied to the column over eight column volumes; (C) Post activation size exclusion chromatography. After activation +12 PMII mature was purified using SuperoseTM 12 10/300 GL column with 0.5 mL concentrated sample applied to the column with flow rate maintained at 0.5 mL/min and separation was achieved in 50 mM NaPO4, pH 7.5. All purification protocols were monitored at 280 nm.

39

A B C D E

50

40

20

Figure 2.8: SDS-PAGE purification results for the truncated PSPMII construct (A) Protein marker; (B) Crude protein harvested from E. coli Rosetta gamiTM B (DE3) pLysS cells; (C) Fusion protein purified using nickel affinity chromatography; (D) +12 PMII mature enzyme; (E) Truncated PMII zymogen. Approximately 5 µg sample was applied in each well and the gel was stained with Coomassie Brilliant Blue.

40

UNICORN 5.01 (Build 318) Result file: C:\...\default\Ahmad\110811 short mature stage A HS

110811 short mature stage A HS:10_UV 110811 short mature stage A HS:10_Conc

mAU A 2000

1500

1000

500

UNICORN 5.01 (Build 318) Result file: C:\...\default\Ahmad\020711 nopro after thrombin unoQ

020711 nopro after thrombin unoQ:10_UV 020711 nopro after thrombin unoQ:10_Conc 0 020711 nopro after thrombin unoQ:10_Fractions 0.0 10.0 20.0 30.0 40.0 50.0 ml mAU 70.0

60.0 B

50.0

280 A

40.0

30.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17Waste 0.0 5.0 10.0 15.0 20.0 25.0 ml C

Volume (mL)

Figure 2.9: Purification results for the NoProPMII construct (A) Nickel affinity was performed on crude protein sample after cell lysis. Elution was recovered using 50 mM NaPO4/250 mM imidazole, pH 7.5; (B) Mono Q anion exchanger purification on post thrombin treatment. To purify the NoProPMII, 5 mL of concentrated post nickel affinity sample were applied to UNO-Q1 anion exchanger column. Separation was achieved using the gradient of 20% 20 mM Tris/HCl/1 M NaCl, pH 7.5 applied to the column over eight column volumes; (C) Post activation size exclusion chromatography. After thrombin activation, NoProPMII mature was purified using SuperoseTM 12 10/300 GL column with 0.5 mL concentrated sample applied to the column with flow rate maintained at 0.5 mL/min and separation was achieved in 50 mM NaPO4, pH 7.5. All purification protocols were monitored at 280 nm.

41

50

40

20

Figure 2.10: SDS-PAGE purification results for the NoProPMII construct (A) Protein marker; (B) Crude protein harvested from E. coli Rosetta gami cells; (C) Fusion protein purified using nickel affinity chromatography (D) Crude sample after UNO-Q1 ion exchanger; (E) NoProPMII mature enzyme. Approximately 5µg of sample was applied in each well and the gel was stained using Coomassie Brilliant Blue.

The protein purification steps were specifically designed to obtain the highest quality and yield of the final PMII mature enzyme. Thioredoxin protein fused with the targeted protein provided an advantage during the purification process as the majority of the protein contaminants were removed easily during the first purification step. In purifying the extended and truncated PS constructs, we discovered that it was better to purify the enzyme in its

42 zymogenic form, which is similar to the in vivo production of PMII (Francis et al., 1997). In order to obtain the mature enzyme, the zymogen was activated by acidification and finally purified using a gel column. By following these steps, consistency in the quality of the final mature samples was increased and the sample could be stored for a longer period in its zymogenic form, which was previously shown to be more stable than the mature enzyme

(Xiao et al., 2011).

2.3.3 Activation to +2 PMII, +12 PMII, and NoProPMII mature enzymes

As previously stated, the extended and truncated PSPMII constructs were purified in their zymogen forms before undergoing autoactivation under acidic conditions to produce mature

PMII. Ideally, the molecular weight for extended and truncated PMII zymogen should be

43,260 and 41,774 Da, respectively, as determined using ProParam (http://web.expasy.org).

Results from the SDS-PAGE gels were unable to differentiate between the molecular sizes of these two zymogens, which were both approximately 40 kDa (Figures 2.6E and 2.8E).

However, N-terminal sequencing confirmed that these two samples were different as the first six residues of the extended and truncated zymogens were determined to be A-M-A-*Q-R-D and A-M-A-I-*S-E, respectively. The first three or four residues in the sequence were derived from the plasmid and the authentic sequence for extended and truncated zymogen begins with the sequence Q-R-D and S-E, respectively. These data were consistent with the complete plasmid sequences for the truncated and extended constructs as shown in Figures 2.2 and 2.3.

An interesting result was discovered during autoactivation of the zymogens. As observed from the N-terminal sequencing result, autoactivation of the extended PMII zymogen resulted in a mature enzyme with an extra two amino acids (i.e., +2 PMII mature; L-G-*S-S-N-D-) as compared to the native enzyme isolated from the malarial parasite (Hill et al., 1994).

43

Autoactivation of truncated PMII zymogen, on the other hand, produced a mature enzyme with an extra 12 amino acid residues (i.e., +12 PMII mature; L-N-S-G-L-) upstream from the native cleavage site. Both samples produced a homogeneous mature enzyme as confirmed by

N-terminal sequence analysis (Appendices 3 and 4).

The occurrence of two activation sites (+12 and +2) in PMII was first reported by Tyas and colleagues in 1999. They discovered that the autoactivation of truncated PMII zymogen resulted in two overlapping sequences. The major sequence (62%) corresponded to +2 activated mature, and the minor sequence (38%) resembled the +12 activated mature. Gulnik and coworkers (2002) reported a similar observation during activation of truncated PMII zymogen. In addition, they conducted several mutation studies to see if the homogeneity of

+2 activated PMII mature could be increased. The activation of truncated PMII zymogen produced a 50/50 distribution of +2 and +12 mature enzyme but when 10 PS residues (112p-

121p) were deleted, they were able to produce a more homogeneous (but not 100%) +2 PMII mature (Gulnik et al., 2002). A homogeneous sample of mature enzyme that closely resembled native enzyme was desired in both experiments since heterogeneous mixtures (+2 and +12 PMII mature) are less suitable for crystallization trials (Gulnik et al., 2002). Their approaches of expressing the targeted protein as insoluble aggregates or inclusion bodies might hinder the overall re-folding and activation of PMII. Meanwhile, the soluble protein expression system used this experiment, which directly folded the targeted protein within the

E. coli host, was more similar to normal cellular conditions and produced homogeneous mature enzymes. +2 PMII mature was expected to be more active than +12 PMII mature because of its greater similarity to the native form of PMII.

44

NoProPMII underwent thrombin digestion to eliminate the thioredoxin fusion protein in order to activate the NoProPMII mature enzyme. Theoretically, the molecular weight of this enzyme should be 36,277 Da as calculated using ProParam tool (http://web.expasy.org) and results from the SDS-PAGE gels showed an enzyme weight slightly below 40 kDa (Figure

2.10E). The N-terminal sequence of the NoProPMII mature was determined to be G-S-A-M-

G-*S-S. Since the construct was inserted into a modified vector (Parr-Vasquez and Yada,

2010), the first five residues (G-S-A-M-G) corresponded to the thrombin and NcoI restriction site and the original sequence of PMII began with -S-S- which was similar to the native cleavage site for PMII (Hill et al., 1994). N-terminal sequencing results are summarized in

Table 2.1. Detailed information regarding the N-terminal sequencing results are shown in

Appendices 1-5.

Table 2.1: N-terminal sequencing of PMII constructs Approximately 15-20 µg samples were electroblotted onto a polyvinylidene difluoride (PVDF) membrane and the samples were analyzed by Edman Sequencing. SAMPLES N-TERMINAL RESULTS

Extended PSPMII zymogen A-M-A-*Q-(unidentified residues)-D-

+2 PMII mature *L-G-S-S-N-D-

Truncated PSPMII zymogen A-M-A-I-*S-D-

+12 PMII mature *L-N-S-G-L-

NoProPMII mature G-S-A-M-G-*S-S *indicates the beginning of the original sequence

2.3.4 Optimum pH of activity and stability of PMII mature enzymes

Optimum pH of activity and stability analyses were conducted to determine the optimum conditions for the structural and enzymatic activity studies. To establish the empirical optimum pH of activity a quenched fluorescent substrate 2837b was used and the enzymatic 45 activity was measured over a pH range. As seen in Figure 2.12, two peaks were evident at pH

3.0 and 5.0. The dominant peak, however, was detected at pH 5.0, which is similar to previous PMII findings (Istvan and Goldberg, 2005). In food vacuoles where hemoglobin is degraded by the mature PMII, the pH is 5.0 (Banerjee et al., 2003). The synthetic fluorescent substrate used in this experiment was originally designed based on the primary cleavage and peptide length corresponding to the residues within the α-chain of hemoglobin (Istvan and

Goldberg, 2005) and hemoglobin digestion is known to be more favorable at low pH within the digestion vacuole (Chu and Ackers, 1991). Therefore, the occurrence of a second optimum pH at 3.0 may be due to the minor cleavage of the substrate, which favors hemoglobin hydrolysis.

+2 PMII mature +12 PMII mature +NoProPMII mature

Activity (%) Relative

pH Figure 2.11: Optimum pH activity of PMII mature enzymes The activity of the three PMII mature enzymes at various pH values were measured using quenched fluorescent substrate 2837b, at 37°C. The relative activity at corresponding pH was calculated individually in each sample based on the percentage of the highest activity. Error bars indicate the SD of n = 6 (3 replicates and 2 determinations per replicate).

46

A similar result (i.e., 2 pH optimums) was seen by Xiao et al. (2007) in the activity of PMI using the same synthetic fluorescent substrate (who reported two optimum pH ranges at f 4.5-

5.5 and 2.5-3.0 with the dominant peak measured at 4.5-5.5. The authors further hydrolyzed human hemoglobin using PMI and discovered that higher hemoglobin hydrolysis occurred at a lower pH (pH 2.8) as compared to a higher pH (5.0). These findings suggest that PMI and

PMII may share similar proteolytic characteristics.

The pH stability of PMII was determined by incubating the mature PMII at various pH values

(pH 3-10) for 1 hour, at 37°C. Following incubation, activity was measured at pH 4.8 using a proteolytic assay (Section 2.2.6). All three mature enzymes showed a similar pattern of pH stability (Figure 2.12). In general, samples were most stable at a neutral pH and they maintained more than 60% of their activity between pH 6 and pH 8. It is also apparent from

Figure 2.12 that the mature enzymes were more tolerant in acidic than alkaline environments.

In addition, PMII, like other AP digestion enzymes such as , pepsin and cathepsin

D, is known to be more tolerant in (or under) acidic conditions due to their high distribution of negatively charged residues (Lee et al., 1998; Mohanty et al., 1999: Fruton, 2002) which remain largely protonated at low pH values, therefore, preventing repulsive forces that would lead to protein unfolding. PMII has approximately twice as many negatively charged residues as positive residues (41% negative and 23% positive residues). Stability studies of PMII as a function of pH showed that PMII was able to retain its structural conformation in acidic but not alkaline environments. At pH values greater than 8.0, structural changes occurred that lead to activity loss (Xiao et al., 2011). In agreement with the stability study conducted by

Xiao and coworkers (2011), all mature enzymes showed a more drastic drop in their activity at alkaline pH values as compared to acidic conditions.

47

+2 PMII mature +12 PMII mature +NoProPMII mature

Activity (%) Relative

pH

Figure 2.12: Stability of PMII mature enzymes as a function of pH The stabilities of multiple PMII mature enzymes were determined by monitoring synthetic peptide 2837b at pH 4.8 after incubating the enzyme at various pHs for 1 h at 37°C. The relative activity at the corresponding pH was calculated individually in each sample based on the percentage of the highest activity. Error bars indicate the SD of n = 6 (3 replicates and 2 determinations per replicate).

2.3.5 Structural characteristics of PMII mature enzymes

2.3.5.1 Far-UV circular dichroism spectroscopy

Circular dichroism (CD) analysis was conducted to monitor and compare the development of the secondary structures in +2 PMII mature, +12 PMII mature and NoProPMII mature. As shown in Figure 2.13, mature samples obtained from constructs with extra residues had similar spectra. This was reflected in their secondary structure fractions where the predicted secondary structures of these two mature enzymes (Table 2.2) were not significantly different

(p > 0.05). The secondary structure fractions of the +2 PMII mature were ≈12% α-helix,

≈42% β-sheet and ≈45% turn and random while the secondary structure of the +12 PMII

48 mature was ≈13% α-helix, ≈43% β-sheet and ≈43% turn and random. The spectrum of

NoProPMII mature was similar to the spectra of other constructs, but had a lower amplitude in the 190 – 195 nm range that was reflected in higher proportions of random structure

(Figure 2.13 and Table 2.2). In general, the secondary structure fractions of the mature enzymes consisted of high β-sheet and low α-helix distribution, which were consistent with other APs and the crystal structure of PMII-1lf4 PDB file (Robbins et al., 2009).

15

10

5

+2Long PMII Mature mature 0 190 200 210 220 230 240 250 260 +12Short PMII Mature mature NoProPMIINoPro mature

-5

Milidegree/Theta (Machine Units) (Machine Milidegree/Theta -10 Wavelength (nM)

-15

Figure 2.13: CD spectrum of PMII mature enzymes CD scans were determined using 0.2 mg/mL sample in 0.1 M phosphate buffer, pH 7.0. The spectra of six scans (3 replicates and 2 determinations per replicate) were averaged.

49

Table 2.2: Secondary structure distribution of PMII matures Data from CD scans were analysed via Dichroweb using analysis programs (Selcon3 and CDSSTR) using the average of six scans (3 replicates and 2 determinations per replicate) to determine the secondary structure.

Random & α-helix (%) β-Sheet (%) Turns (%)

+2 PMII mature 12.2 ± 0.3 a 42.9 ± 1.3 a 45.9 ± 0.2 b

+12 PMII mature 13. 2 ± 0.6 a 43.9 ± 0.3 a 42.8 ± 0.5 b

NoProPMII mature 9.8 ± 0.7 b 37.4 ± 0.8 b 52.8 ± 1.5 a

Crystal Structure (1 SME) 13.0 45.0 42.0

Data are expressed as mean ± SD of six scans. Values in the same column having the same letter are not significantly different at p > 0.05

2.3.5.2 Intrinsic fluorescence spectroscopy

The fluorescent signal of the folded proteins is caused by a mixture of aromatic compounds and the signal was mainly due to the excitation of tryptophan residues. The fluorescent signal from tryptophan (Trp) is very sensitive to the polarity of its local environment and serves as a probe for conformational change of the protein as a whole (Vivian and Callis, 2001). Mature

PMII contained three Trp residues at positions Trp-41, Trp-128 and Trp-192. Two of the Trp residues (Trp-128 and Trp-192) are located adjacent to each other in the C-domain and Trp-

41 is deeply embedded in the interior of the N-domain (Bernstein et al., 1999).

Figure 2.14 depicts the intrinsic tryptophan fluorescent spectra of the three PMII mature enzymes. The spectra for +2 PMII mature and +12 PMII mature were similar with a maximum wavelength of 338 nm although marginal differences in the intensity of the maxima (i.e., 60.8 and 56.8 fluorescent units, respectively) were observed and would suggest that these two mature enzymes shared similar local conformation. There were, however,

50 substantial differences between the spectra of +2 and +12 PMII with the NoProPMII, which had a maximum wavelength of 342.2 nm (a red shift) and exhibited a lower magnitude of intensity (i.e., 42.5 fluorescent units) (Table 2.3). This small, red shifting event (i.e., ≈4 nm) may have resulted from a conformational change involving the surface Trps (Trp-192 and

Trp-128) rather than Trp-41, which is buried in the hydrophobic core of the protein

(Bernstein et al., 1999). In folded proteins, structural changes that involve buried tryptophan residues yield larger red shifting emission spectra (i.e., ≈ 10 to 20 nM; Rubinstein and

Sherman, 2004; Vivian and Callis, 2001). Therefore, the results suggest that surface Trps may have been more exposed to the polar environment than Trp-41 in the NoProPMII enzyme as compared to the +2 and +12 samples.

+2Extended PMII mature PS Truncated PS

+12 PMII mature

NoProPMIINo PS mature

IntrinsicFluorescence

Emission Wavelength (nM)

Figure 2.14: Intrinsic protein (Trp) fluorescent signal of PMII mature enzymes Fluorescent spectra were determined using 0.05 mg/mL sample and the spectra were normalized with blank buffer of 0.1 M phosphate buffer, pH 7.0. The spectra of six scans (3 replicates and 2 determinations per replicate) were averaged.

51

Table 2.3: Intrinsic protein (Trp) fluorescent signal of PMII mature enzymes Highest Intensity Samples Magnitude of Intensity (F) Wavelength (nm) +2 PMII mature 338.0 ± 0.2 a 60.8 ± 6.1 a

+12 PMII mature 338.0 ± 0.2 a 56.8 ± 6.1 a

NoProPMII mature 342.2 ± 0.6 b 42.5 ± 5.1 b

Data are expressed as mean ± SD of 6 scans (3 replicates and 2 determinations per replicate). Values in the same column having the same letter are not significantly different at p > 0.05

The above interpretation, however, must be taken with caution given that intrinsic

fluorescence spectra represent the local areas adjacent to tryptophan in the protein rather than

overall changes in conformation (Eftink, 2000).

2.3.6 Differential scanning calorimetry of PMII mature enzymes

Differential scanning calorimetry (DSC) analysis was conducted to measure the thermal

stability of the three PMII mature enzymes. The experiment was conducted based on the fact

that protein stability, which is associated with appropriate folding conformation, is critical for

biological processes (Chiu and Prenner, 2011). +2 PMII mature and +12 PMII mature

displayed two melting transition (Tm) points (peaks P1 and P2). Each sample was shown to

have similar Tms (P1 and P2, ≈ 55°C and 69°C respectively; p > 0.05), which may represent

the separate melting events of the N- and C- domains. Two Tms were also observed by

Privalov et al., (1981) in their examination of pepsin, and were attributed to the individual

melting of N- and C-domains. Conversely, NoProPMII mature exhibited one Tm at peak P1.

Based on thermograms obtained for NoProPMII mature, which showed only one small peak

at P1, it is suggested that this mature exhibited a different conformation that is less stable

52

when compared to the other two matures. Therefore, it is speculated that the presence of the

PS played a critical function in the proper folding of PMII. DSC profiles of the PMII matures

are shown in Figure 2.15.

A B

(kcal/mole/°C)

(kcal/mole/°C) Cp

Cp Cp Cp (kcal/mole/°C) Cp Cp (kcal/mole/°C) Cp

Temperature (°C) Temperature (°C)

C

(kcal/mole/°C) Cp

Temperature (°C)

Figure 2.15: DSC profiles of PMII mature enzymes (A) Thermogram of +12 PMII mature. (B) Thermogram of NoProPMII mature. (C) Thermogram of +2 PMII mature. DSC thermogram was determined using approximately 0.5 mg/mL sample, with the corresponding buffer (in 50 mM Tris-HCl, pH 7.0) used as the reference. Thermograms shown here represent the average of six thermal experiments (3 replicates and 2 determinations per replicate).

53

The calculated calorimetric enthalpy (∆Hcal) values of the PMII mature enzymes are summarized in Table 2.4. The NoProPMII mature had the lowest ∆Hcal value of 17.2 kcal/mol for peak P1 as compared to the other two samples (+12 PMII mature and +2 PMII mature) which measured 92.5 and 142.4 kcal/mol, respectively. Furthermore, for peak 2, the

+12 PMII mature exhibited a higher ∆Hcal value of 69.7 kcal/mol than the +2 PMII mature, which had a value of 43.5 kcal/mol. The differences in the distribution of ∆Hcal between the two mature enzymes suggested that the extended PS (which produced the +2 PMII mature and is longer than the truncated PS) may have resulted in more interactions and coordinated the development of low melting transition domains (i.e., P1) in PMII.

Table 2.4: Thermal stability of PMII mature enzymes Peak 1 Peak 2 Samples Tm (°C) ∆Hcal(kcal/mol) Tm (°C) ∆Hcal(kcal/mol)

+2 PMII mature 55.4 ± 2.2 a 142.4 ± 18.0 a 69.8 ±5.5 a 43.5 ± 5.1 b

+12 PMII mature 54.7 ± 2.3 a 92.5 ± 15.4 b 69.7 ±4.4 a 69.7 ± 0.5 a

NoProPMII mature 56.2 ± 3.1 a 17.2 ± 0.4 c N/A N/A

Tm: melting transition, ∆Hcal: calorimetric enthalpy Data are expressed as mean ± SD of six DSC scans. Values in the same column having the same letter are not significantly different at p> 0.05

2.3.7 Kinetic parameters of PMII matures

The Michaelis-Menten model was used to quantify and compare the kinetic parameters of the three PMII mature enzymes. The experiment was conducted using quenched fluorescent substrate 2837b assayed at pH 4.8. The results of the kinetic parameters for each sample, determined at 37°C, are summarized in Table 2.5. The mature enzyme obtained in the presence of an extended PS (+2 PMII mature) gave the highest catalytic efficiency (kcat/Km). 54

3.32 ± 0.12 s-1 µM-1. The catalytic efficiency measured in +12 PMII mature was 2.75 ± 0.08 s-1 µM-1, which was slightly higher than the catalytic efficiency measured previously for mPMII (2.45 s-1 µM-1; Parr-Vasquez and Yada, 2010). The sample obtained without a PS

(NoProPMII mature) had the lowest catalytic efficiency at 0.31 ± 0.12 s-1 µM-1, which was approximately 11-fold lower than the catalytic efficiency measured for +2 PMII mature.

Analysis of the kinetic parameters revealed two interesting findings. First, as shown in Table

2.5, NoProPMII mature was the least active mature as it had the lowest catalytic efficiency.

This suggests that the absence of the PS greatly impacted the formation of active enzyme.

During the activation of PMII, removal of the PS allowed the N-terminal domain to experience a major arrangement that is believed to result in the correct orientation for the formation of active enzyme (Bernstein et al., 1999; Friedman and Caflish, 2008). Therefore,

PMII folding in the absence of a PS may result in improper folding leading to low enzymatic activity.

Second, the length of the PS was discovered to have an effect on the catalytic efficiency of the active PMII mature. The mature enzymes obtained from truncated and extended PS were significantly different (p < 0.05) in their catalytic efficiency but not their substrate affinity

(Km; p > 0.05; 0.76 ± 0.13 and 0.73 ± 0.09 µM, respectively). The two mature enzymes, however, differed in their turnover number (kcat); +2 PMII mature had a higher kcat than +12

-1 -1 PMII mature (Kcat; p > 0.05; 3.32 ± 0.12 and 2.75 ± 0.08 s µM , respectively). Therefore, it is suggested that the higher enzymatic efficiency measured in +2 PMII mature as compared to

+12 PMII mature was not due to major structural conformational changes (see results on

2.3.5.1; Far-UV CD spectroscopy and 2.3.5.2; Intrinsic fluorescence spectroscopy), but it is possible that the extra PS residues in +12 PMII mature interacted with nearby residues that

55 may have altered the optimum catalysis properties in PMII. Based on the sequence from the additional 10 PS residues in the +12 PMII mature (Figure 2.3: Complete sequence of truncated PSPMII construct in pET32b+ plasmid), the extra PS residues can be as long as 28 nm (calculation made using PyMol, data not shown). Since nearby pockets (e.g., S2, S3 and S4 pockets) also take part in the catalytic interactions in PMII (Westling et al., 1999 and Gupta et al., 2010), some part of the extra PS residues may have had interacted with neighboring pockets (considering the length of the extra PS) and altered its surrounding environment.

Table 2.5: Kinetic parameters of PMII mature enzymes -1 -1 -1 Samples Km (µM) kcat (s ) kcat/Km (s µM )

+2 PMII mature 0.73 ± 0.09 a 2.46 ± 0.18 a 3.31 ± 0.12 a

+12 PMII mature 0.76 ± 0.13 a 2.10 ± 0.13 b 2.75 ± 0.08 b

NoProPMII mature 0.58 ± 0.10 b 0.18 ± 0.03 c 0.31 ± 0.12 c

*Values were determined by averaging analysis of three replicates with two measurements per replicate (n = 6). Means sharing the same letter are not significantly different at p > 0.05

2.3.8 Active enzyme comparison using pepstatin-A

The amount of properly folded enzyme is difficult to determine accurately as many samples obtained after purification procedures may contain a small amount of denatured or improperly folded protein (Yonezawa et al., 1999). In heterogeneous samples, proteins are determined by their weight or their absorbance (e.g., A280 or BSA protein assay). These methods, however, will provide the total amount of protein in the sample rather than the amount of active protein. Alternatively, the amount of active enzyme can be quantified by using the competitiveness of a selected inhibitor toward the targeted enzyme. Since most APs are strongly inhibited by pepstatin-A, including PMII (Silva et al., 1996), the amount of active enzyme for each of the three PMII mature enzymes was approximated by measuring

56

the amount of titrated pepstatin-A required for complete inhibition of enzyme activity

(Kunimoto et al., 1972). Inhibition occurred when one of the hydroxyl groups in pepstatin

interacted with the catalytic aspartic residue (Asp-32), which resulted in a non-hydrolysable

mimic of the transition state formed during amide hydrolysis (Dunn, 2002).

Active enzyme concentration was assessed by incubating the enzyme-pepstatin-A complex

for 5 min before adding the substrate (Yonezawa et al., 1999). Enzyme activity was

determined by measuring the initial rate of substrate conversion. Pepstatin-A concentrations

ranging from 0.1 to 5 nM were used in the experiment and the total inhibition activities are

summarized in Figure 2.16. The results showed that samples obtained without a PS required

less pepstatin-A for complete inhibition.

100 120 100 90 80 80 60 NoPro 70 NoProPMII mature 40 20 60 SPMII+12 PMII mature 0 50 LPMII+2 PMII mature 0 0.05 0.1 0.15 40 30 Relative Relative Activity (%) 120 20 100 10 80 0 60 0 1 2 3 4 5 40 20 Peptatin A [nM] 0 0 0.1 0.2 0.3 0.4 0.5

Figure 2.16: Activity recovery upon pepstatin-A addition to PMII mature enzymes (A) Pepstatin-A inhibition plot of +2, +12 and NoPro PMII mature enzymes. (B) Linear regression plot of the first six inhibition measurements on NoProPMII mature. (C) Linear regression plot of the first six inhibition measurements determined on +2 PMII mature and +12 PMII mature. Error bars indicate the SD of n = 6 (3 replicates and 2 determinations per

57 replicate).R squared for NoPro, +12 and +2 PMII mature were measured at 0.981, 0.987 and 0.993, respectively.

Initial data on the first six measurements were plotted and the amount of pepstatin-A required for complete inhibition was determined by extrapolating back to zero on the activity axis

(subset Figures b and c, Figure 2.16). +2 PMII mature required the highest amount of pepstatin-A for complete inhibition (0.887 nM) and, therefore, contained the most active mature PMII. As compared to the +2 PMII mature sample, +12 PMII mature and NoProPMII mature had approximately 75% (complete inhibition at 0.667 nM) and 17% (complete inhibition at 0.148 nM) active enzyme, respectively.

2.3.9 Structural prediction of +2 PMII and +12 PMII mature enzymes

The structures of the +2 PMII and +12 PMII mature enzymes were predicted based on a homology modelling technique using I-TASSER (Roy et al., 2010). This technique is based on the principle that protein sequences that are related in their evolutionary tree will have similar structural conformation (Floudas, 2007). From the predicted models (Figure 2.17), mature enzymes were shown to be similar with a root-mean-square-deviation (RMSD) for the superposition of the 288 corresponding Cα atoms of both molecules of 0.892 Å. The differences between the two matures were seen in a few flexible regions that were located on the surface of the enzyme (e.g., the loop between Phe-9:Ile-11, the helix between Val-

143:Glu-146, the loop between Lys-186:Tyr-191 and the loop between Glu-276:Pro-280).

The extra 12 residues in +12 PMII mature were shown to be on the protein’s surface and caused the first three N-terminal residues to bend toward the main pocket of the protein. This orientation allowed some interaction between the extra residues and the beginning of the N- terminal residues. The extra 2 residues on the +2 PMII mature were too short to achieve a similar orientation or cause bending. The positions for both PSs, however, were distal from

58 the active site and, therefore, no direct interactions with the catalytic dyad of the enzyme (i.e.,

Asp-32 and Asp-214) were observed.

+ 2 PS + 12 PS

+ 12 PS S3Pocke

Figure 2.17: Predicted structures of +12 PMII and +2 PMII matures generated from I- TASSER Top: The superposition of the predicted structures and the orientation of 2 and 12 extra residues, red. Bottom: The orientation of 12 extra residues and S3 pocket in +12 PMII. All predicted structures are represented in cartoon mode; +12 PMII is green and +2 PMII is magenta.

59

Superposition of both predicted mature enzymes indicate that the main chains are similar.

The substrate mechanism of PMII is unique as the mechanism involves not only the main active pocket (i.e., pocket with the catalytic dyad) but also nearby pockets (e.g., S2, S3 and S4 pockets) (Westling et al., 1999 and Gupta et al., 2010). As such, any changes on the associated pockets may change the overall catalytic efficiency of the enzyme. From the predicted structure (Figure 2.17), the 12 extra residues were orientated in the same direction as the beginning of the N-terminal residues and some of the extra residues were in a position to possibly interact with the residues in the S3 pocket (i.e., Ile-11 and Met-12). This position may change the hydrophobic nature of the S3 pocket and is likely to cause a loss of stability

(i.e., van der Waals). This may explain why the Kcat for the +2 PMII was higher than +12

PMII despite similar Km values. The S3 pocket has previously been proven to be critical in the alignment of the peptide in the active site cleft and thus facilitates cleavage (Westling et al.,

1999). In addition, studies on mammalian APs have shown great sensitivity to substitutions that interact with the S2 and S3 pockets (Krupa et al., 2002; Beyer & Dun, 1998, 1996;

Scarborough and Dun, 1994).

2.4 Conclusion

This chapter was undertaken to design three different constructs of PMII with differing PS lengths to evaluate the importance of the PS in protein folding. Every construct was successfully expressed in a soluble expression system and purified, resulting in mature enzyme. In the absence of a PS, NoProPMII mature has the high β-sheet and low α-helix distribution that is characteristic of APs. The conformation, however, is slightly more solvent-exposed compared to the other two matures (+2 and +12 PMII mature forms). The

60

NoProPMII mature also displayed the lowest thermal stability and activity assay, which indicated inadequate folding. Based on these results it is suggested that the amino acid sequence encrypted in the mature region of the enzyme is capable of generating the main archetype for folding but the PS is required for the correct folding of PMII.

The second major finding was obtained by comparing the mature enzymes resulting from two different lengths of PS. Autoactivation of extended PMII zymogen produced a mature enzyme (+2 PMII mature) that was more similar to wild type enzyme than that obtained (+12

PMII mature) from the autoactivation of truncated PMII. The +2 PMII mature was found to have a higher turnover number (Kcat) towards synthetic fluorescent substrate 2837b and required a higher volume of pepstatin-A for complete inhibition. Superposition of both predicted two matures showed similarity in their main chains conformation. However, the presence of extra residues in the +12 PMII mature was discovered to interact with the S3 pocket and this orientation may alter the overall optimum catalysis properties in PMII. This evidence suggests that the extended PMII zymogen was similar to wild-type enzyme indicating the importance of the length of PS for PMII folding.

61

CHAPTER 3 Kinetically-stabilized and PS-catalyzed folding of

PMII

3.1 Introduction

Biologically active proteins are folded from an unfolded state into a well-defined three- dimensional structure (Uversky, 2002). The mechanism of protein folding, however, remains unclear despite numerous studies. In 1973, Anfinsen proposed a “thermodynamic hypothesis”

(also known as Anfinsen's Dogma) based on a refolding study of denatured ribonuclease A.

In this study, he claimed that the three-dimensional structure of a native protein in its normal biological environment is resolved at the lowest Gibbs Free Energy (thermodynamically stabilized) and that the entire folding process is made possible exclusively by interatomic interactions within the amino acid sequences (Anfinsen, 1973).

This hypothesis appeared to be applicable to most proteins with the notable exception of those that are synthesized as inactive precursors, such as zymogens, which undergo specific proteolytic cleavage (removal of the PS) in order to produce native enzyme (Khan and James,

1998). The native structures of these proteins are kinetically driven and, after denaturation, they are unable to refold on a relevant time scale (seconds or minutes) even after they are exposed to conditions favorable for enzyme folding (Eder and Fersht, 1995; Cunningham et al., 1999). Despite being the most stable conformation, it is also reported that the native forms of these enzymes are less stable than their unfolded states (Agard, 1993; Miller and

Agard, 1999; Dee and Yada, 2010). Controversy, therefore, arises from the concept that the native conformation is supposed to resolve at the most thermodynamically stable form

(Anfinsen, 1973).

62

The majority of zymogen-derived enzymes are produced with a PS that is covalently attached to the N-terminus of the native protease sequence. Similar to molecular chaperones (i.e., cellular proteins that assist in folding polypeptide chains but are not part of the final assembled structure), PSs were also discovered to assist the folding of polypeptide chains.

However, the PS can only assist in folding to the native structure (Sorensen et al., 1993; Chen and Inouye, 2008).

The PS-assisted folding was initially observed in serine proteases. In the absence of the PS,

α-lytic serine protease (αLP) is folded into a kinetically trapped and inactive intermediate state that is stable for several months with very slow conversion to the native state, T½ ≈

1,800 years (Baker et al., 1992). In addition, native αLP was discovered to be less stable than the inactive intermediate state by 4 kcal/mol. In order to maintain its functional form, αLP is kinetically rather than thermodynamic stabilized, with a large unfolding barrier that separates the functional from the inactive form (Baker et al., 1992).

Cumulative studies on the folding of zymogen-derived enzymes suggest two important PS functions: (1) catalysis of the folding of native enzyme by lowering the transition activation energy barrier, thus making the folding more practical in a biologically-relevant time frame

(i.e., seconds or minutes rather than days, months or years; half-time recovery for pepsin and

αLP were ≈ 82 days and ≈ 1,800 years respectively; Dee and Yada, 2010; Baker et al., 1992), and (2) stabilization of the native state over the inactive state, by allowing the native•PS complex to be more thermodynamically stable than the inactive•PS complex (Baker et al.,

1992; Kelch et al., 2012; Dee and Yada, 2010).

63

In addition to αLP, a variety of other proteases (e.g., subtilisin, carboxyoeotidase Y, papain and pepsin) have been found to exhibit similar PS-assisted folding (Eder and Fersht, 1995;

Bryan, 2002; Horimoto et al., 2009). Native pepsin has been shown to be kinetically stabilized and is unable to refold upon denaturation due to the presence of a large free energy barrier that separates the native state and the refolded state. In the presence of the PS, however, the transition activation energy barrier is reduced by 14.7 kcal/mol and folding to native pepsin is catalyzed (Dee and Yada, 2010). Given that native PMII and pepsin are classified in the same A1 family of APs and that both enzymes are synthesized in a zymogenic form, it is hypothesized that PMII may behave similarly. To verify this hypothesis, the folding landscape of PMII in the presence and absence of a PS was investigated as well as PSs of different lengths (extended PS with 60 residues and truncated

PS with 48 residues).

3.2 Materials and Methods

3.2.1 Materials

Synthetic peptides corresponding to the extended and truncated PSs of PMII

(QRDNEMNEILKNSEHLTIGFKVENAHDRILKTIKTHKLKNYIKESVNFLNSGLTKTN

YLG and SEHLTIGFKVENAHDRILKTIKTHKLKNYIKESVNFLNSGLTKTNYLG, respectively) were purchased from CanPeptide Inc. (Pointe-Claire, QC, Canada). Both samples had 95% purity as verified by mass spectroscopy. Stock solutions of both samples were prepared (w/v) in 50 mM Tris-HCl, pH 7.

64

3.2.2 Expression and purification of PMII

Expression and purification of extended PSPMII was conducted as described in Chapter 2,

Section 2.2.4.1. Samples of extended PSPMII were activated and used in the following experiments.

3.2.3 Preparation of refolded PMII

Alkaline denatured +2 PMII mature (8 mg/mL) was prepared in 50 mM CAPS, pH 10, and incubated for 1 hour at 30°C to ensure complete denaturation. The denaturation condition was similar to that used by Xiao et al. (2011). Inactive refolded PMII (Rp) was prepared by diluting the denatured sample to 0.2 mg/mL in 50 mM Tris-HCl buffer, pH 7.

3.2.4 Kinetic stability of native PMII

The unfolding rate of native PMII (0.05 mg/mL sample) was measured based on intrinsic tryptophan fluorescence (excitation at 295 nm, emission scan from 305 to 450 nm, and emission slit-widths of 3 nm and 5 nm) as a function of urea concentration, ranging from 0 to

8.0 M in 50 mM Tris-HCl, pH 7. Single-exponential fitting of the data (y = a – b exp-kunf) was used to measure the unfolding rate at a specific urea concentration. The natural logarithm of the unfolding rate versus urea concentration was extrapolated to a urea concentration of zero to estimate the unfolding rate in the absence of denaturant (Pace, 1986).

The uncatalyzed folding rate of PMII was measured based on the recovery of Rp activity using the fluorescence based assay described in Chapter 2, Section 2.2.6. Aliquots from a sample of freshly prepared Rp were assayed for hydrolytic activity at different time intervals over 72 hours. Initial reaction rates were determined using linear regression, corrected for background fluorescence, and used to calculate the amount of recovered native PMII.

65

3.2.5 Prosegment-catalyzed PMII folding

The ability of extended and truncated PSs in assisting Rp to fold into native PMII was assessed through enzyme activity. Therefore, various concentrations of PS, i.e., 1 to 50 μM

PS, were added to Rp while maintaining a minimum PS:Rp ratio of 10:1. Aliquots were diluted in 0.1 M phosphate buffer, pH 4.8 at several time intervals (0, 15, 30, 60, 90, 120,

240, 360, 480, and 720 sec) before being assayed for Np activity as previously described in

Chapter 2, Section 2.2.6. Experiments were conducted at 25°C. The initial rate of recovery for Np obtained from a single PS concentration at different time intervals was determined using a single-exponential function (y = a – b exp-kft) to determine the individual folding rate, kf. Data from multiple PS concentrations were analyzed as a function of PS concentration according to the modified Michaelis-Menten model as described by Peter et al.

(1998) in Equation 3.1 where kcat is the maximum folding rate and [PS] is the concentration of PS.

kf = kcat [PS]

KM + [PS] (Equation 3.1)

3.2.6 Prosegment inhibition studies

PS-induced inhibition was investigated by measuring the reduction in activity of +2 PMII mature upon the addition of multiple concentrations of PS (0 to 15 µM). Upon the addition of

PS, the mixture was incubated for 5 min at 25°C and assayed for activity using the fluorescence synthetic substrate 2837b, as described in Chapter 2, Section 2.2.6. The reaction rates were normalized to activity in the absence of the PS and data were analyzed using the competitive inhibitor form of the Michaelis-Menten equation (Equation 3.2), where vo is the initial reaction rate, Vmax is the maximum reaction rate, [S] is the substrate concentration

(fixed at 0.1 mM), [I] is the inhibitor concentration, Km is the Michaelis constant, and Ki is the inhibition constant:

66

vo = Vmax[S]

[S]+ KM(1+[I]/Ki) (Equation 3.2)

3.2.7 Statistical Analysis

Statistical analyses were conducted using GraphPad Prism software (Motulsky, H.J. 2003; www.graphpad.com). The data were subjected to analysis of variance using one-way

ANOVA and the means were separated using Tukey’s test. All experiments were conducted in two replicates with three determinations per replicate. Significantly different means (p <

0.05) are denoted by different letters

3.3 Results and Discussion

3.3.1 Kinetic stability of PMII

The free energy of the folding and unfolding reaction is often used as an index to measure protein stability (Eder and Fersht, 2002). However, in the case of many zymogen-derived enzymes, the equilibrium between the folded and unfolded states is nearly impossible to measure accurately due to a slow recovery rate, aggregation and autolysis (Bryan, 2002; Dee and Yada; 2010). To overcome this problem, the free energy of each state reaction (unfolding and folding) was determined separately; similar to the technique used for αLP (Derman and

Agard, 1998). In this experiment, the free energy of Np unfolding and folding was determined by measuring the rate of unfolding (kunf) and uncatalyzed folding (kf(uncat)).

In order to determine the kunf, the kinetics of urea-induced unfolding were measured as a function of urea concentration. The intrinsic fluorescence spectra from Trp residues served as probes that monitored protein unfolding (Eftnik, 1994) and the results are summarized in

67

Figure 3.1A. Individual unfolding rates (ku) were generated from the unfolding spectra of Np over a range of urea concentrations and were used to generate a linear regression plot of the natural logarithm of the unfolding rate versus urea concentration (Figure 3.1B). From the

-6 -6 -1 plot, it was observed that Np had a slow unfolding rate of 3.8×10 ± 0.3×10 s (t1/2= 50.7h or ~2.1 days).

(a.u) Intensity orescence u

Fl

Time (sec.)

Figure 3.1: Unfolding kinetics of PMII using urea as denaturant Unfolding was induced using a series of urea concentrations at 25°C. (A) Rate of unfolding

(Ku) was determined by monitoring the decrease in fluorescence of Np at different concentrations of urea as a function of time. (B) In order to determine the unfolding rate, Kunf of Np, the curve of ln k versus urea concentration was extrapolated to 0 M urea. Error bars indicate the SD of two replicates with triplicate determinations per replicate (n = 6).

68

The uncatalyzed rate of folding (kf(uncat)) for PMII was calculated based on recovery activity of Rp using a fluorescence based assay over 72 hours. Nonlinear fitting of the data from the recovery of Rp upon denaturation is shown in Figure 3.2A.

From Figure 3.2A, an increase in the rate of substrate conversion was observed as the protein gradually refolded. However, the recovery rate plateaued after 24 hrs when the rate of enzyme recovery is proportional to the degradation rate of Rp. In a previous folding experiment on PMII, Xiao et al. (2011) observed degradation in the renatured PMII sample after three days of incubation in 50 mM, Tris-HCl buffer, pH 7. Therefore, in order to estimate the folding rate of Rp to Np, data from the first 18 hrs of recovery activity (where proteolysis was negligible) was linearly fitted (Figure 3.2B). From the data, the folding of Np

-6 -6 -1 was recorded at 2.3 × 10 ± 0.2 × 10 s (t1/2 fold = 82.1 h or approximately 3.4 days). The slow folding rate indicated that Np was unable to fold on its own within a biological timeframe (i.e., seconds or minutes).

Because kunf and kf(uncat) can be directly related to the relative stability of Rp and Np these variables were used to calculate the activation free energy barrier for the folding and

‡ unfolding of PMII using the equation: ΔG = -RT/ln(rate/h/kBT) (Dee and Yada, 2010). The diagram of the activation free energy for PMII is shown in Figure 3.3. The Rp and Np states were separated by folding and unfolding activation energies of 25.12 and 24.83 kcal/mol, respectively. Of the two states (Np and Rp), Rp was determined to be more thermodynamically stable than Np by ≈ 0.3 kcal/mol (25.12 – 24.83 kcal/mol). This result is consistent with the results from a previous DSC experiment conducted on PMII, which showed that the Tm for Rp was ≈ 15°C higher than Np (Xiao et al., 2011). This phenomenon, in which Rp was more stable than Np, has also been observed in other zymogen-derived

69 enzymes such as αLP (Bryan, 2002), subtilisin (Jaswal et al., 2002), and pepsin (Dee and

Yada, 2010).

A Non linear fitting 0.25

0.20

0.15

0.10

0.05

0.00 Activity Recover (%) Recover Activity 25 50 75 -0.05 Time (Hrs)

Linear Fitting B 0.20

0.15

0.10

0.05

0.00

Activity Recover (%) Recover Activity 5 10 15 20 -0.05 Time (Hrs) Figure 3.2: Uncatalyzed folding kinetics of PMII The rate of uncatalyzed folding of Rp to Np at 25°C was estimated based on recovery of enzyme activity. (A) The recovery of Np upon denaturation was measured based on hydrolytic activity using a fluorescence substrate assay up to 72 hours. (B) Kf(uncat) was determined by using a linear fit of the unfolding data over the first 18 hrs. Error bars indicate the SD of two replicates with triplicate measurements (n = 6).

70

Based on the activation energy of PMII (Figure 3.3), Np was discovered to be preserved by a relatively large activation energy barrier from Rp. Due to this barrier, conversion from an unfolded to a folded state was extremely slow, even when conditions favored folding (50 mM

Tris-HCl, pH 7, no denaturant). Results from the present study imply that the native structure of PMII was kinetically rather than thermodynamically stable, with a large unfolding barrier that preserved the biologically active conformation.

25.12 24.83

G (kcal/mol) G

∆ U Rp Np

Folding Figure 3.3: Free energy diagram summarizing the folding landscape of PMII The activation energy barrier of folding Rp to Np (left barrier) was determined by the uncatalyzed folding rate, Kfol(unc) and the barrier of unfolding Np to Rp (right barrier) was determined by the unfolding rate, Kunf.

3.3.2 Prosegment-catalyzed folding of refolded PMII

From the above study it was clear that a kinetic barrier exists between Rp and Np. In many zymogen-derived enzymes, a PS facilitates folding by lowering the activation barrier and stabilizing the transition state of the native enzyme (Eder and Fersht, 1995; Sohl et al., 1998;

Dee and Yada, 2010). Therefore, it was hypothesized that the presence of a PS would help

71 catalyze PMII folding, which was tested by comparing the folding kinetics in the presence and absence of a PS (see Section 3.2.5). In addition, two lengths of PS, extended (60 residues) and truncated (48 residues), were analyzed and compared for their ability to catalyze PMII folding.

Recovery of Rp to Np in the presence of multiple concentrations and different lengths of PS is presented in Figure 3.4 where the rate of folding increased with the amount of PS.

Saturation occurred at approximately 5 µM for the extended PS and 7.5 µM for the truncated

PS. In order to measure the enzymatic profile of the PS-catalyzed folding reaction, kf values were analysed according to Equation 3.1 (Section 3.2.5). Results are shown in Figure 3.5.

-2 -1 The progress curve for the extended PS resulted in a kcat of 4.35 × 10 ± 0.003 sec and a

Michaelis constant, Km, of 1.584 ± 0.101 µM. In contrast, the truncated PS had kcat and Km values of 2.64 × 10-2 ± 0.002 sec-1 and 1.675 ± 0.124 µM, respectively (Table 3.1).

Table 3.1: Kinetic parameters of extended PS and truncated PS Kinetic parameter Extended PS Truncated PS

-1 -2 -3 a -2 -3 b kcat (s ) 4.35 × 10 ±4.5×10 2.64 × 10 ±2.0×10 a a KM (μM) 1.584 ±0.228 1.675 ±0.124 -1 -1 -2 -3 a -2 -3 b Kcat /KM(s μM ) 2.75 × 10 ±3.1×10 1.58 × 10 ±2.4×10 Uncatalyzed folding rate -1 -6 -6 Kfol(uncat)(s ) 2.34 × 10 2.34 × 10 Folding rate enhancing

(Kcat/ Kfol(uncat)) 18, 550 × 11,258 × * Values were determined by averaging analysis of two replicates with three determinations per replicate (n = 6). Values in the same row having the same letter are not significantly different at p > 0.05

72

A [PS µM]

B [PS µM]

Figure 3.4: Kinetic trace of recovery of Np activity upon incubation with several concentrations of PS The data were fitted using a single-exponential function to obtain the folding rate, kf. (A) Kinetic trace of recovery of Np activity obtain using truncated PS. (B) Kinetic trace of recovery of Np activity obtain using extended PS. Error bars indicate the SD of n = 6 (2 replicates and 3 determinations per replicate).

73

Figure 3.5: PS-catalyzed folding of Rp to Np using truncated and extended PS Summary of modified Michaelis-Menten model fit to kf data obtained at various concentrations of PS. Error bars indicate the SD of n = 6 (2 replicates and 3 determinations per replicate).

A comparison of the rates of catalyzed and uncatalyzed folding (kcat versus kf(uncat)) highlighted the function of the PS as a folding catalyst in PMII. In the presence of extended

PS, the half-time recovery of Np was measured at 15.9 sec-1, which was approximately

18,550x faster than the uncatalyzed rate. Truncated PS also improved the folding rate as compared to the uncatalyzed rate, but to a lesser extent (approximately 11,260 times; Table

3.1). In previous PS-catalyzed experiments, it was shown that the magnitude of the PS effect in catalyzing folding was unique to each specific enzyme. For example, the ability of the PS to accelerate the folding rate varied from as high as 3 × 109 times in αLP to as low as 400 times in Streptomyces griseus cathepsin B-SGPB (Derman and Agard, 2000; Truhlar et al.,

2004, respectively). As for APs that are closely related to PMII, the folding rate of pepsin was approximately 85,000 times faster in the presence of full length PS (Dee and Yada,

2010).

74

The ability to catalyze folding may be related to the length of the PS. For example, in pepsin, a truncated PS decreased the catalyzed folding rate by 2.5 to 70,000 times as compared to the full length PS (Myer, 2012). Similar results have been observed in αLP, α-defensin and cathepsin B (Figueredo and Ouellette 2010; Müntener et al., 2005; Cunningham et al., 2002).

The full length PS in proPMII is 125 residues long and includes a 21 residue trans-membrane region (Hill et al., 1994). The truncated PS (48 residues) used in this experiment was comparable in length to the PS of most APs and the extended version (60 residues PS) included all residues up to the trans-membrane domain. Although the full length PS was not examined and its folding efficiency remains unknown, the current study highlights the importance of the PS as a folding catalyst in PMII and that the longer PS (extended PS) showed better results in Np formation than the shorter PS (truncated PS).

3.3.3 Prosegment affinity toward native PMII

To assess the affinity of the PS toward Np an inhibition study was conducted on the activity of Np upon addition of multiple concentrations of PS (0 to 15 µM). Results are shown in

Figures 3.6 A and B. The extended PS was a significantly stronger inhibitor of Np activity than the truncated PS (Ki = 0.1416 ± 0.021 µM vs. Ki = 0.3031 ± 0.037 µM, respectively; p <

0.05). Nevertheless, both PSs showed strong binding affinity, and therefore, the ability to inhibit the activity of Np PMII.

75

A

B

Figure 3.6: PS binding to Np fitted using a competitive inhibitor model (A) Inhibition of truncated PS toward Np. (B) Inhibition of extended PS toward Np. Error bars indicate the SD of n = 6 (2 replicates and 3 determinations per replicate).

It is interesting to note that the length of PS affected not only the folding but also the inhibition of Np, with the extended PS resulting in higher inhibition. These results are consistent with other studies, which established that the PS does not function solely as a

76 folding catalyst but also as a strong inhibitor (Baker et al., 1992; Derman and Agard, 2000;

Dee and Yada, 2010; Bhaumik et al., 2012). Based on the simulation study conducted in

Chapter 4, Section 4.3.7, the extended PS was discovered to interact at both ends of the C- domain and the PS was further stabilized by the extra electrostatic interactions of two H- bonds and five salt bridges. The complex dynamic (extended PS and mature PMII) of this orientation may allow faster and more proper conformation, which is favorable to the inhibition of the PMII zymogen. The presence of electrostatic interactions in stabilizing the

PS were also monitored during mutation work on pepsin’s PS (Dee et al., 2009). The substitution of positively charged lysine with alanine (K-36A) showed the highest reduction in Ki as compared to the other two mutations (V-4A and R-8A), which indicated that electrostatic stabilizing forces were critical for the PS to act as an inhibitor. However, rates of

PS-catalyzed folding were similar regardless of the mutation, which demonstrated the robustness of the PS as a folding catalyst (Dee et al., 2009).

3.3.4 Energy landscape of uncatalyzed and prosegment-catalyzed PMII folding

The present study was designed to determine the effect of the PS in folding native PMII (Np).

By analyzing the PS-catalyzed folding rate (kcat), PS affinity towards Rp (Km), and PS affinity towards Np (Ki), free energy diagrams of PMII folding in the presence of truncated and extended PS were constructed from the equilibrium (Keq) and rate (k) constants using the

‡ equations ΔG=-RT ln Keq and ΔG = -RT/ln (rate/h/kBT) (Dee and Yada, 2010). The folding barrier in the presence of PS was calculated based on the value of kcat and the PS-stabilized energy was measured using Km (stabilized state energy of Rp-PS) and Ki (stabilized state energy of Np-PS) (Dee and Yada, 2010). The free energy diagrams generated in the presence and absence of PS are summarized in Figures 3.7 A and B.

77

The results indicated that denatured PMII in neutral pH (50 mM Tris-HCl, pH 7) was rapidly converted into a stable intermediate of Rp instead of folding back into Np. In the absence of

PS (solid line in Figures 3.7 A and B), the activation energy barrier of unfolding (kunf) and folding (kf(uncat)) showed that Np was marginally destabilized when compared to Rp by approximately 0.3 kcal/mol (unfolding energy of 25.12 kcal/mol minus the folding energy of

24.83 kcal/mol). Moreover, the presence of large activation energy barriers of unfolding and folding prevented the Np from being in equilibrium with the Rp.

In the presence of PS, the energy barrier was lowered and Np and Rp states were further stabilized. The extended PS lowered the folding barrier from 25.12 to 19.30 kcal/mol and stabilized Np to a greater extent than Rp as evidenced by the approximately 1.4 kcal/mol lower stabilization energy measured in Np-PS versus Rp-PS (Figure 3.7A). A similar pattern of folding assistance was also observed in the truncated PS, but Np was stabilized to a lesser degree (approximately 1.0 kcal/mol, Figure 3.7B). Overall, both the truncated and extended

PSs favored a shift to the Np-PS complex over the Rp-PS complex, which provided the driving force for the formation of the active conformation of Np rather than inactive Rp. As shown in Figures 3.7A and 3.7 B, the extended and truncated PS bound the folding transition state at greater affinity as compared to Np or Rp, lowering the barrier by 12.37 and 12.03 kcal/mol, respectively. These results, however, were lower than those found for pepsin, where the PS lowered the energy barrier of transition by 14.7 kcal/mol rather than stabilizing

Rp or Np; i.e., both Rp and Np were further stabilized by 8.6 and 10.2 kcal/mol in the presence of the PS, respectively (Dee and Yada, 2010). Similar to pepsin, the PS of PMII primarily catalyzed folding by stabilizing the folding transition state.

78

Figure 3.7: Folding landscape of PMII in the presence and absence of PS (A) Folding landscape of native PMII in the presence of extended PS. (B) Folding landscape of native PMII in the presence of truncated PS. The PS-catalyzed activation energy barrier (left dotted barrier) was determined based on kcat values. The stabilized state energy of Rp to Rp+PS (left stabilized energy) was measured on the basis of Km values. The stabilized state energy of Np to Np+PS (right stabilized energy) was measured on the basis of Ki values. The PS catalyzed transition barrier was determined by the formula: value of uncatalyzed folding barrier - value of catalyzed folding barrier + value of stabilized state energy of Np to Np+PS (e.g., calculation from extended PS = (25.12-19.20+6.55) kcal/mol).

79

3.4 Conclusion

Studies on the kinetic stability of +2 PMII mature showed that the enzyme was kinetically rather than thermodynamically stabilized and that Rp was more thermodynamically stable than Np. The presence of a large unfolding barrier trapped Np from converting to Rp even though it was not thermodynamically stabilized. This situation is contrary to that proposed by

Anfinsen; i.e., the native structure is the most thermodynamically stable conformation

(Anfinsen, 1974). In the absence of PS, Np was characterized by slow unfolding and uncatalyzed-folding rates, as indicated by large energy barriers to unfolding and refolding

(24.50 and 25.12 kcal/mol, respectively). Similar to most zymogen-derived enzymes (αLP, subtilisin and pepsin), PMII folds efficiently (in seconds or minutes, not days) to its metastable native conformation with the assistance of the PS.

The PS acted as a foldase in the folding of PMII Np with the extended PS having a greater effect than the truncated PS. The extended PS served as a strong catalyst and increased the folding rate by 18,550 times as compared to the uncatalyzed folding rate of Rp. In addition, the PS had high affinity toward Np and Rp. The binding of PS not only stabilized the two states but also shifted the state of Np to be more thermodynamically stable such that the formation of Np-PS was favored over Rp-PS as evidenced by a greater reduction in the stabilized energy measured in the Np-PS state as compared to the Rp-PS state (7.98 kcal/mol and 6.55 kcal/mol, respectively). While the formation of Np-PS was thermodynamically driven, once the PS was removed, Np was metastable with a large kinetic unfolding barrier that would help maintain the functional conformation (Baker, 1998; Dee and Yada, 2010).

80

CHAPTER 4 Structural characteristics of PMII zymogens

4.1 Introduction

Pepsin-like AP enzymes, members of A1 family in MEROPS database (Rawlings et al.,

2006), share a high degree of similarity in their structures and functions. These enzymes are produced as zymogens and their conversion to active enzyme requires a biochemical transformation that involves molecular reconfiguration and proteolytic removal of the prosegment (PS) (Khan et al., 1999). A structural comparison between pepsinogen (i.e., the zymogen of pepsin) and pepsin showed that interactions between the PS and residues within the active site prevent the substrate from binding to the active site of the zymogen (Richter et al., 1998). Most pepsin-like AP zymogens are stable at neutral pH but are activated in acidic environments. In acidic environments, the acidic residues of the enzyme become protonated and the stabilized electrostatic interactions between the PS and the active enzyme become disordered (Tang and Wong, 1987; Coates et al., 2008). Once the stabilized contacts are disrupted, the PS undergoes major orientation changes that initiate the activation reaction

(Coates et al., 2008).

Although plasmepsin II (PMII) is related to pepsin-like APs and has a similar native structure, it has a different zymogen structure. In typical pepsin-like APs, the PSs are composed of 40-50 amino acids (e.g., the PSs of pepsinogen, procathepsin D and progastricin consist of 44, 44 and 43 amino acids, respectively) and share great similarity in their sequences and three-dimensional structures (Khan et al., 1999). The PS of PMII, however, is longer (up to 125 amino acid residues including a trans-membrane domain in the middle of the PS) and shows virtually no sequence similarity when compared to the archetypal pepsinogen PS (Dame et al., 1994; Berry et al., 1995; Bernstein et al., 1999). These unique

81 characteristics result in a distinct zymogen inhibition mechanism as compared to other pepsin-like enzymes. Instead of binding to the active site, the PS of PMII inhibits the enzyme by separating the two domains and stabilizing the catalytic dyad in a distorted conformation

(Bernstein et al., 1999).

In the crystallized structure of truncated PSPMII (i.e., PS of 48 residues), the PS consists of a

β-strand followed by α-helix, a turn, a second α-helix and a coil connection to the mature segment (Bernstein et al., 1999). Almost the entire length of the truncated PS interacts with the mature enzyme, with the majority of the contacts being in the C-domain. The PS junction is located in a tight loop involving residues Tyr-122p, Leu-123p, Gly-124p, Ser-1, Ser-2,

Asn-3 and Asp-4 (i.e., the Tyr-Asp loop), and these tight interactions anchor the PS to the body of the protein (Bernstein et al., 1999). The disruption of three salt bridges (Glu-

87p:Arg-92p, Asn-91p:His-164 and Glu-108p:Glu107p) in an acidic environment was reported to play a key role in autoactivation (Khan et al., 1999). Dissociation of these interactions destabilizes the PS structure and weakens its interaction with the C-domain. In addition, protonation of positively charged residues (including the Asp-4) disrupts the Tyr-

Asp loop, introduces slack into the PS harness and produces a domain-closed form with a functional active site (Khan et al., 1999). Once the active site is formed, the enzyme hydrolyzes the scissile bond at Phe-112p:Leu-113p and produces an active enzyme with an extra 12 residues at the beginning of the N-domain (Khan et al., 1999).

As shown in Chapter 2, truncated and extended PSPMII produced two distinct mature enzymes after autoactivation: +12 PMII and +2 PMII, respectively. In vivo, proPMII is activated by a cysteine protease maturase within the acidic environment of a food vacuole

(i.e., digestive site within the parasite where hemoglobin being digested by the enzyme) and

82 proteolytic cleavage of the PS takes place between Gly-124p:Ser-1, which produces a native enzyme without any extra residues (Francis et al., 1997). Thus, the length of the PS and its interactions may be critical in explaining the folding event in PMII. Therefore, a structural comparison study between truncated and extended PSPMII was undertaken using circular dichroism (CD) and intrinsic protein (Trp) fluorescence spectroscopy to determine differences in secondary and tertiary conformations. Given that there are no crystal structures available for the extended PMII zymogen, an in silico approach was used to generate a predicted structural model.

4.2 Materials and Methods

4.2.1 Expression and purification of PMII zymogens

Expression and purification of extended and truncated PSPMII were achieved using protocols described in Sections 2.2.2 through 2.2.4.1.

4.2.2 Structural determination of PMII zymogens

The secondary structures of extended and truncated PSPMII were determined through far-UV circular dichroism spectroscopy as described in Section 2.2.7.1. Tertiary structure analysis was completed using intrinsic protein (Trp) fluorescence spectroscopy as described in Section

2.2.7.2.

4.2.3 Homology modelling of extended PSPMII

A comparative modelling approach was chosen to predict the initial structure of extended

PSPMII. For this purpose, the sequence of extended PSPMII was aligned with related, known tertiary and quaternary structures (i.e., the crystallized structure of truncated PMII zymogen from the PDB-1PFZ) using I-TASSER (Eswar et al., 2006). The extended PSPMII sequence

83 was uploaded to the server in FASTA format (i.e., text-based format for representing peptide sequences) and the final results were accessed via a link delivered by registered email.

4.2.4 Molecular dynamics simulations of PMII zymogens

The structure obtained from I-TASSER was used as the initial template for the molecular dynamics (MD) simulation. The simulation was conducted using GROMACS Program

Version 3.3.1 (Van der Spoel et al., 2005). The program utilizes optimized molecular dynamics that are applied to the helix-coil transition of polypeptide force fields. The selected force field allowed a simple backbone correction using two amber force fields (ff99SB and ff03) in order to produce a proper balance of average helix and coil populations at ambient conditions (Best and Hummer, 2009).

The simulation process was 100 ns in duration, conducted at pH 7.0 and the pKa values of the ionizable groups in the protein were calculated from the H++ server

(http://biophysics.cs.vt.edu/index.php, Gordon et al., 2005). The simulation temperature and pressure were maintained at 298K and 1 atm, respectively. Na+ and Cl¯ ions were added to neutralize the charge of the system and the salt concentration was maintained at 0.1 mM. The simulation was carried out in water where the protein was immersed in a truncated octahedral box containing Tip3p model waters (Jorgensen et al., 1983). The edge of the box was extended to at least 1.0 nm between the protein and the solute. Atom contacts between the protein and water molecules were maintained by removing any atom of the water that was closer than the cut-off Van der Waals radius of ≤ 0.9 Å.

84

4.2.5 Structural visualization, surface properties, intermolecular interactions and flexibility of PMII zymogens

The electrostatic surface potentials of extended and truncated PSPMII zymogens were generated using Adaptive Poisson-Boltzmann Solver, which is a plug-in software package in

PyMOL. This program enabled the use of the Poisson-Boltzman equation to evaluate the electrostatic properties of large biomolecules by implementing parallel multigrid solution solvers algorithm (APBS; Baker et al., 2001). In order to use APBS, the atomic charges and radius of the selected protein had to be assigned via the PDB2PQR server

(http://www.poissonboltzmann.org/pdb2pqr). Electrostatic surface potential was conducted at neutral pH and the pKa calculations for ionizable groups were measured using PROPKA (Hui et al., 2005). Intermolecular interactions (i.e., H-bonds and salt-bridges) between the PS and native PMII (i.e., involving 328 residues starting from Ser-1 to Leu-328) were investigated using the GRASP2 program (Donald and Honing, 2003). All figures were generated using

PyMOL Version 10.1.1.

The dynamic mobilities of the predicted PMII zymogen structures were determined based on the B-factor (also known as temperature-factor) of an individual atom. The factor was measured based on the degree to which the electron density was spread out from an average value (Creighton et al., 1984); more flexible atoms have a larger displacement from the average position than less flexible atoms. In order to determine the flexibility of the predicted structure, the B-factor selection parameter must be selected when assigning the restraining forces during the MD simulation. Later, the flexibility of the predicted structure can be viewed using Pymol via Action  Preset  b factor putty. In addition to determining the flexibility of the predicted structure, the measurement was also used as a tool to verify the

85 accuracy of the predicted structure by comparing the flexible region of the predicted structure to the crystal structure of truncated PMII zymogen-1PFZ.

4.3 Results and Discussion

4.3.1 Secondary and tertiary structure of PMII zymogens

Far-UV circular dichroism (CD) spectroscopy was conducted to measure the secondary structure fractions of extended and truncated PSPMII. The spectra of far-UV ranging from

240 nm to 190 nm are shown in Figure 4.1. Both zymogens exhibited similar CD spectra with minor shifting in the extended PSPMII between 190-195 nm. However, the calculated secondary structures of both two zymogens were not significantly different (p > 0.05) (Figure

4.1). These data were consistent with the secondary structures of the PMII mature enzymes described in Chapter 2 (≈ 12% α-helix and ≈42% β-sheet). These findings suggest a high degree of similarity in the secondary structure between the two zymogens and also between the activated enzymes.

100 80 Extendedlong PSPMII 60 TruncatedShort PSPMII 40

20

0 Ellipticity 190 200 210 220 230 240 -20 -40

-60 -80 Wavelength (nm)

Figure 4.1: CD analysis on the extended and truncated PSPMII CD scans were determined using 0.2 mg/mL sample in 0.1 M phosphate buffer, pH 7.0. The secondary structures were analyzed via Dichroweb using analysis programs (Selcon3 and CDSSTR) with a minimum of four scans to determine the secondary structures. Data are expressed as the mean ± SD of six scans (3 replicates and 2 determinations per replicate). Values in the same column having the same letter are not significantly different at p > 0.05.

86

70

60 TruncatedTruncated PMII PSPMII zymogen

Extended PMII zymogen 50 Extended PSPMII

mature+2 PMII PMII mature 40

30

IntrinsicFluorescence 20

10

0 310 330 350 370 390 410 430 450 470 Wavelength (nm)

Figure 4.2: Intrinsic protein (Trp) fluorescence spectra of truncated and extended PSPMII and +2 PMII mature Fluorescence spectra were determined using 0.05 mg/mL sample and the spectra were normalized with a blank buffer of 0.1 M phosphate buffer, pH 7.0. Data are expressed as the mean ± SD of six scans (3 replicates and 2 determinations per replicate) and the + 2PMII mature spectrum was from the scan generated in Chapter 2, Section 2.3.5.2.

In addition to the CD analysis, intrinsic protein (Trp) fluorescence spectroscopy was conducted to investigate the tertiary structure of the two zymogens (Figure 4.2). Since there were no Trp residues in the PS, the spectra were dependent on the excitation of Trp residues within the mature enzyme (Trp-41, Trp-128, and Trp-192). It is apparent from this figure that the spectra for the two samples were almost identical, indicating similar tertiary conformations. The maximum intensities for truncated and extended PSPMII were measured at 340.4 and 341.0 nm, respectively.

87

As compared to the fluorescence spectrum of +2 PMII mature enzyme (see Section 2.3.5.2, maximum wavelength at 338.0 nm), the two zymogens showed moderate changes (red-shift emission spectra). This moderate red shift (≈ 2 to 3 nm) indicated that the tertiary conformations of the two zymogens were slightly more exposed to solvent as compared to the mature conformation. For reference, proteins that undergo a major tertiary change upon exposure to solvent experience a larger red-shift (≈ 10-20 nm) than those experiencing a minor tertiary conformation change (≤ 5 nm; Vivian and Callis, 2001). The findings indicated that the zymogen has a slightly different tertiary structure upon solvent exposure as compared to the +2 PMII mature enzyme which was consistent with the comparison made using crystallized structures of truncated PMII zymogen-1PFZ (Bernstein et al., 1999) and mature

PMII-1SME (Silva et al., 1997).

In previous research, the C- and N-domains of PMII zymogen were found to experience ≈ 5° and 14° rotation shifting, respectively in comparison with the mature form of PMII

(Bernstein et al., 1999). Moreover, domain opening in the zymogen form of PMII separated the catalytic Asp residues (Asp-32 and Asp-214) by an additional 3.2 Å (Bernstein et al.,

1999). Interaction between the PS and the main body of the enzyme alters the shape and the substrate-binding cleft, thereby affecting substrate recognition. This inhibition mechanism

(i.e., separation of the C- and N-domains) retained the PMII zymogen in an inactive form and was completely different than other AP inhibition mechanisms where the PS blocks the substrate-binding cleft (Khan et al., 1999).

4.3.2 In silico structural study

Although extensive structural studies have been carried out on PMII, most of them focused on the inhibition of mature PMII by potential inhibitors (Silva et al., 1996; Jinag et al., 2001;

88

Asoja et al., 2002). Only a single PMII zymogen structure (1PFZ), however, is available in

PDB and it is identified as the truncated zymogen consisting of 48 PS residues (Bernstein et al., 1999). Since there is no crystal data available for extended PSPMII, in silico approaches were applied to generate and evaluate the protein. The simulated structure of extended

PSPMII was compared with truncated PSPMII in order to establish the importance and function of the extra 12 residues PS in protein folding.

4.3.3 Homology modelling of extended PSPMII

The initial structure of extended PSPMII was generated via homology modelling. In homology modelling, the accuracy of the predicted model depends highly on the similarity of the alignment between the model sequences with the most similar sequences available within the PDB (Garnier, 1990). Since only a small part of the structure in the extended PSPMII was unknown (the 12 extra PS residues as compared to truncated PSPMII), homology modelling produced high quality initial structural models by reassembling the remaining 376 residues through the closely related crystal template of 1PFZ within the PDB. The predicted structure of extended PSPMII obtained from homology modelling is shown in Figure 4.3.

The predicted structure showed almost perfect alignment with the crystallized structure of

PMII zymogen-1PFZ, with a RMSD of 0.102 Å as determined based on the alignment of all atoms in PyMol. This low RMSD value was not surprising given that the sequence identity between the two proteins is identical up to the extra 12 residues. As there is no structure for extended PSPMII available in the database, the program reassembled the extra 12 PS residues by ab initio modelling and predicted that the extra residues were in a close coil conformation orientated away from the main body (Figure 4.3).

89

... < 180°

Figure 4.3: Superimposition of the homology modelling structure and the crystal structure of 1PFZ Superimposition of the two structures was conducted using PyMol. Predicted structure of extended PSPMII is green, the crystal structure of truncated PMII zymogen-1PFZ is red, and the extra 12 PS residues of extended PSPMII are blue. Structures are represented in cartoon mode.

90

The predicted structure for the extra PS residues should be taken with some caution, as no contacts are likely to occur between the extended 12 PS residues and the protein main body.

This caution is based on the fact that homology modelling is not designed to predict residues with low or unknown similarity (residues with no resolved crystal data within the database) due to limitations in the number of force field selections and conformational spaces covered

(Blake and Cohen, 2001; Roy et al., 2010).

4.3.4 Molecular dynamics simulations of PMII zymogens

In order to simulate the actual structure of extended PSPMII, the structure generated from homology modelling was further analyzed by MD simulation using GROMACS as outlined in Section 4.2.4. By the end of the simulation, the program had predicted a crescent-shaped structure with a six-stranded β-sheet present in the centre (Figure 4.4). Similar to the crystal structure 1PFZ (Bernstein et al., 1999), the first 48 residues of the extended PS interacted mainly with the C-domain of the protein. The first two helices of the PS were found to interact exclusively with the edge of the C-domain and the β-strand alone interacted with both domains. This β-strand formed the first strand of the central six-stranded β-sheet and interacted with the central portions of the two domains.

There was a major orientation shift within the 12 extra PS residues of the extended PSPMII.

During the simulation, the extra residues associated with the closed coil began to unwind and move toward nearby residues in search of a lower energy conformation. By the end of the simulation, the orientation of these residues was primarily pointed towards the other edge of the C-domain (Figure 4.5). With both sides of the C-domain secured, the extended PS further stabilized and minimized movement in the C-domain, thereby increasing its rigidity.

91

… < 180°

Figure 4.4: Structural comparison between truncated and extended PSPMII Predicted structures for extended and truncated PSPMII using PyMol in grid mode (right) and their superimposition (left). All structures are represented in cartoon mode with extended PSPMII in red, the extended PS in orange, the truncated PSPMII in blue and the truncated PS in purple.

92

However, in the truncated PSPMII, the PS was not long enough to interact with both sides of the C-domain. This finding supports previous research on domain flexibility in APs as reported by Sali and coworkers (1992), who observed that the two domains of APs experienced rigid body movement during activation and inhibition with the C-domain experiencing more limited movement as compared to N-domain (Sali et al., 1992). Therefore, there was less rigidity in the C-domain.

4.3.5 Electrostatic surface potential of PMII zymogens

Electrostatic surface potential can be used as an important indicator of intermolecular association between the PS and the main body of PMII. In general, the surface of native PMII is rich in acidic, negatively charged residues (in red, Figure 4.5) and its theoretical isoelectric point was calculated as pH 4.8 by using ProParam (http://web.expasy.org). There was also a visible space that separated the two domains in native PMII (Figure 4.5). In the zymogenic form, the space between the two domains was filled by the PS.

Since the extended PS is rich in basic, positively charged residues (in blue), and has a theoretical isoelectric point of pH 9.4 (pH 9.7 for truncated PS), the interactions between the two complexes are mainly due to electrostatic interactions. Due to the extra residues, the extended PS had two connections, whereas the truncated PS only had one. The two connections of the extended PS interacted with both sides of the C-domain (Figure 4.5). Since the interactions between the main body of PMII and the PS were driven by electrostatic interactions, the charged residues within the extra 12 residues of the extended PS (i.e., Arg-

65p, Asp-66p, Glu-68p, Glu-71p and Lys-74p) were, therefore, predicted to stabilize the extended PSPMII to a greater extent than truncated PSPMII.

93

N-

C-

Figure 4.5: Electrostatic surface potential of extended PSPMII (Top) Electrostatic surface potential of the main body of PMII with the N- and C-domains labelled. (Bottom) Electrostatic surface potential of the extended PS (left) and the zymogen complex (right) shown separately in grid mode. Electrostatic surface potential was generated using Adaptive Poisson-Boltzmann Solver (APBS) in PyMol. Surface color represents the charge distribution with positive, neutral and negatively charged areas represented in blue, white and red, respectively. The electrostatic surface potential of truncated PSPMII was similar to the extended model without the extra 12 residues (highlighted in the orange oval).

94

4.3.6 Flexibility of extended PSPMII

In general, the predicted structure of extended PSPMII showed a compact conformation with some flexible sections on the surface (Figure 4.6). Similar to the crystal structure of 1PFZ, a flexible region was detected in the area where the PS was positioned and at the loop of Tyr-

Asp. The predicted PS was very flexible in the region involving the two helices and within the extra 12 residues. The solitary β-strand of the PS was shown to be rigid as it was positioned in the middle of the two domains and was a central component of the six-stranded

β-sheet region. As shown by Khan and colleagues (1999), during in vitro autoactivation of

PMII zymogen, the stabilized interactions between the PS and the main body of PMII are disrupted, which allows proteolysis of the PS and results in an active mature enzyme.

Since the PS is removed during autoactivation, the large flexibility within the PS, as indicated by thicker backbone in Figure 4.6, was not surprising. The presence of the extra 12 flexible residues within the extended PS further increased the overall PS flexibility, which permitted proper formation of an active mature enzyme. Furthermore, most of the flexible segments identified in this study (i.e., beginning at the N-terminus of mature enzyme: Ser-1:Phe-29, the flap of Glu-73:Thr-81, the helix of Asp-106;Ala-116 and the loop of Lys-129:Ser-135) experienced major conformational changes during the activation of mature PMII, which is in accordance with the results from a study on truncated PMII zymogen and activated mature enzyme conducted by Bernstein et al. (1999). The similarity in the flexible region between the two structures is supportive of the predicted structure.

95

Figure 4.6: Flexibility of the predicted structure of extended PSPMII superimposed with the crystal structure of 1PFZ The fluctuation of the predicted structure was determined using PyMOL based on the calculation of the B-factor of an atom. Flexible regions are visualized with a thicker backbone, the PS is dark red, and the extra 12 PS residues are highlighted in the black oval.

96

4.3.7 Intermolecular-molecular contacts between the PS and main body of PMII

Only when the PS has been detached from the main body of PMII can it be enzymatically cleaved to produce active enzyme. Therefore, it is important to identify the PS contacts with the main body of PMII that stabilize the zymogen. The present study examined these stabilizing forces, specifically the H-bonds and salt-bridge interactions. The number of H- bonds and salt-bridge interactions between the truncated and extended PSs and the main body of PMII, as generated from the simulation structure, are summarized in Table 4.1.

Table 4.1: H-bond and salt-bridge interactions between the extended and truncated PSs and native PMII No. of Interactions Truncated PSPMII Extended PSPMII

H-bonds 25 27 (+2 from extra residues)

Salt-bridges 2 7 (+5 from extra residues)

Total 27 34

In total, 27 interactions were observed between the truncated PS and the main body of PMII, of which 25 were H-bonds and two were salt-bridges. The solitary β-sheet (Leu-80p:Glu-87p) and the second α-helix (Asn-88p:Thr-99p) regions in the PS demonstrated the highest number of H-bonds (i.e., nine and ten interactions, respectively) with native PMII (Figure 4.7).

Meanwhile, the contact made between Arg-28p and Asp-70 further stabilized the conformation with two extra salt-bridge interactions. As outlined in Table 4.1, the number of

H-bonds and salt-bridge interactions increased with the number of PS residues. In total, 34 interactions were detected that stabilized the extended PSPMII. There were five additional salt-bridge interactions as compared to truncated PSPMII: three from the interaction between the cationic ammonium in Arg-65p and main body residues and two between the anionic

97 carboxylate in Glu-71p and main body residues (Figure 4.8). The additional interactions were mainly due to increased numbers of charged residues in the extended PS that helped stabilize the extended PSPMII conformation.

Disruption of several H-bond and salt-bridge interactions between the PS and the protein main body at low pH has been identified as an important factor for the activation of many

APs including cathepsin D (Benes, 2008), pepsin (Davies, 1990; Dee and Yada, 2010), and plasmepsin (Bernstein et al., 1999; Friedman and Caflish, 2008). As previously shown in

Chapter 2, Section 2.3.3, both zymogens were capable of autoactivating in an acidic environment. In the truncated zymogen, the PS was removed due to the disruption and neutralization of multiple H-bond and salt-bridge interactions involving these Asp and Glu residues (e.g., Tyr-Asp loop; Lys-107p:Glu-180p; Asp-91p:His-164; Glu-87:Arg-92p;

Bernstein et al., 1999). Because Asp and Glu side chains are deprotonated at the activation pH of 4.7 (Kelch et al., 2012), decreasing the pH toward an acidic environment may lead to the further loss of H-bonds and salt-bridge interactions that involve these residues due to neutralization of the charged carboxylic acid group.

98

Figure 4.7: H-bonds between the β-sheet and α-helix of the PS and the adjacent residues in the main body of native PMII (Top) H-bonds in the β-sheet. (Bottom) H-bonds in the α-helix. The PS is cyan and is displayed in both cartoon and sticks representation. The adjacent residues are displayed in sticks representation and are colored by the default element in PyMOL. Dashed lines indicate H-bonds as determined using GRASP2.

99

Figure 4.8: Salt-bridge interactions between the extra PS residues with adjacent charges residues within the main body of PMII. Salt-bridge interactions were measured using GRASP2 in default setting where the distance between anionic carboxylic and cationic ammonium charges was set to ≤ 4 Å.

Even though PS length did not inhibit the autoactivation of the two zymogens, it had an interesting effect on the cleavage position. Truncated PSPMII produced a mature enzyme with an extra 12 residues attached to the N-terminus, whereas the extended PSPMII produced a mature form with two additional residues (refer to Chapter 2, Section 2.3.3). The presence of a large number of charged residues within the extended PS may have provided greater repulsion between the PS and the main body of the enzyme and thus resulted in a greater opening of Tyr-Asp loop during activation. This opening may have exposed the deeper scissile bond between Tyr-3 and Leu-2. In truncated PSPMII, however, the Tyr-Asp loop opening was not large enough to expose these two residues. Therefore, the activation took

100 place at the more solvent-exposed scissile bond between Phe-13 and Leu-12. The opening of the Tyr-Asp loop during acidification has also been observed during the in vivo activation of

PMII by maturase where the extended conformation (i.e., full length PS) was more suitable for proteolytic cleavage (Khan et al., 1999).

4.4 Conclusion

In order to distinguish the effect of PS length on PMII folding, a series of structural analyses were conducted on truncated and extended PSPMII. In addition to in vitro structural analyses, in silico approaches were applied to generate and compare the predicted three dimensional structures of these two zymogens.

In this chapter, CD and intrinsic protein (Trp) fluorescence spectroscopy were used to determine the secondary and tertiary structures, respectively of extended and truncated

PSPMII. CD analysis revealed no significant secondary structure differences between the two zymogens. These data were similar to the mature PMII enzymes (i.e., high in β-sheets and low in α-helices). Using intrinsic protein (Trp) fluorescence spectroscopy, however, there were differences between the tertiary structures of the zymogens and the +2 PMII mature enzyme. The extended and truncated PSPMIIs had more exposed structures than the mature

PMII, as indicated by a red shift of λmax measuring at 340.4 and 341.0 nm as compared to

338.0 nm for +2 PMII mature, respectively. These results highlighted the unique inhibition feature of the PS, in which it is believed to separate the two domains thus stabilizing the catalytic dyad in a distorted conformation (Bernstein et al., 1999).

Since no crystal structure of extended PSPMII was available in the PDB, its predicted three- dimensional structure was generated in silico. Truncated PSPMII was predicted the same

101 way, even though its structure has been crystallized (1PFZ; Bernstein et al., 1999). A comparison between the two showed that the structures were almost identical (i.e., RMSD measured at 0.102 Å). Due to the low sequence similarity and limitations in homology modelling force field selection, the remaining 12 residues in the extended PS were predicted to be oriented away from the main body, which prevented contact with nearby residues. This prediction, however, was not reasonable considering the abundance of charged residues within the extended PS. Therefore, MD simulation was conducted on the two zymogens using GROMACS. Results from the predicted zymogens showed both PSs orientated toward the middle of PMII with more contacts measured within the C-domain as compared to the N- domain. Due to the longer PS residue in extended PMII zymogen, two PS sections were retained at both edges of the C-domain, whereas only one section was detected in truncated

PMII zymogen.

Electrostatic surface potential showed that the interaction between the PS and the main protein body was driven by electrostatic interactions. Both the PS and native PMII had distinct distributions of charged residues; the PS was rich in positively-charges residues and the native enzyme was rich in negatively charged residues. In total, there were 27 electrostatic interactions between the truncated PS and the main body of PMII (25 H-bonds and 2 salt-bridges). The presence of 12 extra residues in the extended PS increased the number of electrostatic interactions to 34 (two additional H-bonds and five additional salt- bridges between Arg-65p and Glu-71p and adjacent residues).

The extra interactions of the extended PS did not hinder the autoactivation of the zymogen; instead, it produced a mature enzyme that was more similar to native PMII than the mature enzyme produced by truncated PSPMII (+2 versus +12 extra residues, respectively). Since

102 the extended PS was longer, more flexible and consisted of a larger number of charged residues than the truncated PS, the extended PSPMII was proposed to expose the deeper cutting site of Tyr-3:Leu-2 through a large repulsion between the PS and the main body of

PMII. Therefore, it is suggested that the extended PS was more appropriate than the truncated

PS in overall PMII folding since it produced a mature enzyme that was more similar to native

(wild-type) PMII.

103

CHAPTER 5 Conclusion and future recommendations

5.1 Conclusion

PMII has been investigated for over two decades (e.g., as a hemoglobinase involved in malarial infection) with most research focusing on recombinant expression, crystallographic determination and development of enzyme inhibitors (Osoja et al., 2002; Clement, et al.,

2006; Kim, et al., 2006). Surprisingly, there has been very little effort to understand the fundamental folding mechanism and the involvement of the PS. Therefore, this thesis investigated the folding and activation of PMII and the role of the PS in these processes.

Three different constructs of PMII, each with a different length of PS: extended PSPMII (60 residues), truncated PSPMII (48 residues) and NoProPMII (absence of PS) were designed and expressed in order to undertake a structure – function study of PMII and its zymogen.

In the absence of a PS, the NoProPMII mature enzyme had a conformation similar to native enzyme with secondary structure fractions representative of an AP, i.e., high β-sheet and low

α-helix content (Robbins et al., 2009). The mature enzyme, however, had a slightly solvent- exposed conformation and the lowest thermodynamic stability and lower enzymatic activity when compared to mature enzymes obtained with a PS. Based on these findings, it is suggested that PMII was able to fold into a structure that resembled wild-type PMII based on secondary structure without the help of a PS, but still required the PS to refine the structure and produce a properly folded native structure.

Extended PSPMII and truncated PSPMII produced active mature enzymes with similar secondary and tertiary structures. Major differences between these two constructs were apparent when the zymogens were autoactivated to produce mature enzymes. Based on N-

104 terminus analysis, extended PSPMII produced a mature enzyme with two extra residues (+2

PMII mature) as compared to native PMII, whereas truncated PSPMII produced a longer mature enzyme with 12 extra residues (+12 PMII mature). The longer PS produced a mature enzyme with higher thermostability and activity than the shorter PS. This finding supports the suggestion that mature enzyme that has been produced or activated from a more native-like zymogen is more active and stable (Tyas et al., 1999; Gulnik et al., 2002).

In Chapter 3, a kinetics study on the folding of PMII was conducted in the presence and absence of PSs of different lengths (60 and 48 residues). The inactive refolded (Rp) state obtained from + 2 PMII mature was thermodynamically more stable than the native PMII

(Np) state. Np had a slow rate of folding and unfolding, as indicated by large energy barriers to fold and unfold at 25.12 and 24.50 kcal/mol, respectively, therefore, Np was unable to refold within a reasonable timeframe in conditions that favored recovery and, instead, the enzyme spontaneously folded into the more thermodynamically stabilized state of Rp.

In the presence of the PS, the height of the free energy barrier of PMII was reduced and the overall folding kinetic was therefore changed. Both PSs (extended and truncated) displayed high binding affinity toward Np and Rp and the binding further stabilized these two states. It was observed that the NP-PS complex was stabilized at a higher magnitude and was more thermodynamically stable than the RP-PS complex and, therefore, provided the driving force necessary for the formation of Np. The PS greatly catalyzed PMII folding (18,550 and 11,258 folds for extended and truncated PS, respectively), and was seen to bind more tightly to the folding transition state (intermediate phase during the conversion of Rp to Np) than to Np or

Rp. The extended PS lowered the transition state barrier by 12.37 kcal/mol (12.03 kcal/mol

105 for truncated PMII) and allowed PMII to fold into its metastable native conformation within a relevant biological timeframe.

To help understand the function and the effect of PS length in PMII folding, the structures of extended and truncated PSPMII were compared (Chapter 4). Both zymogens had similar secondary and tertiary structures, which indicated that PS length had no major effect on PMII zymogen structure. When comparing these two structures with activated mature enzyme, however, the zymogens had a slightly more solvent-exposed conformation, which demonstrated that the PS pushed the two domains apart. This unique feature allowed the zymogen to maintain an inactive form in which the catalytic dyad was in a distorted conformation (Bernstein et al., 1999). The conformation and contacts made by the extended

PS were measured and compared to the truncated PS. An in silico approach was used to examine the two zymogen structures. Given that no crystal structure was available, extended

PSPMII was modelled via a molecular dynamic simulation using GROMACS. From the predicted structure, the majority of PS residues were positioned in the middle of the PMII body. The extended PS formed two regions that complemented the edge of C-domain whereas the truncated PS formed only one region.

Contact between the PS and main protein body was driven by electrostatic interactions. The

12 extra PS residues in the extended PS contributed to an additional 2 H-bonds and 5 salt- bridge interactions. These extra interactions did not hinder autoactivation, but rather helped produce a mature with only two extra residues at the N-terminus as compared to wild-type

PMII. It is proposed that, during autoactivation, the longer PS caused larger repulsion between the PS and main PMII body, thus exposing the deeper cut site of Tyr-3:Leu-2.

106

In summary, the results highlighted the importance of the PS in refining and catalyzing PMII folding. With regards to PS length, the extended PS produced a mature enzyme that more closely mimicked the native enzyme and was more efficient in catalyzing PMII folding than the truncated PS. Therefore, extended PSPMII could be a good model to emulate full-length proPMII. As PMII is one of four homologous enzymes (PMI, PMII, PMIV and HAP) found in malarial parasites (Plasmodium spp.; Banerjee et al., 2002), the results presented in the present study may be applicable to the other three enzymes. This information is critical as these enzymes are hemoglobinases that act during the early stages of malarial infection. The

PS, with its role in proper enzyme folding, may be a key point at which to deactivate these enzymes.

5.2 Future recommendations

The current findings have added substantially to our understanding of the structure-function relationship between the PS and mature PMII enzyme. Information generated from this thesis highlighted the importance of the PS in assisting the PMII folding. From this study, an extended PS was more efficient at producing mature with higher activity that was more similar to native PMII. It is predicted from the study that a large number of charged residues in the extended PS caused larger repulsion during activation, and thus exposed a deeper cut site of Tyr-3:Leu-2. This explanation would be more convincing, however, if an activation model in the exact environment where autoactivation takes place (sodium acetate buffer, pH

4.7) was designed. The dynamics of PMII folding during autoactivation (i.e., the opening of the PS to expose the Tyr-3:Leu-2 cut site) may give absolute information to support this prediction, and the entire event could be monitored through in silico simulations. For this purpose, a molecular dynamic simulation of a selective protein in a specific autoactivation

107 environment using GROMACS or other programs (e.g. AMBER, Spartan, Desmond) can be applied and compared for more accurate results.

Despite the ability of computational modelling to predict the conformation of the extended zymogen, the accuracy of the predicted structure remains uncertain until the structure is solved. As our lab has already established the protocol for expressing the extended zymogen, it is strongly suggested that this construct be crystallized and solved using x-ray crystallography or NMR methods. Structure information generated via these two methods can precisely locate the actual position of the PS and the interactions involved with the main protein body.

Data generated from the thesis indicated that both PSs were strong inhibitors of PMII activity and the extended PS was stronger than the truncated PS. To date, detailed PMII inhibition studies based on PS properties have not been conducted, but it would be interesting to use the

PS as a target in designing an inhibitor. Since the naturally occurring PS is long, it is suggested that the PS should be split into smaller segments in order to identify the segment with the most inhibitory potential. As compared to conventional inhibition methods, which target the active site, the unique inhibition features of the PS (pushing the two domains apart and disrupting the catalytic dyad) offer huge advantages in designing a PMII inhibitor. Since mature PMII shares significant similarity with other human-related APs, attempts to design an inhibitor specific to PMII have not been fully optimized because of cross-reactivity (e.g., pepstatin derived inhibitor is known to inhibit most pepsin-like structures). Inhibition of the unique PS, however, may prevent unwanted cross-reactions.

108

REFERENCES

Agard, D.A. (1993). To fold or not to fold. Science. 260, 1903-1904.

Andreeva, N.S. and Rumsh, L.D. (2001). Analysis of crystal structures of aspartic proteinases: on the role of amino acid residues adjacent to the catalytic site of pepsin- like enzymes. Protein Sci. 12, 2439-2450.

Anfinsen, C.B. (1973). Principles that govern the folding of protein chains. Science. 181, 223-230.

Asoja, O.A., Afonina, E., Gulnik, S.S., Yu, B., Erikson, J.W. Randad, R., Medjahed, D. and Silve, A.M. (2002). Structures of Ser205 mutant plasmepsins II from Plasmodium falciparum at 1.8 A° in complex with the inhibitors rs367 and rs370. Acta Crystallogr D Biol Crystallogr. 58, 2001-2008.

Asoja, O.A., Gulnik, S.V., Afonina, E., Yu, B., Ellman, J.A., Hague, T.S. and Silva, A.M. (2003). Novel uncomplexed and complexed structures of plasmepsin II, an aspartic protease from Plasmodium falciparum. J Mol Biol. 327, 173-181.

Baardsnes, J., Sidhu, S., Macleod, A., Elliot, J., Morden, D., Watson, J. and Borgford, T. (1998). Streptomyces griseus protease B: secretion correlates with the length of the propeptide. J Bacteriol. 180, 3241-3244.

Baker, D. (1998). Metastale states and folding free energy barriers. Nature Struct.Biol. 5, 1021-1024.

Baker, D., Shiau, A.K. and Agard, D.A. (1993). The role of pro regions in protein folding. Curr Opin Cell Biol. 5, 966-970.

Baker, D., Silen, J.L. and Agard, A.D. (1992). Protease pro region required for folding is a potent inhibitor of the mature enzyme. Proteins. 12, 339-344.

Baker, D., Sohl, J.L. and Agard, D.A. (1992). A protein-folding reaction under kinetic control, Nature. 356, 263-265.

Baker, N.A., Sept, D., Joseph, S., Holst, M.J. and McCammon, J.A. (2001). Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci USA. 98, 10037-10041.

Banerjee, R., Francis, S.E. and Goldberg, D.E. (2003). Food vacuole plasmepsins are processed at a conserved site by an acidic convertase activity in Plasmodium falciparum. Mol Biochem Parasitol. 129, 157-165.

Banerjee, R., Liu, J., Beatty, W., Pelosof, L., Klemba, M. and Goldberg, D.E. (2002). Four plasmepsins are active in the Plasmodium falciparum food vacuole, including a protease with an active-site histidine. Proc Natl Acad Sci USA. 99, 990-995.

109

Benes, P., Vetvicka, V. and Fusek, M. (2008). Cathepsin D – many functions of aspartic protease. Crit Rev Oncol Hematol. 68, 12-28.

Bernstein, N.K. and James, M.N. (1999). Novel ways to prevent proteolysis - prophytepsin and proplasmepsin II. Curr Opin Struct Biol. 9, 684-689.

Bernstein, N.K., Cherney, M.M., Loetscher, H., Ridley, R.G. and James, M.N. (1999). Crystal structure of the novel aspartic proteinase zymogen proplasmepsin II from Plasmodium falciparum. Nat Struct Biol. 1, 32-37.

Berry, C., Humphreys, M.J., Matharu, P., Granger, R., Horrocks, P., Moon, R.P., Certa, U., Ridley, R.G., Bur, D. and Kay, J. (1999). A distinct member of the aspartic proteinase gene family from the human malaria parasite Plasmodium falciparum. FEBS Lett. 447, 149-154.

Best, R.B. and Hummer, G. (2009). Optimized molecular dynamics force fields applied to the helix-coil transition of polypeptides. J Phys Chem B. 113, 9004–9015.

Beyer, B.M., and Dun, B.M. (1996). Self-activation of recombinant lysosomal procapthepsin D at a newly engineered cleavage junction, “short” pseudocapthepsin D. J. Biol. Chem. 271, 15590-15596

Beyer, B.M., and Dun, B.M. (1998). Prime region subsite specificity characterization of human capthepsin D: The domain role position 128. Protein Sci.,7, 88-95

Bhaumik, P, Xiao, H., Parr, C.L., Kiso, Y., Gustchina, A, Yada, R.Y. and Wlodawer, A. (2009). Crystal structures of the histo-aspartic protease (HAP) from Plasmodium falciparum. J Mol Biol. 388, 520-540.

Bhaumik, P., Gustchina, A. and Wlodawer, A. (2012). Structural studies of vacuolar plasmepsins. Biochem Biophys Acta. 1824, 207-223.

Blake, J.D. and Cohen, F.E. (2001). Pairwise sequence alignment below the twilight zone. J Mol Biol. 307, 721-735.

Bonilla, J.A. (2006). Assessing the function of the aspartic proteinases of the Plasmodium falciparum digestion vacuole using gene-knockout strategies (Doctoral dissertation). Retrieved from ProQuest Dissertations and Theses. (Accession Order No. UFE0012809).

Bryan, P.N. (2002). Prodomain and protein folding catalysis. Chem Rev. 102, 4805-4815.

Chen, Y.J. and Inouye, M. (2008). The intramolecular chaperone-mediated protein folding. Curr Opin Struct Biol. 18, 765-770.

Chiu, M.H. and Prenner, E.J. (2011). Differential scanning calorimetry: An invaluable tool for a detailed thermodynamic characterization of macromolecules and their interactions. J Pharm Bioallied. Sci. 3, 39-59.

110

Chu, A.H. and Ackers, G.K. (1991). Mutual effect of protons, NaCl and oxygen on the dimer-tetramer assembly of human hemoglobin; the Dimer Bohr Effect. J Biol Chem. 256, 1199-1205.

Clemente, J.C., Govindasamy, L., Madabushi, A., Fisher, S.Z., Moose, R.E., Yowell, C.A., Hidaka, K., Kimura, T., Hayashi, Y., Kiso, Y., Agbandje-McKenna, M., Dame, J.B., Dunn, B.M. and McKenna, R. (2006). Structure of the aspartic protease plasmepsin 4 from the malaria parasite Plasmodium malaria bound to an allophenylnorstatine-based inhibitor. Acta Crystallogr D Biol Crystallogr. 62, 246-252.

Coates, L., Han-Fang, T., Tomanicek, .S., Kovalevsky, A., Mustyakimov, M., Erskine, P. and Cooper, J. (2008). The catalytic mechanism of an aspartic proteinase explored with neutron and x-ray diffraction. J Am Chem Soc. 130, 7235-7237.

Coatney, G.R., Collins, W.E., Warren, M. and Contacos, P.G. (1971). The primate malarias: Evaluation of the primate malarias. National Institute of Health; National Institute of Allergy and Infectious Diseases (pp. 353-354). Bethesda, Maryland.

Conner, G.E. (1992). The role of the cathepsin D propeptide in sorting to the lysosome. J Biol Chem. 267, 21738-21745.

Creighton, T.E. (1984). Proteins Structures and Molecular Properties (Chapter 6, pp 204- 220). New York: W.H. Freeman & Co.

Cunningham, E.L., Jaswall, S.S., Sohl, J.L. and Agard, D.A. (1999). Kinetic stability as a mechanism for protease longevity. Proc Natl Acad Sci USA. 96, 11008-11014.

Dame, J.B., Reddy, G.R., Yowell, C.A., Dunn, B.M., Kay, J. and Berry, C. (1994). Sequence, expression and modeled structure of an aspartic proteinase from the human malaria parasite Plasmodium falciparum. Mol Biochem Parasitol. 64, 177-190.

Davies. D.R. (1990). The structure and function of the aspartic proteases. Annu Rev Biophys Chem. 19, 189-215.

Dee, D.R. and Yada, R.Y. (2010). The prosegment catalyzes pepsin folding to a kinetically trapped native state. Biochemistry. 49, 365-371.

Dee, D.R., Filonowics, S., Horimoto, Y. and Yada, R.Y. (2009). Recombinant prosegment peptide act as a folding catalyst and inhibitor of native pepsin. Biochem Biophys Acta. 1794, 1795-1801.

Demidyuk, I.V., Shubin, A.V., Gasanov, E.V. and Kostrov, S.V. (2010). Propeptides as modulators of functional activity of proteases. BioMol Concepts. 31, 305-322.

Derman, A., Prinz, W.A., Belin, D. and Beckwith, J. (1993). Mutations that allow disulfide bond formation in the cytoplasm of Escherichia coli. Science. 5140, 1744-1747.

Derman, A.I. and Agard, D.A. (2000). Two energetically disparate folding pathways of alpha-lytic protease share a single transition state. Nat Struct Biol. 5, 394-397.

111

Dill, K.A. and MacCallum, J.L. (2012). The protein-folding problem, 50 years on. Science. 338, 1042-1046.

Donald, P. and Honing, B, (2003). GRASP2: visualization, surface properties, and electrostatics of macromolecular structures and sequences. Methods Enzymol. 374, 492-509.

Dunn, B. (2002). Structure and mechanism of the pepsin-like family of aspartic peptidases. Chem Rev. 102, 4431-4458.

Eder, J. and Fersht, A.R. (1995). Pro-sequence-assisted protein folding. Mol Microbiol. 4, 609-614.

Eder, J., Rheinnecker, M. and Fersht, A.R. (1993). Folding of subtilisin BPN9: role of the pro-sequence. J Mol Biol. 233, 293-304.

Eftink, M.R. (2000). Topic in fluorescence spectroscopy. (Vol 6, page 27), New York: Kluwer academic/Plenum Publisher (edited by Joseph R. Lakowicz).

Eftink, M.R. (1994). The use of fluorescence methods to monitor unfolding transitions in proteins. Biophys J. 66, 482-501.

Ersmark, K.S., Samuelsson, B. and Hallberg, A. (2006). Plasmepsins as potential targets for new antimalarial therapy. Med Res Rev. 26, 626-666.

Eswar, N., Webb, B., Marti-Renom, M.A., Madhusudhan, M.S., Eramian, D., Shen, M.Y., Pieper, U., Sali, A. and Eswar, N. (2006). Comparative protein structure modeling with MODELLER. Curr Protoc Bioinformatics. 5, Unit 5.6.

Fakuda, R., Horiuchi, H., Ohta, A. and Takagi, M. (1994). The Prosequence of Rhizopus niveus aspartic proteinase-I supports correct folding and secretion of its mature part in Saccharomyces cerevisiae. J Biol Chem. 269, 9556-9561.

Fakuda, R., Umebayashi, K., Horiuchi, H., Ohta, A. and Takagi, M. (1996). Degradation of Rhizopus niveus aspartic proteinase-I with mutated prosequences occurs in the endoplasmic reticulum of Saccharomyces cerevisiae. J Biol Chem. 271, 14252-14255.

Fisher, K.E., Ruan, B., Alexander, P.A., Wang, L. and Bryan, P.N. (2007). Mechanism of the kinetically-controlled folding reaction of subtilisin. Biochemistry. 46, 640-651.

Floudas, C.A. (2007). Computational methods in protein structure prediction. Biotechnol Bioeng. 2, 207-213.

Fortenberry, S.C. and Chirgwin, J.M. (1995). The propeptide is nonessential for the expression of human cathepsin D. J Biol Chem. 270, 9778-9782.

Francis, S.E., Banerjee, R. and Goldberg, D.E. (1997). Biosynthesis and maturation of the malaria aspartic hemoglobinases plasmepsins I and II. J Biol Chem. 272, 14961- 14968.

112

Francis, S.E., Gluzman, I.Y., Oksman, A., Knickerbocker, A., Mueller, R., Bryant, M.L., Sherman, D.R., Russell, D.G., Goldberg, D.E. and Francis, S.E. (1994). Molecular characterization and inhibition of a Plasmodium falciparum aspartic hemoglobinase. EMBO J. 13, 306-317.

Francis, S.E., Sullivan, D.R. and Goldberg, D.E. (1997). Hemoglobin metabolism in the malaria parasite Plasmodium falciparum. Annu Rev Microbiol. 51, 97-123.

Friedman, R. and Caflisch, A. (2008). Pepsinogen-like activation intermediate of plasmepsin II revealed by molecular dynamics analysis. Proteins. 4, 814-827.

Fruton, J. (2002). A history of pepsin and related enzymes. Q Rev Biol. 77, 127-143.

Fusek, M., Mares, M., Vagner, J., Voburka, Z. and Baudys, M. (1991). Inhibition of aspartic proteases by propart peptides of human procathepsin D and chicken pepsinogen. FEBS letter. 287, 160-162.

Gardiner, D.L., Skinner-Adams, T.S., Brown, C.L., Andrews, K.T., Stack, C.M., McCarthy, J.S., Dalton, J.P., Trenholme, K.R. and Gardiner, D.L. (2009). Plasmodium falciparum: new molecular targets with potential for antimalarial drug development. Expert Rev Anti-Infe Ther. 9, 1087-1093

Gardner, M.J., Hall, N., Fung, E., White, O., Berriman, M., Hyman, R.W., Carlton, J.M., Pain, A., Nelson, K.E., Bowman, S., Paulsen, I.T., James, K., Eisen, J.A., Rutherford, K., Salzberg, S.L., Craig, A., Kyes, S., Chan, M.S., Nene, V., Shallom, S.J., Suh, B., Peterson, J., Angiuoli, S., Pertea, M., Allen, J., Selengut, J., Haft, D., Mather, M.W., Vaidya, A.B., Martin, D.M., Fairlamb, A.H., Fraunholz, M.J., Roos, D.S., Ralph, S.A., McFadden, G.I., Cummings, L.M., Subramanian, G.M., Mungall, C., Venter, J.C., Carucci, D.J., Hoffman, S.L., Newbold, C., Davis, R.W., Fraser, C.M. and Barrell, B. (2002). Genome sequence of the human malaria parasite Plasmodium falciparum. Nature. 6906, 498-511.

Garnier, J. (1990). Protein structure prediction. Biochimie. 72, 513-524.

Gordon, J.C., Myers, J.B., Folta, T., Shoja, V., Heath, L.S. and Onufriev, A. (2005). H++: a server for estimating pKas and adding missing hydrogens to macromolecules. Nucleic Acids Res. 33, 368-371.

Gulnik, S.V., Afonina, E.I., Gustchina, E., Yu, B., Silva, A.M., Kim, Y. and Erikson, J.W. (2002). Utility of (His).6 Tag for purification and refolding of proplasmepsin-2 and mutants with altered activation properties. Protein Expr Purif. 24, 412-419.

Gutiérrez-González, L.H., Rojo-Domínguez, A., Cabrera-González, N.E., Pérez-Montfort, R. and Padilla-Zúñiga, A.J. (2006). Loosely packed papain prosegment displays inhibitory activity. Arch Biochem Biophys. 446, 151-160.

Hartsuck, J.A., Koelsch, G. and Remington, S.J. (1992). The high-resolution crystal structure of porcine pepsinogen. Proteins. 1, 1-25.

113

Hill. J., Tyas, L., Phylip, L.H., Kay, J., Dunn, B.M. and Berry, C. (1994). High level expression and characterisation of Plasmepsin II, an aspartic proteinase from Plasmodium falciparum. FEBS Lett. 353, 155-158.

Horimoto, Y., Dee. D.R. and Yada. R.Y. (2009). Multifunctional aspartic peptidase prosegments. N Biotechnol. 5, 318-324.

Hui, L., Robertson, A.D. and Jensen, J.H. (2005). Very fast empirical prediction and interpretation of protein pKa values. Proteins. 61, 704-721.

Ikemura, H., Takagi, H. and Inouye, M. (1987). Requirement of prosequence for the production of active subtilisin E in Escherichia coli. J Biol Chem. 262, 7859-7864.

Istvan, E.S. and Goldberg, D.E. (2005). Distal substrate interactions enhance plasmepsin activity. J Biol Chem. 280, 6890-6896.

James, M.N.G. (2004). Catalytic pathway of aspartic peptidases. In Handbook of Proteolytic Enzymes, (2nd Edition, pp. 12-19). London: Elsevier (Barrett,A.J., Rawlings,N.D. & Woessner,J.F. eds).

James, M.N.G., Sielekcki, A.R., Hayakawa, K. and Gelb, M.B. (1992). Crystallographic analysis of transition state mimics bound to penicillopepsin: difluorostatine- and difluorostatone-containing peptides. Biochemistry. 31, 3872-3886.

Jaswal, S.S. (2002). Energetic landscape of -lytic protease optimizes longevity through kinetic stability. Nature. 415, 343-346.

Jiang, S., Prigge, S.T., Wei, L., Gao, Y., Hudson, T.H., Gerena, L., Dame, J.B. and Kyle, D.E. (2001). New class of small nonpeptidyl compounds blocks Plasmodium falciparum development in-vitro by inhibiting plasmepsins. Antimicrob Agents Chemother. 9, 2577-2584.

Jordan Tang and Ricky N. S. Wong Tang, J. and Wong, R.N. (1987). Evolution in the structure and function of aspartic proteases. J Cell Biochem. 33, 53-63.

Jørgensen, R., Yates, S.P., Teal, D.J., Nilsson, J., Prentice, G.A., Merrill, A.R., Andersen, G.R. and Jorgensen, R. (2004). Crystal structure of ADP-ribosylated ribosomal from Saccharomyces cerevisiae. J Biol Chem. 279, 45919-45925.

Jorgensen, W.L. (1983). Comparison of simple potential functions for simulating liquid water. J Chem Phys. 79, 926-931.

Kelch, B.A., Salimi, N.L. and Agard, D.A. (2012). Functional modulation of a protein folding landscape via side-chain distortion. Proc Natl Acad Sci USA. 24, 9414-9419.

Khan, A., Khazanovich-Bernstein, N., Bergman, E.M. and James, M.N. (1999). Structural aspects of activation pathways of aspartic protease zymogens and viral 3C protease precursors. Proc Natl Acad Sci USA. 96, 10968-10975.

114

Khan, A.R. and James, M.N. (1988). Molecular mechanisms for the conversion of zymogens to active proteolytic enzymes. Protein Sci. 4, 815-836.

Kim, Y.M., Lee, M.H., Piao, T.G., Lee, J.W., Kim, J.H., Lee, S., Choi, K.M., Jiang, J.H., Kim, T.U. and Park, H. (2006). Prodomain processing of recombinant plasmepsin II and IV, the aspartic proteases of Plasmodium falciparum, is auto- and trans-catalytic. J Biol Chem. 139, 189-195.

Koelsch, G., Mares, M., Metcalf, P. and Fusek, M. (1994). Multiple functions of pro-parts of aspartic proteinase zymogens. FEBS Lett. 343, 6-10.

Krop, M., Lu, X., Jan-Danser, A.H. and Meima, M.E. (2013). The (pro). receptor. A decade of research: what have we learned? Pflugers Arch. 1, 87-97.

Krupa, J.C., Hasnain, S., Nagler, D.K., Ménard, R., Mort, J.S., (2002). S2’ substrate specificity and the role of His 110 and His 111 in the exopeptidase activity of human cathepsin B. Biochem J., 361, 613-619.

Kunimoto, S., Aoyagi, T., Morishima, H., Takeuchi, T., and Umezawa, H. (1972) Mechanism of inhibition of pepsin by pepstatin. J Antibiot. 25, 251-255

Le Bonniec, S., Deregnaucourt, C., Redeker, V., Banarjee, R., Grellier, P., Goldberg, D.E. and Schrevel, J. (1999). Plasmepsin II, an acidic hemoglobinase from the Plasmodium falciparum food vacuole, is active at neutral pH on the host erythrocyte membrane skeleton. J Biol Chem. 274, 14218-14223.

Lee, C.E., Kick, E.K. and Ellman, J.A. (1998). General solid-phase synthesis approach to prepare mechanism-based aspartyl proteasae inhibitor. Identification of potent cathepsin D inhibitor. J Am Chem Soc. 120, 9735-9747.

Loll, P.J. (2003). Membrane protein structural biology: the high throughput challenge. J Struct Biol. 142, 144-153.

Luker, K.E., Francis, S.E., Gluzman, I.Y. and Golderg, D.E. (1996). Kinetic analysis of plasmepsin I and II, aspartic proteases of Plasmodium falciparum digestion vacuole. Mol Biochemi Parasit. 79, 71-78.

Mansfeld, J., Petermann, E., Durrschmit, P. and Ulbrich-Hofmann, R. (2005). The propeptide is not required to produce catalytically active neutral protease from Bacillus stearothermophilus. Protein Expr Purif. 39, 219-228.

Miller, D.W. and Agard, D.A. (1999). Enzyme specificity under dynamic Control: A normal mode analysis of α-Lytic protease. J Mol Biol. 286, 267-278.

Miura, T., Hidaka, K., Uemura, T., Kashimoto, K., Hori, Y., Kawasaki, Y., Ruben, A.J., Freire, E., Kimura, T. and Kiso, Y. (2010). Improvement of both plasmepsin inhibitory activity and antimalarial activity by 2-aminoethylamino substitution. Bioorg Med Chem Lett. 16, 4836-4839.

115

Mohanty, A.K., Mukhopadhyay, U.K., Grover, S. and Batish, V.K. (1999). Bovine chymosin: production by rDNA technology and application in cheese manufacture. Biotechnol Adv. 17, 205-217.

Motulsky, H.J. (2003). Prism 4 Statistical Guide-Statistical analysis for laboratory and clinical researchers. GraphPad Software., San Diego CA.

Müntener, K., Willimann, A., Zwicky, R., Svoboda, B., Mach, L. and Baici, A. (2005). Folding competence of N-terminally truncated forms of human procathepsin B. J Biol Chem. 280, 11973-11980.

Myers, B. (2012). The role of N- and C-terminal amino acids to prosegment catalyzed folding in porcine pepsinogen A (Master’s Thesis). Retrieved from ProQuest Dissertations and Theses. (Accession Order No. AAT 3590).

Northop, J.H. (1930). Crystalline pepsin I. Isolation and test of purity. J Gen Physiol. 6, 739- 766.

Pace, C.N. (1986). Determination and analysis of urea and guanidine hydrochloride denaturation curves, in Methods in Enzymology. (pp 266-280). Orlando, FL: Academic Press (C. H. W. Hirs, and S. N. Timasheff, Eds.).

Parr-Vasquez, C.L. and Yada, R.Y. (2010). Functional chimera of porcine pepsin prosegment and Plasmodium falciparum plasmepsin II, Protein Eng Des Sel. 23, 19-26.

Peters, R.J., Shiau, A.K., Sohl, J.L., Anderson, D.E., Tang, G., Silen, J.L. and Agard, D.A. (1998). Pro region C-terminus: Protease active site interactions are critical in catalyzing the folding of alpha-lytic protease. Biochemistry, 12058-12067.

Prinz, W.A., Aslund, F., Hilmgen, A. and Beckwith, J. (1997). The role of the thioredoxin and glutaredoxin pathways in reducing protein disulfide bonds in the Escherichia coli cytoplasm. J Biol Chem. 272, 15661-15667.

Privalov, P.L. (1981). Comparative thermodynamic study of pepsinogen and pepsin structure. J Mol Biol. 152, 445-464.

Pulido, M., Saito, K., Tanaka, S., Koga, Y., Morikawa, M., Takano, K. and Kanaya, S. (2006). Ca2+-dependent maturation of subtilisin from a hyperthermophilic archaeon, Thermococcus kodakaraensis: the propeptide is a potent inhibitor of the mature domain but is not required for its folding. Appl Environ Microbial. 72, 4154-4162.

Rawlings, N.D., Barret, A.J. and Bateman, A. (2010). MEROPS: the peptide database. Nucleic Acids Res. 38, 227-233.

Rawlings, N.D., Barret, A.J. and Bateman, A. (2012). MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 40, 343-350.

Richter, C., Tanaka, T. and Yada, R.Y. (1998). Mechanism of activation of the gastric aspartic proteinases: pepsinogen, progastricin and prochymosin. Biochm J. 335, 481- 490.

116

Robbins, A.H., Dunn, B.M., Agbandje-McKenna, M. and McKenna R. (2009). Crystallographic evidence for noncoplanar catalytic aspartic acids in plasmepsin II resides in the Protein Data Bank. Acta Crystallogr D Biol Crystallogr. 3, 294-296.

Roy, A., Kucukural, A. and Zhang, Y. (2010). I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 4, 725-738.

Rubinstein, A. and Sherman, S. (2004). Influence of the solvent structure on the electrostatic interactions in proteins. Biophys J. 3, 1544-1557.

Sali, A., Veerapandian, B., Cooper, J.B., Moss, D.S., Hofmann, T. and Blundell, T.L. (1992). Domain flexibility in aspartic proteinases. Proteins. 12, 158-170.

Sanchez-Ruiz, J.M. (2010). Protein kinetic stability. Biophy J. 148, 1-15.

Sanchez-Ruiz, J.M. (2011). Probing free-energy surfaces with differential scanning calorimetry. Annu Rev Phys Chem. 62, 231-255.

Scarborough, P.E., Guruprasad, K., Topham, C., Richo, G.R., Blundell, T.L. and Dun, B.M. (1993). Exploration of subsite binding specificity of human cathepsin D through kinetics and rule-based molecular modelling. Protein Sci. 2:264-276

Shinde, U. and Inouye, M. 2000. Intramolecular chaperones: Polypeptide extensions that modulate protein folding. Semin Cell Dev Biol. 11, 35-44.

Silva, A.M., Lee, A.Y., Gulnik, S.V., Maier, P., Collins, J., Bhat, T.N., Collins, P.J., Cachau, R.E., Luker, K.E., Gluzman, I.Y., Francis, S.E., Oksman, A., Goldberg, D.E. and Erikson, J.W. (1996). Structure and inhibition of plasmepsin II, a hemoglobin- degrading enzyme from Plasmodium falciparum. Proc Natl Acad Sci USA. 93, 10034- 10039.

Snow, R.W., Guerra, C.A., Noor, A.M., Myint, H.Y. and Hay, S.I. (2005). The global distribution of clinical episodes of Plasmodium falciparum malaria. Nature 434, 214- 217.

Sohl, J.L., Jaswal, S.S. and Agard, D.A. (1998). Unfolded conformations of a-lytic protease are more stable than its native state. Nature. 395, 817-819.

Truhlar, S.M., Cunningham, E.L. and Agard, A.D. (2004). The folding landscape of Streptomyces griseus protease B reveals the energetic costs and benefits associated with evolving kinetic stability. Protein Sci. 13, 381-390.

Tyas, L., Gluzman, I., Moon, R.P., Rupp, K., Westling, J., Ridley, R.G., Kay, J., Goldberg, D.E. and Berry, C. (1999). Naturally-occurring and recombinant forms of the aspartic proteinases plasmepsin I and II from human malaria parasite Plasmodium falciparum. FEBS letters. 454, 210-214.

Uversky, V. N. (2002). Natively unfolded proteins: A point where biology waits for physics. Protein Sci. 11, 739-756.

117

Van den Hazel, H.B., Kielland-Brandt, M.C. and Winther, J.R. (1993). The propeptide is required for in-vivo formation of stable active yeast proteinase A and can function even when not covalently linked to the mature region. J Biol Chem. 268, 18002- 18007.

Van den Hazel, H.B., Pichler, H., Valle Matta, M.A., Leitner, E., Goffeau, A. and Daum, G. (1999). PDR16 and PDR17, two homologous genes of Saccharomyces cerevisiae, affect lipid biosynthesis and resistance to multiple drugs. J Biol Chem. 274, 1934- 1941.

Van der Spoel, H., Lindahl, E., Hess, B., Groenhof, G., Mark, A.E. and Barendsen, H.J. (2005). GROMACS: Fast, flexible, and free. J Comput Chem. 16, 1701-1718.

Veerapandian, B., Cooper, J.B., Sali, A., Blundell, T.L., Rosati, R.L., Dominy, B.W., Damon, D.B. and Hoover, D.J. (1992). Direct observation by X-ray analysis of the tetrahedral "intermediate" of aspartic proteinases. Protein Sci. 3, 322-328.

Vivian, J.T. and Callis, P.R. (2001). Mechanisms of tryptophan fluorescence shifts in proteins. Biophys J. 80, 2093-2109.

Wiederanders, B. (2000). The function of propeptide domains of cysteine proteinases. Adv Exp Med Biol. 477, 261-270.

Winther, J.S., and Sorensen, P. (1991). Propeptide of carboxypeptidase Y provide a chaperone-like function as well as inhibition of the enzymatic activity. Proc Natl Acad Sci USA. 88, 9330-9334.

Wittlin, S., Rosel, J. and Stover, D.R. (1998) One step purification of cathepsin D by affinity chromatography using immobilized propeptide sequences. Eur J Biochem. 252, 530- 536.

Wroblowski, B., Diaz, J.F., Schlittler, J. and Engelborghs, Y. (1997). Modelling pathways of a-chymotrypsin activation and deactivation. Protein Eng. 10, 1163-1174.

Xiao, H., Dee. D. and Yada, R.Y. (2011). The native conformation of plasmepsin II is kinetically trapped at neutral pH. Arch Biochem Biophys. 513, 102-109.

Xiao, H., Sinkovits, A.F., Bryksa, B.C., Ogawa, M. and Yada, R.Y. (2006). Recombinant expression and partial characterization of an active soluble histo-aspartic protease from Plasmodium falciparum. Protein Expr Purif. 49, 88-94.

Xiao, H., Tanaka, T., Ogawa, M. and Yada, R.Y. (2007). Expression and enzymatic characterization of the soluble recombinant plasmepsin I from Plasmodium Falciparum. Protein Eng Des Sel. 20, 625-633.

Yasuda, Y., Tsukuba, T., Okamoto, K., Kadowaki, T. and Yammamoto, K. (2005). The role of the cathepsin E propeptide in correct folding, maturation and sorting to the endosome. J Biochem. 138, 621-630.

118

Yonezawa, H., Uchikoba, T. and Kaneda, M. (1999). Determination of pepstatin-sensitive carboxyl proteases by using pepstatinyldansyldiaminopropane (Dansyl-Pepstatin) as an active site Titrant. J Biochem. 122, 294-299.

119

APPENDICES

Appendix 1 N-terminus analysis of extended PSPMII

Appendix Figure 1.1: Summary results on N-terminus analysis of extended PSPMII.

120

A

Response Factor Response

Retention Time (min)

B

B

Response Factor Response

Retention Time (min)

121

C

Response Factor Response

Retention Time (min)

D

Response Factor Response

Retention Time (min)

122

E

Response Factor Response

Retention Time (min)

F

Response Factor Response

Retention Time (min) Appendix Figure 1.2: R esults on N-terminus analysis of extended PSPMII. (A) 1st residue. (B) 2nd residue. (C) 3rd residue. (D) 4th residue. (E) 5th residue. (F) 6th residue. The letter symbols correspond to single letter abbreviation of amino acid and the check marks indicated the most possible amino acid in that particular cycle.

123

Appendix 2 N-terminus analysis of +2 PMII mature activated from extended PSPMII

Appendix Figure 2.1: Summary results on N-terminus analysis of +2 PMII mature activated from extended PSPMII.

124

A

Response Factor Response

Retention Time (min)

B

B

Response Factor Response

Retention Time (min)

125

C

Response Factor Response

Retention Time (min)

D

B

Response Factor Response

Retention Time (min)

126

E

Response Factor Response

Retention Time (min)

F

B

Response Factor Response

Retention Time (min) Appendix Figure 2.2: Results on N-terminus analysis of +2 PMII mature activated from extended PSPMII. (A) 1st residue. (B) 2nd residue. (C) 3rd residue. (D) 4th residue. (E) 5th residue. (F) 6th residue. The letter symbols correspond to single letter abbreviation of amino acid and the check marks indicated the most possible amino acid in that particular cycle. 127

Appendix 3 N-terminus analysis of truncated PSPMII

Appendix Figure 3.1: Summary results on N-terminus analysis of truncated PSPMII.

128

A

Response Factor Response

Retention Time (min)

B

Response Factor Response

Retention Time (min)

129

C

Response Factor Response

Retention Time (min)

D

B

Response Factor Response

Retention Time (min)

130

E

Response Factor Response

Retention Time (min)

F

B

esponse Factor esponse R

Retention Time (min) Appendix Figure 3.2: Results on N-terminus analysis of truncated PSPMII. (A) 1st residue. (B) 2nd residue. (C) 3rd residue. (D) 4th residue. (E) 5th residue. (F) 6th residue. The letter symbols correspond to single letter abbreviation of amino acid and the check marks indicated the most possible amino acid in that particular cycle.

131

Appendix 4 N-terminus analysis of +12 PMII mature activated from truncated

PSPMII

Appendix Figure 4.1: Summary results on N-terminus analysis of +12 PMII mature activated from truncated PSPMII.

132

A

Response Factor Response

Retention Time (min)

B

B

Response Factor Response

Retention Time (min)

133

C

Response Factor Response

Retention Time (min)

D

B

Response Factor Response

Retention Time (min)

134

E

Response Factor Response

Retention Time (min)

F

B

Response Factor Response

Retention Time (min) Appendix Figure 4.2: Results on N-terminus analysis of +12 PMII mature activated st nd rd th from truncated PSPMII. (A) 1 residue. (B) 2 residue. (C) 3 residue. (D) 4 residue. (E) 5th residue. (F) 6th residue. The letter symbols correspond to single letter abbreviation of amino acid and the check marks indicated the most possible amino acid in that particular cycle. 135

Appendix 5 N-terminus analysis of NoProPMII mature

Appendix Figure 5.1: Summary results on N-terminus analysis of NoProPMII mature.

136

A

Response Factor Response

Retention Time (min)

B

D

Response Factor Response

Retention Time (min)

137

C

Response Factor Response

Retention Time (min)

D

B

Response Factor Response

Retention Time (min)

138

E

Response Factor Response

Retention Time (min)

F

D

Response Factor Response

Retention Time (min) st Appendix Figure 5.2: Results on N-terminus analysis of NoProPMII mature. (A) 1 residue. (B) 2nd residue. (C) 3rd residue. (D) 4th residue. (E) 5th residue. (F) 6th residue. The letter symbols correspond to single letter abbreviation of amino acid and the check marks indicated the most possible amino acid in that particular cycle.

139