<<

University of Calgary PRISM: University of Calgary's Digital Repository

Graduate Studies The Vault: Electronic Theses and Dissertations

2014-09-12 Towards a Biochemical Reconstitution of Pitcher Fluid for the Treatment of Celiac Disease

Yang, Menglin

Yang, M. (2014). Towards a Biochemical Reconstitution of Nepenthes Pitcher Fluid for the Treatment of Celiac Disease (Unpublished master's thesis). University of Calgary, Calgary, AB. doi:10.11575/PRISM/28475 http://hdl.handle.net/11023/1740 master thesis

University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. Downloaded from PRISM: https://prism.ucalgary.ca

UNIVERSITY OF CALGARY

Towards a Biochemical Reconstitution of Nepenthes Pitcher Fluid for the Treatment of Celiac Disease

by

Menglin Yang

A THESIS

SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE

DEGREE OF MASTER OF SCIENCE

GRADUATE PROGRAM IN BIOCHEMISTRY AND MOLECULAR BIOLOGY

CALGARY, ALBERTA

SEPTEMBER, 2014

© Menglin Yang 2014

Abstract

Celiac disease (CD) is an autoimmune disorder that is triggered by the incomplete digestion of gliadins in dietary gluten due to the abundance of P and Q residues in their protein sequence(s). This thesis provides an initial assessment of the proteolytic activity of Nepenthes extracts, which is attributed to the aspartic , I and II, potential as an oral therapeutic for CD. To this end, nepenthesin I and II were produced recombinantly and characterized. The recombinant nepenthesins were able to reconstitute the proteolytic activity of Nepenthes extracts except for cleavage after P, which was attributed to a previously unidentified protease. Nevertheless, the Nepenthes extracts and recombinant nepenthesin I/II were assessed for their capacity to detoxify gliadins. Although the recombinant nepenthesins alone did not appear sufficient, the Nepenthes plant extracts appeared to efficiently detoxify gliadin, which supports the proposed formulations potential as an effective oral therapeutic for CD.

ii

Preface

Portions of this thesis resulted in the following peer-reviewed or in preparation publications:

1) Rey, M., Yang, M., Burns, K.M., Yu, Y., Lees-Miller, S.P., and Schriemer, D.C. (2013). Nepenthesin from monkey cups for hydrogen/deuterium exchange mass spectrometry. Molecular and Cellular Proteomics, 12(2), 464-472

2) Yang, M., Hoeppner, M., Rey, M., Man, P. and Schriemer, D.C. (2014). Recombinant nepenthesin II for hydrogen/deuterium exchange mass spectrometry. In preparation.

iii

Acknowledgements

First and foremost, I would like to thank my supervisor, Dr. David Schriemer, for giving me the opportunity to work on this project and for his immense support throughout my graduate studies. For their various experimental contributions to this project, I would like to thank Drs. Martial Rey, Laurent Brechenmacher and Kelvin Ma, as well as, Ronghua Yu. Thank you to my committee members, Drs. Hans Vogel and Tony Schryvers, for their advice and guidance throughout my graduate studies. Thank you to all of the members of the Schriemer laboratory for providing a great work environment. Thank you to my mother and father, Lanlan and Shanning, for their support in my career decisions. Finally, thank you to Patricia Lan for your unwavering support in all of my decisions.

iv

Dedication To Patti.

v

Table of Contents

Abstract ...... ii Preface ...... iii Acknowledgements ...... iv Dedication ...... v Table of Contents ...... vi List of Tables ...... ix List of Figures and Illustrations ...... x List of Symbols, Abbreviations, and Nomenclature ...... xiii

CHAPTER ONE: INTRODUCTION ...... 1 1.1 Celiac Disease: A General Overview ...... 1 1.2 Epidemiology ...... 1 1.3 Basis for Disease Development ...... 3 1.4 Pathogenic Mechanism ...... 3 1.5 Disease Symptoms and Outcomes ...... 7 1.6 Treatments for CD- The Gluten-Free Diet ...... 8 1.7 Proposed Therapeutics for CD ...... 9 1.8 Oral Proteases: A Promising Treatment for CD ...... 10 1.8.1 Oral Proteases: AlV003 ...... 11 1.8.2 Oral Proteases: STAN1 ...... 12 1.8.3 Oral Proteases: AN-PEP ...... 13 1.8.4 Oral Proteases: Limitations and Future Directions ...... 13 1.9 A New Oral Protease Candidate for the Treatment of CD- Nepenthesin ...... 14 1.10 Research Hypothesis and Objectives ...... 18

CHAPTER TWO: IDENTIFICATION OF THE PROTEOLYTIC COMPONENTS OF NEPENTHES PITCHER FLUID ...... 19 2.1 Introduction ...... 19 2.2 Experimental Procedures ...... 20 2.2.1 Chemicals ...... 20 2.2.2 Horticulture of Nepenthes ...... 20 2.2.3 Preparation of Nepenthes fluid for Proteome Studies ...... 22 2.2.4 Proteome Mass Spectrometry and Data Analysis ...... 22 2.2.5 Visualization of the Nepenthes fluid Proteome over Time ...... 23 2.2.6 In-gel Processing ...... 23 2.2.7 Activity Assays ...... 24 2.2.8 Pepstatin A Purification ...... 24 2.2.9 Determination of Cleavage Specificities ...... 25 2.3 Results and Discussion ...... 26 2.3.1 In-solution Proteome Analyses of Nepenthes Pitcher Secretions ...... 26 2.3.2 Activity and Cleavage Preferences of Nepenthes Pitcher Fluid ...... 34 2.3.3 Pepstatin A Purified Nepenthes Pitcher Fluid ...... 39 2.4 Conclusions ...... 41 2.5 Contributions to the Chapter ...... 41

vi

CHAPTER THREE: RECOMBINANT RECONSTITUTION OF THE PROTEOLYTIC ACTIVITY OF NEPENTHES PITCHER FLUID ...... 42 3.1 Introduction ...... 42 3.2 Experimental Procedures ...... 43 3.2.1. Chemicals ...... 43 3.2.2. Plasmid Preparation ...... 43 3.2.3 Intracellular and Periplasmic Protein Expression and Purification Trials in E. coli . 44 3.3.4 Insoluble Protein Expression and Purification Trials in E. coli ...... 45 3.2.5 Size-exclusion Chromatography ...... 46 3.2.6 Protein Identification, Function and Disulfide Bond Formation ...... 46 3.2.7 Determination of Size ...... 46 3.2.8 Glycosylated Protein Expression Trials of Nepenthesin I in Pichia pastoris ...... 47 3.2.9 Activity Assays ...... 47 3.2.10 Digestion Mapping by LC-MS/MS ...... 48 3.2.11 Inhibition of Nepenthes Fluid ...... 49 3.3 Results and Discussion ...... 50 3.3.1 Recombinant Production and Optimization Trials in E. coli ...... 50 3.3.2 Recombinant Expression Trials in Pichia pastoris ...... 67 3.3.3 Enzymatic Characterization ...... 69 3.3.4 The Search for the Missing Proline Cleavage Activity ...... 81 3.5 Conclusions and Future Directions ...... 86 3.6 Contributions to the Chapter ...... 88

CHAPTER FOUR: CHARACTERIZATION OF THE ASPARTIC PROTEASES, NEPENTHESIN I AND II, AS A THERAPEUTIC FOR CELIAC DISEASE ...... 89 4.1 Introduction ...... 89 4.2 Experimental Procedures ...... 91 4.2.1 Chemicals ...... 91 4.2.2 Monitoring Susceptibility to ...... 91 4.2.3 LC-MS Quantitation of Digests of the 33-mer Peptide ...... 91 4.2.4 Dose-dependent 33-mer Peptide and Gliadin Digest Preparations ...... 92 4.2.5 Indirect ELISA ...... 92 4.2.6 Time-dependent Gliadin Digestions ...... 95 4.2.7 LC-MS/MS Analyses ...... 95 4.2.8 Data Analysis ...... 96 4.3 Results and Discussion ...... 97 4.3.1 Quantitation of the Protease Concentration within Nepenthes Pitcher Fluid ...... 97 4.3.2 Stability of Recombinant Nepenthesin I and II under Simulated Gastrointestinal Conditions ...... 99 4.3.3 Quantitative Analysis of the Capacity for Digestion of the 33-mer Peptide ...... 101 4.3.4 Assessment of the Recombinant Nepenthesins and Nepenthes Pitcher Fluids’ Capacity for Digestion of Gliadin Extracts ...... 110 4.4 Conclusions and Future Directions ...... 122 4.5 Contributions to the Chapter ...... 124

vii

CHAPTER FIVE: RESEARCH SUMMARY AND FUTURE CONSIDERATIONS...... 125 5.1 Summary...... 125 5.2 Future Considerations ...... 127

BIBLIOGRAPHY...... 129

APPENDIX...... 142

viii

List of Tables

Table 2.1. Initial proteome identification results for Nepenthes pitcher fluid digested with ...... 27

Table 2.2. Proteome identification results for deglycosylated Nepenthes pitcher fluid digested with trypsin ...... 31

Table 2.3. Proteome identification results for Nepenthes pitcher fluid digested with proteolytically active Nepenthes pitcher fluid ...... 33

Table 3.1. Refolding data for a typical 0.5 L preparation of nepenthesin I and II...... 56

Table 3.2. Refolding data for the optimization of recombinant nepenthesin I production...... 61

Table 3.3. Refolding data for the optimization of recombinant nepenthesin II production...... 62

Table 4.1. Cleavage sites of the 33-mer peptide in gliadin extracts detected by LC-MS/MS .... 107

ix

List of Figures and Illustrations

Figure 1.1. Schematic representation of the pathogenic mechanism of CD ...... 4

Figure 1.2. Mechanism of glutamine deamidation by tissue-transglutaminase-2 (TG2) ...... 6

Figure 1.3. Nepenthes pitcher plants ...... 15

Figure 1.4. Digestion preferences of native Nepenthes pitcher secretions ...... 17

Figure 2.1. The horticultural and harvesting process for the Nepenthes plants ...... 21

Figure 2.2. Deglygosylation of the Nepenthes pitcher fluid ...... 29

Figure 2.3. Stability of acid activated Nepenthes fluid over time ...... 35

Figure 2.4. Relative activity (%) of two-week old Nepenthes fluid incubated over a 7-day time course at 37 ˚C ...... 36

Figure 2.5. Proteolytic selectivity of activated Nepenthes fluid over time under physiological conditions ...... 38

Figure 2.6. Proteolytic selectivity of Nepenthes pitcher fluid purified with pepstatin A under physiological conditions ...... 40

Figure 3.1. Results of intracellular recombinant production trials for nepenthesin I and II ...... 52

Figure 3.2. Results of periplasmic recombinant production trials of nepenthesin I and II ...... 53

Figure 3.3. Refolding of insoluble recombinant nepenthesin I and II from inclusion bodies ...... 55

Figure 3.4. Disulfide bond formation and functionality of recombinant nepenthesin I and II ..... 57

Figure 3.5. Size-exclusion chromatography of recombinant nepenthesin I and II ...... 58

Figure 3.6. Effects of redox environment and pre-purification on refolding of insoluble recombinant nepenthesin I and II from inclusion bodies ...... 60

Figure 3.7. Relative activity of recombinant nepenthesin I and II generated from various refolding conditions ...... 63

Figure 3.8. Cleavage preferences of recombinant nepenthesin I generated from various refolding conditions ...... 64

Figure 3.9. Cleavage preferences of recombinant nepenthesin II generated from various refolding conditions ...... 65

x

Figure 3.10. Recombinant nepenthesin I expression trials in P. pastoris ...... 68

Figure 3.11. Absorbance at 280nm compared to enzyme quantity (µg) ...... 71

Figure 3.12. IC50 profiles for recombinant nepenthesin I and II ...... 72

Figure 3.13. Effect of pH on recombinant nepenthesin I and II activity ...... 73

Figure 3.14. Effect of temperature on recombinant nepenthesin I and II activity ...... 75

Figure 3.15. Effect of pH on the stability of recombinant nepenthesin I and II ...... 77

Figure 3.16. Effects of denaturing and reducing agents on the activity of recombinant nepenthesin I and II ...... 78

Figure 3.17. Proteolytic cleavage preferences of recombinant nepenthesin I and II compared to Nepenthes fluid under physiological conditions ...... 80

Figure 3.18. Proteolytic cleavage preferences of recombinant nepenthesin I and II compared to Nepenthes fluid in the P1 position under physiological conditions ...... 82

Figure 3.19. Proteolytic activity of Nepenthes fluid in the presence of inhibitors under physiological conditions ...... 83

Figure 3.20. Size-exclusion chromatography profile of Nepenthes fluid ...... 85

Figure 4.1. Schematic of the indirect ELISA procedure ...... 94

Figure 4.2. Quantitation of the protease concentration in Nepenthes pitcher fluid by SDS-PAGE ...... 98

Figure 4.3. Stability of nepenthesin I and II under simulated gastrointestinal conditions ...... 100

Figure 4.4. LC-MS quantitation of digests of the 33-mer peptide ...... 102

Figure 4.5. Quantitation of the amount of the 33-mer peptide (%) remaining as a function of dose of nepenthesin II ...... 103

Figure 4.6. Qualitative view of digestion of gliadin extracts by recombinant nepenthesins, Nepenthes pitcher fluid, and pepsin in a dose-dependent manner ...... 105

Figure 4.7. Quantitation of the amount of the 33-mer peptide (%) remaining as a function of enzyme dose (µg) ...... 109

xi

Figure 4.8. Total ion chromatograms of gliadin extracts digested with various protease/protease formulations under simulated gastrointestinal conditions in vitro ...... 111

Figure 4.9. Total ion chromatogram of gliadin extracts digested for 1 hour and 40 minutes with the indicated proteases/protease formulations (shown in legends) under simulated gastrointestinal conditions in vitro ...... 112

Figure 4.10. The sum of the intensities of peptides separated into defined m/z ranges detected from digests of gliadin extracts by Nepenthes pitcher fluid under simulated gastrointestinal conditions plotted as a logarithmic function of time (min)...... 114

Figure 4.11. The sum of the intensities of peptides separated into defined m/z ranges detected from digests of gliadin extracts by recombinant nepenthesin I under simulated gastrointestinal conditions plotted as a logarithmic function of time (min) ...... 115

Figure 4.12. The sum of the intensities of peptides separated into defined m/z ranges detected from digests of gliadin extracts by recombinant nepenthesin II under simulated gastrointestinal conditions plotted as a logarithmic function of time (min) ...... 116

Figure 4.13. The sum of the intensities of peptides separated into defined m/z ranges detected from digests of gliadin extracts by pepsin under simulated gastrointestinal conditions plotted as a logarithmic function of time (min) ...... 117

Figure 4.14. Qualitative analysis of digests of gliadin extracts with the indicated protease/protease formulations (shown in legends) under simulated gastrointestinal conditions...... 119

Figure 4.15. The sum of the intensities of peptides containing immunogenic epitopes detected from digests of gliadin extracts with the indicated protease/protease formulations (shown in legend) under simulated gastrointestinal conditions plotted as a logarithmic function of time (min)...... 121

xii

List of Symbols, Abbreviations, and Nomenclature

Symbol Definition

α-MF Alpha-mating factor µg Microgram µL Microlitre µm Micron µM Micomolar Ala, A Alanine a.a. ACN Acetonitile AGA Anti-gliadin antibody AMBIC Ammonium bicarbonate AN-PEP Aspergillus niger prolyl endopeptidase APC Antigen presenting cells

APLF Aprataxin and PNK-like factor Arg, R ASP Asperogillopepsin Asp, D Aspartate AU Absorbance units BCA Bicinchoninic acid BRCT BRCA1 carboxyl-terminal BSA Bovine Serum Albumin C18 Octadecyl carbon chain CD Celiac disease CID Collision-induced dissociation cm Centimetre Cys, C Cysteine Da Dalton DMSO Dimethyl sulfoxide DTT Dithiothreitol dupes Number of additional matches for the same peptide E. coli Eschericia coli EDTA Ethylenediaminetetraacetic acid ELISA Enzyme-linked immunoabsorbent assay EMA Endomysial antibody EP-B2 Endoprotease B, isoform 2 ESI Electrospray ionization ETD Electron-transfer dissociation

xiii

FA Formic acid g Gram g Gravitational acceleration Gln, Q Glutamine Glu, E Glutamic acid Gly, G Glycine GndCl Guanidine hydrochloride H/DX Hydrogen/deuterium exchange HCl Hydrochloric acid His, H Histidine HLA Human leukocyte antigen HPLC High-performance liquid chromatography Ile, I Isoleucine i.d. Internal diameter IAA Iodoacetamide IC50 Half maximal inhibitory concentration IFN Interferon Ig Immunoglobulin IPTG Isopropyl β-D-1-thiogalactopyranoside IU International units kDa Kilodalton L Litre LC Liquid chromatography Leu, L Leucine Lys, K M Molar m/z Mass/charge MALDI Matrix-assisted laser desorption/ionization MBP Maltose binding protein Met, M Methionine MHC Major histocompatibility complex min Minute mL Millilitre mM Millimolar mm Millimetre MOPS 3-(N-morpholino) propanesulfonic acid MS Mass spectrometer Asn, N Asparagine N. Nepenthes NaCl Sodium chloride NepI Nepenthesin I

xiv

NepII Nepenthesin II ng Nanogram Ni-NTA Nickel-nitrilotriacetic acid nL Nanolitre nm Nanometre NP-40 Nonyl-phenoxypolyethoxylethanol p Probability P. pastoris Pichia pastoris P1 N-terminal side of the cleavage site P1' C-terminal side of the cleavage site PAGE Polyacrylamide gel electrophoresis PBS Phosphate buffered saline PCR Polymerase chain reaction PEP Prolyl endopeptidase Phe, F Phenylalanine PIC Peptide ion chromatrogram pmol Picomole pNA para-Nitroanilide PnGase F Peptide -N-glycosidase F PNK Polynucleotide kinase pNPP para-Nitrophenylphosphate ppm Parts per million Pro, P Proline Q-TOF Quadrupole time-of-flight QUAD Quadrupole RdB Ribonuclease B rpm Revolutions per minute s Seconds SC Sphingomonas capsulata SDS Sodium dodecyl sulfate Ser, S Serine TCA Trichloroacetic acid TCEP tris(2-carboxyethyl)phosphine TCR T-cell receptor TFA Trifluoroacetic acid TG2 Tissue transglutaminase-2 Thr, T Threonine TIC Total ion chromatogram TOF Time-of-flight Trp, W Tryptophan

xv

tTG Tissue transglutaminase v/v Volume/volume Val, V Valine w/v Weight/volume X Any amino acid x Times XIC Extracted ion chromatogram XLF XRCC4-like factor XRCC4 X-ray repair cross-complementing protein 4 Tyr, Y Tyrosine Z Benzyloxycarbonyl

xvi

Chapter One- Introduction

1.1 Celiac Disease: A General Overview

In the early 1950’s, the ingestion of dietary gluten, which is found in a wide range of grains, such as wheat, rye and barley, was recognized as the trigger for celiac disease (CD) [1]. CD is an autoimmune disorder in genetically susceptible individuals that is triggered by the consumption of dietary gluten and is characterized by villous atrophy and inflammation of the small intestinal mucosa [2]. CD has a high global prevalence and coupled to a lack of treatment options, except for a gluten-free diet, have established the disease as one of the most serious and concerning food intolerances to date [3]. The restrictive and costly lifestyle required of the only known treatment for CD patients, the gluten-free diet, is not only difficult to maintain but is sometimes impossible, especially in areas of poverty [4, 5]. Furthermore, the dangers of patient noncompliance and hidden gluten contamination in combination with the burdens of a gluten- free diet have driven the need to find and develop a therapeutic treatment. Several treatments aimed at inactivating contributors of the disease pathology or detoxifying the trigger, gluten, have been proposed and/or are in development but none have yet been approved [6]. The proposed use of oral proteases for gluten detoxification has been shown to be particularly promising with several candidates in currently in advanced clinical trials [7]. However, the biochemical limitations of the proposed treatments have further prompted the need to find new clinical candidates. In this chapter, an overview of the current understanding of CD and its’ proposed treatments are outlined.

1.2 Epidemiology

CD is currently estimated to affect ~1% of the world’s population although cases of the disease are thought to be underreported, especially in the continents of Africa and Asia [8]. Being of European descent, it is thought that the highest incidences of CD are in the United States of America (USA) and Western Europe with an estimated 0.5-1% of the populations afflicted [9,10]. Finland, in particular, has one of the highest prevalence’s of CD with ~2% of the

1

population thought to be afflicted [11]. Identified risk factors associated with the incidence of CD include: Caucasian descent, the female gender, and heredity traits [12-17]. High prevalence for CD has also been shown worldwide including: the Middle East (e.g. Iran), Asia (e.g. India) and North Africa, especially in the Sahara [8, 18]. In fact, it was reported that 1:18 Sawahari children, people of the Western Sahara, are afflicted with CD resulting in a high rate of mortality, especially in the summers [19]. Even more concerning is that numerous studies have concluded that the total global prevalence of CD has increased significantly over the past two decades leading to the belief that CD is more common than once thought [3, 11]. The prevalence of CD in Finland, for example, has almost doubled from 1970-2001 [11]. Furthermore, CD, originally considered a pediatric disorder, is being recognized in older patients who appear to be developing the recognized clinical features with the passing of age [20]. The rise in CD has been attributed to a combination of advances in serological screening procedures and environmental factors. Over the past three decades, the development of the current panel of sensitive serological markers: Immunoglobulin A anti-gliadin antibody (IgA AGA), Immunoglobulin G anti-gliadin antibody (IgG AGA), IgA endomysial antibody (EMA) and IgA tissue transglutaminase (tTG) have allowed the identification of CD even in the face of low clinical suspicion [21]. Once initially identified, a subsequent small bowel biopsy is performed for histological confirmation of infiltrative lesions, which is the gold standard for the diagnosis of CD [22]. Due to the advances and lowered costs of screening procedures, it is believed that CD follows a classical “iceberg” distribution where the number of undiagnosed far outnumbers the confirmed cases [23]. The results of numerous studies have supported this notion; including a large UK patient based study which has suggested that over 90% of asymptomatic children with CD remain undiagnosed [24]. Although unclear, environmental factors such as economic status and hygiene environment have also been implicated in the perceived rise in the prevalence of CD [25]. However, of worry, is that gluten intake has also increased due to its use in grains of processed foods worldwide; as either additives or to confer favorable mechanical properties such as cohesiveness [26].

2

1.3 Basis for Disease Development

Currently, the complete basis for the development of CD is unknown. However, both genetic and environmental factors are thought to contribute to the susceptibility to CD. Individuals carrying either of the human leukocyte antigen (HLA) molecules, DQ2 or DQ8, are known to carry an increased risk for developing the disease with >95% of CD patients carrying either or both HLA genes [27, 28]. Although only ~40% of those possessing the HLA genes risk developing CD, the HLA gene is currently the only established genetic variant for predisposition to CD although two other genes, MY09B and CTLA-4 have been implicated [29-32]. The genetic contributions for the remaining ~60% of CD patients, however, remains unknown spurning the belief that non-genetic factors are at play. The prevalence in family members of CD patients in the USA, for example, has been found to be only ~3-20% depending on the research methods used [33]. Furthermore, comparison of two large population based studies of twins in Italy showed that monozygotic twins had a higher concordance rate of CD (~85%) compared to dizygotic twins (~20%), suggesting that genetic factors cannot fully explain the incidence of CD since the concordance rate is not 100% and that possibly environmental or other factors may contribute [34-35]. Besides the ingestion of gluten, it is currently unclear what these other environmental factors are although changes in gut microbiota compositions have been linked to an increased display of CD [36-38].

1.4 Pathogenic Mechanism

The mechanism leading to destruction of the small intestinal mucosa, and ultimately impaired nutrient absorption, in CD patients is through an inappropriate T-cell mediated immune response to the ingestion of gluten [39]. However, both the innate and adaptive arms of the immune system contribute to the pathogenesis of CD [6]. Gluten is a large protein composite consisting of glutenins and gliadins. Glutenins are insoluble, multimeric protein aggregates consisting of high and low molecular weight subunits that confer elasticity to dough [40]. Gliadins on the other hand are soluble monomeric prolamins that give dough its viscosity and extensibility, and are subdivided into α, β, γ, and ω fractions [41]. Gliadins have been shown to be the main immunogenic culprits in CD patients with

3

Figure 1.1. Schematic representation of the pathogenic mechanism of CD.

A subset of partially digested toxic gliadin peptides, such as the immunodominant 33-mer peptide of α-gliadin (red), cross the epithelial barrier from the intestinal lumen into the lamina propria where they are deamidated by TG2 then recognized by antigen presenting cells (APC) through their major histocompatibility complex (MHC) II receptors (HLA-DQ 2 or 8 molecules) with enhanced binding. Once recognized by APC’s gliadin peptides are presented to gluten specific CD4+ T-cells, via their T-cell receptor (TCR), leading to the release of an array of proinflammatory cytokines such as IFN-γ. In tandem, B-cells are activated and contribute to the release of pro-inflammatory cytokines, as well as, the production of autoantibodies against the self, TG2 and gluten.

4

α-gliadins considered as the most toxic followed by β, γ and ω-gliadins in order of decreasing toxicity [40, 42-43]. Once ingested, the underlying basis for the toxicity of gliadins is thought to be due to their resistance to gastrointestinal throughout the digestive process, which is fuelled by their unique amino acid composition consisting of a high percentage of prolines (~15%) and glutamines (~30%) [39]. The abundance and positions of prolines in gliadins coupled to an observed lack of endoprolyl peptidase activity throughout the gastrointestinal tract is thought to contribute to the formation of immunogenic peptides, with strong polyproline helical character, that are resistant to proteolysis by gastrointestinal enzymes throughout the digestive process [44]. Once reaching the intestinal level, it is thought that a subset of immunogenic gliadin peptides crosses the epithelial cell barrier into the lamina propria. One possible mechanism, of which, is via triggering the release of zonulin. Zonulin, a precursor of prehaptoglobin-2, modulates the tight junctions regulating the permeability of the epithelial cell barrier leading into the lamina propria [45]. In support of this mechanism, gliadin was shown to bind to CXCR3 receptors found on intestinal epithelial cell surfaces in order to induce zonulin release, via a MyD88 dependent pathway, resulting in increased intestinal permeability to the lamina propria [46]. In the lamina propria, the abundance of glutamines found in immunogenic gliadin peptides further contributes to their increased immunogenicity in CD patients upon deamidation by tissue transglutaminase-2 (TG2) at the consensus sequence, QXP, resulting in the addition of an extra negative charge and thus enhanced affinity for the HLA-DQ2 and HLA-DQ8 molecules (Figure 1.2) [47]. The idea that gliadin specific immunogenic peptides are produced from incomplete proteolysis has led to the identification of a 33-mer peptide in α2-gliadin (residues 57-89, UniProt: Q9M4L6) that is thought to be one of the most potent stimulators of the inflammatory response to gluten. In vitro and in vivo studies in rats demonstrate that the 33-mer peptide is stable in the presence of gastrointestinal enzymes well past the normal course of digestion (>20 hours), reacts with TG2 and rapidly stimulates the T-cell proliferation cascade described below (Figure 1.1), based on T-cell lines derived from CD patients [48]. However; further studies indicate a much larger array of immunogenic gluten peptides, which may help to explain the heterogeneity in the and sensitivity of responses [44, 49-51]. Recognition of

5

Figure 1.2. Mechanism of glutamine deamidation by tissue-transglutaminase-2 (TG2).

The thiol of TG2, in the presence of Ca2+, attacks the carboxamide group of glutamine producing a thioester intermediate and the release of ammonia. Water then acts as nucleophile causing a hydrolysis reaction resulting in deamidation. Adapted with permission from Pinkas et al, 2007 [52].

6

immunogenic gliadin peptides by the HLA-DQ2 or HLA-DQ8 molecules (MHC class II receptors) expressed on the surface of antigen presenting cells precludes their presentation to the T-cell receptors of gliadin specific CD4+ T-lymphocytes in the intestinal mucosa [53]. Presentation of the processed immunogenic peptides to the T-cell receptors of gliadin specific T- lymphocytes subsequently activates their release of pro-inflammatory cytokines such as interferon-γ (IFN), tumor necrosis factor-α as well as interleukins 2, 6 and 10 [47, 54-56]. In addition, induction of disease-specific antibodies against the self, TG2 and gluten by antibody secreting B-cells, triggered in tandem with the T-cell activation cascade, have been implicated as independent contributors to cytokine release and to the severity of the diseases progression [55, 57-59]. The inflammation, if continued, eventually causes collapse of the intestinal villi, which normally absorb nutrients into the bloodstream for distribution throughout the body, resulting in a flat, smooth mucosal surface devoid of normal function [2]. If the trigger (gluten) is removed, the autoimmune response ceases and the intestinal mucosa, in most cases, slowly reverts back to normal without scarring [39]. However, tolerance to gluten is never acquired and reintroduction of the trigger reactivates the disease.

1.5 Disease Symptoms and Outcomes

CD is an autoimmune disease that is characterized by inflammation, crypt hyperplasia and villous atrophy of the small intestinal mucosa [60]. Classic intestinal symptoms of CD include vomiting, chronic diarrhea, bloating, constipation and abdominal pain [61]. In CD, the integrity of the intestinal epithelial barrier to the lamina propria becomes compromised, which then allows the destruction of villi necessary for nutrient absorption [62]. The final result is often serious impairment of nutrient absorption function due to the destruction of the small intestinal mucosa, which leaves patients predisposed to more serious manifestations. Over 300 symptoms of CD have been described including: growth problems, weight loss, delayed puberty, arthritis, iron deficiency and bone loss resulting in severely reduced quality of life [63, 64]. CD has also been linked with psychiatric disorders such as cognitive decline and peripheral neuropathy [65]. The severity of the disease is highly variable ranging from asymptomatic to death [66, 67]. Indeed a large proportion of those with CD are and may remain asymptomatic for their

7

entire lives while others may develop symptoms with age and accrued exposure to environmental agents [65, 68].

1.6 Treatments for CD- The Gluten-Free Diet

Currently, a life-long gluten-free diet is the only available treatment for managing CD. A strict, lifelong adherence to a gluten-free diet has been shown, in most cases, to alleviate the symptoms of CD and promote healing of the small intestinal mucosa [2]. However, despite strict adherence to a gluten-free diet, complete recovery of the small intestinal mucosa is exceptionally rare and symptoms may persist [69-71]. In fact, a study on the benefits of the gluten-free diet showed that, for 465 CD patients, after strict adherence to a gluten-free diet for 16 months, only 8% regained histological normalization as determined by duodenal biopsy [72]. This phenomenon has been attributed to the effects of hidden gluten contamination in supposed “gluten-free products” combined with the variable sensitivity to gluten displayed by CD patients. It is estimated that CD patients on a gluten-free diet regularly consume 5-50 mg of unwanted gluten a day as a result of contamination and that as little as 50 mg of gluten, corresponding to 1/96 of a slice of bread, ingested daily for 3 months is sufficient to contribute to the persistence of intestinal mucosal damage [73]. For this reason, multiple studies attempting to define a minimum gluten threshold tolerable for all CD patients have suggested that the minimum acceptable amount of gluten in “gluten-free” products should be further lowered but no studies have yet been conclusive [74]. The results are unsurprising given the observed variability in the amount of gluten that CD patients are able to tolerate and the severity of their manifestations [73, 75-76]. It has been recognized that a gluten-free diet is not always possible to maintain, especially in areas of poverty [5]. Even in conditions of economic prosperity, a restrictive and costly lifestyle is required to maintain a gluten-free diet and the lurking dangers of patient non- compliance can sometimes be fatal [4]. These concerns have driven interest in the development of a therapeutic treatment for CD to either combine with or replace the gluten-free diet.

8

1.7 Proposed Therapeutics for CD

An increased understanding of the molecular mechanisms governing CD pathogenesis has led to the identification of several possible targets for therapeutic development including: inhibition of antigen presentation, direct mitigation of the inflammatory response, inhibition of lymphocyte recruitment, gaining immune tolerance, inhibition of mechanisms mediating increased intestinal permeability and dietary detoxification. A brief summary of the current literature for the proposed treatments is provided below. Proposed therapeutics aimed at inhibiting antigen presentation, thus preventing initiation of the immune cascade, have mainly focused on blocking deamidation of glutamates in gluten (e.g. crosslinking pre-transamidated gluten) or direct inhibition of TG2 (e.g. oral protease inhibitors) [6, 77-78]. Although supporting evidence has been observed in vitro, the fact that some immunogenic peptides do not require deamidation to initiate the immune cascade coupled to the variable sensitivity observed in CD patients have raised concerns over the practicality of the proposed treatment [61, 79-83]. Treatments aimed at direct mitigation of the inflammatory response include a variety of anti-inflammatory and pro-regulatory cytokine compounds currently in the preclinical stage of development. However, these therapies are considered less than ideal due to their side effects and methods of delivery. Inhibition of lymphocyte recruitment is another treatment proposed for CD. Compounds aimed at inhibiting various factors required for T-cell recruitment have been proposed and are in the preclinical stage of development [84-88]. Again, however, the therapy is not considered ideal due to the possible side effects, including increased risk of infections due to immune- suppression. Gaining immune tolerance either through controlled exposures to gluten and/or vaccination has been studied as another method of modulating CD [89-91]. Although currently in clinical trials, given the variable sensitivity that patients display to gluten, risks of activation of the immune system and its long-term consequences have raised concerns over this method. Currently considered one the most promising therapeutics, AT-1001, an octapeptide, is aimed at inhibiting the increased intestinal permeability observed in CD patients by reversibly blocking the receptor for zonulin, which is thought to allow the passage of immunogenic gluten

9

peptides into the lamina propria for initiation of the immune cascade [92]. Advanced clinical trials have been completed and fast-track designation has been assigned. Although the therapeutic is moving into phase 3 clinical trials, mixed results have been reported on its effectiveness at maintaining or repairing intestinal permeability relative to that of a placebo although less CD symptoms were observed in clinical trials while on the treatment [93-95]. Finally, several treatments aimed at detoxifying dietary gluten prior to its initiation of the immune cascade in CD patients have been proposed. Pre-treatment therapies with lactobacillus and fungal proteases in the fermentation of wheat, for example, showed digestion of the immunodominant 33-mer of α-gliadin and resulted in reduced IFN-γ mRNA production in CD patients [96-98]. However, the feasibility of producing truly "gluten-safe" wheat has been called into question, again, due to the variable sensitivity to the trigger (gluten) observed in CD patients. The other proposed gluten detoxification treatment, which is considered one of the most promising treatments to date as well as the focus of this thesis, is through supplementation of the diet with oral proteases highly specific for digestion of P and Q digestion in order to digest immunogenic peptides. The method is highly favorable due to: the low chance of side effects, economic feasibility, ability to study and regulate dosage based on efficacy and lack of changes in the mechanical properties of flour [7, 99]. The main concern raised with this particular treatment method is whether the proteases are able to digest gluten to the extent in which immunogenic peptides no longer initiate the immune cascade. The successes observed for this proposed therapeutic method in clinical trials, however, suggests that this is the case.

1.8 Oral Proteases: A Promising Treatment for CD

The hypothesis that a "missing peptidase" in the gastrointestinal tract of humans was responsible for the pathogenesis of CD was first proposed in the late 1950s when it was observed that an enzyme extract from the intestinal mucosa of pigs was able to detoxify gliadin [100]. This hypothesis eventually led to the identification that the amount and locations of proline and glutamate residues in gliadin was responsible for triggering CD, with the characterization of the immunodominant 33-mer as a prime example. The results of these studies, coupled with observations suggesting a lack of endoprolyl peptidase activity in the digestive enzymes of the

10

gastrointestinal tract, led to the proposed treatment for CD of supplementation with prolyl endopeptidases (PEP) [53, 57]. As a result, several highly promising PEP based treatments have been proposed as therapeutics for CD and are currently being assessed in advanced clinical trials. The most promising PEP treatments include: ALV003, STAN1 and AN-PEP, which are discussed further below.

1.8.1 Oral Proteases: AlV003

Currently thought to be the most promising oral protease therapy for CD, ALV003 (Alvine Pharmaceuticals) is a drug consisting of two glutenases with high substrate specificity: a proline specific PEP from Sphingomonas capsulata (SC-PEP) and a glutamine specific cysteine endoprotease, endoprotease B, isoform 2 (EP-B2), from germinating barley seeds [101]. In vitro evaluation of EP-B2 showed that the enzyme displayed rapid auto-catalytic activation and a high degree of stability in acidic environments with extreme resistance to proteolysis by pepsin but not trypsin, suggesting susceptibility to inactivation. More importantly, EP-B2 efficiently digested α-gliadin, including within the sequence encoding the immunodominant 33-mer, and cleaved Gln residues in the consensus sequence (QXP), which is required for human TG2 mediated deamidation of gluten proteins [102]. Although initially overlooked, SC-PEP was first proposed as a possible oral supplement for the treatment of CD in a study comparing the biochemical properties of three different bacterial PEPs. Although unable to digest the 33-mer peptide of α-gliadin alone, SC-PEP possessed an extended pH profile that showed an acceptable degree of stability in the acidic environment of the stomach and optimal activity under duodenal conditions. However, the study also showed that SC-PEP was susceptible to proteolysis by pepsin and concerns were therefore raised on its ability to survive passage through the stomach to the small intestine where it would need to act [103]. The idea of rationally combining a glutamine specific glutenase (EP-B2), which would act to detoxify gluten in the stomach, with a PEP (SC-PEP), which would act to detoxify residual gluten in the upper intestinal lumen, gave rise to ALV003 and subsequent evaluation as a combination enzyme therapy [101, 104]. In vitro and in vivo, extensive proteolysis of complex

11

gluten proteins in whole wheat bread and reduced T-cell proliferation in 2 polyclonal cell lines derived from CD patients was observed [101]. Two phase I clinical trials showed that ALV003 did not elicit any serious adverse or allergic reactions, was well tolerated and was orally active in CD patients [105]. Furthermore, a double-blind study showed that CD patients pre-treated with ALV003 then challenged with a gluten meal reported decreased gluten-specific T-cell responses and less gliadin or 33-mer peptide specific T-cells in peripheral blood compared to a placebo. However, symptoms associated with ingestion of gluten such as bloating, headaches, and nausea were not ablated [106]. As well, Phase IIa clinical trials reported substantially lower small intestinal mucosal injury post biopsy, less symptoms and no adverse reactions nor changes in serology. Furthermore, attenuation of symptoms of CD associated with up to 2000 mg/day of gluten contamination was observed with ALV003 treatment [107]. The encouraging results have prompted further clinical investigation and assignment of fast-track designation, which reflects the seriousness of and ALV003's potential to treat CD. Phase IIb clinical trials for ALV003 have recently been completed (ClinicalTrials. gov Identifier: NCT01560169) but the results have not yet been released.

1.8.2 Oral Proteases: STAN1

STAN1 is a combination of common microbial enzymes used in food supplements that consists of asperogillopepsin (ASP) from Aspergillus niger, a non-specific enzyme, and dipeptidyl-peptidase IV, a protease specific for cleavage N-terminal to proline [99]. STAN1 has been shown to detoxify modest amounts of gluten in vitro as shown by mass spectrometry, enzyme-linked immunoabsorbent assay (ELISA) and T-cell proliferation assays. However, neither ASP nor dipeptidyl-peptidase IV alone was able to detoxify gluten [108]. The success in the laboratory has led to a randomized, double-blind, placebo controlled crossover study monitoring serum activity markers in the presence of STAN1 (ClinicalTrials. gov Identifier: NCT00962182). However, the results of the study have not yet been released.

12

1.8.3 Oral Proteases: AN-PEP

Aspergillus niger prolyl endopeptidase (AN-PEP) was first proposed for the treatment of CD in response to the recognition that previously characterized PEPs were susceptible to irreversible inactivation in the stomach, through a combination of pepsin digestion and acidity, and hence fail to detoxify gluten and attenuate the initiation of the intestinal inflammatory response [99]. In vitro assays showed that AN-PEP was stable at pH 2, completely resistant to pepsin digestion and efficiently degraded all immunogenic gluten peptides tested under simulated gastrointestinal-like conditions; although optimal activity was observed between pH 4- 5 [109]. Studies using an in vivo model system that mimic’s the human gastrointestinal tract, the TIM system, corroborated the in vitro results and showed that AN-PEP accelerated the digestion of gluten in the stomach compartment to an extent that few gluten peptides large enough to elicit an immune reaction ever reached the duodenal compartment as assessed by T-cell proliferation assays and monoclonal antibody-based competition assays of intestinal compartment extracts [110]. The encouraging results obtained for AN-PEP have led to the completion of two Phase I and II clinical trials (ClinicalTrials.gov Identifier: NCT00810654 and NCT01335503), the results of which have not yet been released.

1.8.4 Oral Proteases: Limitations and Future Directions

CD is a highly prevalent and serious autoimmune disorder to one of the worlds' most common food substances with no known treatment, except a gluten-free diet [39]. The difficult lifestyle placed on CD patients combined with the risks of patient non-compliance and the dangers of hidden gluten contamination, which is accentuated with the global rise in gluten intake, warrant the development of alternative treatments. Oral proteases appear to be prime candidates for treating CD with several highly favorable advantages. Many of the proposed oral protease treatments are currently in clinical trials but the overall number of clinical candidates is low, which may partly be due to the fact that pre-clinical validation is hampered by the lack of a true CD specific animal model [111]. Coupled with susceptibility to proteolysis by pepsin, the main stomach enzyme; the main concern raised with

13

the proposed oral protease treatments centre around their requirement for survival through the stomach, where they show suboptimal to no activity, in order to reach the small intestine where an important portion of their effects are meant to be exerted (e.g. ALV003) [104]. Furthermore, all of the proposed oral protease treatments appear less than optimal given the fact that true detoxification must partly coincide and compete with immunogenic gliadin peptides arriving in the small intestine where the immune cascade is stimulated. Efforts to overcome the concerns raised over the efficacy of the oral protease treatments to efficiently degrade gluten and prevent CD specific T-cell activation have been partly successful. Acceptable doses have been determined to overcome some of the intrinsic biochemical shortcomings of the clinical candidates, although it is often difficult to judge given the varying amounts of gluten used in different clinical trials [101, 105-107]. As the search for new oral protease candidates continues, the list of candidates will increase and this may raise the possibility for further protease combinations. The ideal oral protease candidate should show efficient and optimal gliadin detoxification in the environment of the human stomach to prevent the presence of immunogenic peptides in the small intestine altogether.

1.9 A New Oral Protease Candidate for the Treatment of CD- Nepenthesin

The pitcher secretions of the Nepenthes genus of carnivorous plants contain proteolytic activity for digesting captured insects that appears to arise from the , nepenthesin [112-114]. Nepenthesins represent a distinct class of aspartic proteases (AP) that fall under the MEROPS subfamily A1 of pepsin-like enzymes [115]. These enzymes are typically expressed as zymogens that are capable of auto-activation in acidic pH, where they show optimal activity using the classical of Asp-Ser/Thr-Asp in their active site [116, 117]. In Nepenthes plants, two of nepenthesin (I and II), that contain ~67% sequence homology, are secreted into the pitchers of the plant from the secretory glands; both isozymes of which have been purified from plants and extensively characterized [118]. The proteases show between 12-22% sequence homology to ordinary pepsin-type enzymes and classic aspartic protease features, such as sensitivity to inhibition by pepstatin A; as well as unique features, such as high stability across a wide pH and temperature range and pronounced

14

Figure 1.3. Nepenthes pitcher plants.

Various species of Nepenthes pitcher plants (left). A close-up of a pitcher of the Nepenthes plant, which houses digestive fluid containing various digestive enzymes, including the nepenthesins, used for the digestion of trapped insects (right).

15

sensitivity to denaturing and reducing agents [118, 119]. Serendipitously, nepenthesins are optimally active in highly acidic environments but show no observed activity under alkaline conditions [118]. It is thought that the stability of nepenthesins is attributed to their extensive disulfide bond network and putative N-glycosylation modifications [120]. Our recent study on the native Nepenthes pitcher fluid demonstrated a very puzzling specificity profile for aspartic proteases. In addition to the expected cleavage sites (e.g. after Q and L), we determined that concentrated fluid promoted cleavage C-terminal to residues such as K, R, H and P, known to be “forbidden” residues for classic aspartic proteases like pepsin (Figure 1.4) [121, 122]. That is, the concentrated fluid presents activity that appears to be a blend of proteases, however, the available proteomic studies have not revealed the presence of proteases other than nepenthesins, although other protease classes have been implicated [123- 125]. Further analysis is therefore necessary to ascertain the basis for the unusual reactivity profile observed. Nevertheless, our study demonstrated that even in unfavorable conditions (10 ˚C, 2.5 minute digestion time), an estimated >1400 fold digestion activity N- and C-terminal to P and Q residues for the native pitcher fluid compared to pepsin, the main stomach enzyme, was observed under acidic conditions (pH 2.5) akin to the environment of the human stomach [121, 122]. As well, the stability and pH profiles of nepenthesin I/II, purified from the native plant fluid, demonstrated high stability across pH and inactivation in duodenal-like environments (pH 6-8) [118]. Combined, these observations led us to suggest that nepenthesin and/or possibly other components of the native Nepenthes pitcher secretions would be prime therapeutic candidates for the treatment of CD that appears to overcome many of the concerns previously raised with known oral protease candidates. That is, the novelty of the new enzyme(s) treatment appears to be the efficient digestion of gliadin solely in the stomach, which may effectively erase the presence of any immunogenic peptides normally reaching the small intestine. As well, since inactivation of the enzyme(s) occurs from pH 6-8, it is unlikely that the digestive process within the small intestine will be affected.

16

Figure 1.4. Digestion preferences of native Nepenthes pitcher secretions.

Nepenthes fluid cleavage preferences at the P1 position or C-terminal side of the residue (A) and at the P1ʹ position or N-terminal side of the residue (B). Data is grouped according to amino acid type and compared to a similar rendering of pepsin data from Hamuro et al. [122]. The black and grey bars indicate nepenthesin and pepsin digestion respectively. The % cleavage represents the total number of observed cleavages at a given residue relative to the total number of the residue in the set. Nepenthes fluid cleavage data was obtained from 2 min digests of six acid denatured proteins at 10 ˚C (reprinted with permission) [121].

17

1.10 Research Hypothesis and Objectives

In this thesis, we hypothesize that nepenthesin I and/or II is responsible for the observed proteolytic activity in native Nepenthes pitcher fluid and thus efficient recombinant production is sufficient as a therapeutic for CD. Our specific aims are to: 1) Support our choice of a recombinant target by identifying nepenthesin and all other proteases in the Nepenthes pitcher fluid through a proteomics driven characterization of the native fluid (Chapter 2). 2) Reconstitute the proteolytic activity of the native Nepenthes pitcher secretions through the recombinant production and characterization of nepenthesin I and II and/or other proteomically identified proteases (Chapter 3). 3) Evaluate the potential of the reconstituted recombinant protease formulation as a therapeutic for CD in vitro (Chapter 4).

Completion of the objectives highlighted above will provide an initial assessment of a promising new oral protease candidate for one of the worlds’ most concerning food intolerances to date that appears to overcome many of the shortcomings of its’ current competition; as well as, providing extensive biochemical data for a unique set of enzymes.

18

Chapter Two: Identification of the Proteolytic Components of Nepenthes Pitcher Fluid

2.1 Introduction

Currently, only a limited number of proteome studies have been performed on the pitcher secretions of Nepenthes plants. Based on these findings, the proteome is thought to be simple, consisting of ~8-11 proteins with the observed proteolytic activity attributed to the aspartic proteases, nepenthesin I and II [124,125]. However, some studies have indicated the presence of other protease classes, such as the possibility of a cysteine protease, but no specific protease have been identified or characterized [123]. Furthermore, our recent study on Nepenthes pitcher fluid demonstrated a puzzling specificity profile for aspartic proteases. In addition to the expected cleavage sites, we determined that concentrated fluid promoted cleavage C-terminal to residues such as K, R, H and P, known to be “forbidden” residues for classic aspartic proteases like pepsin [121]. That is, the concentrated fluid presents activity that appears to be a blend of proteases but, as previously mentioned, proteomic studies have not revealed the presence of proteases other than the nepenthesins. Ascertaining the basis for the puzzling specificity profile observed, that is attributed solely to aspartic proteases, is therefore a necessary first step to be taken prior to any attempts towards reconstitution of the proteolytic activity of the native Nepenthes pitcher secretions. As limitations in obtaining the necessary amounts of pitcher secretions exist, recombinant production appears to be the most promising means of obtaining sufficient quantities of protease(s) for characterization and development for further applications, such as the one proposed in this thesis; as a therapeutic for the treatment of CD. In this chapter, we describe our results from global proteome characterizations using multiple strategies in order to identify all of the possible proteases in the Nepenthes pitcher secretions. We further describe our assessments of the stability of the pitcher fluid proteome over time and a partially purified fraction of native nepenthesins with a highly specific inhibitor of aspartic proteases, pepstatin A. The cleavage preferences of the partially purified nepenthesins are compared to that of the processed pitcher fluid in order to determine the specific proteolytic contributions of the partially purified proteases. Based on the current literature, we hypothesize that nepenthesins, alone, are responsible for the observed proteolytic activity of Nepenthes pitcher secretions.

19

2.2 Experimental Procedures

2.2.1 Chemicals

Water and acetonitrile (ACN), high-performance liquid chromatography (HPLC) grade from Burdick and Jackson, were purchased from VWR. Sodium chloride (NaCl) and Tris from AMRESCO were purchased from VWR. Trypsin, mass spectrometry grade, was purchased from Promega and all sodium dodecyl sulfate (SDS)-polyacrylamide gel electrophoresis (PAGE) components were obtained from Bio-rad. All other chemicals and reagents, unless specifically stated, were obtained from Sigma Aldrich.

2.2.2 Horticulture of Nepenthes Plants

N. rafflesiana, N. ampularia, N. mirabilis, and N.globosa plants were purchased from Kheen’s Carnivores (http://www.keehnscarnivores.ca/). The plants were potted with wood bark, perlite, peat moss and humus mix (40, 35, 10, 5% respectively) and grown under a 14-hour light cycle in a terrarium. The terrarium was 36” x 18” x 24” (width, depth, height) and was of sufficient size to accommodate 6-8 pots (8”). Light was supplied with 2x4 Phillips Mini Twister Daylight Compact Fluorescent Bulbs (23 Watts). Plants were watered frequently with deionized water and the humidity was maintained between 60% and 80%. Watering’s were applied at the soil level, taking care to minimize addition of water to the pitchers. Temperature was between 26-28 °C, slightly lower in the dark. Every other week, the plants were fed with frozen Drosophilae, 1 or 2 in every pitcher (Figure 2.1, left panel), and the fluid harvested the following week (Figure 2.1, right panel). Nepenthes pitcher fluid was collected with a 1 mL plastic pipette. Crude pitcher fluid was filtered through a 0.22 µm filter and then concentrated 80x. The concentrated fluid was acid-activated with 100 mM glycine-HCl (pH 2.5) for 24-hours in the fridge. Any peptides resulting from activation of the enzyme and fluid-protein digestion were washed away by 3 dilution concentration cycles using 100 mM glycine-HCl (pH 2.5) in an Amicon Ultracentrifugal Unit with a10 kDa cut-off (Millipore).

20

Figure 2.1. The horticultural and harvesting process for the Nepenthes plants.

Nepenthes plant pitcher being fed Drosophilae (left) and being harvested for proteases (right).

21

2.2.3 Preparation of Nepenthes fluid for Proteome Studies

In order to prepare the processed Nepenthes fluid for global proteome studies, the fluid was first incubated at 37 ˚C for 24 hours in the hopes of clarifying background. The pH of the solution, corresponding to 150 µL of 20 times concentrated fluid, was then adjusted to 8 with 100 mM Tris prior to heat inactivation (95 ˚C, 10 minutes). After cooling, proteins were precipitated with 20% trichloroacetic acid (TCA)/1% trifluoroacetic acid (TFA) and allowed to incubate on ice for 30 minutes. A pellet was formed by centrifugation of the precipitated solution (14000 g, 15 minutes) and was washed thrice with cold acetone (4 ˚C). Precipitated proteins were reduced with 40 mM dithiothreitol (DTT) for 30 minutes at 37 ˚C and alkylated with 80 mM iodoacetamide (IAA) for 30 minutes at room temperature in the dark. The resulting protein solution was digested overnight with either 1 µg of trypsin or 1 µL of the 20 times concentrated Nepenthes fluid at 37 ˚C. A separate sample was also digested with 1 µL of the 20 times concentrated Nepenthes fluid for 1 hour at 37 ˚C. The digested solutions were then lyophilized in a Savant Sc110 speed-vac and resuspended in 1% formic acid (FA) prior to injection in the mass spectrometer for LC-MS/MS analyses. For deglycosylated Nepenthes fluid processing, a commercial PnGase F deglycosylation kit (New England Biolabs, lot no. 0391210, cat. no. P0704S) was used. Briefly, 150 µL of 20x concentrated fluid spiked with 2 µg of ribonuclease B was first boiled at 95 ˚C for 10 minutes with denaturing buffer (0.5% SDS, 40 mM DTT). After cooling, NP-40 was added to the detergent denatured sample in order to counteract the SDS. All reactions were then diluted to 1x G7 reaction buffer (50 mM sodium phosphate, pH 7.5) with the addition of 8 international units (IU) of PNGaseF and the deglycosylation reaction was allowed to proceed for 3 hours at 37 ˚C, 200 rpm. After the incubation, the fluid samples were processed and analyzed as described above from the addition of 20% TCA/1%TFA.

2.2.4 Proteome Mass Spectrometry and Data Analysis

For Nepenthes pitcher fluid proteome characterizations, one pmol of digested substrate was injected into a Easy LC 1000" system (Thermo Scientific) operating in a nanoflow configuration (C18 trap, 75 µm i.d. × 2 cm, 3 µm particle diameter, and C18 column, 75 µm i.d.

22

× 15 cm, 2 µm particle diameter, Dionex). Peptides were eluted with a linear gradient from 5-

50% of mobile phase B (97% ACN, 2.9% H2O and 0.1% FA) over 120 minutes at a flow rate of 300 nL/min. Peptides detected in these analyses were selected for collision induced dissociation (CID) fragmentation and subsequent collection of LC-MS/MS data on a Thermo Orbitrap Velos ETD mass spectrometer. The data was searched in Mascot v2.3 (Matrix Sciences) in the NCBI Viriplantae (green plants), Drosophila (fruit flies) and Bacteria (Eubacteria) databases using the following conditions: a mass tolerance of 20 ppm on precursor ions and 0.2 Da on fragment ions, fixed carbamidomethyl and variable methionine oxidation modifications, either digestion by trypsin or no enzyme specificity, and ESI-QUAD-TOF as the instrument type. A standard probability cut-off of p= 0.05 was also implemented. All data was manually verified. Mascot search results are provided in Appendix Tables A2.1-A2.12.

2.2.5 Visualization of the Nepenthes fluid Proteome over Time

For visualization of the stability of the Nepenthes fluid over time, 120 µL of 150x concentrated fluid, fresh from the source in 50 mM glycine-HCl, pH 2.5, was incubated at 37 ˚C for 24 hours. At different time points, 20 µL of the 150 times concentrated fluid was removed and the pH was adjusted to 8 prior to storage at -20 ˚C. After the time-course, all samples were analyzed by SDS-PAGE (12% acrylamide). After incubation of the Nepenthes fluid for 2 weeks in the fridge in 10 mM glycine-HCl, pH 2.5, a 7 days time-course was undertaken with the same procedure described above and analyzed by SDS-PAGE (12% acrylamide). Due to limitations in fluid quantity, all samples were initially lyophilized in a speed-vac for concentration.

2.2.6 In-gel Processing

Excised gel bands, cut into ~1x1 mm pieces, were destained through two washes with 50% ACN/ 25 mM ammonium bicarbonate (AMBIC). Gel pieces were reduced with 10 mM DTT (1 hour, 56 ˚C) and alkylated with 25 mM IAA in 100 mM AMBIC (30 minutes at room temperature in the dark). After two washes with 100 mM AMBIC, the gel pieces were dehydrated with 100% ACN and further dehydrated again in a speed-vac. The dehydrated samples were rehydrated in 25 mM AMBIC and digested overnight at 37 ˚C with 0.5 µg trypsin

23

or porcine pepsin (Sigma Aldrich, lot no. 074K7717, cat. no. P6887). After digestion, peptides were extracted twice with 50% ACN/1% FA, prior to dehydration by speed-vac and resuspension in 10 µL of 1% FA and subsequent LC-MS/MS analyses. 1 µL of the resuspended sample was injected into the LC-system and peptides were trapped and enriched on a Polaris High Resolution Chip containing a 150 mm x 75 µm, Polaris 3 µm C18A column (Agilent Technologies). Peptides were eluted with a linear gradient of 3- 50% mobile phase B (97% ACN, 2.9% water; 0.1% FA) over 25 minutes at a flow rate of 0.3 µL/min. The eluted peptides were then analyzed on an Agilent 6550 iFunnel Q (quadropole)-TOF (time- of-flight) mass spectrometer operating in positive auto MS/MS mode (CID fragmentation). The resulting spectra were searched against a miniature database in Mascot v2.3 (Matrix Sciences).

2.2.7 Activity Assays

Proteolytic activity was measured using a modified version of the hemoglobin activity assay first described by Anson et al., 1938 [126]. The optimized assay consisted of 4 µL of 10 times concentrated Nepenthes fluid mixed with 1250 µg bovine blood hemoglobin (Sigma Aldrich, lot no. 010157618V, cat. no. H2500) in 100 mM glycine-HCl, pH 2.5 in a final volume of 100 µL. The digestion was allowed to proceed for 30 minutes at 37 ˚C, 200 rpm, and then stopped with the addition of 100 µL of 10% TCA. The precipitate was removed by centrifugation (14000 g, 10 minutes) and 3 µL of the supernatant was used to measure the absorbance of TCA soluble peptides at a wavelength of 280 nm with a pathlength of 0.5 mm on a GE Nanovue plus spectrophotometer. All data points are the means of three technical and three biological replicates.

2.2.8 Pepstatin A Purification

For partial purification of nepenthesins from the native pitcher secretions, as fluid was limited, only 5 mL of 4 times concentrated Nepenthes fluid was incubated with 0.5 mL of resins of pepstatin A coupled to agarose beads (G Biosciences, lot no. 10008076, cat. no. 786-789) overnight at 4 ˚C on a shaker. After incubation, the solution was passed through a 100 mL glass column with a 0.22 µm filter in order to retain the pepstatin A-resins. The resins were washed

24

with 5 column volumes of 50 mM glycine-HCl, pH 2.5 prior to elution with elution buffer A (0.1 M sodium bicarbonate, 0.5 M sodium chloride, pH 8.7). The eluted solution was then buffer exchanged with three 10-fold washes of 100 mM glycine-HCl, pH 2.5 in an Amicon Ultracentrifugal filter unit (10 kDa cut-off) and re-concentrated to the original volume (5 mL).

2.2.9 Determination of Cleavage Specificities

Digestions were carried out in-solution and data was collected using an AB Sciex Triple- TOF 5600 mass spectrometer. Peptides were identified with Mascot (v2.3) from MS/MS data, from .mgf files created in Analyst TF v1.51. Data was mapped to sequences using the following search terms: a mass tolerance of 10 ppm on precursor ions and 0.6 Da on fragment ions, no modifications, and no enzyme specificity. A standard probability cut-off of p= 0.05 was implemented and matches near the cut-off manually verified. For the Nepenthes fluid digestions, 8 µM protein solutions (XLF, PNK, and APLF) were digested with 2x concentrated fluid for 2 min at 37 °C. Details for the recombinant production of the substrates are as previously described [121]. After dilution to 1 µM substrate concentration, 10 pmol was injected into the chilled reversed-phase LC system (4 °C) connected to the mass spectrometer. The peptides were trapped on a 5 cm, 200 µm i.d. Onyx C18 monolithic column (Phenomenex Inc.) and eluted with an acetonitrile gradient from 10% to 40% over 10 min. Peptides detected in these analyses were selected for CID fragmentation for acquisitions of MS/MS spectra. Spectra were searched against a miniature database in Mascot containing the desired sequences and results were manually verified.

25

2.3 Results and Discussion

2.3.1 In-solution Proteome Analyses of Nepenthes Pitcher Secretions

Although the proteolytic activity of the Nepenthes pitcher secretions is attributed to a combination of nepenthesin I and II, other proteases have been implicated in the native fluid [123]. Therefore, global proteome assessments of the Nepenthes pitcher secretions were undertaken in order to identify all of the protease components within the fluid. Inactivated Nepenthes fluid, with or without deglycosylation, was digested overnight, in-solution, with trypsin prior to LC-MS/MS analyses. The top 10 protein identification results of these global proteome analyses, searched against the NCBI Viridiplantae (green plants) database, using Mascot, are summarized in Table 2.1 and 2.2. As shown in Table 2.1, digestion of inactivated Nepenthes fluid, overnight with trypsin, identified mainly a mixture of heat shock and uncharacterized proteins. Protein sequence similarity searches (BLASTp) determined that the uncharacterized proteins were either of unknown function or have been implicated as components of cellular regulation but were not proteolytic in nature. However, the Mascot scores for many of the identified proteins appeared to be contaminated by redundancy due to the low number of unique peptide identifiers and thus a large number of the same peptide or dupes. Nevertheless, the only protease identified was nepenthesin I, ranked third, with 35 dupes of only one unique peptide corresponding to a Mascot score of 1108 (Table 2.1). The Mascot score used for ranking identified proteins is a transformed measure of the probability that observed matches are random events where higher scores correspond to events less likely to occur by chance. However, the absolute values of the scores can only be properly evaluated against a known level of background noise. Searching of the experimental data against the NCBI Drosophila (fruit flies) and Bacteria (Eubacteria) databases, using Mascot, identified Drosophila and bacteria proteins with similar or higher Mascot scores to those observed for the identified plant proteins, suggesting a high level of background. However, no other proteases were identified when the data was searched in the bacteria and Drosophila databases. Given the high level of background observed and the low number of unique peptides identified, the results of this study, at best, implicate the presence of nepenthesins in the native Nepenthes pitcher secretions. Further optimization was, therefore, undertaken in order to increase the

26

Table 2.1. Initial proteome identification results for Nepenthes pitcher fluid digested with trypsin.

Rank NCBI Protein name Number Mascot accession of unique score† number peptides identified* 1 gi|38325811 Heatshock Nicotiana 5 (47) 1657 protein 70-1 tabacum 2 gi|326499079 predicted protein Hordeum 8 (49) 1468 vulgare 3 gi|61214233 nepenthesin-1 Nepenthes 1 (36) 1108 gracilis 4 gi|326492680 predicted protein Hordeum 6 (52) 936 vulgare 5 Uncharacterized Zea Mays 4 (26) 749 gi|308044587 protein LOC100501669 6 Uncharacterized Zea Mays 2 (19) 713 gi|226532205 protein LOC100274495 7 Uncharacterized Zea Mays 8 (38) 601 gi|226510502 protein LOC100273100 8 Hordeum 2 (18) 587 gi|326502504 Predicted protein vulgare 9 Uncharacterized 2 (15) 576 gi|226507094 protein Zea Mays LOC100273141 10 gi|326502344 Predicted protein Hordeum 1 (8) 501 vulgare *The total number of dupes (repeats of the same unique peptide) identified is shown in brackets † Ion cut-off score (p<0.05): 40

27

number of unique peptides identifiers and to reduce background. In the previous proteomics analyses, we noticed many intense ions in the obtained spectra’s that were not accounted for in the Mascot searches. We reasoned that addressing this issue would, perhaps, improve both the number of unique peptide identifiers and the Mascot scores of the identified proteins. The proteome of the Nepenthes pitcher secretions is thought to be highly glycosylated, which may have impacted peptide identifications. We, therefore, decided to deglycosylate the Nepenthes fluid prior to proteomics processing and identification. Initial optimization of the process identified that complete deglycosylation of ribonuclease B with PNGaseF under the experimental conditions required prior denaturation (Figure 2.2A). Therefore, as a control, we spiked Nepenthes fluid with ribonuclease B prior to deglycosylation in order to observe the efficiency of the process. As seen in Figure 2.2B, a lower band corresponding to deglycosylated ribonuclease B was observed in the precipitated fraction after deglycosylation (lane 1 and 2). However, its overall drop in intensity as well as that of the remaining glycosylated ribonuclease B band suggested that the Nepenthes fluid retained some proteolytic activity throughout the deglycosylation process even though the samples were initially boiled in both a detergent and a reducing agent. Although highly inefficient, the process showed some deglycosylation and so we subjected the sample to the same in-solution proteomics assessment as described above, digested with trypsin, to see if there was an improvement in protein identifications. Unfortunately, no improvements in protein sequence identifications were observed and the number of unique peptides identifiers remained unusually low. The low number of unique peptide identifiers for each protein identified may have been attributed to the specific cleavage specificities of trypsin and the lack of K and R’s in the Nepenthes fluid proteome. Certainly, mature nepenthesin I only appears to contain one K whereas mature nepenthesin II does not appear to contain any K’s or R’s, suggesting that the proteome components of the Nepenthes pitcher secretions may have developed to become highly acidic in nature in order to match their environment. Overall, the proteome results appeared very similar as those described in previous Nepenthes proteome studies when searched against the NCBI Viriplantae (green plant) database [125]. The same single unique peptide for nepenthesin I was identified yet again, ranked third, as well as a single unique peptide for a serine -like protease, although with a low Mascot score of 73 (Table 2.2). When the data was searched against the NCBI Drosophila (fruit flies) and Bacteria (Eubacteria) databases, using Mascot, background

28

Figure 2.2. Deglygosylation of the Nepenthes pitcher fluid.

A) Deglycosylation of ribonuclease B (RbB) with prior denaturation (lane 1) or without (lane 2). The glycosylated RbB load (2 µg) is shown in lane 3. B) Nepenthes pitcher fluid spiked with 2µg RbB after deglycosylation (lane 1) or after protein precipitation (lane 2). 2 µg of glycosylated RbB load is shown in lane 3.

29

levels remained relatively high especially for bacteria proteins due to the remaining protein components of the deglycosylation process. Again, given the high background levels, the data can only suggest, again, the presence of nepenthesins in the native fluid as the only proteases present. The indications for the presence of a serine-carboxypeptidase cannot be ruled out completely but appears unlikely given that only a single unique peptide with a low number of dupes was identified and assigned a low Mascot score less than 10 times that of nepenthesin I that is not far above the threshold of significance. All of the results from the current proteome assessments have suffered from a high level of background coupled to a relatively low number of unique peptide identifiers. In order to improve sequence coverage and possibly reduce background ions, trypsin was replaced with a non-specific protease, proteolytically active Nepenthes pitcher fluid, in the digestion step of the experimental proteome processing method described above. As the protein contents of the fluid are identical and no detectable auto-digestion is observed when using Nepenthes pitcher fluid as a protease [121], we expected the non-specific nature of the enzyme to improve protein identifications and reduce background contaminants. The top identification matches in the new proteome study were nepenthesin II and I respectively (Table 2.3). The modified processing method increased the number of peptides identified for nepenthesin II from 0 to 40 unique peptides corresponding to an increase in sequence coverage to 39%. For nepenthesin I, an improvement from 1 to 16 unique peptides was observed corresponding to a sequence coverage increase from 2% to 21%. Furthermore, the levels of background proteins identified within the searched proteomes were also greatly reduced. As we were unsure as to whether over-digestion would result through the use of a non- specific protease, a shorter, 1-hour digestion with proteolytically active Nepenthes fluid was also performed that showed less identifications and a lower number of unique peptides identified for each identification but with the same highly reduced background results as those obtained with an overnight digestion with active Nepenthes fluid (Appendix Table A2.11 and A2.12). The results of this study, therefore, confirm that nepenthesins are present in the Nepenthes fluid and suggest that they are the only proteases.

30

Table 2.2. Proteome identification results for deglycosylated Nepenthes pitcher fluid digested with trypsin.

Rank NCBI Protein name Taxonomy Number Mascot accession of unique score† number peptides identified* 1 gi|165292442 class IV chitinase 4 (26) 836

2 gi|85682819 thaumatin-like Nepenthes 2 (20) 830 protein gracilis 3 gi|61214233 Nepenthesin-1 Nepenthes 1 (31) 765 gracilis 4 gi|393387669 β-1,3-glucanase Nepenthes alata 3 (28) 397

5 gi|167998797 Predicted protein Physcomitrella 1 (7) 131 pattens 6 gi|294461233 unknown Picea sitchensis 1 (41) 117

7 gi|413924608 Hypothetical Zea Mays 1 (96) 108 protein 8 gi|527192719 Hypothetical Genlisea aurea 1 (69) 90 protein 9 gi|205830697 Unknown protein Pseudotsuga 1 (2) 85 18 menziesii 10 gi|2493495 Serine- Pisum sativum 1 (6) 73 carboxypeptidase *The total number of dupes (repeats of the same unique peptide) identified is shown in brackets † Ion cut-off score (p<0.05): 39

31

Overall, the results of the global proteome studies presented here have repeatedly identified, with confidence, nepenthesins as the only proteases in the native fluid. The presence of a serine-carboxypeptidase-like protease was identified with low significance in the deglycosylated proteome study although under high background levels. Therefore, although other uncharacterized or very low abundance proteases with no recognized homology to any known plant, Drosophila or bacteria proteases may be present, multiple attempts at identifying all possible proteases through full proteome characterization of Nepenthes pitcher secretions have led to the identification of only nepenthesin I and II.

32

Table 2.3. Proteome identification results for Nepenthes pitcher fluid digested with proteolytically active Nepenthes pitcher fluid

Rank NCBI Protein name Taxonomy Number Mascot accession of unique score† number peptides identified* 1 gi|409179880 Nepenthesin II Nepenthes 40 (143) 2762 mirabilis 2 gi|61214233 Nepenthesin I Nepenthes 16 (57) 1082 gracilis 3 gi|218201535 Hypothetical protein Oryza 1 (1) 68 sativa 4 gi|508715246 ARM repeat Theobroma 1 (1) 67 superfamily protein cacao 5 Predicted, Cicer 1 (1) 61 gi|502179750 uncharacterized arietinum

protein 6 gi|502146217 Predicted, acid Cicer 1 (1) 61 phosphatase-like arietinum protein 7 gi|255549692 Hypothetical protein Ricinus 1 (1) 61 communis *The total number of dupes (repeats of the same unique peptide) identified is shown in brackets † Ion cut-off score (p<0.05): 60

33

2.3.2 Activity and Cleavage Preferences of Nepenthes Pitcher Fluid

In order to determine the stability of the Nepenthes pitcher fluid proteome over time, processed Nepenthes pitcher secretions from the source were immediately acid activated with 10 mM glycine-HCl, pH 2.5 over a 24-hour time-course at 37 ˚C and visualized by SDS-PAGE (Figure 2.3A). Surprisingly, an intense band at ~50 kDa was maintained throughout the entire time-course, while lower molecular weight protein bands appeared to decrease in intensity. In- gel processing, using porcine pepsin in the digestion step followed by LC-MS/MS analysis, identified the intense band at ~50 kDa to be a combination of nepenthesin I, with 14 unique peptides identified corresponding to a 26% sequence coverage, and nepenthesin II, with 18 unique peptides identified corresponding to a 34% sequence coverage. After two weeks in the fridge, the acid activated Nepenthes fluid was once again subjected to a time-course but for 7 days at 37 ˚C. As seen in Figure 2.3B, the intense nepenthesin I/II band appeared to remain dominant in abundance in the proteome throughout the time-course with very little background observed. At the 5-day point, no other protein bands except for the one corresponding to nepenthesin I/II appeared to be present, although auto-digestion was observed (Figure 2.3B, lane 4 and 5). The relative increase in intensity over time of the intense nepenthesin I/II band is believed to have been induced either by slow resolubilization of the sample after lyophilisation or concentration of the sample by evaporation and not an evolution of mature nepenthesins. In support of this, comparison of the activity profiles of the two-week old fluid over time showed a downward trend in activity with ~32% relative activity remaining after incubation at 37 ˚C for 7 days (Figure 2.4). Likewise, a downward trend was also observed for cleavage of all residues constituting APLF in the P1 and P1’ positions over time (Figure 2.5), although initially, the cleavage specificities of the two-week old fluid appeared similar to those previously described in past literature [121].

34

Figure 2.3. Stability of acid activated Nepenthes fluid over time.

A) Processed Nepenthes fluid straight from the source was acid activated with 10 mM glycine-HCl, pH 2.5 and incubated at 37 ˚C, 200 rpm for 0 minutes (lane 1), 15 minutes (lane 2), 30 minutes (lane 3), 1 hour (lane 4) and 24-hours (lane 5). Each sample was then analyzed by SDS-PAGE (12% acrylamide gel). B) Two-week old processed, acid active Nepenthes fluid (pH 2.5) stored in the fridge was incubated at 37 ˚C, 200 rpm for 0 days (lane 1), 1 day (lane 2), 3 days (lane 3), 5 days (lane 4) and 7 days (lane 5). Each sample was then analyzed by SDS-PAGE (12% acrylamide gel).

35

Figure 2.4. Relative activity (%) of two-week old Nepenthes fluid incubated over a 7-day time-course at 37 ˚C.

Relative proteolytic activity towards hemoglobin was measured at 37 ˚C from 0-7 days. The means of 3 independent replicates and their standard deviations are shown.

36

These results suggest that within the context of proteolytically active fluid, nepenthesins are able to digest other fluid components and are/remain highly abundant over time. Even at the 7-day time point, where no other proteins besides nepenthesins appeared to be present, the two- week old Nepenthes fluid was able to cleave at all residues in the P1 and P1’ positions. If other proteases were present in the native pitcher secretions, a difference in enzyme stability could have been expected and the results may have shown a drop in cleavage of certain residues throughout the time-course. Instead, a general drop in cleavage for all residues in the P1 and P1’ positions was observed within reasonable error. The sudden increase in cleavage preference for H and Y in the P1 position for day 3 compared to day 7 is likely attributed to the error associated with the low abundance of those residues in APLF (13 and 11 respectively) coupled with slight differences in experimental handling on a day-to-day basis (Figure 2.5A). This, however, cannot explain the difference in cleavage preference for A in the P1 position from day 3 and day 7 as APLF contains 21 alanines, suggesting some experimental error. Therefore, although most of the cleavage preferences appear to follow a general downward trend over time, the results of this crude study are inconclusive but, at least, indicate that there does not appear to be a large difference in stability between proteases, if more than one were present in the native fluid. Taken together, however, the results of these studies provide relatively strong evidence that nepenthesins are responsible for the proteolytic cleavage specificities of Nepenthes fluid, which is congruent with previous literature [118,125]. Although, other proteases may still be present in the native pitcher fluid, their relative abundance and contributions to cleavage specificity were not discernable in these studies.

37

Figure 2.5. Proteolytic selectivity of activated Nepenthes fluid over time under physiological conditions.

A) C-terminal cleavage (P1 position) and B) N-terminal cleavage (P1’ position) preferences are shown as the relative % cleavage to the total. Changes in cleavage preferences were determined by LC-MS/MS analysis of digests of 1 model protein (APLF) with 2x concentrated two-week old Nepenthes fluid that had been incubated at 37 ˚C for 0-7 days. The in-solution digestions conditions were 2 minutes at 37 ˚C.

38

2.3.3 Pepstatin A Purified Nepenthes Pitcher Fluid

In order to specifically show that nepenthesins are responsible for the observed proteolytic cleavage preferences of the native plant fluid, we partially purified native nepenthesins with pepstatin A beads, a highly specific inhibitor of aspartic proteases. The cleavage specificities of the partially purified protease fraction were compared to the 80x concentrated, processed crude fluid. Based on our previous estimation of a total protein concentration of 22 ng/µL for the 80 times concentrated fluid [121], the total protein concentration in the purified fraction, after concentration to the same volume as prior to purification, was judged to be below the limit of detection for the bicinchoninic acid (BCA) assay previously used. Therefore, equivalent volumes of the crude and purified protease fractions were used instead as an estimation of equivalent nepenthesin loads. Overall activity for the pepstatin A purified fraction appeared lower than the crude fraction, as would be expected from loss/inactivation of protease(s) throughout the purification procedure. However, although less prominent, cleavages at all residues in the P1 and P1’ position were observed with the purified protease fraction (Figure 2.6). These results further suggest that nepenthesins or at the very least, aspartic proteases, are responsible for all of the proteolytic activity and previously described cleavage preferences of Nepenthes pitcher fluid [118,121,125].

39

Figure 2.6. Proteolytic selectivity of Nepenthes pitcher fluid purified with pepstatin A under physiological conditions.

A) C-terminal cleavage (P1 position) and B) N-terminal cleavage (P1’ position) preferences are shown as relative % cleavage to the total. Cleavage preferences were determined by digestion of 2 model proteins (PNK and XLF) with 2 µL of either pepstatin A purified Nepenthes fluid or 20 times concentrated processed Nepenthes fluid for 5 minutes at 37 ˚C.

40

2.4 Conclusions

In summary, multiple attempts at global proteome assessments of Nepenthes pitcher secretions using various proteases for identification (trypsin and proteolytically active Nepenthes fluid), environments (with/without deglycosylation) and digestion times (overnight and 1-hour) have identified nepenthesin I and II as the only proteases present. SDS-PAGE analysis of the proteome has revealed that nepenthesins appear to be; by far, the most abundant and stable proteins in prolonged acidic environments and that their breakdown corresponds with loss of activity and cleavage preferences. Furthermore, a partially purified fraction of nepenthesins with pepstatin A retained cleavage at all residues in the P1 and P1’ position relative to the Nepenthes pitcher fluid. All of the results presented in this chapter, thus far, lead to the conclusions that: 1) nepenthesins appear to be the predominant proteases in the native Nepenthes pitcher secretions 2) no other proteases appear to be present and therefore 3) recombinant production of nepenthesin I and II appears to be the most promising means of reconstituting the proteolytic activity of the native Nepenthes pitcher secretions.

2.5 Contributions to the Chapter

My contributions to this chapter included: the experimental design, processing and analysis of global proteome samples and in-gel digestions. After processing, samples were submitted to the Southern Alberta Mass Spectrometry Core Facility (Calgary, AB) for LC- MS/MS, which was performed by Dr. Laurent Brechenmacher. All other biochemical assays including SDS-PAGE, pepstatin purification, determination of cleavage specificities and activity assays were designed, performed and analyzed by myself. As well, all figures shown in this chapter were made by me.

41

Chapter Three: Recombinant Reconstitution of the Proteolytic Activity of Nepenthes Pitcher Fluid

3.1 Introduction

Ever since Hooker’s observations in the 1870’s that the pitcher secretions of the Nepenthes genus of carnivorous plants contained proteolytic activity, uncertainty lingered as to whether the source of the observed activity was microbial or enzymatic in nature [1]. It has since been established that the main and perhaps only proteolytic source of the pitcher fluid of Nepenthes plants arises from the aspartic proteases, nepenthesin I and II [112-114]. Our own proteomic studies, as described in the previous chapter, have provided evidence in further support of this statement. Therefore, pursuing recombinant production of nepenthesin I and II appears to be the most promising means of reconstituting the proteolytic activity of the native Nepenthes pitcher fluid. In this chapter, we describe our attempts at generating recombinant nepenthesin I and II from a bacterial system, Escherichia coli (E. coli), and a eukaryotic yeast based system, Pichia pastoris (P. pastoris). Strategies for recombinant production in the E. coli system include: intracellular expression with the addition of solubilisation tags, periplasmic expression and refolding of insolubly expressed proteins. Furthermore, in order to study the role of glycosylation for the recombinant products, attempts to generate the glycosylated form of nepenthesin I were carried out in P. pastoris system Ultimately, successful recombinant production for both targets were achieved and optimized in the E. coli based expression system, which allowed the characterization of the enzymatic and biochemical properties of the recombinant enzymes. The experimental data was then compared to previous literature containing similar data generated for the crude pitcher fluid and the native forms of the two enzymes purified from the natural pitcher fluid in order to determine whether successful reconstitution of proteolytic activity was achieved [118]. Based on the available evidence and the results of our own studies, we hypothesize that successful recombinant production of nepenthesin I and II will be sufficient to reconstitute the proteolyic properties of native Nepenthes pitcher secretions for development for further applications such as a therapeutic for the treatment of CD.

42

3.2 Experimental Procedures

3.2.1. Chemicals

Water and acetonitrile, HPLC grade from Burdick and Jackson, were purchased from VWR. Ampicillin was purchased from Invitrogen and tris(2-carboxyethyl)phosphine (TCEP), from Pierce, was purchased from Thermo Scientific. NaCl and Tris, from AMRESCO, were purchased from VWR and the Bug buster protein extraction reagent was purchased from Novagen Inc (lot no. D00115050, cat. no. 70584-3,). SpectralPor Dialysis membranes (50mm width, 32mm diameter, 8000 MW cut-off) were purchased from Spectrum Laboratories Inc (lot no. 3263540, cat. no. 132665). All other chemicals and reagents were purchased from Sigma Aldrich.

3.2.2. Plasmid Preparation

The gene corresponding to pro-nepenthesin I (residues 25-437), UniProt: Q766C2), from Nepenthes gracilis, was amplified from a pET21d vector encoding the full gene, kindly provided by Dr. Hideshi Inoue (Tokyo University of Pharmacy and Life Science) and cloned into two custom vectors, e3884 and cv16, using the forward primer: 5'-ACT GAC AGATCT ACG TCA AGA ACA GCT CTC AAT C-3', which contains a BglII restriction site (underlined) and the reverse primer: 5’-CTG TCT AAGCTT TTA CGA CGC ACC ACA TTG AG-3’, which contains a HindIII restriction site (underlined). The gene corresponding to pro-nepenthesin II (residues 25-438, UniProt: Q766C3), from Nepenthes gracilis, was amplified from a synthesized puC57 vector encoding the full gene from Genscript, USA and cloned into the same two custom vectors described above (e3884 and cv16) using the forward primer: 5’-ACT GAC GGATCC ACA TCG AGA GGC ACA TTA TTG CAT C -3', which contains a BamHI restriction site (underlined) and the reverse primer: 5’-CTG TCT AAGCTT TTA GCT CGC ACC GCA CTG -3', which contains a HindIII restriction site (underlined). The N-terminal signal peptides (residues 1-25) were removed in both protein constructs. The plasmid maps for the e3884 and cv16 vectors are shown in Appendix 3.1 and 3.2 respectively.

43

Polymerase chain reaction (PCR) amplified products for both enzymes were either cloned in frame with an N-terminal his-tag followed by a maltose binding protein (MBP) tag in the e3884 vector construct or a N-terminal leader peptide followed by a his-tag in the cv16 vector construct. Each construct was verified by restriction digestion analysis and DNA sequencing.

3.2.3 Intracellular and Periplasmic Protein Expression and Purification Trials in E. coli

E. coli BL21 (DE3) cells were transformed with either the e3884 (intracellular expression) or cv16 (periplasmic expression) vector containing pro-nepenthesin I or II. A 5% volume starter culture in 2XYT media with ampicillin (100 µg/mL) was incubated overnight (16 hours) at 37 ˚C. After inoculation of large scale cultures of 2XYT media with 100 µg/mL

ampicillin with the starter cultures, cells were grown to an OD600nm of ~0.8 prior to induction with a final concentration of 0.8 mM isopropyl-β-D-thiogalactopyranoside (IPTG) then incubated overnight at 30 ˚C or 18 ˚C, 200 rpm. A final volume of 100 mL of culture was used for initial trials. For periplasmic expression, after the final overnight incubation, cells were harvested by centrifugation (4000 rpm, 30 min, 4 ˚C) and resuspended in sucrose solution (30 mM Tris-HCl, pH 8, 20% sucrose) prior to incubation at room temperature for 20 minutes. Cells were harvested once again by centrifugation then rapidly resuspended in ice-cold water and allowed to incubate for 20 minutes on ice. The supernatant containing the osmotic shock fraction was then isolated from cells by centrifugation and analyzed by SDS-PAGE. In-solution Ni-NTA purification of the osmotic shock fraction was performed to identify any secreted recombinant products. After incubation of Ni-NTA beads in the osmotic shock fraction for 3 hours at 4 ˚C, the beads were washed twice with wash buffer B (20 mM Tris-base, 300 mM NaCl, 10 mM imidazole, pH 8) and eluted with elution buffer B (20 mM Tris-base, 300 mM NaCl, 0.4 M imidazole, pH 8). The eluted solution was also analyzed by SDS-PAGE. For intracellular expression, after the final overnight incubation, cells were lysed by incubation in bug-buster protein extraction reagent for 10 minutes at room temperature. Centrifugation (13000 rpm, 10 min, 4 ˚C) allowed separation of the lysed insoluble and soluble fractions. The soluble fraction was then incubated in and purified with Ni-NTA beads as

44

described above for the periplasmic expression protocol in order to identify whether soluble recombinant products were secreted. All fractions were analyzed by SDS-PAGE.

3.3.4 Insoluble Protein Expression and Purification Trials in E. coli

E. coli BL21 (DE3) cells were transformed with the e3884 vector containing pro- nepenthesin I or II. A 5% volume starter culture in 2XYT media with ampicilin (100 µg/mL) was incubated overnight (16 hours) at 37 ˚C. Large scale cultures of 2XYT media with 100 µg/mL ampicillin were inoculated with the starter cultures the next day and cells were grown to an

OD600nm of ~0.8 prior to induction with a final concentration of 0.8 mM IPTG followed by incubation overnight at 30 ˚C, 200 rpm. Overnight cultures were harvested by centrifugation (4000 rpm, 30 min, 4 ˚C) then passed 4 times through a homogenizer for cell lysis after resuspension in Buffer C (50 mM Tris, 0.5 M NaCl, pH 8) supplemented with a cocktail of protease inhibitors (Roche, lot no. 16741000, cat. no. 04693159001). Lysed cells were harvested by centrifugation (4000 rpm, 30 min, 4 ˚C) and the pellet was resuspended in Buffer C prior to centrifugation (12000 rpm, 15 min, 4 ˚C) through a sucrose cushion (25% (w/v) sucrose, 50 mM Tris-HCl, 1 mM ethylenediaminetetraacetic acid (EDTA), pH 7.4). The pellet was then resuspended in Buffer D (50 mM Tris-HCl, 10 mM NaCl, 1 mM β-mercaptoethanol, 0.5% Triton X-100 (v/v), pH 7.4) and incubated for 30 min at 37 ˚C prior to centrifugation (12000 rpm, 15 min, 4 ˚C). Finally, the isolated inclusion bodies were washed twice with Buffer D without Triton X-100 prior to storage at -80 ˚C. For protein refolding, the isolated inclusion bodies were first resuspended in Buffer E (50 mM Tris-HCl, 8 M , 1 mM EDTA, 1 mM glycine, 500 mM NaCl, 300 mM, β- mercaptoethanol, pH 10.5) then incubated at 37 ˚C, 200 rpm for 1 hour prior to two dialyses against 5x the volume of 50 mM Tris-HCl (pH 11) in 1-hour steps. Inclusion bodies were then dialysed against 50 mM Tris (pH 7.5) overnight at 4˚C then dialyzed against refolding buffer (50 mM 3-(N-morpholino) propanesulfonic acid (MOPS), 300 mM NaCl, pH 7) for 24 hours at 4 ˚C. After refolding, precipitated material was removed by centrifugation (12000 rpm, 15 min, 4 ˚C). The supernatant was adjusted to pH 2.5 with a final concentration of 100 mM glycine-HCl then incubated overnight at 4 ˚C. Afterwards, the solution was centrifuged again to remove precipitated material, concentrated with a Amicon Ultracentrifugal concentrator with a 10 kDa

45

cut-off and washed 1000x with 100 mM glycine-HCl, pH 2.5 prior to purity and size analysis by SDS-PAGE. This was considered the standard refolding protocol.

3.2.5 Size-exclusion Chromatography

Size-exclusion chromatography was performed with a Superdex S-75 size exclusion chromatography column (Amersham Biosciences) on a Bio-rad BioLogic DuoFlow Chromatography system. All samples were eluted with 20 mM glycine-HCl, pH 2.5 at a flow rate of 0.5 mL/min.

3.2.6 Protein Identification, Function and Disulfide Bond Formation

Incubation of recombinant nepenthesin I and II in the presence or absence of 150 mM TCEP was performed to probe for disulfide bond formation. SDS-PAGE analysis (12% acrylamide gel) of a 5 minute digestion of 4 µg BSA (Sigma Aldrich, lot no. SLBB9570V, cat. no. A7906) with recombinant nepenthesin I or II at 37 ˚C, pH 2.5 was performed to probe for proteolytic activity. LC-MS/MS analyses of in-gel digestions of putative recombinant nepenthesin I and II bands were performed essentially as described in Section 2.2.6 and 2.2.9.

3.2.7 Determination of Enzyme Size

Matrix-assisted laser desorption/ionization combined with time of flight (MALDI-TOF) was performed to confirm intact protein size and successful purification. Two picomoles of recombinant nepenthesin I or II were mixed with a saturated solution of sinapinic acid dissolved in 50% ACN/1% FA on a MALDI plate prior to analysis on an AB Sciex MALDI TOF/TOF 5800 mass spectrometer. Equine myoglobin (Sigma Aldrich, lot no. 069K7027, cat. no. M1882) and cytochrome C (Sigma Aldrich, lot no. 102K7053, cat. no. C7752) were used to calibrate the mass spectrometer. The data was processed in mMass [127].

46

3.2.8 Glycosylated Protein Expression Trials of Nepenthesin I in Pichia pastoris

The gene for pro-nepenthesin I (residues 25-437), UniProt: Q766C2) was amplified and cloned into 2 vectors, pJAG-aMF and pJAN-aMF, using PCR generated 5’ and 3’ BsaI sites in frame with an N-terminal S. Cerevisiae α-Mating factor secretion signal (α-MF). The plasmids containing pro-nepenthesin I were then transfected into P. pastoris Bg10 cells, plated on YPD plates (1% yeast extract, 2% peptone, 2% dextrose, 2% agar) with either G418 (800 µg/ml), or nourseothricin (100 µg/ml) and incubated at 30 ˚C. A single colony was used to inoculate BMGY media (1% yeast extract, 2% peptone, 1.34% yeast nitrogen base, 0.75% glycerol, 100 mM potassium phosphate, pH 6) and was incubated for 24 hours at 30 ˚C. Cells were then induced with BMMY media (1% yeast extract, 2% peptone, 1.34% yeast nitrogen base, 100 mM potassium phosphate, pH 6, 1% methanol) and harvested after 2-3 days by centrifugation (4 ˚C, 715 g). The supernatant, containing secreted proteins, was then concentrated 10-fold and half of the concentrated solution was acid activated overnight in the fridge with a final concentration of 100 mM glycine-HCl, pH 2.5. The resulting acid activated fraction was used to digest 4 µg of BSA for 5 minutes at 37 ˚C to probe for proteolytic activity. All fractions were analyzed by SDS- PAGE (12% acrylamide).

3.2.9 Activity Assays

Proteolytic activity was measured using a modified version of the hemoglobin activity assay first described by Anson et al., 1938 [126]. The absorbance at a wavelength of 280 nM was compared to increasing amounts of enzyme under the modified assay conditions in order to optimize each assay within the signal range and prevent signal saturation. The optimized assay consisted of 400 ng of porcine pepsin (Sigma Aldrich, lot no. 074K7717, cat. no. p6887) or 1 µg of nepenthesin I or II mixed with 1250 µg bovine blood hemoglobin (Sigma Aldrich, lot no. 010K7618, cat. no. H2500) in 100 mM glycine-HCl, pH 2.5 in a final volume of 100 µL. Digestion were allowed to proceed for 30 minutes at 37 ˚C, 200 rpm, then stopped with the addition of 100 µL of 10% TCA. The precipitate was removed by centrifugation (14000 g, 10 min) and 3 µL of the supernatant was used to measure the absorbance of TCA precipitable peptides at a wavelength of 280 nm with a pathlength of 0.5 mm on a GE Nanovue plus

47

spectrophotometer. All data points are the mean of three technical and three biological replicates. This was considered the "standard activity assay." For determination of the half-maximal inhibitory concentration (IC50) of pepstatin A, final concentrations of 0.001-10 µM of pepstatin A (Sigma Aldrich, lot no. 087K8613, cat. no. P5378) dissolved in 2% dimethyl sulfoxide (DMSO) were used in conjunction with the standard activity assay. Enzyme concentrations were adjusted to 0.25 µM for porcine pepsin and 0.5 µM for nepenthesin I and II. The data was fitted in Igor Pro (Wavemetrics) using the Hill equation, taking into account standard deviations. For determination of the pH profile, the standard activity assay was modified with buffers ranging from pH 1.6-8 (100 mM). Glycine-HCl was used as the buffer for pH 1.6-3, ammonium acetate for pH 4-5, citrate for pH 6-7 and Tris-base for pH 8. For determination of the temperature profile (4-80 ˚C), the amount of enzyme was modified (50 ng of porcine pepsin or 100 ng of nepenthesin I or II) and the digestion time was reduced to 15 minutes to prevent signal saturation in the standard activity assay. For determination of the stability profile across pH, a 1 µg/µL solution of each enzyme containing 100 mM of buffer ranging from pH 1.6-8 was incubated at 37 ˚C. After 7 and 30 days, the standard activity assay was used to determine the remaining activity at pH 2.5. The standard activity assay was used for determining relative activity in the presence of the denaturing agents, urea or guanidine hydrochloride (GndCl) in a concentration dependent manner (1-6 M). The reducing agent TCEP was not used in these assays as it reacted under the experimental conditions to produce saturated signals.

3.2.10 Digestion Mapping by LC-MS/MS

For determination of the cleavage preferences of the recombinant products, 8 µM of substrate (XRCC4, XLF, PNK, BRCT and Equine myoglobin) were mixed with 100 ng of nepenthesin I, II or Nepenthes fluid (estimated as previously described [11]) and the digestion was allowed to proceed for 5 minutes at 37 ˚C in-solution after which the pH of the solution was raised to 8 in order to attenuate the reaction. Details for the recombinant production of the substrates are as previously described [121]. One pmol of digested substrate was then injected into a Easy LC 1000 system (Thermo Scientific) operating in a nanoflow configuration (C18

48

trap, 75 µm i.d. × 2 cm, 3 µm particle diameter, and C18 column, 75 µm i.d. × 15 cm, 2 µm particle diameter, Dionex). Peptides were eluted with a linear gradient from 5-50% of mobile phase B (97% ACN, 2.9% H2O and 0.1% FA) over 25 minutes at a flow rate of 350 nL/min. Peptides detected in these analyses were selected for CID fragmentation and subsequent collection of LC-MS/MS data on a Thermo Orbitrap Velos ETD mass spectrometer. Searching of the peptides for identification of cleavage sites was performed as previously described in Section 2.2.4.

3.2.11 Inhibition of Nepenthes Fluid

Final concentrations of 100 µM pepstatin A, 1 mM leupeptin (Sigma Aldrich, lot no. 038K8611, cat. no. L2884) and 1 mM EDTA were incubated with acid processed Nepenthes fluid overnight in the fridge. The next day, an estimated 100 ng of total protein from the inhibitor treated fluids were used to digest 10 µg of BSA for 15 minutes at 37 ˚C. The resulting digests were analyzed by SDS-PAGE (12% acrylamide).

49

3.3 Results and Discussion

3.3.1 Recombinant Production and Optimization Trials in E. coli

In order to employ multiple production strategies in E. coli to maximize the chances of success, the genes for pro-nepenthesin I and II, with the removal of the signal peptide, were each cloned into two different vectors, e3884 and cv16, representing a three-tier protein expression strategy. The e3884 vector is designed for intracellular expression and the construct contains an N-terminal 6x his-tag, for purification, and a MBP tag, for increased solubility, in frame with the genes encoding pro-nepenthesin I or II. On the other hand, the cv16 vector is designed for periplasmic expression and contains an N-terminal leader peptide followed by a 6x his-tag, for purification, in frame with the genes encoding pro-nepenthesin I or II. Direct purification of soluble protein is favored in this case for eventual therapeutic purposes so the two strategies described above were pursued first. Intracellular expression was expected to result in the highest protein yields without the use of detergents whereas periplasmic expression was expected to result in lower protein yields but a high percentage of correctly folded product(s). However, as a last resort, refolding of insolubly expressed proteins isolated from inclusion bodies was considered using the e3884 vector containing the desired pro-nepenthesin constructs. The final strategy was considered a last resort since purification through the use of harsh detergents is not ideal for therapeutic means. But, for the initial purposes of characterizing the efficacy of the enzymes as a therapeutic for CD or for other applications such as hydrogen/deuterium exchange (H/DX) mass spectrometry (MS) for studying protein interactions and conformational changes, which the enzymes were initially characterized for, the method was considered viable [121].

Intracellular Expression Small-scale recombinant expression trials were initially performed for each pro- nepenthesin-vector combination at two temperatures, 30 ˚C and 18 ˚C, in order to identify any soluble protein expression. Although not clear-cut, protein expression at lower temperatures is thought to enhance the ratio of soluble: insoluble protein production by slowing down expression thereby allowing more time for proper protein folding. As shown in Figure 3.1B, intracellular expression of pro-nepenthesin I and II were observed only in the insoluble fraction when

50

expressed at both 30 ˚C (lane 1 and 3) and 18 ˚C (not shown) as determined by LC-MS/MS analyses of in-gel digestions of the putative bands. Ni-NTA purification confirmed that the nepenthesins were not found in the soluble fractions (Figure 3.1B, lane 3 and 6). Plant proteins are known to be notoriously difficult for recombinant production in E. coli systems so the results obtained from the intracellular expression trials were not unexpected.

Periplasmic Expression

As nepenthesins are secreted from the secretory glands into the pitchers of Nepenthes plants, we reasoned that perhaps expression in the periplasmic space of E. coli, where multiple chaperones exist to aid in proper protein folding [128] coupled with secretion into the media, mimicking native conditions, would perhaps promote soluble protein expression. As expected, the total protein yields in the osmotic shock fractions of cells grown overnight at 30 ˚C and 18 ˚C were very low compared to those observed for the intracellular expression trials in E. coli cells. After Ni-NTA purification of the osmotic shock fractions for cells grown overnight at 30 ˚C, no protein bands were observed (Figure 3.2B, lanes 2 and 4); requiring concentration of the sample 10-fold for visualization by SDS-PAGE. Initially, it appeared that protein bands corresponding to the correct molecular weights of pro-nepenthesin I and II had been purified by Ni-NTA in the osmotic shock fractions (Figure 3.2B, lanes 1 and 2 respectively). However, LC-MS/MS analyses of in-gel digests of the putative bands determined that the protein bands were in fact secreted bacterial proteins. Although the possibility that some soluble pro-nepenthesins were produced and secreted into the osmotic shock fraction, a scale-up in production, within the limits of our laboratory, would not have produced sufficient amounts of the mature forms of the recombinant targets for the purposes of the studies outlined in this thesis given that their low abundance was not discernible by SDS-PAGE and LC-MS/MS analyses even after 10-fold concentration. Therefore, as a last resort, refolding of insolubly expressed nepenthesins isolated from E. coli inclusion bodies was pursued.

51

Figure 3.1. Results of intracellular recombinant production trials for nepenthesin I and II.

A) The protein construct from N- to C-terminal in the e3884 vector used for intracellular expression is shown. B) After overnight incubation at 30 ˚C, cells were lysed and expression of the pro- nepenthesin I (pro-nepI) and II (pro-nepII) constructs were observed only in the insoluble fractions (lanes 1 and 4 respectively). Expression of the pro-nepI and II constructs were not observed in the soluble fractions (lanes 2 and 5 respectively), which was confirmed by Ni-NTA purification of the soluble fractions (lanes 3 and 6 respectively).

52

Figure 3.2. Results of periplasmic recombinant production trials of nepenthesin I and II.

A) The protein construct from N- to C-terminal in the cv16 vector used for periplasmic expression is shown. B) After overnight incubation at 30 ˚C, cells were removed by centrifugation and subjected to osmotic shock. The secreted proteins in the osmotic shock fractions for putative pro- nepenthesin I (lane 1) and II (lane 3) expression were purified by Ni-NTA to identify soluble protein expression (lanes 2 and 4 respectively). C) The initial Ni-NTA purified fractions were concentrated 10-fold prior to SDS-PAGE analysis (12 % acrylamide). The concentrated fractions for pro-nepenthesin I and II are shown in lanes 1 and 2 respectively.

53

Refolding of Insoluble Recombinant Protein

Refolding of recombinant proteins expressed into E. coli inclusion bodies is a well- documented and widely used method for achieving high quantities of pure recombinant product [129,130]. When the method was applied, a general refolding protocol appeared sufficient for successful expression and purification of pro-nepenthesin I (25-437) and II (residues 25-438). Each step of the procedure was monitored by SDS-PAGE, which confirmed the purity of the mature enzyme products after acid activation by lowering the pH to 2.5 and incubation overnight (16 hours) in the fridge (Figure 3.3A and C). After acid activation, the intensities of the protein bands corresponding to mature nepenthesin I and II appeared much less intense than that observed for an equivalent amount of the pro-forms suggesting that auto-digestion of the misfolded forms of the enzyme occurred (Figure 3.3A and C). LC-MS/MS analyses of in-gel digests confirmed that the final acid activated products were, in fact, nepenthesin I and II. When nepenthesin I or II were incubated under acidic conditions with BSA, complete proteolytic digestion of the substrate was observed showing that the recombinant enzymes were proteolytically functional (Figure 3.4A and B, lanes 3). Typical yields for both mature nepenthesin I and II were typically between 1.5-2 mg of pure, functional enzyme, per 0.5 L of culture, corresponding to a final protein yield of ~3-6% (Table 3.1). The molecular mass of mature nepenthesin I, as determined by MALDI-TOF MS, was 37461 Da (Figure 3.3B), which is consistent with a theoretical mass of 37473 Da corresponding to residues 79-437. Mature nepenthesin II was of a similar mass with a measured molecular mass of 37507 Da (Figure 3.3D), which matches the theoretical mass of 37509 Da corresponding to residues 80-438. When both enzymes were placed in reducing conditions (150 mM TCEP), distinct downwards shifts in electrophoretic mobility were observed (Figure 3.4) as would be expected for proteins with a reduced disulfide bond network [131].

54

Figure 3.3. Refolding of insoluble recombinant nepenthesin I and II from inclusion bodies.

Recombinant production of the e3884 constructs for nepenthesin I (A) and II (C) from refolding of purified E. coli inclusion bodies is shown. Each step of the refolding procedure was monitored and is shown as: total solubilized protein from purified E. coli inclusion bodies (Lane 1), refolded nepenthesin after final dialysis (lane 2), 24-hour acid activation (100 mM glycine-HCl, pH 2.5) of refolded product (lane 3). MALDI-TOF MS analysis was performed on the 24-hour acid activated nepenthesin I (B) and II (D) enzymes. LC-MS/MS analyses of in-gel digests of the acid-activated bands (A and C, lanes 3) confirmed the presence of pure nepenthesin I and II respectively.

55

Table 3.1. Refolding data for a typical 0.5 L preparation of nepenthesin I and II.

Sample Total Total Refolded Total Acid Solubilized Refolded Yield Protein Active Protein Protein (%) Acid Yield (mg) (mg) Active (%) (mg) Nepenthesin I 203.5 32.4 15.9 1.80 5.6

Nepenthesin II 198.5 50.8 25.6 1.64 3.2

56

Figure 3.4. Disulfide bond formation and functionality of recombinant nepenthesin I and II.

Recombinant nepenthesin I and II compared under reducing and non-reducing conditions. A 5 minute digest of BSA at pH 2.5 and room temperature was used to gauge proteolytic activity. Nepenthesin I (A) and II (B): under reducing conditions (1), under non-reducing conditions (2) and under non-reducing conditions with 4 µg BSA (3). The BSA load (4 µg) in each case is shown in lane (4).

57

Figure 3.5. Size-exclusion chromatography of recombinant nepenthesin I and II.

The size-exclusion chromatography profile of recombinant nepenthesin I and II is shown. Samples were eluted with 20 mM glycine-HCl, pH 2.5. The standards were: I) Dextran blue (2000 kDa), II) bovine blood hemoglobin (64 kDa) and III) equine myoglobin (17 kDa).

58

A single peak was observed for the elution of both nepenthesin I and II on a Superdex S- 75 gel-filtration column profile (Figure 3.5A and B respectively). The single peaks observed were within the expected elution times for the monomeric state of each recombinant product. These observations confirmed that after acid activation nepenthesin I and II were purified in one oligomeric state but also suggests that both products are monomers. However, more sensitive studies to conclusively determine the oligomeric states of nepenthesin I and II should be performed in future studies. Possible methods for identifying the stoichiometric states of the nepenthesins could be through dynamic light scattering or analytical ultracentrifugation as previously documented in past studies [132,133].

Optimization of Recombinant Production

Refolding of detergent solubilized proteins from inclusion bodies is known to result in a large proportion of misfolded protein products [129]. Certainly, the difference in intensity of SDS-PAGE gel bands between the refolded pro- and mature forms of nepenthesins I and II further suggests that a large portion of the refolded products were misfolded. Therefore, we attempted to optimize the refolding procedure with the addition of a redox shuffling system, which consisted of various ratios of oxidized: reduced glutathione (1:1, 1:3 and 3:1), or with pre- purification of the detergent solubilized proteins from inclusion bodies by Ni-NTA [130, 134]. The addition of low molecular weight thiols, such as glutathione, are known to promote rapid disulfide bond exchange reactions in order to allow the most stable configuration to be reached, which is, in general, the native form. This is particularly important for refolding of proteins with extensive disulfide bond networks such as nepenthesin I and II. On the other hand, pre- purification of solubilized proteins from inclusion bodies is thought to improve refolding yield by preventing protein contaminant driven aggregation, the leading cause of low refolding yields. In order to assess the benefits of the modifications to the standard refolding procedure, the final yield, activity and changes in cleavage preferences of the mature recombinant products were assessed in each case. Both modifications to the standard refolding procedure for both nepenthesin I and II either did not significantly affect or decreased the final yield (Figure 3.6, Table 2 and 3) and activity per 100 ng of enzyme (Figure 3.7). Comparison of the digestion

59

Figure 3.6. Effects of redox environment and pre-purification on refolding of insoluble recombinant nepenthesin I and II from inclusion bodies.

Nepenthesin I and II were refolded using the standard refolding procedure with the addition of various ratios of oxidized: reduced glutathione (ox: red) or pre-purification of solubilized protein prior to refolding. The effects of redox environment on refolding of nepenthesin I (A) and II (C) are shown as follows: refolded and acid activated nepenthesin with 1:1 ox: red glutathione (1 and 2 respectively), refolded and acid activated nepenthesin with 1:3 ox: red glutathione (3 and 4 respectively) and refolded and acid activated nepenthesin with 3:1 ox: red glutathione (5 and 6 respectively). The effects of pre-purification on the detergent solubilized nepenthesin I (B) and II (D) are shown as follows: detergent solubilized protein from inclusion bodies (1), Ni-NTA purification of the detergent solubilized fraction (2), final refolded fraction (3) and 24-hour acid activated fraction (4).

60

Table 3.2. Refolding data for the optimization of recombinant nepenthesin I production.

Sample Total Total Refolded Total Acid Solubilized Refolded Yield Protein Active Protein Protein (%) Acid Yield (mg) (mg) Active (%) (mg) Standard 60 7.60 12.7 0.311 4.09

1:1 ox:red 60 2.17 3.6 0.158 7.26

1:3 ox:red 60 9.54 15.9 0.252 2.64 3:1 ox:red 60 8.23 16.7 0.207 2.52 pre-his-purified 8.0 0.45 5.6 0.021 4.7

61

Table 3.3. Refolding data for the optimization of recombinant nepenthesin II production.

Sample Total Total Refolded Total Acid Solubilized Refolded Yield Acid Active Protein Protein (%) Active Yield (mg) (mg) protein (%) (mg) Standard 29 5.11 17.6 0.18 3.5

1:1 ox:red 29 5.39 18.6 0.20 3.7

1:3 ox:red 29 4.56 15.7 0.13 2.8 3:1 ox:red 29 4.32 14.9 0.08 1.9 pre-his-purified 4.7 0.31 6.6 0.01 3.5

62

Figure 3.7. Relative activity of recombinant nepenthesin I and II generated from various refolding conditions.

The relative activity of equivalent amounts of mature nepenthesin I and II, recombinantly produced under different refolding conditions, were measured with the standard hemoglobin activity assay. The means of 3 independent replicates and their standard deviations are shown.

63

Figure 3.8. Cleavage preferences of recombinant nepenthesin I generated from various refolding conditions.

Cleavage of residues in the P1 position (A) and the P1’ position (B) are shown as the relative % cleavage to the total. Changes in cleavage preferences were determined by digestion of APLF with 100 ng enzyme for 5 minutes at 37 ˚C prior to LC-MS/MS analysis.

64

Figure 3.9. Cleavage preferences of recombinant nepenthesin II generated from various refolding conditions.

Cleavage of residues in the P1 position (A) and the P1’ position (B) are shown as the relative % cleavage to the total. Changes in cleavage preferences were determined by digestion of APLF with 100 ng enzyme for 5 minutes at 37 ˚C prior to LC-MS/MS analysis.

65

preferences of each recombinant enzyme in all refolding conditions through digestion of a model protein, APLF, resulted in no observable differences (Figure 3.8 and 3.9). Therefore, based on the evidence provided, the standard refolding protocol appeared to be sufficient as attempts at optimization of the refolding protocol did not improve the final yield and activity or change any of the cleavage preferences of the recombinant enzymes. Interestingly, cleavage after P was not observed for either recombinant nepenthesin I or II. However, given that only one substrate was used, no conclusion could be drawn from the current data so further in-depth cleavage preference studies were performed and are described later in this chapter. In summary, multiple strategies for recombinant expression and purification of nepenthesin I and II in E. coli were attempted and resulted in successful production through refolding of the enzymes expressed in inclusion bodies. Attempts at optimization of the method with the addition of various ratios of a redox shuffling system (oxidized/reduced glutathione) or pre-purification of detergent solubilized protein prior to refolding did not improve yield, activity/100ng enzyme or change any of the cleavage preferences of the mature enzyme products. Although the standard refolding protocol described here would need to be further optimized and scaled up in order to meet commercial demands, the current method appears to be sufficient for most structural studies as well as the purposes of the studies presented in this thesis. After complete acid activation, the mature enzyme products were shown to be pure and functional with a combination of size-exclusion chromatography, MALDI-TOF MS and digestion of BSA. Furthermore, disulfide bond formation was observed based on shifts in SDS- PAGE electrophoretic mobility under reducing and non-reducing conditions. However, the exact disulfide linkages in nepenthesin I and II were not probed and should be the subject of future structural studies. Of note, is that although no study has yet confirmed that nepenthesins are secreted as pro-enzymes into the pitcher fluid, the presence of a signal peptide and auto-digested pro-peptide identified in this study suggests that this is the case.

66

3.3.2 Recombinant Expression Trials in Pichia pastoris

Nepenthesins are known to contain a high degree of glycosylation in the native plant fluid. Mature nepenthesin I has 6 potential N-glycosylation sites whereas mature nepenthesin II contains 1, although it is thought that nepenthesin II is not glycosylated [118]. Therefore, we attempted to study the role of glycosylation for nepenthesin I by generating the glycosylated form of the enzyme in a Pichia pastoris expression system. P. pastoris is a widely used yeast species for the generation of more complex recombinant proteins that are otherwise not possible to obtain from bacteria based expression systems. Being a eukaryotic system, advantages over E. coli or other bacteria based systems include: the addition of post-translational modifications such as N-glycosylation and more successful expression of proteins with extensive disulfide bond networks [135]. In initial expression trials, supernatants containing secreted proteins from cells transfected with two different expression vectors, pJAG-aMF and pJAN-aMF, each encoding pro-nepenthesin I, were assayed for activity by analysis of the digestion of BSA by SDS-PAGE. Although smears that migrated at the expected molecular weight range of glycosylated pro- nepenthesin I (Figure 3.10, lanes 1 and 4) appeared lower once acid activated overnight (Figure 3.10, lanes 2 and 5), indicating auto-digestion of the pro-peptide, neither vector-nepenthesin I combination showed any proteolytic activity when used to digest BSA for 30 minutes at 37 ˚C when compared to the control (Figure 3.10, lanes 3, 6 and 7). Furthermore, LC-MS/MS analyses of in-gel digestions of the smeared bands did not identify nepenthesin I. Therefore, it appears that the preliminary nepenthesin I expression trials failed to express and/or secrete the protein of interest in the P. pastoris system. Attempts at increasing yield by optimization of the induction conditions (e.g. pH and temperature), co-expression with chaperones, and the inclusion of purification tags (e.g. his-tag) for identification of the target proteins are currently in progress and are the subject of future studies.

67

Figure 3.10. Recombinant nepenthesin I expression trials in P. pastoris.

The supernatants of induced cells transfected with pro-nepenthesin I were concentrated 10-fold, acid activated and analyzed by SDS-PAGE. Activity was assayed by digestion of the acid activated supernatants containing secreted proteins with 4µg BSA for 30 minutes at 37 ˚C. The results are as shown for pJAG-aMF and pJAN-aMF: supernatant containing pro-nepenthesin I (1 and 4 respectively), acid activated supernatants (2 and 5 respectively), acid active supernatant fractions incubated with 4 µg BSA (3 and 6 respectively). 4 ug of undigested BSA is shown (7).

68

3.3.3 Enzymatic Characterization

As the production of pure, functional recombinant products were now established, we set about characterizing the enzymatic and biochemical properties of recombinant nepenthesin I and II in order to determine whether they, alone, could reconstitute the observed proteolytic activity of the Nepenthes pitcher fluid; as well as their suitability for further applications. To this end, we initially tried to develop a specific activity assay measuring the evolution of tryptophan fluorescence from digestion of dansyl-quenched peptides. However, attempts at designing peptides that would be digested by recombinant nepenthesin I and II resulted in ambiguous results showing no increases in tryptophan fluorescence. As attempts at designing a specific activity assay have not yet been successful, instead we decided to adopt the hemoglobin activity assay, first described by Anson et al., 1938, used for all previous nepenthesin-related studies [126]. This relative activity assay measures the absorbance of TCA soluble peptides at a wavelength of 280 nm. As well, all of the data generated was compared to porcine pepsin as a control. Using a modified version of the hemoglobin activity assay, we attempted to determine the relative activity of the recombinant nepenthesins compared to porcine pepsin, their sensitivity to pepstatin A and the effects of pH and temperature. The stability of the recombinant enzymes across pH over time and in the presence of denaturing agents was also probed with the modified hemoglobin activity assay. Finally, the cleavage preferences of the recombinant nepenthesins were generated in order to determine whether they, alone, were sufficient for reconstitution of the proteolytic capabilities of the native Nepenthes pitcher fluid.

Relative Activity and IC50 of Pepstatin A

Using the modified version of the hemoglobin activity assay, the relative activity of the recombinant enzymes was generated and compared to porcine pepsin. The signal for each experiment, under the modified assay conditions, was always optimized within range, as shown in Figure 3.11A-C, in order to prevent error associated with signal saturation effects. At a standardized enzyme dose of 400 ng, it was determined that the relative activity of nepenthesin I and II compared to porcine pepsin were 45% and 46% respectively (Figure 3.11D). This is in

69

contrast to our previous study estimating the activity of Nepenthes fluid to be > 1400x that of pepsin [121]. Several possible reasons for explaining this discrepancy exists but the most likely are the uncertain concentrations of protease in the native fluid and/or the lack of N-glycosylation in the recombinant enzymes, which may have lowered their overall activity as has been exemplified in past studies [136]. Nevertheless, in order to probe a defining property of aspartic proteases, dose-dependent inhibition curves of nepenthesin I and II with pepstatin A were generated. The inhibition profile for porcine pepsin was also generated as a positive control. As shown in Figure 3.12, the experimentally determined IC50 values of pepstatin A for nepenthesin I and II were 1.4 µM ± 0.1 µM and 136.8 nM ± 6.5 nM respectively. The determined IC50 value of pepstatin A for porcine pepsin was determined to be 16.6 nM ± 0.6 nM, which is in good agreement with accepted values in the low nanomolar range [137]. Furthermore, although the curves for all three enzymes fit well, the data was least fit for porcine pepsin (chi-squared value= 24.18) compared to nepenthesin I (chi-squared value= 4.18) and II (chi-squared value= 8.25). Overall, the data showed that both recombinantly produced enzymes were much less sensitive to pepstatin A compared to porcine pepsin. Even at 40-fold molar excess of pepstatin A, the proteolytic abilities of nepenthesin I did not appear to be fully inhibited. pH Profiles

In order to determine the effects of pH on enzymatic activity, the pH-relative activity profile for nepenthesin I and II (pH 1.6-8) were generated and compared to that of porcine pepsin (Figure 3.13). Each profile was normalized to its’ optimal activity. Optimal activity for both nepenthesin I and II was observed at pH~2.5; slightly higher than that of porcine pepsin (pH ~2). Also, nepenthesin I appeared to retain higher activity at more basic pH values corresponding to a broader profile compared to nepenthesin II. When comparing all three profiles, nepenthesin I’s pH profile was the most broad and pepsin’s was the most narrow whereas nepenthesin II’s profile fit in the middle. Above pH 6, all three enzymes were inactive. These results are congruent with the data generated for native nepenthesin I and II purified from the natural plant fluid [118].

70

Figure 3.11. Absorbance at 280 nm compared to enzyme quantity (µg).

The modified hemoglobin activity assay was used to measure relative activity. The signal (absorbance at 280 nm) was plotted against µg of nepenthesin I (A), nepenthesin II (B) and porcine pepsin (C). Relative activity of 400 ng of nepenthesin I and II was compared to an equivalent amount of porcine pepsin (D). The means of 3 independent replicates and their standard deviations are shown.

71

Figure 3.12. IC50 profiles for recombinant nepenthesin I and II.

Relative proteolytic activity towards hemoglobin was measured at 37 ˚C in the presence of different concentrations of pepstatin A (0.001-10 µM). The means of 3 independent replicates and their standard deviations are shown.

72

Figure 3.13. Effect of pH on recombinant nepenthesin I and II activity.

Relative proteolytic activity towards hemoglobin was measured at 37 ˚C across pH (1.6-8). The means of 3 independent replicates and their standard deviations are shown.

73

Temperature Profiles

In order to determine the effects of temperature on enzymatic activity, the temperature- relative activity profiles for nepenthesin I and II were generated and compared to porcine pepsin (Figure 3.14). As well, each profile was normalized to its’ optimal activity. Optimal activity for nepenthesin II and porcine pepsin were observed at a temperature of ~55 ˚C and were inactive by ~80 ˚C whereas nepenthesin I appeared to show optimal activity at a temperature of ~50 ˚C and was inactive by ~60 ˚C. Therefore, the temperature optimums for nepenthesin II and porcine pepsin were determined to be slightly higher than that of nepenthesin I, showing some inconsistencies compared to the data described for the native enzymes. As well, in contrast to recombinant nepenthesin I, which appeared to be inactivated by ~60 ˚C, native nepenthesin I has been described as requiring a higher temperature (~80 ˚C) for inactivation. Recombinant nepenthesin II also appeared much more temperature stable than previously described for native nepenthesin II [118].

Stability Profiles across pH

Stability across pH was measured as the remaining activity at pH 2.5 after incubation in buffers ranging from pH 1.6-8 at 37 ˚C for 7 and 30 days (Figure 3.15). After 7 days, nepenthesin I exhibited high stability after incubation in pH 2.5-8. On the other hand, nepenthesin II and porcine pepsin exhibited lower but similar stability after incubation in pH 1.6- 5, with increasing stability as pH rose. However, after incubation above pH 6, porcine pepsin was essentially inactive whereas nepenthesin II retained >80% activity up to pH 7, where a large drop in stability was exhibited. After 30 days, nepenthesin I retained >50% activity in pH 4-8 and showed remarkable stability in pH 6-7 (>75% activity retained) whereas nepenthesin II retained >50% activity after incubation, only in pH 5-7. Porcine pepsin was essentially inactive after incubation for 30 days except in pH 4 (25% activity retained) and 5 (47% activity retained). All three enzymes retained very little activity after the 30-day incubation in pH 1.6-3.

74

Figure 3.14. Effect of temperature on recombinant nepenthesin I and II activity.

Relative proteolytic activity towards hemoglobin was measured at 37 ˚C across temperature (˚C). The means of 3 independent replicates and their standard deviations are shown.

75

After measuring stability across pH 1.6-8, similar stabilities between native and recombinant nepenthesin I were observed at more basic pH’s (pH 4-8) but recombinant nepenthesin I was significantly less stable from pH 1.6-3. The stability of recombinant nepenthesin II also appeared different to that reported of its native form [118]. The stability profile of recombinant nepenthesin II appeared to be a more moderate version of its recombinant , nepenthesin I, up to pH 8 where a large drop in stability was exhibited. This is in contrast to the native form of nepenthesin II, which appeared to have a similar stability profile as porcine pepsin. We are unable to explain the discrepancies highlighted above. However, the differences in stability between the native and recombinant forms of the enzymes observed in this study are likely due to the lack of N-glycosylation in the recombinant enzymes, examples of which have been documented in past studies [136,138-139].

Stability Profiles in the Presence of Denaturing Agents

In order to characterize another aspect of the recombinant products, the stability of nepenthesin I and II, compared to porcine pepsin, in the presence of denaturing and reducing agents was studied. Using the modified activity assay, nepenthesin II and porcine pepsin were both observed to retain near complete activity even in the presence of 6 M urea in contrast to nepenthesin I, which became inactivated between 3-6 M urea (Figure 3.16A). Likewise, nepenthesin II and porcine pepsin both retained near complete activity in the presence of GndCl up to a 3 M concentration but was inactivated by 6 M, in contrast to nepenthesin I, which became fully inactivated in the presence of 3 M GndCl (Figure 3.16B). Overall, it appeared that nepenthesin II and porcine susceptibility to the denaturing agents, urea and GndCl, were very similar and both were more resistant to the effects of the denaturants than nepenthesin I. These stability results are not congruent with previous studies reporting much lower activity for native nepenthesin II, purified from the plant pitcher fluid, in the presence of urea and GndCl [119]. The differences, however, may be a result of differences in the specific isoforms of nepenthesin II used and/or, again, possible N-glycosylation. The stability of nepenthesin I and II in the presence of the reducing agent, TCEP, were not probed using the modified hemoglobin activity assay as the assay conditions produced a reaction resulting in a bright red solution that saturated the signal.

76

Figure 3.15. Effect of pH on the stability of recombinant nepenthesin I and II.

Relative proteolytic activity towards hemoglobin was measured at 37 ˚C across pH (1.6-8) at 7 days (top) and 30 days (bottom). The means of 3 independent replicates and their standard deviations are shown.

77

Figure 3.16. Effects of denaturing and reducing agents on the activity of recombinant nepenthesin I and II.

Proteolytic activity towards hemoglobin was measured at 37 ˚C against denaturant concentration of A) urea and B) guanidine hydrochloride. The means of 3 independent replicates and their standard deviations are shown.

78

Cleavage Preferences

The digestion preferences of recombinant nepenthesin I and II were investigated in order to determine isozyme specific cleavage contributions relative to that of the native Nepenthes fluid (Figure 3.17). The digestion data represents an assessment of 1506 residues from 5 model proteins at physiological temperature (37 ˚C). The cleavage patterns identified for nepenthesin I were highly similar to those of nepenthesin II and, taken together, reconstitute the vast majority of the observed proteolytic activity of Nepenthes pitcher fluid. In agreement with our previous study on pitcher fluid, nepenthesin I and II demonstrates very little selectivity at the P1 and P1’ position except at Gly and Pro as well as at Ile in the P1’ position [121]. Efficient cleavage after K and R was also shown for both nepenthesin I and II; residues known to be “forbidden” for classic aspartic proteases such as pepsin. Notably, cleavage at Q in the P1 and P1’ positions were reconstituted relative to the native pitcher fluid, one half of the necessary cleavages required of a therapeutic for CD. However, although cleavage before P in the P1’ position was retained, cleavage after P was not observed with recombinant nepenthesin I and/or II. Given the near complete reconstitution of proteolytic activity relative to the Nepenthes pitcher fluid, the lack of cleavage after P suggests, perhaps, the presence of another enzyme in the native pitcher secretions.

79

Figure 3.17. Proteolytic cleavage preferences of recombinant nepenthesin I and II compared to Nepenthes fluid under physiological conditions.

Cleavage of residues in the P1 position (A) and in the P1’ positions (B) are shown as relative % cleavage to the total. Cleavage preferences were determined by digestion of 5 model proteins (XRCC4, BRCT, PNK, XLF and Equine myoglobin) with 100 ng of enzyme for 5 minutes at 37 ˚C prior to LC-MS/MS analyses.

80

3.3.4 The Search for the Missing Proline Cleavage Activity

Since it was determined that misfolding of the recombinant enzyme products was unlikely to be the cause of the missing proline cleavage activity in the P1 position, given the near complete reconstitution of the cleavage preferences of the Nepenthes pitcher fluid with recombinant nepenthesin I and/or II, studies were undertaken to identify the presence of another proteolytic enzyme in the native pitcher secretions. First, we hypothesized that perhaps a co- factor within the native fluid was necessary for the proline cleavage activity in the P1 position. Therefore, the cleavage preferences of recombinant nepenthesin I and II supplemented with the flow-through of native pitcher fluid that had been concentrated in a centrifugal filter unit with a 10 kDa cut-off were determined. However, supplementation with the native fluid flow-through did not change the lack of proline cleavage in the P1 position (Figure 3.18). Next, specific protease class inhibitors were employed on the native Nepenthes pitcher fluid in order to identify, at least, the presence of the putative missing protease. Out of the inhibitors used (pepstatin A and leupeptin), only pepstatin A appeared to have an inhibitory effect on the proteolytic activity of the Nepenthes pitcher fluid (Figure 3.19), which did not provide any new insights. Also, up to 1 mM concentrations of EDTA were shown to have no inhibitory effects on Nepenthes pitcher fluid (data not shown).

81

Figure 3.18. Proteolytic cleavage preferences of recombinant nepenthesin I and II compared to Nepenthes fluid in the P1 position under physiological conditions.

Cleavage of residues in the P1 position is shown as the relative % cleavage to the total. Cleavage preferences were determined by digestion of 3 model proteins (XRCC4, PNK and XLF) with 100 ng enzyme for 5 minutes at 37 ˚C prior to LC-MS/MS analyses.

82

Figure 3.19. Proteolytic activity of Nepenthes fluid in the presence of inhibitors under physiological conditions.

The effects of overnight treatments of leupeptin and pepstatin A on the proteolytic activity of Nepenthes fluid are as shown: leupeptin treated fluid with 10 µg BSA (1), pepstatin A treated fluid with 10 µg BSA (2), leupeptin and pepstatin A treated Nepenthes fluid with 10 µg BSA (3), non-treated Nepenthes fluid with 10 µg BSA (4) and 10 µg BSA (5). BSA digestions were carried out for 15 minutes at 37 ˚C.

83

Finally, we decided to try another method in order to isolate the native nepenthesins from the putative enzyme with proline cleavage activity by gel-filtration chromatography since previous attempts at purifying native nepenthesins with pepstatin A beads could not discern between the different enzymes (Section 2.3.3). As shown in Figure 3.20, chromatography on a superdex S-75 gel-filtration column was able to separate the proteins in the Nepenthes fluid into multiple fractions. When the cleavage preferences of these fractions were compared by digestion of the model protein, APLF, peak I and II showed recombinant nepenthesin-like digestion in the P1 and P1’ positions whereas peak IV showed, in addition to recombinant nepenthesin-like digestions, enriched cleavage for prolines in the P1 position (Figure 3.20). Furthermore, when the fraction containing the enriched proline cleavage was subjected to incubation in a high dose of pepstatin A prior to its use in the digestion of APLF, only cleavage of prolines at the P1 position was observed strongly suggesting the presence of a prolyl endopeptidase (PEP). Multiple attempts at identification of the putative PEP in the enriched fraction by proteomics based methods, including de novo sequencing, were widely inconclusive when searched against all organisms in the NCBI database in Mascot and by Blastp analysis (data not shown). Therefore, attempts at characterizing the cDNA genome of the secretory glands of the Nepenthes pitcher plants appears to be the next logical step and is currently in progress. However, the evidence provided thus far strongly suggests the presence of another previously unidentified protease in the pitcher fluid of Nepenthes plants.

84

Figure 3.20. Size-exclusion chromatography profile of Nepenthes fluid.

The size-exclusion chromatography fractionation profile of Nepenthes pitcher fluid is shown. Samples were eluted with 20 mM glycine-HCl, pH 2.5. Fractions I-V were selected digestion of APLF (5 min, 37 ˚C) and subsequent LC-MS/MS analyses. Fractions I and II appeared to contain recombinant nepenthesin-like cleavage preferences whereas fraction IV appeared to contain enriched cleavage for prolines at the P1 position. Fraction III and V showed no proteolytic activity.

85

3.5 Conclusions and Future Directions

In summary, the results of this chapter have provided detailed insights into the enzymatic and biochemical properties of nepenthesins. Recombinant nepenthesin I and II were successfully generated in an E. coli expression system from refolding of the target proteins expressed in inclusion bodies. SDS-PAGE, MALDI-TOF analysis and size-exclusion chromatography identified the purified, mature and functional forms of the recombinant enzymes. So far, attempts at producing the glycosylated form of nepenthesin I in P. pastoris have not been successful but further attempts are currently in progress. As attempts at developing a specific activity assay have not yet been successful, sensitivity to pepstatin A, pH and temperature profiles; as well as stability profiles across pH and in the presence of denaturing agents were generated by measuring relative activity using a modified hemoglobin activity assay, as previously described, for recombinant nepenthesin I and II generated from E. coli. For the most part, the recombinant enzymes appeared to have similar properties to those described for their native forms purified from the natural plant source. Any differences in observed properties were mainly attributed to the lack of N-glycosylation but either enhanced or did not appear to limit their utility as a therapeutic for CD. Further structural studies are necessary in order to ascertain the basis for the observed properties. The cleavage preferences generated for the recombinant enzymes revealed that nepenthesin I and/or II were able to reconstitute the vast majority of the cleavage preferences observed from the native Nepenthes pitcher fluid. However, neither enzyme alone or in combination could cleave at proline in the P1 position, which is necessary as a therapeutic for CD. Given the vast reconstitution of the cleavage preferences by the recombinant enzymes relative to the native secretions, it was hypothesized that another enzyme was present in the native pitcher fluid and studies were performed to support this notion. Eventually, size-exclusion chromatography of the native pitcher secretions was able to partially separate enriched proline cleavage activity from full native Nepenthes pitcher fluid activity. When these partially separated fractions were treated with pepstatin A, only proline cleavage at the P1 position was observed suggesting the presence of a previously unidentified protease in the native pitcher secretions that is likely a PEP. However, extensive proteomics studies have failed to identify this putative PEP. Therefore, in order to identify the missing putative PEP, attempts at generating the cDNA

86

genome of the Nepenthes secretory glands are necessary and are currently in progress. Although beyond the scope of this thesis, after RNA extraction and subsequent reverse transcription into cDNA, the transcripts will need to be searched for homology to known PEP’s. The sequences for the identified hits will then need to be subjected to recombinant expression trials in order to the validate proline cleavage activity and fully reconstitute the observed proteolytic activity of Nepenthes pitcher fluid. In conclusion, the results of this chapter have conclusively identified and characterized the functional differences between nepenthesin I and II generated recombinantly. The results also validate, for the most part, our previous study showing a curious cleavage specificity profile for aspartic proteases. Nepenthesins appear to favor cleavage at basic residues in addition to the expected residues favored by classic aspartic proteases such as pepsin. Unfortunately, cleavage after P was essentially forbidden by recombinant nepenthesin I and II, although cleavage of Q in the P1 and P1’ positions were reconstituted. Strong evidence for the identification of a putative PEP suggests that the integrity of the recombinant nepenthesins remains intact. Therefore, at least a three-enzyme cocktail appears to be responsible for the proteolytic activity of Nepenthes fluid with both components being necessary for therapeutic means of treating CD. Alone, the recombinant nepenthesins appear to reconstitute only one-half of the necessary activity for the treatment of CD. However, their contributions to the proposed protease cocktail still need to be characterized and is therefore the subject of the next chapter. As well, an initial characterization of the proposed protease cocktails capacity as a therapeutic for CD can still be evaluated through the use of the native Nepenthes pitcher fluid and is also discussed in the next chapter.

87

3.6 Contributions to the Chapter

My main contributions to the chapter included: the experimental design of and the recombinant production, optimization and characterization (SDS-PAGE, MALDI-TOF MS, size- exclusion chromatography and cleavage preferences) of all nepenthesin I and II constructs and the enzymatic characterization of their mature forms (activity assays, pH profile, temperature profile, stability across pH and time, pepstatin IC50 profile, susceptibility to urea and GndCl). All figures shown in this chapter were made by me. I also prepared and analyzed LC-MS/MS samples for determination of the cleavage preferences and identification (in-gel digests) of nepenthesin I and II, which were injected into a Thermo Orbitrap Velos by Dr. Kelvin Ma (Laboratory of Dr. David Schriemer) and a Agilent 6550 iFunnel Q-TOF by Dr. Laurent Brechenmacher (Southern Alberta Mass Spectrometry Core Facility, Calgary, AB, Canada) respectively. As well, I performed all experiments in search of the missing prolyl endoprotease (SDS-PAGE with inhibitors and LC-MS/MS) except for the final gel-filtration profiles, which were performed by Dr. Martial Rey (laboratory of Dr. David Schriemer). In addition, I performed the molecular cloning of nepenthesin I and II into the desired E. coli expression vectors in collaboration with Ronghua Yu in the laboratory of Dr. Tony Schryvers (Dept. of Biochemistry and Molecular Biology, University of Calgary, Calgary, AB, Canada). Finally, I performed the experiments to identify successful recombinant expression of nepenthesin I and II in Pichia pastoris (activity assays and SDS-PAGE) although the production trials were performed by Biogrammatics Inc. (Carlsbad, California).

88

Chapter Four: Characterization of the Aspartic Proteases, Nepenthesin I and II, as a Therapeutic for Celiac Disease.

4.1 Introduction

CD is a globally prevalent autoimmune disorder to one of the world’s most common food substances with no known treatment, except a gluten-free diet [2]. The global rise in gluten intake, the dangers of hidden gluten contamination, patient non-compliance and limitations in lifestyle warrant the development of alternative treatments. Oral proteases for the detoxification of gluten prior to ingestion are highly promising and offer several desirable advantages required of a therapeutic for CD [7,99]. In support of this, several clinical candidates are currently being evaluated in advanced clinical trials. However, the total number and efficacies of the proposed treatments are considered low. Efforts to overcome concerns over the efficacy of the proposed treatments have been somewhat successful mainly through increased dosages but are considered suboptimal. Ideally, a single or group of oral protease candidates should offer optimal and efficient gluten detoxification prior to the triggers arrival in the small intestine where the immune cascade is initiated. In our previous studies, we identified that the proteolytic activity of the pitcher secretions of Nepenthes plants presented efficient cleavage before and after P and Q, the high content of which is the basis for gluten’s immunogenicity [121]. Although the proteolytic activity of the pitcher fluid was originally attributed solely to the aspartic protease, nepenthesin, based on current literature [112-114, 118] and further supported by our own proteomics studies (Chapter 2), we later determined, through recombinant production and characterization of both isozymes of nepenthesin and further probing of the proteolytic activities of the Nepenthes pitcher fluid, that a previously unidentified enzyme, a putative PEP, appears to be responsible for the observed cleavage after P (Chapter 3). Currently, we have gathered strong evidence for the existence of this PEP but attempts at proteomic identification through MS/MS and de novo sequencing have not yet been successful (Section 3.3.4). Therefore, in addition to further attempts at improving proteomic identification, genome characterization is currently being undertaken, the results of which are beyond the scope of the studies described in this thesis. Regardless, the results of the studies presented in this thesis show that the enzyme formulation of the Nepenthes pitcher fluid

89

is different than originally hypothesized, consisting of more than just the nepenthesins. The results highlighted above do, however, suggest a further therapeutic advantage to the proposed treatment for CD in that a titratable two/three-enzyme cocktail, the ratio of which can be experimentally adjusted, rather than a single enzyme is responsible for the complete proteolytic activity of Nepenthes pitcher fluid. Nevertheless, proteolytic cleavage before and after Q, as well as before P, have been successfully reconstituted through the recombinant production of nepenthesin I and II. Since partial reconstitution of the proteolytic activity of the native Nepenthes fluid or one-half of the proposed protease cocktail has been successfully observed from recombinant nepenthesin I and II, the suitability of both enzymes as part of the proposed protease cocktail for the treatment of CD can be characterized. In addition, as the Nepenthes pitcher fluid houses the complete protease cocktail necessary for cleavage before and after P and Q, an initial characterization of the pitcher fluid’s capacity for gluten detoxification can be performed. In this chapter, we evaluate the potential of the novel aspartic proteases, nepenthesin I and II, alone and in combination with its intended native enzyme formulation from the pitcher fluid of Nepenthes plants, whose properties appear suited for an oral protease therapeutic for CD. Each protease or protease formulations ability to digest an established immunodominant epitope of α-gliadin, the 33-mer peptide, was assessed by LC-MS quantitation and indirect ELISA then compared to digests with an equivalent dose of pepsin. The recombinant nepenthesins were also used to digest gliadins from whole wheat extracts in a dose-dependent manner under simulated gastrointestinal conditions and assayed by indirect ELISA in order to quantitate digestion of the 33-mer peptide within a complex background. The results were once again compared to an equivalent dose of pepsin mediated digests of gliadin extracts. Finally, we digested gliadin extracts with a fixed but equivalent dosage of recombinant nepenthesins, Nepenthes pitcher fluid and pepsin and assessed overall gliadin and immunogenic epitope digestion over time by LC- MS/MS analyses.

90

4.2 Experimental Procedures

4.2.1 Chemicals

The Nepenthes pitcher fluid used in this chapter was obtained from a contracted greenhouse, the Urban Bog (Langley, BC). The 33-mer peptide of α-gliadin (LQLQPFPQPQLPYPQPQLPYPQPQLPYPQPQPF) was synthesized by New England Peptide Inc. The anti-gliadin peptide mouse IgG2a, kappa primary antibody (lot no. OE1701931, cat. no. HYB 314-02-02) was obtained from Thermo Scientific. All other chemicals, reagents, and antibodies, unless specifically stated, were obtained from Sigma Aldrich.

4.2.2 Monitoring Susceptibility to Pepsin

Recombinant nepenthesin I and II (4 µg each) were incubated alone or in combination with 4 µg porcine pepsin (Sigma Aldrich, lot no. P6887, cat. no. 074K7717) for 1.5 hours at 37 ˚C in 100 mM glycine-HCl, pH 2.5. After 1.5 hours, the proteases were used to digest 4 µg of BSA (Sigma Aldrich, lot no. SLBB9570V, cat. no. A7906) at 37 ˚C for 5 minutes and analyzed by SDS-PAGE (12% acrylamide). A time-course was performed for a combination of nepenthesin I and porcine pepsin at 37 ˚C in 100 mM glycine-HCl, pH 2.5 from 0-60 min. Densitometry of the gel bands was performed to monitor the stability of each enzyme over time with ImageJ v1.47 [140].

4.2.3 LC-MS Quantitation of Digests of the 33-mer Peptide

20 µM of the 33-mer peptide was digested with ~100 ng of native Nepenthes pitcher fluid or 2 µM of nepenthesin I, nepenthesin II, or porcine pepsin at 37 ˚C, 200 rpm for 1.5 hours in 10 mM glycine-HCl, pH 2.5. The reaction was quenched by raising the pH to 8 with 20 mM Tris- base and frozen at -80 ˚C until the time of further quantitation analyses. For LC-MS quantitation, samples were thawed then diluted to 1 µM of 33-mer peptide in 1% FA prior to immediate injection into a LC-system coupled to a Agilent 6550 iFunnel Q-TOF mass spectrometer for MS analyses. Data was collected essentially as described in Section 2.2.6

91

except the mass spectrometer was operated in positive auto MS mode. The peptide ion chromatogram (PIC) of the 33-mer peptide (3912 Da) was extracted in each case and the area was calculated using Qual Browser (ve05) in Xcalibur (Thermo Scientific). Back to back runs were performed on each sample with blank samples run in-between to ensure that the mass spectrometer was operating stably and that no carryover was evident.

4.2.4 Dose-dependent 33-mer Peptide and Gliadin Digest Preparations

In vitro digests of gliadins from whole wheat extracts (Sigma Aldrich, lot no. 118K7004V, cat. no. G3375), which contain an abundance of gliadins relative to glutenins, under simulated gastrointestinal conditions were performed in a total volume of 0.5 mL and consisted of 5 mg of gliadin extracts, 10 µg porcine pepsin and 0-500 µg of recombinant nepenthesin I or II or porcine pepsin in 100 mM glycine-HCl, pH 2.5. Dose-dependent 33-mer peptide digestions were performed with 125 µM of the 33-mer peptide and 0-10.6 µM of nepenthesin II. All digestions proceeded for 1.5 hours at 37 ˚C, 200 rpm. Tubes were laid on their sides to ensure homogeneity of the slurries. The reactions were quenched by raising the pH to 8 with ammonium hydroxide vapors, as to maintain the concentrations of the reagents, then immediately flash frozen and stored at -80 ˚C until the time of further analyses.

4.2.5 Indirect ELISA

Digestion of the 33-mer peptide in its pure peptide form and within gliadin extracts was assessed by indirect ELISA. The procedure was initially optimized by titration with 1:100- 1:40000 fold dilutions of the anti-gliadin peptide mouse IgG2a, kappa primary antibody (1 mg/mL), which recognizes a portion of the immunodominant 33-mer (residues 57-65 of α- gliadin) and 1:100-1:32000 fold dilutions of the goat anti-mouse IgG secondary antibody (Sigma Aldrich, lot no. 067K60691, cat. no. A9316). The optimal dilutions for the primary and secondary antibodies were experimentally determined to be 1:300 and 1:1000 respectively. Standard curves were generated showing linearity for signals obtained from each concentration of substrate, the 33-mer peptide or gliadin extracts, within the experimental limits.

92

On day 1, the digested 33-mer peptide and gliadin extract solutions (described in Section 4.2.4) were diluted to 0.16 µg/mL and 2.5 µg/mL respectively in coating solution (50 mM sodium carbonate/bicarbonate buffer, pH 9.6). 100 µL of each diluted solution was then incubated in a Greiner-Bio-One Microlon 600 high binding 96-well plate at 4 ˚C, overnight (16 hours) with gentle rocking. On day 2, wells were washed thrice with wash solution (PBS, pH 7.4, 0.05% Tween-20) prior to incubation in Starting Block T20 TBS buffer (Thermo Scientific, lot no. 0B182151, cat. no. 37538) at 4 ˚C for 2 hours with gentle rocking, in order to block the remaining protein-binding sites. Samples were then incubated with the primary antibody, at a dilution of 1:300, at 4 ˚C, overnight (16 hours) with gentle rocking. On day 3, wells were washed thrice with wash solution prior to incubation with the secondary antibody at a dilution of 1:1000 at room temperature for 3 hours with gentle shaking. After incubation with the secondary antibody, samples were washed thrice with wash buffer once again. Finally, 200 µL/well of substrate solution (5 mg/mL pNPP, 50 mmol/L sodium carbonate/bicarbonate buffer, pH 9.8, 1

mmol/L MgCl2) was added and the absorbance at 405 nm with a pathlength of 1 cm was measured every 5 minutes for 60 minutes in a Molecular Devices Filtermax F5 microplate reader. As a negative control, 0-500 µg of recombinant nepenthesin I was used to digest BSA rather than gliadin extracts as described in Section 4.2.4 and subjected to the indirect ELISA procedure described above in order to determine the experimental background.

93

Figure 4.1. Schematic of the indirect ELISA procedure.

Antigens (digests of 33-mer peptide or gliadin extracts) were coated overnight at 4 ˚C with gentle rocking in a Greiner-Bio-One Microlon 600 high binding 96-well plate. The next day, wells were washed thrice with PBS-Tween and blocked with Starting Block T20 TBS buffer for 2 hours at room temperature with gentle rocking. After 3 successive washes, wells were then incubated with primary antibody overnight at 4 ˚C with gentle rocking to capture coated antigen. On the final day, wells were initially washed thrice then incubated with the secondary antibody conjugated with alkaline phosphatase at room temperature for 3 hours with gentle rocking in order to capture primary antibody. After the incubation, wells were washed thrice for a final time before addition of the substrate (pNPP). The resulting signal (A405nm) was monitored at room temperature every 5 minutes for 1 hour.

94

4.2.6 Time-dependent Gliadin Digestions

Gliadin extracts were digested, in vitro, under simulated gastrointestinal conditions in a total volume of 0.5 mL. The digests consisted of 5 mg of gliadin from whole wheat extracts, 20 µg caffeine as an internal calibrant, and 20 µg of Nepenthes fluid proteases, nepenthesin I, nepenthesin II or porcine pepsin in 100 mM glycine-HCl, pH 2.5. Digestions proceeded for 52 hours and 40 minutes at 37 ˚C, 200 rpm with aliquots taken at 5 time points. Tubes were laid on their sides to ensure homogeneity of the slurries. At each time point, samples were centrifuged for 5 minutes at 13,000 rpm to remove the insoluble material then the supernatants were immediately flash frozen in liquid nitrogen and stored at -80 ˚C until the time of further analyses.

4.2.7 LC-MS/MS Analyses

Gliadin extract digests collected at different time points were assessed by LC-MS/MS. Approximately 1 µg of soluble gliadin/glutenin peptides were injected into the LC system and trapped over 4 minutes onto a 10 cm, 150 µm i.d, 5 µm particle diameter Magic C-18 column (Agilent Technologies). It should be noted that since only the glycine-HCl soluble peptides were collected, the true concentration of gliadin/glutenin peptides injected into the LC system were unknown. The putative amount injected was determined experimentally based on the desired signal intensity and standardized across all samples. Once trapped, gliadin/glutenin peptides were eluted using an acetonitrile gradient from 0-50% over 50 minutes at a flow rate of 4 µL/min. Peptides detected in these analyses were selected for CID fragmentation and spectra were searched against a miniaturized database in Mascot (v2.3) containing various isoforms of α- and γ-gliadins as well as low and high molecular weight glutenins (Appendix Table A4.1). The MS/MS ion search parameters used in Mascot were: a mass tolerance of 10 ppm on precursor ions and 0.02 Da on fragment ions, variable methionine oxidation modification, no enzyme specificity, and ESI-QUAD-TOF as the instrument type. A standard probability cut-off of p= 0.05 was also implemented. All data was manually verified. Blanks were run in-between each experimental sample to ensure that no carryover was evident for quantitation purposes. The Mascot search results are shown in Appendix Tables A4.2-A4.21

95

4.2.8 Data Analysis

The sum of the intensities of all peptide ion chromatograms (PIC) for each experimental sample at each time point was determined in Mass Spec Studio [Schriemer Laboratory Software, unpublished]. The sum of the PIC’s for each sample digest was further separated into increments of 50 m/z units ranging from 400-1250 m/z and compared between each time point for each enzyme digest. Furthermore, the sum of the PIC intensities of each detected peptide containing an immunogenic epitope, as defined by current literature, containing the full intact sequence was determined for all protease digests at each time point. In this way, any cleavage within an intact immunogenic epitope would result in a decrease in PIC intensity. The sequences of the immunogenic epitopes, which were mainly discovered from T-cell proliferation assays, housed in α/β- and γ- gliadins as well as low and high molecular weight glutenins that were searched in Mascot in these experiments are shown in the Appendix Table A4.22.

96

4.3 Results and Discussion

4.3.1 Quantitation of the Protease Concentration within Nepenthes Pitcher Fluid

The pitcher secretions of Nepenthes plants contain proteolytic activity attributed to the aspartic proteases, nepenthesin I and II, and a putative PEP. Combined, this putative protease formulation appears to efficiently cleave before and after P and Q, the abundance of which is the basis for the immunogenicity of gliadins. In order to characterize the potential of this putative protease formulation for processing gluten, sufficient quantities of Nepenthes pitcher fluid were necessary and therefore obtained from a contracted greenhouse, the Urban Bog (Langley, BC), but the total protease concentration within the pitcher fluid remained to be determined. We had previously estimated the total nepenthesin concentration in 80x concentrated pitcher fluid from our home grown Nepenthes plants (Section 2.2.2) to be 22 ng/µL using the BCA assay [121], which measures total protein concentration, based on the assumption that nepenthesins predominate within the acid active pitcher fluid, an assumption validated by our own studies (Figure 2.3). We realized, however, that the use of this method provided an overestimation of the protease concentration within the Nepenthes pitcher fluid that may have been enhanced by the presence of glycosylation, which is known to interfere with the BCA assay [141]. Instead, for estimating the total protease concentration of our new Nepenthes pitcher fluid, the Urban Bog fluid, we compared the SDS-PAGE gel band intensities of the pitcher fluid sample with that of recombinant nepenthesin I. As seen in Figure 4.2, the intensity of the gel band corresponding to native nepenthesin I/II, as identified by LC-MS/MS, of 20 µL of 15x concentrated, acid active Nepenthes pitcher fluid appeared similar to that of ~1.5 µg of nepenthesin I. Based on these observations, the estimated concentration of proteases within the 1x Nepenthes pitcher fluid appears to be ~5 ng/µL given that the putative PEP cannot be observed by SDS-PAGE and its abundance is estimated to be <10 fold that of the native nepenthesins, as determined by the size exclusion chromatography profile of the pitcher fluid (Figure 3.20). Therefore, for all of the studies described in this chapter involving Nepenthes pitcher fluid, the quantities of proteases are provided based on an estimation of 5 ng/µL of protease for 1x pitcher fluid and the same batch of pitcher fluid was used for each experiment.

97

Figure 4.2. Quantitation of the protease concentration in Nepenthes pitcher fluid by SDS- PAGE.

20 µL of 15 times concentrated Nepenthes pitcher fluid from the Urban Bog Greenhouse (Langley, BC), that had been acid activated with 100 mM glycine-HCl, pH 2.5 for 2 weeks in the fridge, was compared to 0.5-5 µg of mature recombinant nepenthesin I by SDS-PAGE (12% acrylamide).

98

4.3.2 Stability of Recombinant Nepenthesin I and II under Simulated Gastrointestinal Conditions

In order be physiologically active as an oral therapeutic for CD, recombinant nepenthesin I and II must remain relatively stable throughout the typical course of digestion in the stomach, namely in an acidic environment in the presence of pepsin. Therefore, the stability of the recombinant nepenthesins under conditions akin to the environment and typical duration of digestion in the stomach was assessed in vitro by their incubation in the presence of pepsin at pH 2.5, 37 ˚C for 1.5 hours. Based on these studies, nepenthesin II appeared to maintain stability in the presence of pepsin or pepsin and substrate (BSA) as observed by densitometry throughout the entire 1.5 hour digestion period with little to no observed degradation (Figure 4.3A, lanes 5 and 7). In the presence of nepenthesin II, pepsin also maintained stability with little to no observed degradation (Figure 4.3A, lanes 5 and 7) that is no different from the natural auto-digestion of pepsin with time (Figure 4.3A, lane 8), which is known to occur over extended time periods in its active form [142]. On the other hand, nepenthesin I appeared to degrade to ~6% in the presence of pepsin and ~0% in the presence of pepsin and BSA by the end of the 1.5 hour incubation period (Figure 4.3A, lanes 4 and 6 respectively). Pepsin also appeared to degrade faster, to ~15%, over the 1.5 hour incubation period in the presence of nepenthesin I and BSA (Figure 4.3A, lane 6). Combined, these results suggest that nepenthesin I and pepsin are susceptible to each other’s proteolytic effects whereas nepenthesin II and pepsin are immune to each other. Nepenthesin I’s susceptibility to pepsin digestion, therefore, prompted the need for a time-course evaluation to determine at which point nepenthesin I would be fully degraded in the presence of pepsin in order to determine its usefulness as a therapeutic. As seen in the time-course evaluation, ~70% of nepenthesin I and ~11% of pepsin was degraded over a 1 hour incubation in each other’s presence at 37 ˚C, pH 2.5 (Figure 4.3B). These results suggest that recombinant nepenthesin I is much more susceptible to digestion by pepsin than vice versa but is still able to function for at least half of the typical digestion period occurring in the stomach when incubated in equivalent amounts with pepsin. However, given that the substrate ratio will be largely in excess throughout the relatively short time-frame of physiological digestion occurring in the stomach, the rate of nepenthesin I digestion by pepsin should be dramatically reduced. Therefore, recombinant nepenthesin I appeared to remain a viable candidate and was assessed further for its capacity for gluten digestion and as a potential therapeutic for CD. Future work should,

99

Figure 4.3. Stability of nepenthesin I and II under simulated gastrointestinal conditions.

A) The proteases used in these studies are as shown: nepenthesin I (1), nepenthesin II (2) and pepsin (3). The stability of each protease or protease combination incubated for 1.5 hours at 37 ˚C, pH 2.5 is shown as: nepenthesin I and pepsin (4), nepenthesin II and pepsin (5). After the 1.5 hour incubation period, 4 µg of BSA was incubated with the proteases for 5 minutes in order to gauge remaining functionality and are as shown: nepenthesin I, pepsin and BSA (6), nepenthesin II, pepsin and BSA (7), pepsin and BSA (8), and BSA (9). Densitometry data relative to each unincubated protease is shown below the gel.

B) Time-course evaluation of nepenthesin I in the presence of pepsin. 4 µg each of nepenthesin I and pepsin were incubated at 37 ˚C for 0-60 minutes at pH 2.5. Incubation of nepenthesin I and pepsin alone at 37 ˚C for 60 minutes is shown (leftmost lanes). Densitometry data relative to 0 minutes incubation is shown above and below each pepsin and nepenthesin I gel band respectively.

100

however, assess whether the glycosylated form of nepenthesin I possesses an improved resistance to pepsin digestion.

4.3.3 Quantitative Analysis of the Capacity for Digestion of the 33-mer Peptide

There exists much evidence that certain gluten peptides, shown to be resistant to proteolysis by gastrointestinal enzymes, are able to stimulate the immune cascade required of the pathogenesis of CD. One such peptide that has been well described, the 33-mer peptide of α- gliadin (LQLQPFPQPQLPYPQPQLPYPQPQLPYPQPQPF), consists of 6 copies of 3 immunodominant epitopes identified in T-cell proliferation assays, namely: 1 copy of PFPQPQLPY, two copies of PYPQPQLPY and three copies of PQPQLPYPQ and is therefore an ideal antigen for initial studies of gliadin detoxification [143,144]. Therefore, digestion of the 33- mer peptide under simulated gastrointestinal conditions (1.5 hour digestion at pH 2.5, 37 ˚C) with the recombinant nepenthesins, Nepenthes pitcher fluid and pepsin were assessed by LC-MS and compared. All runs were performed twice, back to back and showed reproducible intensities and peak areas. As determined by the area of the PIC’s of the 33-mer peptide, in a 1:10 molar ratio of enzyme to substrate, recombinant nepenthesin I and II showed improved digestion of the 33-mer peptide after 1.5 hours, with 78.7% and 34.0% of the peptide remaining respectively (Figure 4.4A), relative to digestion mediated by pepsin in which digestion was negligible with 93% of the 33-mer peptide retained (Figure 4.4B). The results for nepenthesin II were further validated by indirect ELISA and showed that, in a dose-dependent manner, up to 50± 9% of 125 µM of the 33-mer was digested in a 1:13 ratio of nepenthesin II to substrate (Figure 4.5A). The Nepenthes pitcher fluid, on the other hand, showed vastly more efficient digestion of the 33-mer peptide in which complete digestion was observed in an estimated 1:150 molar ratio of enzyme to substrate (Figure 4.4A). Despite nepenthesin I and II’s improved cleavage preferences for Q in the P1 and P1’ positions relative to pepsin, they still appear to have a difficult time digesting the 33-mer peptide, although nepenthesin II appears to be more efficient in this regard than nepenthesin I. This may be due to the lack of ability of the recombinant nepenthesins for cleavage C-terminal to P, which is the basis for the proteolytic resistance of this immunogenic epitope.

101

Figure 4.4. LC-MS quantitation of digests of the 33-mer peptide.

A) Peptide ion chromatogram of the 33-mer peptide alone (black) compared to an equivalent amount of the 33-mer digested with recombinant nepenthesin I (green), II (purple), or Nepenthes pitcher fluid (blue). B) Peptide ion chromatogram of the 33-mer peptide alone (black) compared to an equivalent amount of the 33-mer digested with pepsin (red).

Digestions were performed in a 1:10 or 1:150 (Nepenthes fluid only) molar enzyme to substrate ratio at pH 2.5, 37 ˚C for 1.5 hours.

102

Figure 4.5. Quantitation of the amount of the 33-mer peptide (%) remaining as a function of dose of nepenthesin II.

In vitro digests of the 33-mer peptide by varying doses of recombinant nepenthesin II were analyzed by indirect ELISA (A) under optimized and linear conditions as determined by titration and an experimentally determined standard curve (B) respectively. The portion of the 33-mer peptide detected by the primary antibody is shown in bold (top). Each data point shown is the mean of 3 biological replicates.

103

Regardless, recombinant nepenethesin II, at least, appears to show a markedly improved capacity for digestion of the immunodominant 33-mer peptide relative to pepsin, which further highlights this well characterized immunodominant gliadin epitopes extreme resistance to proteolysis by gastrointestinal enzymes. The observed improvement in digestion of the 33-mer by nepenthesin II may translate further for other known immune-stimulating epitopes of gliadin that are resistant to gastrointestinal enzymes given that the basis for resistance is the abundance and positions of P and Q’s. However, further studies are required in order to assess this assumption especially within the context of a complex protein background, which is addressed later. Nevertheless, in combination with the putative PEP observed in the Nepenthes pitcher fluid, there appears to be highly efficient digestion of the 33-mer peptide even in extremely low doses within the time frame of digestion expected in the stomach, which further highlights the high efficiency of the proposed protease formulation. In order to stimulate the immune cascade, gluten peptides must be recognized by antigen presenting cells within a complex background. Therefore, we then assessed the efficiency of the recombinant nepenthesins and Nepenthes pitcher fluid for digestion of gliadin extracts, which contain an abundance of gliadins relative to glutenins, compared to pepsin. Qualitatively, all three proteases and the native protease formulation appeared to somewhat process the gliadin extracts in a dose-dependent manner as observed by clarification of the resulting gliadin extract slurries (Figure 4.6). Recombinant nepenthesin I and II appeared more efficient at clarification of the gliadin extract slurry than pepsin at equivalent doses (Figure 4.6A and C). As expected, however, Nepenthes fluid appeared to be much more efficient at processing the gliadin extract as the resulting solution was clarified further than that observed for either of the nepenthesins alone or pepsin even at lower doses and less than one-fourth of the standard 1.5 hour digestion time (Figure 4.6B). These results suggest that the recombinant nepenthesins are able to process gliadin better than pepsin, our natural stomach enzyme, but are much less efficient compared to the proposed formulation of nepenthesins and the PEP housed in the Nepenthes pitcher fluid. As well, this experiment addresses any concerns raised about recombinant nepenthesin I’s possible susceptibility to digestion by pepsin as similar clarification was observed in dose-dependent samples between equivalent doses of nepenthesin I and II and improved clarification was observed by nepenthesin I when compared to equivalent doses of pepsin thereby supporting the

104

Figure 4.6. Qualitative view of digestion of gliadin extracts by recombinant nepenthesins, Nepenthes pitcher fluid, and pepsin in a dose-dependent manner.

The control digestion sample contained 5 mg of whole wheat gliadin extract and 10 µg of pepsin in 100 mM glycine-HCl, pH 2.5.

A) The control digestion samples were incubated with the indicated amounts of recombinant nepenthesin I and II for 1.5 hours at 37 ˚C. B) The control digestion samples were incubated with the indicated amounts of Nepenthes pitcher fluid proteases for 20 minutes at 37 ˚C. C) The control digestion samples were incubated with the indicated amounts of pepsin for 1.5 hours at 37 ˚C.

105

earlier assumption that in the presence of excess substrate, very little nepenthesin I digestion by pepsin occurs. Since all protease/protease formulations tested under simulated gastrointestinal conditions were able to process the gliadin extracts, we then specifically assessed their capacities for digesting the 33-mer peptide beyond antibody recognition within the gliadin extracts. Initially, the cleavage sites of the 33-mer peptide within gliadin extracts were mapped by LC- MS/MS from digests by the recombinant nepenthesins and Nepenthes pitcher fluid. As shown in Table 4.1, the recombinant nepenthesins were able to cleave before and after L and most importantly after Q in the TG2 targeted consensus sequence, QXP, within the 33-mer peptide [52,145]. Remarkably, the Nepenthes pitcher fluid appeared to cleave before and after almost every residue in the 33-mer peptide, including the QXP consensus sequence suggesting highly efficient destruction of the immune epitope. Of note, cleavage within all three of the LPYPQPQ repeats by the recombinant nepenthesins may have occurred but were not detected in these analyses due to the lack of overlapping sequence coverage in that area. The resistance to proteolysis coupled with the insoluble nature of the gliadin extract in water and high salt concentrations may have contributed to the lack of sequence coverage obtained for the 33-mer peptide when digested with the recombinant nepenthesins. Although methods for solubilisation have been developed, such as 60-80% (v/v) ethanol or high percentages of other alcohols [146- 148], we remained with our established simulated gastrointestinal conditions as we did not want to introduce artificial solubility to the gliadins nor affect the activity of the proteases/protease formulations studied. Next, we quantified recombinant nepenthesin I and II digests of the 33-mer peptide within gliadin extracts in a dose-dependent manner by indirect ELISA. Digestions of the gliadin extracts with the Nepenthes pitcher fluid was not assessed in this instance as the fluid was limiting and larger quantities appeared to interfere with the experimental system. As shown in Figure 4.7, nepenthesin I and II appeared to digest up to ~81.5± 6.4% of the original amount of the 33-mer peptide within the gliadin extracts beyond antibody recognition in a dose-dependent manner up to a dosage of 500 µg. Pepsin, on the other hand, appeared to digest up to ~22.5±2.4% of the original amount of the 33-mer peptide within the gliadin extracts with a dosage up to 50 µg where the signal then remained relatively stable even with increasing doses of pepsin up to 500 µg. Performing the experiment with BSA rather than gliadin digested with nepenthesin I

106

Table 4.1. Cleavage sites of the 33-mer peptide in gliadin extracts detected by LC-MS/MS

Epitope Protease Sequence and Cleavage Sites for Digestion 33-mer NepI L↓Q↓L↓QPFPQ↓PQ↓L↓PYPQPQ↓LPYPQPQLPYPQPQPF (α-gliadin, a.a 57-89) NepII L↓Q↓L↓Q↓PF↓PQ↓PQ↓L↓PYPQPQ↓LPYPQPQLPYPQPQPF

Nepenthes L↓Q↓L↓QP↓FP↓Q↓P↓Q↓L↓P↓YP↓QP↓Q↓L↓P↓Y↓P↓Q↓P↓Q↓L↓P↓Y↓PQ↓P↓Q fluid ↓P↓F *LPYPQPQ repeats are in bold

107

showed that the signal was specific as very little detectable signal was observed with BSA digested with 0-500 µg of nepenthesin I (Figure 4.7). In order to estimate the original concentration of the 33-mer peptide in the gliadin extract digests, the initial signals obtained from the gliadin extracts with no nepenthesin I or II digestion were applied to the standard curve generated from the pure 33-mer peptide antigen (Figure 4.5B) with the dilutions used taken into account. Based on these calculations, the original concentration of the 33-mer within the gliadin extract solution was estimated to be ~220 µM, making the ratio of enzyme to substrate ~1:15 for the 500 µg dose of recombinant nepenthesin I or II. These results produced a few visible discrepancies as it was previously shown that pepsin was essentially unable to digest the 33-mer peptide as determined by LC-MS quantitation (Figure 4.4B). As well, the results of this experiment appear to suggest that the recombinant nepenthesins are more efficient at digestion of the 33-mer peptide within gliadin extracts rather than in the pure antigen form alone even at lower enzyme to substrate ratios than those used in the previous LC-MS quantitation and indirect ELISA studies (Figure 4.4 and 4.5). These discrepancies may be explained by the differences in the quantitation methods and/or type of antigen used. In the indirect ELISA procedure, digestion of the gliadin extracts may have affected its coating capacities, which may be responsible for the initial drop in the signal corresponding to the amount of the 33-mer observed in the pepsin mediated digests; as well as, the perceived improvement in the nepenthesin I and II mediated digests. Therefore, the absolute quantity of the 33-mer at each enzyme dose cannot be confidently extracted from the results of this experiment. However, the difference in the signal between equivalent doses of nepenthesins compared to pepsin suggests that the recombinant nepenthesins are more efficient in digesting the 33-mer peptide than pepsin even within a complex background. As well, the relatively stable signal and large error bars observed for 50-500 µg pepsin but not recombinant nepenthesin I and II digests of the gliadin extracts further highlights the 33-mer peptides extreme resistance to pepsin digestion. Taken together, the results of these experiments suggest that recombinant nepenthesins are able to digest the 33-mer peptide more efficiently than an equivalent dosage of pepsin within a complex background of gliadins and glutenins in a dose-dependent manner.

108

Figure 4.7. Quantitation of the amount of the 33-mer peptide (%) remaining as a function of enzyme dose (µg).

In vitro digests of gliadin extracts with varying doses of the proteases shown (legend) were analyzed by indirect ELISA (A) under optimized and linear conditions as determined by titration and an experimentally determined standard curve (B) respectively. The portion of the 33-mer peptide detected by the primary antibody is shown in bold (top). Each data point shown is the mean of 3 biological replicates.

109

4.3.4 Assessment of the Recombinant Nepenthesins and Nepenthes Pitcher Fluids’ Capacity for Digestion of Gliadin Extracts

The results of the studies described thus far have shown that the proposed enzyme formulation of nepenthesins I/II and a putative PEP found in Nepenthes pitcher fluid efficiently digests a well characterized immunodominant epitope of α-gliadin, the 33-mer peptide, alone and in the context of a complex background under simulated gastrointestinal conditions in vitro (Figure 4.4 and 4.7). While it was demonstrated that the 33-mer peptide was less resistant to proteolysis by recombinant nepenthesin I and II relative to pepsin (Figure 4.4 and 4.7), the recombinant enzymes were much less efficient than that of the native formulation. Qualitatively, the Nepenthes pitcher fluid also appeared to digest gliadin extracts more efficiently than either of the recombinant nepenthesins alone, which, in turn, appeared more efficient than an equivalent dose of pepsin (Figure 4.6). This then led us to study, in depth, the efficiency of the recombinant nepenthesins and the Nepenthes pitcher fluid for global processing of gliadin extracts, which consist of a mixture of gliadins in excess of glutenins. To this end, we quantified the relative digestion capabilities of a standardized fixed dosage of each protease/protease formulation for the gliadin extracts over time (30 minutes- 52 hours and 40 minutes) by LC-MS and LC-MS/MS. Although physiological digestion in the stomach only occurs for ~1.5 hours, the effects of proteolysis observed at longer time points in these experiments should be achievable with increased dosages. To mitigate the effects of experimental loading errors, a standardized amount of an internal calibrant, caffeine, was added to each sample and all spectra obtained were normalized to the changes in the area of the extracted ion chromatogram (XIC) for caffeine at the first time point (30 minutes) of each digest. Although, the XIC of the internal calibrant was the same for all of the time points except for the last in each protease mediated digest. Each digestion contained 5 mg of the gliadin extract, 20 µg of each protease/protease formulation in 100 mM glycine-HCl, pH 2.5 in which small aliquots (~1% of the total volume) were removed at each indicated time point for LC- MS/MS analyses. The retention times and intensities of the total ion chromatograms (TIC) of each digest were compared to gain a relative sense of the processing efficiency of the gliadin extracts by each protease/protease formulation for each digestion period. In general, lower retention times on the C18 column indicates smaller peptide sizes and an increase in intensity indicates a larger

110

Figure 4.8. Total ion chromatograms of gliadin extracts digested with various protease/protease formulations under simulated gastrointestinal conditions in vitro.

Digests of gliadin extracts with A) nepenthesin I, B) nepenthesin II, C) Nepenthes pitcher fluid and D) pepsin for the indicated times (shown in legends). The TIC profiles were generated with equivalent doses of each protease/protease formulation and the same mass-load of substrate. An internal calibrant confirmed that signals obtained between samples were stable.

111

Figure 4.9. Total ion chromatogram of gliadin extracts digested for 1 hour and 40 minutes with the indicated proteases/protease formulations (shown in legends) under simulated gastrointestinal conditions in vitro.

The TIC profiles were generated with an equivalent dose of each protease/protease formulation and the same mass-load of substrate. An internal calibrant confirmed that signals obtained between samples were stable.

112

number of peptides. Comparison of the TIC profiles for the Nepenthes pitcher fluid digests of the gliadin extracts showed an overall increase in intensity and clear shifts in the TIC profiles towards lower retention times with increased digestion times (Figure 4.8C) whereas very little shifting towards lower retention times was observed in the TIC profiles for pepsin (Figure 4.8D). Analysis of the TIC profiles at different digestion time points by recombinant nepenthesin I and II indicated that while shifts towards lower retention times were apparent, especially compared to pepsin, they were not as prominent as that observed for the Nepenthes pitcher fluid (Figure 4.8A- C). Between the two isozymes, nepenthesin I appeared to process the gliadin extract more efficiently than nepenthesin II based on comparison of the shifts in the TIC profiles toward lower retention times (Figure 4.8A and B). These results once again suggest that digestion of gliadin extracts over time by the Nepenthes pitcher fluid decreases peptide sizes at a far more efficient rate than either of the recombinant nepenthesins alone or pepsin. In further support of this notion, comparison of the digests after 1 hour and 40 minutes, the approximate amount of time of physiological digestion in the stomach, for all four protease/protease formulations showed the TIC profile of the Nepenthes pitcher fluid to be shifted far more prominently towards a lower retention time than either of the recombinant nepenthesins or pepsin, which all showed a similar TIC profile (Figure 4.9). Next, we assessed the digestion efficiency of each protease/protease formulation by the sum of the intensities of the PIC spectra detected in each sample at each digestion time point within a range of 50 m/z “bins” ranging from 400-1250 m/z. In this way, digestion of the gliadin extracts should result in a decrease in intensity sums in higher m/z bins but an increase in intensity in lower m/z bins with increased digestion times. As seen in Figure 4.10, the desired effect was observed for digests of gliadin extract with Nepenthes pitcher fluid; showing a clear decrease in the sums of the PIC intensities ranging from 1000-1250 m/z but a large increase from 400-700 m/z with increased digestion times. To a lesser extent, the desired effect was also observed for digests of gliadin extracts by recombinant nepenthesin I and II, although the intensities observed at mid-to-high range m/z bins remained relatively high with increased digestion times (Figure 4.11 and 4.12). However, a clear increase in intensity was observed in the lower m/z bins with increased digestion times. For pepsin digests of the gliadin extract, the sum of the intensities of the PIC’s for each bin ranging from 400-1200 m/z showed a global rise in intensity with increased digestion times until a maximum was reached and then subsequently

113

Figure 4.10. The sum of the intensities of peptides separated into defined m/z ranges detected from digests of gliadin extracts by Nepenthes pitcher fluid under simulated gastrointestinal conditions plotted as a logarithmic function of time (min).

All intensities were normalized to an internal calibrant and blanks were run between each experimental sample to minimize the effects of carryover.

114

Figure 4.11. The sum of the intensities of peptides separated into defined m/z ranges detected from digests of gliadin extracts by recombinant nepenthesin I under simulated gastrointestinal conditions plotted as a logarithmic function of time (min).

All intensities were normalized to an internal calibrant and blanks were run between each experimental sample to minimize the effects of carryover.

115

Figure 4.12. The sum of the intensities of peptides separated into defined m/z ranges detected from digests of gliadin extracts by recombinant nepenthesin II under simulated gastrointestinal conditions plotted as a logarithmic function of time (min).

All intensities were normalized to an internal calibrant and blanks were run between each experimental sample to minimize the effects of carryover.

116

Figure 4.13. The sum of the intensities of peptides separated into defined m/z ranges detected from digests of gliadin extracts by pepsin under simulated gastrointestinal conditions plotted as a logarithmic function of time (min).

All intensities were normalized to an internal calibrant and blanks were run between each experimental sample to minimize the effects of carryover.

117

declined (Figure 4.13). This indicated that while digestion of the gliadin extract occurred in the presence of pepsin, the average net change in peptide size remained relatively stable with the accumulation of time, perhaps due to the presence of the proteolytically resistant gliadin/glutenin peptides that remain to stimulate the immune cascade in CD. To support this notion, a qualitative comparison of the average peptide lengths of gliadin/glutenin peptides detected by LC-MS/MS in the presence of pepsin showed a relatively stable average peptide length over increasing digestion times (Figure 4.14B). To assess qualitative trends in the digests of the gliadin extract that could be correlated to the data described above, we determined the average peptide lengths and total detected numbers of gliadin/glutenin and known immunogenic gliadin peptides by LC-MS/MS. Based on these analyses, the total number of gliadin/glutenin peptides detected for pepsin and nepenthesin I and II digests of gliadin extracts appeared to rise with increased digestion times whereas their average peptide lengths remained relatively stable (Figure 4.14A and B). For pepsin (Figure 4.13) and to a lesser extent, nepenthesin I and II (Figure 4.11 and 4.12), the global increase in intensity and the relatively stable m/z distribution profiles observed with increased digestion times described above correlated well with the rise in total peptide numbers and the relatively stable average lengths of gliadin/glutenin peptides detected respectively (Figure 4.14A and B). On the other hand, digests of gliadin extracts by Nepenthes pitcher fluid resulted in increasing amounts of the total number of gliadin/glutenin peptides detected to a maximum followed by a decline while their average peptide lengths purely decreased with increased digestion times (Figure 4.14A and B). The declining average peptide lengths correlated well with the intensity profiles of lower m/z peptides over increasing digestion times for Nepenthes pitcher fluid digests of the gliadin extract (Figure 4.10). However, the drop in the total number of detected peptides after the 16 hours and 40 minutes digest by Nepenthes pitcher fluid indicated an instrument specific digestion maximum rather than a true digestion maximum as smaller peptides become more difficult, with peptides below 5 residues almost impossible, to detect by LC-MS/MS. Nevertheless, the instrument specific digestion maximum observed highlights the efficiency of the Nepenthes pitcher fluid to be able to process gliadin/glutenin peptides beyond detection by a highly sensitive instrument. Finally, as we had assessed the abilities of the recombinant nepenthesins and Nepenthes pitcher fluid for processing gliadin extracts on a global scale compared to pepsin, we then

118

Figure 4.14. Qualitative analysis of digests of gliadin extracts with the indicated protease/protease formulations (shown in legends) under simulated gastrointestinal conditions.

The total number of peptides detected (A), average peptide length (B), total number of immunogenic epitopes (C) and average length of immunogenic epitopes (D) are plotted as a logarithmic function of time (min). The dotted line in B) shows, in general, the minimal detectable peptide length and the dotted line in D) represents the minimal peptide length for the definition of an immunogenic epitope.

119

wanted to determine whether the same digestion effects could be observed for a panel of detectable immunogenic epitopes within the complex background of the gliadin extracts. For the qualitative assessment, the total number and average length of detected peptides containing a full immunogenic epitope sequence as a function of digestion time were determined from digests of the gliadin extracts by recombinant nepenthesin I and II, the native Nepenthes pitcher fluid, and pepsin. In general, the immunogenic peptide profiles for the digests appeared very similar to those obtained for that of the total number and average length of all gliadin/glutenin peptides profiles. The number and average length of immunogenic epitope containing peptides remained relatively stable with increased digestion times by nepenthesin I and II as well as pepsin (Figure 4.14C and D). At the last time point of 52 hours and 40 minutes, however, the total number and average length of peptides containing an immunogenic epitope rose sharply for each digest indicating release of, rather than digestion of, immunogenic gliadin/glutenin peptides, which may contribute to the manifestation of CD symptoms following gluten ingestion in a physiological context. The relative stability of the total number and average length of peptides containing immunogenic epitopes over the majority of the digestion time-course evaluated in these studies highlights, once again, their extreme resistance to proteolysis by pepsin and to a lesser extent, nepenthesin I and II. For digests of the gliadin extract with the Nepenthes pitcher fluid, the total number of immunogenic epitopes detected by LC-MS/MS first rose to a maximum and subsequently declined whereas the average length of these peptides purely decreased with increased digestion times (Figure 4.14C and D). This indicates that with increased digestion time or dose, the Nepenthes pitcher fluid is able to efficiently process immunogenic epitopes normally resistant to proteolysis by gastrointestinal enzymes within a complex background. Furthermore, the average length of the remaining intact immunogenic peptides detected at the final time point was 12 residues, which is not far from the minimum core length of 9 residues defined for immunogenic gliadin epitopes thus far [50]. These qualitative results were further corroborated by quantitative analysis of the sum of the PIC areas containing full, intact sequences of immunogenic epitopes detected from time- dependent digests of gliadin extracts by all four of the protease/protease formulations. These analyses, again, showed a relative stability in the sum of the intensities of immune-stimulating

120

Figure 4.15. The sum of the intensities of peptides containing immunogenic epitopes detected from digests of gliadin extracts with the indicated protease/protease formulations (shown in legend) under simulated gastrointestinal conditions plotted as a logarithmic function of time (min).

All intensities were normalized to an internal calibrant and blanks were run between each experimental sample to minimize effects of carryover.

121

epitope containing peptides with increased digestion times until the final time point where a sharp increase in intensities were observed for nepenthesin I and II as well as pepsin digests of gliadin extracts (Figure 4.15). Digests of the gliadin extract with Nepenthes pitcher fluid, on the other hand, showed an initial increase in the sum of the intensities of immunogenic epitopes with increasing digestion times until the 1 hour and 40 minute time point where decreased intensities were observed thenceforth. Taken together, these results suggest that there is an “uphill barrier” required for the digestion of gliadins and glutenins in which proteases initially release all of the immunogenic epitopes into solution due to their proteolytic resistance prior to their destruction. This places further importance on the use of an optimally active and extremely efficient protease/protease formulation for the detoxification of gliadins and glutenins in CD patients as inefficient digestion may possibly trigger an increase in both the number and/or severity of CD symptoms; thereby causing more harm. In conclusion, the results described in this section once again indicate that while recombinant nepenthesin I and II appear to process gliadins and glutenins more efficiently than an equivalent dose of pepsin, they are much less efficient than the full protease formulation residing in the Nepenthes pitcher fluid, which appears to efficiently detoxify gliadins and glutenins.

4.4 Conclusions and Future Directions

In summary, the results of initial characterizations of recombinant nepenthesin I/II and that of the proposed protease cocktail, housed in Nepenthes pitcher secretions, have supported their potential as effective therapeutics for CD. Recombinant nepenthesin I and II showed more efficient digestion of an immunodominant α-gliadin epitope, the 33-mer peptide, alone and within a complex background of gliadins and glutenins relative to pepsin as determined by LC- MS quantitation and indirect ELISA. While still resistant to proteolysis by the recombinant nepenthesins, the 33-mer peptide, alone, was digested more efficiently by nepenthesin II than I. However, within a complex background of gliadins and glutenins, both recombinant enzymes appeared to digest the 33-mer equivalently at similar dosages. The native Nepenthes pitcher fluid, on the other hand, showed highly efficient digestion of the 33-mer peptide as determined

122

by LC-MS quantitation even at an estimated 15x lower dosage than pepsin or either of the recombinant nepenthesins alone. After initial, promising results showing destruction of the 33-mer peptide, recombinant nepenthesins and the native Nepenthes pitcher fluid were then assessed for their capacity for digestion of gliadins and glutenins by LC-MS and LC-MS/MS. Digests of gliadin extracts with equivalent dosages of recombinant nepenthesin I and II, Nepenthes pitcher fluid, and pepsin were carried out over time and aliquots were collected for each digest at each time point. The quantitative and qualitative results of these studies showed that recombinant nepenthesin I and II appeared to process gliadin and glutenin peptides better than pepsin, although nepenthesin I appeared to be more efficient than nepenthesin II in this instance. However, all three proteases appeared to cause further release of gliadin and glutenin and immunogenic gliadin peptides that appeared to be resistant to further proteolysis even with increased digestion times. The native Nepenthes pitcher secretions, on the other hand, appeared to efficiently process the gliadin extracts globally and within detected immunogenic epitopes. Quantitatively and qualitatively, destruction of the immunogenic epitopes by the native Nepenthes pitcher fluid was characterized by an initial rise in the total number and intensities of immunogenic epitopes, indicating a release of peptides, followed by a steady decline, indicating destruction, with increased digestion times. In parallel, the average lengths and intensities in higher m/z ranges of gliadin extract peptides showed a continuous decline with increased digestion times. Overall, the results presented in this chapter support the potential of the proposed protease formulation residing in the Nepenthes pitcher fluid as a therapeutic for the treatment of CD. However, while one-half of the proposed protease formulation, the recombinant nepenthesins, showed modest detoxification effects relative to an equivalent dose of pepsin, they do not appear to be sufficient, alone, as a therapeutic for CD. Therefore, efforts should be placed in obtaining the sequence and subsequent recombinant product of the putative missing PEP in the Nepenthes pitcher secretions. Although not complete, the experiments described in this chapter for the initial characterizations of the native Nepenthes pitcher secretions and recombinant nepenthesins can be taken as a framework for assessing the suitability of the putative protease formulation, once fully reconstituted, as a potential therapeutic for CD. In addition, further pre-clinical experiments,

123

including in vivo effects, are a necessity for further assessing the therapeutic potential of the proposed protease formulation prior to clinical trials.

4.5. Contributions to the Chapter

My contributions to this chapter included: the experimental design and performance of all assays for gauging fluid protein concentration, all biochemical assays for gauging stability (SDS- PAGE) and activity of all protease/protease formulations, indirect ELISA assays against the 33- mer peptide, and gliadin digests. All figures shown in this chapter were made by me. In addition, I designed, performed and analyzed the digestion of the 33-mer peptide although injection of the sample into the Agilent 6550 iFunnel Q-TOF mass spectrometer for LC-MS quantitation was performed by Dr. Laurent Brechenmacher (Southern Alberta Mass Spectrometry Laboratory). Finally, all data generated from in vitro digests of gliadin extracts at various time points by LC-MS/MS were designed, performed and analyzed through an equal contribution from Dr. Martial Rey (Laboratory of Dr. David Schriemer).

124

Chapter Five: Research Summary and Future Considerations

5.1 Summary

Celiac disease is an autoimmune disorder arising in genetically susceptible individuals that is triggered by the ingestion of one of the world’s most common food substances, gluten. For those afflicted with disease symptoms, prolonged exposure to dietary gluten results in a seriously reduced quality of life and can be fatal. With the estimated global rise in gluten intake, the dangers of hidden gluten contamination, patient non-compliance and limitations placed on lifestyle, therapeutic options have been rigorously pursued but no therapeutic has yet been approved. Therefore, the ultimate goals of this thesis were to recombinantly reconstitute and assess the potential of a novel protease formulation found in Nepenthes pitcher fluid, whose properties appear suited, as a therapeutic for celiac disease. In Chapter 2, the identification of all proteases within the proteome of Nepenthes pitcher fluid was attempted through a proteomics driven approach in addition to various biochemical assays. The results of these studies identified nepenthesin I and II as the only proteases present in the native Nepenthes pitcher fluid. Although these studies were inconclusive as a whole, sufficient evidence was presented to support the notion that the proteolytic activity of Nepenthes pitcher fluid is attributed solely to a combination of nepenthesin I and II. In Chapter 3, successful recombinant production of nepenthesin I and II was achieved and optimized in an E.coli based expression system. Subsequent assessment and comparison of the biochemical/enzymatic characteristics of the recombinant nepenthesins to that of their native forms strongly suggested correctly folded and functional recombinant products. However, neither of the recombinant nepenthesins were able to cleave C-terminal to proline, unlike digestion mediated by the Nepenthes pitcher fluid, although cleavage N- and C- terminal to all other residues were observed. Eventually, further purification and biochemical trials with the Nepenthes pitcher fluid yielded an enriched fraction containing pure cleavage C-terminal to prolines, indicative of a previously unidentified and uncharacterized protease. In Chapter 4, the proposed protease cocktail housed in the Nepenthes pitcher fluid and its partially reconstituted proteolytic activity, in the form of recombinant nepenthesin I and II, were assessed for their capacity to process gliadin extracts in vitro by LC-MS/MS and indirect ELISA.

125

The recombinant nepenthesins were shown to digest an immunodominant gliadin epitope, the 33-mer of α-gliadin, more efficiently than an equivalent dose of pepsin with nepenthesin II appearing to be more efficient than nepenthesin I. However, when digests of gliadin extracts were compared between the recombinant nepenthesins and pepsin, modest benefits in detoxification potential were observed. On the other hand, Nepenthes pitcher fluid efficiently digested the 33-mer peptide, gliadin extracts and a host of immunogenic epitopes within gliadin extracts; reinforcing the potential of the proposed protease formulation, consisting of the nepenthesins and a putative uncharacterized prolyl endopeptidase, as a novel therapeutic for the treatment of celiac disease.

126

5.2 Future Considerations

While recombinant nepenthesin I and II, in their unglycosylated form, have been successfully generated and characterized, glycosylated nepenthesin I, which is found in nature, has not yet been produced and requires further exploration. Our own attempts for production of glycosylated nepenthesin I in a yeast-based expression system have, so far, been unsuccessful but further optimization of the expression procedure is currently in progress. Ultimately, if unsuccessful, different eukaryotic expression systems known for their capacity to produce proteins with complex disulfide networks and post-translational modifications, such as insect cells, should be employed in order to study whether glycosylation plays a role in the stability and/or efficiency of nepenthesin I. If so, studies to determine the proteolytic benefits of glycosylated nepenthesin I should be undertaken in order to determine whether, taken together, the use of glycosylated nepenthesin I is warranted for therapeutic means of treating CD. In these assessments, both a medical and cost perspective should be considered. The next crucial step required for the continuation of the studies presented in this thesis is the successful recombinant production of the uncharacterized PEP residing in Nepenthes pitcher fluid. First, however, the identity of the amino acid sequence of the putative PEP must be elucidated. To this end, isolation of cDNA from the pitchers of the Nepenthes pitcher plants is currently being optimized for whole genome sequencing. Once the sequence of the uncharacterized PEP has been obtained, a recombinant production strategy akin to the one employed in this thesis (Chapter 3) should be attempted and if unsuccessful, may require the use of more complex expression systems as described above. Successful recombinant production of the putative PEP will then require detailed enzymatic and biochemical characterizations such as that presented in Chapter 3 of this thesis. However, characterization of specific enzyme activity should first be attempted and the Z-Gly- Pro-pNA substrate, which has previously been employed successfully for studying PEP candidates for the treatment of CD, appears to be an attractive first option [102,103,109]. If the specific solvents necessary for the proposed specific activity assay are not compatible with the newly produced PEP in question, the hemoglobin activity assay used in Chapter 3 of this thesis can be used as an alternative. Nevertheless, efficient cleavage N- and C-terminal to P and Q must be reconstituted recombinantly.

127

Once complete reconstitution of the proteolytic activity observed for the Nepenthes pitcher fluid has been achieved recombinantly, the gluten detoxification abilities of the newly acquired PEP alone and in combination with recombinant nepenthesin I and/or II should be assessed in the same manner as described in Chapter 4 of this thesis. The optimum ratio of recombinant nepenthesin(s): PEP should be determined based on their relative digestion efficiency and the experimentally determined ratio should then be used for further in vitro gluten processing assays either at a fixed dosage or in a dose-dependent manner. Other studies for assessing gluten detoxification in more complex backgrounds simulating full meal loads could also be pursued; as well as, studying the effects of digestion on deamidation by TG2 and in T- cell proliferation assays. Once in vitro assessments have been completed, in vivo assessments should be attempted. Although no bona fide animal model for CD exists, models such as the one proposed by Leroux et al., which monitors digestion of the 33-mer peptide in rats, offer a reasonable means for assessing gluten detoxification in vivo [149]. Finally, once the arsenal of pre-clinical studies in vitro and in vivo has been completed and the results remain promising, clinical trials should be pursued.

128

Chapter Six- Bibliography

1. Dicke, W.,Weijers, H. and van deKamer, J. (1953). Coelic disease.II. The presence in wheat having a deleterious effect in cases of coeliac disease. Acta Paediatr., 3(42), 34-42

2. Kagnoff, F.M. (2007). Celiac disease: pathogenesis of a model immunogenetic disease. J. Clinic. Invest., 117(1), 41-49

3. Green, P.H. and Cellier, C. (2007). Celiac disease. N. Engl. J. Med., 357, 1731-1743

4. Lee, A. and Newman, J.M. (2003). Celiac diet: its impact on quality of life. J. Am. Diet. Assoc., 103, 1533–1535

5. Hall, N.J., Rubin, G. and Charnock, A. (2009). Systematic review: adherence to a gluten- free diet in adult patients with coeliac disease. Aliment. Pharmacol. Ther., 30, 315–330

6. Rashtak, S. and Murray, J.A. (2012). Review article: coeliac disease, new approaches to therapy. Aliment Pharmacol Ther., 35(7), 768-781

7. Cerf-Bensussan, N., Matysiak-Budnik, T., Cellier, C. and Heyman, M. (2006). Oral proteases: a new approach to managing coeliac disease. Gut, 56, 157-160

8. Ludvigsson, J.F., Rubio-Tapia, A., van Dyke, C.T., Melton, L.J. 3rd, Zinsmeister, A.R., Lahr, B.D., et al. (2013). Increasing incidence of celiac disease in a North American population. Am. J. Gastroenterol., 108(5), 818-824

9. Rubio-Tapia, A., Ludvigsson, J.F., Brantner, T.L., Murray, J.A. and Everhart, J.E. (2012). The prevalence of celiac disease in the United States. Am. J. Gastroenterol., 107(10), 1538-1544

10. Gujral, N., Freeman, H.J. and Thomson, A.B.R. (2012). Celiac disease: Prevalence, diagnosis, pathogenesis and treatment. World J. Gastroenterol., 18(42), 6036-6059

11. Lohi, S., Mustalahti, K., Kaukinen, K., Laurila, K., Collin, P., Rissanen, H., et al. (2007). Increasing prevalence of celiac disease over time. Aliment. Pharmacol. Ther., 26 (9), 1217–1225

12. Ivarsson, A., Persson, L.A., Nyström, L. and Hernell, O. (2003). The Swedish coeliac disease epidemic with a prevailing twofold higher risk in girls compared to boys may reflect gender specific risk factors. Eur. J. Epidemiol., 18 (7), 677-684

13. Kapur, N., Hunt, I., Lunt, M., McBeth, J., Creed, F. and Macfarlane, G. (2005). Primary care consultation predictors in men and women: A cohort study. Br. J. Gen. Pract., 55, 108–113

129

14. Fasano, A. and Catassi, C. (2001). Current approaches to diagnosis and treatment of celiac disease: An evolving spectrum. Gastroenterology, 120, 636–651

15. Falchuk, Z.M., Rogentine, G.N. and Strober, W. (1972). Predominance of histocompatibility antigen HL-A8 in patients with gluten sensitive enteropathy. J. Clin. Invest., 51, 1602–1605

16. Stokes, P.L., Asquith, P., Holmes, G.K., Mackintosh, P. and Cooke, W.T. (1972). Histocompatibility antigens associated with adult coeliac disease. Lancet, 2, 162–164

17. Fasano, A., Berti, I., Gerarduzzi, T., Not. T., Colletti, R.B., Drago, S et al. (2003). Prevalence of celiacdisease in at-risk and not-at-risk groups in the United States: A large multicenter study. Arch. Intern. Med., 163, 286–292

18. Malekzadeh, R. (2005). Coeliac disease in developing countries: Middle East, India and North Africa. Best Pract. Res. Clin. Gastroenterol., 19(3), 351-358

19. Catassi, C., Rätsch, I.M., Gandolfi, L., Pratesi, R., Fabiani, E., El Asmar et al. (1999). Why is coeliac disease endemic in the people of the Sahara? Lancet, 354(9179), 647- 648

20. Dewar, D.H. and Ciclitira, P.J. (2005). Clinical features and diagnosis of celiac disease. Gastroenterology, 128, S19–S24

21. Rostom, A., Dubé, C., Cranney, A., Saloojee, N., Sy, R, Garritty, C., et al. (2005). The diagnostic accuracy of serologic tests for celiac disease: A systematic review. Gastroenterology, 128(4), S38-S46

22. Reddick, B.K., Crowell, K. and Fu, B. (2006). Clinical inquiries: What blood tests help diagnose celiac disease? J. Fam. Pract., 55(12), 1088, 1090, 1093

23. Catassi, C., Fabiani, E., Ratsch I.M., Coppa, G.V., Giorgi, P.L., Pierdomenico, R., et al. (1996). The coeliac iceberg in Italy. A multicentre antigliadin antibodies screening for coeliac disease in school-age subjects. Acta Paediatr. Suppl., 412, 29–35

24. Ravikumara, M., Nootigattu, V.K.T. and Sandhu, B.K. (2007). Ninety percentage of celiac disease is being missed. J. Pediatr. Gastroenterol. Nutr., 45, 497–499

25. Kondrashova, A., Mustalahti, K., Kaukinen, K., Viskari, H., Volodicheva, V., Haapala, et al. (2008). Lower economic status and inferior hygienic environment may protect against celiac disease. Ann. Med., 40, 223–231

26. Day, L., Augustin, M.A., Batey, I.L. and Wrigley, C.W. (2006). Wheat-gluten uses and industry needs. Trends Food Sci. Technol., 17(2), 82-90

130

27. Bevan, S., Popat, S., Braegger, C.P., Busch, A., O’Donoghue, D., Falth-Magnusson, et al. (1999). Contribution of the MHC region to the familial risk of coeliac disease. J. Med. Genet., 36, 687–690

28. Monsuur, A.J. and Wijmenga, C. (2006). Understanding the molecular basis of celiac disease: what genetic studies reveal. Ann. Med., 38, 578–591

29. Petronzelli, F., Bonamico, M., Ferrante, P., Grillo, R., Mora, B., Mariani, P., et al. (1997). Genetic contribution of the HLA region to the familial clustering of coeliac disease. Ann. Hum. Genet., 61(Pt 4), 307–317

30. Monsuur, A.J., de Bakker, P.I., Alizadeh, B.Z., Zhernakova, A., Bevova, M.R., Strengman, E., et al. (2005). Myosin IXB variant increases the risk of celiac disease and points toward a primary intestinal barrier defect. Nat. Genet., 37, 1341–1344

31. King, A.L., Moodie, S.J., Fraser, J.S., Curtis, D., Reid, E., Dearlove, A.M., et al. (2002). CTLA-4/CD28 gene region is associated with genetic susceptibility to coeliac disease in UK families. J. Med. Genet., 39, 51–54

32. Djilali-Saiah, I., Schmitz, J., Harfouch-Hammoud, E., Mougenot, J.F., Bach, J.F. et al. (1998). CTLA-4 gene polymorphism is associated with predisposition to coeliac disease. Gut, 43, 187–189

33. Dube, C., Rostom, A., Sy, R., Cranney, A., Saloojee, N., Garritty, C., et al. (2005). The prevalence of celiac disease in average-risk and at risk Western European populations: a systematic review. Gastroenterology, 128 (Suppl 1), S57–67

34. Nistico, L., Fagnani, C., Coto, I., Percopo, S., Cotichini, R., Limongelli, M.G., et al. (2006). Concordance, disease progression, and heritability of coeliac disease in Italian twins. Gut, 55, 803–808

35. Greco, L., Romino, R., Coto, I., Di Cosmo, N., Percopo, S., Maglio, M., et al. (2002). The first large population based twin study of coeliac disease. Gut, 50, 624–628

36. Nadal, I., Donat, E., Ribes-Koninckx, C., Calabuig, M. and Sanz, Y. (2007). Imbalance in the composition of the duodenal microbiota of children with coeliac disease. J. Med. Microbiol., 56 (Pt 12), 1669-1674

37. Sánchez, E., De Palma, G., Capilla, A., Nova, E., Pozo, T., Castillejo, G., et al. (2011). Influence of environmental and genetic factors linked to celiac disease risk on infant gut colonization by Bacteroides species. Appl. Environ. Microbiol., 77 (15), 5316-5323

38. Collado, M.C., Donat, E., Ribes-Koninckx, C., Calabuig, M. and Sanz, Y. (2009). Specific duodenal and faecal bacterial groups associated with paediatric coeliac disease. J. Clin. Pathol., 62, 264–269

131

39. Lahdeaho, M.L., Lindfors, K., Airaksinen, L., Kaukinen, K. and Maki, M. (2012). Recent advances in the development of new treatments for celiac disease. Exp. Opin. Biol. Ther. 12 (12), 1589-1600

40. Frisoni, M., Corazza, G.R., Laflandra, D., De Ambrogio, E., Filipponi, C., Bonvicini, F., et al. (1995). Wheat deficient in gliadins: a promising tool for treatment of coeliac disease. Gut, 36, 375-378

41. Carroccio, A., Di Prima, L., Noto, D., Fayer, F., Ambrosiano, G., Villanacci, V. et al. (2011). Searching for wheat plants with low toxicity in celiac disease: Between direct toxicity and immunologic activation. Dig. Liver Dis., 43 (1), 34-39

42. Van den Broeck, H.C., van Herpen, T.W., Schult, C., Salentijn, E.M., Dekking, L., Bosch, D. et al. (2009). Removing celiac disease-related gluten proteins from bread wheat while retaining technological properties: a study with Chinese Spring deletion lines. BMC Plant Biol., 9, 41-53

43. Spaenji-Dekking, L., Kooy-Winkelaar, Y., van Veelen, P., Drijfhout J.W., Jonker, H., van Soest, L., et al. (2005). Natural variation in toxicity of wheat: potential for selection of nontoxic varieties for celiac disease patients. Gastroenterology, 129, 797-806

44. Shan, L., Qiao, S.W., Arentz-Hansen, H., Arentz-Hansen, H., Molberg, Ø., Gray, G.M., Sollid, L.M., et al. (2005). Identification and analysis of multivalent proteolytically resistant peptides from gluten: implications for celiac sprue. J. Proteome. Res., 4, 732– 741

45. Fasano, A., Not, T., Wang, W., Uzzau, S., Berti, I., Tommasini, A., et al. (2000). Zonulin, a newly discovered modulator of intestinal permeability, and its expression in coeliac disease. Lancet, 355, 1518–1519

46. Lammers, K.M., Lu, R., Brownley, J., Lu, B., Gerard, C., Thomas, K., et al. (2008). Gliadin induces an increase in intestinal permeability and zonulin release by binding to the chemokine receptor CXCR3. Gastroenterology, 135, 194–204

47. Abadie, V., Sollid, L.M., Barreiro, L.B. and Jabri, B. (2011). Integration of genetic and immunological insights into a model of celiac disease pathogenesis. Ann. Rev. Immunol., 23, 493-525

48. Shan, L., Oyvind, M., Parrot, I., Hausch, F., Filiz, F., Gray, G.M., et al. (2002). Structural basis for gluten intolerance in celiac sprue. Science, 297, 2275-2279

49. Camarca, A., Anderson, R.P., Mamone. G., Fierro, O., Facchiano, A., Costantini, S., et al. (2009). Intestinal T cell responses to gluten peptides are largely heterogeneous: implications for a peptide-based therapy in celiac disease. Journal of Immunology, 182 (7), 4158–4166

132

50. Vader, L.W., Stepniak, D.T., Bunnik, E.M., Kooy, Y.M., de Haan, W., Drijfhout, J.W., et al. (2003). Characterization of cereal toxicity for celiac disease patients based on protein homology in grains. Gastroenterology, 125, 1105–1113

51. Salentijn, E.M.J, Mitea, D.C., Goryunova, S.V., van der Meer, I.M., Padioleau, I., Gilissen, L.J., et al. (2012). Celiac disease T-cell epitopes from gamma gliadins: immunoreactivity depends on the genome of origin, transcript frequency, and flanking protein variation. BMC Genomics, 13, 277

52. Pinkas, D.M., Strop, P., Brunger, A.T. and Khosla, C. (2007). Transglutaminase 2 undergoes a large conformational change upon activation. PLoS Biol., 5(12), e327

53. Sollid, L.M. and Khosla, C. (2005). Future therapeutic options for celiac disease. Nat. Clin. Pract. Gastroenterol. Hepatol., 2(3), 140-147

54. Nilsen, E.M., Lundin, K.E.A., Kracji, P., Scott, H., Sollid, L.M. and Brandtzaeg P. (1995). Gluten specific, HLA-restricted T cells from celiac mucosa produce cytokines with Th1 or Th0 profile dominated by interferon gamma. Gut, 37, 766-776

55. Salvati, V.M., Mazzarella, G., Gianfrani, C., Levings, M.K., Stefanile, R., De Giulio, B., et al. (2005). Recombinant human interleukin 10 suppresses gliadin dependent T cell activation in ex vivo cultured coeliac intestinal mucosa. Gut, 54 (1), 46–53

56. Sollid L. (2009). Coeliac disease: dissecting a complex inflammatory disorder. Nat. Rev. Immunol., 9, 647–655

57. Lindfors, K., Maki, M. and Kaukinen, K. (2010). Transglutaminase 2-targeted autoantibodies in celiac disease: pathogenic players in addition to diagnostic tools? Autoimm. Rev., 9, 744-749

58. Hue, S., Mention, J.J., Monteiro, R.C., Zhang, S., Cellier, C., Schmitz, J., et al. (2004). A direct role for NKG2D/MICA interaction in villous atrophy during celiac disease. Immunity, 21, 367–377

59. Maiuri, L., Ciacci, C., Ricciardelli, I., Vacca, L., Raia, V., Auricchio, S., et al. (2003). Association between innate response to gliadin and activation of pathogenic T cells in coeliac disease. Lancet, 362, 30–37

60. Schuppan, D., Junker, Y. and Barisani, D. (2009). Celiac disease: from pathogenesis to novel therapies. Gastroenterology, 137, 1912-1933

61. Rewers, M. (2005). Epidemiology of celiac disease: what are the prevalence, incidence, and progression of celiac disease? Gastroenterology, 128 (Suppl 1), S47–51

133

62. Fasano, A. and Uzzau, S. (1997). Modulation of intestinal tight junctions by Zonula occludens toxin permits enteral administration of insulin and other macromolecules in an animal model. J. Clin. Invest., 99, 1158-1164

63. Libonati, C.J. (2007). Recognizing Celiac Disease: Signs, Symptoms, Associated Disorders & Complications. Fort Washington, PA: Gluten Free Works Publishing.

64. Basude, D. and Paul, S. (2013). Recognition and management of coeliac disease in children. J. Fam. Health Care, 23 (8), 28-30, 32-35

65. Lurie, Y., Landau, D., Pfeffer, J. and Oren, R. (2008). Celiac disease diagnosed in the elderly. J. Clin. Gastroenterol., 42, 59–61

66. Nenna, R., Mora, B., Megiorni, F., Mazzilli, M.C., Magliocca, F.M., Tiberti, C., et al. (2008). HLA-DQB1_02 dose effect on RIA antitissue transglutaminase autoantibody levels and clinicopathological expressivity of celiac disease. J. Pediatr. Gastroenterol. Nutr., 47, 288–292

67. Westerberg, D.P., Gill, J.M., Dave, B., DiPrinzio, M.J., Quisel, A. and Foy, A. (2006). New Strategies for Diagnosis and Management of Celiac Disease. JAOA, 106(3), 145- 151

68. Vilppula, A., Kaukinen, K., Luostarinen, L., Krekelä, I., Patrikainen, H., Valve, R., et al. (2009). Increasing prevalence and high incidence of celiac disease in elderly people: a population-based study. BMC Gastroenterol., 9, 49

69. Bardella, M.T., Velio, P., Cesana, B.M., Prampolini, L., Casella, G., Di Bella, C., et al. (2007). Coelic disease: a histological follow-up study. Histopathology, 50, 465-471

70. Rubio-Tapia, A., Rahim, M.W., See, J.A., Lahr, B.D., Wu, T.T. and Murray, J.A. (2010). Mucosal recovery and mortality in adults with celiac disease after treatment with a gluten-free diet. Am. J. Gastroenterol., 105(6), 1412-1420

71. Midhagen, G. and Hallert, C. (2003). High rate of gastrointestinal symptoms in celiac patients living on a gluten-free diet: controlled study. Am. J. Gastroenterol., 98, 2023- 2026

72. Lanzini, A., Lanzarotto, F., Villanacci, V., Mora, A., Bertolazzi, S., Turini, D., et al. (2009). Complete recovery of intestinal mucosa occurs very rarely in adult coeliac patients despite adherence to gluten-free diet. Aliment. Pharmacol. Ther., 29(12), 1299- 1308.

73. Catassi, C. and Fabiani, E. (2007). A prospective double-blind, placebo controlled trial to establish a safe gluten threshold for patients with celiac disease. Am. J. Clin. Nutr., 85(1), 160-166

134

74. Akobeng, A.K. and Thomas, A.G. (2008). Systematic review: tolerable amount of gluten for people with celiac disease. Aliment. Pharmacol. Ther., 27, 1044–1052

75. Thompson, T., Lee, A.R. and Grace, T. (2010). Gluten contamination of grains, seeds and flours in the United States: a pilot study. J. Am. Diet Assoc., 110(6), 937-940

76. Thompson, T. and Mendez, E. (2008). Commercial assays to assess gluten content of gluten-free foods: why they are not created equal. J. Am. Diet. Assoc., 108, 1682–1687

77. Gianfrani, C., Siciliano, R.A., Facchiano, A.M. Camarca, A., Mazzeo, M.F., Costantini, S., et al. (2007). Transamidation of wheat flour inhibits the response to gliadin of intestinal T cells in celiac disease. Gastroenterology, 133 (3), 780-789

78. Molberg, S., MckAdam, K.E., Lundin, A., Kristiansen, C., Arentz-Hansen, H., Kett, K., et al. (2001). T-cells from celiac disease lesions recognize gliadin epitopes deamidated in situ by endogenous tissue transglutaminase. European Journal of Immunology, 31(5), 1317-1323

79. Klöck, C., Jin, X., Choi, K., Khosla, C., Madrid, P.B., Spencer, A., et al. (2010). Acylideneoxoindoles: a new class of reversible inhibitors of human transglutaminase 2. Bioorg. Med. Chem. Lett., 21, 2692–2696

80. Choi, K., Siegel, M., Piper, J.L., Yuan, L., Cho, E., Strnad, P., et al. (2005). Chemistry and biology of dihydroisoxazole derivatives: selective inhibitors of human transglutaminase 2. Chem. Biol., 12, 469–475

81. Watts, R.E., Siegel, M. and Khosla, C. (2006). Structure-activity relationship analysis of the selective inhibition of transglutaminase 2 by dihydroisoxazoles. J. Med. Chem., 49, 7493–7501

82. Mazzarella, G., Salvati, V.G., Iaquinto, G. Stefanile, R., Capobianco, F., Luongo, D., et al. (2012). Reintroduction of gluten following flour transamidation in adult celiac patients: a randomized, controlled clinical study. Clinical and Developmental Immunology, vol. 2012, Article ID 329150, 10 pages

83. Freund, K.F., Doshi, K.P., Gaul, S.L., Claremon, D.A., Remy, D.C., Baldwin, J.J., et al. (1994). Transglutaminase inhibition by 2-[(2- oxopropyl)thio]imidazolium derivatives: mechanism of factor XIIIa inactivation. Biochemistry, 33, 10109–10119

84. Jamma, S., Leffler, D.A., Dennis, M., Najarian, R.M., Schuppan, D.B., Sheth, S., et al. (2011). Small intestinal release mesalamine for the treatment of refractory celiac disease type I. J. Clin. Gastroenterol., 45, 30–33

85. Costantino, G., della Torre, A., Lo Presti, M.A., Caruso, R., Mazzon, E. and Fries, W. (2008). Treatment of life threatening type I refractory coeliac disease with long-term infliximab. Dig. Liver Dis., 40, 74–77

135

86. Przemioslo, R.T., Lundin, K.E., Sollid, L.M., Nelufer, J. and Ciclitira, P.J. (1995). Histological changes in small bowel mucosa induced by gliadin sensitive T lymphocytes can be blocked by anti-interferon gamma antibody. Gut, 36, 874–879

87. Ciacci, C., Maiuri, L., Russo, I., Tortora, R., Bucci, C., Cappello, C., et al. (2009). Efficacy of budesonide therapy in the early phase of treatment of adult coeliac disease patients with malabsorption: an in vivo/in vitro pilot study. Clin. Exp. Pharmacol. Physiol., 36, 1170–1176

88. Rubio-Tapia, A., Talley, N.J., Gurudu, S.R., Wu, T.T. and Murray, J.A. (2010). Gluten- free diet and steroid treatment are effective therapy for most patients with collagenous sprue. Clin. Gastroenterol. Hepatol., 8, 344–349

89. Vickery, B.P. and Burks, A.W. (2009). Immunotherapy in the treatment of food allergy: focus on oral tolerance. Curr. Opin. Allergy Clin. Immunol. 9, 364–370

90. Keech, C.L., Dromey, J., Chen, Z., Anderson, R.P., and McCluskey, J. (2009). Immune Tolerance Induced By Peptide Immunotherapy in An HLA Dq2-Dependent Mouse Model of Gluten Immunity. Gastroenterology, 136, A57

91. Maurano, F., Siciliano, R.A., De Giulio, B., Luongo, D., Mazzeo, M.F., Troncone, R., et al. (2001). Intranasal administration of one alpha gliadin can downregulate the immune response to whole gliadin in mice. Scand. J. Immunol., 53, 290–295

92. Paterson, B.M., Lammers, K.M., Arrieta, M.C., Fasano, A. and Meddings, J.B. (2007). The safety, tolerance, pharmacokinetic and pharmacodynamic effects of single doses of AT-1001 in coeliac disease subjects: a proof of concept study. Aliment. Pharmacol. Ther., 26, 757–766.

93. Gopalakrishnan, S., Durai, M., Kitchens, K., Tamiz, A.P., Somerville, R., Ginski, M., et al. (2012). Larazotide acetate regulates epithelial tight junctions in vitro and in vivo. Peptides, 35(1), 86-94

94. Leffler, D.A., Kelly, C.P., Abdallah, H.Z., Colatrella, A.M., Harris, L.A., Leon, F., et al. (2012). A randomized, double-blind study of larazotide acetate to prevent the activation of celiac disease during gluten challenge. Am. J. Gastroenterol., 107(10), 1554-1562

95. Kelly, C.P., Green, P.H., Murray, J.A., Dimarino, A., Colatrella, A., Leffler, D.A., et al. (2013). Larazotide acetate in patients with coeliac disease undergoing a gluten challenge: a randomised placebo-controlled study. Aliment. Pharmacol. Ther., 37(2), 252-262

96. Di Cagno, R., Angelis, M., Auricchio, S., Greco, L., Clarke, C., De Vincenzi, M., et al. (2004). Sourdough bread made from wheat and nontoxic flowers and started with selected lactobacilli is tolerated in celiac sprue patients. Appl. Environ. Microbiol., 70, 1088-1096

136

97. Di Cagno, R., Barbato, M., Di Camillo, C., Rizzello, C.G., De Angelis, M., Giuliani, G. et al. (2010). Gluten-free sourdough wheat baked goods appear safe in celiac sprue patients: a pilot study. J. Pediatr. Gastroenterol. Nutr., 51, 777-783

98. Greco, L., Gobberti, M., Auricchio, R., Di Mase, R., Landolfo, F., Paparo, F., et al. (2011). Safety for patients with celiac disease of baked goods made of wheat flour hydrolyzed during food processing. Clin. Gastroenterol. Hepatol., 9, 24-29

99. Stoven, S., Murray, J.A and Marietta, E. (2012). Celiac disease: Advances in treatment via gluten modification. Clin. Gastroenterol. Hepatol., 10, 859-862

100. Frazer, A., Fletcher, R., Ross, C., Shaw, B., Sammons, H.G. and Schneider, R. (1959). Gluten-induced enteropathy: The effect of partially digested gluten. Lancet, 2, 252–255

101. Gass, J., Bethune, M.T., Siegel, M., Spencer, A. and Khosla, C. (2007). Combination enzyme therapy for gastric digestion of dietary gluten in patients with celiac sprue. Gastroenterology, 133, 472-480

102. Bethune, M.T., Strop, P., Tang, Y., Sollid, L.M. and Khosla, C. (2006). Heterlogous expression, purification, refolding, and structural-functional characterization of EP-B2, a self-activating barley cysteine endoprotease. Chem. Biol., 13, 637-647

103. Shan, L., Marti, T., Sollid, L.M., Gray, G.M. and Khosla, C. (2004). Comparative biochemical analysis of three bacterial prolyl endopeptidases: implications for coeliac sprue. Biochem. J., 383, 311-318

104. Siegel, M. Bethune, M.T., Gass. J., Ehren, J., Xia, J., Johannsen, A., et al. (2006). Rational design of combination enzyme therapy for celiac sprue. Chem. Biol., 13, 649- 658

105. Siegel, M. Garber, M.E., Spencer, A.G., Botwick, W., Kumar, P., Williams, R.N., et al. (2012). Safety, tolerability, and activity of ALV003: Results from two Phase 1 single, escalating-dose clinical trials. Dig. Dis. Sci. 57, 440-450

106. Tye-Din, J.A., Anderson, R.P., French, R.A., Brown, G.J., Hodsman, P., Siegel, M., et al. (2010). The effects of ALV003 pre-digestion of gluten on immune response and symptoms in celiac disease in vivo. Clin. Immunol., 134, 289-295

107. Lahdeaho, M., Maki, M., Kaukinen, K. Laurila, K., Marcantonio, A. and Adelman, D. (2011). AlV003, a novel glutenase, attenuates gluten-induced small intestinal mucosal injury in celiac disease patients: a randomized controlled phase 2A clinical trial. Gut, 60, A12

108. Ehren, J., Moron, B., Martin, E., Bethune, M.T., Gray, G.M. and Khosla, C. (2009). A food-grade enzyme preparation with modest gluten detoxification properties. PLoS ONE, 4(7), e6313

137

109. Stepaniak, D., Spaenij-Dekking, L., Mitea, C., Moester, M., de Ru, A., Baak-Pablo, R., et al. (2006). Highly efficient gluten degradation with a newly identified prolyl endoprotease: implications for celiac disease. Am. J. Physiol. Gastrointest. Liver Physiol., 291, G621-G629

110. Mitea, C., Havenaar, R., Drijfhout, J.W., Edens, L., Dekking, L. and Koning, F. (2008). Efficient degradation of gluten by prolyl endoprotease in a gastrointestinal model: implications for coeliac disease. Gut, 57, 25-32

111. Marietta, E.V., David, C.S. and Murray, J.A. (2011). Important lessons derived from animal models of celiac disease. Int. Rev. Immunol., 30(4), 197-206

112. Thornhill, A.H., Harper, I.S. and Hallam N.D. (2008). The development of the digestive glands and enzymes in the pitchers of three Nepenthes species: N. alata, N. tobaica and N. ventricosa (Nepenthaceae). Int. J. Plant Sci., 169, 615–624

113. Vines, S. H. (1897) The proteolytic enzyme of Nepenthes. Ann. Bot., os-11, 563–584

114. Tokes, Z.A., Woon, W.C. and Chambers, S.M. (1974). Digestive enzymes secreted by the Nepenthes macferlanei L. Planta., 119, 39-46

115. Rawlings, N. D., Barrett, A. J. and Bateman, A. (2012). MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res., 40, D343–350

116. Tang, J. and Wong, R.N (1987). Evolution in the structure and function of aspartic proteases. J. Cell Biochem., 33(1), 53-63

117. Khan, A. R. and James, M. N. (1998). Molecular mechanisms for the conversion of zymogens to active proteolytic enzymes. Protein. Sci., 7, 815–836

118. Athauda, S. B. P., Matsumoto, K., Rajapakshe, S., Kuribayashi, M., Kojima, M., Kubomura-Yoshida, N., et al. (2004). Enzymic and structural characterization of nepenthesin, a unique member of a novel subfamily of aspartic proteinases. Biochem. J., 381, 295–306

119. Kubota, K., Metoki, Y., Athauda, S. B., Shibata, C. and Takahashi, K. (2010). Stability profiles of nepenthesin in urea and guanidine hydrochloride: comparison with porcine . Biosci. Biotechnol. Biochem., 74, 2323–2326

120. Takahashi, K., Athauda, S. B. P., Matsumoto, K., Rajapakshe, S., Kuribayashi, M., Kojima, M., et al. (2005). Nepenthesin, a unique member of a novel subfamily of aspartic proteinases: enzymatic and structural characteristics. Curr. Protein. Pept. Sci., 6, 513–25

121. Rey, M., Yang, M., Burns, K. M., Yu, Y., Lees-Miller, S. P. and Schriemer, D. C. (2013). Nepenthesin from monkey cups for hydrogen/deuterium exchange mass spectrometry. Mol. Cell. Proteomics, 12, 464–472

138

122. Hamuro, Y., Coales, S.J., Molnar, K.S., Tuske, S.J. and Morrow, J.A. (2008). Specificity of immobilized porcine pepsin in H/D exchange compatible conditions. Rapid Commun. Mass Spectrom., 22, 1041-1046

123. Stephenson, P. and Hogan, J. (2006). Cloning and characterization of a ribonuclease, a cysteine proteinase, and an aspartic proteinase from pitchers of the carnivorous plant Blanco. Int. J. Plant Sci., 167(2), 239–248

124. Hatano, N. and Hamada, T. (2012). Proteomic analysis and secreted protein induced by a component of prey in pitcher fluid of the carnivorous plant Nepenthes alata. J. Proteomics, 75(15), 4844-4852

125. Hatano, N. and Hamada, T. (2008). Proteome analysis of pitcher fluid of the carnivorous plant Nepenthes alata. J. Proteomics, 7(2), 809-816

126. Anson, M. L. (1938). The estimation of pepsin, trypsin, papain, and with hemoglobin. J. Gen. Physiol., 22, 79–89

127. Strohalm, M., Kavan, D., Novak, P., Volny, M. and Havlicek, V. (2010). mMass 3: a cross-platform software environment for precise analysis of mass spectrometric data. Anal. Chem., 82, 4648–4651

128. Hannig, G. and Makrides, S.C. (1998). Strategies for optimizing heterologous protein expression in Escherichia coli. Trends Biotechnol., 16(2), 54-60

129. Clark, E.D.B. (1998). Refolding of recombinant proteins. Curr. Opin. Biotech., 9, 157- 163

130. Vallejo, L.F. and Rinas, U. (2004). Strategies for the recovery of active proteins through refolding of bacterial inclusion body proteins. Microb. Cell Fact., 3(1), 11

131. Fritz, J.D., Swartz, D.R. and Greaser, M.L. (1989). Factors affecting polyacrylamide gel electrophoresis and electroblotting of high-molecular-weight myofibrillar proteins myofibrillar proteins. Anal. Biochem., 180(2), 205-210

132. Nobmann, U., Connah, M., Fish, B., Varley, P., Gee, C., et al. (2007). Dynamic light scattering as a relative tool for assessing the molecular integrity and stability of monoclonal antibodies. Biotechnol. Genet. Eng. Rev., 24, 117-128

133. Schuck P. (2013). Analytical Ultracentrifugation as a Tool for Studying Protein Interactions. Biophys. Rev., 5(2), 159-171

134. Yamaguchi, S., Yamamoto, E., Mannen, T., Nagamune, T. and Nagamune, T. (2013). Protein refolding using chemical refolding additives. Biotechnol. J., 8(1), 17-31

139

135. Macauley-Patrick, S., Fazenda, M.L., McNeil, B. and Harvey, L.M. (2005). Heterologous protein production using the Pichia pastoris expression system. Yeast, 22(4), 249-270

136. Aikawa, J., Yamashita, T., Nishiyama, M., Horinouchi, S. and Beppu, T. (1990). Effects of glycosylation on the secretion and enzyme activity of Mucor rennin, an aspartic proteinase of Mucor pusillus, produced by recombinant yeast. J. Biol. Chem., 265(23), 13955-13959

137. Aoyagi, T., Kunimoto, S. Morishima, H., Takeuchi, T. and Umezawa, H. (1971). Effect of pepstatin on acid proteases. J. Antibiot., 24(10), 687-694

138. Chavaroche, A., Cudic, M., Giulianotti, M., Houghten, R.A., Fields, G.B. and Minond, D. (2013). Glycosylation of a disintegrin and metalloprotease 17 affects its activity and inhibition. Anal. Biochem., 449C, 68-75

139. Grinnellt, B.W., Walls, J.D. and Gerlitz, B. (1991). Glycosylation of Human Protein C Affects Its Secretion, Processing, Functional Activities, and Activation by Thrombin. J. Biol. Chem., 226(15), 9778-9785

140. Schneider, C.A., Rasband, W.S. and Eliceiri, K.W. (2010). NIH Image to ImageJ: 25 years of image analysis. Nature Methods, 9, 671-675

141. Brown, R.E., Jarvis, K.L., and Hyland, K.J. (1989). Protein measurement using bicinchoninic acid: elimination of interfering substances. Anal. Biochem., 180, 136-139

142. Mills, J.N. and Tang, J. (1967). Molecular weight and composition of human gastricsin and pepsin. J. Biol. Chem., 242, 3093-3097

143. Arentz-Hansen, H., Korner, R., Molberg, Ø., Quarsten, H., Vader, W., et al. (2000). The Intestinal T cell response to α-gliadin in adult celiac disease is focused on a single deamidated glutamine targeted by tissue transglutaminase. J. Exp. Med., 191(4), 603-612

144. Arentz-Hansen, H., McAdam, S.N., Molberg, Ø., Fleckenstein, B., Lundin, K.E., et al. (2002). Celiac lesion T cells recognize epitopes that cluster in regions of gliadins rich in proline residues. Gastroenterology, 123 (3), 803-809

145. Vader, L.W., de Ru, A., van der Wal, Y., Kooy, Y.M., Benckhuijsen, W., et al. (2002) Specificity of tissue transglutaminase explains cereal toxicity in celiac disease. J. Exp. Med., 195, 643–649

146. Jackson, E.A., Holt, L.M. and Payne, P.I. (1983). Characterisation of high molecular weight gliadin and low-molecular-weight glutenin subunits of wheat endosperm by two- dimensional electrophoresis and the chromosomal localisation of their controlling genes. Theor. Appl. Genet, 66(1), 29-37

147. Cook, R. H. and Rose, R.C. (1934). Solubility of gluten. Nature, 134, 380-381

140

148. Kurowska, E. and Bushuk, W. (1988). Solubility of flour and gluten protein in a solvent of acetic acid, urea, and cetyltrimethylammonium bromide, and its relationship to dough strength. Cereal Chem., 65(2), 156-158

149. Fuhrmann, G. and Leroux, J. (2012). In vivo fluorescence imaging of exogenous enzyme activity in the gastrointestinal tract. PNAS, 109(42), 9032-9037

150. Vader, W., Kooy, Y., Van Veelen, P., De Ru, A., Harris, D., Benckhuijsen, W., et al. (2002). The gluten response in children with celiac disease is directed toward multiple gliadin and glutenin peptides. Gastroenterology, 122(7), 1729-1739

151. van de Wal, Y., Kooy, Y.M., van Veelen, P.A., Peña, S.A., Mearin, L.M., et al. (1998). Small intestinal T cells of celiac disease patients recognize a natural pepsin fragment of gliadin. PNAS, 95(17), 10050-10054

152. Sjöström, H., Lundin, K.E., Molberg, Ø., Körner, R., McAdam, S.N., et al. (1998). Identification of a gliadin T-cell epitope in coeliac disease: general importance of gliadin deamidation for intestinal T-cell recognition. Scand. J. Immunol., 48(2), 111-115

141

Appendix

Table A2.1. Mascot search results in the NCBI Viridiplantae (green plants) database for characterization of the Nepenthes pitcher fluid proteome. Samples were digested with trypsin overnight (16 hours) and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description 1657 71195 47 5 heat shock protein 70-1 [Nicotiana tabacum] 964 72676 25 3 binding protein 1 [Chlamydomonas reinhardtii] 939 71854 27 4 Heat shock protein 70B [Theobroma cacao] 899 71333 23 3 hypothetical protein VITISV_041184 [Vitis vinifera] 897 74941 25 3 Heat Shock Protein 70, cytosolic [Bathycoccus prasinos] hypothetical protein SELMODRAFT_440900 732 73055 18 2 [Selaginella moellendorffii] PREDICTED: heat shock cognate 70 kDa protein-like 717 56433 17 2 [Fragaria vesca subsp. vesca] 380 72268 13 2 predicted protein [Hordeum vulgare subsp. vulgare] 1468 55457 49 8 predicted protein [Hordeum vulgare subsp. vulgare] hypothetical protein COCSUDRAFT_48882 [Coccomyxa 541 58295 17 2 subellipsoidea C-169] RecName: Full=Aspartic proteinase nepenthesin-1; 1108 47012 36 1 AltName: Full=Nepenthesin-I; Flags: Precursor 936 50519 52 6 predicted protein [Hordeum vulgare subsp. vulgare]

749 45842 26 4 uncharacterized protein LOC100501669 [Zea mays] 713 54496 19 2 uncharacterized protein LOC100274495 [Zea mays] 601 59710 38 8 uncharacterized protein LOC100273100 [Zea mays] 443 59120 18 4 predicted protein [Hordeum vulgare subsp. vulgare] F1F0 ATP synthase, subunit alpha, mitochondrial 414 61693 19 3 [Volvox carteri f. nagariensis] 587 81177 18 2 predicted protein [Hordeum vulgare subsp. vulgare] 518 80392 15 2 predicted protein [Hordeum vulgare subsp. vulgare] 271 80973 13 2 heat shock protein 90A [Chlamydomonas reinhardtii] 49 81298 3 2 heat shock protein 90A [Arabidopsis thaliana] 576 18285 15 2 uncharacterized protein LOC100273141 [Zea mays] 501 33414 8 1 predicted protein [Hordeum vulgare subsp. vulgare] 356 38771 13 4 actin, partial [Mesostigma viride] 245 42094 9 4 actin [Chlamydomonas reinhardtii] 134 41897 6 4 Os05g0106600 [Oryza sativa Japonica Group] 124 51882 6 2 PREDICTED: actin-1-like isoform X1 [Setaria italica] 410 42857 20 4 ADP/ATP carrier 2 isoform 1 [Theobroma cacao] adenosine nucleotide translocator [Brassica oleracea var. 270 35113 19 5 viridis]

142

RecName: Full=Malate dehydrogenase, mitochondrial; 375 36406 12 1 Flags: Precursor dead box ATP-dependent RNA helicase, putative 333 46102 7 2 [Ricinus communis] 67 46287 2 2 predicted protein [Hordeum vulgare subsp. vulgare] 318 110436 15 3 predicted protein [Hordeum vulgare subsp. vulgare] chaperonin 60, mitochondrial [Ostreococcus lucimarinus 308 62302 8 2 CCE9901] 284 29365 13 5 uncharacterized protein LOC100382976 [Zea mays] 225 20725 9 3 ADP-ribosylation factor [Actinidia chinensis] 14-3-3 protein homologue [Hordeum vulgare subsp. 219 29361 12 4 vulgare] 44 59707 2 1 maturase K [Castanopsis inermis] 208 33064 6 1 Aldo/keto reductase [Coccomyxa subellipsoidea C-169] 183 35305 4 1 predicted protein [Hordeum vulgare subsp. vulgare] PREDICTED: UPF0481 protein At3g47200-like 182 60562 9 1 [Glycine max] PREDICTED: heat shock cognate 70 kDa protein-like 169 61889 10 1 [Fragaria vesca subsp. vesca]

168 74179 4 3 predicted protein [Hordeum vulgare subsp. vulgare] uncharacterized protein LOC100274291 precursor [Zea 153 41766 4 2 mays] 151 8536 8 2 RecName: Full=Ubiquitin 149 49429 4 2 uncharacterized protein LOC100274418 [Zea mays] 145 30771 4 1 predicted protein [Hordeum vulgare subsp. vulgare] 115 37563 5 1 Cullin 1, putative isoform 1 [Theobroma cacao] PREDICTED: vacuolar iron transporter homolog 4-like 108 25801 6 1 [Glycine max] 103 16198 4 1 predicted protein [Hordeum vulgare subsp. vulgare] hypothetical protein FOXB_10824 [Fusarium oxysporum 102 14782 3 2 Fo5176] 94 24642 5 1 predicted protein [Hordeum vulgare subsp. vulgare] 93 22400 2 1 predicted protein [Hordeum vulgare subsp. vulgare] 93 32710 2 1 spermidine synthase [Arabidopsis thaliana] 92 86445 4 2 predicted protein [Hordeum vulgare subsp. vulgare] 91 32001 2 1 predicted protein [Ostreococcus lucimarinus CCE9901] 88 16456 4 1 NDPK I [Coccomyxa subellipsoidea C-169] 86 93150 4 1 predicted protein [Bathycoccus prasinos] 85 31779 14 1 RecName: Full=Serine carboxypeptidase-like hypothetical protein VOLCADRAFT_92545 [Volvox 83 383332 4 1 carteri f. nagariensis] 80 86657 2 1 predicted protein [Hordeum vulgare subsp. vulgare] 79 40081 2 1 Ubiquitin family, putative [Oryza sativa Japonica Group] 79 29390 2 1 predicted protein [Populus trichocarpa] 79 27859 2 1 predicted protein [Hordeum vulgare subsp. vulgare]

143

75 72762 2 1 predicted protein [Hordeum vulgare subsp. vulgare] PREDICTED: plastid division protein CDP1, 75 94948 9 1 chloroplastic-like [Cucumis sativus] 73 23110 2 1 unnamed protein product [Arabidopsis thaliana] 73 33044 1 1 predicted protein [Hordeum vulgare subsp. vulgare] 72 95470 3 1 elongation factor 2 [Coccomyxa subellipsoidea C-169] 69 45637 3 1 aspartate aminotransferase [Medicago sativa] 69 21409 2 1 unknown [Picea sitchensis] 67 12721 2 1 40S ribosomal protein S26 [Bathycoccus prasinos] glyceraldehyde 3-phosphate dehydrogenase [Prosopis 66 9492 3 1 argentina] mitochondrial transcription termination factor family 64 45161 5 1 protein [Arabidopsis thaliana] 62 16851 2 1 RecName: Full=60S ribosomal protein L12 RecName: Full=Tubulin beta-1 chain; AltName: 61 43839 1 1 Full=Beta-1-tubulin

60 82813 2 1 predicted protein [Ostreococcus lucimarinus CCE9901]

60 87675 3 1 predicted protein [Micromonas pusilla CCMP1545] 59 55493 3 2 predicted protein [Hordeum vulgare subsp. vulgare] 58 51170 2 1 predicted protein [Hordeum vulgare subsp. vulgare] 58 22387 3 1 predicted protein [Hordeum vulgare subsp. vulgare] hypothetical protein CARUB_v10012793mg [Capsella 58 249199 1 1 rubella] 57 17540 1 1 RecName: Full=40S ribosomal protein S18 56 38918 2 1 GDSL esterase/ [Arabidopsis thaliana] 55 37863 2 1 unknown [Populus trichocarpa x Populus deltoides] auxin-regulated protein-like protein [Oryza sativa 55 26842 2 1 Japonica Group] putative male sterility 1 protein (ISS) [Ostreococcus 54 111759 2 1 tauri] 54 28036 5 1 unknown [Picea sitchensis] Lipase, putative, expressed [Oryza sativa Japonica 53 22062 2 1 Group] hypothetical protein SELMODRAFT_408106 53 74723 2 1 [Selaginella moellendorffii] 53 64805 1 1 phosphoglucomutase [Chlamydomonas reinhardtii] 52 85140 6 1 hypothetical protein ZEAMMB73_246912 [Zea mays] 52 188803 2 1 predicted protein [Physcomitrella patens subsp. patens] 52 27103 3 1 predicted protein [Physcomitrella patens subsp. patens] 51 111490 2 1 uncharacterized protein LOC100279776 [Zea mays] hypothetical protein CARUB_v10015867mg [Capsella 51 22621 2 1 rubella] 50 22536 1 1 unknown [Lotus japonicus] PREDICTED: uncharacterized protein LOC101257070 50 59440 2 1 [Solanum lycopersicum] 50 29364 3 1 40S ribosomal protein S4 [Zea mays]

144

49 80638 1 1 protein kinase [Oryza sativa Japonica Group] 49 133399 1 1 Protein popC [Aegilops tauschii] 49 21406 1 1 YK426 [Oryza sativa (japonica cultivar-group)] 49 36269 4 1 unknown [Zea mays] 49 132113 2 1 predicted protein [Physcomitrella patens subsp. patens] 49 13561 1 1 dehydrin-like protein [Selaginella lepidophylla] 49 106518 2 1 mismatch repair protein [Arabidopsis thaliana] 47 13017 1 1 uncharacterized protein [Arabidopsis thaliana] 47 37896 1 1 aldehyde dehydrogenase, putative [Ricinus communis] hypothetical protein SORBIDRAFT_01g022760 46 45387 3 1 [Sorghum bicolor] hypothetical protein CHLNCDRAFT_134066 [Chlorella 46 31172 1 1 variabilis] 46 156894 2 1 predicted protein [Ostreococcus lucimarinus CCE9901]

46 82471 1 1 predicted protein [Micromonas sp. RCC299] 46 23179 1 1 unknown [Picea sitchensis] 46 82189 2 1 unknown protein [Oryza sativa Japonica Group] 46 76908 1 1 hypothetical protein VITISV_011663 [Vitis vinifera] 46 40028 1 1 putative sorbitol dehydrogenase [Arabidopsis thaliana] Cysteine-rich RLK (RECEPTOR-like protein kinase) 8 46 42482 1 1 [Theobroma cacao] 45 34427 1 1 nodulin-like protein [Arabidopsis thaliana] 45 89387 1 1 predicted protein [Bathycoccus prasinos] 45 6783 1 1 hypothetical protein [Oryza sativa Japonica Group] 45 45745 2 1 hypothetical protein VITISV_006088 [Vitis vinifera] diaminohydroxyphosphoribosylaminopyrimidine 45 73391 2 1 deaminase [Chlamydomonas reinhardtii] 45 122292 1 1 valyl-tRNA synthetase, putative [Ricinus communis] hypothetical protein COCSUDRAFT_83655 [Coccomyxa 44 47395 1 1 subellipsoidea C-169] 44 87484 1 1 Transcriptional activator FLO8 [Medicago truncatula] 44 12676 1 1 cytochrome c oxidase subunit 2, partial [Genlisea aurea] PREDICTED: uncharacterized protein LOC100787513 44 32809 1 1 [Glycine max] 44 50484 1 1 alpha1 tubulin hypothetical protein OsI_03416 [Oryza sativa Indica 44 69484 1 1 Group] 44 47441 1 1 predicted protein [Micromonas pusilla CCMP1545] 43 32892 1 1 predicted protein [Hordeum vulgare subsp. vulgare] 43 7952 1 1 ribosomal protein S17 [Solanum lycopersicum] hypothetical protein PRUPE_ppa024355mg [Prunus 43 70419 1 1 persica] hypothetical protein PRUPE_ppa007562mg [Prunus 43 41087 1 1 persica]

145

HXXXD-type acyl- family protein, putative 42 56470 1 1 [Theobroma cacao] 42 18712 1 1 uncharacterized protein LOC100382991 [Zea mays] 42 23087 1 1 60S ribosomal protein L13a-2 [Zea mays] PREDICTED: LOW QUALITY PROTEIN: beta- 42 55538 1 1 1, chloroplastic-like, partial [Vitis vinifera] hypothetical protein MTR_6g006310 [Medicago 42 11852 1 1 truncatula] 42 147476 1 1 predicted protein [Bathycoccus prasinos] PREDICTED: uncharacterized protein LOC101246789 42 399188 1 1 [Solanum lycopersicum] 42 39605 1 1 root iron transporter protein IRT1 [Malus xiaojinensis] hypothetical protein COCSUDRAFT_66283 [Coccomyxa 42 81914 1 1 subellipsoidea C-169] 42 18445 1 1 cyclophilin, partial [Brassica napus] hypothetical protein SELMODRAFT_417760 42 33821 1 1 [Selaginella moellendorffii] 41 30674 1 1 hypothetical protein At2g10850 [Arabidopsis thaliana] putative jumonji-like transcription factor family protein 41 86246 1 1 [Zea mays]

41 84112 1 1 predicted protein [Hordeum vulgare subsp. vulgare] 60S ribosomal protein L7 [Coccomyxa subellipsoidea C- 41 28726 1 1 169] 41 66331 2 1 predicted protein [Bathycoccus prasinos] hypothetical protein SELMODRAFT_56569 [Selaginella 41 32853 1 1 moellendorffii] 41 43871 1 1 predicted protein [Ostreococcus lucimarinus CCE9901] 41 90910 1 1 ARF-GAP domain 1 isoform 1 [Theobroma cacao] PREDICTED: uncharacterized protein LOC101302928 41 22318 1 1 [Fragaria vesca subsp. vesca] 41 76508 1 1 predicted protein [Micromonas sp. RCC299] 40 43586 1 1 predicted protein [Populus trichocarpa] hypothetical protein CARUB_v10018750mg [Capsella 40 41387 1 1 rubella] 40 37800 1 1 zinc finger -like [Oryza sativa Japonica Group] 40 84591 1 1 NADH dehydrogenase [Coreopsis tinctoria] *Ion cut-off score (p<0.05): 40

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131106%2FF005897.dat

146

Table A2.2. Mascot search results in the NCBI Drosophila (fruit flies) database for characterization of the Nepenthes pitcher fluid proteome. Samples were digested with trypsin overnight (16 hours) and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences Description

1286 50535 55 4 EF-1-alpha [Drosophila melanogaster] heat shock protein cognate 4, isoform A [Drosophila 1127 71372 23 3 melanogaster] 1007 71179 22 4 GK20316 [Drosophila willistoni] 713 53544 17 2 ATP synthase beta subunit [Drosophila melanogaster] 646 42196 24 6 actin [Drosophila melanogaster] RecName: Full=Heat shock protein 83; AltName: 602 81994 12 1 Full=HSP 82

503 35660 13 1 GH21422 [Drosophila grimshawi] 315 29951 15 3 14-3-3epsilon, isoform A [Drosophila melanogaster] 243 36819 6 1 AT08919p [Drosophila melanogaster] 206 8548 10 2 ubiquitin, partial [Drosophila melanogaster] 162 20684 5 2 GH23951 [Drosophila grimshawi] 136 59612 10 3 bellwether [Drosophila melanogaster]

121 40682 5 1 ANGEL 39 [Drosophila melanogaster]

117 74390 2 1 heat shock protein cognate 71 [Drosophila melanogaster] 101 22818 7 2 GM16917 [Drosophila sechellia] 62 13223 5 2 Chain C, Drosophila Nucleosome Structure 88 8975 5 1 GA25843 [Drosophila pseudoobscura pseudoobscura] ribosomal protein L8, isoform A [Drosophila 87 27989 2 1 melanogaster] 71 11504 3 1 unnamed protein product [Drosophila melanogaster] 69 15332 8 1 ribosomal protein S17 [Drosophila melanogaster] 67 15777 3 1 GL15782 [Drosophila persimilis] 66 51479 5 1 CG43861, isoform A [Drosophila melanogaster] S-adenosyl-L-homocysteine [Drosophila 66 47849 2 1 melanogaster] ribosomal protein S9, isoform A [Drosophila 61 22610 3 1 melanogaster] 60 57586 4 1 GF22728 [Drosophila ananassae] 59 59985 1 1 GF23744 [Drosophila ananassae] 59 233011 2 1 GJ14200 [Drosophila virilis] 59 26786 3 1 GJ24109 [Drosophila virilis] 55 35997 2 1 GI22416 [Drosophila mojavensis]

147

54 21231 2 2 GF20391 [Drosophila ananassae] 54 11827 3 1 GE15152 [Drosophila yakuba] 51 50899 2 1 GH10205 [Drosophila grimshawi] 51 57290 1 1 GA18419 [Drosophila pseudoobscura pseudoobscura] 50 84623 2 1 GA17662 [Drosophila pseudoobscura pseudoobscura] 50 231681 3 1 GJ21927 [Drosophila virilis] 48 93597 2 1 CG32344 [Drosophila melanogaster] 48 124321 3 1 Cep135, isoform A [Drosophila melanogaster] 48 99827 2 1 CG3409, isoform A [Drosophila melanogaster] 46 303124 2 1 MICAL short isoform [Drosophila melanogaster] 46 328756 3 1 GH13563 [Drosophila grimshawi] 46 184341 4 1 GL19882 [Drosophila persimilis]

46 89065 2 1 GH19328 [Drosophila grimshawi] 46 29855 2 1 GK22915 [Drosophila willistoni] 46 9545 1 1 GH16809 [Drosophila grimshawi] 45 59576 2 1 GI16060 [Drosophila mojavensis] 45 72516 2 1 GK13393 [Drosophila willistoni] GTP-binding nuclear protein RAN [Drosophila 44 24891 1 1 melanogaster] 44 31028 4 1 GH10772 [Drosophila grimshawi] 44 19706 2 1 GM12048 [Drosophila sechellia] 44 65535 2 1 GK22895 [Drosophila willistoni] 43 33846 3 1 GJ20058 [Drosophila virilis] 43 59164 2 1 CG3011, isoform A [Drosophila melanogaster] 43 45452 3 1 GJ14372 [Drosophila virilis] 43 156154 2 2 GH12781 [Drosophila grimshawi] 43 19728 2 1 GK19019 [Drosophila willistoni] 43 29266 2 1 ribosomal protein S4 [Drosophila melanogaster] 42 25760 2 1 ribosomal protein S5a [Drosophila melanogaster] 42 55531 2 1 GA27814 [Drosophila pseudoobscura pseudoobscura] 42 29559 1 1 GJ23083 [Drosophila virilis] 42 13004 2 1 TPA_inf: HDC18414 [Drosophila melanogaster] 41 48843 1 1 septin 2, isoform A [Drosophila melanogaster] 41 16729 2 1 GF16224 [Drosophila ananassae] 41 49261 2 1 GF13806 [Drosophila ananassae] 41 583537 2 1 futsch [Drosophila erecta]

148

41 23172 2 1 LD24589p, partial [Drosophila melanogaster] 41 48785 2 1 CG7564, isoform A [Drosophila melanogaster] 40 190543 1 1 GF21845 [Drosophila ananassae] 40 71046 1 1 GA14572 [Drosophila pseudoobscura pseudoobscura] 40 25328 1 1 GI24134 [Drosophila mojavensis] 40 79571 1 1 alpha-catenin-related protein [Drosophila melanogaster] 39 67580 1 1 GF19769 [Drosophila ananassae] 39 108266 1 1 GL11202 [Drosophila persimilis] 39 41399 1 1 CG3093 [Drosophila miranda] 39 84689 1 1 FGF homolog [Drosophila melanogaster] 39 46899 2 1 GK10664 [Drosophila willistoni] 39 21791 2 1 RE02065p [Drosophila melanogaster]

38 47310 1 1 CG9911, isoform A [Drosophila melanogaster] 38 170575 1 1 GI14719 [Drosophila mojavensis] 38 104529 1 1 GL26780 [Drosophila persimilis] 38 49173 1 1 GK15366 [Drosophila willistoni] 37 59669 1 1 GH19936 [Drosophila grimshawi] 37 93760 1 1 AT25463p [Drosophila melanogaster] 37 15570 1 1 GL22458 [Drosophila persimilis] 37 51438 1 1 tubulin-beta-3 [Drosophila melanogaster] 37 54498 1 1 GI10686 [Drosophila mojavensis] 37 58630 1 1 GL23692 [Drosophila persimilis] 37 10790 1 1 GK19002 [Drosophila willistoni] 37 97513 1 1 GJ17169 [Drosophila virilis] 37 255814 1 1 GK14900 [Drosophila willistoni] 37 43118 1 1 GL11798 [Drosophila persimilis] aromatic-L-amino-acid decarboxylase [Drosophila 37 31665 2 1 caribiana] 36 86072 1 1 GJ15152 [Drosophila virilis] 36 102942 1 1 GH18392 [Drosophila grimshawi] 36 164873 1 1 GM10062 [Drosophila sechellia] 36 98342 1 1 D19A [Drosophila melanogaster] 35 59830 1 1 GG21134 [Drosophila erecta] 35 227844 1 1 GH10629 [Drosophila grimshawi] 35 53925 1 1 hemomucin [Drosophila simulans] 35 186590 1 1 GI23105 [Drosophila mojavensis]

149

35 47318 1 1 GE10456 [Drosophila yakuba] 34 66605 1 1 GF23189 [Drosophila ananassae] 34 57459 1 1 GH14374 [Drosophila grimshawi] 34 14719 1 1 GK19425 [Drosophila willistoni] GA28726, partial [Drosophila pseudoobscura 34 15374 1 1 pseudoobscura] 34 37431 1 1 GL20756 [Drosophila persimilis] 34 57651 1 1 GA19920 [Drosophila pseudoobscura pseudoobscura] 34 13680 1 1 ATPase 6 [Drosophila navojoa] 34 273254 1 1 GK12221 [Drosophila willistoni] 33 60309 1 1 LD38816p [Drosophila melanogaster] 33 150284 1 1 CG2807, isoform A [Drosophila melanogaster] *Ion cut-off score (p<0.05): 33

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131106%2FF005898.dat

150

Table A2.3. Mascot search results in the NCBI Bacteria (Eubacteria) database for characterization of the Nepenthes pitcher fluid proteome. Samples were digested with trypsin overnight (16 hours) and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot significant significant Score Mass matches sequences Description

914 31023 28 4 ATP synthase subunit beta [Sphingomonas sp. 397] F0F1 ATP synthase subunit beta [Acetobacter 911 52631 27 4 pasteurianus IFO 3283-01] ATP synthase f1 subcomplex beta subunit [Marivirga 527 54995 21 3 tractuosa DSM 4126] ATP synthase subunit beta [Candidatus Midichloria 428 50507 16 3 mitochondrii IricVA] hypothetical protein [Verrucomicrobia bacterium SCGC 510 70835 15 2 AAA164-N20] 492 67153 14 2 chaperone protein dnak [Campylobacter lari]

477 8520 21 1 DNA-binding protein [Bacteroides fragilis NCTC 9343] translation elongation factor EF-1, subunit alpha, partial 474 39142 31 3 [Escherichia coli] ATP synthase F1 subunit alpha [Zymomonas mobilis 310 55344 14 2 subsp. pomaceae ATCC 29192] F0F1 ATP synthase subunit alpha [Nitrosomonas 239 55377 11 2 eutropha C91] F0F1 ATP synthase subunit alpha [Roseobacter sp. 129 54936 11 3 MED193] 122 15127 7 1 hypothetical protein [Pseudomonas fragi] putative peptidyl-prolyl cis-trans 405 17894 10 1 [Streptomyces ambofaciens ATCC 23877]

379 42288 14 3 actin, alpha cardiac muscle 1 [Danio rerio] heat shock protein 90 [Rickettsia prowazekii str. Madrid 358 71012 10 1 E] 252 34397 10 1 malate dehydrogenase [Haemophilus ducreyi 35000HP] 251 46508 8 1 enolase [Coprobacillus sp. 29_1] 223 66961 4 1 heat shock protein 70 [Helicobacter pylori] 6-phosphogluconate dehydrogenase, decarboxylating 221 53483 9 1 [Petrotoga mobilis SJ95] S-adenosyl-L-homocysteine hydrolase [Acidocella sp. 195 48105 6 2 MX-AZ02] hypothetical protein, partial [Pectobacterium 181 4252 7 1 carotovorum] copper/silver resistance periplasmic protein [Vibrio 167 62057 10 1 furnissii NCTC 11218] triosephosphate isomerase [Clostridium acetobutylicum 153 26698 4 1 ATCC 824]

151

morphine 6-dehydrogenase [Mycobacterium smegmatis 152 29988 5 1 str. MC2 155] 137 72047 9 1 heat shock protein Hsp90 [Campylobacter rectus] 70 kDa heat shock protein, partial [Dialister 126 23346 2 1 micraerophilus DSM 19965] S-adenosyl-L-homocysteine hydrolase [Streptomyces 118 53249 2 1 coelicolor A3(2)] 100 55673 3 1 gamma Glu-Cys synthetase 94 10499 7 2 ubiquitin, partial [Herbaspirillum seropedicae] ABC-type multidrug transport system, ATPase and 93 61546 3 1 permease components [Roseburia intestinalis XB6B4] aconitate hydratase [Flavobacterium sp. SCGC AAA160- 92 82346 4 1 P02] 91 39018 3 1 hypothetical protein [Acinetobacter sp. ATCC 27244] 91 219551 4 1 hypothetical protein [Singularimonas variicoloris] 90 72471 4 1 transketolase [Synechococcus sp. JA-2-3B'a(2-13)]

83 17906 3 1 hypothetical protein, partial [Streptomyces sp. S4] 82 13037 2 1 hypothetical protein, partial [Herbaspirillum seropedicae] 82 58004 3 1 heat shock protein [Treponema pallidum] spermidine synthase [Thermaerobacter marianensis DSM 82 34348 2 1 12885] type VI secretion system core protein VasK [Cronobacter 81 126157 2 1 sakazakii ES15] 79 12098 3 1 hypothetical protein [Prevotella buccae] outer membrane porin [Burkholderia pseudomallei 78 40272 2 1 K96243]

77 67834 2 1 DnaK-like protein [Deinococcus gobiensis I-0] 76 45597 1 1 acyl-CoA dehydrogenase [Ruegeria pomeroyi DSS-3] 73 32246 1 1 pyridoxal biosynthesis [Anaerostipes caccae] 73 70372 3 1 lipoprotein [SAR86 cluster bacterium SAR86A] 73 38737 4 1 ABC transporter [Moorella thermoacetica ATCC 39073] 72 46264 5 1 hypothetical protein [Xanthomonas translucens] 69 22764 2 1 DNA-binding protein [Pseudomonas sp. Ag1] 69 37729 2 1 outer membrane protein [Burkholderia cepacia] 69 22966 7 1 hypothetical protein [Nonomuraea coxensis] hypothetical protein R27_p157 [Salmonella enterica 68 10962 2 1 subsp. enterica serovar Typhi] 68 84359 1 1 catalase/peroxidase HPI [Pedobacter saltans DSM 12145] 5'-methylthioadenosine/S-adenosylhomocysteine 67 24420 5 1 nucleosidase [Campylobacter curvus 525.92] conserved hypothetical protein [Methylosinus 67 29327 2 1 trichosporium] 67 48375 2 1 restriction endonuclease [Congregibacter litoralis] 67 27562 2 1 chemotaxis protein CheD [Acidovorax sp. CF316] 5-methyltetrahydropteroyltriglutamate/homocysteine S- 65 86208 2 1 methyltransferase [Rhodospirillum rubrum ATCC 11170]

152

glycosyl transferase family protein [Acidaminococcus 65 38447 2 1 intestini RyC-MR95] formate dehydrogenase [Lactobacillus buchneri NRRL B- 64 43095 2 1 30929] molecular chaperone DnaK [Microcystis aeruginosa 64 67933 1 1 NIES-843] 64 92750 1 1 adenylate cyclase, partial [uncultured bacterium] general secretion pathway gspg related transmembrane 63 28396 1 1 protein [Fusobacterium sp. CAG:815] hypothetical protein lpl2806 [Legionella pneumophila str. 62 61937 2 1 Lens] 62 66332 1 1 hypothetical protein [Thermaerobacter subterraneus] 61 210024 3 1 hypothetical protein [Sutterella wadsworthensis] tRNA(Ile)-lysidine synthetase [Prevotella sp. oral taxon 61 50355 1 1 473] F0F1 ATP synthase subunit alpha [Chlorobium 61 56837 4 1 chlorochromatii CaD3] hypothetical protein Cal7507_4786 [Calothrix sp. PCC 61 59730 4 1 7507] 61 21749 2 1 flavodoxin [Gemella haemolysans] Copper-translocating P-type ATPase [Catellicoccus 60 90681 1 1 marimammalium] GntR family transcriptional regulator [Xanthobacter 60 29365 2 1 autotrophicus Py2] extracellular solute-binding 5 60 61149 3 1 [Sphaerobacter thermophilus DSM 20745] F0F1 ATP synthase subunit gamma [Gillisia sp. 60 31485 1 1 CBA3202] anti-sigma factor, protein serine/threonine kinase 60 16962 1 1 [Geobacter sulfurreducens PCA] N-acetylglucosamine-6-phosphate deacetylase 59 41957 2 1 [Streptococcus pneumoniae] fructose-bisphosphate aldolase [Candidatus Blochmannia 59 40626 1 1 pennsylvanicus str. BPEN] DEAD/DEAH box helicase domain-containing protein 58 49827 1 1 [Fluviicola taffensis DSM 16823] short-chain dehydrogenase [gamma proteobacterium 58 26483 2 1 HdN1] putative Peptidyl-prolyl cis-trans isomerase 58 17834 1 1 [Streptomyces aurantiacus] hypothetical protein BPSS0684 [Burkholderia 57 44713 1 1 pseudomallei K96243] 57 30924 1 1 hypothetical protein [Bacillus nealsonii] 57 62688 2 1 coiled coil protein [Fluoribacter dumoffii] 56 44018 1 1 mannose-6-phosphate isomerase [Actinomyces georgiae] 56 32952 1 1 hypothetical protein [Acidocella sp. MX-AZ02] hypothetical protein Bcen2424_6887 [Burkholderia 56 9051 1 1 cenocepacia HI2424] ATP--cobalamin adenosyltransferase [Thermosipho 56 20062 1 1 melanesiensis BI429]

153

ATP--cobalamin adenosyltransferase [Thermosipho 56 20062 1 1 melanesiensis BI429] 56 17820 1 1 hypothetical protein [Prevotella multiformis] 55 23829 1 1 putative antioxidant peroxidase [uncultured bacterium] 55 201968 1 1 alpha-2-macroglobulin [Capnocytophaga gingivalis] glyceraldehyde-3-phosphate dehydrogenase [Sulfurovum 55 35631 1 1 sp. AR] 6-phosphogluconate dehydrogenase [Cardiobacterium 55 53007 1 1 valvarum] RND efflux system, outer membrane lipoprotein,NodT 54 52899 1 1 family [Mesorhizobium sp. STM 4661] PTS family fructose/mannitol porter component IIA 54 18221 1 1 [Corynebacterium glucuronolyticum] non-ribosomal peptide synthetase [Streptomyces scabiei 54 593020 1 1 87.22] 53 89427 1 1 phosphoketolase [Synechococcus elongatus PCC 6301] hypothetical protein L083_2172 [Actinoplanes sp. N902- 53 27468 1 1 109] 53 83083 1 1 maltooligosyl trehalose synthase [Streptomyces sp. AA4] 53 45613 1 1 unknown [Rhodococcus erythropolis] 53 57279 1 1 hypothetical protein [Streptomyces chartreusis] TrbL/VirB6 plasmid conjugal transfer protein 53 36360 1 1 [Chelativorans sp. BNC1] phosphoglucomutase [Proteobacteria bacterium 53 59284 1 1 CAG:495] cAMP-binding protein [Brachyspira hyodysenteriae 53 47630 1 1 WA1] hypothetical protein [Acidobacteriaceae bacterium KBS 52 17015 1 1 89] hypothetical protein Moth_0715 [Moorella thermoacetica 52 21691 1 1 ATCC 39073] 52 28230 1 1 putative hydrolase [Ilumatobacter coccineum YM16-304] 52 48300 1 1 hypothetical protein [Alicyclobacillus hesperidum] Mur middle domain-containing protein 51 172381 1 1 [Variovorax paradoxus S110] 51 39524 1 1 hypothetical protein aq_737 [Aquifex aeolicus VF5] (p)ppGpp 3-pyrophosphohydrolase [Hydrogenobacter 51 80212 1 1 thermophilus TK-6] *Ion cut-off score (p<0.05): 51

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140602%2FF007883.dat

154

Table A2.4. Mascot search results in the NCBI Viridiplantae (green plants) database for characterization of the Nepenthes pitcher fluid proteome. Samples were deglycosylated, digested with trypsin overnight (16 hours) and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences Description 4891 25557 164 8 concanavalin A - jack bean Chain A, The Neutron Structure Of Concanavalin A At 4584 25620 146 8 2.2 Angstroms Chain A, Crystal Structure Of Recombinant Dioclea Guianensis Lectin Complexed With 5-Bromo-4-Chloro- 1630 25616 43 3 3-Indolyl-A-D-Mannose 836 29864 19 3 class IV chitinase [Nepenthes alata] 830 24621 20 2 thaumatin-like protein [Nepenthes gracilis] RecName: Full=Aspartic proteinase nepenthesin-1; 715 47012 27 1 AltName: Full=Nepenthesin-I; Flags: Precursor

337 36782 16 2 glucanase [] 131 64297 7 1 predicted protein [Physcomitrella patens subsp. patens] 117 28036 28 1 unknown [Picea sitchensis] 108 85140 31 1 hypothetical protein ZEAMMB73_246912 [Zea mays] 85 1393 2 1 RecName: Full=Unknown protein 18 81 11426 4 1 Os01g0858200 [Oryza sativa Japonica Group]

73 31779 6 1 RecName: Full=Serine carboxypeptidase-like GDSL-like Lipase/Acylhydrolase superfamily protein, 66 40698 4 1 putative isoform 1 [Theobroma cacao] 58 73279 3 1 unknown [Triticum aestivum] hypothetical protein VOLCADRAFT_103675 [Volvox 55 147520 3 1 carteri f. nagariensis] 55 45751 3 1 NADH dehydrogenase subunit 7 [Ipomoea purpurea] PREDICTED: thylakoid membrane phosphoprotein 14 kDa, chloroplastic-like isoform 1 [Solanum 55 15893 2 1 lycopersicum] 53 42134 3 1 maturase [Fouquieria splendens] hypothetical protein CHLREDRAFT_120727 51 60178 3 1 [Chlamydomonas reinhardtii] 50 34971 5 1 putative F-box protein PP2-B6 [Arabidopsis thaliana] 49 32123 3 1 chitinase [Mesembryanthemum crystallinum] hypothetical protein SELMODRAFT_272349 49 54377 3 1 [Selaginella moellendorffii] PREDICTED: 3-hydroxyisobutyryl-CoA hydrolase-like 48 47250 3 1 protein 3, mitochondrial-like [Solanum lycopersicum] 47 48031 1 1 Lipoyl synthase, mitochondrial [Triticum urartu] 46 64037 2 1 predicted protein [Hordeum vulgare subsp. vulgare]

155

Crossover junction endonuclease MUS81 [Aegilops 44 26903 1 1 tauschii] 44 51186 1 1 EMB2107 [Arabidopsis lyrata subsp. lyrata] 44 26557 2 1 hypothetical protein [Brachypodium sylvaticum] PREDICTED: cohesin subunit SA-1-like [Solanum 44 126468 1 1 lycopersicum] PREDICTED: ATPase 4, plasma membrane-type-like, 44 27624 2 1 partial [Vitis vinifera] 44 973 1 1 RecName: Full=Putative heat shock protein 2 43 19215 1 1 Os02g0160300 [Oryza sativa Japonica Group] 43 13065 1 1 Os06g0610000 [Oryza sativa Japonica Group] 43 18160 1 1 pathogenesis-related protein PR1 [Brassica napus]

42 208252 1 1 uncharacterized protein [Arabidopsis thaliana] leucine-rich repeat transmembrane protein kinase-like 40 79284 1 1 protein [Arabidopsis thaliana] Uncharacterized protein isoform 2, partial [Theobroma 40 8301 1 1 cacao] 40 87055 1 1 putative paramyosin [Oryza sativa Japonica Group] 39 80760 1 1 RSH-like protein [Capsicum annuum] *Ion cut-off score (p<0.05): 39

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131011%2FF005741.dat

156

Table A2.5. Mascot search results in the NCBI Drosophila (fruit flies) database for characterization of the Nepenthes pitcher fluid proteome. Samples were deglycosylated, digested with trypsin overnight (16 hours) and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences Description 108 173464 6 1 SD13096p [Drosophila melanogaster] 95 537427 11 1 TPA: dynein heavy chain [Drosophila pseudoobscura] 68 250569 4 1 rutabaga adenylyl cyclase [Drosophila melanogaster] 64 38130 9 1 GI22635 [Drosophila mojavensis] 61 45600 3 1 GH17045 [Drosophila grimshawi] 55 94610 5 1 GH25143 [Drosophila grimshawi]

54 103337 2 1 GJ21992 [Drosophila virilis] 46 311645 2 1 GF10854 [Drosophila ananassae] 45 60789 3 1 GL26762 [Drosophila persimilis] 44 19631 3 1 CG14983 [Drosophila melanogaster] 41 48785 3 1 CG7564, isoform A [Drosophila melanogaster] 39 74907 1 1 GL27110 [Drosophila persimilis]

38 182658 1 1 GH20490 [Drosophila grimshawi]

38 160158 1 1 GL12817 [Drosophila persimilis] 36 122726 1 1 GD16184 [Drosophila simulans] 36 206382 2 1 lethal (1) G0196, isoform K [Drosophila melanogaster] 36 58395 1 1 GK13679 [Drosophila willistoni] 35 36337 1 1 GF20119 [Drosophila ananassae] 34 33870 1 1 GF10808 [Drosophila ananassae] 34 45452 1 1 GJ14372 [Drosophila virilis] 34 26132 1 1 GK10890 [Drosophila willistoni] 34 47414 1 1 GE16416 [Drosophila yakuba] 34 145177 2 1 LD24110p [Drosophila melanogaster] *Ion cut-off score (p<0.05): 32

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131011%2FF005744.dat

157

Table A2.6. Mascot search results in the NCBI Bacteria (Eubacteria) database for characterization of the Nepenthes pitcher fluid proteome. Samples were deglycosylated, digested with trypsin overnight (16 hours) and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences Description Chain A, The Three-Dimensional Structure Of Pngase F, A Glycosylasparaginase From Flavobacterium 6240 35131 261 13 Meningosepticum hypothetical protein EF1800 [Enterococcus faecalis 4167 147022 224 25 V583] 3868 43014 115 11 RecName: Full=Sialidase; AltName: Full=Neuraminidase beta-N-acetylglucosaminidase [Xanthomonas 1729 71037 68 13 axonopodis] 244 8520 10 1 DNA-binding protein [Bacteroides fragilis NCTC 9343] cold shock protein C [Salmonella enterica subsp. enterica 211 7356 8 3 serovar Typhimurium] hypothetical protein DehaBAV1_1232 [Dehalococcoides 135 24763 7 1 sp. BAV1] 89 56872 10 1 transporter [Ochrobactrum intermedium] ABC transporter ATP-binding protein/permease 58 65996 1 1 [Clostridium termitidis] 57 37711 1 1 aldo/keto reductase [Anaeromyxobacter sp. Fw109-5] L-serine dehydratase [Bifidobacterium longum 57 81863 3 1 NCC2705] hypothetical protein ACD_28C00351G0001 [uncultured 56 7250 2 1 bacterium] 56 27474 1 1 hypothetical protein [Acinetobacter sp. NIPH 899] NADH-dependent dehydrogenase [Oceanobacillus 52 38634 1 1 iheyensis HTE831] *Ion cut-off score (p<0.05): 50

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131011%2FF005746.dat

158

Table A2.7. Mascot search results in the NCBI Viridiplantae (green plants) database for characterization of the Nepenthes pitcher fluid proteome. Samples were digested with Nepenthes pitcher fluid overnight (16 hours) and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences Description 2762 46474 141 39 aspartic proteinase nepenthesin 2 [] RecName: Full=Aspartic proteinase nepenthesin-2; 2077 46829 124 38 AltName: Full=Nepenthesin-II; Flags: Precursor RecName: Full=Aspartic proteinase nepenthesin-1; 1082 47012 56 16 AltName: Full=Nepenthesin-I; Flags: Precursor hypothetical protein OsI_30076 [Oryza sativa Indica 68 38020 2 1 Group] ARM repeat superfamily protein isoform 2 [Theobroma 67 110588 1 1 cacao] PREDICTED: uncharacterized protein At2g37660, 61 27739 1 1 chloroplastic-like [Cicer arietinum] PREDICTED: probable inactive purple acid phosphatase 61 70105 1 1 24-like [Cicer arietinum] 61 40813 1 1 conserved hypothetical protein [Ricinus communis] *Ion cut-off score (p<0.05): 60

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131106%2FF005947.dat

159

Table A2.8. Mascot search results in the NCBI Drosophila (fruit flies) database for characterization of the Nepenthes pitcher fluid proteome. Samples were digested with Nepenthes pitcher fluid overnight (16 hours) and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences Description

190 122730 12 1 GI23221 [Drosophila mojavensis] 70 66928 2 1 GH11117 [Drosophila grimshawi] 61 137876 2 1 GF17780 [Drosophila ananassae] 58 103338 1 1 GI24591 [Drosophila mojavensis] 57 80551 1 1 GK18040 [Drosophila willistoni] 56 28971 4 1 GA17670 [Drosophila pseudoobscura pseudoobscura] 54 54664 1 1 GF20974 [Drosophila ananassae] *Ion cut-off score (p<0.05): 53

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131106%2FF005919.dat

160

Table A2.9. Mascot search results in the NCBI Viridiplantae (green plants) database for characterization of the Nepenthes pitcher fluid proteome. Samples were digested with recombinant nepenthesin I for 3 hours and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences Description 985 46474 37 10 aspartic proteinase nepenthesin 2 [Nepenthes mirabilis] RecName: Full=Aspartic proteinase nepenthesin-2; 784 46829 33 9 AltName: Full=Nepenthesin-II; Flags: Precursor RecName: Full=Aspartic proteinase nepenthesin-1; 520 47012 30 10 AltName: Full=Nepenthesin-I; Flags: Precursor hypothetical protein ARALYDRAFT_492440 79 108232 2 1 [Arabidopsis lyrata subsp. lyrata] *Ion cut-off score (p<0.05): 59

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131106%2FF005948.dat

161

Table A2.10. Mascot search results in the NCBI Drosophila (fruit flies) database for characterization of the Nepenthes pitcher fluid proteome. Samples were digested with recombinant nepenthesin I for 3 hours and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences Description 93 122730 7 1 GI23221 [Drosophila mojavensis] 56 102071 1 1 GH20054 [Drosophila grimshawi] 52 30208 1 1 GF19521 [Drosophila ananassae] *Ion cut-off score (p<0.05): 52

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131106%2FF005926.dat

162

Table A2.11. Mascot search results in the NCBI Viridiplantae (green plants) database for characterization of the Nepenthes pitcher fluid proteome. Samples were digested with Nepenthes pitcher fluid for 1 hour and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot significant significant Score Mass matches sequences Description

878 46474 37 10 aspartic proteinase nepenthesin 2 [Nepenthes mirabilis] RecName: Full=Aspartic proteinase nepenthesin-1; 300 47012 23 9 AltName: Full=Nepenthesin-I; Flags: Precursor 239 46972 19 8 aspartic proteinase nepenthesin I [Nepenthes alata] 113 147279 5 1 unnamed protein product [Vitis vinifera] 64 29717 1 1 AP2 domain class transcription factor [Malus domestica] *Ion cut-off score (p<0.05): 60

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131106%2FF005949.dat

163

Table A2.12. Mascot search results in the NCBI Drosophila (fruit flies) database for characterization of the Nepenthes pitcher fluid proteome. Samples were digested with Nepenthes pitcher fluid for 1 hour and processed as described in Section 2.2.3 and 2.2.4.

Num. of Num. of Mascot significant significant Score Mass matches sequences Description 81 122730 4 1 GI23221 [Drosophila mojavensis] *Ion cut-off score (p<0.05): 53

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20131106%2FF005934.dat

164

Appendix 3.1. Vector Map for Plasmid e3884

165

Appendix 3.2. Vector Map for Plasmid cv16.

Nepenthensin I or II was cloned into the plasmid through BamHI and HindIII at the 5’ and 3’ sites respectively.

166

Table A4.1. Miniaturized Mascot database containing all gliadin and glutenin protein sequences used for identification (Section 4.2.6-4.2.8).

Uniprot Mass Length ID: (Da) (a.a) Description P02863 32963 286 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 P04722 33661 291 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 SV=1 P04723 32236 282 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 SV=1 P04724 34239 297 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 SV=1 P04725 36666 319 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum P04726 33941 296 PE=3 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum P04727 36118 313 PE=3 SV=1 P18573 35397 307 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 SV=1 P08453 37122 327 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 P06659 32967 291 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 P08079 29054 251 SV=1 P21292 302 34300 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Glutenin, low molecular weight subunit 1D1 OS=Triticum P10386 307 34928 aestivum PE=2 SV=1 Glutenin, high molecular weight subunit PW212 OS=Triticum P08489 838 89173 aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum P08488 660 70867 aestivum PE=3 SV=1

167

Table A4.2. Mascot search results for digestions of gliadin extracts with Nepenthes pitcher fluid for 30 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description 1857 37099 71 64 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 1483 32946 66 58 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 1097 34278 49 43 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 639 29035 30 28 SV=1 Glutenin, low molecular weight subunit 1D1 1770 34906 79 73 OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1451 36643 56 46 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1314 35375 55 46 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1289 34217 51 44 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1284 36095 53 44 PE=3 SV=1 1067 32943 52 45 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 915 32216 39 33 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 914 33640 41 38 SV=1 Glutenin, high molecular weight subunit PW212 491 89120 21 20 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 396 70824 17 17 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007482.dat

168

Table A4.3. Mascot search results for digestions of gliadin extracts with Nepenthes pitcher fluid for 1 hour and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description 2055 37099 95 82 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1882 36643 83 73 SV=1 Glutenin, low molecular weight subunit 1D1 1858 34906 110 98 OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1777 35375 82 72 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 1688 33640 75 67 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1625 36095 76 69 PE=3 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1521 34217 70 62 SV=1 1506 32943 76 68 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 1228 34278 67 59 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 1184 32946 74 65 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 1085 32216 52 46 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 998 29035 55 49 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 620 70824 28 26 aestivum PE=3 SV=1 Glutenin, high molecular weight subunit PW212 610 89120 28 28 OS=Triticum aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 13

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007479.dat

169

Table A4.4. Mascot search results for digestions of gliadin extracts with Nepenthes pitcher fluid for 5 hours and 16 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 2447 36643 116 98 SV=1 2122 37099 111 92 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 2096 35375 104 92 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum 2058 33920 92 78 PE=3 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 1936 33640 93 81 SV=1 1827 32943 93 83 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1809 34217 82 70 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1773 36095 94 81 PE=3 SV=1 1647 34278 85 79 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Glutenin, low molecular weight subunit 1D1 1547 34906 102 93 OS=Triticum aestivum PE=2 SV=1 1541 32946 99 83 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 1529 29035 88 78 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 1087 32216 59 54 SV=1 Glutenin, high molecular weight subunit PW212 696 89120 43 41 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 591 70824 34 32 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 13

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007480.dat

170

Table A4.5. Mascot search results for digestions of gliadin extracts with Nepenthes pitcher fluid for 16 hours and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 2973 36643 142 115 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 2557 35375 137 111 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum 2556 33920 121 97 PE=3 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 2396 34217 129 104 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 2289 36095 117 99 PE=3 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 2161 33640 108 89 SV=1

1981 34278 98 87 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 1887 37099 109 89 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 1879 32943 99 84 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Glutenin, low molecular weight subunit 1D1 1737 34906 127 111 OS=Triticum aestivum PE=2 SV=1 1634 32946 109 93 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 1633 29035 104 93 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 1228 32216 76 67 SV=1 Glutenin, high molecular weight subunit PW212 1040 89120 55 51 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 547 70824 41 39 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 13

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007481.dat

171

Table A4.6. Mascot search results for digestions of gliadin extracts with Nepenthes pitcher fluid for 52 hours and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 1935 33640 100 84 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1917 36643 103 84 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1892 34217 104 85 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum 1826 33920 98 81 PE=3 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1809 36095 99 80 PE=3 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1807 35375 104 85 SV=1

1765 37099 107 92 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 1528 32943 86 75 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 1294 29035 93 80 SV=1 1273 32946 99 82 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 1144 34278 77 66 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Glutenin, low molecular weight subunit 1D1 1053 34906 86 80 OS=Triticum aestivum PE=2 SV=1 Glutenin, high molecular weight subunit PW212 849 89120 51 48 OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 794 32216 58 51 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 715 70824 44 43 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 13

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007483.dat

172

Table A4.7. Mascot search results for digestions of gliadin extracts with pepsin for 30 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description 894 37099 37 31 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 756 32946 30 27 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 625 36095 23 18 PE=3 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 586 36643 21 17 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 535 35375 20 19 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 524 33640 20 17 SV=1 Glutenin, low molecular weight subunit 1D1 518 34906 23 23 OS=Triticum aestivum PE=2 SV=1 488 32943 18 16 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 431 34217 18 16 SV=1 421 34278 18 17 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 346 29035 11 9 SV=1 57 28241 4 3 [Homosapiens] Glutenin, high molecular weight subunit PW212 427 89120 17 16 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 36 70824 3 3 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007487.dat

173

Table A4.8. Mascot search results for digestions of gliadin extracts with pepsin for 1 hour and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1224 36095 44 42 PE=3 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1216 36643 40 38 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 1084 33640 39 37 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1003 34217 34 32 SV=1 945 32943 35 35 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 939 35375 37 35 SV=1 Glutenin, low molecular weight subunit 1D1 894 34906 38 35 OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum 857 33920 29 29 PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 289 32216 18 17 SV=1 887 37099 40 35 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 831 32946 38 34 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 613 34278 27 23 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 414 29035 14 11 SV=1 Glutenin, high molecular weight subunit PW212 613 89120 21 18 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 82 70824 6 6 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007484.dat

174

Table A4.9. Mascot search results for digestions of gliadin extracts with pepsin for 5 hours and 16 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1483 36643 61 52 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1468 36095 59 50 PE=3 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1227 35375 52 45 SV=1 Glutenin, low molecular weight subunit 1D1 1166 34906 51 45 OS=Triticum aestivum PE=2 SV=1 1157 32943 46 40 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1114 34217 42 36 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 941 33640 40 36 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum 820 33920 34 31 PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 443 32216 29 24 SV=1 920 37099 45 41 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 803 32946 39 34 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 749 34278 36 32 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 617 29035 24 21 SV=1 Glutenin, high molecular weight subunit PW212 530 89120 20 17 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 203 70824 11 11 aestivum PE=3 SV=1 17 36554 1 1 glucanase [Nepenthes khasiana] *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007485.dat

175

Table A4.10. Mascot search results for digestions of gliadin extracts with pepsin for 16 hours and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Glutenin, low molecular weight subunit 1D1 1694 34906 78 69 OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1432 36643 63 54 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1393 36095 64 55 PE=3 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1255 35375 55 50 SV=1 1164 32943 55 49 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1145 34217 52 48 SV=1

1016 37099 52 50 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 1003 33640 43 39 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum 962 33920 45 43 PE=3 SV=1 811 32946 44 43 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 725 34278 43 37 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 597 32216 35 31 SV=1 Glutenin, high molecular weight subunit PW212 470 89120 25 23 OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 415 29035 26 24 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 272 70824 17 16 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007486.dat

176

Table A4.11. Mascot search results for digestions of gliadin extracts with pepsin for 52 hours and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1653 36643 68 59 SV=1 Glutenin, low molecular weight subunit 1D1 1639 34906 75 66 OS=Triticum aestivum PE=2 SV=1 1428 32943 59 53 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1360 36095 59 51 PE=3 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1278 35375 68 60 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1176 34217 61 53 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 1132 33640 55 49 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum 1017 33920 46 41 PE=3 SV=1 915 37099 46 42 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 718 32946 41 38 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 599 34278 35 31 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 546 32216 34 31 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 354 29035 24 22 SV=1

41 21241 2 2 Glutenin, high molecular weight subunit PW212 750 89120 31 27 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 607 70824 26 24 aestivum PE=3 SV=1 35 36554 3 3 glucanase [Nepenthes khasiana] *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007488.dat

177

Table A4.12. Mascot search results for digestions of gliadin extracts with recombinant nepenthesin I for 30 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description 651 37099 20 17 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 442 32946 15 11 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 200 29035 5 3 SV=1 170 34278 6 6 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Glutenin, low molecular weight subunit 1D1 563 34906 25 20 OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 424 34217 18 16 SV=1

390 32943 20 18 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 384 35375 18 16 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 337 33640 15 13 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 250 32216 9 8 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 203 36643 11 10 SV=1 Glutenin, high molecular weight subunit PW212 413 89120 19 17 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 162 70824 6 5 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 11

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007473.dat

178

Table A4.13. Mascot search results for digestions of gliadin extracts with recombinant nepenthesin I for 1 hour and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1275 36643 46 38 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1219 36095 45 39 PE=3 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1194 34217 45 36 SV=1 1115 32943 46 39 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 1079 37099 44 40 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1033 35375 46 37 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 996 33640 38 32 SV=1 Glutenin, low molecular weight subunit 1D1 992 34906 49 44 OS=Triticum aestivum PE=2 SV=1 976 32946 36 31 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 781 32216 31 25 SV=1 495 34278 22 22 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 388 29035 16 14 SV=1 Glutenin, high molecular weight subunit PW212 543 89120 22 21 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 531 70824 22 21 aestivum PE=3 SV=1 13 36554 1 1 glucanase [Nepenthes khasiana] *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007470.dat

179

Table A4.14. Mascot search results for digestions of gliadin extracts with recombinant nepenthesin I for 5 hours and 16 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1444 36643 56 49 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1436 36095 62 55 PE=3 SV=1 Glutenin, low molecular weight subunit 1D1 1386 34906 61 53 OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1234 34217 53 43 SV=1 1233 37099 51 43 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 1214 33640 53 44 SV=1 1151 32943 52 44 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1131 35375 55 46 SV=1 884 32946 40 37 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 697 70824 31 28 aestivum PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 690 32216 38 35 SV=1 Glutenin, high molecular weight subunit PW212 630 89120 35 33 OS=Triticum aestivum PE=3 SV=1

606 34278 29 28 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 474 29035 22 20 SV=1 27 36554 1 1 glucanase [Nepenthes khasiana] *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007471.dat

180

Table A4.15. Mascot search results for digestions of gliadin extracts with recombinant nepenthesin I for 16 hours and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1835 36095 73 64 PE=3 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1788 36643 65 59 SV=1 1720 32943 65 58 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1574 34217 52 47 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 1496 33640 58 50 SV=1 Glutenin, low molecular weight subunit 1D1 1446 34906 74 65 OS=Triticum aestivum PE=2 SV=1 1428 37099 64 58 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum 1363 33920 51 46 PE=3 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1336 35375 58 53 SV=1 Glutenin, high molecular weight subunit PW212 950 89120 45 40 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 859 70824 45 42 aestivum PE=3 SV=1 858 32946 54 51 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1

707 34278 45 42 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 604 32216 38 34 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 599 29035 37 35 SV=1 32 36554 2 2 glucanase [Nepenthes khasiana] *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007472.dat

181

Table A4.16. Mascot search results for digestions of gliadin extracts with recombinant nepenthesin I for 52 hours and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1790 34217 76 68 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 1765 33640 76 65 SV=1 Glutenin, low molecular weight subunit 1D1 1762 34906 84 75 OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1691 35375 79 70 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1690 36095 84 70 PE=3 SV=1 1569 32943 77 67 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 1564 37099 78 68 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum 1400 33920 65 57 PE=3 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1396 36643 63 54 SV=1 955 34278 57 51 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 939 32946 61 57 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 753 32216 47 40 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 691 29035 51 44 SV=1

27 36554 1 1 glucanase [Nepenthes khasiana] Glutenin, high molecular weight subunit 12 OS=Triticum 1219 70824 56 49 aestivum PE=3 SV=1 Glutenin, high molecular weight subunit PW212 730 89120 43 37 OS=Triticum aestivum PE=3 SV=1 20 69248 1 1 Serum albumin - Bos taurus (Bovine). *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007474.dat

182

Table A4.17. Mascot search results for digestions of gliadin extracts with recombinant nepenthesin II for 30 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description 894 37099 28 26 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 601 32946 22 19 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 571 34278 15 12 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 105 29035 3 3 SV=1 Glutenin, low molecular weight subunit 1D1 705 34906 30 27 OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 559 36643 24 23 SV=1 495 32943 25 23 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 493 33640 19 19 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 492 34217 20 18 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 371 32216 17 17 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 361 35375 17 16 SV=1 Glutenin, high molecular weight subunit PW212 451 89120 20 17 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 69 70824 4 4 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 11

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007478.dat

183

Table A4.18. Mascot search results for digestions of gliadin extracts with recombinant nepenthesin II for 1 hour and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Glutenin, low molecular weight subunit 1D1 1373 34906 58 52 OS=Triticum aestivum PE=2 SV=1 994 37099 43 40 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 748 32946 34 29 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 546 34278 23 22 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 312 29035 14 12 SV=1 836 32943 42 36 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 784 36643 35 31 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 779 36095 36 32 PE=3 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 698 34217 32 27 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 669 35375 33 28 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 637 33640 32 28 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 473 32216 26 24 SV=1 Glutenin, high molecular weight subunit PW212 474 89120 18 17 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 201 70824 10 10 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007475.dat

184

Table A4.19. Mascot search results for digestions of gliadin extracts with recombinant nepenthesin II for 5 hours and 16 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Glutenin, low molecular weight subunit 1D1 1454 34906 73 64 OS=Triticum aestivum PE=2 SV=1 1242 37099 54 47 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 955 36643 37 32 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 857 34217 39 35 SV=1 843 32946 49 44 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 832 36095 36 33 PE=3 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 741 35375 36 34 SV=1 696 32943 35 33 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 608 32216 25 23 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 597 33640 30 29 SV=1 520 34278 26 24 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 510 29035 24 21 SV=1 Glutenin, high molecular weight subunit PW212 449 89120 23 19 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 320 70824 17 17 aestivum PE=3 SV=1 17 36554 1 1 glucanase [Nepenthes khasiana] *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007476.dat

185

Table A4.20. Mascot search results for digestions of gliadin extracts with recombinant nepenthesin II for 16 hours and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Glutenin, low molecular weight subunit 1D1 1542 34906 79 67 OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1018 36643 42 36 SV=1 978 37099 51 46 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 969 36095 46 42 PE=3 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 903 34217 46 39 SV=1 868 32943 48 43 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 799 35375 46 40 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 781 33640 42 38 SV=1 706 32946 45 42 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit PW212 605 89120 27 25 OS=Triticum aestivum PE=3 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 605 29035 34 31 SV=1 593 34278 37 35 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 565 32216 32 28 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 294 70824 18 18 aestivum PE=3 SV=1 *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007477.dat

186

Table A4.21. Mascot search results for digestions of gliadin extracts with recombinant nepenthesin II for 52 hours and 40 minutes (Section 4.2.6 and 4.2.7).

Num. of Num. of Mascot Mass significant significant Score (Da) matches sequences* Description Glutenin, low molecular weight subunit 1D1 1751 34906 83 71 OS=Triticum aestivum PE=2 SV=1 Alpha/beta-gliadin A-V OS=Triticum aestivum PE=2 1520 36643 59 50 SV=1 1470 37099 70 62 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin MM1 OS=Triticum aestivum PE=1 1427 35375 59 51 SV=1 Alpha/beta-gliadin clone PW8142 OS=Triticum aestivum 1409 36095 59 51 PE=3 SV=1 Alpha/beta-gliadin A-II OS=Triticum aestivum PE=2 1326 33640 60 54 SV=1 1158 32943 50 46 Alpha/beta-gliadin OS=Triticum aestivum PE=2 SV=2 1049 34278 55 50 Gamma-gliadin OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin A-IV OS=Triticum aestivum PE=2 1035 34217 41 39 SV=1 971 32946 61 57 Gamma-gliadin B OS=Triticum aestivum PE=3 SV=1 Alpha/beta-gliadin clone PW1215 OS=Triticum aestivum 970 33920 46 41 PE=3 SV=1 Glutenin, high molecular weight subunit PW212 908 89120 35 32 OS=Triticum aestivum PE=3 SV=1 Glutenin, high molecular weight subunit 12 OS=Triticum 849 70824 34 30 aestivum PE=3 SV=1 Alpha/beta-gliadin A-III OS=Triticum aestivum PE=2 767 32216 40 36 SV=1 Gamma-gliadin (Fragment) OS=Triticum aestivum PE=2 739 29035 43 39 SV=1 15 69248 2 2 Serum albumin - Bos taurus (Bovine). *Ion cut-off score (p<0.05): 12

Mascot Results Link:

http://136.159.167.24/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20140407%2FF007495.dat

187

Table A4.22. Miniaturized database containing the sequences for all of the immunogenic gliadin epitopes searched in Mascot for qualitative and quantitative digestion analyses (Section 4.2.6-4.2.8).

Epitope Sequence Description 33-mer (any 9 residue LQLQPFPQPQLPYPQPQLPYPQPQLPYPQPQPF α-gliadin, sequence) a.a. 57-89 [48]

DQ2-α-Ia PFPQPQLPY α-gliadin, a.a. 60-68 [143]

DQ2-α-II PQPQLPYPQ α-gliadin, a.a. 62-70 [143]

DQ2-α-Ib PYPQPQLPY α-gliadin, a.a. 67-75 [144]

GLIA-α-20 FRPQQPYPQ α-gliadin, a.a. 94-102 [150]

DQ8 GLIA-α-I QGSFQPSQQ α-gliadin, a.a. 208-216 [151]

GLIA 31-43 PGQQQPFPPQQPY α-gliadin, a.a. 31-43 [59]

DQ2 -γ- I QPQQSFPQQQR γ- gliadin a.a. 140-150 [155]

Algorithm 1* {FY}XXQX{QP}QX{FYLWI} Predictive immune epitope algorithm [50]

Algorithm 2* {FWYILV}XX{PQVLI}X{QAP}QX{FYLWI} Predictive immune epitope algorithm [50] Algorithm 3* {FWYILV}XX{DQ}X{QAP}XP Predictive immune epitope algorithm [50] *Residues in {XXX} denote the preferred residues for that singular position in the epitope algorithms

188