<<

1

DETERMINING THE SPECIFICITY OF FOR PROTEOLYTIC

A thesis presented

by

Melissa H. Palashoff

to The Department of Chemistry and Chemical Biology

in partial fulfillment of the requirements for the degree of Master of Science

in the field of

Chemistry

Northeastern University Boston, Massachusetts

August 2008 2

DETERMINING THE SPECIFICITY OF PEPSIN FOR PROTEOLYTIC DIGESTION

by

Melissa H. Palashoff

ABSTRACT OF THESIS

Submitted in partial fulfillment of the requirements for the degree of Master of Science in Chemistry and Chemical Biology in the Graduate School of Arts and Sciences of Northeastern University, August 2008 3

ABSTRACT

Pepsin is an that is commonly found in the stomach of many organisms. Porcine pepsin is the most studied and is fully active at pH 1.9 but inactive above pH ~7. Pepsin is known to have limited specificity and there are only general rules about its cleavage preferences.

To further define rules regarding pepsin specificity, a database was constructed consisting of 40 proteins and 1344 peptide cleavages from the literature. Contemporary scientific literature was searched for all publications that involve pepsin digestion and mass spectrometry at pH 2.5-2.7. Peptide data for 40 proteins were extracted and combined to create a map of pepsin cleavage specificity. The frequency of cleavage for each protein was normalized based on how many times that specific combination of residues occurred in the protein sequence.

In addition to the literature search, nine proteins along with E.coli whole cell lysate were digested at pH 1.0, 2.5 and 4.0. The proteins were analyzed with online pepsin digestion using an immobilized pepsin column and UPLC/ESI-MSE. The peptides

and their fragments were identified with a combination of MSE, software analysis, and

manual inspection.

The analysis of the data indicated that pepsin maintains limited cleavage

preferences. At pH 2.5, pepsin will cleave preferentially after most bulky, hydrophobic

amino acids such as leucine and . Additionally, the residues that most often

occur immediately following the cleaved are tryptophan and tyrosine. It has

also been shown that pepsin will rarely cleave at proline and . Analysis

performed at pH 1.0 and 4.0 yielded similar results. 4

ACKNOLEDGEMENTS

I would like to begin by thanking my advisor Dr. John R. Engen for all of his help

in making me a better scientist. Without his guidance and advice this research would not

have been possible. I am also grateful to my other committee members, Dr. Mary Jo

Ondrechen and Dr. Paul Vouros, for helping to make this thesis the best that it could be.

I would like to sincerely thank everyone in Dr. Engen’s lab for their continuous support over the past year. To my labmates, Dr. Thomas Wales, Dr. Roxana Iacob,

Christopher Morgan, Sean Marcsisin, Damian Houde and Susan Fang, for providing a wonderful environment for me to work and learn in.

Finally, I would like to dedicate this to my family and friends for their constant encouragement during the past five years. I would especially like to thank my parents,

William and Patricia, and my brother Joshua for everything they have done to support

me, both morally and materially, during my college career. Last, but certainly not least,

to Eddie for always being there to support me during the good days and the bad, thank

you.

5

TABLE OF CONTENTS

ABSTRACT ………………………………………………………………………………………. 3

TABLE OF CONTENTS …………………………………………………………………………. 5

LIST OF FIGURES ……………………………………………………………………………….. 8

LIST OF TABLES ………………………………………………………………………………...10

LIST OF ABBREVIATIONS …………………………………………………………………… 11

CHAPTER ONE: INTRODUCTION AND BACKGROUND TO ASPARTIC ACID

PROTEASES AND PEPSIN…………………………………………………………….. 13

1.1 Aspartic Acid ……………………………………………………………...13

1.2 Catalytic Mechanism of Aspartic Proteases …………………………………………. 15

1.3 Primary Structure of the Pepsin-like Family ………………………………………… 17

1.4 Pepsin …………………………………………………………………………………18

1.4.1 History of pepsin ……………………………………………………………18

1.4.2 Activation of pepsin ………………………………………………………...18

1.4.3 Pepsin crystal structure ……………………………………………………..21

1.4.4 Activity of pepsin ………………………………………………………….. 23

1.4.5 Pepsin and proteomics ……………………………………………………... 23

1.5 Research Objectives …………………………………………………………………. 24

1.6 References …………………………………………………………………………… 25

CHAPTER TWO: LITERATURE SEARCH …………………………………………………… 29

2.1 Introduction ………………………………………………………………………….. 29

2.2 Materials and Methods ………………………………………………………………. 30

2.3 Construction of Cleavage Database …………………………………………………..30

6

2.4 Data Normalization ………………………………………………………………….. 34

2.5 Cleavage Data Map ………………………………………………………………….. 36

2.6 Revised Cleavage Data Map ………………………………………………………… 41

2.7 Summary of Literature Research ……………………………………………………. 44

2.8 References …………………………………………………………………………… 44

CHAPTER 3: EXPERIMENTAL DETERMINATION OF PEPSIN SPECIFICITY AND THE

EFFECTS OF pH ………………………………………………………………………... 78

3.1 Introduction ………………………………………………………………………….. 78

3.2 Instrumentation ……………………………………………………………………….78

3.2.1 UPLC and online pepsin digestion ………………………………………… 79

3.2.2 Mass Spectrometry ………………………………………………………… 81

3.3 Materials and Methods ………………………………………………………………. 83

3.3.1 Protein sample analysis ……………………………………………………. 83

3.3.1.1 Protein sample preparation ………………………………………. 83

3.3.1.2 UPLC methods …………………………………………………... 85

3.3.1.3 Mass analysis ……………………………………………………..87

3.3.2 E.coli sample analysis ……………………………………………………... 87

3.3.2.1 E.coli sample preparation ………………………………………... 87

3.3.2.2 UPLC analysis …………………………………………………… 88

3.3.2.3 Mass analysis ……………………………………………………..88

3.4 Data Analysis …………………………………………………………………………88

3.4.1 Software processing ……………………………………………………….. 88

3.4.2 Peptide analysis ……………………………………………………………. 89

7

3.5 Results ……………………………………………………………………………….. 92

3.6 References …………………………………………………………………………...100

CHAPTER 4: PERSPECTIVES AND FUTURE DIRECTIONS ……………………………... 141

4.1 Discussion and Conclusions ………………………………………………………... 141

4.1.1 Literature search ………………………………………………………….. 141

4.1.2 Experimental research ……………………………………………………. 141

4.1.3 Research on pepsin specificity …………………………………………… 142

4.2 Future Directions …………………………………………………………………… 142

8

LIST OF FIGURES

Figure 1.1 Aspartic proteases family tree ………………………………………………………... 14

Figure 1.2 The catalytic mechanism …………………………………………... 16

Figure 1.3 Sequence alignment of porcine pepsinogen and porcine pepsin ……………………... 19

Figure 1.4 Crystal structure of human pepsin …………………………………………………… 22

Figure 2.1 Example of a peptic digest map ……………………………………………………… 31

Figure 2.2 Example of cleavage nomenclature …………………………………………………...33

Figure 2.3 Equation used for data normalization and example calculation ……………………... 38

Figure 2.4 Cleavage data map …………………………………………………………………… 40

Figure 2.5 Cleavage data map with probability defined as a percentage ………………………... 42

Figure 3.1 Schematic of online pepsin digestion ………………………………………………… 80

Figure 3.2 Schematic of the operation of MSE …………………………………………………... 82

Figure 3.3 Example MS/MS data of hemoglobin ………………………………………………... 90

Figure 3.4 Example MS/MS data of myoglobin ………………………………………………….91

Figure 3.5 pH 2.5 peptic digest map of Abl ……………………………………………………..102

Figure 3.6 pH 2.5 peptic digest map of Albumin ………………………………………………. 103

Figure 3.7 pH 2.5 peptic digest map of Aldolase ………………………………………………. 104

Figure 3.8 pH 2.5 peptic digest map of Amyloglucosidase ……………………………………...105

Figure 3.9 pH 2.5 peptic digest map of β-Lactalbumin ……………………………………….... 106

Figure 3.10 pH 2.5 peptic digest map of Hemoglobin …………………………………………..107

Figure 3.11 pH 2.5 peptic digest map of Myoglobin ……………………………………………108

Figure 3.12 pH 2.5 peptic digest map of Nef ……………………………………………………109

Figure 3.13 pH 2.5 peptic digest map of Ubiquitin ……………………………………………...110 9

Figure 3.14 Cleavage data map pH 1.0 …………………………………………………………... 93

Figure 3.15 Cleavage data map pH 2.5 …………………………………………………………... 94

Figure 3.16 Cleavage data map pH 4.0 …………………………………………………………... 95

Figure 3.17 Cleavage data map with probability defined as a percentage, pH 1.0 ……………….97

Figure 3.18 Cleavage data map with probability defined as a percentage, pH 2.5 ……………….98

Figure 3.19 Cleavage data map with probability defined as a percentage, pH 4.0 ……………….99

10

LIST OF TABLES

Table 2.1 Literature search results ……………………………………………………………….. 32

Table 2.2 Cleavage database ……………………………………………………………………... 51

Table 2.3 Sum of cleavages between two residues ………………………………………………. 35

Table 2.4 Possible cleavages between two residues ……………………………………………... 37

Table 2.5 Normalized cleavage data ……………………………………………………………... 39

Table 2.6 Cleavage data with probability defined as a percentage ………………………………. 43

Table 3.1 Proteins used for digestion ……………………………………………………………. 84

Table 3.2 Auxiliary solvents …………………………………………………………………….. 86

Table 3.3 pH 1.0 peptides ………………………………………………………………………. 111

Table 3.4 pH 4.0 peptides ………………………………………………………………………. 121

Table 3.5 pH 2.5 E.coli peptides ………………………………………………………………... 125

Table 3.6 Normalized pH 1.0 cleavage data ……………………………………………………... 93

Table 3.7 Normalized pH 2.5 cleavage data ……………………………………………………... 94

Table 3.8 Normalized pH 4.0 cleavage data ……………………………………………………... 95

Table 3.9 Normalized pH 1.0 cleavage data defined as a percentage …………………………….97

Table 3.10 Normalized pH 2.5 cleavage data defined as a percentage …………………………...98

Table 3.11 Normalized pH 4.0 cleavage data defined as a percentage …………………………...99

11

LIST OF ABBREVIATIONS

ACN acetonitrile

DTT dithiothreitol

E.coli

ESI electrospray ionization eV electron volts

FA formic acid fmol femtomole

HCl hydrochloric acid

Hpb hydrophobic residue

HPLC high performance liquid chromatography kDa kilo-Dalton kV kilovolts

L liter

LC liquid chromatography

M molar min minute

μL microliters

μM micromolar mM millimolar mm millimeter

MS mass spectrometry

MS/MS tandem mass spectrometry 12

MW molecular weight

PDB protein data bank pmol picomoles

UPLC ultra performance liquid chromatography

V volts

13

CHAPTER 1

INTRODUCTION AND BACKGROUND

TO ASPARTIC ACID PROTEASES AND PEPSIN

1.1 Aspartic Acid Proteases

Aspartic acid proteases are a large class of that are widely distributed.

They are found in a multitude of organisms such as vertebrates, retroviruses, fungi and plants (Davies 1990). Characteristics shared by aspartic proteases are that they have an optimum pH in the acid range and are inhibited by pepstatin (Fruton 1976). The

MEROPS database (Rawlings, Morton et al. 2008) has classified the aspartic proteases into many different families. However, this class of enzymes is believed to have a conserved segment of residues (-Asp-Thr-Gly-) in its primary structure (Dunn 2002).

Taking this into account, aspartic proteases can be split up into five main families (Figure

1.1). The largest of these groups is the pepsin-like family which contains about 70 members. The retroviral family contains 45 members, the cauliflower mosaic virus family contains six members, the human spumaretroviral family contains one member and the copia transposon family contains six members (Dunn 2001; Rawlings, Morton et al. 2008).

Due to the fact that acid proteases are found in so many different organisms they tend to perform a variety of functions. The most studied of these aspartic proteases involve digestion (pepsin, gastricsin, , etc.) and protein degradation (

D). In fungi acid proteases play an important role in sporulation. Retroviral proteases cleave retroviruses during the activation of the virus (Dickson 1984). 14

Aspartic Proteases

Pepsin-like Retroviral Cauliflower Spumaretroviral Copia Transposons Proteases Mosaic Viruses

Pepsin A HIV-1 retropepsin Bacilliform virus HIV-2 retropepsin Gastricsin Chymosin Rhitopuspepsin Barrierpepsin Aspergillopepsin Phytepsin Plamepsin I Plamepsin II Yapsin Memapsin I Memapsin II

Figure 1.1. Aspartic proteases family tree

The aspartic acid protease family can be split up into five main categories. Pepsin-like contains 70 members, retroviral contains 45 members, cauliflower mosaic contains 6, spumaretiltroviral contitains 1 , and copia transposon contitains 6 . O Olnly a sel ecti on of enzymes are shown in this diagram. Adapted from Dunn (2001) and Rawlings, Morton et al. (2008). 15

1.2 Catalytic Mechanism of Aspartic Proteases

As previously mentioned, aspartic proteases contain a conserved segment of

residues. This amounts to about a 5% sequence identity in all of the enzymes. Two of

the residues that remain conserved are aspartic acids (hence the name aspartic acid

proteases). The aspartic residues, Asp32 and Asp215 (pepsin numbering), are located in the cleft of the and are involved in the catalytic mechanism (Dunn

1989). To date the most popular mechanism of aspartic proteases (Figure 1.2) was proposed by Northrop (Northrop 2001) and is believed to involve the controversial low-

barrier hydrogen bond.

The mechanism begins with the free enzyme wherein the two aspartate groups are

believed to form a 10-atom cyclic structure with a water molecule. This forms a loose

complex with the in step 1. The catalytic process begins in step 2 when the

complex takes on the required geometry and distances. In step 3 a proton is removed

from the bound water molecule in order to generate a hydroxide ion, which attacks the

carbon atom of the substrate’s carboxyl group. A proton is then transferred to the

nitrogen of the peptide bond in step 4 and occurs in step 5. In step 6 the

complex opens up for release of the products in step 7. A proton is then lost in step 8 and

finally a new water molecule is bound to the complex in step 9. Steps 8 and 9 represent

the deprotonation, rehydration and restructuring of the complex to complete

isomerization. Isomerization is necessary in order to reform the low-barrier hydrogen

bond (Northrop 2001; Dunn 2002).

16

123

45 6

78 9

Figure 1.2. The aspartic protease catalytic mechanism. The first step in the mechanism is the formation of a loose complex upon substrate binding. Step 2 begins the catalytic process and in step 3 the carbonyl group is attacked following the removal of a proton. In step 4 there is the transfer of a proton and in step 5 bond cleavage occurs. The complex opens up in step 6 and the product is released in step 7. In step 8 a proton is lost and in step 9 a new water molecu le binds to the complex. Adapted from Northrop (2001). 17

1.3 Primary Structure of the Pepsin-like Family

Pepsin-like proteases make up the largest family of acid proteases. They are

roughly 330-350 residues in length and share a common primary structure. Pepsin-like proteases are single chain enzymes that are usually described as having two domains.

The primary structure is evidence as to why the protein is described as having two- domains as it has a repetitive form (Orengo, Michie et al. 1997).

~ 30-35AA ~ Hpb-Hpb-Asp-Thr-Gly ~ 45-50 AA ~ Tyr ~ 45-50 AA ~ Leu-Gly- Ile ~ 90-95AA ~ Hpb-Hpb-Asp-Thr-Gly ~ 85-90AA ~ Leu-Gly-Asp ~ 20-30AA

The sequence typically begins with 30 to 35 residues followed by two hydrophobic residues (Hpb). The hydrophobic residues are followed by the conserved

Asp-Thr-Gly sequence found in all aspartic proteases. The Asp-Thr-Gly sequence occurs within a wide loop known as the Psi-loop. About 45 residues after the Asp-Thr-Gly sequence a conserved Tyr occurs. After about another 45 residues there is a Leu-Gly-Ile sequence followed by about 90 more residues. The sequence then shows some form of repetition. Two hydrophobic residues occur again before the Asp-Thr-Gly sequence.

There are about 85 more residues before a Leu-Gly-Asp sequence. 20 to 30 more amino acids make up the rest of the primary structure of a pepsin-like protease (Davies 1990).

The tyrosine residue that is conserved in the sequence is important because it helps to define the active site pockets where substrate side-chain residues bind. In the

Leu-Gly-Ile and Leu-Gly-Asp sequence the glycine occurs for structural reasons as it is easily able to fit through the Psi-loop. The leucine, isoleucine and aspartic acid residues are there because their bulkiness ensures that the Psi-loop is locked into place (Dunn

2001).

18

1.4 Pepsin

1.4.1 History of Pepsin

In terms of evolutionary history, pepsin is considered to be the first enzyme in the

aspartic protease family. It was the first enzyme recognized as having activity (in

digestive processes) and in 1825 it was the first to be given a name (Gillespie 1898).

Porcine pepsin was also one of the first proteins to be extracted (Gillespie 1898) and to be

crystallized (Northrop 1930). It was also the first enzyme that has X-ray diffraction

patterns from crystals (Bernal and Crowfoot 1934). The extracted enzyme needs to be

acidified in order for full activity to be regained. It was noted that different acids yielded

different levels of activity. The activities were plotted against hydrogen ion

concentrations by Sorensen (Sorensen 1909). The plot had a scaling issue which

Sorensen resolved using a logarithmic abscissa, thus inventing the pH scale.

1.4.2 Activation of Pepsin

Pepsin, along with other aspartic proteases commonly found in vertebrates and plants, is most often synthesized as an inactive zymogen. For pepsin this zymogen is pepsinogen. Pepsinogen has the same primary structure as pepsin plus an additional 44 residues at the N-terminal of the protein (Figure 1.3). This 44 residue segment is often referred to as a propeptide and pepsinogen is often referred to as a proenzyme (Davies

1990). The pepsinogen propeptide contains nine residues, two residues and two histidine residues which make the peptide basic. The propeptide forms a helical structure that is stabilized by electrostatic forces as six of the basic side chains form ion pairs with the carboxylate side chains of pepsin (Perlmann 1963). The propeptide 19

Pepsinogen LVKVPLVRKKSLRQNLIKNGKLKDFLKTHKHNPASKYFPEAAALIGDEPLENYLDTEYFG Pepsin ------IGDEPLENYLDTEYFG

Pepsinogen TIGIGTPAQDFTVIFDTGSSNLWVPSVYCSSLACSDHNQFNPDDSSTFEATSQELSITYG Pepsin TIGIGTPAQDFTVIFDTGSSNLWVPSVYCSSLACSDHNQFNPDDSSTFEATSQELSITYG

Pepsinogen TGSMTGILGYDTVQVGGISDTNQIFGLSETEPGSFLYYAPFDGILGLAYPSISASGATPV Pepsin TGSMTGILGYDTVQVGGISDTNQIFGLSETEPGSFLYYAPFDGILGLAYPSISASGATPV

Pepsinogen FDNLWDQGLVSQDLFSVYLSSNDDSGSVVLLGGIDSSYYTGSLNWVPVSVEGYWQITLDS Pepsin FDNLWDQGLVSQDLFSVYLSSNDDSGSVVLLGGIDSSYYTGSLNWVPVSVEGYWQITLDS

Pepsinogen ITMDGETIACSGGCQAIVDTGTSLLTGPTSAIANIQSDIGASENSDGEMVISCSSIDSLP Pepsin ITMDGETIACSGGCQAIVDTGTSLLTGPTSAIANIQSDIGASENSDGEMVISCSSIDSLP

Pepsinogen DIVFTINGVQYPLSPSAYILQDDDSCTSGFEGMDVPTSSGELWILGDVFIRQYYTVFDRA Pepsin DIVFTINGVQYPLSPSAYILQDDDSCTSGFEGMDVPTSSGELWILGDVFIRQYYTVFDRA

Pepsinogen NNKVGLAPVA Pepsin NNKVGLAPVA

Figure 1.3. Sequence alignment of porcine pepsinogen and porcine pepsin. The 44 residue propeptide is highlighted in blue. This segment is cleaved off upon activation. 20

inhibits the activity of the enzyme because a segment of it blocks access to the catalytic

aspartates in the active site. Removal of the propeptide results in the activation of

pepsinogen to pepsin (James and Sielecki 1986). A loss of helical structure of the

propeptide also typically occurs during activation of the zymogen (Davies 1990).

Pepsinogen activation occurs when the pH of a solution of pepsinogen is lowered.

The lowering of the pH is believed to protonate the carboxylate side chains of pepsin

which causes the complex to break down and leads to the formation of the active enzyme.

Raising the pH can fully reverse the zymogen activation if performed in a timely manner.

However, if the pH is lowered for a prolonged period of time the activation is irreversible

(James and Sielecki 1986).

The activation of pepsinogen into pepsin is believed to occur through two pathways, either in a one-step process or in a sequential manner. There are also two different reactions that occur during activation. In the intramolecular reaction pepsinogen cleaves itself to form the active pepsin, while in the intermolecular reaction pepsinogen is cleaved by either another pepsinogen molecule, an intermediate form or an active pepsin

molecule. Kinetic experiments have shown that the intramolecular reaction is

predominant at a pH lower than 3.0 (al-Janabi, Hartsuck et al. 1972). The one-step

activation pathway appears to proceed mainly, but not exclusively, through the

intermolecular reaction (Kageyama and Takahashi 1983).

Both the one-step pathway and the stepwise pathway are believed to occur

simultaneously during the activation of pepsinogen to pepsin (Christensen, Pedersen et al.

1977). The intramolecular reaction and the intermolecular reaction are both involved in

the one-step pathway. It appears as though the intramolecular reaction is an essential part 21

for the initial activation in order to generate the active pepsin molecules. The intermolecular reaction is important for completion of the activation (Kageyama and

Takahashi 1987).

1.4.3 Pepsin Crystal Structure

Porcine pepsin was first crystallized in 1930 by John Northrop and later refined

by Sielecki et al. in 1990 (Sielecki, Fedorov et al. 1990). Figure 1.4 illustrates the crystal

structure of human pepsin (Fujinaga, Chernaia et al. 1995). The catalytic Asp residues,

Asp32 and Asp215, are highlighted in blue while the pepsin inhibitor pepstatin is

highlighted in red. The protein can be divided up into three regions (James and Sielecki

1986). The first region consists of a six-stranded antiparallel β-sheet. This interdomain

forms the backbone of the structure and is located behind the catalytic site region. The

other two domains consist of two lobes. One lobe is the N-terminal which consists of

142 residues and the other lobe is the C-terminal which consists of 123 residues. Despite

a similar pattern in their sequences, the N-terminal and C-terminal domains

are not very similar in their secondary or tertiary structures (Sielecki, Fedorov et al.

1990).

Other elements of the pepsin crystal structure are that it consists of a short

interdomain peptide that is next to the external side of the six-stranded β-sheet (Sielecki,

Fedorov et al. 1990). There are also two strands that form a β-hairpin loop that is often

called the flap. The flap projects out at the active site cleft of the molecule (Davies

1990). Pepsin contains a large hydrophobic core at its center. This is a result of the

reassembly of three regions mentioned above. A major factor contributing to the 22

Asp215

Asp32

Figure 1.4. Crystal structure of human pepsin. The two catalytic Asp residues, Asp32 and Asp215, are highlighted in blue. The pepsin inhibitor pepstatin is highlighted in red. PDB: 1PSN 23 hydrophobic core are side chains that protrude inward from the six-stranded β-sheet

(Sielecki, Fedorov et al. 1990).

The catalytic site of pepsin is highlighted by two aspartic acid residues, Asp32 and

Asp215. There is an Asp residue located in both the N-terminal and C-terminal domain.

The two Asp residues are located towards the end of each domain and are connected through a network of hydrogen bonds. The active site is quite rigid. However, the flap that protrudes out above the active site is rather flexible. The flap can close around inhibitors that are bound to the active site, thus limiting the mobility of the flap (James,

Sielecki et al. 1982).

1.4.4 Activity of Pepsin

Pepsin is an enzyme whose activity is greatly dependent on its pH. Pepsin has its optimum enzymatic activity at a pH between 1.8 and 2.0. It is remains stable, and still highly active, when the pH drops to as low as 1.0 (Ryle 1970). Pepsin will begin to lose activity around pH 5 (Smith 1991) and it becomes irreversibly inactive at a pH around 7.

However, a high concentration of pepsin will not become inactive until a pH of about 8

(Jones and Landon 2002). The activity of pepsin is also dependent upon the enzyme to protein ratio. The higher this ratio is the more efficient the enzyme becomes (Wu, Kaveti et al. 2006).

1.4.5 Pepsin and Proteomics

Pepsin can be a very useful tool for proteomics. Pepsin has a very broad specificity and is believed to often cleave after bulky hydrophobic residues (Fruton 1970; 24

Ryle 1970). Because of its broad specificity pepsin produces many peptides during digestion. The multiple cleavage sites means that the peptides produced are usually small, around 3 to 30 residues in length. The peptides are also typically overlapping which is useful for protein mapping. Despite its broad specificity pepsin is still a very reproducible enzyme, meaning it will yield the same peptides when digestion of a protein is performed at identical conditions (Zhang and Smith 1993).

1.5 Research Objectives

As previously mentioned there is little known about its specificity other than the fact that it prefers to cleave after bulky hydrophobic residues (Fruton 1970). Determining trends in pepsin specificity is important because pepsin is an enzyme that is widely used and without any rules about its cleavage preferences, protein characterization can be very difficult. The reason why pepsin is used for protein characterization despite the fact it has little known specificity is because it is one of the only enzymes to have a high activity in the low pH range.

One of the most common uses for pepsin is when performing hydrogen / deuterium exchange mass spectrometry (HXMS). To date, pepsin is essentially the only enzyme that can be used for these experiments. Due to the nature of the reaction, the digests need to be performed at a pH around 2.5 and at 0 °C. While there are other aspartic proteases that will work at this pH requirement there are no other enzymes that remain as active as pepsin is at this low temperature. Since pepsin is so often used to characterize proteins and because there is so little known about pepsin cleavage preferences, the main objective of this research is to determine if pepsin has any 25

specificity, thus advancing the use of pepsin digestion for protein characterization by

mass spectrometry.

The steps taken to determine trends in pepsin specificity consist of two main

parts. Chapter 2 outlines an extensive literature search performed in order to gather all

existing data involving pepsin cleavages. The corresponding experimental research is discussed in chapter 3. The goal of the experimental research was to gain more data of pepsin digests. These analyses were performed at pH 2.5 which is the most common pH for pepsin digests. Another aspect of the experiment which is discussed in chapter 3 is the determination of the effect of pH on the specificity of pepsin. It is known that pepsin’s activity is highly dependent on pH (Ryle 1970) but it is not known whether or not that loss or gain of activity is directly related to specificity.

1.6 References al-Janabi, J., J. A. Hartsuck, et al. (1972). "Kinetics and mechanism of pepsinogen

activation." J Biol Chem 247: 4628-32.

Bernal, J. D. and D. Crowfoot (1934). "X-Ray photographs of crystalline pepsin." Nature

133: 794-795.

Christensen, K. A., V. B. Pedersen, et al. (1977). "Identification of an enzymatically

active intermediate in the activation of porcine pepsinogen." FEBS Lett 76: 214-8.

Davies, D. R. (1990). "The structure and function of the aspartic proteinases." Annu Rev

Biophys Biophys Chem 19: 189-215. 26

Dickson, C., Eisenman, R., Fan, H., Hunter, E., Teich, N. (1984). RNA Tumor Viruses.

R. Weiss, Teich, N., Varmus, H., Coffin, J. New York, Cold Spring Harbor Lab:

513-648.

Dunn, B. M. (1989). Determination of Protease Mechanism. Proteolytic Enzymes: A

Practical Approach. R. J. a. B. Beynon, J. S. Oxford, England, Information Press

Ltd.: 57-81.

Dunn, B. M. (2001). "Overview of pepsin-like aspartic peptidases." Curr Protoc Protein

Sci Chapter 21: Unit 21 3.

Dunn, B. M. (2002). "Structure and mechanism of the pepsin-like family of aspartic

peptidases." Chem Rev 102: 4431-58.

Fruton, J. S. (1970). "The specificity and mechanism of pepsin action." Adv Enzymol

Relat Areas Mol Biol 33: 401-43.

Fruton, J. S. (1976). "The mechanism of the catalytic action of pepsin and related acid

proteinases." Adv Enzymol Relat Areas Mol Biol 44: 1-36.

Fujinaga, M., M. M. Chernaia, et al. (1995). "Crystal structure of human pepsin and its

complex with pepstatin." Protein Sci 4: 960-72.

Gillespie, A. L. (1898). The Natural History of Digestion. London, W. Scott.

James, M. N., A. Sielecki, et al. (1982). "Conformational flexibility in the active sites of

aspartyl proteinases revealed by a pepstatin fragment binding to ."

Proc Natl Acad Sci U S A 79: 6137-41.

James, M. N. and A. R. Sielecki (1986). "Molecular structure of an aspartic proteinase

zymogen, porcine pepsinogen, at 1.8 A resolution." Nature 319(6048): 33-8. 27

Jones, R. G. and J. Landon (2002). "Enhanced pepsin digestion: a novel process for

purifying antibody F(ab')(2) fragments in high yield from serum." J Immunol

Methods 263: 57-74.

Kageyama, T. and K. Takahashi (1983). "Occurrence of two different pathways in the

activation of porcine pepsinogen to pepsin." J Biochem 93: 743-54.

Kageyama, T. and K. Takahashi (1987). "Activation mechanism of monkey and porcine

pepsinogens A. One-step and stepwise activation pathways and their relation to

intramolecular and intermolecular reactions." Eur J Biochem 165: 483-90.

Northrop, D. B. (2001). "Follow the protons: a low-barrier hydrogen bond unifies the

mechanisms of the aspartic proteases." Acc Chem Res 34: 790-7.

Northrop, J. H. (1930). "Crystalline pepsin I. Isolation and tests for purity." J. Gen.

Physiol. 13: 739-766.

Orengo, C. A., A. D. Michie, et al. (1997). "CATH--a hierarchic classification of protein

domain structures." Structure 5: 1093-108.

Perlmann, G. E. (1963). "The optical rotatory properties of pepsinogen." J Mol Biol 6:

452-64.

Rawlings, N. D., F. R. Morton, et al. (2008). "MEROPS: the peptidase database." Nucleic

Acids Res 36: D320-5.

Ryle, A. (1970). "The Porcine Pepsin and Pepsinogens." Methods Enzymol 19: 316-336.

Sielecki, A. R., A. A. Fedorov, et al. (1990). "Molecular and crystal structures of

monoclinic porcine pepsin refined at 1.8 A resolution." J Mol Biol 214: 143-70. 28

Smith, J. L., Billings, G. E., Yada, R. Y (1991). "Chemical Modification of Amino

Groups in Mucor miehei Aspartyl Proteinase, Porcine Pepsin, and Chymosin. I.

Structure and Function." Agricultural and Biological Chemistry 55: 2009-2016.

Sorensen, S. P. L. (1909). "Enzymstudien II. Mitteilung. Uber die Messung und die

Bedeutung der Wasserstoffionen-konzentration bei enzymatischen Prozessen."

Biochem. Z. 21: 201-304.

Wu, Y., S. Kaveti, et al. (2006). "Extensive deuterium back-exchange in certain

immobilized pepsin columns used for H/D exchange mass spectrometry." Anal

Chem 78: 1719-23.

Zhang, Z. and D. L. Smith (1993). "Determination of amide hydrogen exchange by mass

spectrometry: a new tool for protein structure elucidation." Protein Sci 2: 522-31.

29

CHAPTER 2

LITERATURE SEARCH

2.1. Introduction

The first part of this research project was to conduct a search of contemporary scientific literature for publications that involve pepsin digestion and mass spectrometry.

Performing this literature search was useful because it provided a large amount of cleavages from a diverse set of proteins. This extensive amount of data is a very good basis for determining pepsin specificity.

A literature analysis of pepsin specificity has been performed previously (Keil

1992). However, the search performed in this book was a very broad one. There were no limitations put on the pH at which the digestions were performed. As previously mentioned pepsin can be greatly affected by pH. Since this search takes into account digestions performed at a wide range of pH values, some of the specificity results could be skewed because of it.

The literature search performed for this project puts strict limitations on the pH at which the digestions could be performed. In order to be considered in this literature search digestion needed to be performed within a pH range of 2.5 to 2.7. Pepsin experiences a very high activity within these values. This pH range is also of considerable importance because it is the range at which HXMS experiments are conducted. As previously mentioned HXMS experiments are one of the main types of experiments that utilize pepsin digestions. Due to this fact, there is a vast amount of literature published containing digestions within these pH values. 30

2.2. Materials and Methods

Search engines for online databases were used to conduct the literature search.

As mentioned above the literature that was of interest involved pepsin digestions and

mass spectrometry. The initial search yielded hundreds of results. The results were

narrowed by only choosing digestions that were performed within a pH range of 2.5 to

2.7.

Once the results were limited to those containing digests performed at pH 2.5 to

2.7, all of the papers were scanned to see if they contained a peptic digest map (Figure

2.1). If two or more sources used the same peptic digest map only one was used to

extract data from. Likewise, if two or more sources contained digest maps of the same

protein from the same organism the most comprehensive map was used ensuring that a

specific protein only be considered once when gathering data. The final results yielded

peptic digest maps from 40 sources. Table 2.1 lists the references along with the protein(s) studied in the publication.

2.3. Construction of Cleavage Database

All of the peptic digest maps retrieved from the literature search were analyzed in order to create a database of pepsin cleavages. The database consists of the residues that are found in the P4 to P4’ positions (see table 2.2 at end of chapter). As shown in Figure

2.2, residues P1 through P4 occur before the cleavage and residues P1’ through P4’ occur after the cleavage. The database incorporates out to the P4 and P4’ position because there

has been some evidence that residues in these positions can effect pepsin specificity (Keil

1992). 31

Peptic peptides

RRGAISAEVY TEEDAASYVR KVIPKDYKTM

AALAKAIEKN VLFSHLDDNE RSDIFDAMFP

VSFIAGETVI QQGDEGDNFY VIDQGEMDVY

Figure 2.1. Example of a peptic digest map. Peptic peptides are underlined in red. Pepsin will produce multiple overlapping peptides. 32

Table 2.1. Literature search results Reference Protein Lu, Wintrode et al. 2007 Major prion protein (human) Man, Montagner et al. 2007 Myoglobin (sperm whale) Brier, Maria et al. 2007 CENP-E (human) Cheng, Cusanovich et al. 2006 Photoactive yellow protein

Cheng, Wysocki et al. 2006 cytochrome c2 (rhodobacter capsulatus) Hochrein, Wales et al. 2006 HIV and SIV Nef Shi, Koeppe et al. 2006 AChBP from L. stagnalis

Tsutsui, Liu et al. 2006 α1AT (human) Wales and Engen 2006 Lyn SH3 and α-spectrin SH3 Weis, Kjellen et al. 2006 Lck SH3 Yao, Zhou et al. 2006 Cks1 and Skp2 Catalina, Fischer et al. 2005 SH2 domains of Syk tSH2 Kang and Prevelige 2005 P22 capsid coat protein Lee, Hoofnagle et al. 2005 ERK2 Brier, Lemaire et al. 2004 Kinesin-like protein KIF11 (human) Casbarra, Birolo et al. 2004 Human α-LA Croy, Koeppe et al. 2004 human and bovine α-thrombin Croy, Bergqvist et al. 2004 IκBα Li, Chou et al. 2004 LR3IGF-I Mazon, Marcillat et al. 2004 Creatine kinase M-type (rabbit) Wu, Hasan et al. 2004 human rCRALBP Yan, Broderick et al. 2004 human RXRα LBD Anand, Law et al. 2003 PKA Chik and Schriemer 2003 rabbit muscle actin Cravello, Lascoux et al. 2003 PBP-2X* Rist, Jorgensen et al. 2003 σ 32 Wintrode, Friedrich et al. 2003 HSP16.9 Hasan, Smith et al. 2002 α-Crystallin (αA and αB) Yan, Zhang et al. 2002 rhM-CSFß Hughes, Mandell et al. 2001 CheB Wang, Lane et al. 2001 BMV Chen and Smith 2000 GroEL Engen, Smithgall et al. 1999 Hck SH(3 + 2) Wang, Li et al. 1999 cNTnC Resing and Ahn 1998 human MKK1 Neubert, Walsh et al. 1997 recoverin Wang, Blanchard et al. 1997 DHPR Dharmasiri and Smith 1996 horse heart cyt c Zhang, Post et al. 1996 rabbit muscle aldolase Johnson and Walsh 1994 equine myoglobin

33

P4 P3 P2 P1 P1’ P2’ P3’ P4’ TE ED AAS Y

RRGAISAEVY TEEDAASYIR KVIPKDYKTM

AALAKAIEKN VLFSHLDDNE RSDIFDAMFP

VSFIAGETVI QQGDEGDNFY VIDQGEMDVY

Figure 2.2. Example of cleavage nomenclature. The P4 through P1 residues are those that occur before the cleavage site. The P1’ through P4’ residues occur after the cleavage site. 34

After the database was constructed the residues in the P1 and P1’ positions were focused on as these are believed to have the most influence on pepsin specificity

(Hamuro, Coales et al. 2008). For the analysis, all of the cleavages between two specific residues were tallied to construct a matrix of cleavage data (Table 2.3).

As shown in Table 2.3 the residue that occurred most often before the cleavage site (the P1 position) is leucine. Out of the 1,344 cleavages 372, or 28%, occurred after a leucine. The residues that occurred most often in the P1’ position are leucine and alanine.

The residues that occur least often before the cleavage point are proline, histidine and lysine. Glycine, proline and lysine are the three residues that produce the least number of cleavages when found in the position following the cleavage site. This matrix of cleavage data is comprehensive of the literature search performed; however there are some issues with it. It is difficult to extract any trends in pepsin specificity just by looking at the table. Another problem is that this raw data does not take into account the abundance of amino acids. To make the values more meaningful a normalization of the data must be performed.

2.4. Data Normalization

Normalizing the data is important because it takes into account how often a specific amino acid occurs in a protein’s sequence. Taking the data in Table 2.3 for example, 36 cleavages occurred between a leucine and a leucine while only three cleavages occurred between a two tryptophans. When looking at the sequences for the proteins in the literature there are 85 times in which two leucines occur next to one another, thus pepsin produced a cleavage between a leucine and a leucine 36 out of 85 35 Table 2.3. Sum of cleavages between two residues

P1' A C D E F G H I K L M N P Q R S T V W Y A 5 1 2 6 10 0 1 16 4 14 2 1 1 1 2 3 2 11 0 12 C 1 1 1 1 0 0 0 1 0 2 1 0 2 0 0 1 1 3 0 1 D 6 1 3 4 6 1 0 14 0 7 1 3 4 2 2 1 2 9 3 6 E 11 1 11 5 7 1 2 10 6 20 6 6 2 4 10 1 2 16 3 10 F 17 2 13 9 10 9 3 11 8 14 3 6 2 5 10 11 5 15 2 10 G 4 2 0 3 4 3 1 4 1 5 2 1 0 1 2 3 2 3 2 4 H 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 I 5 0 0 1 5 0 0 2 1 5 1 2 0 1 2 3 3 3 0 1 K 3 0 2 4 0 0 0 1 2 1 0 2 0 0 0 1 3 1 0 1

P1 L 39 9 25 28 19 13 13 17 14 36 12 15 8 14 15 22 22 25 9 17 M 7 1 3 10 6 2 1 4 9 4 4 2 0 0 0 5 1 7 2 3 N 0 1 2 1 6 2 0 5 1 2 2 1 1 0 1 1 3 7 1 4 P 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 1 0 0 Q 3 0 3 2 3 0 0 5 1 4 0 1 0 1 1 0 1 7 1 5 R 0 0 1 2 1 0 0 1 1 3 0 0 1 1 1 1 2 2 2 0 S 4 0 2 3 4 4 1 4 2 6 2 0 2 0 1 0 0 6 3 8 T 6 0 4 1 6 5 0 6 0 7 1 1 0 1 4 0 0 2 1 6 V 5 0 4 1 2 1 0 2 3 3 1 0 0 1 1 3 3 2 1 5 W 0 0 1 0 3 0 1 2 0 3 0 0 0 2 0 0 0 0 3 2 Y 3 1 0 1 4 2 0 5 1 4 2 3 0 2 1 2 2 11 2 4

Tally of cleavages between two specific residues. Residues in the P1 positions are in the left hand column and residues in the P1’ positions are along the top row. For example, cleavage between a phenylalanine and glycine occurred 9 times while cleavage between a leucine and a leucine occurred 36 times.

36

possible times. Two tryptophans occurred next to each other in the primary structures

only five times. Therefore, pepsin produced a cleavage between a tryptophan and a

tryptophan three out of five times. Leucine is one of the more abundant amino acids

while tryptophan is not. This is why data normalization is needed.

The first step to normalizing the raw data was to gather the sequences of all of the

proteins from the literature search. A matrix of possible cleavages was then constructed

(Table 2.4). The total number of possible cleavages for this set of proteins is 11,240.

The data was then normalized using the equation found in Figure 2.3 (Keil 1992). The

normalized values are shown in Table 2.5.

The matrix in Table 2.5 was then transformed into cleavage data map. The cleavage

data map is a more illustrative way to represent the data than the matrix form. It is much

easier to decipher trends in pepsin specificity by glancing at the cleavage map than it is to

look at the matrix. The cleavage data map is shown in Figure 2.4.

2.5. Cleavage Data Map

The cleavage data map in Figure 2.4 shows some trends in pepsin specificity. On

the y-axis are the P1 residues and on the x-axis are the P1’ residues. The size of the

square is the normalized cleavage value. For example, the probability that pepsin will cleave between leucine and tryptophan is 7.5 while the probability that pepsin will cleave between phenylalanine and glycine is 1.8. The map is arranged so that the residues where pepsin prefers to cleave most often are nearest to the zero point on the chart and those residues where pepsin does not prefer to cleave are farther out. 37 Table 2.4. Possible cleavages between two residues

P1' A C D E F G H I K L M N P Q R S T V W Y A 89 7 55 56 24 73 18 61 55 75 22 32 35 38 34 47 45 55 9 24 C 8 1 13 7 7 12 3 7 7 14 4 5 8 10 5 12 8 9 0 4 D 47 8 47 49 27 60 12 47 42 75 16 21 27 19 34 38 18 44 9 21 E 66 6 55 90 31 58 21 48 78 79 24 34 19 33 47 29 49 50 10 23 F 34 6 39 21 22 42 11 24 31 33 6 25 17 18 24 32 23 23 2 15 G 56 11 47 58 35 58 14 49 58 61 27 34 22 36 42 51 59 59 11 34 H 12 4 10 18 12 25 44 6 18 30 13 6 21 8 14 22 14 17 1 15 I 47 8 33 27 24 31 23 32 37 61 12 16 30 25 36 44 45 45 8 20 K 77 11 48 60 27 57 27 39 68 68 16 32 37 24 30 35 34 56 11 25

P1 L 92 16 58 88 38 61 28 34 73 85 18 50 39 54 53 82 70 51 10 24 M 25 2 19 27 14 23 6 12 28 29 7 6 11 7 13 18 15 21 2 7 N 20 7 19 36 20 26 8 27 30 48 16 15 22 14 19 29 24 34 6 14 P 30 7 34 56 28 36 14 23 30 39 12 16 16 12 20 45 24 33 11 11 Q 27 3 23 31 11 30 10 28 32 47 17 15 22 19 20 30 21 39 3 9 R 40 10 34 47 24 27 12 34 32 45 9 18 35 23 31 35 35 40 8 14 S 40 12 43 48 31 59 24 35 38 65 12 27 46 29 37 57 31 43 9 23 T 51 5 27 51 30 51 15 33 38 68 17 21 34 22 29 30 42 38 9 19 V 64 14 37 57 31 44 16 37 57 69 18 40 33 21 38 51 37 40 6 17 W 8 2 6 6 5 11 3 9 11 10 1 7 2 10 7 6 7 7 5 7 Y 18 5 20 17 7 23 4 20 22 29 9 18 21 17 22 19 23 26 3 9

Tally of possible cleavages between two specific residues. Residues in the P1 positions are in the left hand column and residues in the P1’ positions are along the top row. For example, cleavage between a phenylalanine and glycine could occur 42 times while cleavage between a leucine and a leucine could occur 85 times.

38

Frequency of cleavages between two specific residues ( Total number of cleavages ) Normalized value = Frequency of times two specific residues ( appear next to each other in the sequences Total number of residues )

9 ( 1344 ) Normalized value = = 1.8 42 (11240 )

Figure 2.3. Equation used for data normalization and example calculation from Keil (1993). Th e normali zati on equati on t ak es i nt o account th e ab und ance of ami no acid s. The example calculation is for the probability of cleavage between a phenylalanine and a glycine. Cleavage between these two specific residues was observed nine times in the literature, where there was a total of 1344 cleavages. These two residues were found next to one another in the protein sequences 42 times. This means that there was possibility of cleavage between these two residues 42 times out of a possible 11240 residues. Therefore the probability of a cleavage occurring between a phenylalanine and a glilycine i18is 1.8. 39 Table 2.5. Normalized cleavage data

P1' A C D E F G H I K L M N P Q R S T V W Y A 0.5 1.2 0.3 0.9 3.5 0.0 0.5 2.2 0.6 1.6 0.8 0.3 0.2 0.2 0.5 0.5 0.4 1.7 0.0 4.2 C 1.0 8.4 0.6 1.2 0.0 0.0 0.0 1.2 0.0 1.2 2.1 0.0 2.1 0.0 0.0 0.7 1.0 2.8 0.0 2.1 D 1.1 1.0 0.5 0.7 1.9 0.1 0.0 2.5 0.0 0.8 0.5 1.2 1.2 0.9 0.5 0.2 0.9 1.7 2.8 2.4 E 1.4 1.4 1.7 0.5 1.9 0.1 0.8 1.7 0.6 2.1 2.1 1.5 0.9 1.0 1.8 0.3 0.3 2.7 2.5 3.6 F 4.2 2.8 2.8 3.6 3.8 1.8 2.3 3.8 2.2 3.5 4.2 2.0 1.0 2.3 3.5 2.9 1.8 5.5 8.4 5.6 G 0.6 1.5 0.0 0.4 1.0 0.4 0.6 0.7 0.1 0.7 0.6 0.2 0.0 0.2 0.4 0.5 0.3 0.4 1.5 1.0 H 0.7 0.0 0.8 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 I 0.9 0.0 0.0 0.3 1.7 0.0 0.0 0.5 0.2 0.7 0.7 1.0 0.0 0.3 0.5 0.6 0.6 0.6 0.0 0.4 K 0.3 0.0 0.3 0.6 0.0 0.0 0.0 0.2 0.2 0.1 0.0 0.5 0.0 0.0 0.0 0.2 0.7 0.1 0.0 0.3

P1 L 3.5 4.7 3.6 2.7 4.2 1.8 3.9 4.2 1.6 3.5 5.6 2.5 1.7 2.2 2.4 2.2 2.6 4.1 7.5 5.9 M 2.3 4.2 1.3 3.1 3.6 0.7 1.4 2.8 2.7 1.2 4.8 2.8 0.0 0.0 0.0 2.3 0.6 2.8 8.4 3.6 N 0.0 1.2 0.9 0.2 2.5 0.6 0.0 1.5 0.3 0.3 1.0 0.6 0.4 0.0 0.4 0.3 1.0 1.7 1.4 2.4 P 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.4 0.0 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.3 0.3 0.0 0.0 Q 0.9 0.0 1.1 0.5 2.3 0.0 0.0 1.5 0.3 0.7 0.0 0.6 0.0 0.4 0.4 0.0 0.4 1.5 2.8 4.6 R 0.0 0.0 0.2 0.4 0.3 0.0 0.0 0.2 0.3 0.6 0.0 0.0 0.2 0.4 0.3 0.2 0.5 0.4 2.1 0.0 S 0.8 0.0 0.4 0.5 1.1 0.6 0.3 1.0 0.4 0.8 1.4 0.0 0.4 0.0 0.2 0.0 0.0 1.2 2.8 2.9 T 1.0 0.0 1.2 0.2 1.7 0.8 0.0 1.5 0.0 0.9 0.5 0.4 0.0 0.4 1.2 0.0 0.0 0.4 0.9 2.6 V 0.7 0.0 0.9 0.1 0.5 0.2 0.0 0.5 0.4 0.4 0.5 0.0 0.0 0.4 0.2 0.5 0.7 0.4 1.4 2.5 W 0.0 0.0 1.4 0.0 5.0 0.0 2.8 1.9 0.0 2.5 0.0 0.0 0.0 1.7 0.0 0.0 0.0 0.0 5.0 2.4 Y 1.4 1.7 0.0 0.5 4.8 0.7 0.0 2.1 0.4 1.2 1.9 1.4 0.0 1.0 0.4 0.9 0.7 3.5 5.6 3.7

Residues in the P1 positions are in the left hand column and residues in the P1’ positions are along the top row.

40 Figure 2.4. Cleavage data map

P1

P1’

Figure 2.4 Cleavage data map illustrating the probability of cleavage between two specific residues. The P1 residues are on the y-axis while the P1’ residues are on the x-axis. For example, the probability of cleavage between phenylalanine and glycine is 1.8 while the probability of cleavage between leucine and tryptophan is 7.5 41

The map shows that pepsin prefers to cleave most often after leucine, phenylalanine, and tyrosine. It prefers to cleave most often before tryptophan, tyrosine, phenylalanine and valine. One of the more telling aspects of the graph is where pepsin rarely cleaves. It is very obvious that pepsin will rarely cleave after proline, histidine or lysine. Also, pepsin will rarely cleave before glycine, proline or

lysine.

2.6. Revised Cleavage Data Map

A second cleavage data map (Figure 2.5) was then produced illustrating the

probability of cleavage between two specific residues as a percentage. The percentages, shown in Table 2.6, were calculated by dividing the number of times a cleavage occurred between two specific amino acids by the total number of times those residues occurred next to each other in the protein sequences. For example, cleavage between leucine and tryptophan occurred nine out of a possible ten times, or 90% of the time while cleavage between phenylalanine and glycine only occurred nine out of a possible 42 times, or 21% of the time.

In the second cleavage data map presented the x-axis and y-axis are simply ordered alphabetically. This illustration of the data makes it easier to spot trends in pepsin specificity. While both of the cleavage data maps utilize color only in the second version do those colors actually mean something. In the first version the colors were arbitrarily assigned to a P1 residues while in the second version the colors represent the

percentage of time a cleavage occurred between two specific residues.

42

Figure 2.5. Cleavage data map with probability defined as a percentage

Y W V T S R Q P N M

P1 L K I H G F E D C A

A CDEFGHIKLMNPQRSTVWY

P1’ Scale

100 90 80 70 60 50 40 30 20 10

Figure 2.5 Cleavage data map representing the probability of cleavage between two specific residues as a percentage. This map is read the same as in Figure 2.4 with the residue occurring before the cleavage site on the y-axis and the residue occurring following the cleavage site on the x -axis. Each colored square represents a percentage range. For example, the brown square represents the range 21-30% and the orange square represents the range 81-90%. Cleavage between phenylalanine and glycine occurred 21% of the time while cleavage between leucine and tryptophan occurred 90% of the time. 43 Table 2.6. Cleavage data with probability defined as a percentage

P1' A C D E F G H I K L M N P Q R S T V W Y A 6 14 4 11 42 0 6 26 7 19 9 3 3 3 6 6 4 20 0 50 C 13 100 8 14 0 0 0 14 0 14 25 0 25 0 0 8 13 33 0 25 D 13 13 6 8 22 2 0 30 0 9 6 14 15 11 6 3 11 20 33 29 E 17 17 20 6 23 2 10 21 8 25 25 18 11 12 21 3 4 32 30 43 F 50 33 33 43 45 21 27 46 26 42 50 24 12 28 42 34 22 65 100 67 G 7 18 0 5 11 5 7 8 2 8 7 3 0 3 5 6 3 5 18 12 H 8 0 10 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 I 11 0 0 4 21 0 0 6 3 8 8 13 0 4 6 7 7 7 0 5 K 4 0 4 7 0 0 0 3 3 1 0 6 0 0 0 3 9 2 0 4

P1 L 42 56 43 32 50 21 46 50 19 42 67 30 21 26 28 27 31 49 90 71 M 28 50 16 37 43 9 17 33 32 14 57 33 0 0 0 28 7 33 100 43 N 0 14 11 3 30 8 0 19 3 4 13 7 5 0 5 3 13 21 17 29 P 0 0 3 0 0 0 0 4 0 0 0 6 0 0 0 0 4 3 0 0 Q 11 0 13 6 27 0 0 18 3 9 0 7 0 5 5 0 5 18 33 56 R 0 0 3 4 4 0 0 3 3 7 0 0 3 4 3 3 6 5 25 0 S 10 0 5 6 13 7 4 11 5 9 17 0 4 0 3 0 0 14 33 35 T 12 0 15 2 20 10 0 18 0 10 6 5 0 5 14 0 0 5 11 32 V 8 0 11 2 6 2 0 5 5 4 6 0 0 5 3 6 8 5 17 29 W 0 0 17 0 60 0 33 22 0 30 0 0 0 20 0 0 0 0 60 29 Y 17 20 0 6 57 9 0 25 5 14 22 17 0 12 5 11 9 42 67 44

Residues in the P1 positions are in the left hand column and residues in the P1’ positions are along the top row. The values listed here represent the percentage of time a cleavage was observed between two specific residues. 44

2.7. Summary of Literature Search

The extensive search of contemporary scientific literature involving pepsin

digestions performed at pH 2.5 to 2.7 showed that pepsin does show some preferences in where it cleaves. The peptic digest maps found in the literature were analyzed in order to create a database of cleavages. The cleavages between the residues in the P1 and P1’ positions were tallied in order to create a matrix of data. These data were then normalized in order to take into account the abundance of amino acids in the protein sequences. The normalized data were then made into a cleavage map in order to easily see trends in pepsin specificity. The literature shows that pepsin prefers to cleave after bulky hydrophobic residues such as leucine and phenylalanine. It also shows that pepsin will hardly ever cleave after proline or histidine.

The data gathered from the literature search only consists of 1,344 cleavages.

While this is a good starting point, much more data is needed to more accurately determine trends in the specificity of pepsin.

2.8. References

Anand, G. S., D. Law, et al. (2003). "Identification of the protein kinase A regulatory

RIalpha-catalytic subunit interface by amide H/2H exchange and protein

docking." Proc Natl Acad Sci U S A 100(23): 13264-9.

Brier, S., D. Lemaire, et al. (2004). "Identification of the protein binding region of S-

trityl-L-cysteine, a new potent inhibitor of the mitotic kinesin Eg5." Biochemistry

43(41): 13072-82. 45

Brier, S., G. Maria, et al. (2007). "Purification and characterization of A1 and A2

from the Antarctic rock cod Trematomus bernacchii." Febs J 274(23): 6152-66.

Casbarra, A., L. Birolo, et al. (2004). "Conformational analysis of HAMLET, the folding

variant of human alpha-lactalbumin associated with apoptosis." Protein Sci 13(5):

1322-30.

Catalina, M. I., M. J. Fischer, et al. (2005). "Binding of a diphosphorylated-ITAM

peptide to spleen tyrosine kinase (Syk) induces distal conformational changes: a

hydrogen exchange mass spectrometry study." J Am Soc Mass Spectrom 16(7):

1039-51.

Chen, J. and D. L. Smith (2000). "Unfolding and disassembly of the chaperonin GroEL

occurs via a tetradecameric intermediate with a folded equatorial domain."

Biochemistry 39(15): 4250-8.

Cheng, G., M. A. Cusanovich, et al. (2006). "Properties of the dark and signaling states of

photoactive yellow protein probed by solution phase hydrogen/deuterium

exchange and mass spectrometry." Biochemistry 45(39): 11744-51.

Cheng, G., V. H. Wysocki, et al. (2006). "Local stability of Rhodobacter capsulatus

cytochrome c2 probed by solution phase hydrogen/deuterium exchange and mass

spectrometry." J Am Soc Mass Spectrom 17(11): 1518-25.

Chik, J. K. and D. C. Schriemer (2003). "Hydrogen/deuterium exchange mass

spectrometry of actin in various biochemical contexts." J Mol Biol 334(3): 373-

85. 46

Cravello, L., D. Lascoux, et al. (2003). "Use of different proteases working in acidic

conditions to improve sequence coverage and resolution in hydrogen/deuterium

exchange of large proteins." Rapid Commun Mass Spectrom 17(21): 2387-93.

Croy, C. H., S. Bergqvist, et al. (2004). "Biophysical characterization of the free

IkappaBalpha ankyrin repeat domain in solution." Protein Sci 13(7): 1767-77.

Croy, C. H., J. R. Koeppe, et al. (2004). "Allosteric changes in solvent accessibility

observed in thrombin upon active site occupation." Biochemistry 43(18): 5246-

55.

Dharmasiri, K. and D. L. Smith (1996). "Mass spectrometric determination of isotopic

exchange rates of amide hydrogens located on the surfaces of proteins." Anal

Chem 68(14): 2340-4.

Engen, J. R., T. E. Smithgall, et al. (1999). "Comparison of SH3 and SH2 domain

dynamics when expressed alone or in an SH(3+2) construct: the role of protein

dynamics in functional regulation." J Mol Biol 287(3): 645-56.

Hamuro, Y., S. J. Coales, et al. (2008). "Specificity of immobilized porcine pepsin in

H/D exchange compatible conditions." Rapid Commun Mass Spectrom 22(7):

1041-6.

Hasan, A., D. L. Smith, et al. (2002). "Alpha-crystallin regions affected by adenosine 5'-

triphosphate identified by hydrogen-deuterium exchange." Biochemistry 41(52):

15876-82.

Hochrein, J. M., T. E. Wales, et al. (2006). "Conformational features of the full-length

HIV and SIV Nef proteins determined by mass spectrometry." Biochemistry

45(25): 7733-9. 47

Hughes, C. A., J. G. Mandell, et al. (2001). "Phosphorylation causes subtle changes in

solvent accessibility at the interdomain interface of methylesterase CheB." J Mol

Biol 307(4): 967-76.

Johnson, R. S. and K. A. Walsh (1994). "Mass spectrometric measurement of protein

amide hydrogen exchange rates of apo- and holo-myoglobin." Protein Sci 3(12):

2411-8.

Kang, S. and P. E. Prevelige, Jr. (2005). "Domain study of bacteriophage p22 coat protein

and characterization of the capsid lattice transformation by hydrogen/deuterium

exchange." J Mol Biol 347(5): 935-48.

Keil, B. (1992). Specificity of Proteolysis. New York, Springer-Verlag.

Lee, T., A. N. Hoofnagle, et al. (2005). "Hydrogen exchange solvent protection by an

ATP analogue reveals conformational changes in ERK2 upon activation." J Mol

Biol 353(3): 600-12.

Li, X., Y. T. Chou, et al. (2004). "Integration of hydrogen/deuterium exchange and

cyanylation-based methodology for conformational studies of cystinyl proteins."

Anal Biochem 331(1): 130-7.

Lu, X., P. L. Wintrode, et al. (2007). "Beta-sheet core of human prion protein amyloid

fibrils as determined by hydrogen/deuterium exchange." Proc Natl Acad Sci U S

A 104(5): 1510-5.

Man, P., C. Montagner, et al. (2007). "Defining the interacting regions between

apomyoglobin and membrane by hydrogen/deuterium exchange coupled to

mass spectrometry." J Mol Biol 368(2): 464-72. 48

Mazon, H., O. Marcillat, et al. (2004). "Hydrogen/deuterium exchange studies of native

rabbit MM-CK dynamics." Protein Sci 13(2): 476-86.

Neubert, T. A., K. A. Walsh, et al. (1997). "Monitoring calcium-induced conformational

changes in recoverin by electrospray mass spectrometry." Protein Sci 6(4): 843-

50.

Resing, K. A. and N. G. Ahn (1998). "Deuterium exchange mass spectrometry as a probe

of protein kinase activation. Analysis of wild-type and constitutively active

mutants of MAP kinase kinase-1." Biochemistry 37(2): 463-75.

Rist, W., T. J. Jorgensen, et al. (2003). "Mapping temperature-induced conformational

changes in the Escherichia coli heat shock transcription factor sigma 32 by amide

hydrogen exchange." J Biol Chem 278(51): 51415-21.

Shi, J., J. R. Koeppe, et al. (2006). "Ligand-induced conformational changes in the

acetylcholine-binding protein analyzed by hydrogen-deuterium exchange mass

spectrometry." J Biol Chem 281(17): 12170-7.

Tsutsui, Y., L. Liu, et al. (2006). "The conformational dynamics of a metastable serpin

studied by hydrogen exchange and mass spectrometry." Biochemistry 45(21):

6561-9.

Wales, T. E. and J. R. Engen (2006). "Partial unfolding of diverse SH3 domains on a

wide timescale." J Mol Biol 357(5): 1592-604.

Wang, F., J. S. Blanchard, et al. (1997). "Hydrogen exchange/electrospray ionization

mass spectrometry studies of substrate and inhibitor binding and conformational

changes of Escherichia coli dihydrodipicolinate reductase." Biochemistry 36(13):

3755-9. 49

Wang, F., W. Li, et al. (1999). "Fourier transform ion cyclotron resonance mass

spectrometric detection of small Ca(2+)-induced conformational changes in the

regulatory domain of human cardiac troponin C." J Am Soc Mass Spectrom

10(8): 703-10.

Wang, L., L. C. Lane, et al. (2001). "Detecting structural changes in viral capsids by

hydrogen exchange and mass spectrometry." Protein Sci 10(6): 1234-43.

Weis, D. D., P. Kjellen, et al. (2006). "Altered dynamics in Lck SH3 upon binding to the

LBD1 domain of Herpesvirus saimiri Tip." Protein Sci 15(10): 2402-10.

Wintrode, P. L., K. L. Friedrich, et al. (2003). "Solution structure and dynamics of a heat

shock protein assembly probed by hydrogen exchange and mass spectrometry."

Biochemistry 42(36): 10667-73.

Wu, Z., A. Hasan, et al. (2004). "Identification of CRALBP ligand interactions by

photoaffinity labeling, hydrogen/deuterium exchange, and structural modeling." J

Biol Chem 279(26): 27357-64.

Yan, X., D. Broderick, et al. (2004). "Dynamics and ligand-induced solvent accessibility

changes in human retinoid X receptor homodimer determined by hydrogen

deuterium exchange and mass spectrometry." Biochemistry 43(4): 909-17.

Yan, X., H. Zhang, et al. (2002). "Hydrogen/deuterium exchange and mass spectrometric

analysis of a protein containing multiple disulfide bonds: Solution structure of

recombinant macrophage colony stimulating factor-beta (rhM-CSFbeta)." Protein

Sci 11(9): 2113-24.

Yao, Z. P., M. Zhou, et al. (2006). "Activation of ubiquitin SCF(Skp2) by Cks1:

insights from hydrogen exchange mass spectrometry." J Mol Biol 363(3): 673-86. 50

Zhang, Z. (1995). Protein Hydrogen Exchange Determined by Mass Spectrometry: A

New Tool for Probing Protein High-order Structure and Structural Changes.

Purdue. Ph.D.

Zhang, Z., C. B. Post, et al. (1996). "Amide hydrogen exchange determined by mass

spectrometry: application to rabbit muscle aldolase." Biochemistry 35(3): 779-91. 51

Table 2.2. Cleavage database

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) A A A G A V V G Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) G L G G Y M L G Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) Y M L G S A M S Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) L G S A M S R P Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) G S A M S R P I Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) F G S D Y E D R Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) E D R Y Y R E N Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) Y Y R E N M H R Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) R E N M H R V P Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) Y P N Q V Y Y R Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) P N Q V Y Y R P Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) N Q V Y Y R P M Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) P M D E Y S N Q Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) D C V N I T I K Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) C V N I T I K Q Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) V N I T I K Q H Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) K G E N F T E T Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) F T E T D V K M Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) T E T D V K M M Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) D V K M M E R V Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) K M M E R V V E Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) V E Q M C I T Q Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) C I T Q Y E R E Lu, X., P. L. Wintrode, et al. (2007) Major prion protein (human) E S Q A Y Y Q R Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) MVLSEGE Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) SEEEWQLV Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) E G G W Q L V L Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) Q L V L H V W A Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) V L H V W A K V Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) Q D I L I R L F Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) L I R L F K S H Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) E A E M KASE Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) V T V L T A L G Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) L T A L G A I L Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) I P I K Y L E F Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) Y L E F I S E A Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) F I S E A I I H Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) I S E A I I H V Man, P., C. Montagner, et al. (2007) Myoglobin (sperm whale) A L E L F R K D Brier, S., E. Carletti, et al. (2006) CENP-E (human) MAEEG Brier, S., E. Carletti, et al. (2006) CENP-E (human) G AVAVCVR Brier, S., E. Carletti, et al. (2006) CENP-E (human) VAVCVRVR Brier, S., E. Carletti, et al. (2006) CENP-E (human) PLNSREES Brier, S., E. Carletti, et al. (2006) CENP-E (human) RPLNSREE Brier, S., E. Carletti, et al. (2006) CENP-E (human) NSREESLG Brier, S., E. Carletti, et al. (2006) CENP-E (human) ETAQVYWK Brier, S., E. Carletti, et al. (2006) CENP-E (human) TAQVYWKT Brier, S., E. Carletti, et al. (2006) CENP-E (human) KTDNNVIY Brier, S., E. Carletti, et al. (2006) CENP-E (human) TDNNVIYQ Brier, S., E. Carletti, et al. (2006) CENP-E (human) NVIYQVDG 52

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Brier, S., E. Carletti, et al. (2006) CENP-E (human) IYQVDGSK Brier, S., E. Carletti, et al. (2006) CENP-E (human) KSFNFDRV Brier, S., E. Carletti, et al. (2006) CENP-E (human) SFNFDRVF Brier, S., E. Carletti, et al. (2006) CENP-E (human) TTKN VYEE Brier, S., E. Carletti, et al. (2006) CENP-E (human) KN VYEE I A Brier, S., E. Carletti, et al. (2006) CENP-E (human) N VYEE I AA Brier, S., E. Carletti, et al. (2006) CENP-E (human) VYEE I AAP Brier, S., E. Carletti, et al. (2006) CENP-E (human) EIAAPIID Brier, S., E. Carletti, et al. (2006) CENP-E (human) PIIDSAIQ Brier, S., E. Carletti, et al. (2006) CENP-E (human) IIDSAIQG Brier, S., E. Carletti, et al. (2006) CENP-E (human) IDSAIQGY Brier, S., E. Carletti, et al. (2006) CENP-E (human) GTIFAYGQ Brier, S., E. Carletti, et al. (2006) CENP-E (human) TIFAYGQT Brier, S., E. Carletti, et al. (2006) CENP-E (human) SGKTYTMM Brier, S., E. Carletti, et al. (2006) CENP-E (human) YTMMGSED Brier, S., E. Carletti, et al. (2006) CENP-E (human) EDHLGVIP Brier, S., E. Carletti, et al. (2006) CENP-E (human) HDIFQKIK Brier, S., E. Carletti, et al. (2006) CENP-E (human) REFLLRVS Brier, S., E. Carletti, et al. (2006) CENP-E (human) EFLLRVSY Brier, S., E. Carletti, et al. (2006) CENP-E (human) RVSYMEIY Brier, S., E. Carletti, et al. (2006) CENP-E (human) VSYMEIYN Brier, S., E. Carletti, et al. (2006) CENP-E (human) ITDLLCGT Brier, S., E. Carletti, et al. (2006) CENP-E (human) TDLLCGTQ Brier, S., E. Carletti, et al. (2006) CENP-E (human) IIREDVNR Brier, S., E. Carletti, et al. (2006) CENP-E (human) RN VYVADL Brier, S., E. Carletti, et al. (2006) CENP-E (human) YVADLTEE Brier, S., E. Carletti, et al. (2006) CENP-E (human) VADLTEEV Brier, S., E. Carletti, et al. (2006) CENP-E (human) LT EEVVYT Brier, S., E. Carletti, et al. (2006) CENP-E (human) YTSEMALK Brier, S., E. Carletti, et al. (2006) CENP-E (human) TSEMALKW Brier, S., E. Carletti, et al. (2006) CENP-E (human) ETKMNQRS Brier, S., E. Carletti, et al. (2006) CENP-E (human) HTIFRMIL Brier, S., E. Carletti, et al. (2006) CENP-E (human) RMILESRE Brier, S., E. Carletti, et al. (2006) CENP-E (human) PSNCEGSV Brier, S., E. Carletti, et al. (2006) CENP-E (human) VSHLNLVD Brier, S., E. Carletti, et al. (2006) CENP-E (human) SHLNLVDL Brier, S., E. Carletti, et al. (2006) CENP-E (human) HLNLVDLA Brier, S., E. Carletti, et al. (2006) CENP-E (human) AGSERAAQ Brier, S., E. Carletti, et al. (2006) CENP-E (human) AAQTGAAG Brier, S., E. Carletti, et al. (2006) CENP-E (human) GAAGVRLK Brier, S., E. Carletti, et al. (2006) CENP-E (human) VRLKEGCN Brier, S., E. Carletti, et al. (2006) CENP-E (human) EGCNINRS Brier, S., E. Carletti, et al. (2006) CENP-E (human) GCNINRSL Brier, S., E. Carletti, et al. (2006) CENP-E (human) NRSLFILG Brier, S., E. Carletti, et al. (2006) CENP-E (human) RSLFILGQ Brier, S., E. Carletti, et al. (2006) CENP-E (human) VGGFINYR Brier, S., E. Carletti, et al. (2006) CENP-E (human) GGFINYRD Brier, S., E. Carletti, et al. (2006) CENP-E (human) FINYRDSK Brier, S., E. Carletti, et al. (2006) CENP-E (human) DSKLTRIL Brier, S., E. Carletti, et al. (2006) CENP-E (human) NAKTRIIC Brier, S., E. Carletti, et al. (2006) CENP-E (human) RIICTITP 53

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Brier, S., E. Carletti, et al. (2006) CENP-E (human) IICTITPV Brier, S., E. Carletti, et al. (2006) CENP-E (human) ITPVSFDE Brier, S., E. Carletti, et al. (2006) CENP-E (human) PVSFDETL Brier, S., E. Carletti, et al. (2006) CENP-E (human) TLTALQFA Brier, S., E. Carletti, et al. (2006) CENP-E (human) LTALQFAS Brier, S., E. Carletti, et al. (2006) CENP-E (human) TALQFAST Brier, S., E. Carletti, et al. (2006) CENP-E (human) ALQFASTA Brier, S., E. Carletti, et al. (2006) CENP-E (human) FASTAKYM Brier, S., E. Carletti, et al. (2006) CENP-E (human) YVNEVSTD Brier, S., E. Carletti, et al. (2006) CENP-E (human) VNEVSTDL Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein H V A F G S E D Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein F G S E D I E N Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein G S E D I E N T Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein E N T L A K M D Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein D G Q L D G L A Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein L D G L A F G A Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein A I Q L D G D G Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein N I L Q Y N A A Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein I L Q Y N A A E Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein I G K N F F K D Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein G K N F F K D V Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein S P E F Y G K F Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein S G N L N T M F Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein N L N T M F E Y Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein L N T M F E Y T Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein E Y T F D Y Q M Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein Y T F D Y Q M T Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein S G D S Y W V F Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein G D S Y W V F V Cheng, G., M. A. Cusanovich, et al. (2006) Photoactive yellow protein Y W V F V K R V

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus G D A A K G E K

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus H S I I A P D G

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus D G T E I V K G

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus G P N L Y G V V

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus Y P E F K Y K D

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus I V A L G A S G

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus S I V A L G A S

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus G A S G F A W T

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus A S G F A W T E

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus W T E E D I A T

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus T E E D I A T Y

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus D I A T Y V K D

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus P G A F L K E K

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus G A F L K E K L

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus T G M A F K L A

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus E D VAAYLA

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus D VAAYLAS

Cheng, G., V. H. Wysocki, et al. (2006) cytochrome c 2-rhodobacter capsulatus Y L ASVVK Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef GWSAIRER Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef DGVGAVSR Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef NADCAWLE 54

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef AQ EEEEVG Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef RPMT YKAA Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef KAALDISH Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef ISHFLKEK Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef KGGLEGLI Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef ILDLWIYH Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef LDLWIYHT Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef QGYFPDWQ Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef DWQNYTPG Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef LTFGWCFK Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef GWCFKLVP Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef KVEEANEG Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef PMSLHGME Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef KEVLVWRF Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef DSKLAFHH Hochrein, J. M., T. E. Wales, et al. (2006) HIV Nef H PEYYKDC Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef HGGAISMR Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef GDLRQRLL Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef YGRLLGEV Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef LSSLSCEG Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef NQGQYMNT Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef PAEEREKL Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef NMDDIDEE Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef DDDLVGVS Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef RTMSYKLA Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef SYKLAIDM Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef MSHFIKEK Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef EGI YYSAR Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef RILDIYLE Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef DIYL EKEE Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef DWQDYTSG Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef FGWLWKLV Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef EAQEDEEH Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef QTSQWDDP Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef YEAYVRYP Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef KSGL SEEE Hochrein, J. M., T. E. Wales, et al. (2006) SIV Nef LLNMADKK Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis RADILYNI Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis AD I LYN I R Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis RPVAVSVS Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis PVAVSVSL Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis DVVFWQQT Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis SDRTLAWN Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis DLAAYNA I Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis TPQLARVV Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis GEVLYMPS Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis DDSEYFSQ Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis DSEYFSQY Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis YSRFE I LD Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis FE I LDVTQ Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis EAYEDVEV 55

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis EDVEVSLN Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis EVSLNFRK Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis VSLNFRKK Shi, J., J. R. Koeppe, et al. (2006) AChBP from L. stagnalis SLNFRKKG

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) FNKIT

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) NLAEFAFS

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) AFSLYRQL

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) STNIFFSP

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) IATAFAML

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) AFAMLSLG

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) GTKADTHD

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) DEILEGLN

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) NFNLTEIP

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) QELLHTLN

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) NGLFLSEG

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) LKLVDKFL

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) HSEAFTVN

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) QIND YVEK

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) IVDLVKEL

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) DTVFALVN

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) YIFFKGKN

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) EEDFHVDQ

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) QVTT VKVP

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) LGMFNIQH

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) KLSSWVLL

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) SWVLLMKY

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) ATAIFFLP

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) ITKFLENE

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) TYDLKSVL

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) KSVLGQLG

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) NGADLSGV

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) VT EEAPLK

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) LKL SKAVH

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) AVLTIDEK

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) AGAMFLEA

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) PFVFLMID

Tsutsui, Y., L. Liu, et al. (2006) α1AT (human) SPLFMGKV Wales, T. E. and J. R. Engen (2006) Lyn SH3 DIVVALYP Wales, T. E. and J. R. Engen (2006) Lyn SH3 FKKGEKMK Wales, T. E. and J. R. Engen (2006) Lyn SH3 HGEWWKAK Wales, T. E. and J. R. Engen (2006) Lyn SH3 EWW KAKSL Wales, T. E. and J. R. Engen (2006) Lyn SH3 KSLLTKEE Wales, T. E. and J. R. Engen (2006) Lyn SH3 KEGFIPSN Wales, T. E. and J. R. Engen (2006) Lyn SH3 PSN YVAKL Wales, T. E. and J. R. Engen (2006) Lyn SH3 AKLNTLET Wales, T. E. and J. R. Engen (2006) α-spectrin SH3 ALYDYQEK Wales, T. E. and J. R. Engen (2006) α-spectrin SH3 KSPREVTM Wales, T. E. and J. R. Engen (2006) α-spectrin SH3 PREVTMKK Wales, T. E. and J. R. Engen (2006) α-spectrin SH3 GDILTLLN Wales, T. E. and J. R. Engen (2006) α-spectrin SH3 DILTLLNS Wales, T. E. and J. R. Engen (2006) α-spectrin SH3 LNSTNKDW 56

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Wales, T. E. and J. R. Engen (2006) α-spectrin SH3 TNKDWWKV Wales, T. E. and J. R. Engen (2006) α-spectrin SH3 DRQGFVPA Wales, T. E. and J. R. Engen (2006) α-spectrin SH3 AAYVKKLD Weis, D. D., P. Kjellen, et al. (2006) Lck SH3 NLVIAL Weis, D. D., P. Kjellen, et al. (2006) Lck SH3 VIALHSYE Weis, D. D., P. Kjellen, et al. (2006) Lck SH3 KGEQLRIL Weis, D. D., P. Kjellen, et al. (2006) Lck SH3 GEQLRILE Weis, D. D., P. Kjellen, et al. (2006) Lck SH3 SGEWWKAQ Weis, D. D., P. Kjellen, et al. (2006) Lck SH3 GFIPFNFV Weis, D. D., P. Kjellen, et al. (2006) Lck SH3 PFNF VAKA Yao, Z. P., M. Zhou, et al. (2006) Cks1 DDEEFEYR Yao, Z. P., M. Zhou, et al. (2006) Cks1 KTHLMSES Yao, Z. P., M. Zhou, et al. (2006) Cks1 SESEWRNL Yao, Z. P., M. Zhou, et al. (2006) Cks1 QSQGWVHY Yao, Z. P., M. Zhou, et al. (2006) Cks1 WVHYMIHE Yao, Z. P., M. Zhou, et al. (2006) Cks1 HILLFRRP Yao, Z. P., M. Zhou, et al. (2006) Skp2 GSPEFM Yao, Z. P., M. Zhou, et al. (2006) Skp2 PKLNRENF Yao, Z. P., M. Zhou, et al. (2006) Skp2 PGVSWDSL Yao, Z. P., M. Zhou, et al. (2006) Skp2 DELLLGIF Yao, Z. P., M. Zhou, et al. (2006) Skp2 CLPELLKV Yao, Z. P., M. Zhou, et al. (2006) Skp2 SDESLWQT Yao, Z. P., M. Zhou, et al. (2006) Skp2 VTGRLLSQ Yao, Z. P., M. Zhou, et al. (2006) Skp2 VIAFRCPR Yao, Z. P., M. Zhou, et al. (2006) Skp2 PLAEHFSP Yao, Z. P., M. Zhou, et al. (2006) Skp2 HMDLSNSV Yao, Z. P., M. Zhou, et al. (2006) Skp2 SVIEVSTL Yao, Z. P., M. Zhou, et al. (2006) Skp2 HGILSQCS Yao, Z. P., M. Zhou, et al. (2006) Skp2 NLSLEGLR Yao, Z. P., M. Zhou, et al. (2006) Skp2 VNTLAKNS Yao, Z. P., M. Zhou, et al. (2006) Skp2 NSNLVRLN Yao, Z. P., M. Zhou, et al. (2006) Skp2 SGCSGFSE Yao, Z. P., M. Zhou, et al. (2006) Skp2 FSEFALQT Yao, Z. P., M. Zhou, et al. (2006) Skp2 QTLLSSCS Yao, Z. P., M. Zhou, et al. (2006) Skp2 SCSRLDEL Yao, Z. P., M. Zhou, et al. (2006) Skp2 AHVSETIT Yao, Z. P., M. Zhou, et al. (2006) Skp2 ITQLNLSG Yao, Z. P., M. Zhou, et al. (2006) Skp2 KSDLSTLV Yao, Z. P., M. Zhou, et al. (2006) Skp2 RRSPNLVH Yao, Z. P., M. Zhou, et al. (2006) Skp2 LDLSDSVM Yao, Z. P., M. Zhou, et al. (2006) Skp2 LKNDCFQE Yao, Z. P., M. Zhou, et al. (2006) Skp2 FQEFFQLN Yao, Z. P., M. Zhou, et al. (2006) Skp2 EFFQLNYL Yao, Z. P., M. Zhou, et al. (2006) Skp2 QLNYLQHL Yao, Z. P., M. Zhou, et al. (2006) Skp2 HLSLSRCY Yao, Z. P., M. Zhou, et al. (2006) Skp2 LSRCYDII Yao, Z. P., M. Zhou, et al. (2006) Skp2 PETLLELG Yao, Z. P., M. Zhou, et al. (2006) Skp2 PTLKTLQV Yao, Z. P., M. Zhou, et al. (2006) Skp2 LQVFGIVP Yao, Z. P., M. Zhou, et al. (2006) Skp2 TLQLLKEA Yao, Z. P., M. Zhou, et al. (2006) Skp2 LPHLQINC 57

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Yao, Z. P., M. Zhou, et al. (2006) Skp2 KCRLTLQK Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 FFFGNI Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 I T R EEAED Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 T R EEAEDY Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 G M S D G L Y L Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 S D G L Y L L R Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 Q S R N Y L G G Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 Y L G G F A L S Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 L G G F A L S V Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 H Y T I L N G T Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 N G T Y A I A G Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 P A D L C H Y H Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 E S D G L V C L Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 L V C L L K K P Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 F E D L K E N L Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 K E N L I R E Y Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 E N L I R E Y V Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 I R EYVKQT Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 Q A L E Q A I L Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 A L E Q A I L S Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 Q A I L S Q K P Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 K P Q L E K L I Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 L E K L I A T T Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 E K L I A T T A Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 S R D E S E Q I Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 D E S E Q I V L Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 S E Q I V L I G Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 Q I V L I G S K Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 N G K F L I R A Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 N N G S Y A L C Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 S Y A L C L L H Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 F D T L W Q L V Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 E H YSYKSD Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 S D G L L R V L Catalina, M. I., M. J. Fischer, et al. (2005) SH2 domains of Syk tSH2 D G L L R V L T Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein I V T L A V D E Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein L A V D E I I E Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein A S M Q R S S N Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein I W M P V E Q E Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein P V E Q E S P T Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein W D L T D K A T Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein A T G L L E L N Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein D N D F F Q L R Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein D L R D E T A Y Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein N N V E L K V A Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein N V E L K V A N Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein G S L V I T S P Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein D A W N F V A D Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein E I M F S R E L Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein G T S Y F F N P Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein D I F G R I P E 58

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein PEEAYRDG Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein V A G F D D V L Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein V D N R F A T V Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein T V T L S A T T Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein K I S F A G V K Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein A K N V L A Q D Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein D A T F S V V R Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein P V A L D D V S Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein A D A M A V N I Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein T N V F W A D D Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein W A D D A I R I Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein H E L F A G M K Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein F A G M K T T S Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein N G I F A T Q G Kang, S. and P. E. Prevelige, Jr. (2005) P22 capsid coat protein L S G L C R I A Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 AGPEMVRG Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 GMVCSAYD Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 VCSAYDNL Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 YDNLNKVR Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 TLREIKIL Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 IKILLRFR Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 GINDIIRA Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 KDVYIVQD Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 VQDLMETD Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 DLMETDLY Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 ETDLYKLL Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 HICYFLYQ Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 ANVLHRDL Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 PSNLLLNT Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 SNLLLNTT Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 TTCDLKIC Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 TCDLKICD Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 DFGLARVA Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 ATRWYRAP Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 PEINLNSK Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 EINLNSKG Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 WSVGCILA Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 LAEMLSNR Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 PSQEDLNC Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 QEDLNCII Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 IINLKARN Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 RNYLLSLP Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 ALDLLDKM Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 YLEQYYDP Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 DMELDDLP Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 ELIFEETA Lee, T., A. N. Hoofnagle, et al. (2005) ERK2 ICDFGLAR Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) K N I Q V V V R Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) N I Q V V V R C Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) V R C R P F N L Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) P F N K A E R K 59

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) KASAHS I V Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) S A H S I V E C Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) S I V E C D P V Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) V E C D P V R K Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) KEVSVRTG Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) V R T G G L A D Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) T G G L A D K S Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) S S R K T Y T F Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) T Y T F D M V F Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) Y T F D M V F G Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) F D M V F G A S Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) Q I D V Y R S V Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) S V V C P I L D Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) C P I L D E V I Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) P I L D E V I M Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) I L D E V I M G Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) D E V I M G Y N Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) M G Y N C T I F Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) N C T I F A Y G Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) C T I F A Y G Q Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) Y G Q T G T G K Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) P N E E Y T W E Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) N E E Y T W E E Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) E E Y T W E E D Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) Y T W E E D P L Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) I I P R T L H Q Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) P R T L H Q I F Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) T L H Q I F E K Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) H Q I F E K L T Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) Q I F E K L T D Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) K L T D N G T E Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) N G T E F S V K Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) G T E F SVKV Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) T E F SVKVS Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) K V S L L E I Y Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) V S L L E I Y N Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) N E E L F D L L Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) N P S S D V S E Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) D V S E R L Q M Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) R L Q M F D D P Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) K R G V I I K G Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) G L E E I T V H Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) V H N K D E V Y Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) D E V Y Q I L E Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) E V Y Q I L E K Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) Y Q I L E K G A Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) Q I L E K G A A Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) A A T L M N A Y Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) A T L M N A Y S Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) L M N AYSSR Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) S S R S H S V F 60

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) H S V F S V T I Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) T I H M K E T T Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) T I D G E E L V Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) D G E E L V K I Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) G E E L V K I G Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) E E L V K I G K Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) K L N L V D L A Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) L N L V D L A G Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) N I G R S G A V Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) R E A G N I N Q Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) N Q S L L T L G Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) Q S L L T L G R Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) L L T L G R V I Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) L T L G R V I T Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) R V I T A L V E Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) V I T A L V E R Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) I T A L V E R T Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) R E S K L T R I Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) E S K L T R I L Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) L Q D S L G G R Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) Q D S L G G R T Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) D S L G G R T R Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) G G R T R T S I Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) R T S I I A T I Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) T S I I A T I S Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) S I I A T I S P Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) I I A T I S P A Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) I A T I SPAS Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) I SPASLNL Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) SPASLNLE Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) S L N L E E T L Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) L N L E E T L S Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) E E T L S T L E Brier, S., D. Lemaire, et al. (2004) Kinesin-like protein KIF11 (human) S T L E Y A G R Casbarra, A., L. Birolo, et al. (2004) Human α-LA LSQLLKDI Casbarra, A., L. Birolo, et al. (2004) Human α-LA GIALPELI Casbarra, A., L. Birolo, et al. (2004) Human α-LA CTMFHTSG Casbarra, A., L. Birolo, et al. (2004) Human α-LA YGLFQ I SN Casbarra, A., L. Birolo, et al. (2004) Human α-LA SNKLWCKS Casbarra, A., L. Birolo, et al. (2004) Human α-LA KILDIKGI Casbarra, A., L. Birolo, et al. (2004) Human α-LA GIDYWLAH Casbarra, A., L. Birolo, et al. (2004) Human α-LA LEQWLCEK Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin RPLFEKKS Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin TERELLES Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin LESY I DGR Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin EGSDAE I G Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin QVML FRKS Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin QEL LCGAS Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin NDL LVR I G Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin DLLVR I GK Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin ERN I EK I S 61

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin EK I SMLEK Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin ISMLEKIY Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin SMLEKIYI Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin YNWRENLD Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin RENLDRD I Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin DIALMKLK Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin FSDY I HPV Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin AASLLQAG Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin ASL LQAGY Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin GNL KETWT Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin GGP F VMKS Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin PFVMKSPF Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin FNNRWYQM Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin KYGFYTHV Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin THVFRKKK Croy, C. H., J. R. Koeppe, et al. (2004) human α-thrombin IDQFGE Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin QPFFNEKT Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin VMLFRKSP Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin QEL LCGAS Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin ASL I SDRW Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin WV LTAAHC Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin DDL LVRIG Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin KENLDRD I Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin DI ALLKLK Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin RP I ELSDY Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin LSDYIHPV Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin HAGFKGRV Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin TTSVAEVQ Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin TSVAEVQP Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin KASTR I R I Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin TDNMFCAG Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin DNMFCAGY Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin GGP F VMKS Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin PFVMKSPY Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin KSPYNNRW Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin YNNRWYQM Croy, C. H., J. R. Koeppe, et al. (2004) bovine α-thrombin GIVSWGEG Croy, C. H., S. Bergqvist, et al. (2004) IκBα DSFLHLA I Croy, C. H., S. Bergqvist, et al. (2004) IκBα FLHLAI IH Croy, C. H., S. Bergqvist, et al. (2004) IκBα ALTMEV IR Croy, C. H., S. Bergqvist, et al. (2004) IκBα LTMEVIRQ Croy, C. H., S. Bergqvist, et al. (2004) IκBα MEV IRQVK Croy, C. H., S. Bergqvist, et al. (2004) IκBα KGDLAF LN Croy, C. H., S. Bergqvist, et al. (2004) IκBα DLAFLNFQ Croy, C. H., S. Bergqvist, et al. (2004) IκBα LAFLNFQN Croy, C. H., S. Bergqvist, et al. (2004) IκBα FLNFQNNL Croy, C. H., S. Bergqvist, et al. (2004) IκBα PLHLAV I T Croy, C. H., S. Bergqvist, et al. (2004) IκBα E I AEALLG Croy, C. H., S. Bergqvist, et al. (2004) IκBα AGCDPELR Croy, C. H., S. Bergqvist, et al. (2004) IκBα DPELRDFR Croy, C. H., S. Bergqvist, et al. (2004) IκBα ELRDFRGN 62

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Croy, C. H., S. Bergqvist, et al. (2004) IκBα GNTRLHL A Croy, C. H., S. Bergqvist, et al. (2004) IκBα TPKHLACE Croy, C. H., S. Bergqvist, et al. (2004) IκBα PLHLACEQ Croy, C. H., S. Bergqvist, et al. (2004) IκBα EQGCLASV Croy, C. H., S. Bergqvist, et al. (2004) IκBα LASVGVLT Croy, C. H., S. Bergqvist, et al. (2004) IκBα LHS I LKAT Croy, C. H., S. Bergqvist, et al. (2004) IκBα GHT L LHL A Croy, C. H., S. Bergqvist, et al. (2004) IκBα HTLLHLAS Croy, C. H., S. Bergqvist, et al. (2004) IκBα GI VELLVS Croy, C. H., S. Bergqvist, et al. (2004) IκBα I VELLVSK Croy, C. H., S. Bergqvist, et al. (2004) IκBα GRTA LHL A Croy, C. H., S. Bergqvist, et al. (2004) IκBα ALHLAVDK Croy, C. H., S. Bergqvist, et al. (2004) IκBα RPSTR I QQ Croy, C. H., S. Bergqvist, et al. (2004) IκBα QLGQLTLE Croy, C. H., S. Bergqvist, et al. (2004) IκBα ESEFTEFT Croy, C. H., S. Bergqvist, et al. (2004) IκBα YDDCVFGG Li, X., Y. T. Chou, et al. (2004) LR3IGF-I MPS L FVNG Li, X., Y. T. Chou, et al. (2004) LR3IGF-I PSLFVNGP Li, X., Y. T. Chou, et al. (2004) LR3IGF-I CGAE L VDA Li, X., Y. T. Chou, et al. (2004) LR3IGF-I GAELVDAL Li, X., Y. T. Chou, et al. (2004) LR3IGF-I LVDALQFV Li, X., Y. T. Chou, et al. (2004) LR3IGF-I VDALQFVC Li, X., Y. T. Chou, et al. (2004) LR3IGF-I DALQEVCG Li, X., Y. T. Chou, et al. (2004) LR3IGF-I DRGFYFNK Li, X., Y. T. Chou, et al. (2004) LR3IGF-I RGFYFNKP Li, X., Y. T. Chou, et al. (2004) LR3IGF-I FYFNKPTG Li, X., Y. T. Chou, et al. (2004) LR3IGF-I QTG I VDEC Li, X., Y. T. Chou, et al. (2004) LR3IGF-I VDECCFRS Li, X., Y. T. Chou, et al. (2004) LR3IGF-I RLEMYCAP Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) Y P D L S K H N Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) T P D L Y K K L Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) E T P S G F T L Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) P S G F T L D D Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) F T L D D V I Q Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) T L D D V I Q T Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) F I M T V G C V Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) M T V G C V A G Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) D EESYTVF Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) Y T V F K D L F Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) D L F D P I I Q Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) H E N L K G G D Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) H Y V L S S R V Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) K L SVEALN Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) V E A L N S L T Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) Q Q Q L I D D H Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) D D H F L F D K Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) D H F L F D K P Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) S P L L L A S G Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) P L L L A S G M Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) N K S F L V W V Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) K S F L V W V N 63

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) E D H L R V I S Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) V I S M E K Q G Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) K E V F R R F C Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) F C V G L Q K I Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) K I E E I F K K Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) H P F M W N E H Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) H L G Y V L T C Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) G Y V L T C P S Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) V L T C P S N L Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) P S N L G T G L Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) H P K F E E I L Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) E E I L T R L R Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) L T R L R L Q K Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) A A V G S V F D Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) G S V F D I S N Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) SEVEQVQL Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) E V E Q V Q L V Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) Q V Q L V V D G Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) G V K L M V E M Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) L M V E M E K K Mazon, H., O. Marcillat, et al. (2004) Creatine kinase M-type (rabbit) M V E M E K K L Wu, Z., A. Hasan, et al. (2004) human rCRALBP DKHMSEGV Wu, Z., A. Hasan, et al. (2004) human rCRALBP YVNFRLQY Wu, Z., A. Hasan, et al. (2004) human rCRALBP QAENTAF Yan, X., D. Broderick, et al. (2004) human RXRα LBD MPVER I LE Yan, X., D. Broderick, et al. (2004) human RXRα LBD PVER I LEA Yan, X., D. Broderick, et al. (2004) human RXRα LBD ER I LEAEL Yan, X., D. Broderick, et al. (2004) human RXRα LBD I LEAELAV Yan, X., D. Broderick, et al. (2004) human RXRα LBD LEAELAVE Yan, X., D. Broderick, et al. (2004) human RXRα LBD EAELAVEP Yan, X., D. Broderick, et al. (2004) human RXRα LBD KTETYVEA Yan, X., D. Broderick, et al. (2004) human RXRα LBD VEANMGLN Yan, X., D. Broderick, et al. (2004) human RXRα LBD NMGLNPSS Yan, X., D. Broderick, et al. (2004) human RXRα LBD NICQAADK Yan, X., D. Broderick, et al. (2004) human RXRα LBD QLFTLVFW Yan, X., D. Broderick, et al. (2004) human RXRα LBD LFTLVEWA Yan, X., D. Broderick, et al. (2004) human RXRα LBD TLVEWAKR Yan, X., D. Broderick, et al. (2004) human RXRα LBD IPHFSELP Yan, X., D. Broderick, et al. (2004) human RXRα LBD PHFSELPL Yan, X., D. Broderick, et al. (2004) human RXRα LBD FSELPLDD Yan, X., D. Broderick, et al. (2004) human RXRα LBD PLDDQVI L Yan, X., D. Broderick, et al. (2004) human RXRα LBD LDDQVI LL Yan, X., D. Broderick, et al. (2004) human RXRα LBD QV I L LRAG Yan, X., D. Broderick, et al. (2004) human RXRα LBD WNELLIAS Yan, X., D. Broderick, et al. (2004) human RXRα LBD NELLIASF Yan, X., D. Broderick, et al. (2004) human RXRα LBD LIASFSHR Yan, X., D. Broderick, et al. (2004) human RXRα LBD IASFSHRS Yan, X., D. Broderick, et al. (2004) human RXRα LBD RS I AVKDG Yan, X., D. Broderick, et al. (2004) human RXRα LBD DG I L L ATG Yan, X., D. Broderick, et al. (2004) human RXRα LBD GILLATGL Yan, X., D. Broderick, et al. (2004) human RXRα LBD ATGLHVHR 64

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Yan, X., D. Broderick, et al. (2004) human RXRα LBD AGVGA I FD Yan, X., D. Broderick, et al. (2004) human RXRα LBD GA I FDRVL Yan, X., D. Broderick, et al. (2004) human RXRα LBD AIFDRVLT Yan, X., D. Broderick, et al. (2004) human RXRα LBD VLTELVSK Yan, X., D. Broderick, et al. (2004) human RXRα LBD LTELVSKM Yan, X., D. Broderick, et al. (2004) human RXRα LBD DMQMDK T E Yan, X., D. Broderick, et al. (2004) human RXRα LBD KTELGCLR Yan, X., D. Broderick, et al. (2004) human RXRα LBD LGCLRA I V Yan, X., D. Broderick, et al. (2004) human RXRα LBD CLRA I VLF Yan, X., D. Broderick, et al. (2004) human RXRα LBD AIVLFNPD Yan, X., D. Broderick, et al. (2004) human RXRα LBD NPAEVEAL Yan, X., D. Broderick, et al. (2004) human RXRα LBD AEVEALRE Yan, X., D. Broderick, et al. (2004) human RXRα LBD EVEALREK Yan, X., D. Broderick, et al. (2004) human RXRα LBD ALREKVYA Yan, X., D. Broderick, et al. (2004) human RXRα LBD YASLEAYC Yan, X., D. Broderick, et al. (2004) human RXRα LBD SLEAYCKH Yan, X., D. Broderick, et al. (2004) human RXRα LBD LEAYCKHK Yan, X., D. Broderick, et al. (2004) human RXRα LBD PGRFAKL L Yan, X., D. Broderick, et al. (2004) human RXRα LBD FAKLLLRL Yan, X., D. Broderick, et al. (2004) human RXRα LBD AKLLLRLP Yan, X., D. Broderick, et al. (2004) human RXRα LBD LLRLPALR Yan, X., D. Broderick, et al. (2004) human RXRα LBD LPALRSIG Yan, X., D. Broderick, et al. (2004) human RXRα LBD ALRSIGLK Yan, X., D. Broderick, et al. (2004) human RXRα LBD RS I GLKCL Yan, X., D. Broderick, et al. (2004) human RXRα LBD SIGLKCLE Yan, X., D. Broderick, et al. (2004) human RXRα LBD GLKCLEHL Yan, X., D. Broderick, et al. (2004) human RXRα LBD KCLEHLFF Yan, X., D. Broderick, et al. (2004) human RXRα LBD HLFFFKL I Yan, X., D. Broderick, et al. (2004) human RXRα LBD PIDTFLME Yan, X., D. Broderick, et al. (2004) human RXRα LBD TFLMEMLE

Anand, G. S., D. Law, et al. (2003) PKA - R1α A I SAEVYT

Anand, G. S., D. Law, et al. (2003) PKA - R1α DAASYVRK

Anand, G. S., D. Law, et al. (2003) PKA - R1α KDYKTMAA

Anand, G. S., D. Law, et al. (2003) PKA - R1α TMAALAKA

Anand, G. S., D. Law, et al. (2003) PKA - R1α EKNVLFSH

Anand, G. S., D. Law, et al. (2003) PKA - R1α KNVLFSHL

Anand, G. S., D. Law, et al. (2003) PKA - R1α FSHLDDNE

Anand, G. S., D. Law, et al. (2003) PKA - R1α LDDNERSD

Anand, G. S., D. Law, et al. (2003) PKA - R1α DDNERSD I

Anand, G. S., D. Law, et al. (2003) PKA - R1α RSD I FDAM

Anand, G. S., D. Law, et al. (2003) PKA - R1α DEGDNFYV

Anand, G. S., D. Law, et al. (2003) PKA - R1α SFGELAL I

Anand, G. S., D. Law, et al. (2003) PKA - R1α EKAL I YGT

Anand, G. S., D. Law, et al. (2003) PKA - R1α NVKLWGI D

Anand, G. S., D. Law, et al. (2003) PKA - R1α DSYRR I LM

Anand, G. S., D. Law, et al. (2003) PKA - R1α MGS T L RKR

Anand, G. S., D. Law, et al. (2003) PKA - R1α GST LRKRK Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit SVKEFLAK Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit AKEDLLKK Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit NTAQDDQF Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit TAQLDQFD 65

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit LDQFFRIK Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit GTGSGGRV Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit H KESVNHY Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit KQ KVKKLK Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit QKVVQLKQ Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit VKLKKIEH Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit TLNEPRIL Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit AVNFMFLV Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit AGGERFSH Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit FSHLSRIG Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit IGRFYEPH Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit HARFYAAQ Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit LDLILRDL Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit KPENDLID Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit NLLISQQG Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit EIILWKGY Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit AVDWFALG Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit FPSHDSSD Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit HFSSLLKD Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit FSSDTKDL Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit GVDLIKRF Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit TTDWFAIY Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit DTS NDDDY Anand, G. S., D. Law, et al. (2003) PKA - C-Subunit TSNFDDYE Chik, J. K. and D. C. Schriemer (2003) PKA - C-Subunit TTALVCDN Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin GSGLVKAG Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin RAVFPSIV Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin QGVMVGMG Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin VMVGMGQK Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin RGILTLKY Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin GILTLKYP Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin WDDMEKIW Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin FYNELRVA Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin YNELRVAP Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin HPTLLTEA Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin KMTQIMFE Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin TQIMFETF Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin QIMFETFN Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin VLSLYASG Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin GIVLDSGD Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin IYEG YAKP Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin IMRLDLAG Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin TDYLMKIL Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin RSYSFVTT Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin SYSFVTTA Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin TTAEREIV Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin AEREIVRD Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin KLCYVALD Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin IGNERFRC Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin QPSFIGME Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin NSIMKCDI 66

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin IMKCDIDI Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin YANNVMSG Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin KEITALAP Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin PSTMKIKI Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin YSVWIGGS Chik, J. K. and D. C. Schriemer (2003) rabbit muscle actin FQQMWITK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* SPEFGTGT Cravello, L., D. Lascoux, et al. (2003) PBP-2X* FGTDLAKE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GTDL AKEA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KEAKKVHQ Cravello, L., D. Lascoux, et al. (2003) PBP-2X* T VPAKRGT Cravello, L., D. Lascoux, et al. (2003) PBP-2X* IAEDATSY Cravello, L., D. Lascoux, et al. (2003) PBP-2X* YN VYAV I D Cravello, L., D. Lascoux, et al. (2003) PBP-2X* N VYAV I DE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* VIDENYKS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* IDEN YKSA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* EN YKSATG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GKIL YVEK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KIL YVEKT Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KTQFNKVA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KVAEVFHK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* VAEVFHKY Cravello, L., D. Lascoux, et al. (2003) PBP-2X* YLDM EESY Cravello, L., D. Lascoux, et al. (2003) PBP-2X* M EESYVRE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* EESYVREQ Cravello, L., D. Lascoux, et al. (2003) PBP-2X* QPNLKQVS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* AKGNGITY Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GNGITYAN Cravello, L., D. Lascoux, et al. (2003) PBP-2X* YANMMSIK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KKEL EAAE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KEL EAAEV Cravello, L., D. Lascoux, et al. (2003) PBP-2X* EL EAAEVK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* L EAAEVKG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* EAAEVKGI Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KGIDFTTS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GIDFTTSP Cravello, L., D. Lascoux, et al. (2003) PBP-2X* SYPNGQFA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* YPNGQFAS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* PNGQFASS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* ASSFIGLA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* SFIGLAQL Cravello, L., D. Lascoux, et al. (2003) PBP-2X* IGLQALHE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* LAQLHENE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* QLHENEDG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KSLLGTSG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* MESSLNLI Cravello, L., D. Lascoux, et al. (2003) PBP-2X* NSILAGTD Cravello, L., D. Lascoux, et al. (2003) PBP-2X* LAGTDGII Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GIITYEKD Cravello, L., D. Lascoux, et al. (2003) PBP-2X* PGTEQVSQ Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GTEQVSQR 67

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Cravello, L., D. Lascoux, et al. (2003) PBP-2X* MDGKDVYT Cravello, L., D. Lascoux, et al. (2003) PBP-2X* DGKDVYTT Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KDVYTTIS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* LQSEMETQ Cravello, L., D. Lascoux, et al. (2003) PBP-2X* QSEMETQM Cravello, L., D. Lascoux, et al. (2003) PBP-2X* QMDAFQFK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* FQFKVKGK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* TATL VSAK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* SAKTGEIL Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GEILATTQ Cravello, L., D. Lascoux, et al. (2003) PBP-2X* LATTQRPT Cravello, L., D. Lascoux, et al. (2003) PBP-2X* TQRPTFDA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* RPTFDADT Cravello, L., D. Lascoux, et al. (2003) PBP-2X* TKEGITED Cravello, L., D. Lascoux, et al. (2003) PBP-2X* ITEDFVWR Cravello, L., D. Lascoux, et al. (2003) PBP-2X* TEDFVWRD Cravello, L., D. Lascoux, et al. (2003) PBP-2X* RDILYQSN Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GSTMKVMM Cravello, L., D. Lascoux, et al. (2003) PBP-2X* MKVMMLAA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KVMMLAAA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* VMMLAAAI Cravello, L., D. Lascoux, et al. (2003) PBP-2X* IDNNTFPG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GEVFNSSE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* NSSELKIA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* IADATIRD Cravello, L., D. Lascoux, et al. (2003) PBP-2X* TIRDWDVN Cravello, L., D. Lascoux, et al. (2003) PBP-2X* ATIRDWDV Cravello, L., D. Lascoux, et al. (2003) PBP-2X* IRDWDVNE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* EGLTGGRM Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GGRMMTFS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GRMMTFSQ Cravello, L., D. Lascoux, et al. (2003) PBP-2X* MMTFSQGF Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GMTLLEQK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* DATWLDYL Cravello, L., D. Lascoux, et al. (2003) PBP-2X* ATWLDYLN Cravello, L., D. Lascoux, et al. (2003) PBP-2X* FGVPTRFG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* PTRFGLTD Cravello, L., D. Lascoux, et al. (2003) PBP-2X* LTDEYAGQ Cravello, L., D. Lascoux, et al. (2003) PBP-2X* ADNIVNIA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* VNIAQSSF Cravello, L., D. Lascoux, et al. (2003) PBP-2X* AFTAIAND Cravello, L., D. Lascoux, et al. (2003) PBP-2X* ANDGVMLE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* NDGVMLEP Cravello, L., D. Lascoux, et al. (2003) PBP-2X* FISAIYDP Cravello, L., D. Lascoux, et al. (2003) PBP-2X* IYDPNDQT Cravello, L., D. Lascoux, et al. (2003) PBP-2X* DQTARKSQ Cravello, L., D. Lascoux, et al. (2003) PBP-2X* QTARKSQK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* RKSQKEIV Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KSQKEIVG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* IVGN PVSK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* N PVSKDAA 68

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Cravello, L., D. Lascoux, et al. (2003) PBP-2X* AASLTRTN Cravello, L., D. Lascoux, et al. (2003) PBP-2X* SLTRTNMV Cravello, L., D. Lascoux, et al. (2003) PBP-2X* TNMVLVGT Cravello, L., D. Lascoux, et al. (2003) PBP-2X* NMVLVGTD Cravello, L., D. Lascoux, et al. (2003) PBP-2X* DPVYGTMY Cravello, L., D. Lascoux, et al. (2003) PBP-2X* YGTMYNHS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GTMYNHST Cravello, L., D. Lascoux, et al. (2003) PBP-2X* PGQNVALK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GTAQIADE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* QIADEKNG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GGYLVGLT Cravello, L., D. Lascoux, et al. (2003) PBP-2X* VGLTDYIF Cravello, L., D. Lascoux, et al. (2003) PBP-2X* DYIF SAVS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* YIF SAVSM Cravello, L., D. Lascoux, et al. (2003) PBP-2X* IF SAVSMS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* AVSM SPAE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* NPDFILYV Cravello, L., D. Lascoux, et al. (2003) PBP-2X* DFILYVTV Cravello, L., D. Lascoux, et al. (2003) PBP-2X* ILYVTVQQ Cravello, L., D. Lascoux, et al. (2003) PBP-2X* VTVQQPEH Cravello, L., D. Lascoux, et al. (2003) PBP-2X* YSGIQLGE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* IQLGEFAN Cravello, L., D. Lascoux, et al. (2003) PBP-2X* LGEFANPI Cravello, L., D. Lascoux, et al. (2003) PBP-2X* ANPILERA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* NPILERAS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* ASAMKDSL Cravello, L., D. Lascoux, et al. (2003) PBP-2X* SLNLQTTA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* LNLQTTAK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* AKALEQVS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* PGDLAEEL Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GDLAEELR Cravello, L., D. Lascoux, et al. (2003) PBP-2X* LVQPIVVG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* VGTGTKIK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* TGTKIKNS Cravello, L., D. Lascoux, et al. (2003) PBP-2X* TKIKNSSA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* AEEGKNLA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* VLILSDKA Cravello, L., D. Lascoux, et al. (2003) PBP-2X* EEVPDMYG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* PDMYGWTK Cravello, L., D. Lascoux, et al. (2003) PBP-2X* GWTKETAE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* TAETLAKW Cravello, L., D. Lascoux, et al. (2003) PBP-2X* AETLAKWL Cravello, L., D. Lascoux, et al. (2003) PBP-2X* KWLNIELE Cravello, L., D. Lascoux, et al. (2003) PBP-2X* NIELEFQG Cravello, L., D. Lascoux, et al. (2003) PBP-2X* DVRANTAI Cravello, L., D. Lascoux, et al. (2003) PBP-2X* AIKDIKKI Cravello, L., D. Lascoux, et al. (2003) PBP-2X* TLTLGD Rist, W., T. J. Jorgensen, et al. (2003) σ 32 MQSLALAP Rist, W., T. J. Jorgensen, et al. (2003) σ 32 VGNLDSY I Rist, W., T. J. Jorgensen, et al. (2003) σ 32 NLDSY I RA Rist, W., T. J. Jorgensen, et al. (2003) σ 32 WPML SADE 69

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Rist, W., T. J. Jorgensen, et al. (2003) σ 32 DLEAAKTL Rist, W., T. J. Jorgensen, et al. (2003) σ 32 AAKTL I LS Rist, W., T. J. Jorgensen, et al. (2003) σ 32 HLRFVVH I Rist, W., T. J. Jorgensen, et al. (2003) σ 32 QADL I QEG Rist, W., T. J. Jorgensen, et al. (2003) σ 32 N I GLMKAV Rist, W., T. J. Jorgensen, et al. (2003) σ 32 LMKAVRRF Rist, W., T. J. Jorgensen, et al. (2003) σ 32 PEVGVRLV Rist, W., T. J. Jorgensen, et al. (2003) σ 32 GVRLVSFA Rist, W., T. J. Jorgensen, et al. (2003) σ 32 RLVSFAVH Rist, W., T. J. Jorgensen, et al. (2003) σ 32 LVSFAVHW Rist, W., T. J. Jorgensen, et al. (2003) σ 32 E I HEYVLR Rist, W., T. J. Jorgensen, et al. (2003) σ 32 VLRNWRI V Rist, W., T. J. Jorgensen, et al. (2003) σ 32 KLFFNLRK Rist, W., T. J. Jorgensen, et al. (2003) σ 32 WFNQDEVE Rist, W., T. J. Jorgensen, et al. (2003) σ 32 NQDEVEMV Rist, W., T. J. Jorgensen, et al. (2003) σ 32 DEVEMVAR Rist, W., T. J. Jorgensen, et al. (2003) σ 32 EVEMVARE Rist, W., T. J. Jorgensen, et al. (2003) σ 32 ESRMAAQD Rist, W., T. J. Jorgensen, et al. (2003) σ 32 QDMT FDL S Rist, W., T. J. Jorgensen, et al. (2003) σ 32 DMT FD L SS Rist, W., T. J. Jorgensen, et al. (2003) σ 32 APVLYLQD Rist, W., T. J. Jorgensen, et al. (2003) σ 32 KSSNFADG Rist, W., T. J. Jorgensen, et al. (2003) σ 32 SSNFADGI Rist, W., T. J. Jorgensen, et al. (2003) σ 32 I EDDNWEE Rist, W., T. J. Jorgensen, et al. (2003) σ 32 RSQD I I RA Rist, W., T. J. Jorgensen, et al. (2003) σ 32 RARWLDED Rist, W., T. J. Jorgensen, et al. (2003) σ 32 KSTLQELA Rist, W., T. J. Jorgensen, et al. (2003) σ 32 LQELADRY Rist, W., T. J. Jorgensen, et al. (2003) σ 32 QELADRYG Rist, W., T. J. Jorgensen, et al. (2003) σ 32 YGVSAERV Rist, W., T. J. Jorgensen, et al. (2003) σ 32 QLEKNAMK Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 MSIVR Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 FADLWADP Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 PFDTFRSI Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 FDTFRSIV Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 ETAAFANA Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 NARMDWKE Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 ARMDWKET Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 AHVFKADL Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 GNVLVVSG Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 NDKWHRVE Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 SGKFVRRF Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 LLED AKVE Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 KVEEVKAG Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 NGVLTVTV Wintrode, P. L., K. L. Friedrich, et al. (2003) HSP16.9 EVKA I QI S Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA MDV T I Q Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA PSRLFDQF Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA DQF FGEGL Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA EGLFEYDL Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA YDLLPFLS 70

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA RQS L PRTV Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA RTVLDSG I Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA SG I SGVRS Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA RDKFV I F L Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA FV I FLDVK Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA IFLDVKHF Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA PEDLTVKV Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA QDDFVE I H Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA ERQDDHGY Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA DDHGY I SR Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA VDQSALSC Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA QSALSCSL Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA SCSLSADG Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αA DGML T FCG Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ MD I A I H Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ PSRLFDQF Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ RLFDQFFG Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ FDQFFGEH Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ ESDLFPTS Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ PPSFLRAP Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ SWF D T G L S Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ GLSEMRLE Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ KDRFSVNL Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ SVNLDVKH Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ PEELKVKV Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ DV I EVHGK Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ ITSSLSSD Hasan, A., D. L. Smith, et al. (2002) α-Crystallin αΒ DGV L TVNG Yan, X., H. Zhang, et al. (2002) rhM-CSFß SGHLQSLQ Yan, X., H. Zhang, et al. (2002) rhM-CSFß LQSLQRLI Yan, X., H. Zhang, et al. (2002) rhM-CSFß DSQMETSC Yan, X., H. Zhang, et al. (2002) rhM-CSFß ITFEFVDQ Yan, X., H. Zhang, et al. (2002) rhM-CSFß KAFLLVQD Yan, X., H. Zhang, et al. (2002) rhM-CSFß DIMEDTMR Yan, X., H. Zhang, et al. (2002) rhM-CSFß TMRFRDNT Yan, X., H. Zhang, et al. (2002) rhM-CSFß TPNAIAIV Yan, X., H. Zhang, et al. (2002) rhM-CSFß NAIAIVQK Yan, X., H. Zhang, et al. (2002) rhM-CSFß IVQLQELS Yan, X., H. Zhang, et al. (2002) rhM-CSFß QLQELSLR Yan, X., H. Zhang, et al. (2002) rhM-CSFß ELSLRLKS Yan, X., H. Zhang, et al. (2002) rhM-CSFß CVRTFYET Yan, X., H. Zhang, et al. (2002) rhM-CSFß VRTFYETP Yan, X., H. Zhang, et al. (2002) rhM-CSFß PLQLLEKV Yan, X., H. Zhang, et al. (2002) rhM-CSFß KNLLDKDW Yan, X., H. Zhang, et al. (2002) rhM-CSFß WNIFSKNC Yan, X., H. Zhang, et al. (2002) rhM-CSFß CSSQDVVT Yan, X., H. Zhang, et al. (2002) rhM-CSFß CNCL YPKA Yan, X., H. Zhang, et al. (2002) rhM-CSFß VAQLTWED Yan, X., H. Zhang, et al. (2002) rhM-CSFß LTWEDSEG Yan, X., H. Zhang, et al. (2002) rhM-CSFß SSLLPGEQ Yan, X., H. Zhang, et al. (2002) rhM-CSFß HTVDPGSA 71

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Hughes, C. A., J. G. Mandell, et al. (2001) CheB MSKIR Hughes, C. A., J. G. Mandell, et al. (2001) CheB DSALMRQI Hughes, C. A., J. G. Mandell, et al. (2001) CheB IMTEIINS Hughes, C. A., J. G. Mandell, et al. (2001) CheB DMEMVATA Hughes, C. A., J. G. Mandell, et al. (2001) CheB PLVARDLI Hughes, C. A., J. G. Mandell, et al. (2001) CheB VARDLIKK Hughes, C. A., J. G. Mandell, et al. (2001) CheB VLTLDVEM Hughes, C. A., J. G. Mandell, et al. (2001) CheB LTLDVEMP Hughes, C. A., J. G. Mandell, et al. (2001) CheB DGLDFLEK Hughes, C. A., J. G. Mandell, et al. (2001) CheB GLDFLEKL Hughes, C. A., J. G. Mandell, et al. (2001) CheB LDFLEKLM Hughes, C. A., J. G. Mandell, et al. (2001) CheB LEKLMRLR Hughes, C. A., J. G. Mandell, et al. (2001) CheB PVVMVSSL Hughes, C. A., J. G. Mandell, et al. (2001) CheB ALELGAID Hughes, C. A., J. G. Mandell, et al. (2001) CheB AIDFVTKP Hughes, C. A., J. G. Mandell, et al. (2001) CheB EGML AYSE Hughes, C. A., J. G. Mandell, et al. (2001) CheB GML AYSEM Hughes, C. A., J. G. Mandell, et al. (2001) CheB YSEMIAEK Hughes, C. A., J. G. Mandell, et al. (2001) CheB PTTLKAGP Hughes, C. A., J. G. Mandell, et al. (2001) CheB SEKLIAIG Hughes, C. A., J. G. Mandell, et al. (2001) CheB GGTEAIRH Hughes, C. A., J. G. Mandell, et al. (2001) CheB GTEAIRHV Hughes, C. A., J. G. Mandell, et al. (2001) CheB PL SSPAV I Hughes, C. A., J. G. Mandell, et al. (2001) CheB PAVIITQH Hughes, C. A., J. G. Mandell, et al. (2001) CheB PPGFTRSF Hughes, C. A., J. G. Mandell, et al. (2001) CheB AERLNKLC Hughes, C. A., J. G. Mandell, et al. (2001) CheB VLPGHAYI Hughes, C. A., J. G. Mandell, et al. (2001) CheB PGHAYIAP Hughes, C. A., J. G. Mandell, et al. (2001) CheB GHAYIAPG Hughes, C. A., J. G. Mandell, et al. (2001) CheB HMELARSG Hughes, C. A., J. G. Mandell, et al. (2001) CheB SGANYQIK Hughes, C. A., J. G. Mandell, et al. (2001) CheB PSVDVLFH Hughes, C. A., J. G. Mandell, et al. (2001) CheB VDVLFHSV Hughes, C. A., J. G. Mandell, et al. (2001) CheB DVLFHSVA Hughes, C. A., J. G. Mandell, et al. (2001) CheB GVILTGMG Hughes, C. A., J. G. Mandell, et al. (2001) CheB EASCVVFG Hughes, C. A., J. G. Mandell, et al. (2001) CheB MPREAINM Hughes, C. A., J. G. Mandell, et al. (2001) CheB G VSEVVDL Hughes, C. A., J. G. Mandell, et al. (2001) CheB VSEVVDLS Hughes, C. A., J. G. Mandell, et al. (2001) CheB QMLAKISA Wang, L., L. C. Lane, et al. (2001) BMV AIAGYSIS Wang, L., L. C. Lane, et al. (2001) BMV ASSDAITA Wang, L., L. C. Lane, et al. (2001) BMV SDAITAKA Wang, L., L. C. Lane, et al. (2001) BMV ATNAMSIT Wang, L., L. C. Lane, et al. (2001) BMV NAMSITLP Wang, L., L. C. Lane, et al. (2001) BMV GRVLLWLG Wang, L., L. C. Lane, et al. (2001) BMV RVLLWLGL Wang, L., L. C. Lane, et al. (2001) BMV LLWLGLLP Wang, L., L. C. Lane, et al. (2001) BMV LGLL PSVA Wang, L., L. C. Lane, et al. (2001) BMV RIKACVAE Wang, L., L. C. Lane, et al. (2001) BMV KAC VAEKQ 72

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Wang, L., L. C. Lane, et al. (2001) BMV AC VAEKQA Wang, L., L. C. Lane, et al. (2001) BMV AEAAFQVA Wang, L., L. C. Lane, et al. (2001) BMV EAAFQVAL Wang, L., L. C. Lane, et al. (2001) BMV AFQVALAV Wang, L., L. C. Lane, et al. (2001) BMV FQVALAVA Wang, L., L. C. Lane, et al. (2001) BMV QVALAVAD Wang, L., L. C. Lane, et al. (2001) BMV SSKEVVAA Wang, L., L. C. Lane, et al. (2001) BMV VAAMYTDA Wang, L., L. C. Lane, et al. (2001) BMV YTDAFRGA Wang, L., L. C. Lane, et al. (2001) BMV GDLLNLQI Wang, L., L. C. Lane, et al. (2001) BMV LLNLQIYL Wang, L., L. C. Lane, et al. (2001) BMV QIYL YASE Wang, L., L. C. Lane, et al. (2001) BMV PAKAVVHL Wang, L., L. C. Lane, et al. (2001) BMV VVHLEVEH Wang, L., L. C. Lane, et al. (2001) BMV FDDFFTPV Chen, J. and D. L. Smith (2000) GroEL RGVNVLAD Chen, J. and D. L. Smith (2000) GroEL KGRNVVLD Chen, J. and D. L. Smith (2000) GroEL DG VSVARE Chen, J. and D. L. Smith (2000) GroEL EIELEDKF Chen, J. and D. L. Smith (2000) GroEL GAQM VKEV Chen, J. and D. L. Smith (2000) GroEL ATVLAQAI Chen, J. and D. L. Smith (2000) GroEL VT AAVEEL Chen, J. and D. L. Smith (2000) GroEL ELKALSVP Chen, J. and D. L. Smith (2000) GroEL DSKAIAQV Chen, J. and D. L. Smith (2000) GroEL VGKLIAEA Chen, J. and D. L. Smith (2000) GroEL AEAMDKVG Chen, J. and D. L. Smith (2000) GroEL QDELDVVE Chen, J. and D. L. Smith (2000) GroEL DELDVVEG Chen, J. and D. L. Smith (2000) GroEL EGMQFDRG Chen, J. and D. L. Smith (2000) GroEL SPYFINKP Chen, J. and D. L. Smith (2000) GroEL NKPETGAV Chen, J. and D. L. Smith (2000) GroEL GAVELESP Chen, J. and D. L. Smith (2000) GroEL PFILLADK Chen, J. and D. L. Smith (2000) GroEL FILLADKK Chen, J. and D. L. Smith (2000) GroEL REMLPVLE Chen, J. and D. L. Smith (2000) GroEL LPVL EAVA Chen, J. and D. L. Smith (2000) GroEL KPLLIIAE Chen, J. and D. L. Smith (2000) GroEL IIAEDVEG Chen, J. and D. L. Smith (2000) GroEL GEALATLV Chen, J. and D. L. Smith (2000) GroEL LATLVVNT Chen, J. and D. L. Smith (2000) GroEL TMRGIVKV Chen, J. and D. L. Smith (2000) GroEL MLQDIATL Chen, J. and D. L. Smith (2000) GroEL ISEEIGME Chen, J. and D. L. Smith (2000) GroEL EEIGMELE Chen, J. and D. L. Smith (2000) GroEL KATLEDLG Chen, J. and D. L. Smith (2000) GroEL DTTTIIDG Chen, J. and D. L. Smith (2000) GroEL QIEEATSD Chen, J. and D. L. Smith (2000) GroEL KLQERVAK Chen, J. and D. L. Smith (2000) GroEL GGVAVIKV Chen, J. and D. L. Smith (2000) GroEL AATEVEMK Chen, J. and D. L. Smith (2000) GroEL EDALHATR 73

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Chen, J. and D. L. Smith (2000) GroEL AVEEGVVA Chen, J. and D. L. Smith (2000) GroEL GVALIRVA Chen, J. and D. L. Smith (2000) GroEL ASKLADLR Chen, J. and D. L. Smith (2000) GroEL LADLRGQN Chen, J. and D. L. Smith (2000) GroEL KVALRAME Chen, J. and D. L. Smith (2000) GroEL PLRQIVLN Chen, J. and D. L. Smith (2000) GroEL ATEEYGNM Chen, J. and D. L. Smith (2000) GroEL YGNMIDMG Chen, J. and D. L. Smith (2000) GroEL MIDMGILD Chen, J. and D. L. Smith (2000) GroEL SALQ YAAS Chen, J. and D. L. Smith (2000) GroEL VAGLMITT Chen, J. and D. L. Smith (2000) GroEL TECMVTDL Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) GSEDIIVV Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) IVVALYDY Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) VVALYDYE Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) HHEDLSFQ Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) HEDLSFQK Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) DLSFQKGD Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) GDQMVVLE Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) SGEWWKAR Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) IPSIYVAR Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) PSIYVARV Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) SIYVARVD Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) YVARVDSL Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) VDSLETEE Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) SLETEEWF Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) TEEWFFKG Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) FFKGISRK Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) ARGNMLGS Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) NMLGSFMI Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) LGSFMIRD Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) GSFMIRDS Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) SYSLSVRD Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) DNGGFYIS Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) NGGFYISP Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) PRSTFSTL Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) FSTLQELV Engen, J. R., T. E. Smithgall, et al. (1999) Hck SH(3 + 2) NDGLCQKL Wang, F., W. Li, et al. (1999) cNTnC I YKAAVEQ Wang, F., W. Li, et al. (1999) cNTnC AVEQLTEE Wang, F., W. Li, et al. (1999) cNTnC KAAFDIFV Wang, F., W. Li, et al. (1999) cNTnC FDIFVLGA Wang, F., W. Li, et al. (1999) cNTnC IFVLGAED Wang, F., W. Li, et al. (1999) cNTnC EDGCISTK Wang, F., W. Li, et al. (1999) cNTnC VMRMLGQN Wang, F., W. Li, et al. (1999) cNTnC TPEELQEM Wang, F., W. Li, et al. (1999) cNTnC PEELQEMI Wang, F., W. Li, et al. (1999) cNTnC IDEVDEDG Resing, K. A. and N. G. Ahn (1998) human MKK1 KKKPTP I Q Resing, K. A. and N. G. Ahn (1998) human MKK1 ETNLEALQ Resing, K. A. and N. G. Ahn (1998) human MKK1 KLEELELD 74

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Resing, K. A. and N. G. Ahn (1998) human MKK1 RLEAFLTQ Resing, K. A. and N. G. Ahn (1998) human MKK1 LEAFLTQK Resing, K. A. and N. G. Ahn (1998) human MKK1 KDDDFEKI Resing, K. A. and N. G. Ahn (1998) human MKK1 DDDFEKIS Resing, K. A. and N. G. Ahn (1998) human MKK1 KISELGAG Resing, K. A. and N. G. Ahn (1998) human MKK1 GVVFKVSH Resing, K. A. and N. G. Ahn (1998) human MKK1 GLVMARKL Resing, K. A. and N. G. Ahn (1998) human MKK1 LIHLEIKP Resing, K. A. and N. G. Ahn (1998) human MKK1 IIRELQVL Resing, K. A. and N. G. Ahn (1998) human MKK1 IVGFYGAF Resing, K. A. and N. G. Ahn (1998) human MKK1 YGAFYSDG Resing, K. A. and N. G. Ahn (1998) human MKK1 GEISICME Resing, K. A. and N. G. Ahn (1998) human MKK1 ISICMEHM Resing, K. A. and N. G. Ahn (1998) human MKK1 GGSLDQVL Resing, K. A. and N. G. Ahn (1998) human MKK1 VSIAVIKG Resing, K. A. and N. G. Ahn (1998) human MKK1 LTYLREKH Resing, K. A. and N. G. Ahn (1998) human MKK1 KPSNILVN Resing, K. A. and N. G. Ahn (1998) human MKK1 EIKLCDVG Resing, K. A. and N. G. Ahn (1998) human MKK1 SGQLIDSM Resing, K. A. and N. G. Ahn (1998) human MKK1 LIDSMANS Resing, K. A. and N. G. Ahn (1998) human MKK1 MANSFVGT Resing, K. A. and N. G. Ahn (1998) human MKK1 ANSFVGTR Resing, K. A. and N. G. Ahn (1998) human MKK1 GTHYSVQS Resing, K. A. and N. G. Ahn (1998) human MKK1 WSMGLSLV Resing, K. A. and N. G. Ahn (1998) human MKK1 GLSLVEMA Resing, K. A. and N. G. Ahn (1998) human MKK1 DAKELELM Resing, K. A. and N. G. Ahn (1998) human MKK1 ELELMFGC Resing, K. A. and N. G. Ahn (1998) human MKK1 ELMFGCQV Resing, K. A. and N. G. Ahn (1998) human MKK1 FGCQVEGD Resing, K. A. and N. G. Ahn (1998) human MKK1 EGDAAETP Resing, K. A. and N. G. Ahn (1998) human MKK1 MAIFELLD Resing, K. A. and N. G. Ahn (1998) human MKK1 ELLDYIVN Resing, K. A. and N. G. Ahn (1998) human MKK1 SGVFSLEF Resing, K. A. and N. G. Ahn (1998) human MKK1 FSLEFQDF Resing, K. A. and N. G. Ahn (1998) human MKK1 MVHAFIKR Resing, K. A. and N. G. Ahn (1998) human MKK1 D AEEVDFA Resing, K. A. and N. G. Ahn (1998) human MKK1 AGWLCSTI Resing, K. A. and N. G. Ahn (1998) human MKK1 LCSTIGLN Neubert, T. A., K. A. Walsh, et al. (1997) recoverin LSKEILEE Neubert, T. A., K. A. Walsh, et al. (1997) recoverin ILEELQLN Neubert, T. A., K. A. Walsh, et al. (1997) recoverin ELQLNTKF Neubert, T. A., K. A. Walsh, et al. (1997) recoverin EEELSSWY Neubert, T. A., K. A. Walsh, et al. (1997) recoverin LSSWYQSF Neubert, T. A., K. A. Walsh, et al. (1997) recoverin YQSFLKEC Neubert, T. A., K. A. Walsh, et al. (1997) recoverin QSFLKECP Neubert, T. A., K. A. Walsh, et al. (1997) recoverin QTIYSKFF Neubert, T. A., K. A. Walsh, et al. (1997) recoverin YSKFFPEA Neubert, T. A., K. A. Walsh, et al. (1997) recoverin D PKAYAQH Neubert, T. A., K. A. Walsh, et al. (1997) recoverin QHVFRSFD Neubert, T. A., K. A. Walsh, et al. (1997) recoverin DGTLDFKE Neubert, T. A., K. A. Walsh, et al. (1997) recoverin DF KEYV I A 75

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Neubert, T. A., K. A. Walsh, et al. (1997) recoverin VIALHMTS Neubert, T. A., K. A. Walsh, et al. (1997) recoverin LEWAFSLY Neubert, T. A., K. A. Walsh, et al. (1997) recoverin AFSLYDVD Neubert, T. A., K. A. Walsh, et al. (1997) recoverin NEVLEIVT Neubert, T. A., K. A. Walsh, et al. (1997) recoverin EIVTAIFK Neubert, T. A., K. A. Walsh, et al. (1997) recoverin TAIFKMIS Neubert, T. A., K. A. Walsh, et al. (1997) recoverin IWGFFGKK Neubert, T. A., K. A. Walsh, et al. (1997) recoverin TEKEFIEG Neubert, T. A., K. A. Walsh, et al. (1997) recoverin ANKEILRL Neubert, T. A., K. A. Walsh, et al. (1997) recoverin ILRLIQFE Wang, F., J. S. Blanchard, et al. (1997) DHPR HDANIRVA Wang, F., J. S. Blanchard, et al. (1997) DHPR ANIRVAIA Wang, F., J. S. Blanchard, et al. (1997) DHPR IQAALALE Wang, F., J. S. Blanchard, et al. (1997) DHPR QAALALEG Wang, F., J. S. Blanchard, et al. (1997) DHPR GAALEREG Wang, F., J. S. Blanchard, et al. (1997) DHPR QSSLDAVK Wang, F., J. S. Blanchard, et al. (1997) DHPR KDDFDVFI Wang, F., J. S. Blanchard, et al. (1997) DHPR FDVFIDFT Wang, F., J. S. Blanchard, et al. (1997) DHPR HLAFCRQH Wang, F., J. S. Blanchard, et al. (1997) DHPR TTGFDEAG Wang, F., J. S. Blanchard, et al. (1997) DHPR AAADIAIV Wang, F., J. S. Blanchard, et al. (1997) DHPR AIVFAANF Wang, F., J. S. Blanchard, et al. (1997) DHPR ANFSVGVN Wang, F., J. S. Blanchard, et al. (1997) DHPR MLKLLEKA Wang, F., J. S. Blanchard, et al. (1997) DHPR DYTDIEII Wang, F., J. S. Blanchard, et al. (1997) DHPR GTALAMGE Wang, F., J. S. Blanchard, et al. (1997) DHPR GEAIAHAL Wang, F., J. S. Blanchard, et al. (1997) DHPR DC AVYSRE Wang, F., J. S. Blanchard, et al. (1997) DHPR TIGFATVR Wang, F., J. S. Blanchard, et al. (1997) DHPR HTAMFADI Wang, F., J. S. Blanchard, et al. (1997) DHPR GERLEITH Wang, F., J. S. Blanchard, et al. (1997) DHPR FANGAVRS Wang, F., J. S. Blanchard, et al. (1997) DHPR RSALWLSG Wang, F., J. S. Blanchard, et al. (1997) DHPR ESGLFDMR Wang, F., J. S. Blanchard, et al. (1997) DHPR DMRDVLDL Wang, F., J. S. Blanchard, et al. (1997) DHPR VLDLNNL Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c KK I FVQKC Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c HTVEKGGK Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c GPNLHGL F Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c HGL FGRKT Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c APGFTYTD Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c PGFTYTDA Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c EETLMEYL Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c ETLMEYLE Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c TLMEYLEN Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c LMEYLENP Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c GTKM I FAG Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c TKMI FAGI Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c KM I FAG I K Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c FAGI KKTE Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c REDL I AYL 76

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c EDL I AYLK Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c DL I AYLKK Dharmasiri, K. and D. L. Smith (1996) horse heart cyt c L I AYLKKA Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase HPALTPEQ Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase ELSDIAHR Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase KGILAADE Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase G S I A K R L Q Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase IGTENTEE Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase NTEENRRF Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase NRRFYRQL Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase RGLLLTAD Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase Q L L L T A D D Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase LLLTADDR Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase LTADDRVN Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase TADDRVNP Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase PCIGGVIL Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase G V I L F H E T Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase HETLYQKA Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase G V V G I K V D Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase LDGLSERC Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase DGLSERCA Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase DGADFAKW Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase RCVLKIGE Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase TPSALAIM Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase PSALAIME Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase SALAIMEN Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase AIMENANV Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase ANVLARYA Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase LARYASIC Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase ICQQNGIV Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase PI VEPE I L Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase RCQYVTEK Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase YVTEKVLA Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase EKVL AAVY Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase VL AAVYKA Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase L AAVYKAL Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase AAVYKALS Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase HIYLEGTL Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase YLEGTLLK Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase YSHEEIAM Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase EIAMATVT Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase M A T V T A L R Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase TVTALRRT Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase VTALRRTV Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase PAVTGVTF Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase SEEEAS I N Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase EEEAS I NL Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase SINLNAIN Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase NLNAINKC Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase W A L T F S Y G Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase ALQASALK 77

Reference Protein P4 P3 P2 P1 P1'P2'P3'P4' Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase ASALKAWG Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase AQ EEYVKR Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase Q EEYVKRA Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase ANSLACQG Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase SESLFISN Zhang, Z., C. B. Post, et al. (1996) rabbit muscle aldolase ISNHAY Johnson, R. S. and K. A. Walsh (1994) equine myoglobin DGEWQQVL Johnson, R. S. and K. A. Walsh (1994) equine myoglobin QVKNVWGK Johnson, R. S. and K. A. Walsh (1994) equine myoglobin QEVLIRLF Johnson, R. S. and K. A. Walsh (1994) equine myoglobin LIRLFTGH Johnson, R. S. and K. A. Walsh (1994) equine myoglobin AEM KASED Johnson, R. S. and K. A. Walsh (1994) equine myoglobin TVVLTALG Johnson, R. S. and K. A. Walsh (1994) equine myoglobin PIKYLEFI Johnson, R. S. and K. A. Walsh (1994) equine myoglobin YLEFISDA Johnson, R. S. and K. A. Walsh (1994) equine myoglobin FISDAIIH Johnson, R. S. and K. A. Walsh (1994) equine myoglobin ISDAIIHV Johnson, R. S. and K. A. Walsh (1994) equine myoglobin MTKALELF Johnson, R. S. and K. A. Walsh (1994) equine myoglobin ALELFRND 78

CHAPTER 3

EXPERIMENTAL DETERMINATION OF

PEPSIN SPECIFICITY AND THE EFFECTS OF pH

3.1 Introduction

The second part of this project involved the experimental research, the goal of

which was to increase the data set of pepsin cleavages. The literature search provided a

good starting point, but as stated previously, the amount of data produced was not enough

to accurately determine the trends in pepsin specificity.

Another goal of the experimental research was to determine if the specificity of

pepsin is affected by a change in pH. It is known that pepsin activity is directly affected

by pH (Jones and Landon 2002). Pepsin experiences its highest activity around pH 1.9.

It remains very active down to pH 1 and begins to lose activity around pH 5 (Brier, Maria

et al. 2007).

To address the two main goals of the experimental project various proteins were

digested with pepsin and analyzed using LC/MS/MS. The proteins were digested at three

different pH conditions, pH 1.0, pH 2.5 and pH 4.0. In addition to digesting a selection

of proteins, the bacterium Escherichia coli (E.coli) whole cell lysate was also digested at the aforementioned pH points and analyzed using LC/MS/MS.

3.2 Instrumentation

Online digestion and sample analysis was performed using UPLC/ESI-MSE. This method was chosen because it allows for fast and accurate determination of peptides

(Chakraborty, Berger et al. 2007). 79

3.2.1 UPLC and online pepsin digestion

UPLC was chosen for analysis over HPLC because it allows for a faster analysis

time while still obtaining a very high-resolution separation (Swartz 2005). UPLC uses a

smaller particle size (less than 2 μm diameter) which creates very high operating

pressures, up to 15,000 psi. Another plus to using UPLC over HPLC is that UPLC has a

greater sensitivity therefore a smaller sample size can be used (Plumb, Castro-Perez et al.

2004; Swartz 2005).

Online pepsin is digestion preferred over a solution digest for various reasons.

One reason is that online digests allow for greater pepsin efficiency as pepsin becomes

more efficient as the enzyme to protein ratio increases. Compared to a solution digest,

with online digests much more pepsin can be packed into a column, therefore the concentration of pepsin will be greater, thus increasing efficiency (Wu, Kaveti et al.

2006). The time needed for digestion can also be decreased when performing online digestions because the efficiency of pepsin is increased. A solution digest that would take five minutes can be decreased to 20 seconds in a pepsin column (Jones and Landon

2002). One final bonus to performing online digestion is the ability to automate the

process. This was especially important with this project because of the number of

samples that needed to be analyzed.

A schematic of the UPLC setup for online digestion can be found in Figure 3.1.

The sample is injected into valve A. It travels through the loop and is then pushed

through the pepsin column via the auxiliary solvent manager (ASM) which contains a pH

buffer in water. Following digestion the peptides are desalted and concentrated on the

peptide trap. Valve B then switches the trap inline with the binary solvent manager 80

Auxiliary Solvent Manager: Pump B

Binary Solvent Manger

C18 Column MS

AB

Pepsin Column

Peptide Trap

Aux iliary So lven t Manager: Pump A

Figure 3.1. Schematic of online pepsin digestion adapted from Wang (2002). The sample is injected into the loop of valve A and is then pushed through the pepsin column with Pump A of the auxiliary solvent manager (ASM) containing pHbH bu ffer in H2O. Th e pepti c peptid es are th en d esalt ed and concent rat ed on th e peptide trap. Valve B then switches to turn the trap inline with the binary solvent manager (BSM). The BSM creates the correct mixture of A and B solvents, where A is 0.05% FA in H2O and B is 0.05% FA in ACN, dependent on the gradient. The sample is then separated on a C18 column with the eluant flowing to the mass spectrometer. Pump B of the ASM delivers the reference material to the mass spectrometer. 81

(BSM). The BSM will then make up the needed mixture of A and B solvents, where A is

0.05% formic acid (FA) in water and B is 0.05%FA in acetonitrile (ACN), according to

the desired gradient. The peptic peptides are then separated on a C18 analytical column.

Following separation the eluant flows into the mass spectrometer for analysis (Wang, Pan

et al. 2002).

3.2.2 Mass spectrometry

For these experiments the technique of MSE was used for mass spectral analyses.

MSE is a data independent method of performing MS and MS/MS analyses. With MSE both fragment and precursor ion information can be obtained relatively quickly and with accurate mass in one analysis (Plumb, Johnson et al. 2006).

A diagram of the operation of MSE is displayed in Figure 3.2 (Plumb, Johnson et

al. 2006). The peptides that are separated during the UPLC analysis are introduced into

the mass spectrometer using an electrospray ionization (ESI) source and then enter the

first quadruple where the peptides are separated. The peptic peptides then move into the

collision cell where they are bombarded with either low or elevated energy. After this

step they move into the second quadruple which is operated in V mode (Chakraborty,

Berger et al. 2007).

The collision energy in the gas cell continuously alternates between low and

elevated energies at a specified rate. At the low collision energy a classic MS spectrum is

obtained generating information about the precursor ions. When hit with the elevated

collision energy, information about the fragment ions is obtained. When the elevated

energy is used, the high voltage causes the formation of both the precursor ions and its 82

MS

6

1 234

57

MSE

Figure 3.2. Schematic of the operation of MSE adapted from Plumb (2006). Peptides are separated during the UPLC separation, in this example there are two peptides that co-elute (1), and are introduced into the mass spectrometer via an ESI source (2). From there the peptides are separated in the first quadruple (3) then move into the collision cell (4) then into the second quadruple (5) which is operated in V mode. The collision energy in the gas cell alternates between low and high energy. At low energy precursor ion information is obtained yielding a classic mass spectrum (6). At the elevated energy information is collected for all of the fragment ions (7). 83

fragments. Therefore, throughout one analysis, information is constantly being gathered

about both the precursor and fragment ions. During data processing, the precursor ions

are matched up with their fragment ions based on their common retention times.

(Chakraborty, Berger et al. 2007).

3.3 Materials and Methods

3.3.1 Protein sample analysis

The proteins used for these experiments are commercially available from Sigma-

Aldrich or were obtained in-house. A list of the proteins used for digestion can be found in Table 3.1. The proteins encompass a wide range of sizes, from 8.6 kDa (Ubiquitin) to

68.0 kDa (Amyloglucosidase).

3.3.1.1 Protein sample preparation

All of the proteins were dissolved into 50 mM TRIS buffer, pH 8.315, to a concentration of 10 μM. These stock solutions were kept in a -20 °C freezer. 100 μL of

each protein stock solution were diluted into 100 μL of TRIS buffer to a concentration of

5 μM. Abelson tyrosine kinase (Abl), Nef (a small HIV accessory protein) and Ubiquitin were heated at 60 °C for ten minutes to help reduce the disulfide bonds present. A small amount of 1 M guanidine-HCl was also added to all samples to aid in digestion. For pH

1.0 analysis the samples were adjusted to pH 1.0 by adding 0.1 M HCl. The samples were also analyzed at pH 2.5 and 4.0. Fresh aliquots of sample were adjusted to these pH points by adding a buffer consisting of 50 mM potassium phosphate monobasic and 50 84

Table 3.1. Proteins used for digestion SwissProt ID and Molecular Number of Protein Ascension Number Weight (kDa) Residues Abelson tyrosine 57.4 501 kinase (Abl)* Albumin Albu_Bovin (P02769) 66.0 583 Aldolase Aldoa_Rabit (P00883) 39.5 364 Amyloglucosidase Amyg_Aspng (P69328) 68.0 640 β-Lactoglobulin Lacb_Bovin (P02754) 18.4 162 Hba_Bovin (P01966) Hbb_Bovin (P02070) Hemoglobin 64.5 576 Hbe2_Bovin (P06642) Hbe4_Bovin (P06643) Myoglobin Myg_Horse (P68082) 17.0 154 Nef* 25.5 226 Ubiquitin Ubiq_Bovin (P62990) 8.6 76 * Proteins obtained in-house, all others available from Sigma-Aldrich

85

mM potassium phosphate dibasic for pH 1.0 and 2.5 analyses, and 50 mM citric acid for

pH 4.0 analysis.

3.3.1.2 UPLC methods

Online pepsin digestion and UPLC separation was performed on a Waters

nanoAcquitiy UPLC system. All samples were analyzed in triplicate. Pepsin digestion

was carried out using a 2.1 mm x 50 mm stainless steel column (Alltech) packed with

pepsin immobilized onto POROS-20R2 support (Applied Biosystems) as described

(Wang, Pan et al. 2002). Online digestion was essentially performed as described (Wang,

Pan et al. 2002). 10 μL of protein (50 pmol) was loaded onto a 20 μL loop and pushed

through the pepsin column by pump A of the ASM (see table 3.2 for solvents used) at a

flow rate of 100 μL/min. Digestion, trapping and desalting was a total of 3 minutes. A

vanGuard BEH C18 trap (Waters) containing 1.7 μm particles was used. The switch

valve then moved to align the peptide trap with the BSM. Chromatography was then

carried out using a nanoAcquity Symmetry C18, 180 μm x 20 mm column (Waters)

containing 5 μm particles. The elution gradient was 8% to 40% acetonitrile (ACN) in 6

minutes followed by a ramp to 85% ACN over 30 seconds with a flow rate of 40 μL/min.

0.05% formic acid (FA) was added to both mobile phases. The sample manager was held at a temperature of 4 °C, the pepsin column was at room temperature and the analytical

column was held at 35 °C.

86

Table 3.2. Auxiliary solvents pH Buffer

1.0 17%FA in H2O

2.5 0.05%FA in H2O

4.0 50mM citric acid in H2O* *Buffer adjusted to pH 4 using 0.01N HCl 87

3.3.1.3 Mass analysis

All mass spectral analyses were carried out using a Waters Synapt HDMS system

equipped with a standard ESI source. All measurements were taken in positive ion V

mode with a capillary voltage of 3.2 kV, cone voltage of 37 V, source temperature of 100

°C, desolvation temperature of 250 °C and desolvation gas flow of 600 L/hour. The low

collision energy was 5 eV and the elevated collision energy was 25 eV. Each 0.4 second

scan spanned an m/z of 50-1700 with an interscan delay time of 0.02 seconds. To ensure

accurate mass all analyses were carried out using Waters LockSpray technology. (Glu1)- fibrinopeptide B (GFP) was used as a reference. The 250 fmol/μL solution of GFP was delivered to the mass spectrometer at a rate of 2.5 μL/min using pump B of the ASM.

Sampling of the reference solution was performed every 20 seconds.

3.3.2 E.coli sample analysis

E.coli whole cell lysate was obtained from Waters Corporation. The sample was prepared from E.coli strain DH5alpha (Q-BIOgene) as described by Millea et al. (Millea,

Krull et al. 2006). The sample was measured to have a final protein concentration of 95 mg/mL using a modified Coomassie dyebinding assay (BioRad).

3.3.2.1 E.coli sample preparation

10.4 μL of E.coli was added to 187.5 μL of H2O and 6.2 μL of DTT. The

solution was then heated at 60 °C for 10 minutes. Following heating, the sample was

cooled back to room temperature and briefly centrifuged. 6.2 μL of iodoacetic acid was

then added and the sample was vortexed then allowed to sit in the dark at room 88

temperature for 30 minutes. After being stored in darkness, 40 μL of H2O was added to

the sample followed by 1 M guanidine HCl. The sample was then adjusted to the

appropriate pH as described in section 3.3.1.1. Finally, the sample was vortexed for about one minute.

3.3.2.2 UPLC analysis

Online digestion and UPLC separation were performed essentially as described in section 3.3.1.2. Changes made to the methods include a total digestion, trapping and desalting time of 18 minutes with a flow rate of 15 μL/min. The elution gradient was also changed to 8% to 40% ACN in 90 minutes followed by a ramp to 85% ACN over 6 minutes with a flow rate of 40 μL/min. All other parameters remained unchanged.

3.3.2.3 Mass analysis

Mass spectral analyses were carried out as described in section 3.3.1.3 with the only change being that the elevated collision energy was set at 15 eV.

3.4 Data Analysis

3.4.1 Software processing

Acquired data was processed using Waters ProteinLynx Global Server software

(PLGS version 2.3). The low energy data and elevated energy data were lockmass corrected and the fragment ions were aligned with their corresponding precursor ion based on their retention times. In the case of the proteins, processed data was searched against the known protein sequence. The processed E.coli data were searched against a 89 database of over 8700 proteins found in E.coli. The search was set for non-specific enzymes in order to identify the maximum number of peptides, allowing for one missed cleavage and no modifications in the search parameters.

3.4.2 Peptide analysis

All peptides generated from the software search were further refined to ensure accuracy. MS/MS data for resulting peptides was investigated. Example MS/MS data can be found in Figures 3.3 and 3.4. These examples show two different peptides identified from hemoglobin and myoglobin respectively. In the spectra the red peaks represent y-ion fragments while the blue peaks represent b-ion fragments. The green peaks are those representing neutral loss of water, ammonia or immonium. All remaining peaks are grey and were not assigned to one of the above groups. These ions along with internal fragments are not labeled.

For a peptide to be considered reliable it must have been found in at least two out of the three replicates. Also, the standard deviation retention time must have been less than ten seconds and a minimum of five fragment ions must have been generated. Once the reliable peptides were established they were then mapped onto their corresponding protein sequence. Peptic peptide maps of the nine proteins analyzed at pH 2.5 can be found in Figures 3.5-3.13 at the end of this chapter. With the exception of β-lactalbumin and amyloglucosidase the sequence coverage percent for the proteins was very high.

90

Figure 3.3. Example MS/MS data

Protein: Hemoglobin Peptide identified: ASHLPSDFTPAVHASL Peptide mass: 1649.8282 Intensity: 465,874

Spectral identification key: Red peaks – y-ions Blue pea ks – b-ions Green peaks – neutral loss of water and ammonia and immonium ions Grey peaks – not assigned to y-ions, b-ions, or neutral loss fragments. Other ion types and internal fragments are not labeled. 91

Figure 3.4. Example MS/MS data

Protein: Myoglobin Peptide identified: GADAQGAMTKA Peptide mass: 1020. 4761 Intensity: 22,189

Spectral identification key: Red peaks – y-ions Blue peaks – b-ions Green peaks – neutral loss of water and ammonia and immonium ions GkGrey peaks – notidtt assigned to y-ibions, b-itllftOthiions, or neutral loss fragments. Other ion types and internal fragments are not labeled. 92

3.5 Results

Digestion of the nine proteins at pH 1.0 generated 582 peptides (see Table 3.3 at end of chapter) at 695 cleavage points. pH 2.5 protein digestions produced 482 peptides

(shown in Figures 3.5-3.13 at end of chapter) with 591 different cleavage points and pH

4.0 digestions yielded 235 (see Table 3.4 at end of chapter) peptides and 366 cleavage points. The E.coli digest at pH 2.5 produced 991 peptides (see Table 3.5 at end of chapter) at 1368 different cleavage points. There were 49 proteins that were identified in the E.coli lysate. The sum of data for all digests performed at pH 2.5 is 1473 peptides and 1959 cleavage points.

While the 49 proteins identified by digesting E.coli lysate seems like a lot, this number was much lower than expected as E.coli contains thousands of proteins. One thing that could help to increase the number of proteins and peptides generated is to further optimize the digestion and chromatography conditions. By increasing the digestion time and running the analysis with a longer gradient there is a possibility that more proteins and peptides will be generated.

After the peptic peptide maps were constructed as described in section 3.4 the data was treated as described in sections 2.3-2.5. A database of cleavages was produced, followed by data normalization and then the creation of a peptide cleavage map for each pH. The cleavage data maps along with the corresponding matrix of normalized data representing the probability of a cleavage occurring between two specific amino acids can be found in Figures 3.14-3.16 and Tables 3.6-3.8 for pH 1.0, 2.5 and 4.0 respectively.

The peptide cleavage maps show that at all pH points pepsin will often cleave after most bulky hydrophobic residues such as leucine and phenylalanine. The data also 93

Figure 3.14. Cleavage data map pH 1.0

P1

P1’ Table 3.6. Normalized pH 1.0 cleavage data ACDEFGH I KLMNPQRSTVWY A 1.6 0.9 0.7 0.3 0.9 0.8 0.6 1.7 0.7 1.1 2.7 1.0 0.0 0.8 1.1 1.3 0.7 1.5 1.5 1.5 C 0.0 0.0 0.5 2.3 0.0 0.0 0.0 0.0 0.0 1.5 0.0 0.0 1.1 1.8 0.0 0.0 1.0 0.0 0.0 1.5 D 0.8 0.0 0.3 1.0 1.7 0.3 0.0 2.0 0.5 0.8 0.0 0.0 0.4 0.0 0.0 0.3 0.5 0.0 0.0 2.3 E 1.4 0.6 0.0 1.4 2.0 0.5 2.3 0.5 0.5 1.7 4.5 1.1 0.9 0.6 0.5 1.3 1.2 0.8 2.3 2.1 F 2.7 0.0 1.2 0.0 0.9 1.4 0.0 2.7 2.0 1.8 0.0 1.5 1.3 1.7 0.8 1.9 1.9 2.6 3.0 2.3 G 0.6 0.0 0.0 0.2 1.8 0.3 2.6 0.9 0.8 0.8 0.0 0.9 0.9 0.6 1.0 1.0 0.7 0.7 3.0 1.1 H 0.4 1.1 0.0 2.3 0.0 1.4 0.6 1.5 0.0 1.7 0.0 0.0 0.6 0.0 0.0 1.3 0.0 0.6 0.0 0.0 I 0.7 0.0 0.0 0.0 2.3 0.6 0.0 1.5 1.0 1.1 4.5 1.1 0.0 1.1 1.5 0.8 1.3 0.7 0.0 0.0 K 0.2 1.5 1.0 0.5 0.4 1.5 1.5 0.6 0.6 1.0 0.0 0.0 0.5 0.6 0.9 0.6 0.3 0.9 0.9 0.8 P1 L 1.6 1.8 1.2 2.4 3.2 0.7 1.5 2.9 1.7 2.3 0.0 1.9 1.5 1.3 1.7 2.0 1.4 1.7 0.9 2.7 M 1.1 0.0 2.3 1.9 4.5 2.3 0.0 2.3 1.8 0.0 0.0 0.0 2.3 2.3 3.0 0.0 1.1 4.5 0.0 0.0 N 1.7 1.5 0.0 0.0 2.8 0.3 0.0 0.0 1.5 0.5 0.0 0.0 1.0 0.0 0.8 0.0 1.1 1.7 0.0 0.9 P 1.8 0.0 0.5 0.8 1.5 1.4 1.5 0.0 0.4 0.9 2.3 0.0 0.0 0.6 2.3 0.4 0.0 2.3 0.0 2.3 Q 1.4 1.5 0.0 1.7 0.0 0.5 0.0 0.0 0.4 0.5 4.5 0.0 0.0 0.9 0.0 0.0 0.9 0.8 0.0 1.5 R 1.8 0.9 1.8 0.5 0.0 0.0 4.5 1.5 2.3 0.3 0.0 1.0 1.5 0.0 0.5 0.0 0.0 0.9 0.0 0.0 S 0.3 0.0 0.7 0.7 0.8 1.0 1.5 0.6 0.0 0.9 0.0 0.6 0.6 0.0 0.6 0.7 0.5 0.0 0.8 0.0 T 0.7 0.0 0.6 0.4 0.5 0.7 1.1 0.0 0.4 0.6 0.9 4.5 1.3 0.5 0.0 0.5 0.4 1.3 0.0 0.8 V 0.9 0.0 1.2 0.8 0.0 1.9 0.0 2.3 0.6 1.6 1.5 2.3 0.8 0.0 1.5 1.0 1.2 0.5 3.4 1.8 W 000.0 151.5 000.0 000.0 000.0 060.6 000.0 000.0 000.0 303.0 000.0 000.0 151.5 111.1 000.0 000.0 080.8 232.3 000.0 151.5 Y 1.8 0.0 0.0 0.6 0.8 0.9 1.5 1.1 1.9 1.7 0.0 0.0 0.0 0.0 1.1 0.9 0.0 1.9 0.0 2.3 94

Figure 3.15. Cleavage data map pH 2.5

P1

P1’ Table 3.7. Normalized pH 2.5 cleavage data ACDEFGH I KLMNPQRSTVWY A 1.4 2.2 0.8 0.7 0.4 0.6 1.1 2.2 0.9 1.3 0.4 0.7 1.0 1.0 0.8 1.1 1.2 1.3 1.2 1.0 C 0.0 0.0 0.0 0.0 1.3 0.0 0.0 1.9 0.0 0.0 1.6 0.0 0.6 0.0 1.3 0.0 1.2 0.8 0.0 1.9 D 1.4 1.6 0.2 0.7 1.6 0.4 0.0 1.5 0.7 1.6 1.3 0.7 0.5 0.0 1.1 0.5 0.2 1.5 1.3 1.1 E 1.5 0.0 0.7 1.4 2.4 0.7 1.3 2.1 1.3 1.1 1.8 1.0 0.8 0.6 1.5 0.7 0.7 2.1 1.0 2.3 F 2.9 2.6 2.7 0.8 1.6 2.1 3.1 3.0 1.4 1.8 2.6 2.2 1.8 1.8 2.1 1.8 1.8 3.3 1.1 2.4 G 0.5 0.0 0.1 1.0 1.5 0.5 0.9 1.0 0.9 0.4 0.6 0.5 0.5 0.8 0.5 0.8 0.5 0.8 3.0 0.5 H 0.7 1.9 0.0 1.7 0.0 1.4 0.0 1.2 0.3 0.4 0.0 0.0 1.2 0.0 0.0 1.0 1.5 1.9 7.8 1.0 I 1.0 2.2 0.6 0.4 1.6 1.1 0.6 1.0 0.6 0.9 1.3 0.4 0.2 0.4 0.3 0.9 0.8 0.7 0.0 0.6 K 0.5 1.6 0.1 0.8 0.4 0.8 0.5 0.5 0.5 0.2 0.4 0.3 0.5 0.6 0.9 0.9 0.6 0.8 0.5 0.5 P1 L 1.7 0.6 2.9 1.8 3.5 1.2 2.0 2.4 2.0 2.1 1.6 2.2 0.8 1.9 1.9 1.5 2.1 2.2 1.8 2.6 M 1.8 0.0 0.7 1.8 1.9 1.3 3.9 3.4 0.5 1.8 0.0 2.3 1.7 1.6 1.9 1.8 0.7 1.7 3.9 0.0 N 0.8 0.0 0.2 0.8 1.8 0.3 0.0 1.3 0.7 1.0 0.5 0.8 0.8 0.0 1.6 0.6 0.0 1.9 1.6 1.8 P 0.6 0.0 0.2 0.3 0.7 0.7 0.0 0.5 0.6 0.4 0.0 0.3 1.9 0.5 0.6 0.0 0.9 0.6 0.0 1.2 Q 1.7 0.0 0.4 0.6 1.2 0.7 0.0 1.1 0.6 0.6 0.0 1.5 0.3 1.6 0.0 0.8 0.8 0.8 0.0 2.7 R 0.6 0.9 0.0 0.9 0.3 0.5 0.8 0.5 1.2 0.3 0.5 0.8 1.0 0.7 0.4 0.8 0.2 0.4 0.0 0.7 S 1.2 0.0 0.2 1.5 0.8 0.7 0.9 0.7 0.6 0.7 0.8 0.5 1.1 0.9 1.1 0.4 0.0 1.1 1.7 0.5 T 1.1 1.0 0.3 1.2 2.2 0.5 0.8 1.6 0.8 1.0 0.5 0.6 1.1 0.5 1.1 1.0 0.5 1.3 0.6 0.3 V 0.7 0.0 1.0 1.0 1.1 0.7 0.0 1.6 0.6 0.7 1.1 0.4 0.7 0.7 0.8 0.9 0.8 0.9 1.3 1.0 W 060.6 262.6 121.2 000.0 000.0 000.0 161.6 262.6 000.0 050.5 262.6 000.0 000.0 555.5 000.0 393.9 141.4 191.9 393.9 525.2 Y 0.5 3.9 1.2 0.5 1.2 0.8 0.0 0.4 1.2 1.2 1.9 0.5 1.0 0.0 0.5 0.0 0.7 2.9 0.0 0.6 95

Figure 3.16. Cleavage data map pH 4.0

P1

P1’ Table 3.8. Normalized pH 4.0 cleavage data ACDEFGH I KLMNPQRSTVWY A 2.0 0.0 1.0 0.9 0.0 0.0 0.9 1.8 1.2 2.2 0.0 3.1 1.4 1.6 0.7 1.8 0.0 1.2 0.0 0.0 C 1.0 0.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.8 1.4 0.0 0.0 1.2 0.0 0.0 0.0 D 3.5 0.0 0.6 1.1 1.0 0.9 0.0 0.9 0.4 0.0 0.0 0.0 1.8 0.0 0.0 1.4 1.2 0.0 0.0 0.0 E 1.3 0.0 0.0 1.9 0.8 0.0 1.8 0.9 0.0 0.7 3.5 1.2 0.9 3.5 1.6 0.0 2.0 1.0 2.3 1.6 F 1.6 0.0 2.6 1.8 0.0 1.8 0.0 2.3 0.9 3.9 0.0 0.0 2.0 1.0 0.0 3.0 1.2 2.8 7.0 0.0 G 2.9 0.0 0.0 0.5 0.0 0.5 1.0 3.5 0.0 1.2 2.3 2.3 0.0 0.0 1.2 3.0 0.0 0.5 2.3 0.0 H 0.7 3.5 0.0 0.0 0.0 2.1 0.0 2.3 0.9 1.0 0.0 0.0 1.8 0.0 1.4 3.0 0.0 0.0 0.0 3.5 I 0.6 2.3 0.0 0.0 0.0 0.0 0.0 2.3 0.8 0.0 3.5 0.0 0.0 3.5 0.0 1.0 2.3 1.0 0.0 0.0 K 0.4 0.0 0.0 0.4 0.0 0.0 2.3 0.0 0.5 0.4 0.0 1.8 0.0 0.0 0.0 0.0 2.0 1.1 1.4 1.9 P1 L 1.8 0.0 2.3 1.9 4.7 0.8 1.5 2.8 0.8 1.7 0.0 2.8 3.5 4.7 2.1 2.1 1.4 1.6 0.0 2.3 M 0.0 0.0 0.0 1.0 3.5 1.8 0.0 0.0 1.4 0.0 0.0 7.0 3.5 0.0 0.0 0.0 3.5 3.5 0.0 0.0 N 1.3 2.3 0.0 0.0 0.0 0.0 0.0 0.0 2.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.9 0.0 1.8 P 1.0 0.0 1.0 0.0 2.3 0.8 0.0 0.0 0.7 0.0 0.0 0.0 0.0 1.2 0.0 0.0 0.0 0.9 0.0 0.0 Q 1.0 0.0 0.0 0.9 0.0 0.0 0.0 3.5 0.6 2.8 0.0 0.0 0.0 1.8 0.0 0.0 0.0 0.0 0.0 1.8 R 1.2 1.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.8 0.0 0.8 0.0 0.0 1.4 0.0 1.2 S 1.2 0.0 0.6 1.2 1.8 0.0 1.2 0.8 0.0 0.6 0.0 0.0 0.0 0.0 0.0 1.2 1.0 2.3 7.0 0.0 T 0.5 0.0 0.0 0.6 1.2 0.0 0.0 0.0 0.6 1.6 0.0 0.0 2.3 0.0 0.0 1.8 0.0 0.8 0.0 2.0 V 0.0 0.0 1.9 0.5 2.3 1.6 1.8 2.3 0.0 1.7 2.3 1.0 0.0 0.0 1.4 1.3 2.5 0.5 2.3 0.7 W 000.0 353.5 000.0 000.0 000.0 141.4 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 353.5 Y 1.8 0.0 0.0 0.0 0.0 0.8 0.0 2.3 1.2 2.3 0.0 0.0 0.0 0.0 1.8 0.0 1.8 2.8 0.0 0.0 96

show that pepsin has a low probability of cleaving after the residues arginine, cysteine,

proline and histidine. The pH 2.5 data incorporates all of the E.coli data so there is a

large data set for this pH point. The sizes of the squares in the pH 2.5 cleavage data map

are also generally smaller than those found on the other two maps. This is because the

sequence coverage percent of the peptides identified in the E.coli lysate was on the lower

end. The lower the sequence coverage the lower the probability of cleavage appears to be.

A revised version of the cleavage data maps were made as described in section

2.6 with the data representing the probability of cleavage between two specific residues

as a percentage. The data maps and corresponding percentage values can be found in

Figures 3.17-3.19 and Tables 3.9-3.11 for pH 1.0, 2.5 and 4.0 respectively. One of the

differences in this set of cleavage data maps is that the pH 2.5 map only takes into

consideration the data that was produced from the digestion of the nine standard proteins

and not the data produced from the digestion of E.coli whole cell lysate. The benefit to

only showing this selected data is that now the maps for all three pH points can be

directly compared because they contain digestion of the same proteins, therefore they

have the same number of possible cleavages. Overall, there were not many distinct

differences between the digestions at the different pH points. The trends in the

specificity of pepsin were similar at pH 1.0, 2.5 and 4.0.

One of the things that was affected by the change in pH was in the number of

peptides generated from each digestion. The low number of peptides produced by the

digestions at pH 4.0 is a reason why the pH 4.0 cleavage data map looks sparse compared

to those of pH 1.0 and 2.5. pH 1.0 produced the most amount of data, while pH 4.0 97

Figure 3.17. Cleavage data map with probability defined as a percentage, pH 1.0

Y W V T S R Q P N M

P1 L K I H G F E D C A

A CDEFGHIKLMNPQRSTVWY P ’ Scale 1

100 90 80 70 60 50 40 30 20 10

Table 3.9. Normalized pH 1.0 cleavage data defined as a percentage

P1’ ACDEFGH I KLMNPQRSTVWY A 31 20156 111813338 2560140 18252415273333 C 0 01133000003300254000220033 D 18 01423257044514000000100050 E 33 13 0 26 56 11 40 11 12 39 100 24 20 14 11 26 27 11 50 45 F 36 0 25 0 20 31 0 60 33 40 100 33 29 43 17 29 33 57 67 50 G 20 0 0 5 407 572012190 8 20130 2414256725 H 9 25050144013330290013002901700 I 15 0 0 2050140 33222533250 25331829130 0 K 9 3322159333314815001102014821208

P1 L 33 40 31 52 67 11 40 54 40 47 100 38 33 17 38 44 30 37 17 62 M 25 0 33 43 100 40 0 33 40 0 0 100 50 33 67 0 25 50 0 0 N 46 33006380 03390 0220170259020 P 40 011183330330020500011509050050 Q 20 3303801000 81300 020002017033 R 40 20 40 10 0 0 100 33 67 6 25 33 33 0 11 0 0 20 0 0 S 6 0 20211823331310150 140 0 1317110 140 T 20 01481817200913201002810012029017 V 19 0 14160 330 5023403333170 332125147540 W 0 33 0 0 0 29 0 0 100 67 0 0 33 20 0 0 0 50 0 33 Y 40 00143320332543360000259043050 98

Figure 3.18. Cleavage data map with probability defined as a percentage, pH 2.5

Y W V T S R Q P N M P1 L K I H G F E D C A

A CDEFGHIKLMNPQRSTVWY P ’ Scale 1

100 90 80 70 60 50 40 30 20 10

Table 3.10. Normalized pH 2.5 cleavage data defined as a percentage

P1’ ACDEFGH I KLMNPQRSTVWY A 24 3814200 22253315252533201110250 210 50 C 0 04303300500000000033000 D 57 0084300256180000332000020 E 27 0 1733440 6013202850112225220 29203356 F 67 0 44676733673338440 0 575010043388310067 G 18 0 0 7 33 14 14 29 13 18 33 18 0 0 0 17 11 8 67 0 H 40 25025143000221400250252905000 I 9 0005025250112233000014142200 K 14 33 0 5 22 20 17 14 15 10 0 0 22 0 25 33 22 14 20 17

P1 L 29 0 40 30 38 40 33 69 42 31 100 33 25 20 50 42 41 48 50 40 M 0 0 0 43 50 40 0 0 40 0 0 100 50 33 67 0 0 50 0 0 N 8 00071130009001702520022025 P 29 0 131133110 0 100 0 202543100040387550 Q 43 033130130338205020040040250050 R 17 20 0 22 0 50 50 0 0 6 25 33 0 0 13 50 25 20 0 20 S 18 00291820170145000251491333500 T 31 1000254300091800191033673333014 V 20 0201933330338173303300270363310 W 25 500004000000006005000050 Y 25 0 0 0 40 22 33 0 33 44 0 0 20 0 0 0 25 20 0 0 99

Figure 3.19. Cleavage data map with probability defined as a percentage, pH 4.0

Y W V T S R Q P N M

P1 L K I H G F E D C A

A CDEF G H I K L M N P Q R S T VWY P ’ Scale 1

100 90 80 70 60 50 40 30 20 10

Table 3.11. Normalized pH 4.0 cleavage data defined as a percentage

P1’ ACDEFGH I KLMNPQRSTVWY A 20 0141300 02215250222022101901600 C 14 13000000000025200017000 D 57 08171411013600025002014000 E 20 0 0 28220 40130 1150161150220 29133322 F 22 0 333333250 3313440 0 29170 4325331000 G 36 002007144301833270003308330 H 20 2500 0300332200 02502543017050 I 9 330000033110330025014291100 K 5 0050033085025000022142033

P1 L 25 0 402563137 3113210 335060253518240 30 M 0 0 0 14 50 20 0 0 20 0 0 100 50 0 0 0 50 25 0 0 N 23 33000000330000000022025 P 14 0130331100100000140001300 Q 14 001300033840000200000025 R 17 200000003300040013002000 S 9 081827017110900000181333500 T 8 008140009180031000011029 V 5 0 306 332220330 1733130 0 2018339 3310 W 0 50000200000000200000050 Y 25 000011173317330000250254000 100 produced the least amount. This is most likely because at pH 1.0 the proteins are denatured, therefore it is easier for the pepsin to digest them. Another reason for the lower number of cleavages at pH 4.0 is the fact that the activity of pepsin is much lower at this pH than at the lower pH values. If pepsin is not as active then it will not cleave as often therefore producing less peptides.

3.6 References

Brier, S., G. Maria, et al. (2007). "Purification and characterization of pepsins A1 and A2

from the Antarctic rock cod Trematomus bernacchii." Febs J 274: 6152-66.

Chakraborty, A. B., S. J. Berger, et al. (2007). "Use of an integrated MS--multiplexed

MS/MS data acquisition strategy for high-coverage peptide mapping studies."

Rapid Commun Mass Spectrom 21(5): 730-44.

Jones, R. G. and J. Landon (2002). "Enhanced pepsin digestion: a novel process for

purifying antibody F(ab')(2) fragments in high yield from serum." J Immunol

Methods 263: 57-74.

Millea, K. M., I. S. Krull, et al. (2006). "Integration of multidimensional chromatographic

protein separations with a combined "top-down" and "bottom-up" proteomic

strategy." J Proteome Res 5(1): 135-46.

Plumb, R., J. Castro-Perez, et al. (2004). "Ultra-performance liquid chromatography

coupled to quadrupole-orthogonal time-of-flight mass spectrometry." Rapid

Commun Mass Spectrom 18(19): 2331-7. 101

Plumb, R. S., K. A. Johnson, et al. (2006). "UPLC/MS(E); a new approach for generating

molecular fragment information for biomarker structure elucidation." Rapid

Commun Mass Spectrom 20: 1989-94.

Swartz, M. E. (2005). "Ultra performance liquid chromatography (UPLC): An

introduction." Lc Gc North America: 8-14.

Swartz, M. E. (2005). "UPLC (TM): An introduction and review." Journal of Liquid

Chromatography & Related Technologies 28(7-8): 1253-1263.

Wang, L., H. Pan, et al. (2002). "Hydrogen exchange-mass spectrometry: optimization of

digestion conditions." Mol Cell Proteomics 1: 132-8.

Wu, Y., S. Kaveti, et al. (2006). "Extensive deuterium back-exchange in certain

immobilized pepsin columns used for H/D exchange mass spectrometry." Anal

Chem 78: 1719-23.

102

Figure 3.5. pH 2.5 peptic digest map of Abl

1 GQQPGKVLGD QRREPQGLSE AARWNBKENL LAGPSENDPN LFVALYDFVA

51 SGDNTLSITK GEKLRVLGYN HNGEWCEAQT KNGQGWVPSN YITPVNSLEK

101 HSWYHGPVSR NAAEYLLSSG INGSFLVRES ESSPGQRSIS LRYEGRVYHY

151 RINTASDGKL YVSSESRFNT LAELVHHHST VADGLITTLH YPAPKRNKPT

201 VYGVSPNYDK WEMERTDITM KHKLGGGQYG EVYEGVWKKY SLTVAVKTLK

251 EDTMEVEEFL KEAAVMKEIK HPNLVQLLGV CTREPPFYII TEFMTYGNLL

301 DYLRECNRQE VNAVVLLYMA TQISSAMEYL EKKNFIHRDL AARNCLVGEN

351 HLVKVADFGL SRLMTGDTYT AHAGAKFPIK WTAPESLAYN KFSIKSDVWA

401 FGVLLWEIAT YGMSPYPGID LSQVYELLEK DYRMERPEGC PEKVYELMRA

451 CWQQQQWNPSDRP SFAEIHQAFE TMFQESSISD EVEKELGKEN LYF QGHHHHH

501 H 103

Figure 3.6. pH 2.5 peptic digest map of Albumin

1 DTHKSEIAHR FKDLGEEHFK GLVLIAFSQY LQQCPFDEHV KLVNELTEFA

51 KTCVADESHA GCEKSLHTLF GDELCKVASL RETYGDMADC CEKQEPERNE

101 CFLSHKDDSP DLPKLKPDPN TLCDEFKADE KKFWGKYLYE IARRHPYFYA

151 PELLYYANKY NGVFQECCQA EDKGACLLPK IETMREKVLA SSARQRLRCA

201 SIQKFGERAL KAWSVARLSQ KFPKAEFVEV TKLVTDLTKV HKECCHGDLL

251 ECADDRADLA KYICDNQDTI SSKLKECCDK PLLEKSHCIA EVEKDAIPEN

301 LPPLTADFAE DKDVCKNYQE AKDAFLGSFL YEYSRRHPEY AVSVLLRLAK

351 EYEATLEECC AKDDPHACYS TVFDKLKHLV DEPQNLIKQN CDQFEKLGEY

401 GFQNALIVRY TRKVPQVSTP TLVEVSRSLG KVGTRCCTKP ESERMPCTED

451 YLSLILNRLC VLHEKTPVSE KVTKCCTESL VNRRPCFSAL TPDETYVPKA

501 FDEKLFTFHA DICTLPDTEK QIKKQTALVE LLKHKPKATE EQLKTVMENF

551 VAFVDKCCAA DDKEACFAVE GPKLVVSTQT ALA 104

Figure 3.7. pH 2.5 peptic digest map of Aldolase

1 MPHSHPALTP EQKKELSDIA HRIVAPGKGI LAADESTGSI AKRLQSIGTE

51 NTEENRRFYR QLLLTADDRV NPCIGGVILF HETLYQKADD GRPFPQVIKS

101 KGGVVGIKVD KGVVPLAGTN GETTTQGLDG LSERCAQYKK DGADFAKWRC

151 VLKIGEHTPS ALAIMENANV LARYASICQQ NGIVPIVEPE ILPDGDHDLK

201 RCQYVTEKVL AAVYKALSDH HIYLEGTLLK PNMVTPGHAC TQKYSHEEIA

251 MATVTALRRT VPPAVTGVTF LSGGQSEEEA SINLNAINKC PLLKPWALTF

301 SYGRALQASA LKAWGGKKEN LKAAQEEYVK RALANSLACQ GKYTPSGQAG

351 AAASESLFIS NHAY 105

Figure 3.8. pH 2.5 peptic digest map of Amyloglucosidase

1 MSFRSLLALS GLVCTGLANV ISKRATLDSW LSNEATVART AILNNIGADG

51 AWVSGADSGI VVASPSTDNP DYFYTWTRDS GLVLKTLVDL FRNGDTSLLS

101 TIENYISAQA IVQGISNPSG DLSSGAGLGE PKFNVDETAY TGSWGRPQRD

151 GPALRATAMI GFGQWLLDNG YTSTATDIVW PLVRNDLSYV AQYWNQTGYD

201 LWEEVNGSSF FTIAVQHRAL VEGSAFATAV GSSCSWCDSQ APEILCYLQS

251 FWTGSFILAN FDSSRSGKDA NTLLGSIHTF DPEAACDDST FQPCSPRALA

301 NHKEVVDSFR SIYTLNDGLS DSEAVAVGRY PEDTYYNGNP WFLCTLAAAE

351 QLYDALYQWD KQGSLEVTDV SLDFFKALYS DAATGTYSSS SSTYSSIVDA

401 VKTFADGFVS IVETHAASNG SMSEQYDKSD GEQLSARDLT WSYAALLTAN

451 NRRNSVVPAS WGETSASSVP GTCAATSAIG TYSSVTVTSW PSIVATGGTT

501 TTATPTGSGS VTSTSKTTAT ASKTSTSTSS TSCTTPTAVA VTFDLTATTT

551 YGENIYLVGS ISQLGDWETS DGIALSADKY TSSDPLWYVT VTLPAGESFE

601 YKFIRIESDD SVEWESDPNR EYTVPQACGT STATVTDTWR 106

Figure 3.9. pH 2.5 peptic digest map of β-Lactalbumin

1 LIVTQTMKGL DIQKVAGTWY SLAMAASDIS LLDAQSAPLR VYVEELKPTP

51 EGDLEILLQK WENGECAQKK IIAEKTKIPA VFKIDALNEN KVLVLDTDYK

101 KYLLFCMENS AEPEQSLACQ CLVRTPEVDD EALEKFDKAL KALPMHIRLS

151 FNPTQLEEQC HI 107

Figure 3.10. pH 2.5 peptic digest map of Hemoglobin

Alpha chain 1 MVLSAADKGN VKAAWGKVGG HAAEYGAEAL ERMFLSFPTT KTYFPHFDLS

51 HGSAQVKGHG AKVAAALTKA VEHLDDLPGA LSELSDLHAH KLRVDPVNFK

101 LLSHSLLVTL ASHLPSDFTP AVHASLDKFL ANVSTVLTSK YR

P02070|HBB_BOVIN Coverage Map Beta chain 1 MLTAEEKAAV TAFWGKVKVD EVGGEALGRL LVVYPWTQRF FESFGDLSTA

51 DAVMNNPKVK AHGKKVLDSF SNGMKHLDDL KGTFAALSEL HCDKLHVDPE

101 NFKLLGNVLV VVLARNFGKE FTPVLQADFQ KVVAGVANAL AHRYH

Epsilon-2 chain 1 MVHFTTEENV AVASLWAKVN VEVVGGESLA RLLIVCPWTQ RFFDSFGNLY

51 SESAIMGNPK VKVYGRKVLN SFGNAIKHMD DLKGTFADLS ELHCDKLHVD

101 PENFRLLGNM ILIVLATHFS KEFTPQMQAA WQKLTNAVAN ALTHKYH

Epsilon-4 chain 1 MVHFTTEEKA AVASLWAKVN VEVVGGESLA RLLIVYPWTQ RFFDSFGNLY

51 SESAIMGNPK VKAHGRKVLN SFGNAIEHMD DLKGTFADLS ELHCDKLHVD

101 PENFRLLGNM ILIVLATHFS KEFTPQMQAS WQKLTNAVAN ALAHKYH 108

Figure 3.11. pH 2.5 peptic digest map of Myoglobin

1 MGLSDGEWQQ VLNVWGKVEA DIAGHGQEVL IRLFTGHPET LEKFDKFKHL

51 KTEAEMKASE DLKKHGTVVL TALGGILKKK GHHEAELKPL AQSHATKHKI

101 PIKYLEFISD AIIHVLHSKH PGDFGADAQG AMTKALELFR NDIAAKYKEL

151 GFQG 109

Figure 3.12. pH 2.5 peptic digest map of Nef

1 MGSSHHHHHH SSGLVPRGSH MGGKWSKSSV IGWPAVRERM RRAEPAADGV

51 GAVSRDLEKH GAITSSNTAA NNAACAWLEA QEEEEVGFPV TPQVPLRPMT

101 YKAAVDLSHF LKEKGGLEGL IHSQRRQDIL DLWIYHTQGY FPDWQNYTPG

151 PGVRYPLTFG WCYKLVPVEP DKVEEANKGE NTSLLHPVSL HGMDDPEREV

201 LEWRFDSRLA FHHVARELHP EYFKNC 110

Figure 3.13. pH 2.5 peptic digest map of Ubiquitin

1 MQIFVKTLTG KTITLEVEPS DTIENVKAKI QDKEGIPPDQ QRLIFAGKQL

51 EDGRTLSDYN IQKESTLHLV LRLRGG 111

Table 3.3. pH 1.0 peptides Protein Residues Peptide Sequence Abl 1-20 GQQPGKVLGDQRREPQGLSE Abl 29-57 NLLAGPSENDPNLFVALYDFVASGDNTLS Abl 31-41 LAGPSENDPNL Abl 31-42 LAGPSENDPNLF Abl 32-50 AGPSENDPNLFVALYDFVA Abl 42-56 FVALYDFVASGDNTL Abl 46-64 YDFVASGDNTLSITKGEKL Abl 46-74 YDFVASGDNTLSITKGEKLRVLGYNHNGE Abl 46-75 YDFVASGDNTLSITKGEKLRVLGYNHNGEW Abl 48-75 FVASGDNTLSITKGEKLRVLGYNHNGEW Abl 48-77 FVASGDNTLSITKGEKLRVLGYNHNGEWCE Abl 49-63 VASGDNTLSITKGEK Abl 49-64 VASGDNTLSITKGEKL Abl 49-75 VASGDNTLSITKGEKLRVLGYNHNGEW Abl 50-64 ASGDNTLSITKGEKL Abl 51-64 SGDNTLSITKGEKL Abl 57-75 SITKGEKLRVLGYNHNGEW Abl 65-77 RVLGYNHNGEWCE Abl 75-84 WCEAQTKNGQ Abl 76-90 CEAQTKNGQGWVPSN Abl 76-98 CEAQTKNGQGWVPSNYITPVNSL Abl 77-114 EAQTKNGQGWVPSNYITPVNSLEKHSWYHGPVSRNAAE Abl 78-90 AQTKNGQGWVPSN Abl 85-131 GWVPSNYITPVNSLEKHSWYHGPVSRNAAEYLLSSGINGSFLVRESE Abl 86-104 WVPSNYITPVNSLEKHSWY Abl 88-114 PSNYITPVNSLEKHSWYHGPVSRNAAE Abl 91-114 YITPVNSLEKHSWYHGPVSRNAAE Abl 92-114 ITPVNSLEKHSWYHGPVSRNAAE Abl 93-102 TPVNSLEKHS Abl 93-114 TPVNSLEKHSWYHGPVSRNAAE Abl 98-110 LEKHSWYHGPVSR Abl 99-114 EKHSWYHGPVSRNAAE Abl 107-122 PVSRNAAEYLLSSGIN Abl 107-125 PVSRNAAEYLLSSGINGSF Abl 110-123 RNAAEYLLSSGING Abl 111-123 NAAEYLLSSGING Abl 115-125 YLLSSGINGSF Abl 115-126 YLLSSGINGSFL Abl 117-127 LSSGINGSFLV Abl 125-141 FLVRESESSPGQRSISL Abl 126-140 LVRESESSPGQRSIS Abl 126-141 LVRESESSPGQRSISL Abl 126-160 LVRESESSPGQRSISLRYEGRVYHYRINTASDGKL Abl 127-141 VRESESSPGQRSISL Abl 140-160 SLRYEGRVYHYRINTASDGKL Abl 141-160 LRYEGRVYHYRINTASDGKL Abl 142-160 RYEGRVYHYRINTASDGKL Abl 154-163 TASDGKLYVS Abl 159-185 KLYVSSESRFNTLAELVHHHSTVADGL Abl 161-171 YVSSESRFNTL Abl 161-173 YVSSESRFNTLAE Abl 172-185 AELVHHHSTVADGL Abl 172-202 AELVHHHSTVADGLITTLHYPAPKRNKPTVY Abl 172-213 AELVHHHSTVADGLITTLHYPAPKRNKPTVYGVSPNYDKWEM Abl 174-185 LVHHHSTVADGL Abl 186-213 ITTLHYPAPKRNKPTVYGVSPNYDKWEM Abl 198-212 KPTVYGVSPNYDKWE Abl 203-213 GVSPNYDKWEM Abl 214-236 ERTDITMKHKLGGGQYGEVYEGV Abl 214-237 ERTDITMKHKLGGGQYGEVYEGVW 112

Abl 214-243 ERTDITMKHKLGGGQYGEVYEGVWKKYSLT Abl 214-245 ERTDITMKHKLGGGQYGEVYEGVWKKYSLTVA Abl 214-255 ERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTME Abl 215-230 RTDITMKHKLGGGQYG Abl 219-246 TMKHKLGGGQYGEVYEGVWKKYSLTVAV Abl 222-244 HKLGGGQYGEVYEGVWKKYSLTV Abl 233-245 YEGVWKKYSLTVA Abl 244-255 VAVKTLKEDTME Abl 246-255 VKTLKEDTME Abl 256-274 VEEFLKEAAVMKEIKHPNL Abl 258-276 EFLKEAAVMKEIKHPNLVQ Abl 259-274 FLKEAAVMKEIKHPNL Abl 261-274 KEAAVMKEIKHPNL Abl 275-292 VQLLGVCTREPPFYIITE Abl 278-292 LGVCTREPPFYIITE Abl 278-293 LGVCTREPPFYIITEF Abl 282-292 TREPPFYIITE Abl 282-293 TREPPFYIITEF Abl 293-313 FMTYGNLLDYLRECNRQEVNA Abl 294-313 MTYGNLLDYLRECNRQEVNA Abl 302-313 YLRECNRQEVNA Abl 316-342 LLYMATQISSAMEYLEKKNFIHRDLAA Abl 317-346 LYMATQISSAMEYLEKKNFIHRDLAARNCL Abl 318-346 YMATQISSAMEYLEKKNFIHRDLAARNCL Abl 321-335 TQISSAMEYLEKKNF Abl 327-338 MEYLEKKNFIHR Abl 328-346 EYLEKKNFIHRDLAARNCL Abl 329-346 YLEKKNFIHRDLAARNCL Abl 330-346 LEKKNFIHRDLAARNCL Abl 331-351 EKKNFIHRDLAARNCLVGENH Abl 339-354 DLAARNCLVGENHLVK Abl 347-358 VGENHLVKVADF Abl 347-360 VGENHLVKVADFGL Abl 347-363 VGENHLVKVADFGLSRL Abl 348-358 GENHLVKVADF Abl 348-360 GENHLVKVADFGL Abl 348-363 GENHLVKVADFGLSRL Abl 364-387 MTGDTYTAHAGAKFPIKWTAPESL Abl 365-377 TGDTYTAHAGAKF Abl 366-377 GDTYTAHAGAKF Abl 369-388 YTAHAGAKFPIKWTAPESLA Abl 378-388 PIKWTAPESLA Abl 380-398 KWTAPESLAYNKFSIKSDV Abl 383-392 APESLAYNKF Abl 383-393 APESLAYNKFS Abl 394-424 IKSDVWAFGVLLWEIATYGMSPYPGIDLSQV Abl 397-424 DVWAFGVLLWEIATYGMSPYPGIDLSQV Abl 406-421 WEIATYGMSPYPGIDL Abl 416-425 YPGIDLSQVY Abl 416-430 YPGIDLSQVYELLEK Abl 428-446 LEKDYRMERPEGCPEKVYE Abl 429-437 EKDYRMERP Abl 432-448 YRMERPEGCPEKVYELM Abl 447-462 LMRACWQWNPSDRPSF Abl 448-462 MRACWQWNPSDRPSF Abl 458-499 DRPSFAEIHQAFETMFQESSISDEVEKELGKENLYFQGHHHH Abl 463-472 AEIHQAFETM Abl 472-495 MFQESSISDEVEKELGKENLYFQG Abl 473-491 FQESSISDEVEKELGKENL Abl 474-491 QESSISDEVEKELGKENL Abl 476-489 SSISDEVEKELGKE Abl 481-495 EVEKELGKENLYFQG Abl 485-495 ELGKENLYFQG 113

Albumin 1-21 DTHKSEIAHRFKDLGEEHFKG Albumin 1-24 DTHKSEIAHRFKDLGEEHFKGLVL Albumin 2-24 THKSEIAHRFKDLGEEHFKGLVL Albumin 3-24 HKSEIAHRFKDLGEEHFKGLVL Albumin 21-43 GLVLIAFSQYLQQCPFDEHVKLV Albumin 23-35 VLIAFSQYLQQCP Albumin 25-38 IAFSQYLQQCPFDE Albumin 25-42 IAFSQYLQQCPFDEHVKL Albumin 25-45 IAFSQYLQQCPFDEHVKLVNE Albumin 27-38 FSQYLQQCPFDE Albumin 30-45 YLQQCPFDEHVKLVNE Albumin 31-45 LQQCPFDEHVKLVNE Albumin 31-77 LQQCPFDEHVKLVNELTEFAKTCVADESHAGCEKSLHTLFGDELCKV Albumin 34-81 CPFDEHVKLVNELTEFAKTCVADESHAGCEKSLHTLFGDELCKVASLR Albumin 59-82 HAGCEKSLHTLFGDELCKVASLRE Albumin 107-121 DDSPDLPKLKPDPNT Albumin 112-133 LPKLKPDPNTLCDEFKADEKKF Albumin 113-133 PKLKPDPNTLCDEFKADEKKF Albumin 125-142 EFKADEKKFWGKYLYEIA Albumin 126-144 FKADEKKFWGKYLYEIARR Albumin 138-154 LYEIARRHPYFYAPELL Albumin 141-153 IARRHPYFYAPEL Albumin 141-154 IARRHPYFYAPELL Albumin 141-155 IARRHPYFYAPELLY Albumin 145-156 HPYFYAPELLYY Albumin 154-164 LYYANKYNGVF Albumin 169-191 QAEDKGACLLPKIETMREKVLAS Albumin 170-188 AEDKGACLLPKIETMREKV Albumin 177-191 LLPKIETMREKVLAS Albumin 177-208 LLPKIETMREKVLASSARQRLRCASIQKFGER Albumin 199-211 CASIQKFGERALK Albumin 212-227 AWSVARLSQKFPKAEF Albumin 213-227 WSVARLSQKFPKAEF Albumin 275-293 KECCDKPLLEKSHCIAEVE Albumin 276-290 ECCDKPLLEKSHCIA Albumin 277-300 CCDKPLLEKSHCIAEVEKDAIPEN Albumin 279-319 DKPLLEKSHCIAEVEKDAIPENLPPLTADFAEDKDVCKNYQ Albumin 292-303 VEKDAIPENLPP Albumin 328-338 SFLYEYSRRHP Albumin 331-345 YEYSRRHPEYAVSVL Albumin 334-354 SRRHPEYAVSVLLRLAKEYEA Albumin 337-357 HPEYAVSVLLRLAKEYEATLE Albumin 355-399 TLEECCAKDDPHACYSTVFDKLKHLVDEPQNLIKQNCDQFEKLGE Albumin 369-390 YSTVFDKLKHLVDEPQNLIKQN Albumin 402-422 FQNALIVRYTRKVPQVSTPTL Albumin 406-422 LIVRYTRKVPQVSTPTL Albumin 407-419 IVRYTRKVPQVST Albumin 407-421 IVRYTRKVPQVSTPT Albumin 407-422 IVRYTRKVPQVSTPTL Albumin 408-421 VRYTRKVPQVSTPT Albumin 418-441 STPTLVEVSRSLGKVGTRCCTKPE Albumin 419-432 TPTLVEVSRSLGKV Albumin 419-440 TPTLVEVSRSLGKVGTRCCTKP Albumin 443-463 ERMPCTEDYLSLILNRLCVLH Albumin 448-466 TEDYLSLILNRLCVLHEKT Albumin 460-470 CVLHEKTPVSE Albumin 460-511 CVLHEKTPVSEKVTKCCTESLVNRRPCFSALTPDETYVPKAFDEKLFTFHAD Albumin 464-482 EKTPVSEKVTKCCTESLVN Albumin 490-505 LTPDETYVPKAFDEKL Albumin 496-515 YVPKAFDEKLFTFHADICTL Albumin 529-543 VELLKHKPKATEEQL Albumin 529-545 VELLKHKPKATEEQLKT Albumin 529-547 VELLKHKPKATEEQLKTVM 114

Albumin 529-549 VELLKHKPKATEEQLKTVMEN Albumin 529-550 VELLKHKPKATEEQLKTVMENF Albumin 531-545 LLKHKPKATEEQLKT Albumin 532-550 LKHKPKATEEQLKTVMENF Albumin 555-579 DKCCAADDKEACFAVEGPKLVVSTQ Albumin 568-580 AVEGPKLVVSTQT Aldolase 2-31 PHSHPALTPEQKKELSDIAHRIVAPGKGIL Aldolase 3-15 HSHPALTPEQKKE Aldolase 4-11 SHPALTPE Aldolase 5-15 HPALTPEQKKE Aldolase 5-22 HPALTPEQKKELSDIAHR Aldolase 7-16 ALTPEQKKEL Aldolase 9-35 TPEQKKELSDIAHRIVAPGKGILAADE Aldolase 10-34 PEQKKELSDIAHRIVAPGKGILAAD Aldolase 31-63 LAADESTGSIAKRLQSIGTENTEENRRFYRQLL Aldolase 32-54 AADESTGSIAKRLQSIGTENTEE Aldolase 32-58 AADESTGSIAKRLQSIGTENTEENRRF Aldolase 32-63 AADESTGSIAKRLQSIGTENTEENRRFYRQLL Aldolase 32-65 AADESTGSIAKRLQSIGTENTEENRRFYRQLLLT Aldolase 32-67 AADESTGSIAKRLQSIGTENTEENRRFYRQLLLTAD Aldolase 33-58 ADESTGSIAKRLQSIGTENTEENRRF Aldolase 34-58 DESTGSIAKRLQSIGTENTEENRRF Aldolase 34-64 DESTGSIAKRLQSIGTENTEENRRFYRQLLL Aldolase 35-58 ESTGSIAKRLQSIGTENTEENRRF Aldolase 35-74 ESTGSIAKRLQSIGTENTEENRRFYRQLLLTADDRVNPCI Aldolase 50-70 ENTEENRRFYRQLLLTADDRV Aldolase 52-71 TEENRRFYRQLLLTADDRVN Aldolase 55-64 NRRFYRQLLL Aldolase 55-92 NRRFYRQLLLTADDRVNPCIGGVILFHETLYQKADDGR Aldolase 64-79 LTADDRVNPCIGGVIL Aldolase 65-79 TADDRVNPCIGGVIL Aldolase 66-79 ADDRVNPCIGGVIL Aldolase 68-79 DRVNPCIGGVIL Aldolase 78-102 ILFHETLYQKADDGRPFPQVIKSKG Aldolase 80-95 FHETLYQKADDGRPFP Aldolase 85-106 YQKADDGRPFPQVIKSKGGVVG Aldolase 85-131 YQKADDGRPFPQVIKSKGGVVGIKVDKGVVPLAGTNGETTTQGLDGL Aldolase 96-106 QVIKSKGGVVG Aldolase 107-131 IKVDKGVVPLAGTNGETTTQGLDGL Aldolase 108-125 KVDKGVVPLAGTNGETTT Aldolase 112-128 GVVPLAGTNGETTTQGL Aldolase 120-147 NGETTTQGLDGLSERCAQYKKDGADFAK Aldolase 139-166 KKDGADFAKWRCVLKIGEHTPSALAIME Aldolase 148-164 WRCVLKIGEHTPSALAI Aldolase 154-170 IGEHTPSALAIMENANV Aldolase 154-171 IGEHTPSALAIMENANVL Aldolase 163-188 AIMENANVLARYASICQQNGIVPIVE Aldolase 169-188 NVLARYASICQQNGIVPIVE Aldolase 171-188 LARYASICQQNGIVPIVE Aldolase 171-207 LARYASICQQNGIVPIVEPEILPDGDHDLKRCQYVTE Aldolase 171-210 LARYASICQQNGIVPIVEPEILPDGDHDLKRCQYVTEKVL Aldolase 172-188 ARYASICQQNGIVPIVE Aldolase 172-207 ARYASICQQNGIVPIVEPEILPDGDHDLKRCQYVTE Aldolase 175-188 ASICQQNGIVPIVE Aldolase 179-207 QQNGIVPIVEPEILPDGDHDLKRCQYVTE Aldolase 184-204 VPIVEPEILPDGDHDLKRCQY Aldolase 188-199 EPEILPDGDHDL Aldolase 189-203 PEILPDGDHDLKRCQ Aldolase 189-204 PEILPDGDHDLKRCQY Aldolase 189-207 PEILPDGDHDLKRCQYVTE Aldolase 193-204 PDGDHDLKRCQY Aldolase 194-207 DGDHDLKRCQYVTE Aldolase 208-226 KVLAAVYKALSDHHIYLEG 115

Aldolase 211-229 AAVYKALSDHHIYLEGTLL Aldolase 211-239 AAVYKALSDHHIYLEGTLLKPNMVTPGHA Aldolase 211-247 AAVYKALSDHHIYLEGTLLKPNMVTPGHACTQKYSHE Aldolase 212-229 AVYKALSDHHIYLEGTLL Aldolase 212-233 AVYKALSDHHIYLEGTLLKPNM Aldolase 212-247 AVYKALSDHHIYLEGTLLKPNMVTPGHACTQKYSHE Aldolase 213-229 VYKALSDHHIYLEGTLL Aldolase 213-247 VYKALSDHHIYLEGTLLKPNMVTPGHACTQKYSHE Aldolase 213-251 VYKALSDHHIYLEGTLLKPNMVTPGHACTQKYSHEEIAM Aldolase 214-226 YKALSDHHIYLEG Aldolase 214-239 YKALSDHHIYLEGTLLKPNMVTPGHA Aldolase 214-250 YKALSDHHIYLEGTLLKPNMVTPGHACTQKYSHEEIA Aldolase 215-247 KALSDHHIYLEGTLLKPNMVTPGHACTQKYSHE Aldolase 227-247 TLLKPNMVTPGHACTQKYSHE Aldolase 227-250 TLLKPNMVTPGHACTQKYSHEEIA Aldolase 229-247 LKPNMVTPGHACTQKYSHE Aldolase 230-247 KPNMVTPGHACTQKYSHE Aldolase 230-251 KPNMVTPGHACTQKYSHEEIAM Aldolase 236-251 PGHACTQKYSHEEIAM Aldolase 238-254 HACTQKYSHEEIAMATV Aldolase 240-251 CTQKYSHEEIAM Aldolase 240-280 CTQKYSHEEIAMATVTALRRTVPPAVTGVTFLSGGQSEEEA Aldolase 248-260 EIAMATVTALRRT Aldolase 250-265 AMATVTALRRTVPPAV Aldolase 252-267 ATVTALRRTVPPAVTG Aldolase 252-270 ATVTALRRTVPPAVTGVTF Aldolase 252-276 ATVTALRRTVPPAVTGVTFLSGGQS Aldolase 255-285 TALRRTVPPAVTGVTFLSGGQSEEEASINLN Aldolase 257-270 LRRTVPPAVTGVTF Aldolase 258-276 RRTVPPAVTGVTFLSGGQS Aldolase 264-292 AVTGVTFLSGGQSEEEASINLNAINKCPL Aldolase 265-287 VTGVTFLSGGQSEEEASINLNAI Aldolase 265-289 VTGVTFLSGGQSEEEASINLNAINK Aldolase 267-304 GVTFLSGGQSEEEASINLNAINKCPLLKPWALTFSYGR Aldolase 269-294 TFLSGGQSEEEASINLNAINKCPLLK Aldolase 271-284 LSGGQSEEEASINL Aldolase 285-299 NAINKCPLLKPWALT Aldolase 285-300 NAINKCPLLKPWALTF Aldolase 287-299 INKCPLLKPWALT Aldolase 289-316 KCPLLKPWALTFSYGRALQASALKAWGG Aldolase 290-316 CPLLKPWALTFSYGRALQASALKAWGG Aldolase 300-311 FSYGRALQASAL Aldolase 300-326 FSYGRALQASALKAWGGKKENLKAAQE Aldolase 300-327 FSYGRALQASALKAWGGKKENLKAAQEE Aldolase 309-327 SALKAWGGKKENLKAAQEE Aldolase 312-326 KAWGGKKENLKAAQE Aldolase 318-345 KENLKAAQEEYVKRALANSLACQGKYTP Aldolase 322-330 KAAQEEYVK Aldolase 326-338 EEYVKRALANSLA Aldolase 327-337 EYVKRALANSL Aldolase 328-337 YVKRALANSL Aldolase 328-352 YVKRALANSLACQGKYTPSGQAGAA Aldolase 328-355 YVKRALANSLACQGKYTPSGQAGAAASE Aldolase 328-357 YVKRALANSLACQGKYTPSGQAGAAASESL Aldolase 332-347 ALANSLACQGKYTPSG Aldolase 337-357 LACQGKYTPSGQAGAAASESL Aldolase 338-355 ACQGKYTPSGQAGAAASE Aldolase 338-357 ACQGKYTPSGQAGAAASESL Aldolase 338-358 ACQGKYTPSGQAGAAASESLF Aldolase 347-362 GQAGAAASESLFISNH Aldolase 350-358 GAAASESLF Amyloglucosidase 25-43 ATLDSWLSNEATVARTAIL Amyloglucosidase 31-43 LSNEATVARTAIL 116

Amyloglucosidase 43-58 LNNIGADGAWVSGADS Amyloglucosidase 53-73 VSGADSGIVVASPSTDNPDYF Amyloglucosidase 54-69 SGADSGIVVASPSTDN Amyloglucosidase 55-67 GADSGIVVASPST Amyloglucosidase 60-73 IVVASPSTDNPDYF Amyloglucosidase 74-84 YTWTRDSGLVL Amyloglucosidase 109-133 QAIVQGISNPSGDLSSGAGLGEPKF Amyloglucosidase 110-133 AIVQGISNPSGDLSSGAGLGEPKF Amyloglucosidase 111-133 IVQGISNPSGDLSSGAGLGEPKF Amyloglucosidase 134-154 NVDETAYTGSWGRPQRDGPAL Amyloglucosidase 134-156 NVDETAYTGSWGRPQRDGPALRA Amyloglucosidase 160-177 IGFGQWLLDNGYTSTATD Amyloglucosidase 178-189 IVWPLVRNDLSY Amyloglucosidase 212-230 TIAVQHRALVEGSAFATAV Amyloglucosidase 309-324 FRSIYTLNDGLSDSEA Amyloglucosidase 310-324 RSIYTLNDGLSDSEA Amyloglucosidase 325-343 VAVGRYPEDTYYNGNPWFL Amyloglucosidase 389-413 SSSSTYSSIVDAVKTFADGFVSIVE β-Lactoglobulin 1-11 LIVTQTMKGLD β-Lactoglobulin 1-13 LIVTQTMKGLDIQ β-Lactoglobulin 1-19 LIVTQTMKGLDIQKVAGTW β-Lactoglobulin 10-19 LDIQKVAGTW β-Lactoglobulin 13-38 QKVAGTWYSLAMAASDISLLDAQSAP β-Lactoglobulin 32-42 LDAQSAPLRVY β-Lactoglobulin 42-54 YVEELKPTPEGDL β-Lactoglobulin 42-57 YVEELKPTPEGDLEIL β-Lactoglobulin 43-54 VEELKPTPEGDL β-Lactoglobulin 83-95 KIDALNENKVLVL β-Lactoglobulin 132-149 ALEKFDKALKALPMHIRL β-Lactoglobulin 147-156 IRLSFNPTQL Hemoglobin-alpha 1-32 MVLSAADKGNVKAAWGKVGGHAAEYGAEALER Hemoglobin-alpha 2-24 VLSAADKGNVKAAWGKVGGHAAE Hemoglobin-alpha 2-25 VLSAADKGNVKAAWGKVGGHAAEY Hemoglobin-alpha 2-28 VLSAADKGNVKAAWGKVGGHAAEYGAE Hemoglobin-alpha 2-30 VLSAADKGNVKAAWGKVGGHAAEYGAEAL Hemoglobin-alpha 3-22 LSAADKGNVKAAWGKVGGHA Hemoglobin-alpha 3-29 LSAADKGNVKAAWGKVGGHAAEYGAEA Hemoglobin-alpha 4-24 SAADKGNVKAAWGKVGGHAAE Hemoglobin-alpha 6-41 ADKGNVKAAWGKVGGHAAEYGAEALERMFLSFPTTK Hemoglobin-alpha 7-18 DKGNVKAAWGKV Hemoglobin-alpha 7-43 DKGNVKAAWGKVGGHAAEYGAEALERMFLSFPTTKTY Hemoglobin-alpha 9-22 GNVKAAWGKVGGHA Hemoglobin-alpha 16-26 GKVGGHAAEYG Hemoglobin-alpha 30-43 LERMFLSFPTTKTY Hemoglobin-alpha 30-57 LERMFLSFPTTKTYFPHFDLSHGSAQVK Hemoglobin-alpha 34-47 FLSFPTTKTYFPHF Hemoglobin-alpha 34-49 FLSFPTTKTYFPHFDL Hemoglobin-alpha 34-68 FLSFPTTKTYFPHFDLSHGSAQVKGHGAKVAAALT Hemoglobin-alpha 34-74 FLSFPTTKTYFPHFDLSHGSAQVKGHGAKVAAALTKAVEHL Hemoglobin-alpha 34-81 FLSFPTTKTYFPHFDLSHGSAQVKGHGAKVAAALTKAVEHLDDLPGAL Hemoglobin-alpha 34-84 FLSFPTTKTYFPHFDLSHGSAQVKGHGAKVAAALTKAVEHLDDLPGALSEL Hemoglobin-alpha 35-60 LSFPTTKTYFPHFDLSHGSAQVKGHG Hemoglobin-alpha 35-72 LSFPTTKTYFPHFDLSHGSAQVKGHGAKVAAALTKAVE Hemoglobin-alpha 35-81 LSFPTTKTYFPHFDLSHGSAQVKGHGAKVAAALTKAVEHLDDLPGAL Hemoglobin-alpha 35-83 LSFPTTKTYFPHFDLSHGSAQVKGHGAKVAAALTKAVEHLDDLPGALSE Hemoglobin-alpha 35-84 LSFPTTKTYFPHFDLSHGSAQVKGHGAKVAAALTKAVEHLDDLPGALSEL Hemoglobin-alpha 36-47 SFPTTKTYFPHF Hemoglobin-alpha 38-58 PTTKTYFPHFDLSHGSAQVKG Hemoglobin-alpha 38-65 PTTKTYFPHFDLSHGSAQVKGHGAKVAA Hemoglobin-alpha 38-81 PTTKTYFPHFDLSHGSAQVKGHGAKVAAALTKAVEHLDDLPGAL Hemoglobin-alpha 38-83 PTTKTYFPHFDLSHGSAQVKGHGAKVAAALTKAVEHLDDLPGALSE Hemoglobin-alpha 48-67 DLSHGSAQVKGHGAKVAAAL Hemoglobin-alpha 48-72 DLSHGSAQVKGHGAKVAAALTKAVE 117

Hemoglobin-alpha 48-81 DLSHGSAQVKGHGAKVAAALTKAVEHLDDLPGAL Hemoglobin-alpha 48-84 DLSHGSAQVKGHGAKVAAALTKAVEHLDDLPGALSEL Hemoglobin-alpha 49-78 LSHGSAQVKGHGAKVAAALTKAVEHLDDLP Hemoglobin-alpha 51-76 HGSAQVKGHGAKVAAALTKAVEHLDD Hemoglobin-alpha 53-86 SAQVKGHGAKVAAALTKAVEHLDDLPGALSELSD Hemoglobin-alpha 60-78 GAKVAAALTKAVEHLDDLP Hemoglobin-alpha 65-81 AALTKAVEHLDDLPGAL Hemoglobin-alpha 66-81 ALTKAVEHLDDLPGAL Hemoglobin-alpha 67-81 LTKAVEHLDDLPGAL Hemoglobin-alpha 67-92 LTKAVEHLDDLPGALSELSDLHAHKL Hemoglobin-alpha 67-99 LTKAVEHLDDLPGALSELSDLHAHKLRVDPVNF Hemoglobin-alpha 68-99 TKAVEHLDDLPGALSELSDLHAHKLRVDPVNF Hemoglobin-alpha 69-97 KAVEHLDDLPGALSELSDLHAHKLRVDPV Hemoglobin-alpha 81-116 LSELSDLHAHKLRVDPVNFKLLSHSLLVTLASHLPS Hemoglobin-alpha 82-99 SELSDLHAHKLRVDPVNF Hemoglobin-alpha 84-99 LSDLHAHKLRVDPVNF Hemoglobin-alpha 85-98 SDLHAHKLRVDPVN Hemoglobin-alpha 85-99 SDLHAHKLRVDPVNF Hemoglobin-alpha 86-104 DLHAHKLRVDPVNFKLLSH Hemoglobin-alpha 87-96 LHAHKLRVDP Hemoglobin-alpha 88-99 HAHKLRVDPVNF Hemoglobin-alpha 94-105 VDPVNFKLLSHS Hemoglobin-alpha 94-106 VDPVNFKLLSHSL Hemoglobin-alpha 94-107 VDPVNFKLLSHSLL Hemoglobin-alpha 94-124 VDPVNFKLLSHSLLVTLASHLPSDFTPAVHA Hemoglobin-alpha 103-136 SHSLLVTLASHLPSDFTPAVHASLDKFLANVSTV Hemoglobin-alpha 105-126 SLLVTLASHLPSDFTPAVHASL Hemoglobin-alpha 106-135 LLVTLASHLPSDFTPAVHASLDKFLANVST Hemoglobin-alpha 107-137 LVTLASHLPSDFTPAVHASLDKFLANVSTVL Hemoglobin-alpha 108-126 VTLASHLPSDFTPAVHASL Hemoglobin-alpha 108-129 VTLASHLPSDFTPAVHASLDKF Hemoglobin-alpha 108-133 VTLASHLPSDFTPAVHASLDKFLANV Hemoglobin-alpha 108-134 VTLASHLPSDFTPAVHASLDKFLANVS Hemoglobin-alpha 108-137 VTLASHLPSDFTPAVHASLDKFLANVSTVL Hemoglobin-alpha 109-131 TLASHLPSDFTPAVHASLDKFLA Hemoglobin-alpha 109-133 TLASHLPSDFTPAVHASLDKFLANV Hemoglobin-alpha 110-126 LASHLPSDFTPAVHASL Hemoglobin-alpha 110-128 LASHLPSDFTPAVHASLDK Hemoglobin-alpha 110-129 LASHLPSDFTPAVHASLDKF Hemoglobin-alpha 111-126 ASHLPSDFTPAVHASL Hemoglobin-alpha 111-129 ASHLPSDFTPAVHASLDKF Hemoglobin-alpha 111-133 ASHLPSDFTPAVHASLDKFLANV Hemoglobin-alpha 111-134 ASHLPSDFTPAVHASLDKFLANVS Hemoglobin-alpha 111-137 ASHLPSDFTPAVHASLDKFLANVSTVL Hemoglobin-alpha 112-141 SHLPSDFTPAVHASLDKFLANVSTVLTSKY Hemoglobin-alpha 119-133 TPAVHASLDKFLANV Hemoglobin-alpha 121-133 AVHASLDKFLANV Hemoglobin-beta 3-13 TAEEKAAVTAF Hemoglobin-beta 12-30 AFWGKVKVDEVGGEALGRL Hemoglobin-beta 14-30 WGKVKVDEVGGEALGRL Hemoglobin-beta 28-52 GRLLVVYPWTQRFFESFGDLSTADA Hemoglobin-beta 31-44 LVVYPWTQRFFESF Hemoglobin-beta 31-84 LVVYPWTQRFFESFGDLSTADAVMNNPKVKAHGKKVLDSFSNGMKHLDDLKGTF Hemoglobin-beta 41-84 FESFGDLSTADAVMNNPKVKAHGKKVLDSFSNGMKHLDDLKGTF Hemoglobin-beta 45-84 GDLSTADAVMNNPKVKAHGKKVLDSFSNGMKHLDDLKGTF Hemoglobin-beta 48-80 STADAVMNNPKVKAHGKKVLDSFSNGMKHLDDL Hemoglobin-beta 48-84 STADAVMNNPKVKAHGKKVLDSFSNGMKHLDDLKGTF Hemoglobin-beta 50-84 ADAVMNNPKVKAHGKKVLDSFSNGMKHLDDLKGTF Hemoglobin-beta 51-84 DAVMNNPKVKAHGKKVLDSFSNGMKHLDDLKGTF Hemoglobin-beta 51-85 DAVMNNPKVKAHGKKVLDSFSNGMKHLDDLKGTFA Hemoglobin-beta 52-62 AVMNNPKVKAH Hemoglobin-beta 52-84 AVMNNPKVKAHGKKVLDSFSNGMKHLDDLKGTF Hemoglobin-beta 53-84 VMNNPKVKAHGKKVLDSFSNGMKHLDDLKGTF 118

Hemoglobin-beta 54-65 MNNPKVKAHGKK Hemoglobin-beta 54-69 MNNPKVKAHGKKVLDS Hemoglobin-beta 54-84 MNNPKVKAHGKKVLDSFSNGMKHLDDLKGTF Hemoglobin-beta 55-84 NNPKVKAHGKKVLDSFSNGMKHLDDLKGTF Hemoglobin-beta 59-90 VKAHGKKVLDSFSNGMKHLDDLKGTFAALSEL Hemoglobin-beta 60-91 KAHGKKVLDSFSNGMKHLDDLKGTFAALSELH Hemoglobin-beta 72-90 NGMKHLDDLKGTFAALSEL Hemoglobin-beta 75-95 KHLDDLKGTFAALSELHCDKL Hemoglobin-beta 77-91 LDDLKGTFAALSELH Hemoglobin-beta 85-101 AALSELHCDKLHVDPEN Hemoglobin-beta 85-109 AALSELHCDKLHVDPENFKLLGNVL Hemoglobin-beta 88-109 SELHCDKLHVDPENFKLLGNVL Hemoglobin-beta 91-101 HCDKLHVDPEN Hemoglobin-beta 95-108 LHVDPENFKLLGNV Hemoglobin-beta 112-131 VLARNFGKEFTPVLQADFQK Hemoglobin-beta 113-123 LARNFGKEFTP Hemoglobin-beta 113-124 LARNFGKEFTPV Hemoglobin-beta 113-132 LARNFGKEFTPVLQADFQKV Hemoglobin-beta 114-125 ARNFGKEFTPVL Hemoglobin-beta 114-127 ARNFGKEFTPVLQA Hemoglobin-beta 114-132 ARNFGKEFTPVLQADFQKV Hemoglobin-beta 116-133 NFGKEFTPVLQADFQKVV Hemoglobin-beta 118-133 GKEFTPVLQADFQKVV Hemoglobin-beta 118-138 GKEFTPVLQADFQKVVAGVAN Hemoglobin-beta 120-135 EFTPVLQADFQKVVAG Hemoglobin-beta 126-139 QADFQKVVAGVANA Hemoglobin-epsilon 2 4-30 FTTEENVAVASLWAKVNVEVVGGESLA Hemoglobin-epsilon 2 8-49 ENVAVASLWAKVNVEVVGGESLARLLIVCPWTQRFFDSFGNL Hemoglobin-epsilon 2 37-73 PWTQRFFDSFGNLYSESAIMGNPKVKVYGRKVLNSFG Hemoglobin-epsilon 2 67-83 KVLNSFGNAIKHMDDLK Hemoglobin-epsilon 2 93-103 HCDKLHVDPEN Hemoglobin-epsilon 2 107-140 LGNMILIVLATHFSKEFTPQMQAAWQKLTNAVAN Hemoglobin-epsilon 2 112-136 LIVLATHFSKEFTPQMQAAWQKLTN Hemoglobin-epsilon 2 123-146 FTPQMQAAWQKLTNAVANALTHKY Hemoglobin-epsilon 2 125-138 PQMQAAWQKLTNAV Hemoglobin-epsilon 2 128-135 QAAWQKLT Hemoglobin-epsilon 4 10-18 AAVASLWAK Hemoglobin-epsilon 4 21-33 VEVVGGESLARLL Hemoglobin-epsilon 4 21-64 VEVVGGESLARLLIVYPWTQRFFDSFGNLYSESAIMGNPKVKAH Hemoglobin-epsilon 4 47-76 GNLYSESAIMGNPKVKAHGRKVLNSFGNAI Hemoglobin-epsilon 4 52-80 ESAIMGNPKVKAHGRKVLNSFGNAIEHMD Hemoglobin-epsilon 4 54-79 AIMGNPKVKAHGRKVLNSFGNAIEHM Hemoglobin-epsilon 4 62-85 KAHGRKVLNSFGNAIEHMDDLKGT Hemoglobin-epsilon 4 67-92 KVLNSFGNAIEHMDDLKGTFADLSEL Hemoglobin-epsilon 4 93-103 HCDKLHVDPEN Hemoglobin-epsilon 4 106-140 LLGNMILIVLATHFSKEFTPQMQASWQKLTNAVAN Hemoglobin-epsilon 4 115-143 LATHFSKEFTPQMQASWQKLTNAVANALA Hemoglobin-epsilon 4 121-135 KEFTPQMQASWQKLT Myoglobin 2-12 GLSDGEWQQVL Myoglobin 2-14 GLSDGEWQQVLNV Myoglobin 2-15 GLSDGEWQQVLNVW Myoglobin 2-20 GLSDGEWQQVLNVWGKVEA Myoglobin 2-30 GLSDGEWQQVLNVWGKVEADIAGHGQEVL Myoglobin 8-30 WQQVLNVWGKVEADIAGHGQEVL Myoglobin 9-30 QQVLNVWGKVEADIAGHGQEVL Myoglobin 10-30 QVLNVWGKVEADIAGHGQEVL Myoglobin 13-30 NVWGKVEADIAGHGQEVL Myoglobin 13-31 NVWGKVEADIAGHGQEVLI Myoglobin 15-30 WGKVEADIAGHGQEVL Myoglobin 15-31 WGKVEADIAGHGQEVLI Myoglobin 16-30 GKVEADIAGHGQEVL Myoglobin 19-34 EADIAGHGQEVLIRLF Myoglobin 24-47 GHGQEVLIRLFTGHPETLEKFDKF 119

Myoglobin 25-45 HGQEVLIRLFTGHPETLEKFD Myoglobin 30-69 LIRLFTGHPETLEKFDKFKHLKTEAEMKASEDLKKHGTVV Myoglobin 31-56 IRLFTGHPETLEKFDKFKHLKTEAEM Myoglobin 34-55 FTGHPETLEKFDKFKHLKTEAE Myoglobin 34-70 FTGHPETLEKFDKFKHLKTEAEMKASEDLKKHGTVVL Myoglobin 40-50 TLEKFDKFKHL Myoglobin 42-70 EKFDKFKHLKTEAEMKASEDLKKHGTVVL Myoglobin 45-59 DKFKHLKTEAEMKAS Myoglobin 48-65 KHLKTEAEMKASEDLKKH Myoglobin 49-65 HLKTEAEMKASEDLKKH Myoglobin 49-66 HLKTEAEMKASEDLKKHG Myoglobin 54-74 AEMKASEDLKKHGTVVLTALG Myoglobin 57-66 KASEDLKKHG Myoglobin 57-70 KASEDLKKHGTVVL Myoglobin 71-106 TALGGILKKKGHHEAELKPLAQSHATKHKIPIKYLE Myoglobin 71-107 TALGGILKKKGHHEAELKPLAQSHATKHKIPIKYLEF Myoglobin 73-104 LGGILKKKGHHEAELKPLAQSHATKHKIPIKY Myoglobin 73-106 LGGILKKKGHHEAELKPLAQSHATKHKIPIKYLE Myoglobin 74-104 GGILKKKGHHEAELKPLAQSHATKHKIPIKY Myoglobin 84-97 EAELKPLAQSHATK Myoglobin 107-135 FISDAIIHVLHSKHPGDFGADAQGAMTKA Myoglobin 107-138 FISDAIIHVLHSKHPGDFGADAQGAMTKALEL Myoglobin 108-135 ISDAIIHVLHSKHPGDFGADAQGAMTKA Myoglobin 108-137 ISDAIIHVLHSKHPGDFGADAQGAMTKALE Myoglobin 108-138 ISDAIIHVLHSKHPGDFGADAQGAMTKALEL Myoglobin 109-136 SDAIIHVLHSKHPGDFGADAQGAMTKAL Myoglobin 111-135 AIIHVLHSKHPGDFGADAQGAMTKA Myoglobin 111-137 AIIHVLHSKHPGDFGADAQGAMTKALE Myoglobin 111-138 AIIHVLHSKHPGDFGADAQGAMTKALEL Myoglobin 112-135 IIHVLHSKHPGDFGADAQGAMTKA Myoglobin 112-138 IIHVLHSKHPGDFGADAQGAMTKALEL Myoglobin 113-142 IHVLHSKHPGDFGADAQGAMTKALELFRND Myoglobin 115-140 VLHSKHPGDFGADAQGAMTKALELFR Myoglobin 115-145 VLHSKHPGDFGADAQGAMTKALELFRNDIAA Myoglobin 122-146 GDFGADAQGAMTKALELFRNDIAAK Myoglobin 124-145 FGADAQGAMTKALELFRNDIAA Myoglobin 124-146 FGADAQGAMTKALELFRNDIAAK Myoglobin 126-143 ADAQGAMTKALELFRNDI Myoglobin 129-146 QGAMTKALELFRNDIAAK Myoglobin 132-147 MTKALELFRNDIAAKY Myoglobin 139-152 FRNDIAAKYKELGF Nef 3-30 SSHHHHHHSSGLVPRGSHMGGKWSKSSV Nef 9-21 HHSSGLVPRGSHM Nef 9-32 HHSSGLVPRGSHMGGKWSKSSVIG Nef 17-27 RGSHMGGKWSK Nef 17-32 RGSHMGGKWSKSSVIG Nef 28-52 SSVIGWPAVRERMRRAEPAADGVGA Nef 29-41 SVIGWPAVRERMR Nef 34-48 PAVRERMRRAEPAAD Nef 35-64 AVRERMRRAEPAADGVGAVSRDLEKHGAIT Nef 41-62 RRAEPAADGVGAVSRDLEKHGA Nef 42-62 RAEPAADGVGAVSRDLEKHGA Nef 47-79 ADGVGAVSRDLEKHGAITSSNTAANNAACAWLE Nef 47-81 ADGVGAVSRDLEKHGAITSSNTAANNAACAWLEAQ Nef 49-77 GVGAVSRDLEKHGAITSSNTAANNAACAW Nef 50-72 VGAVSRDLEKHGAITSSNTAANN Nef 56-90 DLEKHGAITSSNTAANNAACAWLEAQEEEEVGFPV Nef 58-89 EKHGAITSSNTAANNAACAWLEAQEEEEVGFP Nef 60-72 HGAITSSNTAANN Nef 60-93 HGAITSSNTAANNAACAWLEAQEEEEVGFPVTPQ Nef 77-97 WLEAQEEEEVGFPVTPQVPLR Nef 82-108 EEEEVGFPVTPQVPLRPMTYKAAVDLS Nef 99-118 MTYKAAVDLSHFLKEKGGLE 120

Nef 106-119 DLSHFLKEKGGLEG Nef 106-120 DLSHFLKEKGGLEGL Nef 108-119 SHFLKEKGGLEG Nef 140-164 YFPDWQNYTPGPGVRYPLTFGWCYK Nef 152-173 GVRYPLTFGWCYKLVPVEPDKV Nef 154-166 RYPLTFGWCYKLV Nef 173-187 VEEANKGENTSLLHP Nef 183-209 SLLHPVSLHGMDDPEREVLEWRFDSRL Nef 201-219 LEWRFDSRLAFHHVARELH Nef 202-219 EWRFDSRLAFHHVARELH Nef 206-220 DSRLAFHHVARELHP Ubiquitin 1-15 MQIFVKTLTGKTITL Ubiquitin 4-15 FVKTLTGKTITL Ubiquitin 25-45 NVKAKIQDKEGIPPDQQRLIF Ubiquitin 44-58 IFAGKQLEDGRTLSD Ubiquitin 46-58 AGKQLEDGRTLSD Ubiquitin 46-67 AGKQLEDGRTLSDYNIQKESTL Ubiquitin 59-69 YNIQKESTLHL

121

Table 3.4. pH 4.0 peptides Protein Residues Peptide Sequence Abl 31-41 LAGPSENDPNL Abl 31-42 LAGPSENDPNLF Abl 49-64 VASGDNTLSITKGEKL Abl 65-75 RVLGYNHNGEW Abl 76-90 CEAQTKNGQGWVPSN Abl 78-90 AQTKNGQGWVPSN Abl 93-102 TPVNSLEKHS Abl 93-103 TPVNSLEKHSW Abl 126-141 LVRESESSPGQRSISL Abl 127-141 VRESESSPGQRSISL Abl 150-160 YRINTASDGKL Abl 172-185 AELVHHHSTVADGL Abl 186-202 ITTLHYPAPKRNKPTVY Abl 187-202 TTLHYPAPKRNKPTVY Abl 198-210 KPTVYGVSPNYDK Abl 203-213 GVSPNYDKWEM Abl 378-387 PIKWTAPESL Abl 482-491 VEKELGKENL Albumin 1-21 DTHKSEIAHRFKDLGEEHFKG Albumin 28-58 SQYLQQCPFDEHVKLVNELTEFAKTCVADES Albumin 32-79 QQCPFDEHVKLVNELTEFAKTCVADESHAGCEKSLHTLFGDELCKVAS Albumin 71-78 GDELCKVA Albumin 77-105 VASLRETYGDMADCCEKQEPERNECFLSH Albumin 103-137 LSHKDDSPDLPKLKPDPNTLCDEFKADEKKFWGKY Albumin 113-133 PKLKPDPNTLCDEFKADEKKF Albumin 130-159 EKKFWGKYLYEIARRHPYFYAPELLYYANK Albumin 146-158 PYFYAPELLYYAN Albumin 154-164 LYYANKYNGVF Albumin 179-192 PKIETMREKVLASS Albumin 189-202 LASSARQRLRCASI Albumin 231-262 TKLVTDLTKVHKECCHGDLLECADDRADLAKY Albumin 235-271 TDLTKVHKECCHGDLLECADDRADLAKYICDNQDTIS Albumin 240-277 VHKECCHGDLLECADDRADLAKYICDNQDTISSKLKEC Albumin 241-277 HKECCHGDLLECADDRADLAKYICDNQDTISSKLKEC Albumin 253-271 ADDRADLAKYICDNQDTIS Albumin 261-289 KYICDNQDTISSKLKECCDKPLLEKSHCI Albumin 269-290 TISSKLKECCDKPLLEKSHCIA Albumin 278-287 CDKPLLEKSH Albumin 287-327 HCIAEVEKDAIPENLPPLTADFAEDKDVCKNYQEAKDAFLG Albumin 291-306 EVEKDAIPENLPPLTA Albumin 292-306 VEKDAIPENLPPLTA Albumin 292-307 VEKDAIPENLPPLTAD Albumin 292-308 VEKDAIPENLPPLTADF Albumin 308-323 FAEDKDVCKNYQEAKD Albumin 373-386 FDKLKHLVDEPQNL Albumin 374-386 DKLKHLVDEPQNL Albumin 378-390 HLVDEPQNLIKQN Albumin 381-396 DEPQNLIKQNCDQFEK Albumin 384-399 QNLIKQNCDQFEKLGE Albumin 395-406 EKLGEYGFQNAL Albumin 407-419 IVRYTRKVPQVST Albumin 466-483 TPVSEKVTKCCTESLVNR Albumin 485-512 PCFSALTPDETYVPKAFDEKLFTFHADI Albumin 488-501 SALTPDETYVPKAF Albumin 493-528 DETYVPKAFDEKLFTFHADICTLPDTEKQIKKQTAL Albumin 496-514 YVPKAFDEKLFTFHADICT Albumin 506-521 FTFHADICTLPDTEKQ Albumin 516-528 PDTEKQIKKQTAL Albumin 529-542 VELLKHKPKATEEQ Albumin 542-555 QLKTVMENFVAFVD 122

Albumin 544-560 KTVMENFVAFVDKCCAA Albumin 545-572 TVMENFVAFVDKCCAADDKEACFAVEGP Aldolase 2-18 PHSHPALTPEQKKELSD Aldolase 2-21 PHSHPALTPEQKKELSDIAH Aldolase 2-25 PHSHPALTPEQKKELSDIAHRIVA Aldolase 2-31 PHSHPALTPEQKKELSDIAHRIVAPGKGIL Aldolase 4-11 SHPALTPE Aldolase 10-34 PEQKKELSDIAHRIVAPGKGILAAD Aldolase 32-44 AADESTGSIAKRL Aldolase 32-46 AADESTGSIAKRLQS Aldolase 32-50 AADESTGSIAKRLQSIGTE Aldolase 32-54 AADESTGSIAKRLQSIGTENTEE Aldolase 33-50 ADESTGSIAKRLQSIGTE Aldolase 37-53 TGSIAKRLQSIGTENTE Aldolase 60-79 RQLLLTADDRVNPCIGGVIL Aldolase 62-82 LLLTADDRVNPCIGGVILFHE Aldolase 64-76 LTADDRVNPCIGG Aldolase 71-93 NPCIGGVILFHETLYQKADDGRP Aldolase 85-97 YQKADDGRPFPQV Aldolase 85-106 YQKADDGRPFPQVIKSKGGVVG Aldolase 107-121 IKVDKGVVPLAGTNG Aldolase 107-122 IKVDKGVVPLAGTNGE Aldolase 107-128 IKVDKGVVPLAGTNGETTTQGL Aldolase 107-130 IKVDKGVVPLAGTNGETTTQGLDG Aldolase 107-131 IKVDKGVVPLAGTNGETTTQGLDGL Aldolase 107-132 IKVDKGVVPLAGTNGETTTQGLDGLS Aldolase 107-133 IKVDKGVVPLAGTNGETTTQGLDGLSE Aldolase 109-134 VDKGVVPLAGTNGETTTQGLDGLSER Aldolase 142-164 GADFAKWRCVLKIGEHTPSALAI Aldolase 175-188 ASICQQNGIVPIVE Aldolase 180-188 QNGIVPIVE Aldolase 183-204 IVPIVEPEILPDGDHDLKRCQY Aldolase 189-203 PEILPDGDHDLKRCQ Aldolase 189-204 PEILPDGDHDLKRCQY Aldolase 191-202 ILPDGDHDLKRC Aldolase 193-203 PDGDHDLKRCQ Aldolase 193-204 PDGDHDLKRCQY Aldolase 203-221 QYVTEKVLAAVYKALSDHH Aldolase 212-224 AVYKALSDHHIYL Aldolase 213-224 VYKALSDHHIYL Aldolase 215-224 KALSDHHIYL Aldolase 225-234 EGTLLKPNMV Aldolase 241-253 TQKYSHEEIAMAT Aldolase 257-270 LRRTVPPAVTGVTF Aldolase 258-270 RRTVPPAVTGVTF Aldolase 285-299 NAINKCPLLKPWALT Aldolase 287-299 INKCPLLKPWALT Aldolase 301-310 SYGRALQASA Aldolase 305-325 ALQASALKAWGGKKENLKAAQ Aldolase 309-321 SALKAWGGKKENL Aldolase 320-334 NLKAAQEEYVKRALA Aldolase 320-342 NLKAAQEEYVKRALANSLACQGK Aldolase 327-343 EYVKRALANSLACQGKY Aldolase 327-348 EYVKRALANSLACQGKYTPSGQ Aldolase 329-357 VKRALANSLACQGKYTPSGQAGAAASESL Aldolase 337-357 LACQGKYTPSGQAGAAASESL Aldolase 338-357 ACQGKYTPSGQAGAAASESL Hemoglogin-alpha 2-24 VLSAADKGNVKAAWGKVGGHAAE Hemoglogin-alpha 3-21 LSAADKGNVKAAWGKVGGH Hemoglogin-alpha 3-22 LSAADKGNVKAAWGKVGGHA Hemoglogin-alpha 3-24 LSAADKGNVKAAWGKVGGHAAE Hemoglogin-alpha 4-24 SAADKGNVKAAWGKVGGHAAE Hemoglogin-alpha 27-36 AEALERMFLS 123

Hemoglogin-alpha 34-47 FLSFPTTKTYFPHF Hemoglogin-alpha 34-49 FLSFPTTKTYFPHFDL Hemoglogin-alpha 35-47 LSFPTTKTYFPHF Hemoglogin-alpha 35-54 LSFPTTKTYFPHFDLSHGSA Hemoglogin-alpha 36-47 SFPTTKTYFPHF Hemoglogin-alpha 45-68 PHFDLSHGSAQVKGHGAKVAAALT Hemoglogin-alpha 48-67 DLSHGSAQVKGHGAKVAAAL Hemoglogin-alpha 52-61 GSAQVKGHGA Hemoglogin-alpha 53-72 SAQVKGHGAKVAAALTKAVE Hemoglogin-alpha 62-79 KVAAALTKAVEHLDDLPG Hemoglogin-alpha 62-82 KVAAALTKAVEHLDDLPGALS Hemoglogin-alpha 66-81 ALTKAVEHLDDLPGAL Hemoglogin-alpha 67-81 LTKAVEHLDDLPGAL Hemoglogin-alpha 68-83 TKAVEHLDDLPGALSE Hemoglogin-alpha 71-81 VEHLDDLPGAL Hemoglogin-alpha 71-83 VEHLDDLPGALSE Hemoglogin-alpha 73-87 HLDDLPGALSELSDL Hemoglogin-alpha 80-95 ALSELSDLHAHKLRVD Hemoglogin-alpha 81-95 LSELSDLHAHKLRVD Hemoglogin-alpha 82-94 SELSDLHAHKLRV Hemoglogin-alpha 85-99 SDLHAHKLRVDPVNF Hemoglogin-alpha 88-99 HAHKLRVDPVNF Hemoglogin-alpha 94-105 VDPVNFKLLSHS Hemoglogin-alpha 94-106 VDPVNFKLLSHSL Hemoglogin-alpha 108-123 VTLASHLPSDFTPAVH Hemoglogin-alpha 109-124 TLASHLPSDFTPAVHA Hemoglogin-alpha 110-124 LASHLPSDFTPAVHA Hemoglogin-alpha 111-123 ASHLPSDFTPAVH Hemoglogin-alpha 111-124 ASHLPSDFTPAVHA Hemoglogin-alpha 111-126 ASHLPSDFTPAVHASL Hemoglogin-alpha 117-131 DFTPAVHASLDKFLA Hemoglogin-alpha 120-140 PAVHASLDKFLANVSTVLTSK Hemoglogin-beta 5-22 EEKAAVTAFWGKVKVDEV Hemoglogin-beta 6-22 EKAAVTAFWGKVKVDEV Hemoglogin-beta 14-27 WGKVKVDEVGGEAL Hemoglogin-beta 14-30 WGKVKVDEVGGEALGRL Hemoglogin-beta 48-69 STADAVMNNPKVKAHGKKVLDS Hemoglogin-beta 52-69 AVMNNPKVKAHGKKVLDS Hemoglogin-beta 52-73 AVMNNPKVKAHGKKVLDSFSNG Hemoglogin-beta 53-69 VMNNPKVKAHGKKVLDS Hemoglogin-beta 54-69 MNNPKVKAHGKKVLDS Hemoglogin-beta 55-69 NNPKVKAHGKKVLDS Hemoglogin-beta 65-77 KVLDSFSNGMKHL Hemoglogin-beta 68-84 DSFSNGMKHLDDLKGTF Hemoglogin-beta 70-84 FSNGMKHLDDLKGTF Hemoglogin-beta 79-89 DLKGTFAALSE Hemoglogin-beta 86-97 ALSELHCDKLHV Hemoglogin-beta 89-108 ELHCDKLHVDPENFKLLGNV Hemoglogin-beta 114-125 ARNFGKEFTPVL Hemoglogin-beta 118-133 GKEFTPVLQADFQKVV Hemoglogin-beta 118-138 GKEFTPVLQADFQKVVAGVAN Hemoglogin-beta 123-131 PVLQADFQK Hemoglogin-beta 125-140 LQADFQKVVAGVANAL Hemoglogin-beta 126-139 QADFQKVVAGVANA Hemoglogin-beta 128-140 DFQKVVAGVANAL Hemoglogin-beta 131-144 KVVAGVANALAHRY Hemoglogin-epsilon 2 10-26 VAVASLWAKVNVEVVGG Hemoglogin-epsilon 2 22-36 EVVGGESLARLLIVC Hemoglogin-epsilon 2 58-71 NPKVKVYGRKVLNS Hemoglogin-epsilon 2 109-122 NMILIVLATHFSKE Hemoglogin-epsilon 2 125-142 PQMQAAWQKLTNAVANAL Hemoglogin-epsilon 4 25-42 GGESLARLLIVYPWTQRF Hemoglogin-epsilon 4 27-40 ESLARLLIVYPWTQ 124

Hemoglogin-epsilon 4 67-107 KVLNSFGNAIEHMDDLKGTFADLSELHCDKLHVDPENFRLL Hemoglogin-epsilon 4 99-131 VDPENFRLLGNMILIVLATHFSKEFTPQMQASW Hemoglogin-epsilon 4 101-113 PENFRLLGNMILI Hemoglogin-epsilon 4 107-123 LGNMILIVLATHFSKEF Hemoglogin-epsilon 4 109-122 NMILIVLATHFSKE Hemoglogin-epsilon 4 124-144 TPQMQASWQKLTNAVANALAH Hemoglogin-epsilon 4 137-145 AVANALAHK Myoglobin 2-12 GLSDGEWQQVL Myoglobin 14-37 VWGKVEADIAGHGQEVLIRLFTGH Myoglobin 15-30 WGKVEADIAGHGQEVL Myoglobin 16-30 GKVEADIAGHGQEVL Myoglobin 26-36 GQEVLIRLFTG Myoglobin 30-41 LIRLFTGHPETL Myoglobin 41-52 LEKFDKFKHLKT Myoglobin 41-55 LEKFDKFKHLKTEAE Myoglobin 42-56 EKFDKFKHLKTEAEM Myoglobin 49-65 HLKTEAEMKASEDLKKH Myoglobin 58-72 ASEDLKKHGTVVLTA Myoglobin 69-84 VLTALGGILKKKGHHE Myoglobin 71-87 TALGGILKKKGHHEAEL Myoglobin 73-87 LGGILKKKGHHEAEL Myoglobin 85-111 AELKPLAQSHATKHKIPIKYLEFISDA Myoglobin 88-104 KPLAQSHATKHKIPIKY Myoglobin 92-102 QSHATKHKIPI Myoglobin 98-110 HKIPIKYLEFISD Myoglobin 105-117 LEFISDAIIHVLH Myoglobin 108-127 ISDAIIHVLHSKHPGDFGAD Myoglobin 109-124 SDAIIHVLHSKHPGDF Myoglobin 109-136 SDAIIHVLHSKHPGDFGADAQGAMTKAL Myoglobin 111-132 AIIHVLHSKHPGDFGADAQGAM Myoglobin 112-127 IIHVLHSKHPGDFGAD Myoglobin 112-130 IIHVLHSKHPGDFGADAQG Myoglobin 112-135 IIHVLHSKHPGDFGADAQGAMTKA Myoglobin 113-121 IHVLHSKHP Nef 4-38 SHHHHHHSSGLVPRGSHMGGKWSKSSVIGWPAVRE Nef 11-32 SSGLVPRGSHMGGKWSKSSVIG Nef 35-53 AVRERMRRAEPAADGVGAV Nef 52-68 AVSRDLEKHGAITSSNT Nef 60-72 HGAITSSNTAANN Nef 85-100 EVGFPVTPQVPLRPMT Nef 98-115 PMTYKAAVDLSHFLKEKG Nef 100-112 TYKAAVDLSHFLK Nef 154-167 RYPLTFGWCYKLVP Nef 189-202 SLHGMDDPEREVLE Nef 207-224 SRLAFHHVARELHPEYFK Ubiquitin 1-15 MQIFVKTLTGKTITL Ubiquitin 16-45 EVEPSDTIENVKAKIQDKEGIPPDQQRLIF Ubiquitin 25-45 NVKAKIQDKEGIPPDQQRLIF Ubiquitin 44-58 IFAGKQLEDGRTLSD

125

Table 3.5. pH 2.5 E.coli peptides Protein Peptide Sequence ACP_ECOLI (P02901) AAIDYINGHQA ACP_ECOLI (P02901) AIDYINGHQA ACP_ECOLI (P02901) ALEEEF ACP_ECOLI (P02901) DTEIPDEEA ACP_ECOLI (P02901) DTEIPDEEAEKITTVQ ACP_ECOLI (P02901) FDTEIPDEE ACP_ECOLI (P02901) FDTEIPDEEA ACP_ECOLI (P02901) GVKQEEVTNNAS ACP_ECOLI (P02901) GVKQEEVTNNASF ACP_ECOLI (P02901) LGADSLDTVE ACP_ECOLI (P02901) RVKKIIGEQL ACP_ECOLI (P02901) SFVEDLGADSLDTVELVM AHPC_ECOLI (P26427) AEGIGRDASDL AHPC_ECOLI (P26427) EGRWSVFFFYPADFTFVCPTELGDV AHPC_ECOLI (P26427) FKNGEF AHPC_ECOLI (P26427) GDPTGALTRNFDNMR AHPC_ECOLI (P26427) GDVADHYEEL AHPC_ECOLI (P26427) GEVCPAKWKEGE AHPC_ECOLI (P26427) GLADRATFVVDPQGIIQ AHPC_ECOLI (P26427) IEITEKDTEGRW AHPC_ECOLI (P26427) IEVTAEGIGRDASDL AHPC_ECOLI (P26427) IGDPTGAL AHPC_ECOLI (P26427) IGDPTGALTRN AHPC_ECOLI (P26427) IGDPTGALTRNFDNM AHPC_ECOLI (P26427) IGRDASDLLR AHPC_ECOLI (P26427) ITEKDTEGRW AHPC_ECOLI (P26427) QYVASHPGEVCPAKWKE AHPC_ECOLI (P26427) REDEGLADRATF AHPC_ECOLI (P26427) TAEGIGRDASDL AHPC_ECOLI (P26427) TAEGIGRDASDLL AHPC_ECOLI (P26427) VTAEGIGRDASDL ALKH_ECOLI (P10177) AISPGLTEPLL ALKH_ECOLI (P10177) AIVGAGTVLNPQQL ALKH_ECOLI (P10177) FAISPGLTEPLL ALKH_ECOLI (P10177) FKFFPAEANGGVKALQA ALKH_ECOLI (P10177) GTVLNPQQLAEVTEAGAQ ALKH_ECOLI (P10177) IAKEVPEAIVGAGTVLNPQQLAE ALKH_ECOLI (P10177) KAATEGTIPLIPGIST ALKH_ECOLI (P10177) LVAGGVRVLE ALKH_ECOLI (P10177) PFSQVRFCPTGGISP ALKH_ECOLI (P10177) PQQLAEVTEAGAQFAI ALKH_ECOLI (P10177) PQQLAEVTEAGAQFAISPGL ALKH_ECOLI (P10177) QLAEVTEAGAQFAISPGLT ALKH_ECOLI (P10177) SVLCIGGSWLVPADALEA ALKH_ECOLI (P10177) TTGPVVPVIV ALKH_ECOLI (P10177) YLALKSVLCIGGS ARTI_ECOLI (P30859) AGMDITPEREKQVL ARTI_ECOLI (P30859) DLQNGRIDGVF ARTI_ECOLI (P30859) DLQNGRIDGVFGDTA ARTI_ECOLI (P30859) IAVRQGNTEL ARTI_ECOLI (P30859) IMDKHPEITTVPYDSYQNAKL ARTI_ECOLI (P30859) NAKLDLQNG ARTI_ECOLI (P30859) QALCKEIDATC ARTI_ECOLI (P30859) QNGRIDGVFGDTA ARTI_ECOLI (P30859) SIDANNQIVGFDVDLAQ CH10_ECOLI (P05380) AAKSTRGEVL CH10_ECOLI (P05380) AVGNGRILENG CH10_ECOLI (P05380) AVGNGRILENGEVKPLD CH10_ECOLI (P05380) AVGNGRILENGEVKPLDV CH10_ECOLI (P05380) AVGNGRILENGEVKPLDVKVGD 126

CH10_ECOLI (P05380) GNGRILENGEVKPLD CH10_ECOLI (P05380) GNGRILENGEVKPLDVKVGD CH10_ECOLI (P05380) IVIFNDGY CH10_ECOLI (P05380) IVIFNDGYGVKS CH10_ECOLI (P05380) IVIFNDGYGVKSE CH10_ECOLI (P05380) IVIFNDGYGVKSEKIDNE CH10_ECOLI (P05380) IVIFNDGYGVKSEKIDNEE CH10_ECOLI (P05380) MNIRPLHD CH10_ECOLI (P05380) SAGGIVLTGSAAAKSTRGE CH10_ECOLI (P05380) TGSAAAKSTRGEVL CH10_ECOLI (P05380) VKPLDVKVGDIVIFN CH60_ECOLI (P06139) AAIQGRVAQ CH60_ECOLI (P06139) AAIQGRVAQIRQQIEE CH60_ECOLI (P06139) AAKDVKFGND CH60_ECOLI (P06139) AIQGRVAQIRQQ CH60_ECOLI (P06139) ATLTGGTVISEEIGMELEKA CH60_ECOLI (P06139) ATSDYDREKLQE CH60_ECOLI (P06139) DKSFGAPTITKDGVS CH60_ECOLI (P06139) EDGTGLQDELDVVE CH60_ECOLI (P06139) EDLGQAKRVVINKDTTT CH60_ECOLI (P06139) EEPSVVANTVKGGDGNYGY CH60_ECOLI (P06139) EEPSVVANTVKGGDGNYGYNAA CH60_ECOLI (P06139) EGLKAVAAGMNPM CH60_ECOLI (P06139) EPSVVANTVKGGDGNYGYNAATEEYGNMIDMGILD CH60_ECOLI (P06139) ETVGKLIAEAMDKV CH60_ECOLI (P06139) GAAGGMGGMGGMGGMM CH60_ECOLI (P06139) GILDPTKVTRSALQ CH60_ECOLI (P06139) GQAKRVVINKDTTT CH60_ECOLI (P06139) IAQVGTISANSDETVGKL CH60_ECOLI (P06139) IATLTGGTVISEE CH60_ECOLI (P06139) IDMGILDPTKVTRSALQ CH60_ECOLI (P06139) IIDGVGEE CH60_ECOLI (P06139) KDGVSVAREIELED CH60_ECOLI (P06139) KKARVEDALHATRAA CH60_ECOLI (P06139) LGQAKRVVINKDTTT CH60_ECOLI (P06139) LIIAEDVEGEAL CH60_ECOLI (P06139) LSPYFINKPETGAV CH60_ECOLI (P06139) MVTDLPKNDA CH60_ECOLI (P06139) MVTDLPKNDAADL CH60_ECOLI (P06139) MVTDLPKNDAADLGAAGGMGGMGGMGGMM CH60_ECOLI (P06139) QDIATLTGGTVISEEIG CH60_ECOLI (P06139) QGRVAQIRQ CH60_ECOLI (P06139) RVEDALHATRAAV CH60_ECOLI (P06139) TLGPKGRNVVLDKSFGAP CH60_ECOLI (P06139) TRSALQYAASV CH60_ECOLI (P06139) TSDYDREKLQE CH60_ECOLI (P06139) VAAGMNPMDL CH60_ECOLI (P06139) VGTISANSDETVGKL CH60_ECOLI (P06139) VIKVGAATE CH60_ECOLI (P06139) VKEVASKANDAAGDGTTTA CH60_ECOLI (P06139) VKVTLGPKGRN CH60_ECOLI (P06139) VTDLPKNDAADL CH60_ECOLI (P06139) VTDLPKNDAADLGAAGGMGGMGGMGGMM CH60_ECOLI (P06139) VVLDKSFGAPTITKDGVS CH60_ECOLI (P06139) YFINKPETGAVELES CH60_ECOLI (P06139) YLSPYF CSPC_ECOLI (P36996) EIQDGQKGPAAVNV CSPC_ECOLI (P36996) ESKGFGFITPADG CSPC_ECOLI (P36996) FEIQDGQKGPAAVNV CSPC_ECOLI (P36996) FITPADGSKDVF CSPC_ECOLI (P36996) FKTLAEGQN CSPC_ECOLI (P36996) FKTLAEGQNVE CSPC_ECOLI (P36996) FVHFSAIQGNGFK 127

CSPC_ECOLI (P36996) GFGFITPADGSK CSPC_ECOLI (P36996) GFITPADGSKDVF CSPC_ECOLI (P36996) GFITPADGSKDVFVHFSA CSPC_ECOLI (P36996) IQGNGFKTLAEGQN CSPC_ECOLI (P36996) ITPADGSKDVF CSPC_ECOLI (P36996) ITPADGSKDVFVHFSAIQ CSPC_ECOLI (P36996) SAIQGNGFKTLAEGQ CSPC_ECOLI (P36996) SAIQGNGFKTLAEGQN CSPC_ECOLI (P36996) SAIQGNGFKTLAEGQNVE CSPE_ECOLI (P36997) EITNGAKGPSAANV CSPE_ECOLI (P36997) FITPEDGSKDVF CSPE_ECOLI (P36997) IQTNGFKTLAEGQRVEF CSPE_ECOLI (P36997) KDVFVHFSAIQTNGFKTLAEGQRVEFEITN CSPE_ECOLI (P36997) QRVEFEITNGAKG CSPE_ECOLI (P36997) SAIQTNGFKTLAEGQRVE CSPE_ECOLI (P36997) SAIQTNGFKTLAEGQRVEF CSPE_ECOLI (P36997) TNGFKTLAEGQRVE DBHA_ECOLI (P02342) AALESTLAAITESLKEGD DBHA_ECOLI (P02342) AEKAEL DBHA_ECOLI (P02342) IKIAAANVPAF DBHA_ECOLI (P02342) ITESLKEGDAVQLVGFG DBHA_ECOLI (P02342) KEGDAVQLVGFGTFKVNHRAERTGRNP DBHA_ECOLI (P02342) MNKTQL DBHA_ECOLI (P02342) NKTQL DBHA_ECOLI (P02342) QTGKEIKIAAANVPAFV DBHA_ECOLI (P02342) SGKALKDAVK DBHA_ECOLI (P02342) SKTQAKAAL DBHA_ECOLI (P02342) SLKEGDAVQL DBHA_ECOLI (P02342) VGFGTF DBHA_ECOLI (P02342) VSGKALKDAVK DBHB_ECOLI (P02341) AAGADISKAAAGRALDAIIASVTESLKEGDDVAL DBHB_ECOLI (P02341) AAGRALDAIIASVTESLKEGDDVALVGFGTFAVK DBHB_ECOLI (P02341) AAKVPSF DBHB_ECOLI (P02341) AGRALDAIIAS DBHB_ECOLI (P02341) AKVPSF DBHB_ECOLI (P02341) ALDAIIASVTESLKEGDDVALVGFGT DBHB_ECOLI (P02341) ASVTESLKEGDD DBHB_ECOLI (P02341) ASVTESLKEGDDVALVGFG DBHB_ECOLI (P02341) EITIAAAKVPS DBHB_ECOLI (P02341) ESLKEGDDVALVGFGTFAVKER DBHB_ECOLI (P02341) GADISKAAAGRA DBHB_ECOLI (P02341) GRALDAIIASVTESLK DBHB_ECOLI (P02341) IASVTESLKEGDDV DBHB_ECOLI (P02341) IDKIAAGAD DBHB_ECOLI (P02341) ISKAAAGRALD DBHB_ECOLI (P02341) KERAARTGRN DBHB_ECOLI (P02341) MNKSQL DBHB_ECOLI (P02341) NKSQL DBHB_ECOLI (P02341) RAGKALKDAVN DBHB_ECOLI (P02341) SVTESL DBHB_ECOLI (P02341) SVTESLKEGDD DBHB_ECOLI (P02341) SVTESLKEGDDVAL DBHB_ECOLI (P02341) VGFGTF DCEA_ECOLI (P80063) AINFSRPAGQVIAQ DCEA_ECOLI (P80063) FSRPAGQVIAQ DCEA_ECOLI (P80063) HAPAPKNGQAVGTNTIGSSEA DCEA_ECOLI (P80063) HIDAASGGFLAPFVA DCEA_ECOLI (P80063) IAKLGPYEF DCEA_ECOLI (P80063) LSINKNWIDKEEYPQSAA DCEA_ECOLI (P80063) NVDYLGGQIGTF DCEA_ECOLI (P80063) QVPAFTLGGEATDIVVMR DCEA_ECOLI (P80063) REIPMRPGQL DCEA_ECOLI (P80063) VAFQIINDELYL 128

DCEA_ECOLI (P80063) WHAPAPKNGQAVGTNTIGSSE DCEA_ECOLI (P80063) WVIWRDEEAL DCEA_ECOLI (P80063) YEFPQPLHDAL DCEA_ECOLI (P80063) YLDGNARQNLAT DCEB_ECOLI (P28302) AGKPTDKPN DCEB_ECOLI (P28302) AINFSRPAGQVIAQ DCEB_ECOLI (P28302) APKNGQAVGTNTIG DCEB_ECOLI (P28302) AYLADEIAKLGPYEFI DCEB_ECOLI (P28302) ELLDSRFGAKSIST DCEB_ECOLI (P28302) ELVFNVDYLGGQIGT DCEB_ECOLI (P28302) ELYLDGNARQNLATF DCEB_ECOLI (P28302) FSRPAGQVIAQ DCEB_ECOLI (P28302) GGQIGTFAINFSR DCEB_ECOLI (P28302) GTFAINFSRPAGQV DCEB_ECOLI (P28302) HIDAASGGFLAPFVA DCEB_ECOLI (P28302) IAESKRFPLHE DCEB_ECOLI (P28302) IAKLGPYEF DCEB_ECOLI (P28302) KEEYPQSAAIDLRCVNM DCEB_ECOLI (P28302) KLKDGEDPGYTL DCEB_ECOLI (P28302) LAPFVAPDIVWD DCEB_ECOLI (P28302) LRSELLDSRFGAKSIST DCEB_ECOLI (P28302) LSINKNWIDKEEYPQSAA DCEB_ECOLI (P28302) NTIGVVPTF DCEB_ECOLI (P28302) NVDYLGGQIGTF DCEB_ECOLI (P28302) NVDYLGGQIGTFAINFSRPAGQVIAQYYEFL DCEB_ECOLI (P28302) QVPAFTLGGEATDIVVMR DCEB_ECOLI (P28302) REIPMRPGQL DCEB_ECOLI (P28302) RPAGQVIAQYYEFL DCEB_ECOLI (P28302) RPGQLFMDPKRM DCEB_ECOLI (P28302) RSELLDSRFGAKS DCEB_ECOLI (P28302) SASGHKFGLAPLGCGWVIWRDE DCEB_ECOLI (P28302) TDLRSELLDSRF DCEB_ECOLI (P28302) VADLWHAPAPKNGQA DCEB_ECOLI (P28302) VAFQIINDELYL DCEB_ECOLI (P28302) VDYLGGQIGTFAIN DCEB_ECOLI (P28302) WHAPAPKNGQAVGTNTIGSSE DCEB_ECOLI (P28302) WIDKEEYPQSAAID DCEB_ECOLI (P28302) WQVPAFTLGGEA DCEB_ECOLI (P28302) WVIWRDEEAL DCEB_ECOLI (P28302) YEFPQPLHDAL DCEB_ECOLI (P28302) YLDGNARQNLAT DGAL_ECOLI (P02927) AKNLADGKGAADGTNWKIDNKV DGAL_ECOLI (P02927) ALVKSGALAGTVL DGAL_ECOLI (P02927) AMAMGAVEALKAHN DGAL_ECOLI (P02927) AMAMGAVEALKAHNKSSIPVFGVDA DGAL_ECOLI (P02927) ATFDLAKNLADG DGAL_ECOLI (P02927) DTAQAKDKMDA DGAL_ECOLI (P02927) ESGIIQGDLIAKH DGAL_ECOLI (P02927) FMSVVRKAIEQDA DGAL_ECOLI (P02927) IAKHWAANQGWDL DGAL_ECOLI (P02927) IEKARGQNVPVV DGAL_ECOLI (P02927) IEQDAKAAPDVQLLMNDSQ DGAL_ECOLI (P02927) INLVDPAAAGTVI DGAL_ECOLI (P02927) INLVDPAAAGTVIEKARGQNVPVVFFNKE DGAL_ECOLI (P02927) IQGDLIAKHWA DGAL_ECOLI (P02927) KARGQNVPVVF DGAL_ECOLI (P02927) KATFDLAKNLADGKGAA DGAL_ECOLI (P02927) KELNDKGIKTEQ DGAL_ECOLI (P02927) KGEPGHPDAEARTTY DGAL_ECOLI (P02927) LFGAAAHAADTRIG DGAL_ECOLI (P02927) LMNDSQNDQSKQNDQIDVL DGAL_ECOLI (P02927) LSGPNANKIE DGAL_ECOLI (P02927) LTLSAVMASMLFGAAAHAA 129

DGAL_ECOLI (P02927) NDANNQAKATF DGAL_ECOLI (P02927) TFDLAKNLADGKGAADGT DGAL_ECOLI (P02927) VDPAAAGTVIEKARGQNVPVVFFNKEPSR DGAL_ECOLI (P02927) VFFNKEPSRKALDSYDKA DGAL_ECOLI (P02927) VIEKARGQNVPVVF DGAL_ECOLI (P02927) VIKELNDKGIKTEQ DGAL_ECOLI (P02927) VKSGALAGTVL DGAL_ECOLI (P02927) VRKAIEQDAKAAPDVQL DGAL_ECOLI (P02927) VVIANNDA DGAL_ECOLI (P02927) WDTAQAKDKMDA DGAL_ECOLI (P02927) WLSGPNANKIE DGAL_ECOLI (P02927) YVGTDSKESGIIQGDL DGAL_ECOLI (P02927) YYVGTDSKESGIIQGDL DNAK_ECOLI (P04475) ADNKSLGQFNLDGINPAPRGMPQ DNAK_ECOLI (P04475) AGADASANNAKDDD DNAK_ECOLI (P04475) AGLSVSDIDDVILVGG DNAK_ECOLI (P04475) AIMDGTTPRVLE DNAK_ECOLI (P04475) AIMDGTTPRVLENAEGDRTTPSI DNAK_ECOLI (P04475) AIMDGTTPRVLENAEGDRTTPSII DNAK_ECOLI (P04475) AIMDGTTPRVLENAEGDRTTPSIIA DNAK_ECOLI (P04475) ALAYGLDKGTGNRT DNAK_ECOLI (P04475) AQQTDVNLPYITA DNAK_ECOLI (P04475) AQVSQKLME DNAK_ECOLI (P04475) ATNGDTHLGGED DNAK_ECOLI (P04475) ATNGDTHLGGEDF DNAK_ECOLI (P04475) AVITVPAY DNAK_ECOLI (P04475) AVITVPAYF DNAK_ECOLI (P04475) AYGLDKGTGNRTIA DNAK_ECOLI (P04475) AYGLDKGTGNRTIAV DNAK_ECOLI (P04475) DDKTAIESALTALETALKG DNAK_ECOLI (P04475) DDVILVGGQTRMPMVQ DNAK_ECOLI (P04475) DGINPAPRGMPQIE DNAK_ECOLI (P04475) DGTTPRVLENAEGDRTTPSIIA DNAK_ECOLI (P04475) DIDADGIL DNAK_ECOLI (P04475) DKGTGNRTIAVY DNAK_ECOLI (P04475) DLVNRSIEPLKVAL DNAK_ECOLI (P04475) EAVAIGAAVQGGVLTGDVKD DNAK_ECOLI (P04475) ELVQTRNQGDHLLHSTRKQVEEA DNAK_ECOLI (P04475) EPTAAALAYGLDKGT DNAK_ECOLI (P04475) EPTAAALAYGLDKGTGN DNAK_ECOLI (P04475) EVKRIINEPTAAAL DNAK_ECOLI (P04475) FGKEPRKDVNPDEA DNAK_ECOLI (P04475) FNDAQRQATKDAGRIAGL DNAK_ECOLI (P04475) GDHLLHSTRKQVEE DNAK_ECOLI (P04475) GGQTRMPMVQKKVAEF DNAK_ECOLI (P04475) GIETMGGVMTTLIAKNTT DNAK_ECOLI (P04475) GKEPRKDVNPDEAVAIGAAVQGGVLTGDV DNAK_ECOLI (P04475) GQFNLDGINPAPRGM DNAK_ECOLI (P04475) IAKNTTIPTKHSQV DNAK_ECOLI (P04475) IAKNTTIPTKHSQVF DNAK_ECOLI (P04475) IAKNTTIPTKHSQVFSTAEDNQSAVT DNAK_ECOLI (P04475) IAQQQHAQQQTAGADASANNAKDDDVVD DNAK_ECOLI (P04475) IAQQQHAQQQTAGADASANNAKDDDVVDAE DNAK_ECOLI (P04475) IAVYDLGGGTFDISIIE DNAK_ECOLI (P04475) IDEVDGEKT DNAK_ECOLI (P04475) IDEVDGEKTFE DNAK_ECOLI (P04475) IMDGTTPRVLE DNAK_ECOLI (P04475) IMDGTTPRVLENAEGDRTTPSIIA DNAK_ECOLI (P04475) INPAPRGMPQIEVTFDI DNAK_ECOLI (P04475) KASSGLNEDEIQKMV DNAK_ECOLI (P04475) KQVEEAGDKLPADDKT DNAK_ECOLI (P04475) KTAIESALTALETALK DNAK_ECOLI (P04475) LAQVSQKLME 130

DNAK_ECOLI (P04475) LDVTPLSL DNAK_ECOLI (P04475) LDVTPLSLGIETMGG DNAK_ECOLI (P04475) LLDVTPLSL DNAK_ECOLI (P04475) LVNRSIEPLKVAL DNAK_ECOLI (P04475) MEIAQQQHAQQQTAGADASANNAKDDDVVD DNAK_ECOLI (P04475) MEIAQQQHAQQQTAGADASANNAKDDDVVDAE DNAK_ECOLI (P04475) NAEADRKFEEL DNAK_ECOLI (P04475) NAEGDRTTPSIIA DNAK_ECOLI (P04475) NEPTAAALAYGLDK DNAK_ECOLI (P04475) NGDAWVEVKGQKM DNAK_ECOLI (P04475) NLDGINPAPRGMPQI DNAK_ECOLI (P04475) NLDGINPAPRGMPQIE DNAK_ECOLI (P04475) NLDGINPAPRGMPQIEVT DNAK_ECOLI (P04475) NPAPRGMPQIEVTF DNAK_ECOLI (P04475) PRVLENAEGDRTTPSIIA DNAK_ECOLI (P04475) PYITA DNAK_ECOLI (P04475) PYITADATGPKHMN DNAK_ECOLI (P04475) QGGVLTGDVKDV DNAK_ECOLI (P04475) RAADNKSLGQFNL DNAK_ECOLI (P04475) RKQVEEAGDKLPADDK DNAK_ECOLI (P04475) RKQVEEAGDKLPADDKT DNAK_ECOLI (P04475) RRFQDEEVQRD DNAK_ECOLI (P04475) RSIEPLKVAL DNAK_ECOLI (P04475) SLGIETMGGVMTT DNAK_ECOLI (P04475) SSAQQTDVNL DNAK_ECOLI (P04475) SSAQQTDVNLPYITA DNAK_ECOLI (P04475) SVSDIDD DNAK_ECOLI (P04475) TGPKHMNIKVTRAKL DNAK_ECOLI (P04475) VAIMDGTTPRVLENAEGDRTTPSIIA DNAK_ECOLI (P04475) VEVKGQKMAPPQISA DNAK_ECOLI (P04475) VGGQTRMPMVQKKVAEF DNAK_ECOLI (P04475) VGQPAKRQAVTNPQNTL DNAK_ECOLI (P04475) VGQPAKRQAVTNPQNTLF DNAK_ECOLI (P04475) VLATNGDTHLGGED DNAK_ECOLI (P04475) VLATNGDTHLGGEDF DNAK_ECOLI (P04475) VNPDEAVAIGAAVQGG DNAK_ECOLI (P04475) VNRSIEPLKVAL DNAK_ECOLI (P04475) VQRDVSIMPFKIIA DNAK_ECOLI (P04475) VSAKDKNSGKEQK DNAK_ECOLI (P04475) VSIMPFKIIA DNAK_ECOLI (P04475) VSIMPFKIIAADNGDA DNAK_ECOLI (P04475) WVEVKGQKMAPPQISA DNAK_ECOLI (P04475) WVEVKGQKMAPPQISAE DNAK_ECOLI (P04475) YDLGGGTF DNAK_ECOLI (P04475) YDLGGGTFDISIIEIDEVD EFTS_ECOLI (P02997) ADKVLDAAVAGKITDVEVLKAQFE EFTS_ECOLI (P02997) AGFQAFADKVL EFTS_ECOLI (P02997) AMQSGKPKEIAEKMVEG EFTS_ECOLI (P02997) CKKALTEANGDIEL EFTS_ECOLI (P02997) EGDVLGSYQHGARIGVL EFTS_ECOLI (P02997) HVAASKPEF EFTS_ECOLI (P02997) ILEVNCQTDFVAKDA EFTS_ECOLI (P02997) RFEVGEGIEKVET EFTS_ECOLI (P02997) TGQPFVMEPSKTVGQL EFTS_ECOLI (P02997) VSLTGQPFVMEPSKTVGQL EFTU_ECOLI (P02990) AAQMDGAILVVAATDGPMPQTREHIL EFTU_ECOLI (P02990) AIREGGRTVGAGVVA EFTU_ECOLI (P02990) AKTYGGAARAF EFTU_ECOLI (P02990) DQIDNAPEEKARG EFTU_ECOLI (P02990) DYVKNMITGAAQMDGAIL EFTU_ECOLI (P02990) EKFERTKPHVNVG EFTU_ECOLI (P02990) EKFERTKPHVNVGTIGH EFTU_ECOLI (P02990) FDQIDNAPEEKARG 131

EFTU_ECOLI (P02990) FFKGYRPQFYFRTTDV EFTU_ECOLI (P02990) FRKLLDEGRAGEN EFTU_ECOLI (P02990) GGRHTPFFKGYRPQFYFRTTDVT EFTU_ECOLI (P02990) GRQVGVPYIIVF EFTU_ECOLI (P02990) HVDHGKTTLTAAI EFTU_ECOLI (P02990) IKETQKSTCTGVEM EFTU_ECOLI (P02990) LDSYIPEPERA EFTU_ECOLI (P02990) TINTSHVEYDTPTRH EFTU_ECOLI (P02990) VGTIGHVDHGKTTLT EFTU_ECOLI (P02990) VKNMITGAAQMDGAILVVAATDGPMPQT EFTU_ECOLI (P02990) VNVGTIGHVDHGKTT EFTU_ECOLI (P02990) VVAATDGPMPQTRE EFTU_ECOLI (P02990) YDFPGDDTPIVRGSAL EFTU_ECOLI (P02990) YFRTTDVTGT EFTU_ECOLI (P02990) YVKNMITGAAQM G3P1_ECOLI (P06977) GAAKAVGKVLP G3P1_ECOLI (P06977) GGRGASQNII G3P1_ECOLI (P06977) GPSHKDWRGGRGAS G3P1_ECOLI (P06977) KGANFDKYAGQDIVSNASCTTNCLAPLA G3P1_ECOLI (P06977) NIIPSSTGAAKAVGKVLPEL G3P1_ECOLI (P06977) PANLKWDEVGVDVVAEATGLFLTDETAR G3P1_ECOLI (P06977) TIKVGINGF G3P1_ECOLI (P06977) YDNETGYSNKVLDL GLNH_ECOLI (P10344) DYELKPMDF GLNH_ECOLI (P10344) EAQQYGIAFPKGSDE GLNH_ECOLI (P10344) EAQQYGIAFPKGSDEL GLNH_ECOLI (P10344) GNGQFKAVGDSL GLNH_ECOLI (P10344) LKPMDFSGIIPALQTKNVDL GLNH_ECOLI (P10344) LRQFPNIDNA GLNH_ECOLI (P10344) MELGTNRADAVL GLNH_ECOLI (P10344) YMELGTNRADAVL GLR3_ECOLI (P37687) DARGGLDPLLK GLR3_ECOLI (P37687) EEMIKRSGRTTVPQ GLR3_ECOLI (P37687) IDAQHIGGCDDL GLR3_ECOLI (P37687) KGVSFQELPIDGNAAKREE GLR3_ECOLI (P37687) QELPIDGNAAKREEM GLR3_ECOLI (P37687) YALDARGGLDPLLK HDEA_ECOLI (P26604) ACTQDKQANF HDEA_ECOLI (P26604) ADAQKAADNKKPVNS HDEA_ECOLI (P26604) ADAQKAADNKKPVNSW HDEA_ECOLI (P26604) ADAQKAADNKKPVNSWT HDEA_ECOLI (P26604) ADAQKAADNKKPVNSWTCE HDEA_ECOLI (P26604) ADAQKAADNKKPVNSWTCEDF HDEA_ECOLI (P26604) AEALNNKDKPED HDEA_ECOLI (P26604) AEALNNKDKPEDAVL HDEA_ECOLI (P26604) ALNNKDKPED HDEA_ECOLI (P26604) AQKAADNKKPVNSW HDEA_ECOLI (P26604) AVDESFQPTA HDEA_ECOLI (P26604) AVDESFQPTAVG HDEA_ECOLI (P26604) AVDESFQPTAVGF HDEA_ECOLI (P26604) CTQDKQANF HDEA_ECOLI (P26604) DAQKAADNKKPVNS HDEA_ECOLI (P26604) DAQKAADNKKPVNSW HDEA_ECOLI (P26604) DAVLDVQGIATVTP HDEA_ECOLI (P26604) DESFQPTA HDEA_ECOLI (P26604) DESFQPTAVGF HDEA_ECOLI (P26604) DKIKKD HDEA_ECOLI (P26604) DVQGIA HDEA_ECOLI (P26604) DVQGIATVT HDEA_ECOLI (P26604) DVQGIATVTPAIVQ HDEA_ECOLI (P26604) DVQGIATVTPAIVQA HDEA_ECOLI (P26604) EALNNKDKPED HDEA_ECOLI (P26604) ESFQPTAVGF 132

HDEA_ECOLI (P26604) FAEALNNKDKPED HDEA_ECOLI (P26604) FAEALNNKDKPEDAVL HDEA_ECOLI (P26604) GFAEALNNKD HDEA_ECOLI (P26604) IATVTPAIVQ HDEA_ECOLI (P26604) KDKVKGEW HDEA_ECOLI (P26604) LAVDE HDEA_ECOLI (P26604) LAVDESFQPTA HDEA_ECOLI (P26604) LAVDESFQPTAVG HDEA_ECOLI (P26604) LAVDESFQPTAVGF HDEA_ECOLI (P26604) NKDKPEDAVLDVQGIAT HDEA_ECOLI (P26604) NKKPVNSW HDEA_ECOLI (P26604) PAIVQACTQDKQANFKDK HDEA_ECOLI (P26604) PTAVGFAEALNNKDKPE HDEA_ECOLI (P26604) QGIATVTPAIVQ HDEA_ECOLI (P26604) QPTAVGF HDEA_ECOLI (P26604) SFQPTA HDEA_ECOLI (P26604) SFQPTAVGF HDEA_ECOLI (P26604) SFQPTAVGFAEALNN HDEA_ECOLI (P26604) TAVGFAEALNNKDK HDEA_ECOLI (P26604) TCEDF HDEA_ECOLI (P26604) TVTPAIVQ HDEA_ECOLI (P26604) TVTPAIVQA HDEA_ECOLI (P26604) VDESFQPTAVGF HDEA_ECOLI (P26604) VTPAIVQ HDEA_ECOLI (P26604) WTCEDF HDEB_ECOLI (P26605) ANESAKDMTCQEF HDEB_ECOLI (P26605) ETVYKGGDT HDEB_ECOLI (P26605) ETVYKGGDTVT HDEB_ECOLI (P26605) ETVYKGGDTVTL HDEB_ECOLI (P26605) FIDLNPKAMT HDEB_ECOLI (P26605) FIDLNPKAMTPVAW HDEB_ECOLI (P26605) FKNQASNDLPN HDEB_ECOLI (P26605) HEETVYKGGDTVT HDEB_ECOLI (P26605) IDLNPKAMT HDEB_ECOLI (P26605) IDLNPKAMTPVAW HDEB_ECOLI (P26605) IDLNPKAMTPVAWW HDEB_ECOLI (P26605) IDLNPKAMTPVAWWML HDEB_ECOLI (P26605) KAFIFMGAVAALS HDEB_ECOLI (P26605) LNETDL HDEB_ECOLI (P26605) LTQIPKVIE HDEB_ECOLI (P26605) LTQIPKVIEY HDEB_ECOLI (P26605) MTPVAW HDEB_ECOLI (P26605) NETDL HDEB_ECOLI (P26605) NETDLTQIPKVIE HDEB_ECOLI (P26605) NISSLRKAFIFM HDEB_ECOLI (P26605) NPKAMTPVAW HDEB_ECOLI (P26605) TDLTQIPKVIE HDEB_ECOLI (P26605) TQIPKVIE HDEB_ECOLI (P26605) TQIPKVIEY HDEB_ECOLI (P26605) VYKGGDTVTL HDEB_ECOLI (P26605) WMLHEET HLPA_ECOLI (P11457) ADKIAIVNMGSLFQQVAQKTGVS HLPA_ECOLI (P11457) ADKIAIVNMGSLFQQVAQKTGVSNT HLPA_ECOLI (P11457) ADKIAIVNMGSLFQQVAQKTGVSNTL HLPA_ECOLI (P11457) ADKIAIVNMGSLFQQVAQKTGVSNTLE HLPA_ECOLI (P11457) ADKIAIVNMGSLFQQVAQKTGVSNTLENE HLPA_ECOLI (P11457) AIVNMGSLFQQVAQKTG HLPA_ECOLI (P11457) FQQVAQKTGVSNTLE HLPA_ECOLI (P11457) FQQVAQKTGVSNTLENE HLPA_ECOLI (P11457) GRASELQRMETDLQAK HLPA_ECOLI (P11457) NEFKGRASELQRM HLPA_ECOLI (P11457) VDANAVAYNSSDVKDITAD HNS_ECOLI (P08936) DPNELL 133

HNS_ECOLI (P08936) EKLEV HNS_ECOLI (P08936) EVVVNE HNS_ECOLI (P08936) IADGIDPNEL HNS_ECOLI (P08936) IADGIDPNELL HNS_ECOLI (P08936) IADGIDPNELLNS HNS_ECOLI (P08936) IKKAMDEQGKSLDD HNS_ECOLI (P08936) IKKAMDEQGKSLDDF HNS_ECOLI (P08936) IRTLRA HNS_ECOLI (P08936) LEKLEV HNS_ECOLI (P08936) LKILNN HNS_ECOLI (P08936) LNSLA HNS_ECOLI (P08936) SEALKILNN HNS_ECOLI (P08936) TKTWTGQGRTPAVIK HNS_ECOLI (P08936) VDENGETKTWTGQGRTPAV HNS_ECOLI (P08936) VDENGETKTWTGQGRTPAVI HNS_ECOLI (P08936) WTGQGRTPAV HNS_ECOLI (P08936) WTGQGRTPAVI IPYR_ECOLI (P17288) AKAEIVASFERAKN IPYR_ECOLI (P17288) EAAKAEIVASFER IPYR_ECOLI (P17288) IEIPANADPIKY IPYR_ECOLI (P17288) IEIPANADPIKYEIDKESGALFVDRF IPYR_ECOLI (P17288) SLLNVPAGKDLPED IPYR_ECOLI (P17288) SLLNVPAGKDLPEDI IPYR_ECOLI (P17288) SLLNVPAGKDLPEDIY IPYR_ECOLI (P17288) VIEIPANADPIKY IPYR_ECOLI (P17288) VLVPTPYPLQPGSV IPYR_ECOLI (P17288) VVIEIPANADPIKY IPYR_ECOLI (P17288) VVIEIPANADPIKYEID IPYR_ECOLI (P17288) VVIEIPANADPIKYEIDKESGAL IPYR_ECOLI (P17288) YDHIKDVNDLPEL IPYR_ECOLI (P17288) YGYINHTLSL KDUI_ECOLI (Q46938) ASIDTGTPAKF KDUI_ECOLI (Q46938) EIGHRDALYVGKGA KDUI_ECOLI (Q46938) SIHSGVGTKAYT KDUI_ECOLI (Q46938) VTPDEVSPVTLGDNLTS KDUI_ECOLI (Q46938) YEIGHRDAL MALE_ECOLI (P02928) AVINAASGRQTVDE MALE_ECOLI (P02928) KFPQVAATGDGPDIIFWAHDR MALE_ECOLI (P02928) PDKLEEKFPQVAATGDGPDII MALE_ECOLI (P02928) TVDEALKDAQTRI MALE_ECOLI (P02928) VIWINGDKGYNGL OSMY_ECOLI (P27291) AKAVDGVKSVKNDL OSMY_ECOLI (P27291) AVKVAKGVE OSMY_ECOLI (P27291) AVKVAKGVEGVTSVSDKL OSMY_ECOLI (P27291) DGVKSVKNDLKTK OSMY_ECOLI (P27291) DHDNIKSTD OSMY_ECOLI (P27291) DIVPSRHVKVETTDG OSMY_ECOLI (P27291) DTATTSEIKAKLLADDIVPS OSMY_ECOLI (P27291) ENNAQTTNESAG OSMY_ECOLI (P27291) ENNAQTTNESAGQKVDSSM OSMY_ECOLI (P27291) ENNAQTTNESAGQKVDSSMNKVGNF OSMY_ECOLI (P27291) FVESQAQAEE OSMY_ECOLI (P27291) FVESQAQAEEA OSMY_ECOLI (P27291) GVTSVSDKL OSMY_ECOLI (P27291) GVTSVSDKLHVRDAKEG OSMY_ECOLI (P27291) HVRDAKEGSVKGYAG OSMY_ECOLI (P27291) IAKAVDGVKSVKNDL OSMY_ECOLI (P27291) IAKAVDGVKSVKNDLKTK OSMY_ECOLI (P27291) ISVKTDQKVVT OSMY_ECOLI (P27291) ISVKTDQKVVTLSG OSMY_ECOLI (P27291) KVETTDGVVQLSGTVDSQA OSMY_ECOLI (P27291) LADDI OSMY_ECOLI (P27291) LADDIVPSRHVKVETTDG 134

OSMY_ECOLI (P27291) LSGFVESQAQAEEAVKVAKGVEGVTS OSMY_ECOLI (P27291) NKVGNF OSMY_ECOLI (P27291) PSRHVKVETTDG OSMY_ECOLI (P27291) PSRHVKVETTDGVVQLS OSMY_ECOLI (P27291) RDAKEGSVKGYA OSMY_ECOLI (P27291) SAGQKVDSSM OSMY_ECOLI (P27291) SGTVDSQAQSDRAES OSMY_ECOLI (P27291) SGTVDSQAQSDRAESIA OSMY_ECOLI (P27291) SQAQSDRAESIAKA OSMY_ECOLI (P27291) SVKTDQKVVT OSMY_ECOLI (P27291) SVSDKL OSMY_ECOLI (P27291) TDGVVQLSGTVDSQA OSMY_ECOLI (P27291) VDHDNIKSTD OSMY_ECOLI (P27291) VDHDNIKSTDIS OSMY_ECOLI (P27291) VKVAKGVEGVTSVSDKL OSMY_ECOLI (P27291) VQLSGTVDSQAQSDRAES OSMY_ECOLI (P27291) VVQLSGTVDSQAQSD OSMY_ECOLI (P27291) VVQLSGTVDSQAQSDRAES OSMY_ECOLI (P27291) VVQLSGTVDSQAQSDRAESIA PTGA_ECOLI (P08837) DGTIGKIFETNHAFSI PTGA_ECOLI (P08837) GETPVIRIKK PTGA_ECOLI (P08837) IEDVPDVVF PTGA_ECOLI (P08837) IEFDLPLLE PTGA_ECOLI (P08837) IIAPLSGE PTGA_ECOLI (P08837) IVNIEDVPDVVF PTGA_ECOLI (P08837) SGVELFVHFGIDTVELKG PTGA_ECOLI (P08837) TGTIEIIAPLSGEI PTHP_ECOLI (P07006) EAKGFTSEITVTSNGKSA PTHP_ECOLI (P07006) FKLQTLGL PTHP_ECOLI (P07006) FKLQTLGLTQGTVVT PTHP_ECOLI (P07006) FVKEAKGF PTHP_ECOLI (P07006) FVKEAKGFTSEITV PTHP_ECOLI (P07006) ISAEGEDEQKAVEHL PTHP_ECOLI (P07006) ITAPNGLHTRPAAQ PTHP_ECOLI (P07006) ITAPNGLHTRPAAQF PTHP_ECOLI (P07006) ITVTSNGKSASAKSL PTHP_ECOLI (P07006) ITVTSNGKSASAKSLF PTHP_ECOLI (P07006) KGFTSEITVTSNGK PTHP_ECOLI (P07006) KLQTLGLTQGTVVT PTHP_ECOLI (P07006) MFQQEVTITAPNG PTHP_ECOLI (P07006) QEVTITAPNGLHT PTHP_ECOLI (P07006) SLFKLQTLGLTQGTVVTISAEGEDEQKAVEHLV PTHP_ECOLI (P07006) TVTSNGKSASAKSL PTHP_ECOLI (P07006) VTITAPNGL PTHP_ECOLI (P07006) VTITAPNGLHTR PTHP_ECOLI (P07006) VTITAPNGLHTRPAAQ PTHP_ECOLI (P07006) VTITAPNGLHTRPAAQF PTHP_ECOLI (P07006) VTSNGKSASAKSL RBSB_ECOLI (P02925) AAHKFNVL RBSB_ECOLI (P02925) AAHKFNVLASQPADFDRIKGLNVM RBSB_ECOLI (P02925) AATIAQLPDQIGAKGVE RBSB_ECOLI (P02925) AATIAQLPDQIGAKGVETA RBSB_ECOLI (P02925) AKKAGEGAKV RBSB_ECOLI (P02925) ALSATVSANAMAKDTIALVVSTLNNPFFVSLKD RBSB_ECOLI (P02925) ANIPVITL RBSB_ECOLI (P02925) ARERGEGFQQA RBSB_ECOLI (P02925) ASQPADF RBSB_ECOLI (P02925) AVAAHKFNVL RBSB_ECOLI (P02925) AVALSATVSANAMAKDTIALV RBSB_ECOLI (P02925) DGTPDGEKAVNDGKL RBSB_ECOLI (P02925) DGTPDGEKAVNDGKLAAT RBSB_ECOLI (P02925) DKVLKGEKVQAKYPVD RBSB_ECOLI (P02925) DRIKGLNV 135

RBSB_ECOLI (P02925) DRQATKGEVVSHIASD RBSB_ECOLI (P02925) DRQATKGEVVSHIASDN RBSB_ECOLI (P02925) DRQATKGEVVSHIASDNVL RBSB_ECOLI (P02925) DRQATKGEVVSHIASDNVLGGKIAGD RBSB_ECOLI (P02925) DSQNNPAKE RBSB_ECOLI (P02925) DSQNNPAKELANVQD RBSB_ECOLI (P02925) DSQNNPAKELANVQDL RBSB_ECOLI (P02925) FDRIKGLNV RBSB_ECOLI (P02925) FQQAVAAHKFNVL RBSB_ECOLI (P02925) GAKGVETADKVLKGEKVQAKYPVD RBSB_ECOLI (P02925) GAKGVETADKVLKGEKVQAKYPVDL RBSB_ECOLI (P02925) GGKIAGDYIAKKAG RBSB_ECOLI (P02925) GIAGTSAARERGEGF RBSB_ECOLI (P02925) GIAGTSAARERGEGFQQAVAA RBSB_ECOLI (P02925) HIASDNVLGGKIAGD RBSB_ECOLI (P02925) IAKKAGEGAKVIE RBSB_ECOLI (P02925) IAQLPDQIGAKG RBSB_ECOLI (P02925) IAQLPDQIGAKGVE RBSB_ECOLI (P02925) IAQLPDQIGAKGVET RBSB_ECOLI (P02925) IAQLPDQIGAKGVETA RBSB_ECOLI (P02925) IAQLPDQIGAKGVETAD RBSB_ECOLI (P02925) IAQLPDQIGAKGVETADKVL RBSB_ECOLI (P02925) IAQLPDQIGAKGVETADKVLKGEKVQ RBSB_ECOLI (P02925) IAQLPDQIGAKGVETADKVLKGEKVQAKYPVD RBSB_ECOLI (P02925) IASDNVLGGKIAGD RBSB_ECOLI (P02925) INPTDSDAVGNA RBSB_ECOLI (P02925) INPTDSDAVGNAVKMANQANIPVITL RBSB_ECOLI (P02925) KDGAQKEADKL RBSB_ECOLI (P02925) KDGAQKEADKLGYNL RBSB_ECOLI (P02925) KDTIAL RBSB_ECOLI (P02925) KGEVVSHIASDNVLGG RBSB_ECOLI (P02925) KVIELQGIAGTSAARERG RBSB_ECOLI (P02925) LANVQD RBSB_ECOLI (P02925) LANVQDL RBSB_ECOLI (P02925) LANVQDLTVRGT RBSB_ECOLI (P02925) LGGKIAGDY RBSB_ECOLI (P02925) LGYNLVVLDSQNNPAKELANVQD RBSB_ECOLI (P02925) LINPTDSDAVGNA RBSB_ECOLI (P02925) LKDGAQKEADK RBSB_ECOLI (P02925) LLINPTDSDAVGNA RBSB_ECOLI (P02925) LLTAHPDVQAVFAQNDEMA RBSB_ECOLI (P02925) LNNPFFVSLKDGAQ RBSB_ECOLI (P02925) LQGIAGTSA RBSB_ECOLI (P02925) LQGIAGTSAARERGEG RBSB_ECOLI (P02925) LQTAGKSDVM RBSB_ECOLI (P02925) LRALQTAGKSDV RBSB_ECOLI (P02925) LRALQTAGKSDVM RBSB_ECOLI (P02925) LTAHPDVQ RBSB_ECOLI (P02925) LTVRGTKIL RBSB_ECOLI (P02925) LTVRGTKILL RBSB_ECOLI (P02925) LVVSTL RBSB_ECOLI (P02925) LVVSTLNNPFF RBSB_ECOLI (P02925) MQNLLTAHPDVQ RBSB_ECOLI (P02925) MVVGFDGTPDGEKAVNDGKL RBSB_ECOLI (P02925) MVVGFDGTPDGEKAVNDGKLAAT RBSB_ECOLI (P02925) NLLTAHPDVQ RBSB_ECOLI (P02925) NPFFVSLKDGAQKE RBSB_ECOLI (P02925) NVLGGKIAGD RBSB_ECOLI (P02925) NVLGGKIAGDY RBSB_ECOLI (P02925) PDGEKAVNDGKL RBSB_ECOLI (P02925) PDGEKAVNDGKLAAT RBSB_ECOLI (P02925) QGIAGTSA RBSB_ECOLI (P02925) QGIAGTSAARERGEG 136

RBSB_ECOLI (P02925) QGIAGTSAARERGEGFQQA RBSB_ECOLI (P02925) QGIAGTSAARERGEGFQQAVA RBSB_ECOLI (P02925) QGIAGTSAARERGEGFQQAVAA RBSB_ECOLI (P02925) QLPDQIGAKGVETADKVLKGEKVQAKYPVD RBSB_ECOLI (P02925) QQAVAAHKFNVL RBSB_ECOLI (P02925) QTAGKSDVMVVGFDGTPDGEKA RBSB_ECOLI (P02925) RALQTAGKSDV RBSB_ECOLI (P02925) RALQTAGKSDVM RBSB_ECOLI (P02925) RALQTAGKSDVMV RBSB_ECOLI (P02925) RGEGFQQAVAA RBSB_ECOLI (P02925) SLKDGAQKEADKL RBSB_ECOLI (P02925) SLKDGAQKEADKLGYN RBSB_ECOLI (P02925) SLKDGAQKEADKLGYNL RBSB_ECOLI (P02925) STLNNPFF RBSB_ECOLI (P02925) TADKVLKGEKVQAKYPVD RBSB_ECOLI (P02925) TIAQLPDQIGAKGVETA RBSB_ECOLI (P02925) TVRGTKIL RBSB_ECOLI (P02925) TVRGTKILL RBSB_ECOLI (P02925) VAAHKFNVL RBSB_ECOLI (P02925) VAAHKFNVLA RBSB_ECOLI (P02925) VETADKVLKGEKVQAKYPVD RBSB_ECOLI (P02925) VGFDGTPDGEKAVNDGKL RBSB_ECOLI (P02925) VGFDGTPDGEKAVNDGKLAAT RBSB_ECOLI (P02925) VKMANQANIPVIT RBSB_ECOLI (P02925) VKMANQANIPVITL RBSB_ECOLI (P02925) VSLKDGAQKEADKL RBSB_ECOLI (P02925) VSLKDGAQKEADKLGYN RBSB_ECOLI (P02925) VSLKDGAQKEADKLGYNL RBSB_ECOLI (P02925) VVGFDGTPDGEKAVNDGKL RBSB_ECOLI (P02925) VVGFDGTPDGEKAVNDGKLAA RBSB_ECOLI (P02925) VVGFDGTPDGEKAVNDGKLAAT RBSB_ECOLI (P02925) VVLDSQNNPAKE RBSB_ECOLI (P02925) VVLDSQNNPAKELANVQD RBSB_ECOLI (P02925) VVSTLNNPFF RBSB_ECOLI (P02925) YIAKKAGEGAK RBSB_ECOLI (P02925) YIAKKAGEGAKV RBSB_ECOLI (P02925) YIAKKAGEGAKVIE RBSB_ECOLI (P02925) YNLVVLDSQNNPAKELAN RL13_ECOLI (P02410) AVKGMLPKGPLGRAMF RL13_ECOLI (P02410) IAVKGMLPKGPLGRAM RL13_ECOLI (P02410) IAVKGMLPKGPLGRAMF RL13_ECOLI (P02410) MKTFTAKPETVK RL13_ECOLI (P02410) RDWYVVDATG RL13_ECOLI (P02410) TAKPETVKRDW RL13_ECOLI (P02410) VVDATGKTLGRL RL13_ECOLI (P02410) YVVDATGKTLGRL RL16_ECOLI (P02414) AKLPIKTTF RL16_ECOLI (P02414) GRNRGLAQGTDVSF RL16_ECOLI (P02414) IRVFPDKPITE RL16_ECOLI (P02414) IRVFPDKPITEKPLA RL16_ECOLI (P02414) LIQPGKVL RL16_ECOLI (P02414) YEMDGVPEEL RL2_ECOLI (P02387) AAIKPGNTLPMRNIPVGSTVHN RL2_ECOLI (P02387) AAIKPGNTLPMRNIPVGSTVHNVE RL2_ECOLI (P02387) AAIKPGNTLPMRNIPVGSTVHNVEM RL2_ECOLI (P02387) AAIKPGNTLPMRNIPVGSTVHNVEMKPGKGGQL RL2_ECOLI (P02387) DCRATLGEVGNAEHM RL2_ECOLI (P02387) ERLEYDPNRSAN RL2_ECOLI (P02387) GGGHKQAYRIVDFKRNKDGIPA RL2_ECOLI (P02387) GQLARSAGTYVQIVA RL2_ECOLI (P02387) KAGAARWRGVRP RL2_ECOLI (P02387) KDGIPAVVERLEYDPNR RL2_ECOLI (P02387) KGGQLARSAGTYV 137

RL2_ECOLI (P02387) NIPVGSTVHN RL2_ECOLI (P02387) PMRNIPVGSTVHN RL2_ECOLI (P02387) PMRNIPVGSTVHNVE RL2_ECOLI (P02387) RATLGEVGNAEHM RL2_ECOLI (P02387) RVLGKAGAARWRGV RL2_ECOLI (P02387) TTRHIGGGHKQAYRIVDFK RL2_ECOLI (P02387) VQIVARDGAY RL2_ECOLI (P02387) VRGTAMNPVDHPHGGGEGR RL24_ECOLI (P02425) AATGKADRVGFRFEDG RL24_ECOLI (P02425) AIFNAATGKADRVGF RL24_ECOLI (P02425) EAAIQVSNVAIFNAAT RL24_ECOLI (P02425) FKSNSETIK RL24_ECOLI (P02425) FNAATGKADRVGFRFEDG RL24_ECOLI (P02425) NAATGKADRVGF RL24_ECOLI (P02425) RFFKSNSETI RL29_ECOLI (P02429) AASGQLQQSHLLK RL29_ECOLI (P02429) GQLQQSHLLKQ RL29_ECOLI (P02429) LREKSVEEL RL29_ECOLI (P02429) QAASGQLQQSHL RL29_ECOLI (P02429) RMQAASGQLQQSHL RL3_ECOLI (P02386) ADVKKVDVTGTSKGKGF RL3_ECOLI (P02386) AGQMGNERVTVQSL RL3_ECOLI (P02386) DVTGTSKGKGFAGTVKRWNFRTQDA RL3_ECOLI (P02386) ELFADVKKVDVT RL3_ECOLI (P02386) FADVKKVDVTGTSKGKGF RL3_ECOLI (P02386) GLWEFRLAEGEEF RL3_ECOLI (P02386) GYRAIQVTTGAKKANRVTKPEAGHFA RL3_ECOLI (P02386) KKMAGQMGNERVTVQSL RL3_ECOLI (P02386) LVKGAVPGATGSDL RL3_ECOLI (P02386) NSLSHRVPG RL3_ECOLI (P02386) NSLSHRVPGSI RL3_ECOLI (P02386) RIFTEDGVSIPVTVIEVEANRVTQV RL3_ECOLI (P02386) RTQDATHGNSLSHRVPGSIGQNQTP RL3_ECOLI (P02386) TGAKKANRVTKPEAG RL3_ECOLI (P02386) TGTSKGKGFAGTVKRWNFRT RL3_ECOLI (P02386) TVGQSISVELFADV RL6_ECOLI (P02390) GPRDGY RL6_ECOLI (P02390) KGADKQVIGQVA RL6_ECOLI (P02390) QAGTARALLNSMVIGVTEG RL6_ECOLI (P02390) QVIGQVAADLRAY RL6_ECOLI (P02390) SLGFSHPVDHQLPAGIT RL6_ECOLI (P02390) SLGFSHPVDHQLPAGITAECPTQTE RL6_ECOLI (P02390) SRVAKAPVVVPAGVD RL7_ECOLI (P02392) AAAVAVAAGPVEAAEEKT RL7_ECOLI (P02392) AVAAGPVEAAEEKTEFDVI RL7_ECOLI (P02392) FGVSAAAAVAVAAGPVEAAE RL7_ECOLI (P02392) IKAVRGATGLGL RL7_ECOLI (P02392) KAAGANKVAV RL7_ECOLI (P02392) KAAGANKVAVIKA RL7_ECOLI (P02392) LISAMEEKFGVSAAAAVAVAAGPVEAAEE RL7_ECOLI (P02392) LVESAPAAL RL7_ECOLI (P02392) VAVAAGPVE RL7_ECOLI (P02392) VAVAAGPVEA RS15_ECOLI (P02371) AQINHLQGHF RS15_ECOLI (P02371) FGRDANDTGSTEVQ RS15_ECOLI (P02371) GRDANDTGSTEVQVA RS15_ECOLI (P02371) SLSTEA RS15_ECOLI (P02371) SLSTEATAKIVSE RS15_ECOLI (P02371) TAQINHLQGHF RS6_ECOLI (P02358) AHYVLMNVEAPQEVID RS6_ECOLI (P02358) FRFNDAVIRSMVMRT RS6_ECOLI (P02358) ITGAEGKIHRLED RS6_ECOLI (P02358) LMNVEAPQEVI 138

RS6_ECOLI (P02358) MVHPDQSEQVPGM RS6_ECOLI (P02358) QVPGMIERYTAAIT SODC_ECOLI (P53635) EGPEGAGHLGDLPALVVN SODC_ECOLI (P53635) IGSVTITETDKGLEFS SODC_ECOLI (P53635) IHAKGSCQPATKDGKASAAE SODC_ECOLI (P53635) ITETDKGLEFSPDL SODC_ECOLI (P53635) LVTSQGVGQSIGSVT SODC_ECOLI (P53635) MSDQPKPLGGGGERYACG SODC_ECOLI (P53635) MSDQPKPLGGGGERYACGV SODC_ECOLI (P53635) NLVTSQGVGQSIGS SODC_ECOLI (P53635) NLVTSQGVGQSIGSVT SODC_ECOLI (P53635) QGVGQSIGSVTITETDK SODC_ECOLI (P53635) TGAQAASEKVEMNLVT SODC_ECOLI (P53635) TKDGKASAAESAGGHL SODC_ECOLI (P53635) VATGAQAASEKVE SODC_ECOLI (P53635) VATGAQAASEKVEMNLVT SODC_ECOLI (P53635) VHVGGDNMSDQPKPLGGGGERYACG SODC_ECOLI (P53635) VTSQGVGQSIGSVT SODC_ECOLI (P53635) VTSQGVGQSIGSVTITET SODC_ECOLI (P53635) VVATGAQAASEK STFR_ECOLI (P76072) AAAQSKSTAESAATRAETAAKRAEDIASAVALEDA STFR_ECOLI (P76072) AAARSASAAKTSETNAKASET STFR_ECOLI (P76072) AAASSASSAASSASSASASKDEA STFR_ECOLI (P76072) AAKSSETNASSSASSAASSAT STFR_ECOLI (P76072) AARSASAAKTSETN STFR_ECOLI (P76072) AASTSAGQASASATAAG STFR_ECOLI (P76072) AATSAGAAKTSETNASASLQSAATSASTATTK STFR_ECOLI (P76072) ADKRGMRYVRVNAP STFR_ECOLI (P76072) AGAAKTSETNASASLQSA STFR_ECOLI (P76072) AGSATAAAQSKSTAESAATRA STFR_ECOLI (P76072) APGADLVVNDTTY STFR_ECOLI (P76072) AQNTAAAKKSASDA STFR_ECOLI (P76072) ARNASAVAQNTAAAKKSASD STFR_ECOLI (P76072) ARSSETAAGQSASAAAGSKT STFR_ECOLI (P76072) ASAGAHAHTVGIGAH STFR_ECOLI (P76072) ASSAASSASSASASKDEATRQASAAKSSATTASTKATEA STFR_ECOLI (P76072) ASSASSAASSASSASASKDEATR STFR_ECOLI (P76072) ASSASSASASKDE STFR_ECOLI (P76072) ASSSAGTASTKATEASKSAAAA STFR_ECOLI (P76072) ASSTDLGTKTTSSFDY STFR_ECOLI (P76072) ATAAGNSAKAAK STFR_ECOLI (P76072) ATSASTATTKASEAATS STFR_ECOLI (P76072) AVAQNTAAAK STFR_ECOLI (P76072) EEVARNASAVAQNTAAAK STFR_ECOLI (P76072) ETAAKRAEDIASAVA STFR_ECOLI (P76072) GATNPATECIAADV STFR_ECOLI (P76072) GATSGKYYPVV STFR_ECOLI (P76072) GHTITVNAAGNAENTVK STFR_ECOLI (P76072) HSASASSTDLGTKTTSSFDYGTKSTNNTGAHTHSVS STFR_ECOLI (P76072) KSAAAAESSKSAAATSAG STFR_ECOLI (P76072) KSSATTASTKATEA STFR_ECOLI (P76072) KSSATTASTKATEAAGS STFR_ECOLI (P76072) LLTNQGDVYGGWNTLR STFR_ECOLI (P76072) LQSAATSASTATTKA STFR_ECOLI (P76072) NAAGNAENTVKN STFR_ECOLI (P76072) NAPAGATSGKYYP STFR_ECOLI (P76072) NTAAAKKSASDAST STFR_ECOLI (P76072) NVNTASANSGAGSASTRLSVVHNQNYATSSAGAHTHS STFR_ECOLI (P76072) PAGATSGKYYPVVVM STFR_ECOLI (P76072) PESYPVGAPIPWPSDTVPSGYALMQGQAFDKSAY STFR_ECOLI (P76072) PSHAGTITVYEDSQPGT STFR_ECOLI (P76072) PVFAFIEDGLSI STFR_ECOLI (P76072) QASASATAAGKSAESAA 139

STFR_ECOLI (P76072) QASASATAAGKSAESAASS STFR_ECOLI (P76072) QSASAAAGSKTAAASSA STFR_ECOLI (P76072) RAASTSAGQAASSAQSASSS STFR_ECOLI (P76072) RSASAAKTSETNAKA STFR_ECOLI (P76072) SAGAAKTSET STFR_ECOLI (P76072) SANSGAGSASTRLS STFR_ECOLI (P76072) SASLNGNALTAT STFR_ECOLI (P76072) SKSAAATSAGAAKTS STFR_ECOLI (P76072) SKTAAASSASSAASSA STFR_ECOLI (P76072) SLIVNDNLSCKKL STFR_ECOLI (P76072) SSASASKDEATRQASAAKSS STFR_ECOLI (P76072) SSASSASASKDEA STFR_ECOLI (P76072) SSETAAGQSASAA STFR_ECOLI (P76072) STKATEAAGSATAAAQ STFR_ECOLI (P76072) THSASASSTDLGTKTTSSFDYGTKSTNNTGAHTHSV STFR_ECOLI (P76072) THSLANVNTASAN STFR_ECOLI (P76072) TSKNLPPESYPVGAP STFR_ECOLI (P76072) TTKAGEATEQASAAARS STFR_ECOLI (P76072) TTKASEAATSARDAA STFR_ECOLI (P76072) TVGIGAHTHSVA TNAA_ECOLI (P00913) AGGQPVSLANLKAMYSIAKKYDIPVVMDSARFAEN TNAA_ECOLI (P00913) DFKGNFDLEGLERGIEE TNAA_ECOLI (P00913) EDVFIDLLTDSGTGAVTQS TNAA_ECOLI (P00913) ERLAVGLYDGMNL TNAA_ECOLI (P00913) FDLEGLERGIEEVGPNNV TNAA_ECOLI (P00913) GLEGGAMERLA TNAA_ECOLI (P00913) GLERGIEEVGPNNV TNAA_ECOLI (P00913) IIKSGMNPFL TNAA_ECOLI (P00913) LAMSAKKDAMVPMGGL TNAA_ECOLI (P00913) LEGLERGIEEVGPNNV TNAA_ECOLI (P00913) LRLTIPRATY TNAA_ECOLI (P00913) LTDSGTGAVTQSMQ TNAA_ECOLI (P00913) TITSNSAGGQPVSL TNAA_ECOLI (P00913) VATITSNSAGGQPVSLANL TNAA_ECOLI (P00913) VDAGKLLPHIPADQF TNAA_ECOLI (P00913) VDAGKLLPHIPADQFPAQAL TNAA_ECOLI (P00913) VSLANLKAMYSIAKKY TTDT_ECOLI (P39414) AAALAMPEIP TTDT_ECOLI (P39414) ELILAPVTPSNSARG TTDT_ECOLI (P39414) EQLAQPGFKFTAK TTDT_ECOLI (P39414) FCLMVGAAIGLGSI TTDT_ECOLI (P39414) GDQVPRWAETELQAMGP TTDT_ECOLI (P39414) GFKFTAKSLSWAVSG TTDT_ECOLI (P39414) MVGAAIGLGSILTPYA TTDT_ECOLI (P39414) PEIPLPVFCLMVG TTDT_ECOLI (P39414) SLSWAVSGF TTDT_ECOLI (P39414) TGYEKTGLGR YBGS_ECOLI (P75758) AADAGQVAPDARE YBGS_ECOLI (P75758) AALAADSGAQTNNGQANAAADAGQVAPDARENVA YBGS_ECOLI (P75758) ADSGAQTNNGQANAAADAGQVAPDARE YBGS_ECOLI (P75758) ALAADSGAQTNNGQANAAADAGQ YBGS_ECOLI (P75758) KVQTGDGINNDVDTKTDGTTQ YBGS_ECOLI (P75758) NVAPNNVDNNGVNTGSGGTM YBGS_ECOLI (P75758) NVAPNNVDNNGVNTGSGGTML YBGS_ECOLI (P75758) NVAPNNVDNNGVNTGSGGTMLH YBGS_ECOLI (P75758) PDARENVAPNNVDNNGVNTG YBGS_ECOLI (P75758) SGAQTNNGQANAAADAGQVAPDAR YBGS_ECOLI (P75758) SGAQTNNGQANAAADAGQVAPDARE YBGS_ECOLI (P75758) VDNNGVNTGSGGT YCII_ECOLI (P31070) LTAGPMPAVDSNDPGAAGF YCII_ECOLI (P31070) LTAGPMPAVDSNDPGAAGFTGSTV YGIW_ECOLI (P52083) AAVIAVMALCSAPVMA YGIW_ECOLI (P52083) AAVIAVMALCSAPVMAAEQG 140

YGIW_ECOLI (P52083) AEQGGFSGPSATQSQAGGFQGPNGSVT YGIW_ECOLI (P52083) ALCSAPVMAAEQG YGIW_ECOLI (P52083) CSAPVMAAEQGGFS YGIW_ECOLI (P52083) ESAKSLRDDTW YGIW_ECOLI (P52083) IQGEVDKDWNSVE YGIW_ECOLI (P52083) KDASGTINVDIDHKRWNGVTVTPKDTVEIQG YGIW_ECOLI (P52083) MAAEQGGFSGPSAT YGIW_ECOLI (P52083) RISDDL YGIW_ECOLI (P52083) TVESAKSLRDDTW YGIW_ECOLI (P52083) VDIDHKRWNGVTVTPKDTVE YJBJ_ECOLI (P32691) DWETRN YJBJ_ECOLI (P32691) GKIQERYGYQKDQAEKE YJBJ_ECOLI (P32691) IEGKRDQ YJBJ_ECOLI (P32691) IEGKRDQL YJBJ_ECOLI (P32691) IEGKRDQLVG YJBJ_ECOLI (P32691) IIEGKRDQLVG YJBJ_ECOLI (P32691) KLTDDDMTIIEGK YJBJ_ECOLI (P32691) MTIIEGKRDQL YJBJ_ECOLI (P32691) RYGYQKDQAEKE YJBJ_ECOLI (P32691) TIIEGKRDQL YJBJ_ECOLI (P32691) TIIEGKRDQLVG YJBJ_ECOLI (P32691) VGKIQE YJBJ_ECOLI (P32691) VGKIQERYGYQKDQAEKE YJBJ_ECOLI (P32691) VVDWETRN YJBJ_ECOLI (P32691) VVDWETRNE YJBJ_ECOLI (P32691) VVDWETRNEYRW YJGF_ECOLI (P39330) AAGLKVGDIVKT YJGF_ECOLI (P39330) ENAPAAIGPYVQGVDLG YJGF_ECOLI (P39330) FTEHNATFPARSCVE YJGF_ECOLI (P39330) IATENAPAAIGPYVQG YJGF_ECOLI (P39330) IITSGQIPVNPKTGEVPAD YJGF_ECOLI (P39330) IITSGQIPVNPKTGEVPADVAA YJGF_ECOLI (P39330) IITSGQIPVNPKTGEVPADVAAQARQSL YJGF_ECOLI (P39330) IITSGQIPVNPKTGEVPADVAAQARQSLDNVKA YJGF_ECOLI (P39330) QIPVNPKTGEVPADVAAQ YJGF_ECOLI (P39330) SKTIATENAPAAIGPYVQG YJGF_ECOLI (P39330) SKTIATENAPAAIGPYVQGVD YJGF_ECOLI (P39330) SKTIATENAPAAIGPYVQGVDLGNMIITSGQIPVNPKTGE YJGF_ECOLI (P39330) VKDLNDFATVN YJGF_ECOLI (P39330) YVQGVDLGNMIITSGQIPVNP YJGK_ECOLI (P39335) EPYEARRAEYHA YJGK_ECOLI (P39335) IAFLPEGVDEKT YJGK_ECOLI (P39335) IIGNIHNLQPWLPQE YJGK_ECOLI (P39335) ILNEGDFVVFY YJGK_ECOLI (P39335) KGKHDIEGNR YJGK_ECOLI (P39335) MIIGNIHNLQPWLPQE YJGK_ECOLI (P39335) MIIGNIHNLQPWLPQEL YJGK_ECOLI (P39335) QPWLPQELR YJGK_ECOLI (P39335) WLADKDIAFLPEGV

The left column is the protein identified, named according to its SwissProt entry name. The SwissProt accession number is listed in parentheses. 141

CHAPTER 4

PERSPECTIVES AND FUTURE DIRECTIONS

4.1 Discussion and Conclusions

4.1.1 Literature search

The analysis of the data that have been gathered from the literature search

indicated that although pepsin is not as specific as other proteases, it does maintain

limited cleavage preferences. At pH 2.5, pepsin will cleave preferentially after most

bulky, hydrophobic amino acids such as leucine and phenylalanine. Additionally, the

residues that most often occur immediately following the cleaved peptide bond are

tryptophan and tyrosine. It has also been shown that pepsin will rarely cleave at proline

and histidine.

4.1.2 Experimental research

The results yielded by the corresponding experimental research support the initial

trends established in the literature search. The data collected for the digest of the proteins

and E.coli samples at pH 2.5 show that pepsin cleaves most often after leucine and

phenylalanine and least often after arginine, cysteine, proline and histidine.

The data obtained from performing digestion of the samples at pH 1.0 and pH 4.0

also yielded similar results. There were no distinct differences of pepsin specificity

between the three pH points. Due to the fact that all three pH points yielded similar

results of pepsin specificity it has been shown that the activity of pepsin is not directly related to its specificity. 142

4.1.3 Research on pepsin specificity

The research presented in this thesis does not represent the first time that the specificity of pepsin has been investigated. A study performed by Zhang (Zhang 1995) examines pepsin specificity by analyzing the cleavages produced from digesting three

standard proteins. Another more recent study into the specificity of pepsin was

performed by Hamuro et al. (Hamuro, Coales et al. 2008). This research is more

comprehensive than that of Zhang as it consists of cleavages produced from the digestion

of 39 proteins.

The work presented by Hamuro et al. is comparable to the work performed for this thesis. The results were similar as pepsin will cleave preferentially after most bulky

hydrophobic residues. However, there were some differences in the research performed.

The research by Hamuro et al. only considered the digestion of proteins at a pH around

2.5. They also did not strictly control the pH of the sample before digestion. The pH of the samples that were digested ranged from 1.3 to 2.5.

The research for this thesis puts strict limitations on the pH of both the samples and the digestion of the samples. This produces a better understanding of pepsin

specificity at a given pH. This work also looks into the specificity of pepsin at other pH

points, thus expanding current knowledge beyond the specificity that is known at pH 2.5.

4.2 Future Directions

All of the research presented here makes for a very good start at determining the

preferences of where pepsin will cleave. However, there is still more work that needs to

be done. In order to get a more solid perspective on the specificity of pepsin much more 143

data needs to be acquired. This research yielded a total of 1473 peptides generated from

58 proteins at pH 2.5. While this appears to be a lot of data, it is not yet enough for this

type of study.

Hundreds of proteins and tens of thousands of cleavages should be gathered in

order to verify the trends that were established with this work. However, while this

sounds like a good idea it would be rather difficult to actually complete. One way to

possibly gain this much data is to optimize digestion and chromatography conditions of

E.coli whole cell lysate. Only 49 proteins were identified during the E.coli digestion in this research. If the conditions were optimized the digestion of E.coli could produce hundreds of proteins, yielding the data set that is needed.

Another aspect of pepsin specificity that still needs to be investigated is how the residues in positions P4 through P2 and P2’ through P4’ affect where pepsin will cleave.

This research just focused on the residues in the P1 and P1’ positions as they have the

most effect on pepsin specificity. However, from these studies there was the creation of a

cleavage database that incorporates out to the P4 and P4’ residues for the proteins

analyzed. This database will allow for further examination of what residues have an

effect on the specificity of pepsin.

4.3 References

Hamuro, Y., S. J. Coales, et al. (2008). "Specificity of immobilized porcine pepsin in

H/D exchange compatible conditions." Rapid Commun Mass Spectrom 22: 1041-

6. 144

Zhang, Z. (1995). Protein Hydrogen Exchange Determined by Mass Spectrometry: A

New Tool for Probing Protein High-order Structure and Structural Changes.,

Purdue. Ph.D.