<<

Using a Combinatorial Peptide Ligand Library to Reduce the Dynamic Range of Concentrations in Complicated Biological Samples

THESIS

Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University

By

Ran An

Graduate Program in Chemistry

The Ohio State University

2014

Master's Examination Committee:

Vicki Wysocki, Advisor

Dehua Pei

Copyright by

Ran An

2014

Abstract

To improve the detection of low abundance in biological fluids with proteins occurring over a wide dynamic range, a combinatorial peptide ligand library (CPLL) approach combined with two-dimensional LC- MS/MS was used. To model biomarker discovery samples, Aspergillus fumigatus proteins were spiked into patient bronchoalveolar lavage fluid (BALF). The protein profile of BALF treated with the peptide ligand library was compared with non-treated BALF. Limit of detection of

Aspergillus fumigatus and recombinant catalase protein spiked separately in the BALF was evaluated. Orbitrap and Velos Pro mass spectrometric analysis showed that a

BALF/bead slurry volume ratio of 20:1 (molar ratio of BALF proteins to bead capacity of

7:1) was the most ideal ratio for BALF protein concentration normalizations. In general,

CPLL treated BALF lead to detection of two fold more total proteins and two fold more peptides than non-treated BALF, under the same LC-MS conditions. 60% of the total protein detected by both CPLL and non-CPLL are unique to CPLL treated samples, while

15% of the total proteins are unique to the non-CPLL sample. Preliminary data showed that the lowest spike concentrations of Aspergillus fumigatus and catalase detected in 12

μΜ BALF were 0.2 μM and 0.02 μM respectively. Preliminary data also showed that

CPLL is potentially more selective towards specific proteins while removing others.

ii

Dedication

This dissertation is dedicated to my family.

iii

Acknowledgments

First and foremost, I would like to thank my academic advisor, Professor Vicki Wysocki, for accepting me into her group and making me appreciate the power of mass spectrometry. Without her constant contribution and practical advice, this thesis would not have been possible.

A special mention for Professor Dehua, Pei, for offering insightful and detailed discussion about elution conditions for previous experiment design.

It is a great pleasure to thank everyone in the Wysocki group with whom I had the honor to work. They provide a friendly and cooperative atmosphere at work. I owe sincere and earnest thankfulness to Chengsi (Michelle) Huang for being a reliable source of scientific knowledge and a caring partner throughout my graduate career. And my superwoman,

Yang (Stella) Song, who is willing to come to the lab at midnight to help me trouble shoot the instrument. Thanks for all the ladies and the only gentleman, Andrew

Vanschoiack, in the proteomics subgroup for helpful discussions on research progress every week. I am grateful to my colleges, Yun (Winnie) Zhang, Nilini S. Ranbaduge,

Akiko Tanimoto, Matthew Bernier, Xin Ma, and Jing Yan for providing useful feedback and insightful comments on this thesis. Thanks for all the good times and the memories will be cherished forever.

iv

Vita

2007 to 2011 ...... B.S. Chemistry, Nankai University, China

B.S. Chemical Engineering, Tianjin

University, China

2011 to 2012 ...... Graduate Assistant, Department of

Chemistry and , the University

of Arizona

2012 to present ...... Graduate Assistant, Department of

Chemistry and Biochemistry, the Ohio State

University

Fields of Study

Major Field: Chemistry

v

Table of Contents

Abstract…………… ...... ii

Dedication………...... iii

Acknowledgments...... iv

Vita…………………...... v

List of Tables……...... ix

List of Figures...... x

Chapter 1. Introduction ...... 1

1.1. Aspergillus Fumigatus Spiked in Bronchoalveolar Lavage Fluid (BALF) as

Model System ...... 1

1.2. Establishment of Proteomics Using Mass Spectrometry ...... 2

1.2.1. General Workflow of Bottom-up Proteomics ...... 2

1.2.2. Liquid Chromatography Coupled to Mass Spectrometry ...... 4

1.2.3. Electrospray Ionization and Nano Electrospray Ionization ...... 6

1.2.4. Collision-induced Dissociation ...... 8

1.2.5. Thermo Scientific Velos Pro Mass Spectrometer ...... 9

1.2.6. Thermo Orbitrap Elite Mass Spectrometer ...... 10

1.2.7. Database Searching ...... 11 vi

1.3. The Quest for Low-abundance Proteins ...... 12

1.3.1. Detection of Low-abundance Proteins ...... 12

1.3.2. Combinatorial Peptide Ligand Library ...... 12

1.3.2.1. Reduction of Protein Dynamic Concentration Range ...... 12

1.3.2.2. Mechanism and Properties of Peptide Ligand Library in Protein Capturing 16

1.3.3. Protein Quantification in Combinatorial Peptide Ligand Library ...... 19

Chapter 2. Study of Limit of Detection of Aspergillus fumigatus Spiked in

Bronchoalveolar lavage fluid (BALF) before CPLL Treatment...... 21

2.1. Experimental Procedures ...... 21

2.1.1. Materials ...... 21

2.1.2. Bead Bed Volume Optimization for BALF ...... 22

2.1.3. Antigen Spiked in BALF before CPLL ...... 23

2.1.4. Trichloroacetic Acid (TCA) Precipitation ...... 24

2.1.5. Protein Assay ...... 25

2.1.6. Sodium Dodecyl Sulfate-Polyacrylamide Gel Electrophoresis and Silver

Staining ...... 25

2.1.7. Protein Digestion ...... 26

2.1.8. LC-MS/MS ...... 26

2.1.9. MS Data Analysis ...... 28

vii

2.2. Results and Discussions ...... 29

2.2.1. Gel Results of Bed Volume Optimization for BALF ...... 29

2.2.2. MS Results of Bed Volume Optimization for BALF ...... 30

2.2.3. Gel Results for Aspergillus Fumigatus and Catalase Spiked in BALF ...... 34

2.2.4. MS Results for Aspergillus Fumigatus and Catalase Spiked in BALF ...... 36

Chapter 3. Conclusion and Future Directions ...... 51

Appendix A A list of 429 BALF and Aspergillus fumigatus proteins identified on

Orbitrap………...... 57

Appendix B A list of 219 BALF and catalase proteins identified on Orbitrap...... 96

Reference…………...... 109

viii

List of Tables

Table 2.1 Different spike concentrations and mass ratios a) Aspergillus fumigatus spiked in BALF and b) catalase spiked in BALF ...... 24

Table 2.2 Top twenty-four proteins identified from initial BALF without CPLL treatment.

...... 39

Table 2.3 Identified Aspergillus fumigatus protein groups of two BALF spiked with different amount of Aspergillus fumigatus...... 43

Table 2.4 Top twenty-four proteins identified from BALF without CPLL treatment...... 49

ix

List of Figures

Figure 1.1 General workflow of bottom-up proteomics experiment...... 3

Figure 1.2 Illustration of interactions between peptides and stationary phase of reversed phase liquid chromatography...... 6

Figure 1.3 Illustration of an ESI ion source operated in positive mode...... 7

Figure 1.4 Schematic diagram of a Thermo Scientific LTQ Velos ion trap mass spectrometer. Figure adapted from Thermo Scientific...... 10

Figure 1.5 Schematic diagram of a Thermo Scientific Orbitrap Elite Hybrid MS...... 11

Figure 1.6 Schematic representation of peptide library synthesis using “split, couple, recombine” method...... 14

Figure 1.7 Schematic representation of the process for the normalization of protein concentration differences in a sample...... 16

Figure 2.1 SDS-PAGE of initial BALF from patient 1 and the same after CPLL treatment...... 29

Figure 2.2 Number of identified BALF proteins and peptides with and without CPLL. . 32

Figure 2.3 Venn diagrams of BALF run on Velos Pro, independently treated with and without CPLL...... 33

Figure 2.4 Venn diagrams of BALF run on Orbitrap, independently treated without and with two CPLL...... 33 x

Figure 2.5 SDS-PAGE of BALF spiked with different amounts of Aspergillus fumigatus or catalase before CPLL treatment...... 35

Figure 2.6 Number of identified proteins for BALF spiked with different amounts of

Aspergillus fumigatus...... 36

Figure 3.1 Different cellular component percentages comparison graph for BALF proteins...... 54

Figure 3.2 Different molecular function percentages comparison graph for BALF proteins...... 55

xi

Chapter 1. Introduction

1.1. Aspergillus Fumigatus Spiked in Bronchoalveolar Lavage Fluid (BALF) as Model

System

Aspergillus fumigatus is one of the most ubiquitous of the airborne fungi. It is typically found in soil and decaying organic matter. Humans and animals constantly inhale numerous conidia of this fungus. In a healthy host, the conidia are quickly cleared out. However, immunecompromised individuals can be infected by inhalation of conidia, which is the major cause of life threatening invasive aspergillosis (IA). 1,2

Patient BALF, as sampled by fiber-optic bronchoscopy, remains an important mechanism for collection and analysis of the airway cells, proteins and metabolites3. The hypothesis of this study was that the Aspergillus fumigatus proteins in BALF of IA-affected human would show differential protein expression, compared to the Aspergillus fumigatus in BALF of healthy humans. To identify Aspergillus fumigatus proteins that are specifically present during invasive aspergillosis (IA) in patient BALF is the ultimate goal of this study. However, soluble protein components of BALF are dominated by plasma-derived proteins (e.g. albumin and immunoglobulins), as well as resident proteins that are secreted from the airway epithelium and immune responsive cells.3 The large dynamic range of protein abundance in BALF makes the detection of low-abundance Aspergillus fumigatus proteins challenging. To explore the limit of detection of Aspergillus fumigatus 1 proteins in the presence of BALF, a model system of Aspergillus fumigatus spiked in BALF was employed in this study.

In order to evaluate a large number of proteins at once, proteomics has been selected for a comprehensive study of protein composition in the biological sample. The goal of proteomics studies is to identify and quantify complex proteome compositions in order to provide a better understanding of how proteins affect biological processes (e.g. metabolic pathways, disease processes, and cell cycle) and conversely, how these are affected by the external environment.

Several main challenges are involved in proteome analysis of biological samples due to the complexity and dynamics of the proteome compared to the genome. These challenges include i) proteins are constantly undergoing dynamic changes, and some transient species only appear under certain conditions, (ii) proteins are post-translationally modified, as a result, even the same type of proteins studied vary widely with different environmental conditions, (iii) protein isoforms resulting from post-translational modifications need to be analyzed effectively and (iv) proteins usually have a wide dynamic range of concentrations in the body, making it extremely difficult to detect proteins with low-abundance in a complex biological sample.

1.2. Establishment of Proteomics Using Mass Spectrometry

1.2.1. General Workflow of Bottom-up Proteomics

“Bottom-up” proteomics refers to the characterization of proteins extracted from cells, tissues, or biological fluids by analysis of peptides produced via proteolysis. Figure 1.1 shows a general workflow of bottom-up proteomics. Due to the high degree of protein complexity, the sample is

2 often fractionated by gel or affinity selection. Two main gel-based methods generally used are one-dimensional sodium

Figure 1.1 General workflow of a bottom-up proteomics experiment. (1) Protein fractionation and gel separation to reduce sample complexity (2) Extracted proteins are digested into peptides by a proteolytic (3) Peptide mixtures are separated by one or more steps of LC-MS (4) A mass spectrum of eluting peptides (5) Selected peptide are fragmented in MS/MS experiments. Reproduced from reference Error! Bookmark not defined., with permission, Copyright 2003, ature Publishing Group.

3 dodecyl sulfate (SDS)-polyacrylamide gel electrophoresis (PAGE), which separates proteins based on molecular weight (MW) and two-dimensional SDS-PAGE, which separates proteins first by isoelectric point (pI), and second by MW. These gel bands are then digested by proteases into peptides for more sensitive mass spectrometry analysis. The digested peptide mixture is separated by reversed-phase liquid chromatography coupled to mass spectrometry (LC-MS).

Collected mass spectra are searched against a comprehensive database to identify the corresponding peptide sequences for large-scale datasets.

1.2.2. Liquid Chromatography Coupled to Mass Spectrometry

The combination of liquid chromatography with mass spectrometry (MS) allows more positive identification of compounds that cannot be fully achieved by either of the methods alone. 4 Mass spectrometer can accurately and sensitively measure mass to charge ratio (m/z) of the ionized analytes in gas phase. In principle, a mass spectrometer consists of an ionization source that brings analyte into the gas-phase and adds charges onto analytes, a mass analyzer which measures m/z of the ionized analytes, and a detector that detects the number of ions at each m/z value. The ionization system is coupled to different types of mass analyzers to analyze the precursor ions and fragment ions. Protein mixtures are required to be digested into peptides by selected for the majority of techniques developed so far. Protein identification is achieved by tracing the MS fragmentation pattern to the original sequence of the peptide.

Although MS is a powerful tool to obtain mass and structural information, the overall sample complexity in proteomics study results in the limitations when employing a single technique.

When the analyte of interest is part of a complex biological sample, the resulting mass spectra

4 will contain ions from all the compounds, making it difficult to identify the analyte with high confidence. In this case, liquid chromatography helps to separate the proteins prior to MS analysis to simplify the sample. High performance liquid chromatography (HPLC) is a popular liquid based separation technique in proteomics due to its compatibility to MS.

However, employing HPLC as single dimension of separation faces the challenge of resolving thousands of co-eluting peptides with similar m/z ratios in order to obtain accurate quantification and qualification information in proteomics study.5 Therefore, a better method is to develop multidimensional HPLC (MDLC), by which proteins or peptides are separated by more than once to enhance fractionation and minimize the number of co-eluting peptides. There are various combinations of MDLC. In the case of RPLC-RPLC, separation is based on partitioning of the analyte between an aploar stationary phase and a polar mobile phase. In principle, the pH is the main contributor to of the analytical selectivity due to the ionic nature and recharging of peptides

(Fig 1.2).6 Acidic peptides (pI< 5.5) are more attracted to the column at low pH while basic peptides (pI> 7.5) are more strongly retained at high pH.7

5

Figure 1.2 Illustration of interactions between peptides and stationary phase of reversed phase liquid chromatography. Reproduced from reference 6, with permission, Copyright 2012, Elsevier.

1.2.3. Electrospray Ionization and Nano Electrospray Ionization

Electrospray ionization (ESI) works by transferring analyte from liquid phase into the gas phase to generate intact ions at atmospheric pressure. The advantage of using electrospray ionization mass spectrometry (ESI-MS) to study biological macromolecules lies in the fact that it is a relative “soft” ionization technique and can conserve covalent bonds of the molecules. Therefore, even after ionization some analytes in solution phase are well preserved in gas phase. ESI is also popular for its capability to produce multiply charged ions, which makes the detection of very high molecular mass compounds possible on instruments with limited mass ranges. Another advantage of ESI is its compatibility with traditional liquid chromatography separation techniques widely used in proteomics. The ESI process typically consists of three major steps in order to generate intact gas phase ions (see Fig. 1.3). Firstly, a high voltage is applied to a nozzle containing liquid and a Taylor Cone is formed at the tip of the capillary tube. The Taylor Cone shape is a result of two opposing forces experienced by the liquid. One is surface tension derived 6 force that tends to hold the liquid back into the nozzle. The other is Columbic attraction that tries to push the liquid to the counter electrode.8 Secondly, charged droplets are emitted from the tip of the Taylor Cone. Thirdly, these droplets further fission and evaporate assisted with warm gas to undergo desolvation and eventually ions are yielded from these droplets.9

Counter

Electrode

Capillary Tube

Figure 1.3 Illustration of an ESI ion source operated in positive mode. Reproduced from Mass Spectrometry Reviews, 28, 899, with permission, Copyright 2009, Wiley Periodicals, Inc.

For many proteomic applications, where low sample consumption is desired or required, nano-

ESI, a minimized-flow ESI is used. The flow rate of nano-ESI is reduced from micro-liter per minute to nano-liter per minute as compared to conventional ESI process and only a few micro- liters of analyte solution (10-5-10-8 M) is sufficient for structural investigations and molecular weight determinations by MS/MS.10 However, nano-ESI is more than just a miniaturized form of

ESI. The reduced flow rate for nano-ESI is shown to produce smaller initial droplet sizes than

7 conventional ESI, which increases both sensitivity and salt tolerance.11 Moreover, nano-ESI enables the analysis of proteins in their native form with their structures preserved, because nano-ESI replaces organic solvents and high interface temperatures with stable spraying of water and aqueous salt solutions.

1.2.4. Collision-induced Dissociation

Collision-induced dissociation (CID) is the most commonly used method of fragmentation. Ions of interest, often referred to as precursor ions, are excited by colliding with inert target gas (He,

N2, Ar). After the collision, the precursor ions dissociate into ions. The technique that uses two stages of mass analysis is termed tandem mass spectrometry (MS/MS). When an ion collides with a neutral atom or molecule, some of the ion’s kinetic energy can be converted into internal energy. If there is enough excess internal energy to break chemical bonds, the ion will dissociate, given enough time in a particular instrument region. In this way, the structural information of the ions can be studied from the tandem MS spectra.

During the CID process, only a fraction of ion kinetic energy is converted into ion internal energy. When an ion carrying some kilo-electron volts of kinetic energy collides with a neutral atom or molecule, the amount of ion kinetic energy transferred, Elab, is determined by the center-

12,13 of-mass collision energy, Ecm.

Where mn is the mass of the neutral, mi is the mass of the ion.

From the equation above, it can be seen that increasing the lab frame ion kinetic energy or the neutral mass can increase the amount of kinetic energy transferred into internal energy.

8

However, the collision process can also activate the ion analyte. As a result, the ion is subjected to dissociation or rearrangement after randomization of the obtained internal energy.14

1.2.5. Thermo Scientific Velos Pro Mass Spectrometer

Thermo Scientific Velos Pro mass spectrometer is composed of many parts: an ion source, ion optics, a rotated quadrupole with neutral beam blocker, an octopole mass transfer, and a dual- pressure linear ion trap (Fig. 1.4). The optical lenses focus and accelerate gas-phase ions generated by the electrospray ionization source into the quadrupole. Ions are transmitted to the octopole prior to the dual-pressure linear ion trap. The dual-pressure design is able to provide optimal pressure for both ion manipulation (high-pressure cell) and detection (low-pressure cell).

A high-pressure cell is used for effective trapping of injected ions. Also, increased pressure can have a positive impact on the chance of effective ion collision with buffer gas. Therefore, the efficiency of precursor ion fragmentation resulting from traditional collision induced dissociation is improved. The low-pressure cell is used for scanning fragmented ions out radially through the slots in two opposite rod electrodes. The ejected ions are detected by two detectors to double the sensitivity of the device.

9

Figure 1.4 Schematic diagram of a Thermo Scientific LTQ Velos ion trap mass spectrometer. Figure reproduced from Part of Thermo Fisher Scientific. Copyright 2011 Thermo Fisher Scientific Inc. Thermo Scientific LTQ Velos Dual-Pressure Linear Ion Trap. Retrieved from Thermo Scientific, website: http://www.thermo.com/eThermo/CMA/PDFs/Product/productPDF_51541.pdf

1.2.6. Thermo Orbitrap Elite Mass Spectrometer

The Thermo Scientific Orbitrap Elite mass spectrometer is a hybrid ion trap mass spectrometer.

It combines dual pressure linear ion trap with a new high-field orbitrap mass analyzer to enable high resolving power and accurate mass measurement. A schematic diagram of the Orbitrap Elite hybrid MS is shown in Figure 1.5. Ions ejected axially from the linear ion trap are directed through a quadrupole focusing element into a curved RF-only quadrupole known as the C-trap, from where the ion packets are injected into the orbitrap for detection. The orbitrap is also an ion trap where moving ions are trapped in an electrostatic field very much like a satellite in an orbit.

The detection of ion oscillations in orbitrap is achieved by translating image current signal into a frequency domain signal by Fourier transformation for accurate reading of their m/z.

10

Figure 1.5 Schematic diagram of a Thermo Scientific Orbitrap Elite Hybrid MS. Figure reproduced with permission from Thermo Fisher Scientific Copyright 2013 Thermo Fisher Scientific Inc. Orbitrap Elite Hybrid Ion Trap Mass Spectrometer. Retrieved from Thermo Scientific, website: http://planetorbitrap.com/orbitrap-elite#tab:schematic

1.2.7. Database Searching

Peptide sequences from tandem MS experiments are identified using software that correlates experimentally acquired tandem mass spectra to theoretical spectra predicted from sequences obtained from a sequence database.15 Fragmented peptides are then traced back to proteins using a second software algorithm.16 One of the more popular routines for database matching of peptide MS/MS spectra is Sequest.17 A cross-correlation (XCorr) function is used to assess the quality of the match between the mass-to-charge ratios for the fragment ions observed in the tandem mass spectrum and the fragment ions predicted from amino acid sequence information in a database. The XCorr value is dependent on the spectral quality of the tandem mass spectrometry and the closeness of its fit to the theoretical spectrum. SEQUEST scores are currently normalized using the cross-correlation differences for the first- and second-ranked sequences (ΔCn). In general, when ΔCn is greater than 0.1, the match between the sequence and spectrum has been reported to be acceptable.17 XCorr is independent of database size whereas

15 ΔCn is database dependent and reflects the quality of the match relative to near misses. 11

1.3. The Quest for Low-abundance Proteins

1.3.1. Detection of Low-abundance Proteins

The dynamic concentration range of proteins in body fluids (e.g. serum and plasma) can span over 12 orders of magnitudes from highly expressed proteins (e.g., albumin in serum) to proteins that are expressed only in a few copies. For example, in human plasma, the 22 most abundant proteins make up for more than 99% of the mass, while most disease markers are being masked by high-abundance proteins. To circumvent this dynamic range barrier of proteins, several sample treatment strategies have been developed over the years. Subcellular fractionation is employed for proteins displaying region-specific expression patterns. The sample complexity can be greatly reduced by isolating subcellular compartments, including organelles and macromolecules.18 Another way to achieve better proteome coverage is through depletion of high abundant proteins. Multiple high-abundance proteins can be reduced in concentration simultaneously using an affinity-based system. Several methods have also been developed to enrich for transient proteins and peptides based on their post-translational modification, such as immune-precipitation, immobilized metal affinity chromatography, and strong cation exchange.

1.3.2. Combinatorial Peptide Ligand Library

1.3.2.1. Reduction of Protein Dynamic Concentration Range

The idea of using a combinatorial peptide ligand libraries (CPLL) approach to reduce protein dynamic concentration range comes from the affinity chromatographic concept. In the concept, a single protein can specifically bind to ligands, such as grafted onto the columns. From there, mixed beds of chromatographic beads can be used to provide as many immunoaffinity

12 beads as the diversity of the proteins in the sample. In theory, if each protein can find its own binding partner, the beads can capture all the proteins with similar concentration. The CPLL

(now commercially available as ProteoMinerTM, Bio-Rad Labs, CA, USA) methodology was originally developed with a very similar principle in that each bead with an average diameter of

65 μm carries the same hexapeptide instead of antibodies. The library consists of a mixture of porous poly (hydroxymethacrylate) on which hexapeptides are covalently attached via the C-terminus. This solid-phase ligand library is generated by “split, couple, recombine” combinatorial chemistry method. Figure 1.6 shows the synthesis of a linear hexapeptide ligand library with 16 amino acids (ProteoMiner Beads). Briefly, a pool of microscopic porous chromatographic beads is separated into several reaction vessels. The number of reaction vessels is the same as the number of amino acids used in the synthesis of the ligand sequences. Each bead vessel receives a different amino acid, which is then chemically attached to the beads. Once the coupling reaction comes to completion, the different bead vessels are extensively washed to remove the excess of reagents. Coupled amino acids are then de-protected and all the beads are recombined. Finally the batch is split up again into the same number of vessels as before. The process of coupling amino acids is repeated six times until a hexapeptide library is produced.19,20

13

Figure 1.6 Schematic representation of peptide library synthesis using “split, couple, recombine” method. In the first cycle, 16 amino acids are chemically attached to the beads in 16 reaction vessels. Once the coupling reaction comes to completion, all the beads are recombined. Finally, the beads are split up again into 16 vessels as before for the second cycle. The process of coupling amino acids is repeated six times until a hexapeptide library is produced.

The ligands are represented throughout the beads porous structure with a ligand density of ca. 40

- 60 μmoles/mL bead volume (about 50 pmol of the same hexapeptide per bead). 21,22 The commercially available library beads (ProteoMiner) are composed of 16.7 million different peptide combinations resulting from the use of 16 amino acids.23 It is to be noted that the binding

14 is not as specific as for affinity columns. Experimental evidence has shown that a given protein can bind to more than one type of ligand and vise versa.21

The workflow of general equalizing process is shown in Fig 1.7. The vast populations of peptide ligands can theoretically interact with most if not all the proteins in a complex proteome. During sample binding, due to the binding capacity of each individual bead, the saturation is quickly reached for most of the high abundance proteins (a and b) while the medium-to-low abundance proteins (c and d) are captured in progressively increasing amounts as the beads are overloaded with protein sample at the beginning of sample binding. When most binding sites become saturated, the excess of proteins that cannot be bound any further is subsequently collected in the flow-through. As a result, medium-to-low abundance proteins are concentrated on the beads, while the excess of high abundance proteins are washed off. Therefore, the overall dynamic range of protein concentrations is normalized.

15

Figure 1.7 Schematic representation of the process for the normalization of protein concentration differences in a sample. First, the sample is incubated with solid-phase combinatorial ligand library at a given ratio. Second, the binding sites on the ligands are gradually saturated by overloading of the sample. Third, excess amount of proteins in the sample are collected in the flow-through fraction by centrifugation and non-adsorbed proteins are removed by washing. At last, the adsorbed proteins are eluted off the beads.

1.3.2.2. Mechanism and Properties of Peptide Ligand Library in Protein Capturing

The mechanism of protein capture by the hexapeptide bead library is still under debate. Earlier,

Keidel et al.24 stated that ProteoMiner beads act simply “according to a general hydrophobic mechanism, where diversity in surface ligands plays only a negligible role”. This proposed mechanism has been proved to be erroneous due to the pH dependent nature of protein capture

16 and the ability to elute proteins under high salt conditions.25 On the other hand, binding is also not as specific as for affinity columns due to the fact that in a mixed bed of beads, many peptide ligands only have a single amino acid difference from each other. It has been demonstrated that a single bead is capable of binding to more than one protein species with presumably different dissociation constants.26 In a statistical model proposed by Righetti et al,27 the binding is further described as a collection of multiple interactions of several peptides that binds to multiple regions of the protein surface.

A random linear hexapeptide from a combinatorial peptide ligand library can form various types of bonds. While secondary amide bonds on the peptide backbone contribute to hydrogen bonding, the side chains of amino acids are involved in ionic interactions by anionic (glutamic acid and aspartic acid) or cationic (, lysine and histidine) charges present. Hydrophobic interactions are formed by non-ionic hydrocarbon side chains from isoleucine, leucine, and phenylalanine20. All the interactions play an important role in the protein capturing process. It turns out that binding interactions between proteins and ligands can be quite strong and require highly denaturing eluting agents. 28 , 29 For example, 6 M guanidine-HCl (GuHCl), pH 6, is considered to be a general eluent due to its strong chaotropic effect and high ionic strength. It is able to disrupt all bonds and denature all proteins into random coils. Therefore this can be used as the sole elution step. However, it has to be removed from the recovered proteins via dialysis as it is incompatible with downstream analysis. Researchers have reported that in order to recover more than 99% of the bound species, the beads have to be boiled in 4% SDS buffer containing 30 mM dithiothreitol (DTT).28

17

There are several critical parameters that can influence the performance of a peptide ligand library, such as peptide length, environmental pH, protein ionic strength, and overloading conditions etc. The performance of ligand library with different peptide lengths has been examined by Simo et al.30 with a crude cytoplasmic extract of human red blood cells that contains a large amount of hemoglobin. Their results have shown that the diversity of captured proteins changes with the elongation of the peptide from a mono to hexapeptide in a nonlinear manner. The diversity of different captured proteins tends to plateau at a length of a tripeptide.

Environmental pH has been used as a method of changing the dissociation constants due to its effect on global ionic state of different protein species. The benefit of environmental pH modulation is to capture proteins at several different pH levels. The identified proteins are different at each level as indicated by LC-MS/MS.31 Change in protein ionic strength by adding or not adding sodium chloride also has a profound influence on the composition of eluted proteins.32

Another non-negligible parameter is sample loading condition. The binding capacity of the beads is about 10 mg/mL, or roughly 3 ng of protein per bead with an averaged diameter of 65 μm.20

As long as the beads are used for a sample load that is below the saturation stage, the composition of eluted proteins is similar to the loaded sample.33 This shows that the overloading effect is very important for the detection of low-abundance proteins that are progressively concentrated on the beads. Naturally, for proteins with multiple binding partners, the enrichment is better than that of others because they are more likely to reach the saturation of the loading capacity. When a large amount of sample is loaded, the displacement effects overlap with the bead saturation effect. According to law of mass action, displacement effects depend on

18 dissociation constants and relative concentrations of respective proteins. Therefore, upon overloading, the binding favors the species that displace others due to their higher affinity for the peptide ligand.

1.3.3. Protein Quantification in Combinatorial Peptide Ligand Library

With the demonstration of the capability of peptide ligand libraries to decrease the concentration of high-abundance species while concentrating low-abundance species, the quantitation aspect of the technology is remained one of the frequently asked questions. Thanks to a number of published papers, the quantitation aspect of proteins after treatment with CPLL has emerged. In

2008, Roux-Dalvai et al.34 demonstrated with a good reproducibility of the MS signal of a spiked protein (human protein extract spiked with an exogenous alcohol dehydrogenase) between technical replicates. A linear response of the MS signal with increasing concentrations of this protein was reported in the interval 100 to 1000 pmol. This is the first time that people have proved that differential quantitative proteomics could be performed for low-abundance proteins after reduction of the dynamic concentration range.

Another exploration of quantitation came from the work of Mouton-Barbosa et al.35 In this study, four heterologous proteins, beta-lactoglobulin, beta- and kappa-caseins, and phosphorylase b, were spiked into human serum in the nano to micro-molar range before the treatment with CPLL.

After mass spectrometry analysis, a growing linear signal as a function of protein concentration over concentrations spanning at least three orders of magnitude was found. Thus it was demonstrated that relative quantitation of proteins is possible after peptide library treatment, as long as the saturation of the beads is not reached. As the protein concentrations reach the

19 saturation of the bead capacity, the peptide signal from those proteins cannot be changed linearly with any additional load.

20

Chapter 2. Study of Limit of Detection of Aspergillus fumigatus Spiked in Bronchoalveolar lavage fluid (BALF) before CPLL Treatment

2.1. Experimental Procedures

2.1.1. Materials

The solid-phase combinatorial libraries of linear hexapeptides known under the trade name of

ProteoMiner Beads were purchased from Bio-Rad Laboratories, Hercules, CA. The library used

16 natural amino acids except and methionine to prevent secondary reactions, alanine due to its neutral character and leucine due to its hydrophobicity30. Bulk beads of 525 mg were purchased. Ethanol used to swell the beads was purchased from Decon Labs, Inc., 200 proof. 1.2 ml bed volume, disposable mini bio-spin chromatography columns and 50 ml Phosphate

Buffered Saline (PBS) wash buffer were also from Bio-Rad. The wash buffer was made of 150 mM NaCl, 10 mM NaH2PO4, and pH 7.4. Elution reagent was 6 M guanidine-HCl (GuHCl) prepared from solids supplied by Sigma Aldrich, St. Louis, MO. Bronchoalveolar lavage fluid

(BALF) of two patients who are known to be Aspergillus fumigatus negative by culture was obtained from Dr. Karen Wood of The Ohio State Hospital under an IRB-approved protocol.

Aspergillus fumigatus commercial powder was purchased from GREER, Lenoir, NC.

Recombinant catalase was obtained from Dr. Marta Feldmesser’s lab at Albert Einstein College of Medicine, Yeshiva University (Bronx, NY). Trichloroacetic acid (TCA) was purchased from

EMD Biosciences, Darmstadt, Germany. The BCA protein assay reagent was obtained from

Pierce Biotechnology, Inc. (Rockford, IL). Silver nitrate was purchased from Acros-Organics 21

(Morris Plains, NJ). All other silver staining reagents (methanol, cetic cid, sodium thiosulfate, sodium carbonate, and formaldehyde) were purchased from Fisher Scientific. Sequencing Grade

Trypsin and ProteaseMAX surfactant were from Promega (Madison, WI). Trifluoroacetic acid

(TFA) was from Fisher Scientific, St. Louis, MO. LC solvents used in this study were Fisher

Optima LC/MS grade ammonium formate, formic acid, acetonitrile and water.

2.1.2. Bead Bed Volume Optimization for BALF

Before using, the bulk beads were swelled overnight at 4 with 20% v/v aqueous ethanol.

Aliquot the swelled beads into different bead slurry volumes of 10 μl, 25 μl, 50 μl and 100 μl.

The final slurry of the beads following swelling will be approximately 20% beads in 20% aqueous ethanol. Therefore, dry bead volumes of the above aliquot slurry were 2 μl, 5 μl, 10 μl and 20 μl. Prior to sample binding, the biological samples were centrifuged at 10,000 × g for 10 min to remove insoluble fractions, including particulates and cells. Also, the ethanol was removed and the beads were washed with PBS buffer twice to pre-equilibrate the column. 200 μl of crude BALF were mixed with 2 μl, 5 μl, 10 μl and 20 μl of solid-phase ligand library respectively and incubated with rotation for overnight. The flow-through was collected by centrifugation at 1,000 × g for 1 min. The beads were then washed with PBS buffer three times.

Proteins captured by the CPLL were desorbed altogether using 60 μl 6 M Gu- HCl for 1.5 hour.

Recovered proteins were submitted to TCA precipitation discussed below.

22

2.1.3. Antigen Spiked in BALF before CPLL

A series of spiking experiments were performed after the proper volumes of beads slurry/BALF ratio was determined. In the spiked experiments, BALF samples were spiked separately with

Aspergillus fumigatus at six different concentrations and catalase at four concentrations (See

Table 2.1). The molecular weight of BALF was estimated to be 65 kDa based on the molecular weight of serum albumin, the most abundant protein in these samples. The weight average molecular weight of Aspergillus fumigatus was estimated to be 55 kDa based on molecular weights of top 16 most abundant proteins in it. The molecular weight of catalase was 79.9 kDa, obtained from Proteome Discoverer software (Version 1.4). Each of the 10 samples was then loaded onto a spin column containing 2 μl of ProteoMiner beads. The CPLL process for the spiked experiments was the same as described above.

23 a)

Mass ratio of Aspergillus BALF to Final concentration of Aspergillus fumigatus Aspergillus fumigatus proteins spike levels fumigatus proteins 1 10 pg/ml 0.2 pM 8×107:1 2 1 × 103 pg/ml 0.02 nM 8×105:1 3 1 × 105 pg/ml 2 nM 8×103:1 4 1 × 106 pg/ml 0.02 μM 8×102:1 5 1 × 107 pg/ml 0.2 μM 8×10:1 6 1 × 108 pg/ml 2 μM 8:1

b)

Mass ratio of Catalase BALF to Final concentration of catalase spike levels aspergillus proteins 1 1 × 103 pg/ml 0.01 nM 106:1 2 1 × 105 pg/ml 1 nM 104:1 3 1 × 106 pg/ml 0.01 μM 103:1 4 1 × 107 pg/ml 0.1 μM 102:1 Table 2.1 Different spike concentrations and mass ratios a) Aspergillus fumigatus spiked in BALF and b) catalase spiked in BALF

2.1.4. Trichloroacetic Acid (TCA) Precipitation

Eluted proteins of 60 μl were mixed with 15 μl TCA. The mixture was incubated at 4 for 10 min and centrifuged at 14, 000 rpm, 4 for 10 min. The supernatant was removed and 200 μl of acetone was added to wash the pellet for three times. The acetone-containing supernatant was removed and the pellet was air dried. After TCA precipitation, the proteins were dried using 24 speedvac and stored at -20 until used. For protein assay, the pellet was first redissolved in water by heating at 95 for 5 min then sonication for 10 min.

2.1.5. Protein Assay

Protein concentration was determined by using the Thermo Scientific Pierce protein assay kit based on bicinchoninic acid (BCA) for the colorimetric detection and quantitation of total protein, with bovine serum albumin as standard. Protocol for the BCA kit provided by Pierce was followed when measuring the total protein concentration.

2.1.6. Sodium Dodecyl Sulfate-Polyacrylamide Gel Electrophoresis and Silver Staining

Electrophoresis of redissolved proteins was performed using Bio-Rad (Invitrogen, Carlsbad, CA,

USA) Mini-PROTEAN TGX Gels (10-well comb, precast, 4-20% polyacrylamide gel plates.

Samples of appropriate protein concentration were diluted analyzed by Sodium Dodecyl Sulfate-

Polyacrylamide Gel Electrophoresis (SDS-PAGE) in the presence of 2-mercaptoethanol. About

0.9 μg of protein was loaded per lane. Electrophoresis migration was performed at 120 V for 60 min. Silver stain was achieved after the gel electrophoresis to visualize the bands as follows.

Immediately following electrophoresis, a polyacrylamide gel was soaked in 100 ml of fix 1 solution (50% methanol, 10% acetic acid and 40% water) on shaker. After 30 min, the gel was transferred to 100 ml of fix 2 solution (5% methanol, 1% acetic acid and 94% water) for 15 min with shaking. Then the gel was washed in 100 ml of MillliQ water with shaking for 3 × 10 min.

Sensitizer solution made of 0.02% sodium thiosulfate was used to sensitize the gel for exactly 90 seconds and followed by rinse with MilliQ water for 3 × 30 seconds. For protein silver staining, the gel was incubated in 100 ml of chilled 0.2% silver nitrate solution for at least 30 minutes

25 with shaking. 100 ml Developer solution made of 0.06 g/ml sodium carbonate, 50 μl (37%) formaldehyde, and 2 ml 0.0004% sodium thiosulfate was used to visualize the stain. When the desired intensity was acquired, stop the process by soaking the gel in 6% acetic acid for 5 min.

2.1.7. Protein Digestion

All proteins that were leftover after protein assay and gel electrophoresis experiments were digested. 1% ProteaseMAX Surfactant: 50 mM ammounium was mixed with proteins that are diluted in 50 mM ammonium bicarbonate. Proteins were reduced by 0.5 M

Dithiothreitol and alkylated by 0.55 M iodoacetamide respectively. Then proteins were digested with 1.8 μg of trypsin and incubated on a shaker at 37 for overnight. Trypsin was inactivated by mixing with TFA to a final concentration of 0.5%. Prior to LC-MS, the digested proteins were centrifuged at 12,000 × g for 10 min to avoid introduction of non-peptide material to a system. The digested proteins were dried using Speedvac and redissolved in 10 μl water for downstream analysis.

2.1.8. LC-MS/MS

One-dimension LC-MS/MS was performed using a LTQ Velos Pro (ThermoScientific, Bremen,

Germany) mass spectrometer coupled to a 1D Waters nanoACQUITY UPLC system. LC separation was done with a Thermo Scientific EASY-Spray column (PepMap, C18, 3-μm packed

15-cm × 75-μm ID column). Approximately 0.5 μg of in-solution digested protein were loaded onto trap column first with 15 μl/min flow rate, and then desalted samples were loaded onto analytical column at a flow rate of 0.4 μl/min by applying a gradient of eluents A (H2O with

26

0.1% v/v formic acid) and B (CAN with 0.1% v/v formic acid) to achieve separation, from 1% B to 35% B in 210 min, then from 35% B to 85% B in 10 min followed by 10 min in 85% B.

Finally decrease to 1% B in 20 min. Full-scan mass spectra (m/z 400-1800) were acquired in the

LTQ Velos Pro mass spectrometer followed by ten most intense ions were automatically selected and fragmented in the ion trap. The collision energy was set to be 35 eV, with an isolation width of 1. Target ions already selected for the MS/MS were dynamically excluded with the following settings: repeat count 1 and exclusion duration 30 s. In each single LC-MS/MS run, about 0.5 μg of protein were loaded onto the column and separated with a 240 min gradient time.

Two-dimensional LC-MS experiment was performed using an Orbitrap Elite (Thermo Scientific,

Bremen, Germany) coupled to a 2D Waters nanoACQUITY UPLC. The peptide mixture from a tryptic digest was separated on a Waters NanoEase 300 µm x 50 mm 5 µm particle size XBridge

BEH130 C18 for first dimension, and Waters nanoACQUITY UPLC 75 µm x 150 mm 1.8 µm particle size HSS T3 column for the second dimension for three-fraction 2D runs. First, the samples were loaded onto the first dimension column at a flow rate of 2 μl/min. Then step-wisely

(three factions) eluted onto the trap column at a 2 μl/min flow rate. Finally desalted samples were separated on the second dimension analytical column at a flow rate of 0.5 μl/min. Separation was achieved by using solvents (A) 20mM Ammonium formate and (B) acetontrile for the first dimension, and (A) water with 0.1% formic acid and (B) acetonitrile with 0.1% formic acid for the second dimension. Peptides loaded onto the first column were transferred onto the trap column in 20 min at 13.1%, 17.7% and 50% B for three fraction 2D runs. The loading time for the first fraction is 15 min and for the other two fractions is 9.5 min. For each fraction, peptides were separated on the second dimension column under a gradient of B, from 5% B to 35% B in

27

40 min, then further increase to 85% B in 3 min and remained at 85% B for 3 min. Finally a sharp decrease to 5% B in 1 min and 5% B in 10 min. An Easy-nanoLC chromatography system

(Thermo Scientific) was on-line coupled to the Orbitrap Elite instrument (Thermo Scientific) via a Nanospray Flex Ion Source (Thermo Scientific). In each LC/LC-MS/MS run, about 0.5 μg of digested peptides were loaded onto the column and separated with three fractions each with a 55 min gradient time. MS data were acquired in a data-dependent strategy by selectively fragment the ten most intensive precursors in the full-mass scan (m/z 400-1600). The collision energy was set to be 35 eV, with an isolation width of 1. The resolution of the full scan was 240,000 and

MS/MS at a resolution of 30,000. Target ions already selected for the MS/MS were dynamically excluded with the following settings: repeat count 1, and exclusion duration 3 s.

2.1.9. MS Data Analysis

Spectra of all the samples were searched against a customized database containing non- redundant human proteins from the Universal Protein Resource (Uniprot) and Aspergillus fumigatus proteins from the National Center for Biotechnology Information (NCBI). Proteome

Discoverer (Version 1.4) search engine (Thermo Scientific) was used for generating the protein list from the data. For BALF analyzed on the Velos Pro, a maximum of two missed cleavages was allowed, cysteine alkylation and oxidation of methionine residues were used as variable modifications; precursor mass tolerance was set to 1 Da and fragment mass tolerance was set to

0.8 Da. The criteria used for protein identification was at least 2 unique peptides sequenced with high confidence.

28

Spectra of BALF spiked samples were searched with a maximum of two missed cleavages, cysteine alkylation, oxidation of methionine residues and deamidation of asparagines and glutamine were used as variable modifications; precursor mass tolerance was set to 10-20 ppm and fragment mass tolerance was set to 0.8 Da. The criteria used for protein identification was at least 2 unique peptides sequenced with high confidence

Spectral count normalization was performed for identified proteins in order to do comparative analysis. Normalization was done by taking spectral counts for individual peptides divided by the total spectral counts for that sample and multiplied by a factor to obtain an integer for normalized spectral counts.

2.2. Results and Discussions

2.2.1. Gel Results of Bed Volume Optimization for BALF

Figure 2.1 SDS-PAGE of initial BALF from patient 1 and the same after CPLL treatment. Lane 1 is molecular-weight standard; Lane 2 was initial BALF without CPLL. Lanes 3, 4, 5 and 6 represent the samples after CPLL treatment with different BALF/bead slurry volume ratio of 20:1, 8:1, 4:1 and 2:1 respectively. About 0.9 μg of protein was loaded for each lane.

29

Figure 2.1 illustrates the protein profile of initial BALF treated with four different volumes of beads slurry of CPLL followed by SDS-PAGE. Approximately 0.9 μg protein was loaded onto each lane. Lane 2 represents the initial BALF sample without CPLL treatment. It is clearly showed on the gel that several undetected bands in lane 2 are observed in lane 3-6 after different volumes of beads slurry of CPLL treatment, indicating enrichment of these proteins.

Intermediate concentration proteins that were previously masked by the high abundance proteins such as albumin and immunoglobulin became visible due to interactions with their ligands in the combinatorial peptide library. The recommended volumes of sample/bead slurry ratio are 10:1 with a sample concentration of larger than 50 mg/ml according to Bio-Rad Laboratories

(Hercules, CA. Here, the BALF sample used has a concentration of 0.8 mg/ml and therefore different BALF/bead slurry volume ratio of 20:1, 8:1, 4:1 and 2:1 were tested. As the ratio decreases from 20:1 or 8:1 to 4:1 or 2:1, the albumin band is getting thicker and darker

(comparing lane 5 and 6 to 3 and 4) and results in the loss of some other protein bands on the gel.

This can be explained by the fact that since the binding is nonspecific, when there are too many binding sites, the excess of abundant proteins have a greater chance to bind instead of getting removed.

2.2.2. MS Results of Bed Volume Optimization for BALF

The eluates from the experiment described above were subjected to tryptic digestion followed by

LC-MS analysis. CPLL process was repeated twice. The first time, eluates were analyzed on

Velos Pro (Thermo Scientific) for LC-MS/MS runs, while the second time was analyzed on

Orbitrap Elite (Thermo Scientific) for LC/LC-MS/MS runs. Figure 2.2 represents the histograms

30 of identified non-redundant protein groups and peptides in LC-MS/MS and LC/LS-MS/MS analysis respectively. About 0.5 μg of proteins were loaded for LC analysis on both instruments.

As shown by the number on top of each column in the histogram, in general, more proteins and peptides were identified in the BALF sample after CPLL treatment. BALF/bead slurry volume ratio of 20:1 and 8:1 gave better results as compared to the other two ratios. Moreover, the

BALF/bead slurry volume ratio of 20:1 is slightly better than 8:1 since more protein groups and unique peptides were identified on both instruments for that ratio as shown in Venn diagrams of

Figure 2.3 and Figure 2.4. The protein groups shown here are non-redundant and have at least two unique peptide hits. In total, CPLL treatment provided identification for 142 BALF protein groups with at least two peptide hits and 1360 peptides on the Orbitrap using 3 fractions with 55 min LC gradients as compared to 65 BALF protein groups and 599 peptides without CPLL treatment under the same LC-MS conditions. Among the 166 BALF protein groups identified with and without CPLL at the optimized ratio, 101 proteins are unique to the CPLL treated sample and 24 are unique to the non-CPLL treated sample. Consistent with the Orbitrap results, there were 115 BALF protein groups with at least two peptide hits and 746 peptides identified on

Velos Pro as compared to only 61 BALF protein groups and 360 peptides without CPLL treatment, both using a 4 hours LC gradient. The Velos Pro run provided 135 total identified

BALF proteins by both the optimized CPLL ratio and the non-CPLL run. The CPLL treated sample totaled 74 unique proteins while the non-CPLL sample produced a total of 20 unique proteins.

31 a)

b)

Figure 2.2 Number of identified BALF proteins and peptides with and without CPLL. Histograms show the number of a) protein groups and b) peptides identified by Orbitrap or Velos Pro mass spectrometer. 20:1, 8:1, 4:1 and 2:1 represent the samples after CPLL treatment with different BALF/bead slurry volume ratio. About 0.5 μg of protein was loaded on LC column.

32

Figure 2.3 Venn diagrams of BALF run on Velos Pro, independently treated with and without CPLL. Captured proteins eluted and digested for LC-MS/MS on VelosPro mass spectrometer. The search was performed using Sequest with two peptides hit at high confidence. a) Number of protein groups b) Number of peptides. 20:1 and 8:1 represent the samples after CPLL treatment with different BALF/bead slurry volume ratio. About 0.5 μg of protein was loaded on LC column.

Figure 2.4 Venn diagrams of BALF run on Orbitrap, independently treated without and with two CPLL. Captured proteins eluted and digested for LC/LC-MS/MS on Orbitrap mass spectrometer. The search was performed using Sequest with two peptides hit at high confidence. a) Number of protein groups b) Number of peptides. 20:1 and 8:1 represent the samples after CPLL treatment with different BALF/bead slurry volume ratio. About 0.5 μg of protein was loaded on LC column.

33

2.2.3. Gel Results for Aspergillus Fumigatus and Catalase Spiked in BALF

To demonstrate the capability of CPLL in increasing the detection of the number of proteins, comparative studies were done with Aspergillus fumigatus and catalase separately spiked in

BALF. Figure 2.5 (a) represents the eluates of six BALF spiked with different amounts of

Aspergillus fumigatus before CPLL treatment. Approximately 0.9 μg of protein was loaded in each lane. A similar protein profile can be observed across the gel for all six spiked samples, indicating an effective normalization of protein concentration range for BALF spiked with

Aspergillus fumigatus. Figure 2.5 (b) represents the eluates of four BALF spiked with different amounts of catalase before CPLL treatment. Approximately 0.9 μg of protein was loaded in each lane. The catalase spiked BALF seems to have less proteins on the gel compared to that of

Aspergillus fumigatus spiked BALF.

34 a)

b)

Figure 2.5 SDS-PAGE of BALF spiked with different amounts of Aspergillus fumigatus or catalase before CPLL treatment. The volumes of sample/beads slurry ratio used was 20:1. a) SDS-PAGE of Aspergillus fumigatus spiked in BALF before CPLL. Lane 7 is molecular-weight standard; Lane 1 to 6 were BALF from patient 2 with a final Aspergillus fumigatus spiked concentration of 0.2 pM, 0.02 nM, 2 nM, 0.02 μM, 0.2 μM, and 2 μM, respectively. About 0.9 μg of protein was loaded for each lane. b) SDS-PAGE of catalase spiked in BALF before CPLL. Lane 7 is molecular-weight standard; Lane 1 to 6 were BALF from patient 2 with catalase spiked concentration of 0.01 nM, 1 nM, 0.01 μM, and 0.1 μM, respectively. About 0.9 μg of protein was loaded for each lane.

35

2.2.4. MS Results for Aspergillus Fumigatus and Catalase Spiked in BALF

The eluates from BALF spiked with Aspergillus fumigatus were analyzed by LC/LC-MS/MS on

Orbitrap mass spectrometer (Thermo Scientific). Identified Aspergillus fumigatus and BALF protein groups for different spiked quantities were shown in Figure 2.6. The number of identified

Aspergillus fumigatus proteins was four at 0.2 μM spiked quantity and increased to 23 for the highest spike. This result indicates that as Aspergillus fumigatus protein concentration gets higher, the reduction of the Aspergillus fumigatus proteins dynamic range by CPLL becomes possible. This is reasonable since low Aspergillus fumigatus concentration would make it hard for competing binding sites with high-abundance BALF proteins.

Figure 2.6 Number of identified proteins for BALF spiked with different amounts of Aspergillus fumigatus. Histograms show the number of protein groups identified for Aspergillus fumigatus and BALF from tryptic peptides by Orbitrap mass spectrometer with an Aspergillus fumigatus spike concentration of 0.2 pM, 0.02 nM, 2 nM, .02 μM, 0.2 μΜ and 2 μΜ in BALF, respectively.

Top twenty-four BALF proteins identified from initial BALF of patient 2 without CPLL treatment on Orbitrap mass spectrometer were summarized in Table 2.2 with the number 36 representing normalized spectral counts for initial BALF, BALF with CPLL treatment and

BALF spiked with Aspergillus fumigatus from low to high (0.2 pM, 0.02 nM, 2 nM, 0.02 μM,

0.2 μΜ and 2 μΜ) from spike 1 to spike 6 before CPLL treatment. Normalized PSMs are color coded with red representing larger PSMs and white representing fewer PSMs. The whole list of identified proteins can be found in Appendix A. The identified BALF proteins here had a great overlap with what have been found in various BALF before by a senior group member, Chengsi

Huang. In Table 2.2, by comparing column 1 to the other columns, it can be seen that after CPLL treatment certain high abundance proteins in BALF, such as human serum albumin experienced a great reduction in spectral counts from 4176 to about 550 on average. For some mid-abundant proteins, the spectral counts increased indicating an effective concentration of those proteins on the ligand library. Other mid-abundance proteins were reduced in terms of spectral counts, making their concentration less significant. Low-abundance proteins in initial BALF experienced a general increase in spectral counts after CPLL in order to be detected. The fact that some mid- abundance proteins, such as apolipoprotein A-I and complement C4 beta chain, are being enriched across all the spiked quantities in a significant amount, and some proteins, such as inter- alpha-trypsin inhibitor heavy chain H2 and Alpha-actinin-4, that initially were not shown up in

BALF but get enriched after CPLL for all spiked quantities indicate that CPLL does have the ability to change the protein profile and the results for some proteins are reproducible. However, there was some discrepancy for some proteins that would affect the reproducibility of the CPLL treatment. Some proteins, such as IGK@ protein, were present in certain samples but completely gone in others, even though this was the same sample with the same treatment (besides the spike levels). 84% of the total identified protein groups are unique to CPLL treated samples. One

37 thing to note is that there was also some loss of proteins due to the CPLL treatment as compared to the initial sample.

Among the top twenty-four most abundance proteins in BALF, five proteins were significantly enriched by CPLL treatment, complement C3, apolipoprotein A-I, putative uncharacterized protein, cDNA FLJ76826, and vitamin D-binding protein. Although they may have different molecular weight (187.0, 40.5, 30.8, 122.1, and 52.9 kDa), they are all weak acidic proteins with pI values 5-6. Vitamin D-binding protein, alpha-1-antitrypsin, transferring, alpha-1-acid , hemoglobin subunit beta, complement C3, hemoglobin alpha, alpha-1-antichymotrypsin, transthyretin, , and complement C4-A in the top twenty four were shown to be associated with albumin (R.L. Gundry, et al., 2007).36

38

Table 2.2 Top twenty-four proteins identified from initial BALF without CPLL treatment. Normalized spectral counts among the initial BALF without CPLL, BALF with CPLL, and BALF of different Aspergillus fumigatus spiked quantities from low to high (0.2 pM, 0.02 nM, 2 nM, 0.02 μM, 0.2 μΜ and 2 μΜ) from spike 1 to spike 6 before CPLL are shown. Data distribution was shown by color scales with the shade of the color represents higher or lower values. Higher normalized spectral count cells have a more red color. Higher molecular weight cells have a more blue color. Higher pI cells have a more green color. The search was performed using Thermo Proteome Discoverer (version 1.4) software. The molecular weight and calculated pI were also obtained from the above software. # PSM # PSM # PSM # PSM # PSM # PSM # PSM # PSM BALF BALF Spk 1 Spk 2 Spk 3 Spk 4 Spk 5 Spk 6 MW calc. Accession Description with with with with with with with with [kDa] pI CPLL CPLL CPLL CPLL CPLL CPLL CPLL CPLL Serum albumin P02768 4176 514 486 490 505 731 546 555 69.3 6.28 [ALBU_HUMAN] Alpha-1-antitrypsin P01009 406 36 86 60 104 69 52 76 46.7 5.59 [A1AT_HUMAN] Putative uncharacterized Q6MZV7 protein 370 322 186 152 260 185 206 193 52.1 7.58 39

DKFZp686C11235 [Q6MZV7_HUMAN] Transferrin variant Q53H26 (Fragment) 170 0 0 0 0 0 0 7 77.0 7.03 [Q53H26_HUMAN] P00738 138 0 27 9 14 23 8 8 45.2 6.58 [HPT_HUMAN] IGK@ protein Q6PJF2 124 0 0 0 104 0 0 108 25.5 6.55 [Q6PJF2_HUMAN] Ig gamma-3 chain C P01860 region 97 49 64 56 86 72 111 82 41.3 7.90 [IGHG3_HUMAN] Alpha-2- P01023 macroglobulin 90 69 81 100 80 100 111 118 163.2 6.46 [A2MG_HUMAN] 39

continued Table 2.2 continued Alpha-1-acid P02763 glycoprotein 1 90 0 6 3 5 7 2 0 23.5 5.02 [A1AG1_HUMAN] Lambda light chain of human immunoglobulin C6KXN3 surface antigen-related 76 52 0 40 49 47 38 53 24.7 5.54 protein (Fragment) rG PE=1 SV=1 - [C6KXN3_HUMAN] Uncharacterized Q8NEJ1 protein 68 0 54 34 43 0 36 46 25.0 7.69 [Q8NEJ1_HUMAN] Hemoglobin subunit P68871 68 0 9 4 8 4 2 4 16.0 7.28 beta [HBB_HUMAN] 40 Complement C3 P01024 62 412 327 352 481 389 329 382 187.0 6.40 [CO3_HUMAN] Putative uncharacterized protein Q6N093 62 0 0 51 0 0 0 0 46.0 7.59 DKFZp686I04196 (Fragment) [Q6N093_HUMAN] Putative uncharacterized Q8WVW5 51 614 207 160 164 202 188 165 40.5 6.14 protein (Fragment) [Q8WVW5_HUMAN] Hemoglobin alpha-1 globin chain E9M4D4 50 0 3 2 2 0 0 0 10.8 8.48 (Fragment) [E9M4D4_HUMAN] Alpha-1- P01011 47 7 18 19 16 20 16 21 47.6 5.52 antichymotrypsin

40

continued Table 2.2 continued [AACT_HUMAN] Apolipoprotein A-I P02647 34 398 155 203 242 417 187 224 30.8 5.76 [APOA1_HUMAN] Transthyretin P02766 33 71 25 46 30 38 35 46 15.9 5.76 [TTHY_HUMAN] cDNA FLJ14473 fis, clone MAMMA1001080, highly similar to Q96K68 32 43 48 42 46 45 40 54 53.1 6.86 Homo sapiens SNC73 protein (SNC73) mRNA [Q96K68_HUMAN] cDNA FLJ76826, highly similar to

4

1 Homo sapiens A8K5A4 ceruloplasmin 28 155 219 191 248 163 185 209 122.1 5.74 (ferroxidase) (CP), mRNA [A8K5A4_HUMAN] Vitamin D-binding P02774 protein 24 351 438 238 304 214 189 193 52.9 5.54 [VTDB_HUMAN] Hemopexin P02790 23 0 2 0 0 4 0 0 51.6 7.02 [HEMO_HUMAN] Complement C4-A P0C0L4 22 372 0 0 0 0 0 0 192.7 7.08 [CO4A_HUMAN]

41

In the spike experiment, all identified Aspergillus fumigatus proteins by LC/LC-MS/MS were shown in Table 2.3 with the number representing spectral counts. The initial A. fumigatus without CPLL data was obtained by another group member Yun, Zhang using in-solution digestion with four hour LC gradient on Velos Pro mass spectrometer. #PSM

Spk6 and #PSM Spk5 were number of spectral counts for identified Aspergillus fumigatus proteins when the spike concentrations were 2 μM and 0.2 μΜ respectively

(See Table 2.1). Identified Peptides from spike experiments were searched with Basic

Local Alignment Search Tool (BLAST) for homologous sequences peptides. There are

98 Aspergillus fumigatus proteins identified in the antigen. When Aspergillus was spiked into BALF at a concentration of 0.2 μM and 2 μΜ, there were 24 and 3 Aspergillus fumigatus proteins identified respectively. The concentration of BALF protein was 12

μM and the final Aspergillus fumigatus concentrations at spike level 5 and 6 are 0.2 μM and 2 μM respectively. Therefore, in this particular experiment, the lowest spike concentration of Aspergillus fumigatus that can be detected is in the high-abundance protein concentration range of BALF. However, this may not be the limit of detection of

Aspergillus fumigatus proteins in BALF since repeated experiments need to be performed to show the reproducibility of the results.

42

Table 2.3 Identified Aspergillus fumigatus protein groups of two BALF spiked with different amount of Aspergillus fumigatus (0.2 μΜ, and 2 μΜ). 109 Initial Aspergillus fumigatus proteins were identified by another group member Yun Zhang using in- solution digestion 4 hour LC gradient on Velos Pro mass spectrometer. In total, there are 24 Aspergillus fumigatus proteins identified from two of all six spiked quantities (0.2 pM, 0.02 nM, 2 nM, 0.02 μM, 0.2 μΜ and 2 μΜ from spike1 to spike 6 respectively). Data distribution was shown by color scales with the shade of the color represents higher or lower values. Higher normalized spectral count cells have a more red color. Higher molecular weight cells have a more blue color. Higher pI cells have a more green color. The search was performed using Thermo Proteome Discoverer (version 1.4) software. The molecular weight and calculated pI were also obtained from the above software. # PSM Initial # PSM # PSM Aspergillus Spk 6 Spk5 MW calc. Accession Description fumigatus with with [kDa] pI without CPLL CPLL CPLL P02768 RecName: Full=Catalase B; AltName: Full=Antigenic catalase 165 45 6 79.9 5.82 P01009 alpha-mannosidase [Aspergillus fumigatus Af293] 84 34 5 123.7 6.27

4 RecName: Full=Probable alpha/beta-glucosidase agdC; Flags:

3 Q6MZV7 79 28 0 98.8 6.42

Precursor Q53H26 major allergen Asp F2 [Aspergillus fumigatus Af293] 29 16 0 32.2 5.69 P00738 RecName: Full=Probable beta-glucosidase A 100 13 0 94.7 5.19 Q6PJF2 conserved hypothetical protein [Aspergillus fumigatus Af293] 23 13 2 37.2 5.49 P01860 family protein [Aspergillus fumigatus Af293] 32 11 0 66.2 5.01 P01023 GtaA [Aspergillus fumigatus Af293] 31 6 0 76.1 4.88 P02763 aspartyl aminopeptidase [Aspergillus fumigatus Af293] 29 5 0 54.9 6.60 C6KXN3 dihydrolipoamide dehydrogenase [Aspergillus fumigatus Af293] 26 4 0 54.9 8.18 Q8NEJ1 [Aspergillus fumigatus Af293] 8 4 0 45.2 5.81 P68871 ABC transporter (Adp1) [Aspergillus fumigatus Af293] 29 3 0 119.9 6.44 P01024 RecName: Full=Probable carboxypeptidase AFUA_6G06800 29 3 0 46.3 5.52 Q6N093 RecName: Full=Probable beta-glucosidase F 107 3 0 92.9 5.97 Q8WVW5 glucan 1,4-alpha-glucosidase [Aspergillus fumigatus Af293] 115 3 0 67.1 5.21 E9M4D4 conserved hypothetical protein [Aspergillus fumigatus Af293] 9 2 0 20.3 5.20 P01011 vacuolar carboxypeptidase Cps1, putative [Aspergillus fumigatus 6 2 0 62.0 5.21 43

continued Table 2.3 continued Af293] P02647 RecName: Full=Dipeptidyl-peptidase 5 149 2 0 79.7 5.90 P02766 RecName: Full=Asp-hemolysin; Short=Asp-HS; Flags: Precursor 623 2 0 15.2 5.53 P01876 RecName: Full=Probable glucan 1,3-beta-glucosidase A 14 2 0 45.7 4.72 A8K5A4 alkaline phosphatase Pho8 [Aspergillus fumigatus Af293] 70 2 0 66.3 5.27 P02774 ABC metal ion transporter, putative [Aspergillus fumigatus Af293] 0 1 0 171.6 8.02 P02790 RecName: Full=Vacuolar protease A 14 1 0 43.3 5.00 Accession [Aspergillus fumigatus Af293] 5 1 0 16.0 7.24 P02768 class V chitinase ChiB1 [Aspergillus fumigatus Af293] 110 0 0 47.6 5.26 P01009 RecName: Full=1,3-beta-glucanosyltransferase gel1 5 0 0 48.0 5.19 glutathione Glr1, putative [Aspergillus fumigatus Q6MZV7 10 0 0 51.3 7.03 Af293] Q53H26 2-methylcitrate , putative [Aspergillus fumigatus Af293] 4 0 0 62.1 8.37 P00738 transaldolase [Aspergillus fumigatus Af293] 4 0 0 35.4 6.44

4 Q6PJF2 RecName: Full=Probable beta-glucosidase M 26 0 0 82.6 5.34

4 P01860 alpha-amylase [Aspergillus fumigatus Af293] 26 0 0 68.5 4.92 P01023 Ser/Thr protein phosphatase family [Aspergillus fumigatus Af293] 4 0 0 71.3 5.83 P02763 glycosyl [Aspergillus fumigatus Af293] 4 0 0 75.3 5.55 C6KXN3 RecName: Full=Alkaline protease 1; Short=ALP 33 0 0 42.2 6.81 RecName: Full=Probable mannosyl-oligosaccharide alpha-1,2- Q8NEJ1 114 0 0 53.8 5.27 mannosidase 1B P68871 conserved hypothetical protein [Aspergillus fumigatus Af293] 17 0 0 99.4 6.27 P01024 RecName: Full=Probable Xaa-Pro aminopeptidase pepP 17 0 0 51.9 5.73 Q6N093 aminopeptidase [Aspergillus fumigatus Af293] 40 0 0 56.7 5.14 vacuolar aspartyl aminopeptidase Lap4, putative [Aspergillus Q8WVW5 37 0 0 56.0 7.08 fumigatus Af293] E9M4D4 conserved hypothetical protein [Aspergillus fumigatus Af293] 19 0 0 26.0 4.54 P01011 endoglucanase, putative [Aspergillus fumigatus Af293] 3 0 0 31.6 8.27 P02647 GPI anchored protein, putative [Aspergillus fumigatus Af293] 4 0 0 58.5 5.54 NAD-dependent formate dehydrogenase AciA/Fdh [Aspergillus P02766 18 0 0 45.7 8.27 fumigatus Af293] P01876 exo-beta-1,3-glucanase, putative [Aspergillus fumigatus Af293] 5 0 0 86.7 5.34

44 continued

Table 2.3 continued delta-1-pyrroline-5-carboxylate dehydrogenase PrnC [Aspergillus A8K5A4 15 0 0 63.0 8.18 fumigatus Af293] P02774 exo-beta-1,3-glucanase [Aspergillus fumigatus Af293] 4 0 0 82.5 5.85 RecName: Full=Phosphatidylglycerol/phosphatidylinositol transfer P02790 5 0 0 19.1 5.19 protein Accession GMC oxidoreductase, putative [Aspergillus fumigatus Af293] 29 0 0 67.6 5.39 P02768 RecName: Full=Allergen Asp f 15 70 0 0 15.9 4.75 P01009 class V chitinase, putative [Aspergillus fumigatus Af293] 65 0 0 49.1 5.44 Q6MZV7 exo-beta-1,3-glucanase Exg0 [Aspergillus fumigatus Af293] 63 0 0 100.6 5.39 Q53H26 RecName: Full=Alkaline protease 2; Short=ALP2 18 0 0 52.6 6.25 G-protein complex beta subunit CpcB [Aspergillus fumigatus P00738 10 0 0 35.0 6.52 Af293] Q6PJF2 RecName: Full=Catalase-peroxidase; Short=CP 13 0 0 83.7 6.58 P01860 RecName: Full=Protein ecm33; Flags: Precursor 33 0 0 41.5 4.98

4

5

P01023 RecName: Full=Lysophospholipase 1 4 0 0 68.1 4.77 P02763 alpha-amylase [Aspergillus fumigatus Af293] 22 0 0 53.8 4.94 C6KXN3 alpha-glucosidase AgdA [Aspergillus fumigatus Af293] 38 0 0 108.4 5.43 RecName: Full=; AltName: Full=2-phospho-D-glycerate Q8NEJ1 3 0 0 47.3 5.58 hydro- P68871 RecName: Full=Extracellular metalloproteinase mep 8 0 0 68.7 5.41 P01024 beta-N-acetylhexosaminidase NagA [Aspergillus fumigatus Af293] 28 0 0 67.4 5.97 Q6N093 RecName: Full=Probable dipeptidyl peptidase 4 55 0 0 85.8 5.81 Q8WVW5 endo-1,3-beta-glucanase Engl1 [Aspergillus fumigatus Af293] 20 0 0 104.9 5.91 E9M4D4 exo-beta-1,3-glucanase [Aspergillus fumigatus Af293] 70 0 0 84.1 5.22 endonuclease/exonuclease/phosphatase family protein [Aspergillus P01011 40 0 0 33.8 6.11 fumigatus Af293] P02647 RecName: Full=Mannitol-1-phosphate 5-dehydrogenase 6 0 0 43.0 5.90 P02766 conserved hypothetical protein [Aspergillus fumigatus Af293] 56 0 0 42.2 6.05

4

2 P01876 amidase, putative [Aspergillus fumigatus Af293] 17 0 0 65.0 6.95 RecName: Full=Nucleoside diphosphate kinase; Short=NDK;

A8K5A4 29 0 0 16.9 7.97 Short=NDP kinase P02774 conserved hypothetical protein [Aspergillus fumigatus Af293] 29 0 0 29.4 5.44

45 continued

Table 2.3 continued P02790 conserved hypothetical protein [Aspergillus fumigatus Af293] 8 0 0 17.1 5.10 Accession ribose 5-phosphate A [Aspergillus fumigatus Af293] 5 0 0 28.4 6.25 P02768 RecName: Full=Probable pectate lyase D; Flags: Precursor 22 0 0 25.4 4.39 malate dehydrogenase, NAD-dependent [Aspergillus fumigatus P01009 8 0 0 35.9 9.09 Af293] RecName: Full=Uncharacterized protein AFUA_6G02800; Flags: Q6MZV7 4 0 0 24.6 4.77 Precursor Q53H26 acid phosphatase [Aspergillus fumigatus Af293] 10 0 0 30.4 4.88 P00738 hydrolase, putative [Aspergillus fumigatus Af293] 14 0 0 72.1 5.64 Q6PJF2 RecName: Full=Probable leucine aminopeptidase 2 5 0 0 54.2 5.87 P01860 FAD-dependent oxygenase [Aspergillus fumigatus Af293] 8 0 0 55.0 6.99 phosphoglycerate mutase, 2,3-bisphosphoglycerate-independent P01023 10 0 0 57.4 5.67 [Aspergillus fumigatus Af293] ubiquinol-cytochrome c reductase iron-sulfur subunit precursor P02763 5 0 0 32.6 8.95 [Aspergillus fumigatus] C6KXN3 thioredoxin reductase, putative [Aspergillus fumigatus Af293] 24 0 0 42.8 7.49

4

6 Q8NEJ1 extracellular lipase, putative [Aspergillus fumigatus Af293] 24 0 0 31.4 5.97

P68871 conserved hypothetical protein [Aspergillus fumigatus Af293] 3 0 0 22.0 5.15 P01024 conserved hypothetical protein [Aspergillus fumigatus Af293] 14 0 0 23.4 4.78 Q6N093 lactoylglutathione lyase [Aspergillus fumigatus Af293] 5 0 0 36.5 6.39 Q8WVW5 conserved hypothetical protein [Aspergillus fumigatus Af293] 17 0 0 26.2 5.73 E9M4D4 DUF1237 domain protein [Aspergillus fumigatus Af293] 5 0 0 58.9 6.33 extracellular serine-rich protein, putative [Aspergillus fumigatus P01011 43 0 0 85.4 5.02 Af293] endonuclease/exonuclease/phosphatase family protein [Aspergillus P02647 26 0 0 64.1 4.87 fumigatus Af293] acyl-CoA:6-aminopenicillanic-acid-acyltransferase, putative P02766 5 0 0 35.0 5.72 [Aspergillus fumigatus Af293] P01876 aldose 1-epimerase, putative [Aspergillus fumigatus Af293] 17 0 0 50.7 5.99 A8K5A4 BYS1 domain protein, putative [Aspergillus fumigatus Af293] 26 0 0 16.0 5.08 extracelular serine carboxypeptidase, putative [Aspergillus P02774 13 0 0 64.2 5.58 fumigatus Af293] P02790 RecName: Full=Cyanate hydratase; Short=Cyanase; AltName: 6 0 0 17.1 6.80 46

continued Table 2.3 continued Full=Cyanate hydrolase; Accession ribonuclease T2, putative [Aspergillus fumigatus Af293] 3 0 0 29.3 5.15 P02768 RecName: Full=Probable glycosidase crf1 40 0 0 40.3 4.78 P01009 RecName: Full=Probable beta-glucosidase L 97 0 0 78.3 6.00 Q6MZV7 conserved hypothetical protein [Aspergillus fumigatus Af293] 5 0 0 36.6 7.12 alpha glucosidase II, alpha subunit, putative [Aspergillus fumigatus Q53H26 15 0 0 109.6 6.52 Af293] alpha,alpha-trehalose glucohydrolase TreA/Ath1 [Aspergillus P00738 6 0 0 116.9 5.58 fumigatus Af293] Q6PJF2 RecName: Full=Probable glucan endo-1,3-beta-glucosidase eglC 68 0 0 44.6 5.07 P01860 alpha-1,3-glucanase/mutanase [Aspergillus fumigatus Af293] 27 0 0 54.0 5.21 P01023 alcohol dehydrogenase [Aspergillus fumigatus Af293] 19 0 0 37.5 7.40 P02763 Cupin domain protein [Aspergillus fumigatus Af293] 18 0 0 21.7 7.93 C6KXN3 RecName: Full=Probable pectate lyase A; Flags: Precursor 63 0 0 33.8 6.74 Q8NEJ1 FG-GAP repeat protein [Aspergillus fumigatus Af293] 70 0 0 33.7 5.86 P68871 RecName: Full=Superoxide dismutase [Cu-Zn] 34 0 0 16.0 6.52

P01024 Chain B, Structure Of Elastase Inhibitor Afuei (crystal Form Ii) 15 0 0 7.5 4.72

4

7

47

For BALF spiked with catalse before CPLL, catalase protein was identified at spike levels 3 and 4 with a final catalase concentration of 0.01 μM and 0.1 μM respectively.

From spike level one to four of spike concentrations of 0.01 nM, 1 nM, 0.01 μM, and 0.1

μΜ, respectively, the number of identified protein groups is 158, 166, 149 and 136. The identified protein groups are significantly less than Aspergillus fumigatus spiked BALF.

Table 2.4 represents top twenty-four BALF proteins identified from catalase spiked

BALF sample. The whole protein list can be found in Appendix B. Similar to Aspergillus fumigatus spike result, most high-abundance protein get reduced while some mid- to low- abundance proteins get enriched. PSMs in different spike levels show results of identified BALF proteins are reproducible. 70% of the total identified BALF protein are unique to the CPLL treated samples.

Among the top twenty-four most abundance proteins in BALF, proteins that were significantly enriched by CPLL treatment adopt various molecular weight (187.0, 40.5,

30.8, 122.1, and 52.9 kDa) but they are all weak acidic proteins with pI values 5-6. This result agrees well with what was shown earlier with Aspergillus fumigatus spiked BALF.

48

Table 2.4 Top twenty-four proteins identified from BALF without CPLL treatment. Normalized spectral counts among BALF with no treatment, BALF with CPLL treatment, and BALF with different concentrations of catalase (0.01 nM, 1 nM, 0.01 μM, and 0.1 μΜ from spike 1 to spike 4 respectively) before CPLL are shown. Spectral counts were normalized against the smallest PSM of BALF with CPLL treatment sample. Data distribution was shown by color scales with the shade of the color represents higher or lower values. Higher normalized spectral count cells have a more red color. Higher molecular weight cells have a more blue color. Higher pI cells have a more green color. The search was performed using Thermo Proteome Discoverer (version 1.4) software. The molecular weight and calculated pI were also obtained from the above software. # PSM # PSM # PSM # PSM # PSM # PSM BALF BALF Spk 1 Spk 2 Spk 3 Spk 4 MW calc. Accession Description without with with with with with [kDa] pI CPLL CPLL CPLL CPLL CPLL CPLL P02768 Serum albumin [ALBU_HUMAN] 3743 514 361 319 418 497 69.3 6.28 Alpha-1-antitrypsin P01009 364 36 53 54 67 64 46.7 5.59 [A1AT_HUMAN] Putative uncharacterized protein Q6MZV7 DKFZp686C11235 332 322 214 141 191 153 52.1 7.58 [Q6MZV7_HUMAN] Transferrin variant (Fragment)

49 Q53H26 152 0 0 0 6 0 77.0 7.03 [Q53H26_HUMAN] P00738 Haptoglobin [HPT_HUMAN] 124 0 0 2 7 3 45.2 6.58 Q6PJF2 IGK@ protein [Q6PJF2_HUMAN] 111 0 0 0 0 0 25.5 6.55 Ig gamma-3 chain C region P01860 87 49 0 42 0 0 41.3 7.90 [IGHG3_HUMAN] Alpha-2-macroglobulin P01023 81 69 99 83 111 131 163.2 6.46 [A2MG_HUMAN] Alpha-1-acid glycoprotein 1 P02763 81 0 0 0 0 0 23.5 5.02 [A1AG1_HUMAN] Lambda light chain of human immunoglobulin surface antigen- C6KXN3 68 52 58 57 56 62 24.7 5.54 related protein (Fragment) rG PE=1 SV=1 - [C6KXN3_HUMAN] Uncharacterized protein 0 Q8NEJ1 61 0 0 0 0 25.0 7.69 [Q8NEJ1_HUMAN] 49

continued Table 2.4 continued Hemoglobin subunit beta P68871 61 0 0 0 0 0 16.0 7.28 [HBB_HUMAN] P01024 Complement C3 [CO3_HUMAN] 56 412 285 189 244 237 187.0 6.40 Putative uncharacterized protein Q6N093 DKFZp686I04196 (Fragment) 56 0 0 0 0 0 46.0 7.59 [Q6N093_HUMAN] Putative uncharacterized protein Q8WVW5 45 614 354 227 212 213 40.5 6.14 (Fragment) [Q8WVW5_HUMAN] Hemoglobin alpha-1 globin chain E9M4D4 45 0 0 0 0 0 10.8 8.48 (Fragment) [E9M4D4_HUMAN] Alpha-1-antichymotrypsin P01011 42 7 8 11 15 11 47.6 5.52 [AACT_HUMAN] Apolipoprotein A-I P02647 31 398 365 482 321 178 30.8 5.76 [APOA1_HUMAN] P02766 Transthyretin [TTHY_HUMAN] 29 71 101 59 56 82 15.9 5.76 Ig alpha-1 chain C region P01876 29 43 49 50 49 53 37.6 6.51 [IGHA1_HUMAN] cDNA FLJ76826, highly similar to 50 Homo sapiens ceruloplasmin A8K5A4 25 155 168 104 128 86 122.1 5.74 (ferroxidase) (CP), mRNA [A8K5A4_HUMAN] Vitamin D-binding protein P02774 22 351 338 312 230 284 52.9 5.54 [VTDB_HUMAN] P02790 Hemopexin [HEMO_HUMAN] 20 0 0 0 0 0 51.6 7.02 Complement C4-A OS=Homo P0C0L4 sapiens GN=C4A PE=1 SV=2 - 20 372 273 354 318 331 192.7 7.08 [CO4A_HUMAN]

50

Chapter 3. Conclusion and Future Directions

The efficiency of CPLL for dynamic range reduction of low concentration BALF samples and Aspergillus fumigatus antigen spiked in BALF were evaluated in the previously detailed experiments. The SDS-PAGE runs and Orbitrap and Velos Pro mass spectrometric analysis showed that BALF/bead slurry volume ratio of 20:1 (molar ratio of BALF proteins to bead capacity of 7:1) could identify more total BALF protein groups and peptides compared to ratios of 8:1, 4:1 and 2:1. Thus, 20:1 was the chosen ratio for the normalization of protein concentrations in the sample. In total, CPLL treatment provided identification for 142 BALF protein groups with at least two peptide hits and

1360 peptides on the Orbitrap using 3 fractions in the first dimension with 55 min LC gradients in the second dimension as compared to 65 BALF protein groups and 599 peptides without CPLL treatment under the same LC-MS conditions. Among the 166

BALF protein groups identified with and without CPLL at the optimized ratio, approximately 61% proteins are unique to the CPLL treated sample and 14% are unique to the non-CPLL treated sample. Consistent with the Orbitrap results, there were 115

BALF protein groups with at least two peptide hits and 746 peptides identified on Velos

Pro as compared to only 61 BALF protein groups and 360 peptides without CPLL treatment, both using a 4 hours LC gradient. The Velos Pro run provided 135 total identified BALF proteins by both the optimized CPLL ratio and the non-CPLL run. The 51

CPLL treated sample totaled approximately 55% unique proteins while the non-CPLL sample produced a total of 15% unique proteins in the Velos Pro run. As mentioned above, 20:1 ratio in CPLL runs was a better method for identifying unique proteins and peptides and all spike experiments were performed with this optimized sample/bead slurry volume ratio.

When 0.2 μM Aspergillus fumigatus was spiked preceding a CPLL treatment, 3

Aspergillus fumigatus proteins and 288 BALF proteins were identified with at least two peptide hits for a single LC/LC-MS/MS run on the orbitrap. At a 2 μM, Aspergillus fumigatus spike, 24 Aspergillus proteins and 235 BALF proteins were identified using the same LC and MS conditions. A majority of the identified BALF proteins showed reproducible results at six different spike levels. There were 302, 326, 299, 275, 288 and

235 BALF protein groups identified when 0.2 pM, 0.02 nM, 2 nM, 0.02 μM, 0.2 μM, and

2 μM Aspergillus fumigatus was spiked in BALF, respectively. When a single protein catalase was spiked into BALF before CPLL treatment, it was identified at a concentration of 0.02 μM and 0.2 μM after spike with 9 and 28 unique peptides respectively. The BALF proteins identified in the catalase spiked sample were 149 and

136. The limit of detection of catalase was an order of magnitude lower than Aspergillus fumigatus, but it is still scaled to high-abundance range of BALF proteins. A majority of the identified BALF proteins showed reproducible results at four different spiked quantities. There were 158, 166, 149, and 136 BALF protein groups identified when catalase was spiked in BALF at a final concentration of 0.01 nM, 1 nM, 0.01 μM, and 0.1

μM, respectively.

52

Among the top twenty-four most abundance proteins in BALF, proteins that were significantly enriched by CPLL treatment have various molecular weights (187.0, 40.5,

30.8, 122.1, and 52.9 kDa) but they are generally weak acidic proteins with pI values 5-6.

When the complete protein list was examined, alkali proteins with various molecular weights were also found to be enriched by CPLL. Many proteins associated with albumin identified by R.L. Gundry, et al., 36 for example, apolipoprotein E, retinol-binding protein, alpha-2-antiplasmin, vitamin D-binding protein were enriched to varying degrees after

CPLL treatment, indicating possible interactions between albumin and other proteins present in the sample during CPLL process. Most of the 24 identified Aspergillus fumigatus proteins are weak acidic proteins with pI values from 4.72 to 6.99. Detected

Aspergillus fumigatus proteins have molecular weights varying between 15.2 and 171.6 kDa and pI values varying from 4.39 to 8.18.

There are some aspects of this experiment that will need to be improved in the future.

First of all, the data for the spiking experiment was obtained with only single LC/LC-

MS/MS runs; therefore, there should be repeated trials of each spike level to show the

Aspergillus results are actually reproducible. Secondly, the dynamic exclusion time chosen for orbitrap analysis was 3 seconds, based on experience from previous experiments. However, the actual extracted ion chromatograms obtained using

Discoverer software for the identified Aspergillus fumigatus and BALF peptides showed that most of the peak widths are greater than 10 seconds. Therefore, for future reference, the dynamic exclusion should be set for 30 seconds. For catalase spiked BALF sample, there seemed to be significantly less protein on the gel as compared to that of Aspergillus

53 fumigatus spiked BALF and this was reflected in the amounts of protein identified in each MS run. The LC analytical column was changed for all the catalase runs. Although

0.5 ug of proteins were loaded each time, samples may have degraded over the time. The experiment needs to be repeated to confirm the result. Finally, to evaluate the detection of

Aspergillus fumigatus in the presence of BALF under CPLL treatment, Aspergillus fumigatus identified from in-solution digestion and 4hr LC gradient run ran by Yun

Zhang was compared to as a control. Nevertheless, a sample of Aspergillus fumigatus treated with CPLL and a sample of BALF spiked with Aspergillus fumigatus without

CPL would be more appropriate controls for comparison.

Figure 3.1 Different cellular component percentages comparison graph for BALF proteins with and without CPLL treatment and BALF spiked with different amounts of Aspergillus fumigatus before CPLL.

54

Figure 3.2 Different molecular function percentages comparison graph for BALF proteins with and without CPLL treatment and BALF spiked with different amounts of Aspergillus fumigatus before CPLL.

To further study the characteristics of proteins captured by CPLL, a focus on comparing the molecular function of biological systems treated with and without CPLL should be performed. Proteins grouped by cellular components, molecular functions could provide information on the chemistry of the protein in a species-independent manner37. Here the protein grouping was performed using the Thermo Proteome Discoverer software for preliminary analysis. Figure 3.1 shows different cellular components percentages for identified BALF proteins with and without CPLL treatment and BALF spiked with different amounts of Aspergillus fumigatus before CPLL. It can be seen that after CPLL treatment, less than half of the extracellular components were removed. Endosome was also decreased by CPLL treatment. One the other hand, enriched cellular components by

55

CPLL treatments include cytosol, cytoskeleton, nucleus, and mitochondrion. Components that were unique to the CPLL treated samples were proteasome, , ribosome, and spliceosomal complex. As shown in figure 3.2, in terms of molecular function, CPLL tended to improve the detection of proteins that have catalytic activity, structural molecule activity, binding function, and RNA binding function. On the other hand, CPLL was likely to decrease the amount of proteins that are important in the process of protein binding, enzyme regular activity, transporter activity and antioxidant activity. In addition to the re-annotation results, determining the hydrophobicity of the protein groups which are enriched or depleted by CPLL and discovering if there is any trend for this characteristic would be helpful to interpreting the pattern of the proteins captured and the overall mechanism of capture by CPLL.

56

Appendix A: A list of 429 BALF and Aspergillus fumigatus proteins identified on Orbitrap

Normalized spectral counts among the initial BALF without CPLL, BALF with CPLL, and BALF of different Aspergillus fumigatus spiked quantities from low to high (0.2 pM, 0.02 nM, 2 nM, 0.02 μM, 0.2 μΜ and 2 μΜ) from spike 1 to spike 6 before CPLL are shown. Data distribution was shown by color scales with the shade of the color represents higher or lower values. Higher normalized spectral count cells have a more red color. Higher molecular weight cells have a more blue color. Higher pI cells have a more green color. The search was performed using Thermo Proteome Discoverer (version 1.4) software. The molecular weight and calculated pI were also obtained from the above software. # PSM # PSM # PSM # PSM # PSM # PSM # PSM # PSM BALF BALF Spk 1 Spk 2 Spk 3 Spk 4 Spk 5 Spk 6 MW calc. Accession Description without with with with with with with with [kDa] pI CPLL CPLL CPLL CPLL CPLL CPLL CPLL CPLL Serum albumin P02768 4176 514 486 490 505 731 546 555 69.3 6.28 [ALBU_HUMAN]

5

7 Alpha-1-antitrypsin

P01009 406 36 86 60 104 69 52 76 46.7 5.59 [A1AT_HUMAN] Putative uncharacterized protein Q6MZV7 370 322 186 152 260 185 206 193 52.1 7.58 DKFZp686C11235 [Q6MZV7_HUMAN] Transferrin variant Q53H26 (Fragment) 170 0 0 0 0 0 0 7 77.0 7.03 [Q53H26_HUMAN] Haptoglobin P00738 138 0 27 9 14 23 8 8 45.2 6.58 [HPT_HUMAN] IGK@ protein Q6PJF2 124 0 0 0 104 0 0 108 25.5 6.55 [Q6PJF2_HUMAN] Ig gamma-3 chain C P01860 region 97 49 64 56 86 72 111 82 41.3 7.90 [IGHG3_HUMAN] Alpha-2-macroglobulin P01023 90 69 81 100 80 100 111 118 163.2 6.46 [A2MG_HUMAN] P02763 Alpha-1-acid 90 0 6 3 5 7 2 0 23.5 5.02 57 continued

Appendix A continued

glycoprotein 1 [A1AG1_HUMAN] Lambda light chain of human immunoglobulin C6KXN3 76 52 0 40 49 47 38 53 24.7 5.54 surface antigen-related protein (Fragment) rG Uncharacterized protein Q8NEJ1 68 0 54 34 43 0 36 46 25.0 7.69 [Q8NEJ1_HUMAN] Hemoglobin subunit P68871 68 0 9 4 8 4 2 4 16.0 7.28 beta [HBB_HUMAN] Complement C3 P01024 62 412 327 352 481 389 329 382 187.0 6.40 [CO3_HUMAN] Putative uncharacterized protein Q6N093 DKFZp686I04196 62 0 0 51 0 0 0 0 46.0 7.59

(Fragment) [Q6N093_HUMAN]

5 Putative uncharacterized

8

Q8WVW5 protein (Fragment) 51 614 207 160 164 202 188 165 40.5 6.14 [Q8WVW5_HUMAN] Hemoglobin alpha-1 E9M4D4 globin chain (Fragment) 50 0 3 2 2 0 0 0 10.8 8.48 [E9M4D4_HUMAN] Alpha-1- P01011 antichymotrypsin 47 7 18 19 16 20 16 21 47.6 5.52 [AACT_HUMAN] Apolipoprotein A-I P02647 34 398 155 203 242 417 187 224 30.8 5.76 [APOA1_HUMAN] Transthyretin P02766 33 71 25 46 30 38 35 46 15.9 5.76 [TTHY_HUMAN] cDNA FLJ14473 fis, clone MAMMA1001080, Q96K68 highly similar to Homo 32 43 48 42 46 45 40 54 53.1 6.86 sapiens SNC73 protein (SNC73) mRNA [Q96K68_HUMAN] 58

Appendix A continued cDNA FLJ76826, highly similar to Homo sapiens ceruloplasmin A8K5A4 28 155 219 191 248 163 185 209 122.1 5.74 (ferroxidase) (CP), mRNA [A8K5A4_HUMAN] Vitamin D-binding P02774 protein 24 351 438 238 304 214 189 193 52.9 5.54 [VTDB_HUMAN] Hemopexin P02790 23 0 2 0 0 4 0 0 51.6 7.02 [HEMO_HUMAN] Complement C4-A P0C0L4 22 372 0 0 0 0 0 0 192.7 7.08 [CO4A_HUMAN] Complement C4 beta B0UZ83 chain 22 366 250 208 206 268 238 231 192.8 7.03 [B0UZ83_HUMAN] Complement C4-B P0C0L5 22 0 0 0 0 0 0 0 192.6 7.27 [CO4B_HUMAN] Alpha-1-acid P19652 21 0 0 0 0 6 0 0 23.6 5.11 5 glycoprotein 2

9

Uteroglobin E9PN95 17 0 4 4 3 5 5 6 6.3 4.96 [E9PN95_HUMAN] Plasma protease C1 B4E1H2 inhibitor 17 12 26 25 20 24 24 37 49.7 6.54 [B4E1H2_HUMAN] Fibrinogen gamma chain P02679 16 118 72 76 81 86 96 103 51.5 5.62 [FIBG_HUMAN] Fibrinogen beta chain P02675 15 260 120 128 178 124 122 152 55.9 8.27 [FIBB_HUMAN] Apolipoprotein A-II P02652 15 36 27 33 26 20 31 52 11.2 6.62 [APOA2_HUMAN] Protein S100-A9 P06702 15 0 0 6 4 0 2 6 13.2 6.13 [S10A9_HUMAN] Single-chain Fv Q65ZC9 (Fragment) 12 0 8 6 8 8 5 10 25.6 9.11 [Q65ZC9_HUMAN]

59 continued

Appendix A continued Complement factor B B4E1Z4 11 1 5 4 3 4 2 5 140.9 7.18 [B4E1Z4_HUMAN] Alpha-1B-glycoprotein P04217 10 0 4 3 1 6 2 4 54.2 5.86 [A1BG_HUMAN] Inter-alpha (Globulin) inhibitor H4 (Plasma Kallikrein-sensitive Q59FS1 9 19 0 0 0 33 0 39 76.9 6.02 glycoprotein) variant (Fragment) [Q59FS1_HUMAN] cDNA FLJ51742, highly similar to Inter-alpha- B7Z544 trypsin inhibitor heavy 9 0 0 0 0 0 0 0 98.3 6.60 chain H4 [B7Z544_HUMAN] Inter-alpha (Globulin) inhibitor H4 (Plasma

60 B2RMS9 Kallikrein-sensitive 9 19 33 37 29 33 36 37 103.3 6.98

glycoprotein) [B2RMS9_HUMAN] Polymeric immunoglobulin P01833 9 8 19 24 18 29 28 29 83.2 5.74 receptor [PIGR_HUMAN] cDNA FLJ78367, highly similar to Homo sapiens fibrinogen, A alpha A8K3E4 polypeptide (FGA), 8 147 69 67 63 59 74 64 69.7 8.06 transcriptvariant alpha, mRNA [A8K3E4_HUMAN] Leucine-rich alpha-2- Q68CK4 8 0 0 0 0 0 0 0 38.1 6.95 glycoprotein Immunoglobulin P01591 8 11 16 14 12 15 13 15 18.1 5.24 [IGJ_HUMAN] Alpha-2-HS- P02765 8 45 54 67 94 102 64 99 39.3 5.72 glycoprotein 60 continued

Appendix A continued [FETUA_HUMAN] Protein S100-A8 P05109 8 0 5 6 3 3 2 6 10.8 7.03 [S10A8_HUMAN] cDNA, FLJ93914, highly similar to Homo sapiens histidine-rich B2R8I2 8 0 31 23 18 29 19 23 59.5 7.44 glycoprotein (HRG), mRNA [B2R8I2_HUMAN] Histidine-rich P04196 glycoprotein 8 54 31 23 18 0 19 23 59.5 7.50 [HRG_HUMAN] Rheumatoid factor RF- A2J1M2 IP9 (Fragment) 8 4 4 6 4 6 5 7 10.5 9.17 [A2J1M2_HUMAN] Kaliocin-1 (Fragment) E7EQB2 7 65 113 110 85 106 119 110 76.6 8.02 61 [E7EQB2_HUMAN]

Keratin 1 H6VRF8 6 30 61 70 58 59 93 53 66.0 8.12 [H6VRF8_HUMAN] Keratin 1 H6VRG2 6 0 0 0 0 0 0 0 66.0 8.12 [H6VRG2_HUMAN] Rheumatoid factor D5 A0N5G5 light chain (Fragment) 6 0 2 2 3 0 0 4 12.8 8.97 [A0N5G5_HUMAN] Complement factor H P08603 6 18 55 53 46 49 48 65 139.0 6.61 [CFAH_HUMAN] Ig mu chain C region P01871 6 47 56 46 58 46 43 58 49.3 6.77 [IGHM_HUMAN] FN1 protein B7ZLE5 5 130 0 147 0 153 151 121 246.5 6.06 [B7ZLE5_HUMAN] P02751 5 0 143 150 127 155 151 122 262.5 5.71 [FINC_HUMAN] Apolipoprotein E P02649 4 32 39 39 33 25 47 47 36.1 5.73 [APOE_HUMAN] Beta-2-microglobulin A6XMH5 4 0 0 1 0 0 0 0 10.4 8.02 [A6XMH5_HUMAN]

61 continued

Appendix A continued Plastin-2 P13796 3 2 6 12 16 8 5 7 70.2 5.43 [PLSL_HUMAN] Putative uncharacterized Q5HYB6 protein DKFZp686J1372 2 14 25 20 16 14 22 17 27.2 4.74 [Q5HYB6_HUMAN] cDNA FLJ51265, B4DPN0 moderately similar to 2 0 5 4 3 3 3 5 30.3 7.85 Beta-2-glycoprotein 1 Triosephosphate B4DUI5 isomerase 2 0 0 0 0 0 0 0 22.9 6.92 [B4DUI5_HUMAN] 14-3-3 protein epsilon P62258 2 15 26 27 23 24 25 23 29.2 4.74 [1433E_HUMAN] Keratin, type I P35527 cytoskeletal 9 2 22 34 36 28 30 58 18 62.0 5.24 [K1C9_HUMAN] Heat shock protein HSP

62 P07900 90-alpha 2 71 77 68 51 72 82 65 84.6 5.02

[HS90A_HUMAN] cDNA, FLJ93674 B2R7Z6 2 2 7 9 7 5 9 10 52.5 7.55 [B2R7Z6_HUMAN] Elongation factor 1- P68104 alpha 1 2 4 9 9 6 6 7 7 50.1 9.01 [EF1A1_HUMAN] 5AC, oligomeric A7Y9J9 mucus/gel-forming 2 0 7 5 8 3 2 5 648.4 6.76 [A7Y9J9_HUMAN] Galectin-3-binding Q08380 protein 1 38 39 35 39 45 43 46 65.3 5.27 [LG3BP_HUMAN] Alpha-enolase P06733 1 0 0 0 1 1 0 0 47.1 7.39 [ENOA_HUMAN] Vimentin P08670 0 55 78 74 65 66 83 96 53.6 5.12 [VIME_HUMAN] 14-3-3 protein zeta/delta P63104 0 29 35 35 28 25 31 28 27.7 4.79 [1433Z_HUMAN]

62 continued

Appendix A continued 14-3-3 protein P31946 beta/alpha 0 13 23 22 22 18 22 18 28.1 4.83 [1433B_HUMAN] Serum amyloid A D3DQX7 protein 0 11 12 35 23 5 15 19 13.6 6.79 [D3DQX7_HUMAN] cDNA FLJ53691, highly B4E1B2 similar to Serotransferrin 0 0 40 15 19 36 6 7 74.8 7.12 [B4E1B2_HUMAN] Calreticulin variant Q53G71 (Fragment) 0 34 68 73 56 63 82 52 46.9 4.45 [Q53G71_HUMAN] Chloride intracellular O00299 channel protein 1 0 12 16 11 13 9 9 11 26.9 5.17 [CLIC1_HUMAN] Serotransferrin P02787 0 0 0 15 0 0 6 7 77.0 7.12 [TRFE_HUMAN]

63 Full-length cDNA clone

CS0DI041YE05 of Q86TY5 Placenta of Homo 0 0 10 9 6 9 2 0 13.9 9.16 sapiens (human) [Q86TY5_HUMAN] Galectin-1 P09382 0 0 12 7 7 5 8 6 14.7 5.50 [LEG1_HUMAN] Myosin light F8W1R7 polypeptide 6 0 8 12 14 8 4 9 7 16.3 4.65 [F8W1R7_HUMAN] Apolipoprotein A-IV P06727 0 30 44 41 43 39 57 57 45.4 5.38 [APOA4_HUMAN] Proteasome subunit P25786 alpha type-1 0 13 21 23 19 17 26 31 29.5 6.61 [PSA1_HUMAN] Acidic leucine-rich nuclear phosphoprotein H0YN26 0 0 6 9 8 3 8 4 20.0 4.58 32 family member A [H0YN26_HUMAN]

63 continued Appendix A continued Cofilin 1 (Non-muscle), G3V1A4 isoform CRA_a 0 0 7 12 8 5 9 6 16.8 8.35 [G3V1A4_HUMAN] Peroxiredoxin-1 Q06830 0 7 10 15 16 9 8 11 22.1 8.13 [PRDX1_HUMAN] Keratin, type I P13645 cytoskeletal 10 0 24 44 48 33 42 59 49 58.8 5.21 [K1C10_HUMAN] Alpha-actinin-4 O43707 0 20 72 53 40 46 64 50 104.8 5.44 [ACTN4_HUMAN] Pyruvate kinase PKM P14618 0 30 58 30 29 42 38 30 57.9 7.84 [KPYM_HUMAN] 14-3-3 protein gamma P61981 0 14 20 19 17 16 16 18 28.3 4.89 [1433G_HUMAN] Histone H2B A8K9J7 0 10 19 15 32 10 16 14 14.0 10.32 [A8K9J7_HUMAN]

6

4 Myosin regulatory light P19105 chain 12A 0 0 0 7 3 0 3 0 19.8 4.81 [ML12A_HUMAN] Myosin regulatory light chain MRCL3 variant Q53HL1 0 0 0 7 3 0 3 0 19.8 4.84 (Fragment) [Q53HL1_HUMAN] Tropomyosin alpha-4 P67936 chain 0 15 27 26 16 21 26 20 28.5 4.69 [TPM4_HUMAN] Gelsolin P06396 0 29 32 38 24 51 41 34 85.6 6.28 [GELS_HUMAN] Histone H4 P62805 0 3 6 12 9 6 10 15 11.4 11.36 [H4_HUMAN] Ferritin heavy chain E9PPQ4 (Fragment) 0 0 0 2 0 0 0 0 6.7 5.83 [E9PPQ4_HUMAN] P31947 14-3-3 protein sigma 0 12 16 12 11 13 10 11 27.8 4.74 Tubulin beta chain P07437 0 7 12 16 11 15 16 18 49.6 4.89 [TBB5_HUMAN]

64 continued

Appendix A continued Full-length cDNA clone CS0DD006YL02 of Q86TT1 Neuroblastoma of Homo 0 47 0 0 0 0 0 0 41.2 6.79 sapiens (human) [Q86TT1_HUMAN] Protein - P07237 isomerase 0 5 21 16 15 15 16 10 57.1 4.87 [PDIA1_HUMAN] cDNA FLJ37398 fis, clone BRAMY2027467, B3KT06 highly similar to Tubulin 0 21 25 18 13 22 33 15 46.3 5.14 alpha-ubiquitous chain [B3KT06_HUMAN] Lambda-chain (AA -20 A2NUT2 to 215) 0 0 0 0 0 47 35 0 24.6 7.62

6 [A2NUT2_HUMAN]

5

Proteasome subunit beta P20618 type-1 0 0 7 9 5 6 10 8 26.5 8.13 [PSB1_HUMAN] A30 (Fragment) A2MYE1 0 0 2 0 0 0 0 0 10.4 8.50 [A2MYE1_HUMAN] Adenylyl cyclase- D3DPU2 associated protein 0 21 41 22 22 26 24 19 51.6 8.02 [D3DPU2_HUMAN] Proteasome subunit Q05DH1 alpha type (Fragment) 0 11 15 12 18 13 18 13 26.7 8.87 [Q05DH1_HUMAN] Proteasome subunit P25787 alpha type-2 0 8 6 11 9 9 13 7 25.9 7.43 [PSA2_HUMAN] cDNA FLJ54023, highly similar to Heat shock B4DMA2 0 45 62 48 42 45 57 40 79.1 5.02 protein HSP 90-beta [B4DMA2_HUMAN] Histone H2A C9J0D1 0 0 0 0 0 0 10 0 13.2 9.99 [C9J0D1_HUMAN]

65 continued

Appendix A continued Putative uncharacterized protein Q6N096 0 322 0 152 260 0 0 0 50.9 8.06 DKFZp686I15196 [Q6N096_HUMAN] Antileukoproteinase P03973 0 6 16 15 9 3 8 8 14.3 8.75 [SLPI_HUMAN] IGK@ protein Q6P5S8 0 89 99 138 0 187 102 0 25.8 6.33 [Q6P5S8_HUMAN] Eukaryotic translation P63241 initiation factor 5A-1 0 1 2 6 4 3 2 0 16.8 5.24 [IF5A1_HUMAN] Lysozyme C P61626 0 0 3 7 5 0 3 5 16.5 9.16 [LYSC_HUMAN] cDNA FLJ52068, highly similar to Microtubule- B4DM33 associated protein 0 0 4 4 3 1 3 2 26.6 5.29

RP/EB family member 1 [B4DM33_HUMAN]

6

6 RecName: Full=Catalase B; AltName: Full=Antigenic catalase; 2493539 0 0 0 0 0 0 10 79 79.9 5.82 AltName: Full=Slow catalase; Flags: Precursor Dihydropyrimidinase- Q16555 related protein 2 0 4 10 13 8 11 16 11 62.3 6.38 [DPYL2_HUMAN] 14-3-3 protein theta P27348 0 0 18 15 14 16 16 13 27.7 4.78 [1433T_HUMAN] Tryptophan--tRNA P23381 , cytoplasmic 0 9 9 11 12 8 5 5 53.1 6.23 [SYWC_HUMAN] Heat shock protein beta- P04792 0 0 8 7 6 8 9 8 22.8 6.40 1 [HSPB1_HUMAN] Uncharacterized protein A0A5E4 0 52 55 0 0 0 0 0 24.7 5.94 [A0A5E4_HUMAN]

66 continued

Appendix A continued 40S ribosomal protein C9J9K3 SA (Fragment) 0 0 4 5 6 3 7 0 29.5 5.25 [C9J9K3_HUMAN] P51884 0 45 38 34 32 34 36 34 38.4 6.61 [LUM_HUMAN] Prothrombin P00734 0 26 29 26 20 22 24 19 70.0 5.90 [THRB_HUMAN] Inter-alpha-trypsin Q5T985 inhibitor heavy chain H2 0 46 70 92 88 84 71 52 105.2 7.03 [Q5T985_HUMAN] Tubulin beta-4B chain P68371 0 0 0 12 8 11 12 12 49.8 4.89 [TBB4B_HUMAN] TPMsk3 (Fragment) Q8TCG3 0 0 23 0 0 0 0 0 28.8 4.75 [Q8TCG3_HUMAN] Serum amyloid A-4

P35542 protein 0 5 7 9 6 4 9 10 14.7 9.07 67 [SAA4_HUMAN] 78 kDa glucose- P11021 regulated protein 0 4 22 24 18 20 17 22 72.3 5.16 [GRP78_HUMAN] Proteasome subunit P60900 alpha type-6 0 11 12 14 11 15 13 15 27.4 6.76 [PSA6_HUMAN] Proteasome subunit P25789 alpha type-4 0 1 5 7 6 4 9 6 29.5 7.72 [PSA4_HUMAN] C9JGI3 0 12 15 20 16 13 13 17 46.1 5.52 (Fragment) [C9JGI3_HUMAN] Ig gamma-4 chain C P01861 region 0 34 0 0 67 0 93 51 35.9 7.36 [IGHG4_HUMAN] cDNA FLJ56821, highly similar to Inter-alpha- B7Z549 0 18 26 33 27 30 30 19 75.5 6.96 trypsin inhibitor heavy chain H1 67

continued Appendix A continued [B7Z549_HUMAN] Anti-(ED-B) scFV A2KBC2 (Fragment) 0 0 9 8 6 11 11 11 25.2 7.71 [A2KBC2_HUMAN] Complement C1q subcomponent subunit B D6RA08 0 11 10 11 8 11 14 11 24.1 9.16 (Fragment) [D6RA08_HUMAN] Complement component P02748 0 19 30 26 32 26 30 30 63.1 5.59 C9 [CO9_HUMAN] Alpha-actinin-1 P12814 0 14 46 39 26 30 42 31 103.0 5.41 [ACTN1_HUMAN] Proteasome subunit beta P49721 type-2 0 2 7 43 8 3 5 4 22.8 7.02 [PSB2_HUMAN]

68 cDNA FLJ60461, highly

similar to Peroxiredoxin- B4DF70 0 0 4 7 5 0 0 4 20.1 8.78 2 (EC 1.11.1.15) [B4DF70_HUMAN] Nucleosome assembly protein 1-like 1 F8VY35 0 0 3 2 1 0 2 0 31.0 4.51 (Fragment) [F8VY35_HUMAN] Transitional endoplasmic reticulum P55072 0 7 21 17 13 11 13 8 89.3 5.26 ATPase [TERA_HUMAN] Apolipoprotein D P05090 0 15 23 14 13 13 12 11 21.3 5.15 [APOD_HUMAN] Glyceraldehyde-3- phosphate P04406 0 0 4 6 5 3 4 5 36.0 8.46 dehydrogenase [G3P_HUMAN] Tubulin alpha-4A chain A8MUB1 0 0 16 13 8 15 18 0 48.3 5.01 [A8MUB1_HUMAN] P04080 Cystatin-B 0 0 2 2 0 0 2 0 11.1 7.56

68 continued Appendix A continued

[CYTB_HUMAN] Moesin P26038 0 0 16 23 19 20 16 38 67.8 6.40 [MOES_HUMAN] cDNA FLJ53207, highly similar to Homo sapiens fibulin 1 (FBLN1), B4DUV1 0 29 30 27 24 26 33 21 70.1 5.26 transcript variant C, mRNA [B4DUV1_HUMAN] cDNA FLJ75422, highly similar to Homo sapiens capping protein (actin A8K0T9 0 0 7 6 5 4 6 4 32.9 5.69 filament) muscle Z-line, alpha 1, mRNA [A8K0T9_HUMAN] Histone H2A type 1-H Q96KK5 0 6 11 15 8 8 13 14 13.9 10.89 [H2A1H_HUMAN] 69 Coronin-1A P31146 0 5 12 11 11 10 12 4 51.0 6.68 [COR1A_HUMAN] Myeloperoxidase P05164 0 7 23 41 14 22 25 25 83.8 8.97 [PERM_HUMAN] Heterogeneous nuclear ribonucleoproteins G3V2D6 0 0 10 8 11 6 7 4 23.8 4.78 C1/C2 (Fragment) [G3V2D6_HUMAN] Anti-streptococcal/anti- myosin immunoglobulin kappa light chain Q96SA9 0 0 0 0 0 3 0 0 11.5 8.85 variable region (Fragment) [Q96SA9_HUMAN] Proteasome subunit beta P49720 type-3 0 0 6 6 7 3 6 5 22.9 6.55 [PSB3_HUMAN] Putative uncharacterized Q6MZU6 protein 0 27 51 0 52 0 46 61 51.1 7.71 DKFZp686C15213 69 continued Appendix A continued [Q6MZU6_HUMAN] Haptoglobin-related P00739 protein 0 0 0 5 7 0 0 0 39.0 7.09 [HPTR_HUMAN] Ig kappa chain V-I P01593 region AG 0 0 0 0 0 0 0 4 12.0 5.99 [KV101_HUMAN] Kininogen 1, isoform B4E1C2 CRA_b 0 5 21 21 20 22 19 16 71.9 6.81 [B4E1C2_HUMAN] cDNA FLJ77770, highly similar to Homo sapiens nucleobindin 1 A8K7Q1 0 0 11 9 3 4 8 3 53.9 5.25 (NUCB1), mRNA [A8K7Q1_HUMAN]

cDNA FLJ57475, highly

70 similar to Pulmonary

B4E1F5 surfactant-associated 0 0 5 10 8 2 5 4 38.5 5.96 protein B [B4E1F5_HUMAN] Heat shock cognate 71 E9PKE3 kDa protein 0 0 14 15 15 15 12 11 68.8 5.52 [E9PKE3_HUMAN] Proteasome subunit P28066 alpha type-5 0 9 10 8 9 8 11 11 26.4 4.79 [PSA5_HUMAN] cDNA FLJ54121, highly similar to Cysteine and B4E2T4 0 2 0 0 0 0 0 0 15.0 8.57 glycine-rich protein 1 [B4E2T4_HUMAN] Keratin, type II P35908 cytoskeletal 2 epidermal 0 0 24 19 15 17 24 16 65.4 8.00 [K22E_HUMAN] Keratin, type II P05787 cytoskeletal 8 0 14 24 31 22 20 35 20 53.7 5.59 [K2C8_HUMAN] 70

continued Appendix A continued cDNA FLJ76079, highly similar to Homo sapiens lymphocyte-specific A8K2L4 0 3 17 11 14 8 9 0 37.2 4.74 protein 1 (LSP1), mRNA [A8K2L4_HUMAN] cDNA, FLJ96345, Homo sapiens SET translocation (myeloid B2RCX0 0 0 7 8 7 4 9 6 32.1 4.22 leukemia-associated) (SET),mRNA [B2RCX0_HUMAN] Uncharacterized protein H7C469 (Fragment) 0 0 3 9 8 3 2 3 40.4 5.76

71 [H7C469_HUMAN]

cDNA FLJ77835, highly similar to Homo sapiens complement component A8K2N0 1, s subcomponent 0 7 14 15 15 15 16 11 76.6 4.98 (C1S), transcript variant 2, mRNA [A8K2N0_HUMAN] NL3 Q6UY50 0 0 2 4 2 2 2 0 24.6 6.79 [Q6UY50_HUMAN] Complement C1q P02745 subcomponent subunit A 0 4 0 7 4 8 7 10 26.0 9.11 [C1QA_HUMAN] RecName: Full=Asp- 83300542 hemolysin; Short=Asp- 0 0 0 0 0 0 0 4 15.2 5.53 HS; Flags: Precursor Alpha-2-antiplasmin C9JMH6 (Fragment) 0 0 4 6 3 4 5 6 27.8 6.77 [C9JMH6_HUMAN] LMNA protein Q8N519 0 0 8 9 8 6 6 4 53.2 6.40 [Q8N519_HUMAN] Protein S100-A4 P26447 0 0 0 2 0 0 2 0 11.7 6.11 [S10A4_HUMAN] 71

continued Appendix A continued Ran-specific GTPase- activating protein C9JJ34 0 0 2 0 3 0 0 0 18.8 5.21 (Fragment) [C9JJ34_HUMAN] Protein AMBP P02760 0 23 16 16 13 13 14 10 39.0 6.25 [AMBP_HUMAN] Fibulin-1 P23142 0 0 25 22 20 22 27 18 77.2 5.22 [FBLN1_HUMAN] Ig kappa chain V-I P01598 region EU 0 0 0 0 3 0 0 0 11.8 8.44 [KV106_HUMAN] Pyridoxal kinase F2Z2Y4 0 3 7 7 6 8 9 11 30.6 6.65 [F2Z2Y4_HUMAN] P10909 0 41 44 29 52 36 35 45 52.5 6.27 [CLUS_HUMAN]

cDNA, FLJ96580, highly similar to Homo

72 sapiens hepatoma-

derived growth factor B2RDE8 0 0 2 3 0 1 0 0 26.8 4.67 (high-mobility group protein 1-like) (HDGF), mRNA [B2RDE8_HUMAN] alpha-mannosidase 66852431 [Aspergillus fumigatus 0 0 0 0 0 0 9 59 123.7 6.27 Af293] Transforming growth factor-beta-induced G8JLA8 0 6 6 4 6 10 8 7 74.6 7.25 protein ig-h3 [G8JLA8_HUMAN] Proteasome subunit beta P28070 type-4 0 3 6 7 6 6 10 7 29.2 5.97 [PSB4_HUMAN] HCCR-binding protein 2 Q5J908 0 0 4 4 3 2 3 0 12.6 4.91 [Q5J908_HUMAN] cDNA FLJ46506 fis, B3KY04 0 0 6 2 4 3 2 0 35.7 5.64 clone THYMU3030752, 72

continued Appendix A continued highly similar to BTB/POZ domain- containing protein KCTD12 [B3KY04_HUMAN] Glutathione S- pi C7DJS1 0 0 0 0 1 0 0 0 16.7 5.10 (Fragment) [C7DJS1_HUMAN] Complement factor H- Q03591 related protein 1 0 0 4 7 3 4 4 6 37.6 7.39 [FHR1_HUMAN] Prothymosin a14 Q9UMZ1 0 0 7 9 6 3 9 6 11.1 3.79 [Q9UMZ1_HUMAN] RecName: Full=Probable 74672124 0 0 0 0 0 0 0 49 98.8 6.42 alpha/beta-glucosidase agdC; Flags: Precursor

73 Hsc70-interacting H7C3I1 protein (Fragment) 0 0 2 2 0 0 2 0 16.3 4.88 [H7C3I1_HUMAN] Complement C1q P02747 subcomponent subunit C 0 15 10 7 11 10 11 11 25.8 8.41 [C1QC_HUMAN] Keratin, type I P08727 cytoskeletal 19 0 0 9 11 6 6 13 10 44.1 5.14 [K1C19_HUMAN] GTP-binding nuclear P62826 protein Ran 0 0 2 4 3 3 2 0 24.4 7.49 [RAN_HUMAN] Histone H3 K7EMV3 0 0 4 5 5 4 7 5 10.3 11.82 [K7EMV3_HUMAN] Myosin-reactive immunoglobulin kappa Q9UL85 chain variable region 0 0 3 4 2 3 3 4 11.8 8.51 (Fragment) [Q9UL85_HUMAN] 73

continued Appendix A continued Protein disulfide- B7Z254 isomerase A6 0 5 16 7 7 4 6 5 47.8 5.08 [B7Z254_HUMAN] P04004 0 23 23 23 22 22 35 20 54.3 5.80 [VTNC_HUMAN] Thymosin alpha-1 B8ZZQ6 0 0 7 9 6 0 9 6 11.8 3.81 [B8ZZQ6_HUMAN] Tumor rejection antigen Q5CAQ5 (Gp96) 1 0 8 15 16 9 15 16 12 92.3 4.86 [Q5CAQ5_HUMAN] Heat shock 70 kDa P08107 protein 1A/1B 0 0 9 12 0 8 8 8 70.0 5.66 [HSP71_HUMAN] 14-3-3 protein eta Q04917 0 0 10 9 8 11 9 11 28.2 4.84 [1433F_HUMAN] Proteasome subunit beta 74 A2ACR1 type 0 3 5 5 4 5 6 7 20.9 4.89 [A2ACR1_HUMAN] Deleted in malignant Q9UGM3 brain tumors 1 protein 0 4 7 7 7 6 7 5 260.6 5.44 [DMBT1_HUMAN] Serum amyloid P- P02743 component 0 0 3 5 4 7 5 5 25.4 6.54 [SAMP_HUMAN] Keratin, type I P02533 cytoskeletal 14 0 0 8 10 11 8 16 0 51.5 5.16 [K1C14_HUMAN] Ig kappa chain V-IV P01625 region Len 0 0 0 0 0 3 2 0 12.6 7.93 [KV402_HUMAN] Heterogeneous nuclear Q5T6W5 ribonucleoprotein K 0 4 7 4 4 3 3 0 47.5 5.63 [Q5T6W5_HUMAN] High mobility group Q5T7C4 protein B1 0 0 7 4 4 7 5 0 18.3 9.70 [Q5T7C4_HUMAN]

74

continued Appendix A continued cDNA FLJ55287, highly B7Z7L4 similar to Calpastatin 0 4 7 7 9 3 6 3 46.1 4.75 [B7Z7L4_HUMAN] cDNA FLJ75066, highly similar to Homo sapiens complement component A8K5J8 0 0 5 4 6 5 3 4 80.1 6.44 1, r subcomponent (C1R), mRNA [A8K5J8_HUMAN] Annexin (Fragment) H0YNA0 0 0 2 0 0 0 0 0 9.1 4.70 [H0YNA0_HUMAN] Cathepsin G P08311 0 0 2 1 1 0 4 0 28.8 11.19 [CATG_HUMAN] N-acetyl-D-glucosamine Q9UJ70 kinase 0 2 7 6 6 6 8 4 37.4 6.24 [NAGK_HUMAN] Truncated nucleolar A4ZU86 phosphoprotein B23 0 0 6 6 4 3 5 5 30.0 4.75

75 [A4ZU86_HUMAN]

CALM3 protein Q9BRL5 0 0 3 5 4 2 5 7 16.5 4.46 [Q9BRL5_HUMAN] Dickkopf-related protein E7EUD0 0 0 6 3 2 2 2 0 35.3 4.64 3 [E7EUD0_HUMAN] C-reactive protein P02741 0 0 4 4 3 3 3 0 25.0 5.63 [CRP_HUMAN] Proteasome activator Q06323 complex subunit 1 0 0 0 3 5 1 3 5 28.7 6.02 [PSME1_HUMAN] Capping protein (Actin filament) muscle Z-line, B1AK87 0 0 7 7 5 4 5 3 29.3 6.92 beta, isoform CRA_a [B1AK87_HUMAN] Calcyphosin Q13938 0 0 2 4 3 0 0 0 21.0 4.89 [CAYP1_HUMAN] Proteasome subunit beta Q6FHU0 type (Fragment) 0 5 7 6 8 6 5 5 29.7 5.82 [Q6FHU0_HUMAN] 75

continued Appendix A continued

cDNA FLJ75700, highly similar to Homo sapiens complement component 1, q subcomponent A8K651 binding protein 0 0 6 2 3 0 0 0 31.4 4.84 (C1QBP), nuclear encoding mitochondrial protein, mRNA [A8K651_HUMAN] Elongation factor 2 P13639 0 7 13 14 9 13 11 11 95.3 6.83 [EF2_HUMAN] Proteasome subunit P25788 alpha type-3 0 0 3 4 3 5 6 5 28.4 5.33 [PSA3_HUMAN] CD44 antigen H0YD13 (Fragment) 0 0 7 4 3 2 2 3 20.2 8.27 [H0YD13_HUMAN] 76 Myosin-9 P35579 0 29 57 40 32 35 46 35 226.4 5.60 [MYH9_HUMAN] 60S acidic ribosomal F8VWV4 protein P0 (Fragment) 0 0 2 0 0 0 0 0 12.2 9.25 [F8VWV4_HUMAN] Protein disulfide- P13667 isomerase A4 0 6 16 11 9 10 9 7 72.9 5.07 [PDIA4_HUMAN] PNAS-139 Q9BXV5 0 0 0 1 2 0 0 0 22.7 6.27 [Q9BXV5_HUMAN] Protein phosphatase 1 regulatory subunit 7 H7C003 0 0 3 4 4 0 2 0 38.7 4.84 (Fragment) [H7C003_HUMAN] Rho GDP-dissociation J3KTF8 inhibitor 1 (Fragment) 0 0 0 7 9 0 0 0 21.6 5.49 [J3KTF8_HUMAN] Heparin 2 P05546 0 0 7 9 12 5 4 5 57.0 6.90 [HEP2_HUMAN]

76

continued Appendix A continued Dermcidin P81605 0 0 3 0 0 0 2 0 11.3 6.54 [DCD_HUMAN] Profilin-1 P07737 0 0 0 0 2 0 0 0 15.0 8.27 [PROF1_HUMAN] Filamin-A Q5HY54 0 2 26 25 12 28 27 21 276.4 6.05 [Q5HY54_HUMAN] cDNA, FLJ93426, highly similar to Homo B2R7F8 sapiens plasminogen 0 5 11 14 10 15 12 13 90.5 7.24 (PLG), mRNA [B2R7F8_HUMAN] Proteasome activator H0YM70 complex subunit 2 0 0 2 2 2 0 2 2 26.0 5.92 [H0YM70_HUMAN]

Gamma-interferon- inducible lysosomal P13284 0 0 0 4 0 0 0 0 27.9 4.88 thiol reductase 77 [GILT_HUMAN] Afamin P43652 0 0 5 4 6 4 3 6 69.0 5.90 [AFAM_HUMAN] Serum P27169 paraoxonase/arylesterase 0 3 7 5 4 6 5 4 39.7 5.22 1 [PON1_HUMAN] Napsin-A O96009 0 2 6 3 4 0 3 0 45.4 6.61 [NAPSA_HUMAN] Neutrophil defensin 1 P59665 0 0 2 4 0 0 3 4 10.2 6.99 [DEF1_HUMAN] Glutathione peroxidase 3 P22352 0 0 0 4 3 0 3 4 25.5 8.13 [GPX3_HUMAN] conserved hypothetical 66851322 protein [Aspergillus 0 0 0 0 0 0 0 4 20.3 5.20 fumigatus Af293] Plasma retinol-binding Q5VY30 protein(1-182) 0 0 2 3 3 3 2 2 22.9 6.09 [Q5VY30_HUMAN] Keratin, type II P13647 0 0 14 15 0 17 18 0 62.3 7.74 cytoskeletal 5 77

continued Appendix A continued [K2C5_HUMAN] cDNA FLJ77823, highly similar to Homo sapiens EGF-containing fibulin- A8KAJ3 like extracellular matrix 0 4 4 4 4 4 2 0 54.6 5.14 protein 1, transcript variant 3, mRNA [A8KAJ3_HUMAN] V3-3 protein (Fragment) Q5NV83 46 PE=4 SV=1 - 0 0 0 0 0 0 2 0 10.4 7.28 [Q5NV83_HUMAN] V3-2 protein (Fragment) Q5NV80 43 PE=2 SV=1 - 0 0 0 0 0 1 0 4 10.4 7.11 [Q5NV80_HUMAN]

78 Retinal dehydrogenase 1 P00352 0 0 0 6 3 5 5 4 54.8 6.73

[AL1A1_HUMAN] Carboxypeptidase N P22792 subunit 2 0 0 5 4 4 4 2 2 60.5 5.99 [CPN2_HUMAN] RecName: Full=Probable beta- glucosidase A; AltName: Full=Beta-D- glucoside 74669696 glucohydrolase A; 0 0 0 0 0 0 0 23 94.7 5.19 AltName: Full=Cellobiase A; AltName: Full=Gentiobiase A; Flags: Precursor agmatinase [Aspergillus 70989709 0 0 0 0 0 0 0 6 45.2 5.81 fumigatus Af293] Inorganic Q15181 pyrophosphatase 0 0 0 2 0 0 0 2 32.6 5.86 [IPYR_HUMAN] R4GN98 Protein S100-A6 0 0 0 2 2 0 0 0 9.7 5.45 78

continued Appendix A continued (Fragment) [R4GN98_HUMAN] cDNA FLJ53910, highly similar to Keratin, type B4DRR0 0 0 0 0 13 0 0 0 57.8 8.00 II cytoskeletal 6A [B4DRR0_HUMAN] conserved hypothetical 70983229 protein [Aspergillus 0 0 0 0 0 0 3 23 37.2 5.49 fumigatus Af293] cDNA, FLJ92973, highly similar to Homo B2R6J2 sapiens villin 2 (ezrin) 0 0 7 12 6 12 8 20 69.4 6.27 (VIL2), mRNA [B2R6J2_HUMAN] Thioredoxin P10599 0 0 0 0 2 0 0 0 11.7 4.92

79 [THIO_HUMAN]

Transgelin-2 P37802 0 0 0 0 3 0 0 0 22.4 8.25 [TAGL2_HUMAN] Prostaglandin E synthase B4DHP2 0 0 0 2 0 0 0 0 14.8 4.11 3 [B4DHP2_HUMAN] adenosine deaminase family protein 129558247 0 0 0 0 0 0 0 18 66.2 5.01 [Aspergillus fumigatus Af293] Azurocidin P20160 0 0 7 7 6 6 9 5 26.9 9.50 [CAP7_HUMAN] Lamin-B1 P20700 0 6 12 5 8 6 6 4 66.4 5.16 [LMNB1_HUMAN] Elongation factor 1-delta P29692 0 0 3 4 3 0 2 0 31.1 5.01 [EF1D_HUMAN] cDNA FLJ78244, highly similar to Homo sapiens eukaryotic translation A8K7F6 initiation factor 4A, 0 0 4 5 0 4 4 3 46.1 5.48 isoform 1 (EIF4A1), mRNA [A8K7F6_HUMAN] 79

continued Appendix A continued cDNA FLJ51488, highly similar to Macrophage B4DU58 0 0 0 3 3 0 0 0 36.2 6.25 capping protein [B4DU58_HUMAN] aspartyl aminopeptidase 66852422 [Aspergillus fumigatus 0 0 0 0 0 0 0 8 54.9 6.60 Af293] Heterogeneous nuclear Q00839 ribonucleoprotein U 0 0 6 6 3 6 2 0 90.5 6.00 [HNRPU_HUMAN] Prolactin-inducible P12273 0 0 0 0 2 0 0 0 16.6 8.05 protein [PIP_HUMAN] ATP synthase subunit Q0QEN7 beta (Fragment) 0 0 2 2 0 1 2 0 48.1 5.07 [Q0QEN7_HUMAN]

80 Glucosidase 2 subunit

P14314 beta 0 2 3 5 3 6 5 5 59.4 4.41 [GLU2B_HUMAN] cDNA FLJ54993, highly similar to ATP- B4E356 dependent DNA helicase 0 0 7 5 3 4 5 0 54.4 8.87 2 subunit 1 (EC 3.6.1.-) [B4E356_HUMAN] Histone H1.2 P16403 0 0 3 2 0 0 0 0 21.4 10.93 [H12_HUMAN] IQ motif containing GTPase activating A4QPB0 0 3 20 12 8 12 16 6 189.2 6.48 protein 1 [A4QPB0_HUMAN] Adenosylhomocysteinas Q1RMG2 0 0 2 2 0 3 4 0 33.8 6.61 e [Q1RMG2_HUMAN] S-phase kinase- E5RJR5 associated protein 1 0 0 2 2 0 0 0 0 18.7 4.70 [E5RJR5_HUMAN] Complement C5 P01031 0 4 7 11 14 13 7 18 188.2 6.52 [CO5_HUMAN]

80

continued Appendix A continued cytidine deaminase 70982805 [Aspergillus fumigatus 0 0 0 0 0 0 0 2 16.0 7.24 Af293] Scavenger receptor cysteine-rich type 1 F5GZZ9 0 9 12 14 18 13 9 11 120.2 6.10 protein M130 [F5GZZ9_HUMAN] KRT18 protein I6L965 (Fragment) 0 0 7 5 6 2 5 4 42.1 5.10 [I6L965_HUMAN] Eosinophil cationic P12724 protein 0 0 2 2 1 0 2 0 18.4 10.02 [ECP_HUMAN] cDNA FLJ54373, highly

81 similar to 60 kDa heat

B7Z597 shock protein, 0 0 3 4 2 0 3 3 60.0 5.74 mitochondrial [B7Z597_HUMAN] Ferritin light chain P02792 0 0 0 3 0 0 0 2 20.0 5.78 [FRIL_HUMAN] cDNA FLJ60316, highly similar to B4DNT5 0 0 0 1 1 0 3 3 30.8 8.66 Apolipoprotein-L1 [B4DNT5_HUMAN] Carboxypeptidase N P15169 catalytic chain 0 0 4 4 3 5 3 6 52.3 7.34 [CBPN_HUMAN] cDNA FLJ57219, highly similar to Glutamine B4DWM6 0 0 0 0 0 1 0 0 18.1 7.85 synthetase (EC 6.3.1.2) [B4DWM6_HUMAN] cDNA FLJ57277, highly similar to Tripeptidyl- B4DSE2 peptidase 1 (EC 0 0 2 1 0 0 0 0 41.6 5.45 3.4.14.9) [B4DSE2_HUMAN] P20290 factor 0 0 0 0 0 0 2 0 22.2 9.38 81

continued Appendix A continued BTF3 [BTF3_HUMAN] Keratin, type I K7ERE3 cytoskeletal 13 0 0 0 7 0 0 0 0 45.2 4.81 [K7ERE3_HUMAN] Proteasome subunit beta P40306 type-10 0 0 0 3 6 3 5 6 28.9 7.81 [PSB10_HUMAN] Hornerin Q5DT20 0 10 2 1 3 1 0 0 282.2 10.02 [Q5DT20_HUMAN] Chromobox protein Q13185 homolog 3 0 0 2 0 2 1 2 0 20.8 5.33 [CBX3_HUMAN] Leucine-rich repeat

flightless-interacting 82 Q32MZ4 0 0 7 5 5 3 3 0 89.2 4.65 protein 1 [LRRF1_HUMAN] cDNA FLJ57133, highly similar to Bifunctional B4DP06 purine biosynthesis 0 0 2 2 4 0 0 0 58.6 6.84 protein PURH [B4DP06_HUMAN] major allergen Asp F2 70994042 [Aspergillus fumigatus 0 0 0 0 0 0 0 27 32.2 5.69 Af293] cDNA FLJ50118, highly similar to Splicing B4DEM8 factor, arginine/serine- 0 0 0 0 2 0 2 0 18.3 5.11 rich 4 [B4DEM8_HUMAN] Putative uncharacterized protein DKFZp547B159 Q69YT6 0 0 0 1 0 0 2 0 21.2 5.08 (Fragment) [Q69YT6_HUMAN] Coronin B4DMH3 0 0 5 4 5 3 5 0 49.3 7.02 [B4DMH3_HUMAN]

82

continued Appendix A continued cDNA, FLJ96158, highly similar to Homo sapiens calpain 2, (m/II) B2RCM3 0 2 3 4 4 1 0 0 80.0 5.00 large subunit (CAPN2), mRNA [B2RCM3_HUMAN] Complement component 8, beta polypeptide, B7Z550 0 0 5 4 3 5 2 6 60.1 7.77 isoform CRA_b [B7Z550_HUMAN] EF-hand domain- containing protein D2 H0Y4Y4 0 0 0 1 0 0 0 0 19.6 6.35 (Fragment) [H0Y4Y4_HUMAN]

Inter-alpha-trypsin Q06033 inhibitor heavy chain H3 0 7 11 12 10 15 16 12 99.8 5.74 [ITIH3_HUMAN]

83 Plasma kallikrein P03952 0 0 3 3 3 3 2 2 71.3 8.22 [KLKB1_HUMAN] Serine/threonine-protein E9PMD7 phosphatase (Fragment) 0 0 0 0 2 0 0 0 28.9 4.87 [E9PMD7_HUMAN] DnaJ homolog P31689 subfamily A member 1 0 0 0 1 0 2 2 0 44.8 7.08 [DNJA1_HUMAN] cDNA FLJ57246, highly similar to Poly(A)- B4DZW4 0 0 2 1 0 3 0 0 65.6 9.67 binding protein 1 [B4DZW4_HUMAN] HNRNPA2B1 protein I6L957 0 0 2 2 0 2 2 0 28.4 4.86 [I6L957_HUMAN] Hsp90 co-chaperone Q16543 Cdc37 0 7 0 0 0 0 2 0 44.4 5.25 [CDC37_HUMAN] Cytosol aminopeptidase P28838 0 0 3 3 4 3 2 4 56.1 7.93 [AMPL_HUMAN]

83

continued Appendix A continued Monocyte differentiation B2R888 antigen CD14 0 0 0 1 1 0 0 0 40.0 6.23 [B2R888_HUMAN] Phospholipid transfer P55058 protein 0 0 2 0 2 1 2 0 54.7 7.01 [PLTP_HUMAN] cDNA FLJ53437, highly similar to Major vault B4DP93 0 2 5 3 3 3 3 4 87.5 5.66 protein [B4DP93_HUMAN] YBX1 protein Q6PKI6 (Fragment) 0 0 0 0 0 2 3 0 29.4 10.23 [Q6PKI6_HUMAN] RecName: Full=Probable glucan

1,3-beta-glucosidase A;

84 AltName: Full=Exo-1,3- 74669912 0 0 0 0 0 0 0 4 45.7 4.72

beta-glucanase 1; AltName: Full=Exo-1,3- beta-glucanase A; Flags: Precursor cDNA, FLJ96225, highly similar to Homo sapiens heat shock B2RCQ9 0 0 0 0 7 0 0 0 70.3 5.81 70kDa protein 1-like (HSPA1L), mRNA [B2RCQ9_HUMAN] Antithrombin-III P01008 0 0 3 1 2 3 4 5 52.6 6.71 [ANT3_HUMAN] 40S ribosomal protein P62269 0 0 0 0 1 0 0 0 17.7 10.99 S18 [RS18_HUMAN] Elongation factor 1- P26641 gamma 0 0 3 3 0 3 5 2 50.1 6.67 [EF1G_HUMAN] L-lactate dehydrogenase A8MW50 (Fragment) 0 0 4 4 7 3 0 0 25.2 5.81 [A8MW50_HUMAN] 84

continued Appendix A continued cDNA FLJ51711, highly similar to T-complex B4DE30 0 0 2 2 0 0 0 0 51.5 5.53 protein 1 subunit epsilon [B4DE30_HUMAN] LBP protein Q8TCF0 0 3 0 0 1 2 2 0 52.9 6.76 [Q8TCF0_HUMAN] Coronin-1B Q9BR76 0 0 5 6 6 4 5 4 54.2 5.88 [COR1B_HUMAN] cDNA FLJ51597, highly similar to C4b-binding B4E1D8 0 0 2 3 4 3 2 0 60.4 6.65 protein alpha chain [B4E1D8_HUMAN] Actin-related protein 3 B4DXW1 0 0 0 1 1 0 3 0 42.0 5.62 [B4DXW1_HUMAN]

L-lactate dehydrogenase 85 P00338 A chain 0 0 0 0 4 0 4 0 36.7 8.27 [LDHA_HUMAN] cDNA FLJ58073, moderately similar to B4DL49 Cathepsin B (EC 0 0 2 3 3 3 2 3 30.7 6.62 3.4.22.1) [B4DL49_HUMAN] Annexin A1 P04083 0 0 2 1 3 0 2 3 38.7 7.02 [ANXA1_HUMAN] Serine-threonine kinase receptor-associated B0AZV0 0 0 2 0 0 0 0 0 28.5 4.91 protein [B0AZV0_HUMAN] Ubiquitin-like modifier- P22314 activating enzyme 1 0 0 7 6 5 8 6 11 117.8 5.76 [UBA1_HUMAN] cDNA FLJ75376, highly similar to Homo sapiens A8K050 0 0 0 0 2 0 0 0 62.1 7.37 recognition protein L (PGLYRP) mRNA [A8K050_HUMAN] 85

continued Appendix A continued Glucose-6-phosphate 1- dehydrogenase Q2VF42 0 0 2 2 1 2 3 3 54.8 7.28 (Fragment) [Q2VF42_HUMAN] cDNA FLJ78571, highly similar to Homo sapiens A8K477 sulfhydryl oxidase 0 0 2 1 0 1 3 0 66.8 8.60 mRNA [A8K477_HUMAN] Golgi membrane protein Q8NBJ4 0 0 2 1 2 2 0 0 45.3 4.97 1 [GOLM1_HUMAN] V-type proton ATPase P36543 subunit E 1 0 0 5 4 5 1 5 4 26.1 8.00 [VATE1_HUMAN] Proteasome subunit beta

B2RAQ9 type 0 0 0 5 0 0 5 3 29.9 7.68 [B2RAQ9_HUMAN]

86 Nucleosome assembly protein 1-like 4 C9JZI7 0 2 0 0 0 0 0 0 31.8 5.01 (Fragment) [C9JZI7_HUMAN] Putative uncharacterized protein XRCC5 Q53T09 0 0 2 2 4 1 2 0 64.2 6.00 (Fragment) [Q53T09_HUMAN] cDNA FLJ54507, highly similar to Heat shock 70 B4DT47 0 0 0 3 3 1 0 2 77.3 5.07 kDa protein 4 [B4DT47_HUMAN] Elongation factor 1-beta P24534 0 0 0 1 0 0 0 0 24.7 4.67 [EF1B_HUMAN] Fermitin family Q86UX7 homolog 3 0 2 3 2 0 1 2 0 75.9 6.98 [URP2_HUMAN] cDNA FLJ54170, highly 0 B4DV28 similar to Cytosolic 0 0 1 4 0 2 3 51.5 6.43 nonspecific dipeptidase 86

continued Appendix A continued [B4DV28_HUMAN] glutaminase GtaA 66852539 [Aspergillus fumigatus 0 0 0 0 0 0 0 11 76.1 4.88 Af293] Cathepsin Z Q5U000 0 0 2 3 3 3 4 0 33.8 7.11 [Q5U000_HUMAN] SAFB2 protein A0PJ47 (Fragment) 0 0 2 0 3 0 0 0 56.9 4.79 [A0PJ47_HUMAN] PURA protein Q2NLD4 (Fragment) 0 0 2 1 0 0 0 0 32.0 7.05 [Q2NLD4_HUMAN] Aldehyde

dehydrogenase, P05091 0 0 5 0 0 0 0 0 56.3 7.05 mitochondrial

87 [ALDH2_HUMAN]

Neuroblast differentiation- Q09666 associated protein 0 0 15 9 6 4 5 7 628.7 6.15 AHNAK [AHNK_HUMAN] BRCA1/BRCA2- containing complex, D3DWY6 subunit 3, isoform 0 0 2 0 1 0 0 0 29.7 6.40 CRA_e [D3DWY6_HUMAN] HEXA protein Q9BVJ8 (Fragment) 0 0 0 1 1 1 0 0 47.1 5.00 [Q9BVJ8_HUMAN] S-adenosylmethionine B4DN45 synthase 0 0 0 1 0 0 0 0 32.9 7.25 [B4DN45_HUMAN] Splicing factor, arginine/serine-rich 1 Q59FA2 (Splicing factor 2, 0 0 0 0 0 0 2 0 25.6 9.01 alternate splicing factor) variant (Fragment) 87

continued Appendix A continued [Q59FA2_HUMAN] Complement component P07357 C8 alpha chain 0 0 0 3 4 4 4 4 65.1 6.47 [CO8A_HUMAN] Coagulation factor XII P00748 0 0 2 2 2 2 2 4 67.7 7.74 [FA12_HUMAN] IgGFc-binding protein Q9Y6R7 0 3 16 14 11 12 10 5 571.6 5.34 [FCGBP_HUMAN] dihydrolipoamide dehydrogenase 70988990 0 0 0 0 0 0 0 6 54.9 8.18 [Aspergillus fumigatus Af293] Heat shock protein 105 Q92598 0 0 2 2 2 3 2 0 96.8 5.39 kDa [HS105_HUMAN]

cDNA FLJ53638, highly B7Z582 similar to Annexin A6 0 0 0 0 2 0 0 0 61.8 5.54

88 [B7Z582_HUMAN]

Beta-Ala-His J3KRP0 dipeptidase 0 0 0 2 0 0 0 3 51.9 5.40 [J3KRP0_HUMAN] FBRNP Q65ZQ3 0 0 3 3 0 3 0 0 29.3 8.29 [Q65ZQ3_HUMAN] Thioredoxin-like 1 Q59G46 variant (Fragment) 0 0 2 0 0 0 0 0 31.3 4.83 [Q59G46_HUMAN] 26S protease regulatory P43686 subunit 6B 0 0 2 2 0 0 2 0 47.3 5.21 [PRS6B_HUMAN] Complement component P10643 0 0 0 3 19 0 2 3 93.5 6.48 C7 [CO7_HUMAN] RecName: Full=Probable carboxypeptidase 74671026 0 0 0 0 0 0 0 5 46.3 5.52 AFUA_6G06800; AltName: Full=Peptidase M20

88

continued Appendix A continued domain-containing protein AFUA_6G06800; Flags: Precursor Epsilon-COP Q7Z4Z1 0 0 0 3 2 0 0 0 34.4 5.20 [Q7Z4Z1_HUMAN] Drebrin-like protein Q9UJU6 0 0 0 1 1 0 0 0 48.2 5.05 [DBNL_HUMAN] cDNA FLJ54035, highly similar to Neutral alpha- B4DIW2 0 0 2 0 0 3 0 0 93.9 5.87 glucosidase AB [B4DIW2_HUMAN] Nuclear migration Q9Y266 protein nudC 0 0 0 4 0 3 2 0 38.2 5.38 [NUDC_HUMAN]

Metalloendopeptidase

89 Q96E52 OMA1, mitochondrial 0 0 0 0 0 0 2 0 60.1 9.25

[OMA1_HUMAN] cDNA FLJ53631, highly similar to Intercellular B4DNT6 0 0 0 1 0 0 0 0 48.4 7.74 adhesion molecule 1 [B4DNT6_HUMAN] Clathrin heavy chain 1 Q00610 0 0 6 4 2 3 4 0 191.5 5.69 [CLH1_HUMAN] Mesothelin (Fragment) H3BMA1 0 0 0 1 0 0 0 0 37.6 5.97 [H3BMA1_HUMAN] ATP-citrate synthase P53396 0 3 5 5 4 5 5 5 120.8 7.33 [ACLY_HUMAN] cDNA FLJ55694, highly similar to Dipeptidyl- B4DJQ8 peptidase 1 (EC 0 0 0 4 0 3 5 4 50.1 6.99 3.4.14.1) [B4DJQ8_HUMAN] SYNCRIP protein Q05CK9 (Fragment) 0 0 2 1 2 2 0 0 50.6 6.71 [Q05CK9_HUMAN]

89

continued Appendix A continued cDNA, FLJ79457, highly similar to Insulin- like growth factor- B0AZL7 0 0 0 3 2 0 0 0 66.0 6.79 binding proteincomplex acid labile chain [B0AZL7_HUMAN] cDNA FLJ77456, highly similar to Homo sapiens interleukin enhancer A8K590 binding factor 3, 90kDa 0 0 3 2 1 1 4 0 76.0 7.75 (ILF3), transcript variant 2, mRNA [A8K590_HUMAN] cDNA FLJ56531, highly similar to UV excision

B4DEA3 repair protein RAD23 0 0 2 3 3 2 0 0 42.3 5.43 homolog B [B4DEA3_HUMAN] 90 glucan 1,4-alpha- 70988699 glucosidase [Aspergillus 0 0 0 0 0 0 0 4 67.1 5.21 fumigatus Af293] Vasodilator-stimulated P50552 phosphoprotein 0 0 2 0 0 0 0 0 39.8 8.94 [VASP_HUMAN] Clathrin light chain A P09496 0 0 3 3 2 0 0 0 27.1 4.51 [CLCA_HUMAN] RecName: Full=Probable beta- glucosidase F; AltName: Full=Beta-D-glucoside glucohydrolase F; 296439598 0 0 0 0 0 0 0 4 92.9 5.97 AltName: Full=Cellobiase F; AltName: Full=Gentiobiase F; Flags: Precursor B4E1B3 cDNA FLJ53950, highly 0 0 2 1 2 2 0 2 51.0 6.16 90

continued Appendix A continued similar to Angiotensinogen [B4E1B3_HUMAN] cDNA FLJ39263 fis, clone OCBBF2009571, highly similar to ATP- B3KU66 0 0 2 1 0 2 0 0 59.9 9.45 dependent RNA helicase A (EC 3.6.1.-) [B3KU66_HUMAN] Talin-1 Q5TCU6 0 7 5 0 0 0 3 2 257.9 6.49 [Q5TCU6_HUMAN] cDNA FLJ14052 fis, clone HEMBA1006914, highly similar to B3KNA3 0 0 2 0 0 0 0 0 41.2 5.01 Ubiquitin-like 1-

91 activating enzyme E1B

[B3KNA3_HUMAN] Nucleolin, isoform B3KM80 CRA_c 0 0 2 0 0 2 0 0 58.5 4.67 [B3KM80_HUMAN] RecName: Full=Vacuolar protease A; AltName: Full=Aspartic 74675969 0 0 0 0 0 0 0 2 43.3 5.00 endopeptidase pep2; AltName: Full=Aspartic

protease pep2; Flags: Precursor cDNA FLJ59335, highly similar to B4DLL8 Transmembrane 0 0 0 0 0 1 0 2 50.5 6.24 glycoprotein NMB [B4DLL8_HUMAN] V-type proton ATPase B7Z1R5 catalytic subunit A 0 8 7 8 4 8 10 10 64.7 5.66 [B7Z1R5_HUMAN] B4DT28 Heterogeneous nuclear 0 0 2 2 3 3 3 2 55.7 9.23 91

continued Appendix A continued

ribonucleoprotein R, isoform CRA_a [B4DT28_HUMAN] Catalase B4DWK8 0 0 2 0 0 0 0 0 53.3 7.94 [B4DWK8_HUMAN] cDNA FLJ14022 fis, clone HEMBA1003538, weakly similar to Q9H804 COMPLEMENT C1R 0 0 0 0 1 0 0 0 48.0 7.46 COMPONENT (EC 3.4.21.41) [Q9H804_HUMAN] Spectrin alpha chain, Q13813 non-erythrocytic 1 0 0 11 7 6 3 8 8 284.4 5.35 [SPTN1_HUMAN] 92 PACSIN2 protein Q6FIA3 0 0 2 0 0 0 0 0 51.3 5.39 [Q6FIA3_HUMAN] Acylamino-acid- P13798 releasing enzyme 0 0 2 1 2 2 2 3 81.2 5.48 [ACPH_HUMAN] Eosinophil peroxidase P11678 0 0 0 2 0 0 0 0 81.0 10.29 [PERE_HUMAN] Eukaryotic translation initiation factor 3 B4DV79 0 0 2 1 1 1 2 0 85.1 5.26 subunit B [B4DV79_HUMAN] Histone-binding protein E9PC52 RBBP7 0 0 2 0 0 2 0 0 46.9 5.07 [E9PC52_HUMAN] alkaline phosphatase 70989193 Pho8 [Aspergillus 0 0 0 0 0 0 0 3 66.3 5.27 fumigatus Af293] Suprabasin Q6UWP8 0 0 0 1 0 0 0 0 60.5 7.01 [SBSN_HUMAN] cDNA FLJ45763 fis, B3KXN4 clone N1ESE2000698, 0 0 2 2 0 0 0 0 62.1 6.62 highly similar to WD 92

continued Appendix A continued repeat protein 1 [B3KXN4_HUMAN] cDNA FLJ77317, highly similar to Homo sapiens retinoblastoma binding A8K6A2 0 0 2 0 0 2 0 0 47.8 5.08 protein 7 (RBBP7), mRNA [A8K6A2_HUMAN] cDNA FLJ54278, highly similar to SPARC-like B4DNS6 0 0 0 1 0 1 0 0 58.5 4.94 protein 1 [B4DNS6_HUMAN] cDNA FLJ78071, highly similar to Human MHC A8K8Z4 class III complement 0 0 6 4 8 5 4 5 104.6 6.62

component C6 mRNA [A8K8Z4_HUMAN]

93 RecName:

Full=Dipeptidyl- peptidase 5; AltName: 229485355 Full=Dipeptidyl- 0 0 0 0 0 0 0 4 79.7 5.90 peptidase V; Short=DPP V; Short=DppV; Flags: Precursor Vacuolar protein sorting-associated Q96QK1 0 0 0 0 0 2 0 0 91.6 5.49 protein 35 [VPS35_HUMAN] Stress-induced- F5H0T1 phosphoprotein 1 0 0 2 2 1 1 0 4 59.7 6.80 [F5H0T1_HUMAN] T-complex protein 1 B4DQH4 subunit theta 0 0 0 0 0 1 0 0 51.6 5.24 [B4DQH4_HUMAN] Myosin-14 Q7Z406 0 0 5 4 0 0 5 0 227.7 5.60 [MYH14_HUMAN] Q03252 Lamin-B2 0 0 3 0 0 0 0 0 67.6 5.35 93

continued Appendix A continued [LMNB2_HUMAN] cDNA, FLJ92896, B2R6D0 highly similar to Homo 0 0 2 0 0 0 0 0 105.8 5.39 sapiens proteasome Hyaluronan-binding Q14520 protein 2 0 0 4 2 0 3 3 4 62.6 6.54 [HABP2_HUMAN] Vascular cell adhesion P19320 protein 1 0 0 0 0 1 0 0 0 81.2 5.22 [VCAM1_HUMAN] TNC variant protein Q4LE33 (Fragment) 0 0 4 4 0 3 0 0 244.2 4.94 [Q4LE33_HUMAN] Putative uncharacterized

protein UGP2 Q53QE9 0 0 0 1 0 0 0 0 49.2 8.69 (Fragment) 94 [Q53QE9_HUMAN] vacuolar carboxypeptidase Cps1, 129558348 0 0 0 0 0 0 0 4 62.0 5.21 putative [Aspergillus fumigatus Af293] Deoxynucleoside triphosphate Q9Y3Z3 triphosphohydrolase 0 0 2 0 0 0 0 0 72.2 7.14 SAMHD1 [SAMH1_HUMAN] ABC transporter (Adp1) 70991545 [Aspergillus fumigatus 0 0 0 0 0 0 0 5 119.9 6.44 Af293] cDNA FLJ51093, highly B4DU18 similar to Cadherin-5 0 0 2 0 0 0 0 0 83.9 5.50 [B4DU18_HUMAN] Plectin Q15149 0 0 5 2 4 3 4 4 531.5 5.96 [PLEC_HUMAN] Coatomer protein B4DZI8 0 0 0 0 0 0 2 0 99.0 5.16 complex, subunit beta 2

94

continued Appendix A continued

(Beta prime), isoform CRA_b [B4DZI8_HUMAN] Vinculin P18206 0 0 0 0 1 0 0 0 123.7 5.66 [VINC_HUMAN] ABC metal ion transporter, putative 66851475 0 0 0 0 0 0 0 2 171.6 8.02 [Aspergillus fumigatus Af293] Fatty acid synthase P49327 0 0 2 1 0 0 0 0 273.3 6.44

95 [FAS_HUMAN]

95

Appendix B A list of 219 BALF and catalase proteins identified on Orbitrap

Normalized spectral counts among BALF with no treatment, BALF with CPLL treatment, and BALF with different concentrations of catalase (0.01 nM, 1 nM, 0.01 μM, and 0.1 μΜ from spike 1 to spike 4 respectively) before CPLL are shown. Data distribution was shown vby color scales with the shade of the color represents higher or lower values. Higher normalized spectral count cells have a more red color. Higher molecular weight cells have a more blue color. Higher pI cells have a more green color. The search was performed using Thermo Proteome Discoverer (version 1.4) software. The molecular weight and calculated pI were also obtained from the above software. # PSM # PSM # PSM # PSM # PSM # PSM BALF BALF Spk 1 Spk 2 Spk 3 Spk 4 MW Accession Description calc. pI without with with with with with [kDa] CPLL CPLL CPLL CPLL CPLL CPLL P02768 Serum albumin [ALBU_HUMAN] 3743 514 361 319 418 497 69.3 6.28

96

P01009 Alpha-1-antitrypsin [A1AT_HUMAN] 364 36 53 54 67 64 46.7 5.59 Putative uncharacterized protein Q6MZV7 DKFZp686C11235 332 322 214 141 191 153 52.1 7.58 [Q6MZV7_HUMAN] Transferrin variant (Fragment) Q53H26 152 0 0 0 6 0 77.0 7.03 [Q53H26_HUMAN] P00738 Haptoglobin [HPT_HUMAN] 124 0 0 2 7 3 45.2 6.58 Q6PJF2 IGK@ protein [Q6PJF2_HUMAN] 111 0 0 0 0 0 25.5 6.55 Ig gamma-3 chain C region P01860 87 49 0 42 0 0 41.3 7.90 [IGHG3_HUMAN] Alpha-2-macroglobulin P01023 81 69 99 83 111 131 163.2 6.46 [A2MG_HUMAN] Alpha-1-acid glycoprotein 1 P02763 81 0 0 0 0 0 23.5 5.02 [A1AG1_HUMAN] Lambda light chain of human immunoglobulin surface antigen-related C6KXN3 68 52 58 57 56 62 24.7 5.54 protein (Fragment) rG PE=1 SV=1 - [C6KXN3_HUMAN] Uncharacterized protein Q8NEJ1 61 0 0 0 0 0 25.0 7.69 [Q8NEJ1_HUMAN]

96

Hemoglobin subunit beta P68871 61 0 0 0 0 0 16.0 7.28 [HBB_HUMAN] P01024 Complement C3 [CO3_HUMAN] 56 412 285 189 244 237 187.0 6.40 Putative uncharacterized protein Q6N093 DKFZp686I04196 (Fragment) 56 0 0 0 0 0 46.0 7.59 [Q6N093_HUMAN] Putative uncharacterized protein Q8WVW5 45 614 354 227 212 213 40.5 6.14 (Fragment) Hemoglobin alpha-1 globin chain E9M4D4 45 0 0 0 0 0 10.8 8.48 (Fragment) [E9M4D4_HUMAN] Alpha-1-antichymotrypsin P01011 42 7 8 11 15 11 47.6 5.52 [AACT_HUMAN] P02647 Apolipoprotein A-I [APOA1_HUMAN] 31 398 365 482 321 178 30.8 5.76 P02766 Transthyretin [TTHY_HUMAN] 29 71 101 59 56 82 15.9 5.76 Ig alpha-1 chain C region 97 P01876 29 43 49 50 49 53 37.6 6.51 [IGHA1_HUMAN]

cDNA FLJ76826, highly similar to Homo A8K5A4 sapiens ceruloplasmin (ferroxidase) (CP), 25 155 168 104 128 86 122.1 5.74 mRNA [A8K5A4_HUMAN] Vitamin D-binding protein P02774 22 351 338 312 230 284 52.9 5.54 [VTDB_HUMAN] P02790 Hemopexin [HEMO_HUMAN] 20 0 0 0 0 0 51.6 7.02 P0C0L4 Complement C4-A [CO4A_HUMAN] 20 372 273 354 318 331 192.7 7.08 Alpha-1-acid glycoprotein 2 P19652 19 0 0 0 0 0 23.6 5.11 [A1AG2_HUMAN] E9PN95 Uteroglobin [E9PN95_HUMAN] 16 0 0 0 3 3 6.3 4.96 Plasma protease C1 inhibitor B4E1H2 15 12 22 18 24 14 49.7 6.54 [B4E1H2_HUMAN] Fibrinogen gamma chain P02679 15 118 117 99 118 116 51.5 5.62 [FIBG_HUMAN] P02675 Fibrinogen beta chain [FIBB_HUMAN] 14 260 208 170 169 190 55.9 8.27 P02652 Apolipoprotein A-II [APOA2_HUMAN] 14 36 58 27 41 47 11.2 6.62 cDNA, FLJ92148, highly similar to B2R4M6 Homo sapiens S100 calcium binding 13 0 0 0 0 0 13.2 6.13 protein A9 (calgranulin B) (S100A9),

97

continued Appendix B continued mRNA [B2R4M6_HUMAN] Single-chain Fv (Fragment) Q65ZC9 10 0 0 0 7 3 25.6 9.11 [Q65ZC9_HUMAN] P00751 Complement factor B [CFAB_HUMAN] 10 1 0 0 6 0 85.5 7.06 Alpha-1B-glycoprotein P04217 9 0 0 0 0 0 54.2 5.86 [A1BG_HUMAN] Polymeric immunoglobulin receptor P01833 8 8 15 22 14 13 83.2 5.74 [PIGR_HUMAN] Inter-alpha (Globulin) inhibitor H4 B2RMS9 (Plasma Kallikrein-sensitive 8 19 29 27 32 14 103.3 6.98 glycoprotein) [B2RMS9_HUMAN] cDNA FLJ78367, highly similar to Homo sapiens fibrinogen, A alpha polypeptide A8K3E4 7 147 77 86 74 97 69.7 8.06 (FGA), transcriptvariant alpha, mRNA [A8K3E4_HUMAN] 98 Leucine-rich alpha-2-glycoprotein Q68CK4 7 0 0 0 0 0 38.1 6.95 [Q68CK4_HUMAN] P01591 Immunoglobulin J chain [IGJ_HUMAN] 7 11 18 16 19 24 18.1 5.24 Alpha-2-HS-glycoprotein P02765 7 45 40 48 25 34 39.3 5.72 [FETUA_HUMAN] P05109 Protein S100-A8 [S10A8_HUMAN] 7 0 0 0 0 0 10.8 7.03 Histidine-rich glycoprotein P04196 7 54 14 17 36 32 59.5 7.50 [HRG_HUMAN] Rheumatoid factor RF-IP9 (Fragment) A2J1M2 7 4 3 3 0 0 10.5 9.17 [A2J1M2_HUMAN] Kaliocin-1 (Fragment) E7EQB2 6 65 99 129 138 121 76.6 8.02 [E7EQB2_HUMAN] H6VRF8 Keratin 1 [H6VRF8_HUMAN] 6 30 49 43 60 103 66.0 8.12 H6VRG2 Keratin 1 [H6VRG2_HUMAN] 6 0 0 0 0 0 66.0 8.12 Rheumatoid factor D5 light chain A0N5G5 6 0 0 0 0 0 12.8 8.97 (Fragment) [A0N5G5_HUMAN] P01871 Ig mu chain C region [IGHM_HUMAN] 5 47 52 57 59 60 49.3 6.77 P08603 Complement factor H [CFAH_HUMAN] 5 18 23 24 30 19 139.0 6.61

B7ZLE5 FN1 protein [B7ZLE5_HUMAN] 5 130 151 185 189 204 246.5 6.06 P02751 Fibronectin [FINC_HUMAN] 5 0 150 185 189 0 262.5 5.71

98

continued Appendix B continued

P02649 Apolipoprotein E [APOE_HUMAN] 4 32 43 40 51 24 36.1 5.73 Beta-2-microglobulin A6XMH5 4 0 0 0 0 0 10.4 8.02 [A6XMH5_HUMAN] P13796 Plastin-2 [PLSL_HUMAN] 3 2 1 0 3 0 70.2 5.43 Putative uncharacterized protein Q5HYB6 2 14 21 26 25 29 27.2 4.74 DKFZp686J1372 [Q5HYB6_HUMAN] cDNA FLJ51265, moderately similar to B4DPN0 Beta-2-glycoprotein 1 (Beta-2- 2 0 0 0 0 0 30.3 7.85 glycoprotein I) [B4DPN0_HUMAN] Triosephosphate isomerase B4DUI5 2 0 0 0 0 0 22.9 6.92 [B4DUI5_HUMAN] Heat shock protein HSP 90-alpha P07900 2 71 87 99 111 93 84.6 5.02 [HS90A_HUMAN] Keratin, type I cytoskeletal 9 P35527 2 22 36 33 47 64 62.0 5.24 [K1C9_HUMAN] 14-3-3 protein epsilon

99 P62258 2 15 20 19 23 24 29.2 4.74 [1433E_HUMAN] cDNA FLJ52573, highly similar to B4DNE0 Elongation factor 1-alpha 1 2 4 6 10 15 9 42.6 9.01 [B4DNE0_HUMAN] B2R7Z6 cDNA, FLJ93674 [B2R7Z6_HUMAN] 2 2 0 1 2 0 52.5 7.55 Q9HC84 Mucin-5B [MUC5B_HUMAN] 2 0 0 0 1 1 596.0 6.64 Galectin-3-binding protein Q08380 1 38 51 64 67 69 65.3 5.27 [LG3BP_HUMAN] Enolase (Fragment) Q96GV1 1 0 0 0 0 0 20.4 7.03 [Q96GV1_HUMAN] 14-3-3 protein beta/alpha P31946 0 13 20 18 16 0 28.1 4.83 [1433B_HUMAN] Q04917 14-3-3 protein eta [1433F_HUMAN] 0 0 18 17 19 18 28.2 4.84 P31947 14-3-3 protein sigma [1433S_HUMAN] 0 12 12 13 0 0 27.8 4.74 14-3-3 protein zeta/delta P63104 0 29 33 29 27 25 27.7 4.79 [1433Z_HUMAN] 40S ribosomal protein SA (Fragment) C9J9K3 0 0 2 0 4 6 29.5 5.25 [C9J9K3_HUMAN] 78 kDa glucose-regulated protein P11021 0 4 6 7 5 0 72.3 5.16 [GRP78_HUMAN] 99

continued Appendix B continued Acidic leucine-rich nuclear H0YN26 phosphoprotein 32 family member A 0 0 6 4 6 0 20.0 4.58 [H0YN26_HUMAN] Adenylyl cyclase-associated protein B2RDY9 0 21 21 21 21 25 51.6 8.22 [B2RDY9_HUMAN] P43652 Afamin [AFAM_HUMAN] 0 0 0 0 0 3 69.0 5.90 ALDH2 (Fragment) B4YAH7 0 0 4 3 0 3 26.6 5.90 [B4YAH7_HUMAN] P12814 Alpha-actinin-1 [ACTN1_HUMAN] 0 14 33 32 22 24 103.0 5.41 O43707 Alpha-actinin-4 [ACTN4_HUMAN] 0 20 49 54 38 47 104.8 5.44 P03973 Antileukoproteinase [SLPI_HUMAN] 0 6 9 6 14 5 14.3 8.75 Apolipoprotein A-IV P06727 0 30 48 42 57 67 45.4 5.38 [APOA4_HUMAN] P05090 Apolipoprotein D [APOD_HUMAN] 0 15 24 26 32 30 21.3 5.15 P53396 ATP-citrate synthase [ACLY_HUMAN] 0 3 0 0 0 0 120.8 7.33

100 P20160 Azurocidin [CAP7_HUMAN] 0 0 3 7 6 9 26.9 9.50

Calreticulin variant (Fragment) Q53G71 0 34 30 66 47 55 46.9 4.45 [Q53G71_HUMAN] CD44 antigen (Fragment) H0YD13 0 0 2 5 3 0 20.2 8.27 [H0YD13_HUMAN] CDC37 protein (Fragment) A1L0W4 0 7 8 11 9 12 12.2 4.65 [A1L0W4_HUMAN] cDNA FLJ32131 fis, clone PEBLM2000267, highly similar to B3KPS3 0 21 22 22 23 20 46.2 5.12 Tubulin alpha-ubiquitous chain [B3KPS3_HUMAN] cDNA FLJ38781 fis, clone LIVER2000216, highly similar to HEAT B3KTV0 0 0 0 6 6 5 67.9 5.45 SHOCK COGNATE 71 kDa PROTEIN [B3KTV0_HUMAN] cDNA FLJ42761 fis, clone Q6ZVC6 BRAWH3002574, highly similar to 0 2 0 0 0 0 24.3 5.95 Calpain 2, large [Q6ZVC6_HUMAN] cDNA FLJ46506 fis, clone B3KY04 THYMU3030752, highly similar to 0 0 3 0 0 0 35.7 5.64 BTB/POZ domain-containing protein 100

continued Appendix B continued

KCTD12 [B3KY04_HUMAN] cDNA FLJ51488, highly similar to B4DU58 Macrophage capping protein 0 0 0 0 3 0 36.2 6.25 [B4DU58_HUMAN] cDNA FLJ52141, highly similar to 14-3- B4DE78 0 14 18 17 18 17 23.5 4.82 3 protein gamma [B4DE78_HUMAN] cDNA FLJ52193, highly similar to B7Z8S8 0 4 3 4 3 5 37.7 4.61 Calpastatin [B7Z8S8_HUMAN] cDNA FLJ53207, highly similar to Homo B4DUV1 sapiens fibulin 1 (FBLN1), transcript 0 29 25 25 25 24 70.1 5.26 variant C, mRNA [B4DUV1_HUMAN] cDNA FLJ53287, highly similar to CD97 B4E336 0 0 2 2 0 0 72.2 7.58 antigen [B4E336_HUMAN] cDNA FLJ53437, highly similar to Major

B4DP93 0 2 0 0 0 0 87.5 5.66 vault protein [B4DP93_HUMAN] cDNA FLJ54023, highly similar to Heat

101 B4DMA2 shock protein HSP 90-beta 0 45 60 73 84 70 79.1 5.02 [B4DMA2_HUMAN] cDNA FLJ54121, highly similar to B4E2T4 Cysteine and glycine-rich protein 1 0 2 0 0 0 0 15.0 8.57 [B4E2T4_HUMAN] cDNA FLJ54993, highly similar to ATP- B4E356 dependent DNA helicase 2 subunit 1 (EC 0 0 4 6 0 0 54.4 8.87 3.6.1.-) [B4E356_HUMAN] cDNA FLJ56954, highly similar to Inter- B7Z539 alpha-trypsin inhibitor heavy chain H1 0 18 22 24 36 31 72.1 7.68 [B7Z539_HUMAN] cDNA FLJ57475, highly similar to B4E1F5 Pulmonary surfactant-associated protein 0 0 7 6 6 3 38.5 5.96 B [B4E1F5_HUMAN] cDNA FLJ59361, highly similar to B4DNL5 Protein disulfide-isomerase (EC 5.3.4.1) 0 5 13 19 18 19 55.3 4.79 [B4DNL5_HUMAN] cDNA FLJ60461, highly similar to B4DF70 Peroxiredoxin-2 (EC 1.11.1.15) 0 0 0 5 0 0 20.1 8.78 [B4DF70_HUMAN]

101

continued Appendix B continued cDNA FLJ75422, highly similar to Homo sapiens capping protein (actin filament) A8K0T9 0 0 3 0 3 0 32.9 5.69 muscle Z-line, alpha 1, mRNA [A8K0T9_HUMAN] cDNA FLJ75700, highly similar to Homo A8K651 sapiens complement component 1, q 0 0 4 7 6 9 31.4 4.84 subcomponent binding protein (C1QBP) cDNA FLJ76079, highly similar to Homo A8K2L4 sapiens lymphocyte-specific protein 1 0 3 5 8 8 13 37.2 4.74 (LSP1), mRNA [A8K2L4_HUMAN] cDNA FLJ77823, highly similar to Homo sapiens EGF-containing fibulin-like A8KAJ3 0 4 3 4 3 2 54.6 5.14 extracellular matrix protein 1, transcript variant 3, mRNA [A8KAJ3_HUMAN]

cDNA FLJ78071, highly similar to Human MHC class III complement A8K8Z4 0 0 0 3 0 0 104.6 6.62 102 component C6 mRNA [A8K8Z4_HUMAN] cDNA, FLJ93426, highly similar to B2R7F8 Homo sapiens plasminogen (PLG), 0 5 5 6 0 0 90.5 7.24 mRNA [B2R7F8_HUMAN] cDNA, FLJ96345, Homo sapiens SET translocation (myeloid leukemia- B2RCX0 0 0 11 11 11 8 32.1 4.22 associated) (SET),mRNA [B2RCX0_HUMAN] Chloride intracellular channel protein 1 O00299 0 12 15 11 14 13 26.9 5.17 [CLIC1_HUMAN] P10909 Clusterin [CLUS_HUMAN] 0 41 43 49 58 58 52.5 6.27 Cofilin-1 (Fragment) E9PLJ3 0 0 12 7 6 5 9.1 8.38 [E9PLJ3_HUMAN] Complement C1q subcomponent subunit P02745 0 4 0 0 0 0 26.0 9.11 A [C1QA_HUMAN] Complement C1q subcomponent subunit D6RA08 0 11 9 12 10 10 24.1 9.16 B (Fragment) [D6RA08_HUMAN] Complement C1q subcomponent subunit P02747 0 15 11 13 12 7 25.8 8.41 C [C1QC_HUMAN]

102

continued Appendix B continued Complement C1s subcomponent F8WCZ6 0 7 9 9 8 10 57.5 5.45 [F8WCZ6_HUMAN] P01031 Complement C5 [CO5_HUMAN] 0 4 0 0 0 0 188.2 6.52 Complement component C9 P02748 0 19 21 18 14 14 63.1 5.59 [CO9_HUMAN] B4DMH3 Coronin [B4DMH3_HUMAN] 0 0 3 0 0 2 49.3 7.02 P31146 Coronin-1A [COR1A_HUMAN] 0 5 6 6 17 20 51.0 6.68 Q9BR76 Coronin-1B [COR1B_HUMAN] 0 0 0 2 6 0 54.2 5.88 C-reactive protein(1-205) Q5VVP7 0 0 7 6 8 9 11.6 8.46 [Q5VVP7_HUMAN] Deleted in malignant brain tumors 1 Q9UGM3 0 4 0 6 0 0 260.6 5.44 protein [DMBT1_HUMAN] Q16555 Dihydropyrimidinase-related protein 2 0 4 7 11 3 0 62.3 6.38 P13639 Elongation factor 2 [EF2_HUMAN] 0 7 3 2 6 95.3 6.83

103 Eukaryotic translation initiation factor P63241 0 1 0 0 0 0 16.8 5.24

5A-1 [IF5A1_HUMAN] Fermitin family homolog 3 Q86UX7 0 2 0 2 0 0 75.9 6.98 [URP2_HUMAN] Q5HY54 Filamin-A [Q5HY54_HUMAN] 0 2 6 8 13 8 276.4 6.05 Full-length cDNA clone CS0DD006YL02 of Neuroblastoma of Q86TT1 0 47 0 0 0 0 41.2 6.79 Homo sapiens (human) [Q86TT1_HUMAN] Full-length cDNA clone CS0DI041YE05 Q86TY5 of Placenta of Homo sapiens (human) 0 0 3 2 0 0 13.9 9.16 [Q86TY5_HUMAN] P09382 Galectin-1 [LEG1_HUMAN] 0 0 1 0 6 5 14.7 5.50 P06396 Gelsolin [GELS_HUMAN] 0 29 46 37 49 50 85.6 6.28 Glyceraldehyde-3-phosphate P04406 0 0 0 3 11 35 36.0 8.46 dehydrogenase [G3P_HUMAN] Heat shock protein 105 kDa B4DY72 0 0 0 3 0 0 77.1 5.10 [B4DY72_HUMAN] Heat shock protein beta-1 P04792 0 0 2 0 0 0 22.8 6.40 [HSPB1_HUMAN] A8K8G0 Hepatoma-derived growth factor 0 0 0 0 0 2 22.9 4.55

103

continued Appendix B continued [A8K8G0_HUMAN] Heterogeneous nuclear ribonucleoprotein Q5T6W5 0 4 0 0 0 0 47.5 5.63 K [Q5T6W5_HUMAN] Heterogeneous nuclear G3V2D6 ribonucleoproteins C1/C2 (Fragment) 0 0 4 7 10 9 23.8 4.78 [G3V2D6_HUMAN] High mobility group protein B1 Q5T7C4 0 0 0 2 0 6 18.3 9.70 [Q5T7C4_HUMAN] Histone H2A type 1-H Q96KK5 0 6 13 14 19 13 13.9 10.89 [H2A1H_HUMAN] A8K9J7 Histone H2B [A8K9J7_HUMAN] 0 10 30 29 23 22 14.0 10.32 K7EMV3 Histone H3 [K7EMV3_HUMAN] 0 0 4 2 4 2 10.3 11.82 P62805 Histone H4 [H4_HUMAN] 0 3 12 4 13 4 11.4 11.36 Q5DT20 Hornerin [Q5DT20_HUMAN] 0 10 0 0 0 0 282.2 10.02

104 Ig gamma-4 chain C region P01861 0 34 32 35 47 0 35.9 7.36 [IGHG4_HUMAN] IgGFc-binding protein Q9Y6R7 0 3 6 3 0 2 571.6 5.34 [FCGBP_HUMAN] Q6P5S8 IGK@ protein [Q6P5S8_HUMAN] 0 89 90 135 109 97 25.8 6.33 Q6GMW4 IGL@ protein [Q6GMW4_HUMAN] 0 0 45 42 0 0 24.8 6.74 Inter-alpha-trypsin inhibitor heavy chain Q5T985 0 46 65 72 94 86 105.2 7.03 H2 [Q5T985_HUMAN] Inter-alpha-trypsin inhibitor heavy chain Q06033 0 7 6 12 12 14 99.8 5.74 H3 [ITIH3_HUMAN] IQ motif containing GTPase activating A4QPB0 0 3 0 2 0 0 189.2 6.48 protein 1 [A4QPB0_HUMAN] Keratin, type I cytoskeletal 10 P13645 0 24 46 28 39 95 58.8 5.21 [K1C10_HUMAN] Keratin, type I cytoskeletal 19 P08727 0 0 11 8 0 0 44.1 5.14 [K1C19_HUMAN] Keratin, type II cytoskeletal 2 epidermal P35908 0 0 16 10 0 36 65.4 8.00 [K22E_HUMAN] Kininogen 1, isoform CRA_b B4E1C2 0 5 12 17 14 10 71.9 6.81 [B4E1C2_HUMAN] Q969I0 KRT8 protein (Fragment) 0 14 21 19 20 28 41.1 5.00

104

continued Appendix B continued

[Q969I0_HUMAN] Lambda-chain (AA -20 to 215) A2NUT2 0 0 0 53 0 0 24.6 7.62 [A2NUT2_HUMAN] P20700 Lamin-B1 [LMNB1_HUMAN] 0 6 6 7 6 7 66.4 5.16 Q8TCF0 LBP protein [Q8TCF0_HUMAN] 0 3 0 0 0 0 52.9 6.76 P51884 Lumican [LUM_HUMAN] 0 45 55 49 50 61 38.4 6.61 MSN protein (Fragment) Q6PJT4 0 0 13 12 14 0 38.8 9.35 [Q6PJT4_HUMAN] P05164 Myeloperoxidase [PERM_HUMAN] 0 7 11 15 13 16 83.8 8.97 Myosin light polypeptide 6 F8W1R7 0 8 13 10 13 22 16.3 4.65 [F8W1R7_HUMAN] P35579 Myosin-9 [MYH9_HUMAN] 0 29 32 39 31 34 226.4 5.60 N-acetyl-D-glucosamine kinase Q9UJ70 0 2 4 5 0 0 37.4 6.24 [NAGK_HUMAN]

O96009 Napsin-A [NAPSA_HUMAN] 0 2 5 6 0 0 45.4 6.61 Neuroblast differentiation-associated 105 Q09666 0 0 2 3 5 3 628.7 6.15 protein AHNAK [AHNK_HUMAN]

Nucleobindin-1 (Fragment) C9JKZ2 0 0 0 4 5 0 31.1 6.01 [C9JKZ2_HUMAN] Nucleosome assembly protein 1-like 4 C9JZI7 0 2 0 0 0 0 31.8 5.01 (Fragment) [C9JZI7_HUMAN] Q06830 Peroxiredoxin-1 [PRDX1_HUMAN] 0 7 7 4 17 12 22.1 8.13 PRKCSH protein (Fragment) A2VCQ4 0 2 4 3 8 6 20.3 4.72 [A2VCQ4_HUMAN] Prostaglandin E synthase 3 B4DP21 0 0 0 1 0 0 14.9 4.77 [B4DP21_HUMAN] Proteasome (Prosome, macropain) B7Z478 subunit, beta type, 2, isoform CRA_b 0 2 4 6 3 0 20.2 7.44 [B7Z478_HUMAN] Proteasome activator complex subunit 2 H0YM70 0 0 3 0 0 0 26.0 5.92 [H0YM70_HUMAN] Proteasome subunit alpha type-1 P25786 0 13 26 28 32 24 29.5 6.61 [PSA1_HUMAN] Proteasome subunit alpha type-2 P25787 0 8 12 10 12 13 25.9 7.43 [PSA2_HUMAN] 105

continued Appendix B continued Proteasome subunit alpha type-3 P25788 0 0 3 6 0 0 28.4 5.33 [PSA3_HUMAN] Proteasome subunit alpha type-4 P25789 0 1 9 10 10 7 29.5 7.72 [PSA4_HUMAN] Proteasome subunit alpha type-5 P28066 0 9 12 13 16 15 26.4 4.79 [PSA5_HUMAN] Proteasome subunit alpha type-6 P60900 0 11 9 8 5 10 27.4 6.76 [PSA6_HUMAN] Proteasome subunit alpha type-7 O14818 0 11 19 26 22 19 27.9 8.46 [PSA7_HUMAN] Proteasome subunit beta type A2ACR1 0 3 6 6 8 5 20.9 4.89 [A2ACR1_HUMAN] Proteasome subunit beta type-1 P20618 0 0 6 4 7 8 26.5 8.13 [PSB1_HUMAN] Proteasome subunit beta type-10 P40306 0 0 0 7 0 0 28.9 7.81 [PSB10_HUMAN] 106 Proteasome subunit beta type-3 P49720 0 0 6 4 3 0 22.9 6.55

[PSB3_HUMAN] Proteasome subunit beta type-4 P28070 0 3 13 9 3 5 29.2 5.97 [PSB4_HUMAN] Proteasome subunit beta type-8 B0UZC1 0 5 6 8 10 12 27.8 7.80 [B0UZC1_HUMAN] P02760 Protein AMBP [AMBP_HUMAN] 0 23 17 19 21 22 39.0 6.25 Protein disulfide-isomerase A4 P13667 0 6 12 15 12 10 72.9 5.07 [PDIA4_HUMAN] Protein disulfide-isomerase A6 B7Z254 0 5 8 10 8 6 47.8 5.08 [B7Z254_HUMAN] Protein phosphatase 1 regulatory subunit B5MCY6 0 0 0 2 0 0 25.6 5.02 7 [B5MCY6_HUMAN] Protein phosphatase 1F (Fragment) C9J2F3 0 0 0 0 1 0 10.0 8.69 [C9J2F3_HUMAN] P00734 Prothrombin [THRB_HUMAN] 0 26 29 26 22 20 70.0 5.90 Q9UMZ1 Prothymosin a14 [Q9UMZ1_HUMAN] 0 0 9 8 0 0 11.1 3.79 Putative uncharacterized protein Q6MZU6 DKFZp686C15213 0 27 0 0 0 0 51.1 7.71 [Q6MZU6_HUMAN]

106

continued

Appendix B continued

F2Z2Y4 Pyridoxal kinase [F2Z2Y4_HUMAN] 0 3 14 17 12 16 30.6 6.65 Pyruvate kinase PKM P14618 0 30 38 33 28 30 57.9 7.84 [KPYM_HUMAN] RecName: Full=Catalase B; AltName: 2493539 Full=Antigenic catalase; AltName: 0 0 0 0 24 244 79.9 5.82 Full=Slow catalase; Flags: Precursor Rho GDP-dissociation inhibitor 1 J3KTF8 0 0 0 0 8 0 21.6 5.49 (Fragment) [J3KTF8_HUMAN] Scavenger receptor cysteine-rich type 1 F5GZZ9 0 9 8 8 9 17 120.2 6.10 protein M130 Serum amyloid A protein D3DQX7 0 11 19 11 16 8 13.6 6.79 [D3DQX7_HUMAN] Serum amyloid A protein B2R5G8 0 5 12 8 13 9 14.8 8.98 [B2R5G8_HUMAN] Serum amyloid P-component P02743 0 0 2 0 0 0 25.4 6.54 [SAMP_HUMAN] 107 Serum paraoxonase/arylesterase 1 P27169 0 3 1 3 0 0 39.7 5.22

[PON1_HUMAN] Spectrin alpha chain, non-erythrocytic 1 Q13813 0 0 8 12 8 12 284.4 5.35 [SPTN1_HUMAN] Q5TCU6 Talin-1 [Q5TCU6_HUMAN] 0 7 2 0 0 0 257.9 6.49 F5H7V9 Tenascin [F5H7V9_HUMAN] 0 0 2 3 0 0 201.0 5.03 Thymidine phosphorylase (Fragment) C9JGI3 0 12 15 12 11 16 46.1 5.52 [C9JGI3_HUMAN] B8ZZQ6 Thymosin alpha-1 [B8ZZQ6_HUMAN] 0 0 9 8 0 0 11.8 3.81 Transforming growth factor-beta-induced G8JLA8 0 6 4 3 0 0 74.6 7.25 protein ig-h3 [G8JLA8_HUMAN] Tropomyosin alpha-4 chain P67936 0 15 23 33 32 44 28.5 4.69 [TPM4_HUMAN] Tryptophan--tRNA ligase, cytoplasmic P23381 0 9 2 2 12 9 53.1 6.23 [SYWC_HUMAN] P07437 Tubulin beta chain [TBB5_HUMAN] 0 7 9 14 10 10 49.6 4.89 Tumor rejection antigen (Gp96) 1 Q5CAQ5 0 8 14 16 16 14 92.3 4.86 [Q5CAQ5_HUMAN] VCP protein (Fragment) Q96IF9 0 7 0 8 4 3 71.0 5.06 [Q96IF9_HUMAN] 107

continued Appendix B continued

P08670 Vimentin [VIME_HUMAN] 0 55 100 103 120 118 53.6 5.12 P04004 Vitronectin [VTNC_HUMAN] 0 23 26 31 29 31 54.3 5.80 V-type proton ATPase catalytic subunit A B7Z1R5 0 8 8 12 11 9 64.7 5.66 [B7Z1R5_HUMAN] V-type proton ATPase subunit E 1 C9J8H1 0 0 3 0 3 323.5 9.04 9.04 (Fragment) [C9J8H1_HUMAN]

108

108

Reference

1 K.J. Kwon-Chung, J.A. Sugui. 2013, 9(12), e1003743. 2 JP. Latgé. Clin Microbiol Rev. 1999, 12(2), 310-350. 3 M.W. Foster. J Proteome Res. 2013, 12, 2194-2205. 4 R.E. Ardrey. Liquid Chromatography-Mass Spectrometry: An Introduction, John Wiley & Sons, Inc. 2003. 5 T. Kocher, R. Swart, K. Mechtler, Anal. Chem. 2011, 83, 2699-2704 6 S.D. Palma, Marco L. Hennrich, Albert J. R. Heck, Shabaz Mohammed. Journal of Proteomics 2012, 75, 3791-3813. 7 K. Sandra, M. Moshir, F. D’hondt, K. Verleysen, K. Kas, P. Sandra. Journal of Chromatography B 2008, 877, 1019-1039 8 M. Wilm, Principles of Electrospray Ionization, Molecular & Cellular Proteomics 10 2011. 9 O. Hamdy, Ryan Julian. J Am. Soc. Mass Spectrom 2012, 23,1-6 10 M. Karas, U. Bahr, T. Dulcks. Anal. Chem. 2000, 366:669-679 11 S. Banerjee, S. Mazumdar. International Journal of Analytical Chemistry. 2012, 2012, 282574 12 K. Levsen, H. Schwarz. Mass Spectrom. Rev. 1983, 2, 77-148. 13 J. Bordas-Nagy, K.R. Jennings. Int. J. Mass Spectrom. Ion Proc. 1990, 100, 105-131. 14 R. Guevremont, R.K. Boyd. Rapid.Commun. Mass Spectrom. 1988, 2, 1-5. 15 M.J. MacCoss, C.C. Wu, and J.R. Yates. Anal. Chem. 2002, 74, 5593-5599. 16 D.L. Tabb, W.H. McDonald, J.R. Yates. Journal of Proteome Research. 2002, 1, 21-26. 17 J.K. Eng, A.L. McCormack, J.R. Yates. J Am Soc Mass Spectrom, 1994, 5, 976-989. 18 J. Schindler, U. Lewandrowski, A. Sickmann. Mol. Cell Proteomics 2006, 5, 390-400. 19 K. Lam, S. Salmon, E. Hersh, V. Hruby, W. Kazmierski, R. Knapp, Nature 1991, 82-84. 20 P.G. Righetti, E. Boschetti. Low-abundance Proteome Discovery, 2013, 79-157. 21 Egisto Boschetti, Lee Lomas, Attilio Citterio, Pier Giorgio Righetti. Journal of Chromatography A, 2007, 1153, 277-290 22 V. Thulasiraman, S. Lin, L. Gheorghiu, J. Lathrop, L. Lomas, D. Hammond, E. Boschetti. Electrophoresis 2005, 26, 3561-3571 23 E. Boschetti, P.G. Righetti. Biotechniques 2008, 44, 663-665. 24 E.M. Keidel, D. Ribitsch, F. Lottspeich. Proteomics. 2010, 10:1-10. 25 P.G. Righetti, E. Boschetti, G. Candiano. Journal of Proteomics. 2012, 75, 4783-4791. 26 P.Y. Huang, G.A. Baumbach, C.A. Dadd et al. Bioorg. Med. Chem.1996, 4, 699–708.

109

27 Righetti, P. G.; Boschetti, E.; Kravchuk, A. V.; Fasoli, E. Expert. Rev. Proteomics 2010, 7, 373–85. 28 P.G. Righetti, E. Boschetti, L. Lomas et al. Proteomics 2006, 6, 3980–3992. 29 P.G. Righetti, E. Boschetti. Mass Spectrom. Rev. 2008, 27, 596–608. 30 C. Simó, A. Bachi, A. Cattaneo et al. Anal. Chem. 2008, 80, 3547–3556. 31 E. Fasoli, A. Farinazzo, C.J. Sun, et al. J Proteomics. 2010, 73, 733–742. 32 J. Rivers, C. Hughes, T. McKenna, et al. PLoS ONE. 2011, 6, e28902. 33 V. Thulasiraman, S. Lin, L. Gheorghiu et al. Electrophoresis 2005, 26, 3561–3571. 34 F. Roux-Dalvai, A. Gonzalez de Peredo, C. Simó, et al. Mol. Cell. Proteomics. 2008, 7, 2254– 2269. 35 E. Mouton-Barbosa, F. Roux-Dalvai, D. Bouyssié, et al. Mol Cell Proteomics. 2010, 9, 1006– 1021. 36 R.L. Gundry, Q. Fu, C.A. Jelinek, J.E. Van Eyk, R.J. Cotter. Proteomics Clin Appl. 2007, 1(1), 73-88. 37 O. Gundogdu, S.D. Bentley, M.T. Holden, J. Parkhill, N. Dorrell, B.W. Wren. BMC Genomics.2007, 8:162.

110