<<

Scandinavian Journal of Clinical & Laboratory Investigation, 2010; 70(Suppl 242): 19–22

ORIGINAL ARTICLE

Molecular biology tools: techniques in discovery

FRIEDRICH LOTTSPEICH, JOSEF KELLERMANN & EVA-MARIA KEIDEL

Max-Planck-Institute of , Protein Analytics, Germany

Abstract Despite worldwide efforts biomarker discovery by plasma proteomics was not successful so far. Several reasons for this failure are obvious. Mainly, proteome diversity is remarkable between different individuals and is caused by genetic, envi- ronmental and life style parameters. To recognize disease related proteins that could serve as potential is only feasible by investigating a non realizable large number of patients. Furthermore, plasma proteomics comprises enormous technical hurdles for quantitative analysis. High reproducibility of blood sampling in clinical routine is hard to achieve. Quantitative proteome analysis has to struggle with the complexity of millions of protein species comprising typical plasma proteins, cellular leakage proteins and antibodies and concentration differences of more than 1011 between high and low abundant proteins. Therefore, no successful quantitative and comprehensive plasma proteome analysis is reported so far. A novel proteomics strategy is proposed for biomarker discovery in plasma. Instead of comparing the plasma proteome of different individuals it is recommended to analyze the proteomes of different time points of a single individual during the development of a disease. This strategy is realized by the use of plasma of the Bavarian Red Cross Blood Bank, were three million samples are stored under standardized conditions. To achieve reliable data the isotope coded protein labelling proteomics technology was used.

Key Words: individualized proteomics , ICPL , ICPLQuant For personal use only.

Introduction blood collection, hardly achievable in rou- So far, only very few biomarkers have been discovered tine clinical work. using proteomics techniques. Cellular proteomics, ii) The enormous complexity of several millions analyzing the protein pattern of relevant tissues from of molecular distinct protein species and the diseased patients compared with healthy control per- wide dynamic range of more than 10 orders sons, is state of the art. This strategy usually requires of magnitude between high and low abun- invasive sample collection and enrichment of the dis- dant proteins confronts biomarker discovery eased material by e.g. microdissection. After quite with almost intractable technical challenges. sophisticated quantitative analyses using DIGE or All protein fractionation techniques neces- based techniques numerous bio- sary for the reduction of complexity are con- Scand J Clin Lab Invest Downloaded from informahealthcare.com by Max-Planck-Gesellschaft on 03/14/11 marker candidates usually can be found. However, nected with unpredictable protein loss and almost all of them do not survive the validation phase. unpredictable recovery. Especially low abun- In contrast, proteomics starting from blood plasma is dant proteins, suspected to contain the noninvasive and easily applicable and thus much majority of potentially relevant biomarkers, more desirable to discover prognostic and diagnostic are currently out of reach for the common markers. Unfortunately, despite incredible academic proteomics analysis methods. For all these and industrial efforts, plasma proteomics has not reasons, today very few reports on quantita- delivered so far. The reasons are rather obvious. tive plasma proteomics are published. iii) Usually many differences in the proteome i) Sample acquisition is critical and reproduc- pattern of different individuals can be ible plasma samples can only be obtained by detected. However, due to their high number, precise standardized operating procedures of the correlation of any of these changes with a

Correspondence: Friedrich Lottspeich, Max-Planck-Institute of Biochemistry, Protein Analytics, Germany. E-mail: [email protected]

ISSN 0036-5513 print/ISSN 1502-7686 online © 2010 Informa UK Ltd. (Informa Healthcare, Taylor & Francis AS) DOI: 10.3109/00365513.2010.493359 20 F. Lottspeich et al.

disease state is diffi cult. The patient numbers to a single distinct protein species, produce required for solid statistics can hardly be in principle much more meaningful results. investigated due to cost and time constrains. This can be achieved by protein based (top- down) proteomics strategies. Several strategic considerations have to be made iv) The individual heterogeneity is probably the to overcome the causes of failure of plasma proteom- major cause for the ineffi cacy of proteomics ics studies. in biomarker discovery. The protein patterns of two individuals are much more different i) Standardization of sample collection is man- than the protein patterns of one single indi- datory. Nevertheless, in clinical routine vidual in diseased and healthy state. To dis- reproducible sampling of plasma will remain criminate between individual heterogeneity, diffi cult. Consequently, only rather robust caused by genetic and environmental factors markers can be detected. and disease related protein changes is ii) Doubtlessly, the proteomics techniques for extremely challenging. This can only be over- biomarker discovery have to be quantitative. come by either a large number of individuals Label free techniques have been quite suc- analyzed, which is prohibitively costly, or by cessfully applied in low complex systems and a kind of personalized proteomics approach, most of the published plasma proteomics where healthy and diseased states of one sin- projects have used label free techniques. gle individual are compared. Here at least However, these techniques are probably not genetic variability can be neglected. really suited for plasma proteomics. Since the reduction of complexity has to involve An example of a feasibility study multidimensional fractionation steps, unpre- dictable loss of distinct proteins can occur. A biomarker discovery study on colon cancer was Unfortunately, these losses depend on the carried out to evaluate plasma proteomics in the light composition of the sample, which in the case of the arguments given. Following the idea of an indi- of plasma can be quite different with differ- vidualized proteomics strategy, samples from indi- ent blood donors (e.g. , carbohydrates, vidual blood donors were chosen as starting material. etc.). A further disadvantage is that label free The Blutspendedienst of the Bavarian Red Cross col- proteomics techniques have to be performed lects hundreds of thousands blood samples a year. separately for each sample. Better suited for For legal reasons, a portion of these samples has to For personal use only. plasma proteomics are techniques based on be stored in a blood bank for several years. After isotopic labelling where up to four samples informed consent of the blood donor, these samples can be multiplexed at a time. The quantita- may then become accessible for research. 16 blood tive ratios of all the proteins in different donors were selected, who have donated blood samples are fi xed at an early stage and quan- repeatedly and who were diagnosed with colon can- titative results remain valid even after multi- cer with similar staging and grading. Additionally, a dimensional fractionation steps. blood sample after successful therapy was obtained. iii) The type of proteomics technique applied As quantitative proteomics technique the protein plays a major role on the accuracy and the based isotope coded protein label (ICPL) strategy validity of the results. Entirely peptide based was chosen [1]. With the ICPL technique up to four (bottom up) strategies, where the proteome samples can be compared in one single experiment.

Scand J Clin Lab Invest Downloaded from informahealthcare.com by Max-Planck-Gesellschaft on 03/14/11 is cleaved into peptides as a fi rst step, carries Dedicated software, ICPLQuant [2] covers the whole especially for plasma a certain risk. After ICPL workfl ow shown in Figure 1. enzymatic cleavage a certain peptide may be The 20 most abundant proteins account for about released from different protein species. Thus, 97% of the protein content of blood plasma. Any the quantitative analysis of a peptide may not analysis performed directly on whole plasma mainly refl ect the amount of one distinct protein but results in data derived from these major proteins and rather the amount of all protein species and will thereby obscure information on low abundance isoforms containing this peptide. In plasma proteins. Therefore, the 20 most abundant proteins it is well documented that many proteins were extracted by immunoaffi nity chromatography appear in tens of different forms (genetic (IGMA). The bound fraction containing mainly isoforms, glycosylation heterogeneity, phos- albumin and immunoglobulin is eluted and can be phorylated forms, degradation products, further used e.g. to investigate the antibody reper- etc). All these individual protein species can toire of the patient [3]. In our opinion the depletion only be quantifi ed correctly if all the existing of the high abundant proteins is an indispensible forms are known, which is almost impossi- fractionation step for two reasons. The dynamic ble. Therefore, proteomic strategies, where range of the plasma proteome is reduced by at least the analyzed peptides are undoubtedly linked 2–3 orders of magnitude and the isotopic labelling Proteomics in biomarker discovery 21

Figure 1. Workfl ow of ICPL-based proteomics of human blood plasma.

of large protein quantities is rather expensive. At least 20 slices, the different molecular size fractions were 100 μ L of plasma (corresponding to about 7 mg pro- cleaved enzymatically and subjected to mass spec- tein) has to be used to get access to low abundant trometry for quantifi cation of the peptides. The proteins. Only then cellular leakage proteins which ICPLQuant software, specially designed to follow the are in the mg/L interval can be detected with state ICPL workfl ow, recognizes and quantifi es peptide of the art mass spectrometry instrumentation. multiplets and stores the data (e.g. mass, retention After depletion, less than 1 mg protein material from time, quantitative, identifi cations) in a database. Only a 100 μ L plasma proteome sample remains. This is peptides exhibiting the expected concentration change a reasonable amount for labeling and further sample pattern correlated to the progression of the disease workup. However, the depletion step is done without were further investigated for identifi cation by MSMS.

For personal use only. any isotopic control and the large amount of protein Quantitative peptide patterns were recorded for all 16 material removed may unintentionally eliminate cer- patients. However, bioinformatic analysis revealed no tain proteins from the plasma sample by specifi c or single protein consistently regulated in all patients. unspecifi c binding. Therefore, great care has to be The results obtained in this feasibility study taken to reproducibly perform the depletion under showed the high selectivity of the approach but sug- standard operating procedures. gested that the analysis depth of the proteomics study Following ICPL labelling, four samples derived was not suffi cient to reveal a new biomarker. How- from 4 different blood donation time points (as indi- ever, an important result supports the applied strat- cated in Figure 1) were combined and fractionated egy. When comparing the protein abundances in by preparative SDS-gels. After cutting the gels into different patients, only very few proteins are present Scand J Clin Lab Invest Downloaded from informahealthcare.com by Max-Planck-Gesellschaft on 03/14/11

Figure 2. Comparison of peptide amounts in two different patients (A) and two different time points of the same patient (B). 22 F. Lottspeich et al.

in similar concentrations. In contrast, comparing the most successful approach is to quantitatively com- protein pattern of different time points within one pare the proteomes of diseased tissues with their patient the majority of the proteins remain almost healthy counterparts. Proteins found to differ sig- unchanged (Figure 2). Only few proteins changed in nifi cantly in amount in the two proteomic states concentrations even if the time points of sample col- have then to be extensively validated for specifi city lection were more than two years apart. and sensitivity by other techniques. Some of these Currently, these samples are reanalyzed using putative biomarkers are disseminated into the a similar proteomics workfl ow but with much circulation and can be traced directly in blood by higher fractionation of the depleted plasma. This is targeted proteomics techniques with a high detect- obtained by depletion, ICPL labeling, multiplexing ability like SRM or immuno-assays. The major and extensive fractionation by preparative isoelectric obstacle in fi nding new biomarkers is the genetic focusing (OffGel, Agilent). After this, reversed phase polymorphism of different individuals. Therefore, fractionation of each of the isoelectric focusing frac- personalized proteomics techniques in combination tionation steps is performed on the peptide level. with accurate quantifi cation and sophisticated bio- However, since the number of fractions now become informatic methods may improve biomarker discov- quite large (i.e. several thousand fractions per ery in the future. patient), ESI-LC analyses are too time consuming. Automated sample preparation for rapid MALDI- Declaration of interest: The authors report no MS analyses is being developed. confl icts of interest. The authors alone are responsible for the content and writing of the paper.

Discussion and conclusion References The targeted mass spectrometry techniques to [1] Schmidt A, Kellermann J, Lottspeich F. A novel strategy for monitor protein changes in many samples are at a quantitative proteomics using isotope-coded protein labels. mature stage and expected to enter routine clinical Proteomics 2005;5:4–15. diagnosis and therapy monitoring rather soon. [2] Brunner A, Keidel E, Dosch D, Kellermann J, Lottspeich F. However, the targets of interest have to be well ICPL Quant – a software for non-isobaric isotopic labeling proteomics. Proteomics 2010;10:315–23. known and extensively characterized on the molec- [3] Seliger B, Kellner R. Design of proteome-based studies in ular level. Quantitative proteomics for biomarker combination with serology for the identifi cation of biomark- discovery still is in its infancy. Today, probably the ers and novel targets. Proteomics 2002;2:1641–51. For personal use only. Scand J Clin Lab Invest Downloaded from informahealthcare.com by Max-Planck-Gesellschaft on 03/14/11