Differential Expression Proteomics

Differential Expression Proteomics

Lecture 3 Differential Expression Proteomics Arthur Moseley [email protected] Genome Academy April, 2013 Quantitative Mass Spectrometry of Peptides and Proteins – Quantitative MS is easy to try, hard to do right – Sets of “Light and Heavy” reagents can be used for relative quantitation – Quantitative MS often relies on use of isotopically labeled authentic standards – Spiking authentic stable-labeled molecules (peptides, drugs, pesticides, etc.) into samples provides for molar quantitation • THE Gold standard approach for quantitative mass spectrometry – Label-free quantitation is often very useful • Used for relative quantitation and “Top-3” Mole Quantitation • Ultimate flexibility in experimental design “Old-School” Differential Expression Proteomics The First Mass Spec Based Differential Expression Proteomics (ICAT) developed by Reudi Aebersold (Nature Biotechnology, 17, 994, 1999) ICAT Reagent and Strategy Stable Isotope Labeling for Quantitative Proteomics: - Lots of Options Goshe and Smith, Curr Op in Biotech (2003) 14:101 Analytical Challenges Associated with Performing Quantitative Proteomics Using Chemical Isotopic Labeling • Bypassing gels avoids problems with membrane proteins, other special cases • Sample loading issues contributing to poor dynamic range are reduced • Not all proteins contain targeted amino acid (tag dependent consideration) • Post-translational modifications can be missed (tag dependent) • Quantitation from LC/MS: relative intensities of isotope clusters • Qualitative Identification from LC/MS/MS: peptide sequencing (MS/MS) • Analytical challenge - very complex mixtures (30,000+ peptides/sample) are made more complex by isotope labeling (doubles number of analytes) – pre-fractionate samples – Multidimensional analytical HPLC (capillary LC/LC/MS/MS) Applied Biosystems iTRAQ reagents use isobaric tags Multiple tags present with the same nominal mass in survey spectra Quantitation is done during the MS/MS step, simultaneously with peptide identification Only quantify peptides sequenced by MS/MS - A subset of all peptides present Label-free methods quantitate all species regardless of identification http://docs.appliedbiosystems.com/pebiodocs/00113379.pdf Metabolic Stable Isotope Coding Goshe and Smith, Curr Op in Biotech (2003) 14:101 SILAC generates a lot of data regarding 2 samples - Be aware of statistical limitations Even when quantitative methods are used, most of the time, the focus is on function. There is little attention to the details of quantitation. Such an approach is fundamentally flawed. Forget not the basic principals of quantitative analyses. – Replication; QCs; Validation Rigorously use Quantitatively Reproducible Analytical Methods Forget not the basics of analytical chemistry • Highly reproducible chromatography is required • A high sampling rate across the chromatographic peak is required for accurate quantitation •Ideally want 15-20 sampling points across chromatographic profile •Highly reproducible chromatography is required for sample-to-sample comparisons • High resolution, accurate mass (precursor & products) tandem mass spectrometry technology needed • For quantitative selectivity (near isobaric cross-talk) • For accurate qualitative identifications 1% FPR at peptide level (Decoy DB; Peptide Prophet) • No QCs = No Quantifiably Reliable Data • No Replication = No Quantifiably Reliable Data • No Common Standard = No Meaningful Comparison across Projects Overview of Label Free Quantitation Acquisition of LC Acquisition of Selected MS/MS Peptide Separation MS Data Data Via Identification Targeted (Database Search Analysis Engine) Import Raw Data Data Alignment Statistical Import Raw Annotation & & Feature Analysis of MS/MS Data Peptide/Protein Extraction Differences Analysis (courtesy Rosetta Biosoftware) Gel-Free Label Free Proteomics High Resolution, Accurate Mass 3D Peptide Mass Map X and Y coordinates identify the peptide Y coordinate (mass-to charge ratio) is fixed to <5 ppm error X coordinate (LC Retention Time) has more variability (typically < 6 seconds) charge (m/z)charge ratio - to - mass An isotope group of a peptide LC retention time •Intensity (AUC) of SIC of peptide is the quantitative measure •Must be accurately measured across statistically significant sample cohort Results of Data Alignment based on Accurate Mass and Retention Time Raw Aligned Data Data Aligned Data 111,015 Features Aligned across 16 Combined by LC/MS Analyses Biological Condition of Cell Lines How to QC this vast Amount of Data? QC of Individual Isotope Groups pairwise t-tests of significance of peak area measurement Rigorously use Quantitatively Reproducible Analytical Methods Daily QC Checks of Data Acquisition Precision and Reproducibility Instrument Performance Checks Day 1(+) QCs Column Conditioning Preliminary database searches Column QC1 Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 Sample 7 Sample 8 Sample 9 Sample 10 QC 2 Sample 11 Sample 12 Sample 13 Condition Day 2: Data Collection Day 3: Data Collection Sample Sample Sample Sample Sample Sample QC X-1 QC X ……… X-5 X-4 X-3 X-2 X-1 X Day X: Data Collection • Want to maximize biological powering - analyzing as many samples as possible • Must use robust LC-MS platform and singlicate analysis of each sample • Data QC is performed by daily injections of a “standard” of the same biological sample (pool) • Aliquots of same pool used in all projects – QC tracking across projects Quantitatively Reproducible Analytical Methods Forget not the basics of analytical chemistry Assessing Quantitative Reproducibility with Daily QCs QC Metric #1 = %CV (Anal. + Biol. Variability) - %CV (Anal. Variability) • Analytical Variability ~ 35,000 • Analytical + Biological Variability peptides • Patient Samples • Daily QC Sample (pool of QC plasma sample) ~ 40% peptides ~ 2 % peptides CV < 10% CV < 25% ~ 70% peptides CV < 20% ~ 90% peptides CV < 25% Note X- Axis Scale Differences QC Samples 0 to 170% CV Biological Samples 0 to 500% CV 125% CV Plasma Peptides 25% CV Plasma Peptides - Alternating cycles (1 sec. each) of precursor / product scans provides high reproducibility via a high sampling rate across chromatographic peak - Major attribute of MSE Rigorously use Quantitatively Reproducible Analytical Methods Assessing Quantitative Reproducibility at the Peptide Level with QCs Reproducibility of Internal Standard Spiked into Each Sample ADH1_YEAST (50fmol/ug) Peptide Abundance across 60 patient clinical cohort DDA Data Qual only VVGLSTLPEYIEK, 12.8% CV across all samples Label Free Intensity Plots differential expression visualization Cluster Analysis of Label Free Quantitation Datasets Proteins • Cluster Analyses – Examine large data sets and determine if items behave similarly – Data belonging to the same cluster are similar at some level – Data sets in different clusters are less similar at some level – Make a preliminary assessment of possible relationships between clusters and identify Treatment Groups Treatment data sets for further investigation Differential Protein Expression • Differential protein expression studies are key for – Identifying biomarkers of disease and treatment response – Elucidating biological pathways – Identifying and validating protein drug targets • Essentially all differential proteomics studies have studied relative protein expression – Isotope labeling methods – Label free methods • Differential proteomic expression studies based on “absolute” quantitation have yet to be fully exploited Relative Protein Expression • Provides data on protein expression changes between two or more samples within the same experiment • Requires direct comparison of proteolytic peptides or marker ions from proteolytic peptides – Provides relative abundance ratios of the same protein between different samples – Data does not easily extrapolate beyond the experiment • Experiments are isolated “islands of information” One Exemplar Biomarker Discovery & Verification Project Biomarkers to Predict Outcomes of Hepatitis C Patient Treatment in Serum of Treatment Naive Patients Jeanette McCarthy, Keyur Patel, Joe Lucas and John McHutchison Spontaneous clearance (~25%) Chronic infection Eligible for Treatment Responders Non-responders (>50%) Hepatic Fibrosis Steatosis Insulin resistance Dyslipidemia 20% cirrhosis Increased risk Unknown 3-5% cancer of diabetes consequences Cohort Selection and Placement in the Pipeline (Guided by an “Unmet Clinical Need”, US HUPO 2009) Number of 10,000s Analytes 100-1,000 10s 1,000s 100 -1,000 Number of Samples 10s Biomarker Biomarker Biomarker Discovery Verification Validation First: Discover in Matched Cohorts to Second: Verify in All-Comers Trials Focus on the Clinical Variable of Interest to Test Robustness Biomarker Discovery Paradigm Challenge Hepatitis C Cohorts – all by UPLC/Q-Tof Open Platform LC/MS LC/MS/MS (MRM) LC/MS/MS (MRM) Duke Hepatology Biorepository - 3,169 patients Discovery Cohort - small discovery experiment - well matched cohort from Biorepository - n = 55 patients - ‘omic LC/MS/MS Verification Cohort 1 - well matched cohort from Biorepository - n = 41 patients Verification Cohort 2 - pediatric patients - “all-comers” trial - N = 50 patients Verification / Validation Cohort 3 - “all-comers” trial (Australia) - N = 243 patients Insure Professional Use of Statistical Tools Suitable for High Dimensional Data Analyses Sparse Latent Factor Regression - Bayesian Factor Regression Modeling 35,000 Isotope Groups Predictive Factor Factor Score “Metaproteins” “Expression Value” • Regression

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    56 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us