The Study of Macromolecular Complexes by Quantitative Proteomics
Jeff Ranish Proteomics: the systematic study of the protein complement of the cell
Patterson and Aebersold, Nat. Gen., 2003 Many cellular functions are carried out by proteins in complexes
Response to environmental signals Koretzky, G., Myung, P., Nature Immunology Rev., 2001 Nuclear transport Rout, M.P., Aitchison, J.D., J.B.C., 2001
RNA synthesis Shilatifard, A., FASEB J, 1998 Transcription factor complexes orchestrate the control of gene expression
Chromatin remodeling Activators/Repressors complexes Pol II machinery
mRNA
Distinct transcription complexes regulate expression of specific genes Outline
I. Introduction to the study of macromolecular complexes by mass spectrometry
II. Analysis of macromolecular complexes using quantitative MS
A human sequence-specific DNA binding transcription factor
An RNA polymerase II transcription complex
A new component of the transcription machinery
Changes in transcription factor complex composition during development
III. Future directions Transcription complexes in chromatin
Targeted MS approaches to study complexes
IV. Conclusions Macromolecular complex analysis by mass spectrometry
Step 1. Purify complex from cell extracts
Separate based on physical, chemical and/or biochemical properties
i.e., ion exchange chromatography gel filtration chromatography density gradients affinity interaction chromatography (antibodies, nucleic acids, …)
Step 2. Analyze purified sample by MS
Kumar and Synder, Nature, 2002 Step 3. Evaluate results using informatics tools Isolating interacting proteins by affinity chromatography: epitope tags
epitope tag Protein of interest
epitope tags composition affinity matrix FLAG DYKDDDDK FLAG antibody HA YPYDVPDYA HA antibody C-MYC EQKLISEEDL c-MYC antibody 6XHIS HHHHHH Immobilized metal affinity (IMAC) Biotinylation signal 78 amino acids avidin/streptavidin Strep binding 10-50 amino acids avidin/streptavidin Protein A 137 amino acids IgG Calmodulin binding peptide 26 amino acids Calmodulin Isolating interacting proteins: Tandem affinity purification (TAP)
Protein A tag Calmodulin-BD Protein of interest
TEV protease cleavage site
IgG-sepharose Calmodulin-sepharose TAP purification of U1snRNP complex
IgG Beads - + + TEV cleavage - + + Calmodulin beads + - + Extract TAP WT TAP WT MW TEV TAP WT
TEV protease
1 2 3 4 5 6 7 8
Large scale studies have characterized protein interactions associated with >2000 yeast proteins
Gavin, Nature, 2002, Ho, Nature, 2002, Krogan, Nature, 2006 Current limitations of mass spectrometry-based protein complex analysis
• Difficult to distinguish specific complex components from non-specific proteins without extensive purification
• Loosely associated factors may be lost during extensive purification
• Static representation of complex composition
• No information about subunit stoichiometry NCDIR - APPROACH Chait, Aitchison & Rout, Nature Methods, 2007
• Protein A Tag Macromolecular Complexes
• Complexes Preserved by Freezing & Cryolysis
• Rapid Isolation (5-30 minutes) Preserves Complexes
• MS analysis to determine complex: COMPOSITION MODIFICATIONS A quantitative MS approach to complex characterization Ranish, Nat. Gen., 2003 a) b)
control purification specific purification purification from purification from cell state 1 cell state 2
differentially label differentially label isotopically normal isotopically heavy isotopically normal isotopically heavy
combine combine
proteolysis proteolysis
sample complexity reduction sample complexity reduction
LC-MS/MS LC-MS/MS
identify peptides and quantify their identify peptides and quantify their relative levels by measuring peak ratios relative levels by measuring peak ratios condition condition specific
100 complex specific non-specific 100 invariant enrichment rel. abundance rel. rel. abundance rel. 0 m/z 0 m/z distinguish specific complex components detect changes in complex abundance or composition from co-purifying proteins I-DIRT Tacket, JPR, 2005
Mousson, F. (2008) Mol. Cell. Proteomics 7: 845-852 Quantitation: Labeling vs. Label-free
L. Muller et al. Nat. Methods 2007 Isotopic labeling strategies for MS-based quantitative proteomics
MS1 ntensity I Modified from CEBI web site m/z Stable Isotope Tagging by Metabolic Labeling (i.e., SILAC)
Cells are grown on media containing isotopically heavy or normal nutrients, i.e., lysine and arginine
Strengths Weaknesses • Simple in vivo labeling • Compatible with selected protocols species, samples only • Minimal sample handling • Label potentially metabolized • Potentially all peptides labeled • Labeling potentially perturbs biological system
detect changes in complex composition • No inherent sample enrichment
purification from purification from cell state 1 cell state 2 Stable Isotope Tagging by Chemical Reaction (i.e., ICAT, ICPL, N-isotag)
Labeling at the protein or peptide level, after cell lysis. Most reagents target amines or sulfhydryl groups
O ICAT N N O X X XX O O I S cys N O O N S H X X XX Biotin tag Linker (X = hydrogen Thiol specific or deuterium) reactive group Dmass = 8 daltons
GABA NHS N-isotag tBoc-Leucine NHS
X X O X O X N (heavy or normal) X X O *NH tBoc * O
NH2Leu X = C12 or C13, NH Leu *N = N14 or N15 2 Dmass = 7 daltons Stable Isotope Tagging by Chemical Reaction (i.e., ICAT, ICPL, N-isotag)
Strengths Weaknesses • Compatible with any protein • Chemical reactions required source • Sample handling • Different specificities can be designed into reagent • Tag might interfere with MS or MS/MS • Selective tagging reduces • Potential for side reactions, sample complexity incomplete reactions
monitoring complex enrichment
control purification specific purification i.e., control antibody i.e., specific antibody iTRAQ-isobaric tags
MS2-based quantification
Ross, P. L. (2004) Mol. Cell. Proteomics 3: 1154-1169 Stable Isotope Tagging by Chemical Reaction-iTRAQ
Strengths Weaknesses • Multiplexed-up to 8 • Chemical reactions required • Compatible with any • Sample handling protein source • Need to detect reporter ions • Sample complexity in low m/z range reduction in the MS1 • Ion must be selected for dimension CID to quantify • Not necessary to reconstruct ion chromatograms for quantification Label-free Method: Spectral Counting
N unique = 5 N spectra = 14
protein sequence
MESSPFNRRQWTSLSLRVTAKELSLVNKNKSSAIVEIFSKYQKAAEETNMEKKRSNTENLSQHFRKGTLTVLKKKWENP GLGAESHTDSLRNSSTEIRHRADHPPAEVTSHAASGAKADQEEQIHPRSRLRSPPEALVQGRYPHIKDGEDLKDHSTES KKMENCLGESRHEVEKSEISENTDASGKIEKYNVPLNRLKMMFEKGEPTQTKILRAQSRSASGRKISENSYSLDDLEIG PGQLSSSTFDSEKNESRRNLELPRLSETSIKDRMAKYQAAVSKQSSSTNYTNELKASGGEIKIHKMEQKENVPPGPEVC ITHQEGEKISANENSLAVRSTPAEDDSRDSQVKSEVQQPVHPKPLSPDSRASSLSESSPPKAMKKFQAPARETCVECQK TVYPMERLLANQQVFHISCFRCSYCNNKLSLGTYASLHGRIYCKPHFNQLFKSKGNYDEGFGHRPHKDLWASKNENEEI LERPAQLANARETPHSPGVEDAPIAKVGVLAASMEAKASSQQEKEDKPAETKKLRIAWPPPTELGSSGSALEEGIKMSK PKWPPEDEISKPEVPEDVDLDLKKLRRSSSLKERSRPFTVAASFQSTSVKSPKTVSPPIRKGWSMSEQSEESVGGRVAE RKQVENAKASKKNGNVGKTTWQNKESKGETGKRSKEGHSLEMENENLVENGADSDEDDNSFLKQQSPQEPKSLNWSSFVD NTFAEEFTTQNQKSQDVELWEGEVVKELSVEEQIKRNRYYDEDEDEE Label free approaches: Spectral counting
• normalized count of the number of correctly identified spectra per protein (Zybailov,et. al., J. Proteome Res. 2006, Choi, et. al., MCP, 2008)
• robust for high abundance proteins and simple mixtures • simple to do perform • can compare any appropriately matched samples
• indirect assessment of abundance • accuracy may be compromised by duty cycle limitations • sample handling Label free approaches: Intensity–based quantification
• Quantification is performed computationally by comparing ion intensities in sequential runs • Any appropriately matched samples can be compared
• Computationally-intensive • accurate measurements require: • multiple MS runs • high mass accuracy • reproducible chromatography Intensity–based quantification
• Open source platforms are available • msInspect (Bioinformatics, 2006. 22(15): p. 1902-9 ) • MZmine (Bioinformatics, 2006. 22(5): p. 634-6.) • SpecArray (Mol Cell Proteomics. 2005 (9):1328-40) • SuperHirn (Proteomics, 2007 (19):3470-80.) • Platforms available through instrument vendors • Waters Protein Expression Informatics (Anal. Chem. 2005, 77, 2187-2200) Outline
I. Introduction to the study of macromolecular complexes by mass spectrometry
II. Analysis of macromolecular complexes using quantitative MS
A human sequence-specific DNA binding transcription factor
An RNA polymerase II transcription complex
A new component of the transcription machinery
Changes in transcription factor complex composition during development
III. Future directions Transcription complexes in chromatin
Targeted MS approaches to study complexes
IV. Conclusions How is the Muscle Creatine Kinase gene regulated?
0 2 4 6 8 10 12 14 (kb)
EXON # 1 MR-1 2 3 4 5 6 7 8
Enh P
206 nt C T L M M A A A R E E E P T E F F F r 2 rich G X T 1 2 L R E-boxes Transcription factor binding to the MCK enhancer results in muscle-specific expression of the MCK gene Charis Himeda, Steve Hauschka Experimental strategy
wt Trex mt Trex incubate HeLa nuclear extract with DNA-coupled beads
TrexBF
apply magnet remove unbound proteins wash TrexBF
elute bound proteins with high salt
differentially label with ICAT reagents
analyze by quantitative mass spectrometry Himeda, Mol. Cell. Biol., 2004 Protein composition and activity of DNA affinity purified samples
kD
180 - TrexBF 116 - 97.4 -
66 - 48.5 -
29 - 18. 4 -
14.2 - 6.5 - SDS-PAGE Gel-shift Assay Distribution of abundance ratios
200 180 160 140 120 100 80 60 40 20
Number of proteins of Number 0 -0.30 -0.20 -0.10 0.00 0.10 0.20 0.30 >0.35
log10[abundance ratio (specific/control)] 868 proteins or protein groups identified using SEQUEST, Peptide Prophet and Protein Prophet (P > 0.90), and quantified by Xpress 3 proteins with abundance ratios >2 are enriched in the Trex specific purification Trex binding factor candidates
Protein peptides d8/d0 annexin a7 CYQSEFGRDLEK 2.7 : 1
CNBP (+1) CGESGHLAK 2.1 : 1 DCDLQEDACYNCGR 2.6 : 1 GFQFVSSSLPDICYR 2.9 : 1 CYSCGEFGHIQK 10.7 : 1 CGESGHLAR 7.0 : 1 CGETGHVAINCSK 9.2 : 1
Six4 YVLDGMVDTVCEDLETDKK 2.4 : 1 TrexBF in mouse skeletal myocytes is Six4
Probe Trex Extract mouse skeletal myocyte Six4 Competitor - Trex MEF3 mt - - Trex MEF3 mt - - Antibody - - - - aSix4 CTL - - - - aSix4 CTL
TrexBF
1 2 3 4 5 6 7 8 9 10 11 12 Six4 stimulates transcription from the Trex site
350 MM14 skeletal myocytes 180 MM14 skeletal myocytes 160 300 (+Trex) TKCAT 6 140 250 (-M1)5TKCAT 120 (-enh)80MCKCAT 200 100 (-enh-M1)80MCKCAT 150 80 60 100
40 Relative CAT ActivityRelative 50 RelativeCAT Activity 20 0 0 Cand.3 - ++ -- ++ Cand.3 - + - +
700 neonatal rat myocardiocytes
600 (-enh)80MCKCAT 500 (-enh-M1)80MCKCAT
400
300
200 Relative CAT Activity CAT Relative 100
0 Cand.3 -- + -- ++ Quantitative mass spectrometry permits the identification of specific complex components in partially purified samples
kD
180 -
116 - 97.4 - Six4 66 - CNBP 48.5 - Annexin a7
29 - DNA affinity purification/qMS 18. 4 - Himeda CL, et. al. Mol Cell Biol. 2008 Qi Y, et. al. Proc Natl Acad Sci U S A. 2008 14.2 - Rubio ED, et. al. Proc Natl Acad Sci U S A. 6.5 - 2008 SDS-PAGE Outline
I. Introduction to the study of macromolecular complexes by mass spectrometry
II. Analysis of macromolecular complexes using quantitative MS
A human sequence-specific DNA binding transcription factor
An RNA polymerase II transcription complex
A new component of the transcription machinery
Changes in transcription factor complex composition during development
III. Future directions Transcription complexes in chromatin
Targeted MS approaches to study complexes
IV. Conclusions The RNA Polymerase II core machinery
SRB/Meds Chromatin remodeling TAFs (24) complexes (14) Pol II/IIF TBP (15) TFIIA (1) Activator TFIIH (2) TFIIB TFIIE TATA (9) (1) (2)
Responsible for expression of all mRNA’s in the nucleus ~68 polypeptides are thought to be recruited to promoters as part of the core transcription machinery Use of a TBP mutant extract and ICAT to guide identification of pol II core transcription factors Pst I
ACT TBP(I143N) + + M ACT TATA nuclear extract activator
+ rTBP
holoenzyme TAFs M ACTACT TATA M ACTACT rTBPTATA IIA
Pst I elute isotopically label H H H L L H H H combine proteolyze fractionate mLC/MS/MS Ranish, et. al., Nature Gen., 2003 Protein composition after affinity purification
A B kD
220- TBP (51) 160- 120- 100- TOA2 90- (>24) 80- (TFIIA) 70- 60- TFIIB (5.0) 50-
40- SRB4 (2.0) 30- (mediator) 25- 20- KIN28 (5.4) 15- (TFIIH) 10- 1 2 1 2 3 silver stain western ASAPratio
Li, et. al., Anal. Chem, 2003 . 252 proteins quantified by ASAPratio
. 47/57 proteins with enrichment ratios > 2 and p < 0.1 are known core PIC components (70%)
. 17/18 GTFs, 7/7 Pol II-specific subunits, 10/14 TAFs, 12/24 mediators subunits
. 5 proteins with annotated roles in Pol II transcription . 5 potential new PIC components Comparison of amine labeling approaches
Experiment PIC proteins identified Total proteins identified
number percent of number PIC percent PIC
ACE1- 99 90.8% 287 34.5% N-isotag
GCN4- 94 86.2% 239 39.3% N-isotag
ACE1- 105 96.3% 418 25.1% iTRAQ
GCN4- 105 96.3% 418 25.1% iTRAQ
TOTAL 108 99.1% 530 20.4%
Jie Luo TSP: a new component of the transcription machinery?
• Sequest identifies 2 overlapping cysteine containing tryptic peptides from TSP
m/z= 568.1
m/z= Relative Abundance Relative 570.8
scan number Light : Heavy 1.0 : 2.0 (0.23) Xpress ratios Composition of core TFIIH in the absence of TSP John Leppard
TFB2-FLG TFB2-FLG SILAC labeling tspD extract TSP extract (light) (heavy) Darg4, Dlys1 d10ARG, d8LYS
TFB4 Detect changes in complex composition by quantitative MS TFB2 SSL2 SSL1 TFB1 TSP TFB2
SSL2
Core TFIIH Xpress quantification architecture
TSP (9) Outline
I. Introduction to the study of macromolecular complexes by mass spectrometry
II. Analysis of macromolecular complexes using quantitative MS
A human sequence-specific DNA binding transcription factor
An RNA polymerase II transcription complex
A new component of the transcription machinery
Changes in transcription factor complex composition during development
III. Future directions Transcription complexes in chromatin
Targeted MS approaches to study complexes
IV. Conclusions Chromatin Remodeling Complexes
BAF chromatin remodeling complexes contain one of two ATPases: Brg, or Brm (Brahma) and ~10 core subunits
Required for pluripotency and self renewal in ES cells but not for proliferation of fibroblasts and other cell types.
Hypothesize the existence of BAF complexes with distinct subunit composition in different cell types
From Gary M. Halliday et al. Int J Biochem Cell Biol (2008)
Lena Ho, Gerald Crabtree, Stanford Alexey Nesvizhskii, Univ. Michigan Embryonic Stem Cell BAFs
Goal: to understand the role BAF complexes in pluripotency
Step 1: determine the composition of BAF complexes in ES cells
Compared complexes purified from nuclear extracts from:
Mouse Embryonic Stem Cells (ES) Mouse Embryonic Fibroblast (MEF) P0 Mouse Brain, Neurons (Neurons) Identification of BAF components
Nuclear extracts from mouse Embryonic Stem Cells, Mouse Embryonic Fibroblast, Neurons (brain)
Affinity Purification of BAF complexes with anti-Brg/Brm antibody
Trypsin digestion
Strong Cationic Exchange Fractionation
ESI-MS/MS (Orbitrap LTQ)
Peptide/protein identification (Trans Proteomic Pipeline)
Spectral counting-based quantification and computational analysis Identification of BAF components
Identification of BAF components by AP-MS
Comparison of BAF-associated proteins reveals common and cell-type specific components in ES, MEF, Neurons Quantification of BAF Complex Components by Spectral Counting
Protein abundance is estimated by the number of spectra acquired for each protein, normalized to account for protein length and normalized to the total spectra in each dataset Immunoblotting vs. Adjusted Spectral Counts
Immunoblotting confirms differential expression of BAF155 and BAF170 in ES, MEFs, and Brain
immunobloting Good agreement between immunoblotting and spectral count-based quantification spectral counting esBAF Complex
Spectral quantification of core BAF components
(normalized for protein length, and total number of spectra in each dataset)
Composition of BAF complexes from ES cells Transcription complexes in chromatin Challenges for the comprehensive analysis of gene- specific transcription complexes
• Need an efficient way to isolate the complexes in a form that is amenable to MS
• Gene-specific complexes are present at ~single copy per cell
• only ~100 fmol in 3 liters of cells
• the limit of detection in a “shotgun” MS experiment is ~10 fmol
• the complexes are dynamic Chromatin isolation
3XFLAG LacI 3XFLAG LacI
LacO gene X LacO promotergene -XGFP
ARS1 ARS1
TRP1 TRP1
Grow cells under specific environmental conditions in presence of isotopically heavy or light lysine and arginine
Anti-FLAG immuno-purify
Elute bound complexes
Analyze composition and changes in composition by quantitative MS Kinetochore purification scheme Bungo Akiyoshi, Sue Biggins 5 liters of cells CEN3
Cryolysis TRP1
Resuspend in buffer ARS1
Ultracentrifuge LacO LacI 3XFLAG Extract
anti FLAG IP CEN3
Wash beads TRP1 ARS1 Elute with FLAG peptide or SDS
LacO LacI SCX fractionate and analyze by MS/MS 3XFLAG Proteins >3-fold enriched on centromeric minichromosomes Name Function %Coverage #unique enrichment ratio
*Detected >90% of known core kinetochore proteins by MS
Akiyoshi, Genes Dev., 2009 Targeted MS using Selected Reaction Monitoring
Selected reaction monitoring m/z m/z fragment diagnostic signal corresponds corresponds for protein X to peptide x to a fragment from peptide x Domon et al., Science (2006)
• Two levels of mass selection: high selectivity • Selective scanning, short duty cycle: high sensitivity, reproducibility • The most sensitive mass spectrometry method known (low amole ) The transcription factor SRM assay After optimization and validation our assay now includes - 420 proteins - 1539 peptides - 4615 transition (Q1/Q3) 3 transitions/peptide
Hamid Mirzaei, Paola Picotti, Ruedi Aebersold Control of FLO11 expression Nutrients /Environment Ras2
Snf1 kinase cAMP/PKA Kss1/MAPK pathway pathway pathway
Nrg1, Nrg2 Flo8 Sfl1 Ste12/Tec1
Rupp, et. al., EMBO, 1999
Haploid : high glucose low glucose Diploid : high nitrogen low nitrogen
FLO11 repressed FLO11 activated Round form cells Elongation and invasion Application of the assay to identify potential regulators of the yeast FLO11 gene
Segment Regulator
promoter enriched
control enriched
Systematically measured the binding preference of 222 TFs for each promoter segment by SRM Advantages of quantitative mass spectrometry for complex analysis
Quantitative measurement increases confidence in identification of complex components
Permits identification of complex components without the need for extensive purification
Permits detection of quantitative changes in complex composition and abundance
Stoichiometry measurements possible with absolute quantification Limitations of quantitative mass spectrometry for complex analysis
Labeling Extra steps needed to incorporate labels may result in loss of sensitivity and sample handling errors
Low signal to noise ratios, or incomplete resolution of ions can limit accuracy of quantification. (High resolution instruments improve accuracy of quantification )
Label-free Multiple reproducible MS runs needed for accurate measurements with intensity-based measurements
Duty cycle limitations can hinder quantification by spectral counting
Sample handling errors
Ion suppression issues Acknowledgements
ISB ETH Zurich Bong Kim Paola Picotti John Leppard Ruedi Aebersold
Jie Luo University of Washington Hamid Mirzaei Charis Himeda Jimmy Eng Steve Hauschka
Andrew Keller University Michigan Xiao-jun Li Alexey Nesvizhskii David Shteynberg Fred Hutchinson Cancer Research Center Tim Galitski Steve Hahn (Pol II) Theo Knijnenburg Bungo Akiyoshi (kinetochore) Ilya Shmulevich Sue Biggins
John Aitchison Stanford Lena Ho Jerry Crabtree