<<

Chap 7

Biochemistry and „ 9.1 Introduction „ 9.2 Ionization „ 9.3 Mass Analysis „ 9,5 Structural Information by Tandem Mass Spectrometry „ 9.7 Computing and Database Analysis

1 Biological Mass Spectrometry

A key tool for Protein. Peptides. . DNA………and more

2 Back to the history………

The first mass spectrometer - parabola spectrograph

1912 J. J. Thomson

3 Back to the history………

J.J. Thomson

• J.J. Thomson observes a line at mass 22 in the spectrum of neon. • J.J. Thomson delivers his Bakerian Lecture, “Rays of Positive Electricity” to the Royal Society of London. 4 • Francis Aston is awarded the Nobel Prize in for his discovery of isotopes of “inactive elements.” Francis Aston

5 Advances in Modern Mass Spectrometry

Limitations of traditional MS on biological applications

♦ High molecular weight >50,000 ♦ Amount of Sample < 10-12 -10-15 mole

ElectroSpray Ionization MS Matrix Assisted Laser Desorption Ionization MS

John B. Fenn Koichi Tanaka ESI MALDI 6 Evolution of Mass Spectrometry-1

Francis Aston is awarded the for his discovery of isotopes of “inactive The double-foucsing elements.” mass spectrograph is developed J.J. Thomson Rays of Positive Electricity

1913 1922 1934 1946 7 Evolution of Mass Spectrometry-2

Biomolecules

Ion- reaction, unimolecular dissociation

Volatile organic

1960 1970 1980 1990 2000

8 Human Genome Project

2003, 4, 14

35000-40000 Genes

9 Genomics (基因體學) 1953: Watson and Crick: DNA double helix Danio rerio C. Elegans (1998) Rattus norvegicus Anopheles gambiae (2002

Homo sapiens (2001) Fugu rubripes (2002) Mus musculus10(2002) Drosophila melanogaster (2000) From Genomics to Proteomics

Genome Transcriptome Proteome

DNA RNA Proteins Modified Biological Proteins Function Y

Y

Transcription Translation Post-Translation Modification PROTEin complement to a genOME

Entire protein complement in a given cell, tissue or organism 11 PR TE MICS 蛋白質體學

Protein activity, modifications, localizations, amd interactions of proteins in complexes

Proteomics can be defined as the qualitative and quantitative comparison of proteomes under different conditions to further unravel biological processes

12 Protein chemistry and proteomics

Protein Identification ( 蛋白質鑑定)

Protein chemistry Proteomics Individual proteins Complex mixtures Complete sequence analysis Partial sequence analysis Emphasis on structure and function Emphasis on identification by database matching Structual biology Systems biology

13 Disease Proteomics 一滴血可以診斷出疾病嗎?

14 Protein Chips 蛋白質晶片

15 找尋腫瘤標記蛋白質

正常人

良性攝護 腺腫瘤

攝護腺癌

16 17 Wide Applications of Mass Spectrometry

Jennings et al , Int. J. Freiser et al , Morris et al, Rapid Mass Spectrom. . Anal. Chem. 54, commun. Mass Phys. 1, 227 (1968) 1431 (1982) Spectrom. 10, 889 (1996) COLLISIONAL- Derrick et al, Org. FOURIER TRANSFORM Mass Spectrom. 27, ORTHOGONAL INDUCED ION CYCLOTRON 1077 (1992) QUADRUPOLE- DISSOCIATION TIME-OF-FLIGHT RESONANCE MS TIME-OF-FLIGHT /TIME-OF-FLIGHT MS MS Futrell et al Yost et al, J. Am. Cooks et al, (1964) Chem. Soc. 100, Anal. Chem. 2274 (1978) 59, 1677 (1987) FIVE TRIPLE QUADRUPOLE Hager et al , Rapid Commun. SECTOR QUADRUPOLE ION TRAP MS Mass Spectrom. 2002, 16, MS MS 512 (2002) LINEAR ION TRAP

1960 1970 1980 1990 2000 ¾ FTICR- TRAP 1943 ¾ ION- TRAP ¾Triple Q ¾Q-TOF ¾Q-FTICR 1943 ¾ ION- TRAP ¾¾EBE-TOFTriple Q ¾Q-TOF ¾Q-FTICR 1st COMMERICAL ¾EBE-TOF ¾TOF-TOF 18 MASS1st COMMERICAL SPECTROMETER ¾LINEAR¾TOF-TOF TRAP MASS SPECTROMETER ¾LINEAR TRAP What is mass spectrometry?

1619.28 100 968.64 1251.93

1252.97 2313.69 969.70 2312.76 1621.14 2314.75 1670.25

Ionization 1671.24

+ - % 2315.80 M or M 970.69 1254.00 1672.14 M 1013.80 2366.83 836.58 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 2271.83 2368.85 1448.14 1853.38 2019.57 1876.41 2036.57 2369.91 2718.09

0 m/z 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800

Mass-to-charge Ratio ( m/z ) MS is an analytical tool that measure the molecular weight of molecules based on the motion of charged particle in an electrical or magnetic field.

19 Two different ways of measurement • Composition of analyte (MS) e.g. Peptide mass fingerprinting

• Structure of analyte (MS/MS, tandem MS) Dissociation

20 Turbo pumps High Vacuum System Diffusion Pumps Rough pumps Rotary pumps

Detector Data Inlet Ion source Mass Filter System

•Microchannel Plate •PC’s •Sample plate •MALDI •TOF •Electron Multiplier •SunSPARK •“Hybrid” •DECStation •Target •Electrospray •Quadrupole •Macs •HPLC •FAB •Ion Trap •GC •LSIMS •Magnetic Sector •CE •EI •FTMS •CI 21 Ionization Methods

• Electron Impact (EI) • Fast Bombardment (FAB) • Electrospray Ionization (ESI) • Matrix-Assisted Laser Desorption Ionization (MALDI)

22 Advances in Modern Mass Spectrometry

Limitations of traditional MS on biological applications

♦ High molecular weight >50,000 ♦ Amount of Sample < 10-12 -10-15 mole

ElectroSpray Ionization MS Matrix Assisted Laser Desorption Ionization MS

2002 Nobel Prize in Chemistry

23 Matrix-Assisted Laser Desorption Ionization (MALDI)

24 Matrix-Assisted Laser Desorption Ionization (MALDI)

Analyte Metal Ion Matrix http://www.ionsource.com 25 Matrix selection CH O CH CHCOOH 3 α-Cyano-4-hydroxy- Peptides<10kDa cinnamic acid (CHCA) HO CH O Sinapinic Acid Proteins >10kDa 3

Sinapinic acid (3,5-Dimethoxy- 2,5-Dihydroxybenzoic Neutral , 4-hydroxy cinnamic acid) acid (DHB) Synthetic Polymers COOH

“Super DHB” Proteins, OH Glycosylated proteins N 3-Hydroxypicolinic acid Oligonucleotides 3-hydroxypicolinic acid (3-HPA) HABA Proteins,

Oligosaccharides CH C(CN)COOH COOH

NN HO

OH α-cyano-4-hydroxycinnamic acid 2-(4-hydroxyphenylazo)-benzoic acid (HABA) 26 A typical mass spectra

[M+H]+

[M+2H]2+

27 Electrospray: Generation of aerosols and droplets

28 ElectroSpray Ionization (ESI)

29 Charge Residue Model

Coulomb repulsion : Charge repulsion > surface tension

Ion Desorption Model Ion desorption from the droplet surface 30 Multiple-charged ESI spectra Deconvolution

31 32 Deconvolution

Step 1: assume 190.1 = mass/charge 379.2 = mass/charge Step 2: assume the two peaks are related 190.1 = m/(z+1) 379.2=m/z Step 3: solve m and z m=378.2, z=1

Charge state Calculation Unprotonated mass +1 (379.2-1)*1 378.2 +2 (190.1-1)*2 378.2 average 378.2

33 34 Molecular Weight of Proteins

35 Mass Analysis are separated according to their mss-to-charge (m/z)

TRANSPORT of + / FULL + / - IONS AND -IONS TO SCAN NEUTRALS PARENT MASS ION SPECTRA TO FORMED IN ANALYZER THE DETECTOR ion SOURCE

ION SOURCE Transport Mass analyzer DETECTOR optics 36 Ion Mass Data source Detector Inlet Filter System • •Sector Instruments „ Time-of-flight Analyzer

„ Quadrupole Mass Filter

„ Ion-Trap Instrument

„ FT-ICR 37 Quadrupole Mass Filter (m/z -4000)

M+

2 2 au = ax = -ay = 4zU / mω ro a/q = 2U/V, others fixed 2 2 qu = qx = -qy = 2zV / mω ro 38 Length, diameter, kinetic energy -Æm/e range, resolution 39 Ion-Trap Instrument Cap electrode

• Principle very similar to quadrupole • Ions stored by RF & DC fields • Scanning field can eject ions of Ring electrode specific m/z MSn

40 41 Multiple MS/MS (MSn) for Structural determination

42 Time-of-flight Mass Spectrometry

43 Laser

2500 Volt

+ + + + + +

electric field drift region ~3.6cm ~1m 1 Mass spectrum  m  2 + t =   D  2zeEs  + + + t= 100 µ second for mass 50 ( small moleculae) 1000 µ second for mass 5,000 (peptide, polymer…) 3000 µ second for mass 50,000 (protein, DNA….)

m/z 44 Calibration of Mass Spectrum

Flight time 1 D: drift length  m  2 t =   D  2zeEs  1 t = a()m 2 + b The mass is independent of any instrumental parameters [M+H]+ „ Internal calibration „ External calibration 45 m/z is a function of flight time Vacuum pump S D 10-5 ~10-8 torr

E=V/s

V In a electric field E = where V = accelerating voltage s 1 2 1 2  2zeEs  In drift region mv = qV = zeEs v =   2  m  1  m  2 Flight time t =   D  2zeEs   m   2eEs  Relation between m/z and t   =  t 2  z   D2  46 FactorsFactors ofof PoorPoor ResolutionResolution inin TOFTOF MSMS

Linear MALDI TOF

Time-of-flight will be affected by:

Linear MALDI TOF ‹ Temporal Distribution Large kinetic energy spread, broad peaks time of ion formation poor resolution ‹ Spatial Distribution (unresolved isotope) initial location in the extraction field m/z ‹ Kinetic Energy Distribution different initial kinetic energy 47 Ions of same mass, Delayed Extraction (DE) different velocities 1: No applied electric field. 0 V Ions spread out.

Uslow > Umedium > Ufast V < V < V Potential

+20 kV

2: Field applied. Gradient accelerates slow ions more than fast ones.

+20 kV +

3: Slow ions catch up with faster ones at the detector 48 Fourier Transform Ion Cyclotron Resonance Analyzer, FTICR

Z

Comisarow, M. B.; Marshall, A. G. Fourier transform ion cyclotron resonance spectroscopy49 Chem. Phys. Lett. 1974, 25, 282-283. v F=z v x B F: Magnetic force on the ion z: Charge of the ion F v: Velocity of the ion

X B: Magnetic Field

Z A charged particle (ion) will rotate around the magnetic field line of a homogeneous magnetic field in a circular motion.

50 Large biological system Exact mass measurement LC-MS Comparison of Different Types of Mass Spectrometers

Characteristic Magnetic Quadrupole QIT TOF FT-ICR Sector Mass Range 15,000 4,000 100,000 unlimited > 106 (Da) Resolution 200,000 Unit 30,000 15,000 > 106

Dynamic ++++ +++ +++ +++ ++ Range MS/MS ++++ +++ +++ +++ ++

LC or CE + ++++ +++ ++ ++

Cost $$$$ + + ++ ++++ 51 Tandem Mass Spectrometry

PARENT MASS ANALYZER IS PRODUCT PARENT IONS PARKED, PASSING MASS ANALYZER ENTER ONLY PARENT IONS MS 2 IS SCANNED, TRANSPORT of + / COLLISION CELL + / - IONS AND OF A SINGLE M / Z PASSING FULL -IONS TO COLLIDE WITH NEUTRALS TO COLLISION SCAN PRODUCT PARENT MASS Ar GAS AND FORMED IN CELL ION SPECTRA TO ion SOURCE ANALYZER MS 1 DISASSOCIATE THE DETECTOR

E n e rg y

Collision cell ION SOURCE Transport MS 1 MS 2 DETECTOR optics Inert gas 52 Methods of MS-MS

MS 1 Collision MS 2 In space cell Precursor Product ion scan ion selection Dissociation

CID

53 54 Methods of MS-MS

In time Time 1 Time 2 Time 3

CID

famplitude rf EJECT ALL IONS OTHER PRODUCT THAN PARENT ION SCAN ION

INJECION OF IONS INTO THE COLLISIONAL- TRAP INDUCED DISSOCIATION 55 Time (ms) New Types Mass spectrometer

MS 1 Collision MS 2 cell

Electrostatic magnetic analyzer analyzer

Quadrupole Quadrupole analyzer analyzer Ion cyclotron Ion trap + resonance cell analyzer

Ion trap analyzer Time-of-flight analyzer Time-of-flight analyzer 56 Orthogonal-Quadrupole Time-of-flight Mass Spectrometer, Q-TOF

Triple quadruple MS ¾ Good precursor ion selection ¾ Efficient fragmentation +

Time-of-flight MS ¾Good sensitivity ¾ Good sensitivity ¾ Good resolution ¾ Speed 57 2D Linear Trap Instrument

Triple quadruple MS Quadrupole ion trap MS

+ Y

¾ 4 scan modes ¾Good sensitivity ¾ Quantitative analysis ¾ Sensitive product ion scans

Blanc et al,58 proteomics, 3, 859 (2003) FTICR Trap Instrument Linear ion trap MS

¾Good sensitivity ¾ MSn capibility

super conducting + magnet

Fourier transform ion cyclotron resonance MS super conducting magnet ¾ Accurate Mass ¾ Ultra-high Resolution ¾ High dynamic range

59 Time-of-flight / time-of-flight Mass Spectrometer, TOF-TOF

N2 laser Detectors TOF 2 source

TIS Retarding Reflectron lens TOF 1 Floating TOF 2

collision cell Schematics from Applied Biosystemss60 High Energy Peptide Fragment Pattern and Nomenclature

xn-i - Single cleavage A yn-i zn-i Yn-i-1= yn-i-1-2H vn-i+1 wn-i yn-i-1 -HN-CH-CO-NH-CH-CO-NH-

Ri CH-R’i+1

ai b R”i+1 b i c i+1 B - Double cleavage i di+1 Nomenclature of Peptide Fragmentation -CO-HN-CH-CO-NH-CH-CO-NH-

Ri Ri+1 ai+1 yn-i 61 bi+1 yn-i Carbohydrates • Carbohydrates are the most abundant organic molecules in nature – Photosynthesis energy stored in carbohydrates; – Carbohydrates are the metabolic precursors of all other biomolecules; – Important component of cell structures; – Important function in cell-cell recognition; – Carbohydrate chemistry: • Contains at least one asymmetric carbon center; • Favorable cyclic structures; • Able to form polymers 62 Summary of Some Functions of Glycans

63 The cell surface of bacterial is LPS-rich

64 Roles of in Recognition and Adhesion at the Cell Surface Virus, Bacterium, Lymphocyte…….

65 Carbohydrates Have Diverse M.W.

66 Mass and Low Mass MS/MS (CID) Carbohydrate Marker Ions

Monosaccharide / Composition Monoisotopic mass Low Mass CID Intact Residue Hexose, Hex C6O6H12 180.0634 162.0528 163 (Galactose, Mannose) Deoxy-Hexose, dHex C6O5H12 164.0685 146.0579 147 (Fucose) N-acetylhexoseamine, HexNAc C8O6NH15 221.0899 203.0794 204 (N-acetylglucoseamine, N-acetylgalactoseamine) N-acetylneuraminic C11O9NH19 309.106 291.0954 292/274/256 acid, Neu5Ac, NANA (sialic acid) Hex-HexNAc C14O11NH25 383.1428 365.1322 366 Hex3HexNac2 C46O35N2H76 910.3278 892.3172 - (Common N-linked Core)

67 High Mannose-Type Glycans

68 THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 276, No. 10, Issue of March 9, pp. 7351–7356, 2001 MALDI/TOF/TOF MS Spectrum of N-Glycans Derived from Ribonuclease B 1257.4617 100 2725.9

90

80

70

60

50 % Intensity 40 1419.5123 30

20 1743.6193 1581.5762 10 1905.6665 69 0 999.0 1499.6 2000.2 Mass (m/z) Tandem Mass Spectrometry

70 MALDI/TOF/TOF MS/MS of Man5 Derived

22.9524 3,5 100 X4α’ from Ribonuclease B 2.0E+4 CH2OH O

90 O CH OH OH 2 1,5X 3α 3,5 3,5 OH 1,5 O 3,5 A A X A3 4 5 4α’ CH2OH 1,4 CH2OH 80 3,5 O CH2OH X OH A 4α” OH 5 CH2OH O O 2,4 O O A5 O 0,4 A OH O O 70 A3 OH OH OH OH 3,5 OH OH X 1,5 NH OH 3β NH 1,5 CH2OH X2 X4α” O O C O C O 60 0,2 CH 0,2 CH3 A4 3 A5 OH OH B4

50 OH 1,5 β X4β 4x,3 % Intensity α 1036.3893 α 3 3 X 3 3 A A

40 1257.5524 1,5 2 X X 5 4 0,4 X 0,4 B /Y

3 4x A 1,5 1,5 A C β 1,5 B 3 2α 2,4 30 Y 3,5 2 4x,3 1123.4529 β 671.2255 3 3 5 /Y 849.2728 799.2855

4 C 5 475.1992 A A

(Hex) A 4 4x,3 4

1 2 C B Y 1095.4425 2α 907.3265 A 20 1 447.1826 X Y 0,2

Z A 3,5 X Z 2 3,5 3α 3,5

1 3,5

B 0,2 1,5 653.2201 3 347.1320 527.2029 1061.4415

10 771.2791

874.3221 Y 509.1753 550.2066 979.3621 4x 429.1849 583.2186 73.2773 625.2424 226.0908 399.1900 689.2335 41.1236 164.9358 306.8996 327.1572 1156.4673 244.1070 745.2795 953.3788 817.2909 712.2309 365.1320 185.0653 272.1050 1010.3968 147.7515 1182.4816 933.3752 607.1801 203.0858 91.4020 0 9.0 271.6 534.2 796.8 1059.4 1322.0 Mass (m/z) 71 MALDI/TOF/TOF MS/MS of Man6 Derived from Ribonuclease B

3,5 X4α’ 100 CH2OH O 6688.0

O CH OH OH 21,5 90 X3α OH O 3,5 1,5 3,5 A 3,5 X A3 4 A 4α’ O CH2OH 5 3,5 OH CH2OH CH2OH

22.9496 X OH 2,4 80 CH2OH 4α” O O A5 O B4 O O 0,4 OH O O A3 OH OH 1419.5812 OH OH 3,5 3OH OH 70 X 3β 1,5 NH NH OH X 1198.4194 CH2OH X2 1,5 C O O O C O X4α” 0,2 CH 0,2 CH3 A5 3

A4 4x 60 4 3,5 OH A X /Y 4βOH CH2OH 1,5

1,5 4x O X3β 0,2

50 O 5 /Y 4x α 3 3 A

OH OH 5 X % Intensity B Y A 2,4 1,5 OH 1,5 / 40 2,4 X 4x 4β β

B 3

3 Y 4x 2 α 671.2195 X 3 5 4 X /Y X A 1,5 A 30 4x 3 4x 1,5 4 1285.4937

Y 1257.4738

2 1,5 3 0,2 β 3,5 2 A /Y B A /Z 3 3α 3 1011.3154 X 3 X 0,4 3,5 B 4x,3 A / B 0,4

20 0,4 995.3016 475.2043 X C 1123.3922 4x 3,5 Y C 447.1803 2α 73.2619 961.3378 1 2β Y4x 5 1069.3621 3,5 Y 41.1166 Z1 A 1095.3969 55.1815 3,5 655.2412

10 833.2704 503.2459 347.1321 527.1953 1036.3636 874.3069 306.8848 164.9289 583.2076 933.3193 1216.4386 712.2351 1172.4325 907.2592 226.0978 799.3037 740.3250 1344.5597 131.6229 105.4097 185.0463 0 9.0 305.6 602.2 898.8 1195.4 1492.0 Mass (m/z) 72 Identification and Quantification of Proteins

73 Peptide and Protein Analysis

74 Mass Spectrometry Methods

digestion

Molecular weight Identification Sequence information

9 MALDI TOF MS 9Peptide mass fingerprinting 9 ESI MS (MALDI TOF MS) 9Peptide Sequencing (Tandem MS)

75 Peptide Mass Fingerprint • This methodology was the first to allow protein identification without the need for time- consuming Edman sequencing or immunoaffinity probes..

• Landmark Study(1993) MS approaches alone could be Left to right: used to analyze proteins from William J. Henzel; Protein Chemistry; Genentech 2DE John T. Stults; ; Genentech Colin Watanabe; Software Engineer; Genentech 76 Peptide Mass Fingerprinting

H2N-

-COOH

77 Step 1: Enzyme digestion or chemical fragmentation

Protein Cys H N- 2 Lys Arg

-COOH

Reduction / Alkylation Tryptic Digest

Every protein generate a set of unique peptides 每一個蛋白質有一套獨特的peptide碎片 78 Step 2: Mass spectrometry analysis

A unique list

Peptide Mass Lists Peptide Mixtures for each protein in Database % Intensity Extraction, clean up Mass Spectrometry Analysis

Mass Spectrum

600m/z 3000 79 Step 3: Database match

Digest Reagents Peptide Properties

Tolerance

Mol Weight Exclude List Database

Isoelectric Point

Modification Hits to return

Fixed Optional Peptide Match

80 Step 3: In-Silica digestion Mass Peptide Sequence QNGTTDISILHCTTEYPTPY 3583.8 PSLNLNVIHTLK Protein Sequence RPGNGISPMNWYDILGQEAQ 3493.6 DDFEEDEVIR MVYIIAEIGC NHNGDINLAKKMVDVAVSCG GVETFSTPFDEESLEFLIST 2995.4 DMPIYK VDAVKFQTFK AEKLISKFAP KAEYQKATTG DLTIGYSDHSIGSEVPIAAA TADSQLEMTKRLELSFEEYL EMRDYAISKG 2944.5 AMGAEVIEK VETFSTPFDE ESLEFLISTD MPIYKIPSGE 2324.2 VILSTGMAVMEEIHQAVNIL R ITNLPYLEKI GKQQKKVILS TGMAVMEEIH 2188.1 MVYIIAEIGCNHNGDINLAK QAVNILRQNG TTDISILHCT TEYPTPYPSL NLNVIHTLKD EFKDLTIGYS DHSIGSEVPI Digestion 1811.8 FENQLPELHHHHHH AAAAMGAEVI EKHFTLDTNM 1683.8 HFTLDTNMEVPDHK EVPDHKASAT 1573.8 IPSGEITNLPYLEK PDILAALVKG FALLNQALGR FEKIPDPVEE 1558.7 LELSFEEYLEMR KNKIVARKSV VALKPIKKGD IYSIENITVK 1453.7 ATTGTADSQLEMTK RPGNGISPMN WYDILGQEAQ DDFEEDEVIR 1392.7 MVDVAVSCGVDAVK % DSRFENQLPE LHHHHHH 1351.7 GDIYSIENITVK 1269.7 ASATPDILAALVK 1159.7 GFALLNQALGR 954.63 SVVALKPIK 926.48 IPDPVEEK 696.36 DYAISK Intensity 670.36 FQTFK 638.31 AEYQK 600 10001500 2000 2500 3000 538.25 DEFK 81 m/z Search Result

82 Peptide Mass Fingerprinting (PMF) 蛋白質身份鑑定

In-Gel Digestion

Extract peptides; mass analyze

Each protein has a unique peptide mass list

Database search 83 Tandem Mass Spectrometry (MS/MS)

Ionization Source MALDI Electrospray

CID

Molecular Ion

Molecular Ions Fragment Ions

84 m/z m/z Peptide Fragmentation

a,b,c – N-terminal side x,y,z – C-terminal side

85 Table9.2 Symbols and residue masses of the protein amino acids Name Symbol Residue mass Side-chain Residue Alanine A,Ala 71.079 CH3- Mass of Arginine R,Arg 156.188 HN=C(NH2)-N-(CH2)3 Asparagine N,Asn 114.104 H2N-CO-CH2-

Amino Aspartic acid D,Asp 115.089 HOOC-CH2- Acids Cysteine C,Cys 103.145 HS-CH2- Glutamine Q,Gln 128.131 H2N-CO-(CH2)2-

Glutamic acid E,Glu 129.116 HOOC-(CH2)2- Glycine G,Gly 57.052 H-

Histidine H,His 137.141 Imidazole-CH2-

Isoleucine I,Ile 113.16 CH3-CH2-CH(CH3)-

Leucine L,Leu 113.16 (CH3)2-CH-CH2-

Lysine K,Lys 128.17 H2N-(CH2)4-

Methionine M,Met 131.199 CH3-S-(CH2)2-

Metsulphoxide Met.SO 147.199 CH3-S(O)-(CD2)2-

Phenylalanine F,Phe 147.177 Phenyl-CH2- Proline P,Pro 97.117 Phrrolidone-CH-

Serine S,Ser 87.078 HO-CH2-

Threonine T,Thr 101.105 CH3-CH(OH)- Tryptophan W,Trp 186.213 ndole-NH-CH=C-CH

Tyrosine Y,Tyr 163.176 4-OH-Phenyl-CH86 2-

Valine V,Val 99.133 CH3-CH(CH2)- b Peptide Sequencing n F E S N F N T Q A yn

147 87

F (phenylalanine): 147.177 S (Serine) : 87.078 87 Separation and Quantification Techniques for Mass Spectrometry

88 Diverse Properties of Proteins v.s. Mass Spectrometry

9 M.W. measurement 9 Protein identification Protein 9 tandem mass spectrometry Complex Protein-ligand (machines) interaction 9 M.W. measurement 9 Protein identification

Lipid NO PO4

Post-translational modification

9 Protein identification 9 Tandem mass spectrometry 89 What Do We See?

Technology Platform V.S. Complex Proteome 90 Proteomic technologies for biomarker detection & discovery

„ Gel electrophoresis 9 Two-dimensional gel electrophoresis (2DE) 9 Two-dimensional differential gel electrophoresis (DIGE)

• Gel free ( Mass spectrometry) 9 Surface-enhanced laser desorption / ionization TOF MS (SELDI- TOF MS) 9 Multidimentional protein identification technology (MudPIT) 9 Isotope coded affinity tag (ICAT)

„ Other technologies:

9 Protein microarrays 91 9 ELISA Number of Proteins per Genome

• Haemophilus 1742 • E. coli 4413 • Yeast 6600 • Caenorhabditis 18000 • Drosophila 13000 • Human >100000

92 A key technical challenge in proteomics

A more complex biological problem

Diverse solution (105 –106 for protein abundance)

Mass-spectrometry Sample prefractionation strategy for more + complex mixture ? 93 Protein Separation Fxn B5 -> H6 E. coli Lysate E.

200

116 95 Salt Gradient 68

39

29

20 14

Chromatography SDS-PAGE 2D-gel

94 Principle of Electrophoresis

-

v = µ ’ E v = migration velocity (cm/s) µ = electrophoretic mobility (cm2/V’s) charge, size, shape of molecule,

viscosity, pore size, buffer pH and 95 ionic strength, temperature of medium Two-dimensional gel electrophoresis

Protein Extract

Separate proteins on 2-D gels Separation based on pI Separation basedonsize

Analyze spots by AAA Sequencing Mass spectrometry

96 2D-PAGE (二維電泳分離)

Run gel; stain;

In-Gel Digestion and sample prep 97000 for MS 66000

1619.28 100 968.64 1251.93

1252.97 2313.69 969.70 2312.76 1621.14 2314.75 1670.25 1619.28 100 968.64 1671.24 1251.93

% 1252.97 2313.69 45000 969.70 2312.76 2315.80 970.69 1254.00 1672.14 1621.14 2314.75 1670.25 1619.28 1013.80 100 2366.83 968.64 1671.24 836.58 1251.93 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 2271.83 2368.85 % 1448.14 1853.38 2019.57 1252.97 2313.69 1876.41 969.70 2036.57 2369.91 2312.76 2718.092315.80 970.69 1254.00 1672.14 1621.14 0 m/z 2314.75 1000 1200 1400 1600 1800 2000 2200 2400 26001670.25 2800 1619.28 1013.80 100 2366.83 968.64 1671.24 836.58 1251.93 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 2271.83 2368.85 % 1448.14 1853.38 2019.57 1252.97 2313.69 1876.41 969.70 2036.57 2369.91 2312.76 2718.092315.80 970.69 1254.00 1672.14 1621.14 0 m/z 2314.75 1000 1200 1400 1600 1800 2000 2200 2400 26001670.25 2800 1013.80 2366.83 836.58 1671.24 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 Extract peptides; 2271.83 2368.85 % 1448.14 1853.38 2019.57 1876.41 2036.57 2369.91 2718.092315.80 30000 970.69 1254.00 1672.14 0 m/z 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 1013.80 2366.83 836.58 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 2271.83 2368.85 1448.14 1853.38 2019.57 1876.41 2036.57 2369.91 mass analyze 2718.09

0 m/z 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800

20100

14400 Database search

01-2576b Mix-2 97 Separation methods for protein and peptides

Method Basis of separation Chromatographic methods size-exclusion molecular weight ion-exchange chromatography charge reverse-phase high-performance chromatograph hydrophobic interaction between the sample and bonded hydrophobic-interaction chromatography salt-promoted adsorption chromatography affinity chromatography biomolecular interaction (DNA, ligand, ⋯) Electrophoretic methods one-dimensional gel electrophoresis molecular weight two-dimensional gel electrophoresis 1st dimension: charge; 2nd dimension: molecular weigh gel-free isoelectric focusing charge

Subcellular fractionation subcellular fractionation

98 (二維電泳分離) Example: Today’s 2D Gel-Based Proteomics

differentially expressed (170) no differences 99 2D-PAGE & protein identification

isoelectric focusing, IEF SDS-PAGE pretreated pI (Molecular weight) samples

differential expression pattern

m/z database peptide mass MS in-gel digestion searching finger-printing 100 2D Gel Robotic spot excision In-Gel Digestion and sample prep for MS

1619.28 100 968.64 1251.93

1252.97 2313.69 969.70 2312.76 1621.14 2314.75 1670.25 100 1619.28 968.64 1671.24 1251.93

% 1252.97 2313.69 969.70 2312.76 2315.80 970.69 1254.00 1672.14 1621.14 2314.75 1670.25 1013.80 100 2366.831619.28 968.64 1671.24 836.58 1251.93 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 % 2271.831252.97 2368.85 1448.14 1853.38 2019.57 2313.69 969.701876.41 2036.57 2369.91 2312.76 2718.092315.80 970.69 1254.00 1672.14 1621.14 0 m/z 2314.75 1000 1200 1400 1600 1800 2000 2200 2400 26001670.25 2800 1013.80 100 2366.831619.28 968.64 1671.24 836.58 1251.93 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 % 2271.831252.97 2368.85 1448.14 1853.38 2019.57 2313.69 969.701876.41 2036.57 2369.91 2312.76 2718.092315.80 970.69 1254.00 1672.14 1621.14 0 m/z 2314.75 1000 1200 1400 1600 1800 2000 2200 2400 26001670.25 2800 1013.80 2366.83 836.58 1671.24 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 % 2271.83 2368.85 1448.14 1853.38 2019.57 1876.41 2036.57 2369.91 2718.092315.80 970.69 1254.00 1672.14 0 m/z 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 1013.80 2366.83 836.58 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 2271.83 2368.85 1448.14 1853.38 2019.57 1876.41 2036.57 2369.91 2718.09

0 m/z 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800

Automated data processing and database searching Advanced MS 101 Differential gel electrophoresis (2D-DIGE)

Test labelled Control labelled with propyl-Cy3 with methyl-Cy5

IEF

2D-electrophoresis SDS-PAGE λE Cy3 λE Cy5 Image A Image B

102 2D-PAGE & protein identification

• MALDI-TOF MS is mostly used • advantages: – discovering biomarker – high reproducibility – semi-quantitative • drawbacks: – low sensitivity, particularly for less-abundant proteins

103 Plasma proteome of Severe Acute Respiratory Syndrome (SARS) analyzed by 2-DE and MS

(B) Control SARS (A) (A) 31 34 Ⅰ 18 19 32 13 13 18 30 19 17 Ⅰ 30 29 66 29 13 Ⅱ 36 36 15 33 28 15 15 Ⅱ 45 36 16 14 16 10 12 27 6 5 11 5 9 Ⅲ Ⅲ 37 7 8 9 37 37 30 35

7 8 27 25 Ⅳ 23 24 38 25 21 Ⅳ 24 22 25 38 21 20 20 26 4 Ⅴ 3 Ⅴ 4 3 1 4 3 2 14 2 1

pH 4 pH 7 Normal Patient Proc. Natl. Acad. Sci. USA., 101, 17039 (2004) 104 Protein Profiles of SARS Patients

030707-4, 5/28 030708-4, 6/1 030701-4, 5/5 030702-4, 5/11 030703-4, 5/16 030704-4, 5/21

03-0801, 5/3 03-0802, 5/9 03-0803, 5/15 03-0804, 5/21 03-0806, 5/27 03-0807, 5/30

030808, 4/29 030810, 5/5 030811, 5/8 030813, 5/10 030812, 5/12

0816, 5/21 0817, 5/29 0815, 5/16 0809, 5/1 0814, 5/11

105 MALDI

Peptide Mass Fingerprinting (PMF)

Direct LC-MS/MS

Poor salt tolerance Ion suppression

106 液相層析輔助質譜 Ion suppression

1619.28 100 968.64 Salt tolerance 1251.93 1252.97 2313.69 969.70 2312.76 1621.14 2314.75 1670.25 100 1619.28 968.64 1671.24 1251.93

% 1252.97 2313.69 Sample concentration 969.70 2312.76 2315.80 970.69 1254.00 1672.14 1621.14 2314.75 1670.25 1013.80 100 2366.831619.28 968.64 1671.24 836.58 1251.93 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 % 2271.831252.97 2368.85 1448.14 1853.38 2019.57 2313.69 969.701876.41 2036.57 2369.91 2312.76 2718.092315.80 970.69 1254.00 1672.14 1621.14 0 m/z 2314.75 1000 1200 1400 1600 1800 2000 2200 2400 26001670.25 2800 1013.80 100 2366.831619.28 968.64 1671.24 836.58 1251.93 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 % 2271.831252.97 2368.85 1448.14 1853.38 2019.57 2313.69 969.701876.41 2036.57 2369.91 2312.76 2718.092315.80 970.69 1254.00 1672.14 1621.14 0 m/z 2314.75 1000 1200 1400 1600 1800 2000 2200 2400 26001670.25 2800 1013.80 2366.83 836.58 1671.24 1060.23 1570.17 1619.282367.90 893.67 100 968.641873.46 1061.26 1251.93 % 2271.83 2368.85 1448.14 1853.38 2019.57 1876.41 2036.57 2369.91 2718.092315.80 970.69 1254.00 1252.971672.14 2313.69 969.70 0 2312.76m/z 1000 1200 1400 1600 1800 2000 2200 24001621.14 2600 2800 1013.80 2366.83 2314.75 836.58 1670.25 1619.282367.90 893.67 1060.23 1570.17100 968.641873.46 1671.24 1251.93 1061.26 2271.83 2368.85 1448.14 1853.38 2019.57 1876.41 2036.57 2369.91 % 1252.97 2718.09 2313.69 969.70 2312.76 0 2315.80 m/z 1000 1200 1400970.69 16001254.00 1800 20001672.14 2200 24001621.14 2600 2800 2314.75 1670.25 1013.80 100 2366.831619.28 968.64 1671.24 836.58 1251.93 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 % 2271.831252.97 2368.85 1448.14 1853.38 2019.57 2313.69 969.701876.41 2036.57 2369.91 2312.76 2718.092315.80 970.69 1254.00 1672.14 1621.14 0 m/z 2314.75 1000 1200 1400 1600 1800 2000 2200 2400 26001670.25 2800 1013.80 100 2366.831619.28 968.64 1671.24 836.58 1251.93 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 % 2271.831252.97 2368.85 1448.14 1853.38 2019.57 2313.69 969.701876.41 2036.57 2369.91 2312.76 2718.092315.80 970.69 1254.00 1672.14 1621.14 0 m/z 2314.75 1000 1200 1400 1600 1800 2000 2200 2400 26001670.25 2800 1013.80 2366.83 836.58 1671.24 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 % 2271.83 2368.85 1448.14 1853.38 2019.57 1876.41 2036.57 2369.91 2718.092315.80 970.69 1254.00 1672.14 0 m/z 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 1013.80 2366.83 836.58 1060.23 1570.17 2367.90 893.67 1873.46 1061.26 2271.83 2368.85 1448.14 1853.38 2019.57 1876.41 2036.57 2369.91 2718.09

0 m/z 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800

107 Table9.3 Mass differences due to isotopes in multiply charged p Charge on peptide Apparent mass Mass difference between isotope Single charge [(M+H)/1] 1Da Double charge [(M+2H)/2] 0.5Da Triple charge [(M+3H)/3] 0.33Da n charges [(M+nH)/n] 1/nDa

108 Liquid chromatography scan

MS scan

MS/MS scan

109 110 111 112 113 114 115 Multi-dimensiona Liquid Chromatography(MDLC) Coupled Directly to MS

Initial Biological Sample Prep and Protein Extration (organs, tissues, cell types, subcellular components)

MDLC

Dry Down Reduce/Alkylate/Digest

Edman Sequencing Mass Spectrometry

Further Analysis and Characterization 116 MudPIT (Multidimentional Protein Identification Dechnology )

Liquid Chromatography digestion column 1 column 2 pretreated ion-exchange hydrophobic interaction samples MS database searching LC/MS 5 on-line & data comparison LC/MS 4 tandem MS

LC/MS 3 LC/MS 2

LC/MS117 1 Isotope Coded Affinity Tag

Nature biotechnology (1999)17:994-999 O

HN NH O X X O X X O I N O O N S X X H X X H

Thiol-specific Biotin Linker (light or heavy) reactive group Light reagent : D0-ICAT (X=H) Quantification Heavy reagent : D8-ICAT (X=D) SH CH2

H2NCHC OH m=8 O Cysteine

118 Isotope Coded Affinity Tag (ICAT) strategy

protein tagged with “light” mixing [12C or H] sample ICAT reagent digestion pretreatment protein tagged with “heavy” purifying tagged [13C or D] pepticdes ICAT reagent

LC-

database m/z searching MS/MS abnormal “light”/ “ heavy”ratio 119 (四) 應用實例

120 Proteomic pattern diagnostics

surface-enhanced laser desorption / ionization-time-of-flight, SELDI-TOF MS

applying to protein-binding chip samples from chip types: normal persons cation or anion exchange, and patients immobilized metal affinity, hydrophobic interaction, etc. further analysis rinsing to remove using MS/MS unbound species, technology ? MS m/z adding matrix gel-view compound mass spectra 121 Disease Proteomics 一滴血可以診斷出疾病嗎?

122 Protein Chips 蛋白質晶片

123 找尋腫瘤標記蛋白質

正常人

良性攝護 腺腫瘤

攝護腺癌

124 125 126 127 128 Tandem-affinity purification (TAP)

1739

1548

1167

589

232

Mass spectrometry for component identification 129 Statistics of identified proteins and complexes

130 Primary validation of complex composition by `reverse' purifcation

The bands of the tagged proteins are indicated by arrowheads.

New components 131 Protein complex network

Yeast TAP-C21 132 Structural Proteomics

comprehensive description of the multitude of interactions between molecular entities, which in turn is a prerequisite for the discovery of general structural principles that underlie all cellular processes.

NATURE | VOL 422 | 13 MARCH133 2003 Wolfgang Baumeister