<<

October 2015 www.nature.com/milestones/mass-spec MILESTONES

Produced with support from: Produced by: Nature Methods, Nature, Nature Biotechnology, Nature Chemical Biology and Nature Protocols MILESTONES Mass Spectrometry MILESTONES COLLECTION 4 Timeline 5 Discovering the power of mass-to-charge (1910 ) NATURE METHODS: COMMENTARY 23 Mass spectrometry in high-throughput 6 Development of methods (1929) : ready for the big time 7 Isotopes and ancient environments (1939) Tommy Nilsson, , , John R Yates III, Amos Bairoch & John J M Bergeron 8 When a velocitron meets a reflectron (1946) 8 Spinning trajectories (1949) NATURE: REVIEW Fly out of the traps (1953) 9 28 The biological impact of mass-spectrometry- 10 Breaking down problems (1956) based proteomics 10 Amicable separations (1959) Benjamin F. Cravatt, Gabriel M. Simon & John R. Yates III 11 Solving the primary structure of peptides (1959)

12 A technique to carry a torch for (1961) NATURE: REVIEW 12 The pixelation of mass spectrometry (1962) 38 Metabolic phenotyping in clinical and surgical 13 Conquering carbohydrate complexity (1963) environments Jeremy K. Nicholson, Elaine Holmes, 14 Forming fragments (1966) James M. Kinross, Ara W. Darzi, Zoltan Takats & 14 Seeing the full picture of metabolism (1966) John C. Lindon 15 Electrospray makes molecular elephants fly (1968) 16 Signatures of disease (1975) 16 Reduce complexity by choosing your reactions (1978) 17 Enter the matrix (1985) 18 Dynamic protein structures (1991) 19 Protein discovery goes global (1993) 20 In pursuit of PTMs (1995) 21 Putting the pieces together (1999)

CITING THE MILESTONES CONTRIBUTING JOURNALS UK/Europe/ROW (excluding Japan): The Nature Milestones: Mass supplement has been published as Nature Methods, Nature, Nature Biotechnology, Nature Publishing Group, Subscriptions, a joint project between Nature Methods, Nature, Nature Biotechnology, Nature Chemical Biology and Nature Protocols. Brunel Road, Basingstoke, Hants RG21 6XS, UK. Nature Chemical Biology and Nature Protocols. However, most referencing for- Copyright © 2015 Nature America, Inc. Tel: +44 (0)1256 329 242; Fax: +44 (0)1256 812 358 mats and software do not allow the inclusion of more than one journal name E-mail: [email protected] or volume in an article reference. Therefore, should you wish to cite any of the Milestones, please reference the page number (Sxx–Sxx) as a supple- Japan: SUBSCRIPTIONS AND CUSTOMER SERVICE Nature Publishing Group — Asia-Pacific, ment to Nature Methods. For example Nat. Methods 12, Sxx–Sxx (2015). To Americas: Chiyoda Building 5-6th Floor, 2-37 Ichigaya Tamachi, cite articles from the Collection, please use the original citation, which can Springer Nature, Customer Service, be found at the start of each article. Springer Nature Shinjuku-ku, Tokyo 162-0843, Japan. One New York Plaza, Suite 4500, Tel: +81 3 3267 8751; Fax: +81 3 3267 8746 VISIT THE SUPPLEMENT ONLINE New York, NY 10004-1562 E-mail: [email protected] The Nature Milestones in Mass Spectrometry supplement can be T: (212) 726 9200 found at www.nature.com/milestones/mass-spec or +1 212 726 9223 (outside US/Canada). All Collection articles will be available free for six months. E-mail: [email protected] CUSTOMER SERVICE: [email protected]

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 1 ature Milestones are special supplements that aim to highlight the outstanding technological developments and scientific discoveries that have helped to define a particular field. Nature Milestones in Mass Spectrometry, a collaboration between five Nature Publishing Group journals, presents a historical look back at the key technical Ndevelopments in mass spectrometry and the chemical and biological applications that stemmed from these advances. Each short Milestone article, written by a Nature Publishing Group editor, covers one breakthrough, highlighting the main papers that contributed to the advance and discussing both their value at the time and their lasting influence on mass spectrometry today. The Milestone topics and papers were selected with the help of expert advisers, but the ultimate decisions on what to include were made by the editors. Nature Milestones in Mass Spectrometry is not meant to be a comprehensive overview of this field, and despite our and the advisers’ best efforts, omissions of important literature are inevitable. Our intent is to give readers a taste of the key advances in this technique, with a special focus on biological and biomedical applications, areas in which much of the research using mass spectrometry is currently concentrated. ▶ cover: Design by Erin Dewalt Original mass spectrum taken from Käll, L. et al., The seeds of mass spectrometry were planted just over a century ago with the pioneering Nat. Methods 4, 923–925 (2007). work of physicist J.J. Thomson (see Milestone 1). The development of ionization methods EDITORIAL OFFICES (Milestone 2) and instrumentation (see Milestones 4–6) was fueled in part by the Manhattan NEW YORK Project during the Second World War. The first applications of mass spectrometry in the field of Springer Nature One New York Plaza, Suite 4500, were reported soon after, and to this day, mass spectrometry serves as a workhorse New York, NY 10004-1562 technique for molecular and elemental analysis in laboratories worldwide (see Milestones 3, 7, T: (212) 726 9200 Coordinating editors: Allison Doerr, 10 and 12). Joshua Finkelstein, Irene Jarchum, Catherine Goodman and Bronwen Dekker The development of the soft ionization techniques of (Milestone 15) production editor: Jennifer Gustavson and matrix-assisted laser desorption/ionization (MALDI; Milestone 18), and also of tandem Copy editorS: Rebecca Barr and Ashley Stevenson Editorial Assistant: Tanyeli Taze mass spectrometry (Milestone 13) and of the combination of chromatographic separation web production editorS: Jayce Childs and with mass spectrometry (Milestone 8), further revolutionized the field, allowing mass James McSweeney web Design: Sam Rios and Luke Stavenhagen spectrometry to become an essential tool not just in chemical research but also in the biological Manufacturing Production: Susan Gray arena. Today, mass spectrometry is the central technology employed in the field of proteomics marketing: Hannah Phipps Head of Publishing Services: Ruth Wilson (Milestone 20), enabling the analysis of post-translational modifications (Milestone 21) and editor-in-chief, Nature Publications: Philip Campbell protein interactions (Milestone 22), and it is also as an important tool in structural biology Sponsorship: David Bagshaw and Yvette Smith (Milestone 19). Copyright © 2015 Nature America, Inc. The supplement includes a Timeline that lists the key developments (by the year in which the first milestone paper pertinent to each breakthrough was published), a reprinted Commentary from Nature Methods and two reprinted Reviews from Nature (these articles will be made freely available online until March 2016). The Milestones website also includes an extensive Library listing of mass spectrometry–related papers published in Nature Publishing Group journals. We would like to sincerely thank our advisers and acknowledge support from SCIEX, Thermo Fisher Scientific Inc. and Waters Corporation. As always, Nature Publishing Group takes complete responsibility for the editorial content. Allison Doerr, Senior Editor, Nature Methods Joshua Finkelstein, Senior Editor, Nature Irene Jarchum, Associate Editor, Nature Biotechnology Catherine Goodman, Senior Editor, Nature Chemical Biology Bronwen Dekker, Senior Editor, Nature Protocols

MILESTONES ADVISORS *Ruedi Aebersold, ETH Zürich, Switzerland *Fred W. McLafferty, , USA *Peter Armentrout, University of Utah, USA Howard R. Morris, Imperial College London, UK Daniel Armstrong, University of Texas, USA David C. Muddiman, North Carolina State University, USA *H. Alex Brown, , USA Francis Pullen, University of Greenwich, UK *, Vanderbilt University, USA *Joshua Rabinowitz, Princeton University, USA Steven Carr, Broad Institute of MIT and Harvard, USA *Paula J. Reimer, Queen’s University Belfast, UK *Brian Chait, The Rockefeller University, USA *Carol Robinson, University of Oxford, UK David Clemmer, Indiana University, USA David H. Russell, Texas A&M University, USA *Anne Dell, Imperial College London, UK *Uwe Sauer, ETH Zürich, Switzerland *Rob Ellam, University of Glasgow, UK *Antonio Simonetti, University of Notre Dame, USA Michael H. Gelb, University of Washington, USA *Gary Siuzdak, The Scripps Research Institute, USA *Gary Glish, University of North Carolina at Chapel Hill, USA Luke Skinner, University of Cambridge, UK *Michael A. Grayson, American Society for Mass Spectrometry, USA Richard Smith, Pacific Northwest National Laboratory, USA *Jürgen H. Gross, University of Heidelberg, *Giulio Superti-Furga, Research Center for Molecular Medicine of the Austrian Academy of Sciences, *Steven Gygi, Harvard Medical School, USA Donald F. Hunt, University of Virginia, USA *Jonathan Sweedler, University of Illinois at Urbana-Champaign, USA *Akihiko Kameyama, National Institute of Advanced Industrial John Todd, University of Kent, UK Science and Technology, Japan *John Yates III, The Scripps Research Institute, USA Neil Kelleher, Northwestern University, USA *Richard Yost, University of Florida, USA *Bernhard Küster, Technische Universität München, Germany *Joseph Zaia, Boston University, USA *Joseph A. Loo, University of California, Los Angeles, USA *, ETH Zürich, Switzerland *Matthias Mann, Max Planck Institute of , Germany Raymond March, Trent University, Canada *indicates advisers who assisted with multiple stages of the project

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 3 MILESTONES TIMELINE

1910 The beginnings (Milestone 1)

1929 Development of ionization methods (Milestone 2)

1939 Environmental analysis (Milestone 3)

1946 Time of flight (Milestone 4)

1949 Trapping mass analyzers (Milestone 5)

1953 Quadrupole and triple-stage quadrupole mass filters (Milestone 6)

1956 Small-molecule analysis (Milestone 7)

1959 Separations (Milestone 8)

1959 Peptide sequencing (Milestone 9)

1961 Inductively coupled plasma mass spectrometry (Milestone 10)

1962 Imaging mass spectrometry (Milestone 11)

1963 Carbohydrate analysis (Milestone 12)

1966 (Milestone 13)

1966 Metabolomics (Milestone 14)

1968 Electrospray ionization (Milestone 15)

1975 Medical applications (Milestone 16)

1978 Selected reaction monitoring (Milestone 17)

1985 Matrix-assisted laser desorption/ionization (Milestone 18)

1991 Structural biology applications (Milestone 19)

1993 Proteomics (Milestone 20)

1995 Post-translational modification analysis (Milestone 21)

1999 Interactome analysis (Milestone 22)

4 | OCTOBER 2015 www.nature.com/milestones/mass-spec MILESTONES

MILESTONE 1

Discovering the power of mass-to-charge Society Royal the of Proceedings ©

The twenty-first century is an exciting time for Cambridge, made another crucial observation: time using an electrometer, doing away with mass spectrometrists. But things were quite he found that in the purest preparations of the cumbersome photographic plates. exhilarating even at the technique’s birth, more neon gas there were two parabolas, one Dempster also introduced electron than a hundred years ago. The discovery of the corresponding to an atomic weight of 20 and bombardment as a method to generate positive electron and of the isotopes of neon, and another one to 22. Although he could not . Both of these discoveries made ripples in year-to-year leaps in the degree of accuracy explain it at the time, this discovery would later the field, and Dempster’s ‘mass spectrometer’, and resolution of the data, were just some of be recognized as the first indication that stable as it came to be called, became the basis of later the reasons scientists were motivated to push elements can have isotopes. commercially developed instruments. ahead with the nascent technology. Though regarded as a major advance, Dempster and Aston also carried out critical It was perhaps Wilhelm Wien’s discovery Thomson’s technique had limitations, as he work toward determining the isotopic showing that rays of positively charged himself recognized. In particular, some of the abundance and mass of the elements. Among particles could be deflected with very powerful rays hit the walls of the tube as they traveled, these was uranium. Others had shown that magnetic fields that gave mass spectrometry its filling the tubes with ‘metallic dust’ and splitting the uranium atom released a large start. Wien measured the deflection of these requiring frequent cleanings, and the intensities amount of energy, and on the brink of the positive particles and was able to calculate of the parabolas on the photographic plate were Second World War, the idea that the fission of their mass. sometimes insufficient for accurate high-purity uranium could be used as a Following up on these discoveries, measurements. powerful weapon was born. In 1940, Alfred Joseph John (J.J.) Thomson showed that Francis Aston, also at the University of Nier (see Milestone 2) provided the missing positive rays traveling along an axis x and Cambridge, shortly thereafter took on these piece: he was able to make pure preparations of striking a plane at right angles could be challenges with the aim of increasing the 235U and 238U, which were then used to identify deflected by parallel electric and magnetic intensity of the signal. He did this by designing 235U as responsible for slow neutron fission. forces on axis y. This caused the rays to be an instrument that would focus the rays in the Efforts to isolate 235U were named the deflected and strike the plane at a different form of a line hitting the plate at a specific point ‘Manhattan Project’ and occupied leading place depending on their charge-to-mass on a focal plane. Aston’s device incorporated physicists during the war. ratio. The rays hit the plane on a parabolic two parallel slits and used two electromagneti- Investments toward a nuclear bomb led to arc, so to capture this information, Thomson cally charged plates to focus the rays, the development of techniques that advanced allowed the particles to fall on a photo- mimicking the focusing effect of an optical the field of mass spectrometry in the postwar graphic plate. He then measured the lens. This first mass spectrograph had not only years. As we now know, and as is described in parabolas on the photograph and calculated greater measurement intensity and accuracy the following milestones, there would be many the charge-to-mass ratio of the particles but also better resolution than Thomson’s more critical developments to follow. using mathematical equations. instrument. Aston used his spectrograph to Irene Jarchum, Associate Editor, Thomson, resolve the puzzle of neon, demonstrating for Nature Biotechnology working at the the first time that stable elements can be ORIGINAL RESEARCH PAPERS Thomson, J.J. Rays of positive University of isotopic. electricity. Philos. Mag. Ser. 6, 20, 752–767 (1910) | Another important technological Dempster, A.J. A new method of positive ray analysis. Phys. J.J. Thomson captured development came from Arthur Dempster at Rev. 11, 316–324 (1918) | Aston, F.W. A positive ray the parabolas of spectrograph. Philos. Mag. 38, 707–715 (1919) deflected rays on a the University of Chicago. Dempster’s FURTHER READING Wien, W. Untersuchungen über die photographic plate. spectrograph, referred to as a magnetic sector elektrische Entladung in verdünnten Gasen. Ann. Phys. 313, Reproduced with 244–266 (1902) | Lawrence, E.O. Method and apparatus for permission from Proc. analyzer, deflected the rays by 180° by applying the acceleration of ions. US patent 1,948,384 (1934) | Roy. Soc. A 89, 1–20 a strong magnetic field. This focused the rays of Washburn, H.W., Wiley, H.F. & Rock, S.M. The mass (1913), J.J. Thomson, a specific mass-to-charge ratio through a spectrometer as an analytical tool. Ind. Eng. Chem. Anal. Ed. 15, ‘Bakerian Lecture: rays 541–547 (1943) © Proceedings of the Royal Society Royal the of Proceedings © of positive electricity’. narrow slit. These were then detected in real

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 5 MILESTONES

MILESTONE 2 chromatography (HPLC) instrument (Milestone 8). Modern versions of these instruments are compact and inexpensive. Development of ionization methods During this time period, other scientists were trying to obtain mass spectra of solid In 1929, following the pioneering work of electric field led to the gentle ionization of the samples. In 1949, Richard Herzog and Franz Arthur Dempster and Francis Aston analyte at the microscope’s tip. Whereas an EI Viehböck showed that an ion source that (Milestone 1), Walker Bleakney described a spectrum of acetone, for example, contains 19 produces a beam of positive ions can be used new method of positive ray analysis that low-molecular-weight peaks, its FI spectrum to bombard the surface of a solid; the impact involved heating a tungsten wire filament to contains only a single peak that corresponds to of these positively charged particles with the generate a stream of electrons and then using a the molecular ion. sample results in the ionization and ejection of uniform magnetic field to focus those In 1969, Hans Beckey showed that some of the atoms, as ‘secondary ions’, from electrons into a narrow beam. Bleakney used adsorbing a sample onto a tungsten wire the surface. Less than a decade later, Richard this technique to measure the first four containing a dense array of ‘micro needles’ Honig reported a mass spectrometer that uses ionization energies of mercury. Five years later, increased the sensitivity of the FI method and a similar approach—termed ‘sputtering’—to John Tate and Philip Smith showed that this reduced the amount of analyte needed to identify a broad range of neutral, positively ionization method—called ‘electron impact’ acquire a high-quality mass spectrum. Beckey charged and negatively charged species on the (EI) at the time, but now known as ‘electron used this field desorption (FD) method to surfaces of silver, germanium and ionization’—could be used to measure the obtain mass spectra of monosaccharides, germanium-silicon alloy samples. Other ionization energies of several other elements. It which represented a challenge for other scientists showed that this method, eventually was even possible to generate highly ionized ionization methods because of their thermal named secondary ion mass spectrometry species, such as Cs7+. lability. The EI spectrum of d-glucose (SIMS), could be used to determine the Nearly 20 years later, Alfred Nier, whom exclusively contains low-molecular-weight chemical composition of rocks from the Moon many would call the ‘father of modern mass fragment ions, and its FI spectrum contains a and to produce images of cells and other spectrometry’, published a detailed description peak from the protonated monosaccharide, as biological samples (Milestone 11). of an EI mass spectrometer that was able to well as others resulting from the successive Although mass spectra of small organic measure the relative isotopic abundances of dehydration of this species. The FD spectrum molecules were, at this point, relatively easy to carbon, nitrogen and oxygen in a sample. Nier of this sugar, in contrast, consists obtain using EI, CI, FI and FD, larger had reported a simpler version of the predominantly of a large peak that biomolecules were still difficult to characterize instrument in 1940, but his work on the corresponds to the protonated sugar and a using mass spectrometry. In 1976, Ronald Manhattan Project (Milestone 1) prevented smaller peak that corresponds to the molecular Macfarlane and David Torgerson showed that him from publishing his improvements until ion. Other researchers reported soon after that nuclear fission of the radioactive element 252Cf, after the Second World War had ended. FD could be used to obtain spectra of other which generates high-energy particles, could EI became the ‘gold standard’ ionization heat-sensitive organic molecules, including be used to ionize a biological molecule that method for many years. However, the glycosides, nucleotides and short peptides. had been deposited on a nickel foil. Using this conditions required for EI proved too harsh for The harsh conditions of EI inspired technique, known as plasma desorption mass many organic molecules; the molecular ion scientists to develop new ionization methods spectrometry (PDMS), they obtained mass usually decomposed into smaller ions. In the that would not result in the decomposition of spectra of several molecules, including the mid-1950s, Mark Inghram and Robert Gomer the molecular ion. In 1966, Burnaby Munson thermally labile neurotoxin tetrodotoxin,

published two papers that described a ‘softer’ and Frank Field found that when a small vitamin B12 and the antibiotic gramicidin A. ionization method—field ionization (FI)—in amount of an analyte is mixed with methane Other researchers soon used PDMS to obtain which an analyte was ionized in close gas, electrons that pass through the mixture mass spectra of oligonucleotides and small proximity to the tungsten tip of a field almost exclusively ionize the methane. The proteins. emission microscope. Application of a high resulting methane ions can then react with the In 1981, and his colleagues analyte, and these ‘chemical ionization’ (CI) published two papers that described a new a events will produce ions of the analyte. ionization method called fast atom EI Munson and Field obtained CI and EI spectra bombardment (FAB), in which a beam of of several organic molecules and found that neutral argon atoms is aimed at an analyte on

Rel. int (%) “[g]enerally, the ions in the chemical a copper sample stage containing a ionization mass spectrum are predominantly low-volatility organic matrix, such as glycerol. in the high molecular weight end of the The analyte is ionized via the same sputtering m/z b spectrum, whereas the converse is true for the mechanism as SIMS. Barber and colleagues CH4Cl CI electron impact mass spectrum.” used FAB to obtain high-quality spectra of In the mid-1970s, E.C. Horning and oligosaccharides, nucleotides, organometallic colleagues and D.I. Carroll and colleagues complexes and small proteins, and the described modifications to Munson’s and technique became the first ionization Field’s device: they replaced the ionization method able to sequence longer peptides

Springer m/z source with a 63Ni foil or a corona discharge. (Milestone 9). The popularity of FAB waned The EI and CI spectra of methionine. Adapted from Gross, J.H., Because the analyte was ionized in a stream of quickly, however, as the arrival of electrospray Mass Spectrometry—A Textbook (Springer, Berlin, Germany, 2011), with kind permission from Springer Science and flowing gas at ambient pressure, it could be ionization (ESI; Milestone 15) and Business Media. coupled to a high-performance liquid matrix-assisted laser desorption/ionization

6 | OCTOBER 2015 www.nature.com/milestones/mass-spec MILESTONES

(MALDI; Milestone 18) a few years later MILESTONE 3 meant that FAB would only be needed when other ionization methods failed to produce high-quality spectra. With such an assortment of ionization Isotopes and ancient environments methods available, one might expect modern PHOTOALTO scientists to cease their efforts to develop new In the early twentieth century, the relatively new Glacial landforms can be dated by mass spectrometry, using radioactive isotopes such as 14C and 10Be. ones. But that has not been the case. For technique of mass spectrometry provided an example, two recently reported ionization opportunity to assess whether there were mass-to-charge ratios. Muller estimated that methods—desorption electrospray ionization systematic variations in the ratio of the isotopes similar techniques should allow 14C and 10Be to (DESI) and direct analysis in real time of matter. In 1939, Alfred Nier and Earl be measured in far smaller samples than had (DART)—have generated quite a bit of Gulbransen showed that this was the case for been possible at the time. Indeed, the cyclotron excitement because they can directly ionize an carbon. They found that the isotopic proved useful in measuring 10Be, a powerful analyte from a solid surface at ambient composition of carbon varied depending on method for determining the age of glacial pressure without any sample preparation. how and when the various types of rock and landforms such as moraines. In a slight variation DESI was able to detect the presence of an organic matter formed. For example, they found on the cyclotron technique, Charles Bennett antihistamine on a human fingertip subtle but consistent differences between and colleagues showed that a linear accelerator 40 minutes after oral administration, and carbon minerals formed by volcanic processes coupled with a negative ion source could detect DART could detect trace amounts of and those derived from seawater. The isotope very small amounts of radiocarbon. nitroglycerin on a man’s necktie eight hours composition even varied between the flesh and The issue of sample size for stable isotope after he walked past a construction site where the shell of clams. analysis was solved with a modification to the demolition work was taking place. Mass It later became clear that carbon isotope gas inlet system of the mass spectrometer spectrometry ionization methods have made a ratios are even more variable than initially found developed by Nicholas Shackleton. In his setup, tremendous journey—from EI to ESI and in this early work: they also reflect changes in the molecular leak system by which carbon MALDI in fewer than 60 years—affecting carbon cycling in the terrestrial biosphere and dioxide produced from carbonate samples nearly every scientific discipline along the way. throughout the oceans . These discoveries entered the mass spectrometer was applicable Joshua Finkelstein, Senior Editor, Nature opened up the possibility of tracing the to small sample sizes, but this system also evolution of life and the carbon cycle through caused the sample to undergo further ORIGINAL RESEARCH PAPERS Bleakney, W. A new method of positive ray analysis and its time. fractionation. Automation of the inlet valves application to the measurement of ionization potentials in Stable oxygen isotopes, particularly 18O and enabled the amount of time the sample and a mercury vapor. Phys. Rev. 34, 157–160 (1929) | Tate, J.T. & 16O, also proved to be important tracers. The reference standard flowed through the leak to Smith, P.T. Ionization potentials and probabilities for the formation of multiply charged ions in the alkali vapors and in work of Willi Dansgaard demonstrated that the be equalized, thus allowing for correction of the krypton and xenon. Phys. Rev. 46, 773–776 (1934) | Nier, A.O. oxygen isotope composition of precipitation fractionation. With this system, samples as Mass spectrometer for isotope and gas analysis. Rev. Sci. Instrum. could be used to trace both the temperature at small as 100 l could be analyzed. 18, 398–411 (1947) | Herzog, R.F.K. & Viehböck, F.P. Ion source µ for mass spectrography. Phys. Rev. 76, 855–856 (1949) | which the water droplets formed and the Collectively, these developments meant that Inghram, M.G. & Gomer, R. Mass spectrometric analysis of ions history of the air mass the water originated for fossil material such as tooth or bone, only from the field microscope. J. Chem. Phys. 22, 1279–1280 (1954) | Gomer, R. & Inghram, M.G. Applications of field ionization to from as it moved away from the initial vapor small parts of the fossil needed to be sacrificed. mass spectrometry. J. Am. Chem. Soc. 77, 500 (1955) | source. Oxygen isotopes became the primary This also allowed measurements of fossil Honig, R.E. Sputtering of surfaces by positive ion beams of low means to interpret ice cores collected from the carbonate within marine sediments to be made energy. J. Appl. Phys. 29, 549–555 (1958) | Munson, M.S.B. & Field, F.H. Chemical ionization mass spectrometry. I. General Greenland and Antarctic ice sheets, and are at a much higher resolution, resolving the introduction. J. Am. Chem. Soc. 88, 2621–2630 (1966) | Beckey, also key to interpreting biogenic carbonates in patterns of glacial-interglacial temperature H.D. Field desorption mass spectrometry: a technique for the study of thermally unstable substances of low volatility. Int. J. marine sediment cores. change as well as the timing of more recent Mass Spectrom. Ion Phys. 2, 500–503 (1969) | Horning, E.C., Mass spectrometry–based analyses of ice fluctuations. Such high-resolution Horning, M.G., Carroll, D.I., Dzidic, I. & Stillwell, R.N. New and sediment cores soon showed that dramatic measurements ultimately revealed that swings picogram detection system based on a mass spectrometer with an external ionization source at atmospheric pressure. Anal. and repeated periods of climate upheaval had from glacial to interglacial states over the past Chem. 45, 936–943 (1973) | Carroll, D.I., Dzidic, I., Stillwell, R.N., occurred for the past few million years. But half million years were paced by changes in the Haegele, K.D. & Horning, E.C. Atmospheric pressure ionization there were impediments to fully realizing the Earth’s orbit around the sun. mass spectrometry. Corona discharge ion source for use in a liquid chromatograph-mass spectrometer-computer analytical potential of these environmental archives, Alicia Newton, Senior Editor, Nature Geoscience system. Anal. Chem. 47, 2369–2373 (1975) | Macfarlane, R.D. & including the large sample size required for ORIGINAL RESEARCH PAPERS Nier, A.G. & Gulbransen, E.A. Torgerson, D.F. Californium-252 plasma desorption mass isotopic analyses, particularly with spectroscopy. Science 191, 920–925 (1976) | Barber, M., Variations in the relative abundance of the carbon isotopes. J. Am. Bordoli, R.S., Sedgwick, R.D. & Tyler, A.N. Fast atom radioisotopes. The decay counting techniques Chem. Soc. 61, 697–698 (1939) | Muller, R.A. Radioisotope dating bombardment of solids (F.A.B.): a new ion source for mass of the early 1970s required large amounts of with a cyclotron. Science 196, 489–494 (1977) spectrometry. J. Chem. Soc. Chem. Commun. 7, 325–327 (1981) | FURTHER READING Dansgaard, W. The abundance of O18 in Barber, M., Bordoli, R.S., Sedgwick, R.D. & Tyler, A.N. Fast atom often irreplaceable samples to be destroyed. atmospheric water and water vapour. Tellus 5, 461–469 (1953) | bombardment of solids as an ion source in mass spectrometry. Richard Muller’s report of the use of a Shackleton, N.J. The high precision isotopic analysis of oxygen and Nature 293, 270–275 (1981) cyclotron to measure tritium in water samples carbon in carbon dioxide. J. Scient. Instrum. 42, 689–692 (1965) | FURTHER READING Takats, Z., Wiseman, J.M., Gologan, B. & Hays, J.D., Imbrie J. & Shackleton, N.J. Variations in the Earth’s Cooks, R.G. Mass spectrometry sampling under ambient in 1977 was therefore a welcome development. orbit: pacemaker of the ice ages. Science 194, 1121–1132 (1976) | conditions with desorption electrospray ionization. Science 306, In conventional mass spectrometry, stable Nelson D.E., Korteling, R.G. & Stott, W.R. C-14—Direct detection at 471–473 (2004) | Cody, R.B., Laramée, J.A. & Durst, H.D. natural concentrations. Science 198, 507–508 (1977) | Bennett, C.L. Versatile new ion source for the analysis of materials in open air isotopes overwhelm any radioisotope signal, et al. Radiocarbon dating using electrostatic accelerators— under ambient conditions. Anal. Chem. 77, 2297–2302 (2005) | but at the high energy reached in the cyclotron, negative ions provide key. Science 198, 508–510 (1977) | Raisbeck, Gross, J.H. Mass Spectrometry—A Textbook (Springer, Berlin, it was possible to distinguish the radioisotopes G.M., Yiou, F., Fruneau, M. & Loiseaux, J.M. Be-10 mass- Germany, 2011) spectrometry with a cyclotron. Science 202, 215–217 (1978) from other isotopes with similar

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 7 MILESTONES

MILESTONE 4 velocitron’. As Stephens had proposed, the veloc- itron did not need a magnetic field and used ion pulses of 5 microseconds. Cameron and Eggers When a velocitron meets a reflectron showed that in this first TOF spectrometer, mercu- ry ions with different charge could be resolved, but A drawback of sector mass spectrometers acquired in just hundreds of microseconds, or their isotopes could not. The poor mass resolution (Milestone 1) was the narrow range of mass-to- as long as it takes the heaviest ions to reach the was the result of initial distributions in both energy charge ratio (m/z) that could be analyzed at any detector. Stephens pinpointed the advantages of and spatial location of the ions, which led to dis- given time. These analyzers could be thought of his method: “The response time should be limited persions in the measured times not related to m/z. as ‘mass filters’, so that acquiring a completem /z only by the repetition rate (milliseconds). The However, in contrast to previous technologies, the spectrum required ‘tuning’ the filter across all the indication would be continuous and visual and TOF spectrometer did allow mass spectra to be relevant ranges of interest. This took time, and it easily photographed. Magnets and stabilization readily visualized on an oscilloscope and updated was not apt at capturing the spectrum of a sample equipment would be eliminated. Resolution would every millisecond or so. that either was short-lived or had a composition not be limited by smallness of slits or alignment.” The mass resolution of TOF was improved that changed quickly. Moreover, there were upper The seed of time-of-flight mass spectrometry by W.C. Wiley and I.H. McLaren in 1955. They limits to the detectable mass that were inherent to (TOF-MS) was thus planted. devised an improved ion source with two acceler- the spectrometer. It took two years before a proof-of-principle ating regions that could correct for the initial ion In 1946, W.E. Stephens from the University TOF spectrometer was developed by A.E. spatial distribution that is generated by the finite of Pennsylvania proposed a new technology to Cameron and D.F. Eggers, then working at Clinton width of the ionization electron beam. A mass circumvent these limitations. He called it “A pulsed Engineer Works–Tennessee Eastman Corpora- resolving power of up to 300 could be achieved in mass spectrometer with time dispersion.” The key- tion. They gave it a more concise name, the ‘ion this type of spectrometer, which was significant words ‘pulsed’ and ‘time dispersion’ give away the enough to open the path to commercialization. two main features of this mass analyzer: the use of In 1973, B.A. Mamyrin and colleagues solved microsecond pulses of ions, and the fact that ions the issue of the initial ion energy distribution. They with differentm /z reach the detector at different proposed the use of an electrostatic reflector to times, allowing ion species to be distinguished detour ions with the same mass but higher veloci- by their ‘time of flight’ (TOF) in the analyzer. The ties, in what they termed a ‘reflectron’ TOF-MS. ion pulses are accelerated by an electric field to Such ions penetrate the electric field to a greater the same energy and travel down a vacuum tube. depth, increasing the length of their path toward Because ions with differentm /z have differ- LLC Publishing AIP 1948, © the detector and thus compensating for the initial ent initial velocities, they will hit the detector at The velocitron developed by Cameron and Eggers could differences in energy. The introduction of the re- slightly different times, with lighter and/or more resolve mercury ions with one, two or three positive charges, but not their isotopes. Reprinted with permission flectron enabled a further increase in mass resolv- charged ions arriving first. In this way, the entire from Cameron, A.E. & Eggers, D.F. Rev. Sci. Instrum. 19, ing power, by an order of magnitude, as compared m/z spectrum of the sample under study can be 605–607 (1948). to TOF-MS in which ions propagate linearly.

MILESTONE 5 John Hipple in 1943. Image is known as ion cyclotron resonance (ICR) reproduced from Encyclopedia of mass analysis and powers the highest- Mass Spectrometry: Vol. 9: performance mass spectrometers. Historical Perspectives, Part A: The Spinning ion trajectories Development of Mass Spectrometry By 1951, Hipple’s team was already (Keith A. Nier, Alfred L. Yergey & In 1932, Ernest Lawrence invented the envisioning their omegatron as a powerful P. Jane Gale), Newnes, 2015, p. 112, with permission from cyclotron—a particle accelerator using a static mass analyzer and discussing ways to improve Elsevier. magnetic field in which charged particles its resolution using higher magnetic fields, Elsevier follow an outward spiral, accelerated by a better trapping and enhanced detection. magnets, these fields can rapidly varying radiofrequency (RF) field; for Although trapping was improved in the go ever higher: the National High Magnetic this work, he was awarded the in subsequent development of magnetic ion traps, Field Laboratory’s FT-ICR mass spectrometers Physics in 1939. At around that time, John substantial improvements in detection had to hold the current world record of 21 tesla, an Hipple was working for Westinghouse Electric wait until the 1974 work of Melvin Comisarow impressive but very expensive achievement. Co. on the design of 90° magnetic sector mass and Alan Marshall. Instead of detecting the But were magnetic fields actually needed? spectrometers (see Milestone 1). A few years charged particles directly, Comisarow and Back in 1923, Kenneth Kingdon had described later, after joining the US Bureau of Standards, Marshall measured the image current the trapping of charged particles in a simple Hipple combined his knowledge of magnetic generated by the charges in the detector plates. electrostatic device—the Kingdon trap— sector mass spectrometers with the principles Specifically, turning off the RF excitation causes consisting of a cylinder with a wire along its of cyclotron acceleration in a new device he bunches of ions to rotate at the cyclotron axis, with a voltage difference between the two. called the omegatron. frequency. As the ions repeatedly pass the As Kingdon saw it, an ion would be imprisoned In the first prototype, Hipple and two detector plates, they produce a free induction in the tube, forced to orbit to and fro around colleagues trapped hydrogen ions using a static current that can be detected and subsequently the axis until it lost its transverse velocity and electric potential and a magnetic field. Tuning converted to a frequency spectrum using the collapsed into the wire. the frequency of an additional RF field to Fourier transform—hence the name ‘Fourier In 2000, Alexander Makarov revisited this resonance with the cyclotron frequency transform ICR’ (FT-ICR) given to the new concept to create an equally simple and elegant ensured that only ions with a desired technique. design known as the orbitrap. Makarov charge-to-mass ratio would be accelerated. With trapping and detection taken care of, replaced the wire with a spindle-shaped The ions would then be pushed along the exact the most obvious route to even higher electrode and the cylinder with a barrel-like outward-spiraling trajectory necessary for resolution and mass accuracy was to increase electrode. The ions would follow intricate them to hit the detector. Today, this technique the magnetic field. Thanks to superconducting spiraling trajectories around the spindle, much

8 | OCTOBER 2015 www.nature.com/milestones/mass-spec MILESTONES

MILESTONE 6 beyond what was possible with conventional Introduced commercially in the early 1960s, techniques. One person who saw such an TOF-MS has seen alternating fortunes, experi- opportunity was Robert Finnigan. encing a renaissance in popularity in the early Fly out of the traps In 1967, Finnigan co-founded Finnigan 1990s due to new methods to produce pulsed ion sources. In particular, the gentle ioniza- Perhaps fittingly for a machine that works by Instruments, which aimed to combine these tion of soft biological macromolecules, whose sending beams of charged particles on mass filters with gas chromatography importance was underlined by the awarding of undulating paths, the journey of the quadrupole (Milestone 8) to achieve a single, computerized part of the 2002 Nobel Prize in Chemistry to mass filter from conception to laboratory bench system that could separate and identify the John Fenn and , for “their devel- was not particularly straightforward. Wolfgang constituents of a mixture. Importantly, opment of soft desorption ionisation methods Paul, at the University of Bonn, first published quadrupole mass filters were able to generate for mass spectrometric analyses of biological the concept behind the mass filter—or mass spectra at unparalleled sampling speeds. macromolecules,” spread the application of TOF-MS in the biological sciences. Nowadays, spectrometer—in 1953. Today, he is regarded as This gave , biochemists and countless TOF-MS is one of the main mass analyzer the technology’s father. Less known, however, is others a simple system that could analyze technologies available, alongside quadrupoles that a researcher at the University of California samples within hours. (Milestone 6), ion traps (Milestone 5) and (later Lawrence) Radiation Laboratory, Richard In the late 1970s, as quadrupole Fourier transform ion cyclotron resonance Post, came up with a similar idea around the technologies became more widely adopted, Jim (Milestone 5), and is distinguished by a same time. But he never published his work: his Morrison discovered an innovative use for relatively high mass resolving power of up to ideas made it only into his personal notebooks them. With a line of three quadrupoles, a 60,000 at fast scan speeds. and a Lawrence Radiation Laboratory report. specific ion can be isolated in the first filter, Elisa De Ranieri, Senior Editor, Nature Energy Paul is known also for his development of broken into fragments in the second and then quadrupole ion traps, for which he shared the analyzed and detected in the third, providing a

ORIGINAL RESEARCH PAPERS Stephens, W.E. A pulsed Nobel Prize in Physics in 1989. Quadrupole ion method for probing chemical structure or mass spectrometer with time dispersion. Phys. Rev. 69, 674– traps and mass filters both use electric selecting and monitoring specific chemical 792 (1946) | Cameron, A.E. & Eggers, D.F. An ion quadrupole fields to manipulate ionized atoms reactions (Milestone 17). Morrison used light “velocitron”. Rev. Sci. Instrum 19, 605–607 (1948) | Wiley, W.C. & McLaren, I.H. Time-of-flight mass spectrometer with or charged particles. In ion traps, the ions are to break up ions, but this was not a practical improved resolution. Rev. Sci. Instrum. 26, 1150–1157 (1955) | confined to a small region in which they can be approach for analytical studies. Mamyrin, B.A., Karataev, V.I., Shmikk, D.V. & Zagulin, V.A. The mass-reflectron, a new nonmagnetic time-of-flight mass laser cooled and used for spectroscopy, Along with his student Richard Yost, Christie spectrometer with high resolution. Sov. Phys.-JETP 37, 45–48 ultracold chemistry or quantum information Enke showed that energetic gas-phase (1973) [Russian original: Zh. Eksp. Teor. Fiz. 64, 82, 1973] processing, whereas quadrupole mass filters collisions in the second quadrupole could be FURTHER READING The 2002 Nobel Prize in Chemistry— advanced information (Nobel Media AB, 2014) guide the ions to a detector. The filters consist used to break apart the ions, without the need of four metal rods that have both direct-current for light. This collision-induced fragmentation and oscillating radiofrequency voltages applied was achieved by adding an inert gas, such as like a thread spun from a yarn. At the same in a constant ratio between the opposing pairs. argon, to increase the pressure. Similar to what time, they would swing back and forth along The exact nature of the trajectories of the ions happens in particle accelerators, energetic ions, the axis of the spindle, trapped in an depends on their mass-to-charge ratios. This when accelerated by electric fields applied electrostatic harmonic potential. The ratio can be determined relatively easily, as between the first and second quadrupoles, charge-to-mass ratio of the ions can be derived different electric-field strengths and oscillating collide with these gas molecules and separate from these harmonic axial oscillations. frequencies are required to transmit different into smaller pieces. Combined with image current detection, as in species of ions through the electrode structure Enke, Yost and Morrison realized that this FT-ICR, the orbitrap provides a high-accuracy, and onto the detector. triple-stage system could be extended to the high-resolution, simple and compact mass By the early 1960s, several companies were analysis of more complex organic ions (see analyzer that is now routinely used in producing quadrupole mass filters, but Milestone 7), as well as to ions formed from proteomics research (Milestone 20). widespread adoption of the technology was proteins (see Milestone 20). And although Whether achieved by using magnetic fields surprisingly slow. Compared with early the initial adoption of quadrupole mass filters or electrostatic potentials, the idea of spinning magnetic sector analyzers, the mass filters was not as fast as one might expect for such a ions on spiraling trajectories that betray their were smaller, cheaper, tolerant of more revolutionary technology, triple-stage charge-to-mass ratio by the frequencies extreme conditions and generally easier to quadrupole mass spectrometers rose quickly measured is a surprisingly simple yet very automate—so why such resistance to them? to popularity and are firmly established as powerful concept. Magnetic-based devices were trusted, and invaluable tools for a range of disciplines in Iulia Georgescu, Senior Editor, Nature Physics researchers knew both how they worked and laboratories around the world. how to use them. The quadrupole mass filters Luke Fleet, Associate Editor, Nature Physics ORIGINAL RESEARCH PAPERS Hipple, J.A., Sommer, H. & therefore needed to do something spectacular, Thomas, H.A. A precise method of determining the Faraday by magnetic resonance. Phys. Rev. 76, 1877–1878 (1949) | ORIGINAL RESEARCH PAPERS Paul, W. & Steinwedel, H. Ein Sommer, H., Thomas, H.A. & Hipple, J.A. The measurement of neues Massenspektrometer ohne Magnetfeld. Zeitschrift für eM by cyclotron resonance. Phys. Rev. 82, 697–702 (1951) | Naturforschung A. 8, 448–450 (1953) | Paul, W. Apparatus for Comisarow, M.B. & Marshall, A.G. Fourier transform ion separating charged particles of different specific charges. US cyclotron resonance spectroscopy. Chem. Phys. Lett. 25, 282– patent 2,939,952 A (1953) | Yost, R.A., Enke, C.G., McGilvery, 283 (1974) | Makarov, A. Electrostatic axially harmonic orbital D.C., Smith, D. & Morrison, J.D. High-efficiency collision- trapping: a high-performance technique of mass analysis. Anal. induced dissociation in an RF-only quadrupole. Int. J. Mass Chem. 72, 1156–1162 (2000) Spectrom. Ion Phys. 30, 127–136 (1979) FURTHER READING Kingdon, K.H. A method for the FURTHER READING Finnigan, R.E. Quadrupole mass neutralization of electron space charge by positive ionization at spectrometers: from development to commercialisation. Anal. very low gas pressures. Phys. Rev. 21, 408–418 (1923) Diagram of Wolfgang Paul’s patent for a quadrupole mass Chem. 66, 969A–975A (1994) filter. Image adapted from US patent 2,939,952 A (1953).

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 9 MILESTONES

MILESTONE 7 The mass spectrum of styrene chlorohydrin, as reported by McLafferty. Reprinted with permission from McLafferty, F.W., Mass spectrometric analysis broad applicability to Breaking down problems chemical research, Anal. Chem. 28, 306–316 (1956). ©1956, American Chemical Society Chemical American ©1956, At its simplest, mass spectrometry gives onto IBM punched cards. These could be information about the atomic or molecular compared rapidly with the significant peaks of general rule that major bond breakage occurs weight of an element or compound. Initial an unknown sample, thus allowing automated β to a benzene ring, he could predict the main applications of mass spectrometry focused on matches to be made in many cases. characteristics of the molecules’ expected confirming the weight of a known compound Even when reference spectra were not spectra: 2-chloro-2-phenylethanol should or analyzing elemental samples. The next step available, scientists learned how to extract generate large peaks corresponding to + + was its use in determining molecular increasingly greater amounts of information PhCHCH2OH (loss of chloro) and PhCHCl structures. from spectral peaks, through a combination (loss of CH2OH). 2-Chloro-1-phenylethanol, In the 1950s, there was a pressing need for of empirical evidence and chemical intuition. in contrast, should result in peaks for + + such techniques. NMR spectroscopy was in its Fred McLafferty, John Beynon and others PhCHCH2Cl and PhCHOH , also from infancy, with commercial machines only examined many spectra, which led to a β-bond cleavage. McLafferty observed the beginning to emerge. Elemental analysis could number of general observations. For latter scenario and was able to assign the confirm the composition of a sample, but it example, they determined that when structural isomer confidently. did not help with bond connectivity. carbon-carbon double bonds are present, Mass spectra offer insight beyond Crystallography only worked for molecules including in aromatic rings, cleavage occurs structural assignments and have been that form single crystals. typically at the β-position. Molecules with employed to monitor gas-phase reactions of It was in this climate that scientists began carbonyls fragment at the position α to the ions. Ionization techniques produce reactive to use mass spectra as a means to reveal all, or double bond. Saturated rings fragment gas-phase species within a mass part, of a molecule’s structure. Some methods adjacent to the ring. spectrometer, and the products that arise for small-molecule structure determination Applied together, these rules could from collisions can subsequently be were straightforward; the simplest was to distinguish between closely related isomers. monitored by the instrument’s detector. compare the molecule’s spectrum with known As an example, McLafferty examined styrene F.H. Field, J.L Franklin and F.W. Lampe were reference spectra. By this time, laboratories at chlorohydrin, which has two possible isomers instrumental in developing this field, starting Dow Chemical, for example, had encoded depending on the location of the hydroxyl in the mid-1950s. They studied the reference spectra for thousands of compounds and chlorine substituents. Employing the secondary ions formed through gas-phase

MILESTONE 8 Patrick Arpino brings the incompatibility of LC and MS to life. carrier liquid. Other ionization Reprinted from Arpino, P.J., On-line liquid chromatography/mass techniques were trialed with mixed spectrometry? An odd couple!, Trends Analyt. Chem. 1, 154–158 (1982), with permission from Elsevier. results—electrospray ionization Amicable separations (Milestone 15) was a particular

chromatography (LC)—used for the Elsevier 1982, © triumph—and the most successful By the 1950s, mass spectrometry was a well- separation of non-volatile and thermally were soon incorporated by all established technology for the analysis of volatile unstable compounds—with major manufacturers into the new generation compounds in the petroleum, pharmaceutical and mass spectrometry proved of commercial instruments. These technologies chemical industries. However, the deconvolution of more difficult. gave LC–mass spectrometry (LC-MS) a new spectra comprising multiple analytes was proving Initially, V.L. Tal’roze level of usability in terms of compatible sol- problematic—there was a growing desire for a and G.V. Karpov tried vents and analytes. This flexibility, along with rapid, online separation method. direct liquid injection. the improved speed and precision of modern In fact, the chromatographic techniques neces- By leaking a minute LC–MS, makes it an invaluable method for the sary for such separation were themselves just volume of LC effluent unequivocal detection of trace molecules—for coming to the market. Although gas chromatogra- into the high-vacuum example, testing for banned drugs in athletes’ phy (GC) was achieving previously unimaginable conditions of the ioniza- blood or urine. separation performances, the detection methods tion chamber, they could The softness of the new ionization methods then available gave limited chemical insight. The vaporize the sample and then meant that even quite large molecular ions were answer lay in coupling the powerful separation ionize it through electron impact (now called elec- detected intact, simplifying interpretation of the ability of chromatography with the specificity and tron ionization; see Milestone 2). Michael Baldwin data considerably and, importantly, widening the precision of mass spectrometry. and McLafferty improved this approach by switch- scope for potential biological applications. Indeed, This solution was first explored in 1955 by ing to a chemical ionization technique (later devel- when coupled with capillary zone electrophoresis, Roland Gohlke and Fred McLafferty of Dow Chemi- oped and sold commercially by Hewlett-Packard). another liquid-based separation method whereby cal Company, who hooked up a homemade gas Meanwhile, others experimented with belt-drying charged species move under an applied potential, chromatograph to a time-of-flight (TOF) instru- to remove solvent before ionization (later result- these ionization techniques proved to be useful for ment. This TOF instrument had been developed ing in a commercial instrument from Finnigan) the identification of peptides and proteins. only recently (Milestone 4) and generated spectra or concentrating the analytes using membrane In a similar way to GC, ion-mobility separation much faster than did magnetic sector instruments separation. A real game-changer was the develop- (IMS) lent itself well to a partnership with mass (see Milestone 1). Soon, the team could separate ment of charged droplet evaporation techniques. In spectrometry, because both handle ions in the mixtures of organic species and identify them—in 1978, Calvin Blakley, Mary McAdams and Marvin gaseous phase. Pairing IMS with a magnetic sector real time. Vestal reported a method for forcing liquid through or a TOF instrument allowed the analysis of gas- Of course, the marriage of GC and mass + a heated capillary at increased pressure to effect phase reactions, such as the formation of H3 after spectrometry was always going to be harmoni- nebulization, a process that they termed thermo- ionization of hydrogen. Later, experiments showed ous; the gaseous exhaust of the GC was primed, spray. They found that ionization could be achieved that IMS could separate different conformations ready for ionization. In contrast, pairing liquid chemically by adding ammonium acetate to the of intact proteins that have identical m/z values,

10 | OCTOBER 2015 www.nature.com/milestones/mass-spec MILESTONES

collisions, allowing reaction rates and rate MILESTONE 9 constants to be estimated. The structural information that could be deduced from mass spectra continued to Solving the primary structure of peptides increase. was instrumental in By the late 1950s, chemists had realized that as Edman sequencing—which was streamlined applying mass spectrometry analysis to mass spectrometry could be used to decipher by that point—as well as DNA sequencing, natural products. In 1963, he and his the structures of molecules (Milestone 7). which yielded gene sequences that could be co-workers performed a systematic study of Scientists were also just beginning to identify translated into the protein sequence. Ultimately, fragmentation patterns in pentacyclic the primary structure, or sequence, of peptides mass spectrometry proved complementary to triterpenes, describing fragmentation and proteins using chemical approaches. In these techniques, allowing researchers to behaviors and enabling unknown substances 1953, Fred Sanger used N-terminal labeling of determine the C termini of proteins that were to be assigned to a particular subclass. peptide fragments, followed by hydrolysis and too long for Edman sequencing and to confirm Key ion fragments and the reactions that analysis via paper chromatography, to translated sequences. gave rise to them could be deduced. Even sequence insulin (a feat that earned him the The early 1980s brought further innovations. stereochemistry at carbon bridges could be Nobel Prize in Chemistry in 1958). Around the In 1981, Donald Hunt and co-workers carried determined—a powerful demonstration of same time, Pehr Edman devised a method for out the first sequencing of peptides by tandem how much these techniques had progressed, sequencing proteins by stepwise degradation mass spectrometry (Milestone 13). They and foreshadowing their continued use today. starting from their N termini. analyzed permethylated peptides on a triple Enda Bergin, Senior Editor, The application of mass spectrometry to quadrupole (Milestone 6) mass spectrometer Nature Communications peptide sequencing came shortly thereafter, in following chemical ionization; this allowed 1959, when and colleagues direct analysis of a complex mixture of ORIGINAL RESEARCH PAPERS McLafferty, F.W. Mass described an innovative way to elucidate peptides, generated by protease cleavage of a spectrometric analysis broad applicability to chemical research. Anal. Chem. 28, 306–316 (1956) | Beynon, J.H. The use of the peptide structures using the reduction of small large protein, without prior fractionation. mass spectrometer for the identification of organic compounds. peptides to generate polyamino alcohols with Tandem mass spectrometry soon became the Microchimica Acta 44, 437–453 (1956) | Field, F.H., Franklin, J.L. characteristic spectra. Their key advance was standard method for peptide sequencing. & Lampe, F.W. Reactions of gaseous ions. I. Methane and ethylene. J. Am. Chem. Soc. 79, 2419–2429 (1957) | identifying chemistry to reduce the highly polar, Another revolution was the introduction of Budzikiewicz, H., Wilson, J.M. & Djerassi, C. Mass spectrometry zwitterionic character of peptides to allow them ‘soft’ ionization methods, which can be used on in structural and stereochemical problems. XXXII. Pentacyclic to be vaporized for ionization. In the years that polar, thermally labile compounds and yield triterpenes. J. Am. Chem. Soc. 85, 3688–3699 (1963) followed, numerous groups pioneered methods ions that are not highly fragmented (see for mass spectrometry–based peptide Milestone 2). In 1981, Michael Barber and and, in 1998, David Clemmer and colleagues sequencing, developing ways to chemically colleagues developed one of the first of these developed an instrument that could record modify peptides to be compatible with techniques: (FAB), mass-resolved ion mobilities for all analyte ions technical innovations that allowed direct which involves mixing samples in solution with simultaneously. This approach has since become a powerful tool in the characterization of confor- introduction of samples into the ion source for a matrix and bombarding them with mational dynamics of large biomolecules. mass spectrometry. high-energy atoms. FAB allowed the group to Thomas Faust, Associate Editor, In parallel, Biemann and colleagues built on sequence unmodified peptides. Although Nature Communications their previous work to develop a sequencing important, FAB was ultimately surpassed by

ORIGINAL RESEARCH PAPERS Gohlke, R.S. Time-of-flight strategy that was both fast and generally soft ionization methods such as matrix-assisted mass spectrometry and gas-liquid partition chromatography. applicable for use on short peptides. However, laser-desorption/ionization (MALDI; Anal. Chem. 31, 535–541 (1959) | Tal’roze, V.L., Karpov, they soon found that the mass spectra were Milestone 18), which are used widely today. G.V., Gordetski, I.C. & Skurat, V.E. Russ. J. Phys. Chem. 42, 1658–1664 (1968) | McFadden, W.H., Schwartz, H.L. & complicated by factors such as side-chain The past few decades have been fruitful, and Evans, S. Direct analysis of liquid chromatographic fragmentation and variable ion abundance. To the use of mass spectrometry for peptide and effluents. J. Chromatogr. A 122, 389–396 (1976) | Blakley, C.R., McAdams, M.J. & Vestal, M.L. Crossed-beam liquid sort these out, they used a computational protein analysis has become commonplace. It is chromatograph–mass spectrometer combination. approach to interpret the mass spectra. The clear that these and other seminal works have J. Chromatogr. A 158, 261–276 (1978) | Hoaglund, C.S., technique relied on using the exact masses of had a lasting impact on the analysis of protein Valentine, S.J., Sporleder, C.R., Reilly, J.P. & Clemmer, D.E. Three- dimensional ion mobility/TOFMS analysis of electrosprayed ion fragments to compute all possible peptide primary structures. biomolecules. Anal. Chem. 70, 2236–2242 (1998) sequences, from which they could select the Rita Strack, Assistant Editor, Nature Methods FURTHER READING McDaniel, E.W., Martin, D.W. & most probable sequence on the basis of the Barnes, W.S. Drift tube-mass spectrometer for studies of low- energy ion-molecule reactions. Rev. Sci. Instrum. 33, 2–7 most abundant ions. They confirmed their ORIGINAL RESEARCH PAPERS Biemann, K., Gapp, G. & Seibl, J. (1962) | McAfee, K.B. Jr. & Edelson, D. Identification and choice by looking for other ions that should be Application of mass spectrometry to structure problems. I. Amino acid sequence in peptides. J. Am. Chem. Soc. 81, 2274– mobility of ions in a Townsend discharge by time-resolved present were the selected structure correct. mass spectrometry. Proc. Phys. Soc. 81, 382–384 (1963)| 2275 (1959) | Biemann, K., Cone, C., Webster, B.R. & Arsenault, Baldwin, M.A. & McLafferty, F.W. Liquid chromatography- This work was among the first to use computers G.P. Determination of the amino acid sequence in oligopeptides by computer interpretation of their high-resolution mass spectra. mass spectrometry interface—I: the direct introduction of to analyze mass spectra, setting an important liquid solutions into a chemical ionization mass spectrometer. J. Am. Chem. Soc. 88, 5598–1606 (1966) | Hunt, D.F., Buko, A.M., Org. Mass Spectrom. 7, 1111–1112 (1973) | Smith, R.D., Olivares, precedent for the field. Ballard, J.M., Shabanowitz, J. & Giordani, A.B. Sequence analysis J.A., Nguyen, N.T. & Udseth, H.R. Capillary zone The late 1960s and 1970s saw many of polypeptides by collision activated dissociation on a triple quadrupole mass spectrometer. Biomed. Mass Spectrom. 8, 397– electrophoresis–mass spectrometry using an electrospray advances in peptide sequencing by mass ionization. Anal. Chem. 60, 436–441 (1988) | Gohlke, R.S. & 408 (1981) | Barber, M., Bordoli, R.S., Sedgwick, R.D. & Tyler, McLafferty, F.W. Early gas chromatography/mass spectrometry, including a permethylation A.N. Fast atom bombardment of solids as an ion source in mass spectrometry. J. Am. Soc. Mass Spectrom. 4, 367–371 (1993) | technique that Howard Morris and colleagues spectrometry. Nature 293, 270–275 (1981) Pullen, F. The fascinating history of the development of FURTHER READING Morris, H.R., Williams, D.H. & Ambler, R.P. LC-MS; a personal perspective. Chromatography Today used for partial sequencing of proteins. Determination of the sequences of protein-derived peptides and February/March, 4–6 (2010) However, mass spectrometry methods had peptide mixtures by mass spectrometry. Biochem J. 125, 189–201 (1971) competition from alternative approaches, such

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 11 MILESTONES

MILESTONE 10 and VG isotopes— which started to A technique to carry a torch for make the In 1961, American inventor Thomas Reed wrote, plasma-torch design. Most prior plasma torches improve- “This type of apparatus should be useful in high had required the immersion of electrodes into gas, ments temperature research and engineering,” referring which often led to electrode disintegration. But that were to his latest invention—the inductively coupled Reed was able to create an electrode-free, stable needed plasma (ICP) torch. He could probably not have plasma at atmospheric pressures by inductively for the Chemistry of Society Royal 2003, © predicted, however, the vast influence this pub- heating the gas with a radiofrequency coil. It soon technique A plasma torch. Adapted from lication would have. Two decades later, the con- became clear that Reed’s torch was extremely to mature. Aeschliman, D.B. et al., J. Anal. At. nection of an ICP torch to a mass spectrometer useful. In the mid-1960s, Stanley Greenfield and For ex- Spectrom. 18, 1008–1014 (2003), with permission of the Royal Society of would transform elemental and isotope analysis colleagues and Richard Wendt and colleagues ample, the Chemistry. techniques and add unprecedented analytical reported that the ICP torch could be coupled to extraction capabilities to the fields of chemistry, geology, an atomic-emission spectrometer to detect trace of the ICP archaeology, forensics and biology. metal ions in solution. ions, produced at atmospheric pressures, from In ICP mass spectrometry (ICP-MS), an In the late 1970s, Alan Gray started to experi- the plasma into the vacuum system of the inductively coupled argon plasma torch is used ment with capillary-arc plasmas as potential mass spectrometer required careful control of to atomize and ionize a sample, which is usually ionization sources for mass spectrometry. electrostatic effects. By 1988, the technology introduced as an aerosol or by means of laser Although his instrument had excellent detection had advanced dramatically, and SCIEX and VG ablation. The singly charged ions are then trans- limits, matrix effects and interference from ion Isotopes had installed around 175 of these bulky ferred to a mass spectrometer through an orifice species with similar mass-to-charge ratios were instruments in laboratories around the world. and analyzed, one charge-to-mass ratio at a time. severe. The real breakthrough came in 1980, Modern ICP mass spectrometers have Modern ICP-MS systems are rapid, allowing for when Gray, together with researchers from the become substantially more compact, and there trace measurements of multiple elements simul- Ames Laboratory in Iowa, decided to instead are numerous instruments currently in use taneously. The technology can detect all metals use an ICP torch, which could cleanly atomize worldwide. In geology labs, they are used to and even some metalloids and non-metals, and it and ionize the analyte to produce singly charged take reliable measurements of isotope ratios is highly sensitive: detection limits fall in the part- species. to determine the age and origin of liquid or per-billion to part-per-trillion range, depending on The potential for elemental and isotope solid samples. This application goes back to a the element being analyzed. analysis was obvious immediately, not only to development by VG Isotopes and Alex Halliday The process required to achieve such researchers but also to scientific-instrument in the early 1990s known as multiple-collector capabilities was long, set into motion by Reed’s manufacturing companies—most notably SCIEX ICP-MS, a type of mass spectrometry that

IMS analysis of nitrogen fixation by bacterial symbionts within a MILESTONE 11 marine bivalve. Reproduced from Lechene, C.P., Luyten, Y., McMahon, G. & Distel, D.L., Science 317, 1563–1566, 2007, with permission from AAAS.

The pixelation of mass spectrometry produce a mass spectrometry–based

AAAS ‘picture’ of the cell. A further technical improvement, which The idea of employing mass spectrometry for defined points across the sample, they could would have a broad impact on imaging of imaging purposes (imaging mass build up an image of which element or isotope biological samples using mass spectrometry, spectrometry; IMS) was first introduced by was present. Each sampling point would reflect arrived in 1997. Richard Caprioli, Terry Raymond Castaing and Georges Slodzian in what we now describe as a pixel in an image. Farmer and Jocelyn Gile used matrix-assisted 1962, using the principles of secondary ion Castaing and Slodzian compared their new laser desorption/ionization (MALDI) for IMS mass spectrometry (SIMS; see Milestone 2). SIMS-based approach with an approach based for the first time to image peptides and Castaing had begun his career as a research on X-ray emission spectroscopy called proteins in tissues. The development of engineer at the Office National d’Etudes et de ‘scanning microanalysis’, which Castaing had MALDI (Milestone 18) had transformed the Recherches Aéronautiques (ONERA), and developed while working with André Guinier analysis of large, biologically relevant from then on had nurtured an interest in metal in the 1940s and 1950s. Castaing and Slodzian molecules, but researchers typically combined alloys and how their composition varies across found that although it was easier to obtain and lysed cells prior to analysis, thus losing a sample. quantitative results using scanning spatial information about where molecules In SIMS, a sample is bombarded with microanalysis, their new approach was able to of interest were originally located in the high-energy ‘primary ions’, such as Cs+, O, Ar+, detect extremely light elements and sample being analyzed. Caprioli and his Ga+ and In+, causing an ejection of ‘secondary discriminate between isotopes; it also had colleagues took advantage of the high ions’ from the surface. These secondary ions greater resolving power. sensitivity of the MALDI technique (at that are representative of the atomic composition of In 1970, Pierre Galle used Castaing’s and time, low-femtomole to attomole levels for the surface and can be collected and separated Slodzians’ SIMS imaging approach to look proteins and peptides), recognizing that this according to their mass ratios. Castaing and at biological samples. He investigated renal would enable them to determine the Slodzian reasoned that these ions could also be tissue and red blood cells, observing that molecular weights of molecules by analyzing used to obtain information about the specific the distribution of sodium and potassium small, discrete areas on a tissue surface. The element or isotopes that are present at a varied in different parts of the renal tissue. researchers reasoned that coating a biological particular position on a metal-alloy surface. He also demonstrated that sufficient surface with a thin layer of the matrix They recognized that by analyzing different, sodium was present in the red blood cells to required for MALDI-MS analysis and

12 | OCTOBER 2015 www.nature.com/milestones/mass-spec MILESTONES

MILESTONE 12 separates ions into mass-resolved beams that are then guided to different collectors. With this technique, isotope ratios can be deter- Conquering carbohydrate complexity mined with a precision rate of 0.01–0.001%. In 2009, Dmitry Bandura and colleagues reported Despite mass spectrometry’s success in the carbohydrate structure the observed fragment the use of ICP-MS to simultaneously measure identification and characterization of small was from and whether the fragment was part of multiple antigens on single cells. To carry out molecules (Milestone 7), its application to a branch or the ‘core’ structure. this mass-cytometry technique, they attached carbohydrates was initially stymied. This was From the 1990s to early 2000s, the lanthanide tags to antibodies against cell- due to their low vapor pressure—which made it dissemination of commercial electrospray mass surface antigens. difficult to push them into the gas phase—and spectrometers, along with other improvements For Thomas Reed—who, despite enter- ing retirement, has not stopped inventing—it heat sensitivity, which led to low reproducibility in instrumentation, provided major advances in must be incredibly rewarding to see how many or poor signal intensity. In 1959, however, P.A. carbohydrate analysis. Douglas Sheeley and useful applications his torch has sparked. Finan and R.I. Reed published a spectrum of a Vernon Reinhold demonstrated that the Leonie Mueck, Senior Editor, Nature methylated polysaccharide, suggesting that quadrupole ion trap mass spectrometer was carbohydrate modifications could solve the superior to earlier instruments because of its ORIGINAL RESEARCH PAPERS Reed, T.B. Induction- problem of low vapor pressure. efficiency in creating and isolating ions. As a coupled plasma torch. J. Appl. Phys. 32, 821–824 (1961) | Houk, R.S. et al. Inductively coupled argon plasma as an ion On the basis of this technical clue, Klaus result, it enabled multiple tandem mass source for mass-spectrometric determination of trace Biemann and colleagues and N.K. Kochetkov spectrometry experiments and detection of less elements. Anal. Chem. 52, 2283–2289 (1980) and colleagues followed up with two abundant fragments, facilitating more complete FURTHER READING Greenfield, S., Jones, I.L. & Berry, C.T. High-pressure plasmas as spectroscopic emission sources. comprehensive reports in 1963, presenting the assignments of the connections between Analyst 89, 713–720 (1964) | Wendt, R.H. & Fassel, V.A. mass spectra of a variety of methylated or substructures than had been possible before. Induction-coupled plasma spectrometric excitation source. Anal. Chem. 37, 920–922 (1965) | Houk, R.S. & acetylated monosaccharides and establishing Whereas previous studies on glycopeptides Thompson, J.J. Inductively coupled plasma spectrometry. mass spectrometry as a viable technique for the had examined only the isolated peptide or Mass Spectrom. Rev. 7, 425–461 (1988) | Walder, A.J. & structural determination of carbohydrates. The glycan structure, Alan Marshall and Carol Freeman, P.A. Isotopic ratio measurement using a double focusing magnetic sector mass analyser with an inductively two groups identified characteristic patterns of Nilsson and colleagues reported the use of a coupled plasma as an ion source. J. Anal. At. Spectrom. 7, 571– ions that reflect the chemical cleavage of the single instrument—the Fourier transform ion 575 (1992) | Bandura, D.R. et al. Mass cytometry: technique monosaccharides into different fragments. In cyclotron resonance mass spectrometer (see for real time single cell multitarget immunoassay based on inductively coupled plasma time-of-flight mass tandem with the spectra obtained from Milestone 5)—to collect two different kinds of spectrometry. Anal. Chem. 81, 6813–6822 (2009) deuterated monosaccharides to monitor spectra of intact glycopeptides. By using the specific portions of the ring structure, these ‘soft’ fragmentation technique of rastering the laser across the surface would patterns can be used to identify the ring size electron-capture dissociation, the researchers enable point-by-point analysis of the and position of some functional groups. were able to detect fragments of the peptide molecules present at specific locations, much Over the next two decades, additional backbone, whereas infrared multiphoton like the use of SIMS for elemental imaging. reports emerged that focused on longer dissociation reported on the carbohydrate They found a diverse array of peptides and carbohydrate sequences or specific types of structure, facilitating glycopeptide assignment. proteins in rat tissue samples, demonstrating carbohydrates; inefficiencies in the methylation This combination of methods for the value of their method. reactions, however, remained a roadblock. In a modification, characterization and The interest in applying IMS in biology is seminal report in 1984, Ionel Ciucanu and interpretation remains the standard today, reflected in the development of various Francisc Kerek revisited the chemical powering increasingly insightful explorations of commercial systems, mostly based on SIMS, mechanism of this methylation reaction, the glycome. desorption electrospray ionization (DESI; discovering that simple bases such as OH– or H– Catherine Goodman, Senior Editor, Milestone 2) or MALDI. Today, robotic were sufficient to catalyze the reaction. Their Nature Chemical Biology methods are employed for speed and to obtain revised reaction conditions improved yields ORIGINAL RESEARCH PAPERS Biemann, K., DeJongh, D.C. & high-spatial-resolution images. Further from 30% to 100%, shortened the reaction Schnoes, H.K. Application of mass spectrometry to structure biological applications are still under time from hours to minutes, eliminated problems. XIII. Acetates of pentoses and hexoses. J. Am. Chem. development, including the introduction of by-products that had plagued earlier techniques Soc. 85, 1763–1771 (1963) | Kochetkov, N.K., Wulfson, N.S., Chizhov, O.S. & Zolotarev, B.M. Mass spectrometry of IMS into clinical pathology laboratories, and used chemicals that were more accessible carbohydrate derivatives. Tetrahedron 19, 2209–2224 (1963) | suggesting that IMS will continue to be highly to researchers. Ciucanu, I. & Kerek, F. A simple and rapid method for the With improved techniques in hand, and the permethylation of carbohydrates. Carbohydr. Res. 131, 209–217 valued as an application of mass spectrometry. (1984) | Domon, B. & Costello, C.E. A systematic nomenclature Katharine Barnes, Managing Editor, development of tandem mass spectrometry for carbohydrate fragmentations in FAB-MS/MS spectra of Nature Protocols (Milestone 13), scientists in the field were glycoconjugates. Glycoconjugate J. 5, 397–409 (1988) | Sheeley, D.M. & Reinhold, V.N. Structural characterization of poised to investigate more complex carbohydrate sequence, linkage and branching in a quadrupole ORIGINAL RESEARCH PAPERS Castaing, R. & Slodzian, G. carbohydrates. However, it quickly became ion trap mass spectrometer: neutral oligosaccharides and Microanalyse par émission ionique secondaire. J. Microscopie 1, clear that the nomenclature developed to N-linked glycans. Anal. Chem. 70, 3053–3059 (1998) | Håkansson, K. et al. Electron capture dissociation and infrared 395–410 (1962) | Caprioli, R.M., Farmer, T.B. & Gile, J. describe carbohydrate fragments was Molecular imaging of biological samples: localization of multiphoton dissociation MS/MS of an N-glycosylated tryptic peptides and proteins using MALDI-TOF MS. Anal. Chem. 69, insufficient to capture this new information. In peptide to yield complementary sequence information. Anal. 4751–4760 (1997) 1988, Bruno Domon and Catherine Costello Chem. 73, 4530–4536 (2001) FURTHER READING Galle, P. Sur une nouvelle méthode FURTHER READING Finan, P.A. & Reed, R.I. Application of the d’analyse cellulaire utilisant le phénomène d’ <>. Ann. Phys. Biol. Med. 42, 83–94 (1970) | on recently developed peptide terminology, that 1866 (1959) | Hirabayashi, J., Arata, Y. & Kasai, K. Glycome project: concept, strategy and preliminary application to McDonnell, L.A & Heeren, R.M.A. Imaging mass spectrometry. indicated where chemical cleavage occurred Mass. Spectrom. Rev. 26, 606–643 (2007) Caenorhabditis elegans. Proteomics 1, 295–303 (2001) within a saccharide structure, which ‘end’ of the

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 13 MILESTONES

MILESTONE 13

A schematic of the tandem mass spectrometer developed by Futrell and Miller. Reprinted with permission from Forming fragments Futrell, J.H. & Miller, C.D., Tandem mass spectrometer for study of ion-molecule The 1960s were a productive time for the reactions, Rev. Sci. Instrum. 37, 1521 (1966). © 1966, AIP Publishing LLC. Publishing AIP 1966, © pioneers of mass spectrometry. New methods of ionizing molecules were under chamber—became the standard for fragmentation—is the basic premise of development, and technical advancements in collision-induced dissociation tandem mass MS/MS. However, Shannon and McLafferty the preceding decade had improved the spectrometry (MS/MS). relied on the parent ion generating a accuracy of mass spectrometers. Yet, although Despite the innovative nature of their setup, metastable peak that would decompose the technique could identify the Futrell and Miller did not use it to determine quickly inside the mass spectrometer—which mass-to-charge ratios and relative abundances any new structural information about the is rarely the case. of ionized molecules, it could not provide any parent ion. This would require another Keith Jennings ablated the need for a information about the relative arrangement of breakthrough later in the same year, when T.W. metastable peak in 1968. Although he used a atoms within them. Shannon and Fred McLafferty showed that setup that differed from Futrell’s and Miller’s, + In 1966, Jean Futrell and C.D. Miller ions with the chemical formula C2H5O can be it relied on a similar concept: colliding stable reported a mass spectrometer design that classified into different groups on the basis of molecular ions with a gas introduced into would ultimately lead to the analysis of ions by the spectra that arise from the decomposition the spectrometer. He showed that it was means of collision-induced dissociation of a ‘metastable peak’. This type of peak possible to record a ‘collision-induced mass tandem mass spectrometry. That was not, represents ions with lifetimes that are shorter spectrum’, in which a collision produces ions however, their aim at the time. Instead, Futrell than the time required for them to travel that are fragments of the parent ion, which and Miller wanted to study the interaction of between the ionization source and the can then be detected. This meant that a charged ions with neutral molecules. To do detector. The decomposition of different metastable peak was no longer necessary. + this, they developed a setup that used a structural isomers of C2H5O ions produces Furthermore, the decomposition of an ion spectrometer as an ion gun to direct the ion characteristic fragmentation patterns; that gives rise to a metastable peak is highly into a collision chamber containing a neutral therefore, these patterns provide structural dependent on the parent ion’s internal gas; the effects of the interactions between the information about the metastable parent ion. energy. In contrast, collision-induced ion and the gas could then be detected using a This approach—selecting a specific ion from dissociations give rise to spectra that are second spectrometer. This arrangement—two the mixture formed and then analyzing the independent of the energy of the parent ion, spectrometers separated by a collision spectra that are generated through thereby simplifying its identification.

MILESTONE 14 extracts from two Arabidopsis thaliana ecotypes, finding that they expressed distinct metabolic profiles. It was not long before this approach was applied to yeast, and later adapted to a high- Seeing the full picture of metabolism throughput format by analyzing spent culture me- dium rather than cell lysate, a method Douglas Kell Cells, tissues and organisms contain complex mix- W.D. Lehmann and colleagues heralded the birth of and colleagues referred to as ‘metabolic footprint- tures of chemicals, the products and by-products lipidomics. By using electrospray ionization mass ing’. In 2002, Ralf Takors and colleagues described of metabolism. Although mass spectrometry was spectrometry (ESI-MS) (Milestone 15), optimiz- a process to monitor the flow of metabolites in routinely used to identify specific classes of me- ing fragmentation energies for each phospholipid Escherichia coli in response to changes in nutri- tabolites starting in the mid-1950s (Milestone 7), headgroup class and including internal standards ent status, which they termed ‘metabolomics’. it would take another decade before researchers with non-natural fatty acids, they were able to as- Whereas mass spectrometry had previously been began using the technique to examine complex sess both the quality and quantity of a lipid extract. used to present static snapshots of metabolic state, metabolite mixtures. Metabolic profiling by mass spectrometry now dynamic changes could be monitored. The fol- In one early example, E.C. Horning and col- became part of the functional genomics toolkit in lowing year, Eliane Fischer and Uwe Sauer’s inves- leagues separated and characterized organic 2000 when Oliver Fiehn and colleagues compared tigation of the metabolic flux of carbon through a compounds from urine, mainly acids and steroids, panel of E. coli mutants provided insight into central using gas-liquid chromatography and confirming carbon metabolism in this organism. More recent chemical identity by mass spectrometry. In 1971, metabolomics studies yield increasingly complicat- they extended their analysis to include drugs and ed datasets, whose interpretation has been aided drug metabolites, initiating the use of the method by improvements in mass spectrometry techniques as a means of establishing a ‘metabolic profile’ of and by software tools such as XCMS designed to an individual. enable facile detection of unknown spectrum peaks Despite these early instances of complex that differ substantially in concentration across metabolite identification, the field lay relatively dor- samples. mant for more than 20 years, until Richard Gross Following on from its early use in profiling and colleagues reported the identification of more complex metabolite mixtures in biological fluids, than 50 phospholipids from erythrocyte mem- metabolomics has recently found an expanding

branes in 1994. Several other groups later reported Elsevier 2001, © number of uses in basic research and medicine, the identification of complex lipid mixtures or the from gaining insight into how metabolic pathways quantification of a few select species, and the iden- Identification of triglycerides in a mixture by ESI-MS and interact and influence one another to understand- tandem mass spectrometry. Image reproduced from Han, X. & tification and quantification of phospholipids from Gross, R.W., Anal. Biochem. 295, 88–100 (2001), with ing the effect of drugs and diet on our bodies. total cellular lipid extracts in 1997 by F.T. Wieland, permission from Elsevier. Applying metabolomics to direct synthetic biology

14 | OCTOBER 2015 www.nature.com/milestones/mass-spec MILESTONES

Collectively, these papers showed that a MILESTONE 15 mass spectrometer could be used to select a species of ion, collide it with a neutral molecule to break it into fragments and then analyze it in Electrospray makes molecular elephants fly a second mass spectrometer. Crucially, the The ‘electrospray revolution’ in the late 1980s by various combinations and numbers of work also showed that the fragment ions could dramatically increased the utility of mass methanol and water molecules that could be be used to infer information about the spectrometry. Developed by John Fenn at Yale varied widely simply by adjusting the flow rate structure of the parent ion—information not University, electrospray ionization (ESI) finally and/or temperature of the counter-current available from the parent ion’s spectra. These allowed the mass spectrometric analysis of drying gas.” More importantly, they were able to concepts enabled the development of biological macromolecules, including proteins, generate intact ions of fragile molecules collision-induced dissociation, and the and was rapidly adopted in laboratories around (without fragmentation) for mass alternative fragmentation methods that the world. But the seeds of the electrospray spectrometric analysis. Electrospray ionization subsequently followed, into powerful, highly revolution had been planted long before. mass spectrometry (ESI-MS) was born. sensitive techniques for detecting analytes and In 1968, Malcolm Dole at Northwestern Fenn then analyzed larger molecules—such determining their molecular structure. University found that large gas-phase ions as the antibiotic gramicidin S and the Russell Johnson, Associate Editor, Nature Chemistry (‘macroions’) could be transiently generated by immunosuppressant cyclosporin A—and “electrospraying a dilute solution into an observed peaks corresponding to singly and ORIGINAL RESEARCH PAPERS Futrell, J.H. & Miller, C.D. evaporation chamber containing nitrogen, and multiply charged ions with intensities that Tandem mass spectrometer for the study of ion-molecule by allowing the volatile solvent to evaporate depended on the composition and reactions. Rev. Sci. Instrum. 37, 1521–1526 (1966) | Shannon, T.W. & McLafferty, F.W. Identification of gaseous from the tiny drops so produced.” Dole concentration of the solutions. The observation organic ions by the use of “metastable peaks.” J. Am. Chem. Soc. generated a spray of charged liquid drops by of multiply charged ions would be the 88, 5021–5022 (1966) | Jennings, K.R. Collision-induced streaming a dilute polymer solution through a fundamental step toward extending the decompositions of aromatic molecular ions. Int. J. Mass Spec. Ion Phys. 1, 227–235 (1968) needle held at high voltage relative to the spray application of ESI-MS to macromolecules, FURTHER READING Zubarev, R.A., Kelleher, N.L. & chamber wall. Once electrosprayed into the because the mass range of the mass analyzers McLafferty, F.W. Electron capture dissociation of multiply evaporation chamber, polymer ions were could now be increased by a factor equal to the charged protein cations. A nonergodic process. J. Am. Chem. Soc. 120, 3265–3266 (1998) | Syka, J.E.P., Coon, J.J., Schroeder, produced in the ambient gas by the interplay of number of charges on a macroion. M.J., Shabanowitz, J. & Hunt, D.F. Peptide and protein sequence evaporation and electrostatic repulsion. By extensively studying polyethylene glycol analysis by electron transfer dissociation mass spectrometry. Proc. Natl. Acad. Sci. USA 101, 9528–9533 (2004) | Olsen, J.V., Lord Rayleigh had reported in 1882 that as a oligomers, Fenn determined how multiply Macek, B., Lange, O., Makarov, A., Horning, S. & Mann, M. solvent evaporates, the charge density on the charged ions are generated and how to control Higher-energy C-trap dissociation for peptide modification surface of a “charged spherical mass of liquid” that process. Mass spectrometric analysis of analysis. Nat. Methods 4, 709–712 (2007) increases until electrical repulsion overcomes proteins immediately followed. Electrospray surface tension, causing the droplet to ionization is currently the technique typically approaches and adapting it to advance personal- disintegrate. In Dole’s apparatus, repetitive used to generate ions in liquid chromatography– ized medicine are two of the current challenges and opportunities facing metabolomics researchers. disintegrations resulted in droplets so small that mass spectrometry (Milestone 8) and is Kyle Legate, Associate Editor, they contained only a single macroion, whose broadly used in many biological and biomedical Nature Communications mass could then be determined. applications, including proteomics Unfortunately, the mass spectrometers (Milestone 20) and drug-discovery efforts. ORIGINAL RESEARCH PAPERS Dalgliesh, C.E., Horning, E.C., available at the time could not detect singly In 2002, John Fenn was awarded the Nobel Horning, M.G., Knox, K.L. & Yarger, K. Gas-liquid- chromatographic procedure for separating a wide range of charged ions with masses larger than a few Prize in Chemistry together with Koichi Tanaka metabolites occurring in urine or tissue extracts. Biochem. J. thousand daltons, well below the mass range for the “development of soft desorption 101, 792–810 (1966) | Horning, E.C. & Horning, M.G. Metabolic for macromolecules of biological interest. Also, ionisation methods for mass spectrometric profiles: gas-phase methods for analysis of metabolites. Clin. Chem. 17, 802–809 (1971) | Han, X. & Gross, R.W. Electrospray Dole had overlooked the problem of ion analyses of biological macromolecules” (see ionization mass spectroscopic analysis of human erythrocyte re-solvation occurring prior to mass Milestone 18). As he quipped in his Nobel Prize plasma membrane phospholipids. Proc. Natl. Acad. Sci. USA 91, measurement. Owing to the substantial lecture, Fenn became a Nobel laureate for giving 10635–10639 (1994) | Brügger, B., Erben, G., Sandhoff, R., Wieland, F.T. & Lehmann, W.D. Quantitative analysis of temperature drop during its adiabatic “electrospray wings to molecular elephants.” biological membrane lipids at the low picomole level by nano- expansion, the electrosprayed aerosol would Cosma D Dellisanti, Associate Editor, electrospray ionization tandem mass spectrometry. Proc. Natl. Acad. Sci. USA 94, 2339–2344 (1997) | Fiehn, O. et al. rapidly become saturated with solvent vapor. Nature Structural & Molecular Biology Metabolite profiling for plant functional genomics. Nat. Hence, Dole’s macroions would get re-solvated, PRIMARY RESEARCH PAPERS Dole, M. et al. Molecular beams Biotechnol. 18, 1157–1161 (2000) | Buchholz, A., Hurlebaus, J., and their measured masses would be higher of macroions. J. Chem. Phys. 49, 2240–2249 (1968) | Yamashita, Wandrey, C. & Takors, R. Metabolomics: quantification of than the actual values by an unknown amount. M. & Fenn, J.B. Electrospray ion source. Another variation on the intracellular metabolite dynamics. Biomol. Eng. 19, 5–15 (2002) free-jet theme. J. Phys. Chem. 88, 4451–4459 (1984) | FURTHER READING McFadden, W.H., Teranishi, R., Corse, J., John Fenn soon became interested in Dole’s Whitehouse, C.M., Dreyer, R.N., Yamashita, M. & Fenn, J.B. Black, D.R. & Mon, T.R. Volatiles from strawberries. II. research, but at first he was unable to overcome Electrospray interface for liquid chromatographs and mass Combined mass spectrometry and gas chromatography on spectrometers. Anal. Chem. 57, 675–679 (1985) | Fenn, J.B., complex mixtures. J. Chromatogr. 18, 10–19 (1965) | Allen, J. the problems Dole had encountered. Almost 20 Mann, M., Meng, C.K., Wong, S.F. & Whitehouse, C.M. et al. High-throughput classification of yeast mutants for years passed before Fenn resumed working on Electrospray ionization for mass spectrometry of large functional genomics using metabolic footprinting. Nat. the electrospray technique, starting with small biomolecules. Science 246, 64–71 (1989) Biotechnol. 21, 692–696 (2003) | Fischer, E. & Sauer, U. FURTHER READING Rayleigh, J.S.W. On the equilibrium of liquid Metabolic flux profiling of Escherichia coli mutants in central molecules such as vitamin B6 and using a conducting masses charged with electricity. Philos. Mag. 14, 184– carbon metabolism using GC-MS. Eur. J. Biochem. 270, 880– quadrupole mass analyzer (Milestone 6). With 186 (1882) | Wong, S.F., Meng, C.K. & Fenn, J.B. Multiple charging 891 (2003) | Smith, C.A., Want, E.J., O’Maille, G., Abagyan, R. a newly designed electrospray apparatus, Fenn in electrospray ionization of poly(ethylene glycols). J. Phys. Chem. & Siuzdak, G. XCMS: processing mass spectrometry data for 92, 546–550 (1988) | Fenn, J.B. Electrospray ionization mass metabolite profiling using nonlinear peak alignment, matching, and colleagues obtained spectra of a variety of spectrometry: how it all began. J. Biomol. Tech. 13, 101–118 and identification. Anal. Chem. 78, 779–787 (2006) small molecules, and observed “peaks (2002) | Fenn, J.B. Nobel Lecture: Electrospray wings for corresponding, for example, to protons solvated molecular elephants (Nobel Foundation, 2002)

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 15 MILESTONES

MILESTONE 16 Mass spectrometry– and analyzing spore-forming bacteria. However, based approaches modern approaches that combine mass spec- are used today to trometry with techniques such as the polymerase screen for metabolic disease in newborns. chain reaction have started to address these Signatures of disease issues and even have the potential to rapidly iden- Treating a disease effectively depends on making could be loaded tify the presence of antibiotic resistance. the right diagnosis. For a large part of the twenti- directly onto Mass spectrometry has proven to be a eth century, medical laboratory diagnostics relied the mass game-changer in medicine by also enabling spectrometer

on often laborious and occasionally imprecise CORBIS the diagnosis of inherited metabolic disorders. staining procedures or biochemical assays. A fast, for ionization. Because these conditions result in the production sensitive and reliable technique to detect patho- Indeed, when of abnormal acylcarnitine metabolites that can gens such as bacteria or biomarkers of disease in applying this approach to different bacterial be cytotoxic, their prompt diagnosis in newborns tissue or bodily fluids was badly needed, but had species, they observed highly characteristic is critical. Yet chromatographic technologies do remained tantalizingly out of grasp. mass spectra derived largely from the pyrolysis not permit the detection of these metabolites in Mass spectrometry, with its power to ac- of ubiquinones and phospholipids. Even closely blood samples, in part because acylcarnitines are curately quantify and identify miniscule amounts related bacteria such as Staphylococcus aureus small, low in abundance and structurally similar of a dazzling array of biomolecules, had just the and Staphylococcus epidermidis could be readily to normal blood plasma analytes. An additional right mix of capabilities for analyzing bacteria but distinguished on the basis of their mass spectra. challenge was the forbidding complexity of blood was hindered in the 1970s by the harsh condi- Subsequent work has revolutionized microbial plasma—a morass of thousands of biomolecules tions required to prepare samples. To render such diagnostics further through the introduction of that stymied the straightforward detection of any samples suitable for mass spectrometry analysis, soft ionization approaches such as the matrix- disease signature. they had to first be decomposed by heating at assisted laser desorption/ionization time-of-flight That is, until 1990, when D.S. Millington and high temperature. Invariably, this process of (MALDI-TOF) technique (Milestone 18), which colleagues showed that it was possible to use tan- pyrolysis led to the almost total degradation of has been successfully applied to identify a broad dem mass spectrometry (MS/MS; Milestone 13) the material, therefore making mass spectrometry range of microorganisms from a variety of tissue to accurately profile acylcarnitines from dried of little use for diagnostic purposes. sources. MALDI-TOF results in little molecular blood spots taken during routine newborn heel- In 1975, John Anhalt and Catherine Fenselau fragmentation and allows rapid, highly accurate pricks. MS/MS was especially useful for such showed that gentler pyrolysis conditions could identification of microorganisms right down to the metabolic screening because it allows multiple be modified to allow retention of larger and more species level. molecules to be analyzed simultaneously. This complex biomolecules, generating mass spectra Challenges remain in using mass spectrometry not only increases the throughput but also allows unique to specific microorganisms. This team for clinical microbial diagnostics, such as reliably relative levels of analytes to be compared, which demonstrated that lyophilized bacterial samples distinguishing between highly related microbes provides a better diagnostic of metabolic disease

MILESTONE 17 reduction in complexity was needed. In 1978, Richard Kondrat, Gary McClusky and Reduce complexity by choosing your reactions R. Graham Cooks introduced selected reaction monitoring (SRM) by performing tandem Mass spectra are built up by recording the Initially, SIM was performed using mass spectrometry using a mass-analyzed abundance of ions at sequential magnetic sector instruments. To do this, the kinetic energy spectrometer. They showed mass-to-charge ratios (m/z) and can be magnetic field and the accelerating voltage that, by monitoring a fragmentation reaction used to identify what compounds are from the ion source were adjusted to select up that was characteristic of a compound of present in a sample. If a researcher wants to to three ions to monitor. A landmark paper interest, it was possible, for example, to detect determine the presence or quantity of from Bengt Samuelsson, Mats Hamberg and cocaine in a sample of coca leaf without using specific compounds, then the mass Charles Sweeley in 1970 showed that very any type of sample extraction or spectrometer can be programmed to look sensitive and accurate quantitative analysis of chromatography. The molecular ion for exclusively at their characteristic ions or nanogram amounts of the lipid vasodilator cocaine was selected using a magnetic sector

fragmentation reactions. The development prostaglandin E1 could be achieved by adding and fragmented by introducing a collision gas of selected ion monitoring (SIM) and, later, a known amount of a deuterium-labeled through a second inlet. An electrostatic sector selected reaction monitoring (also called version of the analyte (an internal standard) focused ions with the same kinetic energy—in multiple reaction monitoring) allowed and then combining gas chromatography this case, the kinetic energy of a cation researchers to use such a strategy to achieve mass spectrometry (GC-MS; see Milestone 8) characteristic of cocaine fragmentation was both greater sensitivity and quantitative with SIM to measure the relative ion chosen.

3×106 accuracy. intensities of both the heavy and light Today, the instrument of choice for SRM GC/MS SIM m/z 362 versions. Building on this initial work, the use experiments is the triple quadrupole mass

Signa l Targeted detection of of isotopically labeled internal standards and analyzer (Milestone 6). Richard Yost and methtryptoline in rat SIM became routine for quantitative analysis Christie Enke were the first to demonstrate brain extracts. Adapted with in both GC-MS and liquid chromatography the advantages of this instrument for SRM, 3 3×10 permission from (LC)-MS. showing that up to 50 compounds or drug GC/MS/MS Johnson, J.V. & Yost, SRM m/z 362 → 179 R.A., Tandem mass Although SIM greatly reduced the classes could be directly detected in a drug Signa l spectrometry for complexity of GC-MS data, there were still screen of blood extracts, and even from trace analysis, Anal. Chem. 57, many interfering compounds detected in the untreated serum. The analytical speed of the 0246810121416

© 1985, American Chemical Society Chemical American 1985, © Retention time (min) 758A–768A (1985). extracts of biological material; a further triple quadrupole meant that SRM could also

16 | OCTOBER 2015 www.nature.com/milestones/mass-spec MILESTONES

than absolute levels of a single biomarker. This MILESTONE 18 compare the solids tryptophan and nicotinic work charted a path to using MS/MS routinely in acid (another small organic molecule) with newborn blood-spot diagnosis. Indeed, MS/MS liquid matrices that had been developed for is now well-integrated into standard newborn Enter the matrix FAB-MS. They obtained mass spectra for three screening and is used for the detection of dozens ‘medium-sized’ antibiotics (stachyose, of inherited metabolic disorders. By the early 1980s, mass spectrometry was a erythromycin and gramicidin S) and achieved Building on these landmark medical mass well-established laboratory technique for the significantly better results—less fragmentation spectrometry studies is the search for new characterization of small organic molecules and a more intense signal from the parent ion— biomarkers and approaches to improve diagnosis (Milestone 7). But larger ones—particularly with the solid than with the liquid matrices. and monitor disease. In particular, the generation biological molecules, such as proteins, DNA More promising results also came when of larger and more accurate libraries of pathogen and complex carbohydrates—were proving to analyzing vitamin B12 in a nicotinic acid matrix: mass spectra should increase the number of be a challenge. Because mass analysis relies on no desorption was observed without a matrix, microorganisms that can be identified and the the detection of ionized species in the gas and there were solubility problems when using depth at which they can be analyzed. Future ad- phase, the key problem was imparting sufficient the liquid matrices. Evidence was mounting that vances in mass spectrometry technology should energy to these large molecules to send them a solid, UV-adsorbing small-molecule matrix enable the diagnosis of potentially any condition into the gas phase without destroying them. might be the tool needed to obtain mass that leads to derangement of body chemistry, Although some progress had been made using spectra of large proteins. including cancer, atherosclerosis and diabetes. fast atom bombardment (FAB) and On the other side of the world, Koichi Tanaka Zoltan Fehervari, Senior Editor, Nature Immunology 252Cf-plasma desorption (see Milestone 2), at and colleagues from the Shimadzu Corporation the time the gold standard was the CO laser in Japan were working on a similar problem, and ORIGINAL RESEARCH PAPERS Anhalt, J.P. & Fenselau, C. 2 Identification of bacteria using mass spectrometry. Anal. desorption of organic molecules with masses in 1988, they published the mass spectra of Chem. 47, 219–225 (1975) | Millington, D.S., Kodo, N., up to 1,000 daltons—so the world of proteins several proteins and synthetic polymers Norwood, D.L. & Roe, C.R. Tandem mass spectrometry: a new was largely inaccessible. obtained from a laser ionization mass method for acylcarnitine profiling with potential for neonatal screening for inborn errors of metabolism. J. Inher. Metab. Dis. and , in spectrometer they had built. Signals for 13, 321–324 (1990) Germany, were two of the earliest adopters of molecules up to ~25,000 daltons, and FURTHER READING Wilcken, B., Wiley, V., Hammond, J. & lasers for mass spectrometry, starting efforts in oligomeric molecules up to m/z 100,000, could Carpenter, C. Screening newborns for inborn errors of metabolism by tandem mass spectrometry. N. Engl. J. Med. this area when laser technology was relatively be detected using their instrument. Tanaka and 348, 2304–2312 (2003) | Rifai, N., Gillette, M.A. & Carr, S.A. immature. In 1985, while irradiating an colleagues developed an unconventional Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat. Biotech. 24, 971–983 equimolar mixture of alanine and tryptophan, sample preparation approach in which they (2006) they observed signals in the mass spectrum for dispersed an ultra-fine cobalt powder in both amino acids. Although this does not sound glycerol. As the Shimadzu sample preparation particularly surprising, Karas and Hillenkamp method relied on inorganic nanoparticles, be performed in real time for compounds knew that one needed to use a much higher rather than small organic molecules, for photon eluting from GC and, later, LC columns. laser energy at that wavelength to form gaseous adsorption, it was not technically MALDI, but it SRM—especially in combination with ions of alanine—so they only expected to was nonetheless a landmark development. standard isotope dilution—has found observe signals for tryptophan. They postulated Tanaka was awarded a portion of the 2002 numerous applications in environmental, that the alanine was ‘riding piggyback’ on the Nobel Prize for his work on laser-induced soft forensic and clinical analysis of small organic tryptophan and called this phenomenon desorption/ionization of large molecules that molecules. In modern triple quadrupole “matrix-assisted laser desorption.” Although led to modern-day MALDI techniques. instruments, more than 100 precursor- the term matrix-assisted laser desorption/ Karas and Hillenkamp further developed the product ion pairs can be recorded for ionization mass spectrometry (MALDI-MS) use of solid small molecules as matrices, and peptides analyzed in an LC-MS run. Such would not be coined for a few years, the later in the same year they published the mass advances have had a profound impact on the premise—the use of a matrix material to adsorb spectra of four proteins, obtained using detection and quantitation of proteins, laser energy and transfer it to an admixed nicotinic acid as a matrix. Several other enabling a targeted proteomics approach akin analyte of interest, facilitating the vaporization matrices have been developed in the past 25 to the use of antibodies for protein detection. and ionization of that analyte—was born. years, as have alternative mechanisms to Bronwen Dekker, Senior Editor, Karas and Hillenkamp went on to use their facilitate the mild ionization and desorption of Nature Protocols matrix-assisted laser-desorption technique to macromolecules. Today, MALDI and its laser-desorption cousins are workhorses in ORIGINAL RESEARCH PAPERS Kondrat R.W., McLuckey, G.A. & Cooks, R.G. Multiple reaction monitoring in mass most chemistry and biochemistry labs. spectrometry/mass spectrometry for direct analysis of complex Claire Hansell, Associate Editor, Nature mixtures Anal. Chem. 50, 2017–2021 (1978) | Yost, R.A. & Enke, C.G. Triple quadrupole mass spectrometry for direct mixture ORIGINAL RESEARCH PAPERS Karas, M., Bachmann, D. & analysis and structural elucidation. Anal. Chem. 51, 231–243 Hillenkamp, F. Influence of the wavelength in high-irradiance (1979) ultraviolet laser desorption mass spectrometry of organic FURTHER READING Samuelsson, B., Hamberg, M. & Sweeley, molecules. Anal. Chem. 57, 2935–2939 (1985) | Karas, M., C.C. Quantitative gas chromatography of prostaglandin E1 at the Bachmann, D., Bahr, U. & Hillenkamp, F. Matrix-assisted nanogram level: use of deuterated carrier and multiple-ion ultraviolet laser desorption of non-volatile compounds. Int. J. analyser. Anal. Biochem. 38, 301–304 (1970) | Brotherton, H.O. & Society Chemical American 1985, © Mass Spectrom. Ion Processes 78, 53–68 (1987) | Tanaka, K. et al. Yost, R.A. Determination of drugs in blood serum by mass The mass spectrum obtained by Karas and Hillenkamp of a Protein and polymer analyses up to m/z 100,000 by laser spectrometry/mass spectrometry. Anal. Chem. 55, 549–553 mixture of alanine (Ala) and tryptophan (Trp). Reprinted with ionization time-of-flight mass spectrometry. Rapid Commun. (1983) | Gilette, M.A. & Carr, S.A. Quantitative analysis of permission from Karas, M., Bachmann, D. & Hillenkamp, F., Mass Spectrom. 2, 151–153 (1988) | Karas, M. & Hillenkamp, F. peptides and proteins in biomedicine by targeted mass Influence of the wavelength in high-irradiance ultraviolet laser Laser desorption ionization of proteins with molecular masses spectrometry. Nat. Methods 10, 28–34 (2013) desorption mass spectrometry of organic molecule, Anal. Chem. exceeding 10,000 daltons. Anal. Chem. 60, 2299–2301 (1988) 57, 2935–2939 (1985).

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 17 MILESTONES

MILESTONE 19 spectrometry, these first HDX papers paved the way for assessing protein dynamics in increasingly complex systems. Nearly ten years after the first structural Dynamic protein structures biology applications of mass spectrometry were published, another development greatly The native state of most proteins in a cell is a demonstrating how mass spectrometry could extended the information that could be well-defined three-dimensional structure, be used to detect physiologically relevant obtained about protein structure using the which is maintained by intramolecular noncovalent interactions. Yet another group technique. Malin Young and colleagues used interactions. Detailed knowledge of protein studied noncovalent interactions in chemical cross-linkers to covalently bind structure allows researchers to determine the myoglobin, a globular protein for which lysines together in basic fibroblast growth molecules’ functions and to design inhibitors structural information was already available. factor 2 (FGF-2) and analyzed the peptides for medical or research purposes. The Viswanatham Katta and Brian Chait built on resulting from the protein using traditional methods that are used to elucidate their work from 1990, which included the time-of-flight mass spectrometry structure, such as X-ray crystallography and refinement of ESI-MS without using organic (Milestone 4). The cross-links provided nuclear magnetic resonance (NMR) additives, to examine the impact of protein information on the comparative positions of spectroscopy, require a lot of time and a large unfolding on the noncovalent heme-globin the modified lysines, and the researchers quantity of protein, and, in many cases, the domain interaction. This paved the way in combined this data with the output from experiments are not successful. In the early 1992 for the first work to report the structure-prediction servers to build a model 1990s, researchers began to explore how observation of a functional noncovalent of the protein’s structure. The structure for mass spectrometry—a rapid technique that protein-protein interaction, done by Manuel FGF-2 obtained through this method was requires only small sample amounts—could Baca and Stephen Kent. These early papers similar to that determined by NMR be used to provide detailed, dynamic marked the beginning of an explosion of spectroscopy and X-ray crystallography—but structural information about proteins. interest in the use of ESI-MS to examine it had taken a fraction of the time to achieve. In 1991, several groups began to apply conformational changes of native proteins in Taken together, these advances in mass electrospray ionization mass the gas phase, as well as in characterizing spectrometry research demonstrated that spectrometry (ESI-MS; Milestone 15) to noncovalent macromolecular assemblies, an this technique could provide extensive analyze the intact, native structures of application for which separation by ion information on the structure and dynamic protein systems. This approach, now known mobility (see Milestone 8) has been motion of proteins; this can be further as native mass spectrometry, allowed the particularly useful. extended in combination with other groups to examine the conformational Katta and Chait had a busy 1991—they structural-biology and molecular-modeling changes that occur within proteins during were also among the groups developing methods. These methods will continue to the unfolding process triggered by harsh ESI-MS as a means for examining advance as technology and structural conditions. conformational changes in proteins, using a information improve, thanks to the Joseph Loo and colleagues monitored the method called hydrogen-deuterium trailblazers who started the process. small globular regulatory protein ubiquitin exchange (HDX). For this work, they used Rebecca Kirk, Senior Editor, as it changed from its native conformation ubiquitin as a model and took advantage of Nature Communications under physiological pH and solvent the fact that labile hydrogen atoms in a ORIGINAL RESEARCH PAPERS Ganem, B., Li, Y. & Henion, J.D. conditions to an extended conformation protein can be replaced by deuterium only Detection of noncovalent receptor-ligand complexes by mass when exposed to denaturing conditions. In a when they are exposed to deuterium-con- spectrometry. J. Am. Chem. Soc. 113, 6294–6296 (1991) | more complex system, Bruce Ganem, taining solvent. In a folded protein, the Katta, V. & Chait, B.T. Observation of the Heme-Globin complex in native myoglobin by electrospray-ionization mass Yu Tsyr Li and Jack Henion used ESI-MS to hydrogens on the surface can be exchanged spectrometry. J. Am. Chem. Soc. 113, 8534–8535 (1991) | visualize the interaction of drugs with the with deuterium and, because the mass of Katta, V. & Chait, B.T. Conformational changes in proteins cytoplasmic receptor protein FKBP, deuterium is greater than that of hydrogen, probed by hydrogen-exchange electrospray-ionization mass spectrometry. Rapid Commun. Mass Spectrom. 5, 214– this change can be detected by mass 217 (1991) | Loo, J.A. et al. Solvent-induced conformational spectrometry. As the protein unfolds in changes of polypeptides probed by electrospray-ionization mass spectrometry. Rapid Commun. Mass Spectrom. 5, 101–105 harsh conditions, formerly protected (1991) | Zhang, Z. & Smith, D.L. Determination of amide hydrogens become exposed and are hydrogen exchange by mass spectrometry: a new tool for exchanged for the heavier element, which protein structure elucidation. Prot. Sci. 2, 522–531 (1993) | Young, M.M. et al. High-throughout protein fold identification by leads to a change in the mass of the resulting using experimental constraints derived from intramolecular peptides. The hydrogen-to-deuterium cross-links and mass spectrometry. Proc. Natl. Acad. Sci. USA 97, exchange rate can thus be used to monitor 5802–5806 (2000) FURTHER READING Chowdhury, S.K., Katta, V. & Chait, B.T. the timeline of the conformational changes Probing conformational changes in proteins by mass in the protein. In 1993, Zhongqi Zhang and spectrometry. J. Am. Chem. Soc. 112, 9012–9013 (1990) | Baca, M. & Kent, S.B.H. Direct observation of a ternary complex David Smith used HDX to examine between the dimeric enzyme HIV-1 protease and a substrate- conformational changes in the mitochondrial based inhibitor. J. Am. Chem. Soc. 114, 3992–3993 (1992) | protein cytochrome C upon heating, using Robinson, C.V. et al. Probing the nature of noncovalent interactions by mass spectrometry. A study of protein−CoA © (2000) National Academy of Sciences USA Sciences of Academy National (2000) © high-performance liquid chromatography ligand binding and assembly. J. Am. Chem. Soc. 118, 8646–8653 (see Milestone 8) combined with fast-atom (1996) | Ruotolo, B.T. et al. Evidence for macromolecular protein Chemical cross-links detected by mass spectrometry provide bombardment mass spectrometry (see rings in the absence of bulk water. Science 310, 1658–1661 structural information. Image reproduced with permission from (2005) Young et al., Proc. Natl. Acad. Sci. USA 97, 5802–5806 (2000). Milestone 2). As with native mass

18 | OCTOBER 2015 www.nature.com/milestones/mass-spec MILESTONES

MILESTONE 20 In 2002, William Henzel, John Stults and Colin Watanabe (pictured left to right) won the Protein discovery goes global Distinguished Contribution in The term ‘proteome’ was created in the LC-MS/MS spectra to theoretical spectra Mass Spectrometry Award for their work mid-1990s to describe the entire set of proteins inferred from every amino acid sequence in the on peptide mass expressed by a cell or organism at any one time. GenPept database. SEQUEST successfully fingerprinting. Reproduced from Robinson, C. & Gross, M., Focus Although the importance of understanding assigned peptides derived from digested E. coli on proteomics in honor of the 2002 Distinguished Contribution in how proteins function in biological systems and yeast cells to the correct sequences in Mass Spectrometry Award to W. J. Henzel, J. T. Stults, and was realized long before this, the species-specific databases. This work formed C. Watanabe, J. Am. Soc. Mass Spectrom. 14, 929–930 (2003), with kind permission from Springer Science and Business Media. high-throughput characterization of these the basis for future ‘shotgun’ proteomics studies complex molecules had only recently become by showing that global protein profiles could be possible. obtained from digested complex protein shown to improve the multiplexing capabilities In the early 1990s, Edman degradation was mixtures. of protein quantification—allowing eight commonly used to resolve the sequences of Around the same time, Matthias Mann and samples to be analyzed simultaneously. This gel-separated proteins. The process was slow, Matthias Wilm found that fragmented peptides work gave rise to a range of isobaric tagging required large amounts of protein sample and contained short, identifiable amino acid methods that remain popular today. sometimes failed to identify correct peptide sequences that, together with the mass of the Another important component of sequences, particularly in the presence of regions flanking the peptide, could be used to proteomics research was the development of protein modifications such as glycosylation or match proteins to known sequences. These methods that estimate and control for phosphorylation. Looking to speed up the so-called sequence tags were shown to have erroneous assignments of mass spectra to protein identification process, a team led by enormous discriminating potential, and, known sequences. In particular, the William Henzel, John Stults and Colin importantly, Wilm and Mann were able to target-decoy approach, introduced by Steven Watanabe began working on a mass quantify this potential by calculating the Gygi and colleagues in 2003, remains the gold spectrometry–based technique to match likelihood that sequence-tag matches were standard for error assessment in large-scale peptide mass spectra to molecular-weight incorrect. The manual nature of this approach protein identification studies. information inferred from known amino acid somewhat limited its popularity, however. In just over two decades, the field of sequences—a process now known as peptide Notably, both of these studies emphasized proteomics has progressed from the analysis of mass fingerprinting. Although the method was the need for algorithms that could correlate single proteins to the ability to profile promising, its low sensitivity prevented experimental mass spectra to translated gene near-complete proteomes and detect widespread adoption. sequences. The importance of such algorithms post-translational modifications It was the emergence of matrix-assisted became increasingly evident as advances in (Milestone 21). Mass spectrometry–based laser-desorption/ionization (MALDI; sequencing technologies spurred a rapid proteomics has proven to be an indispensable Milestone 18) and electrospray ionization (ESI; expansion in DNA sequence libraries that technique for understanding how complex Milestone 15) that propelled mass could be used for high-throughput protein organisms function. spectrometry forward as a sensitive and identification. Sarah Perry, Associate Editor, Nature Biotechnology efficient method for protein identification. As protein-identification tools continued to Henzel and colleagues were the first to exploit improve, the demand for accurate quantitation ORIGINAL RESEARCH PAPERS Henzel, W.J. et al. Identifying proteins from two-dimensional gels by molecular mass the increased sensitivity of MALDI and also became apparent. Consequently, stable searching of peptide fragments in protein sequence databases. demonstrate a rapid peptide-mass- isotopes were incorporated into samples prior Proc. Natl. Acad. Sci. USA 90, 5011–5015 (1993) | Eng, J.K., McCormack, A.L. & Yates, J.R. An approach to correlate tandem fingerprinting method for identifying proteins to mass spectrometry analysis to determine the mass spectral data of peptides with amino acid sequences in a from two-dimensional gels. They developed a relative abundance of proteins in two samples. protein database. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994) | computational algorithm, Fragfit, that One approach involved chemically modifying Mann, M. & Wilm, M. Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Anal. Chem. 66, accurately matched peptide masses to the two protein populations using isotope-coded 4390–4399 (1994) masses of known sequences generated using affinity tags (ICAT); although successful, this FURTHER READING Wilkins, M.R. et al. From proteins to identical proteolytic digestion techniques. They was limited to proteins containing specific proteomes: large scale protein identification by two-dimensional electrophoresis and amino acid analysis. Bio/technology 14, applied Fragfit to ten proteins isolated from an residues, was difficult to replicate and 61–65 (1996) | Gygi, S.P. et al. Quantitative analysis of complex Escherichia coli cell lysate, and, using only three compared only two samples in a single analysis. protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17, 994–999 (1999) | Oda, Y., Huang, K., Cross, F.R., peptide masses per protein, the algorithm More sensitive quantitation came about Cowburn, D. & Chait, B.T. Accurate quantitation of protein uniquely identified each protein from 91,000 through the use of metabolic labeling—a expression and site-specific phosphorylation. Proc. Natl. Acad. Sci. protein sequences. This study was one of the method introduced by Brian Chait and his USA 96, 6591–6596 (1999) | Ong, S.E. et al. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and first to showcase the use of mass spectrometry team in 1999. They showed that by growing accurate approach to expression proteomics. Mol. Cell Proteomics for protein identification. pools of cells on medium containing different 1, 376–386 (2002) | Thompson, A. et al. Tandem mass tags: a In the following year, ESI-based liquid metabolically labeled amino acids, all cellular novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 75, 1895– chromatography (Milestone 8) coupled with proteins could be labeled as they were 1904 (2003) | Nesvizhskii, A., Keller, A., Kolker, E. & Aebersold, tandem mass spectrometry (LC-MS/MS; synthesized. A popular example of this R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646–4658 (2003) | Peng, J. et al. Milestone 13) eclipsed the performance of approach, stable-isotope labeling by amino Evaluation of multidimensional chromatography coupled with MALDI-MS for protein identification. John acids in cell culture (SILAC), was published tandem mass spectrometry (LC/LC-MS/MS) for large-scale Yates and his team developed the SEQUEST three years later. The following year, a chemical protein analysis: the yeast proteome. J. Proteome Res. 2, 43–50 (2003) algorithm to match experimentally obtained labeling method using tandem mass tags was

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 19 MILESTONES

MILESTONE 21 higher-energy C-trap dissociation, would later become important for analyzing protein phosphorylation, as well as other PTMs, on a In pursuit of PTMs proteomic scale. Although some PTMs are irreversible, In the 1990s, advances in approaches to gene-based sequence databases. The Yates protein phosphorylation is a transient mass spectrometry were enabling the nascent group thus adapted their database search modification that has an important role in proteomics field to grow rapidly algorithm to account for the increased mass cell-signaling networks. In an early step (Milestone 20). But these improvements also of a peptide containing up to three PTMs. toward understanding how signaling occurs revealed just how much more work would be Although the researchers did not apply the over time on a proteomic scale, Matthias required to truly understand how the method on a large scale in the 1995 report, Mann’s group applied a time-course strategy proteome functions. For this, it was essential the technique helped to set the stage for the to quantitatively profile the temporal to study how proteins interact with one use of sequence database searching as a dynamics of 6,600 phosphorylation sites on another (Milestone 22) and how they are means to interpret mass spectra produced by 2,244 proteins in human cells stimulated with modified, post-translationally, in the cell. modified proteins. a growth factor. This, along with other work, A single gene in the genome does not The lack of information about PTMs in cemented mass spectrometry as an essential ‘result’ in a single protein entity; thousands of sequence databases was only part of the tool in the field of . enzymes in the cell modify proteins after challenge. Another pressing issue was the low Today, the ability to identify tens of translation has occurred. These abundance of modified peptides in the thousands of phosphorylation sites in a single post-translational modifications (PTMs) proteome, which made them difficult to mass spectrometry study (as, for example, the range from changes as simple as detect by mass spectrometry without using Gygi group showed in a 2010 investigation of phosphorylation (the addition of a phosphate some form of purification. In 2002, Forest the tissue-specific mouse phosphoproteome) group) or the formation of a disulfide bond White, Donald Hunt and colleagues is taken almost for granted. Yet it once seemed between two cysteine residues, to those as developed a chemical approach to enrich for a near-insurmountable challenge. complex as the conjugation of a chain of phosphorylated peptides from cell lysate. Quantitative proteomics methods ubiquitin proteins to the amino acid lysine, or This method involved digesting proteins into (Milestone 20) have been essential for of a branching series of sugars to an peptides, blocking reactive carboxylate determining PTM site occupancy and for asparagine. Such modifications allow for a groups by converting them to methyl esters comparing PTM profiles under different great diversification of protein chemistry and, and using immobilized metal-affinity biological conditions. Intrepid scientists have by extension, protein function. chromatography to capture only those devised a myriad of methods to enrich, detect In 1995, hot on the heels of their work peptides containing phosphate groups. and profile protein modifications. These describing a method for mass spectrometry– Working in yeast, the researchers identified include well-known PTMs such as acetylation based proteomic analysis (Milestone 20), 383 phosphorylation sites in 216 peptides— and ubiquitination, as well as complex John Yates and colleagues began to explore the first large-scale analysis of the carbohydrate modifications (Milestone 12) whether a similar idea could be applied to phosphoproteome. The work served as early that play important parts in cell-cell interpret spectra from unidentified modified inspiration for the potpourri of communication but that are extremely peptides. It was a relatively straightforward PTM-enrichment approaches that is available challenging to profile in terms of modification task to interpret the collision-induced today, ranging from chemical techniques to site and carbohydrate structure. There are dissociation (CID; Milestone 13) spectrum immunoaffinity methods that rely on many other modifications whose biological of an unmodified peptide—just match it to a PTM-specific antibodies, as was first shown roles are just beginning to be discovered, theoretical CID spectrum generated from a in a landmark paper by Michael Comb and thanks to the power of mass spectrometry for protein-sequence database. Information colleagues in 2003, describing an antibody to PTM analysis on a large scale. about PTMs, however, was not captured in enrich for tyrosine phosphorylation. Allison Doerr, Senior Editor, Nature Methods

In 2004, Steven Gygi’s lab quickly followed ORIGINAL RESEARCH PAPERS | Yates, J.R., Eng, J.K., in White’s and Hunt’s footsteps, reporting McCormack, A.L. & Schieltz, D. Method to correlate tandem 2,002 phosphorylation sites in 967 nuclear mass spectra of modified peptides to amino acid sequences in the protein database. Anal. Chem. 67, 1426–1436 (1995) | Ficarro, proteins in a human cell line. Gygi’s group S.B. et al. Phosphoproteome analysis by mass spectrometry and achieved this feat by applying strong its application to Saccharomyces cerevisiae. Nat. Biotechnol. 20, cation-exchange chromatography to isolate 301–305 (2002) | Beausoleil, S.A. et al. Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc. Natl. phosphorylated peptides with a charge state Acad. Sci. USA 101, 12130–12135 (2004) | Olsen, J.V. et al. Global, of 1+ from unmodified peptides with a charge in vivo, and site-specific phosphorylation dynamics in signaling + networks. Cell 127, 635–648 (2006) state of 2 . They also designed a method to FURTHER READING Peng, J. et al. A proteomics approach to unambiguously identify the phosphorylation understanding protein ubiquitination. Nat. Biotechnol. 21, 921– site using CID alone: they subjected 926 (2003) | Zhang, L. H., Li, X.-J., Martin, D.B. & Aebersold, R. Identification and quantification of N-linked glycoproteins using suspected phosphopeptides to a second hydrazide chemistry, stable isotope labeling and mass fragmentation step to garner additional clues spectrometry. Nat. Biotechnol. 21, 660–666 (2003) | Rush, J. et al. that would enable them to identify the Immunoaffinity profiling of tyrosine phosphorylation in cancer cells. Nat. Biotechnol. 23, 94–101 (2005) | Choudhary, C. et al.

© 2006, Nature Publishing Group Publishing Nature 2006, © phosphorylated residue. Alternative Lysine acetylation targets protein complexes and co-regulates fragmentation methods (Milestone 13), major cellular functions. Science 325, 834–840 (2009) | Huttlin, A multitude of post-translational modifications occur in the E.L. et al. A tissue-specific atlas of mouse protein phosphor- cell. Image reprinted from Jensen, O.N., Nat. Rev. Mol. Cell Biol. including electron-capture dissociation, ylation and expression. Cell 143, 1174–1189 (2010) 7, 391–403 (2006). electron-transfer dissociation and

20 | OCTOBER 2015 www.nature.com/milestones/mass-spec MILESTONES

MILESTONE 22 Putting the pieces together Most proteins carry out their function placed in tandem on the same within the cell not as individuals but as ‘bait’ protein, allowed the components of macromolecular complexes efficient isolation of proteins that vary widely in size and complexity. By directly or indirectly bound as the mid-1990s, it had become clear that to part of the same complex. The truly grasp how cells and tissues function use of two affinity tags enabled a would require methods for characterizing two-step purification procedure how proteins interact with one another. that substantially limited

Early efforts by the proteomics community background contamination Group Publishing Nature 2002, © were vital in generating the large datasets of while allowing the mild protein-protein interaction (PPI) networks purification conditions needed Mass spectrometry enables protein interactome analysis. Adapted from Gavin now widely used to connect specific gene to maintain the integrity of the et al., Nature 415, 141–147 (2002). products to cellular functions and gene protein complex. Séraphin and mutations to diseases. colleagues demonstrated the method’s changes in the abundance of complexes and Two earlier developments in mass power by TAP-tagging the yeast protein their subcomponents could also be spectrometry technology formed the basis Snu71p, a component of the U1 small quantitatively analyzed by mass for the difficult task of systematically nuclear ribonucleoprotein (snRNP), which spectrometry. These advances cemented identifying individual components of they then used to isolate a functional the idea that proteomics could be used to protein complexes: the capacity to establish complex containing all the known comprehensively analyze not only how the a peptide’s sequence from its mass components of the full U1 snRNP from proteome is organized, but how the cellular spectrum and the ability to identify yeast cells. machinery functions and responds to individual peptides within a mixture To what extent individual components perturbations. (Milestone 20). Although by the late 1990s of the proteome act ‘socially’ as part of Although it has already been over 15 proteomics technology had become macromolecular complexes became the years since these early forays into protein relatively efficient, better means to next obvious question. With the ability to complex identification through proteomics, systematically isolate native cellular isolate native complexes and efficiently much remains to be discovered. Mass complexes with the required degree of identify their components at hand, it was spectrometry instruments—as well as the purity and to connect isolated masses to not long before efforts were under way to methods geared toward efficient isolation sometimes yet-to-be-characterized gene tackle the protein interactome full on, with and characterization of protein products were needed. the ultimate goal of identifying every complexes—are still evolving rapidly. Thus In 1999, John Yates and colleagues protein complex present in a cell. In early it is likely that the field of quantitative demonstrated that large complexes such as 2002, using mass spectrometry, interactomics will continue to shape our the ribosome could be analyzed directly by Anne-Claude Gavin, Giulio Superti-Furga understanding of normal physiology and multidimensional chromatography (see and colleagues and Daniel Figeys and disease for some time to come. Milestone 8) followed by tandem mass colleagues both reported identifying Stéphane Larochelle, Associate Editor, spectrometry (see Milestone 13), making it hundreds of protein complexes—many of Nature Communications possible to identify more than 100 whose components had no previously individual components in a single defined function—comprising over 25% of ORIGINAL RESEARCH PAPERS Rigaut, G. et al. A generic experiment. Still, it was not then clear the budding yeast proteome. Those protein purification method for protein complex characterization and proteome exploration. Nat. Biotechnol. whether multiprotein assemblies were analyses provided the first glimpse of the 17, 1030–1032 (1999) | Gavin, A.C. et al. Functional limited to certain core housekeeping extent to which individual components of organization of the yeast proteome by systematic analysis molecular machines, typically around the proteome are connected though of protein complexes. Nature 415, 141–147 (2002) | Ho, Y. et al. Systematic identification of protein complexes in nucleic acids, or were more generally functional networks. The knowledge of Saccharomyces cerevisiae by mass spectrometry. Nature 415, ubiquitous. Although facilitating the those connections could also be used to 180–183 (2002) identification of components of large infer the function of uncharacterized FURTHER READING Neubauer, G. et al. Identification of the proteins of the yeast U1 small nuclear ribonucleoprotein complexes, Yates’s approach required the protein and would eventually pave the way complex by mass spectrometry. Proc . Natl. Acad. Sci. USA ability to purify the target to near for today’s more integrated view of the 94, 385–390 (1997) | McCormack, A.L. et al. Direct analysis and identification of proteins in mixtures by LC/MS/MS and homogeneity, and thus was limited to cellular machinery. database searching at the low-femtomole level. Anal. Chem. analyzing known complexes. Identifying the components of 69, 767–776 (1997) | Link, A.J. et al. Direct analysis of Around the same time, Bertrand complexes and the connections between protein complexes using mass spectrometry. Nat. Biotechnol. 17, 676–682 (1999) | Ranish, J.A. et al. The study of Séraphin and colleagues were developing complexes was only the beginning. Those macromolecular complexes by quantitative proteomics. Nat. the tandem affinity purification (TAP) advances were soon followed by Genet. 33, 349–355 (2003) | Blagoev, B. et al. A proteomics tagging method. This approach, which demonstrations from groups led by strategy to elucidate functional protein-protein interactions applied to EGF signaling. Nat. Biotechnol. 21, 315–318 (2003) relied on two N-terminal affinity tags Ruedi Aebersold and Matthias Mann that

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 21 COLLECTION COMMENTARY

• This Commentary was first published in Nature Methods 7, 681–685 (2010) doi:10.1038/nmeth0910-681

Mass spectrometry in high-throughput proteomics: ready for the big time Tommy Nilsson1,2, Matthias Mann3, Ruedi Aebersold4,5, John R Yates III6, Amos Bairoch7,8 & John J M Bergeron1,2

Mass spectrometry has evolved and matured to a level where it is able to assess the complexity of the human proteome. We discuss some of the expected challenges ahead and promising strategies for success.

The age of whole-genome sequencing has has largely failed to deliver new biomarkers determine the validity of the data without made the research field of proteomics pos- for diseases, an application for which it was knowing the limitations and pitfalls of the sible. Proteomics offers highly complemen- highly touted2. This was in part the result technology, then it is not possible to distin- tary information to genomics; as most bio- of early high-profile studies3 that could not guish whether the experiment was success- logical functions are transmitted through be confirmed and, consequently, have nega- ful or not. Through routinely performing proteins, proteomics has yielded new biolo- tively affected the credibility of proteomics. checks and balances, similar to those used gy and insight into disease. Mass spectrom- A recent comparative study also demon- when purifying a protein by means of clas- etry (MS) technology allows proteins to be strated that 20 out of 27 proteomics labora- sical biochemistry, the mass spectrometrist analyzed rapidly, accurately and with high tories (with varying levels of expertise) had must ensure high fidelity of the platform. sensitivity at a relatively low cost and, when difficulties in identifying proteins in even a Protein standards to test that trypsin diges- properly applied, with high reproducibility. simple mixture4,5. tion, peptide elution, chromatography and With current MS technology, several thou- However, when proteomics technolo- MS are in working order are essential. The sand proteins can be identified and quanti- gies are applied carefully and correctly, assignment of spectra to peptides and pro- fied in a single study. As it is estimated that the technology is highly reproducible. An teins should also be carefully checked either the human genome consists of no more than understanding of these techniques and their manually in small data sets such as those 20,300 protein-coding genes1, it seems that limitations is necessary for success and has generated from single gel bands or spots, or quantification of the whole human pro- facilitated the development of improved by using accurate statistical tools in the case teome is within reach. strategies to overcome problems in pro- of large data sets, before handing over any In contrast to a genome, however, a pro- teomics and advance the field6. data to the clinician or the cell biologist, and teome is seemingly boundless as each pro- this handoff must include clear instructions tein may be present in different forms, in Pipeline quality control as to the interpretation of the data. different amounts and at different times. The main problem areas of proteomics can Because of this great complexity, many be separated into sample preparation, sam- Mass spectrometric technology believe that quantitative profiling of all pro- ple handling, data analysis and data evalua- There are also inherent challenges in mass teins expressed in a cell at a particular time tion. Whereas biologists and clinicians often spectrometry with regards to sensitiv- may be an unachievable goal. Proteomics control how samples are generated and ity (dynamic range), reproducibility and has also been viewed with skepticism as it stored, mass spectrometrists are typically in comprehensiveness, the three of which are charge of all other steps, including evalua- interrelated6. However, these limitations 1 The Research Institute of the McGill University tion of data obtained. This division of labor are understood and can be overcome, and Health Centre, McGill University, Montreal, Quebec, Canada. 2Department of Medicine, McGill can result in serious data quality issues. In many tools are available to assess the quality University, Montreal, Quebec, Canada. 3Department our opinion, a lack of accountability and of the data. of Proteomics and Signal Transduction, Max Planck management of the data generated will Institute for Biochemistry, Martinsried, Germany. 4Institute of Molecular Systems Biology, Swiss cause the proteomics field to continue to Sensitivity (dynamic range). Most com- Federal Institute of Technology (ETH) Zurich, suffer. Biologists and clinicians should not mercial mass spectrometers have limits of Zurich, Switzerland. 5Faculty of Science, University be left in the dark as to the validity of data detection in the low femtomole or attomole 6 of Zurich, Zurich, Switzerland. Department obtained. It is neither acceptable nor practi- range, sensitive enough to detect almost any of Chemical Physiology, The Scripps Research Institute, La Jolla, California, USA. 7Swiss Institute cal to say that ‘someone else should validate protein. However, the true sensitivity of MS of Bioinformatics, Geneva, Switzerland. 8University what I just produced’. Mass spectrometrists is modulated by the nature of the sample7. of Geneva, Geneva, Switzerland. best know the limitations of MS and thus Biological samples have a wide range of pro- Correspondence should be addressed to T.N. ([email protected]) or J.J.M.B. are in the best position to validate the data. tein abundances, and mass spectrometers ([email protected]). If the clinician or cell biologist attempts to are not well equipped to deal with this wide

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 23 NATURE METHODS | VOL.7 NO.9 | SEPTEMBER 2010 | 681

CY_nmeth0910.indd 681 8/17/10 11:57:54 AM COLLECTIONCOMMENTARY

dynamic range. For example, about 35% of expected. Even so, this puts a limit to repro- into proteins is not obvious. As noted all predicted human proteins have yet to ducibility and begs the question of how above, 35% of the 20,300 predicted human be observed reliably by MS. This is partly multiple measurements can be handled and proteins have little or no evidence for their because peptides do not ionize with equal how samples can be compared to reveal, for expression. Given that there are some 230 efficiency, potentially putting some proteins example, differences in expression. different cell types in the human body, each (and their associated post-translational One possible solution to the issue of vari- presumably expressing only a subset of the modifications (PTMs)) at a disadvantage in ability in comparing samples is to use pep- 20,300 predicted proteins, defining when a terms of detection. This issue is further com- tides that have a high probability of being proteome of a cell type, body fluid or tissue pounded for low-molecular-weight proteins observed as internal quality control stan- is complete is a very difficult proposition. and those expressed at low abundance. dards. In a recent comparative study4 where Analyzing the transcripts using, for exam- However, these technical challenges can 27 participating laboratories were asked to ple, a genome-wide array does not provide be overcome. Sample fractionation and pre- identify 20 equimolar human proteins and the answer, as the correlation factor between clearing are both routinely used to improve 22 peptides of 1,250 Da, the participants mRNA abundance and protein abundance the dynamic range. Low-abundance pro- were also required to submit their raw data on a genome-wide scale is very low11. teins, such as the tau protein in spinal fluid8 into a central repository, permitting all tan- However, approaches have been devel- or cytokines9, can be detected in a proteome dem mass spectra to be pooled and analyzed oped to determine when the proteome is using targeted proteomics approaches. centrally. With this central analysis, the ‘complete’. By combining fractionation and peptides seen by most participants became replicate analyses to collect as much data as Reproducibility. MS is also prone to under- evident; these peptides were observed by possible, it is possible to develop saturation sampling. Because the process of acquiring MS irrespective of operating procedures curves: that is to say, as each new experiment tandem mass spectra is driven by computer- or instrumentation and can empirically be is run, the number of new proteins or pep- controlled algorithms over the course of a classified as ‘high-probability’ peptides (that tides identified decrease until eventually no finite scan time determined by ion statis- is, if the protein is there, there is a high prob- new proteins are identified. This approach tics, undersampling can and does occur ability that this peptide will be observed). By was used to define large portions of the when peptide mixtures entering the mass comparing individual participants’ data with secreted proteomes of human saliva from spectrometer are complex. Undersampling the high-probability peptides of the pooled the parotid gland and the submandibular means that repeated analyses may not yield data, it was relatively easy to reveal discrep- plus sublingual glands12, the proteome of a exactly the same protein identifications, as ancies and to identify the errors made by a specific Drosophila cell type13, the proteome a slightly different set of peptides may be particular participant. Differences in ioniza- of rat lung luminal endothelial cell plasma sampled in the repeat analysis. However, tion, in precursor ion selection algorithms membranes14 and, in a landmark study, the undersampling can be remedied and repro- and in upstream fractionation techniques do complete proteome of the budding yeast ducibility improved by repeating experi- not yield great variability in terms of iden- Saccharomyces cerevisiae15. This method is ments. Additionally, reducing the complexi- tification, at least not for high-abundance contingent on setting a low false positive rate ty of samples with extensive prefractionation proteins. For low-abundance proteins, dif- for peptide identification, as the number of can reduce the impact of undersampling and ferences among mass spectrometers may be ‘new’ proteins will continue to grow unless improve reproducibility between analyses. more decisive. this rate is kept low16,17. Low false discovery Even so, data-dependent sampling of pep- Another question the field must resolve rates can be aided by using highly accurate tides with current mass spectrometers has a is not just the reproducibility of MS results, mass spectrometers, but without tight con- well documented degree of variability10. but also potential discrepancies with other trol of the false positive rate, the number The expression of proteins can be strongly technologies. For the SRM reproducibil- of false positive proteins could continue to influenced by growth conditions, genetic ity study mentioned above7, the absolute climb until every protein encoded by the variability and sample handling; these fac- concentration values measured for seven genome has been identified18. tors further contribute to variability. It is proteins in plasma samples differed not just To obtain complete proteomes, the field therefore important to minimize technical among the laboratories using SRM but also must address the issue of protein extract- as well as biological variability among sam- differed from those measured by enzyme- ability. The use of gel-based isoelectric ples. But technical variability is not easily linked immunosorbent assay (ELISA). For focusing for protein separation invariably controlled, as shown in a recent comparative example, for C-reactive protein, a variation leads to a loss of transmembrane proteins analysis where eight laboratories attempted of 0.31 to 1.8 fmol μl–1 was observed by SRM in particular, but as much as 70% of all pro- quantification of a carefully designed test at an amount judged to be 4 fmol μl–1 by teins could be lost through this procedure. sample containing seven different pro- ELISA. Whether this indicates limitations Solution-based isoelectric focusing has teins in plasma using three replicates and with commercial ELISA assays or with the overcome most of this limitation, but the standard operating procedures, through a MS SRM-based approach is not clear, espe- perception is still that transmembrane pro- targeted approach called selected reaction cially as C-reactive protein is comparatively teins may be under-represented. However, monitoring (SRM)7. The study showed that abundant in plasma. a very high degree of comprehensiveness the SRM method was reproducible in the can be obtained for transmembrane pro- sense that the same peptides were detected Comprehensiveness. Although the number teins. A study that put this to the test was consistently by the different labs and with of protein coding genes can be predicted in the characterization of the proteome of the good quantitative accuracy; the 25% quan- a genome, when, where or indeed whether synaptic vesicle, where 80% of all mem- titative variation among labs was better than such genes are transcribed and translated brane proteins in the synaptic vesicles

| VOL.7 NO.9 | SEPTEMBER 2010 | 24682 | OCTOBER 2015 NATURE METHODS www.nature.com/milestones/mass-spec

CY_nmeth0910.indd 682 8/17/10 11:57:54 AM COLLECTIONCOMMENTARY

were characterized, revealing a density of In our opinion, by using combinations of complexity and significance. Most model 130,000 transmembrane domains per μm2 quantitative approaches with SRM, it should organisms are endowed with what is termed (ref. 19). It is theoretically impossible to be feasible to quantitatively map complete a model organism database. Model organ- fit many more transmembrane proteins proteomes of higher eukaryotes, including ism databases try to organize and dissemi- into the lipid bilayer of a synaptic vesicle humans. This, however, will require highly nate genetic, molecular and, increasingly, as there would not be space to surround curated and annotated databases of pro- functional information relevant to their the membrane-spanning domains with teotypic peptides15,26,27. The availability of species of interest. Paradoxically, because phospholipids. synthetic proteotypic peptide reagents to the of the intrinsic importance of our species, community would enormously facilitate the no single institution has been endowed with Quantification characterization of the entire detectable pro- the responsibility of centralizing molecular Introducing quantification can avoid many tein complement of any biological sample28 information concerning Homo sapiens. The of the pitfalls and caveats described above. although SRM alone is not the answer to resulting picture for the user is then one of By determining the absolute or relative obtaining a complete proteome. utter confusion because, at present, mul- amounts of proteins in the sample, it is pos- Antibody reagents to target proteins of tiple resources cater to overlapping needs. sible to place a number on contamination low abundance would also be a very valuable This could still be satisfactory if it were (for example, proteins from blood in a tissue resource for the community. Most studies possible to map ‘objects’ from one of these or proteins from other organelles in a puri- identify abundant proteins, whereas proteins resources to the other. Unfortunately, auto- fied organellar fraction) and to more easily of low abundance are rarely seen; 50% of all matic mapping between any two databases gauge undersampling and comprehensive- tandem mass spectra generated so far and rarely results in a success rate above 95%, ness, as well as sensitivity. Quantification deposited in public databases only account which is unacceptable. The situation is not also allows samples to be compared with for 82 human proteins (L. Martens, personal helped by the tendency of proteomics users each other. The dimension of quantity, communication). As mentioned above, this to ‘mix and match’ the results from differ- however, puts a greater demand on the can be partially overcome through biochemi- ent resources such that they obtain the big- MS-based proteomics pipeline, and several cal pre-fractionation or subcellular fraction- gest possible set of identifications. The net methods have therefore been developed to ation. However, prefractionation is not always effect of this is that protein lists full of a circumvent this. The advantages and disad- sufficient, especially for fluids such as blood hodgepodge of multiple identifiers are gen- vantages of different quantitative approaches where a sixfold difference in dynamic range erated that often or at least in part are obso- have been summarized recently6. exists. For low-abundance proteins, an anti- lete or irrelevant. One simple way to avoid One label-free approach focuses on ion body capture method to capture the entire this should be to only cite protein names. currents carried by several peptides whose protein and its isoforms has proven both However, this introduces yet another prob- sequences match a specific protein to pro- quantitative and reproducible as exemplified lem because many human proteins do not vide an approximation of abundance20. by the low-abundance tau protein found in yet have stable (read single), representative Another label-free method scores the fre- cerebrospinal fluid29. The SISCAPA (stable and well defined names. quency of tandem mass spectra assigned isotope standards and capture by anti-peptide So is there any hope? Yes, definitely! The to proteins21. In both cases, though, antibodies) method, whereby anti-peptide US National Center for Biotechnology increased accuracy becomes in part a antibodies are used to capture trypsinized Information (NCBI), European function of time spent on sample analy- peptides in samples, has been coupled to Bioinformatics Institute (EBI) and Swiss sis on the mass spectrometer, which may SRM for absolute quantification30. Institute of Bioinformatics (SIB), as well as extend into the hundreds of hours for even the HGNC (Human Gene Nomenclature a simple project22. Databases Committee), are working together to come Applications to cultured cells and mice of Protein databases are a necessary part of up with a clean set of genes, gene symbols, SILAC (stable isotope labeling with amino MS-based proteomics pipelines to iden- gene sequences and corresponding protein acids in cell culture) have also proven tify the mass spectra generated by shotgun names and sequences. Already, the Swiss- highly successful23, and because heavy iso- approaches. However, data matching and Prot section of the UniProt knowledge tope–labeled rodents are now available, an protein naming has been identified as one base (UniProtKB/Swiss-Prot) provides a application to whole-animal experiments of the main bottlenecks in MS-based pro- fully annotated and trackable set of 20,300 is feasible24. The merit of more targeted teomics4. The scientific issues with data- master protein entries, most of which are approaches is the focus on highly sensi- bases have been compounded with political linked to a single protein-coding gene. This tive and reproducible detection of repre- and socio-economical considerations, and master set is complemented by more than sentative peptides for each protein, such the net result is a state of affairs that is far 14,500 additional splice isoforms, 62,000 as in SRM. In most cases there are peptide from acceptable. This has become increas- annotated sequence variants and more than sequences that are unique to each protein of ingly evident as we put increasing demands 80,000 PTMs. the protein-coding genome. Those unique on bioinformaticians, including the curators At the level of search engines, the software peptides that also ionize and ‘fly’ well in of the databases. used to match peptide fragmentation spec- the tandem mass spectrometer are referred It is important to point out the distinc- tra to peptides in the database, there are also to as proteotypic peptides25. Evaluating tion between ‘simple’ organisms such as problems that must be addressed. Almost no the abundance of proteotypic peptides in bacteria, archaea and lower eukaryotes as search engines make use of the annotation digests of complex proteins is now feasible opposed to complex eukaryotes such as available in databases such as UniProtKB/ by SRM or label-free methods. Homo sapiens, representing the pinnacle of Swiss-Prot. Some do take into account splice

NATURE METHODS | VOL.7 NO.9 | SEPTEMBER 2010 | 683 NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 25

CY_nmeth0910.indd 683 8/17/10 11:57:54 AM COLLECTIONCOMMENTARY

variants, but most search engines have not sorting, intracellular localization, endocyto- perturbations can be used to explore the made any effort to import any information sis, exocytosis, and protein and membrane molecular basis for alterations in pheno- other than the protein sequence. Therefore, degradation, all in combination with iso- type using MS-based proteomic methods. they miss out on all annotated and relevant forms, splice variants, amino acid–relevant Further elements of the cell’s higher orga- known PTMs and, even worse, fail to take single-nucleotide polymorphisms and more nization need also to be included; proteins polymorphisms into account. This oversight than 200 different PTMs. Hence, the con- associate with other proteins to form opera- leads to incorrect or absent identifications. struction of a human proteome resource tional enzyme or signaling complexes as The HUPO Proteomics Standard must be a long-term process. well as super-structures such as chromatin, Initiative has proposed a common data- In our opinion, it is possible to proceed in nuclear pores or kinetochores. These are not base format, PEFF (PSI Extended FASTA much the same manner with which we today static structures, and thus how protein inter- Format) (http://psidev.info/index.php? conduct basic or clinical research. The main actions are regulated as a function of physi- q=node/363), to store information relevant difference would be to pay explicit attention ology, both normal and abnormal, must be to the proteomic community. If implemented to using published data in a more efficient defined to better understand the underlying by commercial and academic search engine manner. Here, a gene-centric human pro- cell biology. developers, this format would address most teome may serve as a useful scaffold, as in the Many of the questions of cell biology can of the current oversights of search engines. building of a skyscraper. Its foundation, the be addressed through traditional approaches, The PEFF format files that will be used by human genome, has already been laid31–33, but perhaps can be answered more quickly the UniProt consortium will not contain all as has most of the building itself, through and definitively with MS-based methods44. information stored in UniProtKB/Swiss- neXtProt, a comprehensive knowledge Other types of questions have been enabled Prot, but will provide what is necessary to resource on human proteins, hosting some by the creation of large-scale proteomic facilitate the process of identification (for 20,300 rooms (protein-encoding genes) all methods for protein expression or modi- example, known PTMs, polymorphisms and ready to be furnished or already partially fication profiling. These methods are now taxonomic information). furnished. The gene/protein–centric nature poised to address how expression or modi- of neXtProt will ensure that each room is fications change as a function of disease. Understanding human biology through populated such that all associated variants Sequencing the human genome was perhaps mass spectrometry and PTMs, spatial and temporal coordinates the easy part, and now making sense of the We are convinced that MS-based proteom- are collated, along with their functions and constantly moving and changing picture of ics, when used with appropriate restraint relevance to disease. the proteome will require a lot of time, effort and properly controlled, is applicable to the Another challenge is then to ensure that and creativity. study of the human proteome, both on a the data accumulated are made available in COMPETING FINANCIAL INTERESTS hypothesis-driven level and on the level of an intuitive and useful manner. This would The authors declare no competing financial interests. a large-scale effort to comprehensively map necessitate the creation of a repository for the human protein landscape. Proteomics is raw MS data but would ensure that data 1. Clamp, M. et al. Proc. Natl. Acad. Sci. USA 104, ready to define the roles of proteins in nor- from MS-based studies have long-lasting 19428–19433 (2007). mal cellular physiology versus disease by value. The ProteomeXchange consor- 2. Anonymous. Nature 452, 913–914 (2008). accelerating traditional cell biology experi- tium members, which include Tranche34, 3. Petricoin, E.F. et al. Lancet 359, 572–577 ments and enabling new approaches to dis- PeptideAtlas35, the NCBI peptidome36 and (2002). 37 4. Bell, A.W. et al. Nat. Methods 6, 423–430 covery. Aberrant functions may be dictated PRIDE , already go a long way as poten- (2009). by mutations, polymorphisms, isoforms or tial repositories for a human proteome. 5. Mann, M. Nat. Methods 6, 717–719 (2009). modification states of proteins; MS-based Providing these are sufficiently annotated, 6. Domon, B. & Aebersold, R. Nat. Biotechnol. 28, 710–721 (2010). proteomics is well positioned to measure the such a continuously growing data set could 7. Addona, T.A. et al. Nat. Biotechnol. 27, 633–641 types of change that may influence functions then be ‘diced’ through centralized analysis (2009). and physiology. Proteomics can also define such that all that is known for a particular 8. Westman-Brinkmalm, A. et al. Front. Biosci. 14, 1793–1806 (2009). the higher-order organization of cells; that protein, process, structure, cell type, organ 9. Nettikadan, S. et al. Mol. Cell. Proteomics 5, is, where and when proteins localize in the or condition, and disease at that point in 895–901 (2006). cell as a function of time or disease. time may be revealed instantly through the 10. Tabb, D.L. et al. J. Proteome Res. 9, 761–776 (2010). The gathering of MS data with the aim push of a button. This, however, does not 11. Ingolia, N.T., Ghaemmaghami, S., Newman, 38–40 to comprehensively map the human pro- preclude targeted large-scale efforts , J.R.S. & Weissman, J.S. Science 324, 218–223 teome must be done in a way that it does not and indeed, neXtProt already connects to (2009). detract from ongoing activities motivated genome-wide projects such as the Human 12. Denny, P. et al. J. Proteome Res. 7, 1994–2006 41 (2008). and driven by scientific curiosity, hypoth- Protein Atlas project , which aims to map 13. Brunner, E. et al. Nat. Biotechnol. 25, 576–583 esis testing and clinically relevant questions. the expression and localization of proteins in (2007). The huge combinatorial space of the human human tissue using immunohistochemistry 14. Durr, E. et al. Nat. Biotechnol. 22, 985–992 (2004). proteome encompasses all body fluids and and immunofluorescence. Also, data from 15. de Godoy, L.M. et al. Nature 455, 1251–1254 230 different cell types. Each cell type and complementary efforts, such as genome- (2008). body fluid possesses a unique proteome that wide RNA interference screens42 and GFP 16. Tabb, D.L. J. Proteome Res. 7, 45–46 (2008). 43 17. Brusniak, M.Y. et al. BMC Bioinformatics 9, 542 depends on which particular genes are tran- fusion protein localizations , can be incor- (2008). scribed, pre- and post-translational regula- porated into neXtProt. Chemical genomics 18. Reiter, L. et al. Mol. Cell. Proteomics 8, 2405– tion, post-translational processing, protein information can be included; phenotypic 2417 (2009).

| VOL.7 NO.9 | SEPTEMBER 2010 | 26684 | OCTOBER 2015 NATURE METHODS www.nature.com/milestones/mass-spec

CY_nmeth0910.indd 684 8/17/10 11:57:55 AM COLLECTIONCOMMENTARY

19. Takamori, S. et al. Cell 127, 831–846 (2006). 29. Portelius, E. et al. J. Proteome Res. 7, 2114– 37. Jones, P. et al. Nucleic Acids Res. 34, 20. Silva, J.C., Gorenstein, M.V., Li, G.-Z., Vissers, 2120 (2008). D659–D663 (2006). J.P.C. & Geromanos, S.J. Mol. Cell. Proteomics 5, 30. Whiteaker, J.R., Zhao, L., Anderson, L. & 38. Faca, V.M. et al. PLoS Med. 5, e123 (2008). 144–156 (2006). Paulovich, A.P. Mol. Cell. Proteomics 9, 184–196 39. Sun, A. et al. J. Proteome Res. 9, 50–58 21. Blondeau, F. et al. Proc. Natl. Acad. Sci. USA (2010). (2010). 101, 3833–3838 (2004). 31. Collins, F.S., Green, E.D., Guttmacher, A.E. & 40. Yoshida, Y., Miyamoto, M., Bo, X., Yaoita, E. & 22. Gilchrist, A. et al. Cell 127, 1265–1281 (2006). Guyer, M.S. Nature 422, 835–847 (2003). Yamamoto, T. in Proteomics in Nephrology — 23. Krüger, M. et al. Cell 134, 353–364 (2008). 32. International Human Genome Sequencing Towards Clinical Applications (ed. Thongboonkerd, 24. McClatchy, D.B., Dong, M.-Q., Wu, C.C., Venable, Consortium. Nature 431, 931–945 (2004). V.) 186–197 (Karger, 2008). J.D. & Yates, J.R. III J. Proteome Res. 6, 2005– 33. Gregory, S.G. et al. Nature 441, 315–321 (2006). 41. Pontén, F. et al. Mol. Syst. Biol. 5, 337 (2009). 2010 (2007). 34. Hill, J.A., Smith, B.E., Papoulias, P.G. & 42. Neumann, B. et al. Nat. Methods 3, 385–390 25. Kuster, B., Schirle, M., Mallick, P. & Aebersold, Andrews, P.C. J. Proteome Res. 9, 2809–2811 (2006). R. Nat. Rev. Mol. Cell Biol. 6, 577–583 (2005). (2010). 43. Starkuviene, V. et al. Methods Mol. Biol. 457, 26. Picotti, P. et al. Nat. Methods 5, 913–914 (2008). 35. Desiere, F. et al. Nucleic Acids Res. 34, 193–201 (2008). 27. Malmstrom, J. et al. Nature 460, 762–765 D655–D658 (2006). 44. Bergeron, J.J., Au, C.E., Desjardins, McPherson, (2009). 36. Ji, L. et al. Nucleic Acids Res. 38, D731–D735 P.S. & Nilsson, T. Trends Cell Biol. 20, 337–345 28. Picotti, P. et al. Nat. Methods 7, 43–46 (2010). (2010). (2010)

NATURE METHODS | VOL.7 NO.9 | SEPTEMBER 2010 | 685 NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 27

CY_nmeth0910.indd 685 8/17/10 11:57:55 AM NATURE|Vol 450|13 December 2007|doi:10.1038/nature06525 INSIGHT REVIEW COLLECTION

• This Review was first published in Nature 450, 991–1000 (2007) doi:10.1038/nature06525

The biological impact of mass-spectrometry-based proteomics Benjamin F. Cravatt1,2, Gabriel M. Simon1,2 & John R. Yates III1

In the past decade, there have been remarkable advances in proteomic technologies. Mass spectrometry has emerged as the preferred method for in-depth characterization of the protein components of biological systems. Using mass spectrometry, key insights into the composition, regulation and function of molecular complexes and pathways have been gained. From these studies, it is clear that mass-spectrometry-based proteomics is now a powerful ‘hypothesis-generating engine’ that, when combined with complementary molecular, cellular and pharmacological techniques, provides a framework for translating large data sets into an understanding of complex biological processes.

Of the many fascinating discoveries made by genome-sequencing of mass-spectrometry-based proteomics as a hypothesis-generating projects, perhaps none is more provocative than the prediction that platform, and a commitment to focused follow-up studies that test all prokaryotes and eukaryotes produce numerous proteins with emerging hypotheses directly. These examples underscore both the uncharacterized or pleiotropic functions1,2. Confronted with the chal- opportunities and the challenges that face the systematic integration lenge of annotating this enormous segment of the proteome, scientists of mass-spectrometry-based proteomic techniques into the arsenal of have sought to expedite the characterization of proteins by develop- experimental approaches used by molecular and cellular biologists. ing new methods for rapid and parallel analysis. These large-scale Mass-spectrometry-based proteomics has several biological applica- approaches to protein science are collectively termed proteomics3,4. tions. In many pioneering studies, it was used to make an inventory of Proteomics encompasses diverse techniques that allow different aspects the content of subcellular structures and organelles, creating valuable of protein structure and function to be analysed. Many proteomic meth- repositories of information about the localization of proteins in cells and ods — including protein microarrays5,6, large-scale two-hybrid analyses7, tissues (see ref. 12 for a recent review). It is also emerging as a power- and high-throughput protein production and crystallization8 — have had ful way to discern higher-order structural features of protein complexes, a marked impact on the current understanding of protein structures, including subunit orientation and stoichiometry (see page 973). Here, we activities and interactions. Among proteomic techniques, however, mass focus on two main applications — the functional characterization of pro- spectrometry has emerged as the main method for analysing the produc- tein complexes, and the functional characterization of protein pathways tion and function of proteins in native biological systems9–11. — highlighting studies that have led to major advances in understanding Mass spectrometry has become the dominant technique for several the molecular basis of cellular and physiological processes. reasons, mainly because of its unparalleled ability to acquire high-content quantitative information about biological samples of enormous com- Functional characterization of protein complexes plexity. The core technologies of mass-spectrometry-based proteomics, Many proteins function as components of complexes in cells and including the instrumentation and the methods for data acquisition and tissues13. Protein complexes can vary in size and composition, from analysis, have been discussed in several recent reviews9–11 and are out- megadalton assemblies of dozens of proteins (such as the ribosome and lined in Box 1. Although these technologies will continue to be developed the spliceosome) to smaller clusters of a few proteins. The composition in the quest for improved sensitivity, throughput and proteome coverage, and stability of protein complexes is highly regulated in both a context- mass-spectrometry-based proteomics has now developed to the point dependent manner (for example, there are cell-type-specific differences) at which it is routinely applied worldwide to address a large range of and a time-dependent manner14. These biological variables present a biological problems. It therefore seems an opportune time to reflect on challenge to researchers interested in determining the structure and the the functional impact of mass-spectrometry-based proteomics. What function of protein complexes. Mass-spectrometry-based proteomics, has been learned about the molecular mechanisms of complex biological however, can address this issue in a systematic and relatively unbiased processes? How were successful experiments carried out? What addi- manner, often revealing surprising protein partnerships and assemblies tional methods were required to make important biological discoveries? that regulate cellular and physiological processes. In addition to the Finally, are there lessons that might guide future applications? examples discussed in this section, other notable studies that have used In this review, we address these questions by focusing on several mass-spectrometry-based proteomics to characterize protein complexes cases in which mass-spectrometry-based proteomics has had a cru- are described in refs 15–19. cial role in advancing our understanding of basic cellular and physio- logical processes. We highlight common themes that seem to have A mitochondrial protein complex that links apoptosis and glycolysis primed investigations for success, including the configuration of bio- Stanley Korsmeyer and colleagues20 provided an early example of the logically relevant model systems (and controls), the implementation value of mass-spectrometry-based proteomics for uncovering unexpected

1Department of Chemical Physiology, 2The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, USA.

28 | OCTOBER 2015 www.nature.com/milestones/mass-spec991 INSIGHT REVIEW NATURE|Vol 450|13 December 2007 COLLECTION physical connections between proteins that had been thought to function (PAGE). The protein spots constituting this complex were then excised in independent pathways. With the aim of mapping mitochondrial protein and analysed by liquid-chromatography–tandem mass spectrometry complexes that contain the pro-apoptotic protein Bcl-2 antagonist of cell (LC–MS/MS). This revealed, in addition to BAD, the presence of protein death (BAD), they developed a gentle and efficient enrichment method phosphatase 1C (PP1C), cyclic-AMP-dependent protein kinase (PKA) to isolate these complexes from mouse liver mitochondria (Fig. 1a). A and the PKA-anchoring protein WAVE1. Previous studies had impli- 232-kDa assembly consisting of five major proteins was identified by cated PKA as an important regulator of BAD, inhibiting the activity of silver staining of proteins separated by polyacrylamide gel electrophoresis BAD by phosphorylating multiple serine residues21. The authors provided

Box 1 | Fundamentals of mass-spectrometry-based proteomic experiments

MS

Molecular-mass

Relative determination abundance

Mass-to-charge ratio

MS 1 Collision cell MS 2 LC or LC/LC

Peptide mixture Sequence identification Relative abundance

Mass-to-charge ratio

MS/MS

Mass-spectrometry-based proteomic experiments involve several steps which separates peptides according to their hydrophobicity. To (see Figure). A peptide mixture can be obtained from the sample of achieve the best sensitivity and efficiency of separation, microcolumns interest by proteolytic digestion of a protein mixture or of a gel band or (< 100 μm in length) with a small diameter tip (for example, 5 μm) are spot, following separation by electrophoresis. These peptides are then typically used. The electrospray ionization that takes place in the mass introduced into a one-dimensional (LC) or multi-dimensional (LC/LC) spectrometer acts like a concentration-dependent detector; therefore, liquid-chromatography system. After separation, they are eluted into the introduction of peptides in narrow peaks improves detection limits, an electrospray ionization tandem mass spectrometer. The mass-to- and low flow rates (in the order of nL min–1) are used to achieve this. As charge ratios of the peptide ions are measured first by mass spectrometry peptide mixtures become more complex, the introduction of a second (upper panels; ions pass unperturbed through the first mass analyser and dimension of separation can improve the resolution of separation. A collision cell) to determine the molecular mass of each peptide. Then, good choice for a second dimension is strong cation-exchange (SCX) each peptide ion is isolated in the first mass analyser (MS 1) and directed liquid chromatography, which separates peptides mainly on the basis into a collision cell, where it collides with neutral gas molecules (for of positive charges. SCX can be used off-line, and then each fraction example, helium) and becomes fragmented (lower panels). The mass-to- is analysed by reversed-phase high-pressure liquid chromatography, charge ratios of the resultant fragments are measured in the second mass followed by mass spectrometry. Alternatively, both the reversed-phase analyser (MS 2), producing a tandem mass spectrum (shown for the and SCX resins can be packed into a single column, and, by introducing peptide ion indicated with a white rectangle in the upper panel) and, after buffers in series, a two-dimensional separation can be achieved before computer analysis, an amino-acid sequence for each peptide. These steps mass-spectrometry analysis. are described in further detail below. Electrospray ionization Sample preparation A potential is placed on the liquid flowing from the liquid-chromatography Two general strategies are often used to prepare proteins. Proteins column through a fused silica column or needle, causing the solution that have been enriched or obtained as part of an experiment can be to spray. The spray contains fine droplets that encompass the sample. fractionated by SDS–polyacrylamide gel electrophoresis (SDS–PAGE). The droplets are desolvated as they enter the mass spectrometer, by Individual bands can be removed and analysed, or an entire lane can be applying heat to generate ions. The efficiency of ionization depends on excised and divided into 10–15 slices. Proteins in the gel slices are then the chemical properties of each molecule. digested in situ with trypsin, and the peptides are extracted. Extracted peptides are then analysed by mass spectrometry. Protein mixtures Mass-spectrometry analysis can also be digested directly in solution. A protein mixture is denatured Mass spectrometers measure the mass-to-charge ratio of an ion. This is by using chaotropes and then digested — sometimes by a two-step carried out by manipulating ions in electric and/or magnetic fields or by procedure that involves proteases, such as the endoprotease LysC measuring their time of flight (TOF). In addition to determining the mass- followed by trypsin — to generate a peptide mixture that is suitable for to-charge ratio, the intensity of the signal obtained reflects the abundance mass-spectrometry analysis. In general, trypsin digestion is preferred to of the ion. The abundance of ions can vary with the ionization, so samples generate peptides with an arginine or lysine residue at the C terminus, but can be labelled with stable isotopes to determine quantitatively the ratio other types of enzyme, including nonspecific proteases, have also been of peptides from different ‘states’ (by measuring the mass-to-charge used78. ratio and abundance). Various mass analysers are used in proteomic experiments, including ion-trap mass spectrometers, quadrupole/TOF Liquid chromatography hybrids, ion-trap/orbitrap hybrids and ion-trap/ion-cyclotron-resonance Before peptide mixtures are introduced into the mass spectrometer, (FTMS) hybrids. Some types of mass analyser can measure the mass-to- they are fractionated in-line with the instrument. The most common charge ratio with high resolution (up to 150,000 m/∆m, where m denotes method of fractionation is reversed-phase liquid chromatography, mass) and high mass accuracy (to < 1 part per million).

992NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 29 NATURE|Vol 450|13 December 2007 INSIGHT REVIEW COLLECTION strong evidence that PP1C functions to counter the effect of PKA, by a 1D: gel filtration dephosphorylating BAD. These data therefore led to a model in which 0.4 a BAD–PKA–PP1C complex, possibly scaffolded by WAVE1, creates a 3 local microenvironment in which the phosphorylation status of BAD can be finely controlled. 0.3 Korsmeyer and colleagues next considered the possible function(s) of this BAD-containing complex in mitochondrial physiology. First,

280 0.2 1 they pursued the characterization of the fifth (and still unidentified) A component of the complex. By LC–MS/MS analysis, this 50-kDa pro- tein was identified as the glycolytic enzyme glucokinase (also known as 0.1 hexokinase 4). This result was initially surprising; even though apoptosis 2 4 and glycolysis are both crucial physiological processes that are linked to 22,23 cell survival , the molecular pathways involved had been thought 0 to function independently. The organization of glucokinase and BAD Time into a stable multiprotein complex in mitochondria indicated other- 1 2 3 4 wise. Indeed, the authors showed that Bad–/– mice, which lack the gene 2 encoding BAD, had severely blunted mitochondrial glucokinase activity, glucose-driven respiration and glucose-dependent ATP production. Moreover, these mice had significantly higher blood glucose concentra- tions after fasting than did wild-type (Bad+/+)mice. The effect of BAD on glucose homeostasis depended on its phosphorylation status, suggesting that other members of the BAD-containing complex (for example, PKA 2D: native PAGE 3D: denaturing PAGE and PP1C) had a regulatory role. These findings therefore indicate that the mitochondrial fraction of glucokinase — defined as that portion of the enzyme stably associated b with the BAD–PKA–PP1C–WAVE1 complex (Fig. 1b) — has a dispro- Bad+/+ mice Bad–/– mice portionate role in maintaining proper glucose metabolism (given that most of the glucokinase in a cell is cytosolic). The glucokinase- and BAD-containing mitochondrial complex was also proposed to function as an integration centre that links metabolic state and cell death. This hypothesis was supported by the finding that liver cells from Bad–/– mice WAVE1 underwent less glucose-deprivation-induced apoptosis than wild-type Glucokinase liver cells. In summary, the mass-spectrometry-based proteomic analysis PKA of mitochondrial BAD-containing complexes discovered an unexpected PP1C physical association between a key apoptotic protein (BAD) and a key BAD glycolytic protein (glucokinase), thereby leading to new models to explain how cells coordinate metabolic signals and survival signals.

A transcription-factor complex relevant to trichothiodystrophy Mitochondrion The proper maintenance, repair and transcription of DNA requires several multiprotein complexes24. Ruedi Aebersold and colleagues25 Figure 1 | Discovery of a BAD-containing protein complex that localizes were interested in fully characterizing the components of the yeast to mitochondria and integrates apoptotic and glycolytic processes. (Saccharomyces cerevisiae) RNA polymerase II (PolII) pre-initiation a, A complex containing BAD, PKA, PP1C, WAVE1 and glucokinase was complex and established an elegant biochemical system to enrich this discovered by multidimensional fractionation of mitochondrial protein complexes from Bad+/+ (wild-type) mice and Bad –/– mice, followed by protein assembly (Fig. 2). They first isolated nuclear extracts from a 20 LC–MS/MS analysis . Multidimensional fractionation involved gel mutant yeast strain carrying a temperature-sensitive allele of the TATA- filtration as the first dimension (1D), native (non-denaturing) PAGE as the binding protein (TBP) and incubated these proteomes with an immobi- second dimension (2D), and denaturing PAGE (SDS–PAGE) as the third lized HIS4 promoter from yeast, which includes the TATA box, in the dimension (3D). b, In the absence of BAD, none of the members of the presence or absence of recombinant TBP. They then used isotope-coded protein complex associates with mitochondrial membranes. affinity tagging (ICAT) in conjunction with LC–MS/MS26 to identify proteins that were quantitatively enriched (by at least 1.9-fold) in pro- enriched after immunoprecipitation of FLAG–YDR079C-A protein moter-associated fractions from TBP-containing proteomes. Nearly with FLAG-specific antibodies were components of TFIIH. Conversely, all of the proteins that met this criterion were known components of immunoprecipitation with antibodies specific for other TFIIH com- the PolII transcriptional machinery, with the notable exception of an ponents ‘pulled down’ the YDR079C-A protein. Finally, YDR079C-A open reading frame — YDR079C-A (Fig. 2) — which corresponded to protein was shown to be required for stable recruitment of TFIIH to a small (8 kDa) protein of unknown function. promoters. These findings confirmed that the YDR079C-A protein is a BLAST searches (http://www.ncbi.nlm.nih.gov/blast/Blast.cgi) core subunit of TFIIH, prompting the authors to rename the protein as revealed homologues of YDR079C-A in several eukaryotic organisms, the transcription factor subunit Tfb5. including humans and Chlamydomonas reinhardtii. Interestingly, the In an amazing example of a basic scientific discovery being rapidly protein encoded by C. reinhardtii was known to suppress sensitivity to translated, the human homologue of Tfb5 was almost immediately ultraviolet light (which can damage DNA)27. This finding suggested that shown to be the long-sought gene product that is mutated in a set of the protein encoded by YDR079C-A might be a component of transcrip- unexplained cases of trichothiodystrophy28, a rare human disease tion factor IIH (TFIIH), which has a role in both general transcription characterized by brittle hair and skin photosensitivity. Most cases of and repair of DNA damage. Aebersold and colleagues25 confirmed this trichothiodystrophy had been traced to mutations in the genes enco- hypothesis by carrying out a further round of quantitative proteomic ding the nine known components of TFIIH. However, an individual experiments, this time comparing proteins that bound to an epitope with symptoms of trichothiodystrophy but no mutations in these genes (FLAG)-tagged YDR079C-A protein. The proteins that were most had been found several years earlier29. Interestingly, cells from patients 30 | OCTOBER 2015 www.nature.com/milestones/mass-spec993 INSIGHT REVIEW NATURE|Vol 450|13 December 2007 COLLECTION Finally, the interaction of Tfb5 with another component of TFIIH, Tfb2,

TATA was recently confirmed in a genome-wide tandem-affinity-purification study30, thus underscoring the capacity of large-scale mass-spectrometry- + TATA TATA based proteomic experiments to characterize physiologically relevant protein complexes. Nuclear extract Bead-linked HIS4 promoter (TBP temperature- A chaperone complex that regulates CFTR folding and transport sensitive mutant) Recombinant TBP Incubation Transmembrane proteins depend on a complex range of chaperones and co-chaperones for optimal folding, localization and, ultimately, func- tion31,32. Genetic mutations that impair the folding of membrane pro- teins form the basis of many human diseases, including cystic fibrosis. Cystic fibrosis is mainly caused by point mutations in the gene encoding an apical membrane ATP-regulated chloride channel, which is known as the cystic fibrosis transmembrane conductance regulator (CFTR)33. The main disease-associated mutation, ∆F508 (deletion of the phenylalanine residue at position 508 of the wild-type protein), disrupts the folding of CFTR in the endoplasmic reticulum, leading to almost complete Background proteome Background proteome degradation of this channel34 (Fig. 3a). Interestingly, however, properly + PIC-enriched proteins folded CFTR with this mutation can traffic to the plasma membrane, where it forms a functional chloride channel. These findings suggest that rescuing the folding of ∆F508-CFTR could eventually be used to Labelling with d0 ICAT Labelling with d8 ICAT treat patients with cystic fibrosis. LC–MS/MS Initial studies have implicated both chaperone assemblies that con- tain heat-shock protein 90 (HSP90) and those that contain HSP40 and PIC-enriched HSP70 in the folding pathways for CFTR35,36. William Balch, John Yates Background peptide 37 peptide (YDR079C-A) and colleagues proposed that a more complete understanding of the chaperone assemblies that regulate CFTR folding and transport could d8 be achieved by carrying out a proteomic analysis of the proteins associ- ated with the channel. The authors used the shotgun LC–MS method d0 38 Relative d8 Relative d0 MudPIT (multidimensional protein-identification technology) to abundance abundance analyse cells expressing the gene encoding the wild-type CFTR or Time Time ∆F508-CFTR. MudPIT analysis of wild-type CFTR and ∆F508-CFTR immunoprecipitates identified nearly 200 CFTR-associated proteins Figure 2 | Discovery of Tfb5 as the tenth subunit of TFIIH, which is involved (compared with controls in which nonspecific antibodies or cells lack- in transcriptional and DNA-repair processes. Nuclear extracts from yeast ing CFTR were used). Collectively, these proteins have been named the expressing a temperature-sensitive mutant of TBP were incubated with CFTR interactome (Fig. 3b). These proteins included known CFTR- HIS4-promoter-linked beads, in the presence or absence of recombinant binding chaperones, such as calnexin, HSP40–HSP70 and HSP90, as TBP. The inclusion of recombinant TBP facilitated enrichment for proteins well as many previously unknown interactors. that are members of PolII pre-initiation complexes (PICs). Samples These researchers next set out to test whether any of the newly dis- enriched in the absence or presence of recombinant TBP were then labelled covered CFTR-associated proteins regulated channel folding and export with non-deuterated (d0) and deuterated (d8) ICAT probes, respectively, from the endoplasmic reticulum37. RNA-interference (RNAi)-mediated enabling LC–MS-based quantitative proteomic analysis of differentially enriched proteins. This led to the identification of YDR079C-A knockdown of either p23 (also known as PTGES3) or FKBP8 — two (subsequently named Tfb5) as a component of the transcription-factor HSP90 co-chaperones that were selectively identified in ∆F508-CFTR complex TFIIH25. Complementary genetic studies (not shown) confirmed immunoprecipitates (as determined by protein-sequence coverage and that the gene encoding the human homologue of Tfb5 is mutated in rare spectral counting in MudPIT experiments) — resulted in greatly reduced forms of trichothiodystrophy28. amounts of endoplasmic-reticulum-associated and cell-surface-associated ∆F508-CFTR (Fig. 3c). Overexpression of these co-chaperones, however, with this unusual variant of trichothiodystrophy (called trichothiodys- had opposite effects on ∆F508-CFTR, with an increase in p23 leading to trophy A) were found to have low cellular concentrations of TFIIH29. more endoplasmic-reticulum-associated CFTR and an increase in FKBP8 On discovery of Tfb5 as the tenth component of yeast TFIIH, Wim leading to less. These data were thought to reflect the distinct roles of p23 Vermeulen, Aebersold and colleagues sequenced the corresponding and FKBP8 in modulating specific aspects of the HSP90-guided folding human gene in patients with trichothiodystrophy A and found inac- cycle. Notably, overexpression of either co-chaperone failed to increase tivating mutations in four individuals from three separate families28. the amount of cell-surface-associated ∆F508-CFTR (or wild-type CFTR), Moreover, they showed that recombinant human TFB5 could stabilize suggesting that they modulate the initial folding and stability of CFTR but TFIIH complexes and correct the DNA-repair defects in cells from do not participate in the subsequent steps that are required to deliver the patients with trichothiodystrophy A. folded channel to the endoplasmic-reticulum export machinery. In summary, the use of mass-spectrometry-based proteomics to charac- RNAi-mediated knockdown of a third HSP90 co-chaperone present terize a previously unknown component of TFIIH catalysed a remarkable in wild-type and ∆F508-CFTR immunoprecipitates, AHA1, substan- bench-to-bedside-to-bench investigation that succeeded in explaining the tially corrected the amount of both endoplasmic-reticulum-associated molecular basis for a human photosensitivity syndrome. On a technical and cell-surface-associated ∆F508-CFTR (Fig. 3c). These data suggest note, it is worth emphasizing that this discovery hinged on the use of quan- that disruption of AHA1, unlike p23 or FKBP8, facilitates a folding path- titative profiling, which allowed the researchers to confidently identify way that favours not only stability of the channel but also coupling to the Tfb5, despite its showing only a moderate (twofold) increase in abundance endoplasmic-reticulum export machinery. Potentially consistent with in promoter-associated samples compared with control samples. Direct this premise, a decrease in CFTR-bound HSP90 was observed in cells LC–MS/MS analysis also proved crucial, because the small size of Tfb5 in which AHA1 had been knocked down, similar to the finding from (8 kDa) precluded straightforward detection by SDS–PAGE (and might MudPIT analyses that wild-type CFTR immunoprecipitates contained less explain why this protein eluded detection by more-classical methods). HSP90 than ∆F508-CFTR immunoprecipitates. Collectively, these data 994NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 31 NATURE|Vol 450|13 December 2007 INSIGHT REVIEW COLLECTION

a indicate that a reduction in the amount of AHA1 might alter the kinet- Cl– Cl– ics of HSP90–CFTR interactions, thereby increasing the efficiency of transition from folding to export pathways. Finally, the authors showed that a considerable proportion of the cell-surface-associated ∆F508- CFTR channels were functional, as determined by chloride conductance Golgi ERAD measurements. apparatus Mass-spectrometry-based proteomic studies of the CFTR interactome thus identified specific co-chaperone and chaperone folding pathways Endoplasmic reticulum that seem to control mutant channel stability, cell-surface expression and function. Why might the basal chaperone machinery of a cell, or ‘chaper- Nucleus ome’, prevent proper assembly and transport of ∆F508-CFTR, thereby exacerbating the disease phenotype? The authors37 speculate that the Normal Cystic fibrosis folding energetics of ∆F508-CFTR lie outside the capacity of the normal (Wild-type CFTR) (∆F508-CFTR) chaperome environment, which has been evolutionarily optimized to fold wild-type proteins (and to eliminate misfolded proteins). A provoca- b tive extension of this idea is that genetic or pharmacological interventions CFTR-specific CFTR-specific that shift the chaperome so that it can support the folding and transport antibody antibody of mutant proteins could be used to treat patients with cystic fibrosis, as well as those with other protein-conformational disorders. Functional characterization of protein pathways One of the original and most enduring applications of mass- Wild-type CFTR ∆F508-CFTR spectrometry-based proteomics is the comparative analysis of biologi- immunoprecipitate immunoprecipitate cal samples that differ in specific physiological or pathophysiological phenotypes (that is, comparing ‘disease’ and ‘normal’39). These stud- ies are intended to identify the minimal protein ‘signatures’ that depict FKBP8-directed and, ideally, determine the higher-order biological processes under c Scrambled (control) siRNA or AHA1-directed investigation. As highlighted in this section, mass-spectrometry-based siRNA p23-directed siRNA siRNA proteomics carried out in this comparative analysis mode has success- fully identified previously unknown protein pathways with key roles in a wide range of biological systems. Cl– Cl– Kinase pathways that regulate sex-specific functions in Plasmodium Malaria is caused by unicellular parasites of the genus Plasmodium. These parasites undergo a complex series of highly regulated life-cycle tran- sitions that allow transmission between vertebrates and mosquitoes40. ERAD ERAD Chief among these life-cycle stages is the generation of haploid sexually differentiated (male and female) cells, termed gametocytes. In vertebrate blood, gametocytes are in a state of arrest, but on transfer to the mos- quito mid-gut, they become activated and differentiate into gametes, which fertilize and, eventually, produce infectious oocysts. Despite the importance of sexual development to the transmission of Plasmodium spp., the proteins that distinguish male and female cells have not been Figure 3 | Discovery of chaperone complexes that regulate CFTR folding systematically inventoried. Andrew Waters and colleagues set out to and endoplasmic-reticulum-mediated export. a, The cellular fate of CFTR address this important problem through an innovative combination of chloride channels is depicted. Wild-type CFTR is transported to the plasma advanced cell-biological models and proteomic technologies41. membrane. By contrast, ∆F508-CFTR, the mutant protein present in individuals with cystic fibrosis, is degraded by endoplasmic-reticulum- Previous efforts to characterize sex-specific proteins had been con- mediated pathways (ERAD) before it reaches the plasma membrane. founded by a technical inability to separate and purify male and female b, Immuno-enrichment of proteins bound to wild-type CFTR and gametocytes. Waters and colleagues overcame this difficulty by creating ∆F508-CFTR identified several chaperones and co-chaperones37. transgenic lines of Plasmodium berghei that produce green fluorescent c, RNAi-mediated knockdown of these chaperones and co-chaperones protein (GFP) under the control of a male-specific or a female-specific was carried out, using small interfering RNA (siRNA) directed against promoter (from the genes encoding α-tubulin II and elongation factor 1α, the corresponding mRNAs. In the case of p23 and FKBP8 (centre), less respectively) (Fig. 4a). These lines enabled male gametocytes and female ∆F508-CFTR was found in the endoplasmic reticulum, indicating that gametocytes to be selectively enriched by flow cytometry (Fig. 4b). These these proteins regulate the folding and stability of CFTR proteins in the sex-specific cell populations were then compared with one another (and endoplasmic reticulum. By contrast, in the case of AHA1 (right), more with gametocyte-free (asexual) blood stages) by mass-spectrometry-based ∆F508-CFTR was found both in the endoplasmic reticulum and at the cell surface, indicating that this chaperone controls both the folding of CFTR proteomics. Specifically, proteomes were separated into ten fractions by proteins and their export from the endoplasmic reticulum37. one-dimensional SDS–PAGE, and each fraction was digested with trypsin and analysed by LC–MS/MS. Sex-specific proteins were identified by comparing the number of unique ‘tryptic peptides’ in each sample. consistent with the more extensive genome replication that these cells A remarkable number of sex-specific proteins were identified: there undergo during the gamete-activation cycle. The male-gametocyte were 236 unique proteins in male gametocytes, and 101 in female proteome was also strongly enriched in axoneme proteins, which form gametocytes (Fig. 4c). Analysis of the sex-specific proteomes showed the flagella required for motility of male gametes. The female-gametocyte clear enrichment for protein families that are functionally linked to proteome, by contrast, contained larger amounts of ribosomal and male-gametocyte and female-gametocyte biology. For example, nearly mitochondrial proteins than the male-gametocyte proteome. 70% of the Plasmodium proteins that are annotated as DNA-replication Waters and colleagues next selected individual sex-specific proteins proteins (17 of 25) were highly represented in male gametocytes, which is for functional analysis, to determine whether they have important 32 | OCTOBER 2015 www.nature.com/milestones/mass-spec995 INSIGHT REVIEW NATURE|Vol 450|13 December 2007 COLLECTION

22 a Elongation the Warburg effect . In an effort to map dysregulated biochemical path- factor 1α α-Tubulin II ways in cancer more globally, Benjamin Cravatt and colleagues used a promoter promoter chemical proteomic technology known as activity-based protein profiling GFP GFP (ABPP)43, in conjunction with mass-spectrometry-based analytical plat- forms (such as MudPIT), to identify enzyme activities that are increased in aggressive cancer cell lines and primary tumours in humans44. In ABPP, active-site-directed probes are used to profile the functional Gametocytes state of enzymes directly in native proteomes43. ABPP probes contain two main elements: a reactive group that binds to, and covalently labels, many enzymes from a given mechanistic class; and a reporter group, such as biotin or a fluorophore, that allows detection, enrichment and bc160 Asexual identification of probe-modified enzymes (Fig. 5a). In their initial stud- ies, Cravatt and colleagues used fluorophosphonate-containing ABPP probes45,46 to profile the activities of serine hydrolases in a panel of 120 44 171 human cancer cell lines . These experiments identified sets of enzyme activities that distinguished cancer cells on the basis of tissue of origin 80 43 69 and state of aggressiveness. Chief among these enzymes was a previ- 302 101 ously uncharacterized transmembrane enzyme KIAA1363 (also known 236 69 Number of cells 40 as AADACL1), increased amounts of which were found in aggressive lines from several tumour types, including breast cancer, ovarian can- 0 cer and melanoma (Fig. 5b). Cravatt and colleagues later showed by 0 10 100 1,000 10,000 ABPP–MudPIT analysis that the activity of KIAA1363 is much higher GFP intensity in oestrogen-receptor-negative primary breast tumours from humans than in oestrogen-receptor-positive primary breast tumours, which are Figure 4 | Identification of male-gametocyte-specific and female- usually less aggressive, or in normal breast tissue47. gametocyte-specific proteins in Plasmodium berghei. a , Transgenic Cravatt and colleagues next used a competitive version of ABPP48 Plasmodium berghei lines that produce GFP under the control of a to develop a potent and selective inhibitor of KIAA1363, which they female-gametocyte-specific promoter (from the gene encoding elongation named AS115 (Fig. 5c). Treatment of cancer cells with this inhibitor, factor 1α) or a male-gametocyte-specific gene promoter (from the gene 49 encoding α-tubulin II) were generated41. b, Enriched populations of female followed by metabolomic analysis using untargeted LC–MS methods , and male gametocytes were then obtained by flow cytometry, gating on revealed that KIAA1363 regulates an unusual class of neutral lipids: 50 GFP signals (shown for the line in which female gametocytes produce GFP). the monoalkylglycerol ethers (MAGEs) (Fig. 5d). Additional studies c, Comparative proteomic analysis of male-gametocyte-enriched, female- confirmed that KIAA1363 is a 2-acetyl-MAGE hydrolase, producing gametocyte-enriched and asexual populations identified 236, 101 and 171 large amounts of MAGEs in aggressive cancer cells. These MAGEs are, proteins, respectively, that are expressed solely in each cell type. Among in turn, converted into biologically active lysophospholipids, such as these proteins were identified two protein kinases that contribute to male- lysophosphatidic acid (LPA). By contrast, inhibiting KIAA1363 stabi- gametocyte-specific and female-gametocyte-specific cellular functions. lizes 2-acetyl-MAGE, resulting in its conversion into another class of signalling molecule, the lipid platelet-activating factor. Finally, RNAi- roles in male-gametocyte or female-gametocyte biology. The authors mediated knockdown of the protein and activity of KIAA1363 led to a focused on two protein kinases: mitogen-activated protein kinase 2 marked decrease in the amount of MAGE and LPA lipids in cancer cells, (MAP2; accession number PB000659.00.0), which was found only in correlating with significant reductions in the migratory and tumour- male gametocytes; and NIMA-related kinase (NEK4; accession number forming potential of these cells (Fig. 5d). PB001094.00.0), which was found only in female gametocytes. Targeted In summary, Cravatt and colleagues used a combination of mass- disruption of the gene encoding MAP2 resulted in male gametocytes spectrometry-based functional proteomic and metabolomic methods to that can re-enter the cell cycle after activation and complete genome determine that the enzyme KIAA1363 is more abundant and has a higher replication but fail to enter nuclear division. Disruption of the gene activity in aggressive cancer cells, where it is a key node that bridges encoding NEK4, by contrast, did not seem to impair gamete formation platelet-activating factor and LPA in an ether-lipid signalling network. but, instead, arrested zygote development. Cross-fertilization studies Considering that disruption of this network impaired cancer-cell migra- confirmed that the latter phenotype was due to defective female (but tion and tumour growth, the KIAA1363–ether-lipid pathway probably not male) gametes. has a key role in regulating important aspects of cancer pathogenesis. In summary, developing an innovative cell-biological strategy to enrich distinct sexual stages of the P. berghei life cycle allowed the generation of Phosphoprotein networks involved in the DNA-damage response high-quality cellular models for in-depth analysis by mass-spectrometry- Post-translational modifications constitute one of the most pervasive based proteomics. The output of the proteomic investigations was the mechanisms for regulating protein function in cells and tissues. Protein most comprehensive inventory of sex-specific parasite proteins gener- phosphorylation, in particular, dynamically modulates numerous sig- ated so far, including the discovery of novel protein kinases that regulate nalling pathways and is controlled by the complementary action of pro- male-specific and female-specific signalling pathways. Interestingly, both tein kinases and protein phosphatases. One of the big challenges in the MAP2 and NEK4 belong to protein-kinase subfamilies (the MAP and post-genomic era is determining the endogenous substrates of the more NEK subfamilies) that have multiple members in Plasmodium spp.42. than 500 protein kinases in the human proteome51. Recently, Stephen These studies are therefore a compelling example of the value of compara- Elledge and colleagues52 introduced a creative mass-spectrometry-based tive proteomics for assigning unique (that is, non-redundant) cellular proteomic strategy that allowed them to make a comprehensive inven- functions to uncharacterized members of protein classes. tory of substrates for the protein kinases ATM (ataxia telangiectasia mutated) and ATR (ATM and Rad3 related), which are involved in the An ether-lipid signalling pathway that supports tumour pathogenesis DNA-damage-response pathway52. Cancer cells have long been suspected to have alterations in metabo- Previous studies had identified about 25 ATM and/or ATR substrates, lism that support their malignant behaviour. Most cancer cells, for which contained an unusual consensus sequence for phosphorylation: example, have a greater dependence on glycolysis than on oxidative Ser/Thr-Gln53. On the basis of this information, Elledge and col- phosphorylation for energy production, a phenomenon referred to as leagues used a panel of 68 antibodies specific for phospho-Ser-Gln or 996NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 33 NATURE|Vol 450|13 December 2007 INSIGHT REVIEW COLLECTION phospho-Thr-Gln to immunoprecipitate candidate substrates for ATM These studies, together with others58–61, underscore the rapid develop- and ATR from human cells that were either exposed to ionizing radia- ment of quantitative mass-spectrometry methods for mapping protein tion to induce the DNA-damage response or not irradiated (Fig. 6). phosphorylation sites in proteomes. Similar approaches are emerging These cell populations had previously been subjected to stable isotope for the global analysis of other key protein modifications, including labelling54,55, so radiation-induced phosphorylation events could be acetylation62,63, methylation63,64, glycosylation65 and ubiquitylation66. quantified by LC–MS/MS analysis. Relative quantification of heavy- We expect that these methods will also help to improve our understand- isotope-labelled and light-isotope-labelled phosphopeptide pairs iden- ing of the role of post-translational modifications in regulating protein tified 905 phosphorylation sites, across 700 proteins, that had fourfold function in biological systems. higher signals in irradiated cells than in non-irradiated cells. Thus, in a sin- gle set of experiments, the researchers increased the number of candidate Insulin pathways in Caenorhabditis elegans dauer formation and ageing ATM and ATR substrates by more than 20-fold (from about 25 proteins to Genetic studies in C. elegans have determined that the signalling path- 700 proteins). The increase in phosphorylation found in irradiated cells way involving insulin and insulin-like growth factor has an important was confirmed for several candidate substrates by immunoblotting with role in regulating lifespan67. For example, disruption of the C. elegans antibodies specific for phospho-Ser-Gln or phospho-Thr-Gln. receptor DAF-2, which is homologous to the mammalian receptors The researchers next examined whether the newly identified for both insulin and insulin-like growth factor 1, extends lifespan and substrates have a role in the DNA-damage response, by systematically increases entry to the dauer phase (a phase characterized by delayed disrupting expression of the corresponding genes by RNAi. Of the 37 development, which C. elegans can enter if environmental conditions substrates examined, 35 were found to contribute to at least one aspect are unfavourable early in development)68. To understand the molecular of the DNA-damage response. Although these studies do not directly basis of these marked changes in physiology, John Yates and colleagues test whether the phosphorylation state of the proteins is crucial for their carried out a quantitative mass-spectrometry-based proteomic analysis function, the results indicate that many more proteins contribute to the of wild-type and daf-2 mutant strains of C. elegans69. DNA-damage response than was originally thought, and these proteins Two forms of quantification were used: ratiometric analysis of are subject to dynamic phosphorylation in response to DNA-damage proteomes from both wild-type C. elegans and daf-2 mutants with a signals. Interestingly, there were several cases in which multiple com- reference proteome corresponding to wild-type C. elegans fed on ponents of a given pathway were phosphorylated, leading the authors 15N-enriched bacteria70; and direct spectral counting71 of unlabelled to conclude that protein kinases can increase their effect on specific proteins in wild-type and daf-2 mutant proteomes (Fig. 7a). Together, signalling pathways by simultaneously phosphorylating several nodes. these methods identified 86 proteins that were differentially expressed The authors then rapidly mined their phosphoproteomic data sets, in daf-2 mutants, 47 that were more abundant and 39 that were less facilitating the functional annotation of two previously uncharacterized abundant than in wild-type C. elegans. There were good correlations proteins. One of these proteins, which the authors named abraxas (also between the proteomic data obtained with the two methods, indicating known as CCDC98 and FLJ13614), was identified as a potential ATM that either approach can provide an accurate estimate of the relative and/or ATR substrate. Abraxas was more heavily phosphorylated in levels of proteins in two or more biological samples. The authors verified irradiated cells than in non-irradiated cells, and it formed a complex their proteomic data by selecting several proteins from wild-type strains with RAP80 (also known as UIMC1) and BRCA1, which was required and daf-2 mutant strains for analysis by immunoblotting. for resistance to DNA damage, control of the cycle checkpoint at the Interestingly, proteins that had similar changes in abundance in G2–M boundary and repair of DNA56. The other protein, which the the daf-2 mutant strain tended to show a functional relationship. For authors named FANCI (also known as KIAA1794), was found to form a example, as a group, the more abundant proteins tended to have transla- complex with FANCD2, which then localized to chromatin in response tion-elongation and lipid-transport functions, whereas the less-abun- to DNA damage57. Interestingly, a mutation in the gene encoding FANCI dant proteins were over-represented in the categories of amino-acid had been causally linked to Fanconi’s anaemia, a syndrome that impairs biosynthesis, reactive-oxygen-species metabolism and carbohydrate development and increases the risk of developing cancer. metabolism. Yates and colleagues next tested whether these proteins a Reactive Reporter b group group Aggressiveness

ABPP probe Breast cancer

Ovarian cancer KIAA1363

Melanoma

c d AS115 OH OH MAGE LPC LPA O O O OH ( ) ( ) KIAA1363 15 15 Control Control Control O O ( ) O O O 15 NMe ( ) 3 15 OH O P O O P OH Relative Relative Relative abundance AS115 abundance AS115 abundance AS115 OH O O

Time Time Time 2-acetyl-MAGE MAGE LPC LPA

Figure 5 | Discovery of an ether-lipid signalling pathway that supports KIAA1363 by the selective inhibitor AS115 (or short hairpin RNA probes) cancer pathogenesis. a, The general structure and mechanism of action decreased the abundance of a family of ether lipids, including MAGEs of ABPP probes are shown, with proteins of various activities in blue and alkyl-lysophospholipids (lysophosphatidylcholine (LPC) and LPA), as and a probe with a specific reactive group. b, ABPP of a panel of human determined by LC–MS analysis50. d, These results suggest a model in which tumour cell lines identified an uncharacterized hydrolase, KIAA1363, in KIAA1363 regulates an ether-lipid pathway that proceeds from MAGEs to aggressive cell lines from several tumour types. The activity of KIAA1363 LPC and LPA. Disruption of this lipid network by blockade of KIAA1363 increased with the aggressiveness of the cell lines (as determined by in-gel inhibited cancer-cell migration and tumour growth (not shown)50. fluorescence scanning of probe-labelled KIAA1363)44. c, Inactivation of Me, methyl.

34 | OCTOBER 2015 www.nature.com/milestones/mass-spec997 INSIGHT REVIEW NATURE|Vol 450|13 December 2007 COLLECTION of this idea is that pharmacological strategies to block such compensa- tory pathways might be useful for extending the lifespan of animals. From a more technical perspective, this study, together with another study by Yates and colleagues72, shows that stable isotope labelling can be applied to intact organisms, as well as to cell-culture preparations, thus greatly expanding the potential applications of this quantitative Light-isotope-labelled Heavy-isotope-labelled mass-spectrometry-based proteomic method. amino acids (12C and 14N) amino acids (13C and 15N) Emergent themes for mass-spectrometry-based proteomics The studies described in this review have several common conceptual Trypsin digestion and experimental themes that are instructive for researchers interested in using mass-spectrometry-based proteomics. First, it is clear that, to P ask specific biological questions, well-configured model systems need to P be established. Proteomic experiments produce large amounts of data. For these data sets to deliver answers or inspire compelling hypoth- P eses that explain the molecular basis of complex biological processes, Immunoprecipitation well-designed experimental systems and controls must be incorporated Phospho-Ser/Thr-Gln- into the research plan. Not surprisingly, experimental systems often specific antibody involve pathophysiological states for which clinical phenotypes are well P described. Using the appropriate controls allows investigators rapidly P P P P to winnow down proteomic observations to a manageable number of P P proteins that show changes in abundance, activity or post-translational P modification in the experimental model under study. If carried out properly, mass-spectrometry-based proteomics experi- LC–MS/MS ments should uncover a set of proteins associated with a specific cel- lular or physiological process. Testing the function of these proteins, however, requires ‘targeted’ follow-up studies that use complementary Background Candidate ATR and/or ATM experimental approaches. A second theme is the emergence of RNAi as a substrate near-universal method to perturb the production of any protein in cells Light and organisms, offering researchers a powerful strategy to test the func- Light Heavy tion of proteins identified in proteomic experiments. RNAi also has the advantage of operating on a scale that is compatible with screening the

Relative Relative 73,74 abundance abundance Heavy biological function of hundreds to thousands of candidate proteins , making it an attractive method to rapidly validate targets discovered Mass-to-charge Mass-to-charge ratio ratio in large-scale proteomic endeavours. Perhaps the best way to picture the growing synergistic relationship between mass-spectrometry- Figure 6 | Identification of candidate ATM and/or ATR substrates based proteomic techniques and RNAi techniques is to view the former involved in the DNA-damage response. Cells treated with light-isotope- approach as a hypothesis-generating engine and the latter as a tool for labelled amino acids were exposed to ionizing radiation, and cells testing these hypotheses. In this manner, proteomic observations can treated with heavy-isotope-labelled amino acids were maintained under control conditions. Candidate ATM and ATR substrates were then be connected to function or phenotype. identified by trypsin digestion of whole-cell proteomes, followed by A third common theme among the studies highlighted here is that the immunoprecipitation with antibodies specific for the consensus ATM and follow-up biological experiments were carried out by the same research ATR phosphorylation motif phospho-Ser/Thr-Gln, and then LC–MS/MS group as the original mass-spectrometry-based proteomics investiga- analysis. Phosphoproteins produced in response to irradiation were tion. Although repositories of proteomic data are undoubtedly useful, identified by ratiometric analysis of mass signals from light-isotope- this finding suggests that the primary biological users of proteomic labelled cells and heavy-isotope-labelled cells. Many of these proteins were 52 information are typically the generators of these data. There are several found to have important roles in the DNA-damage response . reasons why this might be the case. First, biologists are inundated with large-scale data sets, including those that inventory transcript, protein affected DAF-2-dependent processes such as lifespan. Curiously, and metabolite expression, as well as protein–protein interactions and RNAi-mediated knockdown in wild-type C. elegans of the mRNAs enco- post-translational modification state. This glut of molecular information ding proteins that had increased abundance in daf-2 mutants tended almost certainly has a saturating effect on potential users, who may face to extend the lifespan further, whereas knockdown of the mRNAs too many candidate targets or pathways to explore. Second, potential encoding less-abundant proteins shortened the lifespan of wild-type users might be concerned about the quality of mass-spectrometry-based C. elegans (Fig. 7b). These results suggest that many of the proteomic proteomic data (for example, the number of false-positive and false- changes observed in daf-2 mutants reflect compensatory changes in negative results). Follow-up biological studies are not trivial in terms metabolic and/or signalling pathways that limit the impact of loss of of cost or time, and having confidence in the quality of the data would DAF-2 function. probably lower the ‘activation energy barrier’ for secondary users of Principal among the observed compensatory pathways was TAX-6 proteomic results. Last, it might simply take more time for secondary (also known as CNA-1), the C. elegans orthologue of the protein users to incorporate mass-spectrometry-based proteomic data sets into phosphatase known as calcineurin A. Significantly more TAX-6 was their biological studies, or secondary users might incorporate data from present in daf-2 mutants than in wild-type C. elegans. In addition, dis- proteomic experiments mainly to validate observations from their own ruption of the gene encoding TAX-6 (tax-6) produced a similar pheno- experiments. Thus, there might be particular issues to overcome before type to loss of DAF-2 (that is, extended lifespan and increased entry to repositories of large-scale proteomic data influence hypothesis-driven the dauer phase). Disruption of both tax-6 and daf-2 resulted in even research, which often involves highly specific objectives for which more marked phenotypes. Collectively, these data indicate that TAX-6 proteomic data might be too general to address. This situation should is part of a feedback loop that buffers the effects of DAF-2 on longevity, improve as new methods for mining stored proteomic data are devel- through compensatory mechanisms (Fig. 7c). A provocative extension oped. It should also be noted that it is much easier to track scientific 998NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 35 NATURE|Vol 450|13 December 2007 INSIGHT REVIEW COLLECTION progress if a common authorship is preserved. We might therefore be a daf-2 mutant Wild-type control underestimating the number of researchers who have capitalized on repositories of mass-spectrometry-based proteomic data to gain new insights into biological systems. We have highlighted experimental commonalities among mass- Unlabelled 15N-labelled spectrometry-based proteomic studies that made important biological discoveries; however, there are also some noteworthy differences. For example, several methods for protein quantification have been used: 25 52 69 these include ICAT ; stable isotope labelling of cells and organisms ; Trypsin digestion and label-free techniques such as unique peptide number41, protein- 37 37,47,69 sequence coverage and spectral counting . Given that each of these EFT-2 (reduced abundance TAX-6 (increased abundance strategies is generally successful, does there need to be a single form of in daf-2 mutant) in daf-2 mutant) data collection and analysis in quantitative mass-spectrometry-based proteomic experiments? This question can be distilled to the issue of balancing accuracy and ease of implementation. Label-free methods are Relative Relative the simplest and most cost-effective to carry out, but they lack the preci- abundance abundance sion of isotope-labelling techniques. However, as long as researchers are Mass-to-charge Mass-to-charge committed to validating a portion of their proteomic results by using ratio ratio complementary techniques (for example, immunoblotting or selec- tive-reaction monitoring), confidence in the overall data sets acquired with either method should be achievable. These validation experiments should, for example, readily identify false-positive data, which can be b eft-2-directed siRNA tax-6-directed siRNA eliminated from further analysis. False-negative results (that is, changes that occur but are not detected) are more problematic but are almost cer- tainly minimized as the accuracy of the quantification method increases. 1 1 On this note, the discovery of Tfb5 as a component of TFIIH is worth revisiting. This protein showed only a twofold increase in ICAT signal 0.5 0.5 in enriched TFIIH complexes25, a signal difference that probably would C. elegans C. elegans not have registered as meaningful if less-accurate (label-free) quantifica- 0 0 tion methods had been used. Regardless, in all of the studies highlighted of surviving Fraction Time (days) of surviving Fraction Time (days) here, the overall importance of the proteomic data sets was established by follow-up biological experiments. c Longevity Conclusions and future directions Future technical challenges for mass-spectrometry-based proteomics mainly relate to the nature of proteins in biological systems. Proteins have a wide range of abundances, and this is further confounded by the EFT-2 DAF-2 TAX-6 myriad post-translational modifications that are dynamically regulated by cellular context and time. To capture the various states of proteins Figure 7 | Discovery of DAF-2-regulated protein pathways that modulate in a cell fully, proteomes must therefore be sampled in different condi- longevity in Caenorhabditis elegans. a, A quantitative proteomic analysis of tions and at several time points following perturbation. Several tech- changes in protein abundance was carried out in daf-2 mutant C. elegans, by 69 nical aspects of mass spectrometers need to be improved to meet the using metabolic labelling and MudPIT analysis . Shown are representative demands for higher throughput and proteome coverage without sac- examples of proteins that either decreased (EFT-2) or increased (TAX-6) b rificing information content. First, advances in instrument scan speed in abundance in daf-2 mutants. , Follow-up studies on differentially expressed proteins identified cases in which RNAi-mediated knockdown would allow more frequent sampling of ions. Higher rates of sampling of the corresponding mRNAs decreased (EFT-2) or increased (TAX-6) the would translate into more tandem mass spectra acquired per unit time, lifespan of worms. These results suggest that the proteins participate in which would, in turn, enable higher-resolution chromatography meth- compensatory pathways that limit the effects of daf-2 mutation on longevity ods to be used. Increased sampling rates should also improve dynamic and dauer formation. c, A model of how DAF-2-regulated proteins range, because lower-abundance ions are more likely to be detected. participate in compensatory pathways that affect longevity was assembled Second, coupling these changes to continued improvements in sen- from the results of these experiments. sitivity and mass accuracy measurements, the gain in dynamic range could be multiplied. Increased resolution and mass accuracy should as accurate mass tagging and single-ion reaction monitoring of peptides also strengthen confidence in peptide identifications and facilitate the can increase throughput and reduce sample demands, but they limit discovery of protein modifications. Third, advances in ‘top-down’ mass the analysis to peptides or proteins that are known to be present in a spectrometry for sequence-based characterization of intact proteins mixture and, therefore, preclude serendipitous discoveries76,77. Because can allow patterns of modifications on a protein to be correlated with mass-spectrometry-based proteomic methodology continues to develop specific activities or functions. At present, top-down mass spectrometry at a rapid pace, there is much hope that these and other problems will is most effective for small proteins (< 25 kDa) and presents difficulties for be solved. analysing larger proteins75. Key areas for the improvement of top-down It is clear that biologists are becoming increasingly savvy users of mass spectrometry are the development of more general fragmentation mass-spectrometry instrumentation and, conversely, that mass spec- methods for large proteins, and of higher-throughput and more-robust trometrists are gaining familiarity with other biological techniques. methods to introduce intact proteins into the mass spectrometer. Final We therefore expect that distinctions between these types of scien- issues to consider relate to the throughput and sample demands of stand- tist will soon begin to lose meaning. How long might it be, for exam- ard mass-spectrometry-based proteomics experiments. Unbiased, glo- ple, before mass spectrometers stand alongside centrifuges and PCR bal methods such as a two-dimensional liquid-chromatography-based machines as core pieces of equipment in every biology lab? Wouldn’t shotgun proteomics require considerable time (several hours per sample) that be the ultimate sign of biological impact for this powerful analytical and material (> 0.1 mg protein per sample). Using other strategies such technology? ■ 36 | OCTOBER 2015 www.nature.com/milestones/mass-spec999 INSIGHT REVIEW NATURE|Vol 450|13 December 2007 COLLECTION

1. Whisstock, J. C. & Lesk, A. M. Prediction of protein function from protein sequence and 44. Jessani, N., Liu, Y., Humphrey, M. & Cravatt, B. F. Enzyme activity profiles of the secreted structure. Q. Rev. Biophys. 36, 307–340 (2003). and membrane proteome that depict cancer invasiveness. Proc. Natl Acad. Sci. USA 99, 2. Galperin, M. Y. & Koonin, E. V. ‘Conserved hypothetical’ proteins: prioritization of targets for 10335–10340 (2002). experimental study. Nucleic Acids Res. 32, 5452–5463 (2004). 45. Liu, Y., Patricelli, M. P. & Cravatt, B. F. Activity-based protein profiling: the serine hydrolases. 3. Zhu, H., Bilgin, M. & Snyder, M. Proteomics. Annu. Rev. Biochem. 72, 783–812 (2003). Proc. Natl Acad. Sci. USA 96, 14694–14699 (1999). 4. de Hoog, C. L. & Mann, M. Proteomics. Annu. Rev. Genomics Hum. Genet. 5, 267–293 (2004). 46. Patricelli, M. P., Giang, D. K., Stamp, L. M. & Burbaum, J. J. Direct visualization of serine 5. MacBeath, G. Protein microarrays and proteomics. Nature Genet. 32 (suppl.), 526–532 hydrolase activities in complex proteome using fluorescent active site-directed probes. (2002). Proteomics 1, 1067–1071 (2001). 6. Hall, D. A., Ptacek, J. & Snyder, M. Protein microarray technology.Mech. Ageing Dev. 128, 47. Jessani, N. et al. A streamlined platform for high-content functional proteomics of primary 161–167 (2007). human specimens. Nature Methods 2, 691–697 (2005). 7. Causier, B. Studying the interactome with the yeast two-hybrid system and mass 48. Leung, D., Hardouin, C., Boger, D. L. & Cravatt, B. F. Discovering potent and selective spectrometry. Mass Spectrom. Rev. 23, 350–367 (2004). reversible inhibitors of enzymes in complex proteomes. Nature Biotechnol. 21, 687–691 8. Stevens, R. C., Yokoyama, S. & Wilson, I. A. Global efforts in structural genomics.Science (2003). 294, 89–92 (2001). 49. Saghatelian, A. et al. Assignment of endogenous substrates to enzymes by global 9. Aebersold, R. & Mann, M. Mass spectrometry-based proteomics. Nature 422, 198–207 metabolite profiling. Biochemistry 43, 14332–14339 (2004). (2003). 50. Chiang, K. P., Niessen, S., Saghatelian, A. & Cravatt, B. F. An enzyme that regulates ether 10. Yates, J. R. Mass spectral analysis in proteomics. Annu. Rev. Biophys. Biomol. Struct. 33, lipid signaling pathways in cancer annotated by multidimensional profiling. Chem. Biol. 13, 297–316 (2004). 1041–1050 (2006). 11. Domon, B. & Aebersold, R. Mass spectrometry and protein analysis. Science 312, 212–217 51. Manning, G., Whyte, D. B., Martinez, R., Hunter, T. & Sudarsanam, S. The protein kinase (2006). complement of the human genome. Science 298, 1912–1934 (2002). 12. Andersen, J. S. & Mann, M. Organellar proteomics: turning inventories into insights.EMBO 52. Matsuoka, S. et al. ATM and ATR substrate analysis reveals extensive protein networks Rep. 7, 874–879 (2006). responsive to DNA damage. Science 316, 1160–1166 (2007). 13. Cusick, M. E., Klitgord, N., Vidal, M. & Hill, D. E. Interactome: gateway into systems biology. 53. Shiloh, Y. The ATM-mediated DNA-damage response: taking shape. Trends Biochem. Sci. 31, Hum. Mol. Genet. 14, R171–R181 (2005). 402–410 (2006). 14. Michnick, S. W. Proteomics in living cells. Drug Discov. Today 9, 262–267 (2004). 54. Oda, Y., Huang, K., Cross, F. R., Cowburn, D. & Chait, B. T. Accurate quantitation of protein 15. Neubauer, G. et al. Mass spectrometry and EST-database searching allows characterization expression and site-specific phosphorylation. Proc. Natl Acad. Sci. USA 96, 6591–6596 of the multi-protein spliceosome complex. Nature Genet. 20, 46–50 (1998). (1999). 16. Wang, Y. et al. BASC, a super complex of BRCA1-associated proteins involved in the 55. Mann, M. Functional and quantitative proteomics using SILAC. Nature Rev. Mol. Cell Biol. recognition and repair of aberrant DNA structures. Genes Dev. 14, 927–939 (2000). 7, 952–958 (2006). 17. Rout, M. P. et al. The yeast nuclear pore complex: composition, architecture, and transport 56. Wang, B. et al. Abraxas and RAP80 form a BRCA1 protein complex required for the DNA mechanism. J. Cell Biol. 148, 635–651 (2000). damage response. Science 316, 1194–1198 (2007). 18. Bouwmeester, T. et al. A physical and functional map of the human TNF-α/NF-κB signal 57. Smogorzewska, A. et al. Identification of the FANCI protein, a monoubiquitinated FANCD2 transduction pathway. Nature Cell Biol. 6, 97–105 (2004). paralog required for DNA repair. Cell 129, 289–301 (2007). 19. Das, R. et al. SR proteins function in coupling RNAP II transcription to pre-mRNA splicing. 58. Olsen, J. V. et al. Global, in vivo, and site-specific phosphorylation dynamics in signaling Mol. Cell 26, 867–881 (2007). networks. Cell 127, 635–648 (2006). 20. Danial, N. N. et al. BAD and glucokinase reside in a mitochondrial complex that integrates 59. Rush, J. et al. Immunoaffinity profiling of tyrosine phosphorylation in cancer cells. Nature glycolysis and apoptosis. Nature 424, 952–956 (2003). Biotechnol. 23, 94–101 (2005). 21. Harada, H. et al. Phosphorylation and inactivation of BAD by mitochondria-anchored 60. Jin, M. et al. Quantitative analysis of protein phosphorylation in mouse brain by hypothesis- protein kinase A. Mol. Cell 3, 413–422 (1999). driven multistage mass spectrometry. Anal. Chem. 77, 7845–7851 (2005). 22. Gatenby, R. A. & Gillies, R. J. Why do cancers have high aerobic glycolysis?Nature Rev. 61. Huang, P. H. et al. Quantitative analysis of EGFRvIII cellular signaling networks reveals a Cancer 4, 891–899 (2004). combinatorial therapeutic strategy for glioblastoma. Proc. Natl Acad. Sci. USA 104, 23. Wang, X. The expanding role of mitochondria in apoptosis. Genes Dev. 15, 2922–2933 12867–12872 (2007). (2001). 62. Kim, S. C. et al. Substrate and functional diversity of lysine acetylation revealed by a 24. Coulombe, B., Jeronimo, C., Langelier, M. F., Cojocaru, M. & Bergeron, D. Interaction proteomics survey. Mol. Cell 23, 607–618 (2006). networks of the molecular machines that decode, replicate, and maintain the integrity of the 63. Garcia, B. A., Pesavento, J. J., Mizzen, C. A. & Kelleher, N. L. Pervasive combinatorial human genome. Mol. Cell. Proteomics 3, 851–856 (2004). modification of histone H3 in human cells. Nature Methods 4, 487–489 (2007). 25. Ranish, J. A. et al. Identification of TFB5, a new component of general transcription and DNA 64. Ong, S. E., Mittler, G. & Mann, M. Identifying and quantifying in vivo methylation sites by repair factor IIH. Nature Genet. 36, 707–713 (2004). heavy methyl SILAC. Nature Methods 1, 119–126 (2004). 26. Gygi, S. P. et al. Quantitative analysis of complex protein mixtures using isotope-coded 65. Khidekel, N. et al. Probing the dynamics of O-GlcNAc glycosylation in the brain using affinity tags. Nature Biotechnol. 17, 994–999 (1999). quantitative proteomics. Nature Chem. Biol. 3, 339–348 (2007). 27. Cenkci, B., Petersen, J. L. & Small, G. D. REX1, a novel gene required for DNA repair.J. Biol. 66. Peng, J. et al. A proteomics approach to understanding protein ubiquitination. Nature Chem. 278, 22574–22577 (2003). Biotechnol. 21, 921–926 (2003). 28. Giglia-Mari, G. et al. A new, tenth subunit of TFIIH is responsible for the DNA repair 67. Mukhopadhyay, A. & Tissenbaum, H. A. Reproduction and longevity: secrets revealed by syndrome trichothiodystrophy group A. Nature Genet. 36, 714–719 (2004). C. elegans. Trends Cell Biol. 17, 65–71 (2007). 29. Vermeulen, W. et al. Sublimiting concentration of TFIIH transcription/DNA repair factor 68. Kenyon, C., Chang, J., Gensch, E., Rudner, A. & Tabtiang, R. AC. elegans mutant that lives causes TTD-A trichothiodystrophy disorder. Nature Genet. 26, 307–313 (2000). twice as long as wild type. Nature 366, 461–464 (1993). 30. Krogan, N. J. et al. Global landscape of protein complexes in the yeast Saccharomyces 69. Dong, M. Q. et al. Quantitative mass spectrometry identifies new insulin targets in cerevisiae. Nature 440, 637–643 (2006). C. elegans. Science 317, 660–663 (2007). 31. Krebs, M. P., Noorwez, S. M., Malhotra, R. & Kaushal, S. Quality control of integral 70. Venable, J. D., Dong, M. Q., Wohlschlegel, J., Dillin, A. & Yates, J. R. Automated approach membrane proteins. Trends Biochem. Sci. 29, 648–655 (2004). for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nature 32. Kelly, J. W. & Balch, W. E. The integration of cell and chemical biology in protein folding. Methods 1, 39–45 (2004). Nature Chem. Biol. 2, 224–227 (2006). 71. Liu, H., Sadygov, R. G. & Yates, J. R. A model for random sampling and estimation of relative 33. Riordan, J. R. Assembly of functional CFTR chloride channels. Annu. Rev. Physiol. 67, 701–718 protein abundance in shotgun proteomics. Anal. Chem. 76, 4193–4201 (2004). (2005). 72. Wu, C. C., MacCoss, M. J., Howell, K. E., Matthews, D. E. & Yates, J. R. Metabolic labeling of 34. Qu, B. H., Strickland, E. H. & Thomas, P. J. Localization and suppression of a kinetic defect in mammalian organisms with stable isotopes for quantitative proteomic analysis. Anal. Chem. cystic fibrosis transmembrane conductance regulator folding. J. Biol. Chem. 272, 76, 4951–4959 (2004). 15739–15744 (1997). 73. Berns, K. et al. A large-scale RNAi screen in human cells identifies new components of the 35. Loo, M. A. et al. Perturbation of Hsp90 interaction with nascent CFTR prevents its p53 pathway. Nature 428, 431–437 (2004). maturation and accelerates its degradation by the proteasome. EMBO J. 17, 6879–6887 74. Perrimon, N. & Mathey-Pervot, B. Applications of high-throughput RNA interference (1998). screens to problems in cell and developmental biology. Genetics 175, 7–16 (2007). 36. Meacham, G. C., Patterson, C., Zhang, W., Younger, J. M. & Cyr, D. M. The Hsc70 co- 75. Zabrouskov, V., Senko, M. W., Du, Y., Leduc, R. D. & Kelleher, N. L. New and automated MSn chaperone CHIP targets immature CFTR for proteasomal degradation. Nature Cell Biol. approaches for top-down identification of modified proteins. J. Am. Soc. Mass Spectrom. 16, 3, 100–105 (2001). 2027–2038 (2005). 37. Wang, X. et al. Hsp90 cochaperone Aha1 downregulation rescues misfolding of CFTR in 76. Conrads, T. P., Anderson, G. A., Veenstra, T. D., Pasa-Tolic, L. & Smith, R. D. Utility of accurate cystic fibrosis. Cell 127, 803–815 (2006). mass tags for proteome-wide protein identification. Anal. Chem. 72, 3349–3354 (2000). 38. Washburn, M. P., Wolters, D. & Yates, J. R. Large-scale analysis of the yeast proteome by 77. Gerber, S. A., Rush, J., Stemman, O., Kirschner, M. W. & Gygi, S.P. Absolute quantification of multidimensional protein identification technology. Nature Biotechnol. 19, 242–247 proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl Acad. Sci. USA 100, (2001). 6940–6945 (2003). 39. Hanash, S. Disease proteomics. Nature 422, 226–232 (2003). 78. MacCoss, M. J. et al. Shotgun identification of protein modifications from protein complexes 40. Kumar, N. et al. Molecular complexity of sexual development and gene regulation in and lens tissue. Proc. Natl Acad. Sci. USA 99, 7900–7905 (2002). Plasmodium falciparum. Int. J. Parasitol. 34, 1451–1458 (2004). 41. Khan, S. M. et al. Proteome analysis of separated male and female gametocytes reveals Acknowledgements We gratefully acknowledge the support of the National Institutes novel sex-specific Plasmodium biology. Cell 121, 675–687 (2005). of Health. 42. Ward, P., Equinet, L., Packer, J. & Doerig, C. Protein kinases of the human malaria parasite Plasmodium falciparum: the kinome of a divergent eukaryote. BMC Genomics 5, 79 (2004). Author information Reprints and permissions information is available at 43. Jessani, N. & Cravatt, B. F. The development and application of methods for activity-based npg.nature.com/reprints. Correspondence should be addressed to B.F.C. protein profiling. Curr. Opin. Chem. Biol. 8, 54–59 (2004). ([email protected]) or J.R.Y. ([email protected]).

1000NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 37 COLLECTION

• This Review was first published in Nature 491, 384–392 (2012) doi:10.1038/nature11708 REVIEW doi:10.1038/nature11708

Metabolic phenotyping in clinical and surgical environments Jeremy K. Nicholson1, Elaine Holmes1, James M. Kinross1, Ara W. Darzi1, Zoltan Takats1 & John C. Lindon1

Metabolic phenotyping involves the comprehensive analysis of biological fluids or tissue samples. This analysis allows biochemical classification of a person’s physiological or pathological states that relate to disease diagnosis or prognosis at the individual level and to disease risk factors at the population level. These approaches are currently being implemented in hospital environments and in regional phenotyping centres worldwide. The ultimate aim of such work is to gener- ate information on patient biology using techniques such as patient stratification to better inform clinicians on factors that will enhance diagnosis or the choice of therapy. There have been many reports of direct applications of metabolic phenotyping in a clinical setting.

linical diagnosis, prognosis and treatment selection are increas- A number of terms are used to describe the various metabolic-anal- ingly dependent on the use of molecular tools that help to ysis procedures. Metabolomics6, for example, essentially describes the classify diseases and their subtypes, and to define underlying metabolic composition of a given sample in terms of metabolite pres- Cindividual variations in patient biology. The application of stratified ence and concentration, the metabolome being the multivariate sum of and new therapeutic approaches that have been optimized through these components. There are about 500 histologically distinct cell types in predictive modelling of deep biological information (for example, the human body. Each one of these cell types has specific functions and, genetic, metabolic or physiological) on individual patient variation consequently, a different gene expression pattern, proteome and metabo- will not only have major health-care benefits, but also inevitably lead lome. Cellular metabotypes may overlap within histological specimens, to socioeconomic, health-care deployment, regulatory and research but they interact in space and time through the connecting vascular and changes in the clinic1. One of the most widely applicable areas for the lymphatic systems. Humans, therefore, contain more than 500 dynamic development of precision medicine relates to the diverse applications of cellular metabolomes, as well as those of the individual tissue-specific metabolic phenotyping — or metabotyping2 — to clinical diagnostics, extracellular fluid compartments (which are compositionally different prognostics and molecular epidemiology. The metabotypes of indi- from their surrounding cells) and the various secretory and excretory viduals can be measured from the composition of accessible biofluids biological fluids (Fig. 2). Disease processes and medical treatments also or tissues that are sampled in the clinic. Metabotypes vary extensively occur over variable time frames, and metabotypes change dynamically between individuals and populations, and result from the complex with disease and treatment. Thus, the term metabonomics has been used interplay between host genes, lifestyle, diet and gut microbes3,4. Thus, since the late 1990s (ref. 7) to describe the metabolic responses of com- metabotyping has applications in both population-based disease-risk plex systems to perturbations through time, and how these responses can investigation studies and in solving problems related to personalized be mapped using appropriate analytical and statistical techniques. This health care and patient stratification4. Hence, the ability to generate stimulus could be disease, nutritional changes, drug therapy, genetic mod- metabolic phenotypes from large sample cohorts that have been col- ulation or a myriad of other inputs. Specifically, metabonomics addresses lected as part of epidemiological studies means that the ensuing enor- such phenotypic changes at the level of small-molecule metabolites, and mous statistical power allows the identification of good candidates for usually in the context of analysis of body fluids such as urine or blood metabolic biomarkers of disease risk in different populations (such as plasma. The terms metabolomics and metabonomics are widely — and predictors of elevated blood pressure in so-called metabolome-wide often interchangeably — used, having received about 26,000 and about association studies3). The gene–environment interactions that deter- 10,400 Google Scholar hits, respectively, at the time of writing. Although mine metabotypes are identical to those that determine disease risk in metabolic profiling has been applied in a wide variety of research fields the general population, as well as individual susceptibility to disease over the past 30 years — ranging from microbiology to plant and food and response to treatment. Thus, metabotypes are both statistically and science, through to animal toxicology and mechanisms of disease — it is biologically connected to disease risk factors and treatment outcomes, the clinical areas that are currently receiving the most attention. Apply- and thereby underpin the value of metabolic analysis in a diverse range ing high-throughput metabolic technologies to provide new diagnostic of medical scenarios4. Metabolic phenotypes have been measured and biomarkers and to uncover disease mechanisms is an attractive proposi- mapped indirectly for many centuries — mainly unknowingly. For tion. In this Review, we discuss some of the key and developing areas in example, the urine wheel was used by physicians to relate the colours, clinical metabotyping, and its range of applications in progressing our smells and tastes of urine samples to likely diagnoses and treatments5. understanding of human disease processes. More recently, spectroscopic methods have been applied to generate multivariate profiles of metabolites, mainly using nuclear magnetic Metabolic analysis of biofluids, cells and tissue resonance (NMR) or mass-spectrometric methods that can measure A series of interacting metabolic networks that operate in multiple body a wide range of metabolites simultaneously. The data are then analysed compartments gives rise to a continuum of metabolic processes that con- using multivariate statistics (Fig. 1 and Box 1). tribute to the overall metabotype, and includes contributions from diet,

1Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, South Kensington, London SW7 2AZ, UK.

38 | OCTOBER 2015 www.nature.com/milestones/mass-spec 384 | NATURE | VOL 491 | 15 NOVEMBER 2012

384-392 Insight Nicholson NS.indd 384 08/11/2012 17:44 COLLECTIONREVIEW INSIGHT

High n >104 Large-scale population studies Medical and surgical prognostics Medical and surgical theranostics Mathematical modelling Stratification and optimization (cohort size) Surgical and and visualization Modelling scale minimally complexity invasive diagnostics n = 102 Low Techniques Histopathology augmentation 1–2 hours (little sample preparation) • MALDI-TOF-MS (tissue) • GC-MS (fluids or tissue extracts 10–20 min • UPLC-MS and CE-MS Clinical diagnostic biomarkers of biofluids 5–10 min (little sample preparation) • Biofluid NMR • MAS NMR of tissues

Analytical and 10 s Real-time surgical diagnostics • Direct injection medical time scales (no sample preparation) or nanospray MS 1–2 s • REIMS-MS iKnife

Modelling complexity

Figure 1 | Technology platforms and analytical timescales for patient public health-care epidemiology and identifying new biomarkers as well as journey phenotyping, diagnostic and prognostic biomarker discovery, surgical risk stratification and pre-operative optimization involve the analysis and population disease-risk biomarker modelling. Different analytical of hundreds or thousands of samples from different populations, the transfer technologies (Box 1) can be applied to a variety of clinically derived of information can cause bottlenecks to data processing and total-cohort biosamples, and the choice of technology is dependent on the timescale analysis. In the critical care or surgical setting, large-scale population studies for reaching a solution to the clinical problem, as well as the analytical can also be used to identify populations at risk of surgical morbidity or a performance characteristics of the technology. Thus, surgical problems poor outcome. GC, gas chromatography; iKnife, intelligent knife; MALDI– require either real-time or near real-time solutions for clinical decision- TOF–MS, matrix-assisted laser desorption ionization time-of-flight mass making, whereas histopathological augmentation has a multi-hour timescale. spectrometry; MAS–NMR, magic-angle-spinning–nuclear-magnetic- The cohort size for predictive modelling for technique optimization depends resonance; MS, mass spectrometry; REIMS, rapid evaporative ionization mass on the biological information obtained. Epidemiological problems, such as spectrometry; UPLC, ultraperformance liquid chromatography.

drugs and gut-microbial activities8. The local phenotypic expression of the in analytical matrix properties that not only determine how samples network interactions is obtained by analysing samples in these compart- need to be prepared and analysed, but also carry other types of dynamic ments, such as plasma or urine — which are the two most widely used diagnostic information that is not demonstrated by simple (molecular clinical diagnostic fluids (Fig. 2) — or tissue. This analysis leads to the identity and concentration) compositional analysis. Uniquely, NMR generation of a series of static snapshots of metabolic activity that can be spectroscopic approaches do not disturb these complex perturbations in difficult to interpret in isolation, unless there is overt metabolic disease, dynamic physicochemical interactions between molecules in biofluids, because of the background presence of physiological variability. Obtain- whereas this occurs of necessity in mass spectrometry. However, from a ing time series of samples from individuals who are undergoing diag- diagnostic point of view, dynamic chemical features have received rela- nostic or prognostic evaluation, or at different stages of a disease process tively little attention in comparison with purely compositional biomarker allows a longitudinal metabolic pattern or trajectory to emerge that carries analysis, despite the fact that some biofluids (such as semen) are highly much more information on site, severity and — potentially — mechanism reactive post collection due to intrinsic enzymatic activities11. In terms of of damage9. Similar arguments can be applied to responses to therapy. biological data generation, metabonomic and related methods are highly However, in reality, only a relatively small number of tissue or fluid types complementary to other ‘omics’ tools such as genomics, metagenomics, can be sampled, and the exhaustive analysis of these samples by advanced proteomics and transcriptomics, each of which covers different aspects metabolic and spectroscopic techniques (Box 1) still gives only ‘islands of of systemic and cellular-activity space, and all of which are interrelated. information’ that represent local activities (tissue or specialized biofluids) Systems-biology approaches also seek to integrate these data sets by using (Fig. 2) or systemic activities that affect the extracellular environment appropriate multivariate statistical and network modelling to obtain a (urine and plasma). Thus, one of the challenges of metabolism-based ‘top- more holistic view of human disease, although in the clinical environment down’ systems biology10 is to try to build mathematical bridges between multiple omics screening has, to date, rarely been feasible. No one tool, these islands to create system-level models that, in turn, can be used to metric or platform gives a complete biological picture of a condition, and generate biochemical or medical hypotheses for further testing using all approaches generate hypotheses that need rigorous testing and valida- ‘bottom-up’ systems-biology methods. Furthermore, urine and plasma tion in the field. An advantage to metabolic profiling is that the relatively samples carry very different information sets on various molecules and low cost per assay or procedure means it lends itself to large-scale testing pathways, representing numerous systemic timescales. For example, data and, in a clinical setting, useful information from commonly available from plasma provide a description of the metabolic system at the time of samples (such as urine and plasma) can be obtained. sampling, although persistent alterations induced by dietary or chronic interventions may also be detected. By contrast, information from urine Overview of clinical applications is time-averaged because of its collection and storage in the bladder. There Metabotyping approaches have been used widely in animal models of are also multiple complex physicochemical interactions and differences disease, drug toxicity and drug action, resulting in many advances in the

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 39 15 NOVEMBER 2012 | VOL 491 | NATURE | 385

384-392 Insight Nicholson NS.indd 385 08/11/2012 17:44 COLLECTIONINSIGHT REVIEW

BOX 1 Technology

NMR spectroscopy and mass spectrometry are the main techniques widely used supervised method (using a training set of data with known that are used for the metabolic profiling of biofluids (for example, end points). This method relates a data matrix containing independent urine, blood plasma, amniotic fluid and cerebrospinal fluid) and variables from samples, such as spectral intensity values (an X matrix) to tissues in the form of extracts or intact biopsies94,95. New solid-state a matrix containing dependent variables (for example, measurements NMR methods allow analysis of samples of less than 1 mg, thereby of response) for those samples (a Y matrix). Partial least squares can allowing detailed studies of tissue heterogeneity, such as between also be combined with discriminant analysis to establish the optimal tumour tissue and tumour margins96. position to place a surface that best separates classes. Other popular Both NMR and mass spectrometry can simultaneously identify and chemometric methods include hierarchical clustering, self-organizing quantify information on a wide range of small molecules with good maps and neural networks14,95. analytical precision and accuracy, and require only a small amount of Statistical spectroscopy is a form of computational modelling sample (typically 10–400 μl). NMR spectroscopy is highly reproducible, that is used to enhance biomarker recovery, allowing improved with a detection limit in the sub-micromolar range. All hydrogen- information extraction from a set of spectra. This method generally containing metabolites in a biofluid are detected simultaneously and operates on defining correlation structures between variables (signals) non-destructively with little sample preparation. Mass spectrometry has that are found to be discriminatory between sample classes. Highly much lower detection limits, but it is destructive and a more targeted correlated signals are likely to come from the same molecule or from approach is often needed with prior separation of metabolites, using molecules regulated by the same metabolic pathway. Statistical total either chromatography or capillary electrophoresis97. Thus, mass- correlation spectroscopy is used to identify correlated signals within spectrometry approaches tend to be less reproducible, more platform- a data set for biomarker identification. NMR and mass spectrometry dependent and susceptible to variability. Some of the earliest clinical possess high complementarity in molecular-structure elucidation studies used gas-chromatography–mass-spectrometry, especially for studies. In many metabonomic studies, multiple samples with a detection of inborn errors of metabolism98, and, although still widely wide range of biochemical variation are available for both NMR and used, the requirement for chemical derivation of the sample to allow mass-spectrometry analysis, creating an opportunity for statistical metabolite volatilization imposes limitations on its widespread use in analysis of signal amplitude co-variation between the two sets of clinical diagnostics. Liquid chromatography, particularly ultra-high- data. Statistical heterospectroscopy is an extension of statistical total performance liquid chromatography (UPLC), is being used increasingly correlation spectroscopy for the co-analysis of multispectroscopic for metabolic profiling. data sets, which have been acquired from multiple samples. The Chemometrics — multivariate statistics applied to chemical statistical heterospectroscopy approach, originally developed for NMR data — are used in clinical metabonomics to reduce the dimensionality and mass-spectrometry correlation, can be used if any two or more of complex spectroscopic data sets, and to identify biochemical patterns independent spectroscopic data sets from any source are available for that relate to a disease or an intervention. Linear-projection methods, any sample cohort99. such as principal-components analysis and partial-least-squares A new and exciting approach that uses mass spectrometry is the discriminant analysis, are commonly used to map samples on the basis analysis of smoke from a cauterization device used in surgery to of their biochemical similarity and to extract patterns of metabolites identify the exact type of tissue being investigated92,93. By using a that relate to a particular disease14. Principal-components analysis is combination of new sampling methods, high-speed mass spectrometry used extensively in metabonomics. This technique transforms the data and chemometrics for classification purposes, it is possible to identify descriptors into a set of linear combinations of the original features different types of tissue in real time during a surgical procedure based on decreasing levels of variance. Any clustering seen is based a development known as the intelligent knife (i-knife). A parallel on the data alone, and there is no pre-assignment of sample classes. application of mass spectrometry is to locate molecules within a sample Alternatively, in ‘supervised’ methods, multiparametric data sets can as an imaging technique by using ionization methods based on laser be modelled so that the class of separate samples (a ‘validation’ set) ablation from the tissue surface89. In combination with chemometric- can be predicted based on a series of mathematical models derived enhanced information recovery, this has led to the possibility of an from the original data or ‘training’ set. Partial-least-squares analysis is a augmented histological assessment100.

development of modelling techniques, chemometrics and ways to identify metabolic disease such as that seen in type 2 diabetes16 or inborn errors of new biomarkers12–15. But as the technology and modelling platforms have metabolism (based on 1H-NMR-spectroscopy-derived urine and serum matured and improved, there has been a shift towards the implementation profiles)17. It was also shown quite early on that biochemically relevant of clinical studies. Hence, in this Review, we focus on outlining specific systemic information could be recovered, such as the increase in serum areas that have the potential to have significant impacts on translational alanine and the reduction in branched-chain amino acids (corresponding medicine and clinical delivery in the hospital environment. These include to decreased amino-acid gluconeogenesis and increased ketogenesis) that screening patients with established diseases for the detection of new bio- followed insulin withdrawal in people with type 2 diabetes. In addition, markers to aid in clinical classification. We are now in a position to deliver through time modification of plasma lipid and lipoprotein profiles, thera- a systems-biology framework for complex clinical problems that are often peutic optimization could also be monitored in this way16. The field has compromised by extreme gene–environment variation. Early studies that expanded to encompass epidemiological and population-scale studies18, applied metabolic profiling to clinical conditions were largely focused and to take into account some of the complications of diabetes — such as on identifying biomarkers, and were typically hindered by small group vascular lesions — that lead to premature death19. Detailed cross-species sizes and technical constraints. However, these studies paved the way for metabolic analysis has uncovered information on the potential mecha- the wider use of metabotyping approaches to further the understand- nisms underlying type 2 diabetes that relate to nucleotide metabolism, ing of systemic disease, as well as diagnostic and prognostic biomarker and to modulation of N-methylnicotinamide — which is conserved discovery. Indeed, the general principles of metabotyping approaches across rats, mice and humans20. In the case of type 1 diabetes, a metabolic had already been well-demonstrated by the 1980s, particularly for overt dysregulation of lipid and amino-acid metabolism was found to precede 40 | OCTOBER 2015 www.nature.com/milestones/mass-spec 386 | NATURE | VOL 491 | 15 NOVEMBER 2012

384-392 Insight Nicholson NS.indd 386 08/11/2012 17:44 COLLECTIONREVIEW INSIGHT

System tissue/ Supra-organism Diagnostic uids organ regulation regulation/ interactions Intercellular regulation and communication

Urine Plasma Organismal (time-averaged data) (snap-shot data)

Subcellular Genomic and control metabolic systems networks Other accessible analytical compartments

Pathological uids

Stochastic environmental Microbiome interactions interaction and Specialized uids and biopsies exposures (selected uids) Articial uids

Figure 2 | Local and global metabolic interactions in relation to sampled and control systems. Because urine is stored in the bladder, it represents compartments, fluids and their properties. In clinical settings, it is only time-averaged data and has the following physiological characteristics: possible to obtain limited ‘sampling islands’ for metabolic measurements. a variable pH, ionic strength and osmolarity; a high dielectric constant; Within the body, there is a complex and dynamic continuum of metabolic an extreme dynamic concentration range (more than 1011); thousands of interactions, from the subcellular level through multiple layers of molecules of less than 1 kDa; metal complexes and supramolecular aggregates; biomolecular organization up to the whole supraorganism system, including many small proteins; high enzyme activities in pathological states; and a the symbiotic microbiome components. At the system tissue and organ level, dynamically reactive matrix. Plasma, however, provides snap-shot data and multicellular interactions occur through time and space through the secretion has the following characteristics: relatively constant pH, ionic strength and of biochemical products, as well as hormone and neurological control of osmolarity, a lower bulk dielectric constant, a high dynamic concentration function and physiological homeostatic regulation. Environmental and range (more than 105), hundreds of molecules both smaller and larger than exogenous factors, including lifestyle, diet, drug therapy and the microbiota, 1 kDa; metal complexes and supramolecular complexes; a multi-compartment all influence metabolism. For example, the microbiota, as part of the multi-diffusional matrix; and many large proteins and protein complexes. supraorganism, has a commensal and symbiotic relationship with tissues of There are also a series of specialized secretory and pathological fluids that can the gut; the body’s interactions with pathogens and parasitic organisms, as be sampled and give, on spectroscopic analysis, more localized biochemical well as quorum sensing, also have a role. At the intercellular level, signalling information specific to tissue injury. Specialized fluids are cerebrospinal, molecules and transporter systems coordinate functions and metabolic flux thyroid, saliva (sublingual, parotid and submaxillary), respiratory washings, between cells. Finally, within the cell itself, enzymes require specific substrates gastric, bile, pancreatic, amniotic, follicular, milk, seminal vesicle, prostatic, and cofactors; biochemical conversions in organelles are topographically epididymal and semen. Artificial fluids include bronchiolar lavage fluid, constrained, and the metabolome requires specific functional pathway units. peritoneal dialysates, haemodialysates, faecal water, rectal dialysates, cell The two most accessible components are urine and plasma, but they carry extracts and cell supernatants. Pathological fluids include ascites, pus, cystic different system information sets as a result of these different regulation fluid and effusions (malignant and infective).

onset of the disease21, whereas a panel of five branched-chain amino acids ability of metabolic models to predict clinical outcomes for certain can- was found to be predictive of type 2 diabetes22. Unsurprisingly, insulin cers. Micrometastases were predicted in a study of people with breast resistance, and both type 1 and type 2 diabetes, have been the subject of cancer, in which patients who went on to develop metastases were shown intensive metabolic investigation for many years, and the contributions to have higher levels of plasma glucose, proline, lysine, phenylalanine of metabonomics and metabolomics have been reviewed extensively23. and N-acetylcysteine and lower levels of lipids33. Similarly, a recent study In the field of cancer (reviewed on page 364 of this issue), most stud- showed that — based on pretreatment serum samples for 500 women with ies were originally centred on extracts of the tumour tissue itself, and metastatic breast cancer — time to progression, overall survival and treat- 1H-NMR spectroscopy coupled to pattern recognition methods showed ment toxicity could be predicted from the serum levels of phenylalanine the ease with which discrete cancer-tissue types could be discriminated24. and glutamate (higher) and glucose (lower) for a subset of patients who Perhaps what is now more clinically relevant is the identification of poten- were HER2 positive, although correlation between pre-treatment serum tial cancer biomarkers in biofluids, including successful mapping of profile and outcome was not possible for the general trial population34. plasma ovarian cancer signatures — which are characterized by an altered As for diabetes, there has been an eruption of metabolic research on the pattern of ceramides and lysophospholipids, increased ketone bodies, and processes of onset and progression of tumour development over the past decreased alanine, valine and low-density lipoproteins25,26. Patients with decade, as well as identification of cancer biomarkers, and this has been lung cancer have also been distinguished from a control group by their comprehensively reviewed with respect to cellular biochemistry35, thera- low urinary levels of hippurate and trigonelline, together with elevated peutic target discovery36 and tumour typing32,37. There have also been d-3-hydroxyisovalerate, α-hydroxyisobutyrate and N-acetylglutamine27. advances in metabotyping as a tool for basic cardiac research38 in the An inverse relationship between endometrial cancer and the metabo- ability to predict cardiovascular events in baseline profiles of individuals lites stearic acid and serum acylcarnitines has been identified28, and at risk of coronary artery disease39; as well as for understanding the origins dysregulation of acylcarnitines also has a role in kidney cancer29. Some of pathology, including the complex environmental and non-infectious metabonomic studies have found that models built on serum metabolite microbiological triggers of disease40. For the most part, diagnostic meth- profiles perform better in terms of sensitivity than conventional markers, ods have been developed for serum and urine. However, for certain classes such as carcinoembryonic antigen in colorectal cancer30 for predicting of disease — such as lung disease — gas chromatography methods for early-stage tumours (stage 0–2). Other cancer metabonomic studies — characterizing volatile components of exhaled breath condensate have including those on ovarian and breast cancers25,31, and renal-cell carci- shown considerable promise. For example, children with asthma and noma32 — have shown promise in differentiating early- from late-stage allergic rhinitis were distinguished from controls based on the alkane tumours. Perhaps even more exciting than the diagnostic potential is the and aldehyde composition of breath condensate41; and differentiation NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 41 15 NOVEMBER 2012 | VOL 491 | NATURE | 387

384-392 Insight Nicholson NS.indd 387 08/11/2012 17:44 COLLECTIONINSIGHT REVIEW

between each of the stages from 1 and 3 in chronic obstructive pulmo- outcome for the individual — a process known as the patient journey. By nary disease identified ketones, methyl-branched alkanes and alcohols, use of the advanced metabotyping methods already described, coupled in exhaled breath among other compounds42. An advantage of metabolic to classic clinical diagnostic criteria, it is now possible to conceptualize profiling is that multiple compartments and fluids can be analysed to a phenotypically enhanced patient journey in which multiple technol- give complementary information on systemic dysfunction. Thus, the ogy platforms are deployed throughout the patient-handling pipeline61. metabolic signature of lung disease also extends to urine and serum. Uri- The first level of deployment is to create enhanced diagnostic biomarker nary concentrations of the tricarboxylic-acid (TCA) cycle intermediates profiles at each stage of the journey to assess how the patient responds α-ketoglutarate, succinate, fumarate and cis-aconitate were found to be to therapies, as well as to form differential diagnoses. However, by using differential between people with stable and unstable asthma43, whereas pharmacometabonomic approaches62, it is also possible to consider the decreased serum lipoproteins and N-dimethylglycine, and increased glu- sum of many patient journeys, and to engage in prospective or prog- tamine, 3-methylhistidine and branched-chain amino acids have been nostic analysis of patient outcomes. In pharmacometabonomic studies, associated with chronic obstructive pulmonary disease44. pre-intervention profiles of biofluids, such as urine or plasma, are used Human metabolic phenotypes, and multiple disease processes, are to create mathematical models of therapeutic interventions (using cross- highly dependent on gut-microbial activity. An emerging area for meta- validated models (Box 1)) so that prognostic outcomes can be judged. bolic profiling is the characterization of the functional properties of the Such studies have predicted xenobiotic hepatotoxicity in experimental gut microbiome. This is the combined genomic composition (more animal models through pre-intervention urinary profiling62, and pre- than 3.3 million genes45) of several thousand species that make up the dicted drug (paracetamol) metabolism in humans using NMR-based gut microbiota, and varies with age and between human populations46. spectroscopic profiling of urine (also demonstrating a complex connec- Abnormalities of the gut microbiome have been associated with a remark- tion between gut microbes and drug metabolic fate)63. An example of able variety of human conditions, ranging from obesity and diabetes to a successful pharmacometabonomic patient-stratification model is the autoimmune diseases and neuropsychiatric disorders47. Metabotyping prediction of response to capecitabine therapy in patients with colorectal of inflammatory bowel diseases has been carried out on urine, plasma cancer, whereby high levels of serum polyunsaturated fatty acids were and faecal samples both to characterize the metabolic consequences of predictive of drug toxicity64. The approach is also relevant to understand- ulcerative colitis and Crohn’s disease and to identify disease-induced ing surgical interventional outcomes (discussed later)61, and is generally changes in the metabolites deriving from the gut microbiota48–50. Dif- well-suited to modelling multiple longitudinal congruent patient journeys ferences in the levels of metabolites, such as the short-chain fatty acids, and for the abstraction of patient stratification information to help inform 4-cresyl sulphate and hippurate are indicative of a perturbed microbiota decision-making. in inflammatory bowel conditions, whereas altered levels of TCA-cycle In particular, patient-journey phenotyping lends itself to scalable and intermediates and amino acids reflect a shift in energy balance. Crucially, translatable models that can be applied to any acute hospital admission, the activities of the gut microbiota influence the host metabolic pheno- covering a variety of disease states (Fig. 3). Furthermore, this generalized types51 through a series of complex signalling axes that connect to multiple patient-journey phenotyping protocol can be applied in a drug-devel- host compartments, including the liver and brain47, as well as the immune opment testing environment in which we envisage the development of a system52. The metabolic axes involve bile-acid (which are themselves phenotypically augmented clinical trial. Standard clinical-trial informa- heavily metabolized by the microbiota) dependent signalling, binding tion would be supplemented by molecular data, leading to a significant to a variety of nuclear receptors, such as those in the liver, which in turn enhancement of the mechanistic background to observed responder or affect host gene-expression profiles53. There are also signalling axes that non-responder phenotypes that are seen in many clinical trials. Nowhere involve gut-microbe-generated short-chain fatty acids (from colonic fer- in the clinic is the ability to deliver a rapid prognostic metric of clinical mentation of polysaccharides and oligosaccharides), aromatic amines and condition more important than in the critical-care setting. Here, a gain acids (from aromatic amino-acid and protein putrefaction in the distal in minutes or hours when choosing and implementing a therapeutic colon54), as well as gut-microbe-derived links (through the endocannabi- strategy can mean the difference between life and death, and poor deci- noid system) which affect host adipogenesis55, and ultimately multiple sions carry substantial financial cost. Recent studies suggest that meta- CNS signalling axis connections. Children who are diagnosed with the bolic profiling tools can augment the identification of sepsis based on a neurobehavioural disorder autism manifest different urinary metabolite set of serum acylcarnitines and glycerophosphatidylcholines measured phenotypes compared with controls, including increased excretion of by liquid-chromatography–mass-spectrometry65, whereas serum levels gut-microbial metabolites (such as phenylacetylglutamine and 4-cresyl of triacylglycerides, glucose and glutamate are predictive of survival in sulphate) as well as altered amino-acid and nicotinic-acid profiles56,57. The patients who experienced trauma66. microbiome influences metabolism from birth, and early events in the development of the microbiome–host signalling axes can leave a lasting Mapping metabolic phenotypes during surgery metabolic imprint. Babies that are born before 37 weeks gestation have a Surgery is, by definition, the most personalized of all therapeutic options higher risk of developing metabolic syndrome and end-stage renal failure delivered in current clinical practice, yet there is a lack of molecular diag- than those born at full term. Individuals who are born preterm can still nostic and prognostic instruments that are required for modern preci- be differentiated from those born at full term when they reach adult- sion-based surgical practice1. As a result, clinical decisions are made on hood by profiling of the microbial degradation products choline, bile acids the basis of calculations of surgical risk using single or univariate biomark- and acetylated glycoproteins58. These metabolic signalling axes may form ers or data from retrospective logistic regression models. Surgery also the basis of drug discovery and other therapeutic strategies designed to poses other challenges. First, clinical diagnostic phenotypes in surgery operate either directly on the microbiome59 or indirectly through interac- are often hard to define precisely and tend to be inadequate for omics- tions with host metabolic pathways and immune signalling60. Indeed, the based research. The best example of this is found in the management microbiome may offer more druggable targets than the human genome, of patients who have experienced major trauma, in whom injuries are but determining this lies in future research and will be highly dependent highly heterogeneous and exert a variable and often unpredictable inter- on the successful application of metabotyping approaches to help eluci- individual effect67. Secondly, patients who undergo modern surgery are date these complex microbial–host symbiotic interactions. exposed to a large and variable environmental load of operative drugs and bacteria, as well as nutritional optimization strategies, which cannot Phenotyping patient journeys be objectively measured by standard biochemical assays68. While surgical All patients who enter the diagnostic environment undergo a series of outcomes have improved, the population has aged, and patients increas- tests that is designed to characterize and stage disease, as well as to select ingly have a more diverse range of co-morbidities, with higher associ- suitable therapies, which then result in either a ‘good’ or ‘bad’ clinical ated rates of malnutrition69, polypharmacy70 and medical interventions. 42 | OCTOBER 2015 www.nature.com/milestones/mass-spec 388 | NATURE | VOL 491 | 15 NOVEMBER 2012

384-392 Insight Nicholson NS.indd 388 08/11/2012 17:44 COLLECTIONREVIEW INSIGHT

Finally, the operating environment presents a unique and stringent set with SIRS and multi-organ dysfunction syndrome, according to variation of requirements for molecular platforms. Not only must they be able to in levels of carbohydrate, amino acids, glucose, lactate, glutamine signals, objectively quantify the large number of rapidly changing environmen- fatty acyl chains and lipids72. Finally, metabonomics has had a particular tal influences on patient outcome during a single operative journey, but impact on experimental models of kidney transplant surgery, in which it is they must be rapid, highly reliable, inexpensive and physically able to being assessed for its ability to predict graft failure73, and end-organ drug function under the conditions of an operation. Longitudinal metabotyp- toxicity74 and to assess hypoxic injury in cadaveric specimens. It has also ing describes a metabonomic expression of an individual’s metabolism, been widely used in experimental models of liver75 and gut76 transplanta- the trajectory of which can be accurately measured as a patient passes tion. Metabonomics may have a significant impact on transplant surgery, through a multivariate operative journey61. These data can be integrated in which there is a demand for rapid molecular diagnostics within the into current clinical data sets to augment surgical decision-making, and operating room to predict graft suitability and survival. deviations from a personalized trajectory can be used to detect early clini- The British surgeon Joseph Lister first described the importance of cal deterioration, predict risk or guide intervention. There is now clini- antisepsis more than 100 years ago, yet the wider role of gut bacteria in cal evidence that this approach has merit. Mass-spectrometry-targeted surgical health is only just being recognized and little is known about analysis of 69 serum metabolites in a large population of patients (478 the importance of commensal bacteria to post-operative recovery. This patients) who were undergoing cardiac surgery was able to accurately is largely because clinicians rely on culture-dependent analysis to glean predict a poor operative outcome over a mean follow-up period of 4.3 ± useful information about pathogenic organisms, or complex and dynamic 2.4 years71. Short-chain dicarboxylacylcarnitines, ketone-related metabo- ecosystems. Rapid, culture-independent analytical approaches for the lites and short-chain acylcarnitines were all independently predictive of detection of species-specific changes in pathogenic or commensal bac- an adverse outcome after multivariate adjustment. However, personalized teria are therefore of particular use in surgery. However, clinicians need or stratified approaches to health care must not only focus on mammalian to know not only which bacteria are present, but also what they are doing. biology. Initial work suggests that a metabonomic strategy is also able to Global metabotyping therefore extends beyond the model of culture- predict early-onset systemic inflammatory response syndrome (SIRS) independent analysis — by allowing the exploration of the functional in those patients who are exposed to major trauma. Partial-least-squares and symbiotic biochemical relationship between humans and microbiota discriminant analysis was also able to clearly distinguish between patients — as patients progress through their recovery phase. For example, surgical

Longitudinal patient modelling (prognostics)

Metabolic phenotyping

Biobanking

Tissue Tissue

Stool Stool Stool Stool Stool

Blood Blood Blood Blood Blood Deviation Urine Urine Urine Urine from Urine recovery Pre-intervention Intervention Post-intervention diagnostics outcome Recovery Rehabilitation

Real-time modelling (diagnostics) Critical care Entry into diagnostic

environment Patient journey

Death

Post-operative optimization Targeted treatments Biomarker predictors and risk phenotypes Personalized long-term health Disease prevention Phenotypically augmented clinical trials phenotyping Unmet need Unmet need Therapeutic targeting Unmet need

Figure 3 | Phenotyping the patient journey and phenotypically augmented prognosis to enhance clinical decision-making61. Before admission to hospital, clinical trials. Patients enter the diagnostic environment either through this can be used for disease prevention and after intervention to optimize community admission, electively, as an acute case or as an emergency. At any recovery. However, current biomarkers have left several areas of unmet point in the patient journey, there are multiple opportunities for metabolic clinical need in personalized prevention and therapeutic strategies, and in the phenotyping using technologies such as mass spectrometry or NMR delivery of sensitive and specific diagnostics and prognostic platforms for both spectrometry. Sections of these samples can also be stored in biobanks prior to surgical and medical diseases. By modelling congruent longitudinal journeys analysis for use in future research. Deviation from recovery can occur at any using pharmacometabonomic approaches62 it is possible to derive prognostic point in the patient journey, and samples can be taken again. This analysis can biomarker predictors and risk phenotypes that allow patient stratification, or be used to enhance differential diagnosis, therapeutic responses and long-term that can give mechanistic information relating to therapeutic responder or outcomes of therapy. Taking biosamples also provides real-time diagnosis and non-responder status. NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 43 15 NOVEMBER 2012 | VOL 491 | NATURE | 389

384-392 Insight Nicholson NS.indd 389 08/11/2012 17:44 COLLECTIONINSIGHT REVIEW

bypass of the foregut is increasingly used to treat obesity and its metabolic mechanisms, and it is conceivable that MAS-NMR spectroscopy can be complications, such as type 2 diabetes77. There are now convincing ani- performed within a 10–20-minute time frame if facilities are co-located mal and human data to suggest that this surgical bypass fundamentally near to, or within, clinical environments. Therefore, this analysis has disrupts the distal gut microbiome, and that the subsequent disruption translational potential as a clinical resource for rapid diagnostics within to the metabolome can be objectively measured using both NMR and either the outpatient clinic or the operating room environment. However, mass-spectrometry-based approaches78,79. Following bariatric surgery, few technologies exist that are significantly faster than frozen-section his- there is a marked dysregulation in the gut microbiota from a landscape topathological analysis for the intra-operative assessment of tumour mar- dominated by Firmicutes and Bacteroidetes to one dominated by proteo- gins. Mass spectrometry has been used to characterize intact biological bacteria78,80. Metabolic profiles reflect this shift in the community with a tissues for more than 30 years, but the field gained real momentum in the persistent alteration in urinary, serum and faecal levels of cresols, indoles late 1990s with the advent of matrix-assisted laser desorption/ionization and biogenic amines78,79. This suggests that personalized biomarkers for (MALDI) imaging analysis of histological tissue sections. Mass-spectrom- the prediction of long-term weight loss will have to account for the gut etry imaging studies, including MALDI, have revealed the molecular fin- microbiome, and that the long-term health consequences of permanently gerprint of tissues with metabolic constituents, that lipids and proteins altered enteric flow have yet to be fully defined78. The gut microbiome have a high histological specificity, and identified a number of prognostic is influenced by nearly all medical peri-operative therapeutic and risk- markers (both single and complex). Mass spectrometry imaging has been reduction strategies (for example, broad-spectrum antibiotic use), yet the suggested as an alternative to frozen-section histology; however, the time effect of this on operative morbidity is not yet understood. Therefore, the demand of this type of analysis is currently several hours per sample, clinical deployment of metabonomic technologies that provide real-time even at coarse (100 µm) resolution89. Multivariate, chemically augmented functional insight into the human–microbiome surgical health axis dur- histology of this type therefore has two significant benefits over current ing surgery will have a significant effect. It is likely that this will be crucial histological staining methodologies. First, it provides instantaneous tissue to older people, who are the most vulnerable — as well as the fastest grow- identification, which allows interactive and feedback-controlled surgical ing — group of patients who have surgery. It is well-established that stable and diagnostic interventions. Second, there is no inter-operator reproduc- age-related changes occur in gut-microbiome function81. Recent data sug- ibility of histological data, which can be exacerbated by low quality (for gest the presence of an age-related diet–microbiome-health axis, in which example, smeared) histological sections. the gut microbiome is markedly different in patients living in residential In contrast to mass-spectrometry imaging, rapid evaporative ionization care than those living in the community82. This could be linked to dietary mass spectrometry (REIMS) was developed exclusively for in situ analysis changes, as well as to objective measures of frailty and poor health. The — even for the in vivo chemical characterization of tissues90. REIMS was same is probably true of patients who are subjected to long stays in hos- developed as a result of the discovery that all surgical instruments that use pital, suggesting that measures of gut health are urgently required, and thermal evaporation approaches (including electrosurgery, laser surgery, that new models of pre-operative nutritional optimization are needed radiofrequency ablation and microwave ablation) ionize the molecular that include metabonomic measures of gut health. Moreover, surgical constituents of biological tissues. The subsequent combination of surgical excision of the colon may have much wider metabolic consequences for instruments with mass spectrometry has yielded an approach capable of older people than currently understood. identifying tissues and their pathological subtypes during surgical or diag- Equally, there is a need for real-time interventional precision biomark- nostic interventions90. Data generated from REIMS is strikingly similar ers that improve the quality and efficacy of the surgical excision itself. to other imaging mass spectrometry (MALDI or desorption electrospray Oncological surgery is still based on Halsted’s principles of oncological ionization) data91. Although REIMS is a more suitable technology for clearance, and the technique has remained largely unchanged since it the surgical environment than MALDI imaging, the latter guarantees a was introduced. Clearance is typically based on arbitrary measurements high histological specificity with quantitative, and potentially automated, that are not objective and are often inadequate, with potentially serious histopathological analysis of tissue specimens. Recent results suggest that implications for the patient. For example, more than 20% of breast cancers the REIMS technology can be successfully implemented in the surgical require re-excision for positive margins83. Metabonomics permits near environment90. Although the spatial resolution of REIMS is limited by real-time analysis with minimal sample preparation on very low volume the hand-held nature of the probe and its geometry, 100 µm is generally samples, and several analytical platforms offer a potential solution to this achievable with rapid (less than 0.9 s) feedback to the operator. Initial anal- problem. Magic-angle-spinning (MAS) 1H-NMR spectroscopy is able to ysis of adenocarcinoma of the gastrointestinal system and lungs together robustly determine the difference between benign and malignant tissue with hepatic metastases, primary tumours of the liver, pre-cancerous from patients with breast and colon cancer with a high degree of sensitiv- degenerations of colon mucosa and sentinel lymph-node mapping has ity and specificity84. This has been extensively used in the analysis of brain shown more than 95% concordance with classical histology and a less than tumours to differentiate between malignant tumour types85, and these 1% false-negative rate92. The technology has also been successfully trialled detailed biochemical profiles can then be related to the lower-resolution in neurosurgical brain-tumour excision of astrocytomas, meningiomas spectra obtained in vivo using NMR spectroscopy86. This significantly and metastatic brain tumours, as well as healthy brain tissue, with simi- improves MRI-based characterization of grade IV glioblastomas, metas- lar sensitivity and specificity93. Thus, mass-spectrometry metabonomics tases, medulloblastomas, lymphomas and glial tumours. Low concentra- provides real-time, descriptive in vivo data that are directly comparable tions of citrate and high concentrations of choline-containing compounds with post-interventional histological analysis. are metabolic characteristics that have been observed by NMR spectros- copy of prostate-cancer tissue. A similar approach has, therefore, been Concluding remarks used in prostate cancer, in which this technique demonstrated an overall Systems-biology approaches have allowed a deeper understanding accuracy of between 93% and 97% for detecting the presence of prostate- of the metabolic and physiological function of the human symbiotic cancer lesions87. ‘supraorganism’ (which includes the sum total of the eukaryotic In a study88 to find regulatory genes with the potential for targeted and prokaryotic genomes required for human health). Disorders therapies, the gene products acetylcitrate lyase and m-aconitase were both of supraorganism function underlie the aetiology of many modern found to be predictive of significantly reduced citrate level. In the same non-communicable diseases, and metabolism-based mechanistic study, which used 133 fresh-frozen samples from 41 patients undergo- understanding of these processes therefore has much to offer personalized ing radical prostatectomy, the two genes whose expression most closely health-care systems of the future. Super-system surgery is the influence accompanied the increase in choline-containing compounds were of surgical interventions or trauma on this complex symbiotic network, PLA2G7 and CHKA. Thus, MAS-NMR spectroscopy, when incorporated and the resultant time-dependent disruption in microbial–mammalian into a systems-level analysis, provides new insight into cancer disease co-metabolic pathways. To take full advantage of theories such as this, 44 | OCTOBER 2015 www.nature.com/milestones/mass-spec 390 | NATURE | VOL 491 | 15 NOVEMBER 2012

384-392 Insight Nicholson NS.indd 390 08/11/2012 17:44 COLLECTIONREVIEW INSIGHT

new systems-medicine technologies for surgical biomarker and drug 18. Suhre, K. et al. Metabolic footprint of diabetes: a multiplatform metabolomics study in an epidemiological setting. PLoS ONE 5, e13953 (2010). target discovery are required, and these are already being developed. 19. Makinen, V. P. et al. 1H NMR metabonomics approach to the disease continuum of Metabonomics is used to probe the real-world nature of biochemical diabetic complications and premature death. Mol. Syst. Biol. 4, 167 (2008). functionality and is sensitive to both gene and environmental influences; 20. Salek, R. M. et al. A metabolomic comparison of urinary changes in type 2 diabetes in mouse, rat, and human. Physiol Genomics 29, 99–108 (2007). it is, therefore, likely to be more practical than gene-based measurements 21. Oresic, M. et al. Dysregulation of lipid and amino acid metabolism precedes islet of responses to therapy. Nevertheless, a multi-omics approach can provide autoimmunity in children who later progress to type 1 diabetes. J. Exp. Med. 205, more information than a single one. Thus, it is currently possible to 2975–2984 (2008). 22. Wang, T. J. et al. Metabolite profiles and the risk of developing diabetes. integrate heterogeneous data sources (for example, metagenomic and Nature Med. 17, 448–453 (2011). metabonomic, transcriptomic or proteomic data sets) to provide a 23. Friedrich, N. Metabolomics in diabetes research. J. Endocrinol. 215, 29–42 (2012). complete top-down overview of complex disease states. We affirm that 24. Howells, S. L. Maxwell, R. J. Griffiths, J. R. Classification of tumour 1H NMR spectra by pattern recognition. NMR Biomed. 5, 59–64 (1992). metabonomics has the power to influence clinical decision-making 25. Fan, L. et al. Identification of metabolic biomarkers to diagnose epithelial ovarian in the hospital environment for both medical and surgical treatments. cancer using a UPLC/QTOF/MS platform. Acta Oncol. 51, 473–479 (2012). The exquisite sensitivity of metabolic profiles to different diseases and 26. Garcia, E. et al. Diagnosis of early stage ovarian cancer by 1H NMR metabonomics of serum explored by use of a microflow NMR probe. J. Proteome Res. 10, treatment options means that computer models can be generated to 1765–1771 (2011). aid decision-making processes for the medical practitioner. Moreover, 27. Carrola, J. et al. Metabolic signatures of lung cancer in biofluids: NMR-based by recording a patient’s metabotype as treatment progresses, it will be metabonomics of urine. J. Proteome Res. 10, 221–230 (2011). 28. Gaudet, M. M. et al. Analysis of serum metabolic profiles in women with possible to monitor the beneficial or detrimental effects of treatment, endometrial cancer and controls in a population-based case–control study. so that, for example, drug regimes or diet can be altered and a prognosis J. Clin. Endocrinol. Metab. 97, 3216–3223 (2012). of disease outcome can be made. However, future systems must be 29. Ganti, S. et al. Urinary acylcarnitines are altered in human kidney cancer. Int. J. Cancer 130, 2791–2800 (2012). able to link omics-level data sets and clinical databases seamlessly, and 30. Nishiumi, S. et al. A novel serum metabolomics-based diagnostic approach for incorporating electronic health records into experimental data sets would colorectal cancer. PLoS ONE 7, e40459 (2012). seem to be an essential, although formidable, task for the future. ■ 31. Slupsky, C. M. et al. Urine metabolite analysis offers potential early diagnosis of ovarian and breast cancers. Clin. Cancer Res. 16, 5835–5841 (2010). 1. Mirnezami, R., Nicholson, J. & Darzi, A. Preparing for precision medicine. 32. Lin, L. et al. LC-MS based serum metabonomic analysis for renal cell N. Engl. J. Med. 366, 489–491 (2012). carcinoma diagnosis, staging, and biomarker discovery. J. Proteome Res. 10, 2. Gavaghan, C. L., Holmes, E., Lenz, E., Wilson, I. D. & Nicholson, J. K. An NMR-based 1396–1405 (2011). metabonomic approach to investigate the biochemical consequences of genetic 33. Oakman, C. et al. Identification of a serum-detectable metabolomic fingerprint strain differences: application to the C57BL10J and Alpk:ApfCD mouse. FEBS Lett. potentially correlated with the presence of micrometastatic disease in early 484, 169–174 (2000). breast cancer patients at varying risks of disease relapse by traditional prognostic 3. Holmes, E. et al. Human metabolic phenotype diversity and its association with methods. Ann. Oncol. 22, 1295–1301 (2011). 34. Tenori, L. et al. Exploration of serum metabolomic profiles and outcomes diet and blood pressure. Nature 453, 396–400 (2008). in women with metastatic breast cancer: a pilot study. Mol. Oncol. 6, This study is the first example of the metabolome-wide association study 437–444 (2012). concept in which disease risk factors (such as elevated blood pressure) were 35. Griffin, J. L. & Shockcor, J. P. Metabolic profiles of cancer cells. Nature Rev. Cancer analysed in relation to exploratory (NMR) spectroscopic data. 4, 551–561 (2004). 4. Holmes, E., Wilson, I. D. & Nicholson, J. K. Metabolic phenotyping in health and 36. Tennant, D. A., Durán, R. V. & Gottlieb, E. Targeting metabolic transformation for disease. Cell 134, 714–717 (2008). cancer therapy. Nature Rev. Cancer 10, 267–277 (2010). 5. Nicholson, J. K. & Lindon, J. C. Systems biology: metabonomics. Nature 455, This is an important study on the use of targeted metabolic analysis for 1054–1056 (2008). understanding fundamental metabolic processes in cancer cells for the 6. Fiehn, O. Metabolomics—the link between genotypes and phenotypes. discovery of drug targets and strategies. Plant Mol. Biol. 48, 155–171 (2002). 37. Spratlin, J. L., Serkova, N. J. & Eckhardt, S. G. Clinical applications of metabolomics 7. Nicholson, J. K., Lindon, J. C. & Holmes, E. ‘Metabonomics’: understanding in oncology: a review. Clin. Cancer Res. 15, 431–440 (2009). the metabolic responses of living systems to pathophysiological stimuli via 38. Griffin, J. L., Atherton, H., Shockcor, J. P. & Atzori, L. Metabolomics as a tool for multivariate statistical analysis of biological NMR spectroscopic data. Xenobiotica cardiac research. Nature Rev. Cardiol. 8, 630–643 (2011). 29, 1181–1189 (1999). 39. Shah, S. H. et al. Baseline metabolomic profiles predict cardiovascular events in This article describes and defines metabonomics as a tool for studying patients at risk for coronary artery disease. Am. Heart J. 163, 844–850 (2012). systemic metabolic changes due to disease, stresses, physiological stimulus 40. Wang, Z. et al. Gut flora metabolism of phosphatidylcholine promotes or genetic modification. cardiovascular disease. Nature 472, 57–63 (2011). 8. Nicholson, J. K. & Wilson, I. D. Understanding ‘global’ systems biology: This article reports the major discovery of the potential involvement of gut- metabonomics and the continuum of metabolism. Nature Rev. Drug Discov. 2, microbial metabolism in developing cardiovascular disease. 668–676 (2003). 41. Caldeira, M. et al. Profiling allergic asthma volatile metabolic patterns using 9. Holmes, E. et al. Nuclear magnetic resonance spectroscopy and pattern a headspace-solid phase microextraction/gas chromatography based recognition analysis of the biochemical processes associated with the progression methodology. J. Chromatogr. A 1218, 3771–3780 (2011). of and recovery from nephrotoxic lesions in the rat induced by mercury(ii) chloride 42. Fens, N. et al. Exhaled air molecular profiling in relation to inflammatory subtype and 2-bromoethanamine. Mol. Pharmacol. 42, 922–930 (1992). and activity in COPD. Eur. Respir. J. 38, 1301–1309 (2009). This article reports the first use of metabolic profiling approaches to follow 43. Saude, E. J. et al. Metabolomic profiling of asthma: diagnostic utility of urine longitudinal changes in systemic metabolism. nuclear magnetic resonance spectroscopy. J. Allergy Clin. Immunol. 127, 10. Loscalzo, J., Kohane, I. & Barabasi, A. L. Human disease classification in the 757–764 (2011). postgenomic era: a complex systems approach to human pathobiology. 44. Ubhi, B. K. et al. Metabolic profiling detects biomarkers of protein degradation in Mol. Syst. Biol. 3, 124 (2007). COPD patients. Eur. Respir. J. 40, 345–355 (2012). 1 11. Tomlins, A. M. et al. High resolution H NMR spectroscopic studies on 45. Qin, J. et al. A human gut microbial gene catalogue established by metagenomic dynamic biochemical processes in incubated human seminal fluid samples. sequencing. Nature 464, 59–65 (2010). Biochim. Biophys. Acta 1379, 367–380 (1998). 46. Yatsunenko, T. et al. Human gut microbiome viewed across age and geography. 12. Patterson, A. D. et al. Metabolomics reveals attenuation of the SLC6A20 kidney Nature 486, 222–227 (2012). transporter in nonhuman primate and mouse models of type 2 diabetes mellitus. 47. Nicholson, J. K. et al. Host–gut microbiota metabolic interactions. Science. 336, J. Biol. Chem. 286, 19511–19522 (2011). 1262–1267 (2012). 13. Robertson, D. G., Reily, M. D. & Baker, J. D. Metabonomics in pharmaceutical 48. Ooi, M. et al. GC/MS-based profiling of amino acids and TCA cycle-related discovery and development. J. Proteome Res. 6, 526–539 (2007). molecules in ulcerative colitis. Inflamm. Res. 60, 831–840 (2011). 14. Trygg, J., Holmes, E. & Lundstedt, T. Chemometrics in metabonomics. 49. Williams, H. R. et al. Characterization of inflammatory bowel disease with urinary J. Proteome Res. 6, 469–479 (2007). metabolic profiling. Am. J. Gastroenterol. 104, 1435–1444 (2009). 15. Nevedomskaya, E., Mayboroda, O. A. & Deelder, A. M. Cross-platform analysis of 50. Marchesi, J. R. et al. Rapid and non-invasive metabonomic characterization of longitudinal data in metabolomics. Mol. Biosyst. 7, 3214–3222 (2011). inflammatory bowel disease. J. Proteome Res. 6, 546–551 (2007). 16. Nicholson, J. K. et al. Proton-nuclear-magnetic-resonance studies of serum, 51. Li, M. et al. Symbiotic gut microbes modulate human metabolic phenotypes. plasma and urine from fasting normal and diabetic subjects. Biochem. J. 217, Proc. Natl Acad. Sci. USA 105, 2117–2122 (2008). 365–375 (1984). This article reports the first demonstration of statistical cross-omics 17. Iles, R. A., Snodgrass, G. J., Chalmers, R. A. & Stacey, T. E. Rapid screening of integration to unravel gut-microbe–host metabolic interactions. metabolic diseases by proton NMR. Lancet 2, 1221–1222 (1984). 52. Hooper, L. V., Littman, D. R. & Macpherson, A. J. Interactions between the This article provides an early example of the power of non-targeted microbiota and the immune system. Science 336, 1268–1273 (2012). phenotyping for use in classification of metabolic diseases and for exploring 53. Swann, J. R. et al. Systemic gut microbial modulation of bile acid metabolism in pathway abnormalities in genetic disease. host tissue compartments. Proc. Natl Acad. Sci. USA 108, 4523–4530 (2011).

NATURE MILESTONES | MASS SPECTROMETRY OCTOBER 2015 | 45 15 NOVEMBER 2012 | VOL 491 | NATURE | 391

384-392 Insight Nicholson NS.indd 391 08/11/2012 17:44 COLLECTIONINSIGHT REVIEW

54. Holmes, E. et al. Therapeutic modulation of microbiota–host metabolic 80. Zhang, H. et al. Human gut microbiota in obesity and after gastric bypass. interactions. Sci. Transl. Med. 4, 137rv6 (2012). Proc. Natl Acad. Sci. USA 106, 2365–2370 (2009). This article provides a comprehensive discussion of major gut-microbe–host 81. Biagi, E., Candela, M., Fairweather-Tait, S., Franceschi, C. & Brigidi, P. Aging metabolic interactions and possible therapeutic interventional strategies. of the human metaorganism: the microbial counterpart. Age (Dordr.) 34, 55. Muccioli, G. G. et al. The endocannabinoid system links gut microbiota to 247–267 (2012). adipogenesis. Mol. Syst. Biol. 6, 392 (2010). 82. Claesson, M. J. et al. Gut microbiota composition correlates with diet and health in 56. Yap, I. K. et al. Urinary metabolic phenotyping differentiates children with autism the elderly. Nature 488, 178–184 (2012). from their unaffected siblings and age-matched controls. J. Proteome Res. 9, 83. Jeevan, R. et al. Reoperation rates after breast conserving surgery for breast cancer 2996–3004 (2010). among women in England: retrospective study of hospital episode statistics. 57. Evans, C. et al. Altered amino acid excretion in children with autism. Nutr. Neurosci. Br. Med. J. 345, e4505 (2012). 11, 9–17 (2008). 84. Chan, E. C. et al. Metabolic profiling of human colorectal cancer using high- 58. Thomas, E. L. et al. Aberrant adiposity and ectopic lipid deposition characterize the resolution magic angle spinning nuclear magnetic resonance (HR-MAS adult phenotype of the preterm infant. Pediatr. Res. 70, 507–512 (2011). NMR) spectroscopy and gas chromatography mass spectrometry (GC/MS). 59. Gordon J. I. Honor thy gut symbionts: redux. Science 336, 1251–1253 (2012). J. Proteome Res. 8, 352–361 (2009). This article provides an overview of the importance of the gut microbiome in 85. Opstad, K. S., Bell, B. A., Griffiths, J. R. & Howe, F. A. An investigation of human brain the aetiopathogenesis of diverse non-infectious diseases. tumour lipids by high-resolution magic angle spinning 1H MRS and histological 60. Jia, W., Li, H., Zhao, L. & Nicholson, J. K. Gut microbiota: a potential new territory analysis. NMR Biomed. 21, 677–685 (2008). for drug targeting. Nature Rev. Drug Discov. 7, 123–129 (2008). 86. Wright, A. J. et al. Ex-vivo HRMAS of adult brain tumours: metabolite quantification 61. Kinross, J. M., Holmes, E., Darzi, A. W. & Nicholson, J. K. Metabolic phenotyping for and assignment of tumour biomarkers. Mol. Cancer 9, 66 (2010). monitoring surgical patients. Lancet 377, 1817–1819 (2011). 87. Wu, C. L. et al. Metabolomic imaging for human prostate cancer detection. 62. Clayton, T. A. et al. Pharmaco-metabonomic phenotyping and personalized drug Sci. Transl. Med. 2, 16ra18 (2010). treatment. Nature 440, 1073–1077 (2006). 88. Bertilsson, H. et al. Changes in gene transcription underlying the aberrant citrate This report provides the first description of the use of pre-interventional and choline metabolism in human prostate cancer samples. Clin. Cancer Res. metabolic profile models to predict interventional outcomes. 63. Clayton, T. A., Baker, D., Lindon, J. C., Everett, J. R. & Nicholson, J. K. 18, 3261–3269 (2012). Pharmacometabonomic identification of a significant host–microbiome metabolic 89. McDonnell, L. A. & Heeren, R. M. Imaging mass spectrometry. Mass Spectrom. Rev. interaction affecting human drug metabolism. Proc. Natl Acad. Sci. USA 106, 26, 606–643 (2007). 14728–14733 (2009). 90. Balog, J. et al. Identification of biological tissues by rapid evaporative ionization 64. Backshall, A., Sharma, R., Clarke, S. J. & Keun, H. C. Pharmacometabonomic mass spectrometry. Anal. Chem. 82, 7343–7350 (2010). profiling as a predictor of toxicity in patients with inoperable colorectal cancer This article provides a description of the technology development and treated with capecitabine. Clin. Cancer Res. 17, 3019–3028 (2011). application of the ‘intelligent knife’ concept for real-time surgical diagnostics. This study is the first example of pharmacometabonomic principles to predict 91. Guenther, S. et al. Electrospray post-ionization mass spectrometry of drug toxicity in humans. electrosurgical aerosols. J. Am. Soc. Mass Spectrom. 22, 2082–2089 (2011). 65. Schmerler, D. et al. Targeted metabolomics for discrimination of 92. Gerbig, S. et al. Analysis of colorectal adenocarcinoma tissue by desorption systemic inflammatory disorders in critically ill patients. J. Lipid Res. 53, electrospray ionization mass spectrometric imaging. Anal. Bioanal. Chem. 403, 1369–1375 (2012). 2315–2325 (2012). 66. Cohen, M. J., Serkova, N. J., Wiener-Kronish, J., Pittet, J. F. & Niemann, C. U. 93. Schafer, K. C. et al. Real time analysis of brain tissue by direct combination of 1H-NMR-based metabolic signatures of clinical outcomes in trauma patients— ultrasonic surgical aspiration and sonic spray mass spectrometry. Anal. Chem. 83, beyond lactate and base deficit. J. Trauma 69, 31–40 (2010). 7729–7735 (2011). 67. Polinder, S., Haagsma, J. A., Toet, H. & van Beeck, E. F. Epidemiological burden 94. Beckonert, O. et al. Metabolic profiling, metabolomic and metabonomic of minor, major and fatal trauma in a national injury pyramid. Br. J. Surg. 99, procedures for NMR spectroscopy of urine, plasma, serum and tissue extracts. 114–121 (2012). Nature Protocal. 2, 2692–2703 (2007). 68. Alverdy, J. C., Laughlin, R. S. & Wu, L. Influence of the critically ill state on host– 95. Lindon, J. C. & Nicholson, J. K. Spectroscopic and statistical techniques for pathogen interactions within the intestine: gut-derived sepsis redefined. Crit. Care information recovery in metabonomics and metabolomics. Annu. Rev. Anal. Chem. Med. 31, 598–607 (2003). 1, 45–69 (2008). 69. Volkert, D., Saeglitz, C., Gueldenzoph, H., Sieber, C. C. & Stehle, P. Undiagnosed 96. Wong, A. et al. Evaluation of high resolution magic-angle coil spinning NMR malnutrition and nutrition-related problems in geriatric patients. J. Nutr. Health spectroscopy for metabolic profiling of nanoliter tissue biopsies. Anal. Chem. 84, Aging 14, 387–392 (2010). 3843–3848 (2012). 70. Fitzgerald, S. P. & Bean, N. G. An analysis of the interactions between individual 97. Jellum, E. et al. Application of glass capillary-column gas chromatography- comorbidities and their treatments – implications for guidelines and mass spectrometry to the studies of human diseases. J. Chromatogr. 126, polypharmacy. J. Am. Med. Dir. Assoc. 11, 475–484 (2010). 487–493 (1976). 71. Shah, A. A. et al. Metabolic profiles predict adverse events after coronary artery 98. Ramautar, R., Mayboroda, O. A., Somsen, G. W. & de Jong, G. J. CE-MS for bypass grafting. J. Thorac. Cardiovasc. Surg. 143, 873–878 (2012). metabolomics: developments and applications in the period 2008–2010. 72. Mao, H. et al. Systemic metabolic changes of traumatic critically ill patients Electrophoresis 32, 52–65 (2011). revealed by an NMR-based metabonomic approach. J. Proteome Res. 8, 5423–5430 (2009). 99. Crockford, D. J. et al. Statistical heterospectroscopy, an approach to the integrated 73. Chen, J. et al. Metabonomics study of the acute graft rejection in rat renal analysis of NMR and UPLC-MS data sets: application in metabonomic toxicology transplantation using reversed-phase liquid chromatography and hydrophilic studies. Anal. Chem. 78, 363–371 (2006). interaction chromatography coupled with mass spectrometry. Mol. Biosyst. 8, 100. Fonville, J. M. et al. Robust data processing and normalization strategy for MALDI 871–878 (2012). mass spectrometric imaging. Anal. Chem. 84, 1310–1319 (2012). 74. Kim, C. D. et al. Metabonomic analysis of serum metabolites in kidney transplant recipients with cyclosporine A- or tacrolimus-based immunosuppression. Transplantation 90, 748–756 (2010). Acknowledgements The authors would like to acknowledge the National Institute 75. Legido-Quigley, C. et al. Bile UPLC-MS fingerprinting and bile acid fluxes during of Health Research Biomedical Research Centre for funding clinical and surgical human liver transplantation. Electrophoresis 32, 2063–2070 (2011). metabonomic projects in real-time diagnostics and chemical imaging at Imperial 76. Girlanda, R. et al. Metabolomics of human intestinal transplant College London. We also wish to thank the MRC and NIHR for funding major rejection. Am. J. Transplant. http://dx.doi.org/10.1111/j.1600-6143.2012.04183.x programmes that relate to these studies, including the MRC-NIHR Phenome Centre (July 2012). (joint with Kings College London, Bruker Spectrospin and the Waters Corporation) 77. Fornari, F., Comis, V. R. & Lisboa, H. R. Bariatric surgery or medical therapy for and the Imperial NIHR Clinical Phenome Centre. obesity. N. Engl J. Med. 367, 474 (2012). 78. Li, J. V. et al. Metabolic surgery profoundly influences gut microbial–host metabolic cross-talk. Gut 60, 1214–1223 (2011). Author Information Reprints and permissions information is available at 79. Mutch, D. M. et al. Metabolite profiling identifies candidate markers reflecting the www.nature.com/reprints. The authors declare no competing financial interests. clinical adaptations associated with Roux-en-Y gastric bypass surgery. PLoS ONE Readers are welcome to comment on this article at go.nature.com/jpozrc. 4, e7905 (2009). Correspondence should be addressed to J.K.N. ([email protected]).

46 | OCTOBER 2015 www.nature.com/milestones/mass-spec 392 | NATURE | VOL 491 | 15 NOVEMBER 2012

384-392 Insight Nicholson NS.indd 392 08/11/2012 17:44