FOODOMICS WILEY SERIES ON MASS SPECTROMETRY

Series Editors Dominic M. Desiderio Departments of Neurology and Biochemistry University of Tennessee Health Science Center

Nico M. M. Nibbering Vrije Universiteit Amsterdam, The Netherlands

A complete list of the titles in this series appears at the end of this volume. FOODOMICS

Advanced Mass Spectrometry in Modern Food Science and Nutrition

Edited by

ALEJANDRO CIFUENTES Laboratory of Foodomics (CIAL) National Research Council (CSIC) Madrid, Spain

A JOHN WILEY & SONS, INC., PUBLICATION Copyright C 2013 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data

Foodomics : advanced mass spectrometry in modern food science and nutrition / edited by Alejandro Cifuentes. p. cm Includes bibliographical references and index. ISBN 978-1-118-16945-2 (cloth) 1. Food–Analysis. 2. Mass spectrometry. I. Cifuentes, Alejandro, editor of compilation. TX547.F66 2013 664.07–dc23 2012035736

Printed in the United States of America

ISBN: 9781118169452

10987654321 To the three women in my life, Susana, Claudia and Fernanda, every day they make of this world a better place to be.

A las tres mujeres de mi vida, Susana, Claudia y Fernanda, porque cada d´ıa ellas hacen de este mundo un lugar mejor donde vivir. CONTENTS

Preface xiii Contributors xv

1 Foodomics: Principles and Applications 1 Alejandro Cifuentes 1.1 Introduction to Foodomics 1 1.2 Foodomics Applications: Challenges, Advantages, and Drawbacks 6 1.3 Foodomics, Systems Biology, and Future Trends 11 Acknowledgments 12 References 12

2 Next Generation Instruments and Methods for 15 Mar´ıa del Carmen Mena and Juan Pablo Albar 2.1 Introduction 15 2.2 Emerging Methods in Proteomics 19 2.3 The Move from Shotgun to Targeted Proteomics Approaches 34 2.4 New Instrumental Methods for Proteomics 40 2.5 Bioinformatics Tools 49 References 55

vii viii CONTENTS

3 Proteomic-Based Techniques for the Characterization of Food Allergens 69 Gianluca Picariello, Gianfranco Mamone, Francesco Addeo, Chiara Nitride, and Pasquale Ferranti 3.1 Introduction: What is Food Allergy? 69 3.2 Food Allergy: Features and Boundaries of the Disease 70 3.3 Immunopathology of Food Allergy and Role of Proteomics 71 3.4 Identification of Food Allergy Epitopes 73 3.5 Expression Proteomics and Functional Proteomics in Food Allergy 81 3.6 Identification of Allergens in Transformed Products 85 3.7 Concluding Remarks 90 References 91

4 Examination of the Efficacy of Antioxidant Food Supplements Using Advanced Proteomics Methods 101 Ashraf G. Madian, Elsa M. Janle, and Fred E. Regnier 4.1 Introduction 101 4.2 Methods for Studying the Efficacy of Antioxidants 102 4.3 Strategies Used for Proteomic Analysis of Carbonylated Proteins and the Impact of Antioxidants 106 4.4 Studying Oxidation Mechanisms 107 4.5 Quantification of Carbonylation Sites 111 4.6 Biomedical Consequence of Protein Oxidation and the Impact of Antioxidants 112 4.7 Redox Proteomics and Testing the Efficacy of Antioxidants 113 References 117

5 Proteomics in Food Science 125 Jose´ M. Gallardo, Monica´ Carrera, and Ignacio Ortea 5.1 Proteomics 125 5.2 Applications in Food Science 132 5.3 Species Identification and Geographic Origin 132 5.4 Detection and Identification of Spoilage and Pathogenic Microorganisms 140 5.5 Changes During Food Storage and Processing and Their Relationship to Quality 144 5.6 Proteomics Data Integration to Explore Food Metabolic Pathways and Physiological Activity of Food Components 149 5.7 Nutriproteomics 150 5.8 Final Considerations and Future Trends 151 References 152 CONTENTS ix

6 Proteomics in Nutritional Systems Biology: Defining Health 167 Martin Kussmann and Laurent Fay 6.1 Introduction 167 6.2 From Food Proteins to Nutriproteomics 171 6.3 Nutritional Peptide and Protein Bioactives 172 6.4 Nutritional Peptide and Protein Biomarkers 174 6.5 Ecosystem-Level Understanding of Nutritional Host Health 178 6.6 Conclusions and Perspectives 181 References 182

7 MS-Based Methodologies for Transgenic Foods Development and Characterization 191 Alberto Valdes´ and Virginia Garc´ıa-Canas˜ 7.1 Introduction 191 7.2 Controversial Safety Aspects and Legislation on GMOs 192 7.3 Analysis of GMOs: Targeted Procedures and Profiling Methodologies 193 7.4 Conclusions and Future Outlook 212 Acknowledgments 212 References 212

8 MS-Based Methodologies to Study the Microbial Metabolome 221 Wendy R. Russell and Sylvia H. Duncan 8.1 Introduction 221 8.2 The Gut Microbiota and Their Role in Metabolism 222 8.3 Metagenomics 224 8.4 225 8.5 Microbial Metabolites in the Human Gut 226 8.6 Analysis of the Microbial Metabolome 229 8.7 Implications for Human Health and Disease 232 8.8 Summary 235 Acknowledgments 235 References 235

9 MS-Based Metabolomics in Nutrition and Health Research 245 Clara Iba´nez˜ and Carolina Simo´ 9.1 Introduction 245 9.2 MS-Based Metabolomics Workflow 246 9.3 Metabolomics in Nutrition-Related Studies 253 9.4 Diet/Nutrition and Disease: Metabolomics Applications 259 9.5 Other Applications in Nutritional Metabolomics 261 9.6 Integration with Other “” 262 9.7 Concluding Remarks 263 x CONTENTS

Acknowledgments 264 References 264

10 Shaping the Future of Personalized Nutrition with Metabolomics 271 Max Scherer, Alastair Ross, Sofia Moco, Sebastiano Collino, Franc¸ois-Pierre Martin, Jean-Philippe Godin, Peter Kastenmayer, and Serge Rezzi 10.1 Introduction 271 10.2 Metabolomics Technologies 272 10.3 Personalized Nutrition 278 10.4 Conclusion 291 References 292

11 How Does Foodomics Impact Optimal Nutrition? 303 Anna Arola-Arnal, Josep M. del Bas, Antoni Caimari, Anna Crescenti, Francesc Puiggros,` Manuel Suarez,´ and Llu´ıs Arola 11.1 Introduction 303 11.2 Nutrigenomics 310 11.3 Nutrigenetics and Personalized Nutrition 323 11.4 The Added Value of Foodomics for the Food Industry 329 11.5 Concluding Remarks 337 References 337

12 Lipidomics 351 Isabel Bondia-Pons and Tuulia Hyotyl¨ ainen¨ 12.1 Definition and Analytical Challenges in Lipidomics 351 12.2 Lipidomics in Nutrition and Health Research 360 12.3 Lipidomics and Food Science 368 12.4 Future Perspectives 371 References 372

13 Foodomics Study of Micronutrients: The Case of Folates 381 Susan J. Duthie 13.1 Folates in the Diet 381 13.2 Folate and Human Health 383 13.3 Measuring Folates in Human Biomonitoring 385 13.4 Folate and Colon Cancer: Establishing Mechanisms of Genomic Instability Using a Combined Proteomic and Functional Approach 387 13.5 Folate Deficiency and Abnormal DNA Methylation: A Common Mechanism Linking Cancer and Atherosclerosis 394 13.6 Summary 397 Acknowledgments 399 References 399 CONTENTS xi

14 Metabolomics Markers in Acute and Endurance/Resistance Physical Activity: Effect of the Diet 405 Sonia Medina, Debora´ Villano,˜ Jose´ Ignacio Gil, Cristina Garc´ıa-Viguera, Federico Ferreres, and Angel Gil-Izquierdo 14.1 Introduction 405 14.2 Metabolomics Consequences of Physical Activity: Metabolites and Physiological Pathways Affected 407 14.3 Metabolomics and Physical Activity: Effect of the Diet 410 14.4 Concluding Remarks and Future Perspectives 411 Acknowledgments 412 References 412

15 MS-Based Omics Evaluation of Phenolic Compounds as Functional Ingredients 415 Debora´ Villano,˜ Sonia Medina, Jose´ Ignacio Gil, Cristina Garc´ıa-Viguera, Federico Ferreres, Francisco A. Tomas-Barber´ an,´ and Angel Gil-Izquierdo 15.1 Introduction 415 15.2 Use of Metabolomics in Nutritional Trials 416 15.3 Statistic Tools in Nutritional Metabolomics 421 15.4 Metabolomics from Clinical Trials after Intake of Polyphenol-Rich Foods 421 15.5 Human Metabolome in Low and Normal Polyphenol Dietary Intake 424 15.6 Concluding Remarks and Future Perspectives 424 Acknowledgments 425 References 425

16 Metabolomics of Diet-Related Diseases 429 Marcela A. Erazo, Antonia Garc´ıa, Francisco J. Ruperez,´ and Coral Barbas 16.1 Introduction 429 16.2 Analysis of the Metabolome: Metabolomics 431 16.3 Diet-Related Diseases 432 References 446

17 MS-Based Metabolomics Approaches for Food Safety, Quality, and Traceability 453 Mar´ıa Castro-Puyana, Jose´ A. Mendiola, Elena Iba´nez,˜ and Miguel Herrero 17.1 Introduction 453 17.2 MS-Based Metabolomics for Food Safety 455 17.3 MS-Based Metabolomics to Assess Food Quality 462 17.4 MS-Based Metabolomics Strategies for Food Traceability 464 xii CONTENTS

17.5 Conclusions and Future Outlook 467 Acknowledgments 468 References 468

18 Green Foodomics 471 Jose A. Mendiola, Mar´ıa Castro-Puyana, Miguel Herrero, and Elena Iba´nez˜ 18.1 Basic Concepts of Foodomics (and How to Make it Greener) 471 18.2 Basic Concepts of Green Chemistry 472 18.3 Green Processes to Produce Functional Food Ingredients 476 18.4 Development of Green Analytical Processes for Foodomics 482 18.5 Comparative LCA Study of Green Analytical Techniques: Case Study 493 18.6 Conclusion 497 Acknowledgments 498 References 498

19 Chemometrics, Mass Spectrometry, and Foodomics 507 Thomas Skov and Søren B. Engelsen 19.1 Foodomics Studies 507 19.2 XC-MS Data 511 19.3 Data Structures and Models 517 19.4 Conclusion 534 References 535

20 Systems Biology in Food and Nutrition Research 539 Matej Oresiˇ cˇ 20.1 Systems Biology—New Opportunity for Food and Nutrition Research 539 20.2 Systems Approach to Identify Molecular Networks Behind Health and Disease 542 20.3 Food Metabolome and its Effect on Host Physiology 544 20.4 Building A Systems Biology Platform for Food and Nutrition Research 545 20.5 Future Perspectives 546 References 547

Index 551 PREFACE

The impressive analytical developments achieved at the end of the twentieth cen- tury have made possible the sequencing of nearly the whole human genome at the beginning of the twenty-first century, opening the so-called postgenomic era. These advances have made feasible analytical instruments and methodological develop- ments that were unthinkable a few decades ago. These impressive developments have traditionally found their first application in the biotechnological or biochemical field many times linked to pharmaceutical, medical, or clinical needs. The huge amount of money allocated to these fields of research is logically an additional push to be considered when selecting the area in which a new analytical method can be probed, a good way to compensate the efforts behind any innovative analytical development. As a result, biotech, pharmaceutical, and clinical related industries have usually been the first targets for analytical chemists and instrumentation companies. This has left food analysis overshadowed and connected to the use of more traditional analytical approaches. Nowadays, boundaries among the different research fields are becom- ing more and more diffuse giving rise to impressive possibilities in the emerging interdisciplinary areas, for example, health and food. As a result, researchers in food science and nutrition are being pushed to move from classical methodologies to more advanced strategies usually borrowing methods well established in medical, pharma- cological, and/or biotechnology research. This trend has generated the emergence of new areas of research for which a new terminology is required. In this context, our group defined a few years ago Foodomics, as a discipline that studies the food and nutrition domains through the application of advanced omics technologies to improve consumer’s well-being, health, and confidence. The main idea behind the use of this new term has been not only to use it as a flag of the new times for food analysis but also to highlight that the investigation into traditional and new problems in food analysis

xiii xiv PREFACE in the postgenomic era can find exciting opportunities and new answers through the use of , transcriptomics, epigenetics, proteomics, and metabolomics tools. Indeed, Foodomics is opening a new and unexpected land still wild, still unexplored, to a new generation of researchers who, using the everyday more powerful omics technologies, can find original search possibilities and innovative answers to crucial questions not only related to food science but also related to its complex links with our health. The interest of the scientific community in modern food analysis and Foodomics, and the different trends in this hot area of research are well documented in the 20 chapters that compose this volume on “Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition”, the first book devoted to this new discipline in which the authors present their advanced perspective of the topic. Namely, in the first chapter the principles of Foodomics are presented, the next five chapters (chapter 2 to 6) are devoted to proteomics applications in Foodomics, including a description of modern instruments and methods for proteomics, proteomic-based techniques for food science and food allergens characterization, examination of antioxidant food supplements using advanced proteomics methods and proteomics in nutritional sys- tems biology. The next two chapters (chapters 7 and 8) are devoted to the description of advanced MS-based methodologies to study transgenic foods development and characterization and the microbial metabolome. The following nine chapters (chap- ters 9 to 17) are devoted to metabolomics developments in Foodomics with special emphasis on the possibilities of MS-based metabolomics in nutrition and health research, for food safety, quality, and traceability, the investigations on future person- alized nutrition, the study of the effect of the diet on acute and endurance exercise, the investigation on diet-related diseases, and the study on how Foodomics impact optimal nutrition or can provide crucial information on micronutrients (the case of folates), phenolic compounds as functional ingredients, and lipids (lipidomics). The following two chapters (chapters 18 and 19) present the main principles of Green Foodomics and the use of chemometrics in mass spectrometry and Foodomics. The last chapter of the book is devoted to the description of the possibilities of systems biology in food and nutrition research. As editor of this book devoted to “Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition”, I would like to thank all the authors for their suitable contributions, Dom Desiderio for inviting me to prepare this piece of work, Michael Leventhal for his help and support, and to those in the John Wiley & Sons team who contributed their effort to the preparation of this volume.

Alejandro Cifuentes CONTRIBUTORS

Francesco Addeo, Dipartimento di Scienza degli Alimenti, University of Naples Federico II, Naples, Italy Juan Pablo Albar, Functional Proteomics Group, Centro Nacional de Biotec- nolog´ıa–CSIC, Madrid, Spain Llu´ıs Arola, Centre Tecnologic` de Nutricio´ i Salut (CTNS), TECNIO, Reus, Spain; Departament de Bioqu´ımica i Biotecnologia, Nutrigenomics Research Group, Uni- versitat Rovira i Virgili, Tarragona, Spain Anna Arola-Arnal, Departament de Bioqu´ımica i Biotecnologia, Nutrigenomics Research Group, Universitat Rovira i Virgili, Tarragona, Spain Coral Barbas, Center for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Boadilla del Monte, Madrid, Spain Isabel Bondia-Pons, Quantitative Biology and Bioinformatics, VTT Technical Research Centre of Finland, Espoo, Finland Antoni Caimari, Centre Tecnologic` de Nutricio´ i Salut (CTNS), TECNIO, Reus, Spain Monica´ Carrera, Institute of Molecular Systems Biology, ETH Zurich,¨ Zurich,¨ Switzerland Mar´ıa Castro-Puyana, Laboratory of Foodomics, Institute of Food Science Research (CIAL), National Research Council (CSIC), Madrid, Spain

xv xvi CONTRIBUTORS

Alejandro Cifuentes, Laboratory of Foodomics, Institute of Food Science Research (CIAL), National Research Council (CSIC), Madrid, Spain Sebastiano Collino, BioAnalytical Science, Nestle Research Center, Lausanne, Switzerland Anna Crescenti, Centre Tecnologic` de Nutricio´ i Salut (CTNS), TECNIO, Reus, Spain Josep M. del Bas, Centre Tecnologic` de Nutricio´ i Salut (CTNS), TECNIO, Reus, Spain Sylvia H. Duncan, Rowett Institute of Nutrition and Health, University of Aberdeen, Aberdeen, UK Susan J. Duthie, Natural Products Group, Division of Lifelong Health, Rowett Institute of Nutrition and Health, University of Aberdeen, Aberdeen, UK Søren B. Engelsen, Faculty of Science, University of Copenhagen, Copenhagen, Denmark Marcela A. Erazo, Center for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Boadilla del Monte, Madrid, Spain Laurent Fay, R&D Infant Formulae, Nestle´ Nutrition, Vevey, Switzerland Pasquale Ferranti, Istituto di Scienze dell’Alimentazione, CNR, Avellino, Italy; Dipartimento di Scienza degli Alimenti, University of Naples Federico II, Naples, Italy Federico Ferreres, Department of Food Science and Technology, CEBAS-CSIC, Murcia, Spain Jose M. Gallardo, Marine Research Institute, CSIC, Vigo, Pontevedra, Spain Antonia Garc´ıa, Center for Metabolomics and Bioanalysis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Boadilla del Monte, Madrid, Spain Virginia Garc´ıa-Canas˜ , Laboratory of Foodomics, Institute of Food Science Research (CIAL), National Research Council (CSIC), Madrid, Spain Cristina Garc´ıa-Viguera, Department of Food Science and Technology, CEBAS- CSIC, Murcia, Spain Jose´ Ignacio Gil, Service of Radiodiagnostic, Mammary Pathology Department, Hospital Jose´ Mar´ıa Morales Meseguer, Murcia, Spain Angel Gil-Izquierdo, Department of Food Science and Technology, CEBAS-CSIC, Murcia, Spain Jean-Philippe Godin, BioAnalytical Science, Nestle Research Center, Lausanne, Switzerland CONTRIBUTORS xvii

Miguel Herrero, Laboratory of Foodomics, Institute of Food Science Research (CIAL), National Research Council (CSIC), Madrid, Spain Tuulia Hyotyl¨ ainen¨ , Quantitative Biology and Bioinformatics, VTT Technical Research Centre of Finland, Espoo, Finland Clara Iba´nez˜ , Laboratory of Foodomics, Institute of Food Science Research (CIAL), National Research Council (CSIC), Madrid, Spain Elena Iba´nez˜ , Laboratory of Foodomics, Institute of Food Science Research (CIAL), National Research Council (CSIC), Madrid, Spain Elsa M. Janle, Department of Foods and Nutrition, Purdue University, West Lafayette, Indiana, USA Peter Kastenmayer, BioAnalytical Science, Nestle Research Center, Lausanne, Switzerland Martin Kussmann, Proteomics/Metabonomics Core, Nestle´ Institute of Health Sciences, Lausanne, Switzerland; Faculty of Science, Aarhus University, Aarhus, Denmark Ashraf G. Madian, Department of Chemistry, Purdue University, West Lafayette, Indiana, USA Gianfranco Mamone, Istituto di Scienze dell’Alimentazione, CNR, Avellino, Italy Franc¸ois-Pierre Martin, BioAnalytical Science, Nestle Research Center, Lausanne, Switzerland Sonia Medina, Department of Food Science and Technology, CEBAS-CSIC, Murcia, Spain Mar´ıa del Carmen Mena, Functional Proteomics Group, Centro Nacional de Biotecnolog´ıa–CSIC, Madrid, Spain Jose´ A. Mendiola, Laboratory of Foodomics, Institute of Food Science Research (CIAL), National Research Council (CSIC), Madrid, Spain Sofia Moco, BioAnalytical Science, Nestle Research Center, Lausanne, Switzerland Chiara Nitride, Dipartimento di Scienza degli Alimenti, University of Naples Fed- erico II, Naples, Italy Matej Oresiˇ cˇ, Systems Biology and Bioinformatics, VTT Technical Research Cen- tre of Finland, Espoo, Finland Ignacio Ortea, Health Research Institute of Santiago de Compostela, A Coruna,˜ Spain xviii CONTRIBUTORS

Gianluca Picariello, Istituto di Scienze dell’Alimentazione, CNR, Avellino, Italy Francesc Puiggros` , Centre Tecnologic` de Nutricio´ i Salut (CTNS), TECNIO, Reus, Spain Fred E. Regnier, Department of Chemistry, Purdue University, West Lafayette, Indiana, USA Serge Rezzi, BioAnalytical Science, Nestle Research Center, Lausanne, Switzerland Alastair Ross, BioAnalytical Science, Nestle Research Center, Lausanne, Switzer- land Francisco J. Ruperez´ , Center for Metabolomics and Bioanalysis (CEMBIO), Fac- ultad de Farmacia, Universidad CEU San Pablo, Boadilla del Monte, Madrid, Spain Wendy R. Russell, Rowett Institute of Nutrition and Health, University of Aberdeen, Aberdeen, UK Max Scherer, BioAnalytical Science, Nestle Research Center, Lausanne, Switzer- land Thomas Skov, Faculty of Science, University of Copenhagen, Copenhagen, Den- mark Carolina Simo´, Laboratory of Foodomics, Institute of Food Science Research (CIAL), National Research Council (CSIC), Madrid, Spain Manuel Suarez´ , Departament de Bioqu´ımica i Biotecnologia, Nutrigenomics Research Group, Universitat Rovira i Virgili, Tarragona, Spain Francisco A. Tomas-Barber´ an´ , Department of Food Science and Technology, CEBAS-CSIC, Murcia, Spain Alberto Valdes´ , Laboratory of Foodomics, Institute of Food Science Research (CIAL), National Research Council (CSIC), Madrid, Spain Debora´ Villano˜ , Department of Food Science and Technology, CEBAS-CSIC, Murcia, Spain 1 FOODOMICS: PRINCIPLES AND APPLICATIONS

Alejandro Cifuentes

1.1 INTRODUCTION TO FOODOMICS

Research in food science and nutrition has grown parallel to the consumers’ concern about what is in their food and the safety of the food they eat. To give an adequate answer to the rising consumer demands, food and nutrition researchers around the world are facing increasingly complex challenges that require the use of the best available science and technology. A good portion of this complexity is due to the so- called Globalization and the movement of food and related raw materials worldwide, which are generating contamination episodes that are also becoming global. An additional difficulty is that many products contain multiple and processed ingredients, which are often shipped from different parts of the world, and share common storage spaces and production lines. As a result, ensuring the safety, quality, and traceability of food has never been more complicated and necessary than today. The first goal of food science has traditionally been, and still is, to ensure food safety. To meet this goal, food laboratories are being pushed to exchange their classical procedures for modern analytical techniques that allow them to give an adequate answer to this global demand. Besides, the new European regulations in the European Union countries (e.g., Regulation EC 258/97 or EN 29000 and subsequent issues), the Nutrition Labeling and Education Act in the USA, and the Montreal Protocol have had a major impact on food laboratories. Consequently, more powerful, cleaner, and cheaper analytical procedures are now required by food chemists, regulatory agencies, and quality control laboratories. These demands have increased the need

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

1 2 FOODOMICS: PRINCIPLES AND APPLICATIONS for more sophisticated instrumentation and more appropriate methods able to offer better qualitative and quantitative results while increasing the sensitivity, precision, specificity, and/or speed of analysis. Currently, there is also a general trend in food science toward the connection between food and health. Thus, food is considered today not only a source of energy but also an affordable way to prevent future diseases. The number of opportunities (e.g., new methodologies, new generated knowledge, new products) derived from this trend is impressive and it includes, for example, the possibility to account for food products tailored to promote the health and well-being of groups of population identi- fied on the basis of their individual genomes. Interaction of modern food science and nutrition with disciplines such as pharmacology, medicine, or biotechnology provides impressive new challenges and opportunities. As a result, researchers in food science and nutrition are moving from classical methodologies to more advanced strategies, and usually borrow methods well established in medical, pharmacological, and/or biotechnology research. As a result, advanced analytical methodologies, “omics” approaches, and bioinformatics—frequently together with in vitro, in vivo, and/or clinical assays—are applied to investigate topics in food science and nutrition that were considered unapproachable few years ago. In modern food science and nutrition, terms such as nutrigenomics, nutrigenet- ics, , transgenics, functional foods, nutraceuticals, genetically modified (GM) foods, microbiomics, toxicogenomics, nutritranscriptomics, nutripro- teomics, nutrimetabolomics, and systems biology are expanding. This novelty has also brought about some problems related to the poor definition of part of this termi- nology or their low acceptance, probably due to the difficulty to work in a developing field in which several emerging strategies are frequently put together.

1.1.1 Definition of Foodomics Although the term Foodomics is being used in different web pages and scientific meet- ings since 2007 (see e.g., Slater and Wilson, 2007 or Capozzi and Placucci, 2009), Foodomics was for the first time defined in an SCI journal in 2009 as a new discipline that studies the food and nutrition domains through the application of advanced omics technologies to improve consumer’s well-being, health, and confidence (Cifuentes, 2009; Herrero et al., 2010, 2012). Thus, Foodomics is not only an useful concept that comprises in a simple and straightforward way all of the emerging terms aforemen- tioned (e.g., nutrigenomics, nutrigenetics, microbiomics, toxicogenomics, nutritran- scriptomics, nutriproteomics, nutrimetabolomics), but more importantly, Foodomics is a global discipline that includes all the working areas in which food (including nutrition) and advanced omics tools are put together. A representation of the areas covered by Foodomics and the tools employed can be seen in Figure 1.1. Just to name a few topics that could be addressed by this new discipline, Foodomics would help: (a) to understand the gene-based differences among individuals in response to a specific dietary pattern following nutrigenetic approaches; (b) to understand the biochemical, molecular, and cellular mechanisms that underlie the beneficial or adverse effects of certain bioactive food components INTRODUCTION TO FOODOMICS 3

Foodomics

Genomics & Epigenomics

Metabolomics Bioactivity Safety

VII I

VI II

V III Traceability IV Quality Transcriptomics

Proteomics

FIGURE 1.1 Foodomics: covered areas and tools. following nutrigenomic approaches; (c) to determine the effect of bioactive food constituents on crucial molecular pathways; (d) to know the identity of genes that are involved in the previous stage to the onset of the disease, and, therefore, possible molecular biomarkers; (e) to establish the global role and functions of gut micro- biome, a topic that is expected to open an impressive field of research in the near future; (f) to carry out the investigation on unintended effects in GM crops; (g) to understand the stress adaptation responses of food-borne pathogens to ensure food hygiene, processing, and preservation; (h) to investigate the use of food microor- ganisms as delivery systems including the impact of gene inactivation and deletion systems; (i) in the comprehensive assessment of food safety, quality, and traceabil- ity ideally as a whole; (j) to understand the molecular basis of biological processes with agronomic interest and economic relevance, such as the interaction between crops and its pathogens, as well as physicochemical changes that take place dur- ing fruit ripening; and (k) to fully understand postharvest phenomena through a global approach that links genetic and environmental responses and identifies the underlying biological networks. In this regard, it is expected that the new omics technologies combined with systems biology, as proposed by Foodomics, can lead postharvest research into a new era. The interest in Foodomics also coincides with a clear shift in medicine and biosciences toward prevention of future diseases through adequate food intakes, and the development of the so-called functional foods that are discussed below. 4 FOODOMICS: PRINCIPLES AND APPLICATIONS

1.1.2 Foodomics Tools As can be seen in Figure 1.1, Foodomics involves the use of multiple tools to deal with its different subdisciplines and applications. Thus, the use of omics tools such as genomics, epigenomics, transcriptomics, proteomics, and metabolomics is a must in this new discipline. Although a detailed description on these tools is out of the scope of this chapter, some fundamentals about these techniques are provided below. Epigenomics studies the mechanisms of gene expression that can be maintained across cell divisions, and thus the life of the organism, without changing the DNA sequence. The epigenetic mechanisms are related to the changes induced (e.g., by toxins or bioactive food ingredients) in gene expression via altered DNA methylation patterns, altered histone modifications, or noncoding RNAs, including small RNAs. In mammals, many dietary components, including folate, vitamin B6, vitamin B12, betaine, methionine, and choline, have been linked to changes in DNA methylation. These nutrients can all affect the pathways of one-carbon metabolism that determine the amount of available S-adenosylmethionine, which is the methyl donor for DNA methylation and histone methylation. Although it is too early to apply epigenetic alterations that are induced by dietary ingredients as biomarkers in public health and medicine, research in this area is expected to be boosted by the expanding use of next-generation DNA sequencing technologies. Applications include chromatin immunoprecipitation followed by DNA sequencing (ChIP–seq) to assess the genomic distribution of histone modifications, histone variants and nuclear proteins, and global DNA methylation analysis through the sequencing of bisulphite-converted genomic DNA. Combined with appropriate statistical and bioinformatic tools, these methods will permit the identification of all the loci that are epigenetically altered. Regarding transcriptomics, the global analysis of gene expression offers impres- sive opportunities in Foodomics (e.g., for the identification of the effect of bioactive food constituents on homeostatic regulation and how this regulation is potentially altered in the development of certain chronic diseases). Two conceptually different analytical approaches have emerged to allow quantitative and comprehensive analy- sis of changes in mRNA expression levels of hundreds or thousands of genes. One approach is based on microarray technology, and the other group of techniques is based on DNA sequencing. Next, typically real-time PCR is applied to confirm the up- or down-regulation of a selected number of genes. In proteomics, the huge dynamic concentration range of proteins in biological samples causes many detection difficulties because many proteins are below the sensitivity threshold of the most advanced instruments. For this reason, fractionation and subsequent concentration of the proteome is often needed. Besides, the use and development of high-resolving separation techniques as well as highly accurate mass spectrometers is nowadays essential to solve the proteome complexity. Currently, more than a single electrophoretic or chromatographic step is used to separate the thousands of proteins found in a biological sample. This separation step is followed by analysis of the isolated proteins (or peptides) by mass spectrometry (MS) via the so-called “soft ionization” techniques, such as electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI), combined with the everyday INTRODUCTION TO FOODOMICS 5 more powerful mass spectrometers. Two fundamental analytical strategies can be employed: the bottom-up and the top-down approach. Both methodologies differ on the separation requirements and the type of MS instrumentation. New proteomic approaches based on array technology are also being employed. Protein microarrays can be composed by recombinant protein molecules or antibodies immobilized in a high-density format on the surface of a substrate material. There are two major classes of protein micro- (nano-) arrays: analytical and functional protein microarrays, being the antibody-based microarray the most common platform in proteomic studies. Metabolome can be defined as the full set of endogenous or exogenous low molecular weight metabolic entities of approximately <1000 Da (metabolites), and the small pathway motifs that are present in a biological system (cell, tissue, organ, organism, or species). Unlike nucleic acid or protein-based omics techniques, which intend to determine a single chemical class of compounds, metabolomics has to deal with very different compounds of very diverse chemical and physical properties. Moreover, the relative concentration of metabolites in the biological fluids can vary from millimolar (or higher) to picomolar level, making it easy to exceed the linear range of the analytical techniques employed. Metabolites are, in general, the final downstream products of the genome, and reflect most closely the operation of the biological system, its phenotype. The analysis of metabolic patterns and the changes in the metabolism in the nutrition field can be, therefore, very interesting to locate; for example, variations in different metabolic pathways due to the consumption of different compounds in the diet. One of the main challenges in metabolomics is to face the complexity of any metabolome, usually composed by a huge number of compounds of very diverse chemical and physical properties (sugars, amines, amino acids, organic acids, steroids, etc.). Sample preparation is especially important in metabolomics, because the procedure used for metabolite extraction has to be robust and highly reproducible. Sample preparation will depend on the sample type and the targeted metabolites of interest (fingerprinting or profiling approach). Moreover, no single analytical methodology or platform is applicable to detect, quantify, and identify all metabolites in a certain sample. Two analytical platforms are currently used for metabolomic analyses: MS- and NMR-based systems. These techniques, either stand-alone or combined with separation techniques (typically, LC-NMR, GC- MS, LC-MS, and CE-MS) can produce complementary analytical information to attain more extensive metabolome coverage. There are three basic approaches that can be used in metabolomics: target analysis, metabolic profiling, and metabolic fin- gerprinting. Target analysis aims the quantitative measurement of selected analytes, such as specific biomarkers or reaction products. Metabolic profiling is a nontargeted strategy that focuses on the study of a group of related metabolites or a specific metabolic pathway. It is one of the basic approaches to phenotyping, because the study of metabolic profiles of a cell gives a more accurate description of a pheno- type. Meanwhile, metabolic fingerprinting does not aim to identify all metabolites, but to compare the patterns of metabolites that change in response to the cellular environment. Due to the huge amount of data usually obtained from omics studies, it has been necessary to develop strategies to convert the complex raw data obtained into useful 6 FOODOMICS: PRINCIPLES AND APPLICATIONS information. Thus, bioinformatics has also become a crucial tool in Foodomics. Over the last years, the use of biological knowledge accumulated in public databases by means of bioinformatics allows to systematically analyze large data lists in an attempt to assemble a summary of the most significant biological aspects. Also, statistical tools are usually applied for exploratory data analysis to determine correlations among samples (which can be caused by either a biological difference or a methodological bias), for discriminating the complete data list and reducing it with the most relevant ones for biomarkers discovery, etc.

1.2 FOODOMICS APPLICATIONS: CHALLENGES, ADVANTAGES, AND DRAWBACKS

Although there is still a large number of gaps to be filled in our current knowledge on food science and nutrition, the great analytical potential of Foodomics can help resolve many issues and questions related to food safety, traceability, quality, new foods, transgenic foods, functional foods, nutraceuticals, etc.

1.2.1 Food Safety, Quality, and Traceability Foodomics can help solve some of the new challenges that modern food safety, quality, and traceability have to face. These challenges encompass the multiple anal- yses of contaminants and allergens; the establishment of more powerful analytical methodologies to guarantee food origin, traceability, and quality; the discovery of biomarkers to detect unsafe products; the capability to detect food safety problems before they grow and affect more consumers; etc.

1.2.2 Transgenic Foods Although this book includes a chapter devoted to transgenic foods, a brief outline on this topic is given below. Recombinant DNA technology, or genetic engineering, allows selected individual gene sequences to be transferred from an organism into another and also between nonrelated species. Genetic engineering has been used in agriculture and food industries in the past years in order to improve the per- formance of plant varieties (resistance to plagues, herbicides, and hydric or saline stresses), improve technological properties during storage and processing (firmness of fruits), or improve the sensorial and nutritional properties of food products (starch quality, content of vitamins or essential amino acids). The organisms derived from recombinant DNA technology are termed genetically modified organisms (GMOs). Transgenic food is a food that is derived from or contains GMOs. The use of genetic engineering in the production of foods is constantly growing since the past years as well as the concern in part of the public opinion. This is due to the increasing impact of this technology in foodstuff production, by one side, and to the continued campaign against GMO crops lead by ecologist organizations, by the other. Claims about the advantages derived from GMO crops include those from FOODOMICS APPLICATIONS: CHALLENGES, ADVANTAGES, AND DRAWBACKS 7 the biotechnology companies and most of the scientific community, stressing the benefits for the agriculture and the food industry and the lack of scientific evidence on any detrimental effects on human health. On the other side, ecologists groups are concerned about the impact of GM plants on human health and on the environment. In this context, most governments have dictated regulations on the use, spreading, and marketing of GMOs, in order to regain the confidence of the consumers. Owing to the complexity that entails the compositional study of a biological system such as GMO, the study of substantial equivalence as well as the detection of any unintended effect should be approached with advanced profiling techniques, with the potential to extend the breadth of comparative analyses. However, there is no single technique currently available to acquire significant amounts of data in a single experimental analysis to detect all compounds found in GMOs or any other organism. In consequence, multiple analytical techniques have to be combined to improve analytical coverage of proteins and metabolites. Namely, the European Food Safety Authority (EFSA) (EFSA, 2006) has recommended the monitoring of the composition, traceability, and quality of these GM foods using advanced analytical techniques including omics techniques to provide a broad profile of these GM foods (Levandi et al., 2008; Garcia-Villaba et al., 2008, 2010; Simo´ et al., 2010; Garcia-Canas˜ et al., 2011). The development of new analytical strategies based on Foodomics will provide extraordinary opportunities to increase our understanding about GMOs, including the investigation on unintended effects in GM crops, or the development of the so-called second-generation GM foods. Besides, Foodomics has to deal with the particular difficulties commonly found in food analysis, such as the huge dynamic concentration range of food components as well as the heterogeneity of food matrices and the analytical interferences typically found in these complex matrices.

1.2.3 Foodomics in Nutrition and Health Research Nowadays, food is investigated not only as a source of energy but also as a poten- tial health promoter. As a result, food scientists and nutritionists have to face a large number of challenges to adequately answer the new questions emerging from this new field of research. One of the main challenges is to improve our limited understanding of the roles of nutritional compounds at molecular level (i.e., their interaction with genes and their subsequent effect on proteins and metabolites) for the rational design of strategies to manipulate cell functions through diet, which is expected to have an extraordinary impact on our health (Garcia-Canas˜ et al., 2010). The problem to be resolved is huge and it includes the study of the individual varia- tions in gene sequences, particularly in single nucleotide polymorphisms (SNPs), and their expected different answer to nutrients. Moreover, nutrients can be considered as signaling molecules that are recognized by specific cellular-sensing mechanisms. However, unlike pharmaceuticals, the simultaneous presence of a variety of nutrients with diverse chemical structures and concentrations and having numerous targets with different affinities and specificities increases enormously the complexity of the prob- lem. Therefore, it is necessary to look at hundreds of test compounds simultaneously and observe the diverse temporal and spatial responses. 8 FOODOMICS: PRINCIPLES AND APPLICATIONS

Foodomics can be an adequate strategy to investigate the complex issues related to prevention of future diseases and health promotion through food intake. It is now well known that health is heavily influenced by genetics. However, diet, lifestyle, and environment can have a crucial influence on the epigenome, gut microbiome, and, by association, the transcriptome, proteome, and, ultimately, the metabolome. When the combination of genetics and nutrition/lifestyle/environment is not prop- erly balanced, poor health is a result. Foodomics can be a major tool for detecting small changes induced by food ingredient(s) at different expression levels. A rep- resentation of an ideal Foodomics strategy to investigate the effect of food ingredi- ent(s) on a given system (cell, tissue, organ, or organism) is shown in Figure 1.2. Following this Foodomics strategy, results on the effect of food ingredient(s) at genomic/transcriptomic/proteomic and/or metabolomic levels are obtained, making possible new investigations on food bioactivity and its effect on human health at molecular level. The interest in Foodomics also coincides with a clear shift in medicine and biosciences toward prevention of future diseases through adequate food intakes, and the development of the so-called functional foods. It has been mentioned that it

Cell, tissue, organ or organism under study (control vs. treated with dietary ingredient(s))

Nucleic acids Proteins Metabolites HO O O NH H 2 C

2

N

H C CH N CH N H CH N C NH

N

2 2 NH 2 N O N N NH2 NH2 H H H H C N N O HO C CH N N 2 H N CH 2 H CH 2 O N NH C H H2N NH

2DE MALDI-TOF-TOF PROTEOMICS LC-MS CE-MS Protein Data expression Data Microarrays analysis analysis RT-qPCR CGE-LIF GENOMICS/ SYSTEMS METABOLOMICS TRANSCRIPTOMICS BIOLOGY Data Gene Metabolite integration expression expression CE-MS BIOINFORMATICS LC-MS FT-MS GC-MS FOODOMICS PLATFORM Proved effects NMR and/or MALDI-MSI Health benefits Biomarkers discovery Legal issues: known and Claims on new scientifically functional foods based approval

FIGURE 1.2 Scheme of an ideal Foodomics strategy to investigate the health benefits from dietary constituents, including methodologies and expected outcomes. Modified from Iba´nez˜ et al. (2012) with permission from Elsevier. FOODOMICS APPLICATIONS: CHALLENGES, ADVANTAGES, AND DRAWBACKS 9 is probably too early to conclude on the value of many substances for health, and the same can apply to other health relationships that are still under study. In this regard, it is interesting to remark that several of the health benefits assigned to many dietary constituents are still under controversy as can be deduced from the large number of applications rejected by the EFSA about health claims of new foods and ingredi- ents (EFSA, 2010; Gilsenan, 2011). More sound scientific evidences are needed to demonstrate the claimed beneficial effects of these new foods and constituents. In this sense, the advent of new postgenomic strategies as Foodomics seems to be essential to understand how the bioactive compounds from diet interact at molecular and cel- lular levels, as well as to provide better scientific evidences on their health benefits. The combination of the information from the three expression levels (gen, protein, and metabolite) can be crucial to adequately understand and scientifically sustain the health benefits from food ingredients. To achieve this goal, it will be necessary to carry out more studies to discover more polymorphisms of one nucleotide, to identify genes related to complex disorders, to extend the research on new food products, and to demonstrate a higher degree of evidence through epidemiological studies based in Foodomics that can lead to public recommendations. Moreover, in spite of the significant outcomes expected from a global Foodomics strategy, practically there are no papers published in literature in which results from the three expression levels (transcriptomics, proteomics, and metabolomics) are simultaneously presented and merged. Figure 1.3 shows the results from a global Foodomics study on the chemo- preventive effect of dietary polyphenols against HT29 colon cancer cells (Iba´nez˜ et al., 2012). Figure 1.3 shows the genes, proteins, and metabolites that were identi- fied (after transcriptomic, proteomic, and metabolomic analysis) to be involved in the principal biological processes altered in HT29 colon cancer cells after the treatment with rosemary polyphenols. In order to demonstrate all its value, Foodomics still needs to be translated to methods or approaches with medicinal impact, for example, through the so-called personalized nutrition. In this regard, data interpretation and integration when dealing with such complex systems is not straightforward and has been detected as one of the main bottlenecks. In Foodomics, to carry out a comprehensive elucidation of the mechanisms of action of natural compounds, specific nutrients, or diets, in vitro assays or animal models are mainly used because (a) they are genetically homogeneous within a particular assay or animal model and (b) environmental factors can be controlled. Moreover, these assays allow the study of certain tissues that would not be possible to obtain from humans. On the other hand, the main difficulty in the study of diets is the simultaneous presence of a variety of nutrients, with diverse chemical structures, that can have numerous targets with different affinities and specificities. Ideally, the final demonstration on the bioactivity of a given food constituent should be probed by Foodomics based on a global omics study of the biological samples generated during a clinical trial. It is interesting to mention that there are still rather limited studies on the effect of specific natural compounds, nutrients, or diet on the transcriptome/proteome/ metabolome of organisms, tissues, or cells, being the number of review papers on this topic higher than the number of research papers. 10 TRANSCRIPTOMICS METABOLOMICS

Antioxidative effect

PROTEOMICS

Apoptosis Cell cycle arrest

Till 1308 genes

FIGURE 1.3 Foodomics identification of the proteins, genes, and metabolites involved in three of the principal biological processes altered in HT29 colon cancer cells after the treatment with rosemary polyphenols. Underlined, down-regulated; Not underlined, up-regulated. Modified from Iba´nez˜ et al., 2012, with permission from Elsevier. FOODOMICS, SYSTEMS BIOLOGY, AND FUTURE TRENDS 11

1.3 FOODOMICS, SYSTEMS BIOLOGY, AND FUTURE TRENDS

Analytical strategies used in Foodomics have to face important difficulties derived, among others, from food complexity, the huge natural variability, the large number of different nutrients and bioactive food compounds, their very different concentrations, and the numerous targets with different affinities and specificities that they might have. In this context, proteomics and metabolomics (plus transcriptomics) represent powerful analytical platforms developed for the analysis of proteins and metabolites (plus gene expression). However, “omics” platforms must be integrated in order to understand the biological meaning of the results on the investigated system (e.g., cell, tissue, organ) ideally through a holistic strategy as proposed by Systems Biology. Thus, Systems Biology can be defined as an integrated approach to study biological systems at the levels of cells, organs, or organisms, by measuring and integrating genomic, proteomic, and metabolic data (Panagiotou and Nielsen, 2009). Systems Biology approaches might encompass molecules, cells, organs, individuals, or even ecosystems, and it is regarded as an integrative approach of all information at the different levels of genomic expression (mRNA, protein, metabolite). Although Systems Biology has been scarcely applied to Foodomics studies, its potential is underlined by its adoption by other related disciplines. For instance, Sys- tems Biology has been applied to understand the complexity of the processes in the intestinal tract (dos Santos et al., 2010). This study is based on human adult micro- biota characterization by deep metagenomic sequencing, identification of several hundreds of intestinal genomes at the sequence level, identification of the transcrip- tional response of the host and selected microbes in animal model systems and in humans, determination of the transcriptional response of the host to different diets in humans, germ-free and gene knockout animals, together with different metabolomics and proteomics studies. The long-term goal is to understand how specific nutrients, diets, and environmental conditions influence cell and organ function, and how they thereby impact on health and disease. This systems knowledge will be pivotal for the development of rational intervention strategies for the prevention of diseases such as diabetes, metabolic syndrome, obesity, and inflammatory bowel diseases. The challenge in the combination of Foodomics and Systems Biology is not only at the technological level where great improvements are being made and expected in the “omics” technologies but also in improving our limited knowledge on many biological processes that can have place at molecular level. Last but not the least, bioinformatics (including data processing, clustering, dynamics, or integration of the various “omics” levels) will have to progress for Systems Biology to demonstrate all its potential in the new Foodomics discipline. In this regard, it is also interesting to mention that the traditional medical world has often noted that although many of the omics tools and Foodomics approaches provide academically interesting research, they have not been translated to methods or approaches with medicinal impact and value because the data integration when dealing with such complex systems is not straightforward. In the future, the combination of Foodomics and Systems Biology can pro- vide crucial information on, for example, host–microbiome interactions, nutritional 12 FOODOMICS: PRINCIPLES AND APPLICATIONS immunology, food microorganisms including pathogens resistance, postharvest, plant biotechnology, or farm animal production. Besides, it is also foreseen the emerging of other innovative approaches as, for example, green Foodomics (see the chap- ter on this topic in this book), green systems biology (Weckwerth, 2011) or the human gutome.

ACKNOWLEDGMENTS

This work was supported by AGL2011-29857-C03-01 (Ministerio de Ciencia e Innovacion,´ Spain) and CSD2007-00063 FUN-C-FOOD (Programa CONSOLIDER, Ministerio de Educacion y Ciencia, Spain) projects.

REFERENCES

Capozzi F, Placucci G (2009). 1st International Conference in Foodomics, Cesena, Italy, 2009. Cifuentes A (2009). Food analysis and foodomics. Journal of Chromatography A 1216:7109– 7110. dos Santos VM, Muller¨ M, de Vos WM (2010). Systems biology of the gut: the interplay of food, microbiota and host at the mucosal interface. Current Opinion in Biotechnology 21:539–550. EFSA (2006). Guidance document of the scientific panel on genetically modified organisms for the risk assessment of genetically modified plants and derived food and feed. EFSA Communications Departmente, Parma, Italy. EFSA (2010). Opinions of the NDA panel published on 2009 and 2010. http:// www.efsa.europa.eu/cs/Satellite (accessed on April 7, 2010). Garcia-Canas˜ V, Simo´ C, Leon C, Cifuentes A (2010). Advances in nutrigenomics research: novel and future analytical approaches to investigate the biological activity of natural com- pounds and food functions. Journal of Pharmaceutical and Biomedical Analysis 51:290– 304. Garcia-Canas˜ V, SimoC,Le´ on´ C, Iba´nez˜ E, Cifuentes A (2011). MS-based analytical method- ologies to characterize genetically modified crops. Mass Spectrometry Reviews 30:396–416. Garcia-Villalba R, Leon´ C, Dinelli G, Segura-Carretero A, Fernandez-Gutierrez A, Garcia- Canas˜ V, Cifuentes A (2008). Comparative metabolomic study of transgenic versus conven- tional soybean using capillary electrophoresis–time-of-flight mass spectrometry. Journal of Chromatography A 1195:164–173. Gilsenan MB (2011). Nutrition & health claims in the European Union: a regulatory overview. Trends in Food Science and Technology 22:536–542. Herrero M, Garc´ıa-Canas˜ V, Simo C, Cifuentes A (2010). Recent advances in the application of CE methods for food analysis and foodomics. Electrophoresis 31:205–228. Herrero M, SimoC,Garc´ ´ıa-Canas˜ V, Iba´nez˜ E, Cifuentes A (2012). Foodomics: MS-based strategies in modern food science and nutrition. Mass Spectrometry Reviews 31:49–69. Iba´nez˜ C, Valdes´ A, Garc´ıa-Canas˜ V, Simo´ C, Celebier M, Rocamora L, Gomez´ A, Herrero M, Castro M, Segura-Carretero A, Iba´nez˜ E, Ferragut JA, Cifuentes A (2012). Global REFERENCES 13

foodomics strategy to investigate the health benefits of dietary constituents. Journal of Chromatography A 1248:139–153. Levandi T, Leon C, Kaljurand M, Garcia-Cannas˜ V, Cifuentes A (2008). Capillary elec- trophoresis time-of-flight mass spectrometry for comparative metabolomics of transgenic vs. conventional maize. Analytical Chemistry 80:6329–6335. Panagiotou G, Nielsen J (2009). Nutritional systems biology: definitions and approaches. Annual Reviews in Nutrition 29:329–339. SimoC,Dom´ ´ınguez-Vega E, Marina ML, Garc´ıa MC, Dinelli G, Cifuentes A (2010). CE-TOF MS analysis of complex protein hydrolyzates from genetically modified soybeans. A tool for Foodomics. Electrophoresis 31:1175–1183. Slater N, Wilson I (2007). Horizon Seminar, Foodomics? Why we eat, what we eat, and what’s next on the menu took place on Tuesday, 19 June, 2007. http://www.ceb.cam .ac.uk/pages/foodomics.html. Weckwerth W (2011). Green systems biology: from single genomes, proteomes and metabolomes to ecosystems research and biotechnology. Journal of Proteomics 75:284– 305. 2 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

Mar´ıa del Carmen Mena and Juan Pablo Albar

2.1 INTRODUCTION

The proteomic methods have a prominent role in the analysis of complex proteomes contributing to the characterization of the great diversity of proteins present in living organisms. Mass spectrometry (MS) has increasingly become a powerful analyti- cal tool for both qualitative and quantitative analysis of protein samples over a wide dynamic range. MS-based proteomics is a discipline made possible by the availability of genome sequence databases and advances in many areas. Although much ground has been covered, continued advances in methods, instrumentation, and computa- tional analysis is needed to get a complete analysis of biological systems. Recently, Foodomics has been defined as a new discipline that studies food and nutrition domains through the application of omics technologies in which MS techniques and proteomics are considered cornerstone players (Herrero et al., 2012).

2.1.1 History of Mass Spectrometry-Based Proteomics With the rapid advances in protein analytical technologies in the early 1990s, it became possible to perform large-scale protein studies identifying the expression of many of the proteins resolvable by two-dimensional electrophoresis (2-DE). Gel electrophoresis was successfully developed for oligonucleotide sequencing in the late 1970 (Maxam and Gilbert, 1977) and it was also developed to separate proteins about the same time (O’Farrell, 1975). In 1994 the term “proteome” was coined (Wilkins

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

15 16 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS et al., 1996) and defined as the set of proteins expressed by the genome, and the study of proteomes was named as “proteomics.” In its short history, proteomics has lastly evolved. Indeed, just a decade ago, all proteomic data were generated on instruments with low mass accuracy and resolution, and limited scan speed and sensitivity, compared to high-performance hybrid mass spectrometers presently in common use.

2.1.2 Overview of Classical Proteomics Techniques 2.1.2.1 Separation Techniques by Chromatographic Methods Chromatography has been used for decades as a separation technique and, over time, has developed into a sophisticated analytical technique. The most used methods for protein separation are liquid chromatographic techniques (e.g., ion exchange, size exclusion, affinity, and reversed-phase (RP)), as well as electrophoretic separation in liquid-phase tech- niques (capillary isoelectric focusing, capillary zone electrophoresis, capillary gel electrophoresis, and free-flow electrophoresis). Modern RP-HPLC utilizes a wide selection of chromatographic packing materials to separate proteins and peptides. The separation efficiency is determined by particle size and pore, surface area, sta- tionary phase, as well as the chemistry of the substrate surface. The most popular column packing is based on spherical silica particles where the surface is modified by alkyl chains varying in length from C4 to C18 (Neverova and Van Eyk, 2005). The C18 bound phase is the most used, offering retention and selectivity for a wide range of compounds containing different polar and non-polar groups on their surface. C4 and C8 phases are used preferentially for separation of proteins and C18 for peptides (Wagner et al., 2002).

2.1.2.2 Two-Dimensional Electrophoresis The introduction of sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) allowed the use of the electrophoretic mobility to study the occurrence of multiple protein forms. Neverthe- less, single-dimension separations are inadequate for effectively resolving complex protein mixtures. Separation of proteins by 2-DE dates back to the 1950s (Smithies and Poulik, 1956) and is still one of the most frequently used techniques to separate complex protein mixtures prior to characterization by MS. The development of the modern 2-DE began with the combination of separation by isoelectric focusing (IEF) in the first dimension and SDS-PAGE in the second dimension, a technique published in 1969 by different authors (Dale and Latner, 1969; Macko and Stegemann, 1969). The most used methods to visualize proteins after 2-DE are Coomassie and silver staining, both compatible with downstream MS analyses (Shevchenko et al., 1996) and whose limits of detection are at picomole and femtomole order, respectively (Miller et al., 2006). Colloidal Coomassie staining is more sensitive than the classical Coomassie. There are several fluorescent dyes to quantify the relative abundances of protein amounts in 2-DE gels, such as Nile red, SYPRO Orange, SYPRO Red, and SYPRO Tangerine, but the ruthenium-based dye SYPRO Ruby, with sensitivity similar to silver staining and extended dynamic range, nowadays is one of the most appropriate staining dyes in proteomics. INTRODUCTION 17

After electrophoresis separation and gel staining, statistical analysis is performed via one of the powerful software packages specifically designed to match protein spots of gel replicates for the different conditions, to compare protein patterns, and to detect protein changes, both qualitative (presence/absence) and quantitative (spot intensities) (Rotilio et al., 2012). The main limitation of this technique is that some proteins are not suitable for separation by 2-DE. Proteins with a molecular weight lower than 10 kDa or higher than 150 kDa or with very basic isoelectric point (pI) are seldom detected using conventional gels. Moreover, hydrophobic proteins with low solubility cannot enter the gels. In addition, the detection of low-abundance proteins can be hindered by proteins with similar size and charge or by protein expression levels below the detection limits of the technique (Monteoliva and Albar, 2004).

2.1.2.3 Difference in Gel Electrophoresis Differential proteomics, that is the comparison of different proteomes or different samples such as healthy versus dis- eased, allows to perfom sensitive, accurate, and reproducible quantitative proteomics studies (Monteoliva and Albar, 2004). 2-DE technique does not accomplish this goal, because two different samples cannot be distinguished into the same gel. Instead, difference in-gel electrophoresis (DIGE), a modified form of 2-DE, allows different proteins to be quantified and even different isoforms of proteins that have different migration patterns on the 2-DE gel. DIGE technology will be explained in more detail below.

2.1.2.4 Protein Identification by Mass Spectrometry

Identification by Two-Dimensional Electrophoresis Combined with Mass Spectro- metry The classical workflow in MS-based proteomics includes the protein separa- tion by 2-DE and staining. Afterward, gel images are analyzed and spots of interest are cut and de-stained to prevent staining interference with MS analysis. Some samples may also need to be desalted and concentrated by using pipette tips containing C18 or C4 resin. Then the proteins are digested being trypsin the most common enzyme used, as it very specifically cleaves proteins at the C-terminal side of lysine and arginine, and generates peptides in the preferred mass range for subsequent MS analysis. On the other hand, protein mixtures may be directly digested without previous separation and then peptide mixtures are analyzed by LC–MS.

Ionization Techniques for Peptides and Proteins To measure the mass or, more specifically, the mass-to-charge ratio (m/z) in a mass spectrometer, peptides and proteins must first be ionized and transferred into the high vacuum system of the instrument. In the late 1980s, two methods were developed for the ionization at high sensitivity: matrix-assisted laser desorption ionization (MALDI) (Karas and Hillenkamp, 1988) and electrospray ionization (ESI) (Fenn et al., 1989). 18 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

Matrix-Assisted Laser Desorption Ionization MALDI is one of the ionization techniques more widely used in MS. MALDI time-of-flight (MALDI-TOF) is one of the most commonly used mass spectrometers and consists of an ion source, a mass analyzer, and a detector. The ion source has the purpose to convert sample molecules from solution or solid phase into ionized analytes. Firstly, analytes are co-crystallized with an organic matrix, such as ␣-cyano-4-hydroxycinnamic acid and sinapinic acid, on a metal target. This MALDI matrix absorbs laser energy and transfers it to the acidified analyte, whereas the rapid pulsed laser is used to excite the matrix, which causes rapid thermal heating of the molecules and eventually desorption of ions into the gas phase. After ionization, the samples reach the TOF mass analyzer where ions are separated on the basis of their m/z. Ion motion in the mass analyzer can be manipulated by electric or magnetic fields to direct ions to a detector, which registers the number of ions at each individual m/z value (Rotilio et al., 2012). MALDI ionization requires several hundred laser shots to achieve an acceptable signal-to- noise ratio for ion detection, and the generated ions are predominantly singly charged (Sze et al., 2002). The drawbacks of this type of ionization are low shot-to-shot reproducibility and strong dependence on sample preparation methods. In general, the mass resolution and accuracy of a MALDI-TOF mass spectrometer is not high enough to give a non-ambiguous identification of a peptide. The concept of MALDI has led to techniques such as surface-enhanced laser desorption ionization (SELDI) that introduce surface affinity toward various protein and peptide molecules.

Electrospray Ionization Unlike MALDI, the ESI source produces ions from solution. The use of ESI coupled to MS was introduced in 1989 and led to the Nobel Prize for Chemistry in 2002 (Fenn et al., 1989). During ESI ionization, a high voltage is applied between the emitter at the end of the separation pipeline and the inlet of the mass spectrometer. Physicochemical processes of ESI involve creation of electrically charged spray, followed by formation and desolvation of analyte-solvent droplets is aided by a heated capillary and, in some cases, by heated gas flow at the mass spectrometer inlet (Steen and Mann, 2004). An important development in ESI technique includes micro- and nano-ESI, in which peptide mixtures are sprayed into the mass spectrometer at a very low flow rates improving the method’s sensitivity.

Peptide Mass Fingerprinting The development of new MALDI instruments allows to know sequence of peptides, where a MALDI source is coupled to a double time-of- flight section (MALDI-TOF-TOF), a hybrid quadruple TOF or an ion trap. MALDI- TOF/TOF MS is widely used in proteomics to identify proteins by a process called peptide mass fingerprinting (PMF). The main limitations of the MALDI-based PMF approach are that proteins must be completely sequenced and annotated in databases; it cannot identify proteins containing post-translational modifications (PTMs); it requires a complete protein separation and it is not appropriate for proteins with extensive cross-similarity. EMERGING METHODS IN PROTEOMICS 19

Tandem Mass Spectrometry Tandem mass spectrometry (MS/MS) is a process in which an ion formed in an ion source is mass-selected in the first phase, reacted and fragmented, and then the charged products from the reaction are analyzed in the second phase. The high precision of MS spectrometric measurements can analyze small molecules and distinguish closely related species, and MS/MS can provide structural information on molecular ions that can be specifically isolated on the basis of their m/z and fragmented in the gas phase within the instrument. In LC–MS/MS, peptides generated from the digestion of complex mixtures of proteins are separated on the basis of their hydrophobicity and introduced into the mass spectrometer, in most of the cases directly via online ESI. The ESI source can be coupled to several mass analyzers, as quadrupole, ion trap, orbitrap, or Fourier trans- form ion cyclotron resonance system, whose accuracy and sensitivity is extremely different (Yates et al., 2009). After ESI and detection, the final step in this process is the identification of the proteins by the MS/MS fragmentation spectra using specific databases.

2.1.3 Sample Preparation Methods Sample preparation is critically important in proteomics experiments. Less soluble proteins are difficult to study and the detection of low-abundance proteins is a great challenge for proteomics. Adjuvants and contaminants, such as salts, detergents, or stabilizers, can interfere with the results of mass spectrometric analysis. In case of LC coupled to ESI-MS, salts and detergents can be removed online within the HPLC setup (e.g., guard column or trapping column). For higher concentrations and for MALDI-MS applications, spinning columns, dialysis, or precipitation are the methods which are mostly applied. Nevertheless, to avoid losses or modifications of the proteins, sample preparation steps have to be limited to the minimum steps needed.

2.2 EMERGING METHODS IN PROTEOMICS

2.2.1 Bottom-up and Top-down Proteomics The field of MS-based proteomics can be broadly categorized into two fundamental approaches: the increasingly popular top-down proteomic approach that focuses on the direct analysis by MS of entire intact proteins after being subjected to gas-phase fragmentation; and the more widely used bottom-up proteomic approach that focuses on the analysis of peptides obtained after proteolytic digestion of proteins (Fig. 2.1). With top-down analysis, all PTMs will be subjected to analysis, while bottom-up analysis may skip the fragments with these types of modifications.

2.2.1.1 Bottom-up Proteomics Bottom-up analyses are performed by initial pro- teolytic digestion of the protein of interest, followed by LC–MS analysis of the 20 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

FIGURE 2.1 Representation of top-down and bottom-up proteomics approaches. resultant peptides whose sequences are used to identify the corresponding proteins. The enzyme most used for protein digestion is trypsin, which is very well suited to downstream analysis by the most common MS and tandem MS/MS techniques. However, information regarding PTMs or protein isoforms could be missed, and it is often worth considering other proteolytic enzymes or applying a panel of enzymes (Swaney et al., 2010). The digestion of proteins greatly increases the complexity of samples and it is essential to separate them into manageable, reproducible frac- tions. In addition, pre-analytical sample processing must be considered especially for high-complexity samples for large-scale analyses. EMERGING METHODS IN PROTEOMICS 21

Experimentally determined peptide masses that differ from those predicted from the primary protein sequence allow for identification of modified regions within the protein (Henzel et al., 1993). Analysis of these peptide ions by MS/MS may be used for further characterization of the modification, including its localization to a specific site within the peptide sequence. It is common, however, that some of the peptides resulting from bottom-up digestion strategies are not observed upon mass spectrometric analysis due to their poor chromatographic retention behavior, or inef- ficient ionization (Kapp et al., 2003). The main pre-fractionation methods used are SDS-PAGE, size-exclusion, anion-exchange, cation-exchange, lectin-affinity chro- matography, RP-LC in basic media, free solution IEF, and high-abundance protein depletion. Some of the advantages of the bottom-up approach include better front-end sep- aration of peptides compared with proteins and higher sensitivity than the top-down method. Drawbacks of the bottom-up approach include limited protein sequence coverage by identified peptides, loss of labile PTMs, and ambiguity of the origin for redundant peptide sequences (Yates et al., 2009).

2.2.1.2 Top-down Proteomics Top-down MS method is used for the comprehen- sive identification and characterization of the total number and type of cotranslational and PTMs that are present within a protein of interest. This top-down approach is based on the mass difference between the experimental and predicted masses of the intact protein. However, determination of an intact protein mass alone, even at the highest resolution and mass accuracy provided by modern MS instrumentation, is generally not useful for the characterization of a modified protein, due to the inability of unambiguously localizing the modification to a specific site within the protein sequence (Lee et al., 2002). In the top-down MS/MS-based strategies, ions derived from the intact protein are isolated following their initial mass analysis, and then subjected to fragmentation. As the entire sequence of the protein is available, protein identification may potentially be achieved in a single step, including the characteri- zation of any PTMs (Sze et al., 2002). Compared with bottom-up approaches, the higher sequence coverage of top-down experiments reduces the ambiguities of the peptide-to-protein mapping, which allows for identification of the specific protein isoforms (Uttenweiler-Joseph et al., 2008). However, there are technological limitations to the top-down method such as front- end separation of intact proteins is more challenging than the separation of peptide mixtures and methods to fragment large proteins are more complex. In order to successfully apply these methods, it is necessary to use instrumental or chemical approaches for determining the charge states and masses of the multi- ply charged product ions resulting from the dissociation of large multiply charged protein ions, and develop multistage tandem MS or alternative ion dissociation meth- ods to maximize the sequence coverage obtained from these dissociation reactions (Scherperel and Reid, 2007). Further advances of protein characterization using top-down approaches can be made with the development of high-resolution mass analyzers. 22 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

FIGURE 2.2 Workflow of protein quantitation. MudPIT, multidimensional protein identifi- cation protein; SCX, strong cation exchange; RP, reversed phase.

2.2.2 Methods for Quantitative Proteomics In addition to the classical methods of 2-DE and DIGE, MS-based quantification methods have gained increasing popularity. There are two broad groups of quantitative methods in MS-based proteomics: relative and absolute quantitative proteomics. In addition, quantitative proteomics can be classified into two major approaches: differential stable isotope labeling and label-free techniques (Fig. 2.2). The proteomic quantification methods utilizing dyes, fluorophores, or radioactivity have provided very good sensitivity, linearity, and dynamic range, but they suffer from shortcomings such as that they require high-resolution protein separation typically provided by 2-DE gels, which limits their applicability to abundant and soluble proteins; and they do not allow the identification of the proteins. Labeling methods include labeling of two or more different samples (i.e., healthy and diseased) either with the light isotope or with the heavy isotope to create a specific mass tag that can be recognized by a mass spectrometer, and the determination of the ratio of heavy to light allows a comparative analysis of the relative amounts of proteins in the samples. These isotope labels can be introduced into amino acids of proteins or peptides (a) metabolically (such as SILAM and SILAC), (b) chemically (such as ICAT, ICPL, iTRAQ, and TMT), (c) enzymatically (18O/16O), or (d) as an external standard using spiked synthetic peptides. In contrast, label-free methods aim at comparing two or more experiments on the basis of the signal intensity EMERGING METHODS IN PROTEOMICS 23 for any given peptide or at counting the spectra identified for each sample by the search engine.

2.2.2.1 Quantitative Proteomics by Difference In-Gel Electrophoresis DIGE allows for simultaneous separation of up to three samples on one gel. These samples, usually two different samples (control vs. experimental conditions) and one inter- nal standard, are covalently labeled separately with three fluorescent cyanine dyes (CyDye2, CyDye3, and CyDye5), each with a unique excitation/emission wavelength (to discriminate the protein from each sample), then combined and run together (mul- tiplexed) on the same 2-DE gel. Typically, Cy3 and Cy5 are used for labeling samples and Cy2 is used as an internal standard. The internal standard is a pooled mixture containing an equal aliquot of all test samples facilitating accurate inter-gel match- ing of spots; it allows for data normalization and minimizes gel-to-gel experimental variability, leading to the measurement of subtle changes in protein abundance. Pro- tein spots corresponding to proteins of different samples can then be visualized by scanning and the differential analysis software analysis enables relative quantitation of the labeled proteins (Fernandez and Albar, 2012; Richard et al., 2006). DIGE offers clear advantages over the gel-to-gel comparisons because the different samples are run on the same 2-DE gel and hence the same spots will comigrate. Anal- ysis is performed using software such as GE DeCyder that includes a co-detection algorithm and results are presented using univariate statistics (Student’s t-test). Label- ing with DIGE fluors is extremely sensitive, but the relative high cost of the reagents, equipment, and software, limits a wide application of the technique.

2.2.2.2 Differential Stable Isotope Labeling Stable isotope labeling was intro- duced into proteomics in 1999 by three independent laboratories (Gygi et al., 1999; Oda et al., 1999; Pasa-Toliˇ c´ et al., 1999). Given that a mass spectrometer can rec- ognize the mass difference between the labeled and unlabeled forms of a peptide, quantification is achieved by comparing their respective signal intensities. In Figure 2.3 there is a representation of the stable isotope labeling methods more widely used in proteomics. The first stable isotope labeling approach was based on a class of reagents termed isotope-coded affinity tags (ICATs) in which the sample is chemically reacted with light or heavy pairs of an isotope tag, leading to a cysteine-specific tagging of intact proteins followed by proteolytic digestion. This technique provides an alternative method to the 2-DE-based approaches, but the main drawback is that only proteins containing cysteines can be quantified. The cleavable version of this reagent (cICAT) contains nine 13C instead of deu- terium and an acid-cleavable biotin moiety. The advantages of this approach are that slightly different LC retention times due to deuterium no longer occurred and the potential confusion with double ICAT labeling being the same mass shift as oxidation was removed. Also, biotin cleavage improved the quality of the spectra and led to the identification of more proteins (Elliott et al., 2009). Most labeling-based quantification approaches have potential limitations. These include increased time and complexity of sample preparation, requirement for higher 24 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

A MS NI SILAC Standard Culture (Light) Fractionation Labeling Protein 1:1 Digestion & Metabolically Mix Purification LC-MS Analysis NI B Intensity Labeled Culture (Heavy) m/z

MS A Avidin Fractionation ICAT & Labeling Digestion affinity 1:1 LC-MS Analysis

Mix separation Intensity B NI m/z

MS A Fractionation ICPL & Labeling 1:1 Digestion Mix LC-MS Analysis Intensity B Chemically m/z

MS/MS Labeling Fractionation & Digestion 1:1:1:1 iTRAQ Mix LC-MS Analysis Intensity Multiple samples 114 115 116 117 m/z I MS/MS Fractionation Labeling TMT Digestion 1:1 & Mix LC-MS Analysis Multiple samples 0 Intensity m/z MS Fractionation Enzymatically B 1:1 & A

Labeling Mix LC-MS Analysis Intensity m/z

Fractionation MS Spiked peptides Digestion & LC-MS Analysis Intensity m/z

FIGURE 2.3 Relative quantitation. Labeling methods. In the figure is represented the most used stage to label (proteins or peptides) although in some cases there are other possibilities to label: ICAT, peptides; ICPL, peptides, iTRAQ, proteins. For ICPL the two-plex option is indicated, but there is also a triplex and four-plex, but these are less used. In iTRAQ is indicated the four-plex option, but there is also an eight-plex. In TMT is indicated the two-plex option but there is also and six-plex. I, isobaric mass tags; NI, non-isobaric mass tags. In the figure is indicated the level (MS or MS/MS mode) where the quantification is performed in each type of labeling option. The identification is always performed in the MS/MS mode. sample concentration, high cost of the isotope labels, incomplete labeling, and the requirement for specific quantification software. Moreover, so far only some labeling methods allow the comparison of multiple samples at the same time, whereas most of them can only compare the relative quantity of a protein from two to three different samples. In one study comparing ICAT, iTRAQ, and DIGE, it was found that iTRAQ was more sensitive for quantitation, but more susceptible to errors in precursor ion isolation (Wu et al., 2006). Figure 2.4 illustrates the chemical structure of the most used labeling reagents.

Stable Isotope Labeling of Mammals The stable isotope labeling of mammals (SILAM) approach is a metabolic labeling based on the introduction of stable iso- topes such as 15N into the whole proteins of an animal model such as a rodent by maintaining a long-term diet enriched with that isotope. Afterward, the organs enriched with the isotopes can be used as a quantitative internal standard to measure EMERGING METHODS IN PROTEOMICS 25

FIGURE 2.4 Schematic representation of chemical structure of the most used labeling reagents. For ICPL is indicated in gray the positions where the Carbone or Hydrogen is substituted for the isotopic molecule (13C and/or Deuterium, respectively) in both the two-plex and four-plex versions of the reagent. For iTRAQ reagents is indicated in brackets the ranges in mass (m/z) for the reporter group and balancer group as well as the constant mass of the total group (isobaric tag) for both the four-plex and eight-plex versions of the reagent. For TMT reagent is indicated in brackets the ranges in mass (m/z) of the reporter group for both the two-plex and six-plex variants of the reagent. The mass normalization group balances mass differences from each reporter group to achieve the same overall mass of the isobaric tag. 26 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS changes in individual protein levels after treatments with different drugs or during the development of a disease (McClatchy et al., 2011). Thus, the analysis of labeled mammalian tissues allows a global quantitative analysis of any model of human dis- ease reaching a greater insight into physiology compared to cultured cells or other labeling approaches.

Stable Isotope Labeling by Amino Acids in Cell Culture The stable isotope labeling by amino acids in cell culture (SILAC) approach incorporates isotopically labeled amino acids with deuterium, 13C, or 15N into proteins via metabolic labeling in the cell growth medium culture itself, as the proteins are synthesized by the growing organism. Cell samples to be compared are grown separately in media supplied with either heavy or light form of an essential amino acid, usually leucine, arginine, lysine, and/or tyrosine. The proteins from the light and heavy cells are distinguishable by MS and the relative abundance ratio of peptides can be experimentally measured by comparing heavy/light peptide pairs. Lysine and arginine are the two most com- monly used labeled amino acids as every tryptic peptide will contain one of these amino acids. One drawback of this technique is the metabolic conversion of labeled amino acids such as the generation of labeled proline from arginine. Nevertheless, reducing labeled proteins to amino acids by acid hydrolysis before LC–MS/MS analysis and increasing proline concentration in the growing medium leads to a more complete incorporation of the labeled amino acids and therefore a more accurate quantitative results (Marcilla et al., 2011). SILAC works well in mammalian cell lines due to their inability to synthesize all of their amino acids, but it does not work well in plants due to their autotrophic nature. Additionally, some cells are harder to grow in the dialyzed serum required for SILAC due to the loss of essential growth factors (Elliott et al., 2009). Additionally, a recent paper describes Super-SILAC whereby a mixture of five cell lines of human carcinoma that had been SILAC labeled were used as internal standards to quantify proteins in an unknown tumor sample (Geiger et al., 2010).

Isotope-Coded Protein Label Isotope-Coded Protein Label (ICPL) is a non-isobaric technique based on stable isotope tagging at the free amino groups. After labeling of up to four different proteome states at the same time, the protein samples are combined, digested, and the complexity reduced by any of the separation methods available. The ratios of labeled peptides corresponding to the different proteome states are quantified by MS and identified by MS/MS. On the other hand, labeling can be performed at the peptide level instead of protein level increasing both the number of proteins identified and the quantitation accuracy (Paradela et al., 2010). The quantification of multiplexed ICPL experiments is greatly facilitated by the ICPLQuant software. The method shows highly accurate and reproducible quantifi- cation of proteins, yields high sequence coverage, and is useful for the comprehensive detection of PTMs and protein isoforms (Lottspeich and Kellermann, 2011). EMERGING METHODS IN PROTEOMICS 27

Isobaric Tags for Relative and Absolute Quantitation The isobaric tag for relative and absolute quantitation (iTRAQ) is a quantitative labeling method that can be used to profile four to eight different samples in a single experiment. In contrast to the majority of other labeling methods, iTRAQ relies on quantification of the MS/MS level rather than on the MS level. To this purpose, iTRAQ reagents contain an amine- reactive tagging group that is available in four to eight isotope-coded variants, all with an identical molar mass (isobaric), a balance group, and a reporter group. The labeling can be performed at the protein level (before digestion) or peptide level, although is it more frequently employed at the peptide level.

Tandem Mass Tag Labeling Tandem mass tag (TMT) is an isobaric quantitative labeling method that is available in two different sets of tags, which share the chemical structure, but differ in the number of incorporated heavy isotopes: TMT duplex, which allows for the routine investigation of two different samples in parallel; and TMT six-plex, for six samples. Routinely, raw protein samples are denatured, reduced, and alkylated, and digested with a protease such as trypsin prior to labeling with TMT reagents. The TMT label attaches to amino groups (both N-terminal ␣-amino groups and ε-amino groups of lysine residues) of peptides generated by tryptic digestion. After labeling, the samples are mixed, subsequently analyzed by LC–MS/MS, and peptides are simultaneously identified and quantified (Kuhn et al., 2012b).

Absolute Quantification In the method of absolute quantification (AQUA) of pro- teins, a stable isotope-labeled synthetic peptide is used as a standard for a particular peptide and it is added in a known quantity to a protein digest. The comparison of the mass spectrometric signal to the peptide in the sample with the heavy labeled standard can be used to calculate the concentration of the peptide and hence the protein from which it is derived. Given that tryptic digests of entire proteomes are very complex mixtures, and that most mass spectrometers have a rather limited dynamic detection range, there are a number of limitations to the AQUA approach. One basic step is the choice of the synthetic peptide standard and to know how much of the labeled standard should be added to a sample that may be different for all proteins of interest as their abundance may differ greatly within a sample. Another limitation is the specificity of the spiked standard as there are likely multiple isobaric peptides present in the mixture. Absolute measurements of proteins can also be achieved with Quantification concatemer (QconCAT) (Beynon et al., 2005). This approach employs a synthetic gene encoding a hybrid of multiple proteotypic synthetic peptides from proteins of interest and again this is isotopically labeled, digested, and then used on multiple reaction monitoring (MRM) assays. Another approach is Stable Isotope Standards and Capture by Anti-Peptide Antibodies (SISCAPA) that employs immunoaffinity enrichment of targeted peptides from complex mixtures such as digested tissue or biofluids prior to analysis by MRM-MS (immuno-MRM) (Unwin et al., 2006).

2.2.2.3 Label-Free Quantitative Proteomic Methods MS label-free quantitative proteomics can be classified into two different strategies: (a) Area Under the Curve 28 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

Label – Free Methods Intensity A Spectral Counting A Fractionation A Digestion & m/z LC-MS Analysis MS/MS LC-MS , B 1 Intensity B LC-MS2, LC-MS3... LC-MSn B m/z

Ion peak Intensity

(AUC) A A Intensity Fractionation MS Time Digestion & A LC-MS Analysis 1:X B B Intensity

B Time

FIGURE 2.5 Relative quantitation. Label-free methods. Label-free quantitative proteomics. Control and sample are subject to individual LC–MS/MS analysis. In the figure is indicated the level (MS or MS/MS mode) where the quantification is performed in each type of labeling option. The identification is always performed in the MS/MS mode.

(AUC) also referred as peptide ion current area based on the chromatographic signal intensity measurement of precursor ion spectra such as peptide peak areas or peak heights in chromatography and (b) Spectral Counting, which is based on counting and comparing the number of peptides assigned to a protein after MS/MS analysis (Fig. 2.5). These principles are applied to compare a peptide/protein in two or more experiments to obtain an indicator for their respective amounts in a given sample. These are strategies for rapid, highly reproducible, and accurate quantification of the differential protein expression in complex biological samples. Regardless of which label-free quantitative proteomics method is used, there are some steps that must be done: protein extraction, reduction, alkylation, digestion, sample separation by liquid chromatography, analysis by MS/MS leading to protein identification and quantification (Fig. 2.2). The data processing workflow for a label-free quantitative proteomics experiment begins with matching spectra to peptides by database searching for protein identifi- cation using different algorithms. Although these algorithms aim to be as specific as possible, they cannot eliminate the possibility of false positives in the output data set; therefore, further analyses are required to minimize false discovery rates (Nesvizhskii et al., 2003). Relative Quantification by Spectral Counting The spectral counting technique com- pares the number of identified MS/MS spectra (spectral count) from the same protein EMERGING METHODS IN PROTEOMICS 29 across multiple LC–MS/MS runs. The assumption is that an increase in protein abun- dance results in an increase in protein sequence coverage, the number of unique peptides identified, and the number of identified total MS/MS spectra for each pro- tein. In addition, peptides that are more abundant are more likely to be selected for fragmentation than less abundant peptides and will produce a higher abundance of MS/MS spectra, and it is therefore proportional to protein amount in data-dependent acquisition. Thus, since a peptide can be expected to ionize equally in different sam- ples, the number of fragmentation events should be alike too and can therefore be used as a relative quantification measure between multiple samples. Normalization and statistical analysis of spectral counting datasets are required for accurate and reliable identification of protein changes in complex mixtures. In contrast to quantification by peptide ion chromatogram intensities, spectral counting-based quantification is proved more reproducible and has a larger dynamic range, leading to a more extensive MS/MS data acquisition across the chromato- graphic time scale both for protein identification as well as protein quantification. Nevertheless, the spectrum counting approach does not measure any direct physical property of a peptide; the spectrum (retention time and peak width) can be different for every peptide, as it varies for every peptide, and it requires the observation of many spectra for a given protein.

Area under the Curve or Signal Intensity Measurement The general process of protein quantitation based on AUC involves the measurement of ion abundances (either peak height or area) at specific retention times for the given ionized peptides without the use of a stable isotope standard. The measured ion current increases with increasing concentrations of an injected peptide. As ionized peptides elute from an RP column into the mass spectrometer, their ion intensities can be measured within the given detection limits of the experimental setup. The ion chromatograms for every peptide are extracted from an LC–MS/MS run and either their mass spectrometric peak height or area is integrated over the chromatographic time scale. The intensity value for each peptide in one experiment can then be compared to the respective signals in one or more other experiments to yield relative quantitative information. The relationship between the amount of peptide and the ion current holds for stan- dard samples of limited complexity. In addition, some factors must be controlled such as variations in sample preparation, injection volume, LC signal resolution, retention time, and co-eluting peptides as well as temperature and pressure fluctuations in the mass spectrometer. Intensity-based quantification considers a peptide as a feature with two coordi- nates, its retention time and its m/z value, and records an MS-derived intensity value for each feature (Wiener et al., 2004). Separate LC–MS analyses from distinct sam- ples can then be aligned and normalized by these coordinates, while the intensity values provide quantitative information for the aligned features across different sam- ples. The strength of this method is that it readily scales with the number of samples, thereby being the only quantitative method providing appropriate statistical analysis, 30 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS which makes label-free intensity-based quantification a good candidate for biomarker discovery (Helsens et al., 2011). Computational methods required to process raw LC–MS data for quantitation need to take into consideration alignment of retention times enabling comparative assessment of peptides across experiments, noise suppression, optimal peak picking, and peak abundance normalization. Subsequent MS/MS scans for these peptides will confirm their identity (Neilson et al., 2011).

Absolute Label-Free Quantification An estimate of the protein abundance in a sample can be calculated using the protein abundance index (PAI) defined as the number of observed peptides divided by the number of all possible tryptic pep- tides from a particular protein (Rappsilber et al., 2002). In addition, the exponen- tially modified form (emPAI) shows a better correlation to known protein amounts. The protein content can be calculated in terms of a molar percentage by dividing the emPAI value of a protein by the sum of all emPAI values multiplied by 100. The emPAI method does have associated disadvantages: the correlation between emPAI and protein abundance decreases with the use of low-resolution ion trap mass spectrometers, emPAI can become saturated by the presence of highly abundant proteins, and the quantitation sensitivity is comparable to that of protein staining (Ishihama et al., 2005). One of the shortcomings in spectral counting is the inherency of peptides possess- ing different physicochemical properties that introduce variability and bias in MS measurements. Absolute Protein Expression (APEX) is a modified spectral counting technique that takes into account the number of observed peptide mass spectra for a protein and the probability of the peptides being detected by the MS instrument (Lu et al., 2007). The key to APEX is the introduction of an appropriate correction factor (Oi) that makes the fraction of expected number of tryptic peptides and the fraction of observed number of peptides proportional to one another (Tang et al., 2006). The APEX technique has been developed as APEX Quantitative Proteomics Tool, a freely available source Java implementation (http://pfgrc.jcvi.org/) and it is a rapid, robust, and simple technique to use with large-scale proteomic data sets (Malmstrom et al., 2009). Spectral counting has also been modified to take into consideration that the length of a protein will affect the number of spectral counts: a longer protein will gener- ate more peptides and MS/MS fragments. A normalized spectral abundance factor (NSAF) provides an improved measure for relative abundance by taking into account the length of the protein, which is calculated by dividing the spectral counts of a protein by its length (Zybailov et al., 2007).

2.2.2.4 Label-Free Methods versus Labeling Methods Compared with isotope- labeling methods, label-free experiments need to be more carefully controlled due to possible errors caused by run-to-run variations in performance of LC and MS. The main disadvantages is that AUC quantitation requires highly reproducible HPLC runs and quantitation using spectral count numbers can be subject to variation. On the other hand, label-free shotgun proteomics produces volumes of data that require an EMERGING METHODS IN PROTEOMICS 31 exhaustive statistical assessment. However, the advances in instrumentation such as highly reproducible nano-HPLC separation and high-resolution mass spectrometer as well as delicate computational tools have greatly improved the reliability and accuracy of label-free quantitative proteomics. In protein-labeling approaches, different protein samples are combined together once labeling is finished and the pooled mixtures are then taken through the sample preparation step before being analyzed by a single LC–MS/MS or LC/LC–MS/MS experiment. In contrast, with label-free quantitative methods, each sample is sepa- rately prepared, then subjected to individual LC–MS/MS or LC/LC–MS/MS runs (Zhu et al., 2010). Nonetheless, label-free quantification is worth considering for a number of reasons. It is faster as there are no steps of introducing a label into proteins or peptides, and it is therefore more cost-effective. In addition, it is a cleaner method, versatile, and results are simpler than using labeling techniques. Furthermore, there is no principle limit to the number of experiments that can be compared. This is certainly an advantage over stable isotope labeling techniques that are typically limited to two to eight experiments that can be directly compared. Unlike for most stable isotope labeling techniques, the number of detected peptides/proteins in an experiment is higher because the mass spectrometer is not occupied with fragmenting all forms of the labeled peptide. There is evidence that label-free methods provide higher dynamic range of quantification than stable isotope labeling and therefore may be advantageous when large and global protein changes between experiments are observed. However, particularly for spectral counting, this comes at the cost of unclear linearity and relatively poor accuracy (Old et al., 2005). Regardless of whether a labeling approach or a label-free quantitative technique is employed, another drawback is the occurrence of shared peptides defined as non- unique or degenerate peptides whose sequences can be matched in a database to more than one protein candidate. A single gene can result in hundreds of different proteins, derived from splicing variants, PTMs, protein isoforms, and homologous proteins, creating indistinguishable protein identifications when dealing with incom- plete sequence coverage (Black, 2000). Combining methods of label-free quantitation seems to be the key to achieving reliable results (Griffin et al., 2010). Some studies have demonstrated considerable variability in quantitation outcomes depending on the method used. Ratios obtained by label-free MS, AUC, or spectral counting methods were closer to the expected values as compared to the ratios obtained by stable isotope labeling methods. Nevertheless, whether labeling-based or label-free MS methods are used, these approaches are somehow complementary and have many potential applications in proteomics.

2.2.2.5 Selected Reaction Monitoring as a Label-Free Quantitative Approach and a Validation Tool Selected reaction monitoring (SRM) can determine relative and absolute protein quantities with sensitivity similar to some ELISAs or immunoblot- tings. SRM entails the monitoring of a pre-selected peptide precursor ion. For a particular peptide, unique precursor and fragment ion(s) m/z signals (transitions) are 32 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS chosen. The m/z filtered precursor ions are subject to collision-induced dissociation (CID) followed by fragment ion m/z filtration. The fragment ion intensities are mea- sured and the peptides are identified. By spiking in peptides of known concentration, absolute quantities of targeted proteins may be calculated. Alternatively, comparison between two samples enables relative quantitation (Neilson et al., 2011). Quantitation in SRM is subjected to variations in signal intensities and ionization efficiency over time, and ion suppression or enhancement from co-eluting analytes. Additionally, the acquisition of reproducible quantitative data points over the chro- matographic elution for any given peptide depends upon the balance between the number of data points accumulated per transition and the total number of transitions analyzed in each dwell time (Lange et al., 2008). SRM is a valuable tool for the validation and verification of label-free and labeled quantitative proteomic experiments. Studies using SRM as a validation technique report a high degree of consistency between the SRM quantitative measurements and label-free methods.

2.2.3 Post-Translational Protein Modifications Identification Methods PTMs are not directly coded by genes, but they are present in most eukaryote proteins and determine their tertiary and quaternary structures modulating their activity, local- ization, and biological function. The combination of separation technologies with MS results particularly successful in characterizing these modified peptides. Many proteins present PTMs at multiple sites. The most common are phosphorylation, glycosylation, sulfation, nitration, glycation, acetylation, prenylation, methylation, proteolytic cleavage, and various forms of oxidation. Therefore study of proteins only at expression level provides a very limited view of the proteome, as protein activities are often modulated by PTMs that do not necessarily reflect changes in protein abundance. Generally, the strategies for identifying PTMs employ specific chemical isolation methods or affinity enrichment in order to extract classes of PTM- containing proteins. The detection and characterization of PTMs by MS is usually based on the mass difference that accompanies the changes in amino acid structure of all post-translationally modified proteins (Rotilio et al., 2012).

2.2.3.1 Phosphoproteomics Methods Phosphorylation of serines, threonines, and tyrosines of peptides and proteins is the most abundant and ubiquitous PTM. These reversible modifications affect both the folding and function of proteins, regulating essential functions such as cell division, signal transduction, and enzymatic activity. Phosphoproteomics is geared toward the identification and quantification of phospho- rylated proteins and the identification of phosphorylation sites (Paradela and Albar, 2008). The presence of a huge number of nonphosphorylated proteins and the wide dynamic range comparing with the ones phosphorylated make the analyses of these proteins complicated (Navajas et al., 2011).

Immobilized Metal Affinity Chromatography Immobilized metal affinity chro- matography (IMAC), is an enrichment method for phosphopeptides, which exploits EMERGING METHODS IN PROTEOMICS 33 the strong affinity of phosphate groups of phosphopeptides toward immobilized cations, such as Zn2 + ,Fe3 + ,Ti4 + ,Zr4 + , and Ga3 + . Anti-phosphotyrosine antibod- ies are widely used for immunoprecipitation of intact phosphoproteins in MS-driven proteomic studies of cell signaling (Rotilio et al., 2012). Another strategy for the enrichment of phosphoproteomes by chemical derivati- zation of phosphate groups is metal oxide affinity chromatography (MOAC). Some non-affinity techniques such as anion-exchange chromatography (Han et al., 2008a), mixed-bed chromatography (Motoyama et al., 2007), and hydrophilic interaction chromatography (HILIC) (McNulty and Annan, 2008) offer alternative strategies for phosphopeptide enrichment. HILIC partitions peptides between the hydrophilic layer and the hydrophobic elution buffer. HILIC fractionation with an IMAC compatible buffer (salt-free trifluoroacetic acid/acetonitrile) constitutes an attractive alternative for screening phosphoproteomes (McNulty and Annan, 2008). Recently, a procedure based on IMAC/reversed-phase phosphopeptide purification and analysis by nano- HPLC-ESI-MS/MS with ion trap has demonstrated to improve the results (Navajas et al., 2011).

Titanium Dioxide This is a promising alternative to the use of IMAC for the enrich- ment of phosphorylated peptides. The approach is based on the selective interaction of water-soluble phosphates with porous titanium dioxide microspheres via bidentate binding at the TiO2 surface. Phosphopeptides are trapped in a TiO2 precolumn under acidic conditions and desorbed under alkaline conditions. An increased specificity for phosphopeptides has been reported, although TiO2-based columns still retain nonphosphorylated acidic peptides (Paradela and Albar, 2008).

2.2.3.2 Protein Glycosylation The analysis of glycated proteins is challenging due to their complexity and variability. Most common methods for protein glycosyla- tion identification consist of a proteolytic digestion of glycoproteins followed by LC– MS/MS analysis. One of the methods employed is lectin affinity chromatography in combination with LC/MS analysis (Mechref et al., 2008). Another powerful approach in glycoproteomics is the degradation of free glycans or glycopeptides by specific exo- and endoglycosidases followed by the detection of the reaction products with MS and small-scale monosaccharide analysis for the compositional analysis of individual glycans (Zdebska and Koscielak, 1999). Glycan sequencing may further be achieved by the analysis of free and derivatized glycans using MS/MS fragmentation data. Although MS/MS analysis may be accomplished on native glycans, the quality of the product ion patterns is better for permethylated glycans (Leymarie and Zaia, 2012). Recently, a new cost-effective methodology based on phenylboronate acrylamide gel electrophoresis has been described to detect, identify, and analyze these PTMs (Morais et al., 2012).

2.2.3.3 The Multiplexed Proteomics The multiplexed proteomics (MP) approach is a methodology that allows the parallel determination of phosphorylation, glyco- sylation, and general protein expression patterns within a single gel electrophoresis experiment through serial staining. In particular, Pro-Q Diamond, Pro-Q Emerald, 34 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS and Sypro fluorescent staining are used to detect phosphoproteins, glycoproteins, and total proteins, respectively. Although this method does not provide a high degree of certainty, it is simple, fast, and does not require sophisticated equipment and expertise (Ge et al., 2004).

2.2.3.4 Multidimensional Protein Identification The non-gel-based strategies such as Multidimensional Protein Identification (MudPIT) are powerful tools for large-scale protein expression and characterization studies. This approach combines different separation techniques to resolve the high sample complexity obtained after protein digestion. MudPIT involves sample preparation, orthogonal chromatography, and MS analyses. This technology is widely applied in large-scale analysis of phosphoproteins by the specific enrichment of phosphopeptides and analysis of protein expression and phosphorylation levels. In addition, emerging applications, such as fractionation procedures and mixed-bed anion-cation-exchange multidimensional protein identi- fication technology (ACE-MudPIT), already focus the classes of phosphopeptides and peptides during the same analysis (Yates et al., 2009). Other applications are focused on plant sciences for the study of common bean and rice leaf, root, and seed reference maps that included the most comprehensive proteome exploration available (Lee et al., 2009).

2.3 THE MOVE FROM SHOTGUN TO TARGETED PROTEOMICS APPROACHES

In a typical shotgun proteomic strategy, a mixture of proteins is digested into peptides and fractionated prior to MS/MS analysis getting huge information about the major proteins in the sample. Nevertheless, for many types of experiments, the objective of the study is only a relatively small number of proteins under different conditions. For such experiments, targeted approaches such as SRM or MRM offer higher sensitivity and greater speed of analysis. Both shotgun and targeted proteomics diminish the problems associated with high dynamic range of proteins and sequencing speed, and thereby increase the probability for a peptide to be sequenced and identified.

2.3.1 Shotgun Proteomics Shotgun proteomics is a remarkably powerful technology for identifying complex samples of proteins on a global level. This technology refers to the use of bottom-up proteomics techniques in which protein samples are digested prior to separation and MS analysis (Rotilio et al., 2012). The complexity of protein mixtures derived from biological samples is so great that it may be necessary to perform a separation in two dimensions. The most common of these technologies for shotgun proteomics is MudPIT, which usually couples strong cation exchange (SCX) LC with RP-microLC, so that peptides are separated first on the basis of their charge and their hydrophobicity (Zhang et al., 2010). By increasing the number of SCX fractions, a further increase THE MOVE FROM SHOTGUN TO TARGETED PROTEOMICS APPROACHES 35 in proteome coverage is achieved. In addition, OFFGEL electrophoresis technology can be used for first-dimension separation of peptide mixtures according to their pI. Another method for shotgun proteomics is on-line electrospray tandem mass spectrometry (GeLC–MS). Here, intact proteins are separated by SDS-PAGE, the gel is then cut into multiple slices, and proteins are in-gel digested and the resulting peptides are analyzed by LC–MS/MS. Compared to MudPIT, GeLC–MS/MS requires less protein material. Furthermore, abundant proteins will concentrate in distinct gel slices, thereby increasing the chance of identifying less abundant proteins (de Godoy et al., 2006). The strength of shotgun proteomics is that by random sampling of a peptide mixture, an overview of the proteome composition is readily generated in which many proteins are identified by multiple peptides, which increases the reliability of such identifications. One of the main drawbacks of shotgun proteomics is that large amounts of heterogeneous proteins are cleaved in multiple peptides, increasing dramatically the complexity of the samples.

2.3.2 Targeted Proteomics Targeted proteomics allows studying certain proteins specifically between other less interesting proteins and avoiding redundant of uninformative peptides. The main strategies are to extract a selected set of peptides from a whole proteome digest and only analyze them by LC–MS/MS. This selection must be representative for the analyzed proteome. Moreover, since a selection of peptides always yields a less dense peptide mixture, random sampling tends to be reduced (Malmstrom et al., 2012). For example, human serum contains thousands of peptides, most of which are likely proteolytic fragments of larger proteins, whose identity remains undetermined (Rotilio et al., 2012). Advanced proteomics technologies such as MudPIT combined with MALDI-TOF-MS and SELDI-TOF-MS can profile proteins in the range of low masses, contributing to the discovery of potential biomarkers of several disease (Ciordia et al., 2006). Nevertheless, the critical point is to evaluate a list of candidate biomarker proteins with enough specificity and sensitivity for the targeted disease. In these cases, the advantages of targeted approaches are potentially higher sensitivity and higher throughput. The accurate inclusion mass screening (AIMS) technology is designed to provide a bridge from unbiased discovery to MS-based targeted assay development. Masses on the software inclusion list are monitored in each scan on the mass spectrometer, and MS/MS spectra for sequence confirmation are acquired only when a peptide from the list is detected with both the correct accurate mass and charge state. The AIMS experiment confirms that a given peptide (and thus the protein from which it is derived) is present in the sample with sensitivity similar to MRM experiments and leads to the analysis of hundreds of proteins in a short time (Jaffe et al., 2008).

2.3.2.1 Sample Complexity Reduction On average, tryptic digestion of a protein generates a large number of peptides. A methodology to reduce sample complexity 36 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS is the COmbined FRActional DIagonal Chromatography (COFRADIC) capable of, among others, selecting cysteinyl and methionyl peptides. Firstly, two identical and consecutive RP-HPLC separations of the peptide mixture are performed. Between both separations, a modification reaction alters the physicochemical properties of a targeted group of peptides. As a result, modified peptides obtain a different elution profile during the second RP-HPLC separation by which they are distinguished from non-modified peptides. Clearly, by changing the actual modification reaction, different sets of peptides can be targeted and thus isolated (Helsens et al., 2011). The actual modification reaction uses 2,4,6-trinitrobenzene sulfonic acid that renders the non-N-terminal peptides more hydrophobic such that N-terminal peptides are readily isolated. Once proteins of interest have been discovered by the above technologies, one may like to measure their behavior in a large set of different conditions. It would then be attractive to focus the MS measurement only onto specific peptides related to these proteins (Cox and Mann, 2011).

2.3.2.2 Selected Reaction Monitoring The objective of a proteomic SRM exper- iment is the precise quantification of targeted proteins in complex mixtures. SRM targets predetermined precursor ions for fragmentation which allows peptides from a particular protein of interest to be monitored, giving access to lower-abundance species even in complex mixtures such as plasma and serum with a high consistency, accuracy, and sensitivity. Triple quadrupoles are best suited for SRM, in which the first quadrupole accu- rately filters a targeted precursor, the second quadrupole fragments this precursor ion, and the third quadrupole accurately filters for a specified fragment ion. Thus, a peptide ion is transferred from the first to the last quadrupole and a fragment ion is recorded, and such transitions are monitored through time (Fig. 2.6). A few transitions per peptide (2–5) are often exceptionally specific and monitoring them surpasses other methods in terms of sensitivity (Lange et al., 2008). The advances in development and validation of the assays, as well as novel software and data repositories, are increasing the potential of the SRM approach in whole- proteome analysis. In clinical research and diagnostics applications, SRM technique has a huge potential as a biomarker verification tool, even compared with standard ELISA methods that are commonly used for this purpose. In addition, the amount of sample required for SRM analysis is small. A typical SRM assay consists of two parts, the first involves selecting enzymes that can produce peptides with some target characteristics, and the second involves experimental testing to verify the predictions from the first phase. The manual process for identifying the optimal enzyme to give best peptide characteristics and SRM transitions for MS is very time-consuming, especially if there are multiple protein targets involved (Afzal et al., 2011). In response to this, a number of software tools have been developed to assist with this process (Cham Mead et al., 2010). Selecting the most appropriate peptides and fragment ions, unique to the protein of interest and showing high mass spectrometry signal response, is essential to a successful SRM experiment. Therefore, it is necessary to identify the m/z of some THE MOVE FROM SHOTGUN TO TARGETED PROTEOMICS APPROACHES 37

FIGURE 2.6 Schematic diagram of an SRM/MRM assay conducted on a triple quadrupole (Q) instrument. The first mass analyzer (Q1) is used as a mass filter that only isolates the peptide of interest that is fragmented in the second mass analyzer (Q2). In the third mass analyzer (Q3), only the fragments of interest are selected and afterward are monitored over the chromatographic elution time by the detector. For SRM only a few peptides are selected while for MRM a more number of peptides are selected.

abundant peptides that appear consistently in the same sample (referred as proteo- typic peptides) as well as the m/z of its fragment ions that are generated with high intensity. These “transitions” (specific precursor–fragment ion pairs) allow targeted analysis of a particular peptide in a complex mixture. This selected monitoring and double selection criteria (precursor/product ions) provide high specificity for peptide selection since only desired transitions are recorded and other signals are regarded as noise (Zhi et al., 2011). There are many guidelines that can aid in the selection of appropriate transitions, based on prior experimentation, physicochemical parameters and in silico predictions (Walsh et al., 2009). Although this design stage can take considerable time, once transitions are established, they can be used indefinitely for experiments studying the protein of interest. In addition for each peptide-fragment pair optimization of specific MS parameters must be made. The SRM approach can be used for protein quantification as described above. Relative quantification can be conducted simply by comparing the absolute peak area of the individual samples (label-free quantification), although it is difficult to obtain precise measurements because of differences in ionization efficiency, analyte composition and chromatography. SRM experiments can also be combined with many of the standard isotope labels used in quantitative proteomic experiments, including ICAT, SILAC, ICPL, and iTRAQ. Additionally, several methods that aid greatly in speeding up the assay development aspect of SRM have emerged, including databases such as MRMAtlas (Picotti et al., 2008). 38 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

In the MRMAtlas each protein assay is presented as a set of optimal MRM coordinates for the peptide(s) that represent a protein. Peptide identifications have been validated by acquiring the corresponding tandem mass spectra on the triple quadrupole mass spectrometer, which can be viewed as single or consensus spectra. The final assay coordinates can be directly downloaded in Excel table-format which can be directly pasted into a MRM/SRM method of a triple quadrupole instrument and used to specifically detect and quantify the protein of interest in a complex protein digest. This SRM approach has been shown to be extremely powerful, both in confir- mation of potential biomarkers and in discovery of novel biomarkers (Menon and Omenn, 2010). SRM offers advantages for the detection of lower-abundant proteins over more traditional shotgun experiments in that the instrument spends the majority of the time measuring the proteins of interest resulting in a higher fraction useful data. (Malmstrom et al., 2012). In addition, SRM has several advantages over antibody assays for biomarker validation such as high sensitivity with no cross-reactivity. In addition, SRM can be used for any MS-observable ion, making it generally cheaper than antibody assays and these assays are quantitative and easily multiplexed (Latterich et al., 2008).

2.3.2.3 Multiple Reaction Monitoring In recent years, MRM-MS has emerged as an attractive targeted MS technique for biomarker verification and validation. High selectivity of MRM is achieved by isolating some specific peptide parent ions and MRM transitions in a triple quadrupole mass spectrometer, and many specific peptides of interest are simultaneously targeted and measured in a single run (Fig. 2.6). The MRM transitions can also be determined from synthetic peptides. Key challenges for MRM-based targeted proteomics at the present are robust statistics for false- positive signals, increasing throughput as well as the synthesis, and handling the large numbers of isotope-labeled peptides that are necessary for accurate quantification in this approach (Cox and Mann, 2011). MRM coupled with stable isotope dilution (SID) using chemically identical syn- thesized peptides has been shown to be well suited to achieve absolute and repro- ducible quantitation of proteins (Gerber et al., 2003). SID-MRM, coupled with pep- tide fractionation or specific enrichment, has been shown to target several candidate proteins in plasma or serum in the low range by detecting signature peptides with a broad dynamic range (Keshishian et al., 2009). As expected, SID-MRM quantitation is highly reproducible. Since stable labeled peptides have identical retention times as targeted peptides, unambiguous confirmation of the targeted peptides can be achieved even in the presence of closely co-eluting peptides. Interferences can also be easily detected by monitoring multiple MRM transitions of the stable labeled peptides and comparing the transition intensity ratios with the ratios from targeted peptides. How- ever, the cost and lead time for synthesizing, purifying, and evaluating stable isotope standard peptides for absolute quantitation, as well as setting up spike-in experiments and standard curves, can be substantial, especially at the verification and early stage THE MOVE FROM SHOTGUN TO TARGETED PROTEOMICS APPROACHES 39 validation steps where screening of a large number of putative candidate biomarkers may be of interest (Tang et al., 2011). Furthermore, some studies have reported that coupling SISCAPA methodology to MRM-MS produces immuno-MRM assays that can be multiplexed to quantify proteins in plasma with high sensitivity, specificity, and precision (Kuhn et al., 2012a).

2.3.3 Tandem Mass Spectrometry versus Selected/Multiple Reaction Monitoring Selected/Multiple reaction monitoring assays, conducted on triple quadrupole instru- ments, can be coupled to liquid chromatography for the analysis of complex proteome digests. In SRM/MRM assays the first (Q1) and last (Q3) mass analyzers of a triple quadrupole mass spectrometer are used as mass filters to isolate a peptide ion and a corresponding fragment ion. The signal of the fragment ion is then monitored over the chromatographic elution time. The selectivity resulting from the two filtering stages, combined with the high duty cycle, results in quantitative analyses. The specific pairs of m/z values associated to the precursor and fragment ions selected are referred to as “transitions” and effectively constitute mass spectrometric assays that allow to identify and quantify a specific peptide and, by inference, the corresponding protein in a complex protein digest. MS/MS works well for discovery while SRM/MRM work well for monitoring on known targets with a higher sensitivity and faster than MS/MS. Furthermore, dependent MS/MS scans can be triggered by SRM/MRM scans to provide sequence information for the selected peptides, further increasing the specificity of the tech- nique. Multiple transitions (50–100) corresponding to multiple proteins of interest can be monitored and sequenced in a single MRM analysis, providing great potential for quantitative analyses of a relatively large number of proteins in a single assay (Zhi et al., 2011).

2.3.4 Tandem Mass Spectrometry with Alternative Acquisition Methods There are some acquisition methods, such as SWATHTM that allows the monitoring of all fragment ion data within a wide range of m/z in a few seconds. The main advantage of the application of this acquisition in a 5600 TripleTOF system is that it is possible to identify and quantify a great number of peptides in a single run in a targeted mode. Afterward, several MS/MS spectra can be selected and analyzed in a data-independent acquisition method. The information obtained of the protein and peptides contained in the sample is complete and highly specific, both qualitative and quantitative in one analysis, but it can be retrospectively analyzed in silico as new hypotheses are developed (Gillet et al., 2012).

2.3.5 Applications of Targeted Approaches in Food Science Some authors have demonstrated the potential of SRM/MRM-based proteomics for the measurement of any set of proteins of interest in foods at high-throughput and 40 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS quantitative accuracy. In these experiments, protein targets were selected based on the list of absolute protein abundances. Afterward, these proteins were grouped and for each protein some proteotypic peptides were selected. Proteome analysis has been applied extensively for the analysis of cereal grain proteins, especially in rice, wheat, and barley (Agrawal and Rakwal, 2006; Finnie and Svensson, 2009; Skylas et al., 2005) and also maize (Mechin et al., 2007). The workflow for a targeted proteomics study is thus similar to that for a classical proteomics experiment, requiring techniques for tissue isolation, protein extraction, separation, and identification; however, it is specifically tailored to the subproteome of interest. In order to achieve this, prior knowledge of the proteins of interest is often required, such as solubility characteristics, cellular location, or binding specificity. Thus, classical proteome studies provide important data on which to base the targeted analyses (Finnie et al., 2011).

2.4 NEW INSTRUMENTAL METHODS FOR PROTEOMICS

The MS-based proteomic studies have led to improved MS instrumentation with increased sensitivity, expanded functions, robust integration of HPLC, and highly automated computer control to facilitate the analysis of the very complex samples generated in many of these workflows. In particular, many new types of hybrid instruments that combine more than one mass analyzer have been commercialized to provide high-resolution MS/MS spectra and MRM measurements. Mass spectrometers usually consist of three main parts: the ion source and optics, the mass analyzer, and the detector. The main ionization techniques have been explained above. Mass analyzers are an integral part of each instrument because they can store ions and separate them, based on the m/z ratios. Ion trap (IT), Orbitrap, and ion cyclotron resonance (FT-ICR) mass analyzers separate ions based on their m/z resonance frequency, quadrupoles (Q) use m/z stability, and time-of-flight (TOF) analyzers use flight time. Each mass analyzer has unique properties, such as mass range, analysis speed, resolution, sensitivity, ion transmission, and dynamic range (Yates et al., 2009). In this section nowadays most used instruments are described, but there are other mass spectrometers available in the market that can be used for many applications.

2.4.1 Fragmentation Methods Dissociation or fragmentation of protein and peptide ions is essential for MS/MS analysis. Advances in MS instrumentation have enabled the integration of multi- ple fragmentation methods with high-precision mass measurements of the resultant fragment ions.

2.4.1.1 Electron Transfer Dissociation and Electron Capture Dissociation versus Collision-Induced Dissociation CID was the first fragmentation method introduced in proteomics and is the most commonly used method for peptide fragmentation. In NEW INSTRUMENTAL METHODS FOR PROTEOMICS 41

CID, precursor ions collide with inert gas atoms such as He or Ar in a collision cell upon which mainly b- and y-fragment ions are created. CID also produces immonium ions specific for individual amino acids and further readily dissociates labile peptide bonds and unstable modified residues (Huang et al., 2008). Figure 2.7 illustrates the theoretical ion fragments of a peptide after CID or electron transfer dissociation (ETD) fragmentation. ETD (Syka et al., 2004) and electron capture dissociation (ECD) (Zubarev et al., 2000) rely on an electron-based dissociation process and dominantly produce c- and z-ions along the peptide backbone in a sequence-independent manner, different from CID that prefers labile peptide bonds. ECD is limited to be used in the FT-ICR cells and not widely implemented due to the cost of this analyzer. On the other hand, ETD is used in ion traps and thus more applied. Since ETD and ECD incorporate negatively charged electrons in the positively charged peptides, these peptides need to be highly charged (e.g., 3 + ,4+ ) to get enough signal of the fragment ions. One of the latest instruments using the ETD fragmentation method is the amaZon speed ETD by Bruker, which provides a sensitive, robust, and reliable setup for top-down applications and PTM analysis.

2.4.1.2 High-Energy Collisional Dissociation High-energy collisional dissocia- tion (HCD) is a fragmentation technique that is performed by injecting peptide ions from the ion trap into a collision cell at the far side of the C-trap. Fragment ions are transferred back to the C-trap and analyzed at high resolution and mass accuracy in the Orbitrap analyzer. HCD fragmentation is similar to the fragmentation in triple quadrupole or quadrupole-TOF instruments, and compared with traditional ion trap- based CID, it overcomes the problem of low mass cutoff of ion trap fragmentation and generates more number of ion fragments resulting in higher quality MS/MS spectra (Nagaraj et al., 2010). HCD also employs higher-energy dissociations than those used in CID, enabling a wider range of fragmentation pathways. One drawback is that spectral acquisition times are up to twofold longer compared with CID because more ions are required for Fourier transform detection in the Orbitrap (Jedrychowski et al., 2011).

2.4.1.3 Collision-Activated Dissociation Trypsin digestion of proteins produces a complex mixture of peptides containing one basic amino acid residue per peptide. These peptides ionize via ESI to low charge states (generally + 3 or less, with the + 1 and + 2 charge states most favored), which are optimal for collision-activated dissociation (CAD) MS (Mikesh et al., 2006). Ion fragmentation with Quadrupole Ion Trap (QIT), quadrupole time-of-flight (Q- TOF), and Quadrupole Linear Ion Trap (QLT) instruments can be performed with CAD. In this process, protonated peptides are kinetically excited and undergo col- lisions with an inert gas such as helium or argon. During each collision, imparted translational energy is converted into vibrational energy that then is rapidly distributed throughout all covalent bonds. Fragment ions are formed when the internal energy of the ion exceeds the activation barrier required for a particular bond cleavage. Frag- mentation of protonated amide bonds affords a homologous series of complementary 42 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

FIGURE 2.7 Peptide fragmentation of the peptide AEGAAMGPAVCRby collision-induced dissociation (series b and y) or electron-transfer dissociation (series c and z). NEW INSTRUMENTAL METHODS FOR PROTEOMICS 43 product ions of types b and y. Mass differences observed between homologous mem- bers of an ion series allow assignment of a particular amino acid to the extra residue in the larger fragment and thus facilitate peptide sequence analysis (Syka et al., 2004). Analysis of some PTMs, such as phosphorylation, sulfonation, and glycosylation, is difficult with CAD since the modification is labile and preferentially lost over pep- tide backbone fragmentation, resulting in little to no peptide sequence information. The presence of multiple basic residues also makes peptides exceptionally difficult to sequence by conventional CAD MS (Mikesh et al., 2006). In these cases, ETD represents an advantageous tool in proteomic research by readily identifying peptides resistant to analysis by CAD.

2.4.2 High Mass Accuracy and Fast Scanning Instrumentation The latest generation instruments for proteomics MS-based methods have a high sensitivity, throughput, and great automation. The most relevant features of a mass analyzer are resolution, mass accuracy, sensitivity, speed of data acquisition, and the possibility of performing single- or multistage MS/MS. Proteome coverage depends on these features and on the dynamic range of the instrument, that is, the sig- nal intensity range in which two distinct analytes can be detected and the duty cycle of the mass spectrometer, being the number of fragmentation spectra within a time window. The ionized analytes formed in the ion source are transported to the mass analyzer where their trajectories are controlled and analyzed enabling accurate m/z measure- ments. Ion trajectories can be controlled by two general methods: either by applying a dynamic electrical field (such as QIT, linear ion trap (LIT), Orbitrap, quadrupole, time-of-flight) or by applying a magnetic field (such as Fourier transform ion cyclotron resonance) (Helsens et al., 2011). The quadrupole (Q) mass analyzer is an m/z filter by applying a radio frequency (RF) voltage between two pairs of rods. Similar to the quadrupole is the QIT as it also generates a 3D RF field, though here ions are first trapped and then sequentially ejected from the QIT (March, 2009). LIT, in turn, is similar to the QIT, but now ions are trapped and injected in a 2D RF field, which results in higher ion injection efficiencies and ion storage capacities, more dynamic range, and better ion-trapping capacities and mass accuracy (Domon and Aebersold, 2006), thus increasing the overall sensitivity. The TOF analyzer uses an electrical field to accelerate ions in a vacuum tube. The kinetic energy acquired by the ions correlates with their mass, charge, and applied voltage, and measurement of the flight time finally allows calculation of their m/z value.

2.4.2.1 LTQ-Ion Trap Ion trap instruments are the high-throughput workhorses in proteomics (Douglas et al., 2005). These versatile instruments feature fast scan rates, MSn scans, high-duty cycle, high sensitivity, and reasonable resolution (2000 full width at the half height (FWHH)) and mass accuracy (100 ppm). The LTQ ion trap from Thermo Scientific combines a tenfold higher ion storage capacity than 3D traps and high resolution at a fast scanning rate (5555 Da s−1). In addition, the 44 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

LTQ radial ion ejection offers higher sensitivity than other two-dimensional ion-trap instruments (Schwartz et al., 2002).

2.4.2.2 Orbitrap The Orbitrap is a high-performance mass analyzer that operates by trapping ions in an electrostatic field which causes them to orbit a central electrode in rings determined by the m/z ratio (Hu et al., 2005). An image current of this rotating ion is then Fourier transformed into a frequency spectrum and in it is turn converted into a mass spectrum. It presents a great versatility and features high resolution (up to 150,000), high mass accuracy (2–5 ppm), a m/z range of 6000, and a dynamic range greater than 103 (Yates et al., 2009). These instruments have enabled greater proteomic coverage due to their much higher resolution as compared to former instruments and are also ideally suited to quantification (Domon and Aebersold, 2006).

2.4.3 New Hybrid Instruments To become an analytically useful mass spectrometer, the instrumentation may effi- ciently be constructed by combining different types of mass analyzers and ion- guiding devices. Hybrid mass spectrometers use a combination of two different types of analyzers for the first and second stages in MS/MS analysis. They combine the strengths of each analyzer type supporting different analytical strategies while min- imizing the compromises that might result from interfacing the two or more MS technologies. In proteomics, the term was originally given to the combination of a quadrupole analyzer with a TOF detector, namely, the Q-TOF-type instruments (Q- TOF and QSTAR). Other hybrid instruments that are most commonly used are the time-of-flight–time-of-flight (TOF-TOF), ion trap–time-of-flight (Trap-TOF), triple quadrupole, quadrupole–ion trap (Q-Trap), ion trap–orthogonal time-of-flight (Trap- TOF), quadrupole–orthogonal time-of-flight (Q-TOF), quadrupole-FTMS, and the ion trap–FTMS. Hybrid Q-TOF instruments have a TOF tube and before it, they have two quadrupoles (four parallel metal rods with an electrical field). Quadrupole Q1 serves as a mass filter and Q2 acts as the collision cell. Q-Q-TOF instruments have good sensitivity (femtomole [10–15] limits of detection), good mass accuracy (low parts per million [ppm]), good to high resolution (often exceeding 12,000 where resolution is defined as the width of the peak at half height divided by the mass of the peak), and are suited to both peptide identification and quantitative analyses (Domon and Aebersold, 2006). Most MS/MS is achieved by ESI, but there is also available MALDI TOF/TOF instrumentation where LC is performed off-line for global peptide-centric workflows or not at all for 2-DE and MS. These instruments have similar performance attributes to the Q-TOF instruments but also have even greater sensitivity (subfemtomole limits of detection) and are more tolerant to contaminants such as salts and small amounts of detergent. In addition, there are times when the production of singly charged NEW INSTRUMENTAL METHODS FOR PROTEOMICS 45 peptides is an advantage in terms of interpreting mass spectra. However, the nature of the configuration means that they are not so suited to PTM analysis where data- dependent scanning is advantageous. LIT devices have been implemented on triple quadrupole-type instruments where Q2 is replaced by the LIT (which also acts as the collision cell) and Q1 and Q3 serve as mass filters (called Q-Q-LIT instruments) (Domon and Aebersold, 2006). In sum- mary, MS/MS is usually performed in the product ion mode whereby MS-identified peaks (peptides) are selected for fragmentation to determine whether the amino acid sequence can be done on all tandem MS instruments. However, multiple-stage sequential MS/MS capabilities, in which fragment ions can be iteratively isolated and further fragmented, can only be done on triple quadrupole instruments, such as the quadrupole ion trap (Q-Q-LIT) or triple quadrupole (Q-Q-Q; Q1 and Q3 serve as mass filters and Q2 acts as the collision cell) instruments. Such experiments are extremely useful for PTM analysis where experiments to detect a subset of peptides that contain a specific functional group (e.g., a phosphate group for phosphorylated proteins) are required (Brewis and Brennan, 2010). Hybrid instruments have been widely used for determining food contaminants and are widely cited in bibliography (Rubert et al., 2011; Rubert et al., 2012; Shan et al., 2012).

2.4.3.1 AB SCIEX TripleTOFTM 5600 System The TripleTOF 5600 System is a hybrid quadrupole time-of-flight mass spectrometer that combines a high mass accuracy (<3 ppm) and resolution (30k) with high MS/MS spectral acquisition rates (20 Hz), and it operates by means of information-dependent acquisition (IDA) with the speed and sensitivity of a TOF mass spectrometer and quantification capabilities similar to a triple quadrupole mass spectrometer. These features make this instrument a powerful LC–MS/MS platform for complex proteomic mixtures (Andrews et al., 2011). Nevertheless, it also works as a DDA platform of high resolution, accuracy, and speed. Recent studies have demonstrated that using the high-speed scan acquisition MS/MS capabilities of the TripleTOF 5600 system, it is possible to obtain good depth of protein coverage in the samples analyzed (Dunham et al., 2011; Tambor et al., 2012).

2.4.3.2 AB SCIEX QTRAPR 5500 LC/MS/MS System With the addition of a newly developed ion guide and improved Q3 LIT, the QTRAP 5500 shows large increases in signal response, particularly in LIT mode, and has also enhanced scan rates compared with previous systems. This instrument offers MS3 analysis and has the potential for lower quantitative detection limits in the future, as well as opening the door for development of MS4 systems. Several authors have reported good results after using this high-sensitive instru- ment, allowing the simultaneous identification and quantification of the proteins of 46 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS interest (Grobosch et al., 2012; Janeckova et al., 2012; Lazartigues et al., 2011; Li et al., 2011; Paine et al., 2012; Rust et al., 2012; Wu et al., 2011).

2.4.3.3 ThermoFisher LTQ-Orbitrap Both Orbitrap and ICR instruments use a fast Fourier transform (FFT) algorithm to convert time-domain signal into m/z spectrum (Senko et al., 1996). The Orbitrap mass analyzer features high resolution (up to 150,000), high mass accuracy (2–5 ppm), a mass-to-charge range of 6000, and a dynamic range greater than 103 (Makarov et al., 2006). When coupled to an LTQ ion trap, LTQ Orbitrap instrument represents a multistage trap combination and has the advantages of both high resolution and mass accuracy of the Orbitrap and the speed and the sensitivity of the LTQ. In MS mode the linear trap performs the function of collecting the ion population, passing them on to an intermediate C-trap for injection and analysis in the Orbitrap analyzer at high resolution. In MS/MS mode the LIT only retains a chosen mass window, which is activated by a supplemental RF field leading to fragmentation of the trapped precursor ions, and records the signal of a mass-dependent scan at low resolution (Michalski et al., 2011). Furthermore, one can operate LTQ-Orbitrap in a parallel mode: the Orbitrap acquires MS full scans while the LTQ carries out fragmentation reactions. There are several papers that describe the performance of the Orbitrap for bottom- up (Han et al., 2008b; Olsen et al., 2005; Perry et al., 2008; Yates, 2004; Yates et al., 2006) and top-down (Frank et al., 2008; Macek et al., 2006; Schenk et al., 2008) proteomic applications. Some of the recent applications of the LTQ-Orbitrap highlight the benefits of high mass accuracy, improving the quantification of low- abundance peptides in very complex biological samples such as human plasma (Schenk et al., 2008).

ThemoFisher LTQ-Orbitrap XL and LTQ-Orbitrap-Velos These instruments have a linear octopole collision cell. The LTQ Orbitrap XL ETD is the only ETD instrument with both high resolution and accurate mass. This technology enables protein/peptide characterization including highly sensitive PTM analyses especially phosphorylation and top-down sequencing. Recently, an improved LIT Orbitrap analyzer combination termed LTQ Orbitrap Velos has been introduced. It features an S-lens with up to 10-fold improved ion trans- mission from the atmosphere, a dual LIT, and a more efficient HCD cell interfaced directly to the C-trap (Olsen et al., 2009). HCD fragmentation is similar to the frag- mentation in triple quadrupole or quadrupole TOF instruments and its products are analyzed with high mass accuracy in the Orbitrap analyzer (Olsen et al., 2007). Thus, the LTQ Orbitrap or LTQ Orbitrap Velos instruments offer versatile fragmentation modes depending on the analytical problem (McAlister et al., 2008). The LTQ Velos system is configured with an Eksigent uPLC system and ESI ionization source for nano-LC–MS/MS analyses. It is particularly useful for the analysis of low complexity of samples such as from 2-DE protein spots as well com- plex samples containing a number of proteins in two-dimensional LC fractionations. Database searches are performed using Proteome Discoverer 1.2 software based on SEAQUEST algorithm. NEW INSTRUMENTAL METHODS FOR PROTEOMICS 47

DART-Orbitrap Direct analysis real-time (DART) ion source coupled to a high- resolution orbitrap mass spectrometer employs a glow discharge for the ionization. Metastable helium atoms react with ambient water, oxygen, or other atmospheric components to produce the reactive ionizing species (Cody et al., 2005). The DART ion source was shown to be efficient for soft ionization of a wide range of both polar and non-polar compounds. Several papers have been published describing various DART applications including rapid quantitative analysis of various substances occurring in foodstuffs and food crops (Edison et al., 2011; Lojza et al., 2012; Vaclavik et al., 2010).

Q Exactive—Orbitrap and ESI-Orbitrap Exactive The combination of a LIT with the Orbitrap analyzer has proven to be a popular instrument configuration. This Q Exactive instrument features high ion currents because of an S-lens, and fast high-energy CID peptide fragmentation because of parallel filling and detection modes. The image current from the detector is processed by an enhanced Fourier Transformation algorithm, doubling mass spectrometric resolution. Together with almost instantaneous isolation and fragmentation, the instrument achieves overall cycle times of 1 s for a top 10 higher-energy collisional dissociation method. More than 2500 proteins can be identified in standard 90-min gradient of tryptic digests of mammalian cell lysate—a significant improvement over previous Orbitrap mass spectrometers. Furthermore, the quadrupole Orbitrap analyzer combination enables multiplexed operation at the MS and tandem MS levels (Michalski et al., 2011) Comparing with previous Orbitrap instruments, the Q Exactive offers the potential to analyze many more peptides in a given time, with very high MS/MS data quality (Nagaraj et al., 2011). Performance of the Q Exactive for complex peptide mixtures compares well with current LTQ Orbitrap instruments such as the LTQ Orbitrap Velos. Although the Q Exactive only offers the HCD fragmentation mode, HCD speed and sensitivity are not limiting (Michalski et al., 2011). Another hybrid Orbitrap instrument is the ESI-Orbitrap Exactive. Some authors have currently reported applications of this system (Pagnotti et al., 2011; Trimpin et al., 2010).

2.4.3.4 AB SCIEX QSTARR XL Hybrid LC/MS/MS System Several applications can be found in literature where proteins are identified by MS/MS using nano-ESI hybrid quadrupole time-of-flight (QSTAR XL Hybrid LC/MS/MS System) (Charlton et al., 2009; Chen, 2006; Issaq et al., 2008; Kang et al., 2009; Kirkland et al., 2008; Ransohoff et al., 2008), although there are others mass spectrometers that present better scanning speed and detection sensitivity (Zhang et al., 2007).

2.4.3.5 AB SCIEX Triple QuadTM 5500 LC/MS/MS System Triple quadrupole mass analyzers have been the most often applied MS detectors in targeted analysis (Pico et al., 2004). It has now become one of the most widely used mass spectrometers because of its ease of handling, small size, and relatively low cost. Mass separation in a quadrupole mass filter is based on achieving a stable trajectory for ions of 48 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS specific m/z values in a hyperbolic electrostatic field. An idealized quadrupole mass spectrometer consists of four parallel hyperbolic rods. Each rod has an applied direct current (DC) voltage and a superimposed radiofrequency (RF) potential. One pair of diagonally opposite rods has a positive DC value, and the other has a negative one. The RP-phase is opposite for the positive and negative electrodes (Pico et al., 2004). The ions are sufficiently accelerated (typically 5–20 eV) to pass through the rods, but not so much that they fly through the rods without any mass separation. Once they reach the applied field, the ions oscillate according to their m/z; the ions remain in the quadrupole for hundreds of microseconds. Appropriate values of the DC and RF potentials allow ions of only one m/z to pass through the filter, and to move on to the detector. Scanning the DC and RF fields, while fixing the ratio between them, stabilizes the trajectories of ions of different m/z through the rods and allows them to pass through the filter and to move to the detector (Pico et al., 2004). In triple quadrupole instruments, an ion of interest is preselected with the first mass filter Q1, collisionally activated with energies up to 300 eV with argon in the pressurized collision chamber Q2, and the fragmentation products are analyzed with the third quadrupole Q3. That process is known as low-energy CID MS/MS, and has the advantage that precursor ions are selected prior to CID to eliminate any uncertainty on the origin of the fragment ions. Some authors have used the Triple Quad 5500 LC/MS/MS System to analyze different food samples (Cronly et al., 2010; Kim et al., 2011).

2.4.3.6 AB SCIEX TOF/TOFTM 5800 System This is a tandem time-of-flight MS/MS system, providing not only peptide identification but peptide sequencing by the dual time of flight mass analyzers. The MALDI TOF/TOF presents a high mass accuracy and ion sensitivity leading to the analysis of hundreds of samples within minutes using fewer laser shots than other systems. There are several recent studies of proteins performed using the MALDI TOF/TOF 5800 mass spectrometer (Jakoby et al., 2012; Manivannan et al., 2012; Rebello et al., 2011; Sim et al., 2012; Wang et al., 2011).

2.4.3.7 Waters SYNAPT G2-S HDMS This is a hybrid quadrupole-ion mobility- orthogonal acceleration-time-of-flight (Q-IM-oa-TOF) instrument. An ion mobil- ity cell consisting of three (trap, mobility, and transfer) traveling wave ion guides (TWIGs) is interposed between the Q-TOF analyzers. The trap TWIG collects and releases ions into the mobility TWIG for ion mobility separation, while the transfer TWIG directs the ions to the TOF analyzer for mass analysis. The trap and transfer TWIGs can also act as collision cells, allowing an ion fragmentation step prior to and/or after IMS separation (Benton et al., 2012). SYNAPTR G2-S combines revolutionary StepWaveTM ion optics with proven Quantitative Tof (QuanTofTM) and High Definition MSTM technologies to provide the highest levels of sensitivity, selectivity, and speed. Many applications of this instrument have been recently reported (Cuyckens et al., 2012; Hart et al., 2011; Inutan and Trimpin, 2010; Inutan et al., 2011; Trimpin et al., 2011). BIOINFORMATICS TOOLS 49

2.5 BIOINFORMATICS TOOLS

The capacity to annotate, analyze, store, manage, and distribute the complex data collected is very important in proteomics, and that depends as much on computers and software tools as on instruments to generate the data.

2.5.1 Algorithms for Protein Identification Several search engines match MS/MS spectra to protein databases, and the proteomic study to determine the entire amino acid sequencing of a protein can be done by two different types of approaches. The first type is database search algorithms that work by matching the mass spectrum of the peptide to a database of known peptide sequences. Up to now, most complete proteomics studies have been in this category. Examples of these algorithms include MASCOT (Perkins et al., 1999), Protein Prospector (Clauser et al., 1999), SEQUEST (Yates et al., 1995), and Paragon (Shilov et al., 2007). The second type is de novo sequencing of peptides from mass spectra which is needed when neither genomic sequencing information nor sufficient mass spectrum data are available. Examples of these algorithms include PEAKS (Ma et al., 2003), ADEPTS (He and Ma, 2010), Lutefisk (Frank and Pevzner, 2005; Taylor and Johnson, 1997), PepNovo (Frank and Pevzner, 2005), GST-SPC (Ning et al., 2008), MsNovo (Mo et al., 2007), and QuiXoT (Bonzon-Kulichenko et al., 2011). These search engines are complementary to some extent, so it is often useful to use at least two different algorithms to analyze MS/MS data to increase confidence and sensitivity, and there are tools available to aid this (Price et al., 2007). Validation of the peptide and protein identifications is necessary and is often conducted by determining false discovery rates using decoy databases and other statistical methods (Li et al., 2009).

2.5.1.1 Database Search Algorithms The analysis of mass spectra and identifica- tion of proteins is usually done by searching within protein sequence databases. They are continually updated with submissions produced from the cloning of genes, from which amino acid sequences are generated by translation of nucleotide sequences in their correct reading frames. These databases include theoretical peptide masses gen- erated in silico. If a sufficient number of peptides from the experimental PMF match that computed from the database, then a theoretical protein match will be reported with an ion score based on the percentage of sequence matched and the number of matching peptides. Further verification of the identifications can be provided by matching the molecular weights and isoelectric points of the theoretical proteins with those derived from the spot position on the gel (Rotilio et al., 2012). The PFF approach is very similar to the PMF approach, but is applied to MS/MS spectra, and hence correlates peptide spectra with theoretical peptides from a database (Palagi et al., 2006). One of the main limitations of database searching methods is that some databases do not adequately account for sequence modifications/mutations alterations or there 50 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS are sequences not documented thus can miss valuable information and the protein will not be correctly identified. Table 2.1 shows the most important search engines and databases commonly used nowadays.

2.5.1.2 De Novo Sequencing In some cases it is necessary to use de novo sequenc- ing algorithm to identify proteins from a proteomic experiment. De novo sequencing for MS is typically performed without prior knowledge of the amino acid sequence and it is the process of assigning amino acids from peptide fragment masses of a protein. Correct identification of the entire protein from de novo sequencing is difficult because several genome sequence information is still lacking or incomplete for many organisms. In addition, as de novo sequencing is based on mass and some amino acids have identical masses (e.g. leucine and isoleucine), sometimes an accurate manual sequencing is not possible and it is necessary to work in tandem between a database search and de novo sequencing. De novo sequencing is easier for a peptide than for proteins. Each fragmenta- tion depends on the sequence and some positions may not be completely resolved. Additionally, a mass difference may be explained by more than one amino acid combination leading to inconclusive sequences. As additional fragmentation, such as from internal fragments, side cleavage, or doubly charged ions, may occur and overlay the ion series, the manual interpretation is quite laborious. There are several software developed to perform automated de novo sequencing. The accuracy of the results highly depends on the quality of the fragmentation spectra. Resulting peptide candidates can be searched for similarity against sequence databases. MS-BLAST is an example of an alignment tool for this purpose. Additionally, there are software packages included in MS instrument such as BioTools from Bruker Daltonik GmbH where either a full de novo algorithm is incorporated or sequence tag generation is supported by annotation of a resulting MS/MS spectrum. Table 2.1 also shows the most important available MS/MS de novo sequencing tools.

2.5.2 Post-Translational Modifications Identification by Computational Methods As explained above, PTMs have many important tasks in biological functions of proteins and hence tools for identifying them are of great importance. From MS/MS, the identification algorithms attempt to discover accurate PTMs. However, spectral imperfection, such as noise peaks and missing peaks, in tandem mass spectra provokes computational artifacts and often hampers the identification of PTM, which can result in false PTM assignments (Chung et al., 2011). PTMfinder is a PTM identification software based on unrestricted PTM identifica- tion algorithm that processes stepwise complete mass shift search, which decreases the computational complexity and increases the performance by filtering computa- tional artifacts. BIOINFORMATICS TOOLS 51

TABLE 2.1 Algorithms Mainly Used for Protein Identification Search Engines for PMF MASCOT http://www.matrixscience.com MS-Fit http://prospector.ucsf.edu/prospector/cgi-bin/ msform.cgi?form=msfitstandard PepIdent http://vsites.unb.br/cbsp/paginiciais/pepident.htm PepSea http://vsites.unb.br/cbsp/paginiciais/pepseaseqtag.htm ProFound http://vsites.unb.br/cbsp/paginiciais/profound.htm Search Engines for PFF FindPept http://web.expasy.org/findpept/ InsPecT http://proteomics.ucsd.edu/InspectDocs/Database.html Mascot http://www.matrixscience.com MS-Seq http://prospector.ucsf.edu/prospector/cgi- bin/msform.cgi?form=msseq MS-Tag http://prospector.ucsf.edu/prospector/cgi- bin/msform.cgi?form=mstagstandard OMSSA http://pubchem.ncbi.nlm.nih.gov/omssa/ PepFrag http://prowl.rockefeller.edu/prowl/pepfrag.html Phenyx http://www.genebio.com/products/phenyx/index.html Popitam http://www.expasy.org/tools/popitam/ ProID http://sashimi.sourceforge.net/software_mi.html Sequest http://fields.scripps.edu/sequest/ SpectrumMill http://spectrummill.mit.edu/ VEMS http://yass.sdu.dk/ X!tandem http://www.thegpm.org/tandem/ Sequence Databases DB Stat http://prospector.ucsf.edu/prospector/cgi- bin/msform.cgi?form=dbstat deEST http://www.ncbi.nlm.nih.gov/dbEST/ EMBL http://www.ebi.ac.uk/embl/index.html GenBank http://www.ncbi.nlm.nih.gov/Genbank MSDB ftp://ftp.ebi.ac.uk/pub/databases/MassSpecDB/ NCBInr www.ncbi.nlm.nih.gov/Entrez OWL http://www.bioinf.man.ac.uk/dbbrowser/OWL/index.php PDB www.rcsb.org/pdb PIR http://pir.georgetown.edu/ PRF www.prf.or.jp PRIDE http://www.ebi.ac.uk/pride Swiss-Prot www.expasy.ch/sprot MS/MS De Novo Sequencing Tools AUDENS www.ti.inf.ethz.ch/pw/software/audens/ DeNovoX www.thermo.com Lutefisk www.hairyfatguy.com/Lutefisk NovoHMM http://www.embl.de/∼befische/software.html PEAKS http://www.bioinformaticssolutions.com/peaks-protein-id PepNovo http://proteomics.ucsd.edu/Software/PepNovo.html Sequit! http://proteomefactory.com/services/00000097 dc126831d/index.html SpectrumMill http://spectrummill.mit.edu/ 52 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

PTMClust is a machine learning algorithm that can be applied to the output of blind PTM search methods to improve prediction quality by suppressing noise in the data and clustering peptides with the same underlying modification to form PTM groups. PTMClust markedly outperforms PTMfinder because it is able to reduce false PTM assignments, improve overall detection coverage, and facilitate novel PTM discovery, including terminus modifications (Chung et al., 2011). In addition, it has been reported an approach called NetworKIN, that mines large-scale phosphorylation data sets in the context of the protein–protein inter- action network topology to predict kinase substrates in phosphorylation networks (Linding et al., 2007).

2.5.3 Processing and Analyzing Proteomics Data Proteomic experiments usually generate complex data. Normalization of the data, understanding of the experimental setup, and the nature and quality of the obtained data are required to devise appropriate statistical methods. The “data-mining” approaches to extract relevant information, such as protein interactions, signaling pathways, and biological networks, are very useful but need to be applied and interpreted correctly. One of the most powerful tools available, and often the first tool used to conduct analysis on a large dataset, is Gene Ontology (GO) that is used to standardize the way in which proteins are described across different species and databases (Ashburner et al., 2000). GO annotation of a large MS dataset can be used to determine whether there is any enrichment or depletion for a particular GO category, or can be used to compare two different datasets. Other annotational databases are related with protein domains (InterPro, PFAM) and with pathway databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) or Reactome. This type of analysis may directly yield functional insights into the data set and is easily accomplished using standard tools. DAVID, GoMiner, Cytoscape, and BINGO are examples of readily available software that can be used (Kumar and Mann, 2009). A list of proteins can be uploaded into software, such as Ingenuity Pathway Analysis (Ingenuity systems, www.ingenuity.com) (Raponi et al., 2004), Pathguide (http://www.pathguide.org), or Panther (http://www.pantherdb.org) (Thomas et al., 2003), via a web interface in order to highlight predominant functional themes, orga- nize proteins into groups on the basis of molecular functions or biological processes, and discover common pathways. These software build hypothetical protein interac- tion clusters on the basis of regularly updated databases. Some of these databases also integrate a broad range of systems biology including protein function, cellular localization, small molecule, and disease inter-relationships (Rotilio et al., 2012). The Cytoscape tool visualizes proteins and their interactions in dynamic molecular networks that can be highly customized with biological annotations. Moreover, the BINGO plugin to Cytoscape enables gene ontology-driven analyses from parts of the network. Other available tools for metadata analysis of proteomic data such as DAVID (Huang da et al., 2007), PANDORA, Babelomics, and Conceptgen also attempt to classify protein results lists into functional groups (Sartor et al., 2010). In BIOINFORMATICS TOOLS 53 addition, STRING generates interaction networks as well as predicted interactions and pathway information (Jensen et al., 2009). Data can be input to STRING as protein lists, and it has a user-friendly interface, MiMI, from the National Institute for Integrative Biomedical. The Protein Information and Knowledge Extractor (PIKE) (http://proteo .cnb.csic.es:8080/pike/) automatically accesses several public information systems and databases across the Internet, searching for all annotations that are described for each one of the proteins provided as an input list of accession codes. After compiling all relevant and updated information from the most relevant databases, PIKE summa- rizes the information for every single protein using several file formats that share and exchange the information with other software tools (Medina-Aunon et al., 2010).

2.5.3.1 Analysis of Quantitative Proteomic Data Tools for the analysis of quan- titative data and analysis workflow from proteomic experiments are continuously emerging and being improved. Nevertheless, there is no standard procedure broadly applicable to all experiment types, so it is essential to choose which one is appropriate in every assay. The Trans-Proteomic Pipeline (TPP) is a suite of software tools applicable to label-free or labeled quantitative proteomics data processing (Keller et al., 2005). The TPP supports all steps from spectrometer output file conversion to protein-level statistical validation, including XRPESS and ASAPRatio that are used for the relative quantitation of isotopically labeled peptides and proteins (Deutsch et al., 2010). MaxQuant is a software suite for the analysis and quantitation of SILAC experiments (Cox and Mann, 2008). Similarly, Mascot Distiller, from Matrix Science, determines quantitation based on the relative intensities of extracted ion chromatograms for precursors. This approach can be used both for label-free and labeling approaches such as AQUA, 18O/16O, ICAT, ICPL, and SILAC. ProteinPilot, from Applied Biosystems, provides protein identification and quantitation of SILAC and iTRAQ. For label-free approaches, there are many open-source and commercial software packages available (Wang et al., 2008).

2.5.4 Proteomics Data Repositories Proteomics studies generate large volumes of raw experimental data and inferred biological results. To facilitate the management, integration, storage, and dissemina- tion of these large volumes of data generated and to make the data results accessible, several centralized data repositories have been established. In parallel to the intrinsic complexity of the field, proteomics repositories are quite heterogeneous and have different interests and focus leading to a limited data sharing in proteomics. The main existing public repositories are: the Global Proteome Machine Database (GPMDB), PeptideAtlas, the PRoteomics IDEntifications database (PRIDE), Tranche, and NCBI Peptidome (Vizcaino et al., 2010).

2.5.4.1 PRoteomics IDEntifications The PRIDE (PRoteomics IDEntifications) database is a freely available public data repository for the data generated by MS 54 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS proteomics experiments. PRIDE includes data standards, raw spectral data, peptides, protein identifications, and associated statistics that may be uploaded, downloaded, or viewed using a single, centralized web interface (http://www.ebi.ac.uk/pride/) that is independent of the hardware or algorithms used to generate the data. This allows researchers to achieve standards compliance for data generated by many different platforms (Riffle and Eng, 2009).

2.5.4.2 PeptideAtlas The PeptideAtlas project was initiated to map high- confidence peptide identifications to eukaryotic genomes as one component of a resource for proteomics information (Deutsch et al., 2008). The project annotates genome sequences of multiple organisms with peptide and protein information derived primarily from MS/MS data. This repository provides information such as the total number of times a peptide has been observed, from which experimental samples the peptide has been identified, and a histogram of the number of identifications from each experimental sample (Riffle and Eng, 2009).

2.5.4.3 The Global Proteome Machine Database The Global Proteome Machine (GPM) is a portal to a proteomics database and open-source software that was developed by Beavis Informatics (Beavis, 2006). Its core is a search engine named X! Tandem that identifies peptides and proteins from MS/MS. Results of searches performed on the GPM sites are stored and regularly collated to the GPMDB central repository. This expansive collection of proteomics data in GPMDB, including high- confidence peptide identifications and their corresponding experimental tandem mass spectra, is a valuable resource for further MS computational research. The GPMDB can be accessed at http://gpmdb.thegpm.org/ (Riffle and Eng, 2009). One application of this resource is the study of proteotypic peptides defined as those that ionize and fragment well such that they are successfully identified in MS/MS experiments and represent the protein of interest. This is a very useful tool in targeted mass spectrometry experiment.

2.5.4.4 National Center for Biotechnology Information Peptidome The National Center for Biotechnology Information (NCBI) Peptidome (http://www. ncbi.nlm.nih.gov/peptidome) is the most recent of the main proteomics reposito- ries launched (Slotta et al., 2009) and for that reason is the one that contains the least amount of data at present. It is essentially a sibling repository to PRIDE since data is not reprocessed in any way and the original view on the data by the submitter is repre- sented. This repository stores lists of identified peptides and proteins, mass spectra as the supporting evidence for these identifications, and descriptive information about the biological samples, instrumentation and/or the informatics pipeline. As with all the previous resources, reviewer accounts can be created to anonymously access data in the prepublication stage. There are two types of components in Peptidome: Sam- ples (containing all the data related to the biological material, which is derived from one or more MS runs) and Studies (collection of samples from the same experiment) (Vizcaino et al., 2010). Due to budgetary constraints, NCBI has discontinued the REFERENCES 55

Peptidome Repository. All existing data and metadata files will continue to be made available indefinitely from the ftp server ftp://ftp.ncbi.nih.gov/pub/peptidome/.

2.5.4.5 The ProteoRed Minimum Information about a Proteomics Experiment Repository The ProteoRed Minimum Information About a Proteomics Experiment (MIAPE) repository, accessible by the MIAPE Generator Tool (Martinez-Bartolome et al., 2010) (http://www.proteored.org/MIAPEGenerator), is an online resource in which researchers can store and retrieve MIAPEs (Taylor et al., 2007) documents describing several phases of a proteomics experiment. The repository allows shar- ing the MIAPE documents with other users, and also making them publicly avail- able. Several tools have been developed to produce, store, or retrieve MIAPE data to/from the repository (Medina-Aunon et al., 2011). Currently the repository is storing data and metadata from more than 3500 experiments, 400 of them being publicly accessible.

REFERENCES

Afzal V, Huang JT, Atrih A, Crowther DJ (2011). PChopper: high throughput peptide predic- tion for MRM/SRM transition design. BMC Bioinformatics 12:338. Agrawal GK, Rakwal R (2006). Rice proteomics: a cornerstone for cereal food crop proteomes. Mass Spectrometry Reviews 25(1):1–53. Andrews GL, Simons BL, Young JB, Hawkridge AM, Muddiman DC (2011). Performance characteristics of a new hybrid quadrupole time-of-flight tandem mass spectrometer (TripleTOF 5600). Analytical Chemistry 83(13):5442–5446. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000). Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics 25(1): 25–29. Beavis RC (2006). Using the global proteome machine for protein identification. Methods in Molecular Biology 328:217–228. Benton CM, Lim CK, Moniz C, Jones DJ (2012). Travelling wave ion mobility mass spectrom- etry of 5-aminolaevulinic acid, porphobilinogen and porphyrins. Rapid Communications in Mass Spectrometry 26(4):480–486. Beynon RJ, Doherty MK, Pratt JM, Gaskell SJ (2005). Multiplexed absolute quantification in proteomics using artificial QCAT proteins of concatenated signature peptides. Nature Methods 2(8):587–589. Black DL (2000). Protein diversity from alternative splicing: a challenge for bioinformatics and post-genome biology. Cell 103(3):367–370. Bonzon-Kulichenko E, Perez-Hernandez D, Nunez E, Martinez-Acedo P, Navarro P, Trevisan- Herraz M, Ramos Mdel C, Sierra S, Martinez-Martinez S, Ruiz-Meana M, Miro-Casas E, Garcia-Dorado D, Redondo JM, Burgos JS, Vazquez J (2011). A robust method for quantitative high-throughput analysis of proteomes by 18O labeling. Molecular & Cellular Proteomics 10(1):M110 003335. 56 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

Brewis IA, Brennan P (2010). Proteomics technologies for the global identification and quan- tification of proteins. Advances in Protein Chemistry and Structural Biology 80:1–44. Ciordia S, de Los Rios V, Albar JP (2006). Contributions of advanced proteomics technolo- gies to cancer diagnosis. Clinical & Translational Oncology: Official Publication of the Federation of Spanish Oncology Societies and of the National Cancer Institute of Mexico 8(8):566–580. Clauser KR, Baker P, Burlingame AL (1999). Role of accurate mass measurement ( + /− 10 ppm) in protein identification strategies employing MS or MS/MS and database searching. Analytical Chemistry 71(14):2871–2882. Cody RB, Laramee JA, Durst HD (2005). Versatile new ion source for the analysis of materials in open air under ambient conditions. Analytical Chemistry 77(8):2297–2302. Cox J, Mann M (2008). MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnol- ogy 26(12):1367–1372. Cox J, Mann M (2011). Quantitative, high-resolution proteomics for data-driven systems biology. Annual Review of Biochemistry 80:273–299. Cronly M, Behan P, Foley B, Malone E, Martin S, Doyle M, Regan L (2010). Rapid multi-class multi-residue method for the confirmation of chloramphenicol and eleven nitroimidazoles in milk and honey by liquid chromatography-tandem mass spectrometry (LC-MS). Food Additives & Contaminants. Part A, Chemistry, Analysis, Control, Exposure & Risk Assess- ment 27(9):1233–1246. Cuyckens F, Pauwels N, Koppen V, Leclercq L (2012). Use of relative 12C/14C isotope ratios to estimate metabolite concentrations in the absence of authentic standards. Bioanalysis 4(2):143–156. Cham Mead JA, Bianco L, Bessant C (2010). Free computational resources for designing selected reaction monitoring transitions. Proteomics 10(6):1106–1126. Charlton M, Viker K, Krishnan A, Sanderson S, VeldtB, Kaalsbeek AJ, Kendrick M, Thompson G, Que F, Swain J, Sarr M (2009). Differential expression of lumican and fatty acid binding protein-1: new insights into the histologic spectrum of nonalcoholic fatty liver disease. Hepatology 49(4):1375–1384. Chen S (2006). Rapid protein identification using direct infusion nanoelectrospray ionization mass spectrometry. Proteomics 6(1):16–25. Chung C, Liu J, Emili A, Frey BJ (2011). Computational refinement of post-translational modifications predicted from tandem mass spectrometry. Bioinformatics 27(6):797–806. Dale G, Latner AL (1969). Isoelectric focusing of serum proteins in acrylamide gels followed by electrophoresis. Clinica Chimica Acta; International Journal of Clinical Chemistry 24(1):61–68. de Godoy LM, Olsen JV, de Souza GA, Li G, Mortensen P, Mann M (2006). Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system. Genome Biology 7(6):R50. Deutsch EW, Lam H, Aebersold R (2008). PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows. EMBO Reports 9(5):429–434. Deutsch EW, Mendoza L, Shteynberg D, Farrah T, Lam H, Tasman N, Sun Z, Nilsson E, Pratt B, Prazen B, Eng JK, Martin DB, Nesvizhskii AI, Aebersold R (2010). A guided tour of the trans-proteomic pipeline. Proteomics 10(6):1150–1159. REFERENCES 57

Domon B, Aebersold R (2006). Mass spectrometry and protein analysis. Science 312(5771): 212–217. Douglas DJ, Frank AJ, Mao D (2005). Linear ion traps in mass spectrometry. Mass Spectrom- etry Reviews 24(1):1–29. Dunham WH, Larsen B, Tate S, Badillo BG, Goudreault M, Tehami Y, Kislinger T, Gingras AC (2011). A cost-benefit analysis of multidimensional fractionation of affinity purification- mass spectrometry samples. Proteomics 11(13):2603–2612. Edison SE, Lin LA, Gamble BM, Wong J, Zhang K (2011). Surface swabbing technique for the rapid screening for pesticides using ambient pressure desorption ionization with high- resolution mass spectrometry. Rapid Communications in Mass Spectrometry 25(1):127– 139. Elliott MH, Smith DS, Parker CE, Borchers C (2009). Current trends in quantitative proteomics. Journal of Mass Spectrometry 44(12):1637–1660. Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM (1989). Electrospray ionization for mass spectrometry of large biomolecules. Science 246(4926):64–71. Fernandez M, Albar JP (2012). 2D DIGE for the analysis of RAMOS cells subproteomes. Methods in Molecular Biology 854:239–252. Finnie C, Sultan A, Grasser KD (2011). From protein catalogues towards targeted proteomics approaches in cereal grains. Phytochemistry 72(10):1145–1153. Finnie C, Svensson B (2009). Barley seed proteomics from spots to structures. Journal of Proteomics 72(3):315–324. Frank A, Pevzner P (2005). PepNovo: de novo peptide sequencing via probabilistic network modeling. Analytical Chemistry 77(4):964–973. Frank AM, Pesavento JJ, Mizzen CA, Kelleher NL, Pevzner PA (2008). Interpreting top-down mass spectra using spectral alignment. Analytical Chemistry 80(7):2499–2505. Ge Y, Rajkumar L, Guzman RC, Nandi S, Patton WF, Agnew BJ (2004). Multiplexed flu- orescence detection of phosphorylation, glycosylation, and total protein in the proteomic analysis of breast cancer refractoriness. Proteomics 4(11):3464–3467. Geiger T, Cox J, Ostasiewicz P, Wisniewski JR, Mann M (2010). Super-SILAC mix for quantitative proteomics of human tumor tissue. Nature Methods 7(5):383–385. Gerber SA, Rush J, Stemman O, Kirschner MW, Gygi SP (2003). Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proceedings of the National Academy of Sciences of the United States of America 100(12):6940–6945. Gillet LC, Navarro P, Tate S, Roest H, Selevsek N, Reiter L, Bonner R, Aebersold R (2012). Targeted data extraction of the MS/MS spectra generated by data-independent acquisi- tion: a new concept for consistent and accurate proteome analysis. Molecular & Cellular Proteomics 11(6):O111.016717. Griffin NM, Yu J, Long F, Oh P, Shore S, Li Y, Koziol JA, Schnitzer JE (2010). Label- free, normalized quantification of complex mass spectrometry data for proteomic analysis. Nature Biotechnology 28(1):83–89. Grobosch T, Schwarze B, Stoecklein D, Binscheck T (2012). Fatal poisoning with Taxus baccata. Quantification of Paclitaxel (taxol A), 10-Deacetyltaxol, Baccatin III, 10- Deacetylbaccatin III, Cephalomannine (taxol B), and 3,5-Dimethoxyphenol in Body Fluids by Liquid Chromatography-Tandem Mass Spectrometry. Journal of Analytical Toxicology 36(1):36–43. 58 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R (1999). Quantitative anal- ysis of complex protein mixtures using isotope-coded affinity tags. Nature Biotechnology 17(10):994–999. Han G, Ye M, Zhou H, Jiang X, Feng S, Tian R, Wan D, Zou H, Gu J (2008a). Large- scale phosphoproteome analysis of human liver tissue by enrichment and fractionation of phosphopeptides with strong anion exchange chromatography. Proteomics 8(7):1346– 1361. Han X, Aslanian A, Yates JR, 3rd (2008b). Mass spectrometry for proteomics. Current Opinion in Chemical Biology 12(5):483–490. Hart PJ, Francese S, Claude E, Woodroofe MN, Clench MR (2011). MALDI-MS imaging of lipids in ex vivo human skin. Analytical and Bioanalytical Chemistry 401(1):115–125. He L, Ma B (2010). ADEPTS: advanced peptide de novo sequencing with a pair of tandem mass spectra. Journal of Bioinformatics and Computational Biology 8(6):981–994. Helsens K, Martens L, Vandekerckhove J, Gevaert K (2011). Mass spectrometry-driven pro- teomics: an introduction. Methods in Molecular Biology 753:1–27. Henzel WJ, Billeci TM, Stults JT, Wong SC, Grimley C, Watanabe C (1993). Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases. Proceedings of the National Academy of Sciences of the United States of America 90(11):5011–5015. Herrero M, Simo C, Garcia-Canas V, Ibanez E, Cifuentes A (2012). Foodomics: MS-based strategies in modern food science and nutrition. Mass Spectrometry Reviews 31:49–69. Hu Q, Noll RJ, Li H, Makarov A, Hardman M, Graham Cooks R (2005). The Orbitrap: a new mass spectrometer. Journal of Mass Spectrometry 40(4):430–443. Huang da W, Sherman BT, Tan Q, Collins JR, Alvord WG, Roayaei J, Stephens R, Baseler MW, Lane HC, Lempicki RA (2007). The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biology 8(9):R183. Huang Y, Tseng GC, Yuan S, Pasa-Tolic L, Lipton MS, Smith RD, Wysocki VH (2008). A data-mining scheme for identifying peptide structural motifs responsible for dif- ferent MS/MS fragmentation intensity patterns. Journal of Proteome Research 7(1): 70–79. Inutan ED, Trimpin S (2010). Laserspray ionization-ion mobility spectrometry-mass spec- trometry: baseline separation of isomeric amyloids without the use of solvents des- orbed and Ionized directly from a surface. Journal of Proteome Research 9(11):6077– 6081. Inutan ED, Wang B, Trimpin S (2011). Commercial intermediate pressure MALDI ion mobility spectrometry mass spectrometer capable of producing highly charged laserspray ionization ions. Analytical Chemistry 83(3):678–684. Ishihama Y, Oda Y, Tabata T, Sato T, Nagasu T, Rappsilber J, Mann M (2005). Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Molecular & Cellular Proteomics 4(9):1265–1272. Issaq HJ, Nativ O, Waybright T, Luke B, Veenstra TD, Issaq EJ, Kravstov A, Mullerad M (2008). Detection of bladder cancer in human urine by metabolomic profiling using high per- formance liquid chromatography/mass spectrometry. The Journal of Urology 179(6):2422– 2426. REFERENCES 59

Jaffe JD, Keshishian H, Chang B, Addona TA, Gillette MA, Carr SA (2008). Accurate inclu- sion mass screening: a bridge from unbiased discovery to targeted assay development for biomarker verification. Molecular & Cellular Proteomics 7(10):1952–1962. Jakoby T, van den Berg BH, Tholey A (2012). Quantitative protease cleavage site profiling using tandem-mass-tag labeling and LC-MALDI-TOF/TOF MS/MS analysis. Journal of Proteome Research 11(3):1812–1820. Janeckova H, Hron K, Wojtowicz P, Hlidkova E, Baresova A, Friedecky D, Zidkova L, Hornik P, Behulova D, Prochazkova D, Vinohradska H, Peskova K, Bruheim P, Smolka V, Stastna S, Adam T (2012). Targeted metabolomic analysis of plasma samples for the diagnosis of inherited metabolic disorders. Journal of Chromatography A 1226:11–17. Jedrychowski MP, Huttlin EL, Haas W, Sowa ME, Rad R, Gygi SP (2011). Evaluation of HCD- and CID-type fragmentation within their respective detection platforms for murine phosphoproteomics. Molecular & Cellular Proteomics 10(12):M111.009910. Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, von Mering C (2009). STRING 8–a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Research 37(Database issue):D412–D416. Kang SU, Fuchs K, Sieghart W, Pollak A, Csaszar E, Lubec G (2009). Gel-based mass spectrometric analysis of a strongly hydrophobic GABAA-receptor subunit containing four transmembrane domains. Nature Protocols 4(7):1093–1102. Kapp EA, Schutz F, Reid GE, Eddes JS, Moritz RL, O’Hair RA, Speed TP, Simpson RJ (2003). Mining a tandem mass spectrometry database to determine the trends and global factors influencing peptide fragmentation. Analytical Chemistry 75(22):6251–6264. Karas M, Hillenkamp F (1988). Laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons. Analytical Chemistry 60(20):2299–2301. Keller A, Eng J, Zhang N, Li XJ, Aebersold R (2005). A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Molecular Systems biology 1:2005 0017. Keshishian H, Addona T, Burgess M, Mani DR, Shi X, Kuhn E, Sabatine MS, Gerszten RE, Carr SA (2009). Quantification of cardiovascular biomarkers in patient plasma by targeted mass spectrometry and stable isotope dilution. Molecular & Cellular Proteomics 8(10):2339–2349. Kim JW, Isobe T, Chang KH, Amano A, Maneja RH, Zamora PB, Siringan FP, Tanabe S (2011). Levels and distribution of organophosphorus flame retardants and plasticizers in fishes from Manila Bay, the Philippines. Environmental Pollution 159(12):3653–3659. Kirkland PA, Humbard MA, Daniels CJ, Maupin-Furlow JA (2008). Shotgun proteomics of the haloarchaeon Haloferax volcanii. Journal of Proteome Research 7(11):5033–5039. Kuhn E, Whiteaker JR, Mani DR, Jackson AM, Zhao L, Pope ME, Smith D, Rivera KD, Anderson NL, Skates SJ, Pearson TW, Paulovich AG, Carr SA (2012a). Interlaboratory evaluation of automated, multiplexed peptide immunoaffinity enrichment coupled to mul- tiple reaction monitoring mass spectrometry for quantifying proteins in plasma. Molecular & Cellular Proteomics 11(6):M111.013854. Kuhn K, Baumann C, Tommassen J, Prinz T (2012b). TMT labelling for the quantitative analysis of adaptive responses in the meningococcal proteome. Methods in Molecular Biology 799:127–141. Kumar C, Mann M (2009). Bioinformatics analysis of mass spectrometry-based proteomics data sets. FEBS Letters 583(11):1703–1712. 60 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

Lange V, Picotti P, Domon B, Aebersold R (2008). Selected reaction monitoring for quantita- tive proteomics: a tutorial. Molecular Systems Biology 4:222. Latterich M, Abramovitz M, Leyland-Jones B (2008). Proteomics: new technologies and clinical applications. European Journal of Cancer 44(18):2737–2741. Lazartigues A, Wiest L, Baudot R, Thomas M, Feidt C, Cren-Olive C (2011). Multiresidue method to quantify pesticides in fish muscle by QuEChERS-based extraction and LC- MS/MS. Analytical and Bioanalytical Chemistry 400(7):2185–2193. Lee J, Feng J, Campbell KB, Scheffler BE, Garrett WM, Thibivilliers S, Stacey G, Naiman DQ, Tucker ML, Pastor-Corrales MA, Cooper B (2009). Quantitative proteomic analysis of bean plants infected by a virulent and avirulent obligate rust fungus. Molecular & Cellular Proteomics 8(1):19–31. Lee SW, Berger SJ, Martinovic S, Pasa-Tolic L, Anderson GA, Shen Y, Zhao R, Smith RD (2002). Direct mass spectrometric analysis of intact proteins of the yeast large ribosomal subunit using capillary LC/FTICR. Proceedings of the National Academy of Sciences of the United States of America 99(9):5942–5947. Leymarie N, Zaia J (2012). Effective use of mass spectrometry for glycan and glycopeptide structural analysis. Analytical Chemistry 84(7):3040–3048. Li X, Pizarro A, Grosser T (2009). Elective affinities–bioinformatic analysis of proteomic mass spectrometry data. Archives of Physiology and Biochemistry 115(5):311–319. Li Y, Henion J, Abbott R, Wang P (2011). Dried blood spots as a sampling technique for the quantitative determination of guanfacine in clinical studies. Bioanalysis 3(22):2501– 2514. Linding R, Jensen LJ, Ostheimer GJ, van Vugt MA, Jorgensen C, Miron IM, Diella F, Colwill K, Taylor L, Elder K, Metalnikov P, Nguyen V, Pasculescu A, Jin J, Park JG, Samson LD, Woodgett JR, Russell RB, Bork P, Yaffe MB, Pawson T (2007). Systematic discovery of in vivo phosphorylation networks. Cell 129(7):1415–1426. Lojza J, Cajka T, Schulzova V, Riddellova K, Hajslova J (2012). Analysis of isoflavones in soy- beans employing direct analysis in real-time ionization-high-resolution mass spectrometry. Journal of Separation Science 35(3):476–481. Lottspeich F, Kellermann J (2011). ICPL labeling strategies for proteome research. Methods in Molecular Biology 753:55–64. Lu P, Vogel C, Wang R, Yao X, Marcotte EM (2007). Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nature Biotechnology 25(1):117–124. Ma B, Zhang K, Hendrie C, Liang C, Li M, Doherty-Kirby A, Lajoie G (2003). PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Communications in Mass Spectrometry 17(20):2337–2342. Macek B, Waanders LF, Olsen JV, Mann M (2006). Top-down protein sequencing and MS3 on a hybrid linear quadrupole ion trap-orbitrap mass spectrometer. Molecular & Cellular Proteomics 5(5):949–958. Macko V, Stegemann H (1969). Mapping of potato proteins by combined electrofocusing and electrophoresis identification of varieties. Hoppe-Seyler’s Zeitschrift fur Physiologische Chemie 350(7):917–919. Makarov A, Denisov E, Lange O, Horning S (2006). Dynamic range of mass accuracy in LTQ Orbitrap hybrid mass spectrometer. Journal of the American Society for Mass Spectrometry 17(7):977–982. REFERENCES 61

Malmstrom J, Beck M, Schmidt A, Lange V, Deutsch EW, Aebersold R (2009). Proteome- wide cellular protein concentrations of the human pathogen Leptospira interrogans. Nature 460(7256):762–765. Malmstrom L, Malmstrom J, Selevsek N, Rosenberger G, Aebersold R (2012). Automated workflow for large-scale selected reaction monitoring experiments. Journal of Proteome Research 11(3):1644–1653. Manivannan B, Jordan TW, Secor WE, La Flamme AC (2012). Proteomic changes at eight weeks after infection are associated with chronic liver pathology in experimental schisto- somiasis. Journal of Proteomics 75(6):1838–1848. Marcilla M, Alpizar A, Paradela A, Albar JP (2011). A systematic approach to assess amino acid conversions in SILAC experiments. Talanta 84(2):430–436. March RE (2009). Quadrupole ion traps. Mass Spectrometry Reviews 28(6):961– 989. Martinez-Bartolome S, Medina-Aunon JA, Jones AR, Albar JP (2010). Semi-automatic tool to describe, store and compare proteomics experiments based on MIAPE compliant reports. Proteomics 10(6):1256–1260. Maxam AM, Gilbert W (1977). A new method for sequencing DNA. Proceedings of the National Academy of Sciences 74(2):560–564. McAlister GC, Berggren WT, Griep-Raming J, Horning S, Makarov A, Phanstiel D, Stafford G, Swaney DL, Syka JE, Zabrouskov V, Coon JJ (2008). A proteomics grade electron transfer dissociation-enabled hybrid linear ion trap-orbitrap mass spectrometer. Journal of Proteome Research 7(8):3127–3136. McClatchy DB, Liao L, Park SK, Xu T, Lu B, Yates JR, 3rd (2011). Differential proteomic analysis of mammalian tissues using SILAM. PloS One 6(1):e16039. McNulty DE, Annan RS (2008). Hydrophilic interaction chromatography reduces the complex- ity of the phosphoproteome and improves global phosphopeptide isolation and detection. Molecular & Cellular Proteomics 7(5):971–980. Mechin V, Thevenot C, Le Guilloux M, Prioul JL, Damerval C (2007). Developmental analysis of maize endosperm proteome suggests a pivotal role for pyruvate orthophosphate dikinase. Plant Physiology 143(3):1203–1219. Mechref Y, Madera M, Novotny MV (2008). Glycoprotein enrichment through lectin affinity techniques. Methods in Molecular Biology 424:373–396. Medina-Aunon JA, Martinez-Bartolome S, Lopez-Garcia MA, Salazar E, Navajas R, Jones AR, Paradela A, Albar JP (2011). The ProteoRed MIAPE web toolkit: a user-friendly framework to connect and share proteomics standards. Molecular & Cellular Proteomics 10(10):M111. 008334. Medina-Aunon JA, Paradela A, Macht M, Thiele H, Corthals G, Albar JP (2010). Protein information and knowledge extractor: discovering biological information from proteomics data. Proteomics 10(18):3262–3271. Menon R, Omenn GS (2010). Proteomic characterization of novel alternative splice variant proteins in human epidermal growth factor receptor 2/neu-induced breast cancers. Cancer Research 70(9):3440–3449. Michalski A, Damoc E, Hauschild JP, Lange O, Wieghaus A, Makarov A, Nagaraj N, Cox J, Mann M, Horning S (2011). Mass spectrometry-based proteomics using Q Exactive, a high-performance benchtop quadrupole Orbitrap mass spectrometer. Molecular & Cellular Proteomics 10(9):M111.011015. 62 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

Mikesh LM, Ueberheide B, Chi A, Coon JJ, Syka JE, Shabanowitz J, Hunt DF (2006). The utility of ETD mass spectrometry in proteomic analysis. Biochimica et Biophysica Acta 1764(12):1811–1822. Miller I, Crawford J, Gianazza E (2006). Protein stains for proteomic applications: which, when, why? Proteomics 6(20):5385–5408. Mo L, Dutta D, Wan Y, Chen T (2007). MSNovo: a dynamic programming algorithm for de novo peptide sequencing via tandem mass spectrometry. Analytical Chemistry 79(13):4870– 4878. Monteoliva L, Albar JP (2004). Differential proteomics: an overview of gel and non-gel based approaches. Briefings in Functional Genomics & Proteomics 3(3):220–239. Morais MP, Fossey JS, James TD, van den Elsen JM (2012). Analysis of protein glycation using phenylboronate acrylamide gel electrophoresis. Methods in Molecular Biology 869:93– 109. Motoyama A, Xu T, Ruse CI, Wohlschlegel JA, Yates JR, 3rd (2007). Anion and cation mixed-bed ion exchange for enhanced multidimensional separations of peptides and phos- phopeptides. Analytical Chemistry 79(10):3623–3634. Nagaraj N, D’Souza RC, Cox J, Olsen JV, Mann M (2010). Feasibility of large-scale phospho- proteomics with higher energy collisional dissociation fragmentation. Journal of Proteome Research 9(12):6786–6794. Nagaraj N, Kulak NA, Cox J, Neuhaus N, Mayr K, Hoerning O, Vorm O, Mann M (2011). Systems-wide perturbation analysis with near complete coverage of the yeast proteome by single-shot UHPLC runs on a bench-top Orbitrap. Molecular & Cellular Proteomics 11(3):M111.013722. Navajas R, Paradela A, Albar JP (2011). Immobilized metal affinity chromatography/reversed- phase enrichment of phosphopeptides and analysis by CID/ETD tandem mass spectrometry. Methods in Molecular Biology 681:337–348. Neilson KA, Ali NA, Muralidharan S, Mirzaei M, Mariani M, Assadourian G, Lee A, van Sluyter SC, Haynes PA (2011). Less label, more free: approaches in label-free quantitative mass spectrometry. Proteomics 11(4):535–553. Nesvizhskii AI, Keller A, Kolker E, Aebersold R (2003). A statistical model for identifying proteins by tandem mass spectrometry. Analytical chemistry 75(17):4646–4658. Neverova I, Van Eyk JE (2005). Role of chromatographic techniques in proteomic analysis. Journal of Chromatography B: Analytical Technologies in the Biomedical and Life Sciences 815(1–2):51–63. Ning K, Ye N, Leong HW (2008). On preprocessing and antisymmetry in de novo peptide sequencing: improving efficiency and accuracy. Journal of Bioinformatics and Computa- tional Biology 6(3):467–492. O’Farrell PH (1975). High resolution two-dimensional electrophoresis of proteins. Journal of Biological Chemistry 250(10):4007–4021. Oda Y, Huang K, Cross FR, Cowburn D, Chait BT (1999). Accurate quantitation of protein expression and site-specific phosphorylation. Proceedings of the National Academy of Sciences of the United States of America 96(12):6591–6596. Old WM, Meyer-Arendt K, Aveline-Wolf L, Pierce KG, Mendoza A, Sevinsky JR, Resing KA, Ahn NG (2005). Comparison of label-free methods for quantifying human proteins by shotgun proteomics. Molecular & Cellular Proteomics 4(10):1487– 1502. REFERENCES 63

Olsen JV, de Godoy LM, Li G, Macek B, Mortensen P, Pesch R, Makarov A, Lange O, Horn- ing S, Mann M (2005). Parts per million mass accuracy on an Orbitrap mass spectrome- ter via lock mass injection into a C-trap. Molecular & Cellular Proteomics 4(12):2010– 2021. Olsen JV, Macek B, Lange O, Makarov A, Horning S, Mann M (2007). Higher-energy C-trap dissociation for peptide modification analysis. Nature Methods 4(9):709–712. Olsen JV, Schwartz JC, Griep-Raming J, Nielsen ML, Damoc E, Denisov E, Lange O, Remes P, Taylor D, Splendore M, Wouters ER, Senko M, Makarov A, Mann M, Horning S (2009). A dual pressure linear ion trap Orbitrap instrument with very high sequencing speed. Molecular & Cellular Proteomics 8(12):2759–2769. Pagnotti VS, Chubatyi ND, McEwen CN (2011). Solvent assisted inlet ionization: an ultrasensi- tive new liquid introduction ionization method for mass spectrometry. Analytical Chemistry 83(11):3981–3985. Paine MR, Barker PJ, Maclauglin SA, Mitchell TW, Blanksby SJ (2012). Direct detection of additives and degradation products from polymers by liquid extraction surface analy- sis employing chip-based nanospray mass spectrometry. Rapid Communications in Mass Spectrometry 26(4):412–418. Palagi PM, Hernandez P, Walther D, Appel RD (2006). Proteome informatics I: bioinformatics tools for processing experimental data. Proteomics 6(20):5435–5444. Paradela A, Albar JP (2008). Advances in the analysis of protein phosphorylation. Journal of Proteome Research 7(5):1809–1818. Paradela A, Marcilla M, Navajas R, Ferreira L, Ramos-Fernandez A, Fernandez M, Mariscotti JF, Garcia-del Portillo F, Albar JP (2010). Evaluation of isotope-coded protein labeling (ICPL) in the quantitative analysis of complex proteomes. Talanta 80(4):1496–1502. Pasa-Toliˇ c´ L, Jensen PK, Anderson GA, Lipton MS, Peden KK, MartinovicS,Toli´ cN,´ Bruce JE, Smith RD (1999). High throughput proteome-wide precision measurements of protein expression using mass spectrometry. Journal of the American Chemical Society 121(34):7949–7950. Perkins DN, Pappin DJ, Creasy DM, Cottrell JS (1999). Probability-based protein identi- fication by searching sequence databases using mass spectrometry data. Electrophoresis 20(18):3551–3567. Perry RH, Cooks RG, Noll RJ (2008). Orbitrap mass spectrometry: instrumentation, ion motion and applications. Mass Spectrometry Reviews 27(6):661–699. Pico Y, Blasco C, Font G (2004). Environmental and food applications of LC-tandem mass spectrometry in pesticide-residue analysis: an overview. Mass Spectrometry Reviews 23(1):45–85. Picotti P, Lam H, Campbell D, Deutsch EW, Mirzaei H, Ranish J, Domon B, Aebersold R (2008). A database of mass spectrometric assays for the yeast proteome. Nature Methods 5(11):913–914. Price TS, Lucitt MB, Wu W, Austin DJ, Pizarro A, Yocum AK, Blair IA, FitzGerald GA, Grosser T (2007). EBP, a program for protein identification using multiple tandem mass spectrometry datasets. Molecular & Cellular Proteomics 6(3):527–536. Ransohoff DF, Martin C, Wiggins WS, Hitt BA, Keku TO, Galanko JA, Sandler RS (2008). Assessment of serum proteomics to detect large colon adenomas. Cancer Epidemiology, Biomarkers & Prevention: A publication of the American Association for Cancer Research, Cosponsored by the American Society of Preventive Oncology 17(8):2188–2193. 64 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

Raponi M, Belly RT, Karp JE, Lancet JE, Atkins D, Wang Y (2004). Microarray analysis reveals genetic pathways modulated by tipifarnib in acute myeloid leukemia. BMC Cancer 4:56. Rappsilber J, Ryder U, Lamond AI, Mann M (2002). Large-scale proteomic analysis of the human spliceosome. Genome Research 12(8):1231–1245. Rebello KM, Barros JS, Mota EM, Carvalho PC, Perales J, Lenzi HL, Neves-Ferreira AG (2011). Comprehensive proteomic profiling of adult Angiostrongylus costaricensis, a human parasitic nematode. Journal of Proteomics 74(9):1545–1559. Richard E, Monteoliva L, Juarez S, Perez B, Desviat LR, Ugarte M, Albar JP (2006). Quanti- tative analysis of mitochondrial protein expression in methylmalonic acidemia by two- dimensional difference gel electrophoresis. Journal of Proteome Research 5(7):1602– 1610. Riffle M, Eng JK (2009). Proteomics data repositories. Proteomics 9(20):4653–4663. Rotilio D, Della Corte A, D’Imperio M, Coletta W, Marcone S, Silvestri C, Giordano L, Di Michele M, Donati MB (2012). Proteomics: bases for protein complexity understanding. Thrombosis Research 129(3):257–262. Rubert J, James KJ, Manes J, Soler C (2012). Applicability of hybrid linear ion trap-high reso- lution mass spectrometry and quadrupole-linear ion trap-mass spectrometry for mycotoxin analysis in baby food. Journal of Chromatography. A 1223:84–92. Rubert J, Manes J, James KJ, Soler C (2011). Application of hybrid linear ion trap-high resolution mass spectrometry to the analysis of mycotoxins in beer. Food Additives & Con- taminants. Part A, Chemistry, Analysis, Control, Exposure & Risk Assessment 28(10):1438– 1446. Rust KY, Baumgartner MR, Meggiolaro N, Kraemer T (2012). Detection and validated quan- tification of 21 benzodiazepines and 3 “z-drugs” in human hair by LC-MS/MS. Forensic science international 215(1–3):64–72. Sartor MA, Mahavisno V, Keshamouni VG, Cavalcoli J, Wright Z, Karnovsky A, Kuick R, Jagadish HV, Mirel B, Weymouth T, Athey B, Omenn GS (2010). ConceptGen: a gene set enrichment and gene set relation mapping tool. Bioinformatics 26(4):456–463. Schenk S, Schoenhals GJ, de Souza G, Mann M (2008). A high-confidence, manually validated human blood plasma protein reference set. BMC Medical Genomics 1:41. Scherperel G, Reid GE (2007). Emerging methods in proteomics: top-down protein character- ization by multistage tandem mass spectrometry. The Analyst 132(6):500–506. Schwartz JC, Senko MW, Syka JE (2002). A two-dimensional quadrupole ion trap mass spectrometer. Journal of the American Society for Mass Spectrometry 13(6):659–669. Senko MW, Canterbury JD, Guan S, Marshall AG (1996). A high-performance modular data system for Fourier transform ion cyclotron resonance mass spectrometry. Rapid Commu- nications in Mass Spectrometry 10(14):1839–1844. Shan Q, Liu Y, He L, Ding H, Huang X, Yang F, Li Y, Zeng Z (2012). Metabolism of mequindox and its metabolites identification in chickens using LC-LTQ-Orbitrap mass spectrometry. Journal of Chromatography. B, Analytical Technologies in the Biomedical and Life Sciences 881–882:96–106. Shevchenko A, Wilm M, Vorm O, Mann M (1996). Mass spectrometric sequencing of proteins silver-stained polyacrylamide gels. Analytical Chemistry 68(5):850–858. Shilov IV, Seymour SL, Patel AA, Loboda A, Tang WH, Keating SP, Hunter CL, Nuwaysir LM, Schaeffer DA (2007). The Paragon Algorithm, a next generation search engine that REFERENCES 65

uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra. Molecular & Cellular Proteomics 6(9):1638–1655. Sim SL, He T, Tscheliessnig A, Mueller M, Tan RB, Jungbauer A (2012). Branched polyethy- lene glycol for protein precipitation. Biotechnology and Bioengineering 109(3):736–746. Skylas DJ, Van Dyk D, Wrigley CW (2005). Proteomics of wheat grain. Journal of Cereal Science 41(2):165–179. Slotta DJ, Barrett T, Edgar R (2009). NCBI Peptidome: a new public repository for mass spectrometry peptide identifications. Nature Biotechnology 27(7):600–601. Smithies O, Poulik MD (1956). Two-dimensional electrophoresis of serum proteins. Nature 177(4518):1033. Steen H, Mann M (2004). The ABC’s (and XYZ’s) of peptide sequencing. Nature reviews. Molecular Cell Biology 5(9):699–711. Swaney DL, Wenger CD, Coon JJ (2010). Value of using multiple proteases for large- scale mass spectrometry-based proteomics. Journal of Proteome Research 9(3):1323– 1329. Syka JE, Coon JJ, Schroeder MJ, Shabanowitz J, Hunt DF (2004). Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proceedings of the National Academy of Sciences of the United States of America 101(26):9528–9533. Sze SK, Ge Y, Oh H, McLafferty FW (2002). Top-down mass spectrometry of a 29-kDa protein for characterization of any posttranslational modification to within one residue. Proceed- ings of the National Academy of Sciences of the United States of America 99(4):1774– 1779. Tambor V, Hunter CL, Seymour SL, Kacerovsky M, Stulik J, Lenco J (2012). CysTRAQ – A combination of iTRAQ and enrichment of cysteinyl peptides for uncovering and quantifying hidden proteomes. Journal of Proteomics 75(3):857–867. Tang H, Arnold RJ, Alves P, Xun Z, Clemmer DE, Novotny MV, Reilly JP, Radivojac P (2006). A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics 22(14):e481–e488. Tang HY, Beer LA, Barnhart KT, Speicher DW (2011). Rapid verification of candidate sero- logical biomarkers using gel-based, label-free multiple reaction monitoring. Journal of Proteome Research 10(9):4005–4017. Taylor CF, Paton NW, Lilley KS, Binz PA, Julian RK, Jr., Jones AR, Zhu W, Apweiler R, Aebersold R, Deutsch EW, Dunn MJ, Heck AJ, Leitner A, Macht M, Mann M, Martens L, Neubert TA, Patterson SD, Ping P, Seymour SL, Souda P, Tsugita A, Vandekerckhove J, VondriskaTM, Whitelegge JP, Wilkins MR, Xenarios I, YatesJR, 3rd, Hermjakob H (2007). The minimum information about a proteomics experiment (MIAPE). Nature Biotechnology 25(8):887–893. Taylor JA, Johnson RS (1997). Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. Rapid Communications in Mass Spectrometry 11(9):1067– 1075. Thomas PD, Campbell MJ, Kejariwal A, Mi H, Karlak B, Daverman R, Diemer K, Muruganujan A, Narechania A (2003). PANTHER: a library of protein families and subfamilies indexed by function. Genome Research 13(9):2129–2141. Trimpin S, Inutan ED, Herath TN, McEwen CN (2010). Matrix-assisted laser desorp- tion/ionization mass spectrometry method for selectively producing either singly or multiply charged molecular ions. Analytical Chemistry 82(1):11–15. 66 NEXT GENERATION INSTRUMENTS AND METHODS FOR PROTEOMICS

Trimpin S, Ren Y, Wang B, Lietz CB, Richards AL, Marshall DD, Inutan ED (2011). Extending the laserspray ionization concept to produce highly charged ions at high vacuum on a time- of-flight mass analyzer. Analytical Chemistry 83(14):5469–5475. Unwin RD, Evans CA, Whetton AD (2006). Relative quantification in proteomics: new approaches for biochemistry. Trends in Biochemical Sciences 31(8):473–484. Uttenweiler-Joseph S, Claverol S, Sylvius L, Bousquet-Dubouch MP, Burlet-Schiltz O, Mon- sarrat B (2008). Toward a full characterization of the human 20S proteasome subunits and their isoforms by a combination of proteomic approaches. Methods in Molecular Biology 484:111–130. Vaclavik L, Zachariasova M, Hrbek V, Hajslova J (2010). Analysis of multiple mycotoxins in cereals under ambient conditions using direct analysis in real time (DART) ionization coupled to high resolution mass spectrometry. Talanta 82(5):1950–1957. Vizcaino JA, Foster JM, Martens L (2010). Proteomics data repositories: providing a safe haven for your data and acting as a springboard for further research. Journal of Proteomics 73(11):2136–2146. Wagner K, Miliotis T, Marko-Varga G, Bischoff R, Unger KK (2002). An automated on-line multidimensional HPLC system for protein and peptide mapping with integrated sample preparation. Analytical Chemistry 74(4):809–820. Walsh GM, Lin S, Evans DM, Khosrovi-Eghbal A, Beavis RC, Kast J (2009). Implementation of a data repository-driven approach for targeted proteomics experiments by multiple reaction monitoring. Journal of Proteomics 72(5):838–852. Wang DZ, Li C, Xie ZX, Dong HP, Lin L, Hong HS (2011). Homology-driven proteomics of dinoflagellates with unsequenced genomes using MALDI-TOF/TOF and automated de novo sequencing. Evidence-Based Complementary and Alternative Medicine 2011:471020. Wang M, You J, Bemis KG, Tegeler TJ, Brown DP (2008). Label-free mass spectrometry- based protein quantification technologies in proteomic analysis. Briefings in Functional Genomics & Proteomics 7(5):329–339. Wiener MC, Sachs JR, Deyanova EG, Yates NA (2004). Differential mass spectrometry: a label-free LC-MS method for finding significant differences in complex peptide and protein mixtures. Analytical Chemistry 76(20):6085–6096. Wilkins MR, Sanchez JC, Gooley AA, Appel RD, Humphery-Smith I, Hochstrasser DF, Williams KL (1996). Progress with proteome projects: why all proteins expressed by a genome should be identified and how to do it. Biotechnology & Genetic Engineering Reviews 13:19–50. Wu JJ, Mak YL, Murphy MB, Lam JC, Chan WH, Wang M, Chan LL, Lam PK (2011). Valida- tion of an accelerated solvent extraction liquid chromatography-tandem mass spectrometry method for Pacific ciguatoxin-1 in fish flesh and comparison with the mouse neuroblastoma assay. Analytical and Bioanalytical Chemistry 400(9):3165–3175. Wu WW, Wang G, Baek SJ, Shen RF (2006). Comparative study of three proteomic quantitative methods, DIGE, cICAT, and iTRAQ, using 2D gel- or LC-MALDI TOF/TOF. Journal of Proteome Research 5(3):651–658. Yates JR, 3rd (2004). Mass spectral analysis in proteomics. Annual Review of Biophysics and Biomolecular Structure 33:297–316. Yates JR, 3rd, Eng JK, McCormack AL, Schieltz D (1995). Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Analytical Chemistry 67(8):1426–1436. REFERENCES 67

Yates JR, Cociorva D, Liao L, Zabrouskov V (2006). Performance of a linear ion trap-Orbitrap hybrid for peptide analysis. Analytical Chemistry 78(2):493–500. Yates JR, Ruse CI, Nakorchevsky A (2009). Proteomics by mass spectrometry: approaches, advances, and applications. Annual Review of Biomedical Engineering 11:49–79. Zdebska E, Koscielak J (1999). A single-sample method for determination of carbohydrate and protein contents glycoprotein bands separated by sodium dodecyl sulfate- polyacrylamide gel electrophoresis. Analytical BIochemistry 275(2):171–179. Zhang J, Xu X, Gao M, Yang P, Zhang X (2007). Comparison of 2-D LC and 3-D LC with post- and pre-tryptic-digestion SEC fractionation for proteome analysis of normal human liver tissue. Proteomics 7(4):500–512. Zhang X, Fang A, Riley CP, Wang M, Regnier FE, Buck C (2010). Multi-dimensional liquid chromatography in proteomics—a review. Analytica Chimica Acta 664(2):101–113. Zhi W, Wang M, She JX (2011). Selected reaction monitoring (SRM) mass spectrometry without isotope labeling can be used for rapid protein quantification. Rapid Communications in Mass Spectrometry 25(11):1583–1588. Zhu W, Smith JW, Huang CM (2010). Mass spectrometry-based label-free quantitative pro- teomics. Journal of Biomedicine & Biotechnology 2010:840518. Zubarev RA, Horn DM, Fridriksson EK, Kelleher NL, Kruger NA, Lewis MA, Carpenter BK, McLafferty FW (2000). Electron capture dissociation for structural characterization of multiply charged protein cations. Analytical Chemistry 72(3):563–573. Zybailov BL, Florens L, Washburn MP (2007). Quantitative shotgun proteomics using a pro- tease with broad specificity and normalized spectral abundance factors. Molecular bioSys- tems 3(5):354–360. 3 PROTEOMIC-BASED TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS

Gianluca Picariello, Gianfranco Mamone, Francesco Addeo, Chiara Nitride, and Pasquale Ferranti

3.1 INTRODUCTION: WHAT IS FOOD ALLERGY?

Despite the surprising advances in biochemical and clinical research, the issue of food allergy (FA) still remains one of the most controversial and debated in the scientific community. As in the case of skin and respiratory allergies, the mechanisms underlying the onset, development, and outcomes of FA are far from being fully elucidated. FA arises because of an anomalous interaction between food components and immune system. The great diversity of food allergens so far described and the extreme variability in the individual response to these allergens make this pathology one of the most difficult to enclose in a comprehensive scheme. For this reason, though FA is very often defined as a unique disease, it actually embraces a panel of various food-related disorders. Although FA-related diseases are restricted to only categories of individuals, the number of patients who are interested is steadily growing in the last 10 years, making this field of a primary importance for clinicians as well as for food industry. Actually, the eight food matrices (the so-called “big eight”) that account for over 90% of allergic reactions in predisposed individuals (Sampson, 2004; Sicherer and Sampson, 2009)—that is, milk, egg, wheat, soy, peanuts, tree nuts (e.g., hazelnut, walnuts, pecans, almonds, and cashews), fish, and shellfish—are all basic constituents of the diet of the majority of human population worldwide, and as a consequence they

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

69 70 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS are in practice the raw ingredients processed by the industry to turn out almost all food preparations. Additional known allergen-containing foods include cereals, such as maize, rice and barley, legumes (lupin), mustard, sesame, celery among others. Clinical manifestations of FA range from a tingling sensation in the mouth, lips or throat, up to anaphylaxis and death, depending on the severity of the individual response. Allergic reactions to various foods can also develop into allergy-related respiratory symptoms such as asthma, allergic rhinitis, and atopic dermatitis. Most food allergens are proteins. Many studies in the last years have been dedi- cated to the identification and detection of allergenic protein components from the most common food sources, and this knowledge has undoubtedly provided infor- mation for a better understanding of FA etiology. However, despite the important progress in immunopathology and in clinical protocols, the core problems of FA remain unresolved under several standpoints. These include the following: (i) clin- ical evidence suggests that many of the suspected allergens present in a food have not been identified or characterized yet; (ii) the structural traits that make a protein an allergen and the relationship between allergenic determinants and disease patterns still remain substantially unknown; (iii) the mechanisms through which allergenic proteins and their derived peptides elicit the adverse reactions to food ingestion await to be completely explained yet. These knowledge gaps also slow down the devel- opment of novel and sensitive screening and confirmatory tests for diagnosis and prognosis of allergy as well as more efficient therapeutic protocols. The need for reliable methods allowing accurate structural identification and dosimetry of the offending allergens has prompted researchers to develop new meth- ods for the unambiguous characterization of food allergens. Over the last years, the integrated approaches based on the various “-omics” for food peptide and protein characterization (proteomics, peptidomics, and metabolomics) and recently merged into the new discipline of foodomics are proving to be essential at various levels in the study of FA, from the structural characterization of novel food allergens to the controversial feature of the resistance to digestion of allergenic proteins or to the effi- ciency of removal of epitopes from a food destined to patients. Many excellent recent reviews cover selected topics on the aspects that concur to the field which has been christened “allergenomics” (Akagawa et al., 2007). Thus, far from the aim to provide an exhaustive survey of literature concerning the application of -omic tools to FA analysis, this chapter will critically present the newest exemplificative achievements in the “hot spots” of allergenomics: the understanding of the structural features of epitopes; the elucidation of the mechanisms triggering individual response to aller- gens; and the monitoring of the formation/fate of allergens along the technological processes leading from raw materials to finished foods.

3.2 FOOD ALLERGY: FEATURES AND BOUNDARIES OF THE DISEASE

In the last decade, the European Commission and other world-wide governmental scientific associations have funded numerous projects aimed to promote and extend IMMUNOPATHOLOGY OF FOOD ALLERGY AND ROLE OF PROTEOMICS 71 the knowledge in the field of FA. Recent estimates agree about a prevalence range of 3.2–4% of confirmed FA frequency during the first year of life. According to some authors, FA could be underdiagnosed and its prevalence would reach 6–7.5% (Kagan, 2003; Sicherer and Sampson, 2006; Rash, 2008). A recent meta-analysis settled the prevalence of FA in the adulthood approximately at 3.5% (Rona et al., 2007). Large-scale epidemiological studies are hampered by a series of factors, including the complexity of the food matrices and their interactions with the immune system, the individual variability of the immune response, the wide range of symptomatic patterns, and the differences in the diagnostic methodologies. The self-perception of reactions to foods tends to highly overestimate the prevalence of FA in comparison with studies that make use of objective evaluation tools, particularly those relying on double-blind placebo-controlled food challenge oral test which is considered the “gold standard” for diagnosis. In any case, FA is undoubtedly having a rapidly increasing trend that is probably related to an increased awareness of people and to the diagnostic advances. Currently, in developed countries, FA ranks first among the causes of hospitalization for ana- phylaxis. The increasing incidence of FA can also be partly ascribed to the recent deep changes in the alimentary behavior introduced by globalization. Although the search for clinically relevant allergens has drastically progressed, identification and characterization of allergens still require extensive effort in addition to a large amount of starting material, which is available only for allergens which have been cloned. In general, allergen characterization requires the combination of clinical, immunological, genomic, proteomic, and bioinformatic approaches.

3.3 IMMUNOPATHOLOGY OF FOOD ALLERGY AND ROLE OF PROTEOMICS

FA has an extremely complex etiopathology. Because of the heterogeneity of the pro- tein fraction of many foods and of the individual diversification of the immunological response, more than a single trigger (allergen) can be contained in a food. In addition, reactions can be elicited by more than only one specific peptide sequence (epitope) within a protein allergen. Furthermore, immunological responses can be either IgE- or cell-mediated or may even result from a combination of the two mechanisms. A precise evaluation of the relative prevalence of these mechanisms is hindered by fre- quent under- or misdiagnoses, especially of non-IgE-mediated FA and by the possible late onset in the adulthood (Crittenden and Bennett, 2005). The gastrointestinal (GI) tract is normally exposed to a huge load of potential allergens. These are effectively suppressed to harmless foreign antigens throughout the digestive path. In normal conditions, the tight junctions of gut epithelium prevent proteins to massively cross the intestinal wall and to come into contact with the gut- associated lymphoid tissue (GALT). It is estimated that less than 2% of the dietary proteins are absorbed in an immunologically intact form. Obviously, the intestinal permeability is also likely to have impact on the uptake of potential allergens. For instance, it is known that the immaturity of both the GI digestive machinery and the 72 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS gut epithelium reduces the efficiency of the infant mucosal barrier (Weaver et al., 1987). In physiological conditions, when intact proteins do cross the gut barrier, the immune system induces oral tolerance (Chehade and Mayer, 2005). FA can be considered the result of a transient or permanent functional failure of the oral tolerance mechanisms. The so-called “type I” hypersensitivity reactions, that is those IgE-mediated, are by far the best characterized adverse reactions to foods. The adverse reaction has an early onset, on a timescale ranging from seconds—in oral allergy syndromes (OASs)—to 2–3 h. Symptoms include vomiting, abdominal pain, and diarrhea. The antigens adsorbed through the digestive tract are delivered to distal organs by the blood giving rise to rhinitis and asthma or urticaria and angioedema. Reactions can also evolve into atopic dermatitis or into an anaphylactic blood pressure shock. Symptomatology patterns are the result of the inflammatory response of the immunocompetent cells to the allergens. The development of allergy is a multistep process, and the mechanisms leading to sensitization and to production of IgE antibodies are complex and, for some aspects, not fully understood. When antigens come in contact with GALT, they are taken up by the antigen-presenting cells and are processed and exposed for recognition by T cells. T cells activate chemotactic signaling that induces conversion of B cells into IgE antibody–producing plasma cells. After sensitization, upon further exposure, the circulating IgE recognizes the antigen. The key step to a specific allergic reaction is the binding of at least two IgE antibody molecules to a multivalent antigen. The allergen-IgE complex, in fact, binds with high affinity IgE receptors (FcεR I) of effector cells, that is, circulating basophils and tissue mast cells, inducing degranulation with a massive release of mediators (i.e., histamine, tryptase, cysteinyl leukotrienes, and prostaglandin D2) that trigger allergic inflammatory responses (Wang and Sampson, 2011). Non–IgE-mediated allergy seems to have a different immunopathogenesis. Cell- mediated FA is only rarely life threatening and symptoms are generally limited to GI discomfort. Nevertheless, they are cause of morbidity in infant and young children. Several evidences attribute to the polarization of T cell toward the Th1 subtype, which is most likely ruled by regulatory T cells (Treg), a key role in the development of non–IgE-mediated allergies. The pathways of the allergic response involve the concerted action of a large number of biomolecules, most of which are still to be identified and whose role is far to be defined. For these reasons, one of the main obstacles to either diagnostic or clinical progress in the field of FA is the limitation in the knowledge of the molecular details of the mechanisms triggering and/or amplifying the immune adverse reaction to food ingestion. On the other hand, in these years, proteomic technologies are accelerating drug discovery, diagnostics, and molecular medicine, representing the investigation link between genes, proteins, and disease states. Current proteome researches are essen- tially focused on two major areas: expression proteomics (EP), which aims to mea- sure the upregulation and downregulation of protein levels, and functional proteomics (FP), which aims to characterize protein activities, multiprotein complexes, and sig- naling pathways (Neubauer et al., 1997; Shevchenko et al., 1997; Pandey et al., 2000; Hinsby et al., 2003). In particular, either the tracing of unique patterns of protein IDENTIFICATION OF FOOD ALLERGY EPITOPES 73 expression (EP) or the identification of biomarkers associated with IgE-mediated (food) allergy via FP is rapidly emerging areas of clinical proteomics. Progress in protein annotation and in our understanding of protein–protein interactions can undoubtedly lead to diagnostic and therapeutic advances in the treatment of FA.

3.4 IDENTIFICATION OF FOOD ALLERGY EPITOPES

3.4.1 The Epitopes of Food Allergy The typical FA immune reactions in predisposed individuals are generally elicited by restricted subsets with peculiar structural characteristics among the hundreds of different proteins contained in the food itself. Many studies simply state that food allergens are (glyco)proteins with a molecular mass ranging from 10 to 70 kDa and belong to a variety of different protein families. Such a classification is of limited utility as these features are shared by the most common food protein matrices. Indeed, no structural motifs that determine the antigenic potential of food allergens have been identified so far. As a consequence, despite several strategies of protein modeling and structural comparison have been proposed to assess reactivity, no methods currently available can reliably predict the allergenic potential of a food component. The application of the recent platforms of protein purification and characterization is finding large use for identifying the food components that trigger IgE-mediated reactions in human. Food allergens have been broadly classified into two groups, namely, incomplete and complete allergens (Aalberse, 1997). Incomplete or “type 2” food allergens can elicit clinical symptoms only in previously sensitized individuals. These “confor- mational epitopes” are discontinuous sequence clusters and bind IgE because of a higher order structural arrangement. They are heat labile and are responsible for the so-called OAS, a condition in which symptoms are usually confined to the oropha- ryngeal area of the mouth and throat and include pruritus, urticaria, and angioedema. The complete or type I allergens, also referred to as “sensitizing elicitors”, can either sensitize or elicit allergic responses in susceptible subjects. Epitopes are the antigenic determinants of allergens. The cross-link of mast cell- or basophil-bound IgE, necessary to induce the release of mediators of allergic symp- toms, requires that at least two high-affinity IgE-binding epitopes occur on a single allergen. Both linear and conformational epitopes can be involved in the IgE-binding mechanisms (Fig. 3.1). In linear epitopes, the primary amino acid sequence of the allergen is the exclusive structural feature affecting IgE-binding affinity. In contrast, secondary or tertiary structure elements are required for the binding of conforma- tional epitopes to IgE. Due to the thermally induced conformational transitions, the antigenic potential of conformational epitopes can be reduced or completely annulled upon protein denaturation by cooking or heat processing. The role of conformational IgE-binding epitopes is relevant to the etiology of aeroallergen-mediated allergic reactions. Considering that the access of food allergens cross-reactive with aeroal- lergens may occur across oral or nasal mucosa, the stability to digestion is a not a critical factor of conformational epitopes. The major apple allergens Mal d1 is a 74 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS

LINEAR VS CONFORMATIONAL EPITOPES C N N

C

digestion denaturation digestion denaturation

C N

N

C FIGURE 3.1 Schematic representation of antibodies interacting with linear and conforma- tional epitopes. Linear epitopes, short and continuous, after digestion/denaturation may still be able to bind the antibody. Conformational epitopes, noncontiguous residues brought together by the folding of the antigen to its native structure, after digestion/denaturation can no longer bind the antibody. well-established example of an extremely digestion-labile protein which can prime allergic symptoms due to the cross-reactivity with the birch pollen allergen Bet v1 (Son et al., 1999). Api g1 in celery and Dau c1 in carrot are additional examples of these “nonsensitizing elicitors” or type 2 allergens (Faeste et al., 2010). Interestingly, mass spectrometry (MS) techniques, including multiple reaction monitoring (MRM) MS, have been exploited to detect and quantify the snow crab airborne allergens that are responsible for occupational asthma, related to seafood allergy (Abdel Rahman et al., 2010; Abdel Rahman et al., 2012). In contrast, air- borne peanut allergens were not detected in many simulated environments (Perry et al., 2004).

3.4.2 Proteomic Strategies for Allergen Identification, Detection, and Quantification The strategies for allergen characterization and monitoring are substantially the same general procedures used for protein analysis. Although several immunotherapy IDENTIFICATION OF FOOD ALLERGY EPITOPES 75 approaches to FA primarily oriented to restore oral tolerance are now starting to be launched, the current therapy for FA is still limited to a strict avoidance of the offending food and to the education of patients to the management of the pathol- ogy. This approach is generally effective, but compliance can be very difficult also because of cross-contamination due to the almost ubiquitous occurrence of some allergenic food commodities and to inadequate food labeling. Thus, considering that also traces of allergens can be harmful in some allergic subjects, the European Community has issued several directives to discipline labeling of potential sources of allergens in foodstuffs (EU Directive 2000/13/EC, amended by Food Labelling Directive 2003/89/EC and Directive 2007/68/EC). The current policy concerning the declaration of food allergens, also including the precautionary warnings on food labels and the aspects related to possible cross-contamination or “hidden allergens,” has been recently examined in case studies that have analyzed 550 food labels (van Hengel, 2007a, 2007b). Under an analytical standpoint, sensitivity and specificity are mandatory to support an adequate preventive program. The identification of “type I” allergens and B-cell epitopes can take advantage from their affinity for IgE of allergic individuals. Classically, food allergens have been identified and monitored by immunochemical techniques, such as Western blotting and enzyme-linked immunosorbent assays (ELISA) that also provide semi- quantitative information. Typical limits of detections (LODs) of the ELISA tests fall in the range of the low ppm (1–5 ppm). However, the major drawback of immunological methods is related to the antibody specificity and stability and to the possible cross- reactivity with matrix components that can produce false positives. In the case of the food allergen detection, the antibody-binding properties can be severely affected by the changes induced on proteins by thermal or other technological treatments. In addition, the most commonly utilized sandwich ELISA-based tests tend to underestimate or to mis-detect linear epitopes that could arise from proteolytic events. This is the case of peptides derived from the allergenic milk proteins in fermented or processed dairy products (Ragno et al., 1993; Van Hoeyveld et al., 1998) or of prolamin fragments in glucose syrups (Dostalek et al., 2009) and of fermented bev- erages, such as beer (Picariello et al., 2011a; Colgrave et al., 2012). The presence of interfering non-protein compounds and the effective extraction of the allergens for reliable identification and quantification are further critical factors that intro- duce matrix dependence into the detection process. The aspects concerning sample preparation and allergen extraction will be presented later in this chapter. The use of DNA-based methods, which consist of the PCR amplification of DNA segments specific for the injuring food, also suffers from a series of shortcomings, first of all the protein nature of the allergens. In other terms, the presence of DNA of a raw food component in a commodity is not the proof of the presence of an allergen and vice versa. DNA stability is also drastically affected by thermal processing. Advantages and limitations of the classical methodologies of food allergen analysis have been reviewed (Kirsch et al., 2009). Just like in proteomic analysis, MS-based platforms currently constitute the core technologies for allergen discovery and detection (Fig. 3.2). The revolutionary shift of the analytical perspectives that has taken place in the last two decades has been 76 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS

Complex Food Matrix

protein extraction

2DE-MS BASED PROTEOMICS GEL-FREE SHOTGUN PROTEOMICS

proteolytic digestion staining immunoblotting pl allergic patient sera HPLC and HPLC-MS/MS IgE-binding assay SDS-PAGE

In-gel tryptic digestion Abs (mAU/TIC) min immunorecognition bioinformatic tools of the HPLC fractions MS and MS/MS analysis

Identification and catologuing of food allergens

FIGURE 3.2 Typical proteomic-based approaches for identifying allergens or epitopes in complex food matrices. Food protein fractions can be separated by 2DE and allergens are immunostained using serum of allergic patients as the source of specific IgE. Immunoreactive spots are identified by MS and MS/MS techniques. Alternatively, proteins can be identified by HPLC-tandem MS of the proteolytic digest of the whole protein fractions. In separate experiments, protein digests can be fractionated and peptides can be assayed for their IgE- binding affinity, thus enabling the identification of the epitopic (type I) regions. Bioinformatics is an indispensable tool for the MS-based protein and peptide identification. prompted by the development of the “soft” ionization sources for MS analysis such as matrix-assisted laser desorption ionization (MALDI) and electrospray ionization (ESI). Nowadays ESI and MALDI interfaces are commercialized in combinations with a large array of mass analyzers. MS, also coupled with high-resolution separa- tion techniques (i.e., chromatography, electrophoresis, and capillary electrophoresis), enables the analysis of a protein or its proteolytic digest with high sensitivity and unparalleled specificity. Techniques such as tandem MS (MS/MS) also provide struc- tural information that can permit to characterize and sequence new allergens and/or related protein isoforms. As a general current trend, the high-throughput, so-called “shotgun” proteomic strategies, especially those based on the multi-dimensional chromatographic separation prior to MS, are progressively replacing those based on the previous two-dimensional gel electrophoresis (2DE) separation. In the shotgun proteomic approach, an entire protein extract is digested with trypsin before nanoflow LC-ESI-MS/MS analysis, thereby preventing the possible pitfalls of the 2DE-based separation. Despite the analytical complexity of the system is greatly increased by proteolysis, the multiplication of protein-specific peptide sequences enhances the probability of detecting and confidently identifying at least one peptide of each IDENTIFICATION OF FOOD ALLERGY EPITOPES 77 parent protein, regardless the size and the isoelectric point of the polypeptide chains. The gel-free shotgun methods take advantage from the much more sensitive and specific detection of peptides with respect to large-sized proteins. In particular, one of the primary advantages of the gel-free methods is the higher analytical dynamic range, which enables the detection of the low-abundance components and, in the case of foodstuff, the monitoring of “hidden allergens” in trace amounts. The support of effective bioinformatic tools for data mining is essential to deduce useful information from the analytical process. Obviously, though MS techniques enable an effective monitoring of allergens and epitopes even in very complex samples, the process of identification and validation of food allergens in general still requires the combina- tion of clinical, immunological, genomic, bioinformatic, and proteomic approaches as mentioned above. Although MS is not inherently a quantitative analytical technique, as the ion- ization yield of a protein/peptides depends on molecule-specific physicochemical properties, the newest emerging methodologies for food allergen quantification rely on MS analysis (Fig. 3.3) and specifically on the selected reaction monitoring MS

full scan preliminary survey targeted allergen absolute quantification

purified allergen Food sample

protein extraction

enzymatic digestion

trypsin digestion

high-resolution MS/MS analysis

switch to quantitatuve AQUA peptides MRM mode

identification of hydrolyzed peptides LC-MS anaylsis

ESI Q1 Q2 Q3 selection of product ion proteotypic peptides scanning and their transition fixed CID scanning m/z

MRM

validation of tansitions fixed CID fixed by LC/MS in MRM mode time

FIGURE 3.3 General procedure of the MRM MS assays, conducted on triple quadrupole instruments, for protein absolute quantification. Proteotypic peptides are selected with a pre- liminary full scan MS survey (left panel). For quantitative analysis (right panel), whole protein extracts are trypsinized and the peptide mixture (each peptide is represented by the symbol ◦) is spiked with standard AQUA peptides (). Proteotypic peptides (), selected as analytical probes of the target protein(s), are quantified by comparing the ionic intensities. The strategy is increasingly applied to the quantification of several food allergens. Reprinted with permission from Picariello et al. (2011b). 78 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS approach and on its extension MRM MS, which preferably exploit triple quadrupole (QqQ) or the recent triple-TOF instruments. In a shotgun analysis, the targeted MRM MS detection of protein allergens involve the monitoring of mass and transitions (due to the fragmentation in the course of the tandem MS process) of one or more pep- tides that are selected as analytically representative substitute of the parent protein(s). These so-called “proteotypic peptides” are selected through preliminary explorative analyses (discovery-driven analysis). Although many label-free recent attempts for protein quantification based on spectral counting statistics have been proposed, the MRM MS approach permits a reliable absolute determination of allergens provided that opportune internal reference peptides are available. AQUA peptides (Absolute QUAntification) are synthetic copies of proteotypic peptides, which are isotopically labeled to one or more amino acid residues in order to shift molecular mass, with- out appreciably influencing LC retention time and ionization properties. Using an MRM MS-based strategy, eight allergens have been recently quantified in commer- cial soybean cultivars and the differential expression of four allergens (glycinin G3, glycinin G4, ␤-conglycinin, and Kunitz trypsin inhibitor 1) has been determined across 20 varieties (Houston et al., 2011). Monitoring mass and transitions of pep- tides arising from hordein subtypes, the amount of gliadin-like fragments, potentially harmful for celiacs, has been determined in 60 beer samples (Colgrave et al., 2012). The determination of both food and airborne allergens of snow crab has been car- ried out with an AQUA MRM MS approach (Abdel Rahman et al., 2010; Abdel Rahman et al., 2011). Very recently, LC-triple quadrupole MS operating in MRM mode has been successful for the simultaneous multiplexed determination of aller- gens from seven different potentially allergenic food matrices such as milk, egg, soy, hazelnut, peanut, walnut, and almond, after a previous explorative screening of the peptide mixtures. Linear range of quantification as well as sensitivity was adequate for the purpose, covering the 10–1000 ppm range (Heick et al., 2011a). A label-free approach has been applied to quantify vicilin-like isoforms from lupin (Lupinus albus) employing a microfluidic nanoHPLC-chip separation coupled with ion trap-MS/MS (Brambilla et al., 2009). Given the large panel of methodologies developed and validated as well as the intrinsic specificity of the diverse analytical approaches, it is clear that MS-based analysis of food allergens does not consist in the standard application of a single ana- lytical protocol for all purposes. Proteomic and peptidomic methodologies are rather referred to a set of tools, each of them best suited to provide tailored responses. The design of an MS experiment requires the careful evaluation of several technical factors including the type of instrumentation, fragmentation method, and sample preparation with respect to the contingent analytical inquiries. Technical and instrumental details are given in other chapters of this book or can be found in other dedicated reviews (Mamone et al., 2009; Picariello et al., 2011b) and will not be further presented here.

3.4.3 Identification of Linear and Conformational Epitopes The challenge of FA research is to determine the way proteins are allergenic as there is no definitively accepted procedure to predict or establish the allergenicity IDENTIFICATION OF FOOD ALLERGY EPITOPES 79 of a protein. The precise identification of food-derived epitopes is a key aspect for defining diagnostic and prognostic elements in FA and to design opportune strategies of either therapeutic intervention or technological reduction of the allergenic potential. For instance, the reactivity against specific B-cell epitopes of ␣s1-casein as well as peculiar patterns of IgE reactivity seems associated with persistent cow’s milk (CM) (Chatchatee et al., 2001; Jarvinen¨ et al., 2002) and egg allergy (Cooke and Sampson, 1997). Epitope mapping of the proteins of milk and of other allergenic foods has been attempted by use of synthetic peptides combined with immunological and proteomic approaches. In the case of ␣-lactalbumin (␣-La), one of the major milk whey proteins, the recombinant expressed proteins were characterized by means of MS and circular dichroism. IgE epitope mapping was performed with synthetic peptides. Superpo- sition of IgE-reactive peptides onto the 3-dimensional structure of ␣-La revealed a close vicinity of the N- and C-terminal peptides within a surface-exposed patch. This study also explored the possibility of using the recombinant proteins for the diagnosis of patients with severe allergic reactions to CM (Hochwallner et al., 2010). The development of fast and cost-effective methods for the simple identification of the epitopes in a number of major food allergens would also provide the prospect of safe peptide-based vaccines (Larche,´ 2003). A major problem in this field is that specific allergen immunotherapy is frequently associated with adverse reactions. Sev- eral strategies are being developed to reduce the allergenicity while maintaining the therapeutic benefits. Peptide immunotherapy is one of these strategies. Proteomic methods for the simple and rapid identification of immunogenic epitopes of allergens (i.e., allergenic epitopes) are ongoing and could potentially help to design peptide- based vaccines. An epitope extraction technique, based on biofunctionalized magnetic microspheres self-organized under a magnetic field in a channel of a simple microflu- idic device, was applied in the isolation and identification of prospective allergenic epitopes (Bilkova et al., 2005) and then extended to a model system containing oval- bumin, a high-molecular-weight antigen (Jankovicovaa et al., 2008): the protein was first efficiently digested by a magnetic proteolytic reactor and then allergenic epitopes from the mixture of peptides were selectively “fished” by a magnetic immunoaffin- ity carrier with immobilized rabbit anti-ovalbumin IgG molecules and analyzed by MALDI-TOF-MS. By this way, a dodecapeptide fragment of ovalbumin was identi- fied as the relevant allergenic epitope. This microfluidic magnetic force-based epitope extraction technique has the potential to be a significant step toward developing safe and cost-effective epitope-based vaccines (Jankovicovaa et al., 2008). Stability to GI digestion has been proposed as a basic criterion for predicting allergenic potential of food proteins even though the allergenicity assessment of digestion-stable proteins has shown inconsistency among studies. However, many reports have converged to an increased incidence of FA and of enteropathies, such as celiac disease (CD), to impaired gastric digestion in humans and animal models. Thus, resistance to pepsin hydrolysis in model systems mimicking gastric digestion has been included in the Food and Agriculture Organization/World Health Organi- zation decision tree to assess safety of novel foods produced through agricultural biotechnology (FAO/WHO, 2001). 80 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS

The pepsin-resistant fragments of ovomucoid (OVM), another egg allergen retain the IgE-binding properties as well as the allergenic potential (Urisu et al., 1999). Similarly, peptides arising upon tryptic, peptic, or chymotryptic proteolysis of peanut allergens Ara h1 and Ara h2 preserve intact IgE-binding domains (Sen et al., 2002). Several model systems of the GI digestion with different degree of complexity have been developed with the aim to predict or identify the domains of food proteins able to endure proteolysis. The features, advantages, and pitfalls of these different models have been extensively reviewed (Moreno, 2007; Wickham et al., 2009; Hur et al., 2011). It is noteworthy that because of its technical capabilities, MS is increasingly supporting the investigations about the fate of dietary proteins and, specifically, of food protein allergens. The compactness of the prolamin superfamily, which includes the 2S albumins, nonspecific lipid transfer proteins (nsLTPs), cereal ␣-amylase/protease inhibitors, and cereal prolamin families, has been roughly attributed to a conserved skeleton of eight cysteine residues engaged in disulphide bonds. This compact arrangement confers stability toward either hydrolytic enzymes or thermal denaturation. The pre- cise structural details underlying the resistance to pepsinolysis of peach and barley nsLTPs have been established recently by a combined approach making use of crys- tallography, dynamic simulations, NMR, and MS (Wijesinha-Bettoni et al., 2010). The role of MS has been decisive in identifying the labile protein motifs and the preferential cleavage sites. MALDI-MS and ESI-MS have been complementarily used to identify the regions of milk proteins that survive an in vitro multiphasic model of the GI digestion (Picariello et al., 2010). Similarly, peptides arising from an allergenic 32-kDa avocado endochitinase (Prs a1) upon simulated gastric digestion were identified by MALDI- MS and assayed for their binding properties to IgE from individuals with a clinical history of latex-fruit allergy syndrome (Diaz-Perales et al., 2003). The B-cell linear epitopes of several food allergens, such as peanut Ara h1 and soybean beta-conglycinin, have been mapped throughout the polypeptide chains. However, in general, due to the different assessment strategies and a strong individ- ual variability of the response, the precise identification of epitope sequences is still controversial. This is, for instance, the case of the milk-derived B-cell epitopes. Data about T-cell epitopes that induce either indirect antibody production or cell-mediated immune reactions are even more fragmentary and relatively much less studies have been produced along the years. In the case of CD, research in this direction is more advanced, as a comprehensive inventory of gluten-derived T-cell epitopes has been already outlined (Tye-Din et al., 2010). Multiple peanut Ara h1 T-cell epitopes with defined HLA restriction (DeLong et al., 2011) and two immunodominant T-cell- reactive regions of peach Pru p3 have been recently described (Pastorello et al., 2010). It can be expected that in the next future, a large part of the research effort will be focused on the identification of T-cell epitopes of food allergens, in consid- eration of their effectiveness in inducing oral tolerance and, hence, of the promising utilize in immunotherapy (Tanabe, 2007). Expectedly, MS-based techniques will provide an invaluable support to the discovery and characterization of epitopes of food allergen. EXPRESSION PROTEOMICS AND FUNCTIONAL PROTEOMICS IN FOOD ALLERGY 81

The uptake of food-derived allergens is a downstream aspect that has to be eval- uated in the global estimation of the allergenic potential. The absorption rates of the GI digests arising from two purified major allergens, that is, 2S albumins from Brazil nuts (Ber e1) and from white sesame seeds (Ses i1), have been inferred by monitoring the peptide transport across Caco-2 cell monolayers, used as a model of the human intestinal epithelium (Moreno et al., 2006). Similarly, transport experiments have already been performed with unhydrolyzed bovine whey proteins, ␣-lactalbumin and ␤-lactoglobulin (Caillard and Tome,´ 1995; Rytkonen¨ et al., 2006). These studies, also supported by several evidences acquired in vivo, strengthen the hypothesis that immunogenic proteins can enter the human epithelium intact in an immunological active form. However, the previously described sampling and sentry-like role of dendritic cells that are is active at the level of the intestinal lumen suggests that sensi- tization or immune reactions could arise even without internalization and processing of dietary allergens. The nature of the food matrix and the dynamic process of food digestion in the lumen also have a strong impact on the mechanisms through which food proteins or protein fragments are presented to the gut mucosal barrier, thus affecting their uptake and allergenic potential. Rodent animal models, which rise serum IgE upon immunization and/or ana- phylaxis reactions upon oral challenge, have been developed to infer the allergenic potential of foods. These models have been particularly exploited to elucidate the B- and T-cell–mediated mechanisms of induction of partial or complete tolerance in experiments of immunotherapy (Kobayashi et al., 2003). Promising ex vivo tests of allergenicity prediction includes the evaluation of his- tamine release or upregulation of surface activation markers on basophil granulocytes, known as the basophil activation test (BAT). Regardless the typology of allergenic potential assessment, characterization, and standardization of the molecular trigger(s) are prerequisites that can be effectively addressed with MS-based techniques.

3.5 EXPRESSION PROTEOMICS AND FUNCTIONAL PROTEOMICS IN FOOD ALLERGY

One of the most powerful tools in EP is difference gel electrophoresis (DIGE), which allows parallel comparison of multiple protein samples within the same gel (Alm et al., 2007; Hobson et al., 2007). In this methodology, proteins are labeled with three spectrally distinct, charge- and mass-matched fluorescent dyes prior to be mixed and separated simultaneously on the same 2DE gel. Proteins are then differentially visualized via fluorescence detection at different wavelengths. The robustness of the dye technology enables measurement of small differences (up to 10%) in protein abundances with sensitivity comparable with the silver staining. The DIGE technology prevents the drawbacks related to the poor gel-to-gel reproducibility that represent the main pitfalls of the 2DE-based quantitative determinations. 2DE- DIGE was used to compare the amount of the Fra a1 allergen between different varieties of strawberry (Alm et al., 2007). By the same technique, cultivar-dependent expression of specific allergens was also assessed in ten varieties of rice (Teshima 82 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS et al., 2010) and in two varieties of peanut (Schmidt et al., 2009). Studies like these can be addressed to select specific cultivars with the lowest level of potential allergenic proteins and can enable the quantification of allergens in genetically modified plants compared with their wild-type counterparts. One of the main and still unresolved FA issues is that current knowledge of allergen nature, tolerance, effector mechanisms, and predisposing conditions do not allow for accurate prediction of allergenicity in humans without food trials or prior to accidental exposure. Consequently, there is a large interest in the development of validated animal models to objectively assess the allergenicity of foods for safety testing and immunotherapy development. The ultimate scope is to extend the results obtained to the understanding of human allergy. A pivotal EP approach has been recently proposed to identify protein biomark- ers of FA in mice following exposure to OVM, in a type-1 hypersensitivity ani- mal model (Hobson et al., 2007). Differential protein expression level in plasma as well as in mesenteric lymph nodes, liver, spleen, and ileum was assessed by 2D- DIGE coupled to identification by HPLC-ESI-Q-TOF-MS/MS and by MALDI-TOF- MS. Plasma proteins overexpressed in OVM-sensitized mice included haptoglobin (41-fold), peroxiredoxin-2 (1.9-fold), and serum amyloid A (19-fold). The first two proteins belong to the class of acute phase proteins (APPs), a group of proteins whose expression in plasma varies as a component of the acute phase response immediately following tissue injury. These results indicate that both specific APP and antioxidant proteins may represent important biomarkers for assessing the allergic potential of a food in mice, data which can be possibly extended to human egg allergy. This study is exemplary of how data from EP analysis provide huge amount of biological infor- mation and can help establish multimodal markers for early diagnosis and prognosis. Furthermore, EP data can provide an opportunity for identifying molecular pathways that underlie various clinical allergic phenotypes. The proteomic combination of clinical with comparative DIGE and MS charac- terization on human subjects is also opening completely new approaches to evaluate and elucidate the mechanisms that contribute to FA. A recent, finely performed EP study (Nair et al., 2011) compared the protein profile of subjects with severe FA with healthy subjects. Pooled sera from patients with allergy to several most com- mon injuring foods (peanut, tree nut, milk, egg, strawberry, tomato, tuna, and cocoa) were compared with control sera. The aim was not the mere identification of spe- cific FA biomarkers resulting in IgE-mediated atopy but, more interestingly, to trace the molecular mechanism that underlie the allergic response in FA patients. High- throughput 2D-DIGE and MALDI-TOF-MS analysis identified 52 proteins that were differentially expressed between the healthy and control groups. This panel of pro- teins included chondroitin transferase, zinc finger proteins, C-type lectins, retinoic acid–binding proteins, heat shock proteins, myosin, cytokines, mast cell–expressed proteins, and MAP kinases. It is known that allergic responses to food antigens involve a state of immediate hypersensitivity to certain food proteins. The mechanism under- lying the initiation and development of allergic responses involves cytokine activation which is believed to directly induce the differentiation of effector Th2 lymphocytes; these last trigger allergic responses to specific food antigens. Biological network EXPRESSION PROTEOMICS AND FUNCTIONAL PROTEOMICS IN FOOD ALLERGY 83 analysis with dedicated FP software tools revealed that most of the regulated proteins identified in this study play a role in immune tolerance, hypersensitivity, and mod- ulate cytokine patterns inducing a lymphocyte Th2 response that typically results in IgE-mediated allergic reaction which has a direct or indirect biological link to FA. Although due to the experimental design (use of pooled instead of individual sera), it was not possible to subdivide FA patients into specific food type, the identification of these key proteins provides clues to elucidate mechanisms of the structures that contribute to allergenicity, which thus, in turn, would help to alleviate the injuring potential. Further, these key proteins can be utilized as diagnostic markers that will not only help in the diagnosis and management of FA but also may be used in future immunotherapy. Extension of these approaches to specific food types, by identifying unique biomarkers associated with certain allergic phenotypes and potentially cross- reactive proteins through bioinformatics analyses, could provide enormous insight into the mechanisms that underlie allergic response in individuals with FA. EP is also elucidating the causes and outcomes of occupational exposure in food industry. For instance, a very recent study (Green et al., 2011) investigated the sensitization of workers in soy-processing facilities, where allergy to soy is five times increased compared with other food-processing sites, through a combination of immunological methods and antigen characterization by using nanoUPLC MS/MS (Anto´ et al., 1989). The effects of processing (raw, de-hulled, crushed soybeans, and soy flakes) were also compared, as well as the possible allergenicity differences between wild-type and transgenic soy. A notable finding was that environmental exposure associated with respiratory allergy is mainly due to low-molecular-weight soy antigens (Gly m1 and Gly m2), whereas occupational respiratory soy allergic sensitization is instead associated with the high-molecular-weight proteins Gly m5 and Gly m6. The impact of the production process on the IgE affinity of the soy proteins was also evaluated. As most raw food materials, soy flakes, undergo various preparative processes during the production phase, including heat, chemical, and washing treatments. Postprocessed soy materials were shown to have lower allergen number and content compared with preprocessed soy, in contrast with other studies reporting that temperature treatments during transportation and storage generated additional soy allergen determinants (Codina et al., 1998). These conflicting reports show the complexity of evaluating the effects on FA of technological processing which in general involve a great number of interconnected parameters and well illustrate the necessity of further investigation to understand the influence of the food production process on allergens. Another crucial issue about which EP and FP are starting to provide some clues is the problem of allergen cross-reactivity. The proteomic approach to food aller- gen analysis is able to easily differentiate immunologically cross-reacting allergens. Exemplary is the study of celery (Apium graveolens), acknowledged as one of the main food allergen in Europe for which mandatory labeling for preprocessed foods has been implemented, while no methods for specific detection of celery proteins in foods were available until 2010, when a sandwich ELISA method was set up with an LOD of 0.5 ppm. However, its application to food products screening was limited because of extensive cross-reactivity with potato and carrot proteins. This issue was 84 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS rationalized by proteomic analysis (Faeste et al., 2010): using nanoHPLC-IT-MS/MS, the cross-reacting species were identified as a novel patatin (Sola t1) in celery and a flavin adenine dinucleotide-binding domain-containing protein (Api g5)-like in potato. Information provided by the identification of cross-reacting species were then used to test HPLC-ESI-QqQ-MS/MS for specific detection of celery, carrot, and potato allergens in whole food extracts. Several highly specific precursor-to-product ion transitions were determined for each species, suggesting the feasibility of develop- ing an MS-based screening method to detect celery allergens in foods in the presence of cross-reactive ingredients. This is also a clear example of how proteomics data may provide unique indications to improve existing immunodetection methods. However, there are also crucial diagnostic issues linked to cross-reactivity in which the role of proteomics is becoming even more central. In the diagnosis of allergy, the determination of IgE specific to allergens is routinely used (Hamilton, 2010), and allergen-mediated BATs are performed in selected cases (De Week et al., 2008). The ability of an allergen to trigger the activation of basophils indicates that an allergen- specific IgE has biological activity in that patient (De Week et al., 2008; Hamilton, 2010). However, in vitro IgE reactivity or basophil activation is not always accom- panied by in vivo activity or clinical allergy. Therefore, allergenic cross-reactivity should be taken into account when interpreting the results of allergy tests. Primary sensitization mediated by IgE against a particular allergen can induce reactivity to structurally related molecules (Aalberse et al., 2001). For instance, it is known that many plant and invertebrate glycoproteins contain similarly fucosylated and/or xylo- sylated N-glycans, also known as cross-reactive carbohydrate determinants (CCDs) (Mari, 2002; van Ree, 2002; Malandain, 2005; Altmann, 2007). Because CCDs are ubiquitous in nature, CCD sensitization can induce widespread IgE reactivity, there- fore limiting the specificity of in vitro tests for allergy. A very recent proteomic inves- tigation has faced this issue by examining sensitization to wine glycoproteins in non- allergic heavy drinkers (Gonzalez-Quintelaa et al., 2011). Therefore, the prevalence and biological significance of IgE antibodies to N-glycans from wine glycoproteins in heavy drinkers have been investigated with an immunologic and proteomic approach. Skin prick tests, serum IgE levels, IgE immunoblotting to wine extracts, and BATs showed that most heavy drinkers had IgE affinity to proteins in wine extracts, which were identified by MS, although no subject reported symptoms of hypersensitivity to any food or wine. From a diagnostic standpoint, CCD interference on allergy in vitro tests should be specially considered in heavy drinkers in order to avoid false positives. Discrepancy between strong in vitro activity (basophil activation) and no apparent in vivo activity of IgE to CCDs indicates that mechanisms for tolerance should exist, but they remain elusive, requiring further studies about CCD tolerance in sensitized individuals. In this scenario, glycoproteins in alcoholic beverages may represent a source for CCD sensitization or, alternatively, IgE reactivity to these glycoproteins may represent an epiphenomenon of sensitization to CCDs from another source. A panel of 28 glycoproteins of white wine potentially cross-reactive with plant allergens have been recently catalogued by applying an elegant multiplexed chromatographic approach consisting of sequential hydrophilic interaction liquid chromatography and TiO2 enrichment followed by a chemical-based (hydrazide capture) strategy prior to IDENTIFICATION OF ALLERGENS IN TRANSFORMED PRODUCTS 85 nanoRP-HPLC–ESI-LTQ-Orbitrap MS/MS (Palmisano et al., 2010). The precise role of alcohol and food glycoprotein exposure on CCD sensitization is a crucial point for many aspects of FA and deserves further investigation.

3.6 IDENTIFICATION OF ALLERGENS IN TRANSFORMED PRODUCTS

A food product is in general the result of a series of consecutive technological steps that involve physical and chemical processes: thermal treatments, spray dry- ing, cooking, extrusion, gel or dough formation, chemical or enzymatic hydrolysis, cross-linking, oxidation, just to mention a few, all of which induce deep and differ- ent structural changes in the food constituents. For this reason, while in raw food materials, the characterization of allergens, still in a relatively “native” state, can be considered quite standardized today, allergenomics of processed foods remains a challenging task and requires properly designed approaches. The main limiting factors to allergenomic analysis of processed foods are the increased protein com- plexity (e.g., production of oxidized protein families or of mixtures of hydrolytic fragments) and the interaction of allergens with other proteins or with the other molecules within the food matrix (for instance in the dough network formation or in the case of the condensation products between carbohydrates and proteins in the early stages of the Maillard reaction). MALDI-TOF-MS and fourier transform ion cyclotron resonance (FTICR)-MS have been used to study the alteration of gliadins during the baking process (Rodriguez-Mateos et al., 2006). For this reason, in the last years, procedures of protein chemistry, sometimes derived by classical biochemistry protocols, have been developed or adapted to obtain efficient allergen extraction and characterization from processed foods. Sample preparation is a key step for reliable allergen quantification. The first instance is related to efficient allergen extraction from the food matrix. Obviously, if the allergen recovery is not complete, quantification is underestimated. Protein extraction from “recalcitrant” plant tissues is particularly challenging because of the occurrence of interfering substances (Marzban et al., 2008). Therefore, extraction protocols have to be opportunely adapted depending on the nature of the starting food matrix. In the selection of the extraction buffer, the effects of food processing must be taken into account. Thermal treatments such as roasting, cooking, boiling as well as sanitization processes have a tremendous impact on the protein stability and structure and, thus, on the extraction efficiency (Sathe and Sharma, 2009). Elevated roasting temperatures in food processing were found to drastically reduce protein extraction yields of oil and dry roasted peanuts by 50–75% and 75–80%, respectively, compared with the raw material (Poms et al., 2004). In this latter study, several extraction buffers contained in commercial ELISA kits were tested and compared. Although five nut allergens were successfully detected in breakfast cereals and biscuits using a bicarbonate buffer extraction (Bignardi et al., 2010), for an accurate quantification, detergent and chaotropes are required in order to realize denaturing extraction conditions. The LC-MS/MS LOD of peanut Ara h1 in dark chocolate 86 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS was significantly lowered if a pre-extraction digestion was performed rather than extracting proteins first and then digesting it (Shefcheck et al., 2006). This should be ascribed to an incomplete protein extraction by bicarbonate buffer. When combined with MRM, the LOD of a signature peptide of Ara h1 (Shefcheck and Musser, 2004) decreased 5-fold. Trypsin is the most utilized proteolytic enzyme for MS-based detection and quantification because it often guarantees a complete digestion, a good peptide ionization and reproducibility. The protein modifications from raw materials and finished foods are also relevant. The technological processes required for industrial food preparations may induce chemical modifications in allergens which may be decisive for their toxicological characteristics. In simple terms, allergenic epitopes originally present in a native protein can be destroyed by technological treatments (an epitope may be hydrolyzed, oxidized, or conjugated to other molecules) or conversely neo-epitopes not existing before may be generated by the treatments or the process of by food digestion (hydrolysis which exposes previously hidden epitopes, cross-linking, conjugation, and deamidation). The issue of deamidation is crucial in the case of CD. Gliadins are considered the main factor triggering CD (Dewar et al., 2004; Londei et al., 2005), a widespread common enteropathy induced by ingestion of wheat gliadin and related prolamins from oat (avenin), rye (secalin), and barley (hordein) in genetically susceptible indi- viduals. A large and recent series of studies have clearly shown that the activity of the tissue transglutaminase (tTG) is critical for CD. Patients with active CD have auto-antibodies to tTG, which are highly disease-specific and whose formation is dependent on ingestion of gluten. Notably, tTG itself was described to be involved in covalent complex formation with gluten-derived peptides in the presence of calcium. tTG is a multifunctional enzyme capable of forming covalent protein cross-links (transamidation) and also capable of catalyzing other biochemical reactions such as peptide deamidation. Both transamidation and deamidation play a key role in the pathogenesis of CD by generating a variety of haptens that, in addition to others, are responsible for an autoimmune response (Quarsten et al., 1999; Fleckenstein et al., 2002). Combined immunological, proteomic analyses and molecular modeling have shown that deamidated peptides fit more tightly into the binding pocket of the HLA molecules. The resulting higher binding affinity increases T-cell reactivity. These data highlight the need for methods that allow probing peptides that are substrate of tTG in real wheat-based foods. Proteomics is an efficient tool for the rapid identification of peptides in complex mixtures and could therefore facilitate identification of tTG- mediated modifications of peptides. The resolution and specificity achieved when using the new generation of hybrid (Q-TOF, Orbitrap) instruments enable the dis- crimination among peptides where a single deamidated Gln residue is present, even in complex mixtures, such as those occurring in the enzymatic digests of gluten pro- teins. In order to detect gliadin peptides derived from gastric and pancreatic digestion possibly modified by tTG, a proteomic analytical workflow for selective identifica- tion of susceptible Gln in wheat proteins was developed (Mamone et al., 2004). The whole gliadin fraction, isolated from wheat flour, was subjected to simulated gas- tric digestion. On the complex digestion mixture, monodansylcadaverine was used as fluorescent chemical label, to identify the tTG-susceptible peptides. Fluorescent peaks IDENTIFICATION OF ALLERGENS IN TRANSFORMED PRODUCTS 87 on the HPLC profile were selectively characterized by nano-ESI-MS/MS. As a result, six gluten peptides were identified, among which some previously recognized toxic for CD patients. For the first time in a wheat flour, this procedure allowed to identify, as tTG substrates in vitro, a few specific gliadin peptides having a specific amino acid composition among a myriad of other peptides. In conclusion, the outlined procedure could be used to probe peptides susceptible to tTG in any food, even in a duodenal mucosal biopsy, and to obtain their structural characterization. A positive response with the fluorescent probe would predict the presence of a specific tTG-susceptible Gln residue determinant potentially toxic to a genetically predisposed subject. Thus, a systematic screening of foods with this type of predictive test could help to ensure safer food for CD patients and to support the selection of less harmful cereal varieties. This approach may allow identification of the specific modification sites in a real flour sample, offering information for the understanding of how dietary peptides interact with the host organism in CD. Also, this opens up new perspectives, today only in an early stage, for a possible use of tTG on wheat flours in order to eliminate toxic epitopes of gluten. The issue of food detoxification is crucial to FA and is probably one of the most complex topics, also for the lack of analytical methods to relate food structural modifications and allergenicity. Exemplary is the case of hen egg. Hen’s egg allergy represents one of the most common and severe IgE-mediated reactions to food in infants and young children. It persists, however, in many cases also lifelong. Major allergenic egg proteins are ovalbumin (Gal d2), conalbumin (Gal d3), OVM (Gal d1), and lysozyme (Gal d4). At least 24 antigenic hen’s egg components are known (Langeland, 1983). Allergologically significant is mainly the fractions of OVM, ovalbumin, ovotransferrin (respectively, conalbumin), and lysozyme. These proteins make up 80% of the total protein content of egg white. The rest are, in the case of FA, less significant proteins such as macroglobulin, avidin, and several different enzymes. Despite several attempts to apply some of the procedures normally used for food processing, the allergenicity of hen’s egg could not be reduced to a level that is suitable for allergic consumers under preservation of the desired properties (texture and flavor) of hen’s egg. A recent proteomic/immunological study assayed the tech- nological processes used to reduce the allergenic potential of hen’s egg to relate the technological changes to allergenicity (Hildebrandt et al., 2008). The investigation focused on the pasteurized eggs as starting material, intermediate, and final products of a nine-step manufacturing process performed for use of eggs in convenience products designed for allergic individuals. The steps consisted of a combination of various heat treatments and enzymatic hydrolyses. The shift to alkaline conditions during storage was also considered. Furthermore, thermal processing may generate disulfide-linked polymers with unwanted properties due to disulfide interchange, a reaction that is catalyzed by alkaline conditions. The alterations were controlled by sodium dodecyl sulfate-polyacrylamide gel electrophoresis, immunoblotting, EAST inhibition, and MS. These combined approaches showed that the allergenic potential of the raw material was reduced from step to step, and despite the known stability against heat and proteolysis of certain egg proteins, the total allergenic potential was finally below 1/100 that of the starting material without a significant change in texture and flavor as evaluated in various products. Taking advantage of a high 88 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS resolution/wide dynamic range LTQ-Orbitrap instrument, coupled with 1D nanoRP- HPLC, 158 proteins have been recently identified in hen’s egg white by shotgun proteomics, also including several known egg allergens (Mann and Mann, 2011). Basically, thermal processing should decrease food allergenicity because heating and cooking disrupts the protein structure. On the other hand, neo-antigens can virtually be created, changing the digestibility, solubility, and resistance of proteins to stomach acid (Humeny et al., 2002). Modified proteins can be more resistant to acid or enzymatic digestion and, if stable, can be absorbed intact. The chemical modification can be recognized by the immune system, forming a new B-cell epitope or new anti-IgE epitope. Glycation, the reaction between a sugar and a free amino group of the protein, is the first and simplest reaction between a protein and a sugar. The effects of thermal treatments, also inducing protein carbonylation and glycosylation with consequent possible decrease of allergenicity or conversely with formation of neo-allergens, have also been investigated in vitro food models, including milk proteins (Fenaille et al., 2005), apple (Sancho et al., 2005), peanut (Starkl et al., 2011; Vissers et al., 2011; Mondoulet et al., 2005), and soy (Krishnan et al., 2009). With the exception of a few cases (Moreno et al., 2008; Corzo-Mart´ınez et al., 2009), these methods of analysis were mainly based on immunological techniques (e.g., ELISA). However, food-processing treatments like roasting of peanuts affect the stability of proteins and were shown to influence the detection of allergen by ELISA. To overcome this difficulty, methods employing MS detection have been developed and allow an unambiguous identification of allergenic food ingredients. In practice, either proteins or peptides are targeted for this purpose. More importantly, with a shotgun proteomics approach, the three major peanut allergens in processed peanut were monitored, by selecting some allergen tryptic peptides as markers for monitoring specific peanut allergens in food products (Chassaigne et al., 2007). With a slightly different approach, consisting of a preliminary 2D chromatography method consisting of an affinity capture of proteins on immunobeads followed by tryptic digestion and RP-HPLC-ESI-IT-MS/MS, Ara h 3/4 peanut allergens were identified in breakfast cereals (Careri et al., 2008). The ultimate scope of these studies is either to apply sensitive and reliable MS- based analytical protocols to assess the presence of allergens in processed foods or to identify objective molecular markers of allergens to improve current methods for their detection. The main problems for detecting food allergens are encountered for heavily processed ingredients or in matrices where the allergens or the toxic fraction is expected only in trace levels. The most problematic case is that of the hydrolyzed starch products, including maltodextrins and glucose syrup, obtained industrially through chemical and/or enzymatic methods from various cereals, wheat included. These quite inexpensive ingredients are of enormous relevance in the food industry as functionalizing agents in food formulations, including baking and confectionery, juices, and beverages, and also in dairy and meat-derived foods, all products where the presence of gluten is not immediately apparent to intolerant or allergic consumers. In the products, gluten determination by immunological tests is made unreliable by a number of factors, including the low amount of gluten to be detected being dispersed in a very high amount of interfering substances (low- and high-mass sugars, other IDENTIFICATION OF ALLERGENS IN TRANSFORMED PRODUCTS 89 by-products of the process). Also, proteins undergo modifications such as oxidation, polymerization, and proteolysis by the action of the same chemicals used for starch degradation or of contaminating proteases. These modifications also interfere with gliadin immunoassays, either because of deletion of the sequences targeting anti- body recognition which may cause gluten underestimation or, conversely, because gluten hydrolysis can generate peptides containing repeated motifs common in the gliadin sequence, leading to overestimation of the real gluten content and to false positives. The key of the proteomic approach is in this case the optimization of the gluten extraction procedure. To this purpose, several protocols have been developed, based on the use of mixtures of water and organic solvents also containing reduc- ing agents followed by MALDI-TOF-MS analysis (Mena et al., 2012), that enable the semiquantitative gluten detection (1–10 ppm estimated LOD) of protein, thus allowing to establish whether these products exceeded the 20 ppm limit required for foods “rendered” gluten-free. Analysis of several glucose syrup, maltodextrin, and crystalline dextrose samples by MS found some intact gliadin proteins and some fragments from degradation of gluten, with total concentrations in the range 1–40 mg/kg (EFSA, 2007). In another study, MALDI-TOF-MS of glucose syrups did not detect any intact gluten proteins or fragments of gluten proteins (Dostalek et al., 2010). However, the LOD in this study was quite inadequate (200 mg/kg). These results show the extreme variability of allergen or toxic compound contents, which is mainly due to the technological production process employed or production, and suggest great attention has to be paid in the evaluation of the toxicity of a food. Furthermore, these data may add useful information for developing diets and therapy for allergic as well as for CD patients. The problem of “hidden” allergens is also present in beverages which include wine and beer, for different reasons. After alcoholic fermentation, commercial wines are subjected to fining processes which include use of animal or protein matrices to eliminate haze-associated compounds so to obtain a clear product. Most used proteins include the major allergenic matrices: wheat, soy and other legumes, milk proteins, and egg albumin. These fining agents are not considered ingredients as they are removed after the process (they are classified as technological coadjuvants), therefore current labeling regulation in EU do not impose to declare their use, with consequent risk for allergic consumers. Just like for the starch hydrolyzates, ELISA detection is hampered by the very low residual amounts and by the interference of the alcoholic mixture. To overcome these limitations, a high mass resolution LC- ESI-MS method using a single-stage Orbitrap mass spectrometer has been developed for the quantification of casein allergens present in white wines as a result of fining with caseinate. The method is based on the search of “proteotypic” peptides of the allergen. The method shows the great potential of Orbitrap MS as a reliable technique in the field of protein allergen detection once the peptide markers are identified. In this case (Monaci et al., 2010; Monaci et al., 2011a, 2011b), the LOD ranged 0.15–0.7 ppm in wine, depending on the peptide selected. These limits are compatible with caseinate concentrations typically adopted for wine-fining purposes. By CE and ESI-MS analysis, minute amount of lysozyme could be detected in wines (Simo´ et al., 2004). 90 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS

A more sophisticated approach to the same issue has been based on the powerful enrichment methodology of combinatorial peptide ligand libraries (CPLLs) (Cereda et al., 2010). Addition of minute amounts of CPLL beads to the entire content of a white or red wine bottle (750 mL), allowed to sequester with high efficiency (up to 80%) residual traces of casein, permitting a signal “amplification” of at least 5000- fold. In this way, 1 ppb of allergen could be successfully detected in wines. Analysis of a significant number of Italian red wines where traces of the commonly used egg albumin were searched for, unexpectedly revealed that in all samples, the only fining agent used was bovine casein, a different allergenic matrix, just like in white wines (D’Amato et al, 2010). The fact that such very low levels of fining agents can still be detected in treated wines should be taken into consideration by winemakers in labeling their products and by EC rulers in issuing proper regulations. These encouraging results in FA research are now pushing the development and diffusion of automated allergen detection and quantification methods at low-ppm levels fully based on MS analysis and commercialized by MS instrument producers to this purpose. Initial sample preparation stages are refined from established allergen extraction and purification protocols and have been simplified for rapid and easy-to- perform handling. Allergen quantification is based on the aforementioned monitoring of the MRM transition of proteotypic peptides using the MRM-initiated detection and sequencing workflow (Lane et al., 2008). The derived method for allergen detection is already finding applications by food control companies and agencies in the analysis of major allergens in baked foods (Heick et al., 2011a, 2011b) where it results advantageous because of the drawbacks of ELISA tests. The detection capabilities of this novel method were demonstrated by analyzing commercial foodstuffs containing milk, egg, soy, peanut, hazelnut, walnut, and almond.

3.7 CONCLUDING REMARKS

Apart from the analytical support in the immunological and clinical studies, the largest contribution of MS and proteomic techniques in the field of FA has been the discovery of new major and minor allergens. However, under this standpoint, we are persuaded that the potentiality of proteomics and related -omic techniques has been only partly expressed so far and, hence, in a very next future, we expect strik- ing advances toward the comprehensive and systematic cataloguing of food allergens and a more accurate characterization of the allergenic determinants. Undoubtedly, the instrumentation and maintenance costs and the operative skills required are among the primary factors that have prevented MS to deliver additional contribution to research in FA and to be included into routine monitoring protocols. Thus, at present, MS-based proteomics is essentially applied to assist the development and improve- ment of less expensive platforms for allergen discovery and monitoring, which rely on immunochemical microarrays and biosensors as well as commercial ELISA kits for customized use. However, it is largely likely that MS techniques will massively enter the routine practices. REFERENCES 91

Up to only very recent times, most of the recently developed proteomic operative protocols as well as the process of data mining were not standardized yet, thus hampering the immediate use of analytical data. Today, the proteomic analytical tools and methodological procedures are going to be consolidated so that we can confidently expect that MS techniques will complement the definition of improved diagnostic and immunotherapy strategies also in the preclinical/clinical practice. To this purpose, it will be particularly decisive the individualized identification of epitopes and the acquisition of further clues about the immune mechanisms leading to the loss of oral tolerance, with subsequent development of FA. MS-based proteomics has been also practically charged of the responsibility to challenge the relevant issue of allergen standardization. The concept of standard- ization of allergens and allergoids (aldehyde derivatives of natural allergens able to decrease their allergenicity and potentially increase their safety in immunotherapy) for both diagnosis and treatment purposes is not confined to the mere confirmation of identity or quantification of individual allergens. Allergen standardization encloses the assessment of quality, safety, and efficacy before a marketing authorization can be granted for an allergen product (Lorenz et al., 2008). MS techniques are currently irreplaceable, particularly in selected case studies such as the characterization of aller- goids for which immunochemical techniques might be ineffective due to the altered IgE-binding affinities and the discrimination among allergen isoforms. The specific matter related to allergen standardization has been recently overviewed (Reuter et al., 2009) with a special reference to the pioneering tandem MS application for standard- izing allergens and allergoids from Betula alba (Carnes´ et al., 2009). The very recent successfully use of MRM MS for standardizing timothy (Phleum pratense) pollen allergens (Seppal¨ a¨ et al., 2011) anticipates the likely extension of the techniques to food allergens.

REFERENCES

Aalberse RC (1997). Food allergens. Environmental Toxicology and Pharmacology 4: 55–60. Aalberse RC, Akkerdaas J, van Ree R (2001). Cross-reactivity of IgE antibodies to allergens. Allergy 56(6):478–490. Abdel Rahman AM, Gagne´ S, Helleur RJ (2012). Simultaneous determination of two major snow crab aeroallergens in processing plants by use of isotopic dilution tandem mass spectrometry. Analytical and Bioanalytical Chemistry 403(3):821–31. Abdel Rahman AM, Lopata AL, Randell EW, Helleur RJ (2010). Absolute quantification method and validation of airborne snow crab allergen tropomyosin using tandem mass spectrometry. Analytica Chimica Acta 681(1–2):49–55. Abdel Rahman AM, Kamath SD, Lopata AL, Robinson JJ, Helleur JR (2011). Biomolecular characterization of allergenic proteins in snow crab (Chionoecetes opilio) and de novo sequencing of the second allergen arginine kinase using tandem mass spectrometry. Journal of Proteomics 74(2):231–41. 92 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS

Akagawa M, Handoyo T, Ishii T, Kumazawa S, Morita N, Suyama K (2007). Proteomic analysis of wheat flour allergens. Journal of Agricultural and Food Chemistry 55(17):6863–6870. Alm R, Ekefjard¨ A, Krogh M, Hakkinen¨ J, Emanuelsson C (2007). Proteomic variation is as large within as between strawberry varieties. Journal of Proteome Research 6(8):3011– 3020. Altmann F (2007). The role of protein glycosylation in allergy. International Archives of Allergy and Immunology 142:99–115. Anto´ JM, Sunyer J, Rodriguez-Roisin R, Suarez-Cervera M, Vazquez L (1989). Community outbreaks of asthma associated with inhalation of soybean dust. New England Journal of Medicine 320(17):1097–1102. Bignardi C, Elviri L, Penna A, Careri M, Mangia A (2010). Particle-packed column versus silica-based monolithic column for liquid chromatography-electrospray-linear ion trap- tandem mass spectrometry multiallergen trace analysis in foods. Journal of Chromatogra- phy A 1217(48):7579–7585. Bilkova Z, Stefanescu R, Cecal R, Korecka L, Ouzka S, Jezova J, Viovy JL, Przybylski M (2005). Epitope extraction technique using a proteolytic magnetic reactor combined with Fourier-transform ion cyclotron resonance mass spectrometry as a tool for the screening of potential vaccine lead peptides. European Journal of Mass Spectrometry 11:489–495. Brambilla F, Resta D, Isak I, Zanotti M, Arnoldi A (2009). A label-free internal standard method for the differential analysis of bioactive lupin proteins using nano HPLC-Chip coupled with Ion Trap mass spectrometry. Proteomics 9(2):272–286. Caillard I, Tome´ D (1995). Transport of ␣-lactoglobulin and ␤-lactalbumin in enterocyte-like Caco-2 cells. Reproduction Nutrition Development 35:179–188. Careri M, Elviri L, Lagos JB, Mangia A, Speroni F, Terenghi M (2008). Selective and rapid immunomagnetic bead-based sample treatment for the liquid chromatography-electrospray ion-trap mass spectrometry detection of Ara h3/4 peanut protein in foods. Journal of Chromatography A 1206:89–94. Carnes´ J, Himly M, Gallego M, Iraola V, Robinson DS, Fernandez-Caldas´ E, Briza P (2009). Detection of allergen composition and in vivo immunogenicity of depigmented allergoids of Betula alba. Clinical and Experimental Allergy 39(3):426–434. Cereda A, Kravchuk AV, D’Amato A, Bachi A, Righetti PG (2010). Proteomics of wine additives: mining for the invisible via combinatorial peptide ligand libraries. Journal of Proteomics 73(9):1732–1739. Chassaigne H, Nørgaard JV, Hengel AJ (2007). MS analysis of peanut allergens in food, what are the markers we are looking for? Journal of Agricultural and Food Chemistry 55 4461–4467. Chatchatee P, Jarvinen¨ KM, Bardina L, Beyer K, Sampson HA (2001). Identification of IgE- and IgG-binding epitopes on alpha(s1)-casein: differences in patients with persistent and transient cow’s milk allergy. Journal of Allergy and Clinical Immunology 107(2):379–383. Chehade M, Mayer L (2005). Oral tolerance and its relation to food hypersensitivities. Journal of Allergy and Clinical Immunology 115:3–12. Codina R, Oehling AGJ, Lockey RF (1998). Neoallergens in heated soybean hull. International Archives of Allergy and Immunology 117:120–125. Colgrave ML, Goswami H, Howitt CA, Tanner GJ (2012). What is in a beer? Proteomic characterization and relative quantification of hordein (gluten) in beer. Journal of Proteome Research 11(1):386–396. REFERENCES 93

Cooke SK, Sampson HA (1997). Allergenic properties of ovomucoid in man. Journal of Immunology 159:2026–2032. Corzo-Mart´ınez M, Lebron-Aguilar´ R, Villamiel M, Quintanilla-Lopez´ JE, Moreno FJ (2009). Application of liquid chromatography-tandem mass spectrometry for the characterization of galactosylated and tagatosylated beta-lactoglobulin peptides derived from in vitro gas- trointestinal digestion. Journal of Chromatography A 1216:7205–7212. Crittenden RG, Bennett LE (2005). Cow’s milk allergy: a complex disorder. Journal of the American College of Nutrition 24:S582–S591. D’Amato A, Kravchuk AV, Bachi A, Righetti PG (2010). Noah’s nectar: the proteome content of a glass of red wine. Journal of Proteomics 73(12):2370–2377. De Week AL, Sanz ML, Gamboa PM, Aberer W, Bienvenu J, Blanca M, Demoly P, Ebo DG, Mayorga L, Monneret G, Sainte Laudy J (2008). Diagnostic tests based on human basophils: more potentials and perspectives than pitfalls. II. Technical issues. Journal of Investigation in Allergology and Clinical Immunology 18:143–155. DeLong JH, Simpson KH, Wambre E, James EA, Robinson D, Kwok WW (2011). Ara h 1-reactive T cells in individuals with peanut allergy. Journal of Allergy and Clinical Immunology 127(5):1211–1218. Dewar D, Pereira SP, Ciclitira PJ (2004). The pathogenesis of coeliac disease. International Journal of Biochemistry and Cell Biology 36:17–22. D´ıaz-Perales A, Blanco C, Sanchez-Monge´ R, Varela J, Carrillo T, Salcedo G (2003). Analysis of avocado allergen (Prs a 1) IgE-binding peptides generated by simulated gastric fluid digestion. Journal of Allergy and Clinical Immunology 112(5):1002–1007. Dostalek P, Gabrovska D, Rysova J, Mena MC, Hernando A, Mendez E, Chmelik J, Salplachta J (2009). Determination of gluten in glucose syrups. Journal of Food Composition and Analysis 22:7–8. Dostalek P, Gabrovska D, Rysova J, Mena MC, Hernando A, Mendez E, Chmelik J, Salplachta J (2010). Determination of gluten in glucose syrups. Journal of Food Composition and Analysis 22: 762–765. EFSA (2007). Opinion of the Scientific Panel on Dietetic Products, Nutrition and Allergies on a request from the Commission related to a notification from AAC on wheat-based glucose syrups including dextrose pursuant to Article 6 paragraph 11 of Directive 2000/13/EC. EFSA Journal 488:1--8. EU Directive 2000/13/EC, amended by Food Labelling Directive 2003/89/EC and Directive 2007/68/EC. Faeste CK, Jonscher KR, Sit L, Klawitter J, Løvberg KE, Moen LH (2010). Differentiating cross-reacting allergens in the immunological analysis of celery (Apium graveolens) by mass spectrometry. Journal of AOAC International 93(2):451–461. Fenaille F, Parisod VT, Tabet J-C, Guy PA (2005). Carbonylation of milk powder proteins as a consequence of processing conditions. Proteomics 5:3097–3104. Fleckenstein B, Molberg O, Qiao SW, Schimdt DG, Von der Muller F, Elgstoen K, Jung G, Sollid LM (2002). Gliadin T cell epitope selection by tissue transglutaminase in celiac dis- ease. Role of enzyme specificity and pH influence on the transamidation versus deamidation process. Journal of Biological Chemistry 277:34109–34112. Gonzalez-Quintelaa A, Gomez-Rialb J, Valcarcela C, Camposa J, Sanze M-L, Linnebergf A, Gudec F, Vidald C (2011). Immunoglobulin-E reactivity to wine glycoproteins in heavy drinkers. Alcohol 45:113–122. 94 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS

Green BJ, Cummings KJ, Rittenour WR, Hettick JM, Bledsoe TA, Blachere FM, Siegel PD, Gaughan DM, Kullman GJ, Kreiss K, Cox-Ganser J, Beezhold DH (2011). Occupational sensitization to soy allergens in workers at a processing facility. Clinical & Experimental Allergy 41:1022–1030. Hamilton RG (2010). Clinical laboratory assessment of immediate-type hypersensitivity. Jour- nal of Allergy and Clinical Immunology 125(2S2):S284–S296. Heick J, Fischer M, Kerbach S, Tamm U, Popping B (2011a). Application of a liquid chro- matography tandem mass spectrometry method for the simultaneous detection of seven allergenic foods in flour and bread and comparison of the method with commercially available ELISA test kits. Journal of AOAC International 94(4):1060–1068. Heick J, Fischer M, Popping B (2011b). First screening method for the simultaneous detection of seven allergens by liquid chromatography mass spectrometry. Journal of Chromatogra- phy A 1218(7):938–943. Hildebrandt S, Kratzin HD, Schaller R, Fritsche R, Steinhart H, Pasckhe A (2008). In vitro determination of the allergenic potential of technologically altered hen’s egg. Journal of Agricultural and Food Chemistry 56:1727–1733. Hinsby AM, Olsen JV, Bennett KL, Mann M (2003). Signaling initiated by overexpression of the fibroblast growth factor receptor-1 investigated by mass spectrometry. Molecular and Cell Proteomics 2:29–36. Hobson DJ, Rupa P, Diaz GJ, Zhang H, Yang M, Mine Y, Turner PV, Kirby GM (2007). Proteomic analysis of ovomucoid hypersensitivity in mice by two-dimensional difference gel electrophoresis (2D-DIGE). Food and Chemical Toxicology 45:2372– 2380. Hochwallner H, Schulmeister U, Swoboda I, Focke-Tejkl M, Civaj V, Balica N, Nystrand M, Harlin A, Thalhamer J, Scheiblhofer S, Keller W, Pavkov T, Zafred D, Niggemann B, Quirce S, Mari A, Pauli G, Ebner C, Papadopoulos NG, Herz U, van Tol EAF, Valenta R, Spitzauer S (2010). Visualization of clustered IgE epitopes on ␣-lactalbumin. Journal of Allergy and Clinical Immunology 125:1279–1285. Houston NL, Lee DG, Stevenson SE, Ladics GS, Bannon GA, McClain S, Privalle L, Stagg N, Herouet-Guicheney C, MacIntosh SC, Thelen JJ (2011). Quantitation of soybean allergens using tandem mass spectrometry. Journal of Proteome Research 10(2):763–773. Humeny A, Kislinger T, Becker CM, Pischetsrieder M (2002). Qualitative determination of specific protein glycation products by matrix-assisted laser desorption/ionization mass spectrometry peptide mapping. Journal of Agricultural and Food Chemistry 50(7):2153– 2160. Hur SJ, Lim BO, Decker EA, McClements DJ (2011). In vitro human digestion models for food applications. Food Chemistry 125:1–12. Jankovicovaa B, Rosnerovaa S, Slovakovaa M, Zverinovaa Z, Hubalekb M, Hernychovab L, Rehulkac P, Viovyd J-L, Bilkovaa Z (2008). Epitope mapping of allergen ovalbumin using biofunctionalized magnetic beads packed in microfluidic channels The first step towards epitope-based vaccines. Journal of Chromatography A 1206:64–71. Jarvinen¨ KM, Beyer K, Vila L, Chatchatee P, Busse PJ, Sampson HA (2002). B-cell epitopes as a screening instrument for persistent cow’s milk allergy. Journal of Allergy and Clinical Immunology 110(2):293–297. Kagan RS (2003). Food allergy: an overview. Environmental and Health Perspectives 111(2):223–225. REFERENCES 95

Kirsch S, Fourdrilis S, Dobson R, Maghuin Rogister G, Scippo ML, De Pauw E (2009). Quantitative methods for food allergens. Analytical Bioanalytical Chemistry 395(1):57– 67. Krishnan HB, Kim WS, Jang S, Kerley MS (2009). All three subunits of soybean beta- conglycinin are potential food allergens. Journal of Agricultural and Food Chemistry 57(3):938–943. Kobayashi K, Yoshida T, Takahashi K, Hattori M (2003). Modulation of the T cell response to beta-lactoglobulin by conjugation with carboxymethyl dextran. Bioconjugate Chemistry 14(1):168–176. Lane C, Jackson PJ, Potts D, Stahl-Zeng J, Serna A, Popping B, Lock SJ (2008). Proceedings of the 56th ASMS Conference; May 31–June 6, 2008; Denver, CO. Langeland T (1983). A clinical and immunological study of allergy to hen’s egg white. Allergy 38:399--412. Larche´ M (2003). Allergen-derived T cell peptides in immunotherapy. Revue Franc¸aise d’Allergologie et d’Immunologie Clinique 43(1):59–63. Londei M, Ciacci C, Ricciardelli I, Vacca L, Quaratino S, Maiuri L (2005). Gliadin as a stimulator of innate responses in celiac disease. Molecular Immunology 42:913– 917. Lorenz AR, Luttkopf¨ D, Seitz R, Vieths S (2008). The regulatory system in Europe with special emphasis on allergen products. International Archives of Allergy and Immunology 147:263–275. Malandain H (2005). IgE-reactive carbohydrate epitopes: classification, cross-reactivity, and clinical impact (2nd part). European Annuals of Allergy and Clinical Immunology 37:247– 256. Mamone G, Ferranti P, Melck D, Tafuro F, Longobardo L, Chianese L, Addeo F (2004). Susceptibility to transglutaminase of gliadin peptides predicted by a mass spectrometry- based assay. FEBS Letters 562:177–182. Mamone G, Picariello G, Caira S, Addeo F, Ferranti P (2009). Analysis of food pro- teins and peptides by mass spectrometry-based techniques. Journal of Chromatography A 1216(43):7130–7142. Mann K, Mann M (2011). In-depth analysis of the chicken egg white proteome using an LTQ Orbitrap Velos. Proteome Science 9(1):7. Mari A (2002). IgE to cross-reactive carbohydrate determinants: analysis of the distribution and appraisal of the in vivo and in vitro reactivity. International Archives of Allergy and Immunology 129:286–295. Marzban G, Herndl A, Maghuly F, Katinger H, Laimer M (2008). Mapping of fruit allergens by 2D electrophoresis and immunodetection. Expert Reviews in Proteomics 5 61–75. Mena MC, Lombard´ıa M, Hernando A, Mendez´ E, Albar JP (2012). Comprehensive analysis of gluten in processed foods using a new extraction method and a competitive ELISA based on the R5 antibody. Talanta 91:33–40. Monaci L, Losito I, Palmisano F, Visconti A (2010). Identification of allergenic milk proteins markers in fined white wines by capillary liquid chromatography-electrospray ionization- tandem mass spectrometry. Journal of Chromatography A 1217(26):4300–4305. Monaci L, Losito I, Palmisano F, Godula M, Visconti A (2011a). Towards the quantifica- tion of residual milk allergens in caseinate-fined white wines using HPLC coupled with 96 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS

single-stage Orbitrap mass spectrometry. Food Additives and Contaminants A 28(10):1304– 1314. Monaci L, Losito I, Palmisano F, Visconti A (2011b). Reliable detection of milk allergens in food using a high-resolution, stand-alone mass spectrometer. Journal of AOAC International 94(4):1034–1042. Mondoulet L, Paty E, Drumare MF, Ah-Leung S, Scheinmann P, Willemot RM, Wal JM, Bernard H (2005). Influence of thermal processing on the allergenicity of peanut proteins. Journal of Agricultural and Food Chemistry 53:4547–4553. Moreno FJ (2007). Gastrointestinal digestion of food allergens: effect on their allergenicity. Biomedical Pharmacotherapy 61(1):50–60. Moreno FJ, Rubio LA, Olano A, Clemente A (2006). Uptake of 2S albumin allergens, Ber e 1 and Ses i 1, across human intestinal epithelial Caco-2 cell monolayers. Journal of Agricultural and Food Chemistry 54:8631–8639. Moreno FJ, Quintanilla-Lopez´ JE, Lebron-Aguilar´ R, Olano A, Sanz ML (2008). Mass spec- trometric characterization of glycated beta-lactoglobulin peptides derived from galacto- oligosaccharides surviving the in vitro gastrointestinal digestion. Journal of the American Society for Mass Spectrometry 19:927–937. Nair B, Wheeler JC, Sykes DE, Brown P, Reynolds JL, Aalinkeel R, Mahajan SD, Schwartz SA (2011). Proteomic approach to evaluate mechanisms that contribute to food allergenicity: comparative 2D-DIGE analysis of radioallergosorbent test positive and negative patients. International Journal of Proteomics doi:10.1155/2011/673618. Neubauer G, Gottschalk A, Fabrizio P, Seraphin B, Lurhmann R, Mann M (1997). Iden- tification of the proteins of the yeast U1 small nuclear ribonucleoprotein complex by mass spectrometry. Proceedings of the National Academy of Sciences USA 94:385– 390. Palmisano G, Antonacci D, Larsen MR (2010). Glycoproteomic profile in wine: a ‘sweet’ molecular renaissance. Journal of Proteome Research 9(12):6148–6159. Pandey A, Fernandez MM, Stehen H, Blagoev B, Nielsen MM, Roche S, Mann M, Lodish HF (2000). Identification of a novel immunoreceptor tyrosine-based activation motif-containing molecule, STAM2, by mass spectrometry and its involvement in growth factor and cytokine receptor signaling pathways. Journal of Biological Chemistry 275:38633–38639. Pastorello EA, Monza M, Pravettoni V, Longhi R, Bonara P, Scibilia J, Primavesi L, Scorza R (2010). Characterization of the T-cell epitopes of the major peach allergen Pru p 3. International Archives of Allergy and Immunology 153(1):1–12. Perry T, Conover-Walker M, Pomes A, Chapman M, Wood R (2004). Distribution of peanut allergen in the environment. Journal of Allergy and Clinical Immunology 113:973– 976. Picariello G, Ferranti P, Fierro O, Mamone G, Caira S, Di Luccia A, Monica S, Addeo F (2010). Peptides surviving the simulated gastrointestinal digestion of milk proteins: biological and toxicological implications. Journal of Chromatography B 878(3–4):295–308. Picariello G, Bonomi F, Iametti S, Rasmussen P, Pepe C, Lilla S, Ferranti P (2011a). Proteomic and peptidomic characterisation of beer: immunological and technological implications. Food Chemistry 124(4):1718–1726. Picariello G, Mamone G, Addeo F, Ferranti P (2011b). The frontiers of mass spectrometry- based techniques in food allergenomics. Journal of Chromatography A 1218(42):7386– 7398. REFERENCES 97

Poms RE, Capelletti C, Anklam E (2004). Effect of roasting history and buffer composition on peanut protein extraction efficiency. Molecular Nutrition and Food Research 48(6):459– 464. Quarsten H, Molberg O, Fugger L, McAdam SN, Sollid LM (1999). HLA binding and T cell recognition of a tissue transglutaminase modified gliadin epitope. European Journal of Immunology 29:2506–2510. Ragno V, Giampietro PG, Bruno G, Businco L (1993). Allergenicity of milk protein hydrolysate formulae in children with cow’s milk allergy. European Journal of Paediatrics 152:760– 762. Rash S (2008). Food allergy overview in children. Clinical Reviews in Allergy and Immunology 34:217–230. FAO/WHO (2001). Report of a Joint FAO/WHO Expert Consultation on Allergenicity of Foods Derived from Biotechnology. Evaluation of Allergenicity of Genetically Modified Foods. Rome, Italy: FAO; 2001. pp. 1–29. Reuter A, Luttkopf¨ D, Vieths S (2009). New frontiers in allergen standardization. Clinical and Experimental Allergy 39:307–309. Rodriguez-Mateos A, Millar SJ, Bhandari DG, Frazier RA (2006). Formation of dityrosine cross-links during breadmaking. Journal of Agricultural and Food Chemistry 54:2761– 2767. Rona RJ, Keil T, Summers C, Gislason D, Zuidmeer L, Sodergren E, Sigurdardottir ST, Lindner T, Goldhahn K, Dahlstrom J, McBride D, Madsen C (2007). The prevalence of food allergy: a meta-analysis. Journal of Allergy and Clinical Immunology 120:638--646. Rytkonen¨ J, Valkonen KH, Virtanen V, Foxwell RA, Kyd JM, Cripps AW, Karttunen TJ (2006). Enterocyte and M-cell transport of native and heat-denatured bovine ␣-lactoglobulin: sig- nificance of heat denaturation. Journal of Agricultural and Food Chemistry 54:1500–1507. Sampson HA (2004). Update on food allergy. Journal of Allergy and Clinical Immunology 113(5):805–820. Sancho AI, Rigby NM, Zuidmeer L, Asero R, Mistrello G, Amato S, Gonzalez-Mancebo E, Fernndez-Rivas M, van Ree R, Mills ENC (2005). The effect of thermal processing on the IgE reactivity of the non-specific lipid transfer protein from apple, Mal d 3. Allergy 60:1262–1268. Sathe SK, Sharma GM (2009). Effects of food processing on food allergens. Molecular Nutri- tion and Food Research 53(8):970–978. Schmidt H, Gelhaus C, Latendorf T, Nebendahl M, Petersen A, Krause S, Leippe M, Becker W-M, Janssen O (2009). 2D-DIGE analysis of the proteome of extracts from peanut variants reveals striking differences in major allergen contents. Proteomics 9:3507–3521. Sen M, Kopper R, Pons L, Abraham EC, Burks AW, Bannon GA (2002). Protein structure plays a critical role in peanut allergen stability and may determine immunodominant IgE-binding epitopes. Journal of Immunology 169:882–887. Seppal¨ a¨ U, Dauly C, Robinson S, Hornshaw M, Larsen JN, Ipsen H (2011). Absolute quan- tification of allergens from complex mixtures: a new sensitive tool for standardization of allergen extracts for specific immunotherapy. Journal of Proteome Research 10(4):2113– 2122. Shefcheck KJ, Musser SM (2004). Confirmation of the allergenic peanut protein, Ara h 1, in a model food matrix using liquid chromatography/tandem mass spectrometry (LC/MS/MS). Journal of Agricultural and Food Chemistry 52(10):2785–2790. 98 TECHNIQUES FOR THE CHARACTERIZATION OF FOOD ALLERGENS

Shefcheck KJ, Callahan JH, Musser SM (2006). Confirmation of peanut protein using peptide markers in dark chocolate using liquid chromatography-tandem mass spectrometry (LC- MS/MS). Journal of Agricultural and Food Chemistry 54(21):7953–7959. Shevchenko A, Keller P, Scheiffele P, Mann M, Simons K (1997). Identification of components of trans-Golgi network-derived transport vesicles and detergent-insoluble complexes by nanoelectrospray tandem mass spectrometry. Electrophoresis 18:2591–2600. Sicherer SH, Sampson HA (2006). Food allergy. Journal of Allergy and Clinical Immunology 117:S470--S475. Sicherer SH, Sampson HA (2009). Food allergy: recent advances in pathophysiology and treatment. Annual Review of Medicine 60:261–277. Simo´ C, Elvira C, Gonzalez´ N, San Roman´ J, Barbas C, Cifuentes A (2004). Capillary electrophoresis-mass spectrometry of basic proteins using a new physically adsorbed polymer coating. Some applications in food analysis. Electrophoresis 25(13):2056– 2064. Son DY, Scheurer S, Hoffmann A, Haustein D, Vieths S (1999). Pollen-related food allergy: cloning and immunological analysis of isoforms and mutants of Mal d 1, the major apple allergen, and Bet v 1, the major birch pollen allergen. European Journal of Nutrition 38(4):201–215. Starkl P, Krishnamurthy D, Szalai K, Felix F, Lukschal A, Oberthuer D, Sampson HA, Swoboda I, Betzel C, Untersmayr E, Jensen-Jarolim E (2011). Heating affects structure, enterocyte adsorption and signalling, as well as immunogenicity of the peanut allergen Ara h2.Open Allergy Journal 4:24–34. Tanabe S (2007). Epitope peptides and immunotherapy. Current Protein and Peptide Science 8(1):109–118. Teshima R, Nakamura R, Satoh R, Nakamura R (2010) 2D-DIGE analysis of rice proteins from different cultivars. Regulatory Toxicology and Pharmacology 58:S30–S35. Tye-Din JA, Stewart JA, Dromey JA, Beissbarth T, van Heel DA, Tatham A, Henderson K, Mannering SI, Gianfrani C, Jewell DP, Hill AV, McCluskey J, Rossjohn J, Anderson RP (2010) Comprehensive, quantitative mapping of T cell epitopes in gluten in celiac disease. Science Translational Medicine 2(41):41–51. Urisu A, Yamada K, Tokuda R, Ando H, Wada E, Kondo Y, Morita Y (1999). Clinical significance of IgE-binding activity to enzymatic digests of ovomucoid in the diagnosis and the prediction of the outgrowing of egg white hypersensitivity. International Archives of Allergy and Immunology 120:192–198. van Hengel AJ (2007a). Declaration of allergens on the label of food products purchased on the European market. Trends in Food Science and Technology 18:96--100. van Hengel AJ (2007b) Food allergen detection methods and the challenge to protect food- allergic consumers. Analytical and Bioanalytical Chemistry 389:111--118. Van Hoeyveld EM, Escalona-Monge M, de Swert LF, Stevens EA (1998) Allergenic and antigenic activity of peptide fragments in a whey hydrolysate formula. Clinical and Exper- imental Allergy 28(9):1131–1137. van Ree R (2002). Carbohydrate epitopes and their relevance for the diagnosis and treatment of allergic diseases. International Archives of Allergy and Immunology 129:189–197. Vissers YM, Blanc F, Stahl Skov P, Johnson PE, Rigby NM, Przybylski-Nicaise L, Bernard H, Wal J-M, Ballmer-Weber B, Zuidmeer-Jongejan L, Szepfalusi Z, Ruinemans-Koerts J, Jansen APH, Savelkoul HFJ, Wichers HJ, Mackie AR, Mills CEN, Adel-Patient K (2011). REFERENCES 99

Effect of heating and glycation on the allergenicity of 2S albumins (Ara h 2/6) from peanut. PLoS ONE 6(8):e23998. doi:10.1371/journal.pone.0023998. Wang J, Sampson HA (2011). Food allergy. Journal of Clinical Investigation 121(3):827– 835. Weaver LT, Laker MF, Nelson R, Lucas A (1987). Milk feeding and changes in intestinal permeability and morphology in the newborn. Journal of Pediatric Gastroenterology and Nutrition 6:351–358. Wickham M, Faulks R, Mills C (2009). In vitro digestion methods for assessing the effect of food structure on allergen breakdown. Molecular Nutrition and Food Research 53(8):952– 958. Wijesinha-Bettoni R, Alexeev Y, Johnson P, Marsh J, Sancho AI, Abdullah SU, Mackie AR, Shewry PR, Smith LJ, Mills EN (2010). The structural characteristics of nonspecific lipid transfer proteins explain their resistance to gastroduodenal proteolysis. Biochemistry 49:2130–2139. 4 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS USING ADVANCED PROTEOMICS METHODS

Ashraf G. Madian, Elsa M. Janle, and Fred E. Regnier

4.1 INTRODUCTION

4.1.1 Oxidative Stress in Aging and Disease With aging, endogenous antioxidant defences decrease and production of reactive oxygen species (ROS) increases (Joseph et al., 1998; Herrera et al., 2009) along with the accumulation of oxidative damage (Ames et al., 1993). Exercise also results in the production of ROS, so older people who exercise to maintain good health may have an added ROS burden (Herrera et al., 2009). Beyond aging, oxidative stress (OS) has been associated with many age-related diseases, including heart disease (Diaz et al., 1997; Tsutsui et al., 2011), cataracts, Alzheimer’s disease (Christen, 2000), Parkinson’s disease, cancer (Ames et al., 1995), and diabetes (Brownlee, 2005). There are both endogenous and exogenous defences against OS. Endogenous mechanisms generally neutralize excess ROS or in some cases repair the damage, superoxide dismutase (SOD), catalase, and glutathione peroxidase being key enzymes in controlling ROS levels in cells (Halliwell, 1997). It is often the case, however, that endogenous defence mechanisms are insufficient to completely protect against ROS damage. Exogenous defence comes from the diet in the form of antioxidants, especially from fruits and vegetables (Barnes et al., 1994; Tanaka et al., 2011).

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

101 102 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS

4.1.2 Dietary Antioxidants Many foods contain antioxidants (Halvorsen et al., 2006) that limit oxidative damage from ROS. The vitamins A, C (ascorbic acid), and E (␣- and ␥-tocopherol) and the carotenoids are important antioxidants which have long been known to be beneficial to health. Polyphenols from foods are another class of antioxidants that contribute to the health benefits of foods. These include phenols, phenolic acids, flavonoids, coumarins, tannins, and lignans. Flavonoids contain a large portion of the polyphenols in plant-derived antioxidants. Consumption of flavonoids can be in the range of 50– 800 mg/day (Pietta, 2000), so there is the potential for flavonoids to contribute significantly to exogenous antioxidant capacity. In vitro studies have demonstrated that dietary antioxidants function by sev- eral mechanisms including inhibition of enzymes responsible for ROS formation, chelation of metal ions, scavenging of ROS, inhibition of lipid peroxidation, and up-regulation of endogenous antioxidant defences (Halliwell, 2008). Epidemiologi- cal studies have focused on determining the health benefits of dietary antioxidants. Based on the potential health benefits of dietary antioxidants, the US Department of Agriculture (USDA) has compiled a list of antioxidant-rich foods along with their antioxidant capacities and a database for the flavonoid content of foods (Harnly et al., 2006).

4.2 METHODS FOR STUDYING THE EFFICACY OF ANTIOXIDANTS

Both in vitro and in vivo assays have been used to assess antioxidant efficacy. The horseradish peroxidase-luminol-hydrogen peroxide assay and 1,1-diphenyl-2- picrylhydrazyl (DPPH) assay are good examples of in vitro evaluation methods, but unfortunately these colorimetric methods correlate poorly with the in vivo efficacy of antioxidants (Cos et al., 2002; Georgetti et al., 2003). Some of the reasons for this include the in vivo degradation, the difference in bioavailability of antioxidants, and the interaction with food ingredients. Tests to target the in vivo efficacy of antioxidants solved these problems. Examples of these tests include the thiobarbituric acid test which involves measuring the decrease in lipid oxidation (Hermans et al., 2005) and the 8-hydroxy-2-deoxyguanosine test in which DNA oxidation is measured (Pod- more et al., 1998). Surprisingly, assessment of the antioxidant utility via examining protein oxidation has not been well explored, primarily due to the lack of evaluation tools. One of the most frequently used techniques for examining protein oxidation is to measure the total carbonylation content using a colorimetric test using dinitro- phenyl hydrazine (DNP) colorimetrically (Hermans et al., 2007). Nonspecificity is one among the many limitations of this protein carbonyl content assay (PCC), that is allowing no discrimination between specific proteins. The same is true for the assay based on the release of 3-nitrotyrosin following acid hydrolysis of proteins. The new discipline of Foodomics applies powerful proteomic methods for the analysis of antioxidants. These methods examine oxidation in specific proteins and METHODS FOR STUDYING THE EFFICACY OF ANTIOXIDANTS 103 the degree to which various antioxidants provide protection. A recent proteomics study of human SH-SY5Y neuroblastoma cells was designed to assess the effect of green tea polyphenols as neuroprotective agents under long-term serum deprivation. Of the hundreds of proteins being oxidized in these cells, only four were found to increase while three decreased in concentration as a function of OS (Weinreb et al., 2007). In another study, the effect of dietary coenzyme Q10 on plasma proteins was examined using 2D-gel electrophoresis (2DGE) and mass spectrometry. Seven proteins decreased in concentration, while three increased (Santos-Gonzalez et al., 2007). Another study using streptozotocin (STZ)-induced diabetes in rats found that 14 proteins were either up- or down-regulated in the retina. After the administration of grape seed proanthocyanidin extract (GSPE), seven of these proteins returned to normal levels (Li et al., 2008). Although a great improvement over the old PCC assay, these newer studies are still not ideal. They all suffer from the problem that measuring changes in protein concentration is not the best way to assess OS. Pro- tein degradation is seriously compromised when OS reaches pathological levels, destroying the correlation between OS and the concentration of oxidized proteins. A much better approach is to directly measure oxidative damage of proteins and correlate this with organ damage. This chapter describes ways to do that; first by examining protein oxidation mechanisms and second by using different methods for isolation and quantitation of oxidized proteins. This chapter will conclude with a discussion of the biomedical consequences of protein oxidation along the impact of antioxidants.

4.2.1 Carbonylation as a Universal Indicator of Oxidative Stress The fact that ROS production and the sites oxidized in proteins are not genetically coded gives the oxidation process a high degree of random character. Several studies showed that numerous types of oxidative stress-induced posttranslational modifica- tion (OSi-PTM) are generated in hundreds of proteins during an OS episode (Madian and Regnier, 2010; Madian et al., 2011c). These OSi-PTM types are related in that they bear a carbonyl group. For this reason, carbonylation is considered to be a uni- versal indicator of OS. Among the many types of OSi-PTM, they are produced in three ways. The first is by the oxidation of an amino-acid side chain, predominantly Pro, Arg, Lys, Thr, Glu, or Asp residues or oxidative cleavage of the protein back- bone (Amici et al., 1989; Requena et al., 2001; Stadtman and Levine, 2003). The second is by protein conjugation with lipid peroxidation end products ranging from malondialdehyde and 2-propenal to 4-hydroxy-2-nonenal (Dalle-Donne et al., 2006), and the third is by oxidation of the carbohydrate residue on advanced glycation end (AGE) products conjugated to proteins. Cleavage of diols in the AGE product pro- duces a carbonyl residue on the protein. The logic behind the old colorimetric tests using dinitrophenylhydrazine (DNPH) is sound, and the problem is that it cannot identify sites of oxidations, mechanisms of oxidation, the proteins involved, func- tion and structure of proteins involved, and the efficacy of therapeutic antioxidants. 104 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS

Recently, advancement in proteomics methods introduced new techniques to target, isolate, identify, and quantify oxidized proteins and their oxidation sites and to use this information to determine the efficacy of antioxidants.

4.2.2 Methods for Purifying Carbonylated Proteins from Complex Mixtures, Mechanistic Studies of Diseases, and the Impact of Antioxidants A cellular proteome can be very complex. Unfortunately, mass spectrometers cannot deal with the level of complexity found in a whole proteome. Moreover, they lack the requisite sensitivity to accommodate quantification of proteins below ng/mL levels at present. Thus, reducing the complexity of the cellular proteome and enriching proteins from samples is a prerequisite in many cases. Additionally, separation of OS-modified proteins from a proteome greatly facilitates OSi-PTM identification in peptides derived from proteins. Several selection methods have been described for carbonylated proteins. All of them depend on derivatizing the carbonyl group with a reagent followed by affinity selection of the derivatized proteins.

4.2.2.1 Dinitrophenylhydrazine DNPH derivatization in combination with 2D- PAGE has been used extensively to detect carbonylated proteins. This method rec- ognizes oxidized proteins but does not enrich them. When coupled with LC/MS, it is possible to isolate, identify, and quantify carbonylated proteins through the use of antibodies that can target DNPH in the derivatized proteins (Dalle-Donne et al., 2003). Identification (Kristensen et al., 2004; Prokai and Forster, 2006; Tsujimoto et al., 2007) and quantification of carbonylated proteins (Prokai and Forster, 2006; Tsujimoto et al., 2007) have been achieved by digesting the derivatized proteins, antibody purification of either the protein or the derivatized peptide, separation of the selected polypeptides with reversed-phase chromatography (RPC) (Prokai and Forster, 2006; Tsujimoto et al., 2007), or ion exchange and reversed-phase chro- matography (IEC/RPC) (Kristensen et al., 2004) followed by detection with tandem mass spectrometry (MS/MS).

4.2.2.2 Biotin Hydrazide This method involves a multiple step process where car- bonyl groups are first derivatized with biotin hydrazide (BHZ) to form a Schiff base. The Schiff base is then reduced with sodium cyanoborohydride to prevent reversal of derivatization, and the sample is dialyzed to remove excess BHZ. Biotinylated proteins are then selected with an immobilized monomeric avidin sorbent. Oxidized proteins thus selected are desorbed from the avidin column (with 2 mM biotin or 0.1 M glycine) as a single fraction, often further fractioned by RPC, tryptic digested, and finally analyzed by LC/MS/MS. It should be noted that biocytin hydrazide reac- tion is similar to BHZ and has been used to detect carbonylated proteins in aged mice (Soreghan et al., 2003). It should also be noted that monomeric avidin is gen- erally used more frequently than tetrameric avidin for the isolation of biotinylated molecules because tetrameric avidin binds biotin very strongly and it is difficult to disrupt this interaction during elution. Both 2DGE and conventional LC/MS were METHODS FOR STUDYING THE EFFICACY OF ANTIOXIDANTS 105 used to separate biotinylated proteins, and 2DGE–MS has been used to study car- bonylated proteins in yeast (Yoo and Regnier, 2004). A limit for the detection of 10 ng can be achieved by using avidin fluorescein isothiocyanate (FITC) yeast (Yoo and Regnier, 2004). On the other hand, the LC/MS combined with the avidin affinity purification was used to study carbonylated proteins in yeast (Mirzaei and Regnier, 2005, 2006c, 2007, 2008), rats (Madian et al., 2011c), and humans (Madian and Regnier, 2010; Madian et al., 2011a).

4.2.2.3 Girard’s P Reagent Carbonylated peptides can be isolated after derivatiza- tion with Girard’s P reagent (GPR) (1-(2-hydrazino-2-oxoethyl) pyridinium chloride). The hydrazide group of this reagent reacts with the carbonyl group of an OSi-PTM forming a hydrazone; after derivatization, the peptides are positively charged at pH 6. This derivative can be isolated with strong cation exchange (SCX) chromatography. The eluted peptides can then be separated and identified by RPC–MS/MS (Mirzaei and Regnier, 2006d). Dialysis is not used in this approach as excess derivatizing agent does not interfere with SCX. Additionally, quaternization enhances the ionization of poorly ionized peptides.

4.2.2.4 Oxidation-Dependent Element Coded Affinity Tags Derivatization with oxidation-dependent element coded affinity tags (O-ECAT), that is, ((S)-2- (4-(2-aminooxy)-acetamido)-benzyl)-1,4,7,10-tetraazacyclododecane-N,N,N,N- tetraacetic acid, provides another route for the selection of carbonylated peptides. The O-ECAT moiety has the ability to chelate rare earth metals such as Tb (158.92Da) or Ho (164.93 Da). This differential labeling agent allows relative quantitation of peptides bearing oxidative modifications. The labeled peptides can be isolated by immunoaffinity chromatography and analyzed by nanoRPC-FTICR mass spectrometry (Lee et al., 2006; Cheal et al., 2009).

4.2.2.5 Precautions There are also some caveats that accompany the use of the methods described above.

1. It should be noted that none of the methods mentioned above will isolate all oxidized forms of proteins. Sulfhydryl oxidation, for example, does not produce a carbonyl group. Only proteins bearing a carbonyl group will be isolated. 2. Nonoxidized proteins that are complexed with carbonylated proteins will be isolated with this method as well. Verification that a protein is oxidized comes from the detection of the carbonylation site(s). 3. Only fresh samples should be used for carbonylation studies. This is due to the fact that carbonyl groups are highly reactive and can react with lysine residues and N-terminal amine groups, precluding detection. Therefore, stored biological samples may not be of any value for proteomics studies involving carbonylation. 106 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS

4.3 STRATEGIES USED FOR PROTEOMIC ANALYSIS OF CARBONYLATED PROTEINS AND THE IMPACT OF ANTIOXIDANTS

The methods described above can be combined with affinity chromatography and sophisticated proteomics techniques to allow the isolation of carbonylated proteins and their oxidation sites. This can be achieved using three different strategies.

4.3.1 Isolating Carbonylated Peptides Fresh biotinylated samples can be immediately digested with an enzyme, for example, trypsin or glu-c, and then subjected to avidin affinity chromatography for the selection of carbonylated peptides. The fraction eluted from the affinity column can then be identified and characterized using LC/MS/MS. The advantage of this approach is that only carbonylated peptides are analyzed. A disadvantage is that carbonylated peptides are generally larger than their unoxidized counterparts from a protein (Mirzaei and Regnier, 2007). This is due to the oxidation of arginine and lysine residues. Another problem is that the identification of proteins can be difficult due to the lack of redundant unmodified peptides. Using BHZ to label the carbonylated proteins with the subsequent selection of the biotinylated proteins, this method resulted in the identification of 415 yeast proteins and 87 oxidation sites after treatment with hydrogen peroxide (Mirzaei and Regnier, 2006c).

4.3.2 Targeting Carbonylated Proteins as a Group In this approach, the carbonylated proteins thus biotinylated are isolated using avidin affinity chromatography. The purified proteins are then digested, and peptide frag- ments from these proteins can then be analyzed by LC/MS/MS. This approach is simple and quick. Additionally, multiple unoxidized peptides from a protein are avail- able, which can be used for identification. The main disadvantage of this approach is that fewer carbonylation sites are detected compared to the other two approaches. This is mainly due to suppression of ionization by the more abundant, easily ionized unmodified peptides in the protein.

4.3.3 Multidimensional Separation Biotinylated proteins are purified by avidin affinity chromatography in this approach and then further fractionated with RPC before proteolysis and identification by RPC– MS/MS. Unmodified peptides from the digest are used to identify the proteins. This makes identification of their biotinylated counterparts much easier. Without this, it is necessary to identify biotinylated peptides with a high resolution, high mass accuracy mass spectrometer. Having backup identification of unmodified peptides from the same protein precludes the need for such sophisticated mass spectrometry. This approach was utilized to characterize the oxidation sites for 87 yeast proteins (Mirzaei and Regnier, 2007). One of the advantages of this approach is that it allows STUDYING OXIDATION MECHANISMS 107 the identification of both the proteins and their oxidation sites. The main disadvantage is that analysis times are longer. It should be noted that protein:protein cross-linking products, protein:RNA cross-linking products, isoforms of the same proteins, and in vivo cleavage products will appear as multiple peaks during RPC fractionation. This information will be missed with the previous two approaches.

4.4 STUDYING OXIDATION MECHANISMS

Carbonylation of proteins can occur in at least three ways: (i) direct oxidation with ROS, (ii) adduct formation with lipid peroxidation products by Michael addition, and (iii) formation of adducts with AGE products (Fig. 4.1).

Direct carbonylation

O O H C O CONHR1 3 HN CONHR1 CONHR HN 1 R OCHN COR2 2 COR2 Glutamic semialdehyde Aminoadipic semialdehyde 2-Amino-3-ketobutyric acid (oxidized proline or arginine) (oxidized lysine) (oxidized threonine) Glycation and advanced glycation end products (AGEs) adducts

OH O NRLys HOH C 2 N Lys

OH O Deoxyglucosone adduct Methylglyoxal adduct

Advanced lipid peroxidation end products (ALEs) adducts

aa R NRLys H S O O O H2 NRLys CR C C C H Malondialdehyde adduct OH H HNE Michael adduct Methylglyoxal adduct FIGURE 4.1 Structures of carbonylation products detected in this study. R refers to the sequence of polypeptides, aa refers to lysine, histidine, or cysteine that can form Michael adducts with 4-HNE. Reprinted with permission from Madian et al. (2011c). Copyright 2011 American Chemical Society. 108 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS

4.4.1 Direct Oxidation with ROS Posttranslationally modified proteins commonly exist at low concentration in a high background of unmodified proteins. Direct identification of posttranslationally mod- ified proteins under these conditions is possible but requires use of computer algo- rithms to search for mass shifts that are unique to the modification along with a high resolution, high mass accuracy mass spectrometer, and a partial sequence of the peptide to assure that the modification is being identified. Common bioinformatics tools (e.g., Mascot) can detect this change in the mass and thus identify the presence of carbonylation. Additionally, the mass spectra associated with modified peptides should be carefully examined manually to determine the location of the oxidation site. Sometimes proteins appear in multiple, nonadjacent fractions during RPC (Mirzaei and Regnier, 2007). This occurs due to several reasons. One is when multiple OSi- PTMs occur in the same protein and the isoforms are sufficiently difficult to separate. Partial proteolysis and oxidation in the same protein would be such a case. Proteins can also cross-link to other species after oxidation. The oxidized monomer and cross-linked version are thus seen. Proteins can cross-link with another protein in multiple ways (Berlett and Stadtman, 1997; Stadtman and Berlett, 1997; Mirzaei and Regnier, 2007). One occurs when the carbonyl group on an oxidized amino acid of one protein forms a Schiff base with the amino group of a lysine residue of another protein. Another occurs with Schiff base formation between two aldehyde groups of malondialdehyde and the amino groups of lysine residues on two different proteins. The fourth results from AGE adducts forming a cross-link between the two proteins. Finally, disulphide links between proteins can be mediated by cysteine oxidation. Proteins also cross-link with RNA and DNA. Studies of yeast grown in an H2O2 environment have shown that multiple ribosomal proteins cross-link to ribosomal RNA (rRNA) in addition to undergoing carbonylation. When lysates from these cultures were biotinylated with BHZ and the carbonylated proteins isolated by avidin affinity chromatography followed by RPC of the affinity selected protein fraction, the same ribosomal protein peak was surprisingly eluted in three different nonadjacent fractions. The possibility that this might be due to the cross-linking with rRNA was examined in the following way. Each of the RPC fractions was subjected to affinity chromatography on a meta-amino phenyl boronic acid (mAPBA) column. This type of affinity chromatography is known to select vicinal diol species as would be the case of RNA species and some glycoproteins. After trypsin digestion, the peptide cleavage products were identified by RPC–MS/MS. Spectral analyses indicated that some of the isolated peptides carried specific covalently linked RNA bases (Mirzaei and Regnier, 2006a). Through this approach and inspection of X-ray crystallographic structures of yeast ribosomes, –37 proteins were identified along with their RNA cross-linking sites.

4.4.2 Adducting of Advanced Lipid Peroxidation End Products to Proteins 4.4.2.1 Adducts Oxidation of lipids can result in the generation of reactive degradation products like 4-hydroxynonenal (HNE) and malondialdehyde. These compounds frequently react with proteins through either Michael addition or Schiff STUDYING OXIDATION MECHANISMS 109 base formation, often with pathological consequences. For example, apolipoprotein B-100 (Apo B-100) is a single low-density lipoprotein (LDL) that solubilizes fatty acids by adsorption. Oxidation of Apo B-100 makes it susceptible to uptake and accumulation in receptor cells, leading to the formation of atherosclerotic plaques inside blood vessels. To study advanced lipid peroxidation adducts of this molecule, NaBH4 was used to stabilize the Michael adducts in oxidized LDL, and samples were then delipidated and digested with trypsin to generate proteolytic fragments, and the tryptic peptides were analyzed by LC–MS/MS. A diagnostic product ion of m/z 268 corresponding to the histidine immonium ion modified by HNE is generated upon fragmentation of peptides modified with HNE. Generally, these modified peptides were located on the surface of the LDL molecules (Bolgar et al., 1996). In addition to the HNE Michael addition studies described earlier, HNE and malondialdehyde can form Schiff base adducts with lysine. In fact, mass spectrometry easily discriminates between these two mechanisms. Schiff base formation results in the increase of the mass in peptides by 138 amu while that from Michael addition of HNE is 180 amu. A study with model proteins in plasma (hemoglobin and ␤-lactoglobulin) showed that the ratio of Michael adducts to those of Schiff base formation is 99:1 (Bruenner et al., 1995). It should be noted that the choice of the reducing agent (NaCNBH3 or NaBH4) used affects the type of adduct formed. If NaBH4 is added at the final steps of the reaction between HNE and the proteins, the Michael adduct formed is reduced. On the other hand, if NaCNBH3 is added at the beginning of the same reaction, the N-terminal amino-acid residue was modified mainly via Schiff base adduct (Fenaille et al., 2003). The extent of modification is affected to a large extent by the polypeptide structure. The more open structure of apomyoglobin resulted in a higher degree of modifications than with myoglobin (Liu et al., 2003). The type of mass spectrometer used in analyses is also critical to the analysis of protein oxidation. Masses of the different oxidative modifications can be very similar, and therefore, a mass spectrometer with high accuracy and high resolving power is needed. FTICR–MS is a good example of a high-resolution instrument with high-mass accuracy. With this instrument, it has been possible to locate the sites of HNE modifications in apomyoglobin, finding that three to nine Michael adducts were formed. FTICR–MS was also used to characterize the HNE adducts of creatine kinase. Histidine and cysteine had the highest probability to be derivatized (Eliuk Shannon et al., 2007). FTICR can also be combined with the hybrid linear ion trap. This hybrid linear ion trap-FTICR mass spectrometer (LTQ-FT) has the capability of performing both the usual data-dependent acquisition of spectra along with neutral loss-driven MS3 data. This neutral loss scanning resulted in the identification of 25% of the HNE modification sites of a total of –24 sites on a total of 15 mitochondrial proteins (Stevens et al., 2007). Additionally, mass spectrometers equipped for electron transfer dissociation (ETD) ionization may be preferred over collision-induced dissociation (CID) as it can be superior to the characterization of oxidation sites due to the pro- duction of c and z ions that provide better sequence coverage (Rauniyar et al., 2007).

4.4.2.2 Purification of Adducts The low concentration of the advanced lipid peroxidation adducts sometimes hinders their detection with mass spectrometric 110 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS techniques. Therefore, several strategies have been employed for their isolation and enrichment. First, avidin chromatography can purify biotinylated adducts after label- ing with BHZ. Using this approach, HNE peptides were purified from HNE-spiked yeast. The sites of modifications were mapped to –67 proteins mainly on histidine residues (Roe et al., 2007). The same strategy was used to label the HNE adducts in adipose tissue of mice. Adipocyte fatty acid-binding protein was shown to be among the proteins modified with HNE. This protein plays a role in the insulin resistance (Grimsrud et al., 2007). Another reagent that can be used is biotinylated hydroxy- lamine, which forms an oxime with the HNE Michael adducts eliminating the need for the reduction step (Chavez et al., 2006; Chung et al., 2008). Another way is to label the HNE Michael and malondialdehyde Schiff base adducts with DNP followed by the purification of the labeled peptides with anti-DNP affinity chromatography. This process was shown to be quantitative (Fenaille et al., 2002). A major challenge associated with the forementioned approaches is that they react with all carbonyl groups, not only HNE. This can be solved by the use of anti-HNE antibody immo- bilized on CNBr-activated Sepharose. This antibody is specific for Michael adducts only (Fenaille et al., 2002).

4.4.3 Analysis of Advanced Glycation End Products AGE products are isomeric mixtures of products (Amadori, 1925a, 1925b) that are characterized with long-term stability (Kuhn and Dansl, 1936). They are formed upon Amadori rearrangement of reducing sugar adducts to proteins generated by the addition of monosaccharides to amines on proteins (generally known as the Maillard reaction). The concentrations of these products increase with aging, diabetes, and renal failure (Wautier and Schmidt, 2004). One of the major problems associated with glycation is that it may alter the biological activity of the proteins. One interesting example is the loss of the oxidative repair activity of the glutathione peroxidase enzyme upon irreversible modification with methylglyoxal at residues R184 and R185 (Yong et al., 2003). Studies on model proteins have shown that both glycation and the formation of AGE products are site-specific. For example, the in vitro incubation of RNase with glucose for 14 days showed that both the carboxymethyl groups and the Amadori adducts share the same oxidation sites: K41, K7, and K37. In addition to that, an Amadori adduct was also formed at K1. Interestingly, the carboxymethylation results mainly from the Amadori adduct rather than the autoxidation of glucose which results in the formation of glyoxal (Brock et al., 2003). The previous studies were based on model proteins whereas isolation of glycated proteins and proteins with AGE production from complex mixtures is challenging. Few methods have been described for this purpose. Again through affinity selection with mAPBA, diol species (including glycated proteins and peptides) were selected by reversible formation of a covalent ester between mAPBA and 1,2 or 1,3 diols (Zhang et al., 2007b). Using this approach, the glycated proteins and peptides isolated from human plasma and erythrocytes from individuals with impaired glucose tolerance or type 2 diabetes were isolated and sequenced by ETD and CID (Zhang et al., 2008). QUANTIFICATION OF CARBONYLATION SITES 111

Use of CID combined with ETD offers a great advantage for the sequencing of glycated peptides. ETD provides spectra that with nearly a full series of c and z type ions allow easier peptide sequencing. In contrast, CID produces a series of y, b ions, and ions with neutral loss of water (Zhang et al., 2007a). Additionally, the neutral loss of 162 amu in the CID is yet another way to recognize glycated peptides. This resulted in the identification of 31 out of 59 lysine residues that are partially glycated in human serum albumin (HSA) (Gadgil et al., 2007). Matrix-assisted laser desorption ionization (MALDI) is yet another method of ionization for the analysis of glycated peptides. The high collision energy provided by the time of flight (TOF)/TOF mass analyzer in combination with MALDI has proven to be a valuable tool in the structural analysis (Brancia et al., 2006). mAPBA can also be immobilized on a MALDI chip to analyze glycated peptides, significantly reducing nonspecific binding (Gontarev et al., 2007). Bioinformatics is an important component in the analysis of glycated proteins. Recently, a PERL script program was described to identify glycation sites. After incubating the human ␤ 2-microglobulin with glucose, the protein was then digested and the proteolytic fragments were analyzed by MALDI–TOF–MS. The PERL scripts used the list of masses of the observed peptides and matched them to the list of the peptides putatively modified by AGE (Cocklin et al., 2003).

4.5 QUANTIFICATION OF CARBONYLATION SITES

Most of the methods described in the following sections were developed using model proteins. These methods can be easily adapted to study the effect of antioxidants on protein carbonylation.

4.5.1 Quantitation Using Stable Isotope Coding Traditionally, quantitation of carbonylated proteins was performed using 2DGE cou- pled with staining by DNPH. The main advantage of quantitation using stable isotope coding over 2DGE is that stable isotope coding is characterized by low errors in quantification (only 6–8%) irrespective of the number of steps involved (Julka and Regnier, 2005). DNPH can be used for the differential coding of carbonylated proteins 13 12 where the samples are either labeled with ( C6)-DNPH or ( C6)-DNPH according to the sample origin. Samples can then be mixed and analyzed with RPC coupled to electrospray ionization tandem mass spectrometry (LC/ESI–MS/MS) for peptide identification and quantification of carbonylation sites (Prokai and Forster, 2006; Tsu- jimoto et al., 2007). A version of BHZ called hydrazide-functionalized isotope-coded affinity tag (HICAT) was used in relative quantification of protein carbonylation with mitochondrial proteins from heart (Zhang et al., 2007b). In this approach, the HNE- 13 peptide adducts were synthesized and labeled with ( C4-HICAT). The HNE–peptide 12 adducts were coded by the light version of this reagent ( C4-HICAT). The heavy- and light-labeled adducts were then mixed, enriched, and then analyzed by RPC coupled to MALDI–MS/MS. The carbonylated peptides can be easily recognized as 112 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS their masses vary by 4 amu. Additionally, the heavy and light isoforms of Girard’s reagent P (GRP) and O-ECAT have been used for relative quantitation of protein car- bonylation. For example, d0-GRP and d5-GRP were used to label oxidized samples according to their origins. Then both forms were mixed in a 1:1 ratio and analyzed by RPC–MS/MS. Mass differences of 5 Da or a multiple thereof indicates the pres- ence of carbonylated peptides. False-positive identification was reduced by the use of both RPC–MS/MS and MALDI–MS/MS in addition to parameter filtering, including retention time, resolution, correct concentration ratio, and tag number (Mirzaei and Regnier, 2006b).

4.5.2 Quantitation Using Targeted Proteomics Techniques Despite the importance of the relative methods for quantitation described earlier for determining the efficacy of antioxidants on carbonylated proteins, absolute quantifi- cation is greatly needed to determine the absolute amount of oxidized proteins in the cell and the fraction that is affected by antioxidants. Multiple reaction monitoring (MRM) has been extensively used for the absolute quantitation of small molecules through spiking the samples with heavy internal standard peptides of known concen- tration. In proteomics-based methods, the internal standard can be either a protein generated via bioengineering or an isotope-labeled peptide generated via chemical synthesis. Usually, a synthetic 13C-labeled peptide precludes the possibility of chro- matographically separating peptides isotopomers during RPC. A full discussion of MRM methods is beyond the scope of this chapter. The reader is directed to several excellent reviews about the topic (Keshishian et al., 2007; McKay et al., 2007; Stahl- Zeng et al., 2007; Lange et al., 2008). The use of MRM methods for the analysis of oxidized proteins has not been described, but this method is likely to be of great value in determining the concentration of isoforms oxidized at specific sites. The main problem with the application of these methods to redox proteomics is the difficulty in synthesizing a large number of carbonylated peptides to determine the absolute concentration of several oxidation sites.

4.6 BIOMEDICAL CONSEQUENCE OF PROTEIN OXIDATION AND THE IMPACT OF ANTIOXIDANTS

Heart disease, neurological diseases, and aging are all related to OS. In addition, there are more than 24 million individuals with type II diabetes and an additional 57 million are prediabetic (Mirzaei and Regnier, 2007) in the United States today. Healthcare costs for this disease alone total well over 174 billion dollars per year (Mirzaei and Regnier, 2007). Renal failure, neuropathy, cardiovascular disease, and blindness along with concomitant reductions in quality of life, greater healthcare costs, and increased mortality are some of the long-term consequences of OS. Studying protein oxidation in various diseases and the impact of antioxidants on them has just started to emerge. For example, the OS-induced carbonylation of the fatty acid-binding protein in the adipose tissue of insulin-resistant mice causes at REDOX PROTEOMICS AND TESTING THE EFFICACY OF ANTIOXIDANTS 113 least a 10-fold reduction in the affinity to fatty acids (Grimsrud et al., 2007). The oxidation site was shown to be an in vitro HNE adduct at the Cys-117. Additionally, under ischemia and reperfusion, it was shown that Hsp70-1 was oxidized at Arg 469 in the hippocampus of the macaque monkey (Oikawa et al., 2009). Another example is ADP/ATP translocase 1 in cardiac mitochondria, and carbonylation was detected only at Cys-256 due to the formation of adducts with HNE and acrolein (Han et al., 2007). Oxidation of proteins can impact the biological function of these proteins. GAPDH is an example of these proteins in which the enzymatic activity is altered by OS (Pierce et al., 2008). Moreover, the activity of creatine kinase and carbonic anhydrase is reduced in COPD patients under the effect of OS (Barreiro et al., 2005). Additionally, the in vitro addition of HNE to enolase resulted in the loss of its activity (Hussain et al., 2006). Efforts are even underway to catalog carbonylated proteins in various diseases and with drugs. Approximately, 100 carbonylated proteins have been identified in mouse brain (Soreghan et al., 2003), while in aged mice, the levels of three proteins increased significantly compared to controls while another three decreased by antioxidant treatment with lipoic acid (Poon et al., 2005). Additionally, carbonylated proteins were cataloged in muscle and plasma of diabetic rats (Maeda et al., 2003; Oh-Ishi et al., 2003). Administration of GSPE was shown to return 7 of 14 differentially regulated retinal proteins to their normal levels (Li et al., 2008). Analysis of the carbonylated proteins in the plasma of patients with breast cancer showed that 95 out of 405 proteins were altered more than 1.5-fold. These proteins are part of major pathway networks in breast cancer (Fig. 4.2) (Madian et al., 2011a). It was found in the brain of patients with Alzheimer’s disease that glutamine synthase, creatine kinase BB, and ubiquitin carboxy-terminal hydrolase L-1 are differentially oxidized (Castegna et al., 2002). In association with mild cognitive impairment (MCI), eleven HNE-modified proteins were elevated. This in turn resulted in the loss of the protein activity and neuronal death that may lead to the progression of MCI in patients with Alzheimer’s disease (Reed et al., 2008).

4.7 REDOX PROTEOMICS AND TESTING THE EFFICACY OF ANTIOXIDANTS

Recent studies have shown that it is possible to study the efficacy of antioxidant supplementation at the level of oxidation sites using selective reaction monitoring (SRM) techniques (Madian et al., 2011b). As a proof of concept, the efficacy of green tea on OSi-PTMs in plasma proteins of Zucker diabetic fatty (ZDF) rats was evaluated (Figs. 4.3 and 4.4). The general features of this approach are (i) the use of proteolytic fragments bearing OSi-PTM of oxidized proteins in plasma as biomark- ers, (ii) defining the degree and mechanism of OS damage and antioxidant protection through these OSi-PTM peptide biomarkers, and (iii) organ-specific assessment of OS and antioxidant protection using blood biomarkers. The sensitivity of this approach allowed the ability to assess oxidized peptides of identical structure as opposed to old methods which quantitate protein carbonylation based on absorbance of the 2,4-DNP. 114 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS

FIGURE 4.2 Protein networks associated with proteins that changed more than 1.5-fold in the plasma of patients with breast cancer compared to their controls. The network was created by a direct interaction algorithm of GeneGoTM using the list of proteins from our dataset. Lines between nodes indicate the interaction between proteins with green being activation, red inhibition, and cyan canonical pathways. Shapes of the nodes represent the functional class of the proteins as shown at the bottom of the figure. Brown circles indicate the cancer-related proteins that proteins from our dataset interact with. Reprinted with permission from Madian et al. (2011a). Copyright 2011 Elsevier.

The study was also able to determine the mechanism by which green tea supplemen- tation affected protein carbonylation in ZDF rats with type II diabetes. The approach allowed the examination of the oxidation pathways and the extent of oxidative mod- ifications within individual proteins (Fig. 4.4). MS/MS fragmentation along with the selectivity and sensitivity of triple quadrupole-based SRM methods were exploited to quantify changes in carbonylation sites. Seven out of the 17 differentially altered oxidation sites decreased dramatically. However, the level of one oxidized peptide increases, which indicates that the mechanism of protection of antioxidants is not universal. Additionally, the pro-oxidant effect of green tea may have contributed to that. REDOX PROTEOMICS AND TESTING THE EFFICACY OF ANTIOXIDANTS 115

×102

1.45 1.425 1.4 1.375 1.35 1.325 1.3 1.275 607.3/915.4 1.25 1.225 1.2 1.175 Diabetic 1.15 rat plasma 1.125 pooled 1.1 1.075 sample 1.05 1.025 1 0.975 607.3/599.4 0.95 0.925 0.9

Counts 0.875 0.85 0.825 0.8 0.775 0.75 0.725 0.7 0.675 0.65 0.625 0.6 0.575 Green tea 0.55 607.3/915.5 fed 0.525 diabetic 0.5 rat plasma 0.475 607.3/599.4 pooled 0.45 0.425 sample 0.4

6.85 6.96.95 7 7.05 7.1 7.15 7.2 7.25 7.3 7.35 7.4 7.45 7.5 7.55 7.6 7.65 7.7 7.75 Time (minutes) FIGURE 4.3 Relative quantitation of carbonylated peptides using selective reaction moni- toring (SRM). The HNE-modified peptide, KVADALAK, was biotinylated through addition of biotin hydrazide. Quantification of this modified peptide is based on two transitions: 915.5 (y5) and 599.4 (y2-NH3). As seen, the levels were reduced 25-fold in the pooled plasma sample from green tea-fed diabetic rats compared to the pooled plasma sample from control diabetic animals. Reprinted with permission from Madian et al. (2011b). Copyright 2011 American Chemical Society. 116 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS

FIGURE 4.4 Carbonylation sites quantified using SRM. The figure shows average ratios for carbonylated peptides in diabetic rats to their lean controls and in green-tea-treated diabetic rats to their diabetic controls as detected in this study. Structures were made using the SWISS- MODEL Workspace software from the Swiss Institute of Bioinformatics. Each sequence was uploaded into the Modeling Workspace where the software finds the best structure template based on homology with other proteins of known structure. Reprinted with permission from Madian et al. (2011b). Copyright 2011 American Chemical Society.

An interesting aspect of this work is that there were substantial differences in which oxidation is affected by antioxidants between proteins being oxidized by the same pathway and even between sites in the same protein. Figure 4.4 pro- vides several of these examples. One of them is hemoglobin alpha 2 chain. ALE forms an adduct with K12 and K68. The ratio of the K12 modification of the dia- betic to lean is almost twofold, while the ratio of antioxidant-treated diabetic to its REFERENCES 117 diabetic control is almost 1. In contrast, the ratio for the ALE adduct at K68 between diabetic to lean control animals was almost 20-fold, while the ratio between the antioxidant-treated diabetic to its control was about 0.03. Obviously, the two sites vary by 40-fold in the degree of antioxidant protection. Another example is fibrinogen alpha polypeptide isoform 1. The arginine residues were oxidized at R419 and R770. For R770, the ratio of diabetic to lean was 5.9, while the ratio of the antioxidant-treated diabetic to its control was 0.2. In contrast, the ratio of diabetic to lean at residue R419 was 2, while the ratio of antioxidant-treated diabetic to diabetic controls was almost onefold. Similar results were seen between different proteins oxidized by the same mechanism (e.g., AGE adduct formation) in murinoglobulin 2, albumin, hemoglobin, fatty acid binding protein, inter-alpha-inhibitor H4 heavy chain, and Ig gamma-2A. Overall, the significance of this analytical approach is that antioxidant efficacy can potentially be evaluated with blood samples, identifying the proteins damaged in OS, quantifying both the nature and the extent of individual protein oxidation and identifying that the probable source of OS comes from (i) ROS-initiated oxidative cleavage of amino acids, (ii) glycation, or (iii) addition of lipid peroxidation prod- ucts. The major mechanisms affected were advanced lipid peroxidation end products followed by AGE products. Not only will this level of differentiation be of great value in quantitatively assessing OS, but it will also provide a means to evaluate a wide variety of antioxidants. Over the next decade, we expect that redox proteomics techniques will be used extensively to study the efficacy of antioxidants. New plasma methods will still need to be developed to allow studying the efficacy of antioxidants at the organ level. The use of larger samples, specific antibodies, and the evolution of more sensitive mass spectrometers will allow this to happen.

REFERENCES

Amadori M (1925a). Hydrated mesotartaric acid. Atti della Accademia Nazionale dei Lincei, Classe di Scienze Fisiche, Matematiche e Naturali, Rendiconti 1:244–246. Amadori M (1925b). Products of condensation between glucose and p-phenetidine. I Atti della Accademia Nazionale dei Lincei 2:337–342. Ames BN, Shigenaga MK, Hagen TM (1993). Oxidants, antioxidants, and the degenerative diseases of aging. Proceedings of the National Academy of Sciences of the United States of America 90(17):7915–7922. Ames BN, Gold LS, Willett WC (1995). The causes and prevention of cancer. Proceedings of the National Academy of Sciences of the United States of America 92(12):5258–5265. Amici A, Levine RL, Tsai L, Stadtman ER (1989). Conversion of amino acid residues in pro- teins and amino acid homopolymers to carbonyl derivatives by metal-catalyzed oxidation reactions. Journal of Biological Chemistry 264(6):3341–3346. Barnes S, Peterson G, Grubbs C, Setchell K (1994). Potential role of dietary isoflavones in the prevention of cancer. Advances in Experimental Medicine and Biology 354:135–147. Barreiro E, Gea J, Matar G, Hussain SN (2005). Expression and carbonylation of creatine kinase in the quadriceps femoris muscles of patients with chronic obstructive pulmonary disease. American Journal of Respiratory Cell and Molecular Biology 33(6): 636–642. 118 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS

Berlett BS, Stadtman ER (1997). Protein oxidation in aging, disease, and oxidative stress. Journal of Biological Chemistry 272(33):20313–20316. Bolgar MS, Yang C, and Gaskell SJ (1996). First direct evidence for lipid/protein conjugation in oxidized human low density lipoprotein. Journal of Biological Chemistry 271(45):27999– 28001. Brancia FL, Bereszczak JZ, Lapolla A, Fedele D, Baccarin L, Seraglia R, Traldi P (2006). Comprehensive analysis of glycated human serum albumin tryptic peptides by off-line liq- uid chromatography followed by MALDI analysis on a time-of-flight/curved field reflectron tandem mass spectrometer. Journal of Mass Spectrometry 41(9):1179–1185. Brock JWC, Hinton DJ, Cotham WE, Metz TO, Thorpe SR, Baynes JW, Ames JM (2003). Pro- teomic analysis of the site specificity of glycation and carboxymethylation of ribonuclease. Journal of Proteome Research 2(5):506–513. Brownlee M (2005). The pathobiology of diabetic complications: a unifying mechanism. Diabetes 54(6):1615–1625. Bruenner BA, Jones AD, German JB (1995). Direct characterization of protein adducts of the lipid peroxidation product 4-hydroxy-2-nonenal using electrospray mass spectrometry. Chemical Research Toxicology 8(4): 552–529. Castegna A, Aksenov M, Aksenova M, Thongboonkerd V, Klein JB, Pierce WM, Booze R, Markesbery WR, Butterfield DA (2002). Proteomic identification of oxidatively modified proteins in Alzheimer’s disease brain. Part I: creatine kinase BB, glutamine synthase, and ubiquitin carboxy-terminal hydrolase L-1. Free Radical Biology and Medicine, 33(4): 562–571. Chavez J, Wu J, Han B, Chung WG, Maier CS (2006). New role for an old probe: affinity labeling of oxylipid protein conjugates by N-aminooxymethylcarbonylhydrazino d-biotin. Analytical Chemistry 78(19): 6847–6854. Cheal SM, Ng M, Barrios B, Miao Z, Kalani AK, Meares CF (2009). Mapping protein- protein interactions by localized oxidation: consequences of the reach of hydroxyl radical. Biochemistry 48(21):4577–4586. Christen Y (2000). Oxidative stress and Alzheimer disease. The American Journal of Clinical Nutrition 71(2):621S–629S. Chung W, Miranda CL, Maier, CS (2008). Detection of carbonyl-modified proteins in interfibrillar rat mitochondria using N-aminooxymethylcarbonylhydrazino-D-biotin as an aldehyde/keto-reactive probe in combination with Western blot analysis and tandem mass spectrometry. Electrophoresis 29(6):1317–1324. Cocklin RR, Zhang Y, O’Neill KD, Chen NX, Moe SM, Bidasee KR, Wang M (2003). Identity and localization of advanced glycation end products on human beta 2-microglobulin using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Analytical Biochemistry 314(2):322–325. Cos P, Rajan P, Vedernikova I, Calomme M, Pieters L, Vlietinck AJ, Augustyns K, Haemers A, Vanden Berghe D (2002). In vitro antioxidant profile of phenolic acid derivatives. Free Radical Research 36(6):711–716. Dalle-Donne I, Rossi R, Giustarini D, Milzani A, Colombo R (2003). Protein carbonyl groups as biomarkers of oxidative stress. Clinica Chimica Acta 329(1-2):23–38. Dalle-Donne I, Scaloni A, Butterfield DA, editors (2006). Proteins as sensitive biomarkers of human conditions associated with oxidative stress. In: Redox Proteomics. Hoboken, NJ: John Wiley & Sons. p 487–525. REFERENCES 119

Diaz MN, Frei B, Vita JA, Keaney JF Jr (1997). Antioxidants and atherosclerotic heart disease. The New England Journal of Medicine 337(6):408–416. Eliuk Shannon M, Renfrow MB, Shonsey EM, Barnes S, Kim H (2007). Active site modifica- tions of the brain isoform of creatine kinase by 4-hydroxy-2-nonenal correlate with reduced enzyme activity: mapping of modified sites by Fourier transform-ion cyclotron resonance mass spectrometry. Chemical Research in Toxicology 20(9):1260–1268. Fenaille F, Tabet JC, Guy PA (2002). Immunoaffinity purification and characterization of 4-hydroxy-2-nonenal- and malondialdehyde-modified peptides by electrospray ionization tandem mass spectrometry. Analytical Chemistry 74(24):6298–6304. Fenaille F, Guy PA,Tabet JC (2003). Study of protein modification by 4-hydroxy-2-nonenal and other short chain aldehydes analyzed by electrospray ionization tandem mass spectrometry. Journal of American Society of Mass Spectrometry 14(3):215–226. Gadgil HS, Bondarenko PV, Treuheit MJ, Ren D (2007). Screening and sequencing of glycated proteins by neutral loss scan LC/MS/MS method. Analytical Chemistry 79(15):5991–5999. Georgetti SR, Casagrande R, Di Mambro VM, Azzolini AE, Fonseca MJ (2003). Evaluation of the antioxidant activity of different flavonoids by the chemiluminescence method. American Association of Pharmaceutical Scientists 5(2):E20. Gontarev S, Shmanai V, Frey SK, Kvach M, Schweigert FJ (2007). Application of phenyl- boronic acid modified hydrogel affinity chips for high-throughput mass spectrometric anal- ysis of glycated proteins. Rapid Communications in Mass Spectrometry 21(1):1–6. Grimsrud PA, Picklo MJ Sr, Griffin TJ, Bernlohr DA (2007). Carbonylation of adipose proteins in obesity and insulin resistance: identification of adipocyte fatty acid-binding protein as a cellular target of 4-hydroxynonenal. Molecular and Cellular Proteomics 6(4):624–637. Halliwell B (1997). Antioxidants and human disease: a general introduction. Nutrition Reviews 55(1 Pt 2):S44–S49. Halliwell B (2008). Are polyphenols antioxidants or pro-oxidants? What do we learn from cell culture and in vivo studies? Archives of Biochemistry and Biophysics 476(2):107–112. Halvorsen BL, Carlsen MH, Phillips KM, Bøhn SK, Holte K, Jacobs DR Jr, Blomhoff R (2006). Content of redox-active compounds (ie, antioxidants) in foods consumed in the United States. The American Journal of Clinical Nutrition 84(1):95–135. Han B, Stevens JF, Maier CS (2007). Design, synthesis, and application of a hydrazide- functionalized isotope-coded affinity tag for the quantification of oxylipid-protein conju- gates. Analytical Chemistry 79(9):3342–3354. Harnly JM, Doherty RF, Beecher GR, Holden JM, Haytowitz DB, Bhagwat S, Gebhardt S (2006). Flavonoid content of U.S. fruits, vegetables, and nuts. Journal of Agricultural and Food Chemistry 54(26):9966–9977. Hermans N, Cos P, Berghe DV, Vlietinck AJ, de Bruyne T (2005). Method development and validation for monitoring in vivo oxidative stress: evaluation of lipid peroxidation and fat-soluble vitamin status by HPLC in rat plasma. Journal of Chromatography, B Analytical Technologies in the Biomedical and Life Sciences 822(1-2):33–39. Hermans N, Cos P, Maes L, De Bruyne T, Vanden Berghe D, Vlietinck AJ, Pieters L (2007). Challenges and pitfalls in antioxidant research. Current Medicinal Chemistry 14(4): 417–430. Herrera E, Jimenez´ R, Aruoma OI, Hercberg S, Sanchez-Garc´ ´ıa I, Fraga C (2009). Aspects of antioxidant foods and supplements in health and disease. Nutrition Reviews 67 (Suppl 1):S140–S144. 120 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS

Hussain SN, Matar G, Barreiro E, Florian M, Divangahi M, Vassilakopoulos T (2006). Modi- fications of proteins by 4-hydroxy-2-nonenal in the ventilatory muscles of rats. American Journal of Physiology 290(5, Pt. 1):L996–L1003. Joseph JA, Denisova N, Fisher D, Bickford P, Prior R, Cao G (1998). Age-related neu- rodegeneration and oxidative stress: putative nutritional intervention. Neurologic clinics 16(3):747–755. Julka S, Regnier FE (2005). Recent advancements in differential proteomics based on stable isotope coding. Briefings in Functional Genomics and Proteomics 4(2):158– 177. Keshishian H, Addona T, Burgess M, Kuhn E, Carr SA (2007). Quantitative, multiplexed assays for low abundance proteins in plasma by targeted mass spectrometry and stable isotope dilution. Molecular and Cellular Proteomics 6(12):2212–2229. Kristensen BK, Askerlund P, Bykova NV, Egsgaard H, Møller IM (2004). Identification of oxidised proteins in the matrix of rice leaf mitochondria by immunoprecipitation and two-dimensional liquid chromatography-tandem mass spectrometry. Phytochemistry 65(12):1839–1851. Kuhn R, Dansl A (1936). A molecular rearrangement of N-glucosides. Berichte der Deutschen Chemischen Gesellschaft [Abteilung] B: Abhandlungen 69B:1745–1754. Lange V, Picotti P, Domon B, Aebersold R (2008). Selected reaction monitoring for quantita- tive proteomics: a tutorial. Molecular Systems Biology 4:222. Lee S, Young NL, Whetstone PA, Cheal SM, Benner WH, Lebrilla CB, Meares CF (2006). Method to site-specifically identify and quantitate carbonyl end products of protein oxida- tion using oxidation-dependent element coded affinity tags (O-ECAT) and nanoliquid chro- matography Fourier transform mass spectrometry. Journal of Proteome Research 5(3):539– 547. Li M, Ma YB, Gao HQ, Li BY, Cheng M, Xu L, Li XL, Li XH (2008). A novel approach of proteomics to study the mechanism of action of grape seed proanthocyanidin extracts on diabetic retinopathy in rats. Chinese Medical Journal 121(24):2544–2552. Liu Z, Minkler PE, Sayre LM (2003). Mass spectroscopic characterization of protein modifi- cation by 4-hydroxy-2-(E)-nonenal and 4-oxo-2-(E)-nonenal. Chemical Research in Toxi- cology 16(7):901–911. Madian AG, Regnier FE (2010). Profiling carbonylated proteins in human plasma. Journal of Proteome Research 9(3):1330–1343. Madian AG, Diaz-Maldonado N, Gao Q, Regnier FE (2011a). Oxidative stress induced car- bonylation in human plasma. Journal of Proteomics 74(11):2395–2341. Madian AG, Myracle AD, Diaz-Maldonado N, Rochelle NS, Janle EM, Regnier FE (2011b). Determining the effects of antioxidants on oxidative stress induced carbonylation of pro- teins. Analytical Chemistry 83(24):9328–9336. Madian AG, Myracle AD, Diaz-Maldonado N, Rochelle NS, Janle EM, Regnier FE (2011c). Differential carbonylation of proteins as a function of in vivo oxidative stress. Journal of Proteome Research 10(9):3959–3972. Maeda T, Oishi M, Ueno T, Kodera Y (2003). Detection of oxidized proteins in muscles of diabetic rats. Journal of the Mass Spectrometry Society of Japan 51(5):509–515. McKay MJ., Sherman J, Laver MT, Baker MS, Clarke SJ, Molloy MP (2007). The develop- ment of multiple reaction monitoring assays for liver-derived plasma proteins. Proteomics: Clinical Applications 1(12):1570–1581. REFERENCES 121

Mirzaei H, Regnier F (2005). Affinity chromatographic selection of carbonylated proteins followed by identification of oxidation sites using tandem mass spectrometry. Analytical Chemistry 77(8):2386–2392. Mirzaei H, Regnier F (2006a). Protein-RNA cross-linking in the ribosomes of yeast under oxidative stress. Journal of Proteome Research 5(12):3249–3259. Mirzaei H, Regnier F (2006b). Identification and quantification of protein carbonylation using light and heavy isotope labeled Girard’s P reagent. Journal of Chromatography A 1134(1– 2):122–133. Mirzaei H, Regnier F (2006c). Creation of Allotypic Active Sites during Oxidative Stress. Journal of Proteome Research 5(9):2159–2168. Mirzaei H, Regnier F (2006d). Enrichment of carbonylated peptides using Girard P reagent and strong cation exchange chromatography. Analytical Chemistry 78(3):770–78. Mirzaei H, Regnier F (2007). Identification of yeast oxidized proteins: chromatographic top- down approach for identification of carbonylated, fragmented and cross-linked proteins in yeast. Journal of Chromatography A 1141(1):22–31. Mirzaei H, Regnier F (2008). Protein: protein aggregation induced by protein oxidation. Journal of Chromatography, B Analytical Technologies in the Biomedical and Life Sciences 873(1):8–14. Oh-Ishi M, Ueno T, Maeda T (2003). Proteomic method detects oxidatively induced protein carbonyls in muscles of a diabetes model Otsuka Long-Evans Tokushima Fatty (OLETF) rat. Free Radical Biology and Medicine 34(1):11–22. Oikawa S, Yamada T, Minohata T, Kobayashi H, Furukawa A, Tada-Oikawa S, Hiraku Y, Murata M, Kikuchi M, Yamashima T (2009). Proteomic identification of carbonylated proteins in the monkey hippocampus after ischemia-reperfusion. Free Radical Biology and Medicine 46(11):1472–1477. Pierce A, Mirzaei H, Muller F, De Waal E, Taylor AB, Leonard S, Van Remmen H, Regnier F, Richardson A, Chaudhuri A (2008). GAPDH is conformationally and functionally altered in association with oxidative stress in mouse models of amyotrophic lateral sclerosis. Journal of Molecular Biology 382(5):1195–1210. Pietta P (2000). Flavonoids as antioxidants. Journal of Natural Products 63(7):1035–1042. Podmore ID, Griffiths HR, Herbert KE, Mistry N, Mistry P, Lunec J (1998). Vitamin C exhibits pro-oxidant properties. Nature 392(6676):559. Poon HF, Farr SA, Thongboonkerd V, Lynn BC, Banks WA, Morley JE, Klein JB, Butterfield DA (2005). Proteomic analysis of specific brain proteins in aged SAMP8 mice treated with alpha-lipoic acid: implications for aging and age-related neurodegenerative disorders. Neurochemistry International 46(2):159–168. Prokai L, Forster MJ (2006). Isotope labeled dinitrophenylhydrazines and methods for use. WO/2006/039456. Rauniyar N, Stevens SM Jr, Prokai L (2007). Fourier transform ion cyclotron resonance mass spectrometry of covalent adducts of proteins and 4-hydroxy-2-nonenal, a reactive end-product of lipid peroxidation. Analytical Bioanalytical Chemistry 389(5):1421–1428. Reed T, Perluigi M, Sultana R, Pierce WM, Klein JB, Turner DM, Coccia R, Markesbery WR, Butterfield DA (2008). Redox proteomic identification of 4-Hydroxy-2-nonenal-modified brain proteins in amnestic mild cognitive impairment: insight into the role of lipid peroxida- tion in the progression and pathogenesis of Alzheimer’s disease. Neurobiology of Disease 30(1):107–120. 122 EXAMINATION OF THE EFFICACY OF ANTIOXIDANT FOOD SUPPLEMENTS

Requena JR, Chao CC, Levine RL, Stadtman ER (2001). Glutamic and aminoadipic semialde- hydes are the main carbonyl products of metal-catalyzed oxidation of proteins. Proceedings of the National Academy of Sciences of the United States of America 98(1):69–74. Roe MR, Xie H, Bandhakavi S, Griffin TJ (2007). Proteomic mapping of 4-hydroxynonenal protein modification sites by solid-phase hydrazide chemistry and mass spectrometry. Analytical Chemistry 79(10):3747–3756. Santos-Gonzalez M, Gomez´ D´ıaz C, Navas P, Villalba JM (2007). Modifications of plasma proteome in long-lived rats fed on a coenzyme Q10-supplemented diet. Experimental Gerontology 42(8):798–806. Soreghan BA, Yang F, Thomas SN, Hsu J, Yang AJ (2003). High-throughput proteomic-based identification of oxidatively induced protein carbonylation in mouse brain. Pharmaceutical Research 20(11):1713–1720. Stadtman ER, Berlett BS (1997). Reactive oxygen-mediated protein oxidation in aging and disease. Chemical Research in Toxicology 10(5):485–494. Stadtman ER, Levine RL (2003). Free radical-mediated oxidation of free amino acids and amino acid residues in proteins. Amino Acids 25(3-4):207–218. Stahl-Zeng J, Lange V, Ossola R, Eckhardt K, Krek W, Aebersold R, Domon B (2007). High sensitivity detection of plasma proteins by multiple reaction monitoring of N-glycosites. Molecular and Cellular Proteomics 6(10):1809–1817. Stevens SM, Rauniyar N, Prokai L (2007). Rapid characterization of covalent modifications to rat brain mitochondrial proteins after ex vivo exposure to 4-hydroxy-2-nonenal by liquid chromatography-tandem mass spectrometry using data-dependent and neutral loss-driven MS3 acquisition. Journal of Mass Spectrometry 42(12):1599–1605. Tanaka K, Miyake Y, Fukushima W, Sasaki S, Kiyohara C, Tsuboi Y, Yamada T, Oeda T, Miki T, Kawamura N, Sakae N, Fukuyama H, Hirota Y, Nagai M; Fukuoka Kinki Parkinson’s Disease Study Group (2011). Intake of Japanese and Chinese teas reduces risk of Parkinson’s disease. Parkinsonism and Related Disorders 17(6):446–450. Tsujimoto K, Hayashi A, Kawai T, Matsumoto H (2007). Oxidized protein quantitation method using isotope-substituted labeling reagent and mass spectrometry. JP55617 2007111193, 20070320. Tsutsui H, Kinugawa S, Matsushima S (2011). Oxidative stress and heart failure. American Journal of Physiology 301(6):H2181–H2190. Wautier J, Schmidt AM (2004). Protein glycation: a firm link to endothelial cell dysfunction. Circulation Research 95(3):233–238. Weinreb O, Amit T, Youdim MB (2007). A novel approach of proteomics and transcriptomics to study the mechanism of action of the antioxidant-iron chelator green tea polyphenol (-)-epigallocatechin-3-gallate. Free Radical Biology and Medicine 43(4):546–556. Yong S, Takahashi M, Miyamoto Y, Suzuki K, Dohmae N, Takio K, Honke K, Taniguchi N (2003). Identification of the binding site of methylglyoxal on glutathione peroxidase: methylglyoxal inhibits glutathione peroxidase activity via binding to glutathione binding sites Arg 184 and 185. Free Radical Research 37(2):205–211. Yoo B, Regnier FE (2004). Proteomic analysis of carbonylated proteins in two-dimensional gel electrophoresis using avidin-fluorescein affinity staining. Electrophoresis 25(9):1334– 1341. Zhang Q, Frolov A, Tang N, Hoffmann R, van de Goor T, Metz TO, Smith RD (2007a). Appli- cation of electron transfer dissociation mass spectrometry in analyses of non-enzymatically glycated peptides. Rapid Communication in Mass Spectrometry 21(5):661–666. REFERENCES 123

Zhang Q, Tang N, Brock JW, Mottaz HM, Ames JM, Baynes JW, Smith RD, Metz TO (2007b). Enrichment and analysis of nonenzymatically glycated peptides: boronate affinity chromatography coupled with electron-transfer dissociation mass spectrometry. Journal of Proteome Research 6(6):2323–2330. Zhang Q, Tang N, Schepmoes AA, Phillips LS, Smith RD, Metz TO (2008). Proteomic profil- ing of nonenzymatically glycated proteins in human plasma and erythrocyte membranes. Journal of Proteome Research 7(5):2025–2032. 5 PROTEOMICS IN FOOD SCIENCE

Jose´ M. Gallardo, Monica´ Carrera, and Ignacio Ortea

5.1 PROTEOMICS

Foodomics is a recently defined discipline that studies food and nutrition through the application of advanced, high-throughput, biology-related technologies (referred to as “omics”) to improve consumer well-being, health, and confidence (Cifuentes, 2009; Herrero et al., 2012). These technological approaches include genomics, tran- scriptomics, proteomics, and metabolomics. This chapter reviews the powerful and potential applications of proteomics in food science. As a discipline, proteomics is defined as the large-scale analysis of proteins in a particular biological system at a particular moment in time (Pandey and Mann, 2000). Proteins play crucial roles in almost every biological process. A proteome reflects the biological context of a particular biological system and is highly dynamic and constantly changing in response to different stimuli, including nutrition. Proteomics includes not only the structure and function of proteins but also the study of protein modifications, the interactions between them, the study of their intracellular location, and the quantification of their abundance. The global profiling of proteins in food science offers multiple applications that can be divided into three main topics: (i) quality control and food safety, (ii) food processing studies, and (iii) nutritional aspects and characterization of new food ingredients with beneficial effects on human health. Mass spectrometry (MS), mainly matrix-assisted laser desorption/ionization–time of flight (MALDI–TOF) and electrospray-ion trap (ESI-IT) mass spectrometry, is recognized as an indispensable tool for proteomic studies (Aebersold and Mann,

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

125 126 PROTEOMICS IN FOOD SCIENCE

2003). The history of proteomics, however, began in the 1970s with the development of two-dimensional gel electrophoresis (2-DE), which provided the first method for displaying thousands of proteins on a single gel (Klose, 1975; O’Farrell, 1975). Currently, the bioinformatic treatment of data has increased the scale of proteomic tools, which represent a powerful strategy for high-throughput protein and peptide identification and quantification. The recent success of proteomic methodologies makes them a promising strategy for food science studies, where research institutions, industries, agencies, and reg- ulatory laboratories are combining efforts to acquire essential information on food composition, quality, safety, and biological activity. To this end, a comprehensive overview of the state of the art and the future outlook for proteomic approaches in food science is provided in this chapter. The most common proteomic workflow, referred to as the bottom-up approach, is emphasized in the following sections. In this approach, the protein(s) of interest are converted into peptides using enzymes such as trypsin, and the resulting peptides are analyzed by MS (Aebersold and Mann, 2003). An alternative approach characterizes fragments produced from the breakup of intact proteins directly in a mass spec- trometer, without the need for protein digestion. This approach, known as top-down proteomics (Kelleher, 2004), is now possible due to the high mass accuracy of new high-resolution mass spectrometers.

5.1.1 Bottom-Up Proteomic Approach The greatest challenge for proteomic technology is the inherently complex nature and dynamic range of cellular proteomes. A dynamic range of 1012 has been estimated for protein concentrations in the human plasma proteome (Jacobs et al., 2005). This wide range means that the low abundance proteins must be addressed by depletion of the most abundant proteins (Bellei et al., 2011) or by selective enrichment of low abundance proteins (Bandow, 2010). After protein depletion and/or enrichment, further separation is performed at the protein and/or peptide level based on gel electrophoresis or liquid chromatography (LC). 2-DE is a conventional gel-based proteomic method for the separation of pro- teins in biological samples. This method separates proteins based on two properties: isoelectric point (pI) and molecular weight (Mr). The proteins are visualized and quantified after staining with a staining reagent (Miller et al., 2006). In the gel- based approach, the resulting spots of interest are excised and hydrolyzed by specific proteases, such as trypsin, leading to a specific set of peptides for each spot. The resulting peptides are ultimately identified by MS. Gel-based proteomics remains the most widely applied method in proteome analysis due to its high protein resolution and the minimal requirements for apparatus and bioinformatic tools. In addition, this method is the most powerful option for nonmodel organisms (e.g., seafood samples), in which the identification of proteins is based on comparing peptides from the pro- teins of interest with orthologous proteins from other species or by means of de novo MS sequencing strategies. However, gel-based methods still have some limitations, PROTEOMICS 127 such as the separation of hydrophobic and poorly soluble proteins and the limited sensitivity of the available detection methods. In the gel-free approach, also referred to as the shotgun proteomic approach, the mixture of proteins is digested with proteases, and the resulting mixture of pep- tides is separated by LC, usually based on hydrophobicity via reverse-phase (RP) chromatography. Subsequently, the eluted peptides are analyzed in a mass spectrom- eter. However, the chromatographic separation of the peptide mixture becomes very complex and the peak capacity of one-dimensional chromatography is usually not sufficient to separate highly complex peptide mixtures. Therefore, multidimensional LC coupled with mass detection is widely used (Motoyama and Yates, 2008). Strong cation exchange (SCX) resin coupled with RP chromatography is the most common combination. A gel-free approach presents a number of advantages over 2-DE, such as higher sensitivity, easier automation of procedures thus providing better repro- ducibility, and a reduced influence of intrinsic protein characteristics (pI, Mr, etc.). However, the disadvantages of gel-free approaches include the loss of qualitative and quantitative information on protein isoforms and the inability to perform cross-species identification on poorly sequenced genomes. The gel-based and gel-free methods will continue to prove useful in the long term.

5.1.2 Mass Spectrometry A mass spectrometer consists of an ion source, such as electrospray ionization (ESI) (Fenn et al., 1989) or MALDI (Karas and Hillenkamp, 1988), to produce ions from the sample; one or more mass analyzers (e.g., quadrupole, TOF, IT) to separate ions based on their mass-to-charge (m/z) ratios and a detector to register the number of ions emitted by the analyzer that will produce the mass spectra (Canas˜ et al., 2006). The development of two soft ionization techniques (ESI and MALDI) able to convey large molecules into the gas phase without affecting their integrity has revo- lutionized proteomics, enabling the high-throughput analysis of thousands of proteins in a single experiment. With ESI, a high voltage is applied to an acidic liquid mixture of peptides flowing through a narrow capillary, forming an electrical spray composed of small, charged drops. These microdrops evaporate rapidly until the number of charges on their surface becomes very high and surpasses the Rayleigh limit, which causes the drops to explode and form smaller microdrops. This process is repeated several times until ionizable analytes present in the solution are further desolvated. One of the advantages of ESI is that, depending on their molecular mass and struc- ture, the ions may acquire multiple charges. In fact, tryptic peptides usually become doubly or triply charged ions before entering the mass analyzer. MALDI relies on a laser that is fired at a sample plate containing a dried mixture of the matrix (␣-cyano- 4-hydroxycinammic acid or sinapinic acid, among others) and the sample (peptide mixture). The matrix absorbs radiation from the laser, resulting in excitation of the matrix molecules and producing a dense plume of matrix and analyte molecules. The analyte molecules interact with protons from the matrix to form mainly single-charged ions that enter the mass analyzer. 128 PROTEOMICS IN FOOD SCIENCE

In the MS analyzer, the formed ions are separated according to their m/z ratio. TOF, quadrupole (Q), and ITs are the most commonly used in proteomic approaches. Mass spectrometers may be constructed with one or more analyzers. Instruments composed of two or more coupled mass analyzers are known as tandem mass spec- trometers. The TOF approach consists of a high vacuum tube in which the ions, accelerated with equal energies, travel along the tube at different velocities, which are inversely proportional to the masses of the ions (Weickhardt et al., 1996). Ions with smaller m/zvalues will reach the detector sooner than those with higher m/zvalues. A large majority of TOF instruments use MALDI for ion formation (MALDI–TOF). The quadrupole approach uses oscillating electrical fields to selectively stabilize ions passing through a radio frequency created by four parallel rods (Leary and Schmidt, 1996). Inside this electrically oscillating field, ions travel along complex trajectories and only those with stable trajectories travel along the quadrupole and reach the detector. The IT approach includes 3D, linear, and cyclotron resonance ITs. These traps are usually connected to ESI sources, which in turn are coupled to LC sepa- rations (LC–ESI-IT). These types of instruments are characterized by ion retention inside a dynamic electric field with sequential ejection into the detector according to their m/z values (Jonscher and Yates, 1997). The process of ion selection, decom- position, and fragment analysis can be repeated several times in a process known as MSn. Tandem mass spectrometry (MS/MS) usually couples two MS stages, providing sequence information on the peptide. Standard MS/MS equipment includes ESI– MS/MS, MALDI-Q-TOF, ESI-Q-TOF, Q-IT, and MALDI–TOF/TOF. In MS/MS, a particular peptide is isolated, energy is imparted by collision with an inert gas (collision-induced dissociation, CID) or by electron transfer (electron-transfer dis- sociation, ETD), and this energy causes the peptide to fragment. A mass spectrum of the resulting fragments is thereby generated, making it possible to reconstruct a peptide sequence from MS/MS data (Steen and Mann, 2004).

5.1.3 Protein Identification Currently, MS is the method of choice for protein characterization and identification. The general approach consists of comparing MS experimental data with calculated mass values obtained from entries in a sequence database. Matches are scored using a search engine such as Sequest (Eng et al., 1994) and Mascot (Perkins et al., 1999), and if the protein to be identified is contained in the database, the peptide or protein with the best score is assigned to that entry. In some approaches, the MS experimental data consist of the peptide masses obtained from the enzymatic digestion of the previously isolated unknown protein (Pappin et al., 1993). This approach, known as peptide mass fingerprinting (PMF), often uses trypsin as the protease and 2-DE as the protein isolation method. Another approach, commonly known as peptide fragmentation fingerprint (PFF), uses MS/MS fragment ion data from one or more peptides instead of the masses of the complete set of peptides from the protein, which sequences the peptide and unambiguously identifies the protein PROTEOMICS 129

(Eng et al., 1994). A similar approach, known as peptide sequence tagging (Mann and Wilm, 1994), combines the interpretation of a partial sequence of the peptide— the sequence tag—the peptide mass and the masses of the fragments preceding and trailing the sequence tag. For all of these approaches, the availability of the corresponding protein sequence in the database is of primary importance. If the database does not contain data on the unknown protein but does have data on highly homologous proteins, then the protein with the best match, that is, the one with the closest homology, is selected. Such selected proteins are usually proteins from related species. If the sequence similarity with the database proteins is too low, then peptides must be sequenced de novo (Shevchenko et al., 1997), which means that the MS/MS spectrum must be interpreted manually or by computer-assisted identification of the fragment ions whose mass differences correspond to the residue masses of the amino acids.

5.1.4 Quantitative Proteomics Most biological questions require quantitative information, such as the relative change in protein amounts between different conditions or states (e.g., control vs. case) or the absolute amount of a protein in a sample. The former approach (relative quantification) can be achieved through various methodologies that can be classified into gel-based, label-based, and label-free approaches. Gel-based methods consist of comparing signals from electrophoresis-isolated spots corresponding to proteins from various samples. Each sample, taken under various conditions or states being compared, can be tested on a different gel, or alternatively, up to three samples can be tested on the same gel using difference gel electrophoresis (DIGE) labeling and scanning technology (Unlu et al., 1997). DIGE increases confidence in terms of detection and quantification of differences in protein abundance and reduces the number of gels needed to perform an experiment. For the label-based approaches (Ong et al., 2002; Gygi et al., 1999; Yao et al., 2001), relative quantification is obtained from the MS read-out. Depending on the specific method, proteins or peptides are previously (i) metabolically labeled, providing the cultured cells with isotope-labeled nutrients; (ii) chemically labeled nutrients with different isotopic tags; or (iii) 18O- 18 labeled nutrients via enzymatic digestion in H2 O water. Quantification is based on the intensity ratio of isotope-labeled peptide pairs. Label-free methods are recent and promising alternatives that obtain MS-based quantitative information without the need for labeling or using stable isotopes. Quantification is calculated based on the comparison of peak areas or intensities of the same peptide (Chelius and Bondarenko, 2002) or on the number of identified MS/MS spectra (spectral count) of the same protein (Asara et al., 2008). For the absolute quantification of proteins, there is a need for isotope-labeled synthetic peptides as internal standards for each targeted protein (Gerber et al., 2003); thus they are methods that quantify previously known proteins. An overview of the most commonly used approaches for quantitative proteomics is shown in Table 5.1. 130

TABLE 5.1 Overview of the Most Common Approaches for Quantitative Proteomics Studies Quantification Approach Method Examples Reference Relative Gel-based Comparison of spot abundance DIGE Unlu et al., 1997 Label-based Metabolic isotope labeling SILAC Ong et al., 2002 Chemical isotope labeling ICAT Gygi et al., 1999 ICPL Schmidt et al., 2005 TMT Thompson et al., 2003 iTRAQ Ross et al., 2004 Enzymatic isotope labeling 18O labeling Yao et al., 2001 Label-free MS signal intensity SELDI-TOF MS Vorderwulbecke et al., 2005 LC-MS/MS XIC-based Chelius and Bondarenko, 2002 Spectral counting Asara et al., 2008 Absolute Use of isotopically labeled synthetic peptides AQUA Gerber et al., 2003

DIGE, difference gel electrophoresis; SILAC, stable isotope labeling by amino acids in cell culture; ICAT, isotope-coded affinity tagging; ICPL, isotope-coded protein label; TMT, tandem mass tag; iTRAQ, isobaric tag for relative and absolute quantification; SELDI-TOF, surface-enhanced laser desorption/ionization-time of flight; XIC, extracted ion currents; AQUA, absolute quantification. PROTEOMICS 131

5.1.5 Posttranslational Modifications Proteins can undergo posttranslational modifications (PTMs) in response to a wide range of extra- and intracellular signals. More than 300 different PTMs have been described (Jensen, 2004). PTMs play crucial roles in regulating cell biology because they can change a protein’s physical or chemical property, activity, location, and stability. Several proteomic approaches have been developed to identify and quantify PTM, such as phosphorylation, acetylation, glycosylation, or oxidation. A common characteristic of PTMs is that the accompanying change in amino-acid structure produces a corresponding change in the formula weight of that amino acid relative to the original, unmodified residue. This mass change is usually the basis for the detection and characterization of PTMs by MS, usually with LC–ESI-IT–MS/MS. In addition, several methods have been developed to enrich the samples in proteins or peptides with specific PTMs; these methods include the use of anti-pY antibodies, IMAC, and TiO2 for phosphorylation (Corthals et al., 2005; Larsen et al., 2005), affinity capture with lectins for glycosylated proteins (Yang and Hancock, 2004), and resin coupled with antiacetyl-lysine for acetylated proteins (Kim et al., 2006).

5.1.6 Targeted Proteomics A targeted or hypothesis-driven proteomic approach is being increasingly used to complement or validate candidate proteins. Proteomics has thereby progressed from a pure discovery tool to a screening and validation tool. Targeted proteomics is a hypothesis-driven approach focusing on the detection and quantification of a specific set of peptides associated with proteins of interest. When these selective and sensi- tive operating methods are used, the MS analyzer is focused on analyzing only the compound of interest by selected reaction monitoring (SRM) or multiple reaction monitoring (MRM) (Lange et al., 2008; Gallien et al., 2011). Monitoring transitions (suitable pairs of precursor and fragment ion m/z) constitutes a common assay for identifying and quantifying biomarkers. This system provides high analytical repro- ducibility, a good signal-to-noise (S/N) ratio, and an increased dynamic range (Lange et al., 2008). Although SRM and MRM performed on a triple-quadrupole (QQQ) are the most sensitive scanning modes (low attomolar), with a broad dynamic range (up to 5 orders of magnitude), their optimization for a definite SRM/MRM assay is time-consuming. More importantly, these scanning procedures do not register a com- plete MS/MS spectrum. A molecule’s MS/MS spectrum is of paramount importance when confirming its structure. New routines, such as MRM-triggered MS/MS using hybrid Q-IT mass spectrometers, have been explored to solve this problem (Unwin et al., 2009). When a significant signal for a specific MRM transition is detected in these assays, the instrument switches the third Q automatically to the IT mode, collecting the full MS/MS spectrum. Selected MS/MS ion monitoring (SMIM) in an IT is another scanning mode that provides sensitive monitoring of specific molecules and produces complete structural information (Jorge et al., 2007). The high scanning speed attainable in the IT mode produces MS/MS spectra in a fraction of a second and registers the information provided by the complete spectrum. High-confidence 132 PROTEOMICS IN FOOD SCIENCE

MS/MS spectra are recorded due to signal averaging during acquisition. The utility of this operating mode for authenticating various seafood products has been demon- strated in several published studies (Jorge et al., 2007; Carrera et al., 2011; Ortea et al., 2011).

5.1.7 Proteomics and Systems Biology The ultimate goal of proteomics is to understand how proteins are integrated and participate in biological systems. To achieve this goal, a new approach known as systems biology is used. This approach uses information from a number of advanced biology-related fields (genomics, transcriptomics, proteomics and metabolomics) together with the ability of computational modeling to understand and predict the properties of biological systems using a holistic point of view. In the context of food and nutritional sciences, we envision that systems analysis of normal and nutrient- perturbed signaling networks will lead to a future in which people’s health will be improved through predictive and preventive nutrition.

5.2 APPLICATIONS IN FOOD SCIENCE

The close relationship between nutrition and health has led to changes in the habits of consumers who demand products that meet their dietary and nutritional preferences. In light of this demand, both academia and food industries face a new challenge: the need to develop strategies and products that are not only safe but also contribute to the maintenance of good health and that can even prevent the development of specific diseases. Advances in proteome analysis offer multiple applications that can be used to meet the challenge. These applications can be divided into three main topics: r Quality and food safety control r Food-processing studies r Nutritional aspects and characterization of new food ingredients with beneficial effects on human health.

In this chapter, we illustrate how proteomics tools can be applied to these three main food science topics. The study of food allergens using proteomic tools is described in another chapter of this book.

5.3 SPECIES IDENTIFICATION AND GEOGRAPHIC ORIGIN

5.3.1 Food Labeling, Traceability, and Proteomics The authentication of food components is one of the major food quality and safety issues that demands attention from consumers and the food industry, and consequently from the authorities. Food products can be adulterated in various ways, which leads to mislabeling. An ingredient may be substituted, partially or entirely, with another SPECIES IDENTIFICATION AND GEOGRAPHIC ORIGIN 133 similar but inferior ingredient. Undeclared ingredients may be present in the food. False claims may be made regarding geographic or production origin. Substitution of highly valuable species, which are in greater demand due to their superior organoleptic features, by phylogenetically related but cheaper species decreases the final product quality. In addition to the commercial fraud that these adulterations represent, the nondeclared introduction of potentially harmful food ingredients, such as allergenic or toxic compounds, represents a safety risk (Lockley and Bardsley, 2000; Mermel- stein, 1993; Sotelo et al., 1993). Adulteration can also reduce the effectiveness of wildlife conservation and management programs that help protect overexploited and endangered species (Civera, 2003). Moreover, food information is directly related to product choice, which impacts consumers’ lifestyles, for example, when selecting food for vegetarian diets or religious concerns. In response to the increasing concern about food composition from consumers and industry, regulations have been implemented to ensure that correct information is provided and species substitution is prevented, thus guaranteeing market transparency and protecting consumers against mislabeling. In the United States, food substitution has been prohibited by the Federal Food, Drug and Cosmetic Act, Section 403, Misbranded Food (Food and Drug Administration, 2006), which declares that a food shall be deemed misbranded if it is offered for sale under the name of another food. In the European Union, the General Food Law (European Parliament, 2002) seeks to prevent fraudulent or deceptive practices, such as adulteration of food and practices that may mislead consumers, to provide consumers with the basis for making informed choices about the food they eat. Numerous regulations regarding the labeling and traceability of food and feed have been promulgated. For instance, the Council Regulation (EC) No 104/2000 (European Parliament, 1999) on the common organization of the markets in fishery and aquaculture products advises that seafood products should be labeled indicating (i) the commercial designation of the species, (ii) the production method (wild or farmed), and (iii) the geographic origin. The commercial designation, scientific name, production method, and catch area for the fish must be available at each stage of the marketing chain, ensuring traceability. These requirements have been implemented in each of the European States, ensuring the correct labeling and identification of food products (Ministry of Agriculture, Fisheries and Food, Spain, 2003, 2004a, 2004b). Food species identification has traditionally relied on morphological analysis. However, morphological features are particularly difficult to differentiate among certain species, such as seafood species, due to their phenotypic similarities. In the case of processed food products, differentiation of the constituents can be even more difficult, as the external features are often removed. Therefore, there is a pressing need for fast and reliable molecular identification methods that provide authorities and food industries with the tools to comply with labeling and traceability requirements at both the species and origin levels, thus ensuring product quality and protecting consumer value. Electrophoretic and immunological protein-based methods have been extensively used for the detection and authentication of food species (Lockley and Bardsley, 2000; Rehbein, 1990). For instance, the Association of Official Analytical Chemists 134 PROTEOMICS IN FOOD SCIENCE

(AOAC) adopted isoelectric focusing (IEF), a method based on the separation of proteins on a polyacrylamide gel using a pH gradient, as the only official validated method for species identification (AOAC, 1984). Limitations of these traditional methods, such as the lack of stability of some proteins during food processing, crossreactivity between closely related species, and labor intensiveness, were solved with the introduction of methods based on DNA analysis. Various DNA targets, mainly mitochondrial DNA (mtDNA), have been used in PCR-based studies on food authenticity (Rasmussen and Morrissey, 2008; Mafra et al., 2008). Due to recent advances in MS, proteomic tools have recently been proposed as fast, sensitive, and high-throughput approaches for the assessment of the authenticity and traceability of species in seafood products (Pineiro˜ et al., 2003; Martinez and Friis, 2004). MS is being used for both the discovery of species-specific peptide markers in reference samples and for the subsequent detection of the diagnostic peptides in real samples (Lopez´ et al., 2002a; Carrera et al., 2007). Figure 5.1 shows the proteomic approaches that are being used for the discovery, characterization, and monitoring of species-specific peptides for identification purposes. Proteomic tools take advantage

FIGURE 5.1 Proteomic approaches considered for the (a) identification and characterization and (b) detection and quantification of species-specific diagnostic peptides for authentication purposes. SPECIES IDENTIFICATION AND GEOGRAPHIC ORIGIN 135 of the high-throughput capacity of MS to achieve a fast, robust, and sensitive protein and peptide characterization, detection, and quantification. Proteomic-based methods can be automated to produce fast and reproducible results that allow a high-throughput analysis of foodstuffs. These methodologies can be applied to species that are poorly characterized in genomic databases, avoiding the time-consuming steps of DNA amplification and sequencing. The identification and characterization of species- specific diagnostic peptides is the first step toward designing fast and cheap detection analysis, such as antibody-based assays and MRM MS detection methods. Table 5.2 summarizes the application of proteomics to the assessment of species authenticity in food.

5.3.2 Fish 2-DE has proven to be a valuable tool for discriminating between different gadoid fish species, namely, cod, pollock, blue whiting, and five different hake species (Pineiro˜ et al., 1998), and between nine flat fish species, namely, megrim, turbot, halibut, common dab, flounder, plaice, witch, witch flounder, and sole (Pineiro˜ et al., 1999). 2-DE can even differentiate between species that show identical IEF protein patterns. The differential classification of all previously mentioned species has been based on the specific 2-DE profiles of the parvalbumin fractions, which are small, heat-resistant proteins found in the low-molecular weight range and the 3.5–5.2 pH intervals. Therefore, the methodology described may be useful in identifying fish species in heat-treated products. The 2-DE protein patterns of the water-soluble extracts of five hake species (European, Southern, Argentinian, Chilean, and Cape hakes) show species-specific profiles in the parvalbumin fraction and in another protein cluster with higher molecular mass and pI range, identified as nucleoside diphosphate kinase (NDK) by ESI-IT (Pineiro˜ et al., 2001). MALDI–TOF PMF analysis of these two protein groups classified the hake species into two groups: the East Atlantic group and the West Atlantic group. The analysis also specifically identified the Southern hake Merluccius australis.Mart´ınez and Friis (2004) demonstrated the discriminative potential of 2-DE in identifying different muscle tissues (fast skeletal, slow skeletal, and cardiac muscles) in five fish species (cod, saithe, haddock, mackerel, and capelin) according to the electrophoretic patterns of myosin light chains (MLC) 1, 2, and 3. Moreover, MLC were shown to work as markers for breeding stock, as the differentiation of two stocks of Arctic char (Hammerfest and Sila Lake strains) was achieved. 2-DE and MALDI–TOF analysis of parvalbumin fractions were used for the clas- sification of 10 hake species (European, Cape, Benguela, Deep-water, Patagonian, Peruvian, Southern or Austral, Pacific, and Silver hakes) and two populations of grenadier (namely, Blue and Patagonian grenadiers) (Carrera et al., 2006). Parvalbu- min PMF clearly differentiated the hake genus Merluccius and Macruronus and clas- sified hakes into two groups depending on geographical origin, namely, Euro-African and American. Moreover, specific peptides provided clear individual identification for most of the species. All of these hake species, including the Senegalense, were differentiated by PMF and de novo peptide sequencing of the MS/MS spectra of 136 PROTEOMICS IN FOOD SCIENCE

TABLE 5.2 Summary of Proteomic-Based Methods Applied to the Authentication of Species in Food Species/Food Main Technique Products Target Reference IEF + MS 14 shrimp and SCPs Ortea et al., 2010 prawn species 2-DE 8 gadoid fishes Parvalbumins Pineiro˜ et al., 1998 9 flat fishes Sarcoplasmic Pineiro˜ et al., 1999 proteins 5 fish species Myosin light chains Martinez and Friis, 2004 3 Thunnus species Triose phosphate Pepe et al., 2010 isomerase ESI MS Pig, beef, sheep, and Hemoglobin/ Taylor et al., 1993 horse meats myoglobin HPLC–ESI MS Cow and goat milk ␤-lactoglobulin Chen et al., 2004 CE–MS Cow, goat, and ␤-lactoglobulin Muller¨ et al., 2008 sheep milk MALDI–TOF Cow, ewe and ␣-lactalbumin and Cozzolino et al., protein buffalo milk ␤-lactoglobulin 2001 fingerprinting Mozzarella cheese ␣-lactalbumin and Cozzolino et al., ␤-lactoglobulin 2002 25 commercial fish Sarcoplasmic Mazzeo et al., 2008 species proteins Honey — Wang et al., 2009 Cow, sheep, goat, Caseins Cuollo et al., 2010 and buffalo milk MALDI–TOF PMF 5 hake species Parvalbumins and Pineiro˜ et al., 2001 NDKA 3 European mussels TM Lopez´ et al., 2002 10 Merlucciidae Parvalbumins Carreraet al., 2006 hake and grenadier species 11 Merlucciidae NDK B Carrera et al., 2007 hake and grenadier species Penaeus monodon, AK Ortea et al., 2009a Fenneropenaeus indicus 6 shrimp and prawn AK Ortea et al., 2009b species 7 shrimp and prawn AK Ortea et al., 2009c species Pandalus borealis AK Pascoal et al., 2012 (Northern shrimp) SPECIES IDENTIFICATION AND GEOGRAPHIC ORIGIN 137

TABLE 5.2 (Continued) Species/Food Main Technique Products Target Reference MS/MS 5 hake species Parvalbumins and Pineiro˜ et al., 2001 NDKA Soybean in meat Glycinin, Leitner et al., 2006 products ␤-conglycinin 11 Merlucciidae NDKB Carrera et al., 2007 commercial hakes and grenadiers Bovine gelling agent Fibrinopeptides Grundy et al., 2007 in meat 7 shrimp and prawn AK Ortea et al., 2009c species Soy and pea in Glycinin, Cordawener et al., skimmed-milk ␤-conglycinin, 2009 powder legumin, vicilin Sheep milk in goat Casein Guarino et al., 2010 and cow cheeses Pandalus borealis AK Pascoal et al., 2012 (Northern shrimp) HPLC-ESI MS + Cow, sheep, goat, caseins Cuollo et al., 2010 XIC and buffalo milk SIM + MS/MS 3 European mussels TM Lopez´ et al., 2002a MRM 11 Merlucciidae NDK B Carrera et al., 2007 hake and grenadier species 7 shrimp and prawn AK peptides Ortea et al., 2011a species 11 Merlucciidae Parvalbumins Carrera et al., 2011 hake and grenadier species SIM + AQUA Chicken in meat Myosin light chain 3 Sentandreu et al., preparations 2010

IEF, isoelectric focusing; MS, mass spectrometry; 2-DE, two-dimensional electrophoresis; ESI, elec- trospray ionization; HPLC, high-performance liquid chromatography; CE, capillary electrophoresis; MALDI–TOF, matrix-assisted laser desorption/ionization-time of flight; PMF, peptide mass fingerprint- ing; MS/MS, tandem mass spectrometry; XIC, extracted ion current; SIM, selected ion monitoring; MRM, multiple reaction monitoring; AQUA, absolute quantification; SCPs, sarcoplasmic calcium-binding pro- teins; AK, arginine kinase; NDKA, nucleoside diphosphate kinase A; NDKB, nucleoside diphosphate kinase B; TM, tropomyosin. the protein NDK B (Carrera et al., 2007) after 2-DE isolation. The species-specific characterized peptides allowed for a gel–free LC–MS classification method using the SMIM scanning mode. Once peptide biomarkers have been characterized, identifica- tion can be achieved much more rapidly by directly searching for the specific known peptides in the protein extract and avoiding the laborious and time-consuming 2-DE 138 PROTEOMICS IN FOOD SCIENCE isolation step. When dealing with complex samples, the scanning mode most suitable for detecting and quantifying known peptides is MRM, or when working with an ion trap, it is its variant SMIM (Jorge et al., 2007). A fast strategy for monitoring parvalbumin species-specific peptides from the 11 previously mentioned hake species has been described (Carrera et al., 2011), which combines fast sample preparation using high-intensity focused ultrasound (HIFU) trypsin digestion and the peptide detection ability of MS working in the SMIM scanning mode. Because parvalbu- mins are thermostable proteins, the workflow identifies species even in processed and precooked products. Mazzeo et al. (2008) obtained specific MALDI–TOF MS profiles from 25 different fish species, representing the most complete fish authentication study in terms of the numbers of species included. Several of the proteins that generated the specific signals were identified as parvalbumins. The major advantage of the proposed method is the speed, due to the fast sample preparation step that avoids protein digestion. Recently, the comparison of the 2-DE sarcoplasmic protein patterns of three tuna species (Thunnus thynnus, Thunnus albacares, and Thunnus alalunga) revealed interspecies differences (Pepe et al., 2010). A protein with a Mr of 70 kDa, present only in T. thynnus and identified by PMF as triose phosphate isomerase, was proposed as a potential specific marker for these species.

5.3.3 Shellfish Changes in protein expression were observed by 2-DE between two related species of marine mussels: Mytilus edulis and Mytilus galloprovincialis (Lopez´ et al., 2002b). However, these quantitative differences were attributed to environmental variations and genetic differences. In a subsequent study (Lopez´ et al., 2002a), tropomyosin peptides specific to three European marine mussel species (M. edulis, M. galloprovin- cialis, and Mytilus trossulus) were detected by MALDI–TOF PMF and sequenced by nESI-IT MS/MS. These species-specific markers were validated by HPLC– ESI-IT MS using the selected ion monitoring (SIM) configuration. In this mode, the IT detector was programmed to continuously monitor the previously identified selected peptides. Among fishery products, Decapoda crustaceans are of major commercial value, especially those within the superfamily Penaeoidea (penaeid shrimps and prawns), which represent important resources for both fisheries and aquafarming facilities worldwide and account for more than 17% of the global consumption of seafood products (Yamauchi et al., 2004; FAO, 2009). Due to phenotypic similarities among penaeid species, differentiation between them is difficult even when there is no pro- cessing and the external features remain intact. The ability of MALDI–TOF MS PMF to discriminate between two very closely related penaeid species (Penaeus monodon and Fenneropenaeus indicus) has been tested (Ortea et al., 2009a). Fingerprints of the sarcoplasmic protein arginine kinase (AK) presented two peaks specific to P. mon- odon, which may be useful as markers for this species. In a subsequent study (Ortea et al., 2009b), a combination of AK isolation by 2-DE and PMF provided unequivocal discrimination between six shrimp species of commercial value, including the giant SPECIES IDENTIFICATION AND GEOGRAPHIC ORIGIN 139 tiger prawn (P. monodon) and the northern shrimp (Pandalus borealis), the two most traded species of crustaceans. In this study, a mathematical method for drawing a PMF spectra dendrogram was reported. In addition to helping classify unknown sam- ples, the generated dendrograms were used to infer taxonomic relationships between different specimens, thus opening the way for further studies on phyloproteomics (Ortea et al., 2009b). Because AK interspecific variability was high enough to use as a biomarker for shrimp species identification, further identification and character- ization of the diagnostic peptides from the AK from seven of the most commercially valuable shrimp and prawn species was performed by PMF and MS/MS, with subse- quent database searches and/or de novo sequence interpretation (Ortea et al., 2009c). Several species-specific peptides were reported, together with their MS/MS frag- mentation spectra. A subsequent study (Pascoal et al., 2012) characterized additional P. borealis-specific peptides. Once identified and characterized, differential peptides may be used in immunoassay kits for the sensitive and inexpensive detection of each species and in the development of fast and highly specific MRM MS assays. In a recent work (Ortea et al., 2011), a shotgun proteomics approach was applied to the detection of some previously characterized AK species-specific peptides (Ortea et al., 2009c). A methodology that combined ultrasound-assisted tryptic digestion of protein extracts, HPLC separation, and MRM-SMIM MS detection was able to differentiate seven shrimp and prawn species in less than 90 minutes, timed from the arrival of the sample to the identification of the species. An alternative offline analysis of the tryp- tic digests, although more laborious, was able to identify the closely related species in less time (Ortea et al., 2011). To the best of our knowledge, these approaches are the fastest methods to date for the unambiguous authentication of species in foods. In addition to AK, sarcoplasmic calcium-binding proteins (SCPs) have been reported as potential species-specific biomarkers in a study that used IEF of sarcoplas- mic proteins for the unambiguous identification of 14 shrimp and prawn species of commercial value (Ortea et al., 2010a). Interestingly, both AK and SCPs have been described as allergens (Yu et al., 2003; Shiomi et al., 2008), and, therefore, an anal- ysis targeting this protein would have two applications: species identification and food safety. This analysis may detect the allergen and help manufacturers and control authorities to identify contaminants in the production line and may even assist in quantifying the allergenic protein (Abdel Rahman et al., 2010).

5.3.4 Meat and Other Food Products In contrast to the increasing use of proteomics for the authentication of seafood species, the application of MS-based proteomics to the authentication of species in other types of foods has been limited to date. Taylor et al. (1993) highlighted the potential of MS as an analytical method for meat species identification. Using ESI MS, the researchers were able to identify the origin of purified hemoglobin and myoglobin from various sources (pig, beef, sheep, and horse). Grundy et al. (2007) reported a method for detecting the addition of 5% bovine gelling agents to different food matrices. This method was based on the MS/MS-based detection of the 140 PROTEOMICS IN FOOD SCIENCE species-specific fibrinopeptides, released from fibrinogen during gelling. Sentandreu et al. (2010) developed a sensitive LC–MS/MS methodology for detecting chicken meat in other meat mixtures. After MLC-3 enrichment by OFFGEL fractionation, two chicken-specific tryptic peptides from MLC-3 were detected in pork meat containing as little as 0.5% chicken meat. Stable isotope-labeled peptides (AQUA) were used to calculate the amount of the biomarker peptides in the samples. MS/MS-based proteomics have also been employed to detect soybean proteins added to processed meat products (Leitner et al., 2006) and for the detection of soy and pea proteins in milk powder (Cordawener et al., 2009). MALDI–TOF protein profiling was used to develop a fast method for determining the geographical origin of honey obtained from Hawaiian bees (Wang et al., 2009). Substitution of milk by lower-cost milk and the undeclared use of milk from different species for the production of traditional cheeses are frequent practices in the dairy industry. Using whey proteins as biomarkers, MALDI–TOF MS has been used to detect bovine milk additions to ewe and buffalo milk (Cozzolino et al., 2001) and the adulteration of mozzarella cheese (Cozzolino et al., 2002). ␤-lactoglobulin whey protein has also been used for the detection and quantification of cow milk adulteration in goat milk (Chen et al., 2004) or in either goat or sheep milk (Muller¨ et al., 2008) by means of an HPLC–ESI–MS or a capillary electrophoresis–MS method, respectively, using retention times and accurate molecular masses to detect the presence of cow milk at levels as low as 5%. Guarino et al. (2010) developed an LC–ESI–MS/MS method based on the detection of a sheep-specific peptide from the digestion of casein, a method that is able to detect up to 2% of sheep milk in goat and cow cheeses. Species-specific casein peptides have also been used for the detection and quantification of bovine, ovine, buffalo, and caprine milks in milk mixtures, using MALDI–MS or LC–ESI MS methods that are able to detect extraneous milk concentrations as low as 0.5% (Cuollo et al., 2010).

5.4 DETECTION AND IDENTIFICATION OF SPOILAGE AND PATHOGENIC MICROORGANISMS

The identification and classification of microorganisms has traditionally been based on morphological, physiological, and biochemical characterization. Currently, finger- printing techniques and common DNA sequencing are the most widely used methods for bacterial identification, although proteomic technologies are being introduced to assist in bacterial identification (Emerson et al., 2008).

5.4.1 Mass Spectrometry in Bacterial Identification MS is becoming an important tool for the accurate identification of microorganisms. The first application of MS for bacterial identification was performed by Anhalt and Fenselau (1975), who applied pyrolysis MS to the characterization of small compounds from lyophilized bacteria. The introduction of soft ionization techniques such as MALDI (Tanaka et al., 1988; Karas and Hillenkamp, 1988) and ESI (Fenn DETECTION AND IDENTIFICATION OF MICROORGANISMS 141 et al., 1989) has made the identification of proteins possible. MALDI–TOF MS has been used to produce protein profiles following cellular extraction (Cain et al., 1994) and to obtain fingerprinting spectra from whole bacterial cells (Holland et al., 1996). This approach has been applied to identify species and strains of Gram-negative and Gram-positive bacteria (Claydon et al., 1996; Krishnamurthy and Ross, 1996; Vargha et al., 2006). ESI–MS, although a powerful tool for the analysis of proteins, has been used less frequently for microorganism identification. However, protein profiles of whole bacterial cells (Krishnamurthy et al., 1999; Zheng et al., 2003) and of cell lysates, with or without prior separation (Dworzanski et al., 2004), have been reported. In general, ions obtained by MALDI–TOF MS are attributed to proteins with molecular masses below 15 kDa (Arnold et al., 1999; Holland et al., 1999; Ryzhov and Feneslau, 2001), with most of the peaks in the spectral profiles corresponding to ribosomal proteins (Pineda et al., 2003). The identification of microorganisms by MALDI–TOF MS can be performed by comparing the resulting MS spectra with fingerprint databases of known reference bacterial strains previously identified by genetic analysis (Wunschel et al., 2005). This approach is the most widely applied, and high discrimination has been reported in the identification of bacterial species and strains (Keys et al., 2004; Vargha et al., 2006). Another approach to bacterial identification, one that does not require a library of reference spectra, involves the identification of experimentally determined masses associated with an unknown microorganism by comparing those masses with masses taken from proteome databases (Demirev et al., 1999; Pineda et al., 2003). The bottom-up proteomic approach (Carrera et al., 2007), in which proteins are digested into peptides using proteases and the peptides are ionized, mass measured, and fragmented in a tandem mass spectrometer (Aebersold and Mann, 2003), has also been used for bacterial identification (Warscheid et al., 2003; Pribil et al., 2005). Recently, the discrimination and phylogenomic classification of Bacillus anthracis, Bacillus cereus, and Bacillus thuringiensis strains using LC–MS/MS of whole cell protein digests was reported (Dworzanski et al., 2010). In addition to the bottom- up approach, we have the top-down proteomic approach that yields highly accurate masses and makes protein digestion unnecessary, providing a good approach for the characterization of proteins from microorganisms (Demirev et al., 2005; Fagerquist et al., 2009; Wynne et al., 2009). Recently, six protein biomarkers from two strains of Escherichia coli O157:H7 and one non-O157:H7, nonpathogenic strain of E. coli have been identified using MALDI–TOF/TOF MS/MS and top-down proteomics (Fagerquist et al., 2010). MALDI–TOF MS has been reported as the most widely used mass spectral method for bacterial identification and differentiation at the genus, species, and, in some cases, strain level due to its speed and reduced cost when compared with biochemical and molecular techniques and when coupled with other methodologies, such as SELDI– TOF MS, which is a modified version of MALDI–TOF MS that correctly identifies very closely related bacterial species (Barzaghi et al., 2004; Kiehntopf et al., 2011; Lundquist et al., 2005). Over the past 16 years, there have been several reports targeted mainly at clinical diagnostic microbiology, biodefence, and environmental research that have demonstrated the efficacy of MALDI–TOF MS as a tool for bacterial 142 PROTEOMICS IN FOOD SCIENCE identification (Croxatto et al., 2011; Demirev and Fenselau, 2008a, 2008b; Giebel et al., 2010; Freiwald and Sauer, 2009; Sauer et al., 2008; Welker and Moore, 2011), and many books on this field have recently appeared (Wilkins et al., 2006; Shah and Gharbia, 2010).

5.4.2 Food Microbial Proteomics As mentioned above, proteomic technologies have been used for routine bacte- rial identification, mainly in clinical microbiology, biodefence, and environmental research. However, few studies have been performed in the field of microbial food MS for the identification of foodborne pathogens and microorganisms responsible for food spoilage, a process caused by various biochemical changes occurring in food due to microbial activity. These alterations depend on specific and nonspecific microflora, growth conditions for microorganisms related to intrinsic and extrinsic factors (temperature, pH and aw), and contamination during processing. Spoilage is responsible for significant economic losses in the food industry and serious food- borne diseases. Food intoxications result from the intake of toxins produced in food by bacteria such as Staphylococcus aureus and Clostridium botulinum. The poison- ing by bacteria can be caused, for example, by E. coli O104:H4, E. coli O157:H7, Salmonella spp., Listeria monocytogenes, S. aureus, B. cereus, Shigella spp., Vibrio cholerae, and Vibrio parahaemolyticus, among others. Food safety challenges have increased due to the development of new products, market globalization, and new food preservation processes. To meet these challenges, new technologies for accurately and rapidly identifying and classifying microor- ganisms, such as the new MS-based proteomic tools, are currently complementing traditional and genetics-based identification techniques. MALDI–TOF MS of intact bacterial cells has been used for the detection and iden- tification of 24 different foodborne pathogens and food spoilage bacteria, including genera such as Escherichia, Yersinia, Proteus, Morganella, Salmonella, Staphylococ- cus, Micrococcus, Lactococcus, Pseudomonas, Leuconostoc, and Listeria (Mazzeo et al., 2006). These authors were able to distinguish between the pathogenic E. coli O157:H7 and the nonpathogenic E. coli ATCC 25922. Ochoa and Harrington (2005) reported the identification of E. coli O157:H7 in ground beef. Dieckmann et al. (2008) identified 126 strains of Salmonella spp. in chicken, turkey, swine, and cattle. Mandrell et al. (2005) characterized 75 strains of Campylobacter spp. from humans, poultry, swine, dogs, and cats. Barbuddhe et al. (2008) used MALDI to identify 146 strains of Listeria spp. in meat, poultry, dairy, and vegetables, and Angelakis et al. (2011) used this methodology for bacterial identification at the species level in probiotic foods and yoghurts. MALDI–TOF MS of low-molecular-weight proteins extracted from intact bacte- rial cells was successfully applied to the safety assessment of fresh and processed seafood, identifying the main species of seafood spoilage and pathogenic Gram- negative bacteria, including Aeromones hydrophila, Acinetobacter baumanii, Pseu- domonas spp., and Enterobacter spp. (Bohme¨ et al., 2010; Bohme¨ et al., 2011a) DETECTION AND IDENTIFICATION OF MICROORGANISMS 143

FIGURE 5.2 Method for the identification of food spoilage and pathogenic bacteria by MALDI–TOF MS fingerprinting. Modified from Bohme¨ et al. (2010).

(Fig. 5.2). A subsequent study reported on the identification of Gram-positive bacteria in seafood, including Bacillus spp., Listeria spp., Clostridium spp., and Staphylococ- cus spp. (Bohme¨ et al., 2011b). Biogenic amine histamine poisoning is a type of food intoxication that is usually related to the ingestion of scombrid fishes but is also related to the consumption of cheese, wine, and fermented meats and vegetables (Tenebrick et al., 1990). MALDI–TOF MS has been used to identify biogenic amine-producing bacteria involved in food poisoning (Fernandez-No´ et al., 2010; Fernandez-No´ et al., 2011). Streptococcus parauberis is an agent of mastitis in cows and thus is a source of economic loss for the dairy industry. Barreiro et al. (2010) reported the iden- tification of subclinical cow mastitis pathogens in milk using MALDI–TOF MS. S. parauberis has also been associated with spoilage in meat products (Koort et al., 2006) and has been reported to produce infectious disease in farmed fish, such as turbot (Domenech et al., 1996). Recently, the isolation and MALDI–TOF MS iden- tification of S. parauberis in vacuum-packed seafood products have been reported (Fernandez-No´ et al., 2012). 144 PROTEOMICS IN FOOD SCIENCE

5.4.3 Spectrum Libraries A number of commercial databases have been created for MALDI–TOF MS identifi- cation of bacteria, including for example, SARAMIS (Spectral Archiving and Micro- bial Identification System, AnagnosTec GmbH, Zossen, Germany), the MicrobeLynx bacterial identification system (Waters Corporation, Manchester, UK), or MALDI Biotyper (Bruker Daltonics Inc, Billerica, MA, USA). These private databases are mainly targeted at human pathogens that cause infectious diseases although some bac- terial species that play important roles in food spoilage and safety are also included. A public spectra database containing data on 24 foodborne bacterial species has been constructed by Mazzeo et al. (2006) and is freely available on the Web (http://bioinformatica.isa.cnr.it/Descr_Bact_Dbase.htm). The public database Spec- traBank (http://www.spectrabank.org) was recently created by Santiago de Com- postela University. This library contains the mass spectral fingerprints of the main spoilage and pathogenic bacteria species found in seafood, including more than 70 species of interest to the food sector.

5.5 CHANGES DURING FOOD STORAGE AND PROCESSING AND THEIR RELATIONSHIP TO QUALITY

Food quality is determined to a large extent not only by the biological and genetic variability of the raw constituents but also by the treatment of food during production, processing, and storage. During harvest/slaughter and processing, food proteins are subjected to both natural and processing changes that influence key food properties, including nutritional value, shelf life, and quality features such as tenderness, taste, juiciness and color. Proteomics has been used not only to track changes in food proteins after harvest/slaughter and evaluate the effects of postharvest/postslaughter processing/conservation treatments such as heating, curing, and high-pressure pro- cessing but also to investigate changes induced by various preslaughter conditions.

5.5.1 Postmortem Changes in Meat Proteins Postmortem conversion of muscle tissue into meat is a complex biochemical process that greatly affects meat quality. Postmortem processes include proteolysis, interac- tions of soluble muscle tissue proteins, and nonbiological PTMs such as oxidation and glycation. Understanding the complex mechanisms that influence tenderness and meat quality would significantly benefit production methods and processing tech- nologies and is, therefore, of great interest to the field of meat science. Protein degradation and denaturation during postmortem aging has been associ- ated with meat tenderness and water-holding capacity (Koohmaraie, 1996; Melody et al., 2004). The endogenous calpain system, which comprises several calcium- dependent cysteine proteases (calpains) and their specific inhibitor (calpastatin), reg- ulates proteolysis of muscle tissue proteins under postmortem conditions (Huang and Forsberg, 1998) and has thus been associated with tenderization processes (Goll et al., CHANGES DURING FOOD STORAGE AND PROCESSING 145

1991). Factors that regulate calpain activity, such as denaturation (Kim et al., 2010), oxidation (Lametsch et al., 2008), and nitrosylation (Huff-Lonergan et al., 2010), may then be used to alter muscle tissue protein proteolysis during meat aging. Proteomics-based approaches have been extensively used to study the postmortem changes in muscle tissue proteins that affect meat quality (Hollung et al., 2007). As muscle tissue is converted into meat, myofibrillar and cytoskeletal proteins, such as desmin, actin, myosin, troponin and tropomyosin, are degraded by calpains (Lametsch and Bendixen, 2001; Lametsch et al., 2002; Lametsch et al., 2003; Lametsch et al., 2004; Morzel et al., 2004; Hwang et al., 2005; Muroya et al., 2007). Other proteins, including cellular defence/stress proteins (such as heat shock proteins (HSP) 70 and 27), and metabolic enzymes (such as creatine kinase, myokinase, pyruvate kinase, glycogen phosphorylase and NADH dehydrogenase) were also found to change in abundance during postmortem storage and therefore also seem to play a role in meat aging (Lametsch et al., 2002; Jia et al., 2006; Jia et al., 2007; Laville et al., 2009; Bjarnadottir et al., 2010; Bernevic et al., 2011). The postmortem accumulation of several enzymes involved in ATP-generating pathways, such as the glycolytic and the tricarboxylic acid cycles, suggests that an increase in aerobic metabolism occurs after slaughter to replenish ATP levels in muscle tissue (Jia et al., 2006). In a recent study, Laville et al. (2009) reported a reduction in the solubility of some HSPs and glycolytic enzymes and a fragmentation of glycolytic and structural proteins dur- ing muscle tissue aging. The abundance of mitochondrial membrane proteins just after slaughter suggests a link between the apoptosis process and tenderness. The involvement of HSPs, which are part of the muscle tissue stress response pathway that stabilizes myofibrillar structures after challenges, in the postmortem process and in the development of tenderness has also been studied (Pulford et al., 2008). Intra- muscular connective tissue components, such as collagen, are also degraded during postmortem conditioning of meat, thus contributing to meat texture and tenderness (Purslow, 2005). Protein fragments produced by protein degradation during postmortem muscle tissue to meat transformation and meat aging can be digested to release smaller peptides and even individual amino acids (Mullen et al., 2000). Although these smaller peptides may influence the organoleptic and biological value of food products, their presence in meat has been scarcely studied. Bauchart et al. (2006) reported a decrease in bioactive tripeptide glutathione during storage and cooking of beef. Other small peptides weighing up to 3.6 kDa, which were found to be degradation products of proteins from connective tissue (procollagen I and IV) and other structural proteins (troponin T, nebulin, and cypher protein 3), were generated during meat aging and especially after cooking. During meat aging, an increase in peptides weighing 3 to 17 kDa was reported in pork and beef muscle tissue (Claeys et al., 2004), and small peptides (<2.4 kDa) were also found to have increased in lamb muscle tissue (Sylvestre et al., 2001). The study of postmortem conversion of muscle tissue into meat has been extended to changes occurring in cured/fermented products. Proteomics has been used to under- stand these processes, especially in products from pig meats such as dry-cured hams (Sentandreu et al., 2007; Theron et al., 2009). During ham ripening, endogenous 146 PROTEOMICS IN FOOD SCIENCE proteases (cathepsins and calpains) and aminopeptidases produce complete hydrol- ysis of most fibrillar proteins, such as myosin heavy and light chains and actin, as well as some sarcoplasmic proteins (Di Luccia et al., 2005). Meat protein proteolysis creates small peptides, free amino acids, and degradation products that determine the texture, flavor, and odor of the final product (Toldra´ et al., 2000; Sforza et al., 2001; Sentandreu et al., 2003). Recently, a number of peptides generated from actin (Sen- tandreu et al., 2007), troponin T (Mora et al., 2010), and MLC-1 and MLC-2 (Mora et al., 2011) during the dry curing of Spanish hams have been isolated and identified by 2D-LC coupled to MS/MS. However, the microflora also plays a relevant role in the ripening process (Talon et al., 2007), as do endogenous proteases.

5.5.2 Postmortem Changes in Fish Muscle Tissue Less is known about the postmortem degradation processes in fish muscle tissue compared with mammalian species. Postmortem proteolytic tenderization of muscle tissue is, together with bacterial spoilage, one of the most adverse changes related to fish freshness and quality and has therefore been widely studied. Postmortem proteolysis mainly affects myofibrillar and cytoskeletal proteins such as titin, neb- ulin, dystrophin, ␣-actinin, myosin, and tropomyosin (Verrez-Bagnis et al., 1999; Delbarre-Ladrat et al., 2004; Caballero et al., 2009; Du et al., 2010; Wu et al., 2010). Various endogenous proteolytic enzymes seem to be involved, including cathep- sins, calpains, aminopeptidases, and connective tissue hydrolases (Delbarre-Ladrat et al., 2006). These authors have suggested that calpains and cathepsins in fish act in complementary and synergistic ways at different levels of the myofibrillar protein breakdown. Proteomics, mainly 2-DE, has been used to characterize postmortem changes during cold storage in the muscle tissue of various fish and crustacean species (Verrez-Bagnis et al., 2001; Kjærsgård and Jessen, 2003), and changes have been reported in the concentrations of several structural proteins, such as myosin and ␣-actinin, as well as some glycolytic proteins, including triose-phosphate isomerase and glyceraldehyde-3-phosphate dehydrogenase (Martinez et al., 2001; Kjærsgård et al., 2006; Terova et al., 2011).

5.5.3 Preslaughter Conditions and Proteome Changes Preslaughter conditions such as physical activity, density, stress, and lairage time may influence postmortem muscle tissue integrity through several mechanisms, such as glycogen depletion and pH decline (Dall Aaslyng and Barton Gade, 2000). The influ- ence of preslaughter stress and physical activity on the quality of farmed trout was studied by comparing the muscle proteomes of an exercised group of trout that were crowded together for 15 min with those of a rested group that were quickly netted and killed without crowding (Morzel et al., 2006). Using 2-DE, the study found differ- ences in the expression levels of energy-producing enzymes and structural proteins. The effects of crowding were also investigated in muscle tissue and blood proteomes of Atlantic salmon, and changes were observed in the levels of proteins involved CHANGES DURING FOOD STORAGE AND PROCESSING 147 in secondary and tertiary stress responses (Veiseth-Kent et al., 2010). Morzel et al. (2004) investigated the influence of the transport and lairage period before slaughter on pork meat proteome. Mitochondrial ATPase and the dephosphorylated form of MLC-2 were found to be present in higher quantities in the animals transported imme- diately before slaughter, and more differences were found as a result of postmortem storage. Lametsch et al. (2006) studied the increased meat tenderness found in pigs subjected to compensatory growth; in other words, the animals were feed-restricted for a period of time and then given free access to food instead of having free access at all times. Several proteins affected by compensatory growth were detected, some of which were known to play a role in muscle development and muscle rigor, such as some HSP and the 14-3-3 protein ␥.

5.5.4 Proteomics to Study Effects of Processing/Conservation Treatments Numerous processing treatments, including heating, freezing, high-pressure treat- ment, and modified atmosphere packaging, are used in the food industry primarily to improve shelf life, food safety, and organoleptic quality. These treatments, however, influence the quality of the food by modifying the proteins. During food process- ing and storage, nonbiological (environmental or process-induced) posttranslational modifications, commonly referred to as nonenzymatic posttranslational modifica- tions (nePTMs), occur regularly and significantly affect the technological, sensorial, and nutritional qualities of the food. Oxidation modifications, such as carbonyla- tion, thiol oxidation, and aromatic hydroxylation, and Maillard glycation (a reaction of sugars with amino-acid side chains) are the most frequently reported nePTMs, although condensation, elimination of side chains, or peptide backbone breakdown have also been described (Pischetsrieder and Baeuerlein 2009). These modifications are responsible for changes in key protein features such as hydrophobicity, protein aggregation, and protein solubility. Therefore, characterization of nePTMs (knowing what modification has occurred and where in the protein it is present) is critical to evaluating the effects on the final food products. Protein oxidation modifications in meat, which are produced by oxidative species such as reactive oxygen species derived from metabolic processes occurring during storage and processing, have recently been studied. The relationships between the early postmortem sarcoplasmic proteome from pig muscle tissue and protein oxi- dation generated during meat storage and cooking were investigated using 2-DE, and predictive markers of protein sensitivity to oxidative stress were identified by MALDI–TOF MS (Promeyrat et al., 2011). Myoglobin oxidation causes a brown dis- coloration of the meat, which tends to lead to consumer rejection. MALDI–TOF MS and LC–MS/MS were used to study the influence of lipid oxidation on the oxidation of porcine and bovine muscle myoglobin to metmyoglobin during meat storage, medi- ated by the binding of hydroxynonenal to histidine residues of myoglobin (Suman et al., 2007). Oxidative processes may also contribute to drip loss. Using 2-DE combined with LC–MS/MS, Bernevic et al. (2011) analyzed the oxidation profile of porcine muscle tissue at different drip losses. They identified and characterized 148 PROTEOMICS IN FOOD SCIENCE several oxidative modifications in creatine kinase, actin, and triosephosphate iso- merase, representing possible biomarker candidates. Storage of fresh and frozen fish can also cause protein oxidation, which can be monitored by detecting protein carbonylation in 2-DE gels following the chemi- cal derivatization of the protein carbonyls. Using this approach, increased oxidation levels of several proteins during fish storage have been reported (Kjærsgård and Jessen, 2004; Kinoshita et al., 2007). Several carbonylated proteins were identified by LC–MS/MS in frozen rainbow trout (Oncorhynchus mykiss) fillets; these proteins include nucleoside diphosphate kinase, adenylate kinase, pyruvate kinase, actin, cre- atine kinase, tropomyosin, MLC-1 and MLC-2, and myosin heavy chain (Kjærsgård et al., 2006b). Various nePTMs occurring during heating and storage of milk have also been extensively mapped and characterized using proteomic approaches. The structure and binding sites of six major protein modifications formed in milk and dairy prod- ucts caused by glycation, oxidation, and condensation reactions were characterized using electrophoresis and MALDI–TOF PMF (Meltretter et al., 2008). Johnson et al. (2011) developed a method that couples size exclusion chromatography with ESI– MS to characterize processing-induced changes such as aggregation and Maillard glycation in milk proteins. Protein carbonylation is considered a marker of protein oxidation in dairy products. Fenaille et al. (2005) using various proteomic approaches highlighted the preferential carbonylation of ␤-lactoglobulin during the heat treatment of milk. A method for detecting the glycoxidation product carboxymethyllysine in the entire milk proteome was developed by Meyer et al. (2011). Holland et al. (2012) reported deamidation and loss of the N-terminal dipeptide of ␣-casein in UHT milk stored at elevated temperatures, supporting the idea that storage temperature may affect the stability of milk proteins. Recently, lactosylation, deamidation, and protein crosslinking were observed in milk protein concentrate powder depending on storage time, temperature, and humidity (Le et al., 2012). The effects of high-pressure processing in preventing microbial and viral growth have been widely studied, even with proteomics (Mart´ınez-Gomariz´ et al., 2009), but there are few studies dealing with proteomics and changes in food proteins in response to this technological treatment. Using electrophoresis, Marcos et al. (2010) demonstrated that high-pressure processing modifies the composition of the sarcoplasmic protein fraction of beef and that this change in the protein profile influences some meat qualities such as moisture loss and color. Ortea et al. (2010b) studied the effects of high-pressure treatment on the sarcoplasmic proteome of chilled salmon, and a protein identified by MS/MS as phosphoglycerate mutase was found to be decreased when fish were treated with ≥170 MPa. Modified atmosphere packaging also affects food proteins. Lund et al. (2007) reported the reduction of myofibril fragmentation in meat stored in high-oxygen atmospheres, indicating a lower level of proteolysis and protein crosslinking, thereby reducing tenderness and juiciness. The 2-DE method has been used to study changes in protein levels during processing of fish products with other techniques, such as the addition of Ca and Mg salts (Martinez et al., 1992) and fermentation with the starter culture Lactobacillus sake (Morzel et al., 2000). PROTEOMICS DATA INTEGRATION TO EXPLORE FOOD METABOLIC PATHWAYS 149

5.6 PROTEOMICS DATA INTEGRATION TO EXPLORE FOOD METABOLIC PATHWAYS AND PHYSIOLOGICAL ACTIVITY OF FOOD COMPONENTS

The construction of food metabolic pathways including proteomic data can contex- tualize the large-scale data within the overall physiological scheme of a biological system. This method is an efficient way to predict metabolic phenotype and discover new pathways. In vitro assays using cell cultures or animal models are the main methods for performing a comprehensive elucidation of the mechanism of action of natural compounds, specific nutrients, and diets. Although the number of studies is currently limited, we believe that this topic will be significant. In proteomics, the bottom-up approach is the most widely used. More specifically, studies on protein abundance, protein–protein interaction, and posttranslational modification analysis should be integrated into functional and dynamical food networks to understand the mechanism of action of specific diet components. For instance, proteomic approaches have been used to elucidate biological pathways and networks to obtain information about the functional role of the bovine milk proteome (D’Alessandro et al., 2011). In the previously cited article, a preliminary map of the bovine milk interactome was created for 573 nonredundant protein entries, including functional gene ontology (GO) term enrichment in silico studies. This approach revealed the biological role of bovine milk proteins from an unprecedented interactomic perspective. Epidemiological studies support the association between diet and disease; there- fore, proteomics may be a useful strategy for linking an observed effect to a particu- lar food component. Traditional approaches usually test physiologically active food components by treating cells in vitro with defined components. Therefore, systematic approaches as those described by foodomics (Herrero et al., 2012), including the anal- ysis of physiological reactions to food components by proteome analysis, promise to provide a broader picture of the cellular consequences of food components and accelerate the search for highly relevant bioactive food ingredients. Cultured cells can be better standardized and have a lower biological variability than animals or humans. Thus, physiologically active food components, either as functional food or nutraceuticals, are tested by treating cells in vitro with defined compounds and mea- suring defined cellular reactions to accelerate the search for highly relevant bioactive food ingredients. For instance, HT29 human colon cancer cells were treated with or without 20-S-ginsenoside Rg3, a ginseng metabolite that inhibits the proliferation of tumor cells in vitro and may increase the antiproliferative effects of chemotherapy (Lee et al., 2009). Cellular proteomes were compared by 2-DE, and all regulated spots were identified by MALDI–TOF/TOF MS. The antiproliferative mechanism of 20S-Rg3 was found to be dependent on mitotic inhibition, DNA replication and repair, and growth factor signaling. Moving from cell culture to animal studies, a recent study investigated the influence of dietary oil on plasma proteome during aging in rats (Santos-Gonzalez´ et al., 2012). Using a proteomic approach, the authors studied how dietary oil affected plasma proteins in young (6 months) and old (24 months) rats fed throughout their lives with two experimental diets enriched with either sunflower or virgin olive oil. After 150 PROTEOMICS IN FOOD SCIENCE the depletion of the most abundant proteins, the corresponding plasma subproteome was investigated by 2-DE and MS. The results showed that the significant changes in specific proteins reinforced the beneficial role of a diet rich in virgin olive oil compared with a diet rich in sunflower oil, through the modulation of inflammation, homeostasis, oxidative stress, and cardiovascular risk during aging. A few studies have applied proteome analysis to human food trials. One particular study determined changes in the plasma proteome after folic acid supplementation (Duthie et al., 2010), which is also described in detail in another chapter of this book. Healthy volunteers were included in a double-blind, randomized trial and received either a placebo or 1.2 mg of folic acid daily. After 12 weeks, the subjects’ plasma proteomes were assessed by 2-DE, and the differentially regulated proteins were identified by MS. Low folate status pre- and posttreatment was associated with lower levels of proteins involved in activation and regulation of immune function and coagulation, demonstrating that supplementation with synthetic folic acid increases the expression of proteins primarily involved in immune function activation. The investigations performed on cell cultures, animal models, and human tri- als illustrate that proteomic approaches are being successfully applied to a vari- ety of research questions to understand the physiological activity of different food components.

5.7 NUTRIPROTEOMICS

Personalized nutrition involves adapting food and diet to individual needs, depending on the host’s biological variation as well as gender, life stage, style, and situation. Gender differences in rat plasma proteome were analyzed in response to normal and high-fat diets (Liu et al., 2012). Plasma from rats fed a high-fat diet was analyzed by 2-DE and MALDI–TOF/TOF MS for analysis of differential regulation patterns between male and female plasma proteins. A total of 31 proteins were significantly modulated in a gender-dependent manner and were classified into three groups. The authors proposed that the observed gender-dimorphic differences in the plasma proteome for specific diets represent strong evidence that knowledge of gender-related differences is important when making gender-specific nutritional recommendations. Modern human nutrition science is exploring the application of -omic high- throughput tools to nutritional and health-related issues. New disciplines, such as foodomics that includes nutrigenomics, nutrigenetics and nutriproteomics, are emerg- ing to offer new insights into the complexities of human health and nutrition. While nutrigenetics addresses how an individual’s genetic makeup predisposes for dietary susceptibility, nutrigenomics can be defined as the study of the effects of foods and food components on gene expression. Nutritional proteomics or nutriproteomics is a new field that involves the characterization and quantification of food-derived bioactive peptides and proteins for discerning the mechanisms by which proteome variation impacts nutrition-related health outcomes (Ozdemir et al., 2010). Specifi- cally, nutriproteomics provides two essential avenues for molecular nutrition research and applications in personalized nutrition: (a) the characterization and quantification FINAL CONSIDERATIONS AND FUTURE TRENDS 151 of food-derived bioactive peptides and proteins and (b) the elucidation of biomark- ers for mechanism-of-action, efficacy, and side effects of nutritional interventions. The Human Proteome Organisation (HUPO) proposes an international effort to map the protein complement of the human genome, called the Human Proteome Project (HPP). The HPP plans to characterize the “normal (nonpathological) human pro- teome,” which may be used as a control stage for further human nutriproteomics studies. Several proteome biobanks collecting samples for plasma, urine, and other fluids and tissues are being developed. The final goal of these modern nutritional disciplines is to deliver personalized nutrition and healthcare by providing dietary recommendations to specific individuals and consumer groups.

5.8 FINAL CONSIDERATIONS AND FUTURE TRENDS

As reported in this chapter, the study of proteomics contributes enormously to the development of modern food science. Proteomic technologies are helping to address the major challenges of food authentication, thanks to the development of inexpen- sive, fast, and easy-to-use methodologies for routine use. MS/MS-based proteomic methodologies, such as the detection and quantification of species-specific peptides by MRM, are extremely promising, as they provide high-throughput and sensitive multiplexed species detection, identification, and quantification in complex matri- ces. In addition, proteomics is contributing to the understanding of postmortem and technology-induced changes and the relationship of these changes with quality traits. Efforts have been made in the discovery of protein markers for such traits. Thus, the use of fast sample preparation methods coupled with sensitive and reliable MS for the discovery and monitoring of food quality biomarkers will improve quality control and quality assessment in the food industry and will greatly help with the optimization and development of food technological processes. New developments based on microfluidic devices, lab-on-lab chips, and protein arrays offer a promis- ing area within modern food science where the proteomic results for discovery and monitoring can be implemented for routine control, diagnosis, and monitoring of food products. We anticipate that these new platforms will be essential components of most food control laboratories within the next decade and will provide diagnostic information that drives decision making by the authorities. In addition, the recent advent of -omic high-throughput analytical platforms, cou- pled with computational data management using bioinformatics and mathematical models, has opened up new vistas in our understanding of complex biological sys- tems. The complex interplay that occurs in the individual in terms of genetics, phys- iology, health, diet, and environment requires tools to perform comparative genetic, genomic, proteomic, and metabolomic analyses. In addition, tools for data integration, data mining, and interpretation are also needed. This global vision offers enormous potential in the elucidation of diseases as well as in defining key pathways and net- works involved in optimal human health and nutrition. The tools and technologies now available in systems biology offer exciting opportunities to develop the emerging area of individual nutritional profiling. 152 PROTEOMICS IN FOOD SCIENCE

REFERENCES

Abdel Rahman AM, Kamath S, Lopata AL, Helleur RJ (2010). Analysis of the allergenic proteins in black tiger prawn (Penaeus monodon) and characterization of the major aller- gen tropomyosin using mass spectrometry. Rapid Communications in Mass Spectrometry 24:2462–2470. Aebersold R, Mann M (2003). Mass spectrometry-based proteomics. Nature 422:198– 207. Angelakis E, Million M, Henry M, Raoult D (2011). Rapid and accurate bacterial identification in probiotics and yoghurts by MALDI-TOF mass spectrometry. Journal of Food Science 76:M568–M572. Anhalt JP, Fenselau C (1975). Identification of bacteria using mass spectrometry. Analytical Chemistry 47:219–225. Association of Official Analytical Chemists [AOAC] (1984). Official Method of Analysis, 14th ed. Washington, DC: AOAC. Arnold RJ, Karty JA, Ellington AD, Reilly JP (1999). Monitoring the growth of a bacteria culture by MALDI-MS of whole cells. Analytical Chemistry 71:1990–1996. Asara JM, Christofk HR, Freimark LM, Cantley LC (2008). A label-free quantification method by MS/MS TIC compared to SILAC and spectral counting in a proteomics screen. Pro- teomics 8:994–999. Bandow JE (2010). Comparison of protein enrichment strategies for proteome analysis of plasma. Proteomics 10:1416–1425. Barbuddhe SB, Maier T, Schwarz G, Kostrzewa M, Hof H, Domann E, Chakraborty T, Hain T (2008). Rapid identification and typing of listeria species by matrix-assisted laser desorp- tion ionization-time of flight mass spectrometry. Applied and Environmental Microbiology 74:5402–5407. Barreiro JR, Ferreira CR, Sanvido GB, Kostrzewa M, Maier T, Wegemann B, Bottcher¨ V, Eberlin MN, dos Santos MV (2010). Identification of subclinical cow mastitis pathogens in milk by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Journal of Dairy Science 93:5661–5667. Barzaghi D, Isbister JD, Lauer KP, Born TL (2004). Use of surface-enhanced laser des- orption/ionization time of flight to explore bacterial proteomes. Proteomics 4:2624– 2628. Bauchart C, Remond´ D, Chambon C, Patureau Mirand P, Savary-Auzeloux I, Reynes` C, Morzel M (2006). Small peptides (<5 kDa) found in ready-to-eat beef meat. Meat Science 74:658–666. Bellei E, Bergamini S, Monari E, Fantoni LI, Cuoghi A, Ozben T, Tomasi A (2011). High- abundance proteins depletion for serum proteomic analysis: concomitant removal of non- targeted proteins. Amino Acids 40:145–156. Bernevic B, Petre BA, Galetskiy D, Werner C, Wicke M, Schellander K, Przybylski M (2011). Degradation and oxidation postmortem of myofibrillar proteins in porcine skeleton muscle revealed by high resolution mass spectrometric proteome analysis. International Journal of Mass Spectrometry 305:217–227. Bjarnadottir SG, Hollung K, Faergestad EM, Veiseth-Kent E (2010). Proteome changes in bovine longissimus thoracis muscle during the first 48 h postmortem: shifts in energy status and myofibrillar stability. Journal of Agricultural Food Chemistry 58:7408–7414. REFERENCES 153

Bohme¨ K, Fernandez-No´ I, Barros-Velazquez´ J, Gallardo JM, Calo-Mata P, Canas˜ B (2010). Species differentiation of seafood spoilage and pathogenic Gram-negative bacteria by MALDI-TOF mass fingerprinting. Journal of proteome research 9:3169–3183. Bohme¨ K, Fernandez-No´ I, Gallardo JM, Canas˜ B, Calo-Mata P (2011a). Safety assessment of fresh and processed seafood products by MALDI-TOF mass fingerprinting. Food Bio- process and Technology 4:907–918. Bohme¨ K, Fernandez-No´ I, Barros-Velazquez´ J, Gallardo JM, Canas˜ B, Calo-Mata P (2011b). Rapid species identification of seafood spoilage and pathogenic Gram-positive bacteria by MALDI-TOF mass fingerprinting. Electrophoresis 32:2951–2965. Caballero MJ, Betancor M, Escrig JC, Montero D, de los Monteros AE, Castro P, Gines´ R, Izquierdo M (2009). Post mortem changes produced in the muscle of sea bream (Sparus aurata) during ice storage. Aquaculture 291:210–216. Cain T, Lubman D, Weber J (1994). Differentiation of bacteria using protein profiles from matrix assisted laser desorption/ionization time of flight mass spectrometry. Rapid Com- munications in Mass Spectrometry 8:1026–1030. Canas˜ B, Lopez-Ferrer´ D, Ramos-Fernandez´ A, Camafeita E, Calvo E (2006). Mass spec- trometry technologies for proteomics. Briefings in Functional Genomics and Proteomics 4:295–320. Carrera M, Canas˜ B, Pineiro˜ P, Vazquez´ J, Gallardo JM (2006). Identification of commercial hake and grenadier species by proteomic analysis of the parvalbumins fraction. Proteomics 6:5278–5287. Carrera M, Canas˜ B, Pineiro˜ C, Vazquez´ J, Gallardo JM (2007). De novo mass spectrometry sequencing and characterization of species-specific peptides from nucleoside diphosphate kinase B for the classification of commercial fish species belonging to the family Merlucci- idae. Journal of Proteome Research 6:3070–3080. Carrera M, Canas˜ B, Lopez-Ferrer´ D, Pineiro˜ C, Vazquez´ J, Gallardo JM (2011). Fast monitor- ing of species-specific peptide biomarkers using high-intensity-focused-ultrasound-assisted tryptic digestion´ and selected MS/MS ion monitoring. Analytical Chemistry 83:5688–5695. Chelius D, Bondarenko PV (2002). Quantitative profiling of proteins in complex mixtures using liquid chromatography and mass spectrometry. Journal of Proteome Research 1:317–323. Chen R-K, Chang L-W, Chung Y-Y, Lee M-H, Ling Y-C (2004). Quantification of cow milk adulteration in goat milk using high-performance liquid chromatography with electrospray ionization mass spectrometry. Rapid Communications in Mass Spectrometry 18:1167– 1171. Cifuentes A (2009). Food analysis and foodomics. Journal of Chromatography A 1216:7109– 7110. Civera T (2003). Species identification and safety of fish products. Veterinary Research Com- munications 27:481–489. Claeys E, De Smet S, Balcaen A, Raes K, Demeyer D (2004). Quantification of fresh meat peptides by SDS-PAGE in relation to ageing time and taste intensity. Meat Science 67:281– 288. Claydon MA, Davey SN, Edwards-Jones V, Gordon DB (1996). The rapid identification of intact microorganisms using mass spectrometry. Nature Biotechnology 14:1584–1586. Cordawener JHG, Luykx DMAM, Frankhuizen R, Bremer MGEG, Hooijerink H, America AHP (2009). Untargeted LC-Q-TOF mass spectrometry method for the detection of adul- terations in skimmed milk powder. Journal of Separation Science 32:1216–1223. 154 PROTEOMICS IN FOOD SCIENCE

Corthals GL, Aebersold R, Goodlett DR (2005). Identification of Phosphorylation sites using microimmobilized metal affinity chromatography. Methods in Enzymology 405:66–81. Cozzolino R, Passalacqua S, Salemi S, Malvagna P, Spina E, Garozzo D (2001). Identification of adulteration in milk by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Journal of Mass Spectrometry 36:1031–1037. Cozzolino R, Passalacqua S, Salemi S, Garozzo D (2002). Identification of adulteration in water buffalo mozzarella and in ewe cheese by using whey proteins as biomarkers and matrix- assisted laser desorption/ionization mass spectrometetry. Journal of Mass Spectrometry 37:985–991. Croxatto A, Prod’hom G, Greub G (2011). Applications of MALDI-TOF mass spectrometry in clinical diagnostic microbiology. FEMS Microbiology Reviews 36:380–407. Cuollo M, Caira S, Fierro O, Pinto G, Picariello G, Addeo F (2010). Toward milk speciation through the monitoring of casein proteotypic peptides. Rapid Communications in Mass Spectrometry 24:1687–1696. D’Alessandro A, Zolla L, Scaloni A (2011). The bovine milk proteome: cherishing, nourishing and fostering molecular complexity. An interactomics and functional overview. Molecular Biosystems 7:579–597. Dall Aaslyng M, Barton Gade P (2000). Low stress pre-slaughter handling: effect of lairage time on the meat quality of pork. Meat Science 57:87–92. Delbarre-Ladrat C, Verrez-Bagnis V, Noel¨ J, Fleurence J (2004). Relative contribution of calpain and cathepsins to protein degradation in muscle of sea bass (Dicentrarchus labrax L.). Food Chemistry 88:389–395. Delbarre-Ladrat C, Cheret´ R, Taylor R, Verrez-BagnisV (2006). Trends in post mortem aging in fish: understanding of proteolysis and disorganization of the myofibrillar structure. Critical Reviews in Food Science and Nutrition 46:409–421. Demirev PA, Ho YH, Ryzhov V, Fenselau C (1999). Microorganism identification by mass spectrometry and protein database searches. Analytical Chemistry 71:2732–2738. Demirev PA, Feldman AB, Kowalski P, Lin JS (2005). Top-down proteomics for rapid identi- fication of intact microorganisms. Analytical Chemistry 77:7455–7461. Demirev PA, Fenselau C (2008a). Mass spectrometry for rapid characterization of microor- ganisms. Annual Review of Analytical Chemistry 1:71–93. Demirev PA, Fenselau C (2008b). Mass spectrometry in biodefense. Journal of Mass Spec- trometry 43:1441–1457. Di Luccia A, Picariello G, Cacace G, Scaloni A, Faccia M, Liuzzi V, Alviti G, Musso SS (2005). Proteomic analysis of water soluble and myofibrillar protein changes occurring in dry-cured hams. Meat Science 69:479–491. Dieckmann R, Helmuth R, Erhard M, Malorny B (2008). Rapid classification and identifi- cation of salmonellae at the species and subspecies levels by whole-cell matrix assisted laser desorption/ionization–time of flight mass spectrometry. Applied and Environmental Microbiology 74:7767–7778. Domenech A, Fernandez´ Garayzabal J, Pascual C, Garc´ıa J, Cutuli M, Moreno M (1996). Strep- tococcosis in cultured turbot, Scophthalmus maximus (L), associated with Streptococcus parauberis. Journal of Fish Diseases 19:33–38. Du XL, Du CH, Liu GM, Wang XC, Hara K, Su WJ, Cao MJ (2010). Effect of a myofibril- bound serine proteinase on the deg of giant protein titin and nebulin. Journal of Food Biochemistry 34:581–594. REFERENCES 155

Duthie SJ, Horgan G, de Roos B, Rucklidge G, Reid M, Duncan G, Pirie L, Basten GP, Powers HJ (2010). Blood folate status and expression of proteins involved in immune function, inflammation, and coagulation biochemical and proteomic changes in the plasma humans in response to long-term synthetic folic acid supplementation. Journal of Proteome Research 9:1941–1950. Dworzanski JP, Snyder AP, Chen R, Zhang HY, Wishart D, Li L (2004). Identification of bacteria using tandem mass spectrometry combined with a proteome database and statistical scoring. Analytical Chemistry 76:2355–2366. Dworzanski JP, Dickinson DN, Deshpande SV, Snyder AP, Eckenrode BA (2010). Discrim- ination and phylogenomic classification of Bacillus anthracis-cereus-thuringiensis strains based on LC-MS/MS analysis of whole cell protein digests. Analytical Chemistry 82:145– 155. Emerson D, Agulto L, Liu H, Liu L (2008). Identifying and characterizing bacteria in an Era of genomics and proteomics. BioScience 58:925–936. Eng JK, McCormack AL, Yates III, JR (1994). An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Journal of the American Society of Mass Spectrometry 5:976–989. European Parliament (1999). Council Regulation (EC) No 104/2000 of 17 December on the common organization of the markets in fishery and aquaculture products. Official Journal of European Communities L 17:22–52. European Parliament (2002). Council Regulation (EC) No. 178/2002 of 28 January 2002 laying down the general principles and requirements of food law, establishing the European Food Safety Authority and laying down procedures in matters of food safety. Official Journal of European Communities L 31:1–24. Fagerquist CK, Garbus BR, Williams KE, Bates AH, Boyle S, Harden LA (2009). Web- based software for rapid top-down proteomic identification of protein biomarkers with implications for bacterial identification. Applied and Environmental Microbiology 75:4341– 4353. Fagerquist CK, Garbus BR, Miller WG, Williams KE, Yee E, Bates AH, Boyle S, Harden LA, Cooley MB, Mandrell RE (2010). Rapid identification of protein biomarkers of Escherichia coli O157:H7 by matrix-assisted laser desorption ionization time of flight-time of flight mass spectrometry and top-down proteomics. Analytical Chemistry 82:2717–2725. Food and Drug Administration (2006). US Federal Food, Drug, and Cosmetic Act. Sec. 403. Misbranded food. Available at http://www.fda.gov/RegulatoryInformation/Legislation/ FederalFoodDrugandCosmeticActFDCAct/FDCActChapterIVFood/ucm107530.htm, accessed February 2012. Fenaille F, Parisod V, Tabet J-C, Guy PA (2005). Carbonylation of milk powder proteins as a consequence of processing conditions. Proteomics 5:3097–3104. Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM (1989). Electrospray ionization for mass spectrometry of large biomolecules. Science 246:64–71. Fernandez-No´ IC, Bohme¨ K, Gallardo JM, Barros-Velazquez´ J, Canas˜ B, Calo-Mata P (2010). Differential characterization of biogenic amine-producing bacteria involved in food poi- soning using MALDI-TOF mass fingerprinting. Electrophoresis 31:1116–1127 Fernandez-No´ IC, Bohme¨ K, Calo-Mata P, Barros-Velazquez´ J (2011). Characterization of histamine-producing bacteria from farmed blackspot seabream (Pagellus bogaraveo) and turbot (Psetta maxima). International Journal of Food Microbiology 151:182–189. 156 PROTEOMICS IN FOOD SCIENCE

Fernandez-No´ IC, Bohme¨ K, Calo-Mata P, Canas˜ B, Gallardo JM, Barros-Velazquez´ J (2012). Isolation and characterization of Streptococcus parauberis from vacuum-packaging refrig- erated seafood products. Food Microbiology 30:91–97. Food and Agriculture Organization (FAO) of the United Nations, Fisheries and Aquaculture Department. The State of World Fisheries and Aquaculture (SOFIA) 2008. Rome: FAO Communication Division; 2009. Freiwald A, Sauer S (2009). Phylogenetic classification and identification of bacteria by mass spectrometry. Nature Protocols 4:732–742. Gallien S, Duriez E, Domon B (2011). Selected reaction monitoring applied to proteomics. Journal of Mass Spectrometry 46:298–312. Gerber SA, Rush J, Stemman O, Kirschner MW, Gygi SP (2003). Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proceedings of the National Acedemy of Sciences of the USA 100:6940–6945. Giebel R, Worden C, Rust SM, Kleinheinz GT, Robbins M, Sandrin TR (2010). Microbial fingerprinting using matrix-assisted laser desorption ionization time-of-flight mass spec- trometry. Advances in Applied Microbiology 71:149–184. Goll DE, Taylor RG, Christiansen JA, Thompson VF (1991). Role of proteinases and protein turnover in muscle growth and meat quality. Proceedings of the Reciprocal Meat Conference 44:25–33. Grundy HH, Reece P, Sykes MD, Clough JA, Audsley N, Stones R (2007). Screening method for the addition of bovine blood-based binding agents to food using liquid chromatogra- phy triple quadrupole mass spectrometry. Rapid Communications in Mass Spectrometry 21:2919–2925. Guarino C, Fuselli F, La Mantia A, Longo L, Faberi A, Marianella RM (2010). Peptidomic approach, based on liquid chromatography/electrospray ionization tandem mass spectrom- etry, for detecting sheep’s milk in goat’s and cow’s cheeses. Rapid Communications in Mass Spectrometry 24:705–713. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R (1999). Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nature Biotechnology 17:994– 999. Herrero M, SimoC,Garc´ ´ıa-Canas˜ V, Iba´nez˜ E, Cifuentes A (2012). Foodomics: MS-based strategies in modern food science and nutrition. Mass Spectrometry Reviews 31:49– 69. Holland RD, Wilkes JG, Rafii F, Sutherland JB, Persons CC, Voorhees KJ, Lay JO (1996). Rapid identification of intact whole bacteria based on spectral patterns using matrix-assisted laser desorption/ionization with time-of-flight mass spectrometry. Rapid Communications in Mass Spectrometry 10:1227–1232. Holland RD, Duffy CR, Rafii F, Sutherland JB, Heinze TM, Holder CL, Voorhees KJ, Lay JO (1999). Identification of bacterial proteins observed in MALDI TOF mass spectra from whole cells. Analytical Chemistry 71:3226–3230. Holland JW, Gupta R, Deeth HC, Alewood PF (2012). UHT milk contains multiple forms of ␣S1-casein that undergo degradative changes during storage. Food Chemistry 133:689–696. Hollung K, Veiseth E, Jia X, Faergestad EM, Hildrum KI (2007). Application of proteomics to understand the molecular mechanisms behind meat quality. Meat Science 77:97–104. Huang J, Forsberg NE (1998). Role of calpain in skeletal-muscle protein degradation. Pro- ceedings of the National Academy of Sciences of the USA 95:12100–12105. REFERENCES 157

Huff-Lonergan E, Zhang WG, Lonergan SM (2010). Biochemistry of postmortem muscle— lessons on mechanisms of meat tenderization. Meat Science 86:184–195. Hwang IH, Park BY, Kim JH, Cho SH, Lee J M (2005). Assessment of postmortem proteolysis by gel-based proteome analysis and its relationship to meat quality traits in pig longissimus. Meat Science 69:79–91. Jacobs JM, Adkins JN, Qian WJ, Liu T, Shen Y, Camp II DG, Smith RD (2005). Utilizing human blood plasma for proteomic biomarker discovery. Journal of Proteome Research 4:1073–1085. Jensen ON (2004). Modification-specific proteomics: characterization of posttranslational modifications by mass spectrometry. Current Opinion in Chemical Biology 8:33–41. Jia X, Hildrum KI, Westad F, Kummen E, Aass L, Hollung K (2006). Changes in enzymes associated with energy metabolism during the early post mortem period in longissimus thoracis bovine muscle analyzed by proteomics. Journal of Proteome Research 5:1763– 1769. Jia X, Ekman M, Grove H, Faergestad EM, Aass L, Hildrum KI, Hollung K (2007). Proteome changes in bovine longissimus thoracis muscle during the early postmortem storage period. Journal of Proteome Research 6:2720–2731. Johnson P, Philo M, Watson A, Clare Mills EN (2011). Rapid fingerprinting of milk thermal processing history by intact protein mass spectrometry with nondenaturing chromatogra- phy. Journal of Agricultural and Food Chemistry 59:12420–12427. Jonscher KR, Yates JR 3rd (1997). The quadrupole ion trap mass spectrometer—a small solution to a big challenge. Analytical Biochemistry 244:1–15. Jorge I, Casas EM, Villar M, Ortega-Perez´ I, Lopez-Ferrer´ D, Mart´ınez-Ru´ız A, Carrera M, Marina A, Mart´ınez P, Serrano H, Canas˜ B, Were F, Gallardo JM, Lamas S, Redondo JM, Garc´ıa-Dorado D, Vazquez´ J (2007). High-sensitivity analysis of specific peptides in complex samples by selected MS/MS ion monitoring and linear ion trap mass spectrometry: application to biological studies. Journal of Mass Spectrometry 42:1391–1403. Karas M, Hillenkamp F (1988). Laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons. Analytical Chemistry 60:2299–2301. Kelleher NL (2004). Top-down proteomics. Analytical Chemistry 76:197A-203A. Keys CJ, Dare DJ, Sutton H, Wells G, Lunt M, McKenna T, McDowall M, Shah HN (2004). Compilation of a MALDI-TOF mass spectral database for the rapid screening and char- acterisation of bacteria implicated in human infectious diseases. Infection, Genetics and Evolution 4:221–242. Kiehntopf M, Melcher F, Hanel¨ I, Eladawy H, Tomaso H (2011). Differentiation of Campy- lobacter species by surface-enhanced laser desorption/ionization-time-of-flight mass spec- trometry. Foodborne Pathogens and Disease 8:875–885. Kim SC, Sprung R, Chen Y, Xu Y, Ball H, Pei J, Cheng T, Kho Y, Xiao H, Xiao L, Grishin NV, White M, Yang XJ, Zhao Y (2006). Substrate and functional diversity of lysine acetylation revealed by a proteomics survey. Molecular Cell 23:607–618. Kim YH, Lonergan SM, Huff-Lonergan E (2010). Protein denaturing conditions in beef deep semimembranosus muscle results in limited ␮-calpain activation and protein degradation. Meat Science 86:883–887. Kinoshita Y, Sato T, Naitou H, Ohashi N, Kumazawa S (2007). Proteomic studies on protein oxidation in bonito (Katsuwonus pelamis) muscle. Food Science and Technology Research 13:133–138. 158 PROTEOMICS IN FOOD SCIENCE

Kjærsgård IVH, Jessen F (2003). Proteome analysis elucidating post-mortem changes in cod (Gadus morhua) muscle proteins. Journal of Agricultural and Food Chemistry 51:3985– 3991. Kjærsgård IVH, Jessen F (2004). Two-dimensional gel electrophoresis detection of protein oxidation in fresh and tainted rainbow trout muscle. Journal of Agricultural and Food Chemistry 52:7101–7107. Kjærsgård IVH, Nørrelykke MR, Jessen F (2006). Changes in cod muscle proteins during frozen storage revealed by proteome analysis and multivariate data analysis. Proteomics 6:1606–1618. Kjærsgård IVH, Nørrelykke MR, Baron CP, Jessen F (2006b). Identification of carbonylated protein in frozen rainbow trout (Oncorhynchus mykiss) fillets and development of protein oxidation during frozen storage. Journal of Agricultural and Food Chemistry 54:9437– 9446 Klose J (1975). Protein mapping by combined isoelectric focusing and electrophoresis of mouse tissues. A novel approach to testing for induced point mutations in mammals. Humangenetik 26:231–243. Koohmaraie M (1996). Biochemical factors regulating the toughening and tenderization pro- cesses of meat. Meat Science 43:193–201. Koort J, Coenye T, Vandamme P, Bjorkroth J (2006). Streptococccus parauberis associated with modified atmosphere packaged broiler meat products and air samples from a poultry meat processing plant. International Journal of Food Microbiology 106:318–323. Krishnamurthy T, Ross PL (1996). Rapid identification of bacteria by direct matrix-assisted laser desorption/ionization mass spectrometry analysis of whole cells. Rapid Communica- tions in Mass Spectrometry 10:1992–1996. Krishnamurthy T, Davis MT, Stahl DC, Lee TD (1999). Liquid Chromatography/microspray mass spectrometry for bacterial investigations. Rapid Communications in Mass Spectrom- etry 13:39–49. Lametsch R, Bendixen E (2001). Proteome analysis applied to meat science: characteriz- ing postmortem changes in porcine muscle. Journal of Agricultural and Food Chemistry 49:4531–4537. Lametsch R, Roepstorff P, Bendixen E (2002). Identification of protein degradation during post-mortem storage of pig meat. Journal of Agricultural and Food Chemistry 50:5508– 5512. Lametsch R, Karlsson A, Rosenvold K, Andersen HJ, Roepstorff P, Bendixen E (2003). Post- mortem proteome changes of porcine muscle related to tenderness. Journal of Agricultural and Food Chemistry 51:6992–6997. Lametsch R, Roepstorff P, Moller HS, Bendixen E (2004). Identification of myofibrillar substrates for ␮-calpain. Meat Science 68:515–521. Lametsch R, Kristensen L, Larsen MR, Therkildsen M, Oksbjerg N, Ertbjerg P (2006). Changes in the muscle proteome after compensatory growth in pigs. Journal of Animal Science 84:918–924. Lametsch R, Lonergan S, Huff-Lonergan E (2008). Disulfide bond within ␮-calpain active site inhibits activity and autolysis. Biochimica Et Biophysica Acta-Proteins and Proteomics 1784:1215–1221. Lange V, Picotti P, Domon B, Aebersold R (2008). Selected reaction monitoring for quantita- tive proteomics: a tutorial. Molecular System Biology 4:1–14. REFERENCES 159

Larsen MR, Thingholm TE, Jensen ON, Roepstorff P, Jørgensen TJ (2005). Highly selec- tive enrichment of phosphorylated peptides from peptide mixtures using titanium dioxide microcolumns. Molecular & Cellular Proteomics 4:873–886. Laville E, Sayd T Morzel M, Blinet S, Chambon C, Lepetit J, Renand G, Hocquette JF (2009). Proteome changes during meat aging in tough and tender beef suggest the importance of apoptosis and protein solubility for beef aging and tenderization. Journal of Agricultural and Food Chemistry 57:10755–10764. Le TT, Deeth H, Bhandari B, Alewood PF, Holland JW (2012). A proteomic approach to detect lactosylation and other chemical changes in stored milk protein concentrate. Food Chemistry 132:655–662. Leary JL, Schmidt RL (1996). Quadrupole mass spectrometers: an intuitive look at the math. Journal of Chemical Education 73:1142–1144. Lee SY, Kim GT, Roh SH, Song JS, Kim HJ, Hong SS, Kwon SW, Park JH (2009). Proteomic analysis of the anti-cancer effect of 20S-ginsenoside Rg3 in human colon cancer cell lines. Bioscience, Biotechnology and Biochemistry 73:811–816. Leitner A, Castro-Rubio F, Marina ML, Lindner W (2006). Identification of marker proteins for the adulteration of meat products with soybean proteins by multidimensional liquid chromatography-tandem mass spectrometry. Journal of Proteome Research 5:2424–2430. Liu H, Choi JW, Yun JW (2012). Gender differences in rat plasma proteome in response to high-fat diet. Proteomics 12:269–283. Lockley AK, Bardsley RG (2000). DNA-based methods for food authentication. Trends in Food Science and Technology 11:67–77. Lopez´ JL, Marina A, Alvarez´ G, Vazquez´ J (2002a). Application of proteomics for fast identification of species-specific peptides from marine species. Proteomics 2:1658–1665. Lopez´ JL, Marina A, Vazquez´ J, Alvarez´ G (2002b). A proteomic approach to the study of the marine mussels Mytilus edulis and M. galloprovincialis. Marine Biology 141:217–223. Lund MN, Lametsch R, Hviid MS, Jensen ON, Skibsted LH (2007). High-oxygen packaging atmosphere influences protein oxidation and tenderness of porcine longissimus dorsi during chill storage. Meat Science 77:295–303. Lundquist M, Caspersen MB, Wikstrom P, Forsman M (2005). Discrimination of Francisella tularensis subspecies using surface-enhanced laser desorption/ionization mass spectrometry and multivariate data analysis. FEMS Microbiology Letters 243:303–310. Mafra I, Ferreira I, Oliveira MB (2008). Food authentication by PCR-based methods. European Food Research and Technology 227:649–665. Mandrell RE, Harden LA, Bates A, Miller WG, Haddon WF, Fagerquist CK (2005). Speciation of Campylobacter coli, C. jejuni, C. helveticus, C. lari, C. sputorum, C. upsaliensis by matrix-assisted laser desorption/ionization–time-of-flight mass spectrometry. Applied and Environmental Microbiology 71:6292–6307. Mann M, Wilm M (1994). Error-tolerant identification of peptides in sequence database by peptide sequence tags. Analytical Chemistry 66:4390–4399. Marcos B, Kerry JP, Mullen AM (2010). High pressure induces changes on sarcoplasmic protein fraction and quality indicators. Meat Science 85:115–120. Martinez I, Solberg C, Lauritzen K, Ofstad R (1992). Two-dimensional electrophoretic analyses of cod (Gadus morhua L.) whole muscle proteins, water-soluble fraction and surimi. Effect of the addition of CaCl2 and MgCl2 during the washing procedure. Applied and Theoretical Electrophoresis 2:201–206. 160 PROTEOMICS IN FOOD SCIENCE

Martinez I, Jakobsen Friis T, Careche M (2001). Post mortem muscle protein degradation dur- ing ice-storage of Arctic (Pandalus borealis) and tropical (Penaeus japonicus and Penaeus monodon) shrimps: a comparative electrophoretic and immunological study. Journal of the Science of Food and Agriculture 81:1199–1208. Martinez I, Friis TJ (2004). Application of proteome analysis to seafood authentication. Pro- teomics 4:347–354. Mart´ınez-Gomariz´ M, Hernaez´ ML, Gutierrez´ D, Ximenez-Emb´ un´ P, Prestamo´ G (2009). Proteomic analysis by two-dimensional differential gel electrophoresis (2D DIGE) of a high-pressure effect in Bacillus cereus. Journal of Agricultural and Food Chemistry 57:3543–3549. Mazzeo MF, Sorrentino A, Gaita M, Cacace G, Di Stasio M, Facchiano A, Comi G, Malorni A, Siciliano RA (2006). Matrix-assisted laser desorption ionization-time of flight mass spec- trometry for the discrimination of food-borne microorganisms. Applied and Environmental Microbiology 72:1180–1189. Mazzeo MF, de Giulio B, Guerriero G, Ciarcia G, Malorni A, Russo GL, Siciliano RA (2008). Fish authentication by MALDI-TOF mass spectrometry. Journal of Agricultural and Food Chemistry 56:11071–11076. Melody JL, Lonergan SM, Rowe LJ, Huiatt TW, Mayes MS, Huff-Lonergan E (2004). Early postmortem biochemical factors influence tenderness and water-holding capacity of three porcine muscles. Journal of Animal Science 82:1195–1205. Meltretter J, Becker CM, Pischetsrieder M (2008). Identification and site-specific relative quantification of ␤-Lactoglobulin modifications in heated milk and dairy products. Journal of Agricultural and Food Chemistry 56:5165–5171. Mermelstein MH (1993). A new era in food labeling. Food Technology 47:81–96. Meyer B, Al-Diab D, Vollmer G, Pischetsrieder M (2011). Mapping the glycoxidation product N -carboxymethyllysine in the milk proteome. Proteomics 11:420–428. Miller I, Crawford J, Gianazza E (2006). Protein stains for proteomic applications: which, when and why? Proteomics 6:5385–5408. Ministry of Agriculture, Fisheries and Food, Spain (2003). Royal Decree 1380/2002 of Decem- ber 20. Bolet´ın Oficial del Estado 3, January 3, 2003. Ministry of Agriculture, Fisheries and Food, Spain (2004a). Royal Decree 121/2004 of January 23. Bolet´ın Oficial del Estado 31, February 5, 2004. Ministry of Agriculture, Fisheries and Food, Spain (2004b). Royal Decree 1702/2004 of July 16. Bolet´ın Oficial del Estado 172, July 17, 2004. Mora L, Sentandreu MA, Toldra´ F (2010). Identification of small troponin T peptides generated in dry-cured ham. Food Chemistry 123:691–697. Mora L, Sentandreu MA, Toldra´ F (2011). Intense degradation of myosin light chain iso- forms in Spanish dry-cured ham. Journal of Agricultural and Food Chemistry 59:3884– 3892. Morzel M, Verrez-Bagnis V, Arendt EK, Fleurence J (2000). Use of two-dimensional elec- trophoresis to evaluate proteolysis in salmon (Salmo salar) muscle as affected by a lactic fermentation. Journal of Agricultural and Food Chemistry 48:239–244. Morzel M, Chambon C, Hamelin M, Sante-Lhoutellier´ V, Sayd T, Monin G (2004). Proteome changes during pork meat ageing following use of two different pre-slaughter handling procedures. Meat Science 67:689–696. REFERENCES 161

Morzel M, Chambon C, Lefevre F, Paboeuf G, Laville E (2006). Modifications of trout (Oncorhynchus mykiss) muscle proteins by preslaughter activity. Journal of Agricultural and Food Chemistry 54:2997–3001. Motoyama A, Yates JR 3rd (2008). Multidimensional LC separations in shotgun proteomics. Analytical Chemistry 80:7187–7193. Mullen AM, Stoeva S, Laib K, Gruebler G, Voelter W, Troy DJ (2000). Preliminary analysis of amino acids at various locations along the M.-Longissimus dorsi in aged beef. Food Chemistry 69:461–465. Muller¨ L, Bartak´ P, Bednar´ P, Frysova´ I, Sevcik J, Lemr K (2008). Capillary electrophoresis- mass spectrometry—a fast and reliable tool for the monitoring of milk adulteration. Elec- trophoresis 29:2088–2093. Muroya S, Ohnishi-Kameyama M, Oe M, Nakajima I, Chikuni K (2007). Postmortem changes in bovine troponin T isoforms on two-dimensional electrophoretic gel analyzed using mass spectrometry and western blotting: the limited fragmentation into basic polypeptides. Meat Science 75:506–514. Ochoa ML, Harrington PB (2005). Immunomagnetic isolation of enterohemorrhagic Escherichia coli O157:H7 from ground beef and identification by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry and database searches. Analytical Chemistry 77:5258–5267. O’Farrell PH (1975). High resolution two-dimensional electrophoresis of proteins. Journal of Biological Chemistry 250:4007–4021. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M (2002). Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Molecular & Cellular Proteomics 1:376–386. Ortea I, Barros L, Gallardo JM (2009a). Closely related shrimp species identification by MALDI-TOF mass spectrometry. Journal of Aquatic Food Product Technology 18:146– 155. Ortea I, Canas˜ B, Calo-Mata P, Barros-Velazquez´ J, Gallardo JM (2009b). Arginine kinase peptide mass fingerprinting as a proteomic approach for species identification and taxo- nomic analysis of commercially relevant shrimp species. Journal of Agricultural and Food Chemistry 57:5665–5672. Ortea I, Canas˜ B, Gallardo JM (2009c). Mass spectrometry characterization of species-specific peptides from arginine kinase for the identification of commercially relevant shrimp species. Journal of Proteome Research 8:5356–5362. Ortea I, Canas˜ B, Calo-Mata P, Barros-Velazquez´ J, Gallardo JM (2010a). Identification of commercial prawn and shrimp species of food interest by native isoelectric focusing. Food Chemistry 121:569–574. Ortea I, Rodr´ıguez A, Tabilo-Munizaga G, Perez-Won´ M, Aubourg SP (2010b). Effect of hydro- static high-pressure treatment on proteins, lipids and nucleotides in chilled farmed salmon (Oncorhynchus kisutch) muscle. European Food Research and Technology 230:925–934. Ortea I, Canas˜ B, Gallardo JM (2011). Selected tandem mass spectrometry ion monitoring for the fast identification of seafood species. Journal of Chromatography A 1218:4445–4451. Ozdemir V, Armengaud J, Dube´ L, Aziz RK, Knoppers BM (2010). Nutriproteomics and proteogenomics: cultivating two novel hybrid fields of personalized medicine with added societal value. Current Pharmacogenomics and Personalized Medicine 8:240–244. 162 PROTEOMICS IN FOOD SCIENCE

Pandey A, Mann M (2000). Proteomics to study genes and genomes. Nature 405:837–846. Pappin DJC, Hojrup P, Bleasby A (1993). Rapid identification of proteins by peptide-mass fingerprinting. Current Biology 3:327–332. Pascoal A, Ortea I, Gallardo JM, Canas˜ B, Barros-Velazquez´ J, Calo-Mata P (2012). Species identification of the Northern shrimp (Pandalus borealis) by polymerase chain reaction- restriction fragment length polymorphism and proteomic analysis. Analytical Biochemistry 421:56–67. Pepe T, Ceruso M, Carpentieri A, Ventrone I, Amoresano A, Anastasio A (2010). Proteomics analysis for the identification of three species of Thunnus. Veterinary Research Communi- cations 34:S153–S155. Perkins DN, Pappin DJC, Creasy DM, Cottrell JS (1999). Probability-based protein identi- fication by searching sequence databases using mass spectrometry data. Electrophoresis 20:3551–3567. Pineda FJ, Antoine MD, Demirev PA, Feldman AB, Jackman J, Longenecker M, Lin JS (2003). Microorganism identification by matrix-assisted laser/desorption ionization mass spec- trometry and model-derived ribosomal protein biomarkers. Analytical Chemistry 75:3817– 3822. Pineiro˜ C, Barros-Velazquez´ J, Sotelo CG, Perez-Mart´ ´ın RI, Gallardo JM (1998). Two- dimensional electrophoretic study of the water-soluble´ protein fraction in white muscle of gadoid fish species. Journal of Agricultural and Food Chemistry 46:3991–3997. Pineiro˜ C, Barros-Velazquez´ J, Sotelo CG, Gallardo JM (1999). The use of two-dimensional electrophoresis in the characterization of the water-soluble´ protein fraction of comercial flat fish species. Zeitschrift fur¨ Lebensmittel-Untersuchung und-Forschung A 208:342–348. Pineiro˜ C, Vazquez´ J, Marina A, Barros-Velazquez´ J, Gallardo JM (2001). Characterization and partial sequencing of species-specific sarcoplasmic polypeptides from commercial hake species by mass spectrometry following two-dimensional electrophoresis. Electrophoresis 22:1545–1552. Pineiro˜ C, Barros-Velazquez´ J, Vazquez´ J, Figueras A, Gallardo JM (2003). Proteomics as a tool for the investigation of seafood and other marine products. Journal of Proteome Research 2:127–135. Pischetsrieder M, Baeuerlein R (2009). Proteome research in food science. Chemical Society Reviews 38:2600–2608. Pribil PA, Patton E, Black G, Doroshenko V, Fenselau C (2005). Rapid characterization of Bacillus spores targeting species-unique peptides produced with an atmospheric-pressure MALDI source. Journal of Mass Spectrometry 40:464–474. Promeyrat A, Sayd T, Laville E, Chambon C, Lebret B, Gatellier Ph (2011). Early post-mortem sarcoplasmic proteome of porcine muscle related to protein oxidation. Food Chemistry 127:1097–1104. Pulford DJ, Fraga Vazquez S, Frost DF, Fraser-Smith E, Dobbie P, Rosenvold K (2008). The intracellular distribution of small heat shock proteins in post-mortem beef is determined by ultimate pH. Meat Science 79:623–630. Purslow P (2005). Intramuscular connective tissue and its role in meat quality. Meat Science 70:435–447. Rasmussen RS, Morrissey MT (2008). DNA-based methods for the identification of commer- cial fish and seafood species. Comprehensive Reviews in Food Science and Food Safety 7:280–295. REFERENCES 163

Rehbein HZ (1990). Electrophoretic techniques for species identification of fishery products. Zeitschrift fur¨ Lebensmittel-Untersuchung und-Forschung A 191:1–10. Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, Khainovski N, Pillai S, Dey S, Daniels S, Purkayastha S, Juhasz P, Martin S, Bartlet-Jones M, He F, Jacob- son A, Pappin DJ (2004). Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Molecular & Cellular Proteomics 3:1154– 1169. Ryzhov V, Feneslau C (2001). Characterization of the protein subset desorbed by MALDI from whole bacterial cells. Analytical Chemistry 73:746–750. Santos-Gonzalez´ M, Lopez-Miranda´ J, Perez-Jim´ enez´ F, Navas P, Villalba JM (2012). Dietary oil modifies the plasma proteome during aging in the rat. Age (Dordrecht, Netherlands) 34:341–358. Sauer S, Freiwald A, Maier T, Kube M, Reinhard R, Kostrzewa M, Geider K (2008). Clas- sification and identification of bacteria by mass spectrometry and computational analysis. PLoS One 3:e2843. Schmidt A, Kellermann J, Lottspeich F (2005). A novel strategy for quantitative proteomics using isotope-coded protein labels. Proteomics 5:4–15. Sentandreu MA, Stoeva S, Aristoy MC, Laib K, Voelter W, Toldra´ F (2003). Identification of small peptides generated in Spanish dry-cured ham. Journal of Food Science 68:64–69. Sentandreu MA, Armenteros M, Calvete JJ, Ouali A, Aristoy MC, Toldra´ F (2007). Proteomic identification of actin-derived oligopeptides in dry-cured ham. Journal of Agricultural and Food Chemistry 55:3613–3619. Sentandreu MA, Fraser PD, Halket J, Patel R, Bramley PM (2010). A proteomic-based approach for detection of chicken in meat mixes. Journal of Proteome Research 9:3374– 3383. Sforza S, Pagazzani A, Motti M, Porta C, Virgili R, Galaverna R, Dossena A, Marchelli R (2001). Oligopeptides and free amino acids in Parma hams of known cathepsin B activity. Food Chemistry 75:267–273. Shah HN, Gharbia SE, editors (2010). Mass Spectrometry for Microbial Proteomics. Chich- ester, UK: John Wiley & Sons. Shevchenko A, Wilm M, Mann M (1997). Peptide sequencing by mass spectrometry for homology searches and cloning of genes. Journal of Protein Chemistry 16:481–490. Shiomi K, Sato Y, Hamamoto S, Mita H, Shimakura K (2008). Sarcoplasmic calcium-binding protein: identification as a new allergen of the black tiger shrimp Penaeus monodon. International Archives of Allergy and Immunology 146:91–98. Sotelo CG, Pineiro˜ C, Gallardo JM, Perez-Mart´ ´ın RI (1993). Fish species identification in seafood products. Trends in Food Science and Technology 4:395–401. Steen H, Mann M (2004). The ABC’s (and XYZ’s) of peptide sequencing. Nature Reviews Molecular Cell Biology 5:699–711. Suman SP, Faustman C, Stamer SL, Liebler DC (2007). Proteomics of lipid oxidation-induced oxidation of porcine and bovine oxymyoglobins. Proteomics 7:628–640. Sylvestre MN, Feidt C, Brun-Bellut J (2001). Post-mortem evolution of non-protein nitrogen and its peptide composition in growing lamb muscles. Meat Science 58:363–369. Talon R, Leroy S, Lebert I (2007). Microbial ecosystems of traditional fermented meat products: the importance of indigenous starters. Meat Science 77:55–62. 164 PROTEOMICS IN FOOD SCIENCE

Tanaka K, Waki H, Ido Y, Akita S, Yoshida Y, Yoshida T (1988). Protein and polymer analyses up to m/z 100000 by laser ionization time-of-flight mass spectrometry. Rapid Communications in Mass Spectrometry 2:151–153. Taylor AJ, Linforth R, Weir O, Hutton T, Green B (1993). Potential of electrospray mass- spectrometry for meat pigment identification. Meat Science 33:75–83. Tenebrick B, Damink C, Joosten H, Tveld J (1990). Occurrence and formation of biologically- active amines in foods. International Journal of Food Microbiology 11:73–84. Terova G, Addis MF, Preziosa E, Pisanu S, Pagnozzi D, Biosa G, Gornati R, Bernardini G, Roggio T, Saroglia M (2011). Effects of postmortem storage temperature on sea bass (Dicentrarchus labrax) muscle protein degradation: analysis by 2-D DIGE and MS. Pro- teomics 11:2901–2910. Theron L, Chevarin L, Robert N, Dutertre C, Sante-Lhoutellier V (2009). Time course of peptide fingerprints in semimembranosus and biceps femoris muscles during Bayonne ham processing. Meat Science 82:272–277. Thompson A, Schafer¨ J, Kuhn K, Kienle S, Schwarz J, Schmidt G, Neumann T, Johnstone R, Mohammed AK, Hamon C (2003). Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Analytical Chemistry 75:1895–1904. Toldra´ F, Aristoy MC, Flores M (2000). Contribution of muscle aminopeptidases to flavor development of dry-cured ham. Food Research International 33:181–185. Unlu M, Morgan ME, Minden JS (1997). Difference gel electrophoresis: a single gel method for detecting changes in protein extracts. Electrophoresis 18:2071–2077. Unwin RD, Griffiths JR, Whetton AD (2009). A sensitive mass spectrometric method for hypothesis-driven detection of peptide post-translational modifications: multiple reaction monitoring-initiated detection and sequencing (MIDAS). Nature Protocols 4:870–877. Vargha M, Takats Z, Konopka A, Nakatsu CH (2006). Optimization of MALDI-TOF-MS for strain level differentiation of Arthrobacter isolates. Journal of Microbiology Methods 66:399–409. Veiseth-Kent E, Grove H, Færgestad EM, Fjæra SO (2010). Changes in muscle and blood plasma proteomes of Atlantic salmon (Salmo salar) induced by crowding. Aquaculture 309:272–279. Verrez-Bagnis V, Noel¨ J, Sautereau C, Fleurence (1999). Desmin degradation in postmortem fish muscle. Journal of Food Science 64:240–242 Verrez-Bagnis V, Ladrat C, Morzel M, Noel¨ J, Fleurence J (2001). Protein changes in post- mortem sea bass (Dicentrarchus labrax) muscle monitored by one- and two-dimensional gel electrophoresis. Electrophoresis 22:1539–1544. Vorderwulbecke S, Cleverley S, Weinberger SR, Wiesner A (2005). Protein quantification by the SELDI-TOF-MS-based ProteinChip system. Nature Methods 2:393–395. Wang J, Kliks MM, Qu W, Jun S, Shi G, Li QX (2009). Rapid determination of the geographical origin of honey based on protein fingerprinting and barcoding using MALDI TOF MS. Journal of Agricultural and Food Chemistry 57:10081–10088. Warscheid B, Jackson K, Sutton C, Fenselau C (2003). MALDI analysis of Bacilli in spore mixtures by applying a quadrupole ion trap time-of-flight mass spectrometer. Analytical Chemistry 75:5608–5617. Weickhardt C, Moritz F, Grotemeyer J (1996). Time-of-flight mass spectrometry: state-of-the- art in chemical analysis and molecular science. Mass Spectrometry Reviews 15:139–162. REFERENCES 165

Welker M, Moore ERB (2011). Applications of whole-cell matrix-assisted laser-desorption/ ionization time-of-flight mass spectrometry in the systematic microbiology. Systematic and Applied Microbiology 34:2–11. Wilkins CL, Lay JO, editors (2006) Identification of Microorganisms by Mass Spectrometry. Hoboken, NJ: John Wiley & Sons. Wu GP, Chen SH, Liu GM, Yoshida A, Zhang LJ, Su WJ, Cao MJ (2010). Purification and characterization of a collagenolytic serine proteinase from the skeletal muscle of red sea bream (Pagrus major). Comparative Biochemistry and Physiology Part B-Biochemistry and Molecular Biology 155:281–287. Wunschel SC, Jarman KH, Petersen CE, Valentine NC, Wahl KL, Schauki D, Jackman J, Nelson CP, White E (2005). Bacterial analysis by MALDI-TOF mass spectrometry: an inter- laboratory comparison. Journal of the American Society for Mass Spectrometry 16:456–462. Wynne C, Fenselau C, Demirev PA, Edwards N (2009). Top-down identification of protein biomarkers in bacteria with unsequenced genomes. Analytical Chemistry 81:9633–9642. Yamauchi MM., Miya MU, Machida RJ, Nishida M (2004). PCR-based approach for sequenc- ing mitochondrial genomes of decapod crustaceans, with a practical example from Kuruma prawn (Marsupenaeus japonicus). Marine Biotechnology 6:419–429. YangZ, Hancock WS (2004). Approach to the comprehensive analysis of glycoproteins isolated from human serum using a multi-lectin affinity column. Journal of Chromatography A 1053:79–88. Yao X, Freas A, Ramirez J, Demirev PA, Fenselau C (2001). Proteolytic 18O labeling for com- parative proteomics: model studies with two serotypes of adenovirus. Analytical Chemistry 73:2836–2842. Yu C-J, Lin Y-F, Chiang B-L, Chow L-P (2003). Proteomics and immunological analysis of a novel shrimp allergen, Pen m 2. Journal of Immunology 170:445–453. Zheng S, Schneider KA, Barder TJ, Lubman DM (2003). Two-dimensional liquid chromatogra- phy protein expression mapping for differential proteomic analysis of normal and O157:H7 Escherichia coli. Biotechniques 35:1202–1212. 6 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH

Martin Kussmann and Laurent Fay

6.1 INTRODUCTION

Deep understanding of nutritional health effects at molecular level requires the holis- tic analysis of the interplay between the food, the gut microbial, and the human host genome (Kussmann and Van Bladeren, 2011). The recently defined disci- pline of Foodomics encompasses the omics-based systems tools and bioinformatics applied to the study of food science and nutritional consumer health and well-being (Cifuentes, 2009). Food genomes are mined for discovery and exploitation of macro- and micronutrients as well as specific bioactives. Especially the genes coding for bioactive proteins and peptides are of central interest. The human gut microbiota encompasses a complex ecosystem in the gastrointestinal tract with significant influ- ence on host metabolism. It is being studied at genomic (Jones et al., 2010; Dim- itrov, 2011) and, more recently, also at proteomic (de Graaf and Venema, 2008) and metabonomic (Nicholson et al., 2004) levels. Humans are characterized for their genetic predisposition and interindividual variability at the level of (i) response to nutritional interventions (Michalsen et al., 2009); (ii) direction of health trajectories; (iii) epigenetic, metabolic programming (Burdge et al., 2011); and (iv) acute genomic expression as a systems-level response to diet, monitored at gene transcript, protein, and metabolite level (Kussmann et al., 2008). Figure 6.1 depicts the broad scope of nutritional proteomics translated to protein- derived bioactives and biomarkers; to ultimately define human metabolic health, we

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

167 168 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH

Food incl. Proteins

Healthy

Ageing Liver

Stomach Proteomics Age Tissues Host Kidney METABOLISM Urine

Intestine + Microbiota

Faeces

FIGURE 6.1 Scope of nutritional proteomics applied to protein-derived bioactives and biomarkers: to define human metabolic health, we need to understand the changes along aging and the interplay of food, host, and microbial proteomes. The trajectories of these proteomes need to be monitored to help define the bandwidth of healthy aging. need to understand the changes over time (e.g., along aging) and the interplay of food, host, and microbial proteomes. The trajectories of these proteomes (and epigenomes, transcriptomes, metabolomes—see other chapters) need to be monitored to help define the bandwidth of healthy aging. Most biological processes occurring in the human body involve proteins. As enzymes, proteins catalyze virtually every biochemical reaction involved in human metabolism. Proteins also ensure structural and mechanical functions and are involved in cell signaling and immune response. Due to their involvement in almost all bio- logical reactions in the human organism, modifications of blood plasma proteins can indicate specific pathological states (Tiscornia et al., 2010). Some of these changes are used as biomarkers of clinical disorders (Kiernan et al., 2003, 2004, 2008). Nev- ertheless, in most of the common clinical conditions, very few blood plasma proteins are used as diagnostic markers due to the lack of understanding of their regulation and the difficulty of precisely measuring the variation of their plasma concentrations. Indeed, the measurement of the abundance of a protein marker is the net result of its synthesis and breakdown (Doherty and Beynon, 2006). When those processes are in equilibrium, the net concentration of the protein remains unchanged. For this reason, even after an acute protein load, such as a meal, and despite increased syn- thesis rates of specific proteins, such as albumin, there are no measurable changes in concentrations of many proteins due to their slow turnover. Therefore, protein quantification techniques should be sensitive enough to detect changes that occur in very low abundance proteins, which may have a fast turnover rate (Kussmann et al., 2010). INTRODUCTION 169

Human proteomics is the comprehensive analysis of the whole protein content of a biological sample taken from a human being, being healthy or not. It is a powerful platform for the elucidation of molecular events related to nutrition, health, and disease. It can identify and quantify bioactive proteins and peptides, shed light on dietary effects at protein/peptide level, and thereby address questions of nutritional bioefficacy. In the later 1990s and early 2000s, proteomics raised a lot of expectations, mainly fueled by several life science challenges that appear ideally to be addressable by proteomics: (1) a genomic sequence does not necessarily translate into biological function of the related expressed gene; (2) proteomics complements genomics by focusing on the active gene products, thereby delivering information closer to the observed phenotype; (3) proteins execute biochemical reactions, shape structures, guide traffic, and control interactions; therefore, they are highly informative to under- stand biology from a systems point of view; (4) transcript abundance does not always linearly translate into protein quantity and, hence, the proteins have to be measured directly; (5) genomic prediction of gene products remains difficult even with modern bioinformatics, and consequently, the verification of a gene product by proteomics is important for genome annotation (e.g., PeptideAtlas (http://www.peptideatlas.org) (Zhang et al., 2008; Farrah et al., 2012); (6) prediction of protein modifications or localizations is barely possible from a DNA-only angle; proteomics is therefore key for the elucidation of posttranslational modifications, protein isoforms, and localiza- tion; and (7) protein regulation, interactions, or complex protein structures can only be determined via proteomics. However, during the last two decades, proteomics has only partly fulfilled these high expectations among which were the establishment of a complete human pro- teome map and replenished pipelines of clinical biomarkers and drug candidates. Mainly two factors have dampened these hopes: (i) the complexity of the human proteome and (ii) the dynamics of the human proteome. Despite these persisting challenges and shortcomings, remarkable progress has been made that in turn has consolidated the role of proteomics as a leading technology in the life sciences. Indeed, advances in two major technologies have contributed to the rapid rise of proteomics: the development of high-end mass spectrometers (MS) enabling the sequencing of proteins and peptides, and the advances in liquid chromatography from HPLC to ultra-high performance liquid chromatography (UPLC). Depletion of abundant and/or enrichment of less prominent proteins are routinely being per- formed as well as extensive protein and peptide prefractionation. Mass spectrometric platforms can now cover a wider dynamic range due to improved instrumentation with higher sensitivity, mass accuracy, and resolution. Moreover, software for pep- tide sequencing and protein identification has evolved and diversified considerably: the computing tools for data processing have matured from scoring mass spectra in terms of their peptide sequence-read fidelity to assessing the trade-off between false- positives (specificity) and false-negatives (sensitivity) in a data-dependent manner, which enables data validation and “learning on the data set” (Cox et al., 2009; Pedrioli, 2010). 170 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH

Essentially, today’s proteomics analytical starting point is either a tissue, cell, or body fluid sample which is subsequently decomplexified, typically at biological (organelle isolation) (Brunet et al., 2003; Lescuyer et al., 2006) and/or biochemical (protein/peptide fractionation) (Tam et al., 2004; Manadas et al., 2010) level. Sepa- ration has traditionally consisted of protein-level gel-based techniques, which have been largely outperformed and replaced by peptide-level LC (Domon and Aebersold, 2010) with the latter often complemented by isoelectric focusing (IEF) (Fraterman et al., 2007). Quantification has been classically done by either protein gel spot stain- ing and nowadays ever more by MS-based protein/peptide signal integration, with (Mann, 2006; Pan and Aebersold, 2007) or without (Guillaume et al., 2009; Hansson et al., 2011) protein/peptide labeling. Finally, data acquisition is still widely executed in a data-dependent manner (Schwudke et al., 2007), that is, by selecting precursor (peptide) ions for fragmentation (or sequencing) (classical MS/MS), but is increas- ingly being challenged by data-independent modes (Bern et al., 2010), that is, by uncoupling intact-from fragment-mass acquisition. The future proteomics workflow needs to deliver an optimized trade-off between effort and yield, that is, between throughput and resource-use on the one hand and completeness of proteome coverage on the other. This is increasingly being achieved by (i) maximizing sample specificity if sample amount allows for it (e.g., pure cell populations or cell organelles) (Duclos and Desjardins, 2011); (ii) multiplexed depletion of the most abundant proteins, possibly complemented by biochemical enrichment of less abundant proteins (Polaskova et al., 2010); and, very importantly, (iii) transferring peptide fractionation from the liquid to the gas phase i.e., restricting liquid phase separation to 1D (or online 2D applying a pH step) LC plus gas-phase fractionation (GPF) (Yi et al., 2002); plus, equally relevant, (iv) data-independent acquisition (DIA) of intact and fragment peptide masses, which results in uncoupling peptide mass determination from peptide sequencing (PAcIFIC (Acosta-Martin et al., 2011), MSE (Geromanos et al., 2009; Xie et al., 2009), and SWATH-MS (Gillet et al., 2012)). Such GPF, GPF-DIA, PAcIFIC, and SWATH routines can deliver deeper proteome coverage at higher throughput and with greater reproducibility. All these developments and improvements on the machine, workflow, data acqui- sition, and processing side have put complete maps of cellular proteomes within reach (de Godoy et al., 2006; Nagaraj et al., 2012). While the extremely complex proteomes of higher organisms, such as human tissue and body fluid proteomes, have not yet been captured in their entirety, mainly due to their enormous complexity in space and time including posttranslational modifications (Lamond et al., 2012), the arsenal of proteomic technologies today is stunning and the derived answers are most impressive, extending way beyond protein identification and quantification at certain places and given moments in time: this spectrum of possibilities, questions, and answers has very recently been reviewed at a Keystone Proteomics Symposium (https://www.keystonesymposia.org/Meetings/ViewMeetings.cfm?MeetingID=1133 #35119). The pioneers of the field—Ruedi Aebersold, Matthias Mann, and Mathias Uhlen—gathered´ today’s top experts and assembled contributions on MS-based high-throughput, time- and space-resolved proteomics (Lamond et al., 2012) as well as global assessment of posttranslational modifications (Choudhary and Mann, FROM FOOD PROTEINS TO NUTRIPROTEOMICS 171

2010); the emerging human protein atlas with antibody-based protein annotation of all human tissues and for all protein coding genes, and this across different conditions (Lundberg and Uhlen, 2010; Uhlen et al., 2010); comprehensive pictures on proteome signaling (Kuzu et al., 2012), activity (Phizicky and Grayhack, 2006; Gevaert et al., 2007; Yang et al., 2010), and turnover (Doherty and Beynon, 2006; Claydon and Beynon 2011; Claydon et al., 2012); and the study of molecular complexes and machines like transporters, channels, and motors by measuring these assemblies directly in the MS (Damoc et al., 2007; Gold et al., 2011). Coupled with molecular and cell biology techniques as well as imaging and sorting techniques, these proteomics technologies produce terabytes of data per study and—provided substantial bioinformatic resources—deliver fascinating insights into systems-level function of organelles and cells. This said, proteomics requires further advancements in order to extrapolate these organelle and cell insights into more complex systems such as organs and entire higher organisms, ultimately enabling an improved understanding of the human proteome in different tissues, at different time points, and under different conditions.

6.2 FROM FOOD PROTEINS TO NUTRIPROTEOMICS

Nutrients and genomes interact. Dietary components are consumed in complex mix- tures and not as single compounds. Hence, also interactions between these compounds impact ingredient bioavailability and bioefficacy (Kussmann et al., 2007). Proteomics has, therefore, logically developed into a central platform in nutrigenomics (Kuss- mann and Affolter, 2009). Under this overarching term, we attempt to holistically understand (i) how our genome is expressed in response to diet and (ii) our genetic predisposition and susceptibility toward diet (Kaput, 2008). Nutrigenomics can be deployed to stratify cohorts of subjects enrolled in intervention studies and to dis- cern responders from nonresponders among those subjects (Hauner et al., 2003; Klenke et al., 2011). Epigenetics investigates DNA sequence-unrelated biochemi- cal modifications of both DNA itself and DNA-binding proteins and emerges as a molecular explanation for metabolic imprinting (Gluckman et al., 2008; Kussmann et al., 2010). Mass spectrometry and proteomics play a key role here, too, as they can address posttranslational modifications (e.g., acetylations) of DNA-packaging pro- teins and thereby help decipher the so-termed histone code or quantitatively measure DNA methylation changes thereby advancing our understanding of gene regulation (Trelle and Jensen, 2007; Sidoli et al., 2012). Proteomics in nutrition delivers bioactives and biomarkers and answers questions of nutritional bioavailability and bioefficacy (Kussmann et al., 2010). Nutritional proteomics has more recently been extended toward a metaproteomics approach comprising information from three different proteome levels: host, food, and microbes (Kussmann and Van Bladeren, 2011). While this concept is a logical extension of metagenomic studies, it comes with the proteomics-typical additional challenge of a (meta)proteome being a lot more complex than any (meta)genome: the host proteome will continue to be intensely assessed to reveal nutritional health biomarkers; a 172 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH variety of animal and plant food proteomes have been and are still being extensively characterized (Panchaud et al., 2011); by contrast, the metagenome and metaproteome of the intestinal microbiota, that is, of all gut-residing bacteria, is beginning to be appreciated as a fundamental host health factor affecting, for example, energy balance (Samuel et al., 2008; Velagapudi et al., 2010) and immunity. While high- throughput and deep sequencing now enable an in-depth characterization of the intestinal bacterial population structure beyond phylum and down to family, possibly even species level, the question is now moving from “who is there?” to “who is doing what?”, that is, from a population census to an activity profiling, the latter potentially being enabled by metaproteomic analyses.

6.3 NUTRITIONAL PEPTIDE AND PROTEIN BIOACTIVES

6.3.1 Nutrition and Immunity—A Bioactive Perspective from Human Breast Milk Breast milk is the most important food for the newborn because of its unique nutri- ent composition, which meets all the critical needs for growth, development, and immune protection during early life. Breast-feeding is associated with lower inci- dence of necrotizing enterocolitis (Le Huerou-Luron et al., 2010) and diarrhea (Le Huerou-Luron et al., 2010) (Sazawal et al., 1992) during early life and with lower incidence of inflammatory bowel disease (IBD) (Loland et al., 2007), type 2 diabetes mellitus (T2DM) (Schrezenmeir and Jagla, 2000; Villegas et al., 2008), and obesity (Arenz, 2008; Fewtrell, 2011) later in life. According to the ESPGHAN Committee on Nutrition “the degree of health benefits derived from breast-feeding is higher in developing countries than in developed countries, and is inversely proportional to the socioeconomic level of the population, which is obviously lower in developing than in developed countries” (Bjorck et al., 2010). The high nutritional and protective value of human milk is related to its nutritional composition that changes over the lactation period and to biological activities of specific micro- and macronutrients. The generally slower growth rate of breast-fed infants compared to formula-fed infants may be attributed to a self-regulation of milk intake via the more direct mother–child interaction as opposed to the less- demanding bottle-suckling and easier overfeeding of formula-fed infants. Many of the beneficial outcomes of breastfeeding are conferred by the protein complement of breast milk. Human milk is unique in its bulk protein composition: whey accounts for the major part of protein (60–80%), whereas caseins represent a much smaller fraction. The whey protein fraction is dominated by a small number of abundant proteins which constitute over 80% of its protein content (Tremblay and Leonil, 2003). The protein ␤-lactoglobulin alone constitutes 50% of whey. Another study, based on the parallel use of electrospray and matrix-assisted laser desorption/ionization (MALDI) ionization sources, enhanced protein identification yield with a total of 39 bovine milk proteins identified with a high degree of confidence (Molle et al., 2009). NUTRITIONAL PEPTIDE AND PROTEIN BIOACTIVES 173

Comparative proteomic analysis of human and bovine milk proteins was performed through a combined approach based on functional gene ontology enrich- ment, hierarchical clustering plus pathway and network analyses, all applied to merged data from literature-derived human milk protein studies (D’Alessandro et al., 2010). A core of 106 proteins was established, with most of the entries associated with three main biological functions: namely, nutrient transport and lipid metabolism, immune response, and cellular proliferation. The pivotal role of a series of proteins involved in lipid/vitamin/nutrient transport and humoral immune responses against pathogen infections was evidenced. This approach highlighted the biological scope of human milk function as extending from providing the suckling infant with nutri- ents and defense molecules against pathogens to direct growth stimulation and to the development of a proper independent immune system. Recently, low-abundant proteins in the human milk casein fraction prepared by acid precipitation were investigated by the group of Lonnerdal¨ using proteomics (Liao et al., 2011): 82 proteins were identified in the casein micelle, 18 of which are not present in the whey compartment. Thirty-two of these proteins specifically associated with the casein micelle had not been identified previously in human milk or colostrum. Proteins involved in immune function comprised the major part (28%) of all identified proteins, closely followed by a proportion involved in metabolism and energy production (22%). Most of the proteins were of extracel- lular or cytoplasmic origin (50% and 29%, respectively). This study provides new insights into how to adequately proportion casein and casein-associated proteins in infant formula. The dynamic changes in whole milk proteins during lactation, however, have not yet been fully characterized. Human colostrum and mature milk differ substantially in the quantity of total protein as well as in protein composition. Some more recent studies have, however, shed more light on the protein profile changes of human milk over a 12-month lactation period and identified proteins with differentially regulated abundance during lactation, including low-abundance proteins (Liao et al., 2011). Significant efforts have been made to characterize the milk fat globule membrane (MFGM) proteins. Milk fat globules are the natural colloidal assemblies secreted by the mammary epithelial cells to provide proteins, lipids, and other bioactive molecules to the newborn. It was shown recently that addition of a MFGM-enriched protein fraction to complementary food has a beneficial effect on diarrhea in infants (Zavaleta et al., 2011). A qualitative and a quantitative proteomic profiling of two MFGM- enriched milk fractions, a whey protein and a buttermilk protein concentrate, using LC–MS/MS-based shotgun proteomics revealed the presence of 244 proteins in the whey concentrate and 133 in the buttermilk, respectively (Affolter et al., 2010). Our group furthermore compiled scientific papers on milkomics, that is, expanding from proteomics-only studies to glycomic and lipidomic investigations of human breast milk (Casado et al., 2009). None of these three “milkomes” is near completion, with the human milk proteome being more advanced in terms of cataloguing and annotation compared to the milk oligosaccharide (glycome) and fat (lipidome) composition. To overcome the limitations of the classical electrophoresis-based approach, MFGM and whey protein fractions were analyzed by nanoflow-HPLC coupled 174 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH to Fourier transform–ion cyclotron resonance (FT-ICR) MS. This high-resolution shotgun proteomics technology strategy showed an as yet unmatched potential to profile low-abundance proteins in human milk (Picariello et al., 2012). Proteins asso- ciated with 301 different gene products were identified. These proteins relate to multiple metabolic pathways and are involved in various physiological functions, such as membrane trafficking, cell signaling, fat metabolism and transport, metabo- lite delivery, protein synthesis/proteolysis or folding, and immunity-related actions. In addition, relative quantification of the MFGM proteins during the course of lac- tation was performed (Liao et al., 2011). Data showed that some proteins decrease in abundance during the course of lactation whereas others increased or remained at a relatively constant level. MFGM components are involved in lipid and energy metabolism, important in milk synthesis and secretion, but also in immune func- tion, with the latter possibly explaining the difference in prevalence of infections between breast-fed and formula-fed infants, because infant formulas are devoid of MFGM proteins. In summary, this study is the first attempt to use a quantitative pro- teomic approach which enables a more comprehensive understanding of the human MFGM proteome.

6.4 NUTRITIONAL PEPTIDE AND PROTEIN BIOMARKERS

6.4.1 Biomarkers in Nutrition A biomarker is a measurable change related to a phenotype. Molecular biomarkers are, for example, responsive, specific, and applicable variations in mRNA, protein, or metabolite concentrations. Nutritional biomarkers help understand nutrient absorp- tion, transport, and metabolism within an organism to translate into an effective dose at the target tissue. Biomarkers of susceptibility consider host factors (such as genetic predisposition), environmental, and lifestyle factors. A valid nutritional biomarker links, for example, a specific exposure to a dietary compound to a health outcome. The aforementioned processes can be interpreted as a continuum connecting dose, expo- sure, and effect, with biological steps along the pathway being potentially observed, monitored, and quantified by means of biomarkers. Markers of internal dose are direct measures of a dietary compound or its metabolites at systemic level, for example, in body fluids. More specific, but also more difficult to measure, are biomarkers for the dose of a dietary compound or metabolite at target tissues. Markers of biologically effective nutrient dose assess the interaction of a food constituent with their molecu- lar targets. Biomarkers for disease prevention and nutritional/therapeutic intervention may be measured anywhere along the pathway, with earlier markers bearing a greater potential to avert disease and later markers being more closely related to the disease (Kussmann et al., 2007). The challenges of applying biomarkers to nutrition research are less related to the sensitivity of the analytical techniques than to a lack of accuracy in terms of under- standing the mechanisms of action and bioavailability of bioactive food components and to insufficient validation in terms of application to large populations. Furthermore, NUTRITIONAL PEPTIDE AND PROTEIN BIOMARKERS 175 most micronutrients at normal dietary dose levels are only weakly biologically active in the short term and exert these moderate acute effects on multiple targets. Some of these micronutrients even have a narrow dose window (like selenium (Schrauzer, 2000; Andrews, 2010)), below which they may be inactive and above which they may even be deleterious. Therefore, in most cases, a battery of biomarkers needs be measured to comprehensively describe the entire continuum, from exposure via effect to end point (Crews et al., 2001). Key issues to consider when assessing the use of biomarkers in studies of nutrition and health are: (i) timing of measurement related to bioavailability and bioefficacy; (ii) biomarkers may correlate with intake but represent often the combined result of intake, absorption, metabolism, and excretion; (iii) biomarkers may poorly correlate with intake amount in the case of homeostatic regulation of nutrient levels in body fluids; and (iv) both host genetics and environmental factors influence the correlation between dietary intake and biomarkers (Kussmann and Affolter, 2009).

6.4.2 Nutrition and Immunity—A Biomarker Perspective Nutrition has a strong influence on immune status, development, and decline. Con- sequently, nutritional modulation of immunity is a major axis in nutrition and health research with the objectives to favorably “program” neonate immunity, maintain immune homeostasis throughout life, and reinforce immunity in elderly. Modern immune-modulating nutrition accompanies the consumer through their life stages and styles (Kussmann and Blum, 2007). Our corporate research organizes yearly symposia on nutrition and health and the 2011 event’s theme was “nutrition and the immune system” (http://www.nestle.com/ Media/NewsAndFeatures/Pages/Nestle_International_Nutrition_Symposium.aspx). During this symposium, the challenging topic of the increasing incidence of chronic inflammatory disorders and allergy/asthma was discussed. These disorders are widely recognized to result from a combination of environmental and individual risk factors, constituting a very significant health burden worldwide. Emphasis during the conference was placed on the modulating role of specific nutrients under circumstances of immune suppression and inflammation. Gokhan¨ Hotamis¸ligil, for example, talked about mechanisms linking nutrients and chronic inflammatory responses and resulting implications for metabolic disease (Fu et al., 2011). Other contributions related to allergy and inflammation are referred to later in this section. While mass spectrometry is a powerful tool to develop and apply biomarkers for immune status and nutritional immune modulation, it is to date largely under- deployed in this context. By contrast, immune-relevant food sources like milk have been extensively investigated by MS in terms of their protein and peptide comple- ment (see preceding section). A few nutritional interventions have been monitored by regarding their immune effects, mainly assessing the peripheral blood mononu- clear cell (PBMC) proteome, with the latter serving in general as an accessible and relevant—although heterogeneous—immune cell population readily amenable to mass spectrometric proteomics. 176 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH

6.4.2.1 Proteomics in Systemic Immunity Immunomodulation markers in human nutrition intervention studies have been reviewed by Albers et al. ( 2005). However, these markers do not descend from proteomic projects but rather reflect targeted mea- surements of biomolecules or read-outs from cellular assays, typically performed in (pre)clinical settings. By contrast, Tjalsma et al. (2008) reviewed immunoproteomics with a focus on circulating serum antibodies as useful clinical markers. The ampli- fication cascade of the humoral immune system elevates the concentrations of these circulating antibodies after the appearance of the corresponding (low abundance) antigen. Together with the high stability of these antibodies compared to many other serum proteins, these features seem to make them ideal probes in clinical diagnostic assays. The cited article reviews immunoproteomics at the level of technologies for biomarker discovery, emphasizing recently developed gel-free MS-based approaches, and previews potential immunoproteomic applications in diagnostic medicine. Mov- ing from diagnostics to interventions, De Roos and McArdle (2008) presented an overview on (immune-related) nutritional studies as monitored by mass spectrom- etry. In view of rather few studies on micronutrient effects on the proteome, their article looks at dietary intervention studies that have deployed 2DE and MALDI- MS-based proteomics to reveal pathway changes related to glucose and fatty acid metabolism; redox status, oxidative stress, and antioxidant defense. While citing mainly classical gel-rooted proteomics discovery workflows, the authors highlight the challenge of measuring regulation of low abundant proteins, for example, those involved in inflammation, and of validating candidate biomarkers in human biofluids. They rightly state that this will increasingly depend on more quantitative and sensitive methods like multiple reaction monitoring (MRM) and multiplexed immunoassays. A related review by the same group deals with the rather technical study challenges and discusses developments in nutritional proteomics analyzing plasma, platelet, and PMBCs, including issues related to study design, sample preparation, and data interpretation (de Roos, 2008).

6.4.2.2 Proteomics in Intestinal Immunity Zooming in from systemic to organ or local immunity, Song and Hanash (2006) described protein microarrays, mass spectrometry-based proteomic tools and guidelines for biomarker development in gas- trointestinal immune disorders. The authors highlight the issue of pending biomarkers for prevalent gut-inflammatory conditions such as IBD and irritable bowel syndrome (IBS), and place proteomics in a position to change that. Within IBD, better mark- ers are needed to distinguish between Crohn’s disease (CD) and ulcerative colitis (UC) and to improve diagnosis and prediction of therapy. Purcell and Gorman (2004) reviewed mass spectrometry-based studies of immune responses and discussed the contribution of proteomics to the elucidation of the cytotoxic T lymphocytes, the T-cell–B-cell cooperation and antibody secretion, defining targets of T-cell immu- nity, discovery of T-cell epitopes, analysis of antigen-presenting cell (APC) surface proteins, and the sequencing of major histocompatibility complex (MHC)-bound pep- tides. Addressing a more specific immune context, Weingarten et al. (2005) discussed the application of mass spectrometric protein analysis to biomarker and target finding for immunotherapy. Their article focuses on regulatory T cells that play a central role NUTRITIONAL PEPTIDE AND PROTEIN BIOMARKERS 177 in maintaining the immunological balance and inhibiting T-cell activation both in vivo and in vitro. Inflammation is an essential immune process whereby tissues of the body respond to injury or infection (Medzhitov, 2010). It has been characterized as purposeful, timely, powerful and, as a consequence, also as dangerous, if it is not timely and appropriately resolved (Kussmann and Blum, 2007). The normal outcome of acute inflammation is its successful resolution and repair of tissue damage, rather than a persisting inflammatory response (Henson, 2005). Although inflammation is essential for tissue homeostasis, prolonged inflammation is a characteristic feature of many chronic diseases, such as IBD and autoimmunity. Moreover, chronic inflammation has been shown to be implicated in critical conditions such as atherosclerosis, arthritis, cancer, asthma, all leading to tissue destruction, fibrosis, and impairment or loss of organ function. Celiac disease and the IBD subtype CD are inflammatory conditions of aberrant gastrointestinal mucosal immune function (James, 2005). Celiac disease is a disorder of the small intestine characterized by chronic inflammation of the mucosa caused by loss of tolerance to dietary antigens, such as antigenic peptides in wheat, rye, and barley. Hallmarks of CD are chronic gastrointestinal inflammation and associ- ation with several genetic mutations, with at least one of them being implicated in innate immunity. Gut inflammatory diseases, like IBD and its subtypes CD and UC, can also give rise to abnormal immune responses to gut microbiota (James, 2005) (see also “The Human Host and Their Gut Microbiota”):

– Ng and coworkers (2011) investigated the relationship between human intestinal dendritic cells (DCs), gut microbiota, and CD activity. They found that intestinal DC IL-6 production is increased in patients with CD and correlated with disease activity and C-reactive protein (CRP). In terms of host–microbe interaction, they suggest that bacterially driven local IL-6 production by host intestinal DC may result in unopposed effector function and tissue damage. This would mean that intestinal DC function can be influenced by the commensal microbiota composition. – In another project linking intestinal microbiota and gut inflammation, Pruteanu and colleagues (2011) studied the degradation of the extracellular matrix com- ponents by bacterial-derived metalloproteases: proteolytic degradation of the extracellular matrix is a hallmark of mucosal homeostasis and tissue renewal but also contributes to complications of intestinal inflammation. It is not known how many of this is host-derived or exhibited by the gut microbiota. There- fore, Pruteanu et al. screened fecal bacterial colonies from healthy controls, UC subjects, and patients with CD for gelatinolytic activity. The group concluded that microbial proteolytic activity has the capacity to contribute to mucosal homeostasis and may participate in IBD pathogenesis.

A number among the rather few mass spectrometry-based in vivo proteomic studies in the context of intestinal inflammation have been performed by the Bendixen team, 178 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH using the omnivore piglet as a model: for example, they compared proteome patterns of healthy and inflamed gut tissues harvested from preterm piglets to investigate the effect of inflammation on acquisition of passive immunity (Danielsen et al., 2006). The molecular differences in the 2DE protein patterns between healthy and inflamed intestinal tissues suggested that inflamed tissues fail to absorb and transfer Ig from colostrum to epithelial cells. Mass spectrometry identified isoforms of IgA and IgG heavy chains as well as Ig ␬ and ␭ light chains as being absorbed by healthy intestinal tissues and indicated that colostrum protein uptake in the porcine gut is a selective process deranged in inflamed preterm intestine.

Allergy Food allergy is an adverse reaction to food or food additives with an under- lying immunological mechanism (Brandtzaeg, 2010). At our nutrition symposium on immunity held in 2011, the allergy and asthma sessions addressed the role of nutrition in the early development and education of the immune system with the aim of promoting appropriate immune responses later in life. Erika von Mutius reported on protection from childhood asthma and allergies in farming environments, delivering epidemiological data and possible mechanisms underlying the so-termed “hygiene” hypothesis. She recently reviewed the influence of exposure to environmen- tal micro-organisms on childhood asthma (Ege et al., 2011). At the same conference, Rolf Zinkernagel reported on long-lasting immunity conferred by early infection of infants who were protected by maternal antibodies (Navarini et al., 2010); and Per Brandtzaeg discussed secretory IgA and the intestinal barrier function as determinants of homeostasis and allergy (Brandtzaeg, 2010). In contrast to nutritional immune marker studies, where proteomics could and should be deployed more widely, proteomics is already an established platform for detection, identification, and characterization of allergens. Many food allergens have been identified and often structurally well characterized, typically by mass spectrom- etry. This source of risk necessitates detecting and monitoring (potential) allergens before, during, and after food processing (Eigenmann, 2001) and, consequently, a list of the ten most sensitizing proteins has been compiled. Although this varies geograph- ically, these allergenic proteins and peptides basically derive from egg, fish, shellfish, milk, soy, wheat, peanuts, tree nuts, citrus fruits, and sesame seeds. Most of these food allergens are glycoproteins in the range from 14 to 40 kDa (Chandra, 1997). These physicochemical characteristics make them easily amenable to mass spectrometric and proteomic analysis, with their power to identify, sequence, and quantify proteins, posttranslational modifications such as glycosylation, and to differentiate between protein isoforms (Kussmann et al., 2005; Kussmann and Affolter, 2006).

6.5 ECOSYSTEM-LEVEL UNDERSTANDING OF NUTRITIONAL HOST HEALTH

6.5.1 The Human Host and Their Gut Microbiota Humans and other mammals are colonized by a complex and dynamic community of microorganisms. In fact, in terms of cell numbers, adult humans can be considered ECOSYSTEM-LEVEL UNDERSTANDING OF NUTRITIONAL HOST HEALTH 179 as a “eukaryotic minority in a prokaryotic ecosystem” with 90% of the cells present in the human body estimated to be microbial, and only 10% human (Savage, 1986). The impact of these microbial consortia on human health is probably most important in the intestine, because this organ is inhabited by most of these bacteria: while the microbial densities in the proximal and middle small intestine are relatively low, they increase markedly in the distal small intestine (∼108 bacteria/mL of luminal contents) and colon (1011–1012/g) (Savage, 1986). The microbiota in the adult human body encompasses a huge biomass of >100,000 billion bacteria spread over ∼1500 different species to be present only in the gut. These microbes exert intense metabolic activity, predominantly in the colon, and play an important physiological role in the host (Bourlioux et al., 2003). One of the central functions of the colonic microbiota is its capability to resist colonization by any external new strain of bacteria (Bourlioux et al., 2003). Within the gastrointestinal tract, food is partly metabolized by the above- introduced bacteria residing in the stomach and, in particular, in the gut. From this angle, the gut microbes can be interpreted as a collective metabolically active “organ” affecting the host’s energy metabolism and immunity (Macpherson and Harris, 2004; Turnbaugh et al., 2006). This concept has been championed by the groups of Gordon (Backhed et al., 2005) and Nicholson et al. (2005) and further pursued by ourselves (Rabot et al., 2010; Harris et al., 2012). The microbiota can degrade a variety of dietary substances that are otherwise nondigestible and, therefore, inaccessible to the host (Savage, 1986); one such example is the energy harvest from complex carbohy- drates (Hooper et al., 2002). Gut colonization by commensal bacteria has also been demonstrated to influence the modulation of genes involved in nutrient absorption, gastrointestinal and mucosal immune function, and xenobiotic metabolism (Hooper and Gordon, 2001). Returning to the central theme of this chapter, and that is nutrition, proteomics, and immunity, the microbiota shapes the host immune system via a complex interplay throughout life; however, little is known about the microbial effect on key players of the adaptive immune system, the B2 B cells. In a study on microbial effects on systemic immunity, our group, therefore, evaluated the effect of commensal bac- teria on B cell ontogeny and function (Hansson et al., 2011); combining classical immunology with transcriptomics, we revealed an influence of gut microbiota on function of mucosal B2 B cells. We furthermore executed a closely related project on gut microbiome impact on local, that is, intestinal immunity: dynamic postnatal intestinal development is characterized by morphological changes coinciding with functional adaption to the nutritional change from a diet rich in fat (milk) to a diet rich in carbohydrates from weaning. We analyzed changes of primary intestinal epithe- lial cells from jejunum during the early and middle suckling, as well as the weaning period in mice, using a label-free proteomics approach (Hansson et al., 2011), thereby providing the first time-resolved proteomics study of intestinal epithelial cells along postnatal intestinal development. Investigations on host–microbiome interactions with regard to immune balance, energy metabolism, or overall health trajectories and disease induction can to date be classified as follows: either (i) human cohorts or (groups of) populations, which 180 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH differ in their metabolic or immune health, have been compared at the level of their GI microbiome, mostly by 16s RNA sequencing, more recently by deep sequencing of the entire transcriptome of the metaorgan, thereby providing a much greater pop- ulation census at much higher throughput than would be accessible by colonizing only the aerobic part of the gut bacteria. These studies are typically performed in and by large consortia: for example, the International Human Microbiome Consortium (IMHC) (http://www.human-microbiome.org/); the NIH Human Microbiome Project (http://www.hmpdacc.org/); the MetaHIT group (http://www.metahit.eu/); and the MetMicrOBES project (http://www.inra.fr/micro_obes_eng/). They have delivered associations between health and disease conditions and gut-residing bacterial popu- lations; or (2) gnotobiotic, that is, a priori gut-sterile animal models—mostly rodents such as mice and rats, but also piglets—have been colonized with single or few defined gut bacterial strains, sometimes subsequently, sometimes simultaneously, to study host–bacterial interactions per strain and/or to study a (much) simplified, but defined, gut bacteria–host collective as a model for the entire ecosystem, respectively. While these studies have yielded a wealth of data and insights into host–microbe interactions, it is still an enormous challenge to simulate, for example, the complex bacterial–mucosal immune interaction deploying in vivo models. In one such attempt of in vivo modeling host–guest immune interactions, Nicholls et al. (2003) elucidated metabolic events concomitant to acclimatization of germ-free rats to standard lab- oratory conditions. In order to unravel gut microbial effects under physiologically relevant conditions, animals with a priori sterile GIT and then monocolonized with probiotics are now used as a suitable model, especially the gnotobiotic mouse (Falk et al., 1998) but also germ-free piglets (Danielsen et al., 2007). The latter model was recruited to investigate how bacterial colonization affects the porcine intesti- nal proteome (Danielsen et al., 2007): small-intestinal protein expression patterns in gnotobiotic pigs maintained germ-free or monoassociated with either Lactobacillus fermentum or nonpathogenic Escherichia coli revealed that bacterial colonization dif- ferentially affected proteolysis, epithelial proliferation, and lipid metabolism, which confirms studies of other germ-free animal models. Gut ecology is extremely complex and requires an ecosystem-level metagenomics approach to understand the health impact of the intestinal microbiota and probi- otics (Xu and Gordon, 2003). Logically, nutrigenomics has, therefore, been extended via metagenomics toward metaproteomics collecting information from all three pro- teomes: host, food, and microbes (Kussmann and Van Bladeren, 2011). While tech- niques such as microarrays and high-throughput sequencing (Hamady and Knight, 2009) have delivered comprehensive data on the intestinal bacterial population struc- ture (Turnbaugh et al., 2006), more recently the question, as mentioned above, has changed from “who is there?” to “who is doing what?”, that is, from a population census (Turnbaugh and Gordon, 2009) to an activity profiling, the latter being facil- itated by metaproteomic analyses (Verberkmoes et al., 2009). The latter approach adds the proteomics-typical additional challenge of any (meta)proteome being much more complex than any (meta)genome. Biagi et al. (2012), for example, delivered a metagenomic perspective of the aging process in the human gut: they introduce human beings as “metaorganisms” with a more holistic view of the aging process CONCLUSIONS AND PERSPECTIVES 181 and the interaction between environment, intestinal microbiota, and host taken into consideration. Age-related physiological changes of the gastrointestinal tract, the lifestyle, nutritional behavior, and the host’s immune system affect the gut microbial ecosystem. Biagi et al. review the current knowledge of gut microbial changes in the aging people and propose age-related gut microbial unbalances to be involved in “inflamm-aging” and immunosenescence. In view of the importance of gut micro- biota homeostasis for host health, they consider medical and nutritional applications based on probiotic and prebiotic preparations specific for the elderly. They also review the few clinical intervention trials reporting the use of pre-/probiotics in the elderly.

6.6 CONCLUSIONS AND PERSPECTIVES

By balancing their diet, consumers want to optimize some health aspects without compromising others. Holistic and integrative approaches are, therefore, primordial. Proteomics is a central platform in nutrigenomics, which attempts to holistically understand how our genome is expressed as a response to diet. From a molecu- lar perspective, nutritional proteomics covers two dimensions: characterization of bioactive food proteins and peptides, beyond bulk macronutrient, and discovery and quantification of these bioactives and of biomarkers of health, diet, and nutritional intervention. The further success of proteomics in nutrition and health will depend on mul- tiple factors: the proteomic technology per se will benefit from ever improving protein/peptide separation, depletion, and enrichment on the one hand and more sen- sitive, accurate, and specific MS on the other. Data processing and interpretation is more likely today’s major bottleneck which requires continuously improving tools including those for data visualization and—most importantly—cross-platform corre- lation. Last but not the least, the third room for improvement concerns the analytical strategy: focusing on proteome subsets—be it at the level of cell organelles, pro- tein subclasses, the mass spectral level (targeted proteomics)—will provide deeper insights into preselected molecular networks. Apart from this expected progress at platform level, the technology will increas- ingly benefit from its cross-correlation with gene expression analysis and metabolite profiling. An option of addressing the interrelated timing of gene and protein expres- sion is the investigation of protein turnover at proteomic scale but single-protein resolution, that is, interpreting protein abundance changes as a result of both protein synthesis and degradation rather than taking proteomic “snapshots.” Given the complexity and dynamics of proteomes, nowadays proteomics experi- ences a paradigm shift. Strategically speaking, the original hypothesis-free discovery workflow is being increasingly complemented by either hypothesis-driven analysis or candidate-based targeted analysis and validation. Notably, proteomics has thereby developed from a pure discovery to a screening and validation tool. A further change of nutritional proteomic “philosophy” roots is the increasing appreciation of peptides as bioactive, health-beneficial food components. The analysis of such peptides requires a different analytical approach because these entities vary 182 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH much more in their chemical nature than classical tryptic peptides generated in shotgun proteomics workflows for protein biomarker identification. These recent developments in nutriproteomics are forming a new basis for under- standing the biological value of milk and its protein and peptide content, which will further strengthen our knowledge about why breast milk is best for a healthy newborn development.

REFERENCES

Acosta-Martin AE, Panchaud A, Chwastyniak M, Dupont A, Juthier F, Gautier C, Jude B, Amouyel P, Goodlett DR, Pinet F (2011). Quantitative mass spectrometry analysis using PAcIFIC for the identification of plasma diagnostic biomarkers for abdominal aortic aneurysm. PLoS One 6(12):e28698. Affolter M, Grass L, Vanrobaeys F, Casado B, Kussmann M (2010). Qualitative and quanti- tative profiling of the bovine milk fat globule membrane proteome. Journal of Proteomics 73(6):1079–1088. Albers R, Antoine JM, Bourdet-Sicard R, Calder PC, Gleeson M, Lesourd B, Samart´ın S, Sanderson IR, Van Loo J, Vas Dias FW, Watzl B (2005). Markers to measure immunomod- ulation in human nutrition intervention studies. The British Journal of Nutrition 94(3):452– 481. Andrews PJ (2010). Selenium and glutamine supplements: where are we heading? A critical care perspective. Current Opinion in Clinical Nutrition and Metabolic Care 13(2):192–197. Arenz S (2008). [Do breast-fed children have a lower risk for later obesity? Discussion of a meta-analysis]. Gesundheitswesen 70(Suppl 1):S25–S28. Backhed F, Ley RE, Sonnenburg JL, Peterson DA, Gordon JI (2005). Host-bacterial mutualism in the human intestine. Science 307(5717):1915–1920. Bern M, Finney G, Hoopmann MR, Merrihew G, Toth MJ, MacCoss MJ (2010). Deconvolution of mixture spectra from ion-trap data-independent-acquisition tandem mass spectrometry. Analytical Chemistry 82(3):833–841. Biagi E, Candela M, Fairweather-Tait S, Franceschi C, Brigidi P (2012). Ageing of the human metaorganism: the microbial counterpart. Age (Dordr) 34(1):247–267. Bjorck S, Brundin C, Lorinc¨ E, Lynch KF, Agardh D (2010). Screening detects a high propor- tion of celiac disease in young HLA-genotyped children. Journal of Pediatric Gastroen- terology and Nutrition 50(1):49–53. Bourlioux P, Koletzko B, Guarner F, Braesco V (2003). The intestine and its microflora are partners for the protection of the host: report on the Danone Symposium “The Intelli- gent Intestine,” held in Paris, June 14, 2002. The American Journal of Clinical Nutrition 78(4):675–683. Brandtzaeg P (2010). Food allergy: separating the science from the mythology. Nature Reviews. Gastroenterology & Hepatology 7(7):380–400. Brunet S, Thibault P, Gagnon E, Kearney P, Bergeron JJ, Desjardins M (2003). Organelle proteomics: looking at less to see more. Trends in Cell Biology 13(12):629–638. Burdge GC, Hoile SP, Uller T, Thomas NA, Gluckman PD, Hanson MA, Lillycrop KA (2011). Progressive, transgenerational changes in offspring phenotype and epigenotype following nutritional transition. PLoS One 6(11):e28282. REFERENCES 183

Casado B, Affolter M, Kussmann M (2009). OMICS-rooted studies of milk proteins, oligosac- charides and lipids. Journal of Proteomics 73(2):196–208. Chandra RK (1997). Food hypersensitivity and allergic disease: a selective review. The Amer- ican Journal of Clinical Nutrition 66(2):526S–529S. Choudhary C, Mann M (2010). Decoding signalling networks by mass spectrometry-based proteomics. Nature Reviews. Molecular Cell Biology 11(6):427–439. Cifuentes A (2009). Food analysis and foodomics. Journal of Chromatography A 1216(43): 7109. Claydon AJ, Beynon RJ (2011). Protein turnover methods in single-celled organisms: dynamic SILAC. Methods in Molecular Biology 759: 179–195. Claydon AJ, Thom MD, Hurst JL, Beynon RJ (2012). Protein turnover: measurement of proteome dynamics by whole animal metabolic labelling with stable isotope labelled amino acids. Proteomics 12(8):1194–1206. Cox J, Matic I, Hilger M, Nagaraj N, Selbach M, Olsen JV, Mann M (2009). A practical guide to the MaxQuant computational platform for SILAC-based quantitative proteomics. Nature Protocols 4(5):698–705. Crews H, Alink G, Andersen R, Braesco V, Holst B, Maiani G, Ovesen L, Scotter M, Solfrizzo M, van den Berg R, Verhagen H, Williamson G (2001). A critical assessment of some biomarker approaches linked with dietary intake. British Journal of Nutrition 86(Suppl 1): S5–S35. D’Alessandro A, Scaloni A, Zolla L (2010). Human milk proteins: an interactomics and updated functional overview. Journal of Proteomics 9(7):3339–3373. Damoc E, Fraser CS, Zhou M, Videler H, Mayeur GL, Hershey JW, Doudna JA, Robinson CV, Leary JA (2007). Structural characterization of the human eukaryotic initiation factor 3 protein complex by mass spectrometry. Molecular and Cell Proteomics 6(7):1135– 1146. Danielsen M, Hornshoj H, Siggers RH, Jensen BB, van Kessel AG, Bendixen E (2007). Effects of bacterial colonization on the porcine intestinal proteome. Journal of Proteomics 6(7):2596–2604. Danielsen M, Thymann T, Jensen BB, Jensen ON, Sangild PT, Bendixen E (2006). Pro- teome profiles of mucosal immunoglobulin uptake in inflamed porcine gut. Proteomics 6(24):6588–6596. de Godoy LM, Olsen JV, de Souza GA, Li G, Mortensen P, Mann M (2006). Status of complete proteome analysis by mass spectrometry: SILAC labeled yeast as a model system. Genome Biologiae 7(6):R50. de Graaf AA, VenemaK (2008). Gaining insight into microbial physiology in the large intestine: a special role for stable isotopes. Advances in Microbial Physiology 53:73–168. de Roos B (2008). Proteomic analysis of human plasma and blood cells in nutritional studies: development of biomarkers to aid disease prevention. Expert Review Proteomics 5(6):819– 826. de Roos B, McArdle HJ (2008). Proteomics as a tool for the modelling of biological processes and biomarker development in nutrition research. Br J Nutr 99(Suppl 3):S66–S71. Dimitrov DV (2011). The human gutome: nutrigenomics of the host-microbiome interactions. OMICS 15(7–8):419–430. Doherty MK, Beynon RJ (2006). Protein turnover on the scale of the proteome. Expert Review Proteomics 3(1):97–110. 184 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH

Domon B, Aebersold R (2010). Options and considerations when selecting a quantitative proteomics strategy. Nature Biotechnology 28(7):710–721. Duclos S, Desjardins M (2011). Organelle proteomics. Methods in Molecular Biology 753:117– 128. Ege MJ, Mayer M, Normand AC, Genuneit J, Cookson WO, Braun-Fahrlander¨ C, Heederik D, Piarroux R, von Mutius E; GABRIELA Transregio 22 Study Group (2011). Exposure to environmental microorganisms and childhood asthma. New England Journal of Medicine 364(8):701–709. Eigenmann PA (2001). Food allergy: a long way to safe processed foods. Allergy 56(12):1112– 1113. Falk PG, Hooper LV, Midtvedt T, Gordon JI (1998). Creating and maintaining the gastroin- testinal ecosystem: what we know and need to know from gnotobiology. Microbiology and Molecular Biology Reviews 62(4):1157–1170. Farrah T, Deutsch EW, Kreisberg R, Sun Z, Campbell DS, Mendoza L, Kusebauch U, Brusniak MY, Huttenhain¨ R, Schiess R, Selevsek N, Aebersold R, Moritz RL (2012). PASSEL: The PeptideAtlas SRMexperiment library. Proteomics 12(8):1170–1175. Fewtrell MS (2011). Breast-feeding and later risk of CVD and obesity: evidence from ran- domised trials. Proceedings of the Nutrition Society 70(4):472–477. Fraterman S, Zeiger U, Khurana TS, Rubinstein NA, Wilm M (2007). Combination of pep- tide OFFGEL fractionation and label-free quantitation facilitated proteomics profiling of extraocular muscle. Proteomics 7(18):3404–3416. Fu S, Yang L, Li P, Hofmann O, Dicker L, Hide W, Lin X, Watkins SM, Ivanov AR, Hotamis- ligil GS (2011). Aberrant lipid metabolism disrupts calcium homeostasis causing liver endoplasmic reticulum stress in obesity. Nature 473(7348):528–531. Geromanos SJ, Vissers JP, Silva JC, Dorschel CA, Li GZ, Gorenstein MV, Bateman RH, Langridge JI (2009). The detection, correlation, and comparison of peptide precursor and product ions from data independent LC-MS with data dependant LC-MS/MS. Proteomics 9(6):1683–1695. Gevaert K, Impens F, Van Damme P, Ghesquiere` B, Hanoulle X, Vandekerckhove J (2007). Applications of diagonal chromatography for proteome-wide characterization of protein modifications and activity-based analyses. FEBS J 274(24):6277–6289. Gillet LC, Navarro P, Tate S, Rost¨ H, Selevsek N, Reiter L, Bonner R, Aebersold R (2012). Targeted data extraction of the MS/MS spectra generated by data independent acquisition: a new concept for consistent and accurate proteome analysis. Molecular and Cell Proteomics. DOI: 10.1074/mcp.O111.016717. Gluckman PD, Beedle AS, Hanson MA, Yap EP (2008). Developmental perspectives on individual variation: implications for understanding nutritional needs. Nestle Nutrition Workshop Series Pediatric Program 62:1–9; discussion 9–12. Gold MG, Stengel F, Nygren PJ, Weisbrod CR, Bruce JE, Robinson CV, Barford D, Scott JD (2011). Architecture and dynamics of an A-kinase anchoring protein 79 (AKAP79) signaling complex. Proceedings of the National Academy of Sciences of the United States of America 108(16):6426–6431. Guillaume E, Berger B, Affolter M, Kussmann M (2009). Label-free quantitative proteomics of two Bifidobacterium longum strains. Journal of Proteomics 72(5):771–784. Hamady M, Knight R (2009). Microbial community profiling for human microbiome projects: tools, techniques, and challenges. Genome Research 19(7):1141–1152. REFERENCES 185

Hansson J, Bosco N, Favre L, Raymond F, Oliveira M, Metairon S, Mansourian R, Blum S, Kussmann M, Benyacoub J (2011). Influence of gut microbiota on mouse B2 B cell ontogeny and function. Molecular Immunology 48(9–10):1091–1101. Hansson J, Panchaud A, Favre L, Bosco N, Mansourian R, Benyacoub J, Blum S, Jensen ON, Kussmann M (2011). Time-resolved quantitative proteome analysis of in vivo intestinal development. Molecular and Cell Proteomics 10(3):M110 005231. Harris K, Kassis A, Major G, Chou CJ (2012). Is the gut microbiota a new factor contributing to obesity and its metabolic disorders? Journal of Obesity 2012:879151. Hauner H, Meier M, Jockel¨ KH, Frey UH, Siffert W (2003). Prediction of successful weight reduction under sibutramine therapy through genotyping of the G-protein beta3 subunit gene (GNB3) C825T polymorphism. Pharmacogenetics 13(8):453–459. Henson PM (2005). Dampening inflammation. Nature Immunology 6(12):1179–1181. Hooper LV, Gordon JI (2001). Commensal host-bacterial relationships in the gut. Science 292(5519):1115–1118. Hooper LV, Midtvedt T, Gordon JI (2002). How host-microbial interactions shape the nutrient environment of the mammalian intestine. Annual Review of Nutrition 22:283–307. James SP (2005). Prototypic disorders of gastrointestinal mucosal immune function: Celiac disease and Crohn’s disease. Journal of Allergy and Clinical Immunology 115(1):25–30. Jones BV, Sun F, Marchesi JR (2010). Comparative metagenomic analysis of plasmid encoded functions in the human gut microbiome. BMC Genomics 11:46. Kaput J (2008). Nutrigenomics research for personalized nutrition and medicine. Current Opinion in Biotechnology 19(2):110–120. Kiernan UA, Hernandez L, Niederkofler EE, Tubbs KA, Nelson RW (2008). MS-based pheno- typic characterization of a human blood protein from urinary waste products. Proteomics Clinical Applications 2(7–8):1019–1024. Kiernan UA, Nedelkov D, Tubbs KA, Niederkofler EE, Nelson RW (2004). Proteomic char- acterization of novel serum amyloid P component variants from human plasma and urine. Proteomics 4(6):1825–1829. Kiernan UA, Tubbs KA, Nedelkov D, Niederkofler EE, McConnell E, Nelson RW (2003). Comparative urine protein phenotyping using mass spectrometric immunoassay. Journal of Proteomics 2(2):191–197. Klenke S, Kussmann M, Siffert W (2011). The GNB3 C825T polymorphism as a pharmacoge- netic marker in the treatment of hypertension, obesity, and depression. Pharmacogenetics and Genomics 21(9):594–606. Kussmann M, Affolter M (2006). Proteomic methods in nutrition. Current Opinion in Clinical Nutrition and Metabolic Care 9(5):575–583. Kussmann M, Affolter M (2009). Proteomics at the center of nutrigenomics: comprehensive molecular understanding of dietary health effects. Nutrition 25(11–12):1085–1093. Kussmann M, Affolter M, Fay LB (2005). Proteomics in nutrition and health. Combinatorial Chemistry & High Throughput Screening 8(8):679–696. Kussmann M, Affolter M, Nagy K, Holst B, Fay LB (2007). Mass spectrometry in nutrition: understanding dietary health effects at the molecular level. Mass Spectrometry Reviews 26(6):727–750. Kussmann M, Blum S (2007). OMICS-derived targets for inflammatory gut disorders: opportu- nities for the development of nutrition related biomarkers. Endocrine, Metabolic & Immune Disorders - Drug Targets 7(4):271–287. 186 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH

Kussmann M, Krause L, Siffert W (2010). Nutrigenomics: where are we with genetic and epigenetic markers for disposition and susceptibility? Nutrition Reviews 68(Suppl 1):S38– S47. Kussmann M, Panchaud A, Affolter M (2010). Proteomics in nutrition: status quo and outlook for biomarkers and bioactives. Journal of Proteomics 9(10):4876–4887. Kussmann M, Rezzi S, Daniel H (2008). Profiling techniques in nutrition and health research. Current Opinion in Biotechnology 19(2):83–99. Kussmann M, Van Bladeren PJ (2011). The extended nutrigenomics - understanding the interplay between the genomes of food, gut microbes, and human host. Frontiers in Genetics 2:21. Kuzu G, Keskin O, Gursoy A, Nussinov R (2012). Constructing structural networks of signaling pathways on the proteome scale. Current Opinion in Structural Biology 22(3):367–377. Lamond AI, Uhlen M, Horning S, Makarov A, Robinson CV, Serrano L, Hartl FU, Baumeister W, Werenskiold AK, Andersen JS, Vorm O, Linial M, Aebersold R, Mann M (2012). Advancing cell biology through proteomics in space and time (PROSPECTS). Molecular and Cell Proteomics 11(3):O112 017731. Le Huerou-Luron I, Blat S, Boudry G (2010). Breast- v. formula-feeding: impacts on the digestive tract and immediate and long-term health effects. Nutrition Research Reviews 23(1):23–36. Lescuyer P, Chevallet M, Luche S, Rabilloud T (2006). Organelle proteomics. Current Proto- cols in Protein Science Chapter 24:Unit 24.2. Liao Y, Alvarado R, Phinney B, Lonnerdal¨ B (2011). Proteomic characterization of human milk fat globule membrane proteins during a 12 month lactation period. Journal of Proteomics 10(8):3530–3541. Liao Y, Alvarado R, Phinney B, Lonnerdal¨ B (2011). Proteomic characterization of human milk whey proteins during a twelve-month lactation period. Journal of Proteomics 10(4):1746– 1754. Liao Y, Alvarado R, Phinney B, Lonnerdal¨ B (2011). Proteomic characterization of specific minor proteins in the human milk casein fraction. Journal of Proteomics 10(12):5409–5415. Loland BF, Baerug AB, Nylander G (2007). [Human milk, immune responses and health effects]. Tidsskrift for Den Norske Laegeforening 127(18):2395–2398. Lundberg E, Uhlen M (2010). Creation of an antibody-based subcellular protein atlas. Pro- teomics 10(22):3984–3996. Macpherson AJ, Harris NL (2004). Interactions between commensal intestinal bacteria and the immune system. Nature Reviews. Immunology 4(6):478–485. Manadas B, Mendes VM, English J, Dunn MJ (2010). Peptide fractionation in proteomics approaches. Expert Review Proteomics 7(5):655–663. Mann M (2006). Functional and quantitative proteomics using SILAC. Nature Reviews. Molec- ular Cell Biology 7(12):952–958. Medzhitov R (2010). Inflammation 2010: new adventures of an old flame. Cell 140(6):771– 776. Michalsen A, Frey UH, Merse S, Siffert W, Dobos GJ (2009). Hunger and mood during extended fasting are dependent on the GNB3 C825T polymorphism. Annals of Nutrition and Metabolism 54(3):184–188. Molle D, Jardin J, Piot M, Pasco M, Leonil´ J, Gagnaire V (2009). Comparison of elec- trospray and matrix-assisted laser desorption ionization on the same hybrid quadrupole REFERENCES 187

time-of-flight tandem mass spectrometer: application to bidimensional liquid chromatogra- phy of proteins from bovine milk fraction. Journal of Chromatography A 1216(12):2424– 2432. Nagaraj N, Kulak NA, Cox J, Neuhauser N, Mayr K, Hoerning O, Vorm O, Mann M (2012). System-wide perturbation analysis with nearly complete coverage of the yeast proteome by single-shot ultra HPLC runs on a bench top Orbitrap. Molecular and Cell Proteomics 11(3):M111 013722. Navarini AA, Krzyzowska M, Lang KS, Horvath E, Hengartner H, Niemialtowski MG, Zinker- nagel RM (2010). Long-lasting immunity by early infection of maternal-antibody-protected infants. European Journal of Immunology 40(1):113–116. Ng SC, Benjamin JL, McCarthy NE, Hedin CR, Koutsoumpas A, Plamondon S, Price CL, Hart AL, Kamm MA, Forbes A, Knight SC, Lindsay JO, Whelan K, Stagg AJ (2011). Relationship between human intestinal dendritic cells, gut microbiota, and disease activity in Crohn’s disease. Inflammatory Bowel Diseases 17(10):2027–2037. Nicholls AW, Mortishire-Smith RJ, Nicholson JK (2003). NMR spectroscopic-based metabo- nomic studies of urinary metabolite variation in acclimatizing germ-free rats. Chemical Research in Toxicology 16(11):1395–1404. Nicholson JK, Holmes E, Lindon JC, Wilson ID (2004). The challenges of modeling mam- malian biocomplexity. Nature Biotechnology 22(10):1268–1274. Nicholson JK, Holmes E, Wilson ID (2005). Gut microorganisms, mammalian metabolism and personalized health care. Nature Reviews. Microbiology 3(5):431–438. Pan S, Aebersold R (2007). Quantitative proteomics by stable isotope labeling and mass spectrometry. Methods in Molecular Biology 367:209–218. Panchaud A, Affolter M, Kussmann M (2011). Mass spectrometry for nutritional peptidomics: how to analyze food bioactives and their health effects. Journal of Proteomics. Pedrioli PG (2010). Trans-proteomic pipeline: a pipeline for proteomic analysis. Methods in Molecular Biology 604:213–238. Phizicky EM, Grayhack EJ (2006). Proteome-scale analysis of biochemical activity. Critical Reviews in Biochemistry and Molecular Biology 41(5):315–327. Picariello G, Ferranti P, Mamone G, Klouckova I, Mechref Y, Novotny MV, Addeo F (2012). Gel-free shotgun proteomic analysis of human milk. Journal of Chromatography A 1227:219–233. Polaskova V, Kapur A, Khan A, Molloy MP, Baker MS (2010). High-abundance protein depletion: comparison of methods for human plasma biomarker discovery. Electrophoresis 31(3):471–482. Pruteanu M, Hyland NP, Clarke DJ, Kiely B, Shanahan F (2011). Degradation of the extracellu- lar matrix components by bacterial-derived metalloproteases: implications for inflammatory bowel diseases. Inflammatory Bowel Diseases 17(5):1189–1200. Purcell AW, Gorman JJ (2004). Immunoproteomics: mass spectrometry-based methods to study the targets of the immune response. Molecular and Cell Proteomics 3(3):193–208. Rabot S, Membrez M, Bruneau A, Gerard´ P, Harach T, Moser M, Raymond F, Mansourian R, Chou CJ (2010). Germ-free C57BL/6J mice are resistant to high-fat-diet-induced insulin resistance and have altered cholesterol metabolism. Faseb Journal 24(12):4948–4959. Samuel BS, Shaito A, Motoike T, Rey FE, Backhed F, Manchester JK, Hammer RE, Williams SC, Crowley J, Yanagisawa M, Gordon JI (2008). Effects of the gut microbiota on host adiposity are modulated by the short-chain fatty-acid binding G protein-coupled receptor, 188 PROTEOMICS IN NUTRITIONAL SYSTEMS BIOLOGY: DEFINING HEALTH

Gpr41. Proceedings of the National Academy of Sciences of the United States of America 105(43):16767–16772. Savage DC (1986). Gastrointestinal microflora in mammalian nutrition. Annual Review of Nutrition 6:155–178. Sazawal S, Bhan MK, Bhandari N (1992). Type of milk feeding during acute diarrhoea and the risk of persistent diarrhoea: a case control study. Acta Paediatrica. Supplement 381:93–97. Schrauzer GN (2000). Selenomethionine: a review of its nutritional significance, metabolism and toxicity. Journal of Nutrition 130(7):1653–1656. Schrezenmeir J, Jagla A (2000). Milk and diabetes. Journal of the American College of Nutrition 19(Suppl 2):176S-190S. Schwudke D, Liebisch G, Herzog R, Schmitz G, Shevchenko A (2007). Shotgun lipidomics by tandem mass spectrometry under data-dependent acquisition control. Methods in Enzy- mology 433:175–191. Sidoli S, Cheng L, Jensen ON (2012). Proteomics in chromatin biology and epigenetics: elucidation of post-translational modifications of histone proteins by mass spectrometry. Journal of Proteomics 75(12):3419–3433. Song K, Hanash S (2006). Unraveling the complex proteome for biomarker discovery in gastrointestinal and liver diseases. Gastroenterology 131(5):1375–1378. Tam SW, Pirro J, Hinerfeld D (2004). Depletion and fractionation technologies in plasma proteomic analysis. Expert Review Proteomics 1(4):411–420. Tiscornia MM, Riera MA, Lorenzati MA, Zapata PD (2010). Phosphotyrosine phosphatases in cancer diagnostic and treatment. Recent Patents on DNA & Gene Sequences 4(1):46–51. Tjalsma H, Schaeps RM, Swinkels DW (2008). Immunoproteomics: from biomarker discovery to diagnostic applications. Proteomics Clinical Applications 2(2):167–180. Trelle MB, Jensen ON (2007). Functional proteomics in histone research and epigenetics. Expert Review Proteomics 4(4):491–503. Tremblay L, Laporte MF, Leonil J, Dupont D, Paquin P (2003). Quantitation of proteins in milk and milk products. In: Fox PF, McSweeney PLH, editors. Advanced Dairy Chemistry, 3rd ed. New York: Kluwer Academic/Plenum Publishers. p 49–138. Turnbaugh PJ, Gordon JI (2009). The core gut microbiome, energy balance and obesity. Journal De Physiologie 587(Pt 17):4153–4158. Turnbaugh PJ, Ley RE, Mahowald MA, Magrini V, Mardis ER, Gordon JI (2006). An obesity-associated gut microbiome with increased capacity for energy harvest. Nature 444(7122):1027–1031. Uhlen M, Oksvold P, Fagerberg L, Lundberg E, Jonasson K, Forsberg M, Zwahlen M, Kampf C, Wester K, Hober S, Wernerus H, Bjorling¨ L, Ponten F (2010). Towards a knowledge-based human protein atlas. Nature Biotechnology 28(12):1248–1250. Velagapudi VR, Hezaveh R, Reigstad CS, Gopalacharyulu P, Yetukuri L, Islam S, Felin J, Perkins R, Boren´ J, Oresic M, Backhed¨ F (2010). The gut microbiota modulates host energy and lipid metabolism in mice. Journal of Lipid Research 51(5):1101–1112. Verberkmoes NC, Russell AL, Shah M, Godzik A, Rosenquist M, Halfvarson J, Lefsrud MG, Apajalahti J, Tysk C, Hettich RL, Jansson JK (2009). Shotgun metaproteomics of the human distal gut microbiota. The ISME Journal 3(2):179–189. Villegas R, Gao YT, Yang G, Li HL, Elasy T, Zheng W, Shu XO (2008). Duration of breast- feeding and the incidence of type 2 diabetes mellitus in the Shanghai women’s health study. Diabetologia 51(2):258–266. REFERENCES 189

Weingarten P, Lutter P, Wattenberg A, Blueggel M, Bailey S, Klose J, Meyer HE, Huels C (2005). Application of proteomics and protein analysis for biomarker and target finding for immunotherapy. Methods in Molecular Medicine 109:155–174. Xie H, Gilar M, Gebler JC (2009). Characterization of protein impurities and site-specific modifications using peptide mapping with liquid chromatography and data independent acquisition mass spectrometry. Analytical Chemistry 81(14):5699–5708. Xu J, Gordon JI (2003). Honor thy symbionts. Proceedings of the National Academy of Sciences of the United States of America 100(18):10452–10459. Yang PY, Liu K, Ngai MH, Lear MJ, Wenk MR, Yao SQ (2010). Activity-based proteome profiling of potential cellular targets of Orlistat–an FDA-approved drug with anti-tumor activities. Journal of the American Chemical Society 132(2):656–666. Yi EC, Marelli M, Lee H, Purvine SO, Aebersold R, Aitchison JD, Goodlett DR (2002). Approaching complete peroxisome characterization by gas-phase fractionation. Elec- trophoresis 23(18):3205–3216. Zavaleta N, Kvistgaard AS, Graverholt G, Respicio G, Guija H, Valencia N, Lonnerdal¨ B (2011). Efficacy of an MFGM-enriched complementary food in diarrhea, anemia, and micronutrient status in infants. Journal of Pediatric Gastroenterology and Nutrition 53(5):561–568. Zhang Q, Menon R, Deutsch EW, Pitteri SJ, Faca VM, Wang H, Newcomb LF, Depinho RA, Bardeesy N, Dinulescu D, Hung KE, Kucherlapati R, Jacks T, Politi K, Aebersold R, Omenn GS, States DJ, Hanash SM (2008). A mouse plasma peptide atlas as a resource for disease proteomics. Genome Biologiae 9(6):R93. 7 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT AND CHARACTERIZATION

Alberto Valdes´ and Virginia Garc´ıa-Canas˜

7.1 INTRODUCTION

The rapid progress of genetic engineering (or recombinant DNA technology) has pro- vided new options for the development of novel foods and food ingredients (Petit et al., 2007). This technology allows selected individual gene sequences to be transferred from an organism into another and also between nonrelated species. The organism resultant from genetic engineering is termed genetically modified organism (GMO). The food products containing or derived from GMOs are commonly referred to as transgenic foods. The fast development of genetic engineering in agriculture has led to the production of important genetically modified (GM) crops such as soybean, maize, wheat, rice, cotton, potato, canola, and tobacco that provide advantages for agronomic productivity and industrial processing over their nonmodified counterparts. Among the genetically improved traits, the most frequent in current commercialized GMOs are tolerance to herbicide (Deblock et al., 1987) and resistance to insects and disease (Hails, 2000). In the last years, over 150 GMOs, representing 24 different crops, have been approved by regulatory agencies in different countries. In the near future, this number is expected to rise, and a second generation of GMOs with nutritionally enhanced traits, such as, for instance, plants enriched in ␤-carotene (Ye et al., 2000),

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

191 192 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT vitamin E (Cahoon et al., 2003), or omega-3 fatty acids (Kinney, 2006) could likely obtain commercialization approval (Schubert, 2008).

7.2 CONTROVERSIAL SAFETY ASPECTS AND LEGISLATION ON GMOs

The presence of unintended changes that derive from the genetic transformation represents one of the main controversial issues associated with GMOs safety. Such unintended effects go beyond the primary expected effects of the genetic modifica- tion, and represent statistically significant differences in a phenotype compared with an appropriate phenotype control (Cellini et al., 2004). Unintended changes may orig- inate from unexpected mutations (rearrangements, deletions, insertions, etc.) induced by the genetic transformation or during tissue-culture stages of GMO development (Fitch et al., 1992; Windels et al., 2001; Hernandez et al., 2003; Latham et al., 2006; Rosati et al., 2008). In other cases, the unintended effects can also be associated to secondary effects of gene expression in a way that could be somehow explained con- sidering the function of a transgene, the site of its integration in the genome, or based on our current knowledge of plant metabolism (Kuiper and Kleter, 2003; Ali et al., 2008). Such effects could also be observed if the changes result in a distinct phe- notype, including compositional alterations. Regardless of their origin, unintended effects are difficult to explain or predict without the thorough characterization of the plant at the molecular level. Unintended effects have been suggested to represent a significant source of unpredictability that might have an impact on human health and/or the environment (Ioset et al., 2007). In the European Union and other countries, strict regulations concerning differ- ent aspects of GMOs, including risk assessment, marketing, labeling, and traceability have been established. The most common approach has been based on the assumption that commercialized traditional crop-plant varieties have been consumed for years and have gained a history of safe use. Consequently, they can be used as comparators for the safety assessment of new GM crop varieties derived from established plant lines. Although this concept, commonly known as “comparative safety assessment” or “substantial equivalence,” has been adopted for the current safety assessment of GMOs in several countries, the approval procedure of GMOs differs across national jurisdictions. This heterogeneity and lack of international harmonization among reg- ulations has led to an “asynchronous approval” of GMOs around the world. The requirements for GMO labeling and traceability also differ between the legal frame- works in different countries. For instance, labeling of foodstuffs may be voluntary or mandatory, and the specific thresholds set for labeling vary between countries. In the particular case of the European Union, the Regulation 1829/2003 establishes that any food containing more than 0.9% GM content has to be labeled as such, provided that the presence of this GM ingredient is adventitious or technically unavoidable. For nonauthorized GM ingredients, the threshold is set at 0.5%, provided that the GMO has passed the first stages of approval. ANALYSIS OF GMOs: TARGETED PROCEDURES AND PROFILING METHODOLOGIES 193

7.3 ANALYSIS OF GMOs: TARGETED PROCEDURES AND PROFILING METHODOLOGIES

In order to verify the compliance with the requirements imposed by the legislation regarding GMOs, appropriate methodologies able to cope with the analysis of these novel foods are required. At this regard, there is a need for analytical tools that: (1) provide us with useful information about the primary effects of the genetic modification during GMO development; (2) enable the specific identification and accurate determination of GMOs content in foods for labeling compliance; and (3) facilitate comprehensive compositional studies of GMOs in order to effectively investigate the potential adverse effects on the human health, including the existence (or not) of unintended effects. In order to face the mentioned analytical challenges, two conceptually and method- ologically different strategies have been proposed for the analysis of GMOs, that is, targeted analysis and profiling. In the last decade, the application of targeted anal- ysis has been the prevailing strategy for quantitative and qualitative detection of GMOs in food samples, as well as for comparative safety assessment of a GM crop with its nonmodified counterpart. In the context of substantial equivalence, a group of 50–150 compounds are typically analyzed for each crop variety, follow- ing recommendations in the OECD consensus documents that include macro- and micronutrients, antinutrients, and natural toxins (Cellini et al., 2004; Shepherd et al., 2006). Although it has been indicated that this strategy may cover more than 95% of the crop composition (Chassy, 2010), its application to compare the composi- tion of GMOs with their conventional counterparts has raised numerous concerns. More precisely, it has been pointed out that this strategy is biased (Millstone et al., 1999), and presents many limitations, such as the possible occurrence of unknown toxicants and antinutrients, particularly in food-plant species with no history of (safe) use (Kuiper et al., 2001). Moreover, although a few studies have identified unintended effects with targeted approaches (Hashimoto et al., 1999; Shewmaker et al., 1999; Ye et al., 2000), this strategy might restrict the possibilities to detect other unpredictable effects that could result directly or indirectly from the genetic modification. The mentioned limitations of targeted analysis have encouraged the development and application of new and more powerful analytical approaches to face the complex- ity of this problem, and to improve the chances to detect unintended effects. To this regard, European Food Safety Agency (EFSA) has recommended the development and use of profiling technologies such as omics technologies, with the potential to improve the breadth of comparative analyses (EFSA, 2006). More recently, a panel of experts on risk assessment and management has recommended profiling especially in cases where the most scientifically valid isogenic and conventional comparator would not grow, or not grow as well, under the relevant stress condition (AHTEG, 2010). However, certain questions have been raised about the value of molecular profiling for GMO risk assessment (Chassy, 2010). Some arguments against profiling rely on the lack of validated procedures and the difficulty to interpret the differences observed between a certain GMO and its comparator. However, a number of reports 194 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT demonstrating the suitability and applicability of different profiling approaches for comparative analysis of GMOs suggest good acceptance of these fast-evolving tech- niques by the scientific community (Garc´ıa-Canas˜ et al., 2011; Heinemann et al., 2011).

7.3.1 Targeted Analysis At present, targeted analysis has been the strategy of choice for the detection, identi- fication, and quantification of GMOs and GM-derived materials in food samples (Garc´ıa-Canas˜ et al., 2004; Deisingh and Badrie, 2005; Marmiroli et al., 2008; Michelini et al., 2008; Alderborn et al., 2010). In this field, DNA detection meth- ods have achieved a prominent role. In contrast to proteins, DNA presents higher thermal stability, it is present in most biological tissues, and the fact that the genetic modification affects primarily the DNA sequence makes it a more suitable target for GMO detection. Most DNA-based detection methods for GMOs rely on the use of polymerase chain reaction (PCR) to detect, identify, and quantify GMOs in food. PCR in its different formats has been established as the prevailing technique for GMO detection and traceability due to its specificity, sensitivity, and the fact that it allows a rapid and relatively low-cost analysis. However, many factors can affect the sensitivity and specificity of PCR-based methods, such as quality of DNA, sample processing, equipment, and chemicals. The ability of PCR to amplify specific DNA sequences in a complex DNA extract will depend, to a great extent, on the integrity, quantity, and purity of the DNA extract. These limiting factors define the amplifica- bility of target DNA sequences by PCR-based methods, and are considered critical issues for GMO analysis in highly processed and complex food samples (Gryson, 2010). DNA methods are out of scope of this chapter, and excellent reviews about this topic can be found elsewhere (Elenis et al., 2008; Marmiroli et al., 2008; Michelini et al., 2008; Morisset et al., 2008; Shrestha et al., 2010). The primary or intended effect of the genetic modification is usually studied by the analysis of target compounds. For instance, the interest might be focused on the characterization of the genetic modification, such as the insertion and the expression of the new transgene; subsequently, the analysis might be directed toward the detection of specific proteins besides DNA and mRNA sequences. Also, with the goal to study the intended effect induced by the genetic modification at the metabolite level, targeted analysis might also focus on the detection of a limited selection of metabolites that are involved in certain metabolic pathways in the GMO. Despite the extensive application of the DNA detection methods, the analysis of other target molecules, including proteins and metabolites, has been also important for the characterization of the genetic modification. Next sections will be focused on the lastest developments and advances made in the areas of MS-based techniques applied to the analysis of target proteins and metabolites. A summary of some representative applications of MS-based methodologies in targeted analysis of GMOs is given in Table 7.1.

7.3.1.1 MS-Based Analysis of Target Proteins Conventionally, the most estab- lished methods for the analysis of the transgenic proteins has been based on immuno- chemical detection (Grothaus et al., 2006). Compared to immunological methods, TABLE 7.1 MS-Based Analysis of Target Proteins and Metabolites in GMOs Targets Modification Donor Organism GM Crop Phenotype Tissue Technique Reference Proteins EPSPS enzyme Agrobacterium Soybean Herbicide Seed LC-ESI-QTOF- Fernandez´ Ocana˜ tumefaciens tolerance MS; et al., 2007, 2009 MALDI-TOF- MS ␣-amylase Pea Pea Insect resistance Seed MALDI-TOF-MS Marsh et al., 2011 inhibitor ALS, GAT4621, Maize, Bacillus Maize Herbicide Leaf LC-ESI-IT- Hu and Owens, 2011 PAT licheniformis, tolerance MS/MS Streptomyces hygroscopicus Metabolites Virus Y Potato virus Y Potato Virus resistance Tuber CE-ESI-IT- Bianco et al., 2003 resistance MS/MS Chalcone Petunia hybrid Tomato Flavonoid Fruit LC-ESI-Q-MS; Le Gall et al., 2003b isomerase production LC-ESI-Q- MS/MS LC and C1 genes Maize Tomato Flavonoid Fruit LC-ESI-Q-MS; Le Gall et al., 2003b production LC-ESI-Q- MS/MS Pinoresinol Forsythia intermedia Wheat Lignan production Seed LC-ESI-IT-MS Ayella et al., 2007 lariciresinol reductase RCH10, RAC22, Rice, alfalfa, barley Rice Antifungal activity Seed GC-EI-Q-MS Jiao et al., 2010 ␤-1,3 Glu, B-RIP RC24, ␤-1,3-Glu Rice, alfalfa Rice Antifungal activity Seed GC-EI-Q-MS Jiao et al., 2010

195 Bt toxin, sck Bacillus thuringiensis Rice Insect resistance Seed GC-EI-Q-MS Jiao et al., 2010 196 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT there are few works reporting on the use of MS-based methodologies for the target analysis of the transgenic protein in samples. The main limitations of the application of MS to the analysis of target proteins are linked to the low expression concentration of the recombinant protein as well as its heterogeneous distribution in the different plant tissues. Furthermore, owing to the wide dynamic concentration range of proteins in biological tissues, sample fractionation based on the different physicochemical properties of proteins, and subsequent concentration of the selected target protein is commonly required in order to detect low expression proteins that are below the sensitivity level of the most advanced instruments. Based on this approach, Fernandez´ Ocana˜ et al. (2007) used gel-filtration chromatography (GFC) followed by sequential SDS-PAGE and anion-exchange fractionation steps to purify the transgenic protein CP4 EPSPS from glyphosate-tolerant soybeans. The MS analytical approach, based on the tryptic digestion of the purified transgenic protein of the GM soybeans and subsequent analysis with either matrix-assisted laser desorption/ionization with time of flight MS (MALDI-TOF-MS) or nano-liquid chromatography with electrospray ionization and quadrupole TOF-MS (nLC-ESI-QTOF-MS) allowed the detection of 0.9% GM soya seeds. A further work by the same group demonstrated the potential of isobaric tags for relative and absolute quantification (iTRAQ) for the quantification of CP4 EPSPS protein levels in soya samples (Fernandez´ Ocana˜ et al., 2009). The combination of iTRAQ labeling with a fractionation step using SCX chromatography as a previous step to nLC-ESI-QTOF-MS enabled the quantitative detection of 0.5% GM soya in seed mixtures. The suitability of LC-ESI-QTRAP-MS for simultaneous identification and quantification of three transgenic proteins in maize leaves has been recently reported (Hu and Owens, 2011). The analysis of tryptic-digested extracts under linear ion-trap mode provided a linear dynamic quantitative range of two orders of magnitude (correlation coefficient >0.997) with good accuracy (deviation from nominal concentration <15%) for the recombinant proteins. Glycosylation patterns have attracted attention due to their potential role in aller- genicity of transgenic proteins. In this regard, Marsh et al. (2011) have developed a novel methodology based on protein fractionation, MALDI-TOF-MS, and multi- variate analysis to monitor glycosylation patterns of transgenic proteins. The method was tested on ␣-amylase inhibitor from bean, expressed in pea (Fig. 7.1). The ana- lytical procedure demonstrated that the differences in N-glycan patterns between the ␣-amylase inhibitor from common bean and pea were less than those observed between the inhibitors from common bean and other related species.

7.3.1.2 MS-Based Analysis of Target Metabolites The analysis of target metabo- lites can be useful to study the specific effect produced in an organism by the genetic modification. In particular, this goal is feasible when the genetic modification is directed to increase or decrease the activity of key enzymes within a metabolic path- way affecting the levels of a specific metabolite or a group of metabolites. MS analysis of target metabolites has been especially useful in the development and character- ization of GM crops with interesting traits for human health. This is the case for a genetically transformed wheat cultivar overexpressing the pinoresinol lariciresinol reductase, an enzyme involved in lignan biosynthesis. The application of LC-ESI-MS FIGURE 7.1 MALDI-TOF-MS spectra of permethylated N-glycans isolated from the ␣-amylase inhibitor

197 extracted from (a) transgenic pea and (c) native bean; (insets) ␣-amylase inhibitor extracted from single seeds of (b) transgenic pea and (d) native bean. Reprinted with permission from Marsh et al. (2011). Copyright 2011 American Chemical Society. 198 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT to the determination of lignan content in transgenic wheat samples demonstrated that the technique was essential to corroborate and evaluate the success of the functional transformation and, therefore, of the primary effect of the genetic modification (Ayella et al., 2007). Another example of GM crop, developed to confer beneficial biological activity to the consumer, is the GM tomato with increased levels of flavonoid glyco- sides. The improved levels of these antioxidants in tomato may be helpful to prevent cancer and other pathologies. In this case, the modification was directed to improve the overexpression of two maize regulatory genes of flavonoid biosynthesis (Le Gall et al., 2003a and 2003b). Several analytical techniques, including LC with diode array detection, nuclear magnetic resonance (NMR), MS, and tandem MS (MS/MS) were used to investigate flavonoid composition of tomatoes at different stages of matura- tion. The chromatographic analyses of tomato samples indicated the presence of seven flavonoids at much higher concentration (up to 60-fold difference) in GM tomatoes than in the nonmodified controls. Also, the analyses performed using LC-MS and LC-MS/MS confirmed the identity of the aglycon moiety of two minor, but impor- tant, dihydrokaempferol hexosides. This identification was achieved by comparing the main fragmentations of MH + ions of the unknown compounds with MH + ions obtained from standards of flavonoid glycosides (Le Gall et al., 2003a and 2003b). Capillary electrophoresis (CE) coupled to MS detector has also provided good results in targeted analysis of GMOs. CE-MS coupling has been applied to the anal- ysis of glycoalkaloids in tubers of GM virus Y-resistant potato plants (Bianco et al., 2003). The separation method, based on nonaqueous CE, exhibited very good MS compatibility due to the use of organic solvents in the background electrolyte. The content on glycoalkaloids was determined from methanolic extracts from three lines of potatoes with different resistance to infection by potato virus Y and a conven- tional cultivar Desir´ ee.´ Using CE-MS/MS, the two glycoalkaloids, chaconine and ␣-solanine, were identified. In addition, it was found that potato tubers from the resistant line showed slightly higher content of ␣-solanine in the peel and in the flesh when compared to control potato tubers.

7.3.2 Profiling Methodologies The study of biological systems such as GMOs entails high complexity and restricts the applicability of target analysis. These issues corroborate the need for new and more powerful analytical approaches to study such complexity for comparative safety assessment, and to increase the opportunities to detect unintended effects. As an alternative strategy to target analysis, the development and use of profiling technolo- gies present the potential to improve the coverage in comparative analyses of GMOs (EFSA, 2006; AHTEG, 2010). In this context, Foodomics, defined as a new discipline that studies the food and nutrition domains through the application of advanced omics technologies in order to improve consumers’ well-being and confidence (Cifuentes, 2009; Herrero et al., 2010; Herrero et al., 2012), can play an important role in the investigation of GMOs. Thus, Foodomics can provide significant information, which could be valuable for GMOs traceability and characterization, as well as for the detection of unintended effects during any stage of GMO development (Garc´ıa-Canas˜ et al., 2011). ANALYSIS OF GMOs: TARGETED PROCEDURES AND PROFILING METHODOLOGIES 199

Regarding transcriptomics, for years the expression of individual genes has been determined by quantification of mRNA with Northern blotting. This classical tech- nique has gradually been replaced by more sensitive techniques such as real-time PCR. It has to be considered that both techniques can only analyse gene expression for a limited number of genes per analysis. This can be very useful to monitor the up- or down-regulation of a given gene for a specific problem. However, the appli- cability of these techniques is limited in situations in which the potential up- or down-regulated genes are unknown, since they only enable the analysis of a reduced number of genes. On the other hand, the global analysis of gene expression profiling may offer better opportunities for the comprehensive study of the transcriptome in GMOs. For instance, gene expression microarray has been shown to be a valuable profiling method to assess possible unintended effects of genetic transformation in plants. With this technology, detailed information has been recently obtained on non- targeted effects of transgenes in several plant crops including potato, rice, wheat, and maize (Baudo et al., 2006; Coll et al., 2008; van Dijk et al., 2010). In most cases, the genetic modification did not considerably alter overall gene expression, falling within the range of natural variation of the plant varieties, supporting the possibility of producing transgenic plants that are substantially equivalent to nontransformed plants at transcriptomic level. Although microarray is currently the technique of choice for profiling RNA pop- ulations under different conditions, the new features of next-generation sequencers have stimulated the development of new techniques that have expanded their applica- tions, for example, to comprehensively map and quantify transcriptomes, for which Sanger sequencing would not have been economically or logistically practical before (Hutchison, 2007; Marguerat et al., 2008). These novel techniques for transcriptomics have been termed RNA-Seq methods are still under active development and evalua- tion in multiple laboratories for RNA profiling. They may represent a good alternative for the future comprehensive study of GMOs at transcriptome level. In addition, the development of novel tools for network and pathway analysis, in combination with other profiling techniques, will improve our understanding of the found differences in comparative assays between GM and conventional varieties. Regarding the proteome and the metabolome, there is no single technique currently available to acquire significant amounts of data in a single experimental analysis to detect either all proteins or all metabolites found in GMOs or any other organism (Saito and Matsuda, 2010). In consequence, multiple analytical techniques have to be combined to improve analytical coverage of these molecules. MS-based tech- niques are crucial for proteomics and metabolomics studies. A discussion on the latest advances in MS-based protein and metabolite profiling is provided in the next sections.

7.3.2.1 MS-Based Protein Profiling

Bottom-Up Approach Comparative proteomic analysis has been widely applied for the study of differentially expressed proteins in GMOs (Table 7.2). Two-dimensional gel electrophoresis (2-DGE), followed by image analysis, and MS (typically MALDI- TOF-MS) or different variants of LC-MS configure the so-called bottom-up approach. 200

TABLE 7.2 MS-Based Protein Profiling in GMOs Modification Donor Organism GM Crop Phenotype Tissue Technique Reference (Antisense G1-1 Potato Potato Sprouting delay Tuber MALDI-TOF-MS Careri et al., 2003 gene) TSWV Tomato spotted wild Tomato Virus resistance Seed MALDI-TOF-MS Corpillo et al., 2004 nucleoprotein virus Rab1 protein Tobacco Wheat Improved functional Seed MALDI-TOF-MS; Di Luccia et al., 2005 properties LC-ESI-QTOF- MS/MS Glucan branching Aureobasidium Potato Waxy phenotype Tuber LC-ESI-IT-MS/MS Lehesranta et al., enzyme pullulans 2005 Glycoprotein Potato Potato Changes in cell wall Tuber LC-ESI-IT-MS/MS Lehesranta et al., structure 2005 AdoMetDC Potato Potato Modified Tuber LC-ESI-IT-MS/MS Lehesranta et al., metabolism 2005 Bar protein Wheat Wheat Herbicide tolerance Seed MALDI-TOF-MS Horvath-Szanics et al., 2006 Aldehyde Escherichia coli Grapevine Abiotic stress Leaf MALDI-TOF-MS; Sauvage et al., 2007 dehydrogenase LC-QTOF-MS CDPK13, Rice Rice Cold tolerance Leaf ESI-QTOF-MS Komatsu et al., 2007 CRTintP1 Bt toxin B. thuringiensis Maize Insect resistance Seed MALDI-TOF-MS Albo et al., 2007 Bt toxin B. thuringiensis Maize Insect resistance Seed LC-ESI-IT-MS/MS Zolla et al., 2008 LMW-GS Wheat Wheat Improved functional Seed LC-ESI-QTOF-MS Scossa et al., 2008 properties Bt toxin B. thuringiensis Maize Insect resistance Grain CE-ESI-TOF-MS; Erny et al., 2008 CE-ESI-IT-MS ScFv (G4) - Tomato Virus resistance Leaf MALDI-TOF-MS; Di Carli et al., 2009 LC-ESI-IT-MS/MS hGM-CSF Human Rice Recombinant Seed LC-ESI-QTOF-MS Luo et al., 2009 proteinproduction ␣-amylase Pea Pea Insect resistance Seed LC-ESI-QTOF-MS Islam et al., 2009 inhibitor ␣-amylase Pea Pea Insect resistance Seed MALDI-TOF-TOF- Chen et al., 2009 inhibitor MS Bt toxin B. thuringiensis Maize Insect resistance Grain LC–ESI-IT-MS Garc´ıa-Lopez´ et al., 2009 EPSPS enzyme B. thuringiensis Soybean Herbicide tolerance Seed MALDI-QTOF-MS Brandao et al., 2010 EPSPS enzyme B. thuringiensis Soybean Herbicide tolerance Seed CE-ESI-TOF-MS Simo´ et al., 2010 Bt toxin B. thuringiensis Maize Insect resistance Grain LC-ESI-IT-MS Coll et al., 2011 (Antisense trx s Phalaris coerulescens Wheat Resistance to Seed MALDI-TOF-MS Guo et al., 2011 gene) preharvest sprouting Bt toxin B. thuringiensis Maize Insect resistance Leaf MALDI-TOF-TOF- Balsamo et al., 2011 MS 201 202 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT

2-DGE provides the highest protein-resolution capacity with a low-instrumentation cost. This technique has been applied to compare protein profiles of GMOs, includ- ing wheat with improved functional properties (Di Luccia et al., 2005; Scossa et al., 2008), tomatoes with a genetically added resistance to virus and insect attacks (Corpillo et al., 2004; Di Carli et al., 2009), potatoes showing modified cell wall structure and a delayed sprouting process (Careri et al., 2003; Lehesrant et al., 2005) and insect-resistant GM maize (Albo et al., 2007; Zolla et al., 2008; Coll et al., 2011; Balsamo et al., 2011) versus their corresponding unmodified lines. The comparison of protein profiles of GMOs with those obtained from the unmodified lines often does not reveal more significant differences than those observed between different nonmodified cultivars/genotypes. A representative example of this observation was reported by Lehesranta et al. (2005) in a study of a large selection of potato vari- eties and landraces. In their work, 2-DGE analysis showed significant quantitative and qualitative differences in most of the detected proteins between varieties. How- ever, when different GM potato lines were compared with their controls, statistical analysis showed no clear differences in the protein patterns. In some cases, the differ- ential expression of the proteins in the GMO is considered predictable and it can be explained by the result of the genetic modification. However, it has highlighted the importance of knowing the extent of natural variation in the proteome of plants grown under a range of different environments to avoid any mis- or over-interpretation of the results. Following this idea, a series of studies have been carried out on model plants as well as in common crops such as rice, soybean, and potato (Ruebelt et al., 2006a, 2006b, and 2006c; Natarajan, 2010; Teshima et al., 2010). In addition to the technical limitations of 2-DGE to separate highly hydrophobic proteins, extreme isoelectric point or high molecular weight (MW) proteins, gel- to-gel variation is one of the major sources of error, hampering an exact match of spots in the image-analysis process. To circumvent the gel-to-gel irreproducibility for comparative proteomics, in differential in gel electrophoresis (DIGE) different samples labeled with ultrahigh-sensitive fluorescent dyes, typically Cy5 and Cy3, are loaded in the same gel (Timms and Cramer, 2008). DIGE has been applied to compare the proteomes of wild-type cultivars with two GM pea lines expressing ␣-amylase inhibitor from common bean (Islam et al., 2009). Trypsin digestions of proteins from individual excised spots were analyzed with LC-ESI-QTOF-MS. Approximately, 600 proteins with isoelectric points between 3 and 10 and MWs ranging from 15 to 100 kDa were resolved in the gels. In that study, the gel images for the analysis of one of the GM peas displayed 66 spots showing significant changes. In addition to changes in the abundance of these proteins, complementary analyses suggested post-transcriptional and post-translational modifications of endogenous proteins. Recently, Brandao et al. (2010) have also emphasized the importance of optimizing the parameters that influence the comparisons of the protein map after different gel runs, including those parameters involved in image acquisition. Using a strictly controlled routine for image analysis of 2-D gels, a maximum of 79% of spot match was achieved when GM soybean proteome was compared to the corresponding nonmodified soybean line. ANALYSIS OF GMOs: TARGETED PROCEDURES AND PROFILING METHODOLOGIES 203

The combination of 2-DGE and MS is still the dominant analytical platform to investigate how the genetic modification produce alterations in proteins abundance, structure, or function, as well to study the mechanisms involved in the response of GMOs submitted to a variety of abiotic (chemicals, drought, salinity, etc.) and biotic (pathogens, parasites, etc.) stresses. Horvath-Szanics et al. (2006) studied the effect of drought stress on the proteomic expression of an herbicide-resistant transgenic wheat using 2-DGE followed by MALDI-TOF-MS. In another study, 2-DGE and ESI-QTOF-MS were used to study the effect of the overexpression of calcium-dependent protein kinase 13 and calreticulin-interacting protein 1 involved in cold-stress response in GM rice (Komatsu et al., 2007). As an alternative to 2-DGE, gel-free protein (or peptide) separation techniques including LC and CE benefit from direct coupling to a mass spectrometer. These techniques allow full automation, provide potential high-throughput capabilities, require lower amount of needed starting material, and better reproducibility in terms of qualitative and quantitative analysis is achieved. For instance, protein profiles of transgenic MON810 maize lines have been compared to those obtained from their corresponding unmodified cultivar using LC-ESI-IT-MS in order to investigate pos- sible differences (Garc´ıa-Lopez´ et al., 2009). The analyses revealed spectral signals that were similar between GM lines and the nonmodified ones. CE-ESI-MS has been also applied for the analysis of the intact zein-proteins fraction from three differ- ent GM maize cultivars and their corresponding isogenic lines (Erny et al., 2008). Results showed similar sensitivity and repeatability for two different mass analyzers (TOF and IT); however, CE-ESI-TOF-MS provided better results with regard to the number of identified proteins. A comparison of the protein profiles obtained by CE- ESI-TOF-MS did not show significant differences between the GM lines and their nonmodified counterparts.

Shotgun Approach Shotgun-proteomics profiling have also been applied to study unintended effects in GMOs. In shotgun proteomics, protein digestion is performed without any prefractionation or separation of the proteome. The resulting peptides are separated with LC followed by MS analysis to provide a rapid and automatic identification of proteins in the sample. Based on this approach, a profiling CE-ESI- TOF-MS method has been recently developed for the investigation of unintended effects in GM soybeans (Simo´ et al., 2010). During method development stage, several parameters affecting the separation and detection of peptides were optimized. Using this method, a total of 151 peptides were automatically detected for each soybean line. The comparative analysis showed no differences between the peptide profiles obtained from GM soybean and its conventional counterpart (Fig. 7.2). Shotgun proteomics has been combined with iTRAQ to quantify differences in protein profiles between GM and unmodified rice (Luo et al., 2009). To this aim, four independent isobaric reagents, designed to react with all primary amines of protein hydrolyzates, were used to treat four different digested samples that were subsequently pooled for MS/MS analysis. The analyses using this analytical technique revealed significant differences between GM and wild-type rice in 103 proteins out of the 1,883 proteins identified in 204 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT

Intens. x105 Conventional soybean 7.5

6.0

4.5

Transgenic soybean 3.0

1.5

0.0 0510 15 20 25 30 Time (min) FIGURE 7.2 CE-TOF-MS analyses of the digested protein extract from conventional and transgenic soybean. Reprinted with permission from Simo´ et al. (2010). Copyright 2010 John Wiley & Sons. rice endosperm samples. At present, iTRAQ technique allows the use of up to eight different isobaric reagents for the simultaneous analysis of different samples in the same MS analysis, a feature that enhance the potential of this analytical strategy for large-scale screenings.

7.3.2.2 MS-Based Metabolite Profiling In general, metabolites are the final downstream products of the genome, and reflect most closely the status of a cer- tain biological system. The investigation of the metabolic patterns and changes in the metabolism within the frame of GMO analysis might indicate whether intended and/or unintended effects have taken place as a result of genetic modification (Shepherd et al., 2006). However, one of the main challenges in metabolite profiling is to face the complexity of any metabolome, usually composed by a wide range of chemical species that have diverse physicochemical properties (amino acids, organic acids, sugars, steroids, amines, etc.). Also, the relative concentration of metabolites in a biological system can range several orders of magnitude (from mM to pM). Consequently, high sensitivity and resolution are the most relevant parameters to consider for the selection of an appropriate method for comprehensive metabolomic analysis (Villas-Boas et al., 2005). In metabolomics, sample preparation is a critical step since the method for metabolite extraction from biological samples has to be highly reproducible and robust. Nevertheless, at the moment no single analytical platform or methodology exists to detect, quantify, and identify all metabolites in a single experimental analysis of a certain sample. Essentially, two metabolomics platforms are currently used for metabolite profiling of GMOs: MS and NMR-based methodologies. MS and NMR, either combined with ANALYSIS OF GMOs: TARGETED PROCEDURES AND PROFILING METHODOLOGIES 205 separation techniques or stand alone, are both complementary and, frequently, used in parallel in metabolomics research. NMR-based techniques are out of scope of this chapter; therefore, we will only discuss MS-based techniques within the field of GMO analysis. In last years, as confirmed by the number of published applications based on gas chromatography-MS (GC-MS), LC-MS, CE-MS, or MS as a stand-alone technique, MS-based methodologies have been shown to offer wide possibilities to evaluate GM crops based on their metabolic profiles (Hoekenga, 2008). Some representative works are summarized in Table 7.3.

Gas Chromatography-Mass Spectrometry GC-MS is one of the most reported ana- lytical tools to study the metabolome of GMOs in the literature. This technique provides high separation efficiency and reproducibility. GC-MS is often used for the analysis of a wide range of volatile compounds, and semi- and nonvolatile compounds by employing chemical derivatization. The combination of this versatile technique with chemometric methods (e.g., principal components analysis, PCA) has demon- strated excellent potential to discover significant differences for the discrimination among different plant varieties (Fiehn et al., 2000). In a pioneer work, Roessner et al. (2000) applied GC coupled to a quadrupole mass spectrometer to character- ize the metabolic composition of GM potato tubers with modified sugar or starch metabolism. Sample preparation procedure included extraction of polar metabolites from potato tubers, followed by methoximation and silylation to volatilize various classes of compounds. GC-MS analyses allowed the identification of 77 out of 150 compounds, providing valuable information related to altered metabolic pathways and unexpected changes in the levels of some compounds in the GM potato. Fur- ther research, focused on the application of GC-MS technique, demonstrated altered sucrose catabolism in GM potato tubers coincident with a massive elevation in the content of each individual amino acid (Roessner et al., 2001a). In a separate report, the same group demonstrated the suitability of GC-MS in combination with data- mining tools (e.g., principal components analysis (PCA) and hierarchical clustering) to identify differences that enable the discrimination of the GM potato and tomato lines from the respective unmodified lines on different stages of fruit development (Roessner et al., 2001b; Roessner-Tunali et al., 2003). Following a similar approach, a number of GC-MS studies have been reported for metabolic profiling. Thus, Inaba et al. (2007) investigated a tryptophan (Trp)-enriched GM soybean line. In their study, GC-MS analyses of leave extracts indicated higher levels of fructose, myo-inositol, and shikimic acid among 37 total organic acids, sugars, alcohols, and phenolic com- pounds in GM plants than in the controls. The compositional similarities/differences between conventional and GM potatoes containing high levels of inulin-type fructans have been investigated by Catchpole et al. (2005) using GC-MS. The analytical strategy involved an initial evaluation of the degree of compositional similarity between tubers of GM and several con- ventional potato cultivars. In this first stage of the study, 600 potato extracts were analyzed using flow-injection analysis-TOF-MS. Then, PCA provided the 15 top- ranking ions, some of them corresponding to oligofructans of different polymeriza- tion degrees, allowing genotype separation. In addition, more than 2000 tuber samples 206

TABLE 7.3 MS-Based Metabolite Profiling in GMOs Modification Donor Organism GM Crop Phenotype Tissue Technique Reference Modified starch, sucrose Potato Tomato Altered starch Tuber GC-EI-Q-MS Roessner et al., 2001a metabolism composition Hexokinase Arabidopsis Tomato Altered Leaf and GC-EI-Q-MS Roessner-Tunali thaliana carbohydrate fruit et al., 2003 metabolism 1-SST, 1-FFT proteins Artichoke Potato Inulin synthesis Tuber GC-EI-TOF-MS; Catchpole et al., 2005 LC-ESI-Q-MS YK1 protein Maize Rice Stress tolerance Leaf and FT-ICR-MS Takahashi et al., 2005 calli Fructokinase, A. pullulans Potato Starch Tuber GC-EI-Q-MS Shepherd et al., 2006 ␣-glucosidase, biosynthesis, S-adenosylmethionine leaf morphology, ethylene production Aldehyde E. coli Grapevine Abiotic stress Leaf LC-ESI-IT-MS; Tesniere et al., 2006 dehydrogenase GC-EI-Q-MS C1 and R-S regulatory Maize Rice Flavonoid Leaf LC-ESI-IT- Shin et al., 2006 genes production MS/MS YK1 protein Maize Rice Stress tolerance Seed CE-ESI-Q-MS Takahashi et al., 2006 Anthranilate synthase Tobacco Soybean Nutritionally Leaf and GC-EI-Q-MS Inaba et al., 2007 enhanced seed EPSPS enzyme A. tumefaciens Soybean Herbicide Seed GC-EI-Q-MS Bernal el al, 2007 tolerance STS enzyme Grapevine Tomato Resveratrol Fruit LC-ESI-Q-MS Nicoletti et al., 2007 synthesis B-1,3-glucanase Barley Wheat Antifungal Leaf LC-ESI-IT-MS/MS Ioset et al., 2007 activity Bt toxin B. thuringiensis Maize Insect resistance Grain CE-ESI-TOF-MS Levandi et al., 2008 EPSPS enzyme A. tumefaciens Soybean Herbicide Seed CE-ESI-TOF-MS Garc´ıa-Villalba tolerance et al.,2008 Virus movement protein Dwarf virus Virus resistance Fruit GC-EI-Q-MS Malowicki et al., Raspberries 2008 Bt toxin, skc protein B. thuringiensis Rice Insect resistance Seed GC-EI-Q-MS Zhou et al., 2009 ADP glucose E. coli Rice Nutritionally Seed LC-ESI-Q-MS Nagai et al., 2009 pyrophosphorylase enhanced Bt toxin B. thuringiensis Maize Insect resistance Grain GC-EI-Q-MS Jimenez et al., 2009 Thaumatin-II Thaumatococcus Cucumber Sweet flavor Fruit GC-EI-Q-MS; Zawirska-Wojtasiak daniellii GC-EI-TOF-MS et al., 2009 EPSPS enzyme A. tumefaciens Soybean Herbicide Seed CE-ESI-TOF-MS Giuffrida et al., 2009 tolerance Bt toxin B. thuringiensis Maize Insect resistance Grain FT-ICR-MS Leon et al., 2009 Anthranilate synthase Tobacco Rice Nutritionally Seed LC-ESI-Q-MS Matsuda et al., 2010 enhanced Bt toxin B. thuringiensis Maize Insect resistance Grain GC-EI-Q-MS Barros et al., 2010 (1,3-1, 4)-␤-glucanase B. amyloliquefa- Barley Antifungal Seed LC-ESI-IT-MS Kogel et al., 2010 ciens, B. activity macerans Endochitinase Trichoderma Barley Nutritionally Seed LC-ESI-IT-MS Kogel et al., 2010 harzianum enhanced Miraculin Richadella Tomato Sweet flavor Fruit GC-EI-TOF-MS; LC- Kusano et al., 2011 dulcifica ESI-QTOF-MS; CE-ESI-QTOF-MS Bt toxin, skc protein B. thuringiensis Rice Insect Seed LC-ESI-QTOF-MS Chang et al., 2012 207 Resistance 208 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT were analyzed using GC-EI-TOF MS providing a global profiling of 252 individual metabolites (90 positively identified, 89 assigned to a specific metabolite class, and 73 unknowns). Further chemometric analysis of the profiles showed that, apart from the changes expected as result of the genetic modification, GM potatoes showed a similar metabolite composition within the range exhibited normally by conventional cultivars. In a more recent report, the unintended effects in insect-resistant GM rice have been investigated by Zhou et al. (2009) by means of the combined application of GC-flame ionization detection (FID) and GC-MS. In this case, however, GC-MS was exclusively used to identify certain important compounds after GC-FID pro- filing. Although the genetic transformation and growing conditions induced similar effects on the concentrations of glycerol-2-phosphate, citric acid, and oleic acid, other metabolites, such as mannitol and glutamic acid, were widely affected by the genetic modification. Volatilearoma compounds, such as aldehydes and alcohols, are secondary metabo- lites influenced by a number of variables, including genetic makeup and abiotic fac- tors. GC-MS has been shown to be a useful tool for profiling aroma compounds in GM vegetables and fruits. Thus, volatile fraction of GM raspberries with added resis- tance to virus attack has been investigated using GC-MS (Malowicki et al., 2008). The quantitative study of 30 selected compounds belonging to various chemical classes (e.g., alcohol, aldehyde, ketone, ester, and terpene) did not show significant differences between the GM line and the wild type. Similarly, the aroma compound profiles of four lines of GM cucumber and their unmodified lines were investigated by Zawirska-Wojtasiak et al. (2009) using GC-MS. In this study, the evaluation of two extraction methods, including microdistillation and solid-phase microextraction (SPME) revealed that the latter enabled the identification of higher number of com- pounds (a total of 28 compounds) due to its capability to detect low boiling point volatiles. The cucumber extracts were subjected to GC-EI-TOF-MS and GC-EI-Q- MS. Although all identified compounds were identical in GM and conventional lines, analyses showed that, regardless of the type of MS analyzer used for the analysis, significant quantitative differences were found between GM and control cucumber lines. The combinations of several selective extraction methods using supercritical fluids or accelerated solvents with GC-MS have been also explored for the investiga- tion of unintended effects in GMOs (Bernal et al., 2007; Jimenez et al., 2009). These techniques have been applied to extract selectively amino acids and fatty acids from soybean and maize for subsequent profiling and quantification.

Liquid Chromatography-Mass Spectrometry LC coupled to MS has been demon- strated to be a valuable tool for the metabolomics analysis of GMOs. LC-MS provide a reproducible quantitative analysis, wide dynamic range and the ability to separate and analyze highly complex samples. The versatility of this technique is demonstrated by growing number of applications reported in the field of GMO analysis. For instance, LC-MS has been used for the study of flavonoids in transformed rice expression reg- ulatory genes from maize that induce flavonoids biosynthesis (Shin et al., 2006). In a different report, Ioset et al. (2007) investigated changes in the metabolite accumu- lation in two transgenic lines of wheat (Triticum aestivum L.) with either antifungal ANALYSIS OF GMOs: TARGETED PROCEDURES AND PROFILING METHODOLOGIES 209 or viral resistance. Solid-phase extraction was used to extract flavonoids that were subsequently analyzed by LC-IT-MS with two different ionization sources, ESI and atmospheric pressure chemical ionization (APCI). The use of ESI negative mode in LC-MS/MS analysis enabled the discrimination of C-glycoside flavonoids from O-glycoside analogues. Hierarchical clustering of data revealed a closer correlation between GM/non-GM plants of the same variety than between unmodified plants of different cultivars. Also, an LC-MS method has been developed for the profiling of stilbenes, a specific class of polyphenols, in transgenic tomato overexpressing a grapevine gene that encoded the enzyme stilbene synthase (Nicoletti et al., 2007). The study was focused on the detection of potential alterations on the synthesis of other metabolites along the flavonoid’s pathway. Extracts from tomato fruits and peels were analyzed in the negative ionization mode with LC-ESI-MS, which pro- vided lower background noise and higher sensitivity than in the positive mode for the detection of stilbenes and phenolic compounds. Differences in the concentration of rutin, naringenin, and chlorogenic acids were detected when the profiles of GM and control tomatoes were compared using this methodology. The combination of LC-MS with GC-MS has demonstrated to improve the characterization of the metabolome status of GMOs. More precisely, LC-ESI-IT- MS and GC-MS were applied to detect differences in some phenolic compounds and volatile secondary metabolites that belong to the classes of monoterpenes, C12-norisoprenoids, and shikimates between GM and unmodified grapevine lines (Tesniere et al., 2006). Matsuda et al. (2010) have used LC-ESI-Q-MS to study the effect of gene encoding a feedback insensitive ␣-subunit of anthranilate synthase expression on the metabolic profile of GM rice with increased Trp content. Metabolic profiles obtained from the analysis of different plant tissues were subjected to three different statistical methods (independent component analysis, correlation analysis, and Student’s t-test) in order to characterize the differences between GM rice and the unmodified counterpart. Results obtained in the study also indicated that the concen- tration of Trp changes in a time- and tissue-dependent manner. Recently, Chang et al. (2012) have reported the metabolic profiling of GM rice using LC-ESI-QTOF-MS. The reproducibility and instrument precision was determined during the first stage of the method development. Linearity, recoveries, and LODs of selected compounds were also evaluated. Then, the metabolic differences between GM rice (expressing sck and cry1Ac genes) and conventional rice were investigated using multivariate analysis, including PCA and partial least squares-discriminant analysis (PLS-DA). Environmental factors were also studied by comparing rice samples that were sown on different seasons and places. The results indicated that environmental factors played a greater role than gene modification for most metabolites. Reversed phase is the most frequent mode used in LC-MS metabolite profiling in GMO analysis; however, other suitable modes have proven to be valuable. For example, Nagai et al. (2009) used a HILIC phase for the separation of major carbon metabolites in transgenic rice using LC-ESI-MS/MS (Nagai et al., 2009). Also, Kogel et al. (2010) have applied ion-exchange chromatography coupled to ESI-MS to investigate the substantial equivalence of GM barley. In their work, PCA was computed for the 307 most significant mass signals obtained by the metabolomics 210 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT analysis confirming that cultivar-specific differences in metabolome greatly exceeded the changes caused by the genetic modification.

Capillary Electrophoresis-Mass Spectrometry CE-MS has been shown to be suit- able for the analysis of a wide range of analytes including ionic and polar thermo- labile compounds, being considered complementary to LC-MS and GC-MS. High efficiency, time of analysis, and resolution are features of CE-MS, requiring, more- over, little sample pretreatment. On the other hand, moderate sensitivity is frequently achieved due to the minute sample volumes injected in CE-MS. The potential of CE-MS for metabolic profiling of GMOs has already been demonstrated to study GM maize and soybean (Levandi et al., 2008; Garc´ıa-Villalba et al., 2008). Thus, a novel CE-ESI-TOF-MS was developed to determine statistically significant dif- ferences in the metabolic profiles of conventional and insect-resistant GM maize varieties (Levandi et al., 2008). Different extraction procedures were assayed and optimized in order to obtain the highest number of metabolites from the maize flour. The application of PCA to the analysis of metabolic profiles data enabled the identi- fication of statistically significant differences in the levels of L-carnitine and stachy- drine between conventional and GM maize. In addition, a similar CE-ESI-TOF-MS methodology was developed for the comparative analysis of metabolic profiles from GM soybean (glyphosate resistant) and its corresponding nonmodified parental line (Garc´ıa-Villalba et al., 2008). In that study, over 45 different metabolites, including isoflavones, amino acids, and carboxylic acids were tentatively identified. Metabolic profiles of both lines showed differences on the concentration of three free amino acids (proline, histidine, and asparagine) whereas a metabolite tentatively identified as 4-hydroxi-L-threonine disappeared in the transgenic soybean compared to its con- trol line. Also, a chiral CE-ESI-TOF-MS method has also been developed to study differences in the chiral amino acid profile between six varieties of conventional and herbicide-tolerant transgenic soybean (Giuffrida et al., 2009). Novel modified cyclodextrins were used as chiral selectors in the background electrolyte to achieve good resolution of chiral compounds. Using this method, similar D/L-amino acid profiles were obtained for conventional and GM soybean.

Multiplatform Mass Spectrometry Approaches Recently, the potential of multi- platform approaches for metabolomics profiling of GMOs has been highlighted by Kusano et al. (2011). Data from GC-TOF-MS, LC-TOF-MS and CE-TOF-MS were combined in a summarized data set to improve coverage of the tomato metabolome. This combined data set included 175 unique identified metabolites and 1460 peaks with no or imprecise metabolite annotation. Next, PCA was performed using the physicochemical properties of 160 identified metabolites from the data set to eval- uate its chemical diversity using a tomato metabolism database (LycoCyc) as ref- erence (Fig. 7.3). Using this method, the data set accounted for 85% of the chem- ical diversity. In addition to this, multivariate analysis based on orthogonal projec- tions to latent structure’s discriminant analysis was applied to study the differences ANALYSIS OF GMOs: TARGETED PROCEDURES AND PROFILING METHODOLOGIES 211

FIGURE 7.3 Evaluation of the chemical diversity achieved by the multi-platform approach (GC-MS, LC-MS and CE-MS). PCA was performed on the predicted physicochemical prop- erties of the detected metabolites and the metabolites in the LycoCyc database. (a) The loading plots show that PC1 is dominated by size-related-properties and PC2 by solubility-related properties. (b) The score plots show that the distribution of the detected metabolites occupies a similar space as the reference metabolites. The inset bar-plot shows the ratio of variance among the reference metabolites covered by each of the individual platforms and the summarized data set. Reprinted from Kusano et al. (2011). between GM and control tomato lines. The changes between the GM and nonmodi- fied lines were small compared to the changes observed between ripening stages and traditional cultivars. Fourier transform ion-cyclotron resonance MS (FT-ICR-MS) has proven its great potential as analytical platform for metabolomic profiling of GMOs (Aharoni et al., 2002; Takahashi et al., 2005; Mungur et al., 2005). FT-ICR-MS has been used in combination with CE-TOF-MS for the metabolomic profiling of six varieties of maize, three GM insect-resistant lines and their corresponding isogenic lines (Leon et al., 2009). Pressurized liquid extraction (PLE) of metabolites from maize flours was investigated as automated procedure for sample preparation. Metabolite extracts were analyzed using FT-ICR-MS in positive and negative ESI mode. Data sets were uploaded into MassTRIX server in order to identify maize-specific metabolites anno- tated in the KEGG database (Suhre and Schmitt-Kopplin, 2008). PLS-DA showed the most discriminative masses that contribute to differentiate the GMO samples from their isogenic lines. Such discriminant m/z values were uploaded in MassTRIX showing the total number of compounds identified and present in maize. Although FT- ICR-MS technique offers good mass accuracy (sub-ppm) and resolution (greater than 100,000), various compounds could not be unambiguously identified, for instance, isomers having the same molecular formula. In those cases, electrophoretic mobilities and m/z values provided by CE-TOF-MS were used to confirm the identity of various isomeric compounds. 212 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT

In addition to the aforementioned multiplatform studies, Barros et al. (2010) reported the application of various omics techniques, including gene expression microarray, 2-DGE-based-proteomics, and, NMR- and GC-MS-based metabolomics, to the investigation of unintended effects in two GM maize. To simplify the inter- pretation identification and presentations of the large amount of data generated by these technologies, chemometric tools like PCA was used for the statistical analy- sis, complemented with ANOVA for the determination of significant differences in transcripts, proteins, or metabolites.

7.4 CONCLUSIONS AND FUTURE OUTLOOK

The development of new analytical methods allowing the qualitative (target or profil- ing) and quantitative analysis of GMOs has been and will continue to be driven by the establishment of regulations concerning GMOs risk assessment, marketing, labeling, and traceability. Despite the restrictions imposed by the regulations in several coun- tries, there is an increasingly growing worldwide market of food products containing GMOs that, in the foreseeable future, will include the so-called second generation GMOs. In order to provide the control organisms with the appropriate tools, future research in this field will keep focusing on methods able to reliably detect, char- acterize, and quantify GMOs. Moreover, these methodologies will pass through a validation process in order to be confidently used by enforcement and commercial laboratories. In this context, both target and profiling strategies are expected to keep playing a definitive role. Additionally, more studies will also be required to characterize natural variability of crops to make the identification of any unintended effect or GM crop easier. The definition of common standardized experimental protocols is a major challenge in omics strategies for which Foodomics can be the right framework. The unification of the different analytical platforms and protocols will enable the comparison of experiments performed in laboratories around the world. Besides, the integration of omics data (transcriptomics, proteomics, and metabolomics) will require a significant effort that will involve a vast quantity of collaborative work to compare and share data within the scientific community.

ACKNOWLEDGMENTS

This work was supported by Projects AGL2011-29857-C03-01 (Ministerio de Ciencia e Innovacion,´ Spain) and CSD2007-00063 FUNC-FOOD (Programa CONSOLIDER, Ministerio de Educacion y Ciencia, Spain).

REFERENCES

Ad Hoc Technical Expert Group (AHTEG) (2010). Guidance Document on risk assess- ment of living modified organisms. https://www.cbd.int/doc/meetings/bs/bsrarm-02/ REFERENCES 213

official/bsrarm-02-05-en.pdf. United Nations Environment Programme Convention for Bio- diversity. Aharoni A, De Vos CHR, Verhoeven HA, Maliepaard CA, Kruppa G, Bino R, Goodenowe DB (2002). Nontargeted metabolome analysis by use of Fourier Transform Ion Cyclotron Mass Spectrometry. OMICS 6:217–234. Albo AG, Mila S, Digilio G, Motto M, Aime S, Corpillo D (2007). Proteomic analysis of a genetically modified maize flour carrying Cry1Ab gene and comparison to the correspond- ing wild-type. Maydica 52:443–455. Alderborn A, Sundstrom J, Soeria-Atmadja D, Sandberg M, Andersson HC, Hammerling U (2010). Genetically modified plants for non-food or non-feed purposes: Straightforward screening for their appearance in food and feed. Food and Chemical Toxicology 48:453– 464. Ali S, Zafar Y, Xianyin Z, Ali GM, Jumin T (2008). Transgenic crops: Current challenges and future perspectives. African Journal of Biotechnology 7:4667–4676. Ayella AK, Trick HN, Wang W (2007). Enhancing lignan biosynthesis by overexpressing pinoresinol lariciresinol reductase in transgenic wheat. Molecular Nutrition and Food Research 51:1518–1526. Balsamo GM, Cangahuala-Inocente GC, Bertoldo JB, Terenzi H, Arisi ACM (2011). Proteomic analysis of four brazilian MON810 maize varieties and their four non-genetically-modified isogenic varieties. Journal of Agricultural and Food Chemistry 59:11553–11559. Barros E, Lezar S, Anttonen MJ, van Dijk JP, Rohlig¨ RM, Kok EJ, Engel KH (2010). Compari- son of two GM maize varieties with a near-isogenic non-GM variety using transcriptomics, proteomics and metabolomics. Plant Biotechnology Journal 8:436–451. Bernal JL, Nozal MJ, Toribio L, Diego C, Mayo R, Maestre R (2007). Use of supercritical fluid extraction and gas chromatography–mass spectrometry to obtain amino acid profiles from several genetically modified varieties of maize and soybean. Journal of Chromatography A 1192:266–272. Bianco G, Schmitt-Kopplin P, Crescenzi A, Comes S, Kettrup A, Cataldi TRI (2003). Evalua- tion of glycoalkaloids in tubers of genetically modified virus Y-resistant potato plants (var. Desir´ ee)´ by non-aqueous capillary electrophoresis coupled with electrospray ionization mass spectrometry (NACE–ESI–MS). Analytical and Bioanalytical Chemistry 375:799– 804. Brandao AR, Barbosa HS, Arruda MAZ (2010). Image analysis of two-dimensional gel elec- trophoresis for comparative proteomics of transgenic and non-transgenic soybean seeds. Journal of Proteomics 73:1433–1440. Cahoon EB, Hall SE, Ripp KG, Ganzke TS, Hitz WD, Coughland SJ (2003). Metabolic redesign of vitamin E biosynthesis in plants for tocotrienol production and increased antioxidant content. Nature Biotechnology 21:1082–1087. Careri M, Elviri L, Mangia A, Zagnoni I, Agrimonti C, Visioli G, Marmiroli N (2003). Analysis of protein profiles of genetically modified potato tubers by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Rapid Communications in Mass Spectrometry 17:479–483. Catchpole GS, Beckmann M, Enot DP, Mondhe M, Zywicki B, Taylor J, Hardy N, Smith A, King RD, Kell DB, Fiehn O, Draper J (2005). Hierarchical metabolomics demonstrates substantial compositional similarity between genetically modified and conventional potato crops. Proceedings of National Academy of Science 102:14458–14462. 214 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT

Cellini F, Chesson A, Colquhoun I, Constable A, Davies HV, Engel KH, Gatehouse AMR, Karenlampi¨ S, Kok EJ, Leguay JJ, Lehesranta S, Noteborn HP, Pedersen J, Smith M (2004). Unintended effects and their detection in genetically modified crops. Food and Chemical Toxicology 42:1089–1125. Chang Y, Zhao C, Zhu Z, Wu Z, Zhou J, Zhao Y, Lu X, Xu G (2012). Metabolic profiling based on LC/MS to evaluate unintended effects of transgenic rice with cry1Ac and sck genes. Plant Molecular Biology 78:477–487. Chassy BM (2010). Can – omics inform a food safety assessment?. Regulatory Toxicology and Pharmacology 58:S62–S70. Chen H, Bodulovic G, Hall PJ, Moore A, Higgins TJV, Djordjevic MA, Rolfe BG (2009). Unintended changes in protein expression revealed by proteomic analysis of seeds from transgenic pea expressing a bean ␣-amilase inhibitor gene. Proteomics 9:4406– 4415. Cifuentes A (2009). Food analysis and foodomics. Journal of Chromatography A 1216:7109– 7110. Coll A, Nadal A, Palaudelmas M, Messeguer J, Mele´ E, Puigdomenech P, Pla M (2008). Lack of repeatable differential expression patterns between MON810 and comparable commercial varieties of maize. Plant Molecular Biology 68:105–117. Coll A, Nadal A, Rossignol M, Puigdomenech` P, Pla M (2011). Proteomic analysis of MON810 and comparable non-GM maize varieties grown in agricultural fields. Transgenic Research 20:939–949. Corpillo D, Gardini G, Vaira AM, Basso M, Aime S, Accotto GP, Fasano M (2004). Proteomics as a tool to improve investigation of substantial equivalence in genetically modified organ- isms: The case of a virus-resistant tomato. Proteomics 4:193–200. Deblock M, Botterman J, van Dewiele M, Dockx J, Thoen C, Goseele V, Movva NR, Thompson C, Van Montagu M, Leemans J (1987). Engineering herbicide resistance in plants by expression of a detoxifying enzyme. EMBO Journal 6:2513–2518. Deisingh AK, Badrie N (2005). Detection approaches for genetically modified organisms in foods. Food Research International 38:639–649. Di Carli M, Villani ME, Renzone G, Nardi L, Pasquo A, Franconi R, Scaloni A, Benvenuto E, Desiderio A (2009). Leaf proteome analysis of transgenic plants expressing antiviral antibodies. Journal of Proteome Research 8:838–848. Di Luccia A, Lamacchia C, Fares C, Padalino L, Mamone G, La Gatta B, Gambacorta G, Faccia M, Di Fonzo N, La Notte E (2005). A proteomic approach to study protein variation in GM durum wheat in relation to technological properties of semolina. Annali di Chimica 95:405–414. Elenis DS, Kalogianni DP, Glynou K, Ioannou PC, Christopoulos TK (2008). Advances in molecular techniques for the detection and quantification of genetically modified organisms. Analytical and Bioanalytical Chemistry 392:347–354. Erny GL, Leon´ C, Marina ML, Cifuentes A (2008). Time of flight versus ion trap MS coupled to CE to analyze intact proteins. Journal of Separation Science 31:1810–1818. European Food Safety Agency (2006). Guidance document of the scientific panel on genetically modified organisms for the risk assessment of genetically modified plants and derived food and feed. EFSA Communications Department, Parma, Italy. Fernandez´ Ocana˜ M, Fraser P, Patel RKP, Halket JM, Bramley PM (2007). Mass spectrometric detection of CP4 EPSPS in genetically modified soya and maize. Rapid Communications in Mass Spectrometry 21:319–328. REFERENCES 215

Fernandez´ Ocana˜ M, Fraser P, Patel RKP, Halket JM, Bramley PM (2009). Evaluation of stable isotope labelling strategies for the quantitation of CP4 EPSPS in genetically modified soya. Analytica Chimica Acta 634:75–82. Fiehn O, Kopka J, Dormann P, Altmann T, Trethewney RN, Willmitzer L (2000). Metabolite profiling for functional genomics. Nature Biotechnology 18:1157–1161. Fitch MM, Manshardt RM, Gonsalves D, Slightom JL, Sanford JC (1992). Virus resistant papaya plants derived from tissues bombarded with the coat protein gene of papaya ringspot virus. Biotechnology 10:1466–1472. Garc´ıa-Canas˜ V, Cifuentes A, Gonzalez´ R (2004). Detection of genetically modified organisms in foods by DNA amplification techniques. Critical Reviews in Food Science and Nutrition 44:425–436. Garc´ıa-Canas˜ V, SimoC,Le´ on´ C, Iba´nez˜ E, Cifuentes A (2011). MS-based analytical method- ologies to characterize genetically modified crops. Mass Spectrometry Reviews 30:396–416. Garc´ıa-Lopez´ MC, Garc´ıa-Canas˜ V, Marina ML (2009). Reversed-phase high-performance liquid chromatography-electrospray mass spectrometry profiling of transgenic and non- transgenic maize for cultivar characterization. Journal of Chromatography A 1216:7222– 7228. Garc´ıa-Villalba R, Leon´ C, Dinelli G, Segura-Carretero A, Fernandez-Gutierrez A, Garc´ıa- Canas˜ V, Cifuentes A (2008). Comparative metabolomic study of transgenic versus conven- tional soybean using capillary electrophoresis–time-of-flight mass spectrometry. Journal of Chromatography A 1195:164–173. Giuffrida A, Leon´ C, Garc´ıa-Canas˜ V, Cucinotta V, Cifuentes A (2009). Modified cyclodextrins for fast and sensitive chiral-capillary electrophoresis-mass spectrometry. Electrophoresis 30:1734–1742. Grothaus GD, Bandla M, Currier T, Giroux R, Jenkins GR, Lipp M, Shan G, Stave JW, Pantella V (2006). Immunoassay as an analytical tool in agricultural biotechnology. Journal of AOAC International 89:913–928. Gryson N (2010). Effect of food processing on plant DNA degradation and PCR-based GMO analysis: a review. Analytical and Bioanalytical Chemistry 396:2003–2022. Guo H, Zhang H, Li Y, Ren J, Wang X, Niu H, Yin J (2011). Identification of changes in wheat (Triticum aestivum L.) seeds proteome in response to anti-trx s gene. Plos One 6:e22255. Hails RS (2000). Genetically modified plants—the debate continues. Trends in Ecology and Evolution 15:14–18. Hashimoto W, Momma K, Katsube T, Ohkawa Y, Ishige T, Kito M, Utsumi S, Murata K (1999). Safety assessment of genetically engineered potatoes with designed soybean glycinin: compositional analyses of the potato tubers and digestibility of the newly expressed protein in transgenic potatoes. Journal of the Science of Food and Agriculture 79:1607–1612. Heinemann JA, Kurenbach B, Quist D (2011). Molecular profiling – a tool for addressing emerging gaps in the comparative risk assessment of GMOs. Environment International 37:1285–1293. Hernandez M, Pla M, Esteve T, Prat S, Puigdomenech P, Ferrando A (2003). A specific real- time quantitative PCR detection system for event MON810 in maize YieldGard based on the 30-transgene integration sequence. Transgenic Research 12:179–218. Herrero M, Garc´ıa-Canas˜ V, Simo C, Cifuentes A (2010). Recent advances in the application of CE methods for food analysis and foodomics. Electrophoresis 31:205–228. Herrero M, SimoC,Garc´ ´ıa-Canas˜ V, Iba´nez˜ E, Cifuentes A (2012). Foodomics: MS-based strategies in modern food science and nutrition. Mass Spectrometry Reviews 31:49–69. 216 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT

Hoekenga OA (2008). Using metabolites to estimate unintended effects in transgenic crop plants: problems, promises and opportunities. Journal of Biomolecular Techniques 19:159– 166. Horvath-Szanics E, Szabo Z, Janaky T, Pauk J, Hajos G (2006). Proteomics as an emergent tool for identification of stress-induced proteins in control and genetically modified wheat lines. Chromatographia 63:S143–S147. Hu XT, Owens MA (2011). Multiplexed protein quantification in maize leaves by liquid chro- matography coupled with tandem mass spectrometry: an alternative tool to immunoassays for target protein analysis in genetically engineered crops. Journal of Agricultural and Food Chemistry 59:3551–3558. Hutchison CA (2007). DNA sequencing: bench to bedside and beyond. Nucleic Acids Research 35:6227–6237. Inaba Y, Brotherton JE, Ulanov A, Widholm JM (2007). Expression of a feedback insensitive anthranilate synthase gene from tobacco increases free tryptophan in soybean plants. Plant Cell Reports 26:1763–1771. Ioset JR, Urbaniak B, Ndjoko-Ioset K, Wirth J, Martin F, Gruissem W, Hostettmann K, Sautter C (2007). Flavonoid profiling among wild type and related GM wheat varieties. Plant Molecular Biology 65:645–654. Islam N, Campbell PM, Higgins TJV, Hirano H, Akhurst RJ (2009) Transgenic peas expressing an ␣-amylase inhibitor gene from beans show altered expression and modification of endogenous proteins. Electrophoresis 30:1863–1868. Jiao Z, Si X, Li G, Zhang Z, Xu X (2010). Unintended compositional changes in trans- genic rice seeds (Oryza sativa L.) studied by spectral and chromatographic analysis cou- pled with chemometrics methods. Journal of Agricultural and Food Chemistry 58:1746– 1754. Jimenez JJ, Bernal JL, Nozal MJ, Toribio L, Bernal J (2009). Profile and relative concentrations of fatty acids in corn and soybean seeds from transgenic and isogenic crops. Journal of Chromatography A 1216:7288–7295. Kinney AJ (2006). Metabolic engineering in plants for human health and nutrition. Current Opinion in Biotechnology 17:130–138. Kogel KH, Voll LM, Schafer P, Jansen C, Wu Y, Langen G, Imani J, Hofmann J, Schmiedl A, Sonnewald S, von Wettstein D, Cook RJ, Sonnewald U (2010). Transcriptome and metabolome profiling of field-grown transgenic barley lack induced differences but show cultivar-specific variances. Proceedings of the National Academy of Sciences 107:6198– 6203. Komatsu S, Yang G, Khan M, Onodera H, Toki S, Yamaguchi M (2007). Overexpressionof calcium-dependent protein kinase 13 and calreticulin interacting protein 1 confers cold tolerance on rice plants. Molecular Genetics and Genomics 277:713–723. Kuiper HA, Kleter GA (2003). The scientific basis for risk assessment and regulation of genetically modified foods. Trends in Food Science Technology 14:277–293. Kuiper HA, Kleter GA, Hub PJ, Kok EJ (2001). Assessment of the food safety issues related to genetically modified foods. Plant Journal 27:503–528. Kusano M, Redestig H, Hirai T, Oikawa A, Matsuda F, Fukushima A, Arita M, Watanabe S, Yano M, Hiwasa-Tanase K, Ezura H, Saito K (2011). Covering chemical diversity of genetically-modified tomatoes using metabolomics for objective substantial equivalence assessment. Plos One 16:16989. REFERENCES 217

Latham JR, Wilson AK, Steinbrecher RA (2006). The mutational consequences of plant transformation. Journal of Biomedicine and Biotechnology 25376:1–7. Le Gall G, Colquhon IJ, Davis AL, Collins GJ, Verhoeyen ME (2003a). Metabolite profiling of tomato (Lycopersicon esculentum) using 1H NMR spectroscopy as a tool to detect potential unintended effects following a genetic modification. Journal of Agricultural and Food Chemistry 51:2447–2456. Le Gall G, Dupont MS, Mellon FA, Davis AL, Collins GJ, Verhoeyen ME, Colquhoun MJ (2003b). Characterization and content of flavonoid glycosides in genetically modified tomato (Lycopersicon esculentum) fruits. Journal of Agricultural and Food Chemistry 51:2438–2446. Lehesranta SJ, Davies HV, Shepherd LV, Nunan N, McNicol JW, Auriola S, Koistinen KM, Suomalainen S, Kokko HI, Karenlampi¨ SO (2005). Comparison of tuber proteomes of potato varieties, landraces, and genetically modified lines. Plant Physiology 138:1690– 1699. Leon C, Rodriguez-Meizoso I, Lucio M, Garc´ıa-Canas˜ V, Ibanez˜ E, Schmitt-Kopplin P, Cifuentes A (2009). Metabolomics of transgenic maize combining Fourier transform-ion cyclotron resonance-mass spectrometry, capillary electrophoresis-mass spectrometry and pressurized liquid extraction. Journal of Chromatography A 1216:7314–7323. Levandi T, Leon C, Kaljurand M, Garc´ıa-Canas˜ V, Cifuentes A (2008). Capillary electrophoresis-time of flight-mass spectrometry for comparative metabolomics of trans- genic vs. conventional maize. Analytical Chemistry 80:6329–6335. Luo J,Ning T, Sun Y, Zhu J, Zhu Y, Lin Q, Yang D (2009). Proteomic analysis of rice endosperm cells in response to expression of hGM-CSF. Journal of Proteome Research 8:829–837. Malowicki SMM, Martin R, Qian MC (2008). Comparison of sugar, acids, and volatile composition in raspberry bushy dwarf virus-resistant transgenic raspberries and the wild type ‘meeker’ (Rubus Idaeus L.). Journal of Agricultural and Food Chemistry 56:6648– 6655. Marguerat S, Wilhelm BT, Bahler J (2008). Next-generation sequencing: applications beyond genomes. Biochemical Society Transactions 36:1091–1096. Marmiroli N, Maestri E, Gulli M, Malcevschi A, Peano C, Bordoni R, De Bellis G (2008). Methods for detection of GMOs in food and feed. Analytical and Bioanalytical Chemistry 392:392–384. Marsh JT, Tryfona T, Powers SJ, Stephens E, Dupree P, Shewry PR, Lovegrove A (2011). Determination of the N-glycosylation patterns of seed proteins: applications to determine the authenticity and substantial equivalence of genetically modified (GM) crops. Journal of Agricultural and Food Chemistry 59:8779–8788. Matsuda F, Ishihara A, Takanashi K, Morino K, Miyazawa H, Wakasa K, Miyagawa H (2010). Metabolic profiling analysis of genetically modified rice seedlings that overproduce tryptophan reveals the occurrence of its inter-tissue translocation. Plant Biotechnology 27:17–27. Michelini E, Simoni P, Cevenini L, Mezzanotte L, Roda A (2008). New trends in bioanalyt- ical tools for the detection of genetically modified organisms: an update. Analytical and Bioanalytical Chemistry 392:355–367. Millstone E, Brunner E, Mayer S (1999). Beyond substantial equivalence. Nature 401:525– 526. 218 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT

Morisset D, Stebih D, Cankar K, Zel J, Gruden K (2008). Alternative DNA amplification methods to PCR and their application in GMO detection: a review. European Food Research and Technology 227:1287–1297. Mungur R, Glass AND, Goodenow DB, Lightfoot DA (2005). Metabolite fingerprinting in transgenic Nicotiana tabacum altered by the Escherichia coli glutamate dehydrogenase gene. Journal of Biomedical Biotechnology 2005:198–214. Nagai YS, Sakulsinghroj C, Edwards GE, Satoh H, Greene TW, Blakeslee B, Okita TW (2009). Control of starch synthesis in cereals: metabolite analysis of transgenic rice expressing an up-regulated cytoplasmic ADP-glucose pyrophosphorylase in developing seeds. Plant Cell Physiology 50:635–643. Natarajan SS (2010). Natural variability in abundance of prevalent soybean proteins. Regulatory Toxicology and Pharmacology 58:S26–S29. Nicoletti I, De Rossi A, Giovinazzo G, Corradini D (2007). Identification and quantifica- tion of stilbenes in fruits of transgenic tomato plants (Lycopersicon esculentum Mill.) by reversed phase HPLC with photodiode array and mass spectrometry detection. Journal of Agricultural and Food Chemistry 55:3304–3311. Petit L, Pagny G, Baraige F, Nignol AC, Zhang D (2007). Characterization of geneti- cally modified maize in weakly contaminated seed batches and identification of the origin of the adventitious contamination. Journal of AOAC International 90:1098– 1106. Roessner U, Wagner C, Kopka J, Trethewney RN, Willmitzer L (2000). Simultaneous analysis of metabolites in potato tuber by gas chromatography-mass spectrometry. Plant Journal 23:131–142. Roessner U, Willmitzer L, Fernie AR (2001a). High-resolution metabolic phenotyping of genetically and environmentally diverse potato tuber systems. Identification of phenocopies. Plant Physiology 127:749–764. Roessner U, Luedemann A, Brust D, Fiehn O, Linke T, Willmitzer L, Fernie AR (2001b). Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified plant systems. Plant Cell 13:11–29. Roessner-Tunali U, Hegemann B, Lytovchenko A, Carrari F, Bruedigam C, Granot D, Fernie AR (2003). Metabolic profiling of transgenic tomato plants overexpressing hexokinase reveals that the influence of hexose phosphorylation diminishes during fruit development. Plant Physiology 133:84–99. Rosati A, Bogani PG, Santarlasci A, Buiatti M (2008). Characterisation of 3 transgene insertion site and derived mRNAs in MON810 YieldGard maize. Plant Molecular Biology 67:271– 281. Ruebelt MC, Leimgruber NK, Lipp M, Reynolds TL, Nemeth MA, Astwood JD, Engel KH, Jany KD (2006a). Application of two-dimensional gel electrophoresis to interrogate alter- ations in the proteome of genetically modified crops. 1. Assessing analytical validation. Journal of Agricultural and Food Chemistry 54:2154–2161. Ruebelt MC, Lipp M, Reynolds TL, Astwood JD, Engel KH, Jany KD (2006b). Application of two-dimensional gel electrophoresis to interrogate alterations in the proteome of genetically modified crops. 2. Assessing natural variability. Journal of Agricultural and FoodChemistry 54:2162–2168. Ruebelt MC, Lipp M, Reynolds TL, Schmuke JJ, Astwood JD, Della Penna D, Engel KH, Jany KD (2006c). Application of two-dimensional gel electrophoresis to interrogate alterations REFERENCES 219

in the proteome of genetically modified crops. 3. Assessing unintended effects. Journal of Agricultural and Food Chemistry 54:2169–2177. Saito K, Matsuda F (2010). Metabolomics for functional genomics, systems biology, and biotechnology. Annual Review of Plant Biology 61:463–489. Sauvage FX, Pradal M, Chatelet P, Tesniere C (2007). Proteome changes in leaves from grapevine (Vitis vinifera L.) transformed for alcohol dehydrogenase activity. Journal of Agricultural and Food Chemistry 55:2597–2603. Schubert DR (2008). The problem with nutritionally enhanced plants. Journal of Medicinal Food 11:601–605. Scossa F, Laudencia-Chingcuanco D, Anderson OD, Vensel WH, Lafiandra D, D’Ovidio R, Masci S (2008). Comparative proteomic and transcriptional profiling of a bread wheat cultivar and its derived transgenic line overexpressing a low molecular weight glutenin subunit gene in the endosperm. Proteomics 8:2948–2966. Shepherd LVT, McNicol JW, Razzo R, Taylor MA, Davies HV (2006). Assesing the poten- tial for unintended effects in genetically modified potatoes perturbed in metabolic and developmental processes. Targeted analysis of key nutrients and anti-nutrients. Transgenic Research 15:409–425. Shewmaker CK, Sheehy JA, Daley M, Colburn S, Ke DY (1999). Seed-specific overexpression of phytoene synthase: Increase in carotenoids and other metabolic effects. Plant Journal 20:401–412. Shin YM, Park HJ, Yim SD, Baek NI, Lee CH, An G, Woo YM (2006). Transgenic rice lines expressing maize C1 and R-S regulatory genes produce various flavonoids in the endosperm. Plant Biology Journal 4:303–315. Shrestha HK, Hwu K, Chang M (2010). Advances in detection of genetically engineered crops by multiplex polymerase chain reactions methods. Trends in Food Science and Technology 21:442–454. SimoC,Dom´ ´ınguez-Vega E, Marina ML, Garc´ıa MC, Dinelli G, Cifuentes A (2010). CE-TOF MS analysis of complex protein hydrolyzates from genetically modified soybeans-A tool for foodomics. Electrophoresis 31:1175–1183. Suhre K, Schmitt-Kopplin P (2008). MassTRIX: mass translator into pathways. Nucleic Acids Research 36:W481–484. Takahashi H, Hotta Y, Hayashi M, Kawai-Yamada M, Komatsu S, Uchimiya H (2005). High throughput metabolome and proteome analysis of transgenic rice plants (Oryza sativa L.). Plant Biotechnology 22:47–50. Takahashi H, Hayashi M, Goto F, Sato S, Soga T, Nishioka T, Tomita M, Kawai-Yamada M, Uchimiya H (2006). Evaluation of metabolic alteration in transgenic rice overexpressing dihydroflavonol-4-reductase. Annals of Botany 98:819–825. Teshima R, Nakamura R, Satoh R, Nakamura R (2010). 2D-DIGE analysis of rice proteins from different cultivars. Regulatory Toxicology and Pharmacology 58:S30–S35. Tesniere C, Torregrosa L, Pradal M, Souquet JM, Gilles C, Dos Santos K, Chatelet P, Gunata Z (2006). Effects of genetic manipulation of alcohol dehydrogenase levels on the response to stress and the synthesis of secondary metabolites in grapevine leaves. Journal of Exper- imental Botany 57:91–99. Timms JF, Cramer R (2008). Difference gel electrophoresis. Proteomics 8:4886– 4897. 220 MS-BASED METHODOLOGIES FOR TRANSGENIC FOODS DEVELOPMENT

Van Dijk JP, Leifert C, Barros E, Kok EJ (2010). Gene expression profiling for food safety assessment: examples in potato and maize. Regulatory Toxicology and Pharmacology 58:S21–S25. Villas-Boas SG, Mas S, Akeson M, Smedsgaard J, Nielsen J (2005). Mass spectrometry in metabolome analysis. Mass Spectrometry Reviews 24:613–646. Windels P, Taverniers I, Depicker A, Van Bockstaele E, De Loose M (2001). Characterization of the roundup ready soybean insert. European Food Research and Technology 213:107– 111. Ye X, Al-Babili S, Kloti A, Zhang J, Lucca P, Beyer P, Potrykus I (2000). Engineering the provitamin A (␤-carotene) biosynthetic pathway into (carotenoid-free) rice endosperm. Science 287:303–305. Zawirska-Wojtasiak R, Goslinski M, Gajc-Wolska J, Mildner-Szkudlarz S (2009). Aroma evaluation of transgenic, thaumatin II-producing cucumber fruits. Journal of Food Science 74:204–210. Zhou J, Ma C, Xu H, Yuan K, Lu X, Zhu Z, Wu Y, Xu G (2009). Metabolic profiling of transgenic rice with cryIAc and sck genes: an evaluation of unintended effects at metabolic level by using GC-FID and GC–MS. Journal of Chromatography B 877:725–732. Zolla L, Rinalducci S, Antonioli P, Righetti PG (2008). Proteomics as a complementary tool for identifying unintended side effects occurring in transgenic maize seeds as a result of genetic modifications. Journal of Proteome Research 7:1850–1861. 8 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME

Wendy R. Russell and Sylvia H. Duncan

8.1 INTRODUCTION

The gut microbiota plays a key role in human physiology and contributes to a variety of important functions that their host could not otherwise perform. Despite advances in the molecular approaches describing the composition of the human gut microbiota, it must be emphasized that to date, relatively little is known about the individual func- tionality and role of even the most dominant commensal organisms within the human host. The products of microbial metabolism are considered to play an important role in both inflammation and the immune response (Masklowski et al., 2009; Vijay- Kumar et al., 2010; Clarke et al., 2010). Despite this, many of the products of the microbial metabolism of dietary compounds implicated remain poorly characterized and consequently, their mechanisms of action unknown. As enormous efforts at huge cost are being made to continue to characterize the microbiome and its interaction with the human immune system, the need is now great that the functional role of the microbiota is established. It is important that in addition to describing the microbial populations, the species are also defined based on their function and metabolic output. Provision of this data is exceptionally timely as there is increasing pressure to provide healthy food to a growing and increasingly unhealthy population. Elucidating the products of microbial metabolism provides the essential and missing link between well-defined diets and the physiological effect of

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

221 222 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME

FIGURE 8.1 Schematic showing the relationship between diet, the gut metabolome, and human health and disease. The applications of MS within this scheme have been highlighted with an asterix (∗). The use of MS methodology to study the gut microbiota and their metabo- lites are detailed in this book chapter. Other applications include: measurement of dietary components, drugs, and other xenobiotics with potential to impact on the human gut and evaluation of endogenous metabolites and biomarkers of health and disease. dietary constituents (Fig. 8.1). Without this information, the nutritional value of food cannot be fully established, food claims effectively evaluated, and the contribution of the gut microbiota to an overall aim of developing sustainable agricultural products beneficial for human health ascertained. Additionally, it is also considered that the gut microbiota plays a role in the pathogenesis of under-nutrition and effective trans- lational medicine again through regulation of nutrient and nonnutrient metabolism. Only once the link between microbial diversity and metabolic functionality is firmly established, will the mechanisms by which the gut microbiota contributes to the maintenance of health or disease development be elucidated.

8.2 THE GUT MICROBIOTA AND THEIR ROLE IN METABOLISM

The undigested dietary material that survives passage through the human stomach and small intestine is available as a source of energy and nutrients to the microbiota that colonize the large intestine. While some micro-organisms can also be supported by energy sources of endogenous origin, including mucin, the species composition of the colonic community is likely to be influenced by the energy supply from dietary residues (Duncan et al., 2007; Russell et al., 2011; Walker et al., 2011). Carefully controlled dietary intervention studies are only recently beginning to reveal THE GUT MICROBIOTA AND THEIR ROLE IN METABOLISM 223 how different dietary components influence the composition and metabolic output of intestinal microbial communities. It is becoming increasingly important that we can predict the consequences of dietary manipulation of the gut microbial community upon health. Although there is a good understanding of most major gut pathogens in disease causation, we are only just beginning to understand the roles of different groups of commensal gut micro-organisms that colonize the gut in health maintenance and disease prevention. Such understanding is, however, of fundamental importance in predicting the health impact of dietary components on modulating the gut microbiota and its metabolism toward maintaining a healthy gut. The human intestine comprises complex microbial communities, the composition of which varies with the anatomical site, and environmental and dietary-related conditions. Host factors including genotype, immunological status, age and health status also drive microbial composition (Zoetendal et al., 2004; Flint et al., 2007; Arumugam et al., 2011; Wu et al., 2011). The most densely colonized region is the large intestine where microbial densities can exceed 1011/g contents and total microbial numbers greatly exceed host cells in the body. While the major phyla of gut bacteria is generally well conserved between different individuals, at the level of species or phylotypes, there is notable interindividual variation (Walker et al., 2011) and populations of individual bacterial groups fluctuate constantly with time and dietary intake (Walker et al., 2011). Nevertheless, profiling studies do suggest there is normally a degree of stability in an individual’s fecal microbiota over time (Zoetendal et al., 1998). A number of species (or phylotypes) appear to be detected among the most abundant organisms present in most individuals (Tap et al., 2009; Walker et al., 2011). There has been rapid progress in the development of molecular methodologies applied to identifying the structure of the gut microbiota and, in particular, methods that are mainly being applied to survey ribosomal 16S rRNA sequences (see below). These studies reveal that there are four dominant bacterial phyla present in the large intestine. These are the low percentage G + C Gram-positive Firmicutes that make up approximately 65–75% of the total microbiota (Duncan et al., 2007), high percentage G + C Gram-positive Actinobacteria, and Gram-negative bacteria belonging to the Bacteroidetes that are usually present at around 20–30% of the total microbiota and Proteobacteria usually only present at a few percentage (Tap et al., 2009; Walker et al., 2011). Altogether there are hundreds of bacterial species (or phylotypes) in the colon. The Firmicutes predominantly comprises two families, namely Lachnospiraceae and Ruminococcaceae. Two of the most dominant species that colonize the healthy colon are Faecalibacterium prausnitzii and Eubacterium rectale (Flint et al., 2012). These species are of particular interest, as they are butyrate producers and this short chain fatty acid (SCFA) is considered to be important for a healthy colon (Barcenilla et al., 2000; Louis et al., 2004). Other dominant species can use lactate to form butyrate and include Eubacterium hallii and Anaerostipes species (Duncan et al., 2004). As 20% of butyrate formed in the colon is estimated to be derived from lactate (Belenguer et al., 2006), these species may also compete with sulfate-reducing bacteria for lactate as a growth substrate, thereby modulating hydrogen sulfide formation (Marquet et al., 2009) in the colon. This is also important, as hydrogen sulfide is toxic to colonocytes. 224 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME

With regard to other SCFAs, the identity of the bacterial species involved is less well defined. Propionate is one of the products of metabolism by Bacteroidetes species and certain members of the Veillonellaceae. Acetate is a fermentation product of many colonic bacterial species. To date, we are only beginning to recognize the dominant degraders of nondi- gestible carbohydrates from plant cell walls that reach the human colon. The main polysaccharides include resistant starch, nonstarch polysaccharides, and pectin. One of the keystone species in the colon that has a special role in starch digestion includes Ruminococcus bromii-related bacteria, as these were dominant in volunteers con- suming diets enriched with RS (Davis et al., 2010; Walker et al., 2011; Ze et al., 2012). Among the Lachnospiraceae, the ability to utilize starch has been reported for most members of the Roseburia/E. rectale group (Aminov et al., 2006) of butyrate- producing bacteria. Molecular studies and new isolations suggest that the Firmicutes play a significant role in the degradation of complex plant carbohydrates. In particular, Ruminococcus champanellensis is the only human colonic bacterium so far reported to degrade microcrystalline cellulose (Chassard et al., 2011). Xylan utilization has been reported for Roseburia intestinalis (Duncan et al., 2002) and also for Butyriv- ibrio fibrisolvens isolated from wheat bran enrichment (Rumney et al., 1995). The highly abundant species F. prausnitzii is now known to include strains able to utilize apple pectin for growth (Lopez-Siles et al., 2012). The only other pectin-utilizing Firmicutes species identified so far are the Gram-positive Eubacterium eligens and Lachnospira pectinoschiza species and the Gram-negative Bacteroides species (Salyers et al., 1977). As can be seen, much of the information regarding metabolism by the gut micro- biota concerns utilization of carbohydrate and production and use of SCFAs, with some information on protein metabolism. This is little information in contrast to the metabolomic data on the production of plant secondary metabolites and other bioac- tives. Additionally, a proportion of bacteria within the human intestinal community continue to remain unstudied and this should be an important consideration when trying to predict the consequences, for example, of a novel bacterial species and its role in metabolite transformations.

8.3 METAGENOMICS

To fully appreciate the gut metabolome, it is important to have a brief understanding of the methods for evaluating the gut microbiota. Metagenomics is the analysis of total DNA recovered from environmental samples that allows researchers to survey micro- bial communities without the need for cultivation (Riesenfeld et al., 2004; Kuczynski et al., 2012). Metagenomic libraries are either analyzed by functional screening, for example, for a metabolic activity of interest such as carbohydrate utilization (Gill et al., 2006) or by random high-throughput sequencing which allows large number of samples to be analyzed. Moreover, taxonomic assignment of sequences to bacterial species may be feasible, but this is dependent either on comparison of the 16S rRNA gene sequences, or on the availability of reference genomes from cultured isolates. METABOLOMICS 225

These are rapidly becoming available through, for example, the EU MetaHit program. Sequence information from cultured isolates of gut bacteria is, therefore, essential for meaningful analysis of metagenomic data, and these methods should provide useful insights into the role of the gut microbiota in health and disease (Maccaferri et al., 2011). A metagenomics approach has been used to profile and compare the gut microbiota of 40 twin pairs that were either concordant or disconcordant for Crohn’s disease or ulcerative colitis (Willing et al., 2010). These studies revealed that there was a difference in the profile of the microbiota of the ileal Crohn’s cohort in particu- lar, in that there was diminished abundance of F. prausnitzii (Duncan et al., 2002) and an increase in Proteobacteria abundance (Willing et al., 2010). F. prausnitzii has been reported previously to have potent anti-inflammatory activity (Sokol et al., 2008). Metatranscriptomics and metaproteomics are additional tools to analyze the active component of the gut microbiota (Turnbaugh et al., 2007). Metatranscriptomic data provides the sequence information for a particular set of genes of interest that are being actively expressed at a given time by a complex microbial community such as that found in the human gastrointestinal tract.

8.4 METABOLOMICS

Metabolomics is the comprehensive and nonselective analytical chemistry approach aimed at providing a global description of all the metabolites present in a biolog- ical sample at any given time. Measurements of human metabolites have always been an excellent indicator of human disease and it is likely that they will also be a useful predictor of the effect of diet on human health. In particular, it is likely to deliver the necessary information regarding the microbial metabolome. The measure- ment of metabolites will supplement genomic and proteomics approaches, which can give useful information regarding the predisposition and occurrence of disease and will make possible the application of global strategies as proposed by Foodomics. Although metabolic profiling is not a new science, modern instrumentation and statistical methodologies have found recent application in predicting the outcomes of dietary and clinical studies. In particular, where it has been extremely difficult to establish the long-term implications of interventions in outwardly healthy individuals due to lack of effective biomarkers, metabolomics may prove valuable. The main methods of metabolomic analysis rely on spectroscopic detection. Nuclear magnetic resonance (NMR) spectroscopy exploits nuclear spin and the dis- crete energy states that these nuclei occupy when placed in a magnetic field. The elec- tronic signals obtained following radio frequency perturbation provide information on chemical structure. This information allows direct and simultaneous measurement of low molecular weight metabolites (Nicholson and Lindon, 2008). It can be applied to many biological samples including intact tissue, with the use of high resolution magic angle spinning spectroscopy (Moka et al., 1997). Samples require limited preparation, and measurable concentrations can be as low as nmol dm-3, depending on the nature of the sample. NMR can be coupled to LC allowing separation of the metabolites prior to detection. Proton (1H) NMR is more sensitive than that of other 226 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME nuclei (e.g., 13C) and consequently, low scan times allow its use as a high-throughput technology. Mass spectrometry (MS) is the main focus of this chapter and this tech- nique discriminates molecules based on their mass-to-charge (m/z) ratio. The main advantages of MS over NMR are predominantly sensitivity and the ability to perform quantitative and targeted analysis. Recent advances in MS have resulted in robust and powerful methods to study the human metabolome (Villas-Boasˆ et al., 2005; Hollywood et al., 2006). Although direct-injection MS (DIMS) has been employed to various extents, the real potential of MS has been achieved through prefacing to gas chromatography (GC–MS), liquid chromatography (LC–MS), ultra performance liquid chromatography (UPLC-MS), and to a lesser extent capillary electrophoresis (CE-MS). The plethora of ionization techniques (e.g., electron ionization, chemical ionization, liquid secondary ion ionization, atmospheric pressure ionization, matrix- assisted laser desorption ionization) and wide availability of mass analyzers (e.g., magnetic sector, linear quadrupole, ion trap, time of flight, fourier transform ion cyclotron resonance, etc.) make modern MS analysis an extremely versatile tech- nique (Villas-Boasˆ et al., 2005; Hollywood et al., 2006; Dettmer et al., 2007). MS is an excellent tool to study microbial metabolomics, both the gut microbiota (as described above) and the metabolites they produce (Garcia et al., 2008; Tang, 2011). It can also be applied to evaluate many of the additional factors which impact on the gut metabolome, as well as endogenous metabolites and biomarkers on which the microbial metabolites can have an effect (Fig. 8.1).

8.5 MICROBIAL METABOLITES IN THE HUMAN GUT

The human colon contains a plethora of endogenous and exogenous metabolites, many of which are the products of extensive microbial metabolism (Fig. 8.2). Despite the major role of the human colon in conservation of water and electrolytes, it also plays a role in absorption as columnar absorptive cells and goblet cells are abundant. The undiluted aqueous phase of feces (fecal waters) has been shown to be extremely toxic to epithelial cells in vitro (Rafter et al., 1987a). Rapid turnover of the cells in the base of the colonic crypts is likely to be one of the protective mechanisms for epithelial tissues. However, subepithelial cells, which play a major role in inflammation and the immune response, will be subject to continuous contact with molecules absorbed from the colonic lumen. The metabolite profile of the colon, therefore, has potential to have a major impact on human health (Rafter et al., 1987b). Much of the metabolites present in the human gut will be of food origin, however, other aspects including the environment, drugs, stress, and genetic factors must be considered (Fig. 8.1). The major metabolites found in the human colon derived from food will be the products of undigested macronutrients (carbohydrate, protein, and fat), micronutrients (minerals and vitamins), and non-nutrients such as phytochemicals, which have not been absorbed in the small intestine. Carbohydrate is the major energy source for the gut microbiota. For individu- als consuming a western-style diet, the principal carbohydrate sources are resistant starch and, to a lesser extent, non-starch polysaccharides (which is consumed mainly MICROBIAL METABOLITES IN THE HUMAN GUT 227

FIGURE 8.2 Main classes and examples of metabolites found in the human gut which have the potential to impact on health and disease. in cereals) and oligosaccharides (e.g., fructans). The gut microbiota both utilize and extensively metabolize carbohydrate. One important group of metabolites derived from carbohydrate metabolism is the SCFAs. These are a subgroup of fatty acids with aliphatic tails less than six carbons and include: acetic, propionic, butyric, valeric, and caproic acid. Branched chain examples include isovaleric and isobutyric acid. There are also substituted short chain carboxylic acids such as formic and lactic acid. The total concentration of SCFA in the large intestine may reach upward of 100 mM (Macfarlane GT and Macfarlane S, 2002). Dietary shifts can result in changes in the SCFA production rates and in the molar proportions of different SCFA detected in feces. Diets high in protein, but low in carbohydrate, for example, were shown to reduce fecal butyrate by up to fourfold (Duncan et al., 2007). Meanwhile, higher proportions of propionate and butyrate and lower acetate have been reported to result from increasing prebiotic or fiber intake (Queenan et al., 2007). In addition to carbo- hydrate, these can be produced from undigested protein and intestinal mucin. Protein putrefaction in the colon can produce a variety of metabolite classes. These include the polyamines (putrescine, cadaverine, spermine, spermidine, tyramine, pyrollidine, histamine, piperidine), indoles, ammonia, hydrogen sulfide, and derivatives of the aromatic amino acids (tyrosine, tryptophan, and phenylalanine). Particularly detri- mental to health are the potentially mutagenic and genotoxic heterocyclic amines and N-nitroso compounds also derived from dietary protein (Russell et al., 2011). Glyc- erides which are undigested and not absorbed in the small intestine can be metabolized 228 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME by the gut microbiota to fatty acids (Mackie et al., 1991). Glycerol, a by-product of lipid metabolism, is converted to 3-hydroxypropanal and 1,3-propanediol (De Weirdt et al., 2010; Vollenweider et al., 2003). Bile acid metabolism is modulated by the gut microbiota. In humans, the two primary bile acids are cholic acid and chenodeoxy- cholic acid. Microbial metabolism produces the dehydroxylated products deoxycholic acid and lithocholic acid. Bile acids are additionally conjugated with glycine and tau- rine prior to secretion from the liver and urso- and hyo-derivatives are also available (Russell and Setchell, 1992). There is very little information regarding the presence of dietary vitamins and minerals in the colon. Little is also known about the effect of these compounds on the gut microbiota or whether or not bacteria are involved in their metabolism and absorption. It has been shown that the microbiota can synthesize some B vitamins (B3, B5, and B6) and related molecules including biotin, tetrahydrofolate, vitamin K, and the corrinoids (Goodman et al., 2009). Microbes are considered to affect absorption of certain dietary minerals, with many studies demonstrating that carbohydrate and associated short chain fatty acids (SCFAs) modulate uptake of sodium, calcium, and potassium (Younes et al., 2001). Microbes also actively accumulate iron as a siderophore complex (Neilands, 1995). Non-nutrient gut metabolites are of particular interest as epidemiological studies suggest that there is an inverse association between the intake of phytochemical- rich diets and the incidence of cardiovascular disease, diabetes, and cancer (Hertog et al., 1993). In dietary plant sources, there are thousands of compounds classified as secondary metabolites, many of which have an important role for the plant, includ- ing protection against pathogens. Biochemically, they can be broadly categorized according to their structure and biosynthetic pathways, but it should be appreci- ated that many secondary metabolites are derived by combining elements from all of these biosynthetic routes. Some of these compounds, particularly if available as small molecules or as their aglycones, may be absorbed in the upper GI tract and directly enter the systemic circulation (Russell et al., 2009a). However, many will be available in the colon, in particular those bound to other plant components such as carbohydrate. Within the colon, these phytochemicals are extensively metabolized. Gut microbiota are capable of performing many transformations including the fol- lowing: hydrolysis, de-amination, dehydrogenation, demethylation, decarboxylation, ring cleavage, and chain shortening. Both the parent compounds and the metabolites produced have potential to influence specific microbial groups. Of the secondary metabolites, the group most widely studied are products of the phenylpropanoid pathway, as almost all plant foods considered to have cancer-preventative properties are rich in these compounds (Russell et al., 2009b). These include the benzoic, cin- namic, phenylacetic, phenylpropionic, mandelic phenyllactic, phenylpyruvic acids and aldehydes, acetophenones, simple phenols, phenolic dimers, coumarins, pso- ralens and flavonoids; flavones, flavonol, flavanone, flavanonol isoflavones, flavan- 3-ols, flavan-4-ols, flavan-3,4-diols, anthocyanidins (Russell et al., 2011). Although these phenylpropanoid derivatives are most widely studied, many other secondary metabolites and their derivatives are present in the human colon. Metabolites of plant compounds produced from the mevalonate pathway, such as the terpenoids are little ANALYSIS OF THE MICROBIAL METABOLOME 229 studied and these include the tetraterpenes collectively known as the carotenoids (Rao AV and Rao LG, 2007). These highly colored pigments impart the characteristic red color to tomatoes (lycopene) and the orange color in carrots (␤-carotene). Higher modified terpenoids include the phytosterols and steroidal saponins (Tirapelli et al., 2010; Yang and Dou, 2010). Although considered to be important bioactives, little is known about their metabolism in vivo. The metabolites produced by the acetate pathway and, in particular, the polyketides exhibit a range of bioactivities, including laxatives, antibiotics, and antifungals (Herbert, 1994). Again, little is known about their human metabolism. Although, the glucosinolates and their metabolites (isothio- cyanates and indoles) have been extensively studied in relation to protection against carcinogenesis and mutagenesis (Higdonm et al., 2007; Mithen et al., 2000), many other nitrogen and sulfur-related compounds have been generally overlooked in terms of metabolism.

8.6 ANALYSIS OF THE MICROBIAL METABOLOME

In many studies, the composition of the gut metabolome has been characterized directly by MS. This work has been additionally supported by MS analysis of the products obtained from in vitro incubation of foods, extracts, and molecules with fecal inocula and bacterial species. Early work analyzed the aromatic compounds present in the aqueous phase of human fecal samples (fecal waters) by GC-MS (Jenner et al., 2005). This provided an insight into the composition of the microbial metabolome and it consisted of simple phenols and derivatives of the phenolic acids (benzoic, cinnamic, phenylacetic, and phenylpropionic) and flavonoids. Some of the compounds identified were directly supplied by the diet, whereas many are products of microbial metabolism. Since then LC-MS (LC-MS) and UPLC-MS methodol- ogy has also been developed to measure a wide range of microbial-derived phe- nolic acids (Russell et al., 2011; Sanchez-Pat´ anet´ al., 2011). Recent application of these techniques has focused on determining the sources of these gut metabo- lites, the bacteria responsible for their metabolism, and the identity of novel and/or uncharacterized metabolites. Following ingestion of raspberry puree (200 g/d for four days), the gut metabo- lites produced from plant phenolics were characterized by GC-MS (Gill et al., 2010). These were predominately phenylacetic acid and phenylpropionic acid metabolites, but inter-individual variation resulted in no significant increases across all volunteers. Significant changes in the gut metabolome were observed for volunteers consuming diets in which the protein and carbohydrate ratios were modulated (Russell et al., 2011; Fig. 8.3). Using LC-MS, derivatives of a wide range of plant phenolics consid- ered to be cancer-protective were reduced in diets high in protein (137 g/d) and low in carbohydrate (22 g/d). These included derivatives of benzoic, cinnamic, phenylacetic and phenylpropionic acids, benzaldehydes, acetophenones, and benzene derivatives. Phenylacetic acid, a major metabolite related to protein metabolism was found to be significantly increased. Increasing the carbohydrate content (181 g/d) resulted in significantly increasing some of phenolic acids and their derivatives, principally 230 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME

t[2]

25 20 15 10 5 0--55 0 5 10 15 20 14 14 12 2 12 10 10 8 8 6 2 6 4 2 2 2 4 2 2 1 2 1 1 2 2 1 1 0 0 1 0 0 00 1 1 -2 -2 0 0 0 0 -44 t[1] Num 0 5 5 10 100 15 15 20 20 25 FIGURE 8.3 Principal component analysis (unit variance-scaled) plot showing the dis- crimination between metabolite profiles in the human gut obtained through a macronutri- ent dietary change. Metabolites derived from a high-protein, low-carbohydrate, high-protein medium-carbohydrate, and maintenance (typical western-style) diets are represented by trian- gles labelled 0, 1 and 2 respectively. Adapted from Russell et al. (2011).

fiber-related phenolics, namely; ferulic acid, 4-hydroxy-3-methoxyphenylpropionic acid, and 3-hydroxyphenylpropionic acid. The microbial conversion of caffeic acid and its major esters (chlorogenic and caftaric acid) by human fecal microbiota in vitro was studied by LC-MS (Gonthier et al., 2006). For all three substrates, 3-hydroxyphenylpropionic acid and benzoic acid were found to be the major metabolites and the product profile suggested that esterification did not have an impact of microbial metabolism. The gut microbiota seems to effectively de-esterify compounds, whether the conjugate is quinic acid, tartaric acid, or a sugar moiety. This is also demonstrated by the primary metabolism of soyasaponin I by in vitro incubation with fecal inocula. The major metabolite soyasaponin B was a fully deglycosylated product and the intermediate soyasaponin III partially deglycosylated (Hu et al., 2004). Intestinal bacteria was also shown to metabolize the dietary isoflavone genistein to dihydroxygenistein, 6-hydroxy- odesmethylangolensin, and 2-(4-hydroxyphenyl)-propionic acid (Braune and Blaut, 2011). With the same bacterial incubations, irilone (which differs from genistein by the presence of an A-ring methylenedioxy group) was not extensively metabolized (Braune et al., 2011). This demonstrates that minor changes in chemical structure can have a major influence on microbial metabolism. LC-MS methodology also char- acterized the estrogenic metabolites of polycyclic aromatic hydrocarbons produced ANALYSIS OF THE MICROBIAL METABOLOME 231 by hydroxylation transformation with colonic microbiota incubation (Van de Wiele et al., 2005). The contribution of microbial metabolism in the small intestine was evalu- ated by incubation of individual phenolic compounds found in green tea with ileostomy fluid and measurement of the compounds obtained by LC-MS (Schantz et al., 2010). 3,4,5-Trihydroxyphenyl-␥-valerolactone and 3,4-dihydroxyphenyl-␥- valerolactone were identified as metabolites of ( + )-catechin and (−)-epicatechin and 3,4,5-trihydroxyphenyl-␥-valerolactone was also identified as a metabolite of (−)- epigallocatechin 3-O-gallate. Of particular interest was the cleavage of gallic acid from (−)-epicatechin 3-O-gallate and (−)-epigallocatechin 3-O-gallate. Although metabolism differed dependent on the source of ileal fluid, it suggests that de- esterification can occur prior to metabolism in the colon. For some studies, the presence of specific gut metabolites was detected in plasma and urine by MS. This suggests that these compounds have entered the systemic circulation via the enterohepatic circulation. Urinary phenolic acid metabolites mea- sured by GC-MS were found to be significantly increased with consumption of red wine and red grape juice extracts (van Dorsten et al., 2010). The strongest markers of intake included syringic acid, 3- and 4-hydroxyhippuric acid, and 4-hydroxymandelic acid. Reductions in p-cresol sulfate, 3-indoxylsulfuric acid, and increases in indole- 3-acetic acid and nicotinic acid were also observed in urine following consumption of red wine and grape extracts (Jacobs et al., 2012). Sulforaphane, the gut metabolite of glucoraphanin (commonly found in Brassica sp.), was measured in urine by isotope dilution-MS from volunteers consuming a Brassica extract beverage (Egner et al., 2011). Sesamin, a major bioactive lignin found in sesame seeds was shown to be metabolized to enterolactone by in vitro incubation with fecal inoculate. Sesamin consumption also demonstrated that this compound was a precursor to enterolac- tone in vivo (Penalvo˜ et al., 2005). Ingestion of a range of ellagitannin rich foods, demonstrated that the microbial metabolite 3,8-dihydroxy-6H-dibenzo[b,d]pyran-6- one (urolithin B) conjugated with glucuronic acid was detected in urine by LC-MS (Cerda et al., 2005). These urolithin metabolites were also present in human plasma (Seeram et al., 2006). UPLC-MS/MS has also been shown to have potential to measure gut metabo- lites in urine and plasma. Phenylacylglutamine, 4-cresyl sulfate, and hippurate, have been measured by this method and shown to be potential biomarkers of gut func- tion (Wijeyesekera et al., 2012). Analysis of the products obtained by incubation of red wine extract with faecal inocula showed large increases in catechol, pyrocate- chol, 4-hydroxy-5-(phenyl)-valeric acid, phenylacetic acid, and its 3- and 4-hydroxy derivatives, phenylpropioic and benzoic acids during fermentation (Sanchez-Pat´ an´ et al., 2012). An MS method using atmospheric pressure chemical ionization (APCI) has been validated for the analysis of phytoestrogens shown to be gut microbial metabolites found in human urine and plasma (Wyns et al., 2010). The effect of gut metabolites to impact on the microbiota has also been evaluated by MS. Metabolites of green tea phenolics suppressed pathogenic bacteria (Clostridium perfringens, Clostridium difficile, and Bacteroides spp.), whereas some commensal species (Bifidobacterium and Lactobacillus spp.) were less affected (Lee et al., 2006). 232 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME

However, in a study of the metabolites of red wine phenolics, there was no significant effect of the metabolites produced on the main bacterial groups studied (Sanchez-´ Patan´ et al., 2012). LC-inductively coupled plasma-MS (LC-ICP-MS) has also found application in elucidating the microbial metabolome. In particular, this technique will be most useful for the analysis of inorganic constituents. Using LC-ICP-MS, inorganic arsenic metabolites were identified in human fecal samples (Alava et al., 2012).

8.7 IMPLICATIONS FOR HUMAN HEALTH AND DISEASE

Dysbiosis of the gut microbiota, in particular in relation to the hosts’ immune system has been implicated in the development of diseases such as inflammatory bowel disease, celiac disease, and colon cancer (Mazmanian et al., 2008; Sellitto et al., 2012). Advances in MS methodology are increasingly allowing the identification of the microbial-derived products responsible for disease development and/or progression (Hamer et al., 2012). One particular strength of this technique is that once these molecules are identified, they may be used as a potential diagnostic/prognostic tool for inflammatory diseases and related co-morbidities.

8.7.1 Implications for Obesity Obesity is a major health problem both in developed and in developing nations and was considered to arise when caloric content of food ingested is in excess of energy expenditure. Evidence is accumulating to suggest that the gut microbiota plays a role in obesity pathogenesis. Preclinical studies indicated that bacterial fer- mentative activity in the colon may contribute to fat deposition (Backhed¨ et al., 2004; Turnbaugh et al., 2006). Obese humans on weight loss diets were shown to have altered microbial profiles (Ley et al., 2005, 2006). Also, drastic reduction in total dietary carbohydrate intake in weight loss diets alters the composition of the colonic microbial community as well as fecal metabolite profiles (Duncan et al., 2007a, 2008). Microbial metabolites associated with increased weight gain include increased excretion of hypoxanthine, hippurate, dimethylglycine, and creatinine in the urine (Zhang et al., 2012). Studies where weight loss was achieved via gastric bypass surgery demonstrated that asparagine, lysophosphatidylcholine (C18:2), ner- vonic (C24:1) acid, p-cresol sulfate, lactate, lycopene, glucose, and mannose were all significantly reduced (Mutch et al., 2011).

8.7.2 Implications for Diabetes Type 2 diabetes was found to be associated with changes in the gut microbiota regardless of body mass index (BMI). Specifically, clostridial species were reduced and the ratio of Bacteroides to Firmicutes correlated positively with plasma glucose concentration, but not with BMI (Larsen et al., 2010). Also, Bacteroides vulgatus and Bifidobacterium species were also lower in the diabetic group (Wu et al., 2010). IMPLICATIONS FOR HUMAN HEALTH AND DISEASE 233

Patients with type 2 diabetes were also found to have reduced numbers of F. praus- ntizii, which correlated with increased inflammatory markers (Furet et al., 2010). Evidence also supports the hypothesis that host recognition of the gut microbiota is essential in preventing onset and progression of type 1 diabetes. This is likely to involve the myeloid differentiation factor 88 (MyD88) signaling pathway, but the microbial product initiating the response is yet to be identified. Much of the evidence for metabolite changes is from animal studies and is principally related to choline and bile acid metabolism.

8.7.3 Implications for Cardiovascular Disease The potential impact of gut microbiota and diet extends beyond gut health to include cardiovascular and metabolic health (Cani et al., 2007). One likely cause is the resultant systemic consequences of changes in bacterial populations and their micro- bially produced metabolites and the effect on inflammation and the immune system. In a study on healthy humans, a four-weekly supplementation with resistant starch (30 g/d) produced a significant improvement in insulin sensitivity that might be linked to changes in SCFA formation (Robertson et al., 2005). Preclinical evidence demonstrates that FOS may help improve insulin sensitivity under conditions of high fat intake by stimulating bifidobacteria (Cani et al., 2007). In this case, the mecha- nism was suggested to involve an improvement in barrier function, reducing bacterial endotoxin-mediated inflammation. Gut metabolism also impacts on the develop- ment of dyslipidemia. This results in imbalances in lipoprotein, cholesterol, and triglyceride metabolism. High-fat diets were shown to increase plasma lipopolysac- charides (LPS), compounds found in the outer membrane of Gram-negative bacteria (Amar et al., 2008). LPS can elicit strong immune responses potentially resulting in metabolic endotoxemia. No correlation was observed between LPS and carbohydrate or protein. It was hypothesized that fat could more effectively transport bacterial LPS into the systemic circulation (Amar et al., 2008).

8.7.4 Implications for Inflammation and Cancer Microbial metabolites are likely to be a key factor regulating inflammatory and immunological responses in the colon (Parsonnet, 1995). As described above, metabolism of dietary carbohydrates will result in the formation of SCFAs, which have an important role in the maintenance of human health. SCFAs have a variety of potential effects. Butyrate is beneficial for gut health as it serves as the major energy source for the colonocytes and has a role in preventing colorectal cancer (Pryde et al., 2002; McIntyre et al., 1993). Propionate is metabolized in the liver and is gluconeogenic. Activation of the gut receptors GPR 41 and GPR 43 (also known as FFA3 and FFA2) by SCFAs influences gut motility as well as reducing inflammatory responses (Brown et al., 2003; Tazoe et al., 2008). Interestingly, lactate has been reported to accumulate in fecal samples of Crohn’s patients (Vernia and Cittadini, 1995). 234 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME

Carbohydrate, and in particular, fiber, delivers a wide range of phytochemicals and their metabolites to the human colon (Russell et al., 2011). Many of these compounds have been shown to have anti-inflammatory activity at the concentrations found in vivo (Russell et al., 2006, 2008). Increasing the protein content of the diet, particularly by a greater consumption of red and processed meat is associated with the risk of cancer development. Consumption of high-protein diets has been shown to result in increased levels of toxic bacterial metabolites (Russell et al., 2011). These include nitrosamines, heterocyclic amines, fecapentanes, super oxide radicals, and hydrogen sulfide (Gill and Rowland, 2002). Fecapentanes are conjugated ether lipids and are reportedly synthesized by Bacteroides spp. in the colon (Van Tassell et al., 1990). They are considered to be transformed from polyunsaturated ether phospholipids, and although fecapentanes are detected in fecal samples from individuals on western diets, the dietary origin of these metabolites is as yet known. Fecapentanes are mutagens that may alkylate DNA to form mutagenic adducts (Van Tassell et al., 1990). Cooking proteinaceous foods generates heterocyclic amines that can be further transformed to genotoxic intermediates both in the liver and in the colon by Eubacterium and Clostridium species (Van Tassell et al., 1990). The production of these products is likely to be linked to increased risk of CRC (Hughes et al., 2000). Colonic bacteria also have a role in forming N-nitroso compounds (Lijinsky et al., 1992). Levels of N-nitroso compounds have been shown to be elevated following intake of high- protein diets, particularly meat (Russell et al., 2011). Sulfate-reducing bacteria such as Desulfovibrio piger, produce the toxic metabolite hydrogen sulfide. D. piger can metabolize the lactic acid that may accumulate in bowel diseases such as ulcerative colitis (Vernia et al., 1988) whilst reducing sulfate to sulfide (Marquet et al., 2009). Meat is a major source of sulfur that promotes the growth of the activity of sulfate- reducing bacteria (Christl et al., 1992). The genotoxic potential of hydrogen sulfide is in part mediated by oxidative-free radicals and COX-2 is upregulated in epithelial cells following administration of hydrogen sulfide at physiological concentrations (Ang et al., 2011). The secondary bile acids, mainly deoxycholic acid and chenodeoxycholic acid, are exclusively formed by microbial conversion by a wide range of microbial bac- teria (Ridlon et al., 2006). They are transformed from the primary acids that are formed in the liver and secreted into the duodenum. Bile acid excretion is related to fat and red meat intake that are risk factors for colorectal cancer (Reddy, 1980). These acids can accumulate to high levels in the enterohepatic circulation of some individuals and may contribute to the pathogenesis of colon cancer, gallstones, and other gastrointestinal diseases (Ridlon et al., 2006). Epidemiological evidence links higher concentrations of secondary bile acids in patients with colon cancer, compared to healthy controls (Gill and Rowland, 2002). Secondary bile acids can cause DNA damage (Gill and Rowland, 2002) by the production of oxygen radicals and reactive nitrogen species (Dvorak et al., 2007). Certain bile acids, ursodeoxycholic acid in particular, have been suggested to be preventative against colon cancer (Solimando et al., 2011). Mechanisms of action are considered to be by countering the tumor- promoting effects of secondary bile acids (Serfaty et al., 2010). This may impact on lipid raft composition, growth factor, and inflammatory signals involved in colorectal REFERENCES 235 carcinogenesis (epidermal growth factor receptors signaling and COX-2 expression). Bile acid levels may also modulate the abundance of certain bacterial species includ- ing F. prausnitzii (Lopez-Siles et al., 2012) that has been reported to have potent anti-inflammatory activity (Sokol et al., 2008).

8.8 SUMMARY

It is clear that the metabolites produced by the gut bacteria have potential to influence human health and disease. Advances in technology, particularly in MS instrumenta- tion and methodology, have allowed the characterization of some of these products, as well as the gut bacteria responsible for their formation. Applications of these tech- niques are allowing us to discover the secrets of the gut metabolome, but still many questions remain. In particular, we need to know: (i) What are the identities of novel and/or as yet uncharacterized metabolites? (ii) What are the dietary sources of these gut metabolites and how are they modulated by dietary change? (iii) What are the bac- teria responsible for their production and the mechanisms of their formation? Another major challenge remaining is the characterization of endogenous metabolites. These compounds are generally less stable and are present at lower concentrations than the exogenous products of microbial metabolism, which are likely to modulate their formation. As MS methodology continues to develop and application increases, a clearer and much needed understanding of the complex interplay between diet, the gut microbiota, and human health will be achieved.

ACKNOWLEDGMENTS

Funding is gratefully acknowledged from the Scottish Government Rural and Envi- ronment Science and Analytical Services division; “Food, Land and People” pro- gramme. We would also like to thank the staff of the MS facilities at the Rowett Institute of Nutrition and Health and in particular Lorraine Scobbie, Louise Cantlay, and Gary Duncan.

REFERENCES

Aeschbacher HU, Turesky RJ (1991). Mammalian cell mutagenicity and metabolism of hete- rocyclic aromatic amines. Mutation Research 259(3–4):235–250. Alava P, Tack F, Laing GD, de Wiele TV (2012). HPLC-ICP-MS method development to monitor arsenic speciation changes by human gut microbiota. Biomedical Chromatography 26(4):524–533. Amar J, Burcelin R, Ruidavets JB, Cani PD, Fauvel J, Alessi MC, Chamontin B, Ferrieres´ J (2008). Energy intake is associated with endotoxemia in apparently healthy men. American Journal of Clinical Nutrition 87(5):1219–1223. Aminov RI, Walker AW, Duncan SH, Harmsen HJM, Welling GW, Flint HJ (2006). Molec- ular diversity, cultivation, and improved FISH detection of a dominant group of human 236 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME

gut bacteria related to Roseburia and Eubacterium rectale. Applied and Environmental Microbiology 72:6371–6376. Ang SF, Sio SW, Moochhala SM, MacAry PA, Bhatia M (2011). Hydrogen sulfide upregulates cyclooxygenase-2 and prostaglandin E metabolite in sepsis-evoked acute lung injury via transient receptor potential vanilloid type 1 channel activation. Journal of Immunology 187(9):4778–4787. Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR, Fernandes GR, Tap J, Bruls T, Batto JM, Bertalan M, Borruel N, Casellas F, Fernandez L, Gautier L, Hansen T, Hattori M, Hayashi T, Kleerebezem M, Kurokawa K, Leclerc M, Levenez F, Manichanh C, Nielsen HB, Nielsen T, Pons N, Poulain J, Qin J, Sicheritz-Ponten T, Tims S, Torrents D, Ugarte E, Zoetendal EG, Wang J, Guarner F, Pedersen O, de Vos WM, Brunak S, Dore´ J; MetaHIT Consortium, Antol´ın M, Artiguenave F, Blottiere HM, Almeida M, Brechot C, Cara C, Chervaux C, Cultrone A, Delorme C, Denariaz G, Dervyn R, Foerstner KU, Friss C, van de Guchte M, Guedon E, Haimet F, Huber W, van Hylckama-Vlieg J, Jamet A, Juste C, Kaci G, Knol J, Lakhdari O, Layec S, Le Roux K, Maguin E, Merieux´ A, Melo Minardi R, M’rini C, Muller J, Oozeer R, Parkhill J, Renault P, Rescigno M, Sanchez N, Sunagawa S, Torrejon A, Turner K, Vandemeulebrouck G, Varela E, Winogradsky Y, Zeller G, Weissenbach J, Ehrlich SD, Bork P (2011). Enterotypes of the human gut microbiome. Nature 473:174–180. Backhed¨ F, Ding H, Wang T, Hooper LV, Koh GY, Nagy A, Semenkovich CF, Gordon JI (2004). The gut microbiota as an environmental factor that regulates fat storage. Proceedings of the National Academy of Sciences USA 101(44):15718–15723. Barcenilla A, Pryde SE, Martin JC, Duncan SH, Stewart CS, Flint HJ (2000). Phylogenetic relationships of dominant butyrate producing bacteria from the human gut. Applied and Environmental Microbiology 66:1654–1661. Belenguer A, Duncan SH, Calder AG, Holtrop G, Louis P, Lobley GE, Flint HJ (2006). Two routes of metabolic cross-feeding between Bifidobacterium adolescentis and butyrate- producing anaerobes from the human gut. Applied and Environmental Microbiology 72:3593–3599. Braune A, Blaut M (2011). Deglycosylation of puerarin and other aromatic C-glucosides by a newly isolated human intestinal bacterium. Environmental Microbiology 13(2):482– 494. Braune A, Maul R, Schebb NH, Kulling SE, Blaut M (2011). The red clover isoflavone irilone is largely resistant to degradation by the human gut microbiota. Molecular Nutrition and Food Research 54(7):929–938. Brown AJ, Goldsworthy SM, Barnes AA, Eilert MM, Tcheang L, Daniels D, Muir AI, Wig- glesworth MJ, Kinghorn I, Fraser NJ, Pike NB, Strum JC, Steplewski KM, Murdock PR, Holder JC, Marshall FH, Szekeres PG, Wilson S, Ignar DM, Foord SM, Wise A, Dow- ell SJ (2003). The orphan G protein-coupled receptors GPR41 and GPR43 are activated by propionate and other short chain carboxylic acids. Journal of Biological Chemistry 278(13):11312–11319. Cani PD, Neyrinck AM, Fava F, Knauf C, Burcelin RG, Tuohy KM, Gibson GR, Delzenne NM (2007). Selective increases of bifidobacteria in gut microflora improve high-fat-diet- induced diabetes in mice through a mechanism associated with endotoxaemia. Diabetologia 50(11):2374–2383. Cerda B, Tomas-Barbara FA, Espin JC (2005). Metabolism of antioxidant and chemopreven- tive ellagitannins from strawberries, raspberries, walnuts, and oak-aged wine in humans: REFERENCES 237

identification of biomarkers and individual. Journal of Agriculture and Food Chemistry 53:227–235. Chassard C, Delmas E, Robert C, Lawson PA, Bernalier-Donadille A (2011). Ruminococ- cus champanellensis sp. nov., a cellulose-degrading bacteria from the human gut micro- biota. International Journal of Systematic and Evolutionary Microbiology 62:138– 143. Christl SU, Gibson GR, Cummings JH (1992). Role of dietary sulphate in the regulation of methanogenesis in the human large intestine. Gut 33(9):1234–1238. Clarke TB, Davis KM, Lysenko ES, Zhou AY, Yu Y, Weiser JN (2010). Recognition of peptidoglycan from the microbiota by Nod1 enhances systemic innate immunity. Nature Medicine 16:228–231. Davis LMG, Martinez I, Walter J, Hutkins R (2010). A dose dependent impact of prebiotic galactooligosaccharides on the intestinal microbiota of healthy adults. International Journal of Food Microbiology 144:285–292. De Weirdt R, Possemiers S, Vermeulen G, Moerdijk-Poortvliet TC, Boschker HT, Verstraete W, Van de Wiele T (2010). Human faecal microbiota display variable patterns of glycerol metabolism. FEMS Microbiology Ecology 74(3):601–611. Dettmer K, Aronov PA, Hammock BD (2007). Mass spectrometry-based metabolomics.Mass Spectrometry Reviews 26(1):51–78. Duncan SH, Belenguer A, Holtrop G, Johnstone AM, Flint HJ, Lobley GE (2007). Reduced dietary intake of carbohydrates by obese subjects results in decreased concentrations of butyrate and butyrate-producing bacteria in feces. Applied and Environmental Microbiology 73:1073–1078. Duncan SH, Hold GL, Barcenilla A, Stewart CS, Flint HJ (2002). Roseburia intestinalis sp. a novel saccharolytic, butyrate-producing bacterium from human faeces. International Journal of Systematic and Evolutionary Microbiology 52(5):1615–1620. Duncan SH, Lobley GE, Holtrop G, Ince J, Johnstone AM, Louis P, Flint HJ (2008). Human colonic microbiota associated with diet, obesity and weight loss. International Journal of Obesity 11:1720–1724. Duncan SH, Louis P, Flint HJ (2004) Lactate-utilizing bacteria, isolated from human feces, that produce butyrate as a major fermentation product. Applied and Environmental Microbiology 70(10):5810–5817. Dvorak K, Payne CM, Chavarria M, Ramsey L, Dvorakova B, Bernstein H, Holubec H, Sampliner RE, Guy N, Condon A, Bernstein C, Green SB, Prasad A, Garewal HS (2007). Bile acids in combination with low pH induce oxidative stress and oxidative DNA damage: relevance to the pathogenesis of Barrett’s oesophagus. Gut 56(6):763–771. Egner PA, Chen JG, Wang JB, Wu Y, Sun Y, Lu JH, Zhu J, Zhang YH, Chen YS, Friesen MD, Jacobson LP, Munoz˜ A, Ng D, Qian GS, Zhu YR, Chen TY, Botting NP, Zhang Q, Fahey JW, Talalay P, Groopman JD, Kensler TW (2011). Bioavailability of Sulforaphane from two broccoli sprout beverages: results of a short-term, cross-over clinical trial in Qidong, China. Cancer Prevention Research (Phila) 4(3):384–395. Flint HJ, Duncan SH, Scott KP, Louis P (2007). Interactions and competition within the microbial community of the human colon: links between diet and health. Environmental Microbiology 9:1101–1111. Flint HJ, Scott KP, Duncan SH, Louis P, Forano E (2012). Microbial degradation of complex carbohydrates in the gut. Gut Microbes 3(4):1–18. 238 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME

Furet JP, Kong LC, Tap J, Poitou C, Basdevant A, Bouillot JL, Mariat D, Corthier G, DoreJ,´ Henegar C, Rizkalla S, Clement´ K (2010). Differential adaptation of human gut microbiota to bariatric surgery-induced weight loss: links with metabolic and low-grade inflammation markers. Diabetes 59(12):3049–3057. Garcia DE, Baidoo EE, Benke PI, Pingitore F, Tang YJ, Illa S, Keasling JD (2008). Separa- tion and mass spectrometry in microbial metabolomics. Current Opinion in Microbiology 11(3):233–239. Gill CI, McDougall GJ, Glidewell S, Stewart D, Shen Q, Tuohy K, Dobbin A, Boyd A, Brown E, Haldar S, Rowland IR (2010). Profiling of phenols in human fecal water after raspberry supplementation. Journal of Agriculture and Food Chemistry 58(19):10389– 10395. Gill CI, Rowland IR (2002). Diet and cancer: assessing the risk. British Journal of Nutrition 88(1):S73–S87. Gill SR, Pop M, DeBoy RT, Eckburg PB, Turnbaugh PJ, Samuel BS, Gordon JI, Relman DA, Fraser-Liggett CM, Nelson KE (2006). Metagenomic analysis of the human distal gut microbiome. Science 312:1355–1359. Gonthier MP, Remesy C, Scalbert A, Cheynier V, Souquet JM, Poutanen K, Aura AM (2006). Microbial metabolism of caffeic acid and its esters chlorogenic and caftaric acids by human faecal microbiota in vitro. Biomedicine & Pharmacotherapy 60(9):536–540. Goodman AL, McNulty NP, Zhao Y, Leip D, Mitra RD, Lozupone CA, Knight R, Gordon JI (2009). Identifying genetic determinants needed to establish a human gut symbiont in its habitat. Cell Host and Microbe 6(3):279–289. Hamer HM, De Preter V, Windey K, Verbeke K (2012). Functional analysis of colonic bacterial metabolism: relevant to health? American Journal of Physiology – Gastrointestinal and Liver Pathology 302(1):1–9. Herbert R (1994). The Biosynthesis of Secondary Metabolites. 2nd ed. London: Chapman & Hall. Hertog MG, Feskens EJ, Hollman PC, Katan MB, Kromhout D (1993). Dietary antioxi- dant flavonoids and risk of coronary heart disease: the Zutphen Elderly Study. Lancet 342(8878):1007–1011. Higdonm JV, Delage B, Williams DE, Dashwood RH (2007). Cruciferous vegetables and human cancer risk: epidemiologic evidence and mechanistic basis. Pharmacological Research 55:224–236. Hollywood K, Brison DR, Goodacre R (2006). Metabolomics: current technologies and future trends. Proteomics 6:4716–4723. Hu J, Zheng YL, Hyde W, Hendrich S, Murphy PA (2004). Human fecal metabolism of soyasaponin I. Journal of Agriculture and Food Chemistry 52(9):2689–2696. Hughes R, Magee EA, Bingham S (2000). Protein degradation in the large intestine: relevance to colorectal cancer. Current Issues in Intestinal Microbiology 1(2):51–58. Jacobs DM, Fuhrmann JC, van Dorsten FA, Rein D, Peters S, van Velzen EJ, Hollebrands B, Draijer R, van Duynhoven J, Garczarek U (2012). Impact of short-term intake of red wine and grape polyphenol extract on the human metabolome. Journal of Agriculture and Food Chemistry 60(12):3078–3085. Jenner AM, Rafter J, Halliwell B (2005). Human fecal water content of phenolics: the extent of colonic exposure to aromatic compounds. Free Radical Biology and Medicine 38:763– 772. REFERENCES 239

Kuczynski J, Lauber CL, Walters WA, Parfrey LW, Clemente JC, Gevers D, Knight R (2012). Experimental and analytical tools for studying the human microbiome. Nature Reviews Genetics 13:47–58. Larsen N, Vogensen FK, van den Berg FWJ, Nielsen DS, Andreasen AS, Pedersen BK, Al- Soud WA, Sørensen SJ, Hansen LH, Jakobsen M (2010). Gut microbiota in human adults with type 2 diabetes differs from non-diabetic adults. PLoS ONE 5(2):e9085. Lee JH, Shim JS, Lee JS, Kim JK, Yang IS, Chung MS, Kim KH (2006). Inhibition of pathogenic bacterial adhesion by acidic polysaccharide from green tea (Camellia sinensis). Journal of Agriculture and Food Science 54(23):8717–8723. Ley RE, Backhed F, Turnbaugh P, Lozupone CA, Knight RD, Gordon JI (2005). Obesity alters gut microbial ecology. Proceedings of the National Academy of Sciences USA 102(31):11070–11075. Ley RE, Turnbaugh PJ, Klein S, Gordon JI (2006). Microbial ecology – human gut microbes associated with obesity. Nature 444(7122):1022–1023. Lijinsky W, Kovatch RM, Saavedra JE (1992). Carcinogenesis and mutagenesis by N-nitroso compounds having a basic center. Cancer Letters 63(2):101–107. Lopez-Siles M, Khan TM, Duncan SH, Harmsen HJ, Garcia-Gil LJ, Flint HJ (2012). Cultured representatives of two major phylogroups of human colonic Faecalibacterium prausnitzii can utilize pectin, uronic acids, and host-derived substrates for growth. Applied and Envi- ronmental Microbiology 78(2):420–428. Louis P, Duncan SH, McCrae S, Millar J, Jackson MS, Flint HJ (2004). Restricted distribution of the butyrate kinase pathway among butyrate-producing bacteria from the human colon. Journal of Bacteriology 186:2099–2106. Maccaferri S, Biagi E, Brigidi P (2011). Metagenomics: key to human gut microbiota. Digestive Diseases 29:525–530. Macfarlane GT, Macfarlane S (2002). Diet and metabolism of the intestinal flora. Bioscience and Microflora 21(4):199–208. Mackie RI, White BA, Bryant MP (1991). Lipid metabolism in anaerobic ecosystems. Critical reviews in Microbiology 17(6):449–479. Marquet P, Duncan SH, Chassard C, Bernalier-Donadille A, Flint HJ (2009). Lactate has the potential to promote hydrogen sulphide formation in the human colon. FEMS Microbiology Letters 299(2):128–34. Masklowski KM, Vieira1 AT, Ng A, Kranich J, Sierro1 F, Yu D, Schilter HC, Rolph MS, Mackay F, Artis D, Xavier RJ, Teixeira MM, Mackay CR (2009). Regulation of inflamma- tory responses by gut microbiota and chemoattractant receptor GPR43. Nature 461:1282– 1287. Mazmanian SK, Round JL, Kasper DL (2008). A microbial symbiosis factor prevents intestinal inflammatory disease. Nature 453(7195):620–625. McIntyre AP, Gibson P, Young GP (1993). Butyrate production from dietary fibre and protec- tion against large bowel cancer in a gut model. Gut 34:386–391. Mithen RF, Dekker M, Verkerk R, Rabot S, Johnson IT (2000). The nutritional significance, biosynthesis and bioavailability of glucosinolates in human foods. Journal of the Science of Food and Agriculture 80:967–984. Moka D, Vorreuther R, Schicha H, Spraul M, Humpfer E, Lipinski M, Foxall PJD, Nicholson JK, Lindon JC (1997). Magic angle spinning proton nuclear magnetic resonance spectro- scopic analysis of intact kidney tissue samples. Analytical Communications 34:107–109. 240 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME

Mutch DM, Fuhrmann JC, Rein D, Wiemer JC, Bouillot JL, Poitou C, Clement´ K (2011). Metabolite profiling identifies candidate markers reflecting the clinical adaptations associ- ated with Roux-en-Y gastric bypass surgery. PLoS ONE 4:e7905. Neilands JB (1995). Siderophores: structure and function of microbial iron transport com- pounds. Journal of Biological Chemistry 270(45):26723–26726. Nicholson JK, Lindon JC (2008). Systems biology – metabonomics. Nature 455:1054– 1056. Parsonnet J (1995). Bacterial infection as a cause of cancer. Environmental Health Perspectives 103(8):263–268. Penalvo˜ JL, Heinonen SM, Aura AM, Adlercreutz H (2005). Dietary sesamin is converted to enterolactone in humans. Journal of Nutrition 135(5):1056–1062. Pryde SE, Duncan SH, Hold GL, Stewart CS, Flint HJ (2002). The microbiology of butyrate formation in the human colon. FEMS Microbiology Letters 217:133–139. Queenan KM, Stewart ML, Smith KN, Thomas W, Fulcher RG, Slavin JL (2007). Concentrated oat beta-glucan, a fermentable fiber, lowers serum cholesterol in hypercholesterolemic adults in a randomized controlled trial. Nutrition Journal 6:6. Rafter J, Geltner U, Bruce R (1987a). Cellular toxicity of human faecal water – possible role in aetiology of colon cancer. Scandinavian Journal of Gastroenterology Supplement 129:245–250. Rafter JJ, Child P, Anderson AM, Alder R, Eng V, Bruce WR (1987b). Cellular toxicity of fecal water depends on diet. American Journal of Clinical Nutrition 45:559–563. Rao AV, Rao LG (2007). Carotenoids and human health. Pharmacological Research 55:207– 216. Reddy BS (1980). Dietary fibre and colon cancer: epidemiologic and experimental evidence. Canadian Medical Association Journal 123(9):850–856. Ridlon JM, Kang DJ, Hylemon PB (2006). Bile salt biotransformations by human intestinal bacteria. Journal of Lipid Research 47(2):241–259. Riesenfeld CS, Schloss PD, Handelsman J (2004). Metagenomics: genomic analysis of micro- bial communities. Annual Reviews of Genetics 28:525–552. Robertson MD, Bickerton AS, Dennis AL, Vidal H, Frayn KN (2005). Insulin-sensitzing effects of dietary resistant starch and effects on skeletal muscle and adipose tissue metabolism. American Journal of Clinical Nutrition 82:559–567. Rumney CJ, Duncan SH, Henderson C, Stewart CS (1995). Isolation and characteristics of a wheatbran-degrading Butyrivibrio from human faeces. Letters in Applied Microbiology 20:232–236. Russell DW, Setchell KD (1992). Bile acid biosynthesis. Biochemistry 31:4737–4749. Russell WR, Drew JE, Scobbie L, Duthie GG (2006). Inhibition of cytokine-induced prostanoid biogenesis by phytochemicals in human colonic fibroblasts. Biochimica et Biophysica Acta: Molecular Basis of Disease 1762(1):124–130. Russell WR, Duthie GG (2011). Plant secondary metabolites and gut health: the case for phenolic acids. Proceedings of the Nutrition Society 70(3):389–396. Russell WR, Gratz SW, Duncan SH, Holtrop G, Ince J, Scobbie L, Duncan G, Johnstone AM, Lobley GE, Wallace RJ, Duthie GG, Flint HJ (2011). High protein, reduced carbohydrate weight loss diets promote metabolite profiles likely to be detrimental to colonic health. American Journal of Clinical Nutrition 93(5):1062–1072. REFERENCES 241

Russell WR, Scobbie L, Chesson A, Richardson AJ, Stewart CS, Duncan SH, Drew JE, Duthie GG (2008). Anti-inflammatory implications of the dietary phenolic compounds. Nutrition and Cancer – an International Journal 60:636–642. Russell WR, Scobbie L, Labat A, Duncan GJ, Duthie GG (2009a). Phenolic acid content of fruits commonly consumed and locally produced in Scotland. Food Chemistry 115:100– 104. Russell WR, Scobbie L, Labat A, Duthie GG (2009b). Selective bio-availability of phenolic acids from Scottish strawberries. Molecular Nutrition and Food Research 53:85–91. Salyers AA, West SEH, Vercellotti JR, Wilkins TD (1977). Fermentation of mucins and plant polysaccharides by anaerobic bacteria from the human colon. Applied and Environmental Microbiology 34:529–533. Sanchez-Pat´ an´ F, Cueva C, Monagas M, Walton GE, Gibson GR, Quintanilla-Lopez´ JE, Lebron-Aguilar´ R, Mart´ın-Alvarez´ PJ, Moreno-Arribas MV, Bartolome´ B (2012). In vitro fermentation of a red wine extract by human gut microbiota: changes in microbial groups and formation of phenolic metabolites. Journal of Agriculture and Food Chemistry 60(9):2136–2147. Sanchez-Pat´ an´ F, Monagas M, Moreno-Arribas MV, Bartolome´ B (2011). Determination of microbial phenolic acids in human faeces by UPLC-ESI-TQ MS. Journal of Agriculture and Food Chemistry 59(6):2241–2247. Schantz M, Erk T, Richling E (2010). Metabolism of green tea catechins by the human small intestine. Biotechnology Journal 5(10):1050–1059. Seeram NP, Henning SM, Zhang Y, Suchard M, Li Z, Heber D (2006). Pomegranate juice ellagitannin metabolites are present in human plasma and some persist in urine for up to 48 hours. Journal of Nutrition 136(10):2481–2485. Serfaty L, Bissonnette M, Poupon R (2010). Ursodeoxycholic acid and chemoprevention of colorectal cancer. Gastroenterology and Clinical Biology 34(10):516–522. Sellitto M, Guoyun B, Gloria S, Fricke WF, Sturgeon C, Gajer P, White JR, Koenig SSK, Sakamoto J, Boothe D, Gicquelais R, Kryszak D, Puppa E, Catassi C, Ravel J, Fasano A (2012). Proof of concept of microbiome-metabolome analysis and delayed gluten exposure on celiac disease autoimmunity in genetically at-risk infants. PLoS ONE 7(3):e33387. Sokol H, Pigneur B, Watterlot L, Lakhdari O, Bermudez-Humar´ an´ LG, Gratadoux JJ, Blugeon S, Bridonneau C, Furet JP, Corthier G, Grangette C, Vasquez N, Pochart P, Trugnan G, Thomas G, Blottiere` HM, Dore´ J, Marteau P, Seksik P, Langella P (2008). Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut microbiota analysis of Crohn disease patients. Proculation of the National Academy of Sciences USA 105(43):16731–16736. Solimando R, Bazzoli F, Ricciardiello L (2011). Chemoprevention of colorectal cancer: a role for ursodeoxycholic acid, folate and hormone replacement treatment? Best Practice and Research Clinical Gastroenterology 25(4–5):555–568. Tang J (2011). Microbial metabolomics. Current Genomics 12(6):391–403. Tap J, Mondot S, Levenez F, Pelletier E, Caron C, Furet J-P, Ugarte E, Munoz-Tamayo R, Le Paslier D, Nalin R, Dore J, LeClerc M (2009). Towards the human intestinal microbiota phylogenetic core. Environmental Microbiology 11:2574–2584. Tazoe H, Otomo Y, Kaji I, Tanaka R, Karaki SI, Kuwahara A (2008). Roles of short-chain fatty acids receptors, GPR41 and GPR43 on colonic functions. Journal of Physiology and Pharmacology 59(2):251–262. 242 MS-BASED METHODOLOGIES TO STUDY THE MICROBIAL METABOLOME

Tirapelli CR, Ambrosio SR, de Oliveira AM, Tostes RC (2010). Hypotensive action of naturally occurring diterpenes: a therapeutic promise for the treatment of hypertension. Fitoterapia 81:690–702. Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, Gordon JI (2007). The human microbiome project. Nature 449:804–810. Turnbaugh PJ, Ley RE, Mahowald MA, Magrini V, Mardis ER, Gordon JI (2006). An obesity- associated gut microbiome with increased capacity for energy harvest. Nature 444:1027– 1031. Van de Wiele T, Vanhaecke L, Boeckaert C, Peru K, Headley J, Verstraete W, Siciliano S (2005). Human colon microbiota transform polycyclic aromatic hydrocarbons to estrogenic metabolites. Environmental Health Perspectives 113(1):6–10. van Dorsten FA, Grun¨ CH, van Velzen EJ, Jacobs DM, Draijer R, van Duynhoven JP (2010). The metabolic fate of red wine and grape juice polyphenols in humans assessed by metabolomics. Molecular Nutrition and Food Research 54(7):897–908. Van Tassell RL, Kingston DG, Wilkins TD (1990). Metabolism of dietary genotoxins by the human colonic microflora; the fecapentaenes and heterocyclic amines. Mutation Research 238(3):209–221. Vernia P, Caprilli R, Latella G, Barbetti F, Magliocca FM, Cittadini M (1988). Fecal lactate and ulcerative colitis. Gastroenterology 95(6):1564–1568. Vernia P, Cittadini M (1995). Short-chain fatty acids and colorectal cancer. European Journal of Clinical Nutrition 49(3):S18–S21. Vijay-Kumar M, Aitken JD, Carvalho FA, Cullender TC, Mwangi S, Srinivasan S, Sitaraman SV, Knight R, Ley RE, Gewirtz1 AT(2010). Metabolic syndrome and altered gut microbiota in mice lacking toll-like receptor. Science 328:228–231. Villas-Boasˆ SG, Mas S, Åkesson M, Smedsgaard J, Nielsen N (2005). Mass spectrometry in metabolome analysis. Mass Spectrometry Reviews 24:613–646. Vollenweider S, Grassi G, Konig¨ I, Puhan Z (2003). Purification and structural characteri- zation of 3-hydroxypropionaldehyde and its derivatives. Journal of Agriculture and Food Chemistry 51(11):3287–3293. Walker AW, Ince J, Duncan SH, Webster LM, Holtrop G, Ze X, Brown D, Stares MD, Scott P, Bergerat A, Louis P, McIntosh F, Johnstone AM, Lobley GE, Parkhill J, Flint HJ (2011). Dominant and diet-responsive groups of bacteria within the human colonic microbiota. ISME Journal 5:220–230. Wijeyesekera A, Selman C, Barton RH, Holmes E, Nicholson JK, Withers DJ (2012). Metabo- typing of long-lived mice using (1)H NMR spectroscopy. Journal of Proteome Research 11(4):2224–2235. Willing BP, Dicksved J, Halfvarson J, Andersson AF, Lucio M, Zheng Z, Jarnerot G, Tysk C, Jansson JK, Engstrand L (2010). A pyrosequencing study in twins shows that gastrointesti- nal microbial profiles vary with inflammatory bowel disease phylotypes. Gastroenterology 139:1844–1854. Wu GD, Chen J, Hoffmann C, Bittinger K, Chen Y-Y, Keilbaugh SA, Bewtra M, Knights D, Walters WA,Knight R, Sinha R, Gilroy E, Gupta K, Baldassano R, Nessel L, Li H, Bushman FD, Lewis JD (2011). Linking long-term dietary patterns with gut microbial enterotypes. Science 334:105–108. Wyns C, Bolca S, De Keukeleire D, Heyerick A (2010). Development of a high-throughput LC/APCI-MS method for the determination of thirteen phytoestrogens including gut REFERENCES 243

microbial metabolites in human urine and serum. Journal of Chromatography B: Ana- lytical Technologies in the Biomedical and Life Sciences 878(13–14):949–956. Yang HJ, Dou QP (2010). Targeting apoptosis pathway with natural terpenoids: implications for treatment of breast and prostate cancer. Current Drug Targets 11:733–744. Younes H, Coudray C, Bellanger J, Demigne´ C, Rayssiguier Y, Rem´ esy´ C (2001). Effects of two fermentable carbohydrates (insulin and resistant starch) and their combination on calcium and magnesium balance in rats. British Journal of Nutrition 86(4):479–485. Ze X, Duncan SH, Louis P, Flint HJ (2012). Ruminococcus bromii is a keystone species for the degradation of resistant starch in the human colon. ISME Journal 6(8):1535–1543. Zhang Y, Yan S, Gao X, Xiong X, Dai W, Liu X, Li L, Zhang W, Mei C (2012). Analysis of urinary metabolic profile in aging rats undergoing caloric restriction. Aging Clinical and Experimental Research 24(1):79–84. Zoetendal EG, Akkermans ADL, de VosWM (1998). Temperature gradient gel electrophoresis analysis of 16S rRNA from human fecal samples reveals stable and host-specific commu- nities of active bacteria. Applied and Environmental Microbiology 64:3854–3859. Zoetendal EG, Collier CT, Koike S, Mackie RI, Gaskins HR (2004). Molecular ecological analysis of the gastrointestinal microbiota: a review. Journal of Nutrition 34:465–472. 9 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH

Clara Iba´nez˜ and Carolina Simo´

9.1 INTRODUCTION

With the impressive development of novel high-throughput “Omics” technologies, there is a renewed interest in dietary components that may potentially affect gene expression and the integrative physiological and metabolic functions of an organ- ism. In this line, the term Nutrigenomics was coined to elucidate the influence of interactions between genes and diet on individual’s health (Peregrin, 2001; Wittwer et al., 2011). One of the key opportunities of omic sciences is the exploration of the link between specific gene polymorphisms and the individual response to nutri- ents, what is studied by Nutrigenetics. Recently, a new discipline called Foodomics (Cifuentes, 2009; Herrero et al., 2010; Herrero et al., 2012) has been defined creating in this way a global framework able to integrate in the current postgenomic era the multiple applications and terms (e.g., nutrigenomics, microbiomics, toxicogenomics, transgenic foods, etc) derived from the combination of Omics technologies plus food science and nutrition. In this regard, one of the main interests of Foodomics is related to prevention of chronic diseases through personalized diet with the goal of improv- ing the prospects for individuals to enjoy greater health. Disorders generated by any disease are complex and multifactorial, involving not only genetic factors but also a number of behavioural and environmental factors such as exposure to certain food components. Thus, there is a necessity to improve our limited understanding of the roles of dietary components at the molecular level (i.e., their interaction with genes,

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

245 246 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH and their subsequent effect on proteins and metabolites) and the application of Omics technologies will be essential in order to solve this complex issue. New acquired knowledge will allow us to manipulate cell functions through specific diets that will have an important impact on our health. Diet is considered one of the main causes that contribute to the incidence of metabolic disorders such as obesity, diabetes, cancer and cardiovascular disease, among others. How foods and dietary bioactive components influence on health and disease is a question not completely answered yet. In the post-genomic era, metabolomics is increasingly playing an important role in the correlation of bioac- tive food components and disease prevention. Single nutrients may have multiple biochemical targets and subsequent physiological actions, which may not be easily addressed with classical target biomarker analysis. Thus, one of the main focuses of metabolomics in food science and nutrition research is the study of the changes in the metabolome caused by specific dietary interventions. Although there is a clear dominance of genome-wide omic studies (i.e., Tran- scriptomics), a deeper knowledge about how nutrient-gene interactions can engen- der health-promoting metabolic shifts in individuals is of major importance. Some authors (van Ommen et al., 2010) proposed the creation of a “Nutritional Pheno- type Database” as a research and collaboration tool which shows publicly available data and knowledge repository about all the nutritional studies related to “Omics” technologies including metabolomics. Nowadays, this database is freely available (http://www.dbnp.org/) and provides great benefits to the research community by enabling integration and examination of data from multiple studies from different research groups worldwide. The emerging field of metabolomics is receiving increasing attention by the scien- tific community in nutritional studies. In order to reflect this trend, Figure 9.1 provides information on the number of works published in the period 2001–2011 found through a search in the database Web of Knowledge (http://apps.webofknowledge.com) using as key terms “Metabolomics/Metabolome” and “Nutrition/diet”. Anyhow, for further knowledge of metabolomics in Nutrition Research some revision works presented in Table 9.1 are recommended.

9.2 MS-BASED METABOLOMICS WORKFLOW

Metabolomics discipline, one of the newest “Omics” in nutrition field, has gener- ated high expectations in this novel field of research. However considerable analyt- ical challenges must be still overcome. One of the first difficulties comes from the way in which metabolome has been defined. Thus, metabolome is the full set of endogenous or exogenous low molecular weight entities of approximately <1500 Da (metabolites), and the small pathway motifs that are present in a biological system (cell, biofluid, tissue, organ, organism, etc.) (Trujillo et al., 2006).This wide defini- tion includes compounds such as lipids, carbohydrates, amino acids, organic acids, steroids, peptides, and others. Thus, a broad spectrum of molecules with diverse physicochemical properties is the target in metabolomics. Moreover, metabolome MS-BASED METABOLOMICS WORKFLOW 247

140 Source: Web of Knowledge (http://apps.webofknowledge.com) 120

100

80

60

Number of publications 40

20

0 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 Years FIGURE 9.1 Bar plot representation of the number of works published in the period 2001–2011 available in the database Web of Knowledge using as key terms “Metabolomics/ Metabolome” and “Nutrition/diet”. varies in biological samples in an enormously dynamic range of concentrations. Both metabolome and concentration diversity represent an important challenge from the analytical point of view as it will be explained later. Two different analytical approaches can be followed in nontargeted metabolomics. While “metabolic profiling” is referred to analyze a subclass of metabolites (family or metabolic pathway), “metabolic fingerprinting” has been proposed as a means of analyzing the total set of metabolites in a given sample. In the latter case the identity of the metabolites of interest is established after statistical data analysis of metabolic fingerprints. Targeted metabolomics methods are, in principle, similar to those available in a typical analytical chemistry laboratory, and thus, they will not be included in this book chapter. Typical workflow in nontargeted metabolomics is given in Figure 9.2. Metabolome is dynamic and biologically close to the phenotype of the system and hence its temporal responses to environmental effects may also be indicative of health status. Moreover, changes in the metabolome are resultant from differences in food intake and individual’s metabolic condition. Nontargeted metabolomics will potentially provide variety of biomarkers to set and monitor the health of a subject at any time point during his lifetime. Biological matrices are very complex, and thus, a previous sample preparation step must be carried out in any metabolomic study. After sampling, the first necessary step is to stop any inherent enzymatic activity or any changes in the metabolite levels. This is usually called “quenching” process. Typically enzyme activity is inhibited decreas- ing/increasing temperature and/or by immediate addition of organic solvents. When 248 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH

TABLE 9.1 Review Works on Metabolomics in Food Science and Nutrition Title Year Reference Metabolomics in practice: Emerging knowledge to 2005 German et al., 2005 guide future dietetic advice toward individualized health Metabolomics in human nutrition: Opportunities and 2005 Gibney et al., 2005 challenges Nutrigenomics, proteomics, metabolomics, and the 2006 Trujillo et al., 2006 practice of dietetics Characterization of Proteomic and metabolomic 2007 Astle et al., 2007 responses to dietary factors and supplements Personalised nutrition. Metabolomic applications in 2008 Brennan, 2008 nutritional research Metabolomics: Applications to food science and 2008 Wishart, 2008a nutrition research Metabolomics for assessment of nutritional status 2009 Zivkovic and German, 2009 Metabolomics, a novel tool for studies of nutrition, 2009 Oresic, 2009 metabolism and lipid dysfunction Mass-spectrometry-based metabolomics: Limitations 2009 Scalbert et al., 2009 and recommendations for future progress with particular focus on nutrition research The complex links between dietary phytochemicals 2009 Manach et al., 2009 and human health deciphered by metabolomics Measurement of dietary exposure: A challenging 2009 Fave´ et al., 2009 problem which may be overcome thanks to metabolomics? Metabolomic analysis in food science: A review 2009 Cevallos-Cevallos et al., 2009 Nutritional Metabonomics: An approach to promote 2011 Collino et al., 2011 personalized health and wellness Analytical metabolomics: Nutritional opportunities 2011 McNiven et al., for personalized health 2011 Metabolomics and human nutrition 2011 Primrose et al., 2011 Nutrimetabolomic strategies to develop new 2012 Llorach et al., 2012 biomarkers of intake and health effects Nutritional Metabolomics: Progress in addressing 2012 Jones et al., 2012 complexity in diet and health

working with cell cultures or tissues, a second step of cell disruption must be included to allow metabolites releasing from cell (Sellick et al., 2009; Volmer et al., 2011). In an ideal metabolomic fingerprinting approach, metabolite extraction method should not be biased towards any group of molecules. In practice, this aspect has not been resolved yet. Blood, plasma, serum, as well as other biofluids such as cerebrospinal Food/Diet Genetics SAMPLE Environment Training sample set PREPARATION ANALYSIS Health state

Quality control set LLE, SPE, Ultrafiltration, NMR, LC-MS, Protein precipitation, Validation sample set GC-MS, CE-MS, MS etc...

QUANTIFICATION (Target anslysis) DATA DATA PROCESSING INTEGRATION

NUCLEOTIDES GLYCANS MassTRIX COFACTORS Peak finding, peak VITAMINS integration, time

AMINO DATA MATRIX alignment, adduct ACIDS SECONDARY removal, normalization, LIPIDS METABOLITES etc... ENERGY 4 Treated samples

Quality control samples 0 CHEMOMETRICS IDENTIFICATION PATHWAY Control samples ANALYSIS CV 2 -4 Biomarker discovery Biological processes -8 Predictive models understanding Scripps Center For Metabolomics -6 -2 2 6 10 CV 1 249 FIGURE 9.2 Workflow in nontargeted metabolomics in health/food intervention studies. 250 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH

fluid contain a wide range of macromolecules, thus, a robust metabolite extraction procedure may be applied to avoid peak overlapping of macromolecules with peaks from small molecules (Wishart et al., 2008b; Gika and Theodoridis, 2011). Compar- ison of different sample treatments for nontargeted metabolomics has been recently published (Simo´ et al., 2011) demonstrating that the composition and the quantity of metabolites detected was dependent to a large extent on the sample preparation step, highlighting the importance of this frequently underestimated analytical step. Liquid–liquid extraction, solid-phase extraction, protein precipitation, and ultrafiltra- tion are the preferred sample preparation methodologies among the great variety of purification procedures in metabolomics. Urine is one of the most commonly used biofluids in metabolomics. Due to its lower protein content, less complex sample pretreatment is typically required. Moreover, it can be obtained in large quantities by noninvasive collection procedures and repeat sampling is not a problem (Ryan et al., 2011). Regarding the design of the experiment, if the aim of the work is to create or to validate a classification method, the addition of a blinded group of samples is recommended. This aspect is not always present in metabolomic studies. The “training data set” composed by the known samples and “validation sample set” formed by the blinded samples have to be processed at the same time. A “quality control set” of samples can also be included in the study in order to monitor the performance of the entire process (from the sample analysis to the statistical results). This last group of samples is usually made up of a pool of samples from the training data set. For example, 600 plasma samples and 24 quality control pooled samples were analyzed randomly by high-performance liquid chromatography–mass spectrometry (HPLC–MS) in an intervention metabolomic work that studied fat intake and specific genetic variants with special focus on lipid molecules analysis (Bijlsma et al., 2006). As stated before, one of the main challenges in metabolomics is the complete anal- ysis of entire metabolomes since metabolites have very different molecular structures and are present in biofluids and tissues in a wide dynamic range of concentrations. Two analytical platforms are currently used for metabolomic analyses: MS- and NMR-based systems. The technological advances in NMR- and MS-based method- ologies have the potential to measure several hundreds of metabolites in small vol- umes of biological samples. However current analytical platforms are not able to analyze all metabolites present in an organism. NMR- and MS-based methods are highly complementary to cover as much of the metabolome as possible. A detailed comparison between these two techniques has been already carried out (Pan and Raftery, 2007). NMR provides detailed structural information with no need for labo- rious sample treatment. In contrast, MS sensitivity is higher compared to NMR, enabling broader surveys of the metabolome. However, MS usually requires sample cleanup and/or pre-fractionation. Still today, many of the analytes in metabolomic studies are unknown or are difficult to obtain from commercial sources. Thus, the use of ultra-high-resolution mass spectrometers (e.g., TOF, FT-ICR MS, OrbitrapR ) is essential to obtain accurate mass measurements for the determination of elemen- tal compositions of metabolites, and to carry out their tentative identification with the help of metabolite databases (Dettmer et al., 2007). On the other hand, MSn MS-BASED METABOLOMICS WORKFLOW 251 experiments, especially when product ions are analyzed at high resolution (with Q-TOF, TOF-TOF, or LTQ-Orbitrap) provide additional structural information for metabolite identification purposes. MS can be used as a standalone technique through the direct infusion of samples (Han et al., 2008; Junot et al., 2010), however, chemi- cal isomers cannot be distinguished since they have the same exact mass and there- fore would require previous chromatographic/electrophoretic separation. Thus, for MS-based analysis samples are commonly submitted to different separation tech- niques such as gas chromatography (GC), liquid chromatography (LC), or capil- lary electrophoresis (CE), coupled to MS detection. GC–MS is applied to volatile organic molecules and generally requires sample derivatization of metabolites to create volatile compounds. LC–MS is applicable to the analysis of a wide range of semi-polar compounds. The recent introduction of the reduced particle size (sub- 2 ␮m) used in ultra-high-performance liquid chromatography (UPLC) compared to normal particle size in LC (3.5–5 ␮m) results in increased peak capacity, resolution and lower analysis time. However, special instrumentation is required due to the high pressure needed to operate with these reduced-particle size columns. CE–MS, on the other hand, is particularly suited for the rapid separation of ionic, weakly ionic and/or highly polar metabolites with very high resolution using extremely small reagents and sample volumes. However, CE–MS is not as robust and stable as GC–MS or LC–MS. Owing to the metabolome complexity global coverage of biofluid/tissue metabolome will require the application of multiple analytical platforms. For this reason, the use of multidimensional analytical platforms adds an additional level of separation in an automated-online fashion. For example, GC × GC−TOF MS has been used for high-resolution metabolomics of mouse tissues (Shellie et al., 2005). When the objective is to detect as many metabolites as possible in a complex sam- ple matrix, and the number of samples is high, raw data processing is a very important step in data analysis. High-throughput analysis of biological fluids, especially those which are obtained in a minimally invasive manner, will give a massive produc- tion of data. This is especially interesting in the studies carried out to understand the relationships between nutrition and health/disease in which a high number of large data sets must typically be examined. Bioinformatic tools play here an impor- tant role in order to develop strategies to convert the complex raw data obtained into useful information. High-throughput peak picking and spectral deconvolution in GC–MS, LC–MS and CE–MS are technically challenging when a high num- ber of samples are handled. Particularly, retention time variability in CE–MS and poorer metabolite peak resolution in LC–MS can difficult the application of metabo- lite peak alignment algorithms among samples. The variables of high dimensional data from MS-based analytical platforms involve a list of ion mass, signal intensity and its retention/migration time. Moreover, an additional difficulty is the MS signal redundancy and other signal artefacts. Thus, complexity of raw data from MS-based platform is extremely increased by the formation of metabolite adducts with sample matrix components or mobile phase/electrolyte compounds. To process raw MS data acquired by GC–MS, LC–MS or CE–MS, several open source bioinformatics tools are now available, including MetAlign (Vorst et al., 2005), MZmine2 (Katajamaa et al., 2006), XCMS (Smith et al., 2006), BinBase (Fiehn et al., 2005), among others. 252 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH

Statistical analysis and data mining must be carried out to allow the identification of significant metabolites that capture the bulk of variation between datasets and that represent candidates that may serve as biomarkers (e.g., of a dietary intervention). The use of specific statistical programs as STATISTICA (http://www.statsoft.com), SPSS Statistics (http://www.ibm.com/us/en/) or SIMCA-P (http://www.umetrics.com) is common due to their broader statistical techniques and graphic types availability. A variety of pattern recognition procedures are employed, among others, principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). Several authors have reviewed main aspects regarding the use of the great diversity of bioinformatics tools to allow MS- based metabolomic data analysis (Wishart, 2010; Korman et al., 2012; Sugimoto et al., 2012). Another challenge in metabolomics is the metabolite identification process in a high-throughput manner. Today, many metabolites in complex biological samples are still not-annotated in databases. Due to the lack of comprehensive databases and the chemical complexity, accurate mass spectra must be exploited for structural elucidation of compounds. But even handling with high mass accuracy measures, monoisotopic mass of a compound is usually not sufficient to determine its ele- mental composition (Kind and Fiehn, 2006). Thus, isotopic abundance pattern is needed to remove wrong elemental composition candidates. Tentative identification of metabolites is usually carried out by matching the obtained accurate m/z values and theoretical m/z values contained in different free available databases, such as Human Metabolome Database (HMDB) (Wishart et al., 2009) and Metlin (Smith et al., 2005). Other free available databases are useful to obtain additional informa- tion about the metabolites under study, such as its location in different biological pathways or alteration in certain diseases, this is the case of the Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa and Goto, 2000), and physic properties, chemical structure details or main published related works as provided by PubChem (Bolton et al., 2008), among others. For most common metabolites these databases are linked making the interpretation of results easier. To complete the metabolomic study, a pathway analysis of the identified molecules can be performed by means of some of the databases mentioned above. The identifica- tion of altered pathway after a diet intervention provides important information about the biochemical processes and possible consequences at a molecular level, enhancing hypothesis creation and new studies design. MassTRIX (Suhre and Schmitt-Kopplin, 2008) is a specific bioinformatic online tool very useful for accurate MS data that locates the metabolites in their KEGG pathways and allows the comparison of all the altered metabolites and/or metabolic routes between two groups of samples. In a recent work, (Krug et al., 2012) MassTRIX was applied to a multiplatform metabolomic study. Namely, 15 healthy male volunteers underwent different physical challenges designed to explore metabolic plasticity in catabolic conditions (fasting and cycling), anabolic states (oral glucose tolerance test, oral lipid test tolerance, and standard liquid diet) or stress situation (cold pressure test). NMR, FIA-MS/MS and FT-ICR MS were used to analyze plasma, urine, breath air and exhaled breath concentrate samples. Some expectable metabolite concentrations were altered (e.g., METABOLOMICS IN NUTRITION-RELATED STUDIES 253 insulin, glucose and lactate). Interestingly, some short chain acylcarnitines, impor- tant in metabolic pathways such as lipolysis and ␤-oxidation of fatty acids, were also altered. In this study, MassTRIX utility was demonstrated to perform effective con- nections between metabolites and pathways. In another work (Manna et al., 2010), the effects of alcohol administration were associated to liver disease in two genetically different mouse models by UPLC-QTOF-MS. MassTRIX revealed that metabolites related to the tryptophan metabolism were significantly upregulated, such as indole- 3-lactic acid which contributed to the shift in the redox balance due to chronic alcohol consumption. This finding can contribute to the understanding of the negative effects of the chronic alcohol consumption and to prevent or stop liver disease progression associated to these biochemical processes.

9.3 METABOLOMICS IN NUTRITION-RELATED STUDIES

For the understanding of the effect of diet on metabolism, the study of the mechanisms of nutrients and other bioactive food components at molecular level is required. This is supported by the increasingly growing number of studies in humans, animal mod- els and cell cultures demonstrating that certain dietary compounds can regulate gene expression in different ways. A variety of biological samples have been analyzed for this purpose. It must be underlined that metabolic information obtained from different biofluids and tissues are complementary. Determining the role of diet/dietary com- ponents in metabolic regulation has become a key objective in nutrition research. For this purpose, nutritional intervention studies have incorporated novel metabolomics approaches. Some interesting metabolomic applications in nutrition-related studies are resumed in Table 9.2. Llorach-Asuncion et al. proposed a metabolomic approach to study the urinary metabolome modification after the consumption of cocoa powder (Llorach-Asuncion et al., 2010). A combined multivariate statistical analysis (PLS-DA and two-way HCA) was used to simplify the analysis of the set of data obtained by HPLC- QTOF-MS. Overall, 27 altered metabolites with a variety of structures (alkaloids, polyphenols, amino acids, etc.) could be identified as important markers of cocoa consumption. Similar analytical approach was also followed to study the effect of nut consumption in the urinary metabolome of subjects with metabolic syndrome (Tulipani et al., 2011). As can be seen in Figure 9.3, following a 12-week dietary intervention, the urine samples of the control diet group (not-supplemented), the nut- supplemented diet group and baseline samples (samples collected before the interven- tion study) are markedly differentiated. Twenty potential metabolites, including fatty acid conjugated metabolites, phase II and microbial-derived phenolic metabolites, and serotonin metabolites were proposed as markers of nut intake. This information is important for the development of a personalized approach to nutrition, which will ultimately allow the identification of biomarkers for dietary intervention strategies. As stated above, to increase metabolome coverage, the use of more than one analytical technique is required. Thus, to study global changes in human urine and blood metabolome due to chocolate consumption, three complementary analytical 254 TABLE 9.2 Metabolomic Applications in Health/Dietary Intervention Studies Intervention Sample Metabolic Observation Platform Reference Green tea intake Human urine Glucose metabolism, citric acid cycle NMR Law et al., 2008 and amino acid metabolism altered. GC-Q MS LC-QqQ MS Post exercise intake of 4 types of Human serum Different systemic human blood serum GC-TOF MS Chorell et al., 2009 beveragesa metabolic response. Chocolate consumption Human urine and Different energy homeostasis, NMR Martin et al., 2009 blood hormonal metabolism and gut GC–MS microbial activity. LC–MS/MS Polyphenol-rich red wine and Human urine Endogenous metabolites. Microbial NMR van Dorsten et al., grape juice intake polyphenol metabolite profiles. GC-TOF–MS 2010 Wheat flour fortifiedb Human serum Twenty potential metabolite biomarkers UPLC-TOF–MS Jiang et al., 2011 were identified. Exercise Human plasma Lipolysis pathways modification. LC-QTRAPR -MS Lewis et al., 2010 Cocoa powder consumption Human urine 27 altered metabolite concentrations. HPLC-QTOF–MS Llorach-Asuncion et al., 2010 2 and 4 week duration Human urine and Three distinct dietary patterns related to NMR O’Sullivan et al., individualized diets blood the different nutrient intake. 2010 Cruciferous vegetables intake Human urine S-methyl-L-cysteine and three related NMR Edmands et al., metabolites were identified as 2011 potential biomarkers of cruciferous consumption. Low vs. standard caloric diet Human urine and Different metabolic profiles. LC-FTICR-MS Fave et al., 2011 blood FIE-MS Green tea catechins in intake in Rat plasma Nine potential biomarkers for catechins UPLC-QTOF-MS Fu et al., 2011 D-galactose age-induced rats ingestion. Vitamins E and C in combination Dog urine and Altered lysophospholipid profiles in GC-MS Hall et al., 2011 with dietary fish oil plasma fish-oil diet. UPLC-MS/MS consumption Decreased 11-dehydro-thromboxane B2 urine level in fish oil diet. Citrus fruit consumption Human urine Appearance of proline betaine and FTICR-MS Lloyd et al., 2011a several biotransformed products. FIE-MS Standard corn flakes breakfast Human urine Potential metabolite markers of each GC-TOF-MS Lloyd et al., 2011b vs. smoked salmon vs. dietary intervention. FIE-MS broccoli florets vs. raspberries vs. whole grain cereals Fasting vs. non-fasting; high vs. Rat plasma Different metabolic profiles and GC-MS Mellert et al., 2011 low caloric diet; corn vs. olive identification of biomarkers. HPLC-MS/MS oil intake Breast milk vs. Formula milk Piglet cecal Different metabolic profiles, GC-Q-MS Poroyko et al., 2011 samples and highlighting sugars and fatty acids. human faeces Different gut bacteria composition. Fitness status Human plasma Metabolites related to antioxidant GC-TOF-MS Chorell et al., 2012 defense system and inflammatory pathway of lipids Caloric restricted diet Human plasma Changes in intermediates of fatty acid GC-Q-MS Huffman et al., oxidation. GC–MS/MS 2012 Red wine vs. polyphenol mix Human urine and 17 potential biomarkers in plasma and NMR Jacobs et al., 2012 extract vs. placebo blood 12 potential biomarkers in urine were GC–MS identified. HPLC–MS/MS 7% apple pectin vs. 10 g raw Rat urine 119 potential apple- and 39 potential UPLC-QTOF-MS Kristensen et al., apple vs. no supplementation pectin-exposure markers. 2012 intake

aBeverages: water, low hydrocarbon content, high hydrocarbon content and low hydrocarbon with low protein content. bFlour was supplemented with folic acid, vitamin B1, vitamin B2, ferric sodium edentate and zinc oxide. 255 256 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH

A

10 Before treatment

0 t[2]

Nut-enriched Control diet diet

-10

-20 -10 0 10 20 t[1] FIGURE 9.3 Orthogonal projections to latent structures (OSC-PLS) scores plot deriving from the urine samples collected at baseline (black squares) and after 12-week of control diet (gray diamonds) and nut-enriched diet (gray spheres). Reprinted with permission from Tulipani et al. (2011). Copyright American Chemical Society.

platforms, NMR, GC–MS and LC–MS were employed (Martin et al., 2009), observ- ing that subjects with higher anxiety trait showed a distinct metabolic profile indicative of a different energy homeostasis, hormonal metabolism and gut microbial activity. It was also observed that chocolate ingestion reduced the urinary excretion of the stress hormones. Another interesting multiplatform work was presented by Petersen et al. (Petersen et al., 2006). A multicentre human obesity project was carried out to elucidate the role of interactions between macronutrient composition of the diet with particular emphasis on fat intake and specific genetic variants. In that work (Petersen et al., 2006), plasma obtained from 150 subjects at 4 time points before and after single intake of a high-fat test meal followed by a 10-week hypocaloric intervention with either high- or low-fat content, were analyzed. To cover as much as possible the metabolome, four analytical methods were used. NMR, GC–MS and LC–MS were selected for polar compounds and another LC–MS method was used for lipids determination. One of the main disadvantages of multicentre studies is the increased inter-individual variability due to subtle differences in sample collec- tion, variations in samples and reagents conservation, and other minimum changes in laboratory procedures. A different multiplatform analytical strategy based on GC- Q-MS, LC-QqQ-MS and NMR was also developed by Law et al. for the metabolic profiling of human urine samples associated with the intake of green tea (Law et al., 2008). Complementary metabolic signatures were obtained using different analytical METABOLOMICS IN NUTRITION-RELATED STUDIES 257 techniques. Metabolites involved in glucose metabolism, citric acid cycle and amino acid metabolism were mainly affected after the ingestion of green tea. Other aspects regarding nutrition and health have also been studied from a global metabolomic approach. For example, exercise-induced responses have been tackled by several researchers using nontargeted metabolomic approaches. Exercise pro- vides numerous beneficial effects, but again, there is a limited understanding of how these occur at molecular level. Lewis et al. proposed an overview of human plasma metabolite signatures before and after exercise by using LC-QTRAPR -MS (Lewis et al., 2010). Among other results, it was observed that exercise-induced metabolic changes were different depending on individual fitness. Thus, more fit individuals activated lipolysis (glycerol), facilitated the entry of fatty acids into the TCA cycle (pantothenate), and expanded the TCA cycle intermediate pool (fumarate, malate, succinate) in a higher extend than less fit individuals. Similarly, Chorell et al. carried out an intervention study focusing on characterizing the human plasma metabolome by GC-TOF-MS in relation to fitness status (Chorell et al., 2012). After multi- and uni- variate statistical analysis, global metabolite patterns as well as individual markers were related to the antioxidant system and inflammatory pathway of lipids that may be used to predict the fitness state and metabolic capacity in humans. Post-exercise inges- tion of four beverages with differences in macronutrient composition has also been studied by the same research group (Chorell et al., 2009). Ingestion of water, low- carbohydrate beverage, high-carbohydrate beverage, and low-carbohydrate-protein beverage were investigated in twenty four healthy males involved regularly in exer- cise training. Blood was collected at six time points, one pre- and five post-exercise to monitor metabolic changes. Serum metabolome was characterized by GC-TOF-MS. Separation of subjects according to fitness level was achieved, and it was observed that ingestion of carbohydrates in combination with proteins generated a different systemic human blood serum metabolic response, compared to the sole ingestion of carbohydrates or water, in the early recovery phase following exercise. Figure 9.4a shows the discrimination based on the four different post-exercise intakes using orthogonal partial least-squares-discriminant analysis (OPLS-DA). Moreover Fig- ure 9.4b shows the corresponding loading plot (map over metabolites) which pro- vides detailed information regarding different metabolites and classes of metabolites responsible for the observed nutrition dependent separation of subjects. A major challenge in diet intervention studies is biological variation within the human population (diversity in the genome between ethnic groups and individu- als) as well as food availability mainly depending on cultural, geographical and economic diversity. For this reason, compared to any human study, investigation with animal models is likely to exhibit a much lower level of variation due to their isogenic nature and controllable environment. Thus, easier interpretation of effects in diet-related intervention studies is expected from animal models. GC- MS- and LC-MS-based metabolomic approaches were developed by Mellert et al. to study the effect of different nutrition states in model rats (Mellert et al., 2011). Thus, the effect of fasting vs. non-fasting prior to blood sampling, the influence of high caloric diet and caloric restriction, as well as the administration of corn oil and olive oil were studied for their influence on the plasma metabolome. Some 258

(a) Subjects in recovery (b) Metabolic map Correlated to ingested beverage Correlated to ingested beverage

Serum sugar Non identified metabolite Fructose 5 0.25 Serum amino acid Identified metabolite Serum fatty acid

120 47 180 tCV[2]P 128 67 60 74 111 73 103 24 58 101 -5 112 46 42 158 89 27 40 68 w[2]P 113 Chol SuA 4-DEA 3-MeHis 87 43 16 5 PSU -5 tCV[1]P 32 54 153 8 High carbohydrate (HCHO) Low carbohydrate (LCHO) 142 122 Low carbohydrate-protein (LCHO-P) AA Gly Water 107 114 51 LA 64 123 -0.25 -0.1 w[1]P 0.1 0.2 FIGURE 9.4 (a) OPLS-DA model describing the first two predictive components revealing clustering of subjects in relation to their ingested beverage following exercise (including all recovery samples taken after nutritional intake: at 15, 30, 60, and 90 min after completed exercise). (b) OPLS-DA covariance loading plot of the resulting metabolic patterns explaining the pattern seen in the above mentioned score plot (w1P and w2P are covariance loadings for the first and second predictive component, respectively). Metabolites are marked according to its class. Reprinted with permission from Chorell et al. (2009). Copyright American Chemical Society. DIET/NUTRITION AND DISEASE: METABOLOMICS APPLICATIONS 259 biomarkers derived from food consumption were detected (e.g. alpha-tocopherol, ascorbic acid, beta-sitosterol, campesterol). On the other hand, different diets had dif- ferent effects on metabolic profiles. Triacylglycerol, phospholipids and their degrada- tion product levels (fatty acids, glycerol, lysophosphatidylcholine) were also observed to be altered depending on the nutritional status. In another recent work, Kristensen et al. studied how fresh apple or apple-pectin affected the urinary metabolome of rats (Kristensen et al., 2012). After the comparison of urinary metabolic profiles obtained by UPLC-QTOF-MS, quinic acid, m-coumaric acid and (−) epicatechin were identified as exposure markers of apple intake whereas hippuric acid acted as an effect marker. On the other hand, pyrrole-2-carboxylic acid and 2-furoylglycine were identified as pectin exposure markers and 2-piperidinone behaved as a pectin effect marker. Complexity of inter-organ metabolic relationships and their sensitiv- ity to dietary changes was demonstrated by Jove et al. (Jove et al., 2011). In that work, different metabolic responses in plasma, urine and faecal contents from mice after the administration of different polyphenol-rich diets, was studied. The effect of three different vegetal-derived extracts (hazelnut, almond, and carob) was studied by HPLC-QTOF-MS. It was demonstrated that sensitivity to dietary changes depended on the biofluid/compartment studied. Polyphenol-rich diet mainly affected bile acid and taurine metabolism pathways. Data regarding bioaccessibility and bioavailability was also obtained with this nontargeted metabolomic approach. The effect of dietary components on aging has also been studied in model rats. Aging is a multifactorial process of enormous complexity. A common characteristic of aging-related disease is the involvement of metabolic-related systems in general, and the mitochondria in particular (Kenyon, 2010). Fu et al. (2011) studied the effect of green tea catechins (namely, (−)-epigallocatechin-3-gallate, (−)-epicatechin-3- gallate (−)-epigallocatechin, (−)-epicatechin, ( + )-gallocatechin and ( + )-catechin) in D-galactose age-induced rats (Fu et al., 2011). After the multivariate analysis of metabolite profiles obtained by UPLC-QTOF-MS, nine potential biomarkers which allowed classifying aging rats and green tea-treated aging rats were identified.

9.4 DIET/NUTRITION AND DISEASE: METABOLOMICS APPLICATIONS

Diseases in modern civilization, such as diabetes, heart disease, and cancer, are known to be influenced by dietary patterns. The use of single biomarkers as monitoring of disease state is progressively replaced by comprehensive profiling of metabolites linked to a better understanding of health and human metabolism (Wang et al., 2010; Atzei et al., 2011; Dunn et al., 2011; Rhee and Gerszten, 2012). Metabolomics, besides a promising approach to identify new biomarkers that can be used in the noninvasive monitoring of disease, is predicted to play an increasingly important role in correlating bioactive food components and disease prevention. As an example, the relationship between diet and cancer is in continuous study nowadays (Ross, 2010). Adequate nutrition during cancer plays an important role in clinical outcome mea- sures, such as medical treatment response, quality of life and cost of care. Moreover, 260 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH epidemiologic studies comparing dietary patterns from countries with particular inci- dence for certain cancers have confirmed that diet has the potential to modify cancer risk (Go et al., 2003). Unravelling the effects of diet on cancer risk is therefore of great importance. Study of changes in small molecule patterns provides important mechanistic insights on the cancer process, tumour characterization and in the search for predictive biomarkers of response and toxicity (Griffin and Shockcor, 2004; Claudino et al., 2012). There are important evidences supporting the significant role of metabolic regulation in cancer. Indeed, it has been recently published a revision work regarding emerging hypothesis supporting that cancer is primarily a disease of energy metabolism (Seyfried and Shelton, 2010). Nontargeted metabolomics will provide additional information that can be linked with Transcriptomics and Pro- teomics data to obtain a comprehensive view of the effect of bioactive compounds from diet in cancer disease. Interesting works can be found in literature regarding the use of metabolomic approaches in cancer research for diagnostic and biomarker dis- covery purposes. Thus, Kind et al. (2007) proposed a nontargeted urine metabolome approach using several analytical platforms (namely, HILIC/LC–MS, RP/LC–MS and GC–MS) to discriminate kidney cancer patients from healthy individuals (Kind et al., 2007). Currently, a limited number of diet intervention investigations in humans exist in this area. Tissue/cell as well as animal model systems are mainly used to study the effect of dietary components. For example, LC-QTOF-MS-based methodology was proposed by Godzien et al. (Godzien et al., 2011) for global metabolite analysis in urine samples from induced-diabetic rats supplemented with rich-polyphenol extract from rosemary. After urine metabolome profiling and multivariate statistical analysis a group of 20 endogenous metabolites showing statistically significant differences could be tentatively identified using metabolite databases and MS/MS spectral infor- mation. Among them, several amino acids and their metabolites changed due to the effect of the gut microbiota. In addition, the comparison between control and dia- betic rats showed some metabolic coincidences between type 1 diabetes and other autoimmune diseases such as autism and/or Crohn’s disease, and the diet intervention was succeeded in inducing changes in such biomarkers. In a different work, potential nutraceutical antioxidant properties from a Cystoseira spp extract (carotenoid-rich) was explored in diabetic rat model using a nontargeted metabolomic analysis of urine based on CE–MS (Moraes et al., 2011). A group of compounds allowed the classi- fication of animals according to the extract consumption. Metabolites were mainly lysine derivatives probably related to post-translational (glycation) modifications of proteins. Authors highlighted in this work the complementary metabolic information obtained by CE–MS compared to other more commonly used techniques such as LC–MS and GC–MS. Cell cultures have also been used as a test system to investigate effects of bioactive dietary components in the metabolome. A cell culture is a less expensive model and easier to control than animals or human subjects; moreover, biological variability is expected to be small in cell lines grown under the same conditions (Cuperlovic-Culf et al., 2010). Thus, studies with cell cultures are a good starting point to investigate the effect of novel bioactive compounds found in natural products. Among dietary constituents, polyphenols have been claimed to show promising anticancer activities. OTHER APPLICATIONS IN NUTRITIONAL METABOLOMICS 261

Although many of the health benefits assigned to numerous dietary constituents are still under controversy, it is clear that more sounded scientific evidences will help to elucidate their claimed beneficial effects as proposed by the new Foodomics disci- pline. Better scientific evidences will allow an easier approval by food authorities of these compounds. Recently both LC-TOF-MS and CE-TOF-MS analytical platforms were used to characterize the metabolome changes in colon cancer cells whose pro- liferation was extensively reduced by a rosemary extract rich in polyphenols (Iba´nez˜ et al., 2012a). In a deeper insight of altered metabolites it was observed that rosemary polyphenols markedly affected the intracellular levels of polyamines and its derived catabolites on colorectal cancer cells.

9.5 OTHER APPLICATIONS IN NUTRITIONAL METABOLOMICS

Plant metabolomics is starting to being widely applied in a variety of research areas. It has been stimated that plants contain >200000 metabolites (Dixon and Strack, 2003). Plant-based products comprise the vast majority of human food intake. Once more, metabolomics discipline is helping us for a better knowledge of the components that are present in our diet. The quality and nutritional aspects of crop plants are directly related to their metabolite content. Metabolomics is providing novel information regarding the composition and potential nutritional imbalance of plant-based food- stuffs (Hall et al., 2008). Several studies have been published on how metabolomic analysis can be used to measure the phytochemical composition of foods, biotrans- formation of ingested phytochemicals, and metabolic response to the ingestion of phytochemicals (McGhie and Rowan, 2012). Metabolomics has also been applied to the study of nutritionally improved foods. Thus, metabolomic studies of genetically modified plants might indicate whether intended and/or unintended effects have taken place as a result of genetic modification (Shepherd et al., 2006; Garc´ıa-Canas˜ et al., 2011), but this issue will be discussed deeper in another chapter of this book. Envi- ronmental Metabolomics is also important in Foodomics since food metabolome may also contains exogenous compounds such as herbicides, insecticides, fungi- cides, hormones and other chemicals of interest for environmental health. Soltow et al., used a dual chromatography-Fourier-transform-MS approach for the analysis of human and monkey plasma samples (Soltow et al., 2011) to analyze endogenous and exogenous compounds. Metabolic profiling of plasma revealed over 100 chemi- cals apparently of environmental origin (insecticides, herbicides, flame retardants, and plasticizers). Biotransformation and bioavailability of dietary bioactive compounds is increas- ingly playing an important role in nutritional metabolomics. Most transformations are regarded with inactivating biologically active compounds and facilitating their excretion. Bioavailability of polyphenols is of special interest due to their health benefits. Several mechanisms of action have been proposed for the disease preven- tive activities of polyphenols based on in vitro by using concentrations of standard compounds much higher than those generally attainable through diet. Due to the beneficial properties related to polyphenols, the study of the effects of these com- pounds in humans is of main interest. However, dietary polyphenols undergo extensive 262 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH biotransformation in the small intestine and liver and thus their bioavailability changes (Lambert et al., 2007; Simons et al., 2010). In addition, a major fraction of the dietary polyphenols persists to the colon, where it is extensively degraded by gut microbiota into simpler phenolic compounds. The bioconverted active compounds will interact with the host metabolism and affect multiple physiological processes. In order to understand the absorption and metabolism of polyphenols and their potential bio- logical activity in humans, van Dorsten et al. (van Dorsten et al., 2010) studied the impact of polyphenol-rich red wine and grape juice consumption in humans using a nontargeted metabolomic approach (NMR and GC-MS). Differences in microbial polyphenol metabolites profiles as well as on endogenous metabolite excretion in urine were observed after polyphenol-rich diet intake. This information will help to determine potential beneficial effects of modified-polyphenols in nutritional interven- tion studies. Nevertheless, biotransformation here is still limited to the understanding of the role and importance of gut microbial metabolism of dietary bioactive com- pounds in humans. In this line, Poroyko et al. presented a metabolomic study to relate the intestinal ecosystems of breast milk-fed and formula-fed full term piglets and human premature infants, to determine key metabolites in the gut that induced predictable changes in microbial community composition (Poroyko et al., 2011). After analysis of intestinal microbial and metabolic profiles it was observed that the chemical composition of diet appears to have a significant role in defining the micro- biota of the immature gut. In that work authors hypothesized that the basic chemical composition of diet fundamentally selected for specific intestinal microbiota which might help explain disparate disease outcome and therapeutic direction. In a different work Wikoff et al. studied the effect of gut microbiome on mammalian blood metabolites (Wikoff et al., 2009). For that purpose plasma extracts from germ- free mice were compared with samples from conventional animals by using LC-QTOF MS methodology. It was observed that amino acid metabolites were particularly affected. Several pathways including the metabolic processing of indole-containing molecules were seen to particularly interact with the microbiome. Moreover, organic acids containing phenyl groups were also greatly increased in the presence of gut microbes. Approximately 10% of the plasma metabolome was directly dependent upon the microbiome in the selected animal model. As can be deduced one of the biggest challenges to be addressed in nutritional metabolomics is the extremely variable background of host genetic, lifestyle, dietary and gut microbiota differences inter-individual. In this sense, further systematic studies in this field are needed to evaluate the biological activities of polyphenol metabolites in target biofluid/tissues in real dietary intervention studies.

9.6 INTEGRATION WITH OTHER “OMICS”

To understand complex biological systems the integration of the different levels of knowledge is compulsory. The concept of Systems Biology goes out on stage at this moment. “Systems Biology studies biological systems by systematically perturbing them (biologically, genetically, or chemically); monitoring the gene, protein, and CONCLUDING REMARKS 263 informational pathway responses; integrating these data; and ultimately, formulat- ing mathematical models that describe the structure of the system and its response to individual perturbations” (Ideker et al., 2001). An adequate Systems Biology approach in dietary intervention studies should provide a holistic view of the molecu- lar mechanisms underlying the beneficial or adverse effects of certain bioactive food components. Notable metabolite data from different model organisms are abundant in literature; however, their integration in global databases is yet to be totally accom- plished. An integrated metabolomic and proteomic approach was recently published to study diet-induced effects on hepatic metabolism in a rat model (Bertram et al., 2012). In that work, interesting correlations were observed. Most evident correlation was found between the hepatic protein malate dehydrogenase and levels of lactate, glucose, and glutamine/glutamate. However, other protein-metabolite correlations remained unexplained. A remarkable nutritional intervention study was carried out by Bakker et al. in overweight men with mildly increased C-reactive protein concen- trations (associated with cardiovascular diseases and type 2 diabetes) (Bakker et al., 2010). A supplement containing resveratrol, green tea extract, ␣-tocopherol, vita- min C, n-3 (omega-3) polyunsaturated fatty acids, and tomato extract, with potential anti-inflammatory properties was studied in this work. The effects of these dietary compounds were studied by using a nutrigenomic approach by large-scale profiling of genes, proteins, and metabolites in blood, urine, and fat tissue. The effects on markers of inflammation, oxidation, and metabolism by integrating the results from a mul- tiplatform approach, were reported in this novel study. Although authors described several limitations to be addressed in the future, up-to-date this is the first nutri- tional intervention study with integrated information obtained from three different expression levels (genes, proteins and metabolites). Recently, a second example has been reported in which the activity of dietary polyphenols against colon cancer cells proliferation was investigated integrating, in global a Foodomics approach, the infor- mation from the three levels of expression obtained using transcriptomics, proteomics and metabolomics platforms (Iba´nez˜ et al., 2012b). That hypothesis-free Foodomics study concluded that the antiproliferative effect of the studied polyphenols can be explained through three different biological mechanisms working together and linked to cell cycle arrest, apoptosis and antioxidative enhancement.

9.7 CONCLUDING REMARKS

Metabolomics is an interdisciplinary field of science. Difficulties to be resolved in metabolomics by scientific community are related to the lack of well-established and standardized methods (sample treatment, data processing, and analytical method- ologies, among others) to improve the quality of experimental design, analyses, and results. Thus, natural metabolic diversity together with the lack of a universal sample preparation procedure and analytical platform are major challenges to be unravelled by scientific community in the coming future. Identification of the dis- criminating metabolites is still not trivial with the actual analytical platforms and incomplete metabolite databases currently available. Moreover, further clarification 264 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH of in vivo biotransformation of dietary bioactive components must be presented. Continuous advances in MS technology certainly suggest an important growing of applications in metabolomics field. High-throughput nature of metabolic profiling using MS-based analytical platforms will allow metabolomics to be applied to screen large sample sets at a relatively low cost. On the other hand, one of the biggest challenges to be addressed in metabolomics is the interpretation of the identified significant metabolites in a biologically meaningful manner. For this reason, a suc- cessful metabolomic study focused in nutrition and health research should always be performed as collaborative efforts between analytical chemists, computational sci- entists, dietetic experts, nutritionists, physicians, etc. As the field of metabolomics develops, many of these limitations will be overcome. However, today it is still an interrogation if metabolomics will deliver its full promise in nutritional research.

ACKNOWLEDGMENTS

This work was supported by projects AGL2011-29857-C03-01 and CSD2007-00063 FUN-C-FOOD (CONSOLIDER INGENIO 2010, Ministerio de Educacion´ y Cien- cia). C.I. thanks the Ministerio de Econom´ıa y Competitividad for her FPI predoctoral fellowship. Authors thank Prof. A. Cifuentes for carefully revising our work.

REFERENCES

Astle J, Ferguson JT, German JB, Harrigan GG, Kelleher NL, Kodadek T, Parks BA, Roth MJ, Singletary KW, Wenger CD, Mahady GB (2007). Characterization of proteomic and metabolomic responses to dietary factors and supplements. Journal of Nutrition 137:2787– 2793. Atzei A, Atzori L, Moretti C, Barberini L, Noto A, Ottonello G, Pusceddu E, Fanos V (2011). Metabolomics in paediatric respiratory diseases and bronchiolitis. Journal of Maternal- Fetal and Neonatal Medicine 24:59–62. Bakker GGCM, van Erk MJ, Pellis L, Wopereis S, Rubingh CM, Cnubben NHP, Kooistra T, van Ommen B, Hendriks HFJ (2010). An antiinflammatory dietary mix modulates inflammation and oxidative and metabolic stress in overweight men: a nutrigenomics approach. The American Journal of Clinical Nutrition 91:1044–1059. Bertram HC, Larsen LB, Chen X, Jeppesen PB (2012). Impact of high-fat and high- carbohydrate diets on liver metabolism studied in a rat model with a Systems Biology approach. Journal of Agricultural and Food Chemistry 60:676–684. Bijlsma S, Bobeldijk I, Verheij ER, Ramaker R, Kochhar S, Macdonald IA, van Ommen B, Smilde AK (2006). Large-scale human Metabolomics studies: a strategy for data (pre-) processing and validation. Analytical Chemistry 78:567–574. Bolton E, Wang Y, Thiessen PA, Bryant SH (2008). PubChem: integrated platform of small molecules and biological activities. Annual Reports in Computational Chemistry 4:217– 241. Brennan L (2008). Personalised nutrition. Metabolomic applications in nutritional research. The Proceedings of the Nutrition Society 67:404–408. REFERENCES 265

Cevallos-Cevallos JM, Reyes-de-Corcuera JI, Etxeberria E, Danyluk MD, Rodrick GE (2009). Metabolomic analysis in food science: a review. Trends in Food Science & Technology 20:557–566. Chorell E, Moritz T, Branth S, Antti H, Svensson MB (2009). Predictive Metabolomics evaluation of nutrition-modulated metabolic stress responses in human blood serum during the early recovery phase of strenuous physical exercise. Journal of Proteome Research 8:2966–2977. Chorell E, Svensson MB, Moritz T, Antti H (2012). Physical fitness level is reflected by alterations in the human plasma metabolome. Molecular BioSystems 8:1187–1196. Cifuentes A (2009). Food analysis and Foodomics. Journal of Chromatography A 1216:7109– 7110. Claudino WM, Goncalves PH, di Leo A, Philip PA, Sarkar FH (2012). Metabolomics in cancer: a bench-to-bed side intersection. Critical Reviews in Oncology/Hematology 84:1–7. Collino S, Martin FPJ, Kochhar S, Rezzi S (2011). Nutritional Metabonomics: an approach to promote personalized health and wellness. Chimia 65:396–399. Cuperlovic-Culf M, Barnett DA, Culf AS, Chute I (2010). Cell culture metabolomics: appli- cations and future directions. Drug Discovery Today 15:610–621. Dettmer K, Aronov PA, Hammock BD (2007). Mass spectrometry-based metabolomics. Mass Spectrometry Reviews 26:51–78. Dixon RA, Strack D (2003). Phytochemistry meets genome analysis, and beyond. Phytochem- istry 62:815–816. van Dorsten FA, Grun CH, van Velzen EJJ, Jacobs DM, Draijer R, van Duynhoven JPM (2010). The metabolic fate of red wine and grape juice polyphenols in humans assessed by Metabolomics. Molecular Nutrition & Food Research 54:897–908. Dunn WB, Goodacre R, Neyses L, Mamas M (2011). Integration of metabolomics in heart dis- ease and diabetes research: current achievements and future outlook. Bioanalysis 3:2205– 2222. Edmands WM, Beckonert OP, Stella C, Campbell A, Lake BG, Lindon JC, Holmes E, Gooderham NJ (2011). Identification of human urinary biomarkers of cruciferous vegetable consumption by metabonomic profiling. Journal of Proteome Research 10: 4513–4521. Fave´ G, Beckmann ME, Draper JH, Mathers JC (2009). Measurement of dietary exposure: a challenging problem which may be overcome thanks to metabolomics? Genes & Nutrition 4:135–141. Fave G, Beckmann M, Lloyd AJ, Zhou S, Harold G, Lin W, Tailliart K, Xie L, Draper J, Mathers JC (2011). Development and validation of a standarized protocol to monitor human dietary exposure by metabolite fingerprinting of urine samples. Metabolomics 7:469–484. Fiehn O, Wohlgemuth G, Scholz M (2005). Setup and Annotation of Metabolomic Experiments by Integrating Biological and Mass Spectrometric Metadata. In: Ludascher¨ B, Raschid L, editors. Data integration in the Life Sciences. Verlag Berlin Heidelberg: Springer. p 224– 239. Fu C, Wang T, Wang Y, Chen X, Jiao J, Ma F, Zhong M, Bi K (2011). Metabonomics study of the protective effects of green tea polyphenols on aging rats induced by D-galactose. Journal of Pharmaceutical and Biomedical Analysis 55:1067–1074. Garc´ıa-Canas˜ V, SimoC,Le´ on´ C, Iba´nez˜ E, Cifuentes A (2011). MS-based analytical method- ologies to characterize genetically modified crops. Mass Spectrometry Reviews 30:396–416. 266 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH

German JB, Watkins SM, Fay LB (2005). Metabolomics in practice: emerging knowledge to guide future dietetic advice toward individualized health. Journal of the American Dietetic Association 105:1425–1432. Gibney MJ, Walsh M, Brennan L, Roche HM, German B, van Ommen B (2005). Metabolomics in human nutrition: opportunities and challenges. The American Journal of Clinical Nutri- tion 82:497–503. Gika H, Theodoridis G (2011). Sample preparation prior to the LC-MS-based metabolomics/metabonomics of blood-derived samples. Bioanalysis 3:1647–1661. Go VL, Butrum RR, Wong DA (2003). Diet, nutrition, and cancer prevention: the postgenomic era. Journal of Nutrition 133:3830S–3836S. Godzien J, Garc´ıa-Mart´ınez D, Martinez-Alcazar P, Ruperez FJ, Barbas C (2011). Effect of a nutraceutical treatment on diabetic rats with targeted and CE-MS non-targeted approaches. Metabolomics DOI: 10.1007/s11306-011-0351. Griffin JL, Shockcor JP (2004). Metabolic profiles of cancer cells. Nature Reviews Cancer 4:551–561. Hall RD, Brouwer ID, Fitzgerald MA (2008). Plant metabolomics and its potential application for human nutrition. Physiologia Plantarum 132:162–175. Hall JA, Brockman JA, Jewell DE (2011). Dietary fish oil alters the lysophospholipid metabolomic profile and decreases urinary 11-dehydro thromboxane B2 concentration in healthy Beagles. Veterinary Immunology and Immunopathology 144:355–365. Han J, Danell RM, Patel JR, Gumerov DR, Scarlett CO, Speir JP, Parker CE, Rusyn I, Zeisel S, Borchers CH (2008). Towards high-throughput metabolomics using ultrahigh-field Fourier transform ion cyclotron resonance mass spectrometry. Metabolomics 4:128–140. Herrero M, Garc´ıa-Canas˜ V, Simo´ C, Cifuentes A (2010). Recent advances in the application of CE methods for food analysis and Foodomics. Electrophoresis 31:205–228. Herrero M, SimoC,Garc´ ´ıa-Canas˜ V, Iba´nez˜ E, Cifuentes A (2012). Foodomics: MS-based strategies in modern food science and nutrition. Mass Spectrometry Reviews 31:49–69. Huffman KM, Redman LM, Landerman LR, Pieper CF, Stevens RD, Muehlbauer MJ, Wenner BR, Bain JR, Kraus VB, Newgard CB, Ravussin E, Kraus WE (2012). Caloric restriction alters the metabolic response to mixed-meal: Results from a randomized, controlled trial. Public Library of Science One 7:e28190. Iba´nez˜ C, Simo´ C, Garcia-Canas˜ V, Gomez-Mart´ınez A, Ferragut JA, Cifuentes A (2012a). CE/LC-MS multiplatform for broad metabolomic analysis of dietary polyphenols effect on colon cancer cells proliferation. Electrophoresis 33:2328–2336. Iba´nez˜ C, Valdes´ A, Garc´ıa-Canas˜ V, Simo´ C, Celebier M, Rocamora-Reverte L, Gomez-´ Mart´ınez A, Herrero M, Castro-Puyana M, Segura-Carretero A, Iba´nez˜ A, Ferragut JA, Cifuentes A (2012b). Global foodomics strategy to investigate the health benefits of dietary constituents. Journal of Chromatography A 1248:139–153. Ideker T, Galitski T, Hood L (2001). A new approach to decoding life: systems Biology. Annual Review of Genomics and Human Genetics 2:343–372. Jacobs DM, Fuhrmann JC, van Dorsten FA, Rein D, Peters S, van Velzen EJJ, Hollebrands B, Draijer R, van Duynhoven J, Garczarek U (2012). Impact of short-term intake of red wine and grape polyphenols on human metabolome. Journal of Agricultural and Food Chemistry 60:3078–3085. Jiang Z, Liang Q, Wang Y, Zheng X, Pei L, Zhang T, Wang Y, Luo G (2011). Metabo- nomic study on women of reproductive age treated with nutritional intervention: screening REFERENCES 267

potential biomarkers related to neural tube defects occurrence. Biomedical Chromatography 25:767–774. Jones DP, Park Y, Ziegler TR (2012). Nutritional Metabolomics: progress in addressing complexity in diet and health. Annual Review of Nutrition 32:18.1–18.20. Jove M, Serrano JCE, Ortega N, Ayala V, Angles N, Reguant J, Morello JR, Romero MP, Motilva MJ, Prat J, Pamplona R, Portero-Ot´ın M (2011). Multicompartmental LC-Q- TOF-based Metabonomics as an exploratory tool to identify novel pathways affected by polyphenol-rich diets in mice. Journal of Proteome Research 10:3501–3512. Junot C, Madalinski G, Tabet JC, Ezan E (2010). Fourier transform mass spectrometry for metabolome analysis. Analyst 135:2203–2219. Kanehisa M, Goto S (2000). KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research 28:27–30. Katajamaa M, Miettinen J, Oresic M (2006). MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22:634–636. Kenyon CJ (2010). The genetics of ageing. Nature 464:504–512. Kind T, Fiehn O (2006). Metabolomic database annotations via query of elemental com- positions: mass accuracy is insufficient even at less than 1 ppm. BMC Bioinformatics 7:234–244. Kind T, Tolstikov V, Fiehn O, Weiss RH (2007). A comprehensive urinary metabolomic approach for identifying kidney cancer. Analytical Biochemistry 363:185–195. Korman A, Oh A, Raskind A, Banks D (2012). Statistical methods in metabolomics. Methods in Molecular Biology 856:381–413. Kristensen M, Engelsen SB, Dragsted LO (2012). LC-MS metabolomics top-down approach reveals new exposure and effect biomarkers of apple and apple-pectin intake. Metabolomics 8:64–73. Krug S, Kastenmuller¨ G, Stuckler¨ F, Rist MJ, Skurk T, Sailer M, Raffler J, Romisch-Margl¨ W, Adamski J, Prehn C, Frank T, Engel KH, Hofmann T, Luy B, Zimmermann R, Moritz F, Schmitt-Kopplin P, Krumsiek J, Kremer W, Huber F, Oeh U, Theis FJ, Szymczak W, Hauner H, Suhre K, Daniel H (2012). The dynamic range of the human metabolome revealed by challenges. Federation of American Societies for Experimental Biology 26: 2607–2619. Lambert JD, Sang S, Yang CS (2007). Biotransformation of green tea polyphenols and the biological activities of those metabolites. Molecular Pharmaceutics 4:819–825. Law WS, Huang PY, Ong ES, Ong CN, Li SF, Pasikanti KK, Chan EC (2008). Metabonomics investigation of human urine after ingestion of green tea with gas chromatography/mass spectrometry, liquid chromatography/mass spectrometry and (1)H NMR spectroscopy. Rapid Communications in Mass Spectrometry 22:2436–2446. Lewis GD, Farrell L, Wood MJ, Martinovic M, Arany Z, Rowe GC, Souza A, Cheng S, McCabe EL, Yang E, Shi X, Deo R, Roth FP, Asnani A, Rhee EP, Systrom DM, Sem- igran MJ, Vasan RS, Carr SA, Wang TJ, Sabatine MS, Clish CB, Gerszten RE (2010). Metabolic signatures of exercise in human plasma. Science Translational Medicine 2: 33–37. Llorach-Asuncion R, Jauregui O, Urpi-Sarda M, Andres-Lacueva C (2010). Methodological aspects for metabolome visualization and characterization. A metabolomic evaluation of the 24 h evolution of human urine after cocoa powder consumption. Journal of Pharmaceutical and Biomedical Analysis 51:373–381. 268 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH

Llorach R, Garcia-Aloy M, Tulipani S, Vazquez-Fresno´ R, Andres-Lacueva C (2012). Nutrimetabolomic strategies to develop new biomarkers of intake and health effects. Journal of Agricultural and Food Chemistry DOI: 10.1021/jf301142b. Lloyd AJ, Beckmann M, Fave G, Mathers JC, Draper J (2011a). Proline betaine and its biotransformation products in fasting urine samples are potential biomarkers of habitual citrus fruit consumption. British Journal of Nutrition 106:812–824. Lloyd AJ, Fave G, Beckmann M, Lin W, Tailliart K, Xie L, Mathers JC, Draper J (2011b). Use of mass spectrometry fingerprinting to identify urinary metabolites after consumption of specific foods. The American Journal of Clinical Nutrition 94:981–991. Manach C, Hubert J, Llorach R, Scalbert A (2009). The complex links between dietary phytochemicals and human health deciphered by metabolomics. Molecular Nutrition & Food Research 53:1303–1315. Manna SK, Patterson AD, Yang Q, Krausz KW, Li H, Idle JR, Fornace Jr AJ, Gonzalez FJ (2010). Identification of noninvasive biomarkers for alcohol-induced liver disease using urinary Metabolomics and the Ppara-null Mouse. Journal of Proteome Research 9:4176– 4188. Martin FPJ, Rezzi S, Pere-Trepat E, Kamlage B, Collino S, Leibold E, Kastler J, Rein D, Fay LB, Kochhar S (2009). Metabolic effects of dark chocolate consumption on energy, gut microbiota, and stress-related metabolism in free-living subjects. Journal of Proteome Research 8:5568–5579. McGhie TK, Rowan DD (2012). Metabolomics for measuring phytochemicals, and assessing human and animal responses to phytochemicals, in food science. Molecular Nutrition & Food Research 56:147–158. McNiven EMS, German JB, Slupsky CM (2011). Analytical metabolomics: nutritional oppor- tunities for personalized health. Journal of Nutritional Biochemistry 22:995–1002. Mellert W, Kapp M, Strauss V, Wiemer J, Kamp H, Walk T, Looser R, Prokoudine A, Fabian E, Krennrich G, Herold M, van Ravenzwaay B (2011). Nutritional impact on the plasma metabolome of rats. Toxicology Letters 207:173–181. Moraes EP, Ruperez FJ, Plaza M, Herrero M, Barbas C (2011). Metabolomic assessment with CE-MS of the nutraceutical effect of Cystoseira spp extracts in an animal model. Electrophoresis 32:2055–2062. van Ommen B, Bouwman J, Dragsted LO, Drevon CA, Elliott R, de Groot P, Kaput J, Mathers JC, Muller¨ M, Pepping F, Saito J, Scalbert A, Radonjic M, Rocca-Serra P, Travis A, Wopereis S, Evelo CT (2010). Challenges of molecular nutrition research 6: the nutritional phenotype database to store, share and evaluate nutritional systems biology studies. Genes & Nutrition 5:189–203. Oresic M (2009). Metabolomics, a novel tool for studies of nutrition, metabolism and lipid dysfunction. Nutrition, Metabolism & Cardiovascular Diseases 19:816–824. O’Sullivan S, Gibney MJ, Brennan L (2010). Dietary intake patterns are reflected in metabolomic profiles: potential role in dietary assessment studies. The American Jour- nal of Clinical Nutrition 93:314–321. Pan Z, Raftery D (2007). Comparing and combining NMR spectroscopy and mass spectrometry in metabolomics. Analytical and Bioanalytical Chemistry 387:525–527. Peregrin T (2001). The new frontier of Nutrition Science: Nutrigenomics. Journal of the American Dietetic Association 101:1306–1306. REFERENCES 269

Petersen M, Taylor MA, Saris WH, Verdich C, Toubro S, Macdonald I, Rossner¨ S, Stich V, Guy-Grand B, Langin D, Martinez JA, Pedersen O, Holst C, Sørensen TI, Astrup A (2006). Randomized, multi-center trial of two hypo-energetic diets in obese subjects: high- versus low-fat content. International Journal of Obesity 30:552–560. Poroyko V, Morowitz M, Bell T, Ulanov A, Wang M, Donovan S, Bao N, Gu S, Hong L, Alverdy JC, Bergelson J, Liu DC (2011). Diet creates metabolic niches in the “immature gut” that shape microbial communities. Nutricion Hospitalaria 26:1283–1295. Primrose S, Draper J, Elsom R, Kirkpatrick V, Mathers JC, Seal C, Beckmann M, Haldar S, Beattie JH, Lodge JK, Jenab M, Keun H, Scalbert A (2011). Metabolomics and human nutrition. British Journal of Nutrition 105:1277–1283. Rhee EP, Gerszten RE (2012). Metabolomics and cardiovascular biomarker discovery. Clinical Chemistry 58:139–147. Ross SA (2010). Evidence for the relationship between diet and cancer. Experimental Oncology 32:137–142. Ryan D, Robards K, Prenzler PD, Kendall M (2011). Recent and potential developments in the analysis of urine: a review. Analytica Chimica Acta 684:8–20. Scalbert A, Brennan L, Fiehn O, Hankemeier T, Kristal BS, van Ommen B, Pujos-Guillot E, Verheij E, Wishart D, Wopereis S (2009). Mass-spectrometry-based metabolomics: limitations and recommendations for future progress with particular focus on nutrition research. Metabolomics 5:435–58. Sellick CA, Hansen R, Maqsood AR, Dunn WB, Stephens GM, Goodacre R, Dickson AJ (2009). Effective quenching processes for physiologically valid metabolite profiling of suspension cultured mammalian cells. Analytical Chemistry 81:174–83. Seyfried TN, Shelton LM (2010). Cancer as a metabolic disease. Nutrition & Metabolism 7:1–22. Shellie RA, Welthagen W, Zrostlikova´ J, Spranger J, Ristow M, Fiehn O, Zimmer- mann R (2005). Statistical methods for comparing comprehensive two-dimensional gas chromatography-time-of-flight mass spectrometry results: metabolomic analysis of mouse tissue extracts. Journal of Chromatography A 1086:83–90. Shepherd LVT, McNicol JW, Razzo R, Taylor MA, Davies HV (2006). Assessing the poten- tial for unintended effects in genetically modified potatoes perturbed in metabolic and developmental processes. Targeted analysis of key nutrients and anti-nutrients. Transgenic Research 15:409–425. SimoC,Ib´ a´nez˜ C, Gomez-Mart´ ´ınez A, Ferragut JA, Cifuentes A (2011). Is metabolomics reachable? Different purification strategies of human colon cancer cells provide different CE-MS metabolite profiles. Electrophoresis 32:1765–1777. Simons AL, Renouf M, Murphy PA, Hendrich S (2010). Greater apparent absorption of flavonoids is associated with lesser human fecal flavonoid disappearance rates. Journal of Agricultural and Food Chemistry 58:141–147. Smith CA, O’Maille G, Want EJ, Qin C, Trauger SA, Brandon TR, Custodio DE, Abagyan R, Siuzdak G (2005). METLIN: A metabolite mass spectral database. Therapeutic Drug Monitoring 27:747–751. Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G (2006). XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Analytical Chemistry 78:779–787. 270 MS-BASED METABOLOMICS IN NUTRITION AND HEALTH RESEARCH

Soltow QA, Strobel FH, Mansfield KG, Wachtman L, Park Y, Jones DP (2011). High- performance metabolic profiling with dual chromatography-Fourier-transform mass spec- trometry (DC-FTMS) for study of the exposome. DOI: 10.1007/s11306-011-0332-1. Sugimoto M, Kawakami M, Robert M, Soga T, Tomita M (2012). Bioinformatics tools for mass spectroscopy-based metabolomic data processing and analysis. Current Bioinformatics 7:96–108. Suhre K, Schmitt-Kopplin P (2008). MassTRIX: mass translator into pathways. Nucleic Acids Research 36:W481–W484. Trujillo E, Davis C, Milner J (2006). Nutrigenomics, Proteomics, Metabolomics, and the practice of dietetics. Journal of the American Dietetic Association 106:403–413. Tulipani S, Llorach R, Jauregui O, Lopez-Uriarte P, Garcia-Aloy M, Bullo M, Salas-Salvado J, Andres-Lacueva C (2011). Metabolomics unveils urinary changes in subjects with metabolic syndrome following 12-week nut consumption. Journal of Proteome Research 10:5047–5058. Volmer M, Northoff S, Scholz S, Thute¨ T, Buntemeyer¨ H, Noll T (2011). Fast filtration for metabolome sampling of suspended animal cells. Biotechnology Letters 33:495–502. Vorst O, Vos CH, Lommen A, Staps RV, Visser RG, Bino RJ, Hall RD (2005). A non- directed approach to the differential analysis of multiple LC-MS-derived metabolic profiles. Metabolomics 1:169–180. Wang H, Tso VK, Slupsky CM, Fedorak RN (2010). Metabolomics and detection of colorectal cancer in humans: a systematic review. Future Oncology 6:1395–1406. Wikoff WR, Anfora AT, Liu J, Schultz PG, Lesley SA, Peters EC, Siuzdak G (2009). Metabolomics analysis reveals large effects of gut microflora on mammalian blood metabo- lites. Proceedings of the National Academy of Sciences 106:3698–3703. Wishart DS (2008a). Metabolomics: applications to food science and nutrition research. Trends in Food Science & Technology 19:482–493. Wishart DS, Lewis MJ, Morrissey JA, Flegel MD, Jeroncic K, Xiong Y, Cheng D, Eisner R, Gautam B, Tzur D, Sawhney S, Bamforth F, Greiner R, Li L (2008b). The human cerebrospinal fluid metabolome. Journal of Chromatography B 871:164–173. Wishart DS, Knox C, Guo AC, Eisner R, Young N, Gautam B, Hau DD, Psychogios N, Dong E, Bouatra S, Mandal R, Sinelnikov I, Xia J, Jia L, Cruz JA, Lim E, Sobsey CA, Shrivastava S, Huang P, Liu P, Fang L, Peng J, Fradette R, Cheng D, Tzur D, Clements M, Lewis A, De Souza A, Zuniga A, Dawe M, Xiong Y, Clive D, Greiner R, Nazyrova A, Shaykhutdinov R, Li L, Vogel HJ, Forsythe I (2009). HMDB: a knowledgebase for the human metabolome. Nucleic Acids Research 37:D603–D610. Wishart DS (2010). Computational approaches to metabolomics. In: Matthiesen R, editor. Bioinformatics Methods in Clinical Research. Clifton, NJ: Humana Press. p 283–313. Wittwer J, Rubio-AliagaI, Hoeft B, Bendik I, Weber P, Daniel H (2011). Nutrigenomics in human intervention studies: current status, lessons learned and future perspectives. Molecular Nutrition & Food Research 55:341–358. Zivkovic AM, German JB (2009). Metabolomics for assessment of nutritional status. Current Opinion in Clinical Nutrition & Metabolic Care 12:501–507. 10 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS

Max Scherer, Alastair Ross, Sofia Moco, Sebastiano Collino, Franc¸ois-Pierre Martin, Jean-Philippe Godin, Peter Kastenmayer, and Serge Rezzi

10.1 INTRODUCTION

It is well known that nutrition is a cornerstone of health. Over the past years, changes in diet and lifestyle have rapidly evolved influencing the health status of popula- tions. Overnutrition and changes in dietary pattern toward energy-dense diets high in fat, particularly saturated fat, and lower unrefined carbohydrates have contributed to the pandemy of chronic diseases including obesity, type 2 diabetes, cardiovascular disease (CVD) as well as some types of cancer (Collino et al., 2011). Nowadays, nutrition research is focused on improving health of individuals through providing new tailored dietary patterns (Kussmann and Van Bladeren, 2011). In general, person- alized nutrition describes the concept of adapting food for individual needs. Studies have shown that individuals respond differently to various nutrients depending on their genetic makeup, lifestyle, and environmental factors. For example, Ferguson and coworkers (Ferguson et al., 2010) discovered that omega-3 polyunsaturated fatty acids (PUFA) from fish oil, mainly considered as “healthy fat,” are more beneficial for individuals with a particular genetic background. This study clearly demonstrates that the way individuals respond to a specific diet is different, suggesting that a

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

271 272 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS generic public dietary advice is maybe not the most effective way for improving public health. Therefore, the challenge is to provide products according to consumer needs and desired benefits. Personalized nutrition tries to set the individual apart, considering their specific physical and genetic characteristics. Nutritional metabolomics seeks to capture subtle metabolic changes as a result of different nutritional effects. Usually NMR spectroscopy and mass spectrometry (MS) are utilized to provide a metabolic inventory in a given biological matrix (Nicholson and Wilson, 2003; Dettmer et al., 2007; Nicholson and Lindon, 2008; Benton et al., 2012). The proton-NMR spectrum will give all hydrogen-containing molecules above the detection limit and can guide further analytical approaches into specific directions (Nicholson and Lindon, 2008). Advantageous is that the sample does not require any preparation prior to the analysis, unlike MS-based metabolomics (Benton et al., 2012). MS-based metabolomics can be either used as fingerprinting approach, a technique that basically identifies all detectable metabolites (Dettmer et al., 2007), or as targeted approach, which aims to quantify a subset of metabolites in a given sample (Xiao et al., 2012). Alternatively, the metabolites can be derivatized to make them more volatile for GC-MS analysis (Li et al., 2012). In this chapter, we review state-of-the-art metabolomics technologies in the context of personalized nutrition. Recent advances in biomarker screening to determine the nutritional status are also discussed.

10.2 METABOLOMICS TECHNOLOGIES

Metabolomics aims at measuring a large set of metabolites participating either as sub- strates or products in biochemical reactions. Even today and despite the enormous performance improvements of analytical techniques, the comprehensive and quan- titative analysis of the human metabolome remains a scientific challenge, not only due to instrumentation constraints but also due to the extensive number of molecular species covering a broad dynamic range.

10.2.1 Nuclear Magnetic Resonance Nowadays NMR-based metabolomics provides efficient high-throughput analysis of biological samples making it a relatively cost-effective approach. NMR spectroscopy offers the unique prospect to holistically profile hundreds of metabolites without a priori selection. In proton NMR spectroscopy (1H NMR), all covalently attached protons from mobile molecules within a very high dynamic range of concentrations, that is, from millimolar to nanomolar range, are simultaneously scanned thus provid- ing a biochemical fingerprint of biological sample. 1H NMR-based metabolomics is generally preferred to other nuclei like carbon-13 due to high sensitivity and relative short experimental time needed to acquire metabolic profiles. However, resonances of metabolites may be highly overlapped within the proton resonance window. In such case, ultrahigh magnetic field and/or two-dimensional (2D) NMR spectroscopy can be used to resolve overlapped resonances. METABOLOMICS TECHNOLOGIES 273

Urine and blood serum or plasma are the most commonly used biofluids for metabolomics studies due to their intrinsic richness in metabolic information and their relatively easy and noninvasive access. Detailed procedures to collect, store, and prepare biofluids (e.g., urine, serum, and plasma) or tissue samples for NMR analysis have been provided as guidelines for metabolomics (Beckonert et al., 2007). Urine, serum, and plasma usually require minimal pretreatment such as the addition of sodium azide to control bacterial growth, phosphate buffer to control pH-induced shift in resonance, deuterated water to lock the magnetic field, and TSP (3-(trimethylsilyl)- propionate, sodium salt) and DSS (2,2-dimethyl-2-silapentane-5-sulfonate, sodium salt) for chemical shift calibration. Recent introduction of cryoprobes have strongly improved NMR sensitivity to generally a 4-fold factor relatively to conventional room temperature probes (Keun et al., 2002). Furthermore, NMR spectroscopy, using high- resolution magic angle spinning NMR (HR-MAS) (Beckonert et al., 2010) offers a unique prospect to generate metabolic profiles from intact tissues thus ensuring the biological integrity of the investigated sample. One of the key features of NMR remains definitely the very good robustness as recently shown, that is, reproducibility >98% (Dumas et al., 2006).

10.2.2 Gas Chromatography Hyphenated to Mass Spectrometry Applications of gas chromatography hyphenated to mass spectrometry (GC-MS) for the analysis of metabolites have a long history. It started in the 1970–1980s with the studies on human metabolic disorders (Horning and Horning, 1971; Tanaka et al., 1980) and then continued in 2000s in plant science (Fiehn et al., 2000; Roessner et al., 2000). As such, the metabolite profiling techniques by GC-MS is probably the oldest technique used to screen metabolites in various matrices and is a very affordable technique compared to other MS instruments. Today, GC-MS offers a robust solution to study complex mixture and therefore has been extensively used in metabolomics following two analytical strategies. On the one hand, GC-MS can perform targeted analysis, for example, the analysis of 20 amino acids or a specific set of organic acids. This approach achieves good accuracy and precision especially when labeled internal standards are used. On the other hand, GC-MS can also be used for untargeted profiling. This solution provides a detailed chromatographic profile of metabolites in complex biological matrices with their relative or absolute concentration. The metabolic profiling expands the coverage of the metabolome but it is still restricted to certain classes of molecules. Reliable detection of 100–200 features in serum or plasma is a common readout using GC-MS (Begley et al., 2009). GC-MS is also considered one of the most effective devices for urine analysis due to the polarity of molecules present in urine (Zhang et al., 2007). Since 1960s, electron impact ionization (EI) has been standardized to 70 eV mak- ing GC-MS a very effective device to deliver complete ionization and fragmentation pattern in a very reproducible manner. Thus, by combining GC retention times (or retention indices) and specific EI mass spectra, specific mass spectral libraries can be built regardless of the manufacturers (Kind et al., 2009). These libraries can be used to identify the chemical structure of unknown metabolites via interpretation of 274 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS fragment ions and fragmentation patterns. Commercially available databases such as NIST, Whiley, and FiehnLib can be used for that purpose. GC-MS is ideally suited for volatile and nonvolatile components after specific derivatization with a molecular weight limit around 500 Da. Oximation and silylation reactions (used simultaneously) are both very popular in metabolomics. They allow derivatization of hydroxyl or amine groups into trimethylsilyl (TMS) or tert-butyldimethylsilyl (TBDMS) groups while carbonyl groups are converted into methoxy groups (MO) (Dunn et al., 2011). A major drawback with the use of silylation reagents is the inherent susceptibility to hydrolysis in the presence of trace of residual water. This can be however improved by using heavier silylation reagents such as N-methyl-N-(tert-butyldimethylsilyl)trifluoroacetamide (MTBSTFA). Another method of choice for derivatization exists by using alkyl-chloroformate reagents enabling the derivatization of amines and carboxylic groups (i.e., amino acids, organic acids, and amines). This reagent, although limiting to specific classes of molecules, offers a fast (less than 10-min reaction time) and a practical solution for a direct derivatization in aqueous mobile phase (Qiu et al., 2007; Smart et al., 2010). This method is also known to generate transformation of arginine to ornithine by reaction with butylsilyltrifluoroacetamide (BSTFA). Moreover, some components are more sensitive to storage and analytical conditions and can be easily degraded (i.e., amide, thiol, and sulfonic groups). In addition, the use of fast GC (measured with peak widths below the range 1–3 s) as defined by van Deurse et al. (2000) either with quadrupole instrument or with a fast time of flight (TOF) instrument are currently used to achieve high-throughput analysis. On the one hand, with the quadrupole mass analyzer, 0.15-mm internal diameter GC columns with an optimized scan experiment is suggested (Kirchner et al., 2005). On the other hand, an alternative approach is to use GC-TOF in order to achieve fast acquisition rates (up to 500 Hz) and accurate peak shape and mass data, thus facilitating metabolite identification. Data processing prior to statistical analysis is also an important step in GC-MS. To date, the deconvolution is probably the most used method compared to peak picking. The deconvolution makes use of different metabolites’ mass spectra to separate overlapping peaks. Another aspect of GC-MS rarely described in metabolomics is the method validation. In a comprehensive review, Koeck et al. (2011) discussed some recommendations on method development, validation, and quality control to circumvent the absence of certified reference material or official guidelines to achieve absolute quantitation of many and/or all metabolites.

10.2.3 Inductively Coupled Plasma Mass Spectrometry Inductively coupled plasma mass spectrometry (ICP-MS) is an analytical technique that underwent a fast development during the past 20 years. It has rapidly established itself as one of the most useful and versatile technique for the analysis of minerals and trace elements in biological, food, and environmental samples (Houk and Thompson, 1988; Vandecasteele and Block, 1993; Thomas, 2008). The development of ICP-MS was driven by the possibility to combine the multi- element capability and broad linear working range of ICP-AES with the exceptionally METABOLOMICS TECHNOLOGIES 275 low detection limits of graphite furnace atomic absorption spectrometry (GFAAS). ICP-MS can easily handle both simple and complex sample matrices and exhibits detection limits that are superior to those obtained in inductively coupled plasma optical emission spectrometry (ICP-OES), one of the most widely used routine ana- lytical technique in mineral analysis. In addition to quantitative analysis, ICP-MS also allows measurement of isotope ratios which has been of great importance for nutritional tracer studies (Crews et al., 1994). In ICP-MS, a high temperature plasma ion source at atmospheric pressure is com- bined with a mass spectrometer under vacuum as a sensitive detector. The inductively coupled plasma is generated by inductive heating of argon with a high-frequency electromagnetic field. Ions that are produced in the plasma are sampled in an axial direction through a narrow hole (approximately 0.7–1.2 mm diameter) into a differ- entially pumped interface and from there extracted into the mass analyzer. For most types of ICP-MS, a quadrupole is used for mass separation, but also high-resolution magnetic sector instruments are available. Transmitted ions are detected by an off- axis electron multiplier, which can be operated in the pulse-counting and/or analogue mode. Data acquisition can be done in the scanning or peak jumping mode. In the scanning mode, the mass region with the isotopes of interest is scanned, whereas in the peak hopping mode, only preselected ions are measured. The most common sample introduction mode is the direct injection of solutions using a nebulizer and a spray chamber. Owing to the high temperature of the plasma, the analyte com- pounds in the aerosol are efficiently dissociated and atomized, and singly charged positive ions are formed. More than 50 elements are ionized to form M + ions to an extent of >90%. Unfortunately, peaks from oxide (MO + ), doubly charged (M2 + ), and polyatomic (e.g., ArNa + ) ions are also produced from the analyte, the sample matrix, or the solvent. These peaks complicate the spectra and can cause serious spectral interferences if they occur at masses of singly charged ions (e.g., 40Ar16O + on 56Fe + ). They cannot be resolved using quadrupole analyzers alone but can be minimized by optimizing the instrumental operating conditions or using alternative methods of sample introduction. Additionally, the choice of solvent can contribute to reduce background interferences. For example, dilute HNO3 is preferred over HCl, H2SO4, and H3PO4 for most applications because it produces a simpler background spectrum. In recent ICP-MS instrumentation, polyatomic interferences in many cases can be eliminated by using quadrupole analyzers in combination with a collision/reaction cell. With this approach, ions enter the interface in the normal manner, where they are extracted under vacuum into a collision/reaction cell that is positioned before the analyzer quadrupole. A collision/reaction gas such as hydrogen or helium is then bled into the cell, which consists of a multipole (a quadrupole, hexapole, or octapole), usually operated in the radio frequency (rf)-only mode. By a number of different ion– molecule collision and reaction mechanisms, polyatomic interfering ions like 40Ar, 40Ar16O, and 38ArH will either be converted to harmless noninterfering species or the analyte will be converted to another ion which is not interfered with. This is exemplified by the reaction below, which shows the use of hydrogen gas to reduce the 38ArH polyatomic interference in the determination of 39K. Hydrogen gas converts 38 ArH to the harmless H3 ion and atomic argon, but does not react with the potassium. 276 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS

38 + + → + + ArH H2 H3 Ar 39 + 39 + K + H2 → K + H2 (no reaction)

An alternative way of eliminating interference consists in the use of the more expensive high-resolution magnetic sector instruments, which allow a mass resolution up to 10,000. With this technique, 40Ar16O can be separated from the 56Fe signal. It should be noted, however, that a significant reduction in signal intensity has to be accepted when using higher-resolution settings (e.g., 10,000). ICP-MS also suffers from matrix effects, that is, matrix-induced changes in the intensity of the ion signal, especially at concentrations of >1 g/L dissolved solids. At high salt concentrations, matrix effects such as ionization suppression or space charge effects can be observed. Several methods such as sample dilution, matrix matching, use of an internal standard, standard addition, chemical separation, and isotope dilution are in use to address such matrix effects. The most common means of sample introduction in ICP-MS is nebulization of the sample solution. During the past years, a variety of other methods have been developed. Solid samples can be analyzed directly, without preliminary dissolution, by electrothermal volatilization (ETV) or laser ablation. Gaseous samples such as volatile hydrides (e.g., Se, As), compounds eluting from a gas chromatograph or HPLC (e.g., Cr3 + /Cr6 + ), can also be introduced directly and efficiently into the ICP. Detection limits for quadrupole instruments for most elements are better than 0.01 ␮g/L thus over performing ICP-AES (1–100 ␮g/L). Further advantages include high sample throughput (typically >100 samples/d) and access to isotopic informa- tion. The main disadvantages of ICP-MS consist in the high instrument and running costs (mainly coming from the large consumption of pure argon gas) and the exis- tence of isobaric interferences in the low mass range (<80 amu). The latter issue has been properly addressed with new collision/reaction cell technology that are now available.

10.2.4 Liquid Chromatography–Mass Spectrometry Along with NMR, LC-MS is probably the most widely used technique in metabolomics. Compared to NMR, LC-MS allows the detection of a wide range of metabolites, reaching higher sensitivity levels. Essentially, two different approaches have been developed: the detection of all instrumentally possible metabolites and detection of specific classes of metabolites that are named targeted and untargeted methods, respectively. The detection of metabolites by LC-MS is firstly obtained with optimization of the sample preparation, chromatographic separation, and ionization. The sample preparation in metabolomics is perhaps the most underestimated step and is crucial as the first selection of the aimed metabolite classes is according to their chemical properties. Typically samples are quenched and then extracted, often by liquid extraction with organic solvents or solid-phase extraction, so that eventual enzymatic activity is stopped and the conservation of the metabolite pool is ensured (Alvarez-S´ anchez´ et al., 2010; Bojko et al., 2011). The integrity and recovery of the metabolome is an important parameter, as well as repeatability, so that comparison METABOLOMICS TECHNOLOGIES 277 between samples is made possible in a robust way. Sample preparation strategies have been described for biofluids (Bruce et al., 2009; Bojko et al., 2011), tissues (Romisch-Margl¨ et al., 2012), and cells (Villas-Boas et al., 2005; Sellick et al., 2009). Similarly to GC-MS, the use of internal and external standards, such as nonendoge- nous compounds or isotopically labeled species, is recommended whenever possible for better compensation of liabilities during sample preparation (Wu et al., 2005; Buescher et al., 2010). The ionization of analytes prior to MS analysis is a prerequisite for the detection of metabolites. Electrospray ionization (ESI) and atmospheric pressure chemical ioniza- tion (APCI), which are chemical ionization techniques, are the most used in LC-MS. The chemistry involved in ionization is complex and strongly depends on character- istics of the solvents and additives (volatility, surface tension, viscosity, conductiv- ity, ionic strength, dielectric constant, electrolyte concentration, pH, and gas-phase ion–molecule reactions), analyte (acid dissociation constant, hydrophobicity, surface activity, ion salvation energy, proton affinity), and operational parameters such as flow rate, temperature, and ESI voltage (Kostiainen and Kauppila, 2009). Nordstrom¨ and coworkers (Nordstrom¨ et al., 2008) have applied a multiple ionization strategy, using ESI and APCI, in both positive and negative ionization modes, matrix-assisted laser desorption ionization (MALDI), and desorption ionization on silicon (DIOS) for the analysis of human serum. The combination of ionization techniques maximizes the coverage of measured metabolite classes. Recently, two techniques have been introduced for metabolomics application (Harris et al., 2011). Chromatographic separations are often used to aid isolation of particular metabo- lites from the biological matrix. Nowadays, a wide range of stationary phases is available to improve both resolution and sensitivity. The central carbon metabolism, including glycolysis, pentose phosphate pathway, tricarboxylic acid cycle (TCA cycle), and surrounding metabolic reactions, contains mostly polar compounds such as sugars, sugar phosphates, organic and amino acids and has been covered by GC- MS after derivatization (Bruce et al., 2010), ion pairing LC-MS (Buescher et al., 2010), hydrophobic interaction LC-MS (Tolstikov and Fiehn, 2002), and CE-MS (Soga et al., 2009). In addition, a comprehensive analysis of biological lipids can be performed using MS-based lipidomics. A number of analytical approaches can be deployed to access the lipid inventory of a given biological matrix (Griffiths et al., 2011). Recent advancement in LC-MS technology make this field a promising area of bio- chemical research (Wenk, 2005). Although such approaches are reasonably well established for high-throughput analysis of the major lipid classes (phospholipids, sphingolipids) (Liebisch et al., 1999, 2004, 2006; Schuhmann et al., 2011), they still have to be fine-tuned for the quantification of low-abundant signaling lipids such as sphingosine, sohingosine-1-phosphate, or lysophosphatidic acid (Scherer et al., 2009, 2010, 2011). Advances in mass spectrometry provide the operator with a wide range of technical solutions for metabolomics. Some of these include quadrupole (Q), TOF, orbitrap, ion trap (IT), ion cyclotron resonance (ICR), and combinations (such as QQQ, QTOF, TOF-TOF, IT-orbitrap). Typically in targeted methods, QQQ-MS or QTrap-MS is used, allowing the optimization of parameters for each selected ion pairs (typically 278 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS molecular ion and abundant fragment) or transitions called selected reaction moni- toring (SRM, or multiple reaction monitoring, MRM) can be used for quantitative purposes. IT allows the isolation of ions for fragmentation up to nth time (with n being the number of collision events), while TOFs, ICRs, and orbitraps provide very high mass accuracy and resolution which can be useful for metabolite identifica- tion purposes. Plasma has been analyzed by untargeted GC-TOF-MS (Bruce et al., 2010) leading to analysis of 46 endogenous metabolites. Using a 12T Fourier trans- form ICR-MS, about 570 distinct metabolite features, represented by monoisotopic masses above S/N 3 within the mass range m/z 90–570 (a 480 Da mass range), were detected in murine plasma.

10.2.5 Capillary-Electrophoresis Hyphenated to Mass Spectrometry Capillary-elctrophoresis hyphenated to mass spectrometry (CE-MS) has been estab- lished as a versatile tool for metabolomics and food analysis providing fast, efficient and automated separation with low consumption of sample volumes and reagents. In addition to GC and LC-MS, which are mainly used for metbolomics and foodomics studies, CE-MS has emerged as an advanced and complementary approach to ana- lyze a wide range of compounds, such as amino acids, peptides, proteins, phenolic compounds, carbohydrates, DNA fragments, vitamins, toxins, pesticides as well as chiral compounds in several biological matrices (Castro-Puyana et al., 2012). CE is particularly suitable for the analysis of polar and charged compounds, as compounds are separated on the basis of their charge-to-mass ratio. The separation mechanism of CE fundamentally differs from reversed-phase LC and, therefore, CE can provide additional information on the composition of a biological sample (Ramautar et al., 2011). Compared to LC-MS, samples can be analyzed, without laborious and expen- sive pretreatment. In addition, fused silica-columns are relatively cheap compared to expensive LC columns. CE-MS can be applied as targeted or non-targeted screen- ing approach without having a priori knowledge on the nature and identity of the metabolites. Today, CE-MS is well suited as a complementary technique for metabolomics and foodomics studies.

10.3 PERSONALIZED NUTRITION

10.3.1 Metabolomics—One of the Keys to Unlock the Potential of Personalized Nutrition? The increase in the aging population and incidences of chronic diseases raises new challenges for global public health care in which preventive medicine approaches, particularly by the means of optimal nutrition, will be crucial. When looking at the pandemic evolution of complex disorders such as obesity and type 2 diabetes mellitus (T2DM), with an etiology intimately linked to complex interactions between genetic PERSONALIZED NUTRITION 279 and environmental factors, it becomes obvious that the traditional vision of nutrition as a vector to provide nutrients to the body needs to evolve toward an effective role to maintain or improve health of populations. This is in such context that the concept of personalized nutrition, defined as outcome by which individuals can adapt their lifestyle and nutrition to their individual needs, has become particularly popular. Currently, research emphasis in metabolomics is given to developing new biomark- ers of the onset of homeostatic loss, providing a way to identify early metabolic deviations. This would mean that in the absence of clinical symptoms, corrective nutrition for management of disease risk factors could be started before the need for drug-based intervention (Rezzi et al., 2007a). Efforts are also deployed to under- stand what particular nutrition will suit for individuals or individual groups. This implies the development of so-called prognostic biomarkers, similar to the concept of pharmacometabolomics developed for personalized drug therapies (Chen et al., 2012). This new field of research focused on nutrition rather than drug intervention, nutrimetabolomics, will aim to determine whether or not an individual will respond or not to a specific nutritional intervention. Although remaining in the conceptual phase, the identification and validation of nutrimetabolomic biomarkers can be seen as to understand and utilize the full extent of interindividual genetic and gut micro- biota variation. An important aspect of metabolomic biomarkers lies with the fact that they capture information on metabolic regulation in its totality. This includes surrogate effects, which may or may not be directly linked to the root cause of a pathological process. This feature makes them particularly well suited to generate novel mechanistic insights that could lead to new therapeutic routes for correct- ing homeostatic loss and promoting metabolic health with personalized nutritional approaches. Lastly, it is highly conceivable that these previously mentioned biomarker categories could lead to new molecular targets for rapid and efficient health monitor- ing purposes. Such health monitoring technologies will enable the demonstration of the health benefits of personalized nutritional solutions to consumers and/or specific consumer groups.

10.3.2 Personalized Nutrition—From Population to the Individual Personalized nutrition aims to tailor dietary advice and interventions for known differences between people and populations, rather than assuming all people respond in more or less the same manner to different foods. Out of necessity, most nutritional advice is based on what is suitable for the majority of the population. For example, recommended daily intakes for micronutrients are designed to cover the needs of around 97% of the given population. While strategies such as recommended daily intakes are useful tools for getting the message of good nutrition to the general population, there are elements of personalized nutrition buildin. In most countries, there are different recommendations for infants, children, adolescents, adults, and elderly, as well as pregnant and lactating women, and this population stratification can be seen as a form of personalized nutrition. Other examples of public health nutrition merging with personalized nutrition include general nutritional advice/measures for people living in different geographic 280 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS locations. Low vitamin D status has been associated with poor health, and vitamin D supplements are especially recommended for dark skinned people at extreme latitudes (melanin protects against UV radiation, but UV radiation is required to complete the synthesis of vitamin D, which is markedly less efficient among dark skinned people) (Fiscella et al., 2011; Kahn et al., 2011). In some parts of the world, selenium supplementation is advised due to soils that are low in selenium, leading to low amounts of selenium in the food supply, and consequently potentially deficient intakes (Finland has introduced supplementation of selenium in fertilizers in some foods to combat this particular micronutrient deficiency in their population (Pietilainen et al., 2007)). While nutritional advice from public health authorities is getting more specific for subpopulations, it is still a long way from one of the oft-touted ultimate goals of personalized nutrition—to tailor dietary advice for the individual to optimally achieve a lifestyle goal based on diet. Goals may include weight loss/gain (or optimization of body composition), physical performance, or reduction in disease risk. This sort of individual advice would be based on one or a combination of genetic and metabolomic technologies that could highlight predispositions (genetics) and actual deficiencies (metabolomics). Specific dietary advice can then be given based on this information which should have a tangible effect on a person’s well-being. So how far are we from this goal? To date, commercialized whole genome sequencing is available and coming down rapidly in price. However, the type of advice that is provided based on single-nucleotide polymorphism (SNP) profiling is questionable, given the current lack of understanding of the complexity of multigene-based differences (disease cannot be considered if one considers that a personalized nutrition approach would ultimately seek to mitigate disease risk, rather than treat disease), and the difficulty in replicating SNP associations across different populations and cohorts (Joost et al., 2007; Williams et al., 2008). In contrast to whole genome sequencing, metabolomics may appear to be an ideal tool to address issues in nutrition related to excesses or deficiencies in known nutrients or changes in key diet-related metabolic pathways such as central energy metabolism and gut microbiota metabolism. Metabolomics could be used both as a way of identifying biomarkers/biomarker patterns that could be used as diagnostic tests or as a diagnostic tool in its own right, measuring multiple metabolic pathways simultaneously.

10.3.3 Challenges for Using Metabolomics for Personalized Nutrition A metabolomic readout should give information about the current nutritional state, though this too raises several biological and technical issues. For example, to what degree do various nutrient and biomarker pools fluctuate or are impacted by bioavail- ability kinetics (how many replicate samples are required to get a stable readout of the metabolome)? How responsive are selected biomarkers to a nutritional interven- tion (seeing an improvement in biomarker profile with a dietary solution is critical for the credibility of personalized nutrition)? Are current general metabolite pro- filing techniques suitable for specific nutrient-related metabolites? These issues are PERSONALIZED NUTRITION 281 key challenges that need to be addressed before relevant biomarkers identified by metabolomics can be used for determining nutritional and health status. While, conceptually, metabolomics could be ideal for personalized nutrition, cur- rent metabolic profiling techniques are not necessarily well suited to the analysis of many vitamins and do not cover minerals. Methods to analyze many vitamins in biofluids are highly specialized due to low concentrations and/or poor stability during sample preparation. Mineral profiling techniques such as quantification of individual minerals may seem to be a useful adjunct to personalized nutrition to detect mineral deficiencies, except that total concentration of a mineral is often not informative about mineral status. An example is iron—iron status is not measured by the amount of iron present in a biofluid but by the amount of one or more proteins that are related to iron transport (e.g., ferritin and zinc protoporphyrin) (Zimmermann and Hurrell, 2007). Current methods (normally enzyme based) for the analysis of these proteins are relatively expensive, even if iron is probably one of the better researched essential minerals. A personalized nutrition metabolic readout based on targeted analysis of many different nutritionally important compounds could be envisaged with current technologies, though the cost of doing many individual analyses and the time taken to run individual analyses would make the cost prohibitive. However, there are some models that indicate that this type of approach could work—most notably the systems used for drug control at major international sporting events where turnaround times are typically less than 12 h, though at considerable cost. The problems surrounding accurate and comprehensive measurement of metabo- lites related to nutrition may in time be solved via the use of surrogate markers that are relatively stable and can be measured using profiling techniques. Advances in sensitivity (especially for mass spectrometers) will widen the number of candidate compounds, though considerable work will be required to tease out suitable surro- gate molecules and understand their metabolism to enable adequate interpretation. Realistically, the goal of individual personalized nutrition is some distance away, but knowledge allowing the provision of better advice based on subpopulations already exists and increasing knowledge on the integration of omics technologies such as metabolomics and genomics will allow these subpopulations to decrease in size.

10.3.4 Challenges for Deciphering Nutritional Biomarkers Nutrition is increasingly relying not only on the characterization of food ingredients but also on nutritional biomarkers. In fact, it is often the case that biomarkers of nutri- tion tend to be more informative as they mirror the impact of particular ingredients on metabolism and health. However, there are certain fundamental limitations. It is no surprise that individuals exhibit different physiologies, even if challenged with anal- ogous diets. Nutritional studies need to deal with a large interindividual variability, especially in human cohorts. The establishment of baselines by introducing control groups in nutritional studies is pivotal in evaluating nutritional outcomes. Lifestyle, age, gender, genetic and epigenetic background, food habits, gut microbiota, and body composition are some of the variables that characterize the discrepancies between 282 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS

• Metabolism • Physiology • Absorption • Age • Gender • Compartmentalization • Cooking • Transport • Food combinations • Drugs

• Time • Purpose of the study • Sampling (blood, urine, etc) • Analysis • Storage of samples food nutrient (N) nutritional biomarker (NB) individual

NB NB NB NB ?

NNNN FIGURE 10.1 The study of human nutrition: the food is the input, while biofluids are the nutritional readouts. Lifestyle, age, gender, genetic, and epigenetic background are some of the variables charactering individuals. Interlaboratory harmonization and standardization of sampling conditions and analytical setup is crucial to ensure reliable readouts.

individuals’ metabolism. The absorption of nutrients is influenced by food prepara- tion and cooking, as well as food combinations (Fig. 10.1). The breakdown, trans- port, and temporal effects of food ingredients in the human body is complex, given its compartmentalization and organ organization, leading to differential biotrans- formation, partition, availability, and excretion (Potischman, 2003). Another major challenge is related to the quantification of a nutritional effect, especially on health (in opposition to disease effects). A definition of health is thus difficult to establish, making an improvement in health status a difficult evaluation. Moreover, differences between subjects are often much larger than differences between nutritional inter- ventions. Nutrition provokes changes of minimum magnitude, as a result of multiple minor genetic differences, low gene expression, and protein alterations, modulated by numerous ingredients, with an immensity of confounding factors (van Ommen, 2007). Lastly, given that human studies are confined to obtaining minimally intrusive samples such as blood, urine, and stool samples, an indirect readout of the mecha- nisms of action is bound to be the possible assertion. It is also in the aim, design, and experimental setup of the study that variations may occur. To sum up, there are concrete challenges in studying nutritional effects in humans, but these should not hinder the scientific community from establishing general observations, tendencies, and seeking novel nutritional biomarkers (Fig. 10.1). PERSONALIZED NUTRITION 283

The identification of biomarkers requires systems-approach strategies, such as omics technologies, that establish potential metabolic differences between a control status and an intervention status. However, often enough there is a trend of respon- ders and nonresponders toward a given nutritional intervention. For example, blood ␤-carotene and vitamin A responses to oral ␤-carotene are variable in humans (van Ommen, 2007). In this and other cases, one can identify a parallelism between nutri- tion and drug interventions. In fact, the line between ingredient and drug can often be thin, as both can be active substances and interact with receptors and therefore modulate metabolism. In this respect, nutritional and medical biomarkers follow the same discovery path. In both cases, the adequate combination of drugs or ingredi- ents should be administrated to the (group of) subjects to achieve the best health improvement. Drug discovery has been dealing with a one-drug-to-one-target-to– whole-population approaches up to nowadays, but it is starting to rely on systems approaches, so-called systems pharmacology, to deal with long observed phenomena such as low drug responsiveness in some individuals. Van der Greef and McBurney (2005) discuss the importance of in vivo studies and the monitoring of drug effects by system response profiles (SRPs), by looking at drug versus placebo and disease versus healthy SRPs when challenged to different drugs. This strategy would be pinpointing not only primary but also possible side effects. In addition, the combination of drugs is here proposed as a strategy to increase responsiveness, by using, for example, natural products. While there is much to be learned from pharmaceutical research, there are some fundamental differences that will require tailored approaches for applying systems biology to develop personalized nutrition. While drugs tend to be well controlled in terms of dose and intake and are generally compounds that are not regularly handled by humans, food is highly heterogeneous, especially in micronutrient and phytochemical content, and our metabolic systems have evolved to handle most components present in food. Thus, while drugs may elicit a strong acute metabolic response, changes in diet may not necessarily lead to strong metabolic changes over a short-term period. Exceptions to this can be seen for relatively novel foods from an evolutionary perspective, such as cocoa (Martin et al., 2009a). Nutritional studies need systems-wide approaches, such as metabolomics, to deal with population heterogeneity (genome, epigenome, gut microbiome), diet hetero- geneity and complexity (as multi-ingredient ensembles), and low amplitude effects on metabolism. There remains much that is unknown or controversial, and there is much to be learned from the experience of pharmaceutical research in this field. However, fundamental differences between drugs and food also mean that novel approaches will be needed.

10.3.5 Reality versus Dreams in Personalized Nutrition—Examples with Biomarkers of Intake While much of current nutrition thinking revolves around adequacy of macro- and micronutrients, people generally eat food. Thus, one approach to avoid problems with the difficult analysis of some vitamins is to instead use metabolomics to identify 284 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS if intake of various foods is adequate, too low, or too high. The area of biomarkers of intake (chemicals or compounds that mirror food intake) has existed for several decades, from the use of urinary nitrogen measurements to estimate protein intake. Present developments of biomarkers of food intake include the use of alkylresorci- nols as biomarkers of wholegrain cereal intake (Ross 2012), various carotenoids for fruit and vegetable intake (Al-Delaimy et al., 2005; McNaughton et al., 2005), and proline betaine as a biomarker of citrus intake (Heinzmann et al., 2010). Studies on using these compounds as biomarkers can be informative about the challenges faced when considering trying to accurately estimate adequacy of food intake based on a metabolic profile. Generally, biomarkers of food intake have correlations with food intake estimated from food records in the order of 0.2–0.5. Even in cases where the biomarker compound can only come from one food source, such as lycopene from tomatoes or alkylresorcinols from wheat and rye, correlations are only moderate. This appears to be due to interindividual variation in response to the same dose of the biomarker compound. Interindividual variability is not only due to individual dif- ferences in genetic background but also due to environment, stress, gut microbiota, and diet. Lipid-soluble compounds are better absorbed when consumed with meals rich in lipids, while some plasma metabolites proposed as biomarkers, such as mam- malian lignans, enterolactone, and enterodiol (proposed biomarkers of a plant food rich diet), are dependent on gut microbiota, and all but disappear after an antibiotic treatment. Differing pharmacokinetics also need to be considered. Proline betaine is highly specific for citrus fruit intake but has a very short half life, with most being excreted within 14 h (Heinzmann et al., 2010), so a morning spot urine sample would be unlikely to contain any evidence of citrus consumption from breakfast or lunch the day before, though a 24 h urine collection would be needed. Alkylresorcinols and their metabolites have a longer half life (Landberg et al., 2006; Soderholm et al., 2011), and because they can be stored in adipose tissue, tend to reach an equilibrium with plasma which is relatively stable, meaning that small variations in an individual’s intake will not have a major impact on interpretation. Food intake is also generally stable; especially for the staple foods that have a major impact on dietary pattern. Correlation studies find that at best, dietary intake of the biomarker compound only explains around 36% of observed variation (i.e., a correlation of 0.6). So how can the relatively low correlation between food intake and intake biomarker response be explained? Gender was confirmed to be a key predictor of plasma alkylresorcinol concentrations, along with plasma lipids (nonesterified fatty acids and triglycerides, but not cholesterol) (Ross et al., 2012). Even accounting for all known factors, and using controlled feeding conditions, diet only explains around a third of variation between individuals. The explanation for the remaining variation can only be spec- ulated about at this stage, though differences in absorption and metabolic rate (i.e., rate of the metabolism of the parent compounds) may explain a portion of this, while differences in sampling time between subjects may lead to a large amount of vari- ation (higher amounts of circulating lipids would potentially falsely elevate plasma concentrations of alkylresorcinols). A recent overview of factors affecting energy balance highlighted that even in an area as fundamental to nutrition as variation in energy output in response to diet and lifestyle changes is poorly understood (Hall PERSONALIZED NUTRITION 285 et al., 2012). Time since last meal is also important, especially if compounds are transported in lipoproteins. Recent studies on plasma alkylresorcinols suggest that use of nonfasting plasma samples leads to compromised results (Landberg et al., 2006; Andersson et al., 2011). It should be noted here that while intake biomarkers in general are not excellent predictors of individual intake, they do function well for ranking subjects in epidemiological cohorts (the correlation of mean alkylresor- cinol concentrations with alkylresorcinol (wholegrain wheat and rye) intake across different studies is 0.89, even if correlations in different studies ranged from 0.25 to 0.58 (Ross, 2012). This illustrates the challenge of moving from research studying biomarkers in large populations to providing meaningful advice for the individual. In an ideal scenario for using biomarkers as surrogate measures of food intake, where the only source of the biomarker is from the food group of interest, it is possible to see that there are many factors that lead to high interindividual variation, even if intraindividual variation may be low under controlled conditions (Landberg et al., 2009). The main sources of interindividual variation need to be identified before metabolic biomarkers can be reliably used to estimate food intake in the type of scenario envisaged for individualized nutrition recommendations. A future strategy will be to identify other metabolites that predict individual response and can be used to correct individual responses to approximate these to population means.

10.3.6 Moving from Global-Population-Based Biomarkers to Individual Biomarkers Over the past three decades, the majority of metabolomics studies have been cen- tered on identifying metabolic changes related to diseases or drug/toxic substance interventions (Nicholson et al., 2002; Lindon et al., 2003; Nicholson, 2006). Only relatively recently has metabolomics been applied in the area of food and nutri- tion research. There are two main aspects to nutritional metabolomics, analogous to other fields of human metabolomics research; one is to understand global metabolic responses to food intake (i.e., population level understanding with a focus on bio- chemical mechanisms) and the other is to understand individual responses to diet. While the latter is intuitively important for personalized nutrition, understanding the role of different biomarkers and their interactions is critical for interpreting biomarker responses. The sheer complexity of interactions among the many food components coupled with endogenous factors such as genetics and epigenetics, in situ factors (chiefly gut microbiota status), as well as exogenous factors (lifestyle, diet history, and environment) makes this field highly challenging. “Traditional” nutrition research has been based on a combination of population- based studies looking at associations between disease outcomes and reported dietary intake (nutritional epidemiology) and smaller-scale intervention studies to investi- gate mechanisms. This approach is useful for determining population-based dietary advice, but less informative for individual responses to diet. Nutritional metabolomics has followed a similar evolution to date. Recently, Holmes et al. used metabolomics in the context of a large-scale epidemiological study to identify metabolic signatures across and within selected human populations in relation to geography, diet-related 286 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS disease risk factors, and CVD prevalence (Holmes et al., 2008). Urinary metabolite excretion patterns differed between East Asian and Western populations, as well as between genetically related individuals living in different geographic areas (e.g., ethnic Japanese living in Japan or the United States, and Han Chinese living in the northern or southern regions of China. This metabolomics epidemiology study found that urinary excretion of formate was inversely correlated with blood pressure across different populations, in spite of genetic and lifestyle diversity. Due to the relative expense of running such metabolomic epidemiology studies, they are still rare, but represent a potentially powerful approach for determining new potential biomarker targets relevant to nutrition and disease. Smaller intervention and case control studies allow for more detailed investigation of biomarkers related to health and disease. As an example, metabolic syndrome, related to a combination of risk factors related to obesity, insulin resistance, and dyslipidemia, is a major public health concern. Its increase and prevention are related to diet, making it an ideal field for nutritional metabolomics (Wirfalt et al., 2001; Grundy et al., 2005). While energy imbalance is the primary cause of metabolic syn- drome, individual responses vary widely—many obese people do not have metabolic syndrome, while many people are insulin resistant, but are not classed as obese. Several metabolomics studies have provided novel insights into metabolic pathways harboring different levels of dysregulation during the development of metabolic syn- drome. The potential of such findings for population stratification and diagnosis was also successfully demonstrated. These metabolic readouts also offer promising molecular targets with which to assess individual disease risk in healthy populations or nutritional impact on delaying the onset of metabolic dysregulation and disease development. Several studies have used metabolomics to find novel biomarkers related to metabolic syndrome risk factors, in addition to those used to diagnose the condi- tion. These include branched-chain amino acids (BCAAs) contributing to insulin resistance in obese human subjects (Newgard et al., 2009) and 3-indoxyl sulfate (linked to kidney dysfunction), indicators of shifts in lipid metabolism (glycerophos- pholipids and free-fatty acids) and markers of changes in bile acid metabolism (Suhre et al., 2010). Predictive markers of diabetes in plasma have been determined from prospective studies, underlining a role for both aromatic amino acids and BCAA being associated with insulin resistance (Rhee et al., 2011; Wang et al., 2011a). Using a combination of three amino acids (isoleucine, phenylalanine, tyrosine) it was able to predict future development of diabetes (>5-fold higher risk for individuals in top quartile) (Wang et al., 2011a). In a second report, the authors evaluated the specific inter-relationships between dyslipidemia and the development of insulin resistance (Rhee et al., 2011), reporting that lipids of lower carbon number and number of double bonds were associated with an increased risk of type 2 diabetes. Again, a combina- tion of two specific compounds (two triacylglycerols) further improved prediction of diabetes incidence. Metabolomics was also able to determine indicators of early onset of prediabetes, marked by alterations in fatty acid, tryptophan, uric acid, bile acid, and gut microbial metabolism (Zhao et al., 2010). Gut microbiota appear to be involved in many pathways related to health and disease and represent an important PERSONALIZED NUTRITION 287

“organ” where dietary intervention may play an important role. In a study looking at the relationships between gut microbial metabolism of dietary phosphatidylcholine and cardiovascular pathogenesis in humans, metabolomics analysis indicated that circulating concentrations of choline, trimethylamine oxide (TMAO), and glycine betaine were predictive of cardiovascular events (Wang et al., 2011b). These three metabolites, and TMAO in particular, are related to gut microbiota function in many studies and suggest that modulating gut microbiota through diet could be a future strategy for prevention of CVD. Existing therapeutic strategies for metabolic syndrome lie in a combination of improved dietary habits and lifestyle coupled with drug interventions. These gener- ally only meet with marginally beneficial effects for morbidly obese patients, and it appears as though the only rapidly effective treatment for morbid obesity and type 2 diabetes is bariatric surgery. There are considerable barriers to widespread use of bariatric surgery (up-front cost, mortality during the procedure, considerable psy- chological adjustment postprocedure, as well as philosophical dilemma of offering a solution for a condition popularly associated with lack of self-control). Thus, preven- tative approaches including tailor-made weight management programs may be more sustainable. A key component of understanding success of weight loss/management programs is body composition, rather than assumptions based on body mass index (BMI). For example, body fat distribution, and visceral fat in particular, has been demonstrated to be a key determinant of increased risk of CVD (Lapidus et al., 1984; Larsson et al., 1984; Donahue and Abbott, 1987), diabetes (Hartz et al., 1983; Kalkhoff et al., 1983), hypertension (Despres et al., 1988), nonalcoholic fatty liver disease (Park et al., 2007), and a higher risk of all cause mortality (Folsom et al., 1993). Body composition can be as simple as quantifying fat-free mass and total body water, through to the quantification of specific location of adipose deposits. Over the past decades, there has been an increasing awareness that excess fat stored in the trunk or android regions could be metabolically less healthy and elevate the risk of T2DM and CVD compared to proportionally more fat stored in the gynoid area. This difference in fat distribution appears to lead to a difference in disposition to insulin resistance (Wirfalt et al., 2001; Grundy et al., 2005). Many studies have previously described the role of intra-abdominal fat accumulation in the development of insulin resistance. In particular, reduction of visceral adipose tissue (VAT), a major compartment of visceral adiposity, significantly restores glucose and insulin toward normal levels in humans (Dulloo and Montani, 2010). Moreover, visceral adiposity also includes fat deposition within and around other tissues and organs, also called ectopic fat deposition, which can also impair metabolic homeostasis. For instance, intracellular lipid accumulations in endocrine pancreas, liver, and skeletal muscle cells have all been described and contribute to the pathogenesis of impaired insulin secretion and insulin resistance (Dulloo and Montani, 2010). In contrast to upper body obesity, gluteofemoral adipose tissue mass is associated with a favorable lipid and glucose profile, as well as with a decrease in cardiovascular and metabolic risk (Manolopoulos et al., 2010). This tissue counterbalances the metabolism of visceral adiposity, through long-term entrapment of excess fatty acids, thus protecting from the adverse effects associated with ectopic fat deposition. However, there are many 288 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS individuals apparently at risk of diabetes due to their obesogenic and diabetogenic environment and phenotype who remain metabolically healthy (Wildman et al., 2008). Understanding how the underlying metabolic processes contribute to individual pre- disposition, development and maintenance of physiological dysregulation may offer new avenues to develop future therapeutical and nutraceutical interventions aiming to prevent, delay, or normalize—at least partially—these metabolic processes. In parallel with work on a deeper understanding of disease phenotypes, significant scientific efforts were put in identifying specific disease risk genotypes. While many genes and transcription factors associated with fat storage and obesity have been determined (Klannemark et al., 1998; Viguerie et al., 2005a, 2005b; Sorensen et al., 2006; Clement and Langin, 2007), the molecular determinants of obesity are not fully characterized. Genetic work suggests that both obesity and metabolic syndrome are heritable (Teran-Garcia and Bouchard, 2007), though does not fully explain observed phenotypic differences. Recently, the incorporation of metabolomics with genome- wide association studies (GWAS) is potentially a powerful tool to explore disease- related metabolic deregulations and interactions between environmental exposure, lifestyle, genetic predisposition, and actual metabolic phenotype at both the individ- ual and population scale. The identification of metabolic signatures associated with specific genotypes in free-living populations remains challenging due to the relatively low amplitude of the associations when compared to inherent intra- and interindi- vidual variability in the data that cannot easily be corrected for (Griffin, 2004). However, such genotype-metabolic phenotype offers unprecedented opportunities to link genetic predisposition with specific metabolic dysregulation, which can serve as targets for future therapeutical and nutraceutical management solutions. Gieger et al. first combined SNP-based genotypes with targeted metabolic phenotypes of serum samples (Gieger et al., 2008). The study highlighted specific association with the FADS1 gene, which codes for the fatty acid delta-5 desaturase, a key enzyme in the metabolism of long-chain PUFA. The SNP of this gene was associated with different serum levels of phosphatidylcholines, plasmalogen/plasminogen, and phosphatidyli- nositol, which can be readily ascribed to changes in the efficiency of the delta-5 desaturase reaction. Other investigations have demonstrated that urinary metabolic profile can be combined with GWAS for probing possible genetic causes behind metabolic traits (Suhre et al., 2011). In particular, the authors described a very strong association between the AGXT2 gene (coding of alanine-glyoxylate aminotransferase- 2) and the urinary excretion of ␤-aminoisobutyrate, in agreement with its function as carrier involved in hyper-␤-aminoisobutyric aciduria. The information contained in the metabolic traits can be enriched with the integration of data from both urine and plasma and may lead to new associations between the metabolome and genome (Nicholson et al., 2011). In this instance, two gut microbiota metabolites urinary trimetylamine (TMA) and plasma dimethylamine were associated with the pyridine nucleotide-disulphide oxidoreductase gene PYROXD2. TMA is derived from phos- phatidylcholine (PC), which can be derived from the diet mainly via meat, milk, fish, and eggs. Gastric enzymes release the choline moiety from PC and gut bacteria convert choline into TMA, which is then absorbed (Zeisel et al., 1983; Zeisel and Blusztajn, 1994). Trimethylamine oxide (TMAO), the oxidation product of TMA by PERSONALIZED NUTRITION 289

flavin-containing monooxygenases (FMO) has been recently reported as a potential risk factor for CVD (Wang et al., 2011b). The association between TMA and the PYROXD2 gene is of importance as it suggests how the conversion of TMA to TMAO can be related with key hepatic functions that are influenced by different genotypes and may predispose individuals to specific disease risks. Such approaches offer novel avenues to screen individuals for specific predispositions and determine candidate metabolic targets for tailor-made nutritional management programs. An increasing number of studies also demonstrate how the gut microbiota has a profound impact on multiple host cell metabolic pathways with implications for health and nutritional outcomes (Martin et al., 2007; Clayton et al., 2009; Wikoff et al., 2009). The whole gut functional ecosystem itself is dynamic and varies with host age, diet, and health status. For example, there are possible association of human gut microbial symbionts with the incidence of obesity (Ley et al., 2006; Turnbaugh et al., 2006; Backhed et al., 2007). Systems biology approaches have emerged over the last two decades as a novel way forward to provide insights into the role of mammalian gut microbial metabolic interactions in individual susceptibility to health and disease outcomes. Therefore, current and future emphasis toward personalization of health care and nutritional programs is dependent not only on the host but also on the functional modulation of the gut microbiota–host metabolic interactions. Health status and having a low disease risk are determined by multiple genetic and environmental factors, among which nutrition, as part of lifestyle, plays a key role. In particular, dietary habits not only influence present health but also individ- ual predisposition to disease risk and may have long-term health consequences. It is therefore highly likely that it will be possible to provide consumers with tailor- made nutritional recommendations adapted to their specific metabolic requirements (Rezzi et al., 2007b). While many applications of approaches such as metabolomics combined with GWAS can be easily envisioned in the near future to help develop personalized nutrition programs, nutrimetabolomics remains an extremely complex science. Food-induced metabolic changes are not only the end results of many com- plex interactions among many endogenous and exogenous molecules but also vary widely among individuals due to genetic and environmental factors which influence individual responses. However, a significant challenge remains to move toward the generalization and validation of specific metabolic profile biomarkers in various pop- ulations and in defining the context of use at individual scale for personal monitoring. Many critical steps should be learnt from the successes and failures encountered in the development of novel pharmaceutical drugs.

10.3.7 Understanding Population and Individual Metabolic Needs and Predisposition Will Pave the Way Forward for Future Nutritional Applications Identifying metabolic targets/dysregulation that could respond to an adapted nutri- tion program is only part of the task of personalized nutrition. It is also essential to have a credible, science-based set of nutrition solutions that can improve biomarkers that are identified as being out of range. This could include tailored dietary rec- ommendations and meal plans, prepared meals, and specific nutrient supplements. 290 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS

To build knowledge on the metabolic effects of specific foods and bioactives and more importantly to comprehensively document the metabolic processes associated with individual response is critical in the development of such solutions. Using a single or a combination of bioactives/foods, the aim is that it should be possible to modulate specific metabolic processing to restore or maintain the metabolic home- ostasis of an individual. Recent metabolomic applications in nutritional research, both in clinical and preclinical studies, have furthered our current understanding on individual needs and foods and nutrients’ functionalities in target populations. A number of metabolomics studies have indicated that different foods can modu- late diverse area of human metabolism. For example, supplementation with soy led to clear differences in premenopausal women for plasma lipoproteins, isoleucine, valine, triglycerols, choline, and carbohydrates, indicating an overall shift in energy metabolism favoring lipid metabolism, supported by urine profile results (Solanky et al., 2003, 2005). Based on this type of information, it could be envisioned that a soy- or soy-isoflavone–based “treatment” could be useful for helping to correct abnormalities in lipid metabolism. Further targeted studies would be needed to test the efficacy of soy in these circumstances before recommendations could be made on this basis. Several metabolomics studies have demonstrated that differences in diet or psy- chological state can be detected via changes in the metabolic profile. Stella et al. used metabolomics to characterize the metabolic effects of vegetarian, low-meat and high- meat diets in humans (Stella et al., 2006). Urinary metabolic profiles showed specific signatures according to the diets. In particular, higher urinary levels in creatine, creatinine, carnitine, acetylcarnitine, taurine, TMAO, and glutamine characterized the metabolic signature of high-meat diet, whilst the vegetarian diet was associated with higher urinary excretion of p-hydroxyphenylacetate, a microbial mammalian co-metabolite, and a decreased level in N,N,N-trimethyllysine. Overall, dietary pat- terns could also be distinguished using metabolomics; in a cohort of Danish twins, PCA analysis of the plasma metabolic profile was strongly linked to key dietary patterns, including overall energy intake, preference for traditional foods, and con- sumption of low-fat or high-fat dairy products. (Pere-trepat´ et al., 2010). Metabolic profiling of urine found that “chocolate lovers” have a specific energy metabolism and harbor a gut microbiota with different activities to people who are ambivalent toward chocolate, possibly leading to long-term health differences. In addition to differences in diet or food preferences, it appears as though metabolomics can also detect metabolic differences between different psychological states. In a study where subjects were segmented according to self-perceived anxiety (low or high anxiety), the high-anxiety trait was associated with elevated excretion of hormones, such as the glucocorticoids and catecholamines, with potential modulation of gluconeogen- esis by the catecholamines. Surprisingly, the study also highlighted changes in the urinary excretion of specific mammalian microbial co-metabolites hippurate and p- cresol sulfate illustrating how life stress may impact on gut microbiota metabolism. These subjects then undertook a daily dark chocolate intervention, resulting in subtle cumulative metabolic effects over a 2-wk period only in high-anxiety trait sub- jects. Consumption of dark chocolate resulted in the decrease of urinary excretion CONCLUSION 291 of catecholamines, corticosterone, and the stress hormone cortisol, as well as gut microbiota co-metabolites hippurate and p-cresol sulfate, in subjects with high-trait anxiety, suggesting that metabolomics cannot only detect difference in stress state but also monitor improvements—in this case due to a dietary intervention. Many systems biology approaches have highlighted the importance of gut micro- biota for the normal functioning of mammals, especially with regard to metabolism (Martin et al., 2007; Clayton et al., 2009; Martin et al., 2009b; Claus et al., 2011; Merrifield et al., 2011; Mestdagh et al., 2011; Swann et al., 2011). Preclinical work suggests that changes to gut microbiota may have overall systemic effects and be involved in regulating energy metabolism, with consequent effects on gut microbiota (Martin et al., 2007, 2009b; Clayton et al., 2009; Claus et al., 2011; Merrifield et al., 2011; Mestdagh et al., 2011; Swann et al., 2011). Consequently, the microbiome is a key nutritional target today and might also become the foundation of future drug targeting and interventions (Nicholson et al., 2005; Jia et al., 2008; Zheng et al., 2011). The rise of multifactorial disorders such as obesity, irritable bowel syndrome, and irritable bowel disease, among others that are associated with gut microbiota, highlights the need to model the web of metabolic interactions between genetics, metabolism, environmental factors, lifestyle, and nutrition. A large variety of such biomarkers, based on a concept of a metabolic pattern or signature, are increas- ingly being proposed for various diseases, as illustrated above for several features of metabolic syndrome. The development of systems biology approaches and the new generation of biomarker patterns will provide the opportunity to associate metabolism to the etiology of multifactorial diseases. It is hoped that this will subsequently lead to the development of whole system mechanistic hypotheses that will result in improved tailored nutritional advice and products that cater for the needs of subpopulations and ultimately individuals. As metabolic diseases are often multifactorial, global personalized nutrition solu- tions as proposed here, may be an ideal panacea, using food as a Hippocrates sug- gested, as a well-tolerated medicine with multifactorial effects.

10.4 CONCLUSION

As previously expressed, personalized nutrition aims to tailor dietary advice and intervention for individual needs assuming that people respond in different manner to different foods. With these aims, the analysis of biofluids by metabolomics means may now provide new insight to characterize the metabolic footprint of individu- als, capturing both the end points of the host metabolism and its interactions with symbiotic partners, that is, the gut microbiota. Such comprehensive metabolic foot- prints open new research avenues to identify a new generation of biomarkers, aiming to function as predictive, prognostic, and mechanistic biomarkers. Therefore, it is highly conceivable that these previously mentioned biomarker categories could lead to new molecular targets for rapid and efficient health monitoring purposes. Such health monitoring technologies will enable the demonstration of the health benefits of personalized nutritional solutions to consumers and/or specific consumer groups. 292 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS

Nevertheless, defining such individual biomarkers accurately remains even today an important scientific challenge. This is particularly due to the inherent interindi- vidual biological variability resulting as the evolutionary product of genetic and environmental interactions. However, metabolomics will provide undoubtedly new perspectives into understanding and modulating metabolism according to nutrition. The prospective of preventing the progression of human diseases by individually tai- lored nutritional intervention programs could certainly benefit from the application of metabolomics.

REFERENCES

Al-Delaimy WK, Ferrari P, Slimani N, Pala V, Johansson I, Nilsson S, Mattisson I, Wirfalt E, Galasso R, Palli D, Vineis P, Tumino R, Dorronsoro M, Pera G, Ocke MC, Bueno-de- Mesquita HB, Overvad K, Chirlaque M, Trichopoulou A, Naska A, Tjonneland A, Olsen A, Lund E, Alsaker EH, Barricarte A, Kesse E, Boutron-Ruault MC, Clavel-Chapelon F, Key TJ, Spencer E, Bingham S, Welch AA, Sanchez-Perez MJ, Nagel G, Linseisen J, Quiros JR, Peeters PH, van Gils CH, Boeing H, van Kappel AL, Steghens JP, Riboli E (2005). Plasma carotenoids as biomarkers of intake of fruits and vegetables: individual-level correlations in the European Prospective Investigation into Cancer and Nutrition (EPIC). European Journal of Clinical Nutrition 59(12):1387–1396. Alvarez-S´ anchez´ B, Priego-Capote F, Castro MDLD (2010). Metabolomics analysis II. Prepa- ration of biological samples prior to detection. Trends in Analytical Chemistry 29(2):120– 127. Andersson A, Marklund M, Diana M, Landberg R (2011). Plasma alkylresorcinol concentra- tions correlate with whole grain wheat and rye intake and show moderate reproducibility over a 2- to 3-month period in free-living Swedish adults. Journal of Nutrition 141:1712– 1718. Backhed F, Manchester JK, Semenkovich CF, Gordon JI (2007). Mechanisms underlying the resistance to diet-induced obesity in germ-free mice. Proceedings of the National Academy of Sciences U S A 104(3):979–984. Beckonert O, Coen M, Keun HC, Wang Y, Ebbels TM, Holmes E, Lindon JC, Nicholson JK (2010). High-resolution magic-angle-spinning NMR spectroscopy for metabolic profiling of intact tissues. Nature Protocols 5(6):1019–1032. Beckonert O, Keun HC, Ebbels TM, Bundy J, Holmes E, Lindon JC, Nicholson JK (2007). Metabolic profiling, metabolomic and metabonomic procedures for NMR spectroscopy of urine, plasma, serum and tissue extracts. Nature Protocols 2(11):2692–2703. Begley P, Francis-McIntyre S, Dunn WB, Broadhurst DI, Halsall A, Tseng A, Knowles J, Goodacre R, Kell DB (2009). Development and performance of a gas chromatography- time-of-flight mass spectrometry analysis for large-scale nontargeted metabolomic studies of human serum. Analytical Chemistry 81(16):7038–7046. Benton HP, Want E, Keun HC, Amberg A, Plumb RS, Goldfain-Blanc F, Walther B, Reily MD, Lindon JC, Holmes E, Nicholson JK, Ebbels TM (2012). Intra- and interlaboratory repro- ducibility of ultra performance liquid chromatography-time-of-flight mass spectrometry for urinary metabolic profiling. Analytical Chemistry 84(5):2424–2432. REFERENCES 293

Bojko B, Vuckovic D, Cudjoe E, Hoque ME, Mirnaghi F, Wasowicz M, Jerath A, Pawliszyn J (2011). Determination of tranexamic acid concentration by solid phase microextrac- tion and liquid chromatography-tandem mass spectrometry: first step to in vivo analysis. Journal Chromatography B Analytical Technologies in the Biomedical and Life Science 879(32):3781–3787. Brown M, Dunn WB, Dobson P, Patel Y, Winder CL, Francis-McIntyre S, Begley P, Car- roll K, Broadhurst D, Tseng A, Swainston N, Spasic I, Goodacre R, Kell DB (2009). Mass spectrometry tools and metabolite-specific databases for molecular identification in metabolomics. Analyst 134(7):1322–1332. Bruce SJ, Breton I, Decombaz J, Boesch C, Scheurer E, Montoliu I, Rezzi S, Kochhar S, Guy PA (2010). A plasma global metabolic profiling approach applied to an exercise study monitoring the effects of glucose, galactose and fructose drinks during post-exercise recovery. Journal Chromatography B Analytical Technologies in the Biomedical and Life Science 878(29):3015–3023. Bruce SJ, Tavazzi I, Parisod V, Rezzi S, Kochhar S, Guy PA (2009). Investigation of human blood plasma sample preparation for performing metabolomics using ultrahigh performance liquid chromatography/mass spectrometry. Analytical Chemistry 81(9):3285–3296. Buescher JM, Moco S, Sauer U, Zamboni N (2010). Ultrahigh performance liquid chromatography-tandem mass spectrometry method for fast and robust quantification of anionic and aromatic metabolites. Analytical Chemistry 82(11):4403–4412. Castro-Puyana M, Garcia-Canas V, Simo C, Cifuentes A (2012). Recent advances in the application of capillary electromigration methods for food analysis and foodomics. Elec- trophoresis 33(1):147–167. Chen R, Mias GI, Li-Pook-Than J, Jiang L, Lam HY, Chen R, Miriami E, Karczewski KJ, Hariharan M, Dewey FE, Cheng Y, Clark MJ, Im H, Habegger L, Balasubramanian S, O’Huallachain M, Dudley JT, Hillenmeyer S, Haraksingh R, Sharon D, Euskirchen G, Lacroute P, Bettinger K, Boyle AP, Kasowski M, Grubert F, Seki S, Garcia M, Whirl-Carrillo M, Gallardo M, Blasco MA, Greenberg PL, Snyder P, Klein TE, Alt- man RB, Butte AJ, Ashley EA, Gerstein M, Nadeau KC, Tang H, Snyder M (2012). Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell 148(6): 1293–1307. Claus SP, Ellero SL, Berger B, Krause L, Bruttin A, Molina J, Paris A, Want EJ, de Waziers I, Cloarec O, Richards SE, Wang Y, Dumas ME, Ross A, Rezzi S, Kochhar S, van BP, Lindon JC, Holmes E, Nicholson JK (2011). Colonization-induced host-gut microbial metabolic interaction. MBio. 2(2):e271–e310. Clayton TA, Baker D, Lindon JC, Everett JR, Nicholson JK (2009). Pharmacometabonomic identification of a significant host-microbiome metabolic interaction affecting human drug metabolism. Proceedings of the National Academy of Science U S A 106(34):14728–14733. Clement K, Langin D (2007). Regulation of inflammation-related genes in human adipose tissue. Journal of Internal Medicine 262(4):422–430. Collino S, Martin FP, Kochhar S, Rezzi S (2011). Nutritional metabonomics: an approach to promote personalized health and wellness. Chimia (Aarau) 65(6):396–399. Crews HM, Ducros V, Eagles J, Mellon FA, Kastenmayer P, Luten JB, McGaw BA (1994). Mass spectrometric methods for studying nutrient mineral and trace element absorp- tion and metabolism in humans using stable isotopes. A review. Analyst 119(11):2491– 2514. 294 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS

Despres JP, Tremblay A, Leblanc C, Bouchard C (1988). Effect of the amount of body fat on the age-associated increase in serum cholesterol. Preventive Medicine 17(4):423–431. Dettmer K, Aronov PA, Hammock BD (2007). Mass spectrometry-based metabolomics. Mass Spectrometry Reviews 26(1):51–78. Donahue RP, Abbott RD (1987). Central obesity and coronary heart disease in men. Lancet 2(8569):1215. Dulloo AG, Montani JP (2010). Phenotyping for early predictors of obesity and the metabolic syndrome. International Journal of Obesity (Lond) 34(Suppl 2):S1–S3. Dumas ME, Maibaum EC, Teague C, Ueshima H, Zhou B, Lindon JC, Nicholson JK, Stam- ler J, Elliott P, Chan Q, Holmes E (2006). Assessment of analytical reproducibility of 1H NMR spectroscopy based metabonomics for large-scale epidemiological research: the INTERMAP Study. Analytical Chemistry 78(7):2199–2208. Dunn WB, Broadhurst D, Begley P, Zelena E, Francis-McIntyre S, Anderson N, Brown M, Knowles JD, Halsall A, Haselden JN, Nicholls AW, Wilson ID, Kell DB, Goodacre R (2011). Procedures for large-scale metabolic profiling of serum and plasma using gas chro- matography and liquid chromatography coupled to mass spectrometry. Nature Protocols 6(7):1060–1083. Exarchou V, Godejohann M, van Beek TA, Gerothanassis IP, Vervoort J (2003). LC-UV-solid- phase extraction-NMR-MS combined with a cryogenic flow probe and its application to the identification of compounds present in Greek oregano. Analytical Chemistry 75(22):6288– 6294. Ferguson JF, Phillips CM, McMonagle J, Perez-Martinez P, Shaw DI, Lovegrove JA, Helal O, Defoort C, Gjelstad IM, Drevon CA, Blaak EE, Saris WH, Leszczynska-Golabek I, Kiec-Wilk B, Riserus U, Karlstrom B, Lopez-Miranda J, Roche HM (2010). NOS3 gene polymorphisms are associated with risk markers of cardiovascular disease, and interact with omega-3 polyunsaturated fatty acids. Atherosclerosis 211(2):539–544. Fiehn O, Kopka J, Dormann P, Altmann T, Trethewey RN, Willmitzer L (2000). Metabolite profiling for plant functional genomics. Nature Biotechnology 18(11):1157–1161. Fiscella K, Winters P, Tancredi D, Franks P (2011). Racial disparity in blood pressure: is vitamin D a factor? Journal of General Internal Medicine 26(10):1105–1111. Folsom AR, Kaye SA, Sellers TA, Hong CP, Cerhan JR, Potter JD, Prineas RJ (1993). Body fat distribution and 5-year risk of death in older women. The Journal of the American Medical Society 269(4):483–487. Gieger C, Geistlinger L, Altmaier E, Hrabe de AM, Kronenberg F, Meitinger T, Mewes HW, Wichmann HE, Weinberger KM, Adamski J, Illig T, Suhre K (2008). Genetics meets metabolomics: a genome-wide association study of metabolite profiles in human serum. PLoS Genetics 4(11):e1000282. Griffin JL 2004. Metabolic profiles to define the genome: can we hear the phenotypes? Philo- sophical Transactions of the Royal Society B: Biological Sciences 359(1446):857–871. Griffiths WJ, Ogundare M, Williams CM, Wang Y (2011). On the future of “omics”: lipidomics. Journal of Inherited Metabolic Disorders 34(3):583–592. Grundy SM, Cleeman JI, Daniels SR, Donato KA, Eckel RH, Franklin BA, Gordon DJ, Krauss RM, Savage PJ, Smith SC, Jr., Spertus JA, Fernando C (2005). Diagnosis and management of the metabolic syndrome: an American Heart Association/National Heart, Lung, and Blood Institute scientific statement: executive summary. Critical Pathways in Cardiology 4(4):198–203. REFERENCES 295

Hall KD, Heymsfield SB, Kemnitz JW, Klein S, Schoeller DA, Speakman JR. 2012. Energy balance and its components: implications for body weight regulation. American Journal of Clinical Nutrition 95(4):989–994. Harris GA, Galhena AS, Fernandez FM (2011). Ambient sampling/ionization mass spectrom- etry: applications and current trends. Analytical Chemistry 83(12):4508–4538. Hartz AJ, Rupley DC, Jr., Kalkhoff RD, Rimm AA (1983). Relationship of obesity to diabetes: influence of obesity level and body fat distribution. Preventive Medicine 12(2):351–357. Heinzmann SS, Brown IJ, Chan Q, Bictash M, Dumas ME, Kochhar S, Stamler J, Holmes E, Elliott P, Nicholson JK (2010). Metabolic profiling strategy for discovery of nutritional biomarkers: proline betaine as a marker of citrus consumption. American Journal of Clinical Nutrition 92(2):436–443. Holmes E, Loo RL, Stamler J, Bictash M, Yap IK, Chan Q, Ebbels T, De IM, Brown IJ, Veselkov KA, Daviglus ML, Kesteloot H, Ueshima H, Zhao L, Nicholson JK, Elliott P (2008). Human metabolic phenotype diversity and its association with diet and blood pressure. Nature 453(7193):396–400. Horning EC, Horning MG (1971). Metabolic profiles: gas-phase methods for analysis of metabolites. Clinical Chemistry 17(8):802–809. Houk RS, Thompson J (1988). Inductively coupled plasma mass spectrometry. Mass Spec- trometry 7:425–461. Jia W, Li H, Zhao L, Nicholson JK (2008). Gut microbiota: a potential new territory for drug targeting. Nature Reviews Drug Discovery 7(2):123–129. Joost HG, Gibney MJ, Cashman KD, G + Armanˆ U, Hesketh JE, Mueller M, Van Ommen B, Williams CM, Mathers JC (2007). Personalised nutrition: status and perspectives. British Journal of Nutrition 98(1):26–31. Kahn LS, Satchidanand N, Kopparapu A, Goh W, Yale S, Fox CH (2011). High prevalence of undetected vitamin D deficiency in an urban minority primary care practice. Journal of the National Medical Association 103(5):407–411. Kalkhoff RK, Hartz AH, Rupley D, Kissebah AH, Kelber S (1983). Relationship of body fat distribution to blood pressure, carbohydrate tolerance, and plasma lipids in healthy obese women. Journal of Laboratory and Clinical Medicine 102(4):621–627. Kamleh MA, Ebbels TM, Spagou K, Masson P, Want EJ (2012). Optimizing the use of quality control samples for signal drift correction in large-scale urine metabolic profiling studies. Analytical Chemistry 84(6):2670–2677. Keun HC, Beckonert O, Griffin JL, Richter C, Moskau D, Lindon JC, Nicholson JK (2002). Cryogenic probe 13C NMR spectroscopy of urine for metabonomic studies. Analytical Chemistry 74(17):4588–4593. Kind T, Wohlgemuth G, Lee dY, Lu Y, Palazoglu M, Shahbaz S, Fiehn O (2009). FiehnLib: mass spectral and retention index libraries for metabolomics based on quadrupole and time- of-flight gas chromatography/mass spectrometry. Analytical Chemistry 81(24):10038– 10048. Kirchner M, Matisova E, Hrouzkova S, de ZJ (2005). Possibilities and limitations of quadrupole mass spectrometric detector in fast gas chromatography. Journal of Chromatography A 1090(1–2):126–132. Klannemark M, Orho M, Langin D, Laurell H, Holm C, Reynisdottir S, Arner P, Groop L (1998). The putative role of the hormone-sensitive lipase gene in the pathogenesis of type II diabetes mellitus and abdominal obesity. Diabetologia 41(12):1516–1522. 296 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS

Koek MM, Jellema RH, van der Greef J, Tas AC, Hankemeier T (2011). Quantitative metabolomics based on gas chromatography mass spectrometry: status and perspectives. Metabolomics 7(3):307–328. Kostiainen R, Kauppila TJ (2009). Effect of eluent on the ionization process in liq- uid chromatography-mass spectrometry. Journal of Chromatography A 1216(4):685– 699. Kussmann M, Van Bladeren PJ (2011). The extended nutrigenomics – understanding the interplay between the genomes of food, gut microbes, and human host. Frontiers in Genetics 2:21. Landberg R, Kamal-Eldin A, Andersson SO, Johansson JE, Zhang JX, Hallmans G, Åman P (2009). Reproducibility of plasma alkylresorcinols during a 6-week rye intervention study in men with prostate cancer. Journal of Nutrition 139(5):975–980. Landberg R, Linko AM, Kamal-Eldin A, Vessby B, Adlercreutz H, Åman P (2006). Human plasma kinetics and relative bioavailability of alkylresorcinols after intake of rye bran. Journal of Nutrition 136(11):2760–2765. Lapidus L, Bengtsson C, Larsson B, Pennert K, Rybo E, Sjostrom L (1984). Distribution of adipose tissue and risk of cardiovascular disease and death: a 12 year follow up of participants in the population study of women in Gothenburg, Sweden. Medical Journal (Clinical Research Edition) 289(6454):1257–1261. Larsson B, Renstrom P, Svardsudd K, Welin L, Grimby G, Eriksson H, Ohlson LO, Wilhelmsen L, Bjorntorp P (1984). Health and ageing characteristics of highly physically active 65- year-old men. European Heart Journal 5 (suppl E):31–35. Ley R, Turnbaugh P, Klein S, Gordon J (2006). Microbial ecology: human gut microbes associated with obesity. Nature 444(7122):1022–1023. Li Y, Ruan Q, Li Y, Ye G, Lu X, Lin X, Xu G (2012). A novel approach to transforming a non-targeted metabolic profiling method to a pseudo-targeted method using the retention time locking gas chromatography/mass spectrometry-selected ions monitoring. Journal of Chromatography A 1255:228-236. Liebisch G, Binder M, Schifferer R, Langmann T, Schulz B, Schmitz G (2006). High through- put quantification of cholesterol and cholesteryl ester by electrospray ionization tandem mass spectrometry (ESI-MS/MS). Biochimica Biophysica Acta 1761(1):121–128. Liebisch G, Drobnik W, Reil M, Trumbach B, Arnecke R, Olgemoller B, Roscher A, Schmitz G (1999). Quantitative measurement of different ceramide species from crude cellular extracts by electrospray ionization tandem mass spectrometry (ESI-MS/MS). Journal of Lipid Research 40(8):1539–1546. Liebisch G, Lieser B, Rathenberg J, Drobnik W, Schmitz G (2004). High-throughput quantifi- cation of phosphatidylcholine and sphingomyelin by electrospray ionization tandem mass spectrometry coupled with isotope correction algorithm. Biochimica Biophysica Acta 1686 (1–2):108–117. Lindon JC, Nicholson JK, Holmes E, Antti H, Bollard ME, Keun H, Beckonert O, Ebbels TM, Reily MD, Robertson D (2003). Contemporary issues in toxicology the role of metabo- nomics in toxicology and its evaluation by the COMET project. Toxicology and Applied Pharmacology 187(3):137–146. Lommen A, Gerssen A, Oosterink JE, Kools HJ, Ruiz-Aracama A, Peters RJ, Mol HG (2011). Ultra-fast searching assists in evaluating sub-ppm mass accuracy enhancement in U-HPLC/Orbitrap MS data. Metabolomics 7(1):15–24. REFERENCES 297

Manolopoulos KN, Karpe F, Frayn KN. 2010. Gluteofemoral body fat as a determinant of metabolic health. International Journal of Obesity (Lond) 34(6):949–959. Martin FP, Dumas ME, Wang Y, Legido-Quigley C, Yap IK, Tang H, Zirah S, Murphy GM, Cloarec O, Lindon JC, Sprenger N, Fay LB, Kochhar S, van BP, Holmes E, Nicholson JK (2007). A top-down systems biology view of microbiome-mammalian metabolic interac- tions in a mouse model. Molecular Systems Biology 3:112. Martin FP, Rezzi S, Pere-Trepat E, Kamlage B, Collino S, Leibold E, Kastler J, Rein D, Fay LB, Kochhar S (2009a). Metabolic effects of dark chocolate consumption on energy, gut microbiota, and stress-related metabolism in free-living subjects. Journal Proteome Research 8(12):5568–5579. Martin FP, Sprenger N, Yap IK, Wang Y, Bibiloni R, Rochat F, Rezzi S, Cherbut C, Kochhar S, Lindon JC, Holmes E, Nicholson JK (2009b). Panorganismal gut microbiome-host metabolic crosstalk. Journal Proteome Research 8(4):2090–2105. McNaughton SA, Marks GC, Gaffney P, Williams G, Green A. 2005. Validation of a food- frequency questionnaire assessment of carotenoid and vitamin E intake using weighed food records and plasma biomarkers: the method of triads model. European Journal of Clinical Nutrition 59(2):211–218. Merrifield CA, Lewis M, Claus SP, Beckonert OP, Dumas ME, Duncker S, Kochhar S, Rezzi S, Lindon JC, Bailey M, Holmes E, Nicholson JK (2011). A metabolic system- wide characterisation of the pig: a model for human physiology. Molecular Biosystems 7(9):2577–2588. Mestdagh R, Dumas ME, Rezzi S, Kochhar S, Holmes E, Claus SP, Nicholson JK (2011). Gut microbiota modulate the metabolism of brown adipose tissue. Journal Proteome Research 11(2):620–630. Moco S, Forshed J, Vos CHR, Bino RJ, Vervoort J (2008). Intra- and inter-metabolite correla- tion spectroscopy of tomato metabolomics data obtained by liquid chromatography-mass spectrometry and nuclear magnetic resonance. Metabolomics 4(3):202–215. Moco S, Vervoort J, Bino RJ, de Vos CH, Bino R (2007). Metabolomics technologies and metabolite identification. Trends in Analytical Chemistry 26:855–866. Newgard CB, An J, Bain JR, Muehlbauer MJ, Stevens RD, Lien LF, Haqq AM, Shah SH, Arlotto M, Slentz CA, Rochon J, Gallup D, Ilkayeva O, Wenner BR, Yancy WS, Jr., Eisenson H, Musante G, Surwit RS, Millington DS, Butler MD, Svetkey LP (2009). A branched-chain amino acid-related metabolic signature that differentiates obese and lean humans and contributes to insulin resistance. Cell Metabolism 9(4):311–326. Nicholson JK (2006). -Omics dreams of personalized healthcare. Journal of Proteome Research 5(9):2067, 2069. Nicholson JK, Lindon JC (2008). Systems biology: metabonomics. Nature 455(7216):1054– 1056. Nicholson JK, Wilson ID (2003). Opinion: understanding ‘global’ systems biology: metabo- nomics and the continuum of metabolism. Nature Reviews Drug Discovery 2(8):668–676. Nicholson JK, Connelly J, Lindon JC, Holmes E (2002). Metabonomics a platform for studying drug toxicity and gene function. Nature Reviews Drug Discovery 1(2):153–161. Nicholson JK, Holmes E, Wilson ID (2005). Gut microorganisms, mammalian metabolism and personalized health care. Nature Reviews Microbiology 3(5):431–438. Nicholson G, Rantalainen M, Maher AD, Li JV, Malmodin D, Ahmadi KR, Faber JH, Hall- grimsdottir IB, Barrett A, Toft H, Krestyaninova M, Viksna J, Neogi SG, Dumas ME, 298 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS

Sarkans U, The Molpage Consortium, Silverman BW, Donnelly P, Nicholson JK, Allen M, Zondervan KT, Lindon JC, Spector TD, McCarthy MI, Holmes E, Baunsgaard D, Holmes CC (2011). Human metabolic profiles are stably controlled by genetic and environmental variation. Molecular Systems Biology 7:525. Nordstrom¨ A, Want E, Northen T, Lehtio J, Siuzdak G (2008). Multiple ionization mass spectrometry strategy used to reveal the complexity of metabolomics. Analytical Chemistry 80(2):421–429. Park SH, Kim BI, Kim SH, Kim HJ, Park DI, Cho YK, Sung IK, Sohn CI, Kim H, Keum DK, Kim HD, Park JH, Kang JH, Jeon WK (2007). Body fat distribution and insulin resistance: beyond obesity in nonalcoholic fatty liver disease among overweight men. Journal of the American College of Nutrition 26(4):321–326. Pere-trepat´ E, Ross A, Martin F-PJ, Rezzi S, Kochhar S, Hasselbalch A, Kyvik K, Sorensen TIA (2010). Chemometric strategies to assess metabonomic imprinting of food habits in epidemiological studies. Chemometrics and Intelligent Laboratory Systems 104: 95–100. Pietilainen KH, Sysi-Aho M, Rissanen A, Seppanen-Laakso T, Yki-Jarvinen H, Kaprio J, Oresic M (2007). Acquired obesity is associated with changes in the serum lipidomic profile independent of genetic effects–a monozygotic twin study. PLoS One 2(2):e218. Potischman N (2003). Biologic and methodologic issues for nutritional biomarkers. Journal of Nutrition 133 (Suppl 3):875S–880S. Qiu Y, Su M, Liu Y, Chen M, Gu J, Zhang J, Jia W (2007). Application of ethyl chloroformate derivatisation for gas chromatography-mass spectrometry based metabonomic profiling. Analytica Chimica Acta 583(2):277–283. Rezzi S, Martin FP, Kochhar S (2007a). Defining personal nutrition and metabolic health through metabonomics. Ernst Schering Foundation Symposium Proceedings (4):251–264. Rezzi S, Ramadan Z, Fay LB, Kochhar S (2007b). Nutritional metabonomics: applications and perspectives. Journal of Proteome Research 6(2):513-525. Rhee EP, Cheng S, Larson MG, Walford GA, Lewis GD, McCabe E, Yang E, Farrell L, Fox CS, O’Donnell CJ, Carr SA, Vasan RS, Florez JC, Clish CB, Wang TJ, Gerszten RE (2011). Lipid profiling identifies a triacylglycerol signature of insulin resistance and improves diabetes prediction in humans. The Journal of Clinical Investigation 121(4):1402– 1411. Roessner U, Wagner C, Kopka J, Trethewey RN, Willmitzer L (2000). Technical advance: simultaneous analysis of metabolites in potato tuber by gas chromatography-mass spec- trometry. Plant Journal 23(1):131–142. Romisch-Margl¨ W, Prehn C, Bogumil R, Rohring¨ C, Suhre K, Adamski J (2012). Proce- dure for tissue sample preparation and metabolite extraction for high-throughput targeted metabolomics. Metabolomics 8(1):133–142. Ross AB (2012). Present status and perspectives on the use of alkylresorcinols as biomarkers of wholegrain wheat and rye intake. Journal of Nutrition and Metabolism 2012(462967):1–12. Ross AB, Bourgeois A, Macharia HN, Kochhar S, Jebb SA, Brownlee IA, Seal CJ (2012). Plasma alkylresorcinols as a biomarker of whole grain food consumption in a large pop- ulation - results from the WHOLEheart intervention study. American Journal Clinical Nutrition 95:204–211. Scherer M, Schmitz G, Liebisch G (2009). High-throughput analysis of sphingosine 1- phosphate, sphinganine 1-phosphate, and lysophosphatidic acid in plasma samples REFERENCES 299

by liquid chromatography-tandem mass spectrometry. Clinical Chemistry 55(6): 1218–1222. Scherer M, Leuthauser-Jaschinski K, Ecker J, Schmitz G, Liebisch G (2010). A rapid and quantitative LC-MS/MS method to profile sphingolipids. Journal of Lipid Research 51(7): 2001–2011. Scherer M, Bottcher A, Schmitz G, Liebisch G (2011). Sphingolipid profiling of human plasma and FPLC-separated lipoprotein fractions by hydrophilic interaction chromatography tan- dem mass spectrometry. Biochimica Biophysica Acta 1811(2):68–75. Schuhmann K, Herzog R, Schwudke D, Metelmann-Strupat W, Bornstein SR, Shevchenko A (2011). Bottom-up shotgun lipidomics by higher energy collisional dissociation (HCD) on LTQ Orbitrap mass spectrometers. Analytical Chemistry 83(14):5480–5487. Sellick CA, Hansen R, Maqsood AR, Dunn WB, Stephens GM, Goodacre R, Dickson AJ (2009). Effective quenching processes for physiologically valid metabolite profiling of suspension cultured mammalian cells. Analytical Chemistry 81(1):174–183. Smart KF, Aggio RB, Van Houtte JR, Villas-Boas SG (2010). Analytical platform for metabolome analysis of microbial cells using methyl chloroformate derivatisation followed by gas chromatography-mass spectrometry. Nature Protocols 5(10):1709–1729. Soderholm PP, Lundin JE, Koskela AH, Tikkanen MJ, Adlercreutz HC (2011). Pharma- cokinetics of alkylresorcinol metabolites in human urine. British Journal of Nutrition 106(7):1040–1044. Soga T, Igarashi K, Ito C, Mizobuchi K, Zimmermann HP, Tomita M (2009). Metabolomic profiling of anionic metabolites by capillary electrophoresis mass spectrometry. Analytical Chemistry 81(15):6165–6174. Solanky KS, Bailey NJ, Beckwith-Hall BM, Bingham S, Davis A, Holmes E, Nicholson JK, Cassidy A (2005). Biofluid 1H NMR-based metabonomic techniques in nutrition research - metabolic effects of dietary isoflavones in humans. Journal of Nutritional. Biochemistry 16(4):236–244. Solanky KS, Bailey NJ, Beckwith-Hall BM, Davis A, Bingham S, Holmes E, Nicholson JK, Cassidy A (2003). Application of biofluid 1H nuclear magnetic resonance-based metabo- nomic techniques for the analysis of the biochemical effects of dietary isoflavones on human plasma profile. Analytical Biochemistry 323(2):197–204. Sorensen TI, Boutin P, Taylor MA, Larsen LH, Verdich C, Petersen L, Holst C, Echwald SM, Dina C, Toubro S, Petersen M, Polak J, Clement K, Martinez JA, Langin D, Oppert JM, Stich V, Macdonald I, Arner P, Saris WH, Pedersen O, Astrup A, Froguel P (2006). Genetic polymorphisms and weight loss in obesity: a randomised trial of hypo-energetic high- versus low-fat diets. PLoS Clinical Trials 1(2):e12. Stella C, Beckwith-Hall B, Cloarec O, Holmes E, Lindon JC, Powell J, vanderOuderaa F, Bingham S, Cross AJ, Nicholson JK (2006). Susceptibility of human metabolic phenotypes to dietary modulation. Journal Proteome Research 5(10):2780–2888. Suhre K, Meisinger C, Doring A, Altmaier E, Belcredi P, Gieger C, Chang D, Milburn MV, Gall WE, Weinberger KM, Mewes HW, Hrabe de AM, Wichmann HE, Kronenberg F, Adamski J, Illig T (2010). Metabolic footprint of diabetes: a multiplatform metabolomics study in an epidemiological setting. PLoS ONE 5(11):e13953. Suhre K, Wallaschofski H, Raffler J, Friedrich N, Haring R, Michael K, Wasner C, Krebs A, Kronenberg F, Chang D, Meisinger C, Wichmann HE, Hoffmann W, Volzke H, Volker U, Teumer A, Biffar R, Kocher T, Felix SB, Illig T, Kroemer HK, Gieger C, Romisch-Margl 300 SHAPING THE FUTURE OF PERSONALIZED NUTRITION WITH METABOLOMICS

W, Nauck M (2011). A genome-wide association study of metabolic traits in human urine. Nature Genetics 43(6):565–569. Sumner L, Amberg A, Barrett D, Beale M, Beger R, Daykin C, Fan T, Fiehn O, Goodacre R, Griffin JL (2007). Proposed minimum reporting standards for chemical analysis. Metabolomics 3(3) 211–221. Swann JR, Want EJ, Geier FM, Spagou K, Wilson ID, Sidaway JE, Nicholson JK, Holmes E (2011). Systemic gut microbial modulation of bile acid metabolism in host tissue compart- ments. Proceedings of the National Academy of Sciences U S A 108 (Suppl 1):4523–4530. Tanaka K, Hine DG, West-Dull A, Lynn TB (1980). Gas-chromatographic method of analysis for urinary organic acids. I. Retention indices of 155 metabolically important compounds. Clinical Chemistry 26(13):1839–1846. Tautenhahn R, Bottcher C, Neumann S (2008). Highly sensitive feature detection for high resolution LC/MS. BMC Bioinformatics 9:504. Teran-Garcia M, Bouchard C (2007). Genetics of the metabolic syndrome. Applied Physiology Nutrition and Metabolism 32(1):89–114. Thomas R (2008). Practical Guide to ICP-MS. A Tutorial For Beginners. Boca Raton, FL: CRC Press p 1–5. Tolstikov VV, Fiehn O (2002). Analysis of highly polar compounds of plant origin: combina- tion of hydrophilic interaction chromatography and electrospray ion trap mass spectrometry. Analytical Biochemistry 301(2):298–307. Turnbaugh P, Ley R, Mahowald M, Magrini V, Mardis E, Gordon J (2006). An obesity- associated gut microbiome with increased capacity for energy harvest. Nature 444:1027– 1031. van der Greef J, McBurney RN (2005). Innovation: rescuing drug discovery: in vivo systems pathology and systems pharmacology. Nature Reviews Drug Discovery 4(12):961–967. van Deurse MM, Beens J, Janssen HG, Leclercq PA, Cramers CA (2000). Evaluation of time- of-flight mass spectrometric detection for fast gas chromatography. Journal of Chromatogry A 878(2):205–213. van Ommen B (2007). Personalized nutrition from a health perspective: luxury or necessity? Genes Nutrition 2(1):3–4. VandecasteeleC, Block CB (1993). Mass spectrometry-inductively coupled mass spectrometry. In: Modern Methods for Trace Element Determination Chichester, UK: John Wiley & Sons p 192–260. Viguerie N, Poitou C, Cancello R, Stich V, Clement K, Langin D (2005a). Transcriptomics applied to obesity and caloric restriction. Biochimie 87(1):117–123. Viguerie N, Vidal H, Arner P, Holst C, Verdich C, Avizou S, Astrup A, Saris WH, Macdonald IA, Klimcakova E, Clement K, Martinez A, Hoffstedt J, Sorensen TI, Langin D (2005b). Adipose tissue gene expression in obese subjects during low-fat and high-fat hypocaloric diets. Diabetologia 48(1):123–131. Villas-Boas SG, Hojer-Pedersen J, Akesson M, Smedsgaard J, Nielsen J (2005). Global metabo- lite analysis of yeast: evaluation of sample preparation methods. Yeast 22(14):1155–1169. Wang Z, Klipfell E, Bennett BJ, Koeth R, Levison BS, Dugar B, Feldstein AE, Britt EB, Fu X, Chung YM, Wu Y, Schauer P, Smith JD, Allayee H, Tang WH, DiDonato JA, Lusis AJ, Hazen SL (2011a). Gut flora metabolism of phosphatidylcholine promotes cardiovascular disease. Nature 472(7341):57–63. REFERENCES 301

Wang TJ, Larson MG, Vasan RS, Cheng S, Rhee EP, McCabe E, Lewis GD, Fox CS, Jacques PF, Fernandez C, O’Donnell CJ, Carr SA, Mootha VK, Florez JC, Souza A, Melander O, Clish CB, Gerszten RE (2011b). Metabolite profiles and the risk of developing diabetes. Nature Medicine 17(4):448–453. Wenk MR (2005). The emerging field of lipidomics. Nature Reviews Drug Discovery 4(7): 594–610. Wikoff WR, Anfora AT, Liu J, Schultz PG, Lesley SA, Peters EC, Siuzdak G (2009). Metabolomics analysis reveals large effects of gut microflora on mammalian blood metabo- lites. Proceedings of the National Academy of Sciences U S A 106(10):3698–3703. Wildman RP, Muntner P, Reynolds K, McGinn AP, Rajpathak S, Wylie-Rosett J, Sowers MR (2008). The obese without cardiometabolic risk factor clustering and the normal weight with cardiometabolic risk factor clustering: prevalence and correlates of 2 phenotypes among the US population (NHANES 1999-2004). Archives of Internal Medicine 168(15):1617–1624. Williams CM, Ordovas JM, Lairon D, Hesketh J, Lietz G, Gibney M, van OB (2008). The challenges for molecular nutrition research 1: linking genotype to healthy nutrition. Genes Nutrition 3(2):41–49. Wirfalt E, Hedblad B, Gullberg B, Mattisson I, Andren C, Rosander U, Janzon L, Berglund G (2001). Food patterns and components of the metabolic syndrome in men and women: a cross-sectional study within the Malmo Diet and Cancer cohort. American Journal of Epidemiology 154(12):1150–1159. Wu L, Mashego MR, van Dam JC, Proell AM, Vinke JL, Ras C, van Winden WA, van Gulik WM, Heijnen JJ (2005). Quantitative analysis of the microbial metabolome by isotope dilution mass spectrometry using uniformly 13C-labeled cell extracts as internal standards. Analytical Biochemistry 336(2):164–171. Xiao JF, Zhou B, Ressom HW (2012). Metabolite identification and quantitation in LC- MS/MS-based metabolomics. Trends in Analytical Chemistry 32:1–14. Zeisel SH, Blusztajn JK (1994). Choline and human nutrition. Annual Reviews Nutrition 14:269–296. Zeisel SH, Wishnok JS, Blusztajn JK (1983). Formation of methylamines from ingested choline and lecithin. Journal of Pharmacology and Experimental Therapeutics 225(2):320–324. Zhang Q, Wang G, Du Y, Zhu L, Jiye A (2007). GC/MS analysis of the rat urine for metabo- nomic research. Journal Chromatography B Analytical Technologies in the Biomedical and Life Science 854(1–2): 20–25. Zhao X, Fritsche J, Wang J, Chen J, Rittig K, Schmitt-Kopplin P, Fritsche A, Haring HU, Schleicher ED, Xu G, Lehmann R (2010). Metabonomic fingerprints of fasting plasma and spot urine reveal human pre-diabetic metabolic traits. Metabolomics 6(3):362–374. Zheng X, Xie G, Zhao A, Zhao L, Yao C, Chiu NH, Zhou Z, Bao Y, Jia W, Nicholson JK, Jia W (2011). The footprints of gut microbial-mammalian co-metabolism. Journal of Proteome Research 10(12):5512–5522. Zimmermann MB, Hurrell RF (2007). Nutritional iron deficiency. Lancet 370(9586):511–520. 11 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

Anna Arola-Arnal, Josep M. del Bas, Antoni Caimari, Anna Crescenti, Francesc Puiggros,` Manuel Suarez,´ and Llu´ıs Arola

11.1 INTRODUCTION

The concept of optimal nutrition has been under constant revision since the beginning of the establishment of nutrition as a scientific discipline. In 1956, W.A. Krehl (Krehl, 1956) wrote, “Optimum nutrition might be described, then, as that which provides all dietary nutrients in respect to type and amount, and in proper state of combination or balance so that the organism may always meet the varied exogenous and endogenous stresses of life, whether in health or disease, with a minimal demand or strain on the body’s natural homeostatic mechanisms.” However, in the postgenomic era, the capacity of nutrients or food components to modulate the expression of genes and therefore natural homeostatic mechanisms has become well accepted. Thus, 50 years after Krehl’s definition, nutrition research has experienced a shift of focus. In that time, nutrition was understood as the study of the basic requirements for maintaining healthy life. Today, research groups continue to investigate how dietary patterns can enhance quality of life and prevent disease, and to do this, it is important to understand the basis of nutrient action. New technologies such as -omics approaches combined with molecular biology, genetics, and other disciplines are fundamental tools for achieving this goal. To understand the current status of nutrition research and therefore the modern concept of optimal nutrition, this introduction summarizes the most important milestones in the field.

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

303 304 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

11.1.1 The Five Patterns of Nutrition The nutritional status of human populations has been divided into five different patterns (Popkin, 2006). The first is known as the Paleolithic pattern and covers the most extended period in human history. This pattern is characterized by a healthy diet that is counteracted by infectious diseases and other natural causes, resulting in a short life span. The establishment of complex societies and the arrival of modern agriculture establishes the basis of the second pattern of nutrition, in which an increase in the population leads to the emergence of famines. This period is not accompanied by significant advances in disease prevention or treatment, and this pattern involves a worsened nutritional status and a short life span. These features are improved in the third nutritional pattern, in which the incomes of a society rise and famine is only transient. Nevertheless, the societies matching this third pattern are sensitive to natural events and to a poor health status, resulting in an unstable nutritional situation. The fourth pattern describes a significant portion of current developed and developing societies. This pattern involves increased incomes that guarantee access to food and health benefits, as well as modern concepts such as globalization, urbanization, and marketing. Despite apparent progress, changes in diet and activity favor the prevalence of new diseases, mainly related to aging, and associated morbidity. Recent years have seen the first instances of overfeeding and unhealthy excessive diets in human history and diseases that were previously unknown or rare represent a new challenge not only for nutrition but also for medicine and physiology. As a result, many populations feature longer life spans but a suboptimal quality of life. Finally, the fifth nutritional pattern involves a change in behavior that counteracts the negative features of the previous pattern. The aging process is better handled, and the problems derived from inactivity and unhealthy diets have overcome. While these five patterns could summarize the nutritional history of a modern developed society (Fig. 11.1), the majority of these patterns are present in the world today. Patterns two and three are applicable to various underdeveloped societies and coexist with the fourth and fifth patterns characterizing developing and developed societies. While the fifth pattern represents the most desirable scenario, this situation has only been achieved in a limited number of societies and only in a portion of the populations of those societies. In fact, improved quality of life, prevention of disease, and healthy aging by means of nutrition are still goals to be achieved (Muller¨ and Kersten, 2003; Costa et al., 2010). Foodomics can provide new tools to achieve these goals of nutrition research.

11.1.2 The Evolution of the “Optimal Nutrition” Concept Many authors set the birth of Nutrition as a scientific discipline at the end of the eighteenth century (Dickerson, 2001; Carpenter, 2003a). It is not a coincidence that this science was born in parallel with the so-called “chemical revolution” in France (Fig. 11.1). The arrival of modern chemistry, led by French scientists such as Antoine Lavoisier, brought methods of chemical analysis that allowed for the scientific assessment of new and old ideas. Thus, as with many other disciplines, advances Nutritional pattern 1st 2nd 3rd

Timeline -10000 -9500 -9000 -8500 -8000 -7500 -7000 -6500 -6000 -5500 -5000 -4500 -4000 -3500 -3000 -2500 -2000 -1500 -1000 -500 0 500 1000 1500 2000

Historical Neolithic Agricultural revolutions events revolution (worldwide)

Nutritional pattern 3rd 4th 5th?

Timeline 1700 1750 1800 1850 1900 1950 2000 2050

Historical events “Chemical revolution” Vitamin era -omics systems biology

Sequenciation of “Optimal nutrition” human genome concept

2nd agricultural revolution Mediterranean/ Industrial revolution Japanese diets

FIGURE 11.1 Historical evolution of the different nutritional patterns in a developed society. The five patterns of nutrition described in this chapter (Popkin, 2006) can describe the current nutritional statuses of the different worldwide populations, as well as the nutritional history of 305 a modern developed society. Thus, the first, second, and third patterns are the most common through history, while the fourth and fifth patterns have only appeared during the last centuries in parallel with technological and socioeconomic advances. Therefore, a nutritional pattern reflects and depends on the overall status of a human population. 306 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION? in nutrition research have always depended on the most important advances in science overall such as in chemistry, physics, and physiology. The development and evolution of modern chemistry led to impressive advances in the nutritional sciences, in stark contrast with the relatively modest knowledge that had been accumulated during the previous centuries of human history. The first decades of nutrition research produced many important findings, mainly related to the devel- opment of analytical techniques and animal models that aided in the evolution of nutritional science. The main period of growth of nutritional sciences began in 1912, the start of the vitamin era (Fig. 11.1). Different vitamins such as the B group and vitamins A, D, E, and K were identified along with other components of the diet such as essential fatty acids, amino acids, or mineral elements among others (reviewed in Carpenter, 2003c, 2003d). In parallel, various technological advances took place such as calorimetry and respirometry, and knowledge rapidly expanded in the fields of physiology, chem- istry, and biology, including around the use of food energy by the human body and in the area of digestion, with each of these advances making important contributions to the definition of key aspects of human nutrition (reviewed in Carpenter, 2003b, 2003c). By 1944, the number of discoveries in the nutritional sciences was so over- whelming that some authors suggested that there was nothing more to explore in nutrition research. Elmer McCollum, discoverer of many fat-soluble vitamins, wrote “It seems logical to close this history of ideas with the year 1940. Essentially that year marks the achievement of the primary objectives set by pioneers in this field of study. They sought to discover what, in terms of chemical substances, consti- tuted an adequate diet for man and domestic animals, and that purpose was realized” (Nichols and Reeds, 1991). Indeed, while nutrition research did not end in the 1940s, this date can be considered another inflection point in the discipline. The dietary requirements to avoid many nutritional deficiencies and to maintain basic homeo- static mechanisms had been successfully defined. Thus, many nutritional recommen- dations could be set at that time (Welsh, 1994). The concept of “optimal nutrition” had come to be understood as a diet able to provide all the essential nutrients to avoid nutritional deficiencies. Unfortunately, not all these nutritional recommendations were correct. In some affluent countries such as in the United States and northern European countries, increased incomes did not translate into improved health. Instead, various diseases related to overfeeding, unhealthy diets, and inactivity began to appear, with obesity and cardiovascular disease (CVD) becoming prominent concerns in those countries from World War II onward (WHO, 2000; Hill et al., 2003; Caballero, 2007). In the 1960s, researchers began to suspect that dietary patterns were the basis of the problem, but this was not clarified until various epidemiological studies provided conclusive data by the 1980s. In 1980, Ancel Keys and coworkers reported that the prevalence of cardiovascular events correlated with the amount of saturated fat intake in different countries. This was one of the conclusions of the prospective Seven Countries Study, carried out with 12,763 men from seven different countries (Keys, 1980). Today, it is generally believed that saturated fat intake increases the risk of CVD, although this remains a subject of debate among some experts (Siri-tarino et al., 2010; Astrup et al., INTRODUCTION 307

2011). This finding led the Mediterranean and Japanese diets with less saturated fat to be considered healthier than western diets (Fig. 11.1). However, surprises were still to come in the field of nutrition science. Renaud and de Lorgeril in 1992 and Artaud-Wild in 1993 (Renaud and de Lorgeril, 1992; Artaud-Wild et al., 1993) demonstrated that the French population had a low rate of deaths due to cardiovascular events despite having the same saturated fat intake as other northern countries, a phenomenon known as the French paradox. The French paradox was initially explained by the moderate consumption of red wine. The beneficial effects of red wine intake were later attributed mainly to the polyphenolic compounds found in the beverage (de Lange, 2007; Sies, 2010; Magrone and Jirillo, 2011). The French paradox is currently accepted as arising not only from red wine intake but from the combination of dietary components such as vegetables, fiber, and olive oil. Taken together, these findings suggested that some dietary components can protect against disease. Nutrition research should adopt new techniques to test these findings and hypotheses and assess how food components interact with different metabolic pathways.

11.1.3 Nutrients as Signaling Molecules: Nutrition is Not Only Essential Nutrients Nutritional sciences evolved together with the understanding of the intermediary metabolism and the integration of metabolism. Knowledge on how nutrients and energy are transformed and used and on the interconnections between various metabolic pathways led to a new era in nutrition research. In parallel, new tools were developed, mainly in the fields of molecular biology and genetics. Molecu- lar biology techniques have traditionally been applied to fields such as physiology and pharmacology to study subjects such as the functions of proteins, signaling cas- cades, membrane trafficking, and the molecular mechanisms involved in drug actions. These techniques have been steadily adopted by nutrition scientists intrigued by the potential of nutrients as modulators of different signaling pathways and metabolic processes. As a result, it is now clear that foods are not only a source of essential nutrients but also a source of signaling molecules (Muller¨ and Kersten, 2003; Afman and Muller,¨ 2006; Mortensen et al., 2008; Blade´ et al., 2010). Nutrients can interact with intracellular signaling pathways, modulating the expression of genes and even- tually controlling biological functions. A good example can be found in fatty acids (Seo et al., 2005; Schroeder et al., 2008; Afman and Muller,¨ 2011), which can act as signaling molecules beyond their structural or energetic relevance (Sampath and Ntambi, 2005; Schroeder et al., 2008; Afman and Muller,¨ 2011). Fatty acids have been shown to bind and activate a set of nuclear receptors that can be considered as ligand-activated transcription factors (Nagy and Schwabe, 2004). PPAR␣, a nuclear receptor that controls a wide set of genes involved in fatty acid metabolism, can be activated by docosahexanoate (C22:6w-3) (Pawar and Jump, 2003). LXR␣ is another nuclear receptor that can bind fatty acids (Piomelli et al., 2007). This nuclear receptor controls different genes involved in cholesterol, bile acid, and lipoprotein metabolism 308 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

(Kalaany and Mangelsdorf, 2006). Despite the identification of oxysterols as natu- ral LXR␣ ligands, some mono and polyunsaturated free fatty acids can bind to this nuclear receptor, inhibiting the binding of oxysterols (Ou et al., 2001). Each nuclear receptor can control and coordinate the expression of a wide set of genes involved in a given metabolic pathway. Thus, PPAR␣ is a key controller of the ␤-oxidation program (Pyper et al., 2010), whereas LXR␣ transcriptional activity is essential for the efflux of cholesterol, the synthesis of bile acids, and the metabolism of plasma lipoproteins (Kalaany and Mangelsdorf, 2006). Therefore, the presence of certain fatty acids in the diet can eventually modulate different metabolic pathways. Nevertheless, these are only some of the signaling properties of fatty acids, which can act through other signaling pathways as well (Pyper et al., 2010). Similarly to fatty acids, other nutri- ents such as some vitamins or steroids can modulate different metabolic pathways (Carlberg and Seuter, 2009; Potier et al., 2009; Noy, 2010; Shirazi-Beechey et al., 2011). More surprisingly, other food components such as polyphenols and phytos- terols, which cannot be considered essential nutrients, are also able to interact with intracellular signaling cascades and modulate the expression of several genes and the activity of transcription factors (Plat et al., 2005; Mandel et al., 2008; Del Bas et al., 2009). Thus, the new perspective of nutrients as signaling molecules that are able to modulate different metabolic aspects represented a new challenge for nutrition and food research. In view of these possibilities, the concept of optimal nutrition evolved, in which diet can be considered a potential tool for the prevention of disease and can contribute actively to achieve an increased life span accompanied by healthy aging. Based on this understanding, the food industry has found a way to develop new products with value-added differentiation. For example, so-called “functional foods” claim to provide health benefits beyond the nutritional value of their traditional counterparts. More examples will be listed below within this chapter.

11.1.4 The Postgenomic Era The sequencing of the human genome, completed in 2003 (Fig. 11.1), was another milestone in biological sciences (Lander et al., 2001; Venter et al., 2001). This achievement triggered the emergence of genomics applied to the study of human genes and boosted the field of functional genomics, mainly concerned with patterns of gene expression in a system under different conditions (Naidoo et al., 2011). The arrival of the so-called omics techniques, also known as high-throughput or profiling techniques, presented the possibility to study a wide array of analytes in one sample. Currently, the list of omics technologies is rapidly increasing (Anon, 2009; Haring, 2012) and among the best-developed technologies in this area are transcriptomics, which allows for the analysis of changes in gene expression in a whole genome; proteomics, focused on the separation and analysis of a wide array of proteins; and metabolomics, which has the potential to analyze a wide set of metabolites of different nature and complexity, ranging from simple hormones to complex lipoproteins (Herrero et al., 2012). These three omics technologies represent the main stages of the biological information pathway (Kussmann et al., 2008; Herrero INTRODUCTION 309

HUMAN DIET BIOLOGICAL FLUX INFORMATION Nutrigenetics Genome Epigenome DNA Epigenetics OPTIMAL NUTRITION FOODOMICS NUTRIGENOMICS Transcriptome RNA Nutritranscriptomics

Proteome PROTEIN Nutriproteomics

Metabolome METABOLITE Nutrimetabolomics

SYSTEM BIOLOGY HUMAN HEALTH

FIGURE 11.2 Role of foodomics in human health through the development of optimal nutri- tion. Foodomics involves the application of omics technologies to nutrition research, includ- ing nutrigenetics, epigenomics, and nutrigenomics (nutritranscriptomics, nutriproteomics, and nutrimetabolomics). These powerful methods are employed for the analysis of the genome and epigenome (DNA), the transcriptome (RNA), the proteome (protein), and the metabolome (metabolites) to study the association between human diet and human health and to develop optimal nutrition to promote health and prevent disease. et al., 2012). Nevertheless, as technology advances, more omics approaches can be added to the list. For example, epigenomics studies the epigenetic changes from a genome-wide perspective (i.e., DNA methylation, chromatin remodeling, or miRNA expression); phosphoproteomics studies changes in phosphoproteins; lipidomics is used to profile lipids; and glycomics is used to profile carbohydrates (Anon, 2009; Ordovas´ and Smith, 2010; Kussmann et al., 2010) (Fig. 11.2). Subsequently, to study and understand a complex organism, the information obtained by the different omics techniques should be integrated and, more importantly, interpreted. To do that, the development of statistics and mathematical models is needed, as well as bioinformatics tools that allow for the study of the interactions between the different components of the system (Ghosh et al., 2011). The study of such complex problems is taken on in the field of systems biology (Snoep and Westerhoff, 2005). Therefore, after several decades of successfully applying a reductionist approach to study metabolism and its modulation, these technological advances have allowed for the introduction of a holistic perspective to nutritional science. As seen in this review, the history of nutrition research is tightly linked to the evolution of the various disciplines within the biological sciences (Fig. 11.1). As discussed later in this chapter, a shift in the focus of nutrition research is needed to integrate interindividual genetic variations with the effects of the plethora of bioactive compounds found in a single meal. The holistic paradigm proposed by the 310 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION? omics technologies opens a promising avenue by which to achieve the challenging present objectives of nutritional sciences (van Ommen and Stierum, 2002; Kussmann et al., 2006; Kussmann et al., 2008; Herrero et al., 2012). The optimal diet for the fifth nutritional pattern of human society has not yet been defined. Foodomics represents an invaluable tool for the development of optimal nutrition (Herrero et al., 2012). This chapter discusses the contribution of this discipline in the context of the optimal nutrition concept.

11.2 NUTRIGENOMICS

Diet is an environmental factor of major importance affecting practically all cellular processes, and thus it has a major impact on health and in the development of many diseases such as obesity, cancer, and heart disease. Scientific advances and the devel- opment of high-throughput methods (e.g., microarray technology and protein and metabolite analysis) have made it possible to gain knowledge about nutrient–gene interactions and to partially understand the molecular links between nutrition, phys- iology, and disease. This has led to the development of nutrigenomics, a specialized area of Foodomics that studies how dietary components interact with the genome as well as the resulting changes in proteins and metabolites (Garcia-Canas˜ et al., 2010; Herrero et al., 2012). Nutrigenomics is mainly built on three high-throughput omics techniques: (1) nutritranscriptomics, the study of the global mRNA expression levels in a cell, tis- sue, or organism in a given set of nutritional conditions; (2) nutriproteomics, the large-scale study of proteins, particularly their structures and functions, to identify the molecular targets of dietary components; and (3) nutrimetabolomics, the mea- surement of all metabolites in a biological tissue, biofluid, or cell under a nutritional stimulus (Panagiotou and Nielson, 2009) (Fig. 11.2). The study of the transcripts at the global level indicates what appears to happen, the proteome describes what makes it happen, and the metabolome shows what has happened (Kussmann et al., 2010). Thus, integrating these omics disciplines could allow for the generation of holistic views that provide a better understanding of the interplay of the genome with dietary components.

11.2.1 Nutrigenomics and Prevention of Chronic Diseases: Looking for Health-Related Nutritional Biomarkers Traditionally, nutrition research focused on classical epidemiology and human inter- vention studies consisting of large prospective cohorts. These epidemiological studies allowed for the establishment of relationships between certain diseases and the type of nutrition of the population under study. For example, this sort of analysis related the beneficial effects of the Mediterranean diet with the prevention of CVD and led to some consumption recommendations (Knoops et al., 2004; Covas, 2007). Later on, cohort studies enabled the identification of a number of disease biomarkers, such as low-density lipoprotein cholesterol (LDL-C), glucose, and triglycerides. However, NUTRIGENOMICS 311 in some cases, these markers are less informative because they are also involved in the molecular processes that direct the development of the disease. This is the case for LDL-C and glucose. As a consequence, these markers are suitable for the identification of disease but are not useful for disease prevention through nutrition because when they appear elevated the disease process has already started. Therefore, it is necessary to find early biomarkers that are not involved in the mechanisms that drive the disease to enable disease prediction (van Ommen et al., 2009). These new biomarkers are defined as specific variables (genes, proteins, and metabolites) that can be objectively measured and evaluated as indicators of a biological process, used to identify diseased populations, and most importantly, used to predict the development of future disease (Wehrens et al., 2011). One of the most important characteristics of biomarkers is that they must have clinical utility. Hocquette et al. (2009) stated that to consider a new biomarker as useful it should be “easier, better, faster, and cheaper.” Therefore, a biomarker must provide information in a simpler and faster way than previous biomarkers to be considered useful. In addition, the new biomarkers must be robust in terms of both identification and quantification and must be highly specific. The dramatic improvement in the equipment used to analyze samples has led to a change in the conception of the topic “looking for a biomarker.” Initial investigations were focused on only one set of biomarkers belonging to the same nutritional hypoth- esis due to the limited number of metabolites that could be monitored from a single sample. These studies were commonly known as following the “targeted hypothe- sis.” With the improved power of analytical technologies, these restrictions have been overcome and thousands of metabolites and nutrients can now be measured simul- taneously. These studies follow the “nontargeted hypothesis,” which means that an overall picture of all the metabolites is taken and all the interactions among metabo- lites from different pathways can be studied at once. Consequently, it is possible to obtain biomarkers that are better integrated into the biological process and that can be more accurately related to nutrition in terms of health and disease (Koulman and Volmer, 2010). To demonstrate the implications of nutrition in health optimization and disease prevention, nutrigenomics has to cope with the relatively new European Legislation 1924/2006 (EU, 2006) focused on nutrition and health claims made by food. Taking this legislation into consideration, it is necessary to find new biomarkers that are very close to or even within the range of the healthy state. This challenge is very difficult to sort out, in part due to the natural tendency of organisms to homeostasis, which can be defined as the adaptation of the organisms to a certain alteration to maintain normal parameters. As a consequence of homeostasis, the small changes induced by nutrition are very difficult to quantify because they rapidly return to their normal values. In addition, the problem of interindividual variation has to be taken into consideration because changes among individuals are sometimes larger than changes caused by nutrition, which makes the identification of robust biomarkers difficult (van Ommen et al., 2008). Increased interest in the nutrigenomics field and the need to find new “healthy” biomarkers according to the new European legislation has led the Scientific Community to focus on identifying and validating “healthy” nutrigenomics-based biomarkers through the quantification of the robustness of the 312 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION? homeostatic mechanisms involved in maintaining optimal health using nutrigenomics tools. The identification and validation of such biomarkers in humans is expected to provide new scientific evidence to help support health claims on food in Europe.

11.2.2 Advances in Nutrigenomics The significant advances in the omics technologies during the last decade have led to many studies in which these high-throughput techniques have been used to identify new biomarkers related to food consumption and to better understand how bioactive food components affect gene expression as well as protein and metabolite levels (Garcia-Canas˜ et al., 2010; Wittwer et al., 2011; Herrero et al., 2012). This section aims to review the relevant findings of these studies in regards to (a) the achievement of optimal nutrition, (b) the maintenance of a healthy status, (c) the prevention of chronic diseases, and (d) the identification of new molecular biomarkers. Although a wide set of tissues have been used in different nutrigenomic studies, we will mainly focus our interest on those performed using biosamples obtained through minimally invasive techniques (i.e., blood components—serum, plasma, and peripheral blood mononuclear cells (PBMCs)—and urine). Interest is focused on these biosamples due to the possibility to directly extrapolate these findings to human studies. The nutrigenomic studies that have been conducted thus far are summarized in Table 11.1 and further explained in the text.

11.2.2.1 Nutritranscriptomics DNA microarrays are the most widely used tech- nique in transcriptomics. Microarrays are powerful tools with the capability to simul- taneously measure up to 50,000 transcripts. In the field of nutrition, the global analysis of transcripts could elucidate the effect of a nutrient or diet on metabolic pathways, identify potential biomarkers of chronic diseases, and determine the impact of a diet and/or a single nutrient on a human pathology (Garcia-Canas˜ et al., 2010; Masotti et al., 2010; Wittwer et al., 2011). As an example, studies carried out in human COIIS colon cancer cells revealed that quercetin, a natural flavonoid that is widely distributed in food (tea, apples, and onions), modulates gene expression in colon cancer, with approximately 5000–7000 genes differentially expressed between quercetin-exposed and nonexposed cells (Murtaza et al., 2006). Recent years have seen many transcriptomic studies focused on nutrition using blood as a source of RNA. Although whole blood has been used in some studies, it seems more convenient to use PBMCs because the RNA is derived from a less vari- able population (monocytes and lymphocytes), thus guaranteeing less inter-subject variation in gene expression profiles (Wittwer et al., 2011). Moreover, various studies performed in both animals and humans have shown that PBMCs capture metabolic changes related to nutrition and that these cells can reflect the metabolic adapta- tions that occur in different tissues (Bouwens et al., 2007; Bouwens et al., 2010; Caimari et al., 2010a, 2010b; Rudkowska et al., 2011). As an example, Rudkowska et al. (2011) performed a nutritranscriptomic approach in obese and insulin-resistant humans demonstrating that 8-week supplementation with n-3 polyunsaturated fatty TABLE 11.1 Summarization of Nutrigenomic Studies Dietary Intervention Method Results References Nutritranscriptomics Quercitin Human COIIS colon cancer cells Changes in mRNA levels of Murtaza et al. (2006) 100 ␮M, 24 and 48 h 5000–7000 genes (apoptosis and Microarray analysis xenobiotic metabolism) n-3 PUFAs 16 obese and insulin-resistant subjects 88% of transcripts co-expressed in Rudkowska et al. (2011) 8wk PBMCs and skeletal muscle Microarray analysis in PBMCs and Strong correlation between transcript skeletal muscle expression levels of PBMCs and skeletal muscle Normal fat (chow) and Male Wistar rats Changes in mRNA levels of genes Caimari et al. (2010a), high-fat (cafeteria) 3 Experimental conditions: ad libitum involved in energy homeostasis and (2010b), (2010c) diets feeding, 14 h fasting, and 14 h cholesterol metabolism in normal fasting + 6 h refeeding weight rats. Impaired nutritional Microarray analysis in PBMCs regulation in diet-induced (cafeteria) obese rats Identification of slc27a2 as a putative biomarker of overweight development 24 and 48 h of fasting 4 Healthy men Modulation of the mRNA expression Bouwens et al. (2007) Acute (24 and 48 h) of more than 1000 genes, especially Microarray analysis in PBMCs those involved in fatty acid ␤-oxidation and regulated by PPAR␣ Low-caloric diet Obese men Changes in the mRNA levels of genes Crujeiras et al. (2008) 8wk related to body weight, oxidative

313 Microarray analysis in PBMCs stress, and inflammation

(continued) 314

TABLE 11.1 (Continued) Dietary Intervention Method Results References High-protein or 8 Healthy men 141 Genes mainly involved in the Van Erk et al. (2006) high-carbohydrate Acute (2 h) immune response and signal breakfast Microarray analysis in blood transduction were differentially leukocytes expressed in response to the two breakfasts PUFAs: EPA and DHA 302 Healthy elderly subjects (men and A high EPA + DHA intake changed Bouwens et al. (2009) women) the expression of 1040 genes, most 26 wk involved in inflammatory and Microarray analysis in PBMCs atherogenic-related pathways PUFAs, MUFAs and 21 Healthy men Differential effect of PUFAs, MUFAs, Bouwens et al. (2010) SFAs Acute (6 h) and SFAs on the mRNA levels of Microarray analysis in PBMCs genes related with liver X receptor signaling and cellular stress responses PUFAs: DHA and EPA 10 Healthy men DHA-rich fish oil supplementation Gorjao˜ et al. (2006) 2mo modified the mRNA expression of Microarray analysis in lymphocytes 77 genes involved in different signaling pathways (inflammation, cell cycle, and stress metabolism) High- or low-phenol 20 Patients (men and women) with 98 Genes linked to obesity, Camargo et al. (2010) virgin olive oil metabolic syndrome dyslipidemia, and type 2 diabetes Acute (4 h) mellitus were differentially Microarray analysis in PBMCs expressed between the groups receiving the different oils Virgin olive oil 6 Men and 4 women (healthy) Changes in the expression of genes Khymenets et al. (2009) 3wk related to atherosclerosis Microarray analysis in PBMCs development and progression Olive oil 6 Healthy men Changes in the mRNA levels of genes Konstantinidou et al. Acute (6 h) related to metabolism, cellular (2009) Microarray analysis in PBMCs processes, cancer, atherosclerosis, inflammation, and DNA damage Isoflavones 30 Healthy, nonobese, postmenopausal Increased expression of genes Niculescu et al. (2007) women associated with cAMP signaling and 84 d cell differentiation and decreased Microarray analysis in lymphocytes expression of genes associated with cyclin-dependent kinase activity and cell division Nutriproteomics Soy isoflavones Postmenopausal women Alteration of the expression of Fuchs et al. (2007) (50 mg/d) 8 wk intervention 29 proteins PBMCs by 2-DE Atherosclerotic-preventive activity Fish oil (3.5 g/d) Healthy humans Downregulation of some proteins de Roos et al. (2008b) 8 wk intervention Activation of anti-inflammatory and Serum by 2-DE lipid-modulating mechanisms involved in coronary heart disease ␣-Tocopherol and Men with prostate cancer Selenium and ␣-tocopherol induced Kim et al. (2005) selenium (400 IU of 3 to 6 wk intervention changes in the proteomic patterns ␣-tocopherol and/or Serum by MS techniques associated with prostate cancer-free 200 ␮g of selenium /d) status

315 (continued) 316

TABLE 11.1 (Continued) Dietary Intervention Method Results References Vitamin C (350 mg/3 Hemodialysis patients Alteration of polypeptides Weissinger et al. (2006) times wk) 2 mo intervention Plasma by MS techniques ␣-Tocopherol Healthy humans Increase in plasma levels of Aldred et al. (2006) (134 or 268 mg/d) 28 d intervention apolipoprotein A Plasma by 2-DE and MALDI-MS Cruciferous vegetables Healthy nonsmoking men/women Identification of protein biomarkers of Mitchell et al. (2005) (436 g cruciferous or 7 d after intervention vegetable consumption 190 g allium or Serum by MS techniques 270 g apiaceous vegetables/d) Nutrimetabolomics Soy isoflavones Healthy women Alteration of energy metabolism Solanky et al. (2003) (45 mg/d) 1 mo treatment Plasma by NMR Soy isoflavones Healthy women Effect on osmolyte fluctuation and Solanky et al. (2005) (60 or 50 g/d of 1 mo treatment energy metabolism conjugated or Urine by NMR Unconjugated isoflavones had a unconjugated greater effect isoflavones, respectively) Black and green tea Healthy nonsmoking men Identification of biomarkers of tea Van Dorsten et al. (2006) (1 g/d) Plasma and urine by NMR consumption Effect of green tea on oxidative energy metabolism Plant sterol esters Normo and hypercholesterolemic Decrease in LDL-C and LDL:HDL Carr et al. (2009) (3 g/d) adults ratio Serum by NMR and HPLC Dark chocolate (40 g/d) Healthy humans Reduction in urinary excretion of Martin et al. (2009) 14 d cortisol and catecholamines Plasma and urine by NMR and HPLC-MS Carbohydrate ingestion Humans with metabolic syndrome OWP increased proinflammatory Lankinen et al. (2010) (high oat–white Serum by LC-MS and GC-MS lysoPC bread–potato (OWP) RP increased DHS levels and and rye bread–pasta decreased isoleucine (RP)) Fish consumption Humans with myocardial infarction or Decrease in bioactive lipid species in Lankinen et al. (2009) (4 meals/wk of fatty unstable ischemic attack the fatty fish group or lean fish) 8 wk treatment Increase of cholesterol esters and Plasma by LC-MS and GC-MS long-chain triacylglycerols in the lean fish group Omega-3 Obese women Fish oil increased the proportion of McCombie et al. (2009) polyunsaturated fatty 24 wk intervention phospholipid species and reduced acids (5 g/d) Plasma by NMR, LC-MS and GC-MS the measured total triacylglycerides 317 318 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION? acids (PUFAs) induced comparable changes in the majority of transcripts in PBMCs and skeletal muscle tissue. In a whole-genome microarray experiment performed in rats, Caimari et al. (2010a, 2010b) demonstrated that the expression of genes involved in energy homeostasis (Caimari et al., 2010a) and cholesterol metabolism (Caimari et al., 2010b) is regulated in PBMCs in response to fasting and feeding conditions and that this regulation is impaired in diet-induced (cafeteria) obese rats. Moreover, these studies enabled the identification of a gene, solute carrier family27 (fatty acid transporter), member 2 (slc27a2), as a putative molecular biomarker of overweight/obesity development associated with the intake of a hyperlipidic diet (Caimari et al., 2010c). In humans, a number of studies have reported effects on PBMCs gene expres- sion from fasting (Bouwens et al., 2007) and from the consumption of different diets (van Erk et al., 2006; Crujeiras et al., 2008), nutrients such as PUFAs (Gorjao˜ et al., 2006; Bouwens et al., 2009; 2010; Rudkowska et al., 2011) or bioactive food compounds such as olive oil (Khymenets et al., 2009; Konstantinidou et al., 2009; Camargo et al., 2010) and isoflavones (Niculescu et al., 2007). For example, PBMCs were used to explore the effects of 26-week consumption of the PUFAs eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) on whole-genome gene expression (Bouwens et al., 2009). This study showed that the intake of EPA + DHA induces anti-inflamatory and antiatherogenic gene expression in the PBMCs of healthy elderly subjects. The mRNA profiles of these cells also reflect acute post- prandial changes that differ in the amount of each macronutrient present in the diet (van Erk et al., 2006) or the type of fatty acid consumed (Bouwens et al., 2010). In this sense, Bouwens et al. (2010) nicely demonstrated that the intake of shakes enriched in PUFAs, monounsatured fatty acids (MUFAs), or saturated fatty acids (SFAs) had different effects on the expression of genes involved in liver X receptor signaling in the PBMCs of healthy young subjects. Taken together, these findings strongly suggest that the analysis of gene expression profiles in PBMCs can be highly useful to illuminate the capacity of cell systems to interact with nutrients and bioactive food components. In this way, the use of PBMCs can make signif- icant contributions to elucidate the impact of optimized nutrition on phenotypic expression in humans.

11.2.2.2 Nutriproteomics Proteins are the macromolecules that participate in all cellular processes and carry out structural and mechanical functions. There- fore, proteomics, which quantifies global protein levels to elucidate their cellular localization and identify protein interactions and posttranslational modifications, is highly important to understand the physiological processes that occur in a biolog- ical system (Hocquette et al., 2009; Garcia-Canas˜ et al., 2010; Kussmann et al., 2010; Wittwer et al., 2011). In the context of nutrigenomics, the proteome provides valuable information regarding the impact of a nutrient or diet in a biological sys- tem and could be a useful tool to identify biomarkers for a given physiological or pathological condition. NUTRIGENOMICS 319

For the discovery of early biomarkers of disease, two-dimensional gel elec- trophoresis (2-DE) and mass spectrometry (MS )-based technologies are the most widespread methodologies used in proteomic analyses (Zhang et al., 2008). Biolog- ical fluids such as human plasma, serum, platelets, and PBMCs have been used to search for biomarkers of disease related to nutrition because they have been shown to be excellent platforms for the discovery of qualitative and quantitative changes in physiologically relevant proteins upon dietary interventions (de Roos, 2008; de Roos et al., 2008a). One example of the application of PBMCs in proteomics studies involving dietary interventions is the study by Fuchs et al. (2007). This study was focused on the iden- tification of biomarkers of response to a dietary supplementation with an isoflavone extract in postmenopausal women. These authors were able to identify 29 pro- teins that exhibited significantly modified expression levels in the PBMCs under the soy isoflavone intervention, including a variety of proteins involved in an anti- inflammatory response. Specifically, some proteins that promote increased fibrinol- ysis were found at increased concentrations. On the other hand, those that mediate the adhesion, migration, and proliferation of vascular smooth muscle cells were found at reduced levels after the consumption of soy extract. Based on the nature of the identified proteins, the authors concluded that soy isoflavones may increase the anti-inflammatory response in PBMCs, contributing to the atherosclerosis-preventive activities of a soy-rich diet. Later on, de Roos et al. (2008b) carried out an experiment on the serum proteomes of healthy volunteers that followed a 6-week diet supplemented with fish oil. With the aid of tandem MS, these authors were able to identify biomarkers of inflamma- tion and lipid modulation (namely, apolipoprotein A1, apolipoprotein L1, zinc-␣-2- glycoprotein, haptoglobin precursor, ␣-1-antitrypsin precursor, anti-thrombin III-like protein, serum amyloid P component, and hemopexin) that were significantly modi- fied by fish oil supplementation. They concluded that these proteins could be useful diagnostic biomarkers to assess the mechanisms by which fish oil prevents the early onset of coronary heart disease. Some researchers have focused on the applications of proteomics to study the effects that nutrients could exert in vivo. As an example, Kim et al. (2005) investi- gated the effect of dietary supplementation with vitamin E and/or selenium against prostate cancer. A pre-fractionation of the plasma was carried out to identify the low-molecular-weight proteins using surface-enhanced laser desorption/ionization, followed by principal component analysis to differentiate the proteome of positive prostate cancer plasma from control subjects. In addition, they observed that the combination of selenium and vitamin E induced significant changes in the proteome of prostate cancer patients with a cancer-free status. Other examples of studies on the effects of dietary supplementation on human disease include oral vitamin C sup- plementation in hemodialysis patients (Weissinger et al., 2006), ␣-tocopherol for CVD prevention (Aldred et al., 2006), and cruciform vegetables for insulin resistance (Mitchell et al., 2005). More examples can be seen in the review of Griffiths and Grant (2006). 320 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

11.2.2.3 Nutrimetabolomics Metabolomics, the last of the omics techniques to be developed, focuses on the study of the metabolites that are present in biological sam- ples, therefore representing the end point of the omics cascade (from genes to proteins to metabolites). Compared with the other omics technologies, metabolomics has the advantage of considering the dynamic metabolic status of the whole organism. There- fore, as the metabolome reflects past events that include the whole metabolism and the interaction with the environment, metabolomics techniques have the ability to predict phenotypic properties more accurately than the other omics approaches (Nicholson and Wilson, 2003; Roux et al., 2011). This makes metabolomics the best choice for systems biology studies of interactions at different molecular levels (Alvarez-S´ anchez´ et al., 2010). Traditionally, studies have been carried out in biological fluids such as plasma, serum, and urine using two types of detectors: nuclear magnetic resonance (NMR) and MS coupled with either liquid or gas chromatography. Currently, metabolomics are being used in several studies to relate dietary inter- ventions with health and to identify new biomarkers of food consumption and dis- ease progression. For example, Solanky et al. (2003) used NMR to evaluate the metabolomic changes that occurred as a consequence of a dietary intervention with soy isoflavones in healthy women from 21 to 29 years of age. NMR analysis of plasma samples showed an increase in the lipoprotein fraction and in lactate and a decrease in sugar content. These results led these authors to conclude that soy isoflavones induced an alteration in the energy metabolism of the volunteers. To study the behavior of these isoflavones in more depth, the same research group carried out a complemen- tary assay in which they identified metabolomic changes in urine resulting from the ingestion of conjugated and unconjugated isoflavones (Solanky et al., 2005). The NMR results revealed an increase in methylamine pathway intermediates and sug- gested that the ingestion of soy isoflavones is involved with osmolyte fluctuation and energy metabolism. Nutritional intervention studies carried out by the ingestion of different natural products have demonstrated that NMR technology is a powerful tool for the assess- ment of subtle metabolic changes and that it continues to be useful in metabolomics despite the exponential improvements in tandem MS tools. For instance, studies directed by Van Dorsten et al. (2006) used NMR to differentiate samples after the consumption of black and green teas. They showed that these two types of teas had different impacts on endogenous metabolites in urine and plasma. Specifically, green tea led to a greater increase in urinary excretion of citric acid intermediates, sug- gesting a greater effect on human oxidative energy metabolism and/or biosynthetic pathways. On the other hand, Carr et al. (2009) evaluated the effect of plant sterol ingestion and demonstrated a resulting decrease in serum LDL-C concentrations in adults. MS techniques have recently come into frequent use for metabolomics studies. Some of these studies take advantage of the high sensitivity of MS for the character- ization of specific metabolites after the consumption of specific foods. For example, Martin et al. (2009) investigated the metabolomic changes in urine samples after the ingestion of dark chocolate. In this case, they carried out a combined analysis using NMR and LC-MS/MS. The results led them to conclude that dark chocolate NUTRIGENOMICS 321 reduced the excretion of the stress hormone cortisol and catecholamines and partially normalized the differences in energy metabolism and gut microbial activities related with stress. CVD is one of the main causes of death today, resulting in significant scientific efforts directed toward solving this problem. For instance, Lankinen et al. (2010) analyzed the effect of a dietary carbohydrate modification on the metabolome. The results highlighted that this modification can contribute to proinflammatory processes and produce changes in insulin and glucose metabolism. In another experiment, the same authors studied the effect of lean fish and fatty fish in volunteers with coronary heart disease. Their results from the serum samples showed a decrease in the levels of bioactive lipids resulting from the ingestion of fatty fish, suggesting a protective effect of fatty fish against CVD (Lankinen et al., 2009). In line with these experiments on the beneficial effects of fatty fish, McCombie et al. (2009) related the intake of omega-3 polyunsaturated fatty acids (by means of a diet supplemented with fish oil) to a change in the plasma triglyceride composition, leading to an effect on CVDs.

11.2.3 Current Limitations of the Omics Techniques Despite the promising results of the omics technologies, they still have some limi- tations that have to be overcome. Some of these limitations are technical and others are related to the interpretation of the data. Most of the technical limitations are common among the different platforms. For example, the existence of background noise makes the detection of low signals difficult, variability problems still need to be addressed, and the wide range of data analysis techniques adds variance among studies. Sample preparation is a critical point that still needs further development, especially in proteomics and metabolomics (Alvarez-S´ anchez´ et al., 2010). Finally, the high costs of these techniques represent a significant limitation (Garcia-Canas˜ et al., 2010; Masotti et al., 2010; Wittwer et al., 2011). Significant efforts are underway to address these problems. For example, in nutritranscriptomics the Microarray Quality Control Consortium (MAQC) has con- tributed to ensuring data quality and to the standardization of microarray procedures. Moreover, the development of the guideline MIAME (minimum information about microarray experiment) by the Microarray Gene Expression Data (MGED) organi- zation has been very useful to improve the exchange of microarray data between different platforms and to ensure that microarray experiments can be conveniently interpreted (Garcia-Canas˜ et al., 2010; Masotti et al., 2010; Wittwer et al., 2011). On the other hand, other limitations of foodomics are related to the biological meaning of the data. Omics techniques generate thousands of data points that have to be organized, statistically treated, and analyzed to extract conclusions. For instance, microarray analyses provide a high number of transcripts that are up- or downregu- lated. The interpretation of these changes is not an easy matter and requires powerful statistical and bioinformatic software (METACORE, DAVID, PANTHER, etc.) and databases (e.g., Gene Ontology, KEGG) to obtain a suitable biological interpretation of the results (Garcia-Canas˜ et al., 2010). Specialized software is required to extract 322 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION? the information contained within metabolomics data such as R and MATLAB. These software programs allow for the alignment of features and the identification of signif- icant changes after a dietary intervention. After this step, the metabolites that give rise to these changes can be identified by seeking information in freely available databases such as the Human Metabolome Database (HMDB), Metlin, and PubChem. Moreover, the changes resulting from a nutritional intervention in mRNA, protein, and metabolite levels are usually lower than those resulting from treatment with drugs or medicines, most likely due to the homeostatic robustness of biological systems, increasing the difficulty of obtaining definite conclusions (Wittwer et al., 2011). An integrative approach combining the information obtained in transcriptomic studies with those obtained in proteomics and metabolomics would strongly con- tribute to the understanding of the influences of nutrients or bioactive food compounds on biological systems. This approach is known as systems biology (Hocquette et al., 2009; Garcia-Canas˜ et al., 2010; Masotti et al., 2010; Wittwer et al., 2011; Panagiotou and Nielsen, 2009) (Fig. 11.2).

11.2.4 Systems Biology Systems biology is defined as an integrative approach considering all the information generated by the omics techniques to extract conclusions from a holistic point of view (Fig. 11.2). As previously noted, nutrients and bioactive food compounds can only produce subtle changes in mRNA, protein, and metabolite levels, making the inter- pretation of the data in a biological context difficult. Therefore, systems biology could provide (1) a better understanding of the molecular mechanisms by which nutrients exert their effects and (2) the identification of new biomarkers (transcripts, proteins, or metabolites) of food consumption and disease progression. However, it is impor- tant to highlight that it is not easy to achieve this holistic view. Despite the great recent technological improvements of the different omics techniques, the optimal application of systems biology still requires the development and the improvement of powerful bioinformatic tools allowing for the appropriate integration of the informa- tion obtained from the different “omics” levels (Hocquette et al., 2009; Garcia-Canas˜ et al., 2010; Panagiotou and Nielsen, 2009). Although a systems biology approach has not been extensively applied to the field of nutrition, there are some interesting examples in the literature, both in murine models and humans (Griffin et al., 2004; Herzog et al., 2004; Dieck et al., 2005; Hwang et al., 2005; Schadt et al., 2005; Schnackenberg et al., 2006; Arbones-Mainar et al., 2007; Rezzi et al., 2007; Ferrara et al., 2008; Bakkeret al., 2010). As an example, Bakker et al. (2010) conducted a nutrigenomic approach in healthy overweight men with mildly elevated plasma C-reactive protein (CRP) to determine whether a mixture of different specific dietary components (resveratrol, green tea extract, ␣-tocopherol, vitamin C, n23 (omega-3) PUFAs, and tomato extract) was able to reduce low-grade inflammation as well as metabolic and oxidative stress. For these purposes, the authors analyzed the transcriptomes of PBMCs and adipose tissue and the levels of 120 plasma proteins and 274 plasma metabolites as well as different inflammatory and oxidative stress markers in plasma and urine. Interestingly, the dietary treatment in this study NUTRIGENETICS AND PERSONALIZED NUTRITION 323 did not change the plasma levels of the principal inflammatory marker CRP. However, the integrated analysis of the “omics data” revealed a great number of subtle changes that pointed toward a modulation of inflammatory processes after the nutritional treatment. Moreover, the integrative large-scale analysis of gene expression, proteins, and metabolites also revealed an improvement of the endothelial function, changes in oxidative stress pathways, and an increase of fatty acid oxidation in liver. Thus, this study clearly illustrates that although the systems level view of biology is still in development, it is a highly promising method that can be used in the field of nutrition to detect new biomarkers and gain knowledge about the molecular mechanisms involved in the regulation of different metabolic pathways related to health and disease progression. Therefore, it is expected that in the near future, the integration of the data obtained through transcriptomics, proteomics, and metabolomics, together with progress in nutrigenetics, will provide sufficient knowledge to design optimal and personalized diets that allow for health maintenance and disease prevention in humans (Fig. 11.2).

11.3 NUTRIGENETICS AND PERSONALIZED NUTRITION

As described above, nutrition is one of the primary environmental exposures that determines health. However, the effect of dietary changes on phenotypes (i.e., plasma lipid measures, body weight, and blood pressure) differs significantly between indi- viduals, and the reality of diet and health is that consuming the same diet does not lead all individuals within a population to optimal health. Thus, the use of diet to promote health and prevent disease requires the personalization of diet (Simopoulos, 2010). One important factor for this diversity in dietary responses to nutrients is the exis- tence of genetic variations among nutritionally relevant genes (i.e., genes that control digestion, absorption, distribution, transformation, storage, and excretion by pro- teins), leading to different dietary requirements for different individuals (Simopoulos, 2010). In fact, personalized nutrition could be defined as the prescription of individual diets to enable an optimal physiological response according to individual genotypic variation that will help to prevent, mitigate, or cure chronic disease (Lovegrove and Gitau, 2008). The field of nutrigenetics studies the response to dietary stimuli on the basis of individual genetic makeup. In this respect, nutrigenetics, which is considered a part of the discipline of foodomics (Herrero et al., 2012) (Fig. 11.2), aims to identify and characterize gene variants associated with differential responses to nutrients and to relate this variation to disease states (Raqib and Cravioto, 2009). The ultimate goal of nutrigenetics is to use genetic profiling for the earlier detection of disease risk and the personalization of dietary recommendations provided to individuals or population subgroups (Rimbach and Minihane, 2009). There is relatively little variation in genetic makeup among individuals, with all humans sharing 99.9% identity at the gene sequence level and individual genetic variation accounting for the remaining 0.1% (Raqib and Cravioto, 2009). Among the various types of sequence variations, the majority of nutrigenetic efforts are focused 324 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION? on single-nucleotide polymorphisms (SNPs), which account for up to approximately 90% of all human genetic variations. This type of genetic variation consists of single base pair differences in the DNA sequence with a frequency of more than 1% (Wittwer et al., 2011). Of the approximately 10 million SNPs in the human genome, many have functional consequences, including the ability to alter the metabolic responses of individuals to diet and to influence the risk of nutrition-related chronic diseases (Lovegrove and Gitau, 2008; Raqib and Cravioto, 2009; Stover and Caudill, 2008). Moreover, some of these SNPs occur in 5–50% of the population, making them significant for public health. Therefore, SNP analysis provides a powerful molecular tool for studying the role of nutrition in clinical, metabolic, and epidemiological studies to determine optimum food for health (Raqib and Cravioto, 2009). Phenylketonuria (PKU) was the first “inborn error of metabolism” caused by a single-gene defect that responded to dietary treatment, employing a low- phenylalanine-containing diet for nutrigenetic management (Simopoulos, 2010). This monogenic disorder, among others, illustrates the severe consequences that can result from genetic disruptions but more importantly demonstrates that genetic diseases can be managed and/or alleviated through diet (Stover and Caudill, 2008). Nevertheless, single-gene disorders tend to be relatively rare, with the majority of nutrition-related pathologies (obesity, metabolic syndrome, type 2 diabetes, CVD, and some types of cancers) exhibiting polygenic and multifactorial dependence, with their onset and progression affected by different heterogeneous genes and gene variants, as well as by the interaction between genes and environmental factors (Simopoulos, 2010; Virgili and Perozzi, 2008). One of the best-studied examples of human nutritional intervention using nutrige- netics applied to polygenic disease is methylenetetrahydrofolate reductase (MTHFR) gene variants and folate supplementation in the prevention of CVD risk. MTHFR has a role in supplying 5-methylenetetrahydrofolate, which is necessary for the remethy- lation of homocysteine to form methionine, and folate is essential to the efficient functioning of MTHFR. There is a common polymorphism in the MTHFR gene, present in 5–15% of the general population, in which thymine replaces cytosine at base pair 677 (MTHFR C677T), leading to two forms of the MTHFR enzyme: the wild-type protein (C), which functions normally, and the mutant variant (T), which exhibits significantly reduced activity and stability. Individuals with two copies of the wild-type gene or one copy of each appear to have normal folate metabolism, whereas carriers of two copies of the unstable gene (TT) exhibit decreased methionine syn- thesis from homocysteine. People with this mutated genotype and low folate intake have higher plasma homocysteine levels and an increased risk of CVD. Interestingly, the activity of MTHFR can be modulated by changing the concentration of folate (the MTHFR substrate). Folate supplementation has been demonstrated to compensate for the decreased activity of MTHFR in TT individuals, leading to a decrease in plasma homocysteine and helping to overcome the negative health effects of this SNP (Virgili and Perozzi, 2008; Raqib and Cravioto, 2009; Simopoulos, 2010). A second example of the applicability of nutrigenetics is the large variation in the concentration of serum LDL-C in response to fish oil supplementation. Although there are numerous well-described cardioprotective actions of the fatty acids in fish oil, moderate to high doses (>2 g/d) of EPA and DHA commonly lead to increases in NUTRIGENETICS AND PERSONALIZED NUTRITION 325

LDL-C in the 5–10% range. Recent evidence strongly suggests that common variants of the apoE gene may be important to this inter-individual difference in response, together with other factors including age, gender, baseline LDL-C levels, disease status, and drug use. The apoE protein has a central role in lipoprotein metabolism, being involved in chylomicron metabolism, very-low-density lipoprotein synthesis and secretion, and in the cellular removal of lipoprotein remnants from the cir- culation. This gene locus is highly polymorphic, with the apoE epsilon missense mutations among the best known, which consist of three allelic isoforms named ε2, ε3, and ε4. The proteins produced from these different isoforms differ in the amino acids present at residues 112 and 158 (Rimbach and Minihane, 2009). Numerous studies have associated the increase in LDL-C observed following supplementation with EPA + DHA with an apoE4 genotype (Lovegrove and Gitau, 2008; Rimbach and Minihane, 2009). Another study indicated that DHA rather than EPA is the hypercholesterolaemic agent (Rimbach and Minihane, 2009). The highly variable response of plasma cholesterol to cholesterol feeding is another example of the inter- action of this polymorphism with lipid intake. On a low-fat/high-cholesterol diet, individuals with the apolipoprotein E4/4 genotype exhibit elevated serum choles- terol, whereas those with Apo E2/2 or Apo E3/2 do not show an increase. On a low-fat/low-cholesterol diet, all variants show a decrease in serum cholesterol (Rim- bach and Minihane, 2009). Therefore, individuals with the apoE4 genotype repre- sent a population subgroup that is particularly sensitive to dietary cholesterol and polyunsaturated fatty acids and should be specifically targeted with advice to reduce overall consumption.

11.3.1 Epigenetics Moreover, in addition to changes in the DNA sequence, there exists another cate- gory of heritable changes that influences gene expression without altering the DNA sequence. These types of changes are called epigenetic changes (DeBusk, 2010). The totality of genome-wide epigenetic patterns is known as the epigenome and com- prises four distinct but closely interacting mechanisms: DNA methylation, chromatin structure, posttranslational histone modifications, and non-coding small RNAs (Kuss- mann et al., 2010). Therefore, although some of the variation in the effects of diet on phenotype is due to genetic differences, as discussed earlier, epigenetic mechanisms can influence gene expression and are relevant to nutrition (Zeisel, 2011). An important aspect of epigenetic changes is that they can be modulated by environmental factors, including dietary factors (Wilson, 2008). In fact, epigenetic changes act as a switch turning the expression of specific genes on or off in response to environmental cues. Therefore, genes can be activated or silenced by epigenetic changes, for example, by demethylation or remethylation of their promoter sequence, which may indicate the opportunity for reprogramming, possibly using nutritional means (Kussmann et al., 2010). Epigenetic modification by diet would have important clinical applications to promote health and influence the risk of certain human diseases (DeBusk, 2010). Although nutritional epigenomics and our understanding of its influence in humans is still in its earliest stages, there are several examples of the effect of diet on 326 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION? the epigenetic machinery and its influence on health status or disease risk. The most prominent example from human studies is that of DNA methylation in the context of cancer. For instance, dietary deficiency in folate has been shown to lead to genomic hypomethylation because of the role that this nutrient plays in the generation of methyl groups through one-carbon metabolism, with potential consequences for some types of cancer. Interestingly, folate therapy has been shown to restore the state of hypomethylation, correcting the patterns of gene expression regulated by this epigenetic mechanism (Mathers, 2008; Wilson, 2008; McKay and Mathers, 2011). A second example is the regulation of microRNAs in cancer by some natural agents. MicroRNAs are non-coding RNA sequences that downregulate gene expression in a postgenomic manner. Numerous studies demonstrate that various microRNAs are deregulated in human cancers. For instance, treatment of human leukemia cells with all-trans-retinoic acid, the most biologically active form of vitamin A, resulted in differential expression of microRNAs that were previously known to be deregulated in this type of cancer. In another example, treatment of rats with colon cancer with fish oil (rich in n-3-polyunsaturated fatty acids) resulted in a reversion of the expression of five known tumor suppressor microRNAs (Parasramka et al., 2012). The evidence shows that epigenetic modifications may account for the increasingly recognized links between the prenatal and early postnatal nutritional environment and adult health and disease (Raqib and Cravioto, 2009; Kussmann et al., 2010). Numerous studies demonstrate that the fetal environment can influence an individ- ual’s likelihood of developing chronic disorders during adulthood. For instance, the earliest studies relating early-life undernutrition to the later development of obesity were those from victims of the Dutch Hunger Winter 1944–1945, which showed that females born to women exposed to famine in early gestation were more likely to be obese in later life. Other studies from UK cohorts showed asso- ciations between low birth weight and increased body mass index in adulthood (Taylor and Poston, 2007). Given the major changes in epigenetic markers in early development, the epigenome may be especially plastic and susceptible to modifi- cation by dietary and other environmental factors during this period. This has led to the hypothesis that altered epigenetic markers may be one of the mechanisms that explains developmental programming in early life by dietary environmental exposures (Taylor and Poston, 2007; Stover and Caudill, 2008). Several animal stud- ies demonstrate that nutritional interventions can prevent or reverse the adverse effects of impaired early-life nutrition and the associated epigenetic changes. In this sense it has been shown that, dietary supplementation with substrates and cofactors of DNA methylation abrogate the female-line transmission of obesity by maternal overnutrition. Another example is that early treatment with a histone deacetylase inhibitor have been shown to reverse the phenotypic and epigenetic consequences of intrauterine growth retardation for adult-onset diabetes mellitus (Kussmann et al., 2010). Altogether, there is good reason to think that epigenetics studies will be infor- mative about the mechanisms through which dietary exposures influence human health over long periods. Furthermore, these studies may offer novel opportunities for interventions to prevent, delay, or treat common complex diseases. NUTRIGENETICS AND PERSONALIZED NUTRITION 327

11.3.2 The Present of Nutrigenetics and Personalized Nutrition

Although it is becoming evident that genetics influence nutritional needs, the health benefits of nutrigenetics for the prevention or treatment of diseases remains unclear (Virgili and Perozzi, 2008). To be reliable, human nutrigenetics studies should be carried out using a large number of individuals with a known diagnosis and a well- characterized dietary intake and lifestyle. Moreover, genetic variants in many genes related to human disease and diet should be determined using high-throughput geno- typing procedures such as gene chip methods and next-generation sequencing tech- nologies. However, the current reality is that large and well-characterized nutrigenetic studies are not common and have contributed little to the understanding of diet– genotype interactions, most likely due to the difficulty in obtaining human samples for research studies. As a result, most human nutrigenetic studies are based on the study of single genes and nutrients, the so-called candidate gene approach based on intervention studies focused on biologically relevant genes related to the phenotype of interest (Rimbach and Minihane, 2009), in which subjects receive a controlled dietary intake. However, these studies do not represent the complexity with which the nutrients influence genes, as food is composed of multiple nutrients affecting several genes (Mathers, 2003) and most dietetic-related pathologies are affected by multiple genes (Simopoulos, 2010). Moreover, several deficiencies in nutrigenetic studies need to be overcome to make personalized nutrition feasible in the near future. For instance, better understanding is needed around the genetic basis of polygenetic pathologies and how environmental factors determine the susceptibility to develop these multifactorial diseases (Virgili and Perozzi, 2008; Simopoulos, 2010). Nutrigenetics studies also need to take into account that the individual response to diet not only depends on environmental factors and extrinsic variances such as sex or race but also on variations arising through life stages (growth, pregnancy, and old age) (Simopoulos and Childs, 1990) and several other factors like absorption, digestion, and storage. Furthermore, the bioavailability and coingestion of other nutrients should be considered (Rucker and Tinker, 1986; Williams et al., 1990). Therefore, more large-scale human intervention studies are needed with robustly characterized dietary exposure and other lifestyle factors to thoroughly evaluate the relevant gene–genotype–diet interactions. This necessity should encourage the development of consortia that cross national and continental borders to facilitate the pooling of resources, including biological samples and data. Furthermore, although it is expensive and difficult to collect data in gene variant databases and biobanks, this has been proposed as a necessity for nutrigenetic research progress (Lovegrove and Gitau, 2008). In addition, more nutrigenomics studies are needed because the evidence on gene–nutrient interactions in polygenetic disorders is still not sufficient to create personalized nutrition (Bergmann et al., 2008). Furthermore, it is important to consider whether personalized nutrition will be socially accepted, that is, if the public will agree to be genotyped and if they will understand the consequences of the results. Several studies have been performed trying to answer some of these questions. One example is a study carried out in 2003 by the Institute for the Future (Institute for the Future, 2003) that showed that 328 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION? one-third of the public will accept genotyping for personalized nutrition. In another survey commissioned by the International Food Information Council in 2007, more than two-thirds of Americans surveyed expressed a favorable opinion toward the idea of using genetic information to develop personalized nutrition recommendations (International Food Information Council, 2007). However, more research is needed to determine whether individuals will want to undergo genetic testing and whether they will want to know that they could be predisposed to developing a disease. Studies have demonstrated that about 90% of those surveyed wish to be informed of results (Wendler and Emanuel, 2002). On the other hand, it should also be considered whether individuals will actually change their diet and lifestyle based solely on the possibility of developing a disease. Another major point that nutrigenetics must consider is whether personalized dietary advice will be cost-effective. Nutrients produce a small effect on health compared with drugs. As a result, nutrigenetics should consider whether the cost of the genetic test and personalized recommendations are justified by the health benefits of resulting nutritional changes. Furthermore, there is also a risk that personalized dietary advice will not be affordable for the entire population and that it will become a luxury only available to those with money and education. Furthermore, nutrigenetics and personalized nutritional advice based on individual genotype involves the generation of genetic data and thus ethical risks for those individuals including the confidentiality of the data and the potential dissemination of the results. Medical and pharmaceutical human research studies have several ethical issues in common with human nutritional studies, but nutrigenetic studies also involve the generation of genetic data, resulting in concerns around privacy rights, the confidentiality of data, and the consequent ethical issues involved with the revelation of the DNA sequence (Bergmann et al., 2008). For instance, these genetic data contain important information such as health problems or family relationships that could be used by third parties such as insurance companies or employers. As a result, an ethical committee has been proposed as necessary to protect the data generated in nutrigenetic studies (Roche and Annas, 2001; Knoppers et al., 2006; Bergmann et al., 2008). In conclusion, for nutrigenetics studies to have an impact, more advances are needed in the knowledge of their health benefits, genetic information, the under- standing of gene–diet interactions, and the analysis of the impact of the combined effects of multiple genetic variants and their interactions with the environment. More- over, attention should also be devoted to some of the surrounding issues such as the ethics and consumer acceptance of genetic profiling, which need to be resolved before this potentially valuable public health tool can be used. Despite all these outstanding issues that need to be addressed, the concept of personalized nutrition as a nutritional recommendation based on genotype is emerging as a new approach for the prevention of diseases related to dietetic practices. Indeed, some researchers believe that in the near future, nutritional advice could be given specifically to an individual based on genotype, in this way preventing the development of a certain disease (Kaput and Rodriguez, 2004). This future is encouraging because an individualized nutritional recommendation based on genotype is already possible in some cases, such as in THE ADDED VALUE OF FOODOMICS FOR THE FOOD INDUSTRY 329

PKU, where the effects of genotype clearly dominate over the effects of other factors (Fenech et al., 2011). Furthermore, current advances in research in nutrigenomics, nutrigenetics, and also epigenomics open up a promising future for personalized nutrition to optimize health and reduce disease risk.

11.4 THE ADDED VALUE OF FOODOMICS FOR THE FOOD INDUSTRY

11.4.1 Foodomics as a New Tool for Linking Nutrition Research and Industry Beyond nutrition research, the elucidation of the human genome represents a revo- lutionary breakthrough for the food industry. The food industry has innovated and delivered new products based on this information to fight against the so-called non- communicable diseases (obesity, CVD, type 2 diabetes, osteoporosis, and certain cancers) that are related on unhealthy diet linked with a sedentary lifestyle (WHO/ FAO, 2003). Innovations in food technology and better knowledge of nutrition combined with sociodemographic and economic trends have led to a new understanding of “optimal nutrition”(Ashwell, 2002). Industrial stakeholders are generating new dietary patterns that are becoming new drivers of consumer choice. The industry is addressing the challenge of linking basic research in the areas of food physics, storage, preservation, and fortification. The main industrial driver is the development of health-focused designer foods by improving food nutrition profiles or developing functional foods and/or nutraceuticals (FAO/WHO, 1996; USDA, 2010; Nehir and Simsek, 2012). Health has been the main factor in consumer food choices in past years. Therefore, the food industry (and also the pharmaceutical industry) has heavily invested in functional foods as a driver of innovation in new and high value-added products with specific health benefits (Niva, 2007). However, it is not clearly defined which foods are considered as functional. There- fore, it is rather difficult to estimate the market for these products, and the results differ depending on the definition of functional foods (Kotilainen et al., 2006). Based on the Hilliam definition of functional foods as foods to which ingredients have been added to increase health value (and this is announced to the consumers), the global market has increased from 33 billion US$ (Hilliam, 2000) in 2000 to nearly $167 billion in 2010. The United States is the largest market for these products, followed by Europe and Japan (Arias-Aranda and Romerosa-Mart´ınez, 2010). In fact, the expansion of functional foods has led to the development of auxiliary services based on R&D. As an example, over 140 European companies participate in the European Functional Food Network (FFNet-6th Framework Programme (FP) (Arias-Aranda and Romerosa-Mart´ınez, 2010). The European market was estimated to be between 4 and 8 billion US$ in 2003 (Menrad, 2003) and increased to approximately 15 billion US$ by 2006, with Ger- many, France, the United Kingdom, and the Netherlands as largest contributors to the European market for these products (Makinen-Aakula,¨ 2006). 330 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

In recent years, the European Framework Programmes have attempted to promote links and cooperation between the academic world and industry (Bayona et al., 2004). Under this perspective, specific research into new technologies is receiving increasing attention in line with growing expectations for the future development of the industry. More specifically, functional foods research is gaining importance as the interest in nutrition as a positive force for health grows (Niva and Makel¨ a,¨ 2007).

11.4.2 Using Foodomics to Achieve Scientific Evidence: A Critical Challenge for the Food Industry In general, the total cost of developing a conventional new food product is estimated to be up to 1 or 2 million US$, while the development and marketing costs of functional food products may far exceed this level. Global regulatory systems governing health claims are charged with the respon- sibility of ensuring consumer health by avoiding misleading claims made on foods. Comprehensive scientific evidence is required to make these claims, although claims are viewed differently in each country and different standards of scientific evidence are required for a fully authorized claim (Lalor and Wall, 2011). For example, the Euro- pean regulation—Regulation (CE) nº1924/2006 on nutrition and health claims made on foods (EU, 2006)—acknowledges and references the Process for the Assessment of Scientific Support for Claim on Foods (PASSCLAIM) project (Asp and Bryngels- son, 2008) whose main goal was to produce a generic scientific tool for assessing health claims in food products. However, the quality of the evidence underpinning such claims has been variable, and only approximately 10% of claims submitted by companies to the European Food Safety Authority (EFSA) have been positively evaluated, mainly because of the poor scientific quality of the data. This has discouraged European food companies to invest in R&D because they are at a disadvantage compared with other countries (United States and Japan), where the regulatory framework in this area is less restrictive. In any case, no legal framework establishes a system to assess the evidence or how many or what type of studies are needed to substantiate a claim. Human data are commonly considered the core data for the substantiation of a health claim using (an) appropriate outcome measure(s) of the claimed effect. In addition, statistically vali- dated data from different model systems, non-human data, epidemiological studies, as well as intervention studies on humans need to be present (Menrad, 2003). More extensive evidence from animal and in vitro studies allows for a better explanation of the mechanism of action of a bioactive ingredient or food, supporting the biological plausibility of the specific claim (EFSA, 2011). Metabolomics offers industry promising methodologies for a targeted research approach. A top-down foodomics approach (from transcriptomics to metabolomics) seems most appropriate as it provides a general overview of the reaction of a cell to a bioactive compound, but the functional food industry instead follows a bottom-up approach (from metabolomics to transcriptomics). This approach is closest to the product/phenotype under study and allows the company to make rational decisions and understand the mechanism of action of the bioactive compound. Moreover, the THE ADDED VALUE OF FOODOMICS FOR THE FOOD INDUSTRY 331

Human FOODOMICS Sample

Human Agriculture & Food processing intervention trial Human Health Livestock Safety & Quality

Traditional dietary Nutrigenomics Nutrigenetics assessment

Nutritranscriptomics Nutriproteomics Nutrimetabolomics 24h Recall

Risk r24B Factor

r24FFQ Reduction i Biomarker of Early mechanistic Intermediate nutrient exposure events: new point biomarker biomarkers rFFQB FFQ Enchanced Healthy Function State

OPTIMAL NUTRITION

FIGURE 11.3 Conceptual relationship established between foodomics and all those aspects that may lead to optimal nutrition present in the value chain of food production. Foodomics involves nutrigenomics and nutrigenetics methods to provide data on optimal nutrition. FFQ, food frequency questionnaire; i, true intake; 24 h recall, 24 h recall dietary assessment method; r24FFQ; correlation index between 24 h recall and FFQ; rFFQB, correlation index between FFQ and biomarker; r24B, correlation index between 24 h recall and biomarker. (Adapted from Ocke and Kaaks, 1997; Diplock et al., 1999; Richardson et al., 2003; Biesalski et al., 2011). real benefit of an applied foodomics approach is to obtain comprehensive insight into the biological processes that occur in response to dietary treatments to understand how bioactives enable these processes to be shifted in a desired direction (Van der Werf et al., 2001). The food industry can take advantage of developments in foodomics because of the enormous versatility in both the type of study and in the objective pursued (see Fig. 11.3). First, foodomics methods (e.g., GC–MS, LC–MS, CE-MS, and NMR) can be used to detect new biomarkers of nutrient exposure (see Table 11.2) to ensure that a dietary intervention is compliant (Puiggros` et al., 2011). Second, foodomics can deliver appropriate new early and intermediate biomarkers for evaluating health outcomes, which is the main challenge for the industry in proving the efficacy of functional foods. Third, foodomics provides robust analytical tools for the character- ization of bioactive components and complex mixtures of botanical extracts and the presence of their metabolites in human fluids in bioavailability studies. Finally, the 332

TABLE 11.2 Summary of Suggested Biological Markers of Bioactive Compound Exposure After Dietary Treatment in Human Samples and Foodomics Technologies for Their Detection Bioactives Time After Foodomics Family Food Biofluid Suggested Biomarkers Exposure Method References Polyphenolics Cocoa beverages P Epicatechin-O-sulphate; 0–2 h HPLC-PDA-MS Mullen et al. (2009) O-methyl-(epi)-catechin-O- sulphate U Epicatechin-O-glucoronide; 0–12 h LC-MS-MS Roura et al. (2008) Epicatechin-O-sulphate Pine bark extract P Catechin; caffeic acid; ferulic 0.5 h–1 h (pycnogenol) acid Taxifolin 5–10 h LC-UV/EDD Duweler and Rohdewald (2000) ␦-(3-methoxy-4- 10–14 h hydroxyphenyl)- ␥-valerolactone Pomegranate juice P Ellagic acid 5 h LC-MS-MS Seeram et al. (2006) U Dimethylellagicacid-glucoronide 12–24 h Red wine P and U Caffeic acid 0–1 h HPCL Ritchie et al. (2004) Whole grain P Alkylresorcinol homolog C17:0 1 wk GC-FID Landberg et al. (2008) to C21:0 ratio Black tea U Hippuric acid (major metabolite) 0–24 h 1HNMR Daykin et al. (2005) 1,3-dihydroxyphenyl-2-O- sulfate (sulfate conjugate of pyrogallol) Coffee P 3,4-dimethoxycinnamic acid 1h LC-MS-MS Nagy et al. (2011) 3,4-dimethoxy dihydrocaffeic 10 h acid U Gallic and 4-O-methylgallic 0–24 h Zubik and Meydani acid, isorhamnetin, (2003); Mennen et al. kaempferol, hesperetin, (2006) naringenin, and phloretin Sesame oil U (1R,2S,5R,6S)-6-(3,4- 1–12 h HPLC and 1H Moazzami et al. (2007) dihydroxyphenyl)-2-(3,4- and LC-MS methylenedioxyphenyl)-3,7- dioxabicyclo-(3,3,0)octane Orange juice, P and U Naringenin 0–24 h HPLC Vitaglione et al. (2005) grapefruit juice Isoflavones Soy milk or pure P Daidzein 8–10 h EIS-MS Setchell et al. (2002) glycoside GC-MS compounds Soy foods U Trimethylamine N-oxide 0–24 h 1H NMR Solanky et al. (2005) Soy beverages P Glucuronidated or sulfated 4 h LC-MS Shelnutt et al. (2002) conjugates of genistein and daidzein

(continued) 333 334 TABLE 11.2 (Continued) Bioactives Time After Foodomics Family Food Biofluid Suggested Biomarkers Exposure Method References Phytosterols Spread enriched P Sitosterol (for tall oil treatment) 6 wk GC-FID Clifton et al. (2008) (PSte) and with PS sources: Campesterol (for rapeseed oil Phytostanols Soybean oil treatment) (PSta) Tall oil Mix of tall oil and rapeseed oil as fatty acid esters PST- and P Sitosterol and campesterol 4 wk GC-HPLC Hallikainen et al. (2000) PS-enriched (increase in PS diet) margarines Sitostanol and campestanol (increase in PST diet) n-3 PUFA Fish oil P DHA 10 wk LC Krauss-Etschmann et al. supplementation (2007) (0.5 g DHA + 0.15 g EPA) Peptides Cooked meat U N2-OH–PhIP–N2-glucuronide 0–24 h LC-MS-MS Kulp et al. (2004) N2-PhIP glucuronide Beef P Carnosine (␤-alanyl-l-histidine) 1–2.5 h HPLC Park et al. (2005) Carotenoids Kale P ␤-carotene 6–10 h LC-MS Novotny et al. (2005) lutein 12–24 h retinol (retinol) Green-yellow P ␤-carotene and ␤-cryptoxanthin 1 year HPLC Okuda et al. (2009) vegetables and (11–14 yr olds) fruits THE ADDED VALUE OF FOODOMICS FOR THE FOOD INDUSTRY 335 combination of NMR, LC–MS, GC–MS, and other foodomics tools will provide a holistic measure of the metabolome in biofluids, but to overcome cost implications involved with this approach, bio-computational data analysis tools must be developed to manage the increased complexity of the results.

11.4.3 Broader Approach to Optimal Nutrition: Application of Foodomics Technologies from Farm to Fork From a broader point of view, the impact of foodomics on optimal nutrition should be extended to encompass the whole food chain. In this area, omics science has recently been applied to the farm to fork concept.

11.4.4 Foodomics and Agriculture Agriculture can benefit from foodomics technologies to understand the genetic basis of commercially important traits. For example, the growth and feeding of domesti- cated crops and livestock can be altered to maximize their own health and also to improve the nutritional qualities of these first components in the human food chain. Foodomics also includes the identification of dietary signals that boost immunity, eliminating the need for antibiotics use in animal feed, as well as the development of crops or animal products with increased levels of healthful ingredients (Brown and Van der Ouderaa, 2007).

11.4.5 Screening of Novel Bioactive Functional Foods Innovation in new ingredients will benefit from the availability of rapid screening methods for testing bioactivity. Analogous to the drug discovery approach used in the pharmaceutical industry, foodomics technology will take on a significant role in screening new phytochemicals that can be included in foods.

11.4.6 Livestock and Animal Production In animals, single essential traits have been identified that determine the variation of a complex phenotype for a range of economically important aspects including growth, fatness, fertility, milk production/composition, meat quality, and health (Harlizius et al., 2004; Gordon et al., 2005). Although the research about the way that genetic factors may interact with diet is still in its infancy, there is considerable excitement about future investigations in this area. Particular interest is focused on the impact that these interactions may have on the nutritional value of animal-derived food products compared with conventional products. For example, genetic factors can exert a significant influence on the milk enrichment degree of cis-9, trans-11 CLA or on the content of DHA and EPA in eggs and milk after the supplementation of dairy cattle diets with plant oils high in linoleic and/or linolenic acid (Lock and Bauman, 2004) or n-3 fatty acids, respectively (Bautista and Engler, 2005). 336 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

11.4.7 Food Processing and Safety The identification of molecular markers affected by manufacturing processes (heat, fermentation, withering, etc.) through foodomics technologies should help to improve manufacturing techniques and deliver useful tools to increase the shelf-life of fresh products (Page et al., 2001). The use of foodomics in food safety is mainly concentrated on the detection of food components (van Ommen and Groten, 2004) and microorganisms that may cause food spoilage or present hazards to human health (Abee et al., 2004). Hazard identification (qualitative risk evaluation) and/or characterization (quantitative risk evaluation) are used both to determine the acceptable daily intake and to elicit an adverse health effect. Toxicological evaluation can benefit from the high-throughput nature of foodomics through the analysis of multiple tissues in a timely and cost-effective manner. By clus- tering data from the full range of biological responses and comparing metabolomic patterns, foodomics may be a useful tool for hypothesis generation and testing (Reynolds, 2005). On the other hand, foodomics can be used in situ to detect biomarkers (genes, proteins, and metabolites) representative of pathogenic microorganisms. For example, the identification of metabolites or physicochemical parameters that are critical for the outgrowth these organisms and their induction of toxin production will be useful to improve control of storage. Furthermore, foodomics can be useful to define the mode of action of bacteria, such as Listeria monocytogenes, Escherichia coli, Clostridium botulinum A, and various Salmonella species (Vogels et al., 1993) and for finding new mechanisms that confer stress resistance. This should enable the industry to more rationally design food preservation techniques and establish data for the points of the manufacturing process that are most susceptible to microbial contamination.

11.4.8 Quality Assurance Foodomics analysis of foods or food ingredients in combination with cluster analysis allows for the identification of metabolomes or proteomes representative of foods originating from specific geographical areas or with a specific quality. Indeed, the quality of the raw material can be used to predict the quality of the end product, such as bread analysis of the wheat proteome. Additionally, biomolecules or ratios between specific biomolecules in the raw material that are most critical to the quality of the end product can be identified (Vogels et al., 1993, 1996).

11.4.9 Personalized Nutrition as a Future Challenge for the Food Industry It is clear that SNPs will become an important issue in food science in the near future. This might especially be the case in the development of personalized foods that exhibit no detrimental effects on other groups. Although industrial developments in this direction are foreseen, they will likely be preceded by similar approaches for the identification and characterization of drugs. It is currently impossible to predict REFERENCES 337 on what timescale we can expect these new foods to come to the market (Van der Werf et al., 2001). The immaturity of the science underpinning nutrigenetic testing has already been mentioned. Nevertheless, the food industry considers personalized nutrition, a future target of innovation activities. The industry expects the growth of this area to be encouraged by the convergence between increased receptivity among consumers to have their diet tailored to their genetic makeup and the reliable and robust detection of SNPs. In this sense, emerging foodomics technologies are providing new tools for nutrigenetics, and several food ingredient companies have begun to invest in this area and actively engage with European Union-funded research initiatives, including LIPGENE (Nugent, 2005) and DIOGenes (Saris and Harper, 2005). However, this burgeoning commercial excitement needs to be tempered by a certain degree of caution. Although evidence is rapidly accumulating to support the concept of personalized nutrition, reports on the clinical utility and validity of specific nutrigenetic markers are still rare. In addition to the previously mentioned technical issues facing nutrigenetics, consensus also needs to be reached on a large number of ethical and regulatory issues. Significant questions remain such as who should administer and receive nutrigenetic tests, information privacy, and industrial discrimination that must be discussed in parallel to the development of personalized nutrition.

11.5 CONCLUDING REMARKS

Foodomics developments are contributing to the technological revolution in research on nutritional science. Their application will generate an enormous set of novel data and knowledge. Nutrigenomics and nutrigenetics data will aid in the discovery of a set of biomark- ers that describe different stages of homeostatic alterations leading up to the alteration of human phenotypes. These data will also refine industrial and research tools to bet- ter demonstrate the efficacy of functional foods and nutraceuticals. Moreover, the application of foodomics to nutrigenetics will quickly increase the available data and enable the development of feasible ways to achieve an optimal diet.

REFERENCES

Abee T, van Schaik W, Siezen RJ (2004). Impact of genomics on microbial food safety. Trends in Biotechnology 22:653–60. Afman L, Muller¨ M (2006). Nutrigenomics: from molecular nutrition to prevention of disease. Journal of the American Dietetic Association 106(4):569–76. Afman L, Muller¨ M (2011). Human nutrigenomics of gene regulation by dietary fatty acids. Progress in Lipid Research 51(1):63–70. Aldred S, Sozzi T, Mudway I, Grant MM, Neubert H, Kelly FJ, Griffiths HR (2006). Alpha- tocopherol supplementation elevates plasma apolipoprotein A1 isoforms in normal healthy subjects. Proteomics 6:1695–1703. 338 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

Alvarez-S´ anchez´ B, Priego-Capote F, Luque de Castro MD (2010). Metabolomics analysis I. Selection of biological samples and practical aspects preceding sample preparation. Trends in Analytical Chemistry 29:111–119. Arbones-Mainar JM, Ross K, Rucklidge GJ, Reid M, Duncan G, Arthur JR, Horgan GW, Navarro MA, Carnicer R, Arnal C, Osada J, de Roos B (2007). Extra virgin olive oils increase hepatic fat accumulation and hepatic antioxidant protein levels in APOE−/− mice. Journal of Proteome Research 6(10):4041–4054. Arias-Aranda D, Romerosa-Mart´ınez MM (2010). Innovation in the functional foods industry in a peripheral region of the European Union: Andalusia (Spain). Food Policy 35:240–246. Artaud-Wild SM, Connor SL, Sexton G, Connor WE (1993). Differences in coronary mortality can be explained by differences in cholesterol and saturated fat intakes in 40 countries but not in France and Finland. A paradox. Circulation 88(6):2771–2779. Ashwell M (2002). Concepts of Functional Foods. Washington, DC: ILSI Europe Concise Monograph Series ILSI Press. Asp NG, Bryngelsson S (2008). Health claims in Europe: new legislation and PASSCLAIM for substantiation. Journal of Nutrition 138(6):1210S–1215S. Astrup A, Dyerberg J, Elwood P, Hermansen K, Hu FB, Jakobsen MU, Kok FJ, Krauss RM, Lecerf JM, LeGrand P, Nestel P, Riserus´ U, Sanders T, Sinclair A, Stenders S, Tholstrup T, Willett WC (2011). The role of reducing intakes of saturated fat in the prevention of cardiovascular disease: where does the evidence stand in 2010?. The American Journal of Clinical Nutrition 93:684–688. Bakker GC, van Erk MJ, Pellis L, Wopereis S, Rubingh CM, Cnubben NH, Kooistra T, van Ommen B, Hendriks HF (2010). An antiinflammatory dietary mix modulates inflammation and oxidative and metabolic stress in overweight men: a nutrigenomics approach. American Journal of Clinical Nutrition 91(4):1044–1059. Bautista MC, Engler MM (2005). The Mediterranean diet: is it cardioprotective? Progress in Cardiovascular Nursing 20:70–76. Bayona C, Garcia-Marco T, Huerta E (2004). Links between the characteristics of alliances and the applicability of research results. Journal of High Technology Management Research 15(2):215–231. Bergmann MM, Gorman¨ U, Mathers JC (2008). Bioethical considerations for human nutrige- nomics. Annual Review of Nutrition 28:447–467. Biesalski HK, Aggett PJ, Anton R, Bernstein PS, Blumberg J, Heaney RP, Henry J, Nolan JM, Richardson DP, van Ommen B, Witkamp RF, Rijkers GT, Zollner¨ I (2011). 26th Hohenheim Consensus Conference, September 11, 2010 Scientific substantiation of health claims: evidence-based nutrition. Nutrition 27:S1–S20. Blade´ C, Arola L, Salvado´ MJ (2010). Hypolipidemic effects of proanthocyanidins and their underlying biochemical and molecular mechanisms. Molecular Nutrition & Food Research 54(1):37–59. Bouwens M, Afman LA, Muller M (2007). Fasting induces changes in peripheral blood mononuclear cell gene expression profiles related to increases in fatty acid beta-oxidation: functional role of peroxisome proliferator activated receptor alpha in human peripheral blood mononuclear cells. American Journal of Clinical Nutrition 86(5):1515–1523. Bouwens M, Grootte Bromhaar M, Jansen J, Muller¨ M, Afman LA (2010). Postprandial dietary lipid–specific effects on human peripheral blood mononuclear cell gene expression profiles. American Journal of Clinical Nutrition 91(1):208–217. REFERENCES 339

Bouwens M, van de Rest O, Dellschaft N, Bromhaar MG, de Groot LC, Geleijnse JM, Muller¨ M, Afman LA (2009). Fish-oil supplementation induces antiinflammatory gene expres- sion profiles in human blood mononuclear cells. American Journal of Clinical Nutrition 90(2):415–424. Brown L, van der Ouderaa F (2007). Nutritional genomics: food industry applications from farm to fork. British Journal of Nutrition 97:1027–1035. Caballero B (2007). The global epidemic of obesity: an overview. Epidemiologic Reviews 29:1–5. Caimari A, Oliver P, Keijer J, Palou A (2010a). Peripheral blood mononuclear cells as a model to study the response of energy homeostasis-related genes to acute changes in feeding conditions. OMICS 14(2):129–141. Caimari A, Oliver P, Rodenburg W, Keijer J, Palou A (2010b). Feeding conditions control the expression of genes involved in sterol metabolism in peripheral blood mononuclear cells (PBMC) of normoweight and diet-induced (cafeteria) obese rats. Journal of Nutritional Biochemistry 21(11):1127–1133. Caimari A, Oliver P, Rodenburg W, Keijer J, Palou A (2010c). Slc27a2 expression in peripheral blood mononuclear cells as a molecular marker for overweight development. International Journal of Obesity 34(5):831–839. Camargo A, Ruano J, Fernandez JM, Parnell LD, Jimenez A, Santos-Gonzalez M, Marin C, Perez-Martinez P, Uceda M, Lopez-Miranda J, Perez-Jimenez F (2010). Gene expression changes in mononuclear cells in patients with metabolic syndrome after acute intake of phenol-rich virgin olive oil. BMC Genomics 11:253. Carlberg C, Seuter S (2009). A genomic perspective on vitamin D signaling. Anticancer Research 29(9):3485–3493. Carpenter KJ (2003a). History of nutrition a short history of nutritional science: part 1 (1785– 1885). Journal of Nutrition 133(3):638–645. Carpenter KJ (2003b). History of nutrition a short history of nutritional science: part 2 (1885– 1912). Journal of Nutrition 133(4):975–984. Carpenter KJ (2003c). History of nutrition a short history of nutritional science: part 3 (1912– 1944). Journal of Nutrition 133(10):3023–3032. Carpenter KJ (2003d). History of nutrition a short history of nutritional science: part 4 (1945– 1985). Journal of Nutrition 133(11):3331–3342. Carr T, Krogstrand K, Schlegel V, Fernandez M (2009). Stearate-enriched plant sterol esters lower serum LDL cholesterol concentration in normo- and hypercholesterolemic adults. Journal of Nutrition 139:1445–1450. Clifton M, Mano M, Duchateau G, van der Knaap HCM, Trautwein EA (2008). Dose-response effects of different plant sterol sources in fat spreads on serum lipids and C-reactive protein and on the kinetic behavior of serum plant sterols. European Journal of Clinical Nutrition 62:968–977. Costa V, Casamassimi A, Ciccodicola A (2010). Nutritional genomics era: opportunities toward a genome-tailored nutritional regimen. The Journal of Nutritional Biochemistry 21(6):457–467. Covas MI (2007). Olive oil and the cardiovascular system. Pharmacological Research 55:175– 186. Crujeiras AB, Parra D, Milagro FI, Goyenechea E, Larrarte E, Margareto J, Mart´ınez JA (2008). Differential expression of oxidative stress and inflammation related genes in peripheral 340 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

blood mononuclear cells in response to a low-calorie diet: a nutrigenomics study. OMICS 12(4):251–261. Daykin CA, Van Duynhoven JPM, Groenewegen A, Dachtler M, Van Amelsvoort JMM, Mul- der TPJ (2005). Nuclear magnetic resonance spectroscopic based studies of the metabolism of black tea polyphenols in humans. Journal of Agricultural Food Chemistry 53(5):1428– 1434. DeBusk R (2010). The role of nutritional genomics in developing an optimal diet for humans. Nutrition in Clinical Practice 25:627–633. Del Bas JM, Ricketts M-L, Vaque´ M, Sala E, Quesada H, Ardevol A, Salvado´ MJ, Blay M, Arola L, Moore DD, Pujadas G, Fernandez-Larrea J, Blade´ C (2009). Dietary pro- cyanidins enhance transcriptional activity of bile acid-activated FXR in vitro and reduce triglyceridemia in vivo in a FXR-dependent manner. Molecular Nutrition & Food Research 53(7):805–814. de Lange DW (2007). From red wine to polyphenols and back: a journey through the history of the French Paradox. Thrombosis Research 119(4):403–406. de Roos B (2008). Proteomic analysis of human plasma and blood cells in nutritional stud- ies: development of biomarkers to aid disease prevention. Expert Reviews in Proteomics 5(6):819–826. de Roos B, Duthie SJ, Polley ACJ, Mulholland F, Bouwman FG, Heim C, Rucklidge GJ, John- son IT, Mariman EC, Daniel H, Elliott RM (2008a). Proteomic methodological recommen- dations for studies involving human plasma, platelets, and peripheral blood mononuclear cells. Journal of Proteome Research 7:2280–2290. de Roos B, Geelen A, Ross K, Rucklidge G, Reid M, Duncan G, Caslake M, Horgan G, Bruwer IA (2008b). Identification of potential serum biomarkers of inflammation and lipid modulation that are altered by fish oil supplementation in healthy volunteers. Proteomics 8:1965–1974. Dickerson JWT (2001). Aspects of the history of nutrition since 1876. The Journal of the Royal Society for the Promotion of Health 121(2):79–84. Dieck H, Doring¨ F, Fuchs D, Roth HP, Daniel H (2005). Transcriptome and proteome anal- ysis identifies the pathways that increase hepatic lipid accumulation in zinc-deficient rats. Journal of Nutrition 135(2):199–205. Diplock AT, Aggett PJ, Ashwell M, Bornet F, Fern EB, Roberfroid MB (1999). Scientific concepts of functional foods in Europe. Consensus document. British Journal of Nutrition 81(Suppl 1):S1–S27. Duweler KG, Rohdewald P (2000). Urinary metabolites of French maritime pine bark extract in humans. Pharmazie 55(5):364–368. EFSA NDA Panel (2011). General guidance for stakeholders on the evaluation of Article 13.1, 13.5 and 14 health claims. EFSA Journal 9(4):2135. EU (2006). EU Regulation (EC) No 1924/2006 of the European Parliament and of the European Council of 20 December 2006 on nutrition and health claims made on foods. Official Journal of the European Union L 12:3–18. FAO/WHO (1996). Preparation and Use of Food-Based Dietary Guidelines. Nutrition Pro- gramme, Geneva: WHO. Fenech M, El-Sohemy A, Cahill L, Ferguson LR, French TA, Tai ES, Milner J, Koh WP, Xie L, Zucker M, Buckley M, Cosgrove L, Lockett T, Fung KY, Head R (2011). Nutrigenetics REFERENCES 341

and nutrigenomics: viewpoints on the current status and applications in nutrition research and practice. Journal of Nutrigenetics and Nutrigenomics 4(2):69–89. Ferrara CT, Wang P, Neto EC, Stevens RD, Bain JR, Wenner BR, Ilkayeva OR, Keller MP, Blasiole DA, Kendziorski C, Yandell BS, Newgard CB, Attie AD (2008). Genetic networks of liver metabolism revealed by integration of metabolic and transcriptional profiling. PLoS Genetics 4(3):e1000034. Fuchs D, Vafeiadou K, Hall WL, Daniel H, Williams CM, Schroot JH, Wenzel U (2007). Pro- teomic biomarkers of peripheral blood mononuclear cells obtained from postmenopausal women undergoing an intervention with soy isoflavones. American Journal of Clinical Nutrition 86:1369–1375. Garcia-Canas˜ V, SimoC,Le´ on´ C, Cifuentes A (2010). Advances in nutrigenomics research: novel and future analytical approaches to investigate the biological activity of natural com- pounds and food functions. Journal of Pharmaceutical and Biomedical Analysis 51:290– 304. Ghosh S, Matsuoka Y, Asai Y, Hsin K-Y, Kitano H (2011). Software for systems biology: from tools to integrated platforms. Nature Reviews Genetics 12(12):821–832. Gordon ES, Gordish Dressman HA, Hoffman EP (2005). The genetics of muscle atrophy and growth: the impact and implications of polymorphisms in animals and humans. The International Journal of Biochemistry & Cell Biology 37:2064–2074. Gorjao˜ R, Verlengia R, Lima TM, Soriano FG, Boaventura MF, Kanunfre CC, Peres CM, Sampaio SC, Otton R, Folador A, Martins EF, Curi TC, Portiolli EP, Newsholme P, Curi R (2006). Effect of docosahexaenoic acid-rich fish oil supplementation on human leukocyte function. Clinical Nutrition 25(6):923–938. Griffin JL, Bonney SA, Mann C, Hebbachi AM, Gibbons GF, Nicholson JK, Shoulders CC, Scott J (2004). An integrated reverse functional genomic and metabolic approach to under- standing orotic acid-induced fatty liver. Physiological Genomics 17(2):140–149. Griffiths HR, Grant MM (2006). The use of proteomic techniques to explore the holistic effects of nutrients in vivo. Nutrition Research Reviews 19:284–293. Hallikainen MA, Sarkkinen ES, Gylling H, Erkkila AT, Uusitupa MIJ (2000). Comparison of the effects of plant sterol ester and plant stanol ester-enriched margarines in lowering serum cholesterol concentrations in hypercholesterolaemic subjects on a low-fat diet. European Journal of Clinical Nutrition 54(9):715–725. Haring R, Wallaschofski H (2012). Diving through the “-Omics”: the case for deep phenotyping and systems epidemiology. OMICS: A Journal of Integrative Biology 16(5):231–234. Harlizius B, van Wijk R, Merks JW (2004). Genomics for food safety and sustainable animal production. Journal of Biotechnology 113:33–42. Herrero M, SimoC,Garc´ ´ıa-Canas˜ V, Iba´nez˜ E, Cifuentes A (2012). Foodomics: MS-based strategies in modern food science and nutrition. Mass Spectrometry Reviews 31:49–69. Herzog A, Kindermann B, Doring¨ F, Daniel H, Wenzel U (2004). Pleiotropic molecular effects of the pro-apoptotic dietary constituent flavone in human colon cancer cells identified by protein and mRNA expression profiling. Proteomics 4(8):2455–2464. Hill JO, Wyatt HR, Reed GW, Peters JC (2003). Obesity and the environment: where do we go from here?. Science 299:853–855. Hilliam M (2000). Functional food—how big is the market? The World of Food Ingredients 12:50–53. 342 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

Hocquette JF, Cassar-Malek I, Scalbert A, Guillou F (2009). Contribution of genomics to the understanding of physiological functions. Journal of Physiology and Pharmacology 60(Suppl 3):5–16. Hwang D, Smith JJ, Leslie DM, Weston AD, Rust AG, Ramsey S, de Atauri P, Siegel AF, Bolouri H, Aitchison JD, Hood L (2005). A data integration methodology for systems biology: experimental verification. Proceedings of the National Academy of Sciences of the United States of America 102(48):17302–17307. Institute for the Future (2003). From nutrigenomic science to personalized nutrition Institute for the Future. The market in 2010. Available at http://www.iftf.org/node/785. Accessed 2012 Jan 30. International Food Information Council (2007). Consumer attitudes toward func- tional foods/foods for health executive summary. Available at http://www.ific.org/ research/upload/2005funcfoodsresearch.pdf. Accessed 2012 Jan 30. Kalaany NY, Mangelsdorf DJ (2006). LXRS and FXR: the yin and yang of cholesterol and fat metabolism. Annual Review of Physiology 68:159–191. Kaput J, Rodriguez RL (2004). Nutritional genomics: the next frontier in the postgenomic era. Physiological Genomics 16:166–177. Keys A, editor. (1980). Seven Countries: A Multivariate Analysis of Death and Coronary Heart Disease, Cambridge, MA: Harvard University Press. Kim J, Sun P, Lam YW, Troncoso P, Sabichi AL, Babaian RJ, Pisters LL, Pettaway CA, Wood CG, Lippman SM, McDonnell TJ, Lieberman R, Logothetis C, Ho S-M (2005). Changes in serum proteomic patterns by presurgical a-tocopherol and L-selenomethionine supplementation in prostate cancer. Cancer Epidemiology, Biomarkers and Prevention 14:1697–1702. Khymenets O, Fito´ M, Covas MI, Farre´ M, Pujadas MA, Munoz˜ D, Konstantinidou V, de la Torre R (2009). Mononuclear cell transcriptome response after sustained virgin olive oil consumption in humans: an exploratory nutrigenomics study. OMICS 13(1): 7–19. Knoops KT, De Groot LC, Kromhout D, Perrin AE, Moreiras-Varela O, Menotti A, Van Staveren WA (2004). Mediterranean diet, lifestyle factors, and 10-year mortality in elderly European men and women: the HALE project. Journal of the American Medical Association 292(12):1433–1439. Knoppers BM, Joly Y, Simard J, Durocher F (2006). The emergence of an ethical duty to disclose genetic research results: international perspectives. European Journal of Human Genetics 14:1170–1178. Konstantinidou V, Khymenets O, Fito M, De La Torre R, Anglada R, Dopazo A, Covas MI (2009). Characterization of human gene expression changes after olive oil ingestion: an exploratory approach. Folia Biologica 55(3):85–91. Kotilainen L, Rajalahti R, Ragasa C, Pehu E (2006). Health enhancing foods: opportunities for strengthening the sector in developing countries. Agriculture and Rural Development Discussion. Paper 30. Koulman A, Volmer DA (2010). Perspectives for metabolomics in human nutrition: an overview. British Nutrition Foundation Nutrition Bulletin 33:324–330. Krauss-Etschmann S, Shadid R, Campoy C, Hoster E, Demmelmair H, Jimenez M, Gil A, Rivero M, Veszpremi B, Decsi T, Koletzko BV (2007). Effects of fish-oil and folate supplementation of pregnant women on maternal and fetal plasma concentrations of REFERENCES 343

docosahexaenoic acid and eicosapentaenoic acid: a European randomized multicenter trial. American Journal of Clinical Nutrition 85(5):1392–1400. Krehl WA (1956). A concept of optimal nutrition. American Journal of Clinical Nutrition 4(6):634–641. Kulp KS, Knize MG, Fowler ND, Salmon CP, Felton JS (2004). PhIP metabolites in human urine after consumption of well-cooked chicken. Journal of Chromatography B 802(1):143– 153. Kussmann M, Krause L, Siffert W (2010). Nutrigenomics: where are we with genetic and epigenetic markers for disposition and susceptibility? Nutrition Reviews 68 (Suppl 1):S38– S47. Kussmann M, Raymond F, Affolter M (2006). OMICS-driven biomarker discovery in nutrition and health. Journal of Biotechnology 124(4):758–787. Kussmann M, Rezzi S, Daniel H (2008). Profiling techniques in nutrition and health research. Current Opinion in Biotechnology 19(2):83–99. Lalor F, Wall PG (2011). Health claims regulations comparison between USA, Japan and European Union. British Food Journal 113(2):298–313. Landberg R, Kamal-Eldin A, Andersson A, Vessby B, Aman P (2008). Alkylresorcinols as biomarkers of whole-grain wheat and rye intake: plasma concentration and intake estimated from dietary records1. American Journal of Clinical Nutrition 87(4):832–838. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al. (2001). Initial sequencing and analysis of the human genome. Nature 409:860–921. Lankinen M, Schwab U, Erkkila A, Seppanen-Laakso T, Hannila M-L, Mussalo H, Lehto S, Uusitupa M, Gylling H, Oresic M (2009). Fatty fish intake decreases lipids related to inflammation and insulin signaling—a lipidomics approach. PLoS ONE 4 (4):e5258. Lankinen M, Schwab U, Gopalacharyulu PV, Seppanen-Laakso¨ T, Yetukuri L, Sysi-Aho M, Kallio P, Suortti T, Laaksonen DE, Gylling H, Poutanen K, Kolehmainen M, Oresic M (2010). Dietary carbohydrate modification alters serum metabolic profiles in individuals with the metabolic syndrome. Nutrition, Metabolism and Cardiovascular Diseases 20:249– 257. Lock AL, Bauman DE (2004). Modifying milk fat composition of dairy cows to enhance fatty acids beneficial to human health. Lipids 39:1197–1206. Lovegrove JA, Gitau R (2008). Nutrigenetics and CVD: what does the future hold?. Journal of Human and Nutrition and Dietetics 67:206–213. Magrone T, Jirillo E (2011). Potential application of dietary polyphenols from red wine to attaining healthy ageing. Current Topics in Medicinal Chemistry 11(14):1780–1796. Makinen-Aakula¨ M (2006). Trends in functional foods dairy market. Paper presented at the 3rd FFNet Meeting on Functional Foods, Budapest, Hungary. Mandel SA, Amit T, Kalfon L, Reznichenko L, Weinreb O, Youdim MBH (2008). Cell signaling pathways and iron chelation in the neurorestorative activity of green tea polyphe- nols: special reference to epigallocatechin gallate (EGCG). Journal of Alzheimer’s Disease 15(2):211–222. Martin FP, Rezzi S, Pere-Trepat´ E, Kamlage B, Collino S, Leibold E, Kastler J, Rein D, Fay LB, Kochhar S (2009). Metabolic effects of dark chocolate consumption on energy, gut microbiota, and stress-related metabolism in free-living subjects. Journal of Proteome Research 8:5568–5579. 344 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

Masotti A, Da Sacco L, Bottazzo GF, Alisi A (2010). Microarray technology: a promising tool in nutrigenomics. Critical Reviews in Food Science and Nutrition 50(7):693–698. Mathers JC (2003). Nutrition and cancer prevention: diet-gene interactions. Proceedings of the Nutrition Society 62:605–610. Mathers JC (2008). Epigenomics: a basis for understanding individual differences?. Proceed- ings of the Nutrition Society 67:390–394. McCombie G, Browning LM, Titman CM, Song M, Shockcor J, Jebb SA, Griffin JL (2009). Omega-3 oil intake during weight loss in obese women results in remodelling of plasma triglyceride and fatty acids. Metabolomics 5:363–374. McKay JA, Mathers JC (2011). Diet induced epigenetic changes and their implications for health. Acta Physiologica 202:103–118. Mennen LI, Sapinho D, Ito H, Bertrais S, Galan P, Hercberg S, Scalbert A (2006). Urinary flavonoids and phenolic acids as biomarkers of intake for polyphenol-rich foods. British Journal of Nutrition 96(1):191–198. Menrad K (2003). Market and marketing of functional food in Europe. Journal of Food Engineering 56:181–188. Mitchell BL, Yasui Y, Lampe JW, Gafken PR, Lampe PD (2005). Evaluation of matrix- assisted laser desorption/ionization-time of flight mass spectrometry proteomic profiling: identification of a2-HS glycoprotein B-chain as a biomarker of diet. Proteomics 5:2238– 2246. Mortensen A, Sorensen IK, Wilde C, Dragoni S, Mullerova´ D, Toussaint O, Zloch Z, Sgaragli G, Ovesna´ J (2008). Biological models for phytochemical research: from cell to human organism. The British Journal of Nutrition 99(Suppl E):ES118– ES126. Moazzami AA, Andersson RE, Kamal-Eldin A (2007). Quantitative NMR analysis of a sesamin catechol metabolite in human urine. Journal of Nutrition 137:940–944. Muller¨ M, Kersten S (2003). Nutrigenomics: goals and strategies. Nature Reviews Genetics 4(4):315–322. Mullen W, Borges G, Donovan JL, Edwards CA, Serafini M, Lean MEJ, Crozier A (2009). Milk decreases urinary excretion but not plasma pharmacokinetics of cocoa flavan-3-ol metabolites in humans. American Journal of Clinical Nutrition 89(6):1784– 1791. Murtaza L, Marra G, Schlapbach R, Patrignani A, Kunzli M, Wagner U, Sabates J, Dutt A (2006). A preliminary investigation demonstrating the effect of quercetin on the expres- sion of genes related to cell-cycle arrest, apoptosis and xenobiotic metabolism in human CO115 colon-adenocarcinoma cells using DNA microarray. Biotechnology and Applied Biochemistry 45:29–36. Nagy K, Redeuil K, Williamson G, Rezzi S, Dionisi F, Longet K, Destaillats F, Renouf M (2011). First identification of dimethoxycinnamic acids in human plasma after coffee intake by liquid chromatography–mass spectrometry. Journal of Chromatography A 1218(3):491– 497. Nagy L, Schwabe JWR (2004). Mechanism of the nuclear receptor molecular switch. Trends in Biochemical Sciences 29(6):317–324. Naidoo N, Pawitan Y, Soong R, Cooper DN, Ku CS (2011). Human genetics and genomics a decade after the release of the draft sequence of the human genome. Human Genomics 5(6):577–622. REFERENCES 345

Nehir-El S, Simsek S (2012). Food technological applications for optimal nutrition: an overview of opportunities for the food industry. Comprehensive Review in Food Science Safety 11:2– 12. Nichols BL, Reeds PJ (1991). Symposium history of nutrition: history and current status research in human energy metabolism. The Journal of Nutrition 121:1889–1890. Nicholson JK, Wilson ID (2003). Understanding ‘global’ systems biology: metabonomics and the continuum of metabolism. Nature Reviews Drug Discovery 2:668–676. Niculescu MD, Pop EA, Fischer LM, Zeisel SH (2007). Dietary isoflavones differentially induce gene expression changes in lymphocytes from postmenopausal women who form equol as compared with those who do not. Journal of Nutritional Biochemistry 18(6):380– 390. Niva M (2007). “All foods affect health”. Understandings of functional foods and healthy eating among health-oriented Finns. Appetite 48:384–393. Niva M, Makel¨ a¨ J (2007). Finns and functional foods. Socio-demographics, health efforts, notions of technology and the acceptability of health-promoting foods. International Jour- nal of Consumer Studies 31:34–45. Novotny JA, Kurilich AC, Britz SJ, Clevidence BA (2005). Plasma appearance of labeled -carotene, lutein, and retinol in humans after consumption of isotopically labeled kale. Journal of Lipid Research 46:1896–1903. Noy N (2010). Between death and survival: retinoic acid in regulation of apoptosis. Annual Review of Nutrition 30:201–217. Nugent AP (2005). LIPGENE: an EU project to tackle the metabolic syndrome. Biochimie 87:129–132. Ocke MJ, Kaaks RJ (1997). Biochemical markers of additional measurements in dietary validity studies: application of the method of triads with examples from the European prospective investigation into cancer and nutrition. American Journal of Clinical Nutrition 65:S1240–S1245. Okuda M, Sasaki S, Bando N, Hashimoto M, Kunitsugu I, Sugiyama S, Terao J, Hobara T (2009). Carotenoid, tocopherol, and fatty acid biomarkers and dietary intake estimated by using a brief self-administered diet history questionnaire for older Japanese children and adolescents. Journal of Nutritional Science and Vitaminology 55(3):231–241. Ordovas´ JM, Smith CE (2010). Epigenetics and cardiovascular disease. Nature Reviews Car- diology 7(9):510–519. Ou J, Tu H, Shan B, Luk DeBose-Boyd R, Bashmakov Y, Goldstein JL, Brown MS (2001). Unsaturated fatty acids inhibit transcription of the sterol regulatory element-binding protein- 1c (SREBP-1c) gene by antagonizing ligand-dependent activation of the LXR. Proceedings of the National Academy of Sciences of the United States of America 98(11):6027–6032. Page T, Griffiths G, Buchanan-Wollaston V (2001). Molecular and biochemical characteriza- tion of postharvest senescence in broccoli. Plant Physiology 125:718–727. Panagiotou G, Nielsen J (2009). Nutritional systems biology: definitions and approaches. Annual Review of Nutrition 29:329–339. Parasramka MA, Ho E, Williams DE, Dashwood RH (2012). MicroRNAs, diet, and cancer: new mechanistic insights on the epigenetic actions of phytochemicals. Molecular Car- cionogenesis 51:213–230. Park YJ, Volpe SL, Decker EA (2005). Quantitation of carnosine in humans plasma after dietary consumption of beef. Journal of Agricultural Food Chemistry (12):4736–4739. 346 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

Pawar A, Jump DB (2003). Unsaturated fatty acid regulation of peroxisome proliferator- activated receptor alpha activity in rat primary hepatocytes. The Journal of Biological Chemistry 278(38):35931–35939. Piomelli D, Astarita G, Rapaka R (2007). A neuroscientist’s guide to lipidomics. Nature Reviews Neuroscience 8(10):743–754. Plat J, Nichols J, Mensink RP (2005). Plant sterols and stanols: effects on mixed micellar composition and LXR (target gene) activation. Journal of Lipid Research 46(11):2468– 2476. Popkin BM (2006). Global nutrition dynamics: the world is shifting rapidly toward a diet linked with noncommunicable diseases. The American Journal of Clinical Nutrition 84(2):289– 298. Potier M, Darcel N, Tome´ D (2009). Protein, amino acids and the control of food intake. Current Opinion in Clinical Nutrition and Metabolic Care 12(1):54–58. Puiggros` F, Sola` R, Blade´ C, Salvado´ M-J, Arola L (2011). Review: nutritional biomarkers and foodomic methodologies for qualitative and quantitative analysis of bioactive ingredients in dietary intervention studies. Journal of Chromatography A 1218:7399–7414. Pyper SR, Viswakarma N, Yu S, Reddy JK (2010). PPARalpha: energy combustion, hypolipi- demia, inflammation and cancer. Nuclear Receptor Signaling 8:1–21. Raqib R, Cravioto A (2009). Nutrition, immunology, and genetics: future perspectives. Nutri- tion Reviews 67:S227–S236. Renaud S, de Lorgeril M (1992). Wine, alcohol, platelets, and the French paradox for coronary heart disease. Lancet 339(8808):1523–1526. Reynolds VL (2005). Applications of emerging technologies in toxicology and safety assess- ment. International Journal of Toxicology 24:135–137. Rezzi S, Ramadan Z, Martin FP, Fay LB, van Bladeren P, Lindon JC, Nicholson JK, Kochhar S (2007). Human metabolic phenotypes link directly to specific dietary preferences in healthy individuals. Journal of Proteome Research 6(11):4469–4477. Richardson DP, Affertsholt T, Asp NG, Bruce A, Grossklaus R, Howlett J (2003). PASSCLAIM synthesis and review of existing processes. European Journal of Nutrition 42(Suppl 1):96– 111. Rimbach G, Minihane AM (2009). Nutrigenetics and personalized nutrition: how far have we progressed and are we likely to get there?. Proceedings of the Nutrition Society 68:162–172. Ritchie MR, Morton MS, Deighton N, Blake A, Cummings JH (2004). Plasma and urinary phyto-oestrogens as biomarkers of intake: validation by duplicate diet analysis. British Journal of Nutrition 91(3):447–457. Roche PA, Annas GJ (2001). Protecting genetic privacy. Nature Reviews Genetics 2:392–396. Roura E, Andres-Lacueva C, Estruch R, Bilbao MLM, Izquierdo-Pulido M, Lamuela-Raventos R (2008). The effects of milk as a food matrix for polyphenols on the excretion profile of cocoa (−)-epicatechin metabolites in healthy human subjects. British Journal of Nutrition 100(4):846–851. Roux A, Lison D, Junot C, Heilier J-F (2011). Applications of liquid chromatography coupled to mass spectrometry-based metabolomics in clinical chemistry and toxicology: a review. Clinical Biochemistry 44:119–35. Rucker R, Tinker D (1986). The role of nutrition in gene expression. A fertile field for the application of molecular biology. The Journal of Nutrition 116:177–189. REFERENCES 347

Rudkowska I, Raymond C, Ponton A, Jacques H, Lavigne C, Holub BJ, Marette A, Vohl MC (2011). Validation of the use of peripheral blood mononuclear cells as surrogate model for skeletal muscle tissue in nutrigenomic studies. OMICS 15(1–2):1–7. Sampath H, Ntambi JM (2005). Polyunsaturated fatty acid regulation of genes of lipid metabolism. Annual Review of Nutrition 25:317–340. Saris WH, Harper A (2005). DiOGenes: a multidisciplinary offensive focused on the obesity epidemic. Obesity Reviews 6:175–176. Schadt EE, Lamb J, YangX, Zhu J, Edwards S, Guhathakurta D, Sieberts SK, Monks S, Reitman M, Zhang C, Lum PY, Leonardson A, Thieringer R, Metzger JM, Yang L, Castle J, Zhu H, Kash SF, Drake TA, Sachs A, Lusis AJ (2005). An integrative genomics approach to infer causal associations between gene expression and disease. Nature Genetics 37(7):710– 717. Schnackenberg LK, Jones RC, Thyparambil S, Taylor JT, Han T, Tong W, Hansen DK, Fuscoe JC, Edmondson RD, Beger RD, Dragan YP (2006). An integrated study of acute effects of valproic acid in the liver using metabonomics, proteomics, and transcriptomics platforms. OMICS 10(1):1–14. Schroeder F, Petrescu AD, Huang H, Atshaves BP, McIntosh AL, Martin GG, Hostetler H, Vespa A, Landrock D, Landrock KK, Payne HR, Kier AB (2008). Role of fatty acid binding proteins and long chain fatty acids in modulating nuclear receptors and gene transcription. Lipids 43(1):1–17. Seeram NP, Henning SM, Zhang YJ, Suchard M, Li ZP, Heber D (2006). Pomegranate juice ellagitannin metabolites are present in human plasma and some persist in urine for up to 48 hours. Journal of Nutrition 136(10):2481–2485. Seo T, Blaner WS, Deckelbaum RJ (2005). Omega-3 fatty acids: molecular approaches to optimal biological outcomes. Current Opinion in Lipidology 16(1):11–18. Setchell KDR, Brown NM, Zimmer-Nechemias L, Brashear WT, Wolfe BE, Kirschner AS, Heubi JE (2002). Evidence for lack of absorption of soy isoflavone glycosides in humans, supporting the crucial role of intestinal metabolism for bioavailability. American Journal of Clinical Nutrition 76(2):447–453. Shelnutt SR, Cimino CO, Wiggins PA, Ronis MJJ, Badger TM (2002). Pharmacokinetics of the glucuronide and sulfate conjugates of genistein and daidzein in men and women after consumption of a soy beverage1. American Journal of Clinical Nutrition 76(3):588–594. Shirazi-Beechey SP, Moran AW, Batchelor DJ, Daly K, Al-Rammahi M (2011). Glucose sensing and signalling; regulation of intestinal glucose transport. The Proceedings of the Nutrition Society 70(2):185–193. Sies H (2010). Polyphenols and health: update and perspectives. Archives of Biochemistry and Biophysics 501(1):2–5. Simopoulos AP (2010). Nutrigenetics/nutrigenomics. Annual Review of Public Health 31:53– 68. Simopoulos AP, Childs B (1990). Genetic variation and nutrition. World Review of Nutrition and Dietetics 63:1–300. Siri-tarino PW, Sun Q, Hu FB, Krauss RM (2010). Meta-analysis of prospective cohort studies evaluating the association of saturated fat with cardiovascular disease. American Journal of Clinical Nutrition 91(3):535–546. Snoep JL, Westerhoff HV (2005). From isolation to integration, a systems biology approach for building the Silicon Cell. Systems Biology 13:13–30. 348 HOW DOES FOODOMICS IMPACT OPTIMAL NUTRITION?

Solanky KS, Bailey NJ, Beckwith-Hall BM, Davis A, Bingham S, Holmes E, Nicholson JK, Cassidy A (2003). Application of biofluid 1H nuclear magnetic resonance-based metabo- nomic techniques for the analysis of the biochemical effects of dietary isoflavones on human plasma profile. Analytical Biochemistry 323:197–204. Solanky KS, Bailey NJ, Beckwith-Hall BM, Bingham S, Davis A, Holmes E, Nicholson JK, Cassidy A (2005). Biofluid 1H NMR-based metabonomic techniques in nutrition research - metabolic effects of dietary isoflavones in humans. The Journal of Nutritional Biochemistry 16(4):236–244. Stover PJ, Caudill MA (2008). Genetic and epigenetic contributions to human nutrition and health: managing genome-diet interactions. Journal of the American Dietetic Association 108:1480–1487. Taylor PD, Poston L (2007). Developmental programming of obesity in animals. Experimental Physiology 92:287–298. USDA (2010). Dietary guidelines for Americans. U.S. Dept. of Health and Human Services and U.S. Dept. of Agriculture. Vander Werf MJ, Schuren EHJ, VanOmmen B (2001). Nutrigenomics: application of genomics tecnologies in nutritional sciences and food technology. Journal of FoodScience 66(6):772– 780. Van Dorsten FA, Daykin CA, Mulder TPJ, Van Duynhoven JPM (2006). Metabonomics approach to determine metabolic differences between green tea and black tea consumption. Journal of Agricultural and Food Chemistry 54:6929–6938. Van Erk MJ, Blom WA, van Ommen B, Hendriks HF (2006). High-protein and high- carbohydrate breakfasts differentially change the transcriptome of human blood cells. American Journal of Clinical Nutrition 84(5):1233–1241. van Ommen B, Stierum R (2002). Nutrigenomics: exploiting systems biology in the nutrition and health arena. Current Opinion in Biotechnology 13(5):517–521. van Ommen B, Groten JP (2004). Nutrigenomics in efficacy and safety evaluation of food components. In: Simpopoulos AP, Ordovas JM, editors. Nutrigenetics and Nutrigenomics. World Review of Nutrition and Dietetics Vol. 93. Basel, Switzerland: Karger. p 134– 152. van Ommen B, Keijer J, Kleemann R, Elliott R, Drevon CA, McArdle H, Gibney M, Muller M (2008). The challenges for molecular nutrition research 2: quantification of the nutritional phenotype. Genes and Nutrition 3:51–59. van Ommen B, Keijer J, Heil SG, Kaput J (2009). Challenging homeostasis to define biomarkers for nutrition related health. Molecular Nutrition and Food Research 53:795–804. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. (2001). The sequence of the human genome. Science 291(5507):1304– 1351. Virgili F, Perozzi G (2008). How does nutrigenomics impact human health? IUBMB Life 60:341–344. Vitaglione P, Sforza S, Galaverna G, Ghidini C, Caporaso N, Vescovi PP, Fogliano V, Marchelli R (2005). Bioavailability of trans-resveratrol from red wine in humans. Molecular Nutrition Food Research 49(5):495–504. Vogels JWTE, Tas AC, van den Berg F, van der Greef J (1993). A new method for classifi- cation of wines based on proton and C-13 NMR spectroscopy in combination with pattern recognition techniques. Chemometrics and Intelligent Laboratory System 21:249–258. REFERENCES 349

VogelsJWTE, Terwel L, Tas AC, van den Berg F, Dukel F, van der Greef J (1996). Detection of adulteration in orange juices by a new screening method using proton NMR spectroscopy in combination with pattern recognition techniques. Journal of Agricultural Food Chemistry 44:175–180. Wehrens R, Franceschi P, Vrhovsek U, Mattivi F (2011). Stability-based biomarker selection. Analytica Chimica Acta 705:15–23. Weissinger EM, Nguyen-Khoa T, Fumeron C, Saltiel C, Walden M, Kaiser T, Mischak H, Drueke TB, Lacour B, Massy ZA (2006). Effects of oral vitamin C supplementation in hemodialysis patients: a proteomic assessment. Proteomics 6:993–1000. Welsh S (1994). Atwater to the present: evolution of nutrition education. The Journal of Nutrition 124(Suppl 9):1799S–1807S. Wendler D, Emanuel E (2002). The debate over research on stored biological samples: what do sources think? Archives of Internal Medicine 162:1457–1462. WHO (2000). Obesity: preventing and managing the global epidemic. Report of a WHO consultation. World Health Organization technical report series 894:1–253. WHO/FAO (2003). Diet, nutrition and the prevention of chronic diseases. Report of a joint WHO/FAO expert consultation. Geneva, World Health Organization. WHO Technical Report Series, 916. Available at http://whqlipdoc.who.int/trs/who_trs_916.pdf Williams RR, Hunt SC, Hasstedt SJ, Hopkins PN, Wu LL, Berry TD, Stults BM, Barlow GK, Kuida H (1990). Hypertension: genetics and nutrition. World Review of Nutrition and Dietetics 63:116–130. Wilson AG (2008). Epigenetic regulation of gene expression in the inflammatory response and relevance to common diseases. Journal of Periodontol 79:1514–1519. Wittwer J, Rubio-Aliaga I, Hoeft B, Bendik I, Weber P, Daniel H (2011). Nutrigenomics in human intervention studies: current status, lessons learned and future perspectives. Molecular Nutrition and Food Research 55(3):341–358. Zhang X, Yap Y, Wei D, Chen G, Chen F (2008). Novel omics technologies in nutrition research. Biotechnology Advances 26(2):169–176. Zeisel SH (2011). Nutritional genomics: defining the dietary requirements and effects of choline. The Journal of Nutrition 141:531–534. Zubik L, Meydani M (2003). Bioavailability of soybean isoflavones from aglycone and gluco- side forms in American women. American Journal of Clinical Nutrition 77(6):1459–1465. 12 LIPIDOMICS

Isabel Bondia-Pons and Tuulia Hyotyl¨ ainen¨

12.1 DEFINITION AND ANALYTICAL CHALLENGES IN LIPIDOMICS

12.1.1 Lipids: Functions and Classification Lipidomics is a subdiscipline of metabolomics, with the focus on the global study of molecular lipids (i.e., the complete lipid profile within a cell, tissue, or organism), including pathways and networks of cellular lipids in biological systems (Wenk, 2005; Oresiˇ c,ˇ 2009). It covers not only the analysis of lipid species and their abundance but also their biological activities, subcellular localization, and tissue distribution (Dennis, 2009). Lipids are a diverse group of compounds with multiple key biological functions. The diversity in lipid function is reflected by a huge variation in the structures of lipid molecules, which by some recent estimates comprises hundreds of thousands distinct lipid molecules (Yetukuri et al., 2008; Buckingham, 2010). Lipids function as energy storage sources; they participate in signaling pathways and constitute the cellular structural building blocks in both cell and organelle membranes (Oresiˇ cˇ et al., 2008; Fahy et al., 2011) (Fig. 12.1). Lipids are thus directly involved in mem- brane trafficking, regulating membrane proteins, creating specific subcompartments in membranes that contribute to cellular function (German et al., 2007; Shevchenko and Simons, 2010), and in providing dynamic highly specialized molecular scaffolds for the construction of microscopic and macroscopic chemical assemblies needed for life processes (Klose et al., 2010). This makes them an interesting, but at the same time, a challenging target from both analytical and biological perspectives also, in the field of nutritional and food lipidomics.

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

351 352 LIPIDOMICS

Lipids Membrane structure and function Energy storage and lipid transport

Chylomicron Cell plasma membrane Membrane lipids e.g., PC(16:0/18:1) PC(18:0/0:0) PE(O-16:0(1Z)/22:6) SM(d18:1/24:1) Cell signaling and Cer(d18:1/24:1) metabolic homeostasis Storage lipids e.g., ROS Signaling lipids Nutrient TG(14:0/16:0/16:0) uptake e.g., TG(16:0/18:0/18.2) PC(O-16:0/2:0) TG(18:0/18:1/20:4) Ceramide PA(18:10/0:0) DG(16:0/16:0) Cer(d18:1/18:0) CE(20:4) PGD2 Insulin Free fatty S1P Cell acids

Oresic et al, Trends Biotechnol. (2008) FIGURE 12.1 Main functions of the lipids. Lipids serve as energy storage sources, partic- ipate in cell signaling pathways, and constitute the cellular structural building blocks in both cell and organelle membranes. With permission from Oresiˇ cˇ et al. (2008).

Unlike other biomolecules, lipids are not characterized by a certain individual chemical structure. Table 12.1 shows the main lipid classes and their nomenclature. The most widely used classification is that proposed by Fahy et al. (2005), divid- ing lipids into eight main categories, namely fatty acyls (FA), glycerolipids (GL), glycerophospholipids, sphingolipids (SL), sterol lipids, prenol lipids, saccharolipids, and polyketides. Each category contains distinct classes and subclasses of lipids. The fatty acids are the simplest lipid group. In animal tissues, the common fatty acids vary in chain length from 14 to 22, but on occasion the chain length can vary from 2 to 36, or even more. Most naturally occurring fatty acids have an even number of carbon atoms. Fatty acids from animal tissues may have one to six double bonds and those from algae may have up to five, while those of the higher plants rarely have more than three. Double bonds in fatty acids usually have the cis configuration. The FA group includes the various types of fatty acids, eicosanoids, fatty alcohols, fatty aldehydes, fatty esters, fatty amides, fatty nitriles, fatty ethers, and hydrocarbons. Many lipids in this class, especially the eicosanoids derived from n–6 and n–3 polyunsaturated fatty acids (PUFAs), have distinct biological activities. FAs are also the major lipid building block of more complex lipids, such as GL, that is, monoacylglycerides (MGs), diacylglycerides (DGs), and triacylglycerols (TGs). These neutral lipids have DEFINITION AND ANALYTICAL CHALLENGES IN LIPIDOMICS 353

TABLE 12.1 Lipids Classification Category Abbreviation Subcategory Fatty acyls FA Fatty acids and conjugates Octadecanoids Eicosanoids Docosanoids Fatty alcohols Fatty aldehydes Fatty esters Glycerolipids GL Monoradylglycerols Diradylglycerols Triradylglycerols Glycerophospholipids GP Phosphatidic acids Phosphatidylcholines Phosphatidylserines Phosphatidylglycerols Phosphatidylethanolamines Phosphatidylinositols Phosphatidylinositides Cardiolipins Sphingolipids SP Sphingoid bases Ceramides Phosphosphingolipids Phosphonosphingolipids Neutral glycosphingolipids Acidic glycosphingolipids Sterol lipids ST Sterols Steroids Secosteroids Bile acids and derivatives Prenol lipids PR Isoprenoids Quinones and hydroquinones Polyphenols Saccharolipids SL Acylaminosugars Acylaminosugar glycans Acyltrehaloses Acyltrehalose glycans Polyketides PK Macrolide polyketides Aromatic polyketides Non-ribosomal peptide/polyketide hybrids 354 LIPIDOMICS a glycerol backbone with fatty acid chains attached to the glycerol group. Nutrition- ally, the TGs are the most predominant group in this category. Glycerophospholipids (GP), also referred to as phospholipids (PL), are key components of the lipid bilayer of cells, and are also involved in metabolism and signaling. GPs found in biologi- cal membranes are phosphatidylcholines (PC), phosphatidylethanolamine (PA), and phosphatidylserines (PS). SL, on the other hand, are a complex family of compounds that share a common structural feature, a sphingoid base backbone. Sterols are a class of lipids that contain a common steroid nucleus of a fused four-ring structure, with a hydrocarbon side chain and an alcohol group. Cholesterol is the primary sterol lipid in animal fat and an important part of the lipid membrane. Other lipids include compounds such as polyketides and saccharolipids.

12.1.2 The Lipidomics Pipeline Lipidomics is progressing at great speed through enhancements in mass spectrometry (Sandra et al., 2010; Chait, 2011; Jung et al., 2011), data acquisition (Han et al., 2006; Ståhlman et al., 2009; Sandra et al., 2010), bioinformatics (Niemela¨ et al., 2009; Song et al., 2009; Oresiˇ c,ˇ 2011), and systems biology approaches (Oresiˇ c,ˇ 2009; Gross and Han, 2011), side-by-side with the other –omics disciplines. The recent advances in both mass spectrometry (MS) and computational methods have highly influenced the evolution of lipidomics, together with the recognition of the major role that lipids play in many metabolic nutrition-related diseases such as obesity, atherosclerosis, hypertension, and diabetes. The main lipidomics advances from a technical point of view include tailored condensed phase separations coupled to MS (Merrill et al., 2005); tandem MS strate- gies (Liebisch et al., 2004; Han and Gross, 2005); standardized lipid nomenclature, comprehensive lipid database construction, and synthesis of lipid standards (Fahy et al., 2011); and integration of bioinformatics toward automation of data analysis (Niemela¨ et al., 2009; Fahy et al., 2007). The workflow for lipidomic analyses is presented in Figure 12.2. The analytical workflow starts from sampling and sample preparation, followed by separation and detection. The next steps are data prepro- cessing and statistical analyses, data mining, and modeling.

12.1.2.1 Sampling and Sample Preparation In lipidomics, the first steps of the procedure are sampling, storage, and sample preparation, which all have a very crucial role. Poorly optimized protocols can cause contamination, conversion, and/or degradation of the metabolites, leading to biased results. These two first steps, that is, sampling and sample preparation, are typically the major cause of variation in the analytical results. The sample preparation also often includes fractionation of lipid classes, for example, PL, TG, and cholesterol esters (CEs). The most common methods for fractionation are thin-layer chromatography (TLC), solid-phase extraction (SPE), and normal phase liquid chromatography (NPLC). For small sample sets, TLC is the most convenient technique for isolation of small amounts of lipid components, as it allows the separation of most of the important lipid classes such as PL, TG, CE, DEFINITION AND ANALYTICAL CHALLENGES IN LIPIDOMICS 355

Samples Analytical methods Data preprocessing (biofluids, cells, tissue)

UPLC-MS

GC-MS

NMR Shot-gun MS Simple preparation, quality control

C A D B

Bioinformatics Statistical analysis

Biomarkers, biological insight FIGURE 12.2 Workflow in lipidomics. The analytical workflow starts from sampling and sample preparation, followed by separation and detection of metabolites. The next steps are data preprocessing and statistical analyses, data mining, and modeling. cholesterol, and free fatty acids. However, TLC is not well suited to automation and high-throughput analyses. Both SPE and NPLC procedures can be automated, and particularly with NPLC, very efficient fractionation can be performed.

12.1.2.2 Analytical Methods for Lipid Analysis Due the high number of lipids, the large concentration range and chemical diversity of lipids, it is not possible to cover the whole metabolome with a single analytical technique. Two types of approaches are used in the lipidomics, namely targeted selective analysis and more comprehensive, nontargeted profiling methods. In the targeted analysis, only pres- elected lipids are analyzed with a carefully planned analytical protocol. While this approach allows very sensitive and robust determination of the selected metabolites, it gives relatively limited information. The nontargeted approaches aim to cover as many lipids as possible in a single analysis. However, these methods are typically only semi-quantitative, and it is not possible to optimize the method for all compounds. At present, the most commonly applied methodologies in lipidomics for final separation and identification are based on mass spectrometry (MS) often combined with chro- matographic methods, such as liquid chromatography (LC) and gas chromatography (GC). Also nuclear magnetic resonance (NMR) is used in lipidomics. Tandem or hybrid mass spectrometry (MS/MS) is used both for nontargeted anal- yses, and for increasing the sensitivity and selectivity of quantitative analysis. In 356 LIPIDOMICS

MS/MS experiments, the first analyzer is used to select a precursor ion which is fragmented in a collision cell. The product ions, i.e., the fragments of the precursor ion (collision-induced dissociation (CID)) are then detected in the second mass ana- lyzer. For further identification, this MS/MS process can be iteratively repeated with sequential selection of resultant ions for fragmentation in MSn experiments. Suitable MS systems for CID include both quadruple-based tandem in-space instruments (e.g., triple quadrupole (QqQ) or quadrupole time-of-flight (QTOF)), and ion-trap-based tandem in-time instruments (e.g., quadrupole-ion trap (QIT), linear trap quadrupole (LTQ)–Orbitrap, or linear trap quadrupole Fourier-transform ion cyclotron resonance (LTQ-FT-ICR)).

Shotgun Lipidomics In a direct infusion electrospray ionization (ESI)-based MS approach, the sample (extract) is directly infused into the MS without prior chro- matographic separation. Typically, high-resolution mass spectrometers are used in the shotgun approach, for example, hybrid quadrupole mass spectrometer-time-of- flight MS (Q-TOF) and Fourier-trap-MS (FTMS) instruments, both the cyclotron and the Orbitrap type. Recently, novel hybrid high-resolution instruments, such as the combination of ion mobility and TOFMS (IM–TOFMS) have been launched, also suitable for the shotgun lipidomics. The advantage of the shotgun approach is that is simple and fast, while its major limitation is the ion suppression, which causes that compounds present in trace amount are often not detected (Moco et al., 2007; Jung et al., 2011). Furthermore, the composition of biological samples can vary sub- stantially, and thus, the level of ion suppression can change from sample to sample, hampering the reliability of quantitative results for global profiling. Typically, labeled standards are used in the shotgun approach to correct the matrix effects; however, standards are not available for all lipids. It is possible to minimize ion suppression by more careful sample pretreatment, using for example, fractionation, but then the main advantage of the shotgun approach, that is, simplicity and speed, are lost. Thus, the applicability of shotgun MS in the search of novel, previously unknown lipids, is relatively restricted. Nevertheless, its application might be considered in the future as a complementary tool to other lipidomic approaches in the identification of the bio- chemical mechanisms underlying metabolic diseases. Yang et al. (2011) developed, for instance, an ESI/MS approach for identification and quantitation of the double bond isomers of endogenous FA species or FA chains present in PLs of biological samples by multistage MS (MS3), providing a novel tool for the analysis of a cellular lipidome. Shotgun lipidomics results of both adipose tissue (AT) and skeletal mus- cle of mice null for calcium-independent phospholipase A2gamma (Mancuso et al., 2010) also contributed in identifying the role of this enzyme as a necessary mediator for efficient electron transport chain coupling and energy production through its par- ticipation in the alterations of cellular bioenergetics that promote the development of the metabolic syndrome.

Ultra High Performance Liquid Chromatography Coupled to Mass Spectrometry Ultra high performance liquid chromatography coupled with mass spectrometry (UHPLC–MS)-based methodologies have been widely used for both targeted and DEFINITION AND ANALYTICAL CHALLENGES IN LIPIDOMICS 357 nontargeted analyses, using various types of mass spectrometers, from a simple sin- gle quadrupole to hybrid instruments and to high-resolution Orbitrap instruments. The sensitivity in LC–MS is typically high, and identification of novel lipids is possible. The fast UHPLC methodologies, utilizing very high pressures, elevated temperatures, and novel column materials allow high-throughput analyses, with high separation efficiency in a short (10–15 min) analysis time. For global profiling, the best choices are combination of UHPLC with Q-TOFMS or with tandem ion mobil- ity TOFMS which both allow fast, high resolution MS detection (Yang et al., 2012; Nygren et al., 2011) Typically, up to several hundreds of lipids can be separated with the UPLC–MS methodologies used for the profiling. The matrix effects are the main challenge in the profiling with UHPLC–MS methods, as it is not possible to use labeled standards for all compounds, as is typically done in targeted analyses. Also the sensitivity is typically not as high as in targeted methods, both because the methodological con- ditions cannot be optimized for each compound separately, and because metabolites present in high concentration may hinder the analysis of minor metabolites due to matrix suppression. For targeted analyses, typically triple quadrupole MS is used for the detection with UHPLC, typically using selective ion monitoring. With the most recent UPLC–QqQMS instruments, very high sensitivity can be obtained (pico- moles). The targeted lipid methods include methods for eicosanoids, sterol lipids such as steroids and bile acids (Balazy, 2004; Bobeldijk et al., 2008).

Lipidomics by Structurally Selective Ion Mobility Spectrometry An emerging tech- nology that has only recently been applied in lipid analysis is ion mobility–MS (IM–MS) (Shvartsburg and Smith, 2008). Its relevant capabilities and limitations as applied to lipid research have been recently reviewed (Kliman et al., 2011). In short, the term ion mobility refers to the motion of free (gas-phase) ions in the presence of gas collisions. Ion mobility spectrometry shares parallels with MS in that an ion- ization source, a chamber, and an ion detector are required for both techniques. The fundamental difference is that while in MS the measurement proceeds in vacuum, the measurement of ion mobility occurs within a pressurized chamber, therefore, allowing gas collisions. IM analysis has the ability to differentiate analytes which are isobaric in mass but differ in structure. To date, few lipid studies by IM–MS have been published, but its role for fundamental lipid characterization, with a spe- cial interest in PLs (Jackson et al., 2008; Kim et al., 2009; Trimpin et al., 2009) and lipidomics from complex biological samples such as tissues (Ridenour et al., 2010), is emerging. In fact, the advances in imaging MS have played an important role in the development of imaging IM–MS for lipid analysis (Woods and Jackson, 2010; Goto-Inoue et al., 2011). The combination of IM–MS experiments with molec- ular dynamics computational modeling (Van der Spoel et al., 2011) might be a useful tool to elucidate the structure and stability of lipid-incorporated complexes in future foodomic studies. In addition to the drift time ion mobility (DTIM) method, the poten- tial of the newly introduced traveling wave ion mobility (TWIM)-based instruments is expected to impact over the next few years in the area of lipid food research. 358 LIPIDOMICS

Multidimensional Approaches Used in Lipidomics The intrinsic complexity of natu- rally occurring lipids, such as phospholipids, has pointed out the need for chromatog- raphy and MS to go multidimensional in order to contribute to lipid characterization. As previously mentioned, PLs are essential constitutes of cell membranes, provid- ing a hydrophobic environment for membrane protein activity and a source of lipid second messengers (Ledeen and Wu, 2008). The profile of PLs, especially of the minor components, is proven to respond to the cell changes in biological activities. Depending on cell types and different compartments, the distribution of PL classes and subclasses, as well as their quantities, vary dramatically to facilitate their criti- cal role in biological processes. Phospholipid analysis is probably one of the most demanding parts of lipidomics research, involving the identification and quantitation of up to thousands of cellular PL molecular species and their interactions with other lipids, proteins, and metabolites. With the unique capability of detecting the intact PLs and the elemental compositions as well as obtaining the structural information using MS/MS, MS-based lipidomics has been shown to meet the increasing demands of high-throughput, and accuracy of identification and quantification. The implemen- tation of high-resolution instrumentation such as FT ion cyclotron resonance and Orbitrap MS in combination with a 2D chromatographic separation can be, in fact, very advantageous for rapid PL profiling. It is expected that comprehensive multidi- mensional LC–MS, which inherits the advantages of the existing methodologies and overcomes the limitations of any individual, further develops in the future. Recent developments of multidimensional MS, LC–MS and chromatographic approaches for lipidomics analysis have been extensively reviewed elsewhere (Guo and Lankmayr, 2010; Han et al., 2012). By coupling ion mobility spectrometry with mass spectrom- etry (2D IMMS), a rapid separation of isomers, conformers, and enantiomers can be obtained in addition to a resolving power similar to that of capillary GC.

Gas Chromatography-Based Methods GC-based methods are suitable for only suf- ficiently volatile compounds, thus most of the lipids cannot be analyzed by GC. However, for sufficiently volatile lipids, GC-based methods are a viable option, and GC–MS (and GC–FID) is the most widely used method for the analysis of FAs. GC–MS methods are also used for the analysis of steroids. In the analysis of fatty acids and steroids, the compounds have to be derivatized. For free fatty acids and steroids, silylation is the most common methodology, while esterified fatty acids are typically analyzed as their methyl esters (FAME). Typically, FAMEs are prepared by transesterification using hydrogen chloride, sulfuric acid, or boron trifluoride in methanol. Transesterification involves the extractive transmethy- lation of lipid class-bound fatty acids with methoxide and heat which is followed by acidification in methanol to convert esterified and free fatty acids to FAMEs. The physiological role of FA in health and disease has gained appreciation during the last decades, and there has been an intense effort to develop methodologies to quantitatively monitor FA composition in biological samples in a manner that satisfies the requirements for comprehensiveness, sensitivity, and accuracy. The current state- of-the-art quantitative aspects of FA analysis using GC–MS for FA profiling in DEFINITION AND ANALYTICAL CHALLENGES IN LIPIDOMICS 359 biological samples, such as cultured and primary cells, tissues, and blood plasma samples, has been reviewed elsewhere (Quehenberger et al., 2011).

Isotopic Enrichment in Specific Lipids MS also allows using stable isotopic enrich- ment in specific lipids as the means to follow a label through entire pathways, so that lipid metabolites can provide truly dynamic, kinetic information (German et al., 2007). Stable isotope tracers are used to assess metabolic flux profiles in living cells. Most methods of measurement average out the isotopic isomer distribution in metabo- lites throughout the cell, but information about the compartmental organization of analyzed pathways was recently pointed out to be crucial for the evaluation of true fluxes (Marin de Mas et al., 2011). The recent concept of fluxolipidomics has been briefly reviewed, proposing a fluxomics approach for lipid molecular species, both in terms of compartments and biochemical metabolism (Lagarde et al., 2012). The example of fluxolipidomics of essential FA toward their enzyme-dependent oxygenated metabolites and their degradation products was also developed by the same authors, expanding the horizons of lipidomics toward new approaches.

Nuclear Magnetic Resonance Nuclear magnetic resonance (NMR) spectroscopy is a quantitative, nondestructive technique that provides unique information about the molecular structure and dynamics, and it has been widely utilized in metabolomics. However, in lipidomics, the use of NMR studies is rather incidental, mainly because it is often challenging to directly identify lipids in complex mixtures with NMR. The similarity of the spectra of lipids with respect to the limited structural carbon chain information is another challenge in lipid analysis with NMR. An additional limitation is the rather modest sensitivity of the NMR. NMR interpretation is also complicated by the considerable number of spin-coupled multiplets that result in spectral crowding. NMR has been utilized in lipid analysis (Lindon and Nicholson, 2008); however, the use has been mainly restricted to the elucidation of molecular structures of purified lipids and the characterization of dynamic lipid–protein interactions (Wenk, 2005). NMR has been also been used in structural analysis and quantification of fatty acids and their derivatives. An NMR proof-of-principle study demonstrated the first application of plasma 1H NMR-based lipidomics for improving the prognosis of diet-induced atheroge- nesis. The study evaluated the effects of different dairy-based food products on early atherogenesis in hyperlipidemic hamsters (Martin et al., 2009). The NMR approach selectively captured part of the diet-induced metabotypes correlated with aortic cholesteryl esters, being VLDL lipids, cholesterol, and N-acetyl glycoproteins the most positively correlated metabolites. Recent advances in high-resolution magic-angle-spinning (HR-MAS) NMR spec- troscopy for metabolic profiling of intact tissues (Beckonert et al., 2010) might also become a future tool to monitor tissue-specific or cellular processes in nutri- tional studies. To date, HR-MAS NMR technology has mainly been applied in studies for exploratory research, either answering organ-specific questions or inves- tigating interactions between biofluid and organ compartments. The application 360 LIPIDOMICS of the statistical total correlation spectroscopy (STOCSY) methodology (Cloarec et al., 2005) is now increasingly used for this purpose. So far, HR-MAS NMR spectroscopy has been applied to mainly purely exploratory biomedical studies, as nicely reviewed by Lindon et al. (2009). However, it is not easy to forecast where the next major improvements in this technique will occur and also be utilized in nutrition-based studies.

12.1.2.3 Data Analysis Tools The amount of data obtained with the current lipidomics methodologies is huge, particularly with the global profiling techniques. It is challenging to link the analytical information with available clinical and genetic data. First steps of data processing include signal processing, data normalization, transformation, and assessment, followed by application of statistical methods for comparison of groups and the construction of predictive models (Katajamaa and Oresiˇ c,ˇ 2007; Sumner et al., 2007). Several software packages are available for this data preprocessing before the statistical and bioinformatic analyses (Katajamaa and Oresiˇ c,ˇ 2007; Sumner et al., 2007; Lange et al., 2008; Pluskal et al., 2010). The main challenge in statistical analyses is that models using information of hundreds of metabolites are, in practice, not realistic. To get robust models, the number of significant variables should be less than ca. 25, derived from a limited number of metabolic markers (3–10 compounds). With a larger number of variables/metabolites, there is a high risk of overfitting of the data. However, the state-of-the art analytical techniques produce data of several hundreds or even thousands of compounds. Thus, typically the first step of the statistical analysis is data reduction so that computations are tractable, model predictive power is improved, and the biochemical interpretation can focus on a small set of relevant lipids. Two types of pattern recognition processes are typically used in multivariate statistics, namely unsupervised and supervised methods. The difference in these methods is that in unsupervised data analysis, the data analysis is done without any preconceptions or preselection, that is, without biasing the results by the introduction of prior information of the samples. These methods include hierarchical cluster analysis and principal component analysis, and are good in the identification of patterns of the data. In the supervised approach, such as principal component regression and neural networks, each sample or metabolite is first associated to already known class, and this prior information is then utilized in the generation of the clusters of patterns (Katajamaa and Oresiˇ c,ˇ 2007). Other techniques utilized in lipidomics include artificial neural networks, self-organizing maps, and linear discriminant analysis among others (Katajamaa and Oresiˇ c,ˇ 2007; Sumner et al., 2007).

12.2 LIPIDOMICS IN NUTRITION AND HEALTH RESEARCH

Nutrition plays a crucial role for human health, and dietary choices can both pre- vent and promote disease. A poor diet promotes several diseases, typically linked with metabolic imbalances including obesity, diabetes, atherosclerosis, hypertension, LIPIDOMICS IN NUTRITION AND HEALTH RESEARCH 361 malignancy, osteoporosis, inflammatory disease, and even infectious diseases. How- ever, the link between an individual’s diet and specific health outcome is poorly understood. Certain individuals are easily affected by their diet and poor diet quickly leads to obesity and associated metabolic complications, such as type 2 diabetes and in others, the same diet does not. Metabolomics can help track the interaction between nutrients and human metabolism, as well as the involvement of the genome and the gut microbiome, in overall human health. Investigation of individual variation can be used for the development of personalized solutions for interventions. This, in turn, can establish a new framework to enhance human health through increasing the efficacy and safety of diets. Individualizing metabotype and linking it to the diet and health would allow estimation of nutritional status of individual, follow-up of the compliance, progress, and success of dietary guidance and intervention, identifica- tion of side effects, unexpected metabolic responses, or lack of response to specific dietary changes, recognition of metabolic shifts in individuals due to environmental changes, lifestyle modifications, and normal progression of aging.

12.2.1 Lipidomics and Human Nutritional Interventions The use of lipidomics approaches in human nutritional intervention studies has emerged during the last few years. Multiple bioactive lipid components may, for instance, play a role in the mechanisms by which fish consumption exerts its positive effects on human health. An 8-week parallel controlled pilot study was carried out to investigate how intakes of fatty fish or lean fish affect serum lipidomic profiles in sub- jects with myocardial infarction or unstable ischemic attack (Lankinen et al., 2009). Lipidomics analyses based on UPLC-ESI-QTOF-MS were performed as described earlier (Katajamaa et al., 2006; Schwab et al., 2008). A total of 307 lipids were identified and quantified by the lipidomics platform. Among them, multiple bioactive lipid species including ceramides, lysophosphatidylcholines (lysoPC), diacylglyc- erols (DGs), phosphatidylcholines, and lysophosphatidylethanolamines, decreased significantly in the fatty fish group, whereas in the lean fish group CEs and specific long-chain TG increased significantly (Lankinen et al., 2009). These results, together with the fact that the prevalence of impaired glucose tolerance and type 2 diabetes is lower in populations consuming a high intake of n–3 fatty acids, supports the hypothesis that DGs and ceramides may be the link between n–3 fatty acids and insulin resistance. The proinflammatory cytokines IL-6 and TNF-␣ are associated with insulin resistance (IR) and the metabolic syndrome (Van Gaal et al., 2006). A linear regression model circulating levels of both ceramides and TNF-␣ showed a significant independent influence on circulating levels of IL-6, altogether accounting for 41% of its variation (p < 0.001), indicating that the link between ceramides, IR and inflammation is related to the inflammatory marker IL-6 (de Mello et al., 2009). Ceramides may thus contribute to the induction of inflammation involved in IR states that frequently coexist with coronary heart disease. In addition, the decrease in lysoPC in the fatty fish group may be related to anti-inflammatory effects of n–3 FA as lysoPC is the major bioactive lipid component of oxidized low density 362 LIPIDOMICS lipoproteins (LDL) and may be responsible for many of the inflammatory effects of oxidized LDLs (Aiyar et al., 2007). Characterizing the status of health and defining the border between health and disease is not a trivial issue. Systematic models might indeed help in understanding individual challenge responses. Early alterations in metabolism might be unmasked by challenging metabolic regulatory processes, testing the individual capacity and flexibility to cope with environmental stressors, such as physical activity or dietary components. The analyses of samples in most metabolomic studies are obtained in a fasting state, and few are the studies reporting time-resolved changes of the human metabolome in response to a challenge (Shaham et al., 2008; Wopereis et al., 2009; Rubio-Aliaga et al., 2011). Interestingly, significant changes in bile acids were, for instance, linked for the first time to glucose homeostasis by applying an LC–MS/MS metabolic profiling strategy to samples from both healthy and impaired glucose tolerance individuals after an oral glucose challenge (Shaham et al., 2008). Their findings laid the groundwork for using metabolic profiling to define an individual’s insulin response profile, which might certainly have value in predicting diabetes, its complications, and in guiding therapy when confirmation with data from larger, prospective clinical studies of prediabetics becomes available. In order to extend the knowledge on the dynamics of the human metabolome in response to different challenges, a recent study, with a special focus in lipid and amino acid changes, was performed in healthy men who underwent a prolonged 36-h fasting period, a standard liquid diet, an OGTT, and an oral lipid tolerance test (Krug et al., 2012). Flow injection analysis (FIA)–MS/MS-based analyses were used among other analytical platforms. The study results showed that physiological challenges increased interindividual variation even in phenotypically similar volunteers, reveal- ing metabotypes not observable in baseline metabolite profiles. Plasma-free carnitine and acylcarnitines were shown to define best any catabolic and anabolic conditions and their transitions, and their ratio was suggested by the authors as marker for the metabolic state. Readouts from a systematic model of beta-oxidation showed signif- icant and stronger associations with physiological parameters such as fat mass than absolute metabolite concentrations, suggesting that systematic models might help in understanding individual challenge responses (Krug et al., 2012).

12.2.2 Lipidomics and Nutrition-Related Diseases The origin of obesity and related lipid disturbances is multifactorial. Environmental factors (including nutrition status and dietary patterns) and lifestyle factors play a key role in the development of obesity, in addition to the genetic variation, which also influences both body fat accumulation and lipid metabolism. The study of monozy- gotic (MZ) twins discordant for obesity is probably the best approach, permitting unequivocal distinction between genetic versus environmental and lifestyle effects. In a study focused on young and healthy obesity-discordant MZ twins, Pietilainen¨ et al. (2007) showed that obesity, independent of genetic factors, was related to distinct changes in the global serum lipid profile. Global characterization of lipid molecular species in serum was performed by a lipidomics strategy using UPLC/Q-TOFMS. LIPIDOMICS IN NUTRITION AND HEALTH RESEARCH 363

In comparison to nonobese co-twins, the obese co-twins had increased levels of lysophosphatidylcholines (LPCs), which are lipids found in proinflammatory (Yan et al., 2005) and proatherogenic conditions, as well as decreased levels of ether phos- pholipids, which are known to exert antioxidative properties (Engelmann, 2004). Importantly, these lipid changes were associated with insulin resistance, a pathog- nomonic characteristic of acquired obesity in these healthy adult twins. The authors, therefore, pointed out that proper management of obesity, with a new generation of therapies directed at several targets in the lipid metabolism pathways, will most likely correct these abnormalities, and favorably modify the risk, course, and out- come of diabetes and cardiovascular diseases. Further lipidomic analyses of AT in the previous twin population, revealed that the obese twin individuals had increased pro- portions of palmitoleic and arachidonic acids in their AT, including increased levels of ethanolamine plasmalogens containing arachidonic acid, despite lower dietary PUFA intake (Pietilainen¨ et al., 2011). Information gathered from these twins and from a separate set of morbidly obese subjects, was used for molecular dynamics simulations of lipid bilayers using bioinformatic approaches, and the conclusions were further supported by in vitro adipocyte confirmatory studies. This novel strategy enabled the authors to identify adaptive mechanisms that may lay behind the characteris- tic remodeling of the AT lipidome in response to positive-energy-balance-induced AT expansion during the evolution of obesity. The simulations suggested that the observed lipid remodeling maintains the biophysical properties of lipid membranes at the price of increasing their vulnerability to inflammation (Pietilainen¨ et al., 2011). Research in AT lipidome, covering a global profile of structurally and functionally diverse lipids, provide a unique tool to pursue accurately and sensitively, studies pro- filing hundreds of molecular lipids in parallel (Han and Gross, 2003). The adipocyte metabolism is flexible and tightly influenced by energy balance to carry out one of the major AT functions, that is, the storage of surplus energy. Excess dietary carbo- hydrates are transformed to FA by de novo lipogenesis and stored as triglycerides. Saturated fatty acids (SFA) are the end products of de novo lipogenesis, and a high lipogenic activity in AT has indeed been reported to be positively correlated with the content of SFA. An animal study combining time-resolved microarray analyses of mesenteric-, subcutaneous-, and epididymal white adipose tissue (EWAT) during high-fat feeding of male transgenic apolipoprotein E3 Leiden (ApoE3Leiden) mice with histology, and targeted lipidomics reported that the contents of linoleic acid and alpha-linolenic acid in EWAT were increased compared to other depots (Caesar et al., 2010). The authors suggested that the androgen receptor, which expression was higher in EWAT than in other tissues, may mediate depot-dependent differences in de novo lipogenesis rate and proposed that the accumulation of dietary essential fatty acids are accumulated in EWAT as a result of sex steroid-mediated suppression of lipogenesis, providing an adaptive strategy to provide precursors for epididymal PUFA synthesis (Caesar et al., 2010). Using the same mouse model, Wopereis et al. (2012) reported that specific plasma FFA, as well as their ratio, can be used to predict future glucose intolerance (GI) in ApoELeiden mice. As GI is a hallmark of the prediabetic stage, the authors combined lipidomics and transcriptomics with the aim of identifying prognostic biomarkers that predict the risk of developing GI later in 364 LIPIDOMICS life, as well as diagnostic biomarkers reflecting the degree of already-manifested GI. The plasma ratios of C16:1/C16:0, C18:1/C18:0, and C18:2/C22:6 were significantly correlated with the area under the curve derived from the OGTT. In addition, the expression of several white blood cell genes reflected the individual degree of GI dur- ing disease progression, and was suggested by the authors as easy accessible markers to diagnose and monitor already existing GI (Wopereis et al., 2012). The progress in specific animal models and systems biology-based metabolomics might also assess the effect of Chinese medicine preparations, in which the gap between food and drugs has always been small, and nutrition is seen as a normal part of prevention and healthcare (Chau and Wu, 2006). A remarkably high number of preparations have, for instance, been handed down over the centuries with doc- umented activity related to clinical features of what is now described as metabolic syndrome. In this sense, a recent study focused in plasma and liver lipidomics revealed multiple pathway effects of a multicomponents preparation containing eight natural ingredients (such as Fructus Crataegi, (Hawthorn berries) and Radix et Rhizoma Rhei (rhubarb root)) on lipid biochemistry in ApoE3 Leiden cholesteryl ester transfer pro- tein (ApoE3Leiden.CETP) mice (Wei et al., 2012). The core formula, which is used in China for the treatment of metabolic syndrome and early stage type 2 diabetes with obesity, has shown to significantly improve insulin sensitivity in prediabetic ApoE3 Leiden mice as compared with nontreated controls (Wang et al., 2005). The observed changes in lipids, mainly in CEs and TGs, were comparable to those obtained with compounds belonging to known drugs such as rimonabant, niacin, and atorvastatin. Forthcoming studies should nevertheless include dose titrations and studies on lipid fluxes in human volunteers. It is thought that many of the adverse sequelae of obesity and type 2 diabetes also result from disruption in the efficiency of transitions of metabolic flux that occur during changes in substrate utilization (such as glucose vs. fatty acid) or changes in energy demand. However, the biochemical mechanisms behind the integration of multiple cell-specific responses are still under study (Huss and Kelly, 2005). Mito- chondrial and peroxisomal phospholipases are key actors in the regulation of cellular bioenergetics and signaling (Gadd et al., 2006; Kinsey et al., 2007). A study with −/− mice null for calcium-independent phospholipase A2␥ (iPLA ␥ ) demonstrated that they are completely resistant to high-fat diet-induced weight gain, hyperinsuline- mia, and insulin resistance (Mancuso et al., 2010). Shotgun lipidomics of AT from wild-type mice demonstrated a twofold increase in TG content after high fat feeding in contrast to the identical adipocyte TG content in iPLA ␥ −/−-fed either a stan- dard diet or a high-fat diet. In addition, shotgun lipidomics of skeletal muscle (Han et al., 2006) revealed a decreased content of cardiolipin with an altered molecular species composition, thereby identifying for the first time the mechanism underly- ing mitochondrial uncoupling in the iPLA ␥ −/− mouse. Tissue macrophage inflam- matory pathways have been also shown to contribute to obesity-associated insulin resistance (Xu et al., 2003). Lipidomics analysis revealed that the treatment with a novel anti-inflammatory compound (an analog of a human dehydroepiandrosterone metabolite) reduced liver cholesterol and TG content in Zucker diabetic fatty rats, LIPIDOMICS IN NUTRITION AND HEALTH RESEARCH 365 leading to a feedback elevation of LDL receptor and HMG–CoA reductase expression (Lu et al., 2010). The risk of inflammatory disease is influenced by both life-stage and lifestyle. Circulating levels of inflammatory markers, such as eicosanoids and cytokines, increase post menopause in women and post ovariectomy in rodents (Carlsten, 2005). Resolvins and protectins are a family of lipid mediators with potent anti- inflammatory and proresolving activities derived from omega–3 long-chain PUFAs (n–3 LCPUFAs) (Serhan, 2005). Poulsen et al. (2008) reported the presence of these lipid mediators in murine bone marrow, and demonstrated, by using a lipidomics LC–MS/MS approach, that the profile of lipoxygenase (LOX)-pathway lipid medi- ators is modified by ovariectomy and by dietary intake of the precursor LCPUFAs (eicosapentaenoic acid (EPA), docosahexaenoic acid (DHA), and arachidonic acid (AA)). The supplementation with EPA and DHA increased the percentage of both FA in bone marrow and the proportion of LOX mediators biosynthesized from DHA or EPA, and this fact may be of interest in bone marrow function, and its physiological and biological relevance will certainly need to be addressed in future studies. Several studies have suggested preventive or therapeutic activities of DHA in several neurodegenerative and psychiatric diseases, such as depression (Hibbeln, 2009) and Alzheimer’s disease (Morris, 2009). A lipidomic approach showed that after 1 month of fish oil supplementation with omega–3 FAs, PE in the cortex and hippocampus brain areas became enriched with DHA at the expense of arachidonyl- containing PE species in male Wistar rats in a higher degree than in brain striatum (Lamaziere et al., 2011). These data might in part explain the mixed therapeutic results obtained in neurological disorders, many of which are likely region-specific. Unlike rodents, humans preferentially use dietary DHA for building up brain mem- brane PLs. Losses of DHA caused by dietary constraints are substituted by generation of docosapentaenoic acid (DPA). Brand et al. (2010) have shown in a lipidomics study with pregnant rats that when alpha-linolenic acid nutritional deficiency is imposed, DPA appears to substitute for the losses of DHA not randomly, but in tight linkage with specific saturated and monounsaturated long-chain hydrocarbons at the sn-1 position, sustaining a highly conserved molecular species composition. The impor- tance of this conservation may underscore the possible biochemical consequences of this substitution in the regulation of certain functions in the developing brain. Hypertension is recognized to be related, among other factors, to unhealthy dietary habits, such as excessive intake of calories, alcohol, and salt (Srinath and Katan, 2004). Large population-based cohort studies have shown that dyslipidemia, which causes endothelial dysfunction, plays a key role in the development of hypertension (Dutro et al., 2007). A plasma lipidomics approach based on LC–IT–TOF MS revealed that lipid metabolism in hypertensive subjects is clearly different from that in normoten- sive subjects, PCs and TGs being highly abundant in the plasma of hypertensive patients (Hu et al., 2011). The authors reported that TGs containing three or two SFA chains (TG 48:0, 48:1, 50:0, 50:1, 52:1) were significantly accumulated in hyper- tensive versus normotensive subjects, indicating possible lipotoxic effects (Hu et al., 2011). In addition, a large number of neutral lipid species were significantly elevated 366 LIPIDOMICS in hypertensive subjects but significantly decreased after treatment with antihyper- tensive agents. The study concluded that antihypertensive medication to lower blood pressure of hypertensive subjects to target levels produced moderate plasma lipid metabolism improvement of patients with hypertension. The integration of lipidomic data with genetic, proteomic, and metabolomic data is expected to provide a powerful analytical approach for elucidating the mechanisms behind lipid-based diseases, in addition to biomarkers screening and monitoring of pharmacological therapy (Griffiths and Wang, 2009). Studies of hepatic lipid metabolism can, for instance, provide mechanistic insights into the development of fatty liver disease, which is a disease associated with a chronic alcohol intake. This system biology approach might also help in identifying potential biomarkers for progression to more severe, related diseases. The integration of gene expression data with targeted lipidomics analyses of plasma and liver from control and alcohol- fed C57BL/6 mice allowed Clugston et al. (2011) to quantify levels of 62 defined lipid species by LC–MS/MS, providing an improved mechanistic understanding of alcohol-induced changes in hepatic lipid levels. The previous study focused on FA metabolism, measuring hepatic FFA and FA–CoA, which are essential precursors of many liver lipids, as well as FAEE, which are produced by the nonoxidative metabolism of alcohol, and on the metabolism of two lipid-signaling families: SL and endocannabinoids. The study results supports the concept that decreased mito- chondrial FA oxidation is one of the contributing factors in alcoholic fatty liver disease. Alcohol feeding led to elevated FFA levels, coupled with decreased expres- sion of genes associated with FA oxidation. Clugston et al. (2011) were the first to report broad decreases in FA–CoA levels in the liver of alcohol-fed mice that were associated with decreased expression of FA–CoA-synthesizing genes. There was also an increase of ceramide levels in the alcohol-fed mice, which was asso- ciated with increased levels of the precursor metabolites sphingosine and sphinga- nine. Further research, however, will be required to elucidate the relative importance of the increased concentration of the endocannabinoid anandamide in the liver of alcohol-fed mice. A lipidomics approach based on proton (1H-NMR) and phosphorus (31P-NMR) NMR in plasma and liver of alcohol-fed male Fischer rats for one month showed sig- nificant changes in PPLs (Fernando et al., 2010). Later on, the same authors applied the same lipidomic NMR platforms in ethanol-fed rats for a longer period, show- ing that several hepatic lipids, mainly FA and TG were increased by long ethanol exposure, whereas PC decreased (Fernando et al., 2011). The unsaturation of FA chains increased in liver contrary to plasma. The study confirmed that overaccumu- lation of lipids in ethanol-induced liver steatosis is accompanied by mild inflam- mation on long duration of ethanol exposure. The authors suggested that plasma metabolic profile using NMR lipidomics might be used as a noninvasive diagnostic tool for ethanol-induced liver damage in a clinical setting, as changes in various lipid moieties in plasma and liver can help to differentiate various stages of alcohol liver disease. In the absence of alcohol abuse, nonalcoholic fatty liver disease (NAFLD) is the most common cause of liver dysfunction characterized by fat infiltration of the liver, LIPIDOMICS IN NUTRITION AND HEALTH RESEARCH 367 and is considered to be the hepatic manifestation of metabolic syndrome. NAFLD is usually related to high-fat, high-cholesterol diets. The use of 1H NMR lipidomics for quantitative profiling of liver extracts from LDLr–/–, which is a well-documented mouse model of fatty liver disease, showed hepatic inflammation and development of steatosis to be correlated with cholesterol and TG NMR derived signals, respectively (Vinaixa et al., 2010). This NMR approach might therefore provide information to indicate dietary modifications to modify the reversible components of the associated metabolic derangements. A new method based on high resolution LC–MS and high energy collisional dis- sociation fragmentation with a special focus on characterization of mitochondrial cardiolipins and monolysocardiolipins (MLCL) was applied to a lipidomic profil- ing analysis of rat liver mitochondrial samples from a nutrition study aiming to test the hypothesis that intraclass shifts of fats and carbohydrates in the diet will affect the physiological function and biochemical fingerprint of mitochondria (Bird et al., 2011). The diets were isocaloric and comprised of six different fat groups with the major constituent of each being either SFA, trans-FA, MUFA, or one of three groups of PUFAs varying in the omega-6/omega–3 ratio. Among the identified compounds, two MLCL species, MLCL(18:2)3 and MLCL(18:2)2(18:1) were present in the rat liver mitochondrial samples. MLCL is an intermediate in cardiolipin metabolism as well as a potential byproduct of lipid peroxidation damage. The MLCL relative quan- titation across all rats in the study showed a trend linking the amount of MLCL(18:2)3 present in mitochondria and the major fat component of the diet. The greatest relative percentage of this species was found in the liver mitochondria from rats maintained on diets containing trans-fat as the major constituent. This result might reflect impaired cardiolipin maturation or increased steady-state oxidative stress in the liver mito- chondria of animals fed these diets, the biological interpretation of which has to be further explored. Lipidomics has also been applied in cancer research. It is indeed considered a promising tool to further elucidate the colorectal carcinogenesis modifying activities of lipids. Colorectal cancer is one of the most common malignancies worldwide and the third leading cause of death among cancers (Siegel et al., 2012). An associa- tion between colorectal carcinogenesis and pathways of fatty acid metabolism has been proposed with long-chain acyl–CoA thioesters as essential intermediates, and cyclooxygenases and acyl–CoA synthetases as important enzymes (Gassler et al. 2010). This observation is determined by molecular data indicating high functional diversity of complex lipids such as ceramides, and use in beta-oxidation, regulation of cellular signaling and transcriptional activity. The contribution of dietary lipids has been considered an important additional variable in colorectal carcinogenesis. Diet seems to be a plausible variable in colorectal carcinogenesis because of its direct contact with the intestinal mucosa (Marshall, 2009). An increased risk for cancer development is found in subjects consuming diets high in red and processed meat (Wang et al., 2012). Deep-fried/oxidized fats such as hydroxyl- and hydroperoxy fatty acids have been shown to influence lipid metabolism by activation of the transcriptor factor peroxisome proliferator-activated receptor alpha (PPAR␣) (Luci et al., 2007). A growing number of studies supports the findings that bioactive dietary components 368 LIPIDOMICS containing long-chain PUFAs modulate important determinants that link inflamma- tion to cancer development and tumor progression (Chapkin et al., 2007), while short-chain FA, especially butyrate, which is mainly produced by the microbiome using fermentable dietary polysaccharides, are suggested to be cancer preventive (Louis and Flint, 2009; Scharlau et al., 2009). The highly diversity in lipids has therefore become the focus of intensive research, supported by the epidemiological data indicating a link between the intake of dietary lipids and development of col- orectal cancer (Yeh et al., 2006). As the field of lipidomics advances, the role of the lipidome in cellular functions and pathological states is expected to give light for future establishment of preventive and therapeutic approaches in the field. Lipidomic approaches have been also applied to other cancers with a dietary com- ponent in its etiology. For instance, a lipidomic study of breast cancer pathogenesis showed that the complete profile of AT lipids in patients with benign and malignant breast tumors, rather than a single lipid, has the capability to quantify the dietary contribution of breast cancer risk and to identify dietary modifications in order to reduce its occurrence (Bougnoux et al., 2008). The use of systems biology approaches is becoming more common in the study of lipids to elucidate their functions and roles in human health and diseases. SL play important roles in the pathophysiology of many diseases, but many of the intermedi- ates of SL biosynthesis are highly bioactive and their quantification is challenging. Modeling of the SL network is imperative for an understanding of SL biology. In this direction, Gupta et al. (2011) developed a quantitative model of the SL path- way by integrating lipidomics and transcriptomics data with legacy knowledge. The model can be applied to design experimental studies of how genetic and pharmaco- logical perturbations alter the flux through this important lipid biosynthetic pathway. Therefore, systems biology has already been recognized as an indispensable tool in pathway-based drug discovery. A step further will be to elucidate its real potential in the nutrition field, which is expected to be evaluated in few years.

12.3 LIPIDOMICS AND FOOD SCIENCE

Quality and safety are the two main issues related to the genuineness of both processed and fresh foods. The specificity and sensitivity of MS-based methods is officially recognized by international quality-system control bodies and the application of multistage ion analysis has become mandatory to adhere to worldwide regulations regarding the recognition of fraud and bad practices in food manipulation (Aiello et al., 2011). During the last 10 years, the search for markers of authenticity, quality, safety of foods, as well as the discovery of signature peptides, and fraud deletion has sped up thanks to advances in the –omics fields, such as proteomics (Kvasnicka,ˇ 2003), allergonomics (Kirsch et al., 2009), and lipidomics. In this sense, Foodomics has been defined as a new discipline that studies the food and nutrition domains through the application of advanced omics technologies to improve consumer’s well-being and health (Cifuentes, 2009; Herrero et al., 2010). Herrero and coworkers recently reviewed the MS-based strategies that have been or can be applied in challenging LIPIDOMICS AND FOOD SCIENCE 369

field of Foodomics (Herrero et al., 2012). In this section, a focus on the applications of lipidomics- based methods in the quality assessment of some food matrixes will be overviewed.

12.3.1 Lipidomics and Food Quality Triacylglycerides present in oils and fats are important constituents of the human diet. The nutritional value of fats depends on the degree of FA saturation. While a regular intake of SFAs is not vital, PUFAs such as linoleic and linolenic acid are essential fatty acids for a healthy diet. It is the biosynthetic pathways of vegetable oils that provide that each FA occupies a preferential position on the glycerol backbone of TGs (Han and Gross, 2001). In this sense, stereospecific composition of TGs has been used as a tool to characterize different fats of interest in human health, such olive oil. In fact, there are many factors affecting an oil composition, and therefore its quality and authenticity. The soil, climate, processing, harvesting and chemical treatments during storage, of the vegetable seed or fruits from which the oil is extracted are some of the crucial factors. Oil samples are routinely characterized by single quadrupole and ion trap mass spectrometers. The progress in structure determination and assay of lipids is mainly due to the application of spray and desorption ionization methods in connection with multistage ion analysis and the isotope dilution approach (McAnoy et al., 2005). Although TGs are neither polar nor volatile, their ammonium or alkali metal adducts have occasionally been detected, usually in tissue imaging experiments. Several vegetable oils have been indeed sampled by DESI. The main advantage of the addition of an ionizing agent (adduct-forming ion) is that the abundance of one lipoid is not partitioned over several signals. In this way, the fluctuation of signal distribution is eliminated and the overall sensitivity of the method significantly improves. Multistage MS3 analysis of ions has also proved to be a powerful and useful approach in the characterization of TGs in complex mixtures (Lin and Arcinas, 2008). Shotgun lipidomics has been successfully applied to evaluate fish quality. Fresh- ness is fundamental to fish quality and closely linked to the microbiome flora, storage temperature, handling, and physiological conditions of the fish (Abbas et al., 2008). Phospholipid changes during storage are indeed one of the most important post- mortem changes for fish freshness. Oxidation and hydrolysis are the two main reac- tions in fish PLs for quality deterioration (Caddy et al., 1995), resulting in a range of substances, among which some have unpleasant taste or smell. Some of them may contribute to texture changes by binding covalently to fish muscle proteins. Direct- infusion electrospray ionization tandem mass spectrometry (ESI–MS/MS) was, for instance, recently proven to be an effective method for qualitative and quantitative analyses of PLs from the muscle of Ctenopharyngodon idellus during room-temperature storage (Wang and Zhang, 2011). This study might indeed be useful to further understand changes of PL profiling during storage for other fish species. The new method was able to identify more than 100 molecular species of PLs, provided information regarding not only the FA chain compositions, but also their relative positions (sn-1/sn-2) in individual PL classes. Results pointed out that oxidation and hydrolysis were the two main causes for the deterioration of PLs in 370 LIPIDOMICS

fish muscle during storage. Interestingly, some PE molecular species with former low abundance, such as PE 16:0/16:1, emerged in abundance during the fish storage. The authors suggested that those PE species may come from microbiome breeding in the muscle, a phenomenon that was found and discussed for the first time, implying its relevance as potential markers of fish quality assessment. Calvano et al. (2010) introduced a new MALDI matrix based on 1H-pteridine- 2,4-dione, or lumazine (C6H4N4O2, nominal mass 164) for the MALDI–TOF MS analysis of PLs in positive and negative ion modes. Lumazine is photochemically stable under UV laser irradiation, displays very few matrix-related ions in both positive and negative ion modes, and appears nearly ideal for studying PLs in complex mixtures. Its application to the characterization of crude lipid extracts of cow’s milk, soy milk, and hen egg, where phosphatidylethanolamines, phosphatidylserines, and phosphatidylinositols could additionally be detected, allowed sensitive detection of individual PL classes in the negative ion mode with a relatively low presence of matrix adducts (Calvano et al., 2010). During the last years, the interest on nutraceutical molecules in milk and cheeses has increased due to their healthy properties for human nutrition. As it is well known, the final lipid composition of the food deeply depends on the feeding material to the animals. A recent study evaluated the impact of the Hyblean pasture composition on the lipidic profile in Ragusano cheese ( La Terra et al., 2011). Ragusano cheese is a cheese made with raw milk produced by cows fed with natural pasture in the South of Sicily. A positive correlation was detected in the combined effect of feeding time of the cow and the content of several FAs. Interestingly, the presence of some lipid intermediates such as anandamide, oleoyletanolamide, and palmitoyletanolamide was observed in the analyzed samples. The interest for these species in cheese is due to their potential role as food regulators in human nutrition.

12.3.2 Lipidomics and Food Safety Lipidomics can also be a useful discipline for food safety issues. The full identifica- tion of trans-PUFA isomers represents, for example, an analytical challenge that is acquiring increasing relevance due to the food safety regulations active in most of the industrialized countries (Moss, 2006). Trans-fatty acids have been reported to have an negative role in human nutrition and health. Therefore, monitoring structures and effects connected to the various isomer structures is required to evaluate the relevance of the different isomers in the habitual diet. The formation of trans-isomers can indeed have important meaning and con- sequences connected to radical stress. Free radical isomerization of membrane fatty acids has been the subject of research coupling the top-down approach by model stud- ies, such as biomimetic chemistry in liposomes, with the bottom-up approach dealing with the examination of cell membrane lipidome in living systems under several physiopathological conditions (Ferreri and Chatgilialoglu, 2009). Partial hydrogena- tion and deodorization processes used in food processing are the most frequent and well-known causes of double bond alteration (Seb´ edio´ and Christie, 1998). FUTURE PERSPECTIVES 371

Fish oils usually undergo an industrial deodorization process, in order to provide an odorless material used for functional foods or nutraceutical formulations with health claim derived from the beneficial effects of omega–3 fatty acids. However, the heat treatment converts EPA to geometrical trans-isomers (Fournier et al., 2006), and there is still a lack in literature on the correlation between intake of deodorized fish oils and specific incorporation of trans-EPA isomers in vivo. Ferreri et al. (2012) have reported for the first time that EPA isomers can be metabolized and incorporated in rat liver mitochondrial phospholipids. The study reported a dual synthetic strategy, providing the characterization of five geometrical monotrans isomers of EPA methyl esters and valuable information on GC and NMR characteristics for further applications in metabolomics and lipidomics. As pointed out by the same authors, fatty acid isomerism effects have not yet been connected to the impairment of mitochondrial respiratory chain or liver nutritional associated pathologies as nonalcoholic fatty liver disease or steatosis. Therefore, more attention to the trans-fatty acid content related to diet and biodistribution is needed in order to evaluate the influence at the molecular level on health.

12.4 FUTURE PERSPECTIVES

Lipidomics, in combination with metabolomics in general, has been increasingly been utilized particularly in nutritional studies but also in the development of food products, in the evaluation of food functionality, bioactivity, and toxicity. The novel analytical techniques, particularly LC–MS-based methods in combination with bioinformatic tools can give a deep insight of the biological processes in food-related studies. The current analytical methodologies already allow lipid analysis with high throughput, resolution, sensitivity, and ability for structural identification. Further development is still needed in the data processing, data mining, and interpretation of the data. In nutritional studies, lipidomics allows sensitive measurement of reporters of complex pathological states related to imbalances in organismal energy and cellular signaling metabolism. Combining the lipidomics data with the individual phenotype can provide relevant information on the molecular events initiated by the ingestion of the nutrients and the specific adaptations of the body to altered flux of certain nutrients through specific metabolic pathways. Lipidomics has already proven to be a viable tool in the identification of individual variability in responses to nutritional maintenance/intervention, and thus be an important aspect of nutritional phenotyp- ing. This, in turn, will allow developing a more individualized approach for dietary guidance and allows shifting the focus of nutrition research from disease treatment and management to one of disease prevention. In food development, lipidomics can be used in the optimization of the effect of food processing on the dietary value, such as bioactivity and bioavailability of the food products, and in the evaluation of their health effects and their safety. Lipidomics can also be utilized in the identification of the clinical endpoints of dietary intervention and in identification of novel biomarkers. These in turn can be utilized in validation of health claims of functional and health-promoting dietary components. 372 LIPIDOMICS

REFERENCES

Abbas KA, Mohamed A, Jamilah B, Ebrahimian M (2008). A review on correlations between fish freshness and pH during cold storage. American Journal of Biochemistry and Biotech- nology 4:416–421. Aiello D, De Luca D, Gionfriddo E, Naccarato A, Napoli A, Romano E, Russo A, Sindona G, Tagarelli A (2011). Review: multistage mass spectrometry in quality, safety and origin of foods. European Journal of Mass Spectrometry (Chichester, Eng) 17:1–31. Aiyar N, Disa J, Ao Z, Ju H, Nerurkar S, Willette RN, Macphee CH, Johns DG, Douglas SA (2007). Lysophosphatidylcholine induces inflammatory activation of human coronary artery smooth muscle cells. Molecular Cell Biochemistry 295:113–120. Balazy M (2004). Eicosanomics: targeted lipidomics of eicosanoids in biological systems. Prostaglandins and other Lipid Mediators 73:173–180. Beckonert O, Coen M, Keun HC, Wang Y, Ebbels TMD, Holmes E, Lindon JC, Nicholson JK (2010). High-resolution magic-angle-spinning NMR spectroscopy for metabolic profiling of intact tissues. Nature Protocols 5:1019–1032. Bird SS, Marur VR, Sniatynski MJ, Greenberg HK, Kristal BS (2011). Lipidomics profiling by high-resolution LCMS and high-energy collisional dissociation fragmentation: focus on characterization of mitochondrial cardiolipins and monolysocardiolipins. Analytical Chemistry 83:940–949. Bobeldijk I, Hekman M, de Vries-van der Weij J, Coulier L (2008). Quantitative profiling of bile acids in biofluids and tissues based on accurate mass high resolution LC-FT-MS: compound class targeting in a metabolomics workflow. Journal of Chromatography B Analytical Technologies in the Biomedical and Life Sciences 871:306–313. Bougnoux P, Hajjaji N, Couet C (2008). The lipidome as a composite biomarker of the modifiable part of the risk of breast cancer. Prostaglandins, Leukotrienes and Essential Fatty Acids 79:93–96. Brand A, Crawford MA, Yavin E (2010). Retailoring docosahexaenoic acid-containing phos- pholipid species during impaired neurogenesis following omega-3 ␣-linolenic acid depri- vation. Journal of Neurochemistry 114:1393–1404. Buckingham J (2010). Dictionary of Natural Products on CD-ROM, Version 19.1. Caddy JF, Wickens PA, Sugunan VV, Mahon R (1995). Quality and quality changes in fresh fish. Food and Agriculture Organization (FAO) Rome, Italy 348:68–92. Caesar R, Manieri M, Kelder T, Boekschoten M, Evelo C, Muller¨ M, Kooistra T, Cinti S, Kleemann R, Drevon CA (2010). A Combined transcriptomics and lipidomics analysis of subcutaneous, epididymal and mesenteric adipose tissue reveals marked functional differ- ences. PLoS ONE 5:e1525. Calvano CD, Carulli S, Palmisano F (2010). 1H-pteridine-2,4-dione (lumazine): a new MALDI matrix for complex (phospho)lipid mixtures analysis. Analytical and Bioanalytical Chem- istry 398:499–507. Carlsten H (2005). Immune responses and bone loss: the estrogen connection. Immunological Reviews 208:194–206. Chait BT (2011). Mass spectrometry in the postgenomic era. Annual Review of Biochemistry 80:239–246. Chapkin RS, Davidson LA, Ly L, Weeks BR, Lupton JR, McMurray DN (2007). Immunomod- ulatory effects of (n-3) fatty acids: putative link to inflammation and colon cancer. The Journal of Nutrition 137:200S–204S. REFERENCES 373

Chau C, Wu S (2006). The development of regulations of Chinese herbal medicines for both medicinal and food uses. Trends in Food Science and Technology 17:313–323. Cifuentes A (2009). Food analysis and Foodomics. Journal of Chromatography A 1216: 7109. Cloarec O, Dumas ME, Craig A, Barton RH, Trygg J, Hudson J, Blancher C, Gauguier D, Lindon JC, Holmes E, Nicholson J (2005). Statistical total correlation spectroscopy: an exploratory approach for latent biomarker identification from metabolic 1H NMR data sets. Analytical Chemistry 77:1282–1289. Clugston RD, Jiang H, Lee MX, Piantedosi R, Yuen JJ, Ramakrishnan R, Lewis MJ, Gottesman ME, Huang L, Goldberg IJ, Berk PD, Blaner WS (2011). Altered hepatic lipid metabolism in C57BL/6 mice fed alcohol: a targeted lipidomic and gene expression study. Journal of Lipid Research 52:2021–2031. de Mello V, Erkkila¨ A, Schwab U, Pulkkinen L, Kolehmainen M, Atalay M, Mussalo H, Lankinen M, Oresic M, Lehto S, Uusitupa M (2009). The effect of fatty or lean fish intake on inflammatory gene expression in peripheral blood mononuclear cells of patients with coronary heart disease. European Journal of Nutrition 48:447–455. Dennis EA (2009). Lipidomics joins the omics evolution. Proceedings of the Natural Academy of Sciences of the United States of America 106:2089–2090. Dutro MP, Gerthoffer TD, Peterson ED, Tang SSK, Goldberg GA (2007). Treatment of hyper- tension and dyslipidemia or their combination among US managed-care patients. The Journal of Clinical Hypertension 9:684–691. Engelmann B (2004). Plasmalogens: targets for oxidants and major lipophilic antioxidants. Biochemical Society of Transactions 32:147–150. Fahy E, Cotter D, Byrnes R, Sud M, Maer A, Li J, Nadeau D, Zhau Y, Subramaniam S (2007). Bioinformatics for lipidomics. Methods in Enzymology 432:273. Fahy E, Cotter D, Sud M, Subramaniam S (2011). Lipid classification, structures and tools. Biochimica et Biophysica Acta (BBA) - Molecular and Cell Biology of Lipids 1811:637–647. Fahy E, Subramaniam S, Brown HA, Glass CK, Merrill AH, Murphy RC, Raetz CRH, Russell DW, Seyama Y, Shaw W, Shimizu T, Spener F, van Meer G, VanNieuwenhze MS, White SH, Witztum JL, Dennis EA (2005). A comprehensive classification system for lipids. Journal of Lipid Research 46:839–862. Fernando H, Bhopale KK, Kondraganti S, Kaphalia BS, Shakeel Ansari GA (2011). Lipidomic changes in rat liver after long-term exposure to ethanol. Toxicology and Applied Pharma- cology 255:127–137. Fernando H, Kondraganti S, Bhopale KK, Volk DE, Neerathilingam M, Kaphalia BS, Luxon BA, Boor PJ, Shakeel Ansari GA (2010). 1H and 31P NMR lipidome of ethanol-induced fatty liver. Alcoholism: Clinical and Experimental Research 34:1937–1947. Ferreri C, Chatgilialoglu C (2009). Membrane lipidomics and the geometry of unsaturated fatty acids from biomimetic models to biological consequences. Methods in Molecular Biology 579:391–411. Ferreri C, Grabovskiy SA, Aoun M, Melchiorre M, Kabal’nova N, Feillet-Coudray C, Fouret G, Coudray C, Chatgilialoglu C (2012). Trans fatty acids: chemical synthesis of eicosapen- taenoic acid isomers and detection in rats fed a deodorized fish oil diet. Chemical Research in Toxicology 25:687–694. Fournier V, Juaneda´ P, Destaillats F, Dionisi F, Lambelet P, Seb´ edio´ J, Berdeaux O (2006). Analysis of eicosapentaenoic and docosahexaenoic acid geometrical isomers formed during fish oil deodorization. Journal of Chromatography A 1129:21–28. 374 LIPIDOMICS

Gadd ME, Broekemeier KM, Crouser ED, Kumar J, Graff G, Pfeiffer DR (2006). Mitochondrial iPLA2 activity modulates the release of cytochrome c from mitochondria and influences the permeability transition. Journal of Biological Chemistry 281:6931–6939. Gassler N, Klaus C, Kaemmerer E, Reinartz A(2010). Modifier-concept of colorectal carcino- genesis: lipidomics as a technical tool in pathway analysis. World Journal of Gastroen- terology 16:1820–1827. German JB, Gillies LA, Smilowitz JT, Zivkovic AM, Watkins SM (2007). Lipidomics and lipid profiling in metabolomics. Current Opinion in Lipidology 18:66–71. Goto-Inoue N, Hayasaka T, Zaima N, Setou M (2011). Imaging mass spectrometry for lipidomics. Biochimica et Biophysica Acta (BBA) - Molecular and Cell Biology of Lipids 1811:961–969. Griffiths WJ, Wang Y (2009). Mass spectrometry: from proteomics to metabolomics and lipidomics. Chemical Society Reviews 38:1882–1896. Gross R, Han X (2011). Lipidomics at the Interface of Structure and Function in Systems Biology. Chemistry & Biology 18:284–291. Guo X, Lankmayr E (2010). Multidimensional approaches in LC and MS for phospholipid bioanalysis. Bioanalysis 2:1109–1123. Gupta S, Maurya M, Merrill Jr A, Glass C, Subramaniam S (2011). Integration of lipidomics and transcriptomics data towards a systems biology model of sphingolipid metabolism. BMC Systems Biology 5:26. Han X, Gross RW (2001). Quantitative analysis and molecular species fingerprinting of tria- cylglyceride molecular species directly from lipid extracts of biological samples by elec- trospray ionization tandem mass spectrometry. Analytical Biochemistry 295:88–100. Han X, Gross RW (2003). Global analyses of cellular lipidomes directly from crude extracts of biological samples by ESI mass spectrometry. Journal of Lipid Research 44:1071–1079. Han X, Gross RW (2005). Shotgun lipidomics: multidimensional MS analysis of cellular lipidomes. Expert Review of Proteomics 2:253–264. Han X, Yang K, Yang J, Cheng H, Gross RW (2006). Shotgun lipidomics of cardiolipin molecular species in lipid extracts of biological samples. Journal of Lipid Research 47:864– 879. Han X, Yang K, Gross RW (2012). Multi-dimensional mass spectrometry-based shotgun lipidomics and novel strategies for lipidomics analyses. Mass Spectrometry Reviews 31:134–178 Herrero M, Garc´ıa-Canas˜ V, Simo C, Cifuentes A (2010). Recent advances in the application of capillary electromigration methods for food analysis and Foodomics. Electrophoresis 31:205–228. Herrero M, SimoC,Garc´ ´ıa-Canas˜ V, Iba´nez˜ E, Cifuentes A (2012). Foodomics: MS-based strategies in modern food science and nutrition. Mass Spectrometry Reviews 31:49–69. Hibbeln JR (2009). Depression, suicide and deficiencies of omega-3 essential fatty acids in modern diets. World Review of Nutrition and Dietetics 99:17–30. Hu C, Kong H, Qu F, Li Y, YuZ, Gao P, Peng S, Xu G (2011). Application of plasma lipidomics in studying the response of patients with essential hypertension to antihypertensive drug therapy. Molecular BioSystems 7:3271–3279. Huss JM, Kelly DP (2005). Mitochondrial energy metabolism in heart failure: a question of balance. The Journal of Clinical Investigation 115:547–555. REFERENCES 375

Jackson SN, Ugarov M, Post JD, Egan T, Langlais D, Schultz JA, Woods AS (2008). A study of phospholipids by ion mobility TOFMS. Journal of the American Society for Mass Spectrometry 19:1655–162. Jung HR, Sylvanne¨ T, Koistinen KM, Tarasov K, Kauhanen D, Ekroos K (2011). High through- put quantitative molecular lipidomics. Biochimica et Biophysica Acta (BBA) - Molecular and Cell Biology of Lipids 1811:925–934. Katajamaa M, Miettinen J, Oresiˇ cˇ M (2006). MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22:634–636. Katajamaa M, Oresiˇ cˇ M (2007). Data processing for mass spectrometry-based metabolomics. Journal of Chromatography A 1158:318–328. Kim HI, Kim H, Pang ES, Ryu EK, Beegle LW, Loo JA, Goddard WA, Kanik I (2009). Structural characterization of unsaturated phosphatidylcholines using traveling wave ion mobility spectrometry. Analytical Chemistry 81:8289–8297. Kinsey GR, McHowat J, Beckett CS, Schnellmann RG (2007). Identification of calcium- independent phospholipase A2␥ in mitochondria and its role in mitochondrial oxidative stress. American Journal of Physiology – Renal Physiology 292:F853–F860. Kirsch S, Fourdrilis S, Dobson R, Scippo ML, Maghuin-Rogister G, De Pauw E (2009). Quantitative methods for food allergens: a review. Analytical and Bioanalytical Chemistry 395:57–67. Kliman M, May JC, McLean JA (2011). Lipid analysis and lipidomics by structurally selective ion mobility-mass spectrometry. Biochimica et Biophysica Acta (BBA) – Molecular and Cell Biology of Lipids 1811:935–945. Klose C, Ejsing CS, Garc´ıa-Saez´ AJ, Kaiser H-, Sampaio JL, Surma MA, Shevchenko A, Schwille P, Simons K (2010). Yeast lipids can phase-separate into micrometer-scale mem- brane domains. Journal of Biological Chemistry 285:30224–30232. Krug S, Kastenmuller¨ G, Stuckler¨ F, Rist MJ, Skurk T, Sailer M, Raffler J, Romisch-Margl¨ W, Adamski J, Prehn C, Frank T, Engel K-H, Hofmann T, Luy B, Zimmermann R, Moritz F, Schmitt-Kopplin P, Krumsiek J, Kremer W, Huber F, Oeh U, Theis FJ, Szymczak W, Hauner H, Suhre K, Daniel H (2012). The dynamic range of the human metabolome revealed by challenges. The FASEB Journal 26:2607–2619. Kvasnickaˇ F (2003). Proteomics: general strategies and application to nutritionally relevant proteins. Journal of Chromatography B 787:77–89. La Terra S, Marino VM, Manenti M, Caprino S, Licitra G (2011). Lipidomics of the Ragusano cheese. Scienza E Tecnica Lattiero-Casearia 62:37–43. Lagarde M, Bernoud-Hubac N, Guichardant M (2012). Expanding the horizons of lipidomics. Towards fluxolipidomics. Molecular Membrane Biology. Lamaziere A, Richard D, Barbe U, Kefi K, Bausero P, Wolf C, Visioli F (2011). Differen- tial distribution of DHA-phospholipids in rat brain after feeding: a lipidomic approach. Prostaglandins, Leukotrienes and Essential Fatty Acids 84:7–11. Lange E, Tautenhahn R, Neumann S, Gropl C (2008). Critical assessment of alignment pro- cedures for LC-MS proteomics and metabolomics measurements. BMC Bioinformatics 9:375. Lankinen M, Schwab U, ErkkilaA,Sepp¨ anen-Laakso¨ T, Hannila M, Mussalo H, Lehto S, Uusitupa M, Gylling H, Oresic M (2009). Fatty fish intake decreases lipids related to inflammation and insulin signaling. A lipidomics approach. PLoS ONE 4: e5258. 376 LIPIDOMICS

Ledeen RW, Wu G (2008). Thematic Review Series: sphingolipids. Nuclear sphingolipids: metabolism and signaling. Journal of Lipid Research 49:1176–1186. Liebisch G, Lieser B, Rathenberg J, Drobnik W, Schmitz G (2004). High-throughput quantifi- cation of phosphatidylcholine and sphingomyelin by electrospray ionization tandem mass spectrometry coupled with isotope correction algorithm. Biochimica et Biophysica Acta (BBA) – Molecular and Cell Biology of Lipids 1686:108–117. Lin J, Arcinas A (2008). Analysis of regiospecific triacylglycerols by electrospray ionization– mass spectrometry 3 of lithiated adducts. Journal of Agricultural and Food Chemistry 56:4909–4915. Lindon JC, Beckonert OP, Holmes E, Nicholson JK (2009). High-resolution magic angle spin- ning NMR spectroscopy: application to biomedical studies. Progress in Nuclear Magnetic Resonance Spectroscopy 55:79–100. Lindon JC, Nicholson JK (2008). Analytical technologies for metabonomics and metabolomics, and multi-omic information recovery. TrAC Trends in Analytical Chem- istry 27:194–204. Louis P, Flint HJ (2009). Diversity, metabolism and microbial ecology of butyrate-producing bacteria from the human large intestine. FEMS Microbiology Letters 294:1–8. Lu M, Patsouris D, Li P, Flores-Riveros J, Frincke JM, Watkins S, Schenk S, Olefsky JM (2010). A new antidiabetic compound attenuates inflammation and insulin resistance in Zucker diabetic fatty rats. American Journal of Physiology - Endocrinology and Metabolism 298:E1036–E1048. Luci S, Konig¨ B, Giemsa B, Huber S, Hause G, Kluge H, Stangl GI, Eder K (2007). Feeding of a deep-fried fat causes PPARalpha activation in the liver of pigs as a non-proliferating species. British Journal of Nutrition 97:872–882. Mancuso DJ, Sims HF, Yang K, Kiebish MA, Su X, Jenkins CM, Guan S, Moon SH, Pietka T, Nassir F Schappe T, Moore K, Han X, Abumrad NA, Gross RW (2010). Genetic ablation of calcium-independent phospholipase A2gamma prevents obesity and insulin resistance during high fat feeding by mitochondrial uncoupling and increased adipocyte fatty acid oxidation. The Journal of Biological Chemistry 285:36495–36510. Marin de Mas I, Selivanov V, Marin S, Roca J, Oresic M, Agius L, Cascante M (2011). Compartmentation of glycogen metabolism revealed from 13C isotopologue distributions. BMC Systems Biology 5:175. Marshall JR (2009). Nutrition and colon cancer prevention. Current Opinion in Clinical Nutri- tion & Metabolic Care 12:539–543. Martin J, Canlet C, Delplanque B, Agnani G, Lairon D, Gottardi G, Bencharif K, Gripois D, Thaminy A, Paris A (2009). 1H NMR metabonomics can differentiate the early atherogenic effect of dairy products in hyperlipidemic hamsters. Atherosclerosis 206:127– 133. McAnoy AM, Wu CC, Murphy RC (2005). Direct qualitative analysis of triacylglycerols by electrospray mass spectrometry using a linear ion trap. Journal of the American Society for Mass Spectrometry 16:1498–1509. Merrill Jr. AH, Sullards MC, Allegood JC, Kelly S, Wang E (2005). Sphingolipidomics: high-throughput, structure-specific, and quantitative analysis of sphingolipids by liquid chromatography tandem mass spectrometry. Methods 36:207–224. Moco S, Vervoort J, Moco S, Bino RJ, De VosRCH, Bino R (2007). Metabolomics technologies and metabolite identification. TrAC Trends in Analytical Chemistry 26:855–866. REFERENCES 377

Morris MC (2009). The role of nutrition in Alzheimer’s disease: epidemiological evidence. European Journal of Neurology 16:1–7. Moss J (2006). Labeling of trans fatty acid content in food, regulations and limits—The FDA view. Atherosclerosis Supplements 7:57–59. Niemela¨ PS, Castillo S, Sysi-Aho M, Oresiˇ cˇ M (2009). Bioinformatics and computational methods for lipidomics. Journal of Chromatography B 877:2855–2862. Nygren H, Seppanen-Laakso¨ T, Castillo S, Hyotyl¨ ainen¨ T, Oresiˇ cˇ M (2011). Liquid chromatography-mass spectrometry (LC–MS)-based lipidomics for studies of body flu- ids and tissues. Methods in Molecular Biology 708:247–257. Oresiˇ cˇ M (2009). Metabolomics, a novel tool for studies of nutrition, metabolism and lipid dysfunction. Nutrition, Metabolism and Cardiovascular Diseases 19:816–824. Oresiˇ cˇ M (2011). Informatics and computational strategies for the study of lipids. Biochimica et Biophysica Acta (BBA) - Molecular and Cell Biology of Lipids 1811:991–999. Oresiˇ cM,Hˇ anninen¨ VA, Vidal-Puig A (2008). Lipidomics: a new window to biomedical frontiers. Trends in Biotechnology 26:647–652. Pietilainen¨ KH, Rog´ T, Seppanen-Laakso¨ T, Virtue S, Gopalacharyulu P, Tang J, Rodriguez- Cuenca S, Maciejewski A, Naukkarinen J, Ruskeepa¨a¨ A-L, Niemela¨ PS, Yetukuri L, Tan CW, Velagapudi V, Castillo S, Nygren H, Hyotyl¨ ainen¨ T, Rissanen A, Kaprio J, Yki-Jarvi-¨ nen H, Vattulainen I, Vidal-Puig A, Oresiˇ cˇ M (2011). Association of lipidome remodeling in the adipocyte membrane with acquired obesity in humans. PLoS Biology 9:e1000623. Pietilainen¨ KH, Sysi-Aho M, Rissanen A, Seppanen-Laakso¨ T, Yki-Jarvinen¨ H, Kaprio J, Oresiˇ cˇ M (2007). Acquired obesity is associated with changes in the serum lipidomic profile independent of genetic effects. A monozygotic twin study. PLoS ONE 2:e218. Pluskal T, Castillo S, Villar-Briones A, Oresic M (2010). MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics 11:395. Poulsen RC, Gotlinger KH, Serhan CN, Kruger MC (2008). Identification of inflammatory and proresolving lipid mediators in bone marrow and their lipidomic profiles with ovariectomy and omega-3 intake. American Journal of Hematology 83:437–445. Quehenberger O, Armando AM, Dennis EA (2011). High sensitivity quantitative lipidomics analysis of fatty acids in biological samples by gas chromatography–mass spectrometry. Biochimica et Biophysica Acta (BBA) – Molecular and Cell Biology of Lipids 1811:648– 656. Ridenour WB, Kliman M, McLean JA, Caprioli RM (2010). Structural characterization of phospholipids and peptides directly from tissue sections by MALDI traveling-wave ion mobility-mass spectrometry. Analytical Chemistry 82:1881–1889. Rubio-Aliaga I, de Roos B, Duthie SJ, Crosley LK, Mayer C, Horgan G, Colquhoun IJ, Le Gall G, Huber F, Kremer W (2011). Metabolomics of prolonged fasting in humans reveals new catabolic markers. Metabolomics 7:375–387. Sandra K, Pereira AS, Vanhoenacker G, David F, Sandra P (2010). Comprehensive blood plasma lipidomics by liquid chromatography/quadrupole time-of-flight mass spectrometry. Journal of Chromatography A 1217:4087–4099. Scharlau D, Borowicki A, Habermann N, Hofmann T, Klenow S, Miene C, Munjal U, Stein K, Glei M (2009). Mechanisms of primary cancer prevention by butyrate and other products formed during gut flora-mediated fermentation of dietary fibre. Mutation Research/Reviews in Mutation Research 682:39–53. 378 LIPIDOMICS

Schwab U, Seppanen-Laakso¨ T, Yetukuri L, Agren J, Kolehmainen M, Laaksonen DE, Ruskeepa¨a¨ A-L, Gylling H, Uusitupa M, Oresiˇ cˇ M; GENOBIN Study Group (2008). Tri- acylglycerol fatty acid composition in diet-induced weight loss in subjects with abnormal glucose metabolism- the GENOBIN Study. PLoS ONE 3:e2630. Seb´ edio´ JL, Christie WW (1998). Trans Fatty Acids in Human Nutrition. Dundee, Scotland: The Oily Press. Serhan CN (2005). Lipoxins and aspirin-triggered 15-epi-lipoxins are the first lipid mediators of endogenous anti-inflammation and resolution. Prostaglandins, Leukotrienes and Essential Fatty Acids 73:141–162. Shaham O, Wei R, Wang TJ, Ricciardi C, Lewis GD, Vasan RS, Carr SA, Thadhani R, Gerszten RE, Mootha VK (2008). Metabolic profiling of the human response to a glucose challenge reveals distinct axes of insulin sensitivity. Molecular Systems Biology 4:1–9. Shevchenko A, Simons K (2010). Lipidomics: coming to grips with lipid diversity. Nature Reviews. Molecular Cell Biology 11:593–598. Shvartsburg AA, Smith RD (2008). Fundamentals of traveling wave ion mobility spectrometry. Analytical Chemistry 80:9689–9699. Siegel R, DeSantis C, Virgo K, Stein K, Mariotto A, Smith T, Cooper D, Gansler T, Lerro C, Fedewa S, Lin C, Leach C, Cannady RS, Cho H, Scoppa S, Hachey M, Kirch R, Jemal A, Ward E (2012). Cancer treatment and survivorship statistics. CA-A Cancer Journal for Clinicians 62(4):220–241. Song H, Ladenson J, Turk J (2009). Algorithms for automatic processing of data from mass spectrometric analyses of lipids. Journal of Chromatography B 877:2847–2854. Srinath RK, Katan MB (2004). Diet, nutrition and the prevention of hypertension and cardio- vascular diseases. Public Health Nutrition 7:167–186. Ståhlman M, Ejsing CS, Tarasov K, Perman J, Boren´ J, Ekroos K (2009). High-throughput shotgun lipidomics by quadrupole time-of-flight mass spectrometry. Journal of Chromatog- raphy B 877:2664–2672. Sumner LW, Urbanczyk-Wochniak E, Broeckling CD (2007). Metabolomics data analysis, visualization, and integration. Methods in Molecular Biology 406:409–436. Trimpin S, Tan B, Bohrer BC, O’Dell DK, Merenbloom SI, Pazos MX, Clemmer DE, Walker JM (2009). Profiling of phospholipids and related lipid structures using multidimensional ion mobility spectrometry-mass spectrometry. International Journal of Mass Spectrometry 287:58–69. Van der Spoel D, Marklund EG, Larsson DS, Caleman C (2011). Proteins, lipids, and water in the gas phase. Macromolecular Bioscience 11:50–59. Van Gaal LF, Mertens IL, De Block CE (2006). Mechanisms linking obesity with cardiovas- cular disease. Nature 444:875–880. Vinaixa M, Angel Rodriguez M, Rull A, Beltran R, Blade C, Brezmes J, Canellas N, Joven J, Correig X (2010). Metabolomic assessment of the effect of dietary cholesterol in the progressive development of fatty liver disease. Journal of Proteome Research 9:2527– 2538. Wang J, Joshi AD, Corral R, Siegmund KD, Marchand LL, Martinez ME, Haile RW, Ahnen DJ, Sandler RS, Lance P, Stern MC (2012). Carcinogen metabolism genes, red meat and poultry intake, and colorectal cancer risk. International Journal of Cancer 130:1898– 1907. REFERENCES 379

Wang M, Lamers RAN, Korthout HAAJ, van Nesselrooij JHJ, Witkamp RF, van der Heijden R, Voshol PJ, Havekes LM, Verpoorte R, van der Greef J (2005). Metabolomics in the context of systems biology: bridging traditional Chinese medicine and molecular pharmacology. Phytotherapy Research 19:173–182. Wang Y, Zhang H (2011). Tracking phospholipid profiling of muscle from ctenopharyngodon idellus during storage by shotgun lipidomics. Journal of Agricultural and Food Chemistry 59:11635–11642. Wei H, Hu C, Wang M, van den Hoek AM, Reijmers TH, Wopereis S, Bouwman J, Ramaker R, Korthout HAAJ, Vennik M, Hankemeier T, Havekes LM, Witkamp RF, Verheij ER, Xu G, van der Greef J (2012). Lipidomics reveals multiple pathway effects of a multi-components preparation on lipid biochemistry in ApoE∗3Leiden.CETP mice. PLoS ONE 7:e30332. Wenk MR (2005). The emerging field of lipidomics. Nature Reviews Drug Discovery 4:594– 610. Woods AS, Jackson SN (2010). The application and potential of ion mobility mass spectrometry in imaging MS with a focus on lipids. Methods in Molecular Biology 656:99–111. Wopereis S, Radonjic M, Rubingh C, van Erk M, Smilde A, van Duyvenvoorde W, Cnubben N, Kooistra T, van Ommen B, Kleemann R (2012). Identification of prognostic and diagnostic biomarkers of glucose intolerance in ApoE3Leiden mice. Physiological Genomics 44:293– 304. Wopereis S, Rubingh CM, van Erk MJ, Verheij ER, van Vliet T, Cnubben NHP, Smilde AK, van der Greef J, van Ommen B, Hendriks HFJ (2009). Metabolic profiling of the response to an oral glucose tolerance test detects subtle metabolic changes. PLoS ONE 4:e4525. Xu H, Barnes GT, Yang Q, Tan G, Yang D, Chou CJ, Sole J, Nichols A, Ross JS, Tartaglia LA, Chen H (2003). Chronic inflammation in fat plays a crucial role in the development of obesity-related insulin resistance. The Journal of Clinical Investigation 112:1821–1830. Yan S, Chai H, Wang H, Yang H, Nan B, Yao Q, Chen C (2005). Effects of lysophosphatidyl- choline on monolayer cell permeability of human coronary artery endothelial cells. Surgery 138:464–473. Yang B, Zhang A, Sun H, Dong W, Yan G, Li T, Wang X (2012). Metabolomic study of insomnia and intervention effects of Suanzaoren decoction using ultra-performance liquid- chromatography/electrospray-ionization synapt high-definition mass spectrometry. Journal of Pharmaceutical and Biomedical Analysis 58:113–124. Yang K, Zhao Z, Gross RW, Han X (2011). Identification and quantitation of unsaturated fatty acid isomers by electrospray ionization tandem mass spectrometry: a shotgun lipidomics approach. Analytical Chemistry 83:4243–4250. Yeh C, Wang J, Cheng T, Juan C, Wu C, Lin S (2006). Fatty acid metabolism pathway play an important role in carcinogenesis of human colorectal cancers by Microarray-Bioinformatics analysis. Cancer Letters 233:297–308. YetukuriL, Ekroos K, Vidal-Puig A, Oresiˇ cˇ M (2008). Informatics and computational strategies for the study of lipids. Molecular BioSystems 4:121–127. 13 FOODOMICS STUDY OF MICRONUTRIENTS: THE CASE OF FOLATES

Susan J. Duthie

13.1 FOLATES IN THE DIET

Folate is the generic term for a family of highly labile dietary compounds that are critical for human health. Folates are water-soluble members of the B group of vitamins that derive their name from the Latin folium, meaning leaf. The folic acid molecule consists of a 2-amino-4-hydroxy-pteridine moiety linked via a methylene group at the C-6 position to p-aminobenzoic acid (pABA) combined with a variable number of glutamic acid residues. Folate vitamers differ with respect to the oxidation state of the pteridine ring, the nature of the 1-C substitutes at the N5 and N10 positions (predominantly as methyl or formyl groups or as methylene or methenyl units) and the number of glutamic acids coupled to the pABA moiety via an amide bond. In synthetic folic acid, the pteridine ring is fully oxidized, whilst in naturally occurring folates the pteridine ring is usually fully reduced (Shane, 1995; Wright et al., 2007). The structure of synthetic folic acid and several important dietary and blood folates are shown in Figure 13.1. Mammals are unable to synthesis folate de novo, as they cannot attach the initial glutamate to pteroic acid or manufacture the pABA residue. Therefore, mammals rely primarily on dietary sources of folate, although some folate is obtained via microbial breakdown in the gut. Reduced folates are the naturally occurring forms, while fully oxidized folic acid is found only in supplements or fortified foods. Rich sources of folate in the diet are leafy green vegetables such as broccoli, spinach, cabbage and

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

381 382

H H N N H N N H NH2 C C C NH2 C C C H H

HN C CH CH2 NCOGlutamate HN C CH CH2 NCOGlutamate C N C N H H H H O O O tetrahydrofolate 5-formyltetrahydrofolate H N N CH NH2 C C

HN C C CH2 NCOGlutamate C N H O folic acid H H N N H N N H NH2 C C C NH2 C C C H H

HN C CH CH2 NCOGlutamate HN C CH CH2 NCOGlutamate C N C N H CH2 O O CH3 5,10-methylenetetrahydrofolate 5-methyltetrahydrofolate

FIGURE 13.1 Chemical structure of folic acid and key dietary and blood folates. FOLATE AND HUMAN HEALTH 383 asparagus, liver, certain citrus fruits (and juices), and yeast extracts (McKillop et al., 2002). Other dietary sources contributing to total habitual folate intake include bread, pasta, and cereal (very often fortified with synthetic folic acid) and potatoes. As little as 50% of food folates are bioavailable compared with synthetic folic acid (McKillop et al., 2002). The unstable nature of naturally occurring folates causes significant loss of folate content during harvesting, storage, processing, and preparation. Cooking by boiling can reduce the folate content of vegetables (such as broccoli) by as much as 50%, while steaming has little detrimental effect. Freezing before cooking in most cases has no negative effect, and folate concentrations remain stable for 12 months (Philips et al., 2005). The most abundant folate vitamers present in foods are 5-methyltetrahydrofolate and 10-formyltetrahydrofolate. Dietary folates are absorbed through the cells of the small intestinal mucosa where the polyglutamyl chain is removed within the brush border. Absorbed dietary folate is primarily converted to 5-methyltetrahydrofolate before the release into the portal circulation. To enable both retention and concen- tration of folates in tissues, plasma folates are re-converted to polyglutamate deriva- tives, catalyzed by folylpolyglutamate synthetase (Shane, 1995; Wright et al., 2007). Red blood cells contain higher levels of folate (primarily 5-methyltetrahydrofolate and formyltetrahydrofolate) than plasma. Folate is incorporated into the developing erythroblast during erythropoiesis in the marrow, and retained throughout the life span of the cell. Although fasted plasma folate level is a good indicator of recent folate intake, red cell folate is regarded as a better biomarker of long-term folate status (Gregory et al., 1995, 2000; Sauberlich, 1995; Finch et al., 1998; Ruston et al., 2004).

13.2 FOLATE AND HUMAN HEALTH

The Reference Nutrient Intake (RNI) for folate varies for different age groups, by gen- der and by country. Currently in the United Kingdom, the RNI for folate is 200 ␮g/day for most groups, rising to 400 ␮g/day for all women who could become pregnant (SACN, 2006). It has been suggested that folic acid deficiency is the most common vitamin deficiency in the world, with 40% of 15–18-year-olds in the United Kingdom exhibiting marginal folate status and folate deficiency common in people over 65 years of age, especially in the institutionalized elderly. In excess of 80% of women (in different age groups), do not achieve the RNI for women who could become pregnant (even accounting for supplement use), while 85% of men and women do not consume the recommended number of fruit and vegetable portions as detailed in the “5 a day” campaign issued by the Department of Health (www.dh.gov.uk). Folates play a critical role in human health and development. Folates are substrates and coenzymes in the acquisition, transport, and enzymatic processing of 1-carbon (1-C) units for both protein and DNA synthesis. Folates act as 1-C donors in the synthesis of purines and thymidylate for DNA synthesis, DNA repair, and DNA methylation. Folates also donate 1-C units for the remethylation of homocysteine to methionine, which is converted to S-adenosylmethionine (SAM), the principal 384 FOODOMICS STUDY OF MICRONUTRIENTS: THE CASE OF FOLATES methyl donor in cellular methylation and protein synthesis. Additionally, folates play an important role in the interconversion of serine and glycine, and in histidine catabolism (Quinlivan et al., 2006). A wide range of human diseases or pathologies is associated with clinical folate deficiency or poor folate status. Severe folate deficiency results in a characteristic form of anemia (megaloblastic anemia) in adults, and congenital defects such as neu- ral tube defects (NTD) in the newborn (Czeizel, 1993). NTD are malformations of the embryonic brain and/or spinal cord characterized by incomplete development of the central nervous system (CNS). Periconceptional folic acid supplementation can sig- nificantly reduce the risk of occurrence and recurrence of NTD (Honein et al., 2001; Persad et al., 2002). Suboptimal folate status is also a risk factor for cognitive decline and dementia in the elderly. Folate, together with vitamins B12 and B6, is essential for normal CNS function. While acute clinical deficiencies of these vitamins are established risk factors for severe depression, paranoia, neuropathy, and psychosis, low blood cell folate status is also associated with moderate cognitive dysfunction, dementia, and depression (Duthie et al., 2002). Folate deficiency has been implicated in the development of cancer of the cervix, lung, esophagus, brain, pancreas, and breast (Giovannucci et al., 2002). However, the evidence linking poor folate status and an increased risk of colon cancer is particularly strong. Colorectal cancer (CRC) is a significant health problem in the Western world. Worldwide, CRC represents almost 10% of all incident cancer in men and women. Data from epidemiological studies (retrospective, case-control and prospective) suggest that individuals with the highest habitual folate intake or with the highest circulating folate, have a reduced relative risk (RR) of developing colon polyps or tumors (Giovannucci et al., 2002; Sanjoaquin et al., 2005; Kim, 2007). Generally, the majority of human studies sug- gest a protective role for dietary folate, ranging between 20% and 60% (Sanjoaquin et al., 2005). Poor folate status is also associated with an increased risk of cardiovas- cular disease (CVD) (Boushey et al., 1995). Previously, this has been attributed to the induction of hyperhomocysteinemia (elevated circulating homocysteine) which causes vascular endothelial and smooth muscle cell dysfunction and endoplasmic reticulum stress (Splaver et al., 2004). However, recent prospective studies report high serum folate to be independently and significantly associated with a reduced risk of CVD after adjusting for plasma homocysteine concentration (Voutilainenet al., 2000, 2004). In the remainder of this chapter, we will explore how various LC–MS/MS-based technologies have been used to examine mechanistically the links between folate status and human health. The importance (and difficulties) of accurately measuring blood folates using LC–MS/MS in human population monitoring will be discussed. The application of proteomics to investigate the effect of folate status on global pro- tein expression and metabolic pathways associated with immune function, genomic stability, and malignant transformation will be described in detail using examples from cell and animal models and from a human intervention study. Finally, the use of LC–MS/MS to investigate the impact of folate status on abnormal DNA methylation as a common mechanism linking altered gene expression and cell proliferation in both colon cancer and CVD will be discussed. MEASURING FOLATES IN HUMAN BIOMONITORING 385

13.3 MEASURING FOLATES IN HUMAN BIOMONITORING

Blood concentrations of folate are an important marker of nutritional status and have been measured in many large-scale national population-based surveys including the National Health and Nutrition Examination Survey (NHANES) in the United States of America and the National Diet and Nutrition Survey (NDNS) in the United Kingdom. Given the negative impact that low dietary folate has on human health, a precise, accurate, and standardized population-based analytical method for monitoring folate status in human blood is vital. Moreover, high circulating levels of unmetabolized folic acid, as a consequence of supplement use or fortification of certain foodstuffs with synthetic folic acid, has recently been linked with adverse effects in certain “at risk” groups including the elderly and individuals with early cancers, making the need for careful and accurate monitoring of individual blood folates essential (SACN, 2006). The precise quantification of biologically relevant folates in human blood is extremely difficult due to their relatively low abundance (pmol/L–nmol/L depend- ing on folate species and blood fraction), their inherent instability and susceptibility to oxidative, catalytic and photolytic degradation, and their ability to metabolically interconvert (Shane et al., 1980; Shane, 1995). Several methods, including protein- binding assays, a microbiological assay, and more recently LC–MS/MS, are used to determine blood folate in human studies. There is extensive variability in data generated using different folate assays. A Food-Linked Agro-Industrial Research Program (FLAIR) “round robin” study of serum and whole blood folate data gener- ated across 11 laboratories in 7 European countries described overall CVs of 18–41% (van den Berg et al., 1994). A Centres for Disease Control (CDC) “round robin” of 20 laboratories in the United States reported a two- to ninefold differences in folate values between methods and laboratories (Gunter et al., 1996). An assessment of six commercial serum and whole blood folate assays was reported variance as high as 40% for serum, and 250% for whole blood folate was reported between the methods (Owen and Roberts, 2003). The microbiological assay (MA) is currently the reference method for measuring total blood folates in biological samples. The MA relies on the ability of a folate-dependent bacterium (primarily Lactobacillus. casei) to grow following exposure to blood samples containing unknown concentrations of folate. Bacterial growth in samples (measured as turbidity) is compared against standard growth curves derived from bacteria incubated with synthetic folic acid or specific reduced folate metabolites (Tamura, 1990). The MA is extremely sensi- tive, has an extensive dynamic range (0.025–0.5 ng/mL for plasma and whole blood lysates) and is relatively inexpensive. Good indices of precision (<10% inter-assay variability), recovery (102.5% for 5-methyltetrahydrofolate), and linearity (1:12,000 dilution) have been reported (Zhang, 2005). Conversely, the MA is regarded as time consuming and difficult to perform. The most significant drawback, however, is that it only measures total folate and does not routinely distinguish between different folate species. Most clinical analysis of blood folate is carried out using protein-binding assays. These high-throughput automated assays are designed primarily to estab- lish folate deficiency in medical settings. In the majority of assays, folate-binding 386 FOODOMICS STUDY OF MICRONUTRIENTS: THE CASE OF FOLATES protein (FBP) is used to capture folate in the sample. Several different formats can be employed based on detection and quantification by chemiluminescence, ion capture technology, and radio-binding. Automated protein-binding assays are relatively easy to use, inexpensive, and have high-throughput. However, precision is low. Critically, protein-binding assays detect only total folate and cannot discriminate between folate vitamers. Moreover, they cannot accurately quantify nonmethylated folates including formyltetrahydrofolate, which make up a significant proportion of total blood folates in individuals carrying a genetic variant (TT) of the folate metabolizing MTHFR gene (Molloy et al., 1998). The most promising emerging technology for accurately measuring blood folate is LC–MS/MS. Introduction of isotope-dilution LC–MS/MS methods, where an iso- 13 topically labeled folate metabolite, such as C5 5-methyltetrahydrofolate is added to standards and samples before extraction, allows for accurate quantification at trace levels. This has the advantage of not requiring complete extraction and recovery of folate vitamers in the sample as quantification is based on the relative ratios of calibrant to labeled internal standard. Moreover, introduction of a semi-automated 96 well plate format for solid phase extraction (SPE) has increased throughput substan- tially (Fazili and Pfeiffer, 2004). Samples are retained (and may be concentrated; Kok et al., 2004) and purified on an SPE or solid phase affinity extraction (SPAE) matrix (e.g., FBP) before separation on an LC column. Samples are analyzed using MS/MS detection with either positive or negative ionization. Negative ion electrospray is commonly used to prevent formation of satellite ions such as Na and K adducts, although this can be overcome using an acidic mobile phase during LC separation of the folate compounds. Diverse folate species have different optimal ionization and fragmentation voltages. Plasma 5-methyltetrahydrofolate (5-methylTHF), tetrahy- drofolate (THF), 5-formyltetrahydrofolate (5-formylTHF), and synthetic folic acid (FA) have been detected using LC–MS/MS with negative ion electrospray (Garbis et al., 2001). No SPE was used in this method. Instead, hydrophilic interaction chro- matography was used for sample clean up “on line” and methotrexate was employed as “internal” standard. Endogenous 5-methylTHF in human plasma was detected within the range 2.2–25.6 ␮g/L (Garbis et al., 2001). Plasma 5-methylTHF and FA 13 2 have been quantified using labeled C5 5-methylTHF and H4 FA with LC–MS/MS positive ion electrospray (Kok et al., 2004). The limit of detection (LOD) in this study was 1.2 × 10−11 and 5 × 10−10 mol/L for 5-methylTHF and FA respec- tively (Kok et al., 2004). 5-methylTHF, 5-formylTHF, and FA have been quantified in human serum with an LOD of 0.13, 0.05, and 0.07 nmol/L, respectively (Pfeiffer et al., 2004). Few studies have used LC MS/MS to quantify folate vitamers in whole blood. 5-methylTHF, 5-formylTHF, FA, and 5,10-methenylTHF have been detected and quantified in whole blood lysates by positive electrospray LC MS/MS (Fazili and Pfeiffer, 2004; Fazili et al., 2008). The LOD was 2.5 and 0.7 nmol/L for THF and 5,10-methenylTHF, respectively. The relative distribution of 5-methylTHF, sev- eral nonmethylated folate vitamers (THF, 5,10-methyleneTHF, 5,10-methenylTHF, 5- and 10-formylTHF), and partly oxidized folates (FA and DHF) have been described using positive electrospray LC–MS/MS (Smith et al., 2006). The limit of quantifica- tion (LOQ) for both 5-methylTHF and nonmethylated THF was 0.4 nmol/L (Smith FOLATE AND COLON CANCER 387 et al., 2006). Preparation of whole blood lysates for quantitation of folates by LC– MS/MS is more critical than conventional methods for measuring folate. Incubating human whole blood lysates (pH 4) for up to 2 h at 37◦C (the regime used convention- ally in the MA and radioassay) does not allow complete polyglutamate deconjugation resulting in underestimation of folate by LC–MS/MS (Fazili et al., 2005). Less spe- cific assays such as the MA, which measure short chain polyglutamates in addition to monoglutamates, are less influenced by deconjugation regimes (Fazili et al., 2005). Crucially, LC–MS/MS can detect a wide range of folate concentrations normally associated with folate deficiency, sufficiency, and intake of supplemental folic acid where consequently unmetabolized folic acid appears (Pfeiffer et al., 2004; Fazili and Pfeiffer, 2004; Nelson et al., 2004, 2005). LC–MS can be comparable with LC–MS/MS in terms of analytical accuracy and precision but is less specific and an order of magnitude less sensitive (Hart et al., 2002; Nelson et al., 2003, 2007; Kok et al., 2004). Moreover, folate (even degraded and oxidized forms) in human urine and serum can be determined as folate equivalents by converting folate into its catabolites para-aminobenzoylglutamic acid (pABG) and acetamidobenzoylglu- tamate (apABG), and quantification by LC–MS/MS (Hannisdal et al., 2008, 2010). While this appears to be a promising technique, allowing detection of femtomolar concentrations of folate in human samples, it is at an early stage of development. Most importantly, LC–MS/MS can be used to quantify several biologically active folate vitamers and unmetabolized folic acid simultaneously in human blood sam- ples. Conversely, specific folate vitamers may not be recovered fully following sam- ple extraction (Pfeiffer et al., 2004) and specialized and expensive equipment not routinely found in clinical or analytical laboratories is required. While through- put is rapid, the cost of folate analysis on a per sample basis by LC–MS/MS is substantial compared with more routine analyses. The major strengths and weak- nesses of LC–MS/MS in measuring folate status in human blood are summarized in Figure 13.2. To date, LC–MS/MS has not been employed in large-scale human monitoring. However, the suitability of LC–MS/MS (compared with the microbial assay) for measuring blood folate in nutritional surveys is now being validated for use in the American NHANES survey and has recently been adopted over clinical protein- binding assays as the principal method of choice for quantifying folate status in the UK NDNS (http://foodbase.org.uk/results.php?f_report_id=336).

13.4 FOLATE AND COLON CANCER: ESTABLISHING MECHANISMS OF GENOMIC INSTABILITY USING A COMBINED PROTEOMIC AND FUNCTIONAL APPROACH

The evidence linking low folate status with an increased risk of human colon cancer is convincing. In general, people who habitually consume the highest level of dietary folate, or those with the highest blood folates, are at a significantly reduced risk of developing polyps or cancer (Giovannucci, 2002; Sanjoaquin et al., 2005; Kim, 2007). A recent large-scale meta-analysis of prospective studies (Sanjoaquin et al., 388 FOODOMICS STUDY OF MICRONUTRIENTS: THE CASE OF FOLATES

Strengths of LC MS/MS Weaknesses of LC MS/MS

allows measurement of potentially degraded samples can quantify folates only where standards are stored over a long period (pABA) available (inclusion of full range of standards impractical due to stability problems) detects and quantifies individual folate species e.g., THF, 5-methylTHF, formylTHF, and unmetabolised no external QA system currently in place folic acid assumes full recovery of total folate and equal unaffected by MTHFR genotype recovery of different folate species

sensitive and specific (0.1-0.4 ng/mL) assumes all folate species are converted to pABA

accurate (if appropriate stable isotopically-labelled exogenous pABA (drugs) cause overestimation of total internal standards are used) folate

low limits of detection for several folate species requires expensive and specialised laboratory equipment unambiguous determination of analytes analysis expensive on a per sample basis

relatively untried in large-scale human population monitoring

FIGURE 13.2 Major strengths and weaknesses of LC–MS/MS for measuring blood folate status in human biomonitoring.

2005) reported a significantly lower risk for CRC in subjects with a high dietary folate intake compared with low intake (RR 0.75; 95% CI = 0.64-0.89). Folate deficiency increases cancer risk by inducing genomic instability and abnor- mal cytosine methylation and gene expression. The impact that folate status has on DNA methylation in relation to both cancer and CVD will be covered in a later section. Folate from the diet is essential for DNA synthesis. Deoxyuridine monophosphate (dUMP) is converted to thymidine monophosphate (TMP) by thymidylate synthase using 5,10-methylenetetrahydrofolate, while 10-formyltetrahydrofolate is involved in the production of both adenosine and guanosine. Continual production of these DNA precursors is essential for normal DNA synthesis, cell proliferation, and DNA repair. However, if folate is limiting, and the balance of these DNA precursors is altered, uracil is misincorporated into DNA, resulting in DNA strand breakage, chromosomal instability, and inhibition of DNA repair (Fig. 13.3). In this section, we will focus on how proteomic technologies (2-D gel electrophore- sis and SDS PAGE with proteins identified by nano LC–MS/MS), combined with biochemical analyses of folate status and biomarkers of genomic instability, can be used to investigate the mechanisms involved in folate deficiency-mediated malignant transformation. In the first example, normal human colon cells (NCM460) were cultured for up to 14 days in folate-free medium. Functional biomarkers of uracil misincorporation, DNA strand breakage, apoptosis, and DNA base excision repair capacity (BER) in response to oxidative damage were measured (Duthie et al., 2008). To deter- mine the effect of folate depletion on the total proteome, colon cells (grown either in FOLATE AND COLON CANCER 389

Chromosomal instability Dietary Folate

dUMP DHF TS dTMP THF

5,10-methylene THF 5,10-formyl THF

Pyrimidine biosynthesis Purine biosynthesis (thymidine) (adenosine, guanosine)

DNA synthesis Proliferation DNA repair

Altered DNA repair

Carcinogenesis

FIGURE 13.3 Folate and genomic stability: the role of dietary folate in DNA precursor synthesis and DNA repair. dTMP, deoxythymidine triphosphate; dUMP, deoxyuridine triphos- phate; DHF, dihydrofolate; THF, tetrahydrofolate. folate depleted or folate-supplemented medium) were scraped into Tris/sucrose buffer (10 mM/250 mM, pH 7.4) and centrifuged at 13,000 rpm. The pellet was weighed and extraction buffer (7 M urea, 2 M thiourea, 4% CHAPS and 2% Bio-Rad Biolite Ampholyte pH 3-10) added at a ratio of 3 ␮L buffer per mg of pellet. The sample was homogenized on ice for 30 s and sonicated for 5 s. The homogenate was centrifuged at 16,000 g for 5 min at 4◦C and the supernatant frozen at –80◦C. Protein concen- tration was determined using the Biorad RC DC protein assay. A single 2-D gel was run per culture flask (n=6 per group). Proteins were separated by isoelectric focus- ing in the first dimension (BioRad immobilized pH gradient (IPG) strips (pI range 3–10)) and SDS-PAGE in the second dimension on 18 × 18 cm 8–16% acrylamide gels. Gels were stained with Coomassie blue and imaged on a BioRad GS710 flat bed imager followed by image analysis using PD Quest Version 8.0.1. Spots with densities that significantly differed between treatments (p < .05) were excised from the SDS-PAGE gels using a robotic BioRad spot cutter, trypsinized in a MassPrep station (Waters, MicroMass, Manchester, UK) and analyzed by LC–MS/MS using an Ultimate’ nano LC capillary chromatography system (LC Packings, Camberly, 390 FOODOMICS STUDY OF MICRONUTRIENTS: THE CASE OF FOLATES

70 (a) 110 (b) * 100 * 60 90 50 80 70 40 60 50 30 40 20 30 20 10 10 0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 Time (days) Time (hours)

18 350 (d) * (c) * 16 300 14 250 12 200 10 150 8 6 100 4 50 2 0 0 UT 0 4 8 F+ F- Time (days) Treatment FIGURE 13.4 The effect of folate depletion on genomic stability in human colon cells in vitro. NCM460 human colon cells were grown in the presence or absence of folic acid (4 mg/L) for 14 days. Uracil misincorporation (a), DNA strand breakage (b), DNA base excision repair capacity (c) and caspase activity (d) were measured. Results are mean + /− SEM for n = 8. ∗P < .05, where P values refer to differences between cells grown in the presence (circles) or absence (squares) of folic acid. Reproduced with permission from Duthie et al. (2008). Surrey, UK) combined with an Applied Biosystems 4000 Q-Trap (Warrington, UK). Peptide fragment mass spectra yielded the sequence of separated peptides that were pasted into the fingerprinting web resource program “Mascot” (Matrix Science Ltd, Boston USA) for protein identification. Folate deficiency induced DNA damage in human colon cells progressively. Uracil misincorporation was increased fourfold in cells grown in folate-deficient medium (Fig. 13.4a). DNA strand breakage was increased over the same experimental period (Fig. 13.4b). Moreover, the capacity of human colon cells to repair DNA in response to oxidative damage was reduced (Fig. 13.4c) while the apoptotic potential of the cells was increased (Fig. 13.4d). Proteomic analyses revealed more than 100 distinct proteins that were significantly altered by folate status in these human colon cells (Duthie et al., 2008). Numerous proteins involved in proliferation, repair, apoptosis, and malignant transformation changed in response to folate depletion (Fig. 13.5). MSH2 is a recognition enzyme in DNA mismatch repair (MMR) that acts to proof read the DNA molecule for errors post replication. XRCC5 enables DSB rejoining after chromosomal breakage. Both repair enzymes were up-regulated by folate depletion, almost certainly in response to increased DNA damage. Many apoptotic proteins were altered in response to folate FOLATE AND COLON CANCER 391

DNA repair

MSH2 lesion-recognition enzyme in MMR “proof reads” DNA post replication XRCC5 (Ku80) rejoining enzyme in DSB repair after chromosomal damage Both enzymes upregulated in folate depleted human colon cells Homeostatic response to increased DNA damage Functional assays; increased DNA breakage and compromised DNA repair capacity

Apoptosis

Differential expression of pro- and anti-apoptotic proteins Annexin 1, Diablo, porin, cathepsin D elevated two-to threefold HSP 70, galactin-3, PCNA downregulated or even absent (pyruvate dehydrogenase E1) Overall increase in apoptosis Functional assays; increased apoptotic potential

Malignant transformation and tumor suppressor proteins

Catechol-O-methyltransferase (COMT) associated with breast and colon cancer risk Nit2 controls cancer cell growth and tumor burden Prohibitin (non-functional protein) associated with breast and colon cancer risk Expression of tumor suppressor proteins absent

FIGURE 13.5 A summary of changes in several key proteins involved in malignant trans- formation and links with functional biomarkers of genomic instability in response to folate depletion in human colon cells in vitro. depletion, with an overall increase in the apoptotic potential, again probably as a consequence of increased damage and inhibited repair. This agrees strongly with the functional biochemical data, with folate depletion increasing DNA damage, inhibit- ing DNA repair capacity, and increasing apoptosis. Moreover, there was a significant reduction in the expression of several tumor-suppressor proteins implicated in malig- nant transformation and metastasis in the human colon, including Nit 2, COMT, and prohibitin (Fig. 13.5). This combined functional and global proteomics approach successfully identified many metabolic processes associated with genomic instability and malignant trans- formation that were significantly altered by folate deficiency in vitro and, critically, identified several individual molecular protein targets not previously associated with folate depletion (Duthie et al., 2008). A similar approach was used to determine how folate deficiency alters genomic stability in vivo. Here, rats were fed a folate-free diet for 6 months and various indices of folate status, DNA damage and DNA repair were measured, together with global protein expression in the colon (method essentially as described above; Duthie et al., 2010a). 392 FOODOMICS STUDY OF MICRONUTRIENTS: THE CASE OF FOLATES

TABLE 13.1 The Effect of Long-Term Folate Deficiency on Expression of Rat Colon Proteins Transcription, Protein Intermediate Cell Metabolism Treatment Effect Synthesis, and Catabolism Acyl-CoA dehydrogenase 1.3 2.4 Contrapsin-like protease inhibitor 1 Acyl coenzyme A thioester 5.9 Contrapsin-like protease hydrolase inhibitor 3 Alcohol dehydrogenase 1.8 structural proteins, cytoskeleton, and cell motility 3-␣-hydroxysteroid 1.9 62.5% L-plastin dehydrogenase Aspartate transaminase 1.7 2.5 Tubulin alpha chain Carbonic anhydrase 2 2.9 xenobiotic metabolism, oxidative stress, and redox regulation Carbonyl reductase [NADPH] 1 1.9 1.4 Glutathione synthase Dihydrolipoamide dehydrogenase 1.2 1.3 Glutathione S-transferase P Fumarate hydratase 2.1 5.4 Selenium binding protein 2 Fumarylacetoacetase 2.4 Thioredoxin reductase 1 cell signalling, cell cycle, and nucleotide synthesis GDP-mannose 4,6 dehydratase 2.1 2.2 Adenylate kinase isoenzyme 2 Gpd1l 6.7 1.3 Nucleoside-diphosphate kinase Glycerol-3-phosphate Pdcd6ip protein dehydrogenase Hydroxymethylglutaryl-CoA synthase L-Lactate dehydrogenase A chain 3.7 1.6 Nicotinamide 2.5 phosphoribosyltransferase Phosphoglucomutase 1 2.6 Transaldolase 1.4 UDP-glucose 6-dehydrogenase 1.9

Rats were fed either a folate deficient or control diet for 24 weeks (Duthie et al., 2010a). Change in colon (insoluble) protein expression, grouped according to primary metabolic function, is shown as fold increase, percent decrease (L-plastin) or expression only in folate depleted tissue (italics)

Feeding rats a folate-free diet caused a progressive depletion in blood and colon folate (>50%) and a significant increase in uracil misincorporation and OGG-1 and MGMT protein expression (Duthie et al., 2010a). More than 50 proteins changed significantly in response to folate depletion in the colon. Colon mucosa (insoluble fraction) proteins are grouped broadly by function in Table 13.1. Most proteins affected by folate status are involved in protein synthesis and energy metabolism and were significantly up-regulated by folate depletion. However, a FOLATE AND COLON CANCER 393 significant number of proteins involved in Phase 2 xenobiotic metabolism and cellular response to oxidative stress were significantly affected by folate depletion, includ- ing glutathione transferase P (GST ␲) and glutathione synthase. Several proteins were expressed only in the folate-depleted colon, including the pro-apoptotic protein, Pdcd6ip. A single protein, L-plastin, was significantly down-regulated in the colon. Dysregulation of several of these proteins is implicated in genomic instability and cancer susceptibility or progression (Duthie et al., 2010a). These studies, together with many others in the literature, highlight how moderate- to-severe folate deficiency negatively impact on processes and pathways associated with cancer risk. Together with data from epidemiological studies, they suggest that improving dietary folate status in the human population may reduce genomic instability and malignant transformation in the colon. However, there are recent indications that increasing intake of synthetic folic acid may be detrimental and may actually increase human colon cancer incidence by altering immune function and cancer surveillance or by stimulating initiated cancer cell growth (Troen et al., 2006; Mason et al., 2007). We carried out a human intervention study to determine what effect synthetic folic acid might have on immune function by determining the impact of folic acid on the plasma proteome. Plasma was collected from subjects treated either with placebo or synthetic folic acid (1.2 mg/day for 12 weeks; 10 subjects per group) in a randomized controlled trial (Duthie et al., 2010b). The plasma proteome was assessed by 2-D gel electrophoresis and proteins identified by LC–MS/MS (Duthie et al., 2010b). Plasma samples were depleted of the 12 most abundant proteins (albumin, IgG, fibrinogen, transferrin, IgA, IgM, HDL Apo AI, HDL Apo AII, haptoglobulin, ␣-1-antitrypsin, ␣-1-acid glycoprotein and ␣-2-macroglobulin) using an IgY-12 high capacity proteome partitioning kit (ProteomeLab, Beckman Coulter). Sam- ples were concentrated using Millipore Ultrafree-0.5 centrifugal spin columns and rehydration buffer (7 M Urea, 2 M thiourea, 2% CHAPS and 0.5% IPG buffer, pH 4-7). Protein concentration was quantified as described previously. Depleted plasma (200 ␮g of protein) was subsequently loaded per 17 cm 2-D gel. One 2-D gel was run per single plasma sample. Proteins were separated by isoelectric focusing in the first dimension (BioRad immobilized pH gradient (IPG) strips (pI range 4-7)) and SDS-PAGE in the second dimension on 8–16% acrylamide gels (18 × 18 cm). Gels were stained with Flamingo Red (BioRad). Imaging, spot cut- ting, LC–MS/MS analyses, and protein identification were as described earlier in this section. Plasma folate increased fivefold in supplemented subjects. ApoE A-1, ␣-1- antichymotrypsin, antithrombin, and serum amyloid P were down-regulated while albumin, IgM C, and complement C3 were up-regulated. More than 60 proteins were highly associated with circulating folate pre- and postintervention (Duthie et al., 2010b). These were grouped into metabolic pathways related to complement fixa- tion (e.g., C1, C3, C4, Factor H, Factor 1, Factor B, clusterin), coagulation (e.g., antithrombin, ␣-1-antithrypsin, kininogen) and mineral transport (e.g., transthyretin, haptoglobin, ceruloplasmin). Low folate status, pre- and posttreatment was associ- ated with lower levels of proteins involved in immune function and coagulation. 394 FOODOMICS STUDY OF MICRONUTRIENTS: THE CASE OF FOLATES

Alternative pathway Classical pathway

antigen/antibody complex

YY IgG,IgM C1q C1r C1s

Factor 1 Cq Cr Cs C3 convertase Factor H C4 C2 spontaneous hydrolysis anaphylactic C3 C3a anaphylactic chemoattractant C3 chemoattractant vasodilator C3a vasodilator Factor D C3b C3b Factor B anaphylactic antigen/antibody C5a chemoattractant complex C5 vasodilator lysosomal activator Factor 1 C3 convertase C5b C6 C9 clusterin C7 C8 vitronectin

Cell lysis

FIGURE 13.6 Simplified diagram of the classical and alternative complement pathways. Proteins identified as being significantly associated with folate status are shown in black. Reproduced with permission from Duthie et al. (2010b).

The vast majority of proteins influenced by synthetic folic acid were involved in immune recognition and complement fixation, which removes pathogens from the body and which is also involved in modulating inflammation and immune cell func- tion. Two main pathways interact in the Complement Cascade, the classical and alternate pathways, which act primarily to induce a controlled lysis of foreign cells in the body. More than 30% of proteins critical to complement fixation were identified as responsive to folate status in this study (Fig. 13.6; shown in black). Clearly, increasing intake of synthetic folic acid significantly alters the expression of proteins involved in immune function. Whether this modulates immune surveillance or cancer risk remains to be established.

13.5 FOLATE DEFICIENCY AND ABNORMAL DNA METHYLATION: A COMMON MECHANISM LINKING CANCER AND ATHEROSCLEROSIS

DNA methylation in mammals involves addition of a 1-C group to cytosine residues within CpG dinucleotides. SAM acts as methyl donor in this reaction, which is cat- alyzed by DNA methyltransferase (DNMT) enzymes. Cytosine methylation changes FOLATE DEFICIENCY AND ABNORMAL DNA METHYLATION 395 the structure of the major groove in the DNA molecule and disrupts attachment of DNA binding proteins and transcription factors. In general, methylated genes are either not transcribed or are transcribed at a reduced rate, and translation into the pro- tein for which the gene encodes is reduced. DNA methylation therefore contributes to the control of gene and protein expression (Costello and Plass, 2001). Aberrant DNA methylation is a common feature in tumors. Undermethylation of DNA (hypomethylation) is associated with increased transcription and expression of proto-oncogenes that stimulate malignant cell growth, migration, and metastasis. Genome-wide DNA hypomethylation occurs early in tumorigenesis and may be causal for cancer progression (Gaudet et al., 2003). There is some suggestion that abnormal DNA methylation may also be contributory for human vascular disease (Dong et al., 2002; Zaina et al., 2005; Turunen et al., 2009). The atherosclerotic plaque and the tumor are both monoclonal in origin, with un-regulated cell proliferation providing a growth advantage for selected clones. During atherogenesis, aorta smooth muscle cells (SMC) transform from quiescent contractile cells to synthetic cells that migrate to the intima and multiply. Overexpression of pro-proliferative oncogenes drives SMC invasiveness and plague formation. Low folate status is associated with an increased risk of both cancer and CVD (Sanjoaquin et al., 2005; McNulty et al., 2008). Folate is essential for DNA methy- lation as 5-methyltetrahydrofolate remethylates homocysteine to methionine, which is subsequently metabolized to SAM. Under conditions of low dietary folate, SAM is depleted causing hypomethylation of newly synthesized DNA. Abnormal DNA methylation, as a consequence of folate deficiency, may therefore be causal in the development of both atherosclerosis and colon cancer (Fig. 13.7). Severe folate depletion causes DNA hypomethylation and altered gene expression in cultured cells and in rodents, but the effect is strongly influenced by treatment regime, species, and tissue (Kim, 2007). Lymphocyte DNA is hypomethylated in women made experimentally folate-deficient over several weeks (Jacob et al., 1998) and low dietary folate intake (<200 ␮g/day) correlates with LINE-1 hypomethylation in human colon tumors (Schernhammer et al., 2010). However, there is little conclu- sive evidence that folate status and genome-wide DNA methylation are correlated in healthy individuals (Fenech et al., 1998; Basten et al., 2006; Kim, 2007). With regard to vascular disease, genomic DNA hypomethylation is observed in cultured vascular cells and in atherosclerotic plaques from mice, rabbits, and humans (Hiltunen et al., 2002). Critically, no studies to date have investigated whether folate status can modify vascular tissue methylation and disease progression. In order to determine whether folate deficiency-mediated genome-wide hypomethylation is causal for both genomic instability in the colon and atherosclerosis in the vasculature, we employed LC–MS/MS (based on the original method of Friso et al., (2002)) to directly quantify 5-methyl-2-deoxycytidine levels in both colon and vascular tissue from folate-depleted rodents. Previous methods for measuring genomic DNA methylation (including Southern Blotting, radioassay, and nucleotide extension assay) generally provide inconsistent and highly variable data. Moreover, LC–MS/MS allows direct quantification of methylated 2-deoxycytidine residues in as little as 1 ␮g of DNA (Friso et al., 2002). 396

5,10-methylene THF

5-methyl THF

CH3 homocysteine

SAH methionine

SAM DNMT CH3

CH3 C G Protooncogene Protooncogene activation activation (epithelial cell proliferation) Carcinogenesis Hypomethylation (smooth muscle cell proliferation) Atherosclerosis

FIGURE 13.7 Folate deficiency, cancer, and CVD: a common mechanism? A simplified scheme describing how folate deficiency may alter normal DNA methylation and how this could impact both on risk of colon cancer and atherosclerosis.THF, tetrahydrofolate; SAM, s-adenosylmethionine; SAH, s-adenosylhomocysteine; DNMT, DNA methyltransferase; CH3, 1-C methyl group. Reproduced with permission from Duthie SJ (2011). SUMMARY 397

In the first study, colon was isolated from rats fed a folate-free diet for 24 weeks (Duthie et al., 2010a). Folate, DNA methylation, DNA damage, and DNA repair capacity were measured. In the second experiment, ApoE null mice, which sponta- neously develop atherosclerosis, were fed a high fat diet depleted of folic acid for 16 weeks. Plaque formation in the aorta was measured, as was folate status and 5-methyl-2-deoxycytidine levels in the aorta, heart, and aorta adventitial tissue (McNeil et al., 2011). Colon and vascular tissue DNA was isolated using a Promega WizardR Genomic DNA Purification kit (Madison, USA) and quantified by nanodrop (ND-1000 Spec- trophotometer, Nanodrop Technologies, Wilmington, USA). Denatured DNA was added to ammonium acetate (0.1 M, pH5.3) and nuclease P1 and incubated at 45◦C for 2 h. Ammonium bicarbonate and diluted snake venom phosphodiesterase-1 were added to the mixture and incubated at 37◦C for a further 2 h. Alkaline phosphatase was added and incubated at 37◦C for 1 h. Finally, stable deoxycytidine isotope 15 ([ N3]2-deoxycytidine) was added to each sample as internal standard. LC was performed on an Agilent 1200 Series system using Analyst 1.4.2 software. Sepa- ration was carried out on a Phenomenex Jupiter C18 column (150 × 2 mm, 5 ␮) and an isocratic gradient (10 mM ammonium acetate in 2% acetonitrile with 0.1% acetic acid) over 8 min. MS/MS was performed on an AB Sciex 4000 triple quadrupole instrument with a Turbo-IonSpray source. Methylated DNA was quanti- tatively analyzed using the Multiple Reaction Monitoring (MRM) method. Transition values for 2-deoxycytidine, 5-methyl-2-deoxycytidine, and the internal standard 2- 15 deoxycytidine-[ N3] were 228.2 → 111.7, 242.2 → 125.6, & 230.6 → 114.7 m/z, respectively. DNA methylation status was expressed both as the absolute amount of 5-methyl-2-deoxycytidine per microgram of DNA and as percentage of total 2- deoxycytidine residues in genomic DNA. The effect of dietary folate depletion on folate status and DNA genome-wide DNA methylation in both rats and ApoE null mice is shown in Table 13.2. Folate deficiency was associated both with changes in genomic stability in rats (increased DNA damage and altered DNA BER capacity) and atherosclerosis (significantly increased plaque formation) in ApoE null mice (Duthie et al., 2010a; McNeil et al., 2011). However, despite a profound decrease in circulating and tissue folate status, elevated total plasma homocysteine and a decrease in the hepatic ratio of SAM:SAH in both animal models, global DNA methylation remained unchanged in the colon and vascular tissue of rats and ApoE knockout mice, respectively (Table 13.2). These data, from long-term nutritionally relevant rodent models, indicate that accelerated genomic instability and vascular disease progression as a consequence of folate deficiency are probably not driven by abnormal global cytosine methylation.

13.6 SUMMARY

Dietary folates are critical for human development and health. Both DNA and pro- tein function are profoundly dependent on adequate folate intake. Folate deficiency and suboptimal folate status are associated with several human pathologies including 398 TABLE 13.2 Indices of Folate Status and Global DNA Methylation in Blood and Tissues from Folate-Depleted Rodents Rats Mice

Biomarker C C-F HF HF-F Plasma Folate 96.4 + /− 2.2 13.3 + /− 1.3∗ 104.9 + /− 5.5 15.5 + /− 1.7 ∗ (ng/mL) Whole Blood Folate 1322 + /− 37.4 705.3 + /− 36.4∗ 507.9 + /− 26 149.6 + /− 9.7∗ (ng/mL) Liver Folate 136.2 + /− 6.9 93.9 + /− 16.3∗ 117.1 + /− 8.3 42 + /− 2.3∗ (ng/mg protein) Colon Folate 24.6 + /− 1.9 9.9 + /− 1.8∗ ND ND (ng/mg protein) Plasma tHcy 5.01 + /− 0.27 6.91 + /− 0.32∗ 6.7 + /− 0.3 14.2 + /− 1∗ (nmol/L) Liver SAM 81.60 + /− 4.93 79.22 + /− 4.99∗ 84.1 + /− 4.7 47.6 + /− 6.7∗ (nmol/g tissue) Liver SAH 14.53 + /− 0.9 15.78 + /− 1.06∗ 47.6 + /− 3.1 68.1 + /− 3.8∗ (nmol/g tissue) Colon SAM 32.47 + /− 4.36 28.36 + /− 2.91∗ ND ND (nmol/g tissue) Colon SAH 3.57 + /− 0.47 4.48 + /− 0.52∗ ND ND (nmol/g tissue) Liver 5-methyl-2-deoxycytidine 2.62 + /− 0.32 2.12 + /− 0.23 71.5 + /− 5.9 68.4 + /− 6.5 (ng/␮g DNA; % total DNA) (4.63 + /− 0.10) (4.54 + /− 0.04) (3.21 + /− 0.05) (3.20 + /− 0.06) colon 5-methyl-2-deoxycytidine 4.56 + /− 0.22 3.47 + /− 0.51 ND ND (ng/␮g DNA; % total DNA) (4.87 + /− 0.09) (4.61 + /− 0.15) aorta 5-methyl-2-deoxycytidine ND ND 116.6 + /− 11.5 123.7 + /− 7.8 (ng/␮g DNA; % total DNA) (4.31 + /− 0.12) (3.99 + /− 0.10) Heart 5-methyl-2-deoxycytidine ND ND 134.7 + /− 6.1 157 + /− 13.4 (ng/␮g DNA; % total DNA) (3.79 + /− 0.05) (3.85 + /− 0.12) Rats were fed either a control (C) or folate depleted diet (F-) for 24 weeks (n = 12 per group). ApoE mice were fed a high fat (HF) or HF diet depleted of folic acid (HF-F) for 16 weeks (n = 6–10 per group). Values are means + /− SEM. Significance refers to differences between sufficient versus depleted animals at the end of the experiment; ∗ P < .05. REFERENCES 399

NTD in the newborn and cancer and vascular disease in adults. In order to prevent not only overt clinical folate deficiency, but also most importantly, several chronic human diseases that appear dependent on folate status, it is essential to understand fully the mechanisms through which this critical micronutrient acts. The studies described here clearly demonstrate that LC–MS/MS-based technologies for (1) accurately and specifically measuring blood folate(s) in human biomonitoring, (2) establishing the influence of folate status on total tissue protein expression, and (3) quantifying genome-wide DNA methylation, combined with appropriate functional or biochemi- cal measurements of dietary intervention and biomarkers of disease are powerful tools for interrogating metabolic pathways altered by folate status and involved either in human disease prevention or promotion.

ACKNOWLEDGMENTS

The Scottish Government Rural and Environment Science and Analytical Services Division (RESAS) and the World Cancer Research Fund (WCRF) funded this work.

REFERENCES

Basten GP, Duthie SJ, Pirie LP, Vaughan N, Hill MH, Powers HJ (2006). Sensitivity of markers of DNA stability and DNA repair activity to folate supplementation in healthy volunteers. British Journal of Cancer 94:1942–1947. Boushey CJ, Beresford SAA, Omenn GS, Motulsky AG (1995). A quantitative assessment of plasma homocysteine as a risk factor for vascular disease: probable benefits of increasing folate intakes. Journal of the American Medical Association 274:1049–1057. Costello JF, Plass C (2001). Methylation matters. Journal of Medical Genetics 38:285–303. Czeizel AE (1993). Prevention of congenital abnormalities by periconceptional multivitamin supplementation. British Medical Journal 306:1645–1648. Dong C, Yoon W, Goldschmidt-Clermont PJ (2002). DNA methylation and atherosclerosis. Journal of Nutrition 132:2406S–2409S. Duthie SJ, Whalley LJ, Collins AR, Leaper S, Berger K, Deary IJ (2002) Homocysteine, B vitamin status and cognitive function in the elderly. American Journal of Clinical Nutrition 75:908–913. Duthie SJ, Mavrommatis Y, Rucklidge G, Reid M, Duncan G, Moyer MP, Pirie LP, Bestwick CS (2008). The response of human colonocytes to folate deficiency in vitro: functional and proteomic analysis. Journal of Proteome Research 7:3254–3266. Duthie SJ, Grant G, Pirie LP, Watson AJ, Margison GP (2010a). Folate deficiency alters hepatic and colon MGMT and OGG-1 DNA repair protein expression in rats but has no effect on genome-wide DNA methylation. Cancer Prevention Research 3:92–100. Duthie SJ, Horgan G, De Roos B, Rucklidge G, Reid MD, Duncan GJ, Pirie L, Basten G, Powers H (2010b). Blood folate status and expression of proteins involved in immune function, inflammation and coagulation: biochemical and proteomic changes in the plasma of humans in response to long term synthetic folic acid supplementation. Journal of Proteome Research 9:1941–1950. 400 FOODOMICS STUDY OF MICRONUTRIENTS: THE CASE OF FOLATES

Duthie SJ (2011). Epigenetic modifications and human pathologies: cancer and CVD. Pro- ceedings of the Nutrition Society 70:47–56. Fazili Z, Pfeiffer CM (2004). Measurement of folates in serum and conventionally prepared whole blood lysates: application of an automated 96-well plate isotope-dilution tandem mass spectrometry method. Clinical Chemistry 50:2378–2381. Fazili Z, Pfeiffer CM, Zhang M, Jain R (2005) Erythrocyte folate extraction and quantitative determination by liquid chromatography-tandem mass spectrometry: comparison of results with microbiologic assay. Clinical Chemistry 51:2318–2325. Fazili Z, Pfeiffer CM, Zhang M, Jain R, Koontz D (2008). Influence of 5,10- methylenetetrahydrofolate reductase polymorphism on whole-blood folate concentrations measured by LC-MS/MS, microbiologic assay, and Bio-Rad radioassay. Clinical Chemistry 54:197–201.

Fenech M, Aitken C, Rinaldi J (1998). Folate, vitamin B12, homocysteine status and DNA damage in young Australian adults. Carcinogenesis 19:1163–1171. Finch S, Doyle W, Lowe C, Bates CJ, Prentice A, Smithers G, Clarke PC (1998). National Diet and Nutrition Survey: People Aged 65 Years and Over. Volume 1: Report of the Diet and Nutrition Survey. London: The Stationery Office. Friso S, Choi SW, Girelli D, Mason JB, Dolnikowski GG, Bagley PJ, Olivieri O, Jacques PF, Rosenberg IH, Corrocher R, Selhub J (2002). A common mutation in the 5,10- methylenetetrahydrofolate reductase gene affects genomic DNA methylation through an interaction with folate status. Proceedings of the National Academy of Sciences of the United States of America 99:5606–5611. Garbis SD, Melse-Boonstra A, West CE, van Breemen RB (2001). Determination of folates in human plasma using hydrophilic interaction chromatography-tandem mass spectrometry. Analytical Chemistry 73:5358–5364. Gaudet F, Hodgson JG, Eden A, Jackson-Grusby L, Dausman J, Gray JW, Leonhardt H, Jaenisch R (2003). Induction of tumours in mice by genomic hypomethylation. Science 300:489–492. Giovannucci E (2002). Epidemiologic studies of folate and colorectal neoplasia: a review. Journal of Nutrition 132:2350S–2355S. Gregory JR, Collins DL, Davies PSW, Hughes JM, Clarke PC (1995). National Diet and 1 1 Nutrition Survey: Children aged 1 /2 to 4 /2 Years. Volume 1: Report of the Diet and Nutrition Survey. London: HMSO. Gregory J, Lowe S, Bates CJ, Prentice A, Jackson LV, Smithers G, Wenlock R, Farron M (2000). National Diet and Nutrition Survey: Young People Aged 4 to 18 Years. Volume 1: Report of the Diet and Nutrition Survey. London: The Stationery Office. Gunter EW, Bowman BA, Caudill SP, Twite DB, Adams MJ, Sampson EJ (1996). Results of an international round robin for serum and whole-blood folate. Clinical Chemistry 42:1689–1694. Hannisdal R, Svardal A, Ueland PM (2008). Measurement of folate in fresh and archival serum samples as p-aminobenzoylglutamate equivalents. Clinical Chemistry 54:665–672. Hannisdal R, Gislefoss RE, Grimsrud TK, Hustad S, Morkrid L, Ueland PM (2010). Analytical recovery of folate and its degradation products in human serum stored at -25◦C for up to 29 years. Journal of Nutrition 140:522–526. Hart DJ, Finglas PM, Wolfe CA, Mellon F, Wright AJA, Southon S (2002). Determination of 5-methyltetrahydrofolate (C-13-labeled and unlabeled) in human plasma and urine by REFERENCES 401

combined liquid chromatography mass spectrometry. Analytical Biochemistry 305:206– 213. Hiltunen MO, Turunen MP, Hakkinen TP, Rutanen J, Hedman M, Makinen K, Turunen A- M, Aalto-Setala K, Yla-Herttuala¨ S (2002). DNA hypomethylation and methyltransferase expression in atherosclerotic lesions. Vascular Medicine 7:5–11. Honein MA, Paulozzi LJ, Mathews TJ, Erickson JD, Wong LYC (2001). Impact of folic acid fortification of the US food supply on the occurrence of neural tube defects. Journal of the American Medical Association 285:2981–2986. Jacob RA, Gretz DM, Taylor PC, James SJ, Pogribny IP, Miller BJ, Henning SM, Swend- seid ME (1998). Moderate folate depletion increases plasma homocysteine and decreases lymphocyte DNA methylation in postmenopausal women. Journal of Nutrition 128:1204– 1212. Kim YI (2007). Folate and colorectal cancer: an evidence-based critical review. Molecular Nutrition and Food Research 51:267–292. Kok RM, Smith DEC, Dainty JR, van den Akker JT, Finglas PM, Smulders YM, Jakobs C, de Meer K (2004). 5-methyltetrahydrofolic acid and folic acid measured in plasma with liquid chromatography tandem mass spectrometry: applications to folate absorption and metabolism. Analytical Biochemistry 326:129–138. Mason JB, Dickstein A, Jacques PF, Haggarty P, Selhub J, Dallal G, Rosenberg IH (2007). A temporal association between folic acid fortification and an increase in colorectal cancer rates may be illuminating important biological principles: a hypothesis. Cancer Epidemi- ology Biomarkers and Prevention 16:1325–1329. McKillop DJ, Pentieva K, Daly D, McPartlin JM, Hughes J, Strain JJ, Scott JM, McNulty H (2002). The effect of different cooking methods on folate retention in various foods that are amongst the major contributors to folate intake in the UK diet. British Journal of Nutrition 88:681–688. McNeil CJ, Beattie JH, Gordon M-J, Pirie LP, Duthie SJ (2011). Differential effects of nutri- tional folic acid deficiency and moderate hyperhomocysteinemia on aortic plaque formation and genome-wide DNA methylation in vascular tissue from ApoE -/- mice. Clinical Epi- genetics 2011:361–368. McNulty H, Pentieva K, Hoey L, Ward M (2008). Homocysteine, B vitamins and CVD. Proceedings of the Nutrition Society 67:232–237. Molloy AM, Mills JL, Kirke PN, Whitehead AS, Weir DG, Scott JM (1998). Whole-blood folate values in subjects with different methylenetetrahydrofolate reductase genotypes: differences between the radioassay and microbiological assays. Clinical Chemistry 44:186–188. Nelson BC, Pfeiffer CM, Margolis SA, Nelson CP (2003). Affinity extraction combined with stable isotope dilution LC/MS for the determination of 5-methyltetrahydrofolate in human plasma. Analytical Biochemistry 313:117–127. Nelson BC, Pfeiffer CM, Margolis SA, Nelson CP (2004). Solid-phase extraction-electrospray ionization mass spectrometry for the quantification of folate in human plasma or serum. Analytical Biochemistry 325:41–51. Nelson BC, Sniegoski LT, Welch MJ, Satterfield MB (2005). Analytical quantification of total homocysteine and folate in human serum in relation to cardiovascular disease risk assessment and prevention. FASEB Journal 19:264. Nelson BC (2007). The expanding role of mass spectrometry in folate research. Current Analytical Chemistry 3:219–231. 402 FOODOMICS STUDY OF MICRONUTRIENTS: THE CASE OF FOLATES

Owen WE, Roberts WL (2003). Comparison of five automated serum and whole blood folate assays. American Journal of Clinical Pathology 120:121–126. Persad VL, Van den Hof MC, Dube JA, Zimmer P (2002). Incidence of open neural tube defects in Nova Scotia after folic acid fortification. Canadian Medical Association Journal 167:241–245. Pfeiffer CM, Fazili Z, Mccoy L, Zhang M, Gunter EW (2004). Determination of folate vitamers in human serum by stable-isotope-dilution tandem mass spectrometry and comparison with radioassay and microbiologic assay. Clinical Chemistry 50:423–432. Philips KM, Wunderlick KM, Holden JM, Exler J, Gebhardt SE, Hayrowitz DB, Beecher GR, Doherty RF (2005). Stability of 5-methyltetrahydrofolate in frozen fresh fruits and vegetables. Food Chemistry 92:587–595. Quinlivan EP, Hanson AD, Gregory JF (2006). The analysis of folate and its metabolic precursors in biological samples. Analytical Biochemistry 348:163–184. Ruston D, Hoare J, Henderson L, Gregory J (2004). National Diet and Nutrition Survey: Adults Aged 19-64 Years. Volume 4: Nutritional Status (anthropometry and blood analytes, blood pressure and physical activity). London: The Stationery Office. SACN, Scientific Advisory Committee on Nutrition (2006). Folate and Disease Prevention. London: The Stationery Office. Sanjoaquin MA, Allen N, Couto E, Roddam AW, Key TJ (2005). Folate intake and col- orectal cancer risk: a meta-analytical approach. International Journal of Cancer 113:825– 828. Sauberlich HE (1995). Folate status of U.S. population groups. In: Bailey LB, editor. Folate in Health and Disease. New York: Marcel Dekker. p 171–194. Schernhammer ES, Giovannucci E, Kawasaki T, Rosner B, Fuchs CS, Ogino S (2010). Dietary folate, alcohol and B vitamins in relation to LINE-1 hypomethylation in colon cancer. Gut 59:794–799. Shane B, Tamura T, Stokstad ELR (1980). Folate assay – comparison of radioassay and microbiological methods. Clinica Chimica Acta 100:13–19. Shane B (1995). Folate chemistry and metabolism. In: Bailey LB, editor. Folate in Health and Disease. New York: Marcel Dekker. p 1–22. Smith DEC, Kok RM, Teerlink T, Jakobs C, Smulders YM (2006). Quantitative determi- nation of erythrocyte folate vitamer distribution by liquid chromatography-tandem mass spectrometry. Clinical Chemistry and Laboratory Medicine 44:450–459. Splaver A, Lamas GA, Hennekens CH (2004). Homocysteine and cardiovascular disease: biological mechanisms, observational epidemiology and the need for randomised trials. American Heart Journal 148:34–40. Tamura T (1990). Microbiological assay for folate. In: Picciano MF, Stokstad ELR, Gregory III JF, editors. Folic Acid Metabolism in Health and Disease. New York, Wiley-Liss. p 121–137. Troen AM, Mitchell B, Sorensen B, Wener MH, Johnston A, Wood B, Selhub J, McTiernan A, Yasui Y, Oral E, Potter JD, Ulrich CM (2006). Unmetabolized folic acid in plasma is associated with reduced natural killer cell cytotoxicity among postmenopausal women. Journal of Nutrition 136:189–194. Turunen MP, Aavik E, Yla-Herttula S (2009). Epigenetics and atherosclerosis. Biochim Bio- phys Acta 1790:886–891. REFERENCES 403 van den Berg H, Finglas PM, Bates C (1994). FLAIR intercomparisons on serum and red cell folate. International Journal for Vitamin and Nutrition Research 64:288–293. Voutilainen S, Lakka TA, Porkkala-Sarataho E, Rissanen T, Kaplan GA, Salonen JT (2000). Low serum folate concentrations are associated with an excess incidence of acute coronary events: the Kuopio Ischaemic Heart Disease Risk Factor Study. European Journal of Clinical Nutrition 54:424–428. Voutilainen S, Virtanen JK, Rissanen TH, Alfthan G, Laukkanen J, Nyyssonen K, Mursu J, Valkonen V-P, Tuomainen T-P, Kaplan GA, Salonen JT (2004). Serum folate and homo- cysteine and the incidence of acute coronary events: the Kuopio Ischaemic Heart Disease Risk Factor Study. American Journal of Clinical Nutrition 80:317–323. Wright AJA, Dainty JR, Finglas PM (2007). Folic acid metabolism in human subjects revisited: potential implications for proposed mandatory folic acid fortification in the UK. British Journal of Nutrition 98:667–675. Zaina S, Lindholm MW, Lund G (2005). Nutrition and aberrant DNA methylation patterns in atherosclerosis: more than just hyperhomocystenemia? Journal of Nutrition 153:5–8. Zhang M (2005). Reevaluation of the traditional microbiologic assay for serum folate mea- surement by comparison to LC/MS/MC. Clinical Chemistry 51:A194. 14 METABOLOMICS MARKERS IN ACUTE AND ENDURANCE/ RESISTANCE PHYSICAL ACTIVITY: EFFECT OF THE DIET

Sonia Medina, Debora´ Villano,˜ Jose´ Ignacio Gil, Cristina Garc´ıa-Viguera, Federico Ferreres, and Angel Gil-Izquierdo

14.1 INTRODUCTION

Traditionally, studies to date on exercise metabolism have been developed for mea- sure, and compare the concentration of one or a few metabolites in biofluids (plasma, urine) or muscle tissue, before and after physical activity. Depending on the intensity of the exercise, the volume, and duration, for the acute (single bout of exercise) and endurance/resistance (exercise training) sport practices, the changes in biomarkers could be different. Acute physical activity (APA) is known to largely affect human metabolism; moreover, obtaining a global picture of these exercise effects is compli- cated due to the large amount of metabolites implicated in the response (Enea et al., 2010). Classically, levels of some markers (leptin, adiponectin, ghrelin, cytokine, among others) were used to characterize the physical status of a single exercise ses- sion as well as the levels of certain hormones (growth hormone, insulin, testosterone, and estrogens), which may influence carbohydrate and lipid utilization, needed to meet energy demands during acute and endurance exercises (Jurim¨ ae¨ et al., 2011). Concerning the targeted assays on endurance/resistance physical exercise, the car- diovascular risk factor levels decrease due to the weight loss and reduced visceral

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

405 406 METABOLOMICS MARKERS IN ACUTE AND ENDURANCE/RESISTANCE fat area. However, there is a balanced physiological adaptation to exercise training with preservation of valvular function (Cederberg et al., 2011; Prakken et al., 2011). Targeted markers affected by endurance/resistance exercise leads to an increase of growth hormone with lower lactate and cortisol production (upper limits of VO2 and lactate are established on 90.6 mL/min/kg and 4.4 mmol/L), the parasympa- thetic nervous system function and no changes on total hemoglobin mass (Burtscher et al., 2011; Daniłowicz-Szymanowicz et al., 2011; Rezaee et al., 2011; Ulrich et al., 2011). All these targeted experiments have allowed a partial knowledge of the physio- logical status of the athletes before, during, and/or after the exercise. However, the link among all these parameters has been partially solved or continues to be lack- ing due to their interconnection complexity and multiparametric interaction among them. Besides the specific weight, in different sport specialties and training programs (which allows a better knowledge of the response of the athletes to the exercise), improvement of their sport efficiency and the health-protecting parameters to prevent mechanical injuries or physiological disorders are mainly unknown yet. Therefore, in recent years a new analysis technique has emerged named metabolomics, which has been used to investigate metabolic changes due to acute and endurance/resistance exercise, as a global multiphysiological pathways interaction. Metabolomics is the measurement of the dynamic changes of all metabolites (metabolome) without either focusing or excluding a particular metabolic profile, and it can be used in the study of how the metabolome varies with exercise. It also identifies exercise-related metabolites with their potential biological function. In this sense, it is important to note inter-individual variation in the differences in response to the exercise and the subsequent recovery (Kirwan et al., 2009). Metabolomics approaches have also been applied for the investigation of the effects of exercise on pathological conditions including cardiovascular disease and diabetes. Most of the metabolomic studies have been developed in blood, serum, or plasma; however, lately they are being carried out on urine samples, because it is a noninvasively obtained biological material and reflects the physical-activity-related changes occurring in the muscle (Pechlivanis et al., 2010). It has to be taken into account that the metabolomics techniques are very sensitive (from sample workup up to the analytical platforms and bioinformatic tools). For example, only the fear of blood extraction in human volunteers due to the acute stress of the moment can modify results of the human metabolome of a study. Therefore, the design of a metabolomics experiment is a key point to obtain useful and reliable data. Metabolic profiling can provide a picture of human metabolism before, during, and after physical activity. In fact, there is an increasing interest about the function and utility of metabolomics on sport in order to know the global human metabolome of the athletes and better understand its behavior during physical activity, includ- ing nutrition, as an additional stressor (Lac et al., 2004; Lewis et al., 2010). For this reason, the aim of this work is based on the collection of the metabolites from athletes who may undergo changes in its qualitative and quantitative global profile and their metabolic pathways involved due to the exercise and nutrition through metabolomics. METABOLOMICS CONSEQUENCES OF PHYSICAL ACTIVITY 407

14.2 METABOLOMICS CONSEQUENCES OF PHYSICAL ACTIVITY: METABOLITES AND PHYSIOLOGICAL PATHWAYS AFFECTED

Metabolomics applied to physical exercise provides significant health benefits in the general population, as well for disease prevention, conferring cardiovascular protection and early diagnoses about organ dysfunction (Lewis et al., 2010; Lehmann et al., 2010). Therefore, metabolomics offers a robust analytical platform to monitor athlete’s physiological states, and can be used to identify exercise-related biomarkers and to explore their potential biological function (Lewis et al., 2010; Yan et al., 2009). The identification of exercise markers with metabolic properties is very important for understanding the beneficial effects of physical activity (Lehmann et al., 2010). Table 14.1 shows the metabolites that have been identified in different studies in human plasma/urine, regarding this point. The current studies on metabolomics and exercise reported to date are not abundant, and they discriminate between APA and endurance/resistance physical activity (EPA), where there are significant differences on activation and/or modification of the metabolic pathways.

14.2.1 Energy Metabolism During exercise, the metabolism of glucose in skeletal muscle is very active. Alanine plays an important role in the Cahill cycle or alanine–glucose cycle, and changes in alanine metabolite may indicate that this cycle could be modified (Table 14.1) (Yan et al., 2009). APA increases the concentration of lactate reflecting the anaerobic metabolism. This suggests that the process of glycolysis has been altered by physical activity. Lewis and coworkers (Lewis et al., 2010) have detected plasma metabolic changes in pathways not previously associated with physical exercise, and they iden- tified metabolites such as niacinamide, which enhances insulin release and improves glycemic control (Table 14.1). In the same study, where samples from skeletal muscle biopsies were analyzed, they showed increased levels of intermediates of tricarboxylic acid cycle (TCA) (succinate, malate, and fumarate) indicating an augmentation of the aerobic energy production in cardiac and skeletal muscles. Therefore, tissue changes in response to physical exercise can be detected in plasma by metabolomics. Equally, other intermediates of glycogenolysis (3-phosphoglycerate and glucose-6-phosphate) were increased (Table 14.1) (Lewis et al., 2010).

14.2.2 Oxidative Stress Metabolic differences have been observed between professional athletes and control group (Yan et al., 2009). In this aspect, some authors observed that glutathione homeostasis plays an important role in the prooxidant–antioxidant balance, during physical activity, as it is synthesized in the reaction of the ␥-glutamyl cycle. The cysteine and glutamic acid (intermediates of this cycle) showed increased their levels suggesting that intensive exercise (APA) influences the ␥-glutamyl cycle (Table 14.1) (Yan et al., 2009). Besides, uric acid oxidation products, such as allantoin, decreased. This compound has been used as an indicator of oxidative stress and fell after APA 408 METABOLOMICS MARKERS IN ACUTE AND ENDURANCE/RESISTANCE

TABLE 14.1 Metabolomics Biomarkers and Related Metabolic Pathways Affected by Physical Activity Related Metabolic Exercise Metabolomics Pathway Biomarkers References Glucose metabolism Alanine ↑ Yan et al., 2009 ␤-D-methyl glucopyranoside ↑ Lactate ↑ Pechlivanis et al., 2010 Lewis et al., 2010 Yan et al., 2009 Pyruvate ↑ Pechlivanis et al., 2010 Niacinamide Lewis et al., 2010 Oxidative stress Cysteine ↑ Yan et al., 2009 Glutamic acid ↑ Energy metabolism Citric acid ↑ Yan et al., 2009 Succinate ↑ Lewis et al., 2010 Malate ↑ Fumarate ↑ Lewis et al., 2010 Pechlivanis et al., 2010 2-oxoglutarate ↑ Pechlivanis et al., 2010 Formate ↓ Citrate ↓ Lipid metabolism Palmitic acid ↓ Yan et al., 2009 Linoleic acid ↓ Oleic acid ↓ Acetoacetate ↓ Lewis et al., 2010 Glycerol ↑ Lewis et al., 2010 Pechlivanis et al., 2010 Acylcarnitines ↑ Lehmann et al., 2010 Amino acid metabolism Valine ↑ Yan et al., 2009 Glutamine ↑ Alanine ↑ Lewis et al., 2010 Pechlivanis et al., 2010 Asparagine ↓ Pohjanen et al., 2007 Glycine ↓ Pechlivanis et al., 2010 Phenylalanine ↓ Tryptophan ↓ Taurine ↓ Hippurate ↓ Adenine nucleotide Adenosine monophosphate ↑ Lewis et al., 2010 catabolism Inosine ↑ Lewis et al., 2010 Pechlivanis et al., 2010 Hypoxanthine ↑ Xanthine ↑ Lewis et al., 2010 Uric acid oxidation Allantoin ↓ Lewis et al., 2010 Pechlivanis et al., 2010 Glycogenolysis 3-phosphoglycerate ↑ Lewis et al., 2010 Glucose-10-phosphate ↑ METABOLOMICS CONSEQUENCES OF PHYSICAL ACTIVITY 409

TABLE 14.1 (Continued) Related Metabolic Exercise Metabolomics Pathway Biomarkers References Branched-chain amino 2-hydroxyisovalerate↑ Pechlivanis et al., 2010 acid metabolism 2-oxoisocaproate ↑ 2-oxoisovalerate ↑ 3-hydroxyisobutyrate ↑ 3-methyl-2-oxovalerate↑ Trimethylamine Trimethylamine N-oxide ↓ Pechlivanis et al., 2010 metabolism

(Kirwan et al., 2009). Also, in relation with oxidative stress, increases in products of adenine nucleotide catabolism were reported, including adenosine monophosphate (AMP), inosine, xanthine, and hypoxanthine (Table 14.1) (Lewis et al., 2010).

14.2.3 Amino Acid Metabolism A study carried out by Pechlivanis, et al. (2010) described an increase in five products of the catabolism of branched-chain amino acids (BCAA): degradation products of leucine and isoleucine (2-oxoisocaproate and 3-methyl-2-oxovalerate, respectively) and degradation products of valine (2-oxoisovalerate, 2-hydroxyisovalerate, and 3- hydroxyisobutyrate) (Table 14.1). In skeletal muscle, APA may also cause changes in amino acid concentrations, including an increase in the release of glutamine and alanine to promote ammonia metabolism (Table 14.1). The elevated levels of glu- tamine may be a consequence of a feedback of self-regulation of athletes, since this amino acid is essential for a proper immune function (Pechlivanis et al., 2010). In addition, glycine, phenylalanine, tryptophan, taurine, and hippurate decrease signifi- cantly because of the physical activity (Table 14.1) (Pechlivanis et al., 2010).

14.2.4 Fatty Acid Metabolism It is generally accepted that energy expenditure will be elevated when increasing exercise and lipolysis is activated, and therefore, serum free fatty acid (FFA) would rapidly increase. Pohjanen and coworkers (2007) observed an increase of glycerol postexercise related with fatty acid breakdown (Table 14.1). In the same way, and by use of a LC-q-TOF-mass spectrometry (liquid chromatography coupled to quadrupole time of flight mass spectrometer), medium- and long-chain acylcarnitines were iden- tified as exercise biomarkers in a study by Lehmann, et al. (2010), that may indicate high rates of fatty acid oxidation and low reliance on glycolysis (Table 14.1). 410 METABOLOMICS MARKERS IN ACUTE AND ENDURANCE/RESISTANCE

14.3 METABOLOMICS AND PHYSICAL ACTIVITY: EFFECT OF THE DIET

To date, the majority of exercise-induced metabolic effect studies reported in the lit- erature have relied on classical assays that target specific biomarkers associated. This is a partial view of the global metabolic landscape that limits the crossed influences among different parameters, metabolic pathways, and individual metabolites. Despite the fact that metabolomics applied in human nutrition and linked to physical exercise is still in its infancy, it can provide a global overview of the interaction pattern of the majority of the metabolites without missing the sensitivity and precision provided by a targeted analysis. The final goal of investigating physical activity and diet by metabolomics is:

r To monitor and detect habitual dietary patterns r To investigate nutritional effects r To develop personalized food and exercise regimes

All of them in a framework of the physical exercise for professional purposes or regular training for protection, prevention of rehabilitation of different metabolic disorders. The first step to get success on this type of research is to design and establish a strategy of metabolomics able to develop a predictive model by generation of high- quality metabolomics data for multiple sample comparison analysis, by applying the chemometric and bioinformatic tools. This procedure should be validated in all steps, from creating the original study design to the selection of sample batches for sample preparation and analysis as well as the selection of training and test sets. In this context, Pohjanen and coworkers (2007) have been able to develop a multivariate screening strategy for investigating the strenuous physical exercise in human serum. The application of this methodology to the study of the effect of nutrition modulation following a strenuous ergometer cycling regime postexercise at four levels (water, low carbohydrate and protein, low carbohydrate and high carbohydrate) revealed that low-carbohydrate and protein beverage taken immediately after exercise was able to improve the metabolic status linked to the anti-catabolic effects (decrease in 3- methyl histidine and increase in pseudouridine) of less fit subjects during recovery (Chorell et al., 2009). This predictive metabolomics was so sensitive that it was able to detect the imbalanced metabolism (insulin resistance) at an early stage using the cited exercise and nutrition as stressors. In the same way, and from a clinical point of view, the other predictive model was able to distinguish from diabetics by means of low-molecular-weight compounds in plasma independently of gender or exercise, and exercise-induced differences in the metabolite patterns, both for healthy and diabetics subjects (Kuhl et al., 2008). In the context of the prevention of oxidative stress by nutrition on the human metabolome before, during, and after a strenuous aerobic exercise, a research work can be addressed by using N-acetyl-L-cysteine at high-dose oral intake (Lee et al., CONCLUDING REMARKS AND FUTURE PERSPECTIVES 411

2010). This compound is effective at dampening acute episodes of oxidative stress; however, this clinical intervention study is more pharmacological than nutritional. Research on nutritional metabolomics miss studies using natural foods or functional food. To date, the nutrition intake of banana compared to 6% carbohydrate drink during and after 75 km cycling performance was not able to modulate the metabolome profile at inflammation, oxidative stress, and innate immune metabolites generated by the intrinsic exercise using metabolomics (Nieman et al., 2012). This type of plant food is very rich in potassium but very poor in bioactive compounds like phenolics. However, a promising application of functional foods to physical exercise demonstrated the potential power of polyphenol-rich beverages to improve the energy metabolism of athletes during recovery by postexercise rehydration (Miccheli et al., 2009). The metabolomic approach highlighted that the intake of 500 mL of a green tea-based carbohydrate-hydroelectrolyte drink (with 570 mg of polyphenols) reached to modify the metabolic profiles of plasma and urine, with increases of glucose and citrate in plasma and of acetone, and 3-OH-butyrate in urine during the rehydration period, as well as decreases in plasma lactate. It was hypothesized a stimulation of liver glucose production by fructose content of the drink and a fat oxidation and ketone production due to the green tea ingestion.

14.4 CONCLUDING REMARKS AND FUTURE PERSPECTIVES

Metabolomics is a powerful tool for analyzing novel biomarkers, such as the metabo- lites of food intake, physical activity, and early diagnosis of diseases. Despite this fact and to the best of our knowledge, the majority of significant metabolites on exercise described by metabolomics are already established by experiments following known targeted markers. More effort is required for metabolomics researchers on human physiopathology in this way in order to obtain deeper knowledge on new endoge- nous biomarkers affected by different sport specialties, paying special attention to the significant unknown metabolites detected by global metabolic fingerprinting and bioinformatic tools of the human metabolome. Therefore, further metabolomic studies are needed in order to increase the power of this technique in the understanding of the beneficial effects of physical activity through study and analysis of human metabolic profile. In the same way, and regarding metabolomics and sport, there are very few studies analyzing the metabolic profile of the athletes. Some of these studies have differentiated between APA and EPA metabo- lites, as their identification is very important for understanding the beneficial health effects of different types of sports and training linked to their intensity grade. EPA and average intensity sports activate lipolysis, glucose metabolism, glutathione homeosta- sis, insulin release (niacinamide), and the Cahill and TCA cycles in skeletal muscle, and different metabolites are detected by global comparative metabolite profiling. APA generates some additional and different metabolic changes than those found for EPA, namely changes in the ␥-glutamyl cycle, amino acid levels (overall on alanine and glutamine, which are related to immune function), uric acid oxidation pathway (allantoin) and an increase in the adenine nucleotide catabolism. Metabolomics tool, 412 METABOLOMICS MARKERS IN ACUTE AND ENDURANCE/RESISTANCE novel to the area of physical exercise, allows us to identify potential biomarkers to increase the knowledge of the physiological mechanism that occur during normal and intense physical activity, therefore metabolomics is greatly powerful in monitoring athletes’ physiological status before, during, and after the exercise. On the other hand, metabolomics applied in human nutrition linked to physical exercise is still in its infancy. Some clinical interventions have been conducted using drinks enriched with macronutrients like carbohydrates and/or proteins. However, some recent studies paying attention to the intake of polyphenol-based drinks dur- ing or after the exercise indicate positive ways in which the current research can be directed. In conclusion, metabolomics strategy is having a great impact on the discovery and identification of biomarkers, endogenous or exogenous, and irrespective of their origin, with the aim of improving the understanding of changes in the physiological mechanism, produced by internal or external factors. To date, metabolomics pro- vides some additional information about the effects of physical activity. However, further investigation is required, overall, because the research has been focused on the detection of known metabolites in urine or plasma, but not on new unknown com- pounds which may help to explain changes in pathways already described or new ones affected. This is the bottleneck of the current metabolomics, regardless of the topic associated (disease, nutrition, pharmacology, sport, or others), since it links the physiology with the chemical identification of new metabolites in biological samples. Some markers have been established in athletes, but no exclusive markers, such as fin- gerprint compounds, have been related to a specific sport or physical activity. From a nutritional point of view, the application of sports drinks enriched with macronutrients and/or micronutrients or plant-based extracts rich in polyphenols provide promising and practical modulation effects of the human metabolome of the athletes.

ACKNOWLEDGMENTS

The authors are grateful for the support of the National funding agencies through the Projects AGL2011-23690 (CICYT), CSD007-0063 (CONSOLIDER-INGENIO 2010 ‘Fun-C-Food’), and CSIC 201170E041 (Spanish Ministry of Economy and Competitiveness). They are also grateful to the Fundacion´ Seneca´ - CARM “Group of Excellence in Research” 04486/GERM/06 and the Ibero-American Programme for Science, Technology and Development (CYTED) – Action 112RT0460 CORNU- COPIA.

REFERENCES

Burtscher M, Nachbauer W, Wilber R (2011). The upper limit of aerobic power in humans. European Journal of Applied Physiology 111:2625–2628. REFERENCES 413

Cederberg H, Mikkola I, Jokelainen J, Laakso M, Hark¨ onen¨ P, Ikaheimo¨ T, Keinanen-¨ Kiukaanniemi S (2011). Exercise during military training improves cardiovascular risk factors in young men. Atherosclerosis 216:489–495. Chorell E, Moritz T, Branth S, Antti H, Svensson MB (2009). Predictive metabolomics eval- uation of nutrition-modulated metabolic stress responses in human blood serum during the early recovery phase of strenuous physical exercise. Journal of Proteome Research 8:2966–2977. Daniłowicz-Szymanowicz L, Figura-Chmielewska M, Raczak A, Szwoch M, Ratkowski W (2011). Ocena wpływu długotrwałego intensywnego wysiłku fizycznego na czynnos´c´ auto- nomicznego układu nerwowego w grupie sportowcow´ przygotowujac¸ych sie¸ do startu w zawodach [The assessment of Influence of long-term exercise training on autonomic ner- vous system activity in young athletes preparing for competitions]. Polski Merkuriusz Lekarski 30:19–25. Enea C, Seguin F, Petitpas-Mulliez J, Boildieu N, Boisseau N, Delpech N, Diaz V, Eugene` M, Dugue´ B (2010). 1H NMR-based metabolomics approach for exploring urinary metabolome modifications after acute and chronic physical exercise. Analytical and Bioanalytical Chem- istry 396:1167–1176. Jurim¨ ae¨ J, Maestu¨ J, Jurim¨ ae¨ T, Mangus B, Von Duvillard SP (2011). Peripheral signals of energy homeostasis as possible markers of training stress in athletes: a review. Metabolism- Clinical and Experimental 60:335–350. Kirwan GM, Coffey VG, Niere JO, Hawley JA, Adams MJ (2009). Spectroscopic correla- tion analysis of NMR-based metabonomics in exercise science. Analytica Chimica Acta 652:173–179. Kuhl J, Moritz T, Wagner H, Stenlund H, Lundgren K, Båvenholm P, Efendic S, Norstedt, Tollet-Egnell P (2008). Metabolomics as a tool to evaluate exercise-induced improvements in insulin sensitivity. Metabolomics 4:273–282. Lac G, Maso F (2004). Les marqueurs biologiques pour la surveillance des athletes` a` l’entraˆınement [Biological markers for the follow-up of athletes throughout the training season]. Pathologie Biologie 52:43–49. Lee R, West D, Phillips SM, Britz-McKibbin P (2010). Differential metabolomics for quanti- tative assessment of oxidative stress with strenuous exercise and nutritional intervention: thiol-specific regulation of cellular metabolism with N-acetyl-L-cysteine pretreatment. Ana- lytical Chemistry 82:2959–2968. Lewis GD, Farrell L, Wood MJ, Martinovic M, Arany Z, Rowe GC, Souza A, Cheng S, McCabe EL, Yang E, Shi X, Deo R, Roth FP, Asnani A, Rhee EP, Systrom DM, Sem- igran MJ, Vasan RS, Carr SA, Wang TJ, Sabatine MS, Clish CB, Gerszten RE (2010). Metabolic signatures of exercise in human plasma. Science Translational Medicine 2: 1–10. Lehmann R, Zhao X, Weigert C, Simon P, Fehrenbach E, Fritsche J, Machann J, Schick F, Wang J, Hoene M, Schleicher ED, Haring¨ HU, Xu G, Niess AM (2010). Medium chain acylcarnitines dominate the metabolite pattern in humans under moderate intensity exercise and support lipid oxidation. PLoS ONE 5:1–12. Miccheli A, Marini F, Capuani G, Miccheli AT, Delfini M, Di Cocco ME, Puccetti C, Paci M, Rizzo M, Spataro A (2009). The influence of a sports drink on the postexercise metabolism of elite athletes as investigated by NMR-based metabolomics. Journal of the American College of Nutrition 28:553–564. 414 METABOLOMICS MARKERS IN ACUTE AND ENDURANCE/RESISTANCE

Nieman DC, Gillitt ND, Henson DA, Sha W, Shanely RA, Knab AM, Cialdella-Kam L, Jin F (2012). Bananas as an energy source during exercise: a metabolomics approach. PLoS ONE 7:1–7. Pechlivanis A, Kostidis S, Saraslanidis P, Petridou A, Tsalis G, Mougios V, Gika HG, Mikros E, Theodoridis GA (2010). 1H NMR-based metabonomic investigation of the effect of two different exercise sessions on the metabolic fingerprint of human urine. Journal of Proteome Research 9:6405–6416. Pohjanen E, Thysell E, Jonsson P, Eklund C, Silfer A, Carlsson IB, Lundgren K, Moritz T, Svensson MB, Antti H (2007). A multivariate screening strategy for investigating metabolic effects of strenuous physical exercise in human serum. Journal of Proteome Research 6:2113–2120. Prakken NH, Velthuis BK, Bosker AC, Mosterd A, Teske AJ, Mali WP, Cramer MJ (2011). Relationship of ventricular and atrial dilatation to valvular function in endurance athletes. British Journal of Sports Medicine 45:178–184. Rezaee S, Kahrizi S, Hedayati M (2011). Comparison of acute hormonal responses between resistance, endurance and endurance-resistance exercise in healthy young men. Journal of Physiology and Pharmacology 14:445–457. Ulrich G, Bartsch¨ P, Friedmann-Bette B (2011). Total haemoglobin mass and red blood cell profile in endurance-trained and non-endurance-trained adolescent athletes. European Journal of Applied Physiology 111:2855–2864. Yan B, Jiye A, Wang G, Lu H, Huang X, Liu Y, Zha W, Hao H, Zhang Y, Liu L, Gu S, Huang Q, Zheng Y, Sun J (2009). Metabolomic investigation into variation of endoge- nous metabolites in professional athletes subject to strength-endurance training. Journal of Applied Physiology 106:531–538. 15 MS-BASED OMICS EVALUATION OF PHENOLIC COMPOUNDS AS FUNCTIONAL INGREDIENTS

Debora´ Villano,˜ Sonia Medina, Jose´ Ignacio Gil, Cristina Garc´ıa-Viguera, Federico Ferreres, Francisco A. Tomas-Barber´ an,´ and Angel Gil-Izquierdo

15.1 INTRODUCTION

In the last two decades, the concept of nutrition has evolved from ensuring the adequate intake to preventing deficiencies to investigating how diet components can contribute to improve health and reduce the risk of diseases. Societal changes in developed countries are accompanied with an improvement in the quality of life and expectancy, while consumers seem to be more concerned about health maintenance through dietary habits. These changes have contributed to increase the demand for safer, healthier foods that provide benefits beyond their nutritional content, the so- called functional foods. It has opened up new opportunities to the food industry to present claims about the health benefits of their foods or particular ingredients, with large differences in the quality of evidence. The Concerted Action, FUFOSE (Functional Food Science in Europe) gave a working definition of functional food as “a food that beneficially affects one or more target functions in the body beyond adequate nutritional effects in a way that is relevant to either an improved state of health and well-being and/or reduction of risk of disease” (Diplock et al., 1999). Another Concerted Action, PASSCLAIM (Process for the Assessment of Scientific Support for Claims on Foods) contributed

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

415 416 MS-BASED OMICS EVALUATION OF PHENOLIC COMPOUNDS to give guidance in the use of markers to provide a scientific justification for claims relating to food functionality (Richardson et al., 2003). These guidances have been articulated in the global regulatory legislation with the Directive (EC) Nº 1924/2006 on nutrition and health claims made on foods. The regulation establishes the general principles for scientific substantiation as well as the requisites for nutritional claims and health claims in order to protect consumers from false or misleading claims as well as to stimulate scientific research and innovation in the industry (Regulation (EC) No 1924/2006). Among the different criteria for scientific substantiation, the study of food func- tionality implies the use of markers to monitor food consumption and link them with a health/disease outcome. We can identify markers of exposure to a food component, markers of a target function or biological response, markers of an intermediate end- point (a key stage in development that is unequivocally linked to the endpoint) as well as final endpoints (an improved state of health or reduction of disease) (Biesalski et al., 2011). Among the different PASSCLAIM criteria, it was noted that markers should be both methodologically valid (accuracy, precision, robustness, with respect to their analytical characteristics) and biologically valid (they should show a clear relation- ship to the final outcome and their variability within the target population must be known). These markers should be feasible, valid, reproducible, sensitive, and specific (Biesalski et al., 2011). Omic technologies as proposed by Foodomics can be valu- able tools in the development of these valid markers. Foodomics has been defined as “a new discipline that studies the food and nutrition domains through the appli- cation of advanced omics technologies to improve consumer well-being, health and confidence” (Cifuentes, 2009). The definition includes -omics technologies as tran- scriptomics, proteomics, and metabolomics for the study of food functionality and its effect on health. This chapter is focused on metabolomic approaches applied in human dietary intervention studies with polyphenol-rich foods. Epidemiological evidence suggests that diets rich in plant foods protect from the risk of developing degenerative diseases characterized by high oxidative stress conditions (Valko et al., 2007). This beneficial effect has been partially linked to the high content on polyphenols, equipped with a wide variety of actions, including antioxidant mechanisms (Razquin et al., 2009). They can become an opportunity to develop functional foods that modulate target functions related to oxidative stress conditions in order to preserve the integrity and functional activity of DNA, proteins, and fatty acids.

15.2 USE OF METABOLOMICS IN NUTRITIONAL TRIALS

Metabolomics aims to provide knowledge about the profile of the total metabolites present in a biological sample (metabolome). The metabolome has been defined as “the full set of endogenous or exogenous low molecular weight metabolic entities of approximately <1000 Da (metabolites), and the small pathway motifs that are present in a biological system (cell, tissue, organ, organism, or species) (Trujillo et al., 2006). USE OF METABOLOMICS IN NUTRITIONAL TRIALS 417

The study of the human metabolome implies hundreds of metabolites of chem- ical diversity, dynamical ranges of concentrations, and complex pathways of inter- relation. These challenges can be managed by different analytical approaches. Metabolomic studies use high-throughput technologies mainly based on 1H nuclear magnetic resonance spectroscopy (1H-NMR) as well as mass spectrometry (MS), which most times is coupled with separation techniques as GC and LC (LC–MS, GC–MS) and more recently with capillary electrophoresis (CE–MS). 1H-NMR has been used for identification of low molecular weight compounds. Sample extracts are dissolved in a deuterated solvent and the resonance frequencies of the hydrogen nuclei in each metabolite are measured. Advantages of NMR are that it involves minimal sample preparation, shows high reproducibility, and it is useful in the identification of macromolecules (DNA, proteins). However, each metabolite gives multiple signals from the different hydrogen atoms and these signals usually overlap and compound identification may not be possible; in general, NMR technique is less sensitive than MS spectrometry (Becker et al., 2012). MS is more sensitive and selective in the identification and quantification of low to high molecular weight metabolites, with detection levels in the order of ng/mL– pg/mL. In MS techniques, metabolites are identified using their exact molecular mass and molecular formula by comparison with commercial standards as well as with databases as Human Metabolome Database, Chemspider, or Massbank Database. In the case of polyphenols, GC–MS has been specifically applied to the quantification of phenolic acids that result from polyphenol degradation by gut microbiota (targeted metabolomics) (Van Dorsten et al., 2010). Metabolome is influenced by age and gender factors; furthermore, we can differ- entiate the endogenous metabolome as the metabolites produced by the organism in tissues and cells, the microbiota metabolome as those metabolites produced by colonic microbiota from xenobiotics and dietary compounds and the food metabolome as the metabolites derived from digestion of exogenous food compounds. Metabolomic tools enable the study of metabolome fluctuations due to dietary patterns at a molecular level. The study of food metabolome can provide markers of dietary short- or long- term intake, more reliable than subjective dietary records (Primrose et al., 2011). Dietary assessment has been currently based on food frequency questionnaires and 24-recalls, but the accuracy of these methods is uncertain (Tucker, 2007). Cultivars, food processing, or storage give high variability on phytochemical composition of plant-based foods. The metabolomic approach reduces these limitations. On the other hand, endogenous metabolites are susceptible of change due to dietary components, and their metabolomic study can be useful to provide markers of effect. To date, although human nutritional intervention studies using metabolomics approach are scarce, attempts have been made on the search for markers of intake of polyphenol-rich foods (see Table 15.1). Dietary polyphenols have poor bioavailability and undergo extensive phase II metabolism in small intestine and liver, where they are conjugated with glucuronide, sulfate, and methyl moieties to facilitate their elimination in urine. A major fraction of polyphenols reaches the colon where are degraded by intestinal microbiota in TABLE 15.1 Human Intervention Metabolomics Studies on Polyphenol-Rich Foods

418 Analytical Subjects Dietary Intervention Techniques Samples Changes of Markers References 16 healthy Placebo-controlled study 1-d LC-ESI- 24-h urine Increases in epicatechin-sulfate, Bartolome´ (12 treatment, treatment MS/MS naringenin-O-glucuronide, et al., 2010 4 placebo) 10 capsules of almond skin 5-hydroxymethoxyphenyl-␥-valerolactone extract (350 mg/capsule). sulfate 221 mg gallic acid/g extract, 315 mg cyanidin/g extract 24 healthy Placebo-controlled study LC-q-TOF Urine in Increases in conjugates of hydroxyphenylvaleric, Llorach et al., (12 treatment, 1-d treatment interval hydroxyphenylpropionic, and 2010a 12 placebo) 10 capsules of almond skin periods (-2-0 hydroxyphenylacetic acids extract (350 mg/capsule). h; 0–2 h; 2–6 221 mg gallic acid/g extract, h; 6–10 h; 315 mg cyanidin/g extract 10–24 h) 17 healthy men Randomized, cross-over study NMR 24-h urine Both: increase in hippuric acid, Van Dorsten 2-d intervention 1,3-dihydroxyphenyl-2-O-sulfate et al., 2006 Green tea or Black tea (doses Green tea: increases in citric acid cycle similar to 12 cups a day) intermediates 20 healthy Pu-erh tea UPLC-Q-TOF- 24-h urine Increases in kaempferol, epigallocatechin, and Xie et al., 2012 (10 men, Dose equivalent to 5 cups MS Pre- and postin- caffeine 10 women) 2-wk treatment tervention Increases in metabolites hippuric acid, GC-TOF-MS 1.7-dimethyluric acid, 7-methylhypoxanthine Changes in endogenous metabolites: increases in 4-methoxyphenylacetic acid, inositol, 5-hydroxytryptophan. Decreases in 3-chlorotyrosine, 2-aminobenzoic acid and 2,5-dihydroxy-1H-indole 26 healthy Double blind, randomized, GC–MS Urine, Fecal Urine: increases in 3-hydroxyphenylacetic acid, Grun¨ et al., (15 men, placebo controlled water, 4-hydroxyphenylacetic acid, vanillic acid, 2008 11 women) cross-over study Plasma homovanillic acid, hippuric acid 4-wk intervention Feces: increases in 3-hydroxybenzoic acid, Group 1: Mix 2:1 red wine and 4-hydroxybenzoic acid, red grape juice extracts 4-hydroxyphenylacetic acid, (MIX) 4-hydroxyphenyl-propionic acid Group 2: placebo 39 mildly Double-blind, two groups NMR Fecal samples MIX: reduction of isobutyrate concentrations Jacobs et al., hypertensive placebo-controlled Pre- and postin- respect to placebo 2008 patients (23 treatments tervention men, 4-wk treatment period 16 women) Group 1: Mix 2:1 red wine and red grape juice extracts (MIX) and placebo Group 2: Red grape juice dry extract (GJX) and placebo (both containing 800 mg GAE) 58 mildly Double-blind, two groups NMR, GC–MS 24-h urinePre- MIX: significant 35% increase of hippuric acid in Van Dorsten hypertensive placebo-controlled and postin- urine et al., 2010 patients (33 treatments tervention Markers of MIX Intake: men, 4-wk treatment period 3-hydroxyphenylpropionic acid, 25 women) Group 1: Mix 2:1 red wine and 3,4-dihydroxyphenylpropionic acid and red grape juice extracts pyrogallol (MIX) and placebo Marker for GJX intake: 4-hydroxyphenylacetic Group 2: Red grape juice dry acid, homovanillic acid, dihydroferulic acid, extract (GJX) and placebo and phenylacetylglutamine (both containing 800 mg GAE) Changes in endogenous pathways of metabolism of catecholamines (increase in vanillylmandelic acid and homovanillic acid), phenylalanine (increase in phenylacetylglutamine) and tyramine (increase in 4-hydroxymandelic acid) Increases in TCA cycle intermediates (iso-citrate, cis-aconitate, oxaloacetate)

419 (continued) 420 TABLE 15.1 (Continued) Analytical Subjects Dietary Intervention Techniques Samples Changes of Markers References 35 healthy men Placebo-controlled, GC–MS NMR 24-h urine MIX: Increases in syringic acid, Jacobs et al., randomized, cross-over study plasma 3-hydroxyhippuric acid, pyrogallol, 2012 5-d period MIX Placebo 3-hydroxyphenylacetic acid, 3-hydroxyphenylpropionic acid Changes in amino acid metabolites: increases of indole-3 acetic acid, glucose-1-phosphate, sucrose, nicotinic acid, 1-methylhistidine. Decreases in 3-indoxylsulfuric acid, p-cresol sulfate, 3,4-dihydroxyphenylglycol 10 healthy 1-d intervention LC-q-TOF Urine in interval Both CW and CM: increases in Llorach et al., volunteer Group CW: 40 g of cocoa periods theobromine, epicatechin sulfate, 2009 (5 men, powder with 250 mL of O-methyl epicatechin, vanillic acid. Llorach et al., 5 women) water Group CM: 40 g of Increases in tyrosine, trigonelline, and 2010b cocoa powder with 250 mL hydroxynicotinic acid (metabolites of of milk nicotinic acid) Group NM: 250 mL of milk as a control 42 men and 12-wk period LC-ESI-Q-TOF- 24-h urine Untargeted approach: increase on Tulipani et al., women, Group 1:30 g/day nuts MS Pre- and postin- pyrogallol sulfate, urolithin A 2011 metabolic Group 2: control tervention glucuronide, urolithin A sulfate, Tulipani et al., syndrome p-coumaroyl derivatives, metabolites 2012 patients from fatty acids (dodecanedioic acid), serotonin metabolites (N-acetyl-serotonine sulfate, hydroxyindole acetic acid) Targeted approach: increase in glucuronides of urolithins A, B, C, and D METABOLOMICS FROM CLINICAL TRIALS 421 phenolic acids. Therefore, metabolomic strategies in the study of dietary polyphenols must consider both the host metabolism as well as studies on microbiome metabolism. As a first step, a chemical characterization of the phenolic composition of food must be performed. It is commonly followed by a pilot study for the identification of polyphenol-derived metabolites profile in biological samples. After that, a com- parative placebo-controlled study is performed for a targeted analysis of polyphenol metabolites and to determine changes in metabolome.

15.3 STATISTIC TOOLS IN NUTRITIONAL METABOLOMICS

Evaluation of metabolomic data implies multivariate data analysis. As a first step, a Principal Component Analysis (PCA) is commonly applied that reduces the complex multidimensional data obtained to a lesser number of linear variables, the principal components (PC). These PCs are linear combinations of the variables that maximize sample variation. The data can be visualized in two types of graphics: Score Plots, where each point represents an individual sample, and Loading Plot, that represents the MS reading (retention time and m/z for each analyte) or the single 1H NMR spectral region. As a second step, Partial Least Squares Discriminant Analysis (PLS-DA) is typ- ically performed to define the maximum separation between variables so that the metabolic differences among the predefined sample classes are visualized. The quality of the models is judged by the parameters goodness-of-fit parameter (R2) and the predictive ability parameter (Q2). R2 reflects the ability to fit the data whilst Q2 indicates how good the model is able to predict the class membership; the reliability increases when Q2 approximates to value 1.

15.4 METABOLOMICS FROM CLINICAL TRIALS AFTER INTAKE OF POLYPHENOL-RICH FOODS

In the following subsections, relevant data concerning the ingestion of polyphenol- rich foods using targeted and nontargeted metabolomics tools are detailed and discussed.

15.4.1 Almond Skin Polyphenols as Potential Ingredient for Functional Foods This approach has been followed in the evaluation of almond skin polyphenols in a targeted analysis of polyphenol metabolites in a human intervention study (Bar- tolome´ et al., 2010). The characterization of almond skin polyphenols was performed using LC-DAD/fluorescence, LC/ESI-MS, and MALDI-TOF-MS technologies as a previous step to study their bioavailability. There was a prevalence of flavan-3-ols in almond skins, representing the 33–56% of total compounds identified. The anal- ysis of phenolic metabolites in urine samples was performed by LC-ESI-MS/MS. 422 MS-BASED OMICS EVALUATION OF PHENOLIC COMPOUNDS

Samples were previously subjected to enzymatic hydrolysis for the discrimination of microbial-derived metabolites. Nonhydrolyzed samples were analyzed for conju- gated metabolites of the human metabolism. PCA analysis gave a PC1 component that explained 37% of the total variance of data and was more related to metabolites of microbial origin. PC2 component explained 20% of total variance of data and was more related to conjugated metabolites. In addition, maximum excretion occurred at different time periods depending on the type of metabolite, related to microbial metabolism (see Table 15.1). A further approach has been performed using an LC-q-TOF validated method (Llorach et al., 2010a) to identify conjugates of hydroxyphenyl-valeric, hydroxyphenyl-propionic, and hydroxiphenyl-acetic acids. These metabolites have also been identified after consumption of tea and cocoa, and could represent good biomarkers of flavan-3-ol intake.

15.4.2 Tea and Sports Drink Enriched in Tea Extract Green and black tea intake have shown different impact on endogenous metabolism; PCA analysis demonstrated a separated cluster of profiles for green and black tea intake versus a control of caffeine in healthy volunteer (Van Dorsten et al., 2006). Green tea significantly increased the urinary levels of citric acid cycle intermediates, which suggests a stimulation of the oxidative energy metabolism (Miccheli et al., 2009). This effect is in line with the studies that have associated green tea with increased fatty acid oxidation and weight reduction (Rumpler et al., 2001). Attempts have been made to integrate metabolomics and nutrikinetics in order to describe populations with different metabolic phenotypes following a nutritional intervention. A nutrikinetic analysis was performed after a single dose of black tea (800 mg GAE) in 20 healthy men (Van Velzen et al., 2009). Kinetic parameters were calculated assuming a one-compartment model for three main microbial metabolites, hippuric acid, 4-hydroxyhippuric acid, and 1,3-dihydroxyphenyl-2-O-sulfate. Results provided a differentiation between strong and poor metabolizers as well as fast and slow metabolizers. Studies on Pu-erh tea managed to discern between the exogenous tea polyphenols absorbed, the phase II metabolites generated, and the endogenous metabolites altered in response to the Pu-erh tea intake (see Table 15.1) (Xie et al., 2012). Changes in endogenous metabolites were in line with the cholesterol and triglyceride reducing effects previously shown by Pu-erh tea.

15.4.3 Wine and Grape Metabolomic studies have also been performed on grape-derived polyphenols. A GC–MS method was validated for the study of polyphenol microbiota fermentation products in biological samples as urine, fecal water, and plasma after a 4-week supplementation with a mix of red wine and red grape juice extracts (MIX) in healthy humans (Grun¨ et al., 2008). The method included an enzymatic deconjugation, liquid– liquid extraction, and derivatization prior to GC–MS analysis. Recoveries were in METABOLOMICS FROM CLINICAL TRIALS 423 the order of 80% and limits of detection below 0.1 ␮g/mL for most phenolics. The method was able to profile the metabolism of polyphenols in different biological fluids (see Table 15.1). As long-term ingestion of polyphenols may modulate the gut microflora, same authors validated an NMR-based method for the analysis of fecal samples after a 4-week supplementation with MIX extract, compared to a grape juice extract alone (GJX), and to a placebo, in mildly hypertensive patients (see Table 15.1) (Jacobs et al., 2008). Both MIX and GJX treatments mainly contained oligomeric proanthocyani- dins. The extraction method of metabolites was validated. Surprisingly, both MIX and GJX groups show a metabolite composition of feces quite homogenous, despite the high inter-individual diversity in gut microbiota; differences were only observed in their relative concentrations. The result suggests that different colonic flora share general biochemical characteristics of metabolic patterns. This result is in contrast with urinary outcomes, which in general show a high variability from one subject to another. Besides, MIX treatment induced metabolite changes compared to placebo treatment, with a reduction of isobutyrate concentration. Authors hypothesized that this effect could be related to polyphenol-induced inhibition of protein fermentation. These authors lately reported another 4-week double-blind crossover study with a higher number of mildly hypertensive volunteer, including both NMR and GC– MS technologies for urine samples (Van Dorsten et al., 2010). An increase in the urinary excretion of hippuric acid was detected after the MIX supplementation by NMR and GC–MS. Phenolic acids, most of them from gut microbiota origin were identified, with syringic acid, 3- and 4-hydroxyhippuric acid, and 4-hydroxymandelic acid as urinary markers of both extracts, compared to placebo. Changes in endogenous urinary metabolites were also observed (see Table 15.1). A recent study has monitored the exogenous and endogenous metabolic effects of a 4-day supplementation with the MIX, in a nontargeted approach (Jacobs et al., 2012). Most metabolites detected (see Table 15.1) are originated from gut microbial fermentation of wine and grape polyphenols and are in accordance with previous works (Jacobs et al., 2008; Van Dorsten et al., 2010). MIX treatment had a modulation effect on gut bacteria metabolism of proteins, as changes in urinary concentrations of aminoacid metabolites were detected.

15.4.4 Cocoa Human urine metabolome has also been studied after a nutritional intervention with cocoa (Llorach et al., 2009). Subjects consumed either a single dose of cocoa with milk or water, or milk without cocoa. Urine samples were collected in time periods of 0–6, 6–12, and 12–24 h and were analyzed by liquid chromatography coupled with time-of-flight MS (HPLC-q-TOF). Multivariate data analysis was used to integrate the data obtained and cluster samples in an unsupervised way (see Table 15.1). Food matrix did not affect the bioavailability of samples, as the model did not discriminate from milk or water matrix. Instead, differences were observed between cocoa treatments or milk alone, explainable by the excretion of cocoa metabolites: purine alkaloid metabolites, polyphenol metabolites by colonic microbiota as well as 424 MS-BASED OMICS EVALUATION OF PHENOLIC COMPOUNDS by host metabolism, flavor components, nicotinic acid metabolites, and aminoacids. Further multivariate statistical analysis of data with a hierarchical clustering (Llorach et al., 2010b) allowed refining the differences observed and clustering mass features associated with the corresponding urine time periods. Correlation matrices obtained could be used to infer the metabolite origin.

15.4.5 Nut A nontargeted metabolomics approach was followed in an intervention trial with subjects with metabolic syndrome after consumption of nuts (30 g/day) for a 12-week period, compared to individuals given a control diet (Tulipani et al., 2011). Metabolites were analyzed by LC-Q-TOF-MS and identified with human metabolome databases. After multivariate data analysis it was possible to discriminate between control and nut treatment (Table 15.1). There were 20 potential markers of nut intake, divided in three groups: fatty acid conjugated metabolites (due to the high PUFA content of nuts), phenolic metabolites from ellagitannins (both phase II and microbial-derived), and serotonin metabolites that might derive from serotonin con- tent of walnuts. The nontargeted approach could help to detect these unexpected markers of nut consumption. When a targeted approach was followed in the same study (Tulipani et al., 2012) different glucoronide isomers of urolithins A, B, C, and D were detected.

15.5 HUMAN METABOLOME IN LOW AND NORMAL POLYPHENOL DIETARY INTAKE

Walsh et al. (2007) investigated the role of dietary phytochemicals on shaping human urinary metabolomic profiles, in volunteer following a normal diet (ND), a 2-day low-phytochemical diet (LPD), and a 2-day standardized phytochemical diet (SPD). Intra- and inter-individual metabolic variation was evaluated. Multivariate analysis indicated differences with a higher hippurate excretion with SPD and ND treatments and a creatinine and methylhistidine excretion with LPD. Increased excretion of hip- purate has been related to polyphenol consumption as it derives from the conjugation of glycine in the kidney or liver with benzoic acid produced by colon microbiota (Mul- der et al., 2005). Standardization of diet resulted in a reduction of inter-individual variability in urine but not in plasma and saliva.

15.6 CONCLUDING REMARKS AND FUTURE PERSPECTIVES

These studies demonstrate the potential of MS (and NMR) technologies to detect subtle metabolic changes after dietary polyphenol consumption; they can be valuable tools to decipher the complex interrelationships between diet, molecular endoge- nous processes, and gut microbiota metabolism, necessary for the assessment of the nutritional effects of candidate functional foods in human interventions. REFERENCES 425

However, to the best of our knowledge, only five polyphenol-rich foods and prod- ucts derived (including functional foods or future ingredients for these types of foods) have been studied in a metabolomics context. Therefore, it can be considered that nutritional metabolomics alone or in a more global context as described by Foodomics are still in its infancy. Further studies are required to confirm and corroborate these previous studies, from reproducibility of the results up to the design of the correspond- ing predictive metabolomics models. Current effort has been focused to obtain new outcomes on polyphenol metabolites as primarily significant metabolomics com- pounds useful for determining biomarkers of intake. However, deeper attention is required on endogenous metabolites as consequence of the ingestion of polyphenol- rich foods in the hallmark of metabolomics. This application would give mechanistic support to previous results obtained by classical targeted assays linked to healthy effects after polyphenol-rich foods intake.

ACKNOWLEDGMENTS

The authors are grateful for the support of the National funding agencies through the Projects AGL2011-23690 (CICYT), CSD007-0063 (CONSOLIDER-INGENIO 2010 ‘Fun-C-Food’), and CSIC 201170E041 (Spanish Ministry of Economy and Competitiveness). They are also grateful to the Fundacion´ Seneca—CARM´ “Group of Excellence in Research” 04486/GERM/06 and the Ibero-American Programme for Science, Technology and Development (CYTED)—Action 112RT0460 COR- NUCOPIA.

REFERENCES

Bartolome´ B, Monagas M, Garrido I, Gomez-Cordov´ es´ C, Mart´ın-Alvaraz´ PJ, Lebron-Aguilar´ R, Urp´ı-Sarda´ M, Llorach R, Andres-Lacueva´ C (2010). Almond (Prunus dulcis (Mill.) D.A. Webb) polyphenols: From chemical characterization to targeted analysis of phenolic metabolites in humans. Archives of Biochemistry and Biophysics 501:124–133. Becker S, Kortz L, Helmschrodt C, Thiery J, Ceglarek U (2012). LC–MS-based metabolomics in the clinical laboratory. Journal of Chromatography B 883–884:68–75. Biesalski HK, Aggett PJ, Anton R, Bernstein PS, Blumberg J, Heaney RP, Henry J, Nolan JM, Richardson D, Van Ommen B, Witkamp RF, Rijkers GT, Zollner¨ I (2011). 26th Hohenheim Consensus Conference, September 11, 2010. Scientific substantiation of health claims: evidence-based nutrition. Nutrition 27:S1–S20. Cifuentes A (2009). Food analysis and foodomics. Journal of Chromatography A 1216:7109– 7110. Diplock AT, Aggett PJ, Ashwell M, Bornet F, Fern EB, Roberfroid MB (1999). Scientific concepts of functional foods in Europe. Consensus document. British Journal of Nutrition 81:S1–27. Grun¨ CH, van Dorsten FA, Jacobs DM, Le Belleguic M, van Velzen EJ, Bingham MO, Janssen HG, van Duynhoven JP (2008). GC-MS methods for metabolic profiling of microbial 426 MS-BASED OMICS EVALUATION OF PHENOLIC COMPOUNDS

fermentation products of dietary polyphenols in human and in vitro intervention studies. Journal of Chromatography B 871: 212–219. Jacobs D, Deltimple N, Van Velzen E, Van Dorsten FA, Bingham M, Vaughan EE, Van Duynhoven J (2008). 1H NMR metabolite profiling of feces as a tool to assess the impact of nutrition of the human microbiome. NMR in Biomedicine 21:615–626. Jacobs DM, Furhmann JC, Van Dorsten FA, Rein D, Peters S, Van Velzen WJJ, Hollebrands B, Draijer R, Van Duynhoven J, Garczarek U (2012). Impact of short-term intake of red wine and grape polyphenol extract on the human metabolome. Journal of Agricultural and Food Chemistry 60:3078–3085. Llorach R, Urp´ı-Sarda´ M, Jauregui O, Monagas M, Andres-Lacueva´ C (2009). An LC-MS- based metabolomics approach for exploring urinary metabolome modifications after cocoa consumption. Journal of Proteome Research 8:5060–5068. Llorach R, Garrido I, Monagas M, Urp´ı-Sarda´ M, Tulipani S, Bartolome´ B, Andres-Lacueva´ C (2010a). Metabolomics study of human urinary metabolome modifications after intake of almond (Prunus dulcis (Mill.) D.A. Webb) skin polyphenols. Journal of Proteome Research 9:5859–5867. Llorach R, Jauregui O, Urp´ı-Sarda´ M, Andres-Lacueva´ C (2010b). Methodological aspects for metabolome visualization and characterization. A metabolomic evaluation of the 24 h evolution of human urine after cocoa powder consumption. Journal of Pharmaceutical and Biomedical Analysis 51: 373–381. Miccheli A, Marini M, Capuani G, Miccheli AT, Delfini M, Di Cocco ME, Puccetti C, Paci M, Rizzo M, Spartaro A (2009). The influence of a sports drink on the postexercise metabolism of elite athletes as investigated by NMR-based metabolomics. Journal of the American College of Nutrition 28:552–564. Mulder TP, Rietveld AG, van Amelsvoort JM (2005). Consumption of both black tea and green tea results in an increase in the excretion of hippuric acid into urine. American Journal of Clinical Nutrition 81:256S–260S. Primrose S, Draper J, Elsom R, Kirkpatrick V, Mathers JC, Seal C, Beckmann M, Haldar S, Beattie JH, Lodge JK, Jenab M, Keun H, Scalbert A (2011). Metabolomics and human nutrition. British Journal of Nutrition 105:1277–1283. Razquin C, Martinez JA, Martinez-Gonzalez MA, Mitjavila MT, Estruch R, Marti A (2009). A 3 years follow-up of a Mediterranean diet rich in virgin olive oil is associated with high plasma antioxidant capacity and reduced body weight gain. European Journal of Clinical Nutrition 63:1387–1393. Regulation (EC) No 1924/2006 of the European Parliament and of the Council of 20 December (2006) on nutrition and health claims made on foods. Richardson DP, Afftersholt T, Asp NG, Bruce A, Grossklaus R, Howlett J, Pannemans D, Ross R, Verhagen H, Viechtbauer V (2003). PASSCLAIM-synthesis and review of existing processes. European Journal of Nutrition, 42:I96–111. Rumpler W, Seale J, Clevidence B, Judd J, Wiley E, Yamamoto S, Komatsu T, Sawaki T, Ishikura Y, Hosoda K (2001). Oolong tea increases metabolic rate and fat oxidation in men. Journal of Nutrition 131:2848–2852. Tulipani S, Llorach R, Jauregui O, Lopez-Uriarte´ P, Garc´ıa-Aloy M, Bullo M, Salas-Salvado´ J, Andres-Lacueva´ C (2011). Metabolomics unveils urinary changes in subjects with metabolic syndrome following 12-week nut consumption. Journal of Proteome Research 10:5047–5058. REFERENCES 427

Tulipani S, Urp´ı-Sarda´ M, Garc´ıa-Villalba R, Rabassa M, Lopez-Uriarte´ P, Bullo´ M, Jauregui O, Tomas-Barber´ an´ F, Salas-SalvadoJ,Esp´ ´ın JC, Andres-Lacueva´ C (2012). Urolithins are the main urinary microbial-derived phenolic metabolites discriminating a moderate consumption of nuts in free-living subjects with diagnosed metabolic syndrome. Journal of Agricultural and Food Chemistry dx.doi.org/10.1021/jf301509w. Trujillo E, Davis C, Millner J (2006). Nutrigenomics, proteomics, metabolomics and the practice of dietetics. Journal of the American Dietetic Association 106:403–413. Tucker KL (2007). Assessment of usual dietary intake in population studies of gene-diet interaction. Nutrition, Metabolism & Cardiovascular Diseases 17:74e81. Valko M, Leibfritz D, Moncol J, Cronin MT, Mazur M, Telser J (2007). Free radicals and antioxidants in normal physiological functions and human disease. International Journal of Biochemistry and Cellular Biology 39:44–84. Van Dorsten FA, Daykin CA, Mulder TPJ, Van Duynhoven JPM (2006). Metabonomics approach to determine metabolic differences between green tea and black tea consumption. Journal of Agricultural and Food Chemistry 54:6929–6938. Van Dorsten FA, Grun¨ CH, Van Velzen EJJ, Jacobs D, Draijer R, Van Duynhoven JPM (2010). The metabolic fate of red wine and grape juice polyphenols in humans assessed by metabolomics. Molecular Nutrition and Food Research 54:897–908. Van Velzen EJJ, Westerhuis JA, Van Duynhoven JPM, Van Dorsten FA, Grun¨ CH, Jacobs DM, Duchateau GSMJE, Vis DIJ, Smilde AK (2009). Phenotyping tea consumers by nutrikinetic analysis of polyphenolic end-metabolites. Journal of Proteome Research 8:3317–3330. Walsh MC, Brennan L, Pujos-Guillot E, Seb´ edio´ J-L, Scalbert A, Fagan A, Higgins DG, Gibney M (2007). Influence of acute phytochemical intake on human urinary metabolomic profiles. American Journal of Clinical Nutritrion 86:1687–1693. Xie G, Zhao A, Zhao L, Chen T, Chen H, Qi X, Zheng X, Ni Y, Cheng Y, Lan K, Yao C, Qiu M, Jia W (2012). Metabolic fate of tea polyphenols in humans. Journal of Proteome Research 11:3449–3457. 16 METABOLOMICS OF DIET-RELATED DISEASES

Marcela A. Erazo, Antonia Garc´ıa, Francisco J. Ruperez,´ and Coral Barbas

16.1 INTRODUCTION

Diet and lifestyle are potentially modifiable and both are directly related to certain diseases. Epidemiological and clinical studies have concluded that many diseases with high rates of morbidity and mortality worldwide, including cardiovascular disease (heart disease and stroke), diabetes and some cancers, are related to diet. The term diet-related diseases includes a wide variety of diseases and disorders affecting different organs and systems. There are low complexity alterations, limited to a single organ or system such as dental disorders. Another group, more complex, that would include diabetes, atherosclerosis, obesity, osteoporosis, and others, deals with systemic involvement and associated complications. In addition, cardiovascular disease, and cancer, because of compromising vital organs such as kidneys, liver, heart, and/or brain, or due to the magnitude of their complications are the leading causes of mortality. From the etiology point of view, two types of diet-related diseases could be described: cancer, cardiovascular diseases, and diabetes are related to disorders (mostly excess) in the food intake, whereas the lack of nutrients (proteins, vita- mins, and minerals) may give rise to specific complications known as deficiency diseases (Fig. 16.1). Nevertheless, it is not possible to establish clear boundaries between the factors, the associated disorders, and the diseases, because complex interrelationships can be found among them. For instance, metabolic syndrome is

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

429 430 METABOLOMICS OF DIET-RELATED DISEASES

FIGURE 16.1 Diet-related diseases. associated with higher incidence of diabetes, but it is neither a sequential process, nor a systematic situation; dietary fiber is strongly associated with some types of cancer, and the incidence of cardiovascular diseases (CVDs) has been related to adequate fiber intake too. Food is a complex mixture which does not act on single molecular targets, but modulates many biochemical pathways simultaneously, and the mechanisms for the diet-related pathogenesis involve complex interactions of different processes. Regarding metabolic disorders such as diabetes, they have not only a genetic component but also are related to survival adaptive changes, giving rise to the thrifty genotype theory that refers to the changes that our ancestors had to make to improve metabolic efficiency and survive for long periods of famine (Neel, 1962). These were genetically transmitted, such as the ability to store energy as fat, the promotion of hepatic gluconeogenesis and lipogenesis, the development of insulin resistance at muscle and hyperinsulinemia. In addition, maternal malnutrition during pregnancy associated with low birth weight (less than 2500 g) is a risk factor for develop- ing cardiovascular disease, hypertension, dyslipidemia, diabetes, atherosclerosis and impaired glucose tolerance in adulthood; it would seem to present a program in utero that prepares the offspring for further reduced energy consumption (Hales and Barker, 1992; Barker, 1995). In modern life, this means that increased caloric intake associ- ated with decreased physical activity and the presence of thrifty genes, theoretically adapted to enhance the energy storage efficiency, will result in metabolic changes that have as a result weight gain, obesity, metabolic syndrome, and diabetes. ANALYSIS OF THE METABOLOME: METABOLOMICS 431

In addition to the diabetes and cardiovascular diseases, a recent report (World Cancer Research Fund/American Institute for Cancer Research, 2007) concluded that the risk of major cancers including mouth, throat, esophagus, lung, breast, endometrium, stomach, colon, and rectum, is modified by food and nutrition (includ- ing alcohol), and by physical activity and body composition. Moreover, the National Cancer Institute of the USA estimates that one-third of all cancer deaths may be diet and/or lifestyle related. Furthermore, it has been proposed that chronic diseases of any system of the body, including the nervous system, may be affected by diet and associated factors, although epidemiological research on such diseases is unconvincing. The application of “omics” technology, particularly metabolomics has revealed the metabolic changes associated to diet-related diseases and also consequences of diet intervention in a global untargeted way. Despite progress, more research on this topic is still needed, focused on the development of dietary ideal models that can elucidate the sequence of events that starts with the interaction between dietary habits and genetic adaptations, the changes induced in the metabolism, and the diseases associated, together with the potential therapeutics, therefore, providing a balance in previously unbalanced organisms expanding the knowledge of the health status.

16.2 ANALYSIS OF THE METABOLOME: METABOLOMICS

The metabolome is the complete set of small molecules (typically, less than 1500 Da) coming from protein activity (anabolism and catabolism) in living systems. There are three major approaches used in metabolomics studies: (i) targeted anal- ysis, (ii) metabolite profiling, and (iii) metabolic fingerprinting. Targeted analysis is the classical analytical approach to measure metabolites. It is used to measure the con- centration of a limited number of known metabolites precisely. Metabolite profiling is the measurement of a set of related metabolites either chemically or biochemically correlated. Metabolic fingerprinting does not attempt to identify or precisely quantify all the metabolites in the sample. Rather, it considers a total profile, or fingerprint, as a unique pattern characterizing a snapshot of the metabolism in a particular cell line or tissue. Pattern recognition tools are used to classify the fingerprints and identify the specific features of the profile that are characteristics for each pattern. Metabolic fingerprinting is most useful in biomarker discovery and diagnostics and is the type of analysis that will be considered in this chapter. Due to the large variability in physico-chemical properties of such analytes, together with the enormous differences in concentrations, there is no single analytical technique that can fulfill all the requirements to provide the adequate signal for all of them. Metabolome analysis generally is conducted through gas chromatography– mass spectrometry (GC–MS), liquid chromatography–mass spectrometry (LC–MS), capillary electrophoresis–mass spectrometry (CE–MS), and/or nuclear magnetic res- onance (NMR). Usually, these technologies are applied in a “differential display” mode, that is, by comparing two situations (e.g., diseased vs. healthy) in order to reduce the complexity in data by examining only differences. 432 METABOLOMICS OF DIET-RELATED DISEASES

Findings in the metabolome study will be depending on different factors corre- sponding to the analytical methodology: selection of the sample, sample pretreatment procedure, and analytical instrumentation. Once the most appropriated biological fluid has been selected (or that one available for the study), some general consider- ations should be taken into account. Briefly, although sensitivity is poorer in NMR, the elucidation capabilities are unquestionable; the NMR profile could contain qual- itative and quantitative information on hundreds of different small molecules present in the sample. Regarding high-resolution separation techniques, GC is highly appro- priate for volatiles derivatives of metabolites; amino acids, monosaccharides, fatty acids, disaccharides, and cholesterol could be easily identified in the chromatogram. HPLC shows some advantages such as minimum requirements of sample treatment and also minimum alteration or hydrolysis of the metabolites during the analysis. HPLC in reverse mode is especially suited for medium and low polarity metabolites but is quite limited for polar metabolites as sugars or amino acids while HPLC in hydrophilic interaction chromatography (HILIC) mode and capillary electrophore- sis could offer a good separation of polar metabolites. Mass spectrometry-based approaches are inherently more sensitive than NMR techniques, providing access to lower concentration metabolites. In addition, mass spectrometry detection will per- mit a high sensitive estimation of the abundance in the biofluids and also structural elucidation based on spectra libraries and tandem mass spectrometry. Regarding new developments, next-generation screening of disease-related metabolomic phenotypes will require monitoring of both metabolite levels and turnover rates (Nemutlu et al., 2012) Stable isotope 18O-assisted 31P NMR and mass spectrometry uniquely allows simultaneous measurement of phosphometabolite lev- els and turnover rates in tissue and blood samples. The 18O labeling procedure is 18 18 based on the incorporation of one O into inorganic phosphate from [ O]H2O with each act of ATP hydrolysis and the distribution of 18O-labeled phosphoryls among phosphate-carrying molecules. This enables simultaneous recording of ATP synthe- sis and utilization, phospho transfer fluxes through adenylate kinase, creatine kinase, and glycolytic pathways, as well as mitochondrial substrate shuttle, urea and Krebs cycle activity, glycogen turnover, and intracellular energetic communication.

16.3 DIET-RELATED DISEASES

16.3.1 Diabetes Diabetes mellitus (DM) refers to a group of metabolic disorders that course with hyperglycemia. There are distinct types of DM that are caused by a wide variety of factors as genetics and environmental. The metabolic alterations that can cause hyper- glycemia are: reduced insulin secretion, decreased glucose utilization, and increased glucose production. The metabolic alterations associated with DM are responsible for pathophysiologic changes in target organs that worsen the prognosis. DM is classified as type 1 when courses with insulin deficiency and type 2 when genetic and metabolic defects in insulin action or secretion conduced to hyper- glycemia. There are other types of DM including gestational DM and mitochondrial DIET-RELATED DISEASES 433

TABLE 16.1 Chronic Complications of Diabetes Mellitus Vascular Microvascular Retinopathy Neuropathy Nephropathy Macrovascular coronary artery disease Peripheral arterial disease Cerebrovascular disease Nonvascular Gastroparesis Infections Skin alterations

DM. Although the management of all types of diabetes must include a careful control of the intake, nutrition disorders are associated only with type 2 diabetes, and most of the data herein mentioned are related to it. The worldwide prevalence of DM is increasing, the total number of people with diabetes is projected to rise from 171 million in 2000 to 366 million in 2030 (Wild et al., 2004). The principal risk factors are family history of diabetes (i.e., parent or sibling with type 2 diabetes), obesity (BMI ≥25 kg/m2), race (e.g., African American, Latino, Native American, Asian American, Pacific Islander), previously identified IFG or IGT, hypertension (blood pressure ≥140/90 mmHg), physical inactivity (ADA, 2003). Chronic complications of DM could affect many organs and are responsible for the morbidity and mortality associated with the disease (Table 16.1). A prevailing model for the development of diet-induced insulin resistance holds that mitochondrial fatty acid oxidation is inadequate to deal with the large load of dietary fat, thus leading to an accumulation of lipid-derived metabolites such as diacylglycerols (DAGs) and ceramides that can activate stress kinases to interfere with insulin action (Savage et al., 2007). NMR-based metabolomic analysis in conjunction with multivariate statistics was applied (Salek et al., 2007) to examine the urinary metabolic changes in two rodent models of type 2 diabetes mellitus as well as unmedicated human sufferers. The db/db mouse and obese Zucker (fa/fa) rat have autosomal recessive defects in the leptin receptor gene, causing type 2 diabetes. The study demonstrated metabolic similarities between the three species examined, including metabolic responses associated with general systemic stress, changes in the TCA cycle, and perturbations in nucleotide metabolism and in methylamine metabolism. All the three species demonstrated profound changes in nucleotide metabolism, including that of N-methylnicotinamide and N-methyl-2-pyridone-5-carboxamide, which may provide unique biomarkers for following type 2 diabetes mellitus progression. In humans, a group applied nontargeted LC-QTOF-MS analysis during oral glu- cose tolerance testing of 16 normal individuals and found by multivariate statistical analysis that free fatty acids, acylcarnitines, bile acids, and lysophosphatidylcholines were the most discriminating biomarkers of the glucose bolus (Zhao et al., 2009). A key contribution of metabolomics in diabetes is finding diagnostic markers. By the time diabetes is diagnosed, irreversible pathology is typically present, challenging 434 METABOLOMICS OF DIET-RELATED DISEASES therapeutic intervention. A reliable test for predicting diabetes risk could allow earlier implementation of intervention measures. Increased blood concentrations of amino acids are now suggested to predict risk of diabetes (Wang et al., 2011), and amino acid profiling might also provide mechanistic insights into this disease. Wang et al. used high-throughput metabolomics to uncover significant associations between the concentrations of five branched-chain (leucine, isoleucine, and valine) and aromatic (phenylalanine and tyrosine) amino acids in blood and predisposition to diabetes. Morbidity and mortality related to complications of DM can be reduced with changes in lifestyles and improving disease control. Genomics, proteomics, and metabolomics may help in providing new biomarkers, and the associations between them, related to general metabolism, glycemia, oxidative stress, nutrient status, lipid management, and endothelial inflammation (McKillop and Flatt, 2011). The goal in metabolomics diabetes research is to improve the search of biomarkers that would permit an early diagnosis, an adequate follow-up including early markers of target organ damage.

16.3.2 Cardiovascular Disease The term CVD covers an extended group of diseases that may include ischemic heart disease (e.g., myocardial angina, heart attack), cerebrovascular disease (e.g., stroke), hypertension, heart failure, and rheumatic heart disease. Nowadays, CVD is the most common cause of death, according to WHO, account- ing for 16.1 million (30%) of total deaths and is a major cause of morbidity worldwide; of all these, heart attack is responsible of 7.6 million, and 5.7 million are related to stroke, in addition more than 80% of CVD deaths occur in developing countries (WHO, 2008). CVD is a multifactorial cause disease; it has been seen that disorders like metabolic syndrome, impaired glucose tolerance, diabetes, obesity, and dyslipidemia increase the risk of cardiovascular events; in the same way, there are predisposing factors such as nutrient uptake, genetic profile, and environmental factors (alcohol consumption, smoke, sedentary lifestyle) that should modify the risk (D’Agostino et al., 2008). In the early twenty-first century, with the need of understanding the patho- physiological processes by which diseases are presented, the emergence of “omics” technology (proteomics, genomics, lipidomics, and metabolomics) brought the con- cept of metabolic phenotype. It refers to the set of metabolites in various bio-fluids that reflect the health of an individual, metabolic phenotype is genetically defined and modified by external factors such as diet, exercise, exposure to environmental factors, and microbiome action (Holmes et al., 2008). As described below, within the international study of macro–micro nutrients and blood pressure (INTERMAP Study), this metabolic phenotype was studied through the application of metabo- nomics, and metabolic changes related to diet in different world populations and its relation to different cardiovascular diseases in Chinese population were identified (Yap et al., 2010). Metabolomics with MS has been used to assess whether metabolites discriminate coronary artery disease (CAD) and predict risk of cardiovascular events (Shah et al., DIET-RELATED DISEASES 435

2010). To evaluate the discriminative capabilities of metabolites for CAD, two groups were profiled: 174 CAD cases and 174 sex/race-matched controls (“initial”), and 140 CAD cases and 140 controls (“replication”). To evaluate the capability of metabo- lites to predict cardiovascular events, cases were combined (“event” group); of these, 74 experienced death/myocardial infarction during follow-up. A third independent group was profiled (“event-replication” group; n = 63 cases with cardiovascular events, 66 controls). Two principal components analysis-derived factors were asso- ciated with CAD: one comprising branched-chain amino acid (BCAA) metabolites and one comprising urea cycle metabolites. A factor composed of dicarboxylacylcar- nitines predicted death/myocardial infarction and was associated with cardiovascular events in the event-replication group. Because of the growing number of cases of CVD, tools able to be used as pre- dictors of CVD are necessary to avoid its devastating effects through personalized prevention. Biomarkers for early diagnosis and for the follow-up of medium and long- term therapeutic interventions are also relevant. In that sense, metabolomics has been proposed as a tool to determine the risk profile in healthy populations. Bernini et al. examined, in a comprehensive study, the metabolic profile of 864 healthy volunteers; the aim of the research was to identify new biomarkers by investigating the NMR plasma fingerprints and their relationship to metabolites not previously associated with cardiovascular risk. They found that the subjects classified by common markers as at high or low risk showed a different NMR fingerprint; the subjects with high plasma levels of LDL presented decreased levels of ␣-ketoglutarate and dimethyl- glycine, those with high triglyceride levels presented, in addition to the decreased levels of ␣-ketoglutarate and dimethylglycine, decreased levels of serine; and those with high plasma levels of HDL presented low levels of creatinine and threonine and high levels of 3-hydroxybutyrate. They concluded that discrimination was due to a complex NMR fingerprint composed not only of the common markers (total cholesterol, triglycerides, LDL, HDL) but also of other metabolites previously not associated with CVD and that may be related to the biochemistry of cardiovascular risk (Bernini et al., 2011).

16.3.2.1 Ischemic Heart Disease Ischemic heart disease (IHD) is the term used to group the diseases characterized by inadequate blood supply to a portion of the myocardium, leading to an imbalance between the supply and consumption of oxy- gen by the tissue. The main cause of myocardial ischemia is atherosclerotic disease of the coronary arteries that are responsible for irrigating the myocardium. In devel- oped countries, IHD is the leading cause of death and disability and generates high economic cost for care and long-term rehabilitation of sufferers. Its incidence is increasing not only in developed countries but also in developing countries where it has been estimated that between 1990 and 2020 will rise in 120% in women and 127% in men (Yusuf et al., 2001). This increased incidence has been linked to envi- ronmental factors such as high-fat diet, smoking, sedentary lifestyle; conditions such as obesity, hyperlipidemia, diabetes mellitus type 2, and insulin resistance increment the risk (Yusuf et al., 2001; WHO, 2008). 436 METABOLOMICS OF DIET-RELATED DISEASES

Sabatine et al. described a metabolic profiling strategy for identification of novel biomarkers of myocardial ischemia based on several reverse and normal phase HPLC methods coupled to triple quadrupole-MS detector. A total of 477 parent/daughter ion pairs were monitored through six selected reaction monitoring (SRM) experiments for each sample. They investigated whether metabolic profiling could be used to accurately distinguish patients with ischemia from those without it. A metabolic ischemia risk score was created based on differences in some metabolites before and after exercise stress testing in 18 patients and 18 controls. The score yielded a highly statistically significant relation to the probability of ischemia (p < .0001). The authors recommended testing in larger cohorts for ulterior confirmation (Sabatine et al., 2005). Metabolomics aided in the study of ischemic heart disease by allowing identifica- tion of biomarkers useful for diagnosis. Vallejo et al. (2009) used GC–MS to evaluate the metabolic changes associated with acute coronary syndrome and atherosclerosis, a group of 29 subjects, 9 with acute coronary syndrome without ST segment elevation (NSTEACS), 10 with stable atherosclerotic disease of the carotid, and 10 controls. Clear differences were obtained in the profile of cases and controls, with a significant decrease in the levels of citric acid in NSTEACS patients, in concordance with the results from Sabatine et al. previously mentioned. Hydroxyproline was also found decreased in the NSTEACS patients and this fact could be related to the atheromatous plaque instability and increased risk of coronary heart disease. Lin et al. (2009) studied silent myocardial ischemia (SMI), a form of coronary heart disease, in a group of 39 human adults and 25 controls by ultra-performance liquid chromatography coupled with quadrupole-time-of-flight (UPLC-Q-TOF) -MS. They identified that plasma concentration differences of four kinds of phospholipids showed tight relationship with the occurrence of SMI, among which 1-linoleoyl glycerophosphocholine (C18:2) was statistically decreased in SMI population. The plasma phospholipids changes were previous to enzymatic alteration in SMI, which might be a useful complementary reference to facilitate SMI diagnosis. Metabolite fingerprinting is also useful to evaluate the prognosis of patients as well as the effect of the therapy instituted either the lifestyle modification or phar- macological management. Teul et al. (2011) conducted a study of the metabolomic pattern of patients with acute coronary syndrome (ACS) at 0 days, 4 days, 2 months, and 6 months using GC–MS and found 27 metabolites that showed significant dif- ference with controls. The fatty acid profile was studied in parallel proving that in the first hours after ACS, free fatty acids are increased and their levels are markedly decreased afterward. Increased levels of isocitrate, malate, and succinate at different time points were also found and interpreted as a consequence of decreased activity of the malate descarboxylase.

16.3.2.2 Cerebrovascular Disease Cerebrovascular disease (CBVD), one of the major health problems around the world, being responsible for approximately 5.7 mil- lion deaths annually (WHO, 2008), is associated with a high social burden for the great deterioration in the quality of life of sufferers. CBVDs are the result of problems with the blood vessels inside the brain; the most common types of CBVD are stroke, DIET-RELATED DISEASES 437 transient ischemic attack (TIA), subarachnoid hemorrhage, and vascular dementia. The major risk factors for CBVD are high blood pressure, diabetes, high cholesterol, obesity, alcohol consumption, fatty diet, and smoking. So far there are no tools that allow an early diagnosis and its pathophysiology is not completely understood, the use of metabolomics in the study of CBVD has helped to clarify its pathogenesis. Hattori et al. (2010) described a metabolomic study of cerebral artery occlusion in a mouse model, through the application of imaging mass spectrometry by matrix-assisted laser desorption ionization MALDI-MS and cap- illary electrophoresis-electrospray ionization (CE-ESI) -MS. Both complementary techniques made possible the analysis of a major fraction of metabolites, including ATP, ADP, and AMP, all of them polar or ionic, by CE-ESI-MS and also discrim- inate along the spatial distribution of the molecules by MALDI-MS. The authors distinguished metabolically two spatial areas in the brain after CBVD; the penumbra zone is the area surrounding the lesion, this area is disturbed but not damaged and is recoverable, in this area the changes are related to the metabolism of adenine; while the core area, corresponding to the damaged area that is not recoverable has NADH elevated due to ATP depletion. This study made a big contribution in terms of the development of new strategies to treat patients with CBVD. Also metabolomics strategies identified changes in metabolites that may become useful for early diagnosis. Jung et al. using proton magnetic resonance spectroscopy ( 1H-NMR) found changes in CVBD patients. Metabolic profiles of plasma and urine from patients with cerebral infarctions were associated with folic acid deficiency and anaerobic glycolysis (Jung et al., 2011).

16.3.3 Cardiovascular Diseases and/or Diabetes Associated Disorders There are predisposing common pathologies and conditions for cardiovascular dis- ease, although some of these cannot be interpreted as a direct cause of the CVD and/or diabetes, their relation to lifestyle and nutrition is extensive.

16.3.3.1 Metabolic Syndrome and Obesity Metabolic syndrome is a group of metabolic conditions that increase the risk of CVD and DM; it is also labeled as syn- drome X or insulin resistance syndrome. The syndrome is clinically recognized by hypertriglyceridemia, low levels of high-density lipoprotein (HDL), hyperglycemia, hypertension, and central obesity. According to the Third National Health and Nutri- tion Examination Survey (NHANES III), about 47 million people of the United States adult population have metabolic syndrome (Ford et al., 2002). The main risk factors are overweight/obesity (currently epidemic in many countries worldwide), sedentary lifestyle, aging, and lipodystrophy. The pathogenesis of metabolic syndrome is still unknown; the principal recognized cause is insulin resistance due to its inability to increase input and utilization of glucose by peripheral tissues, especially liver, skele- tal muscle, and adipose tissue. Metabolic syndrome is treatable and changes in the lifestyle can reduce the risk. Liver and serum metabolites of obese and lean mice fed on high fat or normal diets were analyzed using UPLC–QTOF and GC–MS (Kim et al., 2010), and partial 438 METABOLOMICS OF DIET-RELATED DISEASES least-squares-discriminant analysis (PLS-DA). Obese and lean groups were clearly discriminated from each other on PLS-DA score plot, and major metabolites con- tributing to the discrimination were assigned as lipid metabolites (fatty acids, phos- phatidylcholines (PCs), and lysophosphatidylcholines (lysoPCs)), lipid metabolism intermediates (betaine, carnitine, and acylcarnitines), amino acids, acidic compounds, monosaccharides, and serotonin. A high-fat diet increased lipid metabolites but decreased the lipid metabolism intermediates and the NAD/NADH ratio, indicat- ing that abnormal lipid and energy metabolism induced by a high-fat diet resulted in fat accumulation via decreased ␤-oxidation. In addition, this study revealed that the levels of many metabolites, including serotonin, betaine, pipecolic acid, and uric acid, were positively or negatively related to obesity-associated diseases. An intriguing application of 1H-NMR-based metabolic analysis has been to study the influence of intestinal bacteria (microbiota) on the development of obesity and metabolic diseases (Dumas ME, 2006). 1H-NMR-based metabolic profiling of plasma and urine samples from a mouse strain known to be susceptible to hepatic steatosis and insulin resistance (129S6) versus a strain with relative resistance (BALBc) revealed low circulating levels of plasma phosphatidylcholine and high levels of methylamines in urine in the 129S6 strain. The authors proposed that the increased propensity of the 129S6 strain for metabolic disease could be due to increased metabolism of phosphatidylcholine to methylamines by intestinal bacteria, resulting in a reduced pool for the assembly of VLDL particles, leading to the deposition of triglycerides in liver. This hypothesis remains to be tested. Profiling of obese (median BMI 37 kg/m2) versus lean (median BMI 23 kg/m2) humans revealed a BCAA-related metabolite signature that differentiates the two groups, is suggestive of increased catabolism of BCAA, and correlates with insulin resistance (Newgard et al., 2009). The signature includes several metabolites that are the byproducts of BCAA catabolism, such as glutamate, ␣-ketoglutarate, C3 acylcarnitine (propionylcarnitine), and C5 acylcarnitines (␣-methylbutyryl and iso- valerylcarnitines). A subsequent cross-sectional study in sedentary hyperlipidemic subjects of varying BMI (range 25–35 kg/m2) identified several metabolite clusters comprising BCAA and related metabolites (Huffman et al., 2009). To test the possible relevance of this finding for development of obesity-related insulin resistance, rats were fed one of the several diets—high fat (HF), HF with sup- plemented BCAA (HF/BCAA), or standard diet. Despite having reduced food intake and weight gain equivalent to the standard diet group, HF/BCAA rats were equally as insulin resistant as HF rats. Moreover, HF/BCAA-induced insulin resistance was reversed by the mTOR inhibitor rapamycin. These findings show that in the context of a poor dietary pattern of increased consumption of fat, BCAAs make an independent contribution to development of obesity-associated insulin resistance. Non targeted LC-QTOF-MS was used to study changes in the metabolome pro- duced by obesity (Kim et al., 2010). Overweight/obese men showed higher lev- els of homeostatic model assessment-insulin resistance (HOMA-IR), triglycerides, total cholesterol, and LDL-cholesterol, and lower levels of HDL-cholesterol and adiponectin than lean men. Overweight/obese men showed higher proportion of stearic acid and lower proportion of oleic acid in serum phospholipids. In addition, DIET-RELATED DISEASES 439 overweight/obese individuals showed higher fat intake and lower ratio of polyun- saturated fatty acids to saturated fatty acids (SFA). Three lysophosphatidylcholine (lysoPC) were identified as potential plasma markers and confirmed eight known metabolites for overweight/obesity men. Especially, overweight/obese subjects showed higher levels of lysoPC C14:0 and lysoPC C18:0 and lower levels of lysoPC C18:1 than lean subjects. Results confirmed abnormal metabolism of two BCAAs, two aromatic amino acids, and fatty acid synthesis and oxidation in overweight/obese men. In addition, the amount of dietary saturated fat may influence the proportion of SFA in serum phospholipids and the degree of saturation of the constituent acyl group of plasma lysoPC. Because no single analytic technique covers the entire spectrum of the human metabolome, a case/control design (Suhre et al., 2010) in males aged over 55 years with type 2 diabetes combined metabolomics data collected on a complementary set of platforms, covering nuclear magnetic resonance, and tandem mass spectrome- try. Key observations include perturbations of metabolic pathways linked to kidney dysfunction (3-indoxyl sulfate), lipid metabolism (glycerophospholipids, free fatty acids), and interaction with the gut microflora (bile acids). The study suggests that metabolic markers hold the potential to detect diabetes-related complications already under sub-clinical conditions in the general population.

16.3.3.2 Atherosclerosis Atherosclerosis also known as atherosclerosis vascular disease (ASVD), is characterized by the accumulation of lipids in large arteries. It is estimated that over 25 million people in the United States have clinical manifestation of ASVD (Faxon et al., 2004). The process of atherosclerosis involves lipid distur- bances, platelet activation, endothelial dysfunction, chronic inflammation, oxidative stress, altered matrix metabolism, as a result the vessel wall gets thick, and the blood flow is altered. ASVD affects various regions of the circulation including brain, heart, kidneys, mesentery, peripheral arterial (limbs) and the clinical manifestations depending on the circulatory bed strained. The major risk factors are cigarette smok- ing and diabetes, other risk factors are dyslipidemia, family history, and age. Increased dietary cholesterol intake is associated with atherosclerosis. Atherosclerosis devel- opment requires a lipid and an inflammatory component, it is unclear where and how the inflammatory component develops; recently it has been discovered that at the molecular level the presence of lipids in the endothelium causes activation of signal- ing pathways with an inflammatory response that involves leukocyte, macrophages, and lymphocytes that produce mediators such as cytokines and other substances that promote local damage and thrombotic states. To assess the role of the liver in the evolution of inflammation, ApoE3 Leiden mice were treated with cholesterol-free, low-, and high-cholesterol diets scored early atherosclerosis and profiled the (patho) physiological state of the liver using novel whole-genome and metabolome tech- nologies (Kleemann et al., 2007). High-cholesterol diet evoked changes in specific proinflammatory pathways involving specific transcriptional master regulators, some of which are established, others newly identified. Notably, several of these regulators control both lipid metabolism and inflammation, and thereby link the two processes. 440 METABOLOMICS OF DIET-RELATED DISEASES

In another animal model, a comprehensive dataset of protein and metabo- lite changes during atherogenesis was obtained to identify changes in vessels of apolipoprotein E−/− mice on normal chow diet (Mayr et al., 2005). 1H-NMR revealed a decline in alanine and a depletion of the adenosine nucleotide pool in vessels of 10-week-old apolipoprotein E−/− mice. Attenuation of lesion formation was associated with alterations of NADPH generating malate dehydrogenase, which provides reducing equivalents for lipid synthesis and glutathione recycling, and suc- cessful replenishment of the vascular energy pool.

16.3.3.3 Hypertension The relationship of high blood pressure or hypertension and the emergence of diseases such as CBVD and CVD is straightforward; to study the relationship between some components of the diet and increased risk of having specific diseases, epidemiological studies were initiated worldwide as the INTER- SALT study that included 10,000 people, men and women between 20 and 59 years of 32 countries proving that sodium intake is positively related to increased blood pressure, this was estimated from urinary sodium concentration, very low sodium concentrations are associated with minimal increases in systolic and diastolic; in contrast, high concentrations are associated with increases in blood pressure of 0.5 mm Hg annually (Elliott et al., 1996). Results of other epidemiological studies confirm the relationship between diet and hypertension in particular the influence of dietary nutrients (macro and micro), sodium chloride, potassium, alcohol intake, and caloric imbalance. As previously mentioned, the INTERMAP Study has as a goal identifying etiopathogenic mechanism of hypertension and its associated pathologies by metabo- nomic study of the metabolic phenotype (Holmes et al., 2008; Yap et al., 2010). A 1H-NMR spectroscopy-based metabolome-wide association approach was used to identify urinary metabolites that discriminate between them. Urinary metabo- lites significantly higher in northern than southern Chinese populations included dimethylglycine, alanine, lactate, BCAAs (isoleucine, leucine, valine), N-acetyls of glycoprotein fragments (including uromodulin), N-acetyl neuraminic acid, pen- tanoic/heptanoic acid, and methylguanidine; metabolites significantly higher in the south were gut microbial cometabolites (hippurate, 4-cresyl sulfate, pheny- lacetylglutamine, 2-hydroxyisobutyrate), succinate, creatine, scyllo-inositol, proline- betaine, and trans-aconitate. These findings indicate the importance of environmen- tal influences (e.g., diet), endogenous metabolism, and mammalian-gut microbial cometabolism, which according to the authors may help explain north–south China differences in cardiovascular disease risk.

16.3.3.4 Dietary Factors, CVD, Diabetes The role of nutrition as a factor to be modified to prevent or improve the prognosis of CVD has been known for long, and its role to reduce the risk of CVD has been widely studied for about a decade. The first studies that correlated atherosclerosis and nutrition in animals were performed in 1908 by Dr. Nikolai N. Anichkov who showed that modifying the vegetarian diet of rabbits by a diet containing meat and eggs caused atheromatous plaques in blood vessels (Konstantinov et al., 2006). In the last decades of the twentieth century, DIET-RELATED DISEASES 441 the attention on the relationship between food and CVD has increased due to its increasing incidence, and for its social and economic impact. As mentioned in previous paragraphs, most of the conditions that have an impor- tant role in the development of CVD such as hypertension (high blood pressure), hyperlipidemia, hyperglycemia, and obesity/overweight are associated to the diet and can be avoided with changes in the lifestyle and in the dietary patterns. All of these conditions not only elevated lipid levels in the bloodstream but also are related to endothelial damage. In obese, the levels of adiponectin are low and are related to insulin resistance (Chaudhary et al., 2012). Obesity and metabolic syndrome also are associated with increments in the levels of ROS and with oxidative stress (Fenech et al., 2011). An important number of publications have reviewed the effects of different nutri- ents on metabolic disorders; some of them are contradictories, however, the contribu- tion is very important and has allowed establishing new alternatives for its handling including the development of new drugs (Hu and Willett, 2002). Defining the healthy levels of fat consumed in the diet, and how much of the total calories intake would be from fat and the proportion of SFA, monounsaturated fatty acids (MUFA), and polyunsaturated fatty acids (PUFA) is difficult to define. Until now, there is not a consensus; however, a balance among SFA, MUFA, and PUFA, of 1:1.3:1, respectively is currently recommended, with less than 1% of trans fatty acids and adequate omega-6/omega-3 balance, closer to the 1:1 that could have been the characteristic of paleolithic man, far away from the estimated 20–30:1 of the western diets. Najbjerg et al. (2011) studied the effects of dietary fat on metabolism, working with cell cultures of HepG2 line exposed to various fatty acids representing both different chain lengths and cis/trans configurations. Using 1H- NMR spectroscopy, the cell uptake of the fatty acids was evidenced; inspection of the spectra revealed differences that could be ascribed to the uptake of trans fatty acids, and the cells converted them into conjugated fatty acids. The metabolic response was clearly dependent on fatty acid chain length what was related to the fact that short chain fatty acids (C4–C6) undergo ␤-oxidation immediately whereas medium long-chain fatty acids (C12–C16) are deposited in the cell. It is known that whole-grain cereals and diets with a low glycemic index may pro- tect against the development of type 2 diabetes and heart disease, but the mechanisms are poorly understood. The effect of carbohydrate modification on serum metabolic profile has been studied (Lankinen et al., 2010) with LC–MS metabolomics. Results suggest that the dietary carbohydrate modification alters the serum metabolic profile, especially in lysoPC species, and may, thus contribute to proinflammatory processes which in turn promote adverse changes in insulin and glucose metabolism. The dietary fiber intake, mainly from whole-grain products has been linked to reduced risk of obesity, type 2 diabetes, dyslipidemia, hypertension, coronary heart disease, and colorectal cancer (Lattimer and Haub, 2010), recent studies show that for every 10 g of additional fiber added to a diet, the mortality risk of CVD is reduced by 17–35% (Streppel et al., 2008). Also, it has been claimed that a diet with high soluble fiber showed health benefits lowering the levels of total and LDL:HDL cholesterol (Jenkins et al, 2002), although the mechanisms responsible are not fully elucidated. 442 METABOLOMICS OF DIET-RELATED DISEASES

Fardet et al. (2007),using a 1H-NMR-based metabonomic approach, studied the metabolic changes in two groups of rats, one fed with a diet with whole-grain wheat flour and other with refined wheat flour. It was found that the urinary excretion of some tricarboxylic acid cycle intermediates, aromatic amino acids, and hippurate was significantly greater in rats fed with the whole-grain wheat flour diet, and these results were related to metabolic changes that may offer protection against oxidative stress. The effect of whole-grain (in this case, rye) was also studied with 1H-NMR by Bertram et al. (2006, 2009) in pigs. Significant differences in the amount of betaine in plasma, and urine (together with differences in creatine/creatinine ratio) was found when the animals received the whole-grain diet.

16.3.4 Cancer Cancer is a term used for a group of diseases in which there are unregulated growths of cells that invade other tissues (metastasis), through blood or lymphatic system. The American Cancer Society estimates a total of 1,638,910 new cancer cases and 577,190 deaths from cancer to occur in the United States in 2012 (Siegel et al., 2012). It is the second cause of death behind heart disease. The most significant risk factor is aging; the World Cancer Research Fund/American Institute of Cancer Research (WCRF/AICR), based on correlation studies, suggested that the principal causes of cancer are environmental and that cancer incidence could be attenuated by modification of lifestyle habits mainly diet, nutrition, and physical activity (World Cancer Research Fund/American Institute for Cancer Research, 2007). The correct morphology and function of tissues and organs depend on a perfect balance between cell proliferation, differentiation, and death; the normal growth, function, and death of cells are regulated by a complex system that involves oncogenes and tumor suppressor genes. Cancer is a genetic disease that is the result of the clonal expansion of a single progenitor cell that has undergone a genetic lesion (mutation). Considerably evidence points to show that the diet and particularly the consumption of industrialized food increases the risk of some cancers, notably those of the colon, rectum, breast, ovary, endometrium, and prostate (World Cancer Research Fund/American Institute for Cancer Research, 2007). Furthermore, epidemiological studies have suggested that the consumption of diets rich in whole cereals reduced the risk of cancer, diabetes, and atherosclerosis (Slavin et al., 2001). The application of metabolomics in cancer research has provided important infor- mation that contributes to the better understanding of the disease, from the stand- point of biological and metabolic, monitoring of treatment and prognostic evaluation (Fig. 16.2). An interesting example from the cancer research field (Sreekumar et al., 2009) used LC and GC coupled with MS to perform nontargeted profiling on >1100 indi- vidual metabolites in prostate tumor explants, blood, and urine from biopsy-positive cancer patients and biopsy-negative control subjects. Statistically meaningful incre- ments were found in a small subset of metabolites in tumor explants, particularly in metastatic tumors relative to benign prostate. Six metabolites were found to increase with progression from benign prostate to localized cancer to metastatic cancer, DIET-RELATED DISEASES 443

FIGURE 16.2 Contributions of metabolomics to cancer research.

including sarcosine, a glycine metabolite. Importantly, the authors then developed a targeted stable isotope-dilution method for quantitative measurement of sarcosine and found it to be elevated by 10- to 20-fold in metastatic tumors compared with benign prostrate. They also showed that manipulation of enzymes of sarcosine metabolism influenced prostate cancer invasion. These results show that nontargeted MS methods are able to detect changes in metabolites within the tissue of origin of the metabolic variability. Other study of prostate carcinoma using 1H-NMR found a correlation between an increase of choline, a decrease of citrate and the Gleason score (used to evaluate prognosis); correlation between decreased levels of citrate, spermine, and myo-inositol in human prostatic secretions and prostatic carcinoma was also found (Serkova et al., 2008). The findings in a study of breast tumors with 1H-NMR allowed to conclude that the total choline (tCho) and choline metabolism intermediates can be considered as markers of malignancy, being tCho detected in malignant breast cancer, with a sensitivity of around 83% and specificity of 85% (Sah et al., 2011). By means of a metabolomics approach with GC-TOF, healthy colon and colorectal cancer tissues were compared, and the results showed that in such cancer phenotype, the TCA cycle intermediates and lipids were decreased whereas metabolites of the urea cycle, purines, and pyrimidines were elevated (Denkert et al., 2008). 444 METABOLOMICS OF DIET-RELATED DISEASES

Seeking to correlate general changes in metabolites in the different phenotypes of cancer, a pilot study was performed on ten specimens obtained during surgery, including five soft tissue sarcomas and five paired normal samples. Researchers used liquid chromatography tandem targeted mass spectrometry (LC/MS/MS) via SRM of a total of 249 endogenous water soluble metabolites. In the metabolomic study of the formalin-fixed paraffin-embedded (FFPE) tissues, the HPLC method was based on HILIC mode of aqueous solutions obtained from the methanolic tissue extracts. They detected significant changes in an average of 106 metabolites, most of them related to changes in glucose metabolism, including glycolysis, glutamate metabolism, and the citric acid cycle; although the pretreatment of samples with formalin leads to degradation, the findings are correlated to the published literature on metabolic alterations present in cancer (Kelly et al., 2011). A metabolomics approach was also used to characterize cell lines and tumors ex vivo and in vivo, applying 1H-NMR to identify that brain tumor cells display low concentration of N-acetyl aspartate, ␥-amino butyrate (GABA), and taurine (Florian et al., 1995). Magnetic resonance imaging (MRI) also called “metabolomics in vivo,”isan imaging method for detecting and measuring metabolic activity at the cellular level; three neural cell types have been studied concluding that they are not only distinguish- able by morphological and immunocytochemical characteristics but also through their 1H MRS metabolite profiles. Its application in cancer research has allowed identi- fying major metabolites related to brain tumors as choline-containing metabolites, total creatine, alanine, taurine, glutamate, lactate, and CH2/CH3 lipids (Griffin and Kauppinen, 2007). Asiago et al. (2010) described a comprehensive metabolomic study including 257 retrospective serial serum samples from 56 previously diagnosed and surgically treated breast cancer patients. With a combination of NMR and two- dimensional gas chromatography–mass spectrometry (GC × GC−MS) it has been possible to demonstrate reductions in blood levels of choline, formate, histidine, pro- line, N-acetyl-glycine, and 3-hydroxy-2-methyl-butanoic acid. These findings have been interpreted as markers of early recurrence of breast cancer, because their decrease has been linked to the increased demand of amino acids by tumors (Asiago et al., 2010). Applying 1H-MRS on cell cultures of prostate and breast tumors, metabolic changes have been detected when treated with LY294002 (phosphoinositide 3-kinase inhibitor) and 17AAG (heat shock protein 90 inhibitor). Although different cell lines showed differences in metabolic profiles, a decrease in lactate, fumarate, and alanine was found common that confirms the alterations of glucose uptake present in cell cancer. Furthermore, citrate, which is typically observed in normal prostate tissue but not in tumors, increased following the 17AAG treatment in prostate cells. These findings are important because they raise the possibility of monitoring drug treatment in vivo with specific biomarkers and evaluating changes in metabolic pathways (Lodi and Ronen, 2011).

16.3.4.1 Diet and Cancer Cancer cells have metabolic hallmarks that differ from normal cells which are related to glucose metabolism, amino acid metabolism, and fatty acid metabolism, the use of metabonomic/metabolomics has helped to deter- mine the effect of the components of diet on cancer development or control. The DIET-RELATED DISEASES 445

TABLE 16.2 Bioactive Compounds in Food and Cancer Component Food Sources Action References Delphinidin Grapes, cranberries, Cancer prevention Barrios et al., 2010 anthocyanidin pomegranates Silibinin Extracted from milk Prostatic tumors Raina et al., 2009 thistle Genistein Soy Suppress cancer Li et al., 2010; cell proliferation Khan et al., 2011 Tea catechin/ Green tea Metabolic changes Khan et al., 2011 EGCG in cancer Docosahexaenoic Fish oil Metabolic changes Gleissman et al., acid (DHA) in cancer 2010 ␤-glucan Barley and Metabolic changes Chan et al., 2009 mushrooms in cancer Resveratrol Red grapes Metabolic changes Khan et al., 2011 in cancer Luteolin Fresh vegetable Metabolic changes Fernandez-Arroyo in cancer et al., 2012

following table (Table 16.2) summarizes some of the bioactive compounds identified by metabolomics in foods and their relation to cancer. Lifestyle and dietary changes are recommended for men diagnosed with early- stage prostate cancer (PC), the most common cancer in the Western world. It has been shown that a diet rich in whole-grain (WG) rye reduces the progression of early- stage PC, but the underlying mechanism is not clear. A study (Moazzami et al., 2011) sought to identify changes in the metabolic signature of plasma in patients with early- stage PC following intervention with a diet rich in WG rye and rye bran product (RP) compared with refined white wheat product (WP) as a tool for mechanistic investiga- tion of the beneficial health effects of RP on PC progression. Seventeen PC patients received 485 g RP or WP in a randomized, controlled, crossover design during a period of 6 weeks with a 2 weeks washout period. At the end of each intervention period, plasma was collected after fasting and used for 1H-NMR-based metabolomics. A metabolomics analysis of plasma showed an increase in 5 metabolites, including 3-hydroxybutyric acid, acetone, betaine, N,N-dimethylglycine, and dimethyl sulfone, after RP. To understand these metabolic changes, fasting plasma homocysteine, lep- tin, adiponectin, and glucagon were measured separately. The plasma homocysteine concentration was lower (p = .017) and that of leptin tended to be lower (p = .07) after RP intake compared to WP intake. The increase in plasma 3-hydroxybutyric acid and acetone after RP suggests a shift in energy metabolism from anabolic to catabolic status, which could explain some of the beneficial health effects of WG rye, that is, reduction in prostate-specific antigen and reduced 24 h insulin secretion. In addition, the increase in betaine and N,N-dimethylglycine and the decrease in homocysteine show a favorable shift in homocysteine metabolism after RP intake. 446 METABOLOMICS OF DIET-RELATED DISEASES

Evidence from clinical trial outcomes, epidemiological observations, preclinical models and cell culture systems have all provided clues about the biochemistry of disease prevention. Additional research needs concerning diet and disease prevention include: identification and validation of biomarkers and markers of dietary exposure; investigation of the exposure/temporal relationship between food component intakes and disease prevention; examination of possible tissue specificity in response to dietary factors; and examination of interactions among bioactive food components as determinants of response. Other emerging areas that require greater attention include understanding the interaction between diet and the microbiome, as well as how bioactive food components modulate inflammatory processes. Studies described here demonstrate how metabolomics can provide a more detailed picture of metabolic status of normal and diseased subjects, which with further development could contribute to more exact sub-classification of different forms of diseases, leading to more judicious and effective use of drug therapies. The paramount challenge of the next phase of metabolomics investigation is to better harvest the information from large datasets to create knowledge about metabolic regulatory mechanisms, perhaps leading to better understanding of perturbations in chronic diseases and conditions such as type 2 diabetes, obesity, CVD, and cancer.

REFERENCES

ADA Expert Committee on the Diagnosis and Classification of Diabetes Mellitus (2003). Report of the expert committee on the diagnosis and classification of diabetes mellitus. Diabetes Care 26(Suppl 1):S5–S20. Asiago VM, Alvarado LZ, Shanaiah N, Gowda GA, Owusu-Sarfo K, Ballas RA, Raftery D (2010). Early detection of recurrent breast cancer using metabolite profiling. Cancer Research 70(21):8309–8318. Barker DJ (1995). Fetal origins of coronary heart disease. British Medical Journal (Clinical Research Ed.) 311(6998):171–174. Barrios J, Cordero CP, Aristizabal F, Heredia FJ, Morales AL, Osorio C (2010). Chem- ical analysis and screening as anticancer agent of anthocyanin-rich extract from uva caimarona (Pourouma cecropiifolia mart.) fruit. Journal of Agricultural and Food Chem- istry 58(4):2100–2110. Bernini P, Bertini I, Luchinat C, Tenori L, Tognaccini A (2011). The cardiovascular risk of healthy individuals studied by NMR metabonomics of plasma samples. Journal of Proteome Research 10(11):4983–4992. Bertram HC, Bach Knudsen KE, Serena A, Malmendal A, Nielsen NC, Frette XC, Andersen HJ (2006). NMR-based metabonomic studies reveal changes in the biochemical profile of plasma and urine from pigs fed high-fibre rye bread. The British Journal of Nutrition 95(5):955–962. Bertram HC, Malmendal A, Nielsen NC, Straadt IK, Larsen T, Knudsen KE, Laerke HN (2009). NMR-based metabonomics reveals that plasma betaine increases upon intake of high-fiber rye buns in hypercholesterolemic pigs. Molecular Nutrition Food Research 53(8):1055– 1062. REFERENCES 447

Chan GC, Chan WK, Sze DM (2009). The effects of beta-glucan on human immune and cancer cells. Journal of Hematology Oncology 2:25. Chaudhary N, Nakka KK, Maulik N, Chattopadhyay S (2012). Epigenetic manifestation of metabolic syndrome and dietary management. Antioxidants Redox Signaling 17(2):254– 281. D’Agostino RB, Vasan RS, Pencina MJ, Wolf PA, Cobain M, Massaro JM, Kannel WB (2008). General cardiovascular risk profile for use in primary care: the Framingham heart study. Circulation 117(6):743–753. Denkert C, Budczies J, Weichert W, Wohlgemuth G, Scholz M, Kind T, Niesporek S, Noske A, Buckendahl A, Dietel M, Fiehn O (2008). Metabolite profiling of human colon carcinoma– deregulation of TCA cycle and amino acid turnover. Molecular Cancer 7:72. Dumas ME, Barton RH, Toye A., Cloarec O, Blancher C, Rothwell A, Fearnside J, Tatoud R, Blanc L, Lindon JC, Mitchell SC, Holmes E, McCarthy MI, Scott J, Gauguier D, Nicholson JK (2006). Metabolic profiling reveals a contribution of gut microbiota to fatty liver phenotype in insulin-resistant mice. Proceedings of the National Academy of Sciences of the United States of America 103(33):12511–12516. Elliott P, Stamler J, Nichols R, Dyer AR, Stamler R, Kesteloot H, Marmot M (1996). Intersalt revisited: further analyses of 24 hour sodium excretion and blood pressure within and across populations intersalt cooperative research group. British Medical Journal (Clinical Research Ed.) 312(7041):1249–1253. European Association for Cardiovascular Prevention & Rehabilitation; ESC Committee for Practice Guidelines: 2008–2010 and 2010–2012 Committees (2011). ESC/EAS guidelines for the management of dyslipidaemias: the task force for the management of dyslipidaemias of the European Society of Cardiology (ESC) and the European Atherosclerosis Society (EAS). European Heart Journal 32(14):1769–1818. Fardet A, Canlet C, Gottardi G, Lyan B, Llorach R, Remesy C, Mazur A, Paris A, Scalbert A (2007). Whole-grain and refined wheat flours show distinct metabolic profiles in rats as assessed by a 1H NMR-based metabonomic approach. The Journal of Nutrition 137(4):923– 929. Faxon DP, Creager MA, Smith SC Jr, Pasternak RC, Olin JW, Bettmann MA, Criqui MH, Milani RV, Loscalzo J, Kaufman JA, Jones DW, Pearce WH, American Heart Associ- ation. American Heart Association (2004). Atherosclerotic vascular disease conference: executive summary: Atherosclerotic vascular disease conference proceeding for healthcare professionals from a special writing group of the American heart association. Circulation 109(21):2595–2604. Fenech M, El-Sohemy A, Cahill L, Ferguson LR, French TA, Tai ES, Milner J, Koh WP, Xie L, Zucker M, Buckley M, Cosgrove L, Lockett T, Fung KY, Head R (2011). Nutrigenetics and nutrigenomics: viewpoints on the current status and applications in nutrition research and practice. Journal of Nutrigenetics and Nutrigenomics 4(2):69–89. Fernandez-Arroyo S, Gomez-Martinez A, Rocamora-Reverte L, Quirantes-Pine R, Segura- Carretero A, Fernandez-Gutierrez A, Ferragut JA (2012). Application of nano LC-ESI- TOF-MS for the metabolomic analysis of phenolic compounds from extra-virgin olive oil in treated colon-cancer cells. Journal of Pharmaceutical and Biomedical Analysis 63:128– 134. Florian CL, Preece NE, Bhakoo KK, Williams SR, Noble M (1995). Characteristic metabolic profiles revealed by 1H NMR spectroscopy for three types of human brain and nervous system tumours. NMR in Biomedicine 8(6):253–264. 448 METABOLOMICS OF DIET-RELATED DISEASES

Ford ES, Giles WH, Dietz WH (2002). Prevalence of the metabolic syndrome among US adults: findings from the third national health and nutrition examination survey. The Journal of the American Medical Association 287(3):356–359. Gleissman H, Yang R, Martinod K, Lindskog M, Serhan CN, Johnsen JI, Kogner P (2010). Docosahexaenoic acid metabolome in neural tumors: identification of cytotoxic interme- diates. FASEB Journal Official Publication of the Federation of American Societies for Experimental Biology 24(3):906–915. Griffin JL, Kauppinen RA (2007). A metabolomics perspective of human brain tumours. The FEBS Journal 274(5):1132–1139. Hales CN, Barker DJ (1992). Type 2 (non-insulin-dependent) diabetes mellitus: the thrifty phenotype hypothesis. Diabetologia 35(7):595–601. Hattori K, Kajimura M, Hishiki T, Nakanishi T, Kubo A, Nagahata Y, Ohmura M, Yachie- Kinoshita A, Matsuura T, Morikawa T, Nakamura T, Setou M, Suematsu M (2010). Para- doxical ATP elevation in ischemic penumbra revealed by quantitative imaging mass spec- trometry. Antioxidants Redox Signaling 13(8):1157–1167. Holmes E, Loo RL, Stamler J, Bictash M, Yap IK, Chan Q, Ebbels T, De Iorio M, Brown IJ, Veselkov KA, Daviglus ML, Kesteloot H, Ueshima H, Zhao L, Nicholson JK, Elliott P (2008). Human metabolic phenotype diversity and its association with diet and blood pressure. Nature 453(7193):396–400. Hu FB, Willett WC (2002). Optimal diets for prevention of coronary heart disease. JAMA: The Journal of the American Medical Association 288(20):2569–2578. Huffman KM, Shah SH, Stevens RD, Bain JR, Muehlbauer M, Slentz CA, Tanner CJ, Kuchib- hatla M, Houmard JA, Newgard CB, Kraus WE (2009). Relationships between circulating metabolic intermediates and insulin action in overweight to obese inactive men and women. Diabetes Care 32(9):1678–1683. Jenkins DJA, Kendall CWC, Vuksan V, Vidgen E, Parker T, Faulkner D, Mehling CC, Garsetti M, Testolin G, Cunnane SC, Ryan MA, Corey PN (2002). Soluble fiber intake at a dose approved by the US Food and Drug Administration for a claim of health benefits: serum lipid risk factors for cardiovascular disease assessed in a randomized controlled crossover trial. American Journal of Clinical Nutrition 75(5):834–839. Jung JY, Lee HS, Kang DG, Kim NS, Cha MH, Bang OS, Ryu do H, Hwang GS (2011). 1H-NMR-based metabolomics study of cerebral infarction. Stroke, a Journal of Cerebral Circulation 42(5):1282–1288. Kelly AD, Breitkopf SB, Yuan M, Goldsmith J, Spentzos D, Asara JM (2011). Metabolomic profiling from formalin-fixed paraffin-embedded tumor tissue using targeted LC/MS/MS: application in sarcoma. PloS One 6(10):e25357. Khan SI, Zhao J, Khan IA, Walker LA, Dasmahapatra AK (2011). Potential utility of natu- ral products as regulators of breast cancer-associated aromatase promoters. Reproductive Biology and Endocrinology 9:91. Kim JY, Park JY, Kim OY, Ham BM, Kim HJ, Kwon DY, Jang Y, Lee JH (2010). Metabolic profiling of plasma in overweight/obese and lean men using ultra performance liquid chromatography and Q-TOF mass spectrometry (UPLC-Q-TOF MS). Journal of Proteome Research 9(9):4368–4375. Kleemann R, Verschuren L, van Erk MJ, Nikolsky Y, Cnubben NH, Verheij ER, Smilde AK, Hendriks HF, Zadelaar S, Smith GJ, Kaznacheev V, Nikolskaya T, Melnikov A, Hurt- Camejo E, van der Greef J, van Ommen B, Kooistra T (2007). Atherosclerosis and liver REFERENCES 449

inflammation induced by increased dietary cholesterol intake: a combined transcriptomics and metabolomics analysis. Genome Biology 8(9):R200. Konstantinov IE, Mejevoi N, Anichkov NM (2006). Nikolai Anichkov and his theory of atherosclerosis. Texas Heart Institute Journal 33(4):417–423. Lankinen M, Schwab U, Gopalacharyulu PV, Seppanen-Laakso T, Yetukuri L, Sysi-Aho M, Kallio P, Suortti T, Laaksonen DE, Gylling H, Poutanen K, Kolehmainen M, Oresic M (2010). Dietary carbohydrate modification alters serum metabolic profiles in individuals with the metabolic syndrome. Nutrition Metabolism and Cardiovascular Diseases: NMCD 20(4):249–257. Lattimer JM, Haub MD (2010). Effects of dietary fiber and its components on metabolic health. Nutrients 2(12):1266–1289. Lin H, Zhang J, Gao P (2009). Silent myocardial ischemia is associated with altered plasma phospholipids. Journal of Clinical Laboratory Analysis 23(1):45–50. Li W, Frame LT, Hirsch S, Cobos E (2010). Genistein and hematological malignancies. Cancer Letters 296(1):1–8. Lodi A, Ronen SM (2011). Magnetic resonance spectroscopy detectable metabolomic finger- print of response to antineoplastic treatment. PloS One 6(10):e26155. Mayr M, Chung Y, Mayr U, Yin X, Ly L, Troy H, Fredericks S, Hu Y, Griffiths JR, Xu Q (2005). Proteomic and metabolomic analyses of atherosclerotic vessels from apolipoprotein E-deficient mice reveal alterations in inflammation oxidative stress and energy metabolism. Arteriosclerosis Thrombosis and Vascular Biology 25(10):2135– 2142. McKillop AM, Flatt PR (2011). Emerging applications of metabolomic and genomic profiling in diabetic clinical medicine. Diabetes Care 34(12):2624–2630. Moazzami AA, Zhang JX, Kamal-Eldin A, Aman P, Hallmans G, Johansson JE, Andersson SO (2011). Nuclear magnetic resonance-based metabolomics enable detection of the effects of a whole grain rye and rye bran diet on the metabolic profile of plasma in prostate cancer patients. The Journal of Nutrition 141(12):2126–2132. Najbjerg H, Young JF, Bertram HC (2011). NMR-based metabolomics reveals that conjugated double bond content and lipid storage efficiency in HepG2 cells are affected by fatty acid cis/trans configuration and chain length. Journal of Agricultural and Food Chemistry 59(16):8994–9000. Neel JV (1962). Diabetes mellitus: a “thrifty” genotype rendered detrimental by “progress”? American Journal of Human Genetics 14:353–362. Nemutlu E, Zhang S, Gupta A, Juranic NO, Macura SI, Terzic A, Jahangir A, Dzeja P (2012). Dynamic phosphometabolomic profiling of human tissues and transgenic models by 18O-assisted 31P NMR and mass spectrometry. Physiological Genomics 44(7):386– 402. Newgard CB, An J, Bain JR, Muehlbauer MJ, Stevens RD, Lien LF, Haqq AM, Shah SH, Arlotto M, Slentz CA, Rochon J, Gallup D, Ilkayeva O, Wenner BR, Yancy WS Jr, Eisenson H, Musante G, Surwit RS, Millington DS, Butler MD, Svetkey LP (2009). A branched- chain amino acid-related metabolic signature that differentiates obese and lean humans and contributes to insulin resistance. Cell Metabolism 9(4):311–326. Raina K, Serkova NJ, Agarwal R (2009). Silibinin feeding alters the metabolic profile in TRAMP prostatic tumors: 1H-NMRS-based metabolomics study. Cancer Research 69(9):3731–3735. 450 METABOLOMICS OF DIET-RELATED DISEASES

Sabatine MS, Liu E, Morrow DA, Heller E, McCarroll R, Wiegand R, Berriz GF, Roth FP, Gerszten RE (2005). Metabolomic identification of novel biomarkers of myocardial ischemia. Circulation 112(25):3868–3875. Sah RG, Sharma U, Parshad R, Seenu V, Mathur SR, Jagannathan NR (2011). Association of estrogen receptor progesterone receptor and human epidermal growth factor receptor 2 status with total choline concentration and tumor volume in breast cancer patients: an MRI and in vivo proton MRS study. Magnetic Resonance in Medicine. DOI:10.1002/mrm.24117. Salek RM, Maguire ML, Bentley E, Rubtsov DV, Hough T, Cheeseman M, Nunez D, Sweatman BC, Haselden JN, Cox RD, Connor SC, Griffin JL (2007). A metabolomic comparison of urinary changes in type 2 diabetes in mouse rat and human. Physiological Genomics 29(2):99–108. Savage DB, Petersen KF, Shulman GI (2007). Disordered lipid metabolism and the pathogen- esis of insulin resistance. Physiological Reviews 87(2):507–520. Serkova NJ, Gamito EJ, Jones RH, O’Donnell C, Brown JL, Green S, Sullivan H, Hedlund T, Crawford ED (2008). The metabolites citrate myo-inositol and spermine are potential age-independent markers of prostate cancer in human expressed prostatic secretions. The Prostate 68(6):620–628. Shah SH, Bain JR, Muehlbauer MJ, Stevens RD, Crosslin DR, Haynes C, Dungan J, Newby LK, Hauser R, Ginsburg GS, Newgard CB, Kraus WE (2010). Association of a peripheral blood metabolic profile with coronary artery disease and risk of subsequent cardiovascular events. Circulation Cardiovascular Genetics 3(2):207–214. Siegel R, Naishadham D, Jemal A (2012). Cancer statistics 2012. CA: A Cancer Journal for Clinicians 62(1):10–29. Slavin JL, Jacobs D, Marquart L, Wiemer K (2001). The role of whole grains in disease prevention. Journal of the American Dietetic Association 101(7):780–785. Sreekumar A, Poisson LM, Rajendiran TM, Khan AP, Cao Q, Yu J, Laxman B, Mehra R, Lonigro RJ, Li Y, Nyati MK, Ahsan A, Kalyana-Sundaram S, Han B, Cao X, Byun J, Omenn GS, Ghosh D, Pennathur S, Alexander DC, Berger A, Shuster JR, Wei JT, Varambally S, Beecher C, Chinnaiyan AM (2009). Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression. Nature 457(7231):910–914. Streppel MT, Ocke MC, Boshuizen HC, Kok FJ, Kromhout D (2008). Dietary fiber intake in relation to coronary heart disease and all-cause mortality over 40 y: the Zutphen study. The American Journal of Clinical Nutrition 88(4):1119–1125. Suhre K, Meisinger C, Doring A, Altmaier E, Belcredi P, Gieger C, Chang D, Milburn MV, Gall WE, Weinberger KM, Mewes HW, Hrabe de Angelis M, Wichmann HE, Kronenberg F, Adamski J, Illig T (2010). Metabolic footprint of diabetes: a multiplatform metabolomics study in an epidemiological setting. PloS One 5(11):e13953. Teul J, Garcia A, Tunon J, Martin-Ventura JL, Tarin N, Bescos LL, Egido J, Barbas C, Ruperez FJ (2011). Targeted and non-targeted metabolic time trajectory in plasma of patients after acute coronary syndrome. Journal of Pharmaceutical and Biomedical Analysis 56(2):343– 351. Vallejo M, Garcia A, Tunon J, Garcia-Martinez D, Angulo S, Martin-Ventura JL, Blanco-Colio LM, Almeida P, Egido J, Barbas C (2009). Plasma fingerprinting with GC-MS in acute coronary syndrome. Analytical and Bioanalytical Chemistry 394(6):1517–1524. Wang TJ, Larson MG, Vasan RS, Cheng S, Rhee EP, McCabe E, Lewis GD, Fox CS, Jacques PF, Fernandez C, O’Donnell CJ, Carr SA, Mootha VK, Florez JC, Souza A, Melander O, REFERENCES 451

Clish CB, Gerszten RE (2011). Metabolite profiles and the risk of developing diabetes. Nature Medicine 17(4):448–453. WHO (2008). Cardiovascular Disease. The World Health Report Geneva: September 30th. Wild S, Roglic G, Green A, Sicree R, King H (2004). Global prevalence of diabetes: estimates for the year 2000 and projections for 2030. Diabetes Care 27(5):1047–1053. World Cancer Research Fund/American Institute for Cancer Research (2007). Food Nutrition Physical Activity and the Prevention of Cancer: a Global Perspective. Washington DC: The American Institute for Cancer Research. Yap IK, Brown IJ, Chan Q, Wijeyesekera A, Garcia-Perez I, Bictash M, Loo RL, Chadeau- Hyam M, Ebbels T, De Iorio M, Maibaum E, Zhao L, Kesteloot H, Daviglus ML, Stamler J, Nicholson JK, Elliott P, Holmes E (2010). Metabolome-wide association study identifies multiple biomarkers that discriminate north and south chinese populations at differing risks of cardiovascular disease: INTERMAP study. Journal of Proteome Research 9(12):6647– 6654. Yusuf S, Reddy S, Ounpuu S, Anand S (2001). Global burden of cardiovascular diseases: part I: general considerations the epidemiologic transition risk factors and impact of urbanization. Circulation 104(22):2746–2753. Zhao X, Peter A, Fritsche J, Elcnerova M, Fritsche A, Haring HU, Schleicher ED, Xu G, Lehmann R (2009). Changes of the plasma metabolome during an oral glucose tolerance test: is there more than glucose to look at? American Journal of Physiology, Endocrinology and Metabolism 296(2):384–393. 17 MS-BASED METABOLOMICS APPROACHES FOR FOOD SAFETY, QUALITY, AND TRACEABILITY

Mar´ıa Castro-Puyana, Jose´ A. Mendiola, Elena Iba´nez,˜ and Miguel Herrero

17.1 INTRODUCTION

Metabolomics is nowadays attracting a huge interest from different fields of research. Its application to food analysis is not an exception, as a great variety of different methods and approaches are being constantly developed with the aim of solving challenging analytical problems in this field. Metabolomics is often defined as the study of all the metabolites present in a particular system, typically below the 1500 Da limit. Consequently, the food metabolome is composed of a variety of different components belonging to a great number of chemical classes. Besides, among the different compounds included in the food metabolome, the quantitative differences are even more important. Some compounds are very common and abundant, and can be found at millimolar concentrations (e.g., sugars). Others exist in extremely small amounts, such as vitamins, and might be present at concentrations as low as femtomo- lar. This great concentration range involves an important analytical challenge, as the characterization of some of the richest food metabolome components might interfere in the correct characterization of the compounds present in far lower amounts. Keeping all these difficulties in mind, two main different but complementary approaches are used to characterize the food metabolome. The first one is based on profiling. This approach depends on a previous knowledge of the sample to be ana- lyzed and is focused on the analysis of a group of related metabolites. The relationship

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

453 454 MS-BASED METABOLOMICS APPROACHES FOR FOOD QUALITY among them is frequently based on chemical classes. The other approach, commonly called fingerprinting, is focused on comparing patterns of metabolites among sam- ples. The subsequent statistical treatment of the obtained results may ideally allow the grouping of the different samples according to their nature, treatment, or any other desired characteristic. Therefore, it is not the main aim of fingerprinting to identify all the involved compounds, but to establish patterns among them. Profiling and fin- gerprinting can be used alone or in combination, as they can offer complementary information. Regarding the aim of the present chapter, both approaches are used in different applications regarding food safety, quality, and traceability. Up to now, the analytical techniques most widely used for food metabolomics are nuclear magnetic resonance (NMR)- and mass spectrometry (MS)-based techniques. Although NMR has been employed since the beginning of the –omics era, at present MS is gaining importance. MS is able to provide some interesting advantages, such as improved sensitivity, or the possibility to establish more robust couplings with separation techniques. This increase in sensitivity will be a fundamental point when dealing with compounds present at low concentrations. Moreover, the development of high resolution MS instruments as well as the use of MSn procedures allow increasing the ability of this technique for the identification of unknown compounds in target analysis. Compound identification is mostly carried out through comparison of high- resolution MS data with metabolite databases. At present, different public databases available online are capable of offering data regarding the metabolites’ exact mass as well as fragmentation information. Therefore, matching with those databases is one of the more important tools to achieve a positive identification of an unknown metabolite present in a food sample. On the other hand, when identification of the metabolites is not sought, several chemometric tools are employed. In fact, to analyze food metabolomic data, some steps are performed, including peak detec- tion, integration, and data alignment before multivariate statistical analysis. Principal components analysis (PCA) is one of the most-used multivariate statistical tech- niques. This technique intends to group the data from the food samples among new variables called principal components in order to find correlations among the dif- ferent samples. Partial least square (PLS) regression is also employed to achieve sample discrimination by reduction of dimensionality while maximizing correlation between variables. Thanks to the developing and fine-tuning of all the above-mentioned techniques, the use of MS-based methodologies in food metabolomics is gaining attention. This fact can be even more important in the scope of the present chapter; MS-based metabolomics is gradually becoming one of the more relevant procedures to assess food quality, food safety, and traceability in modern food science and Foodomics. As it will be shown in the following sections, the implementation of these methodologies is providing us with tools that allow solving some great challenges that have to be sorted out in those fields; among them, the determination of biomarkers of food origin and authenticity, the development of powerful analytical methods to guarantee the consumers’ well-being and confidence, or the detection of food safety issues before they might pose a risk to consumers’ health, could be pointed out. MS-BASED METABOLOMICS FOR FOOD SAFETY 455

17.2 MS-BASED METABOLOMICS FOR FOOD SAFETY

Food safety is a field of utmost importance in food science and technology. In fact, a vast legislation is focused on the substances that can be or cannot be added to a particular food, and if so, usually stating maximum limits of utilization. Typical examples of these kinds of components are the antibiotics and pesticides used in farming and agriculture to increase the production rates and benefits and to assure the correct growth of animals and crops, respectively. As a consequence, accurate, precise, and robust analytical methods are needed in order to determine if the prod- ucts found in the market comply with the legislation. The application of MS-based metabolomics approaches to this field has brought about a great increase in sensi- tivity and selectivity, making the detection of these potentially harmful components sometimes even more sensitive than the official requirements. Moreover, within this field, not only is the detection of totally or partially forbidden compounds sought, but also the determination of potentially dangerous compounds (e.g., pollutants and toxins) or organisms that could be present in food, and which might pose a risk for consumers’ health.

17.2.1 Detection of Contaminants in Food For the determination of pesticides and antibiotics in food products, the vast majority of the MS-based metabolomics applications developed have been based on the use of hyphenated techniques. The coupling of MS to a separation technique allows a significant increase in the capability of MS alone to implement new multiresidue methods. In these kinds of methods, a great number of related compounds are simul- taneously determined. The development of this particular area of food metabolomics has been favored by the successive improvements on technological aspects produced both in separation science and in mass spectrometry. Thanks to these improvements, nowadays it is possible to simultaneously determine 148 different pesticides via ultrahigh pressure liquid chromatography (UPLC) coupled with mass spectrometry (UPLC–MS/MS) in less than 15 min (Wang et al., 2010). In these cases, the use of modern UPLC instruments permits the use of liquid chromatographic (LC) columns, packed with sub-2-␮m particles, at relatively high flow rates; this type of operation generates backpressures higher than those attainable using conventional LC instru- ments. Separations run under these conditions provide higher efficiencies and fast analysis. However, it could not be possible to make the most of these powerful capa- bilities without the coupling of this separation mode to a fast mass analyzer. In this sense, triple quadrupole analyzers are the most commonly used for the detection and quantification of pesticides and antibiotics in food products, independently of the separation technique employed. One of the main reasons for the success of these kinds of analyzers in these types of applications is related to Commission Decision 2002/657/EC, which establishes that four identification points have to be earned to consider a substance positively identified. An identification point is commonly earned thanks to the retention time (compared to a commercial reference standard) whereas 1.5 points are related to a parent–product ion transition. Modern triple quadrupole 456 MS-BASED METABOLOMICS APPROACHES FOR FOOD QUALITY analyzers are commonly fast enough to monitor two ion transitions from a single parent ion, thus, allowing the attainment of the required identification points. Other state-of-the-art high-resolution MS instruments might provide accurate mass val- ues which could also increase the level of certainty on the identification. However, this possibility is not, at present, included in the legislation, and thus, the use of high-resolution mass spectrometers might not be enough to comply with the stated requirements. Multiple reaction monitoring (MRM) is the most frequently employed mode to determine the two ion transitions. The most intense product ion will be designated as quantifier ion whereas the less intense will be associated to the qualifier ion. This latter product ion will not be used for the quantification but only to confirm the positive identification of the substance being determined. To achieve these two transitions, commercial standards of the target compounds are previously directly infused in the MS instrument. This way, an optimization of the detection conditions for each compound is carried out more or less automatically. Different parameters must be set for each studied compound, the most important ones being the different voltages that should be applied to achieve proper fragmentation of the parent ions into the selected product ions. Thus, collision-induced dissociation parameters have to be closely studied because they might have a critical impact on the whole detec- tion process. Using this detection method coupled to a separation technique, usually, limits of detection (LODs) in the low range of ␮g/kg can be obtained. Once the detection parameters are optimized, it is also important to consider the influence of the studied matrix. In fact, it has been shown in a great number of applications that the components present in the food product could significantly interfere in the ionization and/or detection of analytes. These components may have an effect on both directions; that is, enhancing or inhibiting the ionization of the analytes. For this reason, usually, matrix-matched calibrations are performed in order to statistically assess if the matrix has an influence or not. In those cases in which a matrix effect is observed, the calibration curves of all the studied compounds should be carried out spiking “blank samples” rather than constructing the calibration curves using standards in solvents. As it has been stated above, for pesticides and antimicrobials determination the most usually employed approach consists of coupling MS to a separation technique. A great array of multiresidue methods has been already applied to this aim. For instance, gas chromatography (GC)-based approaches have been applied to the deter- mination of pesticides in cereals (Walorczyk, 2008), fruits (Wong et al., 2010a), or tea (Huang et al., 2007). However, LC has been even more extensively employed, being a useful technique to analyze a wide range of compounds from fruits (Wong et al., 2010b), wines (Economou et al., 2009), vegetables (Ferrer and Thurman, 2007), and to determine antimicrobials from meats and milk (Kinsella et al., 2009). In general, the use of GC–MS provides a series of advantages over LC–MS as excel- lent resolution, good sensitivity, and good confirmation abilities of the MS detection performed after electron ionization (EI). In fact, the use of EI in GC–MS allows the attainment of highly reproducible MS spectra, independently from the instrument and method employed. This advantage, that is not possible to be achieved in LC-based methods using other ionization techniques (such as atmospheric pressure ionization MS-BASED METABOLOMICS FOR FOOD SAFETY 457 techniques), permits the development of MS spectra libraries that are very useful for identification purposes when analyzing real samples. On the other hand, one of the possible shortcomings that GC–MS might have, is the limited volume of sample that can be injected (usually limited to a few microliters) which can imply a negative influence on method sensitivity. To partially solve this problem and in order to be able to achieve limits of quantification (LOQs) lower than the maximum residue lim- its established for some of the pesticides and antibiotics, different approaches have been tested. One of them consists of increasing the amount of sample being injected through the use of large-volume injection (LVI). This solution has been proved to be useful to detect more than 200 pesticides in diverse vegetable food samples (Xu et al., 2009). In that case, 25 ␮L of sample was injected into a programmed temperature vaporization device using a stomach-shaped inlet liner that allowed the evapora- tion of the injection solvent according to a rising temperature program. Under these experimental conditions, appropriate linearity and sensitivity were obtained for all the studied compounds. LC–MS/MS methods, using a triple quadrupole MS analyzer, also allow the attain- ment of low LOQs, generally, with faster analysis times than GC. This strong capa- bility can be illustrated by the method developed for the analysis of more than 190 pesticides carried out in less than 15 min using MRM detection (Wong et al., 2010b). In Figure 17.1 a typical reconstructed chromatogram obtained from one of these analyses can be observed. Nowadays, the different technological improvements in separation science, mainly in column technology and instrument development, allow

# RT (min) Name (a) 1 2.00 Methamidophos 1.6e6 Methanol: Water/70:30 2 2.25 Acephate 3 2.88 Omethoate 4 3.05 Mesotrione 5 3.46 Aldicarb sulfoxide 8.0e5 6 3.58 Butoxycarboxim 4.0e6

0.0 3.0e6 Intensity, cps Intensity, (b) 2.0e6 2.0e6 Methanol: Water/30:70

1.0e6 1.0e6 4 2 3 1 5,6

0.0 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0 Time, min FIGURE 17.1 Typical reconstructed MRM chromatograms of the 191 pesticides and 6 ILISs analyzed by LC–MS/MS. Reproduced with permission from Wong et al. (2010a). 458 MS-BASED METABOLOMICS APPROACHES FOR FOOD QUALITY obtaining separations with high resolution and efficiency and short analysis times by using UPLC systems. In fact, it can be easily observed how a higher proportion of the LC-based methods developed to analyze pesticides and antibiotics in food samples are based on the use of UPLC conditions. Although it is obvious that the separation and detection conditions employed will have a strong influence on the capabilities of the developed method, a key step for the determination of contaminants in food is sample preparation. Food samples are generally very complex matrices in which a great variety of components coexist. As a consequence, in order to properly analyze contaminants that are frequently present in extremely low amounts, one or several sample preparation steps are commonly required. In this sense, there is a wide variety of sample preparation techniques that can be employed in combination with GC–MS or LC–MS to determine contaminants in food, although minimal sample preparation is usually preferred. The reader is referred to some interesting reviews already published to gain a deeper knowledge on the different sample preparation techniques most frequently employed (Gilbert- Lopez et al., 2009; Marazuela and Bogialli, 2009). Among the different available techniques, solid-phase extraction (SPE) and the so-called QuEChERS (Quick, Easy, Cheap, Effective, Rugged, and Safe) are the most common methodologies used today for the subsequent LC–MS/MS analysis of contaminants in food. For solid samples, the sample preparation step means the transfer of the food metabolome compounds of interest from the solid state to a liquid phase. In this sense, it is important to remark that each and every sample treatment step will have a direct influence on the metabolomic profiling obtained. In the case of multiresidue methods developed to analyze contaminants, the sample preparation step has to be carefully selected to fully demonstrate the proper extraction of the target compounds from the matrix while assuring a correct reproducibility of the whole procedure. Although the importance of sample preparation is sometimes dismissed, this stage has a critical influence on MS-based metabolomics analysis. Besides antibiotics and pesticides, other contaminants are also analyzed in foods. Among them, the most important ones are persistent organic pollutants (POPs). These compounds can be transferred from the environment into foods where they can be accumulated in the lipidic phase or tissue, thanks to their lipophilicity. These com- ponents are halogen-substituted compounds well known for being able to persist in the environment. Because of their toxicity to humans and animals, their use tends to be restricted, although some of them can be still employed in some countries. This fact, together with their bioaccumulation, makes necessary the implementation of analytical methods to determine their presence in foods to assure their safety. In this regard, different profiling methods have been developed in a similar way as in those implemented for pesticides analysis. Moreover, some recent developments have aimed the determination of some of these components as well as other pesticides simultaneously. As the group of target compounds becomes wider, the analytical chal- lenge in analyzing all the components rises. A possible solution for improving the separation power as well as the subsequent detection is the multidimensional cou- pling of a separation technique. In this regard, comprehensive two-dimensional gas chromatography (GC × GC) has been already pointed out as a powerful technique MS-BASED METABOLOMICS FOR FOOD SAFETY 459 to analyze very complex samples. GC × GC is based on an initial separation in conventional conditions coupled to a fast separation in a short column through a modulator device. The modulator is able to transfer small fractions from the first dimension to the second dimension continuously, increasing significantly the separa- tion power of a monodimensional separation. Besides, using orthogonal separations, compounds are organized in the two-dimensional plane according to their different chemical classes, facilitating the identification of the unknown components. As an additional advantage, the sample might be concentrated in the modulator, further increasing the sensitivity of the system. GC × GC has been employed, for instance, to determine some pesticides and organic pollutants in milk, via coupling to a TOF analyzer (Hayward et al., 2010). In Figure 17.2, the two-dimensional chromatogram obtained in this separation can be observed.

17.2.2 Detection of Pathogens and Toxins in Food MS-based metabolomics has been recently demonstrated as a useful tool to detect the presence of pathogens in foods and, thus, to assess their microbiological safety. Different approaches have been devised with this aim. One of them is to use direct MS detection in order to reveal the presence of pathogenic microorganisms in foods

FIGURE 17.2 Comprehensive two-dimensional separation for pesticides and internal stan- dards fortified in milk at 10 ␮g/kg obtained by GC × GC using a 30 m × 0.25 mm i.d. 5% phenyl column (HP5-ms) and a 1.5 m × 0.15 mm i.d. 50% phenyl column (SGE BPX-50). Reproduced with permission from Hayward et al. (2010). 460 MS-BASED METABOLOMICS APPROACHES FOR FOOD QUALITY by detecting some of their metabolites. This method was employed for revealing the presence of Escherichia coli-contaminated spinach samples through the detection of some metabolic markers (Chen et al., 2007). This rapid online detection method combined atmospheric pressure desorption sampling by a gentle stream of air or nitrogen with extractive electrospray ionization (EESI) and mass spectrometric anal- ysis. By using this methodology, typical molecular markers might be identified using MS/MS, whereas differences between closely related samples are easily visualized by using PCA of the mass spectral data. Another possibility is to detect metabolites derived from microbial growth in foods by using a separation technique coupled to MS. GC–MS demonstrated its potential to detect the presence of Salmonella and E. coli O157:H7 at levels of 7 ± 2 CFU/25 g of food through a metabolomics approach (Cevallos et al., 2011). In this case, a headspace solid-phase microextrac- tion (HS-SPME) protocol was employed to perform the sampling of the generated compounds. Using chemometrics, it was possible to construct prediction models based on the profile of the metabolites detected in those samples. Although a single compound could not be found associated to the growth of a particular bacteria, the analysis of the whole profile allowed the correct classification of ground beef and chicken samples contaminated with the studied microorganisms. Another important aspect regarding the presence of pathogens in food is related to the detection of these microorganisms through the detection of toxins generated by such pathogens. In this case, MS-metabolomics methods are quite similar to those developed for the determination of antibiotics or pesticides. One of the most remark- able examples is the detection of mycotoxins. These compounds are toxic secondary metabolites that are produced by some fungi that might be found contaminating foods. The presence of these compounds pose a great risk for human health, and consequently, their detection at low levels is of great importance for food safety. The development of MS instruments as well as their coupling with separation techniques have provided us with the ideal tools to achieve the required sensitivity. Thanks to the advantages of this kind of coupling, fast multitarget methods have been developed and applied to the detection of a wide group of toxins. For instance, the sensitive detection of 21 different toxins in less than 13 min was possible by coupling LC to MS/MS (Rupert et al., 2012). The method could be applied to the detection of these toxins in baby foods attaining detection limits as low as 0.05 ng/g. Figure 17.3 shows an example of these kinds of determinations. Generally, LC-based methods are preferred for multiclass mycotoxin analysis (Capriotti et al., 2012) and, although most of published works are based on MRM detection implemented through triple quadrupole analyzers, high-resolution MS instruments, such as Orbitrap analyzers, have also been employed for the detection of 32 mycotoxins (Zachariasova et al., 2010). High-resolution MS allows increasing accuracy in mass determination without losing sensitivity. However, the generalization of methods based on these analyzers is limited because their current high cost, as well as due to the current European legislation regarding the detection and quantification of contaminants and forbidden substances in foodstuffs; in fact, as it was mentioned above, the system to earn iden- tification points according to European legislation only regards the monitoring of ion transitions, whereas the attainment of high-resolution molecular mass values is not considered. MS-BASED METABOLOMICS FOR FOOD SAFETY 461

(a) 4 PEAK IDENTIFICATION: 4,8e 22 4,4e4 (1) NIV (7) 3-ADON (13) DAS (19) ZOL (2) DON (8) AFG2 (14) FB1 (20) OTA 4,0e4 (3) FUS X (9) AFM2 (15) HT-2 (21) ZEN 4 3,6e (4) NEO (10) AFG1 (16) FB3 (22) STER 3,2e4 (5) DOM-1 (11) AFB2 (17) T-2 (23) BEA (6) 15-ADON (12) AFB (18) FB 15 2,8e4 1 2 2,4e4 17 Intensity, cps Intensity, 2,0e4 4 11 23 1,6e4 10 19 1,2e4 16 8000 1 18 21 3 6 7 4000 2 5 12 13 14 8 9 20 0,0 678910 11 12 13 Time, min

(b) 6404 1 2 6000 12.80 5500 1500 1400 BEA 8000 244,2 5000 1300 4500 1200 1100 4000 6000 1000 262,2 3500 900 3000 800 4000 700 E 2500 600 P Intensity, cps Intensity, cps Intensity,

2000 500 cps Intensity, 2000 I 400 1500 Expected ratio: 0.78 Expected ratio: 300 0.59 Observed ratio: 0 1000 200 200 250 300 500 100 0 0 12 13 14 15 6789 10 11 12 13 Time, min Time, min

10.19 8.9 (c) 12900 260 FB1 334,3 800 4000 240 AFG2 4000 217,6 220 700 200 3000 3000 600 352,3 180 500 160 2000 140 E E 2000 400 120 189,6 P P Intensity, cps Intensity, cps Intensity, cps Intensity, 1000 100 cps Intensity, I 300 I 1000 80 200 0.76 Expected ratio: 0.64 Observed ratio: 60 0 0.32 Expected ratio: 0.19 Observed ratio: 0 100 40 300 350 400 150 200 250 20 0 8910 11 12 0 798 10 11 Time, min Time, min FIGURE 17.3 (a) Typical chromatogram of spiked sample at LOQ concentration level. (b) 1. Chromatogram of beauvericin (BEA) positive rice-based baby food; 2. Confirmation by the accomplishment of Q/q ratios and EPI (enhanced product ion) scan. (c) Confirmation of positive samples: 1. Fumonisin B1 (FB1); and 2. Aflatoxin G2 (AFG2). Reproduced with permission from Rupert et al. (2012). 462 MS-BASED METABOLOMICS APPROACHES FOR FOOD QUALITY

17.3 MS-BASED METABOLOMICS TO ASSESS FOOD QUALITY

The term “food quality” includes multiple aspects of food. Because this term is not a completely objective parameter, the consideration of appropriate quality will strongly depend on the food in particular, on the product itself, or even on the consumer. Among the aspects related to food quality, food composition, aroma, flavor, taste, or food properties can be pointed out. For this reason, assessing food quality is a complex task that may imply different types of analysis depending on the particular food or product. As in other fields of food science, as the complexity increases, MS-based approaches, including MS-based metabolomics are gaining attention. One of the key aspects in assessing food composition is aroma determination. In fact, volatiles contained in some foods are of great importance as they are highly relevant for the perceived overall quality of a particular product. As an example, it is very well known that grape type and quality, correct aging, and other factors will determine the volatiles composition of wine, and therefore, its aroma, which is one of the most important parameters in wine quality. The same occurs with other products. For instance, the coupling of powerful separation techniques to high-resolution mass spectrometry has been able to produce volatiles fingerprints of cacao samples that might be directly related to their quality (Humston et al., 2009). Specifically, GC × GC was employed after HS-SPME with the aim of having a complete separation of the volatiles contained in the cacao beans. The use of this separation technique allowed the attainment of high peak capacities by combining a long nonpolar column in the first dimension with a short polar column in the second dimension. The separated volatiles were detected by TOFMS. This study demonstrated the possibility of statistically correlating a particular volatiles profile with the adequate quality levels of the samples (Humston et al., 2009). Similarly, the coupling of MS to other separation techniques has been also fruit- ful in generating metabolite fingerprints to assess food quality. This has been the case of LC–MS couplings to profile, for instance, flavonols and anthocyanins in grapes (Mattivi et al., 2006) or monodimensional GC–MS to identify undesirable compounds (Vikram et al., 2006). This latter approach is directed to the assessing of food quality from a different perspective. Indeed, instead of determining the sig- nificant volatile compounds for food quality, unwanted volatiles were determined in order to predict a loss of quality related to post-harvest spoilage. It was demonstrated how the determination of particular volatiles in carrot samples was closely related to specific diseases (Vikram et al., 2006). By using GC–MS, it is also possible to perform metabolomics-based studies dealing with a high number of compounds, in order to determine a particular metabolite profile for different vegetable varieties. This technique was employed in combination with chemometrics to characterize different potato (Solanum tuberosum) varieties and cultivars (Dobson et al., 2010). LC–MS has also been used for the development of metabolite databases in order to reveal the complete food metabolome of tomato (Moco et al., 2006). Thanks to the development of these kinds of databases, the expected metabolites in a particular food product can be described. Consequently, these tools may offer new advantages for food quality determination. However, the use of a single technique of analysis introduces a bias by itself; that is, not all the compounds contained in the food are MS-BASED METABOLOMICS TO ASSESS FOOD QUALITY 463 actually analyzed. To cope with this potential issue, more comprehensive data col- lection can be performed. In fact, the coupling of different analytical tools allows the attainment of extensive data that might reveal, upon elaboration, different relation- ships. This point of view has been employed to obtain more than 2000 metabolite signatures from melon fruit, combining the use GC–MS to analyze polar compounds as well as LC–QTOFMS and GC–MS for the nontargeted profiling of semipolar and volatile compounds, respectively (Moing et al., 2011). Once the different com- pounds are determined, powerful chemometric tools are used. To dissect different analyte patterns among the huge complexity of the obtained data, K-means clustering was obtained. Subsequently, for each relevant cluster, Spearman correlation coeffi- cients between analytes were calculated. Examples of these correlations are shown in Figure 17.4. This in-depth characterization helped to reveal not only metabolite inter- actions, but may also ensure the validity of targeted breeding efforts to improve fruit quality.

FIGURE 17.4 Correlation networks of two metabolite clusters. Vertices are colored accord- ing to the analytical technique employed (1H–NMR, LC–PDA–FL, LC–MS, GC–MS, or ICP–MS). (a) Correlation network of 190 analytes, and (b) correlation network of 184 analytes. GlD, glutamine derivative; DHpseudoionone, dihydropseudoionone; D-t-indene, 2,3-dihydro-1,1,4,7-tetramethyl-1H-indene; T-t-benzof., 5,6,7,7a-tetrahydro-4,4,7a-trimethyl- 2(4H)-benzofuranone (R). Reproduced with permission from Moing et al. (2011). 464 MS-BASED METABOLOMICS APPROACHES FOR FOOD QUALITY

Direct MS analysis has also shown promising capabilities for food quality assess- ment. Recently, several ambient ionization techniques applied to analyze intact sam- ples have been presented. The use of these kinds of techniques permits the analysis of samples without prior sample preparation or other analytical steps. Among them, desorption electrospray ionization (DESI), direct analysis in real time (DART), and atmospheric-pressure solids analysis probe (ASAP) may be pointed out. Several works have been already published in which these techniques are described more in depth (Hajslova et al., 2011; Nielen et al., 2011). The applicability of these new tech- niques covers a wide range of possibilities. The attainment of metabolite fingerprints and/or profiles from food and food products is among these possibilities. Subse- quently, these profiles/fingerprints can be effectively employed to reveal the quality parameters associated with particular metabolite compositions. This strategy has been followed to monitor beer quality of different specialty beers through the application of DART–TOFMS together with chemometrics (Cajka et al., 2010). In DART analysis a thermodesorption of condensed-phase analytes by a stream of hot gas which carries active species derived from a plasma discharge is produced. Subsequently an APCI- like ionization is produced, enabling the acquisition of the resulting mass spectrum. The influence of the gas beam temperature was demonstrated. Whereas at lower tem- peratures (150ºC) desorption and ionization of low-molecular-weight compounds was mainly produced, at high temperatures (350ºC), the observed ion intensities of some of these compounds decreased, while other ions appeared. Acquiring mass spectra in both ionization modes of the different beer samples allowed the correct classification of each type according to a statistical approach. PLS and artificial neu- ral networks with multilayer perceptrons (ANN-MLP) classifications provided the best results. The construction of databases using this approach with a high number of profiles/fingerprints would enable the critical assessment of beer quality. Besides, it is also important to remark that each analysis required less than 1 min to be completed (Cajka et al., 2011). Another MS-based technique that can be employed to profile minor food compo- nents in order to assess its proper quality is inductively coupled plasma-mass spec- trometry (ICP–MS). The use of this technique allows the simultaneous determination of the elemental composition of foods and food products at very low detection levels. Among them, not only those compounds nutritionally necessary but also others that might be found as contaminants and that may reduce the overall food quality. This strategy has been employed for instance, to determine 20 minor and trace elements in soy and dairy fermented products (Llorent-Martinez et al., 2012).

17.4 MS-BASED METABOLOMICS STRATEGIES FOR FOOD TRACEABILITY

Food traceability is employed to precisely know the composition and origin of a par- ticular food product during the whole manufacturing process, providing, for example, the continuous monitoring of a food from farm to fork. As it can be deduced, this topic is closely related to other aspects such as quality. MS-based metabolomics MS-BASED METABOLOMICS STRATEGIES FOR FOOD TRACEABILITY 465 strategies are of great help for food traceability. In fact, the great analytical capa- bilities that these techniques possess allow the simultaneous, and often continuous, determination of wide groups of components during manufacture. As an example, the combined use of UPLC with a high-resolution QTOFMS was employed to monitor how water-soluble metabolites changed during a 2-month fermentation of soybean- based products (Kang et al., 2011). The information collected was relevant as the influence of the presence of metabolites, including amino acids, urea cycle interme- diates, nucleosides and organic acids, on the overall nutritional and sensory quality of the resulting fermented products was significant. Other powerful hyphenated tech- niques such as GC × GC-TOFMS have been also used to allow the traceability of food products; for instance, the determination of 56 monoterpenoids present in a par- ticular variety of white grape has been proposed as a useful method to subsequently assure the composition of musts and wines theoretically produced from this variety, thus being possible to trace their varietal origin (Rocha et al., 2007). Besides, food origin assessment is an extremely important topic for food science. Food origin is critical in terms of food quality, food adulteration and legislation, among others. As the origin of a food or food product can have important economic implications, the geographical origin traceability is mandatory to assess that a product being sold as part of a protected designation of origin (PDO) or from a protected geo- graphical indication (PGI) has been actually produced in these protected areas. In this regard, MS-based approaches have facilitated the detection of adulterations and the origin assessment of food products in a variety of applications. Figure 17.5 shows the typical analytical steps required for food origin assessment. Typical products within the European Union boundaries that are frequently protected according to their origin are extra-virgin olive oil and wines. Different MS-based methods have been already developed with the aim to assess geographical origin of these valuable food prod- ucts. Generally, one or more analytical techniques are combined, based on the use of separation techniques and mass spectrometry to analyze a group of target metabo- lites. Later on, chemometric tools are needed in order to analyze the typically huge amount of data generated and to reveal different metabolites’ relationships that can be selected to group samples according to their geographical origin. An example of this approach is the use of GC–MS and GC × GC–MS with linear discriminant analysis and ANN to classify olive oils depending on their origin according to their volatiles patterns (Cajka et al., 2010) or the use of HPLC–Q–TOFMS coupled to advanced data mining and multivariate analysis for the discrimination of wines depending on their variety according to their metabolomics profiling (Vaclavik et al., 2011). Multielement fingerprinting obtained by ICP–MS combined with statistical tools and chemometrics has been also utilized to classify products with geographical protected indications, such as honeys (Chudzinska and Baralkiewicz, 2011) or onions (Furia et al. 2011). This combination allows the determination of markers that would enable the detection of frauds. However, an MS-based approach that has found a great number of applications for assessing food origin and authentication is isotope ratio mass spectrometry (IRMS). This technique measures small differences in stable isotope ratios that can be effec- tively correlated to different geographical origins. Carbon isotope ratio (13C/12C) has 466 MS-BASED METABOLOMICS APPROACHES FOR FOOD QUALITY

FIGURE 17.5 General work-flow using MS-based metabolomics to assess food origin and authentication.

been the most commonly employed in this type of studies. By using this methodology, not only the particular geographical origin of a product can be determined, but also possible adulterations with other less valuable products/ingredients from different origins can be detected. This technique has been widely employed in combination with GC and a combustion method to determine, for example, the geographical ori- gin of different unifloral honey types (Kropf et al., 2010), and cattle diet and origin in China (Guo et al., 2010), or to assess, together with other complementary tech- niques such as enantiomeric separation–GC, gas chromatography-flame ionization detector (GC–FID) and high-performance liquid chromatography-diode array detec- tor (HPLC–DAD), the genuineness of mandarin essential oil (Schipilliti et al., 2010). Recently, LC–IRMS methods have been also introduced, for example, to simultane- ously determine 13C isotope ratios of glucose, fructose, glycerol, and ethanol in sweet and semisweet wines for authenticity purposes (Guyon et al., 2011). Although results obtained so far are promising, more research is needed to fully confirm the validity of this approach. On the other hand, proper statistical treatment of the obtained data has to be also performed after IRMS analysis, in order to be able to classify the different samples. CONCLUSIONS AND FUTURE OUTLOOK 467

17.5 CONCLUSIONS AND FUTURE OUTLOOK

MS-based metabolomics approaches possess a series of characteristics which make them ideal for the assessment of food safety, food quality, and traceability. In fact, MS-based techniques are characterized by offering good sensitivity and selectivity, allowing the proper detection of target or marker compounds contained in very complex matrices, such as foods. However, at present, these techniques have not reached their full maturity within the food analysis field. Profiling and fingerprinting approaches combined with chemometrics have been already employed in some food- quality-related applications. As the use of these strategies increases, new information will be collected that will enhance the knowledge of food metabolomes and how these correlate with food quality. Metabolites fingerprint will also increasingly become a powerful tool for the evaluation of food traceability, this aspect being closely related to the detection of frauds and adulterations. With PDO products being more important in the food market, new MS-based methods will help to identify metabolite patterns that will allow the appropriate identification of food products according to their geographical origin, or even the type of soil or diet employed in the case of foods of animal origin. On the food safety side, the instrumental developments will surely mean that lower LODs will be reached in the near future. As the technology improves, new analytical methods will be developed that will extend the current possibilities to the sensitive and selective detection of contaminants and their related metabolites using multiresidue methods. In other areas, these instrumental advances will provide us with the possibility of detecting even more metabolites in a robust and fast way. In any case, the developments of these methodologies are expected to be greatly related, not only to the development of MS-technologies and applications as direct detection methods, but also to the improvement of the hyphenation of separation techniques with MS. The development of more robust interfaces for these couplings is also expected. Consequently, technological improvements on this side will have an important influence on the related applications. In this sense, the use of high throughput and high separation power analytical techniques coupled with state-of- the-art MS analyzers will increase the amount and quality of data that will be available for subsequent statistical and chemometrics analyses. Therefore, clearer and stronger relationships between metabolite fingerprints/profiles are expected to be found. The application of these new developments to food safety, quality, and traceability will also permit the use of a global perspective for the interpretation of the obtained results. The use of global Foodomics approaches, including MS-based metabolomics will be of critical importance to discover new metabolite patterns that can be used to appropriately establish new food quality patterns and/or biomarkers. In fact, the determination of the metabolites that are relevant from a food quality point of view is the first step for the subsequent detection of those metabolites in the studied foods and food products. The integration of the metabolomics knowledge in food science through the implementation and use of metabolite databases will be also of help for the determination of the relevant metabolites linked to a particular food quality parameter. 468 MS-BASED METABOLOMICS APPROACHES FOR FOOD QUALITY

ACKNOWLEDGMENTS

MCP thanks MICINN for her “Juan de la Cierva” contract. MH would like to thank MICINN for a “Ramon´ y Cajal” research contract. The authors thank AGL2011- 29857-C03-01 (Ministerio de Ciencia e Innovacion,´ Spain), and CSD2007-00063 FUN-C-FOOD projects (Programa CONSOLIDER, Ministerio de Educacion´ y Cien- cia, Spain) for their support.

REFERENCES

Cajka T, Riddellova K, Tomaniova M, Hajslova J (2011). Ambient mass spectrometry employ- ing a DART ion source for metabolomics fingerprinting/profiling: a powerful tool for beer origin recognition. Metabolomics 7:500–508. Cajka T, Riddellova K, Verna M, Pudil F, Hajslova J (2010). Traceability of olive oil on volatiles pattern and multivariate analysis. Food Chemistry 121:282–289. Capriotti AL, Caruso G, Cavaliere C, Foglia P, Samperi R, Lagana A (2012). Multiclass myco- toxin analysis in food, environmental and biological matrices with chromatography/mass spectrometry. Mass Spectrometry Reviews 31(4):466–503. Cevallos JM, Danyluk MD, Reyes-De-Corcuera JI (2011). GC-MS based metabolomics for rapid simultaneous detection of Escherichia coli O157:H7, Salmonella typhimurium, Salmonella muenchen,andSalmonella hartford in ground beef and chicken. Journal of Food Science 76:M238–M246. Chen H, Wortmann A, Zenobi R (2007). Neutral desorption sampling coupled to extrac- tive electrospray ionization mass spectrometry for rapid differentiation of biosamples by metabolomics fingerprinting. Journal of Mass Spectrometry 42:1123–1135. Chudzinska M, Baralkiewicz D (2011). Application of ICP-MS method of determination of 15 elements in honey with chemometric approach for the verification of their authenticity. Food and Chemical Toxicology 49:2741–2749. Dobson G, Shepherd T, VerrallSR, Griffiths WD, Ramsay G, McNicol JW, Davies HV, Stewart D (2010). A metabolomics study of cultivated potato (Solanum tuberosum) groups Andi- gena, Phyreja, Stenotomum, and Tuberosum using gas chromatography-mass spectrometry. Journal of Agricultural and Food Chemistry 58:1214–1223. Economou A, Botisi H, Antoniou S, Tsipi D (2009). Determination of multiclass pesticides in wines by solid-phase extraction and liquid chromatography-tandem mass spectrometry. Journal of Chromatography A 1212:5856–5867. Ferrer I, Thurman EM (2007). Multi-residue method for the analysis of 101 pesticides and their degradates in food and water samples by liquid chromatography/time-of-flight mass spectrometry. Journal of Chromatography A 1175:24–37. Furia E, Naccarato A, Sindona G, Stabile G, Tagarelli A (2011). Multielement fingerprinting as a tool in origin authentication of PGI food products: Tropea red onion. Journal of Agricultural and Food Chemistry 59:8450–8457. Gilbert-Lopez B, Garc´ıa-Reyes JF, Molina-D´ıaz A (2009). Sample treatment and determination of pesticide residues in fatty vegetable matrices: a review. Talanta 79:109–128. Guo BL, Wei YM, Pan JR, Li Y (2010). Stable C and N isotope ratio analysis for regional geographical traceability of cattle in China. Food Chemistry 118:915–920. REFERENCES 469

Guyon F, Gaillard L, Salago¨ıty MH, Medina B (2011). Intrinsic ratios of glucose, fructose, glyc- erol and ethanol 13C/12C isotopic ratio determined by HPLC-co-IRMS: toward determin- ing constants for wine authentication. Analytical and Bioanalytical Chemistry 401:1555– 1562. Hajslova J, Cajka T, Vaclavik L (2011). Challenging applications offered by direct analysis in real time (DART) in food-quality and safety analysis. TRAC – Trends in Analytical Chemistry 30:204–218. Hayward DG, Pisano TS, Wong JW, Scudder RJ (2010). Multiresidue method for pesticides and persistent organic pollutants (POPs) in milk and cream using comprehensive two- dimensional capillary gas-chromatography-time-of-flight mass spectrometry. Journal of Agriculture and Food Chemistry 58:5248–5256. Huang Z, Li Y, Chen B, Yao S (2007). Simultaneous determination of 102 pesticide residues in Chinese teas by gas chromatography-mass spectrometry. Journal of Chromatography B 856:154–162. Humston EM, Zhang Y, Brabeck GF, McShea A, Synovec RE (2009). Development of a GC × GC-TOFMS method using SPME to determine volatile compounds in cacao beans. Journal of Separation Science 32:2289–2295. Kang HJ, Yang HJ, Kim MJ, Han ES, Kim HJ, Kwon DY (2011). Metabolomic analysis of meju during fermentation by ultra performance liquid chromatography-quadrupole-time of flight mass spectrometry (UPLC-Q-TOF MS). Food Chemistry 127:1056–1064. Kinsella B, Lehotay SJ, Mastovska K, Lightfield AR, Furey A, Danaher M (2009). New method for the analysis of flukicide and other anthelmintic residues in bovine milk and liver using liquid chromatography–tandem mass spectrometry. Analytica Chimica Acta 637:196– 207. Kropf U, Korosecˇ M, Bertoncelj J, Ogrinc N, Necemerˇ M, Kump P, Golob T (2010). Determi- nation of the geographical origin of Slovenian black locust, lime and chestnut honey. Food Chemistry 121:839–846. Llorent-Martinez EJ, Fernandez de Cordova ML, Ruiz-Medina A, Ortega-Barrales P (2012). Analysis of 20 trace and minor elements in soy and dairy yogurts by ICP-MS. Microchemical Journal 102:23–27. Marazuela MD, Bogialli S (2009). A review of novel strategies of sample preparation for the determination of antibacterial residues in foodstuffs using liquid chromatography-based analytical methods. Analytica Chimica Acta 645:5–17. Mattivi F, Guzzon R, Vrhovsek U, Stefanini M, VelascoR (2006). Metabolite profiling of grape: flavonols and anthocyanins. Journal of Agriculture and Food Chemistry 54:7692–7702. Moco S, Bino RJ, Vorst O, Verhoeven HA, de Groot J, van Beek TA, Vervoot J, de Vos CHR (2006). A liquid chromatography-mass spectrometry-based metabolome database for tomato. Plant Physiology 141:1205–1218. Moing A, Aharoni A, Biais B, Rogachev I, Meir S, Brodsky L, Allwood JW, Erban A, Dunn WB, Kay L, de Koning S, de Vos RC, Jonker H, Mumm R, Deborde C, Maucourt M, Bernillon S, Gibon Y, Hansen TH, Husted S, Goodacre R, Kopka J, Schjoerring JK, Rolin D, Hall RD (2011). Extensive metabolic cross-talk in melon fruit revealed by spatial and developmental combinatorial metabolomics. New Phytologist 190:683–696. Nielen MWF, Hooijernik H, Zomer P, Mol JGJ (2011). Desorption electrospray ionization mass spectrometry in the analysis of chemical food contaminants in food. TRAC – Trends in Analytical Chemistry 30:165–180. 470 MS-BASED METABOLOMICS APPROACHES FOR FOOD QUALITY

Rocha SM, Coehlo E, Zrostlikova J, Delgadillo I, Coimbra MA (2007). Comprehensive two- dimensional gas chromatography with time-of-flight mass spectrometry of monoterpenoids as a powerful tool for grape origin traceability. Journal of Chromatography A 1161:292– 299. Rupert J, Soler C, Manes˜ J (2012). Application of an HPLC-MS/MS method for mycotoxin analysis in commercial baby foods. Food Chemistry 133:176–183. Schipilliti L, Tranchida PQ, Sciarrone D, Russo M, Dugo P, Dugo G, Mondello L (2010). Genuineness assessment of mandarin essential oils employing gas chromatography- combustion-isotope ratio MS (GC-C-IRMS). Journal of Separation Science 33:617–625. Vaclavik L, Lacina O, Jajslova J, Zweigenbaum J (2011). The use of high performance liquid chromatography-quadrupole time-of-flight mass spectrometry coupled to advanced data mining and chemometric tools for discrimination and classification of red wines according to their variety. Analytica Chimica Acta 685:45–51. Vikram A, Lui LH, Hossain A, Kushalappa AC (2006). Metabolic fingerprinting to discriminate diseases of stored carrots. Annals of Applied Biology 148:17–26. Walorczyk S (2008). Development of a multi-residue method for the determination of pesti- cides in cereals and dry animal feed using gas chromatography–tandem quadrupole mass spectrometry II. Improvement and extension to new analytes. Journal of Chromatography A 1208:202–214. Wang J, Leung D, Chow W (2010). Applications of LC/ESI-MS/MS and UHPLC QqTOF MS for the determination of 148 pesticides in berries. Journal of Agriculture and Food Chemistry 58:5904–5925. Wong JW, Zhang K, Tech K, Hayward DG, Makovi CM, Krynitsky AJ, Schenck FJ, Banerjee K, Dasgupta S, Brown D (2010a). Multiresidue pesticide analysis in fresh produce by capillary gas chromatography mass spectrometry/selective ion monitoring (GC-MS/SIM) and -tandem mass spectrometry (GC-MS/MS). Journal of Agriculture and Food Chemistry 58:5868–5883. Wong J, Hao C, Zhang K, Yang P, Banerjee K, Hayward D, Iftakhar I, Schreiber A, Tech K, Sack C, Smoker M, Chen X, Utture SC, Oulkar DP (2010b). Development and interlabora- tory validation of a QuEChERS based liquid chromatography-tandem mass spectrometry method for multiresidue pesticide analysis. Journal of Agriculture and Food Chemistry 58:5897–5903. Xu XL, Li L, Zhong WK, He YJ (2009). Multiresidue analysis of 205 crop pesticides using mini-solid phase extraction –large volume injection-GC-MS. Chromatographia 70:173– 183. Zachariasova M, Cajka T, Godula M, Malachiva A, Veprokova Z, Hajslova J (2010). Analysis of multiple mycotoxins in beer employing (ultra)-high-resolution mass spectrometry. Rapid Communications in Mass Spectrometry 24:3357–3367. 18 GREEN FOODOMICS

Jose A. Mendiola, Mar´ıa Castro-Puyana, Miguel Herrero, and Elena Iba´nez˜

18.1 BASIC CONCEPTS OF FOODOMICS (AND HOW TO MAKE IT GREENER)

Foodomics has been defined as a new discipline that studies the food and nutrition domains through the application of advanced omics technologies in order to improve consumers’ well-being, health, and confidence (Cifuentes, 2009; Herrero et al., 2010a, 2012). Undoubtedly, foodomics tries to provide new answers to the new challenges that researchers and society face in this 21st century. Some of these challenges are to preserve sustainability as well as food quality and safety as a way to improve consumer’s well-being and confidence. Another challenge is to contribute to the rational design and development of new foods, with new targets other than the nutrition, more focused on health improvement and disease prevention; and to be able to provide these answers with supported scientific evidences. These goals are basically “green” by themselves since by reaching them it will be possible to have safer and healthier foods while decreasing contamination and chemical hazards. Foodomics can help in this goal since most of the methodologies employed (and subdisciplines involved) can be considered basically green: -omics technologies, bioinformatics, advanced analytical chemistry (for food quality and safety), food production and design (through the development of functional foods and nutraceuticals), etc. In the present chapter, we will try to show how to make a green discipline such as foodomics, even greener by applying basic concepts of green chemistry in all the areas included in foodomics (see Fig. 18.1, in which concepts marked with a star mean that they can be greened). Therefore, we will discuss different green alternatives

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

471 472 GREEN FOODOMICS

FIGURE 18.1 Concepts related to foodomics. Marked with a star, those concepts that can be greened. for functional food ingredients production (based on the use of green solvents and integrated processes producing less waste and less energy consumption), ways to green analytical methods for food quality, safety, and traceability measurements (through the use of miniaturized sample preparation techniques or greener solvents and the development of new alternatives for greener separation techniques) and, at the end, ways to influence -omics technologies (mainly proteomics and metabolomics) to reduce even further the preparation steps, the consumption of solvents while improving data reliability. By “thinking green” it will be possible to reach foodomics goals but with an additional benefit for our society.

18.2 BASIC CONCEPTS OF GREEN CHEMISTRY

Developed societies are becoming increasingly concerned about the environment and how this is affected (and how it will be affected in the future) by different chem- ical and engineering activities at both, industrial and laboratory scale. Since early 1990s, the green chemistry movement has been exploring ways to reduce the risks of chemical exposure to humans and environment. Simply stated, green chemistry reduces or eliminates the use or generation of hazardous substances from chemical products and processes and improves all types of chemical products and processes by BASIC CONCEPTS OF GREEN CHEMISTRY 473 reducing impacts on human health and environment. As defined by Anastas and Warner (1998), “Green Chemistry is the use of chemistry techniques and method- ologies that reduce or eliminate the use or generation of feedstocks, products, by- products, solvents, reagents, etc. that are hazardous to human health or the envi- ronment.” Green chemistry technologies involve all types of chemical processes, including synthesis, catalysis, reactions, separations, analysis, and monitoring. There are three main aspects that dominate in the twelve principles that rule green chemistry: waste, hazard (health, environmental, and safety), and energy. Green chemistry is not a simple subdivision of chemistry like organic or organometallic chemistry; it has environmental, technological, and social goals and is linked to the wider sustainability movement (Winterton, 2001). Green chemistry has set itself the goal of making chemical technology more environmentally benign by “efficiently using (preferably renewable) raw materials, eliminating waste and avoiding the use of toxic and/or hazardous reagents and solvents in the manufacture and application of chemical products” (Sheldon, 2000).

18.2.1 Green Processes As seen for green chemistry, green engineering is the “development and commercial- ization of industrial processes that are economically feasible and reduce the risk to human health and environment.” Both green concepts are intimately related to sus- tainability, which means using methods, systems, and materials that will not deplete resources or harm natural cycles (Anastas and Zimmerman, 2003). An approach to quantitatively and systematically evaluate synthetic organic reac- tions and processes was described by Curzons et al. (2001). The results of their work indicated that close attention to effective use and reuse of solvents results in the largest gains for reducing life cycle impacts in batch chemical operations. The most common strategy is to perform life cycle assessment (LCA). LCA is a standard- ized methodology, according to ISO 14044 (2006), for assessing the environmental impacts associated with a product, process, or service over its entire life cycle. It is an excellent tool to quantify and characterize fluxes of materials and energy to different environmental impact categories (Lindahl et al., 2010). As mentioned, one of the goals of foodomics is to provide with scientific evidences to demonstrate the real effects of newly developed functional foods; in this sense, there is a growing interest in the use and development of new green processes to obtain such bioactives. In Section 18.3 of this chapter, new approaches will be considered toward the extraction of new bioactive components from different natural sources. These new processes contemplate both, the use of green solvents and the decrease in energy requirements by improving the efficiency and shortening the time. At the end of the section, new processes based on the use of compressed fluids (subcritical water and supercritical CO2) will be discussed in terms of LCA for the production of a potent antioxidant extract from rosemary leaves with application in the development of new functional foods. 474 GREEN FOODOMICS

FIGURE 18.2 Steps of the analytical process to be considered in the frame of ecological paradigm. Reprinted from de la Guardia and Armenta (2011c) with permission from Elsevier.

18.2.2 Green Analytical Chemistry Keeping in mind the LCA strategies, analytical methods can be considered as pro- cesses in which preliminary information and knowledge, solvents, reagents, sam- ples, energy, and instrument measurements are used as inputs to solve a specific problem. The outputs of those processes are qualitative and/or quantitative compo- sition of the analytes. However, analytical methodologies can also have side effects (e.g., energy consumption, wastes that could create risks for operators and damage the environment, etc.). These side effects of analytical methods, and waste gener- ation and management are also the responsibility of method developers and users (Garrigues et al., 2010). All these aspects can be seen in Figure 18.2, and have become the key features to consider during development of green analytical process. Although the analytical community was environmentally sensitive and the idea of improving analytical methods by reducing consumption of solvents and reagents predates the theoretical developments, the first descriptions of “green analytical chem- istry” methods (or clean analytical methods) appeared by mid-90s (Armenta et al., BASIC CONCEPTS OF GREEN CHEMISTRY 475

FIGURE 18.3 Evolution of research papers dealing with “Green Analytical Chemistry”. Obtained from SciVerse-Scopus database searching the strings: “clean analytical chemistry” or “Green Analytical Chemistry” or “environmentally friendly analytical method.” Copyright 2012 Elsevier B.V. SciVerseR is a registered trademark of Elsevier Properties S.A., used under license. ScopusR is a registered trademark of Elsevier B.V.

2008), as can be seen in Figure 18.3, in which the evolution of the number of research papers dealing with “green analytical chemistry” is shown. The key points, regarding adverse environmental impact of analytical methodolo- gies, have been the following:

– sample pretreatment: reduction of the amount of solvents required; – amount and toxicity of solvents: reduce solvents and reagents employed in the measurement step, especially by miniaturization; – development of alternative direct analytical methodologies not requiring sol- vents or reagents (Armenta et al., 2008).

In this sense, these three main areas should be considered when developing a new analytical method or improving an existing one. Moreover, other important aspects should be considered for a successful use of green analytical chemistry, which was defined by L.H. Lawrence as: “the use of analytical chemistry techniques and methodologies that reduce or eliminate solvents, reagents, preservatives and other chemicals that are hazardous to human health or the environment and that may also enable faster and more energy-efficient analysis without compromising performance criteria” (Sandra et al., 2010). In this definition it is clear that the hazard has to be 476 GREEN FOODOMICS reduced while keeping or improving the analytical performance of the method. This is undoubtedly a difficult task and has limited the translation of conventional methods to greener ones in the last years. In this sense, new approaches such as those concerning the greening of sample preparation with the use of new green solvents, miniaturization, or the employment of solvent-free techniques are of paramount importance and will be discussed in detail in Section 18.4. For a deeper and more general knowledge of green sample preparation techniques, some other reviews are recommended (Curyło et al., 2007; Pawliszyn and Lord, 2010; de la Guardia and Armenta, 2011a). The growth of -omics technologies is helping the other two key areas to develop clean analytical methods. The combination of modern analytical techniques with breakthroughs in microelectronics and miniaturization allows the development of powerful analytical devices for effective control of processes and pollution. Com- bining miniaturization in analytical systems with advances in chemometrics is also of interest. The evolution of chemometrics has supported development of solvent- free methodologies based on mathematical treatment of signals obtained by direct measurements on untreated solid or liquid samples. The growth and evolution of chemometrics has shown that spectroscopic/spectrometric methods could be the best way toward green analytical chemistry (Armenta et al., 2008).

18.3 GREEN PROCESSES TO PRODUCE FUNCTIONAL FOOD INGREDIENTS

The development of green processes for extracting functional ingredients from natural sources is one of the fields of interest in foodomics. When searching for functional ingredients, an important aspect that has to be considered is how they are obtained. Traditional extraction techniques (Soxhlet, sonication, solid–liquid extraction (SLE), liquid–liquid extraction (LLE)) require long extraction times and large amounts of samples, provide low selectivity and, generally, low extraction yields, and need high volumes of organic solvents, resulting in the generation of large quantities of solvent waste that can cause environmental problems. Because the use of hazardous solvents should be avoided from the green and sustainable point of view, selective and environment friendly extraction procedures to isolate bioactive compounds from natural sources combined with food grade solvents are required. In this regard, there is an enormous interest in the application of more environment friendly techniques that can overcome the drawbacks of traditional extraction proce- dures. Among the green extraction procedures, ultrasound assisted extraction (UAE) and microwave assisted extraction (MAE) are versatile approaches due to the possi- bility of using several solvents of different polarities, allowing fast extractions and decreasing the amount of solvents used. Whereas UAE is based on acoustic cavitation that causes disruption of cell walls, thus, reducing the particle size and increasing the contact between the solvent and the compounds, MAE uses microwave radiation to cause motion of polar molecules and rotation of dipoles to heat solvents, promoting transfer of target compounds from the matrix to the solvent. On the other hand, the GREEN PROCESSES TO PRODUCE FUNCTIONAL FOOD INGREDIENTS 477 development of advanced pressurized extraction techniques such as supercritical fluid extraction (SFE), pressurized liquid extraction (PLE), or pressurized hot water extrac- tion (PHWE, also called subcritical water extraction (SWE)), which perfectly comply with the principles of green chemistry and green engineering, could represent a key point in sustainable development. In that sense, it is a good point to be able to switch to greener solvents such as CO2, ethanol, or water. The possibility of modifying the physicochemical properties of solvents (density, diffusivity, viscosity, dielectric constant) and changing the pressure and/or temperature of the extraction, that also modifies their selectivity and solvating power, give these pressurized techniques a high versatility. Besides, they also offer the possibility of eliminating additional postextraction procedures (centrifugation and filtration) since they retain the sample inside the extraction cell (Rostagno et al., 2010). The typical system used to perform SFE, PLE, or PHWE basically consists of a solvent supply and a pump for pumping it, a heater for heating the solvent, a pressure vessel where the extraction is carried out, a pressure controller, and a device for collecting the extract (see Fig. 18.4). To gain a deeper knowledge on the design of pressurized fluid extractors, the readers are referred to Pronyk and Mazza (2009); Pereira and Meireles (2010); Teo et al. (2010); Turner and Iba´nez˜ (2011); Mustafa and Turner (2011). SFE is based on the use of solvents at temperatures and pressures above their critical points. CO2 is the solvent of choice to extract functional ingredients from natural sources by SFE since although other solvents (such as propane, butane, or dimethyl ether) have also been proposed, none of them fulfills the principles of green chemistry as CO2, which is inexpensive, environment friendly, is considered generally recognized as safe (GRAS) for its use in the food industry and its critical conditions are easily attainable. In addition, it can be easily removed after extraction by reducing the pressure. Considering the low polarity of supercritical CO2, SFE will be more suitable for the extraction of compounds with low polarity. However, a change in its polarity can be obtained by combining CO2 with cosolvents (polar modifiers such as ethanol or methanol) that increase its solvating power making supercritical CO2 able to extract more polar analytes. SFE has for long been used as a technique to extract bioactive compounds (mainly antioxidants) from different species, such as plants, food by-products or algae (Herrero et al., 2006; Herrero et al., 2010b). Just to name some examples, phenolic compounds have been extracted from rosemary (Herrero et al., 2010c; Carvalho et al., 2005), pomegranate seeds (Liu et al., 2009), or rice wine lees (Wu et al., 2009); carotenoids from tomato pomace (Shi et al., 2009), carrots (Sun and Temelli, 2006), or algae (Mendiola et al., 2005); and omega-3 from hake by-products (Rubio-Rodr´ıguez et al., 2008). PLE is broadly recognized as a green extraction approach, mainly due to its low organic solvent consumption. In this extraction procedure, the pressure is applied to allow the use of liquids at temperatures higher than their normal boiling point. The combined use of high pressures and temperatures provides faster extractions. Besides, high temperature can increase the analyte solubility and decrease the viscosity and the surface tension of the solvents, helping to reach more easily areas of the matrices improving the extraction rate (Mustafa and Turner, 2011). PLE is more flexible than SFE in terms of bioactive compounds that can be extracted since it is more versatile 478 GREEN FOODOMICS

(a) Pump Oven

Extraction cell 2 N

Extract

(b)

Oven

Cosolvent pump Extraction cell

CO2 pump Collection vial 2 CO CO2 Recycling

FIGURE 18.4 Basic scheme of pressurized liquid extractor (a), and supercritical fluid extractor (b). in terms of extraction solvents that can be used, which are selected depending on the polarity of the target compounds. However, it is less selective than SFE and therefore it would be possible to find interferences in the extract. A high number of applications of PLE for obtaining bioactive compounds from different sources can be found in the literature. For a deeper knowledge of different applications of PLE, some interesting and recent reviews are recommended (Mendiola et al., 2007; Mendiola et al., 2008; Mustafa and Turner, 2011; Wijngaard et al., 2012; Sun et al., 2012). PHWE is a green process that uses water as a extracting solvent at a high temper- ature (above its atmospheric boiling point (100◦C) and below its critical temperature (374◦C) and at a pressure high enough to keep the water at liquid state. Therefore, it can be considered a particular application of PLE with water as a extracting agent. GREEN PROCESSES TO PRODUCE FUNCTIONAL FOOD INGREDIENTS 479

Physical and chemical properties of water change dramatically under high tempera- tures. For instance, its dielectric constant decreases from around 80 at 25◦C to around 33 at 200◦C (i.e., close to a polar organic solvent such as methanol), viscosity and surface tension are both reduced while diffusivity increased; under these conditions, the extraction process is enhanced in terms of efficiency and speed. Besides, the solubility of different bioactive compounds is also modified by temperature, favoring their transfer from the matrix to the heated liquid water. In terms of green chem- istry and green engineering, water is the greenest solvent that can be used since it has negligible environmental effect, nontoxicity to health and the environment and it is safe to work with and to transport. The reader is referred to different reviews (Herrero et al., 2006; Mendiola et al., 2007; Wiboonsirikul and Adachi, 2008; Teo et al., 2010; Wijngaard et al., 2012) and a book chapter (Turner and Iba´nez,˜ 2011) already published to gain a deeper knowledge on the extraction of bioactive com- pounds from different sources by PHWE. An interesting comparison of the environmental impacts associated to the above- mentioned pressurized extraction procedures (namely, SFE and PHWE) and a tra- ditional extraction technique (Soxhlet with hexane as extracting solvent) for the production of 1 g of antioxidant extract from rosemary leaves is presented in terms of LCA (using the software SimaPro 7.33 (Pre` consultants, Netherland)). To be able to compare the three techniques, the optimum conditions for providing dry rosemary extracts with high antioxidant activity have been considered (Herrero et al., 2010c; Lagouri et al., 2010). Figure 18.5 shows the diagram of the three extraction processes considered to obtain antioxidants from rosemary leaves, including system boundaries and optimum extraction conditions. Table 18.1 shows the key inventory data for pro- duction of dry rosemary extracts (1 g) by the three processes. The steps previous to the extraction and those after production stage are not included for simplification (they are assumed to be identical for all the processes studied) and the energy con- sumption of each component employed in the extraction process was calculated based on their specification (for commercial equipment) and uptime. The characterization method used was CML 2 baseline 2000 V2.05. Along with the impact categories included in this method, the cost derived by the energy employed in each process was added considering the energy price for industrial consumers published by the Europe’s Energy Portal (€ per kWh for a consumption of 1 GWh/year) (Europe’s Energy Portal. http://www.energy.eu/ —checked on May 2012). Figure 18.6 shows the environmental impact in the different categories considered by the LCA approach for the three extraction processes considered (bars have been normalized consider- ing Soxhlet extraction as 100%). An analysis of the possible factors affecting all the environmental impact categories demonstrates the high significance of electricity consumption of the three processes. Besides, it is clear that extraction solvents used in PHWE and SFE have no important impact (they are green solvents and are used in low volumes) whereas the impact of hexane in Soxhlet extraction is higher in all the categories, mainly in ozone layer depletion. As a step to the future, the idea of multiple integrated processes arrives with enormous potential; these processes involve the development of multiunit operations with the possibility of using different fluids. This approach can provide advantages in 480

FIGURE 18.5 Diagram of extraction processes to obtain antioxidants from rosemary leaves. (a) PHWE, (b) SFE, and (c) Soxhlet. GREEN PROCESSES TO PRODUCE FUNCTIONAL FOOD INGREDIENTS 481

TABLE 18.1 Key Inventory Data for Production of Rosemary Extracts (1 g) by PHWE, SFE and Soxhlet Extraction With Hexanea PHWE SFE Soxhlet Products Rosemary extract 1 g 1 g 1 g Inputs From nature Rosemary 2.6 g 15.4 g 23.3 From technosphere Water 47.5 g – Nitrogen 0.12 kg – Hexane – – 230.2 g Carbon dioxide – 27.7b g– Ethanol – 1.9b g– Electricity 10.7 kWh 3.9 kWh 12.2 kWh Outputs Emissions to air Water – – – Nitrogen – – – Carbon dioxide – 27.7b kg Waste to treatment Solid waste 1.6 g 14.4 g Waste water 47.5 g – Solvents mixture – 1.9b g Residue – – 22.3 Hazardous waste – – 230.2 g aData to perform LCA has been taken from three different databases Ecoinvent 2.0, ELCD and LCA food DK. b The amounts of CO2 and ethanol corresponded with the net value used taking into account a recycling of 95% and a loss of 5% from the initial amounts (0.5 kg and 38.8 g of CO2 and ethanol respectively). the development of a green processing platform able to face some of the challenges in our society such as environmental impact, sustainability, energy preservation, and health (King and Srinivas, 2009; Turner and Iba´nez,˜ 2011; Iba´nez˜ et al., 2012a). This green platform should work with environmentally benign solvents such as liquefied or supercritical CO2, for nonpolar to moderately polar solutes, and with pressurized hot water (between its boiling and critical points) for a wider range of polarities, considering also the use of ethanol as a cosolvent together with water or carbon dioxide. Several examples can be found in the literature about integrated processes that may favor the extraction and purification of bioactives. Among them, some deal with green processes to extract bioactive compounds (Liau et al., 2010) and others can be used as a base for converting the reported processes to more green, sustainable, and efficient ones (Athukorala et al., 2006; Moreda-Pineiro˜ et al., 2007; Siriwardhana et al., 2008). 482 GREEN FOODOMICS

FIGURE 18.6 Impact assessment comparison of 1 g of rosemary extract by SFE, PHWE, and soxhlet extraction. Results normalized considering Soxhlet = 100%.

Processes related to enzymatic hydrolysis and extraction can be easily included in the green platform. Some studies have demonstrated that under pressurized conditions, enzymatic hydrolysis is accelerated; this fact gives a stronger support to the possibility of improving processes through the use of integrated pressurized fluid technologies (Moreda-Pineiro˜ et al., 2007). For instance, Turner’s research group demonstrated the viability of a process combining enzymatic hydrolysis in hot water, using a thermostable ␤-glucosidase to catalyze hydrolysis of quercetin glucosides in onion waste, plus extraction with water at high temperatures. This process was preferred over more conventional extraction/hydrolysis processes regarding primary energy consumption and global warming potential (Turner et al., 2006; Lindahl et al., 2010). It should be borne in mind that the extraction of bioactive compounds is an important step to develop a foodomics platform able to improve our understanding on how these compounds interact at molecular and cellular level. The above mentioned green pressurized technologies have been used in several research works as sample preparation method in analytical platforms that allowed carrying out proteomics, metabolomics, or transcriptomics studies to evaluate the health benefits of functional food ingredients (Ong et al., 2004; Leon et al., 2009; Ruperez et al., 2009; Iba´nez˜ et al., 2012b).

18.4 DEVELOPMENT OF GREEN ANALYTICAL PROCESSES FOR FOODOMICS

As mentioned throughout the book, the use of -omics techniques such as transcrip- tomics, proteomics, and metabolomics in foodomics derives from the search of mas- sive information at different expression levels (transcriptome, proteome, metabolome) DEVELOPMENT OF GREEN ANALYTICAL PROCESSES FOR FOODOMICS 483 able to provide a better understanding of the molecular effects of, for instance, a func- tional food in certain organism, or to assess, for instance food quality, authenticity, and safety. Undoubtedly, each analytical process will include several steps and is impossible to approach each of them individually. In this section, some strategies applied to general steps will be discussed in terms of how to improve them from a green point of view. The main idea behind it consists of providing the reader with some of the new advances and tools to make the foodomics discipline greener; the use of some of them in a particular analytical process will depend on the application itself and on the goal established. By no means can it be considered an exhaustive evaluation of all the processes involved in such a complex discipline. The analytical chemist will have to take decisions of how or when to implement them, without sacrificing the main purpose of the analytical determination. Therefore, ideas about how to green the sample preparation step and how to improve analytical methodologies under the umbrella of green analytical chem- istry will be presented, together with sound applications in the proteomics and metabolomics fields.

18.4.1 Direct Analysis of Samples: The Greenest Approach Undoubtedly, the clearest way to reduce wastes, energy, consumption of solvents, and reagents is through the direct analysis of the samples. This approach is not always possible since most of the samples need to be in solution or analytes of interest need to be selectively extracted and therefore, no direct determination can be done. Methodologies mostly employed in foodomics related to direct analysis involve the use of MS-based procedures (like DART (Direct Analysis in Real Time)-MS, PTR (Proton Transfer Reaction)-MS, IM (Ion Mobility)-MS, among others) and NMR (Nuclear Magnetic Resonance). However, other techniques, as e.g., spectroscopic techniques (such as NIR (Near Infrared) and Raman spectroscopy) can provide also interesting information to foodomics evaluations. Among spectroscopic techniques, NIR gives chemical and physical information about samples employing data related to molecular vibrations detected in the infrared region near the range of visible light frequency under the wavelength of 2500 nm. It is nondestructive, needs no sample preparation, and allows analysis in a very short time. The main drawbacks are related to the high limit of detection and the complexity of the spectra. The development of the technique, together with chemometric tools has allowed, for example, the measurement of green tea quality as affected for different manufacturing processes and green tea varieties (Ikeda et al., 2007). Recent works have also developed new software that allowed its use for metabolic fingerprinting of food (Ikeda et al., 2009). On the other hand, Raman spectroscopy (which considers the scattered radiation of frequencies different from that provided in the incident monochromatic radiation) has been used to obtain molecular fingerprints of the samples under study. This technique has been employed, for instance, for in vivo lipidomics, to profile the oil produced by a microalgae cell in a direct, quantitative, and fast way (Wu et al., 2011). The possibility of using Raman spectroscopy, together with NMR and MS-based 484 GREEN FOODOMICS methods for metabolic fingerprinting in disease diagnostics has been reviewed (Ellis et al., 2007). Among the different techniques that can be used for a direct analysis of the samples, NMR and MS-based methods are the most employed in metabolomics and food applications. In fact, more than 1300 records can be found in the literature dealing with NMR and metabolomics (SciVerse-Scopus database, 2012 Elsevier B.V). NMR is based on the different resonance frequencies exhibited by nuclei (mainly hydrogen and carbon atoms) positioned into a strong magnetic field. NMR has been used, for instance, to assess the quality and traceability of mozzarella cheese (Mazzei and Piccolo, 2012) and to predict the sensorial quality of canned tomatoes by means of metabolomics fingerprints correlated to sensory descriptors such as bitterness, sweetness, sourness, and saltiness (Malmendal et al., 2011). Moreover, NMR together with chemometrics has been employed for identifying urinary metabolite profiles able to discriminate the dietary intake of protein during a dietary intervention (Rasmussen et al., 2012). It has been also suggested as one of the methods for gathering scientific evidence from clinical trials in dietary intervention studies, in a foodomics approach (Puiggros` et al., 2011). MS-based methods are probably the most used for metabolite profiling/finger- printing. Frequently, samples are subjected to an extensive preparation previous to their introduction into the MS system or into the separation system coupled online to MS. Recently, a new sample ionization source, called DART, has been developed allowing a direct and rapid identification of analytes in different types of samples (including solid samples), without any sample treatment (Cody et al., 2005). DART coupled to different types of MS analyzers has been used for measuring food authen- ticity, quality, and safety; Hajslova et al., have discussed different applications of DART in complex food matrices (Hajslova et al., 2011), including optimization of the operating conditions and the use of DART for pesticides, detection of adulter- ation by melamine, mycotoxins, migration of packaging materials into the food prod- ucts, food authentication, etc. For instance, DART-TOFMS has been employed for metabolomics fingerprinting/profiling of beer origin in real time (Cajka et al., 2011). On the other hand, PTR-MS has been suggested as a potent high-throughput technique for metabolomics, targeting volatile analysis without a previous sample preparation step, and able to provide a fast response and ultrahigh detection sen- sitivity. This technique, coupled to TOF-MS has been used, for instance, for fruit metabolomics (Cappellin et al., 2012) and for fingerprinting of food samples in pro- cesses relevant to the food industry such as coffee roasting, acrylamide production during Maillard reaction, metabolic and catabolic reactions of fruits and meat during storage, and in vivo monitoring of flavor release during consumption, directly related to food perception; a review has been recently published by Biasioli et al. (Biasioli et al., 2011). The last MS-based technique that will be presented in this section is ion-mobility spectrometry (IMS) that allows the direct introduction of solid and liquid samples into the MS analyzer by thermal desorption in which the vapor generated is ionized by atmospheric pressure chemical ionization (APCI) to produce ions. This approach has been used, for example, for metabolic profiling of human blood (Dwivedi et al., 2010); DEVELOPMENT OF GREEN ANALYTICAL PROCESSES FOR FOODOMICS 485 the methodology allowed the detection of around 1100 metabolite ions among which amino acids, organic acids, fatty acids, carbohydrates, purines, etc. were observed. Thus, it is clear that there are many options to work avoiding sample preparation and therefore, meeting the requirements of green analytical chemistry in terms of minimizing the use of solvents and, thus, reducing wastes and operator risks. Nev- ertheless, the above-mentioned techniques (and others that have not been included) are not always available. Moreover, samples are usually too complex to use direct analysis since the compounds of interest might not be present in the sample at the minimum levels to perform their detection and quantification directly, without any previous extraction and preconcentration step. In those cases, sample preparation must be undertaken and greened using different strategies such as the employment of new green solvents and techniques and also automation and miniaturization of the sample preparation procedures and systems.

18.4.2 Green Sample Preparation Techniques As previously seen, analytical methods constitutes itself a potential risk by reiteration of operations that need solvents, chemicals, energy, and yield wastes. “Green analyti- cal chemistry” principles can be implemented at each stage of the analytical process. Sample preparation is typically considered one of the “bottlenecks” of any analytical procedure, not only in throughput but also in terms of greening the analysis. Sample preparation operations are characterized by their complexity, diversity, tediousness, and difficulty of automation. The desire to spend less time, effort, and resources on sample preparation has created a trend for more selective sample preparation proce- dures that achieve better cleanup and improved analysis at lower concentrations. It also important to keep in mind that the best clean-up treatment is no-treatment, but, as previously seen, this statement is not possible most of the times. Sample preparation procedures considering greenness issues are not always easy to develop. In fact, a closer look at the scientific literature shows that sample treatment has been the most evaluated analytical step in terms of greenness. The advance- ment of sample preparation tools chases the following goals (de la Guardia and Armenta, 2011a):

(a) reduce amount of sample to treat (b) reduction or elimination of pollutant solvents/acids (miniaturization) (c) multiple compound extraction simultaneously (d) increase automation and throughput determination.

The advent of advanced MS technologies in recent years had several consequences on our analytical work including the sample preparation step. The higher sensitivity and selectivity of modern mass spectrometers combined with LC and/or GC, make it possible to simplify and miniaturize sample preparation. There is no big need for enrichment in the sample preparation step, which in the past was of vital importance, 486 GREEN FOODOMICS because samples needed to undergo a chain of specific treatments to make them compatible with the sensitivity of the analytical techniques used (Sandra et al., 2011). There are several reviews and publications dealing with green sample preparation techniques for environmental analysis (Tobiszewski et al., 2009; Curyło et al., 2007) and for food analysis (Sandra et al., 2008); but not many works published to date applies to -omics. When dealing with metabolomics and/or foodomics, some of the analytes are non- volatile or semi-volatile and the matrices include solid and liquid samples. The first step is, normally, a solvent extraction step to enrich the target solutes from the matrix. Soxhlet extraction was introduced in 1850 and requires large amounts of solvent, energy, and time. As previously seen, more environment friendly techniques such as UAE, SFE, PLE, PHWE, MAE, and matrix solid-phase dispersion (MSPD, or its variant QuEChERS), are being used (Sandra et al., 2011). All of these techniques have in common the drastic reduction in the amount of solvents used since other phys- ical processes such as pressure, temperature, microwaves, high frequency acoustic waves, among others, are applied to improve the efficiency and to speed up the extrac- tion process, as compared to the conventional ones. These new methods also allow reduction on the amount of wastes generated and help reduce energy consumption. When dealing with volatile analytes, solvent-free techniques can be used. Gas phase sampling is intensively used in food analysis and recently in metabolomics (breath analysis) (Guaman´ et al., 2012; Kim et al., 2012a). The main volatile analysis techniques used are static headspace sampling (SHS) or dynamic headspace sampling (DHS), in-tube extraction (ITEX), purge and trap (P&T), gas phase stripping, solid phase microextraction (SPME), and headspace sorptive extraction (HSSE). For example, among the different procedures that can be used in comprehensive omics-based analyses providing insights into complex metabolic networks of biolog- ical systems, MALDI-imaging mass spectrometry (MALDI-IMS) has been recently employed to visualize the spatial distribution of biomolecules without extraction, purification, separation, or labeling of biological samples (Goto-Inoue et al., 2011). This advanced technique could use several sample preparation procedures, among them, the greener are cryomicrotome and freeze–fracture techniques, but sublima- tion has recently appeared as a fast and solvent-free technique to be used as sample preparation for MALDI-IMS -omics analysis (Hankin et al., 2007). In other cases it becomes necessary to derivatize samples, which usually increases the environmental impact of analysis, because reagents and solvents are required. For green foodomics, the ideal situation is to eliminate the need for derivatization. However, if derivatization is still required for analysis, the use of less hazardous chemicals is a step toward a greener methodology (Keith et al., 2007). For instance, Fabbri et al. (Fabbri et al., 2005) developed a method for derivatization of fatty acids greener than the traditional methylation method using BF3-methanol. Their method consisted of dissolving an aliquot of the vegetable oil in dimethyl carbonate which is pyrolyzed with TiSiO4 online with gas chromatography. Another advantage of the above-mentioned processes is the possibility of automa- tion, thus increasing the speed of sample measurement processes, and miniaturization, thus further reducing the amount of reagents used. Overall, these processes, including DEVELOPMENT OF GREEN ANALYTICAL PROCESSES FOR FOODOMICS 487 the simultaneous treatment of samples and the multianalyte determination in a single run help achieve greener processes and contribute to the sustainability of the method. It must be clear that these approaches should only be undertaken when analytical features are not compromised in terms of selectivity, accuracy, representativeness, and sensitivity. If the new green method does not meet the quality criteria needed, it should not be considered an alternative.

18.4.3 Green Separation Techniques Among the most used separation techniques employed in foodomics, chromato- graphic and electrophoretic techniques are excellent platforms for the separation, profiling, and quantification of target compounds in complex samples. In this sec- tion, the different approaches used to make separation techniques (including gas chromatography, GC; liquid chromatography, LC; supercritical fluid chromatogra- phy, SFC; capillary electrophoresis, CE and microanalytical systems) greener for metabolomics and proteomics will be addressed. For simplification, all techniques will be considered online and off-line with MS for the separation, detection, iden- tification, and quantitation of metabolites from different complex samples, since at present this is the most common approach for -omics studies.

18.4.3.1 Gas Chromatography and Supercritical Fluid Chromatography Gas chromatography is undoubtedly the technique of choice when dealing with volatile and semi-volatile analytes. GC is inherently a green separation technique compared to LC since it does not make use of toxic organic solvents as mobile phase and therefore, no wastes and no toxic hazard for the operators is expected. Moreover, GC can be even greener if solvent-free sample preparation techniques are implemented as previously mentioned. On the other hand, one of the main drawbacks of GC is its inability for analyzing certain compounds without derivatization (highly polar or nonvolatile) and the high temperatures used for analytes elution, which requires high energy consumption. Therefore, what seems clear is that GC should be selected from a green point of view for certain applications but not always since for some metabolomics studies, in which polar compounds are involved, the selection of LC can be more convenient to meet the performance criteria and also from a green approach. One of the ways to contribute to “green analytical chemistry” is through the reduction on analysis time while maintaining separation and resolution. To be able to do so, shorter columns with smaller internal diameters are suggested to provide with similar column efficiencies and plate number. This way it will be possible to reduce energy consumption and GC operating costs since the time per analysis will be also reduced. Using this approach, the metabolic fingerprinting of green tea leaf has been evaluated (Jumtee et al., 2009). Another approach to reduce energy consumption has been the use of low thermal mass (LTM) technology that can be used for ultrafast GC. This mode of chromatog- raphy should be only employed when resolution is not compromised (Luong and Gras, 2006). 488 GREEN FOODOMICS

Supercritical fluid chromatography (SFC) was introduced in the later 1980s as an alternative to normal-phase LC (NP-LC); main advantages over NP-LC are the increased diffusivity and resolution, the reduced viscosity, that allowed faster sep- arations, and the lower solvent consumption, since only carbon dioxide plus small amounts (up to 20%) of polar solvents and/or additives are used during the analysis. Therefore, to use SFC is always greener than employing LC, mainly considering the toxic organic solvents employed in NP-LC. Another advantage of SFC is the possibility of scaling up the separation processes to semi- or preparative scale, with an important reduction of the amount of solvents used. Due to its nonpolar character, SFC has been mainly used for metabolomics purposes to analyze lipid profiles in dif- ferent types of complex samples, such as soybeans (Lee et al., 2012). Phospholipids, glycolipids, neutral lipids, or sphingolipids have been also analyzed by SFC (Bamba et al., 2008). However, undoubtedly, the most important field of application of SFC has been chiral separations for drug discovery, with about 260 references found in June, 2012 in SciVerse-Scopus database dealing with SFC and chiral analysis. Some of these new approaches have been also suggested for the separation of urinary metabolites isomers by SFC with chiral stationary phases (Wang et al., 2006).

18.4.3.2 Liquid Chromatography A search of the scientific literature shows that LC is the separation technique most extensively used for metabolomics; in June, 2012, more than 500 references were found in SciVerse-Scopus database using the keywords “LC–MS and metabolom∗.” HPLC is an efficient separation technique that can be used to appropriately separate different groups of compounds, of different chemical classes such as hydrophilic, hydrophobic, salts, acids, bases, etc. HPLC, as opposed to GC, is not limited to the separation of thermally stable volatile or semi-volatile compounds; its separation mode depending on the chemical nature of the target solute(s). These modes include RP (reversed-phase), NP (normal phase), ion exchange, chiral, size exclusion, hydrophilic interaction liquid chromatography (HILIC), and mixed modes. The properties in terms of mobile and stationary phases and separation mechanism will depend on the metabolites of interest, its concentration in the sample, and the presence of interfering compounds. In terms of “green analytical chemistry,” several approaches have been used to develop green strategies applied to LC. In Figure 18.7, the different strategies are shown, and can be divided in:

(i) replacement of toxic organic solvents; (ii) minimizing use of solvents through the use of monolithic and nonporous stationary phases, and (iii) minimizing use of solvents through the miniaturization of the technique (UHPLC, nano-LC).

Replacement of Toxic Organic Solvents At present, most of the mobile phases used in RP-HPLC consist on binary mixtures, acetonitrile/water or methanol/water. Both, DEVELOPMENT OF GREEN ANALYTICAL PROCESSES FOR FOODOMICS 489

FIGURE 18.7 Strategies for greener liquid chromatography. methanol and acetonitrile have favorable properties for LC use, such as compatibility with water, low UV absorbance (in a wide ␭ range), relatively low viscosity, high purity, and low reactivity. Nevertheless, from an environmental point of view, they offer several drawbacks such as high toxicity and high disposal costs. Although it is true that methanol is considered less toxic than acetonitrile for analytical applications, methanol cannot be always used instead of acetonitrile mainly because of its different selectivity. Undoubtedly, selection of other solvents such as water, ethanol, or even acetone can make the chromatographic process greener, mainly when selectivity can be optimized through a correct method development for separation of highly complex samples. A new approach based on HILIC for the separation of polar and ionizable com- pounds in metabolomics was introduced in 2002 by Fiehn’s group (Tolstikov and Fiehn, 2002) and compared to NP-LC. The possibility of substituting the toxic organic compounds commonly employed in NP-LC by other less toxic solvents is also an advantage of the HILIC technique. Recently, the use of ethanol mixed with CO2 in HILIC (dos Santos Pereira et al., 2010) in an enhanced fluidity mode (water/ethanol/CO2) was confirmed for the separation of nucleobases with the same performance than using water/acetonitrile, thus demonstrating the viability of certain greener alternatives such as enhanced fluidity to the use of organic solvents. A different way to replace toxic organic solvents is the employment of elevated temperatures in LC separations. Advantages associated are reduction in analysis time and in the amount of organic modifier in the mobile phase. Increasing water temperature results in a reduced viscosity and an important decrease on the dielectric constant (and thus, on the polarity of the water), meaning that water starts dissolving less polar compounds as temperature increases. Therefore, in an ideal situation, we should be able to use only pure water for polar and nonpolar metabolites elution only via a temperature programming (thermal gradient) during the analysis. For a comprehensive review on the topic, readers are referred to (Greibrokk and Andersen, 490 GREEN FOODOMICS

2003). Although the theory is clear, there are only few pioneer works demonstrating the possibility of using high temperature LC for metabolomics such as the one of Gika et al., in which the application of high temperature LC for the global metabolite profiling of the plasma and urine of normal and Zucker (fa/fa) obese rats (Gika et al., 2008) is presented.

Minimizing Use of Solvents Through the Use of Monolithic and Nonporous Stationary Phases Monolithic columns have been recently suggested for LC separations and involve the use of columns specially designed for independent control of pore sizes (meso- and macropore) in order to improve separation in terms of mass transfer kinet- ics, short analysis time, due to high permeability, and low backpressures. The use of monolithic columns provides an important reduction in terms of solvent consumption since faster separations (by a factor of 4) are obtained as compared to conventional LC columns. Although an interesting application has been suggested for this type of columns in plant metabolomics (Tolstikov et al., 2003), its main use is in the field of proteomics, in which more than 170 documents have been found concerning the use of such columns for human proteome analysis (Van de Meent and De Jong, 2011; El Deeb, 2011; Iwasaki et al., 2012). Important developments are being carried out in this field in terms of evaluation and design of new materials for monolithic columns’ production (Alzahrani and Welham, 2011; Calleri et al., 2012). Partially porous stationary phases, formed by particles with a solid core and a relatively narrow layer of porous material, have several advantages over conventional, completely porous stationary phases in terms of “green analytical chemistry.” Since these type sof columns have a shorter diffusion path (since the main part of the particle is nonporous and thus the compound cannot penetrate in it), faster analysis are expected with conventional LC systems, perfect for high-throughput analysis. This approach has been used for the profiling of lipids in human and mouse plasma (Hu et al., 2008) and for comprehensive proteomics (Franc¸ois et al., 2009).

Minimizing Use of Solvents Through the Miniaturization of the Technique (UHPLC, Nano-LC) A general alternative that can be always used to minimize reagents and wastes (and therefore, disposal costs) is the miniaturization of the techniques. LC has been one of the most studied techniques for using smaller dimensions, mainly due to the advantages associated such as: (1) better resolving power in shorter time, (2) less sample volume necessary for the analysis, and (3) reduction of costs and toxicity of solvents (de la Guardia and Armenta, 2011b). One of the strategies for LC miniaturization is the reduction of either internal diameter of the column or particle size. The use of smaller internal diameters, together with shorter columns and smaller particles is one of the best ways to reduce solvent consumption. When using such microscale columns, one way to reduce solvent consumption without losing efficiency is to decrease flow rate. In order to keep the same efficiency in microscale columns than the one obtained using conventional columns, working conditions should allow the maximum plate number (N) and, therefore, the optimum velocity (uopt) should be employed. Flow rate should be 2 decreased by a factor (F) = (i.d. of conventional/i.d. downscaled) to work at uopt. DEVELOPMENT OF GREEN ANALYTICAL PROCESSES FOR FOODOMICS 491

Using this theory, it can be seen that when switching from 4.6 mm columns to 2.1 mm columns, F equals 4.8 and flow rate might be decreased from 1 mL/min to 0.2 mL/min, without the separation being compromised. An additional gain in sensitivity is obtained under these conditions since less dilution of the solutes in the mobile phase occurs. Moreover, the required sample volume is also reduced, which is important in -omics applications dealing with biological samples. On the other hand, the use of microscale columns is now general in most laboratories and does not impose a challenge in terms of LC instrumentation as it only requires minimizing extra-column volumes in the system to maintain the separation efficiency and performance. Another option to decrease solvent consumption is by decreasing the column particle diameter in combination with column length; this way, we can keep the column efficiency and decrease analysis time obtaining faster separations with lower solvent consumption. Moreover, since efficiency versus mobile phase velocity curves are more flat with this type of columns, it is possible to increase the flow rate even further and therefore to obtain faster separations with the same efficiency. As a consequence, the use of particle diameters lower than 2 ␮m generates higher backpressures along the column, which means that appropriate instruments able to operate at very high pressures are needed. The use of ultra high pressure liquid chromatography (UHPLC) is also common now in most of the laboratories, mainly related to -omics applications. A further reduction on solvent consumption can be achieved by working with capillary or nano-columns in LC; in those cases, flow rates decreased substantially to 10–2500 nL/min for nano-LC. It is important to emphasize that in those cases, specific pumps able to provide with micro- or nanoliter flows should be used and that working with high volume pumps operated in split-flow mode is not green and should be avoided. Capillary and nano-LC are typically used in proteomics (with more than 400 results in SciVerse-Scopus database dealing with “nano-LC and proteome∗”) (Rosenling et al., 2011; Faurobert et al., 2009) and metabolomics (Myint et al., 2009) since together with higher efficiencies, a high sensitivity is achieved for small samples sizes due to the concentration-sensitive nature of nanospray-ESI-MS. Applications to proteomics are related to the use of nano-LC-MS/MS for identifying protein spots detected by 2-DE (Huerta-Ocampo et al., 2012) or to study the embryo development in rice through proteomic analysis (Xu et al., 2012a) or studies related with biomarkers discovery in cancer (Yu et al., 2011; Roberts et al., 2012). The technique has been also used for peptidomics (Ueda et al., 2011) and metabolomics, for example, for identifying metabolites from in vitro and in vivo samples (Liu et al., 2012), among many other studies.

18.4.3.3 Capillary Electrophoresis In terms of green analytical chemistry, CE allows the replacement of established methodologies with greener methods that con- sume lower amount of solvents. CE, using an aqueous buffer to separate charged analytes, is really appealing as the replacement of LC in some cases, since it provides with very high efficiencies, short analysis times, low sample and electrolyte consump- tion, easiness of operation and automation. Moreover, CE can be used in different 492 GREEN FOODOMICS

FIGURE 18.8 Comparison of typical LC versus CE based on the green analytical concepts. Adapted with permission from Armenta et al. (2008). Copyright 2012 Elsevier B.V. modes, thus covering a wide range of analytes to be analyzed. Figure 18.8 shows a comparison of CE and LC in terms of green analytical chemistry requirements. CE has been widely used in -omics technologies (more than 280 articles in SciVerse-Scopus database under “capillary electroph∗” and “metabolom∗”). Rele- vant applications can be found on a disease’s biomarkers discovery (Kumar et al., 2012; Ban et al., 2012; Fuchs and Hewitt, 2011; Issaq et al., 2011), to study food treatments and its effects on the metabolome (Sugimoto et al., 2012), for nontargeted profiling analyses of herbal medicine extracts (Iino et al., 2012), or to the metabolic assessment of the nutraceutical effect of different natural extracts in animal models (Balderas et al., 2010; Moraes et al., 2011; Godzien et al., 2011), etc. But, it is mainly in the field of proteomics that CE is more active. For instance, in a very recent review, Xu et al., (Xu et al., 2012b) discussed the advantages of different platforms for proteomics, based on the use of 2D CE, CE coupling with capillary LC, and microfluidic devices. Advantages mentioned are faster analysis, higher separation efficiency and less sample and solvent consumption than conventional methods based on LC or slab gel electrophoresis. Moreover, advantages of CE–MS as compared to LC–MS have been observed in the field of proteomics and peptidomics (Mullen et al., 2012). In general, an extensive number of references (more than 900 consid- ering capillary “electroph∗ and proteom∗”) were obtained, although most of them showed the employment of CE in microfluidic systems, as will be mentioned in the following section.

18.4.3.4 Analytical Microsystems (Microfluidic, Lab-on-a-Chip, ␮TAS) Micro total analysis system (␮TAS), known also as “lab-on-a-chip,” attempts to develop integrated analytical systems at a micro scale to perform in one device all the analytical steps (sample preparation, analytes separation, and detection) needed to carry out a complete analysis (Burns et al., 1998; Manz et al., 1990; R´ıos et al., 2006). Challenges and difficulties in developing ␮TAS systems have been reviewed by R´ıos et al. (2006), its main benefits being the analytical improvements associated with the scaling down of the size of the device, the minimized consumption of reagents and solvents, the increased automation, the reduced manufacturing costs, and the use of an integrated platform that may allow the improvement, in terms of green analytical chemistry, of the different steps associated to a whole analysis (Kock et al., 2000). Microfluidic devices such as nano-LC/MS, CE-MS, etc. have been widely applied in proteomics and metabolomics. Nano-LC systems have demonstrated its ability COMPARATIVE LCA STUDY OF GREEN ANALYTICAL TECHNIQUES: CASE STUDY 493 in different fields such as the determination of certain abuse drugs and metabolites in human hair (Zhu et al., 2012) or for biomarkers discovery (Houbart et al., 2011; Bai et al., 2011; Armenta et al., 2009; Horvatovich et al., 2007), among others. On the other hand, capillary electrophoresis has been widely applied to lab-on-a-chip systems. Using these types of chips, it is possible to obtain high separation efficiencies in a very short time; as mentioned in the review by R´ıos et al. (2006), there are several key reasons for the dominance of CE microchips over chromatographic techniques such as those related to CE analytical performance (like rapid analysis, high sample throughput, small volumes of samples, separation efficiency remaining or increasing while decreasing scale), and those intrinsic of miniaturization and technological developments (such as the easy fabrication of miniaturized devices, the existence of electro-osmotic flow in glass and polymers, the chemistry on the fabrication of devices based on glass and polymers, and the advantage and easy use of electrokinetic phenomena for moving the fluids through the device). Some interesting applications of the mentioned microdevices are its use in single- cell analysis (Kim et al., 2012b). A very recent review by Yin et al., discusses the latest developments in microfluidics aimed at total single-cell analysis on chip, from an individual live cell to its gene and proteins (Yin and Marshall, 2012) and on the profiling of metabolites and peptides in single cells (Rubakhin et al., 2011). Microchips have been also used as programmable diagnostic devices able to measure DNA, proteins, and small molecules in the same system (Jokerst and McDevitt, 2010), that can be used as disease diagnosis and prognosis for cancer, heart disease, etc.

18.5 COMPARATIVE LCA STUDY OF GREEN ANALYTICAL TECHNIQUES: CASE STUDY

Throughout this chapter, several ways to make cleaner analysis and their implications have been described. In the present section of the chapter we will show some examples of quantification of the green profile of some analytical techniques used in foodomics. For this purpose, LCA methodologies will be used. In fact, Gaber et al (Gaber et al., 2011) suggested that the greenness should be incorporated during analytical method development, along with the conventional standards of accuracy, robustness, selectivity, and reproducibility. The American Chemical Society Green Chemistry Institute has introduced the greenness profile of the analytical methods that can be found in their website devoted to environmental analysis (www.nemi.gov). This profile is based in four categories: persistent, bioaccumulative, and toxic (PBT), hazardous, corrosive, and waste amount (Keith et al., 2007), as shown in Figure 18.9. Probably, additional indicators should be included; these should be able to differentiate green methods using aspects concerning energy and reagents consumed, and volumes of waste generated, and therefore other methodology should be used. For profiling purposes, six advanced analytical methods used for characterization of rosemary antioxidant extracts developed in our laboratory have been selected 494 GREEN FOODOMICS

Greenness profile symbol: Explanation of quadranrts

PBT * Hazardous

Corrosive Waste amount

Each quadrant represents a greenness criterion on which all methods were rated. Green-filled quadrant = the method passes the selection criteria defined for that quadrant. * PBT stands for “persistent, bioaccumulative, and toxic”

FIGURE 18.9 Greenness profile proposed by The American Chemical Society Green Chem- istry Institute that can be found at http://www.nemi.gov (last accessed October 2012).

namely, HPLC-DAD and MECK-DAD (Iba´nez˜ et al., 2000), UHPLC-DAD-MS (Herrero et al., 2010c), CE–MS (Herrero et al., 2005), SFC-FID (Ram´ırez et al., 2004), and GC-FID (Ram´ırez et al., 2007). In all of them only the analytical part has been considered for LCA purposes, excluding sample preparation. For comparison purposes two sample preparation techniques has been included, one “green” (SFE) and one “nongreen” (Soxhlet with hexane). Table 18.2 shows the key inventory used; the functional unit employed for comparison has been one analysis, includ- ing conditioning time in each technique. The software used for LCA was SimaPro 7.33 (Pre` consultants, Netherland). Data to perform LCA has been taken from two databases Ecoinvent 2.0: (life cycle inventory 2007 www.ecoinvent.org) and ELCD database 2.0. (http://lct.jrc.ec.europa.eu). All the methods studied could be consid- ered as highly green according to the rules used by the American Chemical Society Green Chemistry Institute. The only category that fails is the use of PBT compounds in some of them. A closer look at the impacts produced by each analysis performed by LCA can provide a better understanding of what lies behind them and the practical consequences of selecting each method. The first approach can be performed using a comparative assessment of human toxicity and ecotoxicity of each analytical method. In this sense, the use of IMPACT 2002 + calculation method has been the selected one. IMPACT 2002 + proposes a feasible implementation of a combined midpoint/damage approach, linking all types of life cycle inventory results via 14 midpoint categories to four damage categories (Jolliet et al., 2003). Both human toxicity and ecotoxicity effect factors are based on mean responses rather than on conservative assumptions. In this calculation method, TABLE 18.2 Key Inventory of Products Considered to Perform Comparative LCA of Analytical Methods HPLC-DAD MECK-DAD UHPLC-MS CE-MS SFC-FID GC-FID

Reagents used Acetonitrile: 7.96 g Sodium dodecyl Acetonitrile: 0.46 g Ammonium acetate CO2 16 g He 15 g Water: 16.67 sulfate 0.03 g Water: 3.01 0.02 g H2 0.05 g H2 0.5 g Acetic acid: 0.20 g Sodium deoxycholate Formic acid: 0.01 g Ammonium 0.04g Nitrogen 0.5 g hydroxide 0.02 g Boric acid / sodium Water 4 g tetraborate hydrate 2-propanol 0.02 g 0.05 g Nitrogen 0.05 g Water 3 g Nitrogen 0.05 g Column Zorbax C18 column, Fused-silica capillary Hypersil gold column Fused-silica capillary Fused-silica capillary Fused-silica capillary 3.5 ␮m particle, used was 27 cm (50 mm × 2.1 used was 27 cm used was 27 cm used was 30 m 4.6 × 150 mm mm, d.p. 1.9 ␮m) with SE54 with SE54 Total analysis time 45 13 7 25 32 70 Energy used 0.39 kWh 0.16 kWh 0.58 kWh 1.96 kWh 0.81 kWh 2.10 kWh

Wastes Acetonitrile: 7.96 g Sodium dodecyl Acetonitrile: 0.46 g Ammonium acetate CO2 16 g He 15 g Water: 16.67 sulfate 0.03 g Water: 3.01 0.02 g Water 0.2 g Water 0.2 g Acetic acid: 0.20 g Sodium deoxycholate Formic acid: 0.01 g Ammonium As gas As gas In liquid solution 0.04g In liquid solution hydroxide 0.02 g boric acid/sodium N2 0.5gasgas Triethylamine 0.01 g tetraborate hydrate Water 4 g 0.05 g 2-propanol 0.02 g Water 3 g In liquid solution In liquid solution N2 0.05gasgas N2 0.05gasgas 495 496 GREEN FOODOMICS

FIGURE 18.10 Comparative assessment of human toxicity and ecotoxicity impacts of each analytical method performed using IMPACT 2002 + calculation method.

all midpoint scores are expressed in units of a reference substance and related to the four damage categories which are human health, ecosystem quality, climate change, and resources. As can be seen in Figure 18.10 the use of all the tested methods implies a certain impact, but this impact is much lower than the one associated with sample preparation step. Among the analytical methods studied in the present comparison, the GC-FID analysis of volatile components of rosemary is the method that provides the higher impacts. Even considering that this method does not use PBT compounds, that the sample is dissolved in a few ethanol microliters and the mobile phase is helium, it provides with the higher impacts probably due to the high energy consumption per analysis (mainly by heating the oven at high temperatures). On the other hand the MEKC-DAD analysis yielded the lower impacts, even considering that several compounds and additives are present in the mobile phase. It is interesting to underline the case of UHPLC-MS, being a method with low impacts able to provide with a lot of information about the sample. A closer look at each category can be done based on ISO 14044 (2006) using well-known indicators. The calculation method used can ungroup those categories and the result can be seen in Figure 18.11. As seen before, MEKC-DAD provided the lowest impacts. One important data to take into account is the higher production of carcinogens by HPLC in comparison to UHPLC, which is due to the higher amount of acetonitrile used in each run; nevertheless, other indicators suggest lower impacts. In general terms, the main impacts observed are related to energy consumption of each analytical method. Power supply is the key factor of the energy consumption of analytical instruments and apparatus. The use of high temperature steps (GC) involves a high demand of electricity and contributes to environmental impact of the CONCLUSION 497

FIGURE 18.11 LCA comparative assessment of impacts derived from each analytical method. Results normalized considering GC-FID = 100%.

analytical steps. In this regard, it is clear that the increase of productivity through automation of methods or multianalyte determinations, all contribute to reduce the energy consumption per analysis (de la Guardia and Armenta, 2011c). In this sense, it can be concluded that the best analytical method among those compared is UHPLC- MS not only due to the lower impacts produced running each analysis (short time, low volumes used) but also due to the high information that can be depicted from each run. It meets better than any of the others the green and -omics philosophy.

18.6 CONCLUSION

As it has been described along this chapter, modern foodomics approaches include different steps that can be modified to obtain greener processes (see Fig. 18.1). Sustainability and eco-friendliness of a particular process is not just an additional advantage but a goal by itself. In this regard, different approaches have to be closely considered so that greener processes and analytical methods are developed. One of the main aims is to reduce the use of organic solvents and chemicals that might be toxic and/or hazardous. However, even if this point is of utmost importance, the necessity of developing processes and methods that are able to consume fewer resources (e.g., power) cannot be underestimated. Among the sample preparation techniques, modern pressurized extraction meth- ods are pointed out as they are able to provide with additional advantages using significantly less amounts of solvents. Miniaturized extraction methods are also gain- ing importance in this regard. The development of integrated approaches will also help in the future to obtain more environmentally friendly processes under the green 498 GREEN FOODOMICS chemistry domain. Simultaneously, different strategies might be followed for sample analysis. Although this part can be less important from a quantitative point of view, several advances have been performed toward greener analytical methods, such as the employment of novel column technologies and the adaptation of conventional methodologies to others using less amount of solvents. Nevertheless, this field will surely continue to evolve in the future in order to develop new greener alternatives for the analysis of complex materials; miniaturization and the use of water at high temperatures could be some of these possibilities. In general, the development of greener processes and analytical techniques might be further explored through the application of LCA. The employment of this useful tool will be increased in the future in a way to efficiently calculate the impact on the environment of the different available procedures. Thanks to LCA, each analytical technique or process might be characterized not only from a throughput perspective but also from a greenness point of view.

ACKNOWLEDGMENTS

M.C.P. thanks MICINN for her “Juan de la Cierva” contract. M.H. would like to thank MICINN for a “Ramon´ y Cajal” research contract.

REFERENCES

Alzahrani E, Welham K (2011). Design and evaluation of synthetic silica-based monolithic materials in shrinkable tube for efficient protein extraction. Analyst 136:4321–4327. Anastas PT, Warner JC (1998). Green Chemistry: Theory and Practice. New York: Oxford University Press, 152 pages. ISBN: 978-0198506980. Anastas PT, Zimmerman JB (2003). Design through the twelve principles of green engineering. Environmental Science Technology 37:94A–101A. Armenta S, Garrigues S, de la Guardia M (2008). “Green Analytical Chemistry” –review. TrAC - Trends in Analytical Chemistry 27:497–511. Armenta JM, Dawoud AA, Lazar IM (2009). Microfluidic chips for protein differential expres- sion profiling. Electrophoresis 30:1145–1156. Athukorala Y, Kim KN, Jeon YJ (2006). Antiproliferative and antioxidant properties of an enzymatic hydrolysate from brown alga, Ecklonia cava. Food and Chemical Toxicology 44:1065–1074. Bai H-Y, Lin S-L, Chung Y-T,Liu T-Y, Chan S-A, Fuh M-R (2011). Quantitative determination of 8-isoprostaglandin F 2␣ in human urine using microfluidic chip-based nano-liquid chromatography with on-chip sample enrichment and tandem mass spectrometry. Journal of Chromatography A 1218:2085–2090. Balderas C, Villasenor˜ A, Garc´ıa A, Ruperez´ FJ, Iba´nez˜ E, Senorans˜ J, Guerrero-Fernandez´ J, Gonzalez-Casado´ I, Gracia-Bouthelier R, Barbas C (2010). Metabolomic approach to the nutraceutical effect of rosemary extract plus ␻-3 PUFAs in diabetic children with capillary electrophoresis. Journal of Pharmaceutical and Biomedical Analysis 53:1298–1304. REFERENCES 499

Bamba T, Shimonishi N, Matsubara A, Hirata K, Nakazawa Y, Kobayashi A, Fukusaki E (2008). High throughput and exhaustive analysis of diverse lipids by using supercritical fluid chromatography-mass spectrometry for metabolomics. Journal of Bioscience and Bioengineering 105:460–469. Ban E, Park SH, Kang M-J, Lee H-J, Song EJ, Yoo YS (2012). Growing trend of CE at the omics level: The frontier of systems biology - An update. Electrophoresis 33:2–13. Biasioli F, Gasperi F, Yeretzian C, Mark¨ TD (2011). PTR-MS monitoring of VOCs and BVOCs in food science and technology. TrAC - Trends in Analytical Chemistry 30:968–977. Burns MA, Johnson BN, Brahmasandra SN, Handique K, Webster JR, Krishnan M, Sammarco TS, Man PM, Jones D, Heldsinger D, Mastrangelo CH, Burke DT (1998). An integrated nanoliter DNA analysis device. Science 282:484–487. Cajka T, Riddellova K, Tomaniova M, Hajslova J (2011). Ambient mass spectrometry employ- ing a DART ion source for metabolomic fingerprinting/profiling: a powerful tool for beer origin recognition. Metabolomics 7:500–508. Calleri E, Ambrosini S, Temporini C, Massolini G (2012). New monolithic chromatographic supports for macromolecules immobilization: challenges and opportunities. Journal of Pharmaceutical and Biomedical Analysis 69:64–76. Cappellin L, Soukoulis C, Aprea E, Granitto P, Dallabetta N, Costa F, Viola R, Mark¨ TD, Gasperi F, Biasioli F (2012) PTR-ToF-MS and data mining methods: a new tool for fruit metabolomics. Metabolomics 8(5):761–770. Carvalho Jr RN, Moura LS, Rosa PTV, Meireles MAA (2005) Supercritical fluid extraction from rosemary (Rosmarinus officinalis): kinetic data, extract’s global yield, composition and antioxidant activity. Journal of Supercritical Fluids 35:197–204. Cifuentes A (2009) Food analysis and foodomics foreward. Journal of Chromatography A 43:7109–7109. Cody RB, Laramee JA, Dupont Durst H (2005). Versatile new ion source for the analysis of materials in open air under ambient conditions. Analytical Chemistry 77:2297–2302. Curyło J, Wardencki W, Namiesnik´ J (2007). Green aspects of sample preparation—a need for solvent reduction. Polish Journal of Environmental Studies 16:5–16. Curzons AD, Constable DJC, Mortimer DN, Cunningham VL (2001). So you think your process is green, how do you know?—Using principles of sustainability to determine what is green—A corporate perspective. Green Chemistry 3:1–6. de la Guardia M, Armenta S (2011a). Greening sample treatments. Comprehensive Analytical Chemistry 57:87–120. de la Guardia M, Armenta S (2011b). Downsizing the methods. Comprehensive Analytical Chemistry 57:157–184. de la Guardia M, Armenta S (2011c). The basis of a greener analytical chemistry. Comprehen- sive Analytical Chemistry 57:25–38. dos Santos Pereira A, Jimenez´ Giron´ A, Admasu E, Sandra P (2010). Green hydrophilic inter- action chromatography using ethanol–water–carbon dioxide mixtures. Journal Separation Science 33:834–837. Dwivedi P, Schultz AJ, Hill Jr. HH (2010). Metabolic profiling of human blood by high- resolution ion mobility mass spectrometry (IM-MS). International Journal of Mass Spec- trometry 298(1–3):78–90. El Deeb S (2011). Monolithic silica for fast HPLC: current success and promising future. Chromatographia 74:681–691. 500 GREEN FOODOMICS

Ellis DI, Dunn WB, Griffin JL, Allwood JW, Goodacre R (2007). Metabolic fingerprinting as a diagnostic tool. Pharmacogenomics 8:1243–1266. Fabbri D, Baravelli V, Chiavari G, Prati S (2005). Profiling fatty acids in vegetable oils by reactive pyrolysis–gas chromatography with dimethyl carbonate and titanium silicate. Journal of Chromatography A 1100:218–222. Faurobert M, Cha¨ıb J, Barre M, Tricon D, Munos˜ S, Causse M (2009). Genetic and proteomic approach of tomato fruit quality. Acta Horticulturae 817:119–126. Franc¸ois I, Cabooter D, Sandra K, Lynen F, Desmet G, Sandra P (2009). Tryptic digest analysis by comprehensive reversed phase x two reversed phase liquid chromatog- raphy (RP-LC x RP-LC) at different pH’s. Journal of Separation Science 32:1137– 1144. Fuchs TC, Hewitt P (2011). Biomarkers for drug-induced renal damage and nephrotoxicity - An overview for applied toxicology. AAPS Journal 13:615–631. Gaber Y, Tornvall¨ U, Kumar MA, Ali Amin M, Hatti-Kaul R (2011). HPLC-EAT (Environ- mental Assessment Tool): a tool for profiling safety, health and environmental impacts of liquid chromatography methods. Green Chemistry 13:2021–2025. Garrigues S, Armenta S, de la Guardia M (2010). Green strategies for decontamination of analytical wastes. TrAC - Trends in Analytical Chemistry 29:592–601. Gika HG, Theodoridis G, Extance J, Edge AM, Wilson ID (2008). High temperature-ultra performance liquid chromatography–mass spectrometry for the metabonomic analysis of Zucker rat urine. Journal of Chromatography B: Analytical Technologies in the Biomedical and Life Sciences 871:279–287. Godzien J, Garc´ıa-Mart´ınez D, Martinez-Alcazar P, Ruperez FJ, Barbas C (2011). Effect of a nutraceutical treatment on diabetic rats with targeted and CE-MS non-targeted approaches. Metabolomics, in press. doi: 10.1007/s11306-011-0351-y. Goto-Inoue N, Hayasaka T, Zaima N, Setou M (2011). Imaging mass spectrometry for lipidomics. Biochimica et Biophysica Acta - Molecular and Cell Biology of Lipids 1811:961–969. Greibrokk T, Andersen T (2003). High-temperature liquid chromatography. Journal of Chro- matography A 1000:743–755. Guaman´ AV, Carreras A, Calvo D, Agudo I, Navajas D, Pardo A, Marco S, Farre´ R (2012). Rapid detection of sepsis in rats through volatile organic compounds in breath. Journal of Chromatography B 881–882:76–82. Hajslova J, Cajka T, Vaclavik L (2011). Challenging applications offered by direct analysis in real time (DART) in food-quality and safety analysis. TrAC - Trends in Analytical Chemistry 30:204–218. Hankin JA, Barkley RM, Murphy RC (2007). Sublimation as a method of matrix application for mass spectrometric imaging. Journal of the American Society Mass Spectrometry 18:1646– 1652. Herrero M, Arraez-Rom´ an´ D, Segura A, Kenndler E, Gius B, Raggi MA, Iba`nez˜ E, Cifuentes A (2005). Pressurized liquid extraction-capillary electrophoresis-mass spectrometry for the analysis of polar antioxidants in rosemary extracts. Journal of Chromatography A 1084:54–62. Herrero M, Cifuentes A, Iba´nez˜ E (2006). Sub- and supercritical fluid extraction of functional ingredients from different natural sources: plants, food-by-products, algae and microalgae: a review. Food Chemistry 98:136–148. REFERENCES 501

Herrero M, Garc´ıa-Canas˜ V, Simo C, Cifuentes A (2010a). Recent advances in the application of CE methods for food analysis and foodomics. Electrophoresis 31:205–228. Herrero M, Mendiola JA, Cifuentes A, Iba´nez˜ E (2010b). Supercritical fluid extraction: recent advances and applications. Journal of Chromatography A 1217:2495–2511. Herrero M, Plaza M, Cifuentes A, Iba´nez˜ E (2010c). Green processes for the extraction of bioactives from Rosemary: chemical and functional characterization via ultra-performance liquid chromatography-tandem mass spectrometry and in-vitro assays. Journal of Chro- matography A 1217:2512–2520. Herrero M, Simo C, Garcia-Canas˜ V, Iba´nez˜ E, Cifuentes A (2012). Foodomics: MS-based strategies in modern food science and nutrition. Mass Spectrometry Reviews 31:49–69. Horvatovich P, Govorukhina NI, Reijmers TH, van der Zee AGJ, Suits F, Bischoff RPH (2007). Chip-LC-MS for label-free profiling of human serum. Electrophoresis 28:4493–4505. Houbart V, Cobraiville G, Lecomte F, Debrus B, Hubert P, Fillet M (2011). Development of a nano-liquid chromatography on chip tandem mass spectrometry method for high-sensitivity hepcidin quantitation. Journal of Chromatography A 1218:9046–9054. Hu C, Van Dommelen J, Van Der Heljden R, Spijksma G, Reijmers TH, Wang M, Slee E, Lu X, Xu G, Van Der Greef J, Hankemeier T (2008). RPLC-Ion-trap-FTMS method for lipid profiling of plasma: method validation and application to p53 mutant mouse model. Journal of Proteome Research 7:4982–4991. Huerta-Ocampo JA, Osuna-Castro JA, Lino-Lopez´ GJ, Barrera-Pacheco A, Mendoza- Hernandez´ G, De Leon-Rodr´ ´ıguez A, Barba de la Rosa AP (2012). Proteomic analysis of differentially accumulated proteins during ripening and in response to 1-MCP in papaya fruit. Journal of Proteomics 75:2160–2169. Iba´nez˜ E, Cifuentes A, Crego AL, Senor˜ ans´ FJ, Cavero S, Reglero G (2000). Combined use of supercritical fluid extraction micellar electrokinetic chromatography, and reverse phase high performance liquid chromatography for the analysis of antioxidants from Rosemary (Rosmarinus officinalis L.). Journal of Agricultural and Food Chemistry 48:4060–4065. Iba´nez˜ E, Herrero M, Mendiola JA, Castro-Puyana M (2012a). Extraction and characterization of bioactive compounds with health benefits from marine resources: macro and micro algae, cyanobacteria and invertebrates. In: Hayes M, editor. Marine Bioactive Compound: Sources, Characterization and Applications. New York: Springer. p 55–98. Iba´nez˜ C, Valdes´ A, Garc´ıa-Canas˜ V, Simo´ C, Celebier M, Rocamora-Reverte L, Gomez-´ Mart´ınez A, Herrero M, Castro-Puyana M, Segura-Carretero A, Iba´nez˜ E, Ferragut JA, Cifuentes A (2012b). Global foodomics strategy to investigate the health benefits of dietary constituents. Journal of Chromatography A 1248:139–153. Iino K, Sugimoto M, Soga T, Tomita M (2012). Profiling of the charged metabolites of tra- ditional herbal medicines using capillary electrophoresis time-of-flight mass spectrometry. Metabolomics 8:99–108. Ikeda T, Kanaya S, Yonetani T, Kobayashi A, Fukusaki E (2007). Prediction of Japanese green tea ranking by Fourier transform near-infrared reflectance spectroscopy. Journal of Agricultural and Food Chemistry 55:9908–9912. Ikeda T, Altaf-Ul-Amin M, Takahashi H, Fukusaki E (2009). DrEFTIR: the data mining software for Fourier transform near-infrared reflectance spectroscopy focused on food metabolic finger printing. Plant Biotechnology 26:451–457. ISO 14044 (2006). Environmental management – Life cycle assessment – Requirements and guidelines. 502 GREEN FOODOMICS

Issaq HJ, Fox SD, Chan KC, Veenstra TD (2011). Global proteomics and metabolomics in cancer biomarker discovery. Journal of Separation Science 34:3484–3492. Iwasaki M, Sugiyama N, Tanaka N, Ishihama Y (2012). Human proteome analysis by using reversed phase monolithic silica capillary columns with enhanced sensitivity. Journal of Chromatography A 1228:292–297. Jokerst JV, McDevitt JT (2010). Programmable nano-bio-chips: multifunctional clinical tools for use at the point-of-care. Nanomedicine 5:143–155. Jolliet O, Margni M, Charles R, Humbert S, Payet J, Rebitzer G, Rosenbaum R (2003). IMPACT 2002 + : a new life cycle impact assessment methodology. International Journal of Life Cycle Assessment 8:324–330. Jumtee K, Bamba T, Fukusaki E (2009). Fast GC-FID based metabolic fingerprinting of Japanese green tea leaf for its quality ranking prediction. Journal of Separation Science 32:2296–2304. Keith LH, Gron LU, Young JL (2007). Green analytical methodologies. Chemical Reviews 107:2695–2708. Kim K-H, Jahan SA, Kabir E (2012a). A review of breath analysis for diagnosis of human health. TrAC - Trends in Analytical Chemistry 33:1–8. Kim SH, Fourmy D, Fujii T (2012b). Expanding the horizons for single-cell applications on lab-on-a-chip devices. Methods in Molecular Biology 853:199–210. King JW, Srinivas K (2009). Multiple unit processing using sub- and supercritical fluids. Journal of Supercritical Fluids 47:598–610. Kock M, Evans A, Brunnschweiler A (2000). Microfluidic Technology and Applications. Hertfordshire, UK: Research Studies Press. 340 pages. ISBN: 978-0-86380-244-7. Kumar BS, Chung BC, Kwon O-S, Jung BH (2012). Discovery of common urinary biomarkers for hepatotoxicity induced by carbon tetrachloride, acetaminophen and methotrexate by mass spectrometry-based metabolomics. Journal of Applied Toxicology 32:505–520. Lagouri V, Bantouna A, Stathopoulos P (2010). A comparison of the antioxidant activity and phenolic content ot nonpolar and polar extract obtained from four endemic Lamiaceae species grown in Greece. Journal of Food Processing and Preservation 34:872–886. Lee JW, Uchikata T, Matsubara A, Nakamura T, Fukusaki E, Bamba T (2012). Application of supercritical fluid chromatography/mass spectrometry to lipid profiling of soybean. Journal of Bioscience and Bioengineering 113:262–268. Leon C, Rodr´ıguez-Meizoso I, Lucio M, Garc´ıa-Canas˜ V, Iba´nez˜ E, Schmitt-Kopplin P, Cifuentes A (2009). Metabolomics of transgenic maize combining Fourier transform-ion cyclotron resonance-mass spectrometry, capillary electrophoresis-mass spectrometry and pressurized liquid extraction. Journal of Chromatography A 1216:7314–7323. Liau BC, Shen CT, Liang FP, Hong SE, Hsu SL, Jong TT, Chang CMJ (2010). Supercritical fluids extraction and anti-solvent purification of carotenoids from microalgae and associated bioactivity. Journal of Supercritical Fluids 55:169–175. Lindahl S, Ekman A, Khan S, Wennerberg C, Borjesson¨ P, Sjoberg¨ P, Nordberg Karlsson E, Turner C (2010). Exploring the possibility of using a thermostable mutant of ␤-glucosidase for rapid hydrolysis of quercetin glucosides in hot water. Green Chemistry 12:159–168.

Liu G, Xu X, Hao Q, Gao Y (2009). Supercritical CO2 optimization of pomegranate (Punica Granatum L) seed oils using response surface methodology. LWT-Food Science and Tech- nology 42:1491–1495. REFERENCES 503

Liu J, Zhao Z, Teffera Y (2012). Application of on-line nano-liquid chromatography/mass spec- trometry in metabolite identification studies. Rapid Communications in Mass Spectrometry 26:320–326. Luong J, Gras R, Mustacich R, Cortes H (2006). Low thermal mass gas chromatography: principles and applications. Journal of Chromatography Science 44:253–261. Malmendal A, Amoresano C, Trotta R, Lauri I, De Tito S, Novellino E, Randazzo A (2011). NMR spectrometers as “magnetic tongues”: prediction of sensory descriptors in canned tomatoes. Journal of Agricultural and Food Chemistry 59:10831–10838. Manz A, Graber N, Widmer HM (1990). Miniaturized total chemical-analysis systems-a novel concept for chemical sensing. Sensors and Actuators: B. Chemical 1:244–248. Mazzei P, Piccolo A (2012). 1H HRMAS-NMR metabolomic to assess quality and trace- ability of mozzarella cheese from Campania buffalo milk. Food Chemistry 132:1620– 1627. Mendiola JA, Mar´ın FR, Hernandez´ SF, Arredondo BO, Senor˜ ans´ FJ, Iba´nez˜ E (2005). Char- acterization via liquid chromatography coupled to diode array detector and tandem mass spectrometry of supercritical fluid antioxidant extracts of Spirulina platensis microalga. Journal of Separation Science 28:1031–1038. Mendiola JA, Herrero M, Cifuentes A, Iba´nez˜ E (2007). Use of compressed fluids for sample preparation: food applications. Journal of Chromatography 1152:234–246. Mendiola JA, Rodr´ıguez-Meizoso I, Senor˜ ans´ FJ, Reglero G, Cifuentes A, Iba´nez˜ E (2008). Antioxidants in plant foods and microalgae extracted using compressed fluids. Electronic Journal of Environmental, Agricultural and Food Chemistry 7(8): 3301–3309. Moraes EP, Ruperez´ FJ, Plaza M, Herrero M, Barbas C (2011). Metabolomic assessment with CE-MS of the nutraceutical effect of Cystoseira spp extracts in an animal model. Electrophoresis 32:2055–2062. Moreda-Pineiro˜ A, Bermejo-Barrera A, Bermejo-Barrera P, Moreda-Pineiro˜ J, Alonso- Rodriguez E, Muniategui-Lorenzo S, Lopez-Mah´ ´ıa P, Prada-Rodr´ıguez D (2007). Fea- sibility of pressurization to speed up enzymatic hydrolysis of biological materials for multielement determinations. Analytical Chemistry 79:1797–1805. Mustafa A, Turner C (2011). Pressurized liquid extraction as a green approach in food and herbal plants extraction: a review. Analytical Chimica Acta 703:8–1. Mullen W, Albalat A, Gonzalez J, Zerefos P, Siwy J, Franke J, Mischak H (2012). Performance of different separation methods interfaced in the same MS-reflection TOF detector: a com- parison of performance between CE versus HPLC for biomarker analysis. Electrophoresis 33:567–574. Myint KT, Aoshima K, Tanaka S, Nakamura T, Oda Y (2009). Quantitative profiling of polar cationic metabolites in human cerebrospinal fluid by reversed-phase nanoliquid chromatog- raphy/mass spectrometry. Analytical Chemistry 81:1121–1129. Ong ES, Len SM, Lee ACH, Chui P, Chooi KF (2004). Proteomic analysis of mouse liver for the evaluation of effects of Scutellariae radix by liquid chromatography with tandem mass spectrometry. Rapid communications in mass spectrometry 18:2522–2530. Pawliszyn J, Lord HL (2010). Handbook of Sample Preparation. Hoboken, NJ: John Willey & Sons. 496 pages. ISBN: 978-0-470-09934-6. Pereira CG, Meireles MAA (2010). Supercritical fluid extraction of bioactive compounds: fundamentals, applications and economic perspectives. FoodBioprocess Technology 3:340– 372. 504 GREEN FOODOMICS

Pronyk C, Mazza G (2009). Design and scale-up of pressurized fluid extractors for food and bioproducts. Journal of Food Engineering 95:215–226. Puiggros` F, SolaR,Blad` e´ C, Salvado´ MJ, Arola L (2011). Nutritional biomarkers and foodomic methodologies for qualitative and quantitative analysis of bioactive ingredients in dietary intervention studies. Journal of Chromatography A 1218:7399–7414. Ram´ırez P, Senor˜ ans´ FJ, Iba´nez˜ E, Reglero G (2004). Separation of rosemary antioxidant compounds by supercritical fluid chromatography on coated packed capillary columns. Journal of Chromatography A 1057:241–245. Ram´ırez P, Santoyo S, Garc´ıa-Risco MR, Senor˜ ans´ FJ, Iba´nez˜ E, Reglero G (2007). Use of specially designed columns for antioxidants and antimicrobials enrichment by preparative supercritical fluid chromatography. Journal of Chromatography A 1143:234–242. Rasmussen LG, Winning H, Savorani F, Toft H, Larsen TM, Dragsted LO, Astrup A, Engelsen SB (2012). Assessment of the effect of high or low protein diet on the human urine metabolome as measured by NMR. Nutrients 4:112–131. R´ıos A, Escarpa A, Gonzalez MC, Crevillen AG (2006). Challenges of analytical microsystems. TrAC - Trends in Analytical Chemistry 25:467–479. Roberts AS, Campa MJ, Gottlin EB, Jiang C, Owzar K, Kindler HL, Venook AP, Goldberg RM, O’Reilly EM, Patz Jr. EF (2012). Identification of potential prognostic biomarkers in patients with untreated, advanced pancreatic cancer from a phase 3 trial (Cancer and Leukemia Group B 80303). Cancer 118:571–578. Rosenling T, Stoop MP, Smolinska A, Muilwijk B, Coulier L, Shi S, Dane A, Chirstin C, Suits F, Horvatovich PL, Wijmenga SS, Buydens LMC, Vreeken R, Hankemeier T, van Gool AJ, Luider TM, Bischoff R (2011). The impact of delayed storage on the measured proteome and metabolome of human cerebrospinal fluid. Clinical Chemistry 57:1703–1711. Rostagno MA, D’Arrigo M, Mart´ınez JA (2010). Combinatory and hyphenated sample prepa- ration for the determination of bioactive compounds in foods. TrAC - Trends in Analytical Chemistry 29:553–561. Rubakhin SS, Romanova EV, Nemes P, Sweedler JV (2011). Profiling metabolites and peptides in single cells. Nature Methods 8:S20–S29. Rubio-Rodr´ıguez N, de Diego SM, Beltran S, Jaime I, Sanz MT (2008). Supercritical fluid extraction of the omega-3 rich oil contained in hake (Merluccius capensis-Merluccius paradoxus) by-products: study of the influence of process parameters on the extraction yield and oil. Journal of Supercritical Fluids 47:215–226. Ruperez FJ, Garc´ıa-Mart´ınez D, Baena B, Maeso N, Vallejo M, Angulo S, Garc´ıa A, Iba´nez˜ E, Senorans˜ FJ, Cifuentes A, Barbas C (2009). Dunaliella salina extract effect on diabetic rats: metabolic fingerprinting and target metabolite analysis. Journal of Pharmaceutical and Biomedical Analysis 49:786–792. Sandra P, David F, Vanhoenacker G (2008). Advanced sample preparation techniques for the analysis of food contaminants and residues. Comprehensive Analytical Chemistry 51:131– 174. Sandra P, Vanhoenacker G, David F, Sandra K, Pereira A (2010). Green chromatography (part 1): introduction and liquid chromatography. LCGC EUROPE 23:38. Sandra P, Tienpont B, David F (2011). Green chromatography (part 3): sample preparation techniques. LCGC Europe 24:120–133. Sheldon RA (2000). Atom utilisation, E factors and the catalytic solution. Comptes Rendus de l’Academie´ des Sciences—Series IIC—Chemistry 3:541–551. REFERENCES 505

Shi J, Yi C, Xue SJ, Jiang Y, Ma Y, Lis D (2009). Effect of modifiers on the profile of lycopene extracted from tomato skin by supercritical CO2. Journal of Food Engineering 93:431–436. Siriwardhana N, Kim KN, Lee KW, Kim SH, Ha JH, Song CB, Lee JB, Jeon YJ (2008). Optimisation of hydrophilic antioxidant extraction from Hizikia fusiformis by integrat- ing treatments of enzymes, heat and pH control. International Journal of Food Science Technology 43:587–596. Sugimoto M, Kaneko M, Onuma H, Sakaguchi Y, Mori M, Abe S, Soga T, Tomita M (2012). Changes in the charged metabolite and sugar profiles of pasteurized and unpasteurized Japanese sake with storage. Journal of Agricultural and Food Chemistry 60:2586–2593. Sun M, Temelli F (2006). Supercritical carbon dioxide of carotenoids from carrot using canola oil as a continuous co-solvent. Journal of Supercritical Fluids 37:397–408. Sun H, Ge X, Lv Y, Wang A (2012). Application of accelerated solvent extraction in the analysis of organic contaminants, bioactive and nutritional compounds in food and feed. Journal of Chromatography A 1237:1–23. Teo CC, Tan SN, Yong JWH, Hew CS, Ong ES (2010). Pressurized hot water extraction (PHWE). Journal of Chromatography A 1217:2484–2494. Tobiszewski M, Mechlinska´ A, Zygmunt B, Namiesnik´ J (2009). Green analytical chemistry in sample preparation for determination of trace organic pollutants. TrAC - Trends in Analytical Chemistry 28:943–951. Tolstikov VV, Fiehn O (2002). Analysis of highly polar compounds of plant origin: combina- tion of hydrophilic interaction chromatography and electrospray ion trap mass spectrometry. Analytical Biochemistry 301:298–307. Tolstikov VV, Lommen A, Nakanishi K, Tanaka N, Fiehn O (2003). Monolithic silica-based capillary reversed-phase liquid chromatography/electrospray mass spectrometry for plant metabolomics. Analytical Chemistry 75:6737–6740. Turner C, Turner P, Jacobson G, Almgren K, Waldeback¨ M, Sjoberg¨ P, Karlsson EN, Markides KE (2006). Subcritical water extraction and b-glucosidase-catalyzed hydrolysis of quercetin glycosides in onion waste. Green Chemistry 8:949–959. Turner C, Iba´nez˜ E (2011). Pressurized hot water extraction and processing. In: Lebovka N, Vorobiev E, Chemat F, editors. Enhancing Extraction Processes in the Food Industry- Contemporary Food Engineering. Boca Raton, FL: CRC press. p 223–255. Ueda K, Saichi N, Takami S, Kang D, Toyama A, Daigo Y, Ishikawa N, Kohno E, Tamura K, Shuin T, Nakayama M, Sato TA, Nakamura Y, Nakagawa H (2011). A comprehensive peptidome profiling technology for the identification of early detection biomarkers for lung adenocarcinoma. PLoS One 6(4), Art. no. e18567. Vande Meent MHM, De Jong GJ (2011). Novel liquid-chromatography columns for proteomics research. TrAC - Trends in Analytical Chemistry 30:1809–1818. Wang Z, Li S, Jonca M, Lambros T, Ferguson S, Goodnow R, Ho CT (2006). Comparison of supercritical fluid chromatography and liquid chromatography for the separation of urinary metabolites of nobiletin with chiral and non-chiral stationary phases. Biomedical Chromatography 20:1206–1215. Wiboonsirikul J, Adachi S (2008). Extraction of functional substances from agricultural prod- ucts or by-products by subcritical water treatment. Food Science Technology Research 14:319–328. Wijngaard H, Hossain MB, Rai DK, Brunton N (2012). Techniques to extract bioactive com- pounds from food by-products of plant origin. Food Research International 46:505–513. 506 GREEN FOODOMICS

Winterton N (2001). Twelve more green chemistry principles. Green Chemistry 3:G73–G75. Wu JJ, Lin JC, Wang CH, Jong TT, Yang HL, Hsu SL, Chang CMJ (2009). Extraction of antioxidative compounds from wine lees using supercritical fluids and associated anti- tyrosinase activity. Journal of Supercritical Fluids 50:33–41. Wu H, Volponi JV, Oliver AE, Parikh AN, Simmons BA, Singh S (2011). In vivo lipidomics using single-cell Raman spectroscopy. Proceedings of the National Academy of Sciences of the United States of America 108:3809–3814. Xu H, Zhang W, Gao Y, Zhao Y, Guo L, Wang J (2012a). Proteomic analysis of embryo development in rice (Oryza sativa). Planta Medica 235:687–701. Xu X, Liu K, Fan ZH (2012b). Microscale 2D separation systems for proteomic analysis. Expert Review of Proteomics 9:135–147. Yin H, Marshall D (2012). Microfluidics for single cell analysis. Current Opinion in Biotech- nology 23:110–119. Yu CJ, Wang CL, Wang CI, Chen CD, Dan YM, Wu CC, Wu YC, Lee IN, Tsai YH, Chang YS, Yu JS (2011). Comprehensive proteome analysis of malignant pleural effusion for lung cancer biomarker discovery by using multidimensional protein identification technology. Journal of Proteome Research 10:4671–4682. Zhu KY, Leung KW, Ting AKL, Wong ZCF, Ng WYY, Choi RCY, Dong TTX, Wang T, Lau DTW, Tsim KWK (2012). Microfluidic chip based nano liquid chromatography coupled to tandem mass spectrometry for the determination of abused drugs and metabolites in human hair. Analytical and Bioanalytical Chemistry 402:2805–2815. 19 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

Thomas Skov and Søren B. Engelsen

19.1 FOODOMICS STUDIES

19.1.1 Introduction Foodomics has been proposed as a new research area that uses the powerful “omics” technologies to explore food and nutrition systems (Capozzi and Placucci, 2009; Cifuentes, 2009). In Foodomics studies mass spectrometry (MS) techniques are con- sidered most important due to their extremely high sensitivity and selectivity. Very often MS is used after a separation technique such as liquid chromatography (LC), gas chromatography (GC), and capillary electrophoresis (CE) to ensure that highly com- plex food samples are separated into less complex parts before reaching the MS detec- tor. Seen from a physical, chemical, and biological perspective, food systems (and biofluids) are complex multifactorial systems containing mixtures of heterogeneous chemical mixtures of heterogeneous classes of molecules as well as complex phys- ical structures such as amorphous solids, aqueous solutions, gels, macromolecules, macro-organelles, cells, crystals, pores, and cavities. The nature of food samples thus makes extraction and separation on a specific column (GC or LC) combined with a mass separator the logical choice of analytical method. Other techniques such as nuclear magnetic resonance (NMR) spectroscopy and vibrational spectroscopy (near- infrared, infrared, and Raman) can also be applied, but do not match the sensitivity of hyphenated separation and mass spectrometry systems. Applications of foodomics include the genomic, transcriptomic, proteomic, and/or metabolomic studies of foods for compound profiling, authenticity, and/or biomarker detection related to food

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

507 508 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS quality or safety; the development of new transgenic foods, food contaminants, and whole toxicity studies; new investigations on food bioactivity, food effects on human health, etc.

“My hope for nutrigenomics is that, despite the disapproval of the straight-laced, dour- faced, hypothesis-led, elderly scientific community, the genomic technologies will open up the joy of fishing to a younger generation of researchers who will caste out into the pools of genes, proteins and metabolites with eager anticipation and revel in every new and strange piece of information that emerges from the depths.”1 Professor Sue Southon

With the hyphenation of a separation technique and a mass spectrometry instrument (XC-MS) the data output becomes very large and it can be challenging and time- consuming to extract the proper information directly from the raw data. Moreover, data analysis of XC-MS data is further challenged by artifacts such as unwanted signals from column components, skewness in data due to mass uncertainties and ion suppression and shifted peaks across samples, column deterioration over time, drift in baseline, injection variations across samples, etc. For XC-MS, data artifacts will always be present and an effort in minimizing these artifacts through a proper study design and appropriate analytical conditions is crucial. Much attention has thus been given on how to handle these artifacts especially when measuring biological samples where the information sought can be hidden behind artifacts and thus, impossible to extract. Measuring samples on the same analytical system over a short time (a few days) will reduce most artifacts, but with foodomics data this approach is rarely neither possible nor desirable. When samples are measured over many weeks (maybe even months) the column might be replaced and the mass detector slightly worn over time. So how can we deal efficiently with such analytical problems? Besides running under as controlled analytical conditions as possible, the answer very often lies in the use of well-selected standard samples that are run frequently over time and which has the potential to capture some (if not all) of the artifacts present in the data. A visualization of the overall metabolomics workflow used in foodomics is illus- trated in Figure 19.1. Foodomics studies are often designed to investigate well-defined scientific questions. The problem under investigation can, for example, be: what causes one diet to be healthier than another where the health beneficial effect is known but the mechanism due to, for example, metabolites produced might not be. When analyzing data from such studies it is clear what the target is; find the differences in the metabolite profiles that can be related to the experimental design (i.e., the two diets). Even though this question is simple, the complexity and size of metabolomics data in foodomics with hundreds or even thousands of metabolites and including the biological variation of individuals makes the data exploration nontrivial. Chemometrics or multivariate data analysis is a powerful method to explore such datasets using only a minimum of a priori assumptions. This can either be done

1http://www.nugo.org/. FOODOMICS STUDIES 509

(1) (2) (3) Data acquisition Raw metabolomic data Method optimization LC-MS for metabolite extraction and GC-MS sample treatment before data acquisition CE-MS

NMR

Peak deconvolution Parafac and (4) Diet 1 Parafac2 Diet 2 (5) Diet 3 Pre-processing Controls  Data cleaning Classification Chemometrics  Alignment PCA, PLS-DA, Multivariate and multi-  Normalization ECVA O-PLS-DA way techniqaues  Scaling

Regression PLS, iPLS, O-PLS FIGURE 19.1 Metabolomics workflow used in foodomics. (1) Sampling, (2) Data acquisi- tion, (3) The raw multidimensional metabolomics data, (4) Preprocessing of data, (5) Use of chemometric methods (classification, peak deconvolution, and/or regression) to extract infor- mation. When all steps are combined the outcomes of chemometric methods can be interpreted and tried explained using biological knowledge. Inspired by Bekzod Hakimov. with (targeted) or without (untargeted) a priori knowledge about the experimental design. One of the advantages of metabolomics data is that many metabolites are measured simultaneously for each sample. However, the large number of metabolites (variables) combined with relatively few samples being analyzed in a study makes the use of robust and efficient chemometric methods important to avoid false (and nonbiological) interpretations/conclusions simply due to too many (noisy) metabolites/variables giving spurious correlations. Thus, an initial data inspection and cleaning (e.g., threshold values for metabolite intensities, consistency in one metabolite, etc.) is always suggested prior to any subsequent data modeling.

“All models are wrong, but some are useful” George Edward Pelham Box

This chapter deals with the many data issues related to the exploration of data from foodomics studies (mainly metabolomics data) and how these can be handled in an efficient manner in order to exploit new and hidden information content (Fig. 19.1). The main focus will be on the multivariate data modeling (chemometrics) and how this can be performed both with simple and more advanced dedicated methods. The chapter is written as a tutorial with emphasis on the potential of data modeling for different types of data and with different scientific purposes. 510 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

19.1.2 Pattern Recognition Techniques—Setting the Scene How many metabolites do we find in a cell or a biological sample? Theoretically up to 50,000! But the question is: how many of these metabolites are influenced by the parameters in the experimental design? One metabolite found in a biological sample rarely includes enough information in itself to be able to answer a biological problem such as, for example, if a food product is healthy or not. Healthiness is definitely a multivariate case and in order to elucidate and maybe predict the healthiness, several metabolites must be examined. The example in Figure 19.2 illustrates the inadequate information that is often found in single metabolites and how this can be dramatically improved by examining more metabolites at the same time. The variations of the two metabolites by themselves do not provide strong information about the experimental design. However, by simply plotting the two metabolites against each other (scatter plot) it is readily observed that the interaction between the two metabolites is providing a pattern that differentiates the two classes in the experimental design. The case presented here is, of course, too simple and two single metabolites are often not multivariate enough to differentiate between samples in a metabolomics study. More often it is found that the key to success is to be able to determine the pattern(s) that characterizes one food product and differentiates it from another. To determine a pattern of extracted metabolites, data handling techniques (pat- tern recognition methods or chemometrics) capable of handling multiple variables are required.

Metabolite 10 100 90 80 100 70 60 50 80 40

Concentration 30 20 10 60 0

Metabolite 43 40 180 160 140 20 120 100 of metabolite 10 Concentration 80 0

Concentration 60 0 50 100 150 200 40 Concentration of metabolite 43 20 0

FIGURE 19.2 Example of the inadequate use of single metabolites to differentiate between two classes in the experimental design, for example, a bioactive food product (dark bars and dots) and a reference standard diet (light bars and dots). XC-MS DATA 511

TABLE 19.1 Characterization of the Most Common Separation Techniques (CE, LC, and GC) Coupled to MS used in Foodomics Studies Size of One Size of Chromatographya Detector Terminologyb Dimension(s) Sample Dataset Chemometricsc – MS MS Mass channel Vector Matrix PCA like (two way) (Section 19.2) CE, LC, and GC SIM XC-SIM Time CE, LC, and GC MS XC-MS Time × Mass Matrix Three way PARAFAC like channel (Section 19.3) CE, LC, and GC MS/MS XC-MS/MS Time × Mass Three-way Four way channel × array Mass channel GC × GC TOF-MS GC × GC- Time GC TOF-MS 1 × Time GC 2 × Mass channel aLC is used as a generic term for both HPLC and UPLC. CE, capillary electrophoresis; LC, liquid chro- matography; GC, gas chromatography. bTerminology: FID, flame ionization detector; SIM, single ion monitoring; TOF, time of flight; HPLC, high performance/pressure liquid chromatography; UPLC, ultra performance/pressure liquid chromatography. XC is referring to either CE, LC, or GC separation. cPCA and PARAFAC are used to indicate the family of data handling techniques (two-way data or multiway data). In the text there will be examples of other techniques capable of handling such data.

19.2 XC-MS DATA

Data obtained from hyphenated chromatography mass spectrometry systems are often found in three or more dimensions if several samples are analyzed (Table 19.1). As the subsequent data handling (preprocessing and modeling) is highly dependent on the data size, this issue must be taken care of as soon as possible. Raw XC-MS data can be arranged in a data cube. Ideally a specific metabolite elutes at the same time in all samples and in the mass spectrum obtained at this time each fragment appears at the exact same position (m/z). However, in reality, GC–MS and LC–MS do not provide such data due to instrumental imperfections such as column deterioration, mass detector inaccuracies, and solvent irregularities when several samples are being measured. For high resolution instruments a certain fragment will not be present as just one mass, but will have a distribution around a certain mass. When this is the case, the centroid of the distribution can be used at the mass reported, which means that the same fragment will have slightly different masses across samples; all due to the high resolution of the mass spectrometer. Arranging these data in a cube as seen in Figure 19.3 is not feasible without some sort of binning of the uncertain mass determinations. Binning is here referred to as the process of reducing resolution by simply summing neighboring variables and then using only the sum. By binning to, for example, 0.1 m/z or 0.5 m/z bins (or the mass resolution of the instrument) a common m/z axis is created. However, this may not eliminate the same metabolite being present in more than one bin. Thus, if 512 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

FIGURE 19.3 The road from XC-MS data over extracted metabolite features to model parameters. How to get through the data analysis depends on the instrumental technique and data analytical strategy which also determines the actions needed in (a) and (b). (a) Contains preprocessing and peak identification and quantification and (b) the model step. binning is used this must be fine-tuned according to expected mass shifts in order to avoid too many artifacts being created in the binning process. As this is done prior to any data modeling all model parameters will also be based on binned data and before reporting any model findings it is crucial to check the raw data (not binned). This will elucidate if more than one metabolite have been located in the same bin or if one metabolite is spread over two or more bins. When analyzing XC-MS data, binning is a natural and essential step and is performed in numerical software using user-developed scripts (e.g., MATLAB) or as a part of instrument supplied software package. XC-MS DATA 513

19.2.1 From Hypothesis, Study Design Over Experiments to Data Handling In untargeted/fingerprinting metabolomics analysis where all metabolites are included, the use of a data handling strategy is essential. One important issue is data cleaning, but another issue is how to approach the data. With thousands of metabolites expected in data the most obvious start would be to get an overview of data and let this guide the subsequent data analysis. An overall data analytical approach can for example be performed with the following steps: 1. Set up a hypothesis for what should be gained from the data analysis (e.g., quantification of all secondary metabolites or polyphenols in a fruit as a function of abiotic stress). 2. Prepare proper standard mixtures to be able to correct for analytical artifacts. These could be placed as every 10th sample in the run order (Section 19.2.2). 3. If possible randomize the biological samples to avoid confounding of the bio- logical variation with, for example, day-to-day variation. 4. Run the experiment. 5. Extract metabolites using dedicated software (commercial or freeware (Castillo et al., 2011; Katajamaa and Oresic,ˇ 2007)) and collect these in a data table comprised of “m/z-elution time” pairs. More advanced options are possible (Section 19.3), but the creation of such a data table is the most common approach when analyzing metabolomics data. 6. Do a preliminary exploration of the data using Principal Component Analysis (PCA) (Section 19.3.2). 7. Investigate the model parameters (scores, loadings, and residuals). If these are reflecting any of the study design factors this must be taken into consideration when interpreting the model. 8. Find important metabolites that separate data according to the hypothesis or other important study design factors. 9. Check/confirm the validity of these metabolites by inspecting the raw data. This can often be done intuitively in the commercial or free software. 10. If metabolites originate from a perfectly resolved peak with a distinct mass and the intensity is well above the noise level then this metabolite can be identified (if possible) from databases and biologically interpreted (Krebs cycle, photosynthesis, etc.). If the metabolite peak is problematic (low signal-to-noise ratio, coeluting with other peaks causing no distinct masses etc.) the reported metabolite con- centrations has to be validated with additional experiments or by advanced curve resolution methods. 11. One way to solve the latter is to locate the relevant peak(s) in raw data and extract the peak region. This could be a small subpart of the elution time (e.g., between 3.21 and 3.23 min) and for this region apply proper curve resolution methods that can qualify and quantify the metabolites (Section 19.3.3). 12. Replace the obtained metabolite concentrations and go back to step 6. 514 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

13. If the hypothesis is to characterize the difference between biological classes, run a supervised discriminant /classification method such as partial least squares discriminant analysis (PLS-DA) alongside the initial PCA. Remember to vali- date to check for unreliable/noisy metabolites (Section 19.3.2).

19.2.2 Quality of the Data The difference between success and failure is often related to the quality of the data and to the hypothesis of the study. Three types of variation are found in data: (1) variation due to the study hypothesis (control and a perturbation), (2) variation due to the experimental conditions (study design, sampling, analytical instrument, lab technicians, etc.), and (3) variation over time. Comparing a control with a perturbed system that create completely different and consistent metabolome profiles will often be successful regardless of experimental conditions. However, usually foodomics is applied to problems which are more subtle and which are rare and more often hypotheses are put forward that bring data with greater similarity and where the experimental variations will be a problem if not carefully controlled and corrected for. One way to correct for changing experimental conditions is to use several stan- dard mixtures, quality-control (QC) samples, replicates, and/or other reference sam- ples. Such standard mixtures can be simple mixtures of real samples (quality-control samples—QC) with distinct peaks/masses that can easily be detected and identified by the subsequent peak detection methods. Or it can be a metabolomic mixture of prese- lected metabolites in known concentrations or simply a blank sample with just solvent or water. Sometimes all three types of standard samples (QC, metabolomic mixture, and blank) are used and this provides data with the best possible chance of catching and correcting any experimental artifacts introduced during the experiment. Besides standard samples, replicates are essential to further ensure good quality in data. The quality control samples should be designed to compensate for short-term and long-term influence of matrix effects that can be evaluated by comparing the metabo- lite coverage and their relative quantification levels to expected values from back- ground knowledge. Only if quantification of a range of well-known target metabolites validates a specific analytical protocol, can unbiased analysis be extended to the level of metabolomics and detect and quantify novel metabolite signals. Such integration of classical analytical strategies with modern unbiased data analysis also includes ran- domized sample measurement sequences (can be tricky in foodomics studies taking place over a long time). No analytical or data handling methodology can compensate for improper sample collection or inappropriate experimental design. In an ideal experimental design, unavoidable effects of diurnal variation, age, gender, diet, stress, etc. are considered. Other factors such as sampling (how is the sample acquired and how does it represent the object being sampled), sample storage, sample handling (e.g., centrifugation time, freezing temperature), and sample presentation are all important issues and must be evaluated and standardized in order to obtain data without interfering artifacts. With multivariate methods, patterns are found characteristic of experimental sam- ples but also for the standard samples. This means that repeated QC injections should XC-MS DATA 515 be similar compared to the overall study design samples. This will indicate that the relative differences between the repeated injections (i.e., the variation of the analyti- cal system) are much smaller that the biological differences between the test samples (van der Kloet et al., 2009; Zelena et al., 2009). If this is not the case, the dataset is of little practical use as this indicates that either the biological variation of the test samples is too small and/or that the analytical system was unstable. Passing such a preliminary test will confirm that sound and informative data are achieved which will optimize the chances of finding biological differences amongst the samples.

19.2.3 Preprocessing of the Data With standard samples included, one has the best odds for checking the quality of the data before and after the essential preprocessing step. The task of any preprocessing method is to make standard samples as similar as possible. This provides a mean for tuning the preprocessing parameters which would otherwise be based on a time- consuming trial and error approach. Preprocessing is in general performed in two ways (1) using free or commercial software where the outcome is a data table comprised of “m/z-elution time” pairs, or (2) in-house made algorithm where the outcome are metabolite fingerprinting data (see Section 19.3). Whether one uses premade software or in-house algorithms, the steps are the same but they may differ in how the steps are carried out (Amigo et al., 2010; Castillo et al., 2011; Katajamaa and Oresic,ˇ 2007). The preprocessing steps are:

1. Alignment of the elution time axis; to ensure that the same metabolite is correctly assigned across samples in time. 2. Binning of the mass axis; to ensure that the same metabolite is correctly assigned across samples in m/z values. 3. Filtering or noise reduction; to remove high-frequency interfering signal caused by sources unrelated to the biochemical nature of the sample. 4. Baseline subtraction; to remove systematic artifacts produced by the column or matrix material. 5. Normalization; to correct for systematic variation between spectra due to dif- ferences in the metabolite concentration in the sample, degradation over time, and variation in the instrument detector sensitivity.

While alignment and binning are horizontal changes, normalization is a vertical change in the data signal as illustrated in Figure 19.4. Horizontal artifacts are detri- mental to any subsequent statistical or multivariate data analysis and removal of these preserves the quantitative and qualitative information in data. There exist in scientific literature two general types of correction function for alignment; compres- sion/expansion (C/E) or insertion/deletion (I/D). The C/E model implicitly assumes that peak widths can be correlated to the retention time shift, which is a rea- sonable assumption in chromatography. Therefore, the C/E model is by far most commonly used in the alignment of chromatographic signals and has spawned 516 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

1000 1500 2000 0 1000 1500 2000

1000 1500 2000 0 1000 1500 2000 Elution time (scans) Elution time (scans) FIGURE 19.4 Illustration of alignment (Left) and normalization (Right). Left: Peaks are shifted due to horizontal analytical artifacts. When aligned toward a reference chromatogram some parts of the sample chromatogram (dotted profile) go left, and some go right, in accor- dance with the shifts to be corrected. Right: Two replicates that are supposed to be similar, but due to vertical analytical artifacts the intensity varies. Using proper normalization makes the replicates more similar. methods such as correlation optimized warping (COW) (Nielsen et al., 1998). In the I/D approach, it is assumed that the peak width is invariant within limited ranges of retention time and remains unchanged in case of shift. This model is used, for example, in dynamic time warping (DTW) (Tomasi et al., 2004) and by the inter- val correlation optimized shifting (icoshift) (Tomasi et al., 2011). Proper alignment, binning, and baseline subtraction should not change the biological information in a sample (i.e., how intense a peak of certain metabolite is) and should thus, preserve the quantitative relation between the samples. Vertical changes should also be removed but as described below this can affect the quantitative information present in data in both positive and negative respects. Nor- malization must be used with care and the selection of the normalization method will determine what can be derived from data after preprocessing. Two types of normal- ization are mainly used within metabolomics data: (1) normalizing using an internal standard (i.e., a chemical component added to all samples in a known concentration), and (2) a mathematical operation which uses some or all signal intensities (e.g., dividing with the sum of all metabolite intensities). The latter will impose a closure constraint onto each metabolite profile which will allow the individual metabolites to vary, but which will assume that the total intensity of each profile is invariant. This may seem a risky assumption for XC-MS profiles of biological samples. With normalization basically two types of data can be obtained: 1. Qualitative data r Standard normal variate normalization (each sample is mean centered and scaled with the standard deviation of all metabolite intensities). This technique removes offset and slope differences between samples (Barnes et al., 1989). DATA STRUCTURES AND MODELS 517 r Unit area normalization (concentration differences between samples are removed) r Unit length normalization (concentration differences between samples are removed) r Feature normalization (each sample is normalized with a feature not chemi- cally interesting—that is, the solvent peak or a reagent added to each sample) 2. Quantitative (and qualitative) data r Internal standard normalization (concentration differences between samples are removed and it is assumed that all metabolites behave in the same way on the column as the internal standard. If this is not the case several internal standard samples can be used) r Feature normalization (see above—depending on the feature used this can also be quantitative) r Closest standard sample normalization (each sample is normalized with the standard sample run closest on the column. This can remove changes in column and detector response over time)

With qualitative data it is possible to extract patterns (relative metabolite intensities) that are found in different classes of samples, but it is not possible to extract quan- titative information about individual metabolites. So if the hypothesis is to evaluate how much a given perturbation affects the concentration of well-known metabolites, then the data must preserve the quantitative difference between samples, but if the hypothesis is to find relative differences in the metabolite profile, then the data can be normalized using both types of approaches as mentioned above.

19.3 DATA STRUCTURES AND MODELS

Extracting information from multivariate and/or multidimensional data can be done in many ways. However, some methods are more frequently used and have shown great potential in foodomics studies. The choice of chemometric methods depends on the structure or arrangement of the raw data, how the data have been manipulated before modeling, and on the limitations of the individual chemometric models. As exemplified in Table 19.1, three types of data are suitable for chemometric modeling; data as a table, in a three-way cube or in higher order tensors. In foodomics there is a dominant use of hyphenated techniques with only one chromatographic separation as depicted in Figure 19.3 and thus models suitable for analyzing cubes or lesser dimensions of data are the most reported models.

19.3.1 How to Generate Data Suitable for Chemometric Modeling A data table can be generated in several ways when measuring with XC-MS:

1. MASS FINGERPRINT: The elution time dimension is summed providing an m/z fingerprint for each sample. Having many metabolites makes such a 518 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

fingerprint very complex and thus this approach is rarely used for metabolomics data. DATA: samples × mass channels

2. CHROMATOGRAPHIC FINGERPRINT: At each elution time, the inten- sities of the measured mass fragments are summed providing the so-called total ion count (TIC) chromatogram. Often used to get a visual overview of the peaks present but also in a PCA model to get a first glance of which parts of the chromatogram are correlated.

DATA: samples × elution times

3. COMBINED FINGERPRINT: Unfold (meaning: concatenate) the data in either the elution time or mass spectral dimension. This approach is rarely used and cannot be recommended due to the strongly increased size and complexity of the concatenated second dimension.

DATA: samples × (mass spectrum at elution time 1 (termed mz(et1))

+ mz(et2) +···+mz(etn)) or

DATA: samples × (elution profile at mass fragment 1 (termed ET(mz1))

+ ET(mz2) +···+ET(mzn))

4. TABLE COMPRISED OF M/Z-ELUTION TIME PAIRS: The most often used approach is to detect a peak (i.e., a metabolite) at a certain elution time with a certain mass and register in which samples this metabolite is found— hereby extracting metabolite features. How to get from the raw data to a peak table (Fig. 19.3) is not trivial but several types of either commercial of free software are available for doing this (Theodoridis et al., 2012; Want, 2009).

DATA: samples × metabolites

5. CUBED/MULTIWAY DATA STRUCTURE: A data cube can be arranged simply by using the raw cleaned XC-MS data as it is. This has several advan- tages (further discussed in Section 19.3.3) such as performing unique peak deconvolution and ignoring some preprocessing steps as these are handled within the subsequent modeling.

DATA: samples × elution times × mass channels

In Figure 19.5 three model structures are shown; one which is suitable for data tables and two which can be applied directly to the XC-MS data structure. From Figure 19.5 it is obvious that the three models have several similarities and they can be used for specific purposes in the data handling of metabolomics data. In Section 19.3.2, DATA STRUCTURES AND MODELS 519

FIGURE 19.5 Graphical visualization of a two-factor PCA (data: samples × elution time), PARAFAC (samples × elution time × mass fragment), and PARAFAC2 model (samples × elution time × mass fragment). To stick to the normal notation scores (related to samples) are denoted t and a for the PCA and the PARAFAC models, respectively and p, b, and c to the loadings or PCA and the PARAFAC models, respectively. methods for dealing with data arranged in a table will be further discussed and in Section 19.3.3, the use of PARAllel FACtor analysis (PARAFAC) and PARAFAC2 models are presented.

19.3.2 Chemometric Methods for Data Tables In foodomics studies the vast majority of chemometric models are performed on extracted metabolite features which are then organized and aligned in a data table with samples and “m/z-elution time” pairs. For these data tables several bilinear models can be applied but especially two methods have been applied in foodomics studies. PCA is the method of choice when data structures are explored and partial least squares regression discriminant analysis (PLS-DA) when searching for differences between predefined classes in the experimental design.

19.3.2.1 Principal Component Analysis—PCA When exploring metabolomics data the most often used chemometric method is PCA (Hotelling, 1933). The chance of one metabolite containing the maximum information in data is limited and with several metabolites measured the most information is often a combination of changes in several metabolites—a so-called metabolite pattern. PCA searches for common pat- terns in a data table by establishing new directions data. This is done by compressing 520 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS the data table into a lower number of variables called principal components by max- imizing the variance in data. The first principal component extracts as much of the variation in the dataset as possible. The second principal component is orthogonal to the first and covers as much of the remaining variation as possible, and so on. The new variables are constructed as linear combinations of the original variables and they are uncorrelated. The principal components can be used to construct a new coor- dinate system. The data space is projected so that the direction of the largest variance becomes the new coordinate axis. The scores are the projection of the data onto the new coordinate system, whereas the loadings define the size of the contribution of each original variable to the component. PCA does this in a hyperefficient manner by extracting a low number of orthogonal so-called principal components which makes PCA the most robust and reliable chemometric method available. Let us take an example: within foodomics the original variables can, for example, be the intensity as a function of elution times in LC profiles (chromatographic fin- gerprint) and by plotting the loadings as a line (elution time profile) the underlying elution peaks in overlapping chromatograms may be revealed. If several metabolites are covarying strongly in a systematic manner in a study design this might represent the maximum variation found in data. If this is the case, PCA will find how much each metabolite contributes to this variation and each metabolite will be given a weight (i.e., importance), a loading, that tells us which metabolites are important and which are not for the given pattern of variation. For the first direction, each sample (from its original position) can be projected onto this, providing a score value. These score values then describe the amount of this new variance direction that is found in each sample. A set of a score and loading vector constitutes a principal component. The second new direction in the data is found orthogonal with respect to the first direction and the second score value for each sample is found in a similar fashion as described above. This is continued as long as systematic (descriptive) variation is described by the successive principal components. The variance explained in each principal component decreases for successive extracted components. The variance left in the data (unexplained variance) is usually related to unsystematic variation (in many foodomics studies this will be the unsystematic biological variation and remaining analytical artifacts) or noise and is termed the residuals. By plotting pairs of scores and loadings in scatter plots, intuitive graphical illustrations of the patterns in data can be made. In Figure 19.5 the notation of a PCA model is shown and in Figure 19.6 an example of the resulting scores and loadings from a PCA on metabolomics data are illustrated. As with all other methods it is important that the chemometric method is interpreted correctly. The exploratory nature of PCA makes the interpretation of the score plot rather subjective, but some guidelines must be followed. Objects far apart in the score plot are different with respect to what patterns the model describes (which metabolites are responsible for the large variation found in data) and objects in close proximity exhibit similar variations. In PCA one cannot generalize outside the metabolites included, which mean that what is observed in the score plot is only valid for the study at hand. However, if enough objects are included it is very often possible to generalize the validity of the model results. Another important issue is the amount DATA STRUCTURES AND MODELS 521

Scores plot Diet 1 Metabolite 972 Diet 2 Metabolite 92 Diet 3 Controls

Metabolite 404

Metabolite 4120

Metabolite 3

Metabolite 40

FIGURE 19.6 Illustration of a PCA on a dataset with metabolites measured for several samples. LEFT: Score plot (scores on PC1 vs. scores on PC2) with samples and RIGHT: Loading plot (loadings on PC1 vs. loadings on PC2) with metabolites. The first principal component explains 50% and the second 32%. The control samples could be mixtures of all three diets (reference diets) run frequently on the chromatographic system making them ideal for checking the quality of data. The visual appearance of the loading is typical for metabolomics data where Pareto scaling has been applied. of variation covered by the principal components (given in percent of the raw data variation). If this is low the samples in close proximity are only similar in this part of the variation, but if the number is high (as in Fig. 19.6) then the similarity will be much more pronounced. For the loading plot, which is the backbone of the PCA model, the interpretation is slightly more difficult, however some simple rules exist. Firstly, metabolites in the center (e.g., when looking at the first two components in a two component model—Fig. 19.6) must be evaluated with care as they do not contribute to the model. Secondly, the metabolites closely positioned are correlated—meaning that if one metabolite goes up the other will also go up. On the other hand, if metabolites are oppositely positioned across the center the metabolites are negatively correlated. One important thing is that the correlations are only valid in percentage of the variation described by the first two components. What happens in the remaining part of the data cannot be evaluated from this plot. Combining score and loading plots makes the exploration of data very powerful. The samples placed in the same direction influenced/directed by certain metabolites (e.g., metabolite 3 and 404 in Fig. 19.6)—for example, the diet 2 samples—will have a higher content of these metabolites compared to diet 3 samples and a lower content of metabolite 4120. This will be opposite for diet 3 samples that are placed oppositely in the score plot. It is a healthy principle that found correlations are reevaluated in the raw data and if the type of correlation is direct or indirect, before reporting otherwise too strong correlations or simply wrong causal effects. The result of a PCA of a typical LC–MS foodomics study is shown in Figure 19.7. The study aimed at investigating the old saying:

“an apple a day keeps the doctor away” × 1013 522 0.25 Unselected features 0.2 Apple exposure markers 1 Apple 1 Control-Apple exposure markers 0.15 Pectin exposure markers Pectin Control Pectin exposure markers 0.1 2 Apple effect markers 0.5 Control-Apple effect markers 0.05 Pectin effect markers 3 Control-Pectin effect markers 0 0

–0.5 –0.5 –0.1 Scores on PC 2 (15.0%) Loadings on PC 2 (14.86%) –0.15 –1 Control –0.2

–0.25 –1.5 –1 –0.5 0 0.5 1 1.5 –0.2 –0.1 0 0.1 0.2 0.3 0.4 Scores on PC 1 (24.9%) × 1013 Scores on PC 1 (24.9%)

Effect marker Exposure marker 6000 1500 [M-H]- 100 2 Control Pectin Apple 282.122 3 [M-H]- 480.128 4000 1000 [M-H]- 384.105 50 2000 500 1 Control Pectin Apple Control Pectin Apple 0 0 5101520 5 10 15 20 5 10 15 20 Sample Sample Sample FIGURE 19.7 PCA score (top left) and loading plot (top right) of PC1/PC2 with all features measures in negative mode (n = 4010). Data are mean centered and Pareto scaled. Diagrams 1–3 illustrate the corresponding response pattern of a selected feature or markers from the loading plot. The location of the identified exposure and effect markers in the multivariate space is illustrated as different symbols on the loading plot (Kristensen et al., 2012). DATA STRUCTURES AND MODELS 523

—that is, how is the metabolome changed when ingesting of apple or apple com- ponents. In this study LC–MS data were collected from rat urine and the data were cleaned and the metabolites classified as either exposure markers (apple, pectin or control—Figure 19.7a) or effect markers (apple, apple pectin or not—Figure 19.7b). A total of 24 rats were given different apple components (pectin fraction, whole apple, or control) and the LC–MS metabolite profile was investigated using PCA. Inspection of the PCA loading plot reveals that it is not only the features selected as exposure and effect markers, which are responsible for the grouping in the score plot. Features that were not selected by the proposed method showed an even stronger effect on the multivariate discrimination between the sample groups. This is because PCA reflects the overall variation in the data across the 24 rats and thus is not sensitive to inhomogeneous responses between the animals. The PCA shows no clear separation between the potential exposure and effect markers in the loading plot, but it can be observed that, for example, upregulated apple effect markers (apple effect markers) and downregulated apple effect markers (control apple effect markers) are separated from each other in the loadings plot (Fig. 19.7, colored marker groups). The single feature that contributes most strongly to distinguish the apple samples from the other samples is highlighted as #1 in the loading plot of Figure 19.7, and its analytical pattern across the 24 samples is shown in the upper right insert. Despite its clear response behavior this feature was not selected as a potential exposure marker by the proposed selection criteria due to several nonzero values in the control group. This example shows how a priori information can help in the interpretation of the PCA model, but at the same time also provide a strong bias (Kristensen et al., 2012). In a PCA model the a priori knowledge can be used to color objects in the score plot thereby emphasizing potential groupings and/or quantitative gradients found in data (see Fig. 19.7). The a priori knowledge can also used more actively in the model and this is further discussed later in this section.

Things to be Aware of When Using PCA PCA is a powerful explorative method, but some things are important to consider before doing the modeling of metabolomics data where the number of variables largely exceeds the number of objects.

Cleaning It is important that data are cleaned as much as possible to avoid making false conclusions—for example, if the separating power of metabolite #4120 for diet 2 and 3 in Fig. 19.6 are due to noise (low signals) or an unrandomized run order of samples (diet 2 samples run before diet 3), then that metabolite cannot be trusted and must be removed. Another way to clean the data is to apply a so-called 80% nonzero rule (Bijlsma et al., 2005) where only metabolites with a nonzero signal in 80% of the samples from one class (e.g., diet 1 samples) are maintained. The quality control samples (Section 19.2.2), which are mixtures of real samples, can also be used by inspection of the standard deviations of the found metabolites. Those metabolites that have a real signal not affected by noise will have a low relative standard deviation whereas metabolites with a noisy signal will have high relative standard deviations (Theodoridis et al., 2012). If these are not excluded it is also possible to scale the data with this information resulting in a larger weight in the model for the metabolites that hold a signal not affected by noise. 524 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

Vertical scaling Another issue is how to scale the data as the metabolites can have very different intensities. The normal procedure is to scale the metabolites with the standard deviation (scaling to unit variance) which gives all metabolites the same chance of influencing the model. However, this might emphasize the noisy metabolites too much, which in turn may increase the risk for false conclusions. On the contrary if data are not scaled (just mean centered) the high intensity metabolites will dominate the model. In several metabolomics studies it has been suggested to use Pareto scaling which is the square root of the standard deviation used as the scaling factor. Pareto scaling still emphasizes the more intense metabolites but low intense metabolites are not penalized as much as when no scaling is used (van den Berg et al., 2005). However, it is always recommended to run a PCA trial with just mean-centered data (metabolite-wise subtracting the average of all samples from each individual sample) because big systematic variations are often important. If not for the study design then because it will describe the dynamic range by which the methods are challenged. The scaling methods described above use the standard deviation or an associated measure as a scaling factor. However, the standard deviation is just one way to measure the data spread and to correct for intensity differences. In biological studies the scaling measure can also be chosen to be dependent on the biological range, on level ranges, or to emphasize metabolites that fluctuates the least (van den Berg et al., 2005). Different scaling methods also provide loading plots that can be quite different due to the nature of the scaling method (Fig. 19.8). All scaling methods have consequences that deserve some consideration before they are applied. Pareto scaling has been selected as the default method in commercial

Metabolite 972

Metabolite 92

Metabolite 3 Metabolite 412 Metabolite 4120 Metabolite 33

FIGURE 19.8 Typical loading plots where RIGHT: no scaling has been applied and LEFT: scaling to unit variance has been applied. In Figure 19.7, a model using Pareto scaling has been illustrated. The result of no scaling is very often a too simple model where only the major metabolites can be observed (and interpreted). In such models the loading plot is easy to interpret, but will often be too simple. Scaling to unit variance for each metabolite may seem as the most objective way of performing explorative data analysis but this approach is severely affected by noisy metabolites and thus, data should be cleaned before the model step. The loading plot of such a model can be almost impossible to investigate as many metabolites will have an influence on the model. DATA STRUCTURES AND MODELS 525 software, but gives no guarantee that all information is recovered from the data especially from low intense metabolites. Depending on data cleaning, data setup, and biological hypothesis several scaling methods should be tested to evaluate different intensity classes of metabolites. If one has a quantitative parameter such a metabolite concentration measured by a reference method, a simple PLS prediction method can be used to optimize the vertical scaling methods (Rasmussen et al., 2012).

Variation extracted While PCA extracts the main variation in cleaned data there is no guarantee that this variation is reflecting the hypothesis put forward (e.g., what are the differences between object class 1 and 2). Measuring metabolites in biological samples such as, for example, biofluids from humans provides data with multiple relevant and irrelevant types of variation such as sex, eating habits, cohabitation, age, genetics, medication, environmental factors, and lifestyle (Rasmussen et al., 2011). These variations can, depending on the design of the study, be a major part of the total variation and thus, be extracted by the PCA model in the first components. Being able to minimize these using a proper experimental design, balanced data, randomized samples, and optimized instrumental conditions is of prime importance in foodomics studies where the sought information otherwise can be hidden. When principal components are extracted, a part of the variation in original data is captured by these components. These parts are always reported (in percentage of the total variation) together with model parameters and should be used when interpreting the model. For foodomics studies, the relevant study design patterns can be relatively poorly represented in the data compared to noise, biological variations, instrumental artifacts, study design factors, etc. Thus, if two classes are perfectly separated in the score plot (e.g., when plotting component one vs. two) this must be evaluated in the light of how much variation these two components account for. Also when two metabolites are highly correlated in the corresponding loadings plot, it is important to remember that this is not the same as when plotting the raw univariate data for the same metabolites.

Normalization Depending on the data and the expected output, the data can be normalized in several ways. This was discussed in Section 19.2.3 and must be evaluated prior to any data analysis. In commercial software, a default normalization method is often selected and the user should check if this method is in agreement with the target of the data analysis.

Validation It is normally not critical to validate a PCA model since the important score plots remain relatively unperturbed by different validation methods. However, the described variance by the principal components will be strongly influenced by the validation (Section 19.3.2) method wherefore rigorous validation is strongly recommended when the PCA models are to be evaluated in, for example, scientific literature. The validation often goes hand in hand with the explained variance of the principal component. If very little variance is described by the first components then this indicates that multivariate patterns are not consistently present in data, for example, due to inhomogeneous or noisy data. In such a situation it is crucial to avoid 526 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

FIGURE 19.9 Illustration of how metabolites found in several samples can be linked to dummy variable representing the a priori knowledge of the foodomics study. In the dummy variable a 1 indicates that the urine sample comes from a person that have been eating diet 1 and a 0 indicates that diet 2 has been eaten.

too strong conclusions that cannot be supported by either data or model. A new set of samples or the use of an alternative instrumental technique (e.g., NMR) will help to indicate if the results found can be validated or should be abandoned.

19.3.2.2 Partial Least Squares Regression Discriminant Analysis—PLS-DA In foodomics studies, it is common to have a priori knowledge about the data typically from a controlled experimental design. This can be used actively in the model by introducing a so-called dummy variable that contains the a priori knowledge as exemplified in Figure 19.9. The PLS-DA method is a classification method where the dummy variable is predicted in the best possible way using the information found in the metabolite table (Ståhle and Wold, 1987). The result could be a certain metabolite pattern found in a foodomics study responsible for the discrimination between the intervention and the control. This is closely related to a normal PLS prediction model where a continuous parameter (e.g., cholesterol level) is predicted from, for example, a NMR spectrum, but the main difference is that PLS-DA is a classification task (Wold et al., 1983). In a normal PLS, the variation in the continuous parameter is described in the best possible way (providing the best obtainable prediction) from a linear combination of the metabolites, which are weighted (regression coefficients) according to their importance in the prediction model of the parameter. Where a normal PLS model is optimized according to the prediction error (e.g., RMSECV), the PLS-DA should be optimized based on classification parameters (e.g., rate or percentage of misclassified samples). The use of PLS-DA models are so widespread (and more often used than classical PLS in foodomics studies) and will also be dominating future foodomics studies because of its very strong classification performance, but there are several things to be aware of before presenting the outcome of the PLS-DA model. DATA STRUCTURES AND MODELS 527

Things to be Aware of When Using PLS-DA All the things to be aware of when using PCA must also be considered using PLS-DA. In addition a few other things are essential for PLS-DA mainly related to the supervised nature of the model.

“When you ask for discrimination— you will get it!” Lars Nørgaard, Danish chemometrician

Where PCA represents an unbiased, untargeted, and unsupervised data exploration, PLS-DA represents the opposite. It is a supervised method where patterns are extracted so they describe the dummy variable in the best possible way (i.e., best possible classification). Another major difference is that PCA results can often be presented without considering validation. This is not the case for PLS models as these must be always validated before presenting scores and loadings and predic- tion errors. For PLS-DA it is even worse since spurious correlations often can lead to excellent but false classifications. Without validation, the numerous metabolites extracted (some being noisy, some random, and some real) combined with the relative few samples makes the model outcome nonsense. In fact it has been shown that a nonvalidated PLS-DA on sound metabolite data containing no information about a given class relationship is able to show a clear separation of the classes in the score plot (Westerhuis et al., 2008). Unfortunately, such extreme classification results have been reported several times in metabolomics studies, but it can relatively easily be detected if validation is included in a PLS-DA model. In Figure 19.10, an example of a score plot from a PLS-DA model on random data assigned two classes is shown (a

FIGURE 19.10 Score plot (PLS component 1 vs. 2) from a PLS-DA model of random data assigned to either class 1 or class 2 where PLS component one explains 2.5% and component two explains 2.1%. An excellent but meaningless class separation is obtained. 528 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

FIGURE 19.11 Misclassification rate (value between 0 and 1) for nonvalidated and validated PLS-DA models when increasing the number of PLS-DA components. similar model could have been found if sound data were arbitrarily divided into two classes (Westerhuis et al., 2008). The low explained variation of the two first components indicates that only a small fraction of the random data is used to separate the classes. Such an observation, as was also discussed for PCA models, must be evaluated further and for PLS-DA models the only acceptable approach is to include a rigorous validation step to confirm or refute the class separation. Without going into details with validation, several cross- validation methods can be used depending on data structure and setup of study design (Bro et al., 2008; Esbensen and Geladi, 2010). Regardless of validation or not the scores and loading plots would be similar and thus, these plots cannot be used to access the classification performance of a PLS-DA model. Having selected an appropriate validation technique, the PLS-DA model can be investigated by inspection and classification properties in the validated model. Most softwares have a misclassification rate which can be studied when the number PLS-DA components is increased. This is shown in Figure 19.11. The model that is not validated provides a perfect classification (no misclassified samples) using three PLS components but when validating the misclassification rate is almost 10% using the same number of components. This difference is due to overfitting the data and thus the model appears better than it is. The degree of overfitting can be much worse than in the presented example and only when validating rigorously this degree can be evaluated. In this case the model could only give a 10% misclassification rate and depending on the data and the hypothesis criteria this might be acceptable. The aim of validating a PLS-DA model is to test the data and evaluate how many PLS-DA components must be included to provide a representative misclassification DATA STRUCTURES AND MODELS 529 error when using the model to predict new biological samples. The ultimate classifi- cation test/validation is the use of an independent test set; that is, apply the developed classification model on some new samples that have not been included in the model building and test if they are classified correctly.

Other Simple Validation Rules Are:

1. In general you want to perturb your data as much as possible when applying cross-validation (Wold, 1978) and yet be able to obtain a valid model. 2. Do not use full cross-validation (leave-out-one-sample-at-a-time) unless you have very few samples (Martens and Dardenne, 1998). 3. Replicates should always end up in the same cross-validation segments. 4. If you have several design groups—try to use the other groups as segments in the cross-validation, that is, harvest year, sex, location, batch, etc. 5. The segments should be representative for what you would like to use the model for. 6. Be conservative and select the cross-validation model that gives the worst prediction performance (using sound cross-validation). 7. Be conservative and select the model that is more parsimonious. 8. Different cross-validation methods may suggest different numbers of PLS-DA components. Try more validation techniques to test the data and model or repeat a method where samples are randomly split into segments.

The application of other more advanced validation methods like double cross- validation, permutation and Monte Carlo testing often add complementary insight of the model performance (Westerhuis et al., 2008). However, in most cases follow- ing the rules above will often be adequate to access if the data hold information of the classes investigated.

19.3.2.3 Other Types of Classification Methods While PLS-DA is by far the most abundant classification method in foodomics studies other classification methods such as orthogonalization methods, soft independent modeling of class analogy (SIMCA) (Wold, 1976), and canonical variates analysis (CVA) (Rao, 1952) are available. In order to improve the interpretation of the classification results, some classifica- tion methods include an orthogonalization step in which the variation in the data are divided in two parts either before the model as a preprocessing step or as an integral part of the modeling step. Basically orthogonalization divides the measured data up into one part that is orthogonal (uncorrelated) to the study design factor (e.g., diet 1 and 2) and thus being irrelevant for the design factor and one part that contains the relevant information of the investigated design factor. Subsequent the two parts can be analyzed separately. This strategy will sometimes result in an easier and per- haps more intuitive interpretation, but it does not provide a better classification of the samples (Tapp and Kemsley, 2009). Two of the most used methods within this 530 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS category are O-PLS-DA (Bylesjø et al., 2006) and multilevel PLS-DA (van Velzen et al., 2008).

A like-for-like comparison will show that O-PLS-DA never outperforms PLS-DA, just as O-PLS will never outperform PLS Tapp and Kemsley, 2009

Another powerful classification method is CVA and in particular its extension that is able to handle datasets with more variables than objects: extended canonical variates analysis (ECVA) (Nørgaard et al., 2006). In analogy to CVA, the ECVA optimize the within class variation divided by the between class variation criterion by finding new multivariate directions. ECVA has a great potential within foodomics classification analysis such as, for example, authenticity testing and metabolomics. However, it should be noticed that there will not necessarily be a large difference in the mis- classification rate between different methods such as SIMCA, PLS-DA, OPLS-DA, PCA-CVA,and ECVAas they all have their advantages and disadvantages. Unlike, for example, discriminant PLS, ECVA can also handle situations where three groups are separated along one direction. Nevertheless it is generally found that ECVA provides very robust solutions (classification errors tend not to increase with slight overfit- ting) and that they are less sensitive to irrelevant variables such as the discriminant PLS methods.

19.3.2.4 Variable Selection—How to Get Rid of Poor Metabolites When mea- suring many metabolites several of them will be irrelevant and thus, besides the ones removed during the data cleaning it can be an advantage to get rid of the irrele- vant metabolites. In foodomics studies irrelevant often means unsystematic, noisy or metabolites containing different information than the one explored (e.g., the dif- ference between two diets). When performing variable selection from an already established model it tends to become slightly better (better classification or bet- ter visual depiction of the difference between two classes) and the interpretation enhanced significantly (i.e., less metabolites in the loadings plot to consider). How- ever, when performing data reduction and variable selection the strategy must be carefully considered.

“If you keep only the data you think are relevant you will confirm what you already “know” is important and this will reduce your chances of innovation” Frank Westad, Norwegian chemometrician

Variable selection is a whole branch of chemometrics and outside the scope of this chapter. In relation to foodomics studies variable selection can give considerable advantages in interpretation and performance. However, the vast amount of variables (metabolite signals) compared to the relative few object in XC-MS foodomics are often likely to generate spurious correlations when “hardcore” variable selection methods such as forward selection or genetic algorithms are applied. In any case when DATA STRUCTURES AND MODELS 531 combining a supervised model such as PLS-DA with a variable selection method gives a high risk for overfitting the model and a true external test set validation is required. Interval PLS-DA (iPLS-DA) and interval ECVA (iECVA) has proven efficient in improving and simplifying classification models by breaking the model up into smaller intervals (either consisting of many metabolites or one metabolite per interval) of data (Di Anibal et al., 2011; Ferrari et al., 2011; Nørgaard et al., 2000; Savorani et al., 2010; Winning et al., 2009). The interval models are generally a healthy principle when analyzing raw XC-MS datasets and not “just” metabolite tables. When few intervals and/or fewer metabolites are found to be optimal for the best classification, this also makes the subsequent biological interpretation simpler.

19.3.3 Chemometrics Methods for Data Cubes In metabolomics data, several metabolites will often be correlated creating character- istic patterns that are related to different phenomena (e.g., study design factors). This has been shown to be efficiently handled by PCA where characteristic patterns are described. In PCA, individual metabolites are allowed to (and often will) influence not only one but more of the extracted principal components/patterns (e.g., difference between diets, gender differences, age differences). However, this feature is subopti- mal when resolution of coeluting peaks into individual metabolites is required. This could, for example, be an important metabolite found to be important from an initial PCA model, but where the raw data indicate that this metabolite coelutes with another less important metabolite. Then the importance of the metabolite(s) must be evaluated (and validated). This is only possible if the contribution from the two peaks can be separated (i.e., identified and quantified). Such a separation of chemical components complies well with how analytical signals (from e.g., XC-MS) are built up and how they can be decomposed. According to Lambert Beer’s law the analytical signal is composed of component specific terms and even closely eluting or strongly correlated metabolites in the analytical signal can be modeled if the extinction coefficients are known (or found) (Fig. 19.12). This feature can be used to do peak deconvolution if the applied model can find and extract individual metabolite signals. Some chemometric techniques are capable of doing peak deconvolution which means that coeluting metabolites can be characterized and quantified. Such models do not assume the model components to be orthogonal (as in PCA), but they require that data are kept in its original form (cubed data). Used appropriately these models can describe the data in the same way as Lambert Beer’s law and thus, individual chemical information can be characterized and quantified from each model component. For comparison, the composition of Lambert Beer’s law, PCA and the most often used cubed (or multiway) model PARAFAC (described below) is shown in Figure 19.12. The success of Lambert Beer’s law and PARAFAC lies in the fact that the ana- lytical components are extracted and described without the mathematical (and often nonchemical) orthogonality constraint of PCA. Without this constraint and if the extinction coefficient, ε and a (elution time profile) and b (mass spectral profile) are different for different metabolites and if these can be found consistently across samples then the concentration term, c will hold the relative concentration of the 532 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

FIGURE 19.12 The resemblance between Lambert Beer’s law and PCA and PARAFAC. All three methods assume that the extinction coefficient ε, the loadings, p and the a and b, respectively, are common across all samples. The concentration measure is then found in t for PCA and in c for Lambert Beer’s law and PARAFAC. The e represents the information in data not holding the extracted chemical component or pattern information and will often be noise or irrelevant information. individual metabolites. In PCA, properties of metabolites are often included in sev- eral loadings (p) making the corresponding concentration term (t) a mixture of many metabolites. This is not ideal for evaluating individual chemical features, but optimal for exploring and finding patterns in data. In many ways, PARAFAC can be consid- ered as a multiway extension to PCA and scores and loadings are obtained just as for PCA (Fig. 19.5). However, PARAFAC focuses more on chemistry and thus, very suitable for characterizing parts of the data where PCA has limitations. Methods suitable for cubed data also operate on data found in more than three dimensions (e.g., Samples × GC × GC × TOFMS and others as shown in Table 19.1). With hyphenated chromatographic mass spectrometry data, multiway methods have been used most frequently on cubed data such as XC-MS and more specifically they have been most successful when modeling a small part of the elution time axis in which data behaves in a trilinear fashion (illustrated in Fig. 19.13). In such a subregion (or low rank) of data (e.g., looking at few metabolites across samples and all mass fragments) these methods are able to do a unique peak deconvolution so that individual chemical features are extracted separately (Booksh and Kowalski, 1994; Boque and Ferre, 2004; Bro, 2003; Bro et al., 2010; Comas et al., 2004; Escandar et al., 2007; Ortiz and Sarabia, 2007; Rinnan et al., 2007; Skov and Bro, 2008), which would otherwise require a more comprehensive calibration. DATA STRUCTURES AND MODELS 533

FIGURE 19.13 Three cases using PARAFAC and PARAFAC2 to extract peak areas for subsequent data analysis using PCA. Case 1: The same metabolite eluting different places (i.e., shifted data) in four samples—one-factor PARAFAC2 model, Case 2: Two different metabolites that are perfectly resolved and where each metabolite elutes at the same time (i.e., no shifts in data) across samples—two-factor PARAFAC model, Case 3: Two overlapping metabolites that do not show shifts—two-factor PARAFAC model. Case 4: Two overlapping metabolites eluting at different places—two-factor PARAFAC2 model. Besides providing relative concentrations the multiway models also estimate elution time and mass spectral loadings. Especially the latter is of importance for identification of the metabolites and the loading can be directly compared with libraries such as NIST.

Multiway models such as PARAFAC provide a unique solution to a given low-rank problem, which for XC-MS is normally a system of few peaks (i.e., metabolites). The uniqueness is due to models being able to find characteristic mass spectral and elution time profiles for the individual analytes in the system investigated if the proper number of factors is included in the model. Unique profiles do not indicate that selective variables, for example, selective mass channels, are required (Sinha et al., 2004)—which is an otherwise essential criteria if, for example, univariate models or classical statistics are used. PARAFAC provides chemically meaningful solutions even if no selective mass channels exist, provided that the ratios of the intensities of the individual mass channels are different (i.e., different patterns). This will often be the case for even closely eluting metabolites. The principles of PARAFAC was originally proposed in 1970 independently by Harshman (1970) and Carroll and Chang (1970) and has been used to model multiway chromatographic data to get both qualitative and quantitative information (Bro, 1997; Bylund et al., 2002; Hoggard and Synovec, 2007; Johnson et al., 2004) and have recently also been applied within metabolomics (Hendriks et al., 2011; Liland, 2011; Richards et al., 2010). The way PARAFAC can be used for metabolomics data is to model problematic regions; either in a targeted study or for metabolites found important in an initial PCA/PLS-DA screening of design factors. These metabolites can be evaluated further by looking into a subpart of data only consisting of few metabolites (including the important one). The use of these multiway models is most 534 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS powerful for “difficult” peaks where either coelution is severe, data are too misaligned to be handled by standard software, or if some samples contain too little amount of a metabolite (in such cases standard software may wrongly provide a zero intensity if the peak is below a threshold). In Figure 19.13, two examples of the use of PARAFAC can be seen. PARAFAC assumes invariant elution time and mass spectral profiles across the samples and as such, one prerequisite is that data are aligned prior to modeling. Another important thing besides synchronized profiles is that peak shapes must be similar for individual peaks across samples. These two restrictions can be a challenge in real-world XC- MS data, but a less constrained version of PARAFAC can be applied to handle both unsystematic shifts along the elution time axis and changes in peak shapes. This model, called PARAFAC2 (Harshman, 1972; Kiers et al., 1999) allows the natural structure of chromatographic data (e.g., shift peaks) to be present in the data while still providing a chemically meaningful unique solution. If the elution time shifts are relatively confined to a certain extent, only the PARAFAC2 algorithm is able to find and model the shifted peaks of the same chemical compounds, profiting the fact that they have the same mass spectral profile (or vice versa). Figure 19.13 demonstrates the use of PARAFAC and PARAFAC2 to a set of chromatographic data. Applying PARAFAC2 to the first region in Figure 19.13 with a shifted peak pro- vides a perfect description using a one-factor model whereas a PARAFAC model will fail (misaligned data). For the next two regions where no shifts are observed, PARAFAC is the simplest model. For the last region PARAFAC2 models two coelut- ing and shifted peaks perfectly. For all four regions, the model outcome is one or two sets of relative concentrations of a single metabolite creating a data table of five metabolites found in four samples. An important feature with the concentrations found by PARAFAC/PARAFAC2 is that the concentrations are relative concentra- tions, which only needs to be scaled to obtain the real concentrations. It is often experienced that these concentrations in cases with difficult peaks (shifts or coelu- tion) when compared with the relative concentrations extracted using commercial software will be significantly better. For the “easy” peaks as the second regions in Figure 19.13, the two approaches will often provide very similar results. Thus, multiway methods should be used as complementary methods especially for the metabolites found to be incorrectly handled by traditional software (which happens surprisingly frequent). The concentrations and identification of important metabo- lites in a PCA or PLS-DA, which elutes in regions affected by noise, other peaks or background signal, should also be further validated before biological interpretations are carried out.

19.4 CONCLUSION

Foodomics studies are complex and studies dealing with metabolomics data are concerned with multifactorial problems and as these are analyzed with multifactorial sensors and separation methods, multivariate data handling methods are required to extract and describe the data. The modern data analytical platforms generate vast REFERENCES 535 amounts of data in a very short time and the analyst risk the challenge to be flooded with noninformative data.

“Too much data, too little information” Harald Martens, Norwegian chemometrician

This is particular problematic when the amount of variables with orders of mag- nitude exceeds the number of objects investigated. We therefore must emphasize that metabolomics data are carefully explored using both unsupervised and supervised methods. PCA is a very efficient method to investigate the information perturbation by different data cleaning and preprocessing methods. If the experimental design is aimed at differentiating between classes, additional classification methods have proven to be very powerful for metabolomics data. However, when implying a pri- ori knowledge in the model step the outcome must be rigorously validated before biological interpretation and presentation of the results. If data are complex and coeluting peaks are present, multiway models can be applied to perform unique peak deconvolution. Data cleaning, preprocessing, modeling, and validation are all essential parts of the chemometrics that should be applied in any foodomics study. Knowledge of chemometric methods will provide the best platform for getting most from foodomics studies but also to avoid approaching data incorrectly and making nonvalidated interpretations.

REFERENCES

Amigo JM, Skov T, Bro R (2010). ChroMATHography: solving chromatographic issues with mathematical models and intuitive graphics. Chemical Reviews 110:4582–4605. Barnes RJ, Dhanoa MS, Lister SJ (1989). Standard normal variate transformation and de- trending of near-infrared diffuse reflectance spectra. Applied Spectroscopy 43:772–777. Bijlsma S, Bobeldijk I, Verheij ER, Ramaker R, Kochhar S, Macdonald IA, van Ommen B, Smilde AK (2005). Large-scale human metabolomics studies: a strategy for data (pre-) processing and validation. Analytical Chemistry 78:567–574. Booksh KS, Kowalski BR (1994). Theory of analytical-chemistry. Analytical Chemistry 66:A782–A791. Boque R, Ferre J (2004). Using second-order data in chromatographic analysts. Lc Gc Europe 17:402–407. Bro R (1997). PARAFAC. Tutorial and applications. Chemometrics and Intelligent Laboratory Systems 38:149–171. Bro R (2003). Multivariate calibration. What is in chemometrics for the analytical chemist? Analytica Chimica Acta 500:185–194. Bro R, Kjeldahl K, Smilde A, Kiers H (2008). Cross-validation of component models: a critical look at current methods. Analytical and Bioanalytical Chemistry 390:1241–1251. Bro R, Viereck N, Toft M, Toft H, Hansen PI, Engelsen SB (2010). Mathematical chromatog- raphy solves the cocktail party effect in mixtures using 2D spectra and PARAFAC. TrAC Trends in Analytical Chemistry 29:281–284. 536 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

Bylesjø M, Rantalainen M, Cloarec O, Nicholson JK, Holmes E, Trygg J (2006). OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification. Journal of Chemometrics 20:341–351. Bylund D, Danielsson R, Malmquist G, Markides KE (2002). Chromatographic alignment by warping and dynamic programming as a pre-processing tool for PARAFAC modelling of liquid chromatography-mass spectrometry data. Journal of Chromatography A 961:237– 244. Capozzi F, Placucci G (2009). 1st International Conference in Foodomics, Cesena, Italy. Carroll JD, Chang JJ (1970). Analysis of individual differences in multidimensional scaling via an N-way generalization of Eckart-Young decomposition. Psychometrika 35:283. Castillo S, Gopalacharyulu P, Yetukuri L, Oresicˇ M (2011). Algorithms and tools for the preprocessing of LC-MS metabolomics data. Chemometrics and Intelligent Laboratory Systems 108:23–32. Cifuentes A (2009). Food analysis and foodomics. Journal of Chromatography A 1216: 7109. Comas E, Gimeno RA, Ferre J, Marce RM, Borrull F, Rius FX (2004). Quantification from highly drifted and overlapped chromatographic peaks using second-order calibration meth- ods. Journal of Chromatography A 1035:195–202. Di Anibal CV, Callao MP, Ruisanchez I (2011). (1)H NMR variable selection approaches for classification. A case study: the determination of adulterated foodstuffs. Talanta 86:316– 323. Esbensen KH, Geladi P (2010). Principles of proper validation: use and abuse of re-sampling for validation. Journal of Chemometrics 24:168–187. Escandar GM, Faber NKM, Goicoechea HC, de la Pena AM, Olivieri AC, Poppi RJ (2007). Second- and third-order multivariate calibration: data, algorithms and applications. Trac- Trends in Analytical Chemistry 26:752–765. Ferrari E, Foca G, Vignali M, Tassi L, Ulrici A (2011). Adulteration of the anthocyanin content of red wines: perspectives for authentication by Fourier transform-near infrared and (1)H NMR spectroscopies. Analytica Chimica Acta 701:139–151. Harshman RA (1970). Foundations of the PARAFAC procedure: model and conditions for an explanatory multi-mode factor analysis. UCLA working papers in phonetics 16:1–84. Harshman RA (1972). PARAFAC2—mathematical and technical notes. UCLA working papers in phonetics 22:30–44. Hendriks MMWB, Eeuwijk FA, Jellema RH, Westerhuis JA, Reijmers TH, Hoefsloot HCJ, Smilde AK (2011). Data-processing strategies for metabolomics studies. TrAC Trends in Analytical Chemistry 30:1685–1698. Hoggard JC, Synovec RE (2007). Parallel factor analysis (PARAFAC) of target analytes in GCxGC-TOFMS data: automated selection of a model with an appropriate number of factors. Analytical Chemistry 79:1611–1619. Hotelling H (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 24:417–441. Johnson KJ, Rose-Pehrsson SL, Morris RE (2004). Monitoring diesel fuel degradation by gas chromatography-mass Spectroscopy and chemometric analysis. Energy & Fuels 18:844– 850. Katajamaa M, Oresicˇ M (2007). Data processing for mass spectrometry-based metabolomics. Journal of Chromatography A 1158:318–328. REFERENCES 537

Kiers HAL, ten Berge JMF, Bro R (1999). PARAFAC2—Part I. A direct fitting algorithm for the PARAFAC2 model. Journal of Chemometrics 13:275–294. Kristensen M, Engelsen SB, Dragsted L (2012). LC-MS metabolomics top-down approach reveals new exposure and effect biomarkers of apple and apple-pectin intake. Metabolomics 8:64–73. Liland KH (2011). Multivariate methods in metabolomics from pre-processing to dimension reduction and statistical analysis. TrAC Trends in Analytical Chemistry 30:827–841. Martens HA, Dardenne P (1998). Validation and verification of regression in small data sets. Chemometrics and Intelligent Laboratory Systems 44:99–121. Nielsen N-PV, Carstensen JM, Smedsgaard J (1998). Aligning of single and multiple wave- length chromatographic profiles for chemometric data analysis using correlation optimised warping. Journal of Chromatography A 805:17–35. Nørgaard L, Bro R, Westad F, Engelsen SB (2006). A modification of canonical variates analysis to handle highly collinear multivariate data. Journal of Chemometrics 20:425– 435. Nørgaard L, Saudland A, Wagner J, Nielsen JP, Munck L, Engelsen SB (2000). Interval partial least-squares regression (iPLS): a comparative chemometric study with an example from near-infrared spectroscopy. Applied Spectroscopy 54:413–419. Ortiz MC, Sarabia L (2007). Quantitative determination in chromatographic analysis based on n-way calibration strategies. Journal of Chromatography A 1158:94–110. Rao CR (1952). Advanced Statistical Methods in Biometric Research. New York: John Wiley Subscription Services, Inc., A Wiley Company. Rasmussen L, Savorani F, Larsen T, Dragsted L, Astrup A, Engelsen SB (2011). Standardiza- tion of factors that influence human urine metabolomics. Metabolomics 7:71–83. Rasmussen LG, Winning H, Savorani F, Toft H, Larsen TM, Dragsted LO, Astrup A, Engelsen SB (2012). Assessment of the effect of high or low protein diet on the human urine metabolome as measured by NMR. Nutrients 4:112–131. Richards SE, Dumas ME, Fonville JM, Ebbels TMD, Holmes E, Nicholson JK (2010). Intra- and inter-omic fusion of metabolic profiling data in a systems biology framework. Chemo- metrics and Intelligent Laboratory Systems 104:121–131. Rinnan A, Riu J, Bro R (2007). Multi-way prediction in the presence of uncalibrated interfer- ents. Journal of Chemometrics 21:76–86. Savorani F, Picone G, Badiani A, Fagioli P, Capozzi F, Engelsen SB (2010). Metabolic profiling and aquaculture differentiation of gilthead sea bream by 1H NMR metabonomics. Food Chemistry 120:907–914. Sinha AE, Prazen BJ, Synovec RE (2004). Trends in chemometric analysis of comprehensive two-dimensional separations. Analytical and Bioanalytical Chemistry 378:1948–1951. Skov T, Bro R (2008). Solving fundamental problems in chromatographic analysis. Analytical and Bioanalytical Chemistry 390:281–285. Ståhle L, Wold S (1987). Partial least squares analysis with cross-validation for the two-class problem: a Monte Carlo study. Journal of Chemometrics 1:185–196. Tapp HS, Kemsley EK (2009). Notes on the practical utility of OPLS. TrAC Trends in Analytical Chemistry 28:1322–1327. Theodoridis GA, Gika HG, Want EJ, Wilson ID (2012). Liquid chromatography mass spec- trometry based global metabolite profiling: a review. Analytica Chimica Acta 711:7–16. 538 CHEMOMETRICS, MASS SPECTROMETRY, AND FOODOMICS

Tomasi G, Van Den Berg F, Andersson C (2004). Correlation optimized warping and dynamic time warping as preprocessing methods for chromatographic data. Journal of Chemometrics 18:231–241. Tomasi G, Savorani F, Engelsen SB (2011). icoshift: an effective tool for the alignment of chromatographic data. Journal of Chromatography A 1218:7832–7840. van den Berg F, Tomasi G, Viereck N (2005). Warping: investigation of NMR pre-processing and correction. In: Engelsen SB, Belton PS, Jakobsen HJ, editors. Magnetic Resonance in Food Science. Cambridge, UK: The Royal Society of Chemistry. p 131–138. van der Kloet FM, Bobeldijk I, Verheij ER, Jellema RH (2009). Analytical error reduction using single point calibration for accurate and precise metabolomic phenotyping. Journal of Proteome Research 8:5132–5141. van Velzen EJJ, Westerhuis JA, van Duynhoven JPM, van Dorsten FA, Hoefsloot HCJ, Jacobs DM, Smit S, Draijer R, Kroner CI, Smilde AK (2008). Multilevel data analysis of a crossover designed human nutritional intervention study. Journal of Proteome Research 7:4483–4491. Want E (2009). Challenges in applying chemometrics to LC-MS-based global metabolite profile data. Bioanalysis 1:805–819. Westerhuis JA, Hoefsloot HCJ, Smit S, Vis DJ, Smilde AK, van Velzen EJJ, van Duijnhoven JPM, van Dorsten FA (2008). Assessment of PLSDA cross validation. Metabolomics 4:81– 89. Winning H, Roldan-Mar´ ´ın E, Dragsted LO, Viereck N, Poulsen M, Sanchez-Moreno´ C, Cano MP, Engelsen SB (2009). An exploratory NMR nutri-metabonomic investigation reveals dimethylsulfone as a dietary biomarker for onion intake. Analyst 134:2344–2351. Wold S (1976). Pattern recognition by means of disjoint principal components models. Pattern Recognition 8:127–139. Wold S (1978). Cross-validatory estimation of number of components in factor and principal components models. Technometrics 20:397–405. Wold S, Martens H, Wold H (1983). The multivariate calibration problem in chemistry solved by the PLS methods. In: Ruhe A, Kågstrøm B, editors. Lecture Notes in Mathematics, Proceedings of the Conference on Matrix Pencils. Heidelberg, Germany: Springer Verlag. p 286–293. Zelena E, Dunn WB, Broadhurst D, Francis-McIntyre S, Carroll KM, Begley P, Hagan S, Knowles JD, Halsall A, Wilson ID, Kell DB (2009). Development of a robust and repeat- able UPLC-MS method for the long-term metabolomic study of human serum. Analytical Chemistry 81:1357–1364. 20 SYSTEMS BIOLOGY IN FOOD AND NUTRITION RESEARCH

Matej Oresiˇ cˇ

20.1 SYSTEMS BIOLOGY—NEW OPPORTUNITY FOR FOOD AND NUTRITION RESEARCH

20.1.1 Emergence of Systems Biology—An Overview It is generally appreciated that biological systems are complex, involving genetic and molecular interactions across multiple levels of biological organization. Given this, it is not surprising that systems-level thinking in biology is not new (von Berta- lanffy, 1969). Much work has already been devoted in 1960s and 1970s to concepts such as “metabolic control analysis” (Kacser and Burns, 1973), which viewed and modeled the metabolism as a complex system of enzymatic reactions. In fact, Hen- rik Kacser stated, “But one thing is certain: To understand the whole you must look at the whole” (Kacser, 1986). However, the early modeling approaches have also been severely limited—the parameters of the models needed reliable exper- imental data, preferably at the system-wide level. Instead, measurements such as enzyme kinetics or metabolite concentrations and fluxes could only be obtained in an isolated manner at the time. Consequently, the early approaches aimed at systems-level understanding of biological systems were limited to qualitative models. These models may have advanced the conceptual understanding of specific aspects of biology but had poor ability to quantitatively predict, for example, responses to interventions.

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

539 540 SYSTEMS BIOLOGY IN FOOD AND NUTRITION RESEARCH

The emergence of molecular biology as a dominant tool of life science research after about 1970s has contributed new techniques for the studies of genes and proteins, which inevitably also changed the experimental paradigms in terms of how the biological systems are viewed and studied, as well as how biologists are educated. The resulting reductionist approach to life science has led to the perception that many if not most of the complex phenotypes can be explained by the function of specific genes and their products, and that the “wiring” of the biological system can be learned in experimental setting where these genes are manipulated in vitro or in vivo, such as in gene silencing experiments. Even today, this view dominates life science and biomedical research although its limitations have been recognized (Lazebnik, 2002). Notably, like the early theoretically driven system-level models, the models that can be derived from such a reductionist approach are qualitative. In addition, they are also lacking the holistic view how the system functions as a whole. Nevertheless, much invaluable knowledge has been acquired over the past decades about the gene and protein function, and the sequencing of the human genome marks the peak of this period (Lander et al., 2001; Venter et al., 2001), as well as the beginning of a new era. The “omics” revolution of the 1990s has provided many new tools, which were only wished for by the early systems biologists. While many technological challenges and opportunities still lie ahead, particularly when it comes to conducting measure- ments in vivo and in real time, tools are now available which enable comprehensive, quantitative as well as sensitive measurements of molecular components of biological systems such as DNA, RNA, proteins, and metabolites as well as their interactions. Using the “omics” approach, one can therefore generate a molecular snapshot of the biological system, and study the changes of these molecular profiles in time in the context of genetic or environmental changes. The “omics” data present a new challenge for biologists due to their high dimen- sionality. It requires caution and proper statistical treatment in order to reliably inter- pret the data and avoid the bias (Ransohoff, 2004; Ransohoff, 2005). It also requires a systems biology approach, that is, shifting the focus from single components to how they contribute together in a network to make a specific phenotype or biological func- tion (Ideker et al., 2001). The latter cannot be achieved using the above mentioned and still dominant reductionist experimental paradigm. Joyner and Pedersen, in fact, recently noted that “ . . . fundamentally narrow and reductionist perspective about the contribution of genes and genetic variants to disease is a key reason “omics” has failed to deliver the anticipated breakthroughs” as well as emphasize the “critical utility of key concepts from physiology like homeostasis, regulated systems, and redundancy as major intellectual tools to understand how whole animals adapt to the real world” (Joyner and Pedersen, 2011).

20.1.2 Need for Systems Biology in Food and Nutrition Research Understanding of living organisms in the context of coordinated gene and molecu- lar function and translation of this knowledge to improve human health is a great challenge and certainly one of the central aims of medical systems biology. The SYSTEMS BIOLOGY—NEW OPPORTUNITY FOR FOOD AND NUTRITION RESEARCH 541 interface between biological systems and nutritional as well as other environmental factors represents the next level of complexity involving time-dependent interac- tions between nutrients, host metabolism, and gut microbiota that still have to be understood (van Ommen et al., 2008). Systems biology in the context of food and nutrition research thus requires bridging across multiple levels and concepts, for example, cellular–organismal, host–microbial, short-term versus long-term effects. While the lessons over the past years have taught us that the gene-centric and reductionist approach is not the way to pursue nutritional systems biology, or systems biology in general (Joyner and Pedersen, 2011), question then still remains how to tackle it instead? It is clear that no single model can encompass the complexity of human metabolism in the context of nutrition. One attractive view is to consider a platform-based approach where multiple physiological levels relevant to diet are studied in the context of nutrition. Such a platform would need to include at least the following four levels:

1. “Omics” platforms for the detailed characterization of food components. 2. Platform to characterize and model the production of food-derived molecules (e.g., metabolites) in the gut, which enter the enterohepatic circulation and thus affect the host physiology. 3. Platforms to characterize and model the effects of food-derived molecules on host physiology at the cellular, tissue, as well as organismal levels. 4. Sensitive platform to characterize the host physiology and health status.

The last level in the list is what one would commonly consider as “medical sys- tems biology”, that is, aiming to establish the molecular networks behind health in disease, while the first three levels deal with how the specific foods and diets in general may perturb these networks. It is evident from this list alone what makes nutritional systems biology particularly challenging. Biomedical research has been mostly dealing with “extreme phenotypes”, for example, healthy state versus dis- ease state, while much less effort has been spent on understanding molecular net- works behind the maintenance of health and responses to regular environmental challenges. In other words, we understand much better the networks behind the chronic diseases than the networks that keep us healthy. Related to this, much effort has been spent on identifying specific targets for pharmacological interven- tions to treat the disease than on prevention. Even if utilizing sophisticated systems approaches to identify the disease-specific networks of pharmacological relevance (Schadt, 2009), this is still a much simpler problem as compared to how a specific diet, comprising thousands of molecular components entering the body, affects the host’s physiology and health. The subsequent sections of this chapter will cover some ideas and examples of how one can approach nutritional systems biology, and how different components listed above may be integrated into a single platform for nutrition research. 542 SYSTEMS BIOLOGY IN FOOD AND NUTRITION RESEARCH

20.2 SYSTEMS APPROACH TO IDENTIFY MOLECULAR NETWORKS BEHIND HEALTH AND DISEASE

20.2.1 Understanding Health and Disease—A Case for “Omics” and Systems Approach One of the important topics of today’s nutrition research is how specific diets and food may help us keep healthy. But what is health? Is it merely an absence of disease? The World Health Organization (WHO) defined health in 1946 as “a state of complete physical, mental, and social well-being and not merely the absence of disease or infirmity”. While disease may be characterized by specific dysfunctional molecular networks, health cannot be reduced to any specific networks because the “normality” may vary from individual to individual, also depending on environmental factors. Progression from healthy state to a specific disease is generally a slow and complex process which advances through several stages, most of them happening before any clinical symptoms occur. Using a systems biology view, dynamic molecular networks in a healthy person are perturbed when the person starts progressing toward a disease (Fig. 20.1), either by changing the concentrations/activities of specific molecular components or by changing the connectivity of a specific network. However, in this

Overt disease Dysfunctional molecular network

Early signs of disease Perturbed molecular network Healthy state “Normal” molecular network

Normal range Disease severity index (arbitrary)

Time (age) FIGURE 20.1 Progression from healthy state to disease—a network view. The progression is illustrated using an oversimplified “disease severity index”, as dependent on time. When advancing from healthy state toward a disease, the underlying molecular networks are pro- gressively perturbed, at first still compensating for the specific physiological function, but ultimately breaking down in overt disease. SYSTEMS APPROACH TO IDENTIFY MOLECULAR NETWORKS 543 early stage the network is still compensated and may perform the specific physiologi- cal functions properly. In contrast, in disease state the network is dysregulated beyond the organismal ability to compensate, leading to specific physiological dysfunction. From the nutritional point of view, the key is to understand how a specific diet or food (1) modulates the molecular networks in healthy state and (2) modulates the already perturbed but still compensated networks to return to healthy state (second stage in Fig. 20.1), that is, how to prevent disease in at-risk individuals. One would therefore need a highly sensitive analytical approach to identify any potential changes of potential pathophysiological relevance in the absence of any clinical symptoms. Given the comprehensive molecular panels covered and the sensitivity, this is exactly where “omics” platforms may play a key role.

20.2.2 Metabolomics—A Phenotyping Platform for Systems Biology It has been known for several decades that small changes in the activities of individual enzymes may lead to only small changes in metabolic fluxes but can lead to large changes in metabolite concentrations (Kacser and Burns, 1973; Kell, 2006). Metabo- lite levels as measured by metabolomics platforms may therefore be seen as amplified responses to subtle and physiologically relevant perturbations of the system. It is thus not surprising that metabolomics research over the past decade identified that circu- lating metabolite levels in humans are sensitive to many factors that play important roles in the maintenance of health, including genotype (Illig et al., 2010), immune system status (Oresic et al., 2008), diet (Holmes et al., 2008), gut microbiota (Velaga- pudi et al., 2010; Nicholson et al., 2012), development (Nikkila et al., 2008), and age (Yu et al., 2012). Furthermore, metabolome is highly dynamic (Krug et al., 2012), that is, metabolites may respond sensitively and distinctly to specific challenges such as physical exercise (Lewis et al., 2010), oral glucose tolerance test (Shaham et al., 2008), or fasting (Krug et al., 2012). Some of the distinct features of metabolome of potential pathophysiological relevance may therefore reveal themselves only when the organism is put under specific challenge (Krug et al., 2012). Metabolomics has already established itself as a sensitive platform for the char- acterization of complex phenotypes (Oresic et al., 2006) and many metabolomics studies in the nutritional field have been performed (Oresic, 2009). In contrast, while proteomics has over the past decade played an important role in functional and sys- tems biology studies as a tool for molecular and cellular biology, the applications in the clinical domain have been lagging behind metabolomics and genomics. While this may on one end be due to specific suitability of metabolomics as a phenotyping platform, as argued above, the main challenge with implementing proteomics has been that the available proteomic techniques have not been adequate to deal with the wide dynamic range of the circulating proteome (Anderson et al., 2004). However, this is about to change with the advent of targeted quantitative proteomics (Picotti and Aebersold, 2012), so one can expect that in the future applications of metabolomics and genomics platforms in clinical studies (Gieger et al., 2008) will be increasingly complemented with proteomics as well. 544 SYSTEMS BIOLOGY IN FOOD AND NUTRITION RESEARCH

20.3 FOOD METABOLOME AND ITS EFFECT ON HOST PHYSIOLOGY

If one has to study the effects of specific food on host physiology, one needs to first understand how the food is metabolized and which components derived from food then enter the enterohepatic circulation and potentially play a biological role. One approach to link food with their metabolized products is by the establishment of nutritional intake markers (Manach et al., 2009; Primrose et al., 2011). Using this strategy, dietary intake of specific foods (in humans or animal models) is accompanied by “omics” (primarily metabolomics) analysis of biofluids such as plasma or urine and the statistical associations are established between the diet and metabolic profiles. The advantage of such an approach is that it can directly associate the food with specific markers in the clinical context. On the negative side, little mechanistic insight can be gained and the statistical analysis may suffer from low power due to high number of variables included in the models relative to the number of samples analyzed. The “mechanistic” aspect is important in nutrition research because in order to understand how specific diet alters host physiology, one needs to study how the diet modulates specific physiological processes, not only how it affects the circulating markers. More mechanistic although indirect approach, complimentary to the use of intake markers, is to use a panel of in vitro and in vivo systems aimed at studying specific lev- els of physiology relevant to the diet. One model system which is particularly suitable to study colon metabolism of specific dietary components in vitro is an anaerobic in vitro colon model, for example, developed at VTT (Heinonen et al., 2004; Aura et al., 2005; Aura et al., 2006). This model simulates microbial conversion by using pooled human feces from at least four healthy donors to provide reproducible microbiota. The model is performed in strictly anaerobic conditions at human body temperature with rapid sampling. The time course and the extent of microbial metabolite formation can be measured from complex food mixtures and isolated compounds with tailor-made approaches depending on the conversion rate of the substrates and resistance of the substrate matrix. The application of the in vitro colon model can be combined with metabolomics to identify and study metabolites derived from food and influenced by the microbial metabolism in the colon. For example, in a recent study, the in vitro colon model and metabolomics were applied to characterize microbial metabolism of Syrah grape products (Aura et al., 2012). As a limitation of this approach, not all dietary components are metabolized in the colon. Lipids are mainly absorbed in the small intestine where they enter the lipoprotein metabolism pathway. Different experimental strategies are therefore needed in order to be able to model how lipids from diet enter the systemic lipid metabolism (Adiels et al., 2005; Boren et al., 2012). Once the metabolic products from food are identified, one needs to understand how they affect the host physiology. Selected in vitro systems are one attractive option as they allow application of in-depth “omics” studies complemented, for example, by metabolic modeling (Lizarraga et al., 2011; Matito et al., 2011). However, although in vitro studies may provide valuable functional and mechanistic insights on specific subsystems, in vivo studies (e.g., animal models) can provide further insights on their role in physiological settings. In vivo systems also offer opportunity for combining genetic and environmental manipulations to gain mechanistic insights about specific BUILDING A SYSTEMS BIOLOGY PLATFORM FOR FOODOMICS 545

(patho)physiological phenomena. We have, for example, shown in an obese mouse model that a specific protein-rich diet markedly improves the lipidomic profile in the liver as well as leads to lowering of liver fat (Pilvi et al., 2008). Many animal models have been utilized in nutrition research (Baker, 2008), but the challenge still remains how to better integrate the in vivo studies with the applications of systems biology approaches.

20.4 BUILDING A SYSTEMS BIOLOGY PLATFORM FOR FOOD AND NUTRITION RESEARCH

Sections 20.2 and 20.3 introduced some of the key components which could com- prise a comprehensive nutritional systems biology platform, covering all levels, from food composition to health effects. The challenge then becomes how all these com- ponents are connected into a single platform for nutrition and food research. A European project of the 7thFramework Programme “Characterization and modeling of dietary effects mediated by gut microbiota on lipid metabolism”—ETHERPATHS (www.etherpaths.org)—is one example of platform building for nutritional systems biology, focusing primarily on the dietary effects on host lipid metabolism (Fig. 20.2). In ETHERPATHS, the clinical endpoint of interest (i.e., “disease” in Fig. 20.1) is metabolic comorbidities of obesity such as metabolic syndrome and type 2 diabetes. Given the central role of lipids in this context (Virtue and Vidal-Puig, 2010; Pietilainen et al., 2011), lipid metabolism is the key physiological system being investigated in the context of nutrition. Diets rich in components which can modulate systemic lipid metabolism, such as omega-3 fatty acids (Lankinen et al., 2009) or polyphenols (Bravo, 1998), have been selected for investigations. The overall scientific strategy of ETHERPATHS is, as described in earlier sections, to apply specific model systems to link the food intake and its metabolic products with host physiology and ultimately health outcomes. Multiple “omics” platforms and modeling approaches are being applied at multiple levels, from in vitro systems (colon model, adipocyte, and hepatocyte cell lines), to animal models related to lipid metabolism, and clinical studies (kinetic studies as well as a larger nutritional intervention study). These studies are designed so that the output from one level is an input to another. For example, metabolites identified as products of food metabolism from the in vitro colon model are studied (in isolation or in fractions), are being used as interventions for in vitro studies in adipocyte and hepatocyte cell lines, and the metabolic outcomes from these studies are being compared with those of tissue- specific studies in vivo. In parallel, ETHERPATHS has two research lines dedicated to development of “omics” platforms, mainly focusing on metabolomics and lipidomics, and to integrative bioinformatics and modeling, aiming to facilitate the information flow between different levels. Needless to say, there are many ways to build a systems biology platform for nutrition research, and the ultimate choices depend on the key research questions and problems being tackled. It is unlikely one size fits all and a generic platform and model can be built. ETHERPATHS is an example of a platform constructed 546 SYSTEMS BIOLOGY IN FOOD AND NUTRITION RESEARCH

Model integration

Key focus: Integrative bioinformatics, standardization, software engineering, model dependencies.

In vivo systems In vitro systems and and pathway Physiological models cellular modeling reconstruction

Key focus: Influence of Key focus: Characterization of Key focus: Dynamic models of colonic products on host cell plasmalogen deficient mouse systemic lipid metabolism. metabolism (hepatocytes models and potential role of Human intervention trials and adipocytes) using stable gut microbiota. Reconstruction including in vivo tracer studies. isotope tracer data and of affected tissue-specific kinetic and flux models. pathways.

Food products and nutritional intervention trials

Key focus: Nutritional intervention aiming to alter lipid homeostasis, OMICS technology development including ω-3 fatty acid supplementation and polyphenols. Key focus: New instrumentation and platforms for mass spectrometry-based analysis of lipids and hydrophilic metabolites

FIGURE 20.2 The components of ETHERPATHS platform for nutritional systems biology. around a specific clinical outcome and careful selection of physiological levels and model systems, aimed both at addressing the research questions as well as to facilitate building of the platform which can be in part or in full applicable also to address other problems in nutrition research. Nevertheless, “omics” technologies combined with modeling at the selected levels are likely to play a key role in any systems biology approach tacking human nutrition.

20.5 FUTURE PERSPECTIVES

Systems biology brings together the concepts and tools from physiology and systems theory, “omics” tools as well as vast amounts of knowledge gathered over the decades of molecular biology and genomic research. While even today reductionist approaches dominate the domains of life sciences and medicine, the area of nutrition is the one where one has no other choice but to embrace a systems approach, if one is to unleash the power of “omics” technologies in full. This is because foods are complex, and their REFERENCES 547 metabolism produces large numbers of components which together modulate the host physiology. Applying a traditional reductionist experimental paradigm where specific components are studied in isolation is not practical and will not be able to address the synergistic effects of different dietary components. On the other end, “omics” studies alone may not provide sufficient insights either. For example, metabolomic profiling of biofluids may lead to specific profiles associated with specific clinical outcomes or food intake, but that alone may give little clue about the underlying physiology. The near future will likely bring more emphasis on using innovative experimental settings (e.g., challenge experiments, prospective studies) and experimental models in the context of nutrition studies, so that the power of “omics” platforms can be utilized in their full potential, and the quantitative readouts from these platforms can lead to better mathematical models of specific physiological systems of interest. Given the human complexity and the elusive concept of health, there is still much research to be done before a dream of systems approach as envisioned by the pioneers from decades ago can become true also in the domain of human nutrition.

REFERENCES

Adiels M, Packard C, Caslake MJ, Stewart P, Soro A, Westerbacka J, Wennberg B, Olofs- son SO, Taskinen MR, Boren´ J (2005). A new combined multicompartmental model for apolipoprotein B-100 and triglyceride metabolism in VLDL subfractions. Journal of Lipid Research 46(1):58–67. Anderson NL, Polanski M, Pieper R, Gatlin T, Tirumalai RS, Conrads TP, Veenstra TD, Adkins JN, Pounds JG, Fagan R, Lobley A (2004). The human plasma proteome: a non-redundant list developed by combination of four separate sources. Molecular Cell Proteomics 3(4):311–326. Aura AM, Martin-Lopez P, O’Leary KA, Williamson G, Oksman-Caldentey KM, Poutanen K, Santos-Buelga C (2005). In vitro metabolism of anthocyanins by human gut microflora. European Journal of Nutrition 44(3):133–142. Aura AM, Oikarinen S, Mutanen M, Heinonen SM, Adlercreutz HC, Virtanen H, Poutanen KS (2006). Suitability of a batch in vitro fermentation model using human faecal microbiota for prediction of conversion of flaxseed lignans to enterolactone with reference to an in vivo rat model. European Journal of Nutrition 45(1):45–51. Aura AM, Mattila I, Hyotyl¨ ainen¨ T, Gopalacharyulu P, Cheynier V, Souquet JM, Bes M, Le Bourvellec C, Guyot S, Oresiˇ cˇ M (2012). Characterization of microbial metabolism of Syrah grape products in an in vitro colon model using targeted and non-targeted analytical approaches. European Journal of Nutrition. doi: 10.1007/s00394-012-0391-8.[Epub ahead of print]. Baker DH (2008). Animal models in nutrition research. Journal of Nutrition 138(2):391–396. Boren J, Taskinen MR, Adiels M (2012). Kinetic studies to investigate lipoprotein metabolism. Journal of International Medicine 271(2):166–173. Bravo L (1998). Polyphenols: chemistry, dietary sources, metabolism, and nutritional signifi- cance. Nutrition Reviews 56(11):317–333. 548 SYSTEMS BIOLOGY IN FOOD AND NUTRITION RESEARCH

Gieger C, Geistlinger L, Altmaier E, Hrabe´ de Angelis M, Kronenberg F, Meitinger T, Mewes HW, Wichmann HE, Weinberger KM, et al. (2008). Genetics meets metabolomics: a genome-wide association study of metabolite profiles in human serum. PLoS Genetics 4(11):e1000282. Heinonen SM, Wahala K, Liukkonen KH, Aura AM, Poutanen K, Adlercreutz H (2004). Studies of the in vitro intestinal metabolism of isoflavones aid in the identification of their urinary metabolites. Journal of Agricultural and Food Chemistry 52(9):2640–2646. Holmes E, Loo RL, Stamler J, Bictash M, Yap IK, Chan Q, Ebbels T, De Iorio M, Brown IJ, Veselkov KA, et al. (2008). Human metabolic phenotype diversity and its association with diet and blood pressure. Nature 453(7193):396–400. Ideker T, Galitski T, Hood L (2001). A new approach to decoding life: systems biology. Annual Review of Genomics and Human Genetics 2:343–372. Illig T, Gieger C, Zhai G, Romisch-Margl¨ W, Wang-Sattler R, Prehn C, Altmaier E, Kas- tenmuller¨ G, Kato BS, et al. (2010). A genome-wide perspective of genetic variation in human metabolism. Nature Genetics 42(2):137–141. Joyner MJ, Pedersen BK (2011). Ten questions about systems biology. Journal of Physiology 589(Pt 5):1017–1030. Kacser H (1986). On parts and wholes in metabolism. In: Welch GR, Clegg JS, editors. The Organization of Cell Metabolism. New York: Plenum Press. p 327–337. Kacser H, Burns JA (1973). The control of flux. Symposia of the Society for Experimental Biology 27:65–104. Kell DB (2006). Theodor Bucher¨ Lecture. Metabolomics, modelling and machine learning in systems biology—towards an understanding of the languages of cells. Delivered on 3 July 2005 at the 30th FEBS Congress and 9th IUBMB conference in Budapest. FEBS Journal 273(5):873–894. Krug S, Kastenmuller G, Stuckler¨ F, Rist MJ, Skurk T, Sailer M, Raffler J, Romisch-Margl¨ W, Adamski J, Prehn C, et al. (2012). The dynamic range of the human metabolome revealed by challenges. FASEB Journal 26(6):2607–2619. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al. (2001). Initial sequencing and analysis of the human genome. Nature 409(6822) 860–921. Lankinen M, Schwab U, ErkkilaA,Sepp¨ anen-Laakso¨ T, Hannila ML, Mussalo H, Lehto S, Uusitupa M, Gylling H, Oresic M (2009). Fatty fish intake decreases lipids related to inflammation and insulin signaling—a lipidomics approach. PLoS ONE 4(4):e5258. Lazebnik Y (2002). Can a biologist fix a radio?—or, what I learned while studying apoptosis. Cancer Cell 2(3):179–182. Lewis GD, Farrell L, Wood MJ, Martinovic M, Arany Z, Rowe GC, Souza A, Cheng S, McCabe EL, Yang E, et al. (2010). Metabolic signatures of exercise in human plasma. Science Translational Medicine 2(33):33ra37. Lizarraga D, Vinardell MP, Noe´ V, van Delft JH, Alcarraz-Vizan´ G, van Breda SG, Staal Y, Gunther¨ UL, Carrigan JB, Reed MA, et al. (2011). A lyophilized red grape pomace containing proanthocyanidin-rich dietary fiber induces genetic and metabolic alterations in colon mucosa of female C57BL/6J mice. Journal of Nutrition 141(9):1597–1604. Manach C, Hubert J, Llorach R, Scalbert A (2009). The complex links between dietary phytochemicals and human health deciphered by metabolomics. Molecular Nutrition and Food Research 53(10):1303–1315. REFERENCES 549

Matito C, Agell N, Sanchez-Tena S, Torres JL, Cascante M (2011). Protective effect of struc- turally diverse grape procyanidin fractions against UV-induced cell damage and death. Journal of Agricultural and Food Chemistry 59(9):4489–4495. Nicholson JK, Holmes E, Kinross J, Burcelin R, Gibson G, Jia W, Pettersson S (2012). Host-gut microbiota metabolic interactions. Science 336(6086):1262–1267. Nikkila J, Sysi-Aho M, Ermolov A, Seppanen-Laakso¨ T, Simell O, Kaski S, Oresic M (2008). Gender-dependent progression of systemic metabolic states in early childhood. Molecular Systems Biology 4:197. Oresic M (2009). Metabolomics, a novel tool for studies of nutrition, metabolism and lipid dysfunction. Nutrition, Metabolism and Cardiovascular Diseases 19(11):816– 824. Oresic M, Simell S, Sysi-Aho M, Nant¨ o-Salonen¨ K, Seppanen-Laakso¨ T, Parikka V, Katajamaa M, Hekkala A, Mattila I, Keskinen P, et al. (2008). Dysregulation of lipid and amino acid metabolism precedes islet autoimmunity in children who later progress to type 1 diabetes. Journal of Experimental Medicine 205(13):2975–2984. Oresic M, Vidal-Puig A, Hanninen¨ V (2006). Metabolomic approaches to phenotype charac- terization and applications to complex diseases. Expert Review of Molecular Diagnostics 6(4):575–585. Picotti P, Aebersold R (2012). Selected reaction monitoring-based proteomics: workflows, potential, pitfalls and future directions. Nature Methods 9(6):555–566. Pietilainen KH, Rog T, Seppanen-Laakso¨ T, Virtue S, Gopalacharyulu P, Tang J, Rodriguez- Cuenca S, Maciejewski A, Naukkarinen J, Ruskeepa¨a¨ AL, et al. (2011). Association of lipidome remodeling in the adipocyte membrane with acquired obesity in humans. PLoS Biology 9(6):e1000623. Pilvi TK, Seppanen-Laakso T, Simolin H, Finckenberg P, Huotari A, Herzig KH, Korpela R, Oresic M, Mervaala EM (2008). Metabolomic changes in fatty liver can be modified by dietary protein and calcium during energy restriction. World Journal of Gastroenterology 14(28):4462–4472. Primrose S, Draper J, Elsom R, Kirkpatrick V, Mathers JC, Seal C, Beckmann M, Haldar S, Beattie JH, Lodge JK, et al. (2011). Metabolomics and human nutrition. British Journal of Nutrition 105(8):1277–1283. Ransohoff DF (2004). Rules of evidence for cancer molecular-marker discovery and validation. Nature Reviews in Cancer 4(4):309–314. Ransohoff DF (2005). Bias as a threat to the validity of cancer molecular-marker research. Nature Reviews in Cancer 5(2):142–149. Schadt EE (2009). Molecular networks as sensors and drivers of common human diseases. Nature 461(7261):218–223. Shaham O, Wei R, Wang TJ, Ricciardi C, Lewis GD, Vasan RS, Carr SA, Thadhani R, Gerszten RE, Mootha VK (2008). Metabolic profiling of the human response to a glu- cose challenge reveals distinct axes of insulin sensitivity. Molecular Systems Biology 4: e214. van Ommen B, Fairweather-Tait S, Freidig A, Kardinaal A, Scalbert A, Wopereis S (2008). A network biology model of micronutrient related health. British Journal of Nutrition 99(Suppl 3):S72–S80. Velagapudi VR, Hezaveh R, Reigstad CS, Gopalacharyulu P, Yetukuri L, Islam S, Felin J, Perkins R, Boren´ J, Oresic M, et al. (2010). The gut microbiota modulates host energy and lipid metabolism in mice. Journal of Lipid Research 51(5):1101–1112. 550 SYSTEMS BIOLOGY IN FOOD AND NUTRITION RESEARCH

Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. (2001). The sequence of the human genome. Science 291(5507):1304– 1351. Virtue S, Vidal-Puig A (2010). Adipose tissue expandability, lipotoxicity and the metabolic syndrome—An allostatic perspective. Biochimica Et Biophysica Acta 1801(3):338–349. von Bertalanffy L (1969). General Systems Theory. New York: George Braziller. Yu Z, Zhai G, Singmann P, He Y, Xu T, Prehn C, Romisch-Margl¨ W, Lattka E, Gieger C, Soranzo N, et al. (2012). Human serum metabolic profiles are age dependent. Aging Cell. doi: 10.1111/j.1474-9726.2012.00865.x. INDEX

Acetoacetate, 408 Antioxidant, 101, 257, 260, 473, 479, 493 Acute physical activity (APA), 405, 407, Antiproliferative, 263 409, 411 Apoptosis, 263 Acylcarnitines, 253, 408 Apple, see Diet intervention Adenosine monophosphate, 408 AQUA peptides, 77–8 Adenosylmethionine, 4 Ascorbic acid, 259, 263 Adiponectin, 441, 445 Asparagine, 408 Alanine, 408 Atherosclerosis, 429, 430, 435, 439 Alanine-glucose cycle, 407 risk factors, 439 Alignment, 509, 515, 516 Allantoine, 408 Bacterial identification, 142–4 Allergenomics, 70 Banana, 411 Allergoids, 91 Basophil activation test (BAT), 81, 84 Almond, see Diet intervention Beta sitosterol, 259 Amino acid, 246, 253, 254, 257–8, 260, Betaine, 4 262 prolinebetaine, 255 aromatic, 434 Big-eights, 69 branched-chain amino acid (BCAA), Bile acid, 228, 234–5, 259, 286, 439 286, 434, 435, 438–9, 440 Binning, 509, 511, 515 Ammonia, 227 Bioactive, 167, 473, 476–82, 330–2, 334–5 Antibiotic, 455 Bioavailability, 261 Antibody, 171 Bioinformatics, 6, 11, 49–55, 167, 251–2 Antimicrobial, 456 Biological variation, 257

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc.

551 552 INDEX

Biomarker, 174, 246–249, 252–5, 259–260, Choline, 4 310–3, 331–2, 491–3 lysophosphatidylcholine, 259 early, 311, 319 Chromatin, 4 of nutrient exposure, 331 Chromatography, 76, 78 Body mass index (BMI), 287 2D, 76, 88 Bottom-up, see Proteomics nanoHPLC chip, 78 Bovine milk, see Milk nanoRP-HPLC, 88 Breast milk, see Milk Citrate, 408 Breast feeding, 172 Citric acid, 408 Butyrate, 223 Citric acid cycle, 418, 422 Classification methods, 526 C reactive protein, 263 PLS-DA, 509, 526–9 Caco-2 cell monolayers, 81 other classifications methods, 529 Caffeine, 418, 422 Cocoa powder, see Diet intervention Cahill cycle, 407, 411 Colon model, 544 Caloric restriction, 257 Combinatorial peptide ligand libraries Campesterol, 259 (Proteominer), 90 Cancer, 9–10, 233, 246, 259–61, 429, 430, Comparative safety assessment, 192, 198 442, 491, 493 Compounds bioactive, 9, 246, 253, 259–64 breast, 444 Consumer, 167, 329, 330, 337, 471, 479 colon, 9–10, 261, 263 Contaminants, 455 mouth, 442 Coronary artery disease (CAD), 434 prostate, 443 Coumaric acid, 259 risk factors, 442 Crohn’s disease, 260 Capillary electrophoresis, see Cross-reactive carbohydrate determinants Electrophoresis (CCDs), 84 Carbohydrates, 234 Cysteine, 408 metabolism, 246, 257 Cystoseiraspp, 260 Carbonylation, 102–117 Cardiovascular disease (CVD), 233, 263, 2-DGE, see Two-dimensional gel 271, 286–7, 289, 429, 434 electrophoresis prognosis, 440 DART, see Direct Analysis in Real Time risk profile, 435 Data analysis, 512–3 Carob, see Diet intervention Data cleaning, 509, 523 Casein, 172 Data integration, 9, 11, 246, 249 micelle, 173 Data quality (see Quality of data) Celiac disease, 86–9 Data structures, 517–8 Cell, 493 Databases metabolomics, see Metabolomics culture, 248, 253, 260 Deficiency disease, 429 cycle, 263 Delivery systems, 4 disruption, 248 Dental disease, 429 signaling, 168 Depletion, 170 Cerebrovascular disease (stroke), 430, 434, DESI, see Desorption electrospray 436 ionization risk factors, 437 Desorption electrospray ionization, 464 Chemometrics, 454, 465 Diabetes, 172, 232, 246, 259, 260, 271, 429, methods for data cubes, 531 430, 432 methods for data tables, 519 chronic complications, 433 Chocolate, see Diet intervention insulin resistance, 286–7 INDEX 553

prevalence, 433 Epigenome, 168 risk factors, 433 Epitope, 73–74 type 2 diabetes mellitus (T2DM), conformational, 73–4 279 linear, 73–4 Diagnostic marker, 168 mapping, 79 Diarrhea, 172 ESI-IT, 128 Diet, 8, 167 European Food Safety Authority, 9 intervention, 248, 252–64 Exercise intervention, 257–8 mediterranean, 307 Expression Proteomics, 72, 81, 83 western, 306 Dietary, 329, 331–2, 335, 337 Faecalibacterium prausnitzii, 223, 225, 233, pattern, 329 235 treatment, 331, 332 Fat, see Diet intervention Differential in gel electrophoresis (DIGE), Fatty acids, 253, 255, 257–9, 263, 438–9 17, 22–4, 202 metabolism, 444 Differential stable isotope labeling, monounsaturated (MUFA), 441 23–26 omega-6/omega-3, 440 DIGE, see Differential in gel polyunsaturated (PUFA), 441 electrophoresis short chain, 223–4, 233 Direct Analysis in Real Time, 464, Fecapentanes, 234 483–4 FIA-MS/MS, 252 DNA, 4, 493 Fiber, 430 methylation, 4, 171 intake, 430, 440 sequencing, 4 Fingerprinting, 5, 454 Dynamic range, 169, 246 metabolic, 247–8 Dyslipidemia, 286 Flow injection analysis, see FIA-MS/MS Folate, 381–403 Ecosystem, 167, 496 absorption & metabolism, 383 Epigenetics, 171 blood status, 383, 385 EFSA, see European Food Safety Authority LC MS/MS, 386–7, 388 Electron ionization, 465 microbiological assay, 385 Electrophoresis, 173, 487, 491–3 protein binding assay, 385 CE-MS, 5, 195, 198, 204, 251, 260, 492, chemical structure, 381–2 494–5 dietary sources, 381, 383 differential gel (DIGE), 81 DNA methylation and gene expression, two-dimensional, 76, 81 394 Electrospray ionization (ESI), 17–9, 33, 41, cancer, colon, 395–7 44, 46–7 CVD, 395–7 Endurance/resistance physical activity genomic stability, 387–93 (EPA), 407, 411 global DNA methylation & LC MS/MS, Energy metabolism, 256, 260 394–7 Environmental impact, 473, 475, 479, 481, ApoE null mice, folate deficient, 486, 497 397–8 Enzyme-linked immunosorbent assays rats, folate depleted, 397–8 (ELISA), 75, 83, 85, 88, 89 human disease, 383–4, 397, 399 Epicatechin, 259 carbon metabolism, 383 sulfate, 418, 420 cancer, colorectal (CRC), 384, 387, O-methyl, 420 393 Epigenetic, 167 cardiovascular disease (CVD), 384 554 INDEX

Folate (Continued ) Foodomics, 2, 125, 167, 329–37, cognitive decline, 384 416–25 neural tube defects (NTDs), 384 applications, 4 protein expression & proteomic basic concepts, 471 analyses, 388–94 bioinformatics, 6 colon cells, human, 388–91 colon cancer study, 10 colon, rat, 391–3 definition, 2 plasma, human, 393–4 epigenomics, 4 Food future trends, 11 adulteration, 132–4 GMO, 198, 212 allergens, 76–81 Green, 471–98 “hidden”, 75, 89 metabolome, 5 IgE-mediated, 72 metabolomics, 5, 245, 261, 263 non-IgE-mediated, 72 nutrition and health research, 7 processing of, 86, 88 proteomics, 4 simulated gastrointestinal digestion scheme, 8 of, 79–80 tools, 4 standardization of, 91 transcriptomics, 4 “type I” or complete, 72 Formate, 408 “type II” or incomplete, 73 Fragmentation, 170 uptake of, 81 Fragmentation methods, 40 allergy, 69 collision-activated dissociation (CAD), prevalence, 71 41, 43 immunopathology, 72–3 collision-induced dissociation (CID), 32, authentication, 132–5 40–1, 47–8 conservation, 147–8 electron capture dissociation (ECD), 41 contaminants, 455 electron transfer dissociation (ETD), 41, functional, 3, 329, 330–1, 335, 411, 43, 46 471–3, 477–83 high energy collisonal dissociation industry, 329–31, 336–7 (HCD), 41, 46–7 labelling, 132–3 Fruit ripening, 3 metabolic pathways, 149 Fumarate, 257, 408 metabolome, 453 Functional proteomics, 72, 83 new foods, 6 origin, 465 Gallocatechin, 259 physiological activity, 150 Gas Chromatography, 487 processing, 144, 147–8 comprehensive two-dimensional GC, product, 330 458–9, 465 production, 331 GC-FID, 208, 494–7 proteome, 172 GC-MS, 5, 195, 205, 210, 251, 254–6, quality, 6, 144, 462–4 260, 262, 456 safety, 1, 6, 132, 139, 142, 147, large volume injection, 457 455–61 Gas-phase fractionation, 170 science, 167, 336 Gastrointestinal tract, 11, 167 spoilage, 142 GC, see Gas Chromatography storage, 147–8 Gene technology, 329 analysis, 493 traceability, 1, 6, 132–4, 464–6 expression, 253, 312, 318, 325–6 transgenic foods, 6–7, 191–212 microarray, 199, 212 INDEX 555

ontology, 173 Hazelnut, see Diet intervention regulation, 171 Health, 167, 310–11, 329–31, 335–6, 471, Generally Recognized As Safe, 477 473, 475, 482, 496, Genetic claims, 330 predisposition, 167, 171 disease prevention, 310–12 susceptibility, 171 trajectory, 167 Genetically modified organisms (GMO), Herbicide tolerance, 191, 195, 200 191 HILIC/LC, see Hydrophilic interaction plant, 261 liquid chromatography profiling methodologies, 193, 198 Hippurate, 408 regulation, 192, 212 Hippuric acid, 259 targeted analysis, 193–4, 198 Histone, 4 Genome-Wide Association Studies code, 171 (GWAS), 288–9 Homeostasis, 4, 256, 311, 313, 318 Germ-free mice, 262 Homovanillic acid, 419 Glucagon, 445 Host metabolism, 167 Gluconeogenesis, 430 HPLC, see Liquid Chromatography Glucose-10-phosphate, 408 Hydroxybenzoic acid, 419 Glucose metabolism, 253, 257 Hydroxyisobutyrate, 409 Glutamate, 263 Hydroxyisovalerate, 409 Glutamic acid, 408 Hydroxymethoxyphenyl-␥-valerolactone Glutamine, 263, 408 sulfate, 418 Glycemia, 434 Hydroxyphenylacetic acid, 418–9 Glycerol, 259, 408 Hydroxyphenylpropionic acid, 418–20 Glycine, 408 Hydroxyphenylvaleric acid, 418 Glycogenolysis, 407–8 Hydrophilic interaction liquid Glycolysis, 407, 409 chromatography, 260, 488–9 Glycomic, 173 Hypertension, 287, 440 Glycosylation of proteins, 32–3, 43 Hypoxanthine, 408 GM crop, see Genetically modified organism Identification metabolites, see Metabolomics barley, 195, 206 Immobilized metal affinity chromatography cucumber, 206 (IMAC), 32–3 grapevine, 200, 206 Immune maize, 191, 195, 200, 206 protection, 172 pea, 195, 200 reponse, 168 potato, 191, 195, 200 Immunochemical detection, 194 raspberries, 206 Inductively coupled plasma, see Mass rice, 191, 195, 200 spectrometry, ICP-MS soybean, 191, 195, 200 Inflammation, 233, 434 tomato, 195, 200 Inflammatory bowel disease, 172 wheat, 191, 195, 200 Inosine, 408 GRAS, see Generally Recognized As Insects resistance, 191, 195, 200 Safe Insulin, 252 Grape juice, 254, 262 resistance, 430, 437 Green Analytical Chemistry, 474–6, 482–8, Interindividual variability, 167 490–3 Intermap, 440 Green tea (see Diet intervention), 411 Intersalt, 440 Gut microbiota, 256, 260, 262 Ion Mobility Spectrometry, 484 556 INDEX

Ionization LC-NMR, 5 electron, 456 Ultra high Pressure Liquid direct, 464 Chromatography, 491 Ischemia heart disease, 430, 434–5 UPLC-MS, 204, 250–62, 455 prognosis, 436 Loadings, 520–1, 524 Isobaric tag for relative and absolute Luteolin, 445 quantitation (iTRAQ), 22, 24–5, Lysine, 260 27, 37, 53 Lysophosphatidylcholine, 259 Isoelectric focusing (IEF), 16, 21, 170 Isogenic, 257 Macronutrient, 167 Isotope-coded affinity tags (ICAT), 22–5, Malate, 257, 408 37, 53 Malate dehydrogenase, 263 Isotope-coded protein label (ICPL), 22, MALDI, see Matrix Assisted Laser 24–6, 37, 53 Desorption Ionization Isotopic pattern, 252 Mass accuracy, 169, 252 Isotopic ratio mass spectrometry, 465 Mass spectrometer, 169, 485 iTRAQ, 196, 203 Mass spectrometry, 103–6, 125, 127–8, 171, 226, 454 Kidney cancer, 260 atmospheric pressure chemical Krebs cycle, 432 ionization (APCI), 277 CE-MS, 5, 195, 198, 204, 251, 260, 492, Lactate, 223, 233, 253, 263, 408 494–5 Lactation, 172 DART-MS, 464 Lactoglobulin beta, 172 DESI-MS, 464 Lambert Beer’s law, 531–2 desorption ionization on silicon (DIOS), Large volume injection, 457 277 LCA, see Life Cycle Assessment direct ionization, 464 LC-MS, see Liquid Chromatography electron impact ionization (EI), Leptin, 445 273 Life Cycle Assessment, 473–4, 479, 481, electrospray ionization (ESI), 76 493–8 FT-ICR MS, 174, 206, 211, 250, 252, Lifestyle, 8, 262, 429, 431 254, 255, 261 Linoleic acid, 408 GC-MS, 5, 195, 205, 210, 251, 254–6, Lipidomics, 173, 277, 351, 483 260, 262, 456 analysis, 360 GCxGC-MS, 465 in food science, 368–74 ICP-MS, 464, 465 in food quality, 369–70 Imaging, 486 in food safety, 370–1 ion trap analyzers, 84, 88, 277 Lipids, 351–4 IRMS, 465 analytical methods, 355–60 LC-MS, 5, 195–6, 199, 204, 250–62, classification, 352–4 457, 488, 491–6 functions, 351–2 Multiple Reaction Monitoring metabolism, 250, 253, 256, 257, 259 (MRM), 74, 77–78, 90, 456, 457, Lipogenesis, 430 460 Lipopolysaccharides (LPS), 233 MSn, 250 Liquid Chromatography (LC), 169, 487–9, Orbitrap analyzer, 85, 86, 88, 89 494–6 Q-TOF hybrid analyzer, 82, 86 LC-IRMS, 466 tandem, 76 LC-MS, 5, 195–6, 199, 250–62, 457, triple quadrupole analyzer 78, 84 488, 491–6 UPLC-MS, 455 INDEX 557

Matrix Assisted Laser Desorption Microbe, 171 Ionization (MALDI), 17–9. 128, Microbiome, 11, 262, 434, 438 172, 195, 199, 203, 486 Microbiota, 11, 167, 172, 222, 260, Matrix effects, 456 262 Metabolite, 272–4, 276–8, 284–8, 290–1, gut, 279–81, 284–91 320–2, 331–2 Microfluidic, 492–3 Metabolic, see also Metabolomics Micronutrient, 167, 279, 283 fingerprinting, 431, 435, 436 Milk imprinting, 171 bovine, 173 phenotype 434 breast, 172–3, 255, 262 profiling 431, 436 formula-fed, 172–3 programming, 167 Minerals, 274, 281 syndrome, 286–7, 291, 429, 437 MRM, 131 Metabolism mRNA, 310, 313, 318, 322 disorders, 432 MS, see Mass Spectrometry lipid, 438 Multidimensional Protein Identification methylamine, 433 (MudPIT), 22, 34–5 nucleotide, 432 Multiple reaction monitoring (MRM), 27, Metabolome, 168, 245–264, 482, 492 34–5, 37–40, 456, 457, 460 food, 453 Multiplexed proteomics (MP), 33 Metabolomics, 5, 225, 271–4, 453, 484–91, Multivariate approach, 510, 517 543–5 Myocardial infarction, 435 databases, 252 data processing, 251, see also Statistical Nanoflow, 173 analysis Naringenin-O-glucuronide, 418 definition, 453 Necrotizing enterocolitis, 172 fingerprinting, 5, 247–8, 454 Network biology, 542–3 human studies, 418 Next-generation sequencing, 199 identification, 250–3 Niacinamide, 408 nutrimetabolomics, 279, 289 NMR, see Nuclear Magnetic Resonance pattern, 252, 257–8 Nonmodified counterparts, 191, 203 pharmacometabolomics, 279 Non specific lipid transfer proteins plant metabolomics, 261 (nsLTPs), 80 profiling, 5, 247, 254–6, 259–64, 453 Normalization, 509, 515–7, 525 target analysis, 5, 246–7, 417 Northern blotting, 199 untargeted, 424, 513 Nuclear Magnetic Resonance (NMR), 5, workflow, 509 198, 204, 212, 254–256, Metagenome, 172 483–4 Metagenomics, 224–5 imaging (MRI), 444 Metaproteome, 172 LC-NMR, 5 Metatranscriptomics, 225 Nut, see Diet intervention Methionine, 4 Nutraceuticals, 6, 260, 471–92 ␤-D-methyl glucopyranoside, 408 Nutrients, 429 3-methyl-2-oxovalerate, 409 signaling, 307 MIAPE (Minimum information about a Nutriproteomics, 150–1 proteomics experiment), 55 Nutrigenetics, 2, 245 Microarray technology, 4, 310–5, 318, Nutrigenomics, 3, 171, 245, 263, 310–2, 321 327, 329 gene microarrays, 199, 212 Nutrimetabolomics, 310, 316, 320, 330, protein microarrays, 5 331, 336 558 INDEX

Nutriproteomics, 310, 315, 318 Partial Least Squares regression Nutrition, 329–31, 335–7, 361–8 see Statistical analysis evolution of, 304 discriminant analysis (see PLS-DA), history, 304–8 510 human interventions, 361–2 Pathogen, 459 related- diseases, 362–8 Pathway analysis metabolomics, 252–3 cancer, 367–8 PCA, 509, 519, 520–6 fatty liver disease, 366–7 things to be aware of when using PCA, hyperthension, 365 523 inflammatory disease, 365 PCR, see Polymerase chain reaction obesity, 362–3 Pectin, 224 type 2 diabetes mellitus, 364 Peak picking, 251 optimal, 303–8, 329, 331, 335 PeptideAtlas, 169 patterns, 304, 306 Peptide personalized, 9, 245, 253, 271–2, data processing, 169 278–81, 336–7 labeling, 170 profiles, 329 mass fingerprinting (PMF), 18, 49, 51 research, 329 proteotypic, 77–8, 89–90 Nutritranscriptomics, 310, 312–3, 313, sequencing, 169 321 Periferical mononuclear blood cells Nutritional (PBMC), 312–5, 318 intervention, 167, 252–64 Persistent organic pollutants, 458 proteomics, 167 Pesticide, 455 Phenylalanine, 408 Obesity, 172, 232, 246, 256, 286–9, 291, Phenolic acids, 229 429, 437 Phenotype, 174, 246–7 Oil, see also Diet intervention Phenylacetic acid, 229 corn, 255, 257 Phenylacetylglutamine, 419 fish, 254 Phenylpropanoid, 228 olive, 255, 257 3-Phosphoglycerate, 408 Oleic acid, 408 Phospholipid, 259 Omega 3, see Diet intervention Phosphoproteomics, 32 Omics Phytochemicals, 261 history, 308, 539–40 PHWE, see Pressurized Hot Water limitations, 321, 539–40 Extraction platforms, 11, 245–6, 262, 482 PLE, see Pressurized Liquid Extraction techniques, 320–2 PLS-DA, 509, 526–8 Oral allergy syndrome, 72 Things to be aware of when using Osteoporosis, 429, 430 PLS-DA, 527 Oxidative stress, 407–11, 434 Validation, 527 2-oxoisocaproate, 409 Validation rules, 529 2-oxoisovalerate, 409 Polyamines, 227, 261 2-oxoglutarate, 408 Polymerase chain reaction, 194 Real-time PCR, 4 Palmitic acid, 408 Polymorphisms, 9, 245 Pantothenate, 257 Polyphenol, 253–5, 259–63 PARAFAC, 509, 519, 531–4 rich beverages, 411 PARAFAC2, 509, 519, 531–4 Polyunsaturated fatty Acid (PUFA), 271, Pareto scaling, 521, 524 288 INDEX 559

Post-genomic, 246 Proteome, 310, 318–9 Postharvest, 3 Proteomics, 4–5, 101, 103, 171, 260, 263, Post-mortem changes, 144–6 472, 482–3, 487, 490–2 Post-traslational modifications (PTMs), bottom-up, 5, 19–21, 34, 126–7, 199 18–21, 26, 31–3, 131, 169 gel-based, 126 computational methods, 50, 52 label-free, 22, 27–8, 30–2, 37, 53 instrumental methods, 41, 43–6 quantitative, 129 Precursor ion, 170 shotgun, 30, 34–5, 38, 76–7, 127, 174, Pre-processing, 515 203 Alignment, 509, 515–6 targeted, 34–40 Binning, 509, 511, 515 top-down, 5, 19–21, 41, 46, 126 Filtering, 509, 515 Pyrogallol, 419–20 Baseline subtraction, 509, 515 Pyruvate, 408 Normalization, 509, 515–7, 525 Validation, 525 Quadrupole, 19, 36–41, 43–48 Pressurized Hot Water Extraction, 477–82, Qualitative data, 516–7 486 Quality control samples (QC samples), 514, Pre-slaughter conditions, 146–7 523 Pressurized Liquid Extraction, 477–8, 486 Quality control sample set, 250 PRIDE (PRoteomics IDEntifications), 51, Quality of the data, 514–5 53–4 Quantitative data, 517 Principal components analysis, see QuEChERS, 458 Statistical analysis and PCA Quenching, 247 Profiling, 5, 453, 487, 490, 492 Quinic acid, 259 metabolic, 5, 247, 254–6, 259–64, 493 Prolamin superfamily, 80 Real-time PCR, see Polymerase chain Prolinebetaine, 255 reaction Propionate, 233 Red wine, 254, 255, 262 Protected designation of origin, 465 Regulation homeostatic, 4, 256 Protected geographical indication, 465 Regulatory system, 330 Proteins Repositories, proteomics data, 36, 53–4 annotation, 171 Resveratrol, see Diet intervention bioavailability, 171 RNA non-coding, 4 bioefficacy, 171 RNA-Seq, see Next-generation sequencing coding gene, 171 Rosemary extract, 260–1, 473, 477, deep sequencing, 172 479–82, 493, 496 fish, 135, 137–8, 146, 148 identification, 128–9, 169, 491, 493 Sample preparation, 5, 247, 249–50, 472, information and knowledge extractor 482, 484, 492, 494, 496–7 (PIKE), 53 Green, 476, 483, 485–7 isoform, 169 Scores, 520–2 meat, 139–40, 144–6, 147, 148 Selected reaction monitoring (SRM), 31–2, metabolism, 227, 234 34, 36–9 microarrays, 5 Serotonin, 253, 420, 424, 438 milk, effect of processing and storage, SFE, see Supercritical Fluid, Extraction 148 Shotgun, see Proteomics milk, identification of pathogens, 143 Single nucleotide polymorphism (SNP), 7, milk, interactome map, 149 280, 288 shellfish, 138–9 Sitosterol beta, see Beta sitosterol 560 INDEX

SMIM, 131 Tomato, see Diet intervention SNPs, see Single nucleotide polymorphism Top-down, see Proteomics Solid phase extraction, 458 Toxin, 459 Spectral counting, 28–31 Tract intestinal, seeGastrointestinal tract Spectrum libraries, 144 Training sample set, 250 SRM, 131 Transcriptome, 168 Stable isotope labeling by amino acids in Transgenic foods, 191 cell culture (SILAC), 22, 24, 26, Treatment 37, 53 heating, 147–8 Stable isotope labeling of mammals high-pressure, 148 (SILAM), 22, 24 Triacylglycerol, 259 Stable isotope standards and capture by Tricarboxylic acid cycle (TCA), 257, 407, anti-peptide antibodies 411 (SISCAPA), 27, 39 Trimethylamine N-oxide, 409 Starch, resistant, 224 Tryptophan, 408 Statistical analysis Two-dimensional gas chromatography, see HCA, 173, 252, 253 Gas chromatography PCA, 252, 454 Two-dimensional gel electrophoresis PLS-DA, 252, 253, 257–258, 454 (2-DGE), 15–7, 22–3, 199, 212, Structural elucidation, 252 Substantial equivalence, 192, 209 Unintended effects, 192–3, 198, 204 Succinate, 257, 408 UPLC, see Liquid Chromatography, Ultra Sulfate reducing bacteria, 234 High Pressure Supercritical Fluid Urea cycle, 435 chromatography, 487–8 Urolithin, 420–4 extraction, 477–82, 486, 494, 496 Surface-enhanced laser desorption Validation, 525, 527–9 ionization (SELDI), 18, 35 Validation sample set, 250 Systems biology, 11, 132, 262–3, 283, 289, Valine, 408 320, 322, 541–7 Vanillic acid, 419–20 platform 545–6 Variable selection, 530 Systems level, 167 Vitamins, 228, 281, 283 Systems Response Profiles (SRPs), 283 Vitamin B6, 4 Vitamin B12, 4 Tandem mass spectrometry (MS/MS), Vitamin C, 259, 263 19–21, 33–5, 39–41, Vitamin D, 280 bioinformatics tools 50–1, 54 Vitamin E, see Tocopherol new instrumental methods, 43–9 quantification, 24, 26–31 Well-being, 167 Tandem mass tag (TMT), 23–5, 27 Whey, 172 Targeted proteomics, 131–2 Wine (see Red wine), 419, 422–3 Taurine, 259, 408 Workflow, 354–5 Terpenoids, 228 Thrifty genotype, 430 Xanthine, 408 Tocopherol, 259, 263 XC-MS data, 509, 511–2 WILEY SERIES ON MASS SPECTROMETRY

Series Editors Dominic M. Desiderio Departments of Neurology and Biochemistry University of Tennessee Health Science Center

Nico M. M. Nibbering Vrije Universiteit Amsterdam, The Netherlands

r John R. de Laeter Applications of Inorganic Mass Spectrometry r Michael Kinter and Nicholas E. Sherman Protein Sequencing and Identification Using Tandem Mass Spectrometry r Chhabil Dass Principles and Practice of Biological Mass Spectrometry r Mike S. Lee LC/MS Applications in Drug Development r Jerzy Silberring and Rolf Eckman Mass Spectrometry and Hyphenated Techniques in Neuropeptide Research r J. Wayne Rabalais Principles and Applications of Ion Scattering Spectrometry: Surface Chemical and Structural Analysis r Mahmoud Hamdan and Pier Giorgio Righetti Proteomics Today: Protein Assessment and Biomarkers Using Mass Spectrometry, 2D Electrophoresis, and Microarray Technology r Igor A. Kaltashov and Stephen J. Eyles Mass Spectrometry in Structural Biology and Biophysics: Architecture, Dynamics, and Interaction of Biomolecules, Second Edition r Isabella Dalle-Donne, Andrea Scaloni, and D. Allan Butterfield Redox Proteomics: From Protein Modifications to Cellular Dysfunction and Diseases Silas G. Villas-Boas, Ute Roessner, Michael A.E. Hansen, Jorn Smedsgaard, and Jens r Nielsen Metabolome Analysis: An Introduction r Mahmoud H. Hamdan Cancer Biomarkers: Analytical Techniques for Discovery r Chabbil Dass Fundamentals of Contemporary Mass Spectrometry r Kevin M. Downard (Editor) Mass Spectrometry of Protein Interactions r Nobuhiro Takahashi and Toshiaki Isobe Proteomic Biology Using LC-MS: Large Scale Analysis of Cellular Dynamics and Function r Agnieszka Kraj and Jerzy Silberring (Editors) Proteomics: Introduction to Methods and Applications r Ganesh Kumar Agrawal and Randeep Rakwal (Editors) Plant Proteomics: Technologies, Strategies, and Applications Rolf Ekman, Jerzy Silberring, Ann M. Westman-Brinkmalm, and Agnieszka Kraj (Editors) r Mass Spectrometry: Instrumentation, Interpretation, and Applications r Christoph A. Schalley and Andreas Springer Mass Spectrometry and Gas-Phase Chemistry of Non-Covalent Complexes r Riccardo Flamini and Pietro Traldi Mass Spectrometry in Grape and Wine Chemistry r Mario Thevis Mass Spectrometry in Sports Drug Testing: Characterization of Prohibited Substances and Doping Control Analytical Assays r Sara Castiglioni, Ettore Zuccato, and Roberto Fanelli Illicit Drugs in the Environment: Occurrence, Analysis, and Fate Using Mass Spectrometry r Angel´ Garcia´ and Yotis A. Senis (Editors) Platelet Proteomics: Principles, Analysis, and Applications

Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition, First Edition. Edited by Alejandro Cifuentes. © 2013 John Wiley & Sons, Inc. Published 2013 by John Wiley & Sons, Inc. r Luigi Mondello Comprehensive Chromatography in Combination with Mass Spectrometry r Jian Wang, James MacNeil, and Jack F. Kay Chemical Analysis of Antibiotic Residues in Food r Walter A. Korfmacher (Editor) Mass Spectrometry for Drug Discovery and Drug Development r Alejandro Cifuentes (Editor) Foodomics: Advanced Mass Spectrometry in Modern Food Science and Nutrition