Development of a Metabolomics Strategy for Novel Natural Product Discovery

and its Application to the Study of Defense Responses

Dissertation

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of

Philosophy in the Graduate School of The Ohio State University

By

Jiye Cheng, M.A.S.

Graduate Program in

The Ohio State University

2011

Dissertation Committee:

Terrence Lee Graham, Advisor

A. Douglas Kinghorn

Laurence Vincent Madden

Pierluigi Bonello

Copyright by

Jiye Cheng

2011

ii

Abstract

Plant secondary product metabolic pathways produce an astonishing wealth of metabolites with important biological functions. These secondary metabolites play critical roles in plant defense responses, including defense against herbivores, pests and pathogens. Also, plant natural product compounds are the source of numerous pharmaceuticals. In the areas of cancer and infectious disease, 60 and 75%, respectively, of new drugs originate from plants. Metabolomics, with the aim at qualitative and quantitative analysis of the full complement of metabolites in biological samples, has truly established itself as a valuable tool for plant functional genomics and studies of plant biochemical composition. However, its potential to find new natural products remains untapped due to the diversity and complexity of natural products. Optimized metabolomics strategies for the discovery of new natural products are needed. Therefore, we proposed a series of simple operations for new natural products discovery based on metabolomic studies. The main steps include (1) LC-ESI-MS profiling analysis,

(2) peak detection, supervised retention time alignment and peak matching for multiple groups of samples, (3) selection of the metabolites that are unknown induced natural products, (4) purification of target metabolites by preparative

HPLC and finally, (5) structure elucidation based on NMR. This strategy was applied to the metabolite profiling and systematic identification of defense- response-induced secondary metabolites in soybean cotyledons. We were able to simultaneously detect and identify 13 isoflavones, 2 and 6 ii . Totally, 5 compounds were discovered as natural products for the first time in soybean. Comparative metabolite profiling of soybean cultivars resistant and susceptible to sojae (Kauf. and Gerde.) was performed. Principal component analysis clearly demonstrated a separation of elicitor-activated resistant and susceptible soybean cultivars. 3’- prenylisoafrormosin, glyceocarpin, glyceofuran, 3'-prenylgenistein and phaseol were identified as the key secondary metabolites accounting for the separation. The metabolite profiling results present the most complete analysis of soybean induced secondary metabolites to date, which can be further utilized to evaluate chemical components of soybean samples for plant biology, food science and pharmaceutical studies. Our results also provide additional knowledge of the soybean secondary metabolite pathways involved in defense. Moreover, the proposed strategy demonstrates a promising future approach for novel compound discovery even in relatively well studied plants.

In soybean, phenylpropanoids play critical roles in defense responses.

They regulate certain aspects of oxidative stress and hypersensitive cell death and act as phytoalexins, which directly inhibit pathogen growth.

Interestingly, some of phenylpropanoids from soybean also have many reported activities in animal cells. In particular, is one of the most potent and has been reported a good lead for anticancer activity. Due to these important activities, an in depth study of how phenylpropanoid pathways are regulated was conducted. Over 25 compounds were examined, including exogenous elicitors, signal molecules

iii and signal transduction regulators for their effects on wound-, light- and glucan defense elicitor-induced phenylpropanoid responses in soybean. As for exogenous elicitors, many of the Pathogen Associated Molecular Patterns

(PAMPs) tested (chitin oligomers, LPS, wall glucan elicitor, mycolaminaran) induce the production of glyceollin as expected. Interestingly, all chemicals that are H+ and/or K+ ion effectors (vanadate, monensin, valinomycin, nigericin and fusicoccin) cause a massive accumulation of inducible secondary metabolites. Additionally, 2-methoxy-3,9-dihydroxycoumestone, 1- methoxy-3,9-dihydroxy coumestone, 8-methoxy-3,9-dihydroxycoumestone, and 7,4’-dihydroxy-5’-methoxycoumaronochromone were discovered as natural products for the first time in soybean in this study. Overall, the results of this study suggest that phenylpropanoid metabolism may be controlled by manipulating the transmembrane potential.

iv

Dedication

This thesis is dedicated to my parents.

They are the reason of my success, the reason of my life. I love you.

v

Acknowledgements

My foremost gratitude goes to my adviser Dr. Terrence Lee Graham whose guidance and support were crucial for me to accomplish this thesis. I thank him for his patience, encouragement, and consideration that carried me on through difficult times, and for his insights and suggestions on my research.

I gratefully acknowledge my student advisory committee members, Dr. A.

Douglas Kinghorn, Dr. Laurence Vincent Madden and Dr. Pierluigi Bonello, who advised me and helped me in various aspects of my thesis.

It has been a precious experience to work with Madge Graham, Michelle

Sinden and Kara Riggs and I thank them for their valuable suggestions for my research, collegiality and technical support in lab. I would like to thank Kyle

Benzle and Chunxue Cao for their assistance in collection and treatment of soybean cotyledons for my study. Also, I want to give my special thanks to Dr.

Chunhua Yuan and Dr. Stephen Opiyo. Dr. Yuan provided me exceptional

NMR training and metabolite structure elucidation support. Dr. Opiyo has done a tremendous job in organizing and maintaining the use of LC/MS instrument which was critical for me to complete my project in time.

Finally, I am forever indebted to my parents and my fiancée for their understanding, endless patience and encouragement when it was most required.

vi

Vita

2006 ...... B.S. Pharmacy, Shanghai Jiaotong University

2010 ...... M.A.S. Statistics, The Ohio State University

2006 to present...... Graduate Research Associate, Department of

Plant Pathology, The Ohio State University

Fields of Study

Major Field: Plant Pathology

vii

Table of Contents

Abstract ...... ii

Dedication...... v

Acknowledgments ...... vi

Vita ...... vii

Table of Contents ...... viii

List of Tables ...... x

List of Figures ...... xii

List of Abbreviations ...... xvi

Chapter 1: Introduction ...... 1

Chapter 2: Potential defense-related prenylated isoflavones in lactofen- induced soybean ...... 17

Introduction ...... 18

Materials and methods...... 20

Results and discussion ...... 28

Conclusion ...... 40

Chapter 3: Systematic identification and comparative metabolite profiling of soybean (Glycine max) defense-related secondary metabolites in

Phytophthora sojae resistant and susceptible cultivars ...... 42

Introduction ...... 43

Materials and methods...... 46 viii

Results and discussion ...... 54

Conclusion ...... 67

Chapter 4: A metabolomic strategy for new natural product discovery...... 69

Introduction ...... 70

Results and discussion ...... 73

Materials and methods...... 90

Conclusion ...... 96

References ...... 98

ix

List of Tables

Table 1.1 Software and databases for LC-MS tentative compound identification...... 13

Table 1.2 Known soybean defense-related metabolites observed by LC-MS in previous publications...... 14

Table 2.1 1H- (chemical shifts and coupling constants) and 13C-NMR spectroscopic data at 25°C for isoflavone compound (2) and (3) dissolved in methanol-d4 ...... 36

Table 3.1 Metabolites identified and profiled in 200 µg/ml glucan-treated

Williams 82 soybean cotyledons...... 58

Table 3.2 MS n data of pterocarpans and phaseol. Only major fragments are listed in each MS ...... 62

Table 3.3 Comparison of the top seven metabolites between Williams 82 and

Williams that contribute to PC1 in Figure 3.7. All results are presented as medians (25th, 75th percentile) of selective m/z intensity ...... 66

Table 4.1 All tested exogenous elicitors, signal molecules and signal transduction regulators and their functions and effects on the induced accumulation of and II...... 77

x

Table 4.2 Known soybean metabolites observed by LC-MS in previous

(chapter 3) and their possible pseudomolecular species in ESI positive-ion mode ...... 86

xi

Lists of Figures

Figure 1.1 Publications on Metabolomics. Metabolomics published journal articles by year of publication as retrieved from ScifinderTm using the research topic metabolomics or metabonomics ...... 5

Figure 1.2 Basic steps of handling metabolomics mass raw data for generating variables ...... 8

Figure 1.3 Peaks of isomers in two samples. Pseudomolecular ions of phaseol and glyceollin IV form side-by-side isobaric peaks in positive-ion mode LC-MS chromatography due to similar molecular structures ...... 9

Figure 2.1 Structures of lactofen (1) and compounds (2-9)...... 27

Figure 2.2 Base ion chromatograms from positive ion LC-ESI-MS analyses:

(A), an extract of cotyledons that were treated with lactofen. (B), an extract of cotyledons that were treated with water. Tissues were harvested at 48 h.

Assigned peaks were predominantly induced by lactofen treatment. Peaks

(1)-(6) were unknown. Peaks (7) and (8) were indentified as glyceollin I (8) and II (9) by comparison to previous reports...... 30

Figure 2.3 (A) The molecular structure of compound (7). The multiple-bond correlations from 1H to 13C mentioned in the text are indicated by arrows. (B)

Partial 2D 1H-13C HMBC spectrum showing the long-range correlations attributed to the aromatic spins with the most downfield 13C resonances (150-

160 ppm) ...... 37

xii

Figure 2.4 Effects of lactofen and glucan on the accumulations of compounds

(2-7) in soybean cotyledons: Compounds (2-7) were semi-quantified by LC-

MS profiling of tissues harvested 48 h after treatment with 200 µM lactofen (1) or 200 µg/ml glucan. Values of compounds (2-7) were the average of three experiments ± standard error on the scale of selective m/z intensity...... 40

Figure 3.1 Overlaid base ion chromatograms from positive ion HPLC-ESI-MS analysis of the extract of cotyledons which were treated with 200 µg/ml WGE and water control. Tissues were harvested at 48 h...... 54

Figure 3.2 Structures of identified isoflavones from glucan induced soybean cotyledons. The β-d-glucosyl, prenyl and 6”-malonyl groups are designated as

G, P and M, respectively...... 56

Figure 3.3 Structures of identified coumestans and pterocarpans from glucan- induced soybean cotyledons ...... 57

Figure 3.4 Summary of NMR observations on the sample of glyceocarpin dissolved in DMSO-d6: (A) ROE spatial interactions, (B) J-coupling correlations, and (C) multiple-bond correlations from proton to carbon...... 60

Figure 3.5 Proposed mechanism for the formation of the glyceollin IV pseudomolecule in positive-ion mode mass spectrometry...... 62

Figure 3.6 Principal component analysis (PCA) score plot of water and WGE treated Williams and Williams 82 samples. The plot shows that the metabolomes of the four groups of samples were clearly different. WGE- treated Williams 82 moved further from the two water control groups than

xiii

WGE-treated Williams did, consistent with a potential difference related to the presence of the Rps-1K locus in Williams 82...... 65

Figure 3.7 Loading plot of PC1 vs PC2. The loading of individual metabolite represents its contribution to the separation of the four groups of samples.

The top 7 metabolites in PC1 are labeled with the peak numbers that can be found in Figure 3.1...... 66

Figure 4.1 The workflow for the discovery of unknown induced plant metabolites ...... 75

Figure 4.2 Overview of signaling events in host-pathogen interactions ...... 76

Figure 4.3 Overlaid base ion chromatograms from positive-ion HPLC-ESI-MS analysis of the extract of cotyledons that were treated with 50 µM nigericin and water control. Tissues were harvested after 4 days...... 78

Figure 4.4 Overlaid chromatograms of peaks 1 and 2 in Figure 4.2 from two samples of cotyledons that were treated with 50 µM nigericin. Due to the retention shift, peak 2 in the black sample could be matched to both peak 1 and 2 in the red sample...... 83

Figure 4.5 ESI-MS spectra of (A) peak 1and (B) peak 2 in the positive-ion mode ...... 84

Figure 4.6 Chemical structures of (10) 2-methoxy-3,9-dihydroxycoumestone

(Rajani and Sarma 1988), (11) 1-methoxy-3,9-dihydroxycoumestone (Hatano et al., 2000), (12) 8-methoxy-3,9-dihydroxycoumestone (Lee et al., 2008), and

(13) oblonginol (coumaronochromone derivative) (Lin and Kuo, 1993) ...... 89

xiv

Figure 4.7 Heteronuclear multiple-bond correlations (HMBC) spectrum recorded on compound (13) dissolved in DMSO-d6 showing long-range correlations between the aromatic protons and those oxygenated sp2 carbon resonances appearing in the 13C downfield region. The data set was recorded using a spectral width of 36 ppm in the 13C dimension (138.5 –

174.5 ppm), and the unfolded cross-peaks were boxed...... 90

xv

List of Abbreviations

δ (ppm): chemical shift in parts per million

δC: carbon-13 chemical shift

δH: proton chemical shift

1D, 2D: one- or two-dimensional

ACN: acetonitrile

Calcd: calculated

CDCl3: deuterated chloroform

CHCl3: chloroform

COSY: correlation spectroscopy

DEPT: distortionless enchancement by polarization transfer

DMSO: dimethyl sulfoxide

EC50: effective concentration that inhibits a response by 50% relative to a control

Gal: β-d-galactopyranose

Glc: β-d-glucopyranose

HMBC: heteronuclear multiple bond correlation spectroscopy

HPLC: high-performance liquid chromatography

HRESIMS: high-resolution electrospray ionization mass spectroscopy

HSQC: heteronuclear single-quantum coherence spectroscopy

Hz: hertz

xvi

IC50: sample concentration that inhibits cell growth by 50% compared to control

J: coupling constant

M: molar concentration

MeOH: methanol min: minute m/z: mass to charge ratio

NMR: nuclear magnetic resonance

NOESY: nuclear Overhauser enhancement spectroscopy

OPLS-DA: orthogonal partial least squares-discriminant analysis

PCA: principal component analysis

PLS-DA: partial least squares-discriminant analysis

UV: ultraviolet

WGE: the glucan elicitor (WGE) from the of Phytophthora sojae

xvii

Chapter 1

Introduction

The soybean [Glycine max (L.) merr.] is one of the most commonly consumed legumes worldwide. Wide cultivation of soybean in all climatic zones of the world makes it one of the most valuable farm products. In 2009, the production of soybean worldwide reached 222 million metric tons (FAO

2010). The United States ranks number one in soybean production with

31.4 million hectares of planted area that had a total value exceeding $31.7 billion dollars in 2009. Ohio ranks 6th in soybean production among U.S. states. There are 26,000 soybean farmers who plant 4.1 million acres of in Ohio with a value over $1.9 billion (OSC 2008).

Soybean is not only the principal oil crop being utilized for biodiesel in the

USA (Dizge et al., 2009; Liu et al., 2010), but also serves as a great resource of food. Soybean is a rich source of isoflavones, phenolic compounds and proteins that possess many human health beneficial effects.

Thus, soybean is processed into various products such as soymilk powder, soymilk, tofu, soy sauce, and soy flour (Kurzer, 2003; Takahashi et al., 2004;

Low Dog, 2005; Tyug et al., 2010).

1

With the rise in world population and fast economic development, food and biodiesel production has to be increased. Crops like soybean and corn will play a more significant role in the future. However, soybean yields in the

USA have been severely suppressed by soybean diseases, including soybean rust, soybean cyst nematode, phytophthora root and stem rot, seedling diseases, charcoal rot, sclerotinia stem rot, frogeye leaf spot and sudden death syndrome. It is estimated that about 15% of soybean yield is lost to those diseases (Wrather and Koenning, 2006). In order to boost the production of soybean, numerous investigations have been conducted on soybean defense reactions.

Results have clearly shown that multiple changes of secondary metabolites participate in soybean defense reactions. The changes include hydrolysis of conjugates of the isoflavones genistein and (Graham and Graham, 1996). Released free genistein is directly toxic to some fungal pathogens, such as Phytophthora sojae (Riveravargas et al., 1993), while free daidzein is the precursor of well-established inducible antibiotic phytoalexins (Dewick et al., 1970; Dewick and Martin, 1979; Ebel, 1986;

Garcez et al., 2000), including and the that are induced under disease stress. Recently, we demonstrated that prenylation of isoflavones also participates in soybean defense responses (Cheng et al.,

2

2011). Another induced metabolite, glyceocarpin, shows fungitoxic activity against Cladosporium sphaerospermum (Garcez et al., 2000).

Plants produce an astonishing wealth of metabolites with important biological functions from secondary metabolic pathways. Natural products have been the source of many drugs used in modern therapeutics, and particularly, in the case of anticancer drugs, more than 50% originally came from natural products (Kim et al., 2010). Interestingly, soybean-defense- related metabolites show potential for benefiting human health as well.

Soybean isoflavones, mainly genistein and daidzein, have shown beneficial effects on various chronic diseases such as cancer, coronary heart disease, diabetes and menopausal discomfort (Arjmandi et al., 1998; Ho et al., 2000;

Mahn et al., 2005; Qin et al., 2009; Hsu et al., 2010). The glyceollins exhibit marked antiestrogenic effects on receptors and may be useful in the prevention or treatment of breast and ovarian carcinoma (Salvo et al.,

2006). Glyceofuran and coumestrol were reported as low-density lipoprotein- antioxidants (Lee et al., 2006). Thus, a comprehensive metabolite profiling of soybean constitutive and inducible defense-related metabolites will be of paramount importance to both soybean and human health.

To make full use of plant natural products, effective ways to assess the diversity of these structurally complex compounds are needed urgently.

3

Various approaches have been initiated in the last decade largely due to tremendous advances in instrumentation and data handling capabilities.

Among them, metabolomics has proven to be a powerful method to investigate complex biological mixtures. Metabolomics, after genomics and proteomics, is one of the latest-promising platforms for systems biology. It aims to identify and quantify the full complement of low-molecular-weight compounds in a biological system (or a subset thereof) under genetic or environmental perturbations. This approach has gained ever-increasing interest and has truly established itself as a valuable tool for plant functional genomics and studies of plant biochemical composition (Fiehn et al., 2000;

Dixon and Sumner, 2003). An exponential increase in the number of metabolomics studies is apparent (Figure 1.1).

4

1200

980 1000 872 800

600 572 440 400 309 224 200 106 66 11 32 0 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

Figure 1.1. Publications on Metabolomics. Metabolomics published journal articles by year of publication as retrieved from ScifinderATm using research topic metabolomics or metabonomics

Metabolomics experiments start with the acquisition of comprehensive metabolic profiles using various analytical platforms, mainly NMR and MS.

The selection of the most appropriate methodology can be considered as a compromise between the chemical selectivity, sensitivity and speed of the different techniques (Saito and Matsuda, 2010). MS is currently the most widely applied technology in metabolomics due to its ability to detect metabolites present at low levels (Roux et al., 2011). Moreover, the specificity of MS (through high resolution and/or multidimensional MSn techniques) can even facilitate elucidation of the chemical structures of

5 potential metabolites of interest (i.e. identification of biomarkers) (Antignac et al., 2011). Among a variety of MS-techniques, liquid chromatography

(LC/MS) offers the best combination of sensitivity and broad detection range, and therefore is of particular importance for non-targeted plant metabolomics

(Ara et al., 2009; Allwood and Goodacre, 2010). LC/MS has already been established as a powerful tool for the analysis of a wide range of plant secondary metabolites (Tolstikov et al., 2003; Beekwilder et al., 2005;

Allwood et al., 2006; Dixon et al., 2006; Hall, 2006; Moco et al., 2006; De

Vos et al., 2007).

However, LC/MS-based plant metabolomic analyses still face challenges.

The main challenges are the complexity of the plant metabolites and limitations in the capacity for their rapid identification (Moco et al., 2007;

Bottcher et al., 2008). In contrast to LC/MS, NMR techniques are of low sensitivity, although they provide rich structural information and are capable of rigorous metabolite identification. Therefore, the combination of NMR and

LC/MS may offer complementary advantages in plant metabolomic analysis.

The advantages of this combination of technologies have been shown in the areas of detection, identification, and quantization of known, and more importantly unknown, compounds in complex bio matrices (Bobzin et al.,

2000; Exarchou et al., 2003; Exarchou et al., 2005; Lin et al., 2008). Thus, in the present research, LC/MS was employed as a primary tool to detect the

6 multiplicity of changes of soybean-defense-related metabolites and NMR was used to conduct rigorous metabolite identification.

Since metabolomic experiments focus on comprehensive metabolite profiling, metabolomic data sets are usually large and complex, not only because they comprise large numbers of chromatographic peaks that are separated according to varying periods of time, but also because of fluctuations in the detected mass-to-charge ratio of ions between scans, and the high contribution of chemical and instrumental noise to the data acquired

(Aberg et al., 2008). Handling data sets has a great impact on the quality of the identification and quantification of potential metabolites of interest (i.e. identification of biomarkers), and therefore to the resulting biological interpretation (Boccard et al., 2010). Any problem during the data processing

(e.g., peak detection, peak alignment and peak matching) can lead to information leakage and erroneous statistical analysis. The basic steps of handling mass raw data for generating variables were summarized by

Boccard et al., (2010) and are shown in Figure 1.2.

7

Figure 1.2. Basic steps of handling metabolomics mass raw data for generating variables.

The use of bioinformatics tools for handling such large metabolomic data sets is unavoidable (Katajamaa and Oresic, 2007). Hence, several free software tools have been developed in recent years to convert the initial three-dimensional raw data to a two-dimensional data table reporting time- and mass-aligned abundances of chromatographic peaks, such as MZmine

(Katajamaa et al., 2006; Pluskal et al., 2010) and XCMS (Smith et al., 2006).

Many instrument manufacturers have produced their own software {e.g.,

MarkerLynx (Waters), MassProfiler (Agilent) and MarkerView (Applied

Biosystems/MDS SCIEX)}. The outcomes of these bioinformatics tools mainly depend on the definition of the software parameters for filtering, peak detection and matching in retention time. One drawback of such

8 bioinformatics tools is that they may sometimes appear to be a ‘‘black box’’, and it can be difficult to identify which parameters are likely to have a strong influence on the results obtained (Antignac et al., 2011). Another drawback of available free bioinformatics tools is that algorithms require that the deviation in retention time from sample to sample be no greater than the time between two adjacent peaks (Smith et al., 2006), which could not handle isobaric peaks that are very common in the plants. This problem is demonstrated in Figure 1.3. Since isomer peaks are usually very close to each other in chromatography, a slight time shift will cause mismatching across different samples.

Figure 1.3 Peaks of isomers in two samples. Pseudomolecular ions of phaseol and glyceollin IV form side-by-side isobaric peaks in positive-ion mode LC-MS chromatography due to similar molecular structures.

Therefore, a custom-designed bioinformatics tool with a novel peak matching algorithm was developed to handle these complex isobaric peaks in the present research. The program was performed in Matlab version

9

R2008a (The Mathworks, Inc., Natick, MA). The workflow includes reading the raw MS data, finding peaks in an unbiased way, aligning peaks and matching peaks across samples. In differential analysis, the two peak lists of treated and control groups were compared creating a differential peak list.

The outputs of metabolomic bioinformatics tools are data sets including a large number of variables (metabolites). In order to extract the relevant information from the vast number of variables, univariate methods (e.g.,

ANOVA and Student t-test) and multivariate methods (e.g., PCA, PLS and

OPLS) are widely used in metabolomics studies (Mendes et al., 2005;

Sumner et al., 2007). Univariate methods can be used to discover potential candidate compounds with significant differences in terms of abundance between two sub-groups of samples. Multivariate methods are believed to be capable of capturing correlation patterns between variables so that the analytical information of biological relevance can be highlighted (Okada et al.,

2010; Eliasson et al., 2011). There are two types of multivariate methods, namely, unsupervised and supervised methods. Unsupervised methods aim to reveal the structure of the data without measuring or observing any related outcome. As there is no specified class label or response, the results of unsupervised methods show samples that naturally fall together. In other words, unsupervised methods find the natural partitions of patterns to facilitate the understanding of the relationship between the samples and to

10 highlight the variables that contribute to these relationships. On the contrary, supervised methods require class labels to indicate sub-groups. Basically, the classification procedure aims to produce the best separations between sub-groups based on a training set of samples that are identified by known labels corresponding to the class information and to highlight the variables that contribute to these separations. Due to this feature, supervised methods can always generate score plots with better separation between the sub- groups than unsupervised methods can do. Although widely used in metabolomics, this feature also makes score plots too optimistic to be used

(Westerhuis et al., 2008). The problem is that metabolomics data sets always comprise high-dimensional multivariate space and there is almost always a perfect separation possible between the relatively small sample sizes. Supervised methods have no problems in being used to find this separation (Lai et al., 2006; Klement et al., 2008; Calle and Urrea, 2011).

More interestingly, Lai et. al (2006) illustrated that, contrary to several previous studies, in five of seven datasets, univariate selection approaches yielded consistently better results than multivariate approaches. They concluded that the correlation structures, if present, are difficult to extract due to the small number of samples (Lai et al., 2006). From a statistical view, this so-called dimensionality problem is that sample size is much smaller than the number of variables, which is always the case for metabolomics.

11

Therefore, only unsupervised PCA methods were employed in the present research to generate visualization of sub-groups. Any potential variables of interest provided by PCA were checked by the Student t-test to confirm the significance in terms of abundance between two sub-groups.

The most difficult task of LC-MS based metabolomics is to identify potential biomarkers of interest (Wikoff et al., 2009). The most efficient and reliable way to do identification is the co-characterization of authentic compounds. However, the biggest challenge of plant metabolomics is the lack of authentic compounds, particularly when untargeted metabolomics is conducted. Dedicated software and databases are therefore now starting to be implemented on a large scale to facilitate LC-MS tentative compound identification. They are summarized in Table 1.1. Most of the software is designed to predict and analyze compound fragments and link these directly to open source databases. These databases allow a search on the basis of the molecular masses and tandem mass or MSn fragment information, which are essential, but not usually sufficient by themselves, to identify a compound structure. Indeed, there are sometimes hundreds of possible structures even for a mass entered with three-digit precision. Another problem is that there are differences in terms of fragmentation of instruments present on the market, as well as differences for the same instruments with

12 different operating conditions. The type of spectrometer and the operating conditions should be considered seriously to avoid misinterpretations in the assignment of possible structures based on the comparison of experimental data and databases. Consequently, the assignment of metabolites from experimental data still implies intensive manual effort (Moco et al., 2007).

Table 1.1. Software and databases for LC-MS tentative compound identification Software Basic features Calculates exact masses for structures and ChemDraw fragments De novo structure determination from exact MathSpec mass fragmentation Interpretion of MSn spectra and Mass Frontier predicted fragmentation patterns and pathways ACD MS Manager Tandem MS interpretation Metabolite databases Website METLIN http://metlin.scripps.edu/ HMDB http://www.hmdb.ca/search/ KEGG http://www.genome.jp/kegg/ligand.html#rdb MassBank http://www.massbank.jp/index.html

Thus, compound identification procedures in the currently described research included molecular formula calculations, metabolite database searches, co-characterization of authentic compounds, and NMR for de novo structure elucidation. First of all, elemental composition may be computed from accurate-mass determination and isotopic patterns. Filtering molecular formulae followed the seven golden rule (Kind and Fiehn, 2007). If the molecular formula did not correlate with any known soybean metabolites observed by LC-MS in previous publications (Table 1.2), the metabolite was 13 purified by preparative HPLC and then NMR was employed to elucidate its structure.

Table 1.2. Known soybean defense-related metabolites observed by LC-MS in previous publications. Molecular Observed Identified Compounds weight M/Z 416 417 [M+H]+ 7-O-Glucosyl-6”-O-malonate daidzein 254 255 [M+H]+ 432 433 [M+H]+ 7-O-Glucosyl-6”-O-malonate genestein 518 519 [M+H]+ 7-O-Glucosyl-6”-O-malonate 516 517 [M+H]+ Daidzein 254 255 [M+H]+ Genistein 270 271 [M+H]+ Coumestrol 268 269 [M+H]+ Formononetin 268 269 [M+H]+

The first objective of the present research was to comprehensively study the soybean metabolites induced by lactofen, a soybean disease resistance-inducing herbicide. Lactofen has been observed to induce soybean resistance in the field against white mold (caused by Sclerotinia sclerotiorum) (Dann et al., 1999) and the soybean cyst nematode (Levene et al., 1998). Previous studies on biochemical mechanisms of induced resistance in soybean have been based on HPLC targeted analysis, mainly focusing on the well-known phytoalexins glyceollins and isoflavonoids (Dann et al., 1999; Landini et al., 2002; Nelson et al., 2002). Increases in the glyceollin and isoflavonoids levels were the result of enhanced expressions of secondary metabolite synthetic genes (Graham, 2005). Based on these studies, it was clear that the soybean secondary metabolite pathways could

14 be highly affected by lactofen. Therefore, it was hypothesized that a successful global metabolomic study could find more lactofen-induced soybean metabolites so that a better understanding of soybean disease resistance could be achieved.

With the successful completion of this first objective, more inducible soybean defense-related metabolites were discovered. Although inducible soybean secondary metabolites are important for soybean defense responses and have many beneficial effects on human health, no method was previously developed to examine these inducible defense-related metabolites comprehensively. The second objective of this research was thus to develop an approach utilizing HPLC-ESI-MS and NMR for comprehensive profiling, purification and systematic identification of soybean-defense-related metabolites and to use this approach to study the metabolite profiles of Phytophthora sojae resistant (Williams 82) and susceptible (Williams) soybean cultivars treated with the major defense inducing elicitor, the cell wall glucan from Phytophthora. The hypothesis was that there are key soybean secondary metabolites contributing to the differences in resistance to Phytophthora sojae. The outcome of this work will not only add the structural information of soybean secondary metabolites that should greatly benefit future soybean chemical component studies, but

15 also provide a better understanding of how inducible secondary metabolites may be involved in Phytophthora sojae resistance.

The realizations of the first two objectives illustrate the ability of soybean to produce new metabolites with important bioactivities. Since these metabolites are all inducible, and most of them share very similar structures that hamper general metabolomics analysis, the third objective of the present research was to optimize a metabolomics strategy for the discovery of new natural products and to apply the optimized strategy to examine over

25 exogenous elicitors, signal molecules, and signal transduction regulators for their effects on wound-, light- and glucan defense elicitor-induced metabolites in soybean. My expectation was that the results of this proposed work would lead to the discovery of more unknown metabolites in soybean and provide more details of the mechanisms involved in the induction of soybean secondary metabolites.

16

Chapter 2

Potential defense-related prenylated isoflavones in lactofen-

induced soybean

Abstract

An integrated LC-MS and NMR metabolomic study was conducted to investigate metabolites for which their formation was induced by lactofen (1), a soybean disease resistance-inducing herbicide. First, LC-MS analyses of control and lactofen (1) -induced soybean extracts were performed. The LC-

MS raw data were then processed by a custom designed-bioinformatics program to detect the induced metabolites so formed. Finally, structures of unknown induced metabolites were determined on the basis of their 1D and

2D NMR spectroscopic data. Two previously unreported compounds, 7,8- dihydroxy-4'-methoxy-3'-prenylisoflavone (2) and 7-hydroxy-4',8-dimethoxy-

3'-prenylisoflavone (3) were elucidated together with four known prenylated compounds, 3'-prenyldaidzein (4), 8-prenyldaidzein (5), 3'-prenylgenistein

(6), and 4-prenylcoumestrol (7). Compounds (2-6) are reported for the first time in soybean. The 13C NMR chemical shift assignments for compound (7) are reported for the first time. Formation of these six prenylated compounds was also induced by the primary defense glucan elicitor from the cell wall of the pathogen Phytophthora sojae (Kauf. and Gerde.), further suggesting a

17 potential role in soybean defense. These results highlight the metabolic flexibility within soybean secondary product pathways and suggest that prenylation may be associated with defense responses. Moreover, this study demonstrates a promising future approach using metabolomics on elicitor-induced plants for discovery of unknown compounds even in relatively well studied plants.

Introduction

Lactofen (1) is a selective, broad-spectrum herbicide that is used as a postemergence treatment for control of broadleaf weeds in soybean fields

(Al-Khatib et al., 2000). In the field, application of lactofen (1) has been observed to induce soybean resistance against several diseases such as white mold (caused by Sclerotinia sclerotiorum) (Dann et al., 1999) and the soybean cyst nematode (Levene et al., 1998). The biochemical mechanisms of the induced resistance have been studied. The glyceollins, which are the primary soybean phytoalexins, were reported to accumulate in lactofen (1)- treated leaves in the field (Dann et al., 1999; Nelson et al., 2002). Further metabolite profiling of lactofen (1)-treated soybean showed that lactofen (1) induces multiple and dramatic effects on overall isoflavonoid metabolism in soybean (Landini et al., 2003). The increase in isoflavone levels induced by lactofen (1) was the result of enhanced expressions of secondary metabolite synthetic genes (Graham, 2005). Together these results indicate that the

18 soybean secondary metabolite grid is highly affected by lactofen (1).

Therefore, we reasoned that a comprehensive study of the soybean metabolites induced by lactofen (1) might not only lead to a better understanding of lactofen (1)-induced disease resistance, but also to the discovery of new metabolites in soybean.

Metabolomics aims to study the full complement of metabolites present in an organism under given biological conditions. This approach has gained an ever-increasing interest and has truly established itself as a valuable tool for plant functional genomics and studies of plant biochemical composition (Fiehn et al., 2000; Dixon and Sumner, 2003). Among a variety of analytical techniques, liquid chromatography coupled with mass spectrometry (LC-MS) offers the best combination of sensitivity and broad detection range, and therefore is of particular importance for non-targeted plant metabolomics (Ara et al., 2009; Allwood and Goodacre, 2010). LC-MS has already been established as a powerful tool for the analysis of a wide range of plant secondary metabolites (Tolstikov et al., 2003; Beekwilder et al., 2005; Allwood et al., 2006; Dixon et al., 2006; Hall, 2006; Moco et al.,

2006; De Vos et al., 2007). However, LC-MS based plant metabolomic analyses still face challenges. The main challenges are the complexity of plant secondary metabolites and limitations in the capacity for their rapid identification (Moco et al., 2007; Bottcher et al., 2008). In contrast to LC-MS,

19

NMR techniques are of low sensitivity, although they provide rich structural information and are capable of rigorous metabolite identification. Therefore, the combination of NMR and LC-MS offers complementary advantages in plant metabolomic analysis.

In this chapter, formation of the soybean metabolites induced by lactofen (1) was investigated using complementary LC-MS and NMR approaches. Through our discovery of prenylated isoflavones previously unreported in soybean, our knowledge of the biochemical complexity of lactofen (1)-induced responses and potential soybean defense responses was extended significantly. Additionally, the potential use of elicitor-induced plants for the discovery of previously uncharacterized compounds was demonstrated.

Materials and methods

General instrumentation

LC-ESI-MS analysis was carried out on a Varian 500-MS system which consists of a Varian 212-LC Binary Gradient LC-MS pump, Prostar

420 Autosampler and Ion Trap Mass Spectrometer equipped with an ESI source. LC was achieved on a C18 reversed-phase column (LiChrosorb RP-

18 10 µ, 250 mm X 4.6 mm, Alltech Associates, Deerfield, Illinois). High resolution ESIMS (positive mode) was performed on a Micromass Q-TOF

20 mass spectrometer. NMR experiments were conducted at 25°C on a Bruker

DMX-600 spectrometer equipped with a 5 mm triple-resonance probe and three-axis gradient coils, operating at 600.13 MHz 1H of and 150.90 MHz 13C.

The analysis of compound (2) was repeated on a Bruker DRX-800 due to a low sample concentration. The spectrometer operating at 800.13 MHz 1H and 201.19 MHz 13C is equipped with 5 mm triple-resonance cryoprobe and z-axis gradient coil. The experiments generally included (1) 1D 1H NMR and

1D 13C NMR, (2) 2D 1H homonuclear phase-sensitive or magnitude-mode

COSY, NOESY or ROESY (240 ms mixing time), and TOCSY (DIPSI2 with

60 ms mixing time), and (3) 2D 1H-13C heteronuclear HMBC (optimized for small J-observation) and HSQC. The solvent peak was used as internal reference for both 1H and 13C NMR chemical shifts.

Chemicals and elicitor preparation

Unless otherwise noted, all chemicals were purchased from Sigma

Chemical Company, St. Louis, MO, USA. Lactofen (1) was obtained from

Valent Technologies (Walnut Creek, CA, USA). For application to plant tissues, it was dissolved in i-PrOH, and then diluted with H2O to a final concentration of 0.5% i-PrOH. Intact wall glucan elicitor (WGE) was prepared from cell walls of 1 of Phytophthora sojae as described previously (Graham and Graham, 1996). Before use, the unfractionated and

21 insoluble wall glucan preparation was sonicated and then autoclaved for 3 h in deionized double distilled water (Ayers et al., 1976).

Plants

Soybean seeds (cultivar Williams) were kindly provided by Dr. Anne

Dorrance (Department of Plant Pathology, The Ohio State University and the

Ohio Agricultural Research and Development Center, Wooster, OH).

Seedlings were grown as described previously (Graham and Graham, 1996) at 26 o C with 500 µE.m-2.s-1 of light and a 14-hour photoperiod. Flats were immediately watered very thoroughly for . Plants were watered every other day from the top with ½ teaspoon per gallon of Peter’s (20-20-20) fertilizer.

Snapped cotyledon assay

The snapped cotyledon assay was performed as described previously

(Graham and Graham, 1996). Cotyledons were harvested from seven-day- old seedlings in small batches and used immediately. Each individual cotyledon was snapped into two pieces at a point around 1/3 the length away from the point of attachment of the petiole. This smaller section was placed petiole side down into 0.55% water agar. 10 µl of the elicitors were applied across the exposed surface of each cotyledon. After treatment,

22 incubation was in constant light at 400 uE/m2/S. Samples from the cotyledon assays were harvested at 48 hr as described below.

LC-ESI-MS analysis

The uppermost 0.5 mm of the treated surface of each cotyledon was sliced off to provide the proximal cell layer (Graham and Graham, 1991,

1996). The cell layers from 10 replicate cotyledons were then pooled as one sample and each weighed directly into a 2 ml microfuge tube. Extraction was performed directly in microfuge tubes in EtOH-H2O (80:20, v/v, 800 µL per

50 mg tissue) using a polypropylene pestle (Kontes) mounted on a rechargeable hand drill. Extracts were centrifuged at 18,000 g for 5 min and

50 µL of supernatant was analyzed by LC-ESI-MS. An 80 minute run was developed based on a previous HPLC method (Graham, 1991). The flow rate was 0.3 ml/min. Solvent A = 0.01% HCO2H and solvent B = CH3CN.

The linear gradient program was as follows: A = 85% at t = 0 min; A = 45% at t = 35 min; A = 0% at t = 65 min; A = 0% at t = 70 min; A = 85% at t = 80 min. The 0.01% HCO2H was included to protonate metabolites and serve as an ion-pairing agent for positive-ion mode MS. The positive-ion mode was used due to higher sensitivity for the vast majority of soybean metabolites.

Other MS parameters were drying gas 20 psi at 350 ˚C, nebulizing gas 40 psi, needle voltage 5000 V, capillary voltage 80 V, and spray shield voltage

23

600 V. These were optimized based on examination of the full scan intensity, stability, and the product ion spectra.

LC-MS Data processing and differential analysis

LC-MS raw data were processed by a custom designed bioinformatics program (Chen et al. 2006) with modifications. The program was performed in Matlab version R2008a (The Mathworks, Inc., Natick, MA). The workflow included reading the raw MS data, finding peaks in an unbiased way, aligning peaks and matching peaks across samples. In differential analysis, peaks with signal-to-noise (S/N) values lower than 10 were rejected to avoid the disturbance of noise. Eight replicates were used for treated and control groups, respectively, to exclude peak-differences caused by sample variations. Only peaks appearing in all eight replicates in each group were kept after the peak matching step. Finally, the two peak lists of treated and control groups were compared creating a differential peak list that had the eight peaks annotated in Figure 2.2.

Compound purification

In order to obtain adequate amounts of purified unknown compounds so that 2D NMR spectroscopy could be used, 2000 lactofen (1)-treated cotyledons were prepared using the cut cotyledon protocol described previously (Graham and Graham, 1991). Approximately 100 g fresh tissue

24 were ground and extracted with EtOH-H2O (1L, 95:5, v/v) using a mixer

(Sorvall 17150 Omni Newtown, Connecticut) for 1 h at room temperature.

After centrifugation at 5000 g for 10 minutes, the supernatant was reduced in volume to give a 50 ml suspension by rotary evaporation. The first step of isolation was done by the HPLC method described below. A manual injection valve with a 2 ml loop was connected to an analytical C18 reversed- phase column (LiChrosorb RP-18, 10 µ, 250 mm X 4.6 mm, Alltech

Associates, Deerfield, Illinois). The injected 2 ml sample was eluted at 1.5 ml/min with a linear gradient. Solvent A = H2O and solvent B = CH3CN. The linear gradient program was as follows: A = 100% at t = 0 min; A = 100% at t

= 1 min; A = 30% at t = 5 min; A = 18% at t = 16 min; A = 0% at t = 16 min and last for 6 minutes. Fractions from 11 min to 17 min that contained compounds (2-7) were collected and lyophilized, giving 128 mg residue. The residue was dissolved in MeOH (10 ml) and diluted with H2O to 50 ml. Then, the second isolation step used the same equipment, but with the linear gradient described in LC-ESI-MS analysis. Since a manual injection valve with a 2 ml loop was added, each peak was 7 min later than the retention time shown in Figure 2.2. Repetitive isolation produced compound (2) (1.5 mg) at 61 min, compound (3) (2.5 mg) at 68 min, compound (6) (5.5 mg) at

64 min, compound (7) (21 mg) at 63 min and compound (4-5) (18 mg) at 58 min. Additional steps were then taken to separate compounds (4) and (5) by using gradient elution with A=H2O, B=MeOH. The linear gradient program

25 was as follows: A = 30% at t = 0 min and A = 20% at t = 44 min. Compound

(4) (8 mg) was collected at 38 min and compound (5) (3 mg) was collected at 41 min.

7,8-dihydroxy-4'-methoxy-3'-prenylisoflavone (2)

For 1H and 13C NMR spectroscopic data: see Table 2.1; LC-ESI-MS: m/z

353 [M+H]+ (100), 285 [M+H-68] + (7), 297 [M+H-56] + (7); HRESIMS: m/z

+ 353.1378 [M+H] (calc. for C21H21O5 , 353.1389).

7-hydroxy-4',8-dimethoxy-3'-prenylisoflavone (3)

For 1H and 13C NMR spectroscopic data: see Table 2.1; LC-ESI-MS: m/z

367 [M+H] + (100), 299 [M+H-68] + (7), 337 [M+H-30] + (5), 311 [M+H-56] + (7);

+ HRESIMS: m/z 367.1549 [M+H] (calc. for C22H23O5, 367.1545).

4-prenylcoumestrol (7)

13 C NMR (DMSO-d6, 150.90 MHz) δ 17.8 (C-4´), 21.9 (C-1´), 25.4 (C-5´),

98.6 (C-10), 101.6 (C-6a), 104.1 (C-11b), 112.8 (C-2), 113.8 (C-8), 114.5 (C-

6b), 115.6 (C-4), 119.6 (C-1), 120.5 (C-7), 121.2 (C-2´), 131.5 (C-3´), 152.2

(C-4a), 155.8 (C-10a), 156.9 (C-9), 157.5 (C-6), 158.8 (C-3), 159.9 (C-11a).

LC-ESI-MS: m/z 337 [M+H] + (100), 281 [M+H-56] + (15)

26

O O O O O O F O Cl NO F 2 F O (1) (7)

O O O O OH OH 13 12 H O H O OH OH (9) (8)

R1 8 -CH2CH=C(CH3)2 HO 8a O 2 7 A 3 2' 4 1' 3' R3 1'' 4'' 6 4a 3'' B R O 4' 2'' 5'' 2 6' OR 5' 4

(2) R1 = OH, R2 = H, R3 = CH2CH=C(CH3)2, R4 = CH3 (3) R1 = OCH3, R2 = H, R3 = CH2CH=C(CH3)2, R4 = CH3 (4) R1 = H, R2 = H, R3 = CH2CH=C(CH3)2, R4 = H (5) R1 = CH2CH=C(CH3)2, R2 = H, R3 = H, R4 = H (6) R1 = H, R2 = OH, R3 = CH2CH=C(CH3)2, R4 = H

Figure 2.1. Structures of lactofen (1) and compounds (2-9).

27

Results and discussion

Lactofen induced metabolites detected by LC-MS coupled with a bioinformatics-based analysis

While the effects of chemicals on metabolites are often examined in cell culture, we wished to follow the effects of lactofen (1) treatment on whole plant tissues. The snapped soybean cotyledon protocol was specifically developed to examine the effects of various chemical and biotic elicitors under highly defined conditions. This protocol, which has been very useful in cellular and biochemical studies on soybean defense metabolism

(Graham and Graham, 1996), involves direct treatment of freshly exposed and minimally wounded mesophyll parenchyma cells with a test chemical.

The response of such snapped cotyledon cells to lactofen (1) was examined at a concentration of 200 µM, which was chosen based on dose-response studies (Landini et al., 2003).

Comprehensive study of soybean secondary metabolites requires high resolution and sensitivity. Electrospray ionization (ESI) is particularly well suited to the analysis of a wide range of metabolites (Allwood and

Goodacre, 2010). Therefore, LC-ESI-MS was employed and all parameters of this system were optimized with consideration to both sensitivity and resolution. The base ion chromatograms from the final LC-ESI-MS method are illustrated in Figure 2.2. The LC-ESI-MS protocol on crude soybean

28 aqueous ethanol extracts resulted in a large amount of data. To find all new inducible soybean metabolites, a modified bioinformatics program (Chen et al., 2006), which picks all new metabolites in a treated group compared to a control group, was used. Results, with metabolites of interest labeled, are shown in Figure 2.2. The strongly induced formation of at least eight metabolites was identified. Among them, peaks (7) and (8) were identified as corresponding to glyceollin I (8) and II (9) through comparison to previous reports (Graham, 1991; Landini et al., 2003).

29

6 x 10 4 6 3.5 7 3 3 A 2.5 8 4 2 1

Intensity 5 2 1.5

1

0.5

0 20 25 30 35 40 45 50 55 60 65 Retention Time

x 106 2

1.8

1.6

B 1.4

1.2

1 Intensity 0.8

0.6

0.4

0.2

0 20 25 30 35 40 45 50 55 60 65 Retention Time

Figure 2.2. Base ion chromatograms from positive ion LC-ESI-MS analyses: (A), an extract of cotyledons that were treated with lactofen. (B), an extract of cotyledons that were treated with water. Tissues were harvested at 48 h. Assigned peaks were peaks predominantly induced by lactofen treatment. Peaks (1)-(6) were unknown. Peaks (7) and (8) were indentified as glyceollin I (8) and II (9) by comparison to previous reports.

30

Two previously unidentified and four known prenylated compounds were detected by NMR spectroscopy

The six unknown induced components (peaks 1-6) were purified by preparative HPLC, and their structures were elucidated by 1D and 2D NMR spectroscopic analysis, as exemplified on the novel compound (2) dissolved

13 in methanol-d4. Excluding the solvent signal, the C-NMR spectrum displayed a total of 21 resonances, of which 11 were proton-attached as evident in the 2D 1H-13C HSQC spectrum. The number of non-labile protons was determined to be 18, including those from three methyl resonances and six aromatic spins. The latter were grouped into “A” [δ 8.19], “AB” [δ 6.96 (d,

J=8.8 Hz), and 7.60 (d, J=8.8 Hz)], and “ABX” [δ 6.98 (d, J=8.2 Hz), 7.29 (d,

J=2.3 Hz), and 7.33 (dd, J=8.2, 2.3 Hz)] spin systems on the basis of J- coupling correlations. First, the aromatic singlet at δ 8.19, correlating to a

13C NMR signal at δ 154.3 in the HSQC spectrum, is a characteristic H-2 resonance of an isoflavone, thus suggesting the presence of an isoflavone skeleton. Second, the H-2 signal exhibited an NOE cross-peak to the proton at δ 7.29 and another weaker one to the proton at δ 7.33, indicating that the

“ABX” spin system is on the B-ring. These two yet-to-be determined aromatic resonances were meta-coupled with a small J coupling constant while one of them (δ 7.33) was also ortho-coupled with the resonance at δ

6.98, leading to the assignment of H-2´ (δ 7.29), H-6´ (δ 7.33), and H-5´ (δ

31

6.98). Last, the two remaining aromatic spins were ortho-coupled and thus could only be located at C-5 and C-6 of the A-ring, respectively. Since the spin at δ 7.60 showed a strong long-range correlation to the most downfield

13C NMR resonance at δ 178.4 (C-4), it was consequently assigned to H-5 and that at δ 6.96 to H-6.

The presence of a γ,γ-dimethylallyl group [–CH2CH=C(CH3)2] was readily established from the distinguished chemical shifts: two methyl signals

at δH 1.72 and 1.73, a methylene doublet at δH 3.34, and a triplet at δH 5.31.

The two methyl resonances exhibited NOE correlations to H-2" and H-1", respectively, enabling the assignment of H-5″ (δ 1.72) and H-4″ (δ 1.73).

Another –OCH3 substituent was evident from the observation of a cross- peak at δH 3.87 (s, 3H) and δC 55.9 in the HSQC spectrum. Because this methoxy group showed a NOE correlation to H-5´, whereas H-1″ of the above prenyl group had a NOE to H-2´, a 3'-prenyl and 4´-methoxy disubstitution pattern was inferred to be on the B-ring. Additional confirmation came from HMBC data. For instance, a strong long-range correlation was observed between H-2´ and C-1". Finally, the large

downfield shift of C-8 (δC 133.9) with respect to an unsubstituted carbon [i.e.,

C-8 at δ 103.4 in compound (4)], together with C-7 resonating at δ 151.4, was diagnostic for a C-8 and C-7 oxygenated A-ring. Taking account of the overall molecular weight, hydrogen substituents were determined at both positions. The 13C chemical shifts of the ten quaternary carbons (e.g., C-3,

32

C-4 and C-8a) were subsequently obtained on the basis of HMBC analysis.

Thus compound (2) was assigned as 7,8-dihydroxy-4'-methoxy-3'- prenylisoflavone.

The structural determination protocol for compounds (3-6) followed the same approach and is only briefly described here. Compound (3) had identical spin systems as compound (2), but contained two methoxy groups placed in the A-ring and B-ring, respectively. The substitution pattern of the

B-ring was deduced essentially the same way as in compound (2), while in the A-ring, the absence of NOE connectivity between the substituent and H-

6 implied that –OCH3 was placed at C-8. In compound (4), there were two

ABX aromatic spin systems. In compound (5), two AB aromatic spin systems were observed, one of which contained 2H signals. In compound

(6), one AX and one ABX aromatic spin systems were identified. While the spectroscopic data recorded on a sample of low concentration did not permit the assignments of C-4a, C-5, and C-7, they closely matched the reported spectra of 5,7,4´-trihydroxy-3´-prenylisoflavone (Iinuma et al., 1992). In conclusion, the five molecular structures are depicted in Figure. 2.1

Compounds (2) and (3) represent prenylated isoflavonoids that have never been reported previously to our knowledge (Table 2.1), while compounds (4-

6) were identified for the first time in soybean with their assignments in good agreement with the literature data (Nakayama et al., 1975; Hakamatsuka et

33 al., 1991; Iinuma et al., 1992; Nkengfack et al., 1994; Emami et al., 2007;

Khaomek et al., 2008).

Compound (7) was examined in both methanol-d4 and DMSO-d6 solution. The 13C-NMR spectrum showed the presence of 20 carbons, whereas the 1H-NMR identified 16 protons, comprising of two labile ones, five aromatic spins, and nine in a prenyl group. Taken together with the mass measurement, the compound was formulated as C20H16O5. A survey of chemical shifts and through-bond/through-space correlations in 2D NMR spectra suggested a 3,9-dihydroxy coumestrol skeleton (Morandi and

Lequere, 1991; Pahari and Rohr, 2009). Because the 1H-NMR spectrum displayed AB and ABX aromatic spin systems, two possible structures could be proposed with the prenyl substituent at either C-4 (phaseol) or C-10

(isosojagol) (Figure. 2.3 A). However, phaseol with the determination aided by UV spectra in early studies (Oneill, 1983) exhibits closely matched 1H-

NMR observations to isosojagol (Furstner et al., 2007), while its 13C chemical shift assignments have never been reported (Oneill, 1983;

Caballero et al., 1986; Abe et al., 1987; Asada et al., 1999; Furstner et al.,

2007), rendering the differentiation of these two compounds difficult without more rigorous analysis. In this work, it was noticed that the assignments of

H-1 and H-7 would hold the key to the structural elucidation. Therefore an

HMBC analysis was optimized and acquired with high quality, establishing

34 well-resolved aromatic spins as well as those expected correlations. Firstly, a total of six resonances were observed in the downfield region of δ 150 to

165 in the 13C-NMR spectrum, attributable to six oxygenated sp2-carbons (C-

3, C-4a, C-6, C-9, C-10a, and C-11a). Secondly, by structural inspection, it was anticipated that H-1 in either phaseol or isosojagol should have three –

3 13 the most – strong long-range correlations (cis- or trans- JCH) to those C resonances (C-3, C-4a, and C-11a), followed by H-7 with two to C-9 and C-

10a. As shown in Figure 2.3 B, the peak at δH 7.74 was consequently assigned to H-1, whereas the one at δH 7.71 to H-7. Lastly, since H-1 displayed an AB spin system with the resonance at δ 7.00 (H-2), while H-7 formed an ABX with the peaks at δ 6.96 (H-8) and δ 7.18 (H-10), the data unequivocally established the position of the prenyl substituent at C-4.

35

Table 2.1. 1H- (chemical shifts and coupling constants) and 13C-NMR spectroscopic data at 25°C for isoflavone compounds (2) and (3) dissolved in methanol-d4. Position Compound (2) Compound (3) δH (number of δC δH (number of δC H, multiplicity, (ppm) H, multiplicity, (ppm) J in Hz) (ppm) J in Hz) (ppm) 2 8.19 (1, s) 154.3 8.21 (1, s) 154.4 3 125.0 125.7 4 178.4 178.0 4a 118.7 119.1 5 7.60 (1, d, 8.8) 117.4 7.82 (1, d, 8.8) 122.0 6 6.96 (1, d, 8.8) 115.4 7.01 (1, d, 8.8) 116.2 7 151.4 156.4 8 133.9 136.1 8a 148.2 152.6 1′ 125.1 124.9 2′ 7.29 (1, d, 2.3) 131.1 7.30 (1, d, 1.8) 131.2 3′ 130.7 130.8 4′ 158.5 158.5 5′ 6.98 (1, d, 8.2) 111.3 6.98 (1, d, 8.2) 111.1 6′ 7.33 (1, dd, 128.9 7.34 (1, dd, 128.6 8.2, 2.3) 8.2, 1.8) 1′′ 3.34 (2, d, 7.0) 29.6 3.33 (2, d, 7.0) 29.3 2′′ 5.31 (1, t, 7.0) 123.5 5.31 (1, t, 7.0) 123.3 3′′ 133.0 133.1 4′′ 1.73 (3, s) 17.3 1.72 (3, s) 17.6 5′′ 1.72 (3, s) 25.8 1.72 (3, s) 26.1 8-OCH3 3.97 (3, s) 61.7 4′-OCH3 3.87 (3, s) 55.9 3.86 (3, s) 55.8

36

(A)

4' 5' CH3 H3C 3'

H 2' H H 1' 5 4 O HO 3 4a O 6

H 2 6a 11b 6b 7 H 1 11a 8 H O H 11 10a 9 10 OH H

(B)

H1 H7 H10 H2H8

ppm

152 C4a ) m

154 (pp

C10a 156 C9 C6 158 C3 13C Chemical shift shift Chemical 13C C11a 160

7.8 7.6 7.4 7.2 7.0 ppm 1H Chemical shift (ppm) Figure 2.3. (A) The molecular structure of compound (7). The multiple-bond correlations from 1H to 13C mentioned in the text are indicated by arrows. (B) Partial 2D 1H-13C HMBC showing the long-range correlations attributed to the aromatic spins with the most downfield 13C resonances (150-160 ppm).

37

Potential relevance of the newly uncovered prenylated isoflavones to soybean defense

For plants, prenylated compounds are produced as valuable functional metabolites with roles as phytoalexins against pathogenic microorganisms and herbivores (Yazaki et al., 2009). Although the bioactivities of the two novel compounds were not tested in this study due to the limited amounts purified, the known compounds (4-7) have all been reported to possess antifungal, antibacterial and insect-resistant activities

(Caballero et al., 1986; Abe et al., 1987; Nkengfack et al., 1994; Queiroz et al., 2002; Yin et al., 2004; Chacha et al., 2005; Khaomek et al., 2008).

Among them, compounds (4) and (6) showed comparable antibacterial activities as two well-known natural antimicrobial agents, bakuchiol and magnolol. Thus, it is possible that the production of these prenylated compounds may contribute to lactofen (1)-induced resistance. To test whether these prenylated compounds are only induced by lactofen (1) or they are part of a more generalized soybean defense response, glucan elicitors from the cell wall of the pathogen Phytophthora sojae were applied on the snapped cotyledon assay. Bioactivities of the glucan elicitor (Ayers et al., 1976) have been very highly characterized and it is known to induce the full range of defense responses seen in infected tissues (Graham, 1991;

Graham and Graham, 1996; Landini et al., 2003; Graham, 2005) .

Compounds (2-7) were also detected in glucan-induced tissues by semi-

38 quantitative LC-MS analysis. Results are shown in Figure 2.4. The induced accumulation of these prenylated compounds by the highly characterized pathogen-derived glucan elicitor is consistent with their potential involvement in soybean defense responses. However, the actual involvement and role of these metabolites in defense will require further study.

It is interesting that the prenylation of the isoflavones is selective.

Free formononetin accumulated to large amounts under glucan or lactofen

(1) induction, but prenylated formononetin was not detected. On the contrary, non-prenylated isoflavones of compounds (2) and (3) were not detected.

39

3000

2500

2000

1500 Lactofen

Glucan 1000 Intensity

500

0 1234562 3 4 5 6 7 Compound

Figure 2.4. Effects of lactofen and glucan on the accumulations of compounds (2-7) in soybean cotyledons: Compounds (2-7) were semi- quantified by LC-MS profiling of tissues harvested 48 h after treatment with

200 µM lactofen (1) or 200 µg/ml glucan. Values of compounds (2-7) were the average of three experiments ± standard error on the scale of selective m/z intensity.

Conclusions

A complementary LC-MS and NMR metabolomic approach was used to investigate soybean metabolites in response to elicitation by the disease resistance-inducing herbicide, lactofen (1). Identification of six prenylated 40 compounds, whose formation was induced by lactofen (1) treatment, was carried out. Among them, compounds (2-3) were previously unreported and compounds (4-6) were reported for the first time in soybean. Compound (7) was confirmed by 13C NMR spectroscopic analysis for the first time. By discovery of these prenylated compounds, our knowledge of the complexity of potential biochemical responses which may contribute to lactofen (1)- induced resistance and soybean defense responses were significantly extended.

Many prenylated have been identified as active components in medicinal plants with biological activities ranging from anti- cancer, anti-androgen, anti-leishmania, to anti-nitric oxide production. Due to their beneficial effects on human health, prenylated flavonoids are of particular interest as lead compounds for producing drugs (Yazaki et al.,

2009). Our studies have demonstrated the potential of integrated LC-MS and

NMR metabolomic studies on elicitor-induced plants for the discovery of new compounds of potential pharmacological interest. Studies focused on the bioactivities of these new compounds are ongoing.

41

Chapter 3

Systematic identification and comparative metabolite profiling of

soybean (Glycine max) defense-related secondary metabolites in

Phytophthora sojae resistant and susceptible cultivars

Abstract

An integrated approach utilizing multiple levels of fragmentation HPLC–ESI–

MS and NMR was used for the metabolite profiling and systematic identification of Phytophthora sojae (Kauf. and Gerde.) elicitor-induced defense secondary metabolites in soybean. It was possible to detect and identify simultaneously 13 isoflavones, 2 coumestans and 6 pterocarpans.

All identifications were either based upon the co-characterization of authentic compounds or 2D NMR spectroscopic data. Comparative metabolite profiling of Phytophthora resistant and susceptible soybean cultivars was performed. Principal component analysis clearly demonstrated a separation of the Phytophthora sojae elicitor-induced cultivars. 3’-

Prenylisoafrormosin, glyceocarpin, glyceofuran, 3'-prenylgenistein and phaseol were identified as the key secondary metabolites accounting for the separation. Although the glyceollins have been emphasized in the past, these metabolites provide additional insight into the soybean defense responses. The combined HPLC-ESI-MS and NMR results present the most

42 complete analysis of WGE-induced soybean secondary metabolites to date, which can be further utilized to evaluate chemical components of soybean samples from plant biology, food science and pharmaceutical studies.

Introduction.

Soybean, belonging to the Leguminosae family, is an important agricultural and commercial crop consumed in large quantities globally. It has served not only as an important source of food and oil, but also as dietary supplement based on its potential beneficial effects to human health

(Kurzer, 2003; Dog, 2005). Due to its significant value, numerous investigations have been conducted on both soybean disease resistance and its human health effects.

Recent results have clearly shown that multiple changes of secondary metabolites participate in soybean defense reactions. These changes include hydrolysis of conjugates of the isoflavones genistein and daidzein

(Graham and Graham, 1996). Released free genistein is directly toxic to some fungal pathogens, such as Phytophthora sojae (Riveravargas et al.,

1993), while daidzein is the precursor of well established inducible antibiotic phytoalexins (Dewick et al., 1970; Dewick and Martin, 1979; Ebel, 1986;

Garcez et al., 2000), including coumestrol and the glyceollins which are induced under disease stress. Recently, we demonstrated that prenylation of

43 isoflavones also likely participates in soybean defense responses (Cheng et al., 2011). Another induced metabolite, glyceocarpin, shows fungitoxic activity against Cladosporium sphaerospermum (Garcez et al., 2000).

Interestingly, these soybean defense-related metabolites show potential for human health as well. Soybean isoflavones, mainly genistein and daidzein, have shown beneficial effects on various chronic diseases such as cancer, coronary heart disease, diabetes and menopausal discomfort (Arjmandi et al., 1998; Ho et al., 2000; Mahn et al., 2005; Lee et al., 2006; Salvo et al.,

2006; Qin et al., 2009; Hsu et al., 2010; Liu et al., 2010; Park et al., 2010).

The glyceollins exhibit marked antiestrogenic effects on estrogen receptors and may be useful in the prevention or treatment of breast and ovarian carcinoma (Salvo et al., 2006). Glyceofuran and coumestrol were reported as low-density lipoprotein-antioxidants (Lee et al., 2006). Thus, a truly comprehensive metabolite profiling of soybean constitutive and inducible defense related metabolites is of paramount importance to both soybean and human health.

To more systematically study the multiplicity of changes of soybean secondary metabolites, several previous methods were optimized for soybean metabolite profiling. While earlier methods provided significant structural and identity information based on mass and UV spectral data

(Graham, 1991; Garcia-Villalba et al., 2008; Wu et al., 2008), no previous

44 method was specifically designed to cover all classes of inducible defense- related metabolites, including those with weak UV absorbance. Among the analytical methods available, the coupling of liquid chromatography with electrospray ionization (ESI) mass spectrometric detection has been demonstrated as a powerful tool for the identification and quantification of soybean metabolites in plant, plasma and urine samples (Holder et al., 1999;

Griffith and Collison, 2001; Franke et al., 2008). This approach coupled with

NMR has recently been used successfully in the discovery of novel isoflavonoids and phytoalexins in soybean (Cheng et al., 2011).

As to the induction of defense metabolites, the interaction of Phytophthora sojae and soybean has been a useful association to elucidate various induced biochemical events (Graham et al., 2003; Moy et al., 2004). The glucan elicitor from the cell wall of P. sojae (WGE) is well characterized

(Yoshikawa et al., 1981) and appears to function as a global elicitor of soybean defense responses (Graham and Graham, 1991; Graham et al.,

2003). More importantly, soybean suffers severe yield losses from phytophthora root and stem rot (Ayers et al., 1976; Sharp et al., 1984). Yield losses to P. sojae have been limited by incorporating into cultivars Rps alleles for resistance to prevalent races of the pathogen (Wilcox and St

Martin, 1998). Genetic studies showed that this race-specific resistance requires the participation of isoflavone synthase genes (Subramanian et al.,

2005). However, little is known about which specific secondary metabolites

45 of the many potentially produced by this pathway and others contribute to this race-specific resistance.

This study reports an approach utilizing multiple levels of fragmentation

HPLC-ESI-MS and NMR for comprehensive profiling, purification and systematic identification of soybean-defense-related metabolites. Data dependent scanning mass spectrometry (MSn) and NMR results are provided to add to the structural information of soybean secondary metabolites. These results should greatly benefit future soybean chemical component studies. Further, the metabolite profiles of WGE induced changes in P. sojae resistant (Williams 82, Rps1K) and susceptible (Williams) soybean cultivars were performed. Coupled with principal component analysis (PCA), the profiling results allowed the clear separation of the responses of the two soybean cultivars to WGE elicitor and identify the key metabolites with significant impacts on the separation.

Materials and methods

Plant Materials. Soybean seeds (cultivars Williams and Williams 82) were acquired from Dr. Anne Dorrance (Department of Plant Pathology, The

Ohio State University and the Ohio Agricultural Research and Development

Center, Wooster, OH). Seedlings were grown as described previously

(Graham and Graham, 1996). The flats were immediately watered very

46 thoroughly for germination. The plants were then watered every other day from the top with ½ teaspoon per gallon of Peter’s (20-20-20) fertilizer. The classical cut cotyledon elicitor-response assay was performed as described previously (Graham and Graham, 1991). Cotyledons were harvested from seven-day-old seedlings and used immediately. Ten cotyledons were used per petri plate.

Chemicals. HPLC-grade acetonitrile and water were supplied by Fisher

Scientific (Pittsburgh, PA). Genistin, genistein, daidzin, daidzein, formononetin, coumestrol, methanol-d4, and formic acid were purchased from Sigma-Aldrich (St. Louis, MO). Purified standards of 7-O-glucosyl-6”-O- malonate daidzein, 7-O-glucosyl-6”-O-malonate genistein, 7-O-glucosyl-6”-

O-malonate formononetin, phaseol, 3'-prenyldaidzein, 3'-prenyretusin, 8- prenyldaidzein, 3’-prenylgenistein and 3’-prenyisoafrormosin were obtained from our previous soybean studies (Cheng et al., 2011). The intact cell wall glucan elicitor was prepared from the cell walls of race 1 of Phytophthora sojae as described previously (Ayers et al., 1976).

Liquid Chromatography Mass Spectrometry Analysis. The cut cotyledon sample collection was done as previously described (Graham and

Graham, 1991). A number 1 cork borer was used to ensure that each replicate represented a treated surface area of equal size.. The cell layers

47 from ten replicate cotyledons were then pooled as one sample and weighed directly into a 1.5 ml microfuge tubes. Extraction was performed directly in the microfuge tubes in 80% ethanol (800 µL per 50 mg tissue) using a polypropylene pestle (Kontes) mounted on a rechargeable hand drill.

Extracts were centrifuged at 18,000 g for 5 min and 40 µL of the supernatant were analyzed by HPLC-ESI-MS on Varian 500-MS system which consisted of a Varian 212-LC Binary Gradient LC/MS pump, Prostar 420 Autosampler and Ion Trap Mass Spectrometer equipped with an ESI source. LC was achieved on a C18 reverse-phase column (LiChrosorb RP-18, 10 µ, 250 mm

X 4.6 mm, Alltech Associates, Deerfield, Illinois) at a flow rate of 0.3 ml/min.

Mobile phases A and B were water and acetonitrile both with 0.01% formic acid added. The linear gradient program was as follows: A = 85% at t = 0 min; A = 10% at t = 110 min; A = 85% at t = 120 min. The positive-ion mode was used over the range m/z 50-800. Other MS parameters were drying gas

20 psi at 350 ˚C, nebulizing gas 40 psi, needle voltage 5000 V, capillary voltage 80 V, and spray shield voltage 600 V. MSn results were generated by data dependent scanning analysis of the 500-MS Varian system.

Compound Isolation. In order to get high enough quantities of purified unknown compounds to meet the amounts required for NMR, 2,000 glucan- treated cotyledons were prepared using the cut cotyledon protocol. The extraction and first step of isolation was kept the same as previously

48 described (Cheng et al., 2011). Fractions from 8 min to 14 min were collected and lyophilized, giving 218 mg of residue. The residue was made into a suspension again by dissolving in 10 ml methanol and then diluted with water to 50 ml. The second isolation step used the profiling linear gradient described in the above section. Since the injections utilized a manual injection valve with a 2 ml loop, every peak was 7 min later than the retention times in Table 3.1. Repetitive isolation using this gradient produced glyceofuran (16 mg at 41 min), glyceocarpin (11 mg at 61 min), glyceollin III

(18 mg at 64 min), glyceollin II (6 mg at 65 min), glyceollin I (12 mg at 66 min) and glyceollin IV (5 mg at 78 min).

Nuclear magnetic resonance (NMR) Analysis. The HPLC-purified

samples were lyophilized and reconstituted either in 500 μl methanol-d4 or in

500 μl DMSO-d6. NMR experiments were conducted at room temperature on a Bruker DMX-600 spectrometer (Bruker, Karlsruhe, Germany) operating at

600.13 MHz 1H and 150.90 MHz 13C. The experiments generally included (1)

1D 1H NMR and 1D 13C NMR, (2) 2D 1H homonuclear COSY and ROESY

(240 ms mixing time), (3) 2D 1H-13C heteronuclear HMBC (optimized for small J-observation) and HSQC. Data were processed with NMRPipe and visualized by NMRView software. The solvent peaks were used as internal reference for the calibration of 1H and 13C NMR chemical shifts.

49

1 Glyceollin I. H NMR (methanol-d4, 600.13 MHz) δ 1.38 (s, 3H, H-

16), 1.39 (s, 3H, H-15), 3.94 (d, 1H, J=11.49 Hz, H-6), 4.17 (d, 1H, J=11.49

Hz, H-6), 5.17 (s, 1H, H-11a), 5.62 (d, 1H, J=9.96 Hz, H-13), 6.23 (d, 1H,

J=2.01 Hz, H-10), 6.40 (m, 1H, J=2.01, 8.17 Hz, H-8), 6.47 (d, 1H, J=8.43

Hz, H-2), 6.60 (d, 1H, J=9.96 Hz, H-12), 7.17 (d, 1H, J=8.17 Hz, H-7), 7.21

13 (d, 1H, J=8.43 Hz, H-1); C NMR (methanol-d4, 150.90 MHz) δ 28.2 (C-15,

16), 71.3 (C-6), 77.3 (C-6a, 14), 86.1 (C-11a), 99.1 (C-10), 109.5 (C-8),

111.4 (C-4), 111.7 (C-2), 114.3 (C-11b), 117.6 (C-12), 121.3 (C-6b), 125.3

(C-7), 130.5 (C-13), 132.4 (C-1), 152.0 (C-4a), 155.4 (C-3), 161.3 (C-9),

162.3 (C-10a).

1 Glyceollin II. H NMR (methanol-d4, 600.13 MHz) δ 1.37 (s, 3H, H-

15 or H-16), 1.38 (s, 3H, H-16 or H-15), 3.91 (d, 1H, J=11.51 Hz, H-6), 4.11

(d, 1H, J=11.47 Hz, H-6), 5.16 (s, 1H, H-11a), 5.61 (d, 1H, J=9.80 Hz, H-13),

6.23 (d, 1H, J=2.01 Hz, H-10), 6.24 (s, 1H, H-4), 6.37 (d, 1H, J=9.80 Hz, H-

12), 6.39 (m, 1H, J=2.01, 8.17 Hz, H-8), 7.10 (s, 1H, H-1), 7.15 (d, 1H,

13 J=8.17 Hz, H-7); C NMR (methanol-d4, 150.90 MHz) δ 28.4 (C-15 or C-16),

28.5 (C-16 or C-15), 71.1 (C-6), 77.3 (C-6a), 77.8 (C-14), 85.9 (C-11a), 99.1

(C-10), 105.4 (C-4), 109.5 (C-8), 114.5 (C-11b), 118.0 (C-2), 121.3 (C-6b),

122.8 (C-12), 125.3 (C-7), 130.0 (C-1), 130.4 (C-13), 155.9 (C-3), 157.2 (C-

4a), 161.3 (C-9), 162.3 (C-10a).

50

1 Glyceollin III. H NMR (methanol-d4, 600.13 MHz) δ 1.74 (s, 3H, H-

16), 2.98 (m, 1H, J=9.5, 15.5 Hz, H-12), 3.34 (m, 1H, J=8.1, 15.5 Hz, H-12),

3.89 (d, 1H, J=11.70 Hz, H-6), 4.10 (d, 1H, J=11.70 Hz, H-6), 4.88 (s, 1H, H-

15), 5.05 (s, 1H, H-15), 5.18 (s, 1H, H-11a), 5.20 (d, 1H, J=8.10 Hz, H-13),

6.22 (d, 1H, J=2.20 Hz, H-10), 6.27 (s, 1H, H-4), 6.39 (m, 1H, J=2.20, 8.10

Hz, H-8), 7.15 (d, 1H, J=8.10 Hz, H-7), 7.24 (s, 1H, H-1); 13C NMR (methnol-

d4, 150.90 MHz) δ 17.4 (C-16), 35.0 (C-12), 71.3 (C-6), 77.3 (C-6a), 86.5 (C-

11a), 88.1 (C-13), 98.7 (C-4), 99.1 (C-10), 109.5 (C-8), 112.3 (C-15), 114.0

(C-11b), 121.4 (C-6b), 122.5 (C-2), 125.3 (C-7), 128.1 (C-1), 145.9 (C-14),

157.0 (C-4a), 161.6 (C-9), 162.1 (C-10a), 162.6 (C-3).

1 Glyceollin IV. H NMR (methanol-d4, 600.13 MHz) δ 1.72 (s, 3H, H-

16), 1.75 (s, 3H, H-15), 3.24 (d, 2H, J=7.30 Hz, H-12), 3.77 (s, 3H, 3-OCH3),

3.89 (d, 1H, J=11.40 Hz, H-6), 4.11 (d, 1H, J=11.40 Hz, H-6), 5.16 (s, 1H, H-

11a), 5.28 (t, 1H, J=7.30 Hz, H-13), 6.23 (d, 1H, J=2.20 Hz, H-10), 6.40 (m,

1H, J=2.20, 8.10 Hz, H-8), 6.43 (s, 1H, H-4), 7.14 (s, 1H, H-1), 7.15 (d, 1H,

13 J=8.10 Hz, H-7); C NMR (methnol-d4, 150.90 MHz) δ 18.0 (C-16), 26.1 (C-

15), 28.9 (C-12), 56.0 (OCH3-3), 71.1 (C-6), 77.4 (C-6a), 86.0 (C-11a), 99.1

(C-10), 100.1 (C-4a), 109.5 (C-8), 113.2 (C-11b), 121.4 (C-7), 124.0 (C-13),

125.3 (C-2, 7), 132.3 (C-1), 133.3 (C-14), 155.7 (C-4a), 160.0 (C-3), 161.3

(C-9), 162.3 (C-10a).

51

1 Glyceocarpin. H NMR (DMSO-d6, 600.13 MHz) δ 1.69 (s, 3H, H-16),

1.71 (s, 3H, H-15), 3.17 (b, 2H, H-12), 3.86 (d, 1H, J=11.22 Hz, H-6), 3.97 (d,

1H, J=11.22 Hz, H-6), 5.14 (s, 1H, H-11a), 5.27 (t, 1H, J=7.48 Hz, H-13),

5.86 (s, 1H, OH-6a), 6.18 (b, 1H, H-10), 6.28 (s, 1H, H-4), 6.34 (b, 1H,

J=8.10 Hz, H-8), 7.05 (s, 1H, H-1), 7.13 (d, 1H, J=8.10 Hz, H-7), 9.47 (s, 1H,

13 OH-9), 9.57 (s, 1H, OH-3); C NMR (DMSO-d6, 150.90 MHz) δ 17.7 (C-16),

25.6 (C-15), 27.4 (C-12), 69.5 (C-6), 74.1 (C-6a), 84.4 (C-11a), 97.5 (C-10),

102.3 (C-4), 107.7 (C-8), 111.2 (C-11b), 120.3 (C-6b), 121.9 (C-2), 123.0 (C-

13), 124.4 (C-7), 131.1 (C-14), 131.4 (C-1), 153.5 (C-4a), 156.0 (C-3), 159.5

1 (C-9), 160.3 (C-10a). H NMR (methanol-d4, 600.13 MHz) δ 1.72 (s, 3H, H-

16), 1.76 (s, 3H, H-15), 3.24 (d, 2H, J=7.35 Hz, H-12), 3.85 (d, 1H, J=11.45

Hz, H-6), 4.07 (d, 1H, J=11.45 Hz, H-6), 5.14 (s, 1H, H-11a), 5.32 (t, 1H,

J=7.35 Hz, H-13), 6.23 (d, 1H, J=2.09 Hz, H-10), 6.29 (s, 1H, H-4), 6.39 (m,

1H, J=2.09, 8.15 Hz, H-8), 7.09 (s, 1H, H-1), 7.14 (d, 1H, J=8.15 Hz, H-7);

13 C NMR (methanol-d4, 150.90 MHz) δ 17.7 (C-16), 26.1 (C-15), 28.5 (C-12),

70.6 (C-6), 77.0 (C-6a), 85.8 (C-11a), 98.7 (C-10), 103.2 (C-4), 108.9 (C-8),

112.2 (C-11b), 121.1 (C-6b), 123.7 (C-2, 13), 124.7 (C-7), 132.1 (C-1), 132.6

(C-14), 155.0 (C-4a), 157.4 (C-3), 160.9 (C-9), 162.0 (C-10a).

1 Glyceofuran. H NMR (DMSO-d6, 600.13 MHz) δ 1.49 (s, 6H, H-15,

16), 4.04 (s, 2H, H-6), 5.36 (s, 1H, 14-OH), 5.39 (s, 1H, H-11a), 5.99 (s, 1H,

OH-6a), 6.17 (d, 1H, J=1.80 Hz, H-10), 6.35 (m, 1H, J=1.80, 8.13 Hz, H-8),

52

6.61 (s, 1H, H-12), 7.04 (s, 1H, H-4), 7.18 (d, 1H, J=8.13 Hz, H-7), 7.65 (s,

13 1H, H-1), 9.49 (s, 1H, OH-9); C NMR (DMSO-d6, 150.90 MHz) δ 28.7 (C-15,

16), 67.4 (C-14), 70.0 (C-6), 75.4 (C-6a), 85.0 (C-11a), 97.2 (C-10), 98.7 (C-

4), 99.4 (C-12), 107.7 (C-8), 117.3 (C-11b), 120.0 (C-6b), 122.9 (C-1), 123.0

(C-2), 124.3 (C-7), 152.1 (C-4a), 154.5 (C-3), 159.6 (C-9), 160.4 (C-10a),

1 164.8 (C-13). H NMR (methanol-d4, 600.13 MHz) δ 1.60 (s, 6H, H-15, 16),

3.98 (d, 1H, J=11.41 Hz, H-6), 4.16 (d, 1H, J=11.41 Hz, H-6), 5.37 (s, 1H, H-

11a), 6.23 (d, 1H, J=2.00 Hz, H-10), 6.41 (m, 1H, J=2.00, 8.18 Hz, H-8), 6.61

(s, 1H, H-12), 6.98 (s, 1H, H-4), 7.18 (d, 1H, J=8.18 Hz, H-7), 7.64 (s, 1H, H-

13 1); C NMR (methanol-d4, 150.90 MHz) δ 28.8 (C-15, 16), 69.6 (C-14), 71.6

(C-6), 77.5 (C-6a), 86.7 (C-11a), 98.8 (C-10), 100.0 (C-4), 101.0 (C-12),

109.4 (C-8), 118.3 (C-11b), 121.1 (C-6b), 124.0 (C-1), 124.9 (C-2), 125.2 (C-

7), 154.0 (C-4a), 156.7 (C-3), 161.1 (C-9), 162.2 (C-10a), 165.2 (C-13).

Data Preprocessing and Analysis. The raw LC/MS data from analytical runs on untreated and WGE-treated tissues were processed by a custom designed bioinformatics program (Chen et al., 2006; Cheng et al.,

2011). The program was used to read the raw MS data, find peaks in an unbiased way, align peaks and output a peak list in which columns represent the samples and rows represent peak areas. Principal component analysis was performed in Matlab version R2008a (The Mathworks, Inc., Natick, MA).

53

Figure 3.1 Overlaid base ion chromatograms from positive ion HPLC-ESI- MS analysis of the extract of cotyledons which were treated with 200 µg/ml WGE and water control. Tissues were harvested at 48 h.

Results and discussion

Metabolite profiling of soybean defense related secondary metabolites. To study the full complement of inducible soybean secondary metabolites, the classical cut cotyledon protocol was used in this study. In this assay, the nonpenetrable surface of soybean cotyledons was sliced off to facilitate the application of wall glucan elicitors. This wound assay, activating soybean defense competency, has been utilized successfully in research on the multiplicity and coordination of cellular defense responses to various elicitors (Graham and Graham, 1991, 1996; Graham et al., 2003).

HPLC–ESI–MS was employed to comprehensively profile the secondary

54 metabolites present in glucan-treated and control cut cotyledon samples.

The method reported here kept the same MS parameters as our earlier paper (Cheng et al., 2011). However, we extended the gradient from 80 mins to 120 mins in order to get a better separation for those structurally similar inducible metabolites (Figure 3.2).

Samples from WGE- and water- treatments of the P. sojae race 1- resistant soybean cultivar Williams 82 were analyzed by this profiling method.

The base ion chromatograms are overlaid (Figure 3.1) to allow visualization of metabolome-wide differences between glucan-treated and water control samples. Compared to the water control, there were more peaks in the glucan-treated samples. Theoretically, compounds will form pseudomolecular species (i.e. [M+H]+, [M+Na]+) in the positive-ion ESI mass spectra. Initial peak identification was taken by comparing mass spectra and retention times with commercially available standards and pure compounds obtained in our previous studies. As a result, 13 isoflavones and 2 coumestans were identified. Their structures are shown in Figure 3.2 and

Figure 3.3. The identities, retention times and MS data of the confirmed compounds are summarized in Table 3.1. However, six peaks labeled with an asterisk in Figure 3.1 did not match any reported soybean metabolites based on their mass information. Thus, these were purified by repeated preparative HPLC and their structures were determined by 2D NMR

55 spectroscopy.

Figure 3.2. Structures of identified isoflavones from glucan-induced soybean cotyledons. The β-d-glucosyl, prenyl and 6”-malonyl groups are designated as G, P and M, respectively.

56

C 15 13 16 14 12 5 H3CO O O 4 4a O 6 ABOH 3 6a OH 13 11a 7 15 C 2 11b 12 H 1 6b 14 O D H O 8 10a 9 11 16 OH 10 OH Glyceollin I Glyceollin IV

15 O O HO O 14 OH OH 16 13 13 15 12 12 H H O 14 O 16 OH OH Glyceollin II Glyceocarpin

15 H O O O O HO 14 13 OH OH 13 14 12 16 12 16 15 H O H O

OH OH Glyceollin III Glyceofuran

O O O O

O O Coumestrol Phaseol

Figure 3.3 Structures of identified coumestans and pterocarpans from glucan-induced soybean cotyledons.

57

Table 3.1. Metabolites identified and profiled in 200 µg/ml glucan-treated Williams 82 soybean cotyledons. Compound Rt Molecular Observed Compound Number (min) weight M/Z 1 Daidzin 16.3 416 417 [M+H]+ 2 7-O-Glucosyl-6”-O-malonate daidzein 20.3 502 255 [M+H]+ 3 Genistin 21.5 432 433 [M+H]+ 4 7-O-Glucosyl-6”-O-malonate genestein 26.1 518 519 [M+H]+ 5 7-O-Glucosyl-6”-O-malonate formononetin 32.6 516 517 [M+H]+ 6 Daidzein 34.5 254 255 [M+H]+ 7 Glyceofuran 37.4 354 337 [M-H2O+H]+ 8 Genistein 42.4 270 271 [M+H]+ 9 Coumestrol 46.2 268 269 [M+H]+ 10 Formononetin 52.9 268 269 [M+H]+ 11 Glyceocarpin 54.1 340 323 [M-H2O+H]+ 12 Glyceollin III 57.7 338 321 [M-H2O+H]+ 13 Glyceollin II 58.8 338 321 [M-H2O+H]+ 14 Glyceollin I 59.6 338 321 [M-H2O+H]+ 15 3'-Prenyldaidzein 62.4 322 323 [M+H]+ 16 8-Prenyldaidzein 63.1 322 323 [M+H]+ 17 3'-Prenyretusin 66 352 353 [M+H]+ 18 Phaseol 69.9 336 337 [M+H]+ 19 Glyceollin IV 71 354 337 [M-H2O+H]+ 20 3’-Prenylgenistein 73.9 338 339 [M+H]+ 21 3’-Prenyisoafrormosin 82.5 366 367 [M+H]+

Structure elucidation of soybean pterocarpans. All of the six

compounds were revealed as pterocarpans and their structures are

summarized in Figure 3.3. The steps in structural elucidation are exemplified

by the work on glyceocarpin, for which NMR experiments were performed

with the sample dissolved in methanol-d4 as well as in DMSO-d6. First, the

1D 13C NMR spectrum revealed a total of 20 resonances excluding the

solvent signal, among which 11 were proton-attached evidenced in 2D 1H-

13C HSQC spectrum. On the basis of the 1D 1H spectrum recorded in 58 methanol-d4, the number of non-labile protons was determined to be 17, including a pair of methyl groups and five aromatic protons. The latter were grouped into spin systems of two “A” [δ 7.09 ppm and δ 6.29 ppm] and one

“ABX” [δ 7.14 (d, J = 8.15 Hz), 6.39 (dd, J = 8.15, 2.09 Hz), and 6.23 (d, J =

8.15, 2.09 Hz)] on the basis of J-coupling correlations observed in the 1D 1H

NMR spectrum as well as in the 2D COSY spectrum (Figure 3.4 A). Second, the evidence for a prenylated compound was revealed by the signal for a γ,γ-

1 13 dimethylallyl group [–CH2CH=C(CH3)2] from the H and C NMR chemical shifts distinguished in the 2D 1H-13C HSQC spectrum: two allytic methyl

signals at δH 1.72 and 1.76 ppm, one methylene doublet at δH 3.24 ppm, and one allytic triplet at δH 5.61 ppm. The ROE interactions (Figure 3.4 B) observed between H-13 and H-15 and between H-12 and H-16 enabled the stereospecific assignment of C-15 and C-16. Third, in the 1D 1H NMR spectrum recorded with the sample in DMSO-d6, three hydroxy protons were identified. The one at 9.47 ppm showed HMBC correlations with C-8 and C-

10 in the ABX spin system (Figure 3.4 C), leading to the assignment of the aromatic ring D (Figure 3.4 A). On the other hand, that at 9.57 ppm showed a HMBC correlation with C-4 to which the aromatic spin of 6.29 ppm was attached, and another HMBC correlation to C-3 (156.0 ppm in DMSO-d6) to which this hydroxy group is attached. Since the latter also had HMBC correlations with both the H-1 and H-4 whereas the methylene protons of the prenyl group correlated with C-1, it was concluded that -CH2CH=C(CH3)2

59 and –OH are located at C-2 and C-3 of aromatic ring A, respectively (Figure

3.4 A). Finally, the chemical shifts of H-6/C-6 and H-11a/C-11 were indicative of rings B and C skeleton observed in glyceollins. Further confirmation came from ROE assignments and, most importantly, HMBC correlation assignments as summarized in Figure 3.4 C. The identification of the other five pterocarpans followed the same manner and it is important to note that the chemical shifts obtained are consistent with previous reports

(Lyne and Mulheirn, 1978; Ingham et al., 1981; Lee et al., 2006; Khupse and

Erhardt, 2008).

5 4 HO 3 4a O 6 A B 6a OH 11a 13 2 6b 7 11b C 15 12 1 H 14 O D 8 11 10a 9 16 10 OH (A)

HO O OH

H O

OH (B)

HO O OH

H O

OH (C)

Figure 3.4. Summary of NMR observations on the sample of glyceocarpin dissolved in DMSO-d6: (A) ROE spatial interactions, (B) J-coupling correlations, and (C) multiple-bond correlations from proton to carbon

60

MSn information of soybean pterocarpans. Once they were identified, it was confirmed that all of these six pterocarpans are known inducible soybean metabolites. Among them, glyceollins I-III are well-known soybean phytoalexins, although no mass based analytical method has previously been described for these pterocarpans. The reasons might be that (1) none of these metabolites is available commercially and (2) pterocarpans form a

+ particular pseudomolecular ion in the mass detector which is [M-H2O+H] as observed in present study. Thus, we propose a mechanism for the formation of the pseudo molecular ion in Figure 3.5. Since all six pterocarpans lose

H2O, the only possible OH group that could account for this loss is at C6a.

There are two possible ways to form a double bond either at C-6 or C-11a.

Phaseol, glyceocarpin and glyceollin IV have similar structures of their B, C,

D rings and their MSn results share the same fragment pattern (see Figure

3.3 and Table 3.2). Therefore, the double bond should be formed at C-11a, which forms the same ring structure as phaseol. MSn results of these pterocarpans and phaseol are presented in Table 3.2 to provide useful structural information for these commercially unavailable compounds.

61

Table 3.2. MS n data of pterocarpans and phaseol. Only major fragments are listed in each MS. Peak Compound MS1 MS2 MS3 MS4 235 263 309 248 281 7 Glyceofuran 337 247 304 319 275 291 211 11 Glyceocarpin 323 267 239 165 167 251 223 195 12 Glyceollin III 321 259 287 306 274 277 249 167 13 Glyceollin II 321 251 223 195 229 247 232 303 232 275 247 14 Glyceollin I 321 234 289 259 306 178 291 263 195 281 251 18 Phaseol 337 223 269 197 19 Glyceollin IV 337 281 253 225

H3CO O H3CO O OH -H20 H O O

Glyceollin IV OH Pseudomolecular OH

Figure 3.5 Proposed mechanism for the formation of the glyceollin IV pterocarpin pseudomolecule in positive-ion mode mass spectrometry.

62

Comparative study of WGE-elicited responses of P. sojae resistant and susceptible soybean cultivars. The soybean cultivar, Williams 82, which carries the Rps1K resistance gene for resistance to race 1 of P. sojae, is the backcross derivative of the cross between Williams and Kingwa

(Kasuga et al., 1997). There are several reasons that Williams and Williams

82 was chosen for comparative studies with WGE. Recent gene silencing studies have demonstrated two important and related phenomena. First, silencing of genes for isoflavone synthase suppressed the hypersensitive response conditioned by the Rps 1K locus (Graham et al., 2007). Second, treatment of tissues with WGE and silencing of the elicitor-releasing glucanase (a PR-2), which releases active elicitors from WGE, demonstrated that WGE participates is a likely participant in the HR cell death conditioned by the Rps 1K locus (Graham et al., 2007). Finally, Williams 82 and Williams are near-isogenic lines and comparative metabolomic studies may yield distinct differences associated with the interaction of WGE with the Rps 1K locus. Thus, 32 samples (eight replicates for each of the four groups: water and WGE-treated soybean Williams and Williams 82) were profiled using the

HPLC-ESI-MS metabolite profiling method combined with principal component analysis (PCA). Raw mass data acquisition, peak discovery and peak alignment was performed by our in-house program (Chen et al., 2006;

Cheng et al., 2011). The differences of defense responses between Williams

82 and Williams were assessed by PCA for the entire data set. A clear

63 separation of glucan-treated groups was observed in the PCA scores plot

(Figure 3.6). The two-component PCA model cumulatively accounted for

89.54% of variation. As shown in Figure 3.6, water control groups were clustered together, indicating that basal metabolomes of the two cultivars are very similar. Under elicitation conditions, the metabolomes of the two cultivars showed clear differences. According to the score plot, WGE-treated

Williams 82, WGE-treated Williams and two water control groups were clearly separated by PC1 (79.82%) which mainly explains the variation among different groups. Moreover, WGE-treated Williams 82 moved further from the two water control groups than the WGE-treated Williams did, consistent with a potential difference related to the presence of the Rps-1K locus in Williams 82. PC2 (9.72%) mainly explains the variation within each group.

64

5 Williams 82 water control Williams water control 4 Glucan treated Williams Glucan treated Williams 82

3

2

PC2 (9.72%) 1

0

-1

-2

-3

-4 -6 -4 -2 0 2 4 6 8 10 12 PC1 (79.82%)

Figure 3.6 Principal component analysis (PCA) score plot of water and WGE treated Williams and Williams 82 samples. The plot shows that the metabolomes of the four groups of samples were clearly different. WGE- treated Williams 82 moved further from the two water control groups than WGE-treated Williams did, consistent with a potential difference related to the presence of the Rps-1K locus in Williams 82

65

1

0.8

0.6

0.4 20 7 9 21 19 0.2 15

0 PC 2 PC

-0.2 11

-0.4

-0.6

-0.8

-1 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 PC 1

Figure 3.7 Loading plot of PC1 vs PC2. The loading of individual metabolites represent its contribution to the separation of the four groups of samples. The top seven metabolites in PC1 are labeled with the peak numbers which can be found in Figure 3.1.

Table 3.3 Comparison of the top seven metabolites between Williams 82 and Williams that contribute to PC1 in Figure 3.7. All results are presented as medians (25th, 75th percentile) of selective m/z intensity. Peak Compound Williams 82 Williams P value 20 3'-Prenylgenistein 6434955 (4683425,8182198) 3307085 (2938301,4767503) 0.0074 15 3'-Prenyldaidzein 3643674 (3490248,4692588) 2274198 (1621260,2996954) 0.0028 19 Glyceollin IV 898962 (463628,1009554) 316063 (278634,365884) 0.0092 7 Glyceofuran 800564 (714907,933830) 514824 (409201,673749) 0.0061 21 3'-Prenyisoafrormosin 463553 (388823,544577) 258188 (217473,323068) 0.0014 11 Glyceocarpin 471847 (420376,689916) 225437 (185608,297318) 0.0008 9 Coumestrol 247768 (182486,331230) 144399 (117032,224347) 0.0239

The loading plot provides the contribution of each metabolite to a PC.

Investigation of the contribution of individual metabolites in PC1 revealed metabolites with significant impacts on the separation. The top seven metabolites in PC1 can be seen in Figure 3.7. All seven metabolites were 66 identified using the above method and were 3'-prenylgenistein (peak 20), 3'- prenyldaidzein (peak 15), glyceollin IV (peak 19), glyceofuran (peak 7), 3’- prenyisoafrormosin (peak 21), glyceocarpin (peak 11) and coumestrol (peak

9). Mann–Whitney U tests were used to compare all these individual metabolites in Williams 82 versus Williams. It was clear that soybean

Williams 82 produced more of these seven metabolites than soybean

Williams did under WGE-induction conditions (Table 3.3). It is proposes that these seven metabolites may play two roles in P. sojae resistance. One role is that some of them function directly as phytoalexins. Glyceocarpin showed fungitoxic activity against Cladosporium sphaerospermum (Garcez et al.,

2000). Glyceollin IV also functions as a phytoalexin (Lyne et al., 1976). 3'- prenyldaidzein and 3'-prenylgenistein, the two highest PC1 loadings, showed comparable antibacterial activities as two well-known natural antimicrobial agents, bakuchiol and magnolol (Yin et al., 2004). The other potential role in defense is that some of these metabolites may function as active antioxidants that reduce the damaging effects of reactive oxygen species generated by hypersensitive response cell death involved in disease resistance. Interestingly, the isoflavanones, glyceofuran and coumestrol were reported as low-density lipoprotein-antioxidants (Lee et al., 2006).

A very interesting observation of our studies is that the most highly studied soybean phytoalexins, glyceollins I, II and III, did not contribute

67 significantly to the separation of glucan-treated Williams 82 and Williams.

This might indicate that glyceollins I-III are general soybean phytoalexins and the seven metabolites synergistically provide Phytophthora disease resistance specifically.

Conclusions

HPLC–ESI–MS and NMR were used for the metabolite profiling and systematic identification of defense-response-induced secondary metabolites in soybean cotyledons. All together, 13 isoflavones, two coumestans and six pterocarpans were indentified. The six pterocarpans are not commercially available and very limited knowledge of their structural information has been reported in the literature. Thus, we provided extensive

MSn and NMR structural information to facilitate future studies. To the best of our knowledge, the HPLC-ESI-MS metabolite profiling studies reported here provide the most complete analysis of soybean induced secondary metabolites. This will benefit further evaluation of chemical components of soybean related products in several areas. Using this profiling method, our results demonstrated that there were significant differences in the metabolic compositions of WGE treated soybean cultivars resistant to P. sojae race 1.

The key metabolites contributing to the differences were identified. Although our current studies were limited to comparisons of the near isogenic lines,

Williams and Will 82, they provide a model and a template for future studies

68 with other resistance genes in soybean. Combining these results with future genomic studies to form a holistic picture of the metabolic pathways, may facilitate the identification of new targets for efficient metabolic engineering efforts to improve not only soybean disease resistance, but also the composition of these health promoting phytochemicals.

69

Chapter 4

A metabolomic strategy for new natural product discovery

Abstract

Metabolomics, with the aim at qualitative and quantitative analysis of the full completement of metabolites in biological samples, has the potential to find unknown natural products even in relatively well-studied plants. However, this potential remains untapped due to the diversity and complexity of natural products. An optimized metabolomics strategy for the discovery of new natural products is needed. Therefore, we proposed a series of simple operations for plant natural products discovery based on metabolomic studies.

The main steps include (1) LC-ESI-MS profiling analysis, (2) peak detection, supervised retention time alignment and peak matching for multiple groups of samples, (3) selection of the metabolites that are unknown induced natural products, (4) purification of target metabolites by preparative HPLC and finally,

(5) structure elucidation based on NMR. This strategy was applied to examine exogenous elicitors, signal molecules and signal transduction regulators for their effects on wound-, light- and glucan defense elicitor-induced metabolites in soybean. 2-Methoxy-3,9-dihydroxycoumestone, 1-methoxy-3,9- dihydroxycoumestone, 8-methoxy-3,9-dihydroxycoumestone, and 7,4’- dihydroxy-5’-methoxycoumaronochromone were discovered as secondary metabolites for the first time in soybean. The results illustrate the feasibility of the proposed metabolomics strategy in plant natural product discovery and 70

also indicate that phenylpropanoid metabolism may be controlled by manipulating the trans-membrane potential.

Introduction

Plants have been utilized as medicines by human societies for millennia

(Zhang et al., 2010; Coates and Meyers, 2011). The chemical bases of their medicinal applications are natural products that have long been a productive source of lead molecules or even drugs themselves (Balunas and Kinghorn,

2005). In the areas of cancer and infectious disease, 60 and 75%, respectively, of new drugs, originate from natural sources (Gullo et al., 2006).

The traditional way of finding a lead from natural products is based on bioactivity-guided fractionation. Basically, plants are extracted by different solvents in order to get the constituents of different polarities and then each extract is examined by specific bioassays (Kinghorn et al., 2011). This procedure covers most constitutive plant natural products. However, induced plant natural products may be overlooked.

Fifteen to twenty percent of the genes in Arabidopsis, for instance, are predicted to encode enzymes of secondary metabolism. These are far more than necessary to produce the metabolites identified so far in Arabidopsis

(Facchini et al., 2004; D'Auria and Gershenzon, 2005; Bottcher et al., 2008).

In order to survive, plants, on exposure to suboptimal growth conditions, will use these secondary metabolism-related genes to produce certain induced

71

plant natural products to overcome the threat events. In our previous studies of soybean defense responses, several of induced secondary metabolites were discovered (Cheng et al., 2011). These induced secondary metabolites not only play a very important role in soybean defense but also have human health beneficial effects. Coumestrol, glyceocarpin, glyceofuran and the glyceollins are well-known inducible antibiotic phytoalexins that protect plants from microbial attack (Dewick et al., 1970; Dewick and Martin, 1979; Garcez et al., 2000). At the same time, these soybean-defense-related metabolites have shown potential for human health maintenance as well. Glyceollins exhibit marked antiestrogenic effects on estrogen receptors and are useful in the prevention or treatment of breast and ovarian carcinoma (Salvo et al.,

2006). Glyceofuran and coumestrol are reported to be low-density lipoprotein- antioxidants (Lee et al., 2006). Thus, an optimized strategy for induced natural product research will be of paramount importance to both plant physiology and human health.

Metabolomics has emerged as a powerful tool for functional genomics and systems biology (Sumner et al., 2003; Fukushima et al., 2009; Dudley et al., 2010). The major task of metabolomic analysis is the detection of significant sample differences and patterns that identify specific biological conditions by quantitatively analyzing complete profiles of small molecules in biological samples, which is the ideal tool for a plant-induced metabolite study.

Typical metabolomics analytical procedures involve metabolite profiling using

72

mainly NMR spectroscopy or mass spectrometry, peak detection, alignment of retention time, peak matching across samples, statistical analysis using unsupervised and/or supervised statistical analysis to identify significant differences between sub-groups, and compound identification by co- characterization of authentic compounds or database searches (Katajamaa and Oresic, 2007). To handle large metabolomic data sets, several free and commercial tools {e.g., MZmine (Katajamaa et al., 2006; Pluskal et al., 2010),

XCMS (Smith et al., 2006), MarkerLynx (Waters), MassProfiler (Agilent) and

MarkerView (Applied Biosystems)} have been developed to process raw mass spectrometric data to report a data table including aligned chromatographic peaks in all samples. These procedures have been successfully applied in drug evaluation (Lan and Jia, 2010), disease diagnosis

(Soga et al., 2011), plant physiology (Jackson et al., 2009), nutrition science

(Primrose et al., 2011), and toxicology (Chen et al., 2006). However, there are currently relatively few studies attempting to identify unknown induced compounds and no generalized strategy for unknown induced compounds has been proposed by others.

The objective of this project was to optimize a metabolomics strategy for the discovery of unknown induced compounds in plants. A series of simple procedures was proposed: (1) LC-ESI-MS profiling analysis, (2) peak detection, supervised retention time alignment and peak matching for multiple groups of samples, (3) selection of the metabolites that are unknown induced

73

natural products, (4) purification of target metabolites by preparative HPLC and finally, (5) structure elucidation based on NMR spectroscopy. Additionally, a novel algorithm has been designed for correctly matching isomer peaks.

The proposed strategy has been applied to examine exogenous elicitors, signal molecules and signal transduction regulators for their effects on wound-, light- and glucan defense elicitor-induced metabolites in soybean. Four induced metabolites were discovered as natural products for the first time in soybean and the results of this study suggest that phenylpropanoid metabolism may be controlled by manipulating the trans-membrane potential.

Results and Discussion

The workflow for the discovery of unknown induced plant metabolites is shown in Figure 4.1. Experiment creates plant metabolite induction stress.

LC-ESI-MS is the ideal tool to comprehensively profile the metabolite differences between induced and control samples. Peak detection finds peaks in each extracted ion chromatogram (EIC). Supervised retention time (Rt) alignment allows the correction of Rt shift. Peak matching matches the same metabolites across all of the samples based on their retention time similarity and mass spectrum similarity. Selection of unknown metabolites determines target metabolites. Preparative HPLC is then used to enrich and purify the target metabolites. Finally structure elucidation is done using NMR spectroscopy. Next, we describe the details of each procedure by illustrating our work on soybean induced unknown metabolites discovery.

74

Figure 4.1 Workflow for the discovery of unknown induced plant metabolites.

Experimental Rationale: To discover unknown induced metabolites, the experiment should create a condition such that plant secondary metabolite pathway would be highly changed. In our study for unknown induced metabolite discovery in soybean, we tested over 25 elicitors using the classical cut cotyledon protocol which has been successfully utilized in research on the multiplicity and coordination of cellular defense responses to various elicitors (Graham and Graham, 1991; Landini et al., 2003; Graham,

2005; Subramanian et al., 2005). An overview of signaling events in host- pathogen interactions is shown in Figure 4.2 (Gragham, unpublished data).

All the tested elicitors (Table 4.1) were chosen based on the criteria that the chosen elicitors should have an effect on at least one signaling event in host-

75

pathogen interactions. Among the tested elicitors, nigericin, fumonisin, fusicoccin, monensin, vanadate and nystatin, all of which can disturb steady state of trans-membrane potential, have the same capability to comprehensively activate soybean secondary metabolite production. As an example, the base ion chromatograms of nigericin (50 µM)-treated and water control samples are overlaid (Figure 4.3) to differentiate the metabolomes.

Figure 4.2 Overview of signaling events in host-pathogen interactions.

76

Table 4.1 All tested exogenous elicitors, signal molecules and signal transduction regulators and their functions and effects on the induced accumulation of glyceollins I and II. Elicitor Function Effect Reference vanadate Disrupt transmembrane potential Activation (Cantley et al., 1978) fusicoccin Disrupt transmembrane potential Activation (Tode and Luthen, 2001) A23187 Disrupt transmembrane potential Activation (Abbott et al., 1979) nigericin Disrupt transmembrane potential Activation (Daniele et al., 1978) monensin Disrupt transmembrane potential Activation (Mollenhauer et al., 1990) valinomycin Disrupt transmembrane potential Activation (Tosteson et al., 1967) nystatin Disrupt transmembrane potential Activation (Cass et al., 1970) amphotericin Disrupt transmembrane potential Activation (Cass et al., 1970) lanthanum chloride Disrupt transmembrane potential Activation (Zheng et al., 2000) 2,4-Dinitrophenol Disrupt transmembrane potential Activation (Bielawski et al., 1966) glutathione Disrupt redox potential NA (Hothorn et al., 2006) lactofen Disrupt redox potential Activation (Graham, 2005) rose bengal Disrupt redox potential Activation (Hearse et al., 1989) mycolaminaran PAMP Activation (Yoshikawa et al., 1983) P. sojae WGE PAMP Activation (Graham et al., 2003) LPS PAMP Activation (van Loon et al., 2008) Chitosan PAMP Activation (Bhatnagar and Sillanpaa, 2009) Juglone Allelopathic effect NA (Paulsen and Ljungman, 2005) Allelopathic effect NA (He et al., 2009) jasmonic acid Defense signaling NA (Creelman et al., 1992) arachidonic acid Cellular signaling NA (Clark et al., 1991) NPA Inhibit auxin transport Inhibition (Ruegger et al., 1997) ACC Ethylene biosynthesis NA (Adams and Yang, 1979) salicylic acid Defense signaling Inhibition (Delaney et al., 1994) epibrassinolide Plant hormone NA (Park, 1998) PAMP: Pathogen-associated molecular pattern

LPS: Lipopolysaccharides

NPA: Naphthylphthalamic acid

ACC: 1-Aminocyclopropane-1-carboxylic acid

Activation: Induces accumulation of glyceollin I and II

NA: No effect on accumulation of glyceollin I and II

Inhibition: Inhibits WGE induced accumulation of glyceollin I and II

77

Figure 4.3 Overlaid base ion chromatograms from positive-ion HPLC-ESI-MS analysis of the extract of cotyledons treated with 50 µM nigericin and water control. Tissues were harvested after 4 days.

LC-ESI-MS profile analysis. No single technology currently available

(or likely in the near future) can detect all compounds found in plants (Hall,

2006; Fernie, 2007; Saito and Matsuda, 2010). However, gas chromatography-MS, liquid chromatography-MS, and NMR spectroscopy that can detect hundreds of metabolite signals simultaneously in one run and are sensitive and selective enough to cover most small-molecule plant natural products. The features, advantages and disadvantages of these analytical instruments are beyond the scope of this study and were well summarized elsewhere (Issaq et al., 2009; Amantonico et al., 2010). For the purpose of

78

unknown induced metabolite discovery, LC-ESI-MS is the best choice for two reasons. This method provides the best combination of detection range and sensitivity compared to GC/MS and NMR spectroscopy (Griffiths and Wang,

2009; Roux et al., 2011). The other important reason is that the liquid chromatograph method of LC-ESI-MS can be easily transferred onto preparative HPLC so that there is no need to have a preparative HPLC method development step.

As to LC methods, the optimized metabolomics strategy for unknown induced metabolites discovery should be opposite to common metabolomics study. For a common metabolomics study, especially for clinical study, there are often hundreds of LC/MS runs that need to be undertaken in order to find differences between, e.g. diseased and healthy patients. Therefore, the shorter run times, the more efficient the study will be. However, plants, under certain stress condition, might produce structurally similar induced metabolites (see Figure 4.6) which can not be separated by relatively short- time run. Thus, we extended our previous 80 mins gradient (Cheng et al.,

2011) to a 120 mins run in order to get a better separation. Since the purpose of LC-ESI-MS profiling is to detect whether there are induced metabolites occurring in treated groups compared to a control group, the number of samples should be quite small, e.g., three treated samples and three control samples. Longer run times are therefore affordable.

79

Peak Detection. With the rapid development of metabolomics and proteomics, there has been significant progress in the development of various peak detection algorithms that were examined thoroughly and compared in previous reviews (Tautenhahn et al., 2008; Yang et al., 2009). Since many compounds of interest (such as biomarkers in common metabolomics studies) occur at the low concentration levels, any free or commercial softwares designed for common metabolomics studies can also be used as peak detection tools for discovering induced peaks, which should be in higher concentrations and easier to be detected using such software. Usually, the peak intensity threshold is an important parameter required in peak detection.

It may be suggested that the threshold should be set to a relatively high level such as 10% of the most abundant peak. The rationale is that even if induced peaks of low abundance are located, it would be very difficult to enrich them to the milligram quantity as required for NMR spectroscopic de novo structure elucidation. In this study, peak detection was done using custom-designed bioinformatics tools (Chen et al., 2006; Cheng et al., 2011).

Supervised retention time alignment and peak matching. The purpose of metabolomics is to compare the abundances of metabolites across samples in different groups (e.g., treated and control groups). Correct peak matching is the most crucial step using bioinformatics tools for metabolomics. The criteria to define the same metabolites are based on the similarities of the mass spectra and retention times observed. As a result of

80

the development of advanced mass analyzers, mass shifts between runs are at an acceptable level so that it will not be a problem by using a proper binning procedure (Smith et al., 2006; Parsons et al., 2009), which is a standard step in all above mentioned metabolomics software that is available.

However, a retention time shift can be appreciable and nonlinear over the extent of a chromatogram (Nordstrom et al., 2006; Robinson et al., 2007;

Chae et al., 2008). Retention time shifts between run to run are caused by uncontrolled experimental variables, such as column aging and instabilities in flow rates of mobile phases and the shape of thermal or mobile-phase gradients (Bylund et al., 2002; Nordstrom et al., 2004), which will hamper the procedures to recognize the same metabolites across samples in different groups.

A number of algorithms have been developed for Rt shift correction

(Johnson et al., 2003; Jeffries, 2005; Wong et al., 2005; Fischer et al., 2006;

Lange et al., 2007; Hoffmann and Stoye, 2009; Kong and Reilly, 2009;

Podwojski et al., 2009). These algorithms can be described as two categories based on whether they use landmarks or not. The methods not using landmarks aim to obtain the best correlation between different chromatograms by warping. Those methods that use landmarks attempt to define landmarks (features) across different chromatograms and use them as reference to perform alignments. As illustrated in Figure 4.2, the chromatograms of induced and control samples are quite different. As was

81

discussed above, the number of samples in this study is less than 10. We therefore used a supervised retention time alignment method allowing user to separate the chromatogram into certain number of segments and to choose one landmark in each segment. This method utilizes two inherent properties:

(1) the judgment of the user is better than any algorithms for landmark detection and (2) peaks eluting near to one another tend to show similar RT shifts (Chae et al., 2008).

After Rt shifts correction, we developed a unique method to match chemically identical peaks across samples in different groups using both the spectra similarity and retention time similarity. To our knowledge, available

LC-MS peak matching methods require that the deviation in retention time from sample to sample be no greater than the time between two adjacent peaks (Smith et al., 2006). However, our previous results (Cheng et al., 2011) showed that plants produce many isomers that will generate very close adjacent peaks, leading to mismatching (Figure 4.4). To correctly match induced isomers, initially, our peak matching method also relies on retention time. Two peaks from two samples are matched, if they are within the maximum retention time shift that is acceptable for the same metabolite. Then, as illustrated in Figure 4.4, when one peak can be matched to two peaks, a remedy will be triggered that collects all the information of mass fragments belonging to the three peaks. The peak will be matched to the peak that shares the higher mass spectrum similarity. The rationale behind this is that

82

separable isomers generate different fragment pattern in ESI mass condition.

Figure 4.5 shows the differences in fragment pattern of isobaric peaks 1 and

2.

Figure 4.4 Overlaid chromatograms of peaks 1 and 2 in Figure 4.2 from two samples of cotyledons which were treated with 50 µM nigericin. Due to the retention shift, peak 2 in the black sample could potentially be matched to both peak 1 and 2 in the red sample.

83

A

B

Figure 4.5 ESI-MS spectra of (A) peak 1and (B) peak 2 in the positive-ion mode.

Selection of unknown metabolites and preparative HPLC.

Successful peak matching will determine differences between treated and control groups, generating a list of all induced peaks. To find unknown

84

induced metabolites in soybean, the peaks that have already been reported in soybean need to be excluded. An in-house database collecting the LC-MS information of all reported soybean secondary metabolites with molecular weights less than 800 has been constructed. LC-ESI-MS electrospray ionization might produce adducts of molecular ions such as [M+H]+ or

[M+Na]+ or multiple others. Thus, some common adducts of the reported metabolites were calculated and are summarized in Table 4.2. If the induced peaks matched any metabolites in the table, peak identification was conducted by comparing mass spectra and retention times with commercially available standards and pure compounds obtained in our previous studies.

The peaks that could not be identified by this procedure and those that did not match any metabolites in the table were defined as unknown induced peaks.

In this study, three unknown peaks were ground that are marked in Figure 4.2.

The details of repeated preparative HPLC work to purify these are discussed in the Materials and Methods section.

85

Table 4.2 Known soybean metabolites observed by LC-MS in our previous work (chapter 3) and their possible pseudomolecular species in ESI positive- ion mode.

Molecular + + + + + Identified Compound [M+H] [M- H O+H] [M+Na] [M+K] [M+ACN+H] weight 2 Daidzin 416 417 399 439 455 458 7-O-Glucosyl-6”-O-malonate daidzein 254 255 237 277 293 296 Genistin 432 433 415 455 471 474 7-O-Glucosyl-6”-O-malonate genestein 518 519 501 541 557 560 7-O-Glucosyl-6”-O-malonate formononetin 516 517 499 539 555 558 Daidzein 254 255 237 277 293 296 Glyceofuran 354 355 337 377 393 396 Genistein 270 271 253 293 309 312 Coumestrol 268 269 251 291 307 310 Formononetin 268 269 251 291 307 310 Glyceocarpin 340 341 323 363 379 382 Glyceollin III 338 339 321 361 377 380 Glyceollin I 338 339 321 361 377 380 Glyceollin II 338 339 321 361 377 380 3'-Prenyldaidzein 322 323 305 345 361 364 8-Prenyldaidzein 322 323 305 345 361 364 3'-Prenyretusin 352 353 335 375 391 394 Phaseol 336 337 319 359 375 378 Glyceollin IV 354 355 337 377 393 396 3’-Prenylgenistein 338 339 321 361 377 380 3’-Prenyisoafrormosin 366 367 349 389 405 408

NMR structure elucidation. NMR data showed that peak one in

Figure 4.1 contained two isomers. The four structures elucidated by NMR are

summarized in Figure 4.6. While none of them is new, they are all unknown in

soybean and we reported for the first time the 13C NMR assignments of

compound (11) (Rajani and Sarma, 1988) and compound (12) (Pocas et al.,

2006). Furthermore, we corrected the miss-assignment of C-11b in compound

(10) likely caused by resonance overlap with C-4 (Hatano et al., 2000), as 86

well as resolved previous ambiguous assignments in compound (13) (Lin and

Kuo, 1993). The latter is used to exemplify the structural elucidation process.

The NMR experiments were performed with separate samples dissolved in

13 methanol-d4 as well as in DMSO-d6. First, the 1D C NMR spectrum showed a total of 16 resonances, and the 1D 1H NMR spectrum on the other hand revealed two labile protons, one methyl group, and another five protons in the aromatic region. Taken together with the mass spectrometric measurement,

1 13 the compound was formulated as C16H10O6. The 2D H- C HSQC spectrum confirmed the nature of one methoxy moiety and five aromatic resonances.

The latter were grouped into spin systems of two “A” [δ 7.173 and δ 7.436 in

DMSO-d6] and one “ABX” [δ 8.055 (d, J = 8.65 Hz), 7.049 (d, J = 2.19 Hz), and 7.013 (dd, J = 8.65, 2.19 Hz) in DMSO-d6] on the basis of J-coupling correlations. Second, the ortho-coupled proton at 8.055 ppm was the only signal that had a HMBC correlation to the carbonyl carbon at 172.5 ppm, indicating the peri-position of this proton with respect to the carbonyl group

(Figure 4.7). This proton also showed strong HMBC correlations to two other carbons (C-7 and C-9) resonating at 154.4 and 162.5 ppm, respectively, attributable to oxygenated sp2 carbons. Together with the J-coupling information and HMBC correlations observed for signals at 7.049 and 7.013 ppm, a fused ring configuration of rings A and B was deduced. The hydroxy proton at 10.883 ppm was subsequently assigned to OH-7 based on its

HMBC correlations to both C-6 and C-8. Third, signal spin at 7.436 ppm showed an NOE with the methoxy protons, whereas that at 7.173 ppm had an

87

NOE to the hydroxy proton at 9.460 ppm. Both of these substituents had

HMBC correlations to a carbon at 146.6 ppm (C-5´), leading to the assignment of the substitution pattern of ring D followed by that of ring C.

Finally, the relative orientations of ring C with respect to ring B as well as the ring D with respect to ring C were established on the basis of following observations: (1) other than the carbonyl carbon, there were only six oxygenated sp2 carbons in the downfield 13C NMR spectrum (140 – 175 ppm); and (2) other than C-4´ and C-5´, H-6´ also exhibited another strong long- range correlation to a downfield 13C NMR resonance at 143.1 ppm, indicating

3 2 a trans- JCH correlation between H-6´ and this oxygenated sp carbon (C-2´).

This compound was the previous known oblongin as reported previously (Lin and Kuo, 1993). While both the 1H and 13C chemical shifts agreed well, the current study corrected the 13C NMR resonance assignments of C-3, C-6, C-8,

C-10, C-1´, and C-3´.

88

H 5 4 HO 3 4a O O 6 H 2 11b 6a 1 6b R2 11a 7 R3 R1 O 8 11 10a 9 10 OH H

(10) R1 = H, R2 = OCH3, R3 = H (11) R1 = OCH3, R2 = H, R3 = H (12) R1 = H, R2 = H, R3 = OCH3

H 1 HO 8 O 7 9 2 O H 2' A B C 3' 6 3 1' 5 4 H 10 D OH 4' 6' H O 5' H OCH3

(13)

Figure 4.6 Chemical structures of 2-methoxy-3,9-dihydroxycoumestone (10)

(Rajani and Sarma 1988), 1-methoxy-3,9-dihydroxycoumestone (11) (Hatano et al., 2000), 8-methoxy-3,9-dihydroxycoumestone (12) (Lee et al., 2008), and oblonginol (coumaronochromone derivative) (13) (Lin and Kuo, 1993).

89

H-6' H-3' H-5 H-8 H-6 ppm

C-2' C-4' 145 C-5' 150

C-9 155

160 C-7 C-2 165

170 (ppm) shift Chemical 13C C-4

8.0 7.5 7.0 ppm

1H Chemical shift (ppm)

Figure 4.7 Heteronuclear multiple-bond correlations (HMBC) spectrum recorded of compound (13) dissolved in DMSO-d6, showing the long-range correlations between the aromatic protons and those oxygenated sp2 carbon resonances appearing in the 13C NMR spectrum downfield region. The data set was recorded using a spectral width of 36 ppm in the 13C dimension

(138.5 – 174.5 ppm), and the unfolded cross-peaks are boxed.

90

Conclusions. An optimized metabolomics strategy for unknown induced compound discovery in plants has been presented. In particular, a novel algorithm has been designed for correctly matching isomer peaks. A five-step strategy including: (1) LC-ESI-MS profiling analysis, (2) peak detection, supervised retention time alignment and peak matching for multiple groups of samples, (3) selection of metabolites that are unknown induced natural products, (4) purification of target metabolites by preparative HPLC and finally, and (5) structure elucidation based on NMR spectroscopy has been applied successfully on the study of soybean-induced secondary metabolites. 2-Methoxy-3,9-dihydroxycoumestone, 1-methoxy-3,9- dihydroxycoumestone, 8-methoxy-3,9-dihydroxycoumestone, and 7,4’- dihydroxy-5’-methoxycoumaronochromone were discovered as secondary metabolites for the first time in soybean. Combining our previous study

(Cheng et al., 2011), Nine unknown compounds have been identified in the relatively well studied plant soybean, which illustrats the feasibility of using metabolomics for unknown plant natural product discovery and the ability of plants to provide additional novel compounds.

By examining 25 exogenous elicitors, signal molecules, and signal transduction regulators for their effects on wound-, light- and glucan defense elicitor-induced metabolites in soybean, it was found that nigericin, fumonisin, fusicoccin, monensin, vanadate and nystatin could comprehensively activate soybean secondary metabolites production. This result suggests that

91

phenylpropanoid metabolism may be controlled by manipulating the trans- membrane concentrations potential. Genetic work will be followed up to provide a systematic understanding of soybean-induced metabolite production.

Materials and methods

Plant Materials. Soybean seeds (Williams 82) were acquired from Dr.

Anne Dorrance (Department of Plant Pathology, The Ohio State University and the Ohio Agricultural Research and Development Center, Wooster, OH).

Seedlings were grown as described previously (Graham and Graham, 1996).

The flats were immediately watered very thoroughly for germination. The plants were watered every other day from the top with ½ teaspoon per gallon of Peter’s (20-20-20) fertilizer. Cotyledons were harvested from seven-day-old seedlings and used immediately. The cut cotyledon assays were performed as described previously (Graham and Graham, 1991). After treatment, incubation was in constant light at 400 uE/m2/S.

Chemicals. HPLC-grade acetonitrile and water were supplied by

Fisher Scientific (Pittsburgh, PA). vanadate, fusicoccin, A23187, nigericin, monensin, valinomycin, nystatin, amphotericin, lanthanum chloride, 2,4- dinitrophenol, glutathione, lactofen, Rose Bengal, mycolaminaran, lipopolysaccharide (LPS), chitosan, juglone, catechin, okadaic acid, calyculin

A, K-252 a, jasmonic acid, arachidonic acid, NPA, ACC, salicylic acid,

92

epibrassinolide, methanol-d4, and formic acid were purchased from Sigma-

Aldrich (St. Louis, MO). The intact wall glucan elicitor was prepared from the cell walls of race 1 of Phytophthora sojae as described previously (Ayers et al.,

1976).

Liquid Chromatography Mass Spectrometry Analysis. All the samples were prepared and analyzed following previously published protocols (Cheng et al., 2011). In brief, cell layers from 10 replicate cotyledons were pooled as one sample and weighed directly into 2 ml microfuge tubes.

Extraction was performed directly in microfuge tubes in 80% ethanol (800 µL per 50 mg tissue) using a polypropylene pestle (Kontes) mounted on a rechargeable hand drill. Extracts were centrifuged at 18,000 g for 5 min and

40 µL of the supernatant was analyzed by HPLC-ESI-MS on Varian 500-MS system consisting of a Varian 212-LC binary gradient LC/MS pump, a Prostar

420 autosampler and an ion trap mass spectrometer equipped with an ESI source. LC was achieved on a C18 reverse phase column (LiChrosorb RP-

18, 10 µm, 250 mm X 4.6 mm, Alltech Associates, Deerfield, Illinois) at flow rate 0.3 ml/min. Mobile phases A and B were water and acetonitrile, both with

0.01% formic acid added. The linear gradient program was as follows: A =

85% at t = 0 min; A = 10% at t = 110 min; A = 85% at t = 120 min. The positive-ion mode was used over the range m/z 50-800. Other MS parameters were drying gas 20 psi at 350 ˚C, nebulizing gas 40 psi, needle voltage 5000 V, capillary voltage 80 V, and spray shield voltage 600 V. MSn

93

results were generated by data dependent scanning analysis of the 500-MS varian system.

Peak detection, supervised retention time alignment and peak matching. LC/MS raw data were imported into Matlab version R2008a (The

Mathworks, Inc., Natick, MA) as described before (Chen et al., 2006; Cheng et al., 2011). Peak detection was achieved by our previous M files.

Supervised retention time alignment and peak matching were accomplished by new M files, which are available upon request.

Preparative HPLC. In order to obtain at least one mg of pure compounds as required for NMR de novo structure elucidation, 2000 cotyledons were treated with Rose Bengal and kept for 4 days. The extraction and first step of isolation was kept the same as previous (Cheng et al., 2011).

Fractions from 8 min to 14 min were collected and lyophilized, giving 244 mg residue. The residue was made into a suspension again by dissolving in 10 ml methanol and diluting with water to 50 ml. Then, the second isolation step used the profiling linear gradient described in the above section. Since a manual injection valve with a 2 ml loop was added, every peak was eluted 7 min later than the retention times shown in Figure 4.2. Repetitive isolation produced 2-methoxy-3,9-dihydroxycoumestone and 1-methoxy-3,9- dihydroxycoumestone (5 mg at 47.2 min), 8-methoxy-3,9- dihydroxycoumestone (2 mg at 47.7 min), and 7,4’-dihydroxy-5’-

94

methoxycoumaronochromone (4 mg at 57 min).

Nuclear magnetic resonance (NMR) analysis. The HPLC-purified samples were lyophilized and reconstituted in either 500 μl methanol-d4 or

500 μl DMSO-d6. It is important to note that compounds (10) and (11) exist in a mixture with a ratio of roughly 0.8:1, further preparative separation proved to be extremely difficult. The NMR experiments were conducted at room temperature on either a Bruker DRX-800 spectrometer (Bruker, Karlsruhe,

Germany) operating at 800.13 MHz 1H and 201.19 MHz 13C, or a Bruker

DMX-600 spectrometer operating at 600.13 MHz 1H and 150.9 MHz 13C.

They generally include 1D 1H NMR, 1D 13C NMR, 2D homonuclear COSY and NOESY (240 millisecond mixing time), and 2D heteronuclear 1H-13C

HMBC (optimized for small J-observation), and HSQC. The mixture was also subjected to 1D selective NOESY and TOCSY studies that offered high resolution to resolve ambiguous assignments. The residual solvent peaks were used as internal reference for the determination of 1H and 13C NMR chemical shifts.

Compound (10): 1-methoxy-3,9-dihydroxycoumestone. 1H NMR

(DMSO-d6, 600.13 MHz) δ 3.978 (s, 3H, 1-OCH3), 6.514 (d, 1H, J=1.96 Hz, H-

2), 6.518 (d, 1H, J=1.96 Hz, H-4), 6.934 (m, 1H, J=1.96, 8.37 Hz, H-8), 7.122

(d, 1H, J=1.96 Hz, H-10), 7.683 (d, 1H, J=8.37 Hz, H-7), 9.960 (s, 1H, OH-9),

13 10.712 (s, 1H, OH-3); C NMR (DMSO-d6, 150.90 MHz) δ 56.3 (OCH3-1),

95

95.7 (C-11b), 95.8 (C-4), 96.1 (C-2), 98.4 (C-10), 101.2 (C-6a), 113.8 (C-8),

114.2 (C-6b), 120.2 (C-7), 155.3 (C-4a), 155.7 (C-10a), 156.6 (C-1), 156.7 (C-

9), 157.4 (C-6), 159.0 (C-11a), 161.6 (C-3).

1 H NMR (methanol-d4, 600.13 MHz) δ 4.015 (s, 3H, OCH3-1), 6.472 (d,

1H, J=1.86 Hz, H-2), 6.487 (d, 1H, J=1.86 Hz, H-4), 6.916 (m, 1H, J=2.01,

8.36 Hz, H-8), 7.075 (d, 1H, J=2.01 Hz, H-10), 7.729 (d, 1H, J=8.36 Hz, H-7);

13 C NMR (methanol-d4, 150.90 MHz) δ 56.9 (OCH3-1), 97.1 (C-2, 4), 97.8 (C-

11b), 99.6 (C-10), 101.6 (C-6a), 114.9 (C-8), 116.2 (C-6b), 121.9 (C-7), 157.4

(C-4a), 158.0 (C-10a), 158.4 (C-9), 158.8 (C-1), 160.7 (C-6), 161.8 (C-11a),

163.6 (C-3).

Compound (11): 2-methoxy-3,9-dihydroxy coumestone. 1H NMR

(DMSO-d6, 600.13 MHz) δ 3.929 (s, 3H, 2-OCH3), 6.952 (m, 1H, J=2.00, 8.38

Hz, H-8), 6.983 (s, 1H, H-4), 7.179 (d, 1H, J=2.00 Hz, H-10), 7.416 (s, 1H, H-

1), 7.705 (d, 1H, J=8.38 Hz, H-7), 10.030 (s, 1H, OH-9), 10.448 (s, 1H, OH-3);

13 C NMR (DMSO-d6, 150.90 MHz) δ 56.1 (OCH3-2), 98.6 (C-10), 102.2 (C-6a),

102.4 (C-1), 103.2 (C-11b), 103.7 (C-4), 113.9 (C-8), 114.7 (C-6b), 120.6 (C-

7), 145.9 (C-2), 148.6 (C-4a), 151.1 (C-3), 155.9 (C-10a), 157.0 (C-9), 157.6

(C-6), 159.4 (C-11a).

1 H NMR (methanol-d4, 600.13 MHz) δ 3.998 (s, 3H, OCH3-2), 6.932 (s,

1H, H-4), 6.933 (m, 1H, J=2.00, 8.36 Hz, H-8), 7.095 (d, 1H, J=2.00 Hz, H-10),

13 7.417 (s, 1H, H-1), 7.747 (d, 1H, J=8.36 Hz, H-7); C NMR (methanol-d4,

96

150.90 MHz) δ 57.0 (OCH3-2), 99.7 (C-10), 103.1 (C-1), 104.1 (C-6a), 105.0

(C-4), 115.1 (C-8), 116.7 (C-6b), 122.3 (C-7), 147.7 (C-2), 150.8 (C-4a), 152.8

(C-3), 158.2 (C-10a), 158.7 (C-9), 161.0 (C-6), 161.9 (C-11a).

Compound (12): 8-methoxy-3,9-dihydroxy coumestone. 1H NMR

(DMSO-d6, 800.13 MHz) δ 3.889 (s, 3H, OCH3-8), 6.913 (d, 1H, J=2.23 Hz, H-

4), 6.934 (m, 1H, J=2.23, 8.51 Hz, H-2), 7.252 (s, 1H, H-10), 7.330 (s, 1H, H-

7), 7.852 (d, 1H, J=8.51 Hz, H-1), 9.625 (s, 1H, OH-9), 10.670 (s, 1H, OH-3);

13 C NMR (DMSO-d6, 150.90 MHz) δ 56.1 (OCH3-8), 99.2 (C-10), 101.8 (C-7),

102.4 (C-6a), 103.1 (C-4), 104.4 (C-11b), 113.8 (C-2), 113.9(C-6b), 122.7 (C-

1), 146.7 (C-9), 146.9 (C-8), 149.5 (C-10a), 154.5 (C-4a), 157.8 (C-6), 159.2

(C-11a), 161.0 (C-3).

1 H NMR (methanol-d4, 600.13 MHz) δ 3.976 (s, 3H, OCH3-8), 6.894 (d,

1H, J=2.17 Hz, H-4), 6.934 (m, 1H, J=2.17, 8.54 Hz, H-2), 7.174 (s, 1H, H-10),

13 7.464 (s, 1H, H-7), 7.863 (d, 1H, J=8.54 Hz, H-1), C NMR (methanol-d4,

150.90 MHz) δ 57.0 (OCH3-8), 99.9 (C-10), 103.2 (C-7), 104.3 (C-4, 6a),

106.4 (C-11b), 115.1 (C-2), 115.9 (C-6b), 123.8 (C-1), 148.3 (C-9), 148.4 (C-

8), 151.8 (C-10a), 156.5 (C-4a), 160.9 (C-6), 161.8 (C-11a), 162.9 (C-3).

1 Compound (13): Oblongin. H NMR (DMSO-d6, 800.13 MHz) δ 3.880

(s, 3H, OCH3-5´), 7.013 (m, 1H, J=2.19, 8.65 Hz, H-6), 7.049 (d, 1H, J=2.19

Hz, H-8), 7.173 (s, 1H, H-3´), 7.436 (s, 1H, H-6´), 8.055 (d, 1H, J=8.65 Hz, H-

13 5), 9.460 (s, 1H, OH-4´), 10.883 (s, 1H, OH-7); C NMR (DMSO-d6, 200.10

97

MHz) δ 56.2 (OCH3-5), 98.6 (C-3), 99.3 (C-3´), 103.0 (C-8, 6´), 113.3 (C-1´),

114.9 (C-6), 115.7 (C-10), 127.2 (C-5), 143.1 (C-2´), 145.6 (C-4´), 146.6 (C-

5´), 154.4 (C-9), 162.5 (C-7), 163.8 (C-2), 172.5 (C-4).

1 H NMR (methanol-d4, 600.13 MHz) δ 3.976 (s, 3H, OCH3-5´), 7.016

(m, 1H, J=2.05, 8.83 Hz, H-6), 7.024 (d, 1H, J=2.05 Hz, H-8), 7.100 (s, 1H, H-

13 3´), 7.642 (s, 1H, H-6´), 8.164 (d, 1H, J=8.83 Hz, H-5); C NMR (methanol-d4,

200.10 MHz) δ 57.1 (OCH3-5), 99.7 (C-3), 100.0 (C-3´), 104.1 (C-8), 104.5 (C-

6´), 115.1 (C-1´), 116.3 (C-6), 117.2 (C-10), 128.6 (C-5), 145.6 (C-2´), 147.4

(C-4´), 148.2 (C-5´), 156.7 (C-9), 164.7 (C-7), 166.4 (C-2), 176.0 (C-4).

98

References

Abbott BJ, Fukuda DS, Dorman DE, Occolowitz JL, Debono M, Farhner L

(1979) Microbial transformation of A23187, a divalent cation ionophore

antibiotic. Antimicrob Agents Chemother 16: 808-812

Abe N, Sato H, Sakamura S (1987) Antifungal Stress Compounds from

Adzuki Bean, Vigna-Angularis, Treated with Cephalosporium-

Gregatum Type-B. Agril Biol Chem 51: 349-353

Aberg KM, Torgrip RJ, Kolmert J, Schuppe-Koistinen I, Lindberg J (2008)

Feature detection and alignment of hyphenated chromatographic-mass

spectrometric data. Extraction of pure ion chromatograms using

Kalman tracking. J Chromatogr A 1192: 139-146

Adams DO, Yang SF (1979) Ethylene biosynthesis: Identification of 1-

aminocyclopropane-1-carboxylic acid as an intermediate in the

conversion of methionine to ethylene. Proc Natl Acad Sci U S A 76:

170-174

Al-Khatib K, Peterson DE, Regehr DL (2000) Control of imazethapyr-

resistant common sunflower (Helianthus annuus) in soybean (Glycine

max) and corn (Zea mays). Weed Technolo 14: 133-139

Allwood JW, Ellis DI, Heald JK, Goodacre R, Mur LAJ (2006) Metabolomic

approaches reveal that phosphatidic and phosphatidyl glycerol

phospholipids are major discriminatory non-polar metabolites in

99

responses by Brachypodium distachyon to challenge by Magnaporthe

grisea. Plant J 46: 351-368

Allwood JW, Goodacre R (2010) An introduction to liquid chromatography-

mass spectrometry instrumentation applied in plant metabolomic

analyses. Phytochemi Anal 21: 33-47

Amantonico A, Urban PL, Zenobi R (2010) Analytical techniques for single-

cell metabolomics: state of the art and trends. Anal Bioanal Chem 398:

2493-2504

Antignac JP, Courant F, Pinel G, Bichon E, Monteau F, Elliott C, Le Bizec

B (2011) Mass spectrometry-based metabolomics applied to the

chemical safety of food. Trac-Trends in Analytical Chemistry 30: 292-

301

Ara T, Sakurai N, Tange Y, Morishita Y, Suzuki H, Aoki K, Saito K,

Shibata D (2009) Improvement of the quantitative differential

metabolome pipeline for gas chromatography-mass spectrometry data

by automated reliable peak selection. Plant Biotechnology 26: 445-449

Arjmandi BH, Getlinger MJ, Goyal NV, Alekel L, Hasler CM, Juma S,

Drum ML, Hollis BW, Kukreja SC (1998) Role of soy protein with

normal or reduced isoflavone content in reversing bone loss induced

by ovarian hormone deficiency in rats. American Journal of Clinical

Nutrition 68: 1358S-1363S

100

Asada Y, Li W, Yoshikawa T (1999) The first prenylated biaurone,

licoagrone from hairy root cultures of Glycyrrhiza glabra.

Phytochemistry 50: 1015-1019

Ayers AR, Ebel J, Valent B, Albersheim P (1976) Host-pathogen

interactions .10. Fractionation and biological-activity of an elicitor

isolated from mycelial walls of Phytophthora-Megasperma var sojae.

Plant Physiol 57: 760-765

Balunas MJ, Kinghorn AD (2005) Drug discovery from medicinal plants. Life

Sci 78: 431-441

Beekwilder J, Jonker H, Meesters P, Hall RD, van der Meer IM, de Vos

CHR (2005) Antioxidants in raspberry: On-line analysis links

antioxidant activity to a diversity of individual metabolites. J Agric Food

Chem 53: 3313-3320

Bhatnagar A, Sillanpaa M (2009) Applications of chitin- and chitosan-

derivatives for the detoxification of water and wastewater--a short

review. Adv Colloid Interface Sci 152: 26-38

Bielawski J, Thompson TE, Lehninger AL (1966) The effect of 2,4-

dinitrophenol on the electrical resistance of phospholipid bilayer

membranes. Biochem Biophys Res Commun 24: 948-954

Bobzin SC, Yang ST, Kasten TP (2000) Application of liquid

chromatography-nuclear magnetic resonance spectroscopy to the

identification of natural products. Journal of Chromatography B 748:

259-267

101

Boccard J, Veuthey JL, Rudaz S (2010) Knowledge discovery in

metabolomics: an overview of MS data handling. J Sep Sci 33: 290-

304

Bottcher C, von Roepenack-Lahaye E, Schmidt J, Schmotz C, Neumann

S, Scheel D, Clemens S (2008) Metabolome analysis of biosynthetic

mutants reveals a diversity of metabolic changes and allows

identification of a large number of new compounds in Arabidopsis.

Plant Physiol 147: 2107-2120

Bylund D, Danielsson R, Malmquist G, Markides KE (2002)

Chromatographic alignment by warping and dynamic programming as

a pre-processing tool for PARAFAC modelling of liquid

chromatography-mass spectrometry data. J Chromatogr A 961: 237-

244

Caballero P, Smith CM, Fronczek FR, Fischer NH (1986) Isoflavones from

an insect-resistant variety of soybean and the molecular-structure of

afrormosin. J Nat Prod 49: 1126-1129

Calle ML, Urrea V (2011) Letter to the editor: Stability of random forest

importance measures. Brief Bioinform 12: 86-89

Cantley LC, Jr., Cantley LG, Josephson L (1978) A characterization of

vanadate interactions with the (Na,K)-ATPase. Mechanistic and

regulatory implications. J Biol Chem 253: 7361-7368

102

Cass A, Finkelstein A, Krespi V (1970) The ion permeability induced in thin

lipid membranes by the polyene antibiotics nystatin and amphotericin B.

J Gen Physiol 56: 100-124

Chacha M, Bojase-Moleta G, Majinda RR (2005) Antimicrobial and radical

scavenging flavonoids from the stem wood of Erythrina latissima.

Phytochemistry 66: 99-104

Chae M, Shmookler Reis RJ, Thaden JJ (2008) An iterative block-shifting

approach to retention time alignment that preserves the shape and

area of gas chromatography-mass spectrometry peaks. BMC

Bioinformatics 9 Suppl 9: S15

Chen M, Su M, Zhao L, Jiang J, Liu P, Cheng J, Lai Y, Liu Y, Jia W (2006)

Metabonomic study of aristolochic acid-induced nephrotoxicity in rats.

J Proteome Res 5: 995-1002

Cheng J, Yuan C, Graham TL (2011) Potential defense-related prenylated

isoflavones in lactofen-induced soybean. Phytochemistry 72: 875-881

Clark JD, Lin LL, Kriz RW, Ramesha CS, Sultzman LA, Lin AY, Milona N,

Knopf JL (1991) A novel arachidonic acid-selective cytosolic PLA2

contains a Ca(2+)-dependent translocation domain with homology to

PKC and GAP. Cell 65: 1043-1051

Coates PM, Meyers CM (2011) The National Institutes of Health investment

in research on botanicals. Fitoterapia 82: 11-13

103

Creelman RA, Tierney ML, Mullet JE (1992) Jasmonic acid/methyl

jasmonate accumulate in wounded soybean hypocotyls and modulate

wound gene expression. Proc Natl Acad Sci U S A 89: 4938-4941

D'Auria JC, Gershenzon J (2005) The secondary metabolism of Arabidopsis

thaliana: growing like a weed. Curr Opin Plant Biol 8: 308-316

Daniele RP, Holian SK, Nowell PC (1978) A potassium ionophore (Nigericin)

inhibits stimulation of human lymphocytes by mitogens. J Exp Med 147:

571-581

Dann EK, Diers BW, Hammerschmidt R (1999) Suppression of Sclerotinia

stem rot of soybean by lactofen herbicide treatment. Phytopathology

89: 598-602

De Vos RCH, Moco S, Lommen A, Keurentjes JJB, Bino RJ, Hall RD

(2007) Untargeted large-scale plant metabolomics using liquid

chromatography coupled to mass spectrometry. Nature Protocols 2:

778-791

Delaney TP, Uknes S, Vernooij B, Friedrich L, Weymann K, Negrotto D,

Gaffney T, Gut-Rella M, Kessmann H, Ward E, Ryals J (1994) A

central role of salicylic acid in . Science 266:

1247-1250

Dewick PM, Barz W, Grisebac. H (1970) Biosynthesis of coumestrol in

Phaseolus aureus. Phytochemistry 9: 775-779

104

Dewick PM, Martin M (1979) Biosynthesis of pterocarpan, isoflavan and

metabolites of Medicago sativa - chalcone, isoflavone and

isoflavanone precursors. Phytochemistry 18: 597-602

Dixon RA, Gang DR, Charlton AJ, Fiehn O, Kuiper HA, Reynolds TL,

Tjeerdema RS, Jeffery EH, German JB, Ridley WP, Seiber JN

(2006) Applications of metabolomics in agriculture. J Agric Food Chem

54: 8984-8994

Dixon RA, Sumner LW (2003) Legume natural products: Understanding and

manipulating complex pathways for human and animal health. Plant

Physiology 131: 878-885

Dizge N, Aydiner C, Imer DY, Bayramoglu M, Tanriseven A, Keskinler B

(2009) Biodiesel production from sunflower, soybean, and waste

cooking oils by transesterification using lipase immobilized onto a

novel microporous polymer. Bioresour Technol 100: 1983-1991

Dudley E, Yousef M, Wang Y, Griffiths WJ (2010) Targeted metabolomics

and mass spectrometry. Adv Protein Chem Struct Biol 80: 45-83

Ebel J (1986) Phytoalexin synthesis - the biochemical-analysis of the

induction process. Annual Review of Phytopathology 24: 235-264

Eliasson M, Rannar S, Trygg J (2011) From data processing to multivariate

validation - essential steps in extracting interpretable information from

metabolomics data. Curr Pharm Biotechnol 12: 996-1004

105

Emami SA, Amin-Ar-Ramimeh E, Ahi A, Kashy MRB, Schneider B,

Iranshahi M (2007) Prenylated flavonoids and flavonostilbenes from

Sophora pachycarpa roots. Pharmaceutical Biology 45: 453-457

Exarchou V, Godejohann M, van Beek TA, Gerothanassis IP, Vervoort J

(2003) LC-UV-solid-phase extraction-NMR-MS combined with a

cryogenic flow probe and its application to the identification of

compounds present in Greek oregano. Analytical Chemistry 75: 6288-

6294

Exarchou V, Krucker M, van Beek TA, Vervoort J, Gerothanassis IP,

Albert K (2005) LC-NMR coupling technology: recent advancements

and applications in natural products analysis. Magnetic Resonance in

Chemistry 43: 681-687

Facchini PJ, Bird DA, St-Pierre B (2004) Can Arabidopsis make complex

alkaloids? Trends in Plant Science 9: 116-122

Fernie AR (2007) The future of metabolic phytochemistry: larger numbers of

metabolites, higher resolution, greater understanding. Phytochemistry

68: 2861-2880

Fiehn O, Kopka J, Dormann P, Altmann T, Trethewey RN, Willmitzer L

(2000) Metabolite profiling for plant functional genomics. Nature

Biotechnology 18: 1157-1161

Fischer B, Grossmann J, Roth V, Gruissem W, Baginsky S, Buhmann JM

(2006) Semi-supervised LC/MS alignment for differential proteomics.

Bioinformatics 22: e132-140

106

Food Agriculture Organization of the United Nations (FAO), FAOSTAT,

http://faostat.fao.org/site/567/DesktopDefault.aspx?PageID=567#ancor

Assess time: Feb., 2011.

Franke AA, Halm BM, Ashburn LA (2008) Isoflavones in children and adults

consuming soy. Archives of Biochemistry and Biophysics 476: 161-170

Fukushima A, Kusano M, Redestig H, Arita M, Saito K (2009) Integrated

omics approaches in plant systems biology. Curr Opin Chem Biol 13:

532-538

Furstner A, Heilmann EK, Davies PW (2007) Total synthesis of the

antibiotic erypoegin H and cognates by a PtCl2-catalyzed

cycloisomerization reaction. Angewandte Chemie-International Edition

46: 4760-4763

Garcez WS, Martins D, Garcez FR, Marques MR, Pereira AA, Oliveira LA,

Rondon JN, Peruca AD (2000) Effect of spores of saprophytic fungi

on phytoalexin accumulation in seeds of frog-eye leaf spot and stem

canker-resistant and -susceptible soybean (Glycine max L.) cultivars.

Journal of Agricultural and Food Chemistry 48: 3662-3665

Garcia-Villalba R, Leon C, Dinelli G, Segura-Carretero A, Fernandez-

Gutierrez A, Garcia-Canas V, Cifuentes A (2008) Comparative

metabolomic study of transgenic versus conventional soybean using

capillary electrophoresis-time-of-flight mass spectrometry. Journal of

Chromatography A 1195: 164-173

107

Graham MY (2005) The diphenylether herbicide lactofen induces cell death

and expression of defense-related genes in soybean. Plant Physiology

139: 1784-1794

Graham MY, Graham TL (1991) Rapid accumulation of anionic peroxidases

and phenolic polymers in soybean cotyledon tissues following

treatment with f. sp. glycinea wall glucan.

Plant Physiol 97: 1445-1455

Graham MY, Weidner J, Wheeler K, Pelow MJ, Graham TL (2003) Induced

expression of pathogenesis-related protein genes in soybean by

wounding and the Phytophthora sojae cell wall glucan elicitor. Physiol

Mol Plant Pathol 63: 141-149

Graham TL (1991) A rapid, high resolution high performance liquid

chromatography profiling procedure for plant and microbial aromatic

secondary metabolites. Plant Physiol 95: 584-593

Graham TL, Graham MY (1991) Glyceollin elicitors induce major but

distinctly different shifts in isoflavonoid metabolism in proximal and

distal soybean cell-populations. Mol Plant-Microbe Interact 4: 60-68

Graham TL, Graham MY (1996) Signaling in soybean phenylpropanoid

responses - Dissection of primary, secondary, and conditioning effects

of light, wounding, and elicitor treatments. Plant Physiol 110: 1123-

1133

Graham TL, Graham MY, Subramanian S, Yu O (2007) RNAi silencing of

genes for elicitation or biosynthesis of 5-deoxyisoflavonoids

108

suppresses race-specific resistance and hypersensitive cell death in

Phytophthora sojae infected tissues. Plant Physiol 144: 728-740

Griffith AP, Collison MW (2001) Improved methods for the extraction and

analysis of isoflavones from soy-containing foods and nutritional

supplements by reversed-phase high-performance liquid

chromatography and liquid chromatography-mass spectrometry. J

Chromatogr A 913: 397-413

Griffiths WJ, Wang Y (2009) Mass spectrometry: from proteomics to

metabolomics and lipidomics. Chem Soc Rev 38: 1882-1896

Gullo VP, McAlpine J, Lam KS, Baker D, Petersen F (2006) Drug discovery

from natural products. J Ind Microbiol Biotechnol 33: 523-531

Hakamatsuka T, Ebizuka Y, Sankawa U (1991) Induced Isoflavonoids from

copper chloride-treated stems of Pueraria lobata. Phytochemistry 30:

1481-1482

Hall RD (2006) Plant metabolomics: from holistic hope, to hype, to hot topic.

New Phytol 169: 453-468

Hatano T, Aga Y, Shintani Y, Ito H, Okuda T, Yoshida T (2000) Minor

flavonoids from licorice. Phytochemistry 55: 959-963

He WM, Feng Y, Ridenour WM, Thelen GC, Pollock JL, Diaconu A,

Callaway RM (2009) Novel weapons and invasion: biogeographic

differences in the competitive effects of Centaurea maculosa and its

root exudate (+/-)-catechin. Oecologia 159: 803-815

109

Hearse DJ, Kusama Y, Bernier M (1989) Rapid electrophysiological

changes leading to arrhythmias in the aerobic rat heart.

Photosensitization studies with rose bengal-derived reactive oxygen

intermediates. Circ Res 65: 146-153

Ho SC, Woo JLF, Leung SSF, Sham ALK, Lam TH, Janus ED (2000)

Intake of soy products is associated with better plasma lipid profiles in

the Hong Kong Chinese population. Journal of Nutrition 130: 2590-

2593

Hoffmann N, Stoye J (2009) ChromA: signal-based retention time alignment

for chromatography-mass spectrometry data. Bioinformatics 25: 2080-

2081

Holder CL, Churchwell MI, Doerge DR (1999) Quantification of soy

isoflavones, genistein and daidzein, and conjugates in rat blood using

LC/ES-MS. J Agric Food Chem 47: 3764-3770

Hothorn M, Wachter A, Gromes R, Stuwe T, Rausch T, Scheffzek K (2006)

Structural basis for the redox control of plant glutamate cysteine ligase.

J Biol Chem 281: 27557-27565

Hsu A, Bray TM, Helferich WG, Doerge DR, Ho E (2010) Differential effects

of whole soy extract and soy isoflavones on apoptosis in prostate

cancer cells. Expe Biol Med 235: 90-97

Iinuma M, Tanaka T, Mizuno M, Yamamoto H, Kobayashi Y, Yonemori S

(1992) Phenolic constituents in Erythrina x bidwilli and their activity

110

against oral microbial organisms. Chem Pharm Bull (Tokyo) 40: 2749-

2752

Ingham JL, Keen NT, Mulheirn LJ, Lyne RL (1981) Inducibly-formed

isoflavonoids from leaves of soybean. Phytochemistry 20: 795-798

Issaq HJ, Van QN, Waybright TJ, Muschik GM, Veenstra TD (2009)

Analytical and statistical approaches to metabolomics research. J Sep

Sci 32: 2183-2199

Jackson MB, Ishizawa K, Ito O (2009) Evolution and mechanisms of plant

tolerance to flooding stress. Ann Bot 103: 137-142

Jeffries N (2005) Algorithms for alignment of mass spectrometry proteomic

data. Bioinformatics 21: 3066-3073

Johnson KJ, Wright BW, Jarman KH, Synovec RE (2003) High-speed

peak matching algorithm for retention time alignment of gas

chromatographic data for chemometric analysis. J Chromatogr A 996:

141-155

Kasuga T, Salimath SS, Shi JR, Gijzen M, Buzzell RI, Bhattacharyya MK

(1997) High resolution genetic and physical mapping of molecular

markers linked to the Phytophthora resistance gene Rps1-k in soybean.

Mol Plant-Microbe Interact 10: 1035-1044

Katajamaa M, Miettinen J, Oresic M (2006) MZmine: toolbox for processing

and visualization of mass spectrometry based molecular profile data.

Bioinformatics 22: 634-636

111

Katajamaa M, Oresic M (2007) Data processing for mass spectrometry-

based metabolomics. J Chromatogr A 1158: 318-328

Khaomek P, Ichino C, Ishiyama A, Sekiguchi H, Namatame M,

Ruangrungsi N, Saifah E, Kiyohara H, Otoguro K, Omura S,

Yamada H (2008) In vitro antimalarial activity of prenylated flavonoids

from Erythrina fusca. J Nat Med 62: 217-220

Khupse RS, Erhardt PW (2008) Total syntheses of racemic, natural (-) and

unnatural (+) glyceollin I. Org Lett 10: 5007-5010

Kim HK, Wilson EG, Choi YH, Verpoorte R (2010) Metabolomics: a tool for

anticancer lead-finding from natural products. Planta Med 76: 1094-

1102

Kind T, Fiehn O (2007) Seven golden rules for heuristic filtering of molecular

formulas obtained by accurate mass spectrometry. BMC Bioinforma 8:

105

Kinghorn AD, Chai HB, Sung CK, Keller WJ (2011) The classical drug

discovery approach to defining bioactive constituents of botanicals.

Fitoterapia 82: 71-79

Klement S, Mamlouk AM, Martinetz T (2008) Reliability of cross-validation

for SVMs in high-dimensional, low sample size scenarios. Artificial

Neural Networks - Icann 2008, Pt I 5163: 41-50

Kong X, Reilly C (2009) A Bayesian approach to the alignment of mass

spectra. Bioinformatics 25: 3213-3220

112

Kurzer MS (2003) supplement use by women. J Nutr 133:

1983S-1986S

Lai C, Reinders MJT, van't Veer LJ, Wessels LFA (2006) A comparison of

univariate and multivariate gene selection techniques for classification

of cancer datasets. BMC Bioinformatics 7: 235-244

Lan K, Jia W (2010) An integrated metabolomics and pharmacokinetics

strategy for multi-component drugs evaluation. Curr Drug Metab 11:

105-114

Landini S, Graham MY, Graham TL (2003) Lactofen induces isoflavone

accumulation and glyceollin elicitation competency in soybean.

Phytochemistry 62: 865-874

Lange E, Gropl C, Schulz-Trieglaff O, Leinenbach A, Huber C, Reinert K

(2007) A geometric approach for the alignment of liquid

chromatography-mass spectrometry data. Bioinformatics 23: i273-281

Lee JH, Lee BW, Kim JH, Jeong TS, Kim MJ, Lee WS, Park KH (2006)

LDL-antioxidant pterocarpans from roots of Glycine max (L.) Merr. J

Agric Food Chem 54: 2057-2063

Levene BC, Owen MDK, Tylka GL (1998) Influence of herbicide application

to soybeans on soybean cyst nematode egg hatching. Journal of

Nematology 30: 347-352

Lin YL, Kuo YH (1993) Two new coumaronochromone derivatives, oblongin

and oblonginol from the roots of Derris oblonga Benth. Heterocycles 36:

1501-1507

113

Lin YQ, Schiavo S, Orjala J, Vouros P, Kautz R (2008) Microscale LC-MS-

NMR platform applied to the identification of active cyanobacterial

metabolites. Anal Chem 80: 8045-8054

Liu X, Piao X, Wang Y, Zhu S (2010) Model study on transesterification of

soybean oil to biodiesel with methanol using solid base catalyst. J

Phys Chem A 114: 3750-3755

Liu ZM, Chen YM, Ho SC, Ho YP, Woo J (2010) Effects of soy protein and

isoflavones on glycemic control and insulin sensitivity: a 6-mo double-

blind, randomized, placebo-controlled trial in postmenopausal Chinese

women with prediabetes or untreated early diabetes. Amer J Clin Nutr

91: 1394-1401

Low Dog T (2005) Menopause: a review of botanical dietary supplements.

Am J Med 118 Suppl 12B: 98-108

Lyne RL, Mulheirn LJ (1978) Minor Pterocarpinoids of Soybean.

Tetrahedron Lett: 3127-3128

Lyne RL, Mulheirn LJ, Leworthy DP (1976) New pterocarpinoid

phytoalexins of soybean. J Chem Soc-Chem Commun: 497-498

Mahn K, Borras C, Knock GA, Taylor P, Khan IY, Sugden D, Poston L,

Ward JP, Sharpe RM, Vina J, Aaronson PI, Mann GE (2005) Dietary

soy isoflavone-induced increases in antioxidant and eNOS gene

expression lead to improved endothelial function and reduced blood

pressure in vivo. FASEB Journal 19: 1755-1772

114

Mendes P, Camacho D, de la Fuente A (2005) Modelling and simulation for

metabolomics data analysis. Biochem Soc Trans 33: 1427-1429

Moco S, Bino RJ, De Vos RCH, Vervoort J (2007) Metabolomics

technologies and metabolite identification. Trac-Trends in Analytical

Chemistry 26: 855-866

Moco S, Bino RJ, Vorst O, Verhoeven HA, de Groot J, van Beek TA,

Vervoort J, de Vos CHR (2006) A liquid chromatography-mass

spectrometry-based metabolome database for tomato. Plant Physiol

141: 1205-1218

Mollenhauer HH, Morre DJ, Rowe LD (1990) Alteration of intracellular traffic

by monensin; mechanism, specificity and relationship to toxicity.

Biochim Biophys Acta 1031: 225-246

Morandi D, Lequere JL (1991) Influence of nitrogen on accumulation of

isosojagol (a newly detected coumestan in soybean) and associated

isoflavonoids in roots and nodules of mycorrhizal and nonmycorrhizal

soybean. New Phytol 117: 75-79

Moy P, Qutob D, Chapman BP, Atkinson I, Gijzen M (2004) Patterns of

gene expression upon infection of soybean plants by Phytophthora

sojae. Mol Plant-Microbe Inter 17: 1051-1062

Nakayama M, Hayashi S, Masumura M, Horie T, Tsukayama M, Yamada T

(1975) Synthesis of neobavaisoflavone. Chem Lett : 281-282

Nelson KA, Renner KA, Hammerschmidt R (2002) Effects of

protoporphyrinogen oxidase inhibitors on soybean (Glycine max L.)

115

response, Sclerotinia sclerotiorum disease development, and

phytoalexin production by soybean. Weed Technol 16: 353-359

Nkengfack AE, Kouam J, Vouffo TW, Meyer M, Tempesta MS, Fomum ZT

(1994) An isoflavanone and a coumestan from Erythrina sigmoidea.

Phytochemistry 35: 521-526

Nordstrom A, O'Maille G, Qin C, Siuzdak G (2006) Nonlinear data

alignment for UPLC-MS and HPLC-MS based metabolomics:

quantitative analysis of endogenous and exogenous metabolites in

human serum. Anal Chem 78: 3289-3295

Nordstrom A, Tarkowski P, Tarkowska D, Dolezal K, Astot C, Sandberg

G, Moritz T (2004) Derivatization for LC-electrospray ionization-MS: a

tool for improving reversed-phase separation and ESI responses of

bases, ribosides, and intact nucleotides. Anal Chem 76: 2869-2877

Ohio Soybean Council (OSC), Industry Overview,

http://associationdatabase.com/aws/OHSOY/pt/sp/osc_industryoverview

Assess time: Feb. 2011

Okada T, Afendi FM, Altaf-ul-Amin M, Takahashi H, Nakamura K, Kanaya

S (2010) Metabolomics of medicinal plants: the importance of

multivariate analysis of analytical chemistry data. Curr Comput Aided

Drug Des 6: 179-196

O'neill MJ (1983) Aureol and Phaseol, 2 New Coumestans from Phaseolus

aureus Roxb. Z Naturforsch C-a Journal 38: 698-700

116

Pahari P, Rohr J (2009) Total synthesis of , an anticancer natural

product. J Org Chem 74: 2750-2754

Park S, Ahn IS, Kim JH, Lee MR, Kim JS, Kim HJ (2010) Glyceollins, one

of the phytoalexins derived from soybeans under fungal stress,

enhance insulin sensitivity and exert insulinotropic actions. J Agric

Food Chem 58: 1551-1557

Park WJ (1998) Effect of epibrassinolide on hypocotyl growth of the tomato

mutant diageotropica. Planta 207: 120-124

Parsons HM, Ekman DR, Collette TW, Viant MR (2009) Spectral relative

standard deviation: a practical benchmark in metabolomics. Analyst

134: 478-485

Paulsen MT, Ljungman M (2005) The natural toxin juglone causes

degradation of p53 and induces rapid H2AX phosphorylation and cell

death in human fibroblasts. Toxicol Appl Pharmacol 209: 1-9

Pluskal T, Castillo S, Villar-Briones A, Oresic M (2010) MZmine 2: modular

framework for processing, visualizing, and analyzing mass

spectrometry-based molecular profile data. BMC Bioinform 11: 395

Pocas ES, Lopes DV, da Silva AJ, Pimenta PH, Leitao FB, Netto CD,

Buarque CD, Brito FV, Costa PR, Noel F (2006) Structure-activity

relationship of analogues: structural requirements for

inhibition of Na+, K+ -ATPase and binding to the central

benzodiazepine receptor. Bioorg Med Chem 14: 7962-7966

117

Podwojski K, Fritsch A, Chamrad DC, Paul W, Sitek B, Stuhler K, Mutzel

P, Stephan C, Meyer HE, Urfer W, Ickstadt K, Rahnenfuhrer J

(2009) Retention time alignment algorithms for LC/MS data must

consider non-linear shifts. Bioinformatics 25: 758-764

Primrose S, Draper J, Elsom R, Kirkpatrick V, Mathers JC, Seal C,

Beckmann M, Haldar S, Beattie JH, Lodge JK, Jenab M, Keun H,

Scalbert A (2011) Metabolomics and human nutrition. Br J Nutr 105:

1277-1283

Qin WY, Zhu WZ, Shi HD, Hewett JE, Ruhlen RL, MacDonald RS,

Rottinghaus GE, Chen YC, Sauter ER (2009) Soy isoflavones have

an antiestrogenic effect and alter mammary promoter hypermethylation

in healthy premenopausal women. Nutr Cancer Journal 61: 238-244

Queiroz EF, Wolfender JL, Atindehou KK, Traore D, Hostettmann K

(2002) On-line identification of the antifungal constituents of Erythrina

vogelii by liquid chromatography with tandem mass spectrometry,

ultraviolet absorbance detection and nuclear magnetic resonance

spectrometry combined with liquid chromatographic micro-fractionation.

J Chromatogr A 974: 123-134

Rajani P, Sarma PN (1988) A coumestone from the roots of tephrosia

hamiltonii. Phytochemistry 27: 648-649

Riveravargas LI, Schmitthenner AF, Graham TL (1993) Soybean

effects on and metabolism by Phytophthora sojae. Phytochemistry 32:

851-857

118

Robinson MD, De Souza DP, Keen WW, Saunders EC, McConville MJ,

Speed TP, Likic VA (2007) A dynamic programming approach for the

alignment of signal peaks in multiple gas chromatography-mass

spectrometry experiments. BMC Bioinformatics 8: 419

Roux A, Lison D, Junot C, Heilier JF (2011) Applications of liquid

chromatography coupled to mass spectrometry-based metabolomics in

clinical chemistry and toxicology: A review. Clin Biochem 44: 119-135

Ruegger M, Dewey E, Hobbie L, Brown D, Bernasconi P, Turner J,

Muday G, Estelle M (1997) Reduced naphthylphthalamic acid binding

in the tir3 mutant of Arabidopsis is associated with a reduction in polar

auxin transport and diverse morphological defects. Plant Cell 9: 745-

757

Saito K, Matsuda F (2010) Metabolomics for functional genomics, systems

biology, and biotechnology. Annu Rev Plant Biol 61: 463-489

Salvo VA, Boue SM, Fonseca JP, Elliott S, Corbitt C, Collins-Burow BM,

Curiel TJ, Srivastav SK, Shih BY, Carter-Wientjes C, Wood CE,

Erhardt PW, Beckman BS, McLachlan JA, Cleveland TE, Burow

ME (2006) Antiestrogenic glyceollins suppress human breast and

ovarian carcinoma tumorigenesis. Clin Cancer Res 12: 7159-7164

Sharp JK, Mcneil M, Albersheim P (1984) Host-pathogen interactions.27.

The primary structures of one elicitor-active and 7 elicitor-inactive

hexa(beta-d-Glucopyranosyl)-d-Glucitols Isolated from the mycelial

119

walls of Phytophthora megasperma F sp glycinea. J Biol Chem 259:

1321-1336

Smith CA, Want EJ, O'Maille G, Abagyan R, Siuzdak G (2006) XCMS:

Processing mass spectrometry data for metabolite profiling using

Nonlinear peak alignment, matching, and identification. Analytical

Chemistry 78: 779-787

Soga T, Sugimoto M, Honma M, Mori M, Igarashi K, Kashikura K, Ikeda S,

Hirayama A, Yamamoto T, Yoshida H, Otsuka M, Tsuji S, Yatomi Y,

Sakuragawa T, Watanabe H, Nihei K, Saito T, Kawata S, Suzuki H,

Tomita M, Suematsu M (2011) Serum metabolomics reveals gamma-

glutamyl dipeptides as biomarkers for discrimination among different

forms of liver disease. J Hepatol

Subramanian S, Graham MY, Yu O, Graham TL (2005) RNA interference of

soybean isoflavone synthase genes leads to silencing in tissues distal

to the transformation site and to enhanced susceptibility to

Phytophthora sojae. Plant Physiol 137: 1345-1353

Sumner LW, Mendes P, Dixon RA (2003) Plant metabolomics: large-scale

phytochemistry in the functional genomics era. Phytochemistry 62:

817-836

Sumner LW, Urbanczyk-Wochniak E, Broeckling CD (2007) Metabolomics

data analysis, visualization, and integration. Methods Mol Biol 406:

409-436

120

Takahashi K, Kamada Y, Hiraoka-Yamamoto J, Mori M, Nagata R,

Hashimoto K, Aizawa T, Matsuda K, Kometani T, Ikeda K, Yamori

Y (2004) Effect of a soybean product on serum lipid levels in female

university students. Clin Exp Pharmacol Physiol 31 Suppl 2: S42-43

Tautenhahn R, Bottcher C, Neumann S (2008) Highly sensitive feature

detection for high resolution LC/MS. BMC Bioinformatics 9: 504

Tode K, Luthen H (2001) Fusicoccin- and IAA-induced elongation growth

share the same pattern of K+ dependence. J Exp Bot 52: 251-255

Tolstikov VV, Lommen A, Nakanishi K, Tanaka N, Fiehn O (2003)

Monolithic silica-based capillary reversed-phase liquid

chromatography/electrospray mass spectrometry for plant

metabolomics. Anal Chem 75: 6737-6740

Tosteson DC, Cook P, Andreoli T, Tieffenberg M (1967) The effect of

valinomycin on potassium and sodium permeability of HK and LK

sheep red cells. J Gen Physiol 50: 2513-2525

Tyug TS, Prasad KN, Ismail A (2010) Antioxidant capacity, phenolics and

isoflavones in soybean by-products. Food Chem 123: 583-589 van Loon LC, Bakker PA, van der Heijdt WH, Wendehenne D, Pugin A

(2008) Early responses of tobacco suspension cells to rhizobacterial

elicitors of induced systemic resistance. Mol Plant Microbe Interact 21:

1609-1621

121

Westerhuis JA, Hoefsloot HCJ, Smit S, Vis DJ, Smilde AK, van Velzen

EJJ, van Duijnhoven JPM, van Dorsten FA (2008) Assessment of

PLSDA cross validation. Metabolomics 4: 81-89

Wikoff WR, Anfora AT, Liu J, Schultz PG, Lesley SA, Peters EC, Siuzdak

G (2009) Metabolomics analysis reveals large effects of gut microflora

on mammalian blood metabolites. Proc Natl Acad Sci U S A 106:

3698-3703

Wilcox JR, St Martin SK (1998) Soybean genotypes resistant to

Phytophthora sojae and compensation for yield losses of susceptible

isolines. Plant Dis 82: 303-306

Wong JW, Cagney G, Cartwright HM (2005) SpecAlign--processing and

alignment of mass spectra datasets. Bioinformatics 21: 2088-2090

Wrather JA, Koenning SR (2006) Estimates of disease effects on soybean

yields in the United States 2003 to 2005. J Nematol 38: 173-180

Wu W, Zhang Q, Zhu YM, Lam HM, Cai ZW, Guo DJ (2008) Comparative

metabolic profiling reveals secondary metabolites correlated with

soybean salt tolerance. J Agric Food Chem 56: 11132-11138

Yang C, He Z, Yu W (2009) Comparison of public peak detection algorithms

for MALDI mass spectrometry data analysis. BMC Bioinforma 10: 4

Yazaki K, Sasaki K, Tsurumaru Y (2009) Prenylation of aromatic

compounds, a key diversification of plant secondary metabolites.

Phytochemistry 70: 1739-1745

122

Yin S, Fan CQ, Wang Y, Dong L, Yue HM (2004) Antibacterial prenylflavone

derivatives from Psoralea corylifolia, and their structure-activity

relationship study. Bioorg Med Chem 12: 4387-4392

Yoshikawa M, Keen NT, Wang MC (1983) A receptor on soybean

membranes for a fungal elicitor of phytoalexin accumulation. Plant

Physiol 73: 497-506

Yoshikawa M, Matama M, Masago H (1981) Release of a soluble

phytoalexin elicitor from mycelial walls of phytophthora megasperma

var. sojae by Soybean Tissues. Plant Physiol 67: 1032-1035

Zhang A, Sun H, Wang Z, Sun W, Wang P, Wang X (2010) Metabolomics:

towards understanding traditional Chinese medicine. Planta Med 76:

2026-2035

Zheng HL, Zhao ZQ, Zhang CG, Feng JZ, Ke ZL, Su MJ (2000) Changes in

lipid peroxidation, the redox system and ATPase activities in plasma

membranes of rice seedling roots caused by lanthanum chloride.

Biometals 13: 157-163

123