<<

Seeding Multi-omic Improvement of

Thesis

Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in

the Graduate School of The State University

By

Emma Bilbrey

Graduate Program in Horticulture and Crop Science

The Ohio State University

2020

Thesis Committee

Jessica L. Cooperstone, Advisor

Jonathan Fresnedo Ramirez

Diane Miller

Emmanuel Hatzakis

1

Copyrighted by

Emma Bilbrey

2020

2

Abstract

Apples are one of the most commonly consumed fruits in America, and “an apple a day keeps the doctor away” is a well-known adage. The commercial and nutritional importance of has prompted interest in varietal improvement. However, progress is limited by a long juvenile period, which delays fruit evaluation for quality traits, such as phytochemical composition. To minimize this drawback, apple breeders have begun using marker-assisted selection (MAS) for some traits, but breeding strategies for fruit phytochemicals have yet to be developed.

In response, we have developed an integrated genomic-metabolomic platform to better understand gene-phytochemical associations in breeding-relevant apple germplasm. Phytochemicals that are potentially health beneficial, contribute disease resistance, or improve fruit quality can be characterized using metabolomics, providing a foundation to study apple’s breeding potential. The platform is based on high-throughput genomic and metabolomic assessment of 173 unique apples, including members of three pedigree-connected families alongside diverse and wild selections. Single nucleotide polymorphism (SNP) data was obtained from the 20K SNP array for apple and integrated with metabolomic datasets from high-resolution mass spectrometry and nuclear magnetic resonance (NMR) spectroscopy analyses of polar/semi-polar apple fruit extracts.

Metabolite genome-wide association studies (mGWAS) were conducted with 11,165

ii

SNPs for two LC-MS data sets of 4,000+ features each and an NMR data set of 756 bins.

Novel schemes for prioritizing results from mGWAS indicated 519 (LC-MS (+)), 726

(LC-MS (-)), and 177 (NMR) significant marker-trait associations across the apple genome (LC-MS: p < .00001, NMR: p < .0001). These results were then sifted to select features to analyze with a more powerful pedigree-based analysis (PBA) in FlexQTL™ with 6,034 SNPs to identify metabolite quantitative trait loci (mQTL), genomic areas exerting genetic control over phytochemical production. An mQTL for chlorogenic acid was identified on the bottom of chromosome 17 across all three metabolomic data sets and was used as a proof-of-concept example to demonstrate the applicability of the platform. Determining gene-phytochemical relationships in apple will inform breeding and facilitate future MAS for improved nutrition along with attributes related to flavor and disease resistance etiology.

iii

Dedication

To my family, friends, and, above all, to the Creator who has furnished the world with wonders beyond worthy of our study.

iv

Acknowledgments

To my advisor Dr. Jessica Cooperstone, I owe much in terms of skills and scientific understanding. More important I owe her great gratitude for her support and encouragement throughout the course of my master’s work. Her passion and strong resolve as a woman in science is unmatched.

I would like to thank my committee thoroughly for their integral involvement in this highly collaborative project. Dr. Fresnedo Ramírez receives credit for honing my understanding of tree crop genomics and pushing me to dig deep in my understanding of the concepts and their implications. I thank Dr. Diane Miller for her dedicated collection of apple germplasm and insight into commercial apple breeding and its origins. Dr. Emmanuel Hatzakis, I thank for his NMR expertise and kindness throughout the project.

A group that deserves so much thanks is the Cooperstone Lab. It was ever a pleasure to work alongside them and enjoy many Brassica lunches. My thanks also goes to Katie Williamson in the Hatzakis Lab for running my NMR samples and helping me process them with Dr. Matthias Klein.

I am grateful for the Midwest Apple Improvement Association (MAIA) for providing all of the apple germplasm studied here. We could not have answered any of these questions without their support. Additionally, I am thankful for Ohio State’s Foods

v

For Health Theme for funding this project. I would not have received this degree without the financial support of the University Fellowship from the Graduate

School along with the OARDC Director’s Associateship Award.

Finally, to my family, friends, and our Athens home of Brookfield Church, I owe unending gratitude for their love throughout these years of my life. To my husband Clark

I owe thanks for the title “Apple Scientist” and enduring support.

vi

Vita

June 2013 ...... Valparaiso High School

June 2016 – August 2016 ...... Forestry and Horticulture Internship,

Taltree Arboretum and Gardens,

Valparaiso, IN

June 2017 ...... B.S. General Biology, Summa cum

Laude, Union University, Jackson, TN

July 2017 – July 2018 ...... Manufacturing Technician II,

Bioreagents Department, Quidel

Corporation, Athens, OH

August 2018 – July 2019 ...... University Fellow, The Ohio State

University, Columbus, OH

August 2019 – Present ...... OARDC Director’s Award Recipient,

Columbus, OH

August 2018 – Present ...... Graduate Research Associate,

Department of Horticulture and Crop

Science, The Ohio State University

vii

Fields of Study

Major Field: Horticulture and Crop Science

viii

Table of Contents

Abstract ...... ii Dedication ...... iv Acknowledgments ...... v Vita ...... vii Fields of Study ...... viii List of Tables ...... xiii List of Figures ...... xiv Chapter 1. Literature Review ...... 1 1.1 Apples in History ...... 1 1.2 Traditional Apple Breeding ...... 2 1.3 Apples and Human Health ...... 4 1.3.1 Cancer ...... 5 1.3.2 Cardiovascular Disease ...... 6 1.3.3 Cognitive Decline ...... 8 1.3.4 Polyphenols ...... 9 1.3.4.1 Anthocyanins ...... 10 1.3.4.2 Dihydrochalcones ...... 10 1.3.4.3 Flavonols ...... 12 1.3.4.4 Flavanols (Flavan-3-ols) ...... 13 1.4 Nutrition-Driven Breeding ...... 13 1.5 Apples and Genomics ...... 16 1.6 Apples and Metabolomics ...... 19 1.7 Multi-omic Integration ...... 21 1.7.1 Bi-Parental Mapping Populations ...... 22 1.7.2 Metabolite Genome Wide Association Studies (mGWAS) ...... 24 ix

1.7.3 Pedigree-Based Analysis ...... 25 1.7.4 Apple Integrated Genetic Linkage Map (iGLmap) ...... 27 1.8 Specific Aims ...... 28 Chapter 2. Seeding Multi-omic Apple Improvement ...... 31 2.1 Abstract ...... 32 2.2 Introduction ...... 33 2.3 Materials and Methods ...... 36 2.3.1 Sample Selection ...... 36 2.3.2 Genomics ...... 39 2.3.2.1 DNA Extraction and SNP Array Processing ...... 39 2.3.2.2 Pedigree Confirmation ...... 39 2.3.2.3 Marker File Preparation for mGWAS ...... 40 2.3.2.4 Marker File Preparation for FlexQTL™ Analyses ...... 40 2.3.3 Metabolomics ...... 41 2.3.3.1 Chemicals ...... 41 2.3.3.2 Apple Fruit Extraction ...... 41 2.3.3.3 UHPLC-QTOF-ESI-MS Full Scan Experiments ...... 43 2.3.3.4 Iterative UHPLC-QTOF-ESI-MS/MS Experiments ...... 44 2.3.3.5 LC-MS Full Scan Data Deconvolution and Processing ...... 45 2.3.3.6 Iterative LC-MS/MS Data Deconvolution and Processing ...... 46 2.3.3.7 1D 1H NMR Spectroscopy Experiments ...... 46 2.3.3.8 NMR Spectral Processing ...... 47 2.3.3.9 Data Visualization and Analysis ...... 48 2.3.3.10 Feature Identification – LC-MS ...... 48 2.3.3.11 Feature Identification – NMR ...... 49 2.3.4 Omics Integration – mQTL Detection ...... 50 2.3.4.1 Metabolite genome wide association studies (mGWAS) ...... 50 2.3.4.2 Prioritizing significant SNP-feature associations ...... 58 2.3.4.3 Pedigree-based analysis (PBA) in FlexQTL™ ...... 59 2.4 Results and Discussion ...... 60 2.4.1 Genomics ...... 60 2.4.1.1 Pedigree Confirmed and Revised ...... 60

x

2.4.1.2 PCA of SNP data confirmed genetic variation of selected germplasm ..... 61 2.4.2 Metabolomics ...... 62 2.4.2.1 High-throughput untargeted metabolomic analysis and processing of 232 extracts via LC-MS produced high-quality data ...... 63 2.4.2.2 PCA showed distinct metabolome profiles across apple selections in all metabolomics data sets ...... 63 2.4.3 Omics Integration ...... 68 2.4.3.1 Prioritization scheme enabled high-throughput omics integration ...... 69 2.4.3.2 Parallel results across complementary metabolomic datasets increased confidence in compound ID and detection of significant loci: mQTL counts per chromosome ...... 71 2.4.3.3 Parallel results across complementary metabolomic datasets increase confidence in compound ID and detection of significant loci: Composite mQTL chromosome map ...... 73 2.4.3.4 Parallel results across complementary metabolomic datasets increase confidence in compound ID and detection of significant loci: Binary SNP-feature association heat map and hierarchical clustering ...... 76 2.4.3.5 mQTLs detected for peaks across the NMR spectrum ...... 81 2.4.3.6 mQTL hotspots on chromosomes 16 and 17 ...... 84 2.4.3.7 Classical molecular networking in GNPS provides putative IDs for compounds analyzed with iterative LC-MS/MS ...... 90 2.4.3.8 Authentic standards verify metabolite identification ...... 91 2.4.3.9 PBA in FlexQTL™ allows strong validation of mQTLs proposed by mGWAS ...... 92 2.4.3.10 Proof-of-Concept: Chlorogenic Acid mQTL on Chromosome 17 ...... 93 2.5 Conclusion ...... 100 Bibliography ...... 102 Appendix A: Metabolomics-Related Documents ...... 116 A.1 LC-MS ...... 116 A.1.1 Proteowizard msConvert Parameters ...... 116 A.1.1.1 Full Scan ...... 116 A.1.1.2 Iterative MS/MS ...... 116 A.1.2 mzMine2.51 Parameters for Full Scan LC-MS Data ...... 116 A.1.3 Excel Feature Post-Processing ...... 116

xi

A.1.4 GNPS Classical Molecular Networking Parameters for Iterative MS/MS Data ...... 116 A.2 NMR ...... 116 A.2.1 CombiDancer Extract Drying Parameters ...... 116 A.2.2 mrbin Code for Spectral Binning ...... 116 A.3 Data Visualization ...... 116 A.3.1 PCA ...... 116 Appendix B: Omics Integration-Related Documents ...... 117 B.1 mGWAS ...... 117 B.1.1 AGHmatrix Kinship Matrix ...... 117 B.1.2 SNP Principal Components Analysis and Elbow Plots ...... 117 B.1.3 Dividing SNPs by Chromosome ...... 117 B.1.4 Unix Batch Script Code for mGWAS using rrBLUP in OSC ...... 117 B.1.5 File Import and Export for OSC and Results Compilation with Command Line in Terminal ...... 117 B.1.6 Filtering for Significant SNP-Feature Associations in R ...... 117 B.1.7 Venning Significant Features from mGWAS of Three Populations ...... 117 B.1.8 Visualizing Complementary Omics Datasets ...... 117 B.1.8.1 Number of mQTL per Chromosome ...... 117 B.1.8.2 Significant SNP-Feature Association Composite Chromosome Map .... 117 B.1.8.3 Number of Significantly Associated Features per SNP ...... 117 B.1.8.4 B.1.8.4 Presence Absence Heatmap and Hierarchical clustering for SNP- feature associations ...... 117 B.1.8.5 Significant SNP-Bin Associations across NMR Spectra ...... 117 B.1.9 SNP Names Conversion Reference ...... 117

xii

List of Tables

Table 1. Progress in molecular testing for apple disease resistance and reduced physiological disorders according to the RosBREED 2018 Annual Report...... 17

Table 2. Progress in molecular testing for apple quality traits according to the

RosBREED 2018 Annual Report...... 18

Table 3. Apple fruit extraction samples and modifications...... 42

Table 4. FlexQTL™ results for the pedigree-based analysis (PBA) of chlorogenic acid abundance data for LC-MS (+), (-), and NMR data sets. Number of positive mQTL was determined by a minimum Bayes Factor of 2. Genetic interval and narrow sense heritability estimates are recorded in ranges determined from three replicates of the

FlexQTL™ runs for each data set...... 101

xiii

List of Figures

Figure 1. Flavonoid biosynthesis pathway...... 11

Figure 2. Pedigree chart for three pedigree-connected families...... 38

Figure 3. Depiction of the three nested populations used for mGWAS analyses with colors as follows: purple – progenies, orange – progenies plus pedigree-connected individuals, green – progenies plus pedigree-connected individuals and other diverse selections...... 51

Figure 4. Workflow for integration of one metabolomics dataset with a genomics dataset.

This workflow was applied to each of the three metabolomics datasets. Three separate mGWAS analyses (A) with different subsets of individuals were conducted in order to detect real SNP-feature associations present across diverse germplasm and in segregating progeny. This was achieved by filtering results (B) of each for strong signal (C) and then venning the results from the three populations (D). Overlapping features with significant

SNP associations (E) were extracted for further analysis and identification. This resulted in a corresponding collection of overlapping significant features (F) for each of the three metabolomic datasets...... 53

Figure 5. A continued figure. Elbow plots of the percent variation explained by each principal component (PC) in a principal component analysis (PCA) of SNP data used for mGWAS for (A) the diverse population, (B) the pedigree subset, and (C) the progeny xiv subset. The red point indicates the last PC included in the mGWAS model (A: PC = 10,

B: PC = 6, C: PC = 3). A graph of the first two PCs are paired with each population subset where points are apple types (A: n = 124, B: n = 98, C: n = 75)...... 54

Figure 6. A Unix batch script was written and executed for the three population sets within each metabolomics dataset, resulting in a total of nine batch scripts...... 57

Figure 7. Principal components analysis (PCA) scree plots of positive (A) and negative

(B) ionization mode data collected via LC-MS. Tight clustering of pooled QCs (n = 33) indicated stability of data quality within the two experiments. Each point represents one sample (n = 226) and are color-coded to represent general classes within the selected apple varieties. Missing values were imputed, data was log2-transformed then scaled and centered to perform PCA...... 64

Figure 8. A continued figure. Principal components analysis (PCA) scree plots of positive

(A) and negative (B) ionization mode data collected via LC-MS and (C) NMR data. Each point represents one apple sample (n = 193) and are color-coded to represent general classes within the selected apple varieties. Each PCA indicates metabolomic variation in germplasm chosen for analysis in this study. Missing values were imputed, data was log2- transformed then scaled and centered to perform PCA...... 66

Figure 9. (A) A table of the total counts of metabolomic features that remained after - log(p) value filters for each population. Those features were then venned for each metabolomics data set to capture a list of those that were significant in all three populations. Corresponding Venn diagrams for (B) LC-MS (+), (C) LC-MS (-), and (D)

xv

NMR metabolomic features are shown with the center category containing the number of features that passed on for further analysis...... 70

Figure 10. Bar plots showing the number of putative mQTL detected per chromosome via mGWAS. Counts are determined by a SNP-feature -log(p) value minimum of 5 for LC-

MS and 4 for NMR...... 72

Figure 11. Composite chromosome map of the 17 apple chromosomes. Horizontal lines indicate the location of a SNP found to have a significant association with at least one metabolomic feature. Lines are colored based on the origin of the metabolomic feature. 75

Figure 12. A continued figure. Binary heat maps with SNPs organized on the x-axis by chromosome and then genetic distance, and metabolomic features organized on the y-axis by hierarchical clustering. The intersection of each SNP-feature combination is either colored (significant) or black (non-significant). Thresholds for significance were -log(p) of 5 for LC-MS data sets and 4 for NMR...... 78

Figure 13. 1D 1H NMR spectrum of the apple extract pooled QC. Yellow lines indicate each bin that was significantly associated with at least one SNP. Dashed lines approximately divide the spectrum according to the type of compounds that elicit peaks at that chemical shift. The aromatic region and amino acid region are in much lower abundance than the sugar region, so magnified inserts are also presented...... 83

Figure 14. A continued figure. Plots displaying the number of significantly associated metabolomic features per SNP on chromosome 16. Significance is determined as -log(p)

≥ 5 for LC-MS data sets and ≥ 4 for NMR. SNPs are plotted based on their genetic position (cM) on the x-axis. SNPs with a minimum of 1 significant feature association are

xvi labeled with their study index number. Top SNPs include: 13681(SNP_FB_1074682),

13685 (RosBREEDSNP_SNP_CT_1540624_Lg16_LAR1_MAF40_1618769_exon2), and 13675 (SNP_FB_0335535). Additional index-to-SNP name conversions, including synonyms from the 480K SNP array, are available in Appendix B.1.9...... 85

Figure 15. A continued figure. Plots displaying the number of significantly associated metabolomic features per SNP on chromosome 17. Significance is determined as -log(p)

≥ 5 for LC-MS data sets and ≥ 4 for NMR. SNPs are plotted based on their genetic position (cM) on the x-axis. SNPs with a minimum of 1 significant feature association are labeled with their study index number. Top SNPs include: 15109

(RosBREEDSNP_SNP_AG_20028330_Lg17_01298_MAF50_1664885_exon1), 15123

(SNP_FB_1114677), and 15133 (SNP_FB_0398770). Additional index-to-SNP name conversions, including synonyms from the 480K SNP array, are available in Appendix

B.1.9...... 88

Figure 16. Spectral evidence for feature identification as chlorogenic acid where (A) is an extracted ion chromatogram (EIC) of 353.0879 m/z ([M-H]-) in a stock solution of chlorogenic acid authentic standard and (B) is an EIC of the same m/z in a pooled apple extract QC run at the same time as the standard. The retention time match was further validated by matching MS/MS spectra of the mass for both samples in (C) and (D)...... 94

Figure 17. A continued figure. Manhattan plots of chlorogenic acid phenotypic measurements from (A) LC-MS (+), (B) (-), and (C) NMR. Alternating colors were used to help delineate neighboring chromosomes. The dashed line indicates an FDR-corrected q-value equivalent to p =.05...... 96

xvii

Figure 18. NMR spectra of the apple extract pooled QC and zoomed subsets of areas of interest. In the full spectrum (C), yellow lines indicate ppm of bins significantly associated with SNP 15109 and matching with expected peaks for chlorogenic acid. In subsets (A) and (B), yellow arrows point to the specific peaks for chlorogenic acid...... 99

xviii

Chapter 1. Literature Review

1.1 Apples in History

Tree crop cultivation has been dominated by the domesticated apple ( × domestica Borkh., family Rosaceae). Over centuries, apples and their products have been consumed across the globe and have remained prominent in art, mythology, and culture.

In 2017, apples were the most consumed fruit in America with a mean of intake of 27.2 pounds per person, including both fresh apples and processed products (USDA Economic

Research Service 2017). This bests oranges and bananas at 21 and 14.1 lbs/person, respectively.

Originating in Central Asia, M. domestica has a combined genetic makeup derived from several wild apple species. From the Tian Shan Mountains of Central Asia,

M. sieversii (Lebed.) M. Roem. is the source of the domestic apple genome, and the wild European crabapple M. sylvestris (L.) Mill. was found to be a second key contributor (Cornille et al. 2012). Two other wild crab apples, interfertile and closely related to M. domestica, are M. baccata (L.) Borkh. from Siberia and M. orientalis

Uglitzk. from the Caucasian region. These four wild species underwent hybridization as fruit was carried along the Silk Road (Cornille et al. 2014).

Apple breeding advanced in Europe through the 19th century ahead of North

American apple cultivation (Cummins and Aldwinckle 1983). In 19th- and 20th-century

1

America, many farmers maintained to produce fruit for consumption, preservation, and trade. As homesteaders moved west across America, apple production was a key component of human health due to the unsafe water conditions of the land. In the early 19th century, John Chapman, otherwise known as Johnny

Appleseed, began apple seedling nurseries across the Midwest. He sold these seedlings and distributed seeds to many homesteaders settling there or passing through on their way farther west. The spread of seedlings continued after President Abraham Lincoln’s

Homestead Act of 1862, which offered free land conditional on proof of cultivation.

Apple orchards were favorable for this purpose due to the long-term investment of growing trees and dietary benefits of fruit production. From the Midwest, apple production has since spread and become concentrated in the northeast and northwest of the US.

1.2 Traditional Apple Breeding

As a tree crop, apple cultivation has developed fundamentally differently than annual seed-propagated crops. Apple trees have a long juvenile period, preventing them from producing fruit for 5-10 years. Trees can be fruit-bearing for up to half a century. In the 19th century, US apple breeding originated with grafting of open-pollinated chance seedlings exhibiting a desirable phenotype, as necessitated by self-incompatibility

(Cummins and Aldwinckle 1983). Nurserymen and hobbyists carried out selection in this fashion for about 100 years, with rare purposeful hybridization. Widespread apple breeding programs at universities and agricultural experiment stations were established in

2 the late 19th century into the early 1920s. Horticultural traits such as yield, precocity, disease resistance, and tree architecture were the primary focus of these efforts.

Largely beginning in the mid-nineteenth century in Europe, root stocks were established as an avenue to address the challenge of large tree habit in apple breeding

(Cummins and Aldwinckle 1983). Dwarfing root stocks, eg., Malling (M.) 9 and M.26, have been used to great effect in allowing high-density planting. Reduced tree size has benefits of higher yields due to increased canopy light penetration and more manageable harvesting and maintenance (Norelli, Jones, and Aldwinckle 2003).

Further development of pest and pathogen resistance in apple has been a major breeding objective. The long lifespan and juvenile period preclude quick turnover in apple orchards. Saplings that are planted are meant to last for decades. Disease pressure can climb in orchards, causing epidemics through swathes of trees. Infection causes several years of reduced yields before the trees are killed (Norelli et al. 2003; Zhu, Shin, and Mazzola 2016). This has a large economic impact on growers because each tree represents decades of investment in maintenance. The major fungal diseases are (Venturia inaequalis (Cke) Wint.), powdery mildew (Podosphaera leucotricha

(Ellis & Everh.) Salm), and postharvest blue mold (Penicillium expansum (Link) Thom.)

(Baumgartner et al. 2015; Sun et al. 2017). The bacterial disease fire blight (Erwinia amylovora (Burrill) Winslow et al.) is sporadic and can be devastating to orchards when infections begin due to limited treatment options (Norelli et al. 2003).

Overall, domestication of apple has led to a narrow germplasm base (Way et al.

1991), and current apple cultivation is largely comprised of only a few dozen

3

(Cornille et al. 2014; Laurens et al. 2010). Wild species encompassing the extent of diversity in Malus are currently being investigated as sources of additional genetic variation to introduce variety in appearance and taste as well as disease resistance

(RosBREED 2018). Wild species have already demonstrated advantages in disease resistance. A Japanese crabapple, clone 821, was originally used as a source of apple scab resistance through the Rvi6/Vf gene (Koller et al. 1994), but resistance has started to be overcome by a new V. inaequalis strain, particularly in Europe

(Parisi et al. 1993).

In addition to institutional apple breeding programs, growers themselves are coordinating efforts to breed more regionally adapted varieties. One such grower participatory apple breeding program is the Midwest Apple Improvement Association

(MAIA), founded in 1997. With goals of improved cropping reliability, disease resistance, and flavor/quality attributes, the MAIA has released several commercial varieties (http://www.midwestapple.com/). ‘EverCrisp’ is their current hallmark variety.

Released in 2014 from ‘’ × ‘’ hybridization, ‘EverCrisp’ produces fruit late in the growing season with exceptional crispness, although no disease resistance.

Willing to adapt and advance with growing genomic technology and consumer desire for healthful foods, the MAIA partnered with The Ohio State University for this project to investigate genetic control of health beneficial compounds in apples.

1.3 Apples and Human Health

Fruit and vegetable consumption has long been associated with good health, particularly apples—paraphrased in the maxim “An apple a day keeps the doctor away.”

4

The health benefits from fruits and vegetables are commonly attributed to their various phytochemical compositions (Boyer and Liu 2004). In apples, the major phytochemical classes of interest for their roles in bioactivity have been polyphenols, including flavan-3- ols, flavonols, anthocyanidins, dihydrochalcones and hydroxycinnamic acids (Tsao et al.

2005; Vrhovsek et al. 2004).

The first review of apple health benefits by Boyer and Liu (2004) outlines studies suggesting decreases in risk of chronic diseases including cancer, cardiovascular disease, pulmonary disease, and diabetes as a result of apple consumption. An updated review several years later adds apple-associated improved outcomes in weight management, gut and bone health, and cognitive decline (Hyson 2011). Supplementing these reviews,

Rana and Bhushan (2016) give specific outlines of in vitro and in vivo research in apple- associated health benefits and concludes by stressing the potential of apple phytochemicals as nutraceuticals only if their synergism, bioavailability, and bioactivity are comprehensively studied. Overall, although apples have been associated with beneficial health outcomes, there is still no consensus on a bioactive compound or group of compounds within apple that are eliciting these benefits.

1.3.1 Cancer

The effects of apple products (fresh fruit, juice, and extracts) have been studied in regard to cancer with emphasis on breast, lung, and colon cancer. An epidemiological study in found an inverse correlation of cancer in individuals reporting intake of ³1 apple/day when comparing 8,209 patients with various types of cancer to a group of

6,629 patients with no incidence of cancer (Gallus et al. 2005). Rats consuming whole

5 apple extracts, comparable to one, three, and six apples per day in human consumption, showed a significant decrease (p < .01) in number of mammary tumors compared with the control (Liu, Liu, and Chen 2005). Apple consumption (1 serving/day) was significantly inversely correlated with risk of lung cancer in the women cohort of the

Nurses’ Health Study, but no association was found with men (Feskanich et al. 2000). In a study of colon carcinogenesis in rats, administration of a procyanidin fraction of apple extract to rats showed a significant decrease (p < .01) in hyperproliferative crypt count and number of aberrant crypt foci compared to those only receiving water (Gossé et al.

2005). Similarly, cloudy significantly decreased crypt cell proliferation (p <

.001) and genotoxicity colonocytes (p < .01) compared to a water control in a colon carcinogenesis rat model (Barth et al. 2007).

1.3.2 Cardiovascular Disease

Due to high consumption and high concentrations of polyphenols, the role of apples has been investigated with regard to risk of cardiovascular disease (CVD). Whole apples, apple pomace, and cloudy apple juice significantly decreased low-density lipoprotein cholesterol (LDL-C) and total serum cholesterol (TC) in a group of 23 healthy participants. However, clear apple juice consumption had the inverse effect, indicating the importance of cell wall components in decreasing CVD risk factors (Ravn-Haren et al. 2013). Bondonno et al. (2018) compared cardiovascular risk factors in 30 participants in a four-week cross-over trial consuming apples with peel and without peel. They determined that participants consuming apple with peel, compared with no peel, had significantly higher flavonoid metabolite plasma levels and improved endothelial

6 function. They largely attributed this outcome to the higher concentration of flavonoids in peel than in flesh.

Similarly, a clinical trial with subjects (n = 250) consuming one apple per day for four months reported significant decreases in TC, LDL-C, and an increase in high-density lipoprotein cholesterol (HDL-C) (Tenore, Caruso, Buonomo, D’Avino, et al. 2017).

Using the ‘’ apples tested in the study, a polyphenolic extract capsule was developed for an additional clinical trial (Tenore, Caruso, Buonomo, D’Urso, et al. 2017).

Subjects (n = 250) taking two capsules (400 mg/capsule) per day for one month had decreased mean TC (-24.9%, p = .0011) and LDL-C (-37.5%, p = .0021) and increased mean HDL-C (+49.2%, p = .0030). Importantly, the reduction in TC and LDL-C noted here are in the same range of effect size as the use of prescription statins. A subsequent study with the nutraceutical supplements demonstrated an in vitro decrease in cholesterol micellar solubility (-85.7%) as well as a clinical trial indicating significantly increased fecal cholesterol excretion (+35%) (Tenore et al. 2018).

In a separate study testing the impact of polyphenolic extracts on cardiovascular factors, mice induced with cardiovascular disorders by a high fat, high fructose diet were fed extracts from apple peel or apple flesh for 28 days (Tian et al. 2018). After the treatment, mice receiving the extracts had significantly lower blood pressure and serum glucose levels as well as attenuated effects on endothelin-1 reduction and nitric oxide decrease as compared to the high fat, high fructose diet alone.

In contrast, a human clinical trial with administration of a polyphenolic apple extract, reportedly high in epicatechin and flavan-3-ol oligomers, to subjects with mild

7 hypertension (n = 60) did not yield significant differences in brachial artery flow- mediated or nitrate-mediated vasodilation (FMD and NMD respectively) compared to the placebo group (Saarenhovi et al. 2017). Similarly, a study of the effects of consuming isolated apple flavanol and procyanidin extracts on blood pressure in 42 healthy adults with moderately elevated blood pressure found no significant changes in blood pressure or any CVD biomarkers (Hollands et al. 2018).

Seeing the impact on health of whole apples and even polyphenolic extracts, researchers have been interested in determining the specific bioactive compounds within apples responsible for the positive changes in cardiovascular health. However, as indicated by the last two studies mentioned, there have not been convincing results when compounds are tested in isolation. This may suggest that the active agent or combination of agents in apple have not yet been evaluated.

1.3.3 Cognitive Decline

The group of T.B. Shea at the University of Massachusetts has conducted many studies of the relationship between oxidative stress and age-related cognitive decline.

Several of these have investigated apple juice as a dietary supplement to improve cognitive decline outcomes related to aging. In vitro neuronal cell line studies demonstrated the antioxidant capacity of apple juice concentrate (AJC) (Ortiz and Shea

2004) and its stimulation of organized signaling patterns in mouse cortical neurons (Serra and Shea 2009).

Supplementation with AJC to genetically or dietary deficient mouse models has been shown to reduce oxidative species in murine brain tissue, improve cognitive

8 performance (Rogers et al. 2004; Chan et al. 2011; Tchantchou et al. 2005) and suppress

Alzheimer’s disease indicators, including: high amyloid-β levels (Chan and Shea 2009), increased glutathione synthase transcription and activity (Tchantchou et al. 2004), decline of acetylcholine (Chan, Graves, and Shea 2006), and overexpression of presenilin-1(Chan and Shea 2006).

Institutionalized individuals (n = 21) with moderate-to-late stage Alzheimer’s disease participated in a pilot clinical trial investigating the cognitive and behavioral impact of daily 8-oz apple juice consumption for one month (Remington et al. 2010).

While no changes in cognitive symptoms were observed, a significant improvement in symptoms such as anxiety, agitation, and delusion, as evaluated by the Neuropsychiatric

Inventory.

1.3.4 Polyphenols

The fruit of Malus species contain a variety of polyphenols, largely flavonoids. A major function of polyphenolics in plants is in plant-herbivore interactions and pollinator attraction. The chemical structure of polyphenols lends them to acting as antioxidants in plants, quenching reactive oxygen species produced by the cleaving of H2O in photosynthesis. Many assessments of antioxidant capacity and activity of various phenolic compounds from apple have been conducted and are reviewed by Biedrzycka and Amarowicz (2008). In vitro studies have demonstrated growth inhibition of human liver (HepG2) and colon (Caco-2) cancer cell lines with apple extract application (Wolfe,

Wu, and Liu 2003; Eberhardt, Lee, and Lui 2000). Major classes of phenolics found in apples include anthocyanins, dihydrochalcones, flavonols, and flavanols. These are

9 intermediate products of the phenylpropanoid pathway, including the flavonoid pathway branch (Figure 1).

1.3.4.1 Anthocyanins

Anthocyanins are polar plant pigments that give blue, red, and purple color to plants. Cyanidin-3-galactoside is the responsible pigment for the red color seen in most apple varieties today and is more abundant in areas of blush on certain varieties. Apple anthocyanins, recently reviewed in Matsuoka (2019), accumulate rapidly in the peel within the few weeks before the apple reaches maturity (Chalmers, Faragher, and Raff

1973). Anthocyanin production is dependent on enzymes of the flavonoid biosynthetic pathway and is therefore under genetic control (Takos et al. 2006). In addition to genetics, accumulation is influenced by environmental characteristics such as: light exposure, temperature, and nutrient availability (Proctor 1974; Honda et al. 2014). The relationship between apple anthocyanins and health benefits has been a subject of study due to the proposed antioxidant properties of this class of flavonoids, summarized by

Biedrzycka and Amarowicz (2008).

1.3.4.2 Dihydrochalcones

The most abundant dihydrochalcone present in apples is phloridzin (phloretin 2¢-

O-glucoside, phlorizin). Throughout the plant kingdom, this compound is only accumulated in high abundances in Malus, where it is the predominant phenolic compound (Gosch, Halbwirth, and Stich 2010). A substantial review by Ehrenkranz et al.

(2005) summarizes the history of phloridzin use in renal physiology and its effect on human metabolism. In mammals, phloridzin can block intestinal glucose absorption and

10

p-Coumaroyl-CoA Malonyl-CoA

Hydroxycinnamoyl- CoA double-bond Chalcone reductase Synthase

Dihydrochalcone Chalcone (Phloretin) Chalcone Isomerase

Flavanone

Flavanone-3 Unknown β-hydroxylase Flavonol glycosyl transferase synthase Flavonol Dihydroflavonol (Quercetin glycosides)

Dihydroflavonol- 4-reductase Leucoanthocyanidin reductase Flavan-3-ol Leucoanthocyanidin (Catechin)

Leucoanthocyanidin dioxygenase Anthocyanidin reductase Flavan-3-ol Anthocyanidin (Epicatechin)

UDP-glycose:flavonoid- 3-O-glycosyltransferase

Anthocyanin Proanthocyanidins (Cyanidin 3-glycosides) (Condensed Tannins)

Figure 1. Flavonoid biosynthesis pathway.

11 renal glucose resorption (Alvarado and Crane 1962; Vick, Diedrich, and Baumann 1973).

Due to its role in glucose resorption, it has been heavily studied in relation to diabetes.

Phloridzin has demonstrated the ability to normalize insulin sensitivity and reduce hyperglycemia by inhibiting sodium glucose co-transporters in the proximal tubule brush border and gastrointestinal tract (Gaisano et al. 2002; Abdul-Ghani and DeFronzo 2008).

Apple consumption has been correlated with reduced risk of type 2 diabetes mellitus in several studies (Song et al. 2005; Muraki et al. 2013) and was supported by a recent meta-analysis of prospective cohort studies by Guo et al. (2017).

1.3.4.3 Flavonols

Flavonols, thought to protect plants against excessive solar radiation and widely distributed in fruits and vegetables (Merzlyak et al. 2005), have been characterized in apple with the main compounds being quercetin glycosides (Rana and Bhushan 2016). A review by Perez-Vizcaino and Duarte (2010) describes the thorough in vitro and in vivo studies of flavonols, specifically quercetin, as key compounds for cardiovascular health benefits. They underscore the “undisputed” nature of quercetin’s anti-hypertensive and anti-artherogenic effects as displayed in animal models of disease. Anti-inflammatory effects of quercetin-3-O-glycoside apple extractions were examined in a rat-feeding trial in which the rats were fed a high-fat diet (Sekhon-Loodu et al. 2014). The group being fed the apple extract had significantly lower concentrations of proinflammatory biomarkers C-reactive protein and interleukin-6 as well as lower serum triglycerides and

LDL-C with higher HDL-C compared to the control.

12

1.3.4.4 Flavanols (Flavan-3-ols)

Additionally, flavanols make up the largest class of polyphenols in apple fruit and are also abundant in cocoa, tea, and wine (Vrhovsek et al. 2004; Perez-Vizcaino and

Fraga 2018). Abundant in apples, the flavan-3-ols (-)epicatechin and (+)catechin are precursors of the polymeric A- and B-type proanthocyanidins (condensed tannins), respectively, which contribute to astringency in fruit and fruit products (Ottaviani et al.

2018). Gene transcription for synthesis of these compounds in apple peel is regulated separately from the other flavonoid genes involved in the pathway (Takos et al. 2006).

Epicatechin-rich apple extracts have been noted to potentially increase bioavailability of nitric oxide, which is inversely correlated with blood pressure (Hollands et al. 2013).

1.4 Nutrition-Driven Breeding

Despite the proposed links between phytochemicals and improved health outcomes in apple and other horticultural crops, nutrition-driven breeding outcomes have not been a priority (Patil et al. 2012). Conventional breeding has been used in staple crops with germplasm containing sufficient genetic variation for genetic gains in nutritional traits (Nestel et al. 2006). Genetic modification has been used in biofortification of crops without sufficient trait variability for increased nutrition or to quickly supersede levels reached by traditional breeding (Beyer 2010). Patil et al. (2012) points out provitamin A carotenoids and flavonoids as two classes of secondary metabolites that have driven breeding efforts for increased levels of bioactive compounds.

13

Vitamin A (retinol) is essential for key biological processes in human growth and development, such as eyesight, and deficiency is common in developing countries (De

Moura, Miloff, and Boy 2015). The carotenoid β-carotene, a provitamin A phytochemical, has been the subject of many biofortification projects. Golden Rice is the most well-known example. Without natural variants with the capacity for increased provitamin A production, the β-carotene biosynthesis pathway was integrated into the rice (Oryza sativa L.) germplasm via genetic modification (Ye et al. 2002). Genes were introduced for phytoene synthase and β-cyclase from daffodil (Narcissus pseudonarcissus L.) and bacterial phytoene desaturase from Erwinia uredovora (Pon)

Dye. (Ye et al. 2002).

Success in rice sparked similar projects in many other staple crops. Cassava

(Manihot esculenta Crantz.), a staple crop in much of Africa, makes up a large portion of the diet there but is not high in micronutrients (De Moura et al. 2015). Both conventional breeding and transgenic approaches have been used in cassava carotenoid biofortification, led largely by the BioCassava Plus Program (Sayre et al. 2011).

Traditional and transgenic efforts to increase carotenoid content of maize (Zea mays L.) have also been successful (Naqvi et al. 2009). Increased carotenoid accumulation in sweet potato (Ipomoea batatas (L.) Lam.) has been achieved through conventional, molecular, and transgenic breeding methods (Kang et al. 2017). Transgenic modification of potato (Solanum tuberosum L.) has improved β-carotene and lutein levels (Ducreux et al. 2005; Diretto et al. 2007).

14

Increased carotenoid as well as flavonoid content have been objectives in tomato breeding conducted with traditional and transgenic breeding (Patil et al. 2012). Other biofortification efforts for increased flavonoid content in crops have been successful.

Interest in nutraceutical value of increased flavonoids, led to conventional breeding methods able to produce anthocyanin-rich purple carrots (Nelson 1999). In onions

(Allium spp. L.), increasing the potentially health-beneficial flavonoid quercetin has been a breeding target. The red onion variety ‘Quer-rich’ has demonstrated successful increase in quercetin content by conventional breeding (Muro et al. 2010).

The rich opportunity for nutrition-focused breeding in horticultural crops in the

21st century to address human health concerns was summarized by Bliss in 1999.

However, even with transgenic approaches available today, horticultural crop breeding continues to focus on disease resistance and fruit quality characteristics other than nutrition (Xiong, Ding, and Li 2015). Phytochemical research of fruits and vegetables has concentrated on characterizing the metabolomes of horticultural crops, but there has not been a realized translation into breeding. Programs considering phytochemical content for tomato, apple, onion, strawberry, cranberry, and raspberry are currently underway as nutrition and diet become increasingly important to consumers (Farneti, Khomenko,

Cappellin, Ting, Romano, et al. 2015). Biofortification has been pursued in crops that are highly consumed and will therefore be able to make an impact on health world-wide. As previously stated, apple is the most consumed fruit in America and one of the highest globally. Increasing health beneficial compounds in apple would therefore also have a large impact.

15

1.5 Apples and Genomics

It is not for lack of tools that horticultural crop breeding has yet to make widespread progress in advancing nutrition. Whole genome sequencing has enabled significant progress in DNA-informed plant breeding (Peace et al. 2019). The ‘Golden

Delicious’ apple variety was the tenth plant genome to be sequenced (Daccord et al.

2017; Peace et al. 2019). Now a second variety has been sequenced: HFTH1, a line derived from popular Chinese variety ‘Hanfu’ (Zhang et al. 2019). Marker identification and trait mapping are most advanced in apple out of the Rosaceous crops (Laurens et al.

2010).

In light of the long juvenile period and evaluation processes of apple, which result in high cost for breeding, genetic resources can enable a shorter and more efficient breeding cycle that will reduce cost. Despite substantive output of genetics research within apple breeding programs, implementation of these tools has been rare (Laurens et al. 2010; Iezzoni et al. 2010). An international consortium FruitBreedomics was developed in 2010 to connect Rosaceous crop resources and breeders, with an emphasis on apple and peach (Laurens et al. 2010). Along similar lines, the U.S.-based group,

RosBREED, was formed to conduct a coordinated effort of breeders, geneticists, and bioinformaticians with the goal of improving research and utilization of produced tools

(Iezzoni et al. 2010).

Work in apple genomics has produced genome-wide 8K Infinium®, 20K

Infinium®, and Axiom®Apple480K SNP marker arrays for apple genotyping (Chagné et al. 2012; Bianco et al. 2014; Bianco et al. 2016). Additionally, genetic and physical maps

16 have been generated for cultivated apple (Di Pierro et al. 2016). The RosBREED 2018 annual report details the progress in molecular testing for apple disease resistance, fruit maturity and quality, and productivity (Table 1 Table 2). Noticeably absent are breeding advances in apple nutrition. Elements of genomic study in apples with the purpose of integrating phenotypic information will be discussed below.

Table 1. Progress in molecular testing for apple disease resistance and reduced physiological disorders according to the RosBREED 2018 Annual Report.

*Linkage Group (LG)

Disease/Pathogen Locus/Loci Valuable DNA Test DNA Discovered* Alleles Developed Test in Identified Use Blue Mold M. sieversii PI yes Md-Pe3-SSR yes Foliar powdery mildew M. sieversii LG8 yes Md-Plw-SSR yes Scab LG1 yes CHVf1, yes VfSNP1, RviHC Scab Honeycrisp yes in progress (+Wildung) LG1 & Honeycrisp LG15 Scab Vh2, Wh4, Vh8 yes in progress

Fire Blight LG7 & yes in progress LG13 Soft scald / Soggy breakdown Honeycrisp LG2 & yes in progress LG16 Zonal leaf chlorosis Honeycrisp LG7 yes in progress Fire Blight (3 loci) in progress

17

Table 2. Progress in molecular testing for apple quality traits according to the

RosBREED 2018 Annual Report.

*Linkage Group (LG)

Traits Locus/Loci Valuable DNA Test Developed DNA Test in Discovered* Alleles Use Identified Fruit Texture (ethylene) MdACS1 (LG15) yes Md-ACS1-indel, yes ACS1SNP1 Fruit Texture (ethylene) MDdACO1 (LG10) yes Md-ACO1-indel, yes ACO1SNP1

Fruit Texture (firmness) MdPG1 (LG10) yes Md-PG1SSR10Kd, yes PG1SNP1 Fruit Texture (firmness) MdExp7 (LG1) yes Md-Exp7-SSR yes Fruit Texture (crispness) Ma (LG16) yes Ma-indel, Ma1SNP1 yes Fruit Texture (acidity) Ma (LG16 + LG8) yes Ma × A Acidity yes Fruit Sweetness Md-LG1Fru (LG1) yes Md-LG1Fru-SSR yes Bitter Pit Susceptibility Bp1 (LG16) yes BP16-indel yes Fruit skin blush Rf (LG9) yes Md-Rf-SSR yes Fruit flesh color MYB110 (LG17) yes Md-S3-indel yes Cross-compatibliity S locus (LG17) yes S-geno-SNP yes Soft scald LG2 yes in progress

18

1.6 Apples and Metabolomics

Much of the metabolomics research in apple thus far relates to fruit quality concerns in storage, so a summary is given here. Also, the majority of the work in apple has been conducted with gas chromatography coupled with mass spectrometry (GC-MS).

Although GC-MS experiments do differ from the liquid chromatography-mass spectrometry (LC-MS) experiments that will be conducted in our present project, it is important to understand the course of metabolomics work in apple.

In the interest of maintaining fruit quality after harvest, a large portion of the apple metabolite research has aimed to elucidate changes in the fruit during storage.

Volatile compound analysis via GC-MS has been primarily used for this objective.

Effects of hypoxic controlled atmosphere storage have been studied in various varieties, including ‘’, ‘’, ‘Fuji’, ‘’, and ‘

(Lee et al. 2012; Farneti, Khomenko, Cappellin, Ting, Costa, et al. 2015; Brizzolara et al.

2017).

Apple metabolome profiles related to post-harvest disease and physiological disorders have also been characterized with the goal of determining early indicators for disorders that commonly occur in apple storage. No volatile biomarkers were found for early identification of internal browning in ‘’ apples (Hatoum et al. 2016).

However, analysis with NMR spectroscopy revealed differences in organic acid concentrations between ‘Braeburn’ apples affected and non-affected with internal browning (Vandendriessche et al. 2013). Studying long-term storage of Fuji apples with

GC-MS, Lee et al. (2017) identified potential indicators for fruit quality as 1-butanol and

19

2-methylbutanol along with 4 esters (butyl acetate, butyl butanoate, butyl 2- methylbutanoate, 2-methylbutal acetate). Several volatile compounds (fluoroethene, 3,4- dimethyl-1-hexene, butanoic acid butyl ester, 4-methyl-1-hexene, 2-methyletrazole, and acetic acid methyl ester) were found to allow early indication of common fungal diseases in ‘McIntosh’ apple storage (Vikram et al. 2004).

Volatile compounds composing apple aroma have also been studied as they relate to customer preferences. Using an artificial “chewing device” Farneti, Khomenko,

Cappellin, Ting, Costa, et al. (2015), evaluated volatiles of ripening ‘Golden Delicious’,

‘Fuji’, and ‘Granny Smith’ apples with proton transfer reaction-time of flight-mass spectrometry. With a traditional GC-MS/MS approach, Vrhovsek et al. (2014) identified

69 volatile compounds important in apple aroma. These analyses have been paired with sensory panels to discern associations between compounds and flavor perception (Aprea et al. 2012).

Additionally, exploratory metabolomics experiments with apple fruit, juice, and pomace have been conducted using GC-MS, LC-MS, and NMR to capture a general understanding of the apple metabolome and differences between varieties, with emphasis on the variance of phenolic compounds. Using GC-MS, Aprea et al. (2011) discriminated between ‘Golden Delicious’, ‘’, ‘Red Delicious’, and ‘Granny Smith’ based on variance of 10 compounds. Similarly, Cuthbertson et al. (2012) were able to differentiate between six varieties with GC-MS. In Belgium, an LC-MS analysis of 47 apple cultivars found that new and heritage varieties could be distinguished with phenolic profiles of the peel, which also suggested genetic relationships (De Paepe et al. 2015). The study also

20 found distinctly high levels and variety of phenolics in the disease-resistant heritage cultivars that may be important for breeding resistant varieties. An NMR analysis of 14 closely related apple varieties showed unambiguous grouping of all samples from a given with hierarchical clustering (Eisenmann et al. 2016). However, evaluation of the selected varieties differing in scab and mildew resistance did not show clustering based on metabolite differences.

Due to the interest in apples as health beneficial fruit based mainly on phenolic content, a practical assessment of apple consumption is desirable. Relying on food frequency questionnaires has been found to be unreliable (Shim, Oh, and Kim 2014), so biomarker studies aim to produce a quantitative consumption metric of specific foods.

The urine metabolomes of apple-fed rats were analyzed for biomarkers (Kristensen,

Engelsen, and Dragsted 2012). Markers of apple exposure were found to be quinic acid, m-coumaric acid, (-) epicatechin, and hippuric acid. More recently, Saenger, Hübner, and

Humpf (2017) conducted a human study (n = 30) to identify short-term urine biomarkers of low (1 apple), medium (2 apples), and high apple intake (4 apples). Low and medium consumption groups could be differentiated from high consumption with phloretin (a dihydrochalcone that is only in high concentrations in apple), epicatechin, and procyanidin B2, both of which are not unique to apple but could be used as supplementary indicators.

1.7 Multi-omic Integration

Determining relationships between genotype and phenotype is the foundation of genetic-based breeding. Quantitative trait loci (QTL) analysis is used to identify

21 genotype-phenotype associations (Collard et al. 2005). Genotype is determined by use of genetic markers, and phenotype must be accurately and consistently measured. Genetic loci mapping for phenotypic traits has traditionally been performed with bi-parental families, only allowing characterization of the segregating traits present in those specific parental varieties. Significant group differences between phenotypic means associated with a marker indicate that the marker is linked to a QTL for that trait. Flanking markers of detected QTLs are a springboard for development and validation of makers tightly linked to the gene of interest (Collard et al. 2005).

1.7.1 Bi-Parental Mapping Populations

Previous QTL analysis of apple has focused on disease resistance. Resistance to scab (Calenge et al. 2004), fire blight (Calenge et al. 2005), powdery mildew (Calenge and Durel 2006), and blue mold (Norelli et al. 2017) have been studied and QTLs identified as a step to develop markers for breeding. Apple QTL analysis has since broadened to incorporate investigations of apple quality traits.

The first genomic-metabolomic integration in apple was a QTL analysis of an F1 mapping population from ‘Discovery’ × ‘Prima’ (Dunemann et al. 2009). Aroma compounds of the progeny were analyzed via headspace solid-phase microextraction gas chromatography. Amplified fragment length polymorphism (AFLP) and simple sequence repeat (SSR) markers were used for genotyping and genetic map construction. Twenty- seven apple fruit volatiles were putatively associated with 50 QTLs.

Several years later, a bi-parental, segregating population from crosses of ‘Prima’ and ‘’ was used by Khan, Chibon, et al. (2012) to map metabolite QTLs (mQTL).

22

Untargeted metabolomic analysis of the peel and flesh was conducted with LC-MS. They reported 418 peel metabolites and 254 flesh metabolites detected. Various types of markers were used in genotyping. However, the software used for their mQTL analysis was designed for homozygous inbred RIL populations, so they transformed their data through several steps to allow the analysis. Mapping produced 669 significant mQTLs.

Hotspots on four linkage groups were in areas regulating metabolites involved in the phenylpropanoid pathway. In a follow-up study, the group found a significant correlation between the LG 16 mQTL hotspot of flavanols and procyanidins and the structural gene leucoanthocyanidin reductase (LAR1) using transcription data and the reference apple genome (Khan, Schaart, et al. 2012).

In the same year, fruit from 170 individuals from the bi-parental mapping population ‘Royal ’ × ‘Braeburn’ were analyzed via UHPLC for abundance of 23 polyphenols (Chagné, Krieger, et al. 2012). Metabolites measured include: chlorogenic acid, p-coumaroyl quinic acid, cyanidin-glycosides, catechin, epicatechin, procyanidins, quercetin-glycosides, and phloridzin-xyloside. Metabolite abundance data was then integrated with 511 SNP markers using multiple QTL mapping (MQM) analysis, resulting in detection of 69 mQTL for 17 compounds on nine of the 17 apple chromosomes. Eight of the mQTL, all flavanols, were indicated on chromosome 16.

Chromosome 17 displayed signal for chlorogenic acid and quercetin-3-O-rutionside.

Investigation into co-locating genes with the mQTL produced potential results on chromosomes 16 and 17. The gene hydroxy cinnamate transferase/hydroxy quinate transferase (HCT/HQT) was found on the bottom of chromosome 17 where the

23 chlorogenic acid signal was located. Additionally, leucoanthocyanidin reductase (LAR1) co-located with the flavanol mQTL at the top of chromosome 16. This result paralleled the findings of Khan, Schaart, et al. (2012).

Given that limited information exists on integration of genomic and metabolomic datasets in apple, similar analyses conducted in other crops are summarized here.

Genomic and untargeted metabolomic analysis was conducted in a mapping population of 179 doubled haploid lines of wheat (Triticum aestivum L.) (Hill et al. 2015).

Metabolite data was obtained with LC-ESI-QTOF-MS experiments, resulting in 197 putatively identified compounds, including alkaloids, flavonoids, terpenoids, and organic acids. Loci were observed to exhibit coordinated genetic control of several groups of metabolites. In addition to QTL mapping, correlations were investigated between metabolite concentration and agronomic traits. Correlations based on common genetic control or linkage could inform identification of metabolites as biomarkers for advancing breeding of agronomic traits.

1.7.2 Metabolite Genome Wide Association Studies (mGWAS)

Recently, McClure et al. (2019) sought to investigate genetic control of polyphenols in a diverse, breeding population (n = 136) over two years to understand whether the polyphenol mQTLs detected in bi-parental populations (Chagné, Krieger, et al. 2012; Khan, Chibon, et al. 2012) are transferable to breeding-relevant germplasm.

HPLC analysis was used to measure 14 specific polyphenols along with collective phenotypes, including total phenolics, total hydroxycinnamic acids (HCA), total flavonols, total fluorescence, total anthocyanins, and total phloretin-like compounds.

24

These traits were integrated with ~100,000 single nucleotide polymorphisms (SNPs) in metabolite genome wide association studies (mGWAS). Resulting signals indicated a hotspot on chromosome 16 for flavanols within the effect loci determined in Khan,

Chibon, et al. (2012) and Chagné, Krieger, et al. (2012). Contrary to the findings for chlorogenic acid in the two mapping population analyses, McClure et al. (2019) identified mQTL on chromosomes 5 and 15 instead of previously reported chromosome

17. They did indicate that a strong but non-significant signal was present at the expected locus on chromosome 17.

1.7.3 Pedigree-Based Analysis

Advancement in QTL mapping methodologies was needed to better represent the complex genetics of species like apple. Peace et al. (2019) detailed the need to move beyond bi-parental populations in apple trait loci analyses. Due to the heavily heterozygous nature of the apple genome and diversity available in wild and cultivated species, bi-parental populations would hamper characterization of possible alleles.

Breeding programs commonly contain many alleles for a given trait, but a bi-parental population can account for up to four alleles if the parents are heterozygous (e.g. AB ´

CD), but usually only two since parents are homozygous for distinct alleles (e.g. AA ´

BB). Selection based on information about two alleles then leads to loss of efficiency and genetic erosion when other genotypes are ignored (van de Weg et al. 2004). Additionally, loci that are characterized in one mapping population are commonly not transferrable to other families. This is also unfeasible due to the long juvenile period that would stretch the production of segregating populations to many decades.

25

To overcome these limitations, QTL mapping via a pedigree-based analysis

(PBA) approach has been implemented to analyze breeding germplasm. The PBA approach analyzes several pedigree-connected families, taking advantage of identity by descent (IBD) (Bink et al. 2014). The principle of IBD is based on knowledge of haplotype inheritance over generations. It evaluates alleles of recent varieties based on alleles of founding varieties (van de Weg et al. 2004). Important breeding parents and progeny make up the pedigree-related families, so QTLs are evaluated in a more breeding-relevant context than in bi-parental populations. This allows evaluation of individuals from a variety of genetic and even environmental backgrounds because samples are taken from existing breeding programs.

Bink et al. (2014) delivered a proof of concept evaluation of a Bayesian PBA

QTL analysis of apple firmness using the FlexQTL™ software (Bink 2002; Bink et al.

2002; www.flexqtl.nl). Genotyping with SSR markers was conducted for 27 full-sib, pedigree-connected families and their known ancestral progenitors. Evidence was found for 14 fruit firmness QTLs, with several being previously reported. Genomic breeding values produced were 90% correlated with phenotype, on average. The predictability of traits based on QTLs in breeding parents is necessary for effective marker-assisted breeding.

Also in apple, PBA was used for QTL discovery for individual sugars and soluble solids content of multiple breeding populations (Guan et al. 2015). The 8K Infinium®

SNP array was used for genotyping and generated 1,416 polymorphic SNPs. Fructose, glucose, sucrose, and sorbitol were identified and quantified using GS-MS at harvest and

26 several post-harvest time points for two successive seasons. Genotypic-phenotypic integration was performed using FlexQTL™ software. Stable QTLs were identified on

LG1 for fructose and sucrose.

In another Rosaceous tree crop, peach (Prunus persica L.), a Bayesian approach was used with PBA for QTL mapping of eight complex traits: days to bloom, fruit diameter, fruit weight, fruit development period, pH, soluble solids concentration, and titratable acidity (Fresnedo-Ramírez et al. 2015). Fifty-two QTL were positively identified with up to 98% phenotypic variance explained by QTL. Similarly, five QTL for peach fruit size and weight were researched with the same PBA approach within a

Bayesian framework (Fresnedo-Ramírez et al. 2016). This study included material from several peach breeding programs, and therefore demonstrated the robustness of the

Bayesian PBA approach for QTL identification.

1.7.4 Apple Integrated Genetic Linkage Map (iGLmap)

Accurate marker-loci order is necessary for QTL studies. The order of markers is traditionally determined by creating a genetic linkage map for the population being studied (Di Pierro et al. 2016). These populations are typically bi-parental full-sib families that are segregating for the trait of interest. However, this approach to QTL mapping provides results that may not be representative of genetic control of the phenotype across a wider range of germplasm, as it only considers the alleles of two parents (Peace et al. 2019).

Instead, interest in QTLs is moving towards larger, less-constrained populations of pedigree-connected individuals with multiple founders (Peace et al. 2019). These

27 populations allow a more diverse genetic background and allelic variation to be represented in QTL analysis, improving likelihood of QTL detection and wide-ranging applicability across variable germplasm. The alternative is to use a high-quality reference genome or consensus map for the species. To meet this need in apple, a reference genetic linkage map was developed in 2014 by Di Pierro et al. (2016). The integrated genetic linkage map (iGLmap) was based on genotypes for 21 full-sib families from the 20K

Infinium SNP array for apple (Bianco et al. 2014). The apple iGLmap provides the order of marker-loci across all 17 linkage groups, which is necessary for conducting QTL studies.

1.8 Specific Aims

The overall goal of this work is to develop an integrated platform utilizing genomic and metabolomic data from diverse apple selections to better understand genotype-metabolite associations. By using pedigree-connected apples in combination with selections of additional varieties and wild accessions, we hypothesize that both genotypic and phenotypic diversity will enable discovery of genotype-metabolite relationships. This will be accomplished via a three-fold approach: characterizing and analyzing the genome and metabolome data individually, then integrating the two -omics disciplines with mGWAS and PBA for mQTL discovery.

Aim 1 – Genomics: Genotype a segregating pool of apples (n = 199), including individuals from 3 pedigree-connected families (n = 123), additional varieties (n = 32), and wild accessions (n = 44) chosen because of their expected genetic and phytochemical diversity.

28

- Individuals were genotyped using the genome-wide apple 20K Infinium® SNP

array, which is well-characterized and allied with resources, such as linkage maps

and gene annotations.

Aim 2 – Metabolomics: Conduct untargeted metabolomic analysis for comprehensive metabolic profiling of apple fruit extracts using high-resolution mass spectrometry (MS) and nuclear magnetic resonance (NMR) spectroscopy approaches.

- Polar/semi-polar extracts of apple fruits using MeOH:H2O were analyzed using

UHPLC-QTOF-MS.

o We expect metabolomic analysis of these extracts will capture analytes,

including terpenoids and polyphenolic compounds, such as: flavonols,

flavanols, anthocyanins, dihydrochalcones, and hydroxycinnamic acids.

- Polar/semi-polar extracts of apple fruits using MeOH:H2O were analyzed using

NMR.

o We expect metabolomic analysis of these extracts will capture analytes,

including sugars, amino acids, organic acids, and high-abundance phenolic

compounds.

Aim 3 – Omics Integration:

a) In apple germplasm ranging from commercial to wild, integrate genotypic and

metabolomic datasets using mGWAS in order to identify mQTL as indicated by

significant genotype-metabolite associations.

b) In pedigree-connected apple varieties, integrate genomic and metabolomic data

leveraging identity-by-descent in a pedigree-based analysis (PBA) using a

29

Bayesian approach to identify mQTL linked to specific metabolites of interest as

identified through mGWAS.

The long-term goal of the project is to use these genotype-metabolite associations to inform future marker-assisted breeding (MAB) efforts for health-beneficial compounds in apple. Marker-assisted selection (MAS) would allow breeders to choose parents with high heritability for the desired trait. Growers could then screen progeny for the markers linked with the genetic region of interest. With this early-seedling selection, growers could cull seedlings without the trait and concentrate their time and resources on care of the trees confirmed to have the desirable phenotype. The platform being developed in this work has far-reaching applicability to detect mQTLs for metabolite-related traits, including nutritional content, flavor, texture, appearance, fruit quality, and disease resistance.

30

Chapter 2. Seeding Multi-omic Apple Improvement

Emma A. Bilbrey1, Kathryn Williamson2, Emmanuel Hatzakis2, Diane Miller3, Jonathan Fresnedo-Ramírez3, Jessica L. Cooperstone1,2

1 Department of Horticulture and Crop Science, The Ohio State University, Columbus, OH, USA 2 Department of Food Science and Technology, The Ohio State University, Columbus, OH, USA 3 Department of Horticulture and Crop Science, The Ohio State University, Wooster, OH, USA

31

2.1 Abstract

Apples are one of the most commonly consumed fruits in America, and “an apple a day keeps the doctor away” is a well-known adage. The commercial and nutritional importance of apples has prompted interest in varietal improvement. However, progress is limited by a long juvenile period, which delays fruit evaluation for quality traits, such as phytochemical composition. To minimize this drawback, apple breeders have begun using marker-assisted selection (MAS) for some traits, but breeding strategies for fruit phytochemicals have yet to be developed.

In response, we have developed an integrated genomic-metabolomic platform to better understand gene-phytochemical associations in breeding-relevant apple germplasm. Phytochemicals that are potentially health beneficial, contribute disease resistance, or improve fruit quality can be characterized using metabolomics, providing a foundation to study apple’s breeding potential. The platform is based on high-throughput genomic and metabolomic assessment of 173 unique apples, including members of three pedigree-connected families alongside diverse and wild selections. Metabolite genome- wide association studies (mGWAS) were conducted with 11,165 SNPs for two LC-MS data sets of 4,000+ features each and an NMR data set of 756 bins. Novel schemes for prioritizing results from mGWAS indicated 519 (LC-MS (+)), 726 (LC-MS (-)), and 177

(NMR) significant marker-trait associations across the apple genome (LC-MS: p <

.00001, NMR: p < .0001). These results were then sifted to select features to analyze with a more powerful pedigree-based analysis (PBA) in FlexQTL™ with 6,034 SNPs to identify metabolite quantitative trait loci (mQTL). An mQTL for chlorogenic acid was 32 identified on the bottom of chromosome 17 across all three metabolomic data sets and was used as a proof-of-concept example to demonstrate the applicability of the platform.

Determining gene-phytochemical relationships in apple will inform breeding and facilitate future MAS for improved nutrition along with attributes related to flavor and disease resistance etiology.

2.2 Introduction

Apples (Malus × domestica Borkh.) are one of the most consumed fruits worldwide, ranked number one in the US (USDA Economic Research Service 2017).

Consuming apples, which are rich in various phytochemicals, has been associated with many positive health outcomes, such as decreasing risk of cancer, cardiovascular disease, pulmonary disease, and cognitive decline (Boyer and Liu 2004; Hyson 2011). With increasing incidence of obesity, cardiovascular disease, and cancer (World Health

Organization, 2014), there is enormous potential for crop breeding to respond with efforts to increase levels of health-beneficial compounds in fruits and vegetables. With already high consumption rates, increasing healthfulness of apples represents an opportunity to impact the global diet.

One barrier to breeding, including nutrition-driven, in apple is the long juvenile period, 5-10 years to first fruit, which lengthens the breeding cycle. Additionally, as an out-crossing species, the self-incompatibility of apple prohibits creation of inbred lines, leaving apples with heavily heterozygous genomes and making it difficult to lock in desired alleles. Thus, apple breeding is a daunting task using traditional breeding methods. Instead, marker-assisted selection (MAS) must be implemented in the breeding

33 cycle to allow breeders to choose parents with high heritability for the desired trait.

Growers could then screen progeny for the markers linked with the genetic loci of interest, be it a gene or regulatory sequences. With this early-seedling selection, growers could cull seedlings without the favorable allele and concentrate time and resources on care of the trees known to have the desirable phenotype.

Reaching this eventual goal requires us to develop our foundational understanding of (1) the genetic variability and (2) the phytochemical profiles that exist in apple germplasm. Genetic data can be gathered with single nucleotide polymorphism (SNP) arrays to characterize genotypes of chosen apple varieties. Global phytochemical assessment via untargeted metabolomics approaches can be obtained by using high- resolution mass spectrometry (HRMS) as well as nuclear magnetic resonance (NMR) spectroscopy analyses. The larger challenge comes with (3) the integration of the genomic and metabolomic data sets to determine the genotype-metabolite relationships that are the basis of marker development.

Quantitative trait loci (QTL) analysis is used to identify genotype-phenotype associations (Collard et al. 2005). Genotype is determined by use of genetic markers, and phenotype must be accurately and consistently measured. QTL are detected when the pattern of variance of a genotype mirrors the pattern of variance in a phenotype across the population being studied. QTL mapping for phenotypic traits has traditionally been performed with bi-parental families. This is also true in metabolite QTL (mQTL) studies in apple (Dunemann et al. 2009; Chagné, Krieger, et al. 2012; Khan, Chibon, et al. 2012).

34

Due to the complex genetics of a species like apple, Peace et al. (2019) detailed the need to move beyond bi-parental populations in apple trait loci analyses. Due to the heavily heterozygous nature of the apple genome and diversity available in wild and cultivated species, bi-parental populations would hamper characterization of possible alleles. Breeding programs commonly contain many alleles for a given trait, but a bi- parental population can account for up to four alleles if the parents are heterozygous (e.g.

AB ´ CD), but usually only two since parents are homozygous for distinct alleles (e.g.

AA ´ BB). Selection based on information about two alleles then leads to loss of efficiency and genetic erosion when other genotypes are ignored (van de Weg et al.

2004). Additionally, loci that are characterized in one mapping population are commonly not transferrable to other families. This is often impractical due to the long juvenile period that would stretch the production of segregating populations to many decades.

To overcome these limitations, QTL mapping via a pedigree-based analysis

(PBA) approach has been implemented to analyze breeding germplasm. The PBA approach analyzes several pedigree-connected families, taking advantage of identity by descent (IBD) (Bink et al. 2014). The principle of IBD is based on knowledge of haplotype inheritance over generations. It evaluates alleles of recent varieties based on alleles of founding varieties (van de Weg et al. 2004). Important breeding parents and progeny make up the pedigree-related families, so QTLs are evaluated in a more breeding-relevant context than in bi-parental populations. This allows evaluation of individuals from a variety of genetic and even environmental backgrounds because samples are taken from existing breeding programs.

35

Accordingly, this study presents a platform for genomic-metabolomic integration using mGWAS and PBA mQTL analysis in a diverse collection of breeding-relevant apple germplasm. The pipeline developed here enables exploration of the relationships between apple genomics and metabolomics, largely based on visualization strategies for the results of multi-omic integration. It also facilitates seeking answers to a priori questions about specific metabolites or genetic loci of interest. Developing a platform to inform breeding for increased levels of phytochemicals can not only be applied to compounds for health interests but also to phytochemicals involved in other breeding interests such as disease resistance, flavor, texture, fruit quality, and post-harvest characteristics.

2.3 Materials and Methods

2.3.1 Sample Selection

Apples were chosen for the study based on the expected diversity of their genetic and metabolic profiles as well as interest to breeders. Three sets of experimental progeny

‘Honeycrisp’ × ‘Fuji’ (HC×FJ, n = 27) ‘Goldrush’ × ‘Sweet 16’ (GR×S16, n = 28), and

‘Honeycrisp’ × ‘MSH-10’ (HC×M10, n = 19), with varied flavor and texture profiles were selected along with 22 members of their pedigree-connected families (Figure 2).

These three families include commercial and heritage apple varieties as well as advanced selections. Wild accessions (n = 44) from Central Asian M. niedzwetzkyana and M. sieversii along with additional apples (n = 31) with traits of breeding interest were also included to capture the wide variety of apple germplasm.

36

The majority of apples for the study were collected from one of three commercial orchards with membership in the Midwest Apple Improvement Association (MAIA) grower-participatory breeding program: Lynd Fruit Farm, Johnstown, OH (n = 28);

Whitehouse Fruit Farm, Canfield, OH (n = 77); and David Doud’s Countyline ,

Wabash, IN (n = 41) (Supplement 1). Additional varieties not available at the MAIA sites were obtained from Purdue University in West Lafayette, IN, (n = 5); the USDA Apple

Germplasm Repository in Geneva, NY, (n = 2); The Dawes Arboretum in Newark, OH,

(n = 43); or other locations in Ohio (n = 3) (Supplement 1).

Leaf tissue and apple fruit were collected from one tree for the majority of varieties included in the study with a few exceptions. Four apple varieties (‘Honeycrisp’,

‘Goldrush’, ‘Golden Delicious’, and ‘EverCrisp’) were collected at each of the three

MAIA sites. Also, 13 of the 27 HC×FJ progeny from David Doud’s Countyline Orchard in Wabash, IN, were also collected from clonal replicates at White House Fruit Farm in

Canfield, OH. Leaf tissue was collected in summer 2018 for DNA extraction. Fruits were harvested in fall 2018 for metabolomic analysis (Supplement 1).

A minimum of three apples were collected from each selected tree when optimally ripe, as determined by expert opinion based on ground color and traditional ripening timeline for specific varieties. All samples were sent to the Ohio Agricultural

Research and Development Center in Wooster, OH. Apples were stored in a cooler at 4°C for up to seven days. At least three apples were rinsed, dried, and cored for each selection. Eight slices were chosen at random then flash frozen in liquid nitrogen. Fruits too small to be cored were cut with a knife or simply frozen whole. The frozen slices

37

Grimes Golden M. floribunda Rome Beauty Jonathan

9433-2-8 9433-2-2 Wagener Golden Delicious

F2-26829-2-2 F2-26830-2-2 Crandall

Duchess of Oldenberg PRI-14-226

Winesap PRI-49-102 Yellow Newton PRI-669-205 Red Delicious #2 3 Malinda Frostbite 8 NJ-60837 PRI-187-6

Sweet 16 Keepsake MN1627 D1R102T98 Co-op 17 S80ER18T32

PWR37T133

Honeycrisp GoldRush CQR10T17 Fuji MSH-10-1.1

GR×S16 Progeny HC×M10 Progeny HC×FJ Progeny

Figure 2. Pedigree chart for three pedigree-connected families. were placed in labeled freezer storage bags and kept in at -20°C until extraction.

Duplicate samples of eight slices are stored, with one set in Columbus, OH, and one set in Wooster, OH. Comprehensive metadata concerning the samples chosen for the project are available in Supplement 1.

2.3.2 Genomics

2.3.2.1 DNA Extraction and SNP Array Processing

DNA extraction from leaf tissue was carried out using the Omega E-Z 96 Plant

DNA Kit (Omega Bio-tek, Inc. Norcross, GA, USA). Samples were then genotyped at

Michigan State University using an apple 20K Infinium® SNP array (Bianco et al. 2014).

SNP calling and filtering was performed using GenomeStudio version 2.0.4 (Illumina

Inc., San Diego, CA, USA; http://www.illumina.com). Marker order was determined by integrating markers with the apple iGLmap (Di Pierro et al. 2016). The highest quality set of genotype calls, determined by highest genescore, was chosen as representative for an apple variety if DNA was extracted and genotyped for multiple individuals—the case with the four standards and the HC×FJ replicated progeny, such as D01 and D01OH.

2.3.2.2 Pedigree Confirmation

A highly-polymorphic subset of 1,648 SNPs distributed across the genome was used for pedigree verification of 110 genotyped pedigree members in FRANz 2.0

(Riester, Stadler, and Klemm 2009), as in Fresnedo-Ramírez et al. (2015). The minimum minor allele frequency (MAF) was set at 0.1 to ensure inclusion of polymorphic markers across the pedigree allowing no more than 1% missing data. Otherwise, default parameters were used. The dataset estimated date/year of “birth” of the majority of

39 individuals was provided to indicate that individuals which are contemporaneous or no more than 5 years older cannot be parents of other individuals. Also, genotyped individuals included as male or female parents exclusively were indicated as such to facilitate rapid convergence of the Markov Chain Monte Carlo (MCMC) procedure.

Parentage was corrected for accessions with high probability (>0.95) of distinct parentage.

2.3.2.3 Marker File Preparation for mGWAS

Less stringent filtering was applied to the marker dataset to determine SNPs to use for the metabolome-based genome-wide association study (mGWAS). Markers were first matched to those included in the iGLmap, giving 15,260. These were then filtered for minimum MAF > 0.05 and missingness of 5% across all members that were both phenotyped and genotyped (n = 124). This resulted in 11,165 polymorphic, genome-wide markers to be used for mGWAS analyses.

2.3.2.4 Marker File Preparation for FlexQTL™ Analyses

Markers were kept with MAF > 0.07 and missingness of up to 10% in the pedigree members. These parameters yielded the maximum number of informative markers that also matched with those represented in the iGLmap. For FlexQTL™, each allele is reported independently, resulting in two calls for each locus. The markers were analyzed in FlexQTL™ within the Owens Supercomputer (Ohio Supercomputer Center

2016) at the Ohio Supercomputer Center (OSC) (Ohio Supercomputer Center 1987) to check for double recombinations and an excessive number of genotyping inconsistencies.

40

Markers were removed with >3 genotyping inconsistencies as well as any obvious double recombinants, resulting in a final set of 6,034 markers for use in the FlexQTL™ routine.

2.3.3 Metabolomics

2.3.3.1 Chemicals

Solvents were of LC-MS grade from Fisher Scientific (Pittsburg, PA), including methanol, water, acetonitrile, and formic acid. NMR-grade deuterated methanol and trimethylsilylpropanoic acid (TSP) as well as sodium phosphate monobasic monohydrate and sodium phosphate dibasic heptahydrate were also purchased from Fisher Scientific.

Authentic standards were purchased from Sigma-Aldrich (St. Louis, MO).

2.3.3.2 Apple Fruit Extraction

A total of 199 unique apple samples were prepared for metabolomic analysis.

Sample extraction order was randomized by random selection of sample bags from freezer storage totes. Several representative slices were removed from each sample freezer bag containing eight slices. For each apple sample, peel was weighed to 1.0 g ±

0.05 g. Pieces of apple flesh were cut from the slices and added to the peel to reach a combined weight of 5.0 g ± 0.05 g. Apples with inadequate sample collection (n = 5) were weighed to half the value of the standard: peel 0.5 g ± 0.05 g and flesh 2.5 g ± 0.05 g. Separate weighing of peel and flesh was used to account for the different sizes of the apples. Small apples would have a higher peel:flesh ratio than larger apples. This would be problematic in extraction because there is a higher concentration of phytochemicals in the apple peel compared to flesh. Weight matching peel and flesh separately allowed us

41 to see differences in chemical abundance based on true differences instead of differences imparted by fruit size.

Weight-matched peel and flesh were then placed together in a 50 mL tube with two 3/8² ´ 7/8² angled ceramic cutting stones (W.W. Grainger: Lake Forest, IL; Item no.:

5UJX2). Using a bottle top dispenser (Q-sep Bottle Top Solvent Dispenser, 2.5-30 mL,

Restek Corp., Bellefonte, PA, USA), 15 mL of methanol was immediately added to extract polar/semi-polar metabolites and inhibit enzymatic and non-enzymatic oxidation reactions. Tubes were stored overnight at -80°C.

Tubes were placed in a sample homogenizer (SPEX® SamplePrep

Geno/Grinder®, NJ, USA) for grinding. Samples were then centrifuged at 2,800 ´ g for 3 minutes to pellet insoluble material. Using 10 mL leur-lock syringes, ~6-10 mL of supernatant was removed from the pelleted tubes and then syringe-filtered (0.22 µm

PTFE) into a new 50 mL tube to remove remaining particulates. Filtered extract was then dispensed for UHPLC-QTOF-MS and NMR analyses according to (Table 3). NMR samples were dried via a vortex vacuum evaporator (Combidancer, Hettich AG, Baech

Table 3. Apple fruit extraction samples and modifications.

Extract Modification Vial Volume LC-MS (+) 1.0 mL Diluted to 50% MeOH – 0.5 mL H2O 2 mL LC-MS Vial LC-MS (-) 1.0 mL Diluted to 50% MeOH – 0.5 mL H2O 2 mL LC-MS Vial Extra 1.0 mL - 1.5 mL Tube NMR 0.5 mL Dried 1.5 mL Tube

42

Switzerland). Drying parameters can be found in Appendix A.2.1.

A 1.0 mL aliquot of each sample extract was pooled to create a bulk quality control (QC) to be used in the UHPLC-MS experiments. The pooled QC solution was diluted with H2O to 50% MeOH then aliquoted. Several process blanks were made by conducting the extraction method with all the same physical elements except the apple fruit. Analyzing the process blank allowed detection of the presence of any residues or contamination with compounds from the pipette tips, tubes, or pellets used in the extractions. All extracts were stored at -20°C until analysis.

2.3.3.3 UHPLC-QTOF-ESI-MS Full Scan Experiments

Polar extracts were analyzed with an Agilent 1290 Infinity II series UHPLC coupled to an Agilent 6545 quadrupole time of flight mass spectrometer with electrospray ionization (ESI-QTOF-MS) (Agilent, Santa Clara, CA). Injection volume was 3 µL per sample. Reverse phase chromatography was conducted with a Waters Acuity UPLC HSS

T3 column (2.1 x 50 mm, 1.8 μm particle size) maintained at 40°C (Waters, Milford,

MA, USA). Full-scan spectral data was collected separately in both positive and negative ionization modes for complete metabolite coverage. Mobile phases consisted of water with 0.1% formic acid (A) and acetonitrile with 0.1% formic acid (B). Flowing constantly at 0.5 mL/min, the gradient was as follows: 0-0.5 min 0% B; 0.5-8 min increase to 100%

B; 8-9 min hold 100% B; 9.01-10.0 isocratic at 0% B. Settings of the MS were as follows: gas temp 350°C, gas flow 10 L/min, nebulizer 35 psig, sheath gas temp 375°C, sheath gas flow 11 L/min, VCap 4500 V, nozzle voltage 500 V, fragmentor 100,

43 skimmer1 45, octopoleRFPeak 750, and scan rate 2 spectra/s with a mass range of 100-

1700 m/z.

Before injecting any apple extracts, solvent blanks (1:1 water:methanol) were injected to equilibrate the instrument. Next, process blanks were run to give a baseline of chemical noise present in the instrument and from the extraction process. This allowed removal of extraneous peaks coming from any residues or contamination of samples with compounds from the pipette tips and tubes used in the extraction. Pooled QCs were then injected repeatedly until base peak chromatograms were consistent. Then, order of sample analysis followed the random order established by sample weighing so that samples which were extracted first were also analyzed first. This randomization reduced the effect of instrument variation throughout the experiments. Identical pooled QCs were analyzed every seventh injection, providing a way to monitor instrument stability and data quality across the experiments.

2.3.3.4 Iterative UHPLC-QTOF-ESI-MS/MS Experiments

Using the same gradient and QTOF settings as above, the pooled QC samples were also analyzed with iterative MS/MS experiments (MassHunter B09 acquisition software, Agilent Technologies) in which data-dependent MSMS spectra were collected.

The goal was to collect MS/MS spectra of as many features from the apple extracts as possible for use in manual compound identification and molecular networking using

Global Natural Products Social Molecular Networking (GNPS) (Wang et al. 2016), both based on fragmentation. To balance between collecting both comprehensive and good quality spectra, the same QC sample was injected repeatedly, and the most abundant ions

44 throughout the chromatogram were triggered for MS/MS data collection. Those ions were automatically added to an exclusion list, such that the next most abundant ions can then be selected for MS/MS. This allowed collection of fragmentation data for features across a range of ion intensities. These experiments were conducted in both positive and negative ionization mode with five injections for each of two collision energies, 20 and

40 eV.

MS settings were as above in full scan experiments. Parameters for AutoMS2 scans were as follows: MS minimum range 40 m/z, MS maximum range 1700 m/z, MS scan rate 3 spectra/s, MS/MS minimum range 40 m/z, MS/MS maximum range 1700 m/z,

MS/MS scan rate 1 spectra/s, isolation width narrow (~1.3 amu), and decision engine advanced. Terms for precursor selection were max precursors per cycle 2, threshold

(absolute) 10,000, threshold (relative)(%) 0.100, precursor abundance based scan speed – yes, target 100,000 counts/spectrum, use MS/MS accumulation time limit – no, use dynamic precursor rejection – no, purity stringency 100%, purity cutoff 30%, common isotope model, active exclusion enabled – yes, active exclusion excluded after 2 spectra, active exclusion released after 0.12 min, and sort precursors by abundance only.

2.3.3.5 LC-MS Full Scan Data Deconvolution and Processing

Agilent files (*.d) of raw spectral data were converted to *.mzML using

ProteoWizard (Chambers et al. 2012) (parameters available in Appendix A.1.1.1). Raw spectral data were deconvoluted in MZmine2.51 (Pluskal et al. 2010) with a pipeline including mass detection using wavelets (ADAP), Automated Data Analysis Pipeline

(ADAP) chromatogram building (Myers et al. 2017), feature detection, isotope grouping,

45 alignment, gap filling, and filtering according to various parameters recorded in

Appendix A.1.2. This workflow was used to produce a data matrix of signal intensities, retention time, and m/z ratio for each feature in each sample. Manual data cleanup included removal of features with intensity less than 10 times that detected in the process blank and those with >30% coefficient of variance across the pooled QC samples.

Intensities were doubled for those samples that were weighed to half the target weight for peel and flesh. For each remaining feature, missing values were imputed with half the minimum intensity.

2.3.3.6 Iterative LC-MS/MS Data Deconvolution and Processing

Agilent files (*.d) of raw full scan and MS/MS data were converted to *.mzML format using ProteoWizard (Chambers et al. 2012) (parameters available in Appendix

A.1.1.2.). Peak picking was performed simultaneously to centroid the data. Negative and positive ionization modes were processed separately. Within each mode, spectra from the two collision energies, 20 and 40 eV, were processed separately.

2.3.3.7 1D 1H NMR Spectroscopy Experiments

To reconstitute the dried polar/semi-polar samples, an 80:20 solvent system of deuterated methanol (CD3OD) and 0.2 mM aqueous phosphate buffer was prepared.

Trimethylsilylpropanoic acid (TSP) was added to the bulk solvent to serve as an internal reference for chemical shift. For each dried sample, 800 µL of the solvent system was added. Samples were then vortexed to help dissolve extract and placed in a sonication water bath at room temperature for 5 minutes. For each sample, 600 µL was then transferred to a 5 mm NMR tube.

46

1D 1H NMR experiments were performed using a Brüker Avance III spectrometer

(Bruker, Ettlingen, DE), operating at 700.13 MHz, equipped with a TXO helium-cooled 5 mm triple-resonance observe probe with samples at 25 ± 0.1°C. Spectra were acquired with the following parameters: 64 scans and 4 dummy scans, 64K data points, 90° pulse angle (10.5 μs), relaxation delay 3 s, spectral width 15 ppm. The spectra were acquired without spinning the NMR tube in order to avoid artifacts, such as spinning side bands of the first or higher order.

2.3.3.8 NMR Spectral Processing

The spectra were processed by the Topspin software package provided by Brüker

Biospin. A polynomial fourth-order function was applied for base-line correction in order to achieve accurate quantitative measurements when integrating signals of interest. Phase correction and baseline correction were adjusted manually. Chemical shifts are reported in ppm from TSP (δ=0).

Post-processing of spectra was performed using R package mrbin v1.3.0 (Klein

2020). Spectra were binned with a bucket size of 0.01 ppm across the range of the spectrum (9.5-0.5 ppm). Regions for water (5.0-4.6 ppm) and methanol (3.335-3.305 ppm) peaks were excluded. Three unstable peaks affected by slight pH shifts were each summed (2.68-2.45, 2.87-2.73, 4.45-4.25). Negative intensities were converted to positive values with an affine transformation. Bins were filtered with a noise threshold of

0.01 and a signal-to-noise-ratio (SNR) of 10. Full reproducible code is available in

Appendix A.2.2. Intensities were doubled for those samples that were weighed to half the target weight for peel and flesh.

47

2.3.3.9 Data Visualization and Analysis

Samples that were extracted underwent metabolomic analysis to aid precursory understanding of metabolite changes with different harvest times were not included in the rest of the analysis (D07OHa, D07OHb, D24OHa, D24OHb, J17a). Additionally, we lacked confidence in the recorded weight for KG61, unsure if it reached the full 5.0 ±

0.05 g. It was therefore discarded from further analysis. This resulted in data from a final

193 selections to be analyzed.

Remaining features from LC-MS and NMR analyses were log2-transformed then analyzed in R 3.6.2 (R Development Core Team 2008). In addition to Base R, packages used include tidyverse v1.3.0 (Wickham et al. 2019) and ggplot2 v3.3.0 (Wickham 2016).

Unsupervised principal components analyses (PCA) were conducted for each of the three final metabolomics datasets: LC-MS (+), LC-MS (-), and NMR.

2.3.3.10 Feature Identification – LC-MS

Databases, such as the Human Metabolome Database (HMDB) (Wishart et al.

2018), were searched for potential matches for m/z values for features of interest. If fragmentation data had been obtained in the untargeted iterative MS/MS experiments,

MS/MS spectral data were also compared against experimental or predicted MS/MS spectra in databases as well as fragmentation data or spectra in the literature.

Generation of putative identities was broadly applied to all collected iterative

MS/MS spectra using tools in the GNPS infrastructure (Wang et al. 2016). Using file transfer protocol (FTP) client WinSCP (https://winscp.net/eng/docs/library_install), the raw spectral files in *.mzML format were imported into GNPS. They were then analyzed

48 using the molecular networking platform (parameters reported in Appendix A.1.4). This platform performed library searches for MS/MS spectra to provide identifications and conducted molecular networking by assessing pair-wise spectral similarity for each feature.

For compounds of interest, authentic standards were purchased if available for targeted MS/MS analysis. To prepare each stock solution, a needle was dipped into the standard then into a vial of 1:1 MeOH:H2O. The amount was not weighed because quantification of compounds in the apple fruit samples was not the goal of these experiments, simply identity confirmation. Stock solutions were analyzed with full scan

LC-MS and MS/MS with collision energies of 20 and 40 eV. Resultant spectra from the standard were compared with the m/z and retention time of the peak of interest in a pooled QC sample.

2.3.3.11 Feature Identification – NMR

Identities of NMR peaks from the complex mixture were investigated primarily through comparison with identities determined in the literature for polar apple extracts as well as other fruits. Chemical shift and multiplicity were also compared to online data bases containing 1D 1H NMR spectral data for pure compounds, namely HMDB (Wishart et al. 2018).

For future compound ID verification in NMR, spike experiments with authentic standards will be analyzed with additional 1D 1H NMR analysis. Here, the analytical standard would be added to a pooled QC sample and analyzed to determine if the intensity increases for the peaks of interest at specific ppm. Additionally, 2D 1H-13C

49

HSQC NMR analysis of a pooled QC will be performed for additional structural elucidation for compounds of interest.

2.3.4 Omics Integration – mQTL Detection

With the goal of identifying quantitative trait loci for metabolites (mQTLs), genomics data was integrated with metabolomics data. Pedigree-based analysis (PBA) in

FlexQTL would be the most powerful method for mQTL detection in the pedigree- connected population studied here; however, due to the intensive nature of the FlexQTL routines, even using supercomputing, not all metabolomic features (n = 10,331) could be realistically analyzed via this method. Instead, a workflow was developed to prioritize putative genotype-phenotype associations for final analysis in FlexQTL. First, less computationally intensive metabolome-based genome wide association studies

(mGWAS) were conducted to determine associations between SNPs and metabolomic features. The details of those analyses and further feature prioritization steps are discussed below.

2.3.4.1 Metabolite genome wide association studies (mGWAS)

In plant breeding, GWAS is commonly used to identify QTL for only a few traits at a time, such as plant height, flowering time, and disease resistance. In these cases, due to the smaller number of analyses, each model to detect genotype-trait associations can be optimized for the chosen trait. In this project, instead of a few phenotypes, 10,331 metabolomic features were analyzed as traits in metabolite genome-wide association studies (mGWAS) to understand genotype-metabolite relationships. The two untargeted metabolomics LC-MS data sets (positive and negative ionization modes) contributed

50

4,872 and 4,702 features respectively and the NMR data set, 756 bins. To state clearly,

10,331 mGWAS analyses were conducted in which each metabolomic feature (n =

10,331) was considered the phenotype and was examined for association with each SNP

(n = ~10,000) in the genomic dataset. The high-throughput nature of the analyses

required implementing general strategies of model selection and parameterization in

place of the usual optimization for each phenotype.

Due to the unique population structure, which combines three sets of progenies,

their pedigree-related families, as well as heritage and wild species, a novel approach was

taken to determine features of interest in each of the three metabolomic-genomic

combinations. Within each combination, three separate analyses of certain individuals

were conducted: progenies only (Progeny, n = 75), progenies plus pedigree-related

individuals (Pedigree, n = 98), and a full, diverse analysis of all individuals for which

metabolomic and genomic data were collected (Diverse, n = 124) (Figure 3).

Diverse (n = 124) Pedigree (n = 98) Progeny (n =75)

Figure 3. Depiction of the three nested populations used for mGWAS analyses with colors as follows: purple – progenies, orange – progenies plus pedigree-connected individuals, green – progenies plus pedigree-connected individuals and other diverse selections. 51

This rationale was adopted based on the assumption that genetic segregation of metabolite production should have clear patterns in the progenies. Adding pedigree- related individuals, including members, such as wild M. floribunda, and unrelated apples of distinct parentage or wild origin, which are phenotypic outliers from the rest of the pedigree, would likely bias certain metabolite-genotype relationships. However, despite introduction of bias for some marker-metabolite associations, those that retain signal across all three iterations of the analysis should be robust across a wide variety of apple germplasm, as represented by the individuals in this study. The advantage of this approach is generalized applicability of marker-metabolite associations in a wide range of breeding germplasm.

Therefore, a workflow was developed to assess SNP-feature associations within three population sets for each of the three metabolomics datasets (Figure 4). This resulted in a total of nine multivariate sets of mGWAS analyses performed using R package rrBLUP v4.6.1 (Endelman 2011) (Figure 4A). In order to model known pedigree and genetic relationships, R package AGHmatrix v1.0.2 (Amadeu et al. 2016) was used to generate a combined relationship matrix (H matrix). This H matrix was constructed with an additive relationship matrix calculated based on pedigree (A matrix) corrected by a relationship matrix determined by genetic marker information (G matrix) (code available in Appendix B.1.1). Data formatting in R also used package DescTools v0.99.35

(Signorell et. mult. al. 2020). To correct for additional structure within the populations analyzed, elbow plots were constructed and examined to determine the appropriate number of principal components to include in the analysis (Figure 5). In order to generate

52

(C)

(B)

(A) (D)

(E) 53

(F)

Figure 4. Workflow for integration of one metabolomics dataset with a genomics dataset. This workflow was applied to each of the

three metabolomics datasets. Three separate mGWAS analyses (A) with different subsets of individuals were conducted in order to

detect real SNP-feature associations present across diverse germplasm and in segregating progeny. This was achieved by filtering

results (B) of each for strong signal (C) and then venning the results from the three populations (D). Overlapping features with

significant SNP associations (E) were extracted for further analysis and identification. This resulted in a corresponding collection of

overlapping significant features (F) for each of the three metabolomic datasets.

(A)

Figure 5. A continued figure. Elbow plots of the percent variation explained by each principal component (PC) in a principal component analysis (PCA) of SNP data used for mGWAS for (A) the diverse population, (B) the pedigree subset, and (C) the progeny subset. The red point indicates the last PC included in the mGWAS model (A: PC = 10,

B: PC = 6, C: PC = 3). A graph of the first two PCs are paired with each population subset where points are apple types (A: n = 124, B: n = 98, C: n = 75).

54

Figure 5 continued

(B)

(C)

55 these plots, missing marker information was imputed with R package impute v1.60.0

(Hastie et al. 2019) (code available in Appendix B.1.2).

Averages of metabolomic features were calculated for any apple variety that had metabolomic data collected for multiple individuals, such as D01 and D01OH and the four standards. Here, although ‘GoldRush’ was collected from a fourth site, Purdue

University, the values for GRPU were not included in the average to keep data treatment consistent with the other three standards, which were only averaged from the MAIA orchards.

Supercomputing was utilized to increase efficiency of the thousands of mGWAS analyses being conducted. Nine Unix batch scripts (Figure 6) were executed on one node across 16 processors within the Owens Supercomputer (Ohio Supercomputer Center

2016) at the Ohio Supercomputer Center (OSC) (Ohio Supercomputer Center 1987). In addition to PBS directives to the supercomputer, each script contained R code to execute the mGWAS model. Scripts expedited analysis by separately analyzing each chromosome (n = 17) across 16 cores in parallel (batch scripts and related code available in Appendix B.1.3-4).

From each of the nine mGWAS versions, results for separate chromosomes were collated to produce a single data frame of -log10(p) values for each pairwise SNP-feature association, resulting in nine unique SNP-by-feature data frames (Figure 4B) (command line code available in Appendix B.1.5). The nine data frames were transferred from the

OSC to a local server via sftp for further processing and analysis (command line code available in Appendix B.1.5).

56

Figure 6. A Unix batch script was written and executed for the three population sets within each metabolomics dataset, resulting in a total of nine batch scripts.

57

2.3.4.2 Prioritizing significant SNP-feature associations

In R, SNP-feature associations were filtered for significance with a -log10(p) thresholds (Figure 4C). For LC-MS (+) and (-) data sets, a threshold of ³ 4 was used for the progeny results and ³ 5 for the pedigree and diverse results. A ³ 4 filter was used for each of three populations of NMR analyses due to the much smaller volume of data compared with the LC-MS output. A -log10(p) value of 4 corresponds to p = .0001 and 5 to p = .00001. Less stringent thresholds could have been applied for significance filtering, but the large number of results would have made further analysis difficult. Additional study of SNP-feature associations with lesser significance could be conducted in the future and likely yield interesting and applicable results. Values were not subjected to a multiple test correction, but significance thresholds displayed on all Manhattan plots represent p = .05 with a false discovery rate (FDR) correction for that feature. These plots were constructed using the rrBLUP v4.6.1 (Endelman 2011) package in R.

Lists of features with significant associations with one or more SNPs from each mGWAS were compared via Venn diagrams of the three populations from the same metabolomics dataset (Figure 4D). Thus, three such Venn diagrams were constructed with R package eulerr v6.1.0 (Larsson 2020) (code available in Appendix B.1.7). A subset was then taken of each of the three result data frames to include only the features found to be significant across all three populations – the overlapping center of the Venn diagram (Figure 4E). This process resulted in three core datasets, one per type of metabolomics data, of -log10(p) values for features significantly associated with at least one SNP (Figure 4F).

58

Due to the high volume of significant SNP-feature associations identified from the mGWAS workflow, one feature was chosen to highlight the value of this process and continue in the pipeline to the pedigree-based analysis (PBA) in FlexQTL™. This feature had a strong putative identification based on an MS/MS database match, which was confirmed by comparison with an authentic standard. The metabolite also had interest as a potentially health beneficial.

2.3.4.3 Pedigree-based analysis (PBA) in FlexQTL™

Pedigree-based analyses (PBA) under a Bayesian framework were conducted using FlexQTL™ version 0.99130 for Linux (Bink 2002; Bink et al. 2002). Like the mGWAS approach, the PBA method incorporated pedigree relationships, genotype calls, and log2-transformed metabolite abundance data with missing data points imputed with half the lowest value. The analysis was performed using a model that estimated additive and dominance effects within the pedigree. The effect of environment was obviated from the analyses, as it was not practical to increase the model complexity.

A minimum of 50,000 Markov chain Monte Carlo (MCMC) sweeps were run for each routine, thinning every 50 iterations with a thin screen of 1,000. The minimum effective sample size (ESS) to consider that convergence was achieved in the model was set to 101. If at 50,000 iterations, ESS < 101, the software continued iterations, up to

100,000, to ensure convergence and sound statistical foundation for findings. The Finite

Polygenic Model (FPM) was also run to identify major (Mendelian) genes. The algorithms were applied to the data within the Owens Supercomputer (OSC 2016) at the

OSC (OSC 1987). Three replications were conducted for each trait input to provide

59 additional certainty in estimations and statistical robustness for later replication of the results by others.

For each compound of interest, all instances within a metabolomics data set were analyzed for linkage separately. For example, chlorogenic acid was present in LC-MS (+) and (-) mode data as well as NMR, so abundance values for each pedigree sample were utilized in the analysis. Bayes Factors of 2-5 were considered positive, 5-10 strong, and

>10 decisive. Narrow sense heritability was calculated by dividing the phenotypic variance by the sum of the weighted additive genetic variance and the sample residual variance.

2.4 Results and Discussion

2.4.1 Genomics

2.4.1.1 Pedigree Confirmed and Revised

Analysis of adherence to expected pedigree relationships analyzed in FRANz 2.0 indicated certain individuals with disparate parentage. Twelve progenies were removed from the study due to <95% identity (D20, L01, L09, L11, L16, L19, L20, L24, L25,

L28, L39, L40). Five progenies were discovered to have distinct parentage from the expected crosses but were kept in the analysis due to phenotypic interest (J15, J32, J39,

J40, L06). Additionally, the traditional lineage of the advanced selection ‘Co-op 17’ was revised to include ‘Crandall’ as a progenitor. This change was confirmed by investigations into the lineage of ‘Co-op 17’ by colleagues at Wageningen University.

Parentage for all individuals can be found in supplement S1. The FRANz parentage result for ‘Honeycrisp’ was given as ‘Northern Spy’ because it was the closest ancestor to

60

‘Honeycrisp’ with SNP data available for analysis at the time of this pedigree confirmation. This parentage was not used for subsequent analysis; instead, the

‘Honeycrisp’ parentage was set as ‘Keepsake,’ according to the findings of Howard et al.

(2017).

This result was not unexpected due to the nature of making crosses in apple breeding. Breeders make crosses by opening flowers of the mother variety by hand before the petals open themselves to natural pollinators. They then take a paintbrush with pollen from the father variety and apply it to the stigmas of the opened flowers. The hope is that fertilization is successful, and all seeds produced in the fruit from that flower represent the desired cross. However, it is quite possible for the hand to be unsuccessful. In this case, the subsequent arrival of natural pollinators brings pollen from any number of nearby varieties that would unknowingly take the place of the expected pollen parent. Overall, as there is always a potential for errors in pollination, pedigree confirmation is vital to developing a pedigree-related set of germplasm.

2.4.1.2 PCA of SNP data confirmed genetic variation of selected germplasm

Using PCA as a way to reduce the dimensionality of the SNP data allowed a broad visualization of the genetic variance present in the selected apple varieties (Figure

5). All three PCAs showed clusters of the distinct progeny sets, indicating a good selection of polymorphic SNPs that capture the differences between individuals. With clustering of the progenies in Figure 5.A and B, it was clear that the filtered genetic data was still able to differentiate the progenies from the pedigree-members and diverse selections when all selections were considered together.

61

The elbow plots used to determine number of principal components to incorporate into the mGWAS model followed expected trends (Figure 5). When considering the progeny (Figure 5.C), the fact that three PCs modeled the majority of the variance reflected the expected number because three distinct sets of progenies were considered in the analysis. Similarly, with increasing variation between the pedigree-related individuals and even more so with diverse selections, it followed that higher numbers of PCs, six and

10 respectively, were needed to explain the variation in the SNP data (Figure 5.A and B).

Likewise, the amount of variation accounted for by PC 1 was highest in the progeny

(23.00%) compared to the pedigree (18.71%) and diverse populations (14.03%). Here, increasing diversity in the population resulted in an expected decrease in the ability of a single PC to model variation.

2.4.2 Metabolomics

Data processing of untargeted metabolomic analysis of 199 apple extracts resulted in 4,872 molecular features for LC-MS (+), 4,703 for LC-MS (-), and 756 bins for NMR.

These numbers represent the final list of features used in subsequent analysis of metabolomics data as well as integration with genomics data. It is typical for positive mode LC-MS data to yield more features than negative mode because most compounds ionize more readily in positive mode. It was also expected that many more compounds would be detected in LC-MS approaches than NMR given the higher inherent sensitivity of MS-based methods.

62

2.4.2.1 High-throughput untargeted metabolomic analysis and processing of 232

extracts via LC-MS produced high-quality data

Both positive and negative mode experiments were conducted via LC-MS, with each analysis occurring in one continuous batch. Despite long experiment periods and potential of retention time shifting common to LC-MS analysis, regularly injected pooled quality control samples (QCs) indicated stable data quality due to tight clustering of all

33 QCs when each ionization mode was examined separately via principal components analysis (PCA) (Figure 7). The same strategy of repeated QC analysis was not employed for NMR data collection due to the innate stability of NMR analysis across time.

2.4.2.2 PCA showed distinct metabolome profiles across apple selections in all

metabolomics data sets

In order to utilize metabolite data in mGWAS analyses, the datasets needed to capture variability in metabolite profiles between apple varieties in this study. PCA was conducted for broad assessment of metabolite variability across all apples analyzed

(Figure 8). The spread of points between the generalized classifications of wild, pedigree- connected, and other diverse selections confirmed the hypothesis that the apples selected for this study have metabolic variety.

In each PCA of Figure 8, PC1 explained the difference between wild and cultivated apple selections, as a majority of the diverse selections are commercial or heritage varieties. The pedigree-classified individual clustering with the wild accessions in each PCA was M. floribunda, a wild species that was incorporated into breeding germplasm to confer scab resistance. This indicated that much of the M. floribunda

63

(A)

(B)

Figure 7. Principal components analysis (PCA) scree plots of positive (A) and negative

(B) ionization mode data collected via LC-MS. Tight clustering of pooled QCs (n = 33) indicated stability of data quality within the two experiments. Each point represents one sample (n = 226) and are color-coded to represent general classes within the selected apple varieties. Missing values were imputed, data was log2-transformed then scaled and centered to perform PCA.

64 metabolome has been lost with breeding of commercial varieties over the generations.

The overlap of wild and pedigree selections in the plots was plausible due to the presence of M. floribunda in the ancestry of many varieties of the pedigree-related individuals.

Overlap between wild and diverse selections was also expected, as some of the diverse individuals were first generation crosses between commercial and wild species.

Furthermore, there was clear variability within each generalized class. It might have been expected that variation between wild accessions would dwarf variation in commercial apples, but this was observed. We hypothesized that dispersion of points in the wild selections would be larger due to their greater diversity in size, flesh color, peel color, taste, and other characteristics. Although commercial apples were segregated from the wild samples, the spread of the points on both PC1 and 2 in each PCA indicated sizable variety in the apple metabolomes within commercial apple germplasm. This was observed in the LC-MS (+), (-), and NMR data sets.

These conclusions were strengthened by parallel evidence across all three metabolomics data sets. One distinction between these plots was seen in the amount of variation explained by PC1. The PCA of the NMR metabolomic analysis (Figure 8.C) explained much more variation (38.00%) than that of the two LC-MS analyses ((+):

19.99% and (-): 20.64%). The LC-MS PCAs represent many more metabolomic features

(~5,000 each) compared to the ~750 bins inputted for the NMR PCA. Thus, the LC-MS

PCAs had many more variables differentiating the samples, making it difficult to capture a large amount of variation in one component. Although, this difference could be plausibly explained by the nature of NMR analysis in contrast to LC-MS. NMR captures

65

(A)

Figure 8. A continued figure. Principal components analysis (PCA) scree plots of positive

(A) and negative (B) ionization mode data collected via LC-MS and (C) NMR data. Each point represents one apple sample (n = 193) and are color-coded to represent general classes within the selected apple varieties. Each PCA indicates metabolomic variation in germplasm chosen for analysis in this study. Missing values were imputed, data was log2- transformed then scaled and centered to perform PCA.

66

Figure 8 continued

(B)

(C) 67

high abundance metabolites such as sugars, amino acids and phenolic acids, whereas reverse phase LC-MS experiments using C18 columns do not reliably capture sugars and other extremely polar metabolites. LC-MS experiments do, however, capture low abundance metabolites that are undetectable via NMR analysis. In this case, the difference in variation explained by PC1 for the two approaches may indicate that the compounds detected in NMR analyses were those that differed more greatly between the wild and commercial fruits.

In most untargeted metabolomic analyses, PCA serves as a jumping-off point to determine differences between groups and examine loadings plots to understand which metabolites are driving the separation between groups. However, in the design of this project, discrete groups of apples were not chosen, such as disease resistant and susceptible. Instead, the samples for this study were chosen for breeding interest and to represent a specific pedigree. In these PCAs general classifications of “diverse,”

“pedigree,” and “wild” simply serve as a visual gauge for metabolome variability in the selected apples. Many additional variables, such as species, skin color, flesh color, wild vs. cultivated, fruit size, or disease resistance could be used to understand the separation between groups, though this was not the goal here.

2.4.3 Omics Integration

This study represents a proof-of-concept report on the feasibility and benefits of multi-omic data set integration in apple. Genomic and metabolomic data sets were leveraged simultaneously to gain insight into genetic control of metabolite production in

68 fruits. By using three metabolomic data sets in tandem, the approach proved to be even more powerful. Parallel results across genomic-metabolomic integrations increased confidence in accurate identification of mQTLs as well as metabolite identity.

2.4.3.1 Prioritization scheme enabled high-throughput omics integration

Thousands of mGWAS analyses resulted in tens of millions of outputs for SNP- feature combinations. Due to this huge amount of data, strategies were developed to sift the results to prioritize SNP-feature associations for further analysis. Steps needed to be taken to reduce the scope of the data from the original 4,872 LC-MS (+), 4,703 LC-MS (-

), and 756 NMR molecular features.

Using strict -log(p) value cutoffs of 4 (p ≤ .0001) for the progeny analyses then 5

(p ≤ .00001) for the pedigree and diverse analyses of LC-MS data sets and 4 for all NMR population analyses was the first step in shrinking the focus to the most significant findings, resulting in totals found in Figure 9A. For each metabolomic approach, these lists were then venned to determine which metabolomic features were significantly associated with SNPs across all three populations. The Venn diagrams in Figure 9B-C illustrate the overlap in significant features between different populations. The center category for each diagram indicates the number of features that were passed to subsequent stages of analysis: LC-MS (+) 519, LC-MS (-) 726, and NMR 177.

The resulting counts were interesting in that there were more significant features from the LC-MS (-) prioritization than from LC-MS (+) despite the positive mode data set having more features. This difference might be explained by the fact that flavonoids, a diverse class of compounds found in apple, ionize better in negative mode than positive

69

(A) Diverse Pedigree Progeny

LC-MS (+) 1,186 953 1,787 LC-MS (-) 1,370 1,187 1,962 NMR 374 385 281

(B) LC-MS (+)

(C) LC-MS (-) (D) NMR

Figure 9. (A) A table of the total counts of metabolomic features that remained after -log(p) value filters for each population. Those features were then venned for each metabolomics data set to capture a list of those that were significant in all three populations. Corresponding

Venn diagrams for (B) LC-MS (+), (C) LC-MS (-), and (D) NMR metabolomic features are shown with the center category containing the number of features that passed on for further analysis.

70

(López-Fernández et al. 2020). If the chosen apple varieties represent a high diversity of flavonoids, this could account for more significant features passing the thresholds for LC-

MS (-). This observation was strengthened by subsequent investigations into the significant features and will be discussed below.

2.4.3.2 Parallel results across complementary metabolomic datasets increased

confidence in compound ID and detection of significant loci: mQTL counts per

chromosome

Although the stringent filtering reduced the number of features of interest from thousands to hundreds, the size of the data remained a bottleneck for gleaning conclusions from the study. Manual inspection of results would be too time-consuming for this platform to be widely used for exploratory purposes, though this approach could be used if specific QTL or features were selected for investigation in an a priori manner.

To advance with the 519 (+), 726 (-), and 177 (NMR) important features, additional approaches for broad data visualization were adopted. These higher-level depictions were expected to reveal trends in the significant data that would then direct further analyses.

First, bar plots were constructed to get a basic understanding of where putative metabolite-QTLs (mQTLs) were detected (Figure 10). Potential mQTLs were discovered on each of the 17 chromosomes. A huge number of metabolomic features from each of the three metabolomic approaches were associated with SNPs located on chromosome

16: LC-MS (+) 284, LC-MS (-) 443, and NMR 177. This was interesting because of the polyphenol hotspot previously identified on chromosome 16 by Khan, Chibon, et al.

(2012), Chagné, Krieger, et al. (2012), and McClure et al. (2019). Additionally, the

71

Number of mQTL Detected per Chromosomeper Detected NumbermQTL of

Figure 10. Bar plots showing the number of putative mQTL detected per chromosome via mGWAS. Counts are determined by a SNP-feature -log(p) value minimum of 5 for LC-

MS and 4 for NMR.

72 difference of ~150 mQTL present on chromosome 16 between for LC-MS (+) and (-) represents most of the total difference of ~200 significant SNP-feature associations detected in the two approaches. Given these differences on chromosome 16, this supports our hypothesis that the LC-MS (-) data captured more phenolic features.

Chromosome 17 had the second most mQTL counts for the LC-MS data sets, although no remarkable mQTL hotspot on chromosome 17 has yet been reported. A similar untargeted metabolomics experiment by Khan, Chibon, et al. (2012) reported hotspots on chromosomes 1, 8, 13, and 16 when using a mapping population from

‘Prima’ × ‘Fiesta’. In a targeted study of 17 polyphenolic compounds, Chagné, Krieger, et al. (2012) found one mQTL for chlorogenic acid at the bottom of chromosome 17.

Another targeted study of apple phenolics conducted by McClure et al. (2019) located a hotspot on chromosome 16 but only a suggestive mQTL for chlorogenic acid on 17.

2.4.3.3 Parallel results across complementary metabolomic datasets increase

confidence in compound ID and detection of significant loci: Composite mQTL

chromosome map

In order to investigate the distribution of these significant SNPs within each chromosome, a depiction of significant SNP-feature associations across all three metabolomic platforms was devised to create a composite map of the genome (Figure

11). This diagram allowed us to assess simultaneously the areas of the genome that housed significantly associated SNPs using the three metabolomics data sets. mQTL associated with features from the LC-MS (+) are blue, LC-MS (-) are coral, and NMR are yellow. Here, hotspots on chromosome 16 and 17 are easily visible.

73

An important facet of this visualization technique was that it not only displayed genomic areas of interest, but it also showed SNPs associated across the metabolomic data sets, demonstrating the advantage of applying high-resolution MS and NMR together. Loci with signal across all three metabolomic approaches indicate association with compounds that can be analyzed via both ionization modes in LC-MS as well as

NMR. It is likely that metabolites significant across all three data sets exist in sufficient concentration (>1 μmol/L) to be observed via NMR, which is the least sensitive approach. For example, at the bottom of chromosome 17, one can see mQTL that are associated with features across the three data sets.

Conversely, at the tops of chromosomes 2 and 3, there are many significant mQTL found only using the MS data sets. It is likely that these features are present in concentrations too low (<1 μmol/L) to be well-characterized by NMR. Even areas in which a single data set was eliciting signal possibly indicate that the compounds represented there are best captured by one metabolomic approach over the others. SNP associations with compounds that do not easily lose a proton are more likely to be coming from the LC-MS (+) data, and vice versa. Similarly, areas displaying significance for NMR data only could indicate an area controlling abundant compounds that do not ionize well using MS and are therefore preferentially analyzed by NMR, such as sugars.

These observations were important clues to strengthen confidence in understanding the classes of compounds being controlled by certain areas of the apple genome. A great advantage of this approach was also to see that parallel results in mQTL detection were evident across mGWAS analyses. This increased our confidence in

74

SNPs Significantly Associated with at least one Metabolomic Feature

______0 ______30 ______

75 ______60 ______Genetic Distance (cM) ______90 _ _ Metabolomic Approach _ _ LC−MS (+) _ LC−MS (−) _ NMR

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Chromosome

Figure 11. Composite chromosome map of the 17 apple chromosomes. Horizontal lines indicate the location of a SNP found to have 75 a significant association with at least one metabolomic feature. Lines are colored based on the origin of the metabolomic feature. detection of true SNP-feature associations as opposed to signal resulting from artifacts due to the volume of analyses being conducted. Overall, these conclusions further strengthened the belief that multi-omic integration enables additional leverage for compound identification as well as confidence in results.

2.4.3.4 Parallel results across complementary metabolomic datasets increase

confidence in compound ID and detection of significant loci: Binary SNP-feature

association heat map and hierarchical clustering

To further conceptualize how metabolites were associated with particular SNPs, binary heat maps were developed that also integrated hierarchical clustering of metabolomic features (Figure 12). The binary element of the plots indicates a ‘true’ or

‘false’ to the statement, ‘there is a significant relationship between this SNP and this feature.’ The minimum threshold for significance remained at -log(p) value of 5 for LC-

MS data sets and 4 for NMR. R packages used to construct the plot include magicfor v0.1.0 (Makiyama 2016), reshape2 v1.4.4 (Wickham 2007), cluster v2.1.0 (Maechler et al. 2019), ggdendro v0.1-20 (de Vries and Ripley 2016), and grid from base R (R Core

Team 2019) (code available in Appendix B.1.8.4). The heat maps use color to represent a significant (colored) or non-significant (black) relationship. Each plot is organized vertically by binary-based hierarchical clustering of metabolomic features. Applying hierarchical clustering based on the same presence-absence matrix organized the features in a way that grouped metabolites significantly associated with the same SNPs. Sorting the SNPs by chromosome and then genetic distance gave order to evaluate clusters of significant SNP-feature associations with common genetic locations and metabolite

76 associations (code and high-resolution figures available in Appendix B.1.8.4). With this intentional organization, the plots revealed blocks of SNPs from clusters of loci that were all associated with the same metabolites.

The first overwhelming conclusion drawn from inspection of these three heat maps was the channel of continuous color for a group of SNPs on the right half of the diagram. Additional investigation into these SNPs found them all to be located on the top of chromosome 16. This aligned with the expected counts per chromosome (Figure 10) and mirrored the hotspot on 16 from the composite mQTL map (Figure 11). The same observations can be drawn for the smaller but clear cluster of colored squares against the right edge of the diagram, although for chromosome 17. When considering the plot as a whole, it is important to remember that each colored square indicates a significant SNP- feature relationship. Thus, portions of the diagram with smaller clusters or even a single colored square are ripe for investigation. Thin, vertical lines of color indicate a group of metabolites that are all significantly associated with a single or just a few SNPs, examples of which can be seen on the top edge of the LC-MS (+) heat map.

Alternatively, thin, horizontal lines of color denote a cluster of SNPs with which one or a few metabolites is significantly associated. An example can be seen at the bottom-middle of the NMR heat-map.

It is helpful to examine these plots both individually and in tandem. Recognizing similar horizontal patterns of SNPs that are significantly related to multiple metabolites can help to connect compound classes and identities across metabolomic approaches. For example, we can recognize that the long vertical string of color in the NMR heat map is

77

975.1705_2.11 975.1705_2.11 693.1926_0.598 693.1926_0.598 537.1239_2.589 537.1239_2.589 531.1409_0.597 531.1409_0.597 735.1316_0.602 735.1316_0.602 691.1722_1.868 691.1722_1.868 295.0471_1.858 295.0471_1.858 1061.5868_5.144 1061.5868_5.144 1036.132_5.145 1036.132_5.145 1095.563_5.142 1095.563_5.142 1025.1427_5.143 1025.1427_5.143 1562.9159_5.143 1562.9159_5.143 1545.9403_5.144 1545.9403_5.144 1059.5849_5.146 1059.5849_5.146 1567.9225_5.144 1567.9225_5.144 340.6651_5.668 340.6651_5.668 439.1371_3.831 439.1371_3.831 457.3672_4.513 457.3672_4.513 181.0773_1.684 181.0773_1.684 130.1592_3.215 130.1592_3.215 211.1694_3.06 211.1694_3.06 220.0865_0.531 220.0865_0.531 226.1359_4.611 226.1359_4.611 243.9902_0.86 243.9902_0.86 285.1326_3.499 285.1326_3.499 893.4845_6.731 893.4845_6.731 339.0585_0.542 339.0585_0.542 290.0754_0.781 290.0754_0.781 346.1508_2.417 346.1508_2.417 353.0967_2.335 353.0967_2.335 355.1356_1.997 355.1356_1.997 361.2353_6.058 361.2353_6.058 371.1678_2.223 371.1678_2.223 402.1164_2.672 402.1164_2.672 381.241_4.353 381.241_4.353 407.0955_2.098 407.0955_2.098 430.1171_2.006 430.1171_2.006 431.1528_2.74 431.1528_2.74 435.2001_3.354 435.2001_3.354 443.2244_3.052 443.2244_3.052 468.1482_2.095 468.1482_2.095 453.3_4.814 453.3_4.814 475.3033_6.702 475.3033_6.702 477.1579_1.752 477.1579_1.752 479.1373_1.118 479.1373_1.118 481.1319_2.314 481.1319_2.314 492.3093_5.716 492.3093_5.716 499.3058_4.663 499.3058_4.663 495.2732_6.035 495.2732_6.035 513.1731_3.29 513.1731_3.29 529.1197_1.813 529.1197_1.813 559.1782_3.061 559.1782_3.061 605.1661_3.225 605.1661_3.225 625.1766_2.968 625.1766_2.968 657.1702_2.302 657.1702_2.302 638.284_2.251 638.284_2.251 687.174_1.716 687.174_1.716 753.1647_2.264 753.1647_2.264 769.2696_2.867 769.2696_2.867 843.4893_6.28 843.4893_6.28 843.4915_6.717 843.4915_6.717 282.1551_2.304 282.1551_2.304 172.5551_2.304 172.5551_2.304 216.5631_2.496 216.5631_2.496 196.05_2.496 196.05_2.496 421.1838_3.519 421.1838_3.519 267.0829_2.098 267.0829_2.098 561.2515_2.245 561.2515_2.245 561.2515_2.211 561.2515_2.211 562.255_2.231 562.255_2.231 141.0915_6.157 141.0915_6.157 159.1017_2.491 159.1017_2.491 185.0423_1.668 185.0423_1.668 185.0426_2.125 185.0426_2.125 222.0584_0.667 222.0584_0.667 223.0544_0.625 223.0544_0.625 259.2017_2.982 259.2017_2.982 261.0947_1.648 261.0947_1.648 373.1299_3.036 373.1299_3.036 385.18_3.619 385.18_3.619 406.1214_2.927 406.1214_2.927 410.2384_2.889 410.2384_2.889 453.1578_1.601 453.1578_1.601 514.3891_4.497 514.3891_4.497 581.256_3.343 581.256_3.343 593.131_2.452 593.131_2.452 995.5911_6.877 995.5911_6.877 373.2562_2.91 373.2562_2.91 357.2615_3.432 357.2615_3.432 399.1992_3.348 399.1992_3.348 231.1586_2.896 231.1586_2.896 413.179_2.751 413.179_2.751 229.1434_2.772 229.1434_2.772 420.1011_2.396 420.1011_2.396 419.098_2.405 419.098_2.405 505.0956_2.339 505.0956_2.339 269.0207_2.337 269.0207_2.337 307.1521_3.212 307.1521_3.212 249.1492_3.267 249.1492_3.267 335.071_0.829 335.071_0.829 507.2051_2.77 507.2051_2.77 671.375_4.481 671.375_4.481 345.0374_5.293 345.0374_5.293 343.0402_5.293 343.0402_5.293 388.1062_6.081 388.1062_6.081 390.1039_6.081 390.1039_6.081 412.0859_6.081 412.0859_6.081 410.0882_6.081 410.0882_6.081 168.5236_1.856 168.5236_1.856 762.1812_2.103 762.1812_2.103 735.1954_3.455 735.1954_3.455 865.2459_3.924 865.2459_3.924 460.2087_2.861 460.2087_2.861 173.068_2.632 173.068_2.632 404.1555_1.877 404.1555_1.877 415.1384_2.544 415.1384_2.544 541.1423_0.567 541.1423_0.567 739.239_3.324 739.239_3.324 117.0546_1.844 117.0546_1.844 291.086_1.609 291.086_1.609 222.1967_1.075 222.1967_1.075 671.3753_4.622 671.3753_4.622 561.3239_2.14 561.3239_2.14 388.1375_3.082 388.1375_3.082 366.1554_3.089 366.1554_3.089 847.5182_6.737 847.5182_6.737 627.4631_6.745 627.4631_6.745 848.5247_6.736 848.5247_6.736 491.1166_1.996 491.1166_1.996 262.0316_1.998 262.0316_1.998 451.1217_1.985 451.1217_1.985 665.4044_5.819 665.4044_5.819 463.1243_2.35 463.1243_2.35 421.1061_2.385 421.1061_2.385 737.1712_1.931 737.1712_1.931 419.1472_4.674 419.1472_4.674 367.15_1.8 367.15_1.8 775.148_1.981 775.148_1.981 575.119_2.42 575.119_2.42 1181.7306_2.683 1181.7306_2.683 1010.2374_2.419 1010.2374_2.419 1139.2786_2.649 1139.2786_2.649 622.1375_2.487 622.1375_2.487 334.0712_2.479 334.0712_2.479 424.641_2.453 424.641_2.453 511.1579_3.308 511.1579_3.308 1319.7895_2.493 1319.7895_2.493 541.1681_3.315 541.1681_3.315 1465.8352_2.335 1465.8352_2.335 472.1038_2.35 472.1038_2.35 629.1532_2.492 629.1532_2.492 931.2282_2.35 931.2282_2.35 712.7041_2.477 712.7041_2.477 577.1342_3.157 577.1342_3.157 455.1345_2.849 455.1345_2.849 665.1477_2.25 665.1477_2.25 550.8112_5.149 550.8112_5.149 534.8354_5.149 534.8354_5.149 529.3488_4.007 529.3488_4.007 357.1521_2.38 357.1521_2.38 227.1279_2.905 227.1279_2.905 477.3574_5.17 477.3574_5.17 542.8246_4.909 542.8246_4.909 335.6818_5.607 335.6818_5.607 1043.6271_5.143 1043.6271_5.143 1043.6229_5.143 1043.6229_5.143 1552.9609_5.143 1552.9609_5.143 551.812_5.153 551.812_5.153 371.2947_5.611 371.2947_5.611 1490.9808_5.855 1490.9808_5.855 323.0693_3.32 323.0693_3.32 635.2521_2.603 635.2521_2.603 289.1649_2.605 289.1649_2.605 450.1117_2.211 450.1117_2.211 449.1085_2.207 449.1085_2.207 451.1146_2.21 451.1146_2.21 471.0901_2.191 471.0901_2.191 447.0928_2.19 447.0928_2.19 420.1011_2.347 420.1011_2.347 419.0979_2.343 419.0979_2.343 705.3815_3.973 705.3815_3.973 583.2367_3.012 583.2367_3.012 273.1475_4.406 273.1475_4.406 583.2364_2.821 583.2364_2.821 289.1417_3.276 289.1417_3.276 585.2521_3.053 585.2521_3.053 451.192_2.914 451.192_2.914 459.2556_2.562 459.2556_2.562 322.0802_0.957 322.0802_0.957 487.1422_1.853 487.1422_1.853 565.1191_3.184 565.1191_3.184 271.5529_3.291 271.5529_3.291 407.1524_1.751 407.1524_1.751 377.142_1.816 377.142_1.816 447.1474_1.934 447.1474_1.934 363.1261_1.657 363.1261_1.657 383.1433_1.663 383.1433_1.663 190.0496_1.67 190.0496_1.67 347.1318_2.241 347.1318_2.241 342.176_2.241 342.176_2.241 530.3841_3.986 530.3841_3.986 611.1396_3.716 611.1396_3.716 499.1448_2.114 499.1448_2.114 391.1005_2.711 391.1005_2.711 729.1616_2.127 729.1616_2.127 454.02_2.241 454.02_2.241 238.0556_2.241 238.0556_2.241 246.0424_2.238 246.0424_2.238 225.5285_2.235 225.5285_2.235 545.1337_2.243 545.1337_2.243 464.1095_2.239 464.1095_2.239 218.0444_2.244 218.0444_2.244 217.5424_2.241 217.5424_2.241 763.2039_2.255 763.2039_2.255 181.0497_2.237 181.0497_2.237 456.1466_2.45 456.1466_2.45 932.6807_2.238 932.6807_2.238 163.039_1.734 163.039_1.734 395.0986_1.723 395.0986_1.723 395.0976_1.724 395.0976_1.724 377.0852_2.431 377.0852_2.431 932.1799_2.238 932.1799_2.238 924.6938_2.24 924.6938_2.24 811.1901_2.162 811.1901_2.162 789.2074_2.153 789.2074_2.153 219.0843_0.516 219.0843_0.516 197.1021_0.509 197.1021_0.509 460.1446_0.556 460.1446_0.556 235.0602_0.492 235.0602_0.492 397.1097_0.616 397.1097_0.616 237.0576_0.51 237.0576_0.51 1561.9126_5.143 1561.9126_5.143 1097.5636_5.147 1097.5636_5.147 1532.9578_5.146 1532.9578_5.146 1531.9572_5.146 1531.9572_5.146 1563.9163_5.144 1563.9163_5.144 1034.1315_5.146 1034.1315_5.146 949.4708_6.13 949.4708_6.13 543.1354_2.243 543.1354_2.243 450.0481_2.421 450.0481_2.421 753.1627_2.248 753.1627_2.248 163.0394_2.239 163.0394_2.239 1117.2109_2.237 1117.2109_2.237 357.1118_2.239 357.1118_2.239 382.0632_2.236 382.0632_2.236 728.1724_2.238 728.1724_2.238 728.6739_2.238 728.6739_2.238 763.1164_2.239 763.1164_2.239 1101.2385_2.236 1101.2385_2.236 377.0852_2.239 377.0852_2.239 355.1031_2.239 355.1031_2.239 393.0586_2.239 393.0586_2.239 731.18_2.239 731.18_2.239 374.5786_2.238 374.5786_2.238 374.077_2.236 374.077_2.236 551.1247_2.237 551.1247_2.237 559.1121_2.238 559.1121_2.238 601.0857_2.237 601.0857_2.237 747.1453_2.236 747.1453_2.236 382.5646_2.237 382.5646_2.237 750.1548_2.266 750.1548_2.266 539.1372_2.094 539.1372_2.094

924.1931_2.24 Pos Metabolites Pos 924.1931_2.24 1087.2507_2.237 1087.2507_2.237 329.5661_2.341 329.5661_2.341 309.0529_2.34 309.0529_2.34 387.0715_2.442 387.0715_2.442 705.2651_1.945 705.2651_1.945 1173.2425_2.684 1173.2425_2.684 1029.7124_2.651 1029.7124_2.651 1153.2597_2.605 1153.2597_2.605 1029.2156_2.653 1029.2156_2.653 722.1754_2.423 722.1754_2.423 885.6809_2.613 885.6809_2.613 597.1173_2.893 597.1173_2.893 271.0611_2.789 271.0611_2.789 775.1691_2.314 775.1691_2.314 1442.3262_2.65 1442.3262_2.65 1441.3223_2.649 1441.3223_2.649 749.6361_2.574 749.6361_2.574 866.7079_2.61 866.7079_2.61 893.1675_2.61 893.1675_2.61 1037.6989_2.651 1037.6989_2.651 1443.3365_2.576 1443.3365_2.576 1037.1982_2.65 1037.1982_2.65 866.2019_2.593 866.2019_2.593 424.1394_2.454 424.1394_2.454 591.1495_1.964 591.1495_1.964 561.1397_2.625 561.1397_2.625 600.139_2.447 600.139_2.447 387.0715_2.634 387.0715_2.634 697.1552_2.37 697.1552_2.37 741.2015_2.01 741.2015_2.01 741.202_1.813 741.202_1.813 873.1995_2.632 873.1995_2.632 667.1343_2.466 667.1343_2.466 410.091_1.889 410.091_1.889 342.0687_2.249 342.0687_2.249 561.1396_2.336 561.1396_2.336 617.1059_2.334 617.1059_2.334 1174.2497_2.544 1174.2497_2.544 909.1916_2.447 909.1916_2.447 722.1749_2.575 722.1749_2.575 165.0209_2.446 165.0209_2.446 749.1349_2.574 749.1349_2.574 673.0888_2.337 673.0888_2.337 330.0697_2.342 330.0697_2.342 206.0474_2.445 206.0474_2.445 185.5342_2.445 185.5342_2.445 184.5151_2.445 184.5151_2.445 453.1385_2.025 453.1385_2.025 563.1541_2.285 563.1541_2.285 1179.2711_2.103 1179.2711_2.103 674.0941_2.102 674.0941_2.102 673.0884_2.101 673.0884_2.101 598.1241_2.104 598.1241_2.104 671.2098_3.515 671.2098_3.515 1291.2907_2.582 1291.2907_2.582 909.2219_2.538 909.2219_2.538 646.1514_2.577 646.1514_2.577 593.1295_2.208 593.1295_2.208 535.1238_2.49 535.1238_2.49 1209.1956_2.353 1209.1956_2.353 741.1555_2.576 741.1555_2.576 741.6494_2.575 741.6494_2.575 597.1166_2.537 597.1166_2.537 1465.3197_2.575 1465.3197_2.575 1157.2817_2.359 1157.2817_2.359 1155.2749_2.889 1155.2749_2.889 1157.2818_2.534 1157.2818_2.534 577.1344_2.378 577.1344_2.378 577.1344_2.381 577.1344_2.381 598.1236_2.338 598.1236_2.338 601.132_2.333 601.132_2.333 885.1795_2.614 885.1795_2.614 889.1944_2.29 889.1944_2.29 1466.3228_2.576 1466.3228_2.576 453.0855_2.498 453.0855_2.498 1177.2578_2.362 1177.2578_2.362 741.1514_2.576 741.1514_2.576 1443.3371_2.429 1443.3371_2.429 1318.7818_2.564 1318.7818_2.564 1444.3409_2.429 1444.3409_2.429 578.1381_2.537 578.1381_2.537 1156.2785_2.891 1156.2785_2.891 579.1505_2.332 579.1505_2.332 867.2127_2.294 867.2127_2.294 289.0712_2.786 289.0712_2.786 1156.2789_2.359 1156.2789_2.359 1155.2755_2.358 1155.2755_2.358 1465.3277_2.355 1465.3277_2.355 271.0613_2.34 271.0613_2.34 271.0613_2.344 271.0613_2.344 602.1352_2.333 602.1352_2.333 742.1561_2.326 742.1561_2.326 865.1972_2.564 865.1972_2.564 868.2161_2.293 868.2161_2.293 869.219_2.296 869.219_2.296 1176.2629_2.336 1176.2629_2.336 1153.2597_2.509 1153.2597_2.509 1174.7508_2.537 1174.7508_2.537 1137.2637_2.539 1137.2637_2.539 894.1743_2.509 894.1743_2.509 309.0531_2.787 309.0531_2.787 1467.3336_2.324 1467.3336_2.324 410.0929_2.118 410.0929_2.118 410.0928_2.118 410.0928_2.118 418.2055_2.043 418.2055_2.043

417.2025_2.042 78 417.2025_2.042 289.0716_2.359 289.0716_2.359 289.0713_2.352 289.0713_2.352 886.1859_2.493 886.1859_2.493 1211.2087_2.332 1211.2087_2.332 887.6955_2.334 887.6955_2.334 887.1932_2.337 887.1932_2.337 1030.7193_2.526 1030.7193_2.526 1177.2572_2.534 1177.2572_2.534 742.6546_2.546 742.6546_2.546 372.1459_2.362 372.1459_2.362 575.1185_2.077 575.1185_2.077 411.1112_2.324 411.1112_2.324 591.1491_2.51 591.1491_2.51 493.1597_2.638 493.1597_2.638 865.1968_2.34 865.1968_2.34 591.1493_2.056 591.1493_2.056 581.1593_2.511 581.1593_2.511 582.1647_2.532 582.1647_2.532 1176.7657_2.333 1176.7657_2.333 1031.2259_2.328 1031.2259_2.328

1446.3471_2.427

1446.3471_2.427 852.2206_2.643 852.2206_2.643 625.1336_2.462 625.1336_2.462 1179.2723_2.786 1179.2723_2.786 596.1546_2.247 596.1546_2.247 596.1544_1.932 596.1544_1.932 773.1485_2.81 773.1485_2.81 884.2164_1.969 884.2164_1.969 884.2155_2.097 884.2155_2.097 662.2249_2.864 662.2249_2.864 697.1163_2.07 697.1163_2.07 487.108_2.298 487.108_2.298 722.1757_2.34 722.1757_2.34 739.1864_2.975 739.1864_2.975 523.0875_2.072 523.0875_2.072 425.0861_2.509 425.0861_2.509 675.1342_2.07 675.1342_2.07 411.1092_2.106 411.1092_2.106 337.5521_2.343 337.5521_2.343 1173.7437_2.683 1173.7437_2.683 1317.7748_2.709 1317.7748_2.709 865.1975_2.804 865.1975_2.804 246.0491_2.041 246.0491_2.041 927.2309_2.371 927.2309_2.371 674.2228_2.25 674.2228_2.25 289.0713_2.52 289.0713_2.52 675.1343_2.245 675.1343_2.245 712.2025_2.477 712.2025_2.477 275.0925_2.694 275.0925_2.694 661.2271_2.969 661.2271_2.969 597.1572_2.051 597.1572_2.051 480.1084_2.103 480.1084_2.103 695.1567_2.245 695.1567_2.245 577.1341_1.744 577.1341_1.744 599.1159_1.748 599.1159_1.748 677.2228_2.879 677.2228_2.879 409.0969_2.336 409.0969_2.336 867.2122_3.201 867.2122_3.201 677.2275_3.187 677.2275_3.187 1154.2641_2.612 1154.2641_2.612 1010.7391_2.646 1010.7391_2.646 1443.3361_2.831 1443.3361_2.831 561.1396_2.779 561.1396_2.779 1497.2568_2.433 1497.2568_2.433 1010.2373_2.648 1010.2373_2.648 123.0441_2.46 123.0441_2.46 598.1243_2.788 598.1243_2.788 743.164_2.476 743.164_2.476 1275.226_2.245 1275.226_2.245 271.0615_2.765 271.0615_2.765 519.1167_2.243 519.1167_2.243 373.1663_3.25 373.1663_3.25 455.1004_2.446 455.1004_2.446 139.0391_2.447 139.0391_2.447 598.1222_2.492 598.1222_2.492 409.0941_2.103 409.0941_2.103 273.0761_2.448 273.0761_2.448 1178.2607_2.248 1178.2607_2.248 291.0869_2.446 291.0869_2.446 291.087_2.446 291.087_2.446 440.1344_1.955 440.1344_1.955 563.1556_2.533 563.1556_2.533 699.1715_2.496 699.1715_2.496 313.0688_2.446 313.0688_2.446 579.1499_2.1 579.1499_2.1 1179.2718_2.333 1179.2718_2.333 849.2017_2.489 849.2017_2.489 893.6677_2.61 893.6677_2.61 601.1315_2.1 601.1315_2.1 1177.2573_2.248 1177.2573_2.248 598.6251_2.337 598.6251_2.337 310.0606_2.446 310.0606_2.446 742.1582_2.564 742.1582_2.564 329.5661_2.786 329.5661_2.786 774.1641_2.26 774.1641_2.26 475.1222_2.069 475.1222_2.069 674.0943_2.336 674.0943_2.336 427.1021_2.101 427.1021_2.101 330.0679_2.101 330.0679_2.101 329.566_2.1 329.566_2.1 338.2047_2.569 338.2047_2.569 309.0529_2.1 309.0529_2.1 602.135_2.1 602.135_2.1 399.0866_2.107 399.0866_2.107 630.1323_2.258 630.1323_2.258 1153.2593_1.829 1153.2593_1.829 502.1198_2.537 502.1198_2.537 427.1028_2.333 427.1028_2.333 247.0608_2.335 247.0608_2.335 603.1472_2.431 603.1472_2.431 1177.2571_2.891 1177.2571_2.891 1030.2205_2.528 1030.2205_2.528 1003.2283_2.536 1003.2283_2.536 561.1385_2.103 561.1385_2.103 865.1971_2.38 865.1971_2.38 247.0618_2.103 247.0618_2.103 1443.3377_2.328 1443.3377_2.328 1156.2788_2.255 1156.2788_2.255 311.0562_2.449 311.0562_2.449 1445.3442_2.429 1445.3442_2.429 1158.2846_2.359 1158.2846_2.359 823.186_2.54 823.186_2.54 597.1168_2.37 597.1168_2.37 870.2223_2.488 870.2223_2.488 580.1533_2.488 580.1533_2.488 579.15_2.786 (A) LC-MS (+) 579.15_2.786 580.1534_2.787 580.1534_2.787 868.2164_2.486 868.2164_2.486 715.166_2.762 715.166_2.762 715.1662_2.49 715.1662_2.49 1156.2784_2.535 1156.2784_2.535 867.2128_2.486 867.2128_2.486 869.2195_2.486 869.2195_2.486 427.1026_2.786 427.1026_2.786 577.1344_2.538 577.1344_2.538 1465.3203_2.433 1465.3203_2.433 1155.2749_2.534 1155.2749_2.534 1155.2748_2.534 1155.2748_2.534 722.6765_2.576 722.6765_2.576 599.1291_2.421 599.1291_2.421 598.6246_2.483 598.6246_2.483 886.6881_2.49 886.6881_2.49 541.1008_2.806 541.1008_2.806 851.2171_2.642 851.2171_2.642 905.1688_2.49 905.1688_2.49 597.1168_2.188 597.1168_2.188 1195.24_2.333 1195.24_2.333 1178.2605_2.534 1178.2605_2.534 889.1947_2.488 889.1947_2.488 601.1326_2.787 601.1326_2.787 5 101 228 294 339 364 366 368 395 517 537 541 547 550 553 716 764 777 794 825 838 847 865 874 880 885 890 891 892 893 894 897 899 900 901 903 907 908 909 941 950 953 985 987 993 1047 1074 1100 1200 1205 1251 1441 1458 1592 1757 1782 1795 1812 1813 1816 1817 1818 1819 1823 1824 1825 1826 1829 1830 1845 1852 1864 1865 1866 1867 1869 1880 1896 1906 1922 1925 1929 1936 2002 2010 2372 2413 2437 2447 2548 2558 2741 2845 2918 2979 3177 3222 3469 3514 3584 3663 3677 3696 3710 3715 3721 3723 3728 3736 3749 3752 3753 3754 3826 3888 3909 3948 3949 3951 3953 4106 4165 4196 4212 4258 4263 4288 4289 4295 4301 4306 4309 4375 4394 4403 4406 4410 4447 4479 4778 4794 4861 4869 4871 4888 5139 5186 5234 5238 5243 5286 5363 5374 5420 5431 5452 5478 5583 5637 5667 5700 5701 5745 5746 5817 5849 5867 5945 5980 5981 5985 5986 6011 6014 6038 6095 6097 6100 6108 6242 6298 6311 6339 6519 6545 6568 6652 6692 6735 6741 6760 7024 7087 7111 7122 7150 7154 7169 7313 7437 7504 7537 7540 7546 7547 7549 7552 7553 7554 7555 7557 7564 7567 7571 7574 7578 7584 7586 7587 7591 7597 7604 7605 7629 7805 7813 7818 7870 7909 7918 8083 8094 8112 8161 8166 8169 8231 8249 8316 8357 8359 8480 8542 8545 8626 8637 8654 8702 8858 8889 8925 8932 9081 9143 9167 9168 9553 9587 9614 9628 9644 9719 9812 9817 10003 10282 10488 10493 10499 10506 10519 10583 10609 10621 10644 10648 10691 10710 10732 10734 10740 10742 10754 10755 10759 10763 10782 10797 10802 10803 10804 10808 10809 10810 10811 10812 10814 10875 10893 10938 10962 10963 10969 10985 10988 10989 10997 11007 11075 11093 11125 11140 11159 11160 11327 11340 11346 11357 11386 11505 11530 11651 11731 11838 11872 11892 11907 11940 12007 12008 12040 12044 12057 12096 12099 12103 12111 12138 12145 12262 12323 12403 12416 12442 12446 12451 12590 12620 12625 12628 12629 12668 12702 12705 12712 12767 12786 12868 13075 13114 13158 13212 13220 13244 13352 13386 13617 13623 13624 13625 13627 13629 13630 13631 13648 13650 13657 13660 13663 13666 13672 13674 13675 13677 13678 13681 13684 13685 13689 13690 13708 13710 13715 13721 13723 13725 13727 13730 13733 13734 13740 13755 13761 13774 13778 13948 13960 13965 13968 13981 13986 14004 14109 14126 14127 14131 14140 14324 14549 14597 14613 14615 14679 14740 14756 14794 14826 14854 14855 14874 14957 14958 15010 15033 15046 15054 15056 15075 15086 15090 15095 15096 15098 15099 15100 15102 15103 15109 15110 15111 15113 15114 15115 15118 15123 15124 15125 15126 15133 15138 15147 15186 15195 15200 15206 15219 15221 SNPs

Figure 12. A continued figure. Binary heat maps with SNPs organized on the x-axis by chromosome and then genetic distance, and metabolomic features organized on the y-axis by hierarchical clustering. The intersection of each SNP-feature combination is either colored (significant) or black (non-significant). Thresholds for significance were -log(p) of 5 for LC-MS data sets and 4 for NMR. Figure 12 continued

995.2464_2.23 995.2464_2.23 773.4622_5.842 773.4622_5.842 321.1185_2.663 321.1185_2.663 311.1349_1.738 311.1349_1.738 325.0938_2.296 325.0938_2.296 333.0599_3.052 333.0599_3.052 345.2312_5.549 345.2312_5.549 1181.2896_3.202 1181.2896_3.202 1105.5753_5.154 1105.5753_5.154 1461.2862_1.921 1461.2862_1.921 285.0515_3.896 285.0515_3.896 302.0683_1.592 302.0683_1.592 427.1192_2.084 427.1192_2.084 423.0616_1.822 423.0616_1.822 469.0887_2.253 469.0887_2.253 469.1535_1.939 469.1535_1.939 471.113_1.972 471.113_1.972 371.174_2.752 371.174_2.752 359.1186_2.899 359.1186_2.899 385.0761_1.907 385.0761_1.907 403.0254_3.615 403.0254_3.615 417.1066_1.731 417.1066_1.731 563.2699_3.375 563.2699_3.375 547.0161_2.178 547.0161_2.178 569.2571_4.134 569.2571_4.134 608.134_2.392 608.134_2.392 621.1934_2.462 621.1934_2.462 483.1433_2.367 483.1433_2.367 479.246_2.779 479.246_2.779 499.1779_2.473 499.1779_2.473 509.0374_3.367 509.0374_3.367 539.2261_3.104 539.2261_3.104 685.3728_3.978 685.3728_3.978 683.3988_3.641 683.3988_3.641 725.2505_2.491 725.2505_2.491 755.1871_2.445 755.1871_2.445 761.2878_2.268 761.2878_2.268 655.2275_2.149 655.2275_2.149 633.1573_2.894 633.1573_2.894 666.0229_2.337 666.0229_2.337 677.1961_2.094 677.1961_2.094 679.3693_5.672 679.3693_5.672 409.149_3.697 409.149_3.697 317.0501_0.862 317.0501_0.862 395.1003_2.419 395.1003_2.419 327.1041_2.405 327.1041_2.405 509.1477_2.301 509.1477_2.301 425.1438_3.357 425.1438_3.357 651.2285_3.747 651.2285_3.747 436.1946_2.756 436.1946_2.756 507.1731_2.376 507.1731_2.376 439.1843_2.391 439.1843_2.391 421.0782_2.748 421.0782_2.748 1307.2986_2.247 1307.2986_2.247 487.1079_2.267 487.1079_2.267 569.0859_0.849 569.0859_0.849 511.1263_0.844 511.1263_0.844 579.1132_0.904 579.1132_0.904 295.1065_1.614 295.1065_1.614 387.1659_2.524 387.1659_2.524 421.0431_2.089 421.0431_2.089 467.0769_2.519 467.0769_2.519 519.1122_2.979 519.1122_2.979 519.1682_2.092 519.1682_2.092 539.235_3.306 539.235_3.306 564.274_3.37 564.274_3.37 573.1045_1.708 573.1045_1.708 732.714_0.401 732.714_0.401 799.2327_4.624 799.2327_4.624 247.0559_0.413 247.0559_0.413 131.0451_0.327 131.0451_0.327 311.1096_0.329 311.1096_0.329 583.2599_2.215 583.2599_2.215 737.1619_2.971 737.1619_2.971 456.1849_2.372 456.1849_2.372 413.1464_3.25 413.1464_3.25 472.1702_2.366 472.1702_2.366 523.1649_2.342 523.1649_2.342 749.3717_3.975 749.3717_3.975 367.106_1.047 367.106_1.047 367.106_1.046 367.106_1.046 425.103_3.098 425.103_3.098 379.0947_3.047 379.0947_3.047 645.1208_2.34 645.1208_2.34 609.1302_2.455 609.1302_2.455 1218.278_2.326 1218.278_2.326 1217.2718_2.381 1217.2718_2.381 752.1632_2.371 752.1632_2.371 706.1729_2.121 706.1729_2.121 706.1637_2.166 706.1637_2.166 531.0792_1.923 531.0792_1.923 469.0928_2.248 469.0928_2.248 689.2287_2.486 689.2287_2.486 578.2935_4.132 578.2935_4.132 567.2612_4.087 567.2612_4.087 607.2598_3.049 607.2598_3.049 403.1637_2.766 403.1637_2.766 403.158_2.768 403.158_2.768 675.1351_0.532 675.1351_0.532 514.0917_0.479 514.0917_0.479 221.9907_1.722 221.9907_1.722 280.083_2.565 280.083_2.565 323.0778_2.855 323.0778_2.855 331.0574_2.166 331.0574_2.166 364.1407_3.087 364.1407_3.087 379.053_2.877 379.053_2.877 411.0929_0.346 411.0929_0.346 608.1387_2.327 608.1387_2.327 677.2322_2.24 677.2322_2.24 819.1504_0.406 819.1504_0.406 611.1605_2.447 611.1605_2.447 581.2234_2.46 581.2234_2.46 623.189_2.458 623.189_2.458 567.2678_2.549 567.2678_2.549 1549.9414_5.147 1549.9414_5.147 1618.9287_5.154 1618.9287_5.154 669.1672_2.223 669.1672_2.223 481.1047_2.631 481.1047_2.631 857.1776_2.317 857.1776_2.317 481.0971_2.659 481.0971_2.659 497.1214_2.032 497.1214_2.032 631.1468_2.069 631.1468_2.069 606.4117_4.517 606.4117_4.517 501.1167_5.185 501.1167_5.185 649.1627_2.742 649.1627_2.742 785.47_4.516 785.47_4.516 605.4104_4.516 605.4104_4.516 695.2022_1.753 695.2022_1.753 527.0999_2.183 527.0999_2.183 609.1896_0.381 609.1896_0.381 551.186_0.403 551.186_0.403 442.9816_2.327 442.9816_2.327 1129.3146_2.999 1129.3146_2.999 606.172_3.691 606.172_3.691 503.1631_0.534 503.1631_0.534 721.3044_3.234 721.3044_3.234 635.0924_2.343 635.0924_2.343 599.1132_2.338 599.1132_2.338 637.0917_2.343 637.0917_2.343 637.092_2.343 637.092_2.343 667.1067_2.336 667.1067_2.336 303.0494_1.775 303.0494_1.775 487.165_3.314 487.165_3.314 425.0475_2.442 425.0475_2.442 1067.2409_2.543 1067.2409_2.543 1007.7132_2.622 1007.7132_2.622 689.1466_2.333 689.1466_2.333 863.1839_2.222 863.1839_2.222 1014.7022_2.128 1014.7022_2.128 437.1246_3.55 437.1246_3.55 1163.7445_2.675 1163.7445_2.675 1039.2131_2.43 1039.2131_2.43 1026.241_2.133 1026.241_2.133 1043.2413_2.31 1043.2413_2.31 305.0681_1.875 305.0681_1.875 617.1505_2.46 617.1505_2.46 273.0759_2.676 273.0759_2.676 805.2182_2.23 805.2182_2.23 1659.3576_2.328 1659.3576_2.328 1347.3002_2.332 1347.3002_2.332 898.1924_1.888 898.1924_1.888 449.1646_2.934 449.1646_2.934 519.1676_2.148 519.1676_2.148 429.1719_2.921 429.1719_2.921 543.1337_2.701 543.1337_2.701 271.0454_1.698 271.0454_1.698 239.0562_2.331 239.0562_2.331 321.0994_3.17 321.0994_3.17 365.1367_1.813 365.1367_1.813 417.1324_1.83 417.1324_1.83 195.0869_0.48 195.0869_0.48 350.0873_1.675 350.0873_1.675 471.0753_0.719 471.0753_0.719 409.093_0.813 409.093_0.813 453.1231_1.658 453.1231_1.658 453.121_1.739 453.121_1.739 462.1612_1.868 462.1612_1.868 429.1588_1.741 429.1588_1.741 419.1283_1.749 419.1283_1.749 497.1461_1.756 497.1461_1.756 581.1147_1.986 581.1147_1.986 467.1217_2.053 467.1217_2.053 436.1011_2.404 436.1011_2.404 435.0911_2.398 435.0911_2.398 433.2079_2.542 433.2079_2.542 471.0678_2.419 471.0678_2.419 503.0947_1.773 503.0947_1.773 485.0861_2.089 485.0861_2.089 503.0941_1.901 503.0941_1.901 1026.2298_2.666 1026.2298_2.666 287.0918_1.713 287.0918_1.713 579.0965_2.192 579.0965_2.192 571.1501_2.527 571.1501_2.527 737.1466_1.927 737.1466_1.927 735.1533_1.966 735.1533_1.966 437.1117_1.954 437.1117_1.954 611.2587_2.603 611.2587_2.603 446.0753_1.043 446.0753_1.043 483.0713_2.18 483.0713_2.18 447.0963_2.171 447.0963_2.171 533.0923_2.199 533.0923_2.199 501.0829_2.189 501.0829_2.189 436.0943_2.228 436.0943_2.228 823.1674_2.222 823.1674_2.222 789.1871_2.314 789.1871_2.314 523.0649_2.193 523.0649_2.193 515.0792_2.19 515.0792_2.19 507.0484_2.185 507.0484_2.185 545.06_2.164 545.06_2.164 583.0656_2.192 583.0656_2.192 595.0971_2.378 595.0971_2.378 801.1781_2.232 801.1781_2.232 801.1879_2.222 801.1879_2.222 603.0771_2.427 603.0771_2.427 487.0868_2.166 487.0868_2.166 485.0724_2.224 485.0724_2.224 771.1722_2.268 771.1722_2.268 417.0852_2.355 417.0852_2.355 301.0373_2.429 301.0373_2.429 505.0524_2.189 505.0524_2.189 699.2142_2.242 699.2142_2.242 323.1341_2.237 323.1341_2.237 515.1037_1.978 515.1037_1.978 461.1293_1.969 461.1293_1.969 448.1189_1.976 448.1189_1.976 447.1116_1.982 447.1116_1.982 369.083_2.425 369.083_2.425 359.0564_2.423 359.0564_2.423 437.0737_2.419 437.0737_2.419 437.0678_2.394 437.0678_2.394 741.379_4.517 741.379_4.517 673.3921_4.517 673.3921_4.517 621.4014_4.122 621.4014_4.122 693.3881_4.541 693.3881_4.541 607.1345_2.301 607.1345_2.301 337.0816_0.606 337.0816_0.606 1543.9242_5.146 1543.9242_5.146 1041.6097_5.145 1041.6097_5.145 1029.641_5.147 1029.641_5.147 1544.9252_5.145 1544.9252_5.145 1110.6092_5.15 1110.6092_5.15 1550.9437_5.146 1550.9437_5.146 1561.9001_5.147 1561.9001_5.147 1611.9093_5.15 1611.9093_5.15 1621.8549_5.153 1621.8549_5.153 1095.5428_5.149 1095.5428_5.149 425.3055_4.523 425.3055_4.523 1601.882_5.149 1601.882_5.149 1101.5446_5.152 1101.5446_5.152 1019.6465_5.14 1019.6465_5.14 1042.6133_5.145 1042.6133_5.145 597.3103_4.049 597.3103_4.049 573.1377_4.244 573.1377_4.244 1288.2858_5.16 1288.2858_5.16 349.1507_1.792 349.1507_1.792 287.0982_3.247 287.0982_3.247 1550.4429_5.15 1550.4429_5.15 1125.5552_5.15 1125.5552_5.15 1568.9038_5.147 1568.9038_5.147 1528.9499_5.147 1528.9499_5.147 1018.6364_5.672 1018.6364_5.672 925.183_2.239 925.183_2.239 1306.796_2.246 1306.796_2.246 541.1345_2.258 541.1345_2.258 707.166_2.108 707.166_2.108 551.1103_2.083 551.1103_2.083 914.1958_2.247 914.1958_2.247 697.1991_2.221 697.1991_2.221 871.3241_2.46 871.3241_2.46 630.1377_2.115 630.1377_2.115 545.1128_1.61 545.1128_1.61 1103.2304_2.239 1103.2304_2.239 1102.7336_2.239 1102.7336_2.239 933.1743_2.243 933.1743_2.243 371.1072_1.772 371.1072_1.772 371.1005_1.727 371.1005_1.727 372.0997_1.727 372.0997_1.727 515.1431_2.134 515.1431_2.134 1083.2498_2.238 1083.2498_2.238 732.1679_2.243 732.1679_2.243 629.1305_2.114 629.1305_2.114 1306.2922_2.246 1306.2922_2.246 729.1604_2.427 729.1604_2.427 553.1137_2.102 553.1137_2.102 1127.2242_2.267 1127.2242_2.267 553.1223_2.085 553.1223_2.085 683.1809_2.289 683.1809_2.289 551.1165_2.103 551.1165_2.103 1137.1747_2.237 1137.1747_2.237 1113.2169_2.239 1113.2169_2.239 914.7005_2.238 914.7005_2.238 1110.2152_2.237 1110.2152_2.237 716.1685_2.244 716.1685_2.244 1116.2027_2.237 1116.2027_2.237 1115.1932_2.237 1115.1932_2.237 717.1736_2.239 717.1736_2.239 191.0577_2.239 191.0577_2.239 496.1027_2.266 496.1027_2.266 760.0912_2.238 760.0912_2.238 1099.2259_2.238 1099.2259_2.238 353.0919_2.238 353.0919_2.238 495.0936_2.257 495.0936_2.257 791.1826_2.09 791.1826_2.09 761.1056_2.238 761.1056_2.238 761.096_2.238 761.096_2.238 745.1328_2.239 745.1328_2.239 549.1063_2.242 549.1063_2.242 467.0841_2.262 467.0841_2.262 416.0846_2.251 416.0846_2.251 549.114_2.245 549.114_2.245 1105.2401_2.238 1105.2401_2.238 729.167_2.237 729.167_2.237 730.1653_2.238 730.1653_2.238 762.1035_2.238 762.1035_2.238 797.0794_2.243 797.0794_2.243 709.1833_2.24 709.1833_2.24 527.1043_1.786 527.1043_1.786 715.1708_2.244 715.1708_2.244 707.1896_2.238 707.1896_2.238 451.0604_2.239 451.0604_2.239 707.1804_2.246 707.1804_2.246 549.6112_2.244 549.6112_2.244 961.1846_2.412 961.1846_2.412 961.1774_2.411 961.1774_2.411 416.9432_2.452 416.9432_2.452 568.635_2.654 568.635_2.654 712.1589_2.677 712.1589_2.677 547.1116_1.988 547.1116_1.988 387.0579_2.443 387.0579_2.443 606.1636_1.953 606.1636_1.953 1041.2315_2.808 1041.2315_2.808 1177.2561_2.797 1177.2561_2.797 565.1002_3.311 565.1002_3.311 473.0847_3.314 473.0847_3.314 580.1512_2.458 580.1512_2.458 883.6733_2.617 883.6733_2.617 753.6802_2.257 753.6802_2.257 667.0276_2.337 667.0276_2.337 387.1445_3.542 387.1445_3.542 579.1391_2.771 579.1391_2.771 79 904.1836_2.086 904.1836_2.086 883.2083_2.093 883.2083_2.093 1251.2585_2.485 1251.2585_2.485 1064.2296_2.742 1064.2296_2.742 386.061_2.442 386.061_2.442 579.1474_2.468 579.1474_2.468 587.1399_2.454 587.1399_2.454 624.6211_2.565 624.6211_2.565 867.2133_2.605 867.2133_2.605 689.1516_2.556 689.1516_2.556 693.1502_2.261 693.1502_2.261 1308.7829_2.553 1308.7829_2.553 1249.2537_2.517 1249.2537_2.517 1321.796_2.487 1321.796_2.487 325.0438_2.268 325.0438_2.268 891.1515_2.608 891.1515_2.608 962.191_2.41 962.191_2.41 879.2138_2.99 879.2138_2.99 913.1868_2.525 913.1868_2.525 1021.2189_2.57 1021.2189_2.57 687.2334_2.995 687.2334_2.995

601.1322_2.405

601.1322_2.405 1225.2777_2.483 1225.2777_2.483 649.1511_2.461 649.1511_2.461 742.1745_2.65 742.1745_2.65 675.1076_2.332 675.1076_2.332 945.1936_2.697 945.1936_2.697 Neg Metabolites 919.2068_2.745 919.2068_2.745 489.0994_2.058 489.0994_2.058 829.2918_2.454 829.2918_2.454 807.3113_2.454 807.3113_2.454 438.1211_1.947 438.1211_1.947 600.1264_2.106 600.1264_2.106 1203.2888_2.484 1203.2888_2.484 875.1779_2.612 875.1779_2.612 1211.2134_2.346 1211.2134_2.346 1453.8189_2.546 1453.8189_2.546 1020.7238_2.621 1020.7238_2.621 385.0607_2.442 385.0607_2.442 1466.3245_2.327 1466.3245_2.327 414.9435_2.452 414.9435_2.452 613.1092_2.336 613.1092_2.336 1309.2847_2.545 1309.2847_2.545 1164.7522_2.546 1164.7522_2.546 1465.3148_2.372 1465.3148_2.372 711.1532_2.334 711.1532_2.334 879.2138_2.628 879.2138_2.628 904.1652_2.592 904.1652_2.592 894.1691_2.609 894.1691_2.609 1319.7747_2.552 1319.7747_2.552 905.1666_2.607 905.1666_2.607 893.6678_2.604 893.6678_2.604 1020.224_2.64 1020.224_2.64 1019.2136_2.645 1019.2136_2.645 1038.698_2.647 1038.698_2.647 886.6732_2.613 886.6732_2.613 1020.2129_2.645 1020.2129_2.645 1019.7133_2.646 1019.7133_2.646 675.2069_2.871 675.2069_2.871 1038.1979_2.647 1038.1979_2.647 893.1664_2.604 893.1664_2.604 924.1586_2.492 924.1586_2.492 600.1179_2.335 600.1179_2.335 599.1183_2.308 599.1183_2.308 1037.7013_2.647 1037.7013_2.647 903.1511_2.487 903.1511_2.487 661.0879_2.336 661.0879_2.336 347.0296_2.444 347.0296_2.444 1085.1964_2.807 1085.1964_2.807 1405.4097_2.473 1405.4097_2.473 719.6526_2.612 719.6526_2.612 565.1333_3.512 565.1333_3.512 594.6197_2.277 594.6197_2.277 739.1749_2.534 739.1749_2.534 768.6541_2.527 768.6541_2.527 749.6385_2.575 749.6385_2.575 882.196_2.088 882.196_2.088 1234.2747_2.251 1234.2747_2.251 1063.2146_2.803 1063.2146_2.803 883.198_2.092 883.198_2.092 594.1195_2.263 594.1195_2.263 899.2034_1.914 899.2034_1.914 616.12_2.223 616.12_2.223 594.1365_1.931 594.1365_1.931 616.1256_1.932 616.1256_1.932 992.1995_2.246 992.1995_2.246 882.2063_2.082 882.2063_2.082 549.1226_2.713 549.1226_2.713 471.1054_2.703 471.1054_2.703 1425.3279_2.667 1425.3279_2.667 1320.7828_2.508 1320.7828_2.508 919.112_2.488 919.112_2.488 631.1467_2.794 631.1467_2.794 939.2916_2.343 939.2916_2.343 1208.1917_2.334 1208.1917_2.334 1199.2389_2.335 1199.2389_2.335 1031.7173_2.546 1031.7173_2.546 1193.2268_2.333 1193.2268_2.333 1175.7435_2.534 1175.7435_2.534 1177.2457_2.533 1177.2457_2.533 749.1434_2.565 749.1434_2.565 349.0274_2.445 349.0274_2.445 886.1764_2.614 886.1764_2.614 923.1535_2.492 923.1535_2.492 1177.2575_2.327 1177.2575_2.327 749.1345_2.566 749.1345_2.566 887.1807_2.487 887.1807_2.487 419.031_2.445 419.031_2.445 750.1429_2.566 750.1429_2.566 742.6482_2.574 742.6482_2.574 742.1405_2.577 742.1405_2.577 1212.2247_2.346 1212.2247_2.346 1197.2273_2.359 1197.2273_2.359 385.0562_2.633 385.0562_2.633 1597.8498_2.567 1597.8498_2.567 1586.3582_2.578 1586.3582_2.578 1164.2439_2.628 1164.2439_2.628 1597.3466_2.582 1597.3466_2.582 898.7134_2.281 898.7134_2.281 1009.7235_2.645 1009.7235_2.645 1008.7208_2.644 1008.7208_2.644 1587.3577_2.55 1587.3577_2.55 385.0542_2.459 385.0542_2.459 629.2003_3.827 629.2003_3.827 597.1124_2.51 597.1124_2.51 901.2168_2.394 901.2168_2.394 561.1398_2.948 561.1398_2.948 775.1651_1.813 775.1651_1.813 480.0879_2.422 480.0879_2.422 977.2103_2.509 977.2103_2.509 403.0672_3.533 403.0672_3.533 893.1871_2.631 893.1871_2.631 740.1961_2.014 740.1961_2.014 575.1206_2.511 575.1206_2.511 432.0949_2.573 432.0949_2.573 1256.2747_2.619 1256.2747_2.619 1009.2259_2.913 1009.2259_2.913 1010.7287_2.645 1010.7287_2.645 1010.2237_2.646 1010.2237_2.646 981.2123_2.466 981.2123_2.466 593.1267_2.804 593.1267_2.804 1211.2194_2.359 1211.2194_2.359 864.6963_2.926 864.6963_2.926 695.1014_2.07 695.1014_2.07 1327.2732_2.452 1327.2732_2.452 881.1945_2.082 881.1945_2.082 663.1306_2.175 663.1306_2.175 1155.2693_2.616 1155.2693_2.616 1239.3145_2.489 1239.3145_2.489 613.0922_1.749 613.0922_1.749 1027.2438_2.384 1027.2438_2.384 1221.2947_2.263 1221.2947_2.263 566.1007_3.31 566.1007_3.31 641.1439_2.336 641.1439_2.336 690.1517_2.555 690.1517_2.555 1161.3025_2.79 1161.3025_2.79 743.16_2.328 743.16_2.328 611.1393_1.866 611.1393_1.866 740.1867_2.071 740.1867_2.071 1596.8458_2.58 1596.8458_2.58 849.205_2.965 849.205_2.965 811.3371_2.458 811.3371_2.458 691.1286_2.338 691.1286_2.338 552.1108_1.958 552.1108_1.958 1352.2943_2.652 1352.2943_2.652 1144.758_2.722 1144.758_2.722 1586.8584_2.579 1586.8584_2.579 1587.8661_2.535 1587.8661_2.535 425.0908_2.786 425.0908_2.786 580.1418_2.761 580.1418_2.761 624.1172_2.173 624.1172_2.173 631.2253_2.459 631.2253_2.459 741.1897_2.869 741.1897_2.869 869.2214_2.232 869.2214_2.232 897.1819_1.858 897.1819_1.858 739.1667_2.893 739.1667_2.893 1042.2344_2.283 1042.2344_2.283 624.1259_2.556 624.1259_2.556 386.0577_2.633 386.0577_2.633 631.1502_3.053 631.1502_3.053 473.0915_3.307 473.0915_3.307 657.1269_3.406 657.1269_3.406 501.1375_3.344 501.1375_3.344 903.2952_2.452 903.2952_2.452 675.2069_3.182 675.2069_3.182 501.1375_3.397 501.1375_3.397 421.1282_3.401 421.1282_3.401 289.0698_3.591 289.0698_3.591 551.1149_3.526 551.1149_3.526 567.1188_2.822 567.1188_2.822 727.1619_2.663 727.1619_2.663 907.2076_2.519 907.2076_2.519 549.1137_3.282 549.1137_3.282 605.147_2.381 605.147_2.381 341.0668_2.579 341.0668_2.579 1057.7161_2.434 1057.7161_2.434 721.1457_2.428 721.1457_2.428 731.077_2.075 731.077_2.075 739.1925_3.21 739.1925_3.21 814.1936_2.354 814.1936_2.354 633.2295_2.458 633.2295_2.458 675.1323_2.48 675.1323_2.48 719.1489_2.616 719.1489_2.616 409.0904_2.472 409.0904_2.472 934.258_2.347 934.258_2.347 641.1527_2.555 641.1527_2.555 559.0086_3.7 559.0086_3.7 445.0181_3.701 445.0181_3.701 624.6211_2.173 624.6211_2.173 1030.212_2.65 1030.212_2.65 673.1215_2.07 673.1215_2.07 739.1843_2.052 739.1843_2.052 595.1382_2.209 595.1382_2.209 594.1366_2.248 594.1366_2.248 768.1529_2.516 768.1529_2.516 1027.2631_3.121 1027.2631_3.121 893.1767_2.621 893.1767_2.621 925.2148_2.372 925.2148_2.372 535.1252_3.428 535.1252_3.428 742.184_2.565 742.184_2.565 741.1793_2.623 741.1793_2.623 879.2035_2.75 879.2035_2.75 739.2031_3.198 739.2031_3.198 611.0958_1.748 611.0958_1.748 605.1283_2.363 605.1283_2.363 883.1755_2.609 883.1755_2.609 1008.7196_2.915 1008.7196_2.915 592.1211_2.201 592.1211_2.201 882.1856_2.553 882.1856_2.553 882.1856_2.469 882.1856_2.469 641.1463_2.317 641.1463_2.317 1017.2276_2.325 1017.2276_2.325 633.1954_3.315 633.1954_3.315 617.1375_2.739 617.1375_2.739 879.1727_2.47 879.1727_2.47 863.6845_2.001 863.6845_2.001 1283.2943_2.477 1283.2943_2.477 868.2171_2.598 868.2171_2.598 587.117_2.892 587.117_2.892 451.1194_2.077 451.1194_2.077 1027.2551_2.067 1027.2551_2.067 856.1907_2.688 856.1907_2.688 1000.2216_2.713 1000.2216_2.713 1444.3334_2.572 1444.3334_2.572 1441.3237_2.572 1441.3237_2.572 578.1346_2.787 578.1346_2.787 1251.2459_2.5 1251.2459_2.5 915.2579_2.136 915.2579_2.136 1241.2738_2.243 1241.2738_2.243 317.0688_3.155 317.0688_3.155 1220.2928_2.235 1220.2928_2.235 677.2232_3.596 677.2232_3.596 751.1528_1.979 751.1528_1.979 711.1531_2.099 711.1531_2.099 613.1172_2.787 613.1172_2.787 740.1962_1.812 740.1962_1.812 1028.247_2.391 1028.247_2.391 591.1486_3.245 591.1486_3.245 594.111_2.38 594.111_2.38 661.2309_4.181 661.2309_4.181 245.0811_2.446 245.0811_2.446 949.2921_3.579 949.2921_3.579 1001.2339_2.705 1001.2339_2.705 897.7048_2.281 897.7048_2.281 659.2138_3.622 659.2138_3.622 930.2239_2.225 930.2239_2.225 591.1148_2.198 591.1148_2.198 879.1795_2.505 879.1795_2.505 661.2309_3.717 661.2309_3.717 641.2059_2.484 641.2059_2.484 647.2107_3.439 647.2107_3.439 864.1853_2.337 864.1853_2.337 1000.7257_2.72 1000.7257_2.72 720.1569_2.573 720.1569_2.573 587.6202_2.359 587.6202_2.359 865.6986_2.611 865.6986_2.611 568.1294_2.651 568.1294_2.651 1452.8159_2.578 1452.8159_2.578 1165.2513_2.539 1165.2513_2.539 1453.3152_2.575 1453.3152_2.575 875.6812_2.612 875.6812_2.612 876.6898_2.607 876.6898_2.607 979.7751_2.451 979.7751_2.451 963.2697_3.165 963.2697_3.165 875.1874_2.612 875.1874_2.612 1309.7888_2.549 1309.7888_2.549 864.6964_2.612 864.6964_2.612 864.1955_2.612 864.1955_2.612 917.0996_2.463 917.0996_2.463 731.1522_2.56 731.1522_2.56 1175.2423_2.509 1175.2423_2.509 918.1107_2.489 918.1107_2.489 357.0622_2.444 357.0622_2.444 1320.2801_2.544 1320.2801_2.544 635.0924_2.099 635.0924_2.099 600.1179_2.1 600.1179_2.1 579.1384_2.101 579.1384_2.101 637.0918_2.101 637.0918_2.101 575.621_1.827 575.621_1.827 667.0283_2.101 667.0283_2.101 615.1076_2.1 615.1076_2.1 677.0603_2.1 677.0603_2.1 661.0879_2.1 661.0879_2.1 721.6619_2.573 721.6619_2.573 675.1123_2.097 675.1123_2.097 594.6144_2.488 594.6144_2.488 1529.336_2.259 1529.336_2.259 901.2377_2.051 901.2377_2.051 431.0612_3.155 431.0612_3.155 755.1986_3.08 755.1986_3.08 1151.2412_1.828 1151.2412_1.828 857.1979_2.691 857.1979_2.691 856.6991_2.693 856.6991_2.693 667.1015_2.1 667.1015_2.1 1199.2355_2.102 1199.2355_2.102 691.1286_2.099 691.1286_2.099 919.2699_2.546 919.2699_2.546 929.2698_2.325 929.2698_2.325 640.1319_2.095 640.1319_2.095 615.1076_2.102 615.1076_2.102 963.1683_2.488 963.1683_2.488 637.0917_2.787 637.0917_2.787 850.2085_2.639 850.2085_2.639 1138.266_2.639 1138.266_2.639 632.0642_2.448 632.0642_2.448 352.0682_2.449 352.0682_2.449 452.1247_2.061 452.1247_2.061 451.1269_2.061 451.1269_2.061 565.1168_2.02 565.1168_2.02 593.1284_2.535 593.1284_2.535 849.2051_2.638 849.2051_2.638 1137.2632_2.643 1137.2632_2.643 1157.2771_2.333 1157.2771_2.333 1155.2767_2.333 1155.2767_2.333 561.1398_2.534 561.1398_2.534 675.1075_2.788 675.1075_2.788 616.1182_1.974 616.1182_1.974 721.1644_2.573 721.1644_2.573 1464.3066_2.554 1464.3066_2.554 577.6331_2.43 577.6331_2.43 1463.3059_2.549 1463.3059_2.549 579.1383_2.332 579.1383_2.332 720.6603_2.573 720.6603_2.573 1165.7561_2.54 1165.7561_2.54 594.1195_2.538 594.1195_2.538 1321.2898_2.491 1321.2898_2.491 903.1697_2.565 903.1697_2.565 599.1219_2.104 599.1219_2.104 576.1299_2.53 576.1299_2.53 425.0892_2.333 425.0892_2.333 576.6308_2.534 576.6308_2.534 691.1286_2.785 691.1286_2.785 635.0923_2.787 635.0923_2.787 600.1179_2.787 600.1179_2.787 640.1319_2.337 640.1319_2.337 757.2079_2.72 757.2079_2.72 727.6603_2.667 727.6603_2.667 327.046_2.445 327.046_2.445 1158.2826_2.331 1158.2826_2.331 577.1318_2.789 577.1318_2.789 577.1361_2.789 577.1361_2.789 613.1087_2.785 613.1087_2.785 640.1318_2.786 640.1318_2.786 667.0282_2.786 667.0282_2.786 599.1132_2.102 599.1132_2.102 137.0248_2.448 137.0248_2.448 1154.263_2.532 1154.263_2.532 1153.2591_2.532 (B) LC-MS (-) 1153.2591_2.532 1155.2673_2.533 1155.2673_2.533 289.075_2.446 289.075_2.446 291.0768_2.447 291.0768_2.447 580.1428_2.332 580.1428_2.332 614.1163_2.786 614.1163_2.786 615.1075_2.337 615.1075_2.337 615.1076_2.786 615.1076_2.786 901.1646_2.573 901.1646_2.573 901.175_2.572 901.175_2.572 591.1485_2.863 591.1485_2.863 613.109_2.101 613.109_2.101 1157.2704_2.537 1157.2704_2.537 979.1983_2.571 979.1983_2.571 616.119_2.784 616.119_2.784 722.1638_2.569 722.1638_2.569 615.1077_2.786 615.1077_2.786 712.6647_2.678 712.6647_2.678 1028.2581_2.082 1028.2581_2.082 5 101 126 170 216 226 245 294 347 368 488 517 537 541 547 550 553 716 764 819 847 865 873 874 891 893 903 909 985 993 996 1188 1372 1379 1382 1403 1433 1441 1449 1458 1527 1536 1676 1757 1782 1795 1818 1823 1824 1825 1826 1829 1830 1852 1906 1922 1925 1929 1935 1936 1999 2010 2276 2388 2396 2398 2413 2418 2445 2741 2979 3144 3177 3647 3648 3663 3710 3715 3721 3723 3725 3736 3749 3752 3753 3754 3826 3892 3936 3944 3948 3951 3953 4016 4024 4068 4125 4145 4146 4165 4187 4196 4208 4212 4253 4258 4259 4260 4263 4279 4287 4288 4289 4290 4295 4298 4299 4301 4308 4309 4321 4350 4375 4410 4447 4764 4777 4778 4861 4870 4911 5057 5074 5088 5114 5123 5125 5127 5139 5176 5191 5203 5299 5326 5431 5478 5496 5520 5559 5575 5583 5596 5600 5604 5605 5606 5607 5608 5609 5611 5612 5616 5622 5624 5625 5627 5635 5637 5649 5673 5814 5817 5849 5936 5945 5952 5953 5967 5969 5985 6011 6023 6038 6195 6298 6311 6652 6735 6741 6751 7105 7143 7150 7154 7287 7290 7504 7540 7546 7547 7548 7552 7553 7554 7555 7557 7571 7578 7586 7597 7609 7629 7662 7818 7918 8029 8123 8125 8127 8128 8147 8161 8169 8208 8209 8211 8227 8231 8422 8428 8455 8458 8470 8480 8482 8493 8494 8509 8511 8514 8517 8529 8531 8535 8541 8549 8626 8633 8636 8637 8640 8687 8721 8730 8845 8888 8889 8914 8940 9081 9138 9143 9168 9184 9228 9230 9231 9234 9250 9252 9253 9276 9358 9359 9361 9494 9669 9719 9868

79 10003 10018 10203 10213 10215 10227 10240 10245 10260 10266 10275 10280 10282 10294 10296 10299 10302 10303 10304 10305 10309 10318 10322 10328 10330 10332 10333 10336 10339 10340 10343 10344 10368 10376 10493 10498 10500 10506 10512 10513 10515 10516 10584 10593 10609 10621 10644 10660 10714 11159 11386 11496 11505 11763 11779 11790 11791 11793 11838 11872 11892 11907 11932 11940 11951 11967 11980 12021 12040 12044 12051 12057 12073 12096 12101 12111 12138 12145 12172 12241 12442 12446 12451 12615 12620 12625 12628 12629 12636 12640 12705 12774 12868 13075 13352 13617 13623 13625 13627 13628 13629 13630 13631 13635 13636 13648 13650 13655 13657 13660 13663 13666 13672 13674 13675 13677 13678 13681 13684 13685 13689 13690 13694 13696 13708 13709 13710 13715 13721 13723 13725 13727 13730 13733 13734 13750 13755 13761 13774 13778 13992 14004 14017 14034 14046 14074 14090 14095 14487 14549 14613 14623 14629 14679 14740 14756 14765 14784 14791 14844 14854 14855 14873 14874 14890 14897 14935 14939 14957 14958 14961 14975 14988 14990 15010 15033 15046 15086 15090 15095 15096 15098 15099 15100 15102 15103 15109 15110 15111 15113 15114 15115 15118 15119 15122 15123 15124 15125 15126 15133 15134 15138 15147 15182 15186 15195 15200 15203 SNPs

Figure 12 continued 6.92.6.91 6.92.6.91 6.87.6.86 6.87.6.86 7.14.7.13 7.14.7.13 6.91.6.9 6.91.6.9 6.9.6.89 6.9.6.89 6.74.6.73 6.74.6.73 6.68.6.67 6.68.6.67 6.66.6.65 6.66.6.65 6.13.6.12 6.13.6.12 6.67.6.66 (C) NMR 6.67.6.66 6.12.6.11 6.12.6.11 6.15.6.14 6.15.6.14 6.55.6.54 6.55.6.54 6.79.6.78 6.79.6.78 6.75.6.74 6.75.6.74 6.8.6.79 6.8.6.79 6.86.6.85 6.86.6.85 6.89.6.88 6.89.6.88 6.88.6.87 6.88.6.87 6.81.6.8 6.81.6.8 6.7.6.69 6.7.6.69 6.76.6.75 6.76.6.75 6.69.6.68 6.69.6.68 6.77.6.76 6.77.6.76 6.78.6.77 6.78.6.77 6.16.6.15 6.16.6.15 6.64.6.63 6.64.6.63 6.71.6.7 6.71.6.7 6.04.6.03 6.04.6.03 6.83.6.82 6.83.6.82 6.85.6.84 6.85.6.84 5.98.5.97 5.98.5.97 6.01.6 6.01.6 5.97.5.96 5.97.5.96 6.82.6.81 6.82.6.81 6.5.99 6.5.99 6.65.6.64 6.65.6.64 6.54.6.53 6.54.6.53 5.99.5.98 5.99.5.98 6.1.6.09 6.1.6.09 6.09.6.08 6.09.6.08 6.62.6.61 6.62.6.61 5.95.5.94 5.95.5.94 6.63.6.62 6.63.6.62 6.07.6.06 6.07.6.06 6.06.6.05 6.06.6.05 6.08.6.07 6.08.6.07 6.05.6.04 6.05.6.04 5.96.5.95 5.96.5.95 5.9.5.89 5.9.5.89 7.06.7.05 7.06.7.05 6.57.6.56 6.57.6.56 6.73.6.72 6.73.6.72 7.13.7.12 7.13.7.12 6.52.6.51 6.52.6.51 6.03.6.02 6.03.6.02 7.16.7.15 7.16.7.15 6.53.6.52 6.53.6.52 7.15.7.14 7.15.7.14 7.12.7.11 7.12.7.11 6.72.6.71 6.72.6.71 6.5.6.49 6.5.6.49 5.89.5.88 5.89.5.88 6.36.6.35 6.36.6.35 6.99.6.98 6.99.6.98 6.45.6.44 6.45.6.44 6.44.6.43 6.44.6.43 6.6.6.59 6.6.6.59 5.93.5.92 5.93.5.92 6.51.6.5 6.51.6.5 7.05.7.04 7.05.7.04 6.98.6.97 6.98.6.97 5.94.5.93 5.94.5.93 6.11.6.1 6.11.6.1 6.61.6.6 6.61.6.6 6.56.6.55 6.56.6.55 5.92.5.91 5.92.5.91 5.91.5.9 5.91.5.9 6.02.6.01 6.02.6.01 6.58.6.57 6.58.6.57 80 6.14.6.13 6.14.6.13 6.59.6.58 6.59.6.58 6.18.6.17 6.18.6.17 6.17.6.16 6.17.6.16 0.66.0.65 0.66.0.65

0.56.0.55 0.56.0.55 1.03.1.02 1.03.1.02 8.91.8.9 8.91.8.9 1.28.1.27 1.28.1.27 NMR Bins 1.27.1.26 1.27.1.26 1.45.1.44 1.45.1.44 2.91.2.9 2.91.2.9 2.12.2.11 2.12.2.11 3.24.3.23 3.24.3.23 5.32.5.31 5.32.5.31 4.07.4.06 4.07.4.06 5.86.5.85 5.86.5.85 6.95.6.94 6.95.6.94 5.87.5.86 5.87.5.86 7.81.7.8 7.81.7.8 1.17.1.16 1.17.1.16 2.94.2.93 2.94.2.93 3.37.3.36 3.37.3.36 5.38.5.37 5.38.5.37 5.41.5.4 5.41.5.4 5.2.5.19 5.2.5.19 4.17.4.16 4.17.4.16 7.76.7.75 7.76.7.75 7.75.7.74 7.75.7.74 8.54.8.53 8.54.8.53 8.29.8.28 8.29.8.28 7.7.7.69 7.7.7.69 7.69.7.68 7.69.7.68 0.65.0.64 0.65.0.64 0.68.0.67 0.68.0.67 1.44.1.43 1.44.1.43 5.53.5.52 5.53.5.52 8.42.8.41 8.42.8.41 1.32.1.31 1.32.1.31 1.31.1.3 1.31.1.3 1.35.1.34 1.35.1.34 0.89.0.88 0.89.0.88 5.05.5.04 5.05.5.04 5.02.5.01 5.02.5.01 5.85.5.84 5.85.5.84 2.13.2.12 2.13.2.12 6.94.6.93 6.94.6.93 6.93.6.92 6.93.6.92 6.29.6.28 6.29.6.28 7.04.7.03 7.04.7.03 7.02.7.01 7.02.7.01 1.43.1.42 1.43.1.42 1.42.1.41 1.42.1.41 5.42.5.41 5.42.5.41 8.9.8.89 8.9.8.89 1.58.1.57 1.58.1.57 1.57.1.56 1.57.1.56 1.55.1.54 1.55.1.54 1.47.1.46 1.47.1.46 1.56.1.55 1.56.1.55 1.65.1.64 1.65.1.64 1.66.1.65 1.66.1.65 1.67.1.66 1.67.1.66 1.83.1.82 1.83.1.82 1.6.1.59 1.6.1.59 1.59.1.58 1.59.1.58 1.74.1.73 1.74.1.73 1.73.1.72 1.73.1.72 1.76.1.75 1.76.1.75 1.75.1.74 1.75.1.74 1.23.1.22 1.23.1.22 0.64.0.63 0.64.0.63 0.55.0.54 0.55.0.54 0.59.0.58 0.59.0.58 0.63.0.62 0.63.0.62 0.58.0.57 0.58.0.57 0.53.0.52 0.53.0.52 0.54.0.53 0.54.0.53 0.62.0.61 0.62.0.61 4.3.99 4.3.99 3.68.3.67 3.68.3.67 4.05.4.04 4.05.4.04 3.99.3.98 3.99.3.98 3.98.3.97 3.98.3.97 5.8.5.79 5.8.5.79 5.79.5.78 5.79.5.78 8.67.8.66 8.67.8.66 7.28.7.27 7.28.7.27 7.18.7.17 7.18.7.17 7.27.7.26 7.27.7.26 5.4.5.39 5.4.5.39 7.64.7.63 7.64.7.63 6.33.6.32 6.33.6.32 2.17.2.16 2.17.2.16 7.1.7.09 7.1.7.09 2.15.2.14 2.15.2.14 2.14.2.13 80 2.14.2.13 101 250 339 477 478 480 486 493 516 517 525 529 541 1188 1237 1246 1252 1540 1676 2170 2200 2201 2274 2697 2726 2735 2740 2801 2979 2998 3031 3421 3450 3887 4125 4145 4146 4299 4389 4513 4585 4587 4621 4874 4887 5126 5145 5217 5331 5637 5849 6298 6743 6951 7207 7357 8346 8379 8386 8387 8399 8404 8413 8421 8442 8470 8513 8529 8530 8534 8535 8542 8549 8571 8611 8622 8626 8627 8635 8636 8637 8979 9288 9351 9770 9771 9814 10019 10342 10506 10644 10777 10833 10891 10893 11327 11641 11647 11649 11658 11768 11797 11827 11837 11838 11839 11844 11850 11852 11855 11857 11871 11875 12021 12035 12085 12101 12103 12105 12383 12442 12446 12803 12811 12815 12868 12936 13009 13066 13092 13617 13627 13628 13629 13630 13631 13635 13637 13648 13650 13657 13660 13663 13666 13672 13674 13675 13677 13678 13681 13684 13685 13689 13690 13696 13708 13715 13721 13723 13725 13727 13730 13733 13750 13755 13761 13778 13952 13965 13981 13992 14324 14645 14701 14737 14748 14756 14765 14854 14874 15091 15095 15096 15102 15109 15110 15111 15113 15114 15115 15116 15123 15124 15133 15134 15138 15147 15186 SNPs coming from bins in the ~8-6 ppm range, indicating that they are aromatic compounds, likely phenolics (Eisenmann et al. 2016; Iaccarino et al. 2019). This finding can then be translated when viewing the heat maps for LC-MS data sets. Consulting the x-axis to find the same SNPs that are carrying signal from the NMR phenolic region will lead us to the same extremely long region in both LC-MS heat maps, characterizing them as phenolics without endeavoring to systematically identify each one. This demonstrates the power of genomic integration with multiple metabolomic approaches.

2.4.3.5 mQTLs detected for peaks across the NMR spectrum

To our knowledge, this study represents the first mGWAS using NMR-based untargeted metabolomics in plants. Therefore, novel approaches to understand and visualize the NMR-bin associations required development. We found 177 bins that were significantly associated with at least one SNP from the mGWAS results prioritization pipeline, these were plotted over the 1D 1H NMR spectrum of the apple extract pooled

QC (Figure 13). The significant bins are distributed across the entire spectrum, which, in

NMR, indicates genotype-metabolite relationships with a variety of different chemistries, as the chemical shift on the x-axis is determined by compound structure and therefore chemical class.

It is important to note that the binning procedure for NMR data does not aim to capture one compound per bin. Instead, binning is used in complex mixtures to capture distinct portions of the peaks in an attempt to parse signals from hydrogens originating from different compounds but existing in a very similar chemical environment. In this case, it is common that several consecutive bins comprise a single signal derived from

81 one metabolite. This concept of multiple bins per peak is also compounded by the fact that each hydrogen in a metabolite with a distinct chemical environment will elicit multiple peaks at various chemical shifts. Therefore, the 177 bins do not represent 177 different compounds being significantly associated with a SNP. This number includes bins at multiple chemical shifts that come from the same compound.

Although this does increase the number of comparisons occurring in the mGWAS, thus lowering the power of the analysis, there are benefits to this facet of genomics-NMR integration. First, if a compound truly exhibits a significant relationship with a SNP, this signal should be paralleled at multiple chemical shifts in the NMR spectrum. Indeed, even the region where one peak is located may include several consecutive bins with mQTL signal, giving confidence that it is not an artifact of the analysis.

Interpreting this figure from right to left according to the chemical shift, the very low ppm peaks (0.8-0.5 ppm) with significant mQTLs likely represent sterols or methyl groups of terpenoids. Following this area, amino acids, such as alanine and valine, appear to have signal for mQTLs from 1.75-1.25 ppm. Around 2 ppm, we would expect to see peaks from high abundance organic acids. The two large spikes from 3-2.5 ppm are areas that exhibited unstable chemical shifts across the 200 consecutive NMR analyses. The same issue was observed at another peak near 4.4 ppm. These peaks represent malic and citric acid, both of which are expected to have slightly unstable chemical shift across experiments despite inclusion of a buffer to maintain pH of the samples. For all three cases, the intensities for peaks within the range of the area containing the shifting peaks

82

Figure 13. 1D 1H NMR spectrum of the apple extract pooled QC. Yellow lines indicate each bin that was significantly associated with at least one SNP. Dashed lines approximately divide the spectrum according to the type of compounds that elicit peaks at that chemical shift. The aromatic region and amino acid region are in much lower abundance than the sugar region, so magnified inserts are also presented.

83 were summed to a single bin to avoid representing the same signal across many different bins.

As the goal of this project was platform development, there was not time to examine spectra thoroughly to determine and outline chemical shifts, multiplicity, and J- coupling for peak identities. Proton NMR analysis of apple juice by Belton et al. (1997) and Iaccarino et al. (2019) as well as apple fruit pulp and peel extracts by Eisenmann et al. (2016) provide pertinent identities for peaks in the apple spectra represented in this study.

2.4.3.6 mQTL hotspots on chromosomes 16 and 17

The abundance of SNP-feature associations detected on chromosome 16 prompted further investigation. An even distribution of mQTL across the 17 chromosomes would indicate 5.9% of the total mQTLs would be present per chromosome. Chromosome 16 is a clear hotspot because 54-60% of the mQTLs detected per metabolomic approach are located on chromosome 16. Plots were constructed to visualize the number of significantly associated features per SNP within chromosome 16

(Figure 14). The goal of these plots was to understand if many disparate SNPs were eliciting signal or if a few SNPs were associated with many metabolomic features. The plots also gave a sense of pattern repetition across the three metabolomic approaches.

The results were striking, as a set of SNPs on the top of chromosome 16 were significantly associated with hundreds of features across the three analysis methods.

Particularly, three SNPs 13681(SNP_FB_1074682), 13685

(RosBREEDSNP_SNP_CT_1540624_Lg16_LAR1_MAF40_1618769_exon2), and

84

Figure 14. A continued figure. Plots displaying the number of significantly associated metabolomic features per SNP on chromosome 16. Significance is determined as -log(p)

≥ 5 for LC-MS data sets and ≥ 4 for NMR. SNPs are plotted based on their genetic position (cM) on the x-axis. SNPs with a minimum of 1 significant feature association are labeled with their study index number. Top SNPs include: 13681(SNP_FB_1074682),

13685 (RosBREEDSNP_SNP_CT_1540624_Lg16_LAR1_MAF40_1618769_exon2), and 13675 (SNP_FB_0335535). Additional index-to-SNP name conversions, including synonyms from the 480K SNP array, are available in Appendix B.1.9.

85

Figure 14 continued

86

13675 (SNP_FB_0335535) maintained top positions for LC-MS (+), (-), and NMR. This indicated that genetic elements exist in this locus that exert control on metabolite production, most likely the compounds are a collection of structurally similar phytochemicals. These SNPs were also consistent with those eliciting the large band of significance across each of the metabolomic approaches in Figure 12. Based on the complementary nature of the metabolomics data sets, that band was determined to represent phenolic compounds. If this is truly the case, these SNPs may be good candidates for further development as markers for certain phenolic compounds. The genomic region should be recharacterized for a better understanding of what genes or regulatory elements are present. This phenolic mQTL hotspot on chromosome 16 is consistent with results from mQTL mapping studies by Khan, Chibon, et al. (2012) and

Chagné, Krieger, et al. (2012) as well as the mGWAS in McClure et al. (2019).

Similarly, the large number of signals present on chromosome 17 were examined with the same plots to visualize the number of significantly associated features per SNP

(Figure 15). LC-MS approaches provided more signals for mQTL compared to NMR –

18-19% of total mQTL versus only 4% for NMR. This indicates that the bulk of the compounds with mQTL for chromosome 17 are not abundant enough to be detected via

NMR metabolomics. Once again, similar results were seen across each metabolomic approach with top SNPs being 15109

(RosBREEDSNP_SNP_AG_20028330_Lg17_01298_MAF50_1664885_exon1), 15123

(SNP_FB_1114677), and 15133 (SNP_FB_0398770). The whole genomic region surrounding these SNPs should be recharacterized to understand what genes or regulatory

87

Figure 15. A continued figure. Plots displaying the number of significantly associated metabolomic features per SNP on chromosome 17. Significance is determined as -log(p)

≥ 5 for LC-MS data sets and ≥ 4 for NMR. SNPs are plotted based on their genetic position (cM) on the x-axis. SNPs with a minimum of 1 significant feature association are labeled with their study index number. Top SNPs include: 15109

(RosBREEDSNP_SNP_AG_20028330_Lg17_01298_MAF50_1664885_exon1), 15123

(SNP_FB_1114677), and 15133 (SNP_FB_0398770). Additional index-to-SNP name conversions, including synonyms from the 480K SNP array, are available in Appendix

B.1.9.

88

Figure 15 continued

89 elements in this area are eliciting an association with so many metabolites. Subsequently, markers could be developed for this region if the compounds are of interest for breeding purposes.

Chromosome 17 has not been noted as remarkable in previous studies, but there is some consistency in reports of mQTL in this region for phenolic metabolites. Three studies identified an mQTL on the bottom of linkage group 17 for chlorogenic acid

(Chagné, Krieger, et al. 2012; Khan, Chibon, et al. 2012; McClure et al. 2019). Chagné,

Krieger, et al. (2012) also detected an mQTL for quercetin-3-O-rutinoside along the entirety of chromosome 17. Our study may be the first to identify an mQTL hotspot on this chromosome in apple

2.4.3.7 Classical molecular networking in GNPS provides putative IDs for compounds analyzed with iterative LC-MS/MS

Data files where comprehensive MS/MS spectral information was collected were submitted to the classical molecular networking platform in the Global Natural Products

Social Molecular Networking (GNPS) infrastructure to search for putative matches based on m/z and fragmentation patterns (Wang et al. 2016). Four separate networking analyses were performed, each on five data-dependent MS/MS files per combination of two ionization modes and 2 collision energies (ESI+ 20 eV, ESI+ 40 eV, ESI- 20 eV, ESI- 40 eV). The search for matches with ESI+ 20 eV data yielded 109 MS/MS matches, 11 of which were for masses that had putative mQTLs as identified by the mGWAS analysis.

There were 41 matches identified in analyzing the ESI+ MS/MS spectra obtained with a collision energy of 40 eV with 10 being masses with putative mQTLs. There was overlap

90 in the compounds being identified in the two collision energies, resulting in identification of a total of 117 unique compounds, with 12 being related to different mQTL. For ESI-

MS/MS spectra, 42 matches were found for the 20 eV collision energy and 29 for the 40 eV, with nine and seven features, respectively, matching those with mQTLs. Combining the two ESI- results and removing duplicate identifications resulted in a total of 50 unique compounds from ESI-. Nine of these matched metabolomic features associated with putative mQTLs.

After cross referencing the ESI+ and ESI- data, 154 different compounds were identified. From this total, 16 were masses for which mQTL had been identified: catechin, chlorogenic acid, kaempferol 3-α-L-arabinopyranoside, kaempferol 3-O- glucoside, melezitose, p-hydroxyphenylpyruvic acid, peonidin 3-O-glucoside, phloretin, procyanidin B2, pyraclostrobin, quercetin, quercitrin, rosmarinic acid, kaempferol-3-O- arabinoside, luteolin-3-diglycoside, and quinic acid. All tables with IDs and GNPS metadata can be found in Appendix A.1.4.

2.4.3.8 Authentic standards verify metabolite identification

The authentic standard for chlorogenic acid was purchased, and a stock solution were analyzed via LC-MS/MS experiments. The m/z and retention time of the precursor molecule as well as fragmentation pattern were compared with that of targeted analysis of the mass in the pooled QC of apple extracts. Additional compounds were purchased for which MS/MS experiments have been run, but data analysis has not been completed.

Some of these identifications are for metabolomic features for which significant SNP associations were discovered, others simply help to understand what compounds the

91 methods were able to detect. Also, further analysis could use molecular networking with these identified compounds to discover more identities of related compounds.

2.4.3.9 PBA in FlexQTL™ allows strong validation of mQTLs proposed by

mGWAS

In the genomics-metabolomics integration platform, mGWAS and subsequent prioritization schemes were used for high-throughput discovery of putative mQTL.

Although a kinship matrix and principal components were used to correct the model, mGWAS does not capture the inheritance of each allele through a pedigree. Additionally, the mGWAS analyses included more individuals than the three pedigree-connected families, so the pedigree relationships were only valuable to a portion of the analyses. mGWAS are also susceptible to high false positive rates so additional steps were taken to assure an accurate mQTL signal for compounds of interest.

For stronger characterization of the location and variance explained by an mQTL for a specific phytochemical, a pseudo-validation of mQTL for features of interest was then applied by conducting a Bayesian pedigree-based analysis (PBA) in FlexQTL™.

This analysis takes advantage of identity-by-descent (IBD) in the ability to track inheritance of haplotypes from ancestors to progeny in a known pedigree (Bink et al.

2014). Therefore, only genotype and metabolite abundance data for pedigree-connected individuals were used.

Although analyses were time consuming, even with the use of supercomputing, the majority of runs converged within 100,000 iterations. Replicate analyses for the same metabolite were consistent in determining phenotypic variance explained by the mQTL

92 as well as the genetic interval where it was detected. FlexQTL™ results for the proof-of- concept metabolite are presented in the following section.

2.4.3.10 Proof-of-Concept: Chlorogenic Acid mQTL on Chromosome 17

Here, we show how our pipeline was able to discover the relationship between chlorogenic acid and an mQTL on chromosome 17. Chlorogenic acid is of interest because of its association with health promotion (Tajik et al. 2017), though its identity and genetic relationship were not known a priori. To begin with, this compound was simply a feature, represented by an m/z and retention time for both LC-MS (+) and (-) and multiple 0.01 ppm-wide bins in the raw metabolomic data. No identification had been given to the features. After the mGWAS analyses, it passed all the filters to remain in the core feature lists extracted from the overlapping section of the Venn diagram for each metabolomic approach.

It followed that the data-dependent MS/MS analysis captured fragmentation data that was then analyzed for spectral matches in the classical molecular networking platform in GNPS for both LC-MS (+) and (-). This search yielded a match to chlorogenic acid for both data sets. This putative match based on MS/MS similarity was then confirmed by comparing LC-MS/MS analysis in ESI (-) of a purchased authentic standard with analysis of the feature (353.0879 m/z) in the apple pooled QC (Figure 16).

Retention time and fragmentation were consistent between the standard and the pooled

QC sample. The fragments themselves also give confidence in the chlorogenic acid identification. Fragments of note were 191.0574 m/z and 161.0247 m/z, which correspond to the constituent molecules of chlorogenic acid, 3-O-caffeoylquinic acid: quinic acid

93

(A)

(B)

(C)

(D)

Figure 16. Spectral evidence for feature identification as chlorogenic acid where (A) is an extracted ion chromatogram (EIC) of 353.0879 m/z ([M-H]-) in a stock solution of chlorogenic acid authentic standard and (B) is an EIC of the same m/z in a pooled apple extract QC run at the same time as the standard. The retention time match was further validated by matching MS/MS spectra of the mass for both samples in (C) and (D).

94

- - ([M-H] 191.0562 m/z) and caffeic acid ([M-H2O-H] 161.0239 m/z), respectively.

The mQTL detected for this feature, now identified as chlorogenic acid, was investigated further. Manhattan plots were made to visualize the mQTL signal for chlorogenic acid (Figure 17A and B). A clearly significant signal (FDR-corrected p >

.05) was evident on chromosome 17 along with suggestive signals on chromosomes 3 and

9. Further analysis showed SNP 15109

(RosBREEDSNP_SNP_AG_20028330_Lg17_01298_MAF50_1664885_exon1) as the most significant locus in the chromosome 17 signal. This was interesting due to the fact that the visualization strategy used for Figure 15 revealed SNP 15109 as one of the SNPs with the most significantly associated features. These findings suggested that the cluster of mQTLs detected near the bottom of chromosome 17 might be metabolically related to chlorogenic acid. Additional validation is needed to confirm this hypothesis but putative identification for other members of the phenylpropanoid pathway with apparent mQTLs, such as caffeic acid, co-locate at this same locus.

After determining the identity and strong mQTL signal for chlorogenic acid from

LC-MS data sets, the NMR data was investigated to learn if the same results were evident. The findings in Figure 15C were leveraged to help determine which bins in the

NMR analysis, if any, corresponded to signals from chlorogenic acid hydrogen atoms.

Here, SNP 15109 was indicated as having a significant association with five metabolomic features. The results were further solidified by inspection of the binary heat map for SNP- bin associations from NMR data (Figure 12). The bins associating with SNP 15109 are clustered in the bottom right corner. The extracted names of these bins from the data

95

LC−MS (+)

8 (A)

● ●

● ● ● ●

● ● 6 ● ● ● ●

● ●

● ● ●

●● log(p) ● − ● 4

● ●

●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● 2 ● ● ● ● ● ●● ●●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●●●●●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ●● ● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●●●●●●● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ●●● ●● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ● ● ● ● ● ●● ● ●●● ● ● ●● ● ● ● ● ● ●● ● ● ●●● ●● ●●● ● ● ● ●● ● ●● ● ●●● ●● ●● ● ● ● ● ●● ● ●●●● ● ● ● ●●● ●● ● ●● ● ●● ●● ● ● ●● ● ● ● ●●● ● ●● ● ● ● ●●●●●● ● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ●● ●● ● ● ●● ● ● ●● ●●● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ●● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●●● ●● ●● ●● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ●●●● ●●● ● ● ●● ●●● ● ●● ● ●● ● ● ● ● ● ● ●●● ●● ● ● ● ●●●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ●● ●●●● ● ● ●● ● ●●●●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ●● ●● ● ●●● ● ●● ● ● ● ●● ● ● ●● ●● ● ●● ● ●●● ●● ●● ● ● ● ●● ● ●●●●●●● ●● ●● ●● ● ●●● ● ●●● ● ● ● ● ● ●●●● ● ● ●●●● ●●● ●● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ●● ●●● ● ● ● ●● ●●● ●● ●●●●● ● ● ● ●●●●●● ●● ● ● ● ● ● ●● ● ●● ● ● ●●● ●● ● ●● ●● ● ●●●● ●●● ● ●● ●● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ●●●● ● ●● ● ● ●● ●● ●●● ● ● ● ● ● ●●●● ● ●●●●● ●● ●● ● ●● ● ● ●● ● ●● ●● ●●●● ●●● ● ●●●●●●●●● ● ● ●● ● ● ●●● ●●●● ●● ● ●●● ●● ● ●●●● ● ●●● ● ● ●●● ●●●●● ● ● ●● ●●● ● ●● ●● ● ● ●● ●● ● ●● ●●●● ● ●●● ●●● ● ● ● ●● ● ● ●●●●●● ● ● ● ● ● ● ●● ●● ● ●● ●● ● ●●● ●● ●● ●● ●●●● ●●● ●●●●● ● ●● ●● ●●● ● ●●●● ● ● ●●● ●●● ● ●●●●●●●●●●●●● ● ●●●●● ● ●●●●● ●●● ● ● ●●● ● ●● ●●●● ●●● ● ●●●●●● ●●●●● ●● ● ●●● ●● ● ● ●● ●● ●●● ●● ●● ● ● ●●●●● ● ●● ●●●●● ● ● ●● ● ● ● ●●●●●●● ● ● ● ● ● ● ●●●● ●● ●●● ● ● ●● ●●●● ● ●● ● ● ●●●● ● ●●● ● ●● ● ● ●●●●●● ●●● ● ●● ●●●● ● ●●●●●●●● ● ● ●●●●● ● ●●● ●●●● ●● ● ●● ● ● ●●●●● ●●● ●● ●● ●●● ●●●●●●●● ●●●● ● ●●●● ●●●●●● ●●●●●● ● ● ● ● ●● ● ● ● ● ●● ● ●●●●●● ●●●● ●● ●● ●●●●●● ●●● ●● ●● ●●●● ●●●●● ●●● ●● ●● ●●●● ● ● ● ●●● ●●●● ● ● ●●● ●● ●●●●●● ●● ●●●● ● ●●● ●● ● ●● ●●●●●●●●●●●●●●●●●● ●●●● ●●●● ● ● ●●● ●●●●●●● ● ●●● ● ● ● ●●● ● ● ● ● ●●●●●●●●●●●● ●● ● ● ●●● ●● ●● ●●●● ●● ●●● ●●●●● ● ● ●●● ●● ● ●● ●●● ●● ●●● ●●● ●● ● ●●●●●●●● ● ●● ● ●● ●●●● ●●●●● ● ● ● ● ●●●● ●●●●● ● ●●●● ● ●●● ● ● ●●●●● ●●● ●● ●●●●●● ● ● ●● ● ● ●● ● ●● ●● ● ●● ●● ●●●●●●●●● ●●●●●● ●●● ● ● ●● ●●●●●●●● ●● ●●● ●● ● ●●● ● ● ●●●● ●●●● ●● ● ●●● ● ●●● ●● ●●●●●●●●●●● ●● ● ● ● ● ●●● ● ●● ● ● ●●● ●●● ●●●●● ● ●● ● ● ● ● ●●●●●●● ● ●● ●●●● ●●●●● ● ● ●●● ●●● ●●●● ● ●●● ● ● ● ●● ●●●●● ● ●●● ●●●●●● ●● ● ●● ●●●● ●●● ●●●● ● ●●●●●●●● ● ●● ●●● ● ●● ●● ● ● ● ●●● ●●●●●●●●● ● ● ● ● ● ●● ●●● ●● ●●● ●● ●●● ●●● ● ● ●● ● ●●●●●●●●● ● ●●●●●●●● ●● ● ●●●●●●●●●● ● ●●●●●●● ● ●● ●● ● ●●●● ●●●●●●● ●● ●●● ●●●●●● ● ● ●●● ● ●● ● ●● ● ●● ●●●●●●●● ● ●●●●● ● ●●●●●●● ● ●●● ●●●●●●●● ●●● ●●●●●●● ●●● ●●●●●●● ●●●●●●●●● ● ● ●●●●●● ●●● ●●●● ●●● ● ●● ● ●● ● ●●●●●●● ● ●●● ●●●●● ●●●●● ●● ●● ●● ●●●●● ●●●● ●● ●●●●●●●●●●●● ●●● ●●● ●● ●● ● ● ●●●●● ●●●● ●●●● ●● ●●● ●●● ●●●●●●●●●●●●●●●● ●● ●●●● ●●●●●● ● ● ●●● ●●●●●●●●●● ●● ●●● ●●●●●●● ●●●●●● ● ●●● ●● ●●●●●●●●●●●●●●●●●●●●● ● ● ●● ●● ●● ●●●● ●●●●●●● ●●● ●●●●●●●●●● ● ● ●● ●●● ●●●●● ● ●●●● ●●●●●● ●●●● ●●● ●●● ● ● ● ●● ●●●●●●●● ●●●●●● ●●●●●●● ● ●●●●●●●●●●●●●● ●●●●●●● ● ●●●●● ●●●●● ●● ● ●●●●●●●●●●●●●●●●●● ●●● ●●●●●●● ●●●●●●●●● ● ●●●●●●●● ●●● ●●●●●●●●●●●● ●●● ●●●●●● ●●●●●●●● ● ●●●●● ●● ●●●●●● ●● ●● ● ●●●●●● ●●●●●●● ● ●● ●●●●●●●●●●●● ●●● ●●●●●●●● ●●●●●●●●● ●● ●●●●● ● ●●●●● ● ● ●● ●●●●●●●● ●● ●●● ●● ●●●●●● ●●● ●●● ●●●● ● ●●● ●●●●●●● ●●●●●●●●●● ●● ●● ●●●●●●● ●●●●●●●●● ● ●●●●● ●●● ●●● ● ●●●●●● ● ●●●●●● ● ●● ● ●●●●●●●●● ● ●● ●●●●●●●●●●●●● ●●●●●●●●●● ● ●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●● ●●●● ● ●● ●●●●●●●●●●●●●●●●● ●●● ●●●●● ●●●●●●●●●●●●● ●●●●●●● ●●●●●● ●●● ● ●●●●●●●●● ●●●● ●●●● ●● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●● ●●●●● ●●●●●●●● ●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●● ● ● ● ● ● ● ●●●●●●●● ●●● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ● ●●●●●●●● ● ●●●●●●●●●●●● ●●●●●●● ●●● ●●●●● ● ●●●●●●●●● ● ●●●●●●●●●●●●●●●● ●●●●●●●● ●●● ● ●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●● ●● ●●● ●●●●●●● ●●●●●●●● ●●● ●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ● ●● ●●●●●●●● ●●●●●●●●● ● ● ● ● ●●● ●●●●●● ●●● ●●●●●●● ●●●●●●●●●●●●●●●● ●●●●●●● ● ●●●●●●●● ●●●●●●●●●●● ● ●●●●●●●●●●●●●● ●●●●●● ●● ●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●● ●●●● ● ●●●●●●●●●● ●● ● ●● ●●●●●●●●●●●●●● ● ●● ●●● ●●●● ●●●●● ●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●● ●● ●● ●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●● ●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●● ● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●● ● ●●●●●●●●●●●●● ● ●●●● ●●●●● ●● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●● ●● ●●●●●●● ●●●●●● ● ● ●●● ●● ●●●●●●●●●●●●●●●● ●●● ●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●● ●●●● ●●●●●●●●●●● ●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●● ●● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●● ●●●●●●●●● ●●●●●●●● ●●●●●●●●●●●● ●●● ● ●●●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●● ● ● ●●●●●●●●● ● ●● ● ●●●●●●● ●● ●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ● ●●●●●●●●●● ● ●●●●● ●●● ● ● ●●●●●●●●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●● ● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●●● ●●●● ●●●●● ● ● ●●● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●●● ●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●● ●●●●●●●●● ●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●● ●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●●● ●●●●●●●●●●●●●●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●● ●●●●●●● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● 0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Chromosome

Figure 17. A continued figure. Manhattan plots of chlorogenic acid phenotypic measurements from (A) LC-MS (+), (B) (-), and (C) NMR. Alternating colors were used to help delineate neighboring chromosomes. The dashed line indicates an FDR-corrected q-value equivalent to p =.05.

96

Figure 17 continued LC−MS (−)

8 (B)

● ● ● ●

● ● ● ● ●

●●

6 ● ●

● ● ●

● ● ● ● ● ● log(p) −

4 ● ●●●

● ●●

● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● ●●● ●●●● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ●●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●●● ● ●●● ●● ● ● ● ●●● ● ● ● ● ●●● ● ● ●● ● ● ● ●●● ●● ● ●●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ●●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ●● ●● ● ● ● ●● ● ●●● ●● ● ● ●● ●● ● ● ● ●● ●● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ●● ● ●● ● ● ●● ● ● ● ● ●● ●●● ● ● ● ● ●●●● ● ● ● ● ●● ● ● ●●● ● ●●●● ● ● ● ●●● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ●●● ● ●● ●●●●● ●● ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ●● ●● ● ●●● ● ●●● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ●●● ●● ●●●●● ● ● ●●● ● ●● ● ● ● ● ● ●●●●● ●● ●● ● ●● ● ● ●●● ● ● ● ● ●●● ● ● ● ●●● ● ●● ● ●● ● ● ●● ● ●●● ● ● ●●● ● ● ●● ●● ● ● ●●●● ● ● ● ●●● ●● ●● ●●●● ●● ● ●●● ● ● ●● ● ● ● ●●● ● ● ●●● ● ● ● ●● ● ●●● ●● ●● ● ● ●● ●●●● ●●● ●● ●● ●●●● ● ● ● ●●●● ● ●● ● ● ●● ●● ●● ● ●●●● ● ●●● ● ●● ●● ●●●●● ● ● ● ●●● ● ● ●●● ● ● ●●●●●● ●●● ● ●● ●●●● ●● ●● ●●●●● ●●●● ●●●● ●● ●● ● ●● ● ●●●●● ●● ●●●● ● ● ●●● ● ●● ●● ● ●● ● ● ●●● ●●● ●● ● ● ● ●● ● ● ●●●●● ●● ●● ● ●● ● ●● ● ●●●●●●● ● ●●●● ● ● ● ● ● ● ● ●● ●●● ●●● ● ●● ●● ● ●●●● ● ● ● ●●● ● ●●●● ●●● ● ● ●●● ● ●● ● ● ● ● ● ● ●●●●●●●● ● ●● ●●●●●●●● ●●●●●●●● ●●● ●● ● ●● ● ●● ●●●●●● ● ●●●● ● ● ●● ●●●● ● ● ● ●●● ● ●●●● ●● ● ●●●● ●●●●● ●● ●●● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●●●●● ●● ●● ● ● ● ● ●●●●● ●● ●●●●●●●● ●●● ● ●●● ●●●●●● ●●●●●●● ●● ● ● ●● ●● ●● ●● ● ● ● ●● ●● ● ● ●● ● ●● ●●●●●●● ● ● ●●● ● ●● ● ●●●● ●● ● ●●● ● ● ●● ●● ● ●● ● ● ● ●● ●● ●●● ●●● ●●●● ● ●●●● ● ● ●●●●●●●●●● ●●●● ●● ● ●●●●●●● ● ●● ●●● ● ● ●●● ● ● ●● ●●●●●● ●● ● ● ● ● ●●● ● ● ● ●●●●●●●● ● ●●●● ● ●● ●●● ● ● ●●●●● ●●●● ●●● ● ● ●●●●●● ●● ●●●●●●●● ● ● ●●●● ● ● ●● ● ●● ●●●● ●●●● ●●● ● ●● ● ●●●●● ●●●● ● ●●●●● ● ●●● ●●● ●● ●●●● ●●● ●● ●● ●● ●●●●●● ● ● ●●● ●●● ●●●●● ●●●● ● ● ●● ●●●●● ●● ● ●●● ●● ●●●●●●●●●●●●●● ● ●●● ●●●● ●● ● ●● ●●● ● ●●● ● ● ● ●● ●● ●● ●●●● ●●●●● ●●●● ●● ● ●● ●●●● ●● ●●●●● ●● ● ●● ●●●● ●●●●●●● ●●● ●●● ●● ●● ●●●●●●● ●● ●● ●●●●● ● ● ●● ●●●●●●●●●● ●●●●● ● ● ●●●●●●●● ● ●●● ● ● ●●●● ● ● ●●●●● ● ●● ●●●● ●●●●●●●●●●● ●●●●● ●●● ● ●● ●●● ● ● ●● ●●●●●●●●●● ●● ●●●● ●●●●● ●● ● ● ● ●●●● ● ●● ●●●●●● ●●● ● ● ● ●●●●● ●●●●● ● ●●●●● ●●●● ●● ●●●●●●●● ● ● ●●●●● ● ● ●●●● ●●●●●●●●●●●●●●●● ● ● ● ● ● ● ● ●●●●●●●● ●●●● ●●●●●●● ●● ● ●●●●●● ● ●●●● ●●●● ●●●●●●●●●●●● ●● ●● ●●●● ●●● ●●●● ●●●●●●●●●●● ●●● ●●●●● ●● ● ● ●● ● ●●●●●●●● ●●● ●● ● ●●●●●●●● ● ● ● ● ● ● ●● ●● ● ●●● ● ●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●●●● ● ●●●●●●●● ● ●● ●●● ● ● ●●●●●●● ●●●● ●● ●●●●●●●● ● ● ● ●●●● ●● ●● ●● ●●●●●● ●●● ●●● ●●●●●●●●●●●●●● ● ●●●●●●●●● ●● ●●●●●●●●●● ●●●● ●● ●● ●●●● ● ● ● ●● ● ●● ●●●● ●● ●● ●● ●●● ● ● ● ●●●●●●●● ●●● ● ●●●● ● ●●●●●●●●●●●●● ●● ●● ●●● ●●● ●● ● ●●●●●●●● ●●● ●●● ●● ●●●●●●●●●●●●●●●●● ● ●●● ● ●● ●●●● ●●●●● ●●●● ●●●● ●● ● ● ● ●●● ● ●●● ●●● ● ●●●●●●●●●●●●● ●●●●●● ●●●●●●●● ●●●●● ●● ● ●●●● ● ● ●●●●●●● ●● ● ● ● ●●●●●●● ●● ●●●●●● ●● ●●●●●●●●●●● ● ●● ●●●●●●●●●● ●●● ●● ● ●●●●●● ● ● ●●● ●●●● ●●●● ●●●●● ● ●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●● ●●●●●● ●●●●●●●● ● ●●●●●●● ● ●●●●●●●●●● ●● ●●●●●●● ●●● ● ● ●● ●●●● ●●●●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●● ● ●●●● ●●●●●●●●● ●●●● ●●●●●●●●● ● ●●●● ●●● ●●●●● ●●●●●●●● ●● ●● ●●● ● ●●● ●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●● ●●● ●●●● ● ●● ●●●●●● ●●● ●●●●●●●●●●● ●●●●●● ● ● ●●●●●●●●● ●●● ● ●●●●●●●●●● ●● ●● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●● ●● ● ●● ●●● ●●●●● ● ●●● ● ●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●● ●● ●●● ●●●●●●●●●● ●●●●●● ●●●●●●●●●● ● ●●●●●● ●●●●●●●●●● ●● ● ●●● ● ●●●●● ●● ● ● ●●●●●●● ●●● ●●● ●● ●●●●●●●●●●●● ●● ●● ● ●● ●● ● ●● ●●●● ●● ●●●● ●●●● ●●● ●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●● ●●●●●●●●●●●● ●●●●● ●●●●●●●●●●● ●●●●● ●●●● ●●●● ●●● ● ●●●●● ●●●● ●● ●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●● ●● ● ●●●●●●●●●●● ●●●●● ●●●●●● ●●●●●●●● ●●●●●●●●●●●●●●●● ● ●●●● ●●●● ●● ● ●●●● ●● ●●●●●●● ●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ● ● ●● ●●●●●●●●●●● ● ●●● ● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●● ●●● ● ●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ● ●●●● ● ●●●● ●● ●●●●●●●●●●●●●●● ●● ● ●●●●● ●● ●●●● ●●● ●● ● ● ●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●● ●● ●●●●●●●●●●●● ●●●●●●●● ●● ● ●●●● ●●●● ●●●●●●●● ● ● ● ●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●● ●● ●●●●●●●●●●● ●●●●● ●●●●●●●● ●●●●●●●●●●●●● ●●●●●● ●●●●●●●● ●● ●●●●● ●●● ●●●●● ●● ●●●●●●●●●● ●●● ● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●● ●●●●●● ●●●●●●●●●●●● ●●● ●●●●●● ●●●●●●●●●●●●●●● ●● ●● ●●● ●● ●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●● ●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●● ●●●●●●●●●●● ●●●●● ●●● ● ● ●● ●● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●● ●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●● ●●●●●●●●●●●●●● ●●● ●● ●●● ●●●● ●●●● ● ●●●● ●● ●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●● ●● ●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●● ● ●●●● ●● ●● ● ●● ●●●●●●●● ●●●●● ●●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●● ● ● ●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●● ● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●● ●●● ●●●●●●●●●● ●●● ●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●● ● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ●●●●●●● ●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●● ●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●● ●●●●●●●● ●●●● ● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●●●●●●● ●●●●●●●●●●●●●●●●● ●●●●●●●●● ●●●●●●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ● ●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● 0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 NMR Chromosome 7 (C)

● ● 6 ●

● ●

● ●

● ● 5

● ● ●

● ● 4 ●

● ● ● ● ●

log(p) ● ● ●● − ● ● ● ● ● ● ● ● 3 ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● 2 ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ●●● ● ● ● ● ● ● ●●● ●● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●●●●●●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ●●●●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●● ● ● ●●● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ● ● ●● ● ● ●●● ●●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ●●● ●● ● ● ● ● ●● ●● ● ●● ● ● ● ●●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ●● ● ●● ● ● ●●● ● ● ● ●● ● ● ●● ●●● ● ●●● ● ● ●● ● ●●●● ● ●●●● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●●●● ●● ●●● ●● ● ●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●●● ● ● 1 ● ● ● ● ●●● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ●●● ●● ● ● ● ●● ● ●● ●●● ● ●● ● ●●● ● ● ● ● ● ● ● ●●●● ● ● ● ●● ●● ●●● ● ●● ●● ● ● ● ●● ●● ● ● ●● ● ●●●● ● ●● ● ● ●● ● ● ● ● ●●●●● ●● ●● ●●● ●●●●●● ●●● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ●● ● ●● ● ●● ●● ● ●●● ● ●● ●● ● ● ● ●● ● ●● ● ● ● ● ●●●● ● ●●● ●●●● ●●● ●● ● ● ●● ●● ● ● ● ● ●● ● ●● ●●● ● ●●● ● ●● ● ● ● ● ●●● ●● ● ● ● ●● ●●● ●●●● ●●●●●● ● ●●●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●●● ●●● ●●●●●● ● ● ●● ● ●●●●● ●●● ● ●● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ●● ●● ●● ●● ● ● ●●●●●●●●●●● ●●●●● ●● ● ●● ● ● ● ● ● ● ●● ● ●●● ●●● ● ●●● ●● ● ●● ●● ●●● ● ●● ●● ● ●● ●● ● ●● ●●●● ● ●● ● ●● ●● ● ● ●●●● ●● ● ●● ● ●● ● ●● ●● ● ● ● ●●● ● ● ●● ●● ●● ● ●● ● ● ●●●● ● ● ●● ●● ●●● ●● ● ● ● ●●● ●● ● ●●●●● ●●●●● ●● ● ●● ● ● ●● ● ●●●● ● ● ● ●● ● ● ● ●● ● ● ●● ●● ● ●●● ●●●●● ●● ● ● ● ●●● ●●●●● ●●●● ● ●●●● ●● ● ●●● ●● ● ● ● ●● ●●● ●● ●● ●●●● ● ●●●●●●●● ● ●●●● ●●● ● ●● ● ●● ●● ● ● ● ● ●● ●● ● ● ● ● ●●● ● ●● ● ●●● ● ● ●● ●● ● ● ●●●●●● ● ●● ● ● ● ● ● ● ● ●● ●●●● ●●● ●●●●● ● ●● ● ●● ● ● ●● ● ● ●● ●●●●●● ● ● ● ●●● ●●●●● ● ●●●●●●●●●● ● ● ● ● ● ●●● ● ● ● ● ●● ●●●●●● ●● ●●●● ● ●●● ● ●●●●●●● ●●●●● ● ●● ●● ●● ● ●● ●● ● ●● ●● ●●●● ●●● ●● ● ● ●●● ●● ●● ●● ● ●● ● ● ● ●● ●● ●● ● ● ●●●●●●●● ● ● ●● ●●● ● ● ● ●● ●●●●● ● ●● ●●●●● ● ● ●● ●● ●● ● ●● ●●● ●● ●●●●●● ●●● ● ●● ●● ● ●●●● ●● ● ●●● ● ●● ●●● ●●● ● ● ●● ●● ● ●● ●●● ● ●●●● ● ● ● ●●●●● ●● ●● ●●●● ● ● ●●●●●●●●● ● ●●● ● ● ●● ●● ● ●●● ●● ● ● ● ●●● ● ●● ● ● ●●● ● ●● ● ●● ● ● ●● ● ● ●● ●● ● ●● ●●●●●●●●● ● ● ●●● ●●●●●●●●●●● ● ●●●●●● ●● ●●●●●●●●●● ●●●●● ● ● ● ● ●●● ● ● ● ●● ●●●●●● ● ● ●● ● ● ●●●● ●●● ●●●●●● ●●●●● ●●●●●●● ● ●●●●●●●● ● ● ●●●● ●● ●● ●●●● ● ● ● ● ●●● ●●● ● ● ●●●●●●● ● ● ●● ●● ●● ● ●●● ●● ● ● ● ● ●●●●●●●●●●●●●●●●● ● ●●●●●●● ●●●●●● ● ● ●● ● ●●● ● ● ● ●● ●● ● ●●● ● ●●● ● ●●●●●●●●●●●●●●●● ●●● ● ● ●●● ● ●●● ●● ● ●●●●●●● ● ●●● ●●●● ● ●● ● ●●●●● ●●●● ● ●● ● ●● ●● ●●●● ● ● ● ● ●●●●● ● ●●● ●● ● ● ●●●●●● ● ●● ●●●●●● ● ●●● ●●● ●●●● ●●● ●●●●●● ● ●● ●● ●●●●●●● ●● ● ●● ● ● ● ● ●●● ● ● ●●● ●●●●●●● ●●● ●●●●●●●● ● ●●● ●●●● ●● ●●●●●● ●●● ● ● ● ●●●●●●●● ●●●● ●● ● ●●●●● ●●● ● ●● ●● ● ● ● ●●●●●●● ● ●●●●●●● ●●● ●●●●●●●●●● ●● ● ●● ● ● ● ● ● ●●● ● ●●●●●● ●●● ● ●●●●● ● ● ●●● ●● ● ●●● ●●● ●● ●● ● ●●● ● ● ●●●●● ●●● ●● ●●●● ● ●●●●● ● ●●●● ●● ● ●●● ●● ● ●●●●●●●●●● ● ●● ●● ●●● ● ●● ● ●●● ●● ●● ●●●● ● ●●●●●●● ●●● ●●●● ●● ●●●● ● ●● ● ●● ●● ● ● ● ●●● ●● ●●● ●● ● ●●●●● ●● ●●●●●●●●●●●●● ●●●●●●●● ●● ●●●●● ●●●●●●●●● ●●●● ●● ●●●● ●● ● ●●●● ●● ●●● ● ●● ● ●● ● ●●●●●●●●●●●●● ●●● ●● ●●●●●●●● ●● ●●●●●●●● ●● ●●●●●●●● ● ● ●●●● ●●● ●●● ●●●●●●● ●● ●● ●●●●●●●●● ●●●●● ● ● ●●●●●●● ● ●● ●● ●●● ●●●●●●●●● ● ●● ● ● ●● ●●●● ●●●● ●●●●●●● ●●●●● ●● ●●●●●●● ●●●●●●●●●● ● ●●●● ●●●●●● ●●●●●●●●●●● ● ●●● ●●●●● ● ● ●● ●●●●●●●●●●●● ●●●●●●●●●●● ● ●●●● ●● ● ●●● ●●● ●●●●●●● ● ●●● ●●●●●●●●● ●● ●● ● ●●●●● ● ● ● ● ●●●●●● ●●●● ● ●●●●●●● ●● ● ● ● ● ●● ●●●● ●● ●●●●● ●●● ● ●●●●●● ●●● ●●● ● ●●● ●●●● ● ● ●●●●● ●●●●●●●●●● ● ●●● ● ●●●● ● ●●●● ●● ● ●●● ●●● ●●● ●● ●● ●●●●●●●●●●●●●●●●●●● ● ● ● ● ●●●●● ●●● ●●● ●●●●●●●●●●●● ●● ●●●● ●● ●●●●●●●●●●●●●●●● ●● ● ● ● ●● ● ●●●●●●●●●●●● ● ●● ● ●●●●●●●●●●●● ● ●●●● ●● ●● ●● ●●●●●●●●●● ●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●● ●● ●●●●● ●●●●● ●●● ●●● ●● ●●●●●●●● ● ●●●●● ● ●● ●● ●●●● ● ●●●●●●●●●●● ●●● ● ●● ●●● ●●● ●●●●●●●●● ●●●●●●●●● ●● ●● ●●●●●●●● ●●●●●● ●●●●●● ●●●●●●●●●●●●●●● ● ●●● ●●●●●●● ●●●●●●●●●● ● ●● ●●●● ●●●●●●●●●●●● ●●● ● ● ●●● ● ●● ●●●● ●●●●● ● ●●●●●●● ● ●● ●●●●●●●●● ●●● ● ●●●● ●●●● ●●●●● ●●●●●●● ● ●●●●●●●●●●●●● ● ●●● ●●● ● ●●● ● ●● ●●●●● ●● ●●● ● ●● ●●●●●●●●●●●●● ● ●●●●●●●●● ● ●● ● ●●● ●●●● ●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●● ● ●●●●●●●●●●●●● ● ●●●● ●● ●●●●●● ●●● ● ●●● ● ●●●●● ● ● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●● ● ● ●●●●●●●●●●●●●●●●●●●● ●●●●●● ●● ●● ●●●● ●●●●●●●●● ●● ●●●● ●●●● ●● ●●●● ●●●● ●●●●●●● ●●● ●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ● ●●●● ●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ● ● ●●● ●●●●●●●●●●●●●●● ●● ● ●●●●●●●●● ●●●●● ●● ● ●● ●● ● ●●● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ● ● ●● ● ● ●●●● ● ●●● ●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●● ●●●●●●●●● ● ●●● ● ●●●●● ●●●● ●●●●●●● ●●●●●●●●●●●●● ●●●●● ●●●● ●●●●●●●●●●● ●● ●● ● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●●●●●●●●●●● ● ●●●●●● ●●●● ●● ●● ● ●●●●●●●●●● ● ●●●● ● ●●●●●●●●●● ● ● ● ●● ● ●●● ●●●●●● ●●● ●●●●●●●●● ●●●●●● ●●●●● ●● ● ●●● ●●●●●● ●● ●●●●●●● ●●●●● ●●●●● ●●●●●● ● ●●●●●● ●●●● ●●●● ●●● ●●●●●●●●● ●●●●● ● ●●●● ●●●●●●●● ●●●●●● ● ●●●●● ●●● ●● ●●●●●● ● ● ●●●●●● ● ●●●●●●● ●●●●●●●●●● ●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ● ● ● ●●●●●● ●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●●●● ● ●● ● ●●● ●●● ●●●●●●●●●●● ●●●●●●● ●●●●●●●●●●● ●●●● ●● ●●●●●●●● ●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●● ●●●●●●●●●●●●●●● ●●●● ●●●●● ●● ●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●● ●●● ● ●● ●●● ●●●● ●●●●● ●●●●●●●●●●●●●●● ●●●●●● ●● ●●● ●●●●●●●●●● ●●●● ●●●● ●●●●● ●● ●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●● ●●●● ●● ●● ● ● ●●●●●●●● ●●●●● ●●●● ●●●●●●● ●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●● ●●● ●●●●●●●●●●●●●●●● ●●●●●●● ● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ● ●●●●●●●●●●●● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●● ●●●●● ● ●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●● ●●●●● ●● ●●●●●●●●●●● ●●●●●● ●●● ●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●●●●●●●● ●● ● ● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●●●●●●●● ● ●●●●●●●●●●●●●●●●●●● ● ●●●●●●● ●●●●●●●●● ● ●●●●●●●●●●●●● ●●●●●●●●●● ●●●●● ● ● ●●●●●●●●●●●●● ●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●●●●● ● ● ● ●●●●●●●●●●●●●● ● ●● ● ●●● ●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●● ● ●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●● ●● ● ●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●● ● ●●●● ● ●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●● ●●●●●●●●● ●●●●●●●● ●● ●●●● ● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●● ● ●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●● ●●●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●● ●● ●●●●●●●●●●● ●●●●● ●●●●●●●●● ●●●●●● ●●●● ●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●●● ●●● ●●●●●●●●●●●●●●●●●● ●●●● ●●● ● ● ●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●● ● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●● ●●●●●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ●●●●●●● ●●● ● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●●●●● ●● ●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●● ● ●● ●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●●●●● ●●●●●●●●●●●●●● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●● ●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ●●●●●●●●●●●●● ● ●●●●●●●●●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ● ●●● ●●● 0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Chromosome 97 frame were as follows: 7.1-7.09, 6.33-6.32, 2.17-2.16, 2.15-2.14, and 2.14-2.13 ppm. The chemical shift distribution and multiplicity of these peaks in the 1D H1 NMR spectra of the apple pooled QC (Figure 18) matched closely those of the database spectra for chlorogenic acid HMDB (Wishart et al. 2018). Subsequent graphing of the Manhattan plot for a representative chlorogenic acid NMR bin (Figure 17C) showed a consistent pattern with those for the LC-MS data sets.

As stated previously, other studies have identified an mQTL on the bottom of linkage group 17 for chlorogenic acid (Chagné, Krieger, et al. 2012; Khan, Chibon, et al.

2012; McClure et al. 2019). Two of these studies were based on bi-parental mapping populations (Chagné, Krieger, et al. 2012; Khan, Chibon, et al. 2012). Alternatively, a strong but non-significant signal was detected for chlorogenic acid in a breeding-relevant population using mGWAS (McClure et al. 2019). The stronger mQTL signal we observed could be due to the specific breeding-relevant germplasm used for this study.

Another reason for more sensitive detection of mQTL in our mGWAS may be attributed to the core of pedigree-connected individuals chosen for the study that enabled an informative kinship matrix to be included in the model. This seems to suggest, although it is important to use breeding-relevant germplasm in mQTL analysis, having a set of progeny and their pedigree-related individuals strengthens the capability of detection near that of a bi-parental mapping population.

Finally, the log2-tranformed metabolite abundances for the chlorogenic acid peak in all three metabolomic data sets was analyzed with the Bayesian PBA in FlexQTL™.

For NMR, chlorogenic acid is represented by several bins, so the bin with the highest -

98

(A) (B)

(C)

Figure 18. NMR spectra of the apple extract pooled QC and zoomed subsets of areas of interest. In the full spectrum (C), yellow lines indicate ppm of bins significantly associated with SNP 15109 and matching with expected peaks for chlorogenic acid. In subsets (A) and (B), yellow arrows point to the specific peaks for chlorogenic acid.

99 log(p) value from the mGWAS, 2.15-2.14 ppm, was chosen for analysis. The NMR spectrum was examined to ensure this bin was representing only one peak from chlorogenic acid and not the combination of several compounds.

The findings from the FlexQTL™ output are listed in Table 4. Both LC-MS (+) and (-) showed 1 mQTL for chlorogenic acid. The Bayes Factors (2ln(BF)) for replicate

FlexQTL™ routines for each ionization mode ranged from strong (2ln(BF) 5-10) to decisive (2ln(BF) >10). The results from NMR consistently detected a positive (2ln(BF)

2-5) mQTL on chromosome 17 that neared the strong classification. Additionally, a positive signal for chromosome 3 was detected in one replicate of the NMR routines. In all other FlexQTL™ runs for chlorogenic acid, the signal on chromosome 3 fell just short of the positive delineation. The table further outlines a large but consistent genetic interval for the chlorogenic mQTL on chromosome 17. Additionally, narrow sense heritability was high for each metabolomic data set with an overall range of 0.53-0.80.

2.5 Conclusion

This study provides a pipeline for high-throughput processing and integration of genomic and metabolomic datasets in diverse, breeding-relevant apple germplasm. The genomic and metabolomic variety in the selected apple varieties led to detection of mQTLs across the apple genome. The use of three parallel metabolomic approaches offered complementary interpretation of results that increased confidence in compound identification and detection of significant mQTL through mGWAS and PBA. This workflow enabled detection of 519, 726, and 177 putative mQTLs in LC-MS (+), (-), and

NMR data sets, respectively. Furthermore, chlorogenic acid represented a proof-of-

100

Table 4. FlexQTL™ results for the pedigree-based analysis (PBA) of chlorogenic acid abundance data for LC-MS (+), (-), and NMR data sets. Number of positive mQTL was determined by a minimum Bayes Factor of 2. Genetic interval and narrow sense heritability estimates are recorded in ranges determined from three replicates of the

FlexQTL™ runs for each data set.

Chlorogenic Acid LC-MS (+) LC-MS (-) NMR Number of Positive mQTL 1 1 2 Chromosome 17 17 3 17 Genetic Interval (cM) 29-72 28-56 26-58 15-67 Narrow Sense Heritability (h2) 0.62-0.72 0.63 0.59 0.53-0.8

concept example that this approach was able to characterize and detect consistent mQTLs via mGWAS and PBA for metabolites relevant to nutrition-driven breeding.

The pipeline could be extended to other tree crops and could be adapted for use with other perennial and annual crops, as well. Platforms that provide feasible assessment of genotype-metabolite relationships are imperative as crop research moves towards metabolome-based increased nutrition, disease resistance, post-harvest quality, and consumer-likability. This is of greater importance in perennial tree crops, such as apple, which will rely upon marker-assisted breeding to overcome a long breeding cycle and accomplish marked advances to parallel or ideally influence consumer desires and needs.

101

Bibliography

Abdul-Ghani, MA and RA DeFronzo. 2008. “Inhibition of Renal Glucose Reabsorption: A Novel Strategy for Achieving Glucose Control in Type 2 Diabetes Mellitus.” Endocrine Practice 14:782–90. Alvarado, Francisco and Robert K. Crane. 1962. “Phlorizin as a Competitive Inhibitor of the Active Transport of Sugars by Hamster Small Intestine, in Vitro.” Biochimica et Biophysica Acta 56:170–72. Amadeu, Rodrigo R., Catherine Cellon, James W. Olmstead, Antonio A. F. Garcia, Marcio F. R. Resende Jr., and Patricio R. Muñoz. 2016. “AGHmatrix: R Package to Construct Relationship Matrices for Autotetraploid and Diploid Species: A Blueberry Example.” The Plant Genome 9(3). Aprea, Eugenio, Maria Laura Corollaro, Emanuela Betta, Isabella Endrizzi, Maria Luisa Demattè, Franco Biasioli, and Flavia Gasperi. 2012. “Sensory and Instrumental Profiling of 18 Apple Cultivars to Investigate the Relation between Perceived Quality and Odour and Flavour.” Food Research International 49:677–86. Aprea, Eugenio, Helen Gika, Silvia Carlin, Georgios Theodoridis, Urska Vrhovsek, and Fulvio Mattivi. 2011. “Metabolite Profiling on Apple Volatile Content Based on Solid Phase Microextraction and Gas-Chromatography Time of Flight Mass Spectrometry.” Journal of Chromatography A 1218:4517–24. Barth, Stephan W., Christine Faehndrich, Achim Bub, Bernhard Watzl, Frank Will, Helmut Dietrich, Gerhard Rechkemmer, and Karlis Briviba. 2007. “Cloudy Apple Juice Is More Effective than Apple Polyphenols and an Apple Juice Derived Cloud Fraction in a Rat Model of Colon Carcinogenesis.” Journal of Agricultural and Food Chemistry 55:1181–87. Baumgartner, Isabelle O., Andrea Patocchi, Jürg E. Frey, Andreas Peil, and Markus Kellerhals. 2015. “Breeding Elite Lines of Apple Carrying Pyramided Homozygous Resistance Genes against Apple Scab and Resistance against Powdery Mildew and Fire Blight.” Plant Molecular Biology Reporter 33:1573–83. Belton, P. S., I. Delgadillo, A. M. Gil, P. Roma, F. Casuscelli, I. J. Colquhoun, M. J. Dennis, and M. Spraul. 1997. “High‐field Proton NMR Studies of Apple Juices.” Magnetic Resonance in Chemistry 35:S52–60. Beyer, Peter. 2010. “Golden Rice and ‘Golden’ Crops for Human Nutrition.” New Biotechnology 27(5):478–81. Bianco, Luca, Alessandro Cestaro, Gareth Linsmith, Hélène Muranty, Caroline Denancé, Anthony Théron, Charles Poncet, Diego Micheletti, Emanuela Kerschbamer, Erica A. Di Pierro, Simone Larger, Massimo Pindo, Eric van de Weg, Alessandro Davassi, François Laurens, Riccardo Velasco, Charles Eric Durel, and Michela Troggio. 2016. “Development and Validation of the Axiom®Apple480K SNP Genotyping Array.” The Plant Journal 86:62–74. Bianco, Luca, Alessandro Cestaro, Daniel James Sargent, Elisa Banchi, Sophia Derdak, Mario Di Guardo, Silvio Salvi, Johannes Jansen, Roberto Viola, Ivo Gut, Francois Laurens, David Chagné, Riccardo Velasco, Eric van de Weg, and Michela Troggio. 102

2014. “Development and Validation of a 20K Single Nucleotide Polymorphism (SNP) Whole Genome Genotyping Array for Apple (Malus × Domestica Borkh).” PLoS ONE 9(10):e110377. Biedrzycka, Elzbieta and Ryszard Amarowicz. 2008. “Diet and Health: Apple Polyphenols as Antioxidants.” Food Reviews International 24(2):235–51. Bink, M. C. A. M. 2002. “On Flexible Finite Polygenic Models for Multiple-Trait Evaluation.” Genetics Research 80:245–56. Bink, M. C. A. M., J. Jansen, M. Madduri, R. E. Voorrips, C. E. Durel, A. B. Kouassi, F. Laurens, F. Mathis, C. Gessler, D. Gobbin, F. Rezzonico, A. Patocchi, M. Kellerhals, A. Boudichevskaia, F. Dunemann, A. Peil, A. Nowicka, B. Lata, M. Stankiewicz-Kosyl, K. Jeziorek, E. Pitera, A. Soska, K. Tomala, K. M. Evans, F. Fernández-Fernández, W. Guerra, M. Korbin, S. Keller, M. Lewandowski, W. Plocharski, K. Rutkowski, E. Zurawicz, F. Costa, S. Sansavini, S. Tartarini, M. Komjanc, D. Mott, A. Antofie, M. Lateur, A. Rondia, L. Gianfranceschi, and W. E. van de Weg. 2014. “Bayesian QTL Analyses Using Pedigreed Families of an Outcrossing Species, with Application to Fruit Firmness in Apple.” Theoretical and Applied Genetics 127:1073–90. Bink, M. C. A. M., P. Uimari, M. J. Sillanpää, L. L. G. Janss, and R. C. Jansen. 2002. “Multiple QTL Mapping in Related Plant Populations via a Pedigree-Analysis Approach.” Theoretical and Applied Genetics 104:751–62. Bliss, Fredrick A. 1999. “Nutritional Improvement of Horticultural Crops through Plant Breeding.” HortScience 34(7):1163–67. Bondonno, P., Catherine P. Bondonno, Lauren C. Blekkenhorst, Michael J. Considine, Ghassan Maghzal, Roland Stocker, Richard J. Woodman, Natalie C. Ward, Jonathan M. Hodgson, and Kevin D. Croft. 2018. “Flavonoid-Rich Apple Improves Endothelial Function in Individuals at Risk for Cardiovascular Disease: A Randomized Controlled Clinical Trial.” Molecular Nutrition and Food Research 62(3):1–10. Boyer, Jeanelle and Rui Hai Liu. 2004. “Apple Phytochemicals and Their Health Benefits.” Nutrition Journal 3(5). Brizzolara, Stefano, Claudio Santucci, Leonardo Tenori, Maarten Hertog, Bart Nicolai, Stefan Stürz, Angelo Zanella, and Pietro Tonutti. 2017. “A Metabolomics Approach to Elucidate Apple Fruit Responses to Static and Dynamic Controlled Atmosphere Storage.” Postharvest Biology and Technology 127:76–87. Calenge, F., D. Drouet, C. Denancé, W. E. van de Weg, M. N. Brisset, J. P. Paulin, and C. E. Durel. 2005. “Identification of a Major QTL Together with Several Minor Additive or Epistatic QTLs for Resistance to Fire Blight in Apple in Two Related Progenies.” Theoretical and Applied Genetics 111:128–35. Calenge, F. and C. E. Durel. 2006. “Both Stable and Unstable QTLs for Resistance to Powdery Mildew Are Detected in Apple after Four Years of Field Assessments.” Molecular Breeding 17:329–39. Calenge, F., A. Faure, M. Goerre, C. Gebhardt, W. E. van de Weg, L. Parisi, and C. E. Durel. 2004. “Quantitative Trait Loci (QTL) Analysis Reveals Both Broad- Spectrum and Isolate-Specific QTL for Scab Resistance in an Apple Progeny 103

Challenged with Eight Isolates of Venturia Inaequalis.” Phytopathology 94(4):370– 79. Chagné, David, Ross N. Crowhurst, Michela Troggio, Mark W. Davey, Barbara Gilmore, Cindy Lawley, Stijn Vanderzande, Roger P. Hellens, Satish Kumar, Alessandro Cestaro, Riccardo Velasco, Dorrie Main, Jasper D. Rees, Amy Iezzoni, Todd Mockler, Larry Wilhelm, Eric van de Weg, Susan E. Gardiner, Nahla Bassil, and Cameron Peace. 2012. “Genome-Wide SNP Detection, Validation, and Development of an 8K SNP Array for Apple.” PLoS ONE 7(2):e31745. Chagné, David, Célia Krieger, Maysoon Rassam, Mike Sullivan, Jenny Fraser, Christelle André, Massimo Pindo, Michela Troggio, Susan E. Gardiner, Rebecca A. Henry, Andrew C. Allan, Tony K. McGhie, and William A. Laing. 2012. “QTL and Candidate Gene Mapping for Polyphenolic Composition in Apple Fruit.” BMC Plant Biology 12(12). Chalmers, DJ, JD Faragher, and JW Raff. 1973. “Changes in Anthocyanin Synthesis as an Index of Maturity in Red Apple Varieties.” Journal of Horticultural Science 48:387–92. Chambers, Matthew C., Brendan MacLean, Robert Burke, Dario Amodei, Daniel L. Ruderman, Steffen Neumann, Laurent Gatto, Bernd Fischer, Brian Pratt, Jarrett Egertson, Katherine Hoff, Darren Kessner, Natalie Tasman, Nicholas Shulman, Barbara Frewen, Tahmina A. Baker, Mi Youn Brusniak, Christopher Paulse, David Creasy, Lisa Flashner, Kian Kani, Chris Moulding, Sean L. Seymour, Lydia M. Nuwaysir, Brent Lefebvre, Frank Kuhlmann, Joe Roark, Paape Rainer, Suckau Detlev, Tina Hemenway, Andreas Huhmer, James Langridge, Brian Connolly, Trey Chadick, Krisztina Holly, Josh Eckels, Eric W. Deutsch, Robert L. Moritz, Jonathan E. Katz, David B. Agus, Michael MacCoss, David L. Tabb, and Parag Mallick. 2012. “A Cross-Platform Toolkit for Mass Spectrometry and Proteomics.” Nature Biotechnology 30(10):918–20. Chan, Amy, Valerie Graves, and Thomas B. Shea. 2006. “Apple Juice Concentrate Maintains Acetylcholine Levels Following Dietary Compromise.” Journal of Alzheimer’s Disease 9(3):287–91. Chan, Amy, Daniela Ortiz, E. Rogers, and Thomas B. Shea. 2011. “Supplementation with Apple Juice Can Compensate for Folate Deficiency in a Mouse Model Deficient in Methylene Tetra Hydrofolate Reductase.” The Journal of Nutrition, Health & Aging 15(3):221–25. Chan, Amy and Thomas B. Shea. 2006. “Supplementation with Apple Juice Attenuates Presenilin-1 Overexpression during Dietary and Genetically-Induced Oxidative Stress.” Journal of Alzheimer’s Disease 10(4):353–58. Chan, Amy and Thomas B. Shea. 2009. “Dietary Supplementation with Apple Juice Decreases Endogenous Amyloid-β Levels in Murine Brain.” Journal of Alzheimer’s Disease 16(1):167–71. Collard, B. C. Y., M. Z. Z. Jahufer, J. B. Brouwer, and E. C. K. Pang. 2005. “An Introduction to Markers, Quantitative Trait Loci (QTL) Mapping and Marker- Assisted Selection for Crop Improvement: The Basic Concepts.” Euphytica 142:169–96. 104

Cornille, Amandine, Tatiana Giraud, Marinus J. M. Smulders, Isabel Roldán-Ruiz, and Pierre Gladieux. 2014. “The Domestication and Evolutionary Ecology of Apples.” Trends in Genetics 30(2):57–65. Cornille, Amandine, Pierre Gladieux, Marinus J. M. Smulders, Isabel Roldán-Ruiz, François Laurens, Bruno Le Cam, Anush Nersesyan, Joanne Clavel, Marina Olonova, Laurence Feugey, Ivan Gabrielyan, Xiu-Guo Zhang, Maud I. Tenaillon, and Tatiana Giraud. 2012. “New Insight into the History of Domesticated Apple: Secondary Contribution of the European Wild Apple to the Genome of Cultivated Varieties.” PLoS Genetics 8(5):e1002703. Cummins, James N. and Herb S. Aldwinckle. 1983. “Breeding Apple Rootstocks.” Pp. 294–394 in Plant Breeding Reviews, edited by J. Janick. The Avi Publishing Company, Inc. Cuthbertson, Daniel, Preston K. Andrews, John P. Reganold, Neal M. Davies, and B. Markus Lange. 2012. “Utility of Metabolomics toward Assessing the Metabolic Basis of Quality Traits in Apple Fruit with an Emphasis on Antioxidants.” Journal of Agricultural and Food Chemistry 60(35):8552–60. Daccord, Nicolas, Jean Marc Celton, Gareth Linsmith, Claude Becker, Nathalie Choisne, Elio Schijlen, Henri van de Geest, Luca Bianco, Diego Micheletti, Riccardo Velasco, Erica Adele Di Pierro, Jérôme Gouzy, D. Jasper G. Rees, Philippe Guérif, Hélène Muranty, Charles Eric Durel, François Laurens, Yves Lespinasse, Sylvain Gaillard, Sébastien Aubourg, Hadi Quesneville, Detlef Weigel, Eric Van De Weg, Michela Troggio, and Etienne Bucher. 2017. “High-Quality de Novo Assembly of the Apple Genome and Methylome Dynamics of Early Fruit Development.” Nature Genetics 49(7):1099–1106. Diretto, Gianfranco, Salim Al-Babili, Raffaela Tavazza, Velia Papacchioli, Peter Beyer, and Giovanni Giuliano. 2007. “Metabolic Engineering of Potato Carotenoid Content through Tuber-Specific Overexpression of a Bacterial Mini-Pathway.” PLoS ONE 2(4):e350. Ducreux, Laurence J. M., Wayne L. Morris, Peter E. Hedley, Tom Shepherd, Howard V. Davies, Steve Millam, and Mark A. Taylor. 2005. “Metabolic Engineering of High Carotenoid Potato Tubers Containing Enhanced Levels of β-Carotene and Lutein.” Journal of Experimental Botany 56(409):81–89. Dunemann, F., D. Ulrich, A. Boudichevskaia, C. Grafe, and W. E. Weber. 2009. “QTL Mapping of Aroma Compounds Analysed by Headspace Solid-Phase Microextraction Gas Chromatography in the Apple Progeny ‘Discovery’ × ‘Prima.’” Molecular Breeding 23:501–21. Eberhardt, Marian V., Chang Yong Lee, and Rui Hai Lui. 2000. “Antioxidant Activity of Fresh Apples.” Nature 405:903–4. Ehrenkranz, Joel R. L., Norman G. Lewis, C. Ronald Kahn, and Jesse Roth. 2005. “Phlorizin: A Review.” Diabetes/Metabolism Research and Reviews 21(1):31–38. Eisenmann, Philipp, Mona Ehlers, Christoph H. Weinert, Pavleta Tzvetkova, Mara Silber, Manuela J. Rist, Burkhard Luy, and Claudia Muhle-Goll. 2016. “Untargeted NMR Spectroscopic Analysis of the Metabolic Variety of New Apple Cultivars.” Metabolites 6(29). 105

Endelman, Jeffrey B. 2011. “Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP.” The Plant Genome 4(3):250–55. Farneti, Brian, Iuliia Khomenko, Luca Cappellin, Valentina Ting, Guglielmo Costa, Franco Biasioli, and Fabrizio Costa. 2015. “Dynamic Volatile Organic Compound Fingerprinting of Apple Fruit during Processing.” LWT - Food Science and Technology 63:21–28. Farneti, Brian, Iuliia Khomenko, Luca Cappellin, Valentina Ting, Andrea Romano, Franco Biasioli, Guglielmo Costa, and Fabrizio Costa. 2015. “Comprehensive VOC Profiling of an Apple Germplasm Collection by PTR-ToF-MS.” Metabolomics 11:838–50. Feskanich, D., R. G. Ziegler, D. S. Michaud, E. L. Giovannucci, F. E. Speizer, W. C. Willett, and G. A. Colditz. 2000. “Prospective Study of Fruit and Vegetable Consumption and Risk of Lung Cancer among Men and Women.” Journal of the National Cancer Institute 92(22):1812–23. Fresnedo-Ramírez, Jonathan, Marco C. A. M. Bink, Eric van de Weg, Thomas R. Famula, Carlos H. Crisosto, Terrence J. Frett, Ksenija Gasic, Cameron P. Peace, and Thomas M. Gradziel. 2015. “QTL Mapping of Pomological Traits in Peach and Related Species Breeding Germplasm.” Molecular Breeding 35(166). Fresnedo-Ramírez, Jonathan, Terrence J. Frett, Paul J. Sandefur, Alejandra Salgado- Rojas, John R. Clark, Ksenija Gasic, Cameron P. Peace, Natalie Anderson, Timothy P. Hartmann, David H. Byrne, Marco C. A. M. Bink, Eric van de Weg, Carlos H. Crisosto, and Thomas M. Gradziel. 2016. “QTL Mapping and Breeding Value Estimation through Pedigree-Based Analysis of Fruit Size and Weight in Four Diverse Peach Breeding Programs.” Tree Genetics and Genomes 12(25). Gaisano, Herbert Y., Claes Goran Ostenson, Laura Sheu, Michael B. Wheeler, and Suad Efendic. 2002. “Abnormal Expression of Pancreatic Islet Exocytotic Soluble N- Ethylmaleimide-Sensitive Factor Attachment Protein Receptors in Goto-Kakizaki Rats Is Partially Restored by Phlorizin Treatment and Accentuated by High Glucose Treatment.” Endocrinology 143(11):4218–26. Gallus, S., R. Talamini, A. Giacosa, M. Montella, V. Ramazzotti, S. Franceschi, E. Negri, and C. La Vecchia. 2005. “Does an Apple a Day Keep the Oncologist Away?” Annals of Oncology 16:1841–44. Gosch, Christian, Heidi Halbwirth, and Karl Stich. 2010. “Phloridzin: Biosynthesis, Distribution and Physiological Relevance in Plants.” Phytochemistry 71:838–43. Gossé, Francine, Sylvain Guyot, Stamatiki Roussi, Annelise Lobstein, Barbara Fischer, Nikolaus Seiler, and Francis Raul. 2005. “Chemopreventive Properties of Apple Procyanidins on Human Colon Cancer-Derived Metastatic SW620 Cells and in a Rat Model of Colon Carcinogenesis.” Carcinogenesis 26(7):1291–95. Guan, Yingzhu, Cameron Peace, David Rudell, Sujeet Verma, and Kate Evans. 2015. “QTLs Detected for Individual Sugars and Soluble Solids Content in Apple.” Molecular Breeding 35(135). Guo, Xiao fei, Bo Yang, Jun Tang, Jia Jing Jiang, and Duo Li. 2017. “Apple and Pear Consumption and Type 2 Diabetes Mellitus Risk: A Meta-Analysis of Prospective Cohort Studies.” Food and Function 8:927–34. 106

Hastie, Trevor, Robert Tibshirani, Balasubramanian Narasimhan, and Gilbert Chu. 2019. “Impute: Impute: Imputation for Microarray Data. R Package Version 1.60.0.” Hatoum, Darwish, Maarten L. A. T. M. Hertog, Annemie H. Geeraerd, and Bart M. Nicolai. 2016. “Effect of Browning Related Pre- and Postharvest Factors on the ‘Braeburn’ Apple Metabolome during CA Storage.” Postharvest Biology and Technology 111:106–16. Hill, Camilla B., Julian D. Taylor, James Edwards, Diane Mather, Peter Langridge, Antony Bacic, and Ute Roessner. 2015. “Detection of QTL for Metabolic and Agronomic Traits in Wheat with Adjustments for Variation at Genetic Loci That Affect Plant Phenology.” Plant Science 233:143–54. Hollands, Wendy J., David J. Hart, Jack R. Dainty, Oliver Hasselwander, Kirsti Tiihonen, Richard Wood, and Paul A. Kroon. 2013. “Bioavailability of Epicatechin and Effects on Nitric Oxide Metabolites of an Apple Flavanol-Rich Extract Supplemented Beverage Compared to a Whole Apple Puree: A Randomized, Placebo-Controlled, Crossover Trial.” Molecular Nutrition and Food Research 57:1209–17. Hollands, Wendy J., Henri Tapp, Marianne Defernez, Natalia Perez Moral, Mark S. Winterbone, Mark Philo, Alice J. Lucey, Mairead E. Kiely, and Paul A. Kroon. 2018. “Lack of Acute or Chronic Effects of Epicatechin-Rich and Procyanidin-Rich Apple Extracts on Blood Pressure and Cardiometabolic Biomarkers in Adults with Moderately Elevated Blood Pressure: A Randomized, Placebo-Controlled Crossover Trial.” The American Journal of Clinical Nutrition 108:1006–14. Honda, Chikako, Hideo Bessho, Mari Murai, Hiroshi Iwanami, Shigeki Moriya, Kazuyuki Abe, Masato Wada, Yuki Moriya-Tanaka, Hiroko Hayama, and Miho Tatsuki. 2014. “Effect of Temperature on Anthocyanin Synthesis and Ethylene Production in the Fruit of Early- and Medium-Maturing Apple Cultivars during Ripening Stages.” HortScience 49(12):1510–17. Howard, Nicholas P., Eric van de Weg, David S. Bedford, Cameron P. Peace, Stijn Vanderzande, Matthew D. Clark, Soon Li Teh, Lichun Cai, and James J. Luby. 2017. “Elucidation of the ‘Honeycrisp’ Pedigree through Haplotype Analysis with a Multi-Family Integrated SNP Linkage Map and a Large Apple (Malus×domestica) Pedigree-Connected SNP Data Set.” Horticulture Research 4(17003). Hyson, Dianne A. 2011. “A Comprehensive Review of Apples and Apple Components and Their Relationship to Human Health.” Advanced Nutrition 2:408–20. Iaccarino, Nunzia, Camilla Varming, Mikael Agerlin Petersen, Nanna Viereck, Birk Schütz, Torben Bo Toldam-Andersen, Antonio Randazzo, and Søren Balling Engelsen. 2019. “Ancient Danish Apple Cultivars—A Comprehensive Metabolite and Sensory Profiling of Apple Juices.” Metabolites 9(139). Iezzoni, A., C. Weebadde, J. Luby, Chengyan Yue, E. van de Weg, G. Fazio, D. Main, C. P. Peace, N. V. Bassil, and J. McFerson. 2010. “RosBREED: Enabling Marker- Assisted Breeding in Rosaceae.” Pp. 389–94 in Acta Horticulturae. Vol. 859, edited by N. V. Bassil and R. Martin. Corvalis, OR: International Society of Horticultural Sciences. Kang, Le, Sung-Chul Park, Chang Yoon Ji, Ho Soo Kim, Haeng-Soon Lee, and Sang- 107

Soo Kwak. 2017. “Metabolic Engineering of Carotenoids in Transgenic Sweetpotato.” Breeding Science 67:27–34. Khan, Sabaz Ali, Pierre Yves Chibon, Ric C. H. De Vos, Bert A. Schipper, Evert Walraven, Jules Beekwilder, Thijs Van Dijk, Richard Finkers, Richard G. F. Visser, Eric W. Van De Weg, Arnaud Bovy, Alessandro Cestaro, Riccardo Velasco, Evert Jacobsen, and Henk J. Schouten. 2012. “Genetic Analysis of Metabolites in Apple Fruits Indicates an MQTL Hotspot for Phenolic Compounds on Linkage Group 16.” Journal of Experimental Botany 63(8):2895–2908. Khan, Sabaz Ali, Jan G. Schaart, Jules Beekwilder, Andrew C. Allan, Yury M. Tikunov, Evert Jacobsen, and Henk J. Schouten. 2012. “The MQTL Hotspot on Linkage Group 16 for Phenolic Compounds in Apple Fruits Is Probably the Result of a Leucoanthocyanidin Reductase Gene at That Locus.” BMC Research Notes 5(618). Klein, Matthias. 2020. “Mrbin: Magnetic Resonance Binning, Integration and Normalization. R Package Version 1.3.0.” Koller, B., L. Gianfranceschi, N. Seglias, J. McDermott, and C. Gessler. 1994. “DNA Markers Linked to Malus Floribunda 821 Scab Resistance.” Plant Molecular Biology 26:597–602. Kristensen, Mette, Søren B. Engelsen, and Lars O. Dragsted. 2012. “LC-MS Metabolomics Top-down Approach Reveals New Exposure and Effect Biomarkers of Apple and Apple- Intake.” Metabolomics 8:64–73. Larsson, J. 2020. “Eulerr: Area-Proportional Euler and Venn Diagrams with Ellipses. R Package Version 6.1.0.” Laurens, François, Charles-Eric Durel, Andrea Patocchi, Andreas Peil, Silvio Salvi, Stefano Tartarini, Riccardo Velasco, and Eric van de Weg. 2010. “Review on Apple Genetics and Breeding Programmes and Presentation of a New European Initiative to Increase Fruit Breedin Efficiency.” Journal of Fruit Science 27:102–7. Lee, Jangho, Hae Won Jang, Moon Cheol Jeong, Seung Ran Yoo, and Jaeho Ha. 2017. “Analysis of Volatile Compounds as Quality Indicators for Fuji Apples after Cold Storage.” Journal of Food Biochemistry 41(6):1–12. Lee, Jinkwood, David R. Rudell, Peter J. Davies, and Christopher B. Watkins. 2012. “Metabolic Changes in 1-Methylcyclopropene (1-MCP)-Treated ‘Empire’ Apple Fruit during Storage.” Metabolomics 8:742–53. Liu, Rui Hai, Jiaren Liu, and Bingqing Chen. 2005. “Apples Prevent Mammary Tumors in Rats.” Journal of Agricultural and Food Chemistry 53:2341–43. López-Fernández, Olalla, Rubén Domínguez, Mirian Pateiro, Paulo E. S. Munekata, Gabriele Rocchetti, and José M. Lorenzo. 2020. “Determination of Polyphenols Using Liquid Chromatography–Tandem Mass Spectrometry Technique (LC– MS/MS): A Review.” Antioxidants 9(479). Maechler, M., P. Rousseeuw, A. Struyf, M. Hubert, and K. Hornik. 2019. “Cluster: Cluster Analysis Basics and Extensions. R Package Version 2.1.0.” Makiyama, Koji. 2016. “Magicfor: Magic Functions to Obtain Results from for Loops. R Package Version 0.1.0.” Matsuoka, K. 2019. “Anthocyanins in Apple Fruit and Their Regulation for Health Benefits.” in Flavonoids - A Coloring Model for Cheering up Life, edited by F. A. 108

Badria and A. Ananga. IntechOpen. McClure, Kendra A., Yuihui Gong, Jun Song, Melinda Vinqvist-Tymchuk, Leslie Campbell Palmer, Lihua Fan, Karen Burgher-MacLellan, ZhaoQi Zhang, Jean-Marc Celton, Charles F. Forney, Zoë Migicovsky, and Sean Myles. 2019. “Genome-Wide Association Studies in Apple Reveal Loci of Large Effect Controlling Apple Polyphenols.” Horticulture Research 6(107). Merzlyak, Mark N., Alexei E. Solovchenko, Alexei I. Smagin, and Anatoly A. Gitelson. 2005. “Apple Flavonols during Fruit Adaptation to Solar Radiation: Spectral Features and Techniques for Non-Destructive Assessment.” Journal of Plant Physiology 162:151–60. De Moura, Fabiana F., Alexander Miloff, and Erick Boy. 2015. “Retention of Provitamin A Carotenoids in Staple Crops Targeted for Biofortification in Africa: Cassava, Maize and Sweet Potato.” Critical Reviews in Food Science and Nutrition 55:1246– 69. Muraki, Isao, Fumiaki Imamura, Joann E. Manson, Frank B. Hu, Walter C. Willett, Rob M. van Dam, and Qi Sun. 2013. “Fruit Consumption and Risk of Type 2 Diabetes: Results from Three Prospective Longitudinal Cohort Studies.” BMJ 347:f5001. Muro, T., Y. Noguchi, M. Morishita, K. Ito, K. Sugiyama, T. Kondo, Y. Kurenuma, and T. Ono. 2010. “‘Quer-Rich’ a New Variety of Hybrid Red Onion with High Quercetin Content.” Research Bulletin of the National Agricultural Research Center for Hokkaido Region 192:25–32. Myers, Owen D., Susan J. Sumner, Shuzhao Li, Stephen Barnes, and Xiuxia Du. 2017. “One Step Forward for Reducing False Positive and False Negative Compound Identifications from Mass Spectrometry Metabolomics Data: New Algorithms for Constructing Extracted Ion Chromatograms and Detecting Chromatographic Peaks.” Analytical Chemistry 89:8696–8703. Naqvi, S., C. Zhu, G. Farre, K. Ramessar, L. Bassie, J. Breitenbach, D. Perez Conesa, G. Ros, G. Sandmann, T. Capell, and P. Christou. 2009. “Transgenic Multivitamin Corn through Biofortification of Endosperm with Three Vitamins Representing Three Distinct Metabolic Pathways.” Proceedings of the National Academy of Sciences 106(19):7762–67. Nelson, N. J. 1999. “Purple Carrots, Margarine Laced with Wood Pulp? Nutraceuticals Move into the Supermarket.” Journal of the National Cancer Institute 91:755–57. Nestel, Penelope, Howarth E. Bouis, Jonnalagadda V. Meenakshi, and Wolfgang H. Pfeiffer. 2006. “Biofortification of Staple Food Crops.” Journal of Nutrition 136:1064–67. Norelli, John L., Alan L. Jones, and Herb S. Aldwinckle. 2003. “Fire Blight Management in the Twenty-First Century: Using New Technologies That Enhance Host Resistance in Apple.” Plant Disease 87(7):756–65. Norelli, John L., Michael Wisniewski, Gennaro Fazio, Erik Burchard, Benjamin Gutierrez, Elena Levin, and Samir Droby. 2017. “Genotyping-by-Sequencing Markers Facilitate the Identification of Quantitative Trait Loci Controlling Resistance to Penicillium Expansum in .” PLoS ONE 12(3):e0172949. 109

Ohio Supercomputer Center. 1987. “Ohio Supercomputer Center.” Ohio Supercomputer Center. 2016. “Owens Supercomputer.” Ortiz, Daniela and Thomas B. Shea. 2004. “Apple Juice Prevents Oxidative Stress Induced by Amyloid-Beta in Culture.” Journal of Alzheimer’s Disease 6(1):27–30. Ottaviani, Javier I., Christian Heiss, Jeremy P. E. Spencer, Malte Kelm, and Hagen Schroeter. 2018. “Recommending Flavanols and Procyanidins for Cardiovascular Health: Revisited.” Molecular Aspects of Medicine 61:63–75. De Paepe, Domien, Dirk Valkenborg, Bart Noten, Kelly Servaes, Ludo Diels, Marc De Loose, Bart Van Droogenbroeck, and Stefan Voorspoels. 2015. “Variability of the Phenolic Profiles in the Fruits from Old, Recent and New Apple Cultivars Cultivated in Belgium.” Metabolomics 11:739–52. Parisi, L., Y. Lespinasse, J. Guillaumes, and J. Krüger. 1993. “A New Race of Venturia Inaequalis Virulent to Apples with Resistance Due to the Vf Gene.” Phytopathology 83:533. Patil, B. S., K. Crosby, A. Vikram, and D. Byrne. 2012. “Breeding Vegetables and Fruits to Improve Human Health: A Collaborative Effort of Multidisciplinary Scientists, Stakeholders and Consumers Using a Systems-Based Approach.” Acta Horticulturae 939:19–32. Peace, Cameron P., Luca Bianco, Michela Troggio, Eric van de Weg, Nicholas P. Howard, Amandine Cornille, Charles-Eric Durel, Sean Myles, Zoë Migicovsky, Robert J. Schaffer, Evelyne Costes, Gennaro Fazio, Hisayo Yamane, Steve van Nocker, Chris Gottschalk, Fabrizio Costa, David Chagné, Xinzhong Zhang, Andrea Patocchi, Susan E. Gardiner, Craig Hardner, Satish Kumar, Francois Laurens, Etienne Bucher, Dorrie Main, Sook Jung, and Stijn Vanderzande. 2019. “Apple Whole Genome Sequences: Recent Advances and New Prospects.” Horticulture Research 6(59). Perez-Vizcaino, Francisco and Juan Duarte. 2010. “Flavonols and Cardiovascular Disease.” Molecular Aspects of Medicine 31(6):478–94. Perez-Vizcaino, Francisco and Cesar G. Fraga. 2018. “Research Trends in Flavonoids and Health.” Archives of Biochemistry and Biophysics 646:107–12. Di Pierro, Erica A., Luca Gianfranceschi, Mario Di Guardo, Herma Jj Koehorst-van Putten, Johannes W. Kruisselbrink, Sara Longhi, Michela Troggio, Luca Bianco, Hélène Muranty, Giulia Pagliarani, Stefano Tartarini, Thomas Letschka, Lidia Lozano Luis, Larisa Garkava-Gustavsson, Diego Micheletti, Marco C. A. M. Bink, Roeland E. Voorrips, Ebrahimi Aziz, Riccardo Velasco, François Laurens, and W. Eric van de Weg. 2016. “A High-Density, Multi-Parental SNP Genetic Map on Apple Validates a New Mapping Approach for Outcrossing Species.” Horticulture Research 3(16057). Pluskal, Tomáš, Sandra Castillo, Alejandro Villar-Briones, and Matej Orešič. 2010. “MZmine 2: Modular Framework for Processing, Visualizing, and Analyzing Mass Spectrometry-Based Molecular Profile Data.” BMC Bioinformatics 11. Proctor, J. T. A. 1974. “Color Stimulation in Attached Apples with Supplementary Light.” Canadian Journal of Plant Science 54:499–503. R Core Team. 2019. “R: A Language and Environment for Statistical Computing.” 110

R Development Core Team. 2008. “R: A Language and Environment for Statistical Computing.” Rana, Shalika and Shashi Bhushan. 2016. “Apple Phenolics as Nutraceuticals: Assessment, Analysis and Application.” Journal of Food Science and Technology 53(4):1727–38. Ravn-Haren, Gitte, Lars O. Dragsted, Tine Buch-Andersen, N. Jensen, Runa I. Jensen, Mária Németh-Balogh, Brigita Paulovicsová, Anders Bergström, Andrea Wilcks, Tine R. Licht, Jarosław Markowski, and Susanne Bügel. 2013. “Intake of Whole Apples or Clear Apple Juice Has Contrasting Effects on Plasma Lipids in Healthy Volunteers.” European Journal of Nutrition 52:1875–89. Remington, Ruth, Amy Chan, Alicia Lepore, Elizabeth Kotlya, and Thomas B. Shea. 2010. “Apple Juice Improved Behavioral but Not Cognitive Symptoms in Moderate- to-Late Stage Alzheimer’s Disease in an Open-Label Pilot Study.” American Journal of Alzheimer’s Disease and Other Dementias 25(4):367–71. Riester, Markus, Peter F. Stadler, and Konstantin Klemm. 2009. “FRANz: Reconstruction of Wild Multi-Generation Pedigrees.” Bioinformatics 25(16):2134– 39. Rogers, E., S. Milhalik, Daniela Ortiz, and Thomas B. Shea. 2004. “Apple Juice Prevents Oxidative Stress and Impaired Cognitive Performance Caused by Genetic and Dietary Deficiencies in Mice.” The Journal of Nutrition, Health & Aging 8(2):92– 97. RosBREED. 2018. RosBREED Annual Project Report. Saarenhovi, Maria, Pia Salo, Mika Scheinin, Jussi Lehto, Zsófia Lovró, Kirsti Tiihonen, Markus J. Lehtinen, Jouni Junnila, Oliver Hasselwander, Anneli Tarpila, and Olli T. Raitakari. 2017. “The Effect of an Apple Polyphenol Extract Rich in Epicatechin and Flavan-3-Ol Oligomers on Brachial Artery Flow-Mediated Vasodilatory Function in Volunteers with Elevated Blood Pressure.” Nutrition Journal 16(73). Saenger, Theresa, Florian Hübner, and Hans Ulrich Humpf. 2017. “Short-Term Biomarkers of Apple Consumption.” Molecular Nutrition and Food Research 61(3):1–10. Sayre, Richard, John R. Beeching, Edgar B. Cahoon, Chiedozie Egesi, Claude Fauquet, John Fellman, Martin Fregene, Wilhelm Gruissem, Sally Mallowa, Mark Manary, Bussie Maziya-Dixon, Ada Mbanaso, Daniel P. Schachtman, Dimuth Siritunga, Nigel Taylor, Herve Vanderschuren, and Peng Zhang. 2011. “The BioCassava Plus Program: Biofortification of Cassava for Sub-Saharan Africa.” Annual Review of Plant Biology 62:251–72. Sekhon-Loodu, Satvir, Adriana Catalli, Marianna Kulka, Yanwen Wang, Fereidoon Shahidi, and H. P. Vasanth. Rupasinghe. 2014. “Apple Flavonols and N-3 Polyunsaturated Fatty Acid-Rich Fish Oil Lowers Blood C-Reactive Protein in Rats with Hypercholesterolemia and Acute Inflammation.” Nutrition Research 34:535– 43. Serra, Michael and Thomas B. Shea. 2009. “Apple Juice Stimulates Organized Synaptic Activity in Cultured Cortical Neurons.” Current Topics in Nutraceutical Research 7(2):93–96. 111

Shim, Jee-Shim, Kyungwon Oh, and Hyeon Chang Kim. 2014. “Dietary Assessment Methods in Epidemiological Studies.” Epidemiology and Health 36:e2014009. Signorell, Andri and et. mult. al. 2020. “DescTools: Tools for Descriptive Statistics. R Package Version 0.99.35.” Song, Yiqing, JoAnn E. Manson, Julie E. Buring, Howard D. Sesso, and Simin Liu. 2005. “Associations of Dietary Flavonoids with Risk of Type 2 Diabetes, and Markers of Insulin Resistance and Systemic Inflammation in Women: A Prospective Study and Cross-Sectional Analysis.” Journal of the American College of Nutrition 24(5):376– 84. Sun, Jianghao, Wojciech J. Janisiewicz, Breyn Nichols, Wayne M. Jurick II, and Pei Chen. 2017. “Composition of Phenolic Compounds in Wild Apple with Multiple Resistance Mechanisms against Postharvest Blue Mold Decay.” Postharvest Biology and Technology 127:68–75. Tajik, Narges, Mahboubeh Tajik, Isabelle Mack, and Paul Enck. 2017. “The Potential Effects of Chlorogenic Acid, the Main Phenolic Components in Coffee, on Health: A Comprehensive Review of the Literature.” European Journal of Nutrition 56:2215–44. Takos, Adam Matthew, Benjamin Ewa Ubi, Simon Piers Robinson, and Amanda Ruth Walker. 2006. “Condensed Tannin Biosynthesis Genes Are Regulated Separately from Other Flavonoid Biosynthesis Genes in Apple Fruit Skin.” Plant Science 170:487–99. Tchantchou, Flaubert, Amy Chan, Lydia Kifle, Daniela Ortiz, and Thomas B. Shea. 2005. “Apple Juice Concentrate Prevents Oxidative Damage and Impaired Maze Performance in Aged Mice.” Journal of Alzheimer’s Disease 8:283–87. Tchantchou, Flaubert, M. Graves, D. Ortiz, E. Rogers, and Thomas B. Shea. 2004. “Dietary Supplementation with Apple Juice Concentrate Alleviates the Compensatory Increase in Glutathione Synthase Transcription and Activity That Accompanies Dietary and Genetically-Induced Oxidative Stress.” The Journal of Nutrition, Health & Aging 8(6):492–96. Tenore, Gian Carlo, Alfonso Carotenuto, Domenico Caruso, Giuseppe Buonomo, Maria D’Avino, Diego Brancaccio, Roberto Ciampaglia, Maria Maisto, Connie Schisano, and Ettore Novellino. 2018. “A Nutraceutical Formulation Based on Annurca Apple Polyphenolic Extract Is Effective on Intestinal Cholesterol Absorption: A Randomised, Placebo-Controlled, Crossover Study.” PharmaNutrition 6(3):85–94. Tenore, Gian Carlo, Domenico Caruso, Giuseppe Buonomo, Maria D’Avino, Pietro Campiglia, Luciana Marinelli, and Ettore Novellino. 2017. “A Healthy Balance of Plasma Cholesterol by a Novel Annurca Apple-Based Nutraceutical Formulation: Results of a Randomized Trial.” Journal of Medicinal Food 20(3):288–300. Tenore, Gian Carlo, Domenico Caruso, Giuseppe Buonomo, Emanuela D’Urso, Maria D’Avino, Pietro Campigli, Luciana Marinelli, and Ettore Novellino. 2017. “Annurca (Malus Pumilamiller Cv. Annurca) Apple as a Functional Food for the Contribution to a Healthy Balance of Plasma Cholesterol Levels: Results of a Randomized Clinical Trial.” Journal of the Science of Food and Agriculture 97(7):2107–15. Tian, Jia, Xiaoyan Wu, Moyang Zhang, Zhongyi Zhou, and Yingfeng Liu. 2018. 112

“Comparative Study on the Effects of Apple Peel Polyphenols and Apple Flesh Polyphenols on Cardiovascular Risk Factors in Mice.” Clinical and Experimental Hypertension 40(1):65–72. Tsao, Rong, Raymond Yang, Sheery Xie, Emily Sockovie, and Shahrokh Khanizadeh. 2005. “Which Polyphenolic Compounds Contribute to the Total Antioxidant Activities of Apple?” Journal of Agricultural and Food Chemistry 53(12):4989–95. USDA Economic Research Service. 2017. U.S. per Capita Loss-Adjusted Fruit Availability. Vandendriessche, Thomas, Hartmut Schäfer, Bert E. Verlinden, Eberhard Humpfer, Maarten L. A. T. M. Hertog, and Bart M. Nicolaï. 2013. “High-Throughput NMR Based Metabolic Profiling of Braeburn Apple in Relation to Internal Browning.” Postharvest Biology and Technology 80:18–24. Vick, H., DF Diedrich, and K. Baumann. 1973. “Reevaluation of Renal Tubular Glucose Transport Inhibition by Phlorizin Analogs.” American Journal of Physiology 224(3):552–57. Vikram, A., B. Prithiviraj, H. Hamzehzarghani, and A. C. Kushalappa. 2004. “Volatile Metabolite Profiling to Discriminate Diseases of McIntosh Apple Inoculated with Fungal Pathogens.” Journal of the Science of Food and Agriculture 84:1333–40. Vrhovsek, Urska, Cesare Lotti, Domenico Masuero, Silvia Carlin, Georg Weingart, and Fulvio Mattivi. 2014. “Quantitative Metabolic Profiling of Grape, Apple and Raspberry Volatile Compounds (VOCs) Using a GC/MS/MS Method.” Journal of Chromatography B: Analytical Technologies in the Biomedical and Life Sciences 966:132–39. Vrhovsek, Urska, Adelio Rigo, Diego Tonon, and Fulvio Mattivi. 2004. “Quantitation of Polyphenols in Different Apple Varieties.” Journal of Agricultural and Food Chemistry 52:6532–38. de Vries, Andrie and Brian D. Ripley. 2016. “Ggdendro: Create Dendrograms and Tree Diagrams Using ‘Ggplot2’. R Package Version 0.1-20.” Wang, Mingxun, Jeremy J. Carver, Vanessa V. Phelan, Laura M. Sanchez, Neha Garg, Yao Peng, Don Duy Nguyen, Jeramie Watrous, Clifford A. Kapono, Tal Luzzatto- Knaan, Carla Porto, Amina Bouslimani, Alexey V. Melnik, Michael J. Meehan, Wei Ting Liu, Max Crüsemann, Paul D. Boudreau, Eduardo Esquenazi, Mario Sandoval- Calderón, Roland D. Kersten, Laura A. Pace, Robert A. Quinn, Katherine R. Duncan, Cheng Chih Hsu, Dimitrios J. Floros, Ronnie G. Gavilan, Karin Kleigrewe, Trent Northen, Rachel J. Dutton, Delphine Parrot, Erin E. Carlson, Bertrand Aigle, Charlotte F. Michelsen, Lars Jelsbak, Christian Sohlenkamp, Pavel Pevzner, Edlund, Jeffrey McLean, Jörn Piel, Brian T. Murphy, Lena Gerwick, Chih Chuang Liaw, Yu Liang Yang, Hans Ulrich Humpf, Maria Maansson, Robert A. Keyzers, Amy C. Sims, Andrew R. Johnson, Ashley M. Sidebottom, Brian E. Sedio, Andreas Klitgaard, Charles B. Larson, Cristopher A. P. Boya, Daniel Torres-Mendoza, David J. Gonzalez, Denise B. Silva, Lucas M. Marques, Daniel P. Demarque, Egle Pociute, Ellis C. O’Neill, Enora Briand, Eric J. N. Helfrich, Eve A. Granatosky, Evgenia Glukhov, Florian Ryffel, Hailey Houson, Hosein Mohimani, Jenan J. Kharbush, Yi Zeng, Julia A. Vorholt, Kenji L. Kurita, Pep Charusanti, Kerry L. McPhail, Kristian 113

Fog Nielsen, Lisa Vuong, Maryam Elfeki, Matthew F. Traxler, Niclas Engene, Nobuhiro Koyama, Oliver B. Vining, Ralph Baric, Ricardo R. Silva, Samantha J. Mascuch, Sophie Tomasi, Stefan Jenkins, Venkat Macherla, Thomas Hoffman, Vinayak Agarwal, Philip G. Williams, Jingqui Dai, Ram Neupane, Joshua Gurr, Andrés M. C. Rodríguez, Anne Lamsa, Chen Zhang, Kathleen Dorrestein, Brendan M. Duggan, Jehad Almaliti, Pierre Marie Allard, Prasad Phapale, Louis Felix Nothias, Theodore Alexandrov, Marc Litaudon, Jean Luc Wolfender, Jennifer E. Kyle, Thomas O. Metz, Tyler Peryea, Dac Trung Nguyen, Danielle VanLeer, Paul Shinn, Ajit Jadhav, Rolf Müller, Katrina M. Waters, Wenyuan Shi, Xueting Liu, Lixin Zhang, Rob Knight, Paul R. Jensen, Bernhard Palsson, Kit Pogliano, Roger G. Linington, Marcelino Gutiérrez, Norberto P. Lopes, William H. Gerwick, Bradley S. Moore, Pieter C. Dorrestein, and Nuno Bandeira. 2016. “Sharing and Community Curation of Mass Spectrometry Data with Global Natural Products Social Molecular Networking.” Nature Biotechnology 34(8):828–37. Way, RD, Herb S. Aldwinckle, RC Lamb, A. Rejman, S. Sansavini, T. Shen, R. Watkins, MN Westwood, and Y. Yoshida. 1991. “Apples (Malus).” Acta Horticulturae 290:3–46. van de Weg, W. E., R. E. Voorrips, R. Finkers, L. P. Kodde, J. Jansen, and M. C. A. M. Bink. 2004. “Pedigree Genotyping: A New Pedigree-Based Approach of QTL Identification and Allele Mining.” Acta Horticulturae 663:45–50. Wickham, H. 2016. “Ggplot2: Elegant Graphics for Data Analysis.” Wickham, Hadley. 2007. “Reshaping Data with the Reshape Package.” Journal of Statistical Software 21(12):1–20. Wickham, Hadley, Mara Averick, Jennifer Bryan, Chang, Lucy McGowan, Romain François, Garrett Grolemund, Alex Hayes, Lionel Henry, Jim Hester, Max Kuhn, Thomas Pedersen, Evan Miller, Stephan Bache, Kirill Müller, Jeroen Ooms, David Robinson, Dana Seidel, Vitalie Spinu, Kohske Takahashi, Davis Vaughan, Claus Wilke, Kara Woo, and Hiroaki Yutani. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4(43):1686. Wishart, David S., Yannick Djoumbou Feunang, Ana Marcu, An Chi Guo, Kevin Liang, Rosa Vázquez-Fresno, Tanvir Sajed, Daniel Johnson, Carin Li, Naama Karu, Zinat Sayeeda, Elvis Lo, Nazanin Assempour, Mark Berjanskii, Sandeep Singhal, David Arndt, Yonjie Liang, Hasan Badran, Jason Grant, Arnau Serra-Cayuela, Yifeng Liu, Rupa Mandal, Vanessa Neveu, Allison Pon, Craig Knox, Michael Wilson, Claudine Manach, and Augustin Scalbert. 2018. “HMDB 4.0: The Human Metabolome Database for 2018.” Nucleic Acids Research 46:D608–17. Wolfe, Kelly, Xianzhong Wu, and Rui Hai Liu. 2003. “Antioxidant Activity of Apple Peels.” Journal of Agricultural and Food Chemistry 51:609–14. Xiong, Jin Song, Jing Ding, and Yi Li. 2015. “Genome-Editing Technologies and Their Potential Application in Horticultural Crop Breeding.” Horticulture Research 2:15019. Ye, X., S. Al-Babili, A. Klöti, J. Zhang, P. Lucca, P. Beyer, and I. Potrykus. 2002. “Engineering the Provitamin A (-Carotene) Biosynthetic Pathway into (Carotenoid- Free) Rice Endosperm.” Science 287:303–5. 114

Zhang, ManMan, ZengHui Wang, YunFei Mao, YanLi Hu, Lu Yang, YunYun Wang, LuLu Zhang, and Xiang Shen. 2019. “Effects of Quince Pollen Pollination on Fruit Qualities and Phenolic Substance Contents of Apples.” Scientia Horticulturae 256(108628). Zhu, Yanmin, Sungbong Shin, and Mark Mazzola. 2016. “Genotype Responses of Two Apple Rootstocks to Infection by Pythium Ultimum Causing Apple Replant Disease.” Canadian Journal of Plant Pathology 38(4):483–91.

115

Appendix A: Metabolomics-Related Documents

A.1 LC-MS

A.1.1 Proteowizard msConvert Parameters

A.1.1.1 Full Scan

A.1.1.2 Iterative MS/MS

A.1.2 mzMine2.51 Parameters for Full Scan LC-MS Data

A.1.3 Excel Feature Post-Processing

A.1.4 GNPS Classical Molecular Networking Parameters for Iterative MS/MS

Data

A.2 NMR

A.2.1 CombiDancer Extract Drying Parameters

A.2.2 mrbin Code for Spectral Binning

A.3 Data Visualization

A.3.1 PCA

116

Appendix B: Omics Integration-Related Documents

B.1 mGWAS

B.1.1 AGHmatrix Kinship Matrix

B.1.2 SNP Principal Components Analysis and Elbow Plots

B.1.3 Dividing SNPs by Chromosome

B.1.4 Unix Batch Script Code for mGWAS using rrBLUP in OSC

B.1.5 File Import and Export for OSC and Results Compilation with Command

Line in Terminal

B.1.6 Filtering for Significant SNP-Feature Associations in R

B.1.7 Venning Significant Features from mGWAS of Three Populations

B.1.8 Visualizing Complementary Omics Datasets

B.1.8.1 Number of mQTL per Chromosome

B.1.8.2 Significant SNP-Feature Association Composite Chromosome Map

B.1.8.3 Number of Significantly Associated Features per SNP

B.1.8.4 Presence Absence Heatmap and Hierarchical clustering for SNP-feature

associations

B.1.8.5 Significant SNP-Bin Associations across NMR Spectra

B.1.9 SNP Names Conversion Reference

117