H OH metabolites OH

Article Traceability of “Tuscan PGI” Extra Virgin Olive Oils by 1H NMR Metabolic Profiles Collection and Analysis

Chiara Roberta Girelli 1 , Laura Del Coco 1 , Samanta Zelasco 2, Amelia Salimonti 2, Francesca Luisa Conforti 3 , Andrea Biagianti 4, Daniele Barbini 4 and Francesco Paolo Fanizzi 1,*

1 Department of Biological and Environmental Sciences and Technologies, University of Salento, Prov.le Lecce-Monteroni, 73100 Lecce, ; [email protected] (C.R.G.); [email protected] (L.D.C.) 2 Council for Agricultural Research and Economics–Research Centre for Olive, Citrus and Tree Fruit C. da Rocchi, 87036 Rende (CS), Italy; [email protected] (S.Z.); [email protected] (A.S.) 3 CNR-Institute of Neurological Sciences, Località Burga, Piano Lago, 87050 Mangone (CS), Italy; [email protected] 4 Certified Origins Italia srl, Località il Madonnino, 58100 , Italy; andrea.biagianti@oleificioolma.it (A.B.); daniele.barbini@oleificioolma.it (D.B.) * Correspondence: [email protected]; Tel.: +39-0832-29265; Fax: +39-0832-298626

 Received: 11 September 2018; Accepted: 28 September 2018; Published: 30 September 2018 

Abstract: According to Coldiretti, Italy still continues to hold the European Quality record in extra virgin olive oils with origin designation and protected geographical indication (PDO and PGI). To date, 46 Italian brands are recognized by the European Union: 42 PDO and 4 PGI (Tuscan PGI, Calabria PGI; Tuscia PGI and PGI Sicily). Specific regulations, introduced for these quality marks, include the designation of both the geographical areas and the plant varieties contributing to the composition of the olive oil. However, the PDO and PGI assessment procedures are currently based essentially on farmer declarations. Tuscan PGI extra virgin olive oil is one of the best known Italian trademarks around the world. Tuscan PGI varietal platform is rather wide including 31 specific olive cultivars which should account for at least 95% of the product. On the other hand, while the characteristics of other popular Italian extra virgin olive oils (EVOOs) cultivars from specific geographical areas have been extensively studied (such as those of Coratina based blends from Apulia), little is still known about Tuscan PGI EVOO constituents. In this work, we performed, for the first time, a large-scale analysis of Tuscan PGI monocultivar olive oils by 1H NMR spectroscopy and multivariate statistical analyses (MVA). After genetic characterization of 217 leaf samples from 24 selected geographical areas, distributed all over the , a number of 202 micro-milled oil samples including 10 PGI cultivars, was studied. The results of the present work confirmed the need of monocultivar genetically certified EVOO samples for the construction of 1H-NMR-metabolic profiles databases suitable for cultivar and/or geographical origin assessment. Such specific PGI EVOOs databases could be profitably used to justify the high added value of the product and the sustainability of the related supply chain.

Keywords: chemometrics; protected designation of origin (PDO); protected geographical indication (PGI); extra virgin olive oil (EVOO); Nuclear Magnetic Resonance (NMR); genetic analysis

Metabolites 2018, 8, 60; doi:10.3390/metabo8040060 www.mdpi.com/journal/metabolites Metabolites 2018, 8, 60 2 of 17

1. Introduction Originally known as oleaster, the olive tree appeared more than 6000 years B.C. for the first time in Asia Minor and successively diffused in the countries of the Mediterranean basin [1]. To date, olive groves spread still continue all around the world and extra virgin olive oil (EVOO), in particular, remains undoubtedly the most important production of Mediterranean countries, due to its nutraceutical, antioxidant and other well-known health properties [2]. Among other vegetable oils, EVOO is a premium price product in the national and international market, leading to the risk of adulteration and mislabelling [3]. Thus, EVOO authenticity and traceability are important for consumer health and commercial purposes. Severe standards on olive oil production, origin and labelling have been established by European Community (EC) Council of Regulation [4]. In particular, traceability was defined as “the ability to trace and follow a food, feed, food-producing animal or substance intended to be, or expected to be incorporated into a food or feed, through all stages of production, processing and distribution” [5]. In order to improve and protect the high-quality products from a particular origin, the European Regulation (EC, 1992, EC, 2006), established rules on the Protection of Designations of Origin (PDO) and Protected Geographical Indications (PGI) of agricultural products and foodstuff [6,7]. In 2012, the EU Regulation 1151/2012 introduced new guidelines on quality system for agricultural product including PDO, PGI and TSG (Traditional Specialty Guaranteed) schemes. For their well-defined geographic origin, some olive cultivars are recognized as higher quality agricultural products and included in the PDO/PGI labelling [8–10]. For Regulation purposes (EU 510/06, Article 2), both the PDO and PGI indicate “the name of a region, a specific place or, in exceptional cases, a country, used to describe an agricultural product or a foodstuff: originating in that region, specific place or country” [6]. They differ in the quality definition, being for PDO: “the quality or characteristics of which are essentially or exclusively due to a particular geographical environment with its inherent natural and human factors” and for PGI “specific quality, reputation or other characteristics attributable to that geographical origin”. Moreover, PGI definition is assigned to agricultural and food products whose at least one stage of the production process must be performed within the defined geographical area. On the contrary, for PDO, the entire production cycle must be carried out in a specific territory [6,10]. Thus, PGI labelling focuses on quality, reputation and specific characteristics related to the geographical origin [10]. To date, world olive oil production is concentrated (98%) in the Mediterranean basin and, in particular, Spain (45%) and Italy (15%) [11]. Together with Spain, Italy accounts for almost all world exports (60% Spain and 20% Italy). Moreover, Italy still continues to hold the European quality record in EVOOs with 40% of protected designation origin and protected geographical indication (PDO and PGI) [12]. In particular, 46 Italian brands are recognized by the European Union, distinguished in 42 PDO and 4 PGI (Tuscan PGI, Calabria PGI, Tuscia PGI and PGI Sicily). Tuscan PGI extra virgin olive oil is one of the best known Italian trademarks around the world, increasingly diffused and commercialized, especially in the U.S.A. market. Tuscan PGI varietal platform is rather wide including 31 specific olive cultivars, which should account for at least 95% of the product [13]. On the other hand, while the characteristics of other popular Italian EVOO cultivars from specific geographical areas have been extensively studied (such as Coratina based blends and “Terra di Bari” PDO EVOOs from Apulia) [14–17], little is still known about Tuscan PGI EVOO constituents. Combination of environmental aspects with olive cultivars genetic characteristics resulted in the metabolic profile of a specific product. It is largely known, that oil characteristics, such as fatty acid composition, minor and volatile compounds, organoleptic and nutraceutical properties, are strictly related with genetic patrimony. Nevertheless, they also depend on environmental conditions, agronomical practises and local adaptation of the different olive growing [1]. Besides official methods of analysis to be used as reference in defining satisfactory physical and chemical characteristics of EVOOs [18], several alternative methodologies have been also proposed for determining oils profiles. Thanks to their screening potential, several instrumental techniques have been used for this purpose, such as GS-MS, UV-VIS, Raman, NIR, Mass, NMR and others [19]. These latter (Mass and NMR), being the most successfully high-throughput among the spectroscopic Metabolites 2018, 8, 60 3 of 17 techniques, have been widely used for food screening and, in particular, for characterization of olive oil authenticity, adulteration and traceability [3,20–23]. In this work we performed, for the first time, a large-scale analysis of Tuscan PGI monocultivar olive oils by 1H NMR spectroscopy and multivariate statistical analyses (MVA). A number of 217 leaf samples were collected from 24 selected geographical areas, distributed all over the Tuscany and at first genetically characterized using a set of 10 microsatellite markers (SSRs). Molecular analysis revealed a ~93% varietal correspondence of oils (202 on 217 accessions) with 10 Tuscan PGI cultivars. Thus, 202 oil samples were obtained by micro-milling from olives collected from genetically certified localized trees. The aim of this work is to verify the possibility of assessing PGI classification by using 1H NMR metabolic profiles databases and MVA (multivariate analysis) beside farmers declarations. This work could also provide a contribution to support the extra virgin olive oil based economy of the local region. A scientifically certified quality could validate the high added value of the product, promoting its use by end customers (both in Italy and on the foreign markets) and buttressing the sustainability of the related supply chain.

2. Results and Discussion A preliminary genetic characterisation of the plant material (217 olive leaf samples, Table S1) was successfully performed in order to check the correct cultivar declaration of the samples analysed. Genetic analysis revealed that 157 samples were correctly assigned to the declared cultivars, (Table S2) even if a few cases of intra-cultivar variation was shown (Table S3). A number of 45 samples declared has been reassigned to Tuscan cultivars (Table S2). It is worth noting that the cultivars declared “Morchiaio” corresponded to the Tuscan variety called “Giogolino” which is not included in the Tuscan PGI specification (Tables S3 and S4). Overall, 202 accessions corresponded to Tuscan PGI varietal platform while 12 varieties were found to be outside of the composition of the Tuscan PGI production disciplinary and 3 accessions were not identified (Table S2). In particular, two unidentified accessions were genetically identical each other for the 10 SSR markers used and they could be to consider known variety (Table S3). The CREA-OFA internal database includes all the Tuscan PGI olive varieties except for two —named “Scarlinese” and “Melaiolo”—making impossible the genetic identification of these cultivars. The main Tuscan PGI varieties found were Frantoio, Leccino, Moraiolo, Pendolino followed by Maurino, Leccio del Corno while not more than 1 accession of minor varieties such as Mignolo, Mignolo cerretano, Olivastra Seggianese, Rossellino was found. The Tuscan PGI specification includes 31 specific olive cultivars, which should account for at least 95% of the product. In this study, we found in a sample of 217 accessions collected in different geographical areas of Tuscany, a correct PGI varietal composition of about 90% of accessions. All the varieties not belonging to Tuscan PGI specification were discarded. In order to investigate the general trend of data grouping, the whole NMR dataset (obtained from 202 micro-milled olive samples) was studied. An explorative unsupervised PCA analysis of the NMR dataset was first performed. The PCA model obtained with five components gave R2X = 0.89 and Q2 = 0.77). The PCA t[1]/t[2] scores plot for the model, with samples labelled according to the declared cultivars (Figure1a) or geographical origin (Figure1b) showed that no specific clustering among samples could be observed. Metabolites 2018, 8, 60 4 of 17 Metabolites 2018, 8, 60 4 of 17

0.4 Maurino

0.3 Razzaio Moraiolo 0.2 Frantoio Leccino 0.1 Leccio del Corno 0 Maremmano t[2] Morchiaio -0.1 Pendolino -0.2 Rossellino Olivastra Seggianese -0.3 Lazzero

-0.4 Mignolo Cerretano

-0.5 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 t[1] a) Bassa Maremma‐ Capalbio ‐ North Grosseto‐ Hillslope ‐ Grosseto‐ Hill country Campagnatico‐ Hinterland‐Hill country Castiglione della Pescaia‐ Grosseto‐ Hillslope Castiglione della Pescaia ‐ Grosseto‐ Flat area Colline‐ Metallifere‐ Massa Marittima Basso Merse ‐ Hill country Montalbano Monte Amiata Tufo‐ Pitigliano Tufo‐ Pomonte‐ Hillslope‐ Saturnia Pistoia Province Province Follonica Grancia‐ South Grosseto Cecina/San Vincenzo Coast Follonica/Piombino Coast Montelattaia‐Madonnino‐Grosseto Monti dell’Uccellina ‐ Scansano‐ Hill country b) Rosignano Solvay San Casciano Valdi Pesa‐ Montelupo Fiorentino Va l l e r ot a na ‐ Roselle Terme‐Grosseto

Figure 1. Principal Component Analysis (PCA) t[1]/t[2] scores plot (t[1] and t[2] explain 56.7% and 14.5%Figure of the 1. Principal total variance, Component respectively) Analysis for (PCA) micro-milled t[1]/t[2] scores olive oilplot samples (t[1] and labelled t[2] explain according 56.7% to and declared14.5% cultivarof the total (a) and variance, geographical respectively) area (b ).for micro‐milled olive oil samples labelled according to declared cultivar (a) and geographical area (b). Nevertheless, a rough separation of samples in two main groups, apparently independent on both declaredNevertheless, cultivar or geographical a rough separation origin, was of observed,samples in specifically two main along groups, the apparently first principal independent component on t[1].both In order declared to identify cultivar a possibleor geographical clustering origin, among was samples observed, according specifically to cultivar, along an the unsupervised first principal 2 2 analysiscomponent (PCA fivet[1]. componentsIn order to identify give R X a = possible 0.89; Q clustering= 0.73) was among performed, samples by consideringaccording to only cultivar, olive an oilunsupervised samples belonging analysis to the (PCA four five main components cultivars (those give present R2X = in0.89; a representative Q2 = 0.73) was number performed, for each by cultivarconsidering and therefore only olive statistically oil samples more significant): belonging Moraiolo,to the four Frantoio, main Leccinocultivars and (those Pendolino. present Also in a in thisrepresentative case, the PCA number scores for plot each showed cultivar a dispersion and therefore of the samples statistically without more any significant): specific separation Moraiolo, amongFrantoio, the cultivars. Leccino and However, Pendolino. the Also existence in this of case, two macro-groupsthe PCA scores discriminated plot showed a along dispersion the main of the componentsamples without was confirmed. any specific A more separation compact among group the was cultivars. found at However, negative values the existence of PC1 component,of two macro‐ distinctgroups from discriminated a dispersed along macro the area, main this component last at positive was confirmed. values of PC1 A more (Figure compact2a). The group supervised was found 2 2 2 OPLS-DAat negative analysis values (3 of + 3PC1 + 0; component, R X = 0.91; RdistinctY = 0.24; from Q a =dispersed 0.13) did macro not improve area, this the last separation at positive among values 2 theof cultivars PC1 (Figure and the 2a). resulting The supervised model lacked OPLS in‐DA a significant analysis (3 predictive + 3 + 0; R capability2X = 0.91; R (Q2Y = 0.13)0.24; (FigureQ2 = 0.13)2b). did not improve the separation among the cultivars and the resulting model lacked in a significant predictive capability (Q2 = 0.13) (Figure 2b). Metabolites 2018, 8, 60 5 of 17 Metabolites 2018, 8, 60 5 of 17 Metabolites 2018, 8, 60 5 of 17

a) b) a) b) Figure 2. (a) PCA t[1]/t[2] (t[1] and t[2] explain 57.3% and 14.6% of the total variance, respectively) Figureand ( 2.b) ( aOPLS) PCA‐DA t[1]/t[2]t[1]/t[2] (3 + 3 (t[1] (t[1]+ 0; andR and2X =t[2] t[2] 0.91; explain explain R2Y = 57.3% 57.3% 0.24; Qand and2 = 0.13) 14.6% t[1]/t[2] of the total scores variance, plots for respectively) main cultivar 2 2 2 andmicro (b) OPLS-DAOPLS‐milled‐DA olive (3(3 +oil+ 33 samples. ++ 0;0; RR2X == 0.91;0.91; RR2Y = 0.24; Q 2 = 0.13) t[1]/t[2] scores scores plots plots for main cultivar micro-milledmicro‐milled olive oil samples. It should be noted that, a different behaviour was found in the case of four of the main cultivars fromIt should Apulia be Region noted that, (Southern a different Italy) behaviour [16]. A clear was founddiscrimination in the case among of four the of themost main popular cultivars olive fromcultivar Apulia of the Region Apulia (Southern(Southern region, Italy)Italy)Coratina [[16].16]. and A A clear three discrimination popular local cultivarsamong the used most as “sweeteners” popular olive in cultivarCoratina of ‐ thebased Apulia blends region, (Ogliarola, Coratina Cima and di Mola three and popular Peranzana) local cultivars from the used Bari and as “sweeteners” Foggia provinces in Coratina-basedCoratina(Apulia‐based region, blends Southern (Ogliarola, Italy) was Cima observed di Mola in and two Peranzana) different harvesting from the Bari years and [15,24]. Foggia With provinces the aim (Apuliato obtain region,region, similar SouthernSouthern results Italy) Italy) also was wasin the observed observed case of in inPGI two two Tuscan different different oils, harvesting harvesting the analysis years years was [15 [15,24].,24 focused]. With With theof thethe aim aim main to obtainto referenceobtain similar similar areas, results resultswhich also werealso in the in represented case the ofcase PGI of Tuscanwith PGI at Tuscan oils,least the 9 oils,samples analysis the analysisper was area. focused was Despite offocused the the main sampleof the reference mainclasses areas,referencereduction, which areas, werethe which corresponding represented were represented with unsupervised at least with 9 samples at PCA least model 9 per samples area. (5 components, Despite per area. the Despite sample R2X = the 0.93 classes sample and reduction, Q classes2 = 0.82) 2 2 thereduction,showed corresponding againthe corresponding and unsupervised even more unsupervised PCA clearly model the (5PCAexistence components, model of (5a samplecomponents, R X = 0.93 clustering and R2 QX =in= 0.93 0.82)two and showeddifferent Q2 = again 0.82)macro andshowedgroups, even again more specifically and clearly even along the more existence the clearlyfirst of principal athe sample existence component clustering of a samplet[1] in two (Figure differentclustering 3) . macro in two groups, different specifically macro alonggroups, the specifically first principal along component the first principal t[1] (Figure component3). t[1] (Figure 3) .

Cecina /San Vincenzo Coast

CecinaSan Casciano/San Vincenzo Val di Coast Pesa/ Montelupo Fiorentino SanSiena Casciano Province Val di Pesa/ Montelupo Fiorentino SienaBassa Province Maremma COMPACT BassaMonti Maremma dell’Uccellina Montalbano COMPACT Monti dell’Uccellina MontalbanoFollonica FollonicaColline Metallifere ‐Massa Marittima

t[2] Colline Metallifere ‐Massa Marittima t[2]

SCATTERED SCATTERED

FigureFigure 3. 3.PCA PCA t[1]/t[2] t[1]/t[2] scores scores plot plot for for micro-milled micro‐milled olive olive oil oil samples samples from from the the main main geographical geographical origin area (t[1] and t[2] explain 70% and 9% of the total variance, respectively). Compact and scattered Figureorigin 3. areaPCA (t[1] t[1]/t[2] and scorest[2] explain plot for 70% micro and‐milled 9% of olive the totaloil samples variance, from respectively). the main geographical Compact and macro groups were identified by green and red circles respectively. originscattered area macro(t[1] and groups t[2] wereexplain identified 70% and by 9%green of andthe redtotal circles variance, respectively. respectively). Compact and scattered macro groups were identified by green and red circles respectively. Metabolites 2018, 8, 60 6 of 17 Metabolites 2018, 8, 60 6 of 17

A compact group was observed at negative values of t[1], which was clearly distinct from the other scattered cluster, found at positive values of thethe samesame principalprincipal componentcomponent t[1]. In the first first group (compact (compact macro macro area) area) it it was was possible possible to to identify identify some some geographical geographical areas areas of origin of origin for forthe theoil samples:oil samples: Montalbano Montalbano (30 (30samples), samples), Cecina Cecina San San Vincenzo Vincenzo Coast Coast (25 (25samples), samples), Bassa Bassa Maremma Maremma di Capalbiodi Capalbio (10 (10 samples), samples), Monti Monti dell’ dell’ Uccellina Uccellina (10 (10 samples), samples), Follonica Follonica (5 (5 samples) samples) areas areas and and San Casciano Val di Pesa-MontelupoPesa‐Montelupo Fiorentino (4 samples). On On the the contrary, contrary, at positive values of the firstfirst component t[1],t[1], thethe scatteredscattered cluster cluster consisted consisted of of oils oils from from olives olives collected collected in in San San Casciano Casciano Val Val di diPesa/Montelupo Pesa/Montelupo Fiorentino Fiorentino (15 samples), (15 samples), Siena province Siena (13 province samples) and(13 Collinesamples) Metallifere/Massa and Colline Metallifere/MassaMarittima (16 samples) Marittima areas, (16 Follonica samples) (4 samples)areas, Follonica and Montalbano (4 samples) (4 and samples). Montalbano It should (4 samples). be noted Itthat should oils frombe noted Cecina that San oils Vincenzofrom Cecina Coast, San BassaVincenzo Maremma Coast, Bassa di Capalbio Maremma and di Monti Capalbio dell’ and Uccellina Monti weredell’ Uccellina observed were only observed in the compact only in the macroarea, compact macroarea, while samples while from samples Siena from province Siena province and Colline and CollineMetallifere-Massa Metallifere Marittima‐Massa Marittima were identified were identified only in theonly scattered in the scattered area (Figure area4 (Figure). 4).

COMPACT GROUP Colline Metallifere ‐Massa Marittima

SCATTERED GROUP Follonica

Montalbano

Monti dellʹUccellina

Bassa Maremma ‐ Capalbio

Siena Province

San Casciano Va l di Pesa‐ Montelupo fiorentino

Cecina ‐ San Vincenzo coast

40 35 30 25 20 15 10 5 0

Figure 4.4. BarsBars chart chart representing representing percentage percentage distribution distribution of samples of samples from from each maineach main geographical geographical origin originareas into areas two into identified two identified macro groups. macro Green groups. and redGreen rectangles and red indicate rectangles percentage indicate contribution percentage to contributionthe compact andto the scattered compact macro-groups and scattered respectively. macro‐groups X axis respectively. reported theX axis number reported of samples. the number of samples. Therefore, by considering oil samples from the main reference areas, at least a certain degree of separationTherefore, among by the considering PGI Tuscan oil oils samples on the from basis the of the main geographical reference originareas, couldat least be a obtained. certain degree In order of separationto deeply analyse among thisthe samplesPGI Tuscan distribution, oils on the unsupervised basis of the geographical and supervised origin analyses could were be obtained. performed In orderagain, byto consideringdeeply analyse now this separately samples the distribution, identified macro-groups. unsupervised The and OPLS-DA supervised supervised analyses analysis, were performedbuilt with the again, samples by belongingconsidering to thenow compact separately macro-group the identified and considering macro‐groups. the most The representative OPLS‐DA supervisedcultivars (Frantoio, analysis, Moraiolo,built with Leccino),the samples gave belonging a good to descriptive the compact and macro predictive‐group model and considering (2 + 4 + 0 thecomponents, most representative R2X = 0.83, R 2cultivarsY = 0.74 and(Frantoio, Q2 = 0.57) Moraiolo, (Figure 5a)Leccino), revealing gave a certain a good degree descriptive of separation and predictiveamong the model main representative(2 + 4 + 0 components, cultivars. R2 InX = particular, 0.83, R2Y = Leccino 0.74 and and Q2 Frantoio = 0.57) (Figure samples 5a) were revealing clearly a certaindistinct degree from each of separation other along among the first the predictive main representative component cultivars. (t[1]), while In particular, the Moraiolo Leccino class wasand Frantoiofound along samples the second were clearly component distinct t[2] from and each located other in along a central the positionfirst predictive of the graph,component differently (t[1]), whilefrom thethe otherMoraiolo two class oil groups. was found Pendolino along the oil second samples component were excluded t[2] and from located the modelin a central because position they wereof the too graph, scattered. differently The corresponding from the other loading two oil plot groups. for the Pendolino model allowed oil samples to highlight were excluded the molecular from thecomponents model because responsible they were for thetoo classscattered. separation. The corresponding In particular, loading Leccino plot cultivar for the was model characterized allowed to highlightby a high the content molecular of saturated components fatty acids responsible (1.26 ppm), for whilethe class a high separation. content of In oleic particular, acid (1.30 Leccino ppm) cultivar was characterized by a high content of saturated fatty acids (1.26 ppm), while a high content of oleic acid (1.30 ppm) characterized the Frantoio class. Finally, the polyunsaturated fatty acids Metabolites 8 Metabolites 20182018,, 8,, 60 60 77 ofof 1717 Metabolites 2018, 8, 60 7 of 17 (PUFA) (2.06, 2.78, 1.58 ppm) were responsible for separation of the Moraiolo class (Figure 5b) from characterized(PUFA) (2.06, the2.78, Frantoio 1.58 ppm) class. were Finally, responsible the polyunsaturated for separation fatty of the acids Moraiolo (PUFA) class (2.06, (Figure 2.78, 1.58 5b) ppm)from the other two classes. werethe other responsible two classes. for separation of the Moraiolo class (Figure5b) from the other two classes.

0.015 0.1 0.015 0.1 0.01 2,06 1,58 0.01 2,06 1,58 0.05 2,78 0.005 2,78 0.05 0.005 1,38 1,38 0 5,34 0 0 5,34 0 1,34 0,9 1,22 -0.005 1,34 0,9 2,02 1,22 2,02 -0.05 -0.005 1,3 -0.05 -0.01 1,3 -0.01

-0.1 -0.015 -0.1 -0.015

-0.02 -0.15 -0.02 -0.015 -0.01 -0.005 0 0.005 0.01 0.015 -0.15 -0.1 -0.05 0 0.05 0.1 -0.02 -0.15 -0.02 -0.015 -0.01 -0.0050.0387092 0 * pq[1] 0.005 0.01 0.015 -0.15 -0.1 -0.051.00355 0 * t[1] 0.05 0.1 0.0387092 * pq[1] 1.00355 * t[1] a) b) a) b) 2 2 2 Figure 5. (a) OPLS‐DA (2 + 4 + 0 components give R X2 = 0.83, R Y =2 0.74, Q = 0.57)2 scores plot for main Figure 5. (a) OPLS OPLS-DA‐DA (2 (2 + + 4 4 + + 0 0components components give give R2 RX =X 0.83, = 0.83, R2Y R = Y0.74, = 0.74, Q2 = Q 0.57)= 0.57)scores scores plot for plot main for cultivar micro‐milled olive oil samples from the observed compact macro group. (b) Loading scatter maincultivar cultivar micro micro-milled‐milled olive oil olive samples oil samples from the from observed the observed compact compact macro macrogroup. group. (b) Loading (b) Loading scatter plot for the model indicating the molecular component responsible for the cultivar separation. NMR scatterplot for plot the formodel the modelindicating indicating the molecular the molecular component component responsible responsible for the for cultivar the cultivar separation. separation. NMR spectra with detailed assignment of discriminating metabolites are reported as Figure S1 in SI. NMRspectra spectra with detailed with detailed assignment assignment of discriminating of discriminating metabolites metabolites are reported are reported as Figure as Figure S1 in S1SI. in SI. The OPLS‐DA model built using the most representative olive cultivars of the compact macro‐ The OPLS OPLS-DA‐DA model model built built using using the the most most representative representative olive olive cultivars cultivars of the ofcompact the compact macro‐ group (Leccino, Frantoio and Moraiolo) was also successfully used as a prediction model, in order to macro-groupgroup (Leccino, (Leccino, Frantoio Frantoio and Moraiolo) and Moraiolo) was also wassuccessfully also successfully used as a prediction used as a predictionmodel, in order model, to assign some test samples (specifically monovarietal oils), supplied by the provider. From a simple inassign order some to assign test samples some test (specifically samples (specifically monovarietal monovarietal oils), supplied oils), supplied by the provider. by the provider. From a Fromsimple a visual inspection of the OPLS‐DA predicted scores plot, it was possible to correctly assign the oil test simplevisual inspection visual inspection of the OPLS of the‐DA OPLS-DA predicted predicted scores plot, scores it plot,was possible it was possible to correctly to correctly assign the assign oil test the samples in the model (Figure 6). oilsamples test samples in the model in the model(Figure (Figure 6). 6). 1.01275 * tPS[2] 1.01275 * tPS[2]

2 2 2 FigureFigure 6.6. OPLS-DAOPLS‐DA (2 (2 + + 44 ++ 00 componentscomponents give give R R2XX == 0.83,0.83, RR2Y == 0.74, Q 2 = 0.57) predicted scores plotplot Figure 6. OPLS‐DA (2 + 4 + 0 components give R2X = 0.83, R2Y = 0.74, Q2 = 0.57) predicted scores plot forfor mainmain PGIPGI cultivarcultivar micro-milledmicro‐milled oliveolive oiloil fromfrom thethe observedobserved compactcompact macro-group.macro‐group. TheThe predictedpredicted for main PGI cultivar micro‐milled olive oil from the observed compact macro‐group. The predicted samplessamples areare indicatedindicated as as five five points points stars stars coloured coloured as as declared declared cultivar cultivar oils. oils. samples are indicated as five points stars coloured as declared cultivar oils. Moreover, this was confirmed by analyzing the confusion matrix for the prediction model (Table1), Moreover, this was confirmed by analyzing the confusion matrix for the prediction model (Table in whichMoreover, the correctly this was classified confirmed samples by analyzing in the predictionthe confusion set matrix were shown. for the prediction Therefore, model in principle, (Table 1), in which the correctly classified samples in the prediction set were shown. Therefore, in principle, 1), in which the correctly classified samples in the prediction set were shown. Therefore, in principle, as already observed for 100% Italian [15] and PDO EVOOs [14] this model could be also profitably as already observed for 100% Italian [15] and PDO EVOOs [14] this model could be also profitably Metabolites 2018, 8, 60 8 of 17

used in order to assess blends of the specific cultivars originating from these specific geographical Metabolites 2018, 8, 60 8 of 17 areas. as alreadyTable observed 1. Misclassification for 100% Italiantable for [15 the] and model. PDO Y EVOOs predicted [14 value] this estimates model could class be affiliation also profitably and the used in orderlimit to of assess 0.65 was blends chosen of thefor the specific assignment cultivars of observations originating to from a specific these class. specific The geographical observations with areas. no Y predicted below 0.65 were not assigned (no class column). Each observation was assigned to the Tablenearest 1. class.Misclassification table for the model. Y predicted value estimates class affiliation and the limit of 0.65 was chosen for the assignment of observations to a specific class. The observations with no No Class Y predicted below 0.65 wereMembers not assigned Correct (no class Moraiolo column). EachFrantoio observation Leccino was assigned to the (YPred ≤ 0.65) nearest class. Moraiolo 19 100% 19 0 0 0 No Class Frantoio Members23 Correct100% Moraiolo0 Frantoio23 Leccino0 0 (YPred ≤ 0.65) Leccino 19 100% 0 0 19 0 MoraioloNo class 194 100% 190 03 01 0 0 Frantoio 23 100% 0 23 0 0 LeccinoTotal 1965 100% 19 0 026 1920 0 0 Fisher’sNo class probability 1 5.5E4‐022 0 3 1 0 1 Fisher’sTotal exact test is derived 65 from the 100% probability 19 of the particular 26 classification 20 result and 0 all 1 × −22 Fisher’soutcomes probability more extreme5.5 than10 the one observed. All probabilities more extreme than the observed 1 patternFisher’s are exact computed test is derived and fromsummed the probability to give the of the probability particular classificationof the table result occurring and all by outcomes chance more [25]. extreme than the one observed. All probabilities more extreme than the observed pattern are computed and summed toAssignments give the probability were of theperformed table occurring according by chance to [25Naïve–Bayes]. Assignments classification were performed (Software according to WEKA Naïve–Bayes 3.8, classificationUniversity of (Software Waikato WEKA New 3.8, Zealand), University [26,27] of Waikato (Tables New S5). Zealand), [26,27] (Tables S5).

TheThe supervisedsupervised analysisanalysis (OPLS-DA)(OPLS‐DA) waswas thenthen appliedapplied consideringconsidering onlyonly thethe samplessamples fallingfalling inin thethe scattered scattered macro-group macro‐group of of the the PCA PCA scores scores plot. plot. No No specific specific separation separation amongamong thethe mainmain reference reference cultivarscultivars (Leccino, (Leccino, Frantoio Frantoio and and Moraiolo) Moraiolo) was was observed observed (data (data not not sown). sown). OnOn thethe otherother hand,hand, aa clear clear separationseparation forfor thesethese oiloil samplessamples couldcould bebe observedobserved byby OPLS-DA,OPLS‐DA, accordingaccording toto geographicalgeographical origin.origin. Indeed,Indeed, looking looking at at the the twotwo mainmain geographicalgeographical areas,areas, exclusivelyexclusively presentpresent inin thethe scatteredscattered macromacro group,group, thethe Siena Siena province province could could be be clearlyclearly differentiateddifferentiated from from Colline Colline Metallifere–Massa Metallifere–Massa Marittima Marittima samples samples withwith goodgood descriptive and and predictive predictive capabilities capabilities of of the the statistical statistical model model (1 + (1 1 ++ 10; +R2 0;X = R 0.74,2X = R 0.74,2Y = R0.75,2Y = Q 0.75,2 = 0.65)Q2 = (Figure 0.65) (Figure 7). This7 ).suggests This suggests that for that the forsamples the samples of the scattered of the scattered macro‐group macro-group of the PCA of thescores PCA plot scores the plot geographical the geographical origin, origin, rather rather than than olive olive cultivars cultivars was was the the most predominantpredominant discriminatingdiscriminating factor factor on on cluster cluster separation. separation.

Siena Province Colline Metallifere ‐ Massa Marittima

Figure 7. 2 2 2 Figure 7.OPLS-DA OPLS‐DA (1 (1 + + 1 1 + + 0 0 components components give give R RX2X = = 0.74, 0.74, R R2YY == 0.75,0.75, QQ2 = 0.65) scores plotplot forfor samplessamples from two main referenced geographical areas and exclusively present in the scattered macro group. from two main referenced geographical areas and exclusively present in the scattered macro group. A crosscheck was also performed in order to assess the significance of micro-milled samples used in this study with respect to the commercial ones. The unsupervised analysis (PCA five components Metabolites 2018, 8, 60 9 of 17 Metabolites 2018, 8, 60 9 of 17

MetabolitesA crosscheck2018, 8, 60 was also performed in order to assess the significance of micro‐milled samples9 of 17 used Ain crosscheck this study was with also respect performed to the in commercial order to assess ones. the The significance unsupervised of micro analysis‐milled (PCA samples five componentsused in this give study R2 Xwith = 0.88, respect Q2 = 0.75)to the carried commercial out on bothones. the The oil samplesunsupervised obtained analysis from laboratory(PCA five 2 2 2 2 givemicrocomponents R ‐Xmilling = 0.88, give and Q R =theX 0.75) =commercial 0.88, carried Q = out 0.75) bottled on carried both of themulti out oil ‐ on samplesand both monovarietal the obtained oil samples from PGI laboratoryTuscanobtained oils from micro-milling did laboratory not show andanymicro the relevant‐milling commercial separation and the bottled commercial among of multi- the differentbottled and monovarietal of olive multi oil‐ andextraction PGI monovarietal Tuscan procedures. oils PGI did TuscanActually, not show oils as any didobserved relevantnot show for separationmicroany relevant milled among separationoil samples, the different among the scores olive the different oil plot extraction which olive includes procedures. oil extraction the commercial Actually, procedures. as samples, observed Actually, showed for as micro observed the milled same for oildistributionmicro samples, milled the in oil two scores samples, macro plot‐ whichgroups,the scores includes one plot more which the compact commercial includes and the samples,one commercial more showed dispersed samples, the (Figure same showed distribution 8). the same in twodistribution macro-groups, in two one macro more‐groups, compact one and more one compact more dispersed and one (Figure more dispersed8). (Figure 8).

Micro‐milled oil samples CommercialMicro‐milled oil oil samples samples Commercial oil samples t[2] t[2]

Figure 8. PCA t[1]/t[2] scores plot (t[1] and t[2] explain 59.5% and 13.2% of the total variance, Figure 8. PCA t[1]/t[2] scores plot (t[1] and t[2] explain 59.5% and 13.2% of the total variance, respectively)Figure 8. PCA for t[1]/t[2] micro‐ milledscores oliveplot (t[1]oil samplesand t[2] andexplain commercial 59.5% and bottled 13.2% protected of the total geographical variance, respectively) for micro-milled olive oil samples and commercial bottled protected geographical indicationrespectively) (PGI) for EVOOs, micro‐ milledsupplied olive by Certifiedoil samples Origins and Italia commercial srl. bottled protected geographical indicationindication (PGI) (PGI) EVOOs, EVOOs, supplied supplied by byCertified Certified Origins Origins Italia Italia srl srl.. Finally,Finally, a a comparison comparison among among the the PGI PGI micro micro milled milled oils oils and and the the main main popular popular Apulian Apulian cultivar cultivar Finally, a comparison among the PGI micro milled oils and the main popular Apulian cultivar (Coratina)(Coratina) [ 16[16]] was was performed. performed. The The OPLS-DA OPLS‐DA supervised supervised analysis analysis (1 (1 + + 1 1− − 0)0) gave gave a a model model with with good good (Coratina) [16] was performed.2 The OPLS2 ‐DA supervised2 analysis (1 + 1 − 0) gave a model with good significancesignificance parameters parameters (R(R2XX = 0.75, R R2YY = = 0.85, 0.85, Q Q 2= =0.85) 0.85) and and showed showed a clear a clear separation separation between between the significance parameters (R2X = 0.75, R2Y = 0.85, Q2 = 0.85) and showed a clear separation between the theTuscan Tuscan oils oils and and the the Apulian Apulian Coratina Coratina cultivar cultivar (Figure (Figure 99a).a). This indicatesindicates that,that, despite despite being being more more Tuscan oils and the Apulian Coratina cultivar (Figure 9a). This indicates that, despite being more disperseddispersed and and complicated, complicated, the the Tuscan Tuscan PGI PGI EVOOs EVOOs here here studied studied can can be be clearly clearly distinguished distinguished with with dispersed and complicated, the Tuscan PGI EVOOs here studied can be clearly distinguished with respectrespect to to the the popular popular and and easily easily available available Apulian, Apulian, [ 12[12]] Coratina Coratina oils. oils. The TheS S-plot‐plot (Figure (Figure9 b)9b) for for the the respect to the popular and easily available Apulian, [12] Coratina oils. The S‐plot (Figure 9b) for the modelmodel identified identified the the molecular molecular component component responsible responsible for for the the separation separation between between the the cultivar. cultivar. model identified the molecular component responsible for the separation between the cultivar.

2 2 2 FigureFigure 9. 9.( a()a) OPLS-DA OPLS‐DA (1 (1 + + 1 1 + + 0 0 components components give give R RX2X = = 0.75, 0.75, R RY2Y = = 0.85, 0.85, Q Q2= = 0.85) 0.85) scores scores plot plot for for 2 2 2 CoratinaCoratinaFigure 9. cultivar cultivar (a) OPLS [ 16[16]‐]DA and and (1 micro-milled micro+ 1 + ‐0milled components olive olive oil oil give samples samples R X ( =b( b)0.75,)S S-line‐line R plotY plot = 0.85, for for the theQ model =model 0.85) displaying displayingscores plot the thefor predictivepredictiveCoratina loadingscultivar loadings [16] coloured coloured and micro according according‐milled to to olive the the correlation correlationoil samples scaled scaled(b) S loading‐line loading plot [p(corr)]. for[p(corr)]. the model displaying the predictive loadings coloured according to the correlation scaled loading [p(corr)]. Metabolites 2018, 8, 60 10 of 17 MetabolitesMetabolites 20182018, ,88, ,60 60 1010 of of 17 17 As already known [16], Coratina samples were characterized by a high relative content of As monounsaturatedalready known [16], (i.e., Coratina oleic acid) samples (loadings were atcharacterized 1.30, 2.02 ppm) by a andhigh polyunsaturated relative content fattyof acids As already known [16], Coratina samples were characterized by a high relative content of monounsaturated(linoleic and (i.e., linolenic oleic acid)acids) (loadings (loadings atat 1.34,1.30, 5.342.02 ppm), ppm) whileand polyunsaturatedPGI Tuscan EVOOS fatty showed acids relative monounsaturated (i.e., oleic acid) (loadings at 1.30, 2.02 ppm) and polyunsaturated fatty acids (linoleic (linoleic highand linolenicvalues of acids) saturated (loadings fatty acidsat 1.34, (1.22 5.34 and ppm), 1.26 while ppm). PGI Tuscan EVOOS showed relative and linolenic acids) (loadings at 1.34, 5.34 ppm), while PGI Tuscan EVOOS showed relative high values high values of saturated fatty acids (1.22 and 1.26 ppm). of saturated3. Materials fatty acids and (1.22 Methods and 1.26 ppm). 3. Materials and Methods 3. Materials3.1. Sampling and Methods 3.1.3.1. Sampling Sampling A number of 217 samples (for both olives and leaves, Table S1), supplied by Certified Origins Italia s.r.l., were collected during the harvesting period 2016–2017, from 24 different georeferenced AA number number of 217217 samplessamples (for (for both both olives olives and and leaves, leaves, Table Table S1), S1), supplied supplied by Certifiedby Certified Origins Origins Italia selected Tuscany areas (Figure 10). Italias.r.l., s.r.l were., were collected collected during during the harvesting the harvesting period period 2016–2017, 2016–2017, from 24 from different 24 different georeferenced georeferenced selected selected Tuscany areas (Figure 10). Tuscany areas (Figure 10).

Figure 10. The samples collected from each area are indicated with map markers in the expansion of Figure 10. The samples collected from each area are indicated with map markers in the expansion the Tuscan region (Italy). (from http://www.progettott.info/www/MappaNMR.php; Figureof the Tuscan10. The regionsamples (Italy). collected (from from http://www.progettott.info/www/MappaNMR.php each area are indicated with map markers in the expansion; https://en. of https://en.wikipedia.org/wiki/Tuscany). thewikipedia.org/wiki/Tuscany Tuscan region (Italy).). (from http://www.progettott.info/www/MappaNMR.php; https://en.wikipedia.org/wiki/Tuscany). SamplesSamples came essentially came essentially from eight from geographical eight geographical areas with areas a highwith levela high of level geomorphologic of geomorphologic heterogeneitySamplesheterogeneity came [28]: essentially Montalbano, [28]: Montalbano, from Cecina eight Cecina San geographical Vincenzo San Vincenzo Coast,areas with SanCoast, Cascianoa highSan Cascianolevel Val of di geomorphologic Pesa-Montelupo Val di Pesa‐Montelupo heterogeneityFiorentino,Fiorentino, Colline [28]: Montalbano,Colline Metallifere-Massa Metallifere Cecina ‐SanMassa Marittima, Vincenzo Marittima, Siena Coast, Province,Siena San CascianoProvince, Monti Val dell’Monti di Pesa Uccellina, dell’‐Montelupo Uccellina, Bassa Bassa Fiorentino,Maremma-CapalbioMaremma Colline‐Capalbio Metallifere and Follonica and‐Massa Follonica (Figure Marittima, (Figure11). 11). Siena Province, Monti dell’ Uccellina, Bassa Maremma‐Capalbio and Follonica (Figure 11). Tufo‐Pomonte‐ Saturnia Monte Amiata CastiglioneTufo‐Pomonte della Pescaia‐ Saturnia‐ Grosseto‐Flat area Castiglione dellaMonte Pescaia Amiata‐ Grosseto ‐Hill Slope Castiglione della PescaiaCampagnatico‐ Grosseto‐ Hinterland‐Flat area ‐ Hill country Castiglione della Pescaia‐ GrossetoCaldana‐Hill‐ Grosseto Slope ‐ Hill country Campagnatico‐ Hinterland ‐BatignanoHill country ‐ North Grosseto Caldana‐VallerotanaGrosseto ‐ Hill ‐ Roselle country Terme ‐ Grosseto Batignano ‐ North GrossetoTufo‐ Pitigliano Vallerotana ‐ Roselle Terme ‐GranciaGrosseto‐ South Grosseto Tufo‐FollonicaPitigliano‐Piombino coast Grancia‐ SouthBasso Grosseto Merse ‐ Hill country Follonica‐Piombino coastRosignano Solvay Basso MersePreselle ‐ Hill‐ Scansanocountry ‐ Hill country Rosignano SolvayPistoia province Preselle‐ScansanoMontelattaia‐ Hill country‐Madonnino‐Grosseto Pistoia province Follonica Montelattaia‐MadonninoBassa‐Grosseto Maremma‐Capalbio FollonicaMonti dellʹUccellina Bassa Maremma‐CapalbioSiena province CollineMonti Metallifere dellʹUccellina‐ Massa Marittima San Casciano Va l diSiena Pesa‐ provinceMonteLupo Fiorentino Colline Metallifere‐ MassaCecina Marittima‐ San Vincenzo coast San Casciano Va l di Pesa‐ MonteLupo Fiorentino Montalbano Cecina‐ San Vincenzo coast 0 1020304050 Montalbano 0 1020304050 Figure 11.FigureBars-chart 11. Bars representing‐chart representing samples samples (supplied (supplied by Certified by OriginsCertified Italia Origins s.r.l Italia.) distribution s.r.l .) distribution in the in the geographical reference areas. Figure 11.geographical Bars‐chart representing reference areas. samples (supplied by Certified Origins Italia s.r.l.) distribution in the geographical reference areas. Metabolites 2018, 8, 60 11 of 17 Metabolites 2018, 8, 60 11 of 17

The most representative declared olive cultivars were Frantoio,Frantoio, Leccino,Leccino, Moraiolo, Pendolino and Maurino, with 57, 55, 42, 37 and and 11 11 oil oil samples, samples, respectively. respectively. Other minor declared cultivars were Leccio del Corno (4 samples), Rossellino (3 samples), Morchiaio (3 samples), Lazzero, Maremmano, Mignolo cerretanocerretano Olivastra Olivastra seggianese seggianese and and Razzaio, Razzaio, these these last last with with 1 samples 1 samples for each for cultivar. each cultivar. About 70%About of the70% most of the representative most representative olive cultivars olive samples cultivars were samples collected were from collected the main from geographical the main areasgeographical (Figure 12areas). (Figure 12).

Follonica MAURINO PENDOLINO LECCINO Bassa Maremma‐Capalbio MORAIOLO FRANTOIO Monti dellʹUccellina

Siena province

Colline Metallifere‐ Massa Marittima

San Casciano Va l di Pesa‐ MonteLupo Fiorentino

Cecina‐ San Vincenzo coast

Montalbano

012345678910 Figure 12. Bars-chartBars‐chart representing samples (supplied by CertifiedCertified Origins Italia s.r.l.).) distribution of most representative olive cultivars in the main geographicalgeographical referencereference areas.areas.

3.2. SSR Analysis and Varietal IdentificationIdentification Molecular characterizationcharacterization was was conducted conducted on 217 on olive 217 leafolive samples, leaf samples, with a set with of 10 microsatellitea set of 10 markers.microsatellite Sampled markers. leaves Sampled were collected leaves were from olivecollected plants from and olive immediately plants and placed immediately in a paper placed envelope in a withpaper silica envelope gel. Samples with silica were gel. kept Samples in a box were for kept a dehydration in a box for process. a dehydration 5 mg of process. dried leaf 5 mg tissue of dried was µ groundleaf tissue using was Tissuelyser ground using II (QIAGEN) Tissuelyser and II (QIAGEN) subsequently and resuspended subsequently in resuspended 100 L of distilled in 100 μ sterileL of µ waterdistilled and sterile then water vortexed and for then 30 vortexed s, 1 L of for each 30 leafs, 1 μ sampleL of each in distilledleaf sample sterile in distilled water was sterile amplified. water PCRswas amplified. were performed PCRs were using performed KAPA3G using Plant KAPA3G DNA polymerase Plant DNA polymerase (KAPA Biosystems) (KAPA Biosystems) in a reaction in mixa reaction with the mix following with the composition:following composition: 2X KAPA 2X Plant KAPA PCR Plant buffer, PCR 100X buffer, KAPA 100X Plant KAPA PCR Plant Enhancer, PCR 25 mM MgCl , 10 mM of pair of primers. Forward primers were labelled with specific fluorochromes Enhancer, 252 mM MgCl2, 10 mM of pair of primers. Forward primers were labelled with specific (6-FAM,fluorochromes VIC, PET (6‐FAM, and NED). VIC, PET Different and NED). combinations Different of combinations three SSR loci of were three used SSR inloci multiplex were used PCR in amplificationmultiplex PCR strategy. amplification DCA3-6Fam, strategy. DCA5-VIC, DCA3‐6Fam, DCA8-VIC, DCA5‐ DCA11-PETVIC, DCA8‐VIC, and DCA18-6FamDCA11‐PET [and29], GAPU71B-6FamDCA18‐6Fam [29], [30 GAPU71B], UDO12-NED‐6Fam and [30], UDO15-NED UDO12‐NED [31 ],and EMO090-6Fam UDO15‐NED [32 [31],] and EMO090 OLEST23-PET‐6Fam [32] [33] µ lociand wereOLEST23 used‐PET in this [33] work. loci KAPA3Gwere used Plant in this DNA work. polymerase KAPA3G 2U Plant was DNA added polymerase in a final volume 2U was of added 25 L. TM ◦ Thein a thermalfinal volume profile, of in25 μ theL. Veriti The thermalthermal profile, cycler in (Appliedthe Veriti Biosystems),TM thermal cycler was 10(Applied min at 95Biosystems),C and 50 ◦ ◦ ◦ ◦ cycleswas 10 composed min at 95 of°C 30and s at 50 95 cyclesC, 15 composed s at 55 C of and 30 s 30 at s 95 at 72°C, 15C withs at 55 a final °C and elongation 30 s at 72 at °C 72 withC for a 1final min, elongation as reported at by72 Migliaro°C for 1 min, et al. as [34 reported]. by Migliaro et al. [34]. AmplificationAmplification products were separated on a Genetic Analyzer 3130xl (Applied Biosystems Inc., Inc., Foster City, CA, USA). The main authenticated Tuscany cultivars held in CREA-OFACREA‐OFA olive tree collection locatedlocated in in Mirto Mirto Crosia Crosia (CS) (CS) were were included included into into the the analysis analysis as as internal internal reference reference to verify to verify the correctnessthe correctness of molecular of molecular data. data. SSR fragments SSR fragments were analyzed were analyzed by Gene by Mapper Gene 3.7Mapper software 3.7 (Appliedsoftware Biosystems,(Applied Biosystems, Foster City, Foster CA, USA).City, CA, The USA). obtained The data obtained by scoring data ofby SSR scoring profiles of SSR were profiles used to were calculate used to calculate a similarity matrix using Dice’s coefficient [35]. The similarity values were utilized to Metabolites 2018, 8, 60 12 of 17 a similarity matrix using Dice’s coefficient [35]. The similarity values were utilized to determine the cluster analysis based an unweighted pair group method with arithmetic mean (UPGMA) using PAST software v.2.12. In order to carry out the varietal identification of the accessions outside of the IGP Tuscan varietal platform, molecular data obtained in this study, were harmonized and compared with those from the internal CREA-OFA standardized database. The harmonization was carried out by shifting of one or more single repeat for each allele in comparison with reference one. For the loci SSR GAPU71b, DCA3, DCA5, DCA18, GAPU71b and EMO090, reference alleles were taken from oleadb database [36], reference alleles for the loci SSR DCA11 and UDO12 came from [37], UDO 15 from [38], while for the locus OLEST 23 were the same used in Reference [39].

3.3. Olive Oil Extraction Oils were extracted from olive samples by using a laboratory scale milling method, in a short time, reducing any type of decomposition due to thermal effects. For each sample, olives (20 g) were plunged into liquid N2 and ground to obtain a paste with a stainless- steel blender. After storing over night at 4 ◦C the past was added of 2–4 mL of distilled water and centrifuged. The oil (about 2–4 mL) was collected from the upper phase and stored in amber vials until NMR analysis.

3.4. 1H NMR Analysis and Data Processing

NMR samples were prepared dissolving ~140 mg of olive oil in CDCl3 and adjusting ratio of olive oil: CDCl3 to 13.5: 86.5 (% w/w). This ratio was chosen to give the best trade-off for sensitivity/solution viscosity in spectral acquisition (Bruker Italia, standardized procedure for olive oil) [16]. Next, 600 µL of the prepared mixture were transferred into a 5-mm NMR tube. 1H NMR spectra were recorded on a Bruker Avance spectrometer (Bruker, Karlsruhe, Germany) operating at 400.13 MHz, T = 300 K, equipped with a PABBI 5-mm inverse detection probe incorporating a z axis gradient coil. NMR experiments were performed after sample randomization to avoid biasing results due to instrument conditions or operator related differences. The entire process was conducted under full automation for the entire process, after loading individual samples on a Bruker Automatic Sample Changer (BACS-60), interfaced with the IconNMR software (Bruker). In order to optimize NMR conditions, automated tuning and matching, locking and shimming and 90◦ hard pulse calibration P(90◦) were done for each sample using standard Bruker routines ATMA, LOCK, TOPSHIM and PULSECAL. After a 5-min waiting period for temperature equilibration, a standard one-dimensional (1H ZG) NMR experiment was performed for each sample. The relaxation delay (RD) and acquisition time (AQ) were set to 4 s and ~3.98 s, respectively, resulting in a total recycle time of ~7.98 s. Free Induction Decays (FIDs) were collected into time domain (TD) = 65,536 (64 k) complex data points by setting: spectral width (SW) = 20.5524 ppm (8223.685 Hz), receiver gain (RG) = 4 and number of scans (NS) = 16, usually used for samples where metabolites are present in high concentrations, as in the case of olive oil analysis [24,40]. NMR data were processed using Topspin 2.1 (Bruker). 1H NMR spectra were obtained by the Fourier Transformation (FT) of the FID (Free Induction Decay), applying an exponential multiplication (EM) with a line broadening factor of 0.3 Hz, automatically phased and baseline corrected. Chemical shifts were reported with respect to the TMS (internal standard) signal set at 0 ppm, obtaining good peak alignment.

3.5. Multivariate Statistical Analysis 1H NMR spectra were processed by segmentation in rectangular fixed (0.04 ppm width) buckets and integration by Amix 3.9.15 (Analysis of Mixture, Bruker BioSpin GmbH, Rheinstetten, Germany) software. Bucketing was performed within 10.00–0.5 ppm region, excluding the residual non-deuterated chloroform signal and its carbon satellites signals (7.6–6.9 ppm). The total sum normalization was applied to minimize small differences due to olive oil concentration and/or experimental conditions among samples [41–43]. The Pareto scaling method (performed by dividing Metabolites 2018, 8, 60 13 of 17 the mean-centered data by the square root of the standard deviation) was then applied to the variables. The prior log transformation of the data (Figure S2) did not improve the final outcome of the MVA [44–46]. Therefore, no further pre-processing, including noise removal [45] was used. The data table generated by all aligned buckets row reduced spectra was used for multivariate data analysis. Each bucket row represents the entire NMR spectrum, with all the molecules in the sample. Moreover, each bucket in a buckets row reduced spectrum is labelled with the value of the central chemical shift for its specific 0.04 ppm width. The variables used as descriptors for each sample in chemometric analyses are the buckets. Multivariate statistical analysis and graphics were obtained using Simca-P version 14 (Sartorius Stedim Biotech, Umeå, Sweden). PCA (Principal Component Analysis), PLS-DA (data not shown) and OPLS-DA (Partial Least Squares and Orthogonal Partial Least Squares Discriminant Analyses, respectively) were applied to the data [47–49]. Principal Component Analysis is at the basis of the multivariate analysis [47] and usually performed to extract and display the systematic variation in a data matrix X formed by rows (the considered observations), in our case the EVOO samples and columns (the variables describing each sample) in our case the buckets from each NMR spectrum. In this work, the PLS-DA method was also performed in order to justify the number of components used in OPLS-DA model [50]. The OPLS-DA analysis is a modification of the usual PLS-DA method which filters out variation that is not directly related to the response and produces models of clearer interpretation, focusing the predictive information in one component, as shown in several recent studies of metabolomics [26,51]. The OPLS-DA is the most recently used technique for the discrimination of samples with different characteristics (such as cultivars and/or geographical origin). The further improvements made by the OPLS-DA in MVA resides in the ability to separate the portion of the variance useful for predictive purposes from the not predictive variance (which is made orthogonal) [52]. OPLS-DA models are useful tools in application of prediction and classification. The related classification list and confusion matrix summarize the probability of belonging to the class models, showing correctness or incorrectness of particular sample classification [53]. In order to evaluate the robustness and predictive ability of the statistical models, a seven-fold cross-validation procedure was performed [54–56]. Moreover, the minimal number of required components can be easily defined by the analysis of R2 and Q2 parameters, which display completely diverging behaviour as the model complexity increases. The R2X, R2Y and Q2, describing the total variation in X, the variation in the response variable Y and the predictive ability of the models, respectively, were calculated [57]. The results were shown by the optimal bidimensional scores plots and relative loadings plots, which were used to identify differences among groups [58].

3.6. Chemicals

All chemical reagents for analysis were of analytical grade. CDCl3 (99.8 atom %D) and tetramethylsilane, TMS (0.03 v/v %) were purchased from Armar Chemicals (Döttingen, Switzerland).

4. Conclusions The present study represents the first large-scale analysis of oils obtained from cultivars and geographical areas specific for the production of Tuscan PGI EVOOs. Analyzing the NMR-based metabolomic profiles of both laboratory micro-milled and commercial oil samples, a distribution of the samples in two main macro-groups was observed by PCA analysis. The first one, which includes more samples, results in a more compact cluster, while the second group gives a more dispersed one. The different statistical models built by considering separately these two groups showed a very different samples distribution characteristics. In the first case, (compact macro-group samples) a separation of the main reference cultivars, Frantoio, Moraiolo and Leccino, appeared the most relevant discriminating feature with satisfactory model parameters and good predictive capabilities. On the other hand, the scattered macro-group samples could be reasonably well separated only on the basis of the geographical areas rather than olive cultivars. These results showed, for the first time, the specificity of the Tuscan PGI EVOOs production. The observed high variability of this product Metabolites 2018, 8, 60 14 of 17 depends not only on the numerous PGI allowed local cultivars but also on the high heterogeneity of the pedoclimatic conditions characteristic of the region. This result is in contrast with the characteristics of the EVOOs coming from extensively studied most popular Apulian cultivars and geographical areas [14–17]. Further studies are required to deeply characterize the Tuscan PGI EVOOs, especially in the scattered macro-group geographical areas such as Siena Province, Colline Metallifere—Massa Marittima area). Nevertheless all the here reported Tuscan PGI EVOOs could be clearly distinguished with respect to the popular and easily available Apulian, [12] Coratina oils. The results of the present work confirmed the need of monocultivar genetically certified EVOO samples for the construction of a 1H-NMR-metabolic profiles database suitable for cultivar and/or geographical origin assessment. Such a specific PGI EVOOs database could be profitably used to justify the high added value of the product and the sustainability of the related supply chain.

Supplementary Materials: The following are available online at http://www.mdpi.com/2218-1989/8/4/60/s1, Table S1: List of olive samples from geographical areas within Tuscany region (Italy), Table S2: Molecular characterization results of the declared accessions collected in eight Tuscany areas, Table S3: Molecular profile of declared accessions correctly identified. In bold: allele variant from the reference cultivar, Table S4: Reference molecular profiles of Tuscan PGI declared cultivars in this work, Table S5: Classifier Output from Weka analysis, Figure S1: Representative 1H NMR spectra of Moraiolo, Leccino, Frantoio olive oil samples. Figure S2: Data pre-treatment. Author Contributions: Data curation, C.R.G., L.D.C., S.Z., A.S. and F.L.C.; Formal analysis, C.R.G., L.D.C., S.Z. and A.B.; Investigation, C.R.G., L.D.C., S.Z., A.S. and F.L.C.; Project administration, D.B. and F.P.F.; Resources, A.B. and D.B.; Supervision, F.P.F.; Writing—original draft, C.R.G. and L.D.C.; Writing—review & editing, C.R.G. Funding: This research received no external funding. Acknowledgments: The authors give special thanks to the plant material and olive oil supplier “Certified Origins Italia srl” (Località il Madonnino, Grosseto, Toscana, Italy) for collaboration and support the present work. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Muzzalupo, I. Olive Germplasm: Italian Catalogue of Olive Varieties; InTechOpen: London, UK, 2012. 2. Elloumi, J.; Ben-Ayed, R.; Aifa, S. An overview of olive oil biomolecules. Curr. Biotechnol. 2012, 1, 115–124. [CrossRef] 3. Girelli, C.R.; Del Coco, L.; Fanizzi, F.P. Tunisian extra virgin olive oil traceability in the EEC market: Tunisian/Italian (Coratina) EVOOs blend as a case study. Sustainability 2017, 9, 1471. [CrossRef] 4. Scientific Workshop on Olive Oil Authentication. Available online: https://ec.europa.eu/agriculture/ events/2013/olive-oil-workshop/newsletteren.pdf (accessed on 28 September 2018). 5. Regulation (EC) No. 178/2002 of the European Parliament and of the Council of 28 January 2002 Laying Down the General Principles and Requirements of Food Law, Establishing the European Food Safety Authority and Laying Down Procedures in Matters of Food Safety. Available online: https://www.ecolex.org/details/legislation/regulation-ec-no-1782002-of-the-european-parliament- and-of-the-council-laying-down-the-general-principles-and-requirements-of-food-law-establishing-the- european-food-safety-authority-and-laying-down-procedures-in-matters-of-food-safety-lex-faoc034771/ (accessed on 28 September 2018). 6. Council Regulation (EC) No. 510/2006 of 20 March 2006 on the Protection of Geographical Indications and Designations of Origin for Agricultural Products and Foodstuffs. Available online: https://eur-lex.europa. eu/legal-content/EN/TXT/?uri=celex:32006R0510 (accessed on 28 September 2018). 7. Council Regulation (EC) No. 2081/92 of 14 July 1992 on the Protection of Geographical Indications and Designations of Origin for Agricultural Products and Foodstuffs. Available online: https://eur-lex.europa. eu/legal-content/EN/TXT/?uri=CELEX:31992R2081 (accessed on 28 September 2018). 8. European Commission, Regulation (EC) No. 1151/2012 of the European Parliament and of the Council of 21 November 2012 on Quality Schemes for Agricultural Products and Foodstuffs. Available online: https: //eur-lex.europa.eu/legal-content/en/TXT/?uri=CELEX%3A32012R1151 (accessed on 28 September 2018). 9. Ben-Ayed, R.; Kamoun-Grati, N.; Rebai, A. An overview of the authentication of olive tree and oil. Compr. Rev. Food Sci. Food Saf. 2013, 12, 218–227. [CrossRef] Metabolites 2018, 8, 60 15 of 17

10. Likudis, Z. Olive oils with protected designation of origin (PDO) and protected geographical indication (PGI). In Products from Olive Tree; InTechOpen: London, UK, 2016. 11. ASA (Associazione Stampa Agroalimentare Italiana). Available online: http://www.asa-press.com/2018/a- 18-ismea-report-EVO.html (accessed on 28 September 2018). 12. Ismea (istituto di servizi per il mercato agricolo alimentare). Available online: http://www.ismeamercati. it/flex/cm/pages/ServeBLOB.php/L/IT/IDPagina/3523#MenuV%20(3),%20373-381 (accessed on 28 September 2018). 13. Ministry of Agricultural, Food and Forestry Policies (MiPAAF), Decree 21 July 1998, Disciplinary of Production of the Protected Geographical Indication of “Toscano” Olive Oil. (GU General Series No. 243 of 17-10-1998—Ordinary Supplement No. 172). Available online: http://www.gazzettaufficiale.it/ atto/serie_generale/caricaDettaglioAtto/originario?atto.dataPubblicazioneGazzetta=1998-10-17&atto. codiceRedazionale=098A8947&elenco30giorni=false (accessed on 28 September 2018). 14. Del Coco, L.; Mondelli, D.; Mezzapesa, G.N.; Miano, T.; De Pascali, S.A.; Girelli, C.R.; Fanizzi, F.P. Protected designation of origin extra virgin olive oils assessment by Nuclear Magnetic Resonance and multivariate statistical analysis: “Terra di Bari”, an Apulian (southeast Italy) case study. J. Am. Oil Chem. Soc. 2016, 93, 373–381. [CrossRef] 15. Girelli, C.R.; Del Coco, L.; Fanizzi, F.P. 1H NMR spectroscopy and multivariate analysis as possible tool to assess cultivars, from specific geographical areas, in EVOOs. Eur. J. Lipid Sci. Technol. 2016, 118, 1380–1388. [CrossRef] 16. Girelli, C.R.; Del Coco, L.; Papadia, P.; De Pascali, S.A.; Fanizzi, F.P. Harvest year effects on Apulian EVOOs evaluated by 1H NMR based metabolomics. PeerJ 2016, 4, e2740. [CrossRef][PubMed] 17. Piccinonna, S.; Ragone, R.; Stocchero, M.; Del Coco, L.; De Pascali, S.A.; Schena, F.P.; Fanizzi, F.P. Robustness of NMR-based metabolomics to generate comparable data sets for olive oil cultivar classification. An inter-laboratory study on Apulian olive oils. Food Chem. 2016, 199, 675–683. [CrossRef][PubMed] 18. Commission Regulation (EU) No. 61/2011 of 24 January 2011 Amending Regulation (EEC) No. 2568/91 on the Characteristics of Olive Oil and Olive-Residue Oil and on the Relevant Methods of Analysis. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32011R0061 (accessed on 28 September 2018). 19. Callao, M.P.; Ruisánchez, I. An overview of multivariate qualitative methods for food fraud detection. Food Control 2018, 86, 283–293. [CrossRef] 20. D’Imperio, M.; Mannina, L.; Capitani, D.; Bidet, O.; Rossi, E.; Bucarelli, F.M.; Quaglia, G.B.; Segre, A. NMR and statistical study of olive oils from Lazio: A geographical, ecological and agronomic characterization. Food Chem. 2007, 105, 1256–1267. [CrossRef] 21. Del Coco, L.; Schena, F.P.; Fanizzi, F.P. 1H Nuclear Magnetic Resonance study of olive oils commercially available as Italian products in the United States of America. Nutrients 2012, 4, 343–355. [CrossRef][PubMed] 22. Rongai, D.; Sabatini, N.; Del Coco, L.; Perri, E.; Del Re, P.; Simone, N.; Marchegiani, D.; Fanizzi, F.P. 1H NMR and multivariate analysis for geographic characterization of commercial extra virgin olive oil: A possible correlation with climate data. Foods 2017, 6, 96. [CrossRef][PubMed] 23. Camin, F.; Pavone, A.; Bontempo, L.; Wehrens, R.; Paolini, M.; Faberi, A.; Marianella, R.M.; Capitani, D.; Vista, S.; Mannina, L. The use of IRMS, 1H NMR and chemical analysis to characterize Italian and imported Tunisian olive oils. Food Chem. 2016, 196, 98–105. [CrossRef][PubMed] 24. Del Coco, L.; De Pascali, S.A.; Fanizzi, F.P. 1H NMR Spectroscopy and multivariate analysis of monovarietal EVOOs as a tool for modulating Coratina-based blends. Foods 2014, 3, 238–249. [CrossRef][PubMed] 25. Fisher, R.A. On the interpretation of χ2 from contingency tables, and the calculation of P. J. R. Stat. Soc. 1922, 85, 87–94. [CrossRef] 26. Consonni, R.; Cagliani, L.; Benevelli, F.; Spraul, M.; Humpfer, E.; Stocchero, M. NMR and chemometric methods: a powerful combination for characterization of balsamic and traditional balsamic vinegar of Modena. Anal. Chim. Acta 2008, 611, 31–40. [CrossRef][PubMed] 27. Witten, I.H.; Frank, E.; Hall, M.A.; Pal, C.J. Data Mining: Practical Machine Learning Tools and Techniques; Morgan Kaufmann: Burlington, MA, USA, 2016. Metabolites 2018, 8, 60 16 of 17

28. Bicocchi, G.; D’Ambrosio, M.; Vannocci, P.; Nocentini, M.; Tacconi-Stefanelli, C.; Masi, E.; Carnicelli, S.; Tofani, V.; Catani, F. Preliminary assessment of the factors controlling the geotechnical and hydrological properties in the hillslope deposits of eastern Tuscany (central Italy). In Proceedings of the IAMG 2015 Conference, Freiberg, SN, Germany, 13 May 2015; pp. 867–874. 29. Sefc, K.; Lopes, M.; Mendonça, D.; Dos Santos, M.R.; Machado, M.L.D.C.; Machado, A.D.C. Identification of microsatellite loci in olive (Olea europaea) and their characterization in Italian and Iberian olive trees. Mol. Ecol. 2000, 9, 1171–1173. [CrossRef][PubMed] 30. Carriero, F.; Fontanazza, G.; Cellini, F.; Giorio, G. Identification of simple sequence repeats (SSRS) in olive (Olea europaea L.). Theor. Appl. Genet. 2002, 104, 301–307. [CrossRef][PubMed] 31. Cipriani, G.; Marrazzo, M.; Marconi, R.; Cimato, A.; Testolin, R. Microsatellite markers isolated in olive (Olea europaea L.) are suitable for individual fingerprinting and reveal polymorphism within ancient cultivars. Theor. Appl. Genet. 2002, 104, 223–228. [CrossRef][PubMed] 32. De la Rosa, R.; James, C.; Tobutt, K. Isolation and characterization of polymorphic microsatellites in olive (Olea europaea L.) and their transferability to other genera in the oleaceae. Mol. Ecol. Notes 2002, 2, 265–267. [CrossRef] 33. Mariotti, R.; Cultrera, N.; Mousavi, S.; Baglivo, F.; Rossi, M.; Albertini, E.; Alagna, F.; Carbone, F.; Perrotta, G.; Baldoni, L. Development, evaluation, and validation of new est-EST-SSR markers in olive (Olea europaea L.). Tree Genet. Genomes 2016, 12, 120. [CrossRef] 34. Migliaro, D.; Morreale, G.; Gardiman, M.; Landolfo, S.; Crespan, M. Direct multiplex PCR for grapevine genotyping and varietal identification. Plant Genet. Resour. 2013, 11, 182–185. [CrossRef] 35. Sneath, P.H.; Sokal, R.R. Numerical Taxonomy. The Principles and Practice of Numerical Classification; Oxford University Press: Oxford, UK, 1973; pp. 263–268. 36. Olea Databases. Available online: http://www.oleadb.it/ (accessed on 28 September 2018). 37. Sarri, V.; Baldoni, L.; Porceddu, A.; Cultrera, N.; Contento, A.; Frediani, M.; Belaj, A.; Trujillo, I.; Cionini, P. Microsatellite markers are powerful tools for discriminating among olive cultivars and assigning them to geographically defined populations. Genome 2006, 49, 1606–1615. [CrossRef][PubMed] 38. Trujillo, I.; Ojeda, M.A.; Urdiroz, N.M.; Potter, D.; Barranco, D.; Rallo, L.; Diez, C.M. Identification of the worldwide olive germplasm bank of Córdoba (Spain) using SSR and morphological markers. Tree Genet. Genomes 2014, 10, 141–155. [CrossRef] 39. Mousavi, S.; Mariotti, R.; Regni, L.; Nasini, L.; Bufacchi, M.; Pandolfi, S.; Baldoni, L.; Proietti, P. The first molecular identification of an olive collection applying standard simple sequence repeats and novel expressed sequence tag markers. Front. Plant Sci. 2017, 8, 1283. [CrossRef][PubMed] 40. Barison, A.; Pereira da Silva, C.W.; Campos, F.R.; Simonelli, F.; Lenz, C.A.; Ferreira, A.G. A simple methodology for the determination of fatty acid composition in edible oils through 1H NMR Spectroscopy. Magn. Reson. Chem. 2010, 48, 642–650. [CrossRef][PubMed] 41. Sundekilde, U.; Larsen, L.; Bertram, H. NMR-based milk metabolomics. Metabolites 2013, 3, 204–222. [CrossRef] [PubMed] 42. Gallo, V.; Mastrorilli, P.; Cafagna, I.; Nitti, G.I.; Latronico, M.; Longobardi, F.; Minoja, A.P.; Napoli, C.; Romito, V.A.; Schäfer, H. Effects of agronomical practices on chemical composition of table grapes evaluated by NMR spectroscopy. J. Food Compost. Anal. 2014, 35, 44–52. [CrossRef] 43. van den Berg, R.A.; Hoefsloot, H.C.; Westerhuis, J.A.; Smilde, A.K.; van der Werf, M.J. Centering, scaling, and transformations: Improving the biological information content of metabolomics data. BMC Genom. 2006, 7, 142. [CrossRef][PubMed] 44. Changyong, F.; Hongyue, W.; Naiji, L.; Tian, C.; Hua, H.; Ying, L. Log-transformation and its implications for data analysis. Shanghai Arch. Psychiatry 2014, 26, 105. 45. Kvalheim, O.M.; Aksnes, D.W.; Brekke, T.; Eide, M.O.; Sletten, E. Crude oil characterization and correlation by principal component analysis of 13C Nuclear Magnetic Resonance spectra. Anal. Chem. 1985, 57, 2858–2864. [CrossRef] 46. Emwas, A.-H.; Saccenti, E.; Gao, X.; McKay, R.T.; dos Santos, V.A.M.; Roy, R.; Wishart, D.S. Recommended strategies for spectral processing and post-processing of 1D 1 H-NMR data of biofluids with a particular focus on urine. Metabolomics 2018, 14, 31. [CrossRef][PubMed] 47. Jackson, J.E. A User’s Guide to Principal Components; John Wiley & Sons: Hoboken, NJ, USA, 2005; p. 587. Metabolites 2018, 8, 60 17 of 17

48. Eriksson, L.; Byrne, T.; Johansson, E.; Trygg, J.; Vikström, C. Multi-and Megavariate Data Analysis Basic Principles and Applications; Umetrics Academy: Umea, Sweden, 2013. 49. Lindon, J.C.; Nicholson, J.K.; Holmes, E. The Handbook of Metabonomics and Metabolomics; Elsevier: Amsterdam, The Netherlands, 2011. 50. De Pascali, S.A.; Gambacorta, L.; Oswald, I.P.; Del Coco, L.; Solfrizzo, M.; Fanizzi, F.P. 1H NMR and MVA metabolomic profiles of urines from piglets fed with boluses contaminated with a mixture of five mycotoxins. Biochem. Biophys. Rep. 2017, 11, 9–18. [PubMed] 51. Zotti, M.; De Pascali, S.A.; Del Coco, L.; Migoni, D.; Carrozzo, L.; Mancinelli, G.; Fanizzi, F.P. 1H NMR metabolomic profiling of the blue crab (Callinectes sapidus) from the adriatic sea (SE Italy): A comparison with warty crab (Eriphia verrucosa), and edible crab (Cancer pagurus). Food Chem. 2016, 196, 601–609. [CrossRef] [PubMed] 52. Boccard, J.; Rutledge, D.N. A consensus orthogonal partial least squares discriminant analysis (OPLS-DA) strategy for multiblock omics data fusion. Anal. Chim. Acta 2013, 769, 30–39. [CrossRef][PubMed] 53. Ciosek, P.; Brzózka, Z.; Wróblewski, W.; Martinelli, E.; Di Natale, C.; D’amico, A. Direct and two-stage data analysis procedures based on PCA, PLS-DA and ANN for ISE-based electronic tongue—Effect of supervised feature extraction. Talanta 2005, 67, 590–596. [CrossRef][PubMed] 54. Holmes, E.; Loo, R.L.; Stamler, J.; Bictash, M.; Yap, I.K.; Chan, Q.; Ebbels, T.; De Iorio, M.; Brown, I.J.; Veselkov, K.A. Human metabolic phenotype diversity and its association with diet and blood pressure. Nature 2008, 453, 396. [CrossRef][PubMed] 55. Trygg, J.; Wold, S. Orthogonal projections to latent structures (O-PLS). J. Chemom. 2002, 16, 119–128. [CrossRef] 56. Triba, M.N.; Le Moyec, L.; Amathieu, R.; Goossens, C.; Bouchemal, N.; Nahon, P.; Rutledge, D.N.; Savarin, P. PLS/OPLS models in metabolomics: The impact of permutation of dataset rows on the k-fold cross-validation quality parameters. Mol. Biosyst. 2015, 11, 13–19. [CrossRef][PubMed] 57. Wheelock, Å.M.; Wheelock, C.E. Trials and tribulations of ‘omics data analysis: Assessing quality of SIMCA-based multivariate models using examples from pulmonary medicine. Mol. Biosyst. 2013, 9, 2589–2596. [CrossRef] [PubMed] 58. Sun, L.; Zhang, H.; Wu, L.; Shu, S.; Xia, C.; Xu, C.; Zheng, J. 1H Nuclear Magnetic Resonance-based plasma metabolic profiling of dairy cows with clinical and subclinical ketosis. J. Dairy Sci. 2014, 97, 1552–1562. [CrossRef][PubMed]

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).