foods

Review Chemometrics Methods for Specificity, Authenticity and Traceability Analysis of Olive Oils: Principles, Classifications and Applications

Habib Messai 1, Muhammad Farman 2, Abir Sarraj-Laabidi 3, Asma Hammami-Semmar 4 and Nabil Semmar 3,5,*

1 Laboratory of Biomedical and Oncogenetics, Institut Pasteur de Tunis, University of Tunis El Manar, 1002 Tunis, Tunisia; [email protected] 2 Department of , Quaid-i-Azam University, 45320 Islamabad, Pakistan; [email protected] 3 Laboratory of , Biomathematics and (BIMS), Institut Pasteur de Tunis, University of Tunis El Manar, 1002 Tunis, Tunisia; [email protected] 4 National Institute of Applied Sciences and Technology (INSAT), University of Carthage, 1080 Tunis, Tunisia; [email protected] 5 Laboratoire de Biomathématiques, Faculté des Sciences de Saint-Jérôme, Aix-Marseille Université, 13397 Marseilles, France * Correspondence: [email protected]; Tel.: +216-71-873-366; Fax: +216-71-872-055

Academic Editor: Saskia van Ruth Received: 11 October 2016; Accepted: 10 November 2016; Published: 17 November 2016

Abstract: Background. Olive oils (OOs) show high chemical variability due to several factors of genetic, environmental and anthropic types. Genetic and environmental factors are responsible for natural compositions and polymorphic diversification resulting in different varietal patterns and phenotypes. Anthropic factors, however, are at the origin of different blends’ preparation leading to normative, labelled or adulterated commercial products. Control of complex OO samples requires their (i) characterization by specific markers; (ii) authentication by fingerprint patterns; and (iii) monitoring by traceability analysis. Methods. These and management aims require the use of several multivariate statistical tools: specificity highlighting requires ordination methods; authentication checking calls for classification and methods; traceability analysis implies the use of network-based approaches able to separate or extract mixed information and memorized signals from complex matrices. Results. This chapter presents a review of different chemometrics methods applied for the control of OO variability from metabolic and physical-chemical measured characteristics. The different chemometrics methods are illustrated by different study cases on monovarietal and blended OO originated from different countries. Conclusion. Chemometrics tools offer multiple ways for quantitative evaluations and qualitative control of complex chemical variability of OO in relation to several intrinsic and extrinsic factors.

Keywords: chemometrical methods; olive field; blends; quality control; ordination; clustering; pattern recognition; prediction; chromatographic profiles; spectral

1. Introduction Olive oils (OOs) are complex food matrices due to their highly variable compositions. Different OO cultivars are associated with different environmental living conditions, and are characterized by different chemical patterns resulting in different organoleptic properties [1–3]. Thus, chemical patterns play central interests between influencing environment and produced quality of olive products. These interests related to intrinsic regulation and extrinsic sensitivity of metabolic profiles helping for specificity, authenticity and traceability analysis of OO samples and populations [4–8]:

Foods 2016, 5, 77; doi:10.3390/foods5040077 www.mdpi.com/journal/foods Foods 2016, 5, 77 2 of 35

FoodsSpecificity, 2016, 5, 77 authenticity and traceability of complex OO samples can be more or less controlled2 of 34 through statistical analysis of variability between metabolic profiles. Links between these three basic Specificity, authenticity and traceability of complex OO samples can be more or less controlled conceptsthrough and statistical metabolic analysis variability of variabil canity be between organized metabolic around profiles. three questions Links between (Figure these1): three basic • conceptsWhat areand themetabolic roles andvariability status can of separatedbe organized metabolites around three in thequestions polymorphism (Figure 1): of OO samples • (FigureWhat1 area)? the roles and status of separated metabolites in the polymorphism of OO samples • What(Figure is the 1a)? usefulness of metabolic profiles or spectroscopic features for authentication of OO • samplesWhat is (Figure the usefulness1b)? of metabolic profiles or spectroscopic features for authentication of OO • Howsamples and how(Figure much 1b)? the metabolites’ levels vary the ones relatively to the others to favour the • developmentHow and how or formationmuch the metabolites’ of well-distinct levels OO vary patterns? the ones (Figure relatively1c) to the others to favour the development or formation of well-distinct OO patterns? (Figure 1c)

Pattern Recognition Methods : S. LDA, PLS, PLS-DA, SIMCA, Un. SVMs, KNNs S. Artificial Neural Simplex Mixture Design- Networks Approach Co-occurrence or blend (network) scale

How metabolites quantitatively and qualitatively vary the ones relatively to the others to favor the formation of different polymorphic trends ?

(c) Traceability

OO samples

(a) Specificity Authenticity (b)

How separated metabolites can How combined metabolites can define chemical characterize polymorphic trends ? fingerprints of polymorphic trends ?

plant (individual) variety (sub-population) scale scale

Principal Component Analysis (PCA) Hierarchical Classification Analysis (HCA) Un. Un.

Correspondence Analysis (CA) Pattern Recognition Methods : Un. LDA, PLS-DA, SIMCA, SVMs, KNNs S.

Figure 1. General interests of applications of different statistical methods for specificity (a), Figure 1. General interests of applications of different statistical methods for specificity (a), authenticity authenticity (b) and traceability (c) analysis of olive oil (OO) samples. Some methods are (b) and traceability (c) analysis of olive oil (OO) samples. Some methods are unsupervised (Un) whereas unsupervised (Un) whereas others are supervised (S). others are supervised (S). The first question focuses on highlighting of high or low regulation levels of some metabolites specificallyThe first questionto some focusesphenotypes, on highlighting cultivars or of environmental high or low regulation conditions. levels Metabolite of some-phenotype metabolites specificallyspecificity tocan some be statistically phenotypes, highlighted cultivars by or ordination environmental methods conditions. including principal Metabolite-phenotype component specificityanalysis (PCA) can be and statistically correspondence highlighted analysis by (CA). ordination methods including principal component analysis (PCA) and correspondence analysis (CA).

Foods 2016, 5, 77 3 of 35

The second question refers to authenticity consisting in combining several metabolites to define chemical fingerprints from which different OO samples can be reliably classified or predicted. Classification of a wide set of OO samples into different authentic groups is statistically carried out by hierarchical (HCA). However, affiliation of outside samples into appropriate groups implies predictive models based on pattern recognition techniques including linear discriminant analysis (LDA), Soft Independent Modelling of Class Analogies (SIMCA), Support Vector Machines (SVMs) and K-Nearest Neighbours (KNNs). Beyond this qualitative aspect, OO blends can be quantitatively evaluated by predicting some target characteristics or components from high number of input variables (e.g., spectroscopic data). This question can be treated by partial least square (PLS) regression. The third question implies the analysis of memorized and interactive variations within heterogeneous matrices. This traceability question requires statistical techniques able to absorb and separate different types of variations associated with (or at the origin of) different polymorphic or multi-aspect patterns. These techniques include artificial neural networks (ANNs) and simplex mixture design-based approach. Apart from the goal-based criterion, these different statistical methods can be classified into supervised and not-supervised types. The first ones do not need any preliminary information and provide neutral (not guided) results which will help to understand complex structures of studied systems. These techniques include PCA, CA and HCA. However, supervised methods are guided by preliminary information which will serve as reference for target final results. Supervised techniques include LDA, PLS, SIMCA, SVMs, KNN and simplex approach. Finally, ANNs, represent a set of methods including supervised or unsupervised ones. This review provides an illustrated presentation of traditional and modern chemometrics methods applied in olive oil field. Its ultimate objective is to provide a guideline linking directive questions of authenticity, specificity and traceability to appropriate chemometrics methods. Basic methodological aspects are completed by several recent illustrative applications showing the wide interests and perspectives of chemometrics in this food field. Applications essentially concerned chromatographic and spectroscopic data including HPLC, GC, UV, NIR, MIR, NMR data, etc. in addition to genetic markers. Chemical data concerned several types of metabolites including fatty acids (FAs), triacylglycerols (TGCs), sterols, phenols, volatiles, etc. Crucial interests of chemometrics analysis of olive oil samples included geographical and varietal origins determination, standard composition control (conformity, purity, adulteration), historic handling process detection, proportions’ evaluations of mixed OO cultivars in binary and multivarietal blends. These qualitative pattern recognitions and quantitative evaluations of compositions usefully help for more precise quality control and adulteration reduction.

2. Chemometrics Analysis of Specificity Specificity refers to the research of several elementary traits which are specific or characteristic of OO sample leading to its distinction within a polymorphic complex system. Several OO samples associated with different geographical origins/environmental conditions were characterized by different variation ranges and trends of metabolites. Specificity of different samples can be highlighted by ordination methods including principal component analysis (PCA) and correspondence analysis (CA).

Principal Component Analysis and Correspondence Analysis

General Principle PCA and CA are multivariate methods helping to graphically visualize trends of individuals governed by partial correlations between variables characterizing a polymorphic system [9–13]. This is Foods 2016, 5, 77 4 of 35

carried out by compression of the large-dimension and highly dispersed initial dataset into a small and well-structured space. This structuration is provided by new components, called principal components (CPs), combining all the initial variables. PCs represent orthogonal directions along which the total variation is decomposed into structured blocks; such blocks show hierarchically decreasing coverage of total variation. PCA and CA are carried out by analysing variability of both columns (variables) and rows (individuals) leading to dual analysis which highlights partial links between variables and subsequent behaviours of individuals. By this way, individuals with similar or opposite behaviours are topologically identified by particular levels and trends between some variables. This leads to identify specificity of several groups of individuals on the basis of high or low levels for some variables (positive or negative trends between them).

Methodological Steps PCA and CA algorithms can be summarized by three steps consisting of eigenvalues, eigenvectors and PCs calculations (Figure2). Figure 2 Data matrix X X 1 … Xj … X p a) Eigenvalues  j   ...  1 1 ... j p 2 : : i b) Eigenvectors Uj ...... : U1 Uj Up :

n u 1.1 u 1.j u 1.p ::: c) PCj = X.Uj u j.1 u j.j u j.p :::

PCj = X1.U1.j + ... + Xj.Uj.j + ... + Xp.Up.j u p.1 u p.j u p.p

U c) Principal Components matrix PC Up 2 j Uj

PC 1 … PC j … PC p 1 U1 2 : :

i PC i1 Pc ij PC ip d) Scores plots d) Loading plots : : PC 2 U2 n Tr 2 X Trend p 2 Xp Trend j

u1.1 U1 PC1 Xj

Trend 1 u1.2 X1

Figure 2. Different methodological (computational and visualization) steps (a–c and d, respectively) of principal components analysis and correspondence analysis.

The number of PCs is equal to that of eigenvalues and eigenvectors, and is defined by the lowest dimension of data matrix X. If p < n, one expects to calculate p eigenvalues λj, p associated eigenvectors Uj followed by p PCs (PCj). Foods 2016, 5, 77 5 of 35

Eigenvalues λj are weighting values quantifying the relative part of total variation along different PCs (PCj) (Figure2a). Eigenvectors are orthogonal and define the spatial directions of PCs (Figure2b,d). PCs contain the new coordinates of individuals and are calculated by linear combinations between the eigenvectors and initial data matrix X (Figure2c). In the p linear combinations, the coordinates of eigenvectors represent algebraic coefficients quantifying the roles of different variables to the p PCs. Finally, the two topological spaces (called score and loading plots) are considered in parallel ways to analyse the relative behaviours of individuals under the effect of partial correlations between variables (Figure2d). From the subspace defined by a given factorial plot PjPk, different trends are identified from the projection of different points at the extremities of the different PCs. Spatial proximity between projected points generally similar behaviours toward the different variables. On the other hand, in the factorial plot of variables, positive, negative or not significant correlations are indicated by close, opposite or orthogonal projection of points, respectively (Figure2d). For example, Figure2d shows partial negative correlations between X1 and X2 along PC2 and between Xj and Xp along PC1. Finally, by considering both row and column plots, some individuals are specifically characterized by some variables on the basis of their projection in a same subspace. A variable has a particularly high or low level in an individual if the two points (individual point and variable point) are projected in a same subspace or two opposite subspaces, respectively. These graphical analyses help to identify particular subpopulations that can be indicative of different phenotype or polymorphic trends. For instance, Figure2d shows four trends ( Tr) among which, Tr1 and Tr2 are characterized by high levels of variables X1 and X2, respectively. Moreover, these trends are opposite along the second principal component CP2 because of negative correlation between X1 and X2.

Application of PCA and CA in OO Field In OO field, PCA and CA were used to highlight the best regulated metabolites or characteristic metabolic profiles of different biological varieties leading to a better distinction between them. CA was applied to extract typical metabolic profiles of three French virgin olive oils (VOOs) characterized by fatty acid profiles: Aglandau, Grossane and Salonenque [14]. The three cultivars showed relatively higher regulations toward some fatty acids (FAs): 17:0 and 17:1ω8 for A, 16:1ω7 and 18:1ω7 for G, 18:2ω6 for S (Figure3a). Among ordination analyses, PCA was applied for inter-countries varietal differentiation from FAs and squalene chromatographic profiles containing major and minor compounds [15]. PCA helped to differentiate: - Between Tunisian (6 varieties), Algerian (6 varieties) and Moroccan VOOs (1 variety) on one hand, - Between different Tunisian VOOs on other hand. Tunisian and Moroccan VOOs showed clear chemical differentiations compatible with high geographical distance between the two countries. Also, clear differentiations were highlighted between Algerian and Tunisian cultivars expect for the Blanquette variety which showed some overlapping with Tunisian Chetoui variety. The same PCA-based work showed different trends of different VOO varieties originated from a same country (Tunisia) (Figure3b): - Chemchali occupied specific topological subspace due to relatively high level of saturated FAs (20:0, 22:0). - Chemlali and Zalmati showed relatively high level of three upstream chained metabolites: 16:0 → 16:1ω7 → 18:1ω7. These two varieties were characterized by C7-monoiunsaturattion vs. C9-monounsaturation in Chetoui and Oueslati. - These two later varieties (Chetoui and Oueslati) were characterized by relatively high levels of C9-monounsaturated FAs (16:1ω9, 18:1ω9, 20:1ω9) linked to sequential metabolic elongation processes. Foods 2016, 5, 77 6 of 34 Foods 2016, 5, 77 6 of 35 - These two later varieties (Chetoui and Oueslati) were characterized by relatively high levels of C9-monounsaturated FAs (16:1ω9, 18:1ω9, 20:1ω9) linked to sequential metabolic elongation Inprocesses. another study, Semmar et al. (2016) [16] highlighted chemical differentiation between Chetoui and Oueslati by applying correspondence analysis (CA) on FAs (Figure3c). In another study, Semmar et al. (2016) [16] highlighted chemical differentiation between Chetoui Although these two varieties showed relatively higher regulation of C9-monounsaturated and Oueslati by applying correspondence analysis (CA) on FAs (Figure 3c): ω ω FAs comparedAlthough to thChemlaliese two ,varietiesChetoui showedhad more relatively affinity higher for 16:1 regulation9 and of 20:1 C9-monounsaturated9 than Oueslati whichFAs showedcompared relatively to Chemlali higher, Chetoui regulation had more of 18:1 affinityω9. for Moreover, 16:1ω9 and CA 20:1 highlightedω9 than Oueslati the lowest which 16:1 showedω7 and 18:1relativelyω7 levels higher in Chetoui regulation, two of minor 18:1ω9. FAs Moreover, characterizing CA highlightedChemlali the(because lowest 16: of1 maximalω7 and 18:1 regulations).ω7 levels However,in ChetouiOueslati, two showedminor FAs the lowestcharacterizing level for Chemlali 18:2ω6 occurring(because of at significantlymaximal regulations). higher level However, in Chetoui andOueslatiChemlali showed. the lowest level for 18:2ω6 occurring at significantly higher level in Chetoui and ChemlaliAt higher. complexity level, PCA was applied to characterize several blends combining VOO with other vegetableAt higher oils. complexity This differential level, PCA characterization was applied to characterize is needed to several avoid orblends reduce combining adulteration VOO of foodwith oils: other (1) vegetable some oils oils. are oftenThis differential used as adulterant characterization in OO is blends needed because to avoid they or reduce are relatively adulteration cheap; (2)of virgin food oils: olive (1) oil some (VOO) oils canare often be added used as in adulterant other vegetable in OO oilsblends for because nutritional they valuesare relative andly economic cheap; reasons(2) virgin [17,18 olive]. oil (VOO) can be added in other vegetable oils for nutritional values and economic reasonsIn this [17,18 field,]: PCA was applied on UV-spectrometry data to highlight different variation poles specificIn to this different field, extraPCA virginwas applied olive oils on (EVOOs)UV-spectrometry showing data different to highlight adulteration different types variation and levels poles due to presence-absencespecific to different of extra corn virgin oil, palm olive oil, oils soybean (EVOOs) oil showing and sunflower different oil adulteration (Figure3d) types [ 4]. PCA and levels applied onduea dataset to presenceof mono-,-absence bi- of and corn tri-oils oil, palm samples oil, soybean revealed oil that and bi-componentsunflower oil ( oilFigure blends 3d) containing[4]. PCA applied on a dataset of mono-, bi- and tri-oils samples revealed that bi-component oil blends EVOO were clearly separated from mono-component (pure) oils along the first principal components. containing EVOO were clearly separated from mono-component (pure) oils along the first principal This good separation was revealed to be due to high content of EVOO-containing blends in components. This good separation was revealed to be due to high content of EVOO-containing blends terms of compounds absorbing at 200–325 nm. The authors suggested that compounds could be in terms of compounds absorbing at 200–325 nm. The authors suggested that compounds could be mono-unsaturated FAs (mainly oleic acid) which are abundant in EVOO [4,19]. The second PC mono-unsaturated FAs (mainly oleic acid) which are abundant in EVOO [4,19]. The second PC separatedseparated pure pure and and EVOO-containing EVOO-containing blends from from adulterated adulterated blends blends by bypalm palm oil (PO). oil (PO). Loading Loading plot of plot of UV-variablesUV-variables showed showed that that thi thiss separation was was specifically specifically due due to to PO-characterizingPO-characterizing compounds compounds absorbingabsorbing at at 325–350 325–350 nm. nm.

Grossane Chemchali Oueslati (a) (b) (c) ω 20:0 16:1 7 Tunisian OOs French OOs 22:0 18:1ω9 18:1ω7

16:0 16:1ω9 16:1ω9 16:0 18:2ω6 17:1ω8 16:1ω7 ω 18:1 9 16:1ω7 20:0 17:0 18:1ω7 20:1ω9 20:1ω9 18:1ω7 Chemlali & Chetoui & Salonenque Aglandau Chetoui Chemlali Zalmati Oueslati

Adultered blends (60% OO, 40% by palm oil sunflower oil)

(d) 325-350 nm absorbing Higher content of linoleic, oleic, (e) compounds of palm oil arachidic, margaroleic acids

200-325 nm EVOO - vegetable absorbing Pure oil oil blends compounds : without EVOO mono-unsaturated acids of EVOO Higher content of behenic, linoleic, myristic, lignoceric acids

(40% OO, 60% Pure vegetable oil or sunflower oil) EVOO-containing blends

FigureFigure 3. Different3. Different study study cases cases ( a(–a–ee)) on on specificityspecificity ofof different OO OO monocultivars monocultivars (a (–ac–)c or) or OO OO blends blends (d,e(d), by e) ordinationby ordination analysis analysis (principal (principal component component analysisanalysis ( (PCA)PCA) or or correspondence correspondence analysis analysis (CA (CA)))) applied on different physical-chemical parameters (fatty acids, UV absorbances). applied on different physical-chemical parameters (fatty acids, UV absorbances).

Foods 2016, 5, 77 7 of 35

PCA was applied to highlight association between quantitative levels of FAs and binary blends mixing OO with sunflower oil at the proportions of 50%–50%, 60%–40% and 40%–60% (Figure3e) [ 18]. Blends containing 60% OO were characterized by relatively high concentrations of linolenic, oleic, arachidic and margaroleic acids. This was explained by the fact that these FAs showed higher content in OO than sunflower oil. However, blends containing 40% OO were characterized by other FAs including behenic, linoleic, myristic and lignoceric acids which are more concentrated in sunflower oil.

3. Chemometric Analysis of Authenticity Authenticity refers to the definition of general patterns of samples integrating different specific characteristics. In OO filed, this helps to classify varieties and to confirm identities of denominated samples.

Hierarchical Cluster Analysis (HCA)

General Principle Hierarchical cluster analysis (HCA) is a classification method used to organize a whole heterogeneous population into well-distinct and homogeneous groups (called clusters) [20]. By this way, authentic groups are firstly defined as clusters which are more or less separated by well calculated distances: distances are calculated between individuals the closest of which are grouped into same clusters, whereas distant ones are separated into different clusters. In a second technical step, resulting neighbour clusters are merged by applying a given aggregation rule among several possible one: Combination between distance kind and aggregation rule results in several possible tree-like typologies called dendrograms (Figure4). In a third step, the different clusters given by dendrogram are characterized by multivariate patterns due to differential variations and/or non-overlapping variation ranges of different experimental parameters. Finally, dendrogram helps to identify how many distinct groups are constitutive of a whole population on one hand, and how much these groups are close/distant the ones to the others. This usefully helps to understand the organization of whole studied population.

Methodological Steps The first methodological step in HCA consists in calculating distances or similarity levels between individuals. Distances are calculated when data are quantitative, whereas similarity indices are calculated when the system is described by qualitative variables [20–24]. There are several types of distances that can be used including Manhattan, Euclidean, Chi-2, etc. [20,25]. Similarity indices are also numerous including Sorensen-Dice, Jaccard, Simpson, Tanimoto coefficients [11,20]. After distance calculation, all the individuals will be iteratively grouped or separated into different clusters using an aggregation or linkage rule. These rules include single-, complete-, centroid-, average- and Ward-link clustering [20–24] (Figure4):

• In single link clustering, neighbour clusters are those having the closest individuals (Figure4a). • Complete link clustering proceeds by joining clusters having the less distant extreme individuals (Figure4b). • In centroid link clustering, neighbour clusters are those having the closest gravity centres (Figure4c). • In average link clustering, an average distance is calculated from all the distances separating all the pairwise of points. Clusters showing the lowest average distance will be merged (Figure4d). • Ward linkage is based on the calculation of ratio between two : between clusters on variance within group. Clustering leading to the lowest reduction of this ratio will be applied (Figure4e). Foods 2016, 5, 77 8 of 35

From dendrogram, different clusters are identified and interpreted on the basis of two criteria: distinctnessFoods 2016, 5, 77 between clusters and compactness within clusters (Figure4f). Distinctness defines8 of the 34 distance separating a cluster from the rest (Figure4f). Higher distinctness indicates a more distinct (differentiated)(differentiated) cluster. Compactness define definess the highest distance existing existing within within a a cluster cluster (Figure (Figure 44f).f). Lower is the compactness more homogeneoushomogeneous isis thethe cluster.cluster.

Figure 4. (a–(ae–) e)Different Different agglomeration agglomeration rules rules used used in inhierarchical hierarchical cluster cluster analysis analysis (HCA) (HCA);; (f) (Dendrogramf) Dendrogram or ortree tree-like-like diagram diagram provide providedd by by HCA HCA showing showing different different class class structures structures characterized by different distinction andand homogeneityhomogeneity levelslevels (distinctness(distinctness andand compactness,compactness, respectively).respectively).

Foods 2016, 5, 77 9 of 35

Application of HCA in OO Field HCA was applied to classify many populations of olives using different metabolic and physical-chemical variables. HCA was applied on phenol, triacylglycerol and sterol contents to statistically highlight the key role of these metabolites in chemical differentiation of five Tunisian minor olive cultivars [26]. Also, HCA based on Manhattan distance and complete linkage was used to classify and calculate separations between five Italian OO cultivars on the basis of their FA contents [27]: Both dendrogram and PCA scores plot showed that Coratina and Oliarola cultivars were the most distant leading to their strong differentiation. The three other EVOOs (Simone, Olivastro, Leccino) showed intermediate states with some co-occurrence clusters in HCA or overlapping subspaces in PCA. HCA was also applied on 31P NMR data using Euclidean distance and single linkage to explore similarity and dissimilarity between 59 samples representing three OO groups: 34 Greek extra virgin OOs (EVOO), 13 refined OOs (ROO) and 12 lampante OOs (LOO) [28]. Higher homogeneity (identity) was highlighted for EVOO followed by ROO then LOO. In another HCA-based work, OO samples were reliably differentiated from non OO ones on the basis of ATR-FTIR spectra [17]. In this work, 111 pure oil samples were considered including 41 edible vegetable oils and 70 OOs originated from European and American countries (Italy, France, Spain, US, Mexico). The 111 samples were characterized by 2584 wavenumbers covering a spectral region from 650 cm−1 to 3588 cm−1. Using a variance-weighted distance and Ward’s aggregation method, a dendrogram was constructed to highlight three clusters separating OO samples from two other oil categories (Figure5a): a first and second clusters represented flax seed oils and most of non-OO samples, respectively. They were frankly separated from a third cluster made by the 70 OO samples in addition to four non-olive samples (one high oleic sunflower, one safflower, two peanut oils). This suggested that some non OO have more similar physical-chemical profiles with OO than other oils. More generally, the HCA results showed that information contained in infrared spectra can be reliably used for OO distinction from other edible oils like canola, corn, soybean, sunflower oils among others. Spectral bands of triglycerides containing unsaturated FAs provided the main discrimination basis. Beyond authenticity analysis applied at the scale of separated OO and non-OO samples, the same authors applied HCA to carry out classification of pure OO and binary samples mixing OO with another vegetable oil at variable percentages varying from 10% to 90% [17]. Co-occurring vegetable oils and OOs were concerned with eleven European and American origins. Blends were characterized by infrared (IR) data. Dendrogram obtained with Ward’s method highlighted clear separation between pure OO, non-OO and mixed OO-vegetable oil samples. Beyond physical-chemical and metabolic markers, HCA was applied on genetic markers (RAPD and ISSR) for inter- and intra-cultivar classification of Portuguese Olive trees [29]. Inter-cultivar dendrogram was obtained using Dice coefficient. It highlighted different similarity-dissimilarity degrees between the eleven studied olive cultivars (Figure5b): Galega, Madural and Blanqueta cultivars were more dissimilar from all the others, whereas Negrinha and Azeiteira were the most similar between them. The Galega cultivar was then subjected to intra-cultivar HCA using Jaccard index. Resulting dendrogram highlighted five clusters which corresponded to five agro-ecological regions of OO production. Foods 2016, 5, 77 10 of 35 Foods 2016, 5, 77 10 of 34

(a) Olive oil samples 70 European and American OOs Cluster 3 (France, Italy, Spain, Mexico, US)

Infrared spectral bands due to Cluster 2 Cluster 1 Triglycerides containing Non-olive unsaturated FAs Flax oils seed oils

(b) Madural 11 Portuguese Others olive trees

RAPD and ISSR Azeiteira Negrinha genetic markers Galega ecological ecological -

Others 5 agro regions Blanqueta Figure 5. Studies cases on authenticity olive oils (a) and olive trees (b) highlighted by cluster analysis Figure 5. Studies cases on authenticity olive oils (a) and olive trees (b) highlighted by cluster analysis using spectroscopic (a) and genetic (b) variables. using spectroscopic (a) and genetic (b) variables. 4. Chemometric Analysis of Authenticity and Traceability 4. Chemometric Analysis of Authenticity and Traceability Beyond authenticity which focuses on identification and origin determination of samples, Beyondtraceability authenticity refers to the which quantitative focuses evaluation on identification and qualitative and discrimination origin determination of components of samples, in traceabilitycomplex refers matrices. to theIn OO quantitative field, traceability evaluation concerned and adulteration qualitative detection discrimination of labelled of products components and in complexcomposition matrices. evaluations In OO field, of traceabilityheterogeneous concerned or multi adulteration-varietal blends. detection A common of labelled authenticity products- and compositiontraceability evaluations problem concerned of heterogeneous the prediction or multi-varietal of geographical blends. origins A commonof mono-varietal authenticity-traceability samples. It is generally treated by linear discriminant analysis among other pattern recognition methods. problem concerned the prediction of geographical origins of mono-varietal samples. It is generally Pattern recognition methods include LDA, quadratic discriminant analysis (QDA), stepwise treateddiscriminant by linear discriminantanalysis (SDA), analysis Soft Independent among other Modelling pattern recognition of Class Analogies methods. (SIMCA) and Patterncombining recognition approach based methods on partial include least LDA,-square quadratic regression discriminant and DA (PLS analysis-DA) [30]. (QDA), For a given stepwise discriminantprofile, these analysis methods (SDA), calculate Soft Independent several scores Modelling corresponding of Class to all Analogies the possible (SIMCA) cultivars; and then combining the approachmaximal based score on is retained partial to least-square attribute the regressionunknown profile and to DA appropriate (PLS-DA) affiliation [30]. Forgroup. a given profile, these methods calculate several scores corresponding to all the possible cultivars; then the maximal score4.1 is. retainedLinear Discriminant to attribute Analysis the unknown profile to appropriate affiliation group.

4.1. Linear4.1.1. General Discriminant Principle Analysis Linear discriminant analysis (LDA) belongs to a set of statistical methods called pattern 4.1.1.recognition General Principle methods. These methods are applied to predict class membership (e.g., cultivars, Lineargeographical discriminant origins, etc.) analysis of unknown (LDA) samples belongs (individuals) to a set offrom statistical quantitative methods profiles calledmade by pattern several measured variables (e.g., metabolic concentrations). This goal is reached by constructing a recognition methods. These methods are applied to predictp class membership (e.g., cultivars, combining p quantitative variables Xj ( j ) to distinguish between q patterns (q geographical origins, etc.) of unknown samples (individuals)1 from quantitative profiles made by classes). several measured variables (e.g., metabolic concentrations). This goal is reached by constructing p a statistical model combining p quantitative variables Xj (j1 ) to distinguish between q patterns (q classes). Foods 2016, 5, 77 11 of 35

4.1.2. Methodological Steps of LDA Attribution of quantitative individual profiles to appropriate classes requires several methodological steps: patterns’ space is initially constructed, then transformed to reduced discriminant space in which individuals will be finally classified [31]. Pattern space construction is carried out by characterizing each group (class) by its centroid position and dispersion : for a q-group system characterized by p quantitative variables Xj, centroid of group k is spatially located by the average vector: Ck = (X1k, ... , Xjk, ... , Xpk). Internal dispersion of a group is given by its variance. For the simplistic example of q = 2 groups (A and B) and p = 2 variables X1, X2, spatial locations of the two centroids are defined by the points of coordinates (x1A, x2A) and (x1B, x2B) (Figure6a). Distances between centroids, and distances of individuals to their centroids are used to define two types of variances called between- and within-classes variances, respectively.

Discriminant variable X2 (a) Centroid of group (x1A , x 2 A )

Group A Discriminant Function

Group B Centroid of group B

(x1B , x 2B ) Discriminant

variable X1

Group A Not DF Group B Group A (b)

axis Group B Group C

Discriminant Function (DF) Confusing groups Axis maximizing the ratio : between‐group on within group variances

Figure 6. (a) Separation of individual points into well distinct groups by a discriminant function

combining discriminant variables Xj;(b) Comparative illustration of discriminant and not-discriminant axes; linear coefficients in discriminant function are determined by maximizing the ratio of variance between groups on variance within group. Foods 2016, 5, 77 12 of 35

In a second step, the initial pattern space will be transformed by means of linear combinations of p the p variables Xj (j1 ) [31,32]. Each of the linear combinations provides a linear discriminant function (DF) which represents a separating axis between the q groups in the new discriminant space (Figure6a). The coefficients wj of the different variables Xj in the different linear combinations are optimized by maximizing the ratio of between-group variance on within-group variance. This constrains the q groups to be as distinct as possible within a reduced space showing minimum loss of differentiation between the groups (Figure6b). In a third step, DFs are used to calculate q classification scores (CSs) for each individual. The highest score among the q ones will define the appropriate affiliation class of the considered individual. The q CSs can also be converted into q probabilities among which the maximal value will be considered to attribute the new individual to the most plausible group [33].

4.1.3. Applications of LDA in OO Field LDA was widely applied for chemical authentication and traceability analysis of different OO classes including mono-varietal VOOs or EVOOs, registered designations of origins (RDOs), geographical area- and harvesting years, etc. Authentications were based on various discriminant variables including FAs, sterols, triacylglycerols, phenols, volatiles [33–39]: Using NMR data of phenolic and FAs compositional parameters, Petrakis et al. (2008) [37] Figure 7 applied discriminant analysis for pattern recognition of different Greek EVOOs taking into account different spatio-temporal scales: (i) three geographical divisions (Crete, Peloponnesus, Zakynthos) including (ii) six sites of origins (Sitia, Heraklion, Chania, Messinia, Lakonia, Zakynthos) from which (iii) 131 samples were obtained at different harvest years. Some metabolites seemed to play important role in discrimination of EVOOs associated with geographical location: the three geographical sites of Crete division were highly discriminated, with frank separation from Zakynthos division (one site) and intermediate state for Peloponnesus division (two sites) (Figure7).

NMR CDA 2 Greek EVOOs from High phenolic concentrations: different geographical Total tyrosol divisions Total hydroxytyrosol Zakynthos Free hydroxytyrosol

High fatty acid concentrations: Crete Oleic acid Linoleic acid Peloponnesus

Canonical discriminant axis 1

Figure 7. Chemical discrimination of different Greek extra virgin olive oils (EVOOs) originated from different geographical sites of three divisions (Crete, Peloponneus, Zakynthos) using NMR-analysed fatty acids and phenolic compounds.

The high discrimination of Cretean sites was due their higher total concentrations of total and free hydroxytyrosol and total tyrosol. In addition to phenolic compounds, linoleic (LA) and oleic (OA) acids provided significant markers of geographic origin. Traceability performance was linked to two genes OeFDA2 and OeFDA6 encoding the key enzymes for LA and OA pathway. These genes seem to have survived the extensive gene flow. In Mediterranean olive trees, the gene flow seemed to have obscured the phylogenetic signal but not the geographic one. Foods 2016, 5, 77 13 of 35

LDA was also applied on 1H NMR spectra of phenolic extracts to reliably separate between Italian OOs according to both cultivar types and geographical origins [35]. In another work, LDA concerned mass-spectrometry and UV data to differentiate between three geographical nominations of Italian (Ligurian) EVOOs [40]. Figure 8 Beyond authenticity of separated OO samples, LDA was applied for authenticity analysis of OO blends. A work was carried out on normalized Raman spectral in order to discriminate between pure OO, pure soybean oil (SO) and OO-SO blends submitted at different temperatures (20–90 ◦C) [41]. Pure OO and OO-SO blends (i.e., adulterated OO) under 90 ◦C were frankly well-discriminated by LDA compared to those under 20 ◦C (Figure8); this highlighted interactive discriminant effect linked to temperature elevation. This temperature-dependent effect was associated with varied spectral features of the 1265 cm−1 band which was related to the degree of unsaturation.

Varied spectral feature ‐1 pure OO OO‐SO of the 1265 cm band Discrimination due to temperature at 90°C at 90°C elevation and influencing degree of unsaturation

Not pure OO Discrimination and OO‐SO at 20°C

Figure 8. Physical-chemical discrimination of pure OO and adulterated OO oil under temperature of 90 ◦C due to variation of spectral feature of 1265 cm−1 manifesting under temperature elevation.

OO blends authenticity analysis was also carried out by LDA to predict five French VOO registered designation of origins (RDO) made by combining different cultivars as primary or secondary components. RDOs consisted of Nyons (Y, n = 126 samples), Vallée des Beaux (VB, n = 98), Aix-en-Provence (PA, n = 99), Haute Provence (HP, n = 85) and Nice (C, n = 131) (Figure9)[ 34]. The five RDOs were chemically characterized by FAs and triacylglycerols (TGC) contents (14 FAS and 19 TGCs). Application of LDA on the whole chemical dataset of 539 samples belonging to five RDOs showed that PA and VB were the closest origins because of minimal Mahalanobis distance separating them. However, Y was the most distant RDO flowed by C leading to higher authenticity and identicalness. Although all the variables were useful for pattern recognition, some ones revealed to be more particularly influent in RDOs’ discrimination because of higher standardized discriminant coefficients which concerned:

• Triolein (OOO), linoleoyl-dioleoyl-glycerol (LOO), palmitoyl-dioleoyl-glycerol (POO), dilinoleoyl- oleol-glycerol (LOL) for TGCs • Monounsaturated FAs sum, oleic, palmitic, stearic and hypogeic acids for FAs. Figure 9

Foods 2016, 5, 77 14 of 35

French RDOs (blends) HP CDA 2 VB Triacylglycerols : OOO Nyon PA LOO POO LOL

Monounsaturated FAs: Oleic acid Palmitic acid Stearic acid Nice Hypogeic acid

Canonical discriminant axis 1

Figure 9. Chemical discrimination of five French VOO registered designation of origins (RDOs) made by mixing different OO cultivars and characterized by 14 fatty acids (FAs) and 19 triacylglycerols (TGC) among which four FAs and four TGCs were the most discriminant. Legend: OOO, triolein; LOO, linoleoyl-dioleoyl-glycerol; POO, palmitoyl-dioleoyl-glycerol; LOL, dilinoleoyl-oleol-glycerol.

At higher complexity level, LDA was applied on FA profiles to predict multivarietal quantitative compositions of OO blends varying by different weights (proportions) of co-occurring French and Tunisian cultivars [14,16]: such a problematic goes beyond qualitative discrimination of separated cultivars and RDO (blend) types because it makes to evaluate percentages of several co-occurring OOs in same blends from chemical profiles of blends. Application of LDA for prediction of multivarietal OO blends concerned French and Tunisian samples and it will be presented in Section 4.7.3[14,16].

4.2. Partial Least Square Regression

4.2.1. General Principle PLS can be applied for differential sorting of many input variables according to their pertinence levels for the prediction of the output variable Y. This helps for extraction of a small number of reliable predictors of Y among a huge mass of many candidate input variables X. This is particularly interesting for datasets containing few individuals and many variables [42,43]. Moreover, PLS is a favourite method of treatment with possibly correlated predictors which are combined and condensed with all the initial variables into new composite variables called latent variables (LVs) (or components or factors). Like the principal components (in PCA or CA), LVs are newly constructed variables which weight and condense the initial variables to handle with their highly diversified and/or correlative aspects. However, in PLS regression (PLSR) differs from PC regression (PCR) by the fact it incorporates information on both X and Y in the definition of LVs:

- In PLSR, LVs are calculated such that the between Y and X is maximal. By this way, LVs are built on the basis of both the input (X) and output (Y) variables. - In PCR, one focuses on maximization of the variance of X in different spaces. Only the input variables X are used for LVs construction. Foods 2016, 5, 77 15 of 35

4.2.2. Methodology of PLS Regression Initially, PLS proceeds to decomposition of predictors’ matrix X into orthogonal scores T and loading P: X = T·P

This makes to regress the output variable Y on the first columns of the scores instead of X. Therefore, prediction of the output variable Y by PLSR is carried out iteratively by adding a single LV at each step. By this way, several enclosing PLSR models are obtained to predict the output variable Y through linear functions of the first k LVs. The best model can be concluded from a minimal prediction error obtained after cross-validation. The final PLS model has the double advantage to:

- Predict the output variable Y from a small number of LVs. - Help for identification of the most contributive output variables (e.g., spectral wavenumbers).

Thus, PLSR offers an efficient chemometrical tool for variable selection leading to extract the most pertinent predictors of complex and highly variable system such as adulterated OO blends. Before quantitative evaluation of adulteration level (proportion), LDA can be applied to determine the nature of adulterant.

4.2.3. Application of PLS Regression in OO Field Olive oils can be adulterated by adding some cheap vegetable oils at different proportions in the commercial blends. Such adulterations are qualitatively and quantitatively analysed by means of spectroscopic features including near-infrared (NIR), mid-infrared (MIR) and Raman data. These techniques provide highly condensed information on the mixture matrices through large numbers of recorded wavenumbers (hundreds or thousands of variables). Many chemometrical works were carried out by treating IR data by PLSR in order to predict the adulteration level of commercial EVOOs by other vegetable oils added at variable proportions. The list of vegetable oils concerned with adulteration includes: canola, hazelnut, pomace, sunflower, corn, soybean, walnut, etc. [6,44–49].

4.3. PLS-DA Approach

4.3.1. General Principle PLS-DA refers to a method combining partial least square (regression) and linear discriminant analysis. Like LDA, it aims to optimize separation between different groups characterized by several quantitative variables [50]. It differs from LDA in that the q groups are initially represented in q separated indicator columns (0, 1), whereas in LDA all the groups are represented by one column containing q labels (Figure5). Therefore, optimization of separation between groups is carried out by linking the (1, 0)-binary matrix Y to quantitative matrix X containing p discriminant variables.

4.3.2. Methodological Steps of PLS-DA Initially, q class membership columns are prepared by taping 1 or 0 if individuals belong or not to corresponding classes, respectively. Then, optimization of separation between the q groups is performed by maximizing the covariance between the p discriminant variables Xj and the q candidate groups (Figure 10). Foods 2016, 5, 77 16 of 35 Foods 2016, 5, 77 16 of 34

Labelling of qualitative groups in LDA Quantitative discriminant in PLS-DA Variables Xj

Groups A B C D X1 X2 … Xj … Xp A 1 0 0 0 10.1 5.2 1.0 0.5 A 1 0 0 0 12.3 4.2 2.1 0.4 A 1 0 0 0 A 1 0 0 0 A 1 0 0 0 ... etc. ... B 0 1 0 0 B 0 1 0 0 9.2 5.0 2.2 1.2 B 0 1 0 0 8.5 4.3 1.2 1.2 C 0 0 1 0 ... etc. ... C 0 0 1 0 C 0 0 1 0 10.3 1.1 2.2 2.1 C 0 0 1 0 11.2 1.5 2.4 2.3 C 0 0 1 0 ... etc. ... D 0 0 0 1 D 0 0 0 1 8.7 2.2 3.1 0.6 D 0 0 0 1 7.6 2.7 2.8 0.9

FigureFigure 10. 10. DifferenceDifference between between linear linear discriminant discriminant analysis analysis (LDA) and (LDA) partial and least partial square least regression square- regression-discriminantdiscriminant analysis (PLS analysis-DA) based (PLS-DA) on different based on initial different labelling initial of labelling groups. of groups.

Space reduction is carried out by calculating integrative variables, called latent variables (LVs), Space reduction is carried out by calculating integrative variables, called latent variables (LVs), from linear combinations of X (LVX) and Y (LVY), respectively. Coefficients of linear combinations are from linear combinations of X (LVX) and Y (LVY), respectively. Coefficients of linear combinations calculated under the maximization constraint of covariance of LVX and LVY. This leads to define a are calculated under the maximization constraint of covariance of LV and LV . This leads to define linear classifier which has proved to be statistically equivalent toX that of YLDA, but with noise a linear classifier which has proved to be statistically equivalent to that of LDA, but with noise reduction and variable selection advantage [51]. reduction and variable selection advantage [51]. Finally, PLS-DA model predicts continuous values of Y variables which are comprised between Finally, PLS-DA model predicts continuous values of Y variables which are comprised between 0 0 and 1. Values closer to 0 indicate that the individual does not belong to corresponding classes; and 1. Values closer to 0 indicate that the individual does not belong to corresponding classes; however, however, closer the value to 1 higher the chance of individual to belong to the concerned class. closer the value to 1 higher the chance of individual to belong to the concerned class. Therefore, Therefore, individuals are assigned to the classes which have the maximum score in the Y-vector. individuals are assigned to the classes which have the maximum score in the Y-vector. Alternatively, Alternatively, threshold values (between 0 and 1) can be defined for the different classes using Bayes threshold values (between 0 and 1) can be defined for the different classes using Bayes theorem [51]. theorem [51]. 4.3.3. Application of PLS-DA in OO Field 4.3.3. Application of PLS-DA in OO Field PLS-DA models were applied on spectroscopic data (MIR, NIR, FTIR, 1H NMR) to predict PLS-DA models were applied on spectroscopic data (MIR, NIR, FTIR, 1H NMR) to predict authenticity and traceability of different EVOOs, protected designation of origins (PDO), registered authenticity and traceability of different EVOOs, protected designation of origins (PDO), registered designation of origin (RDO) and geographically-linked samples [52–56]. designation of origin (RDO) and geographically-linked samples [52–56]: Bevilacqua et al. (2012) [52] applied PLS-DA on MIR and NIR data for pattern recognition of Bevilacqua et al. (2012) [52] applied PLS-DA on MIR and NIR data for pattern recognition of geographical origins of Italian EVOOs shared between Sabina and not Sabina (confounded other sites) geographical origins of Italian EVOOs shared between Sabina and not Sabina (confounded other (Figure 11). sites) (Figure 11).

Foods 2016, 5, 77 17 of 35 Foods 2016, 5, 77 17 of 34

(a) Discriminant variables Predicted Modalities

MIR data Sabina 650-750 cm-1 : C-H bending origin 1100 cm-1 : C-O stretching PLS-DA Itlaian EVOOs 1700 cm-1 : C=O stretching 2800-3100 cm-1 : C-H stretching Other origin (b)

NIR data Sabina 4500-5000 cm-1 : C=C and C-H stretching origin of cis unsaturated FAs PLS-DA 5650 & 6000 cm-1 : C-H of methylene of Itlaian EVOOs aliphatic groups of oil

7074 & 7180 cm-1 : C-H of methylene Other origin

FigureFigure 11.11. PredictionPrediction ofof geographicalgeographical originorigin ofof ItalianItalian EVOOsEVOOs byby PLS-DAPLS-DA usingusing mid-mid- (MIR)(MIR) ((aa)) andand near-near-(NIR) (NIR)( b(b)) infrared infrared data. data.

BeforeBefore PLS-DA,PLS-DA, MIRMIR andand NIRNIR datadata werewere subjectedsubjected toto differentdifferent pre-treatmentpre-treatment forfor baselinebaseline correctionscorrections consistingconsisting of:of: (a)(a) linear;linear; (b)(b) quadraticquadratic (Q);(Q); (c)(c) multiplicativemultiplicative scatterscatter (MS);(MS); (d)(d) detrendingdetrending (D);(D); (e)(e) firstfirst de derivativerivative (1d) (1d);; (f) (f) second second derivative derivative (2d) (2d);; (g) (g) MS MS with with 1d; 1d;(h) MS (h) + MS 2d; + (i) 2d; MS (i) + MSQ; (j) + MS Q; (j)+ Dt. MS NIR + Dt. and NIR MIR and datasets MIR datasets issued issued from frompre-treatments pre-treatments were were modelled modelled using using calibration calibration (Cbr) (Cbr) and andcross cross-validation-validation (CrV) (CrV) techniques. techniques. UsingUsing MIRMIR data,data, thethe bestbest predictivepredictive PLS-DAPLS-DA modelmodel waswas builtbuilt byby MSMS pre-treatmentpre-treatment (followed(followed byby QB).QB). ThisThis modelmodel predictedpredicted SabinaSabina atat 100%100% andand 92.3%92.3% inin CbrCbr andand CrVCrV vs.vs. 95%95% forfor otherother originorigin inin bothboth CbrCbr andand CrV.CrV. Moreover,Moreover, outside outside samples samples werewere givengiven toto PLS-DAPLS-DA modelmodel (for(for finalfinal validation)validation) whichwhich predictedpredicted SabinaSabina andand otherother originorigin atat 85.7%85.7% andand 86.7%,86.7%, respectively.respectively. AnalysisAnalysis ofof discriminationdiscrimination abilityability ofof MIRMIR variablesvariables showedshowed thatthat C=O C=O doubledouble bondbond stretching stretching (around (around 1700 1700 cm cm−−11)) andand C–HC–H bendingbending (within(within 650–750650–750 cmcm−−1 range) contributed the most toto thethe PLS-DAPLS-DA modelmodel (Figure(Figure 1111a).a). TheyThey werewere followedfollowed by significant significant (but (but lower) lower) contributions contributions of C of–H C–H stretching stretching (280 (2800–31000–3100 cm−1) cm and−1 C)- andO (single C-O (singlebond stretching) bond stretching) (1100 cm (1100−1). cm−1). UsingUsing NIR da data,ta, four four PLS PLS-DA-DA models models showed showed high high predictive predictive aspect aspect both both in Cbr in and Cbr CrV. and They CrV. Theyconcerned concerned MS, MS,Dt, 1d Dt, and 1d andMS MS+ D + which D which predicted predicted Sabina Sabina and and other other origin origin at at 100% 100% and 95.5%,95.5%, respectively.respectively. Final Final validation validation based based on on outside outside sample sample was was concluded concluded by predictive by predictive ability abi of:lity (i) 100%of: (i) for100% both for Sabina both andSabina other and origin other with origin 1d-based with 1d model;-based and model (ii) 100%; and Sabina (ii) 100% vs. 93.3%Sabina other vs. 93.3% origin other with theorigin three with other the models.three other Discriminant models. Discriminant variables included variables all included the spectral all the regions spectral (Figure regions 11 (Figureb): 11b): • −1 • 4500–50004500–5000 cm cm−1 duedue to combination to combination bands bands C=C and C=C C–H and stretching C–H stretching variation of variation cis unsaturated of cis unsaturatedFAs. FAs. −1 •• 56505650 and and 6000 6000 cm cm−due1 due to combinationto combination bands bands and firstand overcomefirst overcome of C–H of of methyleneC–H of methylene of aliphatic of groupsaliphatic of groups oil. of oil. −1 •• 70747074 andand 71807180 cmcm−1 linkedlinked to to C C–H–H band band of of methylene. methylene.

Foods 2016, 5, 77 18 of 35 Foods 2016, 5, 77 18 of 34

ApartApart from from DA, DA, PLS PLS was was combined combined with with linear regression (PLSR) (PLSR) to predict to predict proportions proportions (0% to (0 5% to) of5%) EVOO of EVOO and Palm and Palm oil (PO) oil (PO) in three in three vegetable vegetable oil blends oil blends based based on corn on oilcorn (CO), oil (CO), soybean soybean oil (SO) oil and(SO) sunflowerand sunflower oil (SFO), oil (SFO), respectively respectively [4]. [4]. EVOO EVOO and and PO PO represent represent desired desired and and undesired undesired (adulteration) (adulteration) oilsoils inin vegetablevegetable oiloil blends. blends. BlendsBlends werewere initiallyinitially characterizedcharacterized byby UVUV spectraspectra containingcontaining manymany wavelength-absorbancewavelength-absorbance variables. variables. The The high high number number of of these these variablesvariables was was reducedreduced by by applyingapplying PCA PCA whichwhich provided provided principal principal components component useds used as latent as latent variables variables in PLSR. in PLSR. PLSR modelsPLSR models predicted predicted EVOO proportionsEVOO proportions in EVOO-vegetable in EVOO-vegetable oil blends oil with blends determination with determination coefficient coefficientR2 = 0.86, R2 1.00 = 0.86, and 1.00 0.98 and in EVOO-CO,0.98 in EVOO EVOO-SO-CO, EVOO and- EVOO-SFOSO and EVOO blends,-SFO respectively. blends, respectively. For adulterant For adulterant PO in EVOO-vegetable PO in EVOO- oils,vegetable PLSR oils, predicted PLSR predicted its proportions its proportions with R2 =with 0.85 R (in2 = 0.85 EVOO-CO), (in EVOO 1.00-CO), (in 1.00 EVOO-SO) (in EVOO and-SO) 0.99 and (in0.99 EVOO-SFO). (in EVOO-SFO).

4.4.4.4.Soft Soft IndependentIndependent Modeling Modeling of of Class Class Analogies Analogies

4.4.1.4.4.1.General General Principle Principle SoftSoft IndependentIndependent ModellingModelling ofof ClassClass AnalogiesAnalogies (SIMCA)(SIMCA) isis aa patternpattern recognitionrecognition methodmethod commonlycommonly usedused inin chemometrics. chemometrics. TheThe termterm “Soft”“Soft” meansmeans thatthat nono hypothesishypothesis onon thethe distribution distribution ofof variablesvariables is is made. made. This This concept concept makes makes that that overlapping overlapping of of classes classes is is not not problematic,problematic, and and individuals individuals cancan be be flexibly flexibly affiliated affiliated to to more more than than one one class. class. Flexible Flexible solution solution isis also also provided provided by by the the concept concept of of independencyindependency where where the the classes classes are are separately separately modelled. modelled Specific. Specific models models of different of different classes classes are made are bymade applying by applying PCA on PCA each on class. each At class. the end,At the each end, class each will class be modelledwill be modelled by a specific by a specific PC-based PC model-based whichmodel will which delimit will itdelimit in a region it in a of region space. of A space. new individual A new individual will be affiliated will be affiliated into class intok if itsclass distance k if its todistance corresponding to corresponding region k is region lower k than is lower those than to other those regions to other (classes). regions (classes).

4.4.2.4.4.2.Methodological Methodological Steps Steps FromFrom eacheach separatelyseparately appliedapplied PCA,PCA, aa classclass modelmodel isis builtbuilt fromfrom PCsPCs thatthat bestbestfit fitthe the variation variation withinwithin corresponding corresponding class. class. The The number number of of significant significant PCs PCs in in each each class-model class-model can can be be determined determined by by cross-validation.cross-validation. Figure Figure 12 12 illustrates illustrates two two class class models models based based on on one one and and two two PCs, PCs, respectively. respectively.

PC1(class A)

PC2(B)

PC1(class B)

FigureFigure 12. 12.Illustration Illustration of of the the principle principle of of Soft Soft Independent Independent Modelling Modelling of of Class Class Analogies Analogies (SIMCA)(SIMCA) basedbased onon independentindependent application of of principal principal component component analysis analysis on on each each class. class. In the In example, the example, class A is well predicted by the first PC _PC1(A)_ whereas class B needs the first two PCs _PC1(A) and class A is well predicted by the first PC _PC1(A)_ whereas class B needs the first two PCs _PC1(A) and PC2(B)_ to be well predicted. PC2(B)_ to be well predicted.

The collection of q models will define q subspaces characterizing the q classes. Projection of new individual i is followed by the calculation of q geometric distances to the q subspaces. The distance dik between individual i and class k is calculated by a combination of the distance within the model

Foods 2016, 5, 77 19 of 35

The collection of q models will define q subspaces characterizing the q classes. Projection of new individual i is followed by the calculation of q geometric distances to the q subspaces. The distance dik between individual i and class k is calculated by a combination of the distance within the model space (T2 or leverage) and orthogonal distance to the model space (Q2 statistics or squared residuals) [52]. Individual i is affiliated to class k if dik is lower than a threshold value.

4.4.3. Application of SIMCA in OO Field SIMCA was applied for pattern recognition of different VOO and EVOO varieties using chromatographic (FAs, triglycerides, sterols) and spectrometric data (NIR, UV, MS, NMR) [15,40, 52,54,57,58]. Analogous work using PLS-DA was illustrated in Figure 11. SIMCA-based models were developed to predict origins of Italian EVOOs issued from Sabina and other sites. This helped for traceability analysis of a protected designated of origin vs. oils from other origins [52]. Several predictive models of Sabina were developed on the basis of NIR and MIR data after different types of baseline corrections. The number of latent variables of each model was determined from the highest geometrical averages of sensitivity and specificity obtained in ten iterative cross-validation. Models built from NIR showed higher sensitivity and specificity than MIR-based ones. According to pre-treatment ways, the best performances were obtained with detrending (D) and multiplicative scatter correction (MS) with first derivative (1d) which both resulted in 100% sensitivity and 95.45% specificity in calibration-based models, and 76.92% sensitivity and 95.45% specificity in cross-validation. Final validation issued from predicted outside patterns gave100% sensitivity and 93.33% specificity under D vs. 71.43% sensitivity and 86.67% specificity under MS with 1d. Finally, data analysis by SIMCA vs. PLS-DA showed different optimal pre-treatment way in the two approaches [52]. This was due to the fact that the two methods are differently influenced by the shapes of class distributions. This could be directly linked to the different principles of the two techniques: SIMCA focuses on the similarity of individuals within a same class; the q classes are considered separately the ones from the others leading to q specific categorical spaces. Finally, with q classes, one obtains q class-models based on opportune numbers of latent variables selected among all the principal components of a preliminary applied ordination analysis. However, DA (PLS-DA) operates by evaluating the differences between classes; individuals are attributed to appropriate classes by maximizing the ratio of variance between- on variance within-classes. DA operates by decomposing the hyperspace of the variables in q subspaces associated to the q predicted classes. In another work, SIMCA was applied to discriminate West Ligurian EVOOs from volatile terpenoid hydrocarbons (VTH) which were analysed by GC-MS in 105 OO samples originated from: Italy (West Ligura, Puglia), Greece, Spain and Tunisia [59]. Eight VTH were separated including α-pinene, limonene, trans-β-ocimene, 4,8-Dimethyl-1,3,7-nonatriene, α-Copane, Eremophyllene, α-Muurolene and α-Farnesene. SIMCA models showed high predictive ability of Ligurian EVOOs vs. each of the six other origins, sensitivity and specificity varying in the ranges 81%–90.9% and 92%–100%, respectively. Finally, among the eight VTHs, α-copaene, α-muurolene and α-farnesene showed high discriminant values in the separation of Ligurian samples from the other geographical origins.

4.5. Support Vector Machines

4.5.1. General Principle Support vector machines (SVMs) are supervised methods used to attribute unknown features to one among two possible candidate classes (e.g., class A vs. class B for binary system, or A vs. not A for multiclass system) [60,61]. For that aim, SVMs search optimal hyperplane separating well the two subspaces representing the two feature subsets. To be optimal, the hyperplane needs to have appropriate spatial location and angulation (Figure 13). Foods 2016, 5, 77 20 of 35

As shown by the intuitive Figure 13, there are infinity of possible lines separating two subspaces, butFoods only 2016, one5, 77 provides the best separation. 20 of 34

The best separating line

Figure 13. Simplistic illustration of the concept of optimal hyperplane separating two classes (two Figure 13. Simplistic illustration of the concept of optimal hyperplane separating two classes (twosubspaces) subspaces) among among infinity infinity of possible of possible (not optimal) (not optimal) hyperplanes. hyperplanes.

4.5.2. Methodological Steps 4.5.2. Methodological Steps Construction of optimal separating hyperplane between two classes requires a learning dataset Construction of optimal separating hyperplane between two classes requires a learning dataset L L of N pairs (xi, yi) (i = 1, …, N) where (Figure 14a): of N pairs (x , y )(i = 1, . . . , N) where (Figure 14a): i i p xi are quantitative vectors with p-dimension (p variable features; xi ∈ p ). xi are quantitative vectors with p-dimension (p variable features; xi ∈ IRIR ). yi are binary values equal to +1 or −1 depending on the class to which xi belongs. yi are binary values equal to +1 or −1 depending on the class to which xi belongs. Hyperplane is algebraically defineddefined byby aa separationseparation functionfunction ff(x) taking positive or negative real values by the lineline equationequation (Figure(Figure 1414b):b): T f (x) = β + XT β = 0 f (x) = β00 + X β = 0

where β is a normal vector to the hyperplane and β0 is offset. where β is a normal vector to the hyperplane and β is offset. The line will serve as reference to attribute any0 point x to one among the two classes according The line will serve as reference to attribute any point x to one among the two classes according to to the classifier rule S(x) (Figure 14c): the classifier rule S(x) (Figure 14c): S(x) = sign[f(x)] S (x) = sign [ f (x)] = +1 if f =(x +1) > if 0f(x)( >→ 0 y = +1(→) y = +1) = −1 if f (x) < 0 (→ y = −1) = −1 if f(x) < 0 (→ y = −1) Separation between the two classes is reinforced by establishing two limits at each side of the Separation between the two classes is reinforced by establishing two limits at each side of the hyperplane line f (x) = 0 (Figure 14d). The two limits are defined by the shortest distances d− and d+ hyperplane line f(x) = 0 (Figure 14d). The two limits are defined by the shortest distances d− and d+ from the hyperplane to the nearest points of both classes. They are characterized by the hyperplane from the hyperplane to the nearest points of both classes. They are characterized by the hyperplane lines equations: lines equations: T T f (x) = β0 + X β = −1 and f (x) = β0 + X β = +1 f (x) = β + X T β = −1 f (x) = β + X T β = +1 Points located on these limits0 (the nearest points) and are called0 support vectors (Figure 14e). Finally, the determination of parameter values β0 and β will be carried out by maximizing the Points located on these limits (the nearest points) are called support vectors (Figure 14e). shortest distances d− and d+. These distances satisfy a maximized margin d (the hyperplane wide) Finally, the determination of parameter values β0 and β will be carried out by maximizing the which is equal to 2/kβk (d = d− + d+ = 1/kβk + 1/kβk = 2/kβk). Moreover, margin maximization shortest distances d− and d+. TheseT distances satisfy a maximized margin d (the hyperplaneT  wide) obeys to the constraint y β0 + X β = 1 because y = −1 and +1 for β0 + X β ≤ −1 and T  whichβ0 + X is equalβ ≥ +to1, 2/ respectively.β (d = d− + d+ = 1/ β + 1/ β = 2/ β ). Moreover, margin maximization obeys T T to the constraint y(β0 + X β )=1 because y = −1 and +1 for (β0 + X β )≤ −1 and T (β0 + X β )≥ +1, respectively.

Foods 2016, 5, 77 21 of 34 Foods 2016, 5, 77 21 of 35 By considering all the pairs of points (xi, yi) of the binary system, SVMs algorithm corresponds 1 2 toBy a convex considering optimization all the problem pairs of pointshelping (x to, yfind) of β0 the and binary β that system,minimize SVMs  algorithm under the corresponds constraint: to i i 2 1 k k2 a convex optimizationT problem helping to find β0 and β that minimize 2 β under the constraint: y  Tx   1  i = 1 to N. yi β0i +0xi βi ≥ +1 ∀ i = 1 to N.

p (a) Learning dataset L on N pairs (xi, yi) (i = 1, ..., N) xi  IR : quantitative profiles

yi = +1 or ‐1, classes' labels

Separation function f(x): IRp → IR

(b) T f(x) = 0 +X  = 0 Separating hyperplane line H

+1 if f(x) > 0 y = +1 (c) Classifier S(x) = Sign[f(x)] ‐1 if f(x) < 0 y = ‐1

0 and  of f(x):

(d) Defined by maximizing the shortest distances d‐ and d+ from H to the nearest points of both classes

T y( +XT)= +1 H ‐ : 0 +X  < ‐1 y.f(x) = +1 0 Negative y = ‐1 class f(x) = ‐1 subspace

Optimal d‐ H Margin d = d + d Maximized ‐ + separating 2 hyperplane d d  + 

f(x) = +1 Positive class T T subspace H + : 0 +X  > +1 y.f(x) = +1 y(0 +X )= +1 y = +1

(e) Points , : Support Vectors satisfying the maximized margin d = 2/ under the constraint: y( +xT)=1 0 Figure 14. Different components (a–e) for Support Vector Machines. (a) Information training source; Figure 14. Different components (a–e) for Support Vector Machines. (a) Information training source; (b) separating hyperplane function to be defined; (c) rule; (d) optimisation (b) separating hyperplane function to be defined; (c) binary classification rule; (d) optimisation condition condition for determining hyperplane function; (e) margin separation limits issued from optimization for determining hyperplane function; (e) margin separation limits issued from optimization process. process.

BeyondBeyond the the basic basic concept concept ofof linearly separable separable classes, classes, feature feature recognition recognition can be can improved be improved by byintroducing the the concept concept of a ofsoft a‐margin soft-margin based on based a more on flexible a more formulation flexible formulation of the problem of (Figure the problem 15). (FigureThis 15makes). This to makes overcome to overcome misclassification misclassification problem problemdue to dueoverlapping to overlapping between between classes, classes,i.e., i.e.,infiltration infiltration of of some some points points of ofone one class class into into the thespace space of the of other the other class. class.

Foods 2016, 5, 77Foods 2016, 5, 77 22 of 35 22 of 34

T β0 +X β = 0

H - ξ2 ξ1

ξ4 ξ3 H + ξ5

Figure 15. ImprovementFigure 15. Improvement of separation of separation between between classes classes by by SVMs SVMs using using slack variableslack variableξ. The five ξξ.i (Theξ1, five ξi (ξ1, ..., ξ5) are associated with the points that violate the constraints of hyperplane H+ and H−. …, ξ5) are associated with the points that violate the constraints of hyperplane H+ and H−.

Under technical aspect, soft-margin between classes is constructed by introducing a nonnegative Under technical aspect, soft-margin between classes is constructed by introducing a nonnegative slack variable ξ which takes N values for the pairs (xi, yi): ξ = (ξ1, ... , ξN) ≥ 0. The slack variable ξ is slack variableintroduced ξ which to allowtakes the N violation values of for the marginthe pairs constraints (xi, yi): of ξ hyperplane. = (ξ1, …, ξN) ≥ 0. The slack variable ξ is introduced to allowTherefore, the violation determinations of the of marginβ0, β andconstraintsξ are carried of hyperplane. out by solving soft-margin optimization problem: Therefore, determinations of β0, β and "ξ are carriedN # out by solving soft-margin optimization 1 2 problem: Min kβk + C∑ ξi 2 i=1 Under the constraints: N 1 2  ξ ≥ 0 Min βi + C∑ξi  and 2 i=1    y β + x T β ≥ 1 − ξ for i = 1 to N Under the constraints: i 0 i i where C is a control parameter of the slack variable sizes (called regularisation meta-parameter). It controls the error of misclassification by compromisingξi ≥ 0 between two conflicting objectives: minimizing the training error vs. maximizing the margin. Higher and lower C values result in and emphasizing the error minimization and the margin maximization, respectively. Optimization of margin distribution provides significant improvement of class separation in the T case of non-separable data in the feature space. yi (β0 + xi β )≥1−ξi for i = 1 to N When two classes are nonlinearly separable, their discrimination by SVM can be performed by where C is ausing control a mapping parameter function ofϕ. Thisthe functionslack variable serves as basissizes of (called a kernel whichregularisation transforms the meta original-parameter). It feature space into a higher dimensional space in which a separating hyperplane can be found to controls thelinearly error separate of misclassification the two initially overlapping by compromising classes (Figure 16). between two conflicting objectives: minimizing the training error vs. maximizing the margin. Higher and lower C values result in emphasizing the error minimization and the margin maximization, respectively. Optimization of margin distribution provides significant improvement of class separation in the case of non-separable data in the feature space. When two classes are nonlinearly separable, their discrimination by SVM can be performed by using a mapping function ϕ. This function serves as basis of a kernel which transforms the original feature space into a higher dimensional space in which a separating hyperplane can be found to linearly separate the two initially overlapping classes (Figure 16).

Foods 2016, 5, 77 23 of 35 Foods 2016, 5, 77 23 of 34

X²x² (a)

y = x² X x

-6 -4 -2 0 2 4 6 8 Kernel function

Xx

-6 -4 -2 0 2 4 6 8

X 2 (b)

Kernel function

Z

X1

X X 2 1 FigureFigure 16. 16.Two Two illustrative illustrative examples examples ((aa,b, b)) showing showing howhow twotwo nonlinearly separable separable classes classes can can be be separatedseparated in ain higher a higher dimension dimension space space reached reached byby a data transformation based based on on a kernel a kernel function. function. TheThe illustrative illustrative cases cases (a )(a and) and (b ()b correspond) correspond to to two two states requiring requiring two two different different data data transformations transformations (using(using two two different different kernel kernel functions) functions) for for final final linear linear separation separation of the two two classes. classes.

In SVM algorithm, this transformation involves scalar products between input vectors →,→. In SVM algorithm, this transformation → involves scalar products between input vectors xi, xj. Given φ the mapping function: φ→: φ(→) GivenSo theφ the scalar mapping product function: is: ,φ: x →φ(x,) → → → → So theWhere: scalar product, is called is: k( kernelxi, xj) =function.ϕ(xi), ϕ(xj) → → Where:List ofk( xkerneli, xj) is functions called kernel is wide function. including polynomial kernel, radial basis function kernel, sigmoidList of kernel, kernel Pearson functions VII universal is wide includingkernel (PUK), polynomial etc. In practice, kernel, choosing radial the basis appropriate function kernel kernel, sigmoidis not kernel,an easy Pearson task, and VII tuning universal the parameters kernel (PUK), of chosen etc. In kernel practice, is essential choosing to theget appropriategood performance kernel is notfrom an easy SVM task, algorithm. and tuning the parameters of chosen kernel is essential to get good performance from SVM algorithm. 4.5.3. Application of SVMs in OO Field 4.5.3. ApplicationSVMs were of applied SVMs to in Italian OO Field OOs characterized by near‐ and mid‐infrared spectra in order to discriminate between Ligurian and not Ligurian origins [62]. SVMs were applied to Italian OOs characterized by near- and mid-infrared spectra in order to In another work, FT‐IR data (input spectroscopic features xi) were treated by SVMs combined discriminate between Ligurian and not Ligurian origins [62]. with kernel function to successfully discriminate between (i) Italian and not Italian OOs; and between In another work, FT-IR data (input spectroscopic features x ) were treated by SVMs combined (ii) Ligurian and other Italian regions (Figure 17) [63]. Comparisoni between predicted results issued withfrom kernel SVMs function combined to successfully with Pearson discriminate VII Universal between kernel (i) (PUK) Italian and and Gaussian not Italian kernel OOs; (GK) and showed between (ii)higher Ligurian performances and other Italian of the regionsSVM‐PUK (Figure method. 17)[ This63]. Comparisonmade conclusion between about predicted the higher results mapping issued frompower SVMs of combinedPUK for discrimination with Pearson of VIIbinary Universal systems kernel by SVMs. (PUK) and Gaussian kernel (GK) showed higher performances of the SVM-PUK method. This made conclusion about the higher mapping power of PUK for discrimination of binary systems by SVMs.

Foods 2016, 5, 77 24 of 35 Foods 2016, 5, 77 24 of 34

FT-IR data

Foods 2016, 51193, 77 variables: [700-3000 cm-1] 24 of 34

668 OO samples FT-IR data (b) 1193 variables: [700-3000 cm-1]

668 OO samples Gaussian (a) Ligurian OOs kernel (b)

SVMs (a) Not Ligurian Gaussian Italian OOs OOs Ligurian OOs kernel SVMs - PUK PUK SVMs Better Not Italian Not Ligurian performances OOs Italian OOs OOs SVMs - PUK PUK Figure 17. Application of SVMs combined with kernel function to discriminate between (a) Italian Figure 17. Application of SVMs combinedBetter withNot Italian kernel function to discriminate between (a) Italian and not Italian olive oils and performancesbetween (b) LigurianOOs and other Italian regions using FT-IR data [63]. and not Italian olive oils and between (b) Ligurian and other Italian regions using FT-IR data [63]. PUK: Pearson Universal Kernel. PUK:Figure Pearson 17. UniversalApplication Kernel. of SVMs combined with kernel function to discriminate between (a) Italian 4.6. Kand-Nearest not Italian Neighbours olive oils and between (b) Ligurian and other Italian regions using FT-IR data [63]. 4.6. K-NearestPUK: Pearson Neighbours Universal Kernel. 4.6.1. General Principle 4.6.1.4. General6. K-Nearest Principle Neighbours K-nearest-neighbour method (K-NN) is a pattern recognition method used to attribute new 4K-nearest-neighbourfeatures.6.1. General to appropriate Principle classes method among (K-NN) several is acandidate pattern classes. recognition Due to method its nonparametric used to attributeaspect, it new can be fundamentally applied for new patterns classifications under little or no prior knowledge on features toK- appropriatenearest-neighbour classes method among (K several-NN) is candidate a pattern classes.recognition Due method to its nonparametric used to attribute aspect, new it can data distributions [64]. be fundamentallyfeatures to appropriate applied classes for new among patterns several classifications candidate classes. under D littleue to orits nononparametric prior knowledge aspect, on it data Application of K-NN needs a learning dataset containing N points representing N features which distributionscan be fundamentally [64]. applied for new patterns classifications under little or no prior knowledge on are spatially separated the ones from the others by local distances that can be Euclidean or of other data distributions [64]. Applicationtypes. The type of of K-NN used needsdistance a learningdepends on dataset the type containing of data. N points representing N features which Application of K-NN needs a learning dataset containing N points representing N features which are spatiallyAttribution separated of a the new ones point from U to the appropriate others by class local by distances K-NN is thatbased can on bethe Euclidean evaluation orof ofits other are spatially separated the ones from the others by local distances that can be Euclidean or of other types.location The type by reference of used distance to its K nearest depends neighbours on the type (Figure of data. 18). This results in K calculated distances types. The type of used distance depends on the type of data. Attributionbetween the unlabelled of a new (new) point pointU to U appropriate and the K nearest class neighbours by K-NN. isTherefore, based on K- theNN evaluationdecision rule of its Attribution of a new point U to appropriate class by K-NN is based on the evaluation of its locationconsists by referencein attributing to itsU toK thenearest class to neighbours which most (Figure of the K 18neighbours). This results belong. in K calculated distances location by reference to its K nearest neighbours (Figure 18). This results in K calculated distances betweenbetween the unlabelledthe unlabelled (new) (new) point pointU Uand and thethe K nearest neighbours neighbours.. Therefore, Therefore, K-NN K-NN decision decision rule rule A consistsconsists in attributing in attributing U toU to the the class class to to which which mostmost of t thehe KK neighboursneighbours belong. belong. A Unknown point U A A within the learning U U B space L containing A B Unknownclasses A and point B U A B within the learning U U B space L containing B classes A and B B

Fixation of K = 1 K = 5 Need to optimize neighboring number K K from training dataset a priori Fixation of K = 1 K = 5 Need to optimize neighboring number K K from training Nearest 1 A 3 B vs 2 A dataset a priori neighbors to U

Nearest 1 A 3 B vs 2 A neighbors to U

U predicted Voting rule U predicted in in class A class B Figure 18. Two examples illustrating the K-Nearest Neighbours principle and the influence of the U predicted Voting rule U predicted in neighbouring number K on classificationin class results A . Parameterclass K Bneeds to be optimized.

Figure 18. Two examples illustrating the K-Nearest Neighbours principle and the influence of the Figure 18. Two examples illustrating the K-Nearest Neighbours principle and the influence of the neighbouring number K on classification results. Parameter K needs to be optimized. neighbouring number K on classification results. Parameter K needs to be optimized.

Foods 2016, 5, 77 25 of 35

However, by this majority rule, the K nearest neighbours will have equal influence despite their different distances from U. For that, standard K-NN can be improved by weighted K-NN by individually considering the K distances to attribute different weight values to the K neighbours. By this way, higher voting weights are attributed to the closest (less distant) neighbours.

4.6.2. Methodological Steps

Formally, K-NN initially uses a training or learning set L defined by N multivariate features xi belonging to C classes yi (yi = 1, . . . , C):

L = {(xi, yi), i = 1, . . . , N} where xi denotes a vector containing p predictor variables:

T xi = (xi1, xi2,..., xip)

Pattern classification of unlabelled point U requires the application of distance function d making to evaluate neighbouring or similarity between U and K points representing its nearest neighbours (NNs) among the N points of learning dataset. For instance, Minkowski distance provides a general formula giving different desired distances depending on the parameter q [65]:

1 " p q# q  d xi, xj = ∑ xit − xjt t=1 where xi, xj are two features i and j characterized by p variables x1,... xt. The formula provides several types of distances according to the q value, including: Euclidean distance for q = 2, absolute distance for q = 1, etc. To avoid high variance effects, the different variables xt are initially standardized to 0 and variance 1 by applying:

xit − xt SDt where xit is the value of variable xt in feature i, xt and SDt are the mean and the of variable xt, respectively, initially calculated on raw data. After calculation of N distance values between U and the N learning points xi, the lowest values will be kept and associated to the K NNs. Therefore, the C classes will be concerned by different numbers Kr (among K) due to the different belongings of the K NNs:

C ∑ Kr = K r=1

Therefore, the new point U will be predicted into the class m showing majority of neighbours:

r=C Km = max (Kr)r=1

Note that the predicted class of a new feature U can be significantly influenced by the chosen neighbouring number K (Figure 18). For instance, in the illustrative case of Figure 16x, choosing K = 1 makes the unlabelled point U to be classified in class A because its nearest single neighbour belongs to A. However, but using K = 5, the point U will be attributed to class B because it becomes majority with three points B vs. A with two points. Foods 2016, 5, 77 26 of 35

Therefore, the neighbouring number K needs to be chosen among several candidate values using some global minimization error criterion. Moreover, several K-NN algorithms (including weighted K-NN) were developed from the standard one to (i) avoid equal importance for all neighbours and to overcome imbalanced data problems [66].

4.6.3. Application of K-NNs in OO Field Bajoub et al., (2017) [67] applied K-NN to predict seven EVOO varieties cultivated in Morocco and characterized by phenolic HPLC profiles: Arbequina, Arbosana, Cornicabra, Frantoio, Picholine de Languedoc (PL), Picholine Marocaine (PM) and Picual. Phenolic fingerprints of the seven varieties (represented by 140 samples in all) were acquired by two types of spectroscopic data consisting of diode array (DAD) and fluorescence (FLD). Detection conditions were 280 nm for DAD and 280 nm (λ excitation) and 339 nm (λ emission) for FLD. Application of K-NN on HPLC-DAD dataset provided high prediction accuracy for the seven EVOO cultivars; accuracies ranged within 93.2%–100% for the training set, (91.26%–100%) for the cross-validation and (94.60%–100%) for the test prediction set. The best prediction rates concerned the variety Cornicabra whereas the lowest ones concerned PL and PM. Using HPLC-FLD data, K-NN models gave discrimination rates >96.12%, 97.09% and 94.60% for training, cross-validation and prediction sets, respectively. The best discrimination results concerned the varieties Cornicabra, Arbequina and Arbosana; the lowest accuracy values concerned PM cultivar.

4.7. Artificial Neural Networks

4.7.1. General Principle Artificial neural networks (ANNs) represent a set of unsupervised learning approaches used to build non-linear models predicting q classes or responses from iterative combinations of p weighted control variables [68]. ANNs are sophisticated methods for both classification and prediction providing a visualization of how different categorical variables (e.g., OO cultivars, blends, geographical origins, etc.) are separated in different hyperspace areas with different shapes serving as reference for pattern recognitions of new samples. On this basis, ANNs represent efficient methods for variation absorption and signal separation leading to reliable traceability predictive models.

4.7.2. Methodological Steps of ANNs ANN models are built by iterative training algorithms which conceive network systems into three neuron layers: input, hidden and output layers (Figure 19). For training process, individuals separately and successively enter in the network as multivariate profiles with p learning variables. This input process occurs at the first network layer made by p neurons which are associated to the p variables (Figure 19a). The system is iteratively trained by several profiles to gradually structure and stabilize it by learning new or confirmed information: the input profiles (with p variables) enter in the neural network to bring signals which are detected and categorized by the internal or hidden layers (Figure 19b). Input and cumulative information are transformed by a non-linear monotonic transfer function giving response weight values comprised between 0 and 1. The set of response weight values iteratively vary until the formation of confirmed specialized fields. Such fields reveal to be sensitive toward different signals or categories to which different individuals can be affiliated. Finally, the stable system structure reveals to be influenced by non-linear links between the p input control variables and the q revealed output states of systems. Finally, the q categorized and confirmed signals are received by the output layer made by q feature-collecting or responses-producing neurons (Figure 19c). Foods 2016, 5, 77 27 of 35 Foods 2016, 5, 77 27 of 34

3 neurons associated (a) Input layer to 3 control variables

weights neural weights (0 ≤ ≤ 1)

w11 w34 w w 12 13 w14

(b) Hidden layers Non-linear monotonic tranfer function

(c) Output layer 3 response states

Prediction, Classification

Figure 19. Representation of artificial neural network based on three neural layers (input, hidden, Figure 19. Representation of artificial neural network based on three neural layers (input, hidden, output) (a, b, c) and applied to carry out non-linear predictive models of polymorphic response or state output) (a, b, c) and applied to carry out non-linear predictive models of polymorphic response or system from different input control variables. state system from different input control variables.

4.7.3.4.7.3. ApplicationApplication ofof ANNsANNs inin OOOO FieldField ANNsANNs have have been been applied applied for forclassification classification of 572 of Italian 572 Italian OOs from OOs FAs from profiles FAs (input profiles variables) (input variables)[69]. In another [69]. In work, another geographical work, geographical traceabilit traceabilityy of different of European different EuropeanVOOs (Spain, VOOs Italian, (Spain, Portugal) Italian, Portugal)was modelled was in modelled relation into FAs relation profiles to FAs [70]. profiles In the same [70]. work, In the predictions same work, of predictionsregions, provinces of regions, and provincesPDOs required and PDOs more requiredinput variables more input including variables sterols, including alcohols sterols, and hydrocarbons alcohols and in hydrocarbons addition to FA ins. additionAt more to FAs. complex level, ANNs were used to predict levels (proportions) of OOs in bi-varietal blendsAt [71,72 more] complex (Figure 20): level, ANNs were used to predict levels (proportions) of OOs in bi-varietal blendsCoupling [71,72] (Figure two ANNs 20). methods (self-organizing map and multilayer feed-forward), Marini et al. (2007)Coupling [72] developed two ANNs predictive methods models (self-organizing giving proportions map and of multilayer five Italian feed-forward), mono-cultivarsMarini OOs et co al.- (2007)occurring [72] in developed different binary predictive mixtures. models The givingfive OO proportions cultivars were of Carboncella five Italian (C mono-cultivars), Frantoio (F), Leccino OOs co-occurring(L), Moraiolo ( inM) different and Pendolino binary (P mixtures.). They were The chemically five OO cultivars characterized were byCarboncella relative levels(C), Frantoio of different(F), LeccinoFAs, phytosterol(L), Moraiolo and(M )triglyceride and Pendolino in (additionP). They wereto UV chemically extinction characterized coefficient K by270. relativeThese variables levels of differentresulted FAs,in 18 phytosterol measured or and calculated triglyceride characteristics in addition towhich UV extinction were used coefficient as 18 unit K signals270. These (18 variablesnodes or resultedneurons) in in 18 the measured input ANNs or calculated-layer. Ten characteristics binary mixture which types were usedwere assimulated 18 unit signals by combining (18 nodes 153 or neurons)samples inrepresentative the input ANNs-layer. of the five Ten mono binary-cultivar mixture OOs types leading were simulated to several by thousands combining of 153 bivarietal samples representativeblends of types: of C the-F, fiveC-L, mono-cultivarC-M, C-P, F-L, OOsF-M, leadingF-P, L-M to, L several-P, M-P thousands. Optimization of bivarietal analysis initialized blends of types:with 6 Cto- F15, Cn-odesL, C -madeM, C- determinationP, F-L, F-M, F -ofP, theL-M best, L- numberP, M-P. of Optimization nodes to 9 within analysis a single initialized hidden with layer. 6 toPreliminary 15 nodes studies made determination showed the not of need the bestto add number more than of nodes a single to hidden 9 within layer. a single Finally, hidden the output layer. Preliminarylayer consisted studies of two showed nodes the providing not need the to proportions add more than of the a single two mix hiddened cultivars layer. Finally, in the theconsidered output layerbinary consisted blend. Training of two nodes was performed providing by the the proportions back-propagation of the two algorithm mixed cultivarsusing a hyperbolic in the considered tangent binarytransfer blend. function. Training The ten was ANNs performed models by associated the back-propagation to the ten simulated algorithm binary using ablends hyperbolic predicted tangent the transferpercentages function. of different The ten ANNsOO cultivars’ models associatedpairwise towith the good ten simulated accuracy binary indicated blends by predicted high cross the- percentagesvalidation determination of different OO coefficients cultivars’ pairwiseQ2 (Q2 = 0.9 with1–0 good.96). accuracy indicated by high cross-validation determination coefficients Q2 (Q2 = 0.91–0.96).

Foods 2016, 5, 77 28 of 35 Foods 2016, 5, 77 28 of 34

18 input variables of Italian OO binary blends: fatty acids, phytosterols, triglycerides, KUV270

Input layer 18 nodes etc.

9 nodes Single layer Hidden layer Hyperbolic tangent function

Output layer two proportions of two mixed cultivars

% : C-F, C-L, C-M, C-P, F-L, F-M, F-P, L-M, L-P, M-P 10 binary blends ⇒ 10 ANNs models

Figure 20. Prediction of proportions of Italian OO cultivars’ pairwise combined in ten different blend types by artificialartificial neural networksnetworks (ANNs)(ANNs) usingusing physicalphysical andand chemicalchemical measuredmeasured variables.variables.

Cajka et al.al. (2010)(2010) [[3939]] appliedapplied ANNs ANNs with with multilayer multilayer perceptron perceptron (MLP) (MLP) and and back back propagation propagation to predictto predict geographical geographical origins origins of Ligurian of Ligurian and non-Ligurian and non-Ligurian VOOs fromVOOs volatile from compoundsvolatile compounds patterns. Inpatterns. all, 44 volatilesIn all, 44 werevolatiles analysed were byanaly GC-MSsed by including GC-MS alcohols,including aldehydes, alcohols, aldehydes, ketones, esters, ketones, carboxylic esters, acidscarboxylic and hydrocarbons.acids and hydrocarbons. Olive oil dataset Olive oil consisted dataset consisted of 210 Ligurian of 210 andLigurian 704 non-Ligurianand 704 non-Ligurian samples originatedsamples originated from different from different countries countries (Italy, Spain, (Italy, France, Spain, Greece, France, Cyprus, Greece, Turkey). Cyprus, Turkey). A preliminary PCA PCA on on the the dataset dataset (914 (914 samples samples × 44× volatiles)44 volatiles) showed showed wide wide distribution distribution and andoverlapping overlapping between between different different OO samples. OO samples. LDA LDAapplied applied for prediction for prediction of Ligurian of Ligurian origin origin gave gaveprediction prediction ability ability of 61.7%. of 61.7%. This relatively This relatively low value low was value significantly was significantly improved improved by applying by applying ANNs ANNswhich whichprovided provided an advanced an advanced chemometric chemometric tool toolto treat to treat systems systems with with complex complex variability variability and relationships between predictor andand predictedpredicted variables.variables. The ANNs model consisted of input, hidden and output l layersayers containing containing 44, 25 25 and and 1 1 neuron(s), neuron(s), respectively. TheThe 44 input neurons corresponded toto the 44 volatiles, whereas the single output neuron was reserved forfor geographicalgeographical originorigin predictionprediction (Ligurian(Ligurian vs.vs. non-Ligurian).non-Ligurian). ANNs model showed recognition abilityability of 91.4% andand 84.0%84.0% forfor trainingtraining andand selectionselection subsets,subsets, respectively.respectively. External validation on test subset showed prediction ability of 81.1%, with a sensitivity sensitivity and and specificity specificity of 84.1 84.1%% and 80.7%, respectively (vs.(vs. 89.9% and 58.2%, respectively in LDA).

4.8.4.8. Simplex Mixture Networks

4.8.1.4.8.1. General Principle Simplex mixture networks are particularly advantageous for quantitative control of compositions of mixturemixture systems. systems. Beyond Beyond binary binary systems, systems, simplex simplex spaces spaces are particularly are particularly appropriate appropriate for variability for analysisvariability of anal highysis dimension-mixtures of high dimension (e.g.,-mixtures multivarietal (e.g., multivarietal blends). They blends). provide They robust provide way forrobust analysis way andfor analysis control and of proportions control of proportions of several (ofq) several co-occurring (q) co- componentsoccurring components (e.g., OO cultivars)(e.g., OO cultivars) in complex in blends.complex In blends. this framework, In this framework, a new simplex-based a new simplex approach-based was approach developed was to predictdeveloped proportions to predict of mixedproportions groups of frommixed quantitative groups from multivariate quantitative profiles multivariate of their profiles blends [of14 their,16,73 blends]. [14,16,73]. Theoretically, N blends can be be represented represented by by NN combinationscombinations of of qq groupsgroups k k(k( k= =1 1to to q)q with) with q qproportionsproportions wwk klinkedlinked by by the the unit unit sum rulerule ΣΣwwkk == 1.1. TheThe weights’weights’ multiplet multiplet ( w(w1,1,... …,, wwkk,, ...….,., wq)) represents q coordinates attributing attributing corresponding corresponding blend blend to to a awell well-defined-defined spatial spatial location location within within a asimplex simplex space space (Figure (Figure 21). 21 ).The The simplex simplex approach approach was was developed developed to topredict predict the the weights’ weights’ multiplets multiplets of ofdif differentferent blends blends from from characteristic characteristic quantitative quantitative profiles. profiles. Model Model construction construction requires requires a training a training step in which each mixture point need to be learned by a large set of representative blend profiles. This implies statistical exploration of variability between and within groups.

Foods 2016, 5, 77 29 of 35

step in which each mixture point need to be learned by a large set of representative blend profiles. This implies statistical exploration of variability between and within groups.

4.8.2. Methodological Steps of Simplex-Based Approach Figure 21 In simplex-based approach, each blend i combining q groups (q OO cultivars) is initially represented by a row i with q weights’ columns (w1, ... , wk, ... ., wq)i (i = 1 to N). The set of N possible blends is given by a mixture design called Scheffé’s matrix (Figure 21a) [11,14,74]. This matrix combines q groups according to gradually variable proportions using initially a defined increment.

Groups' Profiles (c) T repeated response matrices T iterations of mixture design (a) of N (=21) average profiles associated to N mixtures of q (=3) groups affected with Proportions (w A , w B , w C ) of 3 mixed groups variable proportions given by Schéffé's matrix Num Group AGroup BGroup C 1100 Group B (b) 20.80 0.2 1 to T 30.80.20 T iterations 40.60.20.2 : 50.60 0.4 : 60.60.40 70.40.60 Group C 80.40.40.2 90.40 0.6 Response 1 to T 10 0.4 0.2 0.4 profile 11 0.2 0.8 0 : 12 0.2 0.4 0.4 : 13 0.2 0.2 0.6 14 0.2 0.6 0.2 Group A 15 0.2 0 0.8 1 to T 16 0 0.8 0.2 : 17 0 0.6 0.4 : 18 0 0.4 0.6 Simplex Space 19010 20 0 0.2 0.8 21001 1 to T

Discriminant predictive model of proportion multiplets (pA, pB, pC) from the set of T repeated quantitative profiles

Figure 21. Principle and parameters of simplex approach applied to predict proportions of different mixed groups from quantitative profiles of resulting blends. (a) Stratification of OO samples; (b) application of Scheffé’s mixture design and calculation of response matrix; (c) iterative application for training process before building of proportions’ predictive model.

At the output of mixture design, N average profiles are calculated to represent N gradually variable blends due to gradual weighting of the q groups (Figure 21b). This elementary scheme is iterated T times by bootstrapping in order to train the response matrix of N average profiles by inter- and intra-group variations (Figure 21c). The set of T response matrices representing (T × N) average profiles is finally used to build predictive models of groups’ weights (e.g., OO cultivars’ proportions) from quantitative profiles of blends (e.g., chromatographic profile). By considering each blend as a multiplet of weights, predictive model of weights from quantitative profiles can be carried out by means of discriminant analysis.

4.8.3. Application of Simplex-Based approach in OO Field Simplex approach in OO field was recently applied for composition analysis of French and Tunisian trivarietal blends by prediction proportions of the three co-occurring VOO cultivars [14,16]: Aglandau (A), Grossane (G), Salonenque (S) for French VOOs [14], and Chemlali (Cm), Chetoui (Ct), Oueslati (Ou) for Tunisian VOOs [16] (Figure 22a,b). The approach is easily extensible to blends containing more than three groups. Foods 2016, 5, 77 30 of 35 Foods 2016, 5, 77 30 of 34

Simplex mixture & (a) Model calibration

17:0, 17:1ω8 Weights' triplet Blend's FAs profile of cultivars Aglandau ω ω ω ω 16:1 7, 17:0, 17:1 8, 18:1 7, 18:2 6, 20:0 A (wA, wG, wS)

A-S A-G A-G-S Salonenque Grossane 18:2ω6, G-S 16:1ω7, 20:0 S G 18:1ω7

Simplex mixture & (b) Model calibration

16:0, 16:1ω7

Weights' triplet Blend's FAs profile Chemlali of cultivars Cm

16:0, 16:1ω7, 16:1ω9, 17:0, 18:1ω9, 20:1ω9 Cm-Ou Cm-Ct (wCm, wCt, wOu)

Cm-Ct-Ou 16:1ω9, 18:1ω9 Oueslati Chetoui 20:1ω9, Ct-Ou Ou Ct 17:0

Figure 22. Application of simplex network for prediction of weights’ triplets (w1, w2, w3) giving Figure 22. Application of simplex network for prediction of weights’ triplets (w1, w2, w3) giving proportions of three French (a) and three Tunisian (b) mixed OO cultivars in trivarietal OO blends proportions of three French (a) and three Tunisian (b) mixed OO cultivars in trivarietal OO blends characterized by fatty acid profiles. characterized by fatty acid profiles. Proportions’ triplets of (A, G, S) and (Cm, Ct, Ou) were predicted from calibration models based on Proportions’complete sets tripletsof N = 66 of and (A ,231G, mixtures,S) and (Cm respectively., Ct, Ou) were These predicted numbers from resulted calibration from combinations models based onof complete q = 3 VOO sets components of N = 66 andby blocks 231 mixtures, of w = 10 respectively. and 20 samples, These respectively numbers. resulted Each of the from N combinations(66 or 231) ofblendsq = 3 VOO was characterizedcomponents byby blocksa weights’ of w triplet= 10 and(wA, 20wG samples,, wS) or (w respectively.Cm, wCt, wOu). Proportion Each of the ofN VOO(66 ork is 231 ) blendsgiven wasby dividing characterized wk by the by constant a weights’ sum triplet w = Σ (wkA =, w10G (French, wS) or blends) (wCm, w orCt ,20w Ou(Tunisian). Proportion blends). of The VOO k ishigher given bytotal dividing weight ww =k 20by in the Tunisian constant blends sum wasw = dueΣw kto= the 10 fact (French that more blends) contrasted or 20 (Tunisian or overlapped blends). Thevariations higher total were weight initiallyw =observed 20 in Tunisian between blends and within was due Tunisian to the fact VOOs that compared more contrasted to the three or overlapped French variationsones. It was were shown initially that observed increase in between w makes and to withinabsorb Tunisianmore variability VOOs comparedbetween and to thewithin three mixed French ones.VOOs It wasleading shown to prediction that increase errors in reductions. w makes to absorb more variability between and within mixed VOOsAt leading the output to prediction of each errorsmixture reductions. (w1, w2, w3), an average FA profile was calculated from the w profiles shared between w1, w2, w3 individuals randomly sampled from the 1st, 2nd and 3rd VOO, At the output of each mixture (w1, w2, w3), an average FA profile was calculated from the w profiles respectively. For the complete set of N = 66 (A, G, S) and 231 (Cm, Ct, Ou) blends, 66 and 231 average shared between w , w , w individuals randomly sampled from the 1st, 2nd and 3rd VOO, respectively. FA profiles were1 calculated2 3 as barycentre responses of the three combined VOOs. These two For the complete set of N = 66 (A, G, S) and 231 (Cm, Ct, Ou) blends, 66 and 231 average FA profiles elementary response matrices of average 66 and 231 FA profiles were iterated 30 times to absorb were calculated as barycentre responses of the three combined VOOs. These two elementary response (integrate) chemical variations within and between cultivars. The number of repetition k was fixed to matrices30 because of averageaverages 66of different and 231 FAs FA i profilesn different were blends iterated stabilized 30 times from tothe absorb 25th iteration. (integrate) Finally, chemical the variations30 × 66 within(1880 French) and between and 30 cultivars. × 231 (6930 The numberTunisian) of simulated repetition kblendswas fixed were to used 30 because as extensive averages ofbackground different FAs chemical in different basis blends to predict stabilized the from66 (A the, G, 25th S) and iteration. 231 (Cm Finally,, Ct, theOu) 30 weights’× 66 (1880 triplets, French) andres 30pectively.× 231 (6930Predictive Tunisian) models simulated were performed blends wereby linear used discriminant as extensive analysis background linking chemical the ordinal basis toweights’ predict thetriplets 66 ((Aw,1,G w, 2S, )w and3) to 231a set ( Cmof discriminant, Ct, Ou) weights’ FAs. Six triplets, most discriminant respectively. FAs Predictive were used models in

Foods 2016, 5, 77 31 of 35

were performed by linear discriminant analysis linking the ordinal weights’ triplets (w1, w2, w3) to a set of discriminant FAs. Six most discriminant FAs were used in French and Tunisian models: 16:1ω7, 18:1ω7, 17:0, 17:1ω8, 18:2ω6, 20:0 for French samples and 16:0, 16:1ω7, 16:1ω9, 17:0, 18:1ω9, 20:1ω9 for Tunisian samples. Prediction errors of VOOs’ proportions were ≤10% both for French and Tunisian blends. Beyond prediction of cultivars’ proportions in blends, simplex mixture network approach was used to determine minimal errors’ area in simplex space leading to identify blends with potentially high traceability for next RDA or PDO development. Error minimization applied on Tunisian VOOs showed that the most predictable blends were:

• Pure Cm among the monovarietal blends. • 50% Ct, 50% Ou for bivarietal blends. • 10% Cm, 30%–35% Ct, 55%–60% Ou for trivarietal blends.

5. Conclusions Olive oils are complex matrices characterized by high chemical variability due to multiple composition influencing factors including:

• Olive living (culture) conditions including varietal types and geographical origins. • OO preparation ways giving pure, mixed and adulterated samples. • Variable proportions of different constitutive components of final OO blends.

To better control the effects of these potential factors, OO samples are chemically analysed then statistically treated in order to:

• Qualitatively determine monovarietal (pure), binary (mixed) or multivarietal (blend) aspects. • Quantitatively evaluate proportions of different mixed components. • Conclude about the most discriminant or predictive chemical variables in the built qualitative and quantitative models.

Chemical analyses of OO include chromatographical profiles of several metabolic families (fatty acids, phenols, volatiles, etc.) and spectroscopic features containing several hundreds of recorded wavenumbers. Chemometrical analyses of chromatographic and spectroscopic data can be carried out by several methods helping for highlighting specific characteristics (i), authentic classifications (ii) and reliable predictive fingerprints (iii) of different OO samples: Specific characteristics associated with differentiation poles of OO samples (i) can be highlighted by topological analysis including PCA and CA. Authenticity of different OO samples (ii) can be defined by classification methods based on distance calculation including HCA. Fingerprint (traceability) analysis of different co-occurring OO components in blends (iii) can qualitatively performed by means of different recognition patterns’ methods and network-based approaches. Recognition patterns’ methods include LDA, PLS-DA, SIMCA, SVM and KNN. Network approaches cover many algorithms based on ANNs. Under quantitative aspect, traceability of OO blends can be analysed by predicting proportions of several co-occurring components using simplex-approach. In the case of high number of candidate predictive variables (particularly for spectroscopic data), PLS can be used to select the most significant ones. Finally, results issued from the different chemometrics methods are helpfully used for (i) routine quality control of OO samples with respect to known compositions and origins, (ii) adulteration detection by reference to well-defined quality (consumption) norms, (iii) outlining new labelled commercial products within a statistically surrounded framework (e.g. RDO, PDO), etc.

Conflicts of Interest: The authors declare no conflict of interest. Foods 2016, 5, 77 32 of 35

References

1. Caponio, F.; Gomes, T.; Pasqualone, A. Phenolic compounds in virgin olive oils: Influence of the degree of olive ripeness on organoleptic characteristics and shelf-life. Eur. Food Res. Technol. 2001, 212, 329–333. [CrossRef] 2. Boggia, R.; Evangelisti, F.; Rossi, N.; Salvadeo, P.; Zunin, P. Chemical composition of olive oils of the cultivar Colombaia. Grasas y Aceites 2005, 56, 276–283. [CrossRef] 3. Shaker, M.A.; Azza, A.A. Relationship between volatile compounds of olive oil and sensory attributes. Int. Food Res. J. 2013, 20, 197–204. 4. Jiang, L.; Zheng, H.; Lu, H. Application of UV spectrometry and chemometric models for detecting olive oil-vegetable oil blends adulteration. J. Food Sci. Technol. 2015, 52, 479–485. [CrossRef] 5. Dais, P.; Hatzakis, E. Quality assessment and authentication of virgin olive oil by NMR : A critical review. Anal. Chim. Acta 2013, 765, 1–27. [CrossRef][PubMed] 6. Maggio, R.M.; Cerretani, L.; Chivaro, E.; Kaufman, T.S.; Bendini, A. A novel chemometric strategy for the estimation of extra virgin olive oil adulteration with edible oils. Food Control 2010, 21, 890–895. [CrossRef] 7. Montealegre, C.; Luisa, M.; Alegre, M.; Garcia-Ruiz, C. Traceability Markers to the Botanical Origin in Olive Oils. J. Food Agric. Chem. 2010, 58, 28–38. [CrossRef][PubMed] 8. Gómez-Caravaca, A.M.; Maggio, R.M.; Cerretani, L. Chemometric applications to assess quality and critical parameters of virgin and extra-virgin olive-oil. A review. Anal. Chim. Acta 2016, 913, 1–21. [CrossRef] [PubMed] 9. Jolliffe, I. Principal Component Analysis, 2nd ed.; Springer: New York, NY, USA, 2002; p. 487. 10. Greenacre, M.J. Correspondence Analysis in Practice, 2nd ed.; Chapman & Hall/CRC: Boca Raton, FL, USA, 2007; p. 280. 11. Semmar, N. Computational , 1st ed.; Nova Science Publishers: New York, NY, USA, 2011; p. 238. 12. Legendre, P.; Legendre, L. Numerical Ecology, 2nd ed.; Elsevier: Amsterdam, The Netherlands, 2000; p. 853. 13. Aguado, D.; Monoya, T.; Borras, L.; Seco, A.; Ferrer, J. Using SOM and PCA for analysing and interpreting data from a P-removal SBR. Eng. Appl. Artif. Intell. 2008, 21, 919–930. [CrossRef] 14. Semmar, N.; Artaud, J. A New Simplex-Based Approach Predicting Olive Oil Blend Compositions from Fatty Acid Data. J. Food Compos. Anal. 2015, 43, 149–159. [CrossRef] 15. Laroussi-Mezghani, S.; Vanloot, P.; Molinet, J.; Hammami, M.; Grati-Kamoun, N.; Artaud, J. Authentification of Tunisian virgin olive oil by chemometric analysis of fatty acid composition and NIR spectra. Comparison with Maghrebian and French olive oils. Food Chem. 2015, 173, 122–132. [CrossRef][PubMed] 16. Semmar, N.; Laroussi-Mezghani, S.; Grati-Kamoun, N.; Hammami, M.; Artaud, J. A new simplex chemometric approach to identify olive oil blends with potentially high traceability. Food Chem. 2016, 208, 150–160. [CrossRef][PubMed] 17. De la Mata, P.; Dominiguez-Vidal, A.; Bosque-Sendra, J.M.; Ruiz-Medina, A.; Cuadros-Rodriguez, L.; Ayora-Canada, M.J. Olive oil assessment in edible oil blends by means of ATR-FTIR and chemometrics. Food Control 2012, 23, 449–455. [CrossRef] 18. Monfreda, M.; Gobbi, L.; Grippa, A. Blends of olive oil and sunflower oil: Characterization and olive oil quantification using fatty acid composition and chemometric tools. Food Chem. 2012, 134, 2283–2290. [CrossRef][PubMed] 19. Maggio, R.M.; Kaufman, T.S.; Carlo, M.D.; Cerretani, L.; Bendini, A.; Cichelli, A.; Compagnone, D. Monitoring of fatty acid composition in virgin olive oil by Fourrier transformed coupled with partial least square. Food Chem. 2009, 114, 1549–1554. [CrossRef] 20. Gordon, A.D. Classification, 2nd ed.; Chapman and Hall/CRC: Boca Raton, FL, USA, 1999; p. 272. 21. Everitt, B.S.; Landau, S.; Leese, M.; Stahl, D. Cluster Analysis, 5th ed.; Arnold Publishers: London, UK, 2011; p. 346. 22. Arabie, P.; De Soete, G.; Arabie, P.; Hubert, L.J.; Hubert, L.J.; De Soete, G. (Eds.) Clustering and Classification; World Scientific Pub. Co. Inc.: River Edge, NJ, USA, 1996; p. 500. 23. Jain, A.K.; Murty, M.N.; Flynn, P.J. Data clustering: A review. ACM Comput. Surv. 1999, 31, 264–323. [CrossRef] 24. Milligan, W.G.; Cooper, M.C. Methodology review: Clustering methods. Appl. Psychol. Meas. 1987, 11, 329–354. [CrossRef] Foods 2016, 5, 77 33 of 35

25. Semmar, N.; Bruguerolle, B.; Boullu-Ciocca, S.; Simon, N. Cluster Analysis: An Alternative Method for Covariate Selection in Population Pharmacokinetic Modelling. J. Pharmacokinet. Pharmacodyn. 2005, 32, 333–358. [CrossRef][PubMed] 26. Manai-Djebali, H.; Krichène, D.; Ouni, Y.; Gallardo, L.; Sanchez, J.; Osorio, E.; Daoud, D.; Guido, F.; Zarrouk, M. Chemical profiles of five minor olive oil varieties grown in central Tunisia. J. Food Compos. Anal. 2012, 27, 109–119. [CrossRef] 27. Sacco, A.; Brescia, M.A.; Liuzzi, V.; Reniero, F.; Guillou, C.; Ghelli, S.; van der Meer, P. Characterization of Italian Olive Oils Based on Analytical and Nuclear Magnetic Resonance Determinations. J. Am. Oil Chem. Soc. 2000, 77, 619–625. [CrossRef] 28. Fragaki, G.; Spyros, A.; Siragakis, G.; Salivaras, E.; Dais, P. Detection of Extra Virgin Olive Oil Adulteration with Lampante Olive Oil and Refined Olive Oil Using Nuclear Magnetic Resonance Spectroscopy and Multivariate Statistical Analysis. J. Agric. Food Chem. 2005, 53, 2810–2816. [CrossRef][PubMed] 29. Gemas, V.J.V.; Almadanim, M.C.; Tenreiro, R.; Martins, A.; Fevereiro, P. Genetic diversity in the Olive tree (Olea europaea, L. subsp. europaea) cultivated in Portugal revealed by RAPD and ISSR markers. Genet. Resour. Crop Evol. 2004, 51, 501–511. [CrossRef] 30. Sliwinska, M.; Wisniewska, P.; Dymerski, T.; Namiesnik, J.; Wardencki, W. Food Analysis Using Artificial Senses. J. Food Agric. Anal. 2014, 62, 1423–1448. [CrossRef][PubMed] 31. Coomans, D.; Massart, D.L.; Kaufman, L. Optimization by statistical Linear Discriminant Analysis in . Anal. Chim. Acta 1979, 112, 97–122. [CrossRef] 32. MaLachlan, G. Discriminant Analysis and Statistical Pattern Recognition; Wiley: New York, NY, USA, 2004; p. 552. 33. Bucci, R.; Magri, A.D.; Magri, A.L.; Marini, D.; Marini, F. Chemical Authentication of Extra Virgin Olive Oil Varieites by Supervised Chemometric Procedures. J. Agric. Food Chem. 2002, 50, 413–418. [CrossRef] [PubMed] 34. Ollivier, D.; Artaud, J.; Pinatel, C.; Durbec, J.P.; Guérère, M. Differentiation of French virgin olive oil RDOs by sensory characteristics, fatty acid and triacylglycerol compositions and chemometrics. Food Chem. 2006, 97, 382–393. [CrossRef] 35. Dankowska, A.; Małeckaa, M.; Kowalewski, W. Discrimination of edible olive oils by means of synchronous fluorescence spectroscopy with multivariate data analysis. Grasas y Aceites 2013, 64, 425–431. [CrossRef] 36. Damiani, P.; Cossignani, L.; Simonetti, M.S.; Campisi, B.; Favretto, L.; Favretto, L.G. Stereospecific analysis of the triacylglycerol fraction and linear discriminant analysis in a climatic differentiation of Umbrian extra-virgin olive oils. J. Chromatogr. A 1997, 758, 109–116. [CrossRef] 37. Petrakis, P.V.; Agiomyrgianaki, A.; Christophoridou, S.; Spyros, A.; Dais, P. Geographical Characterization of Greek Virgin Olive Oils (Cv. Koroneiki) Using 1H and 31P NMR Fingerprinting with Canonical Discriminant Analysis and Classification Binary Trees. J. Food Agric. Chem. 2008, 56, 3200–3207. [CrossRef][PubMed] 38. Dıraman, H.; Saygı, H.; Hısıl, Y. Classification of three Turkish olive cultivars from Aegean region based on their fatty acid composition. Eur. Food Res. Technol. 2011, 233, 403–411. [CrossRef] 39. Cajka, T.; Riddellova, K.; Klimankova, E.; Cerna, M.; Pudil, F.; Hajslova, J. Traceability of olive oil based on volatiles pattern and multivariate analysis. Food Chem. 2010, 121, 282–289. [CrossRef] 40. Casale, M.; Armanino, C.; Casolino, C.; Forina, M. Combining information from headspace and visible spectroscopy in the classification of the Ligurian olive oils. Anal. Chim. Acta 2007, 589, 89–95. [CrossRef][PubMed] 41. Kim, M.; Lee, S.; Chang, K.; Chung, H.; Jung, Y.M. Use of temperature dependent Raman spectra to improve accuracy for analysis of complex oil-based samples: Lube base oils and adulterated olive oils. Anal. Chim. Acta 2012, 748, 58–66. [CrossRef][PubMed] 42. Massart, D.; Vandeginste, B.; Buydens, L.; De Jong, S.; Lewi, P.; Smeyers-Verbeke, J. Handbook of Chemometrics and Qualimetrics; Elsevier: Amsterdam, The Netherlands, 1998; p. 732. 43. Geladi, P.; Kowalski, B. Partial least-squares regression: A tutorial. Anal Chim Acta 1986, 185, 1–17. [CrossRef] 44. Ozen, B.; Mauer, L. Detection of hazelnut oil adulteration using FT-IR spectroscopy. J. Agric. Food Chem. 2002, 50, 3898–3901. [CrossRef][PubMed] 45. Lerma-Garcia, M.J.; Ramis-Ramos, G.; Herro-Martines, J.M.; Simo-Alfonso, E.F. Authentication of extra virgin olive oils by Fourier-transform infrared spectroscopy. Food Chem. 2010, 118, 78–83. [CrossRef] Foods 2016, 5, 77 34 of 35

46. Sun, X.; Lin, W.; Li, X.; Shen, Q.; Luo, H. Detection and quantification of extra virgin olive oil adulteration with edible oils by FT-IR spectroscopy and chemometrics. Anal. Methods 2015, 7, 3939–3945. [CrossRef] 47. Ozdemir, D.; Ozturk, B. Near infrared spectroscopic determination of olive oil adulteration with sunflower and corn oil. J. Food Drug Anal. 2007, 15, 40–47. 48. Oussama, A.; Elabadi, F.; Platikanov, S.; Kzaiber, F.; Tauler, R. Detection of Olive Oil Adulteration Using FT-IR Spectroscopy and PLS with Variable Importance of Projection (VIP) Scores. J. Am. Oil Chem. Soc. 2012, 89, 1807–1812. [CrossRef] 49. Christy, A.A.; Kasemsumran, S.; Du, Y.; Ozaki, Y. The detection and quantification of adulteration in olive oil by near-infrared spectroscopy and chemometrics. Anal. Sci. 2004, 20, 935–940. [CrossRef][PubMed] 50. Wold, S.; Sjostrom, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. 2001, 58, 109–130. [CrossRef] 51. Barker, M.; Rayens, W.S. Partial for discrimination. J. Chemom. 2003, 17, 166–173. [CrossRef] 52. Bevilacqua, M.; Bucci, R.; Magrì, A.D.; Magrì, A.L.; Marini, F. Tracing the origin of extra virgin olive oils by infrared spectroscopy and chemometrics: A case study. Anal. Chim. Acta 2012, 717, 39–51. [CrossRef] [PubMed] 53. Dupuy, N.; Galtier, O.; Ollivier, D.; Vanloot, P.; Artaud, J. Comparison between NIR, MIR, concatenated NIR and MIR analysis and hierarchical PLS model. Application to virgin olive oil analysis. Anal. Chim. Acta 2010, 666, 23–31. [CrossRef][PubMed] 54. Mannina, L.; Marini, F.; Gobbino, M.; Sobolev, A.P.; Capitani, D. NMR and chemometrics in tracing European olive oils: The case study of Ligurian samples. Talanta 2010, 80, 2141–2148. [CrossRef][PubMed] 55. Hennessy, S.; Downey, G.; O’Donnell, C.P. Confirmation of food origin claims by Fourier transform infrared spectroscopy and chemometrics: Extra virgin olive oil from Liguria. J. Agric. Food Chem. 2009, 57, 1735–1741. [CrossRef][PubMed] 56. Woodcock, T.; Downey, G.; O’Donnell, C.P. Confirmation of declared provenance of European extra virgin olive oil samples by NIR spectroscopy. J. Agric. Food Chem. 2008, 56, 11520–11525. [CrossRef][PubMed] 57. Diaz, T.G.; Meras, I.D.; Casas, J.S.; Franco, M.F.A. Characterization of virgin olive oils according to its triglycerides and sterols composition by chemometric methods. Food Control 2005, 16, 339–347. [CrossRef] 58. Monfreda, M.; Gobbi, L.; Grippa, A. Blends of olive oil and seeds oils: Characterisation and olive oil quantification using fatty acids composition and chemometric tools. Part II. Food Chem. 2014, 145, 584–592. [CrossRef][PubMed] 59. Zunin, P.; Boggia, R.; Salvadeo, P.; Evangelisti, F. Geographical traceability of West Liguria extravirgin olive oils by the analysis of volatile terpenoid hydrocarbons. J. Chromatogr. A 2005, 1089, 243–249. [CrossRef] [PubMed] 60. Vapnik, V. Statistical Learning Theory; Wiley: New York, NY, USA, 1998; p. 733. 61. Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [CrossRef] 62. Devos, O.; Downey, G.; Duponchel, L. Simultaneous data re-processing and SVM classification based on a parallel genetic algorithm applied to spectroscopic data of olive oils. Food Chem. 2014, 148, 124–130. [CrossRef][PubMed] 63. Caetano, S.; Üstün, B.; Hennessy, S.; Smeyers-Verbeke, J.; Melssen, W.; Downey, G.; Buydens, L.; Heyden, Y.V. Geographical classification of olive oils by the application of CART and SVM to their FT-IR. J. Chemom. 2007, 21, 324–334. [CrossRef] 64. Cover, T.M.; Hart, P.E. Nearest neighbour pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [CrossRef] 65. Gertheiss, J.; Tutz, G. Feature selection and weighting by nearest neighbour ensembles. Chemom. Intell. Lab. Syst. 2009, 99, 30–38. [CrossRef] 66. Bhatia, N.; Vandana. Survey of Nearest Neighbor Techniques. Int. J. Comput. Sci. Inf. Secur. 2010, 8, 302–304. 67. Bajoub, A.; Medina-Rodriguez, S.; Gomez-Romeo, M.; Ajal, E.A.; Bagur-Gonzalez, M.G.; Fernandez-Gutierrez, A.; Carrasco-Poncorbo, A. Assessing the varietal origin of extra-virgin olive oil using liquid fingerprints of phenolic compound, data fusion and chemometrics. Food Chem. 2017, 215, 245–255. [CrossRef][PubMed] 68. Zupan, J.; Gasteiger, J. Neural networks: A new method for solving chemical problems or just a passing phase? Anal. Chim. Acta 1991, 248, 1–30. [CrossRef] Foods 2016, 5, 77 35 of 35

69. Zupan, J.; Novic, M.; Li, X.; Gasteiger, J. Classification of multicomponent analytical data of olive oils using different neural networks. Anal. Chim. Acta 1994, 292, 219–234. [CrossRef] 70. Garcia-Gonzalez, D.L.; Luna, G.; Morales, M.T.; Aparicio, R. Stepwise geographical traceability of virgin olive oils by chemical profiles using artificial neural network models. Eur. J. Lipid Sci. Technol. 2009, 111, 1003–1013. [CrossRef] 71. Marini, F.; Balestrieri, F.; Bucci, R.; Magrí, A.D.; Magrí, A.L.; Marini, D. S + upervised pattern recognition to authenticate Italian extra virgin olive oil varieties. Chemom. Intell. Lab. Syst. 2004, 73, 85–93. [CrossRef] 72. Marini, F.; Magrì, A.L.; Bucci, R.; Magrì, A.D. Use of different artificial neural networks to resolve binary blends of monocultivar Italian olive oils. Anal. Chim. Acta 2007, 599, 232–240. [CrossRef][PubMed] 73. Semmar, N.; Jay, M.; Farman, M.; Roux, M. A new approach to plant diversity assessment combining HPLC data, simplex mixture design and discriminant analysis. Environ. Model. Assess. 2008, 13, 17–33. [CrossRef] 74. Scheffé, H. Simplex centroid designs for with mixtures. J. R. Stat. Soc. B 1963, 25, 235–263.

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).