Comparison Between Packages Shapes and Genstat ® in Ordinary Procrustes Analysis for Sensory Profiling of Food Products
Total Page:16
File Type:pdf, Size:1020Kb
doi: http://dx.doi.org/10.5892/ruvrv.2012.102. 12 8135 Comparison between packages Shapes and Genstat ® in ordinary procrustes analysis for sensory profiling of food products Eric Batista FERREIRA 1 Marcelo Silva de OLIVEIRA 2 1Professor do Instituto de Ciências Exatas. Universidade Federal de Alfenas. [email protected]. 2Professor do Departamento de Ciências Exatas. Universidade Federal de Lavras. [email protected]. Recebido em: 16/10/2012 - Aprovado em: 12/12/2012 - Disponibilizado em: 30/12/2012 ABSTRACT: In modern management of companies, the quantitative knowledge is determinant for defining marketing strategies and product images. Particularly for industries of products that strongly stimulate the human senses, the statistical methods that allow quantitative evaluations are joined under the science chunk called Sensometrics. One of those methods is the Ordinary or Classical Procrustes Analysis (OPA), that consists in finding the best way one matrix can be fitted to a reference matrix by minimizing the sum of squared distances. Since this methodology begun to be use in sensory evaluation many specific and non-specific software have been developed. The high costs involved to purchase such programs lead some researchers to look for free software. R is a free statistical and mathematical package that already has an implementation of OPA and GPA (Generalized Procrustes Analysis) in its Shapes package (a statistical shape analysis package). The aim of this paper is to interpret, describe and compare R output to a non-specific software, GenStat ®, normally used in sensory analysis. Under simulated data the software were compared and lead to same conclusions and interpretations. Therefore, package Shapes from R can be used in sensory analysis, rather than Statistical Shape analysis. Key words: Ordinary Procrustes Analysis, software R, data simulation, Food products. INTRODUCTION methods that allow quantitative evaluations In the modern management of are joint under the science chunk called companies, the quantitative knowledge of Sensometrics, which deals with several the consumer preferences is determinant categories of problems like consumer for defining marketing strategies and preference, sensory profile of a product product images, impacting also in the and assessors training (FERREIRA; definition of competitive advantage by OLIVEIRA, 2007). production (SLACK; CHAMBERS; One of the greatest needs of Food JOHNSTON, 1997). Further, knowing Sensorial Analysis, especially in dairy such preferences is strictly related to the industries, is a satisfactory statistical implemented programs and politics, aims analysis of the panelist’s scores. For these and objectives of the quality management reason, along the years, a lot of tools have of the company. been tested. The multivariate techniques Particularly for industries of had been demonstrated to be efficient to products that strongly stimulate the human provide good answers about consumer’s senses (food, cosmetics and perfumes preference and food quality (FERREIRA industry, among others), the statistical et al., 2008). 128 Revista da Universidade Vale do Rio Verde, Três Corações, v. 10, n. 2, p. 128-135, ago./dez. 2012 This paper deals with the particular ||sX 1Q - X 2|| . (1) problem of identifying the possible consensus of a trained assessor and an Here, Q refers to a orthogonal expert. It is reasonable to assume that if the matrix (SCHÖENEMANN, 1966), and s is assessor agrees with the expert, his/her a scaling factor that enables stretching or training can be considered satisfactory. shrinking the configuration (GOWER, Cheese sensory data is used here for such 1975). evaluation. In dairy industry is natural to One of the suitable multivariate suppose that an expert panelist train a techniques for such evaluation is the beginner. In order to verify the efficiency Procrustes Analysis. Particularly, the of the training an OPA must be performed Ordinary Procrustes Analysis (OPA) (or (fitting the trainee score matrix to the Classical Procrustes Analysis) is a expert score matrix) to evaluate the multivariate technique that is used when residual sum of squared distances between one want to find the best way one matrix these two matrices. Also, agreements and can fit another. For instance, how similar disagreements between the two panelists one score matrix is to another. While the can be identified in some two axes OPA allow to match just two score graphics (FERREIRA et al., 2007). matrices, with Generalized Procrustes There are a lot of specific and non- Analysis (GPA) we can match m matrices specific software that perform this analysis at the same time (GOWER, 1975). satisfactorily, but all of the commonly used The minimization problem of are so expensive. R is a free statistical and transforming one given matrix X1 by a mathematical program. It has some basic matrix, say Q, such that best fits a given packages (default) and many others target matrix X2 is called a Procrustes packages to be loaded in accordance with Problem since Procrustes was an evil the user needs (R DEVELOPMENT personage of the Greek Mythology which CORE TEAM, 2011). In R there isn’t a used to fit his guests in a so called magical package specific for Procrustes analysis, bed by sawing or stretching them. The but there is a package called Shapes , for term Procrustes Problem is due to Hurley statistical shape analysis, that brings OPA and Cattel (1962) that suggested for the and GPA algorithms implemented first time the problem of transforming one (DRYDEN, 2009). matrix into another by minimizing the Residual Sum of Squares (RSS) 129 Revista da Universidade Vale do Rio Verde, Três Corações, v. 10, n. 2, p. 128-135, ago./dez. 2012 GenStat is statistical software that context. The reason we choose analyze just two attributes was to facilitate the building is able to perform the Procrustes analysis, of graphics. enabling many graphic tools (GENSTAT, 2003). Despite its high purchase cost, it is Data simulation commonly used in sensory analysis Graybill (1976) presents an easy (GAINS; THOMSON, 1990; MCEWAN; manner to generate data from a bivariate SCHLICH, 1992). Normal distribution ( X, Z ), where Z|X=x is This proposed situation was found by a linear relation of X. performed in the programs R and GenStat This way, to simulate data from a (7 th edition) in order to compare the bivariate Normal distribution one has to outputs. We also built graphics with R and assume known five parameters, since got some practical information. (X, Z) ~ N 2(µZ, µX, σZZ , σXX , σZX ) (2) where µZ, µX are the means of Z e X; σZZ , METHODOLOGY σXX the variances of Z e X and σZX the Ordinary Procrustes Analysis covariance between Z e X; and where (OPA) consists in find the best way one σZX = ρσ ZσX. matrix fits a reference matrix by The linear relationship between Z e minimizing the sum of squared distances X is the mathematical expectation of the between the points (lines) in a fixed matrix conditional Z|X = x and a shifted and rotated to-be-fitted-one σ ZX (3) EZXx[|==+ ]µZ ( x − µββ X ) =+0 1 x (GOWER, 1975). σ XX A reference matrix (expert scores) was established and a data matrix (trained and with variance 2 assessor) was simulated in R (like V[Z|X = x ] = σZZ (1-ρ ) (4) demonstrated below). So, one value of the variable Z|X=x is It was considered four brands of given by Gorgonzola cheese, each line of the matrix Z|X= µZ|X=x + e(x) (5) representing a brand. The judges scored 2 where e(x) ~ N(0, σZZ (1-ρ )) . these four samples at two attributes Data simulation was necessary only (appearance and characteristic flavor), to get the trained assessor's scores. We each column of the matrix representing an assumed that the trained assessor scores attribute. So, this simulated sensory were random values from four bivariate analysis occurred in a fixed vocabulary 130 Revista da Universidade Vale do Rio Verde, Três Corações, v. 10, n. 2, p. 128-135, ago./dez. 2012 Normal distributions ( X, Z ), where X is the Later, the configuration graphics appearance and Z is the characteristic were made utilizing R. Figure 1 presents flavor. The means of these four bivariate the configurations of the both assessors, Normal distributions are the expert scores. before and after Procrustes analysis. In The expert scores were arbitrarily Figure 1 one can observe how OPA established and it was considered a allowed a better view of the agreements positive correlation between the attributes and disagreements of such assessors. (appearance and characteristic flavor). We Looking at the right hand side of assumed the following covariance matrix Figure 1 one can observe that both 100 120 assessors agree more about brands 2 and 3 R = 120 225 than about brands 1 and 4. Considering Therefore, there was a positive first the appearance axis, for both correlation between attributes ( ρ=0,8) but assessors, the decreasing relation of brands not between cheese brands. is ordered like 4, 1, 2 and 3; but when one consider the characteristic flavor axis, the MAIN RESULTS decreasing relation of brands for expert is ordered like 4, 1, 2 and 3; and for the After data simulation, the trained assessor the relation of brands is established (A) and simulated matrices (B) ordered like 1, 4, 2 and 3. So, concerning (score matrices from expert and trained to characteristic flavor, they disagree in assessor, respectively) were: first and second positions. However, both 8082 7175 agree that brand 4 is the better whole 7072 and 6259 sample and brand 3 is the worst. A = B = 3050 36 44 9095 7772 131 Revista da Universidade Vale do Rio Verde, Três Corações, v. 10, n. 2, p. 128-135, ago./dez. 2012 (a) (b) FIGURE 1. Configurations of the both assessors: expert (1a, 2a, 3a, 4a) e trained assessor (1b, 2b, 3b, 4b), before (left) and after (right) Ordinary Procrustes Analysis.