Simple and Multiple Ordered Correspondence Analysis to Evaluate Customer Satisfaction

Rosaria Lombardo* and Eric J. Beh**

*The Second University of Naples, Via del Gran Priorato di Malta, 81043 Capua -Italy- [email protected]

**The University of Newcastle, Callaghan, NSW 2308 [email protected]

Abstract

In marketing research, to evaluate customer satisfaction, individuals are often required to fill in questionnaires where responses are ordered. To analyze the association of these ordered categorical variables, an alternative method to the usual multiple correspondence analysis has been adopted. It is based not only on the classic singular value decomposition but on a hybrid decomposition.

Keywords: Correspondence analysis, ordered categorical variables, singular value decomposition, bivariate moment decomposition, hybrid decomposition.

1. Introduction 2. Simple and Multiple Ordered Correspondence analysis of Indicator matrices In many scientific investigations, including marketing research for customer or job satisfaction evaluations, Suppose that a study consists of p variables; let Xk be an surveying models (like SERVQUAL/SERVPERF, [12]) with n units and Jk categories, for k = 1, 2, … include ordered responses to multiple questions , p. The matrix Xk is an indicator matrix of 0’s and 1’s. inspecting the good or bad quality aspects of a Let X = [X1 | ... | Xp] be the concatenation of each of the service/product or job environment. In the past ten years, Xk matrices forming a super-indicator matrix of to deal with two ordered categorical variables, an dimension n J, where is the total number of alternative method of decomposition (bivariate moment categories of the p variables. Here X summarizes the decomposition) to those traditionally used in categorical response (agree/disagree) of each of the n data analysis has been shown applicable for simple individuals/units that are classified into p ordered symmetric [1, 3, 4] and non-symmetric correspondence categorical variables. Suppose that the Jkth marginal analysis [9]. Recently a generalisation to more than two (column) frequency of Xk is denoted as ordered categorical variables has been considered in the literature (see, for example, [10, 11]). This new and D is the super-diagonal matrix of dimension J J, decomposition is called hybrid decomposition (HD) and whose generic diagonal “element” is the diagonal matrix combines the features of the classical singular value . Denote the matrix of column singular decomposition (SVD) and of the bivariate moment vectors (factors) of X by ] and the row decomposition (BMD). singular vectors by . These vectors After briefly reviewing the classical approach to simple exist in RJ and Rn space, respectively. In classical MCA, and multiple correspondence analysis [7, 8] in section 2, the analysis of X is conducted using the singular value referred to here as CA and MCA respectively, we decomposition so that describe the hybrid decomposition and its properties in the context of ordered simple CA (OCA) and ordered (1) MCA (OMCA). In section 3, we illustrate the application of OCA on customer satisfaction survey data. Third Annual ASEARC Conference 1 December 7—8, 2009, Newcastle, Australia where and are subject to the constraints, quadratic, or dispersion, component of the categories can and D = I respectively and is a diagonal matrix be calculated by and reflects the variation of where the (m, m)th element, , is the mth largest the spread of the categories of the k.th variable. In order singular value of . For the MCA of to test for significant components from the ordered categorical variables a different decomposition, decomposition of the total inertia using Z, the called hybrid decomposition (HD), has been proposed mathematical equivalence between this inertia and the [10, 11]. The HD involves computing singular vectors, Pearson chi-squared (related to formula (1)) has , for the n individuals, and orthogonal polynomials [6] been shown [11]. As in classical MCA, when the number for the J ordered categories. The matrices of orthogonal of variables is p = 2 this approach simplifies to simple polynomials for the ordered variable, , have ordered CA (OCA, [11]) and clear relationships can be found with doubly OCA (or DOCA, [1, 2, 5]). When a been arranged on the diagonal of a super-matrix, , of , P (of dimension J J ), consists of dimension J J, where the extra-diagonal elements are 1 2 two ordered variables, the simple CA procedure using the zero matrices. This super-matrix of polynomials is bivariate moment decomposition (BMD) is referred to as subject to the constraint . The main advantage (simple) DOCA [1, 2]. Such a CA approach involves of considering the hybrid decomposition is that the total the BMD being applied to the matrix of Pearson inertia can be partitioned into orthogonal polynomial contingencies such that components, thereby allowing the user to determine the where is the vector that contains the J row dominant (linear, quadratic, etc.) component of each 1 marginal relative proportions and is vector containing principal axis. HD also allows for the individuals/units to be clustered in as many clusters as there are categories. In the J2 column marginal proportions, is the diagonal general for the analysis of X with p variables via the matrix consisting of the elements of and the hybrid decomposition, we write diagonal matrix with elements and . Note that, for classical CA, the (2) relationship between the mth singular value of the two- way contingency matrix, P, and the mth singular value of

the super-indicator matrix, X, is . where is of dimension and Therefore the mth singular value associated with the is the matrix of orthogonal polynomials contingency table can be computed by subject to the constraint as . When using the BMD, the relationship . The generic (m, vk)th element of Z is given by between the mth singular value and the orthogonal polynomial of degree is , for . In general, the correspondence plots obtained from generating coordinates using orthogonal polynomials allows users to determine the nature of the association in terms of location, dispersion and higher order moments, where is the mth singular vector associated with the where each axis reflects one of these moments. row (or individual) i of X obtained from the SVD (for m Comparing results from BMD and HD, we state that the = 1,…, M = min(n, J) – 1). Similarly, is the same significant components of inertia are given, but orthogonal polynomial of order , for graphical displays from DOCA (involving the analysis of , and is associated with the jkth category of the ordered P) and from OCA (involving the analysis of X) do not generic variable k. It can be shown that the total inertia of provide the same kind of information. In fact, the the point cloud can be expressed as position of the categories of X using OCA are identical to . Using the HD implies that the total the category coordinates obtained from a classical MCA, inertia of the data is not only partitioned into polynomial but are different from those in a DOCA plot. components, but it can also be partitioned into m singular Furthermore, while in classical MCA, the coordinates for values and singular vectors. The (m, )th value of Z individuals are often left out of a CA involving X, in defines the contribution of the th order bivariate OCA (and OMCA) the unit coordinates, those obtained moment of the ordered variable k to the singular value (or using orthogonal polynomials, permit one to focus on principle inertia) m. When = 1, the element distinct clusters of individuals. Assuming that all describes the importance of the location component for variables consist of the same number of ordered the kth variable on the mth axis of a classical categories, jk = j for all k = 1, 2, . . , p and that these correspondence plot. Therefore the overall location categories are assigned equivalent scores to reflect the component of categories of the kth variable can be ordinal structure of their variables, then the determined by calculating . If this correspondence plot will consist of j clusters of individuals. component is significant, then there is a significant

variation in the location of those of categories. The

Third Annual ASEARC Conference 2 December 7—8, 2009, Newcastle, Australia 3. Customer Satisfaction in Health Care Services Figure 2: a) OCA plot - responses b) OCA plot - individuals.

To illustrate OCA and DOCA analysis, we study the quality of specific aspects of patient satisfaction in a hospital located in a province of Naples, Italy (June 2004, Second University of Naples). For the survey, the SERVPERF questionnaire was used. It consisted of 15 questions synthesised in 5 variables related to five service quality aspects: Tangibility - related to the quality of the buildings, rooms, bathrooms, etc.; Reliability - the trust and precision of the services offered by the hospital;

Capability of Response - the quality of response to the a) b) needs of the patients; Capability of Assurance - the quality of competency and courtesy of the hospital and its Figure 2b shows the correspondence plot of the staff; and Empathy - the quality of personal attention individuals from an OCA, where the five different given to the patient. The aim of these questions was to individual clusters provide an indication of how people gauge the quality of five key characteristics of the have responded to the two variables. Those belonging to hospital based on a sample of 511 patients that were admitted to the hospital. cluster A are individuals who generally have rated as poor the two services offered by the hospital (9.98%). Those Figure 1: a) DOCA plot b) Classical CA plot. individuals (5.68%) belonging to cluster B are associated with the score 2, cluster C (10.18%) is associated with 3, cluster D (17.42%) with 4. Finally those individuals belonging to cluster E (56.75%) rated the Tangibility and Assurance characteristics as consistently excellent. The five distinct clusters of individuals are in correspondence with the five ordered categories. After considering the graphical displays, we now look at the total inertia and its decompositions. Table 1 summarises the partition of the total inertia for the OCA and DOCA analyses. Excluding a) b) the trivial solution, the total inertia obtained from DOCA is 0.267. With a chi-squared statistic of For the application of simple OCA, suppose we now focus (p-value < 0.0001), there is a significant on only two of these characteristics: Tangibility (Tang) association between the two variables. Each of the OCA and Capacity of Assurance (Cras). The patients were singular values can be obtained from the DOCA singular asked to provide judgment of each characteristic on a five values. For example, the first OCA principal inertia, , point scale, where 1 refers to a poor level of satisfaction and 5 refers to an excellent level of satisfaction. Natural can be obtained in the following manner: scores (1, 2, 3, 4 and 5) were used to reflect the ordered . Using the indicator matrix X to perform structure of the responses. For example, patient responses OCA, the total inertia, excluding the trivial solution, is 4 to Tangibility on the five point scale are reflected by the (Table 1). labels Tang1, Tang2, Tang3, Tang4 and Tang5. In order to Table 1: Principal Inertia of each axis for OCA and DOCA. study the association between patient satisfaction towards these two characteristics, DOCA and OCA were Axis m OCA DOCA performed. From these analyses, graphical displays are 0 (trivial) 1.000 1.000 obtained and compared. Figure 1a is a DOCA plot via 1 0.725 0.203 BMD using the location and dispersion polynomials and 2 0.612 0.050 Figure1b is the classical CA plot via SVD. Furthermore, 3 0.557 0.013 Figures 1a and 2a show that for both Tangibility and 4 0.512 0.001 5 0.488 - Capability of Assurance the first principal axis is 6 0.443 - dominated by the location component of the two variables. 7 0.388 - This is consistent with the horseshoe effect apparent for 8 0.275 - OCA in Figure 1a. Similarly, the second axis of Figures 1a 9 0.000 - and 2a is dominated by the dispersion component. Total 4.00 0.267

Third Annual ASEARC Conference 3 December 7—8, 2009, Newcastle, Australia By DOCA and OCA it is also possible to assess the References significance of polynomial components. The Pearson chi-squared statistic of 136.25 can be [1] E. J. Beh, “Simple correspondence analysis of ordinal cross- classifications using orthogonal polynomials”, Biometrical partitioned into components that reflect location, Journal, 39, 589-613, 1997. dispersion and higher order sources of variation for each variable; see Table 2. This table shows that the location [2] E. J. Beh, “A comparative study of scores for correspondence analysis with ordered categories”, Biometrical Journal, 40, 413- component is a significant and dominant source of 429, 1998. variation for both variables. Identifying the sources of associations can be made by [3] E. J. Beh, “Partitioning Pearson's chi-squared statistic for singly ordered two-way contingency tables”, The Australian and New identifying the variation within each variable in terms of Zealand Journal of Statistics, 43, 327-333, 2001. moments higher than the location. [4] E. J. Beh, “Simple multiple correspondence: A bibliographic review”, International Statistical Review, 72, 257-284, 2004. Table 2: Partition of the total inertia by DOCA.

[5] E. J. Beh, “Simple correspondence analysis of nominal-ordinal Sources of Chi- df P-value % contingency tables”, Journal of Applied Mathematics and Decision variation squared Sciences, Article ID 218140, 17-34, 2008. Tangibility Location 4 100.58 0.00 73.82 [6] P. L. Emerson, “Numerical construction of orthogonal Dispersion 4 13.22 0.01 9.70 polynomials from general recurrence formula”, Biometrics, 24, 696-701, 1968 Error 8 22.45 0.00 16.48 Total 16 136.25 0.00 100.00 [7] M. J. Greenacre, Theory and Application of Correspondence Capacity of Assurance Analysis, Academic Press: London, 1984. Location 4 93.26 0.02 68.45 Dispersion 4 21.64 0.01 15.88 [8] L. Lebart, A. Morineau, & K.M. Warwick, Multivariate Total 16 136.25 0.00 100.00 Descriptive Statistical Analysis, Wiley: New York, 1984.

[9] R. Lombardo, E. J Beh, L. D'Ambra, “Non-symmetric Finally, OCA analysis can be viewed as a confirmatory correspondence analysis with ordinal variables using orthogonal approach to classical CA and MCA. The main polynomials”, Computational Statistics & Data Analysis, 52, 566- advantages of the techniques concerns the possibility of 577, 2007. formally testing for significant sources of association [10] R. Lombardo, E. J Beh,. “Simple and multiple correspondence among category variables and the ability to provide analysis for ordinal scale variables”, Journal of Applied Statistics, clusters of units ordered with respect to the ordered in press, 2009. categories. [11] R. Lombardo, J. Meulman, “Multiple correspondence analysis Table 3: Decomposition of the first two non-trivial via polynomial transformations of ordered categorical variables”, eigenvalues of OCA for Tangibility and Capability of Journal of Classification, in press, 2009.

Assurance. [12] A. Parasuraman, V.A. Zeithalm, L. Berry, “A conceptual model of service quality and its implications for future research”, % of % of Component Journal of Marketing, 49, 41-50, 1985. inertia inertia Tangibility Location 0.321 88.43 0.013 4.25 Dispersion 0.002 0.55 0.245 80.07 Skewness 0.037 10.19 0.047 15.36 Kurtosis 0.003 0.83 0.001 0.32 Total (k=1) 0.363 100.0 0.306 100.0 Capacity of Assurance Linear 0.351 96.69 0.000 0.00 Dispersion 0.005 1.38 0.139 45.42 Skewness 0.004 1.10 0.037 12.09 Kurtosis 0.003 0.83 0.130 42.48 Total (k=2) 0.363 100.0 0.306 100.0 Total

Third Annual ASEARC Conference 4 December 7—8, 2009, Newcastle, Australia