Factor Analysis: Illustrations/Graphics/Tables Using
Total Page:16
File Type:pdf, Size:1020Kb
Factor Analysis: illustrations/graphics/tables using painters data One way to do pairwise PLOTs for the painters4a data is to use ‘splom’ painters4a is just the first four columns of painters (MASS lib.) plus a binary variable that distinguishes ‘School D’ from the other Schools; see painter documentation. Where the commands used here were: > splom(~painters4a[painters4a[,5]==1,1:4]) > par(new=T) > splom(~painters4a[painters4a[,5]==1,1:4]) > mtext("Splom plot of painters4a data, school D points as '+'",line=3) Splom plot of painters4a data, school D points as '+' + + ++ + 6 310 4 155 6 + + + 155 + + + + + + + + + 4 10 + + + 3 Expression 3 2 5 1 +++ + + + + + 00 1 52 3 0 + + + + 18 + + 16 10 17 15 18 15 + + + + 17 + + 10 + ++ + + 16 Colour 16 + + + 5 + + 15 + + + + + 14 0 15 5 16 14 0 + + + 18 + + 1012 1412 1614 18 + + 14 + + + + 16 1412 12 12 + 10 Drawing 10 + + + + 10 + + + + + 8 + + + + 8 ++ 6 8 8 10 1012 6 + + + + + + + + 14 1010 12 1514 15 12 + ++ + + 10 10 Composition 8 + + + + + + 5 6 + + + + + + + + 04 6 5 8 04 + + + St udy this plot carefully in relation to the correlation matrix for these data: rnd2(cor(painters4a)) Composition Drawing Colour Expression Schl.nD Composition 1.00 .42 -.10 .66 -.29 Drawing .42 1.00 -.52 .57 -.36 Colour -.10 -.52 1.00 -.20 .53 Expression .66 .57 -.20 1.00 -.45 Schl.nD -.29 -.36 .53 -.45 1.00 On the following page, principal components analysis (unweighted) is compared w/ common factor analysis for these data; be sure you follow all steps carefully. PCA results for Painters4a data The Unrotated coefficients matrix for 2 components is: pca.coef2_eigen(cor(painters4a))$ve[,1:2]%*%diag( + sqrt(eigen(cor(painters4a))$va[1:2])) rnd2(pca.coef2) [,1] [,2] [1,] -.68 -.58 [2,] -.80 .07 [3,] .62 -.70 [4,] -.81 -.41 [5,] .71 -.35 And its ROTATED counterpart is: > -rnd2(rotate(pca.coef2)$rma) [,1] [,2] with T matrix: [1,] .90 -.04 rotate(pca.coef2)$tm [2,] .54 -.59 [,1] [,2] [3,] .02 .93 [1,] .732 -.681 [4,] .87 -.26 [2,] .681 .732 [5,] -.28 .74 REASONS for choosing TWO components/factors are noted below -------------COMMON FACTOR ANALYSIS NEXT------ Compare with the COMMON FACTOR coefficients matrix: >cfa.coef2_cf.wad(cor(painters4a),2,54,3)$f 2 1/2 >cfa.coef2 [[cf.wad uses computation: S Q2 (Dlambda – arr I) ]] 1 2 Composition .76 -.09 Note that magnitudes of pca Drawing .50 -.56 coefficients are generally (but Colour -.03 .80 not always) larger than their Expression .81 -.26 common factor coef. counterparts. Schl.nD -.30 .62 Also, compare the LACK of FIT matrices for PCA and CFA: Separately examine diagonals and off-diagonal values >rnd2(cor(painters4a)-pca.coef2%*%t(pca.coef2)) Composition Drawing Colour Expression Schl.nD Composition .19 -.09 -.08 -.14 -.01 Drawing -.09 .36 .02 -.05 .23 Colour -.08 .02 .14 .02 -.15 Expression -.14 -.05 .02 .17 -.01 Schl.nD -.01 .23 -.15 -.01 .37 >cfa.coef2_cf.wad(cor(painters4a),2,54,3)$f >rnd2(cor(painters4a)-cfa.coef2%*%t(cfa.coef2)) Composition Drawing Colour Expression Schl.nD Composition .41 -.01 .00 .02 -.01 Drawing -.01 .44 -.05 .02 .14 Colour .00 -.05 .36 .03 .03 Expression .02 .02 .03 .28 -.04 Schl.nD -.01 .14 .03 -.04 .53 An example of a 'scree' plot based on eigenvalues of R 3.0 2.5 2.0 1.5 1.0 eigen(cor(painters4a))$va 0.5 0.0 1 2 3 4 5 1:5 > plot(1:5,eigen(cor(painters4a))$va,ylim=c(0,3),type='b') > abline(h=mean(eigen(cor(painters4a))$va[3:5]),lty=3) > title("An example of a 'scree' plot based on eigenvalues of R") Compare with Common factor counterpart: A 'scree' plot based on eigenvalues of rescaled R: S.m1 R S.m1 6 5 4 3 2 svd.da(painters4a)$rts.r 1 0 1 2 3 4 5 1:5 abline(h=mean(svd.da(painters4a)$rts.r[3:5]),lty=3) Next, examine the ‘biplot’ display of the painters4a data: (done simply w/ principal components analysis, as shown) Display based on 'biplot(princomp(painters4a))' -30 -20 -10 0 10 20 30 30 0.3 Testa Fr. Penni Michelangelo Pourbus 20 0.2 Josepin Parmigiano Van Leyden Volterra Perugino 10 0.1 Bourdon Perino del Vaga GuilioDa RomanoVinci Barocci F. Zucarro Le Suer Drawing Fr.Del Salviata Sarto Primaticcio L. JordaensDurer Bellini Poussin AlbaniLanfranco 0 0.0 T. Zucarro Otho Venius Comp. 2 CaravaggioPalma Vecchio GuercinoSchl.nD Murillo Le Brun Bassano Cortona Domenichino Diepenbeck TeniersDel Piombo PordenoneDa Udine -0.1 Palma Giovane -10 Vanius Expression J. Jordaens The Carraci Tintoretto Giorgione Composition Veronese Raphael CorregioHolbeinTitian -0.2 -20 Van Dyck Rembrandt -0.3 Colour -30 Rubens -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 Comp. 1 The special virtue of this type of display is that it shows simultaneously both data points, in this case for the painters with their names identified, and the variable as vectors. In this case the biplot has been done w/out rotation of component axes, but that could be done rather easily if desired. A biplot can also be done for a common factor solution…and there would be some advantage in doing that for some data sets (but it makes relatively little difference in this case). Note that any interpretation of such data can take account of a great deal of information. To give an example, we see that Expression and Composition, the two highest loading variables on factor or component I (after rotation) are variables that largely define the first component, the horizontal axis here; and painters such as LeBrun, Demenichino, and Raphael differ especially from painters Bellini, Murillo, Calavaggio (to name a few) at the opposite side of the display…that is, they differ especially with respect to the Expression and Composition ratings. Painters whose names are in close proximity to one another tend to be similar with respect to the information summarized by both these components, and painters far from one another may differ with respect to both or just one component (would also be true for common factors). Biplot displays like this are not confined to two factor solutions as the term biplot does not refer to two-dimensional displays, but rather to two kinds of information, coefficients and scores. Take some time to relate what you see summarized here, and what you can see in the data matrix itself for the painters4a data. That matrix is given at the end of this handout ON THE WEB ONLY. Summary: The comparisons of results for common factor analysis (CFA) and principal component analysis (PCA) generally shows similar findings to those shown here: better fits to the OFF- DIAGONAL entries of the correlation matrix when using CFA; worse fits to the DIAGONALS (ones) w/ CFA (but we almost never care about fitting the diagonals, so CFA may be preferable). The coefficients matrix generated using CFA has more clearly interpretable values as each coefficient represents a simple correlation between a (row) variable and a (column) common factor, with little bias; the PCA coefficients are spurious (often larger in magnitude) in the sense that these correlations are between variables and derived components where the components are (merely) linear combinations of the variables themselves. Common factors are ‘latent’ or ‘unobservable’ variates, postulated in a common factor model; components are not latent, so they are computable. To elaborate further, the common factor model postulates that individual scores (ratings in this example) are individually composed of two (orthogonal) parts, the common factor parts (thought to be ‘less error-full’) and ‘uniqueness parts’, essentially the error parts. When one entertains a common factor model in the context of an analysis of a matrix of data, one supposes that individual scores (such as ratings) tend to be ‘fallible’, that in isolation such scores may well misrepresent the ‘true’ rating that might be conceived of as the ‘true common part’ of the data; when the number of common factors is chosen, the value m, one makes a choice for the dimensionality of the common factor space…which we might hope will have relatively low dimensionality so as to enhance the interpretability and reduce the complexity of the account given of the data by way of common factors. Principal component analysis is better thought of as just a data reduction method, as there is no model for it, just a way algebraically to decompose the data (think of the svd) into separable parts. With PCA one has little guidance of the kind available in CFA for choosing the number of components to use in data reduction. Nevertheless, PCA is used more often than CFA (which has MANY different forms and algorithms, and is in some ways controversial in its assumption of underlying or latent factors) and PCA does often produce certain results (see the biplot) that may be very close to common factor counterparts…so it almost inevitable that there will be a good deal of confusion in the minds of those not highly knowledgeable about common factor analysis between PCA and CFA. Factor and component rotation (see function ‘rotate’ in Splus, or ‘varimax’ in R) is easily accomplished and is worth further study if you aim to analyze real data with two or more factors, especially if the ‘model’ for CFA seems credible for your data. Homework: Carry out CFA starting from either a rectangular data matrix (such as painters4a [below], or ?? [see MASS library or do help.search("datasets") in R ) or for a correlation matrix (see Harman74.cor in R, for which you will find that you will need to see its documentation [and use >data(Harman74.cov), then Harman74.cor$cov]) to get started.