Determining the Number of Factors to Retain: Q Windows-Based FORTRAN-IMSL Program for Parallel Analysis

Behavior Research Methods, Instruments, & Computers 2000, 32 (3), 389-395 Determining the number offactors to retain: A Windows-based FORTRAN-IMSL program for parallel analysis JENNIFER D. KAUFMAN and WILLIAM P.DUNLAP Tulane University, New Orleans, Louisiana Parallel analysis (PA; Horn, 1965) is a technique for determining the number of factors to retain in exploratory factor analysis that has been shown to be superior to more widely known methods (Zwick & Velicer, 1986). Despite its merits, PAis not widely used in the psychological literature, probably be cause the method is unfamiliar and because modern, Windows-compatible software to perform PAis unavailable. Weprovide a FORTRAN-IMSL program for PAthat runs on a PC under Windows; it is in teractive and designed to suit the range of problems encountered in most psychological research. Fur thermore, we provide sample output from the PAprogram in the form of tabled values that can be used to verify the program operation; or, they can be used either directly or with interpolation to meet spe cific needs of the researcher. Selecting the correct number offactors to retain in ex nents from the "scree" or the remaining inconsequential ploratory factor analysis is of vital importance to re values. The scree test rests on the assumption that a re searchers as an aid in the measurement and interpretation searcher's visual inspection will clearly distinguish the of psychological constructs (Cota, Longman, Holden, discontinuity in the plot that separates eigenvalues that Fekken, & Xinaris, 1993). The consequences offailure in delineate major factors from those that represent noise or selecting the "correct" number offactors may result in se scree. In practice, visual interpretation ofscree plots is in rious distortion ofreported research results (Cota et aI., herently subjective and has been criticized by researchers 1993). Although factor analysis is an important tool of who often have difficulty locating discontinuities in the many researchers in the psychological sciences, the meth plots (Turner, 1998). ods used by most researchers to determine the number of Although there are other options for factor determina factors to retain are less than optimal. tion, such as Lawley and Maxwell's (1963) maximum To date, Kaiser's (1960) rule is the most common likelihood factor method of factor extraction, Kaiser's method used in determining the number offactors (Laut rule and the scree plot are the techniques most frequently enschlager, 1989). Kaiser's rule is simply to retain fac reported in the literature. One desirable alternative to tors whose eigenvalues are greater than 1. Kaiser's rule is Kaiser's rule and the scree plot is a factor-analytic tech based on the assumption that to retain a factor that ex nique referred to as parallel analysis (PA; Horn, 1965). To plains less variance than a single original variable is not date, PA has shown the most promising results as a method psychometrically reasonable. Despite the fact that Kaiser's for determining the correct number offactors to retain in "rule ofthumb" is both simple and widely applied in psy factor analysis (Fabrigar, Wegener, MacCallum, & Stra chological research, it is often criticized. For example, ap han, 1999; Humphreys & Montanelli, 1975; Zwick & plication of Kaiser's rule suggests that an eigenvalue of Velicer, 1986). For example, Zwick and Velicer showed 1.01 represents a meaningful factor whereas an eigenvalue that PA recovered the correct number offactors in simu of.99 does not. But using a rule that dictates a strict crite lated data more accurately than other methods. An exam rion eigenvalue of 1.00 regardless ofthe data or constructs ple is provided in Figure 1. Data representing 369 respon being measured is overly simplistic. dents and 44 variables were taken from a widely used In addition to Kaiser's rule, another common method multivariate textbook (Tabachnick & Fidell, 1996). The used in factor analysis is the scree test (Cattell, 1966). The scree plot depicts the actual eigenvalues from the data as scree test requires a plot ofthe obtained eigenvalues that well as using both Kaiser's rule and parallel analysis. As is then subjected to visual inspection and interpretation. can be seen from viewing the scree plot, a judgment can The researcher looks for a discontinuity in the plot ofthe be made at the break in the plotted values somewhere be eigenvalues that separates meaningful factors or compo- tween the 3rd and 6th eigenvalues, whereas parallel analy sis clearly suggests keeping 5 factors, and Kaiser's rule suggests retaining 12 factors. Correspondence should be addressed to W. D. Dunlap, Department of Despite the merits of PA, very little published factor Psychology, Tulane University, New Orleans, LA 70118 (e-mail: dunlap@ analytic research reports application of this factor tulane.edu). determining technique. We contend that PA is used so 389 Copyright 2000 Psychonomic Society, Inc. 390 KAUFMAN AND DUNLAP 8 7 6 Actual eigenvalue ~ = • Parallel Analysis -=~ 5 Kaiser's Rule =~ ....~ ~ 4 -= =CJ 3 <- 2 .... 1 0 0 1 0 20 Variables Figure 1. Plot ofactual eigenvalues, parallel analysis values, and Kaiser's rule of1 using data from Tabachnick and Fidell (1996). seldom, at least in part, because ofthe inaccessibility of (4) although Longman et al.'s program uses double programs that perform PA. None ofthe commonly used precision subroutines, and the current program uses sin existing software packages (i.e., SAS, SPSS, BMDP) have gle-precision subroutines, with 5-6 decimal place accu been equipped to handle PA, though O'Connor (2000) has racy that single precision provides being more than ade provided a tutorial on how to use PA with SPSS and SAS. quate for the accuracy ofthe Monte Carlo approximations; In addition, we could find very few readily available, easy and (5) although Longman et al.'s program stores and to-use PA computer programs described in the current sorts the eigenvalue cutoffs at differing probability lev literature. The only operational existing PAprogram in the els, this feature slows the program and requires more literature is now 10 years old (Longman, Cota, Holden, & room for storage. The present program, presented in the Fekken, 1989). The logic ofLongman et al.'s program is Appendix, offers mean eigenvalues based on 1,000 iter parallel to that used in the development ofthe present pro ations. However, if percentile information is desired, an gram. Although the program by Longman et al. provides alternative program is also provided that computes per accurate values, its lack of current popularity may be a centile results for eigenvalues specified by the user; and result of requiring the researcher to have knowledge of finally, (6) unlike Longman et al.'s program, the present FORTRAN and ability to compile FORTRAN programs. program may be run in a Windows-based environment, Therefore, the purpose ofthe present article and program and both programs may be sent to the researcher as com is to build on and improve Longman et al.'s program and piled programs rather than requiring a FORTRAN com provide researchers with a simple, updated, faster, and piler or the IMSL subroutine package and the knowledge Windows-based personal computer program to be used for required to use it. conducting PA. In the present article we not only provide an updated Specifically, our PA program is simpler and faster than program for PA but also present tables of PA values for that provided by Longman et al. (1989) in the following a range of sample sizes and numbers of variables that ways: (1) Longman et al.'s program requires the re covers a substantial proportion ofmost psychological re searcher to change the program parameters, whereas the search needs. The following section provides a brief present program is interactive; (2) Longman et al.'s pro overview and demonstration ofPA. gram must be recompiled each time a problem is run and the output is sent to "unit 7," whereas ours appears on the Parallel Analysis researcher's computer monitor; (3) Longman et al.'s pro PA involves a Monte Carlo approximation of the ex gram requires the researcher to set the seed for random pected values of ordered eigenvalues that would result number generation, whereas ours uses an internal seed; from random independent normal variates given the sam- PARALLEL ANALYSIS 391 pIe size and the original number of variables. The values of population factors is not constant across studies, but are approximate because no closed mathematical solu rather, is dependent on certain aspects ofthe variables and tion to these values is known. The rationale assumes that design in a given study" (p. 96). meaningful components extracted from actual data will In practice, factor analysis is often based on much have larger eigenvalues than eigenvalues obtained from smaller Nand p combinations than the recommended random normal variates generated with the same sample rules ofthumb discussed above. Fabrigar et al. (1999) sur size and number ofvariables (Lautenschlager, 1989). Sev veyed two top-tier psychology journals, the Journal of eral articles have offered regression equations to be used Personality and Social Psychology (JPSP) and the Jour for approximating the expected eigenvalues used as PA nal ofAppliedPsychology (JAP) from 1991 to 1995. The criteria. Some of the earliest work was done by Montanelli authors concluded that exploratory factor analysis con and Humphreys (1976), who placed squared multiple tinues to be an extremely popular statistical procedure in correlations ofthe correlation matrix analyzed on the di psychological research. agonal, which is the first-step approximation to principal They found that a total of 159 ofthe 883 articles pub axis factor analysis. In order to determine the number of lished in JPSP and a total of58 of455 articles published factors to retain, Montanelli and Humphreys plotted the in JAP reported the use of exploratory factor analysis.

Determining the Number of Factors to Retain: Q Windows-Based FORTRAN-IMSL Program for Parallel Analysis

Choosing Principal Components: a New Graphical Method Based on Bayesian Model Selection

Principal Component Analysis (PCA) As a Statistical Tool for Identifying Key Indicators of Nuclear Power Plant Cable Insulation

Exam-Srm-Sample-Questions.Pdf

Dynamic Factor Models Catherine Doz, Peter Fuleky

Principal Components Analysis Maths and Statistics Help Centre

Optimal Principal Component Analysis of STEM XEDS Spectrum Images Pavel Potapov1,2* and Axel Lubk2

An Eigenvector Variability Plot

Principal Components Analysis (Pca)

Robust Principal Component Analysis

Psychometric Properties of Satisfaction with the Childbirth Education Class Questionnaire for Iranian Population

CHAPTER 4 Exploratory Factor Analysis and Principal Components

Interpreting Southern California Arc Geochemistry by Multivariate and Spatial Methods Lance R