Modern Robust Data Analysis Methods: Measures of Central Tendency
Total Page:16
File Type:pdf, Size:1020Kb
Psychological Methods Copyright 2003 by the American Psychological Association, Inc. 2003, Vol. 8, No. 3, 254–274 1082-989X/03/$12.00 DOI: 10.1037/1082-989X.8.3.254 Modern Robust Data Analysis Methods: Measures of Central Tendency Rand R. Wilcox H. J. Keselman University of Southern California University of Manitoba Various statistical methods, developed after 1970, offer the opportunity to substan- tially improve upon the power and accuracy of the conventional t test and analysis of variance methods for a wide range of commonly occurring situations. The authors briefly review some of the more fundamental problems with conventional methods based on means; provide some indication of why recent advances, based on robust measures of location (or central tendency), have practical value; and describe why modern investigations dealing with nonnormality find practical prob- lems when comparing means, in contrast to earlier studies. Some suggestions are made about how to proceed when using modern methods. The advances and insights achieved during the last more recent publications provide a decidedly different half century in statistics and quantitative psychology picture of the robustness of conventional techniques. provide an opportunity for substantially improving In terms of avoiding actual Type I error probabilities modern ,(05. ס ␣ ,.psychological research. Recently developed methods larger than the nominal level (e.g can provide substantially more power when the stan- methods and conventional techniques produce similar dard assumptions of normality and homoscedasticity results when groups have identical distributions. are violated. They also help deepen our understanding However, when distributions differ in skewness or of how groups differ. The theoretical and practical have unequal variances, modern methods can have advantages of modern technology have been docu- substantially more power, they can have more accu- mented in several books (e.g., Hampel, Ronchetti, rate confidence intervals, and they can provide better Rousseeuw, & Stahel, 1986; Hoaglin, Mostelier, & control over the probability of a Type I error. We also Tukey, 1983, 1985; Huber, 1981; Rousseeuw & Le- indicate why some commonly used strategies for cor- roy, 1987; Wilcox, 1997a, 2001, 2003) and journal recting problems with methods based on means fail, articles. Yet, most applied researchers continue to be- and we summarize some recent strategies that appear lieve that conventional methods for making inferences to have considerable practical value. about means perform well in terms of both controlling Articles summarizing some of the basic problems the Type I error rate and maintaining a relatively high with conventional methods and how they might be level of statistical power. Although several classic ar- addressed have appeared in technical journals (e.g., ticles describe situations in which this view is correct, Wilcox, 1998a) and basic psychology journals (e.g., Wilcox, 1998b). Wilcox (2001) also provided a non- technical description of practical problems with con- Rand R. Wilcox, Department of Psychology, University ventional methods and how modern technology ad- of Southern California; H. J. Keselman, Department of Psy- dresses these issues. Our goal here is to expand on this chology, University of Manitoba, Winnipeg, Manitoba, previous work by summarizing some recent ad- Canada. vances.1 But first, we review basic principles moti- Work on this article by H. J. Keselman was supported by vating modern methods. a grant from the Natural Sciences and Engineering Research Council of Canada. We are grateful to M. Earleywine and David Schwartz for comments on an earlier version of this 1 This article does not discuss recent advances related to article. rank-based methods, but this is not to suggest that they have Correspondence concerning this article should be ad- no practical value. For a summary of important and useful dressed to Rand R. Wilcox, Department of Psychology, developments related to rank-based techniques, see Brun- University of Southern California, Los Angeles, California ner, Domhof, and Langer (2002); Cliff (1996); and Wilcox 90089-1061. E-mail: [email protected] (2003). 254 ROBUST DATA ANALYSIS 255 We do not promote a single approach to data analy- convicted of murder. If we randomly sample n par- sis but rather argue that modern technology as a whole ticipants, the most commonly used approach is to es- has much to offer psychological research. No single timate the population mean with the sample mean, statistical method is ideal in all situations encountered M. The conventional 1 − ␣ confidence interval for in applied work. In terms of maximizing power, for is example, modern methods often have a considerable SD (t1−␣ր2 , (1 ע advantage, but the optimal method depends in part on M how the groups differ in the population, which will be ͩ͌nͪ unknown to the researcher. Modern methods can also where SD is the sample standard deviation and t1−␣/2 provide useful new perspectives that help us develop is the 1 − ␣/2 quantile of Student’s t distribution with a better understanding of how groups differ. In the n −1 degrees of freedom. This confidence interval is Appendix to this article, we provide an overview of based on the assumption that statistical software that can implement the modern M − methods to be described. T = (2) SDր͌n Some Basic Problems has a Student’s T distribution. If this assumption is reasonably correct, control over the probability of a We begin with the one-sample case in which the Type I error, when hypotheses are tested, will be goal is either (a) to test achieved.2 Following an example by Westfall and Young , ס H : 0 0 (1993), suppose that unknown to us, observations are the hypothesis that the population mean is equal to sampled from the skewed (lognormal) distribution shown in the left panel of Figure 1. The dotted line in some specified constant 0, or (b) to compute a con- fidence interval for . When data in a single sample the right panel of Figure 1 shows the actual distribu- The smooth symmetrical .20 ס are analyzed, two departures from normality cause tion of T when n problems: skewness and outliers. curve shows the distribution of T when a normal dis- Skewness 2 We are following the convention that uppercase letters represent random variables and lowercase letters represent First, we illustrate the effects of skewness. Imagine specific values (e.g., Hogg & Craig, 1970). So T represents we want to assess an electroencephalographic (EEG) Student’s T random variable, and t represents a specific measure at a particular site in the brain for individuals value, such as the .975 quantile. Figure 1. The left panel shows a lognormal distribution. The right panel shows an approxi- mation of the distribution of Student’s T when 20 observations are sampled from a lognormal distribution. The solid line represents the distribution under normality. 256 WILCOX AND KESELMAN tribution is sampled. As is evident, there is a serious from a lognormal distribution, and if an actual Type I discrepancy between the actual distribution of T ver- error probability less than or equal to .08 is deemed sus the distribution under normality, and this results in adequate, nonnormality is not an issue in terms of very poor control over the probability of a Type I Type I error probabilities and accurate confidence in- .Westfall and Young (p. 40) noted tervals with a sample size of at least 200 .20 ס error when n that problems with controlling the probability of a Now, to expand on the problems with Student’s T When the summarized by Westfall and Young (1993), suppose .160 ס Type I error persist even when n stated alpha is equal to .10, and when one is perform- observations are sampled from the skewed distribu- ing a two-tailed test, the actual probability of a Type tion3 shown in Figure 2. Among a sample of obser- -For the vations, outliers are more common versus the lognor .(05. ס I error for the lower tail is .11 (vs. ␣/2 upper tail, the actual probability of a Type I error is mal in Figure 1. The dotted line in the left panel of The .20 ס Thus, when one is performing the usual two- Figure 3 shows 5,000 T values with n .02. tailed test at the .10 level, the actual probability of a smooth symmetrical curve is the distribution of T as- suming normality. Under normality there is a .95 .13. ס Type I error is .11 + .02 Any hypothesis-testing method is said to be unbi- probability that T will be between −2.09 and 2.09; ased if the probability of a Type I error is minimized these are the .025 and .975 quantiles of the distribu- when the null hypothesis is true; otherwise, it is said tion of T. However, when one samples from the dis- to be biased. In cases such as the example above, where for the lower tail the actual probability of a Type I error is less than ␣/2, power can actually de- 3 This distribution arises by sampling from a chi-square crease as the true mean increases (Wilcox, 2003). This distribution having 4 degrees of freedom, and with prob- means that for a two-tailed test, Student’s T is biased. ability .1 multiplying an observation by 10. (That is, gen- That is, there are situations in which we are more erate X from a chi-square distribution, generate u from a likely to reject when the null hypothesis is true versus uniform distribution, and if u < .1, multiply X by 10.) When the ,20 ס one is sampling from this distribution with n when it is false. median number of outliers is 2; the mean is about 1.9; and Extending the Westfall and Young (1993) results, with probability about .95, the number of outliers is less -values, the skew 10,000 ס we find that for the same skewed distribution, but than or equal to 3.