Robust Within Groups Anova: Dealing with Missing Values

ROBUST WITHIN GROUPS ANOVA: DEALING WITH MISSING VALUES Jinxia Ma1, Rand R. Wilcox1 1Dept. of Psychology, University of Southern California, Los Angeles, CA 90089-1061, United States Abstract Ludbrook [18] suggested using a permutation test The paper considers the problem of testing the hy- when comparing groups based on means. A positive pothesis that J ≥ 2 dependent groups have equal pop- feature of this permutation test is that when testing the ulation measures of location when using a robust esti- hypothesis of identical distributions, the exact probabil- mator and there are missing values. For J = 2, meth- ity of a Type I error can be determined. But as a method ods have been studied based on trimmed means. But for testing (1), it can be unsatisfactory even when there the methods are not readily extended to the case J > 2. are no missing values and the goal is to compare mea- Here, two alternative test statistics were considered, one sures of location (e.g., [2, 25]). of which performed poorly in some situations. The one Let (X1;Y1);:::; (Xn;Yn) be a random sample of size method that performed well in simulations is based on n from some bivariate distribution and let Di = Xi − Yi a very simple test statistic with the null distribution (i = 1; : : : ; n). Let µD be the corresponding popula- approximated via a basic bootstrap technique. The tion mean of D. An approach to missing values using method uses all of the available data to estimate each an empirical likelihood method, based on single imputa- of the marginal (population) trimmed means. Other ro- tion, was derived by Liang et al. [15], which is readily bust measures of location were considered, for which im- adapted to the problem of computing a confidence inter- putation methods have been derived, but in simulations val for µD even when there are no missing values. But the actual Type I error probability was estimated to for heavy-tailed distributions, it can be unsatisfactory. be substantially less than the nominal level, even when For example, suppose D has the contaminated normal there are no missing values. distribution Keywords Trimmed means, Minimum Covariance Determinant, OGK estimator, TBS estimator, Boot- H(x) = (1 − ")Φ(x) + "Φ(x=10); (2) strap methods, Imputation where Φ(x) is the cumulative distribution function of the standard normal distribution. Based on the simulation with 5000 replications, if the sample size is n = 100;" = 1 Introduction 0:1, the actual Type I error is estimated to be 0.084 when testing at the .05 level. Using the Bartlett-correction Consider J ≥ 2 dependent groups and let θj (j = studied by DiCiccio et al. [8] ,the actual probability 1;:::;J) be some robust measure of location associated coverage is 0.078. Here it was found that even with a with the jth marginal distribution. The paper considers few missing values, the probability coverage deteriorates, the problem of testing and so this approach was abandoned. Even when there are no missing values, there are H0 : θ1 = θ2 = ··· = θJ (1) well known concerns about inferential methods based when there are missing values. A simple strategy is to on means. A basic concern is that the population mean use a complete case analysis. That is, exclude any rows is not robust (e.g., [11, 30]. Moreover, the sample mean of data where there are missing values and analyze the has a breakdown point of only 1=n, roughly meaning data that remain. But this approach might result in a that even a single outlier can result in the sample mean reduction in power when testing hypotheses (e.g., [15]). being arbitrarily large or small. Also, outliers can result For the special case where θj is taken to be the in very poor power relative to methods based on some population mean, various methods for handling miss- robust measure of location (e.g., [34]). Indeed, even a ing observations have been derived and studied (e.g., small departure from normality can result in a relatively [1, 20, 5, 29]. Approaches include a maximum likeli- large standard error and poor power. hood method [12, 13], a weighted adjustment method [4], One robust estimator that has been studied exten- single imputation [23] and multiple imputation [28, 16]. sively is a 20% trimmed mean. A reason for considering When testing (1), a simple strategy is to impute missing 20% trimming, rather than other amounts of trimming, values and then use some conventional test statistic in is that its efficiency compares well to the mean under the usual manner. It is known, however, that this ap- normality, but unlike the mean, its standard error re- proach can be unsatisfactory, as noted for example by mains relatively small as we move toward more heavy- Liang et al. [15] as well as Wang and Rao [32]. tailed distributions [26]. Consequently, under normality, 1 power is nearly the same as hypothesis testing methods For the jth column of the bootstrap sample just gener- ^ ∗ based on means. For J = 2, methods for handling miss- ated, compute Xtj , the 20% trimmed mean. Note that ing values, when using a 20% trimmed mean, have been all of the non-missing values are being used. Then com- studied that perform well in simulations [33]. But it pute is unclear how these methods might be generalized to ∗ X ¯ ∗ ¯ ∗ 2 J > 2. (Arguments for considering other robust esti- Q = (Xtj − Xt ) ; mators can be made, but for the situation at hand they ¯ ∗ P ¯ ∗ were found to be unsatisfactory for reasons outlined in where Xt = Xtj =J. Repeat this process B times ∗ ∗ Section 2.0.1.) yielding Q1;:::;QB. Put these B values in ascending Here, two test statistics were considered coupled with ∗ ∗ order yielding Q(1) ≤ ::: ≤ Q(B). Let α be the nominal several robust measures of location. Among these meth- Type I error probability and reject the hypothesis of ods, only one performed well in simulations in terms of ∗ equal 20% trimmed mean if Q > Q(c), where c = (1 − controlling the probability of a Type I error in a reason- α)B rounded to the nearest integer. ably accurate manner. Details of this method are given Here, B = 599 was used, which has been found to per- in section 2 along with a brief outline of the methods form well, in terms of controlling the Type I error prob- that were considered and abandoned. ability, when testing other hypotheses based on some robust estimator [34]. But in terms of power, a larger choice for B might have practical value. Racine and 2 Methods That Were Consid- MacKinnon [22] discuss this issue at length. (Also see ered [14].) Davidson and MacKinnon [7] proposed a pretest procedure for choosing B. The one measure of location that performed well in simulations is the 20% trimmed mean. Momentarily 2.0.1 Alternative Methods That Performed consider a single random sample: X1;:::;Xn. The γ- Poorly trimmed mean is computed as follows. Let X(1) ≤ ::: ≤ X(n) be the values written in ascending order and let Although the trimmed mean used here guards against g = [nγ]; 0 ≤ γ < 0:5, where [nγ] is the greatest integer the deleterious effects of outliers among the marginal dis- less than or equal to nγ. Then the γ-trimmed mean is tributions, a possible criticism is that it does not deal with outliers in a manner that takes into account the n−g 1 X overall structure of the data. There are several (affine X¯ = X : t n − 2g (i) equivariant) estimators that deal with this issue (e.g., i=g+1 Wilcox, 2012, section 6.3). For some of these estimators, A 20% trimmed mean corresponds to γ = 0:2. imputation methods have been derived (e.g., [31, 6]). Now consider a random sample from some unknown J- So a simple strategy would be to impute missing val- variate distribution (Xi1;:::;XiJ ), i = 1; : : : ; n. Rather ues and then use the test statistic Q in conjunction than discard any row of data having one or more missing with the bootstrap method just described. However, values, all of the data are used to compute each marginal even with no missing values, control over the Type I trimmed mean. So if the jth column of data has no error probability was found to be poor. The estima- missing values, all n values are used even if there are tors considered were the minimum covariance determi- missing values in some other column. If the jth column nant estimator (see, for example, [27]), the Donoho and contains say m missing values, then the trimmed mean Gasko [9] trimmed mean, the orthogonal Gnanadesikan- is based on the n − m values that are available. Kettenring (OGK) estimator derived by Maronna and To test (1), where now θ is taken to be the population Zamar [19], and the translated-biweight S-estimator pro- trimmed mean, the test statistic is posed by Rocke [24]. A basic concern is that when using Q and testing at the .05 level, the actual level can be less X ¯ ¯ 2 Q = (Xtj − Xt) ; than .01 even with n = 60. (When using the method in [6], via the R package GSE, bootstrap samples were en- ¯ P ¯ ¯ where Xt = Xtj=J and Xtj is the trimmed mean for countered where the measure of location could not be the jth group. The strategy is to approximate the null compute.) Consequently, further details are omitted. distribution of Q via a bootstrap method. In more de- An alternative hypothesis testing method, based on ^ tail, let Cij = Xij − Xtj. That is, shift the empirical trimmed methods, was considered, details of which can distributions so that the null hypothesis is true.

Robust Within Groups Anova: Dealing with Missing Values

Spatial Duration Data for the Spatial Re- Gression Context

Missing Data Module 1: Introduction, Overview

Coping with Missing Data in Randomized Controlled Trials

Monte Carlo Likelihood Inference for Missing Data Models

Missing Data

Parameter Estimation in Stochastic Volatility Models with Missing Data Using Particle Methods and the Em Algorithm

Statistical Inference in Missing Data by MCMC And

The Effects of Missing Data

Resampling Variance Estimation in Surveys with Missing Data

Missing Data

Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation

A Monte Carlo Study: the Impact of Missing Data in Cross- Classification Random Ffe Ects Models