Non-Parametric Vs. Parametric Tests in SAS® Venita Depuy and Paul A
Total Page:16
File Type:pdf, Size:1020Kb
NESUG 17 Analysis Perusing, Choosing, and Not Mis-using: ® Non-parametric vs. Parametric Tests in SAS Venita DePuy and Paul A. Pappas, Duke Clinical Research Institute, Durham, NC ABSTRACT spread is less easy to quantify but is often represented by Most commonly used statistical procedures, such as the t- the interquartile range, which is simply the difference test, are based on the assumption of normality. The field between the first and third quartiles. of non-parametric statistics provides equivalent procedures that do not require normality, but often require DETERMINING NORMALITY assumptions such as equal variances. Parametric tests (OR LACK THEREOF) (which assume normality) are often used on non-normal One of the first steps in test selection should be data; even non-parametric tests are used when their investigating the distribution of the data. PROC assumptions are violated. This paper will provide an UNIVARIATE can be implemented to help determine overview of parametric tests and their non-parametric whether or not your data are normal. This procedure equivalents; what assumptions are required for each test; ® generates a variety of summary statistics, such as the how to perform the tests in SAS ; and guidelines for when mean and median, as well as numerical representations of both sets of assumptions are violated. properties such as skewness and kurtosis. Procedures covered will include PROCs ANOVA, CORR, If the population from which the data are obtained is NPAR1WAY, TTEST and UNIVARIATE. Discussion will normal, the mean and median should be equal or close to include assumptions and assumption violations, equal. The skewness coefficient, which is a measure of robustness, and exact versus approximate tests. symmetry, should be near zero. Positive values for the skewness coefficient indicate that the data are right INTRODUCTION skewed, and negative values indicate that that data are Many statistical tests rely heavily on distributional left skewed. The kurtosis coefficient, which is a measure assumptions, such as normality. When these of spread, should also be near zero. Positive values for assumptions are not satisfied, commonly used statistical the kurtosis coefficient indicate that the distribution of the tests often perform poorly, resulting in a greater chance of data is steeper than a normal distribution, and negative committing an error. Non-parametric tests are designed to values for kurtosis indicate that the distribution of the data have desirable statistical properties when few is flatter than normal distribution. The NORMAL option in assumptions can be made about the underlying PROC UNIVARIATE produces a table with tests for distribution of the data. In other words, when the data are normality. In general, if the p-values are less than 0.05, obtained from a non-normal distribution or one containing then the data should be considered non-normally outliers, a non-parametric test is often a more powerful distributed. However, it is important to remember that statistical tool than its parametric ‘normal theory’ these tests are heavily dependent on sample size. equivalent. Strikingly non-normal data may have a p-value greater than 0.05 due to a small sample size. Therefore, For example, the Likert scale data frequently used in graphical representations of the data should always be social sciences typically violates the assumption of examined. normality necessary for parametric tests. The ordinal scale also violates the frequent assumption that data are Low resolution plots and high resolution histograms are from a continuous distribution. Various authors (Micceri, both available in PROC UNIVARIATE. The PLOTS option Breckler) have found that, in reviews of data sets and in PROC UNIVARIATE creates low-resolution stem-and- journal articles, the majority of behavioral sciences data leaf, box, and normal probability plots. The stem-and-leaf violate the assumption of normality but rarely address that plot is used to visualize the overall distribution of the data concern. and the box plot is a graphical representation of the 5- number summary. The normal probability plot is designed In this paper, we explore the use of parametric and non- to investigate whether a variable is normally distributed. If parametric tests for one- and two-sample location the data are normal, then the plot should display a straight differences, two-sample dispersion differences, and a one diagonal line. Different departures from the straight way layout analysis. We will also examine testing for diagonal line indicate different types of departures from general differences between populations, and look at normality. different measures of correlation. The HISTOGRAM statement in PROC UNIVARIATE will MEASURES OF LOCATION AND SPREAD produce high resolution histograms. When used in Mean and variance are typically used to describe the conjunction with the NORMAL option, the histogram will center and spread of normally distributed data. If the data have a line indicating the shape of a normal distribution are not normally distributed or contain outliers, these with the same mean and variance as the sample. measures may not be robust enough to accurately PROC UNIVARIATE is an invaluable tool in visualizing describe the data. The median is a more robust measure and summarizing data in order to gain an understanding of of the center of a distribution, in that it is not as heavily the underlying populations from which the data are influenced by outliers or skewed data. As a result, the obtained. To produce these results, the following code median is typically used with non-parametric tests. The 1 NESUG 17 Analysis can be used. Omitting the VAR statement will run the If the p-value for the paired t-test is less than the specified analysis on all the variables in the dataset. alpha level, then there is evidence to suggest that the population means of the two variables differ. In the case PROC UNIVARIATE data=file1 normal plots; of measurements taken before and after a treatment, this Histogram; would suggest a treatment effect. PROC UNIVARIATE Var var1 var2...varn; Run; also calculates t-test for the difference being equal to zero, which is equivalent to the paired t-test (code given in next The determination of the normality of the data should section). result from evaluation of the graphical output in conjunction with the numerical output. In addition, the While the paired t-test is robust to departures in normality user might wish to look at subsets of the data; for when the two distributions are the same shape, the Type I example, a CLASS statement might be used to stratify by error is inflated when the distributions are skewed and gender. also have unequal variances. Therefore, care should be taken when variances appear unequal. GENERAL GUIDELINES FOR CHOOSING TESTS SIGN TEST AND SIGNED RANK TEST Obviously, if your data meets the assumptions of a The Signed Rank test and the Sign test are non- parametric test, you should use it. They are always more parametric equivalents to the one-sample paired t-test. powerful, if used appropriately. Neither of these tests requires the data to be normally If you can transform the dependent variable, such as by distributed, but both tests require that the observed differences between the paired observations be mutually using the log or square root transformation, to make it independent, and that each of the observed paired normally distributed, this may be a good alternative. ® differences comes from a continuous population SAS/INSIGHT is the easiest method to explore these options, due to its interactive nature. Manning and symmetric about a common median. However, the observed paired differences do not necessarily have to be Mullahy (2001) discuss concerns regarding obtained from the same underlying distribution. heteroscedasticity and log transformations, and the biases possible when using transformations. To calculate these tests, the difference between measurements must be calculated for each subject. This If the data are not normally distributed but the populations “difference” variable will tend to be non-zero if there is a have the same spread and similarly shaped distributions, difference between the two groups. The Sign test is and other assumptions are met, the non-parametric tests calculated by counting the number of positive differences are typically the best options. and the number of negative differences. The Signed Rank If neither the parametric nor non-parametric test test is calculated by ranking all differences by their assumptions are met, great care should be taken when absolute value, from least to greatest. If the two selecting tests. Things to consider include: differing populations do not have significantly different centers, the sample sizes, differing variances, and differing numbers of positives and negatives in the Sign test, or the distributional shapes. Zimmerman has published a variety sums of the ranks of the positive and negative numbers in of papers addressing different aspects of this, some of the Signed Rank test, should be roughly equal which are referenced. Specifics regarding each test are For small samples, the exact p-value of these tests can listed after the SAS code. manually be determined by comparing the numbers to previously determined critical values, as are found in texts DIFFERENCES IN DEPENDENT POPULATIONS such as Hollander & Wolfe. Larger sample sizes are Testing for the difference between two dependent typically calculated using a large-sample approximation.