Critical Values for the Two Independent Samples Winsorized T Test
Total Page:16
File Type:pdf, Size:1020Kb
Wayne State University Wayne State University Dissertations 1-1-2010 Critical Values For The woT Independent Samples Winsorized T Test Piper Alycee Farrell-Singleton Wayne State University, Follow this and additional works at: http://digitalcommons.wayne.edu/oa_dissertations Part of the Statistics and Probability Commons Recommended Citation Farrell-Singleton, Piper Alycee, "Critical Values For The wT o Independent Samples Winsorized T Test" (2010). Wayne State University Dissertations. Paper 86. This Open Access Dissertation is brought to you for free and open access by DigitalCommons@WayneState. It has been accepted for inclusion in Wayne State University Dissertations by an authorized administrator of DigitalCommons@WayneState. CRITICAL VALUES FOR THE TWO INDEPENDENT SAMPLES WINSORIZED T TEST by PIPER A. FARRELL-SINGLETON DISSERTATION Submitted to the Graduate School of Wayne State University, Detroit, Michigan in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY 2010 MAJOR: EDUCATION, EVALUATION AND RESEARCH Approved by: ___________________________________ Advisor Date ___________________________________ ___________________________________ ___________________________________ ©COPYRIGHT BY PIPER FARRELL-SINGLETON 2010 All Rights Reserved DEDICATION To my family, my friends and The Most High God. ii ACKNOWLEDGEMENTS I would like to first give honor to God, for without Him, this work would not have been possible. Second, I would like to thank my mother, the late Marie Farrell-Donaldson, whose courage and ability to face the unknown inspired me to pursue this work. To my grandmother, Lorine “Mother Love” Morgan, your love and encouragement are invaluable. To my precious daughter, Aris Singleton, who helped keep me focused; To my stepfather, Dr. Clinton L. Donaldson, thank you for inspiring me to use education as a key to unlock the door to my dreams; My father, Joseph Farrell, your wisdom, love and life lessons helped me to see this project through to the end. To my mentor, Apostle Briggie Stansberry, your unfailing love, prayers and support helped me to realize the destiny within me. A special thanks to Howard Booker for his much appreciated guidance, support and dedication to keeping me focused; my advisor, Dr. Shlomo Sawilowsky, for your patience, understanding and humor; My dissertation committee members, Dr. Gail Fahoome, Dr. Michael Addonizio and Dr. Monte Piliawsky for your support and dedication to this project; and to the countless other friends and family members who have waited patiently at the finish line for me- Ready or not, here I come! iii TABLE OF CONTENTS Dedication ...................................................................................................................... ii Acknowledgments......................................................................................................... iii List of Tables .................................................................................................................. v List of Figures ................................................................................................................ vi CHAPTER 1 – Introduction............................................................................................ 1 CHAPTER 2 – Review of Literature .............................................................................. 9 CHAPTER 3 – Methodology ........................................................................................ 35 CHAPTER 4 – Results.................................................................................................. 40 CHAPTER 5 – Discussion ............................................................................................ 48 References ..................................................................................................................... 53 Abstract ......................................................................................................................... 61 Autobiographical Statement.......................................................................................... 62 iv LIST OF TABLES Table 1: Efficiencies for trimmed and winsorized samples .......................................... 32 Table 2: Sample data used to verify algorithm…… ..................................................... 36 Table 3: Monte Carlo summary results for sample size n1=n2=2 ................................ 37 Table 4: Maximum winsorized values per side for the two sample winsorized t......... 38 Table 5: Critical values for the two sample winsorized t test ....................................... 43 Table 6: Random data used for comparison ................................................................. 49 Table 7: Summary of sample A and B data for n = 25 ................................................. 50 v LIST OF FIGURES Figure 1: Effects of variations in the probability of occurrence of an outlier on power functions of the Student’s t test,Zimmerman (1994) ......................... 14 Figure 2: Effects of variations in the probability of occurrence of an outlier on power functions of the Mann-Whitney-Wilcoxon test, Zimmerman (1994) ........................................................................................ 14 Figure 3: Probability of Type I errors of the Student t test when the null hypothesis is true, Zimmerman (1994) ............................................................................ 16 Figure 4: Probability of Type I errors of the Mann-Whitney-Wilcoxon test when the null hypothesis is true, Zimmerman (1994) ................................... 16 Figure 5: Weibull Distribution characteristics .............................................................. 30 vi 1 CHAPTER 1 CRITICAL VALUES FOR THE TWO INDEPENDENT SAMPLES WINSORIZED T TEST Introduction According to Barnett and Lewis (1984, p. 4), an outlier is an observation (or subset of observations), in a set of data which appears to be inconsistent with the remainder of that set of data. One of the earliest references to outliers was suggested by Boscovich (1755) in an attempt to determine the ellipticity of the earth by averaging measures of excess of the polar degree over the equatorial. In his study, Boscovich determined that two of the ten measured values exceeded the normal range. In an attempt to obtain the best estimate of the mean, Boscovich proceeded to compute the mean minus the two extreme scores in an effort to adjust for the effects of the outlying scores. It was later proposed by Bernoulli (1755) that the practice of removing outliers should not be condemned but that the determination should be left to the satisfaction of the observer and that extreme observations should not be removed or rejected simply because they appear inconsistent with remaining data values. In 1838, subsequent attempts to address the presence and effects of outliers were made by a German mathematician and astrologer named W. F. Bessel. In his work with outliers, Bessel (1838) acknowledged that “he had never rejected an observation simply because of its large residual, and that all completed observations should be given equal weight and consideration and allowed to contribute to the results” (Ascombe, 1960, p. 125). Peirce (1852) later published the first objective test for anomalous observations, which was later followed by the publication of a test for a single doubtful observation by Chauvenet (1863). Their methodology, however, sparked much controversy until 1884 2 when Wright (1884) proposed that the best method for dealing with outliers in astronomical readings was for the non observer to reject any observation whose residual exceeded in magnitude five times the probable error, or 3.37 times the standard deviation (Ascombe, 1960, p.125). Basing his reasoning on the Gaussian law of error being satisfied, Wright (1884) succumbed that “minimal damage would be incurred due to the fact that only about one observation in a thousand would be rejected” (Ascombe, 1960, p.125). Since then, identifying and treating outliers has become so critical to the study of statistics that many suggestions have been made as to what criteria should be used to identify outliers, as well as how they should be treated for purposes of statistical analyses. Identification and treatment of outliers is crucial to statistical research because if left unchecked, outliers can increase error variance, reduce the power of statistical tests, decrease normality (if non-randomly distributed), violate assumptions of sphericity and multivariate normality (in multivariate analyses), as well as significantly bias or influence estimates that may be of considerable interest (Osborne & Overbay, 2004 ). With recent advancements in modern statistical methods, however, the process of identifying and treating outliers has become increasingly simplified. Problem The two sample t test is the best-known and most popular method for comparing two groups according to Wilcox (1996). In the presence of outliers, however, the test becomes inexact and the likelihood of Type I error inflations (or deflations) is significantly increased. Over the years, numerous recommendations have been made as to how to implement the two sample t test in the presence of outliers in an effort to obtain 3 the most valid and reliable statistical estimates. This decision is of particular importance because removal of outliers has been linked to problems such as increased sampling error, particularly when the underlying distribution is unknown or contaminated,