Transformed Tests for Homogeneity of Variances and Means
Total Page:16
File Type:pdf, Size:1020Kb
TRANSFORMED TESTS FOR HOMOGENEITY OF VARIANCES AND MEANS Md. Khairul Islam A Dissertation Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 2006 Committee: Hanfeng Chen, Advisor Michael Zickar Graduate Faculty Representative Arjun K. Gupta Truc T. Nguyen Junfeng Shang ii ABSTRACT Hanfeng Chen, Advisor The analysis of variance (ANOVA) is one of the most important and useful techniques for variety of fields such as agriculture, sociology and medicine for comparing different groups or treatments with respect to their means. A set of assumptions such as normal error distribution, homogeneity of variances and independence of observations, has to be made to employ an F test for equality of the treatment means. It is now well established that the violation of the assumption of homogeneity of variances can have severe effects on the inference of the population means, especially in the case of unequal sample sizes. In fact, the conventional ANOVA F provides generally poor control over both Type I and Type II error rates under a wide variety of variance heterogeneity conditions. Therefore, the problem of homogeneity of variances has to be settled before conducting an ANOVA. While a good number of tests are available for testing homogeneity of variances, Bartlett test and four versions of Levene tests are still popular for testing homogeneity of variances in the case of one-way ANOVA setting. It is evident that the Bartlett test is not as robust as Levene tests against the violation of the normality assumption. On the other hand, Levene tests are less powerful than the Bartlett test. In this dissertation, we proposed a transformed version of Bartlett test where the transformation is intended to achieve the normality of the data and independence of the observations to some extent. It is evident from the simulation that the transformed Bartlett test is more robust than the untransformed Bartlett test against the iii violation of the normality assumption. It also follows that the transformed Bartlett test is a balance between the Bartlett test and the four versions of the Levene tests in terms of Type I error rate and power concern. While, the estimation of location parameter is of concern, a modified version of trimmed mean has been proposed as an alternative to trimmed mean when the distribution is skewed or contains outliers. It is evident from the simulation study that modified trimmed mean outperform trimmed mean in terms of coverage probability of interval estimate. It is also evident that the test based on the modified version of trimmed mean is suitable for estimating Type I error rate and is more powerful than the tests based on the mean and the trimmed mean in the presence of the outliers. Finally, an alternative method of estimating transformation parameter has been pro- posed by using the chi-squared goodness of fit criteria which could be useful for testing equality of two population means. iv ACKNOWLEDGMENTS I would like to thank Professor Hanfeng Chen, my supervisor, for his thoughtful suggestions, guidance and constant support during this research. I would also like to express my thanks and gratitude to the members of my committee, Professors Arjun K. Gupta, Truc T. Nguyen, Michael Zickar and Junfeng Shang, for their time and advice. I am grateful to the Department of Mathematics and Statistics for providing me teaching assistantship and a wonderful research environment. My special thanks go to Mary Busdeker, Marcia Seubert and Cyndi Patterson for their assistance and cooperation. I am thankful to Sherwin and Mike for their cooperation and accompaniment during my stay at Bowling Green State University. I certainly have to thank my elder brother Md. Azhar Ali Miah and my sister-in-law who inspired me throughout my education. Without their sacrifice this work could never have come to existence. I also extend my gratitude to my parents-in-law and my children, Sadia and Towfique, for their love and patience what I badly needed to complete my work. Finally, my sincere and deepest gratitude goes to my wife Tanweer Shapla for her encouragement and support throughout my graduate study. Bowling Green, Ohio Md. Khairul Islam August, 2006 v TABLE OF CONTENTS CHAPTER 1: INTRODUCTION AND SUMMARY 1 1.1 Introduction.................................... 1 1.2 TheClassicalModel ............................... 2 1.3 The Violation of Assumptions in Classical Model . ........ 3 1.4 DevelopingtheTransformation . .... 4 1.5 The Box-Cox Transformation and Some Alternative Versions......... 9 1.6 Estimation of λ with Box-Cox Transformed Model . 11 1.7 ThesisSummary ................................. 15 CHAPTER2:TESTINGHOMOGENEITYOFVARIANCES 17 2.1 Introduction.................................... 17 2.2 TestsforHomogeneityofVariances . ..... 20 2.2.1 Bartlett χ2 test.............................. 20 2.2.2 Levene1test ............................... 22 2.2.3 Levene2test ............................... 23 2.2.4 Levene3test ............................... 24 2.2.5 Levene4test ............................... 24 2.2.6 Jackknifetest ............................... 24 2.2.7 BoxTest.................................. 27 2.3 NewTransformedTestProposed . .. 28 vi 2.3.1 Thechoiceoftransformation. .. 31 2.3.2 TransformedBartletttest . 33 2.4 ApplicationstoRealLifeSituations . ....... 33 2.4.1 Example1:Two-samplecase . 34 2.4.2 Example 2: Three-sample case . 36 2.4.3 Example3:Four-samplecase . 39 2.5 MonteCarloStudy ................................ 42 2.5.1 Simulationtechnique . 42 2.5.2 Simulation results and discussions . ..... 43 CHAPTER 3:ESTIMATION BY MODIFIED TRIMMED MEAN 55 3.1 Introduction.................................... 55 3.2 SampleQuantile ................................. 56 3.3 TrimmedMean .................................. 56 3.4 ModifiedTrimmedMean ............................. 57 3.5 Properties of Modified Trimmed Mean . ... 57 3.5.1 Someusefulresults ............................ 57 3.5.2 Asymptotic distribution of modified trimmed mean . ...... 58 3.6 Applications.................................... 60 3.7 MonteCarloStudy ................................ 63 3.7.1 Simulationtechnique . 63 3.7.2 Resultsanddiscussion . 65 CHAPTER4:TRANSFORMEDTESTSFORMEANS 72 vii 4.1 Introduction.................................... 72 4.2 Box-Cox Transformation for Positive Random Variables . ........... 73 4.2.1 Transformed trimmed t-test ....................... 75 4.2.2 Transformed modified trimmed t-test .................. 75 4.3 MonteCarloStudy ................................ 76 4.3.1 Simulationtechnique . 76 4.3.2 Simulationsresults . 77 4.3.3 Concludingremarks............................ 78 CHAPTER 5: TRANSFORMATION BY GOODNESS OF FIT 83 5.1 Introduction.................................... 83 5.2 Estimation of λ byGoodnessofFitTest . .. 85 5.2.1 Motivationforgoodnessoffittest . ... 85 5.2.2 χ2 goodness of fit in estimation of λ ................... 85 5.3 ApplicationstoReallife . ... 87 5.4 SimulationandDiscussion . ... 91 REFERENCES 95 Appendix A: CODE FOR HOMOGENEITY OF VARIANCES 109 A.1 Computation of Power in Three-sample Case . ...... 109 Appendix B: CODE FOR ESTIMATION AND TESTING FOR MEANS 114 B.1 Coverage Probability and Mean Length of CI . ..... 114 B.2 Computation of Power in Two-sample t Tests ................. 115 viii LIST OF TABLES 2.1 Survival times for bile duct cancer patients . ......... 34 2.2 Some descriptive statistics for data in Table 2.1 . .......... 35 2.3 Results of Lilliefors test for data in Table 2.1 . .......... 35 2.4 Different test statistics with corresponding p-values for data in Table 2.1 . 36 2.5 Lifetimes of steel specimens tested at different stress levels .......... 37 2.6 Skewness and sample variances for data in Table 2.5 . ......... 38 2.7 Summary statistics by Lilliefors test for data in Table 2.5........... 38 2.8 Results of different tests with corresponding p-values for data in Table 2.5 . 39 2.9 Times (in hours) between successive failures of air conditioning equipment by fourBoeing720aircrafts . 40 2.10 Some descriptive statistics for data in Table 2.9 . ........... 40 2.11 Summary statistics by Lilliefors test for data in Table 2.9........... 41 2.12 Different tests with corresponding p-valuesfordatainTable2.9 . 42 2.13 Characteristics of distributions used in simulations ............... 44 2.14 Estimated testing size for two-sample cases . .......... 46 2.15 Estimated testing power for variance ratio 1:4 . .......... 48 2.16 Estimated testing size and power for K = 3 and LN(0, 0.5).......... 50 2.17 Estimated testing size and power for K = 3 and B(6, 1.5)........... 51 2.18 Estimated testing size and power for K = 3 and G(2, 1)............ 52 2.19 Estimated testing size and power for K = 3 and Exp(1)............ 53 ix 2 2.20 Estimated testing size and power for K = 3 and χ5 .............. 54 3.1 Lifetimes of transistors in an accelerated life test . ............. 60 3.2 The 95% confidence intervals for transistor data . ......... 61 3.3 Thetimeofappearanceofcarcinomainrats . ...... 62 3.4 The 95% confidence intervals for carcinoma data . ........ 63 3.5 Some characteristics of the distributions used in simulations ......... 64 3.6 Simulated coverage probability and mean length of intervalestimators . 66 3.7 Simulated coverage probability and mean length of interval estimators with constantoutlierinthepopulation . ... 69 4.1 Estimatedsizeofdifferenttests . ..... 79 4.2 Estimatedpowerofdifferenttests . ..... 80 4.3 Estimated