Confidence INTERVLAS for FUNCTIONS of QUANTILES USING LINEAR COMBINATIONS of ORDER, Stahsncs
Total Page:16
File Type:pdf, Size:1020Kb
CONfIDENCE INTERVLAS FOR FUNCTIONS OF QUANTILES USING LINEAR COMBINATIONS OF ORDER, STAHSnCS by Seth Mich~el Ste.tnberg Department of Biostatistics University of North CaroHna (J,t Chapel Hill Institute of Statistics Mimeo Series No, 1433 r~arcn 1983 CONFIDENCE INTERVALS FOR FUNCTIONS OF QUANTILES USING LINEAR COMBINATIONS OF ORDER STATISTICS by Seth Michael Steinberg A Dissertation submitted to the faculty of The University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Biostatistics, School of Public Health. Chapel Hi 11 1983 Approved by: , j .,_ ...-, f / i i /j.... tJ l..!/'-"rv\ ,<t Advisor i i ABSTRACT SETH MICHAEL STEINBERG. Confidence Intervals for Functions of Quan tiles Using Linear Combinations of Order Statistics. (Under the direction of C.E. DAVIS) Estimators for quantiles based on linear combinations of order statistics have been proposed by Harrell and Davis (1982) and Kaigh and Lachenbruch (1982). Both estimators have been demonstrated to be at least as efficient for small sample point estimation as an ordinary sample quantile estimator based on one or two order statis- tics. Distribution free confidence intervals for quantiles can be constructed using either of the two approaches. By means of a simu- lation study, these confidence intervals have been compared with several other methods of constructing confidence intervals for quan- tiles in small samples. For the median, the Kaigh and Lachenbruch method performed the best overall. For other quantiles, no method performed better than the method which uses pairs of order statistics. The interquantile difference is often useful as a measure of dispersion. Both the Harrell-Davis and Kaigh-Lachenbruch estimators are modified to estimate interquantile differences. Theoretical developments needed to establish large-sample use of the normal distribution for these estimators are presented. Both of these methods are used to form pivotal quantities with asymptotic normal distributions, and thus are readily used for construction of confi- dence intervals. The poi nt estimators of i nterquantil e di fference are compared iii through simulations on the basis of relative mean squared errors. The estimator based on the Harrell-Davis method generally performed best in this regard. Confidence intervals are constructed and com pared with a method based on pairs of order statistics. This order statistic method produced very conservative intervals. The perfor mance of the other estimators varied, and was better for symmetric distributions. Neither method could consistently produce intervals of the desired confidence. Finally, an example using data from the Lipid Research Clinics Program is presented to illustrate use of the new estimators for point and interval estimation of quantiles and interquantile dif ferences. iv e. ACKNOWLEDGEMENTS My committee, chaired by Dr. C.E. Davis, was extraordinary in the amount of time and effort put forth to assist with this project. I sincerely thank and am grateful to Dr. Davis for his availability to discuss my work, suggestions, and guidance throughout my writing of the dissertation. I want to thank the other members of the com mittee, Drs. Shrikant Bangdiwala, Frank Harrell, Abdel Omran, and Dana Quade, for their comments and suggestions, and for maintaining a strong interest in the project. The past few years in Chapel Hill have been very enjoyable. This is due in large part to the many wonderful friends I have made here. I thank them all for their support along the way. My parents deserve special thanks for encouraging me to obtain a worthwhile education and for supporting me throughout the whole process. I would like to thank Dr. P.K. Sen for initially suggesting my investigation of this area of research, and for providing helpful information when it was needed. Dr. William Kaigh of The University of Texas at El Paso, and Dr. Bruce Schmeiser of Purdue made avail- able some results of their own research and is gratefully acknowledged. Data for the example in Chapter V are used with permission of the National Heart, Lung, and Blood Institute. Finally, I would like to thank Ernestine Bland for providing superb, speedy typing services for this manuscript, and the entire faculty and staff of the Department of Biostatistics for making my v experience pleasant and rewarding. Funding was provided by NICHD training grant #5-T32-HD07l02-05, and by Survey Design, Inc. vi TABLE OF CONTENTS Page ACKNOWLEDGEMENTS. ............................................ .. i v L1ST OF TABLES. ........................ .. ................ .• .. .. i x CHAPTER I INTRODUCTION AND REVIEW OF THE LITERATURE . 1.1 Introducti on. ........... .. ........................ 1 1.2 Revi ew of the Literature..... ..................... 2 1. 2.1 Simple Point and Interval Estimators 2 1. 2. 2 Various Median Estimation Methods 8 1.2.3 Estimators for the p-th Quantile 16 1.2.4 Quantile Estimators for Specific Distributions 23 1.2.4.1 Normal Distribution Quantiles ..... 23 1.2.4.2 Exponential Distribution Quantiles 25 1.2.4.3 Quantile Estimation for Other Distributions 28 1.2.5 Estimation of Quantile Intervals 28 1.2.6 Estimation of Quantile Differences 31 1.3 Outl ine of the Research Proposal 32 II A COMPARISON OF CONFIDENCE INTERVALS FOR QUANTILES ..... 34 2.1 Introducti on 34 2.2 Selection of Interval Estimators for Comparison 34 2.3 Note on the Use of the Kaigh and Lachenbruch Estimator 36 2.4 Eval uation of Confi dence Interval s 38 2.4.1 Exact Confidence Intervals 38 2.4.1.1 Determination of Confidence 38 2.4.1.2 Expected Length of Confidence Intervals 39 2.4.2 Simulated Confidence Intervals 43 2.4.2.1 Determination of Confidence 44 2.4.2.2 Expected Lengths of Intervals 45 vii 2.4.3 Selection of Distribution for Pivotal Quantity 47 2.5 Details of the Simulation Process 47 2.6 Results from Simulated or Theoretical Construction of Intervals 49 2.7 Conclusions 51 III THEORY FOR ESTIMATION OF AN INTERQUANTILE DIFFERENCE ... 69 3.1 Introduction 69 3.2 Theory for the L-COST Estimator of Inter- quanti 1e Di fference. ............................ .. 70 3.2.1 The L-COST Interquantile Difference Estimator 70 3.2.2 Theoretical Framework for Convergence toN0 rma 1i ty ............................. .. 71 3.2.2.1 L-estimators and the L-COST Estimator 71 3.2.2.2 Establishing Conditions for Convergence. .................... .. 72 3.2.3 Convergence Theorems for the L-COST Estimator of Interquantile Difference ..... , 77 3.2.4 Confidence Interval Estimator Based on L-COST Interquantile Difference Estimator 80 3.3 Theory for the Kaigh and Lachenbruch (1982) Estimator of an Interquantile Difference 81 3.3.1 The K-L Interquantile Difference Esti rna to r. ............................... .. 81 3.3.2 Convergence Theorems for the K-L Estimator of Interquantile Difference 82 3.3.3 Confidence Interval Estimator for the K-L Interquantile Difference Estimator..... 84 IV A COMPARISON OF POINT AND CONFIDENCE INTERVAL ESTIMATORS OF INTERQUANTILE DIFFERENCES 86 4.1 Introducti on 86 4.2 Point Estimators for the Interquantile Di fference. ..................................... .. 87 4.3 Eval uation of Point Estimators 87 4.3.1 Methodology for Comparisons 87 4.3.2 Results of Comparisons 89 viii 4.4 Evaluation of Confidence Intervals•.•.•.••••.••••• 91 4.5 Results from Simulated Confidence Intervals ••••.•• 93 4.6 Conclusion and Sunmary 95 V EXAMPLE OF QUANTILE ESTIMATION METHODS •••.•••.••••.•••• 105 5. 1 Introducti on •••••.••••••••.•.•••••..•.••.•.•••••••105 5.2 Comparison of Results for the Example .•.•••.••.•.•106 5.3 Conclusion ...••••••••••••••••.•..•••.•.•..••.••••.108 VI SU~~ARY AND SUGGESTIONS FOR FURTHER RESEARCH .•.•.•.•.••119 6. 1 Summa ry •.•••••.••••••••••.••.•••.••••••...•••..•.•11 9 6.2 Suggestions for Further Research 121 BIBLIOGRAPHy •..•••...••••••.•••••..•.•.••.•.••••.•••.•..••.••.•123 APPENDIX •..•.••.•.•.•..•.•..•.••.•.•.••..•..•••..••••••••.••••.129 ix LIST OF TABLES Page TABLE 2.1 Order Statistics X(j); X(k) Comprising a Confidence Interval (with Theoretical Confidence) for Various Quantiles and Sample Sizes 54 2.2 Expected Lengths of 95% Confidence Intervals (and Theoretical or Observed Confidence) Computed for Various Quantiles of the Uniform Distribution, with Three Sample Sizes 55 2.3 Expected Lengths of 99% Confidence Intervals (and Theoretical or Observed Confidence) Computed for Various Quantiles of the Uniform Di~tribution, with Three Sample Sizes 56 2.4 Expected Lengths of 95% Confidence Intervals (and Theoretical or Observed Confidence) Computed for Various Quantiles of the Normal Distribution, with Three Sample Sizes 57 2.5 Expected Lengths of 99% Confidence Intervals (and Theoretical or Observed Confidence) Computed for Various Quantiles of the Normal Distribution, with Three Sample Sizes 58 2.6 Expected Lengths of 95% Confidence Intervals (and Theoretical or Observed Confidence) Computed for Various Quantiles of the Cauchy Distribution, with Three Sampl e Sizes 59 2.7 Expected Lengths of 99% Confidence Intervals (and Theoretical or Observed Confidence) Computed for Various Quantiles of the Cauchy Distribution, with Three Sample Sizes 60 2.8 Expected Lengths of 95% Confidence Intervals (and Theoretical or Observed Confidence) Computed for Various Quantiles of the Exponential Distribution, with Three Sample Sizes 61 2.9 Expected-Lengths of 99% Confidence Intervals (and Theoretical or Observed Confidence) Computed for Various Quantiles of the Exponential Distribution, with Three Sample