A Study of Some Issues of Goodness-Of-Fit Tests for Logistic Regression Wei Ma
Total Page:16
File Type:pdf, Size:1020Kb
Florida State University Libraries Electronic Theses, Treatises and Dissertations The Graduate School 2018 A Study of Some Issues of Goodness-of-Fit Tests for Logistic Regression Wei Ma Follow this and additional works at the DigiNole: FSU's Digital Repository. For more information, please contact [email protected] FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES A STUDY OF SOME ISSUES OF GOODNESS-OF-FIT TESTS FOR LOGISTIC REGRESSION By WEI MA A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2018 Copyright c 2018 Wei Ma. All Rights Reserved. Wei Ma defended this dissertation on July 17, 2018. The members of the supervisory committee were: Dan McGee Professor Co-Directing Dissertation Qing Mai Professor Co-Directing Dissertation Cathy Levenson University Representative Xufeng Niu Committee Member The Graduate School has verified and approved the above-named committee members, and certifies that the dissertation has been approved in accordance with university requirements. ii ACKNOWLEDGMENTS First of all, I would like to express my sincere gratitude to my advisors, Dr. Dan McGee and Dr. Qing Mai, for their encouragement, continuous support of my PhD study, patient guidance. I could not have completed this dissertation without their help and immense knowledge. I have been extremely lucky to have them as my advisors. I would also like to thank the rest of my committee members: Dr. Cathy Levenson and Dr. Xufeng Niu for their support, comments and help for my thesis. I would like to thank all the staffs and graduate students in my department. During the 5 years of my PhD time, they helped me so much for my study, research and life. I have been so lucky to attend FSU for my doctorate degree. Last but not the least, I would like to thank my family members for support and encouragement during my PhD study especially my parents who give birth to me at the first place and support me spiritually throughout my life. iii TABLE OF CONTENTS List of Tables . vi List of Figures . viii Abstract . .x 1 Introduction 1 1.1 Review of Logistic Regression . .1 1.2 Goodness-of-fit Test . .2 1.3 Two Issues of Hosmer-Lemeshow Test . .3 2 Goodness-of-fit Test 6 2.1 Pearson's Chi-square and Deviance Tests . .7 2.2 Tsiatis Test . .9 2.3 Unweighted Sum of Squares Test . 11 2.4 Hosmer-Lemeshow Test . 12 2.5 Motivating Example . 15 3 Grouping Test 17 3.1 Majority Vote Method . 17 3.2 Minimum P Method . 18 3.3 P Values Combined Method . 19 3.4 Averaging Statistics Method . 22 4 Simulation Studies of Grouping Test 24 4.1 Type I Error . 24 4.2 Power ............................................ 32 4.3 Conclusion . 43 5 Interaction Test 44 5.1 Global Interaction Test . 45 5.2 Local Interaction Test . 47 5.3 Generalization of the Binary Covariate . 48 6 Simulation Studies of Interaction Test 51 6.1 Type I Error . 51 6.2 Power ............................................ 52 7 Analysis of Real Data 65 8 Summary and Future Work 75 iv Appendices A Calculation of the Expectation for Five Grouping Tests 77 B IRB Approval 79 Bibliography . 83 Biographical Sketch . 88 v LIST OF TABLES 2.1 Contingency table for goodness-of-fit tests . .8 2.2 HL tests for bone mineral density data . 16 4.1 Average rate of HL tests agreeing with each other across number of groups . 26 4.2 Type I error of HL tests for different number of groups with 500 replications, *m = 802 is not applicable for n = 500; 1000; 2000; 4000 as n=m is too small, so the results under these combinations are empty . 26 4.3 Type I error of majority vote with 500 replications . 27 4.4 Type I error of minimum p method and Bonferroni correction with 500 replications . 28 4.5 Type I error of p values combined methods with independent assumption with 500 replications . 29 4.6 Type I error of p values combined and averaging statistics using bootstrap approach with 500 replications and 1000 bootstrap samples . 31 4.7 Power of HL tests for different number of groups with 500 replications, *m = 802 is not applicable for n = 500; 1000; 2000; 4000 as n=m is too small, so the results under these combinations are empty . 33 4.8 Power of majority vote with 500 replications . 35 4.9 Power of minimum p method with 500 replications . 37 4.10 Power of p values combined and averaging statistics using bootstrap approach with 500 replications and 1000 bootstrap samples . 42 4.11 Summary of the performance of grouping tests . 43 5.1 Summary of five grouping tests and their expectations . 46 5.2 Summary of five local LR and their expectations . 48 5.3 Summary of five local LR and their expectations for categorical variable . 49 6.1 Type I error of interaction tests with 500 replications and 1000 bootstrap samples . 52 6.2 Power of different tests for Model 1, case 1 with 500 replications and 1000 bootstrap samples . 54 6.3 Power of different tests for Model 2, case 1 with 500 replications and 1000 bootstrap samples . 56 vi 6.4 Power of different tests for Model 3, case 1 with 500 replications and 1000 bootstrap samples . 57 6.5 Power of different tests for Model 4, case 1 with 500 replications and 1000 bootstrap samples . 59 6.6 Power of different tests for Model 1, case 2 with 500 replications and 1000 bootstrap samples . 61 6.7 Power of different tests for Model 2, case 2 with 500 replications and 1000 bootstrap samples . 62 6.8 Power of different tests for Model 3, case 2 with 500 replications and 1000 bootstrap samples . 63 7.1 Descriptive summary for BMD . 65 7.2 Grouping tests for model (7.2) . 70 7.3 Goodness-of-fit tests of model (7.4) for bone mineral density data, used 6 to 12 groups and 1000 bootstrap samples . 71 7.4 Likelihood ratio test for bone mineral density data . 72 7.5 Grouping tests of model (7.5) for bone mineral density data, used 6 to 12 groups and 1000 bootstrap samples . 73 vii LIST OF FIGURES 4.1 Type I error of minimum p method and Bonferroni correction with 500 replications . 28 4.2 Comparison of distribution of T between bootstrap and true samples with N = 1000 . 30 4.3 Type I error of p values combined and averaging statistics using bootstrap approach with 500 replications and 1000 bootstrap samples . 31 4.4 Power of majority vote with 500 replications . 36 4.5 Power comparison of majority vote and minimum p with Bonferroni Correction using 3 to 12, 18 groups with 500 replications . 38 4.6 Power of minimum p with BC using 3 to 12 and 18 groups with 500 replications . 39 4.7 Power of minimum p with BC using 6 to 12 groups with 500 replications . 40 4.8 Power of p values combined and averaging statistics with 500 replications and 1000 bootstrap samples . 41 6.1 Type I error of interaction tests with 500 replications and 1000 bootstrap samples . 53 6.2 Power of different tests for Model 1, case 1 with 500 replications and 1000 bootstrap samples . 55 6.3 Power of different tests for Model 2, case 1 with 500 replications and 1000 bootstrap samples . 56 6.4 Power of different tests for Model 3, case 1 with 500 replications and 1000 bootstrap samples . 58 6.5 Power of different tests for Model 4, case 1 with 500 replications and 1000 bootstrap samples . 59 6.6 Power of different tests for Model 1, case 2 with 500 replications and 1000 bootstrap samples . 61 6.7 Power of different tests for Model 2, case 2 with 500 replications and 1000 bootstrap samples . 62 6.8 Power of different tests for Model 3, case 2 with 500 replications and 1000 bootstrap samples . 64 7.1 Histogram of BMD . 66 7.2 Scatter plot of BMD versus age . 67 viii 7.3 Box plot of BMD versus gender . 67 7.4 Box plot of BMD vs race-ethnicity . 68 7.5 Logit plot for bone mineral density data with 10 groups . 69 7.6 Goodness-of-fit tests of model (7.4) for bone mineral density data, used 6 to 12 groups and 1000 bootstrap samples . 72 7.7 Grouping tests of model (7.5) for bone mineral density data, used 6 to 12 groups and 1000 bootstrap samples . 74 ix ABSTRACT Goodness-of-fit.