Mixture Modeling with Applications in Alzheimer's Disease
Total Page:16
File Type:pdf, Size:1020Kb
University of Kentucky UKnowledge Theses and Dissertations--Epidemiology and Biostatistics College of Public Health 2017 MIXTURE MODELING WITH APPLICATIONS IN ALZHEIMER'S DISEASE Frank Appiah University of Kentucky, [email protected] Digital Object Identifier: https://doi.org/10.13023/ETD.2017.100 Right click to open a feedback form in a new tab to let us know how this document benefits ou.y Recommended Citation Appiah, Frank, "MIXTURE MODELING WITH APPLICATIONS IN ALZHEIMER'S DISEASE" (2017). Theses and Dissertations--Epidemiology and Biostatistics. 14. https://uknowledge.uky.edu/epb_etds/14 This Doctoral Dissertation is brought to you for free and open access by the College of Public Health at UKnowledge. It has been accepted for inclusion in Theses and Dissertations--Epidemiology and Biostatistics by an authorized administrator of UKnowledge. For more information, please contact [email protected]. STUDENT AGREEMENT: I represent that my thesis or dissertation and abstract are my original work. Proper attribution has been given to all outside sources. I understand that I am solely responsible for obtaining any needed copyright permissions. I have obtained needed written permission statement(s) from the owner(s) of each third-party copyrighted matter to be included in my work, allowing electronic distribution (if such use is not permitted by the fair use doctrine) which will be submitted to UKnowledge as Additional File. I hereby grant to The University of Kentucky and its agents the irrevocable, non-exclusive, and royalty-free license to archive and make accessible my work in whole or in part in all forms of media, now or hereafter known. I agree that the document mentioned above may be made available immediately for worldwide access unless an embargo applies. I retain all other ownership rights to the copyright of my work. I also retain the right to use in future works (such as articles or books) all or part of my work. I understand that I am free to register the copyright to my work. REVIEW, APPROVAL AND ACCEPTANCE The document mentioned above has been reviewed and accepted by the student’s advisor, on behalf of the advisory committee, and by the Director of Graduate Studies (DGS), on behalf of the program; we verify that this is the final, approved version of the student’s thesis including all changes required by the advisory committee. The undersigned agree to abide by the statements above. Frank Appiah, Student Dr. Richard J. Charnigo, Major Professor Dr. Steven Browning, Director of Graduate Studies MIXTURE MODELING WITH APPLICATIONS IN ALZHEIMER'S DISEASE ABSTRACT OF DISSERTATION A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the College of Public Health at the University of Kentucky By Frank Appiah Lexington, Kentucky Co-Directors: Dr. Richard Charnigo, Professor of Statistics and Biostatistics and Dr. David Fardo, Associate Professor of Biostatistics Lexington, Kentucky 2017 Copyright© Frank Appiah 2017 ABSTRACT OF DISSERTATION MIXTURE MODELING WITH APPLICATIONS IN ALZHEIMER'S DISEASE This dissertation involves an application of mixture of regression models to 114 indi- viduals who are cognitively intact (from the Alzheimer's Disease and Neuroimaging Initiative-ADNI, data). The correct number of components in the model were esti- mated with the Singular BIC (SBIC), marking the first time it has been applied to such a problem. The smallest true model in conjunction with the approximation of SBIC was fixed at 1. The resulting posterior probabilities from the model were used to estimate the probability of a person transitioning and risk plots were obtained that could in principle be used by clinicians to identify patients at risk. This work also proposed a model selection criterion for mixture of regression models with ap- plication to the ADNI data. Finally simulation studies were conducted to compare the performance of the novel model selection and existing criteria. KEYWORDS: Mixture models, cognitively intact, Alzheimer's disease, model com- plexity/component, selection criteria Frank Appiah Author's signature April 25, 2017 Date MIXTURE MODELING WITH APPLICATIONS IN ALZHEIMER'S DISEASE By Frank Appiah Richard Charnigo Co-Director of Dissertation David Fardo Co-Director of Dissertation Steven Browning Director of Graduate Studies April 25, 2017 Date This work is entirely dedicated to the memory of my late mother Paulina Appiah, my wife Leslie and my children Aiden and Naomi. ACKNOWLEDGMENTS I am indebted to my wife Leslie A. Appiah and my children Aiden O. Appiah and Naomi A. Appiah for their unwavering support and love throughout the years of my graduate education. I am also very appreciative of my advisor, Dr. Charnigo and co- chair Dr. Fardo, for providing great insights and motivation for this work. A similar heartfelt thanks go to my advising committee members Drs. Abner and Mays, for their patience, directions and thought provoking questions. This acknowledgement will be incomplete without giving a special thanks to the woman who with the help of God made this day a possibility. This woman is none but my late mother Mrs. Paulina Appiah, who passed away on March 19, 2014. I am very thankful for all the work and 'painful sacrifices' she made for me. Mom, I dedicate this work in its entirety to your memory. God bless. iii TABLE OF CONTENTS Acknowledgments.................................. iii TableofContents.................................. iv List of Figures . v ListofTables.................................... vi Chapter 1 Introduction . 1 1.1 Definition of Mixture Models and Examples . 1 1.2 Review of Applications . 4 1.3 Review of Mixture Model Applications to Alzheimer's Disease . 15 1.4 Review of Existing Model Selection Criteria and Mixture of Regression 19 Chapter 2 An Application of A Bivariate Normal Mixture Model . 22 2.1 Motivation and Objectives . 23 2.2 Methodologies, Cognitive Assessment and Review of Related Concepts 27 2.3 Results and Discussion . 38 2.4 Limitations and Future Directions . 47 2.5 Acknowledgements . 51 Chapter 3 Application of Mixture of Linear Regressions Models And the Ap- proximate Singular Bayesian Information Criterion . 76 3.1 Introduction................................ 76 3.2 General Overview of Mixture of Regression Models With An Illustration 77 3.3 Overview of Primary Objectives . 82 3.4 Methodologies and Overview of AIC . 84 3.5 Results and Discussions . 89 3.6 Limitations and Future Directions . 104 3.7 Illustrative Computations for A1-A3 in Drton and Plummer(2016) for Mixture of Regression Models . 135 Chapter 4 A Singular Flexible Information Criterion From A Mixture of Lin- ear Regressions Perspective . 158 4.1 Introduction . 158 iv 4.2 Deduction of Within and Between Covariance Matrices . 159 4.3 Definition and Derivation of SFLIC . 163 4.4 Consistency of SFLIC . 166 4.5 Application of SFLIC to the ADNI data . 169 Chapter 5 Simulation Studies . 170 5.1 Introduction . 170 5.2 Overview of Approach . 170 5.3 Simulation Design and Results . 172 Chapter 6 Supplementary Chapter . 178 6.1 Introduction . 178 6.2 Review of Other Comparable Models Fitted As Part of This Work . 178 Vita ......................................... 188 Oral Presentations (Delivered by First author) . 188 Poster Presentations (Delivered by First author) . 189 External Funding . 190 Awards..................................... 192 Professional Organizations . 192 LeadershipSkills................................ 192 v LIST OF FIGURES 2.1 Original Biomarkers . 30 2.2 Derived Biomarker Histogram . 31 2.3 BNM Component Membership Probability Distribution . 53 2.4 BNM Predicted Components Scatterplot . 54 2.5 BNM Component Kaplan Meier Plots . 55 2.6 Component contour plots . 56 2.7 The empty circles identify potential outliers among the eight participants 57 2.8 Joint Biomarker contour plot . 58 2.9 Joint Biomarker density plot . 59 2.10 ROC plot. AUC: Area under the curve . 60 2.11 Risk Boundaries . 61 2.12 Risk Boundaries . 62 2.13 Risk Differentiation Boundaries . 63 2.14 BNM Four Component Membership Probability Distribution . 64 2.15 BNM Four Component Kaplan Meier Plots . 65 2.16 BNM Predicted Four Component Plots . 66 2.17 Comparison of rtaubeta between CN and MCI/AD groups . 67 2.18 Comparison of rptaubeta between CN and MCI/AD groups . 68 2.19 Contours embedded on risk components . 69 3.1 Relationship between biomarkers ratios and race. + represents the grand mean...................................... 106 3.2 Relationship between biomarker ratios and Apoe4 + represents the grand mean...................................... 107 3.3 Relationship between biomarkers, race and Apoe4. + is grand mean. 108 3.4 Joint Distribution of rtaubeta and rptaubeta Given Race . 109 3.5 Posterior probability plot with race as predictor in the mixture regression model indicates that the model is well separated . 110 3.6 Individual biomarker ratios grouped into different risk regions. Here race is the predictor variable . 111 3.7 The survivability of the two groups over time given in weeks. Here race is the predictor variable . 112 3.8 Risk strata for Blacks from the posterior probabilities obtained from the mixture of linear regression with race as covariate . 113 vi 3.9 Risk strata for Whites from the posterior probabilities obtained from the mixture of linear regression with race as covariate . 114 3.10 Joint Distribution of rtaubeta and rptaubeta Given Apoe4 . 115 3.11 Posterior probability plot indicates that the mixture of regression model with Apoe4 as predictor model is comparatively less well separated . 116 3.12 Individual biomarker ratios grouped into different risk regions. Here Apoe4 is the predictor variable. 117 3.13 The survivability of the two groups over time given in weeks. Here Apoe4 is the predictor variable. 118 3.14 This corresponds to the risk strata for Apoe4 carriers derived from the posterior probabilities obtained from the mixture of linear regression with Apoe4 as predictor variable. 119 3.15 This corresponds to the risk strata for none-Apoe4 carriers derived from the posterior probabilities obtained from the mixture of linear regression with Apoe4 as predictor variable.