Functional Component Analysis and Regression Using Elastic Methods
Total Page:16
File Type:pdf, Size:1020Kb
FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES FUNCTIONAL COMPONENT ANALYSIS AND REGRESSION USING ELASTIC METHODS By J. DEREK TUCKER A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy Degree Awarded: Summer Semester, 2014 J. Derek Tucker defended this dissertation on May 20, 2014. The members of the supervisory committee were: Anuj Srivastava Professor Co-Directing Dissertation Wei Wu Professor Co-Directing Dissertation Eric Klassen University Representative Fred Huffer Committee Member The Graduate School has verified and approved the above-named committee members, and certifies that the dissertation has been approved in accordance with university requirements. ii I dedicate this work to my wife and my children. Your unconditional support, love, and patience made this work possible and allowed me to achieve my dreams. iii ACKNOWLEDGMENTS I would like to acknowledge the support of my colleagues at the Naval Surface Warfare Center Panama City Division (NSWC PCD) in the pursuit of my PhD degree. It has been a great honor to further my education while employed at the lab. I am thankful for the support of Dr. Frank Crosby and Dr. Quyen Huynh, at NSWC PCD who supported me and introduced me to my advisor, Dr. Anuj Srivastava. Furthermore, I would like to thank my advisors, Dr. Anuj Srivastava and Dr. Wei Wu for their continual support, patience, technical discussions, and dedication to my success. From them I have learned and accomplished much that will benefit me in my future endeavors. iv TABLE OF CONTENTS List of Tables ............................................. vii List of Figures ............................................ viii Abstract ................................................ xii 1 Introduction 1 1.1 Motivation ......................................... 2 1.2 Contributions ........................................ 5 2 Literature Review 7 2.1 Functional Data Analysis ................................. 7 2.1.1 Summary Statistics ................................. 7 2.1.2 Smoothing Functional Data ............................ 8 2.1.3 Functional Principal Component Analysis .................... 11 2.1.4 Functional Regression ............................... 11 2.2 Phase-Amplitude Separation ............................... 12 2.2.1 Previous Work ................................... 13 2.2.2 Phase and Amplitude Separation Using Elastic Analysis ............ 17 2.2.3 Karcher Mean and Function Alignment ..................... 19 2.2.4 Functional Component Analysis with Alignment ................ 23 2.2.5 Functional Linear Regression with Alignment .................. 24 3 Modeling Phase and Amplitude Components 26 3.1 Phase-Variability: Analysis of Warping Functions .................... 27 3.2 Amplitude Variability: Analysis of Aligned Functions .................. 29 3.3 Modeling of Phase and Amplitude Components ..................... 31 3.3.1 Gaussian Models on fPCA Coefficients ...................... 32 3.3.2 Non-parametric Models on fPCA Coefficients .................. 32 3.4 Modeling Results ...................................... 33 3.5 Classification using Phase and Amplitude Models .................... 34 3.6 Classification Results .................................... 37 3.6.1 Pairwise Distances ................................. 37 3.6.2 Phase and Amplitude Models ........................... 43 4 Joint Alignment and Component Analysis 50 4.1 Functional Principal Component Analysis ........................ 50 4.1.1 Numerical Results ................................. 53 4.2 Functional Partial Least Squares ............................. 59 4.2.1 Optimization over fγig ............................... 61 4.2.2 Full Optimization ................................. 64 4.2.3 Numerical Results ................................. 65 v 5 Joint Alignment and Functional Regression 73 5.1 Elastic Functional Linear Regression ........................... 73 5.1.1 Maximum-Likelihood Estimation Procedure ................... 76 5.1.2 Prediction ...................................... 79 5.1.3 Experimental Results ............................... 79 5.2 Elastic Functional Logistic Regression .......................... 82 5.2.1 Maximum-Likelihood Estimation Procedure ................... 84 5.2.2 Prediction ...................................... 85 5.2.3 Experimental Results ............................... 86 5.3 Elastic Functional Multinomial Logistic Regression ................... 92 5.3.1 Maximum-Likelihood Estimation Procedure ................... 93 5.3.2 Prediction ...................................... 95 5.3.3 Experimental Results ............................... 96 6 Elastic Regression with Open Curves 102 6.1 Elastic Shape Analysis of Open Curves .......................... 102 6.2 Elastic Linear Regression using Open Curves ...................... 104 6.2.1 Maximum-Likelihood Estimation Procedure ................... 105 6.2.2 Prediction ...................................... 107 6.3 Elastic Logistic Regression using Open Curves ..................... 107 6.3.1 Maximum-Likelihood Estimation Procedure ................... 108 6.3.2 Prediction ...................................... 109 6.3.3 Experimental Results ............................... 110 6.4 Elastic Multinomial Logistic Regression using Open Curves .............. 113 6.4.1 Maximum-Likelihood Estimation Procedure ................... 114 6.4.2 Prediction ...................................... 117 6.4.3 Experimental Results ............................... 118 7 Conclusion and Future Work 121 7.1 Discussion .......................................... 121 7.2 Future Work ........................................ 122 References ............................................... 124 Biographical Sketch ......................................... 130 vi LIST OF TABLES 2.1 The comparison of the amplitude variance and phase variance for different alignment algorithms on the Unimodal and SONAR data sets. ................... 23 3.1 Classification rates versus amount of smoothing applied. ................. 41 3.2 Mean classification rate and standard deviation (in parentheses) for 5-fold cross- validation on the signature data. .............................. 46 3.3 Mean classification rate and standard deviation (in parentheses) for 5-fold cross- validation on the iPhone data. ............................... 48 3.4 Mean classification rate and standard deviation (in parentheses) for 5-fold cross- validation on SONAR data. ................................. 49 4.1 The comparison of the amplitude variance and phase variance for different fPCA algorithms on the Simulated and Berkley Growth data set. ............... 57 4.2 Resulting singular values percentage of cumulative energy on simulated and growth data from Algorithm 4.1 and [37]. ............................. 61 4.3 Resulting singular values on Simulated Data from Algorithm 4.3. ............ 67 4.4 Resulting singular values percentage of cumulative energy on Gait data from Algo- rithm 4.3. ........................................... 71 5.1 Mean prediction error using 5-fold cross-validation and standard deviation in paren- theses for simulated data using functional regression. ................... 83 5.2 Mean probability of classification using 5-fold cross-validation and standard deviation in parentheses for simulated data using functional logistic regression. .......... 89 5.3 Mean probability of classification using 5-fold cross-validation and standard deviation in parentheses for the medical data using functional logistic regression. ......... 92 5.4 Mean probability of classification using 5-fold cross-validation and standard deviation in parentheses for simulated data using functional mulinomial logistic regression. ... 99 5.5 Mean probability of classification using 5-fold cross-validation and standard deviation in parentheses for the medical data using functional multinomial logistic regression. 101 6.1 Mean probability of classification using 5-fold cross-validation and standard deviation in parentheses for MPEG-7 data using curve logistic regression. ............. 113 6.2 Mean probability of classification using 5-fold cross-validation and standard deviation in parentheses for MPEG-7 data using curve logistic regression. ............. 119 vii LIST OF FIGURES 1.1 Examples of functional data which includes (a) the average Canadian temperature measured at 35 different sites over 365 days, (b) the growth velocity for 21 different girls, and (c) the gyrometer in the x direction measured using a iPhone 4 while riding a bicycle for 30 subjects. .................................. 1 1.2 Samples drawn from a Gaussian model fitted to the principal components for the un-aligned and aligned data. ................................ 4 2.1 Example of smoothing of functional data by changing the number of basis elements. 9 2.2 Example of smoothing of functional data by changing the amount of smoothing penalty. 11 2.3 Example of data with a) phase- and amplitude-variability and b) aligned data. .... 13 2 2.4 Demonstration of pinching problems in F space under (a) the L distance and (b) the 2 L norm. ........................................... 16 2.5 Alignment of the simulated data set using Algorithm 2.1. ................ 21 2.6 Comparison of alignment algorithms on a difficult unimodal data set (second row) and a real SONAR data set (bottom row). ........................... 22 3.1 Depiction of the SRSF space of warping functions as a sphere and a tangent space at identity id. .........................................