Diffusion Maps and Transfer Subspace Learning
Total Page:16
File Type:pdf, Size:1020Kb
DIFFUSION MAPS AND TRANSFER SUBSPACE LEARNING A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy by OLGA L. MENDOZA-SCHROCK B.S., University of Puget Sound, 1998 M.S., University of Kentucky, 2004 M.S., Wright State University, 2013 2017 Wright State University WRIGHT STATE UNIVERSITY GRADUATE SCHOOL July 25, 2017 I HEREBY RECOMMEND THAT THE DISSERTATION PREPARED UNDER MY SUPERVISION BY Olga L. Mendoza-Schrock ENTITLED Diffusion Maps and Transfer Subspace Learning BE ACCEPTED IN PARTIAL FULFILL- MENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy Mateen M. Rizki, Ph.D. Dissertation Director Michael L. Raymer, Ph.D. Director, Ph.D. in Computer Science and Engineering Program Robert E. W. Fyffe, Ph.D. Vice President for Research and Dean of the Graduate School Committee on Final Examination John C. Gallagher, Ph.D. Frederick D. Garber, Ph.D. Michael L. Raymer, Ph.D. Vincent J. Velten, Ph.D. ABSTRACT Mendoza-Schrock, Olga L., Ph.D., Computer Science and Engineering Ph.D. Program, De- partment of Computer Science and Engineering, Wright State University, 2017. Diffusion Maps and Transfer Subspace Learning. Transfer Subspace Learning has recently gained popularity for its ability to perform cross-dataset and cross-domain object recognition. The ability to leverage existing data without the need for additional data collections is attractive for Aided Target Recognition applications. For Aided Target Recognition (or object assessment) applications, Trans- fer Subspace Learning is particularly useful, as it enables the incorporation of sparse and dynamically collected data into existing systems that utilize large databases. In this dis- sertation, Manifold Learning and Transfer Subspace Learning are combined to create new Aided Target Recognition systems capable of achieving high target recognition rates for cross-dataset conditions and cross-domain applications. The Manifold Learning technique used in this dissertation is Diffusion Maps, a nonlinear dimensionality reduction technique based on a heat diffusion analogy. The Transfer Subspace Learning technique used is Transfer Fishers Linear Discriminative Analysis. The new Aided Target Recognition sys- tems introduced in this dissertation are (i) Manifold Transfer Subspace Learning, which combines Manifold Learning and Transfer Subspace Learning sequentially, and (ii) Trans- fer Diffusion Maps, which simultaneously integrates Manifold Learning and Transfer Sub- space Learning. Finally, the ability of the new techniques to achieve high target recognition rates for cross-dataset and cross-domain applications is illustrated using a variety of diverse datasets. iii Abbreviations and Symbols Throughout this dissertation numerous abbreviations and symbols are used. While the definitions can be found in surrounding text, this section provides a quick reference. List of Abbreviations AFRL Air Force Research Laboratory ARL Army Research Laboratory BAA Broad Agency Announcements BKR Biomedical Knowledge Repository BRCA Breast Invasive Carcinoma CCR Correct Classification Rate CRC Classification, Regression, and Clustering DARPA Defense Advanced Research Projects Agency DM Diffusion Maps FLDA Fisher’s Linear Discriminative Analysis FLIR Forward Looking infrared GTL Genetic Transfer Learning HPC High Performance Computer ICML International Conference on Machine Learning IPTO Information Processing Technology Office IR Infrared KKT Karush–Kuhn–Tucker Lung Adenocarcionoma (LUAD) MNIST Modified National Institute of Standards and Technology MTSL Manifold Transfer Subspace Learning NIST National Institute of Standards and Technology NLM National Library of Medicine NIPS Neural Information Processing Systems OSE Out-Of-Sample Extension PC Principal Component PCA Principal Component Analysis RMSE Root Mean Squared Error SVD Singular Value Decomposition SVM Support Vector Machines TRACE Target Recognition and Adaptation in Contested Environments TrDM Transfer Diffusion Maps TrFLDA Transfer Fisher’s Linear Discriminative Analysis TNL Transfer Network Learning TSL Transfer Subspace Learning iv Contents 1 Introduction 1 1.1 Motivation . .3 1.2 Contributions . .5 1.2.1 Improved TrFLDA Algorithm . .7 1.2.2 Manifold Transfer Subspace Learning (MTSL) . .7 1.2.3 Transfer Diffusion Maps (TrDM) . .8 1.2.4 Published Papers Related to this Dissertation . .9 1.3 Outline of Dissertation . 10 1.4 Notation . 10 2 Background 12 2.1 Relevant Scholarship . 12 2.1.1 Aided Target Recognition . 12 2.1.2 Manifold Learning . 13 2.1.3 Transfer Learning . 15 3 Transfer Fisher’s Linear Discriminant Analysis (TrFLDA) Enhancements 19 3.1 Transfer Fisher’s Linear Discriminant Analysis . 19 3.2 Enhancements . 22 v 3.2.1 Implementation of Relative Weight . 22 3.2.2 Reformulation of the Optimization Problem . 25 3.3 Improvements to Kernel Density Estimation (KDE) Technique . 27 4 Manifold Transfer Subspace Learning (MTSL) 29 4.1 Diffusion Maps . 29 4.2 Manifold Transfer Subspace Learning (MTSL) Implementation . 32 4.2.1 The Manifold Learning Step . 32 4.2.2 The Transfer Subspace Learning Step . 34 4.2.3 Selection of Parameters . 35 5 Transfer Diffusion Maps (TrDM) 38 5.1 Eigenvalue Problems and Optimization . 39 5.2 Diffusion Maps and Optimization . 40 5.3 Transfer Diffusion Maps . 48 6 Datasets and Experimental Design 53 6.1 Description of Datasets . 53 6.1.1 Electro-optical synthetic civilian vehicle data domes (EO-SCVDD) 54 6.1.2 Geometric shapes dataset . 55 6.1.3 Cancer Data . 57 6.1.4 MNIST . 60 6.1.5 Comanche Dataset . 61 6.2 Experimental Design . 62 7 Experimental Results 67 7.1 Verification of the Efficacy of Diffusion Maps and TrFLDA . 67 7.1.1 Geometric Shapes using Diffusion Maps and TrFLDA . 68 7.1.2 Electro-optical synthetic civilian vehicle data domes (EO-SCVDD) 77 vi 7.2 Performance of MTSL and TrDM . 85 7.2.1 (Electro-optical synthetic civilian vehicle data domes (EO-SCVDD) 85 7.2.2 MNIST Handwritten Digits . 89 7.2.3 Gene Expressions . 101 7.3 Performance predictors . 106 7.4 Analysis . 111 7.4.1 Observations . 111 7.4.2 Experimental Robustness . 112 7.4.3 Computational Requirements for New Algorithm . 115 8 Closing Remarks 116 8.1 Summary of Contributions and Expected Impact . 117 8.1.1 Manifold Transfer Subspace Learning (MTSL) . 117 8.1.2 Transfer Diffusion Maps (TrDM) . 119 8.2 Future Research . 121 Bibliography 123 vii List of Figures 1.1 Types of Transfer Learning (a) cross-dataset learning, (b) partial cross- domain learning, and (c) cross-domain learning. .5 6.1 (a) The distribution of lighting positions in blue triangles and camera po- sitions in red circles; lighting condition 16 is highlighted as it is the nadir position at (0; 0; 41) (b) Sample vehicle images. 55 6.2 The five shapes in the geometric shapes dataset (a) Cube with dimensions (1; 1; 1), (b) Box (1; 1:5; 1), (c) Longer box (1; 2; 1), (d) Tall box (1; 1; 1:5), and (e) Sphere (1; 1; 1)............................. 56 6.3 The mesh in Blender. 57 6.4 The Cube under the three different lighting conditions . 58 6.5 Graphical view of average root mean square error (RMSE) for the geomet- ric shapes. 59 7.1 Diffusion maps for all five shapes. (a) Displays the Box and Tall box, (b) Box, Long Box, and Sphere and (c) Cube, Box, and Tall Box. 69 7.2 Diffusion maps for cube under all three lighting conditions. 70 7.3 The first three dimensions of the diffusion map for the cube under the first lighting condition. 71 viii 7.4 The first three dimensions of the diffusion map for the cube under the sec- ond lighting condition. 71 7.5 The first three dimensions of the diffusion map for the cube under the third lighting condition. 72 7.6 The different elevations shown in the diffusion map for the cube under the first lighting condition. 72 7.7 The different elevations shown in the diffusion map for the cube under the second lighting condition. 73 7.8 The different elevations shown in the diffusion map for the cube under the third lighting condition. 73 7.9 Graphical view of Hausdorff distance for the geometric shapes. 74 7.10 Graphical view of modified hausdorff distance for the geometric shapes. 75 7.11 (a) The k-nearest-neighbors (kNN) accuracy for the validation experiment. (b) The objective function TrFLDA results for the validation experiment. 75 7.12 (a) The TrFLDA results for λ = 256 for the validation experiment. (b) The TrFLDA results for λ = 256 for the validation experiment when shapes are inverted. 76 7.13 Nearest neighbors accuracy for the first experiment where the source data is the cube and the tall box while the target data is the sphere and the tall box. 76 7.14 (a) The TrFLDA results for λ = 2 for the cross-domain experiment. (b) The TrFLDA results for λ = 512 for the cross-domain experiment. 78 7.15 (a) The KNN accuracy for the cross-domain experiment. (b) The objective function TrFLDA results for the cross-domain experiment. 78 7.16 Example images from the Toyota Avalon swath used in the study, image (a) one, (b) 178, (c) 356, and (d) 533-the last image in the swath. 79 ix 7.17 The diffusion maps for the Toyota Avalon and Nissan Sentra in three dif- ferent lighting conditions. (a) shows both targets (b) shows just the Toyota Avalon, and (c) displays just the Nissan Sentra. 80 7.18 Flow chart for cross-dataset experiments (a) TrFLDA for Avalon and Sentra from LC-1 to LC-9. (b) TrFLDA for Avalon and Sentra from LC-1 to LC- 10. Solid lines indicate notional PDFs for source data and dashed lines indicate notional PDFs for target data. 81 7.19 Experimental set-up for the partial cross-domain experiment using the Air Force EO Vehicle Data Domes. 86 7.20 Diffusion map for MNIST Data . 91 7.21 Dimensionality vs. Eigenvalues for the MNIST Data in graphical format.