Discriminative Models for Robust Image Classification
Total Page:16
File Type:pdf, Size:1020Kb
The Pennsylvania State University The Graduate School DISCRIMINATIVE MODELS FOR ROBUST IMAGE CLASSIFICATION A Dissertation in Electrical Engineering by Umamahesh Srinivas c 2013 Umamahesh Srinivas Submitted in Partial Fulfillment of the Requirements for the Degree of arXiv:1603.02736v1 [stat.ML] 8 Mar 2016 Doctor of Philosophy August 2013 The dissertation of Umamahesh Srinivas was reviewed and approved∗ by the following: Vishal Monga Monkowski Assistant Professor of Electrical Engineering Dissertation Advisor, Chair of Committee William E. Higgins Distinguished Professor of Electrical Engineering Ram M. Narayanan Professor of Electrical Engineering Robert T. Collins Associate Professor of Computer Science and Engineering Kultegin Aydin Professor of Electrical Engineering Head of the Department of Electrical Engineering ∗Signatures are on file in the Graduate School. Abstract A variety of real-world tasks involve the classification of images into pre-determined cat- egories. Designing image classification algorithms that exhibit robustness to acquisition noise and image distortions, particularly when the available training data are insufficient to learn accurate models, is a significant challenge. This dissertation explores the devel- opment of discriminative models for robust image classification that exploit underlying signal structure, via probabilistic graphical models and sparse signal representations. Probabilistic graphical models are widely used in many applications to approximate high-dimensional data in a reduced complexity set-up. Learning graphical structures to approximate probability distributions is an area of active research. Recent work has focused on learning graphs in a discriminative manner with the goal of minimizing clas- sification error. In the first part of the dissertation, we develop a discriminative learning framework that exploits the complementary yet correlated information offered by mul- tiple representations (or projections) of a given signal/image. Specifically, we propose a discriminative tree-based scheme for feature fusion by explicitly learning the conditional correlations among such multiple projections in an iterative manner. Experiments reveal the robustness of the resulting graphical model classifier to training insufficiency. The next part of this dissertation leverages the discriminative power of sparse signal representations. The value of parsimony in signal representation has been recognized for a long time, most recently in the emergence of compressive sensing. A recent significant contribution to image classification has incorporated the analytical underpinnings of compressive sensing for classification tasks via class-specific dictionaries. In continuation of our theme of exploiting information from multiple signal representations, we propose a discriminative sparsity model for image classification applicable to a general multi-sensor fusion scenario. As a specific instance, we develop a color image classification framework that combines the complementary merits of the the red, green and blue channels of color images. Here signal structure manifests itself in the form of block-sparse coefficient matrices, leading to the formulation and solution of new optimization problems. As a logical consummation of these ideas, we explore the possibility of learning dis- criminative graphical models on sparse signal representations. Our efforts are inspired by iii exciting ongoing work towards uncovering fundamental relationships between graphical models and sparse signals. First, we show the effectiveness of the graph-based feature fusion framework wherein the trees are learned on multiple sparse representations ob- tained from a collection of training images. A caveat for the success of sparse classifica- tion methods is the requirement of abundant training information. On the other hand, many practical situations suffer from the limitation of limited available training. So next, we revisit the sparse representation-based classification problem from a Bayesian perspective. We show that using class-specific priors in conjunction with class-specific dictionaries leads to better discrimination. We employ spike-and-slab graphical priors to simultaneously capture the class-specific structure and sparsity inherent in the signal coefficients. We demonstrate that using graphical priors in a Bayesian set-up alleviates the burden on training set size for sparsity-based classification methods. An important goal of this dissertation is to demonstrate the wide applicability of these algorithmic tools for practical applications. To that end, we consider important problems in the areas of: 1. Remote sensing: automatic target recognition using synthetic aperture radar im- ages, hyperspectral target detection and classification 2. Biometrics for security: human face recognition 3. Medical imaging: histopathological images acquired from mammalian tissues. iv Table of Contents List of Figures viii List of Tables x Acknowledgments xii Chapter 1 Introduction 1 1.1 Motivation . 1 1.2 Overview of Dissertation Contributions . 4 1.2.1 Discriminative Graphical Models . 4 1.2.2 Discriminative Sparse Representations . 5 1.3 Organization . 7 Chapter 2 Feature Fusion for Classification via Discriminative Graphical Models 11 2.1 Introduction . 11 2.2 Background . 12 2.2.1 Probabilistic Graphical Models . 12 2.2.2 Boosting . 14 2.3 Discriminative Graphical Models for Robust Image Classification . 15 2.3.1 Feature Extraction . 15 2.3.2 Initial Disjoint Tree Graphs . 16 2.3.3 Feature Fusion via Boosting on Disjoint Graphs . 18 2.3.4 Multi-class Image Classification . 19 2.4 Application: Automatic Target Recognition . 20 2.4.1 Introduction . 20 2.4.2 ATR Algorithms: Prior Art . 20 2.5 Experiments and Results . 22 v 2.5.1 Experimental Set-up . 23 2.5.2 Recognition Accuracy . 25 2.5.2.1 Standard Operating Condition (SOC) . 25 2.5.2.2 Extended Operating Conditions (EOC) . 25 2.5.3 ROC Curves: Outlier Rejection Performance . 26 2.5.4 Classification Performance as Function of Training Size . 26 2.5.5 Summary of Results . 27 2.6 Conclusion . 28 Chapter 3 Application: Learning Graphical Models on Sparse Features 38 3.1 Introduction . 38 3.2 Sparsity in Signal Processing . 39 3.2.1 Compressive Sensing . 39 3.2.2 Sparse Representation-based Classification . 40 3.3 Hyperspectral Imaging . 42 3.3.1 Introduction . 42 3.3.2 Motivation and Contribution . 43 3.4 Experimental Results and Discussion . 44 3.4.1 Hyperspectral Target Detection . 44 3.4.2 Hyperspectral Target Classification . 49 3.4.2.1 AVIRIS Data Set: Indian Pines . 49 3.4.2.2 ROSIS Urban Data Over Pavia, Italy . 50 3.5 Robust Face Recognition . 51 3.5.1 Introduction . 51 3.5.2 Motivation . 51 3.6 Face Recognition Via Local Decisions From Locally Adaptive Sparse Fea- tures . 53 3.7 Experiments and Discussion . 53 3.7.1 Presence of Registration Errors . 54 3.7.2 Recognition Under Random Pixel Corruption . 56 3.7.3 Outlier Rejection . 56 3.8 Conclusion . 57 Chapter 4 Structured Sparse Representations for Robust Image Classification 59 4.1 Introduction . 59 4.2 Histopathological Image Classification: Overview . 60 4.2.1 Prior Work . 61 4.2.2 Motivation and Challenges . 61 4.2.3 Overview of Contributions . 62 4.3 A Simultaneous Sparsity Model for Histopathological Image Representa- tion and Classification . 63 vi 4.4 LA-SHIRC: Locally Adaptive SHIRC . 67 4.5 Validation and Experimental Results . 69 4.5.1 Experimental Set-Up: Image Data Sets . 69 4.5.2 Validation of Central Idea: Overall Classification Accuracy . 73 4.5.3 Detailed Results: Confusion Matrices and ROC Curves . 74 4.5.4 Performance as Function of Training Set Size . 77 4.6 Structured Sparse Priors for Image Classification . 77 4.6.1 Related Work in Model-based Compressive Sensing . 77 4.6.2 Overview of Contribution . 78 4.7 Design of Discriminative Spike-and-slab Priors . 79 4.7.1 Set-theoretic Interpretation . 79 4.7.2 Spike-and-slab Priors . 80 4.7.3 Analytical Development . 81 4.7.4 SSPIC: Some Observations . 85 4.7.5 Parameter Learning . 86 4.8 Experimental Results . 86 4.9 Conclusion . 89 Chapter 5 Conclusions and Future Directions 90 5.1 Summary of Main Contributions . 90 5.2 Suggestions for Future Research . 92 5.2.1 Discriminative Graph Learning . 92 5.2.2 Joint Sparsity Models . 92 5.2.3 Design of Class-specific Priors . 92 Appendix A A Greedy Pursuit Approach to Multi-task Classification 94 Appendix B Parameter Learning for SSPIC 97 Bibliography 98 vii List of Figures 1.1 Schematic of fingerprint verification . 1 2.1 Illustration of (2.4)-(2.5) . 14 2.2 Two-stage framework for designing discriminative graphical models . 16 2.3 Receiver operating characteristic curves . 35 2.4 Classification error vs. training sample size: SOC . 36 2.5 Classification error vs. training sample size: EOC . 37 3.1 Hyperspectral image classification using discriminative graphical models 44 3.2 Target detection: ROC for FR-I . 45 3.3 Difference maps for AVIRIS Indian Pine data set . 46 3.4 Performance of different approaches as a function of number of training samples . 48 3.5 Representation of a block in the test image from a locally adaptive dictionary 52 3.6 Proposed framework for face recognition . 54 3.7 An example of rotated test images . 55 3.8 Recognition rate for rotated test images . 55 3.9 An example of scaled test images . 56 3.10 ROC curves for outlier rejection . 58 4.1 Color channel-specific dictionary design for SHIRC . 65 4.2 Illustration to motivate LA-SHIRC . 67 4.3 Local cell regions selected from the IBL data set . 67 4.4 LA-SHIRC: Locally adaptive variant of SHIRC . 68 4.5 Sample images from the ADL data set . 70 4.6 Sample breast lesion images from the IBL data set . 71 4.7 Bar graphs indicating the overall classification accuracies of the competing methods . 72 4.8 Receiver operating characteristic curves for different organs . 75 4.9 Overall classification accuracy as a function of training set size for three different scenarios: low, limited and adequate training . 76 4.10 SSPIC: Set-theoretic comparison . 79 4.11 Probability density function of a spike-and-slab prior.