Dual Principal Component Pursuit and Filtrated Algebraic Subspace

Dual Principal Component Pursuit and Filtrated Algebraic Subspace Clustering by Manolis C. Tsakiris A dissertation submitted to The Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy. Baltimore, Maryland March, 2017 c Manolis C. Tsakiris 2017 All rights reserved Abstract Recent years have witnessed an explosion of data across scientific fields enabled by advances in sensor technology and distributed hardware systems. This has given rise to the challenge of efficiently processing such data for performing various tasks, such as face recognition in images and videos. Towards that end, two operations on the data are of fun- damental importance, i.e., dimensionality reduction and clustering, and the idea of learning one or more linear subspaces from the data has proved a fruitful notion in that context. Nev- ertheless, state-of-the-art methods, such as Robust Principal Component Analysis (RPCA) or Sparse Subspace Clustering (SSC), operate under the hypothesis that the subspaces have dimensions that are small relative to the ambient dimension, and fail otherwise. This thesis attempts to advance the state-of-the-art of subspace learning methods in the regime where the subspaces have high relative dimensions. The first major contribution of this thesis is a single subspace learning method called Dual Principal Component Pursuit (DPCP), which solves the robust PCA problem in the presence of outliers. Contrary to sparse and low-rank state-of-the-art methods, the theoretical guarantees of DPCP do not place any constraints on the dimension of the subspace. In particular, DPCP computes the ii ABSTRACT orthogonal complement of the subspace, thus it is particularly suited for subspaces of low codimension. This is done by solving a non-convex cosparse problem on the sphere, whose global minimizers are shown to be vectors normal to the subspace. Algorithms for solving the non-convex problem are developed and tested on synthetic and real data, showing that DPCP is able to handle higher subspace dimensions and larger amounts of outliers than existing methods. Finally, DPCP is extended theoretically and algorithmically to the case of multiple subspaces. The second major contribution of this thesis is a subspace clustering method called Filtrated Algebraic Subspace Clustering (FASC), which builds upon algebraic geometric ideas of an older method, known as Generalized Principal Component Analysis (GPCA). GPCA is naturally suited for subspaces of large dimension, but it suffers from two weak- nesses: sensitivity to noise and large computational complexity. This thesis demonstrates that FASC addresses successfully the robustness to noise. This is achieved through an equivalent formulation of GPCA, which uses the idea of filtrations of unions of subspaces. An algebraic geometric analysis establishes the theoretical equivalence of the two methods, while experiments on synthetic and real data reveal that FASC not only dramatically improves upon GPCA, but also upon existing methods on several occasions. Primary Reader: Rene´ Vidal Secondary Reader: Daniel P. Robinson iii Dedication This thesis is dedicated to my advisor, Rene´ Vidal, for marveling me with his genius, rigor, endurance and enthusiasm. iv Acknowledgments First of all, I am thankful to my advisor, Prof. Rene´ Vidal, whose influence has been enor- mous: working with him has been the best academic experience in my life. Secondly, I am thankful to Prof. Daniel P. Robinson, for many useful conversations, for always being supportive and friendly, and, quite importantly, for not being easily convinced: this led to the identification of a few inaccuracies and the improvement of several arguments in this thesis. I also thank Prof. Aldo Conca of the mathematics department of the Uni- versity of Genova for enthusiastically answering many questions that i had regarding the Castelnuovo-Mumford regularity of subspace arrangements, as well as Prof. Glyn Harman for pointing out a Koksma-Hlawka inequality for integration on the unit sphere. Then I thank Miss Debbie Race of the ECE department for being extremely helpful and patient with several administrative issues. I thank the Center for Imaging Science (CIS) for being a cozy home for the last 4 years and the ECE department for always being supportive and admitting me as a PhD student to begin with. I thank all people who were nice to me during my stay in Baltimore. Special thanks go to sensei Ebon Phoenix for passionately teaching me martial arts, and to Tony Hatzigeor- v ACKNOWLEDGMENTS galis and Jose Torres for being like brothers to me. I also thank the JHU newbie, Christos Sapsanis, from whom, remarkably, the JHU graduate community has a lot to benefit. And Guilherme Franca and Tao Xiong for being great friends. Finally, i thank my best friend Dimitris Lountzis for being who he is, and most importantly i thank my family, Chris, Evi and Michalis, for their infinite love. vi Contents Abstract ii Acknowledgments v List of Tables xiii List of Figures xv 1 Introduction 1 1.1 Modeling data with linear subspaces . 1 1.1.1 Modeling data with a single subspace . 1 1.1.2 Modeling data with multiple subspaces . 3 1.2 Challenges and the role of dimension . 4 1.3 Contributions of this thesis . 5 1.3.1 Dual principal component pursuit (DPCP) . 6 1.3.2 Filtrated algebraic subspace clustering . 8 1.4 Notation . 10 vii CONTENTS 2 Prior Art 13 2.1 Learning a single subspace in the presence of outliers . 15 2.2 Learning multiple subspaces . 20 2.3 Challenges in high relative dimensions . 25 3 Dual Principal Component Pursuit 29 3.1 Introduction . 29 3.2 Single subspace learning with outliers via DPCP . 32 3.2.1 Problem formulation . 32 3.2.1.1 Data model . 32 3.2.1.2 Conceptual formulation . 33 3.2.1.3 Hyperplane pursuit by `1 minimization . 34 3.2.2 Theoretical analysis of the continuous problem . 37 3.2.2.1 The underlying continuous problem . 38 3.2.2.2 Conditions for global optimality and convergence . 42 3.2.3 Theoretical analysis of the discrete problem . 48 3.2.3.1 Discrepancy bounds between continuous and discrete prob- lems . 49 3.2.3.2 Conditions for global optimality of the discrete problem 56 3.2.3.3 Conditions for convergence of the discrete recursive algorithm . 66 3.2.4 Algorithmic contributions . 71 viii CONTENTS 3.2.4.1 Relaxed DPCP and DPCA algorithms . 71 3.2.4.2 Relaxed and denoised DPCP . 74 3.2.4.3 Denoised DPCP . 76 3.2.4.4 DPCP via iteratively reweighted least-squares . 79 3.2.5 Experimental evaluation . 80 3.2.5.1 Computing a single dual principal component . 81 3.2.5.2 Outlier detection using synthetic data . 84 3.2.5.3 Outlier detection using real face and object images . 90 3.3 Learning a hyperplane arrangement via DPCP . 96 3.3.1 Problem overview . 96 3.3.2 Data model . 97 3.3.3 Theoretical analysis of the continuous problem . 98 3.3.3.1 Derivation, interpretation and basic properties of the continuous problem . 99 3.3.3.2 The cases of i) two hyperplanes and ii) orthogonal hyperplanes . 102 3.3.3.3 The case of three equiangular hyperplanes . 105 3.3.3.4 Conditions of global optimality for an arbitrary hyperplane arrangement . 114 3.3.4 Theoretical analysis of the discrete problem . 120 3.3.5 Algorithms . 131 ix CONTENTS 3.3.5.1 Learning a hyperplane arrangement sequentially . 132 3.3.5.2 K-Hyperplanes via DPCP . 132 3.3.6 Experimental evaluation . 134 3.3.6.1 Synthetic data . 134 3.3.6.2 3D plane clustering of real Kinect data . 142 3.4 Conclusions . 148 4 Advances in Algebraic Subspace Clustering 155 4.1 Review of algebraic subspace clustering . 156 4.1.1 Subspaces of codimension 1 . 157 4.1.2 Subspaces of equal dimension . 160 4.1.3 Known number of subspaces of arbitrary dimensions . 161 4.1.4 Unknown number of subspaces of arbitrary dimensions . 165 4.1.5 Computational complexity and recursive ASC . 167 4.1.6 Instability in the presence of noise and spectral ASC . 168 4.1.7 The challenge . 170 4.2 Filtrated algebraic subspace clustering (FASC) . 171 4.2.1 Filtrations of subspace arrangements: geometric overview . 171 4.2.2 Filtrations of subspace arrangements: theory . 177 4.2.2.1 Data in general position in a subspace arrangement . 177 4.2.2.2 Constructing the first step of a filtration . 182 4.2.2.3 Deciding whether to take a second step in a filtration . 185 x CONTENTS 4.2.2.4 Taking multiple steps in a filtration and terminating . 188 4.2.2.5 The FASC algorithm . 193 4.3 Filtrated spectral algebraic subspace clustering . 195 4.3.1 Implementing robust filtrations . 195 4.3.2 Combining multiple filtrations . 198 4.3.3 The FSASC algorithm . 199 4.3.4 A distance-based affinity . 200 4.3.5 Discussion on the computational complexity . 203 4.4 Experiments . 205 4.4.1 Experiments on synthetic data . 206 4.4.2 Experiments on real motion sequences . 217 4.5 Algebraic clustering of affine subspaces . 219 4.5.1 Motivation . 219 4.5.2 Problem statement and traditional approach . 221 4.5.3 Algebraic geometry of unions of affine subspaces . 224 4.5.3.1 Affine subspaces as affine varieties . 224 4.5.3.2 The projective closure of affine subspaces . 227 4.5.4 Correctness theorems for the homogenization trick . 232 4.6 Conclusions . 238 4.7 Appendix . 239 4.7.1 Notions from commutative algebra . 239 xi CONTENTS 4.7.2 Notions from algebraic geometry . 241 4.7.3 Subspace arrangements and their vanishing ideals .

Dual Principal Component Pursuit and Filtrated Algebraic Subspace

Orthogonal Complements (Revised Version)

Does Geometric Algebra Provide a Loophole to Bell's Theorem?

Signing a Linear Subspace: Signature Schemes for Network Coding

Quantification of Stability in Systems of Nonlinear Ordinary Differential Equations Jason Terry

Lecture 6: Linear Codes 1 Vector Spaces 2 Linear Subspaces 3

Worksheet 1, for the MATLAB Course by Hans G. Feichtinger, Edinburgh, Jan

3. Hilbert Spaces

10. Vector Subspaces the Solution to a Homogeneous Equation a X = 0

Linear Algebra Review Vectors

Linear Subspaces

Clifford Algebra and the Projective Model of Homogeneous Metric Spaces

Lecture 13 – Subspace Methods for Eigenvalue Problem