Angular Embedding: a New Angular Robust Principal Component Analysis

Angular Embedding: A New Angular Robust Principal Component Analysis Shenglan Liu1 *, Yang Yu1 1 School of Computer Science and Technology, Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian 116024, Liaoning, China. fliusl, [email protected] Abstract which optimizes PCs with angular density, is proposed with quadratic optimization of cosine value on hypersphere man- As a widely used method in machine learning, principal com- ifold. Based on the cosine measurement, AE enhances the ponent analysis (PCA) shows excellent properties for dimensionality reduction. It is a serious problem that PCA is sen- robustness of `2-norm based PCA and outperforms the ex- sitive to outliers, which has been improved by numerous isting methods. As an eigen decomposition based PCA ap- Robust PCA (RPCA) versions. However, the existing state- proach, AE is improved to reduce the computational com- of-the-art RPCA approaches cannot easily remove or toler- plexity by weighing the dimensionality and the sample size. ate outliers by a non-iterative manner. To tackle this issue, Theoretically, we prove the superiority of quadratic co- this paper proposes Angular Embedding (AE) to formulate sine optimization in AE with analyzing the effects of on the a straightforward RPCA approach based on angular density, determination of PCs and outlier suppression. And in prac- which is improved for large scale or high-dimensional data. tice, the experimental results of AE on both synthetic and Furthermore, a trimmed AE (TAE) is introduced to deal with real data show its effectiveness. Furthermore, to address data data with large scale outliers. Extensive experiments on both with large scale outliers, we propose a pre-trimming the- synthetic and real-world datasets with vector-level or pixel- level outliers demonstrate that the proposed AE/TAE outper- ory based on cosine measurement and propose trimmed AE forms the state-of-the-art RPCA based methods. (TAE). Base on TAE theory, the experiments on background modeling and shadow removal tasks show its superiority. Introduction Related Work As machine leaning is widely used in many applications, In recent years, robust extensions for PCA have attracted in- principal component analysis (PCA) (Wold, Esbensen, and creasing attention due to its wide applications. Most existing Geladi 1987) has already become a remarkable method methods could fall into three categories as follows. for dimensionality reduction (Vasan and Surendiran 2016; Adiwijaya et al. 2018), computer vision (Bouwmans et al. RPCA with low-rank representation. Robust PCA 2018), etc. However, one of the most important issues of (Candes` et al. 2011) provides a new robust approach to re- PCA is that the principal components (PCs) are sensitive cover both the low-rank and the sparse components by de- to outliers (Zhao et al. 2014), which could not be well ad- composing the data matrix as X = L + S with Principal dressed under `2-norm. RPCA based methods (Candes` et al. Component Pursuit. Based on RPCA, many improved ver- 2011; Zhao et al. 2014) enhance the robustness of PCA sions have been proposed including the tensor version (Lu through low-rank decomposition. Besides, iterative sub- et al. 2019) (require more memory for SVD results) to re- space learning (Roweis 1998; Hauberg, Feragen, and Black duce the error of low-dimensional representation caused by 2014) is another approach to realize robust PCs for data min- outliers. Methods such as (Candes` et al. 2011) require iter- arXiv:2011.11013v1 [cs.LG] 22 Nov 2020 ing tasks (e.g., data classification (Xia et al. 2013), cluster- ative calculation SVD of data with cubic time complexity. ing (Ding et al. 2006) and information retrieval (Wang et al. (Xue et al. 2018; Yi et al. 2016) utilize gradient descent to 2015)). However, most existing robust methods for PCA are avoid frequent maxtrix decompositions. Although the above iterative optimization and always free of PCs. The outliers dimensionality reduction method is robust, it is difficult to fit cannot be easily removed or tolerated by non-iterative ap- on large datasets and to design a robust mathematical model proaches, which limits the applications on real-world prob- for capturing the linear transformation matrix between latent lems (e.g., image analysis in medical science (Lazcano et al. and observed variables. All these reasons limit the applica- 2017) or FPGA (Fernandez et al. 2019)). tions of the RPCA methods above. To tackle the problems above, in this paper, a straight- RPCA for subspace learning. Subspace learning aims to forward RPCA approach named Angular Embedding (AE), obtain low-dimensional subspace which spans with robust *Corressponding author. projection matrix. `1-norm based subspace estimation (Ke Copyright © 2021, Association for the Advancement of Artificial and Kanade 2005) is a robust version for standard PCA Intelligence (www.aaai.org). All rights reserved. using alternative convex programming. The determination of robust subspaces with `1-norm based methods, however, geodesic distance Dij between any two normalized samples is computationally expensive. Besides, questionable results ui and uj on the unit (radius R = 1) hypersphere manifold, in some tasks like clustering would be produced since the global solutions are not rotationally invariant. To this end, Dij = Rβij = βij; (1) R1-PCA (Ding et al. 2006) is proposed to improve `1-norm PCA. Unfortunately, R1-PCA is also time-consuming, sim- where βij = hui; uji is in radians. That is, the angular den- ilar to another work named Deterministic High-dimensional sity between two samples can be quantified by measuring the Robust PCA (Feng, Xu, and Yan 2012) which updates the surface density on a unit hypersphere manifold of codimen- weighted covariance matrix with frequent matrix decompo- sion one in D dimensions. The input samples can thus be sition. Furthermore, based on statistical features such as co- firstly normalized by mapping the original D-dimensional variance, Roweis proposed EM PCA (Roweis 1998) using zero-mean samples into a unit hypersphere manifold. For EM algorithm to obtain PCs for Gaussian data. i = 1; 2; ··· ; n, the unit vector ui corresponding to each sample xi can be obtained by computing RPCA with angle measurement. Recently, angle-based methods (Graf, Smola, and Borer 2003; Liu, Feng, and Qiao x u = i : (2) 2014; Wang et al. 2018, 2017) become new approaches in i kx k machine learning. The angle metric between samples is less i 2 sensitive to outliers and has been applied in many domains The normalized none zero mean inputs on (D − 1)-sphere (e.g., RNA structures analysis (Sargsyan, Wright, and Lim can be marked as U = fu1; u2; ··· ; ung. 2012), texture mapping in computer vision (Wilson et al. 2014)). The non-iterative linear dimensionality reduction The leading PC. Given the definition that q is the lead- method (Liu, Feng, and Qiao 2014), which utilizes cosine ing PC, which corresponds to the position of zero angle in value to obtain robust projection matrix, motivates the angle- D dimensions, then each sample ui can be represented as a based RPCA methods. Angle PCA (Wang et al. 2017) em- directional angle θi 2 (−π; π], in radians. To further sim- ployed cotangent value and iterative matrix decomposition plify the calculation, the optimization of angular density are to realize the idea of PCA, which is robust but computa- determined by utilizing the sine value of directional variable tionally expensive. Based on EM PCA (Roweis 1998) and θi instead of the geodesic distance Dij on the hypersphere. (trimmed) averages, a more scalable approach in introduced Then the leading PC can be determined by formulating by Grassmann Average (GA) (Hauberg, Feragen, and Black 2014), which is also related to the angle measurement. For n GA, the drawback is lower parallelization for calculating X 2 q = arg min sin θi large PCs because of iterative matrix multiplications and or- q i=1 thogonalization for PCs. Actually, most RPCA methods is n based on the iterative solutions, which are limited by the X 2 : (3) = arg max cos θi time-consuming steps (e.g., matrix decomposition in PCA) q and unused computational resources for parallelization on i=1 = arg max qT UUT q CPUs and GPUs, etc. q D Angular Robust Principal Component Multiple PCs. Let Q = fq1; ··· ; qdg ⊂ R be the top- d orthogonal PCs. Then the orthogonal projection uî = Analysis on Hypersphere Manifold T D QQ ui of the sample ui in R can be represented as ` -norm based PCA are sensitive to outliers and can be mis- d 2 u^ = P (uT q )q =(qT q ). We define θ to be the angle led by the Euclidean distance. Many researches on robust i j=1 i j j j j i between the sample ui and its projection uî. Each qj; j 2 PCA pay attention to characterize the error with `1-norm or 2 Pd 2 other approaches. The proposed AE, which follows `2-norm, f1; ··· ; dg contributes to θi under cos θi = j=1 cos #ij, turns to determine the principal components (PCs) based on where #ij = hui; qji indicates the angle between ui and qj angular density instead of calculating distance. In practice, in D dimensions. Then PCs can be determined by formulat- angular density will not be misled by outliers with large Eu- ing angular density based PCA as clidean distance, especially for the directional outliers (far from the direction of PCs). n d X X 2 Q = arg max cos #ji Angular Density Framework on Hypersphere Q i=1 j=1 : (4) Given a set of zero-mean data X = fx1; x2; ··· ; xng ⊂ T T RD = arg max trace Q UU Q under the assumption of Gaussian distribution, the an- Q gular density can be defined as the number of samples within the unit angle.

Angular Embedding: a New Angular Robust Principal Component Analysis

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support