Robust and Efficient Computation of Eigenvectors in a Generalized

Robust and Efficient Computation of Eigenvectors in a Generalized Spectral Method for Constrained Clustering Chengming Jiang Huiqing Xie Zhaojun Bai University of California, Davis East China University of University of California, Davis [email protected] Science and Technology [email protected] [email protected] Abstract 1 INTRODUCTION FAST-GE is a generalized spectral method Clustering is one of the most important techniques for constrained clustering [Cucuringu et al., for statistical data analysis, with applications rang- AISTATS 2016]. It incorporates the must- ing from machine learning, pattern recognition, im- link and cannot-link constraints into two age analysis, bioinformatics to computer graphics. Laplacian matrices and then minimizes a It attempts to categorize or group date into clus- Rayleigh quotient via solving a generalized ters on the basis of their similarity. Normalized eigenproblem, and is considered to be sim- Cut [Shi and Malik, 2000] and Spectral Clustering ple and scalable. However, there are two [Ng et al., 2002] are two popular algorithms. unsolved issues. Theoretically, since both Laplacian matrices are positive semi-definite Constrained clustering refers to the clustering with and the corresponding pencil is singular, it a prior domain knowledge of grouping information. is not proven whether the minimum of the Here relatively few must-link (ML) or cannot-link Rayleigh quotient exists and is equivalent to (CL) constraints are available to specify regions an eigenproblem. Computationally, the lo- that must be grouped in the same partition or be cally optimal block preconditioned conjugate separated into different ones [Wagstaff et al., 2001]. gradient (LOBPCG) method is not designed With constraints, the quality of clustering could for solving the eigenproblem of a singular be improved dramatically. In the past few years, pencil. In fact, to the best of our knowl- constrained clustering has attracted a lot of at- edge, there is no existing eigensolver that is tentions in many applications such as transductive immediately applicable. In this paper, we learning [Chapelle et al., 2006, Joachims, 2003], com- provide solutions to these two critical issues. munity detection [Eaton and Mansbach, 2012, We prove a generalization of Courant-Fischer Ma et al., 2010] and image segmentation variational principle for the Laplacian singu- [Chew and Cahill, 2015, Cour et al., 2007, lar pencil. We propose a regularization for Cucuringu et al., 2016, Eriksson et al., 2011, the pencil so that LOBPCG is applicable. Wang et al., 2014, Xu et al., 2009, Yu and Shi, 2004]. We demonstrate the robustness and efficiency Existing constrained clustering methods based on of proposed solutions for constrained image spectral graph theory can be organized in two segmentation. The proposed theoretical and classes. One class is to impose the constraints computational solutions can be applied to on the indicator vectors (eigenspace) explicitly, eigenproblems of positive semi-definite pen- namely, the constraints are either encoded in a cils arising in other machine learning algo- linear form [Cour et al., 2007, Eriksson et al., 2011, rithms, such as generalized linear discrim- Xu et al., 2009, Yu and Shi, 2004] or in a bilinear inant analysis in dimension reduction and form [Wang et al., 2014]. The other class of the multisurface classification via eigenvectors. methods implicitly incorporates the constraints into Laplacians. The semi-supervised normalized cut Proceedings of the 20th International Conference on Artifi- [Chew and Cahill, 2015] casts the constraints as a low- cial Intelligence and Statistics (AISTATS) 2017, Fort Laud- rank matrix to the Laplacian of the data graph. In a erdale, Florida, USA. JMLR: W&CP volume 54. Copy- recent proposed generalized spectral method (FAST- right 2017 by the author(s). GE) [Cucuringu et al., 2016], the ML and CL con- Robust and Efficient Computation of Eigenvectors for Constrained Clustering straints are incorporated by two Laplacians LG and We note that the essence of the proposed theoret- LH . The clustering is realized by solving the opti- ical and computational solutions in this paper is mization problem about mathematically rigorous and computationally effective treatments of large sparse symmetric posi- xT L x inf G ; (1) tive semi-definite pencils. The proposed results in n T x2 x LH x this paper could be applied to eigenproblems aris- T R x LH x>0 ing in other machine learning techniques, such as which subsequently is converted to the problem of generalized linear discriminant analysis in dimen- computing a few eigenvectors of the generalized eigen- sion reduction [He et al., 2005, Park and Park, 2008, problem Zhu and Huang, 2014] and multisurface proximal sup- port vector machine classification via generalized LGx = λLH x: (2) eigenvectors [Mangasarian and Wild, 2006]. It is shown that the FAST-GE algorithm works in nearly-linear time and provides some theoret- 2 PRELIMINARIES ical guarantees for the quality of the clusters [Cucuringu et al., 2016]. FAST-GE demonstrates A weighted undirected graph G is represented by a pair its superior quality to the other spectral ap- (V; W ), where V = fv ; v ; : : : ; v g denotes the set of proaches, namely CSP [Wang et al., 2014] and COSf 1 2 n vertices, and W = (w ) is a symmetric weight matrix, [Rangapuram and Hein, 2012]. ij such that wij > 0 and wii = 0 for all 1 6 i; j 6 n. The However, there are two critical unsolved issues as- pair (vi; vj) is an edge of G iff wij > 0. The degree di sociated with the FAST-GE algorithm. The first of a vertex vi is the sum of the weights of the edges one is theoretical. Since both Laplacians LG and adjacent to vi: n LH are symmetric positive semi-definite and the pen- X di = wij: cil LG − λLH is singular, i.e., det(LG − λLH ) ≡ 0 for all λ. The Courant-Fischer variational principle i=j [Golub and Van Loan, 2012, sec.8.1.1] is not applica- The degree matrix is D = diag(d1; d2; :::dn). The ble. Consequently, it is not proven whether the in- Laplacian L of G is defined by L = D − W and has fimum of the Rayleigh quotient (1) is equivalent to the following well-known properties: T 1 P 2 the smallest eigenvalue of problem (2). The second (a) x Lx = 2 i;j wij(xi − xj) ; issue is computational. LOBPCG is not designed for (b) L 0 if wij > 0 for all i; j; the generalized eigenproblem of a singular pencil. In (c) L · 1 = 0; fact, to the best of our knowledge, there is no exist- (d) Let λi denote the ith smallest eigenvalue of L. If ing eigensolver that is applicable to the large sparse the underlying graph of G is connected, then 0 = λ1 < eigenproblem (2). λ2 6 λ3 6 ::: 6 λn, and dim(N (L)) = 1, where N (L) denotes the nullspace of L. In this paper, we address these two critical issues. Theoretically, we first derive a canonical form of the Let A be a subset of vertices V and A¯ = V=A, the singular pencil L − λL and show the existence of quantity G H X the finite real eigenvalues. Then we generalize the cutG(A) = wij (3) Courant-Fischer variational principle to the singular vi2A;vj 2A¯ pencil LG − λLH and prove that the infimum of the is called the cut of A on graph G. The volume of A is Rayleigh quotient (1) can be replaced by the minimum, the sum of the weights of all edges adjacent to vertiecs namely, the existence of minima is guaranteed. Based in A: on these theoretical results, we can claim that the opti- n X X mization problem (1) is indeed equivalent to the prob- vol(A) = wij: lem of finding the smallest finite eigenvalue and as- vi2A j=1 sociated eigenvectors of the generalized eigenproblem It can be shown, see for example [Gallier, 2013], that (2). Computationally, we propose a regularization to xT Lx transform the singular pencil LG − λLH to a positive cut (A) = ; (4) definite pencil K−σM, where K and M are symmetric G (a − b)2 and M is positive definite. Consequently, we can di- where x is the indicator vector such that its i element rectly apply LOBPCG and other eigensolvers, such as x = a if v 2 A, and x = b if v 2= A, and a; b are two Lanczos method in ARPACK [Lehoucq et al., 1998]. i i i i distinct real numbers. We demonstrate the robustness and efficiency of the proposed approach for constrained segmentations of a For a given weighted graph G = (V; W ), the k-way set of large size images. partitioning is to find a partition (A1;:::;Ak) of V , Chengming Jiang, Huiqing Xie, Zhaojun Bai such that the edges between different subsets have different Vi form the CL constraints, the objective of very low weight and the edges within a subset have FAST-GE is to find a partition (A1;A2;:::;Ak) of V very high weight. The Normalized Cut (Ncut) method such that Vi ⊆ Ai, and the edges between different [Shi and Malik, 2000] is a popular method for uncon- subsets have very low weight, the edges within a sub- strained partitioning. For the given weighted graph G, set have very high weight. the objective of Ncut for a 2-way partition (A; A¯) is to Let cut (A ) be the cut of the subset A on G minimize the quantity GD p p D defined in (3), and we introduce a graph cut(A) cut(A¯) Ncut(A) = + : (5) G = (V; W ); (9) vol(A) vol(A¯) M M Pk It is shown that where the weight matrix WM = `=1 WM` : The en- tries of WM are defined as follows: if vi and vj are T ` x Lx in the same V , then W (i; j) = d d =(d d ); min Ncut(A) = min (6) ` M` i j min max x T A x Dx where di and dj are the degrees of vi and vj in G , d = min d and d = max d .

Robust and Efficient Computation of Eigenvectors in a Generalized

Segment-Based Clustering of Hyperspectral Images Using Tree-Based Data Partitioning Structures

Accelerating the LOBPCG Method on Gpus Using a Blocked Sparse Matrix Vector Product

Image Segmentation Based on Multiscale Fast Spectral Clustering Chongyang Zhang, Guofeng Zhu, Minxin Chen, Hong Chen, Chenjian Wu

Learning on Hypergraphs: Spectral Theory and Clustering

Fast Approximate Spectral Clustering

An Algorithmic Introduction to Clustering

Spectral Clustering: a Tutorial for the 2010’S

Spatially-Encouraged Spectral Clustering: a Technique for Blending Map Typologies and Regionalization

Semi-Supervised Spectral Clustering for Image Set Classification

Solving Symmetric Semi-Definite (Ill-Conditioned)

Fast Approximate Spectral Clustering

On the Performance and Energy Efficiency of Sparse Linear Algebra on Gpus