Multi-Level Spectral Graph Partitioning Method Community Detection Jianjun Cheng, Longjie Li, Mingwei Leng Et Al
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Statistical Mechanics: Theory and Experiment PAPER: INTERDISCIPLINARY STATISTICAL MECHANICS Related content - A divisive spectral method for network Multi-level spectral graph partitioning method community detection Jianjun Cheng, Longjie Li, Mingwei Leng et al. To cite this article: Muhammed Fatih Talu J. Stat. Mech. (2017) 093406 - Motif-based embedding for graph clustering Sungsu Lim and Jae-Gil Lee - Complex eigenvectors of network matrices View the article online for updates and enhancements. give better insight into the communitystructure Mina Zarei, Keivan Aghababaei Samani and Gholam Reza Omidi This content was downloaded from IP address 193.140.142.195 on 20/08/2019 at 11:59 IOP Journal of Statistical Mechanics: Theory and Experiment J. Stat. Mech. 2017 ournal of Statistical Mechanics: Theory and Experiment 2017 J © 2017 IOP Publishing Ltd and SISSA Medialab srl PAPER: Interdisciplinary statistical mechanics JSMTC6 Multi-level spectral graph partitioning 093406 method M F Talu J. Stat. Mech. Mech. Stat. J. Muhammed Fatih Talu Multi-level spectral graph partitioning method Computer Science Department, Inonu University, Malatya, Turkey Printed in the UK E-mail: [email protected] Received 29 March 2017 JSTAT Accepted for publication 31 July 2017 Published 27 September 2017 10.1088/1742-5468/aa85ba Online at stacks.iop.org/JSTAT/2017/093406 https://doi.org/10.1088/1742-5468/aa85ba Abstract. In this article, a new method for multi-level and balanced division of non-directional graphs (MSGP) is introduced. Using the eigenvectors of 1742-5468 the Laplacian matrix of graphs, the method has a spectral approach which ( has superiority over local methods (Kernighan–Lin and Fiduccia-Mattheyses) 2017 with a global division ability. Bisection, which is a spectral method, can 9 divide the graph by using the Fiedler vector, while the recursive version of this method can divide into multiple levels. However, the spectral methods ) have two disadvantages: (1) high processing costs; (2) dividing the sub-graphs 093406 independently. With a better understanding of the eigenvectors of the whole graph, and by discovering the confidential information owned, MSGP can divide the graphs into balanced and multi-leveled without recursive processing. Inspired by Haar wavelets, it uses the eigenvectors with a binary heap tree. The comparison results in seven existing methods (some are community detection algorithms) on regular and irregular graphs which clearly demonstrate that MSGP works about 14,4 times faster than the others to produce a proper partitioning result. Keywords: random graphs, networks, clustering techniques, heuristics algorithms © 2017 IOP Publishing Ltd and SISSA Medialab srl 1742-5468/17/093406+17$33.00 Multi-level spectral graph partitioning method Contents 1. Introduction 2 2. Spectral partitioning (SP) 3 3. The proposed method (MSGP) 5 4. Applications with MSGP 8 5. Comparison results 9 Mech. Stat. J. 6. Conclusion 16 Acknowledgments 16 References 16 1. Introduction Graphs (G) are mathematical structures used to define objects (nodes or vertices) and relationships (edges) of them. Edges may be directed or undirected according to the relation types. A graph G can be defined in G =(V,E) form with V and E, without loops or multiple edges. The graph partitioning problem can be defined to divide G into ( smaller subsets such that each subset has about the same size and therefore the cost of 2017 the edges spanning subsets is minimal. For example, a k-way partition splits the node set V of G into k smaller subsets V1,V2,...,Vk 3 . [ ] ) Graph partitioning is a widely researched topic and many books [3, 21] and papers 093406 about the subject have been published [29]. It was used in many applications such as VLSI circuit layout [17], solving linear systems [22] and distributing workloads for par- allel computation [15]. In recent years, graph partitioning has gained importance due to its applications for the grouping and discovering of social, biological, pathological and cyber security networks (see figure 1). It allows us to model many practical appli- cations. For example, in natural language processing, after expressing the verbal argu- ments as vertices and similarity between them as edges, the graph partitioning methods help us to find the same semantic or unique roles [19]. The study of extraction building patterns (collinear, curvilinear, parallel groups, and grid) at [8] can be given as another example. The building clusters with features of area, shape and visual similarities are detected using graph partitioning. The graph partitioning problem is known to be NP-hard [5]. Even for special graphs such as trees and grids (without holes), no reasonable approximation algorithms exist. There is a tradeo between runtime and partition quality (balanced) which is unavoid- able [9]. Some the algorithms are fast but compute inappropriate partitioning results; some provides high quality ratios but are very slow. In literature, the graph partitioning methods can be separated into two wide catego- ries, local and global. Widely used local methods are the algorithms of Kernighan–Lin https://doi.org/10.1088/1742-5468/aa85ba 2 Multi-level spectral graph partitioning method Figure 1. Graph partitioning applications. [18] and Fiduccia-Mattheyses [10]. They are ecient 2-way partitioning algorithms using by local search strategies. Their major disadvantage is the arbitrary initial split- Mech. Stat. J. ting of the node set, which influences partition quality considerably. Global partitioning approaches use the entire graph’s information without an arbitrary initial requirement to start splitting. A well-known example for global approaches is the spectral scheme (SS) [15, 30] and multilevel scheme (MS) [16]. In SS, a graphical partitioning based on the eigenvalues and eigenvectors of the Laplacian (or adjacency) matrix of a graph is performed. MS contains coarsening, partitioning and uncoarsening phases. In [23], for fast graph partitioning, a comparison study of SSs and MSs in terms of partition quality and cost (time) is performed on GPUs. In all experiments carried out in that study, nvGRAPH library [24] with Cuda 8 introduced by NVIDIA was used. As a result of experimental studies, the obtained numerical results showed that SSs can obtain significantly higher quality partitions than MSs for networks with high degree nodes (especially social networks), while the time taken by both schemes is essentially the same. Especially in applications such as medical image segmentation, the partition ( quality has a significant impact on result [27]. Therefore the superiority of the partition 2017 quality of SSs over MSs seems to be very valuable. The major disadvantage of the SSs especially in multi-level partitioning is the ( ) ) need to recalculate eigenvalues and eigenvectors. As it is well known, eigenvalue calcul- ation part of spectral graph partitioning methods requires extensive time. For quick 093406 calculations, many solver methods such as Lanczos, Jacobi/Davidson and LOBPCG were developed. In this paper, a new multi-level and balanced spectral graph partitioning method (MSGP) which does not need to iteratively calculate eigenvalues and eigenvectors is used. By placing the eigenvectors in a binary heap tree structure in a certain order and optimal time, it reveals the knowledge of level division. In section 2, a summary about classical SP methods is given. MSGP is introduced in section 3. The comparison results obtained between MSGP and the existing methods on specific graphs (three regular, one irregular) is given in section 4. In the last section, the conclusion of the work is given. 2. Spectral partitioning (SP) Due to be eectively solved with linear algebra, the SP methods outperform tradi- tional clustering algorithms such as the k-means, mean-shift and PCA [11]. Because of this superiority, it is used to solve problems in unspecified complex graphs [20]. It https://doi.org/10.1088/1742-5468/aa85ba 3 Multi-level spectral graph partitioning method is a partitioning approach based on the eigenvalues and eigenvectors of the Laplacian matrix L(G), which is also a square, symmetric and sparse matrix defined by deg(i) i = j L(G)=D(G) A(G)= 0 i = j , i, j =1,...,n − (1) w i, j E − i,j ∈ where A(G) is the adjacency matrix (square) and D(G) is the node degree matrix (diagonal) defined as follows: 0 i, j / E A (G)= ∈ , i, j =1,...,n Mech. Stat. J. w i, j E (2) i,j ∈ deg(i) i = j D (G) = , i, j =1,...,n 0 i = j (3) where wi,j denotes the edge weight between nodes i and j and deg(i) refers to the num- ber of nodes to which node i is connected. Because L(G) is a symmetric matrix, its eigenvalues are real, and its eigenvectors are perpendicular (orthogonal) to each other. After calculation of L(G)v = λv, the eigenvector corresponding to the second smallest eigenvalue of L(G) is known as the Fiedler vector [14]: Fiedler v v v 1 2 n λ1 ... 0 . v = . ... ,λ= . .. . (4) ( 2017 . 0 ... λ . n ) Using the Fiedler vector, G is divided into two sub-graphs (G1 and G2). For divi- 093406 sion, the signs or median of the Fiedler vector is used. The graph partitioning algo- rithm using the Fiedler vector is known as spectral bisection (SB) algorithm shown in algorithm 1: Algorithm 1. Spectral bisection (SB). INPUT:a graph, G =(V,E) OUTPUT:two sub-graphs, G1 and G2 Compute L(G) Compute eigenvalues and eigenvectors of L(G)=[v, λ]=eig(L) Fiedler vector: v2 while node i of G do i if v2 median(v2) then Put node i in G1 else Put node i in G2 end if end while return G1 and G2 https://doi.org/10.1088/1742-5468/aa85ba 4 Multi-level spectral graph partitioning method Algorithm 2. Recursive spectral bisection (RSB). INPUT:a graph, G =(V,E) and a partition number, k OUTPUT:two sub-graphs, G1,G2, ..., Gk [G1,G2] = SB(G) if (k/2) > 1 then RSB(G1,k/2) else RSB(G2,k/2) end if return G1,G2, ..., Gk J.