A Fast Kernel-based Multilevel Algorithm for Graph Clustering INDERJIT DHILLON,YUQIANG GUAN, AND BRIAN KULIS Data Mining Lab Department of Computer Science University of Texas at Austin
Austin, TX 78712 USA g finderjit, yguan, kulis @cs.utexas.edu
ABSTRACT 2. Graph Clustering 3.2 Initial Clustering Phase ¯ Quality and computation time of our multilevel methods compared with the benchmark spectral algorithm.
¯ 64 clusters 64 clusters
Graph clustering (also called graph partitioning) — clustering the nodes Eventually, the graph is coarsened to the point where very few 1.05 12
= ´Î E µ Î Given graph , consisting of a set of vertices , a set of edges spectral spectral nodes remain in the graph. mlkkm(0) mlkkm(0)
of a graph — is an important problem in diverse data mining applica- mlkkm(20) mlkkm(20) E and an affinity matrix whose entry represents similarity between 10 tions. Traditional approaches involve optimization of graph clustering
each pair of nodes, the goal of graph clustering is to partition the nodes ¯ We use the spectral algorithm of Yu and Shi [5], which we general- 8 objectives such as normalized cut or ratio association; spectral methods 1 such that between-cluster connection is weak and with-cluster connec- ize to work with arbitrary weights [3]. Thus our initial clustering is are widely used for these objectives, but they require eigenvector com- 6 tion is strong. “customized” for different graph clustering objectives. putation which can be slow. Recently, graph clustering with a general È
0.95 4
´A B µ = ´Aµ = ´A Î µ
Denote links and degree links . Here