Modular Community Detection in Networks
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Modular Community Detection in Networks Wenye Li Dale Schuurmans Macao Polytechnic Institute University of Alberta Macao SAR, China Edmonton, Canada [email protected] [email protected] Abstract [Leskovec et al., 2010]. Recently, the modularity function, Q, which measures the quality of a particular grouping of Network community detection—the problem of di- vertices in a network, has been widely accepted. Girvan and viding a network of interest into clusters for intelli- Newman [2002] have shown across a variety of simulated and gent analysis—has recently attracted significant at- real-world networks that larger Q values are correlated with tention in diverse fields of research. To discover better graph vertex groupings. intrinsic community structure a quantitative mea- Unfortunately, maximizing Q is fundamentally difficult, sure called modularity has been widely adopted as hence heuristic approximation methods have been proposed an optimization objective. Unfortunately, modular- for locally optimizing it. Among them, a spectral method ity is inherently NP-hard to optimize and approxi- proposed [Newman, 2006] has attracted broad attention. Af- mate solutions must be sought if tractability is to be ter relaxation, the method computes a decision vector where ensured. In practice, a spectral relaxation method is each element corresponds to the partition assignment of a ver- most often adopted, after which a community parti- tex. To recover a hard partition from such a relaxed solu- tion is recovered from relaxed fractional values by a tion it has been standard practice to round each element in- rounding process. In this paper, we propose an iter- dividually based simply on their sign. Although simple, this ative rounding strategy for identifying the partition conventional rounding strategy has achieved good empirical decisions that is coupled with a fast constrained results and has been deployed extensively in the analysis of power method that sequentially achieves tighter real-world networks and other graph partition applications. spectral relaxations. Extensive evaluation with this coupled relaxation-rounding method demonstrates In this paper we propose an iterative rounding strategy for consistent and sometimes dramatic improvements recovering the final decisions. Unlike conventional round- in the modularity of the communities discovered. ing, which purely operates on the individual signs, we take the magnitude of each element into consideration in a se- quential manner. That is, in successive rounds only a por- 1 Introduction tion of elements with large magnitudes are rounded to hard Many important systems can be represented as networks, with decisions. The remaining elements are then re-optimized in entities represented by vertices and relationships represented the next iteration by solving a residual problem. The solu- by edges. Prominent examples include the world wide web, tion to the residual problem is again partially rounded into social networks, biological networks, communication net- decisions, and so on. At the core of our proposal is a new works, etc. [Easley and Kleinberg, 2010]. Research on net- constrained power method that achieves fast computation of works has attracted significant recent interest, particularly in the residual problem. This sequential approach more tightly computing sciences and artificial intelligence, in response to approximates the global modularity objective by interleaving the rapid increase in size and availability of real world net- partial rounding with tighter spectral relaxation of the succes- works and the practical needs to analyze them. sive residual problems. Through extensive evaluations, the When analyzing such networks, an important question has iterative rounding method reports significant and consistent often been “How many communities are there and what are improvement over the conventional approach. the memberships?”. Community (i.e. cluster) structure seems to be inherent in real-world networks: vertices tend to clus- 2 Preliminaries ter in groups where vertex connections within the same group are dense, while the connections are sparser between vertices Modularity is the standard objective function used in network from different groups. The ability to find and analyze such cluster analysis. It quantifies the quality of a given division groups has proved invaluable in understanding network struc- of a network into communities. Good divisions, which have ture. high modularity values, are those with dense edge connec- Computationally, the quality of a partition obtained de- tions between the vertices within a community but sparse pends on the quality of the objective function being used connections between vertices in different communities. 1366 Consider an undirected graph G =(V,E) where V = 3.1 Two-Way Partitions {v1,v2, ···,vn} is a set of vertices and E is a set of edges w To first understand the spectral method, consider a simple between vertex pairs. Let ij be an element of the adjacency case where the graph is divided into two groups. One defines matrix W of the network, which gives the number of edges si = ±1 to indicate the group membership of vi, yielding between vertices vi and vj . We further denote di = j wij 1 v m = d 1 1 T as the degree of i and 2 i i as the total edge number. Q = b s s = s Bs 4m ij i j 4m For a candidate partition of the vertices into clusters, the ij modularity is defined to be the portion of the edge connec- tions within the same cluster minus the expected portion if the where s is the column vector with elements si. connections were distributed randomly. Assuming the degree The vector s can be expressed as a linear combination of d v i associated with each vertex i is preserved, under uniform the normalized eigenvectors ui of the modularity matrix B, n T random selection the expected number of edges between two so that s = i=1 aiui with ai = ui s. Then one obtains didj vertices vi and vj is . Thus the observed number minus 2m n didj 1 1 2 the expected number is wij − . Summing over all pairs Q = a uT B a u = uT s λ , 2m 4m i i j j 4m i i of vertices within the same group, the modularity, denoted by i j i=1 Q,isgivenby 1 d d where λi is the eigenvalue of B corresponding to the eigen- Q = w − i j δ (c ,c ) u 2m ij 2m i j vector i. ij Assume that the eigenvalues are labeled non-increasingly, λ ≥ λ ≥ ··· ≥ λ Q where ci is the group to which vertex vi belongs, and δ is the 1 2 n. To maximize , the assignment vec- Kronecker delta function. tor s needs to concentrate as much weight as possible in the The value of Q lies in the range [−1, 1]. It is positive when terms involving the leading (largest algebraic) eigenvalues, the observed connections within the same group exceed the which, if s were unconstrained, could be achieved by setting expected number under random connections. Given a larger s proportional to the leading eigenvector u1. But with “±1” than expected portion of connections, one can reasonably in- constraints, s cannot be chosen freely, which makes the opti- fer the presence of an underlying cluster structure. Thus, the mization difficult. cluster structure can be searched precisely by checking the Fortunately there is a convenient approximation available. network divisions that have large modularity values. Ignoring the inconvenient fact that it is not possible to make s u An equivalent formulation is often used. Define sir to perfectly parallel to 1, one simply divides the vertices into u be 1 if vertexvi belongs to group r and 0 otherwise. Then two groups according to the signs of each element of 1. δ (ci,cj )= sirsjr and hence Although this approximation is straightforward, it has often r 1 d d 1 been found to give reasonable results in practice. Q = w − i j s s = tr ST BS 2m ij 2m ir jr 2m ij r 3.2 Multi-Way Partitions where tr denotes the trace of a matrix, S is the matrix having The simple two-way partition method can be extended to elements sir ,andB is the modularity matrix having elements multi-way partition method recursively. That is, using succes- d d sive two-way partitions that divide the graph into subgraphs, b = w − i j . ij ij 2m the process can be continued on each subgraph until no fur- ther increases in Q can be found. All rows and columns of the modularity matrix sum to zero, Formally, for each subgraph G =(V ,E) with n vertices which means that the modularity of an undivided graph is we define an n × n subgraph modularity matrix B with always zero. elements Unlike most statistical clustering models or graph partition d d m techniques, which require a prior setting of partition num- b = w − i j − δ (i, j) d − d , bers or group sizes [Jain et al., 1999; Shi and Malik, 2000; ij ij 2m i i m Ng et al., 2002], the modularity score determines the parti- 1 d = wij m = di tion number and the group size automatically without manual where i j:vj ∈V and 2 i:vi∈V .The intervention. This measure also allows the possibility that no 1 T subgraph modularity is given by Q = 4m s B s ,wheres good division of a network exists, corresponding to the case is a column vector with n elements. Maximizing Q with that the modularity value is zero (and cannot be increased by respect to s gives the further contribution to the modularity further division of vertices). Q obtained by subdividing the subgraph. When G = G, B reduces to B since di → di and m → m in that case. 3 Spectral Modularity Maximization The division process on each subgraph is halted when there Maximizing Q is NP-hard [Brandes et al., 2006], therefore exists no division that further increases the graph modular- researchers have sought approximate solutions.