Local Community Detection in Complex Networks Based on Maximum Cliques Extension
Total Page:16
File Type:pdf, Size:1020Kb
Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2014, Article ID 653670, 12 pages http://dx.doi.org/10.1155/2014/653670 Research Article Local Community Detection in Complex Networks Based on Maximum Cliques Extension Meng Fanrong, Zhu Mu, Zhou Yong, and Zhou Ranran School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China Correspondence should be addressed to Zhu Mu; [email protected] Received 6 January 2014; Accepted 25 February 2014; Published 15 April 2014 Academic Editor: Guanghui Wen Copyright © 2014 Meng Fanrong et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Detecting local community structure in complex networks is an appealing problem that has attracted increasing attention in various domains. However, most of the current local community detection algorithms, on one hand, are influenced by the state of the source node and, on the other hand, cannot effectively identify the multiple communities linked with the overlapping nodes. We proposed a novel local community detection algorithm based on maximum clique extension called LCD-MC. The proposed method firstly finds the set of all the maximum cliques containing the source node and initializes them as the starting local communities; then, it extends each unclassified local community by greedy optimization until a certain objective is satisfied; finally, the expected local communities will be obtained until all maximum cliques are assigned into a community. An empirical evaluation using both synthetic and real datasets demonstrates that our algorithm has a superior performance to some of the state-of-the-art approaches. 1. Introduction difficulty in global community detection. Therefore, many scholars have begun to focus on local community detection In recent years, more and more research has begun to pay of complex networks. attention to large complex networks, such as social net- Different from the global community detection which works, protein interaction networks, citation networks, and classifies a total complex network, local community detec- WWW. Extensive researches have indicated that community tion is only to inquire the community structure where a structure universally exists in complex networks and the designated node (source node) is located in a network. A connection between nodes in a community is closer than that complex network is essentially divided into two parts, namely, between communities. Meanwhile, these nodes often have thecommunitywhereadesignatednodeislocatedandthe similar attributes or play a similar role. Therefore, community rest part. Furthermore, the local community where the node detection has become one of the basic tasks of complex is located has a close internal connection within the com- network analysis and is of important theoretical significance munity but a relatively loose relation with the outside. Local and real value. community detection need not know all information about The research of community detection mostly has focused a complex network in advance. It starts from a node, grad- on detecting all community structures in a whole network ually extends from the node, and gradually acquires the from a global viewpoint [1–5]. However, the large scale of local information around the current community during the complex networks in many real applications is inconceivable. extension process. The representative algorithms for local For example, the friend relation networks on Facebook and community detection include [9–13]. Twitter contain hundreds of millions of nodes [6], and the However, most of available local community detection detection of the community structures in such huge complex algorithms have two restrictions: firstly, the method including networks will cost tremendous time and space. In addition, direct start from a source node, continuous selection of the as the nodes and links of many complex networks are best nodes from candidate ones by greedy algorithm, and dynamicallyevolving[7, 8], it is often hard for us to acquire adding them into a local community till the local community the complete network information, further increasing the detected satisfies all specified conditions, which makes it 2 Mathematical Problems in Engineering easy to deviate from the real local community, thus reducing the accuracy of local community detection; secondly, in this way, finally, only a unique local community structure can be U obtainedandwhenthesourcenodeisanoverlapped(hub) node connecting multiple communities, it is unable to obtain B all local communities. Tosolvetheabovetwoproblems,weproposealocal community detection algorithm based on maximum cliques C extension (LCD-MC), which includes finding maximum cliques and extending local communities. Its main advantages are shown as below. B (i) Instead of taking the source node as input directly, finding all maximum cliques containing the source U node is made as the start of local community exten- sion, thus increasing the stability of local community detection. Figure 1: Local community detection model. (ii) The flexibility in identifying overlapped node- involved local communities is realized by extending all maximum cliques satisfying certain conditions, in and the other node exists in ,then(, ) =1 and, respectively. otherwise, (, ).Thelargerthe =0 valueis,thebetter (iii) The experimental results on both synthetic and real the local community structure detected will be. Initially, = networks demonstrate that, compared with the state- {V0}, and Clauset used a greedy optimization algorithm for of-the-art local community detection algorithms, the local modularity to find the local community structure LCD-MC, on one side, can obtain better local com- where the designated V0 is located. munity quality and, on the other side, can effectively Similar to Clauset algorithm, Luo et al. proposed LWP identify multiple local community structures con- algorithm [10], in which is replaced by a new local nected with the overlapped node. modularity , as shown in the following: (1/2) ∑ (, ) 2. Related Work = in = , ∑ (, ) (2) out Let a complex network = (,),where represents the node set, the edge set, and and number of nodes and where (, ) =1 if both node V and node V exist in com- number of edges in the network, respectively. Different from munity ;otherwise,(, ).And =0 (, ) =1 means that global algorithms which divide intoanumberofclosely only one, either node V or node V, exists in community .In connected community structures, local community detection addition to a different objective function, LWP algorithm also designates a node V0 and explores the community structures includes addition and deletion operations, making it possible in close relation with the node V0. to add into or delete from community the nodes that can Clauset firstly proposed the formal definition of local increase value. Besides, LWP algorithm need not predefine community detection [9]. Assume that we have known a the size of community in advance. community structure (initially, contains only the node LMF [11] is a local-global community detection algo- V0) composed of some nodes; set is connected with nodes rithm, which proposes a fitness function, as shown in the in but does not belong to the node set of community . following: The process of local community detection is to continuously select nodes from and add them to the current community in = , (3) tillthepredefinedlocalmodularity reaches the maximum ( + ) value. To define the objective function , Clauset also defined in ex the boundary of community , that is, node sets in that where and refer to the sums of the internal node in ex have at least one node connecting with ,asshowninFigure degrees and external node degrees of community ,respec- 1. tively, and is a resolution parameter used for controlling Assume that the given , ,and are known; is defined the size of local community. This algorithm is similar to by LWP algorithm in that, according to the given ,itachieves the objective of making the fitness function reach the ∑, (, ) = , (1) maximum local value by addition and deletion. ∑, Wu et al. [12] proposed a local community detection algorithm based on link similarity (LS). The algorithm firstly where =1if (, ) ∈ and either node V or node V exists defines the similarity between a single node and a local in ;otherwise, =0. When either node V or node V exists community and then carries out local community detection Mathematical Problems in Engineering 3 in a decrease sequence of the calculated similarity values. while a leaf node represents a corresponding maximum In addition, this algorithm’s search process is composed of clique. greedy clustering, optimization, and trimming. Algorithm 1 gives the pseudocode of FindMC algorithm Chen et al. [13] proposed a local community detection for finding maximum clique. In the initialization phase, algorithm based on local degree center node (LMD). Though FindMC stores only node V0 in set and only V0’s neighbours the objective of local community detection is to find the com- in set . This is because the nodes in the maximum clique munity structure where the given node is located, it was where node V0 is located can only be V0’s neighbours and any held by the authors that, for some given nodes, the detection search outside