Graph Convolutional Networks Meet Markov Random Fields: Semi-Supervised Community Detection in Attribute Networks

The Thirty-Third AAAI Conference on Artiﬁcial Intelligence (AAAI-19) Graph Convolutional Networks Meet Markov Random Fields: Semi-Supervised Community Detection in Attribute Networks Di Jin,1 Ziyang Liu,1 Weihao Li,2 Dongxiao He,1 Weixiong Zhang3,4 1College of Intelligence and Computing, Tianjin University, Tianjin 300350, China, 2Visual Learning Lab, Heidelberg University, Heidel- berg 69120, Germany, 3Institute for Systems Biology, Jianghan University, Wuhan 430056, China, 4Department of Computer Science and Engineering, Washington University, St. Louis, MO 63130, USA {jindi, liuziyang}@tju.edu.cn, [email protected], [email protected], [email protected] Abstract In practice, except network topological properties, two Community detection is a fundamental problem in network additional pieces of information can also be exploited in science with various applications. The problem has attracted community detection. The first is the prior community much attention and many approaches have been proposed. membership that some of the nodes in a network may have, Among the existing approaches are the latest methods based e.g., students and faculty in a university. The second is the on Graph Convolutional Networks (GCN) and on statistical modeling of Markov Random Fields (MRF). Here, we pro- semantic information on nodes, e.g., information of per- pose to integrate the techniques of GCN and MRF to solve sonal interests. Utilizing the prior membership of even a the problem of semi-supervised community detection in at- small number of nodes can dramatically increase the fideli- tributed networks with semantic information. Our new ty of the final communities identified, which also turns the method takes advantage of salient features of GNN and problem into a problem of semi-supervised community MRF and exploits both network topology and node semantic information in a complete end-to-end deep network archi- detection. Utilizing node semantics expands the envelope tecture. Our extensive experiments demonstrate the superior of community detection to encompass attribute networks, a performance of the new method over state-of-the-art meth- problem that has drawn little attention. ods and its scalability on several large benchmark problems. Many community detection methods have been proposed including hierarchical clustering (Jia et al. 2015), statistical modeling (Chen et al. 2018), network embedding 1. Introduction (Tu et al. 2018), etc. Many complex systems can be abstracted in the form of We would like to highlight two most recent approaches networks, which include, e.g., the Internet, the World- to community detection, one is based on Graph Convolu- Wide Web and power grids. Networks are both a represen- tional Networks (GCN) and the other on Markov Random tation tool and an analytic vehicle for gaining deep insights Fields (MRF). GCN is a Convolutional Neural Network into complex systems. Modular or community structures of (CNN) extended to the problem of semi-supervised classi- networks are important properties of underlying systems. A fication of nodes in a graph (Kipf and Welling 2016; Chen community in a network consists of nodes that are con- et al. 2018) and thus can be adopted for community detec- nected more tightly than nodes in different communities. tion. GCN defines a spectral graph convolution by multi- The objective of community detection is to assign every plying a graph signal with a spectral filter in the Fourier node in a network to a community based on network topol- domain. It uses two graph convolution layers to derive a ogies (Girvan and Newman 2002). Community detection network embedding, and then applies the Softmax function can help reveal and comprehend significant hidden proper- to classify nodes into different categories. In training, the ties of complex systems, e.g., the organizational principles prior information of community memberships of a few of an institution, the units and their functions of an organi- nodes, network topology and node attributes are used to- zation, and structures and possibly vulnerable spots of a gether to learn the weight parameters of the neural network. power grid. Similar to CNN, GCN has an excellent global search capa- bility, i.e., it is able to extract complex features or patterns from a myriad of local features by a stack of convolution Copyright © 2019, Association for the Advancement of Artificial Intelli- operations. However, GCN has at least two drawbacks. gence (www.aaai.org). All rights reserved. Firstly, GCN aims primarily at deriving a network embed- 152 ding of the input data in the hidden layers of CNN. How- ods, e.g., setting the parameters in the GCN model is not ever, such an embedding is not community oriented and affected by the information in the MRF model. does not consider community properties. More seriously, We propose an end-to-end deep learning method to GCN can only obtain a relatively coarse community result combine the GCN and MRF methods for semi-supervised since it lacks smoothness constraints to reinforce similar or community detection on attribute networks. In this new nearby nodes to have compatible community labels. method, we cast the MRF model to a new convolutional Markov Random Fields (MRF) is a statistical represen- layer and incorporate it as the last layer of the GCN model. tation and modeling tool, which has enjoyed much success We then train the whole integrated model altogether, so on structural data, particularly on images where pixels are that the semantic information used in the GCN model can arranged in well-defined grid structures (Krähenbühl and be exploited in the MRF model and the latter can refine the Koltun 2012). An essential ingredient of a MRF model is coarse community result. To this end, we first extend an objective (or energy) function consisting of unary po- NetMRF to eMRF (extended MRF) by adding unary poten- tentials and pairwise potentials. For example, unary potentials and content information, and by reparameterizing the tials for image segmentation quantify the total cost for in- MRF model in order to make it fit to the GCN architecture. dividual pixels to be assigned to specific classes (e.g., the We use mean field approximation for model inference in background and a cat), and the pairwise potentials specify eMRF. By doing so, we can formulate the steps of the constraints among adjacent pixels based on their properties mean field inference as a series of convolution operations, such as colors. For image processing problems, the unary and thus convert eMRF into a convolutional layer of GCN. potentials are able to give rise to a coarse solution and the The integrated GCN and converted eMRF constitute an pairwise potentials typically help refine the coarse solution end-to-end deep CNN (Fig. 1). In this end-to-end model, (Zheng et al. 2015). We recently extended MRF to com- the last convolutional layer (i.e., eMRF) can refine the munity detection in networks with ill-defined structures coarse output from the previous layers in the forward pass, (He et al. 2018). Different from a traditional MRF that has and meanwhile, it can back propagate during training the one graph in its model, our network-specific MRF (i.e., differential error to the GCN layers to update their parame- NetMRF (He et al. 2018)) uses three graphs in attempt to ters, thus achieving a truly end-to-end integration of the characterize hidden communities in a given network. They GCN and MRF methods. are the original network, an expected graph from a random-graph null model of the given network to serve as a baseline for contrasting community structures, and an aux- 2. Preliminaries iliary complete graph that is used as a graphical representa- tion of the MRF model. A network-specific belief propaga- 2.1 Notations and the Problem tion algorithm is then introduced for model inference. Consider an undirected and attributed network G = (A, X) NetMRF has two eminent features. It is designed to ac- specified by an n×n adjacent matrix A over n nodes and an commodate modular structures, so that it is community n×m attribute (content) matrix X of m attributes per node. oriented. Since the MRF model formulates the community There are e edges in G. Some but not all of the nodes are detection problem as a probabilistic inference problem that known to belong to some communities, which are indexed incorporates assumptions such as the community label by a set of k labels {l1, l2,…,lk}. In other words, network G agreement between nearby nodes, it offers a smooth label- is partially labeled. The problem of semi-supervised com- ing among nearby nodes and is able to refine coarsely la- munity detection is then to label the rest unlabeled nodes in beled communities. Nevertheless, NetMRF does not con- G, and as a result to form k communities of nodes. sider information on nodes and requires a substantial amount of computation for learning the model. 2.2 Graph Convolutional Networks Since GCN and MRF have complementary features, it is Spectral Graph Convolutional Neural networks (GCN) as a ideal to combine the two to take advantage of their type of CNN was proposed by (Bruna et al. 2014) to ana- strengths for community detection. A straightforward lyze graph data. Following spectral graph theory (Chung combination is a two-stage scheme, i.e., running the GCN 2010), a network can be considered as a signal x in the method first to obtain a coarse community structure for a time domain, and transformed to the frequency or spectral given network, and subsequently running the NetMRF T domain by the graph Fourier transformation U x, where U method as a post-processing step to refine the GCN result. is the matrix of eigenvectors of the normalized graph La- However, this naive combination is a greedy strategy that −−1/2 1/2 T placian L defined as L=−=Λ I D AD U U , where will train the GCN and MRF models separately.

Load more