Effective Decoding in Graph Auto-Encoder Using Triadic Closure
Total Page:16
File Type:pdf, Size:1020Kb
The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) Effective Decoding in Graph Auto-Encoder Using Triadic Closure Han Shi,1 Haozheng Fan,1,2 James T. Kwok1 1Department of Computer Science and Engineering Hong Kong University of Science and Technology, Hong Kong 2Amazon {hshiac, jamesk}@cse.ust.hk, [email protected] Abstract link recommendation has attracted significant attention in The (variational) graph auto-encoder and its variants have social networks, and also in predicting associations between been popularly used for representation learning on graph- molecules in biology, and the discovery of relationships in structured data. While the encoder is often a powerful graph a terrorist network. Other popular graph analytics tasks in- convolutional network, the decoder reconstructs the graph clude the clustering of nodes (Wang et al. 2017), and auto- structure by only considering two nodes at a time, thus ig- matic graph generation (Bojchevski et al. 2018). Node clus- noring possible interactions among edges. On the other hand, tering aims to partition graph nodes into a set of clusters structured prediction, which considers the whole graph simul- such that the intra-cluster nodes are much related (densely taneously, is computationally expensive. In this paper, we uti- connected) with each other as compared to the inter-cluster lize the well-known triadic closure property which is exhib- nodes. These cluster structures occur frequently in many do- ited in many real-world networks. We propose the triad de- mains such as computer networks, sociology, and physics. coder, which considers and predicts the three edges involved in a local triad together. The triad decoder can be readily used Graph generation refers to the task of generating similar out- in any graph-based auto-encoder. In particular, we incorpo- put graphs given an input graph. It can be used in the discov- rate this to the (variational) graph auto-encoder. Experiments ery of molecule structures and dynamic network prediction. on link prediction, node clustering and graph generation show However, graphs are typically difficult to analyze because that the use of triads leads to more accurate prediction, clus- they are large and highly sparse. Recently, there is a surge tering and better preservation of the graph characteristics. of interest in learning better graph representations (Goyal and Ferrara 2018; Hamilton, Ying, and Leskovec 2017). Dif- Introduction ferent approaches are proposed to embed the structural and With the proliferation of online social networks, an enor- attribute information in a graph to a low-dimensional vec- mous number of people are now connected digitally. Besides tor space, such that both the neighborhood similarity and people, almost everything is also increasingly connected ei- community membership are preserved (Zhang et al. 2018). ther physically or by all sorts of relationships, leading to the A particularly successful unsupervised representation learn- recent popularity of the internet of things and knowledge ing model on graphs is the variational auto-encoder (VAE) graphs. In today’s big data era, it is thus important for or- (Kingma and Welling 2014), and its variants such as the ganizations to derive the most value and insight out of this variational graph auto-encoder (VGAE), graph auto-encoder colossal amount of data entities and inter-relationships. (GAE) (Kipf and Welling 2016b), adversarially regularized In general, nodes in the graph represent data entities, graph auto-encoder (ARGA), and adversarially regularized while edges represent all sorts of fine-grained relation- variational graph auto-encoder (ARVGA) (Pan et al. 2018). ships. For example, in a social network, the nodes are All these auto-encoder models consist of an encoder, which users that are connected by edges denoting pairwise friend- learns the latent representation, and a decoder, which recon- ships. In an author collaboration network, the edges de- structs the graph-structured data based on the learned repre- note co-authorship relationships. Besides social networks sentation. The encoder is often based on the powerful graph and knowledge graphs, data in domains such as chemistry convolutional network (GCN) (Kipf and Welling 2016a). and natural language semantics are often naturally repre- However, the decoder is relatively primitive, and prediction sented as graphs. of a link is based simply on the inner product between the There are a number of important tasks in graph analytics. latent representations of the nodes involved. A prominent example is link prediction (Wang, Chen, and As nodes in a graph are interrelated, instead of predict- Li 2017), which predicts whether an edge should exist be- ing each link separately, all the candidate links in the whole tween two given nodes. Since its early success at LinkedIn, graph should be considered together, leading to a structured Copyright c 2020, Association for the Advancement of Artificial prediction problem. However, learning of structured predic- Intelligence (www.aaai.org). All rights reserved. tion models is often NP-hard (Globerson et al. 2015). A re- 906 cent model along this direction is the structured prediction is the ReLU activation function. H(0) is the matrix of node energy network (SPEN) (Belanger and McCallum 2016). feature vectors X, and H(L) is the (deterministic) graph em- However, design of the underlying energy function is still an bedding matrix Z. Often, L =2(Kipf and Welling 2016b), open question, and computation of the underlying Hessian leading to the following encoder: is computationally expensive. (1) (1) On the other hand, an interesting property exhibited in H = f(X, A),Z= f(H ,A). (1) many real-world networks is the so-called triadic closure, In VGAE, the embedding (encoder output) is probabilis- which is first proposed by Simmel (1908). This mechanism z i states that for any three nodes {i, j, k} in a graph, if there tic. Let i be the embedding of node . It is assumed to follow are edges between (i, j) and (i, k), it is likely that an edge the normal distribution: also exists between j and k. It is then popularized by Gra- 2 q(zi|X, A)=N (μi, diag(σ )), (2) novetter (1973), who demonstrated empirically that triadic closure can be used to characterize link connections in many where μi and log σ are outputs from two GCNs that share the domains. For example, if two people (A and B) in a friend- first-layer weights. The distribution for all the embedding ship network have a common friend, it is likely that A and B N vectors is then q(Z|X, A)= i=1 q(zi|X, A). are also friends. If two papers in a citation network cite the same paper (suggesting that they belong to the same topic), it Inner Product Decoder is likely that one also cites the other. Besides, the triadic clo- sure is also fundamental to the understanding and prediction The decoder is often a vanilla model based on the inner of network evolution and community growth (Caplow 1968; product between the latent representations of two nodes. For Bianconi et al. 2014; Zhou et al. 2018). GAE, the graph adjacency matrix Aˆ is reconstructed from In this paper, we utilize this triadic closure property as an the inner product of two node embeddings as: efficient tradeoff between structured prediction (which con- ˆ T siders the whole graph simultaneously but is expensive) and A = σ(ZZ ), (3) individual link prediction (which is simple but ignores in- where σ is the sigmoid activation function. For the VGAE, teractions among edges). Specifically, we propose the triad the decoder is also based on inner products, but is proba- decoder, which predicts the three edges involved in a triad bilistic: together. The triad decoder can readily replace the vanilla N N decoder in any graph-based auto-encoder. In particular, we p(Aˆ|Z)= p(Aˆij|zi,zj), incorporate this into VGAE and GAE, leading to the triad i=1 j=1 variational graph auto-encoder (TVGA) and triad graph T auto-encoder (TGA). Experiments are performed on link where p(Aˆij|zi,zj)=σ(zi zj). prediction, node clustering and graph generation using a number of real-world data sets. The prediction and cluster- Triad Decoder ing results are more accurate, and the graphs generated pre- serve more characteristics of the input graph, demonstrating Based on triadic closure, the presence of a particular edge the usefulness of triads in graph analytics. in a triad is dependent on whether the other two edges are present. Specifically, consider the triad Δ=(i, j, k). Let Related Work: Graph Auto-Encoder Iij be the indicator function representing whether nodes i j Iij =1 i, j The variational graph auto-encoder (VGAE) and its non- and are connected (i.e., ,if are connected; (i, j), (i, k) variational variant graph auto-encoder (GAE) are introduced and 0 otherwise). As the presence of edges , (j, k) in (Kipf and Welling 2016b). Using the variational auto- and are interrelated, we propose to predict the three {P (Iij|Δ),P(Iik|Δ),P(Ijk|Δ)} encoder (VAE) framework (Kingma and Welling 2014), they edge probabilities (de- noted {eij(Δ),eik(Δ),ejk(Δ)}) together, using as inputs are based on unsupervised deep learning model consisting of z ,z ,z z an encoder and a decoder. the three embeddings i j k, where i’s are the latent rep- resentations of the nodes learned by the graph encoder. GCN Encoder Let the graph be G =(V,E), where V is a set of N nodes, Structure and E is the set of edges. Let its adjacency matrix be A. The The structure of the proposed triad decoder is shown in Fig- encoder is a graph convolutional network (GCN) (Kipf and ure 1. First, we perform 1 × 1 convolution, which corre- Welling 2016a), which is a deep learning model for graph- sponds to a linear transform, on the three node embeddings structure data.