Dynamic Hypergraph Neural Networks
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) Dynamic Hypergraph Neural Networks Jianwen Jiang1;2 , Yuxuan Wei1;2 , Yifan Feng3 , Jingxuan Cao1;2 and Yue Gao1;2∗ 1Beijing National Research Center for Information Science and Technology(BNRist) 2KLISS, School of Software, Tsinghua University, Beijing, China 3School of Information Science and Engineering, Xiamen University, Xiamen, China fjjw17, [email protected], [email protected], [email protected], [email protected] Abstract Embedding on the Embedding on the �-th layer �+1-th layer �� �� In recent years, graph/hypergraph-based deep �� � � ���������� learning methods have attracted much attention � Evolution �� from researchers. These deep learning methods �� � � take graph/hypergraph structure as prior knowl- � � �� edge in the model. However, hidden and im- �� � �� portant relations are not directly represented in � �� �+� the inherent structure. To tackle this issue, we �� �� � propose a dynamic hypergraph neural networks framework (DHGNN), which is composed of the Figure 1: Dynamically constructed hyperedges. When embedding stacked layers of two modules: dynamic hyper- evolves from the l-th layer to the l + 1-th layer, hyperedge el = l+1 graph construction (DHG) and hypergrpah convo- fv0; v1; v2; v6g disappears while hyperedge e = fv2; v3; v4; v5g lution (HGC). Considering initially constructed hy- comes into existence. pergraph is probably not a suitable representation for data, the DHG module dynamically updates hy- degree of hyperedges is restricted to 2, a hypergraph is de- pergraph structure on each layer. Then hypergraph generated to a simple graph, indicating that simple graph is a convolution is introduced to encode high-order data subset of the hypergraph. relations in a hypergraph structure. The HGC mod- ule includes two phases: vertex convolution and Recently graph/hypergraph-based deep learning methods hyperedge convolution, which are designed to ag- have received more and more attention from researchers. In- [ gregate feature among vertices and hyperedges, re- spired by convolutional neural network (CNN) Krizhevsky ] spectively. We have evaluated our method on stan- et al., 2012 in computer vision, researchers have designed dard datasets, the Cora citation network and Mi- graph-based neural networks for semi-supervised learning, [ ] [ croblog dataset. Our method outperforms state-of- like GCN Kipf and Welling, 2017 and GAT Velickoviˇ c´ ] [ ] the-art methods. More experiments are conducted et al., 2018 . Furthermore, HGNN Feng et al., 2018 is to demonstrate the effectiveness and robustness of the first hypergraph neural network model. In a neural our method to diverse data distributions. network model, feature embedding generated from deeper layer of the network carries higher-order relations that ini- tial structure fails to capture. The major drawback of exist- 1 Introduction ing graph/hypergraph-based neural networks is that they only employ the initial graph/hypergraph structures while neglect Graphs are widely used to model pair-wise relations includ- the dynamic modifications of such structures from adjusted ing paper citations, personal contacts and protein-protein in- feature embedding. teractions. However, besides pair-wise relations, there exists [ a large number of non-pair-wise relations that simple graphs Dynamic hypergraph structure learning (DHSL) Zhang ] are unable to model, for example, the communities in social et al., 2018 has been proposed to deal with this problem. networks and the clusters in feature embeddings. Hypergraph DHSL uses raw input data to optimize hypergraph structure is a generalized structure for relation modeling. A hypergraph iteratively. Nonetheless, DHSL only updates the hypergraph is composed of a vertex set and a hyperedge set, where a hy- structure on initial feature embedding, thus failing to exploit peredge contains a flexible number of vertices. Therefore, high-order relations among features. Also, the iterative op- hyperedges are able to model non-pair-wise relations men- timization in DHSL suffers from expensive cost in time and tioned above. The number of vertices a hyperedge contains space. is defined as the degree of the hyperedge. Especially, if the To tackle these issues, we propose a dynamic hypergraph neural networks (DHGNN) framework, which is stacked lay- ∗Corresponding author. ers of dynamic hypergraph construction (DHG) module and 2635 Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) Hypergraph Construction Hypergraph Convolution New feature embedding … Vertex Conv … Hyperedge … … Hyperedge Conv Centroid … Vertex Vertex Conv Feature Vertex Hyperedge Hyperedge Features Features Figure 2: DHGNN framework. The first frame describes the hypergraph construction process on centroid vertex (the star) and its neighbors. For instance, two hyperedges are generated from two clusters (dashed ellipses). In the second frame, features of vertices contained in a hyperedge are aggregated to hyperedge feature through vertex convolution and features of adjacent hyperedges are aggregated to centroid vertex feature through hyperedge convolutoin. After performing such operations for all vertices on current layer feature embedding, we obtain the new feature embedding where new hypergraph structure will be constructed, as is shown in the third frame. hypergraph convolution (HGC) module. In DHG module, we data distributions. On social media sentiment prediction, utilize k-NN method and k-means clustering method to up- we observe performance improvement against state-of- date hypergraph structure based on local and global features the-art methods. respectively during a single inference process. Furthermore, The rest of the paper is organized as follows. Section 2 in- we propose a hypergraph convolution method in HGC mod- troduces related work in graph-based deep learning and hy- ule by a stack of vertex convolution and hyperedge convo- pergraph learning. Section 3 explains the proposed dynamic lution. For vertex convolution, we use a transform matrix hypergraph neural networks method. Applications and exper- to permute and weight vertices in a hyperedge; for hyper- imental results are presented in Section 4. Finally, we draw edge convolution, we utilize attention mechanism to aggre- conclusions in Section 5. gate adjacent hyperedge features to the centroid vertex. Com- pared with hypergraph-based deep learning method HGNN, our convolution module better fuses information from local 2 Related Work and global features provided by our DHG module. In this section, we give a brief review on graph-based deep We have applied our model to data with and without inher- learning and hypergraph learning. ent graph structure. For data with inherent graph structure, we conducted an experiment on a citation network benchmark, 2.1 Graph-based Deep Learning the Cora dataset [Sen et al., 2008], for the node classifica- Semi-supervised learning on graphs has long been an active tion task. In this experiment, we used DHGNN to jointly research field in deep learning. DeepWalk [Perozzi et al., learn embeddings from given graph structure and a hyper- 2014] and Planetoid [Yang et al., 2016] view sampled paths graph structure from feature space. For data without inherent in graphs as random sequences and learn vector embedding graph structure, an experiment was conducted on a social me- from these sequences. dia dataset, the Microblog dataset [Ji et al., 2019], for the sen- After the great success of convolutional neural net- timent prediction task. In this experiment, a multi-hypergrpah works [Krizhevsky et al., 2012] in image processing, re- was constructed to model the complex relations among mul- searchers have been devoted to designing convolutional meth- timodal data. ods for graph-based data. Existing graph neural network Our contributions are summarized as follows: methods can be divided in two main categories: spectral methods and spatial methods. 1. We propose a dynamic hypergraph construction method, Based on spectral graph theory, spectral graph convolu- which adopts k-NN method to generate basic hyperedge tional methods use graph Laplacian eigenvectors as graph and extends adjacent hyperedge set by clustering algo- Fourier basis. After transforming features to spectral do- rithm, i.e., k-means clustering. By dynamic hypergraph main, a spectral convolution operation is conducted on construction method, local and global relations will be spectral features. To overcome the expensive computation extracted. cost in Laplacian factorization, ChebyshevNet introduces 2. We conducted experiments on network-based classifi- Chebyshev polynomials to approximate Laplacian eigenvec- cation and social media sentiment prediction. On the tors [Defferrard et al., 2016]. GCN further simplifies the pro- network-based task, our method outperforms state-of- cess and uses one-order polynomial on each layer [Kipf and the-art methods and shows higher robustness to different Welling, 2017]. 2636 Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) Different from spectral methods, spatial graph convolution Algorithm 1 Hypergraph Construction methods leverage spatial sampler and aggregator to generate Input: Input embedding X; hyperedge size k; adjacent hy- neighborhood feature embedding. MoNet defines a generic peredge set size S spatial convolution framework for deep learning on non- Output: