Hypergraph Representation Learning for Higher Order Tasks

Learning over Families of Sets - Hypergraph Representation Learning for Higher Order Tasks Balasubramaniam Srinivasan Da Zheng George Karypis Purdue University Amazon Web Services Amazon Web Services [email protected] [email protected] [email protected] Abstract in loss of significant information). Hypergraphs, [7] (see Graph representation learning has made major strides Figure 1(a) for example), which serve as the natural over the past decade. However, in many relational do- extension of dyadic graphs, form the obvious solution. mains, the input data are not suited for simple graph Due to the ubiquitous nature of hypergraphs, learn- representations as the relationships between entities go ing on hypergraphs has been studied for more than a beyond pairwise interactions. In such cases, the relation- decade [1, 28]. Early works on learning on hypergraphs ships in the data are better represented as hyperedges employed random walk procedures [15, 5, 8] and the vast (set of entities) of a non-uniform hypergraph. While there majority of them were limited to hypergraphs whose have been works on principled methods for learning rep- hyperedges have the same cardinality (k-uniform hyper- resentations of nodes of a hypergraph, these approaches graphs). More recently, with the growing popularity and are limited in their applicability to tasks on non-uniform success of message passing graph neural networks [14, 12], hypergraphs (hyperedges with different cardinalities). message passing hypergraph neural networks learning In this work, we exploit the incidence structure to de- frameworks have been proposed [10, 4, 24, 27, 25]. These velop a hypergraph neural network to learn provably works rely on constructing the clique expansion (Fig- expressive representations of variable sized hyperedges ure 1(c)), star expansions (Figure 1(d)), or other ex- which preserve local-isomorphism in the line graph of pansions of the hypergraph that preserve partial infor- the hypergraph, while also being invariant to permuta- mation. Subsequently, node representations are learned tions of its constituent vertices. Specifically, for a given using GNN’s on the graph constructed as a proxy of the vertex set, we propose frameworks for (1) hyperedge clas- hypergraph. These strategies are insufficient as either (1) sification and (2) variable sized expansion of partially there does not exist a bijective transformation between observed hyperedges which captures the higher order in- a hypergraph and the constructed clique expansion (loss teractions among vertices and hyperedges. We evaluate of information); (2) they do not accurately model the performance on multiple real-world hypergraph datasets underlying dependency between a hyperedge and its con- and demonstrate consistent, significant improvement in stituent vertices (for example, a hyperedge may cease accuracy, over state-of-the-art models. to exist if one of the nodes were deleted); (3) they do not directly model the interactions between different 1 Introduction hyperedges. The primary goal of this work is to address Deep learning on graphs has been a rapidly evolving field these issues and to build models which better represent due to its widespread applications in domains such as hypergraphs. e-commerce, personalization, fraud & abuse, life sciences, Corresponding to the adjacency matrix representa- and social network analysis. However, graphs can only tion of the edge set of a graph, a hypergraph is com- capture interactions involving pairs of entities whereas monly represented as an incidence matrix (Figure 1(b)), in many of the aforementioned domains any number in which a row is a vertex, a column is a hyperedge of entities can participate in a single interaction. For and an entry in the matrix is 1 if the vertex belongs to example, more than two substances can interact at the hyperedge. In this work, we directly seek to exploit a specific instance to form a new compound, study the incidence structure of the hypergraph to learn repre- groups can contain more that two students, recipes sentations of nodes and hyperedges. Specifically, for a contain multiple ingredients, shoppers purchase multiple given partially observed hypergraph, we synchronously items together, etc. Graphs, therefore, can be an over learn vertex and hyperedge representations that simul- simplified depiction of the input data (which may result taneously take into consideration both the line graph Copyright c 2021 by SIAM Unauthorized reproduction of this article is prohibited 1 1 0 1 1 0 0 1 1 0 0 1 0 0 1 (a) Hypergraph (b) Incidence Matrix (c) Clique Expansion (d) Star Expansion (e) Line Graph Figure 1: A Hypergraph(a) with 5 nodes v1; v2; : : : v5 and 3 hyperedges e1 = fv1; v2g; e2 = fv1; v2; v3g; e3 = fv3; v4; v5g , its incidence matrix(b), its clique expansion (c), its star expansion (d) and its line graph(e) Figure 1(e) and the set of hyperedges that a vertex 2 Preliminaries belongs to in order to learn provably expressive repre- In our notation henceforth, we shall use capital case sentations. The jointly learned vertex and hyperedge characters (e.g., A) to denote a set or a hypergraph, representations are then used to tackle higher-order tasks bold capital case characters (e.g., A) to denote a matrix, such as expansion of partially observed hyperedges and and capital characters with a right arrow over it (e.g., classification of unobserved hyperedges. −! A) to denote a sequence with a predefined ordering of While the task of hyperedge classification has been its elements. We shall use lower characters (e.g., a) to studied before, set expansion for relational data has denote the element of a set and bold lower case characters largely been unexplored. For example, given a partial (e.g., a) to denote vectors. Moreover, we shall denote set of substances which are constituents of a single the i-th row of a matrix A with A , the j-th column of drug, hyperedge expansion entails completing the set i· the matrix with A , and use A to denote a subset of of all constituents of the drug while having access to ·j m the set A of size m i.e., A ⊆ A; jA j = m. composition to multiple other drugs. A more detailed m m (Hypergraph) Let H = (V; E; X; E) denote a hyper- example for each of these tasks is presented in the graph H with a finite vertex set V = fv ; : : : ; v g, cor- Appendix (arXiv version) - Section 7.1 [20]. For 1 n responding vertex features X 2 Rn×d; d > 0, a finite the hyperedge expansion task, we propose a GAN ∗ hyperedge set E = fe1; : : : ; emg, where E ⊆ P (V )nf;g framework [11] to learn a probability distribution over m S ∗ the vertex power set (conditioned on a partially observed and ei = V , where P (V ) denotes the power set i=1 hyperedge), which maximizes the point-wise mutual on the vertices, the corresponding hyperedge features information between a partially observed hyperedge and m×d E 2 R ; d > 0. We use E(v) (termed star of a vertex) other disjoint vertex subsets in the vertex power set. to denote the hyperedges incident on a vertex v and use Our Contributions can be summarized as: (1) Pro- SH , a set of tuples, to denote the family of stars where pose a hypergraph neural network which exploits the SH = f(v; E(v)) : 8v 2 V g called the family of stars incidence structure and hence works on real world sparse of H. When explicit vertex and hyperedge features and hypergraphs which have hyperedges of different cardinal- T weights are unavailable, we will consider X = 1n1n , ities. (2) Provide provably expressive representations of T E = 1m1m , where 1 represents a n × 1 or m × 1 vector vertices and hyperedges, as well as that of the complete of ones respectively. The vertex and edge set V; E of a hypergraph which preserves properties of hypergraph hypergraph can equivalently be represented with an inci- isomorphism. (3) Introduce a new task on hypergraphs jV |×|Ej dence matrix I 2 f0; 1g , where Iij = 1 if vi 2 ej – namely the variable sized hyperedge expansion and and Iij = 0 otherwise. Isomorphic hypergraphs either also perform variable sized hyperedge classification. Fur- have the same incidence matrix or a row/column/row thermore, we demonstrate improved performance over and column permutation of the incidence matrix i.e., the existing baselines on majority of the hypergraph datasets matrix I is separately exchangeable. We use LH to de- using our proposed model. note the line graph (Figure 1(e)) of the hypergraph, use Copyright c 2021 by SIAM Unauthorized reproduction of this article is prohibited H? to denote the dual of a hypergraph. Additionally, we operations that can be performed on the m symbols. define a function LGH , a multi-valued function termed Since the total number of such permutation operations line hypergraph of a hypergraph - which generalizes the is m! the order of Sm is m!. concepts line graph and the dual of a hypergraph and (Group Action (left action)) If A is a set and G defines the spectrum of values which lies between them. is a group, then A is a G-set if there is a function For the scope of this work, we limit ourselves for LGH to φ : G × A ! A, denoted by φ(g; a) 7! ga, such that: be a dual valued function - using only the two extremes, (i) 1a = a for all a 2 A, where 1 is the identity element such that LG (0) = L and LG (1) = H?. H H H of the group G Also, we use Σ to denote the set of all possible at- n;m (ii) g(ha) = (gh)(a) for all g; h 2 G and a 2 A tributed hypergraphs H with n nodes and m hyperedges.

Hypergraph Representation Learning for Higher Order Tasks

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support