Scalable Label Propagation for Multi-Relational Learning on the Tensor Product of Graphs
Total Page:16
File Type:pdf, Size:1020Kb
1 Scalable Label Propagation for Multi-relational Learning on the Tensor Product of Graphs Zhuliu Li*, Raphael Petegrosso*, Shaden Smith, David Sterling, George Karypis, Rui Kuangy Abstract—Multi-relational learning on knowledge graphs infers high-order relations among the entities across the graphs. This learning task can be solved by label propagation on the tensor product of the knowledge graphs to learn the high-order relations as a tensor. In this paper, we generalize a widely used label propagation model to the normalized tensor product graph, and propose an optimization formulation and a scalable Low-rank Tensor-based Label Propagation algorithm (LowrankTLP) to infer multi-relations for two learning tasks, hyperlink prediction and multiple graph alignment. The optimization formulation minimizes the upper bound of the noisy tensor estimation error for multiple graph alignment, by learning with a subset of the eigen-pairs in the spectrum of the normalized tensor product graph. We also provide a data-dependent transductive Rademacher bound for binary hyperlink prediction. We accelerate LowrankTLP with parallel tensor computation which enables label propagation on a tensor product of 100 graphs each of size 1000 in less than half hour in the simulation. LowrankTLP was also applied to predicting the author-paper-venue hyperlinks in publication records, alignment of segmented regions across up to 26 CT-scan images and alignment of protein-protein interaction networks across multiple species. The experiments demonstrate that LowrankTLP indeed well approximates the original label propagation with better scalability and accuracy. Source code: https://github.com/kuanglab/LowrankTLP Index Terms—multi-relational learning, tensor product graph, label propagation, hyperlink prediction, multiple graph alignment, tensor decomposition and completion F 1 INTRODUCTION ABEL propagation has been widely used for semi- sparse [6], [11]. In particular, each multiplication with tensor L supervised learning on the similarity graph of labeled will exponentially increase the number of nonzero entries and and unlabeled samples [1], [2], [3]. As illustrated in Figure after one or two iterations, a dense tensor is expected. The 1 (A), label propagation propagates training labels on a main objective of this study is to provide a principled and graph to learn a vector y predicting the labels of vertices. A theoretical approximation approach, and scalable algorithms generalization of label propagation on the Kronecker product and implementations to tackle the scalability issue. Our of two graphs (also called bi-random walk) can infer the contributions in this paper are summarized as follows: pairwise relations in a matrix Y between the vertices from the two graphs as shown in Figure 1 (B). This approach has been We propose a novel optimization formulation to • applied to aligning biological and biomedical networks [4], approximate the transformation matrix in the closed- [5]. Similar learning problems on Kronecker product graphs form solution of label propagation on the tensor also exist in link prediction [6], [7], matching images [8], product graph, by learning with a subset of eigen- image segmentation [9], collaborative filtering [10], citation pairs from the normalized tensor product graph. We network analysis [11] and multi-language translation [12]. provide a theoretical justification that the globally When applied on the tensor product of n graphs to predict optimal solution of the optimization problem mini- the matching across the n graphs in an n-way tensor mizes an estimation error bound of recovering the Y arXiv:1802.07379v2 [cs.LG] 18 May 2020 as shown in Figure 1 (C), label propagation accomplishes true tensor that is structured by the tensor product n-way relational learning from knowledge graphs [13]. graph manifolds, for multiple graph alignment. We Label propagation on the tensor product graph explores the also provide a data-dependent error bound using the graph topologies for associating vertices across the graphs transductive Rademacher complexity for binary hyper- assuming the global relations among the vertices reveal link prediction. the vertex identities [5]. However, the tensor formulation We develop an efficient eigenvalue selection algo- • of label propagation is computationally intensive to solve. rithm to sequentially select the eigen-pairs from each Empirically, most of the existing methods are only scalable to individual normalized graph considering the global learn 3-way relations in large graphs even if the graphs are spectrum of the transformation matrix. We then show that the spectrum of the low-rank normalized tensor • Z. Li, R. Petegrosso, S. Smith, G. Karypis and R. Kuang are with Depart- product graph constructed by the selected eigen-pairs ment of Computer Science and Engineering, University of Minnesota Twin is guaranteed to be the globally optimal solution to Cities, MN, 55455, USA the proposed optimization formulation. E-mail: {lixx3617, peteg001, shaden, karypis, kuang}@umn.edu We propose a Lowrank Tensor-based Label Propaga- • D. Sterling is with Department of Radiation Oncology, University of • Minnesota Twin Cities, MN, 55455, USA tion algorithm (LowrankTLP) using efficient tensor E-mail: [email protected] y operations to compute the approximated solution (*Co-first author, Corresponding author) based on the selected eigen-pairs from the knowledge 2 (i) graphs. We provide an efficient parallel implementa- distinct symmetric matrices, where each Ii Ii matrix W tion of LowrankTLP using SPLATT library [14] with denotes the adjacency matrix of the i-th undirected× graph shared-memory parallelism to increase the scalability with Ii vertices. The tensor product graph adjacency matrix is n (i) Qn Qn by a large magnitude. defined as W = i=1W with dimension ( i Ii) ( i Ii). ⊗ × (l) Our work comprehensively generalizes the label prop- The edge W = Qn W • (a1;a2;:::;an);(b1;b2;:::;bn) l=1 al;bl agation model proposed in [3] on tensor product of of W encodes the similarity between a pair of n- multiple graphs in several aspects including normal- way tuples of graph vertices (a1; a2; : : : ; an) and ized tensor product graph, regularization framework, (b1; b2; : : : ; bn); al; bl [1;Il] as illustrated in Figure iterative tensor-based label propagation algorithm 1 (C). The network8f propertiesg 2 of the TPG including the and computation of the closed-form solution. spectral distribution are discussed in [16]. We validate the effectiveness, efficiency and scalability • Now, we introduce the two multi-relational learning of LowrankLTP on the simulation data by controlling problems studied in this paper. The objective is to score the the graph size and topology. We also demonstrate the queried n-way relations among the vertices across multiple practical use of LowrankLTP on three real datasets for undirected knowledge graphs W (i) : i = 1; : : : ; n , for hyperlink prediction and multiple graph alignment, either hyperlink prediction orf multiple graph alignmentg across a large number of knowledge graphs. given the input as 1) the labels of a small number of observed n-way relations, 2) or the similarity scores between matrix Y the vertices of every pair of graphs respectively, as shown in a vector y (b1,b2) (a ,a ) a1 1 2 a2 Figure 2 (A). a c b c1 2 c c b1 Task 1: hyperlink prediction d b d2 d d1 2 b (1) Graph (2) Graph W (d1,d2) (c1,c2) W Input relations: a small set Θ = (i ; i ; : : : ; i ): • h f 1 2 n ij [1;Ij]; j = 1; : : : ; n of labeled n-way relations (A) Label Propagation (B) Bi-Random Walk (hyperlinks)2 8 among theg vertices across n graphs, (j) where ij denotes the i-th vertex of graph W . a1 a2 Queried multi-relations: a set Ω = (j ; j ; : : : ; j ): • h 1 2 n c j [1;I ]; l = 1; : : : ; n} of thef queried n-way 2 l 2 l 8 c1 relations (hyperlinks), chosen per user’s interests. (d1,d2,d3,...,dn) d Learning task: given n knowledge graphs and the set d1 b2 2 • b1 Graph (2) Θh of labeled hyperlinks, predict the link strengths Graph W (1) W of the queried set Ωh of hyperlinks in a sparse tensor (b1,b2,b3,...,bn) tensor Y In In−1 ::: I1 (c1,c2,c3,...,cn) h R × × × with Ωh nonzero entries. O 2 j j a3 an Task 2: multiple graph alignment c3 (a1,a2,a3,...,an) cn Ii Ij Input relations: a set Θg = Rij R × : i; j • f 2 + 8 2 b [1; n] and i < j of non-negative matrices, where Rij 3 d3 dn g Graph W (3) bn holds the similarity scores between the vertices of a ... Graph W (n) pair of graphs W (i) and W (j). (C) n-way Relational Learning Queried multi-relations: a set Ω = (j ; j ; : : : ; j ): • g f 1 2 n jl [1;Il]; l = 1; : : : ; n} of the queried n-way Fig. 1. Label propagation generalized on tensor product graphs. (A) relations,2 which8 can be derived from the pairwise Label propagation on a graph predicts the labels of the vertices for semi- supervised learning; (B) Label propagation on the Kronecker product of relations by heuristic as in [17] and [18]. Learning task: given n knowledge graphs and the set two graphs predicts links between the vertices across the two graphs; • (C) Label propagation on an n-way tensor product graph learns n-way Θg of pairwise relations, predict the alignment scores multi-relations across the vertices in n graphs. Each n-way tuple of graph between the queried set Ω of n-way tuples of graph vertices in the same color is represented as an entry in the n-way tensor. g In In−1 ::: I1 vertices in a sparse tensor g R × × × with Ω nonzero entries. O 2 2 PROBLEM FORMULATION j gj We first define the notations and the tensor product As outlined in Figure 2, given either the labels of the graph.