Multi-Layered Network Embedding
Total Page:16
File Type:pdf, Size:1020Kb
Multi-Layered Network Embedding Jundong Li∗y Chen Chen∗y Hanghang Tong∗ Huan Liu∗ Abstract tasks. To mitigate this problem, recent studies show Network embedding has gained more attentions in re- that through learning general network embedding rep- cent years. It has been shown that the learned low- resentations, many subsequent learning tasks could be dimensional node vector representations could advance greatly enhanced [17, 34, 39]. The basic idea is to learn a a myriad of graph mining tasks such as node classifi- low-dimensional node vector representation by leverag- cation, community detection, and link prediction. A ing the node proximity manifested in the network topo- vast majority of the existing efforts are overwhelmingly logical structure. devoted to single-layered networks or homogeneous net- The vast majority of existing efforts predomi- works with a single type of nodes and node interactions. nately focus on single-layered or homogeneous net- 1 However, in many real-world applications, a variety of works . However, real-world networks are much more networks could be abstracted and presented in a multi- complicated as cross-domain interactions between dif- layered fashion. Typical multi-layered networks include ferent networks are widely observed, which naturally critical infrastructure systems, collaboration platforms, form a type of multi-layered networks [12, 16, 35]. Crit- social recommender systems, to name a few. Despite the ical infrastructure systems are a typical example of widespread use of multi-layered networks, it remains a multi-layered networks (left part of Figure 1). In this daunting task to learn vector representations of different system, the power stations in the power grid are used types of nodes due to the bewildering combination of to provide electricity to routers in the autonomous sys- both within-layer connections and cross-layer network tem network (AS network) and vehicles in the trans- dependencies. In this paper, we study a novel prob- portation network; while the AS network, in turn, needs lem of multi-layered network embedding. In particular, to provide communication mechanisms to keep power we propose a principled framework - MANE to model grid and transportation network work in order. On the both within-layer connections and cross-layer network other hand, for some coal-fired or gas-fired power sta- dependencies simultaneously in a unified optimization tions, a well-functioning transportation network is re- framework for embedding representation learning. Ex- quired to supply fuel for those power stations. There- periments on real-world multi-layered networks corrob- fore, the three layers in the system form a triangu- orate the effectiveness of the proposed framework. lar dependency network [11]. Another example is the organization-level collaboration platform (right part of 1 Introduction Figure 1), where the team network is supported by the The prevalence of various information systems make social network, connecting its employee pool, which fur- networked data ubiquitous to our daily life, examples ther interacts with the information network, linking to include infrastructure networks, social media networks, its knowledge base. Furthermore, the social network brain networks, to name a few. Recent years have layer could have an embedded multi-layered structure witnessed many attempts to gain insights from these (e.g., each of its layers represents a different collabora- networked data by performing different graph learn- tion type among different individuals); and so does the ing tasks such as node classification [4, 21, 27], com- information network. In this case, different layers form munity detection [15, 45] and link prediction [1, 26]. a tree-structured dependency network. As the very first step, most of these algorithms need The availability and widespread of multi-layered to design hand-crafted features to enable the down- networks in real-world has motivated a surge of research. stream learning problems. One critical issue of these Recent work shows that cross-layer network dependen- hand-crafted features is that they are problem depen- cies are of vital importance in understanding the whole dent, and cannot be easily generalized to other learning systems and they have an added value over within-layer connectivities. In addition, a small portion cross-layer ∗Computer Science and Engineering, Arizona State Univer- sity, Tempe, AZ, USA. fjundongl, chen chen, hanghang.tong, [email protected] 1networks with only a single type of nodes and node-node yIndicates Equal Contribution interactions Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited Notations Definitions or Descriptions .)& G the layer-layer dependency matrix /2012102404(01 /)0.&,)4502.. ,)4502. A = fA1; :::; Agg within-layer connectivity matrices 1)23)% !)(-%)& D = fDi;j ; (i 6= j)g cross-layer dependency matrices *$&,)4502. $07(08&,)4502. g number of layers ni number of nodes in the i-th layer ,0.)%-/0) ")/)% d1; d2; :::; dg embedding dimensions #05)2& 2() 41502.04(01&,)4502. Fi embedding for the i-th layer Figure 1: Two typical examples of multi-layered net- Table 1: Symbols. works: critical infrastructure systems and organization- 2 Problem Definition and Preliminaries level collaboration platform. We first summarize the main symbols used in this network dependencies could improve the performance paper. We use bold uppercase for matrices (e.g., A), of many learning tasks such as link prediction remark- bold lowercase for vectors (e.g., a), normal lowercase ably [11]. Despite the fundamental importance of study- characters for scalars (e.g., a), and calligraphic for sets ing multi-layered dependencies, the development of a so- (e.g., A). Also, we follow the matrix settings in Matlab phisticated learning model which could embed nodes on to represent the i-th row of matrix A as A(i; :), the j- multi-layered networks into a continuous vector space th column as A(:; j), the (i; j)-th entry as A(i; j), the is still in its infancy. To bridge the gap, in this paper, transpose of matrix A as A0, trace of matrix A as tr(A) we study a novel problem on how to perform network if it is a square matrix, Frobenius norm of matrix A as embedding for nodes on multi-layered networks. It is a kAkF . I denotes the identity matrix. challenging problem mainly because of the following rea- Next, we introduce the following terminology to sons: (1) on multi-layered networks, cross-layer network ease the understanding of multi-layered networks. Let dependencies are introduced to the system in addition matrix G denotes the g × g layer-layer dependencies to the within-layer connectivities in each layer. Needless in a typical multi-layered network with g layers, where G(i; j) = 1 if the j-th layer depends on the i-th to say, those cross-layer network dependencies play an layer, otherwise G(i; j) = 0. Then we use a set of g important role as they also encode the node proximity matrices A = fA1; :::; Agg to represent the proximity to some extent. Hence, embedding multi-layered net- among nodes within each layer. Last, we use a set of works would be a non-trivial extension from single lay- matrices D = fDi;j; (i; j = 1; :::; g)(i 6= j)g to denote ered network embedding, since the latent node features the cross-layer network dependencies between different need to capture both within-layer connections and cross- layers. In particular, the matrix Di;j describes the layer network dependencies; (2) nodes on multi-layered cross-layer dependencies between the i-th layer and the networks come from heterogeneous data sources. Even j-th layer if G(i; j) = 1; otherwise Di;j is absent. The though they are presented in different modalities, they main symbols are summarized in Table 1. With the above notations, the problem of multi-layered network are not mutually independent and could influence each embedding is defined as follows. other. Network embedding algorithms should be able to seize their interconnections to learn a unified embedding Problem 1. Multi-Layered Network Embedding representation. To tackle the above challenges, we pro- Given: the embedding dimension d1; d2; :::; dg for dif- pose a novel multi-layered network embedding frame- ferent layers; a multi-layered network with: (1) a set work - MANE. The main contributions are as follows: of g within-layer adjacency matrices A = fA1; :::; Agg ni×ni where Ai 2 f0; 1g (i=1,...,g); and (2) observed • We formally define the problem of multi-layered cross-layer dependency matrices D = fDi;j; (i; j = ni×nj network embedding; 1; :::; g)(i 6= j)g where Dij 2 f0; 1g denotes the cross-layer network dependency between Ai and Aj; • We provide a unified optimization framework to ni×di Output: the embedding representation Fi 2 R for model both within-layer connections and cross- all nodes in the i-th layer (i = 1; :::; g). layer network dependencies for embedding repre- sentation learning on multi-layered networks; 3 Proposed Algorithm and Analysis In this section, we present a novel multi-layered network • We provide an effective alternating optimization embedding framework - MANE, which models both algorithm for the proposed MANE framework; within-layer connections and cross-layer network depen- • We evaluate the effectiveness of the proposed dencies for embedding representation learning. We first MANE framework with two real-world multi- formulate the multi-layered network embedding prob- layered networks. lem as an optimization problem and then introduce an effective alternating optimization algorithm for the pro- Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited posed framework. At last, we investigate the time com- In the previous subsection, we enforce the embedding plexity of the MANE framework. representations of two nodes to be close to each other in the Euclidean space if they are connected in the same 3.1 Within-Layer Connections Modeling The layer.