Measuring Network Centrality Using Hypergraphs
Total Page:16
File Type:pdf, Size:1020Kb
Measuring Network Centrality Using Hypergraphs Sanjukta Roy Balaraman Ravindran Chennai Mathematical Institute Indian Institute of Technology Madras [email protected] [email protected] ABSTRACT real- world systems. For example, if we model a collabora- Networks abstracted as graph lose some information related tion network as a simple graph, we would have to reduce the to the super-dyadic relation among the nodes. We find nat- interaction into a set of two way interactions, and if there ural occurrence of hyperedges in co-authorship, co-citation, are multiple actors collaborating on the same project, the social networks, e-mail networks, weblog networks etc. Treat- representation will be a clique on all those actors. As we ing these networks as hypergraph preserves the super-dyadic will show, we lose information about the set of actors col- relations. But the primal graph or Gaifmann graph associ- laborating on the same project in the clique construction. A ated with hypergraphs converts each hyperedge to a clique natural way to model these interaction is through a hyper- losing again the n-ary relationship among nodes. We aim to graph. In a simple graph, an edge is represented by a pair measure Shapley Value based centrality on these networks of vertices, whereas an hyperedge is a non-empty subset of without losing the super-dyadic information. For this pur- vertices. pose, we use co-operative games on single graph representa- Multi-way interactions, known as super-dyadic relations, tion of a hypergraph such that Shapley value can be com- characterize many real world applications and there has been puted efficiently[1]. We propose several methods to generate much interest lately in modeling such relations, especially simpler graphs from hypergraphs and study the efficacy of using hypergraphs [3, 4, 5, 6]. While there have been several the centrality scores computed on these constructions. approaches proposed for using various graph properties, such as connectivity, centrality, modularity, etc., for modeling in- teractions between actors, the extensions of these notions to Categories and Subject Descriptors hypergraphs is still an active area of research. The typical G.2.2 [Hypergraphs]: Network Centrality approach is to construct a simple graph from the hypergraph and compute the properties on the regular graph. The ap- Keywords proaches differ in the construction of the simple graph. A hypergraph of tag co-occurrences has been used for de- Hypergraph, Shapley value, Centrality tecting communities using betweenness centrality [3]. Puzis et al. [4] used betweenness as a centrality measure on hy- 1. INTRODUCTION pergraphs and gave an efficient approach using Brandes al- The study of networks represents an important area of gorithm. Zhou et al. [7] pointed out the information loss multidisciplinary research involving Physics, Mathematics, occurs when hypergraphs are treated as simple graphs and Chemistry, Biology, Social sciences, and Information sci- utilised hypergraphs to device clustering technique. Kather- ences. These systems are commonly represented using sim- ine Faust [5] used non-dyadic affiliation relationships which ple graphs in which the nodes represent the objects under in- allowed to quantify the centrality of a subset of actors or a vestigation, e.g., people, proteins, molecules, computer sys- subset of events. She calculated an actor's centrality as a tems etc., and the edges represent interactions between the function of the centrality of hyperedges of events to which nodes. But, there are occasions in which interactions in- the actor belongs and calculated the centrality of events as a volve more than two actors. For example, deliberations that function of centrality of the actors. Hypergraphs have been take place in a committee are multi-way interactions [2]. In used to understand team assembly mechanisms to determine such circumstances, using simple graphs to represent com- collaboration [8, 9, 10]. Jain et al. [11] studied Indian Rail plex networks does not provide complete description of the network using hypergraph. Of particular interest to us in this work are centrality mea- sures. A centrality measure identifies nodes that are in the Permission to make digital or hard copies of all or part of this work for `center'of a network. In practice, identifying exactly what personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear we mean by `center'depends on the application of interest. this notice and the full citation on the first page. Copyrights for components In this work, we mainly look at this from the view point of of this work owned by others than ACM must be honored. Abstracting with information propagation. Independent Cascade and Linear credit is permitted. To copy otherwise, or republish, to post on servers or to Threshold are the two information diffusion models proposed redistribute to lists, requires prior specific permission and/or a fee. Request in [12]. To find a seed set of k nodes that maximizes influ- permissions from [email protected]. CODS ’15, March 18 - 21, 2015, Bangalore, India ence propagation is in general a hard problem and [12] gave Copyright 2015 ACM 978-1-4503-3436-5/15/03 $15.00. an (1 − 1=e − ) approximation under the assumption that http://dx.doi.org/10.1145/2732587.2732595 the activation functions are submodular. We use the diffu- 2 i j sion models of [12] to define measures of effectiveness of the different centrality scores that we investigate. 1 There have been some earlier attempts at computing cen- trality scores on hypergraphs. Faust [5] looked at a hy- 1 2 pergraph as a bipartite graph[6] between actors and events 3 when computing closeness centrality. Puzis et al.[4] essen- tially converted each hyperedge to a clique and computed be- tweenness centrality on the modified graph efficiently. Rep- l k resenting each hyperedge as a clique converts the hypergraph 1 to a general graph. We call this conversion as clique con- struction. Clique construction is the method that is widely Figure 1: wt-clique construction for H and H : used for computations on hypergraph. But in this work we 1 2 W (H ) and W (H ) are the same argue that the clique construction loses the non-dyadic in- 1 2 formation represented by the hyperedges. Another, slightly different way of looking at the hypergraph is using the edge multiplicity between two nodes as the weight on the edge sider a cooperative game with transferable utilities, (N; ν), between those nodes in the clique construction. This con- where N = 1; 2; : : : ; n is the set of players and ν : 2N !R is version we call as wt-clique construction. a characteristic function, that assigns a value to each coali- tion (subset of N) , with ν(φ) = 0. The Shapley value (SV) The following is a toy example where we look at a small of player i 2 N, is network so that we can easily measure influence of each node. 1 X This example shows that the traditional computations on SV (i) = : ν(S (π) [ fig) − ν(S (π)) (1) n! i i hypergraphs indeed give away some information. π2Ω Example 1. where Ω is the set of all permutations over N , and Si(π) is the set of players appearing before the ith player in permu- Let H = (V; E ) be the hypergraph with V = fi; j; k; lg 1 1 tation π. and We use the same game (Topk Game) that is described in E = ffi; jg; fi; kg; fj; kg; fi; k; lg; fi; j; kg; fj; lgg and H = 1 2 [13]. Then we extend the game for multi-hop on weighted (V; E ) be another hypergraph with 2 graphs as done in Topk Game with weights in [1]. We mea- E = ffi; j; kg; fi; j; k; lg; fi; kgg. 2 sure influence of a node using Shapley Value (SV) of the These two hypergraphs can be thought of as two collab- node as in [13, 1]. This computation is fast on graphs. As oration networks where i; j; k; l are authors and each edge shown in [1], it is linear in number of edges and vertices. We represents a document the authors have written. For exam- give an equally efficient algorithm for computing ranking on ple, in the first network (given by H ), the first edge i; j 1 hypergraphs. We compute these ranking using clique con- represents a document written by author i and j together. struction; wt-clique construction and three of our proposed Notice here, in both the networks each author has written a methods and compare the results using diffusion models. document with every one else but the number of times two The main contributions of this work are: (1) Different authors have written a document together is different. For methods of reducing hypergraphs to simpler forms for the example, in the second network (given by H ) i has written 2 purposes of centrality computations. These reductions could one document with l and three documents with k. Let us possibly yield more efficient mechanisms for computing other assume we want to measure the centrality of nodes in these network properties; (2) Computation of game-theoretic cen- two networks to calculate fluency of information propaga- trality measures on super-dyadic relations. These have been tion between two authors. Our philosophy is that it must shown to be efficient to compute on simple graphs. And, this be easy to propagate information from k to i than from l to extension enables us to compute centrality scores on large i. super-dyadic networks; and (3) Empirical study of how cen- If we convert the hyperedges of H and H to clique (clique 1 2 trality scores are distributed over the nodes of the network construction) for both H and H , we get the same graph, 1 2 under different graph constructions.