Closeness Centrality for Networks with Overlapping Community Structure

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Closeness Centrality for Networks with Overlapping Community Structure Mateusz K. Tarkowski1 and Piotr Szczepanski´ 2 Talal Rahwan3 and Tomasz P. Michalak1,4 and Michael Wooldridge1 2Department of Computer Science, University of Oxford, United Kingdom 2 Warsaw University of Technology, Poland 3Masdar Institute of Science and Technology, United Arab Emirates 4Institute of Informatics, University of Warsaw, Poland Abstract One aspect of networks that has been largely ignored in the literature on centrality is the fact that certain real-life Certain real-life networks have a community structure in which communities overlap. For example, a typical bus net- networks have a predefined community structure. In public work includes bus stops (nodes), which belong to one or more transportation networks, for example, bus stops are typically bus lines (communities) that often overlap. Clearly, it is im- grouped by the bus lines (or routes) that visit them. In coau- portant to take this information into account when measur- thorship networks, the various venues where authors pub- ing the centrality of a bus stop—how important it is to the lish can be interpreted as communities (Szczepanski,´ Micha- functioning of the network. For example, if a certain stop be- lak, and Wooldridge 2014). In social networks, individuals comes inaccessible, the impact will depend in part on the bus grouped by similar interests can be thought of as members lines that visit it. However, existing centrality measures do not of a community. Clearly for such networks, it is desirable take such information into account. Our aim is to bridge this to have a centrality measure that accounts for the prede- gap. We begin by developing a new game-theoretic solution fined community structure. Yet, to the best of our knowl- concept, which we call the Configuration semivalue, in order to have greater flexibility in modelling the community struc- edge, only one such measure has been developed to date ture compared to previous solution concepts from coopera- (Szczepanski,´ Michalak, and Wooldridge 2014), which ex- tive game theory. We then use the new concept as a building tends Degree centrality to networks with community struc- block to construct the first extension of Closeness centrality ture. Despite this recent development, one important aspect to networks with community structure (overlapping or other- of real-life networks remains missing from existing central- wise). Despite the computational complexity inherited from ity measures: the ability to consider overlapping communi- the Configuration semivalue, we show that the corresponding ties. Take social networks, for example, where such overlaps extension of Closeness centrality can be computed in poly- are widespread due to the various affiliations and interests nomial time. We empirically evaluate this measure and our of the individuals involved (Kelley et al. 2012). Likewise, in algorithm that computes it by analysing the Warsaw public our example of transportation networks, a bus stop may be transportation network. on the route of multiple (i.e., overlapping) bus lines. If such a stop becomes inaccessible, then all the bus lines that visit Introduction it would no longer function properly. As such, the impor- One of the key problems in network science involves identi- tance of a bus stop clearly depends (at least partially) on the fying the most important (or central) nodes (Freeman 1979; importance of the bus lines to which it belongs. Dezso˝ and Barabasi´ 2002; Keinan et al. 2004; Page et al. In an attempt to define a centrality measure that accounts 1999). The four best-known centrality measures are De- for overlapping communities, we focused on game-theoretic gree, Betweenness, Closeness and Eigenvector centralities centrality measures.1 The inspiration behind this line of re- (Bonacich 1972; Freeman 1979), each of which views cen- search comes from solution concepts in cooperative game trality from a different perspective, focusing on certain traits theory. In essence, given a set of players, a cooperative so- that make nodes important, or, “central,” to the function- lution concept typically defines a payoff for each player by ing of a network (Brandes and Erlebach 2005; Koschutzki comparing his or her contribution to the various groups of et al. 2005). Our focus in this paper is on Closeness cen- players (more on this in the next section). The rich repos- trality. This measure considers the important nodes to be itory of solution concepts has been extensively refined and those that are relatively close to all other nodes in the expanded over the past decades, making it an ideal toolkit network: the closer a node is to the others, the higher for quantifying the importance of individuals in a setting its centrality. Closeness centrality has many applications, where those individuals co-exist and operate in groups. In from coauthorship networks (Yan and Ding 2009), through the context of game-theoretic network centrality, the indi- tourism (Shih 2006), to social networks (Barabasi 2003; Karinthy 2006). 1See www.game-theoretic-centrality.com and Copyright c 2016, Association for the Advancement of Artificial www.network-centrality.com for an overview of this line Intelligence (www.aaai.org). All rights reserved. of research. 622 Solution Concept Degree Closeness Betweenness Game-Theoretic Concepts Shapley value 22 3 A cooperative game consists of a set of players N = Semivalues 44 3 {1, 2,...,n} and a characteristic function ν :2N → R 5 Owen value This Paper Open such that ν(∅)=0. This function assigns to each coali- 5 Coalitional semivalues This Paper Open tion of players its payoff (i.e., an indication of its perfor- Configuration value Open This Paper Open Configuration semivalues Open This Paper Open mance). We will henceforth refer to a game simply by ν.A coalition structure, CS = {Q1,Q2,...,Qm}, is a partition Table 1: The table outlines the papers that used various so- of N into disjoint coalitions. One of the key questions in lution concepts to extend Degree, Closeness, or Betweenness cooperative game theory is the following: Given a game ν centralities, and computed them in polynomial time. and a coalition structure CS that the players have formed, how do we divide the payoff of each coalition among its viduals correspond to the nodes of the network, while the members? In this context, assuming that CS = {N}, i.e., groups correspond to the subgraphs of the network. With assuming that the players have formed the grand coalition, this mapping, any solution concept from cooperative game Shapley (1953) proposed a solution concept—now known theory can be readily applied as a centrality measure, ex- as the Shapley value—to fairly divide the payoff from co- cept for one remaining obstacle: computing the solution con- operation among the players. Banzhaf (1965) proposed an- cept. In fact, most solution concepts are inherently hard to other solution concept—now known as the Banzhaf index— compute (Chalkiadakis, Elkind, and Wooldridge 2011). For- which is similar to the Shapley value except for a subtle tunately, however, in the context of network centrality, we difference in the way contributions are weighed. To gener- typically focus on a certain closed-form formula, specify- alize the aforementioned solution concepts, Weber (1979) ing how each group is evaluated. In certain cases, fixing proposed semivalues—a family of solution concepts that in- the group-evaluation function makes it possible to obtain cludes both the Shapley value and Banzhaf index. Formally, a polynomial-time algorithm that computes the resulting, let MC (C, i)=ν(C ∪{i}) − ν(C) be the marginal con- game-theoretic centrality measure. tribution of player i to coalition C ⊆ N \{i}. Denoting by With this in mind, we propose the first extension of Close- β(k) the probability that any player makes a marginal con- ness centrality to networks with either overlapping or non- tribution to a coalition of size k, the semivalue of i is: overlapping community structure. To this end, we propose |N|−1 four game-theoretic variants of Closeness centrality, three of k ψi(ν)= β(k)E MC (C ,i) , (1) which are based on existing solution concepts, namely: the k=0 Owen value (Owen 1977), the configuration value (Albizuri, k Aurrecoechea, and Zarzuelo 2006), and the coalitional semi- where C is a random variable over subsets of size k chosen value (Szczepanski,´ Michalak, and Wooldridge 2014). The from the set N \{i} with uniform probability, and E is the fourth and most general variant is based on a new solution expected value operator for this variable. The Shapley value concept proposed in this paper, which generalises the afore- and Banzhaf index are two semivalues, defined by β(k)= |N|−1 |N|−1 mentioned three concepts, and offers greater flexibility in 1/|N| and β(k)= k /2 , respectively. modelling the underlying community structure. We call it Importantly, all semivalues assume that CS = {N}, i.e., the Configuration semivalue. that the grand coalition is formed. To relax this assumption, Crucially, all of the aforementioned solution concepts are Owen (1977) introduced a solution concept—now known hard to compute given an arbitrary group-evaluation func- as the Owen value—that divides the payoff of any a pri- tion. Nevertheless, for the purpose of extending Closeness ori coalition structure CS. Now, when CS = {N} or centrality, we propose polynomial-time algorithms for com- CS = {{i}}i∈N , the Owen value is equivalent to the Shap- puting the corresponding game-theoretic extension. This re- ley value. As such, the Owen value is a generalization of the sult fills several gaps in the computational-complexity anal- Shapley value; one that does not generalize the β function ysis of game-theoretic centrality measures (see Table 1). (as semivalues do), but rather generalizes the assumed coali- Finally, to demonstrate the applicability of our approach, tion structure CS.

Closeness Centrality for Networks with Overlapping Community Structure

Detecting Statistically Significant Communities

Centrality in Modular Networks

Incorporating Network Structure with Node Contents for Community Detection on Large Networks Using Deep Learning

Review of Community Detection Over Social Media: Graph Prospective

A Centrality Measure for Networks with Community Structure Based on a Generalization of the Owen Value

Structural Diversity and Homophily: a Study Across More Than One Hundred Big Networks

Community Extraction in Multilayer Networks with Heterogeneous Community Structure

Hidden Community Detection in Online Forums

Communities in Networks Mason A

Testing Community Structure for Hypergraphs

Modularity Based Community Detection with Deep Learning

Ordered Community Structure in Networks