Closeness Centrality for Networks with Overlapping Community Structure

Total Page:16

File Type:pdf, Size:1020Kb

Closeness Centrality for Networks with Overlapping Community Structure Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Closeness Centrality for Networks with Overlapping Community Structure Mateusz K. Tarkowski1 and Piotr Szczepanski´ 2 Talal Rahwan3 and Tomasz P. Michalak1,4 and Michael Wooldridge1 2Department of Computer Science, University of Oxford, United Kingdom 2 Warsaw University of Technology, Poland 3Masdar Institute of Science and Technology, United Arab Emirates 4Institute of Informatics, University of Warsaw, Poland Abstract One aspect of networks that has been largely ignored in the literature on centrality is the fact that certain real-life Certain real-life networks have a community structure in which communities overlap. For example, a typical bus net- networks have a predefined community structure. In public work includes bus stops (nodes), which belong to one or more transportation networks, for example, bus stops are typically bus lines (communities) that often overlap. Clearly, it is im- grouped by the bus lines (or routes) that visit them. In coau- portant to take this information into account when measur- thorship networks, the various venues where authors pub- ing the centrality of a bus stop—how important it is to the lish can be interpreted as communities (Szczepanski,´ Micha- functioning of the network. For example, if a certain stop be- lak, and Wooldridge 2014). In social networks, individuals comes inaccessible, the impact will depend in part on the bus grouped by similar interests can be thought of as members lines that visit it. However, existing centrality measures do not of a community. Clearly for such networks, it is desirable take such information into account. Our aim is to bridge this to have a centrality measure that accounts for the prede- gap. We begin by developing a new game-theoretic solution fined community structure. Yet, to the best of our knowl- concept, which we call the Configuration semivalue, in order to have greater flexibility in modelling the community struc- edge, only one such measure has been developed to date ture compared to previous solution concepts from coopera- (Szczepanski,´ Michalak, and Wooldridge 2014), which ex- tive game theory. We then use the new concept as a building tends Degree centrality to networks with community struc- block to construct the first extension of Closeness centrality ture. Despite this recent development, one important aspect to networks with community structure (overlapping or other- of real-life networks remains missing from existing central- wise). Despite the computational complexity inherited from ity measures: the ability to consider overlapping communi- the Configuration semivalue, we show that the corresponding ties. Take social networks, for example, where such overlaps extension of Closeness centrality can be computed in poly- are widespread due to the various affiliations and interests nomial time. We empirically evaluate this measure and our of the individuals involved (Kelley et al. 2012). Likewise, in algorithm that computes it by analysing the Warsaw public our example of transportation networks, a bus stop may be transportation network. on the route of multiple (i.e., overlapping) bus lines. If such a stop becomes inaccessible, then all the bus lines that visit Introduction it would no longer function properly. As such, the impor- One of the key problems in network science involves identi- tance of a bus stop clearly depends (at least partially) on the fying the most important (or central) nodes (Freeman 1979; importance of the bus lines to which it belongs. Dezso˝ and Barabasi´ 2002; Keinan et al. 2004; Page et al. In an attempt to define a centrality measure that accounts 1999). The four best-known centrality measures are De- for overlapping communities, we focused on game-theoretic gree, Betweenness, Closeness and Eigenvector centralities centrality measures.1 The inspiration behind this line of re- (Bonacich 1972; Freeman 1979), each of which views cen- search comes from solution concepts in cooperative game trality from a different perspective, focusing on certain traits theory. In essence, given a set of players, a cooperative so- that make nodes important, or, “central,” to the function- lution concept typically defines a payoff for each player by ing of a network (Brandes and Erlebach 2005; Koschutzki comparing his or her contribution to the various groups of et al. 2005). Our focus in this paper is on Closeness cen- players (more on this in the next section). The rich repos- trality. This measure considers the important nodes to be itory of solution concepts has been extensively refined and those that are relatively close to all other nodes in the expanded over the past decades, making it an ideal toolkit network: the closer a node is to the others, the higher for quantifying the importance of individuals in a setting its centrality. Closeness centrality has many applications, where those individuals co-exist and operate in groups. In from coauthorship networks (Yan and Ding 2009), through the context of game-theoretic network centrality, the indi- tourism (Shih 2006), to social networks (Barabasi 2003; Karinthy 2006). 1See www.game-theoretic-centrality.com and Copyright c 2016, Association for the Advancement of Artificial www.network-centrality.com for an overview of this line Intelligence (www.aaai.org). All rights reserved. of research. 622 Solution Concept Degree Closeness Betweenness Game-Theoretic Concepts Shapley value 22 3 A cooperative game consists of a set of players N = Semivalues 44 3 {1, 2,...,n} and a characteristic function ν :2N → R 5 Owen value This Paper Open such that ν(∅)=0. This function assigns to each coali- 5 Coalitional semivalues This Paper Open tion of players its payoff (i.e., an indication of its perfor- Configuration value Open This Paper Open Configuration semivalues Open This Paper Open mance). We will henceforth refer to a game simply by ν.A coalition structure, CS = {Q1,Q2,...,Qm}, is a partition Table 1: The table outlines the papers that used various so- of N into disjoint coalitions. One of the key questions in lution concepts to extend Degree, Closeness, or Betweenness cooperative game theory is the following: Given a game ν centralities, and computed them in polynomial time. and a coalition structure CS that the players have formed, how do we divide the payoff of each coalition among its viduals correspond to the nodes of the network, while the members? In this context, assuming that CS = {N}, i.e., groups correspond to the subgraphs of the network. With assuming that the players have formed the grand coalition, this mapping, any solution concept from cooperative game Shapley (1953) proposed a solution concept—now known theory can be readily applied as a centrality measure, ex- as the Shapley value—to fairly divide the payoff from co- cept for one remaining obstacle: computing the solution con- operation among the players. Banzhaf (1965) proposed an- cept. In fact, most solution concepts are inherently hard to other solution concept—now known as the Banzhaf index— compute (Chalkiadakis, Elkind, and Wooldridge 2011). For- which is similar to the Shapley value except for a subtle tunately, however, in the context of network centrality, we difference in the way contributions are weighed. To gener- typically focus on a certain closed-form formula, specify- alize the aforementioned solution concepts, Weber (1979) ing how each group is evaluated. In certain cases, fixing proposed semivalues—a family of solution concepts that in- the group-evaluation function makes it possible to obtain cludes both the Shapley value and Banzhaf index. Formally, a polynomial-time algorithm that computes the resulting, let MC (C, i)=ν(C ∪{i}) − ν(C) be the marginal con- game-theoretic centrality measure. tribution of player i to coalition C ⊆ N \{i}. Denoting by With this in mind, we propose the first extension of Close- β(k) the probability that any player makes a marginal con- ness centrality to networks with either overlapping or non- tribution to a coalition of size k, the semivalue of i is: overlapping community structure. To this end, we propose |N|−1 four game-theoretic variants of Closeness centrality, three of k ψi(ν)= β(k)E MC (C ,i) , (1) which are based on existing solution concepts, namely: the k=0 Owen value (Owen 1977), the configuration value (Albizuri, k Aurrecoechea, and Zarzuelo 2006), and the coalitional semi- where C is a random variable over subsets of size k chosen value (Szczepanski,´ Michalak, and Wooldridge 2014). The from the set N \{i} with uniform probability, and E is the fourth and most general variant is based on a new solution expected value operator for this variable. The Shapley value concept proposed in this paper, which generalises the afore- and Banzhaf index are two semivalues, defined by β(k)= |N|−1 |N|−1 mentioned three concepts, and offers greater flexibility in 1/|N| and β(k)= k /2 , respectively. modelling the underlying community structure. We call it Importantly, all semivalues assume that CS = {N}, i.e., the Configuration semivalue. that the grand coalition is formed. To relax this assumption, Crucially, all of the aforementioned solution concepts are Owen (1977) introduced a solution concept—now known hard to compute given an arbitrary group-evaluation func- as the Owen value—that divides the payoff of any a pri- tion. Nevertheless, for the purpose of extending Closeness ori coalition structure CS. Now, when CS = {N} or centrality, we propose polynomial-time algorithms for com- CS = {{i}}i∈N , the Owen value is equivalent to the Shap- puting the corresponding game-theoretic extension. This re- ley value. As such, the Owen value is a generalization of the sult fills several gaps in the computational-complexity anal- Shapley value; one that does not generalize the β function ysis of game-theoretic centrality measures (see Table 1). (as semivalues do), but rather generalizes the assumed coali- Finally, to demonstrate the applicability of our approach, tion structure CS.
Recommended publications
  • Detecting Statistically Significant Communities
    1 Detecting Statistically Significant Communities Zengyou He, Hao Liang, Zheng Chen, Can Zhao, Yan Liu Abstract—Community detection is a key data analysis problem across different fields. During the past decades, numerous algorithms have been proposed to address this issue. However, most work on community detection does not address the issue of statistical significance. Although some research efforts have been made towards mining statistically significant communities, deriving an analytical solution of p-value for one community under the configuration model is still a challenging mission that remains unsolved. The configuration model is a widely used random graph model in community detection, in which the degree of each node is preserved in the generated random networks. To partially fulfill this void, we present a tight upper bound on the p-value of a single community under the configuration model, which can be used for quantifying the statistical significance of each community analytically. Meanwhile, we present a local search method to detect statistically significant communities in an iterative manner. Experimental results demonstrate that our method is comparable with the competing methods on detecting statistically significant communities. Index Terms—Community Detection, Random Graphs, Configuration Model, Statistical Significance. F 1 INTRODUCTION ETWORKS are widely used for modeling the structure function that is able to always achieve the best performance N of complex systems in many fields, such as biology, in all possible scenarios. engineering, and social science. Within the networks, some However, most of these objective functions (metrics) do vertices with similar properties will form communities that not address the issue of statistical significance of commu- have more internal connections and less external links.
    [Show full text]
  • Centrality in Modular Networks
    Ghalmane et al. EPJ Data Science (2019)8:15 https://doi.org/10.1140/epjds/s13688-019-0195-7 REGULAR ARTICLE OpenAccess Centrality in modular networks Zakariya Ghalmane1,3, Mohammed El Hassouni1, Chantal Cherifi2 and Hocine Cherifi3* *Correspondence: hocine.cherifi@u-bourgogne.fr Abstract 3LE2I, UMR6306 CNRS, University of Burgundy, Dijon, France Identifying influential nodes in a network is a fundamental issue due to its wide Full list of author information is applications, such as accelerating information diffusion or halting virus spreading. available at the end of the article Many measures based on the network topology have emerged over the years to identify influential nodes such as Betweenness, Closeness, and Eigenvalue centrality. However, although most real-world networks are made of groups of tightly connected nodes which are sparsely connected with the rest of the network in a so-called modular structure, few measures exploit this property. Recent works have shown that it has a significant effect on the dynamics of networks. In a modular network, a node has two types of influence: a local influence (on the nodes of its community) through its intra-community links and a global influence (on the nodes in other communities) through its inter-community links. Depending on the strength of the community structure, these two components are more or less influential. Based on this idea, we propose to extend all the standard centrality measures defined for networks with no community structure to modular networks. The so-called “Modular centrality” is a two-dimensional vector. Its first component quantifies the local influence of a node in its community while the second component quantifies its global influence on the other communities of the network.
    [Show full text]
  • Incorporating Network Structure with Node Contents for Community Detection on Large Networks Using Deep Learning
    Neurocomputing 297 (2018) 71–81 Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom Incorporating network structure with node contents for community detection on large networks using deep learning ∗ Jinxin Cao a, Di Jin a, , Liang Yang c, Jianwu Dang a,b a School of Computer Science and Technology, Tianjin University, Tianjin 300350, China b School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa 923-1292, Japan c School of Computer Science and Engineering, Hebei University of Technology, Tianjin 300401, China a r t i c l e i n f o a b s t r a c t Article history: Community detection is an important task in social network analysis. In community detection, in gen- Received 3 June 2017 eral, there exist two types of the models that utilize either network topology or node contents. Some Revised 19 December 2017 studies endeavor to incorporate these two types of models under the framework of spectral clustering Accepted 28 January 2018 for a better community detection. However, it was not successful to obtain a big achievement since they Available online 2 February 2018 used a simple way for the combination. To reach a better community detection, it requires to realize a Communicated by Prof. FangXiang Wu seamless combination of these two methods. For this purpose, we re-examine the properties of the mod- ularity maximization and normalized-cut models and fund out a certain approach to realize a seamless Keywords: Community detection combination of these two models. These two models seek for a low-rank embedding to represent of the Deep learning community structure and reconstruct the network topology and node contents, respectively.
    [Show full text]
  • Review of Community Detection Over Social Media: Graph Prospective
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 10, No. 2, 2019 Review of Community Detection over Social Media: Graph Prospective Pranita Jain1, Deepak Singh Tomar2 Department of Computer Science Maulana Azad National Institute of Technology Bhopal, India 462001 Abstract—Community over the social media is the group of globally distributed end users having similar attitude towards a particular topic or product. Community detection algorithm is used to identify the social atoms that are more densely interconnected relatively to the rest over the social media platform. Recently researchers focused on group-based algorithm and member-based algorithm for community detection over social media. This paper presents comprehensive overview of community detection technique based on recent research and subsequently explores graphical prospective of social media mining and social theory (Balance theory, status theory, correlation theory) over community detection. Along with that this paper presents a comparative analysis of three different state of art community detection algorithm available on I-Graph package on python i.e. walk trap, edge betweenness and fast greedy over six different social media data set. That yield intersecting facts about the capabilities and deficiency of community analysis methods. Fig 1. Social Media Network. Keywords—Community detection; social media; social media Aim of Community detection is to form group of mining; homophily; influence; confounding; social theory; homogenous nodes and figure out a strongly linked subgraphs community detection algorithm from heterogeneous network. In strongly linked sub- graphs (Community structure) nodes have more internal links than I. INTRODUCTION external. Detecting communities in heterogeneous networks is The Emergence of Social networking Site (SNS) like Face- same as, the graph partition problem in modern graph theory book, Twitter, LinkedIn, MySpace, etc.
    [Show full text]
  • A Centrality Measure for Networks with Community Structure Based on a Generalization of the Owen Value
    ECAI 2014 867 T. Schaub et al. (Eds.) © 2014 The Authors and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License. doi:10.3233/978-1-61499-419-0-867 A Centrality Measure for Networks With Community Structure Based on a Generalization of the Owen Value Piotr L. Szczepanski´ 1 and Tomasz P. Michalak2 ,3 and Michael Wooldridge2 Abstract. There is currently much interest in the problem of mea- cated — approaches have been considered in the literature. Recently, suring the centrality of nodes in networks/graphs; such measures various methods for the analysis of cooperative games have been ad- have a range of applications, from social network analysis, to chem- vocated as measures of centrality [16, 10]. The key idea behind this istry and biology. In this paper we propose the first measure of node approach is to consider groups (or coalitions) of nodes instead of only centrality that takes into account the community structure of the un- considering individual nodes. By doing so this approach accounts for derlying network. Our measure builds upon the recent literature on potential synergies between groups of nodes that become visible only game-theoretic centralities, where solution concepts from coopera- if the value of nodes are analysed jointly [18]. Next, given all poten- tive game theory are used to reason about importance of nodes in the tial groups of nodes, game-theoretic solution concepts can be used to network. To allow for flexible modelling of community structures, reason about players (i.e., nodes) in such a coalitional game.
    [Show full text]
  • Structural Diversity and Homophily: a Study Across More Than One Hundred Big Networks
    KDD 2017 Research Paper KDD’17, August 13–17, 2017, Halifax, NS, Canada Structural Diversity and Homophily: A Study Across More Than One Hundred Big Networks Yuxiao Dong∗ Reid A. Johnson Microso Research University of Notre Dame Redmond, WA 98052 Notre Dame, IN 46556 [email protected] [email protected] Jian Xu Nitesh V. Chawla University of Notre Dame University of Notre Dame Notre Dame, IN 46556 Notre Dame, IN 46556 [email protected] [email protected] ABSTRACT ACM Reference format: A widely recognized organizing principle of networks is structural Yuxiao Dong, Reid A. Johnson, Jian Xu, and Nitesh V. Chawla. 2017. Struc- tural Diversity and Homophily: A Study Across More an One Hundred homophily, which suggests that people with more common neigh- Big Networks. In Proceedings of KDD ’17, August 13–17, 2017, Halifax, NS, bors are more likely to connect with each other. However, what in- Canada., , 10 pages. uence the diverse structures embedded in common neighbors (e.g., DOI: hp://dx.doi.org/10.1145/3097983.3098116 and ) have on link formation is much less well-understood. To explore this problem, we begin by characterizing the structural diversity of common neighborhoods. Using a collection of 120 large- 1 INTRODUCTION scale networks, we demonstrate that the impact of the common neighborhood diversity on link existence can vary substantially Since the time of Aristotle, it has been observed that people “love across networks. We nd that its positive eect on Facebook and those who are like themselves” [2]. We now know this as the negative eect on LinkedIn suggest dierent underlying network- principle of homophily, which asserts that people’s propensity to ing needs in these networks.
    [Show full text]
  • Community Extraction in Multilayer Networks with Heterogeneous Community Structure
    Journal of Machine Learning Research 18 (2017) 1-49 Submitted 12/16; Revised 9/17; Published 11/17 Community Extraction in Multilayer Networks with Heterogeneous Community Structure James D. Wilson∗ [email protected] Department of Mathematics and Statistics University of San Francisco San Francisco, CA 94117-1080 John Palowitch∗ [email protected] Shankar Bhamidi [email protected] Andrew B. Nobel [email protected] Department of Statistics and Operations Research University of North Carolina at Chapel Hill Chapel Hill, NC 27599 Editor: Edo Airoldi Abstract Multilayer networks are a useful way to capture and model multiple, binary or weighted relationships among a fixed group of objects. While community detection has proven to be a useful exploratory technique for the analysis of single-layer networks, the develop- ment of community detection methods for multilayer networks is still in its infancy. We propose and investigate a procedure, called Multilayer Extraction, that identifies densely connected vertex-layer sets in multilayer networks. Multilayer Extraction makes use of a significance based score that quantifies the connectivity of an observed vertex-layer set through comparison with a fixed degree random graph model. Multilayer Extraction di- rectly handles networks with heterogeneous layers where community structure may be different from layer to layer. The procedure can capture overlapping communities, as well as background vertex-layer pairs that do not belong to any community. We estab- lish consistency of the vertex-layer set optimizer of our proposed multilayer score un- der the multilayer stochastic block model. We investigate the performance of Multi- layer Extraction on three applications and a test bed of simulations.
    [Show full text]
  • Hidden Community Detection in Online Forums
    Hidden Community Detection in Online Forums Daniel Salz Nicholas Benavides Jonathan Li Stanford University Stanford University Stanford University [email protected] [email protected] [email protected] Abstract cial networks, communities take on a very re- latable meaning, as communities of users are This project investigated both dominant and densely connected and may share common in- hidden community structure on a network of in- terests, backgrounds, or experiences. Online fo- teractions between subreddits on Reddit.com. We rums, such as Reddit, are generally divided into identified moderate community structure using communities based on the topic of the forum (ie the Louvain algorithm, and we also detected some r/nba for NBA fans or r/politics for those inter- hidden community structure with the HICODE ested in political news and discussions). The algorithm. We also explored a method to de- users in a specific forum can be thought of as tect future communities that involved predicting members of a community, but we can also think new edges and their weights, adding the predicted about groups of these forums as communities in edges to the existing network, and re-running the their own right. Louvain and HICODE algorithms to determine Within the field of community detection, there changes in the community structure. While the has been some interesting research conducted on edge weight prediction was a difficult task, we the topic of hidden community detection, which found that a linear regression model using 256- attempts to identify weaker community structure dimension node embeddings yielded the best re- that often goes undetected by the popular commu- sults.
    [Show full text]
  • Communities in Networks Mason A
    Communities in Networks Mason A. Porter, Jukka-Pekka Onnela, and Peter J. Mucha Introduction: Networks and Communities for an example). Each node has a degree given by “But although, as a matter of history, statistical the number of edges connected to it and a strength mechanics owes its origin to investigations in given by the total weight of those edges. Graphs thermodynamics, it seems eminently worthy of can represent either man-made or natural con- an independent development, both on account of structs, such as the World Wide Web or neuronal the elegance and simplicity of its principles, and synaptic networks in the brain. Agents in such because it yields new results and places old truths networked systems are like particles in traditional in a new light in departments quite outside of statistical mechanics that we all know and (pre- thermodynamics.” sumably) love, and the structure of interactions — Josiah Willard Gibbs, between agents reflects the microscopic rules that Elementary Principles in Statistical Mechanics, govern their behavior. The simplest types of links 1902 [47] are binary pairwise connections, in which one only cares about the presence or absence of a tie. How- rom an abstract perspective, the term ever, in many situations, links can also be assigned network is used as a synonym for a math- a direction and a (positive or negative) weight to ematical graph. However, to scientists designate different interaction strengths. across a variety of fields, this label means Traditional statistical physics is concerned with so much more [13,20,44,83,88,120,124]. the dynamics of ensembles of interacting and FIn sociology, each node (or vertex) of a network noninteracting particles.
    [Show full text]
  • Testing Community Structure for Hypergraphs
    Submitted to the Annals of Statistics TESTING COMMUNITY STRUCTURE FOR HYPERGRAPHS By Mingao Yuan‡, Ruiqi Liu§, Yang Feng¶,∗ and Zuofeng Shang‖,† North Dakota State University‡, Texas Tech University§, New York University¶ and New Jersey Institute of Technology ‖ Many complex networks in the real world can be formulated as hypergraphs where community detection has been widely used. How- ever, the fundamental question of whether communities exist or not in an observed hypergraph remains unclear. This work aims to tackle this important problem. Specifically, we systematically study when a hypergraph with community structure can be successfully distin- guished from its Erd¨os-R´enyi counterpart, and propose concrete test statistics when the models are distinguishable. The main contribution of this paper is threefold. First, we discover a phase transition in the hyperedge probability for distinguishability. Second, in the bounded- degree regime, we derive a sharp signal-to-noise ratio (SNR) thresh- old for distinguishability in the special two-community 3-uniform hy- pergraphs, and derive nearly tight SNR thresholds in the general two-community m-uniform hypergraphs. Third, in the dense regime, we propose a computationally feasible test based on sub-hypergraph counts, obtain its asymptotic distribution, and analyze its power. Our results are further extended to non-uniform hypergraphs in which a new test involving both edge and hyperedge information is proposed. The proofs rely on Janson's contiguity theory [40], a high-moments driven asymptotic normality result by Gao and Wormald [36], and a truncation technique for analyzing the likelihood ratio. 1. Introduction. Community detection is a fundamental problem in network data analysis.
    [Show full text]
  • Modularity Based Community Detection with Deep Learning
    Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) Modularity Based Community Detection with Deep Learning 1,2,3 1 3, 1 4 5,6 Liang Yang, Xiaochun Cao, Dongxiao He, ⇤ Chuan Wang, Xiao Wang, Weixiong Zhang 1State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences 2School of Information Engineering, Tianjin University of Commerce 3School of Computer Science and Technology, Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin University 4Department of Computer Science and Technology, Tsinghua University 5College of Math and Computer Science, Institute for Systems Biology, Jianghan University 6Department of Computer Science and Engineering, Washington University in St. Louis yangliang,caoxiaochun,wangchuan @iie.ac.cn, [email protected], { [email protected], [email protected]} Abstract model [Psorakis et al., 2011] focuses on deriving generative models of networks. Such a generative model in essence maps Identification of module or community structures a network to an embedding in a low-dimensional latent space. is important for characterizing and understanding The mapping can be done by, e.g., nonnegative matrix fac- complex systems. While designed with different ob- torization (NMF) [Wang et al., 2008]. Unlike the stochastic jectives, i.e., stochastic models for regeneration and model, the modularity maximization model [Newman, 2006], modularity maximization models for discrimination, as the name suggests, attempts to maximize a modularity func- both these two types of model look for low-rank tion on network substructures. The optimization can be done embedding to best represent and reconstruct net- by eigenvalue decomposition (EVD), which is equivalent to work topology.
    [Show full text]
  • Ordered Community Structure in Networks
    Ordered community structure in networks Steve Gregory Department of Computer Science, University of Bristol, Bristol BS8 1UB, England Abstract. Community structure in networks is often a consequence of homophily, or assortative mixing, based on some attribute of the vertices. For example, researchers may be grouped into communities corresponding to their research topic. This is possible if vertex attributes have discrete values, but many networks exhibit assortative mixing by some continuous-valued attribute, such as age or geographical location. In such cases, no discrete communities can be identified. We consider how the notion of community structure can be generalized to networks that are based on continuous-valued attributes: in general, a network may contain discrete communities which are ordered according to their attribute values. We propose a method of generating synthetic ordered networks and investigate the effect of ordered community structure on the spread of infectious diseases. We also show that community detection algorithms fail to recover community structure in ordered networks, and evaluate an alternative method using a layout algorithm to recover the ordering. Keywords . Community structure, assortative mixing, homophily, spatial networks. 1. Introduction Networks often exhibit a community structure, whereby groups of vertices (communities) exist such that the density of edges within a community is higher than average. In other words, an edge between any two vertices is more likely if the vertices are in the same community (or communities) than if they are not. Several types of community structure have been identified – disjoint, overlapping, hierarchical, etc. – but they all have the property that a network is partitioned into a finite number of (possibly overlapping) sets of vertices.
    [Show full text]