Edge Direction and the Structure of Networks

Total Page:16

File Type:pdf, Size:1020Kb

Edge Direction and the Structure of Networks Edge direction and the structure of networks Jacob G. Foster1, David V. Foster, Peter Grassberger, and Maya Paczuski Complexity Science Group, Department of Physics and Astronomy, University of Calgary, 2500 University Drive NW, Calgary, AB, Canada T2N 1N4 Edited by H. Eugene Stanley, Boston University, Boston, MA, and approved April 6, 2010 (received for review November 2, 2009) Directed networks are ubiquitous and are necessary to represent In directed networks, an edge from source to target (A → B) complex systems with asymmetric interactions—from food webs represents an asymmetric interaction; for example, that Web site to the World Wide Web. Despite the importance of edge direction A contains a hyperlink to Web site B, or organism A is eaten for detecting local and community structure, it has been disregarded by organism B. Edge direction is essential to evaluate and explain in studying a basic type of global diversity in networks: the tendency local structure in such networks. For instance, motif analysis of nodes with similar numbers of edges to connect. This tendency, (13, 14) identifies local connection patterns that appear more called assortativity, affects crucial structural and dynamic properties frequently in the real-world network than in ensembles of rando- of real-world networks, such as error tolerance or epidemic spread- mized networks. In this context, edge direction distinguishes ing. Here we demonstrate that edge direction has profound effects functional units like feed-forward and feedback loops. Takingedge on assortativity. We define a set of four directed assortativity mea- direction into account also overturns the simple picture of the sures and assign statistical significance by comparison to rando- World Wide Web (WWW) as having a short average distance mized networks. We apply these measures to three network between all Web pages (15) in favor of a richer picture of link classes—online/social networks, food webs, and word-adjacency flow into and out of a dense inner core (16). More recently, networks. Our measures (i) reveal patterns common to each class, attempts to identify communities in directed networks have (ii) separate networks that have been previously classified together, demonstrated that ignoring edge direction misses key organiza- – and (iii) expose limitations of several existing theoretical models. tional features of community structure in networks (17 19). Hence We reject the standard classification of directed networks as purely it is striking that assortativity in directed networks has been studied assortative or disassortative. Many display a class-specific mixture, only by ignoring edge direction entirely (8) or by measuring a likely reflecting functional or historical constraints, contingencies, subset of the four possible degree-degree correlations (9, 20). and forces guiding the system’s evolution. All four degree-degree correlations were addressed in the specific contexts of earthquake recurrences (21) and the WWW (22) using the average neighbor degree, e.g., hk0outi ðkinÞ, as a measure omplex networks reveal essential features of the structure, nn rather than the Pearson correlation. However, it is easier to inter- function, and dynamics of many complex systems (1–4). C pret and assign statistical significance to the Pearson correlation. While networks from diverse fields share various properties Moreover, the average neighbor degree cannot be easily used (3, 5–7) and universal patterns (1, 3), they also display enormous to quantify the diversity of a given network or to compare networks structural, functional, and dynamical diversity. A basic measure of various sizes, unlike the Pearson correlation. Incorporating of diversity is assortativity by degree (hereafter assortativity): the edge direction into familiar assortativity measures based on the tendency of nodes to link to other nodes with a similar number of Pearson correlation is an essential step to better characterize, edges (4, 8, 9). Despite its importance, no disciplined approach to understand, and model directed networks. Indeed, since they scale assortativity in directed networks has been proposed. Here we as OðEÞ, where E is the number of edges in the network, our present such an approach and show that measures of directed directed assortativity measures can be evaluated for large assortativity provide a number of insights into the structure of networks that are beyond the reach of current motif analysis or directed networks and key factors governing their evolution. community detection algorithms. Assortativity is a standard tool in analyzing network structure Here we analyze online and social networks, food webs, and (4) and has a simple interpretation. In assortative networks with word-adjacency networks. Classes of directed networks show symmetric interactions (i.e., undirected networks), high degree common patterns across the four directed assortativity measures: nodes, or nodes with many edges, tend to connect to other high rðout; inÞ; rðin; outÞ; rðout; outÞ; and rðin; inÞ. The first element in degree nodes. Hence, assortative networks remain connected the parentheses labels the degree of the source node of the despite node removal or failure (9), but are hard to immunize directed edge, and the second labels the degree of the target against the spread of epidemics (10). In disassortative networks, node. Thus rðin; outÞ quantifies the tendency of nodes with high conversely, high degree nodes tend to connect to low degree in-degree to connect to nodes with high out-degree, and so on; PHYSICS nodes (8, 9); these networks limit the effects of node failure see Fig. 1. because important nodes (with many edges) are isolated from We compare the real-world network with an ensemble of ran- each other (11). Assortativity has a convenient global measure: domized networks. This comparison allows us to assign statistical the Pearson correlation (r) between the degrees of nodes sharing significance to each measure.* We use that significance to define an edge (8, 9). It ranges from −1 to 1, with (r>0) in assortative an Assortativity Significance Profile for each network. This pro- networks and (r < 0) in disassortative ones. Earlier work pro- file allows us to distinguish between networks grouped together posed a simple classification of networks on the basis of assorta- tivity, in which social networks are assortative and biological and Author contributions: J.G.F., D.V.F., P.G., and M.P. designed research; J.G.F., D.V.F., P.G., and technological networks are disassortative (4, 8, 9). Recent work M.P. performed research; J.G.F., D.V.F., P.G., and M.P. analyzed data; and J.G.F., D.V.F., P.G., suggests that this classification does not hold for undirected and M.P. wrote the paper. networks: Many online social networks are disassortative (12). The authors declare no conflict of interest. We go further, demonstrating that the simple assortative/disas- This article is a PNAS Direct Submission. sortative dichotomy misses fundamental features of networks *To our knowledge, statistical significance has been assigned to assortativity measures in where edge direction plays a crucial role. In fact, we show that only one publication (23). many networks are neither purely assortative nor disassortative, 1To whom correspondence should be addressed. E-mail: [email protected]. but display a mixture of both tendencies. These patterns provide a This article contains supporting information online at www.pnas.org/lookup/suppl/ classification scheme for networks with asymmetric interactions. doi:10.1073/pnas.0912671107/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.0912671107 PNAS ∣ June 15, 2010 ∣ vol. 107 ∣ no. 24 ∣ 10815–10820 Downloaded by guest on September 24, 2021 by other measures; indeed, we find that online and social net- r α; β − r α; β Z α; β rwð Þ h randð Þi : [2] works, which have similar motif structure (14), have substantially ð Þ¼ σ r α; β ½ randð Þ different assortativity profiles. The class-specific profiles point to forces or constraints that may guide the structure, function and This quantifies the difference between the assortativity measure of r α; β growth of that class (14, 24, 25). We also uncover limitations of the real-world network rwð Þ and its average value in the ran- r α; β several theoretical network models. For example, neither of two domized ensemble h randð Þi in units of the standard deviation, σ r α; β Z plausible models of word-adjacency networks [one proposed by ½ randð Þ. Larger networks typically have larger scores (see Milo et al. (14), the other in this paper] can reproduce the direc- Table S2). To compare networks of various sizes, the Z scores are ted assortativity profile we observe in the real-world networks. A normalized (14) by defining an Assortativity Significance Profile 2 1∕2 standard model of the WWW (26) is similarly unsuccessful. On (ASP), where ASPðα; βÞ¼Zðα; βÞ∕½∑α;βZðα; βÞ . This quan- the other hand, the food web models (27) examined here repro- tity is directly related to the Z score, and for a given network duce the pattern of assortativity seen in different food webs. the normalization does not change the relative size of the signifi- Hence our measures provide useful benchmarks to test models cance measures. To separate less significant correlations, we indi- of network formation. cate jZðα; βÞj < 2 in all figures by an appropriately colored Table S1 provides descriptions and sources for all networks asterisk. A positive Zðα; βÞ or ASPðα; βÞ (“Z assortative”) indicates examined in this paper; Table S2 collects the full results including that the real-world network is more assortative in that measure error estimates. than expected based on the degree sequence. A negative Zðα; βÞ or ASPðα; βÞ (“Z disassortative”) means that the original Results and Discussion network is less assortative than expected. Since nodes in directed networks have both an in-degree and an out-degree, we introduce a set of four directed assortativity Online and Social Networks.
Recommended publications
  • A Network Approach of the Mandatory Influenza Vaccination Among Healthcare Workers
    Wright State University CORE Scholar Master of Public Health Program Student Publications Master of Public Health Program 2014 Best Practices: A Network Approach of the Mandatory Influenza Vaccination Among Healthcare Workers Greg Attenweiler Wright State University - Main Campus Angie Thomure Wright State University - Main Campus Follow this and additional works at: https://corescholar.libraries.wright.edu/mph Part of the Influenza Virus accinesV Commons Repository Citation Attenweiler, G., & Thomure, A. (2014). Best Practices: A Network Approach of the Mandatory Influenza Vaccination Among Healthcare Workers. Wright State University, Dayton, Ohio. This Master's Culminating Experience is brought to you for free and open access by the Master of Public Health Program at CORE Scholar. It has been accepted for inclusion in Master of Public Health Program Student Publications by an authorized administrator of CORE Scholar. For more information, please contact library- [email protected]. Running Head: A NETWORK APPROACH 1 Best Practices: A network approach of the mandatory influenza vaccination among healthcare workers Greg Attenweiler Angie Thomure Wright State University A NETWORK APPROACH 2 Acknowledgements We would like to thank Michele Battle-Fisher and Nikki Rogers for donating their time and resources to help us complete our Culminating Experience. We would also like to thank Michele Battle-Fisher for creating the simulation used in our Culmination Experience. Finally we would like to thank our family and friends for all of the
    [Show full text]
  • Detecting Statistically Significant Communities
    1 Detecting Statistically Significant Communities Zengyou He, Hao Liang, Zheng Chen, Can Zhao, Yan Liu Abstract—Community detection is a key data analysis problem across different fields. During the past decades, numerous algorithms have been proposed to address this issue. However, most work on community detection does not address the issue of statistical significance. Although some research efforts have been made towards mining statistically significant communities, deriving an analytical solution of p-value for one community under the configuration model is still a challenging mission that remains unsolved. The configuration model is a widely used random graph model in community detection, in which the degree of each node is preserved in the generated random networks. To partially fulfill this void, we present a tight upper bound on the p-value of a single community under the configuration model, which can be used for quantifying the statistical significance of each community analytically. Meanwhile, we present a local search method to detect statistically significant communities in an iterative manner. Experimental results demonstrate that our method is comparable with the competing methods on detecting statistically significant communities. Index Terms—Community Detection, Random Graphs, Configuration Model, Statistical Significance. F 1 INTRODUCTION ETWORKS are widely used for modeling the structure function that is able to always achieve the best performance N of complex systems in many fields, such as biology, in all possible scenarios. engineering, and social science. Within the networks, some However, most of these objective functions (metrics) do vertices with similar properties will form communities that not address the issue of statistical significance of commu- have more internal connections and less external links.
    [Show full text]
  • Geodesic Distance Descriptors
    Geodesic Distance Descriptors Gil Shamai and Ron Kimmel Technion - Israel Institute of Technologies [email protected] [email protected] Abstract efficiency of state of the art shape matching procedures. The Gromov-Hausdorff (GH) distance is traditionally used for measuring distances between metric spaces. It 1. Introduction was adapted for non-rigid shape comparison and match- One line of thought in shape analysis considers an ob- ing of isometric surfaces, and is defined as the minimal ject as a metric space, and object matching, classification, distortion of embedding one surface into the other, while and comparison as the operation of measuring the discrep- the optimal correspondence can be described as the map ancies and similarities between such metric spaces, see, for that minimizes this distortion. Solving such a minimiza- example, [13, 33, 27, 23, 8, 3, 24]. tion is a hard combinatorial problem that requires pre- Although theoretically appealing, the computation of computation and storing of all pairwise geodesic distances distances between metric spaces poses complexity chal- for the matched surfaces. A popular way for compact repre- lenges as far as direct computation and memory require- sentation of functions on surfaces is by projecting them into ments are involved. As a remedy, alternative representa- the leading eigenfunctions of the Laplace-Beltrami Opera- tion spaces were proposed [26, 22, 15, 10, 31, 30, 19, 20]. tor (LBO). When truncated, the basis of the LBO is known The question of which representation to use in order to best to be the optimal for representing functions with bounded represent the metric space that define each form we deal gradient in a min-max sense.
    [Show full text]
  • Introduction to Network Science & Visualisation
    IFC – Bank Indonesia International Workshop and Seminar on “Big Data for Central Bank Policies / Building Pathways for Policy Making with Big Data” Bali, Indonesia, 23-26 July 2018 Introduction to network science & visualisation1 Kimmo Soramäki, Financial Network Analytics 1 This presentation was prepared for the meeting. The views expressed are those of the author and do not necessarily reflect the views of the BIS, the IFC or the central banks and other institutions represented at the meeting. FNA FNA Introduction to Network Science & Visualization I Dr. Kimmo Soramäki Founder & CEO, FNA www.fna.fi Agenda Network Science ● Introduction ● Key concepts Exposure Networks ● OTC Derivatives ● CCP Interconnectedness Correlation Networks ● Housing Bubble and Crisis ● US Presidential Election Network Science and Graphs Analytics Is already powering the best known AI applications Knowledge Social Product Economic Knowledge Payment Graph Graph Graph Graph Graph Graph Network Science and Graphs Analytics “Goldman Sachs takes a DIY approach to graph analytics” For enhanced compliance and fraud detection (www.TechTarget.com, Mar 2015). “PayPal relies on graph techniques to perform sophisticated fraud detection” Saving them more than $700 million and enabling them to perform predictive fraud analysis, according to the IDC (www.globalbankingandfinance.com, Jan 2016) "Network diagnostics .. may displace atomised metrics such as VaR” Regulators are increasing using network science for financial stability analysis. (Andy Haldane, Bank of England Executive
    [Show full text]
  • A Centrality Measure for Electrical Networks
    Carnegie Mellon Electricity Industry Center Working Paper CEIC-07 www.cmu.edu/electricity 1 A Centrality Measure for Electrical Networks Paul Hines and Seth Blumsack types of failures. Many classifications of network structures Abstract—We derive a measure of “electrical centrality” for have been studied in the field of complex systems, statistical AC power networks, which describes the structure of the mechanics, and social networking [5,6], as shown in Figure 2, network as a function of its electrical topology rather than its but the two most fruitful and relevant have been the random physical topology. We compare our centrality measure to network model of Erdös and Renyi [7] and the “small world” conventional measures of network structure using the IEEE 300- bus network. We find that when measured electrically, power model inspired by the analyses in [8] and [9]. In the random networks appear to have a scale-free network structure. Thus, network model, nodes and edges are connected randomly. The unlike previous studies of the structure of power grids, we find small-world network is defined largely by relatively short that power networks have a number of highly-connected “hub” average path lengths between node pairs, even for very large buses. This result, and the structure of power networks in networks. One particularly important class of small-world general, is likely to have important implications for the reliability networks is the so-called “scale-free” network [10, 11], which and security of power networks. is characterized by a more heterogeneous connectivity. In a Index Terms—Scale-Free Networks, Connectivity, Cascading scale-free network, most nodes are connected to only a few Failures, Network Structure others, but a few nodes (known as hubs) are highly connected to the rest of the network.
    [Show full text]
  • Network Science
    This is a preprint of Katy Börner, Soma Sanyal and Alessandro Vespignani (2007) Network Science. In Blaise Cronin (Ed) Annual Review of Information Science & Technology, Volume 41. Medford, NJ: Information Today, Inc./American Society for Information Science and Technology, chapter 12, pp. 537-607. Network Science Katy Börner School of Library and Information Science, Indiana University, Bloomington, IN 47405, USA [email protected] Soma Sanyal School of Library and Information Science, Indiana University, Bloomington, IN 47405, USA [email protected] Alessandro Vespignani School of Informatics, Indiana University, Bloomington, IN 47406, USA [email protected] 1. Introduction.............................................................................................................................................2 2. Notions and Notations.............................................................................................................................4 2.1 Graphs and Subgraphs .........................................................................................................................5 2.2 Graph Connectivity..............................................................................................................................7 3. Network Sampling ..................................................................................................................................9 4. Network Measurements........................................................................................................................11
    [Show full text]
  • Latent Distance Estimation for Random Geometric Graphs ∗
    Latent Distance Estimation for Random Geometric Graphs ∗ Ernesto Araya Valdivia Laboratoire de Math´ematiquesd'Orsay (LMO) Universit´eParis-Sud 91405 Orsay Cedex France Yohann De Castro Ecole des Ponts ParisTech-CERMICS 6 et 8 avenue Blaise Pascal, Cit´eDescartes Champs sur Marne, 77455 Marne la Vall´ee,Cedex 2 France Abstract: Random geometric graphs are a popular choice for a latent points generative model for networks. Their definition is based on a sample of n points X1;X2; ··· ;Xn on d−1 the Euclidean sphere S which represents the latent positions of nodes of the network. The connection probabilities between the nodes are determined by an unknown function (referred to as the \link" function) evaluated at the distance between the latent points. We introduce a spectral estimator of the pairwise distance between latent points and we prove that its rate of convergence is the same as the nonparametric estimation of a d−1 function on S , up to a logarithmic factor. In addition, we provide an efficient spectral algorithm to compute this estimator without any knowledge on the nonparametric link function. As a byproduct, our method can also consistently estimate the dimension d of the latent space. MSC 2010 subject classifications: Primary 68Q32; secondary 60F99, 68T01. Keywords and phrases: Graphon model, Random Geometric Graph, Latent distances estimation, Latent position graph, Spectral methods. 1. Introduction Random geometric graph (RGG) models have received attention lately as alternative to some simpler yet unrealistic models as the ubiquitous Erd¨os-R´enyi model [11]. They are generative latent point models for graphs, where it is assumed that each node has associated a latent d point in a metric space (usually the Euclidean unit sphere or the unit cube in R ) and the connection probability between two nodes depends on the position of their associated latent points.
    [Show full text]
  • Exploring Network Structure, Dynamics, and Function Using Networkx
    Proceedings of the 7th Python in Science Conference (SciPy 2008) Exploring Network Structure, Dynamics, and Function using NetworkX Aric A. Hagberg ([email protected])– Los Alamos National Laboratory, Los Alamos, New Mexico USA Daniel A. Schult ([email protected])– Colgate University, Hamilton, NY USA Pieter J. Swart ([email protected])– Los Alamos National Laboratory, Los Alamos, New Mexico USA NetworkX is a Python language package for explo- and algorithms, to rapidly test new hypotheses and ration and analysis of networks and network algo- models, and to teach the theory of networks. rithms. The core package provides data structures The structure of a network, or graph, is encoded in the for representing many types of networks, or graphs, edges (connections, links, ties, arcs, bonds) between including simple graphs, directed graphs, and graphs nodes (vertices, sites, actors). NetworkX provides ba- with parallel edges and self-loops. The nodes in Net- sic network data structures for the representation of workX graphs can be any (hashable) Python object simple graphs, directed graphs, and graphs with self- and edges can contain arbitrary data; this flexibil- loops and parallel edges. It allows (almost) arbitrary ity makes NetworkX ideal for representing networks objects as nodes and can associate arbitrary objects to found in many different scientific fields. edges. This is a powerful advantage; the network struc- In addition to the basic data structures many graph ture can be integrated with custom objects and data algorithms are implemented for calculating network structures, complementing any pre-existing code and properties and structure measures: shortest paths, allowing network analysis in any application setting betweenness centrality, clustering, and degree dis- without significant software development.
    [Show full text]
  • Centrality in Modular Networks
    Ghalmane et al. EPJ Data Science (2019)8:15 https://doi.org/10.1140/epjds/s13688-019-0195-7 REGULAR ARTICLE OpenAccess Centrality in modular networks Zakariya Ghalmane1,3, Mohammed El Hassouni1, Chantal Cherifi2 and Hocine Cherifi3* *Correspondence: hocine.cherifi@u-bourgogne.fr Abstract 3LE2I, UMR6306 CNRS, University of Burgundy, Dijon, France Identifying influential nodes in a network is a fundamental issue due to its wide Full list of author information is applications, such as accelerating information diffusion or halting virus spreading. available at the end of the article Many measures based on the network topology have emerged over the years to identify influential nodes such as Betweenness, Closeness, and Eigenvalue centrality. However, although most real-world networks are made of groups of tightly connected nodes which are sparsely connected with the rest of the network in a so-called modular structure, few measures exploit this property. Recent works have shown that it has a significant effect on the dynamics of networks. In a modular network, a node has two types of influence: a local influence (on the nodes of its community) through its intra-community links and a global influence (on the nodes in other communities) through its inter-community links. Depending on the strength of the community structure, these two components are more or less influential. Based on this idea, we propose to extend all the standard centrality measures defined for networks with no community structure to modular networks. The so-called “Modular centrality” is a two-dimensional vector. Its first component quantifies the local influence of a node in its community while the second component quantifies its global influence on the other communities of the network.
    [Show full text]
  • Average D-Distance Between Vertices of a Graph
    italian journal of pure and applied mathematics { n. 33¡2014 (293¡298) 293 AVERAGE D-DISTANCE BETWEEN VERTICES OF A GRAPH D. Reddy Babu Department of Mathematics Koneru Lakshmaiah Education Foundation (K.L. University) Vaddeswaram Guntur 522 502 India e-mail: [email protected], [email protected] P.L.N. Varma Department of Science & Humanities V.F.S.T.R. University Vadlamudi Guntur 522 237 India e-mail: varma [email protected] Abstract. The D-distance between vertices of a graph G is obtained by considering the path lengths and as well as the degrees of vertices present on the path. The average D-distance between the vertices of a connected graph is the average of the D-distances between all pairs of vertices of the graph. In this article we study the average D-distance between the vertices of a graph. Keywords: D-distance, average D-distance, diameter. 2000 Mathematics Subject Classi¯cation: 05C12. 1. Introduction The concept of distance is one of the important concepts in study of graphs. It is used in isomorphism testing, graph operations, hamiltonicity problems, extremal problems on connectivity and diameter, convexity in graphs etc. Distance is the basis of many concepts of symmetry in graphs. In addition to the usual distance, d(u; v) between two vertices u; v 2 V (G) we have detour distance (introduced by Chartrand et al, see [2]), superior distance (introduced by Kathiresan and Marimuthu, see [6]), signal distance (introduced by Kathiresan and Sumathi, see [7]), degree distance etc. 294 d. reddy babu, p.l.n. varma In an earlier article [9], the authors introduced the concept of D-distance be- tween vertices of a graph G by considering not only path length between vertices, but also the degrees of all vertices present in a path while de¯ning the D-distance.
    [Show full text]
  • Robustness and Assortativity for Diffusion-Like Processes in Scale
    epl draft Robustness and Assortativity for Diffusion-like Processes in Scale- free Networks G. D’Agostino1, A. Scala2,3, V. Zlatic´4 and G. Caldarelli5,3 1 ENEA - CR ”Casaccia” - Via Anguillarese 301 00123, Roma - Italy 2 CNR-ISC and Department of Physics, University of Rome “Sapienza” P.le Aldo Moro 5 00185 Rome, Italy 3 London Institute of Mathematical Sciences, 22 South Audley St Mayfair London W1K 2NY, UK 4 Theoretical Physics Division, Rudjer Boˇskovi´cInstitute, P.O.Box 180, HR-10002 Zagreb, Croatia 5 IMT Lucca Institute for Advanced Studies, Piazza S. Ponziano 6, Lucca, 55100, Italy PACS 89.75.Hc – Networks and genealogical trees PACS 05.70.Ln – Nonequilibrium and irreversible thermodynamics PACS 87.23.Ge – Dynamics of social systems Abstract – By analysing the diffusive dynamics of epidemics and of distress in complex networks, we study the effect of the assortativity on the robustness of the networks. We first determine by spectral analysis the thresholds above which epidemics/failures can spread; we then calculate the slowest diffusional times. Our results shows that disassortative networks exhibit a higher epidemi- ological threshold and are therefore easier to immunize, while in assortative networks there is a longer time for intervention before epidemic/failure spreads. Moreover, we study by computer sim- ulations the sandpile cascade model, a diffusive model of distress propagation (financial contagion). We show that, while assortative networks are more prone to the propagation of epidemic/failures, degree-targeted immunization policies increases their resilience to systemic risk. Introduction. – The heterogeneity in the distribu- propagation of an epidemic [13–15].
    [Show full text]
  • Ollivier Ricci Curvature of Directed Hypergraphs Marzieh Eidi1* & Jürgen Jost1,2
    www.nature.com/scientificreports OPEN Ollivier Ricci curvature of directed hypergraphs Marzieh Eidi1* & Jürgen Jost1,2 Many empirical networks incorporate higher order relations between elements and therefore are naturally modelled as, possibly directed and/or weighted, hypergraphs, rather than merely as graphs. In order to develop a systematic tool for the statistical analysis of such hypergraph, we propose a general defnition of Ricci curvature on directed hypergraphs and explore the consequences of that defnition. The defnition generalizes Ollivier’s defnition for graphs. It involves a carefully designed optimal transport problem between sets of vertices. While the defnition looks somewhat complex, in the end we shall be able to express our curvature in a very simple formula, κ = µ0 − µ2 − 2µ3 . This formula simply counts the fraction of vertices that have to be moved by distances 0, 2 or 3 in an optimal transport plan. We can then characterize various classes of hypergraphs by their curvature. Principles of network analysis. Network analysis constitutes one of the success stories in the study of complex systems3,6,11. For the mathematical analysis, a network is modelled as a (perhaps weighted and/or directed) graph. One can then look at certain graph theoretical properties of an empirical network, like its degree or motiv distribution, its assortativity or clustering coefcient, the spectrum of its Laplacian, and so on. One can also compare an empirical network with certain deterministic or random theoretical models. Successful as this analysis clearly is, we nevertheless see two important limitations. One is that many of the prominent concepts and quantities used in the analysis of empirical networks are node based, like the degree sequence.
    [Show full text]