Constructing, Comparing, and Reconstructing Networks

Total Page:16

File Type:pdf, Size:1020Kb

Constructing, Comparing, and Reconstructing Networks Constructing, comparing, and reconstructing networks by Brennan Klein B.A. in Cognitive Science & Psychology, Swarthmore College A dissertation submitted to The Faculty of the College of Science of Northeastern University in partial fulfillment of the requirements for the degree of Doctor of Philosophy November 19, 2020 Dissertation Committee Alessandro Vespignani, Chair Samuel V. Scarpino, Co-chair Tina Eliassi-Rad Laurent H´ebert-Dufresne 1 Acknowledgements The thanks I give in this section will never|and can never|fully capture the extent of my gratitude. I cherish the friends, mentors, collaborators, and altogether supportive people in my life. Because of them, I have grown immensely as a scientist. Because of them, I have grown even more as a person. Because of them, I can go forth into this next stage of my life, full of a deep faith in what's to come, supported by a network of endlessly kind and brilliant people. My dissertation committee|Laurent H´ebert-Dufresne, Tina Eliassi-Rad, Sam Scarpino, and Alex Vespignani|is a perfect example of this network of support. It has been a privilege to share this dissertation with them over these last few years. One of the greatest joys in my life has been my friendship with Conor Heins. We grew up together in science, and there are ideas that I simply cannot grasp without his presence. If there is one thing I have learned throughout my short career in science, it is the irreplaceable role that friendship has in driving discovery. To my parents, Marsha and Don, I owe so much. This dissertation would not exist|and I would not be a scientist|without my brother, Jason. Documents like these are, ironically, never comprehensive enough. I spent months compiling a list of all the people I wanted to thank, all the memories I wanted to share, to reminisce over, to try and inspire through. At the same time, I am writing this document in the midst of a year of devastation from the COVID-19 pandemic. For some strange and sad reason, I cannot bring myself to write the names of every person I want to acknowledge. As a result this section may seem artificially short or otherwise rushed. In place of a fuller list of acknowledgements, I make this promise to these cherished people in my life: That the acknowledgements will come in person, sporadically, surprisingly, over next several years of our lives together. I hope we recognize each other. 2 Abstract of Dissertation Complex networks are the syntax of complex systems; they are models that allow us to study phenomena across nature and society. And because they are models, the famous \all models are wrong, but some are useful" quotation rings especially true. We need to use the right networks to properly study complex systems, and in order to do so, the methods we use to create and analyze networks must be fit for purpose. This motivation has guided much of my dissertation, and in it, I explore three related themes around constructing, comparing, and reconstructing complex networks. In the first chapter, I describe a theoretical and computational infrastructure that allows us to ask whether a given network captures the most informative scale to model the dynamics in the system. We see that many real world networks (especially heterogeneous networks) exhibit an information holarchy whereby a coarse-grained, macroscale representation of the network has more effective information than the original microscale network. In the next chapter, I consider the challenging problem of comparing pairs of networks and quantifying their differences. These tools are broadly referred to as \graph distance" measures, and there are dozens used throughout Network Science. However, unlike in other domains of Network Science where rigorous benchmarks have been established to compare our surplus of tools, there is still no theoretically-grounded benchmark for characterizing these tools. To address this, collaborators and I proposed that simple, well-understood ensembles of random networks are natural benchmarks for network comparison methods. In this chapter, I characterize over 20 different graph distance measures, and I show how this simple within-ensemble graph distance can lead to the development of new tools for studying complex networks. The final chapter is an example of exactly that: I show how the within-ensemble graph distance can be used to characterize and evaluate different techniques for reconstructing networks from time series data. Tying together the original theme of using the \right" network, this chapter addresses one of the most fundamental challenges in Network Science: how to study networks when the network structure is not known. Whether it's reconstructing the network of neurons from time series of their activity, or identifying whether one stock's price fluctuations cause changes in another's, this problem is ubiquitous when studying complex systems; not only that, there are (again) dozens of techniques for transforming time series data into a network. In this chapter, I measure the within-ensemble graph distance between pairs of networks that have been reconstructed from time series data using a given reconstruction technique. What I find is that different reconstruction techniques have characteristic distributions of distances and that certain techniques are either redundant or underspecified given other more comprehensive methods. Ultimately, the goal of this dissertation is to stress the importance of rigorous standards for the suite of tools we have in Network Science, which ultimately becomes an argument about how to make Network Science more useful as a science. 3 Table of Contents Acknowledgements.....................................2 Abstract of Dissertation..................................3 Table of Contents......................................4 List of Figures.......................................7 List of Tables........................................ 14 Chapter 1 Introduction................................. 15 1.1 Science in Network Science...................... 16 1.1.1 What makes a science a science?.............. 17 1.2 Theory in Network Theory...................... 20 1.2.1 Networks as data objects................... 22 1.2.2 Networks as generative models of data........... 22 1.2.3 Networks as hypotheses.................... 23 1.3 The current dissertation....................... 24 Chapter 2 Constructing: Informative higher scales in complex networks...... 26 2.1 Introduction.............................. 26 2.2 Results................................. 28 2.2.1 Effective information..................... 28 2.2.2 Determinism and degeneracy................. 31 2.2.3 Effective information in real networks............ 32 2.2.4 Causal emergence in complex networks........... 34 2.2.5 Network macroscales..................... 36 2.2.6 Causal emergence reveals the scale of networks....... 37 2.2.7 Causal emergence in real networks.............. 39 2.3 Discussion............................... 40 2.4 Materials and Methods........................ 42 2.4.1 Selection of real networks................... 42 2.4.2 Creating consistent macro-nodes............... 43 2.4.3 Greedy algorithm for causal emergence........... 43 2.5 Follow-up research: Biological networks............... 44 2.5.1 Background: Noise in biological systems.......... 44 2.5.2 Effectiveness of interactomes across the tree of life..... 47 2.5.3 Causal emergence across the tree of life........... 48 2.5.4 Resilience of macroscale interactomes............ 49 2.5.5 Discussion........................... 52 2.5.6 Protein interactome data................... 55 2.5.7 Robustness of causal emergence............... 55 4 Chapter 3 Comparing: The within-ensemble graph distance............. 61 3.1 Introduction.............................. 61 3.1.1 Formalism of graph distances................ 62 3.1.2 This study........................... 64 3.2 Methods................................ 64 3.2.1 Ensembles........................... 64 3.2.2 Graph distance measures................... 66 3.2.3 Description of experiments.................. 67 3.3 Results................................. 69 3.3.1 Results for homogeneous graph ensembles......... 69 3.3.2 Results for sparse heterogeneous ensembles......... 75 3.4 Discussion............................... 79 Chapter 4 Reconstructing: Comparing ensembles of reconstructed networks.... 82 4.1 Introduction to the netrd package.................. 82 4.1.1 Network reconstruction from time series data........ 84 4.1.2 Simulated network dynamics................. 84 4.1.3 Comparing networks using graph distances......... 84 4.1.4 Related software packages.................. 84 4.2 Introduction to the ( ; ; ) ensemble............... 86 4.2.1 Framing: A distributionG D R of ground truths.......... 87 4.2.2 The ( ; ; ) ensemble.................... 90 4.3 Methods................................G D R 92 4.3.1 The standardized graph distance............... 92 4.3.2 Description of experiments.................. 93 4.4 Results................................. 93 4.5 Discussion............................... 96 Chapter 5 Conclusion.................................. 100 References.......................................... 102 Appendices......................................... 118 6.1 Chapter 2 Appendix......................... 118 6.1.1 Table of key terms...................... 118 6.1.2 Effective information calculation............... 118 6.1.3 Effective information of common network structures.... 121 6.1.4 Network motifs as causal relationships........... 124 6.1.5 Table of
Recommended publications
  • A Network Approach to Define Modularity of Components In
    A Network Approach to Define Modularity of Components Manuel E. Sosa1 Technology and Operations Management Area, in Complex Products INSEAD, 77305 Fontainebleau, France Modularity has been defined at the product and system levels. However, little effort has e-mail: [email protected] gone into defining and quantifying modularity at the component level. We consider com- plex products as a network of components that share technical interfaces (or connections) Steven D. Eppinger in order to function as a whole and define component modularity based on the lack of Sloan School of Management, connectivity among them. Building upon previous work in graph theory and social net- Massachusetts Institute of Technology, work analysis, we define three measures of component modularity based on the notion of Cambridge, Massachusetts 02139 centrality. Our measures consider how components share direct interfaces with adjacent components, how design interfaces may propagate to nonadjacent components in the Craig M. Rowles product, and how components may act as bridges among other components through their Pratt & Whitney Aircraft, interfaces. We calculate and interpret all three measures of component modularity by East Hartford, Connecticut 06108 studying the product architecture of a large commercial aircraft engine. We illustrate the use of these measures to test the impact of modularity on component redesign. Our results show that the relationship between component modularity and component redesign de- pends on the type of interfaces connecting product components. We also discuss direc- tions for future work. ͓DOI: 10.1115/1.2771182͔ 1 Introduction The need to measure modularity has been highlighted implicitly by Saleh ͓12͔ in his recent invitation “to contribute to the growing Previous research on product architecture has defined modular- field of flexibility in system design” ͑p.
    [Show full text]
  • Practical Applications of Community Detection
    Volume 6, Issue 4, April 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Practical Applications of Community Detection Mini Singh Ahuja1, Jatinder Singh2, Neha3 1 Research Scholar, Department of Computer Science, Punjab Technical University, Punjab, India 2 Professor, Department of Computer Science, KC Group of Institutes Nawashahr, Punjab, India 3 Student (M.Tech), Department of Computer Science, Regional Campus Gurdaspur, Punjab, India Abstract: Network is a collection of entities that are interconnected with links. The widespread use of social media applications like youtube, flicker, facebook is responsible for evolution of more complex networks. Online social networks like facebook and twitter are very large and dynamic complex networks. Community is a group of nodes that are more densely connected as compared to nodes outside the community. Within the community nodes are more likely to be connected but less likely to be connected with nodes of other communities. Community detection in such networks is one of the most challenging tasks. Community structures provide solutions to many real world problems. In this paper we have discussed about various applications areas of community structures in complex network. Keywords: complex network, community structures, applications of community structures, online social networks, community detection algorithms……. etc. I. INTRODUCTION Network is a collection of entities called nodes or vertices which are connected through edges or links. Computers that are connected, web pages that link to each other, group of friends are basic examples of network. Complex network is a group of interacting entities with some non trivial dynamical behavior [1].
    [Show full text]
  • Evolution Leads to Emergence: an Analysis of Protein
    bioRxiv preprint doi: https://doi.org/10.1101/2020.05.03.074419; this version posted May 3, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 2 3 Evolution leads to emergence: An analysis of protein interactomes 4 across the tree of life 5 6 7 8 Erik Hoel1*, Brennan Klein2,3, Anshuman Swain4, Ross Grebenow5, Michael Levin1 9 10 1 Allen Discovery Center at Tufts University, Medford, MA, USA 11 2 Network Science Institute, Northeastern University, Boston, MA, USA 12 3 Laboratory for the Modeling of Biological & Sociotechnical Systems, Northeastern University, 13 Boston, USA 14 4 Department of Biology, University of Maryland, College Park, MD, USA 15 5 Drexel University, Philadelphia, PA, USA 16 17 * Author for Correspondence: 18 200 Boston Ave., Suite 4600 19 Medford, MA 02155 20 e-mail: [email protected] 21 22 Running title: Evolution leads to higher scales 23 24 Keywords: emergence, information, networks, protein, interactomes, evolution 25 26 27 bioRxiv preprint doi: https://doi.org/10.1101/2020.05.03.074419; this version posted May 3, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 2 28 Abstract 29 30 The internal workings of biological systems are notoriously difficult to understand.
    [Show full text]
  • (2020) Small Worlds and Clustering in Spatial Networks
    PHYSICAL REVIEW RESEARCH 2, 023040 (2020) Small worlds and clustering in spatial networks Marián Boguñá ,1,2,* Dmitri Krioukov,3,4,† Pedro Almagro ,1,2 and M. Ángeles Serrano1,2,5,‡ 1Departament de Física de la Matèria Condensada, Universitat de Barcelona, Martí i Franquès 1, E-08028 Barcelona, Spain 2Universitat de Barcelona Institute of Complex Systems (UBICS), Universitat de Barcelona, Barcelona, Spain 3Network Science Institute, Northeastern University, 177 Huntington avenue, Boston, Massachusetts 022115, USA 4Department of Physics, Department of Mathematics, Department of Electrical & Computer Engineering, Northeastern University, 110 Forsyth Street, 111 Dana Research Center, Boston, Massachusetts 02115, USA 5Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluís Companys 23, E-08010 Barcelona, Spain (Received 20 August 2019; accepted 12 March 2020; published 14 April 2020) Networks with underlying metric spaces attract increasing research attention in network science, statistical physics, applied mathematics, computer science, sociology, and other fields. This attention is further amplified by the current surge of activity in graph embedding. In the vast realm of spatial network models, only a few repro- duce even the most basic properties of real-world networks. Here, we focus on three such properties—sparsity, small worldness, and clustering—and identify the general subclass of spatial homogeneous and heterogeneous network models that are sparse small worlds and that have nonzero clustering in the thermodynamic
    [Show full text]
  • Social Network Analysis with Content and Graphs
    SOCIAL NETWORK ANALYSIS WITH CONTENT AND GRAPHS Social Network Analysis with Content and Graphs William M. Campbell, Charlie K. Dagli, and Clifford J. Weinstein Social network analysis has undergone a As a consequence of changing economic renaissance with the ubiquity and quantity of » and social realities, the increased availability of large-scale, real-world sociographic data content from social media, web pages, and has ushered in a new era of research and sensors. This content is a rich data source for development in social network analysis. The quantity of constructing and analyzing social networks, but content-based data created every day by traditional and social media, sensors, and mobile devices provides great its enormity and unstructured nature also present opportunities and unique challenges for the automatic multiple challenges. Work at Lincoln Laboratory analysis, prediction, and summarization in the era of what is addressing the problems in constructing has been dubbed “Big Data.” Lincoln Laboratory has been networks from unstructured data, analyzing the investigating approaches for computational social network analysis that focus on three areas: constructing social net- community structure of a network, and inferring works, analyzing the structure and dynamics of a com- information from networks. Graph analytics munity, and developing inferences from social networks. have proven to be valuable tools in solving these Network construction from general, real-world data presents several unexpected challenges owing to the data challenges. Through the use of these tools, domains themselves, e.g., information extraction and pre- Laboratory researchers have achieved promising processing, and to the data structures used for knowledge results on real-world data.
    [Show full text]
  • Networks in Cognitive Science
    1 Networks in Cognitive Science Andrea Baronchelli1,*, Ramon Ferrer-i-Cancho2, Romualdo Pastor-Satorras3, Nick Chater4 and Morten H. Christiansen5,6 1 Laboratory for the Modeling of Biological and Socio-technical Systems, Northeastern University, Boston, MA 02115, USA 2 Complexity & Quantitative Linguistics Lab, TALP Research Center, Departament de Llenguatges i Sistemes Informàtics. Universitat Politècnica de Catalunya, Campus Nord, Edifici Omega, E-08034 Barcelona, Spain 3 Departament de Física i Enginyeria Nuclear, Universitat Politècnica de Catalunya, Campus Nord B4, E-08034 Barcelona, Spain 4Behavioural Science Group, Warwick Business School, University of Warwick, Coventry, CV4 7AL, UK 5 Department of Psychology, Cornell University, Uris Hall, Ithaca, NY 14853, USA 6 Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA *Corresponding author: email [email protected] 2 Abstract Networks of interconnected nodes have long played a key role in cognitive science, from artificial neural networks to spreading activation models of semantic memory. Recently, however, a new Network Science has been developed, providing insights into the emergence of global, system-scale properties in contexts as diverse as the Internet, metabolic reactions or collaborations among scientists. Today, the inclusion of network theory into cognitive sciences, and the expansion of complex systems science, promises to significantly change the way in which the organization and dynamics of cognitive and behavioral processes are understood. In this paper, we review recent contributions of network theory at different levels and domains within the cognitive sciences. 3 Humans have more than 1010 neurons and between 1014 and 1015 synapses in their nervous system [1]. Together, neurons and synapses form neural networks, organized into structural and functional sub-networks at many scales [2].
    [Show full text]
  • Multi-Layered Network Embedding
    Multi-Layered Network Embedding Jundong Li∗y Chen Chen∗y Hanghang Tong∗ Huan Liu∗ Abstract tasks. To mitigate this problem, recent studies show Network embedding has gained more attentions in re- that through learning general network embedding rep- cent years. It has been shown that the learned low- resentations, many subsequent learning tasks could be dimensional node vector representations could advance greatly enhanced [17, 34, 39]. The basic idea is to learn a a myriad of graph mining tasks such as node classifi- low-dimensional node vector representation by leverag- cation, community detection, and link prediction. A ing the node proximity manifested in the network topo- vast majority of the existing efforts are overwhelmingly logical structure. devoted to single-layered networks or homogeneous net- The vast majority of existing efforts predomi- works with a single type of nodes and node interactions. nately focus on single-layered or homogeneous net- 1 However, in many real-world applications, a variety of works . However, real-world networks are much more networks could be abstracted and presented in a multi- complicated as cross-domain interactions between dif- layered fashion. Typical multi-layered networks include ferent networks are widely observed, which naturally critical infrastructure systems, collaboration platforms, form a type of multi-layered networks [12, 16, 35]. Crit- social recommender systems, to name a few. Despite the ical infrastructure systems are a typical example of widespread use of multi-layered networks, it remains a multi-layered networks (left part of Figure 1). In this daunting task to learn vector representations of different system, the power stations in the power grid are used types of nodes due to the bewildering combination of to provide electricity to routers in the autonomous sys- both within-layer connections and cross-layer network tem network (AS network) and vehicles in the trans- dependencies.
    [Show full text]
  • An Ecological Approach to Software Supply Chain Risk Management
    130 PROC. OF THE 15th PYTHON IN SCIENCE CONF. (SCIPY 2016) An Ecological Approach to Software Supply Chain Risk Management Sebastian Benthall‡§∗, Travis Pinney‡¶, JC Herz∗∗‡, Kit Plummerk‡ https://youtu.be/6UnuPhTPdnM F Abstract—We approach the problem of software assurance in a novel way With a small number of analytic assumptions about the inspired by an analytic framework used in natural hazard risk mitigation. Exist- propagation of vulnerability and exposure through the software ing approaches to software assurance focus on evaluating individual software dependency network, we have developed a model of ecosystem projects in isolation. We demonstrate a technique that evaluates an entire risk that predicts "hot spots" in need of more investment. In this ecosystem of software projects, taking into account the dependencey structure paper, we demonstrate this model using real software dependency between packages. Our model analytically separates vulnerability and exposure data extracted from PyPI using Ion Channel [IonChannel]. as elements of software risk, then makes minimal assumptions about the prop- agation of these values through a software supply chain. Combined with data collected from package management systems, our model indicates "hot spots" Prior work in the ecosystem of higher expected risk. We demonstrate this model using data collected from the Python Package Index (PyPI). Our results suggest that Zope [Verdon2004] outline the diversity of methods used for risk and Plone related projects carry the highest risk of all PyPI packages because analysis in software design. Their emphasis is on architecture- they are widely used and their core libraries are no longer maintained. level analysis and its iterative role in software development.
    [Show full text]
  • Exponential Random Graph Models An
    UNIVERSITY OF CALIFORNIA LOS ANGELES Network Statistics and Modeling the World Trade Network: Exponential Random Graph Models and Latent Space Models A thesis submitted in partial satisfaction of the requirements for the degree Master of Science in Statistics by Anthony James Howell 2012 © Copyright by Anthony James Howell 2012 i ABSTRACT OF THE THESIS Network Statistics and Modeling the Global Trade Economy: Exponential Random Graph Models and Latent Space Models: Is Geography Dead? by Anthony James Howell Master of Science in Statistics University of California, Los Angeles, 2012 Professor David Rigby, Chair Due to advancements in physics and computer science, networks have becoming increasingly applied to study a diverse set of interactions, including P2P, neural mapping, transportation, migration and global trade. Recent literature on the world trade network relies only on descriptive network statistics, and few attempts are made to statistically analyze the trade network using stochastic models. To fill this gap, I specify several models using international trade data and apply network statistics to determine the likelihood that a trade tie between two countries is established. I also use latent space models to test the ‘geography is dead’ thesis. There are two main findings of the paper. First, the “rich club phenomenon” identified in previous works using descriptive statistics no longer holds true when controlling for homophily and transitivity. Second, results from the latent space model refute the ‘geography is dead’ thesis. ii The thesis of Anthony James Howell is approved. Nicolas Christou Mark Handcock David Rigby, Committee Chair University of California, Los Angeles 2012 iii DEDICATION This work is dedicated to my loving family who has been instrumental in the completion of this thesis through their unconditional love and support.
    [Show full text]
  • Modularity and Dynamics on Complex Networks
    Modularity AND Dynamics ON CompleX Networks Cambridge Elements DOI: 10.xxxx/xxxxxxxx (do NOT change) First PUBLISHED online: MMM DD YYYY (do NOT change) Renaud Lambiotte University OF OxforD Michael T. Schaub RWTH Aachen University Abstract: CompleX NETWORKS ARE TYPICALLY NOT homogeneous, AS THEY TEND TO DISPLAY AN ARRAY OF STRUCTURES AT DIffERENT scales. A FEATURE THAT HAS ATTRACTED A LOT OF RESEARCH IS THEIR MODULAR ORganisation, i.e., NETWORKS MAY OFTEN BE CONSIDERED AS BEING COMPOSED OF CERTAIN BUILDING blocks, OR modules. IN THIS book, WE DISCUSS A NUMBER OF WAYS IN WHICH THIS IDEA OF MODULARITY CAN BE conceptualised, FOCUSING SPECIfiCALLY ON THE INTERPLAY BETWEEN MODULAR NETWORK STRUCTURE AND DYNAMICS TAKING PLACE ON A network. WE discuss, IN particular, HOW MODULAR STRUCTURE AND SYMMETRIES MAY IMPACT ON NETWORK DYNAMICS and, VICE versa, HOW OBSERVATIONS OF SUCH DYNAMICS MAY BE USED TO INFER THE MODULAR STRUCTURe. WE ALSO REVISIT SEVERAL OTHER NOTIONS OF MODULARITY THAT HAVE BEEN PROPOSED FOR COMPLEX NETWORKS AND SHOW HOW THESE CAN BE RELATED TO AND INTERPRETED FROM THE POINT OF VIEW OF DYNAMICAL PROCESSES ON networks. SeVERAL REFERENCES AND POINTERS FOR FURTHER DISCUSSION AND FUTURE WORK SHOULD INFORM PRACTITIONERS AND RESEARchers, AND MAY MOTIVATE FURTHER STUDIES IN THIS AREA AT THE CORE OF Network Science. KeYWORds: Network Science; modularity; dynamics; COMMUNITY detection; BLOCK models; NETWORK clustering; GRAPH PARTITIONS JEL CLASSIfications: A12, B34, C56, D78, E90 © Renaud Lambiotte, Michael T. Schaub, 2021 ISBNs: xxxxxxxxxxxxx(PB)
    [Show full text]
  • Configuration Models As an Urn Problem
    Conguration Models as an Urn Problem: The Generalized Hypergeometric Ensemble of Random Graphs Giona Casiraghi ( [email protected] ) ETH Zurich Vahan Nanumyan ETH Zurich Research Article Keywords: fundamental issue of network data science, Monte-Carlo simulations, closed-form probability distribution Posted Date: March 5th, 2021 DOI: https://doi.org/10.21203/rs.3.rs-254843/v1 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Configuration models as an urn problem: the generalized hypergeometric ensemble of random graphs Giona Casiraghi1,* and Vahan Nanumyan1 1Chair of Systems Design, ETH Zurich,¨ Zurich,¨ 8092, Switzerland *[email protected] ABSTRACT A fundamental issue of network data science is the ability to discern observed features that can be expected at random from those beyond such expectations. Configuration models play a crucial role there, allowing us to compare observations against degree-corrected null-models. Nonetheless, existing formulations have limited large-scale data analysis applications either because they require expensive Monte-Carlo simulations or lack the required flexibility to model real-world systems. With the generalized hypergeometric ensemble, we address both problems. To achieve this, we map the configuration model to an urn problem, where edges are represented as balls in an appropriately constructed urn. Doing so, we obtain a random graph model reproducing and extending the properties of standard configuration models, with the critical advantage of a closed-form probability distribution. Introduction Essential features of complex systems are inferred by studying the deviations of empirical observations from suitable stochastic models. Network models, in particular, have become the state of the art for complex systems analysis, where systems’ constituents are represented as vertices, and their interactions are viewed as edges and modelled by means of edge probabilities in a random graph.
    [Show full text]
  • Networktoolbox
    NetworkToolbox: Methods and Measures for Brain, Cognitive, and Psychometric Network Analysis in R This manuscript is accepted pending a minor revision to the R Journal. Alexander P. Christensen University of North Carolina at Greensboro Department of Psychology P.O. Box 26170 University of North Carolina at Greensboro Greensboro, NC, 27402-6170, USA E-mail: [email protected] ORCiD: 0000-0002-9798-7037 October 8, 2018 This article introduces the NetworkToolbox package for R. Network analysis offers an intuitive perspective on complex phenomena via models depicted by nodes (variables) and edges (correlations). The ability of net- works to model complexity has made them the standard approach for mod- eling the intricate interactions in the brain. Similarly, networks have become an increasingly attractive model for studying the complexity of psychologi- cal and psychopathological phenomena. NetworkToolbox aims to provide researchers with state-of-the-art methods and measures for estimating and analyzing brain, cognitive, and psychometric networks. In this article, I introduce NetworkToolbox and provide a tutorial for applying some the package’s functions to personality data. Keywords: psychology, network analysis, graph theory, R 1 1 Introduction Open science is ushering in a new era of psychology where multi-site collaborations are common and big data are readily available. Often, in these data, noise is mixed in with relevant information. Thus, researchers are faced with a challenge: deciphering informa- tion from the noise and maintaining the inherent structure of the data while reducing its complexity. Other areas of science have tackled this challenge with the use of network science and graph theory methods.
    [Show full text]