Generative Modelling and Inverse Problem Solving for Networks in Hyperbolic Space
Total Page:16
File Type:pdf, Size:1020Kb
Generative modelling and inverse problem solving for networks in hyperbolic space Alessandro Muscoloni DISSERTATION to achieve the academic degree Doktoringenieur (Dr.–Ing.) 5th July 2019 Faculty of Computer Science Technische Universität Dresden Committee: Prof. Thorsten Stufe (Chair) Dr. Carlo Vittorio Cannistraci (Subject expert) Prof. Michael Schroeder (1. Reviewer) Prof. Giuseppe Mangioni (2. Reviewer) Prof. Gerhard Weber (Committee member) 1 Acknowledgements After having reached this scientific and career achievement, I would like to personally and warmly thank all the people that contributed to make this possible. To my group leader Carlo, for having guided and supported me through these three years of continuous growth, as well as for sharing my same exploratory mentality during our travels. To my academic supervisor Michael, for the wise suggestions during our meetings, in particular towards the preparation of the dissertation. To the secretaries Gloria, Michelle, Janett and Claudia, for the patient support on the boring bureaucracy. To all my colleagues, in particular to Claudio, Aldo and Alberto for the wonderful time spent together also outside work, as well as to Sara, Phine, Ali, Umberto, Ilyes and all the people that helped to create a positive working environment. To my girlfriend Ceciel, for the lovely experiences shared in the last years. To my family, for always being close despite the distance. 2 Papers & Contributions The three papers of the publication-oriented dissertation are listed below. The detailed contributions are reported for each paper. Paper A A. Muscoloni, J. M. Thomas, S. Ciucci, G. Bianconi, C. V. Cannistraci. (2017). Machine learning meets complex networks via coalescent embedding in the hyperbolic space. Nature Communications, 8, 1615. Alessandro Muscoloni’s contributions: - Design and implementation of the code for the coalescent embedding algorithm and for the evaluation measures HD-corr, C-score and GR-score. - Computational simulations for HD-corr, C-score and GR-score evaluations, community detection and rich-club analyses. - Analysis and interpretation of the results. - Realization of figures and tables. - Analysis of the computational complexity of the coalescent embedding variants. - Drafting of the article. 3 Paper B A. Muscoloni, C. V. Cannistraci. (2018). A nonuniform popularity-similarity optimization (nPSO) model to efficiently generate realistic complex networks with communities. New Journal of Physics, 20, 052002. Alessandro Muscoloni’s contributions: - Design of the nonuniform distributions for the nPSO model. - Design and implementation of the code for the nPSO model. - Computational simulations. - Analysis and interpretation of the results. - Realization of figures and tables. - Mathematical demonstration of the equivalence of the implementations for link generation. - Mathematical analysis of the computational complexity. - Drafting of the article. Paper C A. Muscoloni, C. V. Cannistraci. (2018). Leveraging the nonuniform PSO network model as a benchmark for performance evaluation in community detection and link prediction. New Journal of Physics, 20, 063022. Alessandro Muscoloni’s contributions: - Implementation of the code. - Computational simulations. - Analysis and interpretation of the results. - Realization of figures and tables. - Drafting of the article. 4 Table of Contents Acknowledgements .................................................................................................................... 2 Papers & Contributions .............................................................................................................. 3 Table of Contents ....................................................................................................................... 5 Abstract ...................................................................................................................................... 6 Part I: Research Summary ......................................................................................................... 7 1. Introduction ........................................................................................................................ 7 2. Related work .................................................................................................................... 12 3. Results .............................................................................................................................. 22 4. Discussion ........................................................................................................................ 32 Part II: Papers of the dissertation ............................................................................................. 37 Paper A: Machine learning meets complex networks via coalescent embedding in the hyperbolic space ................................................................................................................... 37 A.1 Introduction ............................................................................................................................ 38 A.2 Results .................................................................................................................................... 39 A.3 Discussion .............................................................................................................................. 48 A.4 Methods.................................................................................................................................. 51 A.5 Figures and Tables ................................................................................................................. 70 A.7 Supplementary Information ................................................................................................... 85 Paper B: A nonuniform popularity-similarity optimization (nPSO) model to efficiently generate realistic complex networks with communities .................................................... 129 B.1 Introduction .......................................................................................................................... 130 B.2 Methods ................................................................................................................................ 131 B.3 Results and Discussion ......................................................................................................... 140 B.4 Conclusion ............................................................................................................................ 151 B.5 Figures .................................................................................................................................. 155 B.6 Supplementary Information .................................................................................................. 173 Paper C: Leveraging the nonuniform PSO network model as a benchmark for performance evaluation in community detection and link prediction ..................................................... 199 C.1 Introduction .......................................................................................................................... 200 C.2 Results and Discussion ......................................................................................................... 201 C.3 Conclusion ............................................................................................................................ 209 C.4 Methods ................................................................................................................................ 210 C.5 Figures and Tables ............................................................................................................... 225 C.6 Supplementary Information .................................................................................................. 234 Bibliography .......................................................................................................................... 244 5 Abstract The investigation of the latent geometrical space behind complex network topologies is a fervid topic in current network science and the hyperbolic space is one of the most studied, because it seems associated to the structural organization of many real complex systems. The popularity-similarity-optimization (PSO) generative model is able to grow random geometric graphs in the hyperbolic space with realistic properties such as clustering, small-worldness, scale-freeness and rich-clubness. However, it misses to reproduce an important feature of real complex systems, which is the community organization. Here, we introduce the nonuniform PSO (nPSO) generative model, a generalization of the PSO model with a tailored community structure, and we provide an efficient algorithmic implementation with a O(EN) time complexity, where N is the number of nodes and E the number of edges. Meanwhile, in recent years, the inverse problem has also gained increasing attention: given a network topology, how to provide an accurate mapping into its latent geometrical space. Unlike previous attempts based on a computationally expensive maximum likelihood optimization (whose time complexity is between O(N3) and O(N4)), here we show that a class of methods based on nonlinear dimensionality reduction can solve the problem with higher precision and reducing the time complexity to O(N2). 6 Part I: Research Summary 1. Introduction The Oxford English Dictionary (OED, 2019) provides several definitions for the term network,