0848736-Bachelorproject Peter Verleijsdonk

Eindhoven University of Technology BACHELOR Site percolation on the hierarchical configuration model Verleijsdonk, Peter Award date: 2017 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain Department of Mathematics and Computer Science Site Percolation on the Hierarchical Configuration Model Bachelor Thesis P. Verleijsdonk Supervisors: prof. dr. R.W. van der Hofstad C. Stegehuis (MSc) Eindhoven, March 2017 Abstract This paper extends the research on percolation on the hierarchical configuration model. The hierarchical configuration model is a configuration model where single vertices are replaced by small community structures. We study site percolation on the hierarchical configuration model, as well as the critical percolation value, size of the giant component and distance distribution after percolation. For this we use analytical methods and a stochastic simulation. 2 Site Percolation on the Hierarchical Configuration Model CONTENTS Contents Contents 3 1 Introduction 1 1.1 Report Structure.....................................1 2 Model Description3 2.1 Configuration model...................................3 2.2 Hierarchical Configuration Model............................3 2.3 Community structure..................................4 2.4 Model Assumptions...................................4 3 Site Percolation6 3.1 Site Percolation On Special Community Structures..................6 3.1.1 Household Communities.............................6 3.1.2 Star Communities................................6 3.1.3 Line Communities................................7 3.2 Site Percolation on the Configuration Model......................7 3.3 Site Percolation on the Hierarchical Configuration Model..............8 3.4 Existence of a Giant Component after Site Percolation................8 4 Simulation 16 4.1 Simulation Motivation.................................. 16 4.2 Simulation Description.................................. 16 5 Simulation Results 17 5.1 Size of the Giant Component.............................. 17 5.2 Distances......................................... 23 6 Conclusions and discussion 29 7 Appendix 30 7.1 Table of Definitions.................................... 30 References 31 Site Percolation on the Hierarchical Configuration Model 3 1 INTRODUCTION 1 Introduction This thesis is about community structures in graphs. Often random graphs, constructed using the so−called configuration model, are studied but do not really match real-world graph structures. A configuration model is a random graph where the degree distribution is known beforehand. All vertices have outgoing half-edges which are paired using a random permutation on a suitable vector, this vector has one entry for every open half-edge. This method of pairing is called a configuration. A more formal definition of this algorithm is given later. Since configuration models are studied widely, many statements exist about these kind of graph structures. In order to use these results, we study an extension to the configuration model, called the hierarchical configuration model. These random graphs are on the large scale a configuration model but are constructed using small communities. The hierarchical configuration model replaces all vertices by small communities and then connects these communities in the same way as the configuration model. The advantage of such random graphs is that they contain more short cycles compared to configuration model random graphs. The general idea about introducing these community structures is that it increases short cycles and triangles. A triangle is a set of three vertices which are all adjacent to each other. Configuration model random graphs do not have many short cycles or community structures. Often real-life problems present themselves as a problem that depends on how people or networks are connected. [8] The configuration model does not include structures like family relations or important vertices. For example a certain vertex may play a more important or different role than other vertices inside a neighborhood. Graph structures are important in studying mathematical models. A realistic example of such a mathematical model is the spreading of a virus through a network, using the Reed-Frost model with a connectivity matrix. A connectivity matrix can be interpreted as the blueprint of a graph and hence introducing community structures might improve the accuracy of the model. We introduce community structures which are found in real-life networks such as Facebook networks or routing networks. The Facebook network is studied widely, for example by Ugander et al. [9] Facebook networks are locally highly connected and consist of many triangles and short cycles. To illustrate this, the mean number of friends between two randomly chosen users was just 4:7, as calculated in May 2011. This holds for the entire social graph, when restricted to United States, this mean distance between to randomly chosen people was only 4:3. [9] Local routing networks are traditionally introduced as star networks. A classic computer network model consists of a central hub and adjacent components. The main disadvantage of such networks is that the central hub is a single point of failure. [7] The main research question is how introducing community structures changes model specifics. We compare both models when looking at graph properties such as connected components and distances. We derive new results in percolation theory, sufficient conditions when a giant component exists after site percolation for specific community structures and we show that introducing community structures does not necessarily increase the critical percolation threshold or mean distance. Site- or bond percolation can be seen as the process of deleting vertices or edges respectively with a given probability. Furthermore we analyze the empirical distance distributions obtained after stochastic simulation and present a statistical analysis. 1.1 Report Structure This report follows the following schematic approach. First the mathematical objects are formally introduced and we explain which properties are to be studied and why. It is important to precisely describe how the hierarchical configuration model extends a configuration model random graph and under what conditions derived results are true. This is done in Section2. Site Percolation on the Hierarchical Configuration Model 1 1 INTRODUCTION In Section3, we study several community structures using an analytical approach. Since these properties are suitable for a stochastic simulation, the next main section, Section4, is dedicated to explaining this sort of simulation. Then these simulation results are presented, analyzed and compared to the prior knowledge in Section5. 2 Site Percolation on the Hierarchical Configuration Model 2 MODEL DESCRIPTION 2 Model Description Here the mathematical objects which will be studied are introduced. For ease of reference, a list of definitions is included in the appendix. A configuration model or a hierarchical configuration model is a random graph model. We start with the definition of an arbitrary graph. Definition 1. A graph G = (V; E) is an ordered pair where V and E ⊂ V 2 represents the set of vertices and edges respectively. If (w; v) 2 E then shorthand notation for this is w ∼ v. Thus a graph consists of a set of vertices and a set that describes which vertices are connected. Definition 2. The neighborhood NG(v) = fu 2 V (G)ju ∼ vg ⊂ V of a vertex v in a graph G is the set of all vertices which are adjacent to v. A simple graph is a special kind of graph: Definition 3. A graph G = (V; E) is called simple if and only if ∼ is a symmetric and anti- reflexive relation. Furthermore, every element in E is unique. So a simple graph is an unweighted, undirected graph where self-loops or multi-loops are not allowed. We now introduce a special kind of graph structures, the configuration model. 2.1 Configuration model The definition of a configuration model is straightforward. [4] The configuration model is an algorithm which produces a random graph G with N vertices where every vertex v has kv outgoing edges. These (ki) are called the degree sequence and are assumed to be known beforehand. The first step in constructing the graph is defining N. The next step is adding ki half-edges to every vertex in the network. Completing the graph is done by uniformly matching the half-edges. This can either be done using a random permutation on a suitable vector which has one entry for every half-edge or selecting random half-edges iteratively P until all half-edges are connected. It follows that ki must be even. Note that

0848736-Bachelorproject Peter Verleijsdonk

Optimal Subgraph Structures in Scale-Free Configuration

Detecting Statistically Significant Communities

Processes on Complex Networks. Percolation

Correlation in Complex Networks

Percolation Thresholds for Robust Network Connectivity

Network Science

Fractal Network in the Protein Interaction Network Model

Neutral Evolution of Proteins: the Superfunnel in Sequence Space and Its Relation to Mutational Robustness

Fractal Boundaries of Complex Networks

Giant Component in Random Multipartite Graphs with Given

15 Netsci Configuration Model.Key

A Multigraph Approach to Social Network Analysis